Chapter 8: Object-Oriented Databases
n Need for Complex Data Types
n The Object-Oriented Data Model
n Object-Oriented Languages
n Persistent Programming Languages
n Persistent C++ Systems
Need for Complex Data Types
n Traditional database applications in data processing had conceptually simple data types
é Relatively few data types, first normal form holds
n Complex data types have grown more important in recent years
é E.g. Addresses can be viewed as a
Ø Single string, or
Ø Separate attributes for each part, or
Ø Composite attributes (which are not in first normal form)
é E.g. it is often convenient to store multivalued attributes as-is, without creating a separate relation to store the values in first normal form
n Applications
é computer-aided design, computer-aided software engineering
é multimedia and image databases, and document/hypertext databases.
Object-Oriented Data Model
n Loosely speaking, an object corresponds to an entity in the E-R model.
n The object-oriented paradigm is based on encapsulating code and data related to an object into single unit.
n The object-oriented data model is a logical data model (like the E-R model).
n Adaptation of the object-oriented programming paradigm (e.g., Smalltalk, C++) to database systems.
Object Structure
n An object has associated with it:
é A set of variables that contain the data for the object. The value of each variable is itself an object.
é A set of messages to which the object responds; each message may have zero, one, or more parameters.
é A set of methods, each of which is a body of code to implement a message; a method returns a value as the response to the message
n The physical representation of data is visible only to the implementor of the object
n Messages and responses provide the only external interface to an object.
n The term message does not necessarily imply physical message passing. Messages can be implemented as procedure invocations.
Messages and Methods
n Methods are programs written in general-purpose language with the following features
é only variables in the object itself may be referenced directly
é data in other objects are referenced only by sending messages.
n Methods can be read-only or update methods
é Read-only methods do not change the value of the object
n Strictly speaking, every attribute of an entity must be represented by a variable and two methods, one to read and the other to update the attribute
é e.g., the attribute address is represented by a variable address and two messages get-address and set-address.
é For convenience, many object-oriented data models permit direct access to variables of other objects.
Object Classes
n Similar objects are grouped into a class; each such object is called an instance of its class
n All objects in a class have the same
é Variables, with the same types
é message interface
é methods
The may differ in the values assigned to variables
n Example: Group objects for people into a person class
n Classes are analogous to entity sets in the E-R model
Class Definition Example
class employee {
/*Variables */
string name;
string address;
date start-date;
int salary;
/* Messages */
int annual-salary();
string get-name();
string get-address();
int set-address(string new-address);
int employment-length();
};
n Methods to read and set the other variables are also needed with strict encapsulation
n Methods are defined separately
é E.g. int employment-length() { return today() – start-date;}
int set-address(string new-address) { address = new-address;}
Inheritance
n E.g., class of bank customers is similar to class of bank employees, although there are differences
é both share some variables and messages, e.g., name and address.
é But there are variables and messages specific to each class e.g., salary for employees and credit-rating for customers.
n Every employee is a person; thus employee is a specialization of person
n Similarly, customer is a specialization of person.
n Create classes person, employee and customer
é variables/messages applicable to all persons associated with class person.
é variables/messages specific to employees associated with class employee; similarly for customer
n Place classes into a specialization/IS-A hierarchy
é variables/messages belonging to class person are inherited by class employee as well as customer
n Result is a class hierarchy
Note analogy with ISA Hierarchy in the E-R model
Class Hierarchy Definition
class person{
string name;
string address:
};
class customer isa person {
int credit-rating;
};
class employee isa person {
date start-date;
int salary;
};
class officer isa employee {
int office-number,
int expense-account-number,
};
n Full variable list for objects in the class officer:
é office-number, expense-account-number: defined locally
é start-date, salary: inherited from employee
é name, address: inherited from person
n Methods inherited similar to variables.
n Substitutability — any method of a class, say person, can be invoked equally well with any object belonging to any subclass, such as subclass officer of person.
n Class extent: set of all objects in the class. Two options:
1. Class extent of employee includes all officer, teller and secretary objects.
H Class extent of employee includes only employee objects that are not in a subclass such as officer, teller, or secretary
H This is the usual choice in OO systems
H Can access extents of subclasses to find all objects of
subtypes of employee
Example of Multiple Inheritance
Class DAG for banking example.
Multiple Inheritance
n With multiple inheritance a class may have more than one superclass.
é The class/subclass relationship is represented by a directed acyclic graph (DAG)
é Particularly useful when objects can be classified in more than one way, which are independent of each other
Ø E.g. temporary/permanent is independent of Officer/secretary/teller
Ø Create a subclass for each combination of subclasses
– Need not create subclasses for combinations that are not possible in the database being modeled
n A class inherits variables and methods from all its superclasses
n There is potential for ambiguity when a variable/message N with the same name is inherited from two superclasses A and B
é No problem if the variable/message is defined in a shared superclass
é Otherwise, do one of the following
Ø flag as an error,
Ø rename variables (A.N and B.N)
Ø choose one.
More Examples of Multiple Inheritance
n Conceptually, an object can belong to each of several subclasses
é A person can play the roles of student, a teacher or footballPlayer, or any combination of the three
Ø E.g., student teaching assistant who also play football
n Can use multiple inheritance to model “roles” of an object
é That is, allow an object to take on any one or more of a set of types
n But many systems insist an object should have a most-specific class
é That is, there must be one class that an object belongs to which is a subclass of all other classes that the object belongs to
é Create subclasses such as student-teacher and
student-teacher-footballPlayer for each combination
é When many combinations are possible, creating
subclasses for each combination can become cumbersome
Object Identity
n An object retains its identity even if some or all of the values of variables or definitions of methods change over time.
n Object identity is a stronger notion of identity than in programming languages or data models not based on object orientation.
é Value – data value; e.g. primary key value used in relational systems.
é Name – supplied by user; used for variables in procedures.
é Built-in – identity built into data model or programming language.
Ø no user-supplied identifier is required.
Ø Is the form of identity used in object-oriented systems.
Object Identifiers
n Object identifiers used to uniquely identify objects
é Object identifiers are unique:
Ø no two objects have the same identifier
Ø each object has only one object identifier
é E.g., the spouse field of a person object may be an identifier of another person object.
é can be stored as a field of an object, to refer to another object.
é Can be
Ø system generated (created by database) or
Ø external (such as social-security number)
é System generated identifiers:
Ø Are easier to use, but cannot be used across database systems
Ø May be redundant if unique identifier already exists
Object Containment
n Each component in a design may contain other components
n Can be modeled as containment of objects. Objects containing; other objects are called composite objects.
n Multiple levels of containment create a containment hierarchy
é links interpreted as is-part-of, not is-a.
n Allows data to be viewed at different granularities by different users.
Object-Oriented Languages
n Object-oriented concepts can be used in different ways
é Object-orientation can be used as a design tool, and be encoded into, for example, a relational database
H analogous to modeling data with E-R diagram and then converting to a set of relations)
é The concepts of object orientation can be incorporated into a programming language that is used to manipulate the database.
Ø Object-relational systems – add complex types and object-orientation to relational language.
Ø Persistent programming languages – extend object-oriented programming language to deal with databases by adding concepts such as persistence and collections.
Persistent Programming Languages
n Persistent Programming languages allow objects to be created and stored in a database, and used directly from a programming language
é allow data to be manipulated directly from the programming language
Ø No need to go through SQL.
é No need for explicit format (type) changes
Ø format changes are carried out transparently by system
Ø Without a persistent programming language, format changes becomes a burden on the programmer
– More code to be written
– More chance of bugs
é allow objects to be manipulated in-memory
Ø no need to explicitly load from or store to the database
– Saved code, and saved overhead of loading/storing large amounts of data
n Drawbacks of persistent programming languages
é Due to power of most programming languages, it is easy to make programming errors that damage the database.
é Complexity of languages makes automatic high-level optimization more difficult.
é Do not support declarative querying as well as relational databases
Persistence of Objects
n Approaches to make transient objects persistent include establishing
é Persistence by Class – declare all objects of a class to be persistent; simple but inflexible.
é Persistence by Creation – extend the syntax for creating objects to specify that that an object is persistent.
é Persistence by Marking – an object that is to persist beyond program execution is marked as persistent before program termination.
é Persistence by Reachability - declare (root) persistent objects; objects are persistent if they are referred to (directly or indirectly) from a root object.
Ø Easier for programmer, but more overhead for database system
Ø Similar to garbage collection used e.g. in Java, which
also performs reachability tests
Object Identity and Pointers
n A persistent object is assigned a persistent object identifier.
n Degrees of permanence of identity:
é Intraprocedure – identity persists only during the executions of a single procedure
é Intraprogram – identity persists only during execution of a single program or query.
é Interprogram – identity persists from one program execution to another, but may change if the storage organization is changed
é Persistent – identity persists throughout program executions and structural reorganizations of data; required for object-oriented systems.
n In O-O languages such as C++, an object identifier is actually an in-memory pointer.
n Persistent pointer – persists beyond program execution
é can be thought of as a pointer into the database
Ø E.g. specify file identifier and offset into the file
é Problems due to database reorganization have to be dealt with by keeping forwarding pointers
Storage and Access of Persistent Objects
How to find objects in the database:
No comments:
Post a Comment