Overview
SDO concepts
• Type and Property
• DataObject
• Sequence
Data Access Services
Why choose Service Data Objects?
• Reduced database overhead
• Single API for data
• Knowledge of the structure
Relationship to PHP Data Objects and SimpleXML
Contact scenario
• Contact edit use case
• Retrieving the contact entry
• More on SDO navigation
• Modifying the data
• More on SDO modification
Summary
Resources
About the Authors
Introduction
Service Data Objects (SDOs) have been around in the Java technology world since November 2003. They are designed as a means of simplifying and unifying working with heterogeneous data sources. In February 2005, IBM and Zend announced a strategic partnership to collaborate on the development and support of the PHP environment. One aspect of this collaboration has been the definition and implementation of SDOs for PHP. This article gives an overview of SDOs and the motivations for using them in the PHP environment. A simple contact management scenario is used to illustrate key concepts.
Overview
Service Data Objects (SDOs) are designed to simplify and unify the way applications handle data. Using SDOs, application programmers can uniformly access and manipulate data from heterogeneous data sources, including relational databases, XML data sources, Web services, and other such enterprise information systems.
SDOs are based on the concept of disconnected data graphs. A data graph is a collection of data objects. Under the disconnected data graphs architecture, a client retrieves a data graph from a data source, changes the data graph, and can then apply the data graph updates back to the data source.
The task of connecting applications to data sources is performed by a Data Access Service (DAS) (see Figure 1). Client applications can query a DAS and get a data graph in response, modify the data graph and send the updated data graph to a DAS to have the changes applied to the original data source. Clients may also use DASes to read from one data source and write to another -- for example, reading an XML RSS feed and writing the results to a relational database. This SDO/DAS architecture allows applications to deal principally with data graphs and data objects.
The PHP implementation of SDOs involves mapping to a dynamic and weakly typed language. The result is a greatly simplified API, with the collapsing of the many type-specific getters/setters. In addition to this, SDOs also exploit PHP's ability to implement objects that can be manipulated as if they were arrays, including support for iteration, state testing and unsetting. The result is a powerful data object technology with a simple, intuitive interface.
SDO for PHP currently has support for relational and XML data sources, and a service provider interface to enable the implementation of support for others.
SDO concepts
Core SDO concepts are defined by the SDO model and a small number of interfaces that enable working with instance data managed in accordance with the model. Figure 2 shows a UML class diagram of these concepts. Let's take a few moments to understand the roles of the different elements.
Type and Property
At the heart of every SDO instance is a model that defines the permitted structure of
a data object. You can think of this as being like a blueprint for the data object. It
covers concepts such as parent-child data object relationships, cardinality of
relationships, permitted properties on a data object, default values. The SDO model
is described in terms of Types and Properties. Types can be primitives,
such as string (SDO calls these data types and defines a set
mapped to PHP types), or complex types to represent an order or address (SDO calls
these data object types). Data object types contain properties. Each property
has a type that can be a data type or a data object type.
This Type and Property model allows data objects to represent what SDO refers to as data graphs. The essence of a data graph is a tree structure of data objects, navigable via their containment references (you can think of these as container, or aggregation, parent-child relationships), plus noncontainment references that point to data objects within the data graph (these are not aggregation relationships and can, therefore, point to any other data object within the tree).
Let's use the example of a Person data object to illustrate the important
concepts of Type and Property. The diagram below shows an example of a Person
SDO instance on the left and its corresponding model on the right. The Person
SDO instance has its structure defined by a type with name of Person.
The Person type has been defined to have two properties with the
names name and age. The types of these two properties
are SDO data types of String and Int. On the left, we
can see that the Person instance has had these two properties set
to the values of "Fred Fish" and 35, respectively.
In addition to the attributes mentioned, types also have attributes that say whether they are open (supporting additional instance properties not defined in the type), sequenced (preserving order across properties), or abstract (a base type capable of being extended by another SDO type). Properties also have attributes that describe things like cardinality (for example, a many-valued property to represent the departments in a company), any default value, whether the property is read-only. Not all these concepts are supported in the PHP implementation of SDO. The documentation accompanying the SDO for PHP project describes the status of the various capabilities. All these concepts are covered in detail in the Java SDO specification.
Note that SDO for PHP does not currently support read-only properties, default values, abstract types, open types, or bidirectional relationships.
DataObject
The core interface for working with data objects is the DataObject (no surprise there). This supports all the capabilities one would expect when working with a data structure, such as setting and getting properties, creating child structures, property querying via an augmented subset of XPath, and structure navigation.
If we consider the earlier example, we could use the DataObject interface to get the age property value, 35, and then use it to update the age to 36, say. The use of the DataObject interface is described in more detail in Contact scenario.
Sequence
The sequence interface is used to manipulate data objects when ordering across a data object's properties is required (the data object is said to be a sequenced type). This is particularly useful when working with XML data where the order of property values is important, but varies for each instance and is, therefore, not explicitly defined by the model (for mixed XML content, for example).
Data Access Services
As mentioned in the introduction, SDOs rely on the existence of DASes. The role of a DAS is to retrieve and write data to and from a data source. When using a DAS, the client works with SDOs, and is, therefore, insulated from any data source-specific data representation.
A DAS can also act as a factory for data objects, creating new instances that conform to some predefined data source schema (for example, a database or XML schema).
Two DAS implementations are provided as part of the SDO for PHP project. These are the XML DAS for working with XML files or XML/HTTP sources, and the Relational Data Access Service (Relational DAS), implemented using PHP Data Objects (PDO) for accessing relational data sources.
Why choose Service Data Objects?
There are a number of reasons why you might consider using SDOs. The main ones are outlined below.
Reduced database overhead
At the heart of SDOs is support for disconnected working. Data objects can automatically record their change history that can then be used by a DAS to detect collisions when applying changes back to an enterprise information system.
This technique is often referred to as optimistic concurrency, or optimistic offline locking (described in Patterns of Enterprise Application Architecture, by Martin Fowler). It is best suited to scenarios where there is a low risk of collision, perhaps due to infrequent edits, or the edits being governed by some external process that reduces the likelihood of multiple people concurrently working on the same data.
The benefits are even greater if, in addition to the low risk of collision, there is also high edit latency (significant time between checking out to make edits and edits being committed), or the data is passed around in ways that make it difficult to manage database connections and locks (for example, in service-oriented application architectures).
The major benefit to using this technique is the removal of the need for the application to hold connections and locks in the enterprise information system while some user or application is working on the data.
Interestingly, there is nothing in the SDO architecture that precludes the creation of DASes that employ a pessimistic concurrency model.
Single API for data
In addition to the optimistic concurrency support, another major benefit to using SDO is realized when working with multiple heterogeneous data sources. SDOs provide a single API for manipulating data independent of the data's originating data source.
In addition to the expected data structure manipulation support, SDOs also provide the ability to set and retrieve the contents of an SDO based on an XPath-like expression, including queries. This removes the burden from the client of having to navigate the data structure in order to identify substructures based on instance data. For example, we might choose to extend our Person example to include a new data object type that contains a collection of Person data objects. To identify an individual by name without XPath would require us to iterate through all the Person objects and test the name property until a match was found. To do this with XPath requires just a single call, specifying the property name and the value to be matched.
Knowledge of the structure
As we have already seen, SDOs carry internal knowledge of the structure of the data they represent (the model). This knowledge is used to ensure that the creation and modification of instance data conforms to the structure and type rules for the data object.
Data capabilities, such as SimpleXML and PDO, do not employ this approach and, therefore, must delegate this responsibility to other technologies when creating and validating data. As a consequence, the benefit that SimpleXML and PDO have over SDO is they do not require the developer to specify the model.
An additional benefit to having a model is the capability to introspect it at run time. For example, this can enable developers to write flexible user interfaces that can adapt to changes in their data structures (SDO schema) at run time. SDO model introspection is not fully enabled in the PHP implementation, but we expect this to change over time.
Relationship to PHP Data Objects and SimpleXML
PDO (not to be confused with SDO) aims to provide a consistent API for the common capabilities found in most relational database APIs. This greatly simplifies creating Web applications designed to support different database vendors by encapsulating the differences under a common API. PDO provides a simple object view of results, but does not attempt to normalize those results (one row of the result set equals one object, regardless of whether there are multiple tables represented in the result). The ease of use of PDO makes it a natural choice when working directly with databases, and for this very reason is the technology chosen for the Relational DAS implementation provided with SDO.
As mentioned, the focus of SDO is on providing a flexible data object representation for data from heterogeneous data sources and built-in support for optimistic concurrency. We have also described how having a "knowledge of the structure" enables SDOs to provide a single API for the complete life cycle of the data, including creation and validation. If these capabilities are important to your application, then SDO is probably an appropriate choice, using the Relational DAS implemented to PDO for relational data source support.
SimpleXML provides a simple way for working with XML instance documents. Documents are loaded, and can be navigated and manipulated through a simple API. The interface surfaces some specifics of XML (for example, the syntax differentiates between elements and attributes).
When parsing and processing an instance document, SimpleXML is an excellent technology choice. However, if significant manipulation is required, it is important that the technology understands the model for the document, and in that case, SDO is probably a more appropriate choice.
Contact scenario
You should now have a reasonable understanding of the main concepts of SDO. To help clarify, we will illustrate them with a simple scenario.
The following sections provide an overview of SDO for PHP capabilities, described in the context of a personal contacts example. This example demonstrates the following attributes of SDO:
- Disconnected working (optimistic concurrency)
- SDO navigation and manipulation
- The role of a DAS
Contact edit use case
In this scenario, a contacts Web application has been written to support the management of contact information. The contact information is stored in a relational database containing the following two tables:
| Table 1. "contact" table definition | ||
|---|---|---|
| Column | Type | Example |
| shortname (primary key) | string | "Charlie" |
| fullname | string | "Charles Babbage" |
| Table 2. "address" table definition | ||
|---|---|---|
| Column | Type | Example |
| id (auto-generated primary key) | integer | 1 |
| shortname (foreign key) | string | "Charlie" |
| addressline1 | string | "Analytical House" |
| addressline2 | string | "1 Engine Close" |
| city | string | "Walworth Road" |
| state | string | "London" |
| zip | string | "XX11 1ZZ" |
| telephone | string | "555-555-5555" |
To illustrate the use of SDOs, we have selected the use case of modifying a contact. The main steps to modifying a contact are as follows:
- Retrieve the contact to be modified from the database
- Make the modifications to the contact
- Apply the modification back to the database
Note: As we have mentioned, DAS APIs are not specified as part of SDO because we are referring to the SDO specification, and, therefore, any sample code is necessarily specific to a particular implementation. The code snippets shown below have been created to match the APIs of the Relational DAS provided with SDO for PHP.
Retrieving the contact entry
The first page the user sees is the contact management main.php page. It has a single input field for entering the shortname of the contact to be edited, along with an Edit button to submit the request.
Entering a shortname and clicking Edit transitions to the edit.php page, passing
the specified shortname in the $_POST array. The edit.php page contains
the following main steps:
- Get the entered shortname
- Create a relational DAS instance
- Execute the query to retrieve the contact SDO
- Populate the edit form with the contact details
- Store the contact SDO in the session
1. Get the shortname
The welcome page form configuration resulted in the shortname being placed in
$_POST['shortname']. At this point, and also in the later section
on "Modifying the data", we would normally validate the user input to prevent
the database from being compromised.
// get the shortname from posted variables
|
2. Create an Relational DAS instance
An important aspect of creating the Relational
DAS is describing the database schema it should use. The Relational DAS will take this
schema and use it to define the model for the Service Data Objects it can create. This
information is often required by other Relational DAS instances and is, therefore, a
candidate for placing in a separate script which is then included.
// Describe the structure of the contact table
|
Note: The Relational DAS assumes that all containment relationships are cardinality one-to-many. So in this example, the contact can contain zero or more address DataObject instances.
Having defined the model, we can now create an instance of the Relational DAS.
// Create the Relational Data Access Service telling it the database
|
3. Execute the query
We can now use the Relational DAS to retrieve the contact
information.
// connect to the database. This connection will be released when the
|
The resulting $contact SDO is shown below. This shows the data objects,
their properties (property index shown in the square brackets, followed by the property
name), and the property values. As mentioned, the cardinality of the address containment
within a contact is assumed by the Relational DAS to be one-to-many. In the diagram
below, the notation [0] DataObject has been used to signify the first entry
in the list of address DataObjects.
4. Populate the edit form
Given the contact SDO, we can populate the form to
allow the user to edit the data.
<!-- Create and populate the form with the contact details -->
|
5. Store the data object in the session
Because we are disconnected from the
database, we need to store the contact data object in the session to make it available
to the next page.
// store the contact data object in the session
|
More on SDO navigation
The previous section briefly touched on accessing the properties of an SDO (see the code in Step 4, "Populate the edit form"). In addition to accessing primitive properties, most SDO applications also require navigation up and down parent-child data object relationships (between contact and address, for example). Some also require query capabilities to identify parts of the data graph.
The code snippets below give a quick overview of the ways of navigating the contact SDO data graph.
As we saw in Step 4, SDOs supports property access using the object property syntax:
// get the fullname using the object property
|
We can also access fullname using the property index (the position as
defined by the data object's model, as shown in the square brackets in Figure 4):
// get the second contact property (fullname)
|
We can access many-valued child data object properties, such as address,
using the same syntax:
// get the list of address data objects via the object property
|
We can access individual elements of many-valued properties, such as the first address, using array syntax:
// get the first address from a list of address data objects
|
We can also directly reference properties within child data object properties, such as the ZIP code from the first address:
// access the zip code from the first address via the object properties
|
We can also iterate over the properties of a data object:
// Iterate over the properties of the first address
|
Finally, we can access the properties using XPath-like support, the simplest form being the property name:
// use property names (XPath) to access the zip property
|
XPath can also be used to navigate and query data objects. If we had retrieved a
number of contacts in the Relational DAS query, we could identify an individual
from its first address line; for example: The Relational DAS returns multiple
results as a many-valued child property of a root data object, that in the example
we have named $root.
// Get the address object that contains the addressline1 of "1 Engine Close"
|
Modifying the data
The following shows a simple example of a contact edit page, edit.php, where the contact has a single address:
This page allows the user to modify individual property values. When the Update button
is clicked, all the values are placed in the $_POST array, regardless of
whether they have been modified, and the application transitions to the confirm.php
page. The confirm page performs the following main steps:
- Retrieves the contact SDO from the session
- Updates the contact SDO
- Creates an Relational DAS instance
- Writes the changes back to the database
- Informs the user of the outcome
1. Retrieve the contact SDO from the session
The final step in the execution of
the edit page was to place the contact SDO into the session. We now retrieve this contact
to make the updates.
// retrieve the contact from the session
|
2. Update the contact
Now that we have the contact, we can go about making the
updates posted from the edit page. This is done by comparing the posted value with the
old value, and if they are different, setting the posted value on the contact. We do
this to avoid setting a value unnecessarily and causing SDO to record a change in the
change summary (holds the old values for data objects that have been modified). It
would be nice if SDO implementation were to do this test on our behalf.
// update the fullname if changed
|
3. Create an Relational DAS
The next step is to create the Relational DAS
used to write the updates to the database. This code is identical to that used in
retrieval and, as mentioned, is best placed in a separate script, (contact_model.inc.php,
for example).
// initialize the Relational Data Access Service
|
4. Write the changes back to the database
The next step is to apply the
changes back to the database. The call below shows how this is done for the Relational
DAS. There is no need to specify a SQL statement for updates because the Relational DAS
derives this from the model and the contact data object's change summary.
// apply the changes back to the database
|
The applyChanges() call is deceptively simple. Under the covers, it is:
- Ordering SDO updates to ensure the correct results (for example, creates before updates).
- Generating SQL
INSERT,UPDATE, andDELETEstatements to apply the changes. TheUPDATEandDELETEstatements are qualified with the original values of the data so that should the data have changed in the database in the meantime this will be detected. - Executing the SQL statements -- If any of the SQL statements fails to execute, this is an indication that a collision has occurred and the Relational DAS rolls back all changes and throws an exception. If all statements succeed, all the changes are committed to the database. The client application can then continue to work with the data object, make more changes, and apply them, or can discard it.
5. Inform the user of the outcome
The final task is to notify the user of
the outcome. If no collisions are detected by the Relational DAS, and the update
is successful, all is well.
There are two common schemes employed for detecting conflicts:
- Add a version column (might be based on a timestamp) to each table, updated each time a row is modified. Comparing versions tells us if there is a conflict.
- Record all the original values and compare with the current ones to see if any have been modified.
More on SDO modification
The previous section briefly touched on accessing and setting an SDO property (see the code in Step 2, "Update the contact"). In addition to setting primitive properties, most SDO applications also require the creation of child data objects and the deletion of parts of the data structure. The code snippets below give a quick overview of the other types of modification one might wish to perform on the contact SDO.
The techniques described for getting individual properties are also available for setting:
// set the fullname via the object property
|
We can create child data objects. For example, the edit user interface could allow adding a new address to a contact. When new address details were posted, we might perform the following:
// create a child address data object
|
Note: This child data object is automatically inserted into the graph and
$address is simply a reference to that position in the graph. So
for example, if this were the first address added, the following would both set
the ZIP code on the contact's address.
// set the address's zip
|
We can test and unset individual instance properties of the contact. For example, if the user cleared the string in the interface, we could use this to signify unsetting:
// if the fullname value was cleared on the edit page and the fullname
|
Finally, we probably want to enable the deletion of a contact, or contact's address in the edit page, again implemented using unset:
// test and unset the first address
|
Summary
SDOs add some interesting capabilities for working with data in PHP, whilst maintaining the simple, easy-to-use interfaces PHP developers expect. SDOs can represent complex data structures from heterogeneous data sources, whilst allowing their manipulation through a single API similar to that of SimpleXML and PDO. Optimistic concurrency support is built into SDO, allowing disconnected data manipulation without requiring the application to implement change tracking and conflict detection.
This article has given a taste of some of the capabilities of SDO, but there are many that have not been covered. We expect subsequent articles to cover these topics, including:
- Different classes (or types) of data objects: sequenced, open, abstract
- Relational DAS details
- XML DAS details
- Implementing a DAS using SDO service provider interface
This article was first published by IBM developerWorks at http://www-128.ibm.com/developerworks/opensource/library/os-sdophp/
Resources
The SDO for PHP implementation is delivered as a PECL extension, and can be downloaded from the SDO project page.
The SDO for PHP documentation can be found in the livedocs build of the PHP manual.
The Java SDO specification is available for download from IBM's developerWorks library.
About the Authors
Graham Charters is a Senior Software Engineer working at IBM's development lab in Hursley, England. Past roles have included IBM WebSphere® Application Server development, and architecture responsibilities in WebSphere Business Integration, and Adapters. His current interests are in the relationships between open source technologies, such as those of Linux®, Apache, MySQL, PHP/Perl/Python (LAMP) and the WebSphere platform. He holds degrees in computer science, numerical analysis, and machine vision, all from the University of Manchester.
Matthew Peters works at IBM's development lab in Hursley, England, as a Software Engineer. He has worked in various roles on IBM's CICS® and MQSeries® products, and also spent a number of years working with partners in scientific and technical computing and large-scale parallel processing. In recent years, he worked on the garbage collector in the IBM JVM. He has a degree in mathematics from Queens' College in Cambridge and a master's in software engineering from Oxford University.
Caroline Maynard is a Software Engineer at IBM's development lab in Hursley, England, where she has worked in diverse areas, including networking, graphics, and voice. Most recently, she led the development of the IBM Java ORB, which underpins the WebSphere Application Server EJB container. She is interested in the integration of IBM offerings with open source Linux, Apache, MySQL, PHP/Perl/Python (LAMP) technologies. She holds a degree in mathematics from the University of Sussex.
Anantoju Veera Srinivas works as a Staff Software Engineer at IBM's development lab in Bangalore, India. Past roles have included JVM development on Linux® and AIX®. His current interests are in Web technologies and databases, such as Linux, Apache, MySQL, PHP/Perl/Python (LAMP) in the open source world, and the IBM WebSphere® and DB2® platforms. He holds degrees in electrical and electronics engineering from Sri Venkateshwara University, Andhra Pradesh, India.
All four authors worked together to create the SDO extension for PHP.

Comments