An introduction to Service Data Objects for PHP
Graham Charters (charters@uk.ibm.com), Senior Software Engineer, IBM U.K. Laboratories
Matthew Peters (matthew_peters@uk.ibm.com), Software Engineer, IBM U.K. Laboratories
Caroline Maynard (caroline.maynard@uk.ibm.com), Software Engineer, IBM U.K. Laboratories
Anantoju Srinivas (srinivas.anantoju@in.ibm.com), Staff Software Engineer, IBM Bangalore Laboratories
First published by IBM at IBM developerWorks
(www.ibm.com/developerworks). All rights retained by IBM and the authors.
Used By Permission
Service Data Objects (SDOs) have been around in the Java™ technology world since November 2003. They are designed as a means of simplifying and unifying working with heterogeneous data sources. In February 2005, IBM and Zend announced a strategic partnership to collaborate on the development and support of the PHP environment. One aspect of this collaboration has been the definition and implementation of SDOs for PHP. This article gives an overview of SDOs and the motivations for using them in the PHP environment. A simple contact management scenario is used to illustrate key concepts.
|
Service Data Objects (SDOs) are designed to simplify and unify the way applications handle data. Using SDOs, application programmers can uniformly access and manipulate data from heterogeneous data sources, including relational databases, XML data sources, Web services, and other such enterprise information systems.
SDOs are based on the concept of disconnected data graphs. A data graph is a collection of data objects. Under the disconnected data graphs architecture, a client retrieves a data graph from a data source, changes the data graph, and can then apply the data graph updates back to the data source.
The task of connecting applications to data sources is performed by a Data Access Service (DAS) (see Figure 1). Client applications can query a DAS and get a data graph in response, modify the data graph and send the updated data graph to a DAS to have the changes applied to the original data source. Clients may also use DASes to read from one data source and write to another -- for example, reading an XML RSS feed and writing the results to a relational database. This SDO/DAS architecture allows applications to deal principally with data graphs and data objects.
Figure 1. The Role of a DAS
The PHP implementation of SDOs involves mapping to a dynamic and weakly typed language. The result is a greatly simplified API, with the collapsing of the many type-specific getters/setters. In addition to this, SDOs also exploit PHP's ability to implement objects that can be manipulated as if they were arrays, including support for iteration, state testing and unsetting. The result is a powerful data object technology with a simple, intuitive interface.
Core SDO concepts are defined by the SDO model and a small number of interfaces that enable working with instance data managed in accordance with the model. Figure 2 shows a UML class diagram of these concepts. Let's take a few moments to understand the roles of the different elements.
Figure 2. SDO Model and Interfaces
Type and Property: At the heart of every SDO instance is a model that defines the permitted structure of a data object. You can think of this as being like a blueprint for the data object. It covers concepts such as parent-child data object relationships, cardinality of relationships, permitted properties on a data object, default values. The SDO model is described in terms of Types and Properties. Types can be primitives, such as string (SDO calls these "data types" and defines a set mapped to PHP types), or complex types to represent an order or address (SDO calls these data object types). Data object types contain properties. Each property has a type that can be a data type or a data object type.
This Type and Property model allows data objects to represent what SDO refers to as data graphs. The essence of a data graph is a tree structure of data objects, navigable via their containment references (you can think of these as container, or aggregation, parent-child relationships), plus noncontainment references that point to data objects within the data graph (these are not aggregation relationships and can, therefore, point to any other data object within the tree).
Let's use the example of a Person data object to illustrate the important concepts of Type and Property. The diagram below shows an example of a Person SDO instance on the left and its corresponding model on the right. The Person SDO instance has its structure defined by a type with name of "Person." The Person type has been defined to have two properties with the names "name" and "age." The types of these two properties are SDO data types of "String" and "Int." On the left, we can see that the Person instance has had these two properties set to the values of "Fred Fish" and 35, respectively.
Figure 3. Example Person SDO instance and model
|
In addition to the attributes mentioned, types also have attributes that say whether they are open (i.e., supports additional instance properties not defined in the type), sequenced (preserve order across properties), or abstract (an abstract base type to be extended by another SDO type). Properties also have attributes that describe things like cardinality (for example, a many-valued property to represent the departments in a company), any default value, whether the property is read-only. Not all these concepts are supported in the PHP implementation of SDO. The documentation accompanying the SDO for PHP project describes the status of the various capabilities. All these concepts are covered in detail in the Java SDO specification (see Resources).
DataObject: The core interface for working with data objects is the DataObject (no surprise there). This supports all the capabilities one would expect when working with a data structure, such as setting and getting properties, creating child structures, property querying via an augmented subset of XPath, and structure navigation.
If we consider the earlier example, we could use the DataObject interface to get the age property value, 35, and then use it to update the age to 36, say. The use of the DataObject interface is described in more detail in Contact scenario.
Sequence: The sequence interface is used to manipulate data objects when ordering across a data object's properties is required (the data object is said to be a sequenced type). This is particularly useful when working with XML data where the order of property values is important, but varies for each instance and is, therefore, not explicitly defined by the model (for mixed XML content, for example).
As mentioned in the introduction, SDOs rely on the existence of DASes. The role of a DAS is to retrieve and write data to and from a data source. When using a DAS, the client works with SDOs, and is, therefore, insulated from any data source-specific data representation.
A DAS can also act as a factory for data objects, creating new instances that conform to some predefined data source schema (for example, a database or XML schema).
Two DAS implementations are provided as part of the SDO for PHP project. These are the XML DAS for working with XML files or XML/HTTP sources, and the Relational Data Access Service (Relational DAS), implemented using PHP Data Objects (PDO) for accessing relational data sources.
Why choose Service Data Objects?
There are a number of reasons why you might consider using SDOs, the main ones outlined below.
Reduced database overhead: At the heart of SDOs is support for disconnected working. Data objects can automatically record their change history that can then be used by a DAS to detect collisions when applying changes back to an enterprise information system.
This technique is often referred to as optimistic concurrency, or optimistic offline locking (described in Patterns of Enterprise Application Architecture, by Martin Fowler). It is best suited to scenarios where there is a low risk of collision, perhaps due to infrequent edits, or the edits being governed by some external process that reduces the likelihood of multiple people concurrently working on the same data.
The benefits are even greater if, in addition to the low risk of collision, there is also high edit latency (significant time between checking out to make edits and edits being committed), or the data is passed around in ways that make it difficult to manage database connections and locks (for example, in service-oriented application architectures).
The major benefit to using this technique is the removal of the need for the application to hold connections and locks in the enterprise information system while some user or application is working on the data.
Interestingly, there is nothing in the SDO architecture that precludes the creation of DASes that employ a pessimistic concurrency model.
Single API for data: In addition to the optimistic concurrency support, another major benefit to using SDO is realized when working with multiple heterogeneous data sources. SDOs provide a single API for manipulating data independent of the data's originating data source.
In addition to the expected data structure manipulation support, SDOs also provide the ability to set and retrieve the contents of an SDO based on an XPath-like expression, including queries. This removes the burden from the client of having to navigate the data structure in order to identify substructures based on instance data. For example, we might choose to extend our Person example to include a new data object type that contains a collection of Person data objects. To identify an individual by name without XPath would require us to iterate through all the Person objects and test the name property until a match was found. To do this with XPath requires just a single call, specifying the property name and the value to be matched.
Knowledge of the structure: As we have already seen, SDOs carry internal knowledge of the structure of the data they represent (the model). This knowledge is used to ensure that the creation and modification of instance data conforms to the structure and type rules for the data object.
Data capabilities, such as SimpleXML and PDO, do not employ this approach and, therefore, must delegate this responsibility to other technologies when creating and validating data. As a consequence, the benefit that SimpleXML and PDO have over SDO is they do not require the developer to specify the model.
|
An additional benefit to having a model is the capability to introspect it at run time. For example, this can enable developers to write flexible user interfaces that can adapt to changes in their data structures (SDO schema) at run time. SDO model introspection is not fully enabled in the PHP implementation, but we expect this to change over time.
Relationship to PHP Data Objects and SimpleXML
PDO (not to be confused with SDO) aims to provide a consistent API for the common capabilities found in most relational database APIs. This greatly simplifies creating Web applications designed to support different database vendors by encapsulating the differences under a common API. PDO provides a simple object view of results, but does not attempt to normalize those results (one row of the result set equals one object, regardless of whether there are multiple tables represented in the result). The ease of use of PDO makes it a natural choice when working directly with databases, and for this very reason is the technology chosen for the Relational DAS implementation provided with SDO.
As mentioned, the focus of SDO is on providing a flexible data object representation for data from heterogeneous data sources and built-in support for optimistic concurrency. We have also described how having a "knowledge of the structure" enables SDOs to provide a single API for the complete life cycle of the data, including creation and validation. If these capabilities are important to your application, then SDO is probably an appropriate choice, using the Relational DAS implemented to PDO for relational data source support.
SimpleXML provides a simple way for working with XML instance documents. Documents are loaded, and can be navigated and manipulated through a simple API. The interface surfaces some specifics of XML (for example, the syntax differentiates between elements and attributes).
When parsing and processing an instance document, SimpleXML is an excellent technology choice. However, if significant manipulation is required, it is important that the technology understands the model for the document, and in that case, SDO is probably a more appropriate choice.
|
You should now have a reasonable understanding of the main concepts of SDO. To help clarify, we will illustrate them with a simple scenario.
The following sections provide an overview of SDO for PHP capabilities, described in the context of a personal contacts example. This example demonstrates the following attributes of SDO:
- Disconnected working (optimistic concurrency)
- SDO navigation and manipulation
- The role of a DAS
Code snippets are provided to help explain the concepts and development requirements. However, it is not our goal to describe a complete working example. We expect future articles to cover working examples and elaborate on the use of the relational and XML DASes.
In this scenario, a contacts Web application has been written to support the management of contact information. The contact information is stored in a relational database containing the following two tables:
Table 1. "contact" table definition
| Column | Type | Example |
| shortname (primary key) | string | "Charlie" |
| fullname | string | "Charles Babbage" |
Table 2. "address" table definition
| Column | Type | Example |
| id (auto-generated primary key) | integer | 1 |
| shortname (foreign key) | string | "Charlie" |
| addressline1 | string | "Analytical House" |
| addressline2 | string | "1 Engine Close" |
| city | string | "Walworth Road" |
| state | string | "London" |
| zip | string | "XX11 1ZZ" |
| telephone | string | "555-555-5555" |
To illustrate the use of SDOs, we have selected the use case of modifying a contact. The main steps to modifying a contact are as follows:
- Retrieve the contact to be modified from the database
- Make the modifications to the contact
- Apply the modification back to the database
These steps are described in more detail below.
Note: As we have mentioned, DAS APIs are not specified as part of SDO because we are referring to the SDO specification, and, therefore, any sample code is necessarily specific to a particular implementation. The code snippets shown below have been created to match the APIs of the Relational DAS provided with SDO for PHP.
The first page the user sees is the contact management main.php page. It has a single input field for entering the shortname of the contact to be edited, along with an Edit button to submit the request.
Figure 4. The contact management main page
Entering a shortname and clicking Edit transitions to the edit.php page, passing the specified shortname in the $_POST array. The edit.php page contains the following main steps:
- Get the entered shortname
- Create a relational DAS instance
- Execute the query to retrieve the contact SDO
- Populate the edit form with the contact details
- Store the contact SDO in the session
Each of these steps is described in more detail below.
1. Get the shortname: The welcome page form configuration resulted in the shortname being placed in $_POST['shortname'].
|
NOTE: At this point, and also in the section "Modifying the data," we would normally validate the user input to prevent the database from being compromised.
2. Create a Relational DAS instance: An important aspect of creating the Relational DAS is describing the database schema it should use. The Relational DAS will take this schema and use it to define the model for the Service Data Objects it can create. This information is often required by other Relational DAS instances and is, therefore, a candidate for placing in a separate script and then included.
Listing 1. Describing the database schema
|
Note: The Relational DAS assumes that all containment relationships are cardinality one-to-many. So in this example, the contact can contain zero or more address DataObject instances.
Having defined the model, we can now create an instance of the Relational DAS.
Listing 2. Creating a Relational DAS instance
|
3. Execute the query: We can now use the Relational DAS to retrieve the contact information.
Listing 3. Use the Relational DAS to retrieve contact information
|
The resulting $contact SDO is shown below. This shows the data objects, their properties (property index shown in the square brackets, followed by the property name), and the property values. As mentioned, the cardinality of the address containment within a contact is assumed by the Relational DAS to be one-to-many. In the diagram below, the notation "[0] DataObject" has been used to signify the first entry in the list of address DataObjects.
Figure 5. Contact SDO instance
4. Populate the edit form: Given the contact SDO, we can populate the form to allow the user to edit the data.
Listing 4. Populate the edit form
|
5. Store the data object in the session: Because we are disconnected from the database, we need to store the contact data object in the session to make it available to the next page.
|
The previous section briefly touched on accessing the properties of an SDO (see the code in Step 4, "Populate the edit form"). In addition to accessing primitive properties, most SDO applications also require navigation up and down parent-child data object relationships (between contact and address, for example). Some also require query capabilities to identify parts of the data graph.
The code snippets below give a quick overview of the ways of navigating the contact SDO data graph.
As we saw in Step 4, SDOs supports property access using the object property syntax:
|
We can also access the fullname using the property index (the position as defined by the data object's model, as shown in the square brackets in Figure 4):
|
We can access many-valued child data object properties, such as address, using the same syntax:
Listing 5. Access many-valued child data object properties
|
We can access individual elements of many-valued properties, such as the first address, using array syntax:
|
We can also directly reference properties within child data object properties, such as the ZIP code from the first address:
Listing 6. Reference properties within child data object properties
|
We can also iterate over the properties of a data object:
Listing 7. Iterate over the properties of a data object
|
Finally, we can access the properties using XPath-like support, the simplest form being the property name:
Listing 8. Access the properties using XPath-like support
|
XPath can also be used to navigate and query data objects. If we had retrieved a number of contacts in the Relational DAS query, we could identify an individual from its first address line; for example: The Relational DAS returns multiple results as a many-valued child property of a root data object, that in the example we have named $root.
Listing 9. Use XPath to navigate and query data objects
|
The following shows a simple example of a contact edit page, edit.php, where the contact has a single address:
Figure 6. The contact edit page
This page allows the user to modify individual property values. When the Update button is clicked, all the values are placed in the $_POST array, regardless of whether they have been modified, and the application transitions to the confirm.php page. The confirm page performs the following main steps:
- Retrieves the contact SDO from the session
- Updates the contact SDO
- Creates a Relational DAS instance
- Writes the changes back to the database
- Informs the user of the outcome
Each of these steps is described in more detail below.
1. Retrieve the contact SDO from the session: The final step in the execution of the edit page was to place the contact SDO into the session. We now retrieve this contact to make the updates.
|
2. Update the contact: Now that we have the contact, we can go about making the updates posted from the edit page. This is done by comparing the posted value with the old value, and if they are different, setting the posted value on the contact. We do this to avoid setting a value unnecessarily and causing SDO to record a change in the change summary (holds the old values for data objects that have been modified). It would be nice if SDO implementation were to do this test on our behalf.
Listing 10. Update the contact
|
3. Create a Relational DAS: The next step is to create the Relational DAS used to write the updates to the database. This code is identical to that used in retrieval and, as mentioned, is best placed in a separate script, ("contact_model.inc.php," for example).
|
4. Write the changes back to the database: The next step is to apply the changes back to the database. The call below shows how this is done for the Relational DAS. There is no need to specify a SQL statement for updates because the Relational DAS derives this from the model and the contact data object's change summary.
|
|
The applyChanges() call is deceptively simple. Under the covers, it is:
- Ordering SDO updates to ensure the correct results (for example, creates before updates).
- Generating SQL
INSERT,UPDATE, andDELETEstatements to apply the changes. TheUPDATEandDELETEstatements are qualified with the original values of the data so that should the data have changed in the database in the meantime this will be detected. - Executing the SQL statements -- If any of the SQL statements fails to execute, this is an indication that a collision has occurred and the Relational DAS rolls back all changes and throws an exception. If all statements succeed, all the changes are committed to the database. The client application can then continue to work with the data object, make more changes, and apply them, or can discard it.
5. Inform the user of the outcome: The final task is to notify the user of the outcome. If no collisions are detected by the Relational DAS, and the update is successful, all is well.
Figure 7. The confirmation page
The previous section briefly touched on accessing and setting an SDO property (see the code in Step 2, "Update the contact"). In addition to setting primitive properties, most SDO applications also require the creation of child data objects and the deletion of parts of the data structure. The code snippets below give a quick overview of the other types of modification one might wish to perform on the contact SDO.
The techniques described for getting individual properties are also available for setting:
Listing 11. Setting individual properties
|
We can create child data objects. For example, the edit user interface could allow adding a new address to a contact. When new address details were posted, we might perform the following:
Listing 12. Create child data objects
|
Note: This child data object is automatically inserted into the graph and $address is simply a reference to that position in the graph. So for example, if this were the first address added, the following would both set the ZIP code on the contact's address.
Listing 13. Set ZIP code
|
We can test and unset individual instance properties of the contact. For example, if the user cleared the string in the interface, we could use this to signify unsetting:
Listing 14. Test and unset individual instance properties
|
Finally, we probably want to enable the deletion of a contact, or contact's address in the edit page, again implemented using unset:
Listing 15. Enable deletion of a contact
|
|
SDOs add some interesting capabilities for working with data in PHP, whilst maintaining the simple, easy-to-use interfaces PHP developers expect. SDOs can represent complex data structures from heterogeneous data sources, whilst allowing their manipulation through a single API similar to that of SimpleXML and PDO. Optimistic concurrency support is built into SDO, allowing disconnected data manipulation without requiring the application to implement change tracking and conflict detection.
This article has given a taste of some of the capabilities of SDO, but there are many that have not been covered. We expect subsequent articles to cover these topics, including:
- Different classes (or types) of data objects: sequenced, open, abstract
- Relational DAS details
- XML DAS details
- Implementing a DAS using SDO service provider interface
|
Learn
-
The SDO for PHP implementation is delivered as a PECL extension, and can be downloaded from the SDO project page.
-
The SDO for PHP documentation can be found in the "Function Reference" section of the latest builds of the PHP documentation.
-
IBM and BEA are collaborating on specifications for programming models and APIs for Java 2 Enterprise Edition (J2EE) application servers that provide programmers with simpler and more powerful ways of building portable server applications. Read about it in "Service Data Objects, WorkManager, and Timers" (developerWorks, June 2005).
-
"Connecting PHP to Apache Derby" shows you how to install and configure PHP on Windows®.
-
Visit the developerWorks Open source zone for extensive how-to information, tools, and project updates to help you develop with open source technologies and use them with IBM's products.
Get products and technologies
-
Innovate your next open source development project with
IBM trial software, available for download or on DVD.

Comments
Dear Cal,
I tried the example code of your interesting article but the construction of the SDO_DAS_Relational object in Listing 2 fails because of an error in ContainmentReferencesModel.php,v 1.2 2005/08/03 15:48:26.
Either the example should be changed from $address_reference = array(‘parent’ => ‘contact’, ‘child’ => ‘address’);
to $address_reference[] = array(‘parent’ => ‘contact’, ‘child’ => ‘address’); for this to work.
Or the lines 59-61 in ContainmentReferencesModel.php should be $this->full_set_containment_references[] = new SDO_DAS_Relational_ContainmentReference($containment_references_metadata);
instead of
foreach ($containment_references_metadata as $cref) { $this->full_set_containment_references[] = new SDO_DAS_Relational_ContainmentReference($cref); }because the SDO_DAS_Relational_ContainmentReference constructor already handles the parent and child references by itself.
I tried the example with sdo 1.0.2 and PHP 5.1.4.
Best regards,
Rene
P.S.: Here is the xdebug output:
Warning: array_key_exists(): The second argument should be either an array or an object in SDO/DAS/Relational/ContainmentReference.php on line 49
Call Stack: 0.0087 1. {main}() test_sdo.php:0 0.1733 2. SDO_DAS_Relational->__construct() test_sdo.php:32 0.1818 3.
SDO_DAS_Relational_ContainmentReferencesModel->__construct()
SDO/DAS/Relational.php:86 0.2169 4. SDO_DAS_Relational_ContainmentReference->__construct()
SDO/DAS/Relational/ContainmentReferencesModel.php:60 0.2169 5. array_key_exists()
SDO/DAS/Relational/ContainmentReference.php:49
Fatal error: Uncaught exception ‘SDO_DAS_Relational_Exception’ with message ‘The metadata for a reference did not contain a parent field.’
in SDO/DAS/Relational/ContainmentReference.php:52
Stack trace: #0 SDO/DAS/Relational/ContainmentReferencesModel.php(60):
SDO_DAS_Relational_ContainmentReference->__construct(‘contact’) #1 SDO/DAS/Relational.php(86):
SDO_DAS_Relational_ContainmentReferencesModel->__construct(‘contact’, Array) #2 test_sdo.php(32): SDO_DAS_Relational->__construct(Array, ‘contact’,
Array) #3 {main} thrown in SDO/DAS/Relational/ContainmentReference.php on line 52
Thanks Rene!
=C=
Otherwise great article :)
// construct the SQL query for contact retrieval
$stmt = "select * from contact, address where contact.shortname=$shortname and contact.shortname=address.contact_id";
// execute the query to retrieve the contact
$contact = $das->executeQuery($dbh, $stmt);
could and should be:
// construct the SQL query for contact retrieval
$stmt = $dbh->prepare("select * from contact, address where contact.shortname=? and contact.shortname=address.contact_id");
// execute the query to retrieve the contact
$contact = $das->executePreparedQuery($dbh, $stmt, array($shortname));
While I realize that this code is being supplied as OSS,
not having support for multiple FK's is crippling.
In order to do this, could one hack the package's PHP
code, or do you have to go into the C?
Potential contributers want to know....