Categories


Loading feed
Loading feed
Loading feed

An introduction to Service Data Objects for PHP


Graham Charters (charters@uk.ibm.com), Senior Software Engineer, IBM U.K. Laboratories
Matthew Peters (matthew_peters@uk.ibm.com), Software Engineer, IBM U.K. Laboratories
Caroline Maynard (caroline.maynard@uk.ibm.com), Software Engineer, IBM U.K. Laboratories
Anantoju Srinivas (srinivas.anantoju@in.ibm.com), Staff Software Engineer, IBM Bangalore Laboratories

First published by IBM at IBM developerWorks (www.ibm.com/developerworks). All rights retained by IBM and the authors.
Used By Permission

Service Data Objects (SDOs) have been around in the Java™ technology world since November 2003. They are designed as a means of simplifying and unifying working with heterogeneous data sources. In February 2005, IBM and Zend announced a strategic partnership to collaborate on the development and support of the PHP environment. One aspect of this collaboration has been the definition and implementation of SDOs for PHP. This article gives an overview of SDOs and the motivations for using them in the PHP environment. A simple contact management scenario is used to illustrate key concepts.
Currently supported data sources

SDO for PHP currently has support for relational and XML data sources, and a service provider interface to enable the implementation of support for others.

Service Data Objects (SDOs) are designed to simplify and unify the way applications handle data. Using SDOs, application programmers can uniformly access and manipulate data from heterogeneous data sources, including relational databases, XML data sources, Web services, and other such enterprise information systems.

SDOs are based on the concept of disconnected data graphs. A data graph is a collection of data objects. Under the disconnected data graphs architecture, a client retrieves a data graph from a data source, changes the data graph, and can then apply the data graph updates back to the data source.

The task of connecting applications to data sources is performed by a Data Access Service (DAS) (see Figure 1). Client applications can query a DAS and get a data graph in response, modify the data graph and send the updated data graph to a DAS to have the changes applied to the original data source. Clients may also use DASes to read from one data source and write to another -- for example, reading an XML RSS feed and writing the results to a relational database. This SDO/DAS architecture allows applications to deal principally with data graphs and data objects.


Figure 1. The Role of a DAS
Role of a DAS

The PHP implementation of SDOs involves mapping to a dynamic and weakly typed language. The result is a greatly simplified API, with the collapsing of the many type-specific getters/setters. In addition to this, SDOs also exploit PHP's ability to implement objects that can be manipulated as if they were arrays, including support for iteration, state testing and unsetting. The result is a powerful data object technology with a simple, intuitive interface.

SDO concepts

Core SDO concepts are defined by the SDO model and a small number of interfaces that enable working with instance data managed in accordance with the model. Figure 2 shows a UML class diagram of these concepts. Let's take a few moments to understand the roles of the different elements.


Figure 2. SDO Model and Interfaces
SDO model

Type and Property: At the heart of every SDO instance is a model that defines the permitted structure of a data object. You can think of this as being like a blueprint for the data object. It covers concepts such as parent-child data object relationships, cardinality of relationships, permitted properties on a data object, default values. The SDO model is described in terms of Types and Properties. Types can be primitives, such as string (SDO calls these "data types" and defines a set mapped to PHP types), or complex types to represent an order or address (SDO calls these data object types). Data object types contain properties. Each property has a type that can be a data type or a data object type.

This Type and Property model allows data objects to represent what SDO refers to as data graphs. The essence of a data graph is a tree structure of data objects, navigable via their containment references (you can think of these as container, or aggregation, parent-child relationships), plus noncontainment references that point to data objects within the data graph (these are not aggregation relationships and can, therefore, point to any other data object within the tree).

Let's use the example of a Person data object to illustrate the important concepts of Type and Property. The diagram below shows an example of a Person SDO instance on the left and its corresponding model on the right. The Person SDO instance has its structure defined by a type with name of "Person." The Person type has been defined to have two properties with the names "name" and "age." The types of these two properties are SDO data types of "String" and "Int." On the left, we can see that the Person instance has had these two properties set to the values of "Fred Fish" and 35, respectively.


Figure 3. Example Person SDO instance and model
Example Person SDO instance and model
SDO for PHP model limitations

SDO for PHP does not currently support read-only properties, default values, abstract types, open types, or bidirectional relationships.

In addition to the attributes mentioned, types also have attributes that say whether they are open (i.e., supports additional instance properties not defined in the type), sequenced (preserve order across properties), or abstract (an abstract base type to be extended by another SDO type). Properties also have attributes that describe things like cardinality (for example, a many-valued property to represent the departments in a company), any default value, whether the property is read-only. Not all these concepts are supported in the PHP implementation of SDO. The documentation accompanying the SDO for PHP project describes the status of the various capabilities. All these concepts are covered in detail in the Java SDO specification (see Resources).

DataObject: The core interface for working with data objects is the DataObject (no surprise there). This supports all the capabilities one would expect when working with a data structure, such as setting and getting properties, creating child structures, property querying via an augmented subset of XPath, and structure navigation.

If we consider the earlier example, we could use the DataObject interface to get the age property value, 35, and then use it to update the age to 36, say. The use of the DataObject interface is described in more detail in Contact scenario.

Sequence: The sequence interface is used to manipulate data objects when ordering across a data object's properties is required (the data object is said to be a sequenced type). This is particularly useful when working with XML data where the order of property values is important, but varies for each instance and is, therefore, not explicitly defined by the model (for mixed XML content, for example).

Data Access Services

As mentioned in the introduction, SDOs rely on the existence of DASes. The role of a DAS is to retrieve and write data to and from a data source. When using a DAS, the client works with SDOs, and is, therefore, insulated from any data source-specific data representation.

A DAS can also act as a factory for data objects, creating new instances that conform to some predefined data source schema (for example, a database or XML schema).

Two DAS implementations are provided as part of the SDO for PHP project. These are the XML DAS for working with XML files or XML/HTTP sources, and the Relational Data Access Service (Relational DAS), implemented using PHP Data Objects (PDO) for accessing relational data sources.

Why choose Service Data Objects?

There are a number of reasons why you might consider using SDOs, the main ones outlined below.

Reduced database overhead: At the heart of SDOs is support for disconnected working. Data objects can automatically record their change history that can then be used by a DAS to detect collisions when applying changes back to an enterprise information system.

This technique is often referred to as optimistic concurrency, or optimistic offline locking (described in Patterns of Enterprise Application Architecture, by Martin Fowler). It is best suited to scenarios where there is a low risk of collision, perhaps due to infrequent edits, or the edits being governed by some external process that reduces the likelihood of multiple people concurrently working on the same data.

The benefits are even greater if, in addition to the low risk of collision, there is also high edit latency (significant time between checking out to make edits and edits being committed), or the data is passed around in ways that make it difficult to manage database connections and locks (for example, in service-oriented application architectures).

The major benefit to using this technique is the removal of the need for the application to hold connections and locks in the enterprise information system while some user or application is working on the data.

Interestingly, there is nothing in the SDO architecture that precludes the creation of DASes that employ a pessimistic concurrency model.

Single API for data: In addition to the optimistic concurrency support, another major benefit to using SDO is realized when working with multiple heterogeneous data sources. SDOs provide a single API for manipulating data independent of the data's originating data source.

In addition to the expected data structure manipulation support, SDOs also provide the ability to set and retrieve the contents of an SDO based on an XPath-like expression, including queries. This removes the burden from the client of having to navigate the data structure in order to identify substructures based on instance data. For example, we might choose to extend our Person example to include a new data object type that contains a collection of Person data objects. To identify an individual by name without XPath would require us to iterate through all the Person objects and test the name property until a match was found. To do this with XPath requires just a single call, specifying the property name and the value to be matched.

Knowledge of the structure: As we have already seen, SDOs carry internal knowledge of the structure of the data they represent (the model). This knowledge is used to ensure that the creation and modification of instance data conforms to the structure and type rules for the data object.

Data capabilities, such as SimpleXML and PDO, do not employ this approach and, therefore, must delegate this responsibility to other technologies when creating and validating data. As a consequence, the benefit that SimpleXML and PDO have over SDO is they do not require the developer to specify the model.

Model introspection limitation

SDO for PHP currently only has limited support for model introspection (can determine the type of a data object, but not its structure), but we envisage this being improved over time.

An additional benefit to having a model is the capability to introspect it at run time. For example, this can enable developers to write flexible user interfaces that can adapt to changes in their data structures (SDO schema) at run time. SDO model introspection is not fully enabled in the PHP implementation, but we expect this to change over time.

Relationship to PHP Data Objects and SimpleXML

PDO (not to be confused with SDO) aims to provide a consistent API for the common capabilities found in most relational database APIs. This greatly simplifies creating Web applications designed to support different database vendors by encapsulating the differences under a common API. PDO provides a simple object view of results, but does not attempt to normalize those results (one row of the result set equals one object, regardless of whether there are multiple tables represented in the result). The ease of use of PDO makes it a natural choice when working directly with databases, and for this very reason is the technology chosen for the Relational DAS implementation provided with SDO.

As mentioned, the focus of SDO is on providing a flexible data object representation for data from heterogeneous data sources and built-in support for optimistic concurrency. We have also described how having a "knowledge of the structure" enables SDOs to provide a single API for the complete life cycle of the data, including creation and validation. If these capabilities are important to your application, then SDO is probably an appropriate choice, using the Relational DAS implemented to PDO for relational data source support.

SimpleXML provides a simple way for working with XML instance documents. Documents are loaded, and can be navigated and manipulated through a simple API. The interface surfaces some specifics of XML (for example, the syntax differentiates between elements and attributes).

When parsing and processing an instance document, SimpleXML is an excellent technology choice. However, if significant manipulation is required, it is important that the technology understands the model for the document, and in that case, SDO is probably a more appropriate choice.



Back to top


Contact scenario

You should now have a reasonable understanding of the main concepts of SDO. To help clarify, we will illustrate them with a simple scenario.

The following sections provide an overview of SDO for PHP capabilities, described in the context of a personal contacts example. This example demonstrates the following attributes of SDO:

  • Disconnected working (optimistic concurrency)
  • SDO navigation and manipulation
  • The role of a DAS

Code snippets are provided to help explain the concepts and development requirements. However, it is not our goal to describe a complete working example. We expect future articles to cover working examples and elaborate on the use of the relational and XML DASes.

Contact edit use case

In this scenario, a contacts Web application has been written to support the management of contact information. The contact information is stored in a relational database containing the following two tables:

Table 1. "contact" table definition

ColumnTypeExample
shortname (primary key)string"Charlie"
fullnamestring"Charles Babbage"

Table 2. "address" table definition

ColumnTypeExample
id (auto-generated primary key)integer1
shortname (foreign key)string"Charlie"
addressline1string"Analytical House"
addressline2string"1 Engine Close"
citystring"Walworth Road"
statestring"London"
zipstring"XX11 1ZZ"
telephonestring"555-555-5555"

To illustrate the use of SDOs, we have selected the use case of modifying a contact. The main steps to modifying a contact are as follows:

  • Retrieve the contact to be modified from the database
  • Make the modifications to the contact
  • Apply the modification back to the database

These steps are described in more detail below.

Note: As we have mentioned, DAS APIs are not specified as part of SDO because we are referring to the SDO specification, and, therefore, any sample code is necessarily specific to a particular implementation. The code snippets shown below have been created to match the APIs of the Relational DAS provided with SDO for PHP.

Retrieving the contact entry

The first page the user sees is the contact management main.php page. It has a single input field for entering the shortname of the contact to be edited, along with an Edit button to submit the request.


Figure 4. The contact management main page
The contact management main page

Entering a shortname and clicking Edit transitions to the edit.php page, passing the specified shortname in the $_POST array. The edit.php page contains the following main steps:

  1. Get the entered shortname
  2. Create a relational DAS instance
  3. Execute the query to retrieve the contact SDO
  4. Populate the edit form with the contact details
  5. Store the contact SDO in the session

Each of these steps is described in more detail below.

1. Get the shortname: The welcome page form configuration resulted in the shortname being placed in $_POST['shortname'].


// get the shortname from posted variables
$shortname = $_POST['shortname'];

NOTE: At this point, and also in the section "Modifying the data," we would normally validate the user input to prevent the database from being compromised.

2. Create a Relational DAS instance: An important aspect of creating the Relational DAS is describing the database schema it should use. The Relational DAS will take this schema and use it to define the model for the Service Data Objects it can create. This information is often required by other Relational DAS instances and is, therefore, a candidate for placing in a separate script and then included.


Listing 1. Describing the database schema


// Describe the structure of the contact table
$contact_table = array(
    'name' => 'contact',
    'columns' => array('shortname', 'fullname', 'telephone'),
    'PK' => 'shortname'
    );

// Describe the structure of the address table
$address_table = array (
    'name' => 'address', 
    'columns' => array('id', 'contact_id', 'addressline1', 'addressline2', 
                       'city', 'state', 'zip'),
    'PK' => 'id', 
    'FK' => array ('from' => 'contact_id', 'to' => 'contact')
    );

$table_metadata = array($contact_table, $address_table);

// Describe the parent-child relationship.  This is information required
// by the Relational DAS to help map from the relational 
database representation to 
// the data graph representation of SDO.
$address_reference = array('parent' 
=> 'contact', 'child' => 'address');


Note: The Relational DAS assumes that all containment relationships are cardinality one-to-many. So in this example, the contact can contain zero or more address DataObject instances.

Having defined the model, we can now create an instance of the Relational DAS.


Listing 2. Creating a Relational DAS instance


// Create the Relational Data Access Service telling it the database
// schema, that table should be considered the root of the graph,
// and finally the additional information for the object model.
$das = new SDO_DAS_Relational($table_metadata, 'contact', $address_reference);

3. Execute the query: We can now use the Relational DAS to retrieve the contact information.


Listing 3. Use the Relational DAS to retrieve contact information


// connect to the database.  This connection will be released when the
// $dbh variable is cleaned up.
$dbh = new PDO('mysql:dbname=contactdb;host=localhost', DB_USER, DB_PASSWORD);

// construct the SQL query for contact retrieval
$stmt = "select * from contact, address where contact.shortname=$shortname and
contact.shortname=address.contact_id";

// execute the query to retrieve the contact
$contact = $das->executeQuery($dbh, $stmt);

The resulting $contact SDO is shown below. This shows the data objects, their properties (property index shown in the square brackets, followed by the property name), and the property values. As mentioned, the cardinality of the address containment within a contact is assumed by the Relational DAS to be one-to-many. In the diagram below, the notation "[0] DataObject" has been used to signify the first entry in the list of address DataObjects.


Figure 5. Contact SDO instance
Contact SDO instance

4. Populate the edit form: Given the contact SDO, we can populate the form to allow the user to edit the data.


Listing 4. Populate the edit form


// Create and populate the form with the contact details
<form action= ... method="POST">
  <input type="text" name="fullname" 
  value="$contact->fullname">

  ...
  <input type="text" 
  "name="addressline1"" value="
  $contact->address[0]->addressline1">
  ...
</form>

5. Store the data object in the session: Because we are disconnected from the database, we need to store the contact data object in the session to make it available to the next page.


// store the contact data object in the session
$_SESSION['contact_sdo'] = $contact;

More on SDO navigation

The previous section briefly touched on accessing the properties of an SDO (see the code in Step 4, "Populate the edit form"). In addition to accessing primitive properties, most SDO applications also require navigation up and down parent-child data object relationships (between contact and address, for example). Some also require query capabilities to identify parts of the data graph.

The code snippets below give a quick overview of the ways of navigating the contact SDO data graph.

As we saw in Step 4, SDOs supports property access using the object property syntax:


// get the fullname using the object property
$fullname = $contact->fullname;

We can also access the fullname using the property index (the position as defined by the data object's model, as shown in the square brackets in Figure 4):


// get the second contact property (fullname)
$fullname = $contact[1];

We can access many-valued child data object properties, such as address, using the same syntax:


Listing 5. Access many-valued child data object properties


// get the list of address data objects via the object property
$addresses = $contact->address;

// get the list of address data objects via the property index
$addresses = $contact[2];

We can access individual elements of many-valued properties, such as the first address, using array syntax:


// get the first address from a list of address data objects
$address1 = $contact->address[0];

We can also directly reference properties within child data object properties, such as the ZIP code from the first address:


Listing 6. Reference properties within child data object properties


// access the zip code from the first address via the object properties
$zip = $contact->address[0]->zip;

// access the zip code via the property indices.
// Note: this style is not recommended since it leads to virtually 
// unserviceable code.  The XPath-style (described later) or defining 
// constants lead to much more readable code.
$zip = $contact[2][0][5];

We can also iterate over the properties of a data object:


Listing 7. Iterate over the properties of a data object


// Iterate over the properties of the first address
// $name is assigned each property name (e.g. "id", "addressline1", ...)
// $value is assigned each property value (e.g. 1, "Analytical House", ...)
foreach ($contact->address[0] as $name => $value) {
    echo "$name $value";
}

Finally, we can access the properties using XPath-like support, the simplest form being the property name:


Listing 8. Access the properties using XPath-like support


// use property names (XPath) to access the zip property
$zip = $contact['address'][0]['zip'];

// use single XPath expression with array index notation 
to access the zip property
// Note: XPath array indices start at 1.
$zip = $contact['address[1]/zip'];

// use single XPath expression with dotted index notation 
to access the zip property
// Note: SDO dotted notation indices start at 0.
$zip = $contact['address.0/zip'];

XPath can also be used to navigate and query data objects. If we had retrieved a number of contacts in the Relational DAS query, we could identify an individual from its first address line; for example: The Relational DAS returns multiple results as a many-valued child property of a root data object, that in the example we have named $root.


Listing 9. Use XPath to navigate and query data objects


// Get the address object that contains the addressline1 of "1 Engine Close"
$address = $root["contact/address[addressline1='1 Engine Close']"];

// Get the contact with the matching address.  GetContainer() navigates to the
// parent of a data object, in this case the contact SDO.
$contact = $address->getContainer();

Modifying the data

The following shows a simple example of a contact edit page, edit.php, where the contact has a single address:


Figure 6. The contact edit page
The contact edit page

This page allows the user to modify individual property values. When the Update button is clicked, all the values are placed in the $_POST array, regardless of whether they have been modified, and the application transitions to the confirm.php page. The confirm page performs the following main steps:

  1. Retrieves the contact SDO from the session
  2. Updates the contact SDO
  3. Creates a Relational DAS instance
  4. Writes the changes back to the database
  5. Informs the user of the outcome

Each of these steps is described in more detail below.

1. Retrieve the contact SDO from the session: The final step in the execution of the edit page was to place the contact SDO into the session. We now retrieve this contact to make the updates.


// retrieve the contact from the session
$contact = $_SESSION['contact_sdo'];

2. Update the contact: Now that we have the contact, we can go about making the updates posted from the edit page. This is done by comparing the posted value with the old value, and if they are different, setting the posted value on the contact. We do this to avoid setting a value unnecessarily and causing SDO to record a change in the change summary (holds the old values for data objects that have been modified). It would be nice if SDO implementation were to do this test on our behalf.


Listing 10. Update the contact


// update the fullname if changed
if ($contact->fullname != $_POST['fullname']) {
    $contact->fullname = $_POST['fullname'];
}
...

3. Create a Relational DAS: The next step is to create the Relational DAS used to write the updates to the database. This code is identical to that used in retrieval and, as mentioned, is best placed in a separate script, ("contact_model.inc.php," for example).


// initialize the Relational Data Access Service
require_once('contact_model.inc.php');

4. Write the changes back to the database: The next step is to apply the changes back to the database. The call below shows how this is done for the Relational DAS. There is no need to specify a SQL statement for updates because the Relational DAS derives this from the model and the contact data object's change summary.


// apply the changes back to the database
$das->applyChanges($dbh, $contact);

Collision detection schemes

There are two common schemes employed for detecting conflicts:

  1. Add a version column (might be based on a timestamp) to each table, updated each time a row is modified. Comparing versions tells us if there is a conflict.
  2. Record all the original values and compare with the current ones to see if any have been modified.

The Relational DAS implements the second scheme since this does not require the database to be modified in order to use it.

The applyChanges() call is deceptively simple. Under the covers, it is:

  • Ordering SDO updates to ensure the correct results (for example, creates before updates).
  • Generating SQL INSERT, UPDATE, and DELETE statements to apply the changes. The UPDATE and DELETE statements are qualified with the original values of the data so that should the data have changed in the database in the meantime this will be detected.
  • Executing the SQL statements -- If any of the SQL statements fails to execute, this is an indication that a collision has occurred and the Relational DAS rolls back all changes and throws an exception. If all statements succeed, all the changes are committed to the database. The client application can then continue to work with the data object, make more changes, and apply them, or can discard it.

5. Inform the user of the outcome: The final task is to notify the user of the outcome. If no collisions are detected by the Relational DAS, and the update is successful, all is well.


Figure 7. The confirmation page
The confirmation page

More on SDO modification

The previous section briefly touched on accessing and setting an SDO property (see the code in Step 2, "Update the contact"). In addition to setting primitive properties, most SDO applications also require the creation of child data objects and the deletion of parts of the data structure. The code snippets below give a quick overview of the other types of modification one might wish to perform on the contact SDO.

The techniques described for getting individual properties are also available for setting:


Listing 11. Setting individual properties


// set the fullname via the object property
$contact->fullname = 'Alan Turing';

// set the fullname via the property index
$contact[1] = 'Alan Turing';

// set the fullname via the property name (XPath)
$contact['fullname'] = 'Alan Turing';

We can create child data objects. For example, the edit user interface could allow adding a new address to a contact. When new address details were posted, we might perform the following:


Listing 12. Create child data objects


// create a child address data object
$address = $contact->createDataObject('address');

// set the address's addressline1 property from the posted value
$address->addressline1 = $_POST['addressline1'];

Note: This child data object is automatically inserted into the graph and $address is simply a reference to that position in the graph. So for example, if this were the first address added, the following would both set the ZIP code on the contact's address.


Listing 13. Set ZIP code


// set the address's zip
$address->zip = 'XY11 2ZZ';

// set the addresses zip via the contact data object
$contact->address[0]->zip = 'XY11 2ZZ';


We can test and unset individual instance properties of the contact. For example, if the user cleared the string in the interface, we could use this to signify unsetting:


Listing 14. Test and unset individual instance properties


// if the fullname value was cleared on the edit page and the fullname
// was previously set then unset the fullname property
if (empty($_POST['fullname']) && isset($contact->fullname)) {
    unset($contact->fullname);
}

Finally, we probably want to enable the deletion of a contact, or contact's address in the edit page, again implemented using unset:


Listing 15. Enable deletion of a contact


// test and unset the first address
if (isset($contact->address[0])) {
    unset($contact->address[0]);
}



Back to top


Summary

SDOs add some interesting capabilities for working with data in PHP, whilst maintaining the simple, easy-to-use interfaces PHP developers expect. SDOs can represent complex data structures from heterogeneous data sources, whilst allowing their manipulation through a single API similar to that of SimpleXML and PDO. Optimistic concurrency support is built into SDO, allowing disconnected data manipulation without requiring the application to implement change tracking and conflict detection.

This article has given a taste of some of the capabilities of SDO, but there are many that have not been covered. We expect subsequent articles to cover these topics, including:

  • Different classes (or types) of data objects: sequenced, open, abstract
  • Relational DAS details
  • XML DAS details
  • Implementing a DAS using SDO service provider interface


Back to top


Resources

Learn
  • The SDO for PHP implementation is delivered as a PECL extension, and can be downloaded from the SDO project page.

  • The SDO for PHP documentation can be found in the "Function Reference" section of the latest builds of the PHP documentation.

  • IBM and BEA are collaborating on specifications for programming models and APIs for Java 2 Enterprise Edition (J2EE) application servers that provide programmers with simpler and more powerful ways of building portable server applications. Read about it in "Service Data Objects, WorkManager, and Timers" (developerWorks, June 2005).

  • "Connecting PHP to Apache Derby" shows you how to install and configure PHP on Windows®.

  • Visit the developerWorks Open source zone for extensive how-to information, tools, and project updates to help you develop with open source technologies and use them with IBM's products.


Get products and technologies
  • Innovate your next open source development project with IBM trial software, available for download or on DVD.

Comments


Sunday, August 6, 2006
CORRECTION SENT IN FROM A READER
5:34AM PDT · Cal Evans (editor)
Tuesday, August 15, 2006
SQL INJECTION
12:49AM PDT · ltp
MULTIPLE FK'S?
1:17AM PDT · Anonymous User [unregistered]
Wednesday, August 16, 2006
SDO MULTIPLE FOREIGN KEYS SUPPORT
5:10PM PDT · Michael Fischer [unregistered]
Tuesday, October 24, 2006
IE7 - OT
12:42AM PDT · pablorobert
Loading feed