Work with XML Data in the Zend Framework

      2 Comments on Work with XML Data in the Zend Framework

by Salvador Ledezma (
IBM Corporation


One of the incubator projects in the Zend Framework is Zend_Db_Xml. Zend_Db_Xml, also known as the XML Content Store (XCS), allows web applications that use XML data to easily update, save, and otherwise manage this data. In this article I will introduce the XCS persistence API and discuss an implementation using IBM’s DB2 9 database with its pureXML technology. Finally, I’ll discuss a sample social networking application to show how easy and fun it is to develop XML-centric applications using the XCS.

XML Content Store (XCS)

One of the major advantages of using an application development framework, such as the Zend Framework (ZF), is that it provides an abstraction to the database layer for data-driven Web sites. Data abstraction is helpful because it allows the developer to concentrate on the behavior of data rather than the often tedious details of database access and manipulation. We have seen this with the Zend_Db package in the ZF, with such objects as Zend_Db_Adapter, Zend_Db_Table, and Zend_Db_Select. With the ever-increasing proliferation of XML data over the Web for better or for worse, there is also a need to abstract the mechanics of persisting XML data, including Create, Read, Update, and Delete (CRUD) operations.

The XCS is an incubator project in the ZF and provides both a persistence data access layer as well as an API for managing XML data easily. As an introduction to the XCS, I’ll provide an architectural overview by first describing the components that make up the XCS and by explaining how they work together. Then we will put it all in action by using the XCS to create a small social networking application built on top of the ZF.

XMLContent (Zend_Db_Xml_XmlContent)

As developers, we may encounter XML data in many forms, such as Web Service messages, RSS/Atom feeds, and/or configuration files. Once you determine that your application must “talk” XML and that the XML needs to be saved somewhere, we can assume several things about this data.

  • First we need a way to uniquely identify it, so that once it is saved; it can easily be programmatically found and retrieved. The unique name can be a numeric id or a user-provided name.
  • The second assumption is that the XML data will be stored as is. It will not be modified or changed in any way. No additional header or metadata elements or attributes will be added and certainly nothing will be removed. Any modifications that are required to the actual data elements will be done by the application outside the XCS. Internally, the XML data is stored as a DOM document, but an application is free to access the data as a file stream, a string, or several other convenient access methods which may or may not be implementation-dependent.
  • Third, if metadata is needed, the capability will be provided to add it, but it will be saved separate from the XML data. For example, if the XML data is a blog entry, perhaps the application would care to know the date and title of the entry, or the hostname where the entry originated. The metadata is saved in an “about” property and is also XML.
  • Finally, often times XML data will be accompanied by binary data, such as .jpeg, .pdf, . or .doc files. An “attachment” property will associate this binary data with the XML data. In the current implementation of Zend_Db_Xml_XmlContent, the attachment property can contain either 0 or at most 1 item, though optionally, a future version can contain any number of items.

The XML data and its properties (id, about, and attachment) are encapsulated in an object called Zend_Db_Xml_XmlContent. Zend_Db_Xml_XmlContent objects are the fundamental components of the XCS as they are the XCS representation of XML data. As we will see next, the Zend_Db_Xml_XmlContentStore component needs to know about the persistence technology (for example, a relational database) used and how to access it, but Zend_Db_Xml_XmlContent objects need not know anything about it.

XMLContentStore (Zend_Db_Xml_XmlContentStore)

Zend_Db_Xml_XmlContentStore is an abstract class that represents a repository of XML documents. It is responsible for updating the data source based on changes made to an Zend_Db_Xml_XmlContent object in the repository as well as retrieving Zend_Db_Xml_XmlContent objects from the data source based on search or id criteria. A data source is defined very generally as the persistence layer where the XML data is stored. It can be a relational database, an XML database, or a file system, and it stores the XML data in its own format. When an Zend_Db_Xml_XmlContentStore object is instantiated, it receives a “connection handle” which describes in a meaningful way what the data source is.

In the ZF, it becomes very convenient to allow the connection handle to be a Zend_Db_Adapter object which allows the persistence layer to be a relational database. Then, a call to the insert() method on an Zend_Db_Xml_XmlContentStore object will allow the underlying Zend_Db_Adapter object to build an appropriate SQL insert statement based on the structure of the underlying tables used and the contents of the Zend_Db_Xml_XmlContent object. It will connect to the database and execute the statement. Other CRUD methods work in a similar fashion and include: update(), delete(), deleteById(), and selectAll().

Zend_Db_Xml_XmlContentStore also contains a simple search facility that retrieves Zend_Db_Xml_XmlContent by its id or by searching within the XML data or the “about” metadata in Zend_Db_Xml_XmlContent. The search on XML data is done using XPath expressions. These methods are find(), and findById(). There is also a method, executeXPathPredicateQuery() that does simple XPath searches on the data.

An Implementation using the DB2 Database

DB2 Express C V9 provides innovative pureXML technology to store and manage XML data as a native data structure. This makes it very easy to store and retrieve XML data without having to map or “shred” the XML data into relational columns. The class Zend_Db_Xml_XmlContentStore_Db2 is implemented using the Zend_Db_Adapter for DB2. The adapter uses the ibm_db2 CLI driver. You can get a copy of DB2 Express C V9 and the ibm_db2 driver by installing the Zend Core for IBM product. Please see the resources section at the end of this article to learn where to download Zend Core for IBM.

Because DB2 V9 supports a native XML data type, one Zend_Db_Xml_XmlContentStore_Db2 object maps to one table with four columns. These columns are:

  • id, a unique integer and used as the primary key of the table
  • data, defined as an XML column
  • about, also defined as an XML column
  • attachment, defined as a BLOB column

Using DB2, each row in the table represents one Zend_Db_Xml_XmlContent object.

The following classes are helper classes that allow for easier processing of Zend_Db_Xml_XmlContent objects.

It is possible that a search returns one Zend_Db_Xml_XmlContent or a set of Zend_Db_Xml_XmlContent objects. In the case where a set is returned, the Zend_Db_Xml_XmlIterator class is used to easily iterate over the set of XML documents that meet the search criteria. Zend_Db_Xml_XmlIterator implements the Iterator interface so it knows several essential things about the set of Zend_Db_Xml_XmlContent objects over which it is iterating. These include its current location in the set, how to retrieve the next object in the set, how to go back to the beginning of the set, and when it has reached the last item in the set. This allows the developer to assign behavior on the XML data at each iteration, using a foreach construct for example, without having to worry about the details of loop control.

Zend_Db_Xml_XmlUtil is a utility class that provides static convenience methods for passing XML data back and forth from the application to Zend_Db_Xml_XmlContent, either for the raw XML data or for the “about” metadata. Though the XML is stored internally as a DOM, Zend_Db_Xml_XmlUtil allows an application to use strings, file streams, SimpleXML, or any other implementation-specific object representation of XML data. Convenience methods for converting back and forth between these different types of representations and DOM are provided.

Some XCS Applications

So now you have seen the components that make up the XCS: Zend_Db_Xml_XmlContent, Zend_Db_Xml_XmlContentStore, Zend_Db_Xml_XmlIterator, and Zend_Db_Xml_XmlUtil. But what types of applications would use the XCS?

There are several types of applications that will benefit from the XCS architecture. Here are a few examples:

  • RSS/Atom Feed Aggregator
    An RSS/Atom Feed Aggregator application is well-suited for the XCS. A typical use case is that a user can input different feed URLs and the application can periodically go out and retrieve feed updates and store them in the XCS. These feeds can be displayed, searched, and possibly be published as a “feed of feeds” as well.
  • Content Management Systems
    A content management system (CMS) is a computer software system for organizing and facilitating collaborative creation of documents and other content. Storing content as XML in the XCS allows easy retrieval and search. “about” metadata can be used for workflow management and processing. Though the data is stored as XML, an CMS application can export documents as needed in .pdf, .doc, html, etc. by transforming the XML into the required format. Storing the data as XML will be transparent to the user of the application.
  • Web Services and Mashups
    Web Services send messages using XML data. Many applications can benefit by storing data to be served in the XCS. As a web service request comes in, the result can be easily retrieved or even composed of several Zend_Db_Xml_XmlContent objects. Similarly, a mashup is a website or web application that seamlessly combines content from more than one source into an integrated experience. The content used in mashups typically comes from a third party via a public interface or API. Most of these API return data as XML and often times the data is refreshed using AJAX, which inherently returns XML data. The XCS can be used to maintain and manage this data. By querying and joining the data, interesting scenarios can be created dynamically, in essence creating a “mashup of mashups”.

I have only listed a few applications of the XCS, but essentially, any application that requires the storage, processing, and interchange of XML data can be easily implemented using the XCS. In the world of Web 2.0, this may mean all applications!

A Sample Application: my.Net.wrk

Enough of the theory behind the XCS. Let’s see some code!

Sites such as (, LinkedIn (, and Friendster ( allow people to interact with each other and to form social networks. We will build a simple social networking site using the ZF and the XCS. It will not contain the full functionality and features that the other sites contain, but will contain the same, basic functionality:

  • Create and maintain a user profile for social/professional networking
  • Search for people within networks
  • Make contacts with people and collaborate with them

Application Overview and Architecture

Users will be able to register and log into the my.Net.wrk application. User login information will be stored in a member table in the database. This data will be updated and searched using the standard Zend_Db_Adapter.

By registering, a user is also creating a profile with their interests and experience. A user’s profile will be stored as XML data in the XCS. A typical profile for someone hypothetically named George Smith might look like this:

If you recall, an Zend_Db_Xml_XmlContent object has an “about” property. This is a convenient place to store a user’s contacts and relationships. We don’t have to store all the contact information. Since the Zend_Db_Xml_XmlContent object also has an id property, knowing your contacts’ id is enough to be able to pull the information if we need it. George’s contact list might look like this:

The id gives us an easy way to look up information on George’s contacts. Also notice how one person can have multiple relationships with someone.

This is the basic data model. The XCS and the Zend Framework will help put it all together.

Plugging into the Zend Framework

The application is built on top of the ZF. I will assume that you are already familiar with the basic MVC setup of applications in the framework so I will only describe the important pieces of the application as they pertain to the XCS. I will also assume that you are using the Apache httpd server. Finally, I won’t talk too much about the views. I will just say that the views grab and output data used by the application. For more information about the ZF, please see the resources at the end of this article.

To use the XCS functionality, you will need the following directory from version 0.1.4 or later of the ZF: /incubator/library/Zend/Db/Xml. Copy this directory to lib/Zend/Db/Xml of your actual framework installation under the Apache htdocs directory. The general framework set up on my machine looks like this under htdocs/zframework:

In the enclosed zip file that you can download, you will find the following files that plug right into the framework:


Directory Name Type Description
  index.php Bootstrap XCS is initialized here
  style1.css CSS Style sheet for html presentation
  IndexController.php Controller Main controller
  AddController.php Controller Creating a profile and adding contacts
  ViewController.php Controller Main logic for navigating the site
  index.php View Initial view
  member.php View Member profile data
  searchResult.php View Displays search results
  view.php View User to get new user profile data
  thanks.php View Acknowledgement of successful user action
  error.php View Displays error messages
  Database.php Encapsulates XCS XCS and database functions

Setting up the Database

The database for the my.Net.wrk application is simple. We will store login information in a table called db2admin.member. Remember that profiles and contact information will be stored in the XCS as XML data.

We will use the Zend_Db_Xml_XmlContent “data” property to track the user profile and the “about” property to track a user’s set of contacts.

There are only a couple of DB2 administration commands needed. These may be executed on the DB2 command line:

That’s it, we are done with the database! Notice that we didn’t create the XCS. When the Zend_Db_Xml_XmlContentStore_Db2 object is instantiated, the underlying DB2 table is created automatically (that is, if it didn’t exist already – we wouldn’t want to delete existing data)! The XCS checks for this.

Let’s a look a little closer at how the Zend_Db_Xml_XmlContentStore_Db2 object is created in the bootstrap index.php file. The factory() method for the DB2 Zend_Db_Adapter is called and passed into the constructor for the Database object. It is then placed in the ZF registry since we will use the database in various places throughout the application. The database class encapsulates the XCS.

The Database Class

Below are the first few lines of the Database class, including the constructor:

The constructor instantiates the XCS with a DB2 Zend_Db_Adapter and assigns it to the $_db1 instance variable. We also have one more database table that we will need to access so we will keep the Zend_Db_Adapter available by assigning it to the $_db instance variable.

The first time an operation occurs on $_db1, it checks to make sure the underlying table exists. If the name of the table is not passed in the constructor, it assumes the table is called “xmldata”. If the table does not exist it creates it automatically along with some indexes on the XML data to improve performance for searches.

All databases operations on the XCS and on the MEMBER table are encapsulated in the Database class.

Figure 1 is a screenshot of the main page:

XCS: A Simple Example

Let’s see a simple example of how to use the XCS. When a new user joins, he/she creates a profile. The AddController::memberAction() takes care of grabbing all the form data and creating an Zend_Db_Xml_XmlContent object with the XML data.

Figure 2 is the registration screen:

After performing some basic validation steps on the data, it creates the XML document using the PHP DOMDocument object:

You probably noticed that once the XML document is created, it is sent to the Database class with a call to the saveNew() method. It looks like this:

$entry is a Zend_Db_Xml_XmlContent object and to save, simply call the insert() method on the $_db1 XCS instance variable. The point is that the developer does not have to worry about the SQL or other implementation details of data persistence and can spend most of his/her time with other aspects of the application such as implementing business rules or making a really nice, interactive GUI, perhaps using AJAX.

Figure 3 is a screen shot of a new user profile (notice that this user does not have any contacts set up as of yet):

Figure 3: User Profile

XCS: A More Interesting Example

Now, let’s look at a slightly more interesting and complex example. What happens when a member wants to add a contact?

This happens in AddController::contactAction(). Again, after performing some validation and making sure the user is actually logged in, we add the contact:

We let the database class do the work of adding the contact by calling addContact(). It returns a list of contacts, including the one we just added, so that the view can display the latest list of contacts and relationships. Also, notice that all this happens only if the id you are adding is different than your own. You wouldn’t want to add yourself as a contact for yourself!

The fun code is in the Database class addContact() method:

We get the user information by searching by the user’s unique id. Recall that the contact information is stored in the ‘about’ property as a DOM. Sometimes working with DOM can be cumbersome, so let’s use a utility method to extract it as a SimpleXML object.

The code checks to see if the person is already a contact. If so, and the new relationship is different than any existing ones, then we add the new <relationship>. If the person is not yet a contact, we add a new <entry> element with the new relationship information. Finally, import the SimpleXML back into a DOM and save the update by calling the update() method.

The method returns an updated list of contacts so that the view can display it. The processContacts() method is a helper function that extracts the contacts, looks up the names, and returns a nice array for easy manipulation by the view.

Figure 4 is a screenshot of a user profile who has added a couple of contacts:

Figure 4: User profile, including contacts

XCS: A Complex Example

As users of the ZF, we are familiar with its design philosophy (I am paraphrasing a bit): Simple enough to easily meet the needs of 80% of use-cases, while powerful and flexible enough that advanced developers can implement the remaining more difficult 20%, if desired.

The XCS was designed with the same philosophy. The XCS API allows developers to essentially forget about mundane CRUD tasks by hiding these tasks in API calls. But sometimes the type of query you would like to issue just cannot be done in a simple API call. When this situation arises, the flexibility of the XCS allows us to take control of the database ourselves and issue an SQL Query, an XQL/XML Query, or an XQuery directly to the XCS table.

As you may recall, member names and passwords are stored in the MEMBER table, while the member profile is stored in the XCS. So what happens when we would like view a member’s profile by doing a search on someone’s name? What if we only know a first name or only a last name or only a portion of a name? The solution is that we need to do a fuzzy search on a join of two tables and we don’t care about case. Some of the data is in a relational table while some of it is stored as XML in the XCS. Oh my!

Never fear. Let’s get our hands dirty and bypass the XCS API by issuing a SQL/XML query directly to DB2. Here is the search method found in the Database class:

We are selecting some information to display by joining, by id, the “member” table and the XCS table, which is called “xmldata”. To get data from within the XML document, we use the SQL/XML function XMLQuery(). XMLQuery() can be used to execute XQueries, but for our purposes we use it with a simple XPath expression. XMLSerialize() will convert the XML to a string in order for us to be able to easily manipulate the data in PHP. Finally, our where clause checks to see what parameter was passed in: last name, first name, or both. And to return as many results that might match our search, we ignore case, use wildcard characters, and we use the OR operator.

Since we have crossed the boundary outside the XCS and into SQL, we use the Zend_Db_Adapter, $_db, to fetch the results.

Figure 5 is a screenshot of a search result. The search term was “sal” as a first name and no last name was entered. Two “Sals” and one “Salvador” were returned. By clicking on one of the links, you will be directed to their profile and you will be given the option to add this person to your contacts list. This is shown in Figure 6.

Figure 5: Search Results

Figure 6: Viewing a Member Profile and Making a Contact

We have touched on a series of use cases, from simple to complex, for managing XML data in an application. The remaining pieces of the application are variations of these examples. In the interest of space, I will only list the controller methods and what they do, but please feel free to browse the application source yourself and experiment by using the application. Who knows, maybe you will find ways to improve the implementation or add some cool features.

Also, please note that some of the links in the application, such as “Our Privacy Policy”, “About My.Net.wrk”, and “Customer Service”, were not implemented, but hopefully you get the idea of what kind of view should be rendered by each those links.

indexAction: Renders the main page where a user can register, login, or search for someone.

newAction: Renders the registration page for someone to register

memberAction: Processes the form data from the registration page to add a new member

contactAction: Adds a new contact relationship for a member


searchAction: Executes a search by calling the appropriate Database class search method. Renders results.

displayAction: Looks up and displays a member profile

_call(): Processes a user login and starts a session for the user. If the login is successful, the user’s profile and contacts is displayed.


So now you’ve seen the XCS, its components and API, and a cool social networking application built on top of the ZF and XCS. There is already a DB2 implementation in the class Zend_Db_Xml_XmlContentStore_Db2 that takes advantage of DB2’s pureXML technology. There may be a need for other implementations using your favorite database engine or other persistence mechanisms, such as a file system or cache. Or alternatively, download DB2 Express C V9 edition or Zend Core for IBM and give the XCS a spin to see if it suits your needs. As the XCS is still in the incubator, there are plenty of opportunities to provide comments and suggestions for improvement.

If you do have any questions, comments, suggestions, or critiques on this paper or the XCS in general, please send them to me at


I would like to thank Stephen Brodsky (IBM Silicon Valley Lab) for his contributions to this paper and overall mentoring and guidance for the XCS project.

I would also like to thank Rakesh Ranjan (also from the IBM Silicon Valley Lab) for his help in development of the Social Networking Application and for his review and comments on this paper.


Sample code from this article

Zend Framework
Zend Core for IBM
DB2 9 Express C
A more extensive discussion of the XCS presented at XTech 2006