Getting Started with MongoDB and PHP

Generation Next

Over the last year or so, there’s been a small revolution taking place in the database world, with the advent of “schema-less” database systems like Apache CouchDB. These databases follow a different approach to data storage as compared to the traditional relational model, and they’re quickly becoming popular with Web developers on account of their flexibility, simplicity and easy integration with modern Web technologies such as JSON.

In this article, I’ll introduce you to MongoDB, one of the new generation of schema-less database systems that is quickly gaining the attention of open source developers. Over the next few pages, I’ll guide you through the process of getting started with MongoDB, showing you how to install it, set up a data store, connect to it and read and write data using PHP. Let’s get started!

Start Me Up

In the words of its official Web site, MongoDB is “a scalable, high-performance, open source, document-oriented database”. It is licensed under the GNU AGPL, and is currently available for a variety of different platforms, including Microsoft Windows, Linux, Mac OS X and Solaris.

Although MongoDB is similar to CouchDB, in that both products offer schema-less data storage, there do exist some important differences:

  • MongoDB developers must use native language drivers for database access, while CouchDB developers can access a CouchDB database using REST.
  • MongoDB supports a larger number of datatypes than CouchDB.
  • MongoDB scalability is based on sharding, while CouchDB scalability is based on replication.
  • MongoDB queries use BSON objects, while CouchDB queries/views are generated using JavaScript.

The MongoDB documentation explains these (and a few other) differences in greater detail, and also has a nice comparison of MongoDB, CouchDB and MySQL.

With that brief introduction out of the way, it’s time to download and install MongoDB. In most cases, the standard binary package will suffice; this contains the MongoDB server and command-line client, a set of backup and restore tools, and a utility to store/retrieve binary files in a MongoDB database. To begin, download a version suitable for your platform and extract the package contents to a directory on your system, as shown below:

If you’re using Ubuntu, you can also install MongoDB using aptitude. To do this, add the following line to your /etc/apt/sources.list file:

Then, install the package with aptitude by executing the following commands:

Once the package has been installed, start up the MongoDB server, as shown below:

Note that, by default, the MongoDB server assumes the data storage directory to be /data/db, and will exit with an exception if this directory does not exist. Passing the –dbpath option on the command line specifies an alternative location for the data storage directory.

You can now use the command-line client to access the MongoDB database server, as below:

Here’s what you should see:

You can now issue commands to the server, just as you would with the MySQL client. Here are some examples, which list the server version and available databases:

If it’s all working as advertised, let’s get started with collections!

Collecting Ideas

In the MongoDB universe, the equivalent of relational tables are “collections”. Just as a table can have multiple records, so too can a collection have multiple “documents”. These documents are represented as JSON objects, with fields and values represented as key-value pairs, and serialized to BSON (Binary JSON) for storage. Here’s an example of one such document:

Since MongoDB’s basic unit is a document expressed in JSON, and JSON supports hierarchical data, you can easily nest, or embed, one document in another. Further, since documents are serialized to BSON for storage, MongoDB can just as easily search nested documents. Here’s an example:

To create a new MongoDB collection, fire up the command-line client and try running the following command, which creates a collection named “items” and adds some documents to it:

Go ahead and add a few more documents as shown above. Then, to display a complete list of the documents in this collection, use the find() method without any arguments, as below:

Notice the special ‘_id’ key that appears with each document. When you save a new document to a collection, MongoDB automatically attaches a unique identifier to it. This identifier can be used to retrieve or modify a specific document, similar to an auto-incrementing ID in a relational database.

To display a list of documents matching specific criteria, add those criteria to the find() method, again as a JSON object. Here’s an example, which displays items with quantities greater than 9 and price less than 1:

Now, let’s try doing the same thing with PHP!

Hooking Things Up

MongoDB support in PHP comes through the MongoDB PHP extension, which provides a full-fledged API for accessing MongoDB collections. This extension is currently maintained by Kristina Chodorow and is freely available from PECL under the Apache License. The extension is currently stable and allows you to perform most of the common tasks related to accessing and using a MongoDB database from within a PHP application.

To get started with the MongoDB PHP extension (v1.0.7), install it using the automated PECL installer, as below:

Alternatively, you can download the source code archive and compile it into a loadable PHP module, as below:

At this point, you should have a loadable PHP module named in your PHP modules directory. Enable this extension in the php.ini configuration file, restart the Web server, and check that the extension is active with a quick call to phpinfo():

Now, take the PHP extension out for a quick spin, by using it to query the MongoDB database and retrieve the ‘items’ collection:

This script begins by initializing a new Mongo object, passing the object constructor the information needed to establish a connection to the database server – in this case, the server host name. This Mongo connection object is used for all subsequent communication with the MongoDB server.

The next step is to get a handle to the database. This can be done either by using the Mongo object’s selectDB() method or – more simply – calling the database as a member of the parent Mongo object using magic methods. Once the database object has been retrieved, a collection from the database can be retrieved either by using the MongoDB object’s selectCollection() method or via a magic method call. Collections are represented as MongoCollection objects.

Every MongoCollection object exposes a find() method, which can be used to query the corresponding collection. This method accepts two arrays: an array of query parameters, and an array of fields to be returned in the result set. The return value of the find() method is a MongoCursor object, which represents the result of the query. Since this MongoCursor object implements the Iterator pattern, it’s quite simple to iterate over it using a foreach() loop and process each document of the result set. The MongoCursor object also exposes a count() method that returns the number of documents in the result set.

The Mongo PHP extension is fully complaint with the PHP 5.x exception model, and defines five types of exceptions: MongoConnectionException for connection-related errors; MongoCursorException and MongoCursorTimeoutException for query-related errors; MongoGridFSException for file storage/retrieval errors; and a base MongoException class for all other errors. As the previous example demonstrates, it’s a good idea to wrap your Mongo code in a try-catch block so that you can trap and resolve these different error types in a graceful manner.

Here’s an example of what the output of the script looks like:

Addition And Subtraction

Adding new documents to a collection is also quite simple. Consider the next example, which illustrates:

To add a new document to a collection, simply create a new PHP array containing the key-value pairs to be inserted (nested arrays are supported and will be automatically converted to embedded documents) and then pass this array to the MongoCollection object’s insert() method. This method will add the document to the collection and then update the original array with an ‘_id’ key containing the unique document identifier. This identifier is represented by the special MongoId data type, which returns a hexadecimal representation of the identifier. This also means that it is possible to obtain the ID of the newly-added document without needing to perform a second query to the database.

Here’s an example of the output:

Deleting documents is accomplished with the remove() method, which accepts an array of criteria, and removes all documents matching those criteria. Typically, the remove() method returns Boolean true or false; however, passing it the special ‘safe’ option as second argument will instead make it return an array with more information, including the number of documents removed. Here’s an example:

It’s also possible to remove a document using its ID, so long as the ID is passed to the remove() method as a MongoId object and not a regular PHP string. Here’s an example:

Finally, updating a document can be accomplished using the save() method, as shown below:

Note that if the call to findOne() does not result in a match, calling save() will result in a new document being added to the database.

Asking Questions

When it comes to querying, MongoDB offers a fair amount of flexibility. You’ve already seen the find() method, which returns a set of documents matching the specified search criteria. There’s also the findOne() method, which returns a single document instead of a set. Multiple search criteria can be used as well – simply add them as elements of the array passed to the find() or findOne() methods, and MongoDB will automatically perform an AND query to return only those documents matching all the specified criteria.

Here’s an example:

MongoDB also supports a variety of conditional and logical operators that can be used to create reasonably complex queries. Add in support for regular expressions, and you’ve got a party! Here’s an example, which lists all items with quantities between 10 and 50 and whose names end with ‘es’:

Here’s an example of the output:

You can also limit the number of documents returned by a search, or sort them by a specified key, using the limit() and sort() methods. Here’s an example:

Incidentally, if you’re curious about how MongoDB is internally performing a query, you can use the MongoCursor object’s explain() method to “look inside” the query processing subsystem, similar to MySQL’s EXPLAIN command. Here’s an example, and the resulting output:

Here’s the output:

Rank And File

In addition to documents, MongoDB also supports storage of binary data. Binary data up to 4 MB in size can be stored as regular documents, and files larger than this can be stored using a little thingummy called GridFS.

GridFS is a specification for splitting up and storing large files in a MongoDB database. Typically, GridFS makes use of two collections: a ‘files’ collection which stores metadata about each file, and a ‘chunks’ collection which stores the actual file content, split into chunks. Each file in the ‘files’ collection has a unique identifier, just like other documents stored in MongoDB; this identifier can be used to retrieve or modify the file.

The MongoDB PECL extension provides a set of MongoGridFS classes, which can be used to interact with files stored using GridFS. Individual files are represented as instances of the MongoGridFSFile class, and each MongoGridFS object exposes methods to add, remove and find these files.

To better understand this, consider the following example, which illustrates the process of adding a binary image file to a MongoDB database:

This example first retrieves an instance of the primary MongoGridFS object using the MongoDB object’s getGridFS() method, and then uses the object’s storeFile() method to save a file to the MongoDB database. If you run this example and then look inside the database, you’ll see two new collections, ‘fs.files’ and ‘fs.chunks’. If you dig a little deeper and look inside the ‘fs.files’ collection, you should see the file that was added through the script.

An alternative to the storeFile() method is the storeUpload() method, which is designed specifically for use with PHP file uploads. Using this is also very easy: simply pass the storeUpload() method the name of the input field containing the file, and MongoDB will do the rest. You can also specify an optional filename for the uploaded file as a second argument, if you wish.

Here’s a quick-and-dirty example that demonstrates how to do this:

Once you’ve got the file into the database, how do you get it back out? Just use the document identifier to retrieve the file (MongoGridFS extends MongoCollection so you can use the regular find() and findOne() methods) as a MongoGridFSFile object, and then either save it to disk with the object’s write() method or send it to the output device with the getBytes() method. If you’re sending it to the user’s browser, remember to attach appropriate headers as necessary!

Here’s a quick example:

Boys And Their Toys

While the PECL MongoDB extension makes it quite easy to interact directly with a MongoDB database, you might often prefer a slightly higher level of abstraction – say, for example, if you’re trying to integrate MongoDB into an existing framework-based application. In this case, you should consider Morph, a “high-level PHP library for MongoDB”, which is freely available under the GNU GPL.

Morph follows the ActiveRecord pattern, allowing you to create definitions of MongoDB objects by extending the basic Morph_Object class. These Morph_Object instances map directly to MongoDB documents, and expose properties and methods that can be easily manipulated using standard object notation.

To illustrate, assume for a moment that you wish to create a product database of toys. The typical attributes of a toy would include a name, description, age suitability, gender suitability and price. You could represent all of this information in a Morph_Object, as follows:

You can now create new documents from this template, by instantiating new Toy objects, setting their properties, and saving them with save(). Here’s an example:

In a similar vein, you can retrieve an object by ID, update it and save it back to the database:

As these examples illustrate, MongoDB provides a solid, feature-rich implementation of a schema-less database system. Availability for different platforms, easy integration with PHP and other languages, and extensive documentation (plus a very cool interactive online shell for experimentation) make it ideal for developers looking for a modern, document-oriented database. Try it out sometime, and see what you think!

Copyright Melonfire, 2010. All rights reserved.