Using YAML With PHP and PECL

October 1, 2007

Tutorials

Extending Yourself

One of the things I like best about PHP, is its support for a diverse array of technologies and formats. Take, for instance, YAML. I’ve been using YAML for a while to manage configuration data for a Ruby-based application, but I recently needed to use it for a PHP-based project for the first time. Although PHP doesn’t come with built-in support for this format, a quick Google search revealed a PECL extension, ext/syck, that allowed me to quickly add YAML parsing support to my PHP build and begin reading and managing YAML-formatted files through my PHP application.

If you’ve ever encountered the same need, or if you’re just curious about the YAML format and how it can be used with PHP, then today is your lucky day. Over the next few pages, I’m going to give you a crash course in YAML and in PHP’s ext/syck extension, showing you how it can be used to efficiently translate data structures from PHP to YAML, and vice-versa. Come on in, and let’s get started!

The YAML Files

I’ll begin with the answer to a basic, yet important, question: what the heck is YAML anyhow? Well, YAML, also known as YAML Ain’t Markup Language and Yet Another Markup Language, is “a straightforward machine parsable data serialization format designed for human readability and interaction with scripting languages such as Perl and Python” (http://www.yaml.org).
Translated from Geek, all this means is that YAML is a way of formatting information such that it is easy to read by both humans and machines.
It’s tempting to think of YAML as being similar to XML, but in reality it’s not. Unlike XML, YAML doesn’t use elements and attributes to mark up data; rather, indentation is used to denote nested relationships, and punctuation elements like dashes (-) and colons (:) are used to mark lists and hashes of data items. To illustrate, consider the following simple YAML document, which sets up a hash containing five key-value pairs:

--- 
luke: good guy
darth: bad guy
emperor: even worse guy
han: cool renegades
yoda: jedi master

Or how about this one, which holds a list of values, with list items preceded by dashes:

--- 
- one
- two
- three

You can also combine lists and sequences, as in the following example:

---
droids: 
    - r2d2
    - c3p0
heroes:
    - luke
    - han

As these examples illustrate, YAML documents are much simpler than XML documents; they also have a fairly easy-to-decipher internal structure, which makes them good for applications that are used by both humans and programs, such as configuration files or log files. By virtue of their simple format, YAML files are also much easier to generate (through a script program) than equivalent XML files.

Getting Syck

Now that you’ve understood the basics of YAML, let’s look at how to read and write YAML files in PHP. YAML support in PHP comes through PECL’s ext/syck extension, which is maintained by Alexey Zakhlestin and provides a simple API to parse YAML documents and convert them to PHP data structures (and vice-versa). This ext/syck extension, in turn, depends on the Syck library, which must be compiled and installed to your development environment before ext/syck can be built.

Note that at the current time, a Windows version of ext/syck is not available; the following steps assume a *NIX system.

The first step, then, is to download, compile and install the Syck library. Visit http://www.whytheluckystiff.net/syck/, download the source code archive (v0.55 at this time) and install it using the standard configure-make-make install cycle:

shell# cd syck-0.55
shell# ./configure
shell# make
shell# make install
Once libsyck is installed, proceed to download ext/syck (v0.9.1 at this time) from http://pecl.php.net/package/syck, and compile it into a loadable PHP module:
shell# cd syck-0.9.1
shell# phpize
shell# ./configure
shell# make

At this point, you should have a loadable PHP module named syck.so in your ./modules directory. Copy this to your PHP extension directory, and enable the extension in the php.ini configuration file. Restart your Web server, and check that the extension is enabled with a quick call to phpinfo():

Country Bumpkin

Let’s begin with something simple – dynamically constructing a YAML document using ext/syck methods. Here’s the code:

<?php
// define PHP array
$data = array(
    'a' => 'America',
    'b' => 'Brazil',
    'c' => 'Canada',
    'd' => 'Denmark',
    'e' => 'England',
);

// convert to YAML and print
$yaml = syck_dump($data);
echo $yaml;
?>


Feast your eyes on the output:

--- 
a: America
b: Brazil
c: Canada
d: Denmark
e: England

The output of the script is a correctly-formatted YAML document. It was fairly easy to create: all one needs to do is pass the corresponding PHP data structure (in this case, an associative array) to the syck_dump() function, which turns it into YAML.

Here’s another example, this one creating a nested series of arrays:

<?php
// define PHP array
$data = array (
    'droids' =>
        array('r2d2', 'c3po'),
    'heroes' =>
       array(
        'one' => array('luke', 'leia'),
        'two' => array('han', 'chewbacca'))
);

// convert to YAML and print
$yaml = syck_dump($data);
echo $yaml;
?>

And here’s the output YAML (notice how indentation is used to maintain the hierarchical relationship between the various array elements):

--- 
droids: 
  - r2d2
  - c3po
heroes: 
  one: 
    - luke
    - leia
  two: 
    - han
    - chewbacca

If you’d prefer to write the YAML output to a file (instead of dumping it to the screen), simply pass the output of syck_dump() to fwrite() or file_put_contents(). The ext/syck API doesn’t currently include functions to read or write YAML content from or to disk files, although this functionality is planned for future releases (according to the extension’s TODO file).

Of Apples And Oranges

Now, how about doing the reverse – converting a YAML document into a PHP data structure? With the syck_load() function, this is a snap. Consider the following example, which illustrates the process:

<?php
// define YAML string
$yaml =$gt;$gt;$gt; END
---
- apples
- oranges
- bananas
- grapes
END;

// convert to PHP array
$data = syck_load($yaml);
print_r($data);
?>

Here, the syck_load() function accepts a YAML string as input and converts it into a PHP data structure – in this case, a numerically-indexed array. Here’s the output:

Array
(
    [0] => apples
    [1] => oranges
    [2] => bananas
    [3] => grapes
)

Here’s another example, this time with a YAML document containing key-value pairs:

---
name : John Doe
tel  : 123-4567
fax  : 987-6543
email:
  - john@domain.com
  - jdoe@domain.com

And here’s the PHP code to read this YAML file and convert it into a PHP data structure. Note the use of file_get_contents() to read the source YAML into a string and then pass this to syck_load():

<?php
// get YAML data
$yaml = file_get_contents('sample.yaml');

// convert to PHP data structure and print
$data = syck_load($yaml);
print_r($data);
?>

Here’s the output:

Array
(
    [name] => John Doe
    [tel] => 123-4567
    [fax] => 987-6543
    [email] => Array
        (
            [0] => john@domain.com
            [1] => jdoe@domain.com
        )

)

Here, the syck_load() function has first converted the colon-separated pairs of values in the YAML file into key-value pairs of a PHP associative array and then turned the two email addresses, indented and preceded by a dash, into a numerically-indexed sub-array.

(Con)Figuring It All Out

PHP’s ext/syck extension currently only exposes the two functions you’ve seen in previous examples, but this is more than enough to enable some fairly useful applications. To illustrate, consider the next example, which demonstrates how to build a PHP application to manage application configuration variables using YAML for the configuration file format:

<html>
    <head>
        <style type="text/css">
        body {
          font-family: Verdana;
          font-size: 10pt;
        }
        </style>
    </head>
    <body>

<?php
// define name and path to config file
define ('CONFIG_FILE', './conf/local.config.yaml');

// read configuration
if (file_exists(CONFIG_FILE)) {
    $CONFIG = syck_load(file_get_contents(CONFIG_FILE));
}

// if form not submitted
// display form
if (!isset($_POST['submit'])) {
?>

        <form method="post" action="<?php echo $_SERVER['PHP_SELF']; ?>">
            Global background color: <br/> <input type="text" name="bgcolor" value="<?php echo isset($CONFIG['bgcolor']) ? $CONFIG['bgcolor'] : ''; ?>" /> <p/>

            Feedback email address: <br/> <input type="text" name="email"  value="<?php echo isset($CONFIG['email']) ? $CONFIG['email'] : ''; ?>" /> <p/>

            Number of news items displayed per sector: <br/> <input type="text" name="news"  value="<?php echo isset($CONFIG['news']) ? $CONFIG['news'] : ''; ?>" /> <p/>

            Home page URL: <br/> <input type="text" name="url" value="<?php echo isset($CONFIG['url']) ? $CONFIG['url'] : ''; ?>" /> <p/>

            Signature appended to all outgoing email: <br/> <textarea name="sig" cols="30" rows="5"><?php echo isset($CONFIG['sig']) ? $CONFIG['sig'] : ''; ?></textarea> <p/>
            <input type="submit" name="submit" value="Save" />
        </form>
<?php
} else {
    // if form submitted
    // validate input

    // example validation below
    // use more stringent validation in 
    // production environments!

    // email address
    if (preg_match("/^([a-z0-9_-])+([.a-z0-9_-])*@([a-z0-9-])+(.[a-z0-9-]+)*.([a-z]{2,6})$/", $_POST['email'])) {
        $CONFIG['email'] = $_POST['email'];
    } else {
        die ('Improper email address format');
    }

    // number of news items
    if (ctype_digit($_POST['news'])) {
        $CONFIG['news'] = $_POST['news'];
    } else {
        die ('Improper number format');
    }

    // URL
    if (preg_match("/^(http|https|ftp)://([a-z0-9]([a-z0-9_-]*[a-z0-9])?.)+[a-z]{2,6}$/", $_POST['url'])) {
        $CONFIG['url'] = $_POST['url'];
    } else {
        die ('Improper URL format');
    }

    // background color
    if (trim($_POST['bgcolor']) != '') {
        $CONFIG['bgcolor'] = $_POST['bgcolor'];
    } else {
        die ('Missing background color');
    }

    // signature
    if (trim($_POST['sig']) != '') {
        $CONFIG['sig'] = $_POST['sig'];
    } else {
        die ('Missing outgoing signature');
    }

    // save to config file
    if (file_put_contents(CONFIG_FILE, syck_dump($CONFIG))) {
        echo 'Configuration successfully saved!';
    } else {
        echo 'Configuration could not be saved!';
    }
}
?>
    </body>
</html>

This script appears fairly long and complex, but it’s actually quite easy to understand. There are two sections to it: the first section generates a form for users to enter configuration values for a (fictional) application, while the second section validates user input and saves the values to a file in YAML format.

  • In the first section of the script, the syck_load() function is used to load the configuration file (stored in the constant CONFIG_FILE), read the YAML-formatted configuration values from it with file_get_contents(), and convert them into a PHP associative array named $CONFIG. Next, a check is performed on the $_POST['submit'] variable to see if the form has been submitted; if this test returns false, a form is generated to allow the user to enter values for the various configuration options. Notice that the form is pre-filled with the existing configuration values, which are retrieved from the $CONFIG array.
  • Once the form is submitted, the user’s input is checked and if correct, the various configuration values are formatted into an associative array, which is then turned into YAML with the syck_dump() function and written back to the configuration file with PHP’s file_put_contents() function.

Here’s an example of the form in action:

And once the form is submitted, here’s what you should see in the configuration file:

---
email: feedback@domain.com
news: 5
url: http://www.domain.com
bgcolor: "#00ff00"
sig: |-
  --
  This email comes to you from http://www.some-domain.com. We respect your privacy and will never sell your email address.

As this example illustrates, although ext/syck only exposes some very basic functions (at this time), these functions are more than sufficient for some fairly complex YAML read/write operations in PHP. Try it out for yourself the next time you have some data to serialize…and happy YAML-ing!

Copyright Melonfire 2007, all rights reserved

4 Responses to “Using YAML With PHP and PECL”

  1. pbsyndicator Says:

    <p>I just found out that installing syck on a Zend Server CE environment in OSX is a-b-c nowadays … quick & dirty:</p>

    <p>|| install Apple’s Developer Tools (if you haven’t already… )</p>

    <p><small>N.B.: +1GB is a bit steep, only to get ‘configure – make – install’ running on your machine, but it works … I never bothered to find out how else to do this ***q&d***. Plus it’s got lot’s of yummies you’ll never use, but look good…</small></p>

    <p>|| In your Terminal run:</p>
    <pre>sudo pecl install syck</pre>

    <p>|| edit /usr/local/zend/etc/php.ini and add:</p>
    <pre>extension=syck.so</pre>
    <p>somewhere under ‘Dynamic Extensions’ (or prob. at the bottom where Zend Server puts extensions like memcache [I haven't tried that, because I always put the line under D.E.]).</p>

    <p>|| In your browser go to: http://localhost:10081/ZendServer
    <br />|| select the ‘Server Setup’ tab
    <br />|| select ‘Extensions’</p>

    <p>You’ll see the syck extension is already seen by Zend Server but not active</p>

    <p>|| Click ‘Restart PHP’</p>

    <p>… et viol&aacute;: "Currently turned on || syck" … and you can even turn it off by the click of the button…</p>

  2. emanaton Says:

    Greetings All,

    Thank you for the GREAT article! It helped me to create my own <a href="http://www.emanaton.com/code/php/zendconfigyml">implementation of Zend Config for YML files</a>. You rock!

    Regards,

    Sean P. O. MacCath-Moran
    http://www.emanaton.com

  3. markusfischer Says:

    Why would additional operation for e.g. direct file access be necessary? Is performance that important here that a temporary variable would be a problem in case of e.g. file_get_contents() ?

  4. shahar Says:

    I wrote a blog post some time ago about using syck / YAML with Zend_Config to process configuration files – http://prematureoptimization.org/blog/archives/39 . it’s quite easy if you have syck installed.