PHP 101 (part 11): Sinfully Simple

November 30, -0001

Tutorials

Easy Peasy
The Bad Old Days
Petting Zoo
Sin City
The Shape Of Things To Come
X Marks The Spot

An Evening At The Moulin Rouge


Easy Peasy

Unless you’ve been hiding in a cave for the last few years, you’ve heard about XML – it’s
the toolkit that more and more Web publishers are switching to for content markup. You may
even have seen an XML document in action, complete with user-defined tags and markup, and
you might have wondered how on earth one converts that tangled mess of code into
human-readable content.

The answer is, not easily.

While PHP has included support for the two standard methods of parsing (read: making sense
of) XML – SAX and DOM – since version 4.0, the complexity and inherent geekiness of these
methods often turned off all but the most dedicated XML developers. All that has changed,
however, with PHP 5.0, which introduces a brand-spanking-new XML extension named SimpleXML
that takes all (and I do mean all) the pain out of processing XML documents. Keep reading,
and find out how.


The Bad Old Days

In order to understand why SimpleXML is so cool, a brief history lesson is in order.

In the days before SimpleXML, there were two ways of processing XML documents. The first,
SAX or the Simple API for XML, involved traversing an XML document and calling
specific functions as the parser encountered different types of tags. For example, you might
have called one function to process a starting tag, another function to process an ending tag,
and a third function to process the data between them. The second, DOM or the Document
Object Model, involved creating a tree representation of the XML document in memory, and then
using tree-traversal methods to navigate it. Once a particular node of the tree was reached,
the corresponding content could be retrieved and used.

Neither of these two approaches was particularly user-friendly: SAX required the developer
to custom-craft event handlers for each type of element encountered in an XML file, while
the DOM approach used an object-oriented paradigm which tended to throw developers off, in
addition to being memory-intensive and thus inefficient with large XML documents. In the
larger context also, PHP 4 used a number of different backend libraries for each of its
different XML extensions, leading to inconsistency in the way different XML extensions
worked and thus creating interoperability concerns (as well as a fair amount of confusion
for developers).

With PHP 5.0, a concerted effort was made to fix this problem, by adopting the libxml2
library (http://www.xmlsoft.org/)
as the standard library for all XML extensions and by getting the various XML extensions
to operate more consistently. The biggest change in the PHP 5 XML pantheon, though, is
the SimpleXML extension developed by Sterling Hughes, Rob Richards and Marcus Börger,
which attempts to make parsing XML documents significantly more user-friendly than it
was in PHP 4.

SimpleXML works by converting an XML document into an object, and then turning the
elements within that document into object properties which can be accessed using standard
object notation. This makes it easy to drill down to an element at any level of the XML
hierarchy to access its content. Repeated elements at the same level of the document tree
are represented as arrays, while custom element collections can be created using XPath
location paths (of which, more later); these collections can then be processed using PHP’s
standard loop constructs. Accessing element attributes is as simple as accessing the keys
of an associative array – there’s nothing new to learn, and no special code to write.

In order to use SimpleXML and PHP together, your PHP build must include support for
SimpleXML. This support is enabled by default in both the UNIX and Windows versions of
PHP 5. Read more about this at target = "_blank">http://www.php.net/manual/en/ref.simplexml.php. If you’re a PHP 4
user, you’re out of luck – SimpleXML is only available for PHP 5.


Petting Zoo

To see how SimpleXML works, consider the following XML file:


<?xml version="1.0"?>

<pet>

    <name>Polly Parrot</name>

    <age>3</age>

    <species>parrot</species>

    <parents>

        <mother>Pia Parrot</mother>

        <father>Peter Parrot</father>

    </parents>

</pet>

Now, you need a way to get to the content enclosed between the <name>,
<age>, <species> and <parents>

elements. With SimpleXML, it’s a snap:


<?php

// set name of XML file

$file = "pet.xml";

// load file

$xml = simplexml_load_file($file) or die ("Unable to load XML file!");

// access XML data

echo "Name: " . $xml->name . "\n";

echo
"Age: " . $xml->age . "\n";

echo "Species: " . $xml->species . "\n";

echo
"Parents: " . $xml->parents->mother . " and " .  $xml->parents->father . "\n";

?>

The action begins with the simplexml_load_file() function, which accepts
the path and name of the XML file to be parsed. The result of parsing the file is a
PHP object, whose properties correspond to the elements under the root element. The
character data within an element can then be accessed using standard
object->property notation, beginning with the root element and moving
down the hierarchical path of the document.

Just as you can read, so also can you write. SimpleXML makes it easy to alter the
contents of a particular XML element – simply assign a new value to the corresponding
object property. Here’s an example:


<?php

// set name of XML file

$file = "pet.xml";

// load file

$xml = simplexml_load_file($file) or die ("Unable to load XML file!");

// modify XML data

$xml->name = "Sammy Snail";

$xml->age = 4;

$xml->species = "snail";

$xml->parents->mother = "Sue Snail";

$xml->parents->father = "Sid Snail";

// write new data to file

file_put_contents($file, $xml->asXML());

?>

Here, the original XML file is first read in, and then the character data enclosed within
each element is altered by assigning new values to the corresponding object property. The

asXML() method, typically used to dump the XML tree back out to the standard
output device, is in this instance combined with the file_put_contents()
function to overwrite the original XML document with the new data.


Sin City

Repeated elements at the same level of the XML hierarchy are represented as array elements,
and can be accessed using numeric indices. To see how this works, consider the following XML file:


<?xml version="1.0"?>

<sins>

    <sin>pride</sin>

    <sin>envy</sin>

    <sin>anger</sin>

    <sin>greed</sin>

    <sin>sloth</sin>

    <sin>gluttony</sin>

    <sin>lust</sin>

</sins>

Here’s the PHP script that reads it and retrieves the data from it:


<?php

// set name of XML file

$file = "sins.xml";

// load file

$xml = simplexml_load_file($file) or die ("Unable to load XML file!");

// access each <sin>

echo $xml->sin[0] . "\n";

echo
$xml->sin[1] . "\n";

echo $xml->sin[2] . "\n";

echo
$xml->sin[3] . "\n";

echo $xml->sin[4] . "\n";

echo
$xml->sin[5] . "\n";

echo $xml->sin[6] . "\n";

?>

If you’d prefer, you can even iterate over the collection with a foreach()

loop, as in this next, equivalent listing:


<?php

// set name of XML file

$file = "sins.xml";

// load file

$xml = simplexml_load_file($file) or die ("Unable to load XML file!");

// iterate over <sin> element collection

foreach ($xml->sin as $sin) {

    echo
"$sin\n";

}

?>


The Shape Of Things To Come

SimpleXML handles element attributes as transparently as it does elements and their
content. Attribute-value pairs are represented as members of a PHP associative array,
and can be accessed like regular array elements. To see how this works, take a look
at this script:


<?php

// create XML string

$str = <<< XML
<?xml version="1.0"?>
<shapes>
    <shape type="circle" radius="2" />
    <shape type="rectangle" length="5" width="2" />
    <shape type="square" length="7" />
</shapes>

XML;
// load string
$xml = simplexml_load_string($str) or die ("Unable to load XML string!");

// for each shape
// calculate area

foreach ($xml->shape as $shape) {
    if (
$shape['type'] == "circle") {
        
$area = pi() * $shape['radius'] * $shape['radius'];

    }
    elseif ($shape['type'] == "rectangle") {
        
$area = $shape['length'] * $shape['width'];

    }
    elseif ($shape['type'] == "square") {
        
$area = $shape['length'] * $shape['length'];

    }
    echo $area."\n";
}

?>

Unlike previous examples, which used an external XML file, this one creates the XML
dynamically and loads it into SimpleXML with the simplexml_load_string()
method. The XML is then parsed with a foreach() loop, and the area for
each shape calculated on the basis of the value of each <shape>
element’s type attribute. The listing above demonstrates how attribute values can
be accessed as keys of the attribute array associated with each element property.


X Marks The Spot

SimpleXML also supports custom element collections, through XPath location paths. For
those of you new to XML, XPath is a standard addressing mechanism for an XML
document, allowing developers to access collections of elements, attributes or text
nodes within a document. Read more about XPath at target = "_blank">http://www.w3.org/TR/xpath.html and

target = "_blank">http://www.melonfire.com/community/columns/trog/article.php?id=83.

To see how this works, consider the following XML document:


<?xml version="1.0"?>

<ingredients>

    <item>

        <desc>Boneless chicken breasts</desc>

        <quantity>2</quantity>

    </item>

    <item>

        <desc>Chopped onions</desc>

        <quantity>2</quantity>

    </item>

    <item>

        <desc>Ginger</desc>

        <quantity>1</quantity>

    </item>

    <item>

        <desc>Garlic</desc>

        <quantity>1</quantity>

    </item>

    <item>

        <desc>Red chili powder</desc>

        <quantity>1</quantity>

    </item>

    <item>

        <desc>Coriander seeds</desc>

        <quantity>1</quantity>

    </item>

    <item>

        <desc>Lime juice</desc>

        <quantity>2</quantity>

    </item>

</ingredients>

Now, let’s suppose you want to print all the <desc> elements. You
could do it by iterating over the array of <item> elements, as
discussed earlier…or you could just create a custom collection of only the
<desc> elements with the xpath() method, and
iterate over that instead:


<?php

// set name of XML file

$file = "ingredients.xml";

// load file

$xml = simplexml_load_file($file) or die ("Unable to load XML file!");

// get all the <desc> elements and print

foreach ($xml->xpath('//desc') as $desc) {

    echo
"$desc\n";

}

?>

Using XPath, you can get even fancier than this – for example, by creating a collection
of only those <desc> elements whose corresponding quantities are two or more.


<?php

// set name of XML file

$file = "ingredients.xml";

// load file

$xml = simplexml_load_file($file) or die ("Unable to load XML file!");

// get all the <desc> elements and print

foreach ($xml->xpath('//item[quantity > 1]/desc') as $desc) {

    echo
"$desc\n";

}

?>


Without XPath, accomplishing this would be far more complicated than the five lines of code
above…try it for yourself and see!


An Evening At The Moulin Rouge

Now that you’ve seen what XPath can do, let’s wrap this up with an example of how you
might actually use it. Let’s suppose you have a bunch of movie reviews marked up in XML,
like this:


<?xml version="1.0"?>

<review id="57" category="2">

    <title>Moulin Rouge</title>

    <teaser>

        Baz Luhrmann's over-the-top vision of Paris at the turn of the century

        is witty, sexy...and completely unforgettable

    </teaser>

    <cast>

        <person>Nicole Kidman</person>

        <person>Ewan McGregor</person>

        <person>John Leguizamo</person>

        <person>Jim Broadbent</person>

        <person>Richard Roxburgh</person>

    </cast>

    <director>Baz Luhrmann</director>

    <duration>120</duration>

    <genre>Romance/Comedy</genre>

    <year>2001</year>

    <body>

        A stylishly spectacular extravaganza, Moulin Rouge is hard to

        categorize; it is, at different times, a love story, a costume drama,

        a musical, and a comedy. Director Baz Luhrmann (well-known for the

        very hip William Shakespeare's Romeo + Juliet) has taken some simple

        themes - love, jealousy and obsession - and done something completely

        new and different with them by setting them to music.

    </body>

    <rating>5</rating>

</review>

Now, you want to display this review on your Web site. So, you need a PHP script to
extract the data from this file and place it in the appropriate locations in an HTML
template. With everything you’ve learned so far, this is a snap…as the code below
illustrates:


<?php

// set name of XML file

// normally this would come through GET

// it's hard-wired here for simplicity

$file = "57.xml";

// load file

$xml = simplexml_load_file($file) or die ("Unable to load XML file!");

?>

<html>

<head><basefont face="Arial"></head>

<body>

<!-- title and year -->

<h1><?php echo $xml->title; ?> (<?php echo $xml->year; ?>)</h1>

<!-- slug -->

<h3><?php echo $xml->teaser; ?></h3>

<!-- review body -->

<?php echo $xml->body; ?>



<!-- director, cast, duration and rating -->

<p align="right"/>

<font size="-2">

Director: <b><?php echo $xml->director; ?></b>

<br />

Duration: <b><?php echo $xml->duration; ?> min</b>

<br />

Cast: <b><?php foreach ($xml->cast->person as $person) { echo "$person "; } ?></b>

<br />

Rating: <b><?php echo $xml->rating; ?></b>

</font>

</body>

</html>

Pretty simple, huh?

That’s about all for the moment. In
Part Twelve of PHP 101, I’ll be telling you all
about the new exception handling model in PHP 5, showing you how you can use it to
catch your scripts before they crash and burn. See you there!


Copyright Melonfire, 2004 (http://www.melonfire.com). All rights reserved.

,

13 Responses to “PHP 101 (part 11): Sinfully Simple”

  1. gelie Says:

    <?php

    $file="stats.xml";

    $xml = simplexml_load_file($file) or die(‘Unable to load XML file!!’);

    foreach($xml->player_stats->stat as $data)
    echo $data['name']." = ".$data."<br />\n";

    ?>

  2. everlof Says:

    Thank you for these tutorials, they are really nice!

    But I’ve got a problem that I deal with that I couldnt find anywhere in the tutorial.

    <stats>
    <player_stats aid=’6′>
    <stat name=’nickname’>Maliken</stat>
    <stat name=’acc_games_played’>37</stat>
    <stat name=’acc_wins’>29</stat>
    <stat name=’acc_losses’>8</stat>
    </player_stats>
    </stats>

    How do I access ‘Maliken’, ’37′, ’29′ and ’8′?

    Thanks!

  3. mgkingston Says:

    I’m running on Windows XP, with WAMP Server running. I created the file, 57.xml, and created a .php file for the php code, but I keep getting "Unable to load XML file!". First, I thought the problem was the name of the file, being all numeric, so I changed the file name and also the php code, but I get the same message. So, I thought, maybe there’s a conflict from the previous example, so I changed the file name to "ingredients.xml" and, wouldn’t you know, it sort of worked. However, all I get are the headings for "Director", "Cast", "Duration", and "Rating"… no data from the xml. I even tried starting all over with a fresh browser session (Firefox, by the way), but no luck.

    All the previous examples worked well.

    Any ideas?

    Thanks,

    Merrill

  4. whrkit Says:

    These are great examples, but how would one handle a XML response coming from a 3rd party API? Save it as a file first? I do have the XML response as a variable and can echo it out, but not sure on how to parse it.

  5. _____anonymous_____ Says:

    Here is the link to the next lesson…njoy!

    http://devzone.zend.com/article/652-PHP-101-part-12-Bugging-Out—Part-1

  6. edarroyo Says:

    // normally this would come through GET

    What do you mean by GET?

  7. _____anonymous_____ Says:

    Hi,

    and thanks for these tutorials, they are making such a difference to me!

    However, the link to the next lesson is broken (as was the link from the previous lesson to this one).

    Could they possibly be fixed, or could you please provide some working nav so I (and many others, I suspect), can continue with the tutorials?

    It would be so appreciated!

    Many thanks.

  8. kjetiltroan Says:

    If I have a xml file like this:
    <xml version="1.0" ?>
    <database>
    <article id="1">
    <title>title1</title>
    <head>this is an interesting article</head>
    <author>Me</author>
    <text>This is the looooong story!!</text>
    </article>
    <article id="2">
    <title>title2</title>
    <head>this is an interesting article No.2</head>
    <author>Me</author>
    <text>This is another looooong story!!</text>
    </article>
    </database>

    How can I find the article by ID instead of ->article[0] and ->article[1]?

    Also, how can I when adding articles, iterate through the IDs and check make a new ID ONLY if that ID is not taken? More like a traditional database I know..

    Another thing:

    Lets say I have a bunch of <comment><title></title><name></name><commenttext></commenttext></comment> inside each article. How can I delete one of them only? Like deleting unwanted comments, spam, abuse and so on…

    Thanks for any reply:)

  9. _____anonymous_____ Says:

    <b>h</b><span>

  10. manujdel Says:

    <?php

    // load string
    $xml = simplexml_load_file("test.xml") or die ("Unable to load XML file!");

    $N=500; // bad thing to do

    for ($i=0;$i<$N;$i++)
    {
    $temp=$xml->item[$i]->desc; //main point to show

    if($temp=="")
    {echo "break"; break;}
    else
    if($xml->item[$i]->quantity > 1) //main point to show
    echo $temp;

    }

    ?>

  11. nickweavers Says:

    So, in the "An Evening At The Moulin Rouge" example, how would you echo the review id and category attributes directly? I have seen it done using the attributes() method in a foreach loop something like this:

    // Load up the root element attributes
    foreach($xml->attributes() as $name=>$attr) {
    $res[$name]=$attr;
    }

    However, I would like to know if I can use a direct path?

    Thanks,
    Nick.

  12. alexcory Says:

    Hey, My name is Alex.

    I just got hired by Dominion Enterprises as a php developer and they directed me to these tutorials. I noticed your comment and I may have a solution to your problem (even though its been maybe a month or so)

    anyway take a look at what I got below, this is for writing a .html file, I’m assuming you can use this for .xml too. I work under php version 5 so I don’t know if it works with earlier versions.

    Here is the code:

    $filename = "testfile.html";
    $contents = <<< FILE_DONE
    <html>
    <head>
    <title>testfile</title>
    </head>
    <body>
    This is a test file for .html code
    </body>
    </html>
    FILE_DONE;

    file_put_contents($filename, $contents) or die(‘could not write to file’);

    //END CODE

    the <<< FILE_DONE is an interesting way of combining a set of tags/commands/strings all at once. it will set $contents equal to everything until it sees another FILE_DONE with a ; at the end. FILE_DONE can also be changed to ALL_DONE or any other name so long as you end with that name.

    WARNING!!! using this method the ending FILE_DONE; MUST be set all the way to the left of your code, if it is tabbed over in any way it will provide a parse error.

    I hope this helps with your problem, c ya later.

    –AlexCory

  13. mercury7 Says:

    Whilst I found this tutorial very useful I regret the lack of a code example to create & write to a new file.

    I am able to write to a new file if I create it seperately and it contains all the required XML tags. However I’d like to know if its possible to write the file creation (like the string creation in ‘The Shape Of Things To Come’) into the script so the file is created on the fly & then written to.

    Any help on this that could be posted to this coment page would be appreciated.

    Thanks

    Mark