Original Content
 ,  

Recursion In PHP: Tapping Unharnessed Power

By Robert Peake

November 18, 2004

Overview
Introduction
Recursion Defined: When A Function Calls A Function
Iteration
Recursion in Context
A Recursive Example: Factorials
Another Example: File System Traversal
Conclusion: Appropriate Use
Further Reading

About the Author


Overview

I liken PHP to the Punk Rock movement of the late 1970s. The raw power of the movement
has attracted tremendous participation. And just as anyone who could strum power chords
and scream along to a melody could start their own punk band, likewise anyone who can
search php.net and download sample code can learn to program PHP. The lack of barriers
present in other languages – like licensing fees, proprietary approaches, and privately
guarded examples – makes PHP all the more attractive and available to the programmer on
the street. Frankly, this thrills me, because it is this conscious commitment to openness
that has made PHP one of the most popular programming languages of our time.

The accessibility of PHP also means that the PHP community is composed of programmers
from a very diverse blend of backgrounds, many of them self taught. Such programmers
sometimes miss out on some of the finer points of programming that can be learned through
academic training. Next to object oriented programming, normalization of relational data
structures and refactoring, recursion is one of the topics in computer science most often
overlooked by the self-taught PHP programmer. This article is an attempt to help redress
this situation.


Introduction

In a recent survey I conducted ( "http://forums.dreamincode.net/index.php?showtopic=9158&hl=recursion"
target="_blank">http://forums.dreamincode.net/index.php?showtopic=9158&hl=recursion
), nearly
half the PHP programmers I polled didn’t really know what recursion was or how to use it.
In this article, we will define recursion, contrast it with iteration, and step through
two examples: factorials and file system traversal. We will also look at more advanced
implications, such as the use of global variables and depth versus breadth traversal.
Finally, we will clarify when it is most appropriate to use recursion and when iteration
will suffice.

This article is aimed at novice to intermediate PHP programmers, particularly those
without a formal background in computer science, who want to understand the full power of
recursion.


Recursion Defined: When A Function Calls A Function

(To be sung to the tune of “When the Moon Hits Your Eye”:) “When a function calls a
function in a Turing machine … that’s recu-u-u-u-ursion!”

Your first question is probably, “What’s this ‘Turing machine’ thing doing in place of
the usual lyrics, ‘big pizza pie’?” Don’t worry about it. The short answer is to just
consider that the PHP runtime environment is a Turing-complete environment -
and we all know that the PHP runtime environment is almost as near and dear to a
programmer’s heart as pizza.

To put it simply: recursion means anything that references itself. In computer science,
we mostly talk about recursive functions – functions that call themselves. We will get
into some examples momentarily.


Iteration

You are probably familiar with the two most popular iterative constructs in PHP:
for and while. Here are two examples of iterating over an
array:


<?php

for ($i = 0; $i < sizeof($array); $i++) {

        print $array[$i];

}

?>



<?php

$i = 0;

while (
$i < sizeof($array)) {

        print $array[$i];

        
$i++;

}

?>



Both examples print out all the elements of a one-dimensional array. Both examples are

iterative.

Simply put, iteration is doing something inside a loop. If you have never heard of
recursion, you probably never realized that there is an alternative approach. This is
because it is relatively easy to solve simple problems using loops. Recursion, on the
other hand, seems to involve a shift in thinking for some programmers. The shift is well
worth it, however, since some problems can be solved more elegantly using recursion.


Recursion in Context

Some languages are designed with recursion in mind. A classic example is LISP. Most
of a LISP programmer’s life is spent writing recursive functions that traverse
linked-list data structures. The language itself is tuned for recursion, so that
recursive functions execute very quickly and efficiently.

PHP, by contrast, chooses to focus on iteration to do most of its heavy lifting. This is
probably because it draws on the heritage of C and (increasingly) Java, which do the same.
None the less, recursion is not only possible in these languages – sometimes it is desirable.


A Recursive Example: Factorials

Mathematically speaking, a factorial (denoted by “!”) is the result of a number
multiplied by all positive integers less than that number. Non-integer and negative
values are considered undefined. Zero factorial (0!), however, is defined as 1.

6! = 6 * 5 * 4 * 3 * 2 * 1 or 720

Here is one way to write the factorial function using iteration:


<?php

//iterative factorial function

function factorial($number) {

    $result = 1;

    while (
$number > 0) {

        print
"result = $result, number = $number\n";

        $result *= $number;

        
$number--;

        }

    return
$result;

}

?>

Running print factorial(6); should produce the output:

result = 1, number = 6
result = 6, number = 5
result = 30, number = 4
result = 120, number = 3
result = 360, number = 2
result = 720, number = 1
720

As you can see, we start out with the default values of $result and
$number in the first line. Then we multiply $result by

$number minus one until $number reaches one.

Now let’s do the same thing recursively:


<?php

//recursive factorial function

function factorial($number) {

    if ($number < 2) {

        return
1;

    } else {

        return (
$number * factorial($number-1));

    }

}

print factorial(6);

?>




As you can see, we take a totally different approach here. Instead of setting a

$result variable, we use the $number variable exclusively. How
can we get away with this? Instead of keeping track of a counter variable (like
$i in the for/while examples above), in simple
recursion we are keeping track of the operations on a single variable.

The if statement in this example serves two purposes. The first, obvious, purpose
is to return the value 1 if you call factorial(0) or

factorial(1) from somewhere else inside your code.

But there is a second, even more important, purpose to this statement. It serves as what
is called the end condition for the recursion. Just as the end condition for
ending the while loop is that $number is no longer greater than
zero, here the end condition is when $number is less than two.

Again, the best way to see what this function is doing is to add an output line. In this
case, let’s add a more descriptive line that tells us about the recursive function call.
Just after the else clause, we add:


print "factorial($number) is ($number * factorial(".($number-1)."))\n";


Now, running print factorial(6); gives us:

factorial(6) is (6 * factorial(5))
factorial(5) is (5 * factorial(4))
factorial(4) is (4 * factorial(3))
factorial(3) is (3 * factorial(2))
factorial(2) is (2 * factorial(1))
720

Where is the running total? There isn’t one. Instead of multiplying out a
$return value, we are simply multiplying the number by the factorial of the
number minus one. It is the multiplication that is aggregating our code all
together. For this reason, the multiplication operator (*) is said to be our

aggregator.

In the end, the last line evaluates to:

return 6 * 5 * 4 * 3 * 2 * 1;

instead of the last line of the iterative version, which evaluated to:

return 720;

Because this particular type of recursion is performing the recursive function call as
the second part of the return line, it is called tail recursion. If we
rewrote the line this way:


return (factorial($number-1) * $number);


we would be using head recursion, and in this case, would get an identical
result. When dealing with numbers, tail and head recursion often return the same thing.
When dealing with data structures, however, it can be a different matter altogether.

As you can see, recursion takes a bit of a shift in thinking. But once you make that
shift, you can reap two benefits:

  • No need for a counter
  • More natural thinking

By natural thinking, I mean that we are thinking about the factorial problem
in much the same way that a mathematician might define a factorial. Once you
become more comfortable with recursion, this shift in mindset can help you to think about
the problem at the level of the problem.

A word of caution: note that both functions will return incorrect values if you pass in a
negative or non-integer value. So, adding the line:


if ($number < 0 || !is_int($number)) {

    return
NULL;

}



will help make these functions safe for general use by returning a value we can equate
with the mathematical term “undefined” – in this case, NULL.


Another Example: File System Traversal

The next example is far more useful:


<?php

//a simple file tree traversal - takes in a path and returns a nested array

$delim = strstr(PHP_OS, "WIN") ? "\\" : "/";

function retrieveTree($path)  {

    global
$delim;

    if ($dir=@opendir($path)) {

        while (($element=readdir($dir))!== false) {

            if (
is_dir($path.$delim.$element) && $element!= "." && $element!= "..") {

                $array[$element] = retrieveTree($path.$delim.$element);

            } elseif (
$element!= "." && $element!= "..") {

                $array[] = $element;

            }

        }

        
closedir($dir);

    }

    return (isset(
$array) ? $array : false);

}

?>




Assuming the following file structure:

foo
foo/bar
foo/baz
foo/baz/bop
foo/baz/bop/file4.txt
foo/baz/file2.txt
foo/baz/file3.txt
foo/file1.txt

a call to:


print_r(retrieveTree('foo'));

will yield the result:

Array
(
    [bar] =>
    [baz] => Array
        (
            [bop] => Array
                (
                    [0] => file4.txt
                )

            [0] => file2.txt
            [1] => file3.txt
        )

    [0] => file1.txt
)

How does this function achieve a complete traversal of the file system? The function
actually iterates through all the elements of a directory. Then it examines each element.
If the element is a file, it is added to the array we are constructing. If it is a
directory, the result of a call to retrieveTree($element) is added into the
array.

A word of caution about global variables: In this case, we imported a global variable to
decide how to delimit a file path. That’s fine. However, setting global variables from
inside a recursive function is generally frowned upon. This is because setting a global
variable typically means you are trying to keep track of something outside the recursive
scenario you have set up. In the end, global variables referenced from inside recursive
functions can end up confusing you, your code, and all who look at it.

This function is termed depth-first, because as soon as the case structure detects
a directory it chases off down that directory with a recursive call. The other form of
tree traversal, breadth-first, involves examining all the files in a particular
subdirectory (including other directories) first, then going down to the next level by
examining a subdirectory.

Since we can not control the order in which the opendir() function gives us
directory elements, if we are going to consider all the elements before moving on to the
next subdirectory, we need to separate out the directories when we encounter them so that
we can traverse them only when we have looked at everything else in the directory. One
way to do this is to push all the directories we encounter into a temporary array, and
then iterate through that array to make recursive function calls on each directory. This
example uses another iterative loop to defer the recursive calls until the end:


<?php

//a breadth-first file tree traversal

$delim = strstr(PHP_OS, "WIN") ? "\\" : "/";

function retrieveTree($path) {

    global
$delim;

    if ($dir = @opendir($path)) {

        while (($element = readdir($dir)) !== false) {

            if (
is_dir($path.$delim.$element) && $element != "." && $element != "..") {

                //$array[$element] = NULL;

                
$temp_array[] = $element;

            } elseif (
$element != "." && $element != "..") {

                
$array[] = $element;

            }

        }

        if (isset($temp_array)) {

            for (
$i = 0; $i < sizeof($temp_array); $i++) {

                $element = $temp_array[$i];

                
$array[$element] = retrieveTree($path.$delim.$element);

            }

        }

        closedir($dir);

    }

    return (isset(
$array) ? $array : false);

}

?>




Notice that we never need to reset our $temp_array variable, because each
function call happens within a separate environment space, so the
$temp_array of the calling function does not get passed down to the next
function call unless it is passed as an argument to the function.

The result of:


print_r(retrieveTree('foo'));



is now:

Array
(
    [0] => file1.txt
    [bar] =>
    [baz] => Array
        (
            [0] => file2.txt
            [1] => file3.txt
            [bop] => Array
                (
                    [0] => file4.txt
                )

        )

)

Scanning from top-to-bottom, this new printout reveals that we have dealt with all the
elements in the directory before moving on to examine subdirectories, even if dealing
with a directory just means pushing it into a temporary array. If we wanted to preserve
the original array structure we achieved with the depth-first search, but go about
creating that structure in a breadth-first way, we could uncomment the line:


$array[$element] = NULL;



just before the directory $element is pushed into the
$temp_array. This would put the key value for the directory into the array
immediately, so that it can be set later with our recursive call. The result would look
like the results of our depth-first tree, except that the approach to getting that result
would be a breadth-first traversal.

Obviously, the purpose of distinguishing between breadth-first and depth-first searches
is to construct efficient searches. If you are more likely to meet your exit
condition
(e.g. find the file you are looking for) by looking through everything in
a directory before moving on, then you should be implementing the breadth-first algorithm.


Conclusion: Appropriate Use

Computer science theory tells us that everything that can be done iteratively can be done
recursively, and vice versa. However, just because it is possible does not mean it is
always appropriate or even easy to implement.

As we have seen, implementing a factorial function recursively is
slightly more elegant in terms of the number of variables involved and the way you get to
think about the problem. Approaching the problem recursively also means thinking about
the factorial problem in much the same way that a mathematician might define a factorial.
It is therefore arguably a more natural approach to an algorithm than using a
counting variable.

When implementing a breadth-first search, we employed both recursion and iteration;
however, the critical working component was recursion – since the recursive case statement
outside the while loop was the part of the code that determined when to stop
moving through sub-trees. The iterative component really just grabbed all the elements at
each successive level of the file system. It was recursion that allowed us to go deeper.

The more you get comfortable with recursion, the more you will see how elegant and
powerful it can be. An easy way to get started is to attempt to write functions that you
would normally write iteratively in a recursive fashion. Remember to trace your function
calls – the easiest way is to print the recursive function call. This is what we did in
the factorial exercise. Another, more simplified example of tracing function calls is as
follows:


<?php

function foo($bar) {

   if(
$bar > 0 && !($bar < 0)) { //if $bar is greater than zero and not negative

     $bar--;

     print(
"Called foo($bar)");

     
foo($bar);

   }

}

?>




The performance difference between recursion and iteration is typically pretty
negligible, and the benefits gained make wrapping your head around recursion a worthwhile
exercise.

In the case of tree traversals (e.g. file systems), recursion is clearly the best
approach. In fact, any time the depth or breadth of a structure is unknown, using
recursion makes much more sense than using iteration. With file system traversals it is
possible but much, much more difficult to implement those functions iteratively. The
process involves pushing information into “stacks” (in our case, we could use PHP arrays)
for later retrieval and processing. This process is tantamount to writing an assembly-level
process (a stack) in a high-level language like PHP. The fact that this can be done does
not necessarily mean that it is the fastest, most elegant, or even easiest way to write code.

For these reasons it is clear that recursion has a valuable niche in computer science. It
provides a more natural path through unknown structures, a more intuitive way of writing
many algorithms, and a more elegant approach to writing some kinds of code. Hopefully this
article has helped to establish recursion as an important and useful tool in your retinue
of PHP tactics. Happy coding!


Further Reading

target="_blank">http://en.wikipedia.org/wiki/Recursive_function

target="_blank">http://en.wikipedia.org/wiki/Recursion


About The Author

Robert Peake taught programming languages to computer science
students at Berkeley before earning his degree in Poetry. He recently
resigned as an IT Manager and took up consulting to spend more time
with his family and PHP. He lives in Los Angeles with his wife,
Valerie, and their two laptops.

Robert can be contacted at robert@peakepro.com.

Published: November 18th, 2004 at 12:00
Categories: Tutorials
Tags: ,

3 comments to “Recursion In PHP: Tapping Unharnessed Power”

Thank you for this guide…great approach to recursion.
Antonio

_____anonymous_____
August 20th, 2009 at 12:50 am

Thank you for the article.

Thanks so much for this example, it inspired me to figure out the coding for a recursive user tree based on parent I was writing for WordPress. Awesome Awesome. Graet explanation and detail.