PHP Patterns: The Visitor Pattern (continued)
[ continued from part 1 of PHP Patterns: The Visitor Pattern ]
Discussion
Our Visitor implementation could bear one significant improvement. At present,
you need to implement accept() for every component in the visited
structure. This is because each accept() method must call its
corresponding visit* method; for example,
WebText::accept() has to call MyListVisitor::visitWebText().
This is annoying, given that most accept() methods are identical
apart from the MyListVisitor invocation.
We can address this problem by delegating the choice of a specialized method
to the MyListVisitor class itself. Here's how we might amend
MyListVisitor:
abstract class MyListVisitor {
|
The generic visit() method accepts a MyListComponent
object and a $level integer. It acquires the object's class name
with get_class(), and uses it to construct a method name. If the
method exists, visit() invokes it.
What does this win us? Well for a start there is a small performance penalty
in adding this extra layer. This may be worth monitoring for some kinds of
application or data structure. On the positive side, the generic method can
be called by any visited component without losing specificity. This means
that we can implement accept() in base classes only. Here's the
MyListComponent base class:
abstract class MyListComponent {
|
This version of accept() defines the functionality for node
classes; that is, those that don't maintain children. We override this behavior
in MyListComposite:
abstract class MyListComposite extends MyListComponent {
|
This ensures that accept() is called on all children of a composite
object. With these methods implemented in the MyList* base classes,
concrete children no longer need to provide implementations unless they want to
do something substantially different with their MyListVisitor object.
One thing to note about Visitors is their tendency to flatten the structure they
traverse. A visitor object is passed around without any automatic sense of where
it currently is in the visited structure. This is often not desired or irrelevant.
Sometimes, though, you need to work to build a sense of context into a visitor.
This is one reason I often build the $level argument into my visitor
interfaces.
Let's say that we need to create a Visitor that outputs a
MyListComponent tree to XML. We can send a visitor around to gather
data about each component with no problem, but how does it know where to put each
new element in relation to the last? Every time visit() is called,
it could relate to the previous component as either child or sibling. This is an
important consideration when you are building an XML tree, and it is a
surprisingly knotty problem.
First of all, I'll define a constructor for the class:
class DomVisitor extends MyListVisitor {
|
As you can see, here I'm using the DOM extension to build the XML document. The
constructor performs some initial setup, creating a root element and pushing it
onto the end of an empty array property named $elementStack. A
stack is a good way of keeping track of enclosing elements. You often see them
used in SAX scripts. Stacks of this kind follow the LIFO (last in first out)
rule, with the endmost object representing the innermost open element reached
thus far. So we push the open root element onto the stack. Anything it contains
(that is, everything in the document) will be pushed on top of it. As we close
an element we will pop it from the top of the stack, revealing its enclosing parent.
Here is one of our specialized visit* methods, visitNote():
function visitNote(Note $comp, $level) {
|
So creating the XML element is pretty easy. I query the relevant methods on the
Note object and use the data to construct sub-elements. It's only
when I've finished all this that the thorny issue arises: where do I put my new
element?
As you can see, the answer lies in the handleParent() method.
handleParent() is passed the new child element, and the
$level argument handed along by the MyListComponent
object (in this case, Note). It also has access to the
$elementStack array, from whence it will retrieve the parent
element and another property, $currentLevel, which stores the
depth of the previous element.
Here's how handleParent() works with these raw materials to
retrieve the parent element for the newly-created DomElement object:
function handleParent(DomElement $child, $level) {
|
The fact that this is not pretty is ample testimony to the fact that visitors
like to flatten out data structures. I have to work hard here to maintain an
awareness of a tree structure. I do this by comparing the depth of the current
component with that of the last. I know that an equal value makes the current
element the sibling of the previous element. A greater $level
argument means that the element is a child of the last, and a lesser value
means that the new element is the sibling of a parent or ancestor. Based on
this information, I 'close' elements appropriately by popping them off the
end of the element stack. The currently open element is the one left at the
top of the stack.
handleParent(), then, has two roles. Firstly it acquires and
returns the parent element for the new child, and secondly it adds the child
to the top of the stack after closing off (removing) any necessary elements.
Here I generate an XML representation of the MyListComponent tree
I populated earlier:
$bob = new Note("Bob's notes", "General meanderings");
|
Figure 2 shows the XML output, saved to file and displayed in the Firefox browser.
Whether this kind of juggling is worth the effort is entirely a matter of judgement. You might decide that outputting XML is close enough to a type's core function that the operation can be embedded within the components themselves. This would be easier because components could then simply provide their corresponding XML elements to their children on generation. This brings us full circle though, and carries the risk that you may begin to weigh down your type with extraneous functionality.
We have shown the Visitor pattern in the context of a Composite pattern structure. In fact any kind of structure can be adapted to work with a Visitor. In principle, visitor objects don't care where they are passed, they simply make their visit methods available for calling. There is no reason all visited components should even share the same type - a visitor class specifies individual types in its visit methods, and nothing enforces a shared supertype.
Another issue to consider is the responsibility for traversal. I tend to embed this functionality within the target components, so a parent passes its visitor object on to its children, which call the relevant visit method and then pass the object on. Of course, the visitor itself could be the engine of its own movement from component to component. You could also create a separate controller to move the visitor around.
As with the Observer pattern, there is a danger here that target classes may reveal too much of their inner workings. Generally a visited object passes itself to the visitor's visit method. In order to work with this object the visitor must then call back to get state or other information. This generally involves some loosening of encapsulation as components make more of their data publicly available. The extent of this is worth monitoring. You could change the way that components call visitor methods so that they pass data rather than an instance of themselves. This is safer, but harder to maintain, and less flexible. It's another judgement call.
Typical Pattern Characteristics
The visitor pattern can effectively decouple components from context, encapsulating context-specific functionality within visitor implementations.
In using the Visitor pattern, you code to interface not implementation. A visited component has no knowledge of the specific functionality hidden behind the visitor interface - it only knows that it has been passed a visitor object, and that it must invoke a particular method. This makes it easy to extend a system, dropping new visitor implementations into place without the need to update your components. In reality, of course, you may need to adjust the visibility of the occasional method, or write an accessor or two, but the principle holds good.
About the Author
Matt Zandstra is a developer, teacher and writer specializing in object oriented Internet applications. He currently works as an engineer at Yahoo!, where he helps develop a core template management system. He is the author of SAMS Teach Yourself PHP in 24 Hours, and the more recent PHP 5: Objects, Patterns and Practice. He has written articles about PHP for Linux Magazine, IBM, Zend.com and bgz-consultants.com. Matt's own Web site can be found at http://www.getinstance.com.

Comments
Gosh - what is this? An English paper for English 101 ????
Terrible explanations. Sorry - it really pissed me off.
Great work Matt, your book is officially on my amazon wish list.
Great work, Matt, keep it up!
Thanks!