Zend Weekly Summaries Issue #236

      Comments Off on Zend Weekly Summaries Issue #236

TLK: PHP-GTK corner
TLK: Overloading __new
FIX: ext/soap parse error
TLK: Margaritas and DOM
CVS: pecl/stats, array_product, CLI completion
PAT: Quiet week

TLK: PHP-GTK corner

Markus Fischer had quite a lot to do with the shaping of PHP-GTK 2 this week. It
began when he wondered briefly why exceptions in PHP-GTK had the prefix
PhpGtk, whereas everything else has the prefix Gtk. Andrei
Zmievski explained that exceptions don’t exist in GTK+; they’re purely a PHP
feature. Markus then asked whether there shouldn’t be a base exception class named
something like PhpGtkException. Andrei agreed, and made
PhpGtkException the parent class for all exceptions in PHP-GTK 2.

Markus then went on to post a patch implementing some GtkTreeStore
methods, which Andrei later committed following a few adjustments. Andrei felt that
GtkTreePath should be the next item on his list for implementation, as
it crops up a lot in GTK+ code – in fact Markus’ patch had been to some extent a
workaround for the lack of it.

Christian Weiske was also busy this week. He published his
simple reflection browser‘ [dead link],
and went on to ask Andrei a series of complex questions about the
Reflection API. He wanted to know whether it would be possible to extend the PHP-GTK
source generator to supply reflection information, assuming that classes needed to
have special comments or some equivalent that the Reflection API could parse.

Andrei confirmed that PHP extensions need to implement special structures
describing the types and names of parameters, and explained that he hadn’t
implemented this in the PHP-GTK 2 source generator because it would increase the
size of the generated files. ‘Is this really necessary?‘ he asked.

Christian argued that it is necessary; if PHP extensions don’t support
reflection, it loses its usefulness; and he referred to his experience with the
Reflection API in Java, which works on all Java beans. Andrei replied that it
might be very cool to be able to browse methods and see their parameters, but he’d
guarantee that implementing support for reflection would adversely affect the build
time of PHP-GTK. Christian felt this was unimportant; most users wouldn’t need to
compile PHP-GTK often, and a small configuration switch would mean it could be
disabled at will. Andrei argued that ‘in the grand scheme of things‘ the
ability to query information about method parameters at runtime was not crucial to
the widespread use of PHP-GTK 2. The difference between using reflection for Java
beans and using it for PHP extensions was that Java beans were concerned primarily
with userland rather than C methods.

Christian hadn’t fully appreciated this last point, and double-checked: ‘So
Reflection is for userland classes/functions/properties, not for standard PHP
ones?
‘ Andrei again confirmed that, in PHP, this was where the Reflection API’s
usefulness lay.

For reasons best known to himself, Christian then went ahead and added reflection
information for GObject, the GTK base class. He also altered the code
generator to write reflection information for normal functions if #define
ENABLE_REFLECTION 1
happened to be in config.h, and left a note asking
Andrei to add a configure switch.

Short version: PHP development in microcosm.

TLK: Overloading __new

PHP user Leonardo Pedretti sparked not one, but two concurrent flames, when he
asked if it were possible – for code cleanliness purposes – to make new
return a reference to a pre-existing object under certain circumstances without
using a factory.

Jason Barnett wondered why, since it was only ‘under certain circumstances’ that
Leonardo wanted this, he couldn’t just use factory methods? Or perhaps something
like a Singleton
pattern
?

Jochem Mass felt that this wasn’t a PHP internals matter at all, and thoughtfully
fanned the flame across to PHP general. He was prepared to bet that none of the PHP
or Zend developers would be keen to introduce this kind of ‘magic’ into the Engine
(right). He also made the (good and often missed) point that patching the Engine to
get the functionality into your own PHP build meant that your code would only work
in that build, i.e. (usually) locally. A factory provided a clean method of handling
the behaviour, and also had the benefit that nobody looking through the code would
misinterpret occurrences of the new keyword. Factory example:


class Test
{
    function
__construct() {}
    function
get() {}
}

$var = Test::get($construct_args);

From the safety of the PHP general list, Rory Browne felt that built-in support
for Singletons without the use of static functions would be nice. He wouldn’t use
new though; something like existing perhaps, that would
return an existing instance if one existed and a new one if it didn’t. Still, he
concluded, ‘When all’s said and done though, it’s still just syntactic sugar. It
all depends on how sweet the devs consider it
‘.

Leonardo still saw this as an internals issue, and cc’d the thread back to that
list along with his response. He wanted to be able to do the same task through a
static or global function, but he’d found that if he built a cache in an array and
then checked whether the requested object was already loaded via
__new(), objects that entered the cache never left it. They were still
referenced in the array, and so would never be destroyed until the array itself was
destroyed at the end of script execution. His cache would grow to include all
possible objects. It was possible, if ugly, to implement methods to emulate
reference counting and remove the object from the cache at the appropriate time, but
the main disadvantage of this approach was that any request for an object needed to
be matched with a __release() function call. He was looking for a way
of checking for and returning existing instances of a class without implementing the
cache system.

Wez Furlong entered the fray briefly to point out the chicken and egg aspect of
the situation: ‘You can’t keep the object alive without a reference. There is no
way to achieve what you want, and we’re not going to change the behaviour of the
new operator, nor are we going to build some kind of singleton factory
thingy into the core language. You have the tools to do it “the right way”, please
use them and kill these two threads.

Leonardo argued: ‘Not so chicken and egg if, say, a
ReflectionClass object could inform all the instances of that class in
a simple array or with an iterator
‘.

Enter Duncan McIntyre, who had found similar problems. His internals
solution would be to define a magic static method __free(), which would
be given an instance of the class every time a reference to that instance was
deleted. Given that there is no such method, users need to code in a disciplined
way; he provided an
example
, and won a new fan in Leonardo. Thus endeth Flame A.

Flame B followed Derrell Lipman (Derrell please don’t mangle threads in
future!), whose response to Leonardo’s initial mail was to refer him to
his own proposal at the
end of last year for an overloaded __new() function. He added, ‘I’d
love for this to be revisited now that there’s someone else with a similar
desire
‘.

Justin Hannus revisited it and found it interesting, despite feeling that
adding too many __methods is ugly‘. He also had seen the need for more
advanced overloading features in his PHP 5 code…

Johannes Schlüter hurriedly referred Justin to the summary archives on the subject.

George Schlossnagle, avowedly not an OO expert, wondered: ‘Doesn’t this break
a fundamental semantic of OOP – namely that new() returns a new object
of the specified class?
‘ Wasn’t this the whole reason the Singleton pattern
existed? Lukas Smith agreed, but said that there was an unresolved bug that made it
hard to write a working Singleton, even if you inherited from the base class that
implements the pattern. Cue a short intermission while everyone went to read the
bug report.

Sebastian Bergmann backed Marcus Börger’s case in that report:
self is bound at runtime, so the behaviour is correct. If you do
not like this late binding just do not use self but the name of the
class
‘. Lukas retorted that this wasn’t useful when writing a Singleton static
method for a base class from which you could inherit. He understood that the ‘bug’
was a nice performance tweak, but he wanted an easy way to call a static method in
the current context, without the overhead of clumsy workarounds.

Ondrej Ivanic introduced him to the Singleton Factory,
but Lukas was quick to point out that it wouldn’t work if you already were using a
factory – you couldn’t have $name::factory(). Thomas Richter suggested
call_user_func(array($name, 'Factory'));, which was one of the clumsy
workarounds Lukas had previously alluded to.

Alex Knaub backed Lukas, saying that every genuine [sic] OOP and dynamic
language bounds self at runtime. Marcus took some time out of his busy life to
explain to Alex the difference between a class based language, where all objects
refer to a given class whose members cannot be altered dynamically, and a prototype
based language. He’d already said that what Alex wanted to see was possible,
it just wasn’t desirable; it would slow down script execution for one, and it was
also not the correct approach in a class based language. Alex argued that it was the
only correct way to use factories in a dynamic language; ‘everything else
leads to code duplication and/or crappy code
‘. Marcus reiterated that adding
support for this into PHP would mean making PHP slower. Besides, where was the
problem in a little code duplication? Perhaps Alex needed templates? Or perhaps he
was using the wrong language…

Short version: Still too much magic.

FIX: ext/soap parse error

Adam Maccabee Trachtenberg hit a parse error when compiling ext/soap from
CVS HEAD, and sent in a patch that seemed to fix his problem. Could someone please
review and apply?

Now normally this kind of thing wouldn’t be covered in full here, but

  1. it was a very quiet week because half the devs were in Mexico for php|tropics
  2. it was after midnight over in St. Petersburg so SOAP maintainer Dmitry Stogov
    wasn’t around to review Adam’s fix, and
  3. largely as a result of 2, figuring out the proper fix disclosed potentially
    useful information for extension hackers.

So bear with me.

Antony Dovgal, who has CVS karma for the SOAP extension, asked if simply moving
TSRMLS_FETCH() to the end of the declarations wouldn’t fix the problem.
He added that it compiled fine as it stood under gcc-3.3.4, and asked which compiler
Adam was using.

Adam replied that he hadn’t been sure how TSRMLS_FETCH() affected
the executor (EG()) and compiler (CG()) global macros. And
it didn’t compile fine as it stood under gcc-2.95.4.

Rasmus Lerdorf, seeing the same issue as Adam, reported that Antony’s solution
worked for him on his old FreeBSD box.

The strangely named 10t3k posted a red herring, saying he’d had similar issues
with his own extensions with ‘undeclared identifier’ errors appearing under VS.NET
unless TSRMLS_FETCH() appeared after all other declarations. Antony was
convinced, and committed the patch. Some minutes later (list mail was lagging), Rob
Richards reported that he couldn’t build CVS HEAD under Windows after applying
Antony’s proposed solution locally.

Andi Gutmans’ mail finally came through, confirming that Adam had been correct
regarding the macros; CG()and EG()can’t be accessed before
TSRMLS_FETCH() is called. (He didn’t say it, but this is strictly true
only for ZTS builds.)

Adam double-checked: ‘So we need to declare the vars, call
TSRMLS_FETCH(), and then assign values from
CG()/EG()?
‘ as he submitted a patch that did just
that. Andi verified this. Antony committed the fix in the light of this new
information, and Dmitry thanked both him and Adam for rectifying the problem – as
soon as his time zone and the mail lag allowed.

Short version: Global village, global solution.

TLK: Margaritas and DOM

George Schlossnagle wrote from the hackers’ table at php|tropics to complain that
he’d run into an annoying limitation in ext/dom while working on a PHPDoc
-> WSDL generator. He’d found he couldn’t add a namespace to a
domDocument unless there was an element in that namespace, and needed
to be able to specify an xml-schema type as an attribute type. Adding the namespace
‘manually’ as an attribute on an element didn’t work, so he’d done the gentlemanly
thing and written a new method for the extension, DomElement::addNS($uri,
$alias)
. Admittedly it wasn’t part of the DOM spec, but it was useful;
did any of the XML people mind if he committed it to CVS HEAD?

Thanks to the mailing list lag, XML person Rob Richards saw George’s blog item on
the subject before George’s mail arrived in his inbox. He replied to both,
explaining that


$root->setAttributeNS('http://www.w3.org/2000/xmlns/', 'xmlns:xsd',
'http://www.w3.org/2001/XMLSchema');

should do exactly what George wanted.

PHP user Jared Williams wasn’t far behind Rob, with a fuller version of the same
code for SVG:


define('NS_NS', 'http://www.w3.org/2000/xmlns/');
define('NS_XLINK',
'http://www.w3.org/1999/xlink');

$document =
new
DOMDocument();
$root
= $document->createElement('root');
$document->appendChild($root);
$root->setAttributeNS(NS_NS, 'xmlns:xlink',
NS_XLINK);
echo
$document->saveXML();

George acknowledged that their methods worked, adding apologetically,
Apparently I had one too many margaritas yesterday while I was writing this…

Short version: There’s such a thing as ‘too many margaritas’?

CVS: pecl/stats, array_product, CLI completion

Andi kept his head above the parapet for long enough to veto Andrey Hristov’s
naming of math_std_dev(), as Jani Taskinen had foretold, with an email
that simply read ‘Damn right!‘. He followed this by explaining that naming in
PHP usually follows the PHP coding standards rather than a rule of commonly used
names, and also std_dev probably wasn’t common enough even if it had been the other
way about. math_standard_deviation() was clearer and therefore better,
and it wasn’t likely to be so frequently used that the length of the name became a
big issue.

Andrey complied with the request and renamed the function. For some off-list
reason, he subsequently moved both math_standard_deviation() and
math_variance() out of the PHP core and into pecl/stats, where
they are known as stats_standard_deviation() and
stats_variance(), respectively.

Another new function, array_product(), also came from Andrey this
week. This one stayed in the core; it returns the product of an array’s entries.

George Richter succumbed to Jani’s nagging and renamed
mysqli_set_character_set() to mysqli_set_charset(). Jani’s
still nagging him over mysqli_character_get_name().

Antony went to fix ext/pgsql bug #32904
by making pg_get_notify() take notice of the
result_type parameter. He was amused to find, on porting his fix to the
4_3 branch, that result_type had never been passed into the internal
function there at all. The report had been open against PHP 5 only.

On Friday 13th, Andi gave Antony (who said he prefers being called Tony, OK
noted) CVS karma for the Zend Engine. Tony promptly went in and fixed a memory leak
in set_error_handler() (bug #29975).

Marcus ended his week in the sun by adding command completion to PHP CLI’s
interactive mode (php -a). That should make a few people happy.

Short version: blah

PAT: Quiet week

Jani applied a patch from ric at arizona dot edu (with Jani’s own help) to fix
an ext/snmp bug.

Tony applied an ext/sockets bugfix
across all three CVS branches, thanks to jwozniak23 at poczta dot onet dot pl.

And Wez committed Christopher Kings-Lynne’s changes to the PDO_PGSQL driver,
noting only that the large-ish patch had been ‘slightly modified’.

Short version: Why don’t people leave their names on bug
reports?