Zend Weekly Summaries Issue #219

TLK: php.net bug
FIX: html_entity_decode
TLK: Type hinting
TLK: buildconf vs autoconf
TLK: static method invoked by …?
TLK: To BC or not to BC?
NEW: PIC/non_PIC
NEW: sqlite_key
TLK: External internals
FIX: mbstring
PAT: Commercial interests


TLK: php.net bug

Gareth Ardron opened the week again, this time with the observation that the
php.net downloads and anonymous CVS pages were redirecting to somewhere strange. Jed
Smith called him out, saying that there was already a bug report open about this and
that the internals list was not the appropriate forum to request site fixes, even of
this nature. For future reference, we have "mailto:webmaster@php.net">webmaster at php dot net for this kind of thing, as
well as the bug reporting
mechanism
.

Short version: Right thought, wrong time, wrong place.


FIX: html_entity_decode

Kamesh Jayachandran wrote in to report a segfault in the NetWare version of PHP
5.0.3 when calling:


"color: #0000BB">html_entity_decode "color: #007700">( "color: #DD0000">' ',
ENT_QUOTES,
'UTF-8' "color: #007700">);

He’d tracked down the problem to ext/standard/html.c, where some of the
arrays of entity values were being accessed with an index greater than their size.
He listed the problem arrays and suggested that either the entity_map
struct should have the relevant end characters reduced or that the arrays should be
extended. Kamesh added that it was possible to see the accessing indices by
putting


printf("k -
entity_map[j].basechar = %d
", k - entity_map[j].basechar);


into the for loop at line 898.

Moriyoshi Koizumi fixed the problem in CVS and thanked Kamesh for his
impressively clear bug report.

Short version: If only they were all like that…


TLK: Type hinting

Leonardo Pedretti wrote to the internals list to ask where there were any plans
to include type hinting with basic types and for variables, saying that it would be
useful to be able to declare the type of a data member of a class when automating
data retrieval.

Derick Rethans replied that, of the basic types, type hinting was planned only
for array, as far as he knew, and that property type declaration was
definitely not on the TODO. ‘PHP is weakly typed and will remain so.

Andi Gutmans added helpfully that phpDoc supports adding basic data types by
using comments, and this could be useful when auto-generating code or SQL
statements.

Leonardo responded with an impassioned email refuting the idea that having the
ability to strongly type certain variables in local scope would make PHP strongly
typed. He felt that having type hinting for object properties would be ‘heaven’,
saying that it was straightforward to emulate basic type hinting (with
is_int() and friends) but impossible to know the type of a data member
prior to its assignment. Having to assign a value in the constructor means creating
a specific constructor per type, when creating classes that derive from or implement
an interface that has functions using that type. Code would be so much cleaner if it
were possible to type hint an object property declaration, e.g.:


"color: #007700">class specific_class
extends "color: #0000BB">abstract_base {

    private SomeClass
$var1
;
    private
SomeOtherClass
$var2
;
}

rather than:


"color: #007700">class specific_class
extends "color: #0000BB">abstract_base {

    function "color: #0000BB">__construct() {
        
"color: #0000BB">$var1 = new "color: #0000BB">SomeClass;
        
"color: #0000BB">$var2 = new "color: #0000BB">SomeOtherClass;
    }

    private "color: #0000BB">$var1;
    private
"color: #0000BB">$var2;
}

That stunned everyone into silence.

Short version: PHP is not C.


TLK: buildconf vs autoconf

Sebastian Bergmann wrote to say that he hadn’t built CVS HEAD on Linux for some
time, but that when he tried to do so ‘buildconf didn’t like my autoconf‘. He
attached his build output, which reported autoconf version 2.13 (ok)
before screaming:

FATAL ERROR: Autoconf version 2.50 or higher is required for this script

Sebastian actually had both autoconf 2.13 and 2.59 installed, wrapped
by autoconf-wrapper. He dug around a little, and discovered that buildconf was
finding autoconf 2.13 because buildcheck.sh explicitly checks for that
version, but was using autoconf 2.59 – the default on his system – because plain
vanilla autoconf was then called.

Derick guessed that there might be a problem with Sebastian’s wrapper script, but
Sebastian pointed out that buildcheck.sh‘s explicit call to 2.13 did not
involve the wrapper script. He had to backtrack on that when he found that setting
the WANT_AUTOCONF option in his wrapper script to 2.1 also failed for
him. However, he repeated that his setup used to work fine until recently. He dug
around a little more and found that reverting build/build2.mk to revision
1.30 from CVS resolved his issue. The checks introduced in late December found both
the autoconf versions available on his system.

Magnus Määttä wondered, ‘Doesn’t libtool 1.5.10 require autoconf >= 2.50
?
‘ and added that downgrading libtool to 1.4.3 had fixed his build system to
work again with autoconf 2.13.

Jani Taskinen pointed out that only libtool 1.4.3 was supported in any case.

Sebastian felt that the new detection code Jani had added for multiple autoconf
versions was still wrong, despite that.

Short version: You might find this information useful if your build
broke recently.


TLK: static method invoked by …?

PHP user Torsten Roehr wrote to the internals list, explaining that he’d tried
everywhere else but there didn’t seem to be a way to solve his problem.


"color: #007700">class Car "color: #007700">{
    function
"color: #0000BB">drive() {
        
"color: #FF8000">// I need the name of the calling class here
        // in my case it should be
'Porsche'
    
}
}

class Porsche "color: #007700">extends Car "color: #007700">{
}

Porsche "color: #007700">::drive "color: #007700">();

Torsten explained that, in PHP 4, it was possible to find the name of the class
that invoked the method in drive() by using
debug_backtrace(). The behaviour has changed in PHP 5. How could he
find that information now?

Johannes Schlüter referred to a short internals discussion on the same topic,
some time last Spring, which led to the conclusion that there was currently no way
to pick up the correct class name. Torsten had found this already, and asked if
anyone had any more ideas; ‘Something so straightforward and fundamental should
be possible!?!

Derrell Lipman did a small sanity check and found that Torsten’s original
question had been about a static function. He pointed out that Torsten hadn’t
declared drive() as a static function, and that if it needn’t be static
he could do something like:


"color: #007700">class Car "color: #007700">{
    function
"color: #0000BB">drive() {
        echo
"color: #0000BB">get_class "color: #007700">($this "color: #007700">) . "
"
"color: #007700">;
    }
}

class Porsche "color: #007700">extends Car "color: #007700">{
}

$carType "color: #007700">= new "color: #0000BB">Porsche();

$carType "color: #007700">->drive "color: #007700">();

Torsten replied that Derrell was correct, but actually he’d simply forgotten to
declare the function static. Unfortunately he did need it as a static call, so
get_class() wasn’t an option open to him.

Daniel Convissor came up with:


"color: #007700">class Car "color: #007700">{
    function
"color: #0000BB">drive( "color: #0000BB">$child) {
        echo
"color: #DD0000">"parent driven in a $child
"
"color: #007700">;
    }
}

class Porsche "color: #007700">extends Car "color: #007700">{
    function
"color: #0000BB">drive() {
        
"color: #0000BB">parent:: "color: #0000BB">drive( "color: #0000BB">get_class "color: #007700">());
    }
}

Porsche "color: #007700">::drive "color: #007700">();

but, as Torsten pointed out, this solution requires the definition of
drive() in every subclass. An alternative would be passing in the class
name as a parameter, but ‘this seems kind of stupid, doesn’t it?

Christian Schneider suggested that this was less of a fundamental issue than
Torsten believed; he’d never come across the issue himself. Then again, he uses
classes with static calls only when emulating namespaces. He felt that Torsten
should rethink the problem he was trying to solve, as there was probably a more
elegant solution.

Torsten explained that he was trying to write a load method in a superclass that
could be used by all subclasses, passing in the SQL statement as a parameter. The
load method commits the query, loops through the result set and returns a collection
of objects. The method needed to ‘know’ the name of the class in order to create
objects of that class, e.g.


"color: #0000BB">$persons = "color: #0000BB">Person:: "color: #0000BB">load( "color: #0000BB">$sql); "color: #FF8000">// returns collection of Person objects
"color: #0000BB">$cars = "color: #0000BB">Car:: "color: #0000BB">load( "color: #0000BB">$sql); "color: #FF8000">// returns collection of Car objects

He added that there were quite a few people on the php-general list facing the same
problem.

Christian offered some simple solutions:

1) $persons = DB::load('Person', $sql);
   foreach ($persons as $person)...
2) $person = new Person($sql);
   while ($person->iterate())...
3) Adding get_class() to all DB classes

He felt that 1) was close to Torsten’s own solution, although he
himself uses 2) because it is more efficient when there is a large number of results
returned, as only one object is needed per class. He added that he also creates
classes automatically at runtime from database tables, removing the need to write
stub classes and making 3) feasible.

Andrey Hristov wrote saying that he had faced this same limitation, and that his
solution had been to use __CLASS__ in the extending classes:


"color: #007700">class Foo "color: #007700">{
    function
"color: #0000BB">getInstanceInt "color: #007700">($params "color: #007700">, $class_name
= "color: #0000BB">__CLASS__) {
        
"color: #0000BB">var_dump "color: #007700">($params "color: #007700">, "color: #0000BB">$class_name);
        
"color: #FF8000">/*...*/
    
}
}

class Bar "color: #007700">extends Foo "color: #007700">{
    function
"color: #0000BB">getInstance() {
        
"color: #0000BB">Foo:: "color: #0000BB">getInstanceInt "color: #007700">(func_get_args "color: #007700">(), "color: #0000BB">__CLASS__);
    }
}

$a "color: #007700">= Bar "color: #007700">:: "color: #0000BB">getInstance( "color: #0000BB">1, "color: #0000BB">2, "color: #0000BB">3);

Torsten thanked Andrey and summarized the thread: the only viable solution was to
pass the class name as a parameter.

Short version: Don’t look back.


TLK: To BC or not to BC?

Dmitry Stogov of Zend mailed Moriyoshi off-list to say that a patch Moriyoshi had
applied in the Virtual Machine code was incorrect. He explained that
zend_vm_execute.h should not be edited directly; it’s generated by
zend_vm_gen.php from zend_vm_def.h, where any changes should be made.
He also felt that isset($str['str']); should not return
TRUE, and had modified the patch to return FALSE and throw
a warning about incorrect usage of dimension operators.

Moriyoshi thanked him publicly for the fixed patch, but argued that although the
behaviour was semantically wrong, there was backward compatibility to consider, and
so the behaviour ought not to change from Zend Engine 1. He hadn’t added the warning
for the same reason.

Dmitry replied, quite reasonably, that BC didn’t need to be adhered to for bugs;
but Moriyoshi pointed out that certain long-standing behaviours might be considered
as idiosyncracies rather than as bugs by the user base, so there was a need for
caution in that area.

Andi intervened at this point, saying that he’d looked into it and agreed with
Moriyoshi that BC should not be broken in this case. Dmitry promptly restored the
original behaviour.

Short version: When is a bug not a bug?


NEW: PIC/non_PIC

Following Jani’s $host_alias-forcing fix "/article/475#Heading4">last week, Joe Orton came up with an
elegant and simple patch which nobody argued with and which Jani subsequently
applied. On Linux and FreeBSD systems where non-PIC is known to be safe, the default
build is now non-PIC. Should you for any reason need to force a build in PIC mode on
those platforms (unlikely), you will need to run ./configure
--with-pic
. The new default setting should speed up all versions of PHP
noticeably on the affected platforms; other systems will build PHP in PIC mode as of
yore.

Short version: Finally, it’s in!


NEW: sqlite_key

Marcus Börger implemented an iterator interface in SQLite mid-week, meaning that
you should now be able to use $current_index = sqlite_key($result); on
a buffered query result.

Short version: Proto, Marcus?


TLK: External internals

Joe applied a compiler warning fix to a file in libmbfl (the underlying library
behind the mbstring extension), with the commit message ‘Don’t scribble all over
the stack
‘. Derick promptly wrote to Joe, explaining that the libmbfl code is
LGPL’d and that the patch should be applied to the original library too in order to
avoid license problems. As Joe was swift to point out, however, in the case of
libmbfl the original library’s authors include three members of the PHP development
team. He had already mailed Moriyoshi (one of those authors) the patch. Should he
back it out, were there restrictions on making changes to the library from within
the PHP source tree? (Mm, generally, yes.)

Moriyoshi responded, saying that he had been reluctant to apply the patch because
there had been an ongoing effort to clear the ‘stack-scribbling’ issue. However, as
Joe had already applied it in the PHP source tree, he’d commit it to the original
source tree too.

In another area, Joe fixed bug #30446 by
working around an Apache 2 issue with subrequests and
internal redirects. Again, it was Derick who pounced:

Why not fix the Apache bug instead? It feels a bit wrong adding
workarounds to PHP for bugs in other software.

Joe explained that the Apache fix would take longer, and that the workaround had
minimal risk and impact.

Short version: OK, but don’t do it again!


FIX: mbstring

Moriyoshi himself was busy trying to resolve the issue of request-local settings
leaking into other requests in the mbstring extension. He applied a patch that
should fix this critical issue at the end of this week.

Short version: Calling all Japanese PHP users: please test.


PAT: Commercial interests

Joe was busy in other areas too this week. He provided a Zend engine patch to
support glibc’s new RTLD_DEEPBIND flag for DL_LOAD; this
flag will effectively cause the loading process to search for a dl()‘d
shared object’s undefined symbols in that shared object and its dependencies before
looking in the global symbol table, thereby avoiding symbol namespace collisions.
There had been no response at the time of writing, so Joe’s patch is sitting in
the PAT directory awaiting review.

Andi meanwhile responded to August Software’s patch to provide support for their
ODBC Router, asking for a new patch against PHP 5 and also wanting to know how the
product differed from existing ODBC bridge solutions. The company agreed to this,
and offered an explanation of their product that didn’t actually answer the base
question, i.e. ‘what is it?’, but which focused on criticism of competing products.
As they later explained, there was then a conference in their field – the result
being that most of the subsequent exchange of mail on the PHP internals list went
unremarked by them.

Michael Sims kicked off by bringing the company’s site to the list’s attention,
saying that there was ‘possibly libelous FUD‘ about FreeTDS there. The
company responded to his mail, accusing him of quoting out of context and saying
that there was case evidence to support their assertions there. This was enough to
make Dan Kalowsky, the maintainer of the affected extension, extremely wary. He
wanted to know how the product differed from an ODBC Driver or an ODBC Driver
Manager. Was this just another drive manager?

Andi agreed with the company that unixODBC and iODBC weren’t the best solutions
in the field, and put the view that if this was an equivalent product, there was no
good reason PHP shouldn’t support it.

Wez wondered whether ODBC Router might turn out to be ‘a compatible ODBC
driver that can be loaded via unixODBC or iODBC, requiring no patch
‘, but
conceded that if running via a manager introduced a performance hit, running a
driver directly would be better. He then read up on the software and found that it
didn’t implement ODBC 3.0 API’s. He concluded that ‘it can’t be used directly for
the PDO ODBC extension, but it should be loadable via unixODBC or iODBC
‘ and
added that the patch itself was trivial and looked ‘safe’ to commit.

Dan pointed out to Andi that the reason iODBC and unixODBC run slowly under PHP
is that PHP uses a dynamic cursor. He explained that this is mostly due to BC and
his own ‘unwillingness to deal with the fallout of changing it (again)‘. He
saw no problem with including ODBC Router if it happened to be another driver
manager, but didn’t want to include a proprietary driver, and couldn’t tell from the
literature available online which category the product fell into.

Mike Robinson backed up Dan, saying that he’d found the information on the
company website to be both misleading and self-aggrandizing, and he didn’t see any
good reason to include support for their product in PHP’s core.

Wez felt that the way the company came across was less important than the fact
that there was no technical reason the patch shouldn’t be in PHP, given that other
ODBC drivers are already supported. He then committed the patch to CVS HEAD, leaving
it to the Release Masters to decide whether they wanted it in their upcoming
releases; Andi responded, saying that it should stay in HEAD for PHP 5 (i.e. it will
be in PHP 5.1). He added, to Dan, that he didn’t see how such a trivial patch could
be so terrible for the odbc extension.

Dan stated that if the product was not a driver manager, adding support for it
would move the extension closer to being a driver manager itself – something he’d
been trying to avoid. He was against continuing the process of directly linking into
DB-specific drivers, mostly to cut down on support issues, and wanted to move the
extension forward in time. He felt that Wez had taken his own route with PDO and
that he should be allowed to take his own with uODBC.

Wez retorted that ext/odbc has been idle for over a year in development terms,
that this was a trivial patch, and that vendor-supplied patches do not imply support
issues. He concluded, ‘The fact of the matter is that the patch has no negative
impact while providing more functionality to PHP.

At this point the people behind August Software returned from their conference,
and mailed in to thank Dan, Wez, Mike, Andi and the list for taking part in the
debate. They claimed that the effect of the patch would be to connect PHP into the
official Microsoft ODBC Driver Manager hosted on a centrally administered Windows
box, allowing all data sources to be simply referenced by name from any PHP client.
They considered that this would be of substantial value to a significant number of
PHP users. They also offered support for any technical issues arising.

Back to normality: John Carter re-submitted his patch for bug "http://bugs.php.net/29334">#29334, correcting DST in win32′s
mail function – a one-liner, but one that Derick couldn’t test due to it’s being
OS-specific. John’s patch is actually attached to the bug report; he just wanted to
draw attention to its existence.

Finally, the wonderfully-named Quanah Gibson-Mount offered a patch that gives
working SASL support for LDAP, against PHP 5.0.3. He added that there is also a bug
report, #30819, open against
the lack of SASL support in PHP 5 currently.

Short version: RTLD_DEEPBIND (Zend), win32/sendmail.c and ext/ldap fixes
are sitting in the PAT directory.

Published: January 17th, 2005 at 12:00
Categories: Uncategorized
Tags: