Zend Weekly Summaries Issue #360

      Comments Off on Zend Weekly Summaries Issue #360

REQ: Compiled variables defined
NEW: MySQL/PDO commitment
NEW: PHP_5_3
BUG: Over-enthusiasm from __toString?
TLK: Test suite changes
TLK: Internals newbies
NEW: Release Master for the PHP 5.3 series
TLK: PHP 5.2.5 release cycle
CVS: All hell breaking loose
PAT: ODBC and OpenSSL

23rd September – 29th September 2007

REQ: Compiled variables defined

One Paul Biggar wrote to internals@ with a request. He’d come across
references to compiled variables and backpatching in the interpreter, both in
list archives and in code, but had never seen either term defined. If an
overview existed, could somebody please point him to it? If not, he would
appreciate answers to the following questions:

  • What is a compiled variable?
  • How does it differ from a non-compiled variable?
  • Why were compiled variables introduced?
  • What performance impact do compiled variables have?
  • What is backpatching?
  • Is backpatching related to compiled variables, or are they different
    concepts?
  • Why is backpatching necessary?
  • What performance impact does backpatching have?

Stas Malyshev was game. He explained that, when the Zend Engine encounters a
standard variable $a, it looks up the current symbol table for a
matching entry a, using the zval* there to determine
the value of the variable. However, once this entry has been found, future
calls to $a from within that scope no longer require a symbol
table lookup to determine the entry (the value is a different matter).
Compiled variables skip repeat lookups; and yes, the reason is performance.

Paul understood this as ‘a bit of caching‘, and wondered what happens
if the symbol table is rehashed. Does the original compiled variable continue
to work, or does it need to start over with a new lookup? He also wanted to
know whether there is a single compiled variable for each occurrence of
$a within a scope, or one per bytecode operand. In other words,
in a function that uses $a repeatedly, is there a lookup for
each bytecode that uses the variable, or just one lookup for the entire
function?

Stas believed rehashing wouldn’t have any impact, given that both the hash
table and compiled variables contain pointers rather than actual
zvals. It would only move the pointers around. As to the rest,
since each variable name has its own compiled variable, every reference to
$a within the same op_array would be to the same
compiled variable. CVs are local per op_array and local per
function call; if the same function is run twice, $a would have
the same CV for both calls, but could potentially have a different value
stored in that CV on each. There is therefore one symbol table lookup per
function call. In a function that uses $a repeatedly, all CV
lookups except the first in each function call would be saved.

Short version: Nobody knows what backpatching is, apparently.

NEW: MySQL/PDO commitment

Lukas Smith returned from the developers’ meeting in Heidelberg with some
good news. It seems that MySQL AB are committed to fixing up and maintaining
the PDO_MYSQL driver, and have come to accept that PDO is the future.
The mysqli extension will also be actively maintained, but the new
mysqlnd library will be made to ‘play nicely‘ with PDO
too.

The company even have a budget allocated for PDO development, and Lukas had
been told that one of their developers will be assigned to it. It was
anticipated that the entire PDO test suite would also benefit from this.
Furthermore, MySQL AB have allocated someone from their documentation team to
check the PHP manual entries for ext/mysql and ext/mysqli. Lukas
had committed himself to ‘poking the relevant people at regular
intervals
‘ to ensure that any MySQL-specific features in PDO would also
be fully documented.

Alexey Zakhlestin thanked Lukas for his efforts and Ilia Alshanetsky welcomed
the news, saying that he looked forward to MySQL AB developer participation in
PDO development.

Short version: A major boost for PDO adoption.

NEW: PHP_5_3

Ilia, wearing his Release Master hat, announced that the PHP_5_3 branch had
been created in CVS and was now open for development. He asked that all
developers remember to merge their patches from branch or head to the new
branch when making their commits.

Etienne Kneuss was overjoyed, and re-presented his
patch
offering dynamic access of static members (as in
$foo::myFunc()). Johannes Schlüter committed it so quickly
that Sebastian Bergmann didn’t even have time to bump the version numbers
before it was in. Ilia thanked Johannes.

Stas, who seems to have a bee in his bonnet about maintaining three CVS
branches simultaneously, asked Ilia if he thought he might start the PHP
5.2.5 release cycle soon? He hoped to ‘get rid of pending fixes‘ that
would need to be merged across. Ilia repeated what he’d said last week; that
he didn’t believe 5.2.5 would be the last release in the 5.2 series before
the 5.3.0 release, however much he’d like it to be that way. The need to
maintain the 5_2 branch – and to merge three ways – would continue after the
5.2.5 release. That said, Ilia didn’t disagree with the idea of starting the
release process in the next week or so. Stas agreed that a week would be fine
with him too; he just didn’t want to keep a backlog in the 5_2 branch when all
the ‘real’ work would be in 5_3.

Jani Taskinen pointed out that merges to a pure bugfix branch, such as
PHP_5_2 now is, are the responsibility of the Release Master for that branch
– in this case, Ilia. While on the subject, Jani added that one thing had
been overlooked: ‘Who is the RM for 5.3?

Lukas proposed Johannes as ‘fresh new blood‘, and Jani immediately
seconded him. Pierre-Alain Joye wrote that he’d actually like to see Lukas as
RM; ‘From a management, feedback or compromises ready point of view, you
have already proven your abilities.
‘ Tony Dovgal wrote bluntly that Ilia
was good at the job; he’d prefer to see him take on the 5.3 series and
someone new to take over PHP_5_2 maintenance. I backed Pierre’s compliment to
Lukas, but wanted to know what Ilia thought best. Andi Gutmans was with Tony;
he thought it would be a good idea to have Ilia continue for the 5.3.0
release at least, given that this is a major release for PHP. If this wasn’t
an option, he’d at least like to get a better understanding of the
candidates; the RM job required someone with strong technical understanding,
as well as management skills.

Ilia responded to my query. He firmly believed that two active branches
necessitated two Release Masters; there would be simply too much work for one
person. His suggestion, had Lukas not beaten him to it, would have been that
he oversaw the 5.3.0 release, and someone new took over the PHP_5_2 branch.
That person would then switch to running the PHP_5_3 branch when they have
become familiar with the process. That said, Ilia noted that
historically, every minor/major release had a new RM‘.

Tony and Andi both agreed with Ilia’s plan. Lukas, however, didn’t see why
there should be a delayed handover; previous RMs have managed to take over
during the process, and these days there is even a check list for the purpose
– something that didn’t exist before. The main challenge was managing the
politics on the internals list, although there are of course technical issues
that need taking care of too. Lukas believed that switching RMs regularly
ensures that the experience remains shared within the group and guards
against person-specific processes. Regardless, Lukas offered his ongoing
secretarial support for any incoming RM, since he lacks the technical
expertise to do the job himself.

I backed Ilia, arguing that the PHP project has grown massively over the last
five years, and a delayed handover makes much more sense now than it would
have in, say, the PHP 4.2 series. Still, the question of ‘who?
remained. I didn’t see anyone fighting the proposal for Johannes, but I also
felt that Pierre had made a good point when he hoped that Lukas himself might
be willing to fill the slot. Did there need to be a vote, or was there only
one volunteer?

Lukas reiterated that he didn’t have the technical skills to be a good fit
for RM; ‘I would definitely change the job scope.‘ He believed that
Johannes was the best candidate for a delayed takeover; being both fully able
to fulfill the role and young enough that training him ensures some kind of
fallback expertise in future decades. (Note: Lukas had a smiley on this
part.)

Mike Wallner wrote that he liked the idea of a deferred handover, but didn’t
believe Johannes would have major problems starting with PHP 5.3.0. Whichever
way it went, he’d like to see Johannes as RM for the PHP_5_3 branch, whether
it be now or later in the series.

It was around then I started to wonder if we were posting to the right list.
Nobody had argued.

Short version: Good luck, Johannes!

BUG: Over-enthusiasm from __toString?

PHP user Martin Alterisio wanted to confirm the behaviour he was seeing in
PHP 5.2.4 before submitting a bug report. He had the following test code:


<?php

class Foo {
    static public
$foo;

    function __toString() {
        
self::$foo = $this;
        return
'foo';
    }
}

$foo = (string) new Foo();
var_dump(Foo::$foo);

?>

After running this script, Martin was seeing:


string(3) "foo"

– not the object, but the string returned by __toString().
Paweł Stradomski tested Martin’s script, and came back with an even
stranger result:


string(3) "foo"
ALERT - canary mismatch on efree() - heap overflow detected (attacker 'REMOTE_ADDR not set', file 'unknown')

He agreed with Martin’s diagnosis, that var_dump() should report
the object and not the string. Moritz Bechler confirmed the same behaviour
using an older CVS build, and added that on his system it was possible to
trigger a segfault by calling var_dump() twice. He offered a backtrace along with
this information.

Martin thanked everyone for confirming his own suspicions, and added that
he’d now submitted a bug report.

Short version: That’ll be bug #….?

TLK: Test suite changes

Mike Wallner had found some bewildering new failures among his extension‘s
test suite, and finally tracked them down to a change made in the
run-tests.php script made by Nuno Lopes a couple of weeks ago.
He wrote to Nuno to complain, pointing out that – unlike bundled extension
tests – PECL extension tests have no way to be version agnostic when it comes
to the test suite controller.

Zoe Slattery immediately took Mike’s point. She explained that all but four
of the tests in the PHP core had actually relied on %s never
matching beyond the end of a line, so changing the regex to meet that
expectation had been a positive change there. Zoe wanted to know if the
situation was very different in PECL?

Nuno Lopes backed her, saying that although the change would break a handful
of tests the vast majority should continue to work correctly. Given that some
test failures had been hidden before, he felt it was an important enough issue
to justify the BC break. Besides, anyone needing the old behaviour in PECL
could use the section headed


--EXPECTREGEX--

to adjust the expectation.

Short version: Tickle it, you PECLers!

TLK: Internals newbies

Someone named Alon came along with a prime newbie question. How could he use
zend_parse_parameters() to accept the object of a given class as
a parameter?

It transpired that Alon was actually trying to build a method that would
accept a built-in DateTime object. He understood that the
correct syntax would be something along the lines of:


zval *obj;
zend_class_entry ce;

if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "O", &obj, ce) == FAILURE) {
    return;
}

but he didn’t know how to specify the DateTime type.

Mike wrote that the class entry pointer, or a function returning the class
entry pointer, would need to be exported by the date extension’s API
for this to work. For now at least, this isn’t the case.

Johannes offered another way. It’s possible to get the ce (AKA
zend_class_entry) by feeding the class name to the Zend API
function zend_fetch_class(). The code to do this would look
something like:


zend_class_entry *ce = zend_fetch_class("DateTime", sizeof("DateTime") - 1, ZEND_FETCH_CLASS_DEFAULT TSRMLS_CC);

if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "O", &obj, ce) == FAILURE) {
    return;
}

When referencing an internal class, it should be possible to call
zend_fetch_class() just once and cache the value inside your
extension.

Short version: Nice’n’easy (when you know how).

NEW: Release Master for the PHP 5.3 series

Following some behind-the-scenes discussions, Ilia had another announcement
to make:

David Coallier, Sebastian Bergmann and Andi were all swift to thank Ilia for
his RM work on the PHP 5 series to date, and also for his past and future
contributions to PHP security and PHP in general. Andi wrote that Ilia had
shown true leadership throughout the releases and been very good at
making the quality/features trade-offs
.’

They also joined in wishing Johannes the best of luck. Again quoting Andi,
PHP 5.3 is going to be a huge release for PHP and I’m sure you’ll be
successful in helping us get there.

Johannes thanked everyone for their support, particularly Ilia himself, and
formally accepted the role of RM for PHP 5.3.

Short version: Speech!

TLK: PHP 5.2.5 release cycle

One small aside that I didn’t quote from Ilia’s announcement:

Short version: Just in case anybody missed that…

CVS: All hell breaking loose

Changes in CVS that you should probably be aware of include:

Prior to the PHP_5_3 branch opening:

  • Core bug #42739
    (mkdir() doesnt like a trailing slash when
    safe_mode is
    enabled) was fixed in the 5_2 branch only (deliberately so) [Ilia]
  • In the CGI SAPI, bug #42699
    (PHP_SELF duplicates path) was fixed [Dmitry]
  • In ext/mbstring, bug
    #39404
    was fixed when support was added for “entity” as
    substitute_character setting [Rui Hirokawa]
  • Zend Engine bug #42767
    (highlight_string() truncates trailing comment) was fixed
    [Ilia]

Following the PHP_5_3 branch opening – take a deep breath now:

  • Read support for dynamic access of static members was backported to the
    PHP_5_3 branch [written by Etienne Kneuss, committed by Johannes]
  • Version numbers were bumped to PHP_VERSION "5.3.0-dev" and
    ZEND_VERSION "2.3.0" in the 5_3 branch [Sebastian Bergmann]
  • Following improvements to the CGI code, FastCGI is always enabled and
    cannot be disabled in PHP_5_3 and CVS HEAD [Dmitry]
  • The openssl extension now has support for OpenSSL digest and
    cipher functions, as well as access to the internal values of DSA, RSA and DH
    keys, in 5_3 and HEAD [Dmitry]
  • In ext/reflection, Coverity issue #411 was fixed in 5_2,
    5_3 and HEAD [Tony]
  • In ext/iconv, Coverity
    issue #412 was fixed in the 5_2 branch only [Tony]
  • The SOAP extension now supports element names in the context of
    XMLSchemas (<any>) in PHP_5_3 branch and CVS HEAD [Dmitry]
  • Zend Engine bug #42657
    (ini_get() returns incorrect value when default is
    NULL) was fixed in 5_3 and HEAD [Jani]
  • Also in the Zend Engine, Coverity issue #470 (Unitialized integer
    value used inside zend_ini_boolean_displayer_cb()) was fixed in
    5_2 only [Ilia]
  • In a patch that involved practically every source file in PHP_5_3 and
    HEAD, memory usage was improved by moving constants to read-only memory
    [Dmitry, Pierre]
  • In another huge commit, php.ini handling was completely updated
    in 5_3 and HEAD [Jani]
  • In CVS HEAD, support for namespaces in dynamic calls was added
    [Dmitry]
  • Following that last change, namespace support was backported to the
    PHP_5_3 branch [Dmitry]
  • Support for late static binding was added in CVS HEAD and the PHP_5_3
    branch [Dmitry, Etienne Kneuss]
  • Support for the __callstatic() magic method was backported
    to the 5_3 branch [written by Sara Golemon, backported by Dmitry]
  • A new “compact” handler for Zend MM storage was added in CVS HEAD and
    the 5_3 branch [Dmitry]
  • A little late perhaps… the ZEND_EXTENSION_API_NO and
    ZEND_MODULE_API_NO were bumped in both PHP_5_3 and CVS HEAD
    [Dmitry]

There isn’t much else to tell in the way of CVS news – just a few snippets of
exchanges and additional information.

There was a nice moment when Marcus asked Dmitry to start adding
documentation following the initial LSB commit (into CVS HEAD). Dmitry
replied coolly that the whole lot needed documenting… ‘namespaces, lsb,
dynamic calls, __call_static(), …
‘. Marcus, recognizing that the onus
was on him too, went very quiet.

Another nice moment came when Nuno opened the commit mail detailing the move
to read-only memory for constants. He couldn’t have been happier if he’d won
the lottery, and immediately wrote to Dmitry and Pierre: ‘Many many many
many many many thanks! When I opened this e-mail I thought I was dreaming,
but no, it finally happened :) Thank you both!

Jani’s enormous INI handling commit had a great many parts to it. For a
start, it fixed two bugs; #27372
(parse error loading browscap.ini at Apache startup) and #42069 (parse_ini_file()
allows using some non-alpha numeric characters). It added
.htaccess-style user-defined INI support for CGI/FastCGI, as discussed
in recent weeks; it added support for PATH sections in
php.ini that cannot be overridden in user-defined INI files or during
runtime in the specified path; and it improved php.ini handling in a
myriad of small ways. Jani noted cheerfully in his commit message that Pierre
had promised to handle the documentation.

With that out of the way, Jani had time to notice Dmitry’s ‘new “compact”
handler for Zend MM storage
‘ going into CVS: ‘A what? Care to explain
it a bit more than one line in NEWS? Something to use in scripts or
what?
‘ Dmitry had marginally less time, and it was a while before Jani
got a response. Dmitry explained that the Zend Memory Manage has a storage
layer that manages a large memory block. Until now, that storage layer had
five callbacks: init(), dtor(),
alloc(), realloc() and free(). It now
has a sixth callback, compact(), which should return all unused
memory to the operating system. This handler had been implemented as a hack
in PHP_5_2 in order to maintain binary compatibility; it was needed for the
Windows storage manager.

Short version: This was possibly the only week in PHP history that
CVS commits outweighed internals intrigues.

PAT: ODBC and OpenSSL

Alexandra Shpindovsky chased up on the patch she’d posted last week to
fix ext/odbc bug #37527. This
time around, Dmitry looked into it. He wrote that he wasn’t very sure about
the patch; it effectively called zend_list_delete() on the
connection twice. Also, it was possible that it might cause the connection to
be deleted before the related results were deleted. Would that be allowed?

Undeterred, Alexandra sent another ext/odbc patch, this time to fix bug #40695, which involved the wrong
error being reported by PHP because the error code returned from the database
was never checked. Dmitry was able to okay this one, but asked if it would be
possible to write a .phpt test for the test suite, with a proper
--SKIPIF-- condition, to go with the fix.

Finally, Moritz Bechler responded to a similar request from Pierre – following a
2-month gap
– for his CRL patch
for ext/openssl. He made the tests and a small CA required for test
data available
online
. The email that
accompanied this
contained the results of Moritz’ investigations into the
state of ext/openssl, which quite possibly went out of date two days
later when Dmitry committed his changes to the extension. Hopefully someone
better placed to judge will pick up on this summary.

Short version: Please can someone check out that email?