Zend Weekly Summaries Issue #353

      Comments Off on Zend Weekly Summaries Issue #353

TLK: getopt revisited
TLK: PDO/Firebird [again]
TLK: TSRM in CLI
TLK: Taint update
TLK: Namespaces or packages?
REQ: Pre-destructor magic function
CVS: Nothing major going down
PAT: That win32 filepath issue [again]

6th August – 11th August 2007

TLK: getopt revisited

David Soria Parra kicked off the week with his second
attempt
to bring longopt and Windows support to
getopt(), following a long discussion on IRC. The idea now was
to centralize PHP’s own getopt() implementation in the core
rather than duplicate it in the CGI and CLI SAPIs, which would also allow a
clean-up of the currently necessary checks in the php_getopt()
function in ext/standard/basic_functions.c. The minor changes to the
implementation that already exists in CLI were to enable long options on the
command line under both Linux and Windows. The function syntax would look
like:


$opt = getopt("a", array("param:", "param2"));


where : means that the marked parameter takes an argument.

Marcus Börger agreed that it might be a smart move to consolidate the
implementation in /main, and asked Jani Taskinen to get that part
under way. Jani agreed too, but thought it best to wait until the 5_3 branch
opens before commencing open-heart surgery on the PHP core. Johannes
Schlüter more cautiously noted that there is currently just one test
case for getopt(); he didn’t like the idea of rewriting the
implementation without regression testing. Johannes added that the offered
patch isn’t threadsafe either, and really should be. Finally,
php_getopt() is not marked PHPAPI in either the
current implementation or the patch, meaning that it can only be used
locally.

Jani wondered quietly why anyone would need thread safety in a CGI/CLI
environment, but Johannes felt there would be no way to guarantee those
environments once the function becomes part of the PHP API. Stas Malyshev
wanted to know where the options would come from in other environments, given
that there are no other command line versions of PHP. Somehow his query
backfired; David immediately rewrote the history of his patch, and Johannes
reiterated the arguments for moving it into /main.

Short version: David’s patch is available here.

TLK: PDO/Firebird [again]

Diving neatly sideways from the discussion following the PHP 5.2.4 RC1
release last week, Lester Caine came up with a list of arguments against the
emphasis on PDO development. PDO’s lack of support for PHP 4 was the first on
that list, despite the PHP 4
end of life announcement
. ADOdb (Lester’s database abstraction layer of choice)
has support for PDO drivers, but at the cost of performance. PDO’s internal
restrictions mean that PDO drivers can only access the database via SQL; this
doesn’t take Firebird’s services interface, used for backup, user management
and the event handler, into consideration. PDO’s transaction control hides
the transaction mode and cannot retain the context of a transaction following
a commit or rollback; there’s no way to select a more appropriate transaction
mode. Finally, as Lester understood it, PDO returns an entire
BLOB object rather than a simple handle that would allow access
to a subset of its data. All these were restrictions that made it unlikely,
if not impossible, that any Firebird user would step up to develop
PDO_FIREBIRD… unless he’d misunderstood something.

Larry Garfield posted a link to gophp5.org,
a laudable effort from the developers of symfony, Typo3, phpMyAdmin, Drupal,
Propel and Doctrine to get other mainstream PHP applications and ISPs to
commit to a minimal PHP 5.2 requirement in the first quarter of 2008. Lack of
support for PHP 4 is no longer an argument against anything. Larry went on to
demolish the performance myth, pointing out that PDO does rather more than
simply pass along an SQL query string. Where native drivers are used,
escaping mechanisms need to be implemented separately – and usually in
userland PHP code. That said, his own
experience
of moving Drupal to PDO_MYSQL had resulted in only marginally
better performance than that achieved with the native MySQL driver. The
benefits of PDO were more in the way of cleaner code and type-safe prepared
statements than speed. Larry believed that PDO does in fact allow non-SQL
strings as queries, but referred Lester to Wez Furlong to be certain. How
this might affect Firebird particularly, he didn’t know – but it seemed
pointless to Larry to write cross-database compatible code when you’re
certain you’re only going to be using one RDBMS.

Lukas Smith disputed this, writing that the whole point of PDO was to have a
common set of methods and a common infrastructure to support them. Any PDO
driver could, however, implement driver-specific methods, prefixed with the
driver name. He therefore saw no reason that ‘RDBMS-specific goodies
in the native drivers shouldn’t be ported to PDO. Dan Scott pointed out that
we’ve been here before
and are still awaiting Lester’s (or anyone else’s) patch for PDO_FIREBIRD
after two and a half years. Lester retorted that he was simply trying to
re-open the discussion on a generic issue, i.e. transaction handling. Lukas
agreed that there is actually an issue here: PDO doesn’t support nested
transactions. Then again, as far as he knew none of the other RDBMS
extensions supports it, including the Firebird native driver. Still, this
didn’t prevent PDO_FIREBIRD having specific methods to handle nested
transactions…

Lester wrote that he just wanted to be able to turn auto-commit
on or off and switch transaction modes, something that many RDBMS have
support for. He’d noticed, however, that other generic drivers only make this
available via the connect string, which isn’t the way it works in
Firebird. There, he can wrap an update in its own transaction, which can then
be rolled back and reworked if necessary. Until now, Lester had been unaware
that most of the native drivers in PHP don’t provide access to those
facilities, despite the fact that the underlying RDBMS has support for them
in at least three cases (Oracle, DB2 and MSSQL). Further, both Firebird and
PostgreSQL can use parameters to handle the transfer of values to a query,
whereas – as far as Lester could see – MySQL cannot. PDO wraps this ‘quite
nicely
‘, except in the case of BLOBs. In his view, ‘most
database engines would benefit from being able to handle BLOB elements
independent of the general fetch
.’

Finally, Lester wrote some surprising things. Firstly, the idea behind PDO is
good. The core just needed a little tidying to make it fully practical.
Secondly, he was actually looking forward to not having to support PHP 4 any
more…

Lukas pointed out that pg_query_params() values can only be
literals, not identifiers, and the differences in placeholder support come
down to whether the placeholders are named (PGSql, Oracle) or not (db2,
MySQL). PDO offers support for both named and unnamed placeholders in all
drivers. As for BLOBs, they’re implemented in PDO using the
streams API; they aren’t actually raw content. That said, Lukas agreed with
Lester that adding support for BLOB IDs would be a good move,
assuming they aren’t supported already. Lukas was uncertain of that because
the documentation – another area where PDO is intended to reduce duplicate
effort – is behind the times at present.

Short version: Finally moving forward with the PDO_FIREBIRD
debate.

TLK: TSRM in CLI

Stas had found some odd code in the CLI SAPI. In ZTS builds,
compiler_globals and executor_globals were being
allocated their respective ts_resource(*_globals_id). This made
no sense to him because the globals are already allocated in
zend_startup(), which is called during the SAPI module startup.
Those globals must be holding either tsrm_ls if they’d been
initialized to 0, or ‘random crap‘ otherwise. Could
anyone explain to him why/how this block of code in CLI worked?

Following a brief silence, Stas reported his discovery that the
compiler_globals pointer isn’t even used in ZTS mode. The
code in question was therefore completely redundant, and he planned to delete
it unless anyone had objections.

The silence held. Stas looked into it a little more, and eventually killed
off two related blocks of code.

Short version: Stas finally started talking to himself.

TLK: Taint update

Wietse Venema, who posted an RFC about taint support on the internals list at the end of last
year
, wrote to say that he has now resumed work on the project. Wietse is
currently working on a rough prototype to support taint support in the PHP
core and standard extensions. He has chosen the path of developer impact
rather than performance impact, meaning that overhead should be minimal; his
prototype simply sets and tests ‘a few normally unused bits in the zval
structure’. Wietse added, however, that he doesn’t expect to have actual
performance data available until his first implementation is ready, some time
in September.

Guilherme Blanco had tried to do something similar with a Poka-Yoke implementation in the
past, and was very certain that the PHP development team wouldn’t be
interested in incorporating Wietse’s taint support in the core. He felt that
Wietse should concentrate on data validation with userland PHP rather than
going under the hood.

Richard Quadling wrote helpfully about Marco Tabini’s php|architect article on
Poka-Yoke. Guilherme agreed that this was a useful resource, although
claiming that his own implementation takes things a step further.

Short version: It seems the internals folk are waiting until the
implementation turns up before they comment further.

TLK: Namespaces or packages?

Having picked up on the brief exchange about renaming namespaces
to packages
a couple of weeks ago, Johannes provided a
patch to accomplish this
and another to alter
any affected tests
accordingly. He wrote that he believed consensus had
been achieved (erm… if so, it was largely off-list), but wanted to check
that there were no objections before committing his changes.

Marcus promptly thanked him, but Dmitry Stogov soon followed his post with a
complaint that the decision to use namespace,
package or even packet had yet to be taken. Derick
Rethans argued that the implementation is closer to package than
namespace support, ‘and “packet” doesn’t have any meaning in
this case
‘. Did Dmitry have a better suggestion? Marcus agreed; ‘I
actually only saw pros on the question.
‘ Dmitry – who was all for calling
it packages from the start – explained that he just wanted to
wait a bit, presumably anticipating on-list agreement over the feature’s
name. Christian Schneider intervened to say that the feature itself was more
important than its name, and mentioned ‘the color of the
bikeshed
‘ in passing. Derick disagreed with him; calling this
namespace support ‘just for marketing reasons‘ would say
nothing about the kind of functionality that had been implemented. Jani wrote
rather impatiently that the majority had agreed already, and Johannes should
just commit. Stas – who was against renaming at the start – wanted to know
when this vote had been held. He also wondered where everyone else had found
the definition that explains the difference between namespaces
and packages – did C++ now have a trademark on the name? That
said, he could live with packages if necessary; he just liked
namespaces better because the concept was easier for PHP users
to grasp.

Jeremy Privett felt that namespaces was a confusing name for
anyone with a C++ or C# background, purely because the behaviour’s so
different in PHP. However, in line with most of the PHP users contributing to
this thread, he only really cared that the functionality was there. Robert
Cummings started singing ‘tomayto, tomahto‘ to himself at this point.
Larry Garfield mentioned the similarity with XML namespaces for the first
time – something that seemingly hadn’t occurred to anyone until now – and
Lars Gunther brought up ECMAScript 4/Javascript 2 namespaces, ditto. Stas
meanwhile pulled out the
Wikipedia page on namespaces
and demanded to know in what way the PHP
implementation fell outside that definition. And Tijnema – who has lost his
exclamation mark over the summer – went on one of his inventive naming
sprees:

Short version: PHP’s roots are showing.

REQ: Pre-destructor magic function

Someone rejoicing in the name of Nathanael D. Noblet wrote to internals@ with
a request. He had an object containing members that were themselves objects.
These member objects each held references to the container:


class A {
    var
B;
}

class B {
    var
A;
}


Nathanael had found that when calling something like:


$obj = new A();


in a loop, the object is never released from memory because of those internal
references. Although this wouldn’t normally be a problem during the lifetime
of an HTTP request, Nathanael’s objects happened to be in a batch script that
looped through 22,000 records and were eating memory. He’d worked around the
issue by creating a special function in the object to manually release the
recursive references, but thought it would be good to have a magic function
that could be called on an object so that it would ‘know’ when it’s being
deleted – something like __preDestruct(). Another approach would
be to implement better handling for recursive references in objects…

Arnold Daniels intervened, and made Nathanael aware of David Wang’s Google
Summer of Code project
to implement a garbage collector for circular
references in PHP. Nathanael cheerfully admitted to not having followed
internals traffic in the past, and volunteered to test David’s code.

Short version: That’ll be one of those real-world cases, then.

CVS: Nothing major going down

Changes in CVS that you should probably be aware of include:

  • AIX configuration bug #42195
    (C++ compiler required always) was fixed [Jani]
  • In ext/openssl, bug
    #42222
    (php_openssl_make_REQ() buffer overflow) was fixed
    [Pierre-Alain Joye]
  • Core bug #42233 (Problems with
    æøå in extract()) was fixed [Jani]
  • In ext/ldap, bugs #41973
    (./configure --with-ldap=shared fails with
    LDFLAGS="-Wl,--as-needed") and #42247 (ldap_parse_result()
    not defined under win32) were fixed [Nuno Lopes and Jani, respectively]
  • In the CGI SAPI, bugs #42198
    (SCRIPT_NAME and PHP_SELF truncated when inside a
    userdir and using PATH_INFO) and #31892 (PHP_SELF incorrect
    without cgi.fix_pathinfo, but turning on screws up
    PATH_INFO) were fixed [Dmitry]
  • Two gdImageCreate() crash fixes, libgd bug #94 and (presumably #11, reported as #101) were merged into
    PHP 5_2 and HEAD [Mattias Bengtsson]
  • In ext/bz2, bug #42117
    (bzip2.compress loses data in internal buffer) was fixed [Ilia,
    Tony Dovgal]
  • In ext/dbase, bug #42261
    (header wrong for date field) was fixed [Ilia, Tony]
  • A core bug affecting Linux users, #42243 (copy() does not
    output an error when the first arg is a dir) was fixed [Ilia, Tony]
  • In ext/sybase, bug
    #42242
    (sybase_connect() crashes) was fixed in 5_2 only
    [Ilia]
  • Support for building ext/oci8 with Oracle 11g was added to both
    CVS HEAD and the 5_2 branches, the latter with Ilia’s permission [Christopher
    Jones]

Unusually, there wasn’t much else going on in the CVS mailing lists this
week, give or take the large number of test commits and various issues
associated with those. Jani started the week with a small storm over
non-merged fixes in the 5_2 branch – nothing new there – and Tony Dovgal
obliged by merging those of Ilia’s patches that Jani either couldn’t or
wouldn’t deal with himself.

Dmitry, though, commented after Johannes slipped Etienne Kneuss’ support for
dynamic references to static class members into the 5_2 branch:

Short version: Dynamic referencing comes under the microscope.

PAT: That win32 filepath issue [again]

A patch offered by crrodriguez at suse dot de via bug report #42208
(substr_replace() crashes when the same array is passed more
than once) was applied in the 5_2 branch by Ilia. Bug reporter Andrew Minerd
also posted a fix, this time for streams bug #42237
(stream_copy_to_stream() returns invalid values for memmapped
streams). Ilia again applied the patch; Jani later fixed the variable names.
Neither fix is needed in CVS HEAD.

Rob Richards applied a fix for ext/dom bug #42082 (NodeList length
zero should be empty). The patch came from Hannes Magnusson.

Johannes applied an array.c patch from MySQL AB’s Ulf Wendel to fix
the build in CVS HEAD.

Richard Quadling chased up on his patch from last week to
fix bug #25361 (Program
Execution Functions fail in certain cases) on both NT and 9x systems,
although noting that the problem of whitespace in directory paths is in fact
a Windows issue rather than a bug in PHP. Nuno referred him to a previous
attempt to fix a
duplicate bug
and the ensuing discussion, but Richard still didn’t see
why no fix had ever been committed. Tim Starling, who authored the previous
fix, explained that nobody has yet submitted a patch that takes backward
compatibility into consideration. Simply adding the /s switch
and/or wrapping the path in quotes in all cases would break existing scripts
that wrap the path in quotes.

Tony offered up a patch to resolve
the universal
binary issues
raised a couple of weeks back, and asked Uwe Schindler to
test it. Uwe explained that he currently has no way of obliging, and passed
the baton to Christian Speich, the originator of the complaint.

Short version: Note to Richard: support for win9x was dropped some time ago.