Categories


Loading feed
Loading feed
Loading feed

Zend Weekly Summaries Issue #353


TLK: getopt revisited
TLK: PDO/Firebird [again]
TLK: TSRM in CLI
TLK: Taint update
TLK: Namespaces or packages?
REQ: Pre-destructor magic function
CVS: Nothing major going down
PAT: That win32 filepath issue [again]

6th August - 11th August 2007

TLK: getopt revisited

David Soria Parra kicked off the week with his second attempt to bring longopt and Windows support to getopt(), following a long discussion on IRC. The idea now was to centralize PHP's own getopt() implementation in the core rather than duplicate it in the CGI and CLI SAPIs, which would also allow a clean-up of the currently necessary checks in the php_getopt() function in ext/standard/basic_functions.c. The minor changes to the implementation that already exists in CLI were to enable long options on the command line under both Linux and Windows. The function syntax would look like:

$opt = getopt("a", array("param:", "param2"));

where : means that the marked parameter takes an argument.

Marcus Börger agreed that it might be a smart move to consolidate the implementation in /main, and asked Jani Taskinen to get that part under way. Jani agreed too, but thought it best to wait until the 5_3 branch opens before commencing open-heart surgery on the PHP core. Johannes Schlüter more cautiously noted that there is currently just one test case for getopt(); he didn't like the idea of rewriting the implementation without regression testing. Johannes added that the offered patch isn't threadsafe either, and really should be. Finally, php_getopt() is not marked PHPAPI in either the current implementation or the patch, meaning that it can only be used locally.

Jani wondered quietly why anyone would need thread safety in a CGI/CLI environment, but Johannes felt there would be no way to guarantee those environments once the function becomes part of the PHP API. Stas Malyshev wanted to know where the options would come from in other environments, given that there are no other command line versions of PHP. Somehow his query backfired; David immediately rewrote the history of his patch, and Johannes reiterated the arguments for moving it into /main.

Short version: David's patch is available here.

TLK: PDO/Firebird [again]

Diving neatly sideways from the discussion following the PHP 5.2.4 RC1 release last week, Lester Caine came up with a list of arguments against the emphasis on PDO development. PDO's lack of support for PHP 4 was the first on that list, despite the PHP 4 end of life announcement. ADOdb (Lester's database abstraction layer of choice) has support for PDO drivers, but at the cost of performance. PDO's internal restrictions mean that PDO drivers can only access the database via SQL; this doesn't take Firebird's services interface, used for backup, user management and the event handler, into consideration. PDO's transaction control hides the transaction mode and cannot retain the context of a transaction following a commit or rollback; there's no way to select a more appropriate transaction mode. Finally, as Lester understood it, PDO returns an entire BLOB object rather than a simple handle that would allow access to a subset of its data. All these were restrictions that made it unlikely, if not impossible, that any Firebird user would step up to develop PDO_FIREBIRD... unless he'd misunderstood something.

Larry Garfield posted a link to gophp5.org, a laudable effort from the developers of symfony, Typo3, phpMyAdmin, Drupal, Propel and Doctrine to get other mainstream PHP applications and ISPs to commit to a minimal PHP 5.2 requirement in the first quarter of 2008. Lack of support for PHP 4 is no longer an argument against anything. Larry went on to demolish the performance myth, pointing out that PDO does rather more than simply pass along an SQL query string. Where native drivers are used, escaping mechanisms need to be implemented separately - and usually in userland PHP code. That said, his own experience of moving Drupal to PDO_MYSQL had resulted in only marginally better performance than that achieved with the native MySQL driver. The benefits of PDO were more in the way of cleaner code and type-safe prepared statements than speed. Larry believed that PDO does in fact allow non-SQL strings as queries, but referred Lester to Wez Furlong to be certain. How this might affect Firebird particularly, he didn't know - but it seemed pointless to Larry to write cross-database compatible code when you're certain you're only going to be using one RDBMS.

Lukas Smith disputed this, writing that the whole point of PDO was to have a common set of methods and a common infrastructure to support them. Any PDO driver could, however, implement driver-specific methods, prefixed with the driver name. He therefore saw no reason that 'RDBMS-specific goodies' in the native drivers shouldn't be ported to PDO. Dan Scott pointed out that we've been here before and are still awaiting Lester's (or anyone else's) patch for PDO_FIREBIRD after two and a half years. Lester retorted that he was simply trying to re-open the discussion on a generic issue, i.e. transaction handling. Lukas agreed that there is actually an issue here: PDO doesn't support nested transactions. Then again, as far as he knew none of the other RDBMS extensions supports it, including the Firebird native driver. Still, this didn't prevent PDO_FIREBIRD having specific methods to handle nested transactions...

Lester wrote that he just wanted to be able to turn auto-commit on or off and switch transaction modes, something that many RDBMS have support for. He'd noticed, however, that other generic drivers only make this available via the connect string, which isn't the way it works in Firebird. There, he can wrap an update in its own transaction, which can then be rolled back and reworked if necessary. Until now, Lester had been unaware that most of the native drivers in PHP don't provide access to those facilities, despite the fact that the underlying RDBMS has support for them in at least three cases (Oracle, DB2 and MSSQL). Further, both Firebird and PostgreSQL can use parameters to handle the transfer of values to a query, whereas - as far as Lester could see - MySQL cannot. PDO wraps this 'quite nicely', except in the case of BLOBs. In his view, 'most database engines would benefit from being able to handle BLOB elements independent of the general fetch.'

Finally, Lester wrote some surprising things. Firstly, the idea behind PDO is good. The core just needed a little tidying to make it fully practical. Secondly, he was actually looking forward to not having to support PHP 4 any more...

Lukas pointed out that pg_query_params() values can only be literals, not identifiers, and the differences in placeholder support come down to whether the placeholders are named (PGSql, Oracle) or not (db2, MySQL). PDO offers support for both named and unnamed placeholders in all drivers. As for BLOBs, they're implemented in PDO using the streams API; they aren't actually raw content. That said, Lukas agreed with Lester that adding support for BLOB IDs would be a good move, assuming they aren't supported already. Lukas was uncertain of that because the documentation - another area where PDO is intended to reduce duplicate effort - is behind the times at present.

Short version: Finally moving forward with the PDO_FIREBIRD debate.

TLK: TSRM in CLI

Stas had found some odd code in the CLI SAPI. In ZTS builds, compiler_globals and executor_globals were being allocated their respective ts_resource(*_globals_id). This made no sense to him because the globals are already allocated in zend_startup(), which is called during the SAPI module startup. Those globals must be holding either tsrm_ls if they'd been initialized to 0, or 'random crap' otherwise. Could anyone explain to him why/how this block of code in CLI worked?

Following a brief silence, Stas reported his discovery that the compiler_globals pointer isn't even used in ZTS mode. The code in question was therefore completely redundant, and he planned to delete it unless anyone had objections.

The silence held. Stas looked into it a little more, and eventually killed off two related blocks of code.

Short version: Stas finally started talking to himself.

TLK: Taint update

Wietse Venema, who posted an RFC about taint support on the internals list at the end of last year, wrote to say that he has now resumed work on the project. Wietse is currently working on a rough prototype to support taint support in the PHP core and standard extensions. He has chosen the path of developer impact rather than performance impact, meaning that overhead should be minimal; his prototype simply sets and tests 'a few normally unused bits in the zval structure'. Wietse added, however, that he doesn't expect to have actual performance data available until his first implementation is ready, some time in September.

Guilherme Blanco had tried to do something similar with a Poka-Yoke implementation in the past, and was very certain that the PHP development team wouldn't be interested in incorporating Wietse's taint support in the core. He felt that Wietse should concentrate on data validation with userland PHP rather than going under the hood.

Richard Quadling wrote helpfully about Marco Tabini's php|architect article on Poka-Yoke. Guilherme agreed that this was a useful resource, although claiming that his own implementation takes things a step further.

Short version: It seems the internals folk are waiting until the implementation turns up before they comment further.

TLK: Namespaces or packages?

Having picked up on the brief exchange about renaming namespaces to packages a couple of weeks ago, Johannes provided a patch to accomplish this and another to alter any affected tests accordingly. He wrote that he believed consensus had been achieved (erm... if so, it was largely off-list), but wanted to check that there were no objections before committing his changes.

Marcus promptly thanked him, but Dmitry Stogov soon followed his post with a complaint that the decision to use namespace, package or even packet had yet to be taken. Derick Rethans argued that the implementation is closer to package than namespace support, 'and "packet" doesn't have any meaning in this case'. Did Dmitry have a better suggestion? Marcus agreed; 'I actually only saw pros on the question.' Dmitry - who was all for calling it packages from the start - explained that he just wanted to wait a bit, presumably anticipating on-list agreement over the feature's name. Christian Schneider intervened to say that the feature itself was more important than its name, and mentioned 'the color of the bikeshed' in passing. Derick disagreed with him; calling this namespace support 'just for marketing reasons' would say nothing about the kind of functionality that had been implemented. Jani wrote rather impatiently that the majority had agreed already, and Johannes should just commit. Stas - who was against renaming at the start - wanted to know when this vote had been held. He also wondered where everyone else had found the definition that explains the difference between namespaces and packages - did C++ now have a trademark on the name? That said, he could live with packages if necessary; he just liked namespaces better because the concept was easier for PHP users to grasp.

Jeremy Privett felt that namespaces was a confusing name for anyone with a C++ or C# background, purely because the behaviour's so different in PHP. However, in line with most of the PHP users contributing to this thread, he only really cared that the functionality was there. Robert Cummings started singing 'tomayto, tomahto' to himself at this point. Larry Garfield mentioned the similarity with XML namespaces for the first time - something that seemingly hadn't occurred to anyone until now - and Lars Gunther brought up ECMAScript 4/Javascript 2 namespaces, ditto. Stas meanwhile pulled out the Wikipedia page on namespaces and demanded to know in what way the PHP implementation fell outside that definition. And Tijnema - who has lost his exclamation mark over the summer - went on one of his inventive naming sprees:

classgroups?
phpspaces?
codebundles?
...?

Short version: PHP's roots are showing.

REQ: Pre-destructor magic function

Someone rejoicing in the name of Nathanael D. Noblet wrote to internals@ with a request. He had an object containing members that were themselves objects. These member objects each held references to the container:

class A {
    var
B;
}

class
B {
    var
A;
}

Nathanael had found that when calling something like:

$obj = new A();

in a loop, the object is never released from memory because of those internal references. Although this wouldn't normally be a problem during the lifetime of an HTTP request, Nathanael's objects happened to be in a batch script that looped through 22,000 records and were eating memory. He'd worked around the issue by creating a special function in the object to manually release the recursive references, but thought it would be good to have a magic function that could be called on an object so that it would 'know' when it's being deleted - something like __preDestruct(). Another approach would be to implement better handling for recursive references in objects...

Arnold Daniels intervened, and made Nathanael aware of David Wang's Google Summer of Code project to implement a garbage collector for circular references in PHP. Nathanael cheerfully admitted to not having followed internals traffic in the past, and volunteered to test David's code.

Short version: That'll be one of those real-world cases, then.

CVS: Nothing major going down

Changes in CVS that you should probably be aware of include:

  • AIX configuration bug #42195 (C++ compiler required always) was fixed [Jani]
  • In ext/openssl, bug #42222 (php_openssl_make_REQ() buffer overflow) was fixed [Pierre-Alain Joye]
  • Core bug #42233 (Problems with æøå in extract()) was fixed [Jani]
  • In ext/ldap, bugs #41973 (./configure --with-ldap=shared fails with LDFLAGS="-Wl,--as-needed") and #42247 (ldap_parse_result() not defined under win32) were fixed [Nuno Lopes and Jani, respectively]
  • In the CGI SAPI, bugs #42198 (SCRIPT_NAME and PHP_SELF truncated when inside a userdir and using PATH_INFO) and #31892 (PHP_SELF incorrect without cgi.fix_pathinfo, but turning on screws up PATH_INFO) were fixed [Dmitry]
  • Two gdImageCreate() crash fixes, libgd bug #94 and (presumably #11, reported as #101) were merged into PHP 5_2 and HEAD [Mattias Bengtsson]
  • In ext/bz2, bug #42117 (bzip2.compress loses data in internal buffer) was fixed [Ilia, Tony Dovgal]
  • In ext/dbase, bug #42261 (header wrong for date field) was fixed [Ilia, Tony]
  • A core bug affecting Linux users, #42243 (copy() does not output an error when the first arg is a dir) was fixed [Ilia, Tony]
  • In ext/sybase, bug #42242 (sybase_connect() crashes) was fixed in 5_2 only [Ilia]
  • Support for building ext/oci8 with Oracle 11g was added to both CVS HEAD and the 5_2 branches, the latter with Ilia's permission [Christopher Jones]

Unusually, there wasn't much else going on in the CVS mailing lists this week, give or take the large number of test commits and various issues associated with those. Jani started the week with a small storm over non-merged fixes in the 5_2 branch - nothing new there - and Tony Dovgal obliged by merging those of Ilia's patches that Jani either couldn't or wouldn't deal with himself.

Dmitry, though, commented after Johannes slipped Etienne Kneuss' support for dynamic references to static class members into the 5_2 branch:

I think this feature shouldn't go into 5.2, especially after
5.2.4 RC1 release. Maybe into 5.3

Short version: Dynamic referencing comes under the microscope.

PAT: That win32 filepath issue [again]

A patch offered by crrodriguez at suse dot de via bug report #42208 (substr_replace() crashes when the same array is passed more than once) was applied in the 5_2 branch by Ilia. Bug reporter Andrew Minerd also posted a fix, this time for streams bug #42237 (stream_copy_to_stream() returns invalid values for memmapped streams). Ilia again applied the patch; Jani later fixed the variable names. Neither fix is needed in CVS HEAD.

Rob Richards applied a fix for ext/dom bug #42082 (NodeList length zero should be empty). The patch came from Hannes Magnusson.

Johannes applied an array.c patch from MySQL AB's Ulf Wendel to fix the build in CVS HEAD.

Richard Quadling chased up on his patch from last week to fix bug #25361 (Program Execution Functions fail in certain cases) on both NT and 9x systems, although noting that the problem of whitespace in directory paths is in fact a Windows issue rather than a bug in PHP. Nuno referred him to a previous attempt to fix a duplicate bug and the ensuing discussion, but Richard still didn't see why no fix had ever been committed. Tim Starling, who authored the previous fix, explained that nobody has yet submitted a patch that takes backward compatibility into consideration. Simply adding the /s switch and/or wrapping the path in quotes in all cases would break existing scripts that wrap the path in quotes.

Tony offered up a patch to resolve the universal binary issues raised a couple of weeks back, and asked Uwe Schindler to test it. Uwe explained that he currently has no way of obliging, and passed the baton to Christian Speich, the originator of the complaint.

Short version: Note to Richard: support for win9x was dropped some time ago.

Comments