Categories


Loading feed
Loading feed

Zend Weekly Summaries Issue #367


TLK: Visual Studio 2005 (again)
TLK: PDO performance
TLK: Optional scalar type hinting
RFC: Accessible compiler info
TLK: Question about superglobals
REQ: Bring back allow_call_time_pass_reference
TLK: Restore output buffer in case of exceptions
CVS: Oracle 8 support dropped
PAT: Etienne joins the clan; str_split patch

11th November - 17th November 2007

TLK: Visual Studio 2005 (again)

Elizabeth Smith revisited the topic of providing PHP builds created with VS 2005. She knew it would be impossible to provide only VS 2005 builds until Apache - and various other third-party libraries - made the move, since 'the runtimes will crash and things will blow up'. However, she would like to provide non-threadsafe VS 2005 builds for Windows users. The process needed to be well planned; 'PHP users are notorious for not testing things that aren't available right on the downloads page, and people at Microsoft have made the request that binaries be made available on the php.net site.' Elizabeth was prepared to create VS 2005 builds of the third-party libraries that link into PHP to avoid runtime issues that would otherwise originate from them, but noted that the actual problems would remain unknown until there is widespread testing. She also recognized that this would be 'a big job', but felt it important for php.net to be proactive about the migration.

Stas Malyshev has been compiling his work builds with VS 2005 for some time, and wrote that he hadn't come across any problems. However, VS 2005 binaries are linked against 'libraries which may not be installed by default' (read: a different C runtime), and in some cases users would need to install this package for them to work at all. Elizabeth, who has been quietly providing unofficial VS 2005 and 2008 builds of PHP for a while, acknowledged this point. Although she personally hadn't hit any runtime issues, she was aware that others had. However, official php.net builds were needed so that people could start testing and/or using them on a broader scale. Stas was all for it, commenting that VC8 is a better compiler (than VC6).

Andi Gutmans also backed the proposal, noting that there are actually VC8-specific optimizations in the Zend Engine. He added that he'd recently had a similar discussion with Edin Kadribasic and John Mertic, and both had been supportive in principle. He suggested that Elizabeth contact Edin directly and offer her assistance.

Then I butted in. One of the disadvantages of running late with these archives is that threads long forgotten by everyone else are right in front of me. I'd only recently covered the 'let's kill off VC6' discussion from October, so I probably came across as unhinged here... whatever. It made no sense to me to lose the "runs on your toaster" advantage of the lowest common denominator, or to be tied to shipping the C runtime, or to drop native support for Windows systems pre-dating XP (as per the October discussion). I believed that multiple CRT versions of PHP on the download page would lead to user confusion; even Edin's NTS option had proven confusing for some. The only thing that made sense was offering test builds (as per the current discussion). To that end, the only real issue was that of incompatible third-party libraries, since PHP itself has a CL build system that is known to work across all current MS compiler versions. I wasn't at all happy about php.net being pushed into distributing our own builds of third-party libraries just so we could continue to support Windows. However, if we were going to do that, we should set up edge case tests in the known problem areas (multiple I/O calls, data structures passed around), and run them over binaries built with the full range of compilers. That would be genuinely useful.

Stas likes a good argument too. He pointed out that the l.c.d. compiler is 10 years old and produces slow code, whereas VC8 is the best available today. He wasn't against keeping the VC6 build for the official PHP 5.x releases, but felt that snapshots were a good candidate for test builds. Besides, all the platforms that natively include the version 6 CRT are able to support the version 8 CRT library... I asked if Stas had read to the end of my post. All I wanted was to find a way of upgrading and testing that wouldn't alienate the PHP user base. Having more than one official build would mean having to explain 'C runtime' to practically every PHP newbie, which didn't bear thinking about, whereas third-party "experimental" library collections built with the various compilers would be far less intrusive. Stas agreed it would be useful to have those (or at least, two sets - VC6 and VC8), but he'd like the PHP binary itself to be made available. I saw limiting ourselves to two sets as shortsighted, particularly as we'd need the raw code for all those third-party libraries in order to build them in the first place. Stas and I got sidetracked into a pointless discussion about when would be the right moment to assume majority native support for VC8 binaries (who can say?), which kept us both entertained until Elizabeth picked up her mail.

I'd definitely confused her by combining my responses to two threads, and more so because Elizabeth's initial assumption was that anyone not 100% for Microsoft must be 100% against them. She made the valid complaint that most of the open source libraries linked by PHP have Windows support 'as an afterthought', but ascribed this to a 'negative attitude' on the part of the open source movement. Elizabeth agreed that compiler-specific library collections would be useful; in fact, she'd already been working on this to some extent. All she needed was a test area... the snaps box perhaps, or the gcov site? Another option might be for Microsoft to distribute the builds on their own site and php.net to provide a link. Was there a problem with that, or were Microsoft being held to a different standard than other platforms? She ended her post with the observation that I obviously had an issue with the Microsoft runtime changes; would anyone other than myself or Stas care to weigh in?

Rob Richards did, to point out that the tried and tested VC6 builds aren't going anywhere in a hurry. However, he agreed that an upgrade would eventually be needed, and that there would need to be serious testing beforehand. It made sense to provide a download, even if it was only updated now and again, and he really didn't see what the problem was with that. Richard Quadling wanted to see some actual performance statistics, but - assuming VS 2005 really did offer major benefits - believed there'd be no problem in bundling the runtime with the PHP distribution, given that Windows users generally prefer MSI installers.

I explained to Elizabeth about the two threads thing, and asked if anyone had actually suggested linking to Microsoft? The snaps box idea struck me as a bad one, since anyone reporting a PHP bug is sent there to test the fix. The QA site, on the other hand, was intended for testing stuff; if there must be differently-abled binaries, why not distribute them from there? There was no reason the QA site couldn't be linked from either the php.net home page or the download page, or both. As for building third-party OSS libraries from scratch - I have plenty of experience with that, and offered to help Elizabeth with her efforts when I get time.

The culture clash was more difficult. I suggested talking to the open source teams involved without making assumptions about their attitude, and pointed out that the only thing preventing Microsoft from rolling their own distributions was the fact that they would then have to support them. Whereas an open source team can guarantee support for as long as there is active community interest in a project, companies have different sets of interests and resources, and very different ideas about "long term". Then I messed up by announcing to all and sundry that Elizabeth was actually being paid by Microsoft to get the new build onto the official PHP download page, which I believed to be true at the time. It proved not so, and I ended up withdrawing that statement and apologizing to both parties. Back on track, Elizabeth amended her initial request for space on the official download page and agreed that the QA site plus prominent links would be fine. Her sticking point had been my recommendation that only third-party libraries were provided. At this stage, I happily agreed that we'd reached a resolution we could both live with.

Mario Brandt wrote to confirm that compiling Apache and PHP 5.2.4 with VS8 had boosted performance for him. However, he disputed Richard's assertions that PHP users would prefer an MSI and bundled runtime. Mario proved to be one of the many Windows-based developers who dislike the way MSI installers take control over the environment; to him, a zipped bundle was much easier to install, test and delete. Marcus Börger asked if Mario had any figures to back his performance claims, and Mario helpfully provided some. Richard didn't see how Mario could possibly deal with a new CRT for PHP that way, and gloomily foresaw hordes of users failing to download the new runtime and complaining that their copy of PHP didn't work. I retorted that finding a way to avoid that was precisely what this entire discussion had been about. Pierre-Alain Joye pointed out that users will complain however php.net approaches the matter, and added that nobody had said the VC6 builds would end. He also believed that Microsoft would be likely to provide help with the transition.

Short version: PHP binaries and associated libs built with VS 2005 will be made available on qa.php.net.

TLK: PDO performance

PHP user Andrew Mason was considering a switch from ext/mysqli to the PDO_MYSQL driver. He liked the look of the PDO API better, but wondered if there would be any performance or latency issues when using it with MySQL 5. He didn't trust his own benchmarking because PDO had been faster for some items there, which went against everything he'd heard.

Alexey Zakhlestin wrote that if there is any difference, it's very small. Besides, the overhead was as nothing compared to typical query execution time. He'd been using PDO for three years, and could confirm that it is 'a pleasure to work with'.

Brian Moon also responded. He hadn't done any benchmarking in the past year, but last year's results were still available online. Essentially, the performance hit depended on the way in which PDO_MYSQL was being used.

Short version: The advent of mysqlnd is likely to tip the balance in favour of mysqli until PDO is supported.

TLK: Optional scalar type hinting

A Sam Barrow had found an old patch written by Derick Rethans that allows scalar type hinting. Sam had adapted the patch to work with the latest PHP_5_3 snapshot, and added the ability to type hint for resource and support for secondary keywords (double, real, long) for all basic types. Since the type hinting is 100% optional, the patch maintains back compatibility in full. Sam didn't actually supply the patch at this stage; he wasn't sure where it should be submitted. However, he asked if it could be added to 'the next PHP release'.

Richard Quadling wanted to know whether the argument would be cast to the hinted type in this patch, and pointed out that the idea of type hinting for arrays and classes was to ensure that the incoming data was appropriate. This was less applicable to basic types, where the same data could imply an empty string, a zero integer or FALSE. Wouldn't it mean that all the arguments had to be explicitly cast? Jeremy Privett assumed not; it should 'just bomb out' if the incoming data is of the wrong type. However, he felt it would only be useful to the kind of developer who always uses === and tries to make PHP behave as if it were strongly typed.

Marcus wrote that the subject had been discussed 'a million times already', and the conclusion had been that type hinting shouldn't be allowed for basic types in PHP. Cristian Rodriguez ignored this, and mentioned that Hannes Magnusson had offered a good implementation that had 'unfortunately' never been merged. David Coallier, though, shared the feedback he'd received when asking for this feature: if he wanted Java, he should just use Java.

Short version: Nope.

RFC: Accessible compiler info

Following on from the VS 2005 discussion, I came up with the idea of adding compiler version information to the php -v output. I'd initially been thinking of ways to make it easier to sort the bug reports once the Windows test releases are available, but it seemed to me that this would be a useful thing across all platforms.

Everyone agreed, for once, but with the caveat that compiler information should be part of the phpinfo() output rather than the -v output. That stymied me (I know how to add information to -v but phpinfo() needs actual thought), but I promised to look into it.

Short version: Another item for the TODO.

TLK: Question about superglobals

A Sam Barrow announced that he was developing a patch for personal use to allow custom superglobals, and could do with some advice. It appeared to work, but only when he hard coded his superglobals into the source files; otherwise, despite the fact that the variable appeared to register as expected, he was unable to access it from his script. Was there some kind of restriction on setting superglobals at runtime?

Hans-Peter Oeri and Johannes both immediately referred Sam to the runkit extension in PECL, but Sam already was aware of it; he'd decided against it because of its beta status and the fact that he only wanted support for superglobals. Besides, in runkit there is a restriction; superglobals can only be specified through php.ini, whereas Sam wanted to use a superglobal keyword to define them. He wasn't sure that the team would want this in the core, but if they liked the idea he'd be happy to provide the implementation.

PHP user Michael McGlothlin thought it a great idea, but Stas was quick to squash it. If everybody were to add variables into the global space and make them superglobal just to save a few keystrokes, things would quickly get messy. He advised Michael to declare a class and use statics or singletons instead. Sam pointed out that it's easy enough to create messy code with many existing features in PHP; he didn't see that as a good reason to deny serious developers a useful tool. In his medium-sized application, Sam was now able to have a single line:

superglobal $mod, $sec, $cfg;

in his root include file, rather than having to specify his three universal variable as global in every single function and method.

Sean Coates promptly demonstrated why Sam shouldn't need to do that, and Stas pointed out that changing the behaviour of unrelated code is almost always a bad idea. Michael, though, still felt that PHP should have 'more options for both tighter and looser control of variables.' He would like user-defined superglobals; he would also like a way to make variables local to a chunk of code without needing to use a function as a container. Besides, if superglobals were so pointless, why did PHP have them at all? With regard to Sean's demo code, Michael wanted to know whether the idea was now to inappropriately force everything to be a class? Sean argued that this was wholly appropriate; classes were designed to be user-definable, whereas superglobals were not. Defining the configuration in a class made it easy for the application maintainer to determine where the data was defined; superglobals, on the other hand, might be defined anywhere in the source. Sam queried Sean's assumption that superglobals were not designed to be user-definable and pointed to C and C++, where they are. (Robert Cummings later wrote to point out that locally declared variables take precedence over global variables in both languages, which is not the case in PHP.) Rasmus Lerdorf pulled out a piece of history to explain why the concept of having to declare your globals exists in PHP; briefly, global variables can be hell to debug. A finite number of clearly labelled global arrays ($_GET, $_POST), and even the later addition of $GLOBALS, meant that the chance of someone mistaking a superglobal for something else and 'ending up with strange side effects' is remote. In case Sam wasn't sure, Rasmus was very against destroying something he considered 'a rather good design decision' just to avoid a couple of keystrokes. However inconvenient it may be to have to declare your globals, it was as nothing compared to the trauma of trying to find a bug caused by a global side effect in someone else's code.

Sam conceded that Rasmus had a point, but disagreed anyway; he believed that the benefits of superglobals outweigh the risks. Larry Garfield shared a piece of history of his own that involved watching a colleague driven slowly mad by a predecessor who had used global variables as a primary means of internal communication. He backed Rasmus; 'undeclared globals are a form of extreme sadism.' Sam double-checked; his superglobals do, after all, need to be declared somewhere. Rasmus pointed out that that declaration isn't necessarily going to be transparent to a developer working on a different area of the project. Sam suggested it might be helpful to have a php.ini directive, allow_superglobals, defaulting to off. Robert exploded:

Does your code live in a bubble? Allowance of arbitrary super global
definitions would open a can of local variable clobbering worms. A
php.ini directive won't make this reality any more palatable.

Sam retorted that functions and variables are perfectly capable of causing problems too, not to mention user input. He saw flexibility as a goal in programming languages, and it should be up to the programmer to use it well. Rasmus explained that flexibility, in this case, had the potential to cause more confusion than it was worth. Sam's patch didn't offer a way to do something that can't be done currently; it merely changed a minor syntax and a concept that had been in PHP for the last 12 years. He concluded, 'To be completely blunt, this change has no chance of making it into PHP'.

Short version: Not that good an idea.

REQ: Bring back allow_call_time_pass_reference

PHP user Karoly Negyesi made a heartfelt plea for the return of allow_call_time_pass_reference, which has been deprecated since Adam was a lad. In fact, Karoly was surprised it hadn't been de-deprecated in PHP 5, since there are things you can't do in PHP without it...

Brian Moon felt 'froggy' enough to ask for details of those things. All he could think of were the bad things that could happen when allow_call_time_pass_reference is switched on, and in OOP objects are always passed by reference anyway. Karoly gave the example of a CMS where some functions needed to modify a single array and others needed to modify two. If these weren't wrapped into an "arguments" array, func_get_args() would 'butcher your references'.

Stas wasn't alone in failing to follow this explanation. He requested a short code example. Karoly offered a blog entry dating back a couple of years that explained the issue more fully. In an attempt to be more constructive, he asked if there couldn't be a new parameter added to func_get_arg() to obtain the argument by reference? Then he caught up with the universal confusion brought on by his initial explanation, and apologized. What was needed was the ability to pass a variable number of arguments, some of which may be references. He believed this impossible without support for call-time by-ref calls. A simple flag would suffice in func_get_arg(); he was less sure how to deal with func_get_args(), but perhaps a parameter that would accept all the by-ref arguments would be the answer. Stas, having read the blog post, suggested that Karoly submit his idea as a feature request via bugs.php.net, this time including the code examples given in the blog.

Short version: Clarity is good.

TLK: Restore output buffer in case of exceptions

Someone with a rather wonderful name, Mehmet Yavuz Selim Soyturk, wrote to internals@. He admitted up front to being neither a Web programmer nor a PHP developer; he just had come across something he thought might be useful. When an exception occurs, PHP 5 unwinds the stack to restore the state of the script, but neglects to restore the output. Wouldn't it be a good idea if it also did some sort of output buffer unwinding? To demonstrate:

exceptional_ob_start();

echo
"Begin\n";
try {
    echo
"Exception\n";
    throw new
Exception();
} catch (
Exception $e) {}
echo
"End\n";

exceptional_ob_end_flush();

// Begin
// End

Tony Dovgal pointed out that if this approach were taken to output buffering, the same would need to be applied to file descriptors, database transactions and network connections et al. This didn't make a lot of sense. Alexey corrected him; it made perfect sense, it would just be 'almost impossible' to implement. He actually liked the idea of having the output buffer respond in the way Mehmet suggested, but was concerned that edge cases would be less than obvious.

Evert|Rooftop wrote bluntly that in his code, the parts that start an output buffer are also responsible for closing it. Something like:

ob_start();
try {
    
do_something();
} catch (
Exception $e) {
    
ob_end_clean();
    throw
$e;
}

$data = ob_get_clean();

would be typical. Edward Z. Yang agreed, explaining that Mehmet would need to implement 'rollback functionality' in the catch block. Mehmet hadn't been aware until now that output buffers in PHP can be nested; this made it possible to manually translate a standard try-catch block to something like:

ob_start();
try {
    
// ... code
    
ob_end_flush();
} catch (
Exception $e) {
    
ob_end_clean();
    
// ... code
}

Stas' post explaining precisely this arrived after Mehmet's post with the code.

Short version (thanks Stas): Buffers are stackable in PHP.

CVS: Oracle 8 support dropped

Changes in CVS that you should probably be aware of include:

  • Zend Engine bug #42937 (__call() method not invoked when methods are called on parent from child class) was fixed in PHP_5_2, PHP_5_3 and CVS HEAD [Dmitry Stogov]
  • In ext/xmlrpc, bug #42736 (xmlrpc_server_call_method() crashes) was fixed across all three branches [Tony]
  • Zend Engine bug #43183 ("use" of the same class in different scripts results in a fatal error) was fixed in the PHP_5_3 branch and CVS HEAD [Dmitry]
  • Core bug #43182 (file_put_contents() LOCK_EX does not work properly on file) was fixed across all three branches [Ilia Alshanetsky]
  • In ext/simplexml, bug #43221 (SimpleXML adding default namespace in addAttribute()) was fixed across all three branches [Rob Richards]
  • TSRM bug #43248 (backward compatibility break in realpath()) was fixed, again across all three branches [Dmitry]
  • Oracle 8 support was dropped from ext/oci8 in the PHP_5_3 branch and CVS HEAD [Tony]
  • In ext/soap, bug #42692 (Procedure int1 not present with doc/lit SoapServer) was fixed across all three branches [Dmitry]
  • In ext/pgsql, bug #43279 (pg_send_query_params() converts all elements in params to strings) was fixed across all three branches [Ilia]
  • Multiple segfaults in getopt() were fixed across all three branches, closing bug #43293, and support for numeric options was added in 5_3 and HEAD [Hannes Magnusson]
  • In the PDO_FIREBIRD driver, bug #43271 (closeCursor() not implemented), feature request #43296 (ATTR_FETCH_TABLE_NAMES support) and bug #43244 (closeCursor() w/o returned data kills process) were fixed in the PHP_5_3 branch [Lars Westermann]
  • Safe mode bug #43276 (Incomplete fix for bug #42739, mkdir() under
    safe_mode) was fixed in the PHP_5_2 and PHP_5_3 branches [Ilia]
  • In ext/mbstring, bug #43301 (mb_ereg*_replace() crashes when replacement string is invalid PHP expression and e option is used) was fixed across all three branches [Jani Taskinen]
  • In ext/curl, bug #43092 (curl_copy_handle() crashes with > 32 chars long URL) was fixed across all three branches [Jani]

In other CVS news, the command used for building PHP under Windows using CL changed from 5_3 up when Johannes Schlüter committed a fix to make the buildconf batch file work properly. Instead of typing:

cscript /nologo buildconf.js --whatever

you can now simply type:

buildconf --whatever

(The old way will still work too.) Elizabeth followed this with a commit that does the same for the generated config.nice.bat across all three branches.

Wearing his Release Master hat, Johannes entered into the discussion about Ilia's fix for PDO bug #43130 (Bound parameters cannot have - in their name) a couple of weeks back. He didn't like the idea of having vendor specific rules in the PDO parser, but a subset that works with most (all?) backends seemed a good compromise. The regex suggested by Lukas Smith seemed the way to go. Lorenzo Alberton promptly came up with a couple of safer regular expressions for bound parameter checking. Ilia, however, remained immovable: 'I don't see why PDO should follow Oracle's rules for generic functionality.' He was happy with the current implementation, and saw no need to change it.

Short version: The Windows CL build system is now fully in line with Linux syntax.

PAT: Etienne joins the clan; str_split patch

Two new Zend Engine patches from Etienne Kneuss were applied by Johannes and Tony respectively; one to fix bug #43126 (Unexpected T_STATIC in parser error) and one to disallow multiple access modifiers and 'abstract abstract' methods. Derick, seeing a common theme emerging, gave Etienne access to the php-src module. Marcus went one better, and gave Johannes enough karma to allow him to control other peoples' CVS access.

Hans-Peter Oeri offered a fix for duplicated PDO_FIREBIRD bug #43271/#43246 (closeCursor() not implemented) and followed with a patch offering ATTR_FETCH_TABLE_NAMES support. Marcus looked through the patches, offered some advice about coding standards and eagerly asked Hans-Peter if he'd like CVS access to the module. Hans-Peter, who hadn't expected this, wrote 'yes please', but wanted to know whether there wasn't an active maintainer for the driver already. Marcus didn't know of one... he added that the firebird installation on the gcov site could do with some attention. Nuno Lopes quickly backed Marcus in this; he had no idea how to get it to work. At this point Lars Westermann, the active maintainer of the PDO_FIREBIRD driver, introduced himself to all concerned and explained that he'd already fixed that particular bug in CVS earlier in the week. Marcus pointed to Hans-Peter's second patch and asked whether Lars would like another person to have access to the module at this point. Lars committed support for the attribute, but there was no word on the idea of a second maintainer for the driver.

Meanwhile Marcus had been looking at Andrew Minerd's patch from last week to allow __sleep() to return NULL. He wrote that the patch looked fine, but asked Andrew to include the now outdated test changes in both the 5_3 and HEAD versions of an updated patch.

And finally, a Claudio Cherubino posted a one-line patch against CVS HEAD to fix bug #42866 (str_split() returns extra char when given string size is not multiple of length). He explained that, for example, if the string size is given as 22 and the split length as 5, the final element of the returned array currently contains 5 or more characters rather than the expected 2. This bug was unique to PHP 6 with unicode.semantics switched on.

Short version: You wait ages for a PDO_FIREBIRD maintainer and then two come along at once.

Comments


Saturday, January 5, 2008
THANKS
7:29AM PST · somnium