Zend Weekly Summaries Issue #282
TLK: pg_execute_error
TLK: FastCGI and STDIN
REQ: Arbitrary precision datatype
TLK: Late static binding
NEW: PHP 5.1.3 RC2
TLK: GD development @ php.net
BUG: __set/ __get behaviour
REQ: Zend API bump
TLK: PHP-GTK corner
CVS: Mostly Unicode
PAT: Quiet week
TLK: pg_execute_error
Unusually, ext/pgsql maintainer Yasuo Ohgaki mailed the internals list for
advice on how to deal with a perennial PostgreSQL issue. The story was that
pgsql.c currently has support for prepared statements, but
pg_execute() raises an E_WARNING if the query plan has not
been prepared before the function is called. Yasuo termed this 'annoying',
particularly when the database connection is persistent in the Web environment.
He hoped to be able to work with PHP code such as:
if
(!pg_execute($db, 'myquery', array()) {
|
without raising the warning at all. Yasuo could see four possible approaches to the problem:
- ignore errors from
pg_execute()- identify them via the return status - add
pg_is_prepared() - add an optional
booleanparameter topg_execute(), e.g.
pg_execute(resource connection, string stmtname, array params, bool ignore_error) - ignore
pg_execute()errors only if theparamsarray isNULL
Did anyone have any comments?
Lukas Smith wanted to know how Yasuo planned to implement option 3 on that list?
As far as Lukas knew, there is no way to discover prepared statements in the current
session, although he believed this is slated to change in PostgreSQL 8.2. The only
way to find out whether a statement has been prepared is simply to try it, and
anticipate an error on failure. That error could trigger 'all sorts of error
handlers' on the database side, producing unexpected behaviour in any PHP
function that had an ignore_error parameter switched to
TRUE.
Yasuo, agreeing that this was exactly the issue, went ahead and applied a fix. He
replied to the list mail saying that he'd simply killed the E_WARNING
in pg_execute(); it seemed the most efficient way to deal with the
problem. An alarmed Lukas fired off another email to clarify: calling
pg_execute() on a unprepared statement will cause the transaction to be
rolled back on the next commit. Encouraging the use of pg_execute() to
find out whether the statement has been prepared is, therefore, 'simply
wrong'. The appropriate place to address the issue would be in userspace code,
by using error suppression in the error handler.
Wez Furlong and Edin Kadribasic were both quick to back Lukas' position, and
called for Yasuo to revert his patch. Yasuo argued that it was as though
file_exists() were to raise an E_ERROR on failure;
PostgreSQL doesn't provide a means to check whether a plan is already defined, so
the script developer can't design around it. Users are advised to prepare the
statement before getting into the transaction block, and check the return status of
pg_execute()... how about if he disabled the E_WARNING,
but allowed other errors to be caught? Would that be okay?
Lukas reiterated his view that PostgreSQL users should use
@pg_execute() and then put something in the error handler like:
function
ErrorHandler($errno, $errstr,
$errfile, $errline) {
|
Admittedly this was a bit of a hack; the 'beautiful' alternative would be to manage the prepared statements in some persistent layer.
Yasuo pointed out that it's not usually recommended to use the @
operator, but Lukas argued that hiding limitations in third-party libraries causes
more problems than it solves - not least since PostgreSQL itself is slated to solve
this particular issue in version 8.2. He offered an alternative to multiple
suppression; prepare the statement again before entering the loop. This would mean
only having to silence pg_prepare() once, although it would of course
add some overhead.
Wez intervened to say that it was a bad idea to commit such a large change in behaviour into the stable branch partway through the release process anyway, regardless of the soundness of the idea, and again asked Yasuo to revert his patch. Yasuo took his point and reverted it in the PHP_5_1 branch immediately. He continued the discussion, however, saying he'd heard the intention for the future of PostgreSQL execution calls is something like 'execute; if it fails, prepare', with version 8.2 backends having a view for currently registered plans. That meant clients would need to send requests over the network just to discover whether a plan was defined, which he saw as a waste of resources:
if
(!pg_execute('plan')) {
|
not least because pg_prepare() - an expensive function at the best
of times - raises an E_WARNING if the plan is already defined.
PostgreSQL users would only be able to take advantage of the performance benefits in
using prepared statements when working over a persistent connection. Perhaps a good
way would be to remove the E_WARNING thrown when
pg_prepare() discovers duplicate plans; would this be a
satisfactory approach?
Lukas agreed that it would be nice to have something like
pg_is_prepared() in place, but reiterated that it would be best to wait
until there is a native solution in PostgresSQL and then work with it. It's possible
to work around the limitations in userland code; Yasuo's patch should therefore be
reverted in all PHP branches.
Yasuo wrote that he hadn't expected to come across such strong opposition to
removing an E_WARNING, given that it seemed to him the best way of
resolving the issue, but - given that this was this case - went on to revert.
Short version (thanks Jani): Nothing to see here, move along.
TLK: FastCGI and STDIN
A PHP user named Matthew needed some help figuring out how to use FCGI_STDIN with
a running PHP script. He explained that he'd written a server in C++, which has
until recently been running PHP through a CGI interface. He'd recently implemented a
FastCGI interface to replace that, but had found that interactive command line
scripts that used to work under CGI were now failing. It seemed that
fopen("php://stdin", "r"); isn't the correct way to read command line
input under FastCGI?
Also, he wanted to keep the connection open to php-cgi in order to avoid connect/accept calls, if that was possible. He realized it would be possible to bind a second port to pipe requests from PHP streams to the server, but was there a more simple approach?
Wez suggested briefly that Matthew should write a real daemon using
stream_socket_server() rather than 'abusing fastcgi/cgi to work in
that way'.
Matthew took exception to the term 'abuse', pointing out that both protocols were
in fact designed to support this kind of usage. He didn't understand why PHP, alone
among programming languages, should have no support for using fcgistdin. The
simplest approach he could see toward making a FastCGI client application would be
to close STDIN and SDOUT, dup2() to the end
of a pipe, grab data there, wrap it in a FastCGI header and print it to the server.
STDIN data from the server would enter a second thread or a
multiplexing select() core, use the request ID to find the correct
STDIN pipe, and send the data there.
Wez, pointing out that 'a simple dup2() is not sufficient',
argued that it is always better to tackle a problem "the right way" rather than
trying to force something to work in a way that it doesn't. The use of threads to
"solve" this non-problem was also a bad idea - and where did Matthew intend to put
the dispatcher for multiplexing? He strongly advised Matthew to step back, take a
deep breath and look at the problem afresh; 'and remember: KISS'.
Short version: That's KISS as in Keep It Simple Stupid, not marital guidance.
REQ: Arbitrary precision datatype
Andreas Korthaus hijacked an elderly thread
(subject: 'Floored!') from the php-general mailing list. In that thread, a PHP
beginner had come across the problem of floating point precision for the first time.
Rasmus Lerdorf had referred the newbie to the manual page
on the subject, explaining in passing that the options are to work entirely in
integers or to introduce a little "fuzz factor" when operating on floating point
values. Andreas used the thread as a springboard to launch an impassioned plea for
something like GMP or BC to be implemented as part of the core; something that could
be used transparently (like float), but having arbitrary precision
(like java.math.BigDecimal, PostgreSQL's NUMERIC, GNUCash, etc).
Would anyone really care if it slowed down floating-point calculation?
Andreas pooh-poohed Rasmus' suggestion (again in the old thread) that computers cannot accurately represent a fraction; the way he saw it, if a child in elementary school can do it, there's no reason a computer shouldn't do it the same way. It should simply be made to emulate the child's reasoning. Arbitrary precision numbers could be stored in a struct, alongside an array of the digits (as integers) and the position of the decimal point. Then two arbitrary precision numbers could be calculated with the same steps the child learned in school. Sure, it'd take up more memory and more CPU cycles but, given the amount of resources used by the average PHP script anyway, he didn't see this as an issue.
He went on to denounce the recommendation of a "fuzz factor" 'in days of CPUs with billions of cycles/second', simply to calculate financial data. In his experience, very few PHP users resort to workarounds like that or to the bcmath/gmp extensions; either their applications work through sheer luck, or else they are able to overlook the errors caused by floating point arithmetic - or unaware of them.
Andreas concluded by writing that '64-bit integer and an "arbitrary precision numbers" datatype are the last major features missing in PHP', and - surprisingly meekly after all that fire and thunder - asked whether the 64-bit integer would make it into PHP 6?
PHP user Leon Matthews wrote to point out that the 64-bit integer already exists in PHP 5, and told a tale of regression test failure where the test had expected an error when dealing with UNIX timestamps post-2038. With 64-bit support, there is no error; '64-bit timestamps should keep track of time nicely until sometime after the heat death of the universe....' he ended happily. But as Andreas pointed out, there's a difference between having a generic integer type that will support 64-bit processing and a 64-bit integer type that is guaranteed to always be 64-bit regardless of the platform.
Short version: Arbitrary precision is a frequent topic. A detailed analysis of the problem is here; some attempts at resolving it are here and here.
TLK: Late static binding
Dmitry Stogov finally found time to look into Mike Lively's late static binding
patch and make it work in all the cases he could think of. He returned the improved
patch and test cases - unusually, in a format that the internals list attachment
stripper can handle (tar.gz), so we could all see it. Dmitry added bluntly
that he still didn't like the name static, and wasn't convinced that
the concept was needed in PHP at all.
Jochem Maas wrote somewhat acerbically that the new Zend Framework needs it if the team there want to implement something like
Person::findAll($myFilter)
|
and 'every half-assed PHPer doing OO in PHP 5' would love to know how they intended implementing it otherwise! If there is a clean way of doing this without introducing static late binding, he has been unable to find it. Current solutions tend to be something like
$peeps
= Person::findAll('Person',
$myFilter)
|
or
$p
= new Person; $peeps =
$p->findAll($myFilter);
|
and, wrote, Jochem, that last example feels to him like 'some OO principles are being thoroughly raped'.
Mike Lively went through Dmitry's changes; his only query was over both
executor_globals and execute_data being used to store the
caller_scope pointer. Dmitry explained that
EX(caller_scope) is a temporary value set in
ZEND_INIT_METHOD_CALL and then copied into
EG(caller_scope) during ZEND_DO_FCALL_BY_NAME. Something
like
Foo::bar(test());
|
would cause the method call to occur several times before the first
DO_FCALL; to handle this situation, EX(caller_scope) is
stored into a special stack.
OO fan Marcus Börger pleaded for the patch to be committed as it is; the keyword
could always be changed at a later date, and he needed late static binding for
SPL_Types. Andi Gutmans was more cautious; he still had some questions about
the patch for Dmitry before he'd be happy to apply it. He also reiterated that
this:: was a better keyword than static::, pointing out
that this:: had received fairly widespread support.
Dmitry wrote to Marcus explaining that the reason he wasn't happy about committing the patch as it stood was that 'this seldom-used feature' will slow down each PHP call. He intended to measure the performance loss over the coming week, and invited Marcus to do the same.
Short version: It's problematic.
NEW: PHP 5.1.3 RC2
Ilia Alshanetsky, as Release Master for the PHP 5.1 series, announced the second Release Candidate for PHP 5.1.3. as follows:
PHP 5.1.3RC2 has just been released, about a week late, but better late then never ;-). Please test this RC as much as possible, if it proves to be stable, this release will be published as final next week Thursday. The source packages can be found here: http://downloads.php.net/ilia/php-5.1.3RC2.tar.bz2 MD5: 8ad7bddc9a3b4dbcd2ecb1d6f5446970 http://downloads.php.net/ilia/php-5.1.3RC2.tar.gz MD5: 1e66780413580bc4a0742fa302735c99 Win32 binaries will be available for download shortly.
Edin Kadribasic, as ever, wasn't far behind him with the Windows binaries:
http://downloads.php.net/edink/php-5.1.3RC2-Win32.zip http://downloads.php.net/edink/pecl-5.1.3RC2-Win32.zip
Ron Korving noticed that two of the minor code cleanups he'd suggested during the optimization discussions, hadn't actually been addressed. Ilia thanked him for pointing them out and applied fixes in CVS; but as Marcus wrote, 'if only these two spots were all problems we had. :-)'.
Short version: Download, test, report bugs to the usual place.
TLK: GD development @ php.net
The original author of the GD graphics library, Thomas Boutell, posted a lengthy missive to the PHP internals list 'offering the bazaar-keepers the keys to the cathedral'. He wrote that - due to lack of time - he hadn't released a new GD update in some time, and the project is effectively forked at present because many of the fixes and improvements only exist in the PHP version. It made sense to him to move GD development to php.net.
Thomas would still like to maintain the project home page, and he would also hope to contribute to GD development as an individual developer; but he wanted to relinquish control. He asked whether the PHP community were interested in taking it on, and raised issues over licensing, support for GD usage outside PHP, documentation for the C API and patch management.
Rasmus immediately confirmed that php.net are very much interested in the GD project, and that there is the infrastructure in place to handle the move. He saw no problem with either the existing licensing or with providing support for the C API - given that PHP uses it - beyond abstracting the PHP-specific 'hacks to make GD play nice with the memory manager' so that any kind of memory manager override could be allowed. GD would live in its own top-level repository in cvs.php.net, and there are ACLs on php.net CVS access - it would be straightforward to add GD-only accounts for developers wanting to work on the project. The only problem he could see was that there needed to be a volunteer project lead; he himself didn't have the time to do the job either.
Pierre-Alain Joye, writing as the lead maintainer of the GD library embedded in PHP, was very much interested. He backed Rasmus' points regarding the licensing, the C API support and php.net's infrastructure, and was interested to know how the current documentation for the GD C API is maintained. The patches sitting in Thomas' inbox could be forwarded to him; he'd add them as soon as the issue tracker was up and running. Obviously, added Pierre, he was volunteering to lead the project if there was the need. He hoped, however, to get other GD maintainers and users involved. He ended by thanking Thomas for his decision in this matter.
Thomas later confirmed his official agreement both with the move and with the choice of Pierre as GD maintainer. He added that he will make an announcement on the project home page directing users to php.net at the point of the next release.
Short version: Outside recognition for years of hard work.
BUG: __set/ __get behaviour
Jochem wrote again. He'd found that the following piece of code:
<?php
|
gave him
Setting insideArray
|
under both PHP 5.0.4 and PHP 5.1.0. He had expected either the line commented
with SET 2 to trigger a failed call to __set(), or the key
test to be set in the array returned by __get() but not in
$this->array['insideArray']. Shouldn't __set() protect
the elements that already exist in an object?
Following criticism from other PHP users (not the development team) regarding user level questions being asked on the internals list, Jochem went on to say that a variation of this code currently doing the rounds on the php-general list actually segfaults under PHP 5.0.4. Under PHP 5.1.1 it throws a fatal error:
FATAL: emalloc(): Unable to allocate
1916888421 bytes
|
Surely this was an internals issue? He complained that it was all too easy to
make PHP segfault when using __get(), __set() and
__call(), and he'd come to believe this was a problem in the Zend
Engine.
Wez pointedly remarked that it would be more useful to file some solid bug reports than to bitch about the problem, either on the internals list or anywhere else.
Short version: That mystery URL is http://bugs.php.net. Got it?
REQ: Zend API bump
Pierre had some comments to make regarding a Zend Engine change committed to the
5_1 branch by Antony Dovgal and affecting several of the newer extensions. The patch
fixed bug #36898
(__set() seems to leak memory when extending internal classes) by
adding new functions to initialize and destroy zend_object structs.
Pierre agreed that it was an important fix, but queried the wisdom of adding two new
functions to the Zend API during the Release Candidate phase for PHP 5.1.3. He also
pointed out that it is no longer possible to compile the PECL extensions Tony had
altered to make use of those functions, against the current release of PHP 5.1.
Wouldn't a Zend API bump be in order?
Wez pointed out that extensions not using the new API would in fact continue to
work; they would simply continue to leak when __set() is used with
them. However, he agreed that the Zend API version number should be bumped, allowing
extensions to make use of the new API. Pierre went further, saying that the Zend API
number should be bumped every time something new is added; 'it is getting really
hard to know when and what was introduced'.
Short version: The Zend API number's still 20050922 at the time of writing...
TLK: PHP-GTK corner
Andrei Zmievski had one of those 'aha!' moments at the beginning of the week, and
added the gtype object property into CVS. That means that
$obj->gtype
;
|
will return the object's type - which is useful, because we don't have
$obj->get_type() exposed, so $obj->get_name() has
been the only way to reach that information until now.
Anant Narayanan nudged Andrei about his waiting patches, and Andrei subsequently
added GtkAboutDialog into CVS (but not GtkPlug or
GtkSocket).
Madeleine Drake wrote in with an unusual request; she wanted a Windows binary of PHP-GTK 1 compiled against PHP 4.4.2. She explained that she hoped to file a bug report about an exit hang under Windows 98, and felt that php.net wouldn't talk to her unless she'd tested with the 'latest and greatest' version of PHP. I said I'd make her one if she hassled me, but suspected her bug report would get short shrift anyway given that the issue only arises with the Win9x/PHP 4/PHP-GTK combination.
Christian Weiske came up with an idea for a new method,
GtkWidget::set_visible(). He wrote that he needed to call
show()/hide() on menu items dynamically in response to a
GtkListStore value, and although it was perfectly possible to do this with an
if block, it was an ugly approach. Since Christian felt that this
routine was likely to arise frequently in PHP-GTK programming, it would be nice to
be able to call it in a single line. Scott Mattocks agreed, and wrote a patch
implementing the suggestion but with a second optional boolean parameter to
determine whether show()/hide() or
show_all()/hide_all() should be called. He added that his
patch didn't actually work, and he had no idea why; it compiled, the method was
callable, the values passed to it were correct, it just didn't toggle
show()/hide(). Andrei wrote that he was wary of adding
PHP-specific methods to GTK+ widgets, but did it anyway once he'd tracked down the
reason Scott's otherwise perfect patch failed. (Scott had declared his variables as
the integer type gboolean rather than as the unsigned char type
zend_bool, so the result was always TRUE.)
Finally, Andrei looked into the segfault Christian and Anant had both reported in
GdkDrawable::draw_rgb_image_dithalign(). He ended by throwing out the
*_dithalign() methods completely, and writing new wrappers for
draw_rbg_image() and draw_rgb_32_image(). Christian
wondered if this meant GdkPixbuf animation should work now, but Andrei
was unsure whether it should; he only knew it hadn't worked on his box.
Short version: Getting closer all the time.
CVS: Mostly Unicode
Changes in CVS that you should probably be aware of include:
- Bug #36869 (memory leak in output buffering when using chunked output) was fixed in HEAD and 5_1 [Tony]
- There are several new ext/mbstring functions in CVS HEAD:
mb_list_mime_names(),mb_strstr(),mb_strrchr(),mb_stripos()andmb_strripos()[Seiji Masugata] - ext/pdo bugs #35671 and PECL #6504 - caused by the fix for #35332 - were fixed in CVS HEAD (only) [Wez]
- Also in ext/pdo, bug #36342 (ODBC won't let you bind variables by buffer after "long" columns) was fixed in CVS HEAD and 5_1. The maximum length for column names was increased as part of the same patch. [Wez]
- Zend Engine bugs #36878 (error messages
are printed even though an exception has been
thrown) and #36897
(
debug_print_backtrace()doesn't return void butarray(0) {}) were fixed in the HEAD and 5_1 branches [Tony] - A build issue and bug #36887 were fixed in PHP_4_4 branch (only) [Tony]
- Bug #36886 (User filters can leak buckets in some situations) was fixed in PHP_5_1 branch (only) [Ilia]
- In ext/mysqli, bug #36922 (missing
MYSQLI_REPORT_STRICTconstant in userspace) was fixed in CVS HEAD and 5_1 [Tony] - ext/spl bug #36941
(
ArrayIteratordoes not clone itself) was fixed in CVS HEAD and 5_1 [Marcus]
Derick Rethans queried the new additions to ext/mbstring, pointing out that the full Unicode support in CVS HEAD makes the extension obsolete there. Seiji replied that a) he couldn't add them into PHP_5_1 branch during the release process - but intends to following the 5.1.3 release - and b) not every application currently relying on mbstring functionality will immediately adapt to use Unicode when PHP 6 is released.
Meanwhile in CVS HEAD, Sara Golemon and Andrei both had another busy week. The
chief change Andrei made that the development team needs to be aware of is the
introduction of the U and S type specifiers in the
parameter parsing API. These are intended for use when a function wants to accept
only Unicode or binary strings (without type conversion).
Sara beavered away at her streams work, moving Unicode conversion to the filter layer. It is now possible to set encoding on a stream in userspace either by context:
$ctx
= stream_context_create
(NULL,array('encoding'=>'latin1'));
|
via stream_encoding():
$fp
= fopen('somefile',
'r+');
|
or through a filter:
$fp
= fopen('somefile',
'r+');
|
She also made php_stream_passthru() Unicode-friendly, which means
that the userspace functions fpassthru() and readfile()
are up to date. readfile()'s signature was altered slightly along the
way, and the optional boolean parameter use_include_path is now a
bitmask "flags" parameter, with the value of FILE_USE_INCLUDE_PATH
'coincidentally' set at a BC-friendly 1. Sara went on to apply
the same principle to file_get_contents(), again with that signature
change.
Following this, Sara waded deep into streams support, adding API hooks and the
new .ini setting unicode.filesystem_encoding - this to cope with
Unicode conversions of filename entries. Protocols other than straightforward
file:// can override the directive. She ended her week by rewriting a
handful of tests for ext/bz2, noting in her commit message that
'compression is just a binary thing... Write unicode and suffer my
wrath!'
Short version: Unicode-aware streams support is pretty much there.
PAT: Quiet week
Hannes Magnusson posted a patch [dead link] to make streams respect POSIX error retrieval functions. This would allow code like:
$fp = @fopen("/unwritabledirectory/filename", "w");
|
and would also fix bug #36868, which expresses the perceived need for it.
Wez reviewed the patch, and came back with the news that Hannes was probably
attempting to capture the POSIX errno value too late in the procedure.
In fact, he wrote, the failure code might not even be an errno;
getaddrinfo(), for example, returns a failure code outside the
errno "protocol", and many kinds of streams fail at a level that
doesn't even allow errno to be set.
Short version: Streams stuff is more complicated than it looks.

Comments