TLK: About those hexdump manipulation functions...
TLK: Return value of convert_to_*
TLK: Dropping support for Windows 98/ME [cont]
BUG: Missing SSL warning
TLK: Not enough error messages
REQ: ext/fileinfo
CVS: ext/filter code review
PAT: is_numeric_string optimization in 5_2
TLK: About those hexdump manipulation functions...
Alexander Demin chased up a patch he submitted to internals@ a couple of weeks ago introducing
hexdump(), packhex(), smart_hexdump() and
smart_packhex() into the PHP_5_2 core. This time, he had a response.
Tony Dovgal asked Alexander to elaborate on the usage of the proposed functions, the
reason(s) he felt they should be in the core and not in PECL, and the reason(s) they
couldn't be implemented in userspace PHP.
Alexander was quick to reply. He explained that hexdump manipulation is very
useful when it comes to debugging communication protocols, any algorithms with
binary data or any kind of logging. In fact, he'd found it so useful in the past
that he'd included his userland implementation into several of his own projects over
the last two years or so. Implementing the functions in C makes them faster, but -
more to the point - Alexander felt they belonged in the PHP core because the more
complex (and so less useful) pack() and unpack() functions
are there already.
Tony didn't see a reason to add something into the core that could be easily implemented in PHP itself, and was against adding every function that might be useful for 'a couple of persons using PHP', particularly given that scenario. He suggested that Alexander might offer his userland code as a PEAR package instead.
Stas Malyshev agreed that it sounds like a useful addition to the PEAR repository. He pointed out that, since the main use of hexdump functionality is in debugging, performance isn't really an issue. Alexander saw what the pair meant, and agreed to alter his userland source to follow PEAR coding standards.
Short version: Interesting PEAR package incoming, assuming it survives the PEPr process.
TLK: Return value of convert_to_*
Andrei Zmievski took issue with an apparently minor change made by Tony to the
internal functions convert_to_unicode() and
convert_to_string(). The commit message promised that it would
'restore old behaviour when casting objects to strings and NULL'. Andrei
felt that a failed object/string conversion really
should return FAILURE, as it had until very recently. Tony argued that,
when converting an objects to a string emits an E_RECOVERABLE_ERROR, it
should still be possible to do the conversion if there is a user-defined error
handler. He offered up the following code to demonstrate his point:
<?php
|
Andrei pointed out that the object $o would also be seen by
zend_parse_parameters() as a string, rather than as an object - was
that really Tony's intention?
Following a long gap in list traffic, Andrei wrote to internals@ saying that he
and Tony had held 'a spirited discussion' on IRC on the subject of
IS_OBJECT conversion. Tony felt that, since all that matters is the end
result, the function should return SUCCESS regardless of whether the
object could genuinely convert to a string or a forced conversion to the string
"Object" had occurred. Andrei did not share this point of view, writing
that the whole point of his adding return values in those functions in the first
place had been to distinguish between genuine and forced conversions.
zend_parse_parameters() really ought to know the difference, for a
start; and how did it make sense for users to get an
E_RECOVERABLE_ERROR on forced conversion but internal code be unable to
catch it?
Tony pointed out that a forced conversion doesn't actually throw a recoverable
error, even in his scheme of events. A failed conversion does, though, just
before it bails out; to Tony, anything that doesn't fail counts as a successful
conversion. He wrote that zend_parse_parameters() already throws an
E_NOTICE (via convert_to_string()) when a forced string
conversion occurs; he didn't see a need for a new error there. Tony argued that
'it's perfectly legal (to pass an object) in 5.2', and he didn't see any
reason to change that in PHP 6. It would be the equivalent of making that
E_RECOVERABLE_ERROR into a fatal E_ERROR.
Andrei wanted the conversion function to return FAILURE when the
E_RECOVERABLE_ERROR is triggered, which in turn would be used to
trigger an E_WARNING: "foo() expects parameter 1 to be string,
object given". He didn't consider the string "Object" a valid
conversion for this purpose, and argued that returning FAILURE from
convert_to_string() would simply alert the calling code to such a
conversion. Tony retorted that a failure in zend_parse_parameters()
would prevent the function itself from being executed; this would be a major change
in behaviour, and one he didn't think desirable.
Short version: Is it just me, or are there some crossed wires here?
TLK: Dropping support for Windows 98/ME [cont]
PHP user Richard Quadling was a little alarmed over the universal enthusiasm for dropping of support for Windows 98 and ME. He wanted to know what the ramifications might be for application developers. What exactly would not be supported any more, once support is dropped? When code is removed, will the Changelog specify that that code affected Windows 98/ME? Will it be possible to maintain compatibility using userland PHP solutions?
Frank Kromann explained that PHP will now use win32 API calls that were not implemented in those older versions of Windows. It will not be possible to run newer versions of PHP on those systems, and this will affect both PHP 5 and PHP 6. He added comfortingly that PHP built from source that pre-dates the decision to drop support, will still work on the affected platforms.
Short version: PHP 5.2.0 ran under Windows 98/ME; PHP 5.2.1 most likely will not.
BUG: Missing SSL warning
Long-time-missing Mehdi Achour, of the PHP Documentation Group, mailed the internals list out of the blue. He wrote that he'd been looking at bug #37799, and believed there should be 'a little warning' upon SSL connection failure rather than the current silent fallback to a standard connection. Along with the bug's reporter, Mehdi felt that this behaviour leads to a false sense of security. He'd tracked it down to line 269 of ftp.c, which reads:
if (ftp->resp != 334) {
|
unlike all the other SSL-related checks in ftp_login(), which either
return 0 or trigger an E_WARNING.
Pierre-Alain Joye responded. He wrote that he'd rather have
ftp_ssl_connect() simply return FALSE on failure as
currently; network connection failure is an expected error and so shouldn't really
trigger a warning, in his opinion. The user should have the option of trying again
using plain ftp_connect().
Nuno Lopes agreed; functions in other PHP extensions return FALSE on
failure, so this would be a consistent approach.
Mehdi wrote that he agreed with them both in principle, but pointed out that
ftp_ssl_connect() isn't actually the problem, since it isn't in control
of the SSL connection. The silent switch to SSL is made at the point where
ftp_login() is called. If anything was going to return
FALSE, it should be ftp_login() - and that would make
perfect sense to him.
Short version (Thanks Nuno): Nice comeback, didou ;-) (But this still needs fixing.)
TLK: Not enough error messages
Daniel Convissor had fallen across 'a subtle bug' which had been very hard to track down because PHP didn't throw an error message. He thought it probably should have. The code looked something like:
<?php
|
some_func(), which was supposed to return an array, had actually
returned void, making the value of $a unexpectedly NULL.
Daniel felt there should have been some helpful 'Undefined index'
E_NOTICE thrown at this point. Likewise, added Daniel:
$a
= 12;
|
gives no error. Worse still,
$a
= 'value';
|
returns the wholly counter-intuitive and undocumented v.
What were the internals folks' thoughts on these matters?
PHP user Jochem Maas pointed out that a more defensive coding style might have
prevented Daniel's original problem. He could have forced or checked for an array,
called isset(), or any combination of the three - and probably should,
in future.
Sara Golemon pointed to the source code in
zend_fetch_dimension_address()responsible for Daniel's headaches:
switch(Z_TYPE_P(container)) {
|
To Sara, it was clearly intentional that there is no error message when
attempting to access a NULL value using [] in a read-only
context. She didn't necessarily see this as correct behaviour, but noted that
'since BC trumps reason, this isn't going to change'.
Short version: Discretion is the better part of valour.
REQ: ext/fileinfo
PHP user Kevin 'Shai' Waterson wrote to internals@ complaining that, now
ext/mime_magic is deprecated, there is no way to determine a file's MIME type
from the PHP core without using some ugly and non-portable hack like
exec('file -bi'..). Kevin felt that PHP users should have access to the
functionality without having to install PECL or PEAR packages, since that might not
be possible in a shared hosting environment.
Tony pointed out that 'deprecated' is not the same as 'broken'; the extension is still part of the PHP_5_2 core, and in fact had some TLC (read: maintenance work) from him during this week. He listed some of the many extensions that were once part of the PHP core and are now in PECL, and pointed out that PECL is now the only way to use the functionality they represent. There are also a number of useful extensions that were 'born in PECL' and will never move to core. Installing an extension from PECL is no more complicated than installing a core extension for those with administrative rights, and those without administrative rights can't install or enable a core extension anyway.
Kevin pointed out that being marked DEPRECATED doesn't bode well for mime_magic's future as a core extension; at some point, it will be moved. He felt that, if the extension is likely to go, its functionality should be replaced with 'a simple core function' available on all platforms. When PECL becomes the only option, portability will be lost, and along with it the only 'clean' way of determining the type of an uploaded file. Tony pointed out that core extensions aren't necessarily enabled on all hosts in any case, and particularly not those with external dependencies. He didn't quite see how having the extension in PECL made it less portable, given this situation.
PHP user Vlad Bosinceanu commented that it might be a good idea to put together a list of recommended extensions for hosting providers briefly stating the functionality offered by each module, its dependencies, runtime overhead and openness to abuse. Although this wouldn't fully solve the problem of core vs PECL, it might encourage hosts to consider installing extensions other than those bundled with the default PHP distribution. Tony agreed that such a list would be appreciated, but doubted whether it would change anything.
Short version: The user base still need to be convinced over PECL.
CVS: ext/filter code review
Changes in CVS that you should probably be aware of include:
- Zend Engine bugs #39944 (References broken) and #39825 (
foreachproduces memory error) were fixed [Dmitry Stogov] - Configuration bug
#39890 (using autoconf 2.6x and
--with-layout=GNUbreaks PEAR install path) was fixed across all current branches of PHP [Tony] - Also across all current branches, ext/pgsql bug #39971
(
pg_insert/pg_updatedo not allownow()to be used for timestamp fields) was fixed [Ilia Alshanetsky] - In ext/simplexml, bug #39760 (cloning fails on nested
SimpleXML-Object) was fixed [Rob Richards] - Core bug #38542
(
proc_get_status()returns wrongPIDon windows) was fixed [Nuno] - CGI SAPI bug #39984
(redirect response code in
header()could be ignored in CGI sapi) was fixed [Ilia]
In other CVS news, Dmitry completed and committed a full code review of ext/filter in both the PHP_5_2 branch and CVS HEAD. He cleaned up a few overflow/underflow bugs, added 'more strict IP validation', and reimplemented float validation in the extension.
Hannes Magnusson challenged a buffer boundary protection fix that Ilia had
applied across all current branches of PHP, in ext/imap. The code used a new
constant, SENDBUFLEN, which didn't exist in Hannes' c-client library.
Ilia went ahead and added a handler 'for older c-client libs', again across
all live branches.
Meanwhile, in the snowy wastes of CVS HEAD, Tony busied himself with some of that
'low-hanging fruit'. All the LOB functions in ext/oci8, most of
the ext/posix functions and the new system shutdown wrapper,
stream_socket_shutdown(), are now marked Unicode safe.
Tony also worked with Andrei to bring Unicode support to sscanf()
and fscanf(), which Andrei committed later in the week.
Short version: PostgreSQL users can use now() at long last.
PAT: is_numeric_string optimization in 5_2
Pierre committed a fix for a zlib configuration issue in ext/zip to PHP_5_2 and CVS HEAD. The patch came from another old-timer, 'judas dot iscariote'.
All the hard work that Matt Wilmas put into optimizing
is_numeric_string() bore fruit when Ilia committed it into the PHP_5_2
branch this week. He also committed another, much smaller, Zend Engine patch from
Matt to 'simplify and optimize code', again in the 5_2 branch only.
Christopher Jones' ext/oci8 tests continued to come in, and were duly committed into 5_2 and HEAD by Tony.
The final offering of the year came from jdolecek at netbsd dot org, fixing a
proc_open()/proc_close() leak that had been reported under
Windows. Nuno applied it in 5_2 and HEAD just a few hours before the bells began to
ring in the New Year.
Short version: Nothing new for the PAT directory as we roll into 2007. Happy new year!


Comments (Login to leave comments)