Zend Weekly Summaries Issue #321

TLK: About those hexdump manipulation functions…
TLK: Return value of convert_to_*
TLK: Dropping support for Windows 98/ME [cont]
BUG: Missing SSL warning
TLK: Not enough error messages
REQ: ext/fileinfo
CVS: ext/filter code review
PAT: is_numeric_string optimization in 5_2


TLK: About those hexdump manipulation functions…

Alexander Demin chased up a patch he submitted to internals@ "/node/view/id/1525#Heading7">a couple of weeks ago introducing
hexdump(), packhex(), smart_hexdump() and
smart_packhex() into the PHP_5_2 core. This time, he had a response.
Tony Dovgal asked Alexander to elaborate on the usage of the proposed functions, the
reason(s) he felt they should be in the core and not in PECL, and the reason(s) they
couldn’t be implemented in userspace PHP.

Alexander was quick to reply. He explained that hexdump manipulation is very
useful when it comes to debugging communication protocols, any algorithms with
binary data or any kind of logging. In fact, he’d found it so useful in the past
that he’d included his userland implementation into several of his own projects over
the last two years or so. Implementing the functions in C makes them faster, but -
more to the point – Alexander felt they belonged in the PHP core because the more
complex (and so less useful) pack() and unpack() functions
are there already.

Tony didn’t see a reason to add something into the core that could be easily
implemented in PHP itself, and was against adding every function that might be
useful for ‘a couple of persons using PHP‘, particularly given that scenario.
He suggested that Alexander might offer his userland code as a PEAR package instead.

Stas Malyshev agreed that it sounds like a useful addition to the PEAR
repository. He pointed out that, since the main use of hexdump functionality is in
debugging, performance isn’t really an issue. Alexander saw what the pair meant, and
agreed to alter his userland source to follow PEAR coding standards.

Short version: Interesting PEAR package incoming, assuming it survives
the PEPr process.


TLK: Return value of convert_to_*

Andrei Zmievski took issue with an apparently minor change made by Tony to the
internal functions convert_to_unicode() and
convert_to_string(). The commit message promised that it would
restore old behaviour when casting objects to strings and NULL‘. Andrei
felt that a failed object/string conversion really
should return FAILURE, as it had until very recently. Tony argued that,
when converting an objects to a string emits an E_RECOVERABLE_ERROR, it
should still be possible to do the conversion if there is a user-defined error
handler. He offered up the following code to demonstrate his point:


<?php

function "color: #0000BB">foo( "color: #0000BB">$errno, "color: #0000BB">$errstr,
$errfile "color: #007700">, "color: #0000BB">$errline) {
    
"color: #0000BB">var_dump "color: #007700">($errstr "color: #007700">);
}
set_error_handler "color: #007700">("foo" "color: #007700">);

$o "color: #007700">= new "color: #0000BB">stdclass "color: #007700">;
"color: #0000BB">var_dump "color: #007700">(settype "color: #007700">($o "color: #007700">, "color: #DD0000">"string" "color: #007700">));
"color: #0000BB">var_dump "color: #007700">($o "color: #007700">); // here we got a string,
not an object

?>

Andrei pointed out that the object $o would also be seen by
zend_parse_parameters() as a string, rather than as an object – was
that really Tony’s intention?

Following a long gap in list traffic, Andrei wrote to internals@ saying that he
and Tony had held ‘a spirited discussion‘ on IRC on the subject of
IS_OBJECT conversion. Tony felt that, since all that matters is the end
result, the function should return SUCCESS regardless of whether the
object could genuinely convert to a string or a forced conversion to the string
"Object" had occurred. Andrei did not share this point of view, writing
that the whole point of his adding return values in those functions in the first
place had been to distinguish between genuine and forced conversions.
zend_parse_parameters() really ought to know the difference, for a
start; and how did it make sense for users to get an
E_RECOVERABLE_ERROR on forced conversion but internal code be unable to
catch it?

Tony pointed out that a forced conversion doesn’t actually throw a recoverable
error, even in his scheme of events. A failed conversion does, though, just
before it bails out; to Tony, anything that doesn’t fail counts as a successful
conversion. He wrote that zend_parse_parameters() already throws an
E_NOTICE (via convert_to_string()) when a forced string
conversion occurs; he didn’t see a need for a new error there. Tony argued that
it’s perfectly legal (to pass an object) in 5.2‘, and he didn’t see any
reason to change that in PHP 6. It would be the equivalent of making that
E_RECOVERABLE_ERROR into a fatal E_ERROR.

Andrei wanted the conversion function to return FAILURE when the
E_RECOVERABLE_ERROR is triggered, which in turn would be used to
trigger an E_WARNING:foo() expects parameter 1 to be string,
object given
“. He didn’t consider the string "Object" a valid
conversion for this purpose, and argued that returning FAILURE from
convert_to_string() would simply alert the calling code to such a
conversion. Tony retorted that a failure in zend_parse_parameters()
would prevent the function itself from being executed; this would be a major change
in behaviour, and one he didn’t think desirable.

Short version: Is it just me, or are there some crossed wires
here?


TLK: Dropping support for Windows 98/ME [cont]

PHP user Richard Quadling was a little alarmed over the universal enthusiasm for
dropping of support for Windows 98 and ME. He wanted to know what the ramifications
might be for application developers. What exactly would not be supported any more,
once support is dropped? When code is removed, will the Changelog specify that that
code affected Windows 98/ME? Will it be possible to maintain compatibility using
userland PHP solutions?

Frank Kromann explained that PHP will now use win32 API calls that were not
implemented in those older versions of Windows. It will not be possible to run newer
versions of PHP on those systems, and this will affect both PHP 5 and PHP 6. He
added comfortingly that PHP built from source that pre-dates the decision to drop
support, will still work on the affected platforms.

Short version: PHP 5.2.0 ran under Windows 98/ME; PHP 5.2.1 most likely
will not.


BUG: Missing SSL warning

Long-time-missing Mehdi Achour, of the PHP Documentation Group, mailed the
internals list out of the blue. He wrote that he’d been looking at "http://bugs.php.net/37799">bug #37799, and believed there
should be ‘a little warning‘ upon SSL connection failure rather than the
current silent fallback to a standard connection. Along with the bug’s reporter,
Mehdi felt that this behaviour leads to a false sense of security. He’d tracked it
down to line 269 of ftp.c, which reads:


if (ftp->resp != 334) {
    ftp->use_ssl = 0;
}

unlike all the other SSL-related checks in ftp_login(), which either
return 0 or trigger an E_WARNING.

Pierre-Alain Joye responded. He wrote that he’d rather have
ftp_ssl_connect() simply return FALSE on failure as
currently; network connection failure is an expected error and so shouldn’t really
trigger a warning, in his opinion. The user should have the option of trying again
using plain ftp_connect().

Nuno Lopes agreed; functions in other PHP extensions return FALSE on
failure, so this would be a consistent approach.

Mehdi wrote that he agreed with them both in principle, but pointed out that
ftp_ssl_connect() isn’t actually the problem, since it isn’t in control
of the SSL connection. The silent switch to SSL is made at the point where
ftp_login() is called. If anything was going to return
FALSE, it should be ftp_login() – and that would make
perfect sense to him.

Short version (Thanks Nuno): Nice comeback, didou ;-) (But this still
needs fixing.)


TLK: Not enough error messages

Daniel Convissor had fallen across ‘a subtle bug‘ which had been very hard
to track down because PHP didn’t throw an error message. He thought it probably
should have. The code looked something like:


<?php

function "color: #0000BB">some_func() {}

$a "color: #007700">= "color: #0000BB">some_func();
if (
$a "color: #007700">[ "color: #DD0000">'do_something'] ==
true) {
    
// Do
it.
}

?>

some_func(), which was supposed to return an array, had actually
returned void, making the value of $a unexpectedly NULL.
Daniel felt there should have been some helpful ‘Undefined index’
E_NOTICE thrown at this point. Likewise, added Daniel:


$a
"color: #0000BB">12;
echo
$a "color: #007700">['k' "color: #007700">]

gives no error. Worse still,


$a
"color: #DD0000">'value';
echo
$a "color: #007700">['k' "color: #007700">] . "
"
"color: #007700">;

returns the wholly counter-intuitive and undocumented v.

What were the internals folks’ thoughts on these matters?

PHP user Jochem Maas pointed out that a more defensive coding style might have
prevented Daniel’s original problem. He could have forced or checked for an array,
called isset(), or any combination of the three – and probably should,
in future.

Sara Golemon pointed to the source code in
zend_fetch_dimension_address()responsible for Daniel’s headaches:


switch(Z_TYPE_P(container)) {
    /* ... */
    case IS_NULL: {
        /* for read-mode only */
        if (result) {
            result->var.ptr_ptr
= &EG(uninitialized_zval_ptr);
            PZVAL_LOCK(*result->var.ptr_ptr);

        }
        if (type==BP_VAR_W ||
type==BP_VAR_RW) {
            zend_error(E_WARNING,
"Cannot use a NULL value as an array");
        }
        break;
    }
    /* ... */
}

To Sara, it was clearly intentional that there is no error message when
attempting to access a NULL value using [] in a read-only
context. She didn’t necessarily see this as correct behaviour, but noted that
since BC trumps reason, this isn’t going to change‘.

Short version: Discretion is the better part of valour.


REQ: ext/fileinfo

PHP user Kevin ‘Shai’ Waterson wrote to internals@ complaining that, now
ext/mime_magic is deprecated, there is no way to determine a file’s MIME type
from the PHP core without using some ugly and non-portable hack like
exec('file -bi'..). Kevin felt that PHP users should have access to the
functionality without having to install PECL or PEAR packages, since that might not
be possible in a shared hosting environment.

Tony pointed out that ‘deprecated’ is not the same as ‘broken’; the extension is
still part of the PHP_5_2 core, and in fact had some TLC (read: maintenance work)
from him during this week. He listed some of the many extensions that were once part
of the PHP core and are now in PECL, and pointed out that PECL is now the
only way to use the functionality they represent. There are also a number of
useful extensions that were ‘born in PECL’ and will never move to core. Installing
an extension from PECL is no more complicated than installing a core extension for
those with administrative rights, and those without administrative rights
can’t install or enable a core extension anyway.

Kevin pointed out that being marked DEPRECATED doesn’t bode well for mime_magic’s
future as a core extension; at some point, it will be moved. He felt that, if the
extension is likely to go, its functionality should be replaced with ‘a simple core
function’ available on all platforms. When PECL becomes the only option, portability
will be lost, and along with it the only ‘clean’ way of determining the type of an
uploaded file. Tony pointed out that core extensions aren’t necessarily enabled on
all hosts in any case, and particularly not those with external dependencies. He
didn’t quite see how having the extension in PECL made it less portable, given this
situation.

PHP user Vlad Bosinceanu commented that it might be a good idea to put together a
list of recommended extensions for hosting providers briefly stating the
functionality offered by each module, its dependencies, runtime overhead and
openness to abuse. Although this wouldn’t fully solve the problem of core vs PECL,
it might encourage hosts to consider installing extensions other than those bundled
with the default PHP distribution. Tony agreed that such a list would be
appreciated, but doubted whether it would change anything.

Short version: The user base still need to be convinced over
PECL.


CVS: ext/filter code review

Changes in CVS that you should probably be aware of include:

  • Zend Engine bugs #39944 (References broken) and >#39825 (foreach produces memory error) were fixed
    [Dmitry Stogov]
  • Configuration bug
    #39890
    (using autoconf 2.6x and --with-layout=GNU breaks PEAR
    install path) was fixed across all current branches of PHP [Tony]
  • Also across all current branches, ext/pgsql "http://bugs.php.net/39971">bug #39971
    (pg_insert/pg_update do not allow now() to
    be used for timestamp fields) was fixed [Ilia Alshanetsky]
  • In ext/simplexml, bug #39760 (cloning fails on nested SimpleXML-Object)
    was fixed [Rob Richards]
  • Core bug #38542
    (proc_get_status() returns wrong PID on windows) was
    fixed [Nuno]
  • CGI SAPI bug #39984
    (redirect response code in header() could be ignored in CGI sapi) was
    fixed [Ilia]

In other CVS news, Dmitry completed and committed a full code review of
ext/filter in both the PHP_5_2 branch and CVS HEAD. He cleaned up a few
overflow/underflow bugs, added ‘more strict IP validation‘, and reimplemented
float validation in the extension.

Hannes Magnusson challenged a buffer boundary protection fix that Ilia had
applied across all current branches of PHP, in ext/imap. The code used a new
constant, SENDBUFLEN, which didn’t exist in Hannes’ c-client library.
Ilia went ahead and added a handler ‘for older c-client libs‘, again across
all live branches.

Meanwhile, in the snowy wastes of CVS HEAD, Tony busied himself with some of that
‘low-hanging fruit’. All the LOB functions in ext/oci8, most of
the ext/posix functions and the new system shutdown wrapper,
stream_socket_shutdown(), are now marked Unicode safe.

Tony also worked with Andrei to bring Unicode support to sscanf()
and fscanf(), which Andrei committed later in the week.

Short version: PostgreSQL users can use now() at long last.


PAT: is_numeric_string optimization in 5_2

Pierre committed a fix for a zlib configuration issue in ext/zip to
PHP_5_2 and CVS HEAD. The patch came from another old-timer, ‘judas dot
iscariote’.

All the hard work that Matt Wilmas put into optimizing
is_numeric_string() bore fruit when Ilia committed it into the PHP_5_2
branch this week. He also committed another, much smaller, Zend Engine patch from
Matt to ‘simplify and optimize code‘, again in the 5_2 branch only.

Christopher Jones’ ext/oci8 tests continued to come in, and were duly
committed into 5_2 and HEAD by Tony.

The final offering of the year came from jdolecek at netbsd dot org, fixing a
proc_open()/proc_close() leak "http://bugs.php.net/36427">that had been reported under
Windows. Nuno applied it in 5_2 and HEAD just a few hours before the bells began to
ring in the New Year.

Short version: Nothing new for the PAT directory as we roll into 2007. Happy new year!

Published: January 1st, 2007 at 12:00
Categories: Uncategorized
Tags: