TLK: About those hexdump manipulation functions...
TLK: Return value of convert_to_*
TLK: Dropping support for Windows 98/ME [cont]
BUG: Missing SSL warning
TLK: Not enough error messages
REQ: ext/fileinfo
CVS: ext/filter code review
PAT: is_numeric_string optimization in 5_2

TLK: About those hexdump manipulation functions...

Alexander Demin chased up a patch he submitted to internals@ a couple of weeks ago introducing hexdump(), packhex(), smart_hexdump() and smart_packhex() into the PHP_5_2 core. This time, he had a response. Tony Dovgal asked Alexander to elaborate on the usage of the proposed functions, the reason(s) he felt they should be in the core and not in PECL, and the reason(s) they couldn't be implemented in userspace PHP.

Alexander was quick to reply. He explained that hexdump manipulation is very useful when it comes to debugging communication protocols, any algorithms with binary data or any kind of logging. In fact, he'd found it so useful in the past that he'd included his userland implementation into several of his own projects over the last two years or so. Implementing the functions in C makes them faster, but - more to the point - Alexander felt they belonged in the PHP core because the more complex (and so less useful) pack() and unpack() functions are there already.

Tony didn't see a reason to add something into the core that could be easily implemented in PHP itself, and was against adding every function that might be useful for 'a couple of persons using PHP', particularly given that scenario. He suggested that Alexander might offer his userland code as a PEAR package instead.

Stas Malyshev agreed that it sounds like a useful addition to the PEAR repository. He pointed out that, since the main use of hexdump functionality is in debugging, performance isn't really an issue. Alexander saw what the pair meant, and agreed to alter his userland source to follow PEAR coding standards.

Short version: Interesting PEAR package incoming, assuming it survives the PEPr process.

TLK: Return value of convert_to_*

Andrei Zmievski took issue with an apparently minor change made by Tony to the internal functions convert_to_unicode() and convert_to_string(). The commit message promised that it would 'restore old behaviour when casting objects to strings and NULL'. Andrei felt that a failed object/string conversion really should return FAILURE, as it had until very recently. Tony argued that, when converting an objects to a string emits an E_RECOVERABLE_ERROR, it should still be possible to do the conversion if there is a user-defined error handler. He offered up the following code to demonstrate his point:

<?php

function foo($errno, $errstr, $errfile, $errline) {
    
var_dump($errstr);
}
set_error_handler("foo");

$o = new stdclass;
var_dump(settype($o, "string"));
var_dump($o); // here we got a string, not an object

?>

Andrei pointed out that the object $o would also be seen by zend_parse_parameters() as a string, rather than as an object - was that really Tony's intention?

Following a long gap in list traffic, Andrei wrote to internals@ saying that he and Tony had held 'a spirited discussion' on IRC on the subject of IS_OBJECT conversion. Tony felt that, since all that matters is the end result, the function should return SUCCESS regardless of whether the object could genuinely convert to a string or a forced conversion to the string "Object" had occurred. Andrei did not share this point of view, writing that the whole point of his adding return values in those functions in the first place had been to distinguish between genuine and forced conversions. zend_parse_parameters() really ought to know the difference, for a start; and how did it make sense for users to get an E_RECOVERABLE_ERROR on forced conversion but internal code be unable to catch it?

Tony pointed out that a forced conversion doesn't actually throw a recoverable error, even in his scheme of events. A failed conversion does, though, just before it bails out; to Tony, anything that doesn't fail counts as a successful conversion. He wrote that zend_parse_parameters() already throws an E_NOTICE (via convert_to_string()) when a forced string conversion occurs; he didn't see a need for a new error there. Tony argued that 'it's perfectly legal (to pass an object) in 5.2', and he didn't see any reason to change that in PHP 6. It would be the equivalent of making that E_RECOVERABLE_ERROR into a fatal E_ERROR.

Andrei wanted the conversion function to return FAILURE when the E_RECOVERABLE_ERROR is triggered, which in turn would be used to trigger an E_WARNING: "foo() expects parameter 1 to be string, object given". He didn't consider the string "Object" a valid conversion for this purpose, and argued that returning FAILURE from convert_to_string() would simply alert the calling code to such a conversion. Tony retorted that a failure in zend_parse_parameters() would prevent the function itself from being executed; this would be a major change in behaviour, and one he didn't think desirable.

Short version: Is it just me, or are there some crossed wires here?

TLK: Dropping support for Windows 98/ME [cont]

PHP user Richard Quadling was a little alarmed over the universal enthusiasm for dropping of support for Windows 98 and ME. He wanted to know what the ramifications might be for application developers. What exactly would not be supported any more, once support is dropped? When code is removed, will the Changelog specify that that code affected Windows 98/ME? Will it be possible to maintain compatibility using userland PHP solutions?

Frank Kromann explained that PHP will now use win32 API calls that were not implemented in those older versions of Windows. It will not be possible to run newer versions of PHP on those systems, and this will affect both PHP 5 and PHP 6. He added comfortingly that PHP built from source that pre-dates the decision to drop support, will still work on the affected platforms.

Short version: PHP 5.2.0 ran under Windows 98/ME; PHP 5.2.1 most likely will not.

BUG: Missing SSL warning

Long-time-missing Mehdi Achour, of the PHP Documentation Group, mailed the internals list out of the blue. He wrote that he'd been looking at bug #37799, and believed there should be 'a little warning' upon SSL connection failure rather than the current silent fallback to a standard connection. Along with the bug's reporter, Mehdi felt that this behaviour leads to a false sense of security. He'd tracked it down to line 269 of ftp.c, which reads:

if (ftp->resp != 334) {
    ftp->use_ssl = 0;
}

unlike all the other SSL-related checks in ftp_login(), which either return 0 or trigger an E_WARNING.

Pierre-Alain Joye responded. He wrote that he'd rather have ftp_ssl_connect() simply return FALSE on failure as currently; network connection failure is an expected error and so shouldn't really trigger a warning, in his opinion. The user should have the option of trying again using plain ftp_connect().

Nuno Lopes agreed; functions in other PHP extensions return FALSE on failure, so this would be a consistent approach.

Mehdi wrote that he agreed with them both in principle, but pointed out that ftp_ssl_connect() isn't actually the problem, since it isn't in control of the SSL connection. The silent switch to SSL is made at the point where ftp_login() is called. If anything was going to return FALSE, it should be ftp_login() - and that would make perfect sense to him.

Short version (Thanks Nuno): Nice comeback, didou ;-) (But this still needs fixing.)

TLK: Not enough error messages

Daniel Convissor had fallen across 'a subtle bug' which had been very hard to track down because PHP didn't throw an error message. He thought it probably should have. The code looked something like:

<?php

function some_func() {}

$a = some_func();
if (
$a['do_something'] == true) {
    
// Do it.
}

?>

some_func(), which was supposed to return an array, had actually returned void, making the value of $a unexpectedly NULL. Daniel felt there should have been some helpful 'Undefined index' E_NOTICE thrown at this point. Likewise, added Daniel:

$a 12;
echo
$a['k']

gives no error. Worse still,

$a 'value';
echo
$a['k'] . " ";

returns the wholly counter-intuitive and undocumented v.

What were the internals folks' thoughts on these matters?

PHP user Jochem Maas pointed out that a more defensive coding style might have prevented Daniel's original problem. He could have forced or checked for an array, called isset(), or any combination of the three - and probably should, in future.

Sara Golemon pointed to the source code in zend_fetch_dimension_address()responsible for Daniel's headaches:

switch(Z_TYPE_P(container)) {
    /* ... */
    case IS_NULL: {
        /* for read-mode only */
        if (result) {
            result->var.ptr_ptr = &EG(uninitialized_zval_ptr);
            PZVAL_LOCK(*result->var.ptr_ptr);
        }
        if (type==BP_VAR_W || type==BP_VAR_RW) {
            zend_error(E_WARNING, "Cannot use a NULL value as an array");
        }
        break;
    }
    /* ... */
}

To Sara, it was clearly intentional that there is no error message when attempting to access a NULL value using [] in a read-only context. She didn't necessarily see this as correct behaviour, but noted that 'since BC trumps reason, this isn't going to change'.

Short version: Discretion is the better part of valour.

REQ: ext/fileinfo

PHP user Kevin 'Shai' Waterson wrote to internals@ complaining that, now ext/mime_magic is deprecated, there is no way to determine a file's MIME type from the PHP core without using some ugly and non-portable hack like exec('file -bi'..). Kevin felt that PHP users should have access to the functionality without having to install PECL or PEAR packages, since that might not be possible in a shared hosting environment.

Tony pointed out that 'deprecated' is not the same as 'broken'; the extension is still part of the PHP_5_2 core, and in fact had some TLC (read: maintenance work) from him during this week. He listed some of the many extensions that were once part of the PHP core and are now in PECL, and pointed out that PECL is now the only way to use the functionality they represent. There are also a number of useful extensions that were 'born in PECL' and will never move to core. Installing an extension from PECL is no more complicated than installing a core extension for those with administrative rights, and those without administrative rights can't install or enable a core extension anyway.

Kevin pointed out that being marked DEPRECATED doesn't bode well for mime_magic's future as a core extension; at some point, it will be moved. He felt that, if the extension is likely to go, its functionality should be replaced with 'a simple core function' available on all platforms. When PECL becomes the only option, portability will be lost, and along with it the only 'clean' way of determining the type of an uploaded file. Tony pointed out that core extensions aren't necessarily enabled on all hosts in any case, and particularly not those with external dependencies. He didn't quite see how having the extension in PECL made it less portable, given this situation.

PHP user Vlad Bosinceanu commented that it might be a good idea to put together a list of recommended extensions for hosting providers briefly stating the functionality offered by each module, its dependencies, runtime overhead and openness to abuse. Although this wouldn't fully solve the problem of core vs PECL, it might encourage hosts to consider installing extensions other than those bundled with the default PHP distribution. Tony agreed that such a list would be appreciated, but doubted whether it would change anything.

Short version: The user base still need to be convinced over PECL.

CVS: ext/filter code review

Changes in CVS that you should probably be aware of include:

  • Zend Engine bugs #39944 (References broken) and #39825 (foreach produces memory error) were fixed [Dmitry Stogov]
  • Configuration bug #39890 (using autoconf 2.6x and --with-layout=GNU breaks PEAR install path) was fixed across all current branches of PHP [Tony]
  • Also across all current branches, ext/pgsql bug #39971 (pg_insert/pg_update do not allow now() to be used for timestamp fields) was fixed [Ilia Alshanetsky]
  • In ext/simplexml, bug #39760 (cloning fails on nested SimpleXML-Object) was fixed [Rob Richards]
  • Core bug #38542 (proc_get_status() returns wrong PID on windows) was fixed [Nuno]
  • CGI SAPI bug #39984 (redirect response code in header() could be ignored in CGI sapi) was fixed [Ilia]

In other CVS news, Dmitry completed and committed a full code review of ext/filter in both the PHP_5_2 branch and CVS HEAD. He cleaned up a few overflow/underflow bugs, added 'more strict IP validation', and reimplemented float validation in the extension.

Hannes Magnusson challenged a buffer boundary protection fix that Ilia had applied across all current branches of PHP, in ext/imap. The code used a new constant, SENDBUFLEN, which didn't exist in Hannes' c-client library. Ilia went ahead and added a handler 'for older c-client libs', again across all live branches.

Meanwhile, in the snowy wastes of CVS HEAD, Tony busied himself with some of that 'low-hanging fruit'. All the LOB functions in ext/oci8, most of the ext/posix functions and the new system shutdown wrapper, stream_socket_shutdown(), are now marked Unicode safe.

Tony also worked with Andrei to bring Unicode support to sscanf() and fscanf(), which Andrei committed later in the week.

Short version: PostgreSQL users can use now() at long last.

PAT: is_numeric_string optimization in 5_2

Pierre committed a fix for a zlib configuration issue in ext/zip to PHP_5_2 and CVS HEAD. The patch came from another old-timer, 'judas dot iscariote'.

All the hard work that Matt Wilmas put into optimizing is_numeric_string() bore fruit when Ilia committed it into the PHP_5_2 branch this week. He also committed another, much smaller, Zend Engine patch from Matt to 'simplify and optimize code', again in the 5_2 branch only.

Christopher Jones' ext/oci8 tests continued to come in, and were duly committed into 5_2 and HEAD by Tony.

The final offering of the year came from jdolecek at netbsd dot org, fixing a proc_open()/proc_close() leak that had been reported under Windows. Nuno applied it in 5_2 and HEAD just a few hours before the bells began to ring in the New Year.

Short version: Nothing new for the PAT directory as we roll into 2007. Happy new year!