Zend Weekly Summaries Issue #316

      Comments Off on Zend Weekly Summaries Issue #316

TLK: Feature request I
TLK: Feature request II
TLK: .phps line numbering
TLK: Accessing SERVER_NAME
TLK: “Hooks for tracking file upload progress were introduced”
TLK: Over in PECL
CVS: ftp_put/ASCII bug fixed
PAT: pack/unpack revisited

TLK: Feature request I

Quiet week… So quiet, in fact, that the biggest thread in it came out of
Richard Lynch’s suggestion last
week
that Mathias Bank should supply a patch introducing a LISP-like
macro keyword. Mathias – and a handful of others – assumed a) that
Richard had intended sarcasm rather than simply not known the ramifications of the
change, and b) that Richard is a core PHP developer rather than an interested user.
François Laupretre, being one of that handful, mentioned his proposal for a formal
RFC process several weeks ago and
the fact that it had had no reply. He’d like to know what Richard thought about it?
Richard, realizing the error, replied very carefully: ‘For an
outsider
, an RFC would be a great mechanism. This will let us
users
see what future development is planned without having to be part
of the inner group
‘. He also felt that having a formalized process would
give the core developers space to get their work done, whereas the current mix of
IRC (invisible to a lot of people) and email (too difficult to track down) does not.
Even a ‘rigidly administrated forum‘ would do; Richard suggested using phpBB
for this. François agreed happily with every word. It would allow outsiders to keep
track of events without having to resort to asking the development team and without
repeating previously discussed ideas. It would allow more detail, both in the RFC
itself and in the comments on it, whereas currently ideas are filtered according to
the ease with which they can be understood. It would force people to think things
through before submitting an RFC. It would enable a wider audience to join the
discussion. Best of all, it would allow a real voting process.

In short, François was all for it, to the extent that he offered to implement,
house and administer the forum. Naturally, nobody on the development team responded.
Why naturally? – perhaps because informality, problematic tho’ it be at times,
allows far greater flexibility, openness to new ideas and new blood, and easier
access to ‘the top guys’, than any other option might – particularly an option
rigidly administrated by someone who doesn’t understand, and has little sympathy
with, the current process.

Short version: A small prize is offered to the first non core dev to
guess which of the listed ‘bonuses’ would actually be good.

TLK: Feature request II

Richard continued on his solo mission to keep me in employment by commenting on
another ancient request – the one
for foo()[0] syntax. He was decidedly against the idea; he could see
people using it to write unintelligible code like


echo
{$$(
foo()[4])}()[17][13];

which it might become his job to maintain. Envisaging this hazardous
future, Richard decided he’d ‘rather flip burgers‘. Mathias pointed out that
maintainable code was the responsibility of the programmer rather than the
responsibility of the language; he rather liked the possibilities offered by the
syntax, and could see uses for it in helper classes among other things. Someone
named Konstantin Käfer was for it too; he’d like to be able to write
parse_url($url)['host'] instead of having to store the return value in
a new variable every time. Core developer Hannes Magnusson cracked at this point and
introduced Konstantin to the useful parse_url() function that has been
in the PHP core forever.

Meanwhile Richard was struggling to get his head around Stas Malyshev’s comment
that an assignment to the returned value would be meaningless unless it was returned
by reference. Surely if foo() returned a string, assigning a letter to
foo()[0] would alter the first letter in that string? That would be
what he expected to happen. He’d no idea, though, whether it was altering the
original string or a copy. Perhaps it would be better to make it read only, but he
personally wouldn’t have expected that if ()[] were supported… Fellow
PHP user Rick Widmer broke the soliloquy with a note saying he was having a hard
time understanding the thread. (He wasn’t alone.) He liked the idea of
foo()[0] returning the first element of an array returned by
foo(), and thought f(){0} should do the same thing if the
return value were a string, but thought assigning to that return value should
trigger a syntax error, plain and simple. Functions, by their nature, return
values. ‘You can’t assign to a function, so why should you be able to assign to
an element of it?
‘ asked Rick. Jasper Bryant-Greene explained that
foo()[0] refers to the first element of the array after it has been
returned by foo(), not to the first element of
foo(). Rendering foo()[0] = 'a'; was therefore perfectly
valid code. However, for the assignment to work, that returned array would need to
be either returned by reference or else be an object with array-access syntax.
Personally, Jasper didn’t see this as a problem.

Short version: The discussion that mattered was over a week ago. The
verdict was, ‘too confusing’.

TLK: .phps line numbering

PHP user Kevin Waterson reminded internals@ readers of a discussion held back in March, about the inclusion of a
line numbering patch for .phps files in PHP 5.2. He wrote that most people
thought the idea was a good one at the time, and the patch didn’t impact on anything
else. Was anybody willing to commit it, or could there at least be a vote on the
matter?

Sean Coates, evidently having read the archives, explained that the patch hadn’t
been applied because ‘there is no reliable way to number lines and maintain
copy-and-paste
‘. Kevin offered ‘a simple CSS solution‘:


<html>
<head>
<style>
.numberline{
  display: inline;
  float: left;
  width: 15px;
}
.code{
  display: inline;
}
</style>
</head>

<body>
<div class="numberline">1<br />2<br />3<br />4<br
/>5<br /></div>
<div class="code">
&lt;?php<br />
while(1){<br />
echo "foo";<br />
?&gt;<br />
}<br />
</div>
</body>
</html>

But Sean countered with:


<div class="numberline">1<br
/>2<br />3<br />4<br />5<br />6<br />7<br
/>8<br /></div>
<div class="code">
&lt;?php<br />
// here's a really long line that will mess up the line numbers, so what you
call "not rocket science" is actually quite difficult, even though you fail to
believe this is true.<br />
while(1){<br />
echo "foo";<br />
// note that this is _actually line #5<br />
// oops.. the last line doesn't have a number <br />
?&gt;<br />
}<br />
</div>

Core developer Johannes Schlüter, who authored the actual patch, wrote to the
list with an overview of the current situation. The approach he’d originally used
worked across almost all browsers; only the ‘rarely used‘ Firefox – and
friends – spurned it. Under earlier versions of Firefox, a # appears
before every line during the copy/paste operation. Under later versions, the number
itself was added. In his own opinion, this wasn’t a major problem; his
implementation simply adds one (or more, he wasn’t sure) optional parameter(s) to
the highlight_*() functions rather than altering the .phps
output (see comments dating back
forever
). Its only impact would be on those that had actively requested
line numbering.

In a quiet week, when the majority of the core team appeared to be engaged in
other work, Kevin remained the only adoring fan for that patch. The chance of its
going into CVS HEAD (it was never likely for the PHP_5_2 branch) slipped by once
more.

Short version: Maybe next March?

TLK: Accessing SERVER_NAME

Internals newbie Glenn Richmond wanted to access
$_SERVER['SERVER_NAME'] from within the PHP source, as part of a tweak
to protect against inappropriate directory access in a shared environment without
resorting to safe_mode.


if
(zend_hash_find(PG(http_globals)[TRACK_VARS_SERVER]->value.ht, "SERVER_NAME",
sizeof("SERVER_NAME"), (void **) &server_name) != FAILURE) {
    // Got server name
    php_error_docref(NULL TSRMLS_CC, E_WARNING, "Base
directory: %s, %s, %s", basedir, path, server_name);
    return 0;
} else {
    // Unable to find server name
    php_error_docref(NULL TSRMLS_CC, E_WARNING, "Unable to
find server_name hash.");
    return 0;
}

The value reported as SERVER_NAME was “lEÃ???Ã??Ã?·Ã??Ã?¨EÃ???Ã??Ã?·_EÃ???Ã??Ã?·“, and
he had no idea why… right up to the moment of pressing ‘send’, when he presumably
re-read what he’d just written and realized he was printing out a hash. He posted a
working version of his code for the benefit of future newbies:


zval **server_name = NULL;
char* lookup_server_name;

if (strcmp(basedir, "VIRTUAL_DOCUMENT_ROOT") == 0) {
    if (!PG(http_globals)[TRACK_VARS_SERVER] ||
zend_hash_find(PG(http_globals)[TRACK_VARS_SERVER]->value.ht, "SERVER_NAME",
sizeof("SERVER_NAME"), (void **) &server_name) == FAILURE) {
        // Unable to find server
name
        php_error_docref(NULL TSRMLS_CC,
E_WARNING, "SERVER_NAME variable is not set, cannot determine server
name.");
        return 0;
    } else {
        // Convert the hash to a
string
        convert_to_string_ex(server_name);

        // Convert the hash value to a
char* for C processing
        lookup_server_name =
estrndup(Z_STRVAL_PP(server_name), Z_STRLEN_PP(server_name));
        // Non-zero result, error
occurred
        php_error_docref(NULL TSRMLS_CC,
E_WARNING, "Base directory: %s, %s, %s", basedir, path,
lookup_server_name);
        return 0;
    }
}

Pierre-Alain Joye wrote to let Glenn know there’s an easier and cleaner way to
write this, using the HASH_OF macro:


if
(zend_hash_find(HASH_OF(PG(http_globals)[TRACK_VARS_SERVER]), "SERVER_NAME",
sizeof("SERVER_NAME"), (void **) &server_name) == SUCCESS) {
    ...
}

Short version: If you use this code, don’t forget to ignore the
comments and return values.

TLK: “Hooks for tracking file upload progress were introduced”

Stickman – a PHP user with, one assumes, a difficult childhood behind him –
wanted to know about the ‘hooks’ that were introduced in PHP 5.2.0.
He couldn’t find anything about them, even using
get_defined_functions(). Could someone please give him some
pointers?

Rasmus Lerdorf obliged, explaining that these are C-level hooks – they haven’t
been exposed to userspace PHP yet, except in pecl/apc. Not for the first
time, Rasmus linked to his
demo
and the code
behind it
, and explained that eventually there will be more extensions that
manage server-side storage and have support for the hooks.

Stickman thanked Rasmus for the reply, but wrote that he’d found a patch on
Christian ‘Chregu’ Stocker’s blog that makes use of the hooks. It wasn’t quite
‘official’ enough for him to use, though. (See: Week
311, ‘Upload progress (yet again)’
.) Nick Loeve wrote to
say that he maintains an extension that will scan a file on upload using ClamAV,
which is available from here. However, Nick failed to point out that the reason
his extension isn’t in PECL is that it ClamAV itself is released under the GPL.

Short version: That one-line announcement has caused a disproportionate
amount of user confusion.

TLK: Over in PECL

OK, so it’s too quiet – time for a quick look at the week’s PECL releases,
although it has to be said that there wasn’t a great deal of activity there
either.

First off the starting line was Dave Renshaw’s SAM (Simple Asynchronous
Messaging) package, now at version 0.2.0 (beta). The extension allows PHP
applications to send and receive messages to/from a number of different middleware
message and queuing systems.

Kellen Bombardier was next, releasing the stable PDO_IBM extension with the
very respectable version number 1.0.0. This PDO driver supports the IBM DB2
Universal Database, IBM Cloudscape and Apache Derby databases.

Also from IBM, Caroline Maynard rolled a stable release of SCA_SDO 1.1.0. This is the
first release of the SDO package under the new project name, which itself springs
from the fact that the project now implements Service Component
Architecture
(the SCA part of the new name). Caroline’s release notes explained
that the ‘stable’ part refers to the SDO component rather than to the alpha-quality
and experimental SCA part. To quote from the package notes:

Short version: SCA is likely to be more useful than it sounds. Check it
out.

CVS: ftp_put/ASCII bug fixed

Changes in CVS that you should probably be aware of include:

  • Core bugs #39576
    (array_walk() doesn’t separate userdata zval) and
    #39575
    (move_uploaded_file() no longer working (safe mode related)) were
    fixed [Tony Dovgal]
  • Zend Engine bug #39602
    (Invalid session.save_handler crashes PHP) was fixed [Dmitry
    Stogov]
  • Streams bug #39551
    (Segfault with stream_bucket_new in user filter) was fixed in CVS
    HEAD only [Tony]
  • Core bug #39548
    (ZMSG_LOG_SCRIPT_NAME not routed to OutputDebugString()
    on Windows) was fixed [Dmitry]
  • In ext/ftp, bug
    #39583
    (ftp_put() does not change transfer mode to ASCII) was
    fixed across all three current branches of PHP [Tony]
  • Core bug #39621
    (str_replace() is not binary safe on strings with equal length) was
    fixed [Tony]
  • phpinfo() output now includes a meta tag to prevent search
    engines indexing the page, in all current branches of PHP [Ilia Alshanetsky]

Up in the rarified atmosphere of CVS HEAD, Marcus Börger added a new optional
parameter, use_keys, to the SPL function
iterator_to_array(). He credited Kevin (Waterson, I presume) with the
idea. Meanwhile Tony fixed a memory leak in zend_register_functions(),
which wouldn’t be news if it weren’t for his bemused note that ‘the Zend memory
manager said nothing about it
‘.

Andrei Zmievski ploughed grimly on through ext/standard, providing Unicode
support for highlight_string() (isn’t this round two?),
import_request_variables(), ftok(),
get_html_translation_table() and the tick functions. Tony later added
CG(literal_type) initialization to fix the backticks operator.

Short version: Googling for phpinfo() generated pages is no longer an
option for hackers seeking open servers.

PAT: pack/unpack revisited

Tony reported that one of Ilia’s tests for his
pack()/unpack() fix last week failed under AMD64, although
it worked fine under i386 systems. He posted a diff to show Ilia the problem, but
Ilia replied that the diff had identical values on the same lines tagged as
‘different’. David Soria Parra, whose patch for the same bug had passed everyone by,
intervened to say that there was definitely a bug there. To avoid it, you’d need to
simulate an overflow if the size of long was greater than the requested
size. He attached a patch to his mail that would do this, as well as a few other
bits and pieces that sailed way over my head. David directly asked Tony to test his
patch this time. Tony did so, and agreed that the patch indeed fixed the problem,
but asked David not to use C++ comments in C code. David meekly posted a cleaned-up
version. Ilia, meanwhile, had taken a look at David’s solution, and argued that it
was incorrect to fake an overflow in all instances because ‘some pack formats are
machine size, and in those instances 64-bit and 32-bit results should
differ
‘. David replied that a programmer packing a 32-bit integer would expect
the overflow; in fact, if PHP internally represented the value as an integer on
64-bit machines rather than as a long, the overflow would appear anyway… Ilia
argued that programmers generally pack an integer without specifying the bit size,
and that PHP doesn’t fake overflow 64-bit integers to simulate 32-bit behaviour
anywhere else. He didn’t see why pack() should be an exception. David
argued that in the pack options N is always declared as 32-bit – so why
should it behave as 64-bit? That was the only reason he’d attempted to make it
behave in this way; if the behaviour was correct, then the manual page
needed altering so that it didn’t promise a 32-bit integer.

Pierre had in the meantime looked at David’s other outstanding patch, for
fgetcsv(), and applied it with some modifications, fixing bug #39538.

Zoe Slattery sent Tony another new test, this time for the core
levenshtein() function. Tony added it to PHP_5_2 branch and CVS
HEAD.

Matt Wilmas meanwhile had been busy fiddling with his by now massive speedup for
zend_u_strtod(), and posted the latest
version online
for Tony to evaluate.

One Mathieu Carbonneaux had developed
a new implementation of the FastCGI module
, and told the list
all about it
in highly original prose. The only snag in his implementation was that the FastCGI
SAPI would need to be modified to handle static and index files, but he didn’t see
that as a problem since it would only affect that SAPI. How could he add his
contribution to the PHP trunk?

Andi didn’t like the sound of that solution, and wrote that PHP requests only
should go to PHP’s own FastCGI – which does in fact work, including remotely. Static
content should be redirected to a Web server. He didn’t see any reason to add
Mathieu’s patch into PHP itself, but wondered whether Mathieu had considered sharing
it with the Apache httpd development team? Mathieu pointed out, among other things,
that the Apache FastCGI license makes it difficult for the community to modify the
source; besides, the last update to the project was made in 2004. That was why he’d
developed his own implementation. It transpired that he’d already tried to contact
the development team there and had no reply. Mathieu went on to say that
modifications in a scripting language would necessarily be embedded in the language
itself, and he’d developed his FastCGI SAPI modification for PHP with this in mind.
Without it, the reverse proxy FastCGI implementation didn’t work correctly.

Andi still didn’t see why PHP should serve static files, and gave several
different existing solutions for serving them. James Aylett agreed, and wrote that
the niche for serving static files out of the FastCGI process would be where more
security than a single Web server was needed, but the project wasn’t big enough to
need to manage dynamic and static content separately. He wasn’t convinced this was a
big niche, though. He agreed with Andi; while a better FastCGI implementation
would be good, there was no need for it to serve static files.

The silent but productive Andy Wharmby fixed three PHP bugs this week; one in the
Zend Engine, bug #39534
(Error in maths to calculate of ZEND_MM_ALIGNED_MIN_HEADER_SIZE),
applied by Dmitry; one in ext/imap, bug #39613 (Possible segfault in imap initialization due to
missing module dependency), applied by Tony; and bug #39623 (thread safety fixes on *nix for
putenv() & mime_magic), applied by Ilia.

Short version: Nothing new for the PAT directory this week.