Zend Weekly Summaries Issue #356

      Comments Off on Zend Weekly Summaries Issue #356

TLK: Safe mode
TLK: Namespaces and __autoload
TLK: What should be in 5.3?
TLK: Windows installer extensions
NEW: PHP 5.2.4
FIX: Magic methods and by-ref arguments
TLK: PHP 5.3 TODO
CVS: Life after 5.2.4
PAT: SQLite data type support

26th August – 1st September 2007

TLK: Safe mode

Systems administrator Mark Krenz had just read the PDM notes
from… oh, ages ago, when we were all young and Santa Claus was still real.
Mark was unhappy about the proposals for the future of safe_mode
and open_basedir, and wanted his voice to be heard.

In Mark’s experience, PHP is ‘a real pain in the ass to lock down
completely
‘. The only way to achieve it was by using modules like
mod_suphp that run scripts in a similar way to
suexec, but this wasn’t an option for every user on his system
because, wrote Mark, it affects simplified URLs. Application developers –
even those writing popular software – wouldn’t take responsibility for their
programs not working, and Mark saw the PHP core team’s attempts to regard
security as an Apache or OS issue as simply irresponsible. Despite the
vulnerabilities, in Mark’s view removing safe_mode could only
make things worse, and he felt that it should stay until the team could come
up with a 100% secure way to run PHP. His own security model was partly based
around safe_mode, alongside suexec and restrictive
permissions on home directories; take safe_mode away, and his
users would be able to run scripts that could read other users’ files. Were
there any plans for increasing rather than decreasing the security of PHP?
He’d hoped the team might jump on Apache 2.0’s ability to have modules run as
the user, but had seen nothing along these lines.

Deep breaths all round.

Stas Malyshev bravely insisted that PHP is just as secure as any other
scripting language, and that security measures belong at the level of the
operating system. The whole point of removing safe_mode was to
publicly acknowledge that solving the problem of security at language level
is not feasible; it had been a mistake to even try. Having a security
solution that works 90% of the time is no better than having none at all. As
for running multiple-user servers on a single Apache instance – sadly, that
had never really been supported.

Richard Lynch, Nate Gordon and Cristian Rodriguez also supported the removal
of the beast, and offered up various solutions. Cristian even referred Mark
to Ilia’s
article on the subject
of safe_mode. But Mark didn’t really
want to hear all this, and went unerringly for Stas’ throat. He rubbished the
OS line, arguing that anything that runs through CGI can utilize
suexec. Solutions like SELinux take completely the wrong
approach to the problem, since applying SELinux or AppArmor would mean
maintaining rules on a per-user basis. Having a secure server should be easy;
offering only complex solutions risks the situation where a majority of
sysadmins would leave the whole thing insecure rather than figure it all out.
Stas could hardly suggest that PHP might be too high-level to provide
security; it’s no more high-level than Perl. Mark believed the problems with
PHP stem from its being overly configurable; programming languages simply
shouldn’t be configurable, full stop. Finally, when it came to Apache
2.0, it seems that Mark had personally met one of the Apache development team
several years ago, and had been told at the time that Apache 2 modules would
support per-vhost user permissions. He’d specifically asked about the
possibility of PHP being able to utilize this, which had been confirmed. Mark
had been left with the impression that the PHP development team simply weren’t
interested.

Stas retorted that CGI/suexec actually is an OS approach
in his book, and there’s absolutely nothing to prevent PHP being run in that
way. He sympathized with the lack of good solutions for Mark’s situation, but
argued that this didn’t mean PHP itself should attempt to become that
solution. Perl doesn’t have a safe mode; it has a taint mode, which is a very
different thing, and indeed is something that PHP might well have in the
future. Finally, Stas referred Mark to the Apache 2.2
documentation
, which clearly states that the use of the User
directive is no longer supported in <VirtualHost> and recommends
the use of suexec. Worse, the perchild MPM
page
in the Apache 2.0 documentation adds “Do not user perchild unless
you are a programmer willing to help fix it”. The PHP development team would
be very interested indeed, if the thing worked – but it doesn’t.

Rasmus Lerdorf came up with a
bit of history
to add some colour to Stas’ bald “it doesn’t work”, and
recommended that Mark use a layered approach to security. PHP isn’t the bad
guy here; other languages, including Perl, also have no way to protect
users from malicious users on the same server. Rasmus had written safe_mode
at a time when there was no real means of protection; he hadn’t thought it a
brilliant solution even at the time, but it became ever less effective as
PHP’s collection of third-party libraries grew. Given that some of those
libraries aren’t even open source and have no hooks that might allow PHP to
override file access, the team now consider safe_mode an
intractable problem.

Short version: It’s gone, and it ain’t coming back. Not no-how.

TLK: Namespaces and __autoload

Dmitry Stogov came up with a patch for review that implements
__autoload() support for namespaces, noting that the changes
touch on SPL code as well as the Zend Engine. The problem the patch aimed to
solve was that there is no way to know whether the exception thrown in:

namespace Foo;
throw new
Exception;

is from the current namespace, or an internal PHP class. PHP’s initial lookup
was for the namespace version, Foo::Exception. This might well
call __autoload(), which could in turn emit an error or throw an
exception, leading to an unseemly crash. Dmitry’s solution was to provide an
additional Boolean argument to __autoload() that could be used
to confirm whether the loading class is actually required; if the value were
FALSE, no userland errors would be thrown. As an aside, Dmitry
noted that this is ‘the last semantic patch we have for namespaces‘,
and syntax will be the next area of focus. Hopefully nobody noticed.

Stas immediately saw a problem with the proposed flag. For a start, every
call to an internal class would involve a call to __autoload()
followed by disk access, potentially more than once. The performance impact
didn’t bear thinking about. Besides, every library with autoloaders would
need to be rewritten to support the new mode. He proposed instead that, when
faced with an unresolved unqualified name, there should be a check to see
whether the class was known in that namespace at compile-time; if not, the
check should be repeated at run-time. If that also failed, the next check
should be for a known internal class of that name. Only if the third check
failed should there be an attempt to call __autoload(), and if
that failed, there should be an “undefined class” error. Admittedly
this was a more complex approach than that offered by Dmitry, but it had two
distinct advantages: it allowed resolution without additional file system
calls, and it didn’t require autoloader modification.

Dmitry promptly offered another patch implementing Stas’ ideas, but internals
regular Jochem Maas grumbled that both solutions had ‘a wtf factor‘, in
that a class with a name matching an internal class name would behave
differently. He felt it would be better to simply make internal class names
illegal in namespaces.

Stas agreed that this was an option, but doubted that it was the best of the
bunch. Still, the matter was open to debate… Christian Schneider took up on
the offer, and pointed out that making internal class names illegal in
namespaces would defeat the entire point of having namespaces. The only way
it made any sense was if the restriction only applied to internal classes in
the global namespace, and then only if the use of the global namespace for
internal classes is to be phased out. Even so, he’d prefer there to be no
restriction at all.

Greg Beaver’s post making the same points arrived mere moments later. He
followed up with a much longer email about Stas’ proposal, arguing that the
need to rewrite autoloaders is hardly an issue given that back compatibility
has already been broken by the introduction of namespace support in PHP.
Besides that, he saw flaws in the process outlined by Stas. For example, the
following code would throw an Exception, and not a
Foo::Exception as of yore:

<?php
//Foo/Exception.php

namespace Foo;

class Exception extends ::Exception {}

?>

<?php
//Foo/Something.php

namespace Foo;

function __autoload($class) {
    include
str_replace('::', '/', $class) . '.php';
}

class Something
{
    function
__construct($param) {
        if (
$param == 3) {
            throw new
Exception('oops');
        }
    }
}

$a = new Something(3);

?>


As Greg noted, ‘this would mean that all naming conflicts with internal
classes must have a different import or be fully qualified.
‘ He wondered
whether calling:

import Foo::Exception;


immediately after the namespace declaration in Foo/Something.php might
be a good way to get around the problem? Stas agreed that it might, but
pointed out that you could also do:

require 'Foo/Exception.php';


and achieve the same result. Although this wasn’t pretty, Stas felt that the
alternatives – of always using :: or else waving goodbye to
performance – were worse.

Short version: You have to feel for the documentation folk.

TLK: What should be in 5.3?

Lukas Smith, crystal ball firmly in hand, predicted that the PHP 5.3 release
would be ‘very important‘ in terms of forward compatibility,
regardless of the future of the Unicode switch in PHP 6. He hoped to open a
debate on that issue, and also to clarify the development philosophy
regarding strictness. To be fair to Lukas, he also wrote that he had no
intention of re-opening elderly cans of worms. Some hope.

Apart from all that stuff, Lukas wanted the core development team and
hangers-on to check his wiki and see what is currently planned for both PHP 5.3 and PHP 6.0, with the aim of
bringing their direction a little closer.

Jani Taskinen wrote bluntly that in his opinion PHP 6.0 and PHP 5.3
should be identical apart from the Unicode support, ‘which is the only
thing PHP 6 is about
.’ Marcus Börger simply added “MFH namespace” to
Lukas’ TODO list, noting that his own task list for PHP 5.3 will take some
time to work through.

Sebastian Bergmann put in a request for both the new garbage collector and
late static binding in 5.3. Andrei Zmievski promptly hauled him up over the
former: ‘Don’t you think that a far-reaching thing like a garbage
collector is best left until a major release?
‘ Sebastian argued weakly
that 5.3 is just a number, but Andi Gutmans was with Andrei on this, pointing
out that he’d yet to see a production quality implementation. Until that
happened, the garbage collector should remain a PHP 6 item.

Derick Rethans wondered if Andi had seen David Wang’s latest patch, which
appeared to him to have addressed all the issues that had been raised in the
past. Cristian Rodriguez popped his head above the parapet to claim that the
last time he’d tested the patch it had worked fine, and the reduction in
memory consumption had been ‘really big‘. Jani explained to Cristian
that the problem was political rather than technical and that the GC would be
in 5.3 regardless. Andi replied carefully; in the Zend team’s experience, even
relatively small changes in the memory manager take a long time to stabilize.
Sebastian promptly argued that in that case, David’s implementation should go
immediately into CVS HEAD for testing.

Andi, realizing this was rapidly morphing into an ‘us vs them’ situation,
explained that he isn’t actually against the idea of having a garbage
collector; in fact he thought it would be very good for PHP. He’d even had a
go at implementing one himself in the past. The Zend team would simply like
the opportunity to review, test and benchmark David’s implementation, and
Andi asked David directly to send him his latest patch. Sebastian pointed out
that merging David’s code into CVS HEAD would expose it to a wider audience
for testing. Less confrontationally, he added that the patch would be smaller
– and the code review easier – if all David’s code to move the direct
manipulation of reference counters into macros could be committed, regardless
of the fate of the rest of the patch. Stas wondered aloud how putting code
into a branch that nobody uses in production could possibly expose it to the
kind of testing needed for memory management changes. He wasn’t
against GC either, but the emphasis on having the code in CVS HEAD made no
sense to him. Pierre-Alain Joye retorted that CVS HEAD is supposed to be the
development branch, and he couldn’t see how the core team could be
expected to achieve anything much without one.

Rasmus was more interested to know precisely what kind of code Cristian had
been testing; ‘It takes some very specific code for you to see noticeable
savings with this.
‘ Cristian confessed to having used ‘some very specific
code’, since without it there would have been no way to check the enhancement.
However, he had also run average-sized and larger applications under his test
copy of PHP to check for regression, and had detected no problems.

Somewhere in the middle of all that, Andi produced a lengthy response to
Lukas’ initial post (y’all remember Lukas?), with the emphasis on taking care
over scope and high risk items in PHP 5.3. The Zend team had produced a list,
separated into ‘must-haves, should-haves and nice-to-haves‘. The
must-haves were the ICU extension, OpenSSL modifications for OpenID, dynamic
class access, the (binary) operator, FastCGI always on,
__callStatic() and friends, and the removal of the warning when
var is used. The should-haves included the Unicode extension,
late static binding, namespaces, pluggable per-request memory management, the
removal of (undocumented) string support in list(), and Gwynne
Raskind’s nowdocs (non-parsed heredocs). Nice-to-haves would be
cookie2 support, stat cache, mysqlnd, goto,
__construct in interfaces, compiled functions and classes, and
the ability to evaluate static expressions at compile-time.

Andrey Hristov of MySQL AB noted that mysqlnd is currently in beta and
undergoing stabilization, and added that anyone wanting to know about its
memory consumption and tuning should read Ulf Wendel’s blog entry on the
subject
.

Short version: Garbage collection is high on the agenda for
many.

TLK: Windows installer extensions

Having followed John Mertic’s sporadic struggles to define which PECL
extensions should be included in the Windows installer for some weeks, Pierre
suggested gently that it might best not to install any. John gratefully
agreed, and asked Edin Kadribasic to stop adding the PECL libraries to the
installer when building it. He noted ruefully that things would be much
easier all round if the PEAR command pecl install actually
worked on Windows, as in “links in with pecl4win and downloads working binaries
on request”. Maybe one day…

Short version: The PECL infrastructure strikes again.

NEW: PHP 5.2.4

Ilia Alshanetsky, as Release Manager for the PHP 5.2 series, was finally able
to announce the release of PHP 5.2.4:

Short version: It’s out. Roll on PHP 5.3!

FIX: Magic methods and by-ref arguments

Tony Dovgal had prepared patches for both CVS HEAD and
the PHP_5_2
branch
that would prevent any magic method declaration accepting
arguments by reference. As he wrote, it makes no sense to pass by-ref
arguments to a magic method in the first place…

<?php

class test {
    function
__set(&$name, $val) { }
}

$t = new test;
$name = "prop";
$t->$name = 1;

?>

currently throws a fatal error with a high wtf factor – “Couldn't
execute method test::__set in Unknown on line 0
” – on reaching the
assignment to $t->$name, although Tony didn’t actually tell us
that part. With his patch in place, the declaration of __set(&$name, $val)
wouldn’t even make it past the parser before throwing the fatal
error “Method test::__set() cannot take arguments by reference in %s on line %d“.

Assuming there were no objections, Tony planned to commit his patches later
the same day.

Richard Quadling wondered about __call(), but Dmitry and Marcus
both gave the patches their approval on the grounds that they would
prevent stupid errors‘. It seems that the Zend Engine always passes
arguments to magic methods by value, so they couldn’t be modified in any case.

Stas didn’t see why the error should be fatal, not realizing that it already
was. Tony suggested he might run the example code, and deftly slipped his
patches into CVS.

Short version: It’s still a fatal error, but now it tells
you why before you even reach it.

TLK: PHP 5.3 TODO

Still wearing his Release Master hat, Ilia posted an important message to the
internals list:

Short version: FAO everyone who wants namespace support, a native
MySQL library and much, much more…

CVS: Life after 5.2.4

The changes in CVS listed here all came after the PHP 5.2.4 release:

  • The Boolean optional parameter to debug_backtrace()
    proposed last week,
    provide_object, went into both 5_2 and CVS HEAD [Sebastian]
  • PDO bug #42452 (PDO classes do
    not expose Reflection API information) was fixed in 5_2 only [Hannes
    Magnusson]
  • In ext/dom, bug #42462
    (Segmentation when trying to set an attribute in DOMElement) was
    fixed [Rob Richards]
  • In the SOAP extension, bugs #42326 (SoapServer crash),
    #42086 (SoapServer
    return Procedure ” not present for WSIBasic compliant wsdl) and #42359 (xsd:list type not
    parsed) were fixed [Dmitry]
  • CGI SAPI bug #42453 (CGI SAPI
    does not shut down cleanly with -i/-m/-v cmdline options) no
    longer exists [Dmitry]
  • In ext/pdo_oci, the attributes ATTR_SERVER_VERSION,
    ATTR_SERVER_INFO, ATTR_CLIENT_VERSION and
    ATTR_AUTOCOMMIT can now be accessed using
    $dbh->getAttribute [Christopher Jones]
  • The bundled pcrelib library was updated to PCRE 7.3 [Nuno Lopes]
  • Core bug #42512
    (ip2long('255.255.255.255') should return 4294967295 on 64-bit
    PHP) was fixed [Derick]

In other CVS news, Pierre had fun and games trying to fix GD library bug 106
(imagerectangle draws 1×1 rectangles as 1×3 rectangles). The fix
went into CVS HEAD alright, but his attempt to add it to the 5_2 branch drew
fire from Ilia and was quickly withdrawn. Pierre had been blissfully unaware
that there was a complete code freeze in operation at the time. He complained
in his reversion message that communication was lacking and besides, a single
week for a Release Candidate is ‘definitively too short’. He reiterated these
complaints in a follow-up mail, but was more concerned to make Ilia aware that
the libraries used to build the Windows distribution still hadn’t been
updated, and he had no way to do the job himself. If those relevant to GD
were updated at this point, there should be another RC just to test them. If
they weren’t updated, or if he couldn’t commit fixes once they were
updated, Pierre felt there were serious problems in the release process.

Something presumably occurred off-list to mend the rift; Pierre was allowed
to commit an alternative fix to accommodate ‘old’ versions of the libraries,
prior to the 5.2.4 release.

Jani left one of his more inscrutable messages lying about in Zend Engine’s
CVS repository: ‘Revert the revert: this is not causing any problems (or
we have lot bigger issues), the bug is elsewhere.
‘ A mystified Stas asked
politely whether Jani could explain
what the problem
had been
, and how it had been solved…

Short version: The process gets another hammering, but the fixes go on (eventually).

PAT: SQLite data type support

In a relatively quiet week on the patch front, Gwynne Raskind came up with a patch offering basic
support for storing typed integer and Boolean data in SQLite 3 databases. She
had a little rant about the lack of such support in PHP database driver
implementations generally, but neglected to supply any tests for its presence
along with her patch. Stas noticed the omission.

Multibyte king Rui Hirokawa noticed François Laupretre’s fix for bug #42396 at the end of last week,
which would prevent binary encoding being wrongly detected as Unicode when
--enable-zend-multibyte is switched on. Rui explained that it
was very wrong to call this behaviour a bug when it isn’t one; the behaviour
adheres to the specification, which rightly assumes that null bytes found in
a normal script indicate Unicode encoding. What François was really
after was a change to accommodate a new feature. Whether he would be prepared
to accommodate the accommodation, Rui didn’t actually say.

And finally, Johannes Schlüter slipped a patch from Etienne Kneuss into
the Zend Engine. This one never made an appearance on the list, at least
according to my archives. It makes it possible to use binary strings in PHP 6
method names.

Short version: Semantics, shemantics.