Zend Weekly Summaries Issue #333
TLK: Anonymous functions
TLK: GSoC – dbobj [continued]
REQ: getpass
TLK: GSoC 07 [continued]
TLK: More GSoC 07…
BUG: IMAP/GSSAPI auth failure
BUG: String BC break
TLK: Dealing with the old stuff
CVS: LDAP maintenance era
PAT: Black box free zone
TLK: Anonymous functions
Wez Furlong posted a proposal and
href="http://pastebin.ca/400952">patch for anonymous functions in PHP.
This, he wrote, would allow users to do things like:
Wez felt strongly that pulling the behaviour into the core would be
preferable to ‘the travesty that is create_function()‘, and explained
that the changes needed to do so are minor. His approach was to have the
expression:
|
evaluate to the (generated) name of the anonymous function, so that:
|
would set $foo to a string such as __zend_anon_1,
which can then be passed
around internally as a callback name. This is similar to the way
create_function() works, but with the advantage that ‘you
don’t need to use crazy quoting‘ to declare complex functions. Wez noted
that his current patch wasn’t perfect (and in fact found other minor problems
during testing). However, his real question was whether anonymous functions
should be part of PHP at all.
Jan Lehnardt, Jim Wilson, David Zülke and Gwynne, Daughter of the Code
(no really) all backed the proposal, prompting a request from Tony Dovgal for
the enthusiasts to write tests for the patch. Stas Malyshev was less certain.
He wondered what would happen with:
Stas believed it would be quite hard to make that work in PHP, because
$rev would be in the wrong scope. And how about:
- would this maintain a correct value in $rev? Wez explained
that ‘doing anything fancy with scoping‘ would a) complicate the
implementation and b) mean explicitly referencing the global scope in order
to break out of the function scope. In summary, ‘it would be cool if the
lexical scope was inherited, but maybe not cool enough to warrant making it
work‘. Stas argued that the difference was that inheriting lexical scope
would make it closure. Without that, he simply saw the patch as ‘a nice
way to save a couple of keystrokes‘, and warned that people coming to PHP
from other languages might expect it to act like closure because of the way it
looked.
Marcus Börger, Sebastian Bergmann, Jacob Santos and Christian Schneider
backed the initial proposal. Christian added that bringing closures into PHP
wasn’t a good idea in his own opinion; he regarded them as having ‘a high
WTF factor‘, citing his own experience with closures and
this in Javascript as a cautionary tale. That said, he wondered
whether generating an object method rather than a function wouldn’t resolve
the scoping issue?
Jon Parise took the time to review the actual patch, and came up with
href="http://news.php.net/php.internals/28426">a couple of suggestions
for improving it.
Sean Coates couldn’t see a way for PHP to offer full closure support, given
that – unlike Javascript – there is no way in PHP to access the parent scope
unless it happens to be the global scope. That said, he liked the syntax Wez
proposed for anonymous function declaration, although he had some queries
about how it would tokenize. If it went the way he believed it would – with
the anonymous function tokenized as T_FUNCTION rather than as
T_STRING – it should be possible to compile it at compile time
rather than at runtime. Stas agreed that this is a major advantage of the
proposal, since it allows the function to be cached.
Thinking it over, though, Sean recognized that create_function()
is most often used to create dynamic functions, and sent the discussion back
to scoping, noting that variables used by such functions aren’t necessarily
declared in global scope. Wez had been thinking about ways to get around this
too, and suggested a keyword to mark variables used in dynamic functions so
that they could be inherited from the lexical scope in which the function was
defined:
|
However, Wez – admitting unfamiliarity with Zend Engine internals – remained
uncertain what would happen if the function were called after the hash table
representing its scope had been destroyed. He wondered whether it could be
solved by storing a reference to $ver when the function is
bound, but wasn’t sure of the implications of this and suspected that a
realistic solution would be far more complex.
Sean liked the idea of using a keyword to grab scope. He went on to explain
how closure works in Javascript; the user-defined function maintains access
to the parent scope even after the parent would normally have been destroyed.
That said, functions in PHP are fundamentally different to functions in
Javascript, which are objects that have access to variables from all parent
scopes… Stas intervened to point out that variables aren’t interpreted by
the compiler in any case. Wez suggested that the compiler could make a list
of variable names to import and store them in the zend_function
struct, allowing the variable reference to be treated in the same way (he
believed) as a global variable. Stas explained that global references are
actually created at runtime, and binding to scope couldn’t work in this way.
That said, adding binding capabilities to DECLARE_FUNCTION could
work, although it wasn’t clear what would happen in the case of a loop – which
in itself would be difficult to deal with at compile time. Stas also wondered
how variable values might be added to the function symbol table at runtime.
Would they be references, or would they be copies?
In what had been a sub-thread, Robert Cummings asked whether it wouldn’t be
reasonable to assume that the relevant scope was the immediate parent of the
function scope. Stas explained that, although the parent scope is known at
compile time, functions are actually called during runtime; there is no way
for PHP to ‘know’ the scope of variables passed to, say, usort()
with regard to the scope of the anonymous function calling it. Besides,
storing the function name as a variable could easily move the scope away from
the original declaration. Robert wrote that he hadn’t been arguing for the
preservation of a variable value at the point of function creation; he simply
wanted that value to reflect whatever is defined in the parent scope. Sean
explained exactly how horrible this would be when it came to debugging
somebody else’s code, but admitted that a single layer of scope – something
like $_PARENT – might be useful.
Richard Lynch held a torch for the introduction of metadata
(__FILE__ and friends) for anonymous functions. Wez gently
pointed out that his patch actually offers that.
Andi Gutmans suggested compiling the anonymous function itself at compile
time with placeholders, i.e. the actual closure would be created at runtime.
He proposed a new global variable, $_SCOPE['var'], that would
reference the current $var during runtime. For example:
|
Regular variables would be treated as they currently are in
create_function(). Andi noted that ‘fix-up time’ – when
$_SCOPE is populated – would be faster than compiling the code.
Stas replied that this more or less was what Wez had already proposed, and
went on to outline how it could be achieved. However, he wasn’t certain that
bringing full closure support to PHP would be a good thing, pointing out that
it could lead to very messy code. Lukas Smith agreed; he felt that polluting
the global namespace for something not intended for re-use would be a bad
move, and suggested artificial limitations for closure usage. Andi pointed
out that the namespace would remain polluted, limitations or no limitations.
Wez intervened to say that the $_SCOPE idea wouldn’t quite work
out:
|
assuming that $_SCOPE would take a copy of $i
during fix-up. Since the function would only be compiled once, there would
only be one place to store the variable values for $_SCOPE, and
this would leave the result of calling any of the functions listed in
$funcs undefined. Wez could see two ways to resolve this. The
first was to generate a unique function name as an alias at each iteration
through the loop and use that as a key to access the appropriate stored
value(s) using something like get_scope_vars(). The second way
was to have a first-class callable type that would store the information in
the return value from the function declaration. It would also store a pointer
to the op_array and a hash table used to initialize the local
scope, based on the information in $_SCOPE. Wez had a personal
preference for the second solution, but noted that it would require a lot of
work. It would be less invasive to have a callable class type store the
information.
Andi had ideas of his own, ‘including some funky parameter passing
games‘, but wanted to bring the discussion back to whether the feature
was actually wanted in PHP in the first place, and if so, to what extent?
Andrei Zmievski, arriving late at the ball, supported the initial proposal
but added that he’d been wanting a true first-class callable type in
PHP for some time. Stas was less certain; he thought the only way to
implement it would be to explicitly declare the imported variables. Besides,
why would anyone need a callable type? Wez explained; the only other option
would be to store closure information in the op_array, which
wasn’t a good idea because there would be no way to know when it could be
freed – it would have to be stored for the lifetime of the request. Stas
agreed that this was a big argument in its favour, but pointed out that a new
type would add complexity to the language and tools; it would also require
modifications wherever callbacks are used. He proposed a middle way: the
callback type could be modified to accept array($object, $name,, where
$arguments)$arguments are the captured closed
variables. That way, no changes would be required anywhere in the source that
closure variables aren’t needed. There’d still need to be a decision made over
the scoping of closures created in a class context…
A couple of people – “boots” of Smarty fame and PHP user Jim Wilson – wrote
that they didn’t need closures in PHP anyway, just the anonymous function
syntax Wez had proposed in the first place.
Short version: Closures are a different thing altogether.
TLK: GSoC – dbobj [continued]
There was much interest in the native ORM project that Ádám
Bankó proposed for GSoC. He posted a link to his
href="http://dbobj.sourceforge.net/bzr/dbobj/">project homepage, noting
that his test applications currently act as stand-ins for documentation. He
also warned anyone downloading the code that the current version is so
unstable that it will probably segfault; he’d only just finished modularizing
the database layer. He would, however, like to receive crash reports -
preferably with some debugging information.
Propel user Tony Bibbs agreed with
Lukas that only selected bottleneck areas of PHP code should be ported to C.
He felt the only thing worth considering would be a native extension capable
of collecting the metadata required by an ORM. Lukas argued that DB
abstraction of any kind should stay in userland. Getting it to work usually
meant a bit of hackery, so putting it into C would create ‘a maintenance
nightmare‘.
Stas took a couple of Ádám’s assertions about the advantages of C with
a pinch of salt. Ádám explained; there is, for example, no way for a
PHP class to handle:
|
where MyClass::i doesn’t exist, and this should be relayed to
some magic function like __set(). He also didn’t know how well
__get() and __set() behave with references. Using C
would allow him to do things like cache the mapping configuration in memory,
and it would mean he didn’t need to worry about the performance cost of large
structures.
Doctrine fan Guilherme Blanco,
in the throes of an ORM implementation himself, posted a link to
href="http://www.ambysoft.com/essays/persistenceLayer.html">an IBM white
paper about the persistence layer. He didn’t see how a compiled ORM tool
would have any great advantage over one written in PHP, but was all for the
idea of having a bundled ORM tool. Ádám wrote that, among other
things, having what amounts to a bundled base class written in C (but
extensible in PHP) would help make it standard. Jacob Santos agreed that this
would be the optimal solution, but Lukas argued again that it was a question
of maintainability. A tool to reverse engineer a database schema, for
example, really shouldn’t be written in C because it needed a low entry
barrier; users should be able to quickly fix anomalies on encountering a new
or obscure RDBMS. Ádám agreed entirely on this point; he didn’t
consider database schema discovery a core ORM feature. In fact, his current
implementation supplies a callback hook that allows a PHP script to pull out
this data from a multitude of sources.
Andrey Hristov suggested writing a reference implementation in PHP and
porting it to C as appropriate, citing Marcus’ initial SPL implementation as
a template for this. Lukas could see this working, as it would make it easier
to figure out what should or shouldn’t be written in C. Ádám was less
certain – he didn’t like the idea of doing everything twice over – but Jacob
agreed that this approach enables speedy extension development. The important
thing was to achieve community consensus over a standard API. Ádám,
having put a lot of work into his project, asked list followers to take a
look at his example scripts and decide whether his current API was or was not
a good starting point. He would be willing to implement whatever API is agreed
upon, and wrote the initial
href="http://news.php.net/php.internals/28535">abstract for his GSoC
application to reflect this flexibility.
Short version: All sounds promising. Fingers crossed.
REQ: getpass
Daniel Rozsnyo, working on a CLI script, wanted to allow the user to type a
password without it being echoed back to the screen. He’d found the function
he believed he needed for this in unistd.h – getpass() -
but had found it marked obsolete in the manpage. Was there any way the
getpass() function could be included in the next release of PHP,
or should he patch his own copy? The only alternative he could see was to use
a small binary and call it using the tick operator.
Sara Golemon thought it was a really bad idea to wrap an obsolete function as
part of the PHP core, but wrote that supplying it as an extension would be a
different matter – perhaps even as a PECL extension. That said, it would only
work with the CGI or CLI SAPI, and not even then if output buffering came into
the equation. Sara also thought it unlikely that getpass()
integrates with PAM, meaning that it would fail alongside distributed
authentication schemes like LDAP and kerberos. Overall, the idea wasn’t a
good one, full stop.
Edin Kadribasic suggested a workaround PHP function for *nix systems:
Daniel hadn’t thought of offering his code as a PECL extension. He wrote to
Sara explaining that he only really wanted getpass() as a safe
password entry for CLI scripts – safe in the sense of no shell history,
password visibility or storage. Despite the obsolescence, both the mysql and
openssl client use getpass(), and there are probably other
mainstream clients that use it too. Daniel wrote that he would take his code
to PECL after the next cleanup – maybe not getpass() directly,
but the version from apr_getpass.c, which has the advantage of
fallback implementation.
Short version: Not as crazy as it seemed on first sight.
TLK: GSoC 07 [continued]
Tijnema had been reading the ideas on the PHP project GSoC planning list. The
test writing idea would be the simplest, in his opinion… but he had some
ideas of his own, too. He’d like to see support for handling audio and video
files in PHP, to do music processing, or to create music streams directly
from a website. Marcus introduced him to PECL, pointing out that strong C
skills are definitely required for that kind of project, and recommending
that he choose an idea from the existing list (read: we really, really want
those tests). Tijnema’s response was that he could only find
pecl/oggvorbis.
Richard Lynch mentioned the useful but unmaintained pecl/id3, which he
uses to splice ID3 tags onto the front of a mp3 stream
href="http://uncommonground.com/">on one of his sites. Tijnema became
very excited and started listing the libraries that he’d like to see as part
of a file format conversion extension. Still, he’d rather implement video
support… Alexey Zakhlestin pointed him towards ffmpeg (GPL’d), but Tony
Dovgal wrote that nobody sane would do audio encoding and video resizing in
PHP. It would result in an impossibly slow page load. There are plenty of
open source utilities for converting WAV to MP3 or AVI to OGG… transcoder,
lame, oggenc, to name but a few. That said, he’d be happy to see a PECL
extension capable of reading video files and grabbing screenshots, and the
existing sound file
extension in PECL has never been released.
Tijnema wrote seriously that his dreams for audio would extend the limits of
PHP; hadn’t Tony ever wanted to be a web DJ? As for video files, they are
simply sets of frames; any PECL extension capable of reading them has already
achievest the hardest part of the conversion process. Tony – who has never had
any desire whatever to be a web DJ – wondered whether Tijnema really intended
to create videos on the fly? Tijnema didn’t see why not. Why not have a movie
stream on your homepage? Tony explained gently that he’d need a Cray cluster
to handle it if his homepage ever became popular. Vlad Bosinceanu backed
Tony; PHP really isn’t suited for massive processing tasks. Audio or video
processing may be useful in CLI applications, but even there he saw no gain
in interfacing highly complex and specialized tasks from PHP. Resizing videos
for online use meant encoding the video – and any accompanying audio – to
begin with and fiddling with various properties in the process. Even if there
happens to be a single library suited to do all this, bringing it to PHP would
mean exposing a very complicated API.
Robert Cummings found Tony’s idea of general PHP usage very limited, and
Richard Quadling backed him in this. It seems it’s fairly standard to use the
same PHP classes across Web, CLI and GUI environments. That said, he didn’t
think he’d want to do video encoding with PHP… though it would be nice
sometimes to do things in PHP via an extension to an existing library, and he
mentioned Delphi’s JEDI project as a
worthy template for a potential GSoC project. Tony showed him
href="http://pecl.php.net/package/ffi">ext/dangerous, which does much the
same thing.
Short version: Dreaming’s okay, but not on the internals list maybe.
TLK: More GSoC 07…
GSoC 06 participant William Candillon wrote to internals@ saying that he’d
like to spend this summer writing an Eclipse plugin for the phpAspect project
he produced last year. Sadly, there was no immediate response.
One David Duong had seen the GSoC 07 announcement on php.net, and wrote in
search of a mentor for a project he had in mind. HyperWiki would be a
hypertext distribution system providing a minimal CMS, the ‘gateway’, which
could be administrated by non-programmers. In response to a user’s search
request, the gateway would provide aggregated search results, possibly
alongside a list of the other linked systems searched. The user could add,
edit or delete entries on all linked systems.
Although noting that this project doesn’t add anything to PHP itself, David
intended to go ahead with his proposal on the grounds that it would provide a
showcase application for PHP 5, and possibly for PHP 6.
Marcus wrote simply that the deadline for submitting proposals is looming.
Short version: All kinds of everything…
BUG: IMAP/GSSAPI auth failure
IMAP user Mustafa wrote asking whether there had been any movement on the
IMAP/GSSAPI authorization issue reported
some time ago. Michael Allen responded, with a possible solution for the
problem using his company’s PHP
extension – assuming that Mustafa is on an Active Directory network. Mustafa
replied that he’s actually using MIT kerberos and OpenLDAP, and dovecot is
the IMAP server. He has no problems with GSSAPI except with the PHP IMAP
call, which doesn’t try plain auth when GSSAPI fails. Mustafa added that
ldap_sasl_auth() has no GSSAPI support either. Michael pointed
out that ldap_sasl_bind() does in fact support GSSAPI binds with
the kerberos mechanism – he even had an example script for this:
|
Michael saw no reason Mustafa shouldn’t be able to get this working using
mod_auth_kerb with the option:
|
Although he’d noticed in the past that using KRB5_KTNAME to
specify a keytab file from which to get credentials doesn’t work. Mustafa
thanked Michael for the example script and wrote that he’d test later and
confirm his findings – but didn’t.
Short version: Hard to tell without user feedback…
BUG: String BC break
Christian Schneider wrote to say that he’d found an apparently undocumented
change in behaviour:
|
He wasn’t sure when the change had been introduced, but some third-party code
using that construct had failed during a PHP 5 migration. Was it an
intentional BC break, and if so, shouldn’t it be documented
href="http://www.php.net/manual/en/migration5.incompatible.php">in the
manual?
Tomas Kuliavas replied saying that it had changed somewhere between PHP 5.1.0
and PHP 5.1.1. The manual
page on string behaviour says that curly brackets are not escaped
with a backslash, but escaping did work in older PHP versions. Tomas ended
his mail with references to two
href="http://bugs.php.net/31341">closed
href="http://bugs.php.net/35411">bug reports, but Christian wrote that
these bugs weren’t quite the same as his. Moreover, in PHP 5.1.5 and PHP
5.2.1 the curly braces are escaped, it’s just that the backslash is
output too. If they weren’t escaped at all he’d be seeing abc.
Whatever, the PHP 4 way seemed to him to have the lowest WTF factor. Back to
the original question: is this behaviour intentional, or is it a bug?
Short version: Hard to tell without developer feedback…
TLK: Dealing with the old stuff
Tony posted a proposal to change the severity of the error triggered in CVS
HEAD when enabling magic_quotes or safe_mode, from
E_ERROR to E_WARNING. Since E_ERROR is
supposed to be used only for things that leave the Zend Engine in an unstable
state, he believed it was misused here. Besides, it’s impossible to give a
filename and line number when an error is triggered by a php.ini
directive. He planned to commit his patch later if there were no objections.
Alexey wrote that he thought it would be better to make users disable those
directives manually, but Tony explained that this wasn’t the issue. The
problem was that the error messages refer to an unknown file.
Johannes Schlüter asked if he could completely remove
get_magic_quotes_gpc() and similar functions from HEAD, rather
than have them result in a fatal error. Derick Rethans backed his request,
and also wrote that there should be a better error message for INI directives
- something that would determine the configuration file in which the offending
directive was set.
Short version: Thread hijack alert!
CVS: LDAP maintenance era
Changes in CVS that you should probably be aware of include:
- Zend Engine bugs #40833 (Crash
when usingunset()on anArrayAccessobject
retrieved via__get()), href="http://bugs.php.net/40899">#40899 (memory leak when nesting
list()) and #40883
(mysql_query()is allocating memory incorrectly) were fixed
[Dmitry, Tony] - In ext/imap, bug #40854
(imap_mail_compose()creates an invalid terminator for multipart
e-mails) was fixed in 5_2 branch [Ilia] - In ext/soap, bug #36226
(Inconsistent handling when passing nullable arrays) was fixed [Dmitry] - In ext/spl, bug #40872
(inconsistency inoffsetSet,offsetExiststreatment
of string enclosed integers) was fixed [Marcus] - Issues with the long form of CLI options were fixed across all current
branches of PHP [Marcus, Johannes] - Random crashes seen in ext/ldap should be a thing of the past
from 5_2 up [Doug Goldstein]
In other CVS news, Jani Taskinen finally gave up the unequal battle to stay
out of PHP development and started committing little bits and pieces once
more.
Dmitry made some changes to the Zend Memory Manager ‘to guarantee
reasonable time for worst cases of best-fit free block searching
algorithm‘. (That’ll be a speedup then.) He also worked on the SOAP
extension during the week, and it’s now possible to encode arrays using the
SOAP-ENC:Array type rather than WSDL. You can
activate this by using the option SOAP_USE_XSI_ARRAY_TYPE in
your SoapClient or SoapServer constructor.
Wez Furlong figured a way out of the problem of local SQLite installs and the
clashes they bring under Windows. He added a new DLL,
php_pdo_sqlite_external.dll, to the build system, thereby allowing
users to provide their own version of sqlite3.dll rather than the
SQLite 3 library bundled in the PHP core.
Short version: Doug Goldstein takes responsibility for ext/ldap,
long CLI options work, and SOAP-ENC:Array and php_pdo_sqlite_external.dll are
born.
PAT: Black box free zone
Richard Quadling produced a patch to fix
href="http://bugs.php.net/33664">bug #33664 by preventing the DOS box
from firing up under Windows when exec() is called from PHP CLI.
There had been no comment at the time of writing.
Short version: Couldn’t get much shorter.

