Zend Weekly Summaries Issue #194

      Comments Off on Zend Weekly Summaries Issue #194

FIX: empty_string
TLK: Memory leak (again)
BUG: require_once()
FIX: html_entity_decode()
FIX: Abstract reflection
NEW: Array functions
TLK: Internals newbies
NEW: Diana Coggeshall
TLK: PHP-GTK lives!
NEW: Stream functions
NEW: realpath() caching
NEW: Full gif support
PAT: fp guru required

FIX: empty_string

Andi Gutmans nuked an elderly Zend Engine constant, empty_string,
replacing it with “” where it didn’t need to be malloc’d, or
STR_EMPTY_ALLOC() where it did – much to Sterling Hughes’ approval.
Andi cited slow general case behaviour when the constant was used, and possible
stability issues where it was used incorrectly. He asked extension maintainers to
check that their extensions weren’t broken through this change, and to amend them
accordingly if they were.

Antony Dovgal reported a segfault when freeing SID after building these changes,
but this was quickly fixed with one-line patches from himself and Marcus Börger.

Michael Sisolak posted a patch that effectively worked around the
empty_string constant for PHP 5.0 to fix bug #28929, and suggested
simply returning STR_EMPTY_ALLOC() in HEAD to achieve the same effect,
thanks to Andi’s change. Edin Kadribasic committed the patch.

Short version: The speeding-up process has begun.

TLK: Memory leak (again)

This time it was Andi who started the discussion, and Sterling who ended it with
a howl of “dooooooooooooooooooalllllllllocccccccccccccccccaaaaaaaaaaaaaaaaaaa,
damnit”.

Acknowledging that Sterling wasn’t someone he’d like to meet in a dark alleyway,
Ilia Alshanetsky agreed in principle to revert the Zend Engine alloca()
patches. However Marcus had already spent some time diagnosing and fixing the 0 byte
leak and, for the present at least, all appears to have stayed as it was.

Short version: The end of all alloca()-related discussions. Please…

BUG: require_once()

Jakub Vruna noticed that require_once() is case-insensitive on
Windows under PHP 4, so that if the file ‘a.php’ existed you could successfully
include it twice by calling

php –r "require_once('a.php');require_once('A.php');"

He wanted to know whether this behaviour should be fixed in the source, or just
documented in the manual.

Wez Furlong asked whether the behaviour was the same in both PHP 4 and PHP 5, but
didn’t really get a clear answer. FYI Wez: no, it isn’t.

Short version: This works as advertised in PHP 5; PHP 4 users need to
be more careful.

FIX: html_entity_decode()

Among his many other fixes this week (iconv, ctype), Moriyoshi Koizumi fixed a
bug whereby html_entity_decode() was returning the UTF-8 encoded euro
symbol as a fractional slash. He also, at Sara Goleman’s request, introduced
safe_pemalloc() to PHP source, which allows the safe persistent memory
allocation needed in streams internals. Sara promptly used it in the zlib and bz2
stream filter support she was working on at the time.

Short version: A busy week for Moriyoshi.

FIX: Abstract reflection

Marcus fixed a bug that made it impossible to declare a ReflectionClass as
abstract. This was his second attempt, and he closed bug report #28895 with a request for
volunteers to write a much-needed test suite for the Reflection API.

Later in the week, Marcus added a new method,
ReflectionParameter::isOptional(), to the Reflection API.

Short version: Anyone interested in writing a test suite?

NEW: Array functions

Cristiano Duarte mailed the list with a patch he’d found useful at work when he
needed to intersect an array of database records indexed by the primary key, with an
array with named keys. His patch introduces two new functions –
intersect_key() and intersect_ukey().

Andrey Hristov suggested that diff_key() and
diff_ukey() might also be implemented at this point, and he and
Cristiano both created versions of these.

Andi advised some minor changes to the code, and allowed Andrey to commit all
four functions to the PHP 5.0 branch as well as to HEAD. This means that they
will be available in PHP 5.0.1.

Short version: Four useful array functions that work with keys.

TLK: Internals newbies

Kamesh Jayachandran, having followed the list’s doings for a couple of weeks (and
also having provided a handful of small Zend Engine improvements), finally asked
‘What is MFH?’. Various people wrote back eagerly to explain MFH (Merge From HEAD),
and also BFN (Bug Fixing News) and MFB (the to-be-avoided-at-all-times acronym for
Merge From Branch).

At a heavier level, John Lim wrote in asking for assistance with the ADODB
extension he’s currently trying to port to PHP 5, particularly in speeding up calls
to PHP functions. Andi explained that he could cache his function lookups using
zend_call_function().

Finally, Øyvind Stegard needed advice on overriding PHP’s file operations in
order to create extended access control in the environment where PHP would be
running. Sara suggested that by far the most modular (and least hassle) approach
would be in the form of a PHP extension, rather than directly patching PHP source at
every upgrade. She also confirmed that Øyvind was looking into the correct
areas of the source for file operation functionality.

Short version: ‘Community’ works at all levels.

NEW: Diana Coggeshall

The newest of the PHP babies was born on July 19th to Hollie and John.
Anyone seeing John at OSCON and not wishing to discuss diapers, consider yourself
forewarned.

Short version: Ahh.

TLK: PHP-GTK lives!

The PHP-GTK project bounced back onto its feet as PHP 5.0.0’s release generated a
whole new set of confused newbies on the general list. Suddenly there is a new
documentation effort (headed by Christian Weiske) and a website revival (headed by
Ben Ramsey), and Andrei Zmievski concluded that there is enough interest in the
project to justify his working on PHP-GTK 2.

Three PHP-GTK commits later he ran into trouble, and wrote to the internals list
asking the reason why objects, arrays and resources couldn’t be declared as object
properties. Andi responded, explaining that object and resource types are shut down
at request shutdown, and arrays need to be emalloc’d. David Sklar and Marcus then
joined forces to produce a possible ‘objects as arrays’ solution, along with a way
to give emalloc’d zvals a pointer. Andi regarded this as a bad move due to the
performance impact and the extra memory required, but held out a ray of hope by
saying that it should be possible to create statics at request init and destroy them
at request shutdown, and he would look into it. Andrei followed this up with the
suggestion that objects, arrays and resources could be allowed as non-persistent
constants on a per-request basis, and added that he’d really like to see something
like this in 5.0.1.

Meanwhile, Frank Kromann – the Windows part of the PHP-GTK development team –
wrote in to request that iconv support be separated from libxml support during the
build process. Unfortunately, the means for allowing this to happen only exists in
PHP HEAD.

Short version: Still some ZE 2 fiddling needed before PHP-GTK 2 work
can properly begin.

NEW: Stream functions

Wez created two new userland stream functions:
stream_context_get_default(), which returns the default context while
simultaneously allowing you to set stream operations options for the entire script;
and stream_socket_enable_crypto(), which allows you to turn a supported
encryption layer on or off.

Again, as this is new functionality, it is only available in CVS HEAD.

Short version: Streams are starting to look pretty smart.

NEW: realpath() caching

Andi submitted for review a proposed patch for PHP 5.1 that implements internal
realpath() caching. After a few days’ complete silence, he sent out an
enquiry after it. Rasmus Lerdorf posted a long and detailed analysis, concluding
that there was an extra stat call still in the process that shouldn’t be there, and
also suggesting a different approach to implementation – having an ‘own’
realpath() replacement rather than trying to fix the way
realpath() is called, and caching the stat() call instead.
Stefan Esser was quick to back Rasmus in this, saying that realpath()
was flawed by nature from a security perspective (although the example he gave to
illustrate this wasn’t something anyone should be doing anyway). Andi replied that
caching the stat() call was likely to result in hard-to-trace bugs, and
that implementing a ‘partial realpath()‘ was neither straightforward
nor likely to be bug-free, but agreed that the extra stat call should be nuked.
Rasmus sent in another bunch of suggestions that basically centred on avoiding
realpath() usage altogether, and the thread went quiet.

Somewhere in there, a gleeful Gareth Ardron popped up with the news that he’d
been doing a little light benchmarking on Andi’s patch, and found it shaved 30% off
realpath() execution time on his box.

Short version: Speed-up slowed down.

NEW: Full gif support

Towards the end of the week, Stefan merged the changes from GD 2.0.28 to all
current branches of PHP. Marcus wrote in to say that the configuration files should
have the GIF_CREATE definition removed from them. Stefan wrote back asking ‘Why? –
Otherwise the build system does not know that bundled GD has GIF create support.’
It’s a fair bet Marcus didn’t know it had until then, either.

Short version: Possibly the quietest announcement in the annals of PHP history.

PAT: fp guru required

Florian Schaper’s patch to execute destructors earlier in the shutdown process
was applied by Marcus this week.

Someone known only as Mostapha mailed a version of hijri.c, which is an
implementation of the Islamic calendar. However, he doesn’t know how to incorporate
it into PHP’s calendar extension, and asked for help in doing this. The lack of
response is most likely through the lack of expertise in hijri amongst the available
core devs, which makes it difficult for them to check his code. The team need new
contributors in this area; in the meantime, Mostapha’s hijri.c is in the PAT directory, not least because it’s almost
impossible to find this file online.

George Whiffen posted one of
those emails
that generally gets a Golden Email Award from these pages, in which
he explained precisely what was wrong with PHP’s existing floating point
implementation across various systems (C), and submitted code that he felt might go
some way to fix the issues that people not using bcmath will find when they reach
very high figures. In the ensuing discussion he said that for this to go any
further, a PHP lead developer and a numerical analyst/floating point guru would need
to work together to find a definitive implementation; he’s gone as far as he can
with it.

Short version: The diffs for George’s initial fp solution and hijri.c
are likely to stay in the PAT directory
for some time.