Zend Weekly Summaries Issue #251

TLK: E_RECOVERABLE_ERROR
REQ: Thread-safety flag
TLK: 5.0.5 Release Candidates
TLK: Unicode strings impl proposal
FIX: Returning references from internal functions
TLK: Unicode support design document [continued]
TLK: PHP 6.0 wishlist [ad infinitum]
BUG: Schroedinger objects
TLK: Property Overloading RFC [continued]
CVS: Script encoding resolved
PAT: Module info and zlib fix


TLK: E_RECOVERABLE_ERROR

Bringing the type hinting/fatal error debate
back into the limelight, Derick Rethans offered to create an implementation of a
preventable ‘fatal error’. He was uncertain of the way this should be signalled from
a user defined error handler, but suggested that returning FALSE should
be enough to tell PHP to handle the error and stop the application from running.
Zeev Suraski gave Derick’s offer his blessing, but wondered why you wouldn’t simply
call exit() if you intended to stop the application? Derick explained
that he didn’t want to stop it, and Zeev suggested that in that case
zend_error_cb() should simply call the user error handler where it was
available, or E_ERROR if there was nothing in place. Andrei Zmievski
produced an elderly zend.c patch for this, which Derick received gratefully
although (as he said) ‘it doesn’t solve the problem of the catchable fatals‘.
He had already prepared a patch to achieve this, but it was still at the test
stage.

Marcus Börger confused Zeev by asking whether an E_ERROR by default
was good enough for him, or whether he would like to always have an exception thrown
for the type hinting error. Not one of nature’s OO lovers, Zeev retorted ‘I’m not
sure what happened, but I don’t want exceptions at all, let alone always :)
‘. He
only wanted type hint errors to be trappable by users, which would mean having the
potential for an E_ERROR to be caught using a user error handler when
the Engine and PHP are in a stable state. George Schlossnagle intervened to suggest
converting E_ERRORs to exceptions in an extension, and Derick pointed
out that his patch would allow extension authors to do that for the new
E_CATCHABLE error condition. Derick and Zeev both agreed that this was
a horrible name… George went on to write that his original proposal had been to
move irrecoverable errors to something like E_UNRECOVERABLE_ERROR or
E_FATAL and have only recoverable errors designated
E_ERROR. Zeev liked the idea but expressed concern over back
compatibility, and Marcus agreed with him over this.

Derick meanwhile had "http://files.derickrethans.nl/e_fatal-20050823.diff.txt">prepared a
patch
implementing ‘the George version’ and introducing E_FATAL,
and produced this for inspection. Zeev argued that, despite its elegance, replacing
the meaning of E_ERROR would create compatibility issues at source code
level, affecting cross-version extensions. Extension authors would need to terminate
execution in two different ways, every time. He’d prefer to keep the existing
meaning of E_ERROR and introduce a new error level,
E_RECOVERABLE_ERROR.

Derick argued that, as this change was only intended for PHP 6, all the
extensions would need to be updated anyway to work with the Unicode API, and he
didn’t see a problem with forcing a more thorough revision at that point. He was
prepared to adopt Zeev’s proposal, but felt it was a less elegant solution and asked
for other opinions.

I wrote supporting Zeev, partly due to version compatibility issues and partly
out of a concern that E_USER_ERROR was about to become
incomprehensible. I also inadvertently re-routed the discussion to PHP 6 extensions
when I wondered why back compatibility isn’t achievable there, given that PHP 6 has
the capability of running without Unicode support?

Rasmus Lerdorf pointed out that it obviously isn’t possible to retain binary
compatibility in PHP 6 extensions, due to changes in the underlying internal
structures. It might be possible to retain compatibility at the source level, but
having extensions that fall over when you turn on unicode_semantics would be a
real pain
‘. Perhaps the team should consider breaking non-compatible PHP 6
extensions outright, and supplying an upgrade FAQ for extension authors regarding
Unicode support. Zeev registered his view that it was probably not a good idea to
break source level compatibility on purpose, and I asked Rasmus for clarification.
He responded that PHP 6 extensions should handle IS_UNICODE strings
intelligently, regardless of whether unicode_semantics were turned on
or not.

Zeev suggested giving extensions a means to indicate Unicode compatibility,
assuming non-compatibility in all cases where that flag was missing. Non-compatible
extensions would not be loaded, and would throw an error. Rasmus felt that API
versioning would cope adequately with that situation. He did, however, have concerns
about extensions that don’t use zend_parse_parameters() to handle
arguments; he couldn’t see a way to maintain source compatibility in those cases,
and felt it might be a good idea to prevent such extensions from even building. Zeev
agreed with me that source compatibility really shouldn’t be broken when
unicode_semantics was turned off; ‘if Foo Inc. wrote a module for
PHP 4, and they want to migrate to PHP 6 – it should be a simple recompile.

Rasmus asked what would happen in such instances when the extension was passed an
IS_UNICODE string, given that Unicode strings exist independently of
the state of the .ini directive. Zeev felt that this should work reasonably
well, thanks to PHP’s typeless nature: ‘Of course you’d be able to shoot yourself
in the foot by explicitly creating unicode strings and using them with non-unicode
functions with unicode_semantics turned off, but the wound wouldn’t be too bad
:)
‘ Andi Gutmans backed him, saying that so long as a function was expecting
strings, Unicode strings would simply be converted to native strings.

He and Stas, meanwhile, were both thinking about flagging Unicode-enabled
extensions and how to deal with those core extensions, such as GD, that are unlikely
to need any changes. Stas felt that the development team should take responsibility
for ‘rubber-stamping’ core extensions that didn’t need changes, and the respective
maintainers should do the same for PECL or external extensions once the upgrade
procedure was properly documented. He also wanted to add an error message referring
to the upgrade documentation, to be thrown when a non-Unicode enabled extension was
loaded in Unicode mode.

Back on the original subject, Derick created an initial patch for "http://files.derickrethans.nl/patches/e_fatal-20050825.diff.txt">E_RECOVERABLE
and asked on the list for reviews. Ilia Alshanetsky, Zeev
and Marcus all felt that the longer but more descriptive
E_RECOVERABLE_ERROR was a better name, and Derick modified his patch
accordingly. Ilia asked whether the Release Managers for PHP 5.1 and PHP 5.0 would
allow the new error type into their respective branches, and Edin agreed that adding
it to the pre-release PHP 5_1 branch would be a good idea.

Michael Sims and John LeSueur both hoped to see the new error level thrown when
attempting to call a method on a non-object, but Derick explained that this would
have to remain a fatal E_ERROR because it leaves the Engine in an
unstable state.

Short version: The patch is there, but hasn’t been applied at this
time.


REQ: Thread-safety flag

John Coggeshall, responding to Zeev’s earlier suggestion of a Unicode
compatibility flag, asked whether a similar flag couldn’t be provided to signal
thread safety in extensions? He would like to see PHP ‘unhappy’ about loading
non-thread-safe extensions when operating in thread safety mode.

Stanislav Malyshev pointed out that it wasn’t always possible to know whether an
extension was really thread safe, particularly when external libraries were
involved. John agreed, but suggested levels of thread safety flags;
ZEND_THREAD_SAFE, ZEND_THREAD_UNSAFE and
ZEND_TS_UNKNOWN. An extension flagged UNSAFE would ‘just
blow up’ in ZTS mode.

Edin Kadribasic noted that thread safety is a) difficult to verify, b) platform
dependent and c) can be changed in any direction at every commit, but Marcus liked
the idea, saying that this would finally allow thread-safe PHP to ‘go with the
faster Apache way
‘. Andi argued that ‘the faster Apache way with thread-safe
PHP is slower than the slower Apache way with non-thread-safe PHP
‘, and he for
one wouldn’t use that combination. This alarmed Marcus, who wanted to know how ZTS
could be slower? Zeev explained that there is a slight slowdown whenever a shared
resource is touched, which primarily happens during the much-used
TSRMLS_FETCH(). Passing on the context at all stages also produces a
slight slowdown, but this is so slight as to be ‘barely noticeable‘. However,
the end result is that ZTS is a little slower and more bug-prone than non-ZTS
builds. Andi pointed out that the difference is marginal, and architectural
decisions should be based upon stability, ease of deployment and management as well
as performance.

John felt that the arguments against flagging thread safe extensions were
outweighed by the potential benefits of doing do, and that PHP should offer solid
support for multithreading. Zeev argued that ‘there are almost no advantages to
multithreaded PHP
‘ but there are known disadvantages, including some
reduction in both stability and performance. As the person responsible for writing
and implementing PHP’s thread safety resource management, Zeev is in a good position
to back Edin’s claim that every commit is a potential breakage in this area; he
called TSRM ‘a bitch to stabilize‘. He went on to recommend FastCGI, which
offers cross-process isolation, and said that the main reason for retaining
thread-safety mode was that it doesn’t bother those who don’t use it, and is useful
to those who do.

John wondered why ZTS couldn’t simply be disabled and a default FastCGI install
provided in its place, but Wez Furlong pointed out that certain environments
need to run a ZTS-enabled PHP build. Wez liked the idea of giving warnings
about potential stability issues, however.

Short version: Possibly, after a fashion.


TLK: 5.0.5 Release Candidates

Zeev, as Release Manager for the PHP 5.0.* series, announced the first release
candidate for PHP 5.0.5 at the start of the week in the hope of rolling the full
release by the end of it, and asked everyone to test it.

Sebastian Bergmann was quick to point out that the bundled PCRE library needed to
be updated to 6.2 due to "http://www.securitytracker.com/alerts/2005/Aug/1014744.html">a
security issue
in the existing version, and Rasmus added that the same applied
to PEAR, which still contained the vulnerable version of XML_RPC in this tarball.
PEAR’s Greg Beaver wrote in to say that the PEAR bundle in PHP_5_0 branch hadn’t
been updated at all yet, and he didn’t have time to do so immediately – ‘probably
tomorrow
‘.

Wez mailed Zeev to advise him that there might be a configure-related problem
affecting the SQLite extension’s ability to be built as shared, but he was also
short of time and didn’t expect to be able to look into it until Thursday. Could
someone please test?

Zeev gave it a couple of days, then asked whether there was anything still
outstanding, apart from PEAR, before Release Candidate 2 could be rolled? PHP user
Sonke Ruempler was quick to reply that some existing applications running happily
under PHP 5.0.4 were throwing fatal errors under PHP 5.0.5RC1. The errors in
question were reference fixes, which he had assumed would only affect the PHP_4_4
and PHP_5_1 branches; in PHP_5_0 branch they constituted a BC break. Zeev looked
into the issue and tracked it down to a genuine fix for "http://bugs.php.net/31525">bug #31525 – but as Sonke said,
there’s a big difference between E_NOTICE and E_ERROR.
Derick argued that PHP was simply trying to prevent memory corruption, but
Pierre-Alain Joye backed Sonke, saying that this was ‘a real problem‘ and
should be treated as such.

Meanwhile Greg reported that he’d finally synched php-src/pear with
pear-core in the 5_0, 5_1 and 4_4 branches, adding rather wistfully ‘May
God bring the day when this is no longer necessary sooner rather than later
‘.
Rasmus sympathized, but noted that the PHP_5_0 tree couldn’t be killed off until
there is a PHP 5.1 release. Zeev commented that neither PHP_5_0 nor PHP_4_4 could be
allowed to die any time soon, adding ironically ‘May God bring the day when we no
longer have to support security issues/major bugs in production releases…

This startled Lukas Smith, who had expected the 5_0 branch to cease at the point of
PHP 5.1′s release and PHP 4.* support to be dropped early next year. He called for a
clear policy statement regarding version support on php.net,
saying that this would help users make more educated
planning decisions. Derick, who had also anticipated dropping PHP_5_0 very soon,
argued that ‘having to deal with 4 branches is a little bit too much‘, but
agreed with Zeev that PHP 4.* support should be viewed as a long-term commitment.
Rasmus agreed over maintaining PHP 4.4.*, but argued for dropping PHP_5_0 branch
a couple of releases into 5.1.*‘. Zeev stood firm over the issue, saying
that PHP 5.1 would need to prove itself stable before the ~200,000 servers with PHP
5.0 installed could be expected to upgrade, and that the architectural changes
between PHP 5.0 and PHP 5.1 are as substantial as those between PHP 4.* and PHP 5.0.
He added:

Look, I take no joy at committing to 4 branches any more than anybody
else, but let's be realistic here... 5.0, which was the best PHP
version we offered as of mid 2004, will have to be supported for
critical fixes for quite some time.

Derick felt that a lot depended on the definition of ‘critical fix’,
and Edin agreed that fixes should be firmly limited. Andi foresaw a likelihood of
new versions of PHP 5.0 branch due to security issues, but didn’t think there was a
need for much further maintenance in the branch. Lukas agreed that limiting
maintenance there to security fixes after the 5.1 release was a sensible decision,
given the team’s limited resources. He also noted that certain bug fixes were
already being limited to the 5_1 branch at this stage, and wanted to check that
everyone was ‘on the same page‘ regarding the development roadmap. Jani
Taskinen, in his capacity as Evil Pixie, suggested that PHP 5.0.5 should never see
release and that PHP 5.1 should be pushed instead, but everyone ignored him.

Not one to give in easily, our Zeev:

http://downloads.php.net/zeev/php-5.0.5RC2.tar.gz
http://downloads.php.net/zeev/php-5.0.5RC2.tar.bz2

Greg reported that PEAR had been successfully updated this time, but
Mike Robinson seized the moment to respond to Wez’s earlier plea for SQLite shared
build tests with a confirmation that it was broken under Debian woody. Wez promptly
created a fix for that problem, but then Marcus Bointon reported an OSX
configuration failure in RC2, and Uwe Schindler reported problems with gmake
test
under Solaris in RC1.

Short version: It’s not ready to roll.


TLK: Unicode strings impl proposal

Rolland Santimano, a man much given to writing technical txt msgs, posted
some proposals
for Unicode implementations of PHP’s string functions for review. He was mainly
concerned about the start and length parameters in
substr_replace() and substr_count(),combining characters
in strtok(), the nature of padding in str_pad() and
whether the Levenstein algorithm should be expressed in graphemes or codepoints when
converted to Unicode. Andrei affirmed that start/length parameters – and pretty much
everything else involving character substitution, insertion or deletion – should
always be interpreted in codepoint context, but saw no reason not to accept single
‘combining characters’ as delimiters in strtok(). However, Andrei
suggested deferral to Tex Texin’s opinion regarding the role of combining sequences
in str_pad() – Tex being Yahoo! Inc.’s Internationalization
Architect.

The functions viewed as non-problematic included addslashes(),
stripslashes(), addcslashes(),
stripcslashes() and strip_tags(), although Andrei foresaw
code duplication ‘on a large scale‘.

Short version: Trust Andrei’s crystal ball.


FIX: Returning references from internal functions

Sara Golemon came up with a one-line patch for Zend/zend_vm_def.h allowing
internal functions to return by reference, and asked for permission to commit it in
CVS HEAD. Andi was quick to respond that actually this should work already in both
PHP_5_1 and HEAD, and set Dmitry Stogov to work on a fix for the new-found bug.

Short version: Sara strikes again!


TLK: Unicode support design document [continued]

Makoto Tozawa picked up on Andrei’s Unicode
support design document
with great pleasure, but had a few questions. He had no
knowledge of any HTTP request header that would specify the request encoding. If the
intention was to honor ACCEPT-CHARSET, there could be a problem because
there was no guarantee that the encoding there was the same as the encoding used to
escape characters in the query string.

Andrei looked into it, and found that RFC 2616 does not specify whether user
agents should send a charset parameter in the Content-Type header of a
POST request. He thought it would be safer to rely on
http_input_encoding and output_encoding settings than to
rely on ACCEPT-CHARSET, for the reasons Makoto had given.

Adam Maccabee Trachtenberg came up with a couple of "http://intertwingly.net/slides/2005/etcon/72.html">useful
links
for Andrei and Tex to look into on the subject of HTTP, HTML and
XML encodings, precedence rules and negotiation.

Tex went on to investigate form encoding, initially believing that setting the
accept-charset attribute for the FORM element in HTML
would also set the ACCEPT-CHARSET request header. Makoto queried this,
and asked him to double-check that Netscape was actually sending
ACCEPT-CHARSET=xxx rather than simply making the input characters in
the request content use xxx encoding. Tex looked again, and agreed with
Makoto’s findings. He concluded that the accept-charset value
definitely couldn’t be used to trigger the encoding. James Aylett agreed, pointing
to part 4.5 of RFC 2388 and enctype='multipart/form-data', which
presumes a default Content-Type of text/plain. Even so, James wasn’t
certain of the degree of browser support for charset/MIME settings there, and made
the point that ASCII is still the only encoding guaranteed to work everywhere.

Makoto also asked whether there was any way to keep byte semantics (as opposed to
unicode semantics) for existing functions? For example, a function calling
strlen('áéí') currently would expect the byte size of the string, which
is 6 under UTF-8 encoding. The same function would throw an error under
Unicode-enabled PHP, where strlen('áéí') would return 3… Tex pointed
out that all functions would need to provide reasonable behaviour for Unicode, that
it was still possible to leave unicode off and simply convert the data to UTF-8, and
that functions providing the raw byte length would still be available. Code relying
on hard-coded byte sizes was unlikely to be widespread, and at least this way most
code would ‘do the right thing’. Makoto replied that he’d asked because the Back
Compatibility section in the document claimed that existing data types and functions
must work as they have always done. Although the character semantics would remain
the same under Unicode-enabled PHP, both for functions written for single-byte
encoding and for those written for multi-byte encoding using mb_str*()
functions, he’d hoped to save those functions written for multi-byte encoding by
abusing the str*() functions‘.

Tex hadn’t realized that ext/mbstring users might manipulate
byte/character orientation to this extent, and thanked Makoto for the observation,
saying that it would bear some additional consideration. Still, looking through the
documentation, he noted that the str*() functions could be overloaded
to use mb_str*() character semantics. If a high proportion of users
happened to be using the extension in this way, there’d be little to gain by
altering the original proposal. Makoto agreed with his analysis, recognizing that
there was no clear way to support str*-abuse.

Short version: Everything’s always more complicated than it
looks.


TLK: PHP 6.0 wishlist [ad infinitum]

Mikko Rantalainen sparked a long debate when he responded to the PHP 6.0 wishlist
thread (sponsored by Duracell) with yet another plea for anonymous functions;
the real stuff, not just some odd string passed to create_function().
Marcus recognized that others had already asked for this feature and asked the list
to consider, seriously, whether it could be achieved?

Stas wondered aloud what was bad about create_function() and how
‘the real stuff’ should be different. The Clayton formerly known as 10t3k provided
an answer; create_function() does an expensive eval() at
runtime, whereas ‘real’ anonymous functions would be handled in the compiler. Stas
pointed out that, since function code is compiled once anyway, it shouldn’t really
make any difference when the compilation took place. George came up with two further
differences; syntax and garbage collection. He added that the runtime compilation as
it stands works for him, as the whole point is to allow the function definition to
be varied at runtime. He also didn’t see garbage collection as a critical issue.
Syntax, however, was a different matter; he claimed that the code required was too
obtuse to be used in the majority of cases because of the amount of escaping needed
to make anonymous functions in PHP appear standard and, while heredocs could
help, there were major limitations there. He’d like to be able to do something
like


$max
= function ( "color: #0000BB">$a, "color: #0000BB">$b) {
    return
$a
< "color: #0000BB">$b? "color: #0000BB">$b: "color: #0000BB">$a;
}

Stas pointed out that, if George used advanced concepts such as closures, his
code was bound to appear obtuse. He felt that the lack of garbage collection was a
valid concern, but fixing it would mean creating a new type, which seemed like
overkill to him. Compilation at runtime works better because it only occurs when
needed. He ended with


$max
= "/manual/view/page/function.create-function.html">create_function "color: #007700">('$a,$b' "color: #007700">, 'return $a <
$b?$b:$a;'
);

and asked, ‘Is it that different?

Michael Walter offered ‘the ability to capture variables from the lexical
environment
‘ as a feature of ‘the real stuff’, and gave a couple of examples


function
adder "color: #007700">($a "color: #007700">) {return function( "color: #0000BB">$b) {return
$a "color: #007700">+$b "color: #007700">;}}
function
index "color: #007700">($i "color: #007700">) {return function( "color: #0000BB">$a) {return
$a "color: #007700">[$i "color: #007700">];}}

Stas pointed out that PHP already has


function
adder "color: #007700">($a "color: #007700">) { return "/manual/view/page/function.create-function.html">create_function "color: #007700">('$b' "color: #007700">, 'return
'
.(integer) "color: #0000BB">$a. "color: #DD0000">'+$b;');
}

and that there is no substantial difference here. Dmitry, however, liked the
syntax Michael espoused, but said the problem with it would be in the
implementation. He didn’t see a way to compile the function at compile time, rather
than evaluating it at runtime, and still capture variables from the lexical scope.
Zeev wondered, ‘Since when do we consider moving towards LISP a good thing?
and broke the news that create_function() is actually intended
to be cumbersome, on the grounds that nobody should be creating functions on the fly
unless they absolutely have to. In response to Dmitry’s comments about
implementation, Zeev wrote that the idea of ‘function factories’ is fundamentally
alien to PHP, and ‘we shouldn’t be wasting time on it, let alone consider
introducing language constructs to support it
‘. Michael started to argue for the
LISP approach, but realized what Zeev was saying part-way through: ‘As I believe
you said, it is certainly *not* a question of how/whether we want that
functionality, but whether we want to encourage this particular style of
programming.
‘ Andi confirmed that we do not, just to be certain.

Noah Botimer wrote that the ability to hand a code string to a function was
useful but wasn’t always what the doctor ordered, and suggested that there could be
some benefit in having an anonymous lambda function that ‘fits in’ with the rest of
the language.

Andi brought the discussion to an end, calling anonymous functions ‘a niche
feature
‘ in a language designed for ease-of-use, and making the point that any
power user in need of them would be fully capable of getting by with
create_function().

Next up, Joseph Crawford wanted to see multiple inheritance in a future
version of PHP. Rasmus said darkly that if multiple inheritance already existed he’d
have put it on his wishlist for removal. Lukas suggested that Joseph should use
overloading and interfaces to achieve the same end, and Zeev pointed out that that
is precisely why PHP has interfaces in the first place. Marcus agreed that MI could
be emulated with __call(), but went on to discuss an idea he had for
implementing delegation in order to promote code-reuse in interfaces, much to
Andi’s distress. Andi suggested that a userland or internal Proxy class
might be more appropriate for PHP, allowing objects to aggregate and have their
priority defined by the order in which they were added. Perhaps also a method to
override that priority by allowing methods to be mapped to objects. This would allow
advanced users to have fine-grained control over delegation, without adding any new
and over-complex language constructs. However, he was torn over whether even this
much should be added…

Greg wrote in to highlight the existence of PEAR_Delegator at this point, saying
that the code there might provide an interesting starting point for an
implementation in ext/SPL or equivalent.

Marcus replied that he didn’t see any good reason to add a proxy/aggregation
implementation, although it would be simple enough to do so in SPL. It wouldn’t
solve the problems of multiple inheritance, and was incompatible with the concept of
interfaces; he saw it as counter-productive. Andi argued that it wouldn’t
necessarily be incompatible with interfaces. The debate grew heavily technical at
this point, with Marcus insisting that Andi’s proposal would need dynamic interfaces
(aka a full rewrite of object/interface internals in PHP) to work properly, and Andi
maintaining that his proposal would work, albeit in a more limited way than
Marcus hoped. The mention of OOP pissing contests brought this particular diversion
to an abrupt end.

One Jordan Miller, new to the internals list, wanted to see comparison operator
expressions that looked more like real mathematics, e.g.


if
(
2 "color: #007700">< $x "color: #007700"><= 4 "color: #007700">) {}

Would adding this syntax to PHP be incredibly difficult or lead to performance
slowdowns, wondered Jordan? Tex pointed out quickly that it would actually change
the meaning of existing code.

Ovidiu Farauanu wrote a long and indignant email about PHP’s annoyances, and the
way they ‘give headaches to any C programmer‘. Stas, evidently on a roll that
day, replied: ‘That’s like saying “American cars suck” because they have gear
shifting done differently from European ones…

More seriously, Daniel Convissor wanted to deal with the "http://bugs.php.net/bug.php?id=25987">problem of
short_open_tag and XML. Although Ilia was quick to mention the existence of
<?php and Jani equally quick to advocate nuking every other form of
PHP tag (an ongoing theme), Sara recognized that Daniel was only hoping to have the
scanner recognize <?xml. Jani voted -1000000 to this, but Sara
knocked up ‘a sloppy implementation‘ (she should’ve known better) which
aroused instant criticism from Johannes Schlüter, who mentioned
<?xml-stylesheet in passing. Paul Reinheimer suggested having the
scanner recognize <?w (where w == whitespace) might be more
appropriate and BC-friendly, as he had never come across an instance where a space
or line break wasn’t used following the short open tag. Rasmus, who had come across
several thousand such instances, mentioned single PHP variables in HTML forms. David
Kingma, meanwhile, had been busy creating his own implementation based around Sara’s
and presented it to the list, but Johannes was quick to point out that it suffered
from the same misconception regarding whitespace.

Finally, Cyril Pierre de Geyer mailed internals@, calling all the regulars
quite PHP geek‘ (hm?) and bringing the results of his wishlist poll of
French PHP users.

Proposal For Against Huh? Responses
Remove register_globals
86% 13% 0% 120
Remove magic_quotes_*
59% 18% 21% 128
Add input filter extension 95% 4% 0% 92
Include opcode cache by default 74% 0% 25% 108
Remove safe_mode, focus on
open_basedir
27% 31% 40% 88
Remove long-term deprecated items 82% 14% 3% 112
Make identifiers case-sensitive 75% 21% 3% 112
Remove function aliases 58% 34% 6% 116

Marcus thanked Cyril for this helpful feedback.

Short version: Enough of this wishing already…


BUG: Schroedinger objects

Rasmus wrought some amusement from bug report #34199, which highlighted the discovery that
if(!$object) and if($object) are handled by PHP in
completely different ways. if($object) would always return
TRUE for objects regardless of the actuality, whereas
if(!$object) would call the object’s cast_object handler,
if it existed. If the object’s cast_object handler was implemented and
could return FALSE under certain conditions, that object would be
neither there nor not there (or both). The example Rasmus gave was:


<?php

$a = "color: #0000BB"> "/manual/view/page/function.simplexml-load-string.html">simplexml_load_string
(
'<a></a>' "color: #007700">);
if(
$a "color: #007700">&& ! "color: #0000BB">$a) echo
"BUG!";

?>

Rasmus proposed fixing IS_OBJECT in zend_execute.h‘s
i_zend_is_true() to test for a cast_object handler and
either call it or call convert_to_boolean(). The alternative he could
see would be to clean up the mechanism so that the code paths for the two cases were
brought into line.

Andi agreed with him, and voted for taking the ‘checking for cast’ route. Marcus
wondered whether this meant a conversion would be supplied by default, but Andi
pointed out that the return value for an instantiated object should simply be
TRUE.

Short version: As I was walking up the stair/I met a man who wasn’t
there/He wasn’t there again today/I wish that man would go away


TLK: Property Overloading RFC [continued]

Marian Kostadinov felt that it would be good to have real properties in PHP -
like those in C#, but more flexible, with getter and setter allowed different
visibility. Christian Schneider pointed out that Marian’s sample code could already
be emulated in PHP as it is, but Marian argued that the current possibilities don’t
solve issues of property visibility or provide ‘good’ overloading rules. Derick
backed him, saying there are "http://www.zend.com/lists/php-dev/200508/msg00017.html">several problems with
the current approach. He then announced "http://files.derickrethans.nl/property_overloading.html">an
update
to his earlier proposal, and suggested
that this would be a good time for him to start implementing it, assuming nobody
else had any better ideas.

Christian and Edin both disliked the proposal altogether, with Christian calling
it overly complex,’a worst of both worlds kind of compromise‘, and hoping not
to see it implemented. Derick retorted that there were three inherent problems in
the existing implementation, ‘and how can you possibly argue that this is more
complex than all the other OO crap that people are suggesting here?
‘ Edin seized
the moment to register a heartfelt plea for filtering out OO feature requests
altogether – a position both Derick and Christian could sympathize with. Even so,
Derick continued, at present there is an inherently flawed implementation of
__get() and __set(), and he hoped ‘to see the OO crap
that we have working in a useful way
‘.

Stas took the proposal to pieces. He felt (and Lukas backed him in this) that it
would be more sensible to add a keyword to doxygen/phpDocumentor than to create a
new PHP language construct just to support documentation. Derick pointed out that
introducing the keyword property was a big help in resolving the other
problems in the current implementation, and not purely a documentation issue.

Stas also wanted to know whether __have_prop() would be overridable?
If so, he didn’t see why it should need specific support in PHP. And how was it
different from the __isset() implementation in CVS HEAD? Derick
explained that __isset() simply checks whether something is set,
whereas __have_prop() would check whether something had been declared
as property in the class. Stas wondered exactly when
__have_prop() would be called by the Engine? Jeff Moore was also
wondering about this, as double underline methods are generally reserved for those
called by PHP itself.

Finally, Stas disputed that returning FALSE would resolve the
recognized problem in __handlers diagnostics, as FALSE
might represent another kind of failure than the non-existence of the property in
the __get/__set handler. Derick explained that the idea was to allow
the Engine to throw an error on the line __get() had been called from,
rather than inside __get() itself, where it made for unhelpful
debugging information. Stas understood this, but pointed out that you still wouldn’t
know what was wrong in the reported line.

Marcus just wanted __have_prop() to be renamed to
__exists().

Lukas didn’t see why the existence of a virtual property couldn’t be checked via
a non-public set of arrays containing information about the property and its
visibility level. He did like the idea of having the __get/__set
handlers return FALSE on error, however, and felt there should be
something similar in place for __call().

David Zulke and Jeff didn’t understand why virtual properties would need to be
declared beforehand at all. Marcus explained that following the declaration you’d be
able to perform lazy initialization, and this was the whole point. Then Andi
intervened, saying that the proposal was syntax bloat as far as he was concerned,
and would only please OO fanatics; ‘“it isn’t worse than the other OO crap” only
means that the other OO crap should also be left out of PHP
‘. The benefits of
PHP lie in its development time, ease-of-use and training period for developers, and
his primary aim was to keep the language good at what it does best.

Derick argued that he was simply trying to fix broken behaviour in encapsulation
and visibility, but Andi disputed that __call(), __get()
or __set() should be resolving visibility at all, saying that the main
goal is to allow the exposure of a dynamic public interface.

Jeff promptly proved Andi’s point with this week’s "http://www.zend.com/lists/php-dev/200508/msg00999.html">Golden Email Award
winner
.

Short version: Timing is everything.


CVS: Script encoding resolved

Changes in CVS that you should probably be aware of include:

  • zend_is_callable() and zend_make_callable() now
    return the readable function name as a zval (affects CVS HEAD
    internals only) [Dmitry]
  • fread() now returns bool(false) on error rather than
    an empty string [Dmitry]
  • is_a() and is_subclass_of() no longer call
    __autoload() [Dmitry]
  • declare(encoding=...) is now required prior to any opcodes
    running, in CVS HEAD. This allows op arrays to ‘know’ which script encoding they
    were compiled from, which in turn allows intelligent conversion of inline HTML
    blocks to the output encoding [Andrei]

Andrei and Dmitry both made a lot of changes to the Unicode support in
CVS HEAD throughout the week, and Derick continued his task of fine-tuning the
date/time functionality in CVS HEAD and PHP_5_1 branch.

Ilia updated the bundled SQLite library in PDO to 3.2.5 in 5_1 and HEAD, and made
ext/dba support BerkleyDB 4.3 across all current PHP branches.

Marcus asked Dmitry whether he could remove support for call-time
pass-by-reference
in CVS HEAD, as it has been marked deprecated for a number
of years now. Dmitry agreed that it would be straightforward to remove it at parser
level, and said he would be prepared to do so following team consensus.

Short version: Busy week.


PAT: Module info and zlib fix

Johannes offered up a Zend Engine patch
[dead link] to add modules registering an internal function to
the zend_internal_function struct, allowing the module information to
be used by the Reflection API. His patch also provides a more helpful error message
when such a function is re-declared – it tells you which module the function came
from. Andi approved the patch, and Marcus committed it in CVS HEAD (only) at the end
of the week.

Xuefer provided a fix for a data
corruption bug
in ext/zlib, which Ilia subsequently
committed across all four current PHP branches.

Short version: Nothing new in the PAT
directory
this week.

Published: August 29th, 2005 at 12:00
Categories: Uncategorized
Tags: