TLK: The accidental death of --disable-all
REQ: 5.2 serialization change [continued]
TLK: INI includes
RFC: Moving core extensions to PECL
TLK: PHP 5.2.1 status
RFC: unset($object)
CVS: Ming API upgrade, PDO in HEAD
PAT: make test -n, CV experiments
TLK: The accidental death of --disable-all
It was one of those weeks. First there were a bundle of threads completely
unrelated to PHP core development, and then there was Andrey Hristov's funny moment.
Andrey had decided to disable a few extensions to speed up his development efforts,
and had discovered that --disable-all was 'kind of borked' in
CVS HEAD. PHP wouldn't build, complaining about Unicode stuff. OK, so Andrey went to
disable extensions on an individual basis, starting with the XML support:
'--disable-xml'
|
SPL suddenly started complaining, and PHP still wouldn't build:
'--disable-spl'
|
Now he couldn't build ext/standard, because count() relies on
SPL. Maybe it's time to make disabling SPL a non-option... Further experimentation
showed that the CLI SAPI can't build if ext/reflection is disabled,
either, and at that point Andrey gave up and mailed the list.
Ilia Alshanetsky wrote bemusedly that there is absolutely no reason for
count() to require SPL. Besides, it should definitely be possible to
compile PHP with --disable-all and no other flags, perhaps with the
exception of ICU in CVS HEAD 'for obvious reasons'. Andrey posted the error
message that had prompted him to mention count() in the first
place:
ext/standard/array.o(.text+0x6c2): In function
`zif_count':
|
'make clean!' chorused Hannes Magnusson and Tony Dovgal in unison.
Short version: Sometimes the biggest problems are the simplest to fix.
REQ: 5.2 serialization change [continued]
Andrei Zmievski realized that his patch had zero chance of getting into the
PHP_5_2 branch if it broke back compatibility. He wrote sadly that he couldn't see a
way to make decoding work for all cases in both CVS HEAD and PHP 5; 'I guess
we'll have to leave this task to PHP_Compat'. Thomas Seifert asked if he was
kidding. Why would he even think it might be OK to break all the strings stored as
serialized in the PHP 5.x series? He could understand that kind of breakage in PHP
6, but 'not in some minor release'... Robert Cummings stepped up to explain
to Thomas (and anyone else who'd missed Ilia's rejection of the patch) that this is
a forward compatibility issue. If people don't mind not being able to read
serialized data from future PHP versions in 5.2 and over, there is no BC problem. He
proposed checking the start of the serialized string for a version indicator prior
to decoding. If no version indicator is found, unserialize() should
fall back to the original semantics.
Stefan Esser had a better idea. He suggested introducing a new variable type,
S, for serialized strings. Zeev Suraski pondered that for a while
before responding that he actually couldn't think of a single reason why not. Could
anyone else? Ilia could; he pointed out that older versions of PHP would be totally
incapable of parsing a PHP 6 serialized string, creating problems for anyone using
serialization as a means of passing data between applications. He thought the
problem would be worse if the new type were added to a pre-6.0 release. Stefan
replied that the idea was to enable PHP 5.2.x to read data serialized with PHP 6,
not to enable PHP 5.2.x to generate it.
Andrei fired off a haughty note to Thomas explaining that the BC break would occur if you used PHP 6 serialized content in PHP 5 - not the other way around. It isn't possible to use the same escape format in PHP 5.2, for BC reasons; the support would need to come from PHP_Compat, as he'd originally noted.
PHP user Chad Daelhousen had a plan, too. He wanted a version indicator at the
start of strings serialized in PHP 6 as per Robert's suggestion; no changes to
serialize() in earlier PHP versions; an optional flag for
serialize() in PHP 6 to force old-style serialization; and
unserialize() in PHP 5.2.x to understand the PHP 6 version
indicator.
Short version: Everybody wants to save the world.
TLK: INI includes
Brian Shire had made a patch to support INI includes before he came across a prior discussion of them in the MARC archives. He wrote that INI includes would be a great help in simplifying large configurations, particularly when using a configuration management tool. Was this still of interest to the current PHP team?
Mathieu Carbonneaux wrote that he'd made a similar patch making it possible to
modify the configuration of scandir for additional INI files. He found the
--with-config-file-scan-dir configure option very useful, but limited.
His solution, on the other hand, allowed him to dynamically modify the directories
to be scanned or to add an include parameter in php.ini.
Short version: Funnily enough, John Mertic was hoping for this for the Windows installer too...
RFC: Moving core extensions to PECL
Ilia mailed the internals list to sound out his idea about moving the com_dotnet, mhash and sockets extensions to PECL for the PHP 5.2.0 and PHP 6 releases. His reasons for picking on these three were explained in some detail.
The com_dotnet extension (Ilia called it 'COM') has no maintainer at present and a high number of bug reports open against it - many of which are crashes. It's a Windows-only extension; PECL Windows binaries are readily available, and most Windows users don't compile the extension from source anyway. A move to PECL would allow for an independent release cycle, which would make it possible to deploy any fixes quickly; and Ilia hoped it might also encourage individuals or companies to take an interest in maintaining the extension.
The mhash extension has been superceded by ext/hash, which is enabled by default and requires no external libraries. Similarly, the sockets extension - which is unmaintained - has largely been superceded by the more consistent and stable streams API. This much said, Ilia opened the floodgates by asking people for their thoughts on the matter.
Straight away, the difference between *nix-only developers and those who occasionally code under win32 showed up. Sebastian Bergmann and Tony both agreed with Ilia on all counts, but Stas Malyshev wrote point-blank that com_dotnet is an incredibly useful extension when running PHP under Windows. He didn't see that moving it to PECL would improve the bug count any, and it doesn't cost much to keep it in the core. Ilia argued that com_dotnet isn't enabled by default under Windows anyway - so how could it make any difference whether it stays in the core or moves to PECL? Stas - and pretty much everybody else who ever used Windows - argued that not only is it both built-in and enabled by default, it also happens to provide core win32 functionality. Pierre-Alain Joye and Frank Kromann both wrote that they were prepared to do what they could about the bug reports mounting up there. Wez Furlong arrived on the scene to explain that, although he no longer has time to maintain the extension, he still has a vested interest in it and would be happy to review any patches. He suspected the bug reports would prove to be 'mostly duplicates' (now there's an interesting phrase) and wrote that it works fine for most of its users, most of the time. Wez wasn't aware of the extension being critically flawed to the point where it should be removed from the core. Besides which, 'it's a bit like suggesting that we make the exec family of functions an optional download. In theory, it sounds like a great idea, in practice, it's a pain in the ass and will leave people wondering what you were smoking :-)'. Marcus Börger was simply concerned over maintenance; he didn't mind keeping ext/com_dotnet so long as somebody was willing to take responsibility for it. Faced with the evidence - all of which was fully backed by the majority of the core dev team - Ilia dropped the idea of moving com_dotnet, crossing his fingers that somebody would find time to go through its bug reports. On to the rest...
Stas had nothing against moving ext/mhash, except that ext/hash needed better coverage in the manual beforehand (to which, Ilia agreed). He felt that the sockets extension probably shouldn't be moved in a minor PHP version; people use it in applications, and would need to rewrite their code to a significant degree to migrate those apps to use streams instead. Ilia agreed, but pointed out that there is nobody to support the extension's users; they could be leading themselves to a dead end if they started using ext/sockets in its final days. Stas suggested putting a notice in the manual to say it would be leaving the core in PHP 6 and recommending streams instead. His chief concern was existing code, rather than new users. Derick Rethans and Pierre both expressed similar concerns over ext/sockets. Frank thought the extension shouldn't go to PECL until everything it handles can be handled using the streams API. Mike Wallner simply wrote '+1 for PHP 6' against ext/sockets.
Johannes Schlüter - the only developer to show any sign of caring about ext/mhash - didn't think any extension should be removed from the core 'on a minor release'. He'd like to mark them as deprecated instead, to give the users a chance to see the warnings and update their code beforehand. Ilia agreed that it would be wrong to move anything as part of a patch release, and promised not to move anything before PHP 5.3.0. It would be nice, he added, if someone from the docs team could add a notice to the ext/mhash and ext/sockets pages in the PHP Manual indicating that those two extensions wouldn't be core for much longer.
Short version: The lack of concern over ext/mhash is a measure of the success of ext/hash.
TLK: PHP 5.2.1 status
Ilia, wearing his PHP 5.2 series Release Master hat, made an announcement regarding his plans for the PHP 5.2.1 release:
Just a quick notice to everyone that I'll be making RC1 of 5.2.1 on Thursday (November 14th), after which only bug fixes will be allowed into the tree. So, if you have any major commits pending, now is the time to make them. This will be the only RC this year and will be followed by RC2 in the first week of January. If all goes well expect the PHP 5.2.1 final release in late January.
Short version: He meant December.
RFC: unset($object)
Following on from a brief on-list discussion between PHP users about the
difficulty of destroying cross-referenced objects, Sebastian Bergmann posted an RFC
suggesting internal support for this. Having proved that - as
Arnold Daniels had found earlier - calling unset() on a parent object
does nothing because the child still references it, Sebastian suggested a 'magic'
method that would automatically be called when unset() is called on an
object that implements it, allowing the user to do something like this (given that
the name __unset() is already taken):
public
function __new_magic_method()
{
|
Sebastian asked whether anyone was aware of any solution to the problem other
than explicitly unsetting the $children array, adding that even if one
exists he believed the proposed method would be useful.
Etienne Kneuss suggested using references:
function
&setup() {
|
Sara Golemon got involved. She wrote that the code would provide better analysis if there were notification of the current state of both relevant reference counts:
class
foo {
|
That code assumes there is only one child. Multiple children will
'probably' share a single object store reference through a common
zval with multiple references:
class
foo {
|
However, 'that still doesn't cover the case where there are multiple object
store references spread out among the children (and possibly other variables not
contained in the object itself)'. It is not possible to aggregate the total
number of variable->zval->object references at this point; only
the zval in the process of being dereferenced is known. Sara added that this is part
of the reason PHP doesn't have delete().
Sebastian failed to see why anyone would need to know the current state of the
reference counts; if his new magic method existed there should be no problem with
unsetting the $children property:
unset($parent) -> "magic method" ->
$children = array() -> no references to $child objects
|
although he recognized that nothing would happen where other references to the
$child object exist.
Arnold Daniels suggested hunting down circular references on
unset(). Although noting that 'this might be bad for performance', he
felt that memory leak prevention was the responsibility of the language rather than
that of the user. When an object is destroyed because its reference count is
0, unset() should be called for all its children;
unset() should also be called on a variable leaving the call stack
unless destroy() was. Ants Aasma reckoned Arnold's solution would
definitely be too slow; most of the other languages he knows only check visibility
periodically.
Matthias Pigulla wrote that the issue of cross-referenced object destruction
comes up regularly on PHP user mailing lists and occasional bug reports;
while workarounds are possible in userspace code, they are 'painful and
messy'. If the only problem with Arnold's solution is that detection is too
slow, would it be possible to add a userspace function called something like
gc_cleanup() to perform the scan? Those using PHP in a request/response
environment wouldn't need it because everything is freed at the end of a request,
but users working in other environments could find a good place to make the call,
and are unlikely to complain about the delay.
Short version: A problem without an owner.
CVS: Ming API upgrade, PDO in HEAD
Changes in CVS that you should probably be aware of include:
- Ext/pdo_mysql now defaults to use buffered queries and prepared statement emulation [Ilia]
- PDO bugs #39483
(Problem with handling of
char in prepared statements), #38252 (Incorrect PDO error message on invalid default fetch mode), #38319 (Remove bogus warnings from persistent PDO connections) and #36798 (Error parsing named parameters with queries containing high-ascii chars) were fixed [Ilia] - In ext/session, bug
#37627 (
session.save_pathcheck checks the parent directory) was fixed [Ilia] - Zend Engine bugs #38274 (Memlimit fatal error sent to "wrong" stderr when using fastcgi), #39721 (Runtime inheritance causes data corruption) and #39775 ("Indirect modification ..." message is not shown) were fixed [Dmitry Stogov]
tolower()related functions were improved in VC2005 builds by caching locale and usingtolower_l()- giving a 10-18% speedup in benchmark tests [Stas]- In ext/openssl, bug
#39571 (
fsockopen()timeoutparam does not affect ssl/tls handshake) was fixed [Ilia] - In ext/xsl, bug
#39625 (Apache crashes on
importStylesheetcall) was fixed [Rob Richards] - POSIX bug #39754 (Some POSIX extension functions not thread safe) was fixed [Ilia]
- In ext/pdo_mysql, bug #39759 (Can't use stored procedures fetching multiple result sets) was fixed [Ilia]
- In ext/oci8, bug
#39732 (
oci_bind_array_by_namedoesn't work on Solaris 64-bit) was fixed [Tony] - Heap corruption when adding/caching typelib in ext/com_dotnet was fixed, closing bug #39606 [Rob]
- Ancient
safe_modebug #29840 (is_executable()does not honorsafe_mode_exec_dirsetting) was fixed [Ilia] - Hartmut Holzgraefe's new-ish function,
sys_get_temp_dir(), was backported to the PHP_5_2 branch [Hannes] - In ext/gd, bug #39780 (PNG image with CRC/data error raises fatal error) was fixed [Pierre]
CVS HEAD perked up a little this week. Andrei continued implementing Unicode
support for myriad core functions, including var_export(),
http_build_query(), parse_url(), dl() and (in
ext/date) strptime() which, along with
version_compare(), uses runtime encoding for conversion. He made
headers-related functions accept Unicode strings ('but only if their contents can
be converted to ASCII'), and made the iptc* family of functions
Unicode safe. The latter, however, went into CVS HEAD untested, 'cause I know
crap about IPTC'. Sara joined the party, supplying a Unicode upgrade for
fgetcsv() and adding str_getcsv() before Andrei challenged
her over the portability of her code. She subsequently changed fgets()
behaviour to be back compatible again.
Ilia meanwhile worked his way steadily through ext/curl. Andrei wanted to
know how he was dealing with POSTFIELDS, commenting that they would
need to be able to cope with Unicode content. Ilia agreed that they present a
problem, mainly because the first character needed checking for @
(signifying a file upload). Perhaps data could be posted as UTF-8, but the form
might not be expecting that, and there was no way to be certain what the form
actually did expect. Posting binary as-is, and Unicode as UTF-8, might
resolve that issue, but he had concerns over possible side effects. Ilia
subsequently applied a fix to allow the submission of Unicode data in UTF-8
form.
Tony's work on ext/oci8 appeared to be running along smoothly, with
oci_statement_type() covered explicitly and 'most of the OCI8
functions' now marked as Unicode aware. Rob seemed equally on top of the large
and sprawling DOM extension, and marked a vast array of its functions as Unicode
safe in one huge commit.
Frank Kromann updated the ext/ming API in PHP_5_2 and CVS HEAD, bringing
several new PHP methods to life in the process. Ming users now have access to the
SWFVideoStream methods init(), setDimention()
[sic] and getNumFrames(); swfprebuiltclip::init();
SWFMovie::namedAnchor() and SWFMovie::protect(); and a new
function, ming_setSWFCompression(). There are also two new
SWFTextField constants, SWFTEXTFIELD_USEFONT and
SWFTEXTFIELD_AUTOSIZE, and several for SWFSound:
SWF_SOUND_NOT|ADPCM|MP3|NELLY_COMPRESSED,
SWF_SOUND_NOT_COMPRESSED_LE, SWF_SOUND_5|11|22|44_KHZ,
SWF_SOUND_8|16_BITS and SWF_SOUND_MONO|STEREO.
Tony committed a patch introducing a BSD licensed implementation of
double-to-string utilities to replace the previous LGPL'd version. He noted that the
change also fixes thread safety issues in zend_strtod(). Matt Wilmas
noticed some changes to formatted_print.c in there. He had a vested interest
in the file, having had a patch for it since August. He wrote that Tony had added the
specifiers g/G and E - part of his own patch
- but missed the F specifier, although it was still present in
php_formatted_print(). Matt was prepared to accept that the locale
decimal point might be handled differently now, but thought F should be
in there for back compatibility. Tony promptly added the F specifier,
and took a look at Matt's old patch. He couldn't see any obvious problems with it,
but wrote that he'd need to play with it before he was certain enough to commit
it.
Stefan kicked up a fuss following a commit from Ilia to fix bug #39763 (magic quotes are applied
twice by ext/filter in parse_str()) in PHP_5_2. He wrote that a
comment suggesting php_register_variable_safe() was responsible for
adding the magic_quotes slashes was simply wrong; ext/filter
does that job now. Further, a previous commit there by Tony had broken
magic_quotes_gpc completely, introducing potential SQL injection
vulnerabilities. Stefan wasn't entirely happy with the filter extension anyway; he
thought it should be rewritten to support daisy chaining, work as a filter
rather than registering the variables itself, and support cookies 'correctly'.
Ilia survived the onslaught. He wrote that Stefan was wrong about the comment in
the first place; php_register_variable_safe() does indeed put the
slashes there. That was precisely why Tony's earlier changes had been correct; magic
quotes shouldn't be applied for PARSE_STRING() because the slash-adding
function would be executed on the returned value. Ilia thought daisy chaining should
be supported by providing hooks and having them call the stock filter functions. The
variable registration is as it is in ext/filter because it makes the API
simpler, and he didn't know of any really good reason to have it any other way.
However, he was curious over Stefan's mention of incorrect cookie support, and asked
him to elaborate, Stefan being a bit of an expert in that area.
Stefan looked again at the code, and agreed about the slashes. However, this
simply illustrated the need for variable registration to be kept separate from
filtering. As for hooks, although he had no problem with providing his own, it made
no sense to him to have input filter hooks in PHP 5.1 and then find them unusable in
PHP 5.2, where ext/filter takes them over. In fact, that abuse of the input
filtering hooks was the main reason he could see for moving variable registration to
a single place; several codepaths currently lead to different results. In
fact, wrote Stefan, if ext/filter didn't insist on registering
variables itself, the bug Ilia had just fixed would never have arisen. As for the
cookie issue, Stefan gave clue. Somewhere down the line,
php_register_variable_ex() had been changed to handle cookies
differently from other variables; cookies with the same name would be dropped after
the first was registered. In ext/filter RAW this (correct)
behaviour still stood, but the filtered variables behaved differently...
Pierre retorted that if it were possible to work around the filter extension,
filter itself would make no sense. magic_quotes had known issues, which
was why it was set to disappear in PHP 6 anyway. As for the business over the
cookie, if Stefan had found a bug he should report it at bugs.php.net in the normal way; it would
make it easier for the team to track it. Stefan replied loftily that his bug reports
weren't wanted at bugs.php.net because he uses a patched version of PHP; he was
therefore not submitting any more bug reports. However, Pierre should feel free to
submit one of his own.
Ilia wrote that actually, daisy chaining wasn't a bad idea; it's just that nobody had expressed the need for it before. The variable registration, though, is as it is largely in an attempt to reduce memory usage. He suspected there might be a better solution to prevent duplication of data... Ilia concluded his post with a promise to look into the cookie business, and subsequently did. He found he'd fixed the discrepancy himself earlier in the week, in the course of fixing an entirely different bug.
Wez wound up the week with a surprise move when he merged PDO (but not the drivers, yet) from the PHP_5_2 branch to CVS HEAD. He noted in his commit message that the source currently compiles against both PHP 5 and PHP 6, and asked that anyone 'poking around in here' ensures it stays that way.
Short version [thanks Andrei]: Looking forward to next installment of "PDO in Unicode Land!
PAT: make test -n, CV experiments
Marcus took a look at the make test -n flag patch used by the Gentoo
PHP distro, and wrote to Luca Longinotti that the PHP team couldn't use it. It would
prevent shared extensions being loaded, and they couldn't use dl() in
their test scripts. The only workable approach would be to have a new
make call that passes -n to the test script, because the
whole point was to test the entire PHP deployment rather than just the core. That
said, there is already an environmental variable TEST_PHP_ARGS that
serves the purpose; you could set TEST_PHP_ARGS=-n and the problem
would be gone.
Brian Shire wanted to know whether Marcus envisaged a second version of
make test that included the -n option, whether
-n should be included in the default make test, whether a
completely different solution was needed, or whether he should drop the whole idea?
He'd thought of a more complex solution but not explored it yet; it might be
possible to ignore duplicate extensions during make test. However, this would mean
complicating the CLI SAPI for a single, specific scenario. Brian went on to say that
he hadn't known TEST_PHP_ARGS existed, and wondered whether it might be
sensible to have shared extensions set it? Marcus, however, wasn't convinced that a
solution was even needed. It seemed to him more of a documentation issue.
Later in the week, Brian provided another patch optimizing the
extract() function by eliminating unnecessary calls to
strlen(). Ilia, calling it a 'good catch', promptly applied the
patch in the PHP_5_2 branch and CVS HEAD.
Alexey Zakhlestin, who had been looking into the problem of persistent memory
management, believed he'd sorted out most of the issues. He wanted to know if there
was an 'official' way to deep copy a non-persistent zval into a
persistent zval, given that zval_copy_ctor() doesn't have
a flag for this. If not, he could see a need for
zval_pcopy_ctor(zval *value, zend_bool
persistent)
|
and
#define zval_copy_ctor(v) zval_pcopy_ctor(v,
0)
|
similar to the *alloc() functions.
Sara had been idly toying with compiled variables. As an experiment, she'd
applied them to pi(), and had seen a consistent gain of approximately
18% using the simple test:
for
($i = 0; $i < 10000000;
$i++)
pi();
|
Although recognizing that the test is not normal usage, and that her patch doesn't address dynamic function calls, method calls or class resolution, Sara felt the results probably meant this was a good time to start discussing CV in functions.
Ilia was interested, but doubted whether the overall speed improvement would be
anywhere near so high. That said, it would certainly make PHP faster than before.
Sara agreed entirely; reality, as she'd already mentioned, is not reflected in her
test script. Also, her figures were based on unicode.semantics=off. She
thought optimized class fetches would probably help in several cases too, but wasn't
prepared to put in the work for that, 'unless it sounds worth doing to enough
people'. Ilia pointed out that optimized class fetches would only be useful for
native classes.
Dmitry called it an 'interesting patch', and wrote that the Zend team had
had a very similar idea in the past. He showed Sara that it's possible to optimize
function calls much more by optimizing ZEND_INIT_FCALL_BY_NAME as well
as ZEND_DO_FCALL, and that the same cached entries can be reused for
all the op_arrays from a given PHP file. He offered to send a demo
patch in the next few days. Sara replied that she'd only presented this as a rough
idea, to spark discussion. She agreed that there was a lot that could be done to
improve it, both in terms of coverage and in the actual implementation. Making the
cache per-file rather than per-scope was definitely a good idea, but she hadn't
found a simple way to track the current file. All in all, she looked forward to
seeing Dmitry's patch.
One Kevin Hoffman offered up perhaps the tidiest bug report
and patch ever seen on these lists, for bug #39751 (putenv causes string copy of freed memory
region, causing crash). Edin promptly applied Kevin's fix.
Tony applied Matt's zend_u_strtod() implementation mid-week,
announcing it in the commit message as a 'major speedup when using floats in
Unicode mode, also fixing several problems with the current code'. Matt himself,
however, had moved on to pastures new; he posted a patch for
zend_u_strtol() and HANDLE_U_NUMERIC() allowing only ASCII
digits and sign characters. His initial tests showed all was well, and the change
brought a performance increase too.
Short version: Compiled variables could be a way forward.


Comments (Login to leave comments)