FIX: RETURN_RT_STRING
FIX: OS X and .dylib
TLK: Coverity
BUG: zend_highlight core dump
TLK: DB2 support
REQ: PDO_Firebird maintainer
CVS: Zend API bump averted
PAT: PCRE, ext/filter and LDAP
FIX: RETURN_RT_STRING
Still working her way through the code base for PHP 6, Sara Golemon noticed a bunch of failed tests due to memory leaks and tracked down the cause. In Unicode mode, the family of macros with the
RETURN_STRING(s, 0);
|
semantic become
RETURN_RT_STRING(s, 0);
|
with the binary value being translated to Unicode using a conversion function
called by the macro. The value of the original s was leaking memory
during the conversion process. She'd found some places in the source where this
issue had been handled, sort of:
RETVAL_RT_STRING(s,
0);
|
which is, as she said, perfectly valid in itself, but 'a little
inconsistent' with the standard RETURN_STRING()family semantics,
not to mention problematic for RETURN_RT_STRING() usage in general.
Sara offered up two solutions; either modify the macro definitions to call
efree() where there is a copy, or modify the conversion macro to accept
a flag that would trigger a call to efree(), and alter the
RETURN_RT_STRING() macros to support it. She asked for feedback.
Dmitry Stogov replied, saying that the first solution would fail because a string
passed to ZVAL_RETURN_RT_STRING() cannot use emalloc().
The second solution would work, but had Sara considered reusing the second argument
so that 2 could mean 'duplicate and free'?
Sara wrote that she'd pasted the wrong version of the first solution into her
initial post; it should've read 'modify the macro definitions where there is NOT a
copy'. The second parameter, duplicate, should only ever be set to
0 where the string is allocated memory with
emalloc(), 'otherwise the Engine would get in trouble freeing
it later on'. Given this consideration, she offered a further solution; leave
the existing prototypes as they are and have them call efree() where
there is no copy. This would involve creating a new set of macros,
*_RT_STRINGL_NOFREE(s, len)
|
to be used when duplication is not needed but s should not be freed,
allowing edge cases to reuse s. Sara added that a RETURN
variant wouldn't be needed in this case, and in fact would be guaranteed to leak in
Unicode mode. She and Andrei Zmievski had discussed Dmitry's idea of reusing the
duplicate argument, but had rejected it on the grounds that it was
inconsistent with the existing macros; duplicate has always been a
binary value until now.
Dmitry pointed out that there is in fact a way to avoid Engine problems without
calling emalloc(). 'You can use ZVAL_RT_STRING(&fname,
"strlen", 0), then call zend_call_function(&fname) and do
not destroy fname'. Sara, who hadn't thought of that, followed
through with an updated list of options:
- Assume
auto_freeforRETURN_RT_STRING(s, 0)only - Expand
*_RT_STRING(s, 0)to include a newauto_freeargument - Overload the existing
duplicateargument to include auto-free logic - Duplicate the macros to one auto-free, and one non-auto-free version
Her own preference was for a combination of 1) and 2), but Dmitry pointed out
that 3) would require fewer changes, and demonstrated how it could work by drafting
two flags and rewriting the ZVAL_U_STRINGL() macro to use them:
#define S_DUPLICATE (1<<0)
|
The advantage of the approach would be that only those calls that actually leak
would need to be altered. Sara agreed, but changed the prefix for the new constants
to ZSTR_ in the full patch she provided. Dmitry okayed the patch, and
asked her to update the *_ASCII_STRING() family in the same way; he'd
found that they suffered from the same problem.
Usage is as follows, with duplicate (now flags) having
the option of a new value, 2, to trigger automatic freeing of the
original string during Unicode conversion:
/* Uses s as-is for non-unicode, or frees s
after converting to unicode */
|
Short version: A lot of deep thought went into this pluggage. Appreciate it.
FIX: OS X and dylib
Christian 'Chregu' Stocker wrote to internals@ to report that iconv detection no longer worked in the PHP_5_1 branch under Mac OS X, following an earlier M4 patch from Andrei. He didn't want to revert the patch because he figured Andrei must have had some reason for making the change...?
Andrei responded, explaining that SHLIB_SUFFIX_NAME is used to
represent both .so (bundles) and .dylib (dynamic libraries) under OS X
during compilation. They should be treated separately; he'd considered introducing
an additional variable specifically for the platform, but had hit a snag at the
point where PHP_SHLIB_SUFFIX is registered; which variable should that
be?
For some reason Andrei and I were chatting that night, and I'd been up to my neck in M4 code for the PHP-GTK project for the previous two weeks, so between us we came up with a workable solution fairly quickly. Andrei posted the resulting patch [dead link] to the list, commenting that it worked on his own Mac but he needed feedback from people on other platforms before he felt confident enough to apply it. Chregu tried it and found the typo, but reported that it worked otherwise. I posted the working version to him off-list to double-check; he'd fixed a symptom rather than the cause. Success! Andrei subsequently applied the patch in CVS, sans typo.
Later in the week, Jani Taskinen came home from Afghanistan on leave and went ploughing through the entire PHP code base, muttering under his breath as he went. Our brave new macro was one of the many things that got changed along the way.
Short version: Presumably it still works...?
TLK: Coverity
Nuno Lopes wrote to the internals list asking whether anyone else had 'received the Coverity report'? He was hoping to help fix the remaining detected bugs reported by the software on the site, but had had no response to his application.
Tony Dovgal replied that several people - himself included - have had access for some time, and advised Nuno that applications can take a while to process. He added pointedly, 's/bugs/pieces of code that Coverity doesn't like/' - there were a lot of false positives in among the initial 204 bugs reported for the PHP source. Nuno wondered whether any real bugs had actually been found through the Coverity scan, and Tony admitted to having found a couple of potential problems that way. Marcus Börger, presumably catching up on this exchange, started labelling his Coverity-related commit messages later in the week; but Nuno - and also Michael Vergoz - were still waiting for their accounts by the end of it.
Short version: We're still not sure how useful Coverity is at this stage.
BUG: zend_highlight core dump
A fairly bemused Rasmus Lerdorf posted to internals@. He'd spent a full day
trying to track down a repeated crash under current PHP_5_1 on one of the servers
Yahoo! donated to php.net, and had run out of ideas. The crash was always on a
request to the
MySQL reference page in the manual, and occurred during a call to
highlight_string() for one particular user note. He'd taken the
raw note data from
the backend files and attempted to reproduce the crash with it, but it was fine in
itself. 'Somehow CG(op_array) is getting set to crap coming into
that call, but only sometimes. Yet if it is random memory corruption, why is the
crash always exactly the same?'
He provided a backtrace, but went on to say that he'd had no luck when trying to reproduce the crash under valgrind either. It could be 64-bit specific, or he might have somehow set up the environment differently, or there could be some kind of re-entrancy problem...
Later, Rasmus posted again to report that the most recent core, although still in
highlight_string(), had been sparked by the first user note on the
Brasilian Portuguese copy of the register_shutdown_function()
manual page. Which rules out re-entrancy, at least.
Short version: Mystery crash in 5_1 branch, probably 64-bit related.
TLK: DB2 support
IBM developer Kellen F Bombardier wrote to the PECL development list to complain that, although he'd been able to create the new directory pecl/pdo_db2 without any difficulty, he was unable to commit any files to it. Could someone please give him the karma to access the directory?
A furious Wez Furlong - known as 'the King of PECL' for the very good reason that he generally controls what happens in the repository there - let Kellen feel the long end of his whip. (He didn't say it, but creating a directory for a new PECL extension without mentioning its existence to the community beforehand is generally considered quite rude.) Wez has been keeping PDO_ODBC - which has DB2 support - 'clean IP' from the start, in the hope that IBM would eventually contribute to it. Dan Scott - who is no longer working for IBM - had given him reason to believe there was a strong likelihood of this happening. Many of the developers working on the PHP project are unhappy about not having access to the module, and of course Wez has also had to support PDO_ODBC entirely unaided as well as taking the flak behind the scenes from those unhappy developers. 'And now you're (IBM) going to release your own driver anyway?' he asked incredulously.
Wez went on to ask for a clarification of IBM's position, pointing out that the rest of php.net can't contribute to IBM code without a CLA. He personally had been keeping everyone away from his own code where it might become part of an IBM project, because there is no clear process in place. If IBM planned to create a further PDO_DB2 driver that the rest of php.net couldn't contribute to, concluded Wez, it was probably time that IBM kept their code in their own repository and simply published their package releases via PECL. He could remove IBM code from the PECL repository and the ACLs from the PDO_ODBC driver, and allow everyone else to contribute once more.
Kellen replied that IBM's position is based on their concern for a stable PDO driver for DB2. They felt that DB2-specific nuances might not be catered for in a non-specific driver. As far as the political aspect of the situation went, he had started internal communications on the topic and the appropriate groups within IBM would work to find a solution; he promised to keep the community updated over this. In the meantime, Kellen had removed the PDO_DB2 project page and pledged to work towards a more community-based approach to fulfill the company's needs. However, at this point he wasn't certain whether this would be through PDO_ODBC, PDO_DB2, or some entirely new module; he cited a couple of bugs he'd opened against PDO_ODBC that had led to the perceived need for a specific DB2 driver.
Wez pointed out that 'having two pieces of code that are > 90% the same, maintained by two people that are not legally allowed to communicate code, doesn't help either piece of code'. Not only would there be a duplication of effort, but the differences in implementation over time would be likely to lead to ever more subtle problems. On the other hand, he had absolutely no problem over adding conditional DB2 specific code to PDO_ODBC. Wez agreed that having a single code base - any single code base - the teams could collaborate on would be a step in the right direction, but argued that the whole point of keeping the PDO_ODBC module clear in the first place had been to allow precisely that. Noting (silently) that both Kellen's bug reports were against the PHP 5.1.2 release rather than a PHP_5_1 branch snapshot, Wez added that he believed he'd fixed related problems in PDO and PDO_ODBC recently, and this later proved to be the case. He ended by reiterating that there was absolutely no point in replicating code, before thanking Kellen for his efforts on the political front.
Short version: When worlds collide...
REQ: PDO_Firebird maintainer
PEAR contributor Lorenzo Alberton wrote in to ask about the current status of the PDO_Firebird module. He'd tested it a little, and believed it wasn't on par with the other PDO drivers. Was there some comparison table or roadmap he could refer to to check on the features that had been implemented in individual PDO drivers? Was Firebird/InterBase guru Ard Biesheuvel still active in PHP development?
Marcus confirmed that there is no such table; theoretically at least, all the PDO drivers should support the full range of functionality once they were working, but he hadn't been able to generate a gcov report for PDO_Firebird. The module appeared to have been abandoned. He also hadn't heard from Ard in a while, and put out a call for him on the list.
Firebird user Lester Caine wrote that the lack of support for certain key Firebird/Interbase features in PDO meant that the native driver would still need to be used alongside any 'limited PDO port'. He implied that this was the reason Ard hadn't put any further effort into the driver, which isn't entirely the case; it was more that the aims for PDO in the early stages of the PHP 5.1 series were irrelevant to Firebird. Moreover, taking PDO_Firebird to the point where it becomes useful will involve a lot more work than has been the case for any of the other PDO drivers, thanks to the unique design goals in Firebird itself.
Lorenzo hoped Ard would still be interested in working on the PDO driver, given that he'd made such a good job of the original interbase extension in PHP 5, but Wez informed him that he was actively looking for someone to maintain PDO_Firebird. At this point Lorenzo did two remarkable things. He posted 'a call for help' on the Firebird news page, and wrote a full online report regarding the current status of the driver. If only everyone could be so pro-active.
Short version (thanks Lorenzo): The PDO_Firebird driver needs A LOT of love.
CVS: Zend API bump averted
Changes in CVS that you should probably be aware of include:
- A new function,
mb_check_encoding(), was added to the mbstring extension in the PHP_4_4 branch. The function is security-related; it provides a means of detecting a potential encoding attack. [Seiji Masugata] - Also in ext/mbstring, new functions
mb_stristr()andmb_strrichr()were added - this time in CVS HEAD only [Seiji] - Functions using the new Zend API functions
zend_object_std_init()andzend_object_std_dtor()were#ifdef'd out in the tidy, simplexml, xmlwriter and xmlreader extensions to avoid the need for a Zend API number change in the PHP 5.1 series. (The check will be removed in PHP_5_2 branch.) [Tony] - In ext/spl, bug
#36941 (
ArrayIteratordoes not clone itself) was fixed in CVS HEAD and PHP_5_1 [Marcus] - ext/mssql bug #33694 (invalid SQL or timeouts makes it impossible to reuse persistent connections) was fixed across all active PHP branches [Frank Kromann]
- Core bugs #36957
(
serialize()does not handle recursion) and #36875 (is_*()functions do not account foropen_basedir) were fixed in HEAD and 5_1 [Ilia Alshanetsky] - Zend Engine bug #36944
(
strncmp()&strncasecmp()do not returnFALSEon negative string length) was fixed in HEAD and 5_1 [Tony] - In ext/oci8, bug
#36934 (
OCILob->read()doesn't move internal pointer when reading 0's) was fixed in HEAD and 5_1. Note: there is no support for UnicodeCLOBs prior to Oracle 10. [Tony] - ext/spl bug
#36981 (
SplFileObject->fgets()ignoresmax_length) was fixed in HEAD and 5_1 [Tony] - pdo_mysql bug #5827
was fixed in 5_1 and HEAD, making the MySQL driver
responsible for 'gobbling up result sets in
closeCursor()and thestmtdestructor' [Wez] - Also in ext/pdo_mysql, bugs #36602 (persistent connections don't work with MySQL 5.0.3+) and #6262 (correctly fallback to emulated statements when the server version is too old) were fixed. Code was also included to default to using emulated prepared statements, but was not implemented due to the stage in the 5_1 release cycle [Wez]
- A new general attribute,
PDO_ATTR_EMULATE_PREPARES, was added to ext/pdo in HEAD and 5_1, replacing the custom attributes in the MySQL and Postgres drivers [Wez] - PHP 5.1.3 RC3 was rolled [Ilia]
Short version: More about RC3 next week.
PAT: PCRE, ext/filter and LDAP
Nuno had been looking into ext/pcre bug #33151,
and wrote to Andrei - as maintainer of the PCRE extension - to suggest that the
MATCH_LIMIT and MATCH_LIMIT_RECURSION parameters should
be tuned 'to more realistic limits'. He could see several approaches to this,
but Andrei replied that he'd been considering simply adding a couple of new .ini
settings to set them with. Nuno promptly posted
a patch [dead link]
to provide these, and wrote that he also felt there should be an
E_WARNING thrown when pcre_exec() fails. Derick Rethans
argued that this should only happen if the regular expression itself was the only
possible cause of failure, but Andrei pointed out that the subject string and
pattern work hand in hand; a 'bad' pattern might work perfectly well on one string
but fail on another. Nuno's point was that currently there is no way to report the
error returned by PCRE. Perhaps every error returned by
pcre_exec() could be influenced by the subject string, but even having
an E_NOTICE would be helpful when it came to debugging. Looking into
it, Nuno went on to report that none of the potential return values could be
triggered by a 'malicious' string, with the exception of
PCRE_ERROR_BADUTF8 - and that, he argued, should have been validated
anyway.
Pierre-Alain Joye, meanwhile, had been looking into pecl/filter, and had
noticed that many of the extension's test failures were related to the
FILTER_VALIDATE_REGEXP mode. He'd found that the mode uses PCRE
natively and is incompatible with ext/pcre, which 'makes its usage a bit
tricky'. Pierre supplied a
patch to make the mode use the ext/pcre API, commenting
that it would be better if every addition to ext/filter were to use it.
Ignacio Arenaza - evidently unaware of the extensive existing patches for the LDAP
module - posted his modifications
allowing it to use pagedResults. He'd tested his patch with W2003 AD
and OpenLDAP 2.1.x servers, using a Linux-based PHP client, and posted diffs against
all current PHP branches (and also PHP_4_3). We still don't have a maintainer for
the LDAP module at present, which is why the existing patches are still sitting in
PAT; it would be good to find one, or at least someone equipped to fully test and
evaluate these offerings.
Finally, Ants Aasma's patch introducing a new ext/gmp function,
gmp_nextprime(), was applied by Tony in CVS HEAD this week.
Short version: The LDAP extension urgently needs a maintainer.

Comments