TLK: More about licensing
FIX: CVS Repository
TLK: Avoiding circular references
NEW: Restarting php-i18n
REQ: Multithreading
TLK: ZTS mode bugs?
TLK: Extension writing docs
NEW: CVS server
CVS: Testing, testing, 123
PAT: LDAP maintenance offer
TLK: More about licensing
Derick Rethans came back from his holiday, and discovered the full horrors of the date wars thread. Being restrained by friends, he contented himself with commenting that ext/xmlwriter had better switch its licensing before it went into core. Pierre-Alain Joye replied that that had been done a week ago (Derick was still catching up on his email), but added that he still hadn't heard a good explanation for 'this new requirement'. Lukas Smith backed him, surprisingly, saying that the chief difference between the license Pierre had hoped to use and the PHP License was that no copyright was granted to the PHP Group in Pierre's version. Given that the PHP Group don't have an army of lawyers at their disposal, he didn't see an issue there. On the other hand, Lukas could see how facing users with a single license for everything that comes with PHP might be 'an automatic gain', regardless of the nature of any additional licenses.
Stefan Walk wondered whether dual licensing would be a problem for a core extension? It would allow users to choose which license they wanted to use it under...
Rasmus Lerdorf pointed out to Pierre that, far from being a new requirement, consistency in licensing was a) the status quo and b) common sense. It means, as Lukas had already noted, that 'users don't have to wade through a license soup trying to figure out if they meet all the requirements'. It would be possible to add extensions with differing, but compatible licenses - but there would have to be an extremely good reason to do so.
Pierre, arguing that 'common sense' can differ, went on to say that his only issue was over copyright. He was reluctant to give copyright to the PHP Group and thereby lose it altogether; would it be possible to retain copyright for the original authors as well as for Group? Pierre also wanted to know where underlying extension libraries stood; should they now follow the same policy as extensions? He made it clear that he had no intention of applying a restrictive license to anything in the PHP core; he was thinking only of BSD style licensing.
Zeev Suraski got involved to explain the legal ramifications of having dual copyright in part of the PHP core:
Does it mean that someone who wants to use PHP in his product name has to also come to you in addition to getting permission from group@php.net? Does it mean we need to get your OK on any future license changes, as minor as they may be?
It all went quiet after that.
Short version: Thus endeth the license wars.
FIX: CVS Repository
William A. Rowe mailed the internals list to issue a warning that php.net's CVS repository had been corrupt for the past two weeks. Someone had added a config0.m4 file by duplicating config.m4,v directly in the repository, causing duplicate builds of the affected extension (ext/pcre) and breaking every historical PHP checkout. He went on:
Of course, if one wants to do this in order to preserve history, it's still essential to remove all prior tags from the new ,v file - to avoid exactly this sort of duplicate checkout -by tag-. Of course checkouts by date would still check out duplicate files. And of course nobody wants the true history concealed by removing the old, now removed file from Attic/. So the patch to the true config0.m4,v is attached for someone to modify this directly in cvs. Note this is specific to today's untagged state, and within days this patch would become invalid when config0.m4,v is tagged with new, non- config.m4,v tags.
Jani Taskinen was as confused as most readers would be by this, and wanted to know which old PHP branch William was using? Edin Kadribasic argued that William was right; it should remain possible to check out old versions of PHP directly from CVS. Jani thought a lot more files than this config0.m4 had been moved around; the possibility was probably already lost, no? Being uncertain, he gave Edin his blessing to go ahead and patch the file directly in the repository, so long as it didn't break CVS HEAD or PHP_5_1 branch.
Marcus Börger understood William's email, and confirmed that the solution on offer was correct. Without the patch, the old branches would magically inherit the new file, as if the past had been changed. He then muttered something about 'one more reason to switch to SVN', which is an argument that comes up from time to time.
William, amused by Jani's assumption that the problem was branch-specific, made
it clear that every historical CVS checkout was broken currently - including
those made in the past month. Rasmus - who also understood William's first email -
went ahead and applied the patch, commenting to Marcus that SVN has its own set of
problems. William agreed, saying that importing CVS history is problematic with SVN;
tweaks made to CVS could confuse the importer, and it was 'quite a chore' to
identify and fix all the subtly corrupt ,v files. He added that the
advantage of SVN is that it allows files and directories to be renamed while
preserving history. Realizing he was about to get into one of those SVN vs CVS
arguments, he concluded with a hasty 'not that I'm advocating either way' and
disappeared.
Short version: We're back to building a single ext/pcre. Thanks William.
TLK: Avoiding circular references
PHP 5 user Alan Pinstein had figured a way to avoid circular references in some data import scripts, saving 'unbounded amounts' of memory usage, but wasn't sure whether it was legitimate code that could be relied on to work into the future.
He has objects in a parent-child relationship, both linking to each other. The
parent object has an addChildObject() method:
function
addChildObject($c) {
|
To avoid incrementing the reference count when storing the link, the child
object's setParent() method needs a non-refcounted pointer to the
parent:
function
setParent(&$parent) {
|
It was the $this->this part that bothered him. Could it be relied
upon?
Tony Dovgal tried to send Alan to the PHP general mailing list but David Zülke
defended him, saying this was an Engine query and he was interested in what the
developers might say about it too. Alan said he'd tried the general list already,
and had been told there that passing objects by reference is a legitimate way to
avoid incrementing the refcount - but nobody there knew about the
$this->this part. His question was related to a discussion on internals@ dating from last October; it
was about reference counting of objects when assigned by value and by reference. The
ability to have an object handle that is not reference counted, is
prerequisite in preventing circular reference deadlocks and the accompanying memory
leaks in object oriented projects, and he needed to know the officially recommended
way to achieve this.
Alan went on to ask why, in PHP 5.0.4 at least, $obj =
&$this->this; doesn't increment the object's reference count, whereas
$obj = &$this; does. He maintained that references to
$this are essential for proper memory management, unless there is
another way to get a weak reference. He also felt that making
&$this throw an error was wrong, but that changing the value of
$this probably should - even though it's legal to do so in other
languages.
Andi Gutmans agreed that if you create a circular reference, either by reference or by value, there would be a 'memory leak' until the end of the request. This is true for arrays as well as for objects, and is a side effect of reference counting systems. However, given that PHP is generally used within a request/response paradigm, this shouldn't be a problem in real life usage; garbage collecting systems are potentially much slower given that context.
Alan retorted that using a PHP CLI script to import bulk data could, in his eyes, be considered 'real life usage'. He went on to talk about the way the issue has been resolved in Cocoa, which works like PHP 99% of the time but has the concept of a weak reference that is specifically used - in userland code - to prevent circular references. However, his original question was this: he'd found a way that appears to create a weak reference in PHP, and he needed to know whether his approach is condoned or just lucky.
Andi felt that it is possible to design an application in a way that avoids the
issue altogether. Alan agreed, but argued that doing so would both reduce the
functionality of the object layer and add 'an unreasonable burden' on the
object client to undo the circular reference. Besides, in his current situation,
designing around the problem wasn't an option (he was using Propel) because the parent/child
'pointers' were required. Andi then suggested that it's possible, if not perfect, to
create weak references in PHP by using indirect property access. On being pressed to
clarify that, the best solution he could come up with was some kind of id-to-object
mapping, perhaps achieved by overloading __get/__set.
Jochem Maas was also interested in the concept of weak references, and referred to a previous list enquiry about making a distinction between object handles and references to them - again, an inconclusive discussion. He added that he also has an OO import script that eats huge amounts of memory; documentation indicating strategies for overcoming that issue would be of great benefit to him, and probably to many other developers.
Wez Furlong suggested that if a piece of code 'feels wrong' you should probably not rely on it to work forever. He went on to say that PHP isn't always the right tool for the job; finding a limitation that can't be avoided without nasty hacks is probably a symptom of that.
Meanwhile, Alan had been discussing his $this->this approach on
the Propel development list, and had become convinced it's a bad idea - mainly
because of the way references behave in PHP. He'd missed the point that a reference
is bound to the same place as the original; when the original changes, so does the
target of the stored link. However, it concerned him that Wez had implied PHP is
only intended as a web tool, rather than as a general programming language. This had
the effect of making him seriously reconsider whether to use PHP in the long term;
if he had to develop his core model in, say, Java, he wasn't at all certain he'd opt
to use PHP to write the front-end.
Short version: It'd be really good to find a way to avoid that memory munching.
NEW: Restarting php-i18n
A message from Andrei Zmievski:
I'd like to resurrect php-i18n list and shift all Unicode/i18n discussions over there so that people interested in these topics can better focus on them. To that end I have subscribed a few of you (haha) to it already so don't be surprised.
Short version: You know who you are.
REQ: Multithreading
Even though he recognized he might be the thousandth person to do so, Bart de Boer went ahead anyway and asked whether there are any future plans for native multithreading support in PHP?
Rasmus denied it, saying that asynchronous mechanisms offer a more efficient approach. Bart clarified this - by 'asynchronous mechanisms', did Rasmus mean calling other scripts, such as web services, from within the main script? He'd regard that as a good alternative, apart from the extra overhead, scripts and code...
Ilia Alshanetsky also denied it, and suggested fork(), or
pcntl_fork(), saying that on *nix systems fork() is
'nearly as fast as threads and much safer to boot', although much slower
under win32. Bart felt that sounded 'sufficient enough'; Wez suggested he try
proc_open() instead.
Sara Golemon, who never thinks like anyone else, announced that she currently has two embryonic plans running with regard to this concept. One was 'a very non-true-threading approach' involving a combination of ticks and Runkit_Sandbox, which she hopes to work on in the near future. The other was a long-range plan specifically for apache2-worker, using some of the extra threads there to link multiple interpreters together and allow one request chain to start or shutdown another in a sibling relationship. But, she added, 'don't look for this anytime soon' - it's still on her 'cool things to try' list.
Joseph Crawford argued that PHP should definitely aim to support multithreading in the future, particularly if it wasn't just going to be geared towards Web applications any more. Rasmus was swift to point out that this is not a development goal for PHP. Meanwhile Andrei hoped to see 'something based on libevent'. Wez introduced him gently to the libevent wrapper pecl/event, much to Andrei's joy.
J. Allen Dove wrote to confirm that using forking and daemon CLI classes for asynchronous processing in PHP 'works like a charm'. He said that his team had written a robust KISS async mechanism that works perfectly, even when processing several million tx's per day. He added that this approach is easily scaleable through adding more hardware according to the load.
Bart thanked everyone for their feedback, but said he felt that all the options sound like workarounds for the lack of native multithreading support. Lukas retorted that writing proper multithreaded code is hard work, and would clog up the server anyway. Besides PHP, being a glue language, should not require multithreading in order to get decent performance/latency for a request. Most of the work is performed elsewhere, e.g. inside the database server. Lukas went on to argue that the fact that PHP can also be used to write daemons, hardcore number crunching and desktop applications should not dominate decisions regarding the core of the language, and so Sara's approach of using extensions to provide additional functionality was to be welcomed.
Short version: For the thousandth time, nope.
TLK: ZTS mode bugs?
PHP user David Oren had encountered crashes during shutdown in the ZTS build of
PHP 5.1. Looking into the source, he'd discovered two issues, the first of which he
felt was almost certainly a bug. He pointed to some recent changes Dmitry had made
to class entry static_members in the Zend Engine, saying that - among
other things - support for runtime declaration (read: dl()) of static
class members was 'completely broken' under ZTS by that patch. He could
provide a patch to fix this if needed.
The second issue, he was less certain of; it appeared that the persistent list isn't unlinked cleanly, leading to the global persistent list being effectively shared with the startup thread. It 'felt wrong' to him, but it was also possible that this is intended behaviour? - he'd appreciate feedback over this.
Jani responded that patches are preferable to long stories (note: the original email was a fair bit longer than this), and a short script reproducing the problem is preferable to either. Andi wasn't far behind in asking David for a short reproducing script. David said he'd do his best to oblige, but couldn't guarantee that it was possible to produce one - presumably referring to the second of his reported issues rather than the first.
Short version: Another of those hard-to-reproduce bugs? - maybe.
TLK: Extension writing docs
Joseph Crawford wondered where he could get hold of documentation about creating PHP extensions for use under Windows and *nix-based systems. Sara pointed him towards the internals tutorials she wrote for zend.com and George Schlossnagle's 'Advanced PHP Programming' (which contains a chapter or two on extension development), and also took the opportunity to announce her forthcoming tome 'Extending and Embedding PHP', due out early in 2006. Beyond that, she advised Joseph to ask extension-related questions on the PECL development list rather than on internals@.
Bob Silva wrote to say that he'd started out by copying ext/skel and gone from there. Looking at existing extensions had helped him, as had Sara's zend.com articles, which offer an in-depth introduction to zval structure and the parameter parsing API. Andi intervened to say his own book also has a chapter on extension writing, and is a free .pdf download. Unfortunately this was the week the mailing list server took a fancy to Andi's email, and what should have been a one-off plug reached the list seven times over the next few hours, much to the amusement of all.
Short version: Lots of resources these days, especially if you read Andi's book seven times over :)
NEW: CVS server
Rasmus announced an upcoming switch to a new and much faster CVS server, y1.php.net, over the weekend. There would be 'some downtime and some DNS lag', and he advised CVS users to put the new machine's IP address in their hosts file while waiting for their DNS to catch up.
The job took two and a half hours, after which Rasmus was able to confirm that the new CVS server is now handling requests. Like a proud parent, he went on to describe the machine as 'a monster dual 3GHz CPU machine with 4G of ram running 64-bit FreeBSD6 with 6 73G 10k rpm SCSI drives in raid 10'. He'd timed a full php-src module checkout at 10 seconds, and a CVS update at around 5 seconds. viewcvs was already up and running, and lxr and bonsai would be swift to follow. Anyone seeing any 'weirdness' with the new server should write to systems@ to report it.
Finally, Rasmus thanked Yahoo! for donating the machine and the bandwidth, his colleague Paul Saab 'for clearing all the hurdles to finally get it up', and Wez for his help in the migration to the new box.
Note: bonsai has since been dropped, as viewcvs offers the same functionality.
Short version: w00t!
CVS: Testing, testing, 123
Not a lot went on this week, beyond the multitude of tests everyone wrote thanks to the shiny new gcov report, which shows where tests are most required. Marcus, meanwhile, continued working to perfect the test suite in CVS HEAD.
Jani set out on a self-imposed mission to nuke all things PHP 3 related in CVS HEAD and the PHP_5_1 branch. Unfortunately he broke a few bits and pieces in the process, and Wez made him stop as a result.
Rob Richards updated the libxml library for Windows in the 5_0, 5_1 and HEAD branches to 2.6.22. Edin adjusted the .defs files (exported functions) accordingly, announcing in his commit message that anyone finding their build broken as a result should update their copy of libxml and libxslt from Rob's site.
Apart from that, the only thing of note was that the new ext/hash was tagged stable ready for a PECL release. Thanks go to Michael Wallner and Sara for their work there.
Short version: About time everyone took a break.
PAT: LDAP maintenance offer
One Marcin Obara mailed in a patch to fix a couple of glob() bugs
he'd identified in PHP 4.4.1. The first was that glob() can cause a
crash in ZTS mode where the pattern contains something like
./http://www.zend.com/http://www.zend.com/http://www.zend.com/http://www.zend.com/* , resulting in an illegal area of memory being
read. The second was that the current code assumes any glob pattern will match files
from one directory only, whereas (claims Marcin) actually a pattern such as
./*/* will match files from several. His solution was to disallow
regular expressions prior to the last slash in the pattern when
safe_mode or open_basedir is set, and to expand the
pattern using virtual_file_ex(). He also wanted to add some missing
calls to globfree() and a zero terminator in line 453. He admitted that
his patch was less than beautiful, but hoped it would help in fixing the bugs he'd
identified. It's way beyond my capabilities to assess this one, so it's sitting in
PAT waiting for someone to yay or nay
it.
PHP Group member Thies Arntzen wrote that he had an issue with a fatal error in PDO:
Fatal error: Uncaught exception 'PDOException' with message 'SQLSTATE
[42P05]: Duplicate prepared statement: 7 ERROR: prepared statement
"pdo_pgsql_stmt_086eebf4" already exists' in table.php:670
Investigation showed that his prepared statement failed in executing but not in
preparing, so his code would assign a different value to the bound variables and try
again, resulting in the error. His workaround was to add an is_prepared
flag to pdo_pgsql_stmt and use it to protect against multiple
prepares.
Ilia took the point, and later committed a fix along the lines Thies had suggested.
Another kind of Ilia, Ilya M. Slepnev, mailed in a patch to fix
flush() under FastCGI. Tony, Michael and Dmitry all looked into the
patch, with Michael being the first to recognize that the patch was wrong but the
bug was real. He made an attempt to fix it too, but it was Dmitry who eventually
found the cause of the problem - PHP no longer spawns additional processes by
default - and fixed it in CVS HEAD, PHP_5_1 and PHP_5_0.
Hardcore LDAP fan Pierangelo Masarati was the last one up this week, posting the latest version of his LDAP API extension patch, along with a test script and a note that the patch should be preceded by the OpenLDAP C API cleanup patch. Once again, Pierangelo asked for feedback about his work; this time, he offered to maintain the LDAP extension if the team would like him to do so. In any case, he promised to continue posting further patches as soon as his work is integrated into PHP source.
Short version: Some stuff to look through this week in PAT.


Comments (Login to leave comments)