Zend Weekly Summaries Issue #343
REQ: New zend_class_entry callback
TLK: Changes, merges and SVN
TLK: Late static binding (again)
REQ: Test merge
TLK: MTOM support
TLK: Bitwise operations and Unicode
TLK: Session security
TLK: Berkeley DB XML
TLK: Bug tracker merging
TLK: Names and numbers
TLK: PECL vs core [continued]
CVS: Norwegian docs
PAT: User streams and remote includes
REQ: New zend_class_entry callback
Wez Furlong had been building an Objective-C runtime bridge for PHP, but hit
a snag along the way. The idea was to dynamically interrogate the Objective-C
runtime and map the interfaces and classes found there into PHP. In
Objective-C, it's possible to enumerate static methods defined in interfaces
but not in classes. There is also something approaching PHP's
__call mechanism that allows classes and instances to magically
handle methods that are not explicitly defined. Wez had hoped to support the
syntax:
$NSApp =
NSApplication::sharedApplication();
|
but had found that static methods don't appear among the enumerated list of methods at runtime, meaning that he couldn't simply add it to the PHP function table when the class is registered. To work around the problem, he'd need to do something like:
$NSApp =
NSApplication::__staticInvoke('sharedApplication');
|
to check whether the static method sharedApplication exists and,
if so, invoke it appropriately. This isn't particularly nice syntax, so Wez
hoped to have an additional callback in the zend_class_entry
struct specifically for checking whether a static method exists. He envisaged
it operating like the existing get_method() callback but taking
the struct itself, rather than the object pointer, as a parameter. This would
enable him to use a more natural syntax in his application, and would probably
be helpful for other bridging extensions in the future.
Marcus Börger agreed this would be handy, and added that another missing
element at present is __method_exists():
__method_exists(string $name, [bool $static = false]);
|
Short version: Them as does it, gets it.
TLK: Changes, merges and SVN
Wez sparked by far the longest thread of the week when he picked up on last week's exchange over the spate of missing merges to CVS HEAD. He offered up some sympathy for Ilia Alshanetsky's position, and went on to introduce the idea that all changes should have a bug reference ID. This would make it possible to set up an automated review of the change/merge situation.
Pierre-Alain Joye was all for it, with the exceptions of WS (whitespace fixes), CS (coding standard adherence) and inline documentation. Sean Finney spoke for all third party distributors of PHP when he gave it his vote too, but Lukas Smith pointed out that not every PHP_5_2 patch is appropriate in CVS HEAD. There'd need to be some flag to make clear which branches of PHP need the change, to avoid false positives.
Wearing my 'archivist' hat, I could foresee problems too. Many of the unidentified changes are minor performance or security updates replicated throughout the code base. This usually means there are a lot of one-line 'catch-up' fixes applied in subsequent weeks to comply with the latest approach. Perhaps only the first of a batch of commits should need an ID - but wouldn't the need to look up that ID be enough to put most devs off making a simple one-line fix?
Wez responded. At OmniTI they've got around this problem by creating a single maintenance ticket per-milestone (i.e. at point releases) and using its ID as a catch-all reference number. Any commit that doesn't contain a reference is rejected outright by their system. It seemed to Wez that translating their approach to the PHP project would be straightforward. I disagreed, partly because it's almost completely the opposite of current practice and partly because bugs.php.net isn't a closed system. Wouldn't it make sense to have a completely separate database for maintenance tickets, and have the ID assigned and the corresponding report opened automatically during the commit process? That way, only maintenance merges and genuine bug fixes/merges would require the developer to quote a reference number at all. Wez thought not; 'it's much simpler to have one source of ticket numbers'. He'd prefer to extend the existing bugs database; having two separate databases for PECL and the PHP core is complicated enough. The existing authorization system could easily be used to prevent PHP users from originating maintenance bugs. Wez proceeded to list his recommendations:
- add a maintenance category to bug db
- add merge status to bug db
- add commit hook to detect a bug number in the commit log
- add some method of automatically commenting on bug reports with the commit message and urls to the diffs
There might eventually be a cron script to analyze tickets with missing merge activity, but this wouldn't be needed 'if we're good at respecting the merge status'. I argued that Wez' plan would lead to an overly complex procedure for simple commits, and started dreaming about the ways my own ideas could be made to work.
Andi Gutmans woke me (and everyone else) up with a post claiming that it would be no bad thing to upgrade the php.net infrastructure. A couple of lines into his mail, it became evident that what he meant by this was "switch to SVN".
Rasmus Lerdorf, while equally keen to move to Subversion, wrote that he'd rather wait for the 1.5 release, which will offer 'a real technological incentive' for the switch. I threw up my hands in horror and wrote at length about the mayhem introducing new tools could bring to php.net. Wez was pretty much with me on this point; he'd wanted a quick fix to address the current merge situation before it becomes a real problem, not a major change in the development environment. Lukas wrote helpfully that there are 'plenty of sub-projects' that could be used as testing grounds prior to the internals move to SVN. I agreed that this would allow people to familiarize with the new tool, but argued that the problems of PECL vs core extension development should be addressed before any major changes, lest the current ways become set in stone.
Stas Malyshev wrote to inform us that the difference between SVN and CVS commands is so negligible that familiarization wouldn't be an issue. He was less certain when it came to the admin side, but felt sure this would affect fewer people. Rasmus, who has attempted the conversion process in the past, wrote that CVS commit history will almost certainly be lost during the move; making the transition will be much as a fortnight's work. William A. Rowe, having experienced migration with the ASF, agreed with both Stas and Rasmus: SVN is 'nicer' from a user perspective, but a large repository can't be expected to migrate cleanly.
Marcus Börger didn't quite see where problems might arise, but thought it would be a good idea to analyze how well SVN 1.5 will handle the workarounds that have been added to the PHP CVS repository over the years. Bringing a completely new argument to the table, Marcus added that in his opinion only those areas currently undergoing active development should be migrated, and the rest left to die a slow death in the CVS repository. Before we had a chance to recover from this, Marcus offered his coup de grâce: the PHP, PECL and PEAR projects should all share the same bugs database.
Rasmus wrote firmly that his own comments about migration had been based on actual experience, and asked whether there were any volunteers willing to try a dry run at home? Christian Schneider went into 'offering mode', writing cautiously that he has some migration experience. He asked which CVS features are currently used by php.net... just as Jani Taskinen got around to reading the thread and exploding. The only thing Jani liked about any of it was the initial, basic idea of having bug reports linked to CVS commits. Were there any material benefits in moving the entire repository move to SVN? Because he'd heard none so far!
Rasmus obliged with a list of direct benefits to the PHP project (Sean Coates later added to this), but noted a few disadvantages in switching too. He cited the loss of real tagging, the loss of commit history on some files, the potential learning curve for the less technically adept project contributors, and the fact that the development process will be impacted. Andi reiterated Stas' point about most of the pain belonging to those actually responsible for the migration, and suggested that the current CVS repository might be stored on museum.php.net for reference.
Pierre opened up a discussion about the merits of Git over SVN for large projects, but Rasmus was less keen. He felt that Bazaar offered more than Git; but in either case, the majority of PHP repository users would have a steep learning curve. A big advantage of SVN, once it has support for merge tracking and cherry picking, is that 'you get much of the same benefits without turning the world upside down'. Stefan Walk added that Git would require Windows users to install cygwin, which Pierre didn't see as a problem. There was also a small warning note in that exchange: SVN, wrote Stefan, is 'a pain in the ass when it comes to merging' at present, though it shouldn't be by the time version 1.5 comes out. Pierre pointed out that the Subversion team had made similar claims for version 1.0. Christian helpfully introduced svnmerge.py, a tool that can be used to maintain SVN merges once the 'inevitable' switch to Subversion has been made.
Short version: Nobody has enough time right now.
TLK: Late static binding (again)
Stas meanwhile had read through his backlog of mail, and responded late to Ken Stanley's
query about late static binding. He explained that the idea of having
dynamic scope that refers to the class named during the static call is a
recurring one. The problem is that the Zend Engine currently does not
preserve that class scope, meaning that it's not accessible inside the static
call. The name Ken had suggested, child::, made no sense, since
an object can have many children; the same fact gets in the way of a unique
link from parent to child.
Richard Lynch made much the same point about multiple children, explaining
that PHP wouldn't know which child:: to call. Jochem Maas,
though, felt that the problem was simply down to the name, and something like
super:: would be better. The concept itself was simple:
<?php
|
Bart de Boer liked it, but noted that you could just do:
$tableName =
$this->getTableName();
|
from within the base class without changing PHP at all. That said, he'd still
like to be able to use $className::getTableName().
Richard liked it less, and confessed to having 'no idea what the heck is
going on' in Jochem's code snippet. He didn't like super::
as a keyword either, preferring static::. Bart didn't like that
idea, noting it implies that every other keyword followed by ::
is not static. Perhaps it would be better to drop the whole keyword
thing and simply allow objects to access their own static members in the same
way as all the others?
class
Base {
|
It seemed to Bart that all the Engine would need to do to achieve this would
be create an object member of the same name that references the class static
variable... Ken Stanley wrote in to say that, on reflection, he agreed with
the other posters over static:: being the best keyword
candidate. There definitely would need to be a keyword though; the whole
point of the static model is the ability to access class members without
having to instantiate the class. Bart argued that if there were no class
instance PHP wouldn't know which child class to target, and Ken realized he
was probably right. That said, surely if there needed to be a class instance
the whole point of having late static binding in the first place would be
lost.
Short version: Eh.
REQ: Test merge
Pierre came up with an innovative request. He'd been working on support for external GD libraries in CVS HEAD, and reported that this now passes almost all the relevant tests in the PHP test suite. As a result of his work, users will have the option to build PHP 6 with support for an external GD library with all features enabled, such as imagefilter and antialiasing. The innovative part, though, came in Pierre's request to merge those test changes back to the PHP_5_2 branch. As he wrote, 'It makes my life easier to maintain both libraries and keep them 100% compatible'.
Ilia was all for it, asking only that Pierre hold back his change until PHP 5.2.3 is released.
Short version: Developing in HEAD and merging to branches - whatever next?
TLK: MTOM support
Someone named Gal wrote to internals@ to enquire about PHP's support for moving large amounts of XML data around the Internet. More specifically, he wanted to know whether there are any standard extensions that allow a large amount of data to be shifted via a SOAP response, and whether there are any plans to support MTOM in PHP.
Andi responded with the news that 'the guys from WSO2' are working on WS-* support for PHP, in the form of the Axis2 extension currently in PECL. He believed they were also attempting to make their API compatible with the SOAP extension. Pierre updated the news; project development for Axis2 has now moved to http://wso2.org, which hosts all related documentation and sources.
Short version: It seems the Axis2 folks prefer SVN.
TLK: Bitwise operations and Unicode
Tony Dovgal wondered whether the team thought the Zend Engine should support bitwise operators and Unicode strings and, if so, how they believed it should work? To illustrate the current situation, Tony gave an example:
$a =
"1";
|
Under CVS HEAD, this scriptlet outputs 3 in native mode
but throws a fatal error (Unsupported operand types) in Unicode
mode. Tony saw this behaviour as inconsistent; it should just work. There
were two possible implementations, in his view; either apply the operator to
each element of the Unicode string separately, or convert the string to
binary before doing so. Of course, there were also the options of allowing it
to fail, or of dropping native string support, but neither seemed useful to
him.
Pierre failed to see any point in offering support for bitwise operations with Unicode strings in the first place, and wrote that there should be a cast to integer prior to any non-string operations. He saw the current behaviour as correct, and would be happy to drop the native support too for the sake of consistency. Marcus Börger grumbled that it was 'far from KISS' to either have no types or strict typing depending on the operator, but Tony pointed out that he was 7 years late with that complaint. An alarmed Richard Quadling checked that PHP 6 will retain loose typing. Tony reassured him; it's just that bitwise operators aren't, and never were, loosely typed.
Richard Lynch believed there are probably 'a bazillion PHP scripts written by newbies' that use bitwise operators on strings, and regarded it as 'crucial' to Unicode adoption that type juggling continues to work as of yore. He wondered how Unicode strings generally are type-juggled when they are used as integers, regardless of the operation? Tony explained again that numeric strings behave differently with bitwise operations, and put his view that very few PHP scripts are likely to rely on bitwise operations with strings due to this behaviour. Jared Williams came out of nowhere to produce an example from the PEAR repository that does precisely that.
Andi, when he caught up with his mail, wrote simply that the option of binary conversion would retain BC, and there seemed very little point in breaking it.
Short version: No change.
TLK: Session security
PHP user Stut wrote to the internals list in search of an opinion over a discussion on php-general@. Why do PHP sessions not use the user agent to validate a session ID?
Rasmus pointed out briefly that anyone in the business of session hijacking
would be more than able to hijack a user agent string. Another user, Xing
Xing, pointed out that some HTTP proxies modify the user agent. He saw
session validation as the responsibility of the script, and suggested hashing
the user agent with a secret key to prevent forgery. Robert Cummings agreed,
but suggested that support for the user agent approach might offer 'a
teensy bit more protection from casual hijacking' - as, for example, when
a posted URL includes a PHPSESSID.
Rasmus commented that 'the session store is just a session store'; a generated session cookie shouldn't be used for authentication in the first place, and there should be a separate cookie for that task. Stas promptly argued that if it's possible to steal a session cookie it's also possible to steal any other cookie that happens to be lying around. There is no way to securely identify the client, other than by external means such as an SSL client certificate. After a bit of a circular discussion, Rasmus wrote that session authentication should be left to the application. IP checking might be appropriate where the user base aren't on dynamic connections; something like the user agent might help in other cases, or 'an RSA token plugin thing'. The right answer is different for everyone, which is why it shouldn't be hard-coded into PHP.
Christian Schneider suggested adding a session variable that contains client metadata such as the user agent and IP, and checking it during the application startup code. He agreed that this shouldn't be done by PHP itself, but was fairly sure something in PEAR can help PHP users with this. Lukas confirmed that PEAR::LiveUser supports it; he was less certain about PEAR::Auth.
Short version: Each to his own.
TLK: Berkeley DB XML
Arnold Daniels wrote to internals@ asking why the PHP extension offering support for the Berkeley DB XML isn't available through php.net? Ilia didn't think anyone had written one, but Wez owned up; he and George Schlossnagle have, and it's available through Sleepycat. Marcus wrote that there was a licensing issue, and went on to speculate on the reasons Oracle (who now own Sleepycat) hadn't wanted the extension to be made available via PECL. Wez refused to comment on the reasons, pointing out that he didn't actually know them; the point was that Sleepycat have the extension, and if anyone wanted it in PECL they should ask there. Andi offered an option: DB2 Express-C has XML support, is free, and is supported by the ibm_db2 extension in PECL. He commented that XML DBs are useful for those having to deal with large sets of XML data, but should be viewed as complementary to RDBMS rather than as a replacement for them.
Short version (thanks Andi): Some useful links on DB2 Express-C are available here.
TLK: Bug tracker merging
Following on from the earlier discussion, Greg Beaver wrote that he had some
qualms over the 'headlong rush' to refactor the PHP bug tracking
system to include PEAR and PECL bug reports. His main point was that the
current PEAR bug tracking system is far superior to that used for the PHP
core; he didn't want to lose it. OK so currently it has a dependency on the
pearweb core, but the PEAR web team are working on
making it independent. I wrote a 'DON'T PANIC' post to explain that there
isn't actually any consensus to add PEAR bugs into the mix; the problems are
with PECL, not PEAR. I added that that whole discussion had come about in the
first place because of the difficulties of merging from the 5_2 branch into
CVS HEAD; the code can be very different, and merging isn't always a
straightforward process. Greg was in favour of moving the PECL bugs over to
bugs.php.net, and also agreed - coming
from his recent experience with the 'uber-simple'
__HALT_COMPILER() patch - that merging to HEAD is problematic.
Jani hijacked the thread to update the rest of us over the subsequent IRC discussion. He had looked into the PEAR version of the bug tracker and agreed with Greg that this was 'the best starting point for a unified code base'. Greg had already created a new CVS module, pear/Bugtracker, from that code, and this would be the base for 'the übertracker', as Jani termed it. Since the PEAR and PECL trackers already share the concept of 'package/type', Jani intended to add support for this (phase I) and 'project' to the PHP tracker before merging the databases (phase II). Ultimately, http://bugs.php.net will be the only bug tracker for all php.net projects. Adding a new project to it will be straightforward. In a postscript, Jani made it plain that help was not required and - characteristically - issues such as coding style and whether or not PEAR classes should be use are not open to discussion.
I wondered whether this would do anything to help the merge issue, but Jani pointed out that he wasn't focused on that right now. That said, there are features in the PEAR tracker that might be useful; he just wasn't familiar with them enough to know what the patch tracker and roadmaps were capable of doing.
Short version: Never fear - Jani's in control.
TLK: Names and numbers
Stas had noticed that some build targets and file names in CVS HEAD still have "php5" in the name, and asked whether this shouldn't be "php6" throughout. Jani promptly suggested moving the version number out of the equation altogether, but Tony pointed out that it isn't possible everywhere; there needs to be a version number in libphp6.so, for example. Jani asked why this was necessary, given that PHP 6 is a totally new major version? Tony suggested that some users might want to have both libphp5.so and libphp6.so installed, but only one of them enabled. Jani mentioned the concept of renaming libraries, and pointed out that it's not possible to have two PHP versions enabled in the same Apache instance any more anyway.
Lukas wondered why users should have this option only for major PHP versions, and Stefan Walk suggested using a constant to represent the version throughout the source to satisfy everybody's aims. Tony retorted that this is the way it's always been. Although he agreed it wasn't a good idea to have the version number as part of the file name, changing a long-established behaviour on a whim isn't a good idea either - 'even if it looks stupid'. Jani pointed out that this breaks nothing, since libphp4.so and libphp5.so wouldn't be affected. Tony argued, rather weakly, that the World Order ™ would. Stas pointed out that it's quite common to have the major version number as part of a Unix shared library name. Still, he added, Jani was probably correct in saying there's no need for source and building system files to do likewise - unless there were some limitation imposed by Unix build tools. He didn't believe there was any such limitation under Windows.
Richard Lynch wrote that he'd rather not have his existing libphp*.so overwritten when installing a new version, thank you, but Stas pointed out that this situation already exists when updating minor PHP versions. Richard also recalled a time when it was possible to run two different PHP module versions simultaneously, which had made it important to have different names for them. Stas wasn't sure that had ever worked correctly, but pointed out that you can run as many PHP versions together as you like using FastCGI.
Arnold Daniels wondered why the PHP Apache module isn't appended with the
same suffix as the rest of the binaries, and suggested a
--program-suffix configuration option. Jani pointed out that it
would need to be --library-suffix, since the module isn't the
only binary built, and added that patches are welcome in this case.
Short version: Generated binary names that include the version number seem a likely outcome.
TLK: PECL vs core [continued]
Windows user Richard Quadling pounced on the discussion about the lack of synchronicity between extensions in the PHP core and their PECL equivalent. He wrote that his understanding was the snaps site should provide the most recent successful PHP compilation, with a few extensions built-in and a few more alongside. The pecl4win site, on the other hand, should offer binaries for those extensions that are not part of the standard package. Wez agreed with this definition, adding that the pecl4win binaries are generated on the same box as the official Windows releases.
Gaetano Giunta asked whether anyone else thought using pecl4win to distribute binaries corresponding to the official PECL releases, as well as to snapshots, would be a good idea. Pierre did; he wrote that he is 'not a fan of using snapshots on production servers'. Caroline Maynard also agreed; her IBM team have taken to developing their extension in a branch and merging their changes to HEAD at each PHP release to get around the issues.
Lukas felt the thread was wandering off track, and returned to the subject of Philip Olson's original post. What, precisely, is the relationship between PECL and the PHP core, particularly when it comes to PECL packages that are adopted as part of the core? Are core packages removed from PECL, for instance? Should the team start looking at PHP releases as if it were a Linux distro, i.e. take the current kernel and all relevant stable PECL releases, QA the whole thing and then release? How far should the team support users running older minor versions of PHP? Should there be independent releases of core extensions? Lukas also felt that ensuring API versions are updated is extremely important, and should be taken care of. Is there no common standard when it came to updating the API version? Are tools needed to ensure compliance? In Lukas' eyes, the time for discussions about the ways in which PECL and the PHP core are related is now.
Short version: Discussions about the relationship between PECL and core are long overdue.
CVS: Norwegian docs
Changes in CVS that you should probably be aware of include:
Prior to the PHP 5.2.3 release:
- Core bug #41516
(
fgets()returns a line of text whenlengthparameter is<= 0) was fixed in CVS HEAD only [Tony] - The current module is now set in
internal_function->module[Tony]
Following the PHP 5.2.3 release:
- Apache 2 bug #39330
(
apache2handlerdoes not call shutdown actions before apache child dies) was fixed [Tony] - Core bug #41518
(
file_exists()warns ofopen_basedirrestriction on non-existent file) was fixed [Tony] - Configuration bugs #41555
(configure failure: regression caused by fix for #41265) and #41576 (misbehaviour when using
--without-apxs) were fixed [Jani] printf %uno longer truncates numbers at 32 bits [Brian Shire]- In ext/gd, several version-related constants were added:
GD_MAJOR_VERSION,GD_MINOR_VERSION,GD_RELEASE_VERSION,GD_EXTRA_VERSIONandGD_VERSION_STRING[Pierre] - In ext/zip, PECL bug
#11216 (
addEmptyDircrashes if the directory already exists) was fixed [Pierre]
In other CVS news, PHP manual editor Philip Olson opened up a new directory, phpdoc-no, for the Norwegian translation of the manual and gave one Knut Urdalen karma to it.
Meanwhile Ilia was kept busy for a day or two, merging everything he'd ever committed into 'PHP_5_2 branch only' to CVS HEAD. In an off-list exchange, Ilia explained that this was a clean-up operation. He intends to post all future unmerged patches in a public space, thereby allowing others to deal with them more easily.
Short version: A simple solution to the problem of unmerged patches was found.
PAT: User streams and remote includes
In response to Greg's RFC a couple of weeks back, Stas posted a patch for review that will restrict user streams from executing dangerous operations within the include context. The patch was against CVS HEAD and did not include changing the names of the INI/structure fields, although he wrote that this might also be a good idea.
Marcus asked why the INI options couldn't be broadened to six settings:
allow_url_fopen_local, allow_url_fopen_user,
allow_url_fopen_remote and the same three ranges for
allow_url_include_*. Making the _remote settings an
alias would retain full back compatibility. Stas saw no need for so many
settings. Besides, why would anybody want to prohibit local file access and
includes? The whole point was to make user streams behave more like built-in
streams, while ensuring that random errors in user implementations couldn't
allow the inclusion of remote code.
François Laupretre wondered if the Boolean argument Stas had added to
stream_wrapper_register() could be altered to an integer,
allowing a set of enumerated values that would make it straightforward to
extend stream options. Stas didn't see a problem with that, but didn't see an
immediate use for it either.
Etienne Kneuss - backed by Marcus - believed he'd come across a bug in the way certain callbacks are handled. For example, in
array('MyClass', 'parent::who');
|
the class name would only be used for inheritance checks within the current
scope, but the actual resolution would be determined using the current scope
and 'parent::who'. Etienne posted patches against the PHP_5_2
branch and CVS HEAD
for consideration, but Stas wasn't even sure such a thing should be supported.
He wrote that he'd expect that code to call a MyClass method
named 'parent::who', which would fail because there is no way to
define such a method name. Was there a reason somebody might need this?
Further, part of the patch altered an executor globals value
(EG(scope)) even though no actual call is made; Stas thought a
value that depends on the context should be stored in a variable instead.
Etienne pointed out that the code already is supported, if buggy; he
was simply trying to fix it for odd cases. He'd altered
EG(scope) to temporarily store the value because
zend_u_fetch_class() in CVS HEAD uses it to get the current
class, and he'd assumed that zend_lookup_class() in PHP_5_2 does
the same. Having now realized that it doesn't, Etienne provided an updated
version of the 5_2 patch that directly uses
calling_scope/ce_org rather than the current scope.
Stas remained unconvinced; he thought the sample code shouldn't work in the
first place, and that having it do anything other than error out would lead
to attempts to support
array('MyClass', 'AnotherClass::foo')
|
- 'and that'd be a real mess'.
Marcus wrote that the reason the sample code doesn't error out is that
array($obj, 'parent::func')
|
needs to work. Stas wondered why... Mike Wallner added that you might want to use:
((ParentClass) $child)->virtualMethod();
|
if it were possible. A bemused Stas asked why Mike didn't simply call the child's method and allow it to pass control to the parent as needed. Jochem Maas sympathized with Stas; 'it smells like bad OO'.
Scott MacVicar provided a TSRM patch to fix the problems with
lstat() and symbolic link resolution reported here last week, and
Tony subsequently applied it in CVS HEAD and the PHP_5_2 branch.
And finally, Gaetano Giunta posted his revision of the build process used by pecl4win, again with no on-list response.
Short version: The lstat()/symlink problems are fixed; Stas' user streams/allow_url_include=off patch is in PAT.

Comments
This Bazaar document makes for good reading:
http://bazaar-vcs.org/BzrForCVSUsers
And Linus Torvalds' Git presentation at a Google Tech Talk is good viewing:
http://www.youtube.com/watch?v=4XpnKHJAok8
SVN 1.5 will include the changelists feature, which are changesets based on file paths (not content):
http://blogs.open.collab.net/svn/2007/07/one-quality-of-.html
cvs2svn - CVS to Subversion conversion
http://cvs2svn.tigris.org/
SVN Importer - Migrate to SVN from CVS
http://www.polarion.org/index.php?page=overview&project=svnimporter
SVK has some advantages over SVN (including star merge) and a CVS mirror tool: MirrorVCP
http://svk.elixus.org/view/MirrorVCP
Bazaar has the cvsps-import plugin
http://bazaar-vcs.org/CVSPSImport
I agree that Bazaar is the technological leader: real tags and true renaming, excellent merge (even of file renames), distributed/centralized/mixed working possible.
It's no big difference to CVS on the command line, but although many other clients are under development, the availability of excellent clients is better for SVN.
http://bazaar-vcs.org/3rdPartyTools
There's lot of wrong with all the alternatives.