Zend Weekly Summaries Issue #248

August 8, 2005

Uncategorized

TLK: libxml2 errors
TLK: PHP-GTK corner
TLK: Property overloading RFC
TLK: Moving extensions to PECL
CfP: International Open Source Database Conference
TLK: CVS vs SVN
TLK: Streams, URI handling and XML
CVS: allow_url_fopen and SOAP
PAT: More namespace stuff


TLK: libxml2 errors

Ron Korving opened this week’s mail by asking whether he’d fallen across a bug;
one of the three warnings he was seeing over a non-existant XSD file had an
urlencoded file path. Dmitry Stogov agreed that it could well be a bug, and
suggested Ron post a proper bug report for his attention, which Ron did. Wez Furlong
pointed out that the file name contained spaces, which are illegal in URLs, so
encoding them seemed sane to him – but he’d let the XML people decide whether it was
actually a bug or not. Ron explained that his XSD file wasn’t a remote source…

XML person Rob Richards confirmed that the URL was encoded internally – for a
warning thrown by libxml, which is not under the PHP development team’s control. He
suggested that Ron could either suppress the errors on a per-function basis or use
the new error handling for XML in PHP 5.1, which would allow errors to be retrieved
following the function call rather than display them correctly.

Ron felt that three warnings for one bad statement was over the top, and asked
whether there couldn’t just be one. The other XML person, Christian ‘Chregu’
Stocker, took a look at Ron’s bug
report
and explained that all three warnings originated in libxml2:

We would have to start intercepting those warnings and decide which
one we want to show and which not... Too much hassle IMHO. In your
scenario, which should be shown?

Chregu pointed Ron to some "http://www.php.net/libxml_use_internal_errors">background
information
on libxml’s error handling, including one of his "http://php5.bitflux.org/xml5_1/slide_13.php">own conference
slides
containing PHP 5.1 examples. PHP Documentation Group member Nuno Lopes
offered a further
example
, but noted that libxml always returns 0 for the column number in the
sample errors returned by print_r(libxml_get_errors());. Was
this a bug?

Rob explained that ‘column numbers in errors are not yet implemented
everywhere in libxml and 0 is often returned
‘. Nuno promptly added this helpful
piece of information to the PHP manual.

Short version: You can control your libxml error display – the dev team can’t.


TLK: PHP-GTK corner

Andrei Zmievski started a debate over the depth of property display that should
be expected from var_dump()in GTK+ objects:

If you run var_dump() on a realized widget right now, you will
get a few thousand lines of output, owing to the deeply nested structure
of widgets and especially the multitude of properties that the GtkStyle
object has. This is even without exposing GdkGC properties yet.

He offered two options; expose everything, or expose everything except
GtkStyle, whose properties would remain accessible via other
means.

Scott Mattocks voted for total exposure, suggesting that someone might want to
run foreach() across the style object for some reason. I argued that
GtkStyle isn’t even cast as an object in GTK 1.3 (I didn’t have GTK 2
handy at the time) and that get_object_vars() and
var_dump() shouldn’t be expected to return the properties of a
third-party struct. Reflection fan Christian Weiske voted with Scott, saying there
are cases in which all sub-elements of an object or widget should be reflected. Rob
Richards, unusually, intervened with ‘don’t do it‘. He advised that his
experience with the DOM extension had been that nesting goes too deep; beside, many
of the properties there are recursive. Returning thousands of lines makes debugging
harder, not easier.

Christian asked whether there was a way to determine which level we were at and
limit the depth that way. I said not, adding ‘Dmitry killed recursion now
but reiterating my point about third-party structs. Andrei came back into the
discussion to point out that actually GtkStyle is a real object
nowadays, but he agreed there is no way to limit the depth of
var_dump() operations from within PHP-GTK. Rob, on being pressed,
repeated his opinion that having too much information was worse than having none,
but added that in ext/dom he’d gone for consistency – all property info there
has to be retrieved in some other way than var_dump().

Meanwhile I put my development environment together (thanks Edin), finally
downloaded GTK 2 (Andrei was right about GtkStyle) and started work on
the PHP-GTK 2 win32 port, with some background help from the ever-patient Rob, and
later from Wez Furlong. Most of the rest of the week’s list mail was taken up by my
‘wtf?’ and ‘aha!’ moments as a result.

Short version: Optimism prevails.


TLK: Property overloading RFC

Derick Rethans "http://www.zend.com/lists/php-dev/200508/msg00017.html">posted a RFC asking for
feedback regarding changes he’d like to see in the way the property overloading
methods __get() and __set() are implemented. The problems
he specifically wanted to address were:

  1. No way to document ‘virtual’ properties
  2. No clean way to check for the existence of a ‘virtual’ property
  3. No precision in the error data where a property does not exist

The solution he offered involved introducing a keyword to define a property as
virtual (e.g. virtual), which would resolve issue 1. There was an
existing patch by Marcus Börger, which used the abstract keyword for
this purpose. Alongside the keyword, there would need to be an easy way to check
whether a passed property name had been declared as virtual, allowing
something like:


<?php

class "color: #0000BB">Base {
    abstract public
"color: #0000BB">$x = "color: #0000BB">1;

    function "color: #0000BB">__get( "color: #0000BB">$name) {
        if (!
"color: #0000BB">self:: "color: #0000BB">isVirtual "color: #007700">($name "color: #007700">))) {
            

/* throw error */
        
"color: #007700">}
     }
}

$b "color: #007700">= new "color: #0000BB">Base();
echo
$b "color: #007700">->foo "color: #007700">;

?>

The third problem could be resolved by introducing an optional by-reference
parameter to __get() and __set(), allowing
&$error to be passed. If the parameter were used and returned
FALSE, the error thrown by the Zend Engine would report the correct
file/line combination. Another option would be to use a different
__get() and __set() function for such properties.

Christian Schneider objected that checking ‘virtual’ properties would make PHP
less dynamic, and pointed to class inheritance. Would the extending class need to
know about the get/set methods and attribute declarations?
He added that, in his opinion, the currently available mechanism of checking the
property name against an array in the object is sufficient, and there was no point
in adding a language feature that would add complexity with so little gain.

James Crumpton had concerns over the self::isVirtual() syntax,
wondering whether ‘virtual’ members would act like static members, in that
inheriting classes wouldn’t be able to access ‘virtual’ base class members.

Greg Beaver went straight to the theme of documentation, recommending heartily
that virtual properties should be documented inside the docblock for
__get() and __set(). He added, ‘most implementations I
have seen of __get()/__set() use them because it isn’t possible to know
the names of the variables in advance
‘, making the keyword proposal unusable.
That said, he suggested


<?php

class "color: #0000BB">Base {
    
virtual
public "color: #0000BB">$x = "color: #0000BB">1;

    function "color: #0000BB">__get( "color: #0000BB">$name) {
        if (!
"color: #0000BB">$this "color: #007700">-> "color: #0000BB">isVirtual "color: #007700">($name "color: #007700">))) {
            

/* throw error */
        
"color: #007700">}
     }
}

?>

would be less confusing to users than Derick’s initial example;
virtual could not be confused with abstract classes, and the use of
$this-> rather than self:: would allow inheritance
without redefining __get() and __set(). Finally, Greg
suggested that the proposed error parameter should be allowed to be set to
FALSE or an exception object, allowing users to customize the error
message.

Marcus took issue with Greg’s anti-keyword argument, asking ‘How about lazy
initialization? You cannot have a declared property for those at the moment
‘. He
also didn’t see the need for a brand new keyword to be introduced;
abstract should be sufficient. Something like
property_is_virtual($this, $name) would be preferable to
$this->isVirtual($name), and (finally) he’d already suggested
exceptions should be used here, but there were several strong arguments against
it.

Andrei pointed out to Derick that the whole point of ‘virtual’ properties was
that their names can be parameterized. How could you declare, say, 1000 properties
coming in from a database, as virtual? He suggested implementing a
__have_prop() method to test whether a given virtual property exists.
Derick replied that the declaration was supposed to be optional, but that not
declaring the properties as virtual would preclude the introspective
__have_prop(), which in itself wasn’t a bad idea so long as it could be
called both statically and dynamically.

Lukas Smith wondered whether it wouldn’t be better to be able to force
__set() and __get() to be triggered for all properties,
which Derick was firmly against; not only would it would break BC, it would also not
solve the error data issue. Lukas retorted that ‘be able to’ was not the same as
‘always behave this way’; he’d been thinking of flagging an object to behave in this
way. ‘Heck this flag could be “allow now virtual properties, and trigger
__set()/__get() always and throw the proper error otherwise”.

Smarty developer Boots asked whether it would be reasonable to allow
__get() and __set() to be called without any parameters,
returning an array or object containing the properties supported by them? He added
that it would be nice if there were a way for the property overloading mechanism to
support references… Marcus replied that this would be too complicated, and would
also mean having one thing having two completely different purposes, which confuses
everyone. Boots, not a man to give up easily, asked how his suggestion was more
complicated than adding new keywords or property mechanisms? All he wanted to see
was an optional parameter allowing __get() and __set() to
be self-describing.

Stas Malyshev didn’t see the point of the entire discussion:

I must be missing something, because I don't understand one simple
thing: if you need a mechanism that would allow your classes to know
which properties they have, and you need custom logic to decide which
properties exist, why don't you write an interface that has a
WhatAreMyProperties() method, implement it in your classes and use it?
Why bother with RFCs and all the __magic stuff?

Ilia Alshanetsky agreed with Greg’s earlier diagnosis; allowed property
values should be documented within the comment block. He also stated that the whole
concept behind the feature of ‘virtual’ properties was that the values weren’t known
in advance; if they were known, you might as well use normally declared
property and method names. As to the third optional argument, wrote Ilia, this
should be an array of acceptable values for the name parameter,
throwing a standard warning message where there was a mismatch. Derick argued that
the doc block approach would only work if __set() and
__get() were also called for declared properties; furthermore, the Zend
Engine, which actually calls the property overloading methods, wouldn’t have
knowledge of the contents of the suggested list, hence the need for a keyword.

Short version: Complicated stuff.


TLK: Moving extensions to PECL

Ext/dbx maintainer Marc Boeren noted that his extension had been moved to
PECL recently, ‘without any discussion, but that’s not the point‘, but that
there was no package on pecl.php.net for it.
Shouldn’t it be policy that the necessary PECL packaging was required prior to
a CVS source move?

Wez noted that the person doing the moving was supposed to coordinate their
efforts with the maintainer to set this up, and asked Marc whether he’d mind
creating the package.xml file and tarball and releasing the extension via
PECL. Marc didn’t mind, but was short of both time and documentation for the
packaging process. He also didn’t know whether he had CVS karma for
pecl/dbx.

Jani Taskinen, who did the moving, noted that the extension had only been moved
from PHP 5.1-dev, and no released versions of PHP were affected. He also pointed out
that there was in fact a package.xml file in place already; Hartmut
Holzgraefe added these for all existing PHP 5 core extensions last year. CVS access
should be provided by Derick.

Marc felt that there was some work to be done in making the transition to PECL
easier for maintainers. For example, the extension was not in the package list
there, and the docs didn’t mention it was now available on PECL. What was needed to
add pecl/dbx to the package list?

Lukas explained the process:

  1. It needs to become registered as a package: "http://pecl.php.net/package-new.php">http://pecl.php.net/package-new.php
  2. In order to be downloadable it needs a release: "http://pecl.php.net/release-upload.php">http://pecl.php.net/release-upload.php
  3. For both these steps you need the necessary PECL Karma, for which you can
    apply with this form:
    >http://pecl.php.net/account-request.php

but Marc pointed out that he’d quite like to keep his existing php.net username,
and asked Wez for karma accordingly – just as Derick allowed him it.

Short version: Oops.


CfP: International Open Source Database Conference

Software & Support Verlag, the producer of internationally renowned
conferences such as JAX, International PHP Conference, ApacheCon Europe and others,
announces a new conference for the international Free Software/Open Source
community:

Open Source Database Conference 05
November 7 to 9, 2005
Frankfurt, Germany

It is our pleasure to invite you to become part of this new conference
by sending you the official Call for Papers. Your submissions would
be very much appreciated. Please find further information on the
conference and on the submission proceeding below.

++++++++++++++++++++++++++++++++++++++

THE CONFERENCE AT A GLANCE

* Open Source Database Conference 05: November 7 to 9, 2005
* Event location: NH-Hotel Frankfurt-Mörfelden
* Main conference: November 8 and 9, 2005
* Power workshops (tutorials): November 7, 2005

* Duration of a session: 75 minutes
* Duration of a power workshop (whole day): ~ 6 hours
* Duration of a power workshop (half-day): ~ 3 hours

* Entry form: http://input.entwickler.com
* Closing date: August 19, 2005

++++++++++++++++++++++++++++++++++++++

The main conference is divided into the following tracks:

- Database Fundamentals
- Database Development
- Database Administration
- Business Intelligence
- Free Software/Open Source Database Business

Conference topics to be covered are:

- Database Administration
- Migration to Free Software/Open Source Databases
- Performance tuning and optimization
- APIs/Connectors
- New Technologies
- Lowering TCO with Free Software and Open Source RDBMS
- Case studies
- Community-related topics

Languages, technologies: all (Java, PHP...)
Free Software/Open Source

We are looking forward to your submission and wish you all the best!

Frank Stepan
Software & Support Verlag

Short version: Thanks to Georg Richter for forwarding this to internals@.


TLK: CVS vs SVN

Pasha Zubkov wondered why his deleted ZendEngine2 directory wasn’t restored from
the repository when he ran cvs up -APd. How could he update his
copy?

Jani explained that buildconf handles the Zend directory version automagically
and it shouldn’t be removed, but Pasha argued that it should still be possible to
update the source to the repository status. Derick took the time to explain that
repository links weren’t resolved with cvs updates, but only on checkout.

At which point the SVN argument came up again, with Derick asking Pasha ‘How
do you want to convert 1100 users instantly without causing disruptions?
‘ and
Jochem Mass muttering ‘magicwand.php‘ in the background. Andrey Hristov
pointed out that CVS works, and the team is happy with it so far, but Pasha disputed
that statement, saying that the Zend directory issue proved otherwise. SVN had a
better version control mechanism; if the PHP project was ever going to ‘update’ to
SVN, why not do it now?

Derick pointed out that SVN has ‘other annoyances‘ and ended his part in
the discussion by stating that the project won’t move from CVS any time soon. Jani
backed him, saying that CVS works fine so long as you don’t try to outsmart it, and
SVN wasn’t better in his opinion. The move to SVN would come about, wrote Jani,
over my smoking carcass…

Short version: That’ll be a ‘no’, then.


TLK: Streams, URI handling and XML

Rob Richards wondered whether he’d come across a couple of streams bugs.

Firstly, the following script:


<?php

$handle = "color: #0000BB">
fopen
( "color: #DD0000">"file.txt",
"w" "color: #007700">);
"color: #0000BB">
fwrite
( "color: #0000BB">$handle,
"SOME DATA" "color: #007700">);
"color: #0000BB">unlink style="color: #007700">( "color: #DD0000">"file.txt");
if (
="/manual/view/page/function.fwrite.html">fwrite "color: #007700">($handle "color: #007700">, "color: #DD0000">"SOMEMORE") ===
FALSE) {
    print
"CANNOT
WRITE"
;
} else {
    print
"Wrote
Data"
;
}
="/manual/view/page/function.fclose.html">fclose "color: #007700">($handle "color: #007700">);

?>

when run under Linux would delete the file and print “Wrote Data”, even though
the last call to fwrite() did nothing. Under Windows, the call to
unlink() would throw a ‘permission denied’ error. Shouldn’t Linux also
refuse to delete a file when there was an open stream?

Secondly, Rob had found that stream URIs needed to be escaped, but this wasn’t
true of URIs within the file system:


$test
= "/manual/view/page/function.file-get-contents.html">file_get_contents
(
"t%20e" "color: #007700">); // results in
error
$test "color: #007700">= "/manual/view/page/function.file-get-contents.html">file_get_contents
(
"t e" "color: #007700">); // reads the file "t
e"

regardless of OS. According to the RFC, he wrote, spaces should be escaped. Why
shouldn’t file system paths have to be escaped, when other protocols do?

Tony Dovgal explained that, under Linux, data is physically deleted only when the
number of hard links to it and open descriptors becomes 0. In fact opening and
deleting a file, and then continuing to write to and read from it, was one of the
methods used to get a temporary file descriptor.

Later, Rob realized that there was a genuine issue with URI handling and libxml;
non-file system URIs only worked when the URI was double-escaped. He prepared a
patch
against ext/libxml
for this, but recognized that it could break existing
applications that double-escape only the arguments, rather than the entire URI. On
the other hand, he wrote, it would make libxml’s URI handling work like the rest of
the stream-based functions, while still allowing the loading of a URI that had been
completely escaped; the latter worked for libxml under PHP 5.0, although not for
other stream-based functions. He hoped to commit a fix in time for the PHP 5.1
release cycle, but wasn’t sure whether to commit it immediately, following more
testing, or not at all – although he felt that the benefits of the change would
greatly outweigh the potential breakage.

Short version: A proposal to alter URI handling in libxml is in the air.


CVS: allow_url_fopen and SOAP

Changes in CVS that you should probably be aware of include:

  • Dmitry’s fix to make SOAP work when allow_url_fopen is turned
    off. Dmitry also fixed bugs #33723 (php_value overrides php_admin_value) and
    #33999 (object remains object when cast to int), amongst many lesser changes
  • Frank Kromann’s changes to allow ext/sybase and ext/mssql to be
    used simultaneously under Windows
  • Andrey’s fixes to allow ext/mysqli to work in a 64-bit environment

In an attempt to keep test output homogenous, Derick requested that Ilia (and
anyone else fixing ext/date bugs) should use the DATE_ISO8601
constant for the format in the extension’s test suite.

Zeev Suraski, meanwhile, had a short discussion with Marcus over the naming of
classes in ext/spl. He was concerned that PHP’s class namespace was becoming
polluted, and asked if there were any particular reason for not prefixing the SPL
classes with Spl. Marcus argued that all SPL classes and interfaces
could go into a Spl namespace once there was namespace support; he
didn’t want to end up with spl:splclassname or
spl:spl_classname, calling the latter in particular ‘hyper
ugly
‘. However, if Zeev insisted on his doing this, he’d use Spl
without an underscore as a prefix for the upcoming classes.

Dmitry fixed a possible compile-time memory corruption of foreach($a as
$b)
, and Derick promptly requested a test case for it (is that possible?).
Tony wrote to say he’d committed a test for a still-existing memory leak, which is
only visible when using valgrind and the Zend memory manager is disabled. Dmitry’s
fix for a memory leak in foreach() when a variable is undefined,
followed shortly.

Short version: ‘Homogenous’ is the keyword of the week.


PAT: More namespace stuff

Edin committed M. Sisolak’s long-time PAT resident to allow ext/gd to
build on Windows systems without t1lib, which will save win32 users from having to
see that warning at every build.

Marcus committed some run-tests.php changes to add ENV
support from Michael Wallner, but then realized it could break any test script using
__FILE__. Michael responded that currently his own extension
pecl/http was the only one using the ENV section.

Jessie Hernandez mailed in the latest version of his namespace patch, beta 1. The
shift/reduce conflict had been fixed, and its features are as follows:

- Simple imports: import ns:class1;
- Import aliases: import ns:class1 as my_alias;
- Namespace imports: import namespace ns;
- Anonymous namespaces: namespace { class file_class{} }
- Namespace-private classes: namespace ns{ private class prv_class{} }

Jessie had also added two new functions to support namespace
imports.


array
get_imported_namespaces "color: #007700">([string
classname
])

returns an array of imported namespaces for the current file, and


void
autoload_import_class
( "color: #0000BB">string classname,
string namespacename "color: #007700">)

which is only used inside __autoload(), imports a class in a given
namespace for the currently executing file.

There is sample usage of both functions in
tests/classes/namespace_autoload.php in Jessie’s patch.

Jessie felt that all the issues that had originally been presented against
namespace solutions were now resolved. He wrote that imports and namespace imports
are handled by the user with __autoload(), meaning there are now no
restrictions on the class file naming or directory structure. Private classes are
honored, and classes defined within an anonymous namespace can only be used within
the file declaring that namespace.

Following some discussion with John LeSueur over the finer points of the patch,
Jessie asked Marcus whether a default implementation following the PEAR convention
could be provided in SPL’s __autoload(), thereby allowing namespace
imports to be used out of the box. Marcus agreed that out of the box was good, but
felt that the simplest default should be offered, i.e. the directory separator would
replace the namespace colon. Jessie pointed out that this would only allow class
imports to work out of the box, and he’d like to provide fast namespace imports by
default too; he offered to provide a patch for spl_autoload()
shortly.

Andi Gutmans wrote in to say that he hadn’t had a chance to evaluate either the
namespace patch or the several emails on the subject yet, and hoped to have time for
it following PHP 5.1 RC1. He added that it would be interesting to see whether it
really did address the issues raised previously; he felt some of them weren’t
solvable, although limiting namespace support to classes might get around those. He
ended with a reminder that ‘we need to make sure we don’t complicate the language
too much…
‘ and promised more constructive feedback in the fairly near
future.

James Crumpton wondered aloud whether it might be possible to feed a formatted
string to spl_autoload() somehow to set the naming convention, but had
no idea how this might be achieved. The strangely named l0t3k asked about extension
support; how could he place his extension-created class into a namespace? Would it
work if he simply declared the class name as Ns:Class? Jessie assured
him that it would. He then produced Beta 2, which added some memory leak fixes and a
new constant, __NAMESPACE__, following it swiftly with a copy
containing ZTS fixes. The latter is the version currently in "http://devzone.zend.com/article/1432">the PAT directory.

On a much more mundane level, Jani committed my copyright year changes (plus
several of his own). php -v, which has been irritating me since
January, reports 2005 now.

Kamesh Jayachandran asked permission to add
PHP_SUBST(SOCKETS_SHARED_LIBADD) and
PHP_SUBST(EXIF_SHARED_LIBADD) to ext/exif and
ext/sockets’ respective .m4 files, to mend the PHP build under
NetWare. Marcus gave permission for the change to ext/exif, but Jani
suggested that this might not be the best fix for the problem, and the patch was
never committed as a result.

Short version: Apart from namespaces, not a lot going on.

Comments are closed.