Categories


Loading feed
Loading feed
Loading feed

Zend Weekly Summaries Issue #366


TLK: Safe mode [again]
TLK: Ignored patches
BUG: always_inline symbol clash
REQ: conf.d-like support
TLK: Preliminary taint support [continued]
NEW: PHP 5.2.5
TLK: Parallel database queries
CVS: T_IMPORT vs T_USE resolved, yay
PAT: __sleep

4th November - 10th November 2007

TLK: Safe mode [again]

Mark Krenz picked up the trail of his sporadic one-man campaign to have safe_mode reinstated in PHP 6, explaining that he hadn't had time to deal with the discussion back in the day. In his search for 100% security in a shared environment, Mark was now preparing to look into mpm_itk as a means to control user access. However, the best solution he'd found for securing PHP to date was still to run PHP in safe mode and rely on safe_mode_exec_dir to prevent users from running arbitrary executables. After lengthy security analysis of the options available to him, he still hadn't found anything to replace it. If the PHP development team could just provide a replacement for safe_mode_exec_dir, Mark would be much happier. open_basedir wasn't a solution; a user could still upload cat and run something like:

print exec('/home/myuser/www/cat /home/otheruser/private/mysqlinfo');

Mark realized he'd probably be advised to disable all the exec functions, 'but tell that to joe user who wants to be able to run some popular photo gallery software or blog that needs to run an external command like ImageMagick'. He also recognized that an execution directory restriction would be vulnerable to arbitrary arguments passed to the restricted programs, but still saw this approach as more safe than any of the options available. The usual advice, to run Apache in a chroot jail, struck Mark as 'unreasonable', since it would entail running 200+ instances of Apache on a single server; it also wasn't trivial to set up.

Cristian Rodriguez immediately argued that the kind of directory restriction Mark wanted belongs at the level of the operating system. Removing safe_mode simply eliminates a false sense of security and makes it clearer that people should secure their servers better; 'whoever convinced you that it is a good thing does not have a clue.' Mark replied tiredly that he felt like he was 'up against a religion'. He'd done his own security analysis and knew what he was talking about, but the term "safe mode" seemed to short-circuit a lot of peoples' brains. He asked rhetorically what could sanely be done at OS level to prevent random program execution by Apache and its modules? All the suggestions he'd heard so far seemed to come from those who'd never run a shared user environment; either they simply wouldn't work, or they'd be cost prohibitive.

At this stage Mark altered his plea slightly and made a more reasonable request for 'a transition period in which sane solutions are presented to the community'. Simply dropping safe_mode and telling everyone to deal with it was irresponsible; tools for dealing with it should be made available, and those tools should include such things as a setting to prevent execution outside a named directory.

Alexey Zakhlestin recommended a combination of FastCGI and suexec, which would give every user their own instance of PHP with the uid as owner, and cited Textdrive/Joyent as a hosting provider happily using this approach. Mark promptly produced a document suggesting they had poor security, and argued that for him at least, relying on security by obscurity would mean sleepless nights. Alexey queried this; had Mark simply ignored the FastCGI part? Mark replied in the negative; he simply didn't feel that fastcgi/suexec/mod_suphp were able to handle everything. Besides, wasn't the whole point of PHP originally supposed to be that it was part of Apache? (It later became clear that Mark hadn't actually tried the FastCGI approach in recent years.)

Michael McGlothlin wondered why Mark didn't just give every user their own virtual machine. Mark explained that he offers tiered hosting; some of his users pay for precisely that, but mega-cheap hosting is only financially viable if there are a couple of hundred users per machine. Michael ran through a few possibilities before conceding that any good solution would be more resource intensive than safe_mode. The best he could offer for 'people who only want to pay $5/mo' was to keep them on PHP 5 forever.

Nate Gordon provided some backing for Mark's arguments; he pointed out that it's not always possible to run a script as a single user in a shared environment, because the content may be owned by a group of users. He didn't see how it could be difficult to lock down execution to a specific directory on a per-vhost basis via PHP, given that PHP provides the means for execution. Nate added that he would be the first to acknowledge that the basic premise of safe_mode is broken. What he really needed was the ability to disable execution of anything other than PHP on a per-vhost basis. Stefan Esser's suhosin project provides a per-vhost function disabling feature, but Nate really didn't understand why it should be left to an extension to provide that. He'd also like a per-vhost exec_dir limit... 'People are too quick to throw out the baby with the bath water on safe_mode. It isn't completely useless to everyone.'

Peter Brodersen noted that the idea of unbundling safe_mode_exec_dir from safe_mode had come up before now and shelved for "later"; perhaps now was later. Basically, the need was for a central switch for exec functions, rather than a long and changeable list under disable_functions.

Mark couldn't have agreed more. His biggest concern was that Linux distros would start bundling PHP 6 before it had that feature. He therefore saw this as urgent, and wrote that he'd be willing to write documentation or a migration guide for php.net, if someone could only provide the C skills to get safe_mode_exec_dir - or some equivalent - into CVS HEAD very soon.

Short version: Retaining safe_mode_exec_dir has been mooted several times in the past and never rejected (sorry Tony).

TLK: Ignored patches

Greg Beaver had discovered for himself what it feels like to be on the blunt end of the patch review process, and he didn't like it much. Two of his recent patches - one to implement multiple namespaces per file (sans brackets), and one to remove keyword restrictions for methods - had seemingly fallen through the cracks. Greg wrote rather bitterly that he'd like a review and feedback or even a commit, 'so we can still pretend that outsider contributions have an impact on PHP, even those from annoying people like me.'

Stas Malyshev wrote that one of those patches (removing keyword restrictions for methods) should probably be applied, but he wasn't sure which of Greg's many patches it was; there had been two along those lines, as he recalled. As for multiple namespace support, it brought too many complications to both the syntax and the Zend Engine, and Stas really wasn't convinced the end was worth the means.

Short version: Not ignored so much as 'on hold'.

BUG: always_inline symbol clash

Wez Furlong took time out to report a build problem in the PHP_5_3 branch on his Mac OSX. It seems that the system headers on the platform use __attribute__((always_inline)), and zend.h now defines always_inline to 'something else', causing problems when the compiler tries to resolve that attribute name. Wez suggested prefixing the defines used in the Zend Engine with zend_ or a similar namespacing token. In fact, he'd assumed this was standard practice. Any similar updates should also be fixed in the same way.

Dmitry Stogov wondered if it might not be better just to define always_inline as inline on Mac OSX, but Wez explained that this wasn't a platform specific issue. Symbol leakage has the potential to break any library using that feature - he'd just happened to notice it on OSX. It would be best to rename the symbols to avoid conflict. Dmitry asked for more information about the compiler and the error, and whether there were existing reports about the issue. Wez patiently demonstrated the problem with a faked system definition, and gave his compiler information (GCC 4.0.1) as requested. Dmitry okayed Wez' original patch at this stage, and asked him to commit it - but Wez was all out of time, and that rough demo patch only covered one small area in any case.

Short version: Symbol leakage needs attention.

REQ: conf.d-like support

A Sriram Natarajan wrote to the internals list with a request for ''Include' file/directory support (like 'conf.d' in Apache httpd)'. His idea was that loaded extensions could be defined in a separate file, rather than in a single php.ini file. Sriram believed that some Linux distributions already do this, but wanted to know whether the facility could be considered for the standard PHP distribution.

Cristian Rodriguez introduced Sriram to the --with-config-file-scan-dir configuration option utilized by those Linux distributions; 'no other black magic involved'.

Short version: Sometimes things are less complicated than they seem.

TLK: Preliminary taint support [continued]

Cristian was, however, still finding it a complicated business to build PHP with taint support. He'd tried Wietse Venema's most recent tarball, which included code to update the apache2 SAPI, but it still wasn't compiling for him. The compiler complained about a casting issue somewhere in the CGI SAPI and then bailed out.

Wietse tried it himself, but couldn't reproduce the problem.

Christian Schneider didn't have any build problems either, and wrote to express his happiness with the patch. He posted a small patch of his own adding taint support to func_get_arg[s](), and suggested that the taint functions should probably be namespaced with a taint_ prefix before being integrated into the PHP core. That said, he saw taint mode as such a useful tool that he planned to patch PHP on his team's development boxes, and promised Wietse more feedback in the near future if he was authorized to go ahead with this.

Wietse thanked Christian for his patch and explained that he intended to revise the user interface after catching up with PHP 5.2.5, since that took priority.

Short version: Surely that should read 'PHP_5_3'?

NEW: PHP 5.2.5

Ilia Alshanetsky, as Release Master for the PHP 5.2 series, announced the release of PHP 5.2.5 as follows:

The PHP development team would like to announce the immediate
availability of PHP 5.2.5. This release focuses on improving the
stability of the PHP 5.2.x branch with over 60 bug fixes, several of
which are security related. All users of PHP are encouraged to
upgrade to this release.

Security Enhancements and Fixes in PHP 5.2.5:
---------------------------------------------
- Fixed dl() to only accept filenames
  Reported by Laurent Gaffie
- Fixed dl() to limit argument size to MAXPATHLEN (CVE-2007-4887)
  Reported by Laurent Gaffie
- Fixed htmlentities/htmlspecialchars not to accept partial multibyte
  sequences
  Reported by Rasmus Lerdorf
- Fixed possible triggering of buffer overflows inside glibc
  implementations of the fnmatch(), setlocale() and glob() functions
  Reported by Laurent Gaffie
- Fixed "mail.force_extra_parameters" php.ini directive not to be
  modifiable in .htaccess due to the security implications
  Reported by SecurityReason
- Fixed bug #42869 (automatic session id insertion adds sessions id
  to non-local forms)
- Fixed bug #41561 (Values set with php_admin_* in httpd.conf can be
  overwritten with ini_set())

Key enhancements in PHP 5.2.5 include:
--------------------------------------
- Upgraded PCRE to version 7.3
- Updated timezone database to version 2007.9
- Added ability to control memory consumption between requests using
  ZEND_MM_COMPACT environment variable
- Improved speed of array_intersect_key(), array_intersect_assoc(),
  array_uintersect_assoc(), array_diff_key(), array_diff_assoc() and
  array_udiff_assoc()
- Fixed bug #43139 (PDO ignores ATTR_DEFAULT_FETCH_MODE in some
  cases with fetchAll())
- Fixed bug #42785 (json_encode() formats doubles according to locale
  rather then following standard syntax)
- Fixed bug #42549 (ext/mysql failed to compile with libmysql 3.23)
- Over 60 bug fixes

For users upgrading from PHP 5.0 and PHP 5.1, an upgrade guide is
available here, detailing the changes between those releases and PHP 5.2.5.

For a full list of changes in PHP 5.2.5, see the ChangeLog.

Gaetano Giunta went immediately to download the new release and undertake a full analysis of the versioning information. He reported that, of the 83 extensions shipped with PHP, 7 had had changes in their source since the PHP 5.4.4 release but had not updated the versioning information as reported by phpversion(). Of these 7, only ext/tidy had an updated version number in the global phpinfo() page. Another extension - ext/oci8 - had updated version information but no changes in the code. Gaetano wasn't certain about ext/mysql; changes had been made there, but may have been non-affective. 5 other extensions had no versioning information whatsoever, and a whopping 46 extensions had no versioning information available via phpversion().

Tony Dovgal explained that ext/oci8 is released through PECL, and its release cycle isn't synchronized with the PHP core; the same ought to apply to far more extensions in theory than it does in practice. That said, he felt that core extensions should only have a version update following major changes or additions, and this would be a rare event in a bug-fix only branch.

Gaetano argued that PECL extensions should sync better with the PHP release cycle. He'd found that pecl/oci8 had been updated to version 1.2.4 a single day after the PHP 5.2.4 release, making the extension shipped with PHP 5.2.4 appear different to the PECL version even though the code was exactly the same. Even 1.2.4-dev in the core would have been better than leaving it at 1.2.3. Still, misleading version information was better than none at all... One problem with the core and PECL versions not being in sync was that there is movement between the two; another was that it's perfectly possible for a user to load a PECL extension to replace a core extension. That makes it impossible to rely on the PHP version when checking for the presence of a given feature or bug, which wouldn't be the case if every fix were accompanied by a change to the extension's version number.

Tony pointed out that there is a $Revision$ CVS tag for that purpose, and a version number doesn't mean quite the same thing. Gaetano disagreed; to him, it seemed the CVS tag was intended for those compiling from source and reporting or fixing bugs, and not as something to be accessed and used by the general PHP coder.

Short version: Gaetano also has a one-man campaign on the go.

TLK: Parallel database queries

Arend van Beelen was doing some research. He hoped to develop a shared library that could perform database queries to multiple databases in parallel, and he hoped to be able to use it from within PHP. His main concern was thread safety. Arend could see three possible approaches:

  • Use multi-threading within the library, but have it return a blocking single-threaded API
  • Use a single thread and asynchronous socket communication
  • Use a daemon on the localhost as a middle man, allowing PHP to connect once and then handling all the database work itself before passing back a result

Given that the aims included stability and minimal overhead, could anyone advise him about the pros and cons of these solutions?

One Donal McMullen pointed out that a 'maxed out' CPU on a web server won't go away on its own, but if that wasn't the issue, the curl_multi_* functions can be useful. A cheap way to parallelize database or data-object access would be to implement a services-oriented architecture and call that library from a script using said functions. The advantage of this approach was its quick and easy implementation; it would, however, introduce latency into data retrieval, making it slower for most applications.

Arend thanked Donal for his suggestion, and agreed it might provide some quick solutions in a small scale venture. He'd omitted to explain that his own situation involved literally hundreds of servers, and the farm is growing. Each time the number of web servers increased, the databases became the bottleneck. Although parallel querying wouldn't magically resolve the bottlenecks, the specific problem Arend was addressing was that of tables that are divided over multiple database clusters. The aim of parallellization techniques, in this case, would be to remove the strain of dealing with distributed databases from the PHP application.

Lukas Smith shared some insights about existing databases. The pgsql extension already has support for asynchronous queries in pg_send_query() and friends. It should be possible to use MySQL Proxy to create something that splits a single query into multiple queries and then rejoins them. Finally, since MySQL AB are actively developing a native PHP library, Arend might want to talk to them about his ideas.

Arend replied that, while Lukas' suggestion of query splitting was exactly what he hoped to achieve, MySQL Proxy didn't appear to be the best way of approaching it. Adding another proxy layer between the web servers and the database servers wouldn't only mean additional overhead; it would bring new potential bottlenecks and points of failure. That was precisely why Arend hoped to move the functionality onto the web servers themselves. That said, some of MySQL Proxy's functionality is exactly the kind of thing he needs, and contacting MySQL AB about the possibility of re-using some of its components might well be a good idea.

Rasmus Lerdorf recommended a simple single-threaded event-driven approach, and suggested that Arend look into the source behind the curl_multi() implementation, since he is essentially planning to do the same thing. Writing a threaded library would mean dealing with a lot of issues (portability, threading clashes, signal handling and so on), and Rasmus didn't see how intra-request thread scheduling would help any when it came to busy web servers.

Short version: This sounds like a scarily big project.

CVS: T_IMPORT vs T_USE resolved, yay

Changes in CVS that you should probably be aware of include:

  • GD library bug #43121 (gdImageFill() with IMG_COLOR_TILED crashes httpd) was fixed in PHP_5_2, PHP_5_3 and CVS HEAD [Mattias Bengtsson]
  • In the core, the copy() function's optional third parameter context was backported to the 5_3 branch [Jani Taskinen]
  • Core bug #43197 (array_intersect_assoc() does not emit warning messages for error inputs) was fixed in 5_3 and HEAD [Ilia]
  • Following the addition of zend_mm_set_custom_handlers() in the Zend API, user defined malloc(), realloc() and free() are supported in PHP_5_3 and CVS HEAD (affects internals only) [Dmitry]
  • T_IMPORT is now T_USE [Dmitry]
  • There is now a glob stream wrapper, glob://, in PHP_5_3 branch and CVS HEAD [Marcus]
  • Core bug #43196 (array_intersect_assoc() crashes with non-array input) was fixed in 5_2, 5_3 and HEAD [Jani]
  • Zend Engine bugs #43201 (Crash on using unitialized vals and __get()/__set()) and #43175 (__destruct() throwing an exception with __call() causes segfault) were fixed in the PHP_5_3 branch and CVS HEAD [Dmitry]
  • Streams bug #43216 (stream_is_local() returns FALSE on file://) was fixed in 5_3 and HEAD [Dmitry]
  • A bunch of ext/interbase bugs were finally fixed and/or closed in the PHP_5_3 branch and CVS HEAD - #30690, #30907, #32143, #39056, #39397, #39700 and #42284. See history for details. [Lars Westermann]
  • Following the PHP 5.2.5 release, the fixes for Zend Engine bugs #43175 (__destruct() throwing an exception with __call() causes segfault) and #43201 (Crash on using unitialized vals and __get/__set) and streams bug #43216 (stream_is_local() returns FALSE on file://) were merged to the 5_2 branch [Dmitry]

In other CVS news, Jani Taskinen added support for special [PATH=/opt/httpd/www.example.com/] and [HOST=www.example.com] INI sections; these are intended for admins, and cannot be overridden in user-defined INI files. He also backported support for loading modules using full paths, via the extension directive.

Andrei Zmievski welcomed Bob Majdak, another new contributor, into the PHP-GTK fold. Wez did something far more mysterious: a new module named php-objc appeared overnight in the php.net CVS repository.

Short version: An Objective-C bridge appears without fanfare - and there will definitely be a PHP 5.2.6.

PAT: __sleep

Sara Golemon helped David Zülke polish up his patch adding a new option, ignore_errors, for the HTTP fopen wrapper. This is now in CVS HEAD and the PHP_5_3 branch, and offers a way to pick up HTTP response headers regardless of status.

One Andrew Minerd offered a patch against CVS HEAD and the PHP_5_3 branch that would allow the magic __sleep() function to return NULL 'to continue the normal serialization process'. This, he wrote, would allow the function to clean up without having to resort to the Reflection API. Andrew had checked his patch against the test suite, and included a separate patch correcting a wrong EXPECTF in one of the existing tests.

Johannes Schlüter applied a Zend Engine patch from Andrey Hristov bringing persistency support to zend_ptr_stacks (affects internals only).

Stas went looking for ignored patches (this does happen from time to time) and came across Martin Jensen's patch to unify phpinfo() output for PDO drivers. He wondered why the patch was PECL-specific when PDO is not, and moved on to Wez Furlong's large file support patch. Stas still had some concerns stemming from 64-bit compatibility and the change to the FILE* structure, and wondered if Wez knew how LF-enabled code copes with non-LF-enabled code. Again, there was no response; the original patch being three weeks old, it's likely Wez missed the query.

And finally, Johannes committed a trivial patch from imagick developer Mikko Koppanen, replacing #ifdef with #if defined in request initialization/shutdown in the PHP_5_2 branch of ext/mysqli.

Short version: LFS needs serious research before it stands a chance.

Comments