Categories


Loading feed
Loading feed
Loading feed

Blueprint for PHP Applications: Bootstrapping (part 1)


[ previously in this series: Blueprint for PHP Applications: Cornerstone ]

Bootstrapping alludes to a German legend about Baron Munchhausen, who was able to lift himself out of a swamp by pulling himself up by his own hair. In later versions he was using his own boot straps to pull himself out of the sea which gave rise to the term bootstrapping.

That definition from Wikipedia has more flair to it than I can probably give this article, but nonetheless bootstrapping is an incredibly important part of a PHP web application. It is the ignition, the launch-pad, the booster rockets, and even the safety gear for our trip through PHP Best Practices. Without a well designed bootstrap, our application will never get off the ground.

What is Bootstrapping?

Many PHP applications funnel server requests into a single (or few) PHP source file that sets up the environment and configuration for the application, manages sessions and caching, and invokes the dispatcher for their MVC framework. They can do more, but their main job is to take care of the consistent needs of every page of a web application.

In our Blueprint for PHP Applications, we will have a core bootstrapper that receives all dynamic requests for an application and applies a template for application behavior that we can later extend. It will allow us to later customize the functionality for each unique application.

Feeding the Bootstrapper

Typically a PHP application will use Apache rewrite rules (or an IIS ISAPI URL rewriter) to reshape URL’s into prettier, search engine and human friendly works of art. They also use rewrite rules as their funnel for the bootstrap code. Some of these rewrite rules can be quite complex. For example, this one from MediaWiki:

   RewriteRule /wiki/([\w:]+)(?!/)(?:(?:\?)(.<ins>))? /wiki/index.php\?title=$1&$2 [I]
   RewriteRule /wikiDir/(?!Special)([\w:]</ins>)/((?:view)|(?:watch)|(?:unwatch)|(?:delete)|(?:revert)|(?:rollback)|(?:protect)|(?:unprotect)|(?:info)|(?:markpatrolled)|(?:validate)|(?:render)|(?:deletetrackback)|(?:print)|(?:dublincore)|(?:creativecommons)|(?:credits)|(?:submit)|(?:edit)|(?:history)|(?:raw)|(?:purge))(?:\?(.<ins>))? /wikiDir/index.php\?title=$1&action=$2&$3 [I]
   RewriteRule /wikiDir/(?=Special)([\w:]</ins>)/([\w:]+) /wikiDir/index.php\?title=$1&target=$2 [I]

If you are a regular expression fan, these rules probably come through crystal clear. For the rest of us, they can be problematic and may do more work than we actually need. The loss of simplicity is not worth the gain.

Instead, why not keep the funnel simple and do the rest of the rewriting ourselves in PHP code? When using the Zend Framework front controller this is the suggested tact, and for new application development it makes sense to do so. Not only will this simplify our Apache configuration, but our code is more likely to port between other web servers. To implement this concept, we will rely on everything being pushed directly to one file, our index.php. This would be done using a rewrite rule similar to this one from Wordpress:

  RewriteEngine On
  RewriteBase /
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteCond %{REQUEST_FILENAME} !-d
  RewriteRule (.*) /frontcontroller.php?foo=$1 [L]

This is close to what we want, but this rule excludes real files and directories from the rewriting and lets them pass through untouched. Therefore any actual, living, breathing file on our file system will be served as normal. Is that what we want?

It could be argued that anything under directories containing PHP, include, configuration or other application files should be protected from being directly served by the web server. Either all application requests should go through the bootstrapper or for static files they should be served in a way that exposes just the safe files, and nothing more. This model protects the application from having anything directly read from the file system that wasn’t intended. For example, have you ever had an .inc file read exposing the inner passwords of your site on accident? Regardless of your answer, lets follow the path of safety and never have it happen in the future. Therefore our rewrite rule will push everything to the bootstrapper, allow a small set of safe static files to be directly served, and then require subdirectories to expose their own safe files for further exceptions. This will be done mostly in directories such as those containing only static HTML, images, CSS, and client scripts.

Here is the entire proposed .htaccess file for the root directory of our application:

  RewriteEngine on
  RewriteBase /
  RewriteRule !\.(js|ico|txt|gif|jpg|png|css)$ index.php
  php_flag magic_quotes_gpc off

This does the job of funnelling to index.php, but it allows some basic file types to be served directly (.js, .ico, .txt, .gif, .jpg, .png, and .css but not .html or any others). Although if we get fancy and start generating on-the-fly dynamic CSS, or custom JavaScript files generated from PHP, or want to expose additional safe file types we will come back and change the rule to do our bidding.

In the real world it is likely you are not running from the root of the web server for every application. When you are in another directory, the RewriteBase setting should change to match the alias you are running under. For example, I am running the above .htaccess under an alias “test” and therefore to make an URL like http://myhost/test/ work, I should change my file to:

  RewriteEngine on
  RewriteBase /test/
  RewriteRule !\.(js|ico|txt|gif|jpg|png|css)$ index.php
  php_flag magic_quotes_gpc off

If you are curious about the line containing “php_flag magic_quotes_gpc off” at the end of the .htaccess file, it turns off magic quoting which is being deprecated in future versions of PHP and as a best practice should be avoided. You should intentionally protect your use of input data using input filtering rather than relying on magic quoting (or other forms of voodoo).

Lastly, before using .htaccess files, you should read the Apache Tutorial for .htaccess Files so that you understand the full impact of using them, and the alternative of including the same information in the main httpd.conf file. Don’t be too alarmed by the comments about performance as many servers will cache the .htaccess file in the disk or OS cache keeping the performance high; you can always test both methods for performance variations to find what works nicely for your environment. Another solid resource for .htaccess tips and tricks is available taking you through many different uses.

Funnel in Place, Now What?

First things first. Test the funnel. To do this, I created the following files in my web directory:

/.htaccess

  RewriteEngine on
  RewriteBase /
  RewriteRule !\.(js|ico|txt|gif|jpg|png|css)$ index.php
  php_flag magic_quotes_gpc off

/index.php

  <?php
    echo 'Hello Baron';
  ?>

/bad.php

  <?php
    echo 'You cannot see me';
  ?>

/bad.html

  <html>
    <head>
    </head>
    <body>
      You cannot see me
    </body>
  </html>

/images/good.gif

Once those files have been created (or downloaded and unzipped) into your web or alias root you can go to the following list of URL’s to test the funnel (remember to update the rewrite rule if not running out of the web root).

Note that these steps assume you are running locally on the same machine and also from the root of the web server:

  1. http://localhost/ will show “Hello Baron”
  2. http://localhost/bad.html will show “Hello Baron” ignoring the bad.html file
  3. http://localhost/images/good.gif will show the “Allowed Image” graphic
  4. http://localhost/bad.php will show “Hello Baron” ignoring the bad.php file

If any of these fail, it is normally due to a setting in your Apache configuration that does not allow the .htaccess file to do its job at a per-directory level. See the documentation for AllowOverride. You can either adjust that configuration, or move the settings into the main httpd.conf file.

Now that you have the funnel working, we can talk more about the bootstrapper’s responsibilities.

Role of the Bootstrapper

If you review the diagram from PHP Best Practices: Creating a Blueprint for PHP Applications

You will note that there is a long list of issues to be dealt with under the heading of “Pages.” Many of the top-level items are touched on by the bootstrapper. Settings, mapping, application state, filtering, resources, cookies, sessions, caching, messages, headers, and dispatching to name a few.

You may ask, what do these have to do with the bootstrapper? Well, as the first run script for any page request, the bootstrapper can setup, configure and gift wrap everything that all pages have in common. It can make sure the include path is setup correctly, that the environment is configured correctly, all PHP settings are set, that common model objects are loaded, and that the front controller is invoked to begin your MVC processing. If you setup the bootstrapper correctly your application code will be more managible and consistent, along with being easier to write in the first place.

... Continued in Blueprint for PHP Applications: Bootstrapping – part 2

Comments


Friday, March 31, 2006
REALLY ONLY ONE BOOTSTRAP FILE?
4:07PM PST · Ralf Eggert
Monday, April 3, 2006
RE: REALLY ONLY ONE BOOTSTRAP FILE?
10:18AM PDT · Jayson Minard (editor)
Saturday, June 10, 2006
RESOLUTION OF GRAPHIC FILE
3:03AM PDT · jthomp77
Sunday, June 11, 2006
RE RESOLUTION OF GRAPHIC FILE
10:04AM PDT · lramalho
Monday, June 19, 2006
IN REPLY TO LRAMALHO
6:26PM PDT · ipzent
Wednesday, August 29, 2007
SOMEONE SHOULD CONTINUE THIS TUTORIAL
7:58AM PDT · Gerry_
Saturday, December 29, 2007
SOMO COMMENTS ABOUT .HTACCESS
3:22PM PST · orickmers