Dynamically Creating Compressed Zip Archives With PHP

June 4, 2007

Tutorials

Ancient History

I remember the very first time I encountered a ZIP file. It was back in the early 1990s, when I was over at a friend’s house to celebrate his purchase of a new computer. The computer vendor had generously included a bunch of shareware games with the system, all of which were compressed using the PKZip format. It took us a while to figure out how to get them out; we eventually succeeded, but not without first getting a crash course (over the telephone!) in the basics of using the command-line PKZip tool.

The Internet has changed quite a bit since those early days, but the ZIP format has nevertheless remained a fairly constant presence in my digital existence (as I suspect it has for many other developers and even casual Web surfers). As a way of compressing files for archival, for transport over the Internet, or for storage on space-limited hardware, it remains an efficient, reliable, portable, easy to use and fairly popular format. It’s also well-supported in most operating systems and programming toolkits, either natively or through the use of add-on extensions.

PHP too has included support for the ZIP format since PHP 4.x but it was only recently when, idly browsing the PHP manual, I realized that PHP 5.2.0 includes a re-engineered version of the ext/zip extension, one based on the zlib library. Bored and not a little intrigued, I decided to try it out. And over the next few pages, I’m going to tell you what I found.

Kicking The Tyres

In order to read and write ZIP files with PHP, your PHP build must include support for ext/zip. On UNIX, this is accomplished by adding the –enable-zip option to the configure script when building PHP. On Windows, a pre-built DLL named php_zip.dll is included with the Windows PHP distribution, and must be activated in the Windows php.ini configuration file. Read more about this at http://www.php.net/manual/en/ref.zip.php

The statements above assume that you’re using PHP 5.2.0 or better. However, if you’re using an older version of PHP, all is not lost – you can still download and use the ext/zip extension from the PECL repository, at http://pecl.php.net/package/zip

UNIX PHP users should also note that the PHP configure script expects the zlib libraries to be installed and available in a standard location, in order to enable ext/zip. If your system has these libraries in a non-standard location, or if the configure script can’t find them, you can explicitly point the script to these libraries with the additional –with-zlib-dir option.

Assuming you’ve got the extension installed and ready to use – a quick call to phpinfo() is always useful to verify this – let’s get things rolling with a simple example: reading an existing ZIP file with PHP, and printing a list of its contents:

<?PHP
// create object
$zip = new ZipArchive();   

// open archive 
if ($zip->open('app-0.09.zip') !== TRUE) {
    die ("Could not open archive");
}

// get number of files in archive
$numFiles = $zip->numFiles;

// iterate over file list
// print details of each file
for ($x=0; $x<$numFiles; $x++) {
    $file = $zip->statIndex($x);
    printf("%s (%d bytes)", $file['name'], $file['size']);
    print "
";    
}

// close archive
$zip->close();
?>

A quick word on terminology before I explain this script in detail. Individual files within the ZIP archive are referred to as entries. Each entry is referred to by a numeric index and, as you’ll shortly see, this index is required by most methods of the ZipArchive class to perform operations on the ZIP archive. For example, the statIndex() method used in the previous listing accepts an index as input parameter and returns detailed information on the file corresponding to that index, as an associative array.

The script begins by instantiating a ZipArchive object; this object serves as the entry point to all of ext/zip’s functions. Then, the object’s open() method is used to open the target ZIP archive for reading. If successful, this method returns Boolean true and populates various object properties with summary information about the ZIP archive; if unsuccessful, it returns an error code.

One such property is the numFiles property, which stores the number of files contained within the archive. It’s then easy to iterate over the file list using a for() loop, call statIndex() with the file index, and retrieve and display file information, such as the name and size, in a neat little list. Once this is done, the close() method is used to close the ZIP archive and otherwise clean things up.

Here’s a snippet of what the output looks like:

...
config/database.php (2670 bytes)
config/sql/db-setup.sql (26773 bytes)
scripts/m_a/m1.php (886 bytes)
scripts/m_a/m2.php (487 bytes)
scripts/m_b/m7.php (716 bytes)
webroot/css/admin_default.css (1458 bytes)
webroot/css/default.css (8591 bytes)
webroot/favicon.ico (4286 bytes)
webroot/img/add.gif (182 bytes)
webroot/img/arrow-next.gif (63 bytes)
webroot/img/arrow-prev.gif (63 bytes)
webroot/img/myprofile-tab.gif (259 bytes)
webroot/index.php (3111 bytes)
webroot/js/vendors.php (1369 bytes)
...

Being Selective

To retrieve information on a particular file (rather than the whole shebang), simply call statIndex() with the index number of the corresponding entry. How do you get the index number? With the ZipArchive’s locateName() method, of course:

<?PHP
// create object
$zip = new ZipArchive();   

// open archive 
if ($zip->open('app-0.09.zip') !== TRUE) {
    die ("Could not open archive");
}

// find index of named file in archive
// print file details
$x = $zip->locateName('index.php', ZIPARCHIVE::FL_NODIR);
if ($x !== FALSE) {
    $file = $zip->statIndex($x);
    print_r($file);
}

// close archive
$zip->close();
?>

Here, the locateName() method accepts the name of the file, and scans the archive looking for matches; it returns the index of the first matching file. This index is then passed to the statIndex() method, which retrieves information on that file, including information on its name, compressed and uncompressed size, modification time and CRC. Take a look at the output:

Array
(
    [name] => webroot/index.php
    [index] => 91
    [crc] => -1611935277
    [size] => 3111
    [mtime] => 1176437912
    [comp_size] => 1296
    [comp_method] => 8
)

Wondering about the second parameter to locateName()? It’s a flag that tells the method to ignore the directory component when scanning the archive for a match. You can also use the flag ZIPARCHIVE::FL_NOCASE, which tells locateName() to ignore case when performing the name search.

Decompression Chamber

The ZipArchive object also allows you to extract the contents of a compressed ZIP file with its extractTo() method, which accepts the target directory as argument. Consider the following example, which demonstrates by extracting the contents of the named archive to the temporary directory:

<?PHP
// create object
$zip = new ZipArchive();   

// open archive 
if ($zip->open('app-0.09.zip') !== TRUE) {
    die ("Could not open archive");
}

// extract contents to destination directory
$zip->extractTo('/tmp/extracted/');

// close archive
// print success message
$zip->close();    
echo "Archive extracted to directory";
?>

A quick look in /tmp/extracted after running the script above will show you that the files have been successfully extracted.

shell> ls -lR /tmp/extracted/
/tmp/extracted/:
total 2
drwxr-xr-x 3 nobody nobody 104 2007-05-22 15:25 config/
drwxr-xr-x 4 nobody nobody  96 2007-05-22 15:25 scripts/
drwxr-xr-x 5 nobody nobody 184 2007-05-22 15:25 webroot/

/tmp/extracted/config:
total 5
-rw-r--r-- 1 nobody nobody 2670 2007-05-21 21:44 database.php
drwxr-xr-x 2 nobody nobody  112 2007-05-22 15:25 sql/

/tmp/extracted/config/sql:
total 36
-rw-r--r-- 1 nobody nobody 26773 2007-05-14 21:25 db-setup.sql
-rw-r--r-- 1 nobody nobody  5384 2007-05-04 17:20 dummy.sql
...

Note that if the target directory does not exist, the extractTo() method will attempt to create it for you.

You can also selectively extract certain files from the source archive, by passing the extractTo() method an additional parameter: an array containing a list of the file names to be pulled out of the archive. Here’s an example:

<?PHP
// create object
$zip = new ZipArchive();   

// open archive 
if ($zip->open('app-0.09.zip') !== TRUE) {
    die ("Could not open archive");
}

// extract selected contents to destination directory
$fileList = array(
            'config/database.php', 
            'scripts/m_a/m1.php',
            'webroot/img/top-curve.gif',
);
$zip->extractTo('/tmp/extracted', $fileList);

// close archive
// print success message
$zip->close();    
echo "Selected files from archive extracted to directory";
?>

It’s important to note that when extracting selected files from an archive, you must specify the full path to the destination files(s) within the ZIP archive. Invalid paths will simply be ignored by the extractTo() method.

Adding It All Up

Just as you can read and extract ZIP archives, so too can you create new ZIP archives from within a PHP script. The easiest way to do this is with the addFile() method, which accepts a file path and adds it to the archive. The following example demonstrates:

<?PHP
// create object
$zip = new ZipArchive();

// open archive 
if ($zip->open('my-archive.zip', ZIPARCHIVE::CREATE) !== TRUE) {
    die ("Could not open archive");
}

// list of files to add
// list of files to add
$fileList = array(
    'fstab2.php',
    'images/branch.gif',
    'files/php1.php'
);

// add files
foreach ($fileList as $f) {
    $zip->addFile($f) or die ("ERROR: Could not add file: $f");   
}
    
// close and save archive
$zip->close();
echo "Archive created successfully.";    
?> 

The first thing you’ll notice here is the additional argument to the open() method, the ZIPARCHIVE::CREATE flag. This flag tells the open() method to create a new archive file, if one does not already exist with the same name. The addFile() method can be used to add files to it, in a loop. Once the loop has completed, the close() method is used to close and save the archive to disk.

Note that if a file with the same name already exists, the open() and addFile() methods will simply append new files to the existing archive. If this behaviour is not what you want, replace the ZIPARCHIVE::CREATE flag with the ZIPARCHIVE::OVERWRITE flag, to force the call to open() to generate a fresh archive file.

Now, if you were to look inside the output ZIP archive at your console prompt, you should see the selected files:

shell> unzip -t my-archive.zip
Archive:  my-archive.zip
    testing: fstab2.php               OK
    testing: images/branch.gif        OK
    testing: files/php1.php           OK
No errors detected in compressed data of my-archive.zip.

Different Strokes

The addFile() method can end up being somewhat tedious when adding a large number of files to an archive, or when recursively compressing a directory tree. That’s why you can improve the previous listing by using some of PHP’s built-in tools for file and directory manipulation.

First up, the RecursiveDirectoryIterator, which provides an easy way to compress and archive a directory, complete with all its sub-directories:

<?PHP
// increase script timeout value
ini_set('max_execution_time', 300);

// create object
$zip = new ZipArchive();

// open archive 
if ($zip->open('my-archive.zip', ZIPARCHIVE::CREATE) !== TRUE) {
    die ("Could not open archive");
}

// initialize an iterator
// pass it the directory to be processed
$iterator = new RecursiveIteratorIterator(new RecursiveDirectoryIterator("app/"));

// iterate over the directory
// add each file found to the archive
foreach ($iterator as $key=>$value) {
    $zip->addFile(realpath($key), $key) or die ("ERROR: Could not add file: $key");        
}

// close and save archive
$zip->close();
echo "Archive created successfully.";    
?>

Here, a RecursiveDirectoryIterator is initialized with the current directory as the starting point; it’s then passed to a RecursiveIteratorIterator to ensure that it continues iterating for so long as a sub-directory exists in the descendant tree. On each iteration, the current file is added to the archive via the addFile() method.

Look inside the ZIP archive, and you’ll see something like this:

shell> unzip -t my-archive.zip
Archive: my-archive.zip
    testing: app/config/database.php   OK
    testing: app/config/sql/db-setup.sql   OK
    testing: app/config/sql/dummy.sql   OK
    testing: app/scripts/m_a/m1.php   OK
    testing: app/scripts/m_a/m2.php   OK
    testing: app/scripts/m_a/m3.php   OK
    testing: app/scripts/m_b/m4.php   OK
    testing: app/scripts/m_b/m5.php   OK
    testing: app/scripts/m_b/m6.php   OK
    testing: app/scripts/m_b/m7.php   OK
    testing: app/webroot/css/admin_default.css   OK
    testing: app/webroot/css/default.css   OK
    testing: app/webroot/favicon.ico   OK
    testing: app/webroot/img/add.gif   OK
    testing: app/webroot/img/arrow-next.gif   OK
   ...

Let me go off on a quick tangent here. Notice the second parameter to the addFile() method in the previous listing – this allows you to specify the path and name to the file, as it appears within the archive. This comes in handy if you need to exert more fine-grained control over the directory structure of the ZIP archive and its output.

What does this mean? Well, the second parameter to addFile() lets you control the directory structure inside the ZIP archive, and thereby determine the output when the archive is uncompressed. So, if you wanted (for example) all files within the archive to ultimately be extracted into a tools/0.57/ directory, you could revise the script above as follows:

<?PHP
// increase script timeout value
ini_set('max_execution_time', 300);

// create object
$zip = new ZipArchive();

// open output file for writing
if ($zip->open('my-archive.zip', ZIPARCHIVE::CREATE) !== TRUE) {
    die ("Could not open archive");
}

// initialize an iterator
// pass it the directory to be processed
$iterator  = new RecursiveIteratorIterator(new RecursiveDirectoryIterator("app/"));

// iterate over the directory
// add each file found to the archive
foreach ($iterator as $key=>$value) {
    $zip->addFile(realpath($key), 'tools/0.57/' . $key) or die ("ERROR: Could not add file: $key");        
}

// close and save archive
$zip->close();
echo "Archive created successfully.";    
?>

And now, when you inspect the ZIP archive, you’ll notice a different directory structure:

shell> unzip -t my-archive.zip
Archive:  my-archive.zip
    testing: tools/0.57/app/config/database.php   OK
    testing: tools/0.57/app/config/sql/db-setup.sql   OK
    testing: tools/0.57/app/config/sql/dummy.sql   OK
    testing: tools/0.57/app/scripts/m_a/m1.php   OK
    testing: tools/0.57/app/scripts/m_a/m2.php   OK
    testing: tools/0.57/app/scripts/m_a/m3.php   OK
    testing: tools/0.57/app/scripts/m_b/m4.php   OK
    testing: tools/0.57/app/scripts/m_b/m5.php   OK
    testing: tools/0.57/app/scripts/m_b/m6.php   OK
    testing: tools/0.57/app/scripts/m_b/m7.php   OK
    testing: tools/0.57/app/webroot/css/admin_default.css   OK
    testing: tools/0.57/app/webroot/css/default.css   OK
    testing: tools/0.57/app/webroot/favicon.ico   OK
    testing: tools/0.57/app/webroot/img/add.gif   OK
    testing: tools/0.57/app/webroot/img/arrow-next.gif   OK
    testing: tools/0.57/app/webroot/img/arrow-prev.gif   OK
    testing: tools/0.57/app/webroot/img/arrow-up.gif   OK
    testing: tools/0.57/app/webroot/img/attach-cv.gif   OK
    testing: tools/0.57/app/webroot/img/banner.gif   OK
    testing: tools/0.57/app/webroot/img/black-bullet.gif   OK
    testing: tools/0.57/app/webroot/img/blue-bg-arrow.png   OK
   ...

Back to business. Another option is to use the glob() function in combination with addFile() to only compress files matching a certain pattern, as in the following example:

<?PHP
// create object
$zip = new ZipArchive();

// open archive 
if ($zip->open('my-archive.zip', ZIPARCHIVE::CREATE) !== TRUE) {
    die ("Could not open archive");
}

// add all .php files in directory to archive
foreach (glob ('*.php') as $f) {
    $zip->addFile(realpath($f)) or die ("ERROR: Could not add file: $f");
}

// close and save archive
$zip->close();
echo "Archive created successfully.";    
?> 

This listing generates a ZIP archive containing only files with the ‘.php’ extension from the current directory.

If you like, you can also add a file to a ZIP archive, by directly specifying its contents as a binary-safe string. The method to do this is the addFromString() method, and it’s illustrated in the next example:

<?PHP
// create object
$zip = new ZipArchive();

// open output file for writing
if ($zip->open('my-archive.zip', ZIPARCHIVE::CREATE) !== TRUE) {
    die ("Could not open archive");
}

// add file from disk
$zip->addFile('app/webroot/img/arrow-prev.gif', 'webroot/img/arrow-prev.gif') or die ("ERROR: Could not add file");        

// add text file as string
$str = "<?PHP die('Access denied'); ?>";
$zip->addFromString('webroot/index.php', $str) or die ("ERROR: Could not add file");        

// add binary file as string
$str = file_get_contents('app/webroot/img/arrow-next.gif');
$zip->addFromString('webroot/img/arrow-next.gif', $str) or die ("ERROR: Could not add file");        

// close and save archive
$zip->close();
echo "Archive created successfully.";    
?>

The addFromString() method accepts two arguments: the path and name of the destination file (as it should appear within the ZIP archive), and the content of the file. The example above demonstrates the method with two types of files: a simple text file containing PHP code, and a binary image file which is first read into a string using the binary-safe file_get_contents() method and then written to the archive with addFromString().

Power Tools

In addition to reading, creating and unpackaging ZIP archives, the ZipArchive class also comes with various utility methods to delete and rename files inside a ZIP archive, and to set or read a descriptive comment for the archive file. Let’s look at a few examples that demonstrate these methods in action.

First up, deleting files from within an archive. There are two ways to do this: with the deleteIndex() method, which accepts the file’s index, or with the deleteName() method, which accepts the file’s name. The following example demonstrates both:

<?PHP
// create object
$zip = new ZipArchive();

// open archive 
if ($zip->open('my-archive.zip') !== TRUE) {
    die ("Could not open archive");
}

// delete file from archive using index
$zip->deleteIndex(1) or die("ERROR: Could not delete file");

// delete file from archive using name
$zip->deleteName("m2.php") or die("ERROR: Could not delete file");

// close and save archive
$zip->close();
echo "File(s) successfully deleted from archive";
?>

You can also rename files within the ZIP archive with the renameIndex() and renameName() methods, which accept the file’s index and name respectively, together with the new file name. Here’s an example:

<?PHP
// create object
$zip = new ZipArchive();

// open archive 
if ($zip->open('my-archive.zip') !== TRUE) {
    die ("Could not open archive");
}

// rename file in archive using index
$zip->renameIndex(1, 'file1.txt') or die("ERROR: Could not rename file");

// rename file in archive using name
$zip->renameName("zip3.php", "zip_3.php") or die("ERROR: Could not rename file");

// close and save archive
$zip->close();
echo "File(s) successfully renamed in archive";
?>

Note that when using deleteName() or renameName(), you must provide the full path and name of the file within the ZIP archive, for the operation to be successful.

You can set a comment for a ZIP archive with the setArchiveComment() method:

<?PHP
// create object
$zip = new ZipArchive();

// open archive 
if ($zip->open('my-archive.zip') !== TRUE) {
    die ("Could not open archive");
}

// set archive comment
$zip->setArchiveComment("Yesterday's backup");

// close and save archive
$zip->close();
echo "Archive comment saved";
?>

Or retrieve a previously-assigned comment with the – yup, you guessed it! – getArchiveComment() method:

<?PHP
// create object
$zip = new ZipArchive();

// open archive 
if ($zip->open('my-archive.zip') !== TRUE) {
    die ("Could not open archive");
}

// get archive comment
echo $zip->getArchiveComment();

// close and save archive
$zip->close();
?>

Z-Ray Vision

Now that you have a fair idea about ext/zip’s capabilities, let’s put what we’ve learned to the test, with a simple application. The script below accepts a ZIP archive for upload and prints its contents using the statIndex() method:


 
 
 
<?PHP
if (!isset($_POST['submit'])) {
?>
    <form action="/myform.php" method="POST" enctype="multipart/form-data">
     Select a file:
     <input type="file" name="file">
     <p>
     <input type="Submit" name="submit" value="Send File">
    
<?PHP
} else {
    
    // check to make sure this is a ZIP file   
    if ($_FILES['file']['type'] != "application/zip") {
        die("Unsupported file type!");
    }
    
    // add some more file security checks
    // eg: file size > 0
    
    // create object
    $zip = new ZipArchive();
    if (is_uploaded_file($_FILES['file']['tmp_name'])) {        
?>        
    <table border="1" cellspacing="5" cellpadding="5">
        <tr>
            <td><b>Filename</b></td>
            <td><b>Uncompressed size</b></td>
            <td><b>Compressed size</b></td>
            <td><b>Pack ratio</b></td>
            <td><b>Last Modified</b></td>
           
<?PHP
        $filename = $_FILES['file']['tmp_name'];
        
        // open uploaded file
        if ($zip->open($filename, ZIPARCHIVE::CREATE) !== TRUE) {
            die ("Could not open archive");
        }
        
        // get number of files
        $numFiles = $zip->numFiles;
        
        // iterate over file list
        // print details of each file
        for ($x=0; $x<$numFiles; $x++) {
            $file = $zip->statIndex($x);
            echo "<tr>";
            echo "<td>" . $file['name'] . "</td>";
            echo "<td>" . $file['size'] . "</td>";
            echo "<td>" . $file['comp_size'] . "</td>";
            if ($file['size'] > 0) { 
                echo "<td>" . sprintf("%3.2f", (($file['size'] - $file['comp_size']) / $file['size']) * 100)  . " %</td>";
            } else {
                echo "<td>-</td>";
            }
            echo "<td>" . date("d M Y h:i", $file['mtime']) . "</td>";
            echo "</tr>";
        }
        
        // close archive
        $zip->close();  
?>    
        </table>
<?PHP
    } else {
        die ('Invalid file!');
    }
}
?>

 </body>
</html>

The script above is divided into two main parts, separated from each other by an if() condition:

  1. The first part of the script checks if the form has been submitted and, if not, displays a file selection box which the user can use to select file for upload. Note that since this POST transaction involves a file transfer, the encoding type of the form field must be set to multipart/form-data.
  2. Once a file has been uploaded, the second half of the script examines the $_FILES array and checks the file type to make sure that it is, in fact, a ZIP archive. Assuming it is, the open() method is used to open the archive and the numFiles property is used, in combination with a loop and the statIndex() method, to display the contents of the archive in a neatly-formatted HTML table. Notice that the script uses information on each entry’s compressed and uncompressed size to calculate a “packing ratio” for each file in the archive.

Here’s an example of what the output looks like:

Note that the script above is illustrative only – allowing users to upload files to your Web application is an inherently dangerous process and one which opens up multiple security holes. If you plan to use this example in a live environment, you should beef up the security checks within the code to avoid malicious uploads.

And that’s about it for this tutorial. Over the last few pages, I took you on a whirlwind tour of the new ZIP extension in PHP 5.2.0, showing you the basics of reading, writing and extracting ZIP archives with it. In case you’d like to read more about the topics discussed in this article, you should consider bookmarking the following sites:

Until next time…happy archiving!

Copyright Melonfire 2007, all rights reserved

2 Responses to “Dynamically Creating Compressed Zip Archives With PHP”

  1. adamcole83 Says:

    The zip file is created with the ZipArchive class and being uncompressed in Windows or Mac.

  2. adamcole83 Says:

    When I try to unzip this file in Windows or Mac, it gives me an error saying Operation not permitted or Access denied. My code is exactly what you have presented here. How can I fix this?