PDF Generation Using Only PHP - Part 1
Intended Audience
Overview
Learning Objectives
Prerequisites
How It Works
- Setting up Class Variables
- The Factory Method
- Writing Content
- Starting the Document
- Adding a Page
- Output of Simple Text
- Closing the Document
- Document output
The Script
- The complete class
- Example Use
About the Author
Intended Audience
This tutorial is intended for the PHP programmer who needs to incorporate PDF generation in a script without using external libraries such as PDFlib (often unavailable due to licensing restrictions or lack of funds).This tutorial will cover only the basics, which hopefully will give you a good start. PDF has a vast set of features and possibilities which can not be covered in a short tutorial. If you need more than what is covered here, you might want to look at some similar yet more complete solutions available, such as the excellent work done by Olivier Plathey on the FPDF class (http://fpdf.org), on which this tutorial is based.
Of course, you may wish to take your own route and for that there is also the PDF reference (be warned: it’s 1,172 pages!)
Basic familiarity with using PHP classes is assumed. Knowledge of PDF file structure is not required, as all references are explained.
Overview
PDF files are, after all, just plain text files with specific markup syntax that describes what should happen to objects within the document, such as text and images. It follows that, armed with some PDF logic, anyone can create a PDF file. In this tutorial you will be shown the basic features of the PDF language, to enable you to put together your own PDF document.Learning Objectives
At the end of this first part of the tutorial you should be able to put together a simple PDF class that can:- Create and output a PDF document;
- Set up page size and orientation;
- Insert simple text into the page;
- Handle simple font attributes;
- Activate compression.
Prerequisites
You need to have a fully functional PHP install (either PHP 4 or PHP 5 will work here) and a running web server to output the PDF file from your script.Acrobat Reader, XPDF, or an equivalent is required to see the results of your work.
You do not need any external library, either separate or compiled into PHP, to generate your PDF files.
How It Works
The best approach is to set the code up as a class. This allows for greater flexibility later.The primary (public) methods deal with the main operations on a PDF document: setting it up, adding pages, setting font, adding text, activating compression, and output of the document.
We shall review the various methods and features of the PDF language, and then eventually put it all together as one class.
Setting up Class Variables
We will need a few class variables to keep track of output, pages, objects, settings, etc.The following is a list of the essential variables, with brief comments. You will later see each one of these variables in its context, which will give you a better idea of how they are used. For now just briefly get yourself acquainted with them.
var $_buffer = ''; // Buffer holding in-memory PDF.
var $_state = 0; // Current document state.
var $_page = 0; // Current page number.
var $_n = 2; // Current object number.
var $_offsets = array(); // Array of object offsets.
var $_pages = array(); // Array containing the pages.
var $_w; // Page width in points.
var $_h; // Page height in points
var $_fonts = array(); // An array of used fonts.
var $_font_family = ''; // Current font family.
var $_font_style = ''; // Current font style.
var $_current_font; // Array with current font info.
var $_font_size = 12; // Current font size in points.
var $_compress; // Flag to compress or not.
var $_core_fonts = array('courier' => 'Courier',
'courierB' => 'Courier-Bold',
'courierI' => 'Courier-Oblique',
'courierBI' => 'Courier-BoldOblique',
'helvetica' => 'Helvetica',
'helveticaB' => 'Helvetica-Bold',
'helveticaI' => 'Helvetica-Oblique',
'helveticaBI' => 'Helvetica-BoldOblique',
'times' => 'Times-Roman',
'timesB' => 'Times-Bold',
'timesI' => 'Times-Italic',
'timesBI' => 'Times-BoldItalic',
'symbol' => 'Symbol',
'zapfdingbats' => 'ZapfDingbats');
The Factory Method
This method will give us the PDF object with which we can build our document. It sets the initial values for the document, such as page orientation and size, and returns the object.
function &factory($orientation = 'P', $format = 'A4')
{
/* Create the PDF object. */
$pdf = &new PDF();
/* Page format. */
$format = strtolower($format);
if ($format == 'a3') { // A3 page size.
$format = array(841.89, 1190.55);
} elseif ($format == 'a4') { // A4 page size.
$format = array(595.28, 841.89);
} elseif ($format == 'a5') { // A5 page size.
$format = array(420.94, 595.28);
} elseif ($format == 'letter') { // Letter page size.
$format = array(612, 792);
} elseif ($format == 'legal') { // Legal page size.
$format = array(612, 1008);
} else {
die(sprintf('Unknown page format: %s', $format));
}
$pdf->_w = $format[0];
$pdf->_h = $format[1];
/* Page orientation. */
$orientation = strtolower($orientation);
if ($orientation == 'l' || $orientation == 'landscape') {
$w = $pdf->_w;
$pdf->_w = $pdf->_h;
$pdf->_h = $w;
} elseif ($orientation != 'p' && $orientation != 'portrait') {
die(sprintf('Incorrect orientation: %s', $orientation));
}
/* Turn on compression by default. */
$pdf->setCompression(true);
return $pdf;
}
The actual
setCompression() method is as follows:
function setCompression($compress)
{
/* If no gzcompress function is available then default to
* false. */
$this->_compress = (function_exists('gzcompress') ? $compress : false);
}
Writing Content
We will not be writing directly to the PDF file, the content is going to be buffered as it is created. Only after the PDF document is closed, and after some rearranging, will it be sent as a PDF file to the browser for download. So, as a first step, we will need to create a function to buffer the output. As it will be used internally within the PDF class, let's make it a private function.
function _out($s)
{
if ($this->_state == 2) {
$this->_pages[$this->_page] .= $s . "\n";
} else {
$this->_buffer .= $s . "\n";
}
}
$_state variable keeps track of four
different states that the PDF document can be in:
0 = initialised
1 = opened but no page opened
2 = page opened
3 = document closed
The state is important in this method for determining how to buffer the output. If there is an open page, output is sent to the
$_pages array. For any other state it
is sent to the main buffer held in
$_buffer variable.
This distinction is necessary because page content is handled as a separate object within PDF and hence will need extra work on it when it is finally written to the main buffer.
As you will later see, the
$_state variable is used elsewhere to
similarly add logic according to the document state.
It is recommended to use the newline (“\n”) following each output, as it is required in some cases (for example certain PDF instructions have to begin on a new line). Also, remember that PDF is case sensitive, so always follow the exact spelling of PDF syntax.
Starting the Document
The following two lines of code which are required for initializing the document. These two lines must be called before any output:
function open()
{
$this->_state = 1; // Set state to initialised.
$this->_out('%PDF-1.3'); // Output the PDF header.
}
This tutorial will not be covering anything exotic, so you might as well stick with version 1.3. If you do start incorporating the more advanced PDF features found in 1.5 you will need to change the version number.
Adding a Page
We can now add a page to our document. The following code is quite straightforward.One point worth noting is the
$_font_family check. For any text to be
written to a page we need to set the font. However, we have to take into account
the possibility that the font was set before any page was added, or that the
font was set for a previous page in the current document. Either way we need to
check the font class variable, and output the font information to the page. The
function setFont() is used for this,
which we shall cover later.
function addPage()
{
$this->_page++; // Increment page count.
$this->_pages[$this->_page] = ''; // Start the page buffer.
$this->_state = 2; // Set state to page
// opened.
/* Check if font has been set before this page. */
if ($this->_font_family) {
$this->setFont($this->_font_family, $this->_font_style, $this->_font_size);
}
}
Output of Simple Text
As mentioned earlier, before any text can be output, font information must be supplied. We therefore need a function to define which font will be used. PDF specifications offer a core set of fonts which can be used with no extra information supplied to the PDF reader. You can also embed your own custom fonts into a PDF file, but for this you need to create font definitions, which are beyond the scope of this tutorial.For now, limit your output to the following fonts:
Courier, Courier-Bold, Courier-Oblique, Courier-BoldOblique;
Helvetica, Helvetica-Bold, Helvetica-Oblique, Helvetica-BoldOblique;
Times-Roman, Times-Bold, Times-Italic, Times-BoldItalic;
Symbol;
ZapfDingbats.
The following method sets the font family name, and also (optionally) a style such as bold, italic or both, and a font size.
function setFont($family, $style = '', $size = null)
{
$family = strtolower($family);
if ($family == 'arial') { // Use helvetica.
$family = 'helvetica';
} elseif ($family == 'symbol' || // No styles for
$family == 'zapfdingbats') { // these two fonts.
$style = '';
}
$style = strtoupper($style);
if ($style == 'IB') { // Accept any order
$style = 'BI'; // of B and I.
}
if (is_null($size)) { // No size specified,
$size = $this->_font_size; // use current size.
}
if ($this->_font_family == $family && // If font is already
$this->_font_style == $style && // current font
$this->_font_size == $size) { // simply return.
return;
}
/* Set the font key. */
$fontkey = $family . $style;
if (!isset($this->_fonts[$fontkey])) { // Test if cached.
$i = count($this->_fonts) + 1; // Increment font
$this->_fonts[$fontkey] = array( // object count and
'i' => $i, // store cache.
'name' => $this->_core_fonts[$fontkey]);
}
/* Store current font information. */
$this->_font_family = $family;
$this->_font_style = $style;
$this->_font_size = $size;
$this->_current_font = $this->_fonts[$fontkey];
/* Output font information if at least one page has been
* defined. */
if ($this->_page > 0) {
$this->_out(sprintf('BT /F%d %.2f Tf ET', $this->_current_font['i'], $this->_font_size));
}
}
setFont() function.
function setFontSize($size)
{
if ($this->_font_size == $size) { // If already current
return; // size simply return.
}
$this->_font_size = $size; // Set the font.
/* Output font information if at least one page has been
* defined. */
if ($this->_page > 0) {
$this->_out(sprintf('BT /F%d %.2f Tf ET',
$this->_current_font['i'],
$this->_font_size));
}
}
You will need to pass to this method the x/y position of your text, as well as the actual text.
function text($x, $y, $text)
{
$text = $this->_escape($text); // Escape any harmful
// characters.
$out = sprintf('BT %.2f %.2f Td (%s) Tj ET',
$x, $this->_h - $y, $text);
$this->_out($out);
}
$this->_h - $y).
Also note how actual text needs to be escaped to ensure that it is safely inserted into the file. Since text in the PDF file is denoted using parentheses around it, any parentheses in the text itself should be escaped.
The best solution is to create a separate function to handle any cases when text needs to be inserted safely. This will be used a couple of times in this tutorial, but it will also be useful if you add more functionality to this class later.
function _escape($s)
{
$s = str_replace('\\', '\\\\', $s); // Escape any '\\'
$s = str_replace('(', '\\(', $s); // Escape any '('
return str_replace(')', '\\)', $s); // Escape any ')'
}
Closing the Document
The closing function is a bit more involved: we need to clean up a bit, set some PDF tags, and create a few references. This is the code that does most of the work in setting up the buffered content to finally look like a PDF file.Begin by checking that there is at least one page, and setting the state to “page closed”.
function close()
{
if ($this->_page == 0) { // If not yet initialised, add
$this->addPage(); // one page to make this a valid
} // PDF.
$this->_state = 1; // Set the state page closed.
/* Pages and resources. */
$this->_putPages();
$this->_putResources();
_newobj() function. You could add other
information to this section, such as author, subject, title, keywords, etc. For now we'll just put in the producer.
/* Print some document info. */
$this->_newobj();
$this->_out('<<');
$this->_out('/Producer (My First PDF Class)');
$this->_out(sprintf('/CreationDate (D:%s)',
date('YmdHis')));
$this->_out('>>');
$this->_out('endobj');
/* Print catalog. */
$this->_newobj();
$this->_out('<<');
$this->_out('/Type /Catalog');
$this->_out('/Pages 1 0 R');
$this->_out('/OpenAction [3 0 R /FitH null]');
$this->_out('/PageLayout /OneColumn');
$this->_out('>>');
$this->_out('endobj');
$_offset array that has appeared
before. PDF stores a byte offset reference to all objects in the document. This
allows the PDF reader to read objects in a random access way, without having to
load the entire document.
/* Print cross reference. */
$start_xref = strlen($this->_buffer); // Get the xref offset.
$this->_out('xref'); // Announce the xref.
$this->_out('0 ' . ($this->_n + 1)); // Number of objects.
$this->_out('0000000000 65535 f ');
/* Loop through all objects and output their offset. */
for ($i = 1; $i <= $this->_n; $i++) {
$this->_out(sprintf('%010d 00000 n ', $this->_offsets[$i]));
}
The final lines to be printed are the PDF trailer.
/* Print trailer. */
$this->_out('trailer');
$this->_out('<<');
/* The total number of objects. */
$this->_out('/Size ' . ($this->_n + 1));
/* The root object. */
$this->_out('/Root ' . $this->_n . ' 0 R');
/* The document information object. */
$this->_out('/Info ' . ($this->_n - 1) . ' 0 R');
$this->_out('>>');
$this->_out('startxref');
$this->_out($start_xref); // Where to find the xref.
$this->_out('%%EOF');
$this->_state = 3; // Set the document state to
// closed.
}
_newobj()
function above is used simply to keep track of objects added to the
document.
function _newobj()
{
/* Increment the object count. */
$this->_n++;
/* Save the byte offset of this object. */
$this->_offsets[$this->_n] = strlen($this->_buffer);
/* Output to buffer. */
$this->_out($this->_n . ' 0 obj');
}
_putPages()
function handles the output of the page content. Here we go through the
$_pages array that has been buffering
the page content separately, and output it to the main buffer.
If compression is required page content will be passed through the
gzcompress() function before being
written to output. Here you also can see why the
$_n object counter starts from 2. We
set the root pages parent as object number 1, and later you will see that we set
resources as object number 2. This is just so that it is easier for us to
reference these when required, for example in each page object.
function _putPages()
{
/* If compression is required set the compression tag. */
$filter = ($this->_compress) ? '/Filter /FlateDecode ' : '';
/* Print out pages, loop through each. */
for ($n = 1; $n <= $this->_page; $n++) {
$this->_newobj(); // Start a new object.
$this->_out('<</Type /Page'); // Object type.
$this->_out('/Parent 1 0 R');
$this->_out('/Resources 2 0 R');
$this->_out('/Contents ' . ($this->_n + 1) . ' 0 R>>');
$this->_out('endobj');
/* If compression required gzcompress() the page content. */
$p = ($this->_compress) ? gzcompress($this->_pages[$n]) : $this->_pages[$n];
/* Output the page content. */
$this->_newobj(); // Start a new object.
$this->_out('<<' . $filter . '/Length ' . strlen($p) . '>>');
$this->_putStream($p); // Output the page.
$this->_out('endobj');
}
/* Set the offset of the first object. */
$this->_offsets[1] = strlen($this->_buffer);
$this->_out('1 0 obj');
$this->_out('<</Type /Pages');
$kids = '/Kids [';
for ($i = 0; $i < $this->_page; $i++) {
$kids .= (3 + 2 * $i) . ' 0 R ';
}
$this->_out($kids . ']');
$this->_out('/Count ' . $this->_page);
/* Output the page size. */
$this->_out(sprintf('/MediaBox [0 0 %.2f %.2f]',
$this->_w, $this->_h));
$this->_out('>>');
$this->_out('endobj');
}
_putStream(). We could have included
the code in the actual _putPages()
function, however, since this method is required for other objects (such as
images), we might as well separate it out now.
function _putStream($s)
{
$this->_out('stream');
$this->_out($s);
$this->_out('endstream');
}
function _putResources()
{
/* Output any fonts. */
$this->_putFonts();
/* Resources are always object number 2. */
$this->_offsets[2] = strlen($this->_buffer);
$this->_out('2 0 obj');
$this->_out('<</ProcSet [/PDF /Text]');
$this->_out('/Font <<');
foreach ($this->_fonts as $font) {
$this->_out('/F' . $font['i'] . ' ' . $font['n'] . ' 0 R');
}
$this->_out('>>');
$this->_out('>>');
$this->_out('endobj');
}
_putResources() method, includes any
font names into the PDF file. As we are only covering core fonts in this
tutorial, nothing more than listing the font names is done here.
function _putFonts()
{
/* Print out font details. */
foreach ($this->_fonts as $k => $font) {
$this->_newobj();
$this->_fonts[$k]['n'] = $this->_n;
$name = $font['name'];
$this->_out('<</Type /Font');
$this->_out('/BaseFont /' . $name);
$this->_out('/Subtype /Type1');
if ($name != 'Symbol' && $name != 'ZapfDingbats') {
$this->_out('/Encoding /WinAnsiEncoding');
}
$this->_out('>>');
$this->_out('endobj');
}
}
Document output
The following function, the actual output of the document, does nothing more than make sure the document is closed, send a few headers according to browser type, and echo the buffered data.
function output($filename)
{
if ($this->_state < 3) { // If document not yet closed
$this->close(); // close it now.
}
/* Make sure no content already sent. */
if (headers_sent()) {
die('Unable to send PDF file, some data has already been output to browser.');
}
/* Offer file for download and do some browser checks
* for correct download. */
$agent = trim($_SERVER['HTTP_USER_AGENT']);
if ((preg_match('|MSIE ([0-9.]+)|', $agent, $version)) ||
(preg_match('|Internet Explorer/([0-9.]+)|', $agent, $version))) {
header('Content-Type: application/x-msdownload');
Header('Content-Length: ' . strlen($this->_buffer));
if ($version == '5.5') {
header('Content-Disposition: filename="' . $filename . '"');
} else {
header('Content-Disposition: attachment; filename="' . $filename . '"');
}
} else {
Header('Content-Type: application/pdf');
Header('Content-Length: ' . strlen($this->_buffer));
Header('Content-disposition: attachment; filename=' . $filename);
}
echo $this->_buffer;
}
The Script
The complete class
You can download the entire class for use with Part 1 of this tutorial.Example Use
<?php
require 'PDF.php'; // Require the lib.
$pdf = &PDF::factory('p', 'a4'); // Set up the pdf object.
$pdf->open(); // Start the document.
$pdf->setCompression(true); // Activate compression.
$pdf->addPage(); // Start a page.
$pdf->setFont('Courier', '', 8); // Set font to arial 8 pt.
$pdf->text(100, 100, 'First page'); // Text at x=100 and y=100.
$pdf->setFontSize(20); // Set font size to 20 pt.
$pdf->text(100, 200, 'HELLO WORLD!'); // Text at x=100 and y=200.
$pdf->addPage(); // Add a new page.
$pdf->setFont('Arial', 'BI', 12); // Set font to arial bold italic 12 pt.
$pdf->text(100, 100, 'Second page'); // Text at x=100 and y=200.
$pdf->output('foo.pdf'); // Output the file named foo.pdf
?>
About the Author
Marko Djukic works and lives in Florence, Italy running his own company http://oblo.com with the goal of bringing innovative Open Source solutions to local government and SMEs. He is also a core developer for the Horde Project (http://horde.org).Marko can be reached directly at marko@oblo.com

Comments
if ($this->_line_width != 1) {
$this->_out($this->_line_width);
}
However, _line_width is just an integer, and won't output correctly
ex:
.8
instead of:
.80 w
I changed mine to:
if ($this->_line_width != 1) {
$this->setLineWidth($this->_line_width);
}
since setLineWidth uses the _out function, outputting the line width correctly
if ($this->_line_width != 1) {
$this->_out($this->_line_width);
}
However, _line_width is just an integer, and won't output correctly
ex:
.8
instead of:
.80 w
I changed mine to:
if ($this->_line_width != 1) {
$this->setLineWidth($this->_line_width);
}
since setLineWidth uses the _out function, outputting the line width correctly