Creating Multi-Language Web Applications with Zend_Translate

Lingua Franca

Most of today’s popular applications, including Google, Facebook and Flickr, are available in multiple languages. This type of localization is a key component of any global strategy: by ensuring that users across the world are able to browse and use an application in their native language, a company is able to attract a wider user base and ensure a consistent experience for all.

If you’re a Web developer building an application for global consumption, it’s important for you to build in a framework for multi-language support right from the start. Fortunately, there are a number of ready-made components that can help with this task. This article will introduce you to one such component, Zend_Translate, and demonstrate how you can use it to add multi-language support to your PHP application.

Good Morning, France!

Zend_Translate is part of the Zend Framework, and provides an API for managing strings in different languages, and in retrieving and interpolating those strings into output at run-time. It can read language strings from different data sources, including PHP arrays, CSV files and XML documents. It’s capable of detecting the user’s default language from the locale, and it can also automatically detect and import translation source files from standard directory structures.

To get started using Zend_Translate, you need to first download and install the Zend Framework. Installation is typically as simple as uncompressing the distribution archive and adding the location of the resulting Zend/ directory to the PHP include path. In case you have problems, take a look at the installation instructions in the online manual.

Assuming you’ve got it all set up, let’s take it out for a spin:

This is a very simple example that serves to illustrate some basic concepts when using Zend_Translate. It begins by setting up the Zend auto-loader, which takes care of automatically loading Zend Framework components as needed. It also sets up a translation source, which in this example consists of a PHP array with human-readable keys mapped to the equivalent translations in a local language – in this case, French.

The script then initializes a Zend_Translate object and passes it three arguments: the name of the translation adapter to use, the translation source, and the locale to which the translation applies. Zend_Translate offers adapters for nine different formats, including CSV, INI, XLIFF, TBX, TMX, QT and XMLTM, and also supports PHP arrays and Gettext binary files. You’ll find a complete list of adapters here.

Once translation sources have been mapped to locales, you can use the Zend_Translate API to retrieve the language-specific translation for a key, simply by calling the object’s translate() method with the translation key and locale as arguments. Zend_Translate will use these arguments to retrieve the translated string, and it can then be interpolated into your output, as shown in the previous example.

The Local Advantage

You can add multiple translation sources to Zend_Translate by specifying them with the addTranslation() method. Here’s an example, which adds a German source to the adapter in addition to the French one shown previously:

You can also set a default locale for translations with the setLocale() method. This ensures that if translate() is invoked without the second locale argument, the correct default locale is used. Here’s an example:

Rank And File

The previous examples have all used PHP arrays to hold the translation keys and their equivalent local-language keys, and these arrays have been defined within the PHP script itself. In reality, a Web application may have many hundreds of such strings to be translated and so, defining the translation source within the script itself isn’t very feasible (or maintainable).

With this in mind, you can extract your translation strings into separate files, and specify file names, rather than variables, when adding each translation to the adapter. Here’s an example of what a translation file might look like:

And here’s how you’d use it in a script:

Here’s what the output would look like:

Remember that you should use a UTF-compliant editor when saving files containing language strings, to avoid data corruption. Notepad++ (Windows) and Vim (Linux) are two good open-source options.

Intelligent Automation

If you have a lot of translation sources, importing them one by one with addTranslation() can be a tedious task; it might also create maintenance problems if you later change a file name or location. Zend_Translate comes with a very useful feature to help with this problem: it can automatically scan a directory tree, and read and import all translation source files stored in that tree. It will also automatically map each file to its correct locale, assuming that the locale identifier appears in either the directory name or the file name.

An example will help to make this clearer. Consider a directory tree that has a separate subdirectory for each locale. Within each subdirectory is a file containing translation strings for that locale. Here’s an example of what this structure might look like:

This approach is convenient, because it produces a separate directory for each locale or language. It also makes it possible to add a second level of organization to translation files (for example, by having a separate translation file for each module of the application).

Since the locale information for each file is embedded in its directory name, Zend_Translate can automatically import all the translation strings, without any manual intervention. This is accomplished by specifying the source directory in the Zend_Translate constructor, together with the special 'scan' option, as shown in the example below:

In this example, the 'scan' option tells Zend_Translate to use the directory name as the locale identifier (Zend_Translate::LOCALE_DIRECTORY) when adding new translation sources.

An alternative approach is to embed the locale information within the file name itself, rather than within the parent directory name. If you wish to store all translation files within the same directory, rather than in separate sub-directories, this is the approach you should use. Here’s what you might end up with:

In this case, you need to instruct Zend_Translate to read the locale identifier from the filename (Zend_Translate::LOCALE_FILENAME), as shown below:

Both the examples above will produce the same output:

Speaking GNU

As discussed earlier, Zend_Translate comes with adapters for a wide array of different file formats, so you’re not restricted to encoding your translated strings only as PHP arrays. A common alternative is GNU gettext, which provides a standard library for creating multi-lingual applications. Zend_Translate offers the Zend_Translate_Adaper_Gettext implementation, which makes it possible to read and use gettext files within your PHP application.

To create gettext translation files under UNIX, you must have GNU gettext installed on your system. Most UNIX-based distributions will come with this pre-installed; in case yours doesn’t, it should be fairly easy to download and install it using your distribution’s package manager. You can also download and compile the source code from the GNU gettext Web page.

If you’re using Windows or Mac OS X (or if you simply don’t like the UNIX command line), you can install the free, cross-platform Poedit tool, which provides a graphical interface to creating gettext files on Windows, Linux and Mac OS X. And once you’ve got either gettext or Poedit installed, you can look through these tutorials to learn more about the process of creating translation source files (.PO files) and then converting them to binary files (.MO files).

Here’s an example of what a completed translation source file (.PO file) for a language looks like:

You can now use Poedit or msgfmt to generate binary files (.MO files) from these translation sources. Once these binary files are prepared, arrange them in the standard gettext directory structure, which looks like this:

All that’s left now is to tell Zend_Translate to use its Zend_Translate_Adaper_Gettext implementation, and point it to the root directory of the translation sources:

X Marks The Spot

If you prefer XML, you can use the Zend_Translate_Adaper_Xliff implementation, which will read strings from XLIFF 1.1 files. Here’s an example of one such file:

Here’s an example of how you can use Zend_Translate with translation sources in XLIFF format:

Read more about XLIFF here.

In a similar vein, you’ll also find adapters for CSV files, INI files, QT files, TMX files and TBX files. Look here for more information on the supported adapters.

Sniff Sniff

Zend_Translate also integrates nicely with Zend_Locale: it can use Zend_Locale’s locale auto-detection features to identify the client’s preferred language, and display text in that language if available. Consider the following example, which illustrates:

This script uses Zend_Locale to automatically detect the browser’s preferred locale. If no locale can be detected, a fallback locale (‘en’) is used instead. This locale is then registered with Zend_Registry, so that other Zend Framework components like Zend_Translate can access it. If you don’t like the idea of using Zend_Registry, you can also pass the Zend_Locale object directly to Zend_Translate, as the third argument to the object constructor.

To see this in action, change your browser’s preferred language to French or German, then browse to the script above. Zend_Locale should automatically detect the browser’s preferred language and display the menu in that language:

If you switch your browser’s preferred language to an unsupported language, you’ll end up with just the translation keys displayed, since no translation can be performed in this case:

In case you see a bunch of error notices from Zend_Translate about the unsupported language, remember that you can always turn these off by passing the ‘disableNotices’ option to the constructor. You’ll find a complete list of Zend_Translate options here.

Down The Rabbit Hole

It’s interesting to note that if Zend_Translate is unable to find a translation for a more specific locale (example: Spanish/Mexico or ‘es_MX’), it will automatically downgrade the locale to only the language identifier (example: Spanish or ‘es’) and then check if a translation for this more generic locale exists.

The following example will make this clear:

That’s why it’s important to ensure that you always include translations for top-level locales, such as ‘en’ and ‘fr’, as these provide a fallback mechanism when translations for more specific locales, such as ‘en_US’ and ‘fr_CH’, cannot be found.

Manual Override

If you’d also like to offer users a “manual override” to the locale auto-detection, it’s possible to add a language selector to the page, and have the script use the selected language instead of attempting to auto-detect it. Here’s a revision of the previous example, which illustrates:

And now, when you visit the page, you should see a language selector at the top. Select a language, and you’ll see the menu in that language.

Attempt to specify an unsupported language, or access the script without specifying a language, and locale auto-detection will take place, as shown earlier:

As these examples illustrate, the Zend_Translate component provides an easy-to-use and sophisticated API for adding multi-language support to a Web application. Its support for multiple formats means that it’s easy to integrate with other tools, and its support for auto-retrieval of translation sources from standard file or directory structures ensures that your translations remain easy to maintain, even as you extend them over time. Play with it sometime, and see what you think!

Copyright Melonfire, 2011. All rights reserved.