In the current global environment, companies are pushed more than ever to provide an intimate and personalized experience for all of their customers. This of course includes offering localized content for customers of different nationalities and languages. Unfortunately, this feature is often overlooked in content management systems (CMS). Considering, however, that customers are four times more likely to purchase from a site in their native language (according to IDC) suggests that almost all sites should offer some sort of localization solution in order to stay competitive in this global age.
In this series I will investigate several different content management systems and review their localization support. More specifically, I will review WordPress, Drupal, and Hippo CMS, along with highlighting features of enterprise systems like Ektron and Ingeniux.
In this first post, I will begin by describing the ideal qualities a multilingual CMS should have and how localization is typically supported in these current systems. This will provide the context by which I will evaluate these content management solutions throughout the rest of the series.
translation of content
At the most basic level, a system should have the ability to store and provide user content in multiple languages. As of today, this feature is nearly a given in all systems, whether it be out of the box or through a plugin.
taxonomies and assets
Other assets, however, are not so readily translatable. Taxonomies, for example, are typically managed separately from individual posts or pages. As a result, a separate process needs to be developed by the CMS to provide translations for each taxonomy term. Assets such as images and PDFs are even harder to manage because multiple instances of the same asset need to be uploaded depending on if there is localized content within it. A good CMS would provide for a way to organize assets so the original is always on hand. For example, a PSD could be conveniently uploaded and grouped with its derivatives so a new localized version could be easily generated by a marketing team.
administrator panels and configurations
Depending on the business and whether or not it has a globalized group of contributors, the admin panel may or may not also need to be localized. Most CMSs support multiple languages and have initiatives to achieve wider support. Plugins and modules written by third parties, however, typically do not have the same support as an entire CMS, so greater care needs to be taken with CMSs that rely heavily on plugins.
A robust workflow is essential to providing a usable system for businesses that have a large team of writers and editors. While this support is typical of most CMS’s, support for a workflow that also works in tandem with localization is harder to come by. Two aspects that are particularly important are separation of workflows and update notifications.
It is essential that each language have a separate workflow because it allows each language to publish and edit its content regardless of the status of the other languages. For example, a site would probably want to publish the English version of a page before the German or Dutch versions if the content has yet to be translated.
In the same vein, each language should have its own set of permissions. This is important if you only want translators to have access to content written in their language.
Maintaining synchronization between different translations is also an incredibly important feature a robust workflow should facilitate. Notifications should be pushed out to all relevant translators whenever a document has been updated. This allows translators to then update their content to reflect the original.
synchronization of non-translatable content
As mentioned before, maintaining synchronized content is incredibly important. Not all content needs to be translated, however, such as prices, images, and numbers. This content is given the ISO 639 code of UND, or undefined language. It is crucial these changes are automatically propagated to all languages. Otherwise, translators are left with the burdensome task of maintaining data integrity.
A CMS should also be able to synchronize related content such as likes, votes, signups, and event attendance. This information should be aggregated in some manner so admins do not have to manage multiple sources of data. At the same time, however, administrators should still have the ability to separate information by language and location. This is essential for providing regional statistics and further aids the creation of targeted content.
Lastly, a multilingual CMS should have some sort of ability to handle comments from different languages. In an ideal system, comments could be translated and synchronized between various languages. This is difficult if not impossible to find in current systems because of the inaccuracy of machine translation and the exorbitant cost of professional translators. Therefore, efforts should be made to ensure comments from different localities are separated from each other to prevent the user experience from being degraded.
machine and human translation
The ability to provide translation services is perhaps one of the most important qualities of a multilingual system. Most if not all CMSs allow for human translation, but machine translation is also a beneficial feature that can be packaged within a system. While machine translation does not have perfect results, it is useful on smaller projects where accurate translation comes second to getting a message quickly across to a new or established market.
third party integration
To help with human translation, multiple third party sites have been created to provide translation services for businesses. If a company is not doing its translations internally then they may end up depending on such a service. One such example is ICanLocalize. In such cases integration is key. ICanLocalize, for example readily integrates with Drupal and WordPress.
Support for XLIFF (XML Localization Interchange File Format) also aids in providing human translation. Countless translators in the field use XLIFF, making XLIFF the international standard for localized content. By supporting importing and exporting of XLIFF files, large portions of a site can be translated with relative ease by a professional translator.
Lastly, CMSs should have some ability to provide local themes or sub-sites. While most businesses opt to have a unified look between their different channels, some businesses may require unique presentations for different locations. This makes it easier for a company to target an audience by designing a theme that appeals directly to a localized region.
Today there are two predominant models that describe how translated content can be structured within a system. In this series I have given them the names multi- and single-node translation.
multi-node translation model
The first, and more frequently used model, involves duplicating a document or node multiple times, once for each language. In the following diagram, English is treated as the root node with the other duplicated nodes radiating out from it.
The multi-node translation model allows for each language to act as a first-class citizen, so each document or node can behave as a separate, unique document. Applying distinct workflows and permissions for each node is therefore incredibly easy.
There is, however, a fundamental problem with this model. Mainly, that all of the data is repeatedly copied, resulting in data redundancy. This is particularly detrimental to a system that has UND content because all of the nodes need to be manually updated if a simple change is made. Secondly, because each node is disjoint, they each have their own references to different sets of related information, making it difficult to aggregate information.
Examples of CMSs that use this model include Hippo CMS, Drupal content translation, and the WPML plugin in WordPress.
single-node translation model
The second model, on the other hand, encapsulates all of the languages into a single document or node with multiple languages for each field.
As can be seen from this model, restricting a field to only one language can easily generate UND fields. This model also easily synchronizes related content such as likes and event attendance because all of the information is stored in one location.
The problem, however, is that each translation is no longer a first-class citizen, and as a result, all of the languages use a single workflow. The fact that all of the languages are in one node also makes it difficult to assign different permissions to the various languages and to separate comments out by their language.
Examples include Drupal entity translation and the qTranslate plugin for WordPress.
By investigating the various requirements of multilingual systems and the two models typically used to construct them, it is clear that providing localization support is not a trivial task. In fact, the rudimentary analysis I have given for each model suggests that several of the key features required of an ideal multilingual system are mutually exclusive. Thankfully, novel approaches have been developed to bridge the gap between these two models, but there are still enough differences to make choosing between them incredibly difficult.
The remainder of this series will investigate the aforementioned content management systems. In the process I will analyze their approach and see how well they have been able to augment the features of each model. At the end of the series I will provide a comprehensive report of the pros and cons of each CMS to further help you decide on the most beneficial CMS for your needs.