Voting starts in March for the Drupal Association Board election.
Over the past two and a half months, Sony Music rolled out some of their top musical artist sites in new multilingual versions based on Drupal 6. P!nk, Beyonce, Britney Spears, John Legend, Kelly Clarkson and The Fray were among the first to go multilingual, with many more on the way.
Before introducing multilingual support, Sony Music had a solid platform built on Drupal 5 from which they delivered dozens of major artist websites with interactive and social networking features. But having their sites in English only presented significant barriers. Going multilingual would allow them to expand fan bases and reach out to new audiences.
In addition to their team of in-house developers and other contractors, Sony Music sponsored CivicActions to improve internationalization support in Drupal, contributing back all development work. CivicActions guided the internationalization of Sony Music's platform and ensured stability as they upgraded from Drupal 5 to Drupal 6.
Accomplishing these aims required an intensive focus on reviewing and upgrading the large selection of core and contributed modules Sony Music relies on. Sony Music's investment in improving the Drupal platform demonstrates the sound business case for contributing back.
Here's a summary of how we approached the work and how this project produced internationalization improvements throughout Drupal core and contrib that can be used on any site wanting to serve users in multiple languages and countries.
Multilingual support in Drupal 6
Sony Music picked a great time to internationalize their platform. Improving internationalization support was a key aim when Drupal founder Dries Buytaert chose a maintainer for Drupal 6. Thanks to efforts led largely by Gábor Hojtsy and Jose Reyero, Drupal 6 features major internationalization enhancements. Highlights include:
- The new content translation module that allows content to be in multiple languages.
- Automatic import of translation files.
- A newly expanded localization system that can accept custom text ("strings") as well as the text contained in code files.
Yet there were also challenges. Because they were new, content translation and other internationalization improvements in Drupal core still had some rough edges. And very few of the major contributed modules that Sony Music relied on had been fully upgraded to support internationalization improvements in Drupal 6.
To ensure that Sony Music's high traffic sites had a stable and fully functional internationalized platform, we had our work cut out for us.
Code quality and stability
As a first step, we did a code-level review of every contributed module Sony Music used, using the excellent Coder module supplemented with manual reviews.
In many cases we were able to detect and fix bugs and Drupal 6 upgrade errors that had not yet been reported. We also extended the existing set of Coder tests with new rules for analyzing the quality of code translation. The result was over fifty "patches" (proposed code-level changes) with improvements and bug fixes to dozens of contributed Drupal modules.
Dates formatted by locale
Whether it's concert schedules or album releases, date information is a key part of why fans come to a musician's site. It was important for Sony Music that date information be formatted appropriately for different countries and languages.
By default, Drupal offers a single set of long, medium, and short date formats per site with no way to customize them for users from different locations.
The contributed Date module provided some additional flexibility for date formatting, but didn't yet allow formats to be customized by location.
Adding support for locale-specific date formats turned out to require some significant changes to the data structure and user interface in the Date module. But date maintainer Karen Stevenson welcomed the contribution and gave our patches the significant time and effort that was needed to review and improve them.
Weeks of work later, the outcome was a highly configurable solution that allows both module-defined and admin-entered date formats, all of which can be localized into specific formats by language.
So concert-goers in France will see upcoming performance dates in a format they recognize and understand.
The challenges of working with translation sets
Many of the most challenging issues on multilingual sites relate to how, in Drupal 6, an original "source translation" is translated into multiple languages. The different language versions of a particular piece of content form a "translation set".
Issues arise when users or administrators interact with the with content through actions like voting, flagging, and adding to queues. Should the interaction be with the translation set as a whole (identified by the ID of the source translation)? Or with a particular translation?
For example, consider voting. You don't want votes on the French Canadian translation of a song to be completely separate from those on the Australian English translation. It's the same song, after all.
To address this issue, first we enhanced modules including Flag and Node queue to work with translation sets as well as individual nodes. In practice, this meant that a flag type or node queue could be applied to individual translations or to translation sets.
For voting, we worked on extending the voting API module to support custom voting axes (tags) that could then be treated as either per-translation or per-translation-set. (That patch is still pending.)
Then we wrote a helper module and accompanying patch on Drupal core to address a quirky issue--the need to know if the "source" translation in a translation set has changed (since now Flag and Node queue and other modules were tracking information by this datum). Sound tricky and obscure? We thought so too, which is why we put this functionality in a contributed module that could save others the headache of trying to figure it out.
Other tweaks and changes related to translation sets included:
- Tracking page views by translation set in Google analytics.
- Recognizing translated references to other content ("node references") when translating in Content Construction Kit.
Together, these various improvements helped bring widely used Drupal modules up to speed with the specific requirements of internationalized sites using content translation.
Multiple languages per country
A key requirement for Sony Music was to have content presented to users based on the users' locations. If someone is visiting The Fray's site from Canada, she or he should get a version of the site specific to Canada--in Canadian English or French, with concert and album release dates for Canada.
By default, Drupal comes with only a single version of most languages. Rather than using these default languages, e.g., English (en), French (fr), Sony Music sites use country-specific languages like 'en-CA' for Canadian English and 'fr-CA for Canadian French.
Matching site designers' requirements with Drupal implementations required a couple of tricks, implemented in a small, custom module written by Sony Music developer Roger Lopez.
First, the specs called for country-specific sites to be available at country code addresses rather than at language ones. Drupal optionally uses "path prefixes" (language-specific parts of the page address) to determine the language to be used on pages. Taking advantage of the fact that the prefix used for a language doesn't have to be the same as the language code, Sony Music used just the country code as the prefix for the primary language for each country--e.g. 'ca' for 'en-CA'.
Second, a bug in Drupal core meant that anonymous users could get the wrong language served from a page cache. To address this, on pages that lack a language prefix in their address, the custom module always forwards to a new prefixed page address. This approach means that language-neutral pages are never cached so users don't get an incorrect language.
Visitors coming from IP addresses that are outside of the supported countries on a given artist's site get a global rather than country-specific version.
Through these small enhancements of Drupal's built-in language negotiation, users see country-appropriate content from their first page load.
Stabilizing the i18n package
In Drupal 6, the Internationalization (i18n) package provides many of the multilingual features that are not yet supported in Drupal core. Examples are the ability to have blocks and menus function in multiple languages.
The i18n module set was the origin of many of the internationalization improvements that came in Drupal 6, which was a good thing but also meant that i18n required fundamental reworking in D6. This work was still underway and the i18n package hadn't yet reached a stable release.
High traffic sites like P!nk's have a way of making problems show up sooner rather than later. After one of the servers was nearly brought down by a stray query, we turned our attention to stabilizing i18n as quickly as possible.
Working closely with i18n maintainer Jose Reyero, we combed the issue queue and the code, tackling every bug and performance issue we could uncover. Following up on the first major performance issue we hit, we reviewed every module in the set that had related code, detecting several previously unknown issues in the process.
It took some intensive and focused work, but we managed to close every major issue and Jose posted the first Drupal 6 i18n stable release. Sony Music developers could rest a lot easier in terms of the reliability and performance of their internationalized platform.
Like most major Drupal sites, Sony Music's music sites rely heavily on Views--not surprising, since Views author Earl Miles is on staff.
The first need in upgrading Views multilingual support was to ensure multilingual data was exposed to Views. The node language field was already views-enabled. Building on that, we introduced a new "node translation" group and exposed additional fields showing whether a piece of content has been translated, what its source translation is, and whether a translation is up to date.
Next up was the locale data tables, where non-node translations are stored. The next stable release of Views will include a new "Locale source" data type where both text to be translated and text that has already been translated can be filtered and displayed.
Together, the node translation and locale Views data can be used to build customized views useful in multilingual environments. View can be created for conditions like all translated content in French that needs updating or all taxonomy term names that haven't yet been translated into German.
With Views newly aware of multilingual data, it was time to make sure that Views' own data were fully translatable. Views can contain admin-entered text in various places, including headers, footers, and the text given when no matching content is found (the "empty text").
An approach to exposing these data for translation was already sketched in but not fully fleshed out. To improve translation of these strings, we introduced a new kind of Views plugin, a localization plugin. While not quite complete yet, the patch should provide a flexible and secure way to make Views header, footer, and other text multilingual.
Improvements for Drupal core
Fixes and tweaks
Content and interface translation in Drupal core were fairly solid, but as we ran the systems through their paces we hit several edge cases and issues. So throughout the project we identified and submitted patches for core issues (many of which are still awaiting some further review).
Internationalization in core code sprint
As we took stock near the end of the project, we realized that these tweaks and bugfixes weren't enough. Consolidating the improvements we'd made would require a solid focus on Drupal core improvements.
Many of the multilingual features we were relying on in the Internationalization package were workarounds or dated approaches required to fill gaps in Drupal core. It was time to make sure that internationalization improvements in Drupal 7 kept pace with the great start in Drupal 6. And this was a task far beyond what we could accomplish alone.
The answer? We helped organize and participated in a "code sprint" focused on internationalization improvements in Drupal core.
Traditionally, a code sprint is when you get a bunch of developers together in the same room for a period of one or more days to intensively collaborate on solving development issues.
In this case, though, the relevant developers were in many different countries and we didn't have a budget for flying people around the world. However, most of us were accustomed to working remotely--in fact, the three main project developers hadn't yet met each other in person. So we organized a "virtual" code sprint, inviting one and all to come together from wherever they were located to work together on identifying and tackling some key internationalization problems.
The result was a remarkable success. In addition to our team, Francesco Placella, Jose Reyero, and Daniel Kudwien participated in the full sprint and several other developers joined in for parts. In all there were participants from at least seven countries--an international internationalization team!
We focused on several quicker fixes and two ambitious new issues.
Quicker fixes completed and applied from the sprint included:
- Streamlining the process of editing translations by maintaining context like is done in other Drupal editing pages such as the content editing page.
- Facilitating further development by increasing the test coverage of locale module functionality.
- Fixing a bug in locale module's uninstall routine.
The two big issues we tackled in the sprint centered on extending the types of data that can be translated in Drupal core.
First, several sprint developers worked on enabling the newly minted core Fields API to include multilingual support. One of the biggest new features of Drupal 7, "fields in core" as it's known will allow any data type (content, user, whatever) to be extended with custom fields, like the CCK module does for content in Drupal 5 and 6. The actively emerging translatable fields patch would allow all of those fields to be multilingual, effectively extending multilingual support to any fieldable data type.
Next, we tackled the need to have translation of user-defined text strings - e.g., custom menu titles - handled in Drupal core alongside the existing support for code-defined strings. In the space of the sprint we wrote a whole new set of locale data handling methods for core, and even segued into a robust set of methods that could be used wherever there's a need to read records.
While neither of these two ambitious patches has been completed yet, they are both making great progress, thanks to the focused collaboration of skilled contributors from many areas of Drupal development. Together, these patches could vastly improve and solidify the Drupal core support for flexible and powerful multilingual websites.
For more details, see the sprint writeup.
Multiple languages in practice
All of these development issues can seem pretty abstract though. What has all this code-level work meant on the Sony Music artist sites?
Well, there's probably no better place to see the new multilingual content in action than the Britney Global Fan Fiction Contest.
Fans from around the world can enter their stories in more than a dozen languages. The entries pouring in from many countries are a clear measure of the success of the internationalization effort. Music fans are celebrating the chance to read about and interact with their favourite artists in their own languages.
In closing, some of the key lessons we've drawn from this experience.
Engage with the community. The challenge of improving internationalization in Drupal core and contrib was not purely a technical one--it also called for a lot of working with people. We wrote developer documentation where it was lacking. We initiated community dialogue and discussion through a series of posts to the Internationalization group. And we communicated closely with the many contributed module maintainers whose work we were enhancing, working to understand and address their needs and ideas.
This attention to communication, community engagement, and dialogue was key to getting our contributions reviewed, critiqued, improved, and ultimately committed.
Work with Drupal core, not against it. Our first attempts to handle the needs of country-based language negotiation involved a lot of overrides and substitutes for Drupal's built in language handling. While those efforts were educational, in the end the best approach was the simplest, relying wherever possible on Drupal's native functionality and intervening only in the smallest ways that could achieve the basic objectives.
Fix it in core too. Drupal's flexible architecture makes it possible in many cases to avoid, override, or compensate for problems. But that extra work can increase complexity, introduce redundancy, and hurt performance. Even when we found workarounds, it was worthwhile also to submit the core patches that would make it possible to drop those workarounds when upgrading to the next Drupal version.
Developers like to get to do it right. For us on the project team, it was a privilege to code consistently to the highest standards. Often, development projects are driven by the individual needs of a particular site. No matter how open source savvy the client, a lot of work ends up being custom and limited to that site's specific requirements. For our developers, it was energizing and rewarding to engage with our peers, get review on every patch we submitted, and work to consistently generic standards.
Internationalization in Drupal is coming of age. The major Sony Music sites running in multiple languages demonstrate the flexibility, stability, and performance of internationalized sites on Drupal 6. If sites have been holding off on going multilingual, it's a good time to reconsider.
Open source means open solutions. Maybe the most important development lesson of the Sony Music internationalization project is that contributing back pays off.
A firm commitment to contributing back has its challenges. It takes time and energy. In internationalizing their sites, Sony Music could have gone the route that many companies choose--make optimizations to their own code base and pretty much leave it at that. But doing so misses out on a lot of the key benefits of open source.
By contributing back and engaging the community, Sony Music achieved improvements throughout the Drupal codebase. In doing so, Sony Music has strengthened the overall stability and performance of their chosen platform and at the same time helped ensure long term improvements in the next major Drupal version.
Music may be the universal language, but some stuff in your own language doesn't hurt either. Through page visits, multilingual comments, and new signups, music fans have given the new internationalized artist sites their clear seal of approval.
Every line of code that CivicActions produced over the six months of this project was contributed back in the form of fixes and improvements to the Drupal code base. Engaging the community led to improvements far beyond the scope of what our limited development team could have achieved on our own.
Sony Music's approach to investing in free and open source software is an example that other companies choosing Drupal can learn and benefit from.
Lead CivicActions development work was by Stella Power, Nat Catchpole and Nedjo Rogers with contributions from many other CivicActions developers. Project management was by Jenn Sramek and Sadie Honey.
We also had the amazing support and collaboration of Sony Music's in-house developers and other contractors, specifically Suzi Arnold, Roger López and Earl Miles at Sony Music and Nate Haug at Lullabot. Their knowledgeable feedback and commitment to quality and open solutions made all the difference.