I’ve worked with applications with different language variants for years. Here are some things I’ve learnt along the way. This is written from the point of view of an English speaker, working with an English application. In honour of that, there are 26 tips – one for each letter of the alphabet in the en-GB culture.
Terms
Globalisation is the process of ensuring that your application can support multiple cultures and languages.
Localisation is the process of adapting your application to one target language. It includes but is not limited to translating the application.
Globalisation
1. Build globalisation into your application from the start
Modern programming frameworks come with support for globalisation and localisation but make sure you use it from the earliest opportunity. It’s way harder to retrofit this capability and frankly it’s a huge slog. If you must do it – get someone else in to do the work. I’m only half joking.
2. Extended Character Sets
You’ll have to use extended character sets to support more than the 26 letters and 10 numbers you are used to. Accents and umlauts and so forth or perhaps an entire Cyrillic alphabet. UTF-8 is the correct choice on the web and make sure that your database can store it correctly – nvarchar for SQL Server for example.
Weirdly, people can rapidly occupy the moral high ground when talking about text encoding and you wonder if you are mad or stupid for even mentioning it. But unless you actively monitor it – there will be somewhere in your application that doesn’t properly support it and you will find that out at a very inopportune moment.
3. String length and truncated text
Translating text often results in longer or shorter strings. German is notorious for super long words for instance. Short strings might make your application look a bit weird. Long strings might truncate unhelpfully in your UI or might break your UI layout completely. Long strings can also break your database if the field length isn’t long enough.
Design your database and UI to deal with a range of string sizes as soon as you can.
4. UTC Dates
If possible, store your dates as UTC format – Coordinated Universal Time or equivalent. Store it with the user’s time zone and you’ll be able to reconstruct the actual time and compare it to other time zones. Without it, you’ll just know something important happened at 18.00. Which depending on the time zone you are in could be any time at all.
This might not seem important when shipping to the UK. It’s a tiny, tiny, crowded island where the natives only have one time zone to share between them, Then you sell to Australia with 5 time zones and UTC dates sudden seem much more important.
Generally, dates are complex – the Babylonians didn’t invent our calendar to make it easy for software developers. But make it easier for yourself wherever you can.
5. Database text
This is another obvious one, but any text in your database will also need translating. It’s easy to get so enamoured with all the amazing support that your UI technology has for globalisation that you forget that 75% of the text in your application comes out of a database. Likely you’ll need a separate process to deal with it.
6. External APIs
If you use any external APIs for content, then you’ll have to make sure that they support all the target languages. If they don’t then find an alternative, translate the input yourself on the fly (not easy) or learn to live without them.
7. You might need a different UI entirely for non-roman languages
I’ve only every worked with European languages – the standard ones. We once floated the idea of translating it into Mandarin Chinese and everyone got nervous. I don’t think our UI would have withstood it.
If you are going to access those big, big markets then it’s a whole new level of globalisation – are you ready for it?
8. Do you need to do it?
It’s worth taking a step back and asking if it does need to be translated at all. If your application is in English and your target market is professional, then they might take it as is – particularly if they can get it quicker and cheaper. We sold into the Gulf states, and they were happy to take it in English – which is as well really. The option for them was English or nothing. They took English.
Localisation
9. Managing provisional translations
Translators are expensive and you probably only want to employ them at the end. You likely end up putting provisional translations in when developing and during initial testing then bundling off your text to translators after that, when the application is more stable.
Anticipate that and devise a way to put in provisional translations, know that they are provisional and swap them out for confirmed translations when needed.
10. Language variants
You’ll want to make sure that you can properly distinguish language variants – Austrian and Swiss German, US and UK English and know which words and phrases need a custom translation for that variant. As a minimum, it’s annoying for people to read phrases that aren’t from their culture e.g., color vs colour for the UK market. At worse, your application will seem unprofessional and even borderline illiterate.
11. Don’t translate single words
Obvious maybe, but it just doesn’t work to translate single words and stitch them together. Slightly less obviously, your units of translation may have to larger than you initially think – sentences rather than phrases or even passages of text.
12. Don’t duplicate translations
This conflicts with a lot of other tips but if possible don’t duplicate translations. You’ll have a list (xml, json, csv, excel etc…) somewhere with all the terms you are translating. If you have 400 line items all of which are ‘Save’ you will delight your external translator. They will translate it once, copy it down 400 lines then charge you for 400 translations. Try to avoid delighting your external translator in this way.
13. Invest in a translation tool
Or perhaps don’t. It’s a big undertaking to localise an application and keep all the translations up to date so take all the help you can get. Considering buying in any third-party product you think can help you.
However, when I looked there wasn’t anything amazing. We bought one and it did help but it wasn’t a panacea. I would look though and see if there is anything that fits your organisations requirements. You might have to code up a translation tool yourself.
14. Translate installation files
If you’ve got installation files, msi or other utilities you’ll want to translate those. At the very least be aware that you haven’t translated them, so you aren’t shocked when the first thing the client sees is an installer rattling away in English.
15. Date format
Date formats change depending on locality e.g. dd-mm-YYYY for the UK, mm-dd-YYYY for the US and YYYY-mm-dd for China. Make sure yours change too.
More subtly, make sure you don’t accidentally translate formatting strings. For instance, the formatting string in
MyDate.ToString(“dd/mm/YYYY”);
If translated to German in Google translate would be
MyDate.ToString(“TT/MM/YYYY”);
Which is nonsense as far as string formatting is concerned and will break your application.
16. Text on images
This is an amusing surprise – even when your UI, every database and external data source is translated, and you are unveiling the application there will be an image somewhere with English text on it. Maybe it’s a shop, or road sign or maybe someone has photoshopped a great swathe of text over a large image.
Your images need to be culture neutral or be able to be swapped out for culture appropriate ones.
17. Job Titles
An interesting case is job titles. A job title in your home language might be entirely non-existent in your target language. You can’t translate something if it doesn’t exist.
As a broader point – concepts may differ or change in different cultures in non-obvious ways. Official documents will be different, how people identify themselves might be different even things like date of birth might have to become optional or be able to support ranges. There are places in the world where this information might not be available.
Testing
18. Your automated tests might break
So, you’ve spent a heap of time creating an automated suite with a truly astonishing test coverage. Well, they might break in your localised application. Behaviour driven testing and UI testing are particularly vulnerable to this but even the humble unit test could start to blink red at you. Factor this into your planning.
19. Don’t translate system error information and logging
Keep your logs and error messages in the language you actually speak. Your support staff will not thank you if faced with a 30 MB error log file in Hungarian.
20. Consider using test environments with language specific OS/Database etc…
Not mandatory by any means but you might want to consider setting up your test environment to have the target language for its operating system, databases and all other software. You might detect some hard-to-find bugs before it hits production.
Balanced against that, it might be a pain to maintain your own on-premise test servers in a radically different language. But with infrastructure on demand, it should be less of an issue and worth thinking about.
21. Account for increase in test time
Each language you add, even language variant, will add to your test effort. Budget for it and don’t over promise on your project estimations.
22. Employ testers with good language skills
Employ anyone with good language skills but it’s really useful in testing. Also, if they are good then you might be able to use fewer external translators which as I’ve said before, can be expensive.
Broadly it’s just generally useful to speak the target language to some extent but scarily actually not necessary. I’m pretty poor at languages (high school Spanish and that’s it) but I’ve worked with French and German drug databases for years. Other than the odd embarrassing bug (entire section of data in the wrong language that I didn’t notice), it hasn’t made a much of a difference. But then I’m not a tester.
Final Words
23. Other things to watch for
There are many, many other things to bear in mind. Here are a few more
- Number separators – different in different cultures. Can you application support a comma for a decimal point
- Currency symbols – need to be swappable and correct
- Post(zip) codes – even within very similar cultures they are radically different. The Australian postcode is 4 digits and the UK postcode is alpha numeric and up to 8 characters long with a space. Make sure your form validation can cope
- International telephone numbers – might be formatted differently
- Sorting might break- it did for me.
- Gender in languages – get this wrong and your application looks like it’s been translated by a 5-year-old.
24. Google Translate is your friend
Google translate is just an amazing help during initial development. Pump in all your English text and get provisional translation out.
This is really handy for spotting areas in your application that you have forgotten to translate, and it gives a good indication what the difference in text length is doing to your UI. Also, if you have problems with your encoding then you’ll see it straightaway when faced with weird characters in your text. You just get an early view of what your application looks like in the target language which is invaluable.
25. Google Translate is not your friend
But do not rely on Google Translate for your final translations and be very careful not to inadvertently leave it in the final release.
We had much hilarity from a client when skin peel (medical) was translated as lemon peel (cooking). They thought it was funny. Your client might think it’s woefully unprofessional. Don’t take the risk.
26. It’s a lifelong task
Translating an application is a pretty sizable task. Once done, there will be further releases and each one will have more stuff to translate. You need a good method to work the translations into your processes. The task will never end. Embrace it.
Good Luck
Not a tip – just good luck, buona fortuna, bonne chance and buena suerte* with your localisation endeavours. You are doing a good thing.
*All translations were provided by Google Translate and are provisional