Localization with OmegaT - beyond ultimate solution

Last modified by tms320c on Sat, November 5, 2011 12:13
Source|Old Revisions  

This is an old revision of the document!

The mystery with OmegaT and Magento localization has been here for quite a long time.

This article has been inspired by and is is continuation of The ultimate guide to translating magento using translation memory software article which describes OmegaT plugin.

This document presents complete OmegaT application with built-in support of Magento CE CSV text filter.

The build is based on the most recent version (up to date - 2.0.3_1) of OmegaT translation memory software and even implemets alignment functionality. It means that one can use previously translated CSV files to create TMX data that can be immediately used for the new translation project.

The software has been tested in the field - all ru_RU localization files for Magento CE and versions have been successfully created using exactly this build. Initial translation memory was created using alignment with 1.4 translation (after manual cleanup of non-translated segments with help of Notepad++).

Detailed description can be found in OmegaT for Magento localization along with the links to binary and source code downloads.

Key points are:

  • OmegaT for Magento binary is JAR file, so you should have JRE installed
  • Download and install latest OmegaT - Magento JAR can be used alone but it does not include neither documentation, nor plugins. Copy JAR file to OmegaT directory and run it from there.
  • If you want to create TMX for initial automatic translation, carefully clean up existing translation. All untranslated strings must be filtered out otherwise they can be marked as “translations that are equal to original text” during alignment.
  • I recommend you to switch off sentence level segmentation.
  • The filter automatically detects source files encoding and uses UTF-8 for output files by default. This can be configured in the same way as for any other OmegaT text filter. RTFM.

Alingment process is rather simple and fast:

  • Prepare clean and neat translated CSV – remove ALL non-translated rows except rows like “HTML”,”HTML”, where translation is naturally equal to original text – and put them to a folder “/translatedFiles/” (as it is mentioned below).
  • Create new project, import source files from app/locale/en_US, set project options as you believe it is necessary (again, I recomment switch off sentence level segmentation), make sure that Magento CSV filter is enabled, and save the project.

After that, run the following command (from the command line):

  1. java -jar OmegaT-magento.jar --mode=console-align /projectDir --alignDir=/translatedFiles/

where “/projectDir” is root directory of your new project.

Upon successfull completion you should find a new file “/projectDir/omegat/align.tmx”, which should be moved to “/projectDir/tm/auto” (if you want autoimatic translation).

Reload your project. Mission complete.

Final note and warning. Alingment functionality is robust but, well, a little bit too much simple-minded. If you miss untranslated strings in your CSV files you most probably find all these garbage in your target files and in your project TMX. And OmegaT will believe that all of them are properly translated. Cleanup process is a paragon of human-unbearable job. Fresh start is the only resque option.