Guide to Translation Memory (TM)
August 22, 2013
A translation memory is a database that keeps track of translations per language.
Translation memory (TM) is a software application that gives translators the opportunity to reuse existing translations. Translation Memory, or TM, is a simple database of translated strings or sentences. All previous translations are accumulated within the translation memory (in source and target language pairs called translation units) and reused so that you never have to translate the same sentence twice. The more you build up your translation memory, the faster you can translate content.
How does a translation memory work?
Translation memory works at the sentence level. When using TM, the source document is broken down into its component sentences or segments. The term segment is used because in some cases, a chunk of text may not be a complete sentence, for example, in the case of headings. The segment is the smallest unit of text that can be reused when working with TM. Smaller units of text, such as individual words, are not used, because they may occur in different contexts and thus require different translations, and word-for-word translation generally does not produce usable results.
Repetitions, 100% Matches, and Fuzzy
As the translator works, each segment for translation is compared to what is stored in the TM, and matches are presented to the translator automatically. A segment in the TM that is identical to the segment for translation is considered a 100% match. At some point in the past, this exact segment was encountered and a translation was provided and stored in the TM. In theory, it can be used exactly as is. If there is no exact match, but there are segments in the TM that are similar to the one being translated, then these are presented as fuzzy matches. Each is ranked by a percentage ranging from 0% to 99%, where the higher percent matches are closer in content to the sentence being translated. A 99% match might differ only in a single letter or punctuation mark, where a 75% match might have several different words. Generally, matches below the 70% mark are not useful.
When a document contains several identical segments that are not currently in the TM, these segments are known as repetitions. Most translation memory tools can identify potential repetitions before translation begins. The advantage of repetitions is that after the first occurrence is translated, the rest will become 100% matches. As the translator works, each newly translated sentence is added to the TM. Thus, that new sentence can become a 100% match or even a fuzzy match for other sentences in the document. Repetitions are those segments that have the potential to become 100% matches.
How we use the TM during the localization process?
Before translation begins, the file will be analyzed against the TM. This process gives us a status about the file that was provided on the total word count, and the number of words that make up repetitions, 100% matches, and fuzzy matches in the file. This status is also called a TM breakdown, as the total word count of the file is broken down in the several TM match categories. For example, a common breakdown used is:
- New Words
- Repetitions/ 100% matches
- 95% - 99% matches
- 85% - 94% matches
- 75% - 84% matches
Each category may have its own price or discount. 100% matches and repetitions are generally less expensive, as they need little effort to be translated, while the lower percentage matches require more.
Why keep a translation memory?
We use translation memories so we can reduce the translation costs for our clients and we can get them a faster turnaround time. Also it will ensure a translation's consistency. Finally, since the translation memory is automatically suggesting matches to the translator while they work, the translator is more likely to use terminology and phrases consistent with previous translations, which increases quality.
Useful resources on translation and localization services
Globalization Partners International (GPI) has created a series of blogs and website resource pages to help you understand key concepts and vocabulary used in the translation and localization process:
- Planning a Website Localization Project? Where do I start?
- Localization Engineers - what do they do?
- Importance of Client Review Cycle in Translation
- 3 Metrics for Measuring Language Translation Quality
- What Should You Expect From Your Localization Partner?
You may contact GPI at firstname.lastname@example.org or at 866-272-5874 with your specific questions about translation memories and its use in projects and accounts. You may also request a Translation Quote for your project as well.
Federico Pascual - Director: Digital Delivery Services
Federico has over 12 years' experience as a globalization engineer managing a wide range of software and website globalization projects (internationalization I18n + localization L10n). His expertise spans software and website internationalization and localization processes, standards and tools as well as locale specific SEO. Federico has completed hundreds of successful globalization engagements serving as lead I18n architect involving different programming languages. He is a certified developer in several content management systems and helps clients create world-ready applications, utilizing development practices that are faster, more economical, and more localization-friendly.