Preparing Source Files for Language Translation

September 05, 2011

In several previous blogs, GPI has frequently referred to "source language" files or "source files." For clients new to language translation services, the biggest surprise may be the number of source files that can be required for a complex project, e.g. an extensive website localization project. But just what is a source file?

In most instances, source files will refer to project assets/files which contain text that must be translated and localized. This could include graphics files (e.g. Adobe Illustrator or Adobe Photoshop) which contain editable text layers. The most common examples of a text-based source file would be a Microsoft Word file, a FrameMaker file or an XML file. In this blog we will explore some of the pre-translation and post-translation file preparation steps that are necessary to complete the translation process.

Source Files for Document Translation

*Documentation source formats for language translation can include different types of files such as Word, excel, PowerPoint, InDesign, plain text (i.e. comma separated text files), InDesign, FrameMaker etc. The preparation of source files by your language translation services provider for document translation depends a great deal on the file format.

Some document file formats require little or no pre-translation preparation by engineering resources, and are essentially "ready to go." For example a Word document can be prepared for translation in its native format, unless there are special instructions for special texts or tags externalization.

File format complexity for document translation varies

More complex document file formats like Adobe InDesign and unstructured Adobe FrameMaker cannot be prepared for translation in their native, binary format. InDesign documents must first be export to an .INX format before being further modified to work with language translation software used by linguists. On the other hand, unstructured FrameMaker files must be saved to .MIF before being further modified for translation. With regular FrameMaker documents, some further steps by your language translation services company are necessary, like turning off change bars and hyphenation.

The preparation process is done using different tools in order to prepare files based on the file format such as Trados TagEditor for a file format like PowerPoint or Excel and Trados S-Tagger for unstructured FrameMaker.

The preparation process also includes analyzing source files using a tool like Trados Workbench. An accurate analysis for documentation source files depends on the integrity of file preparation based on the project requirements.

Post translation file processing steps

After translation, document files have another round of linguistic engineering preparation before further steps towards a final deliverable can be taken. As with source files preparation, final preparation depends on the file format. In this step, intermediate bilingual files (which still display the source and target language) need to be reviewed and prepared in order to be delivered in the same format as the original document source file. TagEditor and S-tagger are also among the tools used in this final process.

The role of Translation Memory (TM)

*Translation Memory is a very important tool used during source file preparation, during the translation process itself and during post-translation file processing.

Translation Memory is a database that stores so-called "segments", which can be sentences or sentence-like units (headings, titles or elements in a list) that have previously been translated. Translation memory may be product line- or project-specific.

During analysis, the Translation Memories are used to calculate the word count for new or pre-translated segments. Updating Translation Memories is a very important step after translation. Over time, Translation Memories decreased translation costs, as more and more previously translated segments are stored and accessed in future projects.

Source files for website translation

*For website localization, the same preparation tools (i.e. TagEditor) can be used with source files, but additional steps are needed in the pre- and final preparation steps. Based on the website programming language, website files should be prepared properly by externalizing tags such as HTML tags, C++, JavaScript or PHP internal Code.

Also, an overall website analysis is important to ensure that the application supports the target language requirements, such as Right-to-Left direction languages like the Arabic language. Final preparation for translated website source files include few more steps, such as reviewing the internal tags/code to make sure that no changes occurred with these tags during translation process. A final Quality Assurance (QA) step is also needed to complete an overall review in target languages.

Additional resources on globalization and internationalization issues

To further understand the entire globalization process, you should download our PDF " Document Globalization Guide."  You may also benefit from three of our previous blogs, "Why Internationalize Your Code Base," "12 Steps to Website Globalization" and " Software Translation, Software Localization and Software".

Each software globalization project is unique. GPI will be happy to assist you. Request a Quote online, or you may contact GPI at or at 866-272-5874 with your specific questions about your target global markets and your project goals.

Document Translation
Source files, document translation, website translation, software translation, localization services, translation services, website localization, translation process, translation method

Consider Your Target Audience When Writing Content for Language TranslationWhen There Are No Words: “Translating” from the Heart


Currently, there are no comments. Be the first to post one!

Contact Us FREE Globalization eBooks Request Demo Request Quote

Ayman is a native Arabic speaker with extensive expertise in Arabic software and website localization. He is a Microsoft Certified professional (MCP) since 2001 and earned several certificates including MCSD, MCAD, MCTS and MCPD. He has over 12 years’ experience in software / websites engineering using Microsoft Programming tools including C#, ASP.NET, SQL Server, Visual Studio and other tools such as HTML, JavaScript, XML, Ajax and others.