Translate Toolkit & Pootle

Tools to help you make your software local

User Tools

Process Information

Process information is information that is included in the files in order to undertand when and who performed different actions related to the file, as well as possible goals for the file.

The goal of storing this information is

  • To reduce the work by knowing what work was done in the prior phases (for example, a reviewer only needs to review translations done after the file was last reviewed).
  • To understand if new processes need to be performed (for example, if glossary information was included in an XLIFF file, and we can see that the glossary has been upgraded, then we need to run the process again to ensure that glossary information in the XLIFF file is correct).

Process information can be included in XLIFF, TMX and TBX files.

For the moment we only consider what should go inside XLIFF files.

Process information in XLIFF files

Process information in an XLIFF file might refer to the whole file (and in this case it will be placed in the <header> of the file) or to one of the translations, in which case it will be included inside the corresponding <trans-unit>.

Information that pertains to the whole file

Structure inside the XLIFF file (what the standard says)

The information must be place in the a <phase-group> element inside the <header>. Inside the <phase-group> each bit of information must be include in a <phase> element.

The <phase> contains metadata about the tasks performed in a particular process. The optional phase-name attribute uniquely identifies the phase for reference within the file. The required process-name attribute identifies the kind of process the phase corresponds to; e.g. “proofreading”. The optional company-name attribute identifies the company performing the task. The optional tool-id attribute references the <tool> used in performing the task. The optional date attribute provides a timestamp indicating when the task was performed. The optional job-id attribute allows an ID to be assigned to the job. The optional contact-name, contact-email, and contact-phone attributes all refer to the person performing the task.

Required attributes:

phase-name, process-name. [ERRATA: phase-name is considered optional in the text above, but mandatory here]

Optional attributes:

company-name, tool, tool-id, date, job-id, contact-name, contact-email, contact-phone.


Zero, one or more



+- <file>+
   +--- <header>?
   |    |
   |    +--- <skl>?
   |    |    |
   |    |    +--- (<internal-file> | <external-file>)1
   |    |
   |    +--- <phase-group>?
   |    |    |
   |    |    +--- <phase>+
   |    |         |
   |    |         +--- <note>*

(legend: 1 = one
       + = one or more
       ? = zero or one
       * = zero, one or more)

=== Information and phases that we have and might consider including ===

== Template XLIFF file creation phase ==

  • Date in which the template file was created
    • This information might be useful to ensure that we always upgrade template files with information from files that older.
    • It is also interesting to have inside a block of information about the file
  • Group project and version of the project the file belongs to
    • Important, to ensure that different versions of the project are not confused by the translator and uploaded to the wrong version in Pootle (for example, uploading OpenOffice HEAD to the location of OpenOffice 2.0.3)
  • Path inside the Project tree in which the file should be
    • This information might be interesting when the file is uploaded back by a translator, to ensure that it does not replace another file of the same name that has is somewhere else in the tree for the same project
  • By whom it was created
    • Only for informational purposes, in case files for the same project could have different origins.
  process-name="Template Creation" 
  date = "2006-01-25T21:06:00Z" 
  contact-name="The Creator"
  x-project="OpenOffice 2.0.3"

During the creation of this phase, it is also interesting to count messages and include the count information in the file.

FIXME {Javier: Define which counters and how they are coded. All counters can be included and initialised to 0, or we can consider that counters that do not exist are 0, but this might be too much to assume sometimes}

== File import into Pootle phase ?? ==

  • Date in which the template file was integrated in Pootle
  • By whom it was integrated
    • I see no reason to include this information or to have a phase for this

== Instantiation and initialisation (Upgrade from an old version) ==

  • Date in which the file was initialised for this language
  • Version of the file (project name) from which info was taken to initialise
  • Information about phases in the file from which info was taken
  • Name and version of the tool that was used to do the upgrade
    • This whole set of information does not influence the process, but it is interesting to know when the origin of the information needs to be tracked. The existence of the phase itself shows that a step was taken, and that the old project was probably eliminated form Pootle.

All the upgrade, translation, review and approval phases of the old project are copied right after the Creation phase of this file, then we add our own phase. Glossary and TM inclusion phases are not kept, as this information will not be copied from the file when upgrading. Then, a new Upgrade phase is added.

  date = "2006-01-25T21:06:00Z" 
  x-prior-project="OpenOffice 2.0.2"
  tool="Translate Toolkit pot2po 0.9"

FIXME {Javier update counter information, adding counts of translated and fuzzy messages, as well as reviewed and approved messages. It might well be that the file was not yet reviewed or approved when the Upgrade took place}

== Inclusion of Glossary info in XLIFF phase ==

  • Date of the TBX file from which a glossary for this XLIFF file was created
  • Name of the TBX file

== Inclusion of TM info in XLIFF phase ==

  • Date of the TMX file used to populate this XLIFF file.
  • Name of the TMX file.

== Translation phase ==

  • Name and contact data of a translator that has worked in this file
  • When the file was last edited in a given phase

== Review phase ==

  • Name and contact data of a reviewer that has worked in this file
  • When the file was reviewed for the last time in a given phase

== Approval phase ==

  • Name and contact data of a approver that has worked in this file
  • When the file was approved for the last time in a given phase

== Uniqueness of phase names ==

All <phase-name>s in a file must be unique. It is important to ensure this, considering the case in which an upgrade brings in phases from another file. It is important to ensure uniqueness of the phase name of the Creation phase, so that it will not agree with other phase names. Many be phase name should be the name of the phase followed by a number. This way the phase names will always be manageable.

== Goal information and Accomplishment ==

It is unclear if this information should be included in a phase

  • Date for which the file is due (it might include a percentage of it being translated).
  • Maybe, a job ID.

==== Information that pertains to only one translation Unit ====

=== Structure inside the XLIFF file ===

The <phase-group> and <phase> do not exist inside the <trans-unit> element. It is possible to express a reference to a phase that is included in the <header>.

=== Information that we might want to include ===

  • Name of the translator
  • Date of the translation
  • Status of the translation
  • Errors that the translation commits

==== Processes that create a new phase ====

  • Creation of Template
  • Instantiated
  • Inclusion of TM and glossary information
  • Translation
  • Review
  • Approval

=== Creation of Template ===

Fill in the header entries:

FIXME {DB what do we consider to be the mininmal entries}

=== Instantiation of language specific XLIFF files ===

Copies of all the XLIFF Templates pertaining to this projct are made for the language. We also initialise the files global target-langauge.

<file target-language="af-ZA">

=== Upgrading ===

After instatiating a new XLIFF files from XLIFF Templates for a specific language. If there are old and translated XLIFF files for the same project then we fold those translations into the new XLIFF file. We do not regard this as a phase because we are not adding new information that would be required by people in the process eg what TM was used, who translatd this.

=== Inclusion of TM and glossary information in an XLIFF file ===

  • Create a phase, if the current phase is not “TM and Glossary Inclusion”, otherwise just change the date, eliminate all the entries that have this phase name and run again the TM algoritm.
  process-name="TM and Glossary Inclusion" 
  date = "2006-01-25T21:06:00Z" 
  • Create the <alt-trans> units.

For each match that is considered interesting for the translator, create a <alt-trans> unit with all the available information in TMX.

  <source xml:lang="en-US">
    Source message, but only if the match is less that 100%
    Chevaliers de la table ronde, goutez mois si le vin est bon.
    <context context-type="x-openoffice">

FIXME {Javier: After we include the information, we count how many translations/exact-matches/fuzzy-matches the file includes, and we include them in the <count-group> of the header, attached to this phase.}

  • Parrallel translations

In the case that a translator group requires another reference language and have indicated as such in their configuration this parrellel language is added as <alt-trans> items to the <trans-unit>. The user also configures the minimum match quality for parrallel translations that should be incliuded.

  <source xml:lang="en-US">
    xml:lang="fr" >

=== Translation ===

If the current phase does not match the data of the translator (phase-name and contact-name), then a new phase is created, otherwise the current (last) phase is considered as still active. In the last case, the date of the phase is updated to the current date (we assume that we are interested on having the last date in which work on the file took place).

  date = "2006-01-25T21:06:00Z" 
  contact-name="Alberto Martinez"

The translation editor might be configured to place in the <target> TM matches over X%. These must labeled in the target's state-qualifier attribute as exact-match, fuzzy-match or id-match, and these targets will be assigned to the translator's phase. These marks are only removed after review by the translator, and no fuzzy messages will be considered as translated. It is up to the translation editor to decide when a <target> is considered to be reviewed by the translator, and remove the fuzzy-match or exact-match mark.

<target phase-name="xxx321" status="needs-translation|new" state-qualifier="fuzzy-match" ....

Each time the translator creates or modifies a <target>, the <target> must be associated to the current phase through the phase-name attribute and the status must be changed to “needs-review-translation” (because it has been translated, but not yet reviewed, when it is reviewed, this status will be changed to “translated”).


In the case where the translator wishes to qualify their translation: eg. why they did not use a glossary word. Why they have not used correct grammar, etc. In this case they can add a note:


A translator might also wish to add a note that helps other translators in other languages. In this case the xml:lang=“en-US” (this is so that others can read the message) and annotates=“source”

FIXME {DB: how do we use notes to report errors in source? So a more serious comment then the previous paragraph}

=== upload ===

FIXME {Javier: when the file is uploaded after translation, the counts are made again, and the data is stored in relation to the translator's phase}

The file is merged with the present file in the server, just in case the the file has been updated or somebody else has translated something on that file while this translator was working on the file.

=== Review ===

If the current phase does not match the data of the reviewer (phase-name and contact-name), then a new phase is created, otherwise the current (last) phase is considered as still active. In the last case, the date of the phase is updated to the current date (we assume that we are interested on having the last date in which work on the file took place).

  date = "2006-01-25T21:06:00Z" 
  contact-name="Alberto Martinez's reviewer"

With each <target>, the reviewer might accept the string, correct it or reject it

Each time the reviewer reviews or modifies a <target> and considers it correct, the <target> must be associated to the current phase through the phase-name attribute and the status must be changed to “translated”. If the state-qualifier is fuzzy-match, the reviewer can fix it or leave it as it is, as if it was a non-translated message.

FIXME {Javier, The problem here is that we loose information. If there are two translations phases, followed by a review phase, we will know know in which of the two translation phases the translation took place}


The reviewer might reject a translation, without fixing it. In this case, the state state-qualifier will be set to: rejected-* and the state to needs-translation. If needed a qualifying note can also be added.

   File is a noun in this context not a verb

FIXME {DB: how do we handle some automatic checking like glossary alignment. Do we use state-qualifiers before sending to review. Also how does a translator override this to say yeah I know its not aligned}

FIXME {DB: we need to allow a reviewer to perform a global rejection}$

=== Approval ===

If the current phase does not match the data of the approver (phase-name and contact-name), then a new phase is created, otherwise the current (last) phase is considered as still active. In the last case, the date of the phase is updated to the current date (we assume that we are interested on having the last date in which work on the file took place).

  date = "2006-01-25T21:06:00Z" 
  contact-name="Big Boss"

The approver can accept or reject translations. When accepted the status is changed to final and sny state-qualifier removed.


If rejected they can use the same reasons and process used by the reviewer.

Once approved we extract any new glossary information and TM information and send them to the whatever the process is for new TM and Glossary items.

FIXME {DB a seperate process defines acceptance of new TM and Glossary data. Still to be defined} FIXME {DB we accept TM at the minimal approval level}

Post approval all TM, Glossary data is scrubbed.

=== Compilation ===

Once appoved the translation can be used by the translation farm.

A small team can define a lower level at which translations can be used. For instance a one person team could define status=“translated” as acceptable minimal process. Any change in status beyond that is an added bonus.