Translate Toolkit & Pootle

Tools to help you make your software local

User Tools


Placeables design

The ODF Translation project is driving the need for proper placeables support.

The following issues need to be taken into account when designing the placeables code:

Different classes of placeables

Some placeables are substitutes for complex structures:

<x id="1"/>

instead of

<text:note text:id="ftn2" text:note-class="footnote">
<text:note-citation>2</text:note-citation>
<text:note-body>
<text:p text:style-name="Footnote">A footnote</text:p>
</text:note-body>
</text:note>.

Other placeables are meant to be masks for pieces of text which should be present. Variables are an example of this. Thus, one might want to mask

%(a_variable)s

as

<it>"%(a_variable)"</it>
Lokalize's approach

In Lokalize inline elements are divided into 3 categories:

  1. “unclassified” placeabes - <bpt>, <ept>, <ph> and <it>
  2. replacement contains - <g>, <mrk>, <sub>
  3. paired delimiters - <x>, <bx>, <ex>

This is expressed in the enum InlineElement, found at tagrange.h.

This matches the XLIFF standard. There are three types of placeables:

  1. Elements that have a content, and for which this content is the actual native code of the original data (escaped for XML if necessary). These elements are: <bpt>, <ept>, <it>, and <ph> .
  2. Elements that are empty and act as placeholders for a native code that is either in the Skeleton file or generated automatically. These elements are: <g>, <bx/>, <ex/> , and <x/>.
  3. The <sub> element, which can be inside <bpt>, <ept>, <it>, and <ph> to delimit a translatable run of text within a native inline code, for example the value of an ALT attribute in a <IMG> element in HTML.
Considerations for the Translation Toolkit

The toolkit should deal with elements which have native content. This would be displayed as non-editable text in Virtaal & Pootle.

The code should be backwards compatible

The attributes .source and .target in the TranslationUnit derived types should only return text. If there are placeables, these attributes should strip out the placeables and return the resulting text.

We should add new attributes to obtain text which includes the placeables.

We could make .source and .target do exactly what it does at the moment - give all the text between the markup of the XLIFF. That means that the text of replacing tags will disappear, and the text of markup tags will simply appear in .source and .target.

Lokalize's solution

Lokalize uses a data structure called CatalogString (source code). It contains text with placeables represented by the Unicode codepoint 65532. It also contains a list of TagRange structures that matches tags with codepoints of 65532.

Current toolkit placeables solution

New methods called getmarkedtarget and getmarkedsource were added to TranslationUnit in storage/base.py. These methods return lists with chunks of text and placeable objects (actually, the current code returns 2-tuples in which the first element is occupied by text and the second by placeable objects; this is likely to change soon).