Translate Toolkit & Pootle

Tools to help you make your software local

User Tools

Farzaneh Sarafraz, from the OpenOffice Persian team comments in the OpenOffice localization list:

Here is a summary of what goes wrong when trying to check the Persian translation using common filters.

* acronyms: We translate most of the acronyms into Persian, e.g. CD or URL. The test fails.

* brackets: It works well with paranthesis. When it comes to square brackets, it fails since in the right-to-left languages (including Persian) ] stands for opening brackets and comes before [.

* doublequoting & singlequoting: The characters used for them in Persian are « and ». So the test fails whenever the message contains quotation marks.

* escapes: causes problems when there is \” in the string because of the above problem.

* doublewords: The test should check for double words in the original message and should not alert when it contains double words as well. There are messages such as “Dot Dot Dash” among OO.o strings.

* endpunc, startpunc, purepunc & puncspacing: Persian doesn't use English punctuation marks. The test fails when we use ؟ for ? and ٪ for %. Comma and Semicolon changes to ، and ؛ in Persian (as well as other right-to-left scripts) so the test fails when the string contains any punctuation marks except the period. Another problem is that the translated string doesn't necessarily keep the word order. For example, there is a string “Automatic *bold* and _underline_”. The word-by-word Persian translation would be “*bold* and _underline_ automatic”. Both startpunc and endpunc are violated.

* numbers: Persian doesn't use European digits for numbers. The test is of no use with Persian digits (Extended Arabic-Indic digits).

* sentencecount: it misunderstands floating points with full stops. Persian uses a different character for decimal separator (U+066B). The dot abbreviations also cause problems in counting sentences.

* simplecaps & startcaps: Might not be useful even for Latin scripts, since the capitalization pattern could be different from English. Not good for other scripts.

Disabling all these filters is almost equal to not using pofilter at all. In fact msgfmt just checks the syntactical correctness of the file. It doesn't do anything more than what pofilter does; but the number of false positive results of poilter is what pushes our translators away.