The *ODF Internationalization Test Suite* can be used to validate any tool that allows conversion from the Open Document Format (ODF) to translation files (like PO or XLIFF).
This test suite is a derived work of the OpenDocument Fellowship test suite, which has a different focus (it aims to validate how much of ODF is supported by different office applications, like OpenOffice and KOffice).
This test suite is available under terms and conditions of the Creative Commons Attribution license version 3.0 (CC BY 3.0).
The table below summarizes the functional scope of the Open Document Format currently covered by this test suite.
|1. Simple word processing||Incomplete|
|2. Basic tables||Complete|
|4. Basic accessibility||Not covered||Because ODG files are not supported|
|5. Footnotes, headers, and footers||Complete|
|7. Bibliography entries||Not covered||Not covered by the original test suite|
|8. Ruby text||Not covered||Not covered by the original test suite|
|9. Headers and footers||Complete|
Sometimes whether to extract or not a certain field is unclear; for instance the author's name maybe left untranslated, but it may be useful too to add the phonetic translation to the original name. Our general policy for these unclear cases has been: *when in doubt, translate*.
Can we perhaps support different “levels”? A level for “definitely translate” and another for “maybe translate”? Friedel 2008/07/30 07:44
Segmentation defines how to split a paragraph into the sentences (translation units) that will make up the translation file (PO or XLIFF). There may be also different choices on how to do this. So far the choice for this test suite has been to split a sentence on the dot (.), semicolon (;), exclamation mark (!), question mark (?) and colon (:) characters.
We'll need to go far more advanced than that, but I don't think it is required to commit to anything now. We want to support many source langauges, therefore also languages which don't use the same punctuation marks. The Translate Toolkit has sentence segmentation rules for many languages. But even then, we probably should only segment on a block level, not sentence. Ideally it should be up to the translation software to segment, otherwise the translator can't correct from a segmentation mistake. --- Friedel 2008/07/30 07:44
We segment at the sentence level because we do not use translation software able to segment by itself (do you know by the way any free software tool that does so?). We have handled the problem of segmentation mistakes by being conservative: when in doubt do not segment. However, I believe the best would be to support the SRX standard, so the segmentation rules would not be hardcoded; do you think it is feasible? --- J. David 2008/08/12 03:18
If we segment too finely, I don't see how a translator can recover from that. To segment everything perfectly in all languages at all times, just seems a bit hard, and we really don't need to do that now. I think it is easiest if we just get block level units to convert, and eventually we can plan to do lots of fancy things, providing some options to the user, perhaps. SRX is also on our radar for other things. --- Friedel 2008/08/14 03:00