Translate Toolkit & Pootle

Tools to help you make your software local

User Tools

Merging and updating

This are just some thoughts and notes about merging

Current problem

Currently (2006-08) we have several implementations of merging code in Pootle and the toolkit that do similar things. pot2po, pomerge, (used with file upload in Pootle), (used for cvs update)

For pootle issues, see upload_and_download_merging_policy.

Status quo


  • POT file is input and defines:
    • layout of the final document
    • what message to keep, what to obsolete
    • order of message
  • old PO file is template (in -t sense) and defines existing translations
  • header values are updated:
    • information from PO file is reused
    • values such as POT-Creation-Date are carried from POT
  • take .target from PO file and inject it into POT, method:
    • location, or
    • .source, or
    • fuzzy match - in that case set fuzzy state
      • Uses fuzzy matching with Leivenschtein (sp?)
      • obsolete messages can be reused in matching
  • Comments:
    • Automatic comments from POT is used
    • translation comments from PO is used
    • msgidcomments? (these should come from POT - Dwayne has patch)
  • Special comments:
    • locations from POT is used
    • unused messages from PO becomes obsolete


  • updated or snippet PO file is input
  • existing PO file is template, usually current
  • Header
    • Header is usually missing from snippets
    • If only header is present nothing happens
    • If header us present the one in the input is not used to update existing headert
  • overwrites .target from input over template's targets
  • automatic comments from template (original PO) is used
  • translation comments are merged (concatenated) (this might not be happening now… could be a problem)
  • locations should be unchanged from template (currently merged)
  • no action for obsolete messages


(used for file upload)

  • uploaded file is input
  • current PO file is template
  • update certain entries from input header (set last-translator externally, i.e. from Pootle?)
  • overwrite .target from input if template is empty, otherwise enter it as suggestion
  • (currently) if there is a new unit and the user has admin rights, add the unit
  • (currently) use pounit.merge() if we merge (merge comments except automatic comments) - probably not good (don't want to merge locations)
  • no action for obsolete messages

(used for updating from version control) Three files are involved: the current local file, the original file from VCS and the new file from VCS.

  • reuses some code from file upload code. Should obsolete but doesn't.


We want to abstract it all to have one code base, where new features can be shared by all users, etc. The merging can be custimised by parameters or a configuration.


  • Define a document template - this dictates the basic format and layout of the output (order)
  • (ideally, perhaps not worth it) Define the reference PO - to dictate message layout if present. FIXME DB does this mean you would have a template for general layout and new comments, a reference PO so that we can update it with minimal changes and an update PO?
  • If there is no header, should one be added? FIXME DB - in most cases no. The header is usually not there for a reason, although we could add headers by default if we can handle all cases of udpating. That way pogrep, pofilter, etc can make use of the header to feed that back into the final header when merged.
  • If we have a header, which entries should be updated from input: PO-Revision-Date, X-Editor, Last Translator, Team, Plural
    • update certain header entries from parameters (Last-Translator) it would be useful to have an option --no-update-header
    • Should PO-Revision-date automatically be updated? Yes if it is newer
  • overwrite translated (what if input is fuzzy) - depends on the case, pomerge usually means we have a reason to make something fuzzy so then yes we should overwrite and set fuzzy. In other cases we probably don't want to update if its fuzzy. In the Pootle case we want to make that a suggestion.
  • overwrite fuzzy (what if input is fuzzy) - some applies to above
  • Who's comments for what is used - programmer comments should come from the latest authority ie the template, transltor comments from input
  • Who is authority on locations - template has authority
  • Unused messages in input become obsolete? That might make sense as then all work is preserved.
  • Do we want fuzzy matching, with TM? - only in an update scenario not in a merging scenario
  • Should new messages be added - Yes from templates not from input
  • What to do with a clash (callback?) - clash? On Pootle we can make them suggestions

Sanity checks

The following sanity checks are basic things to check in addition to the complex issues described above.

  • If a file is unchanged locally, a CVS update should be an exact copy of the CVS file
  • CVS up for a template should not involve intelligence
  • Uploading a file with no changes on the server since download, should result in a copy of the uploaded file.
  • pot2po: reinitialising a PO file with the same POT file should make no difference to the file
  • Merging with a template should reflect the layout of the template
  • Merging with a reference file should present the reference file layout