Skip to content

itsjeyd/wikitrans-pootle

 
 

Repository files navigation

Documentation

Introduction

WikiTrans is an open-source machine translation project that is intended for use with the Wikipedia community. Using Pootle as a platform, users are able to request translations of Wikipedia articles into various languages. The user community can review and post-edit machine translations before submitting the new article to Wikipedia. Eventually, these post-edits will be used to update the MT systems.

Dependencies

As described below, an installation script can be used to automatically install all dependencies listed in this section.

  • translate-toolkit, version 1.9
  • python-protobuf
  • python-yaml
  • libxslt-dev
  • libevent-dev
  • build-essential
  • python-setuptools
  • python-dev
  • django, version 1.3.1
  • lxml
  • simplejson
  • apyrtium
  • BeautifulSoup
  • django-uni-form
  • goopytrans
  • nltk, version 2.0b7
  • polib, version 0.5.3
  • pycountry
  • wikipydia

About half of these dependencies can be installed using apt-get; the remaining dependencies should be installed using pip. For details, consult the installation script.

Installation

  1. Get code:

    <pre>$ git clone https://github.com/itsjeyd/wikitrans-pootle.git</pre>
    
  2. Enter project directory:

    <pre>$ cd wikitrans-pootle</pre>
    
  3. Run installation script:

    <pre>$ ./dependencies/install.sh</pre>
    
  4. Set up the database:

    <pre>$ python manage.py syncdb</pre>
    

    Answer yes when asked whether or not you would like to create a superuser. You will be prompted for a user name, an email address, and a password.

  5. Populate the database:

    <pre>$ python manage.py initdb</pre>
    
  6. Precalculate statistics about existing translation projects:

    <pre>$ python manage.py refresh_stats</pre>
    

    Please note that

    • this will take a while, and
    • doing it manually is optional.
      If you decide to skip this step, these precalculations will be done when you try to access your WT installation through your browser for the first time.
  7. Set up the translation host (MT Serverland):

    <pre>$ python manage.py serverland_init</pre>
    
  8. Start the server:

    <pre>$ ./PootleServer</pre>
    
  9. In your browser, access the following URL:

    <pre>http://localhost:8080/wikitrans</pre>
    

    This will take you to the main page of WikiTrans:

    WT main page

Usage

N.B.: The shell commands listed throughout this section consistently assume you are in the root directory of your WT installation; unless you cloned the project into a custom folder or have renamed the folder since you cloned it, the name of this directory should be wikitrans-pootle.

Logging in

Clicking the "Login" button on the main page brings up the login form for WT:

WT login

Enter the superuser credentials you provided when you first ran the syncdb command.

After successful login, you will be taken to your "Dashboard":

WT dashboard

Requesting Articles

From your dashboard, navigate to the Articles view by clicking on the "Articles" tab. On the left hand side you will see a list of all Wikipedia articles that have been imported into the system; if you are doing this for the first time, this list will of course be empty. The form on the right hand side can be used to request additional Wikipedia articles you would like to translate (or make available to your end users for translating):

WT no-articles

To request an article, enter its full title in the "Title" field. Then, from the drop-down menu next to "Title language", choose the language in which the article is written, and click "Submit Query".

WT request-article

Note that doing this will not cause the article to be imported right away. Instead, WikiTrans provides a custom command that takes care of importing requested articles and making them available to registered users:

$ python manage.py update_wiki_articles

This command can be run manually by the administrator or automatically at specific intervals, e.g. by scheduling it as a cron job.

After running the command, the Articles view lists the newly imported article(s):

WT articles

Viewing and Editing Articles

As soon as a newly requested article has been imported into the system and shows up in the list of articles, you can start working with it.

You can access the content of a specific article by clicking its name in the listing:

WT article

When importing articles, WT tries its best to preserve their general structure and formatting. If you do spot a mistake in an article, or if you want to add or remove some parts from the original text before translating it, you can click on "Fix Article" to bring up an editing interface for that article:

WT fix-article

There are only two things you need to keep in mind when editing an article:

  1. Each sentence needs to be on a separate line.
  2. Paragraphs are separated using single blank lines.

Please note that any changes you decide to make to an article need to be saved before requesting translations for it. Making changes later on will not result in any errors, but they will not be incorporated into the source text that is displayed in the interface for editing translations.

Creating Projects

To be able to request translations for an article, you first have to create a project for it. You can do this simply by clicking on the "Create new project" link next to the article. When you are done, the article list should look like this:

WT project-created

Additionally, the system needs to know which target language(s) you would like to work on for a given article. Adding one or more languages to a project works like this:

  1. Expand the "Target Language" drop-down and click "Add Language":

    WT add-language

    This will take you to a page showing a list of available languages to choose from.

  2. Hold down the Ctrl key and mark (i.e., left-click) all languages you would like to add to the project:

    WT add-language-list

    Note that if you only want to add a single language, there is no need to hold down the Ctrl key.

  3. When you are done, click "Submit Changes". This will take you back to the Articles view.

Accessing projects

After adding some languages to a project, you can access it by clicking on its name (e.g. "de:Hafenbecken") in the article list:

WT project

The project view lists all target languages selected for a given source article and project.

Requesting Translations

On the project page for a given source article, you can request a translation into a specific target language like this:

  1. Expand the corresponding drop-down menu and choose one of the workers listed:

    WT choose-worker

  2. Click "Go!"

If WT lists different sets workers for different target languages, this is because not every worker can handle every single pair of languages available through Pootle.

Forwarding requests

Since WT depends on external MT services for translating Wikipedia articles, each translation request made by you or one of your end users needs to be forwarded to a translation host. For this purpose, the following command can be used:

$ python manage.py request_translations

Fetching translations

To import translation results from the translation host into WT, use the following command:

$ python manage.py fetch_translations

Deleting requests (on the MT host)

While request_translations and fetch_translations are the most important commands for dealing with translation requests and therefore need to be run most often, there is another command you should run on a somewhat regular basis:

$ python manage.py delete_finished_requests

For each finished request in the system, WT tells the translation host to delete that request. A request is considered to be "finished" if the translation produced for it by the translation host has been imported into WT successfully.

Viewing and Editing Translations

For each of the target languages chosen for a given article, WT provides an overview page showing progress information. This overview page can be accessed from the project page by clicking the name of the language on the project page.

WT translation-overview

Clicking on "article.po" takes you to the editing interface for a specific source article and translation, which looks like this:

WT edit-translation

Individual sentences can be edited in place; clicking "Submit" saves the changes.

Deleting Projects

Individual projects can be deleted simply by clicking the corresponding "Delete" button in the article listing.

About

Integration of WikiTrans with the Pootle Framework. (Previous master has been overwritten with the upgrade branch.)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 95.9%
  • JavaScript 3.6%
  • Other 0.5%