WikiTrans is an open-source machine translation project that is intended for use with the Wikipedia community. Using Pootle as a platform, users are able to request translations of Wikipedia articles into various languages. The user community can review and post-edit machine translations before submitting the new article to Wikipedia. Eventually, these post-edits will be used to update the MT systems.
As described below, an installation script can be used to automatically install all dependencies listed in this section.
translate-toolkit
, version 1.9python-protobuf
python-yaml
libxslt-dev
libevent-dev
build-essential
python-setuptools
python-dev
django
, version 1.3.1lxml
simplejson
apyrtium
BeautifulSoup
django-uni-form
goopytrans
nltk
, version 2.0b7polib
, version 0.5.3pycountry
wikipydia
About half of these dependencies can be installed using apt-get
; the
remaining dependencies should be installed using pip
. For details,
consult the installation script.
-
Get code:
<pre>$ git clone https://github.com/itsjeyd/wikitrans-pootle.git</pre>
-
Enter project directory:
<pre>$ cd wikitrans-pootle</pre>
-
Run installation script:
<pre>$ ./dependencies/install.sh</pre>
-
Set up the database:
<pre>$ python manage.py syncdb</pre>
Answer
yes
when asked whether or not you would like to create a superuser. You will be prompted for a user name, an email address, and a password. -
Populate the database:
<pre>$ python manage.py initdb</pre>
-
Precalculate statistics about existing translation projects:
<pre>$ python manage.py refresh_stats</pre>
Please note that
- this will take a while, and
- doing it manually is optional.
If you decide to skip this step, these precalculations will be done when you try to access your WT installation through your browser for the first time.
-
Set up the translation host (MT Serverland):
<pre>$ python manage.py serverland_init</pre>
-
Start the server:
<pre>$ ./PootleServer</pre>
-
In your browser, access the following URL:
<pre>http://localhost:8080/wikitrans</pre>
This will take you to the main page of WikiTrans:
N.B.: The shell commands listed throughout this section consistently
assume you are in the root directory of your WT installation; unless
you cloned the project into a custom folder or have renamed the folder
since you cloned it, the name of this directory should be
wikitrans-pootle
.
Clicking the "Login" button on the main page brings up the login form for WT:
Enter the superuser credentials you provided when you first ran the
syncdb
command.
After successful login, you will be taken to your "Dashboard":
From your dashboard, navigate to the Articles view by clicking on the "Articles" tab. On the left hand side you will see a list of all Wikipedia articles that have been imported into the system; if you are doing this for the first time, this list will of course be empty. The form on the right hand side can be used to request additional Wikipedia articles you would like to translate (or make available to your end users for translating):
To request an article, enter its full title in the "Title" field. Then, from the drop-down menu next to "Title language", choose the language in which the article is written, and click "Submit Query".
Note that doing this will not cause the article to be imported right away. Instead, WikiTrans provides a custom command that takes care of importing requested articles and making them available to registered users:
$ python manage.py update_wiki_articles
This command can be run manually by the administrator or automatically at specific intervals, e.g. by scheduling it as a cron job.
After running the command, the Articles view lists the newly imported article(s):
As soon as a newly requested article has been imported into the system and shows up in the list of articles, you can start working with it.
You can access the content of a specific article by clicking its name in the listing:
When importing articles, WT tries its best to preserve their general structure and formatting. If you do spot a mistake in an article, or if you want to add or remove some parts from the original text before translating it, you can click on "Fix Article" to bring up an editing interface for that article:
There are only two things you need to keep in mind when editing an article:
- Each sentence needs to be on a separate line.
- Paragraphs are separated using single blank lines.
Please note that any changes you decide to make to an article need to be saved before requesting translations for it. Making changes later on will not result in any errors, but they will not be incorporated into the source text that is displayed in the interface for editing translations.
To be able to request translations for an article, you first have to create a project for it. You can do this simply by clicking on the "Create new project" link next to the article. When you are done, the article list should look like this:
Additionally, the system needs to know which target language(s) you would like to work on for a given article. Adding one or more languages to a project works like this:
-
Expand the "Target Language" drop-down and click "Add Language":
This will take you to a page showing a list of available languages to choose from.
-
Hold down the Ctrl key and mark (i.e., left-click) all languages you would like to add to the project:
Note that if you only want to add a single language, there is no need to hold down the Ctrl key.
-
When you are done, click "Submit Changes". This will take you back to the Articles view.
After adding some languages to a project, you can access it by clicking on its name (e.g. "de:Hafenbecken") in the article list:
The project view lists all target languages selected for a given source article and project.
On the project page for a given source article, you can request a translation into a specific target language like this:
If WT lists different sets workers for different target languages, this is because not every worker can handle every single pair of languages available through Pootle.
Since WT depends on external MT services for translating Wikipedia articles, each translation request made by you or one of your end users needs to be forwarded to a translation host. For this purpose, the following command can be used:
$ python manage.py request_translations
To import translation results from the translation host into WT, use the following command:
$ python manage.py fetch_translations
While request_translations
and fetch_translations
are the most
important commands for dealing with translation requests and therefore
need to be run most often, there is another command you should run on
a somewhat regular basis:
$ python manage.py delete_finished_requests
For each finished request in the system, WT tells the translation host to delete that request. A request is considered to be "finished" if the translation produced for it by the translation host has been imported into WT successfully.
For each of the target languages chosen for a given article, WT provides an overview page showing progress information. This overview page can be accessed from the project page by clicking the name of the language on the project page.
Clicking on "article.po" takes you to the editing interface for a specific source article and translation, which looks like this:
Individual sentences can be edited in place; clicking "Submit" saves the changes.
Individual projects can be deleted simply by clicking the corresponding "Delete" button in the article listing.