Machine translation in wesnoth using gplv3 apertium software
Moderator: Forum Moderators
Machine translation in wesnoth using gplv3 apertium software
I want to introduce another set of tools that can provide alternative Machine Translation (MT) services based on apertium.
First this is a gplv3 tool which can be installed locally and for which you're getting all the code. As a practical example, you are getting all the rules and dictionaries used during MT and you're free to edit them! Second the project is an spin-off of an academic work with the explicit target of focusing on less widely spread languages. Finally, this is a multi-language project that doesn't revolve around US English.
People involved in translation have noticed the improvement of MT in the recent past. Some wesnoth translation teams list using DeepL, google-translate and other MT tools as part of their current process. It gives good results as:
The last problem with MT is that most are web-based and require copy-paste of the translatable strings one-by-one. That is another benefit of using apertium, as the local install links with pology that can apply different processes on po files -- used in wesnoth -- as feeding apertium.
A full po file MT translation using pology/apertium is basically looking like:
[edit] Removed a link to a post on wesnoth forum. It was a link on
First this is a gplv3 tool which can be installed locally and for which you're getting all the code. As a practical example, you are getting all the rules and dictionaries used during MT and you're free to edit them! Second the project is an spin-off of an academic work with the explicit target of focusing on less widely spread languages. Finally, this is a multi-language project that doesn't revolve around US English.
People involved in translation have noticed the improvement of MT in the recent past. Some wesnoth translation teams list using DeepL, google-translate and other MT tools as part of their current process. It gives good results as:
- they translate from wesnoth original US English which is the best supported language
- they often translate into other widely used languages
- they do a human proof-reading of the MT result with manual edits when required
- with experience, they can avoid translatable strings that are badly translated by MT (races, unit names...)
- MT from different full translations in alternative widely used western languages
- MT to western languages with less speakers (based on language proximity)
- less need of human proof-reading as MT is done on similar languages
- possibility to improve MT rules and dictionary based on gplv3 license
cs
to sk
, from de
to da
, from ru
to uk
. With no falling back to English. The theory is that MT could do a better job at translation between two languages that are close than from US English. That could reduce the need of the final human check that is difficult to get in languages with less speakers (see my failed attempt at getting translation review).The last problem with MT is that most are web-based and require copy-paste of the translatable strings one-by-one. That is another benefit of using apertium, as the local install links with pology that can apply different processes on po files -- used in wesnoth -- as feeding apertium.
A full po file MT translation using pology/apertium is basically looking like:
Code: Select all
pomtrans apertium -s ita -t srd -d /usr/local/share/apertium/apertium-sr-ita -p srd.:it. po/wesnoth-units/srd.po
some teams list DeepL, ...
and not on they translate from ...
.[/edit]
Last edited by demario on June 17th, 2023, 7:47 am, edited 3 times in total.
Re: Machine translation in wesnoth using gplv3 apertium software
When discussing apertium MT in wesnoth, I will work based on the translation data for 1.14 and will apply any MT to the strings from wesnoth in that version.
The reason is twofold:
- as the MT translation will not be reviewed by a native speaker most probably, I would like that the people that are interested in testing the results from MT are taking an action (like downloading the add-on containing the core 1.14) to use them in full knowledge of the limitations.
- as respect for the work of translators in these languages, I want to give them a head start so their translations are appreciated in their own right.
There are different factors that can impact the usefulness of apertium MT is wesnoth:
As I said before, I will focus on pairs of languages that belong to the same (or close) linguistic family. If I use wikipedia as reference for European languages, the family tree looks like:
The European languages are grouped as follow:
So taking the apertium language pairs into account, we have the following path for spreading translation for wesnoth:
Finally taking into account the apertium state of progress, we end up in two kinds of pairs that could be useful:
(*) the same thinking can be applied from
The reason is twofold:
- as the MT translation will not be reviewed by a native speaker most probably, I would like that the people that are interested in testing the results from MT are taking an action (like downloading the add-on containing the core 1.14) to use them in full knowledge of the limitations.
- as respect for the work of translators in these languages, I want to give them a head start so their translations are appreciated in their own right.
There are different factors that can impact the usefulness of apertium MT is wesnoth:
- the maturity of the translation between a pair of languages. The apertium project is maintaining different levels of state of progress (trunk, staging, nursery, incubator) and each pair of languages is associated with one of these levels. The result from translation between 2 languages in trunk will be better than 2 languages in nursery. Most pairs in incubator have been last updated long ago and have sometimes few commits. The activity around apertium seems to have taken a hit around year 2021.
- one of the two languages from the pair needs to be actively translated for BfW version 1.14. Beside English, wesnoth 1.14 is roughly available fully translated in different European languages(*):
cs
,de
,fr
,it
,sp
,ru
.
- in the other language of the pair, better to have some translation available for the domains that contain the strings for generic game information (#wesnoth, #wesnoth-units, #wesnoth-lib...).
As I said before, I will focus on pairs of languages that belong to the same (or close) linguistic family. If I use wikipedia as reference for European languages, the family tree looks like:
Indo-European linguistic families
wesnoth linguistic groups ("full" translation shown in bold)
apertium translation paths
- translation in language currently not available in wesnoth:
French - Arpitan ; French - Occitan ; Italian - Sardinian ; Russian - Kazakh ; English - Welsh ; Spanish - Aragonese
As the target language is not present in wesnoth, all wesnoth-specific words will end up not translated (ie identical to string in the source language), and either the word must be close enough in the target language or the target audience needs to be bilingual.
The second problem is that the language is not defined in wesnoth, so even if the translation is generated, it can't be selected in the language selection dialog.
- translation in languages currently present in wesnoth:
(in parenthesis, translation percentage for mainline core domains in BfW 1.14)
French/Spanish - Catalan (98.98%) ; Czech - Polish (93.74%) ; Russian - Ukrainian (93.07%) ; French - Portuguese (87.66%)
(*) the same thinking can be applied from
tr
to various Central Asia languages.
Last edited by demario on September 10th, 2023, 11:48 pm, edited 10 times in total.
Re: Machine translation in wesnoth using gplv3 apertium software
I get a different interpretation from the post that you've linked to. I think Michal- is doing a human translation, and then sometimes using MT as a sanity-check to compare to, rather than using the MT as the actual translation.demario wrote: ↑June 4th, 2023, 12:55 pm Some wesnoth translation teams list using DeepL, google-translate and other MT tools as part of their current process. It gives good results as:
- they translate from wesnoth original US English which is the best supported language
- they often translate into other widely used languages
- they do a human proof-reading of the MT result with manual edits when required
- with experience, they can avoid translatable strings that are badly translated by MT (races, unit names...)
- Lord-Knightmare
- Discord Moderator
- Posts: 2377
- Joined: May 24th, 2010, 5:26 pm
- Location: Somewhere in the depths of Irdya, gathering my army to eventually destroy the known world.
- Contact:
Re: Machine translation in wesnoth using gplv3 apertium software
I wanted to use this to help my translation effort on the game to BN, but I see that it's not in the support languages list so I will use one where it's supported (shown as an option at least)
Creator of "War of Legends"
Creator of the Isle of Mists survival scenario.
Maintainer of Forward They Cried
User:Knyghtmare | My Medium
Creator of the Isle of Mists survival scenario.
Maintainer of Forward They Cried
User:Knyghtmare | My Medium
Re: Machine translation in wesnoth using gplv3 apertium software
So I have put up an experimental Sardinian translation of wesnoth in the core 1.14 ("Bienvenue"). This is Machine Translated using the platform apertium based on the text in Italian. The four campaigns in "demario wrote: ↑ we end up in two kinds of pairs that could be useful:
- translation in language currently not available in wesnoth:
Italian - Sardinian ; ...
Bienvenue à Wesnoth ! (Welcome to Wesnoth)
" are also available in that language if you download the add-on and load the core "Bienvenue (1.14)". The translation has not been reviewed.You will have to select the
Burmese (mranmabhasa)
language to see Sardinian translation instead when you start BfW with the option --all-translations
(or edit the file data/languages/my_MM.cfg
to boost the percent=0
over 80 ).I put in attachment the BfW 1.14
#wesnoth-help
domain (all MTed strings; no fuzzy, no 'mtrans' marker) in Sardinian for those who want to check it out without the boilerplate.
I have used apertium to "complete" translation of thedemario wrote: ↑
- translation in languages currently present in wesnoth:
(in parenthesis, translation percentage for mainline core domains in BfW 1.14)
French/Spanish - Catalan (98.98%) ; ...
#wesnoth-help
domain from BfW 1.14 in Catalan (out of personal convenience, I worked from French). I picked this domain as it is somehow less specific to wesnoth and the vanilla apertium fra-cat
pair will possibly lead to some positive result. From checking the results, I can see some untranslated words from French (copiage, collage...) that I'll need to fix later, but it still looks like a foreign language to me lol.You can check it out with the Catalan translation in attachment (42 MTed strings are identified as fuzzy, as original 'mtrans' doesn't show up).
- Attachments
-
- wesnoth-1.14.17.po.wesnoth-help.ca.po.gz
- BfW 1.14 wesnoth-help m-translated in Catalan (from French)
- (110.11 KiB) Downloaded 75 times
-
- wesnoth-1.14.17.po.wesnoth-help.srd.po.gz
- BfW 1.14 wesnoth-help m-translated in Sardinian
- (101.57 KiB) Downloaded 73 times
Last edited by demario on June 17th, 2023, 8:08 am, edited 1 time in total.
Re: Machine translation in wesnoth using gplv3 apertium software
You are right.
For example I discovered many pop culture links in DiD achievements, whose was previously unknown for me and to translate them exactly I had to search and see movies in Czech. Using MT first in this case could probably hide some of them.
But I have tried different approach recently - automaticaly convert untranslated messages to fuzzies using msgattrib, potrans (DeepL), msgmerge and then translate these DeepL fuzzies much quicker, which is very tempting.