[mainline] there is a need for a en_US translation
Moderator: Forum Moderators
Forum rules
Before posting a new idea, you must read the following:
Before posting a new idea, you must read the following:
[mainline] there is a need for a en_US translation
Each time a translatable string is changed in mainline content (code for UI or WML/lua for content) the corresponding translations in all languages become "fuzzy". It highlights that people needs to check the translation again to see if it is valid.
It is up to the translation team to remove the fuzzy status. But statistics show that not so many team are able to keep up with the changes, basically only the most active ones: Brazilian Portuguese, Italian, Turkish, Russian (with Spanish, Japanese and Chinese to a lesser extend). Most of the translation teams are overwhelmed by the constant changes and the fuzzy strings accumulate.
When the translation of a string is "fuzzy" at time of release for the translation in one language, running the game in that language will display the English sentence instead, breaking the flow of the text in the selected language.
The fact is that many changes are done to add flavor and color to the dialogs or include fixes in semantics, typos or grammar to the original English text. But it is very unlikely that the same flavor can be added in all languages or that the translations have the same error in grammar.
Let me quote an example from Descent in Darkness:
There is no change in meaning, the new text being actually harder to understand for average people knowing English as a foreign language.
The translation team-member that listed this issue described it as leading to "unhappiness", "regression" and "frustration". So we reach a situation where in order to please US-English speakers, we make the experience of many non-English speakers worse.
There is a way to sort this situation: offer the best US-English text in a en_US translation.
The strings in the code and mainline content should target stability and ability to be translate easily. It is up to each translation team to add as much flavor as they like.
This is certainly not the only example, I looked for some in the commits from the past year, but I had to stop when the file compiling the changes reached more than 700 lines I put what I collected in attachment.
It is up to the translation team to remove the fuzzy status. But statistics show that not so many team are able to keep up with the changes, basically only the most active ones: Brazilian Portuguese, Italian, Turkish, Russian (with Spanish, Japanese and Chinese to a lesser extend). Most of the translation teams are overwhelmed by the constant changes and the fuzzy strings accumulate.
When the translation of a string is "fuzzy" at time of release for the translation in one language, running the game in that language will display the English sentence instead, breaking the flow of the text in the selected language.
The fact is that many changes are done to add flavor and color to the dialogs or include fixes in semantics, typos or grammar to the original English text. But it is very unlikely that the same flavor can be added in all languages or that the translations have the same error in grammar.
Let me quote an example from Descent in Darkness:
Code: Select all
1.14:
«By all rights, I should have you executed on the spot, Malin. I cannot believe you let that necromancer corrupt you.»
1.16:
«By all rights, Malin, I should have ya kill’d on tha spot. I can’t believe ya let that necromancer corrupt you.»
The translation team-member that listed this issue described it as leading to "unhappiness", "regression" and "frustration". So we reach a situation where in order to please US-English speakers, we make the experience of many non-English speakers worse.
There is a way to sort this situation: offer the best US-English text in a en_US translation.
The strings in the code and mainline content should target stability and ability to be translate easily. It is up to each translation team to add as much flavor as they like.
This is certainly not the only example, I looked for some in the commits from the past year, but I had to stop when the file compiling the changes reached more than 700 lines I put what I collected in attachment.
- Attachments
-
- translation.en_US.md
- Recent changes in mainline creating "fuzzy" translations
- (48.22 KiB) Downloaded 361 times
- Pentarctagon
- Project Manager
- Posts: 5603
- Joined: March 22nd, 2009, 10:50 pm
- Location: Earth (occasionally)
Re: [mainline] there is a need for a en_US translation
I don't know how practical this would end up being, so I guess I'm neutral to the idea. It's worth pointing out though that this sort of thing happening is inevitable given the people translating the game into other languages seem to interact with the rest of the dev team rarely if at all. Issues can't be addressed if nobody tells us they exist.
99 little bugs in the code, 99 little bugs
take one down, patch it around
-2,147,483,648 little bugs in the code
take one down, patch it around
-2,147,483,648 little bugs in the code
Re: [mainline] there is a need for a en_US translation
It's not a lack of interaction, and it's not a fault of the translation teams. Iris commented about it recently on DIscord (March 24th, in the voice-acting discussion); I knew about it but assumed the problem was the number of changes between 1.16.0 and 1.16.2 rather than the number of changes between 1.14.x and 1.16.
There's a communication failure in the other direction, in that we're not providing useful information to the translators about which changes are meant to change the meaning of text. Taking the biggest example in the upcoming 1.16.3 update, wesnoth-sof, there are 82 changes between 1.16.2 and 1.16.3, of which
I can see the logic of having an en_US translation and writing any new strings in simplified English. I haven't thought through and seen the downsides yet.
There's a communication failure in the other direction, in that we're not providing useful information to the translators about which changes are meant to change the meaning of text. Taking the biggest example in the upcoming 1.16.3 update, wesnoth-sof, there are 82 changes between 1.16.2 and 1.16.3, of which
- 1 is a new string
- 2 are clarifications changing "dwarves" to "Shorbear dwarves"
- 1 is meant to fix a plot hole #6554
- 78 would be limited to the en_US translation in demario's suggestion
I can see the logic of having an en_US translation and writing any new strings in simplified English. I haven't thought through and seen the downsides yet.
Re: [mainline] there is a need for a en_US translation
It seems to violate the DRY principle having an
en_US.po
file which mostly just duplicates the strings in the source code. That will make it more work for people editing the English-language text because they will often have to make changes in two different places - inevitably people will sometimes make a change to one and forget to change the other.Re: [mainline] there is a need for a en_US translation
That's the point of Demario's proposal - that a lot of what's currently changing in the source code would become a single change in en_US.po instead. Yes, currently strings in the source are en_US, but new ones would be written in (whatever the ISO code for Simple English or English for TEFL is), so forgetting to add an en_US translation would show up as a simplified string to en_US users.
- Pentarctagon
- Project Manager
- Posts: 5603
- Joined: March 22nd, 2009, 10:50 pm
- Location: Earth (occasionally)
Re: [mainline] there is a need for a en_US translation
That might be true. But also case in point, I didn't know the mailing list that demario linked even existed. And that was created (I assume, since its archive history only goes back to 2019) despite us having IRC/Discord, a translations forum here, and our own i18n mailing list (that's mostly not used any more it seems like).octalot wrote: ↑April 8th, 2022, 8:44 am It's not a lack of interaction, and it's not a fault of the translation teams. Iris commented about it recently on DIscord (March 24th, in the voice-acting discussion); I knew about it but assumed the problem was the number of changes between 1.16.0 and 1.16.2 rather than the number of changes between 1.14.x and 1.16.
There's a communication failure in the other direction, in that we're not providing useful information to the translators about which changes are meant to change the meaning of text.
99 little bugs in the code, 99 little bugs
take one down, patch it around
-2,147,483,648 little bugs in the code
take one down, patch it around
-2,147,483,648 little bugs in the code
Re: [mainline] there is a need for a en_US translation
For certain minor changes, yes. For a more substantive change which alters the meaning of the text, the writer would need to make a change in both the source code and in the
en_US.po
file.It just seems a strange, complex technical solution (is there any other project which does translations that way?) for something which is more of a process issue rather than a technical problem. I.e., why is there so much churn occurring in the translatable strings? (Why are there 80 changes to a single campaign in a bugfix release of the stable branch?)
Re: [mainline] there is a need for a en_US translation
The underlying meaning may not have changed, but I believe this particular revision was part of a deliberate campaign-wide revision/rewrite on nemaara's part. Perhaps there's better communication to be had in making these large changes - I believe the intent, in the example you cite, is to have Malin's townsfolk be more simple-minded (and therefore less open to the idea of necromancy as an acceptable weapon of war) than in previous versions and their speech was rewritten to be more typical of US country 'yokels'. I'm not sure how you'd go about translating/localising such nuances, but in this case it's not exactly accurate to say there is absolutely no change in meaning.
Soli Deo Gloria
Re: [mainline] there is a need for a en_US translation
Interesting thoughts, thanks everyone to put them down.
Or that they are unhappy and frustrated to see the result of their work being reset repeatedly to accommodate US English speakers?
All the break-down that follows just shows a change of meaning of the writer, but the sentence meaning hasn't changed to me. All the complex references to "townsfolk", "yokels" is probably lost to a large number of translators, thus unlikely to be found in any translation.
All your detailed explanation (thank you for teaching us), serves as a confirmation to me that all these changes should be limited to a en_US translation
This is what I describe as flavor in the text.
Updating a translation is easy. You don't have to wonder which file the string is, you just open the po file and save it after change.
It is kind of surprising that you find this process so complex. That is how it is done for every language but US English. The double work you refer to, is how all translation teams are required to repeat each time a translation is fuzzy.
Of course, when the additional work is pushed to other people, it may look to devs like things are done in the most efficient way.
What should they warn the dev team about? That changing translatable strings in source is breaking translation?Pentarctagon wrote: ↑April 8th, 2022, 5:00 am It's worth pointing out though that this sort of thing happening is inevitable given the people translating the game into other languages seem to interact with the rest of the dev team rarely if at all. Issues can't be addressed if nobody tells us they exist.
Or that they are unhappy and frustrated to see the result of their work being reset repeatedly to accommodate US English speakers?
Man, you're the native speaker here. When I sayWedge009 wrote: ↑April 8th, 2022, 2:34 pm The underlying meaning may not have changed, but I believe this particular revision was part of a deliberate campaign-wide revision/rewrite on nemaara's part. Perhaps there's better communication to be had in making these large changes - I believe the intent, in the example you cite, is to have Malin's townsfolk be more simple-minded (and therefore less open to the idea of necromancy as an acceptable weapon of war) than in previous versions and their speech was rewritten to be more typical of US country 'yokels'. I'm not sure how you'd go about translating/localising such nuances, but in this case it's not exactly accurate to say there is absolutely no change in meaning.
The meaning of the sentence hasn't changed
, you think this is not an accurate statement?All the break-down that follows just shows a change of meaning of the writer, but the sentence meaning hasn't changed to me. All the complex references to "townsfolk", "yokels" is probably lost to a large number of translators, thus unlikely to be found in any translation.
All your detailed explanation (thank you for teaching us), serves as a confirmation to me that all these changes should be limited to a en_US translation
This is what I describe as flavor in the text.
There is no need to do changes in 2 different places. If it is a bug fix (4 of 83 cases by octalot statistics), you fix it in the code, then you will have to translate it during the string freeze (like any language). For the other cases (79 out of 83), you keep the code unchanged and you change only the en_US translation in the po file. At the same time, all translation teams are saved from checking 79 useless fuzzy (only 4 new/fuzzy remaining).gnombat wrote: ↑April 8th, 2022, 1:03 pm It seems to violate the DRY principle having anen_US.po
file which mostly just duplicates the strings in the source code. That will make it more work for people editing the English-language text because they will often have to make changes in two different places - inevitably people will sometimes make a change to one and forget to change the other.
Updating a translation is easy. You don't have to wonder which file the string is, you just open the po file and save it after change.
It is kind of surprising that you find this process so complex. That is how it is done for every language but US English. The double work you refer to, is how all translation teams are required to repeat each time a translation is fuzzy.
Of course, when the additional work is pushed to other people, it may look to devs like things are done in the most efficient way.
- Pentarctagon
- Project Manager
- Posts: 5603
- Joined: March 22nd, 2009, 10:50 pm
- Location: Earth (occasionally)
Re: [mainline] there is a need for a en_US translation
I'll go out on a limb and say that the majority of the current dev team, myself included, have approximately 0% understanding of what goes into translating. So, yes, if they're frustrated about something then they do need to communicate that. If all that happens is translation updates are wordlessly sent in, my assumption is simply "well, I guess it's not that bad".demario wrote: ↑April 8th, 2022, 11:44 pmWhat should they warn the dev team about? That changing translatable strings in source is breaking translation?Pentarctagon wrote: ↑April 8th, 2022, 5:00 am It's worth pointing out though that this sort of thing happening is inevitable given the people translating the game into other languages seem to interact with the rest of the dev team rarely if at all. Issues can't be addressed if nobody tells us they exist.
Or that they are unhappy and frustrated to see the result of their work being reset repeatedly to accommodate US English speakers?
99 little bugs in the code, 99 little bugs
take one down, patch it around
-2,147,483,648 little bugs in the code
take one down, patch it around
-2,147,483,648 little bugs in the code
Re: [mainline] there is a need for a en_US translation
But, in this case, en_US isn't like any other language - it is the original source language of the text. It's the language the campaign was originally written in, and it's the language that writers will be working with in future revisions and edits. If it were just like any other language, there would be an en_US translator or translation team - but there isn't one. So the burden of maintaining both the text in thedemario wrote: ↑April 8th, 2022, 11:44 pm There is no need to do changes in 2 different places. If it is a bug fix (4 of 83 cases by octalot statistics), you fix it in the code, then you will have to translate it during the string freeze (like any language). For the other cases (79 out of 83), you keep the code unchanged and you change only the en_US translation in the po file. At the same time, all translation teams are saved from checking 79 useless fuzzy (only 4 new/fuzzy remaining).
Updating a translation is easy. You don't have to wonder which file the string is, you just open the po file and save it after change.
.wml
source file and the en_US.po
file will fall upon whoever is writing and editing the text of the campaign.In practice, this is what I expect will likely happen:
- Writers and editors will probably work mostly with the
en_US.po
file and make most of their changes there. - Over time, the text in the
en_US.po
file and the text in.wml
source files will gradually drift apart as editors make changes to theen_US.po
file but forget to update the.wml
file (even in cases where the change is substantive and the.wml
file should be updated). - This will make things more difficult for the translators, as they will find that some of the English strings they are translating in their
.po
file (e.g.,fr.po
) are out of date and out of sync with the rest of the campaign text.
I'm not so much concerned about the double work (although it is more work for writers and editors, as I noted above) as I am about the fact that there will be two different English texts with neither one clearly and unambiguously the actual source text. I think this will make things more difficult for translators rather than less. For example, consider the French translation team. Previously, they really only needed to worry about maintaining thedemario wrote: ↑April 8th, 2022, 11:44 pm It is kind of surprising that you find this process so complex. That is how it is done for every language but US English. The double work you refer to, is how all translation teams are required to repeat each time a translation is fuzzy.
Of course, when the additional work is pushed to other people, it may look to devs like things are done in the most efficient way.
fr.po
file. Now, there will likely be cases where the English strings in the fr.po
file are out of date (as I explained above), so the translators will have to inspect 3 different files - their own fr.po
file, the en_US.po
file, and the .wml
source file - to try to reconcile the differences between the three texts (the French text, the en_US text, and the simplified English text in the WML).Re: [mainline] there is a need for a en_US translation
Yepgnombat wrote: ↑April 9th, 2022, 2:31 am In practice, this is what I expect will likely happen:[Some catastrophic outcome]
- Writers and editors will probably work mostly with the
en_US.po
file and make most of their changes there.- Over time, the text in the
en_US.po
file and the text in.wml
source files will gradually drift apart as editors make changes to theen_US.po
file but forget to update the.wml
file (even in cases where the change is substantive and the.wml
file should be updated).
...
You know, I can probably make up some very bad omen of my own too, starting from the premises that people make mistakes and they don't like to follow rules.
Thanks for contributing yours to the discussion
Re: [mainline] there is a need for a en_US translation
Please try adding the steps that a translation team takes to that walkthrough, including the "stare at an entire paragraph that's been marked as changed and work out exactly what changed" parts. Then write down a walkthrough of what happens when the writers and editors make the same changes, but in the current "en_US is the primary source" situation.gnombat wrote: ↑April 9th, 2022, 2:31 am In practice, this is what I expect will likely happen:
- Writers and editors will probably work mostly with the
en_US.po
file and make most of their changes there.- Over time, the text in the
en_US.po
file and the text in.wml
source files will gradually drift apart as editors make changes to theen_US.po
file but forget to update the.wml
file (even in cases where the change is substantive and the.wml
file should be updated).- This will make things more difficult for the translators, as they will find that some of the English strings they are translating in their
.po
file (e.g.,fr.po
) are out of date and out of sync with the rest of the campaign text.
The risk of a writer forgetting to update the .wml source file for a significant change is probably less than the chance of a translator mistaking a significant change for a trivial one. The writer's work is also pushed through source control, and it's far more likely that someone will review the writer's work as part of an individual change - whereas the translator is going to get an update with a single file per textdomain, bundling all changes since the last release.
Re: [mainline] there is a need for a en_US translation
Any serious engineering discussion will take into account the possibility (really, the inevitability) that people will make mistakes.
I'm not denying that there are tradeoffs involved here. (As I noted in a previous post, it is no doubt painful for translators when they see that there are 80 different text strings changed in one campaign in a single bugfix release.) I'm suggesting it would be wise to take into account all the positives and negatives when considering an idea like this (and also to consider possible alternatives).octalot wrote: ↑April 9th, 2022, 2:39 pm Please try adding the steps that a translation team takes to that walkthrough, including the "stare at an entire paragraph that's been marked as changed and work out exactly what changed" parts. Then write down a walkthrough of what happens when the writers and editors make the same changes, but in the current "en_US is the primary source" situation.
Re: [mainline] there is a need for a en_US translation
Yes there's some small issues on the dev side with forgetting to update the WML file when necessary but on the flip side this would be really useful for me. I've actually hesitated to add more "flavor" to the text in some areas like Liberty because I was worried people would be completely unable to translate it. With this, I could go crazy with the en_US translation while leaving a more basic broadly understandable version in the WML file.