[mainline] there is a need for a en_US translation

Brainstorm ideas of possible additions to the game. Read this before posting!

Moderator: Forum Moderators

Forum rules
Before posting a new idea, you must read the following:
Post Reply
demario
Posts: 131
Joined: July 3rd, 2019, 1:05 pm

[mainline] there is a need for a en_US translation

Post by demario »

Each time a translatable string is changed in mainline content (code for UI or WML/lua for content) the corresponding translations in all languages become "fuzzy". It highlights that people needs to check the translation again to see if it is valid.
It is up to the translation team to remove the fuzzy status. But statistics show that not so many team are able to keep up with the changes, basically only the most active ones: Brazilian Portuguese, Italian, Turkish, Russian (with Spanish, Japanese and Chinese to a lesser extend). Most of the translation teams are overwhelmed by the constant changes and the fuzzy strings accumulate.

When the translation of a string is "fuzzy" at time of release for the translation in one language, running the game in that language will display the English sentence instead, breaking the flow of the text in the selected language.

The fact is that many changes are done to add flavor and color to the dialogs or include fixes in semantics, typos or grammar to the original English text. But it is very unlikely that the same flavor can be added in all languages or that the translations have the same error in grammar.
Let me quote an example from Descent in Darkness:

Code: Select all

1.14:
«By all rights, I should have you executed on the spot, Malin. I cannot believe you let that necromancer corrupt you.»
1.16:
«By all rights, Malin, I should have ya kill’d on tha spot. I can’t believe ya let that necromancer corrupt you.»
There is no change in meaning, the new text being actually harder to understand for average people knowing English as a foreign language.

The translation team-member that listed this issue described it as leading to "unhappiness", "regression" and "frustration". So we reach a situation where in order to please US-English speakers, we make the experience of many non-English speakers worse.

There is a way to sort this situation: offer the best US-English text in a en_US translation.

The strings in the code and mainline content should target stability and ability to be translate easily. It is up to each translation team to add as much flavor as they like.

This is certainly not the only example, I looked for some in the commits from the past year, but I had to stop when the file compiling the changes reached more than 700 lines :annoyed: I put what I collected in attachment.
Attachments
translation.en_US.md
Recent changes in mainline creating "fuzzy" translations
(48.22 KiB) Downloaded 223 times
User avatar
Pentarctagon
Project Manager
Posts: 5526
Joined: March 22nd, 2009, 10:50 pm
Location: Earth (occasionally)

Re: [mainline] there is a need for a en_US translation

Post by Pentarctagon »

I don't know how practical this would end up being, so I guess I'm neutral to the idea. It's worth pointing out though that this sort of thing happening is inevitable given the people translating the game into other languages seem to interact with the rest of the dev team rarely if at all. Issues can't be addressed if nobody tells us they exist.
99 little bugs in the code, 99 little bugs
take one down, patch it around
-2,147,483,648 little bugs in the code
User avatar
octalot
General Code Maintainer
Posts: 783
Joined: July 17th, 2010, 7:40 pm
Location: Austria

Re: [mainline] there is a need for a en_US translation

Post by octalot »

It's not a lack of interaction, and it's not a fault of the translation teams. Iris commented about it recently on DIscord (March 24th, in the voice-acting discussion); I knew about it but assumed the problem was the number of changes between 1.16.0 and 1.16.2 rather than the number of changes between 1.14.x and 1.16.

There's a communication failure in the other direction, in that we're not providing useful information to the translators about which changes are meant to change the meaning of text. Taking the biggest example in the upcoming 1.16.3 update, wesnoth-sof, there are 82 changes between 1.16.2 and 1.16.3, of which
  • 1 is a new string
  • 2 are clarifications changing "dwarves" to "Shorbear dwarves"
  • 1 is meant to fix a plot hole #6554
  • 78 would be limited to the en_US translation in demario's suggestion
79 of those are detected as fuzzies, with the clarifications and plot hole fix among those 79. There's currently no indication to the translators that those 3 are meant to be significant.

I can see the logic of having an en_US translation and writing any new strings in simplified English. I haven't thought through and seen the downsides yet.
gnombat
Posts: 682
Joined: June 10th, 2010, 8:49 pm

Re: [mainline] there is a need for a en_US translation

Post by gnombat »

octalot wrote: April 8th, 2022, 8:44 am I haven't thought through and seen the downsides yet.
It seems to violate the DRY principle having an en_US.po file which mostly just duplicates the strings in the source code. That will make it more work for people editing the English-language text because they will often have to make changes in two different places - inevitably people will sometimes make a change to one and forget to change the other.
User avatar
octalot
General Code Maintainer
Posts: 783
Joined: July 17th, 2010, 7:40 pm
Location: Austria

Re: [mainline] there is a need for a en_US translation

Post by octalot »

That's the point of Demario's proposal - that a lot of what's currently changing in the source code would become a single change in en_US.po instead. Yes, currently strings in the source are en_US, but new ones would be written in (whatever the ISO code for Simple English or English for TEFL is), so forgetting to add an en_US translation would show up as a simplified string to en_US users.
User avatar
Pentarctagon
Project Manager
Posts: 5526
Joined: March 22nd, 2009, 10:50 pm
Location: Earth (occasionally)

Re: [mainline] there is a need for a en_US translation

Post by Pentarctagon »

octalot wrote: April 8th, 2022, 8:44 am It's not a lack of interaction, and it's not a fault of the translation teams. Iris commented about it recently on DIscord (March 24th, in the voice-acting discussion); I knew about it but assumed the problem was the number of changes between 1.16.0 and 1.16.2 rather than the number of changes between 1.14.x and 1.16.

There's a communication failure in the other direction, in that we're not providing useful information to the translators about which changes are meant to change the meaning of text.
That might be true. But also case in point, I didn't know the mailing list that demario linked even existed. And that was created (I assume, since its archive history only goes back to 2019) despite us having IRC/Discord, a translations forum here, and our own i18n mailing list (that's mostly not used any more it seems like).
99 little bugs in the code, 99 little bugs
take one down, patch it around
-2,147,483,648 little bugs in the code
gnombat
Posts: 682
Joined: June 10th, 2010, 8:49 pm

Re: [mainline] there is a need for a en_US translation

Post by gnombat »

octalot wrote: April 8th, 2022, 1:44 pm That's the point of Demario's proposal - that a lot of what's currently changing in the source code would become a single change in en_US.po instead.
For certain minor changes, yes. For a more substantive change which alters the meaning of the text, the writer would need to make a change in both the source code and in the en_US.po file.

It just seems a strange, complex technical solution (is there any other project which does translations that way?) for something which is more of a process issue rather than a technical problem. I.e., why is there so much churn occurring in the translatable strings? (Why are there 80 changes to a single campaign in a bugfix release of the stable branch?)
User avatar
Wedge009
Developer
Posts: 17
Joined: June 24th, 2009, 11:17 am
Location: Sydney, Australia

Re: [mainline] there is a need for a en_US translation

Post by Wedge009 »

demario wrote: April 8th, 2022, 3:58 am There is no change in meaning, the new text being actually harder to understand for average people knowing English as a foreign language.
The underlying meaning may not have changed, but I believe this particular revision was part of a deliberate campaign-wide revision/rewrite on nemaara's part. Perhaps there's better communication to be had in making these large changes - I believe the intent, in the example you cite, is to have Malin's townsfolk be more simple-minded (and therefore less open to the idea of necromancy as an acceptable weapon of war) than in previous versions and their speech was rewritten to be more typical of US country 'yokels'. I'm not sure how you'd go about translating/localising such nuances, but in this case it's not exactly accurate to say there is absolutely no change in meaning.
Soli Deo Gloria
demario
Posts: 131
Joined: July 3rd, 2019, 1:05 pm

Re: [mainline] there is a need for a en_US translation

Post by demario »

Interesting thoughts, thanks everyone to put them down.
Pentarctagon wrote: April 8th, 2022, 5:00 am It's worth pointing out though that this sort of thing happening is inevitable given the people translating the game into other languages seem to interact with the rest of the dev team rarely if at all. Issues can't be addressed if nobody tells us they exist.
What should they warn the dev team about? That changing translatable strings in source is breaking translation?
Or that they are unhappy and frustrated to see the result of their work being reset repeatedly to accommodate US English speakers?
Wedge009 wrote: April 8th, 2022, 2:34 pm The underlying meaning may not have changed, but I believe this particular revision was part of a deliberate campaign-wide revision/rewrite on nemaara's part. Perhaps there's better communication to be had in making these large changes - I believe the intent, in the example you cite, is to have Malin's townsfolk be more simple-minded (and therefore less open to the idea of necromancy as an acceptable weapon of war) than in previous versions and their speech was rewritten to be more typical of US country 'yokels'. I'm not sure how you'd go about translating/localising such nuances, but in this case it's not exactly accurate to say there is absolutely no change in meaning.
Man, you're the native speaker here. When I say The meaning of the sentence hasn't changed, you think this is not an accurate statement?
All the break-down that follows just shows a change of meaning of the writer, but the sentence meaning hasn't changed to me. All the complex references to "townsfolk", "yokels" is probably lost to a large number of translators, thus unlikely to be found in any translation.
All your detailed explanation (thank you for teaching us), serves as a confirmation to me that all these changes should be limited to a en_US translation :lol:
This is what I describe as flavor in the text.
gnombat wrote: April 8th, 2022, 1:03 pm It seems to violate the DRY principle having an en_US.po file which mostly just duplicates the strings in the source code. That will make it more work for people editing the English-language text because they will often have to make changes in two different places - inevitably people will sometimes make a change to one and forget to change the other.
There is no need to do changes in 2 different places. If it is a bug fix (4 of 83 cases by octalot statistics), you fix it in the code, then you will have to translate it during the string freeze (like any language). For the other cases (79 out of 83), you keep the code unchanged and you change only the en_US translation in the po file. At the same time, all translation teams are saved from checking 79 useless fuzzy (only 4 new/fuzzy remaining).
Updating a translation is easy. You don't have to wonder which file the string is, you just open the po file and save it after change.

It is kind of surprising that you find this process so complex. That is how it is done for every language but US English. The double work you refer to, is how all translation teams are required to repeat each time a translation is fuzzy.
Of course, when the additional work is pushed to other people, it may look to devs like things are done in the most efficient way.
User avatar
Pentarctagon
Project Manager
Posts: 5526
Joined: March 22nd, 2009, 10:50 pm
Location: Earth (occasionally)

Re: [mainline] there is a need for a en_US translation

Post by Pentarctagon »

demario wrote: April 8th, 2022, 11:44 pm
Pentarctagon wrote: April 8th, 2022, 5:00 am It's worth pointing out though that this sort of thing happening is inevitable given the people translating the game into other languages seem to interact with the rest of the dev team rarely if at all. Issues can't be addressed if nobody tells us they exist.
What should they warn the dev team about? That changing translatable strings in source is breaking translation?
Or that they are unhappy and frustrated to see the result of their work being reset repeatedly to accommodate US English speakers?
I'll go out on a limb and say that the majority of the current dev team, myself included, have approximately 0% understanding of what goes into translating. So, yes, if they're frustrated about something then they do need to communicate that. If all that happens is translation updates are wordlessly sent in, my assumption is simply "well, I guess it's not that bad".
99 little bugs in the code, 99 little bugs
take one down, patch it around
-2,147,483,648 little bugs in the code
gnombat
Posts: 682
Joined: June 10th, 2010, 8:49 pm

Re: [mainline] there is a need for a en_US translation

Post by gnombat »

demario wrote: April 8th, 2022, 11:44 pm There is no need to do changes in 2 different places. If it is a bug fix (4 of 83 cases by octalot statistics), you fix it in the code, then you will have to translate it during the string freeze (like any language). For the other cases (79 out of 83), you keep the code unchanged and you change only the en_US translation in the po file. At the same time, all translation teams are saved from checking 79 useless fuzzy (only 4 new/fuzzy remaining).
Updating a translation is easy. You don't have to wonder which file the string is, you just open the po file and save it after change.
But, in this case, en_US isn't like any other language - it is the original source language of the text. It's the language the campaign was originally written in, and it's the language that writers will be working with in future revisions and edits. If it were just like any other language, there would be an en_US translator or translation team - but there isn't one. So the burden of maintaining both the text in the .wml source file and the en_US.po file will fall upon whoever is writing and editing the text of the campaign.

In practice, this is what I expect will likely happen:
  • Writers and editors will probably work mostly with the en_US.po file and make most of their changes there.
  • Over time, the text in the en_US.po file and the text in .wml source files will gradually drift apart as editors make changes to the en_US.po file but forget to update the .wml file (even in cases where the change is substantive and the .wml file should be updated).
  • This will make things more difficult for the translators, as they will find that some of the English strings they are translating in their .po file (e.g., fr.po) are out of date and out of sync with the rest of the campaign text.
demario wrote: April 8th, 2022, 11:44 pm It is kind of surprising that you find this process so complex. That is how it is done for every language but US English. The double work you refer to, is how all translation teams are required to repeat each time a translation is fuzzy.
Of course, when the additional work is pushed to other people, it may look to devs like things are done in the most efficient way.
I'm not so much concerned about the double work (although it is more work for writers and editors, as I noted above) as I am about the fact that there will be two different English texts with neither one clearly and unambiguously the actual source text. I think this will make things more difficult for translators rather than less. For example, consider the French translation team. Previously, they really only needed to worry about maintaining the fr.po file. Now, there will likely be cases where the English strings in the fr.po file are out of date (as I explained above), so the translators will have to inspect 3 different files - their own fr.po file, the en_US.po file, and the .wml source file - to try to reconcile the differences between the three texts (the French text, the en_US text, and the simplified English text in the WML).
demario
Posts: 131
Joined: July 3rd, 2019, 1:05 pm

Re: [mainline] there is a need for a en_US translation

Post by demario »

gnombat wrote: April 9th, 2022, 2:31 am In practice, this is what I expect will likely happen:
  • Writers and editors will probably work mostly with the en_US.po file and make most of their changes there.
  • Over time, the text in the en_US.po file and the text in .wml source files will gradually drift apart as editors make changes to the en_US.po file but forget to update the .wml file (even in cases where the change is substantive and the .wml file should be updated).
    ...
[Some catastrophic outcome]
Yep :hmm:

You know, I can probably make up some very bad omen of my own too, starting from the premises that people make mistakes and they don't like to follow rules.
Thanks for contributing yours to the discussion :mrgreen:
User avatar
octalot
General Code Maintainer
Posts: 783
Joined: July 17th, 2010, 7:40 pm
Location: Austria

Re: [mainline] there is a need for a en_US translation

Post by octalot »

gnombat wrote: April 9th, 2022, 2:31 am In practice, this is what I expect will likely happen:
  • Writers and editors will probably work mostly with the en_US.po file and make most of their changes there.
  • Over time, the text in the en_US.po file and the text in .wml source files will gradually drift apart as editors make changes to the en_US.po file but forget to update the .wml file (even in cases where the change is substantive and the .wml file should be updated).
  • This will make things more difficult for the translators, as they will find that some of the English strings they are translating in their .po file (e.g., fr.po) are out of date and out of sync with the rest of the campaign text.
Please try adding the steps that a translation team takes to that walkthrough, including the "stare at an entire paragraph that's been marked as changed and work out exactly what changed" parts. Then write down a walkthrough of what happens when the writers and editors make the same changes, but in the current "en_US is the primary source" situation.

The risk of a writer forgetting to update the .wml source file for a significant change is probably less than the chance of a translator mistaking a significant change for a trivial one. The writer's work is also pushed through source control, and it's far more likely that someone will review the writer's work as part of an individual change - whereas the translator is going to get an update with a single file per textdomain, bundling all changes since the last release.
gnombat
Posts: 682
Joined: June 10th, 2010, 8:49 pm

Re: [mainline] there is a need for a en_US translation

Post by gnombat »

demario wrote: April 9th, 2022, 9:03 am You know, I can probably make up some very bad omen of my own too, starting from the premises that people make mistakes and they don't like to follow rules.
Any serious engineering discussion will take into account the possibility (really, the inevitability) that people will make mistakes.
octalot wrote: April 9th, 2022, 2:39 pm Please try adding the steps that a translation team takes to that walkthrough, including the "stare at an entire paragraph that's been marked as changed and work out exactly what changed" parts. Then write down a walkthrough of what happens when the writers and editors make the same changes, but in the current "en_US is the primary source" situation.
I'm not denying that there are tradeoffs involved here. (As I noted in a previous post, it is no doubt painful for translators when they see that there are 80 different text strings changed in one campaign in a single bugfix release.) I'm suggesting it would be wise to take into account all the positives and negatives when considering an idea like this (and also to consider possible alternatives).
User avatar
nemaara
Developer
Posts: 333
Joined: May 31st, 2015, 2:13 am

Re: [mainline] there is a need for a en_US translation

Post by nemaara »

Yes there's some small issues on the dev side with forgetting to update the WML file when necessary but on the flip side this would be really useful for me. I've actually hesitated to add more "flavor" to the text in some areas like Liberty because I was worried people would be completely unable to translate it. With this, I could go crazy with the en_US translation while leaving a more basic broadly understandable version in the WML file.
Post Reply