[mainline] there is a need for a en_US translation

Brainstorm ideas of possible additions to the game. Read this before posting!

Moderator: Forum Moderators

Forum rules
Before posting a new idea, you must read the following:
Post Reply
User avatar
octalot
General Code Maintainer
Posts: 783
Joined: July 17th, 2010, 7:40 pm
Location: Austria

Re: [mainline] there is a need for a en_US translation

Post by octalot »

The main aim of the proposal was to reduce the number of changes causing fuzzies. The conversation overnight on Discord focused on the idea of simplified language, but that's a detour from the main proposal - that suggestion came from Nemaara and myself, not the translators.

For determining whether a string has changed meaning, we should ask the opposite question: when someone opens a PR with string changes, ask them if which ones are intended to change the meaning, and then we can do a better review of whether their changes match their intent. Being able to pass that information on the the translators is an additional benefit.
Celtic_Minstrel wrote: April 12th, 2022, 11:58 pm This tool already exists to avoid fuzzying strings that contain only typo fixes and other changes that do nothing to the meaning (such as the letter case change you highlighted in SoF).
Is this referring to pofix? It's not that capable a tool, and the letter case changes are context-sensitive; for example, "elves" should usually start with a lower-case letter, but not always.
Celtic_Minstrel wrote: April 12th, 2022, 11:58 pm
octalot wrote: April 11th, 2022, 1:08 am The tool doesn't unwrap the word-wrapped lines in .pot files, which is why the cut&paste would need to be from a diff of the .pot files rather than the diffs of the .cfg files; either way, it's a lot of work.
Maybe it's worth updating the pofix tool to fix this?
Even with that fixed, I don't think pofix would be up to the job; it's a script that needs each ("old", "new") pair to be added to the script itself. For something when a single match could fix multiple lines, such as converting the ASCII apostrophe in Haldric's, that's reasonable, as would fixing #6624. However if 100 changed lines mean 100 ("old", "new") pairs then it doesn't seem to be the tool for the job.
Soliton
Site Administrator
Posts: 1680
Joined: April 5th, 2005, 3:25 pm
Location: #wesnoth-mp

Re: [mainline] there is a need for a en_US translation

Post by Soliton »

If pofix is not usable by people caring for translations now as it used to in the past decade then maybe it should be improved according to the new needs.

The basic issue is the same as before though, right? Text changes that do not change meaning should need no action from translators. Or are we trying to solve a different issue here? Maybe an additional issue is that there are too many text changes in a stable version? I have not been paying attention there but mentioning 100 changed lines in the context of pofix sounds like it could be the case.
"If gameplay requires it, they can be made to live on Venus." -- scott
User avatar
Jarom
Posts: 110
Joined: January 4th, 2015, 8:23 pm
Location: Green Isle, Irdya or Poland, Earth - I'm not quite sure

Re: [mainline] there is a need for a en_US translation

Post by Jarom »

Soliton wrote: April 14th, 2022, 5:36 pm If pofix is not usable by people caring for translations now as it used to in the past decade then maybe it should be improved according to the new needs.

The basic issue is the same as before though, right? Text changes that do not change meaning should need no action from translators. Or are we trying to solve a different issue here? Maybe an additional issue is that there are too many text changes in a stable version? I have not been paying attention there but mentioning 100 changed lines in the context of pofix sounds like it could be the case.
As far as I'm aware, we're discussing two problems here that en_US translation could solve:
- Small changes breaking translations
- Dialect or other fancy writing being unintelligible for many

First one is at least in theory solvable using some sort of tool or procedure, without extra translation file.
Second one would call for a separate translation file, if it's decided that it should be addressed.

There's also issue of scope of changes covered by possible new translation file vs WML source.
User avatar
octalot
General Code Maintainer
Posts: 783
Joined: July 17th, 2010, 7:40 pm
Location: Austria

Re: [mainline] there is a need for a en_US translation

Post by octalot »

Soliton wrote: April 14th, 2022, 5:36 pm The basic issue is the same as before though, right? Text changes that do not change meaning should need no action from translators.
Yes, that's the basic issue.
Soliton wrote: April 14th, 2022, 5:36 pm Maybe an additional issue is that there are too many text changes in a stable version? I have not been paying attention there but mentioning 100 changed lines in the context of pofix sounds like it could be the case.
The numbers are much larger than you expect:
  • Between 1.16.0 and 1.16.3, it's about 600 changes, of which 500 don't change the meaning.
  • Between late 1.14 and 1.16.0, I think the languages that were at 100% of all strings for 1.14.x went down to 83% complete. So 3000 strings new or fuzzy strings.
  • Between 1.14.0 and now, around 25 strings were added to pofix. None look like they'd fix more than a couple of lines (excluding the website fixups).
It's not "too many", it's "the number that were required". I say that merely because reducing the number of changes in strings shown to an en_US player shouldn't be an option, although changing how that's implemented is a option.
H-Hour
Posts: 222
Joined: April 14th, 2010, 12:27 pm

Re: [mainline] there is a need for a en_US translation

Post by H-Hour »

This may not apply to Wesnoth's translation workflow, but at my work we use the fuzzy distinction to address this precise problem. We mark a msgid entry as fuzzy to indicate to the translator that they may wish to review their translation, but don't need to because it's meaning is broadly the same. If a msgid entry has changed enough that a translation is obsolete, we remove it completely from the po file of all other languages, so that a translator must re-translate it to reach 100% coverage.
User avatar
Celtic_Minstrel
Developer
Posts: 2166
Joined: August 3rd, 2012, 11:26 pm
Location: Canada
Contact:

Re: [mainline] there is a need for a en_US translation

Post by Celtic_Minstrel »

Jarom wrote: April 14th, 2022, 6:12 pm As far as I'm aware, we're discussing two problems here that en_US translation could solve:
- Small changes breaking translations
- Dialect or other fancy writing being unintelligible for many
I don't believe a separate en_US translation is an effective solution to either of those problems.

Many cases of small changes breaking translations can be handled by pofix, I think. We could also make a policy to avoid changes that can't be handled by pofix in a stable series, which I imagine would help quite a bit (although it is admittedly somewhat unfortunate that this means even relatively minor campaign revisions may have to wait for the next stable series). Alternatively, a translator's changelist could help translators focus on the most important changes, giving them insight into which changes are ignorable and which definitely require attention. An en_US translation is not an effective solution here because it means the base text is now incorrect (as in, it contains errors in spelling, capitalization, punctuation, perhaps even unintended word substitutions).

As for dialect, the proper solution if it is believed that it may be unintelligible is to add a translator's note explaining the intended meaning and connotation. The translator can then use that note to inform their translation. An en_US translation is not an effective solution here because it removes those nuances from the source text.
Author of The Black Cross of Aleron campaign and Default++ era.
Former maintainer of Steelhive.
demario
Posts: 131
Joined: July 3rd, 2019, 1:05 pm

Re: [mainline] there is a need for a en_US translation

Post by demario »

Celtic_Minstrel wrote: April 12th, 2022, 11:58 pm That might be true, but that makes the resulting translation a poor translation. That nuance may be subtle but it does mean something, and a good translation would alter the translated text to give a similar nuance. For example, dialectical language is commonly translated to a dialect of the target language that has similar connotations to speakers of that language.
So after allowing yourselves the right to break hard earned translations, you now feel like you should also have a say on what is getting in?
It is so easy for you to put such high standards as you are not the one doing the work to reach them :augh:
Iris wrote: Say you got your hands on a Lord of the Rings translation where the translator misinterpreted half of what was being said in English and simplified the rest to the point it reads more like the kind of literature you'd find in elementary school. Would you still say that's a faithful Lord of the Rings translation?
Right, the killing argument must be an analogy :doh:
Should I take the analogy of imported cans of food. Would it be on sale if only half of the ingredients were translated for the local market? I won't, cause I would feel stupid to bring such an unrelated topic. Analogy is just a way to turn a problem into something different that is closer to your point.

Let me break it down for you how your analogy is irrelevant:
- Lord of the Rings is a book, there is nothing else to appreciate than the text itself. Wesnoth is a game. You don't get the right feeling from the text, you can still play the map.
- Lord of the Rings is a successful commercial product. Translation are done by professional as a living. Wesnoth is translated by willing volunteer.
- Lord of the Rings is the work of one person. Wesnoth is made by collaboration of different people with different skills.
- Lord of the Rings is a world standard-setting fiction. Wesnoth campaigns are... work of amateurs :oops:
Celtic_Minstrel wrote: April 12th, 2022, 11:58 pm All that flavour that people spent time putting into the English text would just be missing from the translated texts… and you might even find non-English players complaining that the writing is boring.
rofl. What world are you living in?
In both French and German there is not one single campaign that can be played fully in local language from master branch. But you wouldn't know that, would you?
Celtic_Minstrel wrote: April 12th, 2022, 11:58 pm In order to do a good job, especially on story prose, a translator needs to have a certain level of fluency in both the source and target languages. If Wesnoth's translators can't speak English fluently, then they're making things unnecessarily difficult for themselves. If they still want to translate and do a good job, they should spend some time studying English to improve their fluency. (That said, there is something to be said for having a basic translation as well; at least, it's usually better than no translation as long as it's been proofread by a native speaker of the target language.)
Go ahead, apply your criterias to select translation teams to work with this development team. We will see what kind of organization you will be able
to gather by being so elitist :roll:

Beyond the carelessness for the output from the translation teams work. this thread is showing contempt for the people who need localization of this game (disguised behind "we care more deeply than you"). You could use this opportunity to wonder about how some lack of language diversity inside the development team might be driving the response to this request. I will not hold my breath on that though :lol:
Last edited by demario on November 8th, 2023, 8:01 pm, edited 2 times in total.
User avatar
octalot
General Code Maintainer
Posts: 783
Joined: July 17th, 2010, 7:40 pm
Location: Austria

Re: [mainline] there is a need for a en_US translation

Post by octalot »

Celtic_Minstrel wrote: April 15th, 2022, 3:57 am Many cases of small changes breaking translations can be handled by pofix, I think.
Have you tried to do this for the scale and type of changes happening in 1.16, rather than just for unique proper nouns like "Tarrynth"? For example, changing the capitalisation of "elves" in lots of strings without affecting Northern Elves, High Lords of the Elves, etc?
Celtic_Minstrel wrote: April 15th, 2022, 3:57 am An en_US translation is not an effective solution here because it means the base text is now incorrect (as in, it contains errors in spelling, capitalization, punctuation, perhaps even unintended word substitutions).
What's the problem here? Although the "English (stable)" text isn't as good as it could be, it's still going to be readable for English readers, and it's more likely to be translated for non-English readers.
demario wrote: April 15th, 2022, 5:38 am In both French and German there is not one single campaign that can be played fully in local language from master branch.
The German translation team only works on 1.16 at the moment, ignoring master because 1.18 won't be released until 2024; we don't even submit .po changes for master (and still, the fuzzies in 1.16 cause a lot of effort). I'd incorrectly assumed that the other translation teams did the same. Would it save much effort for the other teams to only work on 1.16 for the moment?

While a lot of the strings are shared, I'd assumed that the churn of a single string changing multiple times was OK in master.
User avatar
Celtic_Minstrel
Developer
Posts: 2166
Joined: August 3rd, 2012, 11:26 pm
Location: Canada
Contact:

Re: [mainline] there is a need for a en_US translation

Post by Celtic_Minstrel »

octalot wrote: April 15th, 2022, 11:15 am
Celtic_Minstrel wrote: April 15th, 2022, 3:57 am An en_US translation is not an effective solution here because it means the base text is now incorrect (as in, it contains errors in spelling, capitalization, punctuation, perhaps even unintended word substitutions).
What's the problem here? Although the "English (stable)" text isn't as good as it could be, it's still going to be readable for English readers, and it's more likely to be translated for non-English readers.
But it's incorrect. At some unspecified point in the future, you'll still need to propagate any changes in the en_US translation back to the source. If an en_US translation is solely a mechanic to make minor changes that don't affect the meaning without fuzzying strings, then it's probably not the worst way to handle that, but in that case you can't really think of it as a "translation". But even if that is its purpose, having the text exist in two places feels like a recipe for confusion. And we need an easy way to propagate those changes back to the source, which probably means a script that updates C++, Lua, and WML files that use those strings to instead use the "translated" version from the en_US po file. Obviously this would only be done at the beginning of an unstable series, and we'd delete the en_US po file at the same time.
H-Hour wrote: April 14th, 2022, 10:58 pm This may not apply to Wesnoth's translation workflow, but at my work we use the fuzzy distinction to address this precise problem. We mark a msgid entry as fuzzy to indicate to the translator that they may wish to review their translation, but don't need to because it's meaning is broadly the same. If a msgid entry has changed enough that a translation is obsolete, we remove it completely from the po file of all other languages, so that a translator must re-translate it to reach 100% coverage.
I think this idea is missing the fact that strings are marked fuzzy automatically by the msgmerge(?) tool. It's not like the fuzzy marking is something intentionally added by a developer.

As I mentioned earlier, there's always the option of stripping out all or some of the fuzzy markings for a release, which may help with the issue, but I'm not entirely sure if that is a good idea. It would mean that all those strings with minor changes in the source text still display the translated string, though… but there may be cases where that translated string is no longer a correct translation of the source.
demario wrote: April 15th, 2022, 5:38 am elitist
I don't really think it's elitist to ask people to do a good job even though they're volunteers. Okay, sure, perhaps some of the volunteer translators will be less fluent than they really should be for the job, but they really want to help out as best they can. Then they should at least strive to make up for their reduced fluency through research. If there's something they don't understand, they can look it up or ask us. (And some of that burden should rest on us as well, adding translator's notes where the meaning may be unclear.)

I do think we shouldn't allow people to translate if they don't have a sufficient grasp of English to communicate in the language, though. If they really want to translate that badly, they should start out by learning the language.
Author of The Black Cross of Aleron campaign and Default++ era.
Former maintainer of Steelhive.
User avatar
Pentarctagon
Project Manager
Posts: 5531
Joined: March 22nd, 2009, 10:50 pm
Location: Earth (occasionally)

Re: [mainline] there is a need for a en_US translation

Post by Pentarctagon »

Some things we could reasonably do to try and reduce work from string churn:
  • Have a "translators changelog" of the string changes done since the previous point release. This would need to be done for stable and master, since even if nobody should be translating the master branch yet, it will eventually become a new stable branch.
  • Ignore fuzzy strings.
  • Start using pofix again.
  • More specifically decide which campaigns are being actively worked on, or perhaps decide which campaigns won't have work done to them in the near to medium future, and declare those not being worked on as fully string frozen all the time until significant work needs to be done on them.
  • Make an en_US translation, which in some ways sounds like it would function sort like a "super pofix" - instead of putting trivial changes in the WML/lua/C++ source and then needing to update the pofix script, trivial changes would be put into the en_US translation and then that would be it. It also sounds like there's some debate on what exactly "trivial" would mean however.
What isn't going to happen, whether or not an en_US translation ends up getting added, is the entire game gets permanently string frozen.
99 little bugs in the code, 99 little bugs
take one down, patch it around
-2,147,483,648 little bugs in the code
H-Hour
Posts: 222
Joined: April 14th, 2010, 12:27 pm

Re: [mainline] there is a need for a en_US translation

Post by H-Hour »

Celtic_Minstrel wrote: April 15th, 2022, 1:59 pm I think this idea is missing the fact that strings are marked fuzzy automatically by the msgmerge(?) tool. It's not like the fuzzy marking is something intentionally added by a developer.
My bad. At my work the fuzzy designation is manually added by developers when needed.
User avatar
egallager
Posts: 576
Joined: November 19th, 2020, 7:27 pm
Location: Concord, New Hampshire
Contact:

Re: [mainline] there is a need for a en_US translation

Post by egallager »

H-Hour wrote: April 15th, 2022, 7:53 pm
Celtic_Minstrel wrote: April 15th, 2022, 1:59 pm I think this idea is missing the fact that strings are marked fuzzy automatically by the msgmerge(?) tool. It's not like the fuzzy marking is something intentionally added by a developer.
My bad. At my work the fuzzy designation is manually added by developers when needed.
There are command-line flags for that; I encourage translators to look at their msgmerge --help output; here's mine:

Code: Select all

$ msgmerge --help
Usage: msgmerge [OPTION] def.po ref.pot

Merges two Uniforum style .po files together.  The def.po file is an
existing PO file with translations which will be taken over to the newly
created file as long as they still match; comments will be preserved,
but extracted comments and file positions will be discarded.  The ref.pot
file is the last created PO file with up-to-date source references but
old translations, or a PO Template file (generally created by xgettext);
any translations or comments in the file will be discarded, however dot
comments and file positions will be preserved.  Where an exact match
cannot be found, fuzzy matching is used to produce better results.

Mandatory arguments to long options are mandatory for short options too.

Input file location:
  def.po                      translations referring to old sources
  ref.pot                     references to new sources
  -D, --directory=DIRECTORY   add DIRECTORY to list for input files search
  -C, --compendium=FILE       additional library of message translations,
                              may be specified more than once

Operation mode:
  -U, --update                update def.po,
                              do nothing if def.po already up to date

Output file location:
  -o, --output-file=FILE      write output to specified file
The results are written to standard output if no output file is specified
or if it is -.

Output file location in update mode:
The result is written back to def.po.
      --backup=CONTROL        make a backup of def.po
      --suffix=SUFFIX         override the usual backup suffix
The version control method may be selected via the --backup option or through
the VERSION_CONTROL environment variable.  Here are the values:
  none, off       never make backups (even if --backup is given)
  numbered, t     make numbered backups
  existing, nil   numbered if numbered backups exist, simple otherwise
  simple, never   always make simple backups
The backup suffix is '~', unless set with --suffix or the SIMPLE_BACKUP_SUFFIX
environment variable.

Operation modifiers:
  -m, --multi-domain          apply ref.pot to each of the domains in def.po
      --for-msgfmt            produce output for 'msgfmt', not for a translator
  -N, --no-fuzzy-matching     do not use fuzzy matching
      --previous              keep previous msgids of translated messages

Input file syntax:
  -P, --properties-input      input files are in Java .properties syntax
      --stringtable-input     input files are in NeXTstep/GNUstep .strings
                              syntax

Output details:
      --lang=CATALOGNAME      set 'Language' field in the header entry
      --color                 use colors and other text attributes always
      --color=WHEN            use colors and other text attributes if WHEN.
                              WHEN may be 'always', 'never', 'auto', or 'html'.
      --style=STYLEFILE       specify CSS style rule file for --color
  -e, --no-escape             do not use C escapes in output (default)
  -E, --escape                use C escapes in output, no extended chars
      --force-po              write PO file even if empty
  -i, --indent                indented output style
      --no-location           suppress '#: filename:line' lines
  -n, --add-location          preserve '#: filename:line' lines (default)
      --strict                strict Uniforum output style
  -p, --properties-output     write out a Java .properties file
      --stringtable-output    write out a NeXTstep/GNUstep .strings file
  -w, --width=NUMBER          set output page width
      --no-wrap               do not break long message lines, longer than
                              the output page width, into several lines
  -s, --sort-output           generate sorted output
  -F, --sort-by-file          sort output by file location

Informative output:
  -h, --help                  display this help and exit
  -V, --version               output version information and exit
  -v, --verbose               increase verbosity level
  -q, --quiet, --silent       suppress progress indicators

Report bugs in the bug tracker at <https://savannah.gnu.org/projects/gettext>
or by email to <bug-gettext@gnu.org>.
$
Personally I find the --previous flag to be helpful for when there are fuzzies.
User avatar
Celtic_Minstrel
Developer
Posts: 2166
Joined: August 3rd, 2012, 11:26 pm
Location: Canada
Contact:

Re: [mainline] there is a need for a en_US translation

Post by Celtic_Minstrel »

I think there's a reasonably high possibility that many of the translators do not use msgmerge, however. Po editors often have the functionality built in so that translators don't need to know how to use the command-line. I have no idea whether they offer similar options; presumably depends on the editor.
Author of The Black Cross of Aleron campaign and Default++ era.
Former maintainer of Steelhive.
demario
Posts: 131
Joined: July 3rd, 2019, 1:05 pm

Re: [mainline] there is a need for a en_US translation

Post by demario »

egallager wrote: April 15th, 2022, 9:12 pm Personally I find the --previous flag to be helpful for when there are fuzzies.
The process applied by wesnoth already uses this option to merge po files. This option is ultimately only useful if the editor is able to perform diff (see this thread). At the time (2008), this function was only available in KDE lokalize. Most of the translators use poedit (546 occurrences over 711 records); lokalize is not available on Windows.

Do you have another experience to share on the benefits of using this option?

Only the following teams have recorded use of lokalize at some time: el, fi, fr, gl, he, hu, id, lt, sr@ijekavianlatin, sr@ijekavian, sr@latin, sr, vi. None of them is among the most active translations (highest is fr rating 11 for mainline campaigns on branch 1.16).
Of course different translators use different tools (eg. lokalize was used by French team only on wesnoth-tsg)
User avatar
egallager
Posts: 576
Joined: November 19th, 2020, 7:27 pm
Location: Concord, New Hampshire
Contact:

Re: [mainline] there is a need for a en_US translation

Post by egallager »

demario wrote: April 16th, 2022, 10:32 am
egallager wrote: April 15th, 2022, 9:12 pm Personally I find the --previous flag to be helpful for when there are fuzzies.
The process applied by wesnoth already uses this option to merge po files. This option is ultimately only useful if the editor is able to perform diff (see this thread). At the time (2008), this function was only available in KDE lokalize. Most of the translators use poedit (546 occurrences over 711 records); lokalize is not available on Windows.

Do you have another experience to share on the benefits of using this option?
Well, pull requests on GitHub (specifically Czech translators for "A Little Adventure" sending me translation PRs for it)
Post Reply