[mainline] there is a need for a en_US translation
Moderator: Forum Moderators
Forum rules
Before posting a new idea, you must read the following:
Before posting a new idea, you must read the following:
Re: [mainline] there is a need for a en_US translation
The main aim of the proposal was to reduce the number of changes causing fuzzies. The conversation overnight on Discord focused on the idea of simplified language, but that's a detour from the main proposal - that suggestion came from Nemaara and myself, not the translators.
For determining whether a string has changed meaning, we should ask the opposite question: when someone opens a PR with string changes, ask them if which ones are intended to change the meaning, and then we can do a better review of whether their changes match their intent. Being able to pass that information on the the translators is an additional benefit.
For determining whether a string has changed meaning, we should ask the opposite question: when someone opens a PR with string changes, ask them if which ones are intended to change the meaning, and then we can do a better review of whether their changes match their intent. Being able to pass that information on the the translators is an additional benefit.
Is this referring to pofix? It's not that capable a tool, and the letter case changes are context-sensitive; for example, "elves" should usually start with a lower-case letter, but not always.Celtic_Minstrel wrote: ↑April 12th, 2022, 11:58 pm This tool already exists to avoid fuzzying strings that contain only typo fixes and other changes that do nothing to the meaning (such as the letter case change you highlighted in SoF).
Even with that fixed, I don't think pofix would be up to the job; it's a script that needs each ("old", "new") pair to be added to the script itself. For something when a single match could fix multiple lines, such as converting the ASCII apostrophe inCeltic_Minstrel wrote: ↑April 12th, 2022, 11:58 pmMaybe it's worth updating the pofix tool to fix this?
Haldric's
, that's reasonable, as would fixing #6624. However if 100 changed lines mean 100 ("old", "new") pairs then it doesn't seem to be the tool for the job.Re: [mainline] there is a need for a en_US translation
If pofix is not usable by people caring for translations now as it used to in the past decade then maybe it should be improved according to the new needs.
The basic issue is the same as before though, right? Text changes that do not change meaning should need no action from translators. Or are we trying to solve a different issue here? Maybe an additional issue is that there are too many text changes in a stable version? I have not been paying attention there but mentioning 100 changed lines in the context of pofix sounds like it could be the case.
The basic issue is the same as before though, right? Text changes that do not change meaning should need no action from translators. Or are we trying to solve a different issue here? Maybe an additional issue is that there are too many text changes in a stable version? I have not been paying attention there but mentioning 100 changed lines in the context of pofix sounds like it could be the case.
"If gameplay requires it, they can be made to live on Venus." -- scott
- Jarom
- Posts: 110
- Joined: January 4th, 2015, 8:23 pm
- Location: Green Isle, Irdya or Poland, Earth - I'm not quite sure
Re: [mainline] there is a need for a en_US translation
As far as I'm aware, we're discussing two problems here that en_US translation could solve:Soliton wrote: ↑April 14th, 2022, 5:36 pm If pofix is not usable by people caring for translations now as it used to in the past decade then maybe it should be improved according to the new needs.
The basic issue is the same as before though, right? Text changes that do not change meaning should need no action from translators. Or are we trying to solve a different issue here? Maybe an additional issue is that there are too many text changes in a stable version? I have not been paying attention there but mentioning 100 changed lines in the context of pofix sounds like it could be the case.
- Small changes breaking translations
- Dialect or other fancy writing being unintelligible for many
First one is at least in theory solvable using some sort of tool or procedure, without extra translation file.
Second one would call for a separate translation file, if it's decided that it should be addressed.
There's also issue of scope of changes covered by possible new translation file vs WML source.
Re: [mainline] there is a need for a en_US translation
Yes, that's the basic issue.
The numbers are much larger than you expect:
- Between 1.16.0 and 1.16.3, it's about 600 changes, of which 500 don't change the meaning.
- Between late 1.14 and 1.16.0, I think the languages that were at 100% of all strings for 1.14.x went down to 83% complete. So 3000 strings new or fuzzy strings.
- Between 1.14.0 and now, around 25 strings were added to pofix. None look like they'd fix more than a couple of lines (excluding the website fixups).
Re: [mainline] there is a need for a en_US translation
This may not apply to Wesnoth's translation workflow, but at my work we use the fuzzy distinction to address this precise problem. We mark a msgid entry as fuzzy to indicate to the translator that they may wish to review their translation, but don't need to because it's meaning is broadly the same. If a msgid entry has changed enough that a translation is obsolete, we remove it completely from the po file of all other languages, so that a translator must re-translate it to reach 100% coverage.
- Celtic_Minstrel
- Developer
- Posts: 2222
- Joined: August 3rd, 2012, 11:26 pm
- Location: Canada
- Contact:
Re: [mainline] there is a need for a en_US translation
I don't believe a separate en_US translation is an effective solution to either of those problems.
Many cases of small changes breaking translations can be handled by pofix, I think. We could also make a policy to avoid changes that can't be handled by pofix in a stable series, which I imagine would help quite a bit (although it is admittedly somewhat unfortunate that this means even relatively minor campaign revisions may have to wait for the next stable series). Alternatively, a translator's changelist could help translators focus on the most important changes, giving them insight into which changes are ignorable and which definitely require attention. An en_US translation is not an effective solution here because it means the base text is now incorrect (as in, it contains errors in spelling, capitalization, punctuation, perhaps even unintended word substitutions).
As for dialect, the proper solution if it is believed that it may be unintelligible is to add a translator's note explaining the intended meaning and connotation. The translator can then use that note to inform their translation. An en_US translation is not an effective solution here because it removes those nuances from the source text.
Re: [mainline] there is a need for a en_US translation
So after allowing yourselves the right to break hard earned translations, you now feel like you should also have a say on what is getting in?Celtic_Minstrel wrote: ↑April 12th, 2022, 11:58 pm That might be true, but that makes the resulting translation a poor translation. That nuance may be subtle but it does mean something, and a good translation would alter the translated text to give a similar nuance. For example, dialectical language is commonly translated to a dialect of the target language that has similar connotations to speakers of that language.
It is so easy for you to put such high standards as you are not the one doing the work to reach them
Right, the killing argument must be an analogyIris wrote: Say you got your hands on a Lord of the Rings translation where the translator misinterpreted half of what was being said in English and simplified the rest to the point it reads more like the kind of literature you'd find in elementary school. Would you still say that's a faithful Lord of the Rings translation?
Should I take the analogy of imported cans of food. Would it be on sale if only half of the ingredients were translated for the local market? I won't, cause I would feel stupid to bring such an unrelated topic. Analogy is just a way to turn a problem into something different that is closer to your point.
Let me break it down for you how your analogy is irrelevant:
- Lord of the Rings is a book, there is nothing else to appreciate than the text itself. Wesnoth is a game. You don't get the right feeling from the text, you can still play the map.
- Lord of the Rings is a successful commercial product. Translation are done by professional as a living. Wesnoth is translated by willing volunteer.
- Lord of the Rings is the work of one person. Wesnoth is made by collaboration of different people with different skills.
- Lord of the Rings is a world standard-setting fiction. Wesnoth campaigns are... work of amateurs
rofl. What world are you living in?Celtic_Minstrel wrote: ↑April 12th, 2022, 11:58 pm All that flavour that people spent time putting into the English text would just be missing from the translated texts… and you might even find non-English players complaining that the writing is boring.
In both French and German there is not one single campaign that can be played fully in local language from master branch. But you wouldn't know that, would you?
Go ahead, apply your criterias to select translation teams to work with this development team. We will see what kind of organization you will be ableCeltic_Minstrel wrote: ↑April 12th, 2022, 11:58 pm In order to do a good job, especially on story prose, a translator needs to have a certain level of fluency in both the source and target languages. If Wesnoth's translators can't speak English fluently, then they're making things unnecessarily difficult for themselves. If they still want to translate and do a good job, they should spend some time studying English to improve their fluency. (That said, there is something to be said for having a basic translation as well; at least, it's usually better than no translation as long as it's been proofread by a native speaker of the target language.)
to gather by being so elitist
Beyond the carelessness for the output from the translation teams work. this thread is showing contempt for the people who need localization of this game (disguised behind "we care more deeply than you"). You could use this opportunity to wonder about how some lack of language diversity inside the development team might be driving the response to this request. I will not hold my breath on that though
Last edited by demario on November 8th, 2023, 8:01 pm, edited 2 times in total.
Re: [mainline] there is a need for a en_US translation
Have you tried to do this for the scale and type of changes happening in 1.16, rather than just for unique proper nouns like "Tarrynth"? For example, changing the capitalisation of "elves" in lots of strings without affecting Northern Elves, High Lords of the Elves, etc?Celtic_Minstrel wrote: ↑April 15th, 2022, 3:57 am Many cases of small changes breaking translations can be handled by pofix, I think.
What's the problem here? Although the "English (stable)" text isn't as good as it could be, it's still going to be readable for English readers, and it's more likely to be translated for non-English readers.Celtic_Minstrel wrote: ↑April 15th, 2022, 3:57 am An en_US translation is not an effective solution here because it means the base text is now incorrect (as in, it contains errors in spelling, capitalization, punctuation, perhaps even unintended word substitutions).
The German translation team only works on 1.16 at the moment, ignoring master because 1.18 won't be released until 2024; we don't even submit .po changes for master (and still, the fuzzies in 1.16 cause a lot of effort). I'd incorrectly assumed that the other translation teams did the same. Would it save much effort for the other teams to only work on 1.16 for the moment?
While a lot of the strings are shared, I'd assumed that the churn of a single string changing multiple times was OK in master.
- Celtic_Minstrel
- Developer
- Posts: 2222
- Joined: August 3rd, 2012, 11:26 pm
- Location: Canada
- Contact:
Re: [mainline] there is a need for a en_US translation
But it's incorrect. At some unspecified point in the future, you'll still need to propagate any changes in the en_US translation back to the source. If an en_US translation is solely a mechanic to make minor changes that don't affect the meaning without fuzzying strings, then it's probably not the worst way to handle that, but in that case you can't really think of it as a "translation". But even if that is its purpose, having the text exist in two places feels like a recipe for confusion. And we need an easy way to propagate those changes back to the source, which probably means a script that updates C++, Lua, and WML files that use those strings to instead use the "translated" version from the en_US po file. Obviously this would only be done at the beginning of an unstable series, and we'd delete the en_US po file at the same time.octalot wrote: ↑April 15th, 2022, 11:15 amWhat's the problem here? Although the "English (stable)" text isn't as good as it could be, it's still going to be readable for English readers, and it's more likely to be translated for non-English readers.Celtic_Minstrel wrote: ↑April 15th, 2022, 3:57 am An en_US translation is not an effective solution here because it means the base text is now incorrect (as in, it contains errors in spelling, capitalization, punctuation, perhaps even unintended word substitutions).
I think this idea is missing the fact that strings are marked fuzzy automatically by the msgmerge(?) tool. It's not like the fuzzy marking is something intentionally added by a developer.H-Hour wrote: ↑April 14th, 2022, 10:58 pm This may not apply to Wesnoth's translation workflow, but at my work we use the fuzzy distinction to address this precise problem. We mark a msgid entry as fuzzy to indicate to the translator that they may wish to review their translation, but don't need to because it's meaning is broadly the same. If a msgid entry has changed enough that a translation is obsolete, we remove it completely from the po file of all other languages, so that a translator must re-translate it to reach 100% coverage.
As I mentioned earlier, there's always the option of stripping out all or some of the fuzzy markings for a release, which may help with the issue, but I'm not entirely sure if that is a good idea. It would mean that all those strings with minor changes in the source text still display the translated string, though… but there may be cases where that translated string is no longer a correct translation of the source.
I don't really think it's elitist to ask people to do a good job even though they're volunteers. Okay, sure, perhaps some of the volunteer translators will be less fluent than they really should be for the job, but they really want to help out as best they can. Then they should at least strive to make up for their reduced fluency through research. If there's something they don't understand, they can look it up or ask us. (And some of that burden should rest on us as well, adding translator's notes where the meaning may be unclear.)
I do think we shouldn't allow people to translate if they don't have a sufficient grasp of English to communicate in the language, though. If they really want to translate that badly, they should start out by learning the language.
- Pentarctagon
- Project Manager
- Posts: 5565
- Joined: March 22nd, 2009, 10:50 pm
- Location: Earth (occasionally)
Re: [mainline] there is a need for a en_US translation
Some things we could reasonably do to try and reduce work from string churn:
- Have a "translators changelog" of the string changes done since the previous point release. This would need to be done for stable and master, since even if nobody should be translating the master branch yet, it will eventually become a new stable branch.
- Ignore fuzzy strings.
- Start using pofix again.
- More specifically decide which campaigns are being actively worked on, or perhaps decide which campaigns won't have work done to them in the near to medium future, and declare those not being worked on as fully string frozen all the time until significant work needs to be done on them.
- Make an en_US translation, which in some ways sounds like it would function sort like a "super pofix" - instead of putting trivial changes in the WML/lua/C++ source and then needing to update the pofix script, trivial changes would be put into the en_US translation and then that would be it. It also sounds like there's some debate on what exactly "trivial" would mean however.
99 little bugs in the code, 99 little bugs
take one down, patch it around
-2,147,483,648 little bugs in the code
take one down, patch it around
-2,147,483,648 little bugs in the code
Re: [mainline] there is a need for a en_US translation
My bad. At my work the fuzzy designation is manually added by developers when needed.Celtic_Minstrel wrote: ↑April 15th, 2022, 1:59 pm I think this idea is missing the fact that strings are marked fuzzy automatically by the msgmerge(?) tool. It's not like the fuzzy marking is something intentionally added by a developer.
Re: [mainline] there is a need for a en_US translation
There are command-line flags for that; I encourage translators to look at theirH-Hour wrote: ↑April 15th, 2022, 7:53 pmMy bad. At my work the fuzzy designation is manually added by developers when needed.Celtic_Minstrel wrote: ↑April 15th, 2022, 1:59 pm I think this idea is missing the fact that strings are marked fuzzy automatically by the msgmerge(?) tool. It's not like the fuzzy marking is something intentionally added by a developer.
msgmerge --help
output; here's mine:
Code: Select all
$ msgmerge --help
Usage: msgmerge [OPTION] def.po ref.pot
Merges two Uniforum style .po files together. The def.po file is an
existing PO file with translations which will be taken over to the newly
created file as long as they still match; comments will be preserved,
but extracted comments and file positions will be discarded. The ref.pot
file is the last created PO file with up-to-date source references but
old translations, or a PO Template file (generally created by xgettext);
any translations or comments in the file will be discarded, however dot
comments and file positions will be preserved. Where an exact match
cannot be found, fuzzy matching is used to produce better results.
Mandatory arguments to long options are mandatory for short options too.
Input file location:
def.po translations referring to old sources
ref.pot references to new sources
-D, --directory=DIRECTORY add DIRECTORY to list for input files search
-C, --compendium=FILE additional library of message translations,
may be specified more than once
Operation mode:
-U, --update update def.po,
do nothing if def.po already up to date
Output file location:
-o, --output-file=FILE write output to specified file
The results are written to standard output if no output file is specified
or if it is -.
Output file location in update mode:
The result is written back to def.po.
--backup=CONTROL make a backup of def.po
--suffix=SUFFIX override the usual backup suffix
The version control method may be selected via the --backup option or through
the VERSION_CONTROL environment variable. Here are the values:
none, off never make backups (even if --backup is given)
numbered, t make numbered backups
existing, nil numbered if numbered backups exist, simple otherwise
simple, never always make simple backups
The backup suffix is '~', unless set with --suffix or the SIMPLE_BACKUP_SUFFIX
environment variable.
Operation modifiers:
-m, --multi-domain apply ref.pot to each of the domains in def.po
--for-msgfmt produce output for 'msgfmt', not for a translator
-N, --no-fuzzy-matching do not use fuzzy matching
--previous keep previous msgids of translated messages
Input file syntax:
-P, --properties-input input files are in Java .properties syntax
--stringtable-input input files are in NeXTstep/GNUstep .strings
syntax
Output details:
--lang=CATALOGNAME set 'Language' field in the header entry
--color use colors and other text attributes always
--color=WHEN use colors and other text attributes if WHEN.
WHEN may be 'always', 'never', 'auto', or 'html'.
--style=STYLEFILE specify CSS style rule file for --color
-e, --no-escape do not use C escapes in output (default)
-E, --escape use C escapes in output, no extended chars
--force-po write PO file even if empty
-i, --indent indented output style
--no-location suppress '#: filename:line' lines
-n, --add-location preserve '#: filename:line' lines (default)
--strict strict Uniforum output style
-p, --properties-output write out a Java .properties file
--stringtable-output write out a NeXTstep/GNUstep .strings file
-w, --width=NUMBER set output page width
--no-wrap do not break long message lines, longer than
the output page width, into several lines
-s, --sort-output generate sorted output
-F, --sort-by-file sort output by file location
Informative output:
-h, --help display this help and exit
-V, --version output version information and exit
-v, --verbose increase verbosity level
-q, --quiet, --silent suppress progress indicators
Report bugs in the bug tracker at <https://savannah.gnu.org/projects/gettext>
or by email to <bug-gettext@gnu.org>.
$
--previous
flag to be helpful for when there are fuzzies.Wesnoth-related GitHub repos:
General mods collection, SotBEEE, AToTBWaTD, The Earth's Gut, A Little Adventure, FtF
Social media: Mastodon: @egallager@treehouse.systems, Steam: egallager
General mods collection, SotBEEE, AToTBWaTD, The Earth's Gut, A Little Adventure, FtF
Social media: Mastodon: @egallager@treehouse.systems, Steam: egallager
- Celtic_Minstrel
- Developer
- Posts: 2222
- Joined: August 3rd, 2012, 11:26 pm
- Location: Canada
- Contact:
Re: [mainline] there is a need for a en_US translation
I think there's a reasonably high possibility that many of the translators do not use msgmerge, however. Po editors often have the functionality built in so that translators don't need to know how to use the command-line. I have no idea whether they offer similar options; presumably depends on the editor.
Re: [mainline] there is a need for a en_US translation
The process applied by wesnoth already uses this option to merge po files. This option is ultimately only useful if the editor is able to perform diff (see this thread). At the time (2008), this function was only available in KDE lokalize. Most of the translators use poedit (546 occurrences over 711 records); lokalize is not available on Windows.
Do you have another experience to share on the benefits of using this option?
Only the following teams have recorded use of lokalize at some time:
el
, fi
, fr
, gl
, he
, hu
, id
, lt
, sr@ijekavianlatin
, sr@ijekavian
, sr@latin
, sr
, vi
. None of them is among the most active translations (highest is fr
rating 11 for mainline campaigns on branch 1.16).Of course different translators use different tools (eg. lokalize was used by French team only on
wesnoth-tsg
)Re: [mainline] there is a need for a en_US translation
Well, pull requests on GitHub (specifically Czech translators for "A Little Adventure" sending me translation PRs for it)demario wrote: ↑April 16th, 2022, 10:32 amThe process applied by wesnoth already uses this option to merge po files. This option is ultimately only useful if the editor is able to perform diff (see this thread). At the time (2008), this function was only available in KDE lokalize. Most of the translators use poedit (546 occurrences over 711 records); lokalize is not available on Windows.
Do you have another experience to share on the benefits of using this option?
Wesnoth-related GitHub repos:
General mods collection, SotBEEE, AToTBWaTD, The Earth's Gut, A Little Adventure, FtF
Social media: Mastodon: @egallager@treehouse.systems, Steam: egallager
General mods collection, SotBEEE, AToTBWaTD, The Earth's Gut, A Little Adventure, FtF
Social media: Mastodon: @egallager@treehouse.systems, Steam: egallager