[mainline] there is a need for a en_US translation
Moderator: Forum Moderators
Forum rules
Before posting a new idea, you must read the following:
Before posting a new idea, you must read the following:
- Jarom
- Posts: 110
- Joined: January 4th, 2015, 8:23 pm
- Location: Green Isle, Irdya or Poland, Earth - I'm not quite sure
Re: [mainline] there is a need for a en_US translation
As a person who had been involved in Polish translation at some point, I can support the claim about minor changes breaking translations. I admit that I sometimes only skimmed texts marked as fuzzy and accepted translation without changes, after several previous changes that were only as much as adding a comma, article or a preposition, possibly missing an actually important addition like afforementioned "Shorbear dwarves" just because it was unreasonable to scan the whole text looking for a single comma.
Well, I'll voice my complain while we're at it: campaigns get reinvented every major version, and until someone spends several hours, or even days, translating them again from scratch, there will be no translation. I don't think anything can or should be done about it though, it's part of the progress.I'll go out on a limb and say that the majority of the current dev team, myself included, have approximately 0% understanding of what goes into translating. So, yes, if they're frustrated about something then they do need to communicate that. If all that happens is translation updates are wordlessly sent in, my assumption is simply "well, I guess it's not that bad".
You're taking it wrong. There is usually only one file to inspect: your own .po, because all source strings are already there, and are an actual source text you have to translate, taken directly from wml and lua files. You only look at wml files when you're unsure what the context is (i.e. you haven't played this campaign or haven't encountered this particular string). In this case adding en_US would only provide alternative ways to look at text, possibly providing some disambiguation, not an actual source you need to check. As a bonus I'll mention the probably well-known fact, that often various programs don't have actual strings in their translation files like Wesnoth does, only a translation key, which is an obscure label serving as id, having to look for English text in a different file. I wouldn't opt for such a change here though.I'm not so much concerned about the double work (although it is more work for writers and editors, as I noted above) as I am about the fact that there will be two different English texts with neither one clearly and unambiguously the actual source text. I think this will make things more difficult for translators rather than less. For example, consider the French translation team. Previously, they really only needed to worry about maintaining the fr.po file. Now, there will likely be cases where the English strings in the fr.po file are out of date (as I explained above), so the translators will have to inspect 3 different files - their own fr.po file, the en_US.po file, and the .wml source file - to try to reconcile the differences between the three texts (the French text, the en_US text, and the simplified English text in the WML).
Actually I'd say that while literal meaning remains the same, the whole meaning does change and could be translated differently in many cases, reflecting the circumstantial differences of speakers and their personalities. The practicality of such solution is disputable though, as putting work into making sentences more obscure to speakers of common variant of language (I mean, not some countryside lingo) is like putting cart before the horse.Man, you're the native speaker here. When I say The meaning of the sentence hasn't changed, you think this is not an accurate statement?
All the break-down that follows just shows a change of meaning of the writer, but the sentence meaning hasn't changed to me. All the complex references to "townsfolk", "yokels" is probably lost to a large number of translators, thus unlikely to be found in any translation.
I just want to remind everyone once again, that missing translations get replaced by source text. This is probably the most important reason to create en_US for "flavored" text. For just the comma issue, it would be much better to have someone check incoming changes in .pot source, and somehow un-fuzzy insignificant changes in all translations, not to violate DRY and maintain two different files. Also languages that have little or no translations at all might have speakers knowing some degree of English, but not all the intricacies and lingo.Yes there's some small issues on the dev side with forgetting to update the WML file when necessary but on the flip side this would be really useful for me. I've actually hesitated to add more "flavor" to the text in some areas like Liberty because I was worried people would be completely unable to translate it. With this, I could go crazy with the en_US translation while leaving a more basic broadly understandable version in the WML file.
Re: [mainline] there is a need for a en_US translation
Much thanks Jarom, nothing better than the hard earned experience of a translator.Jarom wrote: ↑April 9th, 2022, 7:47 pm As a person who had been involved in Polish translation at some point, I can support the claim about minor changes breaking translations.
[...]
I just want to remind everyone once again, that missing translations get replaced by source text. This is probably the most important reason to create en_US for "flavored" text. For just the comma issue, it would be much better to have someone check incoming changes in .pot source, and somehow un-fuzzy insignificant changes in all translations, not to violate DRY and maintain two different files. Also languages that have little or no translations at all might have speakers knowing some degree of English, but not all the intricacies and lingo.
Let me make a status of where we are now:
- Development team has been informed of the original problem
- Problem was confirmed by one independent translator
- Three ways of fixing the problem are considered:
- Improve communication to translation-teams to inform of nature of strings changes (but we must maintain such a list independently)
- Build a tool to automatically (or someone to) clear before translation the "fuzzy" flag of strings that are changed to improve text flavor in US English
- Use a en_US translation
- It is time for action...
As octalot has pointed out, there are many changes to the text from Sceptre of Fire in the pipe that would repeat this problem.
I have created a en_US.po file for wesnoth-sof domain that I am putting in attachment. This is the translation of strings corresponding to 1.16.2 version of BFW (git revision a623cc25) where the en_US translation corresponds to the 1.16.3 text (current rev 6a37f9f). The strings that have been changed are still in fuzzy state to highlight the change.
Using it as such is a bit of a trouble: you need to clear the
fuzzy
, change po/LINGUAS
to define the en_US
language but you also have to generate a en_US.po
for each text domain
(they can be blank) and then run the compilation chain
again (to generate the .mo
file that BFW needs).Alternatively you can use your current copy of BFW 1.16.2 (or go back to rev
a623cc25
on BFW 1.16 branch) and -unzip and- copy the wesnoth-sof.mo file in attachment to a new directory translations/en_US/LC_MESSAGES/
and you should be good to go. The fuzzy strings have already been cleared and on the first dialog of the campaign, you should see "Ay, the Sceptre of Fire. The Sceptre [...]" instead of the original "Ay, the Sceptre of Fire. The sceptre [...]". Yep, that's the kind of changes we are talking about. If it works for you, you can play the campaign with the most updated US English, without feeling any guilt that what you are enjoying is making the experience miserable for non-native US English wesnoth players.
Benefits of a en_US translation:
- All the infrastructure already exists
- It can be implemented on a text-domain basis (for example on [some] campaigns only)
- It would allow more freedom for writers
- Could potentially make the translation easier by using more simple text
- It can be tested right on
- Attachments
-
- wesnoth-sof.mo.gz
- Compiled en_US translation for 1.16.3 version Sceptre of Fire
- (47.18 KiB) Downloaded 139 times
- Pentarctagon
- Project Manager
- Posts: 5564
- Joined: March 22nd, 2009, 10:50 pm
- Location: Earth (occasionally)
Re: [mainline] there is a need for a en_US translation
To be clear, I think more communication from the translation teams is desirable entirely independently of anything else that happens.
99 little bugs in the code, 99 little bugs
take one down, patch it around
-2,147,483,648 little bugs in the code
take one down, patch it around
-2,147,483,648 little bugs in the code
Re: [mainline] there is a need for a en_US translation
I've just realised it's not enough to just blindly unfuzzy them; that works if the file was 100% translated before, but if it wasn't then the logic ought to preserve the fuzzy status of the old string. So my plan to send out a email saying "here's the 4 significant changes in wesnoth-sof, 2 more are insignificant changes but the the system thinks they're new strings, and for the other 94 you can simply unset the flag" doesn't work; and yes, while I was thinking it would work that way I added another 18 insignificant changes to the batch.
The game engine can load .po files for mainline campaigns too, so it doesn't need the .mo file to be generated, and it doesn't need the LINGUS file to be edited. However, it doesn't (yet) have an option to load the .po files from the directories where they're kept in source control.
- Find the directory called
translations
, under which the the .mo files are stored. For example, there's atranslations/fr/LC_MESSAGES/wesnoth-sof.mo
file. - If there's already an
translations/en_US/LC_MESSAGES/wesnoth-sof.mo
file, delete it. Otherwise the engine will load the .mo instead (I think, haven't tested that). - Create a directory
translations/wesnoth-sof
. Yes,translations
should now contain one folder named after a campaign and about fifty named after languages. - Unzip demario's file and rename it to
translations/wesnoth-sof/en_US.po
. - Start Wesnoth with the command-line argument
--language en_US
.
Re: [mainline] there is a need for a en_US translation
That is indeed a good question. Either way though if those text changes do not actually change the meaning the pofix tool should be used to adjust all translations so that there is no need for translators to do anything. There is no new technical solution needed.
"If gameplay requires it, they can be made to live on Venus." -- scott
Re: [mainline] there is a need for a en_US translation
The pofix tool hasn't been part of the workflow for a long time (I'm saying that based just on the set of strings that are in it). If pofix is the answer, please could you walk us through how it should be used for the scale of changes that are happening?
I can see it working well on the conversion of ASCII apostrophes to their typographical versions, which will reduce some of the burden. However, handling commas and many of the capitalisation changes looks like it's going to need line-by-line cutting and pasting from a diff of the .pot files. The tool doesn't unwrap the word-wrapped lines in .pot files, which is why the cut&paste would need to be from a diff of the .pot files rather than the diffs of the .cfg files; either way, it's a lot of work.
Handling Nemaara's conversion of Liberty to rural slang seems completely outside the scope of pofix.
I can see it working well on the conversion of ASCII apostrophes to their typographical versions, which will reduce some of the burden. However, handling commas and many of the capitalisation changes looks like it's going to need line-by-line cutting and pasting from a diff of the .pot files. The tool doesn't unwrap the word-wrapped lines in .pot files, which is why the cut&paste would need to be from a diff of the .pot files rather than the diffs of the .cfg files; either way, it's a lot of work.
Handling Nemaara's conversion of Liberty to rural slang seems completely outside the scope of pofix.
Re: [mainline] there is a need for a en_US translation
Interesting undocumented feature. It would work only after the fuzzy flags are cleared though.
Oh you're a non-native speaker that doesn't consider improving grammar in the US English text as a bug fix, right? That's why it is done.
There are benefits of seeing US English as just another language. The speakers of this language have a right to the best reading experience too.
Only the new translations should be put in an update of a stable release. So that the experience for players in all languages improves from fix releases.
And that's why these fixes should be done as part of a en_US translation update instead.
- loonycyborg
- Windows Packager
- Posts: 295
- Joined: April 1st, 2008, 4:45 pm
- Location: Russia/Moscow
Re: [mainline] there is a need for a en_US translation
Having a separate textdomain for text cleanup sounds nice yes, though still a bit hackish. Just how in sync with original text strings it's supposed to be? Will it be synced before new stable release or at some other schedule?
"meh." - zookeeper
Re: [mainline] there is a need for a en_US translation
The aim would be strings that only change if the meaning changes. So they wouldn't be synced up with the en_US translation.
- Pentarctagon
- Project Manager
- Posts: 5564
- Joined: March 22nd, 2009, 10:50 pm
- Location: Earth (occasionally)
Re: [mainline] there is a need for a en_US translation
So if:
That said though, I'm not too clear on where #3 stands, or what would even be a good way to try and get a better answer for it.
- Yumi (SP content maintainer) thinks having an en_US translation would be beneficial.
- Nobody else making significant text contributions to mainline has objections to an en_US translation.
- Translators think having an en_US translation would be beneficial.
That said though, I'm not too clear on where #3 stands, or what would even be a good way to try and get a better answer for it.
99 little bugs in the code, 99 little bugs
take one down, patch it around
-2,147,483,648 little bugs in the code
take one down, patch it around
-2,147,483,648 little bugs in the code
- loonycyborg
- Windows Packager
- Posts: 295
- Joined: April 1st, 2008, 4:45 pm
- Location: Russia/Moscow
Re: [mainline] there is a need for a en_US translation
What if someone changes base string with the intent to change meaning as part of adding features and the like. Will they be expected to update en_US translation on their own too?
"meh." - zookeeper
Re: [mainline] there is a need for a en_US translation
Excluding the special case of wanting to do a regional accent, I think people only need to change en_US.po if that exact base string already exists in en_US.po:
Graphviz source for image:
Re: [mainline] there is a need for a en_US translation
Wow, pretty awesome octalot.
A couple of comments:
"Is it meant to be a dialect(Texan drawl, etc)?"
I think this case in your diagram refers to a new string (as a simple English needs to be put in the WML).
I am not sure we should go to that level of cleanness. I think at the first version of a campaign, writers should be able to input any text that meets the standard for content in wesnoth. If it is in dialect/slang/... I am sure the translators will make all the necessary researches/guesses/shortcuts to come up with a translation. The goal here is to have this translation to stay valid after the original text is "improved".
"Is it a new string?"
Right, when should that process be applied? For me, any change can be done for text in WML until the text is entering the first string-freeze for the first stable release. From then on, only qualified (see later) changes can be done in text from WML (including in development branch). Other changes go in po files.
"Has the text changed meaning?"
That is the biggest problem here. As long as we can find native speakers who see difference of meaning between:we will always have the same kind of problem. We should avoid arguments about how changing "Code: Select all
«By all rights, I should have you executed on the spot, Malin. I cannot believe you let that necromancer corrupt you.» «By all rights, Malin, I should have ya kill’d on tha spot. I can’t believe ya let that necromancer corrupt you.»
,
" to ".
", ".
" to "!
", ".
" to "...
", "sceptre
" to "Sceptre
" are changes of meaning (hence changing the WML, thus breaking the translations). So we need a harder check to decide if the changes are qualified to be done to WML in.cfg
files.
I would go for something like "Is the change of text reported in a github issue? Was it accepted?".
If it is not worth reporting a bug, it is definitely not worth breaking translations.
OMG I made a diagram too
Last edited by demario on April 12th, 2022, 11:07 am, edited 1 time in total.
Re: [mainline] there is a need for a en_US translation
The tools for building .po files seem to have an unexpected feature - when building en_US.po or en_AU.po, they automatically fill in the translated text. The same doesn't happen for fr_AU.po. Once the .po file has been edited, they don't overwrite translations, it's just the initial creation step which is surprising, and I don't see options to control it.
Poedit warns that the source and destination languages are the same when editing an en_US.po, but not an en_AU.po. No idea where Poedit is getting "the source is en_US" from, it doesn't seem to be in the file itself.
These aren't insurmountable problems, but they suggest there may be more surprises ahead.
Poedit warns that the source and destination languages are the same when editing an en_US.po, but not an en_AU.po. No idea where Poedit is getting "the source is en_US" from, it doesn't seem to be in the file itself.
These aren't insurmountable problems, but they suggest there may be more surprises ahead.
- Celtic_Minstrel
- Developer
- Posts: 2222
- Joined: August 3rd, 2012, 11:26 pm
- Location: Canada
- Contact:
Re: [mainline] there is a need for a en_US translation
That might be true, but that makes the resulting translation a poor translation. That nuance may be subtle but it does mean something, and a good translation would alter the translated text to give a similar nuance. For example, dialectical language is commonly translated to a dialect of the target language that has similar connotations to speakers of that language.
In order to do a good job, especially on story prose, a translator needs to have a certain level of fluency in both the source and target languages. If Wesnoth's translators can't speak English fluently, then they're making things unnecessarily difficult for themselves. If they still want to translate and do a good job, they should spend some time studying English to improve their fluency. (That said, there is something to be said for having a basic translation as well; at least, it's usually better than no translation as long as it's been proofread by a native speaker of the target language.)
This isn't such a big deal for user interface text. If there are multiple translators and some are less fluent in English, I'd recommend the less fluent ones focus on the shorter user interface strings.
Note that translation is not a science. It's an art in and of itself. The translator needs to rewrite the entire text in their target language, and do so in a way that gives speakers of the target language as similar an impression as possible as speakers of the source language would get from the source text.
First of all, I don't think this is the correct approach. In order to get an accurate translation, all translators should work off the source text, which in Wesnoth's case is the en_US text. If you instead set simple English as the source with an en_US translation, then translators are translating the wrong source and will produce a less accurate translation. All that flavour that people spent time putting into the English text would just be missing from the translated texts… and you might even find non-English players complaining that the writing is boring.
This tool already exists to avoid fuzzying strings that contain only typo fixes and other changes that do nothing to the meaning (such as the letter case change you highlighted in SoF). However, it would be incorrect to use it on a string where flavour was added, for example by using dialectical vocabulary. It would also be incorrect to use it on most punctuation changes, as those usually alter the meaning subtly as well. So, I'm afraid this solution can't actually solve the problem of string churn. It can only alleviate it a little.
There may be another tool-based approach that could help, though. I think the idea that fuzzy strings are simply not shown (falling back to the source text) is not necessarily the correct choice with a complex prose-based game like Wesnoth. We could have some tool that defuzzes strings satisfying certain criteria as part of the release process, so that the translator will receive the properly-fuzzied strings, but some of those fuzzy strings will actually be shown in the release even if the translator does not update them. This might be harder than it sounds, as those "certain criteria" could end up being quite complex, and indeed I'm not sure the approach is viable (perhaps those criteria are just too complex to automate it). But it could be something to think about.
So ultimately I think the only thing we can really do is produce a "translations changelog" for each release. I don't know what exactly this would look like, but the purpose would be to communicate to the translators which fuzzy strings they should prioritize for updates and which ones would not lose much if they did not update them (merely clearing the fuzzy status).
So basically, #4 is not a benefit. It is a net negative, which will actually serve to make the translations worse.demario wrote: ↑April 10th, 2022, 10:41 am Benefits of a en_US translation:
- All the infrastructure already exists
- It can be implemented on a text-domain basis (for example on [some] campaigns only)
- It would allow more freedom for writers
- Could potentially make the translation easier by using more simple text
- It can be tested right on
Maybe it's worth updating the pofix tool to fix this?
Yeah, that's definitely outside the scope of pofix.