Encodings && font problems
Moderator: Forum Moderators
Encodings && font problems
Ok, Dave. I had downloaded last version via CVS and added line
encoding=UTF-8
to my file czech.cfg.
Its better then it was before, but there are squares instead of some characters. But thats not because of some bug in your code, its because you provide very small unicode font vera.ttf with BfW. It has only about 70 Kb, so anyone should not expect that it has all characters supported by Unicode encoding. Alas, it hasnt for example some characters that are important for my language. I deleted vera.ttf, renamed font file cyberbit.ttf (about 13 Mb - its one of fonts that support nearly all languages) to vera.ttf, so the game loaded this font instead of vera.
The result was :
1) Unexpected slowdown. Maybe you are reopening font file and refreshing its image in memory too often instead of some more effective way of using it ??
2) I saw that all czech characters can be used in game if proper font is supplied - thats good
3) I noticed, that if some phrase, which replaces some weapon name in unit window on the right, ends with non-english characters, this šforeignš characters arent displayed. That is probably bug in BfW code, I think.
I understand that its hard for english-speaking developers to support some strange encodings, because its hard for them to test their work. Well, I think I can spend some time testing.
encoding=UTF-8
to my file czech.cfg.
Its better then it was before, but there are squares instead of some characters. But thats not because of some bug in your code, its because you provide very small unicode font vera.ttf with BfW. It has only about 70 Kb, so anyone should not expect that it has all characters supported by Unicode encoding. Alas, it hasnt for example some characters that are important for my language. I deleted vera.ttf, renamed font file cyberbit.ttf (about 13 Mb - its one of fonts that support nearly all languages) to vera.ttf, so the game loaded this font instead of vera.
The result was :
1) Unexpected slowdown. Maybe you are reopening font file and refreshing its image in memory too often instead of some more effective way of using it ??
2) I saw that all czech characters can be used in game if proper font is supplied - thats good
3) I noticed, that if some phrase, which replaces some weapon name in unit window on the right, ends with non-english characters, this šforeignš characters arent displayed. That is probably bug in BfW code, I think.
I understand that its hard for english-speaking developers to support some strange encodings, because its hard for them to test their work. Well, I think I can spend some time testing.
Re: Encodings && font problems
No, I'm loading the font file once and caching it in memory until the game exits. I'm not sure why the slowdown would be.Sofronius wrote: The result was :
1) Unexpected slowdown. Maybe you are reopening font file and refreshing its image in memory too often instead of some more effective way of using it ??
Okay, well send me the translation file, and a link to where I can download your font file, and an example of a string that doesn't display properly, and I'll look into it.Sofronius wrote: 3) I noticed, that if some phrase, which replaces some weapon name in unit window on the right, ends with non-english characters, this šforeignš characters arent displayed. That is probably bug in BfW code, I think.
David
Hi, Dave
Opening font file:/fonts/Vera.ttf
about six times. So I thought that.... Chm. Maybe thats some feature of SDL
(Later : Okay, I search the sources, I understand why message is printed six times. But I still not understand the slowdown )
In Linux, I am using some TrueType fonts from Windows and some fonts that were once free for download (i.e. without cost) but which can be no longer officially downloaded as free. Fonts from
http://savannah.nongnu.org/projects/freefont/
have some rowspacing bug at the moment, so they are unusable, I hope they will repare it.
So I tried some bugtracking on my own as its far more interesting then browsing in vain for free fonts.
Unwanted stripping was found in file config.cpp, function strip, line
str.erase(std::find_if(str.rbegin(),str.rend(),isgraph).base(),str.end());
Changing isgraph to notspace helped me(change similar to last mentioned in CVS - just five rows down). I would be very glad if some change of this sort can be made in CVS. Its still the same - in UTF-8 characters with values over 127 are used, so they should not be stripped away.
I think two more things should be done for full UTF-8 support:
1) All file preprocessing code should be rewritten so it can handle even some strange UTF-8 characters. For example consisting of two bytes - second one being ".
I am afraid that such wide characters are used in UTF-8, so there is a chance we will meet them in some translation one day. You should know it.
But this is of low priority, as no one has encountered this problem yet and it may be so that all problematic characters are from UTF-8 subsets (braill font, math font etc.) that will be never used with this game (we can hope).
2) Because there are no free (as freedom) fonts supporting most languages, its clear, that we cannot even think about distributing fonts for all translations. It would be strange if we do so, anyway.
But we should give users chance to use some of fonts they probably already have (for example with localized Windows). I think there should be option in config file "Path_to_font" or something similar.
And new tag for translation files, something like :
SpecialFontRequired=Yes/No
when Yes is written in language file, that after selecting this language, dialog box will appear with some message like
"Dear user, unfortunately we are not able to supply you with proper font for this translation. If you want translation to be displayed correctly, please fill full correct path to some of yours localized TrueType files (*ttf) to option Path_to_font in file ....blabla"
Its less dirty then renaming some font to Vera.ttf.
And I hope its easy to add it.
Thanks for support and all work you do.
Hmm. I noticed slowdown with big font files just when starting game or starting scenario. If I start game and the tutorial I see message :Dave wrote:
No, I'm loading the font file once and caching it in memory until the game exits. I'm not sure why the slowdown would be.
Opening font file:/fonts/Vera.ttf
about six times. So I thought that.... Chm. Maybe thats some feature of SDL
(Later : Okay, I search the sources, I understand why message is printed six times. But I still not understand the slowdown )
Honestly, I had tried, but I havent found any usable free font that can be downloaded freely and support my languageDave wrote: Okay, well send me the translation file, and a link to where I can download your font file, and an example of a string that doesn't display properly, and I'll look into it.
David
In Linux, I am using some TrueType fonts from Windows and some fonts that were once free for download (i.e. without cost) but which can be no longer officially downloaded as free. Fonts from
http://savannah.nongnu.org/projects/freefont/
have some rowspacing bug at the moment, so they are unusable, I hope they will repare it.
So I tried some bugtracking on my own as its far more interesting then browsing in vain for free fonts.
Unwanted stripping was found in file config.cpp, function strip, line
str.erase(std::find_if(str.rbegin(),str.rend(),isgraph).base(),str.end());
Changing isgraph to notspace helped me(change similar to last mentioned in CVS - just five rows down). I would be very glad if some change of this sort can be made in CVS. Its still the same - in UTF-8 characters with values over 127 are used, so they should not be stripped away.
I think two more things should be done for full UTF-8 support:
1) All file preprocessing code should be rewritten so it can handle even some strange UTF-8 characters. For example consisting of two bytes - second one being ".
I am afraid that such wide characters are used in UTF-8, so there is a chance we will meet them in some translation one day. You should know it.
But this is of low priority, as no one has encountered this problem yet and it may be so that all problematic characters are from UTF-8 subsets (braill font, math font etc.) that will be never used with this game (we can hope).
2) Because there are no free (as freedom) fonts supporting most languages, its clear, that we cannot even think about distributing fonts for all translations. It would be strange if we do so, anyway.
But we should give users chance to use some of fonts they probably already have (for example with localized Windows). I think there should be option in config file "Path_to_font" or something similar.
And new tag for translation files, something like :
SpecialFontRequired=Yes/No
when Yes is written in language file, that after selecting this language, dialog box will appear with some message like
"Dear user, unfortunately we are not able to supply you with proper font for this translation. If you want translation to be displayed correctly, please fill full correct path to some of yours localized TrueType files (*ttf) to option Path_to_font in file ....blabla"
Its less dirty then renaming some font to Vera.ttf.
And I hope its easy to add it.
Thanks for support and all work you do.
Well, SDL requires loading of the font file for each different font size used. The function is,Sofronius wrote: Hmm. I noticed slowdown with big font files just when starting game or starting scenario. If I start game and the tutorial I see message :
Opening font file:/fonts/Vera.ttf
about six times. So I thought that.... Chm. Maybe thats some feature of SDL
Code: Select all
TTF_OpenFont(const char* filename, size_t size);
And that's probably the slowdown: it's cached in memory, and so there'll be like 6 x 13 megs in memory = alot of memory usage.
Oops. I fixed the bug where it was stripping characters at the front of the string a while ago, but missed fixing the same bug at the end of the string. I have committed a fix to this to CVS.Sofronius wrote: So I tried some bugtracking on my own as its far more interesting then browsing in vain for free fonts.
Unwanted stripping was found in file config.cpp, function strip, line
str.erase(std::find_if(str.rbegin(),str.rend(),isgraph).base(),str.end());
Changing isgraph to notspace helped me(change similar to last mentioned in CVS - just five rows down). I would be very glad if some change of this sort can be made in CVS. Its still the same - in UTF-8 characters with values over 127 are used, so they should not be stripped away.
Yes, this is a problem that I have considered, but am still working on a solution. There could also be character sequences that contain '\n', null, or whitespace (that gets stripped).Sofronius wrote: I think two more things should be done for full UTF-8 support:
1) All file preprocessing code should be rewritten so it can handle even some strange UTF-8 characters. For example consisting of two bytes - second one being ".
I am afraid that such wide characters are used in UTF-8, so there is a chance we will meet them in some translation one day. You should know it.
But this is of low priority, as no one has encountered this problem yet and it may be so that all problematic characters are from UTF-8 subsets (braill font, math font etc.) that will be never used with this game (we can hope).
I'm not sure how we're going to solve this problem. I'm still thinking about it.
Yes, I think something like this is the best solution. I'll look into it.Sofronius wrote: 2) Because there are no free (as freedom) fonts supporting most languages, its clear, that we cannot even think about distributing fonts for all translations. It would be strange if we do so, anyway.
But we should give users chance to use some of fonts they probably already have (for example with localized Windows). I think there should be option in config file "Path_to_font" or something similar.
And new tag for translation files, something like :
SpecialFontRequired=Yes/No
when Yes is written in language file, that after selecting this language, dialog box will appear with some message like
"Dear user, unfortunately we are not able to supply you with proper font for this translation. If you want translation to be displayed correctly, please fill full correct path to some of yours localized TrueType files (*ttf) to option Path_to_font in file ....blabla"
Its less dirty then renaming some font to Vera.ttf.
And I hope its easy to add it.
David
For anyone reading this thread in future :
We have discovered that UTF-8 encoding is no problem for current parsing program code, because character sequences we had been
afraid of are not used in UTF-8, see for example
http://www.cis.ohio-state.edu/cgi-bin/rfc/rfc2279.html
especially end of Page 2.
So UTF-8 encoding is fully supported
In less technical speech - we are (with the most highest probability) able to display all characters your language needs, if proper font is supplied.
And you are encouraged to translate the game in some of yet unsupported languages
Don't be afraid if you haven't ever heard about UTF-8 encoding, or about encodings at all, we can help you so your only task will be to translate texts from English to your language using your favourite text editor on your favourite operating system.
We have discovered that UTF-8 encoding is no problem for current parsing program code, because character sequences we had been
afraid of are not used in UTF-8, see for example
http://www.cis.ohio-state.edu/cgi-bin/rfc/rfc2279.html
especially end of Page 2.
So UTF-8 encoding is fully supported
In less technical speech - we are (with the most highest probability) able to display all characters your language needs, if proper font is supplied.
And you are encouraged to translate the game in some of yet unsupported languages
Don't be afraid if you haven't ever heard about UTF-8 encoding, or about encodings at all, we can help you so your only task will be to translate texts from English to your language using your favourite text editor on your favourite operating system.
Re: Encodings && font problems
This is something I encountered also when using the standard font and a word ending with 'ß'. The 'ß' is not displayed. I will check again, if that is "standard" behaviour and the circumstances if not.Dave wrote:Okay, well send me the translation file, and a link to where I can download your font file, and an example of a string that doesn't display properly, and I'll look into it.Sofronius wrote: 3) I noticed, that if some phrase, which replaces some weapon name in unit window on the right, ends with non-english characters, this šforeignš characters arent displayed. That is probably bug in BfW code, I think.
David
Re: Encodings && font problems
You can just send me your translation so I can search for this bug, if you want....Arndt wrote:
This is something I encountered also when using the standard font and a word ending with 'ß'. The 'ß' is not displayed. I will check again, if that is "standard" behaviour and the circumstances if not.
Re: Encodings && font problems
Have you tried on a very recent CVS? This should hopefully be fixed now.Arndt wrote: This is something I encountered also when using the standard font and a word ending with 'ß'. The 'ß' is not displayed. I will check again, if that is "standard" behaviour and the circumstances if not.
David
- Viliam
- Translator
- Posts: 1341
- Joined: January 30th, 2004, 11:07 am
- Location: Bratislava, Slovakia
- Contact:
characters required
Hi! I have finished a Slovak translation (of the 0.6 version), but there were 6 characters missing in the Vera font:
0x010e, 0x010f: D/d + caron (Ď ď)
0x0139, 0x013a: L/l + acute (Ĺ ĺ)
0x013d, 0x013e: L/l + caron (Ľ ľ)
0x0147, 0x0148: N/n + caron (Ň ň)
0x0154, 0x0155: R/r + acute (Ŕ ŕ)
0x0164, 0x0165: T/t + caron (Ť ť)
So - unless someone has already added these characters in 0.6.99.3, which I have not checked yet, waiting for Win binaries - would you please please add these characters to font? I hope there will be no copyright problems.
Also please add the following characters for Esperanto (will be done in a few weeks):
0x0108, 0x0109: C/c + circumflex (Ĉ ĉ)
0x011c, 0x011d: G/g + circumflex (Ĝ ĝ)
0x0124, 0x0125: H/h + circumflex (Ĥ ĥ)
0x0134, 0x0135: J/j + circumflex (Ĵ ĵ)
0x015c, 0x015d: S/s + circumflex (Ŝ ŝ)
0x016c, 0x016d: U/u + breve (Ŭ ŭ)
If the font must be whole loaded in memory, it could be good in future to use different font files for different groups of characters; e.g. one for Latin characters, another for Cyrilic charactes, one more for Chinese/Japanese characters (this would eat a lot of memory anyway). The only place in game where all characters are needed is the "choose language" dialog... perhaps bitmaps could be used there.
0x010e, 0x010f: D/d + caron (Ď ď)
0x0139, 0x013a: L/l + acute (Ĺ ĺ)
0x013d, 0x013e: L/l + caron (Ľ ľ)
0x0147, 0x0148: N/n + caron (Ň ň)
0x0154, 0x0155: R/r + acute (Ŕ ŕ)
0x0164, 0x0165: T/t + caron (Ť ť)
So - unless someone has already added these characters in 0.6.99.3, which I have not checked yet, waiting for Win binaries - would you please please add these characters to font? I hope there will be no copyright problems.
Also please add the following characters for Esperanto (will be done in a few weeks):
0x0108, 0x0109: C/c + circumflex (Ĉ ĉ)
0x011c, 0x011d: G/g + circumflex (Ĝ ĝ)
0x0124, 0x0125: H/h + circumflex (Ĥ ĥ)
0x0134, 0x0135: J/j + circumflex (Ĵ ĵ)
0x015c, 0x015d: S/s + circumflex (Ŝ ŝ)
0x016c, 0x016d: U/u + breve (Ŭ ŭ)
If the font must be whole loaded in memory, it could be good in future to use different font files for different groups of characters; e.g. one for Latin characters, another for Cyrilic charactes, one more for Chinese/Japanese characters (this would eat a lot of memory anyway). The only place in game where all characters are needed is the "choose language" dialog... perhaps bitmaps could be used there.