[accepted, implemented] Context-free Grammar for unit names

Brainstorm ideas of possible additions to the game. Read this before posting!

Moderators: Forum Moderators, Developers

Forum rules
Before posting a new idea, you must read the following:
User avatar
skeptical_troll
Posts: 401
Joined: August 31st, 2015, 11:06 pm

Re: How about using Context-free Grammar to generate unit na

Post by skeptical_troll » April 8th, 2016, 7:05 pm

Just out of curiosity, is the current algorithm based on MCMC on letters , not syllables or blocks? If the length is the problem, isn't just possible to include it in the likelihood by hand so that probability of long names go down?

Spixi
Posts: 51
Joined: August 23rd, 2010, 7:22 pm

Re: How about using Context-free Grammar to generate unit na

Post by Spixi » April 8th, 2016, 7:17 pm

skeptical_troll wrote:Just out of curiosity, is the current algorithm based on MCMC on letters , not syllables or blocks? If the length is the problem, isn't just possible to include it in the likelihood by hand so that probability of long names go down?
You can find the current implementation in /src/race.cpp.

I wonder if there is already an existing library which does exactly the opposite of what GNU flex does.

User avatar
Dugi
Posts: 4823
Joined: July 22nd, 2010, 10:29 am
Location: Carpathian Mountains
Contact:

Re: How about using Context-free Grammar to generate unit na

Post by Dugi » April 8th, 2016, 7:49 pm

skeptical_troll wrote:Just out of curiosity, is the current algorithm based on MCMC on letters , not syllables or blocks? If the length is the problem, isn't just possible to include it in the likelihood by hand so that probability of long names go down?
If you are lazy to find it, it takes a pair of letters to pick what will follow. It can be configured to use the last triplet or more, but it's not used anywhere as far as I know.
I have tried the algorithm with next letter determined from the previous single letter, but it sucked.

EDIT:
Some improved grammars whose average number of recruits till a pair of namesakes appear are significantly better than it was before (no experiment was made because it can be estimated mathematically, I am adding rough estimates).
Male elves (around 70)
Female elves (around 70)
Male humans (around 80)
Female humans (around 80)
Orcs (around 90)
Other grammars were already better than the current markov generator.

Some further improvements were made and the pull request was accepted and will be a part of wesnoth 1.13.5 and later.

User avatar
GunChleoc
Translator
Posts: 436
Joined: September 28th, 2012, 7:35 am
Contact:

Re: [accepted, implemented] Context-free Grammar for unit na

Post by GunChleoc » January 27th, 2017, 9:43 am

I have finally been playing with translating the name generation. I am running into a few difficulties with the town names:
  1. I have prefixed the base names with "XXX", but I get names with "XX", "XXXX" or "XXXXXX" in them. This should not happen - I should see "XXX" only here.
  2. I have prefixed the rule-generated base names with "NOCOM". I don't see any of those at all.
  3. Segmentation of base names is broken. Seems like blank space is used in addition to , for parsing the names into a list, resulting in nonsense names.
  4. There are town names that consist of base names only. Since my base names need to be in the genitive case for the composition rules, I need to get rid of pure base town names without any prefixes. I haven't found a way to do that.
I am attaching the current state of my translation file for the wesnoth textdomain, gd locale.
Attachments
wesnoth.zip
(173.6 KiB) Downloaded 62 times

User avatar
Dugi
Posts: 4823
Joined: July 22nd, 2010, 10:29 am
Location: Carpathian Mountains
Contact:

Re: [accepted, implemented] Context-free Grammar for unit na

Post by Dugi » January 29th, 2017, 10:50 am

Hello,

All mentions that I have contributed something at all were removed from my forum profile, so this is not my responsibility any more.

However, because it's you, I will give you some advice. The implementation tries to find the name generator. If it fails to find it, is falls back to the old Markov chains. The Markov chain generated names work like that, if you add XXX before all base names, the result may have a random number of X letters in it. You have prefixed the context-free grammar generated names of villages with NOCOM, but the code is messed up (the second line is missing a newline), the parsing fails and falls back to the old method. I do not know how is the old name system implemented, so I have no idea what can be broken in it and cause the segmantation issue.

HTH,
-Dugi

User avatar
GunChleoc
Translator
Posts: 436
Joined: September 28th, 2012, 7:35 am
Contact:

Re: [accepted, implemented] Context-free Grammar for unit na

Post by GunChleoc » January 29th, 2017, 4:16 pm

Thanks, Dugi. Seems like that's not the only thing that's broken in my code though, so I need to find a way to really debug this thing.

User avatar
Dugi
Posts: 4823
Joined: July 22nd, 2010, 10:29 am
Location: Carpathian Mountains
Contact:

Re: [accepted, implemented] Context-free Grammar for unit na

Post by Dugi » January 29th, 2017, 8:25 pm

You can use my website to debug it (most of those links I have posted link to that). I have expanded its functionality since then, but you probably won't come across any of the new syntactic features.

You may need to add that \n at the end of each line. The code needs the newlines to be there, but I am not sure how do the translation file deal with the newlines.

User avatar
GunChleoc
Translator
Posts: 436
Joined: September 28th, 2012, 7:35 am
Contact:

Re: [accepted, implemented] Context-free Grammar for unit na

Post by GunChleoc » January 30th, 2017, 12:07 pm

Thanks, that is a very helpful tool! It's working via the website now, but not in Wesnoth.

I compiled Wesnoth on my Linux box and added some debug output. I spent a few hours digging into the code and it seems like calling generate() for "main" always returns an empty string, even for English. This means that the $base variable is then filled by the Markov generator.

So, this is definitely a bug in the context free generator in Wesnoth, but I have no idea what's wrong with it yet.

User avatar
GunChleoc
Translator
Posts: 436
Joined: September 28th, 2012, 7:35 am
Contact:

Re: [accepted, implemented] Context-free Grammar for unit na

Post by GunChleoc » January 31st, 2017, 11:53 am

I found the bug :)

https://github.com/wesnoth/wesnoth/pull/921

I am still getting pure base names though.

Post Reply