Machine Learning Recruiter

SeattleDad · Post by **SeattleDad** » September 21st, 2012, 7:22 am

So I'm beginning to code up a variant of Sapient's suggested goodness metric, which I'm calling the gold yield metric. Here's how I see it:

Basic Damage Metric: Target unit cost * (Damage inflicted/target max HP) (Sapient's suggestion). The concept is that you cost your opponent this much gold by destroying this fraction of the unit. Obviously in any given attack, we would calculate this for both the attacker and the defender.
Village Capture: capturing_unit.variables.ml_gold_yield += wesnoth.game_config.village_income. # My take is that it's important to credit the capturing unit for this. I'm thinking that fast units tend to get more captures than slow units at all stages of the game and this may be a way to give units credit for being fast.
Poison: Treat the same as Basic Damage Metric by crediting for the amount of damage done in that turn. On the turn in which the unit is cured, credit the poisoner with Target Unit Cost * (8/target max HP) to reflect the damage that it would have healed if it hadn't been poisoned (obviously, lessened if it has less than 8 HP of damage)
Slowing: When a unit is on defense and it slows the attacker, give no extra credit to the defender because the attacker just unslows at the end of its turn. When you slow a unit as the attacker, it's debatable. I'm thinking credit the attacker for 1/4 * defender unit cost.
Healing: healing_unit.variables.ml_gold_yield += Healed_unit_cost * (healed amount/healed unit max HP) # Directly analogous to the Basic Damage Metric
Walking Corpse Creation: Credit a unit which gets a kill which creates a unit due to its plague ability with 8 gold (the value of a Walking Corpse)
Leadership: Credit the leader for the bonus damage as per Sapient's suggestion.
Maintenance: It seems to me that Level 0 units should get some sort of credit for the fact that they don't require maintenance. Not sure exactly how to handle this.

I may not get to all of these for the next drop of the ML Recruiter Patch. They are listed in rough order of the order in which I'll attack them. Any suggestions for anything I might be missing would be welcome.

Post by **mattsc** » September 21st, 2012, 2:05 pm

SeattleDad wrote:
mattsc wrote: However, wouldn't it be easier to write the above code snippet as:
Code: Select all
local cant_poison = defender.status.poisoned or defender.status.not_living

EDIT: No, unfortunately that doesn't work. Well, I was wrong, that does work. Thanks, SD, I learned something today...

First, you have to go through defender.__cfg to access this. Second, have a look at how this is set up in the unit table:

Code: Select all

...
    [13] = {
                   [1] = "filter_recall",
                   [2] = {
                             }
               },
    [14] = {
                   [1] = "status",
                   [2] = {
                                 poisoned = true
                             }
               },
    [15] = {
                   [1] = "attack",
...

Note that it is not always index [14] that holds the status, so you have to loop through defender._cfg to find it, which is what helper.get_child does for you:

Code: Select all

function helper.get_child(cfg, name, id)
        -- ipairs cannot be used on a vconfig object
        for i = 1, #cfg do
                local v = cfg[i]
                if v[1] == name then
                        local w = v[2]
                        if not id or w.id == id then return w, i end
                end
        end
end

As a secondary concern, accessing __cfg is expensive (as is the loop in helper.get_child, of course), so you want to minimize its use, which is why that code bit is set up as is, rather than in one line

Code: Select all

                local cant_poison = H.get_child(defender.__cfg, "status").poisoned or H.get_child(defender.__cfg, "status").not_living

which would also work.

SeattleDad · Post by **SeattleDad** » September 21st, 2012, 2:22 pm

mattsc wrote: No, unfortunately that doesn't work.

So this code seems to be doing the right thing. Am I missing something?

Code: Select all

            if defender.status.not_living == true  then
                print(defender.id .. " is not living")
            else
                print(defender.id .. " is living")
            end

Post by **mattsc** » September 21st, 2012, 2:31 pm

Umm, yes, you are right, that is working, for both not_living and poisoned. Not sure why I thought it wasn't, I thought I had tried it and it didn't. Sorry for that, and thanks!

(I'm going to edit my post above to that effect.)

taptap · Post by **taptap** » September 25th, 2012, 9:32 am

An easy improvement to AI recruiting would be banking, i.e. not recruiting when possible but not necessary to save gold and upkeep to have more of it later. Of course this won't improve performance on AI vs. AI or MP in general, but this would toughen the AI in SP where it often fields considerably more units than it can bring to bear efficiently - while paying upkeep for all of them. While impossible to implement with a XP metric, it might be well doable with a gold metric (wasted upkeep).

Damage metric: The first 10 HP damage should generally count less than the last 10 HP. As a way to implement it, I would account "reduced availability cost" (absence for healing / expected match length) * unit cost for all kinds of availability reducing effects (damage, poison) and reduced availability cost + killing value = unit cost. It will help in giving credit to wolves, poachers and similar finishing units, also a 1 HP mage is worth more than 1 gold.

SeattleDad · Post by **SeattleDad** » September 26th, 2012, 1:04 pm

taptap wrote:An easy improvement to AI recruiting would be banking, i.e. not recruiting when possible but not necessary to save gold and upkeep to have more of it later. ... it might be well doable with a gold metric (wasted upkeep).

As you say, this idea would come into play on one of those campaign scenarios in which the AI recruits a hoard of units, but there's some sort of choke point so that most of these units are standing around waiting to reach the battle. I imagine that you could implement this by 1. Implementing a feature which gives the number of units unable to make an attack because a friendly unit is in their way and 2. Changing the metric from "gold yield" to "gold yield over the next x turns". If the neural net predicts that gold yield over the next x turns will be very low (using this feature to help it make this prediction), then it will just bank the gold.

Damage metric: The first 10 HP damage should generally count less than the last 10 HP. As a way to implement it, I would account "reduced availability cost" (absence for healing / expected match length) * unit cost for all kinds of availability reducing effects (damage, poison) and reduced availability cost + killing value = unit cost.

That's an interesting idea, but it seems like it needs to be a little tricky to implement, since I'd have to separately predict expected match length in a given situation. This is very doable (it's just another metric), but now we're combining two predictions, which would take some experimentation. Also, what would the killing value be?

As an alternative, I imagine that you could give a unit credit for the change in the expected metric for the enemy unit before the attack and after the attack. In other words, the neural net expects that a Mage with 24 HP will have a metric of 18 going forward. The Orcish Grunt then does 15 HP of damage to the mage, so the neural net now predicts that the Mage will have a metric of 7 going forward. The Orcish Grunt then gets credited with a metric of 8 (15 - 7=8). This is a little tricky to wrap your mind around conceptually, but it seems doable and it would give more credit to finishing units, as you suggest.

By the way, I've updated the ML Recruiter Wiki to include a discussion of the three metrics I've looked at so far: http://wiki.wesnoth.org/Machine_Learnin ... ss_Metrics

SeattleDad · Post by **SeattleDad** » October 26th, 2012, 3:23 pm

ML Recruiter 0.3 is out and posted at https://gna.org/patch/index.php?3479.

The Machine Learning Recruiter is a brand new recruiting algorithm which achieves dramatically better performance than the current RCA recruiter through the use of the latest neural net technology. The ML Recruiter automatically trains itself by playing thousands of games to determine which units perform best under which circumstances. Read all about it at http://wiki.wesnoth.org/Machine_Learning_Recruiter.

New features of ML Recruiter 0.3 are:

New "gold yield" metric (http://wiki.wesnoth.org/Machine_Learnin ... Gold_Yield) for judging a unit's goodness
Several new ML features to aid in prediction: alignment, race, time of day, map size, friendly and enemy leader hit point percentage remaining, and nearest enemy unit to friendly leader
The above leads to improved performance: Defeats ML Recruiter 0.2 58% of the time and recruits a greater variety of units (especially units with a poison attack)
Runs on all 2-player maps except for Hornshark Island, Thousand Stings, Caves of the Basilisk, and Dark Forecast
Greatly improved ai_test2.py script for running thousands of games to test AI and gather data for the neural net
New script (run_model_and_make_new_model.py) for running games and building a new neural net based on the data gathered from those games

Website · Post by **Sapient** » October 27th, 2012, 2:38 am

Those results sound very exciting. I like your improvements.
What would it take to make it work with those other maps?

SeattleDad · Post by **SeattleDad** » October 27th, 2012, 5:07 pm

Sapient wrote:Those results sound very exciting. I like your improvements.

Thanks! As you can see, I implemented your idea for a new metric and I think it helped.

What would it take to make it work with those other maps?

I'm pretty sure I know the issues:

Thousand Stings and Caves of the Basilisk: These have petrified units. I just need to switch to get_live_units() from AI-demos rather than wesnoth.get_units()
Hornshark Island: Some of the units for the neutral side have names with commas in them like, "Rukhos, Chosen of Death". This is blowing up some of my code which works with comma-separated values. I'll fix this.
Dark Forecast: I'm less sure I can deal with this since it's a "Survival" scenario, but I'll take a look. It might not be a big deal.

More generally, I've laid out the roadmap I'm currently thinking about here: http://wiki.wesnoth.org/Machine_Learnin ... nt_roadmap. The two things I'd add to that roadmap are 1) I'd like to get ML Recruiter integrated into the mainline code, at least for multiplayer maps and 2) I'm hoping to turn this into a scientific paper. I'm targeting "Computational Intelligence in Games" (http://eldar.mathstat.uoguelph.ca/dashlock/CIG2013/), which has a due date of March 1.

SeattleDad · Post by **SeattleDad** » October 29th, 2012, 1:50 am

For anyone who wants to learn more about ML Recruiter, I've just updated the Wiki at http://wiki.wesnoth.org/Machine_Learning_Recruiter.

New items are faction vs. faction statistics for ML Recruiter 0.3, faction-by-faction unit recruitment percentages, a further breakdown of unit recruitment percentages vs. the Undead to show MLR's adaptability, and some basic documentation on the new script that lets you train your own ML Recruiter.

Quitch · Post by **Quitch** » December 27th, 2012, 6:44 pm

I'm interested in trying this and tried selecting it as part of a 1.11.1 game, but it pointed me to the wiki in order to get it to run. Do I need to manually patch this in?

Post by **Alarantalara** » December 27th, 2012, 6:49 pm

Quitch wrote:I'm interested in trying this and tried selected it as part of a 1.11.1 game, but it pointed me to the wiki in order to get it to run. Do I need to manually patch this in ?

Yes, you do. The Waffles code is a C++ library and so cannot be included in an add-on. The contents of the add-on represent as much as possible of the code to minimize the need to reapply patches.

Quitch · Post by **Quitch** » December 27th, 2012, 9:20 pm

Thanks for the clarification.

Quitch · Post by **Quitch** » December 27th, 2012, 10:12 pm

So I grabbed the patch and the source and CMake, but before using CMake to compile I apparently need to use the following line:

patch -p1 -i [path to patch file]

But what's patch? I cannot find such a file.

SeattleDad · Post by **SeattleDad** » December 27th, 2012, 11:04 pm

Instructions for how to apply the patch can be found here: http://wiki.wesnoth.org/Machine_Learnin ... _the_patch.

As it says at the above link, you get the patch from https://gna.org/patch/?3479. Take the latest version, which is currently ML Recruiter 0.4.

The Battle for Wesnoth Forums

Machine Learning Recruiter

Re: Machine Learning Recruiter

Re: Machine Learning Recruiter

Re: Machine Learning Recruiter

Re: Machine Learning Recruiter

ML Non-Recruiting: Banking

Re: ML Non-Recruiting: Banking

Re: Machine Learning Recruiter

Re: Machine Learning Recruiter

Re: Machine Learning Recruiter

Re: Machine Learning Recruiter

Re: Machine Learning Recruiter

Re: Machine Learning Recruiter

Re: Machine Learning Recruiter

Re: Machine Learning Recruiter

Re: Machine Learning Recruiter