[engine] A mathematical framework for game balancing
Moderators: Forum Moderators, Developers
Forum rules
Before posting a new idea, you must read the following:
Before posting a new idea, you must read the following:
[engine] A mathematical framework for game balancing
Dear Wesnoth developers and users,
I happened to think about a framework to balance Battle for Wesnoth. The core idea of these few mathematical principles is to automate the balancing, so that there is no need to play actual games to balance units, but several games can be simulated without any actual display, and the result can be exploited using algorithms to fit unit characteristics. Since the formalism here is quite general, there may be other applications I do not suspect.
I have some programming skills but I'm not used to Lua scripting so I'm open to any suggestions about how this could be implemented. Also, I would like to open a general discussion about how game balancing could be automated using mathematics, since this could be of tremenduous help for modders, and even core game developers.
A theory to automatically balance Battle for Wesnoth unit/terrain characteristics
Battle for Wesnoth is a turnbyturn game where factions fight to kill other leaders. The complexity of the interactions between the units, and the unit/terrain interactions, prevent a complete theory to be made that would predict the winner from any starting situation. However, multiple tests on a sufficiently big map with a sufficiently large number of units over a sufficiently large number of turns may enable to calculate an overall contribution for each unit type at each turn for a given situation. Game balancing could then be achieved by examining the average contribution of all unit types over a representative set of situations.
Assuming that the particular size and configuration of the map does not influence the final outcome, and not considering any alliance, what characterize a situation is a set of numbers:
 The proportion matrix of each kth unit and terrain for each ith player P;
 The number of players N.
The outcome is the victory vector v, of which each element vi is 1 if player i wins and 0 otherwise.
Let us assume a multilinear function f of the proportions of each unit and terrain for each player, bounded between 0 and 1 and of which the rounded value is vi:
( 1 )
( 2 )
In eq. 1, [ ] denotes the rounding function.
Or, generalized for all N players as a vector equation:
( 3 )
For each situation, the objective is to find the vector of contributions c, containing all ck that minimizes the distance between f and v. The error vector is:
( 4 )
The problem is to find c that corresponds to the minimal possible values for the norm of the vector of squared ei. This is an ordinary least square minimization problem which can be solved in python language using pandas, for example.
Over a representative set of M situations, the average of the contribution vectors is the balance vector b:
( 5 )
For a fully equilibrated and balanced game, ideally, the elements of the b vector should all have a close value if all terrains have the same weight.
Starting from this representation, several algorithms could be imagined. We could, for example, simulate several starting situations with the same terrain but different proportions of units. We would then tune unit statistics such that the average over all different unit proportions yields a balance vector with closely equal bk elements. We could also imagine tuning unit statistics so that maps with a lot of forests give the same advantage to elfs than maps with a lot of hills/mountains give to dwarves. To tune these statistics, a general metaalgorithm could be imagined (it might take some calculation time) which would automatically minimize the distance between a targeted balance vector and the actual obtained balance vector by tuning the unit characteristics (HP, damages, capabilities...), using, for example, a Monte Carlo or simulated annealing approach or some combination of optimization method.
All the best, with the hope it can open an interesting discussion! Do not hesitate if you have any questions, as the mathematical formalism may pose some difficulties!
I happened to think about a framework to balance Battle for Wesnoth. The core idea of these few mathematical principles is to automate the balancing, so that there is no need to play actual games to balance units, but several games can be simulated without any actual display, and the result can be exploited using algorithms to fit unit characteristics. Since the formalism here is quite general, there may be other applications I do not suspect.
I have some programming skills but I'm not used to Lua scripting so I'm open to any suggestions about how this could be implemented. Also, I would like to open a general discussion about how game balancing could be automated using mathematics, since this could be of tremenduous help for modders, and even core game developers.
A theory to automatically balance Battle for Wesnoth unit/terrain characteristics
Battle for Wesnoth is a turnbyturn game where factions fight to kill other leaders. The complexity of the interactions between the units, and the unit/terrain interactions, prevent a complete theory to be made that would predict the winner from any starting situation. However, multiple tests on a sufficiently big map with a sufficiently large number of units over a sufficiently large number of turns may enable to calculate an overall contribution for each unit type at each turn for a given situation. Game balancing could then be achieved by examining the average contribution of all unit types over a representative set of situations.
Assuming that the particular size and configuration of the map does not influence the final outcome, and not considering any alliance, what characterize a situation is a set of numbers:
 The proportion matrix of each kth unit and terrain for each ith player P;
 The number of players N.
The outcome is the victory vector v, of which each element vi is 1 if player i wins and 0 otherwise.
Let us assume a multilinear function f of the proportions of each unit and terrain for each player, bounded between 0 and 1 and of which the rounded value is vi:
( 1 )
( 2 )
In eq. 1, [ ] denotes the rounding function.
Or, generalized for all N players as a vector equation:
( 3 )
For each situation, the objective is to find the vector of contributions c, containing all ck that minimizes the distance between f and v. The error vector is:
( 4 )
The problem is to find c that corresponds to the minimal possible values for the norm of the vector of squared ei. This is an ordinary least square minimization problem which can be solved in python language using pandas, for example.
Over a representative set of M situations, the average of the contribution vectors is the balance vector b:
( 5 )
For a fully equilibrated and balanced game, ideally, the elements of the b vector should all have a close value if all terrains have the same weight.
Starting from this representation, several algorithms could be imagined. We could, for example, simulate several starting situations with the same terrain but different proportions of units. We would then tune unit statistics such that the average over all different unit proportions yields a balance vector with closely equal bk elements. We could also imagine tuning unit statistics so that maps with a lot of forests give the same advantage to elfs than maps with a lot of hills/mountains give to dwarves. To tune these statistics, a general metaalgorithm could be imagined (it might take some calculation time) which would automatically minimize the distance between a targeted balance vector and the actual obtained balance vector by tuning the unit characteristics (HP, damages, capabilities...), using, for example, a Monte Carlo or simulated annealing approach or some combination of optimization method.
All the best, with the hope it can open an interesting discussion! Do not hesitate if you have any questions, as the mathematical formalism may pose some difficulties!
Re: [engine] A mathematical framework for game balancing
Well, it has been discussed before.
Without understanding 100% of what you stated, I suspect you are underestimating the difficulty of calculating a "contribution vector." Am I right?
Without understanding 100% of what you stated, I suspect you are underestimating the difficulty of calculating a "contribution vector." Am I right?
http://www.wesnoth.org/wiki/User:Sapient... "Looks like your skills saved us again. Uh, well at least, they saved Soarin's apple pie."
Re: [engine] A mathematical framework for game balancing
I know that the whole mathematical problem is complex. For this reason, instead of attempting to find an a priori formulation (in other words, an analytical solution), I propose to infer the solution from a series of simulations using the same AI for a given number of factions. For each simulated game, an OLS procedure yields a contribution vector. The difficulty of this procedure is by definition proportional to the number of elements of the P matrix, which will depend upon the number of degrees of freedom we chose to include into the problem, it could be all characteristics or only a few. I do not underestimate it, I just see that using such a general linear algebraic formalism we can tune it.
Once we have a set of contribution vectors, we simply average them and see the result. The result can also be treated factionwise by averaging the b values over all units of a given faction and then comparing the factions. I feel this formalism is quite flexible.
I red the post you pointed and indeed it has some similarities with the proposition of this post, the main difference was that there were some fixed empirical contributions for each characteristic. This framework is general, and many efficient algorithms exist to solve OLS.
The only difficulty is to evaluate how much time it would take to simulate, e.g., a game between, say, 2 AI, 100 times, with an homogeneous terrain (all mixed) by changing the proportions of each units used by each faction. If we can simulate games at a reasonable cost, any general algorithm for multivariate optimization could be used to tune the unit characteristics so that the b vector has homogeneous values.
I have some professional experience with fitting and I think that, provided the flexibility of the problem, it should be feasible.
Once we have a set of contribution vectors, we simply average them and see the result. The result can also be treated factionwise by averaging the b values over all units of a given faction and then comparing the factions. I feel this formalism is quite flexible.
I red the post you pointed and indeed it has some similarities with the proposition of this post, the main difference was that there were some fixed empirical contributions for each characteristic. This framework is general, and many efficient algorithms exist to solve OLS.
The only difficulty is to evaluate how much time it would take to simulate, e.g., a game between, say, 2 AI, 100 times, with an homogeneous terrain (all mixed) by changing the proportions of each units used by each faction. If we can simulate games at a reasonable cost, any general algorithm for multivariate optimization could be used to tune the unit characteristics so that the b vector has homogeneous values.
I have some professional experience with fitting and I think that, provided the flexibility of the problem, it should be feasible.
Re: [engine] A mathematical framework for game balancing
You can't infer anything from AI simulations, as the AI is not sufficiently intelligent to be used as the basis for any but the most rudimentary balancing. You could have very balanced factions which AI simulations indicate are unbalanced, and completely imbalanced factions which AI simulations indicate are balanced and statistically evenly matched. Regardless of what kind of formula you use, having a million AI simulations give the result that Drakes consistently lose against Rebels simply does not show that there is a balance problem between Drakes and Rebels.
Re: [engine] A mathematical framework for game balancing
If no reliable data can be gathered from AI matches, then a database of human multiplayer games could be used instead (but it could only diagnose balancing with present parameters, and could not simulate parameters changes then because each change would need tens of new MP games to be tested and stored)
This math framework can still be used from AI simulations if a sufficiently strong AI is reached later.
I admit I have no experience in AI. What are the current general limitations of Wesnoth AI that prevent it from being representative of factions balance?
This math framework can still be used from AI simulations if a sufficiently strong AI is reached later.
I admit I have no experience in AI. What are the current general limitations of Wesnoth AI that prevent it from being representative of factions balance?
Re: [engine] A mathematical framework for game balancing
I suppose one big highlevel general limitation is that it can't really plan ahead; it can't make strategic plans several turns in the future, and I believe it can't really even plan a sequence of actions during the same turn, such as killing an enemy unit to allow another unit to attack another otherwise unreachable enemy.Glxblt76 wrote:I admit I have no experience in AI. What are the current general limitations of Wesnoth AI that prevent it from being representative of factions balance?
Re: [engine] A mathematical framework for game balancing
I think the most obvious issue with the AI is that it doesn't use specials (slow, backstab etc.) properly and also ignores abilities like leadership or healing, so these units would end up looking really useless.
I had some (off topic now I guess) questions about the equations, sorry if they're stupid (but it would be a shame not to discuss them after you spent all the time writing them )
I had some (off topic now I guess) questions about the equations, sorry if they're stupid (but it would be a shame not to discuss them after you spent all the time writing them )
Spoiler:
Screenshot playthroughs: Let's Play Dead Water, Let's Play Invasion from the Unknown and Let's Play After the Storm
 Pentarctagon
 Forum Administrator
 Posts: 4171
 Joined: March 22nd, 2009, 10:50 pm
 Location: Earth (occasionally)
Re: [engine] A mathematical framework for game balancing
Unfortunately the closest thing to a database of matches is https://replays.wesnoth.org/, which I'd imagine would be a pain to parse through.Glxblt76 wrote:If no reliable data can be gathered from AI matches, then a database of human multiplayer games could be used instead (but it could only diagnose balancing with present parameters, and could not simulate parameters changes then because each change would need tens of new MP games to be tested and stored)
99 little bugs in the code, 99 little bugs
take one down, patch it around
2,147,483,648 little bugs in the code
take one down, patch it around
2,147,483,648 little bugs in the code
Re: [engine] A mathematical framework for game balancing
It seems to me that it would be invaluable to be able to autogenerate balances. However, it seems (from the discussion) that an even more useful thing would be to improve the AI. With mathematical skills such as you have demonstrated i believe you would be able to successfully generate formula's to improve the current AI.
Of course this is not the goal of your topic and you may not even be interested in modifying the AI and that is perfectly reasonable.
As to the current topic which is being discussed:
I can not say i fully understand the equations which you have currently proposed, but it seems to me that they are equations relating the victory to players based upon the number of players and some constant. Is this a reasonable generalization? (like i said i do not fully understand them so please tell me if i am incorrect)
Assuming that my previous assumption is correct the challenge now would be to determine what exactly this constant C is. That, however, seems like a very hard problem . Fundamentally the problem i see with this idea are:
How do you propose to factor in gold and how it is spent. (players could buy many cheap units such as spearman, or goblins, or they could focus on specialists such as mages and berserkers.)
How are you planning on calculating for resistances and attack types. (For example pierce does virtually nothing against undead, to compensate the loyalists faction has mages with fire damage which are invaluable. A second example is that when i was balancing factions i added the elvish sorceress to a faction i had created (instead of the red mage). It then suddenly became massively unbalanced versus the drake faction. I was surprised by this since they both the red mage and elvish sorceress are similar units with similar damage however the elvish sorceress has arcane and the red mage has fire which versus drakes (who have strong fire resistance and weak arcane resistance) was an massive difference)
These two fundamentally seem the biggest problems because in order to create equations to balance resistance and cost you must take into account the entire faction and its units and the opposing faction.
(this may seem that i am trying to discourage your idea but if it does i seriously apologize because that is not at all the intention. I am genuinely interested in the idea you have. However i am slightly pessimistic at this point about whether something like this can actually be created.)
Of course this is not the goal of your topic and you may not even be interested in modifying the AI and that is perfectly reasonable.
As to the current topic which is being discussed:
I can not say i fully understand the equations which you have currently proposed, but it seems to me that they are equations relating the victory to players based upon the number of players and some constant. Is this a reasonable generalization? (like i said i do not fully understand them so please tell me if i am incorrect)
Assuming that my previous assumption is correct the challenge now would be to determine what exactly this constant C is. That, however, seems like a very hard problem . Fundamentally the problem i see with this idea are:
How do you propose to factor in gold and how it is spent. (players could buy many cheap units such as spearman, or goblins, or they could focus on specialists such as mages and berserkers.)
How are you planning on calculating for resistances and attack types. (For example pierce does virtually nothing against undead, to compensate the loyalists faction has mages with fire damage which are invaluable. A second example is that when i was balancing factions i added the elvish sorceress to a faction i had created (instead of the red mage). It then suddenly became massively unbalanced versus the drake faction. I was surprised by this since they both the red mage and elvish sorceress are similar units with similar damage however the elvish sorceress has arcane and the red mage has fire which versus drakes (who have strong fire resistance and weak arcane resistance) was an massive difference)
These two fundamentally seem the biggest problems because in order to create equations to balance resistance and cost you must take into account the entire faction and its units and the opposing faction.
(this may seem that i am trying to discourage your idea but if it does i seriously apologize because that is not at all the intention. I am genuinely interested in the idea you have. However i am slightly pessimistic at this point about whether something like this can actually be created.)
Creator of: The Reign of The Lords Era,The Gnats Franken Dungeon
Codesigner of the (notwesnoth) space combat video game Planet Bounce.
Codesigner of the (notwesnoth) space combat video game Planet Bounce.

 Posts: 113
 Joined: March 16th, 2008, 6:39 am
Re: [engine] A mathematical framework for game balancing
It seems to be rather than calculating causality, it will instead gather statistical data.
The problems of the first assumption has been addresed, I will also raise a problem about the other assumption (map layout not unduly affecting the statistics). This is because map layout has a very big effect on the outcomes of the match, particularly with regards to how it favors certain factions. Map layout also affects player decision making not only in terms of the avenues it could take, but also about the availability of resources and information.
This is not to say that the idea isn't useful tho.. but we have to be aware of its limitations.
as for a database of useful replays by human players, well why not use the publicly available ladder replays? Not only do you have curated human replays (sometimes with comments) but you also have additional data regarding the players (such as ELO ranking etc.)
http://wesnoth.gamingladder.info/
The problems of the first assumption has been addresed, I will also raise a problem about the other assumption (map layout not unduly affecting the statistics). This is because map layout has a very big effect on the outcomes of the match, particularly with regards to how it favors certain factions. Map layout also affects player decision making not only in terms of the avenues it could take, but also about the availability of resources and information.
This is not to say that the idea isn't useful tho.. but we have to be aware of its limitations.
as for a database of useful replays by human players, well why not use the publicly available ladder replays? Not only do you have curated human replays (sometimes with comments) but you also have additional data regarding the players (such as ELO ranking etc.)
http://wesnoth.gamingladder.info/
Re: [engine] A mathematical framework for game balancing
Hello all,
Thanks for the interesting points you raise!
P_1 = {0.8 ; 0.2 ; 0.5 ; 0.5}
P_2 = {0.2 ; 0.8 ; 0.5 ; 0.5}
This means that player 1 has 20% of spearmen and 80% of archers, player 2 has 80% of spearmen and 20% of archers, and terrain is 50% of grass and 50% of desert in the various maps in which the situation is simulated.
Obviously there will be repetitions in the matrix (its bottom will have many lines equal) as the proportions of terrains will be the same for all players. But, the contributions for each terrain for different players will be different as some will win and some other lose. This is simply a trick to make the problem multilinear and make it amenable to OLS. I assume a linearity because it is equivalent to a first order approximation, and in most problems the biggest part of information can be obtained with linear form.
Regarding the players database, the only data that would be needed for each game would be, in the end:
 the proportions of each unit created (it's not a problem that different players have different sets of unit available, we would simply have a sparse matrix as obviously a Loyalist would not create a Saurian Skirmisher so the Saurian Skirmisher proportion would be 0 for a normal loyalist faction);
 the proportions of each terrain type (which would be repeated in the end of each P_i column)
Thanks for the interesting points you raise!
I'm not an AI specialist but from my naive view, there might be ways to teach the computer how to play them properly on the basis of our player experience (like, slow down unit which can make the most damage). I think that in general, tactics are easier to teach to AI than strategy. But even for strategy, as soon as we can, from our experience, draw general principles, there may be ways to encode them into AI. I'm not aware about the calculation cost though...Inky wrote:I think the most obvious issue with the AI is that it doesn't use specials (slow, backstab etc.) properly and also ignores abilities like leadership or healing, so these units would end up looking really useless.
No problem, it's a kind of "game" for me to try to find ways to formalize Wesnoth as I'm between two research contracts which means monthes without formalizing scientific stuff every day I would be curious if some other people already did some coherent sets of equations to describe some aspects of Wesnoth.I had some (off topic now I guess) questions about the equations, sorry if they're stupid (but it would be a shame not to discuss them after you spent all the time writing them )
P is a matrix, P_i is the vector of proportions for player i. The P matrix is simply the concatenation of all P_i vectors. In P_i, are listed all proportions of units and terrains. Let's say, there is only two units (spearman and archer) and two terrains (grass and desert) in the game, and 2 players. We could imagine a situation defined by:It wasn't quite clear to me what P_i is. "The proportion matrix of each kth unit and terrain for each ith player P" seems to imply that P_i is a matrix, but from the equations it seems that P_i is a vector with k entries?
P_1 = {0.8 ; 0.2 ; 0.5 ; 0.5}
P_2 = {0.2 ; 0.8 ; 0.5 ; 0.5}
This means that player 1 has 20% of spearmen and 80% of archers, player 2 has 80% of spearmen and 20% of archers, and terrain is 50% of grass and 50% of desert in the various maps in which the situation is simulated.
Obviously there will be repetitions in the matrix (its bottom will have many lines equal) as the proportions of terrains will be the same for all players. But, the contributions for each terrain for different players will be different as some will win and some other lose. This is simply a trick to make the problem multilinear and make it amenable to OLS. I assume a linearity because it is equivalent to a first order approximation, and in most problems the biggest part of information can be obtained with linear form.
I think it would be problematic because then we would do OLS on an averaged P matrix. I think that ideally, each individual P matrix should have the most suitable possible ck coefficients for each unit and each terrain to predict which player will win the individual game corresponding to the P matrix.And instead of doing least squares for every different vector v and then averaging the results, wouldn't it be much faster/easier to first average all the v's and then do least squares on that? Or does that not work?
Regarding the players database, the only data that would be needed for each game would be, in the end:
 the proportions of each unit created (it's not a problem that different players have different sets of unit available, we would simply have a sparse matrix as obviously a Loyalist would not create a Saurian Skirmisher so the Saurian Skirmisher proportion would be 0 for a normal loyalist faction);
 the proportions of each terrain type (which would be repeated in the end of each P_i column)
Regarding the particular modifications of units, if a sufficiently proficient AI is available, for me this really is a matter of supervised optimization. Every possible parameter that you can imagine (including resistance rate, gold cost, and even particular characteristics such as leadership which can be modeled as booleans ...) can be tuned using standard stochastic optimization methods for complex problems. Monte Carlo alone may explore a too big space and lead to absurd results, so, e. g., a perturbation method which would propose new hypotheses while staying close to the old values and see the effect on the overall balance could do the trick. We could combine it with a bit of Monte Carlo so that some optimization steps check if big changes could provide more balance. And we could manually constrain the optimization not to change some unit characteristics, either particular units or particular characteristics etc, if we think that some units are perfect from a RP point of view and we don't want to change them for example.The_Gnat wrote:How do you propose to factor in gold and how it is spent. (players could buy many cheap units such as spearman, or goblins, or they could focus on specialists such as mages and berserkers.)
How are you planning on calculating for resistances and attack types. (For example pierce does virtually nothing against undead, to compensate the loyalists faction has mages with fire damage which are invaluable. A second example is that when i was balancing factions i added the elvish sorceress to a faction i had created (instead of the red mage). It then suddenly became massively unbalanced versus the drake faction. I was surprised by this since they both the red mage and elvish sorceress are similar units with similar damage however the elvish sorceress has arcane and the red mage has fire which versus drakes (who have strong fire resistance and weak arcane resistance) was an massive difference
I know that there is limitations, since obviously for an individual game, the particular configuration of the terrain and the order in which the units were created (i. e. the different proportions of units at each turn) will have an impact on the odds of victory. What I hope is that for a big enough set of games, these parameters compensate/average, as in principle for a normal distribution of games, an advantageous temporal arrangement of unit proportions or spatial arrangement of tiles would compensate with a disadvantageous arrangement. This is a kind of statistical mechanical hypothesis (or thermodynamic). If you take a snapshot of a small cube with a few molecules for a few time, then the pressure in this cube will be quite different for different snapshots. If you take a sufficient number of these snapshots, you can expect that the average pressure will be representative of the pressure you measure for a big cube over a large amount of time.Computer_Player wrote:The problems of the first assumption has been addresed, I will also raise a problem about the other assumption (map layout not unduly affecting the statistics). This is because map layout has a very big effect on the outcomes of the match, particularly with regards to how it favors certain factions. Map layout also affects player decision making not only in terms of the avenues it could take, but also about the availability of resources and information.
This is not to say that the idea isn't useful tho.. but we have to be aware of its limitations.
Re: [engine] A mathematical framework for game balancing
Thanks for explaining!
Even if it's not practical it sounds like a cool idea in theory, it's always very interesting to see real life Irdyan life I guess, applications of math (Maybe, you will be the first to write a math paper about Wesnoth )
Even if it's not practical it sounds like a cool idea in theory, it's always very interesting to see real life Irdyan life I guess, applications of math (Maybe, you will be the first to write a math paper about Wesnoth )
Screenshot playthroughs: Let's Play Dead Water, Let's Play Invasion from the Unknown and Let's Play After the Storm
 skeptical_troll
 Posts: 439
 Joined: August 31st, 2015, 11:06 pm
Re: [engine] A mathematical framework for game balancing
I praise your brave formalization attempt, but, unless I am missing something (I'd lie if I said I understand everything of what you wrote), you'll need to add some element (or prior/constrain on your minimization) to preserve diversity. There is an obvious way of making all units/factions perfectly balanced for any map: giving them the same HP, gold, resistances etc... which is of course very boring. Even if you get your algorithm to work, how do you prevent it to go in the direction of simply making the stats very homogeneous, if not identical?
On the other hand, I don't think the dependence on the map is too relevant: MP map design is part of the balancing problem, if the factions are diverse enough there will always be maps in which one systematically prevails. So you can just solve the issue by training your algorithm on a limited set of maps which are considered to be balanced. I don't see instead how you could do it with a humanbattles database: you need battles with the new stats you want to test, right? I don't see how you can get that without AI virtual games.
On the other hand, I don't think the dependence on the map is too relevant: MP map design is part of the balancing problem, if the factions are diverse enough there will always be maps in which one systematically prevails. So you can just solve the issue by training your algorithm on a limited set of maps which are considered to be balanced. I don't see instead how you could do it with a humanbattles database: you need battles with the new stats you want to test, right? I don't see how you can get that without AI virtual games.
Re: [engine] A mathematical framework for game balancing
Afraid your initial assumption is way off. Size certainly has huge implications for balancing, as does placement of terrain. Compare a mountain on the edge of the map with one next to a frontline vill, for example...Glxblt76 wrote: Assuming that the particular size and configuration of the map does not influence the final outcome
As for AI testing, I'm afraid the current AI is very weak compared to human players for NvN multiplayer (no disrespect meant to whoever coded it, it's an incredibly tough problem). A mediocre human player should beat the AI almost every game. AI games are useless for balancing, and it would need massive improvements before they were  balancing is based on the top players, so you'd need an AI that could beat the top human players. As has been mentioned, there's a huge range of obstacles here, not least lack of knowledge (how to infer what your opponent is doing in the fog), adaptability, strategic management. Even tactical situations are tough, due to the huge range of possible outcomes from just a few unit interactions, ordering problems etc.
I think you're underestimating / missing many of the variables that impact on balance, and thus the time it would take to 'optimise across all variables', but that's a pretty academic point when you have no way of generating a fitness metric for each set. Since a vastly improved AI is needed for you idea, perhaps focus on that as the first step, since any improvements in it would also be valuable in themselves?
Have you played much multiplayer (against humans, against AI)? It might be useful to get some idea of the complexity involved before getting too invested in this project...
Re: [engine] A mathematical framework for game balancing
AI is a topic that could definitely use a mathematician: http://yieldthought.com/post/9572288205 ... betterai