Ladder Site Online...

Discussion of all aspects of multiplayer development: unit balancing, map development, server development, and so forth.

Moderator: Forum Moderators

Post Reply
Gallifax
Multiplayer Moderator
Posts: 137
Joined: October 23rd, 2006, 5:36 pm
Location: Who cares?

Re: Ladder Site Online...

Post by Gallifax »

To all of you. You missed something!

Since eyerouge stepped back for the moment to 2nd row. However any of us liked or disliked recent cap troubles, many thanks so far for all the work put into the project.!:) Of course this also applies to all the other guys/girls, whom I know is chains one of them.


Somehow I felt this needed to be said.

Cheers
User avatar
TheMasterOfBattle
Posts: 161
Joined: October 24th, 2008, 1:13 pm
Location: My War Council

Re: Ladder Site Online...

Post by TheMasterOfBattle »

Well, to clear up a few thing.
eyerouge wrote:My question to everyone that supports the system is why we should measure exactly the same skills differently just because a player has played more games. Obviously it makes no sense and it clearly distorts the rating of that game unless it really does compensate for match spamming. And if it does compensate, how on earth would one know if the compensation is fair and/or enough? By i.e. over-compensating one would just distort Elo.
This is why we plan to use such a soft cap system as I don't see many people managing to get the 21 games in the past seven days in order to trigger it. Also, even after that, it would be set so the k value goes down extremely slowly, so it shouldn't effect many people.

Then again, I haven't seen much of a match spamming problem, so maybe it would be best to have this in there as a solution we could activate in the future for if a match spamming problem ever arises in the future.


As to user-made maps, no, I would not allow just any user made map to be approved if any. The maps would have to be proved to be balanced to the standards of the mainline maps in order to even be considered for being allowed for use of the ladder. And if it is deemed worthy, then only that map would be added to the list of legal ladder maps.


As for a 2v2 ladder, yes, as eyerouge and Pelopidas have said, we plan to get one in the future.
Gallifax wrote:To all of you. You missed something!

Since eyerouge stepped back for the moment to 2nd row. However any of us liked or disliked recent cap troubles, many thanks so far for all the work put into the project.!:) Of course this also applies to all the other guys/girls, whom I know is chains one of them.


Somehow I felt this needed to be said.

Cheers
I couldn't agree more Gallifax. :)
chains
Posts: 76
Joined: January 9th, 2007, 5:02 am
Location: Portland OR
Contact:

Re: Ladder Site Online...

Post by chains »

Over several recodes the provisional system is broken.

The system was created to prevent the 'Owlface' problem. Owlface played 40 games against people he knew were bad and had never played before. These players often only played a few games, and they were all against him. He became the number 1 ladder player for months and refused to take serious games.

Obviously its bad that someone could be a top 10 player playing nothing but these crappy players, now called provisional players. So, to fix this, I coded the provisional system which we stole from real chess websites which have the same inflated rating problems we do because people can cherry pick their enemy.

To prevent this sort of exploit, I halved the number of points gained or lost to provisional players, and halved the number they gained or lost. This makes it impossible to climb to the top off nothing but provisionals, and makes it impossible for 1500 newbies to inflate their ratings in only 10 games.

This of course makes people unhappy who want instant results, they want to play ten games and be in the top 30. They don't want to play 10 games and earn their way to being a real rated player around a rating between 1450 and 1550.

Elo isn't an instant system, it takes 40-50 games to show any sort of accuracy is attained. Which gives you something to work for. 50 games isn't really that hard to do. I play that many games in a week.

Now the exploit in its worst and most obvious form. In the current system as a rate 1900 player, I can lose 10 games to a provisional. This costs me 40-60 points. Once they hit their 11th game, I can beat them 20 games. This gains me 2-300 points. So I can climb to the top of ladder simply by "sand baging" to unrated players inflating their rating and costing me nothing, then turn around and beat them a couple of games for easy points. The system as I coded it made this really unproductive, because after feeding a newbie a few games, he is still rated really low, so when he comes out of provisional... you get very few points off him, and the net gain is very small.

Obviously no one has quite exploited the system to that degree, yet many players play and lose games to provisional players because of bad luck, sloppy play since they know the game barely counts, or maybe the provisional isn't that bad. This causes them to be over rated. So, players looking for an edge find these just out of provisional players and beat their socks off... sometimes 2-3-4 times in a row...

I am not going to mention any names because the system is designed to be exploited. Anyone who wants to inflate their rankings is forced to target players with 10 games in who are inflated.

Example
P = provisional/E = eye Rouge/C = Chains
I am sloppy because I'm playing a provi who won't hurt to lose.
P beats C * 2 Gains 100 points. C loses 12 points.
P comes out out of provisional after smoking me.
E beats P * 2 gains 80 points.
C beats E * 2 gains 80 points.

C is inflated 68 points.
User avatar
eyerouge
Posts: 380
Joined: June 29th, 2007, 4:37 am
Location: wtactics.org
Contact:

Re: Ladder Site Online...

Post by eyerouge »

I support chains notions, ehrm, hard not to considering how it all has turned out after some of the changes.

When they were applied we got heat because people "climbed to slow". Luckily it isn't really hard to change, to escape the inflated provisional ratings you'd just need to give the provisonal players a K that's normal, or even better, half or a quarter of the normal K values (all settable in the config already). This of course will, again, be a discussion of what that will do to the newcomer: Will he quit when he notices he has played 10 games and gotten nowhere? Or will he understand that its just for 10 games by reading the FAQ etc? Good questions.

While we're at tweaking values I honestly also think that the protection the normal player gets from the provisional is set to a too high value: If you lose against a provisional player today you'd only lose 25% of the points you would have lost if you played him on his 11:th+ game, when he's not provisional. That may contribute to the problem, making provisional players maybe too attractive to play with. I wouldn't know, just raising the question. Could be an idea to make losses against provisional 50% of what you'd lose in normal case instead of those 25%. Again, if you'd remove the protectuon alltogether it could also mean that high ranked veterans would more seldom play provisional players since they a) don't know the provisonal players real rating and b) fear they'll lose a huge chunk of points if they lose due to a).

Also, a while ago we changed the win expectancy to account for the RNG-effects. I believe, by looking at some results at many of the top players, which are veterans in the general community, that we might have overdone it and that the better rated player should have somewhat higher win expectancy than he currently has. This is also easy tochange in the config.

Notice that any of the above changes would require a total rerank and that everyone(?) would get a different rating ofc, and it's not something thats recommended to happen all too often actually. In a year I think we have reranked it like twice. Last time was after the win expectancy was lowered, by request of Wintermute (see other posts in this thread for that discussion and also pdf:s of different outcomes of different settings)
User avatar
eyerouge
Posts: 380
Joined: June 29th, 2007, 4:37 am
Location: wtactics.org
Contact:

Re: Ladder Site Online...

Post by eyerouge »

http://ladder.subversiva.org/opposition ... e=Gallifax

..is what I have this far. Will add more next weekend perhaps and include it in next release. You can change the name at the end of the URL and bview your own info.

Notice: There are tooltips in the column headers, so hold the mouse over them to get an explanation. The main usage comes if you start pressing the headers to sort the columns, there's plenty of info.
User avatar
Doc Paterson
Drake Cartographer
Posts: 1973
Joined: February 21st, 2005, 9:37 pm
Location: Kazakh
Contact:

Re: Ladder Site Online...

Post by Doc Paterson »

TMOB wrote:As to user-made maps, no, I would not allow just any user made map to be approved if any. The maps would have to be proved to be balanced to the standards of the mainline maps in order to even be considered for being allowed for use of the ladder.
Hooray for the voice of sanity. :D
I will not tell you my corner / where threads don't get locked because of mostly no reason /
because I don't want your hostile disease / to spread all over the world.
I prefer that corner to remain hidden /
without your noses.
-Nosebane, Sorcerer Supreme
User avatar
TheMasterOfBattle
Posts: 161
Joined: October 24th, 2008, 1:13 pm
Location: My War Council

Re: Ladder Site Online...

Post by TheMasterOfBattle »

Doc Paterson wrote:
TMOB wrote:As to user-made maps, no, I would not allow just any user made map to be approved if any. The maps would have to be proved to be balanced to the standards of the mainline maps in order to even be considered for being allowed for use of the ladder.
Hooray for the voice of sanity. :D
Hehe, I didn't know I was considered to be sane. :lol2:
Fosprey
Posts: 254
Joined: January 25th, 2008, 8:13 am

Re: Ladder Site Online...

Post by Fosprey »

I want to throw away an idea, i don't know if it's good or not, so here it is.
Give the option to start a Small gamble game. That is agame, where the K value is halved.
Sometimes i want to play in the ladder, but don't want to think too much or just want to try stuff. I've lost LOTS of points doing that in the ladder.
so i wouldlike to risk less points but still make it ladder.
I really don't care that much so i will keep doing it anyway. But it could encourage people to try new strats on the ladder. And of course it woudl be better for me.
As i said, just hear the suggestion i don;t know if it's actually good :)
Pelopidas
Posts: 18
Joined: November 5th, 2008, 7:42 pm

Re: Ladder Site Online...

Post by Pelopidas »

@chains:
Your system is a first approach, but as you noticed, it has serious shortcomes.
You want to half point changes for both newcomers AND their opponents. That this does not really work after the cap has been removed, is entirely obvious. It also does not remove the problem, since still the superior players could suck off the points of the newcomer, since just the rating change is halfed.

One needs a system that quickly adjusts the ratings of new players, before others can suck off points or loose significant points. So what one needs is a strong rating change for the new player and a very weak rating change for the old player. Maybe you will need to do this in examples for yourself. This is also the rating system chess servers apply. When I enter a chess server, my rating will have a very quick adaption, since the first rating changes are of order 100 points, then declining quickly, and the rating converges to the fair value, instead of croaching slowly towards it.
This is the only sensible way of getting rid of point stealing and is done exactly by the RD-system I suggested.

@eyerouge:
Concerning your doubts about the RD-system:
An RD-System is exactly designed NOT to alter the gain/loss ratio of a match.
I think this misunderstanding arised from some attempts, respectively the recently set-up cap-coding don e by admiral that actually would distort ratings of active players. One has to alter the k-values for each player separately, but NOT to treat the winning k and losing k separately.
I think the best thing might be to discuss this in some quiet moments next weekend - possibly by ICQ, or in a chat, where we can clear up misunderstandings hopefully instantly.
Back to the RD-System:
You already calculate the win/loss ratio of points by using some expectation value for winning losing a match and then demanding the expectation value to be 0 at fair rating - this is entirely unaltered by the RD-system.
The point where the RD-system enters the scheme is the k-value, which has to be the same for a player no matter, if he loses or wins. So the RD-system determines the k-value, which is lower the more active the player himself is, and the higher, the more active the opponent is (so you cannot suck points from a newcomer, but HIS points will be changed very quickly by the first matches).
Ladder distortion will exist as long as new players do not reach a fair rating so quick that one cannot suck off points in significant amounts from them.
Think about a grandmaster of Wesnoth entering the ladder. In the current system he will need, say 50 matches to get into the top region. During this time he is severely distorting the ladder.
On the other hand, an entire newcomer hardly knowing the rules needs also 30-40 matches to get to the bottom, where he actually should have started being rated. On his way down, he delivers huge amounts of points to other players.
Chain's system does not work against this, it actually tends to even slow down these changes, not altering basically the amount of points that is sucked off.
Looking at the RD-system it is different: If you win your first 10 matches against the top players, with an RD-system you will then be found already in the top ranks, which is your fair place.
If you lose your first 10 matches against medium to bad players you will be quickly where you should be - at a newcomer rank in the ladder - say 1100-1200.
The good thing about the opponent setting is that no matter what happens, the opponents' ratings are not going to be significantly distorted, since their point change is minimal by the high RD of the newcomer.
Hope that brought in some light, for details a separate discussion might be more appropriate.

@Fosprey
I fear I did not understand your point exactly. Do you mean you would like to play ladder matches with low rating changes?

@TMOB
Well, as I stated I do not hang my heart onto the kind of maps that are played in the ladder, as long as they are ok. I would think that players will not want to play heavily unbalanced maps themselves. I am not sure, why one should ban them agreeing on newly designed maps. At the moment the number of maps for choice is relatively small and I would be happy if there were more in the future. It definitely has to be granted that noone has to face a new map that he does not want to play on, but on the other hand. The ladder is at the moment increasing its share of players in the Wesnoth community as far as I could see on the server, so our actions will necessarily also have an impact on the development or not development of new maps. So it is also a strategic decision what one wants to allow and what to ban. One could also create a rule, that opponents have to choose random race on a not-yet-canonical map. If you look at your own map, TMOB, it is for example a nice map that I would like to test playing ladder on it, in the free matches it is a problem to find a suitable opponent. On this perspective one might even think about some registration for new maps at the ladder, giving them a test-status and introducing something like a discussion section on them (for this purpose one might in my eyes get an own thread in the Wesnoth forum, since it is about map development).
Yogibear
Retired Developer
Posts: 1086
Joined: September 16th, 2005, 5:44 am
Location: Hamburg, Germany

Re: Ladder Site Online...

Post by Yogibear »

Pelopidas wrote:On this perspective one might even think about some registration for new maps at the ladder, giving them a test-status and introducing something like a discussion section on them (for this purpose one might in my eyes get an own thread in the Wesnoth forum, since it is about map development).
I agree that new maps are desirable. On the other hand, i think new maps should undergo an approval step by Doc as well. Why should we find out problems the hard way, investing dozens of hours of play, if Doc's trained eyes can identifiy problems at first glance?
Smart persons learn out of their mistakes, wise persons learn out of others mistakes!
ElvenKing
Posts: 105
Joined: February 7th, 2008, 7:02 am
Location: Melbourne, Australia

Re: Ladder Site Online...

Post by ElvenKing »

Pelopidas wrote:I fear I did not understand your point exactly. Do you mean you would like to play ladder matches with low rating changes
That is exactly what he meant. He wants to play matches that have less points riding on them. He wants it for the times he just wants to muck around, try new strategies, etc. Seems a bit of a weird request to me since you could just play non-ladder to do those things.

Still I guess he might have trouble finding reasonable opponents for a non-ladder 1vs1. I haven't played in months so I don't know how hard it is to find good opponents for a non-ladder 1vs1.

The major problem I would have with introducing a system where you could agree to play for less points is the potential for disputes. There would be cases where people would dispute whether or not they had agreed to play for half points, not that there would be many of course, in fact there probably would be very few; but I don't think the ladder needs any more potential for disputes.
Last edited by ElvenKing on November 10th, 2008, 12:24 pm, edited 1 time in total.
"if nothing we do matters... , then all that matters is what we do."
Angel- Angel the Series

"Sore thumbs. Do they stick out? I mean, have you ever seen a thumb and gone 'wow, that baby is sore'?"
Willow Rosenberg- Buffy the Vampire Slayer
Zlodzei
Posts: 44
Joined: January 6th, 2007, 10:31 am
Location: Belarus, Minsk
Contact:

Re: Ladder Site Online...

Post by Zlodzei »

eyerouge wrote:http://ladder.subversiva.org/opposition ... e=Gallifax

..is what I have this far. Will add more next weekend perhaps and include it in next release. You can change the name at the end of the URL and bview your own info.
Bug: Go http://ladder.subversiva.org/opposition ... me=Fosprey and try to sort by "E.Gained". It is sorted as text, not as numbers. Can't reproduce it for other players. ;)
I can see you!...
User avatar
eyerouge
Posts: 380
Joined: June 29th, 2007, 4:37 am
Location: wtactics.org
Contact:

Re: Ladder Site Online...

Post by eyerouge »

Zlodzei wrote:Bug: Go http://ladder.subversiva.org/opposition ... me=Fosprey and try to sort by "E.Gained". It is sorted as text, not as numbers. Can't reproduce it for other players. ;)
I can't reproduce it at all - just checked and it sorts correct here. I'm using Firefox 3.0.3 in Ubuntu (Linux). Maybe it's your browser and/or plattform combination?
User avatar
eyerouge
Posts: 380
Joined: June 29th, 2007, 4:37 am
Location: wtactics.org
Contact:

Re: Ladder Site Online...

Post by eyerouge »

ElvenKing wrote:That is exactly what he meant. He wants to play matches that have less points riding on them. He wants it for the times he just wants to muck around, try new strategies, etc. Seems a bit of a weird request to me since you could just play non-ladder to do those things.
1. As Elven writes, why can't just a normal game be played? It sounded like "there werent decent non.ladder 1vs1 players around"... and if I got that part right, it's ridiculous: Of course there is. It also doesn't mean that one is automagically a descent 1vs1 player just because one happens to play ladder games. If one just end up playing crap players outside the ladder maybe that person should revise his criterion for picking opponents, what about, i.e. to actually talk with each other for 3-5 min and get to know the person?

2. Allowing players to barter/agree/wager on how many points they will play for is a superb way to insure that the rating becomes totally useless since it would, each time they agreed on something else than the standard Elo thingie, distort the rating and add to it's inaccuracy. The whole rating would become more like a game element in monopoly or whatever. While I thin it would be cool to implement a "gold coins" system where people can "wager" gold on a game etc and collect them and what not, such crap is just fluff and hasn't any priority whatsoever, at least not by my fingers.
Yogi wrote:I agree that new maps are desirable. On the other hand, i think new maps should undergo an approval step by Doc as well. Why should we find out problems the hard way, investing dozens of hours of play, if Doc's trained eyes can identifiy problems at first glance
It sounds great, if Doc would really want to put down time on such a project. As far as I have seen nobody has asked him.

But, really, how many maps can there be a need of? I know it's cool with variation, but when you get too much variation you will make it very very hard for players to master any map at all since they keep changing maps, opponenents and/or factions a majority of all the games. Let's not forget that there are a very few players that truly master the official maps equally well. Most can improve on some of them, if not all. Having a map knowledge is always important and an advantage in a game. By having many maps one makes it, once again, harder to compete with hardcore games. Why? Because a hardcore gamer would know all the maps since he has more time to play on and that, in result, will lead to better gameplay by him against the average sucker that goes up against him semi-clueless about the map.

Sure, it can all be agreed on and then the problem is solved. Or is it? The more maps there are, the harder it should be to agree on something. It's like coming to the gamestore with your boyfriend - it's christmas again the mere options makes a decision harder. But, let's not take this all to seriously, I'm just playing with the idea...

Lastly, the more variables that vary between different games, the more general and the more useless the rating becomes. As I wrote to KnightKunibert the other day:

The more game variations you allow in a rating system the less accurate and useful the rating becomes. This is a mathematical truth that probably won't ever be proven wrong.

Let me give an example with Wesnoth. Imagine the ladder allowed different Eras. For each era we allow, the less we can compare players with each other. So, if I compare my 1500 with your 1600, we can't really say much about our skills. Why? Well, because it would require different skills to master different Eras, perhaps.

Now, lets forget about the Eras. Let's make it so people can choose how much gold they start with, or any other of the many factors they can setup and customize. For each that they have free hands to decide, the value of the ratings and statistics get lower and become more meaningless. It has to do with mathematics, statistics, and measurements. It's kind of like mixing €uro, dollar$ and yen in a bag and then writing "I have 1500 in the bag". 1500, okey. 1500 what? It's not 1500 yen, nor dollar, nor euro. It's 1500-something-unknown. Also, in the example with the currencies, you can easily convert a euro to a yen and so on. But when it comes to skills and factors in Wesnoth, you can't convert a victory in one era into x points into another era.. and so on.

a) To make a long story short, there are some factors that can easily be converted and measured and others which can't be done so in a rational way. The ladder is already balancing when it comes to this issue and walking a thin line with very very liberal rules compared to almost any other rating system where people try to act as professionals and have a high quality rating system (i.e. chess).

b) The official maps have countless hours of playtesting and revisions in them by the official developers, people that understand the game and which are veterans. Their expertise also insures us that one of their main goals are reached as good as anyone would be able to make them real: The balance of a map. The insurance that it is more well-balanced than any custom made map by the random guy from the community. And an insurande that it's a quality map that offers strategical depth. By allowing custom made maps you increase the problems in a), you also make it very easy for people to cheat (just create an unbalanced map), and you also make it harder to enter the ladder and play well since you never can be quite sure that you will ever play on the same map again, and, you also make it hard to find games since many will want to play on their own maps and many will want to avoid others, so they'll all create own games than joining already created ones. This is just a few reasons, but there are plenty of others as well.

Pelopidas:
I'd like to participate in a chat sometime if possible, mainly as you suggest yourself, to clear out all out misconceptions :P That said, I won't touch that part of the code unless Admiral refuses since I think he knows it better than me.
User avatar
Wintermute
Inactive Developer
Posts: 840
Joined: March 23rd, 2006, 10:28 pm
Location: On IRC as "happygrue" at: #wesnoth-mp

Re: Ladder Site Online...

Post by Wintermute »

Yogi Bear wrote:
Pelopidas wrote:On this perspective one might even think about some registration for new maps at the ladder, giving them a test-status and introducing something like a discussion section on them (for this purpose one might in my eyes get an own thread in the Wesnoth forum, since it is about map development).
I agree that new maps are desirable. On the other hand, i think new maps should undergo an approval step by Doc as well. Why should we find out problems the hard way, investing dozens of hours of play, if Doc's trained eyes can identify problems at first glance?
In my mind, the single best aspect of the ladder is the way it encourages serious games (games where both players are attempting to play very well). Serious games are where we get to see which strategies really work, where potential faction balance/map balance problems are. These things are great for the Wesnoth community IMO, and should be encouraged. It also seems obvious to me that Doc (or any group of people even) will not physically have enough to time to seriously review *every* map that players make. Not to mention the fact that posting a map on the forums for review can be a daunting process for at least some people. We take our maps seriously in Wesnoth, but that doesn't make it any less hard for a new player to post up a map only to become informed how bad they are at making maps (or rather, how many factors go into making a good map that they have not considered).

I think it would be wonderful to allow the ladder community to test out maps (of course, only some players will be interested in this -that's fine). That way players can get feedback on maps, make changes and then post up a map that has already been playtested a bit, and is likely to be much smoother. That seems like more efficient use of everyone's time (I.E. give doc and F8 a break from telling new mapmakers the same basic points things over and over before they can move on to better maps). However, I understand that just playing ladder games on untested maps could be a very frustrating experience, so I will address that below.

This dovetails nicely into two other points kicking around in this thread.
1) I am interested in Fosprey's "half value" games. I also feel I have lost plenty of points in ladder games in situations where I really probably didn't have time for a full game, or wanted to play but I was very tired. I would definatly use half value games in those cases. Even if they were called something diminuative, like yellow-belly matches. :wink: I would also be more likely to branch out and play maps that I am not as confident on (we all have our favorites, right?). And of course, new maps! I am very much in favor of increasing the number of high-quality maps in Wesnoth, and I think that encouraging playtesting will allow some interesting new maps to bubble up to the top, so to speak, and get some attention. Of course, I agree with Fosprey: it's not that big a deal really, I will still play ladder games even when I am not "at my peak" and probably lose a lot of them - which is fine. But the point is to find fun games right? I think that this idea might contribute to that goal.

2) What other settings should the ladder be testing? My answer to this is exactly what eyerouge has already said (paraphrase): you can't compare apples to oranges. In other words: game settings should be the offical wesnoth settings. Messing around with other xp settings, gold settings or eras is a terrible idea IMO, as it will fracture and distort the rating system in place. That said, I personally think that it's fine to "mess around" within the framwork of default. For example, there is really nothing stopping a player from playing HODOR (only recruiting outlaws) in a ladder game - it's perfectly legit IMO. However, if playerd decide that they don't want to use fog, that is a problem: if a bunch of players play games without fog, then theorietically, a player might rise quite high on the ladder by beating everyone else who plays no fog games. But that ranking would be meaningless, since it shows nothing about the relative ability of that player to the rest of the (real) ladder. Similar logic holds of othter eras, starting games with 800 gold, etc.

***
Also: I had some interesting in becoming a ladder admin a while back, when the openings were posted. I talked to eyerouge about it a bit, but at the time I had several reasons that I felt held me back from throwing my hat in the ring. I have now changed my mind. I have strong opinions about the ladder. Mostly in the sense of preserving the good things that I feel the ladder is contributing to the community while not introducing anything that I feel would harm the community - eyerouge has done a good job of this and I want to continue the work. The one hesitation that is still an issue for me is that I will be quite busy at times (I have kids, etc.), and I don't want to be in a position where everything falls apart if I can't reply to email for a few days. But since there are a bunch of admins that shouldn't be a problem.

That said, there are a bunch of admins already, and I realize that I am trying to jump in after the fact. Still, if there is room for me, I would like to step up.
Thanks,
'mute
"I just started playing this game a few days ago, and I already see some balance issues."
Post Reply