Ladder Site Online...

Discussion of all aspects of multiplayer development: unit balancing, map development, server development, and so forth.

Moderator: Forum Moderators

Post Reply
User avatar
cookie
Posts: 171
Joined: December 21st, 2010, 6:57 am

Re: Ladder Site Online...

Post by cookie »

OKAY! For all those who know me well..You know how I am like with accounts. xD
Rigor / ANY Ladder ADMIN, I would like to request the amount of nicknames registered to my email under the username of Cookie.
Please post up the complete list so I can remember and decide which one to keep.
I think there should be less than 10 :S
Bye says the cookie.
anoel
Posts: 23
Joined: October 6th, 2010, 12:05 am

Re: Ladder Site Online...

Post by anoel »

@cookie: have a look at this, maybe it can help you :wink:

http://ladder.subversiva.org/ip.php
Alexandria, Cookie, CookieBite, Flare, Jamie-Ann, naye, PrincessPeach, Rei-Akira, Rita, Rossan, shootingstar, Thanh
@milwac: refering to your suggested improvement c)

i think you are talking about something like variance. it is quite easy to calculate, if you have the needed data. let us assume, we know all past ratings a player ever had. then we can easily calculate the mean (= average rating). then the standard deviation is this: :eng:

square root of the sum of all squared distances(=(each value - mean)²) divided by the number of values. in a formula: square root ( sum ((x-mean)²) / N). if the data are normal distributed (can be tested for example with the Kolmogorov–Smirnov test), the standard deviation will be really easy to interpret, as described in the following example:

player A has a current rating from 1721 and his past ratings were 1704 1711 1721 1726 1711 1704 1701. then his average rating is 1711,00 and the SD (standard deviation) 9,09. coz the values are normal distributed, you can make these statements:

68,3% of his ratings are between 1702 and 1720 (+/- 1 SD)
95,4% of his ratings are between 1693 and 1729 (+/- 2 SD)
99,7% of his ratings are between 1684 and 1738 (+/- 3 SD)

also you can say that this player has a far more constant rating than a player with a SD from 138,79 (just an example).

so far the theory. ^_^ i have neither an idea if the past ratings are saved somewhere nor how difficult it is to put such an calculation in the existing code. :whistle:
i hope it was understandable. :hmm: i am not a native english speaker.
Dauntless
TGT Champion
Posts: 196
Joined: October 14th, 2008, 10:16 pm

Re: Ladder Site Online...

Post by Dauntless »

Anoel: Im not an expert on statistics, but every player starts at 1500. Even if he played at a steady 2000 for some time, wouldnt his initial ratings of 15,16 hundred etc. throw the SD off scale?

Cheers

DL

Edit: After a second look it seems also irrelevant what ratings one player has had and how much they shifted. What speaks about the current level of the player isnt his rating - even if i played like a 1500 now, my rating would still be 2200+ for many games - but the rating of the opponent i beat/lose to. Even a highly volatile gaming performance will start to stagnate around some elo value with a fairly low SD - even if the actual game performance varies really a lot...
anoel
Posts: 23
Joined: October 6th, 2010, 12:05 am

Re: Ladder Site Online...

Post by anoel »

Dauntless: if the rating increases or decreases much, the SD will increase. if the rating is close to constant, the SD will decrease. there are no problems with beginners. if their rating increases much until they reach for example 2000 and stays around 2000 for a while, the SD will increase in the beginning and later drop down. the SD tells you how much the values scatter around the mean. in the beginning the mean will be in the middle between 1500 and 2000, but later get closer and closer to 2000. at last the mean is almost 2000 and the most values are around 2000. so the SD will be quite low. for more clarification lets think about another example. player B raises from 1500 to 2000 stays around 2000 for a short while then drops after the next games to 1700. in the end he climbs again and manages to reach 2300. the mean will be quite similar to example one (around 2000), but he will have an higher SD. for further clarification: the more games a player has, the less a single new game influences the mean and the SD.

all in all i don't think the SD is a useful number for a normal ladder gamer, because it wont be self-explanatory for the most players. more intuitive is somthing like this:
68,3% of PlayerA's ratings are between 1702 and 1720
coz everyone can undertand it. if it is interesting enough to add and if it is possible at all... i dont know. :mrgreen:
another example - from usual wesnoth fights (less abstract)
a wose and an elvish fighter are attacking orcs at 50% terrain defence like crazy. in 16 turns we would expect something like this:
table with damage of wose and fighter
the tree has an higher average damage and a higher standard deviation (there are more values 'far' from the mean than in the fighter's case).
Dauntless
TGT Champion
Posts: 196
Joined: October 14th, 2008, 10:16 pm

Re: Ladder Site Online...

Post by Dauntless »

anoel, my point was that the rating fluctuates much slower than the actual game level. If i play each odd game as 2300 and each even as 1700 my rating will be +/- the same as if i played at a steady 2000. So also the SD will be about the same...
anoel
Posts: 23
Joined: October 6th, 2010, 12:05 am

Re: Ladder Site Online...

Post by anoel »

well you cannot play more steady than winning all (or most) odd games and loosing all (or most) even games. so i cannot see why your two cases differ and why they should have different statistics. :hmm:

if your player A has vs a 2000-player 1 0 1 0 1 0 1 0 (1=win,0=loose). where is the difference to your steady player B? (my point is: your player A is playing steady)
User avatar
milwac
Posts: 29
Joined: April 9th, 2010, 11:40 am

Re: Ladder Site Online...

Post by milwac »

@anoel : Yes, the RD is something like variance. But the calculation is not that simple :) Actually you can say that the RD becomes lower and lower as one becomes increasingly certain of the mean. And of course more recent games contribute more to this (RD is calculated incrementally as opposed to taking into account all past records, which will be difficult to do as well), also for stagnant periods the RD is supposed to go up. But I am still not certain by how much. I hope to come up with something soon, until then there's nothing much to say.

Cheers!

PS: Happy to see leocrotta's account unblocked.
anoel
Posts: 23
Joined: October 6th, 2010, 12:05 am

Re: Ladder Site Online...

Post by anoel »

milwac, what do you mean with RD?
RiceMuncher
Posts: 1
Joined: October 18th, 2009, 5:52 am

Re: Ladder Site Online...

Post by RiceMuncher »

Funny how everyone who plays/cares about the ladder is the same people from 3/4 years ago haha.
Just my two cents.I will try not to repeat what's already been said.

I agree that there should be one account only because it does affect the ELO/Ladder system in ways the original "founding father" (eyerouge?) did not intend for.For reasons already discussed, it CAN negatively impact on the ladder. The rule simply helps eliminate the possible abuse that might stem from multiple accounts. It also helps the admins do their jobs by keeping things simple.
Ultimaely, as already discussed, multiple accounts will lead to (major or minor) inaccurate ELO ratings as we do have a very small community.

As for blocking Leo's account I do not agree, even though there is supposedly a joint solution via players/admins coming soon. However the admins do have the right to exact punishment to whatever they see fit as there are no clear guidelines for punishment and also because all registered players have agreed to the ladder rules.

ANyways I actually have a simpler proposal.
Due to the ladder community being relatively small.
I suggest we
RESET THE LADDER,
MODIFY/HIGHLIGHT RULES (such as bolding the single accounts)
GET MORE ADMINS

The ladder has not been reset since its inception regardless of multiplayer balance changes which just seems stupid.
Its really been a long long time. A reset would give newcomers a chance to be at the top(you have to win like 100 games in a row to get top 3) and maybe attract new players. It will also hopefully kill off every single multiple account user as everyone stats will be reseted anyway. =p
A "season" of maybe one year, with the top 10/20 going into some sort of forum honor thread will do well to keep the ladder active and deter multiple accounts as multiple accounts would have less chance of achieving anything with "seasons" in play. (Cheating would also be obvious because the games can be scrutinsied more readily.)


Change rules to make it more clear that multiple accounts will not be tolerated.
More admins to help keep community active/enforce rules.

Happy RNG !
User avatar
Crendgrim
Moderator Emeritus
Posts: 1328
Joined: October 15th, 2010, 10:39 am
Location: Germany

Re: Ladder Site Online...

Post by Crendgrim »

Creating "seasons" and/or resetting the ladder (regularly even?!) would break the ELO system completely. Players would have no idea who is better if everyone would have to climb up again and again — and I think this would scare more good players away than new players would join.
Besides, even if the ranking system would be reset, it would in no way affect the actual strength of players; so if I don't like to enter the ladder now, why should I want to do so after the reset? Only for having a better position at first, and then fall, fall, fall, fall down? Then losing against a top player who wasn't around immediately after the reset and now gets many, many points by people with higher ELO, who are in fact very worse?

And finally, I doubt this would really prevent people from using multiple aliases. Actually, I would expect more people doing so, because, hey!, it doesn't matter, right? The ranking will be reset in a few months no matter what I do. So, who cares?

Sorry, but I don't think your idea is the best choice.. anyone feel free to convince me, though. :P


Crend
UMC Story Images — Story images for your campaign!
User avatar
Doc Paterson
Drake Cartographer
Posts: 1973
Joined: February 21st, 2005, 9:37 pm
Location: Kazakh
Contact:

Re: Ladder Site Online...

Post by Doc Paterson »

Crendgrim wrote: Sorry, but I don't think your idea is the best choice.. anyone feel free to convince me, though. :P


Crend
You're absolutely right- I don't think a single sentence of that post was logically sound, and considering all of the rehashing of old points, the writer clearly did not actually read preceding arguments. This "what would the founding father want" concept is also particularly irrelevant to the current ladder, and the phrasing kind of made me laugh, seeing as Eyerogue is female). 8)
I will not tell you my corner / where threads don't get locked because of mostly no reason /
because I don't want your hostile disease / to spread all over the world.
I prefer that corner to remain hidden /
without your noses.
-Nosebane, Sorcerer Supreme
Madlok
Posts: 80
Joined: April 24th, 2008, 1:26 pm
Location: Poland
Contact:

Re: Ladder Site Online...

Post by Madlok »

Rigor, and what with Primus_pilus?
You don't want people to say that you removing only the clones that are higher ranked than you :twisted:
Quick bats are quick.
User avatar
eyerouge
Posts: 380
Joined: June 29th, 2007, 4:37 am
Location: wtactics.org
Contact:

Re: Ladder Site Online...

Post by eyerouge »

milwac wrote:I am not that great of a PHP/MySQL coder but can certainly help maintaining the code and implement bits and pieces of new stuff every now and then. (The broken database search for example
As it's announced in the first post of this thread and elsewhere I imagine the ladder is in great need of an active developer. It hardly matters that you are not a PHP expert: I used the project myself as a learning ground and knew zero PHP before the project: You can do wonders even with limited knowledge as long as you have the will and patience.

The project resides at https://sourceforge.net/projects/gamingladder and you can be added by me or Tessa maybe. Contact me via mail eyerouge thething eyerouge thedot com if you fail to get in touch with anyone and really mean what you write. When I began coding on it I imagined it should be turned into something that is easy to customize and possible to set it up for almost any game, not just Wesnoth, and I think it already is half way there. It just needs some love and addition of great new features.

About your glick suggestions: Mr russ or chains actually wrote a PHP class for the ladder that let the ladder admin choose glick instead of ELO in the ladder config, but I don't know if it was ever released or died with his hd :P ...in any case it is "easily" doable and would make the ladder script even more attractive to other gaming communities.

I hope you take this opportunity and start working on the code, not just to control the specifics of the Ladder of Wesnoth, but because you're interested and want to contribute to the open source community. Sadly it seems there has been hardly any work done at all on it since I disappeared.

With regards
/e
User avatar
Rigor
Posts: 941
Joined: September 27th, 2007, 1:40 am

Re: Ladder Site Online...

Post by Rigor »

so far the responses i gathered from the ladder mainpage were plenty, i have no worries that we will soon have highly motivated active developers. https://github.com/moserware/PHPSkills/ ... ulator.php should help to implement trueskill in php (for those motivated candidates that want to take a look at it already). or you scroll down to the bottom of the page and see what the trueskill guy wrote us and maybe ask him specific stuff via his website (all informations in this link) http://forums.wesnoth.org/viewtopic.php?t=32037 - i pmed everybody my own idea for the ladder as well, maybe an easier task than that.
User avatar
milwac
Posts: 29
Joined: April 9th, 2010, 11:40 am

Re: Ladder Site Online...

Post by milwac »

Hi all,

Sorry I am a bit late, I was busy with exams and had time just for a couple of games every now and then during breaks :) I'll try to address two issues in this post.

Which ranking system is better?

Well, all this while I was trying to see how Glicko fares better compared to Elo, I read about Trueskill but got lost in the implementation. It seems as if there is not much difference between Glicko and Trueskill, just that the latter takes into account the possibility of team games. But for our simple 1v1 ladder, Glicko seemed like a good bet.

Quetzalcoatl helped me get hold of all reported games since the beginning of time, which I processed in the order in which the games were reported. Unfortunately quite a lot of time passed since he gave me this list (28th July 2011 to be precise) and hence the most recent game that I processed was bkerin vs analtract (28/07/2011). Apologies for this.

Digging into the existing code, I felt that the current system wasn't working as it seems in the code :) Well I could be wrong but it also could be the fact that the rankings have been calculated with different parameters in the past which have been changed over time but all past games weren't migrated to reflect this change. Come what may, Elo proved to be very bad even if there were slight changes in the parameters. My elo ranking list varies a lot as compared to the list you see at the ladder website. It's because I couldn't simulate the exact calculations involved and the exact parameters used. This has nothing to do with the rankings being right or wrong. Infact it'll be better if no-one paid too much attention to the Elo rankings I generated.

Here I discuss the parameters I used and the results obtained. This is related to the mathematical formulations of the rating systems, so if you're not familiar with these you can jump directly to the results.

Elo Parameters :

Starting rating = 1500
Provisional period = 10 games
K factor for all players between ratings 0 and 1799 = 32
K factor for all players between ratings 1800 and 2099 = 24
K factor for all players above rating 2100 = 16
K factor for all players who played against a provisional player and lost = <Original K factor> / 2

I didn't take into account a higher K factor for provisional players because I didn't see it anywhere in the code. But as it is evident, the ladder code is using this. I couldn't figure it out. Apologies again.

Glicko parameters :
Starting rating = 1500
Starting RD = 150

The reason for the low Starting RD is because it is possible to have negative ratings in Glicko and this shows up with a high RD. (Some current 1800 players ended up with negative ratings so I had to lower the RD) To keep things simple, and also because I didn't have enough information, I didn't consider the 'wesbreaks' when a ladder player would be inactive for a long time (In effect the RD should increase in such cases)

The rankings also do not reflect active periods, they contain all players ever ranked.

Elo Rankings
http://xntrick.comuf.com/wesnoth/elo_ranks.html

Glicko Rankings

http://xntrick.comuf.com/wesnoth/glicko_ranks.html

Ok, now coming back to my original intention. Which was to see how well Glicko performs compared to Elo. For this I tried to see for a couple of particular players how long it took them to come really close to their current rating for the first time (All these results are based upon the ranking lists above and has nothing to do with the rankings on the ladder website). This measure is not an indication of a stable rating, but is useful nonetheless. This is what I discovered.

leocrotta
Needed 250 games to reach within 5% of his current rating and 52 games to reach within 10% in the Elo system
Needed 49 games to reach within 5% of his current rating and 24 games to reach within 10% in the Glicko system

Dauntless
Needed 53 games to reach within 5% of his current rating and 44 games to reach within 10% in the Elo system
Needed 105 games to reach within 5% of his current rating and 37 games to reach within 10% in the Glicko system

[This is actually not that bad as it could be the case that this players' rating fluctuated quite a bit in his early periods as he has played quite a few games. Also since the measure is not a stable rating and is just a indication of the first occurence]

Demogorgon
Needed 50 games to reach within 5% of his current rating and 29 games to reach within 10% in the Elo system
Needed 39 games to reach within 5% of his current rating and 4 games to reach within 10% in the Glicko system

Goldilocks
Needed 60 games to reach within 5% of his current rating and 31 games to reach within 10% in the Elo system
Needed 56 games to reach within 5% of his current rating and 19 games to reach within 10% in the Glicko system

From these observations I could more or less conclude that Glicko converges faster than Elo to a players' true rating. Also for players who are already experienced but are just new to the ladder(and/or aliases), Glicko converges much faster.

I could also calculate how ratings were affected by the introduction of aliases, but I didn't do this because of two reasons : first, not all aliases are revealed, only a handful, and second, I am not too interested in this issue anymore :)

Do we need a new ladder?

My initial impression of Quetzalcoatls' and Justendturns' new ladder project was great (I don't have the link as it seems to have moved, this here is the thread btw http://forums.wesnoth.org/viewtopic.php ... =trueskill). I always wanted to see more features, more images and a more lively interface. This coupled with the fact that the old ladder code in PHP is difficult to maintain and is expected to become more complicated in the future with more and more active developers wishing to work on it. I have experience in PHP and JSP (Just that it's been long and hence I said I am not much of a web developer) but I am willing to take up Ruby on Rails (as Quetzalcoatl has) which is a great framework for such a large project. Once a prototype is ready we could open it for a test phase and if the community feels positive all the dirty work of migrating past games into the new system could be done. In addition introducing new features like individual player stats etc could be done with very less effort. This is just my 2 cent take on this.

Cheers,
-milwac
Post Reply