Ideas for handling disconnects/lag in Multiplayer

Brainstorm ideas of possible additions to the game. Read this before posting!

Moderator: Forum Moderators

Forum rules
Before posting a new idea, you must read the following:
linuxminter
Posts: 11
Joined: October 11th, 2013, 8:31 pm

Ideas for handling disconnects/lag in Multiplayer

Post by linuxminter »

I don't have any suggestions for Wesnoth solo. In my opinion, things are great. The game is balanced, and I like it.

For Wesnoth multiplayer, I do have suggestions. I play Wesnoth multiplayer at several locations. When I have wide pipes, I can play without any problems. When I have a "budget" Internet connection, the case is reversed, with lag time and disconnection being common. This probably deters some players from playing multiplayer. I'd wish for multiplayer to be more tolerant of lower bandwidth Internet connections. Specifically, I wonder if it would be possible to transmit less data, only the bare minimum required, to indicate location and existence/health/xp of units and hexes only, with all graphics being stored locally on the user's hard drive or downloaded at the point that they actually join the game, even before they select their faction and leader.

Since a hex can only be one of say 30 types, then 256/30 or 8 hexes may be transmitted with each byte, so a 100x100 map can be transmitted using 10,000/8=1250 bytes, assuming no compression. However, if there is no change for a certain range of hexes, a "NC" bit can be set, indicating that there is no change from the previous transmission until the "NC" bit is reset. This will allow compression so that considerably less than 1250 bytes can transmit a 100x100 map each turn. It may be possible to provide updates for each player using around 500 - 600 bytes per player-turn. This would reduce server load and speed up gameplay for all users, not just ones with lower quality connections.

I had a player disconnect from a Multiplayer game once. When they reconnected and rejoined, they had to experience all 50 - 60 turns that had elapsed in the game. I did not understand what they were telling me when they kept typing "turn 13....turn 20...turn 21..." etc at regular intervals. But eventually I realized that the game was replaying, blow by blow, all that had occured. This seems like a deterrent to most users. Perhaps some are very interested to learn what exactly transpired, and for these, there could be an option in "User Preferences." But for others, it would be a nice default, if disconnection for some reason occurs, and a player rejoins, then instead of updating all that happened, only the current realtime scenario can be transmitted, using 500 - 600 bytes and delivering the realtime game in less than 2 seconds to the end user. That way if a player disconnects, then reconnects later, instead of having to experience all that happened, he can just plunge right in.
User avatar
pauxlo
Posts: 1047
Joined: September 19th, 2006, 8:54 pm

Re: Ideas for handling disconnects/lag in Multiplayer

Post by pauxlo »

No graphics are transmitted during a MP game.

What is transmitted is the WML of the scenario (from the host to the server to the other players) at the start of the game, and all game actions done by players or the AI. Also, for each fight action the server provides some random numbers which help to know whether your units (and the enemy ones) hit or not.

For the map, only a terrain code for each hex is transmitted. (This does not have the most efficient representation like the one you describe, but it still is not that much. And only transmitted once at start of a game.)

The server is quite dumb: It doesn't keep a state of the game, but the scenario and a list of users actions ... this is why you see a replay when you rejoin – this is when your client recalculates the new game state. Of course this could be optimized: your client could do all this again (and calculate the new state) without showing it.
User avatar
Pentarctagon
Project Manager
Posts: 5564
Joined: March 22nd, 2009, 10:50 pm
Location: Earth (occasionally)

Re: Ideas for handling disconnects/lag in Multiplayer

Post by Pentarctagon »

pauxlo wrote:The server is quite dumb: It doesn't keep a state of the game, but the scenario and a list of users actions ... this is why you see a replay when you rejoin – this is when your client recalculates the new game state. Of course this could be optimized: your client could do all this again (and calculate the new state) without showing it.
The reconnecting player could also just get the current gamestate from the host and not have to recalculate anything.
99 little bugs in the code, 99 little bugs
take one down, patch it around
-2,147,483,648 little bugs in the code
linuxminter
Posts: 11
Joined: October 11th, 2013, 8:31 pm

Re: Ideas for handling disconnects/lag in Multiplayer

Post by linuxminter »

pauxlo wrote:No graphics are transmitted during a MP game.

What is transmitted is the WML of the scenario (from the host to the server to the other players) at the start of the game, and all game actions done by players or the AI. Also, for each fight action the server provides some random numbers which help to know whether your units (and the enemy ones) hit or not.

For the map, only a terrain code for each hex is transmitted. (This does not have the most efficient representation like the one you describe, but it still is not that much. And only transmitted once at start of a game.)

The server is quite dumb: It doesn't keep a state of the game, but the scenario and a list of users actions ... this is why you see a replay when you rejoin – this is when your client recalculates the new game state. Of course this could be optimized: your client could do all this again (and calculate the new state) without showing it.
I am not sure why one of my locations has experienced lags and disconnects, because I have broadband (actually about 330K/second) DSL and play chess, torrent and use the web otherwise, although I have had problems playing Dungeon Crawl Stone Soup online. There may be something about online gaming that is more demanding in one aspect than other scenarios. Maybe the problem is not the amount of data transmitted per turn, but rather the timing of the transmissions and acknowledgments. That's something that can only be unravelled in testing. My suggestion is to make the game more tolerant of Internet connections that may not be ideal in one aspect or another. Maybe the problem is something as simple as the transmission being rejected by the client, due to noise on the line or something else, and the server needing to retransmit. Maybe the server could break up the transmission into smaller packages, so that if one package fails, then the time lost in retransmitting is less.

The second idea, avoiding the lengthy replay when rejoining a game from which the player has disconnected, could be implemented by a "User Preferences" option that will have the following influence. Upon resuming an already existing game (game turn > 1), Wesnoth will not display the graphics of all the previous turns, but instead a window that says "Please wait...recalculating game status.". Whether that saves enough time for the user depends on how fast these calculations can be made. I suspect most of the delay is caused by the scrolling and animation of graphics and that the calculations could probably be made pretty quickly, because they are just moves and attacks rather than AI, which can become slow in some scenarios (i.e. the late game of some SXC 0.2.1.2 maps).
User avatar
iceiceice
Posts: 1056
Joined: August 23rd, 2013, 2:10 am

Re: Ideas for handling disconnects/lag in Multiplayer

Post by iceiceice »

Pentarctagon wrote:
pauxlo wrote:The server is quite dumb: It doesn't keep a state of the game, but the scenario and a list of users actions ... this is why you see a replay when you rejoin – this is when your client recalculates the new game state. Of course this could be optimized: your client could do all this again (and calculate the new state) without showing it.
The reconnecting player could also just get the current gamestate from the host and not have to recalculate anything.
Technical question: If the server is able to generate replay files, it must have the current snapshot of the game? Couldn't it just send this instead of the replay info it currently sends?

Or is the snapshot compiled from the moves by the server when the game ends?
linuxminter
Posts: 11
Joined: October 11th, 2013, 8:31 pm

Re: Ideas for handling disconnects/lag in Multiplayer

Post by linuxminter »

iceiceice wrote:Technical question: If the server is able to generate replay files, it must have the current snapshot of the game? Couldn't it just send this instead of the replay info it currently sends?

Or is the snapshot compiled from the moves by the server when the game ends?
No, according to the kind developer above, the server does not store any snapshot of the game. It is a "dumb" server ("dumb" also means fast, I think, and speed may be the intention). The server only stores user actions and the results of fighting, which it must retransmit entirely if someone disconnects and reconnects. The snapshot is recompiled from the moves by the client, that is, the end user's Wesnoth on the local machine.
User avatar
iceiceice
Posts: 1056
Joined: August 23rd, 2013, 2:10 am

Re: Ideas for handling disconnects/lag in Multiplayer

Post by iceiceice »

I know that the server is designed simply to echo transmitted WML between the different players, and not think about it itself.

However, the server also hosts replays at replays.wesnoth.org of every game that is played on any official mp server. AFAIK these replays are from "observer" or "server"'s point of view, i.e. they also see private obs chat. So I believe the server generates these itself. My question is when does it do this, only when the game is over, or does it amortize the cost of building the replay over the course of the game? If the latter it could just send the current state to any connecting player.

Edit: I guess it is also possible that these replays are generated by the host, and that the host does in fact receive the private obs chat and chooses not to display it. But if so then this would mean that if the host segfaults the replay wont be on replays.w.o, and in fact I think the replay always ends up there.
gfgtdf
Developer
Posts: 1432
Joined: February 10th, 2013, 2:25 pm

Re: Ideas for handling disconnects/lag in Multiplayer

Post by gfgtdf »

the replays hosted at replays.wesnoth.org normaly dont't contain asnapshot . they only contain a [replay_start] and a [replay].
the clients communicate durign the game by sending replays for their current turn, thats way the server can easily story these and generate a replay.
Scenario with Robots SP scenario (1.11/1.12), allows you to build your units with components, PYR No preperation turn 1.12 mp-mod that allows you to select your units immideately after the game begins.
User avatar
iceiceice
Posts: 1056
Joined: August 23rd, 2013, 2:10 am

Re: Ideas for handling disconnects/lag in Multiplayer

Post by iceiceice »

Ah i see, thanks for info.
User avatar
iceiceice
Posts: 1056
Joined: August 23rd, 2013, 2:10 am

Re: Ideas for handling disconnects/lag in Multiplayer

Post by iceiceice »

linuxminter wrote: The second idea, avoiding the lengthy replay when rejoining a game from which the player has disconnected, could be implemented by a "User Preferences" option that will have the following influence. Upon resuming an already existing game (game turn > 1), Wesnoth will not display the graphics of all the previous turns, but instead a window that says "Please wait...recalculating game status.". Whether that saves enough time for the user depends on how fast these calculations can be made. I suspect most of the delay is caused by the scrolling and animation of graphics and that the calculations could probably be made pretty quickly, because they are just moves and attacks rather than AI, which can become slow in some scenarios (i.e. the late game of some SXC 0.2.1.2 maps).
You know that there is the "quick replays" checkbox in the mp lobby right?

If that is not good enough, then as Pentarctagon suggested the joining clients could ask host rather than server for replay. But how will this actually work, will there be a wml tag "client req" which expects to be responded with a snapshot from host and gets written into the replay? If the communication is via server then this is what would have to happen, as I understand it. If lots of clients and obs join and rejoin this might lead to seriously bloated replays. Currently I don't think there is any way for the clients to connect directly to eachother, so it would take a significant amount of coding to get that to work. If its buggy it could lead to alot of grief, with players not being to join games for vague network reasons -- the current system is at least pretty reliable in that respect.

An alternative that might be easier to implement is that there could be a special menu button that makes it as easy as possible to save and rehost a game, which results in the host snapshot being transmitted to everyone. I'm imagining that in an mp game, you would be able to go Menu->Auto Rehost, which would save and immediately rehost your game on the server. It would also transmit a wml note before leaving the current game to the other player clients with a pointer to the new game, and those clients would automatically try to join it. If someone DC'd then hopefully they find it quickly and it starts right up again.

Players can already do this, and rehosting this way is already recommended whenever OOS is encountered. The problem is you never really know if the other players will come back :/. So auto rehost would hopefully improve the chances.
User avatar
Pentarctagon
Project Manager
Posts: 5564
Joined: March 22nd, 2009, 10:50 pm
Location: Earth (occasionally)

Re: Ideas for handling disconnects/lag in Multiplayer

Post by Pentarctagon »

iceiceice wrote:
linuxminter wrote: The second idea, avoiding the lengthy replay when rejoining a game from which the player has disconnected, could be implemented by a "User Preferences" option that will have the following influence. Upon resuming an already existing game (game turn > 1), Wesnoth will not display the graphics of all the previous turns, but instead a window that says "Please wait...recalculating game status.". Whether that saves enough time for the user depends on how fast these calculations can be made. I suspect most of the delay is caused by the scrolling and animation of graphics and that the calculations could probably be made pretty quickly, because they are just moves and attacks rather than AI, which can become slow in some scenarios (i.e. the late game of some SXC 0.2.1.2 maps).
If that is not good enough, then as Pentarctagon suggested the joining clients could ask host rather than server for replay. But how will this actually work, will there be a wml tag "client req" which expects to be responded with a snapshot from host and gets written into the replay? If the communication is via server then this is what would have to happen, as I understand it. If lots of clients and obs join and rejoin this might lead to seriously bloated replays. Currently I don't think there is any way for the clients to connect directly to eachother, so it would take a significant amount of coding to get that to work. If its buggy it could lead to alot of grief, with players not being to join games for vague network reasons -- the current system is at least pretty reliable in that respect.
By "host" I simply meant "the person who put the game in the MP lobby". Whether the information gets sent by the host's computer or the server doesn't really matter until somebody gets to the point of actually implementing it.

But regardless of where the gamestate comes from, the way I was imagining it would work would be that the person reconnecting would only receive information that's relevant to the current turn. So map dimensions and terrain, plus the relevant side information such as unit positions, recall list (for MP campaigns), and statistics. There is no need to show where every unit moved or who they each attacked for every single turn (much less replay the entire thing). The entirety of the game history could then be sent later if a replay is needed (Save Replay button/completion of the scenario).
99 little bugs in the code, 99 little bugs
take one down, patch it around
-2,147,483,648 little bugs in the code
linuxminter
Posts: 11
Joined: October 11th, 2013, 8:31 pm

Re: Ideas for handling disconnects/lag in Multiplayer

Post by linuxminter »

Pentarctagon wrote:By "host" I simply meant "the person who put the game in the MP lobby". Whether the information gets sent by the host's computer or the server doesn't really matter until somebody gets to the point of actually implementing it.
Perhaps "initiator" is a better term, because "host" means server, from the tech point of view, and "creator" would mean the person who designed and created the scenario.
Pentarctagon wrote:But regardless of where the gamestate comes from, the way I was imagining it would work would be that the person reconnecting would only receive information that's relevant to the current turn. So map dimensions and terrain, plus the relevant side information such as unit positions, recall list (for MP campaigns), and statistics. There is no need to show where every unit moved or who they each attacked for every single turn (much less replay the entire thing). The entirety of the game history could then be sent later if a replay is needed (Save Replay button/completion of the scenario).
Yes, that was the gist of my suggestion. There is no need to scroll through the entire game blow-by-blow if a player reconnects to an existing game. I am not sure what iceice was talking about when he referred to a check-box in the multiplayer lobby. Does that represent an already existing solution? I do not remember seeing any option to bypass the blow-by-blow "movie" which can go on for a considerable length of time in long games.

To the resuming or reconnecting player, in most cases (I would imagine, although I haven't taken a survey), only the current, realtime state is relevant, not what happened in the past. The current state consists of the map and the locations and stats of monsters and players. I believe it may be possible to transmit the current state in less than 2000 - 5000 bytes for most maps, and DSL broadband speeds allow this to be transmitted in considerably less than one second. If each hex's display is controlled independently in the code, then if the hex's content has not changed, it does not need to be changed. Only the minimum amount of display-updating should take place, with the code comparing the new map-state with the old map-state and only updating where necessary, in the interest of efficiency.

I suggest a new advanced option in Settings, off by default in order to accomodate those who prefer the old way of doing business. However, if a player does not wish to receive sequential updates, due to problems with lag or disconnects, then his updates may simply be map-updates, with one possible exception, those attacks which directly impact his own unit(s)--the player may wish to observe those attacks in animation.

The above is one of my suggestions, but it does not attack the original reason for my thread and my main motivation. I think it is a good idea, but I'm not sure whether it will help me. My problem with Wesnoth multiplayer is that I can only play it reliably from a location with wide pipes. I cannot play it reliably using DSL broadband with 330K download / 35K upload per second. What happens is that the game bogs down into severe lag. Other players are forced to wait on me because my side has not updated. Sometimes it may never update at all. Other times, it will update after a 1 - 10 minute delay, during which time people are chatting and speculating on what the problem might be. The problem is the internet connection. When I play Wesnoth multiplayer at a different location, using broadband with 500K download / 300K upload per second, there is no problem for me whatsoever. I observe other players suffering from the by now familiar problem, but it does not affect me. I can play all day long and experience not a single problem.

I have examined my firewalls in both cases and the firewall is not a problem. In both cases, the firewall is set to allow outgoing, and deny incoming, with a few exceptions here and there to permit torrent traffic. According to Wesnoth documentation, that is right for a client.

I have also tried setting the ping timeout to 10 seconds (from the default of 0), but that does not seem to help. I thought at first that the reason there was a problem was due to the amount of data being transmitted, but the developer in this thread disabused me of that false assumption. There must be another reason, but I'm not sure what it is. I wonder whether the slow upload speed might be a factor?
gfgtdf
Developer
Posts: 1432
Joined: February 10th, 2013, 2:25 pm

Re: Ideas for handling disconnects/lag in Multiplayer

Post by gfgtdf »

the terraindata is normaly only a small part of a game snapshot. a repletive complicatede scenario has a mapfile of uncompressed ~50kb and the snapshot part of the savefile is uncompressed 2.5 mb. I think you'll get less network traffic if the other uses just send you their actions like it is surrently done "unit at (1,3) attacks unit at (4,5) with attack 2" than sending snapshots. a normal turn is has uncomressed data ~5kb from which >50% is additional checkup data to detect oos errors.
Scenario with Robots SP scenario (1.11/1.12), allows you to build your units with components, PYR No preperation turn 1.12 mp-mod that allows you to select your units immideately after the game begins.
linuxminter
Posts: 11
Joined: October 11th, 2013, 8:31 pm

Re: Ideas for handling disconnects/lag in Multiplayer

Post by linuxminter »

gfgtdf wrote:the terraindata is normaly only a small part of a game snapshot. a repletive complicatede scenario has a mapfile of uncompressed ~50kb and the snapshot part of the savefile is uncompressed 2.5 mb. I think you'll get less network traffic if the other uses just send you their actions like it is surrently done "unit at (1,3) attacks unit at (4,5) with attack 2" than sending snapshots. a normal turn is has uncomressed data ~5kb from which >50% is additional checkup data to detect oos errors.
I tried playing Wesnoth multiplayer again the other night from a location with 330K/second down / 37K / second up bandwidth, and again suffered lag on the 2nd turn and had to quit a multiplayer game, after notifying the other players of the problem. I have had to quit about 20 - 50 multiplayer games from this location. In the beginning, I was not aware that the problem was local, and thought it might be the server or another player. But I have come to a conclusion recently after comparing my multiplayer experience between one location and another. Sometimes I waited for half an hour for the lag to resolve. Of course, the other players had to wait as well in those cases. It is only natural that other players may conclude someone stepped away from the keyboard, when in reality the problem is a technical, network issue. I am not sure of the reason why, and the documentation does not offer any insight. I have researched this problem extensively on Google and read many forum messages and blog posts and change-logs written by the developers. If there were already information available on troubleshooting the problem, then I would have pursued that angle. That is why I was speculating that Wesnoth is transmitting a massive amount of data during gameplay. When I play Wesnoth multiplayer at a location with wide pipes, which you may understand to be 350K/second down and 350K/second upload speed, there are zero lags and zero disconnects. So it appears to me that Wesnoth functions well in scenarios with very broad broadband, but not well in scenarios with average or budget broadband. I do not think I am the only player with this problem, because I have observed other players having a similar issue, although they too are often puzzled about the cause. I have learned over time, through trial and error, after 20 - 50 attempts, to not try multiplayer in one location, but that it is okay to play it in another.
User avatar
iceiceice
Posts: 1056
Joined: August 23rd, 2013, 2:10 am

Re: Ideas for handling disconnects/lag in Multiplayer

Post by iceiceice »

when you say "in one location" does this mean you are using the same machine, but connecting to wifi in different locations? or using entirely different machines.

i am not an expert in networking by any means, but I can tell you anyways that there are possibly many other relevant parameters to the network connection than its bandwidth, and it is usually pretty difficult to diagnose these things.
Post Reply