[scyld-users] SCYLD/MOSIX for Game Server SSI/Process Migration?
scott at trinitygames.com
Tue Feb 8 13:36:11 PST 2005
I'm new to the list, first post. Thank you for allowing me to post.
We've begun the long and painful process of exploring various cluster
We run online game servers, so for us, our applications are:
- Many separate serial tasks
- Varying load per task
- Task/job/applications run all the time
We know we won't gain many of the benefits people generally expect from
parallel systems. This is OK for us.
Our primary objective is to increase efficient use of server space by load
balancing on the fly. i.e. SSI with Process Migration.
Our game server daemons, like all others, start with a burst of CPU to
setup, then are generally idle, until players join, and then CPU increases
with player load. This is where we hope process migration will help us to
better utilize our serverspace. We don't know in advance which of the
hundreds of game servers we operate will actually have player counts and
when, so a single system image with process migration among many nodes
appears ideal for us. With separate systems and manual load balancing, we
end up with many idle systems and some that are indeed overloaded.
So far, the cluster solutions we've studied are:
If SCYLD can do what we want, and is affordable, it sure leems like the
obvious choice due to apparent ease of installation and configuration.
However, a major concern is that a migration event will cause a "lag spike"
on the game server daemon being migrated or other gaming processes on the
system -- this is a real show stopper for game servers, and our users would
not tolerate it.
Our processes can be compared to near real-time applications like streaming
video or audio, and any hiccup is very noticeable.
In a paper written in Nov. 2002, Carlo Daffara raises this issue, and
overcomes the problem by using iproute2 queue controls. Here is an excerpt
from the writing:
"Another problem appeared during testing: since the game server memory
footprint is large (around 80 Mbytes each), we discovered that the migration
of processes slowed down the remaining network activity, introducing
significant packet latency (especially perceptible, since packets are very
small). So, we used the linux iproute2 queue controls to establish a
stochastic fair queuing discipline to the ethernet channels used for
internode communications; this works by creating a set of network "bins"
that host the individual network flows, marked using hashes generated from
the originating and destination IP addresses and the other part of the
header. The individual bins are then emptied in round robin, thus
prioritizing small packets over large transfer and not penalizing large
transfers (like process migration)."
So, the questions raised so far in our quest are:
- Does Scyld support process migration and load balance like [MOSIX]?
- Will the process migration event cause a hiccup as described by Daffara?
- Does our GigE network [help to] overcome this problem?
- Is it necessary (or even possible) to use the iproute2 queue controls on
I certainly would appreciate anyone's input on any of these or other related
This is our available test hardware:
Twin Xeon 2.8, 2G, 80G SATA primary for root/boot, some big RAID for the
'common' filesystem (tbd).
Nodes: P4 3.0/800 1G, Diskless. PXE/Gigabit NIC.
Dedicated Gig. Switch, GigE/PXE in every node.
We haven't installed any O/S yet. I'm still trying to find out how to
obtain SCYLD. We are waiting for an answer from an email sent to the email
address on the site which is supposed to be emailed to find vendors.
More information about the Scyld-users