[Beowulf] Re: dealing with lots of sockets

Wed Jul 2 18:06:55 PDT 2008

"Robert G. Brown" <rgb at phy.duke.edu> writes:
> I'm not quite sure what you mean by "vast numbers of teeny high
> latency requests" so I'm not sure if we really are disagreeing or
> agreeing in different words.

I mostly have worried about such schemes in the case of, say, 10,000
people connecting to a web server, sending an 80 byte request, and
getting back a few k several hundred ms later. (I've also dealt a bit
with transaction systems with more stringent throughput requirements,
but rarely with things that require an ack really, really fast.) That
said, I'm pretty sure event systems win over threads if you're
coordinating pretty much anything...

>> Sure, but it is way inefficient. Every single process you fork means
>> another data segment, another stack segment, which means lots of
>> memory. Every process you fork also means that concurrency is achieved
>> only by context switching, which means loads of expense on changing
>> MMU state and more. Even thread switching is orders of magnitude worse
>> than a procedure call. Invoking an event is essentially just a
>> procedure call, so that wins big time.
>
> Sure, but for a lot of applications, one doesn't have a single server
> with umpty zillion connections

Well, often one doesn't build things that way, but that's sort of a
choice, isn't it. Your machine has only one or two or eight
processors, and any other processes/threads above that which you
create are not actually operating in parallel but are just a
programming abstraction. It is perfectly possible to structure almost
any application so there is just the one thread per core and you
otherwise handle the programming abstraction with events instead of
additional threads, processes or what have you.

> If the connection is persistent, the overhead associated with task
> switching is just part of the normal multitasking of the OS.

That overhead is VERY high. Incredibly high. Most people don't really
understand how high it is.

If you compare the performance of an http server that manages 10,000
simultaneous connections with events, versus one that handles it with
threads, you'll see there is no comparison -- events always beat
threads into the ground, because you can't get away from threads
requiring a new stack for each thread, and you can't get away from the
fact that context switching is far more expensive than a procedure
dispatch.

> Similarly, many daemon-driven tasks tend to be quite bounded.  If a
> server load average is down under 0.1 nearly all the time, nobody cares,

That implies almost nothing is in the run queue. For an HPC system,
one hopes that the load is hovering around 1. Less means you're
wasting processor, more means you're spending too much time context
switching. But I digress..

> Still, it is important to understand why there are a lot of applications
> that are.  In the old days, there were limits on how many processes, and
> open connections, and open files, and nearly any other related thing you
> could have at the same time, because memory was limited.

Believe it or not, memory is still limited, and context switch time is
still pretty bad. Changing MMU contexts is unpleasant. Even if you
don't have to do that, because you're using another thread in the same
MMU context rather than a process, the overhead is still quite
painful.

Seeing is believing. There are lots of good papers out there on
concurrency strategies for systems with vast numbers of sockets to
manage, and there is no doubt what the answer is -- threads suck
compared to events, full stop. Event systems scale linearly for far
longer.

> Or maybe not.  If you make writing event driven network code as easy,
> and as well documented, as writing standard socket code and standard
> daemon code, the forking daemon may become obsolete.  Maybe it IS
> obsolete.

It is pretty easy. The only problem is getting your mind wrapped
around it and getting experience with it. Most people have been
writing fully linear programs for a whole career. If you tell them to
try events, or try functional programming, or other things they're not
used to, they almost always scream in agony for weeks until they get
used to it. "Weeks" is often more overhead than people are willing to
suffer. That said, I am comfortable with both of those paradigms...

> So, what do you think?  Should one "never" write a forking daemon, or
> inetd?

It depends. If you're doing something where there is going to be one
socket talking to the system a tiny percentage of the time, why would
you bother building an event driven server? If you're building
something to serve files to 20,000 client machines over persistent TCP
connections and the network interface is going to be saturated, hell
yes, you should never use 20,000 threads for that, write the thing
event driven or you'll die.

It is all about the right tool for the job. Apps that are all about
massive concurrent communication need events. Apps that are about very
little concurrent communication probably don't need them.

>> Event driven systems can also avoid locking if you keep global data
>> structures to a minimum, in a way you really can't manage well with
>> threaded systems. That makes it easier to write correct code.
>>
>> The price you pay is that you have to think in terms of events, and
>> few programmers have been trained that way.
>
> What do you mean by events?  Things picked out with a select statement,
> e.g. I/O waiting to happen on a file descriptor?  Signals?

More the former, not the latter. Event driven programming typically
uses registered callbacks that are triggered by a central "Event Loop"
when events happen. In such a system, one never blocks for anything --
all activity is performed in callbacks, and one simply returns from a
callback if one can't proceed further. The programming paradigm is
quite alien to most people.

I'd read the libevent man page to get a vague introduction. 

-- 
Perry E. Metzger		perry at piermont.com