<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

  <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">

  <title></title>

</head>

<body bgcolor="#ffffff" text="#000000">

Rahul Nabar wrote:

<blockquote

 cite="mid:c4d69730809171418t61104974hf1b09c9552251992@mail.gmail.com"

 type="cite">

  <pre wrap="">On Wed, Sep 17, 2008 at 4:05 PM, Eric Thibodeau <a class="moz-txt-link-rfc2396E" href="mailto:kyron@neuralbs.com"><kyron@neuralbs.com></a> wrote:

  </pre>

  <blockquote type="cite">

    <pre wrap="">Well, apart from the fact that ssh is compressed and, as Digo pointed out and that 47 MB/sec is probably your HDD's transfer capacity as >Shannon pointed out, also keep in mind your bus's capacity ( <a class="moz-txt-link-freetext" href="http://en.wikipedia.org/wiki/List_of_device_bandwidths">http://en.wikipedia.org/wiki/List_of_device_bandwidths</a> is a nice list). So, >unless you've got both NICs on PCI-E (or independant PCI channels, which I've only heard of in high-end Compaq servers with hotswap PCI >interfaces)  you're saturating your bus.

    </pre>

  </blockquote>

  <pre wrap=""><!---->

Thanks for all those responses guys! Eric; I'll check my bus speed; my

server is not very high end. These are Dell Power Edge 1435's. But

after I first posted I did a couple more debugs and diagnostics:

(1) As Shannon pointed earlier, I did give netperf a shot now. Funny

resut is this:

If I netperf from Machine A to B I get only 1Gbps.

If I start two netperfs on A and try to talk to B ; each gets 0.5Gbps.

Thus aggregate of still 1 Gbps

BUT if I start two netperfs on A and one talks to B and another to C

each gets 1 Gbps. Thus I got an aggregate of 2 Gbps out [desired

result]

In the last situation if I disable one link then I fall back to 0.5

Gbps each. So this is my (almost) perfect situation.

Forces me to conclude that I am _not_ disk, bus nor I/O limited. What

do you think?

The sad thing though is this: I could never get a peer-to-peer (A

talks to B alone) mode that would give me a 2 Gbps aggregated. This is

frustrating. These are 8 cpu/node servers and frequently even a 16 cpu

job will span across only 2 compute-nodes. Then if I cannot use both

the eth cards it seems an awful waste of capacity. Just think about

this: If two-processes talk from A-to-B I get 1 Gbps aggregate. But if

I have two processes and just route one through a

passive-forwarding-machine C (thus A-to-B and A-to-C-to-B) then I will

end up with an aggregate of 2 Gbps. This seems a very strange,

non-intuitive and undesirable outcome of the current bonding setup , I

feel.

I might have to actually _force_ jobs to span more than two servers

just to be able to use both my eth cards! Feels very strange to me.

I tried both modes 4 and 6. Rick Jones, the netperf maintainer gave me

a very promising suggestion that I might be able to modify my bonding

hash algorithm so that it bonds traffic coming from two different

processes originating on the same node. Currently I cannot. Anybody

else has given this a shot?

I'm eager to hear any other comments people might have.

  </pre>

</blockquote>

Well, I don't have "bondable" hardware so I'm really interested in how

you technically manage this one at the end.<br>

<br>

Eric<br>

<br>

</body>

</html>