From e.scott.atchley at gmail.com Mon Mar 2 14:08:31 2026 From: e.scott.atchley at gmail.com (Scott Atchley) Date: Mon, 2 Mar 2026 09:08:31 -0500 Subject: [Beowulf] [EXTERNAL] IB vs. Ethernet In-Reply-To: <8A12224D-EDAA-4A1C-96FC-28B5DACCCB31@serissa.com> References: <1f8b8ac8-2ff7-4496-8d68-a6356da7b342@ucar.edu> <20260221082852.GA10552@rd.bx9.net> <83B6174F-BAA2-4F44-BA3B-BEA56CA07044@serissa.com> <8A12224D-EDAA-4A1C-96FC-28B5DACCCB31@serissa.com> Message-ID: On Wed, Feb 25, 2026 at 9:04?PM Lawrence Stewart wrote: > Arista has published 10G latency measurements for QSFP based copper and > optical links from 1-6 meters > > Copper latency looks like about 5 ns per meter while optical is a little > slower for short cables and a little faster for long ones. > > > For 400 GB link modules, apparently you can use ?analog? optical > transceivers with 20 ns delays plus fiber delay up to 100 meters. You can > also use DSP based ones that could be 100 ns > > The Optical Analog/Clock and Data Recovery cables are much lower latency > than the Active Optical Cables with retimers in them and perhaps equalizers. > > For connections within a rack, you can also use Direct Attach Copper, > which is just a twinax parallel cable, up to about 5 meters. Or there are > Active Electrical Cables with equalizers that are a bit slower. > > The price tags for the optical 400G cables are eye-popping. > > I realize that most AI work is bandwidth-focussed, and a microsecond is > fine, but I have a soft spot for SHMEM 8 byte puts and gets, and there is > always a role for Barrier and small AllGathers. > > -L > How much does FEC add? I have been under the impression that it is now mandatory ?100Gbps. > > > > On Feb 25, 2026, at 19:20, Lux, Jim (US 430E) > wrote: > > > > > > > > -----Original Message----- > > From: Beowulf On Behalf Of Lawrence > Stewart > > Sent: Saturday, February 21, 2026 4:34 AM > > To: beowulf at beowulf.org > > Cc: Lawrence Stewart > > Subject: [EXTERNAL] Re: [Beowulf] IB vs. Ethernet > > > > > > > >> On Feb 21, 2026, at 3:28?AM, Greg Lindahl wrote: > >> > >> On Thu, Jan 15, 2026 at 08:28:36PM -0500, Lawrence Stewart wrote: > >> > >>> I think a 64 byte store at a core should directly become a packet. No > on-die-network, no coherence, no root complex, no host-fabric adapter. > Incoming short messages should be delivered directly to a fifo in the > relevant core. > >> > >> I think that's a great idea! > >> > >> ? greg > >> > > > > > > As Greg, I think, is hinting, this idea was a thing that QLogic HFI?s > did, using the core write combining buffers to good effect. It seems like > it is also the basic idea behind MOVDIR64B, which specifies that a 64 byte > write will be atomic all the way down. > > > > Using core registers for messaging is much older, with Transputers, > Tilera, Dally?s J Machine and arguably Cray E-registers. > > > > What this is really about is end to end latency. We?ve been stuck at 1 > microsecond since the Cray T3D 30 years ago, in spite of 100x improvements > in link speed. If we can eliminate all the middlemen and get switches back > to 50 ns forwarding, I think we should be able to get 300 ns end to end in > a good size system. > > > > -Larry > > > > > > Indeed, I suspect the 1 microsecond probably ties to something else that > was convenient - If you're not running parallel wires (lanes) then sending > 1000 bits at 1Gbps takes 1 microsecond. > > > > And if the actual link gets faster, the messages get bigger, so that > they still take 1 microsecond. > > > > There are some practical issues - As your symbol rate gets higher on the > wire, things like impedance discontinuities causing reflections become more > important. You have a transition from die to package, one from package to > board, one from board to connector/cable. And those all have ~1-10 ns > kind of time scales. Stack all those up and it can take a long time for > the cascade of reflections to die out. > > > > The fix, today, is to put equalizers (preferably adaptive equalizers) > that essentially "undistort" the waveform. But those equalizers have to > look at many symbol times to work (typically, they're implemented as a > tapped delay line with weights on each tap and summed - a FIR filter), > which then means that your first bit out is delayed by however many symbols > are in the filter's delay line. I suspect that for "commodity" hardware, > there's a particular length of delay line that is long enough to > accommodate all possible wiring configurations. > > > > Let's look at Ethernet - the maximum ethernet run for GigE is 100 > meters, which not so oddly, is about 500 ns long (propagation speed is > ~0.66c due to the dielectric and capacitance/inductance of the twisted > pair). So the time for a reflection to get back to the sending end is, > hmmm, 1 microsecond. > > > > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > https://beowulf.org/cgi-bin/mailman/listinfo/beowulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jcducom at gmail.com Tue Mar 10 18:47:08 2026 From: jcducom at gmail.com (Jean-Christophe Ducom) Date: Tue, 10 Mar 2026 11:47:08 -0700 Subject: [Beowulf] Job posting: Senior HPC engineer at Scripps Research Institute, San Diego, CA Message-ID: <6bce2716-50cd-4029-b46c-fc6ee8c6cd27@gmail.com> Scripps Research Institute is looking for a Senior HPC engineer. See link below for job description https://recruiting2.ultipro.com/SCR1003TSRI/JobBoard/98759e7d-7ede-4c0b-ac7b-2c6293c7b522/OpportunityDetail?opportunityId=84056ddc-527e-4308-9c3f-f8b9ed9e7fee JC Ducom From lindahl at pbm.com Mon Mar 16 06:15:13 2026 From: lindahl at pbm.com (Greg Lindahl) Date: Sun, 15 Mar 2026 23:15:13 -0700 Subject: [Beowulf] [EXTERNAL] IB vs. Ethernet In-Reply-To: References: <1f8b8ac8-2ff7-4496-8d68-a6356da7b342@ucar.edu> <20260221082852.GA10552@rd.bx9.net> <83B6174F-BAA2-4F44-BA3B-BEA56CA07044@serissa.com> <8A12224D-EDAA-4A1C-96FC-28B5DACCCB31@serissa.com> Message-ID: <20260316061513.GA21908@rd.bx9.net> On Mon, Mar 02, 2026 at 09:08:31AM -0500, Scott Atchley wrote: > How much does FEC add? I have been under the impression that it is now > mandatory ?100Gbps. You're correct that it is now required by ethernet, at 100Gbit and also 1 lane 25 Gbit. I have only worked with an FPGA implementation and it wasn't latency sensitive. It also seems to be the kind of thing that would be extremely implementation-dependent. For a history lesson, one of the first things Qlogic wanted to do with Infinipath after buying us was to add FEC. Even below 100 gbits the error rates left all of us nervous. -- greg