Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] Fastest way to compute Euclediant distance [spin off from: Building new cluster - estimate]

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Eric Thibodeau kyron at neuralbs.com
Thu Aug 7 10:02:58 PDT 2008


>
>>>   Most of the arguments I have heard are "oh but its compiled with 
>>> -O3" or whatever. Any decent HPC code person will tell you that that 
>>> is most definitely not a guaranteed way to a faster system ...
>> Hey...as I stated above, one would have to be quite silly to claim 
>> -O3 as the all well and all good optimization solution. At least you 
>> can rest assured your solutions will add up correctly with GCC. To get a 
> Well, sometimes.  You still need to be careful with it.
>
> This said, I am not sure icc/pgi/... are uniformly better than gcc.  I 
> did an admittedly tiny study of this http://scalability.org/?p=470 
> some time ago.  What I found was the gcc really held its own.  It did 
> a very good job on a very simple test case.
Very nice post, thanks for that, it so happens I am going through the 
exact same steps trying to optimize a very simple piece of code 
computing the Euclidean distance and I was a little stomped to find out 
the simople C code outperforms BLAS (both GOTO and MKL). If you have 
gnuplot, a BLAS library with cblas interface, and icc installed, all you 
have to do is run `make` with the three attached files in the same dir 
and you'll get nice plots of what's going on. I'm also attaching an 
example run with:

icc 10.1.017
gcc 4.3.1
GOTO BLAS 1.24

Eric
PS: regular disclaimers about crappy code writing apply ;)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: EuclideanDist.c
Type: text/x-csrc
Size: 3596 bytes
Desc: not available
Url : http://www.scyld.com/pipermail/beowulf/attachments/20080807/63baf0c3/EuclideanDist.bin
-------------- next part --------------
### Global, generic variables ###
SHELL     = /bin/bash
ARCH      = $(shell uname -m)
GCC       = gcc-4.3.1
#GCCFLAGS  = -Wall -march=native -mfpmath=sse,387 -O3 -fomit-frame-pointer -fkeep-inline-functions -funsafe-loop-optimizations -freorder-blocks-and-partition -fno-math-errno -ffinite-math-only -fno-trapping-math -fno-signaling-nans -fwhole-program --param l1-cache-line-size=1 --param l1-cache-size=64 --param l2-cache-size=4096
GCCFLAGS  = -Wall -march=native -O3 

# For ICC 
# on Opteron: -xW
#ICCFLAGS    = -xW
# on Core2 Duo -xT
ICCFLAGS    = -xT 

LIBS      = -lm -lblas -lcblas
LDFLAGS   = $(LIBS)

### TAU specific variables ###
TAU_MAKEFILE = ~/TAU/TAU/$(ARCH)/lib/Makefile.tau-pdt
TAU_CXX      = tau_cxx.sh
TAU_CC       = tau_cc.sh
TAU_OPTS     = -optNoRevert -optLinking="$(LIBS)" -optTauCC="$(CC)" -optCPPOpts="$(GCCFLAGS)" -tau_makefile=$(TAU_MAKEFILE)

PROGNAM    = EuclideanDist
PROGRAM    = $(PROGNAM)               # nom de l'executable
PROGOUT    = $(PROGNAM)_$(ARCH)
TAU_PROG   = $(PROGOUT)_TAU
SRCS       = $(PROGNAM).c             # les fichiers source
OBJS       = $(PROGNAM).o             # fichiers objets

MKL_LIBS = -liomp5 -lpthread -I/opt/intel/mkl/10.0.3.020/include/ -L/opt/intel/mkl/10.0.3.020/lib/em64t/

.SUFFIXES: .c .o
.cpp.o:
	$(CXX) -c $(GCCFLAGS) $<
.c.o:
	$(CC) -c $(GCCFLAGS) $<

# Targets
default: all
# all: $(PROGRAM) icc gcc
all: $(PROGRAM) icc gcc tests plots
$(PROGRAM):
	$(GCC) $(SRCS) -o $(PROGOUT) $(LDFLAGS)

tau:
	$(TAU_CC) $(TAU_OPTS) $(SRCS) -o $(TAU_PROG)

set_mkl_blas:
	sudo eselect blas set mkl-gfortran

set_goto_blas:
	sudo eselect blas set goto

icc:
	icc $(MKL_LIBS) $(LIBS) $(ICCFLAGS) $(SRCS) -o $(PROGOUT)_ICC

gcc:
	$(GCC) $(LIBS) $(GCCFLAGS) $(SRCS) -o $(PROGOUT)_GCC
	
clean:
	/bin/rm -f $(OBJS) $(PROGRAM) $(TAU_PROG) *.dat

tests: icctest gcctest

icctest:
	./$(PROGOUT)_ICC     > icc.dat

gcctest:
	./$(PROGOUT)_GCC     > gcc.dat

plots:
	gnuplot Plot.gp
-------------- next part --------------
set title "BLAS Vs C execution time for\n Euclidean Distance computation"
set xlabel "Vector Size (bytes)"
set ylabel "Time (sec)"
set logscale xy
set grid xtics mxtics
set key top left
set key box
#set term post enh         # enhanced PostScript, essentially PostScript
                           # with bounding boxes
#set term postscript
#set term png
set term postscript enhanced color

set out 'BlasVsC.eps'

plot "icc.dat" using 1:2 title 'icc-BLAS' w l lw 1 , \
"icc.dat" using 1:3 title 'icc' w l lw 1 , \
"gcc.dat" using 1:2 title 'gcc-BLAS'  w l lw 1 , \
"gcc.dat" using 1:3 title 'gcc' w l lw 1  


More information about the Beowulf mailing list