[Beowulf] Detecting binaries that are limited to an architecture?

Paul McIntosh paul.mcintosh at monash.edu
Thu May 5 18:53:03 PDT 2016


I am thinking that this may not be as useful as I’d hoped…

 

Short Answer:

I am finding new opcodes on the old system all over the place, so this is not necessarily an indication of an issue they are more like a “smell” that may or maynot indicate something is off

 

Long Answer:

 

Even if code is compile on an old system it will not necessarily run on that old system if the source code explicitly users new features e.g.

http://www.codeproject.com/Articles/874396/Crunching-Numbers-with-AVX-and-AVX

]$ gcc -mavx -o hello_avx hello_avx.c

]$ ./hello_avx

Illegal instruction

 

Yes – opcodes.sh will find the issue…

]$ objdump -D -M intel hello_avx | ~/opcode.sh -s AVX -m 2

  40058d:       c5 fa 10 45 a0          vmovss xmm0,DWORD PTR [rbp-0x60]

  400598:       c4 e3 79 21 95 ec fe    vinsertps xmm2,xmm0,DWORD PTR [rbp-0x114],0x10

 

However there are lots of system libraries on the old system that have new opcodes…

 

objdump -D -M intel /usr/lib64/libfreetype.so.6  | ~/opcode.sh -s AVX

  3d51800aa6:   f3 c5 09 67 2a          repz vpackuswb xmm13,xmm14,XMMWORD PTR [rdx]

  3d5180b691:   c5 a9 51 3d 00 00 00    vsqrtpd xmm7,XMMWORD PTR [rip+0x7000000]        # 3d5880b699 <_end+0x6d6efe9>

  3d5180b6a9:   c5 a9 51 3d 00 00 00    vsqrtpd xmm7,XMMWORD PTR [rip+0x7000000]        # 3d5880b6b1 <_end+0x6d6f001>

  3d5180b6c1:   c5 a9 51 3d 00 00 00    vsqrtpd xmm7,XMMWORD PTR [rip+0x7000000]        # 3d5880b6c9 <_end+0x6d6f019>

  3d5180b6f1:   c5 a9 51 3d 00 00 00    vsqrtpd xmm7,XMMWORD PTR [rip+0x7000000]        # 3d5880b6f9 <_end+0x6d6f049>

 

So I am guessing that things like freetype are built to support multiple architectures and contain opcodes that never get hit.

 

Interestingly dmesg has more info on the issue so that might be a better avenue for alerts of software not running correctly…

 

dmesg

hello_avx[11258] trap invalid opcode ip:40058d sp:7fffca0f34c0 error:0 in hello_avx[400000+1000]

 

Cheers,

 

Paul

 

From: Peter St. John [mailto:peter.st.john at gmail.com] 
Sent: Friday, 6 May 2016 9:46 AM
To: Paul McIntosh <paul.mcintosh at monash.edu>
Subject: Re: [Beowulf] Detecting binaries that are limited to an architecture?

 

If you ran the compiler (for your current target architecture) on the assembler (from decompiling the binaries you want to port) you'd get error messages, but I've never done that myself

Peter

 

On Thu, May 5, 2016 at 6:39 PM, Paul McIntosh <paul.mcintosh at monash.edu <mailto:paul.mcintosh at monash.edu> > wrote:

Yes – tried that and it gets the opcodes but then I am back to the issue of not knowing which opcodes are related to which architecture. My current train of though is finding something in the gcc install that works out generating the opcodes for an architecture and see if it can be used to reverse them.

 

Paul

 

From: Peter St. John [mailto:peter.st.john at gmail.com <mailto:peter.st.john at gmail.com> ] 
Sent: Friday, 6 May 2016 7:23 AM
To: Paul McIntosh <paul.mcintosh at monash.edu <mailto:paul.mcintosh at monash.edu> >
Cc: Beowulf List <beowulf at beowulf.org <mailto:beowulf at beowulf.org> >
Subject: Re: [Beowulf] Detecting binaries that are limited to an architecture?

 

You might run it through a decompiler; then you'd be looking at the assembler at least.

Peter

 

On Thu, May 5, 2016 at 4:51 PM, Paul McIntosh <paul.mcintosh at monash.edu <mailto:paul.mcintosh at monash.edu> > wrote:

All,

I am wondering if there is an easy way to detect if a binary makes use of
opcodes which are not available on a specific architecture?

We have /usr/local mounted across nodes with some Intel Xeon X5650
(Westmere) and some E5-2670 (SandyBridge). Some code spits out "Illegal
Instruction" when run on the old nodes and it appears to be due to hitting
shared libraries compiled on the newer nodes. We are going to have a similar
situation on the newer clusters also.

I have been putting together a test suite for our software stack and would
like to add the ability to sanity check binaries for such errors. I thought
there would be easy way to do this by looking at the opcodes (objdump) and
comparing them to what the architecture provides. However this requires
knowing all the opcodes from Intel manuals for a chip.

I have be playing with opcode.sh
(https://gist.github.com/rindeal/72af275f05d44e10ebca) which looks promising
but will need a bit of manual work to get it to do what I want (and still
may be incomplete/inaccurate).

Has anyone done this? Know of a way to easily get a computer readable list
of opcodes per cpu (note /proc/cpuinfo flags just shows features not
opecodes)?

Cheers,

Paul
--
Dr Paul McIntosh
  Senior HPC Consultant, Technical Lead,
    Multi-modal Australian ScienceS Imaging and Visualisation Environment
(www.massive.org.au <http://www.massive.org.au> )
       Monash University, Ph: 9902 0439 Mob: 0434 524935



_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org <mailto:Beowulf at beowulf.org>  sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20160506/73c29ce7/attachment-0001.html>


More information about the Beowulf mailing list