There are currently no projects
This tab is intentionally left blank.
Me getting a lot hits from reddit.com/r/programming
Yesterday I wrote about a small script that visualizes control flow in ASM dumps with arrows and when a friend of mine posted it in reddit.com/r/programming I suddenly got a lot more traffic than usual and a bit of useful feedback that helped me improve the robustness and applicability of the script considerably.
This morning I noticed that the I’d also been getting a few referrer hits from this unlikely source http://llvm.org/bugs/show_bug.cgi?id=16297.
Reading that “request for feature” got me thinking that it ought to be dead easy to modify the existing script to also handle disassemblies from llvm-objdump.
An example of the result can be seen below.
80486cb: 55 push EBP
80486cc: 89 e5 mov EBP, ESP
80486ce: 53 push EBX
80486cf: 83 ec 10 sub ESP, 16
80486d2: c7 45 f8 00 00 00 00 mov [EBP - 8], 0
,---80486d9: eb 29 jmp 41
|,->80486db: 8b 45 f8 mov EAX, [EBP - 8]
|| 80486de: 8b 55 08 mov EDX, [EBP + 8]
|| 80486e1: 8d 0c 02 lea ECX, [EDX + EAX]
|| 80486e4: 8b 1d 4c a0 04 08 mov EBX, [134520908]
|| 80486ea: a1 54 a0 04 08 mov EAX, 134520916
|| 80486ef: 89 c2 mov EDX, EAX
|| 80486f1: 01 da add EDX, EBX
|| 80486f3: 0f b6 12 movzx EDX, BYTE PTR [EDX]
|| 80486f6: 88 11 mov BYTE PTR [ECX], DL
|| 80486f8: 83 45 f8 01 add [EBP - 8], 1
|| 80486fc: 83 c0 01 add EAX, 1
|| 80486ff: a3 54 a0 04 08 mov 134520916, EAX
'|->8048704: 8b 15 54 a0 04 08 mov EDX, [134520916]
| 804870a: a1 50 a0 04 08 mov EAX, 134520912
| 804870f: 39 c2 cmp EDX, EAX
,-|--8048711: 7d 14 jge 20
| | 8048713: 8b 15 4c a0 04 08 mov EDX, [134520908]
| | 8048719: a1 54 a0 04 08 mov EAX, 134520916
| | 804871e: 01 d0 add EAX, EDX
| | 8048720: 0f b6 00 movzx EAX, BYTE PTR [EAX]
| | 8048723: 3c 0a cmp AL, 10
| '--8048725: 75 b4 jne -76
'--->8048727: a1 54 a0 04 08 mov EAX, 134520916
804872c: 83 c0 01 add EAX, 1
804872f: a3 54 a0 04 08 mov 134520916, EAX
8048734: 8b 45 f8 mov EAX, [EBP - 8]
8048737: 83 c4 10 add ESP, 16
804873a: 5b pop EBX
804873b: 5d pop EBP
804873c: c3 ret
What made this version of the script a bit trickier to make was that where the classical objdump is kind enough to translate the target address for jump instructions to one of the relative offsets seen in the first column (ie. 80486db), llvm-objdump only does a litteral translation of the instruction name followed by the relative offset in bytes as a signed integer.
And with the x86 architectures notorious variable instruction length, finding the target address isn’t just a matter of counting the offset divided by instruction size as number of lines as it would be in a fixed length instruction set, but rather a matter of counting every byte on the way there. A slight added complication to this calculation is that the offset to start counting from is that of the instruction following the jump instruction.
If I had to work with assembly output like that, I would be begging for something like the sort of annotation this script provides ;)
The llvm version of the script is available for downlad here, and I assume that a diff between that and the vanilla objdump version would make an excellent starting point for anyone wanting to adapt the script to some other disassembly dialect.