-
Notifications
You must be signed in to change notification settings - Fork 10
/
tool_ideas.text
92 lines (68 loc) · 3.42 KB
/
tool_ideas.text
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
#section Decompiling for validation
Someone recently bragged about how cool common lisp
compilers are. The guy had a snippet of intel disassembly
dump for some trivial code. I thought "oh that's short" and
begun deciphering what the dump means.
After checking for instruction references a while, I about
recognized where it does an addition and modulo operation.
The addition was implemented with LEA -insruction and modulo
used IDIV. There were lots of other things I didn't
understood well about that stream of bytes.
Metacognition kicks in and asks "what am I doing?" Oh I see
I am trying to figure out what's the effect of each
instruction so that I can track which computations take
place. "can I automate it?" Yes I guess. "Is there a gain in
figuring it out?" Definitely, the results of doing this tool
could be used for analysis, and for validifying or
even for generating compilers.
To automate something like this, I would need a disassembly
table that also contain a chart, program or expression of
some sort for every single instruction. I need enough
information to write a simulator for an x86 instruction set.
So I started to ask around and seek if there's something
like that. Here's what I have found so far.
#url http://ref.x86asm.net/
Good old x86 opcode and instruction reference. Whole lot of
intel instruction set encoding tables you could use to
implement an assembler and a disassembler.
It would serve as a start if nothing else.
#url http://udis86.sourceforge.net/manual/
Seems to be a nice and simple C library to disassemble code.
It's not enough but it could be nice to try it. It appears
simple enough that I could already generate headers for it
to use it in Lever, but lets wait for a small moment until I
get the new C header generator out.
#url http://www.cl.cam.ac.uk/~acjf3/arm/
This is very weird one. But it'd seem to attempt to specify
several instruction sets with a common language. The results
may be really useful if I can use them.
Well.. I got superb parsing tools so I definitely can. But
is the information correct or sufficient?
#url http://www.capstone-engine.org/
Capstone seems like it's got all the chops to do exactly
what I want, and I indeed found few decompilers using it to
produce C code. Producing C code is far above of what I
desired to do in the first place, so the chances are this
will provide me all the tools to do exactly what I need.
#url http://www.msreverseengineering.com/research
It appears to be a some kind of solver that already does lot
of what I'm interested about. I might like to take a look.
#url http://angr.io/
Whee. A python framework for analysing binaries. I'll
definitely give this a spin some time!
#break
Native machine code isn't the only interesting target here.
There's also the SPIR-V and WebAssembly.
I know roughly how the SPIR-V instructions are encoded, and
have vague idea how it forms programs. I may have to play
with it some more to extend an understanding.
WebAssembly seem to be coming from very similar directions
as SPIR-V. It is currently too limited for me to use as a
target, but it is starting to seem promising and getting to
go into the right direction for me to use it. It's got
three-layered design to provide greatly compressed binaries
in not-so-complex packaging.
Guessing it's a weekend project to check one thing out in
this list. I'll see when I get to time it. There are whole
bunch of things to do, and lots of time left before I must
concentrate on this.