-
Notifications
You must be signed in to change notification settings - Fork 244
BinaryNinja
Howdy y'all,
Binary Ninja is the latest serious competitor among reverse engineering CAD tools. It might be more expensive than Radare2, but it's cheaper than IDA Pro and includes a very smooth UI and a clean, modern API. Unfortunately, Binary Ninja doesn't have dialogs for loading raw images to arbitrary addresses; fortunately, it's very easy to write a plugin that places things at the right addresses. It's also easy to write plugins for importing symbols.
The plugin described in this article was written in a weekend as a way
to learn Binary Ninja scripting. It can be found, perhaps with some
cleanup, as md380tools/playground/binaryninja
in the repo.
git clone https://github.com/travisgoodspeed/md380tools
Before we get started, I need to remind you of a sore point in ARM reversing: there are multiple instruction sets! Even though the MD380's firmware only uses the 16-bit wide Thumb2 instruction set, Binary Ninja also supports the 32-bit wide ARM7 instruction set.
An ARM CPU uses the least significant bit of the program counter to
indicate the instruction set, but ignores it when doing a fetch. An
ARM function at 0x0800F000
would be called as 0x0800F000
, but a
Thumb2 function at the same address would be called at 0x0800F001
instead.
Binary Ninja does it both ways. In the GUI, you must hit the P
key,
which defines a function, on the byte after the start of the
function. So just like the CPU, to define a Thumb2 function at
0x0800F000
, you must navigate to 0x0800F001
and hit P
.
In the scripting environment, it does things the other way. There,
you must manually specify the Thumb2 instruction set and then give the
even address, 0x0800F000
. If you provide the odd address that the
GUI expects, it will run off the rails and try to interpret unaligned
instructions!
Unlike IDA, Hopper, and Radare2, the Binary Ninja GUI doesn't provide
a loader for raw binary files. At best, you can define functions in a
Raw view at address 0x00000000
, but anything more sophisticated
requires a plugin. Luckily, the plugin API is very clean and we can
toss together a loader plugin from the examples in short order.
Loader plugins work by extending the BinaryView
class to present a
view of the executable. You can find an example of this in the
nds.py
example loader for Nintendo DS ROM images, and it was this
example that I forked for an MD380 loader.
from binaryninja.binaryview import BinaryView
from binaryninja.architecture import Architecture
from binaryninja.enums import SegmentFlag
from binaryninja.log import log_error
from binaryninja.types import Symbol
from binaryninja.enums import SymbolType, SegmentFlag
from binaryninja import PluginCommand
from binaryninja.interaction import get_open_filename_input
import struct
import traceback
class MD380View(BinaryView):
"""This class implements a view of the loaded firmware, for any image
that might be a firmware image for the MD380 or related radios loaded
to 0x0800C000.
"""
def __init__(self, data):
BinaryView.__init__(self, file_metadata = data.file, parent_view = data)
self.raw = data
@classmethod
def is_valid_for_data(self, data):
hdr = data.read(0, 0x160)
if len(hdr) < 0x160 or len(hdr)>0x100000:
return False
if ord(hdr[0x3]) != 0x20:
# First word is the initial stack pointer, must be in SRAM around 0x20000000.
return False
if ord(hdr[0x7]) != 0x08:
# Second word is the reset vector, must be in Flash around 0x08000000.
return False
return True
def init_common(self):
self.platform = Architecture["thumb2"].standalone_platform
self.hdr = self.raw.read(0, 0x100001)
print "Loaded %d bytes." % len(self.hdr)
def init_thumb2(self, adr=0x08000000):
try:
self.init_common()
self.thumb2_offset = 0
self.arm_entry_addr = struct.unpack("<L", self.hdr[0x4:0x8])[0]
self.thumb2_load_addr = adr #struct.unpack("<L", self.hdr[0x38:0x3C])[0]
self.thumb2_size = len(self.hdr);
# Add segment for SRAM, not backed by file contents
self.add_auto_segment(0x20000000, 0x20000, #128K at address 0x20000000.
0, 0,
SegmentFlag.SegmentReadable | SegmentFlag.SegmentWritable | SegmentFlag.SegmentExecutable)
# Add segment for TCRAM, not backed by file contents
self.add_auto_segment(0x10000000, 0x10000, #64K at address 0x10000000.
0, 0,
SegmentFlag.SegmentReadable | SegmentFlag.SegmentWritable)
#Add a segment for this Flash application.
self.add_auto_segment(self.thumb2_load_addr, self.thumb2_size,
self.thumb2_offset, self.thumb2_size,
SegmentFlag.SegmentReadable | SegmentFlag.SegmentExecutable)
#Define the RESET vector entry point.
self.define_auto_symbol(Symbol(SymbolType.FunctionSymbol,
self.arm_entry_addr&~1, "RESET"))
self.add_entry_point(self.arm_entry_addr&~1)
return True
except:
log_error(traceback.format_exc())
return False
def perform_is_executable(self):
return True
def perform_get_entry_point(self):
return self.arm_entry_addr
class MD380AppView(MD380View):
"""MD380 Application loaded to 0x0800C000."""
name = "MD380"
long_name = "MD380 Flash Application"
def init(self):
return self.init_thumb2(0x0800c000)
MD380AppView.register()
You can load this plugin by dropping it into ~/.binaryninja/plugins/
in Linux or the equivalent location on your platform's plugin
directory. Then run binaryninja D013.020.img
to load the firmware.
You can select the viewer plugin from the bottom-right of the screen,
which will auto-analyze the RESET vector as it first loads.
Unfortunately, the RESET vector only reveals a dozen or so functions, so we'll continue with the interrupt handlers.
As I mentioned earlier, the raw firmware has no real header, but it
does begin with a series of 32-bit little-endian pointers. We can
extend the init_thumb2()
function of our viewing class to include
those interrupt handler addresses, and to give them names that
indicate their status.
#Define other entries of the Interrupt Vector Table (IVT)
for ivtindex in range(8,0x184+4,4):
ivector=struct.unpack("<L", self.hdr[ivtindex:ivtindex+4])[0]
if ivector>0:
#Create the symbol, then the entry point.
self.define_auto_symbol(Symbol(SymbolType.FunctionSymbol,
ivector&~1, "vec_%x"%ivector))
self.add_function(ivector&~1);
This grows the identified functions to a few dozen, but many are useless infinite loops that just branch to themselves. We'll need to load symbols from the Radare2 annotations or the GNU LD linker scripts used in compiling the code.
In md380tools/annotations
you will find annotations of the firmware
that were produced in Radare2, and designed for use in the same.
You'll note that each line is just an r2 command to define a function
at a given address.
x270% cat annotations/d13.020/flash.r | head
f md380_menu_id @ 0x2001e915
f md380_menu_mem_base @ 0x2001b274
f md380_menu_memory @ 0x2001d5cc
f mn_editbuffer_poi @ 0x200049fc
f md380_menu_edit_buf @ 0x2001cb9a
f md380_menu_0x2001d3f0 @ 0x2001e946
f md380_menu_depth @ 0x20004acc
f md380_menu_0x2001d3ef @ 0x2001e945
f md380_menu_0x2001d3f1 @ 0x2001e947
x270%
Some have a slightly different format, defining both the function name and its size.
af+ 0x0809a4c0 4 gfx_font_small
Using them in Binary Ninja is no problem. In the same plugin file, we'll just define a plugin that opens a file dialog, then shotgun parses that file for all R2 definitions, importing them to the local database.
def importr2symbols(bv,filename):
"""Janky shotgun parser to import Radare2 symbols file to Binary Ninja."""
f=open(filename,"r");
for l in f:
words=l.strip().split();
try:
if words[0][0]=="#":
pass;
elif words[0]=="f" and words[2]=="@":
#f name @ 0xDEADBEEF
name=words[1];
adrstr=words[3];
adr=int(adrstr.strip(";"),16);
#Functions are in Flash
if adr&0xF8000000==0x08000000:
bv.define_auto_symbol(Symbol(SymbolType.FunctionSymbol, adr&~1, name));
bv.add_function(adr&~1);
print("Imported function symbol %s at 0x%x"%(name,adr));
#Data in SRAM or DRAM
elif adr&0xFE000000==0x02000000:
bv.define_auto_symbol(Symbol(SymbolType.DataSymbol, adr&~1, name));
print("Imported data symbol %s at 0x%x"%(name,adr));
elif (words[0]=="af+" or words[0]=="f") and int(words[2],16)>0:
name=words[3];
adrstr=words[1];
adr=int(adrstr.strip(";"),16);
#Functions are in Flash
if adr&0xF8000000==0x08000000:
bv.define_auto_symbol(Symbol(SymbolType.FunctionSymbol, adr&~1, name));
bv.add_function(adr&~1);
print("Imported function symbol %s at 0x%x"%(name,adr));
#Data in SRAM or DRAM
elif adr&0xFE000000==0x02000000:
bv.define_auto_symbol(Symbol(SymbolType.DataSymbol, adr&~1, name));
print("Imported data symbol %s at 0x%x"%(name,adr));
else:
print "Ignoring: ",words;
except:
if len(words)>3:
print("Ignoring: %s\n"%words);
#log_error(traceback.format_exc())
def md380r2symbols(view):
"""This loads an MD380Tools symbols file in Radare2 format."""
filename=get_open_filename_input("Select GNU LD symbols file from MD380Tools.")
if filename:
print("Opening: %s"%filename);
importr2symbols(view,filename);
else:
print("Aborting.");
PluginCommand.register("Load MD380 R2 Symbols",
"Load Radare2 symbols from MD380Tools",
md380r2symbols);
The plugin is operating by right-clicking on the background.
And sure enough, we have our symbols.
Our Radare2 symbols are great for their completeness, but they require
manual identification. Luckily, MD380Tools includes a custom tool
called symgrate
that identifies identical Thumb2 functions between
Flash revisions. Let's load the symbols for firmware version D03.008,
to which MD380Tools has never been ported.
First, use the symbols migration tool to generate the symbols.
x270% cd md380tools/symbols
x270% make
189 symbols_d02_034
189 symbols_d03_008
189 symbols_d03_020
189 symbols_d13_009
189 symbols_d13_014
189 symbols_d13_020
189 symbols_s03_012
189 symbols_s13_012
189 symbols_s13_020
1701 total
These are temporary symbol files that will be overwritten.
Move them elsewhere before editing.
x270% head symbols_d03_008
/* Symbols for ../firmware/unwrapped/D003.008.img imported from ../firmware/unwrapped/D013.020.img. */
md380_create_main_menu_entry = 0x0800c189; /* 1024 byte match */
md380_create_menu_entry = 0x0800c72f; /* 86 byte match */
gfx_drawtext10 = 0x0800ded9; /* 26 byte match */
gfx_drawtext = 0x0800def7; /* 56 byte match */
draw_datetime_row = 0x0800df1b; /* 20 byte match */
draw_zone_channel = 0x0800e538; /* 10 byte match */
md380_menu_entry_back = 0x0800fc85; /* 40 byte match */
Create_Menu_Utilies = 0x080134a1; /* 520 byte match */
md380_menu_entry_programradio = 0x080136c1; /* 936 byte match */
x270%
Then extend the Binary Ninja plugin to handle the GNU LD format used by this file. We're shotgun parsing it, so arithmetic operations won't be handled.
def importldsymbols(bv,filename):
"""Janky parser to import a GNU LD symbols file to Binary Ninja."""
f=open(filename,"r");
for l in f:
words=l.strip().split();
try:
name=words[0];
adrstr=words[2];
adr=int(adrstr.strip(";"),16);
#Function symbols are odd address in Flash.
if adr&0xF8000001==0x08000001:
bv.define_auto_symbol(Symbol(SymbolType.FunctionSymbol, adr&~1, name));
bv.add_function(adr&~1);
print("Imported function symbol %s at 0x%x"%(name,adr));
#Data symbols are in SRAM or TCRAM with unpredictable alignment.
elif adr&0xC0000000==0:
bv.define_auto_symbol(Symbol(SymbolType.DataSymbol, adr, name));
print("Imported data symbol %s at 0x%x"%(name,adr));
else:
print "Uncategorized adr=0x%08x."%adr;
except:
# Print warnings when our janky parser goes awry.
if len(words)>0 and words[0]!="/*" and words[0]!="*/":
print("#Warning in: %s\n"%words);
log_error(traceback.format_exc())
def md380ldsymbols(view):
"""This loads an MD380Tools symbols file in GNU LD format."""
filename=get_open_filename_input("Select GNU LD symbols file from MD380Tools.")
if filename:
print("Opening: %s"%filename);
importldsymbols(view,filename);
else:
print("Aborting.");
PluginCommand.register("Load MD380 LD Symbols",
"Load GNU LD symbols from MD380Tools",
md380ldsymbols);
Then unwrap that firmware revision and load it up in Binary Ninja.
x270% cd md380tools/firmware
x270% make unwrapped/D003.008.img
"make" -f Makefile_orig unwrapped/D003.008.img
make[1]: Entering directory '/home/travis/svn/md380tools/firmware'
../md380-fw --unwrap bin/D003.008.bin unwrapped/D003.008.img
DEBUG: reading "bin/D003.008.bin"
INFO: base address 0x800c000
INFO: length 0xf3000
DEBUG: writing "unwrapped/D003.008.img"
make[1]: Leaving directory '/home/travis/svn/md380tools/firmware'
x270% binaryninja unwrapped/D003.008.img
The auto-analyzer will identify the interrupt table, and then you can
right click and select Load MD380 LD Symbols
to open
symbols_d03_008
. We now have an annotated binary for a new firmware
revision without having manually reverse engineered a single function!
By this point, you should see that the symbols from the MD380Tools project are easily accessible, and easy to use with other tools. You should also have learned that Binary Ninja is pretty damned easy to script; none of this code is particular advanced.
I took the lazy way out and included assumptions about the MD380's
architecture, such as that firmware is loaded to 0x0800C000
instead
of 0x08000000
or 0x0
, or that the code is all Thumb2 with no
native ARM. With a bit of polish, you could add generic support for
embedded ARM.