Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Add heap tracking #25

Open
wants to merge 105 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
105 commits
Select commit Hold shift + click to select a range
a54e97c
pintool: Add callstack and sha1 files
aewag Mar 16, 2022
19eca92
pintool: Fix for new pin versions
aewag Mar 18, 2022
9f2637e
pintool: Make clang-format happy
aewag Mar 18, 2022
3477dc1
pintool: Add debug prints
aewag Mar 22, 2022
0314479
pintool: Fix RecordFreeBefore args
aewag Mar 22, 2022
567e35e
pintool: Fix RTN_InsertCall args
aewag Mar 22, 2022
2fa8524
pintool: Refactor *allocWrapper to Before & After
aewag Mar 22, 2022
bbe1fff
pintool: Remove unused code in doalloc()
aewag May 17, 2022
f6035ca
pintool: Refactor to use LA for heap address only
aewag Mar 22, 2022
da1b763
pintool: Refactor getLogicalAddress for readability
aewag May 17, 2022
38cf0af
pintool: Fix LA recovery from known objects
aewag Apr 28, 2022
5dd32ac
pintool: Refactor the use of LAs
aewag Apr 28, 2022
f811935
pintool: Add assert for heap tracking
aewag May 17, 2022
463e069
pintool: Refactor calculate_sha1_hash()
aewag May 11, 2022
61d594f
pintool: Remove unused heapcache
aewag May 12, 2022
1d3a8aa
pintool: Refactor doalloc()
aewag May 12, 2022
95e2844
pintool: Remove item from heap if freed
aewag May 12, 2022
0180151
pintool: Refactor dofree()
aewag May 12, 2022
55c397a
pintool: Refactor doalloc()
aewag May 12, 2022
939cf1a
analysis: Delete apply of masks for heap
aewag Mar 23, 2022
fd9d3b5
Set debug print mostly to level 1
aewag May 13, 2022
d14d023
pintool: Refactor to demangle symbols
aewag Jun 29, 2022
cf5e64d
cryptolib: Add selftest-heap
aewag Apr 28, 2022
21bd603
GHA: Add clang-format stage
aewag May 16, 2022
56f636b
Make clang-format happy
aewag Jun 29, 2022
c857bab
alloc: Add MREMAP and MUNMAP
aewag Jul 13, 2022
6f4cb70
pintool: Fix grep call in execute_commands()
aewag Jul 27, 2022
a168723
pintool: Move printheap() up
aewag Jul 27, 2022
67cd7d8
pintool: Remove never executed statement
aewag Jul 27, 2022
9969847
pintool: calculate_sha1_hash() wrap debug statements into one context
aewag Jul 27, 2022
67c49f9
pintool: main() wrap debug statements into one context
aewag Jul 27, 2022
ee535fb
pintool: Print always debug statements of wrapper
aewag Jul 27, 2022
5801f72
pintool: instrumentMainAndAlloc() wrap debug statements into one context
aewag Jul 27, 2022
e30183f
pintool: Add tracking of sections in instrumentMainAndAlloc()
aewag Jul 27, 2022
8e4790c
pintool: Heap tracking with syscalls
aewag Jan 20, 2023
d66865a
pintool/addrtrace: Add Hook and addr LU for brk
aewag Jan 25, 2023
6e34c8d
pintool/addrtrace: Add print_proc_map
aewag Jan 25, 2023
eeeee2a
pintool/addrtrace: Fix indents
aewag Jan 25, 2023
997ee5c
pintool/addrtrace: Fix log in case of corrupted heap
aewag Jan 25, 2023
9cefb6e
pintool/addrtrace: Print heap and proc_map only if in debug
aewag Jan 25, 2023
2ff9c73
pintool/addrtrace: Allow nested [m,mun]map calls
aewag Jan 25, 2023
873adcd
pintool/addrtrace: Allow overlapping MMAP object within heap vector
aewag Jan 25, 2023
435acc3
pintool/addrtrace: Print heap only in debug
aewag Jan 26, 2023
8cc0e4f
pintool/addrtrace: Rm commented out getLogicalAddress calls
aewag Jan 26, 2023
72686ad
pintool/addrtrace: Allow brk access only from owner
aewag Jan 27, 2023
00a7c49
pintool/addrtrace: Rm obsolete startaddr from imgobj_t
aewag Jan 27, 2023
25b5652
pintool/addrtrace: Use imgobj_t within program_break_obj_t
aewag Jan 27, 2023
91f0102
pintool/addrtrace: Move Image tracking section up
aewag Jan 27, 2023
14816cc
pintool/addrtrace: Rename imghash to hash
aewag Jan 27, 2023
f28f455
pintool/addrtrace: Use imgobj_t for stack tracking
aewag Jan 27, 2023
038a273
pintool/addrtrace: Rm virtual address prepare for stack and heap
aewag Jan 27, 2023
66e9b61
pt/addrtrace: Rm obsolete map objects
aewag Jan 27, 2023
462cf64
pt/addrtrace: Use imgobj_t for heap tracking
aewag Jan 27, 2023
6efd9a0
pt/addrtrace: Refactor getLogicalAddress to log only for debug
aewag Jan 27, 2023
42e0c60
pt: Make clang-format happy
aewag Jan 27, 2023
0e86009
pt/addrtrace: Fix assert for brk range check
aewag Jan 30, 2023
9665bb7
pt/addrtrace: Fix cmp images
aewag Jan 31, 2023
d42cc28
pt/addrtrace: Restructure logging
aewag Feb 2, 2023
41ab66c
pt/addrtrace: Allow for multiple brk sections
aewag Feb 3, 2023
8a270d5
pt/addrtrace: Priotize and structure logging
aewag Feb 3, 2023
b1d1099
pt/addrtrace: Add callstack printing
aewag Feb 3, 2023
5e6c99a
pt/addrtrace: Print heap and proc_map during forced exit
aewag Feb 3, 2023
3ff7aad
pt/addrtrace: Always trace Syscalls
aewag Feb 3, 2023
62e9b46
pt/addrtace: Donot doubly trace Syscalls
aewag Feb 3, 2023
0c3a689
pt/addrtrace: Disallow unassignable virt_addr in fast_recording mode
aewag Feb 3, 2023
07fb97b
pt/addrtrace: Repriotize logging
aewag Feb 6, 2023
0d95f50
pt/addrtrace: Track MREMAP as syscall
aewag Feb 6, 2023
701056c
pt/addrtrace: Minor adaption to logging
aewag Feb 6, 2023
2950e1f
pt/addrtrace: Modify doalloc to do realloc in-place
aewag Feb 6, 2023
53935f6
pt/addrtrace: Minor in doalloc
aewag Feb 6, 2023
d28d2e0
pt/addrtrace: doalloc allow nested mremap object
aewag Feb 6, 2023
f0a7e66
pt/addrtrace: dofree during realloc only if object was moved
aewag Feb 6, 2023
19dac02
pt/addrtrace: Remove unused options (heapData, logaddr, allocmap)
aewag Feb 7, 2023
f380255
pt/addrtrace: Remove unused imgdata unkown
aewag Feb 7, 2023
dd9b093
pt/addrtrace: Do not print id for heap
aewag Feb 7, 2023
b1555f5
pt/addrtrace: Rework handling of allocation operations
aewag Feb 7, 2023
cfbede8
pt/addrtrace: Simplify RecordFree handling
aewag Feb 7, 2023
d2cdc67
pt/addrtrace: Pass (re)alloc state ptr to doalloc
aewag Feb 7, 2023
c14cfff
pt/addrtrace: Remove force from syscall handlers
aewag Feb 7, 2023
938c7ff
pt/addrtrace: Low prio debug if virt_addr not found
aewag Feb 9, 2023
3511bbc
pt/addrtrace: Get callsite offset for [m,re]alloc
aewag Feb 9, 2023
6d07da1
pt/addrtrace: Change logging in memory handler
aewag Feb 9, 2023
4fc0186
pt/addrtrace: Add print_allocmap
aewag Feb 14, 2023
b67ff88
pt/call-stack: Log demangled function name
aewag Feb 14, 2023
d0ef7c1
pt/addrtrace: Repriotize logging
aewag Feb 14, 2023
e556513
pt/addrtrace: Restructure instrumentMainAndAlloc
aewag Feb 15, 2023
bba6024
pt/addrtrace: Use counter for unique log addrs
aewag Feb 16, 2023
b2f983d
pt/addrtrace: Don't use obj size for log addr hash
aewag Feb 16, 2023
c23d083
pt/addrtrace: Remove callsite from memobj_t
aewag Feb 16, 2023
79640f1
pt/addrtrace: Simplify getcallstack routine
aewag Feb 16, 2023
893b67e
pt/addrtrace: Always record allocations
aewag Feb 16, 2023
ee7e1f0
pt/addrtrace: Don't store sections in imgvec
aewag Feb 16, 2023
17d8dba
analysis/datastub/SymbolInfo: Get debug file using gdb
aewag Feb 17, 2023
f34b93a
pt/addrtrace: Disable tracing within alloc rtn
aewag Feb 17, 2023
8d9991c
pt/addrtrace: Undecorate symbol name for rtn list
aewag Feb 17, 2023
4536f02
pt/addrtrace: Remove callsite from [re]alloc_state_t
aewag Feb 17, 2023
be3634c
pt/addrtrace: Add C++ vector tracing
aewag Feb 27, 2023
1843c0a
pt/addrtrace: Minor changes debug printing
aewag Feb 28, 2023
1ed9600
pt/addrtrace: Make clang-format happy
aewag Feb 28, 2023
7657f63
analysis/datastub: Use GDB to resolve not found symbols
aewag Feb 28, 2023
2a2f36a
analyze/datastub/printer: Minor restructure
aewag Feb 28, 2023
44c6a59
analyze/datastub/printer: Sort entries within XML
aewag Feb 28, 2023
c4a25ba
pt/addrtrace: Handle nullptr in dofree
aewag Mar 2, 2023
b0e8c1c
WIP
aewag Mar 2, 2023
7481c8b
pt: Adapt Makefile to include several sources
aewag Mar 2, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .github/workflows/lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,3 +30,11 @@ jobs:
- run: |
black --version
black --check --diff .
clang-format:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- run: sudo apt-get install clang-format
- run: |
clang-format -style="{IndentWidth: 4}" --dry-run --Werror -i *.{c*,h*,H}
working-directory: pintool
27 changes: 9 additions & 18 deletions analysis/analyze.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,6 @@
SPLeak,
TraceQueue,
Type,
MaskType,
Leak,
)
import multiprocessing
Expand Down Expand Up @@ -320,9 +319,6 @@ def iterate_queue(files, fast=True):
assert e1.data != 0
assert e2.data != 0
assert queues[0].callstack == queues[1].callstack
if Type(e1.type) in (Type.HREAD, Type.HWRITE):
e1.data &= 0x00000000FFFFFFFF
e2.data &= 0x00000000FFFFFFFF
if e1.data != e2.data:
report_dataleak(queues[0].callstack, e1, e2)
else:
Expand All @@ -333,11 +329,6 @@ def iterate_queue(files, fast=True):
# Mixture of heap and non-heap read/write. Maybe, heap tracking is imprecise
# We require that both elements are either (h)read or (h)write
debug(0, "Imprecise heap tracking @ %08x", (e1.ip))
# assert((e1.type | MaskType.HEAP.value) == (e2.type | MaskType.HEAP.value))
if (e1.type | MaskType.HEAP.value) > 0:
e1.data &= 0x00000000FFFFFFFF
if (e2.type | MaskType.HEAP.value) > 0:
e2.data &= 0x00000000FFFFFFFF
report_dataleak(queues[0].callstack, e1, e2)
else:
# This should never happen. We miss some conditional branches in the code
Expand Down Expand Up @@ -610,7 +601,7 @@ def generic_leakage_test(fixed, random):
debug(1, "")

# iterate over leaks
debug(0, "Got %d trace differences.", (len(fixedleaks)))
debug(1, "Got %d trace differences.", (len(fixedleaks)))
sys.stdout.flush()
for (idx, (fl, rl)) in enumerate(zip(fixedleaks, randomleaks)):
msg = {"warning": "", "leak": ""}
Expand Down Expand Up @@ -638,7 +629,7 @@ def generic_leakage_test(fixed, random):
if _glt_sanity_check_abort([len(fl.evidence), len(rl.evidence)]):
msg["warning"] += f" warning: {len(fl.evidence)} evidences for fixed\n"
msg["warning"] += f" warning: {len(fl.evidence)} evidences for random\n"
debug(0, msg["warning"])
debug(1, msg["warning"])
continue

(fnum, fnum_uniq, fdic) = _glt_gather_information(fl)
Expand Down Expand Up @@ -760,7 +751,7 @@ def specific_leakage_test(random, callback, keys, LeaksOnly=True, mp=False):

# process leaks
randomleaks = extract_leakdiff_to_array(random, LeaksOnly=LeaksOnly)
debug(0, "Got %d leaks.", (len(randomleaks)))
debug(1, "Got %d leaks.", (len(randomleaks)))
sys.stdout.flush()

# convert keys with callback
Expand Down Expand Up @@ -806,7 +797,7 @@ def specific_leakage_test(random, callback, keys, LeaksOnly=True, mp=False):

# sanity check
if len(rl.evidence) == 0:
debug(0, "Warning: no evidences")
debug(1, "Warning: no evidences")
continue

# gather information -- leaks
Expand Down Expand Up @@ -873,7 +864,7 @@ def specific_leakage_test(random, callback, keys, LeaksOnly=True, mp=False):
assert len(X_labels) == X.shape[1]

if X.shape[0] != len(keys):
debug(0, "Warning: callback returned wrong matrix!")
debug(1, "Warning: callback returned wrong matrix!")
continue

######
Expand Down Expand Up @@ -941,9 +932,9 @@ def report_spleak(rl, cleak):
# Print progress
if len(randomleaks) > 100:
if (rli % int(len(randomleaks) / 10)) == 0:
debug(0, "[Progress] %6.2f%%", ((rli * 100.0) / len(randomleaks)))
debug(1, "[Progress] %6.2f%%", ((rli * 100.0) / len(randomleaks)))
else:
debug(0, "[Progress] %d/%d", (rli + 1, len(randomleaks)))
debug(1, "[Progress] %d/%d", (rli + 1, len(randomleaks)))
sys.stdout.flush()
if mp:
pool_size = multiprocessing.cpu_count()
Expand Down Expand Up @@ -1005,7 +996,7 @@ def precompute_single(input):

def get_rdc_single(N, alpha):
limit = rdctest.RDC.rdc_sigthres(N, alpha)
debug(0, "RDC_limit=%f for N=%d, alpha=%f", (limit, N, alpha))
debug(1, "RDC_limit=%f for N=%d, alpha=%f", (limit, N, alpha))


"""
Expand Down Expand Up @@ -1126,7 +1117,7 @@ def collapse_leaks(leaks, collapse_cfleaks, granularity, resfilter=""):
if len(resfilter) > 0:
filterarr = resfilter.replace('"', "").replace("'", "").split(";")
for f in filterarr:
debug(0, "Filtering results for: " + f)
debug(1, "Filtering results for: " + f)
if mask == -1 and not collapse_cfleaks:
# Nothing to collapse
return leaks
Expand Down
49 changes: 47 additions & 2 deletions analysis/datastub/SymbolInfo.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,9 +24,11 @@
# @version 0.3


import sys
import copy
import os.path
import shlex
import subprocess
import sys
from operator import itemgetter
from datastub.SortedCollection import SortedCollection
from datastub.utils import debug
Expand All @@ -35,6 +37,38 @@
*************************************************************************
"""

DEBUG_SYMBOLS = dict()

def getdebugsymbol(sym, address):
if address in DEBUG_SYMBOLS:
debug(3, f"found symbol {DEBUG_SYMBOLS[address]} at {hex(address)}")
return DEBUG_SYMBOLS[address]
offset = address - sym.img.lower
command = f"gdb -ex 'set print asm-demangle on' -ex 'x/i {hex(offset)}' -ex quit {sym.img.name}"
output = subprocess.check_output(shlex.split(command)).decode("utf-8")
line = str()
lines = output.splitlines()
for line in reversed(lines):
tmp = line.lstrip().split(" ", 1)[0]
if tmp == hex(offset):
break
line = line.split("<", 1)[1]
line = line[::-1].split(">", 1)[1]
line = line[::-1]
DEBUG_SYMBOLS[address] = line
return line


def getdebugelf(fname):
command = f"gdb -ex quit {fname}"
output = subprocess.check_output(command.split(" ")).decode("utf-8")
lines = output.splitlines()
assert lines[-2].find(fname) != -1
if lines[-1].find("No debugging symbols found") != -1:
return None
assert lines[-2].find("Reading symbols from") != -1
return lines[-1].split(" ")[-1].split("...")[0]


def readelfsyms(fname, image):
try:
Expand All @@ -53,7 +87,13 @@ def readelfsyms(fname, image):
return None

if lines is None or len(lines) == 0:
return None
debug(0, f"No symbols found in {fname}")
fname = getdebugelf(fname)
if fname is None:
debug(0, f"GDB didnot found any debug file")
return None
debug(0, f"GDB found debug file: {fname}")
return readelfsyms(fname, image)

syms = []
for line in lines:
Expand Down Expand Up @@ -175,6 +215,11 @@ def lookup(cls, address):
assert cls.instance is not None
try:
(_, sym) = cls.instance.symbols.find_le(address)
if sym.name[0].find("_init") >= 0:
sym = copy.deepcopy(sym)
sym_name = getdebugsymbol(sym, address)
sym.name[0] = sym_name
return sym
return sym
except ValueError:
return None
Expand Down
Loading