Skip to content

AppThreat/blint-db

Repository files navigation

blint-db

blint-db is a family of binary symbol databases (pre-compiled SQLite files) generated by building a collection of open-source libraries and applications across various platforms and architectures, then generating an SBOM with OWASP blint. We started this project with C/C++ projects that can be built using wrapdb and vcpkg package managers, but we plan to extend to other ecosystems.

blint-db is available under GitHub packages and on Hugging Face datasets.

Use cases

Use blint-db to:

  • Improve the precision of generated SBOMs and SCA for C/C++ projects
  • Vectorize data to train ML models for component identification and risk prediction from binaries
  • And much more

Build pipeline

Native binaries vary based on several factors, such as configuration, build tools, and the operating system’s architecture. The project aims to generate and publish multiple versions of the database, built in both debug and stripped modes without optimizations. The following OS and architecture matrix is currently available:

  • Ubuntu 24.04 — amd64, arm64
  • Alpine Linux - amd64
  • macOS 15 — arm64

Database Schema

The schema design is not currently stable and is likely to change as we add more build pipelines.

Blint DB schema

Table: Exports

Index of all exported symbols.

CREATE TABLE Exports ( infunc VARCHAR(4096) PRIMARY KEY )

Table: Projects

Contains information about the projects indexed in the database, including each project's name, purl, and additional metadata.

source_sbom is currently unused.

CREATE TABLE Projects ( pid INTEGER PRIMARY KEY AUTOINCREMENT, pname VARCHAR(255) UNIQUE, purl TEXT UNIQUE, metadata BLOB, source_sbom BLOB )

Table: Binaries

A given project can produce multiple binary files during the build process. This table maps each generated binary to its parent project.

bbom is currently unused.

CREATE TABLE Binaries ( bid INTEGER PRIMARY KEY AUTOINCREMENT, pid INTEGER, bname VARCHAR(500), bbom BLOB, FOREIGN KEY (pid) REFERENCES Projects(pid) )

Table: BinariesExports

This table maps exported symbols to individual binaries.

CREATE TABLE BinariesExports ( bid INTEGER, eid INTEGER, PRIMARY KEY (bid, eid), FOREIGN KEY (bid) REFERENCES Binaries(bid), FOREIGN KEY (eid) REFERENCES Exports(eid) )

Search by symbol

Given a list of symbols, use the query below to identify the matching binary IDs and export IDs.

SELECT eid, group_concat(bid) from BinariesExports where eid IN (SELECT rowid from Exports where infunc IN ({symbols_list})) group by eid

You can then use the binary ID to retrieve the parent project’s name and purl at any time.

SELECT bname, pname, purl from Binaries JOIN Projects on Binaries.pid = Projects.pid WHERE Binaries.bid = ?

Apply heuristics, such as a ranking algorithm based on the number and type of matches, to reduce false positives in the results.

Funding

This project is funded through NGI Zero Core, a fund established by NLnet with financial support from the European Commission's Next Generation Internet program. Learn more at the NLnet project page.

NLnet foundation logo
NGI Zero Logo

Citation

@misc{blint-db,
  author = {Team AppThreat},
  month = Mar,
  title = {{AppThreat blint-db}},
  howpublished = {{https://huggingface.co/datasets/AppThreat/blint-db}},
  year = {2025}
}

Releases

No releases published

Packages

 
 
 

Languages