Skip to content

hohn/codeql-operational-view

Repository files navigation

An Operational View of CodeQL

These are notes used to develop the slides in ./operational-view.key and its pdf version. They may be handy for running the examples, and are a simpler read than the slides.

A diagram view of the use of codeql is in ./notes/codeql-build.drawio. To edit / view / print these, use the open-source version of drawio. It can be used in the browser or downloaded. For simpler viewing, a pdf is provided.

The codeql / C compiler comparison

sql injection problem, compiler view

Think Compiler (C) with library:

# Prepare System
./admin -c

# Convert data if needed
cat users.txt

# Edit your code
edit add-user.c

# Compile & run your code
clang -Wall add-user.c -lsqlite3 -o add-user
for user in `cat input.txt` ; do echo "$user" | ./add-user 2>> users.log ; done

# Examine results
./admin -s

sql injection problem, codeql view

Think Compiler (CodeQL) with library:

# Prepare System
export PATH=$HOME/local/vmsync/codeql250:"$PATH"

# Convert data if needed
SRCDIR=.
DB=add-user.db
cd $SRCDIR &&                                                           \
    codeql database create --language=cpp                               \
           -s . -j 8 -v                                                 \
           $DB                                                          \
           --command='clang -Wall add-user.c -lsqlite3 -o add-user'

# Edit your code
edit SqlInjection.ql

# Compile & run your code
RESULTS=cpp-sqli.sarif
codeql database analyze                         \
       -v --ram=14000 -j12 --rerun              \
       --search-path ~/local/vmsync/ql          \
       --format=sarif-latest                    \
       --output=$RESULTS                        \
       --                                       \
       $DB                                      \
       $SRCDIR/SqlInjection.ql

# Examine results
# Plain text, look for
#     "results" : [ {
#     and
#     "codeFlows" : [ {
edit $RESULTS                   
# Or
jq --raw-output --join-output  -f sarif-summary.jq < cpp-sqli.sarif | less
# Or use vs code's sarif viewer
# Or use the GHAS integration via actions

Connecting to the compiler core

  • IDEs: use vs code for full functionality, any lsp-using editor for completion/jump to source
  • choose a repository layout that best fits your custom queries’ development model
  • check for library X support in the ql/ library. Better yet, check for particular function names.

Best practice CodeQL

Key Ideas

All of following ideas follow from one simple observation: The CodeQL CLI is a compiler and you should treat it as such

  • ghas setup and integration are almost completely independent of query customization; take advantage of this.

    If you can build your code on your desktop/laptop/own server, you don’t have to wait for GHAS integration to produce codeql databases.

    In fact, you should start on your desktop/laptop/server to find issues around the build: memory / thread requirements, ensuring the build system runs correctly when invoked from codeql, etc.

  • use desktop-based code scanning earlier in workflow
  • cli setup / analysis should be done as prototype for your github admins to work off
  • customize scanning tools to actually get results:
    • bug bounty programs
    • known entry / exit points for services
  • Just like your CI/CD pipeline encapsulates your compiler cli tools,

    github and GHAS encapsulate the codeql cli tools.

    So you can always think about what makes sense for the cli, try it there, and then update your GHAS workflow.

Some Q&A via compiler analogy

Q: Is the C standard library supported?

A: Much of it, typically from a conceptual level.

To find the supported APIs, search the =ql/= library source tree.

For example, for a top-down search start with cpp.qll and notice the import import semmle.code.cpp.commons.Printf. Follow this to find the =cpp.commons= module and see what it models:

Alloc.qll       Dependency.qll   NullTermination.qll   StringAnalysis.qll
Assertions.qll  Environment.qll  PolymorphicClass.qll  StructLikeClass.qll
Buffer.qll      Exclusions.qll   Printf.qll            Synchronization.qll
CommonType.qll  File.qll         Scanf.qll             VoidContext.qll
DateTime.qll    NULL.qll         Strcat.qll            unix/

Q: Is library X supported?

A: If it is, you’ll find it in the =ql/= library source tree. A whole-tree search, grep-style, is easiest.

For example, to check support for sqlite:

hohn@gh-hohn ~/local/vmsync/ql/cpp/ql/src
0:$ grep -l -R sqlite *
Security/CWE/CWE-313/CleartextSqliteDatabase.ql
Security/CWE/CWE-313/CleartextSqliteDatabase.c
semmle/code/cpp/security/Security.qll

So we have a query (.ql) and a library (.qll); look at both to get some ideas:

Security/CWE/CWE-313/CleartextSqliteDatabase.ql has some info in the header

/**
 * @name Cleartext storage of sensitive information in an SQLite database
 * @description Storing sensitive information in a non-encrypted
 *              database can expose it to an attacker.
 */

and a promising class:

class SqliteFunctionCall extends FunctionCall {
    SqliteFunctionCall() { this.getTarget().getName().matches("sqlite%") }

    Expr getASource() { result = this.getAnArgument() }
}

semmle/code/cpp/security/Security.qll has some very promising entries

/**
 * Extend this class to customize the security queries for
 * a particular code base. Provide no constructor in the
 * subclass, and override any methods that need customizing.
 */
class SecurityOptions extends string {
    ;;
    predicate sqlArgument(string function, int arg) {
        ;;
        // SQLite3 C API
        function = "sqlite3_exec" and arg = 1
    }
    ;;
    /**
     * The argument of the given function is filled in from user input.
     */
    predicate userInputArgument(FunctionCall functionCall, int arg) {
        ;;
        fname = "scanf" and arg >= 1
        ;;
    }
    ;;
}

This is a library, so some sample uses would be nice. Another search via

grep  -nH  -R SecurityOptions *

finds documentation:

docs/codeql/ql-training/cpp/global-data-flow-cpp.rst:59:The library class ``SecurityOptions`` provides a (configurable) model of what counts as user-controlled data:

and an extension point:

cpp/ql/src/semmle/code/cpp/security/SecurityOptions.qll:16:class CustomSecurityOptions extends SecurityOptions
/**
 * This class overrides `SecurityOptions` and can be used to add project
 * specific customization.
 */
class CustomSecurityOptions extends SecurityOptions {...}

Q: How should we go about modeling our libraries with CodeQL?

A: Follow the way you use a C library, say sqlite3. Your code includes only sqlite3.h; you use, but don’t care about, libsqlite3.a.

Thus for CodeQL: don’t try to model the library internals, only model the parts of the API you actually use.

For other languages, you need also only model the exposed API.

Q: Should we use the most recent version of codeql at all times?

A: Follow the way you use your compiler. Do you use the most recent version of compiler at all times, or do you use a rolling release cycle?

To get your current version’s info:

hohn@gh-hohn ~/local/vmsync/ql/cpp/ql/src
0:$ codeql --version
CodeQL command-line toolchain release 2.5.0.
Copyright (C) 2019-2021 GitHub, Inc.
Unpacked in: /Users/hohn/local/vmsync/codeql250
   Analysis results depend critically on separately distributed query and
   extractor modules. To list modules that are visible to the toolchain,
   use 'codeql resolve qlpacks' and 'codeql resolve languages'.

You should match the CodeQL cli version to the CodeQL library version; the library releases have codeql-cli/<VERSION> tags to allow matching with the binaries.

When using git for the library, you should check out the appropriate version via, e.g.,

cd $HOME/local/vmsync/ql && git checkout codeql-cli/v2.5.9

About

A concrete view of codeql based on actual commands

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published