An open-source, prototype implementation of property graphs for JavaScript based on the esprima parser, and the Mozilla SpiderMonkey Parser API. JAW can be used for analyzing the client-side of web applications and JavaScript-based programs.
This project is licensed under GNU AFFERO GENERAL PUBLIC LICENSE V3.0
. See here for more information.
JAW has a Github pages website available at https://soheilkhodayari.github.io/JAW/.
The architecture of the JAW is shown below.
JAW can be used in two distinct ways:
Option 1. Modeling and analyzing an arbitrary JavaScript program using JAW.
In this case, the only input to the tool is the path
of the target JavaScript program in the file system.
Option 2. Analyzing a web application given a single seed URL
.
JAW has a stand-alone, JavaScript-enabled web crawler (based on chromium
) that can collect the application web resources, and use them for analysis.
Test Case Scripts. Optionally, a so-called test case
script or a state
script can be provided together with the seed URL of the application in order for the crawler to reach a certain, pre-defined state (e.g., logged in state) before the crawling session starts.
For more information about how to create such state
script, see here.
Crawler Output.
The crawler outputs the JavaScript
code as well as the State Values
for each web page found.
JavaScript Code
: for each web page found, JAW creates a single JavaScript file preseving the execution order of the program.State Values
: state values are a collection of concrete values observed during the execution of a web page. These include the snapshot of the initial and rendered HTML page, fired events, HTTP requests and responses, and cookies. Optionally, the crawler can collect any JavaScript property accessible within the web page.
JavaScript code and state values collected are next used to build a HPG. The built graph is imported into a Neo4j database.
Alternatively, the graph is built for the (arbitrary) JavaScript program given as the input by the tester
Optional Input: the HPG construction module can optionally be provided with a mapping of semantic types to arbitrary, tester-defined JavaScript language tokens. For example, a semantic type REQ
can be assigned to all low-level, JavaScript functions that send an HTTP request (e.g., the Fetch API, or XMLHttpRequest).
Finally, the constructed Neo4j
graph database can be queried for analysis.
JAW provides a series of utility traversals for data flow analysis, control flow and reachability analysis, or pattern matching, which can be leveraged for writing custom traversals for security analyses. Also, JAW includes traversals to detect client-side CSRF vulnerabilities.
Please install the following dependencies before proceeding to the installation step:
- latest version of
npm package manager
(node js) - any stable version of
python 3.x
- python
pip
package manager
Please follow the steps below in order for an smooth installation process.
In the project root directory, run:
$ pip3 install -r requirements.txt
In the project root directory, run:
$ cd hpg_construction
$ npm install
Then:
$ cd hpg_construction/lib/jaw/dom-points-to
$ npm install
Finally:
$ cd hpg_construction/lib/jaw/normalization
$ npm install
1- Install Java.
Follow the tutorial here to install the latest version of Java.
2- Install Neo4j.
This prototype have been tested with Neo4j 3.5.9
, community edition.
You can download the 3.5.x
version from the neo4j download center.
You can also install it, among others, via apt-get
or homebrew
, as shown below.
2.1- Installing for Linux
$ cd installation
$ chmod +x neo4j_installation.sh
$ ./neo4j_installation.sh
For more information, see here, or here.
2.2- Installing For MacOS
$ cd installation
$ brew install ./neo4j.rb
Note: the graph import commands and database activation may slightly differ across neo4j versions.
If you want to use another version of neo4j, you need to change the graph import command (i.e., NEO4J_IMPORT_COMMAND
), as well as the database activation logic (i.e., dbms.active_database
) in API_neo4j_prepare
function
of hpg_neo4j/db_utility
package.
3- Set the intitial Neo4j Password.
By default, the password should be set as root
for the user neo4j
. If you set any other password, you also need to change it in constants.py
.
$ neo4j-admin set-initial-password root
Note: the default username and password should be neo4j
and neo4j
, respectively. But this has to be changed so that neo4j allows driver connections
. If the above command did not work, try using the cypher-shell
:
connect to cypher shell via:
$ cypher-shell -u neo4j -p neo4j
then run:
CALL dbms.changePassword('root');
:exit
If you choose a different password, you must set it in .env
with NEO4J_PASS=your-password
.
4- Make sure you can see the followings (uncommented) in your neo4j.conf
file:
dbms.connector.bolt.enabled=true
dbms.connector.bolt.listen_address=0.0.0.0:7687
Copy the example.env
and rename it to .env
.
- Set your operating system:
PLATFORM=linux
PLATFORM=macos
- If you choose a different password than that of step 3, you must set it in
.env
withNEO4J_PASS=your-password
.
Note: you may use the tool in windows. This requires changing the neo4j
configuration varibles (e.g., NEO4J_CONF) in constants.py
. In addition, you should change the
API_neo4j_prepare
function in main.py
by replacing the sed
bash command to that of windows (or simply provide a similar copying logic).
Neo4j example DB import synax:
$ neo4j-admin import --mode=csv --database=graph.db --nodes=nodes.csv --relationships=rels.csv --delimiter='¿'
See: Neo4j import documentation for more!
Environment configuration on OS X:
$ export NEO4J_HOME="/usr/local/Cellar/neo4j/3.5.9/libexec"
This module collects the data (i.e., JavaScript code and state values of web pages) needed for testing.
If you want to test a specific JavaScipt file that you already have on your file system, you can skip this step.
JAW can crawl and collect the web resources for a given set of web applications.
The inputs to the tool are thus a list of sites' seed URLs
, and an optional test case
or state
script for each site under test.
Action 1: Specify the set of sites for testing by adding new entries in the SITES_MAP
dictionary in hpg_crawler/sites/sitemap.py
.
As shown below, you should assign each site an integer id, a name, and a seed URL.
SITES_MAP = {
'1': ('example-name-1', 'https://example1.com', ),
'2': ('example-name-2', 'https://example2.com', ),
}
Action 2 (Optional): For each site specified, you can input a custom test case
or state
script that gives instructions to the crawler to reach a certain state of the application (e.g., logged in) before the crawling session starts.
- To create a new
state
script for a site, copy thesites/template
folder and rename it tosites/<site-id>
where<site-id>
is the id you set insitemap.py
for that site, e.g.,sites/1
. - Add your
selenium-based
state functions insites/<SITE_ID>/scripts/Auth.py
for the target site. See the exampleAuth.py
file for more information.
In order to start the crawler for a given <site-id>
, run:
$ cd hpg_crawler
$ python3 driver.py <site-id>
Alternatively, you can run the crawler for a list of sites, i.e., from an integer <start-site-id>
to the integer <end-site-id>
(inclusive), according to the integer id values specified in hpg_crawler/sites/sitemap.py
. Thus, you can run:
$ cd hpg_crawler
$ python3 driver.py <start-site-id> <end-site-id>
For example, python3 driver.py 1 5
crawls websites with id one to five from hpg_crawler/sites/sitemap.py
.
- Specify which website you want to crawl in
docker-compose.yml
under thecommand
field - Spawn an instance of the crawler by
./run.docker.sh
.
For more information about the web crawler of JAW, see here.
In order to create a hybrid property graph for the output of the crawler or a given JavaScript program, you can use the following general command:
$ python3 -m hpg_construction.api <path> --js=<program.js> --import=<bool> --hybrid=<bool> --reqs=<requests.out> --evts=<events.out> --cookies=<cookies.pkl> --html=<html_snapshot.html>
Specification of Parameters:
<path>
: base path to the folder containing the program files for analysis (must be under thehpg_construction/outputs
folder).--js=<program.js>
: name of the JavaScript program for analysis (default:js_program.js
).--import=<bool>
: whether the constructed property graph should be imported to an active neo4j database (default: true).--hybrid=bool
: whether the hybrid mode is enabled (default:false
). This implies that the tester wants to enrich the property graph by inputing files for any of the HTML snapshot, fired events, HTTP requests and cookies, as collected by the JAW crawler.--reqs=<requests.out>
: for hybrid mode only, name of the file containing the sequence of obsevered network requests, pass the stringfalse
to exclude (default:request_logs_short.out
).--evts=<events.out>
: for hybrid mode only, name of the file containing the sequence of fired events, pass the stringfalse
to exclude (default:events.out
).--cookies=<cookies.pkl>
: for hybrid mode only, name of the file containing the cookies, pass the stringfalse
to exclude (default:cookies.pkl
).--html=<html_snapshot.html>
: for hybrid mode only, name of the file containing the DOM tree snapshot, pass the stringfalse
to exclude (default:html_rendered.html
).
For more information, you can use the help CLI provided with the graph construction API:
$ python3 -m hpg_construction.api -h
Note: To add and assign custom semantic types to JavaScript language tokens (suitable for the specific type of analysis you are doing), you can input them in hpg_construction/lib/jaw/semantictypes.js
.
Note: Alternatively, one can use the following two steps for graph construction, which is more suited for debugging purposes.
The javascript analyzer modules creates the nodes.csv
and rels.csv
files for analysis (the property graph).
In the project root directory, try:
$ node hpg_construction/main.js -js <RELATIVE_PATH_TO_TEST_FILE> -o <OUTPUT_FOLDER_NAME>
For example:
$ node hpg_construction/main.js -js hpg_construction/test-inputs/test.js -o myfolder
In the project root directory, run:
$ python3 -m hpg_neo4j.hpg_import <path-to-the-folder-of-the-csv-files> --nodes=nodes.csv --edges=rels.csv
Run the help CLI command for more information:
$ python3 -m hpg_neo4j.hpg_import -h
To construct a HPG for an arbitrary JavaScript file, simply run:
$ python3 -m hpg_construction.api <path> --js=<program.js>
In the above command, the parameter <path>
is the base path to the folder containing the program files for analysis (must be under the hpg_construction/outputs
folder), and the
parameter --js=<program.js>
specifies the name of the JavaScript program for analysis.
To construct a HPG for a given web page output by the crawler, simply run:
$ python3 -m hpg_construction.api <path>
In the above command, the <path>
parameter is the location of the folder containing the target web page for analysis (output by the crawler), e.g., <BASE_PATH>/hpg_construction/outputs/example-app/example-page
.
The constructed HPG can then be queried using Cypher or the NeoModel ORM.
You should place and run your queries in hpg_analysis/<ANALYSIS_NAME>
.
You can use the NeoModel ORM to query the HPG. To write a query:
(1) Check out the HPG data model and syntax tree.
(2) Check out the ORM model for HPGs
(3) See the example query file provided; exampleorm.py
in the hpg_analysis/example
folder.
$ python3 -m hpg_analysis.example.exampleorm
For more information, please see here.
You can use Cypher to write custom queries. For this:
(1) Check out the HPG data model and syntax tree.
(2) See the example query file provided; example.py
in the hpg_analysis/example
folder.
$ python3 -m hpg_analysis.example.example
For more information, please see here.
This section describes how to configure and run the client-side CSRF analyzer, and how to interpret its output.
In the root directory, run
$ python3 -m hpg_analysis.cs_csrf.main
This will build the property graph, creates a neo4j database and queries the database for client-side CSRF vulnerabilities.
Note: By default, the testing process
is done for the unit tests specified under hpg_construction/unit_tests/cs_csrf
.
However, the analysis script has three different run modes:
- (i) Detecting vulnerabilities in a specific web page
- (i) Detecting vulnerabilities in a specific web site, i.e., all web pages of that site
- (iii) Detecting vulnerabilities in unit test files
The active run mode can be adjusted by changing the hardcoded value of the ACTIVE_MODE
parameter at the top of the script.
ENUM_TEST_WEB_PAGE = 0 # case (i)
ENUM_TEST_WEB_SITE = 1 # case (ii)
ENUM_UNIT_TEST = 2 # case (iii)
ACTIVE_MODE = ENUM_UNIT_TEST # change this value to change the active run mode of the script
For case (i) and (ii), to specify the exact site or URL to test, modify the hardcoded site_identifer
and url
variables at the end of the file.
The outputs will be stored in a file called template.out
in the same folder as that of the input. For each HTTP request detected, JAW outputs an entry marking the set of semantic types (a.k.a, semantic tags or labels) associated with the elements constructing the request (i.e., the program slices). For example, an HTTP request marked with the semantic type ['WIN.LOC']
is forgeable through the window.location
injection point. However, a request marked with ['NON-REACH']
is not forgeable.
An example output entry is shown below:
[*] Tags: ['WIN.LOC']
[*] NodeId: {'TopExpression': '86', 'CallExpression': '87', 'Argument': '94'}
[*] Location: 29
[*] Function: ajax
[*] Template: ajaxloc + "/bearer1234/"
[*] Top Expression: $.ajax({ xhrFields: { withCredentials: "true" }, url: ajaxloc + "/bearer1234/" })
1:['WIN.LOC'] variable=ajaxloc
0 (loc:6)- var ajaxloc = window.location.href
This entry shows that on line 29, there is a $.ajax
call expression, and this call expression triggers an ajax
request with the url template value of ajaxloc + "/bearer1234/
, where the parameter ajaxloc
is a program slice reading its value at line 6 from window.location.href
, thus forgeable through ['WIN.LOC']
.
For more information, visit our wiki page here. Below is a table of contents for quick access.
Pull requests are welcomed. Please see the contributor code of conduct.