Skip to content

Crawler to obtain data from the Dark Web marketplaces in order to get some databases for further analysis.

Notifications You must be signed in to change notification settings

victorlabrador/darkmarkets_crawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation


Darkweb markets Crawler

Crawler to obtain data from the Dark Web marketplaces in order to get some databases for further analysis.


Code · Markets code · Analysis code

Table of Contents

  1. About The Project
  2. Getting Started
  3. Roadmap
  4. Contact

About The Project

Currently, the Dark Web is one key platform for the online trading of illegal products and services. Analysing the .onion sites hosting marketplaces is of interest for law enforcement and security researchers. This paper presents a study on 123k listings obtained from 6 different Dark Web markets. While most of current works leverage existing datasets, these are outdated and might not contain new products, e.g., those related to the 2020 COVID pandemic. Thus, we build a custom focused crawler to collect the data. Being able to conduct analyses on current data is of considerable importance as these marketplaces continue to change and grow, both in terms of products offered and users. Also, there are several anti-crawling mechanisms being improved, making this task more difficult and, consequently, reducing the amount of data obtained in recent years on these marketplaces. We conduct a data analysis evaluating multiple characteristics regarding the products, sellers, and markets. These characteristics include, among others, the number of sales, existing categories in the markets, the origin of the products and the sellers. Our study sheds light on the products and services being offered in these markets nowadays.

Moreover, we have conducted a case study on one particular productive and dynamic drug market, i.e., Cannazon. Our initial goal was to understand its evolution over time, analyzing the variation of products in stock and their price longitudinally. We realized, though, that during the period of study the market suffered a DDoS attack which damaged its reputation and affected users' trust on it, which was a potential reason which lead to the subsequent closure of the market by its operators.

Consequently, our study provides insights regarding the last days of operation of such a productive market, and showcases the effectiveness of a potential intervention approach by means of disrupting the service and fostering mistrust.

(back to top)

Getting Started

In order to use this tool, it is necessary to have installed the Tor browser, Python and some libraries such as Selenium (to be able to scrape the websites of the markets) and sqlite3 to be able to store the information obtained in a SQLite database.

Prerequisites

  1. Installing Tor
  sudo add-apt-repository ppa:micahflee/ppa
  sudo apt update 
  sudo apt install torbrowser-launcher
  1. Installing Python
  sudo apt install python3

Installation

  1. Clone the repo
 git clone https://github.com/vicviclablab/darkmarkets_crawler
  1. Install dependencies and libraries
 sudo pip3 install -r requirements.txt

Execution

 python3 crawler.py market_name port file.txt
  • Where the market_name is the market to crawl
  • The port is the port that the crawl will use to connect through Tor to the market
  • The file.txt is the file where the URLs will be stored for later analysis.

(back to top)

Roadmap

  • Generic crawler structure
  • Create logs for crawler
  • Create specific markets crawler
  • Create option to parallelize processes
  • Create scripts in order to organize data from databases
  • Crawl new markets
  • Integrate free automatic CAPTCHA solvers
  • Automatization
    • Automatic check for completeness of data crawled
    • Crawler automatically resume upon crashes
  • Integrate AI module to crawl any new market based on the markets already viewed

(back to top)

Contact

Víctor Labrador - [email protected]

(back to top)

About

Crawler to obtain data from the Dark Web marketplaces in order to get some databases for further analysis.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published