Skip to content

Scrapes the entire Kattis website, downloads all problems and helps you perform complex queries to find interesting problems.

Notifications You must be signed in to change notification settings

ChrisVilches/Kattis-Scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Kattis Scraper

Scrapes the entire Kattis website, downloads all problems and helps you perform complex queries to find interesting problems.

Install

Install using:

npm install

Make sure you've installed the appropriate Node version:

nvm use

# or

fnm use

Note: The current Node version is found in .nvmrc.

How to Run

The scraped URLs are cached, and will only be downloaded once, so you can run the script multiple times without using additional network resources.

Remove the .cache folder if you wish to clear the cached data.

Export as CSV

Run and export the data as an CSV file:

npm run csv

Populate a SurrealDB Database

Make sure you have installed SurrealDB before starting.

Start the SurrealDB server (this command also starts the built-in API):

surreal start --log debug --user root --pass root memory

Scrape the Kattis website and populate the database:

SURREALDB_USER=root SURREALDB_PASS=root npm run surrealdb

Then, you can query the SurrealDB API. Read the docs to learn how to use cURL or Postman to query SurrealDB. Keep in mind the NS and DB headers should both be kattis. For example:

DATA="INFO FOR DB;"
curl --request POST \
	--header "Accept: application/json" \
	--header "NS: kattis" \
	--header "DB: kattis" \
	--user "root:root" \
	--data "${DATA}" \
	http://localhost:8000/sql

To learn how to perform advanced queries in SurrealDB, you should refer to the official documentation.

Example #1: Filter by Difficulty

SELECT slug, minDifficulty FROM problem WHERE minDifficulty > 9.3 LIMIT 5;
"result": [
  {
    "minDifficulty": "9.6",
    "slug": "connectdots"
  },
  {
    "minDifficulty": "9.6",
    "slug": "magicalmysteryknight"
  },
  {
    "minDifficulty": "9.5",
    "slug": "cameramakers"
  },
  {
    "minDifficulty": "9.4",
    "slug": "textprocessor"
  },
  {
    "minDifficulty": "9.4",
    "slug": "callacab"
  }
]

Example #2: Find Geometry Problems

SELECT subdomain, slug FROM problem
WHERE statement CONTAINS "coordinate"
AND statement CONTAINS "distance"
AND statement CONTAINS "polygon"
LIMIT 4;
"result": [
  {
    "slug": "randommanhattan",
    "subdomain": "open"
  },
  {
    "slug": "marshlandrescues",
    "subdomain": "open"
  },
  {
    "slug": "puzzle2",
    "subdomain": "open"
  },
  {
    "slug": "tracksmoothing",
    "subdomain": "open"
  }
]

Example #3: Full Text Search

Under construction. At the moment of writing, full text search is not currently supported by SurrealDB.

Example #4: Count Amount of Scraped Problems

SELECT subdomain, count(subdomain) AS total
FROM problem
GROUP BY subdomain;
"result": [
  {
    "subdomain": "icpc",
    "total": 112
  },
  {
    "subdomain": "open",
    "total": 3551
  }
]

Subdomains

Currently the problems are downloaded from the following URL scopes:

https://icpc.kattis.com/problems/*
https://open.kattis.com/problems/*

More subdomains can easily be added by modifying the source code.

Tools Used

  • Node
  • TypeScript
  • SurrealDB

About

Scrapes the entire Kattis website, downloads all problems and helps you perform complex queries to find interesting problems.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages