CS Word Cloud

Requirements

Python 3.x
numpy, wordcloud, and any other misc pip/conda packages
Golang
GNU make
GNU coreutils
Bash or Zsh

How To Build

Install the prerequesites
Setup your environment by setting the variables in Makefile. The main variables to set are APIKEY, INITMATCH, and MATCH_COUNT

For example, before:

APIKEY?=TODO # Add server api key here from https://developers.faceit.com/
INITMATCH?=TODO # Add any recent faceit match ID here 
MATCH_COUNT=1000 # Number of demos to download
SHELL=/bin/bash # Need this just so I can use pipefail :/

After:

APIKEY?=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx # Keep this a secret
INITMATCH?=1-a993a412-8987-4d11-a682-dbe2fae3a761 
MATCH_COUNT=5 # Let's only do 5 demos for a short test
SHELL=/bin/zsh # Say I have a macbook lets use zsh

Run make all
Get a cool word cloud like this:

How It Works

Step 1: Traverse through FACEIT API for some matches

The basic traversal goes like this: given some initial match id, choose a random player in that match. Then choose a random match in their recent match history. And so on. This gave me a decently "random" sample of demos from a variety of regions and skill levels.

Step 2: Download the demos

I'm sure their are other ways to do this. However, I was able to download all 1000 demos in my dataset through these 3 URLS in cdns.txt:

https://storage.googleapis.com/demos-us-central1.faceit-cdn.net
https://storage.googleapis.com/demos-europe-west1.faceit-cdn.net
https://storage.googleapis.com/demos-europe-west2.faceit-cdn.net
https://storage.googleapis.com/demos-asia-southeast1.faceit-cdn.net

The download.sh script already handles the demo request automatically given cdns.txt is there.

Step 3: Parse the words

All I had to do was write a small method in Go in main.go using the API provided from https://github.com/markus-wa/demoinfocs-golang. It dumps all the chat text to stdout. Then I just cat them together for the word cloud generator.

Step 4: Generate the word cloud

I mainly followed this example here https://github.com/amueller/word_cloud/blob/main/examples/masked.py. I made my own stencil with GIMP and played around with the parameters. You can see the parameters I chose in csgoman.py.

Using this work to make new large CSGO datasets

The scripts in this repo may be of interest for those doing data science / statistics on CSGO games on the general population of players. Just use scrapeGames.py and download.sh scripts, and you should be able to get pretty large datasets in no time. I was able to get 1000 demos using 150GB and only a handful of hours.

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
blacklist.txt		blacklist.txt
cdns.txt		cdns.txt
csgo_mask.jpg		csgo_mask.jpg
csgoman.py		csgoman.py
download.sh		download.sh
go.mod		go.mod
go.sum		go.sum
main.go		main.go
makeWords.sh		makeWords.sh
rev0.png		rev0.png
scrapeGames.py		scrapeGames.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CS Word Cloud

Requirements

How To Build

How It Works

Step 1: Traverse through FACEIT API for some matches

Step 2: Download the demos

Step 3: Parse the words

Step 4: Generate the word cloud

Using this work to make new large CSGO datasets

About

Releases 2

Packages

Languages

License

matth2k/cs-word-cloud

Folders and files

Latest commit

History

Repository files navigation

CS Word Cloud

Requirements

How To Build

How It Works

Step 1: Traverse through FACEIT API for some matches

Step 2: Download the demos

Step 3: Parse the words

Step 4: Generate the word cloud

Using this work to make new large CSGO datasets

About

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages