Salary Database

Technology

sqlite3
python

Setup

make install
make collect_2019_data to populate salary db with just 2019 data(if using sqlite)
docker-compose up --build to build and start docker container(if using postgres)

Info

To obtain the H1B data, go to the site here. Then, click on the "Disclosure Data" tab. Once we're here, scroll down to LCA Programs and download the report for the given fiscal year. In my sample, I ran this for 2019. After downloading one of the excel file, open it in your desktop client and export it to csv. Then, use the "Data Ingestion" notebook to clean it up.

Currently, all the information exists in the data folder. I added the steps for moving information in the data ingestion notebook. We can probably automate that to make this idempotent.

Feel free to access the GraphQL Viewer!

Using https://salary-database.herokuapp.com/graphql, feel free to take a look at the schema and query the data. Here is an example input

{
  salaries(limit: 10, employer: "AIRBNB", year:"2020"){
    caseNumber
    employerName
    jobTitle
    prevailingWage
    employmentStartDate
  }
}

with the following link: Query Link

Production

Deploying to Heroku

To deploy Docker container to heroku:

heroku create (one time step)
heroku container:push web
heroku container:release web
heroku open

Pushing Salary Data from Local DB to Production DB

You'll need:

postgresql
- install with: brew install pgloader
pgloader
- install with: brew install postgresql
Heroku CLI

In the future, when we want to update the database, the steps to push our local sqlite database to the production heroku database are:

use pgloader to load our local sqlite db to a local postgres database
- pgloader data/salary.sqlite postgresql:///[name of postgres dev db]
reset remote db
- heroku pg:reset DATABASE_URL --app salary-database
push local postgres database to heroku
- heroku pg:push postgresql:///[name of postgres dev db] DATABASE_URL --app salary-database

NOTE: you can use heroku pg:info --app salary-database to get info about the the production database and check if we are near the row limit.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
backend		backend
data		data
migrations		migrations
notebooks		notebooks
tests		tests
.coveragerc		.coveragerc
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
Makefile		Makefile
Procfile		Procfile
README.md		README.md
docker-compose.yml		docker-compose.yml
manage.py		manage.py
requirements.txt		requirements.txt
runtime.txt		runtime.txt
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Salary Database

Technology

Setup

Info

Feel free to access the GraphQL Viewer!

Production

About

Releases

Packages

Contributors 4

Languages

drizzleco/salary-database

Folders and files

Latest commit

History

Repository files navigation

Salary Database

Technology

Setup

Info

Feel free to access the GraphQL Viewer!

Production

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages