- sqlite3
- python
make install
make collect_2019_data
to populate salary db with just 2019 data(if using sqlite)docker-compose up --build
to build and start docker container(if using postgres)
To obtain the H1B data, go to the site here. Then, click on the "Disclosure Data" tab. Once we're here, scroll down to LCA Programs and download the report for the given fiscal year. In my sample, I ran this for 2019. After downloading one of the excel file, open it in your desktop client and export it to csv. Then, use the "Data Ingestion" notebook to clean it up.
Currently, all the information exists in the data folder. I added the steps for moving information in the data ingestion notebook. We can probably automate that to make this idempotent.
Using https://salary-database.herokuapp.com/graphql, feel free to take a look at the schema and query the data. Here is an example input
{
salaries(limit: 10, employer: "AIRBNB", year:"2020"){
caseNumber
employerName
jobTitle
prevailingWage
employmentStartDate
}
}
with the following link: Query Link
Deploying to Heroku
To deploy Docker container to heroku:
- heroku create (one time step)
- heroku container:push web
- heroku container:release web
- heroku open
Pushing Salary Data from Local DB to Production DB
You'll need:
- postgresql
- install with:
brew install pgloader
- install with:
- pgloader
- install with:
brew install postgresql
- install with:
- Heroku CLI
In the future, when we want to update the database, the steps to push our local sqlite database to the production heroku database are:
- use pgloader to load our local sqlite db to a local postgres database
pgloader data/salary.sqlite postgresql:///[name of postgres dev db]
- reset remote db
heroku pg:reset DATABASE_URL --app salary-database
- push local postgres database to heroku
heroku pg:push postgresql:///[name of postgres dev db] DATABASE_URL --app salary-database
NOTE: you can use heroku pg:info --app salary-database
to get info about the the production database and check if we are near the row limit.