Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update redwood / Add Docker / Fix Scraper #13

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

bozdoz
Copy link

@bozdoz bozdoz commented May 12, 2021

I had issues bringing up the project and I realized everything was out-of-date. Ran yarn rw update, and compared with a brand new project to fix up any issues.

Also began fixing the scrape function. Appears to be getting the same data for each graph. Maybe a WIP:

image

image

@netlify
Copy link

netlify bot commented May 12, 2021

Deploy request for countrycovid19 pending review.

Review with commit 36c24b1

https://app.netlify.com/sites/countrycovid19/deploys

@vercel
Copy link

vercel bot commented May 12, 2021

Someone is attempting to deploy a commit to a Personal Account owned by @lachlanjc on Vercel.

@lachlanjc first needs to authorize it.

@bozdoz
Copy link
Author

bozdoz commented May 12, 2021

Resolves #12

@vercel
Copy link

vercel bot commented May 12, 2021

This pull request is being automatically deployed with Vercel (learn more).
To see the status of your deployment, click below or on the icon next to each commit.

🔍 Inspect: https://vercel.com/lachlanjc/covid19/ASmkjC8tyoKFtPo4x7nKrNiDcuwJ
✅ Preview: https://covid19-git-fork-bozdoz-update-redwood-lachlanjc.vercel.app

@lachlanjc
Copy link
Owner

Wow, thanks for bringing this back to life!! Yeah, this was one of the first releases of Redwood we ran it on. Some observations/questions:

  • Looks like netlify.toml needs an update — the build logs say the db up command has changed to prisma migrate
  • We implemented the redwood-hack because the database transactions wouldn't run without it, can we drop that now?
  • Since this project is just using JS without TS, & no authentication, why are the new types files needed?
  • How does the new scraping work? We just go to the URL of that function to update the database? Will it backfill to when I stopped manually updating it last summer?
  • Looks like the chart data fetching isn't working in prod: https://covid19-qhsp5dim1-lachlanjc.vercel.app/

@bozdoz
Copy link
Author

bozdoz commented May 12, 2021

Hmm, that's curious.

I will try the site out in docker, to make sure there's a straight-forward approach to getting this set up. That should help figure out netlify (don't have much experience with it).

I imagine redwood-hack is useless, but I'll test it out in docker.

Those new types files were added automatically (and updated frequently!). Might be worth adding them to .gitignore.

New scraping just altered the selector for the script tag, and parsed the data differently; that's it. shouldn't be anything new. Maybe the script tags are in a different format now, and maybe that's why the daily stats numbers are in error. I'm not sure what it was doing before though.

@lachlanjc
Copy link
Owner

Interesting. Yeah, let's ignore the types. The scraper never worked consistently—that was a project we started with the hopes the site could update itself, but I ended up just manually entering the numbers into the database using SQL every day until last summer, when the visualization wasn't that useful anymore.

api/prisma/migrations/migration_lock.toml Show resolved Hide resolved
@@ -0,0 +1,38 @@
-- CreateTable
CREATE TABLE "Day" (
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new migration is incompatible with production. Likely needs to be reset

api/prisma/schema.prisma Show resolved Hide resolved
api/prisma/seed.js Outdated Show resolved Hide resolved
const services = importAll('api', 'services')
import schemas from 'src/graphql/**/*.{js,ts}'
import { db } from 'src/lib/db'
import services from 'src/services/**/*.{js,ts}'
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All copied from a new redwood project

deleteCountry
} from './countries'

describe('countries', () => {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guessing redwood created these somehow too

package.json Show resolved Hide resolved
web/src/components/Chart.js Show resolved Hide resolved
@bozdoz
Copy link
Author

bozdoz commented May 13, 2021

Docker was a bit tough, but I got it up eventually.

Added a few more countries. Changed default country to be an environment variable. Caching one of the queries to prevent re-renders. Removed redwood-hack directory and seems to work fine. Also I think I fixed the scraper.

Try it out if you have docker: docker-compose up --build

Scraper seems to take about 10min

@bozdoz
Copy link
Author

bozdoz commented May 13, 2021

proof of scraper fix (I think): 🤔

image

{
iso: 'cad',
worldometersSlug: 'canada',
name: 'Canada',
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

closes #6

*
* @template T
* @param {Array<T>} array
* @param {(item: T) => void} callback
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

async function asyncForEach(array, callback) {
const promises = array.map((item) => callback(item))

return Promise.allSettled(promises)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

allSettled means that some or all could fail 🤷

Comment on lines +22 to +24
cors: {
origin: '*',
credentials: true,
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cors is just to make docker work; if you want I could make this conditional

htmlAttempt = $(`div#${chartId} + script`).html()
const hackedChartJSON = htmlAttempt.split('Highcharts.chart(')[1];

const xAxisCategories = hackedChartJSON.match(/xAxis:.*?{.*?categories\:.*?(\[.*?\])/s)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here's where we just get xAxis categories directly


const xVals = JSON.parse(xAxisCategories[1])

const seriesData = hackedChartJSON.match(/series:.*?data:.*?(\[.*?\])/s)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and this gets series data directly

Comment on lines -27 to +40
// need to manually set year to 2020
let day = new Date(val + ", 2020")
const day = new Date(val)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that's not true anymore

Comment on lines +170 to +173
return {
statusCode: 200,
body: JSON.stringify('Success!')
}
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new format for redwood function handlers


services:
web:
build: .
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

builds Dockerfile

web:
build: .
ports:
- 8910:8910
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

binds host port 8910 to container port 8910

build: .
ports:
- 8910:8910
command: yarn rw dev web
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

main command to bring up web side

Comment on lines +10 to +12
environment:
REDWOOD_ENV_DEFAULT_COUNTRY: usa
REDWOOD_ENV_DEFAULT_COUNTRY_NAME: United States
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dev could easily change these if they wanted

restart: always
volumes:
# keeps database after container stops
- db:/app/api/prisma
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a named volume which docker keeps internally

@@ -1,5 +1,5 @@
[build]
command = "cp redwood-hack/api-dbInstance.js node_modules/@redwoodjs/api/dist/dbInstance.js && yarn rw db up --no-db-client && yarn rw build"
command = "yarn rw prisma migrate dev && yarn rw build"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure

export const beforeQuery = () => {
return {
// never change query with new variables
variables,
Copy link
Author

@bozdoz bozdoz May 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the query (below) never uses variables, so this prevents it from fetching what it thinks is a new query

variables,
fetchPolicy: 'cache-and-network',
// countries don't change mid-session
nextFetchPolicy: 'no-cache'
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

meant for this to be cache-only, but maybe doesn't matter

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cache-only breaks. Maybe should revert to cache-first (default)

@bozdoz bozdoz changed the title WIP: Update redwood Update redwood / Add Docker / Fix Scraper May 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants