Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to a plan-centered instead of gush-centered data model #58

Closed
niryariv opened this issue Jan 6, 2014 · 6 comments
Closed

Switch to a plan-centered instead of gush-centered data model #58

niryariv opened this issue Jan 6, 2014 · 6 comments
Assignees

Comments

@niryariv
Copy link
Owner

niryariv commented Jan 6, 2014

currently the data model is gush-centered - as it works now, we scrape each gush at a time then store all the gush plans in the DB.

that's result of the original sprint to get the code working, and isn't clean, already causing some issues: eg, when a plan exists in several gushim it is stored several times. since MMI have a weird protocol where a plan appears in ALL gushim at certain stages, we have plans duplicated ~500 times, requiring the whole blacklisting system etc.

the idea here is to switch to a plan-centric model. the plan_id is the index, each plan appears once and has an array of gushim it belongs to.

@alonisser
Copy link
Collaborator

I think a plan centered is better, I think we should aim to meaningful and saveable url

@niryariv
Copy link
Owner Author

niryariv commented Jan 7, 2014

im not sure we'd have a plan URL - added text above to explain what this ticket is about.

in any case, keep in mind we want to store as little as possible of the plan data on our side, instead have a link to the plan URL in the canonical (ie MMI or Ministry of Interior) site. this to avoid sync issues later on between our copy of the data and the data on the canonical site.

@alonisser
Copy link
Collaborator

I don't think we need most of the plan data, can be the plan number, gush, city, street. and this doesn't change for a specific plan, so the data won't be stale. we would have to build a "static snapshots" of the spa and serve to google crawlers. see this example angular specific but can work with out SPA

Also: What I was referencing is actually opentaba-client issue

@niryariv
Copy link
Owner Author

niryariv commented Jan 7, 2014

I didnt save the explanation text i wrote.. rewrote now - see above

@alonisser
Copy link
Collaborator

@niryariv great Idea and better data model..

@florpor
Copy link
Collaborator

florpor commented Jul 2, 2014

done in 511186d 5ed3023 0e3cb47 b1a4374

@florpor florpor closed this as completed Jul 2, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants