Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Several Crawls in one DB #38

Open
jogli5er opened this issue Dec 5, 2019 · 0 comments
Open

Several Crawls in one DB #38

jogli5er opened this issue Dec 5, 2019 · 0 comments

Comments

@jogli5er
Copy link
Member

jogli5er commented Dec 5, 2019

New Table: Crawls: Id & human readable id

on Paths: Secure flag could evolve over time, could be shifted to the content, meaning: content was found with secure flag true/false

Contents: Add

  • Crawl ID foreign key
  • secure flag

Links: Add

  • Source Content

Whenever Crawler is ran:
Configuration: Tag and random id => find control ID and use this
Paths: we will consider all paths that finished before the current start
Assumption: we only ever run one crawl at a time

@jogli5er jogli5er changed the title Sever Crawls in one DB Several Crawls in one DB Dec 5, 2019
jogli5er added a commit that referenced this issue Dec 5, 2019
This commit initializes the structure such that we can update the
sequelize models continuously. This allows us in the first place to
streamline the updates necessary to the db, required that we can
run several runs on the same database. That enables further that
we can compile all runs into one dataset instead of splitting it
up into multiple databases. The current commit is untested and
especially the migrations require some testing before we can
employ them.

For details on the changes required, please refer to issue #38.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant