Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Architecture review: should we support constant injection and scraping while the user is browsing? #75

Open
justincy opened this issue Mar 29, 2017 · 2 comments

Comments

@justincy
Copy link
Member

genscrape is currently architected to be injected on every page the user visits. What this means is the page is scraped (and sometimes AJAX endpoints are hit) even if the user doesn't intend on using the data. RootsSearch is implemented this way. The user initiates an action on that data 0.3% of the time that genscrape is injected. That's a lot of wasted resources for everyone.

Two reasons this is advantageous:

  1. Notify the user when data is available
  2. So we can track how often the extension is used vs how often it could be used

We can still meet those objectives if we just match the current URL against all registered scrapers.

Benefits of not always running:

  1. Consume less resources
  2. Easier implementation for scrapers (never have to support single page apps and listen for URL changes)

It does mean that genscrape and all scrapers will have to be written such that they can be injected or run multiple times on the page. But that shouldn't be a problem when we get rid of all URL change listeners.

@justincy
Copy link
Member Author

Another benefit of this change (removing a usecase) is additional simplicity: simplicity in the abstraction and the code base.

@justincy
Copy link
Member Author

Though this is a breaking change. I'm generally not in favor of breaking changes. Thankfully only products I support are customers so migration is trivial.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant