Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scrapping virus exchange #23

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Scrapping virus exchange #23

wants to merge 2 commits into from

Conversation

Sarieh-M
Copy link

@Sarieh-M Sarieh-M commented Nov 3, 2024

No description provided.

@Sarieh-M Sarieh-M requested a review from rothoma2 November 3, 2024 20:06
@rothoma2
Copy link
Contributor

I looked at this and we have a few issues to work on.

  1. I dont think we should be adding Selenium as a dependency here. Selenium is great, but requires you to setup a browser, and keep it locally, it makes the setup of the tool a lot more complicated and hard to use for more people.

We should try to see if we can "crawl" the site with just low level tools such as requests, or bs4 (beutifullshop) most tools, dont require javascript rendering, and if they do, we have some options before we use selenium.

  1. I see the flake8 hooks are failing, so can you also fix that? If you are unfamiliar with flake8, is a formating format, with some rules around how to make your python code look better.

import time
from pathlib import Path
from datetime import datetime as dt
from selenium import webdriver
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets refactor this not to depend on selenium

self.wait = WebDriverWait(self.driver, 10)

def login(self, email, password):
# Login to the Virus Exchange site
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do they not have an API? Do we really need to login to download samples?
Are we able to maybe use request to send a post, to login, keep the cookie and then send it in another request to get the samples via get?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API didn't work

"Are we able to maybe use request to send a post, to login, keep the cookie and then send it in another request to get the samples via get?"

I tried it and it didn't work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants