SKAL TRIAL SCRAPPING
scrapy startproject projectname
scrapy genspider spidername URL
scrapy version used 2.1.0
Splash scrapy for accessing the java script objects
docker for splash browser sudo docker run -it -p 8050:8050 --name splash scrapinghub/splash requires knowledge on lua scripts
Beginner friendly
selenium for headless browser https://github.com/clemfromspace/scrapy-selenium
chrome selenium driver path https://chromedriver.storage.googleapis.com/index.html?path=81.0.4044.69/
install the pymongo pip3 install pymongo
docker run -d --network some-network --name some-mongo -e MONGO_INITDB_ROOT_USERNAME=mongoadmin -e MONGO_INITDB_ROOT_PASSWORD=secret -p 8081:8081 mongo
or
check the mongodb.yaml file for docker-compose docker-compose -f mongodb.yaml up -d
for images need
pip3 install pillow