Skip to content

SWHL/BaiduImageCrawling

Repository files navigation

🕷️ Baidu Image Crawling

PyPI SemVer2.0

简介

一个超级轻量的百度图片爬虫, modified from https://github.com/kong36088/BaiduImageCrawling

安装

pip install baidu_image_crawling

Python使用

from baidu_image_crawling.main import Crawler

crawler = Crawler(0.05, save_dir="outputs")  # 抓取延迟为 0.05

# 抓取关键词为 “美女”,总数为2页,开始页码为1,每页 30 张, 即总共2*30=60张
crawler(word="美女", total_page=2, start_page=1, per_page=30)

终端使用

baidu_image_crawling -w 美女 -tp 1 -sp 1 -pp 2

查看参数文档:

$ baidu_image_crawling -h
usage: baidu_image_crawling [-h] -w WORD -tp TOTAL_PAGE -sp START_PAGE [-pp [PER_PAGE]] [-sd SAVE_DIR] [-d DELAY]

options:
  -h, --help            show this help message and exit
  -w WORD, --word WORD  抓取关键词
  -tp TOTAL_PAGE, --total_page TOTAL_PAGE
                        需要抓取的总页数
  -sp START_PAGE, --start_page START_PAGE
                        起始页数
  -pp [PER_PAGE], --per_page [PER_PAGE]
                        每页大小
  -sd SAVE_DIR, --save_dir SAVE_DIR
                        图片保存目录
  -d DELAY, --delay DELAY
                        抓取延时(间隔)