scrapy-collector

A collection of website spiders. At this time, we only have one spider (mailcollect).

Installation / Usage

For users

Install scrapy from your distro's packaging or look at https://github.com/scrapy/scrapy
Download the spiders you want to use:
- download manually from https://github.com/evait-security/scrapy-collector/tree/main/scrapy_collector/spiders/
- or use svn: svn export https://github.com/evait-security/scrapy-collector/trunk/scrapy_collector/spiders/
scrapy runspider <path-to-spider> <options> (see examples below)

For developers

Familiarize yourself with scrapy.

git clone [email protected]:evait-security/scrapy-collector.git

cd scrapy-collector/

pipenv shell (or your prefered way to initiate a virtualenv)

pipenv install (pip users: python -m pip install -r requirements.txt)

cd scrapy scrapy_collector/

scrapy crawl <spider> <options> (see examples below)

Spiders

mailcollect

Tries to collect email addresses from a given domain. Will follow internal links, including subdomains. Does not filter collected mail addresses from other domains, all found addresses are included in the results. Optionally outputs the crawled paths.

Options
`-a target=<target-domain>`	The domain to be crawled. Subdomains will be included automatically (if they are linked within the page)
`-a show-paths=true`	Optional. Include the crawled paths in the output file
`-O outfile.json`	Write results to outfile, in JSON format. Other formats are available too (see https://docs.scrapy.org/en/latest/topics/feed-exports.html#serialization-formats)

Usage examples:

scrapy runspider mailcollect.py -a target=<target-domain> -O outfile.json

scrapy runspider mailcollect.py -a target=<target-domain> -O outfile.json -a show-paths=true

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
scrapy_collector		scrapy_collector
.gitignore		.gitignore
LICENSE		LICENSE
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
requirements.txt		requirements.txt
scrapy.cfg		scrapy.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

scrapy-collector

Installation / Usage

For users

For developers

Spiders

mailcollect

Usage examples:

About

Releases

Packages

Contributors 2

Languages

License

evait-security/scrapy-collector

Folders and files

Latest commit

History

Repository files navigation

scrapy-collector

Installation / Usage

For users

For developers

Spiders

mailcollect

Usage examples:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages