crawler-Course-Example: 自動排程網路爬蟲教材

A fully automated solution for web scraping. It includes automatic scheduling and triggering through GitHub Actions, data collection using Node.js's Puppeteer, and data preservation on GitHub Pages for use by others.

一套網路爬蟲的全自動解決方案。包含了透過GitHub Action自動排程啟動、使用Node.js的Puppet蒐集資料、以GitHub Pages保存資料供其他人使用。

Techniques

Node.js
GitHub Action: 自動排程執行的DevOps方案。
Puppeteer: Node.js的瀏覽器模擬工具。

Slide

網路爬蟲實作 - 112-1 資訊儲存與檢索

Citation

Chen, Y.-T. (2024). Crawler-Course-Example (20240518.210053) [Computer software]. Zenodo. https://doi.org/10.5281/ZENODO.11214115

Memo

最後記得用以下欄位儲存：

id: 一定要有id。
dc.title
dc.creator
dc.subject
dc.description
dc.publisher
dc.contributor
dc.date: 建議轉換成ISO格式。
dc.type
dc.format
dc.identifier
dc.source
dc.language
dc.relation
dc.coverage
dc.rights

API

https://pulipulichen.github.io/crawler-Course-Example/data.csv

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
.github/workflows		.github/workflows
app		app
build		build
docker-build		docker-build
.gitignore		.gitignore
.jshintrc		.jshintrc
CITATION.cff		CITATION.cff
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

crawler-Course-Example: 自動排程網路爬蟲教材

Techniques

Slide

Citation

Memo

API

About

Releases 2

Packages

Languages

License

pulipulichen/crawler-Course-Example

Folders and files

Latest commit

History

Repository files navigation

crawler-Course-Example: 自動排程網路爬蟲教材

Techniques

Slide

Citation

Memo

API

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages