Author | Hrvoje Nikšić (all…) |
Maintainer(s) | |
Released | 1996 |
Source | wget.git |
Man | |
Info |
---|
GNU Wget is a file retrieval utility which can use either the HTTP or FTP protocols. Wget features include the ability to work in the background while you are logged out, recursive retrieval of directories, file name wildcard matching, remote file timestamp storage and comparison, use of Rest with FTP servers and Range with HTTP servers to retrieve files over slow or unstable connections, support for Proxy servers, and configurability.
p | |
e |
download an entire website:
~ $ wget --random-wait -r -p -e robots=off -U mozilla WEBSITE_URL
download a sertain file in a list of files from a server with known structure:
#!/usr/bin/env bash
URL="https://dumps.wikimedia.org/other/pageviews/2016/2016-01/"
FILE_RE="(?<=\<a href\=\").+(?=\"\>)"
declare -a FILE_NAMES
FILE_NAMES=($(curl -s "$URL" | grep -oP "$FILE_RE"))
wget "$URL${FILE_NAMES[2]}"
declare curl (1) grep (1)