Skip to content

This is a Python script that checks the HTTP status of every <a> element on a website using it's sitemap.xml and produces a report which is stored on your server and emailed to you (if enabled)!

Notifications You must be signed in to change notification settings

finnito/link-checker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

What is This?

This is a Python script that checks the HTTP status of every <a> element on a website using it's sitemap.xml and produces a report which is stored on your server and emailed to you (if enabled)!

Simply attach it to your crontab to run it on a regular basis and be aware of any dead links that need fixing!

Sound good?

✅ Getting Started

  1. Clone the repo
cd ~/
git clone [email protected]:Finnito/link-checker.git
cd link-checker
  1. Install pipenv
sudo apt install pipenv
  1. Install the project
pipenv install
  1. Configure some settings
cp config.example config
nano config
[default]
sitemap_url     = http://localhost:1313/sitemap.xml     # The script only does XML sitemaps
email_log       = yes                                   # Set to "no" to disable email logs
to_email        = [email protected]                        # Ensure the domain meets your mail server requirements
from_email      = [email protected]                        # Who's getting the email reports?
email_subject   = Link Checker reports                  # Customise the subject in case you don't like it!
  1. Test the script!
python3 -m pipenv run link-checker
  1. Setup a crontab, I prefer mine to run at 6am each Monday like so:
crontab -e

# Paste the following
0 6 * * 1 cd ~/link-checker/ && python3 -m pipenv run link-checker

Extra for Experts (not really)

You can alternatively pass the sitemap URL as an argument to the script - this will override the config file allowing you to check multiple sites with the one script. It might look like this in your crontab:

0 6 * * 1 cd ~/link-checker/ && python3 -m pipenv run link-checker https://finn.lesueur.nz/sitemap.xml
5 6 * * 1 cd ~/link-checker/ && python3 -m pipenv run link-checker https://science.lesueur.nz/sitemap.xml

About

This is a Python script that checks the HTTP status of every <a> element on a website using it's sitemap.xml and produces a report which is stored on your server and emailed to you (if enabled)!

Topics

Resources

Stars

Watchers

Forks

Sponsor this project

 

Languages