All notable changes to this project will be documented in this file. The format is based on Keep a Changelog.
This project has its own custom versioning scheme. Version numbers consist of
three parts (e. g. 3.1.5
).
- The first number is increased on major rewrites or changes. What classifies as a major change is up to the maintainers. This is pretty rare and a PFERD version 4 should hopefully not be necessary.
- The second number is increased on backwards-incompatible changes in behaviour. This refers to any change that would make an existing setup behave differently (e. g. renaming options or changing crawler behaviour). If this number is increased, it may be necessary for you to adapt your own setup.
- The third number is increased on backwards-compatible changes (e. g. adding new options or commands, changing documentation, fixing bugs). Updates that only increase this number should be safe and not require manual intervention.
We will try to correctly classify changes as backwards-compatible or backwards-incompatible, but may occasionally make mistakes or stumble across ambiguous situations.
- Support for MOB videos in page descriptions
- Clickable links in the report to directly open new/modified/not-deleted files
- Support for non KIT shibboleth login
- Remove videos from description pages
- Perform ILIAS cycle detection after processing the transform to allow ignoring duplicated elements
- Parse headings (h1-h3) as folders in kit-ipd crawler
- Personal desktop/dashboard/favorites crawling
- Crawling of nested courses
- Downloading of links with no target URL
- Handle row flex on description pages
- Add
<!DOCTYPE html>
heading to forum threads to fix mime type detection
- Generic
ilias-web
crawler andilias-web
CLI command - Support for the course overview page. Using this URL as a target might cause duplication warnings, as subgroups are listed separately.
- Support for named capture groups in regex transforms
- Crawl custom item groups as folders
- Normalization of meeting names in cards
- Sanitization of slashes in exercise container names
- Crawling of personal desktop with ILIAS 8
- Crawling of empty personal desktops
- Support for ILIAS 8
- Video name deduplication
no-delete-prompt-override
conflict resolution strategy- Support for ILIAS learning modules
show_not_deleted
option to stop printing the "Not Deleted" status or report message. This combines nicely with theno-delete-prompt-override
strategy, causing PFERD to mostly ignore local-only files.- Support for mediacast video listings
- Crawling of files in info tab
- Remove size suffix for files in content pages
- Crawling of courses with the timeline view as the default tab
- Crawling of file and custom opencast cards
- Crawling of button cards without descriptions
- Abort crawling when encountering an unexpected ilias root page redirect
- Sanitize ascii control characters on Windows
- Crawling of paginated past meetings
- Ignore SCORM learning modules
- Missing documentation for
forums
option
- Clear up error message shown when multiple paths are found to an element
- IPD crawler unnecessarily appending trailing slashes
- Crawling opencast when ILIAS is set to English
- Recognize and crawl content pages in cards
- Recognize and ignore surveys
- Forum crawling crashing when a thread has no messages at all
- Forum crawling crashing when a forum has no threads at all
- Ilias login failing in some cases
- Crawling of paginated future meetings
- IPD crawler handling of URLs without trailing slash
- Download of page descriptions
- Forum download support
pass
authenticator
- Add
cpp
extension to defaultlink_regex
of IPD crawler - Mention hrefs in IPD crawler's
--explain
output for users oflink_regex
option - Simplify default IPD crawler
link_regex
- IPD crawler crashes on some sites
- Meeting name normalization for yesterday, today and tomorrow
- Crawling of meeting file previews
- Login with new login button html layout
- Descriptions for courses are now placed in the correct subfolder when downloading the whole desktop
- Message when Shibboleth entitlements need to be manually reviewed
- Links to unofficial packages and repology in the readme
- Increase minimum supported Python version to 3.9
- Support video listings with more columns
- Use UTF-8 when reading/writing the config file
- Crash during authentication when the Shibboleth session is still valid
- ILIAS login
- Local video cache if
windows_paths
is enabled
- A KIT IPD crawler
- Support for ILIAS cards
- (Rudimentary) support for content pages
- Support for multi-stream videos
- Support for ILIAS 7
- Interpolation in config file
- Crawling of recursive courses
- Crawling files directly placed on the personal desktop
- Ignore timestamps at the unix epoch as they crash on windows
--skip
command line option- Support for ILIAS booking objects
- Using multiple path segments on left side of
-name->
now results in an error. This was already forbidden by the documentation but silently accepted by PFERD. - More consistent path printing in some
--explain
messages
- Nondeterministic name deduplication due to ILIAS reordering elements
- More exceptions are handled properly
If your config file doesn't do weird things with transforms, it should continue
to work. If your -re->
arrows behave weirdly, try replacing them with
-exact-re->
arrows. If you're on Windows, you might need to switch from \
path separators to /
in your regex rules.
skip
option for crawlers- Rules with
>>
instead of>
as arrow head -exact-re->
arrow (behaves like-re->
did previously)
- The
-re->
arrow can now rename directories (like-->
) - Use
/
instead of\
as path separator for (regex) rules on Windows - Use the label to the left for exercises instead of the button name to determine the folder name
- Video pagination handling in ILIAS crawler
credential-file
authenticator--credential-file
option forkit-ilias-web
command- Warning if using concurrent tasks with
kit-ilias-web
- Cookies are now stored in a text-based format
- Date parsing now also works correctly in non-group exercises
- Proper config files
- Concurrent crawling
- Crawl external ILIAS links
- Crawl uploaded exercise solutions
- Explain what PFERD is doing and why (
--explain
) - More control over output (
--status
,--report
) - Debug transform rules with
--debug-transforms
- Print report after exiting via Ctrl+C
- Store crawler reports in
.report
JSON file - Extensive config file documentation (
CONFIG.md
) - Documentation for developers (
DEV.md
) - This changelog
- Rewrote almost everything
- Better error messages
- Redesigned CLI
- Redesigned transform rules
- ILIAS crawling logic (paths may be different)
- Better support for weird paths on Windows
- Set user agent (
PFERD/<version>
)
- Backwards compatibility with 2.x
- Python files as config files
- Some types of crawlers