Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Identify snapshots on archiving websites with their source #8344

Open
0x7969 opened this issue Nov 28, 2023 · 2 comments
Open

Identify snapshots on archiving websites with their source #8344

0x7969 opened this issue Nov 28, 2023 · 2 comments

Comments

@0x7969
Copy link

0x7969 commented Nov 28, 2023

Hey, I'm using archive.ph a lot to save social media posts and articles. Hypothesis works great for annotating them. Though if someone else accesses the source page directly (without using archive.ph), the annotations won't be shared. Identifying that the original URL is the same as the archived URL and then sharing the annotations would be a great feature.

@acelaya
Copy link
Contributor

acelaya commented Nov 28, 2023

We usually try to identify equivalent content from different URLs, in order to load the right annotations.

For example, if page A has a <link rel="canonical" href="..." /> pointing to page B, we load annotations created in page B when page A is loaded.

Can you share some equivalent URL examples? We can take a look and see if archive.ph is already adding something to identify the original content.

@0x7969
Copy link
Author

0x7969 commented Nov 28, 2023

Hey @acelaya,
thanks, I checked the source but an accordant reference in the typical formats seems to be missing. Meh. I'll ask them if they are able to add a link rel pointing to the source.
They include a <link rel="canonical" ...> though that is a longer version of the initially provided shortlink, that contais the source URL, looking like this: http://archive.today/2023.11.28-100610/https://www.faz.net/aktuell/wirtschaft/scholz-urteil-schafft-neue-realitaet-fuer-alle-kuenftigen-regierungen-19344914.html
Doubt if customly parsing that would be a good idea though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants