Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a file verification mechanism #219

Open
nfebe opened this issue Aug 17, 2023 · 4 comments
Open

Implement a file verification mechanism #219

nfebe opened this issue Aug 17, 2023 · 4 comments

Comments

@nfebe
Copy link
Contributor

nfebe commented Aug 17, 2023

Implement a file verification mechanism that allows users to generate a file (e.g., CSV) containing a list of file paths and their corresponding cryptographic hashes (e.g., SHA256). This file can serve as a reference for users to quickly compare the integrity of their uploaded files with their local copies.

This would ideally happen in the backend repo.

Originally reported by @danfuzz

I tried to upload another bunch of files yesterday via SFTP and ran into some trouble which I think is worth
reporting:

● I used the SFTP command "put -R dir-name" (recursively send a directory). On Permanent, the dir- name in question didn't already exist, and the first time I tried the command, I got an error
complaining that the directory didn't exist. A bit odd — because the command should make the
directory — but not terrible, so I went ahead and made the directory via the Permanent website. Then
I retried the command. It worked, but to my surprise it ended up making a new directory with the same
name, so via the web I could see that there were two dir-name directories, one empty, and one with
my uploaded files.
● During a post-upload spot check, I found one file which seems to have gotten corrupted. It is a file

with a .txt extension, e.g. file-name.txt. On the website it (unsurprisingly) shows up as just file- name, but unlike other text files, if I use the "download" link, the menu indicates that it has an "original

format: unknown" instead of "original format: .txt" and it also has a blank icon instead of a text-
document icon. Selecting the download link does nothing (no download). If I use SFTP, I can navigate

to the directory, and ls shows file-name.txt, but the command get file-name.txt seems to hang
forever.
I am particularly concerned about the spot check, in that, generally speaking, I can't have confidence that
what I tried to upload was actually uploaded. By way of suggestion, it'd be super useful if there were a way via
the Permanent website to generate a file (e.g. CSV) containing file paths and corresponding hashes (e.g.
SHA256), so I could quickly compare with my local copy and gain confidence in what's stored at Permanent.

@jasonaowen
Copy link
Contributor

Instead of a CSV, I suggest that it would be more useful to use the format expected by tools like sha256sum(1). From the manpage I have locally:

The default mode is to print a line with checksum, a space, a character indicating input mode ('*' for binary, ' ' for text or where binary is insignificant), and name for each FILE.

This produces output like

$ sha256sum /dev/null 
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855  /dev/null

When saved to a file, that can then be checked:

$ sha256sum -c devnull.sha256
/dev/null: OK

@slifty
Copy link
Contributor

slifty commented Oct 12, 2023

In addition to supporting sha256sum it would also be good / important to have the sftp service itself provide support for exposing (where possible) sum data in a way that rclone can utilize it.

@danfuzz
Copy link

danfuzz commented Apr 14, 2024

I don't suppose there's any news on this feature request…

@slifty
Copy link
Contributor

slifty commented Apr 17, 2024

@danfuzz I'm no longer working on this project but maybe @cecilia-donnelly can weigh in!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants