Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bag completeness for tag files #124

Open
rvanheest opened this issue Jul 20, 2018 · 2 comments · May be fixed by #125
Open

Bag completeness for tag files #124

rvanheest opened this issue Jul 20, 2018 · 2 comments · May be fixed by #125

Comments

@rvanheest
Copy link

In BagIt v16 spec it says on completeness of a bag:

  1. Every file listed in every tag manifest MUST be present.

Likewise in earlier specs (v14) it says:

  1. Every file in every tag manifest MUST be present. Tag files not listed in a tag manifest MAY be present.

However, the BagVerifier only verifies the completeness in terms of the payload manifests, not in terms of the tag manifests. Shouldn't that be added also? Or is that done somewhere else?

@jscancella
Copy link
Contributor

@rvanheest and @acdha the PayloadVerifier actually does verify all files listed in all manifests. It should probably be renamed/refactored to make this more obvious

It does this by first getting all the files listed in all manifests (see https://github.com/LibraryOfCongress/bagit-java/blob/master/src/main/java/gov/loc/repository/bagit/verify/PayloadVerifier.java#L115-L134) and then verifies them (see https://github.com/LibraryOfCongress/bagit-java/blob/master/src/main/java/gov/loc/repository/bagit/verify/PayloadVerifier.java#L102-L103)

@rvanheest
Copy link
Author

Thanks for the quick response (as always!). I now see what my confusion was: in PayloadVerifier I mainly looked at line 105-109 and glanced over 102-103. That one does the checking mentioned in my original post. The 105-109 part checks the opposite: "all files in the payload directory should be listed in all manifests".

public void verifyPayload(final Bag bag, final boolean ignoreHiddenFiles)
throws IOException, MaliciousPathException, UnsupportedAlgorithmException,
InvalidBagitFileFormatException, FileNotInPayloadDirectoryException, InterruptedException {
final Set<Path> allFilesListedInManifests = getAllFilesListedInManifests(bag);
checkAllFilesListedInManifestExist(allFilesListedInManifests);
if (bag.getVersion().isOlder(new Version(1, 0))) {
checkAllFilesInPayloadDirAreListedInAtLeastOneAManifest(allFilesListedInManifests, PathUtils.getDataDir(bag), ignoreHiddenFiles);
} else {
CheckAllFilesInPayloadDirAreListedInAllManifests(bag.getPayLoadManifests(), PathUtils.getDataDir(bag), ignoreHiddenFiles);
}
}

Thanks for the clarification. That helped a lot.
Yes, refactoring this a bit probably won't hurt.

@jscancella jscancella linked a pull request Jul 22, 2018 that will close this issue
rvanheest pushed a commit to DANS-KNAW/dans-bagit-lib that referenced this issue Feb 20, 2019
…fy all manifests, not just payload manifest
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants