-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
* Add checkLargeFiles to checks.R that checks if files with more than 5 MB exist in 'data/' and if yes recommends ExperimentHub/AnnotationHub * Add unit test: creates large file with for loop * Add call to BiocCheck.R that calls checkLargeFiles() on real package. Uses same flag as checkIndivFileSizes
- Loading branch information
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -219,6 +219,29 @@ checkIndivFileSizes <- function(pkgdir) | |
} | ||
} | ||
|
||
checkLargeFiles <- function(pkgdir){ | ||
pkgType <- getPkgType(pkgdir) | ||
if (is.na(pkgType) || pkgType == "Software") { | ||
maxSize <- 5*10^6 ## 5MB | ||
allFiles <- list.files(file.path(pkgdir, "data"), all.files=TRUE) | ||
allFilesFullName <- file.path(pkgdir, "data", allFiles) | ||
This comment has been minimized.
Sorry, something went wrong.
This comment has been minimized.
Sorry, something went wrong.
lshep
|
||
|
||
sizes <- file.size(allFilesFullName) | ||
largeFiles <- paste(allFiles[sizes > maxSize], collapse=" ") | ||
if (any(sizes > maxSize)) { | ||
handleWarning( | ||
"Found the following large files (over 5MB) in the ", | ||
"'data/' folder: ", | ||
paste0("'", largeFiles, "'", collapse = " "), | ||
"\nConsider uploading them to ExperimentHub ", | ||
"<https://www.bioconductor.org/packages/ExperimentHub/> or ", | ||
"AnnotationHub <https://bioconductor.org/packages/AnnotationHub/>." | ||
) | ||
return(TRUE) | ||
} | ||
} | ||
} | ||
|
||
checkBiocViews <- function(pkgdir) | ||
{ | ||
dirty <- FALSE | ||
|
1 comment
on commit 5fad21c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the code it might make more sense to implement the check for large data files directly in the checkIndivFileSizes
function since currently both functions would pick up on large data files as you pointed out.
You could filter the files picked up by the list.files for any that are in the data (maybe also inst/extdata?) directory vs other (make sure to test that it functions properly with or without a data directory present)
The above is probably recommended. If you continue with two seperate checks with is not as ideal:
- I would move the BiocCheck.R function into the section for
if (is.null(dots[["no-check-file-size"]])){
}
- consider changing the name perhaps checkLargeDataFiles or something indicating data
- make sure the two checks don't pick up the same files
Be sure to test the edge case of if the data directory is present or not
present since it does not have to exist in a package.
Unit Testing:
Do not create a large file. This should be avoided as we don't want to use up tmp space on the builders or anyones individual machines.
I would suggest making a helper function to pass in file sizes and a max size. Then in the test make the max size low to trigger the failure for testing.
I haven't run this, but are you appending the full path to the file names here? If so, you can use
list.files(..., full.names = TRUE)
to do this in one step.