create dataset folder only when needed #459
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue I encountered
I was trying to run inference on a AWS Lambda function that has a read-only filesystem and I got an error that the dataset folder cannot be created (because the available disk was read-only). Looking at the error stack I found out that the issue is that Surprise tries to create the dataset folder even when user does not want to use the built-in datasets.
Fix
In my opinion, it is better to create the the dataset folder when it is actually needed. I have made the code change necessary for this (see that changes in this pull request). I ran all the tests in the tests folder after this change and they all pass, but please let me know if you think more tests are needed.