Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 2 and 3: Add documentation to Deposit and DepositProperties #22

Merged
merged 7 commits into from
Aug 7, 2018

Conversation

lindareijnhoudt
Copy link
Contributor

@lindareijnhoudt lindareijnhoudt commented Aug 3, 2018

fixes #3
fixes #2

Where should the reviewer @DANS-KNAW/easy start?

some questions in the code, as TODO's

@@ -367,6 +384,21 @@ object Deposit {
} yield new Deposit(baseDir, bag, properties)
}

//TODO wouldn't it be better to be able to provide the payloadDir and a baseDir for the Deposit separately?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or as a separate method. I guess it could be used by easy-deposit-api to copy the deposit from the draft-area to the staging-area

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, a deposit is created in the draft-area and then moved/copied to the staging area. I don't see the need for a dans-bag-lib method for that. Just use better.files.File.move/copy

What I meant to question with this is: 'Would it make sense to have an extra method that doesn't create the bag in-place, but rather at a different location, indicated by baseDir?'. However, after thinking somewhat more about this, the user of this library can just do that by itself by first copying the original data to baseDir and then calling this method.

If you agree, feel free to remove this line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jo-pol and i thought to use such a method to automatically check the result of the copy by validating the checksums

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to check if a Bag is complete/valid/virtually-valid, you can use the methods described in #1 (PR in #13), I guess.

Besides, what do you want to compare the checksums against? The manifest files will only be calculated during the execution of this method. Hence, I really would not like to do that twice within one method... Of course I'm perfectly fine with comparing these calculated checksums in the manifests with the checksums of the original files (where the files where copied from, provided they weren't moved rather than copied), but that would probably something to be taken care of by the user of the library, as that is the one that is gonna do the copying/moving anyway, right?

@@ -39,6 +39,12 @@ case class DepositProperties(creation: Creation = Creation(),
springfield: Springfield = Springfield(),
staged: Staged = Staged()) {

//TODO shouldn't this always be 'deposit.properties'? in the current Deposit.baseDir?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or am i missing something?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is an old comment I forgot to remove. The current philosophy is that DepositProperties doesn't know it belongs to a bag, or at which location it is stored. That information is only known by the Deposit that holds this instance of DepositProperties. So it is correct that this save method takes the file: File as an argument.

@lindareijnhoudt lindareijnhoudt changed the title Issue #3: Add documentation to Deposit and DepositProperties Issue 2 and 3: Add documentation to Deposit and DepositProperties Aug 3, 2018
@@ -336,6 +336,11 @@ class Deposit private(val baseDir: File,
withDepositProperties(newProperties)
}

/**
* Most methods in this library only manipulate the Bag and Deposit in memory.
* Saves the in-memory deposit to the file system, like calling flush() in a Stream. It writes all the bag-files and the deposit.properties.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. There is no flush() on a Stream. Perhaps remove the whole like calling flush() in a Stream all together. It doesn't add much here.
  2. Refer to the bag.save() and props.save functions (in the text and maybe also using @see.
  3. Don't forget to fill in the @return line.

@@ -351,6 +356,18 @@ object Deposit {

val depositPropertiesName = "deposit.properties"

/**
* Creates a new Deposit with an empty Bag inside.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(for all these comments, they also apply to createFromData and other documentation below)

  1. please format the docs such that they will line out nicely
  2. usually the @param lines don't start their documentation with a capital letter; just use lowercase letters
  3. "if none provided SHA1 is used." --> "If none are provided, SHA-1 is used."
  4. use backticks for the state.label, depositor.userId and bagId
  5. use hyperlink syntax for 'nl.knaw.dans.bag.Deposit'

@@ -367,6 +384,21 @@ object Deposit {
} yield new Deposit(baseDir, bag, properties)
}

//TODO wouldn't it be better to be able to provide the payloadDir and a baseDir for the Deposit separately?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, a deposit is created in the draft-area and then moved/copied to the staging area. I don't see the need for a dans-bag-lib method for that. Just use better.files.File.move/copy

What I meant to question with this is: 'Would it make sense to have an extra method that doesn't create the bag in-place, but rather at a different location, indicated by baseDir?'. However, after thinking somewhat more about this, the user of this library can just do that by itself by first copying the original data to baseDir and then calling this method.

If you agree, feel free to remove this line.

@@ -393,6 +431,11 @@ object Deposit {
} yield new Deposit(baseDir, bag, properties)
}

/**
* Returns the `baseDir` of the Deposit object
* @param deposit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the deposit to extract the `baseDir` from

or something along those lines

@@ -39,6 +39,12 @@ case class DepositProperties(creation: Creation = Creation(),
springfield: Springfield = Springfield(),
staged: Staged = Staged()) {

//TODO shouldn't this always be 'deposit.properties'? in the current Deposit.baseDir?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is an old comment I forgot to remove. The current philosophy is that DepositProperties doesn't know it belongs to a bag, or at which location it is stored. That information is only known by the Deposit that holds this instance of DepositProperties. So it is correct that this save method takes the file: File as an argument.

* @param depositor the accountname of the depositor
* @param bagStore the bagId to be used for this deposit
* @return returns a new DepositProperties
*/
def empty(state: State, depositor: Depositor, bagStore: BagStore): DepositProperties = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe rename this empty to from, since it is not empty at all...

//TODO shouldn't this always be 'deposit.properties'? in the current Deposit.baseDir?
/**
* Writes the DepositProperties to `file` on the filesystem
* @param file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some lines here and below are not yet filled in

Copy link
Contributor

@jo-pol jo-pol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So much confusion for me that I stopped reviewing. Am I overlooking a glossary and/or directory structure? IMO it should be possible to understand a library without specs.

@@ -336,6 +336,13 @@ class Deposit private(val baseDir: File,
withDepositProperties(newProperties)
}

/**
* Most methods in this library only manipulate the Bag and Deposit in memory.
* Saves the in-memory deposit to the file system. It saves all the bag files by calling `DansBag.save()`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Saves the in-memory deposit to the file system. should be the first line. The rest is detail.
  2. I don't hope you mean you keep all payload files in memory. At first sight all the bag files seems to suggest that. Please be precise in distinguishing payload files and tag files.

* Creates a new Deposit, as a parent-directory to the `payloadDir`, moving the files in the
* `payloadDir` to a Bag in the Deposit.
*
* @param payloadDir the directory containing the payload files for the bag. Here the Deposit will
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just the uploaded files. i.e. the data folder in the DansV0Bag, or also files like datasetmetadata and files.xml? This is still confusing for me.

* @param bagInfo The entries to be added to `bag-info.txt`
* @param state The state to be set in the deposit.properties' state.label
* @param depositor The depositor to be set in the deposit.properties' depositor.userId
* @param bagStore The bagId for the bag-dir in this Deposit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confusing for me: from a bagStore I expect something containing multiple bags hence not an id of a bag. Rather call it how you start the description: bagId.

* @param state The state to be set in the deposit.properties' state.label
* @param depositor The depositor to be set in the deposit.properties' depositor.userId
* @param bagStore The bagId for the bag-dir in this Deposit
* @return if successful, returns a `nl.knaw.dans.bag.Deposit` object representing the deposit located at `payloadDir`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Above you say the payloadDir is moved into the deposit, now it is the location of the deposit.

@rvanheest rvanheest added this to the Dans-Bag milestone Aug 6, 2018
rvanheest pushed a commit to rvanheest-DANS-KNAW/dans-bag-lib that referenced this pull request Aug 7, 2018
@rvanheest rvanheest merged commit 73cdce6 into master Aug 7, 2018
@rvanheest rvanheest deleted the issue-03 branch August 7, 2018 12:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Documentation for DepositProperties Documentation for Deposit
3 participants