Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sbtools: modifying data #115

Closed
lindsayplatt opened this issue Jan 20, 2017 · 16 comments
Closed

sbtools: modifying data #115

lindsayplatt opened this issue Jan 20, 2017 · 16 comments

Comments

@lindsayplatt
Copy link
Contributor

lindsayplatt commented Jan 20, 2017

Parent: sbtools: accessing cloud-based data

Adding/modifying data

  • Creating ScienceBase items and "folders"
  • Uploading your files
  • Organizing and querying your files
  • Replacing and deleting your files

9999-06-25

@lindsayplatt
Copy link
Contributor Author

edit existing things

items_update
items_upsert
item_append_files
item_move
item_rename_files
item_replace_files
item_update
item_update_identifier
item_upsert

make/remove things

folder_create
items_create
item_create
item_rm
item_rm_files
item_upload_create

@aappling-usgs
Copy link
Member

you know, this lesson might actually be the best place to introduce user_id, which returns the home item for a user. seems like it mostly gets used in the context of functions like item_create and folder_create, and in examples. (it's currently slated for the getting data lesson)

this lesson might benefit from a paragraph up top outlining the logic behind the item_ vs items_ functions. i think the items_ functions deal with more items per http request, possibly saving time. i'm not sure what the benefit of the item_ functions is...greater usability? availability for all actions rather than just some?

if you start with make/remove things, the student can make some scratch items that they can safely edit in the following section.

@lindsayplatt
Copy link
Contributor Author

I ended up switching this a little bit before you commented. Checkout this commit: lindsayplatt@78d87d9

@lindsayplatt
Copy link
Contributor Author

Realizing that I mention user_id, but don't really introduce it. I'll make a note to do that.

@aappling-usgs
Copy link
Member

@lindsaycarr wrote (Alison made a couple of edits):
Some additional unexpected behavior for sbtools functions. I'm trying to update an existing item's title.

# 1. creates a new item even though one already exists with this title
test_item <- item_upsert(title="books.json") 

# 2. created another item with the name "books.json" with this message:
#   "title is NULL - re-using title from input SB item"
item_upsert(test_item, title=NULL, info=list(title = "sbtools stuff"))

# 3. created a new item named "sbtools stuff" under the 
# books.json item
item_upsert(test_item, title="sbtools stuff")

Agreed that example 1 is not as expected. If a new title matches an old title, the old item should be updated. I made an sbtools issue: DOI-USGS/sbtools#239

Example 2 seems like a mildly annoying constraint: because title was NULL, item_upsert looked to the title of the parent, but then info also has title such that the body of the POST request contains title twice:

Browse[2]> body
$parentId
[1] "595e7a28e4b0d1f9f057035f"

$title
[1] "books.json"

$title
[1] "sbtools stuff"

and then I suppose sciencebase just ignores the second one. Possible fixes for this could be for item_upsert to override title with info$title if info$title exists, or to always throw an error if title is included in info. OR item_upsert could be enhanced to allow for all sorts of item queries, as the ScienceBase API does (https://my.usgs.gov/confluence/display/sciencebase/ScienceBase+Catalog+Item+REST+Services#ScienceBaseCatalogItemRESTServices-UPSERTRequests)...maybe by changing the title arg to query. I guess this calls for another issue: DOI-USGS/sbtools#240

For example 3, I think that the parent_id should and does specify the parent of the new item, rather than the item itself, such that the creation of a new "sbtools stuff" item beneath "books.json" is exactly what I expected. It does raise the question of how/whether you can use item_upsert to modify an item with a known item ID, but that's what item_update does, right?

@lindsayplatt
Copy link
Contributor Author

lindsayplatt commented Jul 6, 2017

I assumed that item_upsert had the same behavior as item_udpate when the item already existed based on the definition: Either creates or updates (if item already exists)

@aappling-usgs
Copy link
Member

Yeah, I don't think that assumption can be met. It searches based on title, updates based on info, and can't be directed (i don't think) to search for an item based on its sbid

@lindsayplatt
Copy link
Contributor Author

So how is item_upsert different from item_create?

@aappling-usgs
Copy link
Member

Isn't that what I just answered? It's different because it takes different arguments. It should also be different because it can modify an item, with those different arguments, if a single matching item already exists. But see sbtools issues 239 and 240.

@lindsayplatt
Copy link
Contributor Author

Oh sorry, thought you were answering how item_upsert differed from item_update.

@aappling-usgs
Copy link
Member

yeah, my bad, was just writing an apology/clarification

@aappling-usgs
Copy link
Member

Still - item_upsert should be different from item_create because it can sometimes modify rather than create an item, if a single matching item already exists. But see sbtools issues 239 and 240.

@aappling-usgs
Copy link
Member

item_upsert is basically a single API call that does the equivalent of

if(item_exists(id)) {
  item_update(id)
} else {
  item_create(id)
}

and because it has this fancier functionality, it takes different arguments than either item_update or item_create

@aappling-usgs
Copy link
Member

@lindsaycarr , regarding your slack issue with trying to create three items at once, repeated here:

new_folder <- item_create(title='bigitem')
add_mult <- items_create(parent_id = c(new_folder, new_folder, user_id()),
                         title = c("item 1", "item 2", "top-level item"))
## Error: If parent_id length > 1, it must be of same length as title and info

making parent_id a list rather than a vector, or a simple vector of character IDs, should work:

add_mult <- items_create(parent_id = list(new_folder, new_folder, user_id()),
                         title = c("item 1", "item 2", "top-level item"))
# OR
add_mult <- items_create(
	parent_id = c(new_folder$id, new_folder$id, user_id()),
	title = c("item 1", "item 2", "top-level item"))

but in both cases, all three items get created under the first parent_id:

* bigitem
  * item 1
  * item 2
  * top-level item

...so i guess that's a new sbtools issue for us: DOI-USGS/sbtools#242

@lindsayplatt
Copy link
Contributor Author

Oh man, another issue 😬

What do you think we should do for these examples in the mean time?

@aappling-usgs
Copy link
Member

yeah, 😬 ! hmm...use a single, replicated parent_id, i guess? and include a link to the issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants