-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Finalize format of master list. #1
Comments
Some questions that we should answer:
|
@lossyrob Yeah, thanks for kicking this off. These are good questions. Here's an initial take on a format -- focused on s3 and object stores:
Pros: this is super simple on the catalog indexing side. |
Interesting, this is the sort of format I was imagining for the 'grouping' metadata. What I thought of was to have the register be a flat file of a list of URIs, and those URIs would point to a json file that looked just like what you have suggested for the register entries. Either works, I'll have to think through the pros and cons of those two options, but this is a good idea.
|
The major difference I can come up with between having the JSON entry metadata in the register vs having the register being URIs that point to provider-hosted JSON entry metadata is, if a provider needed to update the provider entry metadata, in the former case the provider would need to make a PullRequest to edit the information, in the latter case, the provider would be able to modify the entry metadata on their side without having to make a PR to the register. If we want to make it hard for providers to update information (and review all updates), then having them in the register. If we want the provider to be update their own entries easily (like if they move some imagery to another bucket and want to provide the new bucket name), it would be advantageous to keep the provider entry metadata on the provider side. |
I was envisioning the pointer to the JSON entry so the provider could update themselves. Is there an in between solution? Basic registration but the detailed metadata is on the provider side? Perhaps that makes things too complicated. |
Yeah, these are valid points @lossyrob. Couple follow ups:
I don't think we expect people do be moving imagery between buckets a lot, do we? I would think that using something like S3 allows providers to dump into a bucket and then not worry about it.
To me, it's not about making it harder for providers but better for the system and control over the inputs into the system. Ultimately, this is a short-term, non-scalable solution. Putting control on the repo and ensuring providers are adding valid data will enable less time spent debugging or checking. This doesn't really scale when we're above 100 objects in the file. I also don't think we're talking about a lot of data - just bucket location, name, and contact information so you can follow up. We also don't want to prescript how someone structures their node. They may want to use folders or may not want to use folders. Indexing can recursively go over a folder structure. Here's an updated format with some names adjusted in the case that a provider has multiple buckets: {
"nodes": [
{
"name": "Some provider",
"contact": "[email protected]",
"locations": [
{
"type": "s3",
"bucket_name": "somebucket-1"
},
{
"type": "s3",
"bucket_name": "somebucket-2"
},
{
"type": "s3",
"bucket_name": "somebucket-3"
}
]
}, ...
]
} Would it be worthwhile to pin this and start with a test HOT node to begin with? We can see how it works and functions and then evaluate what could be improved? |
I was imagining not moving imagery between buckets, but specifying keys within a bucket...say if a provider wanted to add another folder to the set of folders specified under one bucket (assuming we could have a I see advantages and disadvantages to both sides, but also I'm big on an implement-and-refactor workflow, so I totally agree with your last couple of sentences. |
Cool, I'll get a HOT bucket set up and use that as a the first pull request to the list. |
The register act as a list of URI's of endpoints that allow participants of OIN to be discoverable. What are the requirements on a URI that is included in the list?
The text was updated successfully, but these errors were encountered: