-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AUDR analysis updates RFC #91
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking really good; clear and to the point. images are great. just minor updates requested.
6. For files that are in both envelopes but don’t need updating, do nothing. | ||
|
||
![](../images/0000-update-file-relationships.png) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be good to describe that updated files are not really updated but deleted and then an independent new file is created and why. Also that this is a short term compromise to simplify implementation at the expense of a more complex data mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it appears this is no longer the case and the figure just needs updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For an updated analysis the figure tries to show the possible scenarios of a) not changing a file, b) updating a file, c) adding a file, d) deleting a file. But it sounds like this isn't easy to see and I should improve the diagram?
On the content, yes on further consideration deleting and re-adding all the other files seems a worse idea since it means some special case work and I'm not sure it ultimately helps analysis much. This is a change from what was discussed so I need feedback from @rexwangcc and @samanehsan.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@justincc can you clarify why the old files need to be deleted in order to add analysis files that have new uuids?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@samanehsan Because everything in the previous version of that bundle will automatically carry forward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you confirm that 2 analyses triggering in quick succession is a scenario that's not sufficiently theoretical for us to ignore temporarily?
@justincc I don't think the scenario is unrealistic because it could happen if a primary submitter makes/updates a submission even an hour or two apart. Looking at this production workflow as an example, after analysis creates a submission envelope in ingest it takes 2 hours to upload the files and confirm the submission. So if an update came in less than 2 hours later, we would have this "simultaneous" submission problem.
Also in this case, the confirm submission step only took 4 minutes but the workflow can wait up to 2 hours for the submission to validate. And then there is additional time for the submission to go through ingest and appear as a results bundle in the data store. So in the worst-case scenario, "quick succession" actually means 2+ hours.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@justincc, this is the concern that @rexwangcc and I mentioned in tech-arch today ^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think the answer here is that ingest may need to do complex engineering to ensure non-interference between submissions. Does that address the issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you clarify what that will look like in ingest? Is the idea that analysis would still specify which files we think are new/updated/deleted and then ingest confirms? That seems like we would be duplicating effort.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment there's no plan for ingest to confirm anything, it just does what you say or some known interpretation thereof. If it's an update then the new update (which is given as the file, not a delta) will replace whatever the old version was. If it's addition of a file that already exists then it could just replace what is already there if this is the easiest behaviour. Or if it's a deletion of something that's already deleted then we could ignore that if this is the easiest behaviour.
Many people are confused about usage here: * #91 (comment) * #90 (comment) * ... have observed others
Co-Authored-By: Matt Weiden <[email protected]>
Co-Authored-By: Matt Weiden <[email protected]>
* Hint that the title name should be replaced Many people are confused about usage here: * #91 (comment) * #90 (comment) * ... have observed others * Update rfc-template.md * Update rfc-template.md
6. For files that are in both envelopes but don’t need updating, do nothing. | ||
|
||
![](../images/0000-update-file-relationships.png) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@justincc I'm a little confused as to why the old files from the original analysis bundle need to be deleted. Why would you copy forward files from the old analysis bundle to the new analysis bundle? Shouldn't the copy forward be from the new primary bundle to the new analysis bundle?
…ubmission envelopes Also clarify what a submission envelope is
…h old and new analyses
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great work!
* Hint that the title name should be replaced Many people are confused about usage here: * HumanCellAtlas#91 (comment) * HumanCellAtlas#90 (comment) * ... have observed others * Update rfc-template.md * Update rfc-template.md
…CellAtlas/dcp-community into justincc-rfc-audr-analysis-updates
@justincc there's an "rfc-paused" label you can add as well |
* Hint that the title name should be replaced Many people are confused about usage here: * HumanCellAtlas#91 (comment) * HumanCellAtlas#90 (comment) * ... have observed others * Update rfc-template.md * Update rfc-template.md
August 12: Last call for community reviewAugust 23: Last day for oversight review
During community review, reviewers