-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Case 10: Discussion #10
Comments
I think that it is a good idea to separate these two requirements in use case 10. We can also say that being able to select part of a dataset is important to implement parallel I/O operations by accessing the data with independent processes. Perhaps this could be added in requirement 10 or in a new requirement that describes accessing a dataset in parallel. |
I've tacked in your parallel I/O wording to Use Case 10, but it seems a little bit like it was bolted on. Can you write a separate Use Case about parallel access I/O, perhaps around a large dataset scenario? I worry we are missing some important aspects of this functionality and capturing requirements for large datasets in general. Note i've also added a parallel I/O requirement too. |
Quick note that Requirement-12 : Parallel I/O Support seems to overlap Requirement 10: partial read of format. Not sure if these are really different or not. I've linked them in the wiki so that the issue is highlighted but opinions here on this matter would be good. |
To add to the discussion, agreed that Parallel I/O is an important requirement. While parallel I/O would be useful for a lot of things, here's a specific use case for parallel write in radio astronomy: a FX correlator breaks up the cross-correlation into frequency subbands over several compute nodes. To reconstruct the full spectrum each compute node needs to write each subband to a single file (or file-like object). And for parallel read: a user wishes to image several subbands of a wide-bandwidth visibility dataset produced by a correlator. Data access should be parallelizable over both time and frequency, so that multiple parallel data reduction pipelines can be run at once on the same dataset. |
@telegraphic Thank you. Those are good details, do you think you could meld them into Use Case 10? |
I think that it makes sense to have a separate use case for parallel I/O because selecting part of a large dataset as described in Usecase 10 can have other important applications. The radio astronomy example for parallel data analysis is very good. I can write a usecase for distributed data access that will be related with the new Requirement 12, and please feel free to add more details specific to the radio astronomy case. |
Usecase 17 is looking good! |
Thanks @telegraphic , could you add the example of parallel data access in radio astronomy to Use Case 17? |
Added it in, feel free to edit as required |
Great, thank you! |
I've tried to extract requirements from this Use Case but I fear I may have missunderstood the intent. Currently I have extracted requirements 10 and 11 from this use case (they appear different requirements to me). Please review and feedback any needed changes/problems.
The text was updated successfully, but these errors were encountered: