Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update extreme_scale_workflows.md #648

Merged
merged 2 commits into from
Feb 6, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions _projects/extreme_scale_workflows.md
Original file line number Diff line number Diff line change
Expand Up @@ -239,6 +239,16 @@ Today's science campaigns consist of multiple tasks with wide-ranging data and c

We have published our results in 2021 IEEE Workshop on Workflows in Support of Large-Scale Science (WORKS) {% cite yildiz2021dynamic --file jlesc.bib %}. In 2022, we will continue to apply our approach, dynamic heterogeneous in situ workflows, to several science use cases including elastic distributed deep learning ones. This is a joint project between Orcun Yildiz, Bogdan Nicolae and Tom Peterka of ANL, and we are actively looking for students to help us with this project.

## Results for 2023/2024

### LowFive: In Situ Data Transport for High-Performance Workflows

* Tom Peterka, Orcun Yildiz, and Bogdan Nicolae of ANL

We describe LowFive, a new data transport layer based on the HDF5 data model, for in situ workflows. Executables using LowFive can communicate in situ (using in-memory data and MPI message passing), reading and writing traditional HDF5 files to physical storage, and combining the two modes. Minimal and often no source-code modification is needed for programs that already use HDF5. LowFive maintains deep copies or shallow references of datasets, configurable by the user. More than one task can produce (write) data, and more than one task can consume (read) data, accommodating fan-in and fan-out in the workflow task graph. LowFive supports data redistribution from n producer processes to m consumer processes. We demonstrate the above features in a series of experiments featuring both synthetic benchmarks as well as a representative use case from a scientific workflow, and we also compare with other data transport solutions in the literature.

We have published our results in 2023 International Parallel and Distributed Processing Symposium (IPDPS).

## Visits and meetings

* Pierre-Louis Guhur of ENS 9 months at ANL in 2016
Expand Down
Loading