Skip to content

Commit

Permalink
Add ideas from LBNL. Bin. (#320)
Browse files Browse the repository at this point in the history
* Add prject ideas from lbnl. Bin

* update index files

* Fix targs format issues index.md

* Update issues with authors index.md

---------

Co-authored-by: Carlos Maltzahn <[email protected]>
  • Loading branch information
BinDong314 and carlosmalt authored Jan 23, 2024
1 parent dfd47e8 commit 1bb2398
Show file tree
Hide file tree
Showing 4 changed files with 49 additions and 0 deletions.
Binary file added content/project/osre24/lbl/aiio/featured.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
24 changes: 24 additions & 0 deletions content/project/osre24/lbl/aiio/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
---
title: "AIIO / Graph Neural Network"
authors: [bindong, "Suren Byna"]
author_notes: ["Research Scientist, Lawrence Berkeley Lab", " The Ohio State University (OSU)"]
tags: ["osre24", "uc", "AI", "Bottleneck Analysis"]
date: 2024-01-17T10:15:56-07:00
lastmod: 2024-01-17T10:15:56-07:00
---

[AIIO] (https://github.com/hpc-io/aiio) revolutionizes the way for users to automatically tune the I/O performance of applications on HPC systems. It currently works on linear regression models but has more opportunities to work on heterogeneous data, such as programming info. This requires extending the linear regression model to more complex models, such as heterogeneous graph neural networks. The proposed work will include developing the graph neural work-based model to predict the I/O performance and interpretation.

### AIIO / Graph Neural Network

- **Topics**: AIIO/Graph Neural Network`
- **Skills**: Python, Github, Machine Learning
- **Difficulty**: Difficult
- **Size**: Large (350 hours)
- **Mentor**: {{% mention bindong %}}, "Suren Byna"

The Specific tasks of the project include:
- Develop the data pre-processing pipeline to convert I/O logs into formats which are required by the Graph Neural Network
- Build and test the Graph Neural Network to model the I/O performance for HPC applications.
- Test and evaluate the accuracy of the Graph Neural Network with test cases from AIIO

Binary file added content/project/osre24/lbl/fastensor/featured.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
25 changes: 25 additions & 0 deletions content/project/osre24/lbl/fastensor/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
---
title: "FasTensor / Stream Processing"
authors: [bindong, kwu]
author_notes: ["Research Scientist, Lawrence Berkeley Lab", "Senior Computer Scientist, Lawrence Berkeley Lab"]
tags: ["osre24", "uc", "Streaming Processing"]
date: 2024-01-17T10:15:56-07:00
lastmod: 2024-01-17T10:15:56-07:00
---

[FasTensor] (https://github.com/BinDong314/FasTensor) is a generic tensor processing engine with scalability from single nodes to thousands of nodes on HPC. FasTensor supports applications from traditional SQL query to complex DFT solver in scientific applications. It has a 1000X performance advantage over MapReduce and Spark in supporting generic data processing functions on tensor structure. In this project, we propose to expand FasTensor with streaming functionality to support online data processing. Specifically, participants of this project will develop a stream endpoint for retrieving live data output from applications, such as DAS. The stream endpoint performs the function to maintain the pointer of data, which could be a n-dimensional subset of a tensor.

### FasTensor / Stream Processing

- **Topics**: `FasTensor/Streaming Processing`
- **Skills**: C++, github
- **Difficulty**: Difficult
- **Size**: Large (350 hours)
- **Mentor**: {{% mention bindong %}}, {{% mention kwu %}}

The Specific tasks of the project include:
- Building a mock workflow based on our DAS application (https://github.com/BinDong314/DASSA) to test stream processing. The mock workflow comprises a data producer, which generates DAS data, and a data consumer, which processes the data.
- Developing a Stream Endpoint (e.g., I/O driver) to iteratively read dynamically increasing data from a directory. The stream endpoint essentially includes open, read, and write functions, and a pointer to remember current file pointer.
- Integrating the Stream Endpoint into the FasTensor library.
- Evaluating the performance of the mock workflow with the new Stream Endpoint.
- Documenting the execution mechanism.

0 comments on commit 1bb2398

Please sign in to comment.