Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve parallel POSIX read performance #641

Draft
wants to merge 1 commit into
base: branch-25.04
Choose a base branch
from

Conversation

kingcrimsontianyu
Copy link
Contributor

@kingcrimsontianyu kingcrimsontianyu commented Feb 22, 2025

Not for review

Investigating #629

Number of subtasks per task

KVIKIO_NUM_SUBTASKS_PER_TASK

@kingcrimsontianyu kingcrimsontianyu added improvement Improves an existing functionality non-breaking Introduces a non-breaking change c++ Affects the C++ API of KvikIO labels Feb 22, 2025
@kingcrimsontianyu kingcrimsontianyu self-assigned this Feb 22, 2025
Copy link

copy-pr-bot bot commented Feb 22, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@kingcrimsontianyu
Copy link
Contributor Author

kingcrimsontianyu commented Feb 22, 2025

(Outdated) Performance checkpoint 926fbb8

  • branch-25.04, 72 threads, GH200
    This figure shows the task profiles from 4 out of 72 threads. The first task takes 85 ms to complete. cuMemHostAlloc takes 1 ms per thread.
    image

  • This PR, 72 threads, GH200
    Now the first task takes 79 ms. cuMemHostAlloc takes 27 ms per thread.
    image

Conclusion: small reduction in latency spike, 85 to 79 ms. It may be worth having a page-locked memory pool, or at least pre-allocating nthreads * task_size page-locked memory block.

@kingcrimsontianyu kingcrimsontianyu changed the title Lock free bounce buffer Improve parallel I/O read performance Feb 27, 2025
@kingcrimsontianyu kingcrimsontianyu changed the title Improve parallel I/O read performance Improve parallel POSIX read performance Feb 27, 2025
@kingcrimsontianyu kingcrimsontianyu force-pushed the lock-free-bounce-buffer branch 7 times, most recently from 6d77d41 to 9cfed2d Compare March 7, 2025 20:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Affects the C++ API of KvikIO DO NOT MERGE improvement Improves an existing functionality non-breaking Introduces a non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant