-
Notifications
You must be signed in to change notification settings - Fork 41
Threads WG Meeting 06 26 2018
Manjunath Gorentla Venkata edited this page Oct 2, 2018
·
1 revision
(Thanks Swaroop for the notes)
- Issue #223 (https://github.com/openshmem-org/specification/issues/223)
- Updates on PR #103 (https://github.com/openshmem-org/specification/pull/103)
-
Issue 223
- What is the OSH safe way to create child processes ?
- Users used vfork - create child processes and parent process is suspended.
- Child has lower memory overhead
- same shared memory
- POSIX-2001 deprecated it - dangerous if _exit is not called
- posix_spawn() - works with all osh implementations
- Users used vfork - create child processes and parent process is suspended.
-
Should OSH put constraints such that the implementation should support forking like posix_spawn()?
- ORNL - Leave it undefined: as multithreaded + forking can get complicated
- What would we need to standardize this behavior ?
- Intel : Does the fork call OSH ?
- Children are not expected to shmem_init().
- Cray: If it does not work, something is wrong with the implementation.
- Cray already supports it.
- Intel: Concerns regarding its interaction with other components of the software stack.
- Intel: This should work on all commodity distributions.
- Are we trying to specify the semantics wrt symmetric variables (updates by child process) ?
- ORNL: Does OMPI-OSH support it ?
- Segfaults.
- DoD: Usecase: Children copy symmetric memory (maybe at a checkpoint) but not use it as symmetric memory.
- ORNL: Need to understand the implications in greater detail.
- ORNL - Leave it undefined: as multithreaded + forking can get complicated
- Action Items:
- Nick: Create ticket, supply test code for everyone to try.
- Manju: Find OpenMPI's support for fork.
- What is the OSH safe way to create child processes ?
-
PR 103
- Discussing the changes to PR from last meeting.
- Text changes - more to come with chapter edits
- Language clarification in shmem_wait_nbe
- Rename API - shmem_wait_nbe and shmem_test_nbe
- since wait is actually blocking
- Merged handles with multiple requests have been re-moved from this PR
- Questions:
- DoD:
- No ordering guarantee between same or different merged req guarantee ?
- Both
- The relationship between memory allocation behind the scenes and state of the request handle is not clear.
- what does the data structure look like ?
- Opaque to user
- State should be query-able
- ORNL: We have the distinction internally.
- Does wait uninitialize ?
- Yes, if the associated operations are completed.
- API is not clear about handle allocation
- There is both implicit and and explicit support in OSH-X implementation
- Explicit was removed with the merged handle semantics.
- Exposing state makes sense for explicit.
- There is both implicit and and explicit support in OSH-X implementation
- How much space is required to track a request ?
- small - Don't require allocation request
- significant - Have allocation call
- A: 2 words — Allocation request not required.
- Context variants of non-blocking calls ?
- Not at this time.
- Con: There will be a ctx and req object.
- Not at this time.
- No ordering guarantee between same or different merged req guarantee ?
- Intel:
- Return value shmem_request_allocate passed by ref or value ?
- By Ref (for allocate and put)
- Put routine could be allocating a request ?
- Yes
- This is confusing.
- Why reuse handles? What is the stale value that wait re-sets?
- Allocate provides hints to the runtime
- Why reuse handles? What is the stale value that wait re-sets?
- Does quiet affect the completion ?
- Wait gives remote completion.
- Why do it this way? - Usually wait gives a local completion.
- Throughput vs. tracking
- Cray has a way to chain nbe operations.
- Add use cases to the proposal.
- Return value shmem_request_allocate passed by ref or value ?
- DoD:
- Action Items:
- Swen: Make requirements and states more explicit.
- Nick and Jim: Comments on GitHub
- Discussing the changes to PR from last meeting.
-
Working Groups
-
Errata