Skip to content

Commit

Permalink
Prevent race between look-up and set deletion
Browse files Browse the repository at this point in the history
Fix a race condition where a set delete request from a peer could
invalidate maps while ldmsd is handling a rendezvous lookup and is about to
submit a remote read request.  an update is being scheduled. Check if the
remote and local map handles of the set are valid and hold the lock during the
entire read submission to ensure maps remains valid.

This is a corner case. The time window between the times the server
deletes the set and responds to a lookup request is very small. In
practice, this could happen when a set is very short live.
nichamon authored and tom95858 committed Jan 29, 2025

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
1 parent 6a331d7 commit 23fc75e
Showing 1 changed file with 6 additions and 0 deletions.
6 changes: 6 additions & 0 deletions ldms/src/core/ldms_xprt.c
Original file line number Diff line number Diff line change
@@ -2608,11 +2608,17 @@ static void handle_rendezvous_lookup(zap_ep_t zep, zap_event_t ev,
rd_ctxt->rc = ctxt->rc;
pthread_mutex_unlock(&x->lock);
assert((zep == x->zap_ep) && (x == rd_ctxt->x));
pthread_mutex_lock(&lset->lock);
if (!lset->lmap || !lset->rmap) {
pthread_mutex_unlock(&lset->lock);
goto callback;
}
rc = zap_read(zep,
lset->rmap, zap_map_addr(lset->rmap),
lset->lmap, zap_map_addr(lset->lmap),
__le32_to_cpu(lset->meta->meta_sz),
rd_ctxt);
pthread_mutex_unlock(&lset->lock);
if (rc) {
x->zerrno = rc;
rc = zap_zerr2errno(rc);

0 comments on commit 23fc75e

Please sign in to comment.