rpcclient: handle RPC requests based on their priority #2027

yyforyongyu · 2023-08-30T12:54:48Z

This PR introduces a new flag lowPriority on RPC requests to differentiate their priorities so more urgent RPC requests can be processed earlier. This is needed as bitcoind will block when too many RPCs requests are made at the same time, yet our recent addition of mempool notifier puts a lot of pressure on it because it's calling getrawtransaction for every new transaction, causing other RPC requests to halt. To solve it, we introduce a new method GetRawTransactionLazy, which will make the RPC calls more friendly and leave room for other more urgent RPC calls to go through.

Resulted from running `go get "golang.org/x/sync/errgroup"` and `go mod tify`.

coveralls · 2023-08-30T13:10:56Z

Pull Request Test Coverage Report for Build 6038665903

0 of 152 (0.0%) changed or added relevant lines in 2 files are covered.
1 unchanged line in 1 file lost coverage.
Overall coverage decreased (-0.1%) to 55.121%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
rpcclient/rawtransactions.go	0	19	0.0%
rpcclient/infrastructure.go	0	133	0.0%

Files with Coverage Reduction	New Missed Lines	%
rpcclient/rawtransactions.go	1	0.0%

Totals
Change from base Build 5957307420:	-0.1%
Covered Lines:	26914
Relevant Lines:	48827

💛 - Coveralls

guggero

Great idea and optimiziations! Though I think some assumptions around reading from channels have changed and might introduce new blocking behavior.

guggero · 2023-08-31T08:02:30Z

rpcclient/infrastructure.go

@@ -892,7 +929,7 @@ out:
 		// Send any messages ready for send until the shutdown channel
 		// is closed.
 		select {
-		case jReq := <-c.highPriorityPostQueue:
+		case jReq := <-c.nextPostRequest():


This now no longer considers the c.shutdown channel. Because we'll block on the method call c.nextPostRequest() and then only actually select once that returns. So I think we need to consider c.shutdown within nextPostRequest() and then we can simply return the next request (instead of a channel) or a boolean that indicates we're shutting down.

good catch, fixed!

guggero · 2023-08-31T08:03:16Z

rpcclient/infrastructure.go

@@ -905,7 +942,7 @@ out:
 cleanup:
 	for {
 		select {
-		case jReq := <-c.highPriorityPostQueue:
+		case jReq := <-c.nextPostRequest():


Same here, this breaks assumptions and will block forever on shutdown.

here is a bit different as we wanna drain the channel, so changed to read the channels directly.

guggero · 2023-08-31T08:05:56Z

rpcclient/infrastructure.go

+	// TODO(yy): Ideally we should pass the `-rpcthreads` used in
+	// `bitcoind` here to better increase the performance.
+	const numThreads = 4
+	eg.SetLimit(numThreads)


Don't we ever need to wait for this error group? Or we're basically using it as a simple "thread pool"?

yeah it's more like limiting concurrent calls, but think we should also wait in the end so now I added it.

guggero · 2023-08-31T08:07:46Z

rpcclient/infrastructure.go

+	eg.SetLimit(numThreads)
+
+	// lowReqCounter counts the number of low priority requests.
+	lowReqCounter := 0


Should we use an explicit uint64 type here to avoid rolling over into negative? Wouldn't really matter too much as we're only using the modulus, but would give us some more explicit behavior.

guggero · 2023-08-31T08:08:23Z

rpcclient/infrastructure.go

+	// requests to bitcoind. This is only used by low priority messages.
+	// For every 4 messages, we would sleep this duration before making
+	// more post requests.
+	rpcRequestInterval = time.Millisecond * 10


Could we maybe make this configurable when instantiating the client?

added a new commit to achieve this

guggero · 2023-08-31T08:10:06Z

rpcclient/infrastructure.go

+// associated server and returns a response channel on which the reply will be
+// delivered at some point in the future. It handles both websocket and HTTP
+// POST mode depending on the configuration of the client.
+func (c *Client) SendCmdLazy(cmd interface{}) chan *Response {


Low prio: To me, the word "lazy" in programming indicates "don't do it immediately, only when we actually need the result". Which kind of fits, but also not fully. Wasn't able to come up with a better suffix though, so I asked ChatGPT. Not sure if any of those fit better:

SendCmdLight - Using "Light" suggests a less intensive or lower priority operation. SendCmdLazy - "Lazy" often implies that the action isn't urgently prioritized. SendCmdIdle - "Idle" might indicate that the task is of low urgency. SendCmdSlow - Directly indicates that it's a less priority-sensitive function. SendCmdBack - As in "backburner", where it's not the immediate focus. SendCmdMinor - Signifies lesser importance without being too long. SendCmdRelax - Suggests that the function can "relax" and doesn't need to hurry. SendCmdSoft - A softer version, possibly less critical. SendCmdCasual - Implying the task is non-urgent. SendCmdChill - A colloquial way to denote low urgency.

Maybe SendCmdWithLowPriority? kinda long...

Yeah, it's a bit long. My favorite from the list is actually SendCmdSlow.

switched to slow

This commit adds a new message queue `lowPriorityPostQueue` and a new flag on `jsonRequest` so we can split the requests based on priorities.

This commit changes how the messages are handled based on their priorities. For low priority messages, they are processed sequentially and the processing would sleep for 10ms for every four messages. For high priority messages, they will be handled in goroutines so they will be sent to `bitcoind` faster.

This commit adds a new method `GetRawTransactionLazy` to handle low priority requests. To support it, the underlying `SendCmd` is refactored to make use of the priority flag.

Roasbeef · 2023-09-01T19:05:58Z

What about taking a more generic approach here? The request type is the paramter, and we can have a generic method that'll handle the priotization.

I also think we should step back a bit before proceeding to make sure we're actually fixing something here:

For bitcoind, the client runs in POST mode.

In POST mode, a single goroutine handles all the incoming requests (it'll block an retry up to 10 times before giving up):

btcd/rpcclient/infrastructure.go

Lines 885 to 921 in ec401d0

    
           // sendPostHandler handles all outgoing messages when the client is running 
        
           // in HTTP POST mode.  It uses a buffered channel to serialize output messages 
        
           // while allowing the sender to continue running asynchronously.  It must be run 
        
           // as a goroutine. 
        
           func (c *Client) sendPostHandler() { 
        
           out: 
        
           	for { 
        
           		// Send any messages ready for send until the shutdown channel 
        
           		// is closed. 
        
           		select { 
        
           		case jReq := <-c.sendPostChan: 
        
           			c.handleSendPostMessage(jReq, c.shutdown) 
        
           		case <-c.shutdown: 
        
           			break out 
        
           		} 
        
           	} 
        
           	// Drain any wait channels before exiting so nothing is left waiting 
        
           	// around to send. 
        
           cleanup: 
        
           	for { 
        
           		select { 
        
           		case jReq := <-c.sendPostChan: 
        
           			jReq.responseChan <- &Response{ 
        
           				result: nil, 
        
           				err:    ErrClientShutdown, 
        
           			} 
        
           		default: 
        
           			break cleanup 
        
           		} 
        
           	} 
        
           	c.wg.Done() 
        
           	log.Tracef("RPC client send handler done for %s", c.config.Host) 
        
           }

Similarly, bitcoind also has a set of RPC threads that consume the requests, and block, eventually returning an error if they can't be processed.

In short, adding another channel to consume requests off of, doesn't help to actually address the issue of head of line blocking. If we're sending 10 getblock requests, and one of them blocks because of bitcoind hitting a queue limit, then the entire sequence will block. I think what we're after here is either utilizing JSON-RPC batching properly, or doing a rechitcture to make the POST mode as concurrent as the normal ws mode.

The ws mode has proper pipelining: a message request adds an entry to match the response, then sends out the request, not blocking on the response:

btcd/rpcclient/infrastructure.go

Lines 986 to 996 in ec401d0

    
           	// Add the request to the internal tracking map so the response from the 
        
           	// remote server can be properly detected and routed to the response 
        
           	// channel.  Then send the marshalled request via the websocket 
        
           	// connection. 
        
           	if err := c.addRequest(jReq); err != nil { 
        
           		jReq.responseChan <- &Response{err: err} 
        
           		return 
        
           	} 
        
           	log.Tracef("Sending command [%s] with id %d", jReq.method, jReq.id) 
        
           	c.sendMessage(jReq.marshalledJSON) 
        
           }

. This can further be improved by having multiple worker goroutines read off the queue to handle the responses.

yyforyongyu · 2023-09-04T04:01:52Z

The initial idea was to implement a special case that can be used by our mempool poller as it consumes a lot of RPC resources. The original observation was, a lot of getrawtransaction requests were queued, causing the call to getblockchaininfo to be blocked by calling getinfo in lnd.

However, after digging into bitcoind's debug log, the cause of the slowness seemed to be on the bitcoind side - whenever a new block tip is connected, or a new peer is connected, the daemon would require a lock, blocking the RPC responses until the aforementioned tasks are finished, as shown in the logs,

2023-08-31T09:17:35Z ThreadRPCServer method=getrawtransaction user=yy
2023-08-31T09:17:35Z ThreadRPCServer method=getrawtransaction user=yy
2023-08-31T09:17:39Z New outbound peer connected: version: 70016, blocks=805566, peer=24 (outbound-full-relay)
2023-08-31T09:17:42Z New outbound peer connected: version: 70016, blocks=805566, peer=22 (outbound-full-relay)
2023-08-31T09:17:42Z ThreadRPCServer method=getrawtransaction user=yy
2023-08-31T09:17:42Z ThreadRPCServer method=getrawtransaction user=yy
2023-08-31T09:17:48Z ThreadRPCServer method=getrawtransaction user=yy

2023-08-31T09:05:35Z ThreadRPCServer method=getrawtransaction user=yy
2023-08-31T09:05:39Z UpdateTip: new best=00000000000000000003134988bcace49938ffe175379a14c66529580c202473 height=805559 version=0x20400000 log2_work=94.390304 tx=886852495 date='2023-08-31T08:29:35Z' progress=0.999993 cache=53.6MiB(319643txo)
2023-08-31T09:05:48Z ThreadRPCServer method=getrawtransaction user=yy

Thus this PR is no longer the right fix. If we are looking for async RPC calls, we should instead use rpcclient.BatchNew as suggested by @Roasbeef.

gitignore: ignore vim files

87552dd

yyforyongyu force-pushed the priority-queue branch from bbaf161 to 562c8b3 Compare August 30, 2023 12:58

yyforyongyu added 2 commits August 30, 2023 21:08

gomod: clean go mod files and add errgroup

25402ce

Resulted from running `go get "golang.org/x/sync/errgroup"` and `go mod tify`.

rpcclient: rename sendPostChan to highPriorityPostQueue

673dd8b

yyforyongyu force-pushed the priority-queue branch from 562c8b3 to 4a80477 Compare August 30, 2023 13:08

yyforyongyu mentioned this pull request Aug 30, 2023

chain: use GetRawTransactionLazy in mempool poller btcsuite/btcwallet#886

Closed

guggero requested changes Aug 31, 2023

View reviewed changes

yyforyongyu added 2 commits August 31, 2023 18:53

rpcclient: add lowPriorityPostQueue and split messages

b2c2a85

This commit adds a new message queue `lowPriorityPostQueue` and a new flag on `jsonRequest` so we can split the requests based on priorities.

yyforyongyu force-pushed the priority-queue branch from 4a80477 to b8d0a62 Compare August 31, 2023 11:20

yyforyongyu added 4 commits August 31, 2023 22:25

rpcclient: add new method GetRawTransactionLazy

f2c0c1e

This commit adds a new method `GetRawTransactionLazy` to handle low priority requests. To support it, the underlying `SendCmd` is refactored to make use of the priority flag.

rpcclient: remove redundant params used in handleSendPostMessage

feeb393

rpcclient: catch shutdown signal when sending requests

c4db4c7

rpcclient: make rpcRequestInterval configurable

e1d7a6c

yyforyongyu force-pushed the priority-queue branch from b8d0a62 to e1d7a6c Compare August 31, 2023 14:25

guggero self-requested a review August 31, 2023 15:23

yyforyongyu closed this Sep 4, 2023

yyforyongyu deleted the priority-queue branch September 4, 2023 04:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rpcclient: handle RPC requests based on their priority #2027

rpcclient: handle RPC requests based on their priority #2027

yyforyongyu commented Aug 30, 2023

coveralls commented Aug 30, 2023 •

edited

Loading

guggero left a comment

guggero Aug 31, 2023

yyforyongyu Aug 31, 2023

guggero Aug 31, 2023

yyforyongyu Aug 31, 2023

guggero Aug 31, 2023

yyforyongyu Aug 31, 2023

guggero Aug 31, 2023

yyforyongyu Aug 31, 2023

guggero Aug 31, 2023

yyforyongyu Aug 31, 2023 •

edited

Loading

guggero Aug 31, 2023

yyforyongyu Aug 31, 2023

guggero Aug 31, 2023

yyforyongyu Aug 31, 2023

Roasbeef commented Sep 1, 2023

yyforyongyu commented Sep 4, 2023

rpcclient: handle RPC requests based on their priority #2027

rpcclient: handle RPC requests based on their priority #2027

Conversation

yyforyongyu commented Aug 30, 2023

coveralls commented Aug 30, 2023 • edited Loading

Pull Request Test Coverage Report for Build 6038665903

💛 - Coveralls

guggero left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yyforyongyu Aug 31, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Roasbeef commented Sep 1, 2023

yyforyongyu commented Sep 4, 2023

coveralls commented Aug 30, 2023 •

edited

Loading

yyforyongyu Aug 31, 2023 •

edited

Loading