Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chainntfns: Panics during initial sync blow up consumers. #827

Closed
JoeGruffins opened this issue Nov 4, 2022 · 4 comments · Fixed by #870
Closed

chainntfns: Panics during initial sync blow up consumers. #827

JoeGruffins opened this issue Nov 4, 2022 · 4 comments · Fixed by #870

Comments

@JoeGruffins
Copy link
Contributor

Specifically, if wallet is locked during recovery btcwallet panics. Trying to use the wallet blows up whatever app is using it if a user simply locks the wallet at the wrong time. I also don't think there is a way to check if btcwallet is currently doing recovery.

I wonder if it's possible to remove the panic logic in *Wallet.handleChainNotifications altogether and handle errors more gracefully?

The point of panic that is causing some pain

err = w.syncWithChain(birthdayBlock)
if err != nil && !w.ShuttingDown() {
panic(fmt.Errorf("unable to synchronize "+
"wallet to chain: %v", err))
}

The problematic recovery that we can not currently escape unless shutdown. Locking the wallet here panics.

btcwallet/wallet/wallet.go

Lines 665 to 792 in 27f244e

// recovery attempts to recover any unspent outputs that pay to any of our
// addresses starting from our birthday, or the wallet's tip (if higher), which
// would indicate resuming a recovery after a restart.
func (w *Wallet) recovery(chainClient chain.Interface,
birthdayBlock *waddrmgr.BlockStamp) error {
log.Infof("RECOVERY MODE ENABLED -- rescanning for used addresses "+
"with recovery_window=%d", w.recoveryWindow)
// Wallet locking must synchronize with the end of recovery, since use of
// keys in recovery is racy with manager IsLocked checks, which could
// result in enrypting data with a zeroed key.
syncer := &recoverySyncer{done: make(chan struct{})}
w.recovering.Store(syncer)
defer close(syncer.done)
// We'll initialize the recovery manager with a default batch size of
// 2000.
recoveryMgr := NewRecoveryManager(
w.recoveryWindow, recoveryBatchSize, w.chainParams,
)
// In the event that this recovery is being resumed, we will need to
// repopulate all found addresses from the database. Ideally, for basic
// recovery, we would only do so for the default scopes, but due to a
// bug in which the wallet would create change addresses outside of the
// default scopes, it's necessary to attempt all registered key scopes.
scopedMgrs := make(map[waddrmgr.KeyScope]*waddrmgr.ScopedKeyManager)
for _, scopedMgr := range w.Manager.ActiveScopedKeyManagers() {
scopedMgrs[scopedMgr.Scope()] = scopedMgr
}
err := walletdb.View(w.db, func(tx walletdb.ReadTx) error {
txMgrNS := tx.ReadBucket(wtxmgrNamespaceKey)
credits, err := w.TxStore.UnspentOutputs(txMgrNS)
if err != nil {
return err
}
addrMgrNS := tx.ReadBucket(waddrmgrNamespaceKey)
return recoveryMgr.Resurrect(addrMgrNS, scopedMgrs, credits)
})
if err != nil {
return err
}
// Fetch the best height from the backend to determine when we should
// stop.
_, bestHeight, err := chainClient.GetBestBlock()
if err != nil {
return err
}
// Now we can begin scanning the chain from the wallet's current tip to
// ensure we properly handle restarts. Since the recovery process itself
// acts as rescan, we'll also update our wallet's synced state along the
// way to reflect the blocks we process and prevent rescanning them
// later on.
//
// NOTE: We purposefully don't update our best height since we assume
// that a wallet rescan will be performed from the wallet's tip, which
// will be of bestHeight after completing the recovery process.
var blocks []*waddrmgr.BlockStamp
startHeight := w.Manager.SyncedTo().Height + 1
for height := startHeight; height <= bestHeight; height++ {
if atomic.LoadUint32(&syncer.quit) == 1 {
return errors.New("recovery: forced shutdown")
}
hash, err := chainClient.GetBlockHash(int64(height))
if err != nil {
return err
}
header, err := chainClient.GetBlockHeader(hash)
if err != nil {
return err
}
blocks = append(blocks, &waddrmgr.BlockStamp{
Hash: *hash,
Height: height,
Timestamp: header.Timestamp,
})
// It's possible for us to run into blocks before our birthday
// if our birthday is after our reorg safe height, so we'll make
// sure to not add those to the batch.
if height >= birthdayBlock.Height {
recoveryMgr.AddToBlockBatch(
hash, height, header.Timestamp,
)
}
// We'll perform our recovery in batches of 2000 blocks. It's
// possible for us to reach our best height without exceeding
// the recovery batch size, so we can proceed to commit our
// state to disk.
recoveryBatch := recoveryMgr.BlockBatch()
if len(recoveryBatch) == recoveryBatchSize || height == bestHeight {
err := walletdb.Update(w.db, func(tx walletdb.ReadWriteTx) error {
ns := tx.ReadWriteBucket(waddrmgrNamespaceKey)
for _, block := range blocks {
err := w.Manager.SetSyncedTo(ns, block)
if err != nil {
return err
}
}
return w.recoverScopedAddresses(
chainClient, tx, ns, recoveryBatch,
recoveryMgr.State(), scopedMgrs,
)
})
if err != nil {
return err
}
if len(recoveryBatch) > 0 {
log.Infof("Recovered addresses from blocks "+
"%d-%d", recoveryBatch[0].Height,
recoveryBatch[len(recoveryBatch)-1].Height)
}
// Clear the batch of all processed blocks to reuse the
// same memory for future batches.
blocks = blocks[:0]
recoveryMgr.ResetBlockBatch()
}
}
return nil
}

@Roasbeef
Copy link
Member

Roasbeef commented Nov 4, 2022

Ah, this makes sense, much of the recent work we've done in btcwallet was motivated by lnd. For lnd, the wallet is unlocked once at start up, then never locked again.

What's missing here is that we properly thread through a context.Context (or something like that), so it can be cancelled to ensure we can do things like unsafe shutdown.

I agree that panic doesn't really need to be there either, it should be handled gracefully. So we'd either need to try to retry once the wallet is unlocked again, or move it to something periodic tat attempts to sync again once things are clear.

@chappjc
Copy link
Contributor

chappjc commented Nov 11, 2022

I wonder if it's possible to remove the panic logic in *Wallet.handleChainNotifications altogether and handle errors more gracefully?

This would be good. There are numerous ways this explicit panic in handleChainNotifications can be triggered:

@chappjc
Copy link
Contributor

chappjc commented Nov 14, 2022

Just got this recovering a mainnet wallet:

panic: unable to synchronize wallet to chain: unable to perform wallet recovery: couldn't retrieve block 00000000000000000004d901bfc6973cad6a39d65f88ac6af77c2c212d1635fd from network

goroutine 4830583 [running]:
github.com/btcsuite/btcwallet/wallet.(*Wallet).handleChainNotifications(0xc0050d6900)
	github.com/btcsuite/[email protected]/wallet/chainntfns.go:117 +0x1065
created by github.com/btcsuite/btcwallet/wallet.(*Wallet).SynchronizeRPC
	github.com/btcsuite/[email protected]/wallet/wallet.go:213 +0x25c

The neutrino log had no direct error at the time, but shortly prior the node to which that getdata block request was made had dropped and reconnected a couple times.

@JoeGruffins
Copy link
Contributor Author

This seems to come up a bit while testing dcrdex. Will attempt to stop the panics as simply as possible if that's alright. Will look through the code some more and come up with a plan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants
@Roasbeef @chappjc @JoeGruffins and others