-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Server resets its latest block from time to time #400
Comments
I launched a server using master branch at db2795a
|
mainnet.lightwalletd.com is currently reset to a low height again
|
Please post here the result of I haven't seen this myself, and I can't think of what could cause this problem. Can you provide more detail? Let's say you were running Or, after reporting the very low height, would it slowly return to the correct height (like maybe advancing several hundred blocks per second)? In other words, I'd like to understand the timing behavior. That would help narrow down the source of the problem. Thanks! |
Sorry, I forgot to post GetLightInfod results though I tried it. The server height reported is the correct one, i.e., Getlastblock resets but not GetLightInfod. Getlastblock indeed slowly goes back up as if block ingestion is repeated. Eventually, the server is back on track. However the process takes a few hours. It is hard to tell when it happens and it seems to fix itself after a while. However, wallet users are reporting lost funds (they are not lost) because wallets think they are synced but actually haven't processed the later blocks. |
Okay, but I was asking for the My guess is that this is some sort of infrastructure problem, rather than in the lightwalletd code itself. We (ECC) don't run the
If something outside of lightwalletd deletes or corrupts these files, the result would be what you're seeing. There's no way for us to know if that's actually what's happening, I'm just saying it's something like this. Maybe the file system holding these files runs out of space? Although I think lightwalletd would just halt; I don't know why it would back up many blocks. Who runs that instance, isn't it Nighthawk wallet? Do they have a GitHub repo that you can open a ticket on for this? |
I contacted him on telegram and he assured me that the ECC was working actively on it as a top priority. I thought that you work at the ECC? |
Yes, I do, but I'm not aware of any bug that makes the block height back up like that. There have been a few fixes lately, maybe you can try it again and make sure it's still happening (if you don't mind). |
From a client's point of view, it happens randomly. But when it does, there is no clear indication that something is wrong. I could check both GetLastBlock and GetLightdInfo to see if they match but it seems to be a kludge since it is not supposed to happen anyway. Btw, a user reported an issue 2 days ago, so I think it's still happening. |
You definitely should not need to call both, I'll dig into this and try to figure out what's going on. |
I just noticed that
This checks every 10 seconds to see if the reported block height drops below 1753466 (just what the tip height happened to be when I started the script), and it hasn't printed anything (meaning no block was reported below that height). But that server has been down, since I don't know when. I just started the script to my local instance, just to see if I can reproduce it that way. |
Update, it's been running for about 16 hours, and no block height resets have occurred. There's been some speculation that this could have been caused by #408 (which should be fixed by #409, awaiting review), the idea being that if the lightwalletd doesn't cleanly shut down, the blocks cache could be in a strange state that required resynchronizing from a long way back in the chain. This is superficially plausible. But I don't think it's right. I just killed my lightwalletd without giving it a chance to cleanly shut down, and when I restarted it, it didn't reset (resync many blocks), it picked up exactly where it was when I killed it. (And, knowing how the code works, this is what I would expect.) |
Is this problem still being seen? I haven't made any progress in reproducing it. |
Tbh, I am not using this server anymore. But I haven't had any report of
issues either.
I will close this bug. Thanks for your help.
…On Thu, Aug 11, 2022, 9:44 AM Larry Ruane ***@***.***> wrote:
Is this problem still being seen? I haven't made any progress in
reproducing it.
—
Reply to this email directly, view it on GitHub
<#400 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABWIH4W5JCW4ITCCFSPQOTVYRLIFANCNFSM5234LTZQ>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
What is the bug?
Occasionally, but not rarely, the server reports a very low height.
For example:
Additional context
The height returned by the server info call is accurate though. It looks like
lightwalletd
is performing a deep reorg.After a while, it gets back in sync.
The text was updated successfully, but these errors were encountered: