-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mining thread panics due to incorrect difficulty #154
Comments
Some things that may or may not be related:
|
Ok, I wanted more concrete evidence that something is wrong (or not) with the difficulty adjustment and block times, so I wrote a couple of quick php scripts that (a) download info for all blocks from explorer.neptune.cash and (b) statistically summarize the information in terms of: 1) difficulty, 2) duration between blocks, and 3) expected blocks for the entire time period. I then wanted to compare the results with bitcoin for a similar number of blocks, so I modified the scripts to do the same for bitcoin. In summary, the neptune block times are both too high and too low and there are difficulty anomalies. Here are the stats for neptune and bitcoin, side by side.
There are several things to note here:
regarding methodology:
Here are the php scripts used to generate the stats. This also includes raw data for the neptune blocks. [1] https://bitcoin.stackexchange.com/questions/25293/probablity-distribution-of-mining |
Thanks for the elaborate research. Instead of trying to formulate a well rounded conclusion, let me just put words to my consciousness as it rapidly jumps around. Bitcoin mining is a Poisson process resulting in an exponential distribution. Neptune is supposed to be a Poisson process resulting in an exponential distribution as well. Any deviation from this objective is worthy of investigation and fixing. That said, the mechanism by which the difficulty is updated differs dramatically. Bitcoin, if memory serves, calculates the average block time of the last 2016 blocks and adjusts the difficulty (within clamping bounds) based on that. Neptune only looks at the previous block, and computes the new difficulty based on a linear process control mechanism (in fact, it is the dumbest possible control mechanism: PID with I=D=0), with the exception that the new difficulty is lower-bounded by 2. The rate at which the difficulty adjusts to the hash power of the network is parameter that can be tuned to fit the desired results. You mention that 2 is the most common difficulty for Neptune. That's because the blocks are found too far apart and the difficulty control mechanism wants to set it even lower, but the lower bound clamps it up to 2.
Meaning that, the difficulty has adjusted to the hash power after essentially two blocks? The relevant counterpart is the 2016 blocks it takes for Bitcoin's difficulty to adjust, and even then it might not have caught up to the hash power. I wonder whether running the Neptune regtest on a single machine skews the results. In principle, even a single-threaded miner is a Poisson process, so intuitively the answer is "no". That if the node crashes from time to time, that is a behavior that deviates from the Poisson process, and that has the potential to bias the results. But if you run the network on many machines that can each crash independently, then you even out the disturbances. It should be possible to rapidly generate these statistics for Neptune based on the current set of control parameters. I say "rapidly" because you would be sampling the next block time based on the current exponential distribution -- as opposed to running the proof-of-work algorithm. I think that's a worthwhile unit test and will provide valuable information to update the control parameters. This test case could also serve as a basis for analyzing ahead of time the robustness of the network against mining attacks. What if the hash power increases 100x and then decreases to where it started from in the span of one week? How does the network react? Under which conditions is selfish mining profitable?
This does skew the results technically speaking, but by a negligible amount as far as I can tell. |
right. or just a single machine, sampled for a period of time that we are certain there was no crash or restart. In lieue of certainty I sampled a few smaller subsets and found similar mean durations. Another thing I've thought of doing is to set the target time very small, eg 1 second and run for 10 minutes or so with unrestricted mining. It seems that should generate a pretty good data set quickly. I might give it a try today.
thoughts:
yes I think its important to test and a have a better understanding of these kinds of things. |
Well by accident I discovered a way to reliably generate the difficulty error/panic. I just tried setting the When I try to mine with regtest from genesis block I repeatedly (3 tries in a row) get the error on the very first block:
That's a clear indicator that the error is related to timing/interval calcs. edit: Thinking the genesis timestamp might play a role, I put in a quick hack to skip difficulty validation for block height == 1. It then mined 4 blocks right away, and hit the error on height == 5. So it seems the 1 second target doesn't trigger it for every block, but probably is somehow a contributing factor. |
Hallelujah! I finally found the cause of this panic and have a fix. It was because of the earlier change that updates the header timestamp in the mining loop in order to have the timestamp reflect the time block is found. It turns out that the difficulty value depends on the timestamp value, but was not being updated, so it was based on the initial timestamp when mining started. After block is mined, the validation checks the difficulty using the newer timestamp and there is (only sometimes) a mis-match, eg off-by-one. With the fix in place, I was finally able to mine hundreds of blocks quickly by setting target interval to 1 second. I will commit the fix for this tmw. |
Addresses #154. Adds test `mine_ten_blocks_in_ten_seconds()` which tests two things: 1. That 10 blocks can be mined and validated correctly. 2. That 10 blocks can be mined in approx 10 seconds with a target block interval of 1 second. Both (1) and (2) are failing with the present code. The goal is to fix them future commits. Also: Minor modifications are made to a few fn to enable simulated mining with a custom target block interval.
I had made the fix mentioned in #154 (comment) which ended the panics and seemed to work well, but I wanted to check if the block intervals seemed right or not. So I added a test, mine_ten_blocks_in_ten_seconds, that sets the block target interval to 1 second, mines ten blocks, and then compares the elapsed time to expected time. Initially I got these results, running the test case 10 times, indicating blocks are taking longer than expected:
After more digging I discovered that the "fix" was incomplete because the mining loop calculates a threshold value from the difficulty and that is what each digest is compared against. This threshold was not being recalculated along with the timestamp and difficulty, so I fixed that. After this change, the values were better, but now a bit lower than expected.
I modified the test case to ignore blocks 1 and 2 which are normally found very fast. The results now look pretty much as expected:
I ran the test case one final time using the standard target block interval of 9.8 minutes. This also passed within the allowed variance limit of 1.3. The mean interval was 9.19 minutes.
The sample size is small but it looks like we may now be a little bit on the "too fast" side. It might be that block-height 3 is too early to start sampling at and is skewing results a bit. Anyway, this seems a solid improvement, so I've closed the issue. I will plan to generate stats from the block explorer after it has run for a few days with the new code. edit: I should mention that in order to make this test work I also modified some functions to accept an optional target_block_interval parameter which enables simulating mining and block validation with custom target intervals, eg 1 second. See 0458d77 and 7fa9c69. |
Well this is very encouraging. I just ran the test for longer (nearly 3 hours) and the actual result matched expected to 0.9999 accuracy. Parameters were:
Results
|
I noticed that the block explorer neptune-core instance (running regtest) stops generating new blocks at times and must be restarted to resume.
The log shows a warning that the new difficulty is incorrect, and then a panic, presumably from a failed assertion.
needs review.
note: neptune-core was built at 5ef2423
The text was updated successfully, but these errors were encountered: