-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Take into account network latency when syncing #55
Merged
Merged
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
f57b9a8
Take into account network latency when syncing from a node to avoid g…
heifner 4c4689c
Fix overflow bug introduced in port of original eosio/eos PR
heifner 7790347
use int64_t instead of long long
heifner 80bb777
Handle negative network latency (clock skew)
heifner 8c3f475
Merge remote-tracking branch 'origin/main' into fsh-sync-to-chain
heifner 3d6fc91
Clarify comment
heifner File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,112 @@ | ||
#!/usr/bin/env python3 | ||
from testUtils import Utils, WaitSpec | ||
from Cluster import Cluster | ||
from WalletMgr import WalletMgr | ||
from TestHelper import TestHelper | ||
import signal | ||
import platform | ||
import subprocess | ||
import time | ||
import re | ||
|
||
############################################################### | ||
# p2p connection in high latency network for one producer and one syning node cluster. | ||
# | ||
# This test simulates p2p connections in high latency network. The test case is such that there are one producer | ||
# and one syncing node and a latency of 1100ms is introduced to their p2p connection. | ||
# The expected behavior is that producer recognize the net latency and do not send lib catchup to syncing node. | ||
# As syncing node is always behind, therefore sending lib catchup is useless as producer/peer node gets caught into infinite | ||
# loop of sending lib catch up to syncing node. | ||
############################################################### | ||
|
||
def readlogs(node_num, net_latency): | ||
filename = 'var/lib/node_0{}/stderr.txt'.format(node_num) | ||
f = subprocess.Popen(['tail','-F',filename], \ | ||
stdout=subprocess.PIPE,stderr=subprocess.PIPE) | ||
latRegex = re.compile(r'\d+ms') | ||
t_end = time.time() + 80 # cluster runs for 80 seconds and and logs are being processed | ||
while time.time() <= t_end: | ||
line = f.stdout.readline().decode("utf-8") | ||
print(line) | ||
if 'info' in line and 'Catching up with chain, our last req is ' in line: | ||
Utils.Print("Syncing node is catching up with chain, however it should not due to net latency") | ||
return False | ||
if 'debug' in line and 'Network latency' in line and float(latRegex.search(line).group()[:-2]) < 0.8 * net_latency: | ||
Utils.Print("Network latency is lower than expected.") | ||
return False | ||
|
||
return True | ||
def exec(cmd): | ||
process = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE) | ||
out, err = process.communicate() | ||
process.wait() | ||
process.stdout.close() | ||
process.stderr.close() | ||
return err, process.returncode | ||
|
||
Print=Utils.Print | ||
|
||
args = TestHelper.parse_args({"--dump-error-details","--keep-logs","-v","--leave-running","--clean-run"}) | ||
Utils.Debug=args.v | ||
|
||
producers=1 | ||
syncingNodes=1 | ||
totalNodes=producers+syncingNodes | ||
cluster=Cluster(walletd=True) | ||
dumpErrorDetails=args.dump_error_details | ||
keepLogs=args.keep_logs | ||
dontKill=args.leave_running | ||
killAll=args.clean_run | ||
|
||
testSuccessful=False | ||
killEosInstances=not dontKill | ||
|
||
specificExtraNodeosArgs={} | ||
producerNodeId=0 | ||
syncingNodeId=1 | ||
|
||
specificExtraNodeosArgs[producerNodeId]=" --p2p-listen-endpoint 0.0.0.0:{}".format(9876+producerNodeId) | ||
specificExtraNodeosArgs[syncingNodeId]="--p2p-peer-address 0.0.0.0:{}".format(9876+producerNodeId) | ||
|
||
try: | ||
TestHelper.printSystemInfo("BEGIN") | ||
cluster.killall(allInstances=killAll) | ||
cluster.cleanup() | ||
traceNodeosArgs=" --plugin eosio::trace_api_plugin --trace-no-abis --plugin eosio::producer_plugin --produce-time-offset-us 0 --last-block-time-offset-us 0 --cpu-effort-percent 100 \ | ||
--last-block-cpu-effort-percent 100 --producer-threads 1 --plugin eosio::net_plugin --net-threads 1" | ||
if cluster.launch(pnodes=1, totalNodes=totalNodes, totalProducers=1, useBiosBootFile=False, specificExtraNodeosArgs=specificExtraNodeosArgs, extraNodeosArgs=traceNodeosArgs) is False: | ||
Utils.cmdError("launcher") | ||
Utils.errorExit("Failed to stand up eos cluster.") | ||
|
||
cluster.waitOnClusterSync(blockAdvancing=5) | ||
Utils.Print("Cluster in Sync") | ||
cluster.biosNode.kill(signal.SIGTERM) | ||
Utils.Print("Bios node killed") | ||
latency = 1100 # 1100 millisecond | ||
# adding latency to all inbound and outbound traffic | ||
Utils.Print( "adding {}ms latency to network.".format(latency) ) | ||
if platform.system() == 'Darwin': | ||
cmd = 'sudo dnctl pipe 1 config delay {} && \ | ||
echo "dummynet out proto tcp from any to any pipe 1" | sudo pfctl -f - && \ | ||
sudo pfctl -e'.format(latency) | ||
else: | ||
cmd = 'tc qdisc add dev lo root netem delay {}ms'.format(latency) | ||
err, ReturnCode = exec(cmd) | ||
if ReturnCode != 0: | ||
print(err.decode("utf-8")) # print error details of network slowdown initialization commands | ||
Utils.errorExit("failed to initialize network latency, exited with error code {}".format(ReturnCode)) | ||
# processing logs to make sure syncing node doesn't get into lib catch up mode. | ||
testSuccessful=readlogs(syncingNodeId, latency) | ||
if platform.system() == 'Darwin': | ||
cmd = 'sudo pfctl -f /etc/pf.conf && \ | ||
sudo dnctl -q flush && sudo pfctl -d' | ||
else: | ||
cmd = 'tc qdisc del dev lo root netem' | ||
err, ReturnCode = exec(cmd) | ||
if ReturnCode != 0: | ||
print(err.decode("utf-8")) # print error details of network slowdown termination commands | ||
Utils.errorExit("failed to remove network latency, exited with error code {}".format(ReturnCode)) | ||
finally: | ||
TestHelper.shutdown(cluster, None, testSuccessful, killEosInstances, False, keepLogs, killAll, dumpErrorDetails) | ||
|
||
exit(0) |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding note to this PR here for future documentation of why this was removed. Removed this code because it could never have worked. time is in microseconds where msg.time is in nanoseconds so time - msg_time is always negative.
Also there is no way to do what this was trying to do. You don't know how much network latency is involved so you have no idea what clock skew is involved.