EOS-EVM-Node: Does Not Retry After Ship Node Connection Issue #579

Johnaverse · 2023-06-08T00:43:31Z

Issue: EOS-EVM-Node does not retry after ship node connection and exit

Here is the logs

Jun 06 07:54:53 jungle4evm-arch41 silkworm[6699]:   WARN [06-06|07:54:53.967 UTC] Can't link new block #80'597'773 (id:04cdd30d04626f864de35171812ce9feb1b55f1c2bcbef171c03827e26633255,prev:>
Jun 06 07:54:53 jungle4evm-arch41 silkworm[6699]:   WARN [06-06|07:54:53.967 UTC] Fork at Block #80'597'772 (id:04cdd30c28aca266dc41714d9c0039ffa0ef0f8c0a4fac6796f6cde28d0b4c4f,prev:04cdd30>
Jun 06 07:54:53 jungle4evm-arch41 silkworm[6699]:   WARN [06-06|07:54:53.967 UTC] Removing forked native block #80'597'773 (id:04cdd30d7dfcfaa7243bb6911e7be5cb23067d5efc04d805928b94b4b38031>
Jun 06 07:54:53 jungle4evm-arch41 silkworm[6699]:   WARN [06-06|07:54:53.967 UTC] Reset upper bound for EVM Block #6'501'275, txs:0, hash:7deed7d547c8256c1ca6c83261345eb9bd4b4a87ae2465d6028>
Jun 06 07:57:58 jungle4evm-arch41 consul[6636]: 2023-06-06T07:57:58.418Z [INFO]  agent: Synced check: check=service:jungle4evm-silkworm
Jun 06 08:00:06 jungle4evm-arch41 silkworm[6699]:   CRIT [06-06|08:00:06.285 UTC] SHiP read failed : End of file
Jun 06 08:00:06 jungle4evm-arch41 silkworm[6699]:  ERROR [06-06|08:00:06.831 UTC] [2/10 BlockHashes]                 function=forward exception=kAborted
Jun 06 08:00:06 jungle4evm-arch41 silkworm[6699]:  ERROR [06-06|08:00:06.831 UTC] [2/10 BlockHashes]                 op=Forward returned=kAborted
Jun 06 08:00:06 jungle4evm-arch41 silkworm[6699]:  ERROR [06-06|08:00:06.831 UTC] SyncLoop                           function=work exception=kAborted

Description: I am facing an issue with the EOS-EVM-Node where it does not retry after encountering a ship node connection issue. Assume that few connection issues must occur when program running. This behaviour poses a problem for the stability and reliability of the eos-evm-node.

Steps to Reproduce:

Start the EOS-EVM-Node.
Simulate a ship node connection issue by disconnecting the ship node from the network.
Observe the EOS-EVM-Node's behavior.
Expected Behavior: The EOS-EVM-Node should attempt to reconnect to the ship node after detecting the connection issue.

Actual Behavior: The EOS-EVM-Node does not make any retry attempts after encountering a ship node connection issue. It remains in a disconnected state, preventing the synchronization of data and disrupting the blockchain network's operations.

Impact: This issue significantly affects the availability and stability of the eos-evm-node and, consequently, the EOS EVM node operation. Without proper retry mechanisms, the node's failure to connect to the ship node and stop service.

The text was updated successfully, but these errors were encountered:

taokayan · 2023-06-08T06:20:13Z

If eos-evm-node exits gracefully after SHIP is disconnected, the solution is to have a batch/python script to find out the next available SHIP endpoint and restart eos-evm-node.

In a high-available setup, you can have

leap(nodeos) node 1 with SHIP in VM 1
leap node 2 with SHIP in VM2
eos-evm-node 1 in VM3, managed by a script to automatically select the available leap node to connect
eos-evm-node 2 in VM4, managed by a script to automatically select the available leap node to connect

yarkinwho · 2023-07-04T03:14:24Z

After some discussion, the behavior should be:
1 The code will retry connection even if reconnection itself failed.
2 It will retry a configurable number of times before exit (setting the number to 0 will effectively make it current behavior)
3 Configurable delay between retry, default to 10s
4 cache LIB for each block so at least for reconnection, we have a reliable LIB to restart from.

yarkinwho · 2023-07-12T00:46:22Z

PR in eosnetworkfoundation/eos-evm-node#3

Joshua2Mars added enhancement New feature or request 👍 lgtm labels Jun 19, 2023

Joshua2Mars assigned yarkinwho Jun 20, 2023

Joshua2Mars added this to the 0.6.0 milestone Jun 20, 2023

yarkinwho mentioned this issue Jul 3, 2023

Try reconnecting to SHiP again when read fails. #614

Closed

yarkinwho mentioned this issue Jul 12, 2023

Refactor to allow reconnection to SHiP endpoints after connection lost. eosnetworkfoundation/eos-evm-node#3

Merged

yarkinwho closed this as completed in eosnetworkfoundation/eos-evm-node#3 Aug 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EOS-EVM-Node: Does Not Retry After Ship Node Connection Issue #579

EOS-EVM-Node: Does Not Retry After Ship Node Connection Issue #579

Johnaverse commented Jun 8, 2023

taokayan commented Jun 8, 2023

yarkinwho commented Jul 4, 2023

yarkinwho commented Jul 12, 2023

EOS-EVM-Node: Does Not Retry After Ship Node Connection Issue #579

EOS-EVM-Node: Does Not Retry After Ship Node Connection Issue #579

Comments

Johnaverse commented Jun 8, 2023

taokayan commented Jun 8, 2023

yarkinwho commented Jul 4, 2023

yarkinwho commented Jul 12, 2023