Releases: ObolNetwork/charon
v0.17.2 - 2023-11-08
This release updates a timeout related to how Charon DKG interacts with the Obol API, as well as fixing an issue related to pre-generated builder registrations.
Full Changelog: v0.17.1..v0.17.2
What's Changed
v0.17.1 - 2023-10-19
We are thrilled to announce the release of v0.17.1 of Charon, delivering Holesky support and bug fixes to the v0.17 line of releases.
Full Changelog: v0.17.0..v0.17.1
Feature
Bug
What's Changed
- beaconmock: add
MIN_EPOCHS_FOR_BLOCK_REQUESTS
by @dB2510 in #2627 - gomod: upgrade libp2p to v0.29.2 by @dB2510 in #2626
- cmd/combine: ensure output dir exists by @dB2510 in #2625
- eth2util: add holesky by @dB2510 in #2628
- eth2util/deposit: bump deposit cli version by @dB2510 in #2630
- eth2util: fix fork version by @dB2510 in #2629
- *: bump go version by @gsora in #2648
- testutil/genchangelog: force fetch tags by @gsora in #2647
Full Changelog: v0.17.0...v0.17.1
v0.17.0 - 2023-08-21
We are thrilled to announce the release of v0.17.0 of Charon, delivering major enhancements in terms of MEV support (still in beta), network performance, and duty result analysis.
Notable Improvements:
- #2172 introduces improved MEV support for newly formed clusters using the latest v1.7.0 cluster definition and lock files. Keep in mind, this enhanced MEV support is still in the beta phase for this release, but we anticipate general availability in the near future.
- #2203 provides a significant performance improvement by decreasing the P2P networking bandwidth over 50%. This reduction is achieved by serialising data as SSZ instead of JSON, greatly benefiting consensus on block proposals with large payloads.
- #2200 offers improved chain inclusion tracking as part of duty result analysis. As a result, duties will now only be marked successful if they have been included on-chain. This change enhances previous behaviour that marked duties as successful only if they were successfully broadcasted to an upstream beacon node.
Breaking Changes
⚠️ ⚠️ ⚠️ #2203 introduces network changes that isn't compatible with v0.15.0. So v0.17.0 is only compatible with v0.16.0.
Make sure all nodes in the cluster are upgraded to v0.16.0 before proceeding with the upgrade to v0.17.0.
Full Changelog: v0.16.0..v0.17.0
Feature
- load cluster manifest and lock file #2334 (#2358)
- Instrument supported peer versions #2293 (#2295)
- Create backup manifests #2345 (#2348)
- Switch to SSZ eth2 encoding in protobufs #2203 (#2286)
- Create deposit data file for
add-validators-solo
command #2379 (#2395,#2388) charon combine
should have--no-verify
flag #2321 (#2352)- Add inclusion results to tracker for accurate success/failure metrics #2200 (#2299)
- Add command to view cluster manifests #2381 (#2408)
- Create keystore files for new validators #2382 (#2406)
- Implement alpha add validator local command #1887 (#2377,#2332,#2306,#2280)
- Add node_signatures to lock file #2204 (#2240,#2250)
Bug
- Inclusion checker bug #2415 (#2424)
- Ensure that ENRs are unique when verifying the definition | Charon #2296 (#2297)
- Workflows using wrong image tag #2367 (#2368)
- dkg: private key lock file not deleted properly on successful run #2258 (#2257)
- p2p: stream scope not attached to a protocol errors #2259 (#2260)
- Fix unmarshal issues with
go-eth2-client
version v0.17.0 #2333 (#2353) - Cluster failing attestations with mixed node versions #2386 (#2387)
- Missing flags error on create cluster command #2444 (#2445)
- goroutine leak in
charon-perf-1
#2439 (#2452,#2447,#2448) - align hash timestamp to eth2 #2436 (#2430)
- Outbound stream limit reached #2277 (#2290,#2289,#2278)
- new memory leak alert on v.0.17-dev #2438 (#2443)
Refactor
- Clone explicit builder version to eth2util #2466 (#2470)
- Eager dlinear reporting consensus duration greater than legacy inc timer #2337 (#2425)
- incorrect error message when number of addresses don't match number of validators #2340 (#2372)
parsigex
can't unmarshal blinded blocks #2433 (#2462,#2459,#2434)- Make the folder output structure from
create cluster
compatible withcharon run
defaults #2302 (#2392) - Pre-generate validator registrations #2172 (#2238)
- manifest file should contain
SignedMutationList
#2472 (#2473) - Remove default values for critical flags #2341 (#2380)
- Remove support for stream-delimited wire protocols #1934 (#2350,#2350)
- Track duty failing reasons on aggregate dashboard #1382 (#2242)
Test
- Fix panic in testutil/validatormock/synccomm.go #2347 (#2349)
- Nightly tests are failing #2239 (#2273,#2267)
- Develop a test suite to ensure regressions don't happen on blinded blocks #2317 (#2356)
Misc
v0.16.0 - 2023-06-20
We are excited about this new v0.16.0 release of charon which includes the fixes from our soon to be announced audit, as well as a number of performance and stability improvements.
Notable improvements:
- #1991 is the tracking ticket for multiple fixes related to the Sigma Prime audit. These fixes mostly relate to more robust data validation.
- #2117 adds a new feature flag that A/B tests a new QBFT round timing strategy. The aim is to monitor the performance of this new strategy compared to the existing strategy and hopefully enable it by default in v0.17. This A/B testing feature can be enabled via the
--feature-set=alpha
or--feature-set-enable=qbft_timers_ab_test
flag. - #1382 introduces a new metric
core_tracker_failed_duty_reason_total
that counts the number of duty failed duties by type and reason which makes is much easier to identify why duties are failing.
Note that v0.16 doesn't introduce any breaking changes so has the same backwards compatibility as v0.15, it is therefore compatible with both v0.15 and v0.14.
Full Changelog: v0.15.0..v0.16.0
Feature
- Pre-generate validator registrations #2172 (#2238,#2219,#2214,#2208,#2205,#2202)
- Validate duty data #1922 (#2198)
- Validate received DKG messages #1888 (#2107)
- Upgrade all libp2p wire protocols to use length-delimited messages #1884 (#2212)
- Introduce cluster state data structure #1886 (#2199,#2197,#2193,#2190,#2187,#2182,#2175)
- Add lodestar to compose #1897 (#2128)
create dkg
config check (OBOL-19) #2093 (#2136)- Improve tracker participation metric #2034 (#2112,#2080,#2075)
- Teku migrated to SSZ block creation by default #1537 (#2063)
- verify reconstructed signature in sigagg (OBOL-17) #2122 (#2123)
- Switch to SSZ eth2 encoding in protobufs #2203 (#2210)
- Cache validators by pubkey in eth2client #1396 (#2121)
- Add deployment workflows for relays #2031 (#2082,#2032)
- dkg: generate deposit data #491 (#2150)
- Double QBFT timers for lead rounds #2092 (#2129,#2096)
- Extend inclusion delay logic to track failed attestation #1538 (#2124,#2120)
- Fix Sigma Prime Audit Results #1991 (#2115)
- Mitigate side-effects due to multiple
charon
commands running at the same time #1918 (#2109) - Add node_signatures to lock file #2204 (#2240,#2250,#2224)
- Regularly attempt direct dials even if relay conns exist #2114 (#2222)
Bug
- RLP Length in Bits Rather Than Bytes (OBOL-13) #2076 (#2081)
- Verify QBFT justifications #1923 (#2079)
- p2p: stream scope not attached to a protocol errors #2259 (#2260)
- dkg: private key lock file not deleted properly on successful run #2258 (#2257)
- Incorrect attestation inclusion check #2130 (#2223,#2168)
- Duplicate Keys Allowed in ENR (OBOL-14) #2054 (#2073)
- Prevent mutexes from holding locks while doing I/O #2028 (#2027)
- Outbound stream limit reached #2277 (#2291)
- Lack of Size Checks When Slicing Arrays (OBOL-05) #2052 (#2077)
sigagg
Does Not Ensure t Partials Are Received (OBOL-16) #2053 (#2061)- Error while combining 1000 distributed keys #2151 (#2178,#2167)
- Incorrectly ignoring attestation aggregation failures #1348 (#2086)
Refactor
- Benchmark different QBFT timing strategies #2117 (#2213,#2116)
- update outdated dependencies (OBOL-12) #2083 (#2089)
- Complete herumi migration #2055 (#2100,#2091)
- Track duty failing reasons on aggregate dashboard #1382 (#2242)
Test
- Nightly test doing long-running DKG #1971 (#2173,#2111)
- Flapping deadliner test #1721 (#2047)
- Fix issues found by beacon mock fuzz #1962 (#2137,#2105)
Misc
- Add linter to disallow import of testutil into production packages #1683 (#2143)
- Set app/version at build time for tagged releases #2097 (#2232)
- Tagged release versions not overwritten to app.Version. #2270 (#2276,#2275,#2274,#2272)
nil
Pointer References from Protobuf Messages. #1997 (#2090,#2088)- Update deposit cli version to v2.5.0. #2067 (#2074)
- Use main/release branch version when tagging docker images #2098 (#2233)
- Change current config version to v1.6 #2246 (#2247)
v0.15.0 - 2023-04-12
We are excited about this new v0.15.0 release of charon which focuses on performance and stability.
Notable improvements:
- #1904 completes the QBFT wire protocol migration resulting in a substantial decrease in bandwidth requirements for block proposals. This addresse sporadic block proposal failures in large geo-distributed clusters.
- #1889 refactors the DKG wire protocol to use "reliable-broadcast" making our DKG more robust against byzantine peers.
- #1895 enforces matching DKG minor versions, this mitigates problems caused by users running different versions of the DKG protocol.
Breaking Changes
⚠️ ⚠️ ⚠️
- #1904 removed support for v0.13.0 legacy consensus wire protocol. v0.15.0 is therefore only compatible with v0.14.0. Only upgrade to v0.15.0 if all other nodes in the cluster are on v0.14.0 or newer.
- The
--network
configuration flag in thecharon create cluster
andcharon create dkg
commands now default tomainnet
. It does not default togoerli
like previously anymore.
Full Changelog: v0.14.2..v0.15.0
Feature
- Fix race conditions in charon #1867 (#1966,#1952,#1953,#1950,#1949,#1944,#1947,#1946,#1939,#1942,#1940,#1938,#1937)
- Log peer info at the start of DKG #1912 (#1941,#1936)
- Default on CLI commands to mainnet #1969 (#1986)
- p2p: add support for delimited wire protocol upgrades #1885 (#1890)
- Refactor DKG to use "reliable broadcast" #1889 (#1893)
- dkg/bcast: implement reliable broadcast protocol #1893 (#1896)
- Reduce QBFT throughput #1552 (#1904)
- Upgrade all libp2p wire protocols to use length-delimited messages #1884 (#1885)
- Teku migrated to SSZ block creation by default #1537 (#2064)
- PoC fuzzing beaconmock #1848 (#1929)
- Display "BeaconNode far behind" on grafana #1967 (#1995)
Bug
- Verify commitments lengths in DKG #1999 (#2007)
- DKG doesn't support peerinfo protocol for relays #1978 (#1982)
- Improve BN health check #1732 (#1961)
- Enhance
dkg.checkWrites()
logic #1919 (#1935) - DKG doesn't fail server side if version mismatch is detected #2002 (#2003)
- Concurrent relay dial errors #1898 (#1921,#1903)
- Incorrectly ignoring attestation aggregation failures #1348 (#2087)
- Multi beacon node issues if http times out #2037 (#2038)
- fix relay conns #1892 (#1891)
- DKG reconnect deadlock bug #1925 (#1924)
- Relay
peerinfo
protocol is failing #1956 (#1958) - Issues with keymanager API feature #1928 (#1930)
- lock file post does not add content type to request #1906 (#1909,#1908,#1907)
- DKG must error when peer runs different version #1895 (#1901)
- v0.15.0-dev doesn't advertise supported protocols #1948 (#1954)
Refactor
Misc
v0.14.3 - 2023-03-21
This release introduces a fix to the unexpected libp2p relay NO_RESERVATION
errors which result in network connectivity issues in charon dkg
and charon run
.
This is a high priority patch release and all users are strongly encouraged to upgrade as soon as possible.
Full Changelog: v0.14.2..v0.14.3
Bug
- Unexpected
NO_RESERVATION
errors #1913
v0.14.2 - 2023-03-16
This release introduces a fix to the container health check mechanism. It also fixes a bug in the --split-existing-keys
argument.
Full Changelog: v0.14.1..v0.14.2
Bug
v0.14.1 - 2023-03-13
Release v0.14.1 fixes an issue introduced in v0.14.0 related to parsing charon-enr-private-key
with trailing newlines, see #1875.
This is a low priority release only required for operators facing the following error:Error: load priv key: decode private key hex: encoding/hex: invalid byte
.
Full Changelog: v0.14.0..v0.14.1
v0.14.0 - 2023-03-10
We are excited about this new v0.14.0 release of charon. This release will be our first under a new Business Source License, which will be used to guide the gradual roll out of DVT onto mainnet over the coming year. This release also includes the first step of reducing the consensus protocol network bandwidth requirements, along with general bug fixes and performance improvements. Most importantly, this release adds provisional attestation support for Nimbus, Lodestar and Prysm validator clients, meaning every Ethereum validator client can attest as part of a distributed validator cluster. 🎉 Finishing the remainder of the duties is a work in progress, and you can view the latest client support at https://dvt.obol.tech/
⚠️ ⚠️ ⚠️ Breaking Changes.
- #1552 removed support for v0.12.0 legacy consensus wire protocol. v0.14.0 is therefore only compatible with v0.13.0. Only upgrade to v0.14.0 if all other nodes in the cluster are on v0.13.0 or newer.
--withdrawal-address
and--fee-recipient-address
have been renamed to--withdrawal-addresses
and--fee-recipient-addresses
, to allow you to specify an array of addresses if you want to have different addresses per validator in the cluster. As these flags are only used once-off to prepare a cluster, the impact of the change should be negligible.
The following flags have been removed and is no longer supported from this release:
p2p-bootnodes
: Renamed top2p-relays
p2p-bootnode-relay
: Always enabled.p2p-bootnodes-from-lockfile
: Not supported anymore.p2p-udp-address
: Discv5 not supported anymore.
The following port is no longer in use by DiscV5, and can be closed if it was opened/port forwarded previously.
- UDP/3630
Some notable features and fixes:
- #1866 fixed a data race condition in scheduler
- #1827 fixed a memory leak
- #1626 replaced the deprecated kryptology BLS crypto library in charon with Herumi BLS.
- #1503 adds support to import DKG generated keys to a keymanger-api such as Web3Signer.
- #1552 upgrade the consensus wire protocol to support reduced bandwith in the next release.
All operators are encouraged to upgrade to this release as soon as possible to ensure smooth upgrade to v0.15.0 that will greatly reduce the consensus network bandwidth requirements thereby removing support for v0.13.0.
Full Changelog: 2cddcf0...v0.14.0
Feature
- Set a constant gauge for upstream
/eth/v1/node/version
#1773 (#1825) - Make it easier to recombine keys #1311 (#1799)
- Relay to serve multiaddrs instead of ENR #1632 (#1646)
- Warn if a relay without https is specified #1637 (#1704)
- Implement
NodeVersion
for charon #1747 (#1765) - Charon cannot start if any beacon node is down #1312 (#1760)
- Use withdrawal-addresses from cluster-definition v1.5.0 #1650 (#1687)
- Support Keymanager API #1502 (#1663,#1657)
- Reduce QBFT throughput #1552 (#1838)
- Incorrect failed duty reason when deadline reached - tracker v2 #1478 (#1853,#1830,#1706,#1666)
- Create a
--publish
flag on thedkg
andcreate cluster
commands that pushes the produced lockfile to the obol api #1492 (#1781) - Add daily validator averages to promrated reported statistics #1656 (#1740)
herumi_bls
feature flag #1743 (#1856,#1745)- Prioritise block proposal strategy #1652 (#1762,#1751,#1731)
- Port geth eip-712 package #1698 (#1710,#1707)
- Instrument beacon node peer count #1603 (#1681,#1675)
- Add
--keymanager-address
as a flag tocharon dkg
#1504 (#1671) - Add vouch to compose #1403 (#1814)
- Include deposit data in the lockfile #1775 (#1815,#1813)
- Use fee-recipient addresses from cluster-definition v1.5.0 #1651 (#1679)
- Add multiple withdrawal addresses to cluster-definition #1645 (#1674)
- Support multiple addresses via CLI #1756 (#1832)
Bug
- Issue with p2p connection type metric #1790 (#1798,#1792,#1791)
- Promrated needs authentication token and is failing silently #1738 (#1746,#1739)
- Cluster definition hash incorrect for empty addresses or signatures #1689 (#1695)
Refactor
- Replace geth crypto package #1626 (#1699,#1694,#1685,#1682)
- Remove discv5 completely #1648 (#1653)
- Refactor
tbls
usage totbls/v2
package #1744 (#1789,#1784,#1774,#1768,#1766,#1754) - Add
--keymanager-address
as a flag tocreate cluster
#1503 (#1680,#1662) - Refactor VC readyz status #1612 (#1761)
- Design a BLS12-381 library abstraction #1658 (#1692)
Test
Misc
v0.13.0 - 2023-01-17
We are excited about this new v0.13.0 release of charon that introduces a big upgrade to the networking model, along with general fixes and performance improvements from our recent load-testing efforts.
⚠️ ⚠️ ⚠️ #1555 removed support for v0.10.0 legacy consensus wire protocol. v0.13.0 is therefore only compatible with v0.11.0 and v0.12.0. Only upgrade to v0.13.0 if no other node in the cluster is on v0.10.0.
The UDP-based discv5 peer discovery protocol has been deprecated (and will be removed completely in the next release). It has been replaced with the native libp2p-based peer discovery using external relays. This requires all peers in a cluster to connect to at least one common relay to find and establish a direct p2p connection with one another at boot.
We also took the opportunity to rebrand these "bootnodes" as "relays", as these connections can occur at more than just the boot phase but also when a direct connection between peers fails to be established (this is not suitable for production, your validator will be slower than with a direct connection). It also aligns the naming closer to an mev-relay, which although not the same, share some similarities and as such can help a new user become familiar with the trust assumptions that might be entailed by their selection of relay. Obol intends to make it easy to self-host independent relays, and will work with other organizations in the space to host some alternative relays to the Obol hosted ones to improve network resilience and decentralization.
One other change to be aware of is that we have started to monitor the health and readiness of validator clients that connect to charon, by confirming whether they are requesting information about all of the distributed validator key shares this client will operate. This monitoring feeds into Charon's health and readiness endpoints. If you notice your charon client's container health failing, flapping, or otherwise behaving unexpectedly after this release, do let us know.
The following flags have been deprecated (and will be removed in the next release):
p2p-bootnodes
: Renamed top2p-relays
p2p-bootnode-relay
: Will always be enabled from next release.p2p-bootnodes-from-lockfile
: Will not be supported from next release.p2p-udp-address
: Will be removed in next release.
The following bootnode URL has been deprecated (and will be removed in the future once enough operators have upgraded to a version supporting the new default relay):
http://bootnode.lb.gcp.obol.tech:3640/enr
As part of the networking change, this release adds a new bootnode/relay URL alongside the old one. Charon clients are capable of relying on multiple relays and only need one common relay between peers to function. Once every operator in your cluster is running at least v0.13 or newer, it will be safe to remove all references to the old bootnode URL. If you remove reference to it too soon, operators that have not updated to the new relay will be left behind and will be unable to find your client at the old bootnode.
All operators are strongly encouraged to upgrade to this release as soon as possible to ensure a smooth upgrade to the subsequent v0.14.0 release.
Full Changelog: v0.12.0..v0.13.0
Feature
- Detect all validators queried by VC #1501 (#1609,#1599,#1597,#1566)
- Add support for p2p external IP and Hostname to relay discovery #1590 (#1604)
- Use case v1 priority protocol #1421 (#1485)
- Track attestation inclusion distance #1254 (#1468)
- Synthetic block proposals #1486 (#1544,#1539,#1533,#1499,#1497,#1487)
- Add metrics for consensus components #1247 (#1526)
- Introduce "partial cluster definition" which contains the creator data but without operators or a definition hash (empty operators and empty definition_hash). #1488 (#1520)
- peerinfo protocol to bootnode #1521 (#1536)
- Scheduler duties not trimmed #1584 (#1596)
- AggSigDB data not trimmed #1582 (#1598)
- Improve error message when VC key share index doesn't match charon index #947 (#1601,#1586)
- Prometheus metrics for bootnode observability #1506 (#1534,#1512)
- Investigate dropping discv5 for pure libp2p stack #712 (#1496)
- core/tracker: track inclusion distance #1446 (#1473)
- Add cluster name, hash, peer and network to pushed loki labels #1509 (#1511)
Bug
- Go routine leak in validatormock #1527 (#1528)
- Periodic clock offset spikes #1445 (#1474)
- Remove deprecated consensus wire protocol fields #1555 (#1556,#1554)
- core/dutydb: trim expired duties #978 (#1595)
- UponRule deduplication issue in QBFT #1493 (#1505)
- Create promrated utility to report validator metrics #1540 (#1575,#1574,#1569,#1562,#1553,#1548)
Refactor
- Rebrand bootnodes as relays #1524 (#1610)
- Deprecate discV5 #1606 (#1607)
- Clock diff still includes connection opening delay #1494 (#1592)
- Make DKG program fault tolerant during 'connecting to peers' phase. #1475 (#1495)
- Discv5 stale bootnode do not resolve #139 (#1489)
- Revert partial definition concept #1591 (#1602)