Skip to content

Commit

Permalink
Remove fail-fast behaviour on cluster bootstrap when peers discovery …
Browse files Browse the repository at this point in the history
…fails (#1513)

(cherry picked from commit 5f50950)
  • Loading branch information
thampiotr committed Aug 23, 2024
1 parent dfc3968 commit 3f64531
Show file tree
Hide file tree
Showing 2 changed files with 13 additions and 4 deletions.
10 changes: 10 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,16 @@ This document contains a historical list of changes between releases. Only
changes that impact end-user behavior are listed; changes to documentation or
internal API changes are not present.

v1.3.1
-----------------

### Bugfixes

- Changed the cluster startup behaviour, reverting to the previous logic where
a failure to resolve cluster join peers results in the node creating its own cluster. This is
to facilitate the process of bootstrapping a new cluster following user feedback (@thampiotr)


v1.3.0
-----------------

Expand Down
7 changes: 3 additions & 4 deletions internal/service/cluster/cluster.go
Original file line number Diff line number Diff line change
Expand Up @@ -274,10 +274,9 @@ func (s *Service) Run(ctx context.Context, host service.Host) error {

peers, err := s.getPeers()
if err != nil {
// Fatal failure on startup if we can't discover peers to prevent a split brain and give a clear signal to the user.
// NOTE: currently returning error from `Run` will not be handled correctly: https://github.com/grafana/alloy/issues/843
level.Error(s.log).Log("msg", "fatal error: failed to get peers to join at startup - this is likely a configuration error", "err", err)
os.Exit(1)
// Warn when failed to get peers on startup as it can result in a split brain. We do not fail hard here
// because it would complicate the process of bootstrapping a new cluster.
level.Warn(s.log).Log("msg", "failed to get peers to join at startup; will create a new cluster", "err", err)
}

// We log on info level including all the peers (without any abbreviation), as it's happening only on startup and
Expand Down

0 comments on commit 3f64531

Please sign in to comment.