-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IR additional health status #2934
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #2934 +/- ##
==========================================
+ Coverage 23.88% 23.91% +0.03%
==========================================
Files 775 776 +1
Lines 45610 45721 +111
==========================================
+ Hits 10892 10936 +44
- Misses 33861 33925 +64
- Partials 857 860 +3 ☔ View full report in Codecov by Sentry. |
d451e1e
to
f16d6d6
Compare
pkg/services/control/ir/types.proto
Outdated
|
||
// IR application is started and serves all services. | ||
READY = 2; | ||
READY = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think you have broken the older version of cli that expect 2
as READY:
neofs-node/cmd/neofs-cli/modules/control/healthcheck.go
Lines 70 to 72 in 4a1bc79
if healthStatus != control.HealthStatus_READY { | |
os.Exit(1) | |
} |
it is internal API but still i think it is not worth it, @roman-khimov
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, FIY, if we do such things we usually add something to our CHANGELOG to prepare our users for their catastrophic update, e.g.:
Lines 154 to 170 in 4a1bc79
### Updating from v0.40.1 | |
Remove `notification` section from all SN configuration files: it is no longer | |
supported. All NATS servers running for this purpose only are no longer needed. | |
If your app depends on notifications transmitted to NATS, do not update and | |
create an issue please. | |
Stop attaching `__NEOFS__NETMAP*` X-headers to NeoFS API requests. If your app | |
is somehow tied to them, do not update and create an issue please. | |
Notice that this is the last release containing `blobovnicza-to-peapod` | |
migration utility. Blobovniczas were removed from the node since 0.39.0, so | |
if you're using any current NeoFS node version it's not a problem. If you're | |
using 0.38.0 or lower with blobovniczas configured, please migrate ASAP. | |
Remove `grpc.tls.use_insecure_crypto` from any storage node configuration. | |
Remove `timers.emit` from any inner ring configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Compatibility better be kept, it doesn't cost a lot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added new statuses after the existing ones, therefore, it no longer breaks.
pkg/innerring/innerring.go
Outdated
func New(ctx context.Context, log *zap.Logger, cfg *viper.Viper, errChan chan<- error) (*Server, error) { | ||
var err error | ||
server := &Server{log: log} | ||
|
||
server.setHealthStatus(control.HealthStatus_HEALTH_STATUS_UNDEFINED) | ||
server.setHealthStatus(control.HealthStatus_CREATE_SERVER) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think HealthStatus_CREATE_SERVER
was not a problem at all. you started a server, it only created &Server{log: log}
, so it is really undefined now. i guess the initial issue was mostly about extending codes: you set it like "STARTING_BLOCKCHAIN" and then call server.bc.Run(ctx)
; you set it like "DEPLOYING_NETWORK" and then you call deploy.Deploy(ctx, deployPrm)
, etc. it allows understand more about what is happening, starting IR may take minutes now, and all an admin got is minutes of "undefined" that goes immediately to READY then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the rule of thumb may be: any operation that is not about local CPU work but about any I/O that potentially may block and that an admin should know about
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added new statuses that seemed to work with I/O and can block func, but I'm not sure about them.
f16d6d6
to
9acfa84
Compare
9acfa84
to
9232078
Compare
CHANGELOG.md
Outdated
@@ -5,6 +5,8 @@ Changelog for NeoFS Node | |||
|
|||
### Added | |||
- More effective FSTree writer for HDDs, new configuration options for it (#2814) | |||
- New health statuses in inner ring (#2934) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets add this record with the corresponding commit
Signed-off-by: Andrey Butusov <[email protected]>
New health status for time-spending process - initializing Neo network by blockchain: `INITIALIZING_NETWORK`. Closes #2923. Signed-off-by: Andrey Butusov <[email protected]>
Expose health status of ir via Prometheus. Signed-off-by: Andrey Butusov <[email protected]>
9232078
to
5f97a86
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
idk why lint action failed
Looks a lot like nspcc-dev/neo-go#3416, to be fixed with #2940. |
Closes #2923.
I'm not sure I've found the right name for the status, and I was also thinking that I could add an additional status before updating contracts, since this is the longest process.
I made a metric type like the storage node (number from enum), but I also thought about the string parameter.