-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add elastic search ingester module #82
base: master
Are you sure you want to change the base?
Conversation
@david-vavra PTAL |
b977bb1
to
33d8209
Compare
33d8209
to
7c3c671
Compare
Signed-off-by: Martin Chodur <[email protected]>
7c3c671
to
bc3bfef
Compare
Ok, some tests are there and documentation also, I think it's gtg |
@rudo-thomas PTAL, and please feel free to rewrite my czenglish in the documentation if not clear enough 🙏 |
@@ -5,6 +5,8 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), | |||
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). | |||
|
|||
## Unreleased | |||
### Added | |||
- [#82](https://github.com/seznam/slo-exporter/pull/82) New module `elasticSerachIngester`, for more info see [the docs](./docs/modules/elasticsearch_ingester.md) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- [#82](https://github.com/seznam/slo-exporter/pull/82) New module `elasticSerachIngester`, for more info see [the docs](./docs/modules/elasticsearch_ingester.md) | |
- [#82](https://github.com/seznam/slo-exporter/pull/82) New module `elasticSearchIngester`, for more info see [the docs](./docs/modules/elasticsearch_ingester.md) |
|
||
```yaml | ||
# OPTIONAL Debug logging | ||
debug: false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once I checked what this option actually does, I would prefer it to be refactored to sth like - ES client log level.
timeout: "5s" | ||
# Enable/disable sniffing, autodiscovery of other nodes in Elasticsearch cluster | ||
sniffing: true | ||
# Enable/disable healtchecking of the Elasticsearch nodes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Enable/disable healtchecking of the Elasticsearch nodes | |
# Enable/disable healthchecking of the Elasticsearch nodes |
# REQUIRED Document filed name containing a timestamp of the event | ||
timestampField: "@timestamp" | ||
# REQUIRED Golang time parse format to parse the timestamp from the timestampField | ||
timestampFormat: "2006-01-02T15:04:05Z07:00" # See # https://www.geeksforgeeks.org/time-formatting-in-golang/ for common examples |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's reference something more official here :)
} | ||
|
||
var clientCaCertPool *x509.CertPool | ||
if config.CaCertFile != "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't if I understand the TLSClientConfig
correctly, but this seems rather confusing to me.
I suggest to refactor this to one of the two:
- either append
config.CaCertFile
to bothTLSClientConfig.ClientCAs
andTLSClientConfig.RootCAs
- or use explicit config options so that it clear how
config.CaCertFile
is applied. This mean droppingconfig.CaCertFile
and replacing it with something likeconfig.clientCaCertFile, config.serverCaCertFile
if config.Debug { | ||
opts = append(opts, elasticV7.SetTraceLog(logger), elasticV7.SetInfoLog(logger)) | ||
} | ||
if config.Username != "" || config.Password != "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should not this be mutual exclusive to tls auth?
client *elasticV7.Client | ||
} | ||
|
||
func (v *v7Client) RangeSearch(ctx context.Context, index, timestampField string, since time.Time, size int, query string, timeout time.Duration) ([]json.RawMessage, int, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know we are not really consistent with this across the slo-exporter code, but please add documention for the public methods.
Help: "Timestamp of the last processed document, next fetch will read since this timestamp", | ||
}) | ||
missingRawLogField = prometheus.NewCounter(prometheus.CounterOpts{ | ||
Name: "missing_raw_log_filed_total", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Name: "missing_raw_log_filed_total", | |
Name: "missing_raw_log_field_total", |
fields map[string]string | ||
} | ||
|
||
type tailer struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please document the fields. At least/especially timestamp*, rawLog*
.
} | ||
} | ||
|
||
timeFiled, ok := newDoc.fields[t.timestampField] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-> timeField
Just some nits, documentation and typos fix needed. Otherwise ok, gj. :) |
Adding new module with support to read events from elastic search.
Tested locally and looks ok and working just right.
The module remembers the last timestamp from each log (user needs to specify which field holds the timestamp and its format).
It periodically (interval configurable) queries elastic for max X documents(also configurable) since the last timestamp.
So if slow of very much behind, it should be catching up until reading the most recent documents.
It converts every key in the root of the document from elastic to a key in the event. If its value is printed using
fmt.Sprint
so if there is some nested JSON it will be rendered as a string map representation for example. In case user does not use structured logging, it supports parsing of the field with the source log (configurable) same way the tailer module does(specifying regular expression with named groups)still to be done: