Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Logql regexp extracts only one named group and sets the key to "undefined" #573

Closed
bcarlock-mycarrier opened this issue Sep 13, 2024 · 2 comments

Comments

@bcarlock-mycarrier
Copy link

Qryn version: 3.2.32-bun
Grafana version: v11.1.3

Given this log line:

Back-off restarting failed container vector in pod vector-events-54f8b45d76-4mmr7_monitoring(3741f37c-383d-4b42-ba79-fe121d1d12f3)

and this logql query:
... | regexp "Back-off restarting failed container (?P<container>\w+) in pod (?P<pod>[\w.\-]*)\"

The expected result is the addition of two new labels:

container=vector
pod=vector-events-54f8b45d76-4mmr7_monitoring

Testing in the Grafana LogQL Analyzer returns the following result:
image

However, the return from qryn looks like this:
image

Upon researching we discovered that qryn is running the following query (note the arrayZip argument)

WITH idx_sel AS (select ` sel_1 `.`fingerprint`
                 from (select ` fingerprint ` from ` logs `.` time_series_gin ` where ((` key ` = 'k8sEvent') and (` val ` = 'BackOff'))) as ` sel_1 ` inner any join (select ` fingerprint ` from ` logs `.` time_series_gin ` where ((` key ` = 'cluster') and (match (val, 'production.*') = 1))) as ` sel_2 `
                 on ` sel_1 `.` fingerprint ` = ` sel_2 `.` fingerprint `),
     sel_a AS (select ` samples `.`string` as `string`,`samples`.`fingerprint` as `fingerprint`,samples.timestamp_ns as `timestamp_ns`,JSONExtractKeysAndValues(time_series.labels, 'String') as `labels`,arrayFilter(x -> x.1 != '' AND x.2 != '', arrayZip(['undefined','undefined'], arrayMap(x -> x[length(x)], extractAllGroupsHorizontal(string, 'Back-off restarting failed container (?P<container>\w+) in pod (?P<pod>[\w.\-]*)')))) as `extra_labels`
               from logs.samples_v3 as ` samples ` left any join (select ` fingerprint `, ` labels ` from ` logs `.` time_series ` as ` time_series ` where ((` time_series `.` fingerprint ` in (idx_sel)) and (date >= toDate(fromUnixTimestamp(intDiv(1726238936619000000, 1000000000)))) and (date <= toDate(fromUnixTimestamp(intDiv(1726249736619000000, 1000000000)))))) AS time_series
               on ` samples `.` fingerprint ` = time_series.fingerprint
               where ((` samples `.` timestamp_ns ` between 1726238936619000000
                 and 1726249736619000000)
                 and (` samples `.` type ` in (0
                   , 0)))
                 and (samples.fingerprint IN idx_sel)
               order by ` timestamp_ns ` desc
               limit 1000)
select *
from sel_a
order by ` labels ` desc, ` timestamp_ns ` desc
@akvlad
Copy link
Collaborator

akvlad commented Sep 16, 2024

Hello @bcarlock-mycarrier .
Can you please try setting (?<container> and (?<pod> instead of (?P.... in the regexp?

There are some regexp style deviation between golang and javascript regex implementations.

@bcarlock-mycarrier
Copy link
Author

That does work! Is there some documentation of the deviations necessary from standard LogQL? I looked at the documentation and the only thing I was able to find was that regexp was supported, but not anything related to differences in syntax.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants