Skip to content

Conversation

cosmo0920
Copy link
Contributor

@cosmo0920 cosmo0920 commented Sep 26, 2025

When migrating to remote accessing feature from the parity feature on Fluentd, I didn't find a way to reconnect stale connections.
Now, I found a way to handle this.
The clue is distinguish an error that is ERROR_CANCELLED or ERROR_INVALID_HANDLE.
This is one of the key points to handle disconnect on subscriptions.

Closes #10937.


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
  • Debug log output from testing the change
    After applying this patch, retrying to connect mechanism is starting to work:
[2025/09/26 18:38:45.838808500] [ warn] [in_winevtlog] EvtNext failed (err=1726), will reconnect
[2025/09/26 18:39:04.992279600] [error] [input:winevtlog:winevtlog.0] reconnect: EvtSubscribe failed on 'Application' (err=1726)
[2025/09/26 18:39:04.992385700] [ warn] [input:winevtlog:winevtlog.0] reconnect attempt 1 failed for 'Application' (err=1726), next at +20002ms
[2025/09/26 18:39:04.992399900] [debug] [input:winevtlog:winevtlog.0] read 205880 bytes from 'Application'

Or

[ warn] [in_winevtlog] EvtNext failed (err=6), will reconnect
  • Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • Run local packaging test showing all targets (including any new ones) build.
  • Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • Documentation required for this feature

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

  • New Features

    • Added automatic, non-blocking reconnection for Windows Event Log input with configurable exponential backoff, jitter and retry scheduling.
    • New user settings: reconnect.base_ms, reconnect.max_ms, reconnect.multiplier, reconnect.jitter_pct, reconnect.max_retries.
    • Added explicit cancel API to request cancellation without triggering reconnect.
  • Bug Fixes

    • Corrected wide-character allocation sizing in subscribe/reconnect paths.
    • Enforced safe defaults, bounds and validation for reconnect settings; ensures max_ms ≥ base_ms and caps values.

Copy link

coderabbitai bot commented Sep 26, 2025

Walkthrough

Implements configurable exponential reconnect backoff for in_winevtlog: normalizes/clamps backoff params at init, exposes reconnect options in config, adds per-channel reconnect state and scheduling, detects unexpected cancellations, and performs non-blocking retry attempts with bookmark-aware subscription recreation.

Changes

Cohort / File(s) Summary
Backoff configuration & normalization
plugins/in_winevtlog/in_winevtlog.c
Adds config_map entries for reconnect.*; parses backoff_multiplier_str; normalizes/clamps backoff.base_ms, backoff.max_ms, backoff.jitter_pct, backoff.max_retries; computes/stores backoff.multiplier_x1000; logs warnings for invalid or swapped values; applied during init before session creation.
Reconnection control flow & channel state
plugins/in_winevtlog/winevtlog.c
Adds per-channel fields (cancelled_by_us, reconnect_needed, last_error, retry_attempts, next_retry_deadline, prng_state) and initializes them. Detects unexpected cancellations in read/next paths, records last_error, schedules jittered exponential backoff retries, and attempts non-blocking reconnects that recreate session/subscription and restore bookmark state. Adds helpers: backoff_defaults, prng16, calc_backoff_ms, winevtlog_schedule_retry, winevtlog_try_reconnect, and winevtlog_request_cancel. Adjusts cancel/subscribe flows and wide-char allocations.
Public structs and APIs
plugins/in_winevtlog/winevtlog.h
Adds struct winevtlog_backoff; extends winevtlog_config with backoff and backoff_multiplier_str; extends winevtlog_channel with reconnect fields; declares winevtlog_try_reconnect, winevtlog_schedule_retry, and winevtlog_request_cancel.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Reader as in_winevtlog (read loop)
  participant Channel as winevtlog_channel
  participant EVT as Windows Event Log
  participant Cfg as Backoff Config

  rect rgba(230,245,255,0.5)
    note over Reader,EVT: normal subscription & reads
    Reader->>EVT: Subscribe / read events
    EVT-->>Reader: Events
  end

  rect rgba(255,240,240,0.5)
    note over EVT,Reader: unexpected cancellation / error
    EVT-->>Reader: Cancel/Error
    Reader->>Channel: mark reconnect_needed, set last_error
    Reader->>Channel: winevtlog_schedule_retry(Cfg) -> compute next_retry_deadline
  end

  alt deadline reached && retries remain
    Reader->>Channel: winevtlog_try_reconnect(Cfg)
    Channel->>EVT: Recreate session/subscription (bookmark aware)
    alt reconnect success
      Channel->>Channel: reset retry_attempts, clear flags
      EVT-->>Reader: Events resume
    else reconnect failed
      Channel->>Channel: increment retry_attempts
      Channel->>Channel: winevtlog_schedule_retry(Cfg)
    end
  else not yet due or exhausted
    note over Reader: wait until deadline or stop after max_retries
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60–90 minutes

Suggested reviewers

  • edsiper
  • koleini
  • fujimotos

Poem

I thump my paws at logs grown still,
I count the ticks, then plot a pill—
A jittered hop, a doubled try,
Bookmarks guide me to the sky. 🐇✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 6.67% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title succinctly captures the primary enhancement—adding a reconnect-on-cancel mechanism for the in_winevtlog plugin—and is specific to the main functionality introduced without extraneous details.
Linked Issues Check ✅ Passed The changes implement error logging on disconnect, detect cancellation or invalid handle, and automatically reestablish subscriptions with backoff retries as specified in issue #10937.
Out of Scope Changes Check ✅ Passed All modifications—config map extensions, backoff logic, reconnection flow, and state tracking—directly support the automatic reconnection objectives and no unrelated functionality has been introduced.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch cosmo0920-try-to-reconnect-on-cancel

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (3)
plugins/in_winevtlog/in_winevtlog.c (1)

192-201: Parse float multiplier robustly (use strtod and validate).

atof lacks error detection and is locale-sensitive. Prefer strtod with endptr + errno handling to catch invalid inputs.

Apply this diff:

-    if (ctx->backoff_multiplier_str && ctx->backoff_multiplier_str[0] != '\0') {
-        mult = atof(ctx->backoff_multiplier_str);
-        if (mult <= 0.0) {
-            flb_plg_warn(in, "invalid reconnect.multiplier='%s', fallback to 2.0",
-                         ctx->backoff_multiplier_str);
-            mult = 2.0;
-        }
-    }
+    if (ctx->backoff_multiplier_str && ctx->backoff_multiplier_str[0] != '\0') {
+        char *endp = NULL;
+        double parsed;
+        errno = 0;
+        parsed = strtod(ctx->backoff_multiplier_str, &endp);
+        if (errno != 0 || endp == ctx->backoff_multiplier_str || *endp != '\0' || parsed <= 0.0) {
+            flb_plg_warn(in, "invalid reconnect.multiplier='%s', fallback to 2.0",
+                         ctx->backoff_multiplier_str);
+        }
+        else {
+            mult = parsed;
+        }
+    }

Add this include at the top of the file:

#include <errno.h>
plugins/in_winevtlog/winevtlog.c (2)

842-843: Remove shadowed variable.

int rc = 0; is unused and later shadowed. Drop it.

Apply this diff:

-    int rc = 0;

897-917: Avoid repeated “reconnect exhausted” log spam after max retries.

After reaching max retries, winevtlog_next() will set reconnect_needed again on every cycle, spamming logs. Either close handles to stop polling or gate setting reconnect_needed when exhausted.

Option A (stop polling this channel cleanly):

-                else {
-                    flb_plg_error(ctx->ins, "reconnect exhausted for '%s' (last err=%lu)", ch->name, ch->last_error);
-                    ch->reconnect_needed = FALSE;
-                }
+                else {
+                    flb_plg_error(ctx->ins, "reconnect exhausted for '%s' (last err=%lu). stopping channel.",
+                                  ch->name, ch->last_error);
+                    ch->reconnect_needed = FALSE;
+                    /* prevent further EvtNext() attempts */
+                    close_handles(ch);
+                }

Option B (track exhaustion): add a boolean reconnect_exhausted to winevtlog_channel and check it in winevtlog_next() before setting reconnect_needed. Requires a small header addition.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d21e8de and 39afed2.

📒 Files selected for processing (3)
  • plugins/in_winevtlog/in_winevtlog.c (3 hunks)
  • plugins/in_winevtlog/winevtlog.c (6 hunks)
  • plugins/in_winevtlog/winevtlog.h (4 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
plugins/in_winevtlog/winevtlog.h (1)
plugins/in_winevtlog/winevtlog.c (3)
  • winevtlog_try_reconnect (703-822)
  • winevtlog_schedule_retry (694-701)
  • winevtlog_request_cancel (197-201)
plugins/in_winevtlog/winevtlog.c (1)
include/fluent-bit/flb_mem.h (1)
  • flb_free (126-128)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (29)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
  • GitHub Check: PR - fuzzing test
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
  • GitHub Check: pr-compile-centos-7
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
🔇 Additional comments (7)
plugins/in_winevtlog/in_winevtlog.c (2)

164-166: LGTM: local vars for backoff parsing

mult default and temporary tmp_ms look fine.


497-522: Config map additions look good; align docs and types.

Options and defaults are sensible. Please add/adjust user docs for these keys. If available in the framework, consider parsing reconnect.multiplier as a float type rather than string.

plugins/in_winevtlog/winevtlog.h (4)

30-37: LGTM: backoff struct addition

Fields and units (multiplier_x1000) are clear and match usage.


60-62: LGTM: config additions to context

Holds backoff config and raw multiplier string for parsing.


78-85: LGTM: per-channel reconnect state fields

These enable non-blocking backoff, error propagation, and scheduling.


146-150: LGTM: reconnection API surface

Prototypes reflect the implementation and intended non-blocking usage.

plugins/in_winevtlog/winevtlog.c (1)

147-153: LGTM: initialize reconnect state and PRNG seed

Good defaults and a simple per-channel seed using GetTickCount64 XOR pointer.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
plugins/in_winevtlog/winevtlog.c (1)

109-121: Free wide strings and close signal_event on early returns to prevent leaks.

On remote handle or subscribe failure paths, wide_channel/wide_query and signal_event are not released.

Apply this diff:

         if (err != ERROR_SUCCESS) {
             flb_plg_error(ctx->ins, "cannot create remote handle '%s' in %ls (%i)",
                       channel, session->server, err);
+            if (signal_event) { CloseHandle(signal_event); }
+            if (wide_channel) { flb_free(wide_channel); }
+            if (wide_query) { flb_free(wide_query); }
             flb_free(ch->name);
             if (ch->query != NULL) {
                 flb_free(ch->query);
             }
             flb_free(ch);
             return NULL;
         }
@@
     if (!ch->subscription) {
         flb_plg_error(ctx->ins, "cannot subscribe '%s' (%i)", channel, err);
+        if (signal_event) { CloseHandle(signal_event); }
+        if (wide_channel) { flb_free(wide_channel); }
+        if (wide_query) { flb_free(wide_query); }
         flb_free(ch->name);
         if (ch->query != NULL) {
             flb_free(ch->query);
         }
         if (ch->remote) {
             EvtClose(ch->remote);
         }
         flb_free(ch);
         return NULL;
     }

Also applies to: 131-145

🧹 Nitpick comments (2)
plugins/in_winevtlog/in_winevtlog.c (1)

506-531: Config types: consider unsigned for time-like fields.

base_ms, max_ms, jitter_pct, max_retries represent non-negative quantities. Using unsigned config map types (if available) or keeping INT but clamping at init avoids negative assignments. Optional but reduces footguns.

plugins/in_winevtlog/winevtlog.c (1)

896-916: Avoid shadowing rc; use the existing variable.

Minor clarity fix.

Apply this diff:

-            int rc = winevtlog_try_reconnect(ch, ctx);
+            rc = winevtlog_try_reconnect(ch, ctx);
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 39afed2 and 524301c.

📒 Files selected for processing (2)
  • plugins/in_winevtlog/in_winevtlog.c (3 hunks)
  • plugins/in_winevtlog/winevtlog.c (8 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
plugins/in_winevtlog/winevtlog.c (1)
include/fluent-bit/flb_mem.h (1)
  • flb_free (126-128)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (29)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-centos-7
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
  • GitHub Check: PR - fuzzing test
🔇 Additional comments (5)
plugins/in_winevtlog/winevtlog.c (5)

147-153: Good initialization of per-channel reconnect state.

Fields are set to safe defaults; PRNG seeded for jitter.


193-201: Helper to mark self-cancel is fine.

Matches the reconnection flow and avoids spurious reconnects.


618-635: Promoted cancellation/next errors to WARN and set reconnect flags.

This surfaces disconnects as intended.


1016-1028: LGTM on WCHAR-sized allocation.

Corrects wide-char buffer sizing.


702-821: Set ch->last_error on reconnect failures for accurate diagnostics.

winevtlog_read logs ch->last_error after failures, but try_reconnect does not assign it on all failure paths.

Apply this diff:

     new_signal = CreateEvent(NULL, TRUE, TRUE, NULL);
-    if (!new_signal) {
+    if (!new_signal) {
+        ch->last_error = GetLastError();
         flb_free(wide_channel);
         if (wide_query) {
             flb_free(wide_query);
         }
         return -1;
     }
@@
     if (ch->session) {
         new_remote = create_remote_handle(ch->session, &err);
-        if (err != ERROR_SUCCESS || !new_remote) {
+        if (err != ERROR_SUCCESS || !new_remote) {
+            ch->last_error = err;
             flb_plg_error(ctx->ins, "reconnect: cannot create remote handle '%s' in %ls (err=%lu)",
                           ch->name, ch->session->server, err);
             CloseHandle(new_signal);
             flb_free(wide_channel);
             if (wide_query) {
                 flb_free(wide_query);
             }
             return -1;
         }
     }
@@
-    if (!new_sub) {
-        DWORD sub_err = GetLastError();
+    if (!new_sub) {
+        DWORD sub_err = GetLastError();
+        ch->last_error = sub_err;
         flb_plg_error(ctx->ins, "reconnect: EvtSubscribe failed on '%s' (err=%lu)", ch->name, sub_err);
         if (new_remote) EvtClose(new_remote);
         CloseHandle(new_signal);
         flb_free(wide_channel);
         if (wide_query) {
             flb_free(wide_query);
         }
         return -1;
     }

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (4)
plugins/in_winevtlog/in_winevtlog.c (3)

192-201: Parse multiplier robustly and clamp before casting to DWORD (avoid overflow/NaN/locale pitfalls).

atof is locale-dependent and doesn’t detect errors; casting an extreme mult to DWORD before clamping can wrap. Parse with strtod, validate, and clamp the double prior to the cast.

Apply this diff:

-    if (ctx->backoff_multiplier_str && ctx->backoff_multiplier_str[0] != '\0') {
-        mult = atof(ctx->backoff_multiplier_str);
-        if (mult <= 0.0) {
-            flb_plg_warn(in, "invalid reconnect.multiplier='%s', fallback to 2.0",
-                         ctx->backoff_multiplier_str);
-            mult = 2.0;
-        }
-    }
-    ctx->backoff.multiplier_x1000 = (DWORD)(mult * 1000.0);
+    if (ctx->backoff_multiplier_str && ctx->backoff_multiplier_str[0] != '\0') {
+        char *end = NULL;
+        errno = 0;
+        mult = strtod(ctx->backoff_multiplier_str, &end);
+        if (errno != 0 || end == ctx->backoff_multiplier_str) {
+            flb_plg_warn(in, "invalid reconnect.multiplier='%s', fallback to 2.0",
+                         ctx->backoff_multiplier_str);
+            mult = 2.0;
+        }
+    }
+    /* clamp before casting to DWORD: 0.5x .. 10x */
+    if (mult < 0.5)  { mult = 0.5; }
+    if (mult > 10.0) { mult = 10.0; }
+    ctx->backoff.multiplier_x1000 = (DWORD)(mult * 1000.0);

Add headers once at file top if not present:

#include <errno.h>
#include <stdlib.h>

243-249: Clamp multiplier_x1000 earlier to prevent cast issues; keep this as a second safety net.

Given pre-cast clamping (proposed above), this block becomes a harmless guard. Leaving it is fine.


515-540: Config map additions LGTM; clarify semantics and prefer unsigned where applicable.

  • Clarify in help strings: units, jitter range [0–100], and max_retries semantics for 0/unlimited.
  • Consider FLB_CONFIG_MAP_UINT for milliseconds and jitter to avoid negatives at the source.
plugins/in_winevtlog/winevtlog.c (1)

887-887: Remove unused/shadowed variable.

int rc = 0; is unused and later shadowed. Drop it.

Apply this diff:

-    int rc = 0;
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 524301c and 870a4c8.

📒 Files selected for processing (2)
  • plugins/in_winevtlog/in_winevtlog.c (3 hunks)
  • plugins/in_winevtlog/winevtlog.c (8 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
plugins/in_winevtlog/winevtlog.c (1)
include/fluent-bit/flb_mem.h (1)
  • flb_free (126-128)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (29)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
  • GitHub Check: PR - fuzzing test
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
  • GitHub Check: pr-compile-centos-7
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
🔇 Additional comments (10)
plugins/in_winevtlog/in_winevtlog.c (1)

222-241: Bounds and ordering normalization looks good.

Upper caps and base/max swap are correct and mitigate pathological configs.

plugins/in_winevtlog/winevtlog.c (9)

193-199: Per‑channel reconnect state initialization looks correct.

Flags, counters, and PRNG seed are properly set.


239-241: Marking cancellation as “by us” before EvtCancel is correct.

Prevents false-positive reconnect scheduling.


243-248: Helper to request cancel is fine.

Keeps call sites simple.


709-713: PRNG helper LGTM.

Simple LCG gives adequate jitter for backoff.


715-737: Backoff math OK; ensure delay never goes negative (already effectively guaranteed).

With jitter clamped to 0..100 and ms >= 0, with_jitter stays >= 0. No change required.


739-747: Scheduling uses attempt count properly.

First reconnect is immediate; subsequent attempts back off using incremented attempt number.


942-962: Retry loop semantics: confirm intended meaning of max_retries=0.

With current logic, you attempt one immediate reconnect, then stop when max_retries==0. If you intend 0=unlimited, we should special‑case it.


1062-1062: LGTM.

Correct sizeof(WCHAR) allocation fix.


748-867: Set last_error on all reconnect failure paths; validate second-stage conversions.

Missing assignments lead to misleading logs, and conversion calls are unchecked.

Apply this diff:

@@
-    len = MultiByteToWideChar(CP_UTF8, 0, ch->name, -1, NULL, 0);
+    len = MultiByteToWideChar(CP_UTF8, 0, ch->name, -1, NULL, 0);
     if (len == 0) {
         return -1;
     }
     wide_channel = flb_malloc(sizeof(WCHAR) * len);
     if (!wide_channel) {
         return -1;
     }
-    MultiByteToWideChar(CP_UTF8, 0, ch->name, -1, wide_channel, len);
+    if (0 == MultiByteToWideChar(CP_UTF8, 0, ch->name, -1, wide_channel, len)) {
+        ch->last_error = GetLastError();
+        flb_free(wide_channel);
+        return -1;
+    }
@@
-        MultiByteToWideChar(CP_UTF8, 0, ch->query, -1, wide_query, len);
+        if (0 == MultiByteToWideChar(CP_UTF8, 0, ch->query, -1, wide_query, len)) {
+            ch->last_error = GetLastError();
+            flb_free(wide_query);
+            flb_free(wide_channel);
+            return -1;
+        }
@@
-    new_signal = CreateEvent(NULL, TRUE, TRUE, NULL);
-    if (!new_signal) {
+    new_signal = CreateEvent(NULL, TRUE, TRUE, NULL);
+    if (!new_signal) {
+        ch->last_error = GetLastError();
         flb_free(wide_channel);
         if (wide_query) {
             flb_free(wide_query);
         }
         return -1;
     }
@@
-        if (err != ERROR_SUCCESS || !new_remote) {
+        if (err != ERROR_SUCCESS || !new_remote) {
+            ch->last_error = err;
             flb_plg_error(ctx->ins, "reconnect: cannot create remote handle '%s' in %ls (err=%lu)",
                           ch->name, ch->session->server, err);
             CloseHandle(new_signal);
             flb_free(wide_channel);
             if (wide_query) {
                 flb_free(wide_query);
             }
             return -1;
         }
     }
@@
-    if (!new_sub) {
-        DWORD sub_err = GetLastError();
+    if (!new_sub) {
+        DWORD sub_err = GetLastError();
         if (sub_err == ERROR_EVT_QUERY_RESULT_STALE) {
             flb_plg_warn(ctx->ins, "reconnect: bookmark stale on '%s' (err=%lu), falling back to latest",
                          ch->name, sub_err);
             flags = ctx->read_existing_events ? EvtSubscribeStartAtOldestRecord
                                               : EvtSubscribeToFutureEvents;
             new_sub = EvtSubscribe(new_remote, new_signal, wide_channel, wide_query,
                                    NULL, NULL, NULL, flags);
         }
     }
 
-    if (!new_sub) {
-        DWORD sub_err = GetLastError();
+    if (!new_sub) {
+        DWORD sub_err = GetLastError();
+        ch->last_error = sub_err;
         flb_plg_error(ctx->ins, "reconnect: EvtSubscribe failed on '%s' (err=%lu)", ch->name, sub_err);
         if (new_remote) EvtClose(new_remote);
         CloseHandle(new_signal);
         flb_free(wide_channel);
         if (wide_query) {
             flb_free(wide_query);
         }
         return -1;
     }

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
plugins/in_winevtlog/winevtlog.c (1)

219-226: Also free wide_query on bookmark-create failure path.

wide_query leaks in this branch.

Apply this diff:

             if (signal_event) {
                 CloseHandle(signal_event);
             }
             flb_plg_error(ctx->ins, "cannot subscribe '%s' (%i)", channel, GetLastError());
             flb_free(wide_channel);
+            if (wide_query) {
+                flb_free(wide_query);
+            }
             flb_free(ch->name);
             if (ch->query != NULL) {
                 flb_free(ch->query);
             }
             flb_free(ch);
             return NULL;
🧹 Nitpick comments (2)
plugins/in_winevtlog/winevtlog.c (2)

771-785: Check MultiByteToWideChar conversion results in reconnect path.

Avoid using partially filled buffers; clean up on failure.

Apply this diff:

-    MultiByteToWideChar(CP_UTF8, 0, ch->name, -1, wide_channel, len);
+    if (0 == MultiByteToWideChar(CP_UTF8, 0, ch->name, -1, wide_channel, len)) {
+        flb_free(wide_channel);
+        return -1;
+    }
@@
-        MultiByteToWideChar(CP_UTF8, 0, ch->query, -1, wide_query, len);
+        if (0 == MultiByteToWideChar(CP_UTF8, 0, ch->query, -1, wide_query, len)) {
+            flb_free(wide_query);
+            flb_free(wide_channel);
+            return -1;
+        }

946-965: Avoid shadowing rc; reuse the existing variable.

Minor cleanup to reduce confusion.

Apply this diff:

-            int rc = winevtlog_try_reconnect(ch, ctx);
+            rc = winevtlog_try_reconnect(ch, ctx);
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 870a4c8 and f52fb1f.

📒 Files selected for processing (1)
  • plugins/in_winevtlog/winevtlog.c (11 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
plugins/in_winevtlog/winevtlog.c (1)
include/fluent-bit/flb_mem.h (1)
  • flb_free (126-128)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (29)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
  • GitHub Check: PR - fuzzing test
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-centos-7
🔇 Additional comments (9)
plugins/in_winevtlog/winevtlog.c (9)

156-167: Close signal_event and free wide buffers on remote-handle creation failure (leak).

On this path, signal_event, wide_channel, and wide_query leak.

Apply this diff:

         if (err != ERROR_SUCCESS) {
             flb_plg_error(ctx->ins, "cannot create remote handle '%s' in %ls (%i)",
                       channel, session->server, err);
+            if (signal_event) {
+                CloseHandle(signal_event);
+            }
+            if (wide_channel) {
+                flb_free(wide_channel);
+            }
+            if (wide_query) {
+                flb_free(wide_query);
+            }
             flb_free(ch->name);
             if (ch->query != NULL) {
                 flb_free(ch->query);
             }
             flb_free(ch);
             return NULL;
         }

181-191: Fix resource leaks on EvtSubscribe failure (handle + buffers).

signal_event, wide_channel, and wide_query are not released here.

Apply this diff:

     if (!ch->subscription) {
         flb_plg_error(ctx->ins, "cannot subscribe '%s' (%i)", channel, err);
+        if (signal_event) {
+            CloseHandle(signal_event);
+        }
         flb_free(ch->name);
         if (ch->query != NULL) {
             flb_free(ch->query);
         }
         if (ch->remote) {
             EvtClose(ch->remote);
         }
+        if (wide_channel) {
+            flb_free(wide_channel);
+        }
+        if (wide_query) {
+            flb_free(wide_query);
+        }
         flb_free(ch);
         return NULL;
     }

667-676: Good: promote unexpected CANCELLED to WARN and consume self-cancel flag.

This aligns with the objective to surface disconnects and avoids false positives on intentional cancels.


733-735: Good: jitter percentage clamped to 0..100.

Prevents span overflow/odd modulo behavior with large jitter values.


194-200: LGTM: initialize per-channel reconnect state.

Sane defaults; prng_state seeding looks fine.


244-248: LGTM: add non-blocking cancel request helper.

Explicit API to mark self-cancel and invoke EvtCancel is useful for control flow.


512-513: LGTM: plumb remote handle into publisher metadata lookup.

Passing remote for EvtOpenPublisherMetadata supports remote sources; callers updated accordingly.

Also applies to: 559-559


787-795: Set ch->last_error on reconnect failures for accurate diagnostics.

Without this, logs show err=0 after failures.

Apply this diff:

     new_signal = CreateEvent(NULL, TRUE, TRUE, NULL);
     if (!new_signal) {
+        ch->last_error = GetLastError();
         flb_free(wide_channel);
         if (wide_query) {
             flb_free(wide_query);
         }
         return -1;
     }
@@
     if (ch->session) {
         new_remote = create_remote_handle(ch->session, &err);
         if (err != ERROR_SUCCESS || !new_remote) {
+            ch->last_error = err;
             flb_plg_error(ctx->ins, "reconnect: cannot create remote handle '%s' in %ls (err=%lu)",
                           ch->name, ch->session->server, err);
             CloseHandle(new_signal);
             flb_free(wide_channel);
             if (wide_query) {
                 flb_free(wide_query);
             }
             return -1;
         }
     }
@@
     if (!new_sub) {
         DWORD sub_err = GetLastError();
+        ch->last_error = sub_err;
         flb_plg_error(ctx->ins, "reconnect: EvtSubscribe failed on '%s' (err=%lu)", ch->name, sub_err);
         if (new_remote) EvtClose(new_remote);
         CloseHandle(new_signal);
         flb_free(wide_channel);
         if (wide_query) {
             flb_free(wide_query);
         }
         return -1;
     }

Also applies to: 796-808, 834-844


701-708: Preserve explicit 0 values in backoff_defaults (don’t treat 0 as “unset”).

Zero jitter/retries should be honored (e.g., disable jitter or retries).

Apply this diff:

-static inline void backoff_defaults(struct winevtlog_backoff *dst, const struct winevtlog_backoff *src)
-{
-    dst->base_ms          = (src && src->base_ms)          ? src->base_ms          : 500;
-    dst->max_ms           = (src && src->max_ms)           ? src->max_ms           : 30000;
-    dst->multiplier_x1000 = (src && src->multiplier_x1000) ? src->multiplier_x1000 : 2000; /* x2.0 */
-    dst->jitter_pct       = (src && src->jitter_pct)       ? src->jitter_pct       : 20;
-    dst->max_retries      = (src && src->max_retries)      ? src->max_retries      : 8;
-}
+static inline void backoff_defaults(struct winevtlog_backoff *dst, const struct winevtlog_backoff *src)
+{
+    if (src) {
+        *dst = *src; /* preserve explicit zeros */
+    }
+    else {
+        dst->base_ms          = 500;
+        dst->max_ms           = 30000;
+        dst->multiplier_x1000 = 2000; /* x2.0 */
+        dst->jitter_pct       = 20;
+        dst->max_retries      = 8;
+    }
+}

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
plugins/in_winevtlog/winevtlog.c (3)

156-167: Fix resource leaks on remote-handle creation failure path.

When create_remote_handle fails, signal_event, wide_channel, and (if present) wide_query leak. Close/free them before returning.

Apply this diff:

     if (session != NULL) {
         remote_handle = create_remote_handle(session, &err);
         if (err != ERROR_SUCCESS) {
             flb_plg_error(ctx->ins, "cannot create remote handle '%s' in %ls (%i)",
                       channel, session->server, err);
+            if (signal_event) {
+                CloseHandle(signal_event);
+            }
+            if (wide_channel) {
+                flb_free(wide_channel);
+            }
+            if (wide_query) {
+                flb_free(wide_query);
+            }
             flb_free(ch->name);
             if (ch->query != NULL) {
                 flb_free(ch->query);
             }
             flb_free(ch);
             return NULL;
         }
     }

218-236: Free wide_query on bookmark creation failure path.

On this error path, wide_query isn’t freed, causing a leak.

Apply this diff:

             if (signal_event) {
                 CloseHandle(signal_event);
             }
             flb_plg_error(ctx->ins, "cannot subscribe '%s' (%i)", channel, GetLastError());
             flb_free(wide_channel);
+            if (wide_query) {
+                flb_free(wide_query);
+            }
             flb_free(ch->name);
             if (ch->query != NULL) {
                 flb_free(ch->query);
             }
             flb_free(ch);
             return NULL;

1156-1166: Fix re-subscribe handle transfer: remote not copied; re_ch leaked; potential NULL remote in downstream and handle leaks.

After re-subscribe with bookmark, the code copies only a subset of fields, leaving ch->remote NULL and leaking the temporary re_ch struct/handles. This breaks remote-dependent calls (e.g., get_description) and leaks memory/handles.

Apply this diff to transfer handles safely and dispose of re_ch:

                 re_ch = winevtlog_subscribe(ch->name, ctx, bookmark, ch->query, ch->session);
                 if (re_ch != NULL) {
                     close_handles(ch);
 
-                    ch->bookmark = re_ch->bookmark;
-                    ch->subscription = re_ch->subscription;
-                    ch->signal_event = re_ch->signal_event;
-                    ch->session = re_ch->session;
+                    /* Steal handles from re_ch */
+                    ch->bookmark      = re_ch->bookmark;      re_ch->bookmark = NULL;
+                    ch->subscription  = re_ch->subscription;  re_ch->subscription = NULL;
+                    ch->signal_event  = re_ch->signal_event;  re_ch->signal_event = NULL;
+                    ch->remote        = re_ch->remote;        re_ch->remote = NULL;
+                    ch->session       = re_ch->session;
+
+                    /* Reset reconnection state */
+                    ch->cancelled_by_us     = FALSE;
+                    ch->reconnect_needed    = FALSE;
+                    ch->retry_attempts      = 0;
+                    ch->next_retry_deadline = 0;
+                    ch->last_error          = 0;
+                    ch->count               = 0;
+
+                    /* Dispose of temporary container (names/queries are private to it) */
+                    winevtlog_close(re_ch);
                 }
                 else {
                     flb_plg_error(ctx->ins, "Failed to subscribe with bookmark XML: %s\n", record.bookmark_xml);
                     ch->bookmark = EvtCreateBookmark(NULL);
                 }
🧹 Nitpick comments (4)
plugins/in_winevtlog/winevtlog.c (4)

70-70: Remove unused variable ‘buf’.

‘buf’ is unused; drop it to avoid warnings.

Apply this diff:

-    void *buf;

910-911: Avoid shadowing ‘rc’; remove the unused outer variable.

The outer ‘rc’ is unused and shadowed inside the reconnect block, producing warnings.

Apply this diff:

-    int rc = 0;

47-56: Always scrub credentials on failure too.

Zero EVT_RPC_LOGIN before any return to avoid leaving sensitive pointers in stack memory.

Apply this diff:

     remote = EvtOpenSession(EvtRpcLogin, &credentials, 0, 0);
     if (!remote) {
         *error_code = GetLastError();
+        SecureZeroMemory(&credentials, sizeof(EVT_RPC_LOGIN));
         return remote;
     }
 
     SecureZeroMemory(&credentials, sizeof(EVT_RPC_LOGIN));

735-760: Guard against pathological backoff configs (multiplier/base).

If multiplier_x1000 < 1000 or base_ms is unreasonably large, backoff can stall at 0ms or overflow. Clamp multiplier to >= 1000 and optionally cap base/max to sane bounds.

Apply this diff:

 static DWORD calc_backoff_ms(struct winevtlog_channel *ch, const struct winevtlog_backoff *cfg, DWORD attempt)
 {
@@
-    double mult = (double)cfg->multiplier_x1000 / 1000.0;
-    double t = (double)cfg->base_ms;
+    double mult = (double)(cfg->multiplier_x1000 < 1000 ? 1000 : cfg->multiplier_x1000) / 1000.0;
+    double t = (double)cfg->base_ms;
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f52fb1f and 82a745a.

📒 Files selected for processing (2)
  • plugins/in_winevtlog/in_winevtlog.c (3 hunks)
  • plugins/in_winevtlog/winevtlog.c (11 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • plugins/in_winevtlog/in_winevtlog.c
🧰 Additional context used
🧬 Code graph analysis (1)
plugins/in_winevtlog/winevtlog.c (1)
include/fluent-bit/flb_mem.h (1)
  • flb_free (126-128)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (29)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
  • GitHub Check: PR - fuzzing test
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-centos-7
🔇 Additional comments (9)
plugins/in_winevtlog/winevtlog.c (9)

92-115: Good OOM and conversion checks for channel name.

The new checks around wide-channel allocation and MultiByteToWideChar error handling are correct and prevent NULL derefs.


119-144: Good OOM and conversion checks for query.

The added checks and early cleanups on query allocation/conversion failures look correct.


676-685: CANCELLED handling: warn-level logging and flag consumption look good.

The WARN promotion and “Consume this flag” fix match the PR goals and previous feedback.


687-691: Reconnect on EvtNext errors (non-ENO_MORE_ITEMS): LGTM.

Setting reconnect_needed and last_error on unexpected EvtNext errors aligns with the reconnection workflow.


710-727: Preserve explicit zero values in backoff config (effective defaults): LGTM.

Copying src wholesale preserves 0 jitter/retries, fixing the earlier “zero treated as unset” behavior.


751-754: Jitter clamped to 0..100: LGTM.

Prevents overflow/odd modulo behavior for large jitter.


521-575: Remote-aware description lookup: LGTM.

Passing the remote handle into EvtOpenPublisherMetadata is correct for remote sessions; falls back to local when NULL.


1085-1096: Allocate WCHAR-sized buffer in convert_str: LGTM.

The corrected allocation fixes a potential underrun.


771-890: Set last_error on reconnect failures for accurate diagnostics.

winevtlog_read logs ch->last_error after failed reconnects, but try paths don’t consistently set it (CreateEvent, remote handle, EvtSubscribe failures).

Apply this diff:

     new_signal = CreateEvent(NULL, TRUE, TRUE, NULL);
     if (!new_signal) {
+        ch->last_error = GetLastError();
         flb_free(wide_channel);
         if (wide_query) {
             flb_free(wide_query);
         }
         return -1;
     }
@@
     if (ch->session) {
         new_remote = create_remote_handle(ch->session, &err);
         if (err != ERROR_SUCCESS || !new_remote) {
+            ch->last_error = err;
             flb_plg_error(ctx->ins, "reconnect: cannot create remote handle '%s' in %ls (err=%lu)",
                           ch->name, ch->session->server, err);
             CloseHandle(new_signal);
             flb_free(wide_channel);
             if (wide_query) {
                 flb_free(wide_query);
             }
             return -1;
         }
     }
@@
     if (!new_sub) {
         DWORD sub_err = GetLastError();
+        ch->last_error = sub_err;
         flb_plg_error(ctx->ins, "reconnect: EvtSubscribe failed on '%s' (err=%lu)", ch->name, sub_err);
         if (new_remote) EvtClose(new_remote);
         CloseHandle(new_signal);
         flb_free(wide_channel);
         if (wide_query) {
             flb_free(wide_query);
         }
         return -1;
     }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

in_winevtlog: remote logging fails to reconnect or even log errors when service suffers disconnect.
1 participant