-
Notifications
You must be signed in to change notification settings - Fork 1.8k
out_azure_kusto: fix SIGSEGV in nested mk_list_foreach_safe loops #11212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
out_azure_kusto: fix SIGSEGV in nested mk_list_foreach_safe loops #11212
Conversation
WalkthroughUpdated the Azure Kusto output plugin to use a dedicated inner safety iterator variable; added a new runtime test that verifies buffering/backlog processing with a two-phase Fluent Bit run and accompanying recursive cleanup helpers. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used🧠 Learnings (6)📚 Learning: 2025-11-21T06:23:29.770ZApplied to files:
📚 Learning: 2025-08-31T12:46:11.940ZApplied to files:
📚 Learning: 2025-09-08T11:21:33.975ZApplied to files:
📚 Learning: 2025-08-29T06:24:26.170ZApplied to files:
📚 Learning: 2025-08-29T06:24:55.855ZApplied to files:
📚 Learning: 2025-08-31T12:46:11.940ZApplied to files:
🔇 Additional comments (6)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
8d08559 to
55130bb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
tests/runtime/out_azure_kusto.c (1)
217-286: Consider adding test directory cleanup for reliability.The test uses a hardcoded buffer directory
/tmp/fluent-bit-kusto-test/without cleanup. Leftover files from previous runs could cause flaky behavior or mask test failures.Consider adding cleanup before and after the test:
void flb_test_azure_kusto_buffering_backlog(void) { int i; int ret; int bytes; char sample[] = "{\"k\":\"v\"}"; size_t sample_size = sizeof(sample) - 1; flb_ctx_t *ctx; int in_ffd; int out_ffd; + + /* Clean up any leftover files from previous runs */ + system("rm -rf /tmp/fluent-bit-kusto-test/"); /* First run: enable buffering and write data to disk */And at the end:
flb_stop(ctx); flb_destroy(ctx); + + /* Clean up test directory */ + system("rm -rf /tmp/fluent-bit-kusto-test/"); }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
plugins/out_azure_kusto/azure_kusto.c(2 hunks)tests/runtime/out_azure_kusto.c(3 hunks)
🔇 Additional comments (2)
plugins/out_azure_kusto/azure_kusto.c (1)
410-431: LGTM! Correct fix for the nested loop iterator collision.The introduction of
f_tmpas a dedicated temporary iterator for the innermk_list_foreach_safeloop properly resolves the SIGSEGV. Previously, both loops sharedtmp, causing the inner loop to overwrite the outer loop's saved "next" pointer, corrupting iteration state upon returning to the outer loop.tests/runtime/out_azure_kusto.c (1)
33-33: LGTM! Forward declaration and test list entry properly added.Also applies to: 42-42
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
tests/runtime/out_azure_kusto.c (1)
244-245: Consider test isolation and cleanup for the buffer directory.The hardcoded path
/tmp/fluent-bit-kusto-test/may cause issues with parallel test runs or leave stale files between runs. Consider:
- Using a unique temp directory (e.g., with
mkdtemppattern or PID suffix).- Cleaning up the buffer directory at test start and/or end.
For a regression test focused on preventing SIGSEGV, this is a minor concern, but cleanup would prevent accumulation of test artifacts.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
plugins/out_azure_kusto/azure_kusto.c(2 hunks)tests/runtime/out_azure_kusto.c(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- plugins/out_azure_kusto/azure_kusto.c
🧰 Additional context used
🧬 Code graph analysis (1)
tests/runtime/out_azure_kusto.c (1)
src/flb_lib.c (10)
flb_stop(1011-1055)flb_destroy(228-263)flb_create(143-225)flb_service_set(652-678)flb_input(266-276)flb_input_set(305-335)flb_output(279-289)flb_output_set(520-551)flb_start(983-994)flb_lib_push(843-870)
🔇 Additional comments (2)
tests/runtime/out_azure_kusto.c (2)
33-33: LGTM!The function declaration and TEST_LIST registration follow the existing patterns in this file.
Also applies to: 42-42
260-285: Regression test structure looks correct for its purpose.The two-phase approach (write buffered data, restart to process backlog) correctly exercises the
ingest_all_chunkscode path where the nested loop SIGSEGV occurred. If the test completes without crashing, the fix is validated.For enhanced verification, you could optionally add a comment clarifying that the test's success criterion is "no crash" rather than output validation.
55130bb to
ea67b3e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/runtime/out_azure_kusto.c (1)
280-283: Consider adding verification that backlog processing occurred.While the test successfully exercises the buffering/backlog code path (its primary goal is crash prevention), it could be more robust by verifying that:
- Buffered data was actually loaded from disk
- Backlog processing completed
- The buffer directory state changed appropriately
This is optional since the test's main purpose—ensuring no SIGSEGV—is achieved if execution completes successfully.
Example verification (optional):
/* After line 283, verify buffer directory state or add logging check */ /* Note: This would require inspecting flb logs or buffer directory contents */
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
plugins/out_azure_kusto/azure_kusto.c(2 hunks)tests/runtime/out_azure_kusto.c(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- plugins/out_azure_kusto/azure_kusto.c
🔇 Additional comments (1)
tests/runtime/out_azure_kusto.c (1)
33-33: LGTM!The function prototype and test list entry follow the established pattern in the file.
Also applies to: 42-42
The ingest_all_chunks() function had nested mk_list_foreach_safe loops that both used the same 'tmp' variable as the iterator. The macro stores the 'next' pointer in its second argument for safe iteration during list modification. When the inner loop overwrote 'tmp', it corrupted the outer loop's iteration state, causing undefined behavior and a SIGSEGV crash when processing buffered backlog data on startup. Fix: Add a dedicated 'f_tmp' variable for the inner loop to prevent iterator corruption. Also adds a regression test (buffering_backlog) that exercises the buffering/backlog restart code path to guard against future regressions. Signed-off-by: Yash Ananth <[email protected]> Signed-off-by: Yashwanth Anantharaju <[email protected]>
ea67b3e to
e9e2389
Compare
2929d37 to
03323cf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
tests/runtime/out_azure_kusto.c (2)
40-49: Consider checkingnftwreturn value.The
nftwreturn value is not checked. While this is acceptable for cleanup code, checking it and logging failures would improve debuggability if cleanup doesn't complete.Apply this diff to add return value checking:
static void flb_kusto_rm_rf(const char *path) { + int ret; struct stat st; if (stat(path, &st) != 0) { return; } - nftw(path, flb_kusto_unlink_cb, 64, FTW_DEPTH | FTW_PHYS); + ret = nftw(path, flb_kusto_unlink_cb, 64, FTW_DEPTH | FTW_PHYS); + if (ret != 0) { + fprintf(stderr, "Warning: nftw cleanup failed for %s\n", path); + } }
286-291: Consider verifying that buffered chunks were created.The test could verify that files were written to
buffer_dirafter phase 1 to confirm buffering actually occurred. This would make the test more robust and explicit about what it's testing.Add verification after phase 1 (after line 294):
/* Verify that buffering wrote files to disk */ struct stat st; ret = stat(buffer_dir, &st); TEST_CHECK(ret == 0); /* Could also check that directory is not empty using readdir */
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tests/runtime/out_azure_kusto.c(5 hunks)
🧰 Additional context used
🧠 Learnings (7)
📚 Learning: 2025-11-21T06:23:29.770Z
Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 11171
File: include/fluent-bit/flb_lib.h:52-53
Timestamp: 2025-11-21T06:23:29.770Z
Learning: In Fluent Bit core (fluent/fluent-bit repository), function descriptions/documentation are not required for newly added functions in header files.
Applied to files:
tests/runtime/out_azure_kusto.c
📚 Learning: 2025-08-31T12:46:11.940Z
Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit CMakeLists.txt, the system library preference flags are defined as FLB_PREFER_SYSTEM_LIB_ZSTD and FLB_PREFER_SYSTEM_LIB_KAFKA with the FLB_ prefix.
Applied to files:
tests/runtime/out_azure_kusto.c
📚 Learning: 2025-09-08T11:21:33.975Z
Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 10851
File: include/fluent-bit/flb_simd.h:60-66
Timestamp: 2025-09-08T11:21:33.975Z
Learning: Fluent Bit currently only supports MSVC compiler on Windows, so additional compiler compatibility guards may be unnecessary for Windows-specific code paths.
Applied to files:
tests/runtime/out_azure_kusto.c
📚 Learning: 2025-08-29T06:24:26.170Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:39-42
Timestamp: 2025-08-29T06:24:26.170Z
Learning: In Fluent Bit, ZSTD compression support is enabled by default and does not require conditional compilation guards (like #ifdef FLB_HAVE_ZSTD) around ZSTD-related code declarations and implementations.
Applied to files:
tests/runtime/out_azure_kusto.c
📚 Learning: 2025-08-29T06:24:55.855Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: src/aws/flb_aws_compress.c:52-56
Timestamp: 2025-08-29T06:24:55.855Z
Learning: ZSTD compression is always available in Fluent Bit and does not require conditional compilation guards. Unlike Arrow/Parquet which use #ifdef FLB_HAVE_ARROW guards, ZSTD is built unconditionally with flb_zstd.c included directly in src/CMakeLists.txt and a bundled ZSTD library at lib/zstd-1.5.7/.
Applied to files:
tests/runtime/out_azure_kusto.c
📚 Learning: 2025-08-31T12:46:11.940Z
Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit, the correct CMake flag for using system librdkafka is `FLB_PREFER_SYSTEM_LIB_KAFKA=ON`.
Applied to files:
tests/runtime/out_azure_kusto.c
📚 Learning: 2025-08-29T06:25:27.250Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:93-107
Timestamp: 2025-08-29T06:25:27.250Z
Learning: In Fluent Bit, ZSTD compression is enabled by default and is treated as a core dependency, not requiring conditional compilation guards like `#ifdef FLB_HAVE_ZSTD`. Unlike some other optional components such as ARROW/PARQUET (which use `#ifdef FLB_HAVE_ARROW` guards), ZSTD support is always available and doesn't need build-time conditionals. ZSTD headers are included directly without guards across multiple plugins and core components.
Applied to files:
tests/runtime/out_azure_kusto.c
🧬 Code graph analysis (1)
tests/runtime/out_azure_kusto.c (1)
src/flb_lib.c (10)
flb_create(143-225)flb_service_set(652-678)flb_input(266-276)flb_input_set(305-335)flb_output(279-289)flb_output_set(520-551)flb_start(983-994)flb_lib_push(843-870)flb_stop(1011-1055)flb_destroy(228-263)
🔇 Additional comments (4)
tests/runtime/out_azure_kusto.c (4)
3-5: LGTM!The
_XOPEN_SOURCE 700definition is necessary fornftw()(used in the cleanup helpers) and properly guarded against redefinition.
27-31: LGTM!All includes are necessary for the new cleanup helpers and test implementation (
getpid,stat,mkdir,nftw,PATH_MAX).
59-59: LGTM!Test prototype and registration follow the established pattern in this test file.
243-325: Effective regression test for the SIGSEGV fix.The two-phase approach correctly simulates the restart scenario that triggered the nested loop bug. The test exercises the
ingest_all_chunks()code path with buffered backlog, which is the critical path for this regression.Using PID for the buffer directory effectively addresses the previous review concern about hardcoded paths and test isolation.
Signed-off-by: Yashwanth Anantharaju <[email protected]>
03323cf to
5602e1f
Compare
|
@edsiper @cosmo0920 👋 any feedback on this PR? thank you! |
Disclaimer: Used copilot to investigate the crash we faced and took it's help to write this
Summary
Fixes a SIGSEGV crash in the Azure Kusto output plugin when processing buffered backlog data on startup.
Root Cause
The
ingest_all_chunks()function had nestedmk_list_foreach_safeloops that both used the sametmpvariable as the iterator:The
mk_list_foreach_safemacro stores the "next" pointer in its second argument for safe iteration during list modification. When the inner loop overwrotetmp, it corrupted the outer loop's iteration state, causing undefined behavior and eventually a SIGSEGV when the outer loop tried to continue iteration with a corrupted pointer.Crash Stack Trace (from production)
Fix
Add a dedicated
f_tmpvariable for the inner loop to prevent iterator corruption:Testing
flb_test_azure_kusto_buffering_backlogregression test that exercises the buffering/backlog restart code pathChanges
plugins/out_azure_kusto/azure_kusto.c: Addf_tmpiterator variable for inner loop (+2 lines, -1 line)tests/runtime/out_azure_kusto.c: Add buffering backlog regression test (+73 lines)Summary by CodeRabbit
Refactor
Tests
✏️ Tip: You can customize this high-level summary in your review settings.