Skip to content

Conversation

@zoonage
Copy link

@zoonage zoonage commented Nov 3, 2025

Description

#34054 - When fluentforward is used as a firelens target by a sidecar logs can be dropped due to delays in sending/receiving.

Link to tracking issue

#34054

Testing

Verified there is a substantial difference between 1s, 10s, and unset configuration options locally, unsure what the best method of testing that with a unit test would be

Documentation

I believe this is auto-generated?

@zoonage zoonage requested review from a team and dmitryax as code owners November 3, 2025 17:32
@github-actions github-actions bot added the first-time contributor PRs made by new contributors label Nov 3, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Nov 3, 2025

Welcome, contributor! Thank you for your contribution to opentelemetry-collector-contrib.

Important reminders:

A maintainer will review your pull request soon. Thank you for helping make OpenTelemetry better!

package fluentforwardreceiver // import "github.com/open-telemetry/opentelemetry-collector-contrib/receiver/fluentforwardreceiver"

// Config defines configuration for the fluentforward receiver.
import "time" // Config defines configuration for the fluentforward receiver.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will not pass CI, please move up in the import block

if r.listener == nil {
return nil
}
time.Sleep(r.waitBeforeShutdownDuration)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't seem like a good solution to me. It is possible more requests be made to the port while we wait.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's intended, this is to catch the case where (in my infra) AWS Firelens hasn't sent everything yet, so this adds a delay to ensure those logs are received

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something is way off. I would build a test that continuously sends requests and counts them, and compares to successful logs received. On the first request refused, we stop the test and make sure we didn't drop any request.

The issue seems to come from the server.go file ; it's doing a few too many things with goroutines that can be problematic.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm, is that intended test to confirm that logs are being dropped from Firelens or a unit test that could be added to check behaviour here?

@atoulme
Copy link
Contributor

atoulme commented Nov 5, 2025

Please add a changelog and tests.

@zoonage
Copy link
Author

zoonage commented Nov 5, 2025

Please add a changelog and tests.

I'm not sure exactly how to test this, verify that an amount of time has passed after issuing a shutdown call to the receiver and it's shut down? Or is there a way that doesn't involve making the CI sleep?

@atoulme
Copy link
Contributor

atoulme commented Nov 8, 2025

To add a changelog, type make chlog-new and follow the instructions.

@atoulme
Copy link
Contributor

atoulme commented Nov 10, 2025

@zoonage please take a look at #44133

It shows how to create a test that allows to test for some level of atomicity around receiving data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants