Skip to content

[service]: use configured logger whenever possible #13081

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 27, 2025

Conversation

TylerHelmuth
Copy link
Member

Description

I was using the supervised collector today and ran into an issue where the agent (collector) was crashing on startup and I wasn't seeing the logs exported via the configured logger, but I knew it existed bc I was getting the Setting up own telemetry... log.

Turns out in service.New, once we've created the logger, we aren't using it to log and following errors. Instead, they are being returned by service.New and handled by the fallbackLogger.

I propose that, since we have a logger, we use it.

As a followup it would be nice if any confmap errors could be reported using the instantiated logger, but that would be a bigger refactor.

Testing

Tested locally with the following config:

receivers:
  nop:
exporters:
  nop:
  otlphttp:
    endpoint: "${MISSING_ENV_VAR}:4318"
service:
  pipelines:
    traces:
      receivers: [nop]
      processors: []
      exporters: [otlphttp]
  telemetry:
    logs:
      processors:
        - batch:
            exporter:
              otlp:
                endpoint: https://api.honeycomb.io:443
                headers:
                  - name: x-honeycomb-team
                    value: `[READACTED]`
                protocol: http/protobuf

console output:

2025-05-22T17:25:30.764-0600	info	service/service.go:200	Setting up own telemetry...	{"resource": {}}
2025-05-22T17:25:30.764-0600	error	service/service.go:223	failed to initialize service graph	{"resource": {}, "error": "failed to build pipelines: failed to create \"otlphttp\" exporter for data type \"traces\": endpoint must be a valid URL"}
go.opentelemetry.io/collector/service.New
	/Users/tylerhelmuth/projects/opentelemetry-collector/service/service.go:223
go.opentelemetry.io/collector/otelcol.(*Collector).setupConfigurationComponents
	/Users/tylerhelmuth/projects/opentelemetry-collector/otelcol/collector.go:197
go.opentelemetry.io/collector/otelcol.(*Collector).Run
	/Users/tylerhelmuth/projects/opentelemetry-collector/otelcol/collector.go:312
go.opentelemetry.io/collector/otelcol.NewCommand.func1
	/Users/tylerhelmuth/projects/opentelemetry-collector/otelcol/command.go:39
github.com/spf13/cobra.(*Command).execute
	/Users/tylerhelmuth/go/1.24.0/pkg/mod/github.com/spf13/[email protected]/command.go:1015
github.com/spf13/cobra.(*Command).ExecuteC
	/Users/tylerhelmuth/go/1.24.0/pkg/mod/github.com/spf13/[email protected]/command.go:1148
github.com/spf13/cobra.(*Command).Execute
	/Users/tylerhelmuth/go/1.24.0/pkg/mod/github.com/spf13/[email protected]/command.go:1071
main.runInteractive
	/Users/tylerhelmuth/projects/opentelemetry-collector/cmd/otelcorecol/main.go:57
main.run
	/Users/tylerhelmuth/projects/opentelemetry-collector/cmd/otelcorecol/main_others.go:10
main.main
	/Users/tylerhelmuth/projects/opentelemetry-collector/cmd/otelcorecol/main.go:50
runtime.main
	/Users/tylerhelmuth/go/1.24.0/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:283
2025-05-22T17:25:30.758-0600	warn	envprovider/provider.go:61	Configuration references unset environment variable	{"name": "MISSING_ENV_VAR"}
Error: failed to build pipelines: failed to create "otlphttp" exporter for data type "traces": endpoint must be a valid URL
2025/05/22 17:25:30 collector server run finished with error: failed to build pipelines: failed to create "otlphttp" exporter for data type "traces": endpoint must be a valid URL

Proof that the error log exported:
image

@TylerHelmuth TylerHelmuth requested a review from a team as a code owner May 22, 2025 23:27
@TylerHelmuth TylerHelmuth requested a review from dmitryax May 22, 2025 23:27
Copy link

codecov bot commented May 22, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 91.59%. Comparing base (392b705) to head (b0a9b66).
Report is 4 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #13081      +/-   ##
==========================================
+ Coverage   91.58%   91.59%   +0.01%     
==========================================
  Files         505      505              
  Lines       28476    28479       +3     
==========================================
+ Hits        26079    26085       +6     
+ Misses       1883     1880       -3     
  Partials      514      514              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@songy23 songy23 added the collector-telemetry healthchecker and other telemetry collection issues label May 23, 2025
@TylerHelmuth
Copy link
Member Author

TylerHelmuth commented May 23, 2025

Another approach to this, if we don't like logging an error we're returning, is to refactor so that the logger and all the sdk are instantiated outside the service and passed into the service via settings. Then that logger can be used for the error returned by New instead of the fallback logger.

@@ -192,6 +192,7 @@ func New(ctx context.Context, set Settings, cfg Config) (*Service, error) {

tracerProvider, err := telFactory.CreateTracerProvider(ctx, telset, &cfg.Telemetry)
if err != nil {
logger.Error("failed to create tracer provider", zap.Error(err))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One alternative way of doing this is to do something like this after line 190:

defer func(){
     if err != nil {
         logger.Error("error found during service initialization", zap.Error(err))
     }
}()

(the message is a bit less specific but at least we don't have to add a new log on each place

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

either works, one thing i would ask though is if the defer func is used, please add a comment otherwise i'll ask myself the same "why is this a defer func() here" every time i come across the code :D

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea as it means that any future errors added after the logging initialized don't have to worry about adding their own logging statement.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mx-psi updated. One downside to this strategy is the error returned from sdk.Shutdown cannot be returned from the function. We might be able to use named return values, but I dont care for those.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this should be fine, maybe the only case I would be concerned about is if there are no other errors other than the Shutdown one

@mx-psi
Copy link
Member

mx-psi commented May 26, 2025

Another approach to this, if we don't like logging an error we're returning, is to refactor so that the logger and all the sdk are instantiated outside the service and passed into the service via settings. Then that logger can be used for the error returned by New instead of the fallback logger.

I would personally prefer this, see #4970 for some discussion on how to do this, though it would be a bigger refactor

@TylerHelmuth
Copy link
Member Author

TylerHelmuth commented May 27, 2025

I would personally prefer this, see #4970 for some discussion on how to do this, though it would be a bigger refactor

I ultimately prefer this as well, but I think logging the error ourselves for now helps a lot, especially for the opampsupervisor. I can also volunteer to start moving this issue forward again if there is capacity to review, but I think merging this now is worth it.

@mx-psi
Copy link
Member

mx-psi commented May 27, 2025

I would personally prefer this, see #4970 for some discussion on how to do this, though it would be a bigger refactor

I ultimately prefer this as well, but I think logging the error ourselves for now helps a lot, especially for the opampsupervisor. I can also volunteer to start moving this issue forward again if there is capacity to review, but I think merging this now is worth it.

Happy to review PRs related to this

@codeboten codeboten added this pull request to the merge queue May 27, 2025
Merged via the queue into open-telemetry:main with commit 9b4911b May 27, 2025
56 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
collector-telemetry healthchecker and other telemetry collection issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants