Emit `outcome: failure` in obsconsumer #13234

jade-guiton-dd · 2025-06-18T16:21:31Z

Description

The last remaining part of #12676 is to implement the outcome: failure part of the Pipeline Component Telemetry RFC (see here). This is done by introducing a downstream error wrapper struct, to distinguish between errors coming from the next component from errors bubbled from further downstream.

Important note

This PR implements things slightly differently from what the text of the RFC describes.

If a pipeline contains components A → B and an error occurs in B, this PR records:

otelcol.component.outcome = failure in the otelcol.*.consumed.* metric for B
otelcol.component.outcome = refused in the otelcol.*.produced.* metric for A

whereas the RFC would set both outcomes to failure.

This is programmatically simpler — no need to have different behavior between the obsconsumer around the output of A and the one around the input of B — but more importantly, I think it is clearer for users as well: outcome = failure only occurs on metrics associated with the component where the failure actually occurred.

This subtlety wasn't discussed in-depth in #11956 which introduced outcome = refused, so I took the liberty to make this change. If necessary, I can file another RFC amendment to match, or, if there are objections, implement the RFC as-is instead.

Link to tracking issue

Fixes #12676

Testing

I've updated the existing tests in obsconsumer to expect a downstream-wrapped error to exit the obsconsumer layer. I may add more tests later.

Documentation

None.

…d on it

codecov · 2025-06-18T16:30:55Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 91.45%. Comparing base (f68d710) to head (61ba663).
Report is 14 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main   #13234   +/-   ##
=======================================
  Coverage   91.44%   91.45%           
=======================================
  Files         533      534    +1     
  Lines       29564    29596   +32     
=======================================
+ Hits        27034    27066   +32     
  Misses       1998     1998           
  Partials      532      532

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

github-actions · 2025-07-08T03:34:33Z

This PR was marked stale due to lack of activity. It will be closed in 14 days.

#### Description This PR updates the [Pipeline Component Telemetry RFC](https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/rfcs/component-universal-telemetry.md) with the following changes: - Reflect implementation choices that have been made since the RFC was written: 1. using instrumentation scope attributes instead of datapoint attributes to identify component instances (see discussion in #12217 and open-telemetry/opentelemetry-go#6404) 2. automatically injecting these attributes, without changes to component code 3. changing the instrumentation scope name used for pipeline metrics - Slightly change the semantics of `outcome = refused`: The current planned behavior (from #11956) is that, in the case of a pipeline A → B where component B returns an error, the "consumed" metric for B and the "produced" metric for A should both have `outcome = failure`. I fear that this may lead users to think that a failure occurred in A, and would like to restrict `outcome = failure` to only be associated with the component that "failed", ie. component B. The "produced" metric associated with A would instead have `outcome = refused`. This incidentally makes implementation slightly easier, since an instrumentation layer will not need different error wrapping behavior between the "producer" layer and the "consumer" layer. See draft PR #13234 for an example implementation. As this is a non-trivial change to an RFC, it may need to follow the RFC process. Co-authored-by: Alex Boten <[email protected]>

mx-psi · 2025-07-10T12:30:56Z

Is this ready for review now that the RFC amendment has been merged?

jade-guiton-dd · 2025-07-10T13:33:58Z

Not quite, I'm thinking of adding some additional tests. Not sure when I'll have the time to get to it though 😅

jade-guiton-dd · 2025-07-16T11:39:51Z

Looks like the contrib failures are due to #13364. This should be ready for review.

mx-psi

LGTM. @evan-bradley could you review the consumererror bits?

consumer/consumererror/downstream.go

consumer/consumererror/downstream_test.go

consumer/consumererror/downstream.go

jade-guiton-dd · 2025-07-21T10:37:29Z

I think I've addressed your comments, could you take another look @evan-bradley?

evan-bradley

Looks good to me. Thanks for your patience with all my questions. 🙂

consumer/consumererror/downstream.go

jade-guiton-dd · 2025-07-22T15:59:48Z

We discussed whether errors.Join(downstream, notDownstream) should be considered downstream or not during the Collector stability meeting (see this comment thread for context). Some points that were raised:

A more idiomatic way to add context to a downstream error would be fmt.Errorf("error in <context>: %w", downstream), so it's not clear that we should support the "additional context" interpretation of errors.Join.
If a component experiences an issue, but still succeeds in forwarding data to the next component, that issue was clearly not fatal to the processing of that payload. It should thus probably be surfaced as a warning log, not as an error returned to the caller with errors.Join. So it's not clear that we should support the "unrelated failure" interpretation of errors.Join either.
Given that we can't really think of an idiomatic use case for this errors.Join call, we may want to consider it "undefined behavior", and default to the simplest implementation using errors.As. This is the current state of the PR.

I don't remember very well, but I think @dmitryax raised the point that perhaps we should instead default to the interpretation which is least likely to "hide" internal failures. Do you have objections to the current logic in the PR?

mx-psi · 2025-07-23T09:45:59Z

@open-telemetry/collector-approvers Based on the above comment I think we can merge this by EOW (after the merge conflicts are resolved) unless there are objections. If we explicitly consider this edge case to be unspecified behavior, I think we can go ahead with the choice we made

Add "downstream" wrapper in obsconsumer, emit "outcome: failure" base…

0fa4e93

…d on it

jade-guiton-dd added 3 commits June 20, 2025 13:21

Move wrapper to consumererror, add comments and changelog

09899ba

make gotidy

421cac4

make goporto

9961a78

jade-guiton-dd mentioned this pull request Jun 24, 2025

Update Pipeline Component Telemetry RFC #13260

Merged

github-actions bot added the Stale label Jul 8, 2025

jade-guiton-dd removed the Stale label Jul 8, 2025

jade-guiton-dd added 5 commits July 11, 2025 15:27

Merge branch 'main' into obsconsumer-downstream-refused

05cd135

Add test for "outcome: refused" behavior

c965e04

Merge branch 'main' into obsconsumer-downstream-refused

debe328

make gogenerate

75e9b5f

lint

16496d7

jade-guiton-dd marked this pull request as ready for review July 15, 2025 16:39

jade-guiton-dd requested a review from a team as a code owner July 15, 2025 16:39

jade-guiton-dd requested a review from dmitryax July 15, 2025 16:39

jade-guiton-dd added 2 commits July 15, 2025 18:51

Add test for downstream wrapper

7d98a32

lint

6bb8302

codeboten and others added 2 commits July 16, 2025 07:53

Merge branch 'main' into obsconsumer-downstream-refused

b1e986c

Merge branch 'main' into obsconsumer-downstream-refused

330009c

mx-psi approved these changes Jul 18, 2025

View reviewed changes

mx-psi requested a review from evan-bradley July 18, 2025 10:22

evan-bradley reviewed Jul 18, 2025

View reviewed changes

consumer/consumererror/downstream.go Show resolved Hide resolved

consumer/consumererror/downstream_test.go Show resolved Hide resolved

consumer/consumererror/downstream.go Outdated Show resolved Hide resolved

consumer/consumererror/downstream.go Show resolved Hide resolved

Switch to errors.As logic for IsDownstream + refactor test

a64e707

jade-guiton-dd requested a review from evan-bradley July 21, 2025 10:37

evan-bradley approved these changes Jul 21, 2025

View reviewed changes

consumer/consumererror/downstream.go Show resolved Hide resolved

Add API release note for consumererror

c7eedf6

jade-guiton-dd mentioned this pull request Jul 22, 2025

Update receiverhelper for requests that failed to be received #12802

Open

Merge branch 'main' into obsconsumer-downstream-refused

61ba663

jmacd approved these changes Jul 23, 2025

View reviewed changes

mx-psi added this pull request to the merge queue Jul 25, 2025

Merged via the queue into open-telemetry:main with commit 545866f Jul 25, 2025
56 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Emit `outcome: failure` in obsconsumer #13234

Emit `outcome: failure` in obsconsumer #13234

Uh oh!

jade-guiton-dd commented Jun 18, 2025

Uh oh!

codecov bot commented Jun 18, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jul 8, 2025

Uh oh!

mx-psi commented Jul 10, 2025

Uh oh!

jade-guiton-dd commented Jul 10, 2025

Uh oh!

jade-guiton-dd commented Jul 16, 2025

Uh oh!

mx-psi left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jade-guiton-dd commented Jul 21, 2025

Uh oh!

evan-bradley left a comment

Uh oh!

Uh oh!

jade-guiton-dd commented Jul 22, 2025

Uh oh!

mx-psi commented Jul 23, 2025

Uh oh!

Uh oh!

Uh oh!

Emit outcome: failure in obsconsumer #13234

Emit outcome: failure in obsconsumer #13234

Uh oh!

Conversation

jade-guiton-dd commented Jun 18, 2025

Description

Important note

Link to tracking issue

Testing

Documentation

Uh oh!

codecov bot commented Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions bot commented Jul 8, 2025

Uh oh!

mx-psi commented Jul 10, 2025

Uh oh!

jade-guiton-dd commented Jul 10, 2025

Uh oh!

jade-guiton-dd commented Jul 16, 2025

Uh oh!

mx-psi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jade-guiton-dd commented Jul 21, 2025

Uh oh!

evan-bradley left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jade-guiton-dd commented Jul 22, 2025

Uh oh!

mx-psi commented Jul 23, 2025

Uh oh!

Uh oh!

Uh oh!

Emit `outcome: failure` in obsconsumer #13234

Emit `outcome: failure` in obsconsumer #13234

codecov bot commented Jun 18, 2025 •

edited

Loading