Skip to content

[RFC] - Configuration Merging revamped #13256

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

VihasMakwana
Copy link
Contributor

@VihasMakwana VihasMakwana commented Jun 24, 2025

Description

This RFC is a follow-up of #12097. The first PR introduced the feature gate to merge the components' lists and left out the options to configure the merging behaviour.

This RFC proposes an approach to extend the current behaviour by enabling merging of specified config parts and support different modes.

Link to tracking issue

Relates:

Thanks to @mx-psi @dmitryax and @evan-bradley for their feedback on the first PR!!

Copy link

codecov bot commented Jun 24, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 91.64%. Comparing base (3bcdf80) to head (74862cb).

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #13256      +/-   ##
==========================================
+ Coverage   91.62%   91.64%   +0.01%     
==========================================
  Files         522      522              
  Lines       29208    29208              
==========================================
+ Hits        26763    26767       +4     
+ Misses       1926     1923       -3     
+ Partials      519      518       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@douglascamata douglascamata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff, @VihasMakwana. I like this. I left comments but they are non-blockers to me.


## Open questions

- What to do if an invalid option is provided for `merge_mode` or `merge_paths`?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My personal opinion on this is that we should error out and not start the Collector. If we log an error and merge the default way the output configuration might have very different behavior than what the user expected.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree.

Comment on lines +39 to +48
We will support new parameters to config URIs as follows:
1. `merge_paths`: A comma-separated list of glob patterns which will be used while config merging
- This setting will control the paths user wants to merge from the given config.
- Example:
- `otelcol --config main.yaml --config extra.yaml?merge_paths=service::extensions,service::**::receivers`
- In this example, we will merge the list of extensions and receivers from pipeline, excluding lists in the rest of the config.
- `otelcol --config main.yaml --config ext.yaml?merge_paths=service::extensions --config rec.yaml?merge_paths=service::**::receivers`
- In this example, we will merge all list of extensions from `ext.yml` and list of receivers from `rec.yaml`, excluding lists in the rest of the config.
2. `merge_mode`: One of `prepend` or `append`.
- This setting will control the ordering of merged list.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know how technically feasible is this today, @VihasMakwana? I am thinking about questions like:

  • Does koanf support this syntax of the URI parameters in the file paths? Or will we have to "pre-process" the CLI args before passing them to koanf?
  • Can we support these gob-like patterns of **? Maybe we should start simpler to have some room for experimentation before adding more advanced path logic?

Copy link
Contributor Author

@VihasMakwana VihasMakwana Jun 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does koanf support this syntax of the URI parameters in the file paths? Or will we have to "pre-process" the CLI args before passing them to koanf?

We will do the preprocessing and merge as per user's CLI args.

Can we support these gob-like patterns of **? Maybe we should start simpler to have some room for experimentation before adding more advanced path logic?

We already have the logic in place that does this, but only for components

patterns := []string{
"service::extensions",
"service::**::receivers",
"service::**::exporters",
}
var globs []glob.Glob
for _, p := range patterns {
if g, err := glob.Compile(p); err == nil {
globs = append(globs, g)
}
}

TL;DR;

  • It constructs globs.
  • We flatten the map and go through each key and match it against glob.
  • If a match is found, we merge the lists.
  • Else, we continue and don't touch the unmatched key

- In this example, we will merge the list of extensions and receivers from pipeline, excluding lists in the rest of the config.
- `otelcol --config main.yaml --config ext.yaml?merge_paths=service::extensions --config rec.yaml?merge_paths=service::**::receivers`
- In this example, we will merge all list of extensions from `ext.yml` and list of receivers from `rec.yaml`, excluding lists in the rest of the config.
2. `merge_mode`: One of `prepend` or `append`.
Copy link
Member

@dmitryax dmitryax Jun 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably need more options. For example, if I want to add a receiver in a merging config while it may be already defined in the base config. Also, we need the default replace behavior. What about the following list:

  • append: append entries to the list
  • append_unique: append only unique entries to the list
  • prepend: insert entries in front of the list
  • prepend_unique: insert only unique entries in front of the list
    replace (default): overwrite the list
    keep: keep the original list

Also, I would call the options list_merge_mode and list_merge_paths to make it clearer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dmitryax I see. I'll update the RFC

Copy link
Member

@douglascamata douglascamata Jun 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you touch a good point there with the _unique modifiers, @dmitryax. Shouldn't both append and prepend always ensure lists containing only "basic types" (strings, ints, etc) do not have duplicated values? I don't know about any scenario in the Collector involving lists of basic types where duplicated values wouldn't be a problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For service components, duplicates will cause error.
I'm not sure about other places though.

1. `merge_paths`: A comma-separated list of glob patterns which will be used while config merging
- This setting will control the paths user wants to merge from the given config.
- Example:
- `otelcol --config main.yaml --config extra.yaml?merge_paths=service::extensions,service::**::receivers`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using the query params is fine. However, we need to keep in mind that some confmap providers use the same pattern to pass extra information. With this approach, they won't be able to use these keys which is probably fine.

If anyone else have any other ideas, please share.

cc @open-telemetry/collector-approvers

Copy link
Contributor Author

@VihasMakwana VihasMakwana Jun 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, we can extract the parameters here

for _, uri := range mr.uris {
and process them for our merge function.

Then we can remove our parameters from URI and let resolver do its job:

u, _ := url.Parse("otel.yaml?key=val&merge_mode=append")
q := u.Query()
q.Del("merge_mode")
u.RawQuery = q.Encode()
// uri will now be otel.yaml?key=val&key2=val2

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this in this RFC and came up with a few issues with query parameters that we'll want to consider before going that route.

@dmitryax I believe you, @mx-psi, and I had an offline conversation at KubeCon and tacitly agreed we should go the route of configuring these inside a config file (this option in the RFC). Do you still think we should go this route? I think we should probably at least explore it before deciding on an approach.

@mx-psi
Copy link
Member

mx-psi commented Jun 27, 2025

I haven't had time to look into this proposal in detail but one thing I would like to see explored is whether we can use YAML custom tags https://yaml.org/spec/1.2.2/#tags to specify this in the YAML itself per-array

@VihasMakwana
Copy link
Contributor Author

@mx-psi Thanks for sharing your thoughts! I'll surely go the documentation and see if they can be integrated. I'm all in if we can get it work through just yaml.

@VihasMakwana
Copy link
Contributor Author

VihasMakwana commented Jul 1, 2025

@mx-psi I had a brief look at yaml tags and their support in golang. Here are my thoughts:

  1. We rely on koanf for configuration management and it deals with map[string]any. It does not support the concept of yaml tags. To implement this feature using yaml tags, we need to extract tags somewhere around here
    // NewRetrievedFromYAML returns a new Retrieved instance that contains the deserialized data from the yaml bytes.
    // * yamlBytes the yaml bytes that will be deserialized.
    // * opts specifies options associated with this Retrieved value, such as CloseFunc.
    func NewRetrievedFromYAML(yamlBytes []byte, opts ...RetrievedOption) (*Retrieved, error) {
    var rawConf any
    if err := yaml.Unmarshal(yamlBytes, &rawConf); err != nil {
  • One thing to note: We use https://pkg.go.dev/sigs.k8s.io/yaml, which is a fork of gopkg.in/yaml.v3. I believe it should support yaml tags. I'll get back to you if I face any blockers.
  • Pseudo code for extracting tags:
var node Node
err := yaml.Unmarshal(data, &node)

// recursively go through the node and its children to find the nodes with tags and store their path.
  1. I still need to create a proof of concept to be sure it works. I'm planning to update the RFC with both approaches and decide on the best one after evaluating the pros and cons of each.

Here are some initial pros and cons that come to mind:

  1. Pros of yaml tag approach:

    • Straightforward from end-user's perspective. No need to extra CLI flags and everything can be specified in configuration files.
  2. Cons of yaml tag approach:

    • More preprocessing:
      • As koanf converts the given config into map[string]any, we need to do the preprocessing of tags before converting it to a koanf instance or else we'll lose the tags.
      • We need to go down the yaml tree to extract nodes with tags.

Let me know your thoughts!

@VihasMakwana
Copy link
Contributor Author

@mx-psi I just saw your raised #13308. That makes things a lot easier for yaml tags. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants