Skip to content

WIP: Add the OTEL collector service #67

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from
Draft

WIP: Add the OTEL collector service #67

wants to merge 5 commits into from

Conversation

ezimuel
Copy link
Collaborator

@ezimuel ezimuel commented Jul 8, 2025

This PR add the OTEL collector in start-local using the option --otel. This is done using the otel_collector docker service reported here.
This PR should address the request in #55.

To test the --otel option we can run the following command:

curl -fsSL https://raw.githubusercontent.com/elastic/start-local/refs/heads/feature/otel/start-local.sh | sh -s -- --otel

To be done:

  • test the OTEL collector in tests
  • document the --otel option in README.md


exporters:
elasticsearch:
endpoint: http://elasticsearch:9200
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
endpoint: http://elasticsearch:9200
endpoints: [ "http://elasticsearch:9200" ]

Shouldn't this be plural?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just copy & paste from here where it's singular.

Copy link
Member

@xrmx xrmx Jul 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The source of truth should be the collector documentation:
https://www.elastic.co/docs/reference/opentelemetry/edot-collector/config/default-config-standalone

This is the example config for the collector in agent mode (i.e. will export to ES) that will also collect logs and generate metrics from the host https://raw.githubusercontent.com/elastic/elastic-agent/refs/tags/v9.0.3/internal/pkg/otel/samples/linux/logs_metrics_traces.yml

I think it's a bit too much but maybe hostmetrics are a good source of data to do some smoke testing? i.e. you can check that you can query them in elasticsearch

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need help on this configuration for the docker-compose.yml. The goal is to offer a default settings for EDOT Collector (Standalone) to be used locally for start using the Observability stack in Elastic. I need the config here for the service and the docker specification here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rogercoll let's chat to get your support on generating a configuration that can enable us to deploy EDOT collector as a gateway for the start-local effort. So far I have referenced the one we are using for K8s gateway, but we might need to hardcode the endpoints and enrichment processors.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both endpoint and endpoints configurations are supported in the Elasticsearch exporter. The first one is because the exporter embedding the upstream confighttp configuration: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/exporter/elasticsearchexporter/config.go#L69C13-L69C25

The latest was previously added because alignment with elasticsearch-go client configuration: https://github.com/elastic/go-elasticsearch/blob/main/elasticsearch.go#L71

Not a strong opinion, but maybe use endpoints as being the one used in our public docs: https://www.elastic.co/docs/reference/opentelemetry/edot-collector/config/default-config-standalone#data-export

@SylvainJuge
Copy link
Member

Do you plan to also generate a dedicated API key for the otel ingestion like it's currently provided for ES clients ? Without that users would have to perform another manual step to generate the API key before being able to send otel data to the otel collector.

ezimuel and others added 2 commits July 8, 2025 16:40
Co-authored-by: Riccardo Magliocchetti <[email protected]>
Co-authored-by: Riccardo Magliocchetti <[email protected]>
@ezimuel
Copy link
Collaborator Author

ezimuel commented Jul 8, 2025

@SylvainJuge it this API key different from the ES one? Can you point me to any documentation about it? Thanks.

I don't know if it's different, but there is dedicated UI for in in Kibana, also you can use https://www.elastic.co/docs/api/doc/kibana/operation/operation-createagentkey to create a new one for APM agents (which also includes EDOT SDKs).

if [ -z "${esonly:-}" ]; then
if [ "$otel" = "true" ]; then
cat >> .env <<- EOM
ES_LOCAL_JAVA_OPTS="-Xms2g -Xmx2g"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we set that Xms to 2GB intentionally? I don't think we'll compare well to Grafana's stack and others like that ("big", "bloated",...). We should IMO still initialize as small as possible with the necessary room to grow if needed. Even if we take a small performance hit when needing to increase the heap size.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xeraa this is the default configuration proposed here. I'm not an expert of EDOT collector and I asked to the OTel team to help on this.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, IMO this can (should) use the same settings and general approach used for the other options. We still want this to start really lightweight.

Copy link

@mlunadia mlunadia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added comments about imaged used and collector config.

if [ "$otel" = "true" ]; then
cat >> uninstall.sh <<- EOM
if docker rmi docker.elastic.co/elastic-agent/elastic-otel-collector:${es_version} >/dev/null 2>&1; then
echo "Image docker.elastic.co/elastic-agent/elastic-otel-collector:${es_version} removed successfully"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ezimuel we should be using the image of Elastic Agent and trigger otel mode, there is a flag that enables the Elastic Agent container to start in otel mode = EDOT

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mlunadia Are we interested in any other Elastic Agent binaries/features except the Otel collector? If we are only interested in the EDOT collector I would recommend using the elastic-agent/elastic-otel-collector image as being smaller than the main elastic-agent image: elastic/elastic-agent#7173

Copy link

@mlunadia mlunadia Aug 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just EDOT collector for start-local, for edge we might add examples later

# Add the OTLP configs in docker-compose.yml
cat >> docker-compose.yml <<-'EOM'
configs:
# This is the minimal yaml configuration needed to listen on all interfaces

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need to work on this config as we are missing some of the core processors like batch. As reference we can begin using this configuration with the exception of the inframetrics processor which is marked for removal in it. https://github.com/elastic/elastic-agent/blob/main/internal/pkg/otel/samples/linux/gateway.yml


exporters:
elasticsearch:
endpoint: http://elasticsearch:9200

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rogercoll let's chat to get your support on generating a configuration that can enable us to deploy EDOT collector as a gateway for the start-local effort. So far I have referenced the one we are using for K8s gateway, but we might need to hardcode the endpoints and enrichment processors.


exporters:
elasticsearch:
endpoint: http://elasticsearch:9200

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both endpoint and endpoints configurations are supported in the Elasticsearch exporter. The first one is because the exporter embedding the upstream confighttp configuration: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/exporter/elasticsearchexporter/config.go#L69C13-L69C25

The latest was previously added because alignment with elasticsearch-go client configuration: https://github.com/elastic/go-elasticsearch/blob/main/elasticsearch.go#L71

Not a strong opinion, but maybe use endpoints as being the one used in our public docs: https://www.elastic.co/docs/reference/opentelemetry/edot-collector/config/default-config-standalone#data-export

connectors:
elasticapm:

processors:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a couple of batch processor configurations? batch for logs and traces pipelines while batch/metrics for metrics pipelines, sample configuration: https://github.com/elastic/elastic-agent/blob/main/internal/pkg/otel/samples/linux/gateway.yml#L38-L44

(note that they need to be referenced in the pipeline's configuration too)

Comment on lines +765 to +770
logs_dynamic_index:
enabled: true
metrics_dynamic_index:
enabled: true
traces_dynamic_index:
enabled: true

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be removed as being deprecated:

No-op. Documents are now always routed dynamically unless logs_index is not empty. Will be removed in a future version.

https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/exporter/elasticsearchexporter/README.md#elasticsearch-document-routing

@@ -93,6 +106,8 @@ startup() {
kibana_container_name="kibana-local-dev${ES_LOCAL_DIR:+-${ES_LOCAL_DIR}}"
# Kibana settings container name
kibana_settings_container_name="kibana-local-settings${ES_LOCAL_DIR:+-${ES_LOCAL_DIR}}"
# OTEL container name
otel_container_name="otel-collector${ES_LOCAL_DIR:+-${ES_LOCAL_DIR}}"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wdyt of renaming the container name to elastic-otel-collector or edot-collector to differentiate from the upstream image?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we can have different services based on the version (--v option) and to avoid conflict we need to have different names.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I should have worded that better. My suggestion would be to include the edot keyword on the container name prefix:

Suggested change
otel_container_name="otel-collector${ES_LOCAL_DIR:+-${ES_LOCAL_DIR}}"
otel_container_name="edot-collector${ES_LOCAL_DIR:+-${ES_LOCAL_DIR}}"

(EDOT = Elastic Distributions of OpenTelemetry)

processors:
elastictrace:

exporters:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add the debug exporter with the default configuration? That is really helpful to quickly check if data is being received by the collector, similar to https://github.com/elastic/elastic-agent/blob/main/internal/pkg/otel/samples/linux/gateway.yml#L48

ezimuel and others added 2 commits August 7, 2025 16:34
Co-authored-by: Roger Coll <[email protected]>
Co-authored-by: Roger Coll <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants