Description
In our environment, we are asked to not put the ingestion token as plaintext in the SPLUNK_ACCESS_TOKEN environment variable, as anyone is able to read this in the AWS console or APIs when describing the Lambda. To work around this, we created our own Lambda layer which is a Lambda execution wrapper around the Splunk provided wrapper. Our own wrapper expects an AWS Secrets Manager ARN as an environment variable. It then fetches the secret, parses out the token and sets the SPLUNK_ACCESS_TOKEN environment variable. Our wrapper then calls the Splunk wrapper to continue as normal.
The change in #114 has broken this flow for us. It looks like the OTEL collector starts up before our own wrapper is able to execute and set up the environment variable.
Is there a way we can delay the OTEL collector starting up? Is there another way to keep the token secret and out of the AWS Lambda console as plaintext?
Or can a mechanism be added to the Lambda Layer that can fetch the token from a secret input as an environment variable? The script could either use the plaintext value of the secret, or expert JSON and use syntax similar to what AWS ECS uses that expects the secret value to be JSON and which key to pull the token from. e.g. arn:aws:secretsmanager:region:aws_account_id:secret:secret-name:json-key:version-stage:version-id
.
This mechanism works with arn:aws:lambda:us-east-2:254067382080:layer:splunk-apm:222
. Trying this in the latest version of the Lambda Layer arn:aws:lambda:us-east-2:254067382080:layer:splunk-apm:365
this is what we see in the logs.
Lambda starts up
INIT_START Runtime Version: python:3.9.v18 Runtime Version ARN: arn:aws:lambda:us-east-2::runtime:edb5a058bfa782cb9cedc6d534ac8b8c193bc28e9a9879d9f5ebaaf619cd0fc0
We see this error which we've always got and doesn't seem to cause a problem, but would be nice if we didn't see this.
2023/03/23 01:13:16 [ERROR] Exporter endpoint must be set when SPLUNK_REALM is not set. To export data, set either a realm and access token or a custom exporter endpoint.
The commit sha of the Splunk wrapper is logged
[splunk-extension-wrapper] splunk-extension-wrapper, version: 4552de7
The OTEL collector listening on localhost starts up successfully. The SPLUNK_ACCESS_TOKEN is not set yet in our case.
{
"level": "info",
"ts": 1679533996.8630877,
"msg": "Launching OpenTelemetry Lambda extension",
"version": "v0.69.1"
}
{
"level": "info",
"ts": 1679533996.8672311,
"logger": "telemetryAPI.Listener",
"msg": "Listening for requests",
"address": "sandbox:53612"
}
{
"level": "info",
"ts": 1679533996.8673244,
"logger": "telemetryAPI.Client",
"msg": "Subscribing",
"baseURL": "http://127.0.0.1:9001/2022-07-01/telemetry"
}
TELEMETRY Name: collector State: Subscribed Types: [Platform]
{
"level": "info",
"ts": 1679533996.8688502,
"logger": "telemetryAPI.Client",
"msg": "Subscription success",
"response": "\"OK\""
}
{
"level": "info",
"ts": 1679533996.874017,
"caller": "service/telemetry.go:90",
"msg": "Setting up own telemetry..."
}
{
"level": "Basic",
"ts": 1679533996.8743467,
"caller": "service/telemetry.go:116",
"msg": "Serving Prometheus metrics",
"address": ":8888"
}
{
"level": "info",
"ts": 1679533996.8772216,
"caller": "service/service.go:128",
"msg": "Starting otelcol-lambda...",
"Version": "v0.69.1",
"NumCPU": 2
}
{
"level": "info",
"ts": 1679533996.8773112,
"caller": "extensions/extensions.go:41",
"msg": "Starting extensions..."
}
{
"level": "info",
"ts": 1679533996.8773668,
"caller": "service/pipelines.go:86",
"msg": "Starting exporters..."
}
{
"level": "info",
"ts": 1679533996.877425,
"caller": "service/pipelines.go:90",
"msg": "Exporter is starting...",
"kind": "exporter",
"data_type": "traces",
"name": "otlphttp"
}
{
"level": "info",
"ts": 1679533996.8788476,
"caller": "service/pipelines.go:94",
"msg": "Exporter started.",
"kind": "exporter",
"data_type": "traces",
"name": "otlphttp"
}
{
"level": "info",
"ts": 1679533996.8789244,
"caller": "service/pipelines.go:98",
"msg": "Starting processors..."
}
{
"level": "info",
"ts": 1679533996.8789926,
"caller": "service/pipelines.go:110",
"msg": "Starting receivers..."
}
{
"level": "info",
"ts": 1679533996.8790362,
"caller": "service/pipelines.go:114",
"msg": "Receiver is starting...",
"kind": "receiver",
"name": "otlp",
"pipeline": "traces"
}
{
"level": "warn",
"ts": 1679533996.8790877,
"caller": "internal/warning.go:51",
"msg": "Using the 0.0.0.0 address exposes this server to every network interface, which may facilitate Denial of Service attacks",
"kind": "receiver",
"name": "otlp",
"pipeline": "traces",
"documentation": "https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/security-best-practices.md#safeguards-against-denial-of-service-attacks"
}
{
"level": "info",
"ts": 1679533996.8791919,
"caller": "[email protected]/otlp.go:94",
"msg": "Starting GRPC server",
"kind": "receiver",
"name": "otlp",
"pipeline": "traces",
"endpoint": "0.0.0.0:4317"
}
{
"level": "warn",
"ts": 1679533996.8792677,
"caller": "internal/warning.go:51",
"msg": "Using the 0.0.0.0 address exposes this server to every network interface, which may facilitate Denial of Service attacks",
"kind": "receiver",
"name": "otlp",
"pipeline": "traces",
"documentation": "https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/security-best-practices.md#safeguards-against-denial-of-service-attacks"
}
{
"level": "info",
"ts": 1679533996.8793197,
"caller": "[email protected]/otlp.go:112",
"msg": "Starting HTTP server",
"kind": "receiver",
"name": "otlp",
"pipeline": "traces",
"endpoint": "0.0.0.0:4318"
}
{
"level": "info",
"ts": 1679533996.879386,
"caller": "service/pipelines.go:118",
"msg": "Receiver started.",
"kind": "receiver",
"name": "otlp",
"pipeline": "traces"
}
{
"level": "info",
"ts": 1679533996.8794274,
"caller": "service/service.go:145",
"msg": "Everything is ready. Begin running and processing data."
}
Our own wrapper starts executing, fetching the token from the input secret and setting the SPLUNK_ACCESS_TOKEN environment variable
[WRAPPER] - INFO - START
[WRAPPER] - INFO - Fetching Splunk token
[WRAPPER] - INFO - Fetching arn:aws:secretsmanager:us-east-2:my-aws-acct-id:secret:splunk-token-secret
[WRAPPER] - INFO - END
The Splunk extension begins executing as called from our own wrapper. The the change in #114 this script is unsettling SPLUNK_ACCESS_TOKEN so traces are sent to localhost collector that is already set up with the token.
EXTENSION Name: collector State: Ready Events: [INVOKE, SHUTDOWN]
EXTENSION Name: splunk-extension-wrapper State: Ready Events: [INVOKE, SHUTDOWN]
We get a request. Ingesting traces through the localhost collector errors with 401 unauthorized and eventual times out the Lambda in retry policies.
START RequestId: 2bdc5088-8c42-42eb-9013-79f41f191fd4 Version: $LATEST
[WARNING] 2023-03-23T01:13:20.564Z 2bdc5088-8c42-42eb-9013-79f41f191fd4 Invalid type NoneType for attribute value. Expected one of ['bool', 'str', 'bytes', 'int', 'float'] or a sequence of those types
{
"level": "error",
"ts": 1679534000.7317784,
"caller": "exporterhelper/queued_retry.go:394",
"msg": "Exporting failed. The error is not retryable. Dropping data.",
"kind": "exporter",
"data_type": "traces",
"name": "otlphttp",
"error": "Permanent error: error exporting items, request to https://ingest.us1.signalfx.com:443/v2/trace/otlp responded with HTTP Status Code 401",
"dropped_items": 8,
"stacktrace": "go.opentelemetry.io/collector/exporter/exporterhelper.(*retrySender).send\n\tgo.opentelemetry.io/[email protected]/exporter/exporterhelper/queued_retry.go:394\ngo.opentelemetry.io/collector/exporter/exporterhelper.(*tracesExporterWithObservability).send\n\tgo.opentelemetry.io/[email protected]/exporter/exporterhelper/traces.go:137\ngo.opentelemetry.io/collector/exporter/exporterhelper.(*queuedRetrySender).send\n\tgo.opentelemetry.io/[email protected]/exporter/exporterhelper/queued_retry.go:294\ngo.opentelemetry.io/collector/exporter/exporterhelper.NewTracesExporter.func2\n\tgo.opentelemetry.io/[email protected]/exporter/exporterhelper/traces.go:116\ngo.opentelemetry.io/collector/consumer.ConsumeTracesFunc.ConsumeTraces\n\tgo.opentelemetry.io/collector/[email protected]/traces.go:36\ngo.opentelemetry.io/collector/receiver/otlpreceiver/internal/trace.(*Receiver).Export\n\tgo.opentelemetry.io/collector/receiver/[email protected]/internal/trace/otlp.go:55\ngo.opentelemetry.io/collector/receiver/otlpreceiver.handleTraces\n\tgo.opentelemetry.io/collector/receiver/[email protected]/otlphttp.go:47\ngo.opentelemetry.io/collector/receiver/otlpreceiver.(*otlpReceiver).registerTraceConsumer.func1\n\tgo.opentelemetry.io/collector/receiver/[email protected]/otlp.go:210\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2084\nnet/http.(*ServeMux).ServeHTTP\n\tnet/http/server.go:2462\ngo.opentelemetry.io/collector/config/confighttp.(*decompressor).wrap.func1\n\tgo.opentelemetry.io/[email protected]/config/confighttp/compression.go:162\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2084\ngo.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.(*Handler).ServeHTTP\n\tgo.opentelemetry.io/contrib/instrumentation/net/http/[email protected]/handler.go:210\ngo.opentelemetry.io/collector/config/confighttp.(*clientInfoHandler).ServeHTTP\n\tgo.opentelemetry.io/[email protected]/config/confighttp/clientinfohandler.go:39\nnet/http.serverHandler.ServeHTTP\n\tnet/http/server.go:2916\nnet/http.(*conn).serve\n\tnet/http/server.go:1966"
}
{
"level": "error",
"ts": 1679534000.731938,
"caller": "exporterhelper/queued_retry.go:296",
"msg": "Exporting failed. Dropping data. Try enabling sending_queue to survive temporary failures.",
"kind": "exporter",
"data_type": "traces",
"name": "otlphttp",
"dropped_items": 8,
"stacktrace": "go.opentelemetry.io/collector/exporter/exporterhelper.(*queuedRetrySender).send\n\tgo.opentelemetry.io/[email protected]/exporter/exporterhelper/queued_retry.go:296\ngo.opentelemetry.io/collector/exporter/exporterhelper.NewTracesExporter.func2\n\tgo.opentelemetry.io/[email protected]/exporter/exporterhelper/traces.go:116\ngo.opentelemetry.io/collector/consumer.ConsumeTracesFunc.ConsumeTraces\n\tgo.opentelemetry.io/collector/[email protected]/traces.go:36\ngo.opentelemetry.io/collector/receiver/otlpreceiver/internal/trace.(*Receiver).Export\n\tgo.opentelemetry.io/collector/receiver/[email protected]/internal/trace/otlp.go:55\ngo.opentelemetry.io/collector/receiver/otlpreceiver.handleTraces\n\tgo.opentelemetry.io/collector/receiver/[email protected]/otlphttp.go:47\ngo.opentelemetry.io/collector/receiver/otlpreceiver.(*otlpReceiver).registerTraceConsumer.func1\n\tgo.opentelemetry.io/collector/receiver/[email protected]/otlp.go:210\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2084\nnet/http.(*ServeMux).ServeHTTP\n\tnet/http/server.go:2462\ngo.opentelemetry.io/collector/config/confighttp.(*decompressor).wrap.func1\n\tgo.opentelemetry.io/[email protected]/config/confighttp/compression.go:162\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2084\ngo.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.(*Handler).ServeHTTP\n\tgo.opentelemetry.io/contrib/instrumentation/net/http/[email protected]/handler.go:210\ngo.opentelemetry.io/collector/config/confighttp.(*clientInfoHandler).ServeHTTP\n\tgo.opentelemetry.io/[email protected]/config/confighttp/clientinfohandler.go:39\nnet/http.serverHandler.ServeHTTP\n\tnet/http/server.go:2916\nnet/http.(*conn).serve\n\tnet/http/server.go:1966"
}
[WARNING] 2023-03-23T01:13:20.734Z 2bdc5088-8c42-42eb-9013-79f41f191fd4 Transient error Internal Server Error encountered while exporting span batch, retrying in 1s.
[WARNING] 2023-03-23T01:13:21.797Z 2bdc5088-8c42-42eb-9013-79f41f191fd4 Transient error Internal Server Error encountered while exporting span batch, retrying in 2s.
[WARNING] 2023-03-23T01:13:23.856Z 2bdc5088-8c42-42eb-9013-79f41f191fd4 Transient error Internal Server Error encountered while exporting span batch, retrying in 4s.
[WARNING] 2023-03-23T01:13:27.919Z 2bdc5088-8c42-42eb-9013-79f41f191fd4 Transient error Internal Server Error encountered while exporting span batch, retrying in 8s.
{
"level": "error",
"ts": 1679534015.9845555,
"caller": "exporterhelper/queued_retry.go:394",
"msg": "Exporting failed. The error is not retryable. Dropping data.",
"kind": "exporter",
"data_type": "traces",
"name": "otlphttp",
"error": "Permanent error: error exporting items, request to https://ingest.us1.signalfx.com:443/v2/trace/otlp responded with HTTP Status Code 401",
"dropped_items": 8,
"stacktrace": "go.opentelemetry.io/collector/exporter/exporterhelper.(*retrySender).send\n\tgo.opentelemetry.io/[email protected]/exporter/exporterhelper/queued_retry.go:394\ngo.opentelemetry.io/collector/exporter/exporterhelper.(*tracesExporterWithObservability).send\n\tgo.opentelemetry.io/[email protected]/exporter/exporterhelper/traces.go:137\ngo.opentelemetry.io/collector/exporter/exporterhelper.(*queuedRetrySender).send\n\tgo.opentelemetry.io/[email protected]/exporter/exporterhelper/queued_retry.go:294\ngo.opentelemetry.io/collector/exporter/exporterhelper.NewTracesExporter.func2\n\tgo.opentelemetry.io/[email protected]/exporter/exporterhelper/traces.go:116\ngo.opentelemetry.io/collector/consumer.ConsumeTracesFunc.ConsumeTraces\n\tgo.opentelemetry.io/collector/[email protected]/traces.go:36\ngo.opentelemetry.io/collector/receiver/otlpreceiver/internal/trace.(*Receiver).Export\n\tgo.opentelemetry.io/collector/receiver/[email protected]/internal/trace/otlp.go:55\ngo.opentelemetry.io/collector/receiver/otlpreceiver.handleTraces\n\tgo.opentelemetry.io/collector/receiver/[email protected]/otlphttp.go:47\ngo.opentelemetry.io/collector/receiver/otlpreceiver.(*otlpReceiver).registerTraceConsumer.func1\n\tgo.opentelemetry.io/collector/receiver/[email protected]/otlp.go:210\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2084\nnet/http.(*ServeMux).ServeHTTP\n\tnet/http/server.go:2462\ngo.opentelemetry.io/collector/config/confighttp.(*decompressor).wrap.func1\n\tgo.opentelemetry.io/[email protected]/config/confighttp/compression.go:162\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2084\ngo.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.(*Handler).ServeHTTP\n\tgo.opentelemetry.io/contrib/instrumentation/net/http/[email protected]/handler.go:210\ngo.opentelemetry.io/collector/config/confighttp.(*clientInfoHandler).ServeHTTP\n\tgo.opentelemetry.io/[email protected]/config/confighttp/clientinfohandler.go:39\nnet/http.serverHandler.ServeHTTP\n\tnet/http/server.go:2916\nnet/http.(*conn).serve\n\tnet/http/server.go:1966"
}
{
"level": "error",
"ts": 1679534015.984702,
"caller": "exporterhelper/queued_retry.go:296",
"msg": "Exporting failed. Dropping data. Try enabling sending_queue to survive temporary failures.",
"kind": "exporter",
"data_type": "traces",
"name": "otlphttp",
"dropped_items": 8,
"stacktrace": "go.opentelemetry.io/collector/exporter/exporterhelper.(*queuedRetrySender).send\n\tgo.opentelemetry.io/[email protected]/exporter/exporterhelper/queued_retry.go:296\ngo.opentelemetry.io/collector/exporter/exporterhelper.NewTracesExporter.func2\n\tgo.opentelemetry.io/[email protected]/exporter/exporterhelper/traces.go:116\ngo.opentelemetry.io/collector/consumer.ConsumeTracesFunc.ConsumeTraces\n\tgo.opentelemetry.io/collector/[email protected]/traces.go:36\ngo.opentelemetry.io/collector/receiver/otlpreceiver/internal/trace.(*Receiver).Export\n\tgo.opentelemetry.io/collector/receiver/[email protected]/internal/trace/otlp.go:55\ngo.opentelemetry.io/collector/receiver/otlpreceiver.handleTraces\n\tgo.opentelemetry.io/collector/receiver/[email protected]/otlphttp.go:47\ngo.opentelemetry.io/collector/receiver/otlpreceiver.(*otlpReceiver).registerTraceConsumer.func1\n\tgo.opentelemetry.io/collector/receiver/[email protected]/otlp.go:210\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2084\nnet/http.(*ServeMux).ServeHTTP\n\tnet/http/server.go:2462\ngo.opentelemetry.io/collector/config/confighttp.(*decompressor).wrap.func1\n\tgo.opentelemetry.io/[email protected]/config/confighttp/compression.go:162\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2084\ngo.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.(*Handler).ServeHTTP\n\tgo.opentelemetry.io/contrib/instrumentation/net/http/[email protected]/handler.go:210\ngo.opentelemetry.io/collector/config/confighttp.(*clientInfoHandler).ServeHTTP\n\tgo.opentelemetry.io/[email protected]/config/confighttp/clientinfohandler.go:39\nnet/http.serverHandler.ServeHTTP\n\tnet/http/server.go:2916\nnet/http.(*conn).serve\n\tnet/http/server.go:1966"
}
[WARNING] 2023-03-23T01:13:35.985Z 2bdc5088-8c42-42eb-9013-79f41f191fd4 Transient error Internal Server Error encountered while exporting span batch, retrying in 16s.
[WARNING] 2023-03-23T01:13:50.564Z 2bdc5088-8c42-42eb-9013-79f41f191fd4 Timeout was exceeded in force_flush().
END RequestId: 2bdc5088-8c42-42eb-9013-79f41f191fd4