Description
What is the issue?
HTTP/1.1 keep-alive connections are being prematurely closed by the Linkerd proxy after 30 seconds of idleness, and there is currently no way to configure this timeout.
We are using [VictoriaMetrics](https://victoriametrics.com/) to collect metrics from services running inside a Kubernetes cluster. Many of our scrape intervals are configured to 60 seconds. We’ve observed that the vmagent
scraper is encountering frequent connection closures from the Linkerd sidecar:
INFO ThreadId(01) linkerd_app_core::serve: Connection closed error=read header from client timeout client.addr=10.244.0.13:37704 server.addr=xxx.xxx.xxx.xxx:xxx
This indicates that persistent HTTP/1 connections used by vmagent
to scrape metrics are being closed before the next request is sent. Capturing traffic with tcpdump
shows that vmagent
uses HTTP keep-alive and only sends requests every 60 seconds, which causes the HTTP/1 connection to remain idle for longer than Linkerd's 30-second default.
When Linkerd is enabled, these connections are closed after 30 seconds of inactivity. From debug logs, we see the following error:
[ 1861.475759s] DEBUG ThreadId(01) linkerd_proxy_http::server: The client is shutting down the connection res=Err(hyper::Error(HeaderTimeout))
We traced this to the underlying [Hyper](https://github.com/hyperium/hyper/blob/c88df7886c74a1ade69c0b4c68eaf570c8111622/src/server/conn/http1.rs#L79) implementation used by Linkerd. Hyper's Builder
sets a default h1_header_read_timeout
of 30 seconds. This timeout is triggered when no new headers are received during that period on an idle HTTP/1 connection.
h1_header_read_timeout: Dur::Default(Some(Duration::from_secs(30))),
Currently, the Linkerd proxy does not expose this setting, and there’s no way to override it via annotations or configuration.
How can it be reproduced?
Establish an HTTP/1.1 keep-alive connection (e.g., using Python’s requests.Session
) and leave it idle without sending further requests for over 30 seconds. The Linkerd proxy will close the connection with a HeaderTimeout
error.
Logs, error output, etc
[450882.624436s] INFO ThreadId(01) outbound: linkerd_app_core::serve: Connection closed error=read header from client timeout client.addr=10.246.13.59:50568 server.addr=10.246.12.16:9793
[450881.544872s] INFO ThreadId(01) outbound: linkerd_app_core::serve: Connection closed error=read header from client timeout client.addr=10.246.13.59:47154 server.addr=10.246.15.207:9180
[450878.959567s] INFO ThreadId(01) outbound: linkerd_app_core::serve: Connection closed error=read header from client timeout client.addr=10.246.13.59:44612 server.addr=10.246.14.56:9793
[450878.672225s] INFO ThreadId(01) outbound: linkerd_app_core::serve: Connection closed error=read header from client timeout client.addr=10.246.13.59:44486 server.addr=10.246.9.45:9180
output of linkerd check -o short
linkerd-version
---------------
‼ cli is up-to-date
is running version 25.5.5 but the latest edge version is 25.6.2
see https://linkerd.io/2/checks/#l5d-version-cli for hints
control-plane-version
---------------------
‼ control plane is up-to-date
is running version 25.5.5 but the latest edge version is 25.6.2
see https://linkerd.io/2/checks/#l5d-version-control for hints
linkerd-control-plane-proxy
---------------------------
‼ control plane proxies are up-to-date
some proxies are not running the current version:
* linkerd-destination-97c46bc4d-mst8p (edge-25.5.5)
* linkerd-destination-97c46bc4d-r4pdm (edge-25.5.5)
* linkerd-destination-97c46bc4d-vsfhf (edge-25.5.5)
* linkerd-identity-7b5c4b4f75-gvgw5 (edge-25.5.5)
* linkerd-identity-7b5c4b4f75-l6hln (edge-25.5.5)
* linkerd-identity-7b5c4b4f75-wsj9m (edge-25.5.5)
* linkerd-proxy-injector-c58469d8f-257lh (edge-25.5.5)
* linkerd-proxy-injector-c58469d8f-8f4v2 (edge-25.5.5)
* linkerd-proxy-injector-c58469d8f-tcqv4 (edge-25.5.5)
see https://linkerd.io/2/checks/#l5d-cp-proxy-version for hints
Status check results are √
Environment
- Kubernetes Version: 1.32
- Cluster Environment: RKE2
- Host OS: Ubuntu
- Linkerd Version: 25.5.5
Possible solution
Expose the h1_header_read_timeout
setting in the Linkerd proxy configuration, possibly via annotations or config fields, so that users can adjust the HTTP/1 idle timeout behavior to better suit long-polling or low-frequency scraping use cases.
Additional context
No response
Would you like to work on fixing this bug?
maybe