Closed
Description
Component(s)
receiver/hostmetrics
What happened?
Description
The process.cpu.utilization
metric is expected to be a value between 0 and 1, but it can be greater than 1.
My guess here is that process.cpu.utilization does not properly divide by the number of system cores, so the value can actually be between 0 - ${num_cores}
Steps to Reproduce
- Run a process with heavy load on a multi-core machine
- Scrape with the hostmetric receiver
- Note that process metrics may exceed 1
Expected Result
No metrics exceed 1 (in fact, I'd expect the sum of all processes to be < 1)
Actual Result
I'm getting a metric of 9.5 (this process was taking ~950% cpu in my activity monitor)
{
"resource": {
"attributes": [
{ "key": "process.pid", "value": { "intValue": "89633" } },
{ "key": "process.parent_pid", "value": { "intValue": "89617" } },
{
"key": "process.executable.name",
"value": { "stringValue": "load" }
},
{
"key": "process.executable.path",
"value": {
"stringValue": "/var/folders/qn/13392c0n4rl4j4cplc65_66h0000gn/T/go-build2476508403/b001/exe/load"
}
},
{
"key": "process.command",
"value": {
"stringValue": "/var/folders/qn/13392c0n4rl4j4cplc65_66h0000gn/T/go-build2476508403/b001/exe/load"
}
},
{
"key": "process.command_line",
"value": {
"stringValue": "/var/folders/qn/13392c0n4rl4j4cplc65_66h0000gn/T/go-build2476508403/b001/exe/load"
}
},
{
"key": "process.owner",
"value": { "stringValue": "brandonjohnson" }
},
{ "key": "host.name", "value": { "stringValue": "Brandons-MBP" } },
{ "key": "os.type", "value": { "stringValue": "darwin" } }
]
},
"scopeMetrics": [
{
"scope": {
"name": "otelcol/hostmetricsreceiver/process",
"version": "v1.44.0"
},
"metrics": [
{
"name": "process.cpu.utilization",
"description": "Percentage of total CPU time used by the process since last scrape, expressed as a value between 0 and 1. On the first scrape, no data point is emitted for this metric.",
"unit": "1",
"gauge": {
"dataPoints": [
{
"attributes": [
{ "key": "state", "value": { "stringValue": "user" } }
],
"startTimeUnixNano": "1708529361162000000",
"timeUnixNano": "1708529423825350000",
"asDouble": 9.532629437655167
},
{
"attributes": [
{ "key": "state", "value": { "stringValue": "system" } }
],
"startTimeUnixNano": "1708529361162000000",
"timeUnixNano": "1708529423825350000",
"asDouble": 0.07040537266051189
},
{
"attributes": [
{ "key": "state", "value": { "stringValue": "wait" } }
],
"startTimeUnixNano": "1708529361162000000",
"timeUnixNano": "1708529423825350000",
"asDouble": 0
}
]
}
}
]
}
],
"schemaUrl": "https://opentelemetry.io/schemas/1.9.0"
}
Collector version
v0.94.0
Environment information
Environment
OS: macOS 14.2.1
Compiler(if manually compiled): go 1.22
OpenTelemetry Collector configuration
No response
Log output
No response
Additional context
I used this quick go program to quickly generate load:
package main
import "time"
func main() {
for i := 0; i < 100; i++ {
go func() {
v := 0
for {
v++
}
}()
}
time.Sleep(3 * time.Minute)
}