Skip to content

Commit 5b2cbf2

Browse files
committed
add memory limiter extension, a copy of memorylimiter processor
1 parent 9475ac7 commit 5b2cbf2

File tree

14 files changed

+1173
-0
lines changed

14 files changed

+1173
-0
lines changed
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
# Use this changelog template to create an entry for release notes.
2+
3+
# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
4+
change_type: [new_component]
5+
6+
# The name of the component, or a single word describing the area of concern, (e.g. otlpreceiver)
7+
component: extension_memoryLimiter
8+
9+
# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
10+
note: move memory limiter processor to extension to allow connection limits at the receiver
11+
12+
# One or more tracking issues or pull requests related to the change
13+
issues: [8632]
14+
15+
# (Optional) One or more lines of additional information to render under the primary note.
16+
# These lines will be padded with 2 spaces and then inserted directly into the document.
17+
# Use pipe (|) for multiline entries.
18+
subtext:
19+
20+
# Optional: The change log or logs in which this entry should be included.
21+
# e.g. '[user]' or '[user, api]'
22+
# Include 'user' if the change is relevant to end users.
23+
# Include 'api' if there is a change to a library API.
24+
# Default: '[user]'
25+
change_logs: []

extension/memorylimiter/Makefile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
include ../../Makefile.Common

extension/memorylimiter/README.md

Lines changed: 101 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,101 @@
1+
# Memory Limiter extension
2+
3+
| Status | |
4+
|--------------------------|-------------------|
5+
| Stability | [alpha] |
6+
| Distributions | [core], [contrib] |
7+
8+
The memory limiter extension is used to prevent out of memory situations on
9+
the collector. Given that the amount and type of data the collector processes is
10+
environment specific and resource utilization of the collector is also dependent
11+
on the configured extensions, it is important to put checks in place regarding
12+
memory usage.
13+
14+
The memory_limiter extension allows to perform periodic checks of memory
15+
usage if it exceeds defined limits will begin refusing data and forcing GC to reduce
16+
memory consumption.
17+
18+
The memory_limiter uses soft and hard memory limits. Hard limit is always above or equal
19+
the soft limit.
20+
21+
When the memory usage exceeds the soft limit the extension will enter the memory limited
22+
mode and will start refusing the data by returning errors to the preceding component.
23+
The preceding component should be normally a receiver.
24+
25+
In memory limited mode the error returned by CheckMemory function is a
26+
non-permanent error. When receivers see this error they are expected to retry sending
27+
the same data. The receivers may also apply a backpressure to their data sources
28+
in order to slow down the inflow of data into the Collector and allow the memory usage
29+
to go below the limits.
30+
31+
When the memory usage is above the hard limit in addition to refusing the data the
32+
extension will forcedly perform garbage collection in order to try to free memory.
33+
34+
When the memory usage drop below the soft limit, the normal operation is resumed (data
35+
will no longer be refused and no forced garbage collection will be performed).
36+
37+
The difference between the soft limit and hard limits is defined via `spike_limit_mib`
38+
configuration option. The value of this option should be selected in a way that ensures
39+
that between the memory check intervals the memory usage cannot increase by more than this
40+
value (otherwise memory usage may exceed the hard limit - even if temporarily).
41+
A good starting point for `spike_limit_mib` is 20% of the hard limit. Bigger
42+
`spike_limit_mib` values may be necessary for spiky traffic or for longer check intervals.
43+
44+
Note that while the extension can help mitigate out of memory situations,
45+
it is not a replacement for properly sizing and configuring the
46+
collector. Keep in mind that if the soft limit is crossed, the collector will
47+
return errors to all receive operations until enough memory is freed. This may
48+
eventually result in dropped data since the receivers may not be able to hold back
49+
and retry the data indefinitely.
50+
51+
Please refer to [config.go](./config.go) for the config spec.
52+
53+
The following configuration options **must be changed**:
54+
- `check_interval` (default = 0s): Time between measurements of memory
55+
usage. The recommended value is 1 second.
56+
If the expected traffic to the Collector is very spiky then decrease the `check_interval`
57+
or increase `spike_limit_mib` to avoid memory usage going over the hard limit.
58+
- `limit_mib` (default = 0): Maximum amount of memory, in MiB, targeted to be
59+
allocated by the process heap. Note that typically the total memory usage of
60+
process will be about 50MiB higher than this value. This defines the hard limit.
61+
- `spike_limit_mib` (default = 20% of `limit_mib`): Maximum spike expected between the
62+
measurements of memory usage. The value must be less than `limit_mib`. The soft limit
63+
value will be equal to (limit_mib - spike_limit_mib).
64+
The recommended value for `spike_limit_mib` is about 20% `limit_mib`.
65+
- `limit_percentage` (default = 0): Maximum amount of total memory targeted to be
66+
allocated by the process heap. This configuration is supported on Linux systems with cgroups
67+
and it's intended to be used in dynamic platforms like docker.
68+
This option is used to calculate `memory_limit` from the total available memory.
69+
For instance setting of 75% with the total memory of 1GiB will result in the limit of 750 MiB.
70+
The fixed memory setting (`limit_mib`) takes precedence
71+
over the percentage configuration.
72+
- `spike_limit_percentage` (default = 0): Maximum spike expected between the
73+
measurements of memory usage. The value must be less than `limit_percentage`.
74+
This option is used to calculate `spike_limit_mib` from the total available memory.
75+
For instance setting of 25% with the total memory of 1GiB will result in the spike limit of 250MiB.
76+
This option is intended to be used only with `limit_percentage`.
77+
78+
Examples:
79+
80+
```yaml
81+
extensions:
82+
memory_limiter:
83+
check_interval: 1s
84+
limit_mib: 4000
85+
spike_limit_mib: 800
86+
```
87+
88+
```yaml
89+
extensions:
90+
memory_limiter:
91+
check_interval: 1s
92+
limit_percentage: 50
93+
spike_limit_percentage: 30
94+
```
95+
96+
Refer to [config.yaml](./testdata/config.yaml) for detailed
97+
examples on using the extension.
98+
99+
[beta]: https://github.com/open-telemetry/opentelemetry-collector#beta
100+
[contrib]: https://github.com/open-telemetry/opentelemetry-collector-releases/tree/main/distributions/otelcol-contrib
101+
[core]: https://github.com/open-telemetry/opentelemetry-collector-releases/tree/main/distributions/otelcol

extension/memorylimiter/config.go

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
// Copyright The OpenTelemetry Authors
2+
// SPDX-License-Identifier: Apache-2.0
3+
4+
// Package memorylimiter provides an extension for OpenTelemetry Service
5+
// that refuses data according to the current state of memory usage.
6+
package memorylimiter // import "go.opentelemetry.io/collector/extension/memorylimiter"
7+
8+
import (
9+
"time"
10+
11+
"go.opentelemetry.io/collector/component"
12+
)
13+
14+
// Config defines configuration for memory memoryLimiter extension.
15+
type Config struct {
16+
// CheckInterval is the time between measurements of memory usage for the
17+
// purposes of avoiding going over the limits. Defaults to zero, so no
18+
// checks will be performed.
19+
CheckInterval time.Duration `mapstructure:"check_interval"`
20+
21+
// MemoryLimitMiB is the maximum amount of memory, in MiB, targeted to be
22+
// allocated by the extension.
23+
MemoryLimitMiB uint32 `mapstructure:"limit_mib"`
24+
25+
// MemorySpikeLimitMiB is the maximum, in MiB, spike expected between the
26+
// measurements of memory usage.
27+
MemorySpikeLimitMiB uint32 `mapstructure:"spike_limit_mib"`
28+
29+
// MemoryLimitPercentage is the maximum amount of memory, in %, targeted to be
30+
// allocated by the extension. The fixed memory settings MemoryLimitMiB has a higher precedence.
31+
MemoryLimitPercentage uint32 `mapstructure:"limit_percentage"`
32+
33+
// MemorySpikePercentage is the maximum, in percents against the total memory,
34+
// spike expected between the measurements of memory usage.
35+
MemorySpikePercentage uint32 `mapstructure:"spike_limit_percentage"`
36+
}
37+
38+
var _ component.Config = (*Config)(nil)
39+
40+
// Validate checks if the extension configuration is valid
41+
func (cfg *Config) Validate() error {
42+
if cfg.CheckInterval <= 0 {
43+
return errCheckIntervalOutOfRange
44+
}
45+
46+
if cfg.MemoryLimitMiB == 0 && cfg.MemoryLimitPercentage == 0 {
47+
return errLimitOutOfRange
48+
}
49+
50+
return nil
51+
}
Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
// Copyright The OpenTelemetry Authors
2+
// SPDX-License-Identifier: Apache-2.0
3+
4+
package memorylimiter
5+
6+
import (
7+
"path/filepath"
8+
"testing"
9+
"time"
10+
11+
"github.com/stretchr/testify/assert"
12+
"github.com/stretchr/testify/require"
13+
14+
"go.opentelemetry.io/collector/component"
15+
"go.opentelemetry.io/collector/confmap"
16+
"go.opentelemetry.io/collector/confmap/confmaptest"
17+
)
18+
19+
func TestUnmarshalDefaultConfig(t *testing.T) {
20+
factory := NewFactory()
21+
cfg := factory.CreateDefaultConfig()
22+
assert.NoError(t, component.UnmarshalConfig(confmap.New(), cfg))
23+
assert.Equal(t, factory.CreateDefaultConfig(), cfg)
24+
}
25+
26+
func TestUnmarshalConfig(t *testing.T) {
27+
cm, err := confmaptest.LoadConf(filepath.Join("testdata", "config.yaml"))
28+
require.NoError(t, err)
29+
factory := NewFactory()
30+
cfg := factory.CreateDefaultConfig()
31+
assert.NoError(t, component.UnmarshalConfig(cm, cfg))
32+
assert.Equal(t,
33+
&Config{
34+
CheckInterval: 5 * time.Second,
35+
MemoryLimitMiB: 4000,
36+
MemorySpikeLimitMiB: 500,
37+
}, cfg)
38+
}
39+
40+
func TestValidateConfig(t *testing.T) {
41+
cm, err := confmaptest.LoadConf(filepath.Join("testdata", "config.yaml"))
42+
require.NoError(t, err)
43+
factory := NewFactory()
44+
cfg := factory.CreateDefaultConfig()
45+
// Valid Config
46+
assert.NoError(t, component.UnmarshalConfig(cm, cfg))
47+
assert.NoError(t, cfg.(*Config).Validate())
48+
// Invalid Interval
49+
pCfg := cfg.(*Config)
50+
pCfg.CheckInterval = 0
51+
assert.Error(t, pCfg.Validate(), errCheckIntervalOutOfRange)
52+
// Invalid mem limit
53+
pCfg = cfg.(*Config)
54+
pCfg.CheckInterval = 1
55+
pCfg.MemoryLimitMiB = 0
56+
pCfg.MemoryLimitPercentage = 0
57+
assert.Error(t, pCfg.Validate(), errLimitOutOfRange)
58+
}

extension/memorylimiter/factory.go

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
// Copyright The OpenTelemetry Authors
2+
// SPDX-License-Identifier: Apache-2.0
3+
4+
package memorylimiter // import "go.opentelemetry.io/collector/extension/memorylimiter"
5+
6+
import (
7+
"context"
8+
9+
"go.opentelemetry.io/collector/component"
10+
"go.opentelemetry.io/collector/extension"
11+
)
12+
13+
const (
14+
// The value of "type" Attribute Key in configuration.
15+
typeStr = "memory_limiter"
16+
)
17+
18+
// NewFactory returns a new factory for the Memory Limiter extension.
19+
func NewFactory() extension.Factory {
20+
return extension.NewFactory(
21+
typeStr,
22+
createDefaultConfig,
23+
createExtension,
24+
component.StabilityLevelAlpha)
25+
}
26+
27+
// CreateDefaultConfig creates the default configuration for extension. Notice
28+
// that the default configuration is expected to fail for this extension.
29+
func createDefaultConfig() component.Config {
30+
return &Config{}
31+
}
32+
33+
func createExtension(_ context.Context, set extension.CreateSettings, cfg component.Config) (extension.Extension, error) {
34+
return newMemoryLimiter(cfg.(*Config), set.TelemetrySettings.Logger)
35+
}
Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
// Copyright The OpenTelemetry Authors
2+
// SPDX-License-Identifier: Apache-2.0
3+
4+
package memorylimiter
5+
6+
import (
7+
"context"
8+
"testing"
9+
"time"
10+
11+
"github.com/stretchr/testify/assert"
12+
"github.com/stretchr/testify/require"
13+
"go.opentelemetry.io/collector/extension/extensiontest"
14+
15+
"go.opentelemetry.io/collector/component/componenttest"
16+
)
17+
18+
func TestCreateDefaultConfig(t *testing.T) {
19+
factory := NewFactory()
20+
require.NotNil(t, factory)
21+
22+
cfg := factory.CreateDefaultConfig()
23+
assert.NotNil(t, cfg, "failed to create default config")
24+
assert.NoError(t, componenttest.CheckConfigStruct(cfg))
25+
}
26+
27+
func TestCreateExtension(t *testing.T) {
28+
factory := NewFactory()
29+
require.NotNil(t, factory)
30+
31+
cfg := factory.CreateDefaultConfig()
32+
// This extension can't be created with the default config.
33+
tp, err := factory.CreateExtension(context.Background(), extensiontest.NewNopCreateSettings(), cfg)
34+
assert.Nil(t, tp)
35+
assert.Error(t, err, "created extension with invalid settings")
36+
37+
// Create extension with a valid config.
38+
pCfg := cfg.(*Config)
39+
pCfg.MemoryLimitMiB = 5722
40+
pCfg.MemorySpikeLimitMiB = 1907
41+
pCfg.CheckInterval = 100 * time.Millisecond
42+
43+
tp, err = factory.CreateExtension(context.Background(), extensiontest.NewNopCreateSettings(), cfg)
44+
assert.NoError(t, err)
45+
assert.NotNil(t, tp)
46+
// test if we can shutdown a monitoring routine that has not started
47+
assert.ErrorIs(t, tp.Shutdown(context.Background()), errShutdownNotStarted)
48+
assert.NoError(t, tp.Start(context.Background(), componenttest.NewNopHost()))
49+
50+
assert.NoError(t, tp.Shutdown(context.Background()))
51+
// verify that no monitoring routine is running
52+
assert.Error(t, tp.Shutdown(context.Background()))
53+
}

0 commit comments

Comments
 (0)