|
| 1 | +# Memory Limiter extension |
| 2 | + |
| 3 | +| Status | | |
| 4 | +|--------------------------|-------------------| |
| 5 | +| Stability | [alpha] | |
| 6 | +| Distributions | [core], [contrib] | |
| 7 | + |
| 8 | +The memory limiter extension is used to prevent out of memory situations on |
| 9 | +the collector. Given that the amount and type of data the collector processes is |
| 10 | +environment specific and resource utilization of the collector is also dependent |
| 11 | +on the configured extensions, it is important to put checks in place regarding |
| 12 | +memory usage. |
| 13 | + |
| 14 | +The memory_limiter extension allows to perform periodic checks of memory |
| 15 | +usage if it exceeds defined limits will begin refusing data and forcing GC to reduce |
| 16 | +memory consumption. |
| 17 | + |
| 18 | +The memory_limiter uses soft and hard memory limits. Hard limit is always above or equal |
| 19 | +the soft limit. |
| 20 | + |
| 21 | +When the memory usage exceeds the soft limit the extension will enter the memory limited |
| 22 | +mode and will start refusing the data by returning errors to the preceding component. |
| 23 | +The preceding component should be normally a receiver. |
| 24 | + |
| 25 | +In memory limited mode the error returned by CheckMemory function is a |
| 26 | +non-permanent error. When receivers see this error they are expected to retry sending |
| 27 | +the same data. The receivers may also apply a backpressure to their data sources |
| 28 | +in order to slow down the inflow of data into the Collector and allow the memory usage |
| 29 | +to go below the limits. |
| 30 | + |
| 31 | +When the memory usage is above the hard limit in addition to refusing the data the |
| 32 | +extension will forcedly perform garbage collection in order to try to free memory. |
| 33 | + |
| 34 | +When the memory usage drop below the soft limit, the normal operation is resumed (data |
| 35 | +will no longer be refused and no forced garbage collection will be performed). |
| 36 | + |
| 37 | +The difference between the soft limit and hard limits is defined via `spike_limit_mib` |
| 38 | +configuration option. The value of this option should be selected in a way that ensures |
| 39 | +that between the memory check intervals the memory usage cannot increase by more than this |
| 40 | +value (otherwise memory usage may exceed the hard limit - even if temporarily). |
| 41 | +A good starting point for `spike_limit_mib` is 20% of the hard limit. Bigger |
| 42 | +`spike_limit_mib` values may be necessary for spiky traffic or for longer check intervals. |
| 43 | + |
| 44 | +Note that while the extension can help mitigate out of memory situations, |
| 45 | +it is not a replacement for properly sizing and configuring the |
| 46 | +collector. Keep in mind that if the soft limit is crossed, the collector will |
| 47 | +return errors to all receive operations until enough memory is freed. This may |
| 48 | +eventually result in dropped data since the receivers may not be able to hold back |
| 49 | +and retry the data indefinitely. |
| 50 | + |
| 51 | +Please refer to [config.go](./config.go) for the config spec. |
| 52 | + |
| 53 | +The following configuration options **must be changed**: |
| 54 | +- `check_interval` (default = 0s): Time between measurements of memory |
| 55 | +usage. The recommended value is 1 second. |
| 56 | +If the expected traffic to the Collector is very spiky then decrease the `check_interval` |
| 57 | +or increase `spike_limit_mib` to avoid memory usage going over the hard limit. |
| 58 | +- `limit_mib` (default = 0): Maximum amount of memory, in MiB, targeted to be |
| 59 | +allocated by the process heap. Note that typically the total memory usage of |
| 60 | +process will be about 50MiB higher than this value. This defines the hard limit. |
| 61 | +- `spike_limit_mib` (default = 20% of `limit_mib`): Maximum spike expected between the |
| 62 | +measurements of memory usage. The value must be less than `limit_mib`. The soft limit |
| 63 | +value will be equal to (limit_mib - spike_limit_mib). |
| 64 | +The recommended value for `spike_limit_mib` is about 20% `limit_mib`. |
| 65 | +- `limit_percentage` (default = 0): Maximum amount of total memory targeted to be |
| 66 | +allocated by the process heap. This configuration is supported on Linux systems with cgroups |
| 67 | +and it's intended to be used in dynamic platforms like docker. |
| 68 | +This option is used to calculate `memory_limit` from the total available memory. |
| 69 | +For instance setting of 75% with the total memory of 1GiB will result in the limit of 750 MiB. |
| 70 | +The fixed memory setting (`limit_mib`) takes precedence |
| 71 | +over the percentage configuration. |
| 72 | +- `spike_limit_percentage` (default = 0): Maximum spike expected between the |
| 73 | +measurements of memory usage. The value must be less than `limit_percentage`. |
| 74 | +This option is used to calculate `spike_limit_mib` from the total available memory. |
| 75 | +For instance setting of 25% with the total memory of 1GiB will result in the spike limit of 250MiB. |
| 76 | +This option is intended to be used only with `limit_percentage`. |
| 77 | + |
| 78 | +Examples: |
| 79 | + |
| 80 | +```yaml |
| 81 | +extensions: |
| 82 | + memory_limiter: |
| 83 | + check_interval: 1s |
| 84 | + limit_mib: 4000 |
| 85 | + spike_limit_mib: 800 |
| 86 | +``` |
| 87 | +
|
| 88 | +```yaml |
| 89 | +extensions: |
| 90 | + memory_limiter: |
| 91 | + check_interval: 1s |
| 92 | + limit_percentage: 50 |
| 93 | + spike_limit_percentage: 30 |
| 94 | +``` |
| 95 | +
|
| 96 | +Refer to [config.yaml](./testdata/config.yaml) for detailed |
| 97 | +examples on using the extension. |
| 98 | +
|
| 99 | +[beta]: https://github.com/open-telemetry/opentelemetry-collector#beta |
| 100 | +[contrib]: https://github.com/open-telemetry/opentelemetry-collector-releases/tree/main/distributions/otelcol-contrib |
| 101 | +[core]: https://github.com/open-telemetry/opentelemetry-collector-releases/tree/main/distributions/otelcol |
0 commit comments