This commit respects online CPU count for process_cpu_cores_available metric. Since CPUQuota may exceed it.
Detailed description:
There might be a case when `getCPUQuota()` returns a value bigger than
available logic CPU cores. In this case the lower value should be used.
These changes also reflect upcoming go1.25 changes
https://tip.golang.org/doc/go1.25#container-aware-gomaxprocs
> If the CPU bandwidth limit is lower than the number of logical CPUs
available, GOMAXPROCS will default to the lower limit
In practice that happens with [CPU
Manager](https://kubernetes.io/blog/2022/12/27/cpumanager-ga/) enabled.
It requires setting `/sys/devices/system/cpu/online` which shows the
total number of cores on a node. The container doesn't have any cgroup
limits, but it's pinned to a limited subset of cores.
```
root@vmselect:/# cat /sys/fs/cgroup/cpu/cpu.cfs_quota_us
-1
root@vmselect:/# cat /sys/fs/cgroup/cpu/cpu.cfs_period_us
100000
root@vmselect:/# cat /sys/devices/system/cpu/online
0-255
```
go1.25 and go.uber.org/automaxprocs don't look into
`/sys/devices/system/cpu/online`, VictoriaMetrics does
Previously, if `cpu.max` file has only `max` resource defined without
`period`, it was parsed incorrectly and silently drop error. While this
syntax is valid and actually used by some container runtimes. If period
is not defined, default value for it 100_000 must be used.
This commit fixes parsing function by using default value for period.
In addition, it adds zero value check, which fixes possible panic if
period has 0 value.
Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8808
Historically some of VictoriaMetrics components were optimized for the low rate of memory allocations.
These are: vmagent, single-node VictoriaMetrics and vmstorage. These components benefit from the low
GOGC value, since this allow reducing their memory usage in steady state on typical workloads.
Other VictoriaMetrics components aren't optimized for the reduced rate of memory allocations.
This results in the increased CPU usage spent on garbage collection (GC) in these components,
since it must be triggered at higher rate. See https://tip.golang.org/doc/gc-guide#GOGC for details.
These components do not use too much memory, so it is OK increasing the GOGC for these components
from 30 to 100 - this won't affect the most users.
Keep GOGC to 30 only for vmagent, single-node VictoriaMetrics and vmstorage components.
See 077193d87c and 54b9e1d3cb .
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7902
Rounding GOMAXPROCS to the upper interger value of cpuQuota increases chances of CPU starvation,
non-optimimal goroutine scheduling and additional CPU overhead related to context switching.
So it is better to round GOMAXPROCS to the lower integer value of cpuQuota.
GOGC can be already set via environment variable. There is no need in adding
new approaches for setting the GOGC (such as command-line flag), since they complicate operations.
The ioutil.{Read|Write}File is deprecated since Go1.16 -
see https://tip.golang.org/doc/go1.16#ioutil
VictoriaMetrics needs at least Go1.18, so it is safe to remove ioutil usage
from source code.
This is a follow-up for 02ca2342ab
This reduces memory usage under production workloads by up to 10%,
while CPU spent on GC remains roughly the same.
The CPU spent on GC can be monitored with go_memstats_gc_cpu_fraction metric
This metric shows the number of CPU cores available to the process.
This allows creating alerting rules on CPU saturation with the following query:
rate(process_cpu_seconds_total[5m]) / process_cpu_cores_available > 0.9
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2107