lib/storage: report per-idb cache stats only once

`tagFiltersCache` and `dateMeticIDCache` are now per-indexDB. Currently
we have 2 instance of indexDBs (prev and curr) and therefore 2 instances
of each cache.

When the storage stats is collected, the stats of individual caches is
added together. For example, is the `sizeMaxBytes` of each
tagFiltersCache is `100MB` and the `sizeBytes` of each instance is
`10MB` and `99MB`, then the resulting stats will be `sizeMaxBytes ==
200MB, sizeBytes == 109MB`.

While this is accurate, this stats hides a potential problem. It says
that the cache utilization is slightly above `50%` (109/200) and
everything seems to be okay. But in reality one of the caches is
utilized by 99% and soon will start evicting existing records to make
room for new ones, potentially slowing down the data retrieval. Ops
won't see it and will not take necessary action.

The solution is to report stats only for one instance of cache whose
utilization is the highest.

Alternatives considered:
- #10123. Might work, but breaks the encapsulation and can potentially
be slower
- Do not aggregate the stats and report is per-indexDB. This increases
the number of metrics and makes it dependent on the number of indexDB
instances (which can be many once #8134 is released).

Related issue https://github.com/VictoriaMetrics/VictoriaMetrics/pull/8134
This commit is contained in:
Artem Fetishev
2025-12-09 12:43:50 +01:00
committed by GitHub
parent 76f5def301
commit f62893c151
2 changed files with 20 additions and 12 deletions

View File

@@ -37,6 +37,7 @@ See also [LTS releases](https://docs.victoriametrics.com/victoriametrics/lts-rel
* BUGFIX: [vmctl](https://docs.victoriametrics.com/victoriametrics/vmctl/): properly handle process termination during prompt confirmation. Previously, termination signal was ignored and process was still waiting for user input. See[#10104](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10104).
* BUGFIX: [vmgateway](https://docs.victoriametrics.com/victoriametrics/vmgateway/): properly recover from proxy requests errors. Previously, vmgateway may return empty response.
* BUGFIX: [vmui](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#vmui): always add `/prometheus` suffix while generating backend URL. See [#10097](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10097).
* BUGFIX: [vmsingle](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/) and `vmstorage` in [VictoriaMetrics cluster](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/): report stats only for most utilized instances of `indexdb/tagFiltersToMetricIDs` and `indexdb/date_metricID` caches. This makes it clear when a cache is full and an action needs to be taken (such as adding more memory or adjusting cache limits). See PR [#10131](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10131) for details.
* BUGFIX: [vmsingle](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/) and `vmselect` in [VictoriaMetrics cluster](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/): disable rollup result cache for [instant queries](https://docs.victoriametrics.com/keyConcepts.html#instant-query) that contain [`rate`](https://docs.victoriametrics.com/MetricsQL.html#rate) function with a lookbehind window larger than `-search.minWindowForInstantRollupOptimization`. Previously, utilizing the cache might yield incorrect results when time series samples are not continuous. See [#10098](https://github.com/VictoriaMetrics/victoriaMetrics/issues/10098) for more details.
* BUGFIX: all VictoriaMetrics components: properly validate remaining system memory limit. Previously it could have negative values. See this issue [#10083](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10083) for details.
* BUGFIX: `vmstorage` in [VictoriaMetrics cluster](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/): do not wait after closing the last connections from vminsert when shutting down. The bug was introduced in [#9487](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9487). See [#10136](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10136) for detail.

View File

@@ -244,20 +244,27 @@ func (db *indexDB) UpdateMetrics(m *IndexDBMetrics) {
m.CompositeFilterSuccessConversions = compositeFilterSuccessConversions.Load()
m.CompositeFilterMissingConversions = compositeFilterMissingConversions.Load()
m.TagFiltersToMetricIDsCacheSize += uint64(db.tagFiltersToMetricIDsCache.Len())
m.TagFiltersToMetricIDsCacheSizeBytes += uint64(db.tagFiltersToMetricIDsCache.SizeBytes())
m.TagFiltersToMetricIDsCacheSizeMaxBytes += uint64(db.tagFiltersToMetricIDsCache.SizeMaxBytes())
m.TagFiltersToMetricIDsCacheRequests += db.tagFiltersToMetricIDsCache.Requests()
m.TagFiltersToMetricIDsCacheMisses += db.tagFiltersToMetricIDsCache.Misses()
m.TagFiltersToMetricIDsCacheResets += db.tagFiltersToMetricIDsCache.Resets()
// Report only once and for an indexDB instance whose tagFiltersCache is
// utilized the most.
if db.tagFiltersToMetricIDsCache.SizeBytes() > m.TagFiltersToMetricIDsCacheSizeBytes {
m.TagFiltersToMetricIDsCacheSize = uint64(db.tagFiltersToMetricIDsCache.Len())
m.TagFiltersToMetricIDsCacheSizeBytes = db.tagFiltersToMetricIDsCache.SizeBytes()
m.TagFiltersToMetricIDsCacheSizeMaxBytes = db.tagFiltersToMetricIDsCache.SizeMaxBytes()
m.TagFiltersToMetricIDsCacheRequests = db.tagFiltersToMetricIDsCache.Requests()
m.TagFiltersToMetricIDsCacheMisses = db.tagFiltersToMetricIDsCache.Misses()
m.TagFiltersToMetricIDsCacheResets = db.tagFiltersToMetricIDsCache.Resets()
}
// Report only once and for an indexDB instance whose dateMetricIDCache is
// utilized the most.
dmcs := db.dateMetricIDCache.Stats()
m.DateMetricIDCacheSize += dmcs.Size
m.DateMetricIDCacheSizeBytes += dmcs.SizeBytes
m.DateMetricIDCacheSizeMaxBytes += dmcs.SizeMaxBytes
m.DateMetricIDCacheSyncsCount += dmcs.SyncsCount
m.DateMetricIDCacheResetsCount += dmcs.ResetsCount
if dmcs.SizeBytes > m.DateMetricIDCacheSizeBytes {
m.DateMetricIDCacheSize = dmcs.Size
m.DateMetricIDCacheSizeBytes = dmcs.SizeBytes
m.DateMetricIDCacheSizeMaxBytes = dmcs.SizeMaxBytes
m.DateMetricIDCacheSyncsCount = dmcs.SyncsCount
m.DateMetricIDCacheResetsCount = dmcs.ResetsCount
}
m.IndexDBRefCount += uint64(db.refCount.Load())
m.DateRangeSearchCalls += db.dateRangeSearchCalls.Load()