mirror of
https://github.com/VictoriaMetrics/VictoriaMetrics.git
synced 2026-05-23 03:36:31 +03:00
lib/storage: report per-idb cache stats only once
`tagFiltersCache` and `dateMeticIDCache` are now per-indexDB. Currently we have 2 instance of indexDBs (prev and curr) and therefore 2 instances of each cache. When the storage stats is collected, the stats of individual caches is added together. For example, is the `sizeMaxBytes` of each tagFiltersCache is `100MB` and the `sizeBytes` of each instance is `10MB` and `99MB`, then the resulting stats will be `sizeMaxBytes == 200MB, sizeBytes == 109MB`. While this is accurate, this stats hides a potential problem. It says that the cache utilization is slightly above `50%` (109/200) and everything seems to be okay. But in reality one of the caches is utilized by 99% and soon will start evicting existing records to make room for new ones, potentially slowing down the data retrieval. Ops won't see it and will not take necessary action. The solution is to report stats only for one instance of cache whose utilization is the highest. Alternatives considered: - #10123. Might work, but breaks the encapsulation and can potentially be slower - Do not aggregate the stats and report is per-indexDB. This increases the number of metrics and makes it dependent on the number of indexDB instances (which can be many once #8134 is released). Related issue https://github.com/VictoriaMetrics/VictoriaMetrics/pull/8134
This commit is contained in:
@@ -37,6 +37,7 @@ See also [LTS releases](https://docs.victoriametrics.com/victoriametrics/lts-rel
|
||||
* BUGFIX: [vmctl](https://docs.victoriametrics.com/victoriametrics/vmctl/): properly handle process termination during prompt confirmation. Previously, termination signal was ignored and process was still waiting for user input. See[#10104](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10104).
|
||||
* BUGFIX: [vmgateway](https://docs.victoriametrics.com/victoriametrics/vmgateway/): properly recover from proxy requests errors. Previously, vmgateway may return empty response.
|
||||
* BUGFIX: [vmui](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#vmui): always add `/prometheus` suffix while generating backend URL. See [#10097](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10097).
|
||||
* BUGFIX: [vmsingle](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/) and `vmstorage` in [VictoriaMetrics cluster](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/): report stats only for most utilized instances of `indexdb/tagFiltersToMetricIDs` and `indexdb/date_metricID` caches. This makes it clear when a cache is full and an action needs to be taken (such as adding more memory or adjusting cache limits). See PR [#10131](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10131) for details.
|
||||
* BUGFIX: [vmsingle](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/) and `vmselect` in [VictoriaMetrics cluster](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/): disable rollup result cache for [instant queries](https://docs.victoriametrics.com/keyConcepts.html#instant-query) that contain [`rate`](https://docs.victoriametrics.com/MetricsQL.html#rate) function with a lookbehind window larger than `-search.minWindowForInstantRollupOptimization`. Previously, utilizing the cache might yield incorrect results when time series samples are not continuous. See [#10098](https://github.com/VictoriaMetrics/victoriaMetrics/issues/10098) for more details.
|
||||
* BUGFIX: all VictoriaMetrics components: properly validate remaining system memory limit. Previously it could have negative values. See this issue [#10083](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10083) for details.
|
||||
* BUGFIX: `vmstorage` in [VictoriaMetrics cluster](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/): do not wait after closing the last connections from vminsert when shutting down. The bug was introduced in [#9487](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9487). See [#10136](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10136) for detail.
|
||||
|
||||
@@ -244,20 +244,27 @@ func (db *indexDB) UpdateMetrics(m *IndexDBMetrics) {
|
||||
m.CompositeFilterSuccessConversions = compositeFilterSuccessConversions.Load()
|
||||
m.CompositeFilterMissingConversions = compositeFilterMissingConversions.Load()
|
||||
|
||||
m.TagFiltersToMetricIDsCacheSize += uint64(db.tagFiltersToMetricIDsCache.Len())
|
||||
m.TagFiltersToMetricIDsCacheSizeBytes += uint64(db.tagFiltersToMetricIDsCache.SizeBytes())
|
||||
m.TagFiltersToMetricIDsCacheSizeMaxBytes += uint64(db.tagFiltersToMetricIDsCache.SizeMaxBytes())
|
||||
m.TagFiltersToMetricIDsCacheRequests += db.tagFiltersToMetricIDsCache.Requests()
|
||||
m.TagFiltersToMetricIDsCacheMisses += db.tagFiltersToMetricIDsCache.Misses()
|
||||
m.TagFiltersToMetricIDsCacheResets += db.tagFiltersToMetricIDsCache.Resets()
|
||||
// Report only once and for an indexDB instance whose tagFiltersCache is
|
||||
// utilized the most.
|
||||
if db.tagFiltersToMetricIDsCache.SizeBytes() > m.TagFiltersToMetricIDsCacheSizeBytes {
|
||||
m.TagFiltersToMetricIDsCacheSize = uint64(db.tagFiltersToMetricIDsCache.Len())
|
||||
m.TagFiltersToMetricIDsCacheSizeBytes = db.tagFiltersToMetricIDsCache.SizeBytes()
|
||||
m.TagFiltersToMetricIDsCacheSizeMaxBytes = db.tagFiltersToMetricIDsCache.SizeMaxBytes()
|
||||
m.TagFiltersToMetricIDsCacheRequests = db.tagFiltersToMetricIDsCache.Requests()
|
||||
m.TagFiltersToMetricIDsCacheMisses = db.tagFiltersToMetricIDsCache.Misses()
|
||||
m.TagFiltersToMetricIDsCacheResets = db.tagFiltersToMetricIDsCache.Resets()
|
||||
}
|
||||
|
||||
// Report only once and for an indexDB instance whose dateMetricIDCache is
|
||||
// utilized the most.
|
||||
dmcs := db.dateMetricIDCache.Stats()
|
||||
m.DateMetricIDCacheSize += dmcs.Size
|
||||
m.DateMetricIDCacheSizeBytes += dmcs.SizeBytes
|
||||
m.DateMetricIDCacheSizeMaxBytes += dmcs.SizeMaxBytes
|
||||
m.DateMetricIDCacheSyncsCount += dmcs.SyncsCount
|
||||
m.DateMetricIDCacheResetsCount += dmcs.ResetsCount
|
||||
|
||||
if dmcs.SizeBytes > m.DateMetricIDCacheSizeBytes {
|
||||
m.DateMetricIDCacheSize = dmcs.Size
|
||||
m.DateMetricIDCacheSizeBytes = dmcs.SizeBytes
|
||||
m.DateMetricIDCacheSizeMaxBytes = dmcs.SizeMaxBytes
|
||||
m.DateMetricIDCacheSyncsCount = dmcs.SyncsCount
|
||||
m.DateMetricIDCacheResetsCount = dmcs.ResetsCount
|
||||
}
|
||||
m.IndexDBRefCount += uint64(db.refCount.Load())
|
||||
|
||||
m.DateRangeSearchCalls += db.dateRangeSearchCalls.Load()
|
||||
|
||||
Reference in New Issue
Block a user