Compare commits

...

300 Commits

Author SHA1 Message Date
Max Kotliar
d4208f2cde fix 2026-01-29 16:44:44 +02:00
Max Kotliar
cce1b50df5 app/vmselect/promql: Add test that demo unstable sort behavior
Related to
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10189

Debug notes
https://github.com/VictoriaMetrics/debug-notes/tree/main/gh10189
2026-01-29 16:39:54 +02:00
Hui Wang
1db7597e45 vmalert: disallow setting the -notifier.url command-line flag to a null value
Previously, running a vmalert with an empty notifier.url does not produce an error and leads to vmalert which will never send a notification successfully.

 This commit properly validates notifier.url empty value.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10355
2026-01-28 14:09:18 +01:00
Artem Fetishev
23fe7db35c lib/storage: follow-up for making searchAndMerge profile-friendly
Follow-up for c705da74f6
2026-01-28 14:07:26 +01:00
Hui Wang
817f2dc9e7 app/vmselect/promql: fix gaps at changes() functions
After changing the scrape interval from a smaller value (e.g., 30s) to a larger value (e.g., 60s), the changes() function starts to yield non-zero values even when the underlying values have not changed.

 This commit keeps unchanged series values when a large gap occurs between samples or when the scrape interval decreases.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10280
2026-01-28 14:06:51 +01:00
Max Kotliar
731ba17962 docs: Update vmctl flags in docs with a command (#10357)
### Describe Your Changes

The commit extends make docs-update-flags command so it updates vmctl
flags as well. It creates one md file with global flags and several
files per supported mode.


### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-01-28 14:12:45 +02:00
Max Kotliar
bb163692ba docs: add avilable_from to request body buffering vmauth doc
Follow-up for
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10310 and
e31abfc25c
2026-01-28 12:50:25 +02:00
Nikolay
952ef51cd1 lib/fs: properly check for partially deleted directories (#10342)
Commit 83da33d8cf introduced a check to
detect directories partially removed via IsPartiallyRemovedDir.

However, the check was performed using the full path, while de.Name()
returns only the current entry name (without the path). As a result, the
check always succeeded and the function did not behave as intended.
2026-01-28 10:30:35 +01:00
Nikolay
1fc548b63a lib/fs: add fs.disableMincore flag
This flag allows disabling the mincore() syscall introduced in
50fc48ac47. On older ZFS filesystems,
mincore() may trigger a bug related to ZFSÕs own in-memory cache. Mixing
reads from mmap()ed files and direct disk reads can corrupt the ZFS ARC
cache and lead to data read corruption.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10327
2026-01-27 20:29:01 +01:00
Nikolay
aa5236877c lib/storage: properly aggregate per IndexedDB cache stats
Commit f62893c151 added an attempt to fix
stats for `tagFiltersCache`, `metricIDCache`, and `dateMetricIDCache`.
Instead of aggregated stats, it returned the largest cache stats by
cache size.

This resulted in possible counter decreases for counter metric types. It
made aggregated metrics less usable.

This commit changes cache stats aggregation by metric type:
* size-related gauge metrics are returned based on max cache size usage
* metric counters are reported as a sum of all counters

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10275
2026-01-27 20:27:41 +01:00
Artem Fetishev
c705da74f6 lib/storage: make pt and legacy idbs visible in golang profiles
Rewrite the searchAndMerge so that golang profiles could show exactly
how much resources is consumed by each idb type.
2026-01-27 20:26:39 +01:00
Aliaksandr Valialkin
2e9bda2bff lib/{mergeset,storage}: add a comment explaining why the strange construct with anonymous function is needed
This is a follow-up for the commit 2a0e382a99

Updates https://github.com/VictoriaMetrics/VictoriaLogs/issues/1020
2026-01-27 19:44:49 +01:00
Jiekun
e1413536fc chore: add build version information to the home page for consistency with other projects
The build version added to:
- victoria-metrics
- vmagent
- vmalert

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10249

Co-authored-by: Hui Wang <haley@victoriametrics.com>
Signed-off-by: Zhu Jiekun <jiekun@victoriametrics.com>
2026-01-27 18:28:15 +02:00
Jayice
1a438a04ba introduce new alert for vmagent persistenqueue capacity 2026-01-27 18:14:02 +02:00
Aliaksandr Valialkin
4ad47d6fe3 docs/victoriametrics/README.md: remove obsolete docs about staleness markers during deduplication after the commit 7bd5d19f62
Staleness markers are ignored on the deduplication interval if there are other numeric samples exist on that interval.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10196
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5587
2026-01-27 16:08:45 +01:00
Aliaksandr Valialkin
879443f915 lib/storage/dedup.go: remove obsolete comment from DeduplicateSamples - it doesnt keep stale NaNs on purpose after the commit 7bd5d19f62
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5587
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10196
2026-01-27 16:08:44 +01:00
Max Kotliar
bd6788cb8f docs/changelog: fix ordering after merging pr.
related pr https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10320
2026-01-27 16:37:49 +02:00
Jayice
22696f378c lib/promscrape: apply promscrape.maxScrapeSize to decompressed data
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9481
2026-01-27 16:30:38 +02:00
Artur Minchukou
7205f479aa app/vmui: fix build of vmui by handling playground env variable correctly (#10354)
### Describe Your Changes

Fixed build of vmui by handling playground env variable correctly.

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-01-27 16:24:18 +02:00
Yury Moladau
5c7031c000 vmui: fix "Percentage from total" for multiple metrics in Cardinality Explorer (#10323)
### Describe Your Changes

In the Cardinality Explorer, when filtering, a "Percentage from total"
stat appears. This stat is documented as "the share of these series in
the total number of time series".

This works for pages for individual metrics. However, if using a filter
that returns *multiple* metrics, the value of "Percentage from total"
will only account for the size of the *first* metric. One can have a
filter that returns, say, 10k time series (out of, say, 100k in the VM
cluster), and if the first metric returned has 1k time series, then
"Percentage from total" will show 1%, not 10%.

This PR fixes that calculation.

Credits to @PleasingFungus for the original fix (PR #10288).

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Signed-off-by: Yury Molodov <yurymolodov@gmail.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
Co-authored-by: PleasingFungus <PleasingFungus@users.noreply.github.com>
2026-01-27 15:37:34 +02:00
Artur Minchukou
a227128467 app/vmui: move node from ci to docker and update build steps (#10299)
### Describe Your Changes

Moved node from CI to make command and update build steps.

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-01-27 15:23:35 +02:00
Nikolay
777c8913b3 follow-up after e35a9a366c
Commit e35a9a366c changed the order of wg.Add calls in the Graphite transform package. Previously, all wg.Add calls were made upfront, but after that change it became possible for wg.Wait to exit earlier than expected.

This commit fixes the issue by spawning all background goroutines first and starting the goroutine that calls wg.Wait afterward.
2026-01-27 13:50:07 +01:00
Aliaksandr Valialkin
e35a9a366c all: consistently use sync.WaitGroup.Go() instead of sync.WaitGroup.Add(1) + sync.WaitGroup.Done()
This improves code readability a bit.
2026-01-27 00:29:47 +01:00
JAYICE
6bbc03ecf8 app/vmagent: support configuring different -remoteWrite-queues per url
Previously vmagent had remoteWrite.queues as a global setting that was be applied to every persistentqueue. However, it could be useful to specify remotewrite.queues per remotewrite.url.

Considering each rw might have different workload(latency, throughput, and availability), so it will be more flexible for tuning if we can set remoteWrite.queues separately for specific rw.

This commit, makes `-remoteWrite-queues` configurable per remoteWrite.url. 

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10270
2026-01-26 20:09:35 +01:00
Max Kotliar
ca34ae48b4 docs/changelog: chore changelog
- rename `these docs` link to a more explisit link
- Add thank you for contribution.
2026-01-26 18:45:04 +02:00
Max Kotliar
f18fd37433 docs: run make docs-update-flags 2026-01-26 18:43:35 +02:00
Zhu Jiekun
f191a052dc lib/promscrape: ceiling the last scrape size
ceiling the last scrape size as an integer in bytes or kilobytes to
avoid misleading dots.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10307
2026-01-26 12:46:24 +01:00
Max Kotliar
0fdd5cb435 app/vmauth: fix backend healthcheck for url prefixes defined inside url_map
Previously health checks for url prefixes defined inside `url_map` were
not properly stopped. See STR in
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10334#issuecomment-3791401822

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10334
2026-01-26 11:47:43 +01:00
f41gh7
76dd8f4adb lib/storage: properly search searchTenantsOnDate
Initial implementation of searchTenantsOnDate used a index scan for the given prefix (index prefix + tenant + date).
It did not check whether the date prefix was actually outside the current date.

This commit adds the missing date check and makes the tenant search results accurate.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10295
2026-01-26 11:35:27 +01:00
Aliaksandr Valialkin
e31abfc25c app/vmauth: allow buffering request body before proxying it to the backend
This should help reducing load on backends when many concurrent clients
send requests over slow networks (for example, when many IoT devices send metrics
to vmauth over slow connections).

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10309

This commit is based on top of https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10310
Thanks to @makasim for the initial idea.
2026-01-26 03:02:32 +01:00
Aliaksandr Valialkin
ac6d9d632f app/vmauth: properly increment vmauth_user_concurrent_requests_limit_reached_total and vmauth_unauthorized_user_concurrent_requests_limit_reached_total metrics when the request is rejected because of the concurrency limit
These metrics must be incremented when the request couldn't be processed because of the configured per-user concurrency limit.
The commit 76176ac1d3 moved the counter increase to the place when the current request
is put in the wait queue because of the concurrency limit is reached. This is incorrect, since such requests
can still be successfully processed during -maxQueueDuration . This also contradicts the docs at https://docs.victoriametrics.com/victoriametrics/vmauth/#concurrency-limiting

There is a small practical sense in counting the number of times the concurrency limit is reached,
while the request is successfully processed during the -maxQueueDuration after that.

Add missing alerting rule for rejected unauthorized requests because of the concurrency limit.

Add missing grouping by instance for per-user counter of rejected queries because of the concurrency limit.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10078
2026-01-25 21:43:44 +01:00
Aliaksandr Valialkin
e43de2a2b3 app/vmauth: put comments into the correct places after the commit 5f67f04f6b 2026-01-25 21:19:01 +01:00
Aliaksandr Valialkin
efe4a3b2dd vendor: update github.com/VictoriaMetrics/VictoriaLogs from v1.36.2-0.20251008164716-21c0fb3de84d to v0.0.0-20260125191521-bc89d84cd61d 2026-01-25 20:24:04 +01:00
Aliaksandr Valialkin
3bf5f0297b LICENSE: update the end copyright year from 2025 to 2026 2026-01-25 20:14:07 +01:00
Aliaksandr Valialkin
5632ccc64a lib/logger: count both printed and suppressed logs at vm_log_messages_total metric
This simplifies troubleshooting by investigating the vm_log_messages_total metric
when logs are unavailable. The logs may be unavailable when the -loggerLevel command-line
flag is set to value other than INFO. The logs may be unavailable when clients
use Monitoring of Monitoring service ( https://victoriametrics.com/products/mom/ ),
which provides metrics, but doesn't provide logs from VictoriaMetrics components
running at the client side.

Add `is_printed` label to the `vm_log_messages_total` metric in order to detect whether
the given log has been suppressed or printed.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10304

While at it, make more readable the description for the TooManyLogs alert,
which is based on the vm_log_messages_total metric.
Also return back the `level!="info"` instead of `level="error"` filter
in the query for this alerting rule, in order to be consistent with queries
at the official dashboards for VictoriaMetrics components.
TODO: investigate too high warnings rate at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2760
and fix it at the source of these warnings instead of modifying the query
for the TooManyLogs alert.
2026-01-25 17:43:20 +01:00
Nikolay
446452857c lib/storage: tsdb stats fallback to legacy idb
Add fallback to legacy indexDB for stats search

After introducing the new partition index
(f97f627f79), storage stopped returning
stats for date ranges outside the partition index. This made the
migration backward incompatible, as there was no way to retrieve stats
for dates prior to the migration.

This change adds a fallback to the legacy indexDB search when the status
search on the current partition index returns zero series.

This is an imperfect solution: due to tag filters, the TSDB status
search may legitimately return empty results. However, the additional
overhead is small and acceptable.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10315
2026-01-23 16:43:13 +01:00
Max Kotliar
4ff409eb27 docs/changelog: mention already fixed bug fix in vmauth
The bug was fixed in
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10233 before we
realized it was a bug, at that time we considered it as improvement - do
less retries. But later in
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10318 we
realized that we actually fixed a bug.

Adding postfactum a record about bugfix to the changelog
2026-01-22 16:57:58 +02:00
Phuong Le
1b7f0172d2 fsutil: fix a typo related to default concurrent goroutines working with files
s/265/256
2026-01-20 21:46:38 +01:00
Andrii Chubatiuk
1c77ee9527 app/vmui: removed anomaly ui (#10316)
The vmanomaly has been moved to a separate repository. This means that the functionality related to vmanomaly is no longer needed in the app/vmui located in the VictoriaMetrics repository.

This commit removes all the functionality and unnecessary abstractions related to vmanomaly from the app/vmui repository. This should help improving long-term maintenance of the code.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9755
2026-01-20 21:46:09 +01:00
Phuong Le
2a0e382a99 lib/storage, lib/mergeset: avoid deadlock on panic while merging
Related to
https://github.com/VictoriaMetrics/VictoriaLogs/issues/1020#issuecomment-3763912067
2026-01-20 21:43:12 +01:00
Max Kotliar
02c8ea5a48 docs/changelog: fix typo in security upgrade 2026-01-20 21:53:52 +02:00
Aliaksandr Valialkin
34f242a6b8 vendor: run make vendor-update 2026-01-19 15:29:25 +01:00
f41gh7
bc8f6c5688 docs: point examples to the v1.134.0 release 2026-01-19 14:28:00 +01:00
f41gh7
c0fe67c2db docs: cut LTS releases v1.110.28 and v1.122.13
Signed-off-by: f41gh7 <nik@victoriametrics.com>
2026-01-19 14:26:36 +01:00
Fred Navruzov
ede1c2cde9 docs/vmanomaly: release v1.28.5 (#10311)
### Describe Your Changes

- Adjusted vmanomaly docs for v1.28.5
- Added missing `server` page at /anomaly-detection/components/server

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-01-17 21:52:48 +02:00
Aliaksandr Valialkin
ad34a5eb53 lib/protoparser/protoparserutil: reduce memory usage in ReadUncompressedData() when processing big number of incoming connections
Wait for the first byte from the reader passed to ReadUncompressedData()
before obtaining concurrency token from -maxConcurrentInserts and before allocating
buffers needed for reading the request body in memory.
This should limit the amounts of memory needed for processing a big number of concurrent
HTTP requests via Prometheus remote_write protocol and via other HTTP-based data ingestion
protocols where every request contains a single block of data to process.
Now the maximum memory usage is limited by -maxConcurrentInserts, while the server
can process much more than -maxConcurrentInserts concurrent HTTP requests by pausing the excess requests.

Previously the memory usage wasn't limited by -maxConcurrentInserts, since buffers for reading the data
from concurrent connections were allocated before obtaining the concurrency token from -maxConcurrentInserts.

While at it, use protoparserutil.ReadUncompressedData() in lib/protoparser/promremotewrite/stream.Parse()
for the sake of consistency across parsers for protocols, which send the full block of data per every incoming HTTP request.

This is a follow-up for the commit d107dee9c7
2026-01-17 15:49:53 +01:00
f41gh7
eaf7a68c92 CHANGELOG.md: cut v1.134.0 release 2026-01-16 20:49:31 +01:00
Max Kotliar
c5e43e1c91 docs: use canonical link 2026-01-16 19:09:49 +02:00
f41gh7
b343f541f0 make vmui-update 2026-01-16 16:46:26 +01:00
f41gh7
a23a902953 deployment: update Go builder from v1.25.5 to v1.25.6
See https://github.com/golang/go/issues?q=milestone%3AGo1.25.6%20label%3ACherryPickApproved
2026-01-16 16:26:27 +01:00
Hui Wang
54c60706ca lib/streamaggr: prefer numerical values over stale markers when sample share the same timestamp during deduplication (#10300)
follow up
7bd5d19f62,
apply the same logic in stream aggregation.
2026-01-16 16:14:09 +01:00
Nikolay
cd2e11b7cf lib/storage: increase rotation time for daily metricID cache
This is follow-up for c5713a09d3

Originally, dateMetricID cache was fully rotate every 20 minutes. It
made daily-index pre-creation less efficient and caused CPU usage spikes
for index records lookup at midnight.

storage pre-fills index records for the next day in 1 hour before night.
But this rotation made only last 20 minutes before midnight visible in
the cache.

 This commit changes rotation period from 20 minutes to 2 hours ( 1 hour
 tick interval). While
it could slighlty increase cache memory usage ( in practice it shouldn't
be noticeable). It prevents from CPU usage spikes.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10064
2026-01-16 16:12:59 +01:00
Aliaksandr Valialkin
5423d5e93a docs/victoriametrics/relabeling.md: add an alias seen in wild - https://docs.victoriametrics.com/victoria-metrics/relabeling/
Google sends users to this alias according to the report on 404 pages.
2026-01-16 15:46:55 +01:00
Aliaksandr Valialkin
48819b6781 docs/victoriametrics/CaseStudies.md: added alias for this page seen on the Internet - https://docs.victoriametrics.com/casestudies.html
Google sends users to this url according to the report on 404 pages.
2026-01-16 15:32:40 +01:00
JAYICE
c4bff27f46 lib/storage: properly search for LabelNames and LabelValues
Issue was introduced at d6ef8a807b commit.

Due to variable shadowing, if filter matched more than 100_000 metricIDs, it's fallback to the indexDB scan.
But because of type, `filter` value was not properly updated. And it triggered incorrect results.

 This commit fixes this typo and adds test to verify this case.


fixes  https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10294
2026-01-16 13:53:29 +01:00
Max Kotliar
432b313a48 docs: cleanup changelog a bit before release 2026-01-16 12:51:08 +02:00
Haley Wang
7bd5d19f62 lib/storage: prefer numerical values over stale markers when samples share the same timestamp during deduplication
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10196

Prefer the non StaleNaN  value when both StaleNaN and non-StaleNaN samples share the timestamp during deduplication(downsampling). The scenario can occur when:
1. Multiple vmagent instances scrape the same target(without -promscrape.cluster.name flag), one instance fails to scrape due to issues such as network, while others succeed.
2. Multiple vmalert instances evaluate the same recording rule, with one instance receiving a partial response while others receive a complete response.

In both cases, since the samples share the same timestamp and represent the metric state at that moment,
the non-StaleNaN value is entirely valid, whereas the StaleNaN could be caused by other unknown issues.
Therefore, it is reasonable to prioritize the non-StaleNaN value.
2026-01-16 09:25:11 +02:00
Haley Wang
8d18bc288f vmselect: use the last 20 raw samples to auto-calculate the lookbehind during range query
Previously, the first 20 raw samples were used for calculation.
But compare to the first 20 samples, the last 20 samples represent the latest state of the metrics,
so the lookbehind window calculated from them should be more accurate when applied to the most recent samples,
resulting in better query results for recent time ranges.

For example,if the scrape interval changes at day4, and the query range is set to last 7 days.
Applying the window derived from the first 20 samples(the old scrape interval) to new samples could result in consistently incorrect results from day4 through day11.
Conversely, applying the window derived from the last 20 samples (the new scrape interval) could lead to incorrect results for [day0-day4),
which are old states and generally less important.

This pull request does not address any specific bug, but change the general behavior, so there is no changelog.

Inspired by https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10280, but not the fix for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10280.
2026-01-16 09:00:51 +02:00
Nikolay
ff6e5c2983 app/vmstorage: reduce default value for storage.vminsertConnsShutdownDuration
This commit reduces default value for
`storage.vminsertConnsShutdownDuration` flag from `25s` to `10s`
seconds.
This change should help to reduce probability of ungraceful storage
shutdown at Kubernetes based environments, which has 30 seconds default
graceful termination period value (terminationGracePeriodSeconds).

Related issue
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10063
2026-01-15 17:26:12 +01:00
Yury Moladau
23af0086d8 app/vmui: fix heatmap rendering for uniform or sparse histogram buckets (#10292)
* Fixed a heatmap crash that happened when all visible cells had the
same value (division by zero produced invalid color indices).
* Improved how histogram buckets are chosen for display when the data is
very sparse, so the heatmap doesn’t look empty or drop the only
meaningful bucket.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10240
2026-01-15 16:55:24 +01:00
Yury Moladau
8657470068 app/vmui: bump package versions (#10291)
### Describe Your Changes

Updated project dependencies to the latest versions.

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Signed-off-by: Yury Molodov <yurymolodov@gmail.com>
2026-01-15 16:53:23 +02:00
Aliaksandr Valialkin
3f16bc7cb2 docs/victoriametrics/Articles.md: add https://www.keyvalue.systems/blog/kubernetes-observability-with-victoriametrics-loki-grafana/ 2026-01-15 13:53:26 +01:00
Aliaksandr Valialkin
655a0eb0c3 app/vmstorage/main.go: typo fix after the commit 7cbd2a8600: partition -> snapshot 2026-01-15 12:50:24 +01:00
Aliaksandr Valialkin
7cbd2a8600 app/vmstorage: delete just created snapshot if the client canceled the request for creating the snapshot
It is better to delete the snapshot, since the client is no longer interested in it.
This should prevent from creating many unused snapshots when clients cancel creating snapshots
because of timeouts. This is the real production case from one of VictoriaMetrics users:
the disk IO subsystem became very slow, so creating a snapshot took a lot of time, so vmbackup
was canceling creating the snapshot because of the timeout. But vmstorage was still continue
creating the snapshot. This resulted in the increasing number of created but unused snapshots.
2026-01-15 12:36:48 +01:00
Max Kotliar
5f67f04f6b app/vmauth: measure client cancelled requests
Without measuring this, we have a blind spot. Exposing it as a metric
improves visibility and should save time during future debugging
sessions.

Inspired by review commit
c9596a0364 (r173621968)
2026-01-15 12:13:35 +01:00
Nikolay
2056e5b46d lib/mergeset: do no cache inmemoryBlock with single item
indexDB mergeset has an edge for single item inmemoryBlock. It stores
such items blocks in-memory at blockheader firstItem. So there is no
need to perform on-disk read operations and storing copy of it at cache.

 It also may result in incorrect search results, inmemoryBlock with a
 single item has always zero index block offset. Which causes collisions
if it's cached with the next index block at part.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10239
Probably fixes
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10063
2026-01-15 12:12:08 +01:00
Hui Wang
4d1f262ec4 vmalert: add support for $isPartial variable in alerting rule annotation templating
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4531
2026-01-15 12:10:07 +01:00
Vadim Rutkovsky
afca599a46 app/vmauth: Ffx typo in auth config warning message 2026-01-15 12:09:36 +01:00
Yury Moladau
d667f694bc app/vmui: fix tenant ID handling via URL path (#10287)
**Problem**

* VMUI had two tenant ID sources:

  * URL path: `/select/<accountID>/vmui/`
  * Query param: `tenantID`
* These could differ, causing confusion and inconsistent behavior.

**Solution**

* Removed the legacy `tenantID` query parameter.
* Use the URL path as the single source of truth for tenant ID.
* Changing the tenant in the UI now updates the URL path.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10232
2026-01-15 12:05:12 +01:00
Aliaksandr Valialkin
fe2c60c79b dashboards: follow-up for the commit 36460f6297
Use $__range duration instead of 1h duration for the 'Retention errors' stats panel
in the similar way it was done in the commit 36460f6297
for the 'Backup errors' stats panel.

While at it, run `make dashboards-sync` in order to sync the dashboards
in the dasbhoards/vm/ folder. See https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/dashboards/README.md
for details.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10279
2026-01-14 23:30:08 +01:00
Stephan Burns
36460f6297 Make stats panel use the range specified in grafana (#10279)
### Describe Your Changes

The Backups errors panel uses a hard coded rate, when looking over a
large period of time this number would likely stay low do to the hard
coded rate when in reality the amount of errors is much larger.

This change addresses this by using the __rate variable in Grafana so
the rate will align with the date/time range in Grafana.

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-01-14 23:21:16 +01:00
Aliaksandr Valialkin
d107dee9c7 lib/writeconcurrencylimiter: remove Reader.DecConcurrency() method
Call decConcurrency() inside Reader.Read() before calling the Read() at the underlying reader.
This reduces chances of improper use of the writeconcurrencylimiter.Reader by callers.

While at it, move the creation of writeconcurrencylimiter.GetReader() to the top of stream parser functions
at lib/protoparser/* packages, and call incConcurrency() inside GetReader() call.
This reduces the frequency of decConcurrency() / incConcurrency() calls
for typical buffered reads when parsing the incoming data. This, in turn,
reduces the contention on the concurrencyLimitCh.
2026-01-14 22:55:17 +01:00
Max Kotliar
b33d7c3ef9 dashboards: remove timezone from vmagent dashboard
The bug introduced in
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10267 and breaks
helm charts customization, see discussion
415ff27c74 (r174600675)
2026-01-14 13:28:35 +02:00
JAYICE
d3848f6802 vmagent: fix calculation of vm_persistentqueue_free_disk_space_bytes (#10271)
### Describe Your Changes

follow up https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10242,
see discussion in
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10267#issuecomment-3729577415
for more context

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-01-13 20:12:31 +02:00
Jayice
415ff27c74 dashboards: add Persistent queue Full ETA panel to the Drilldown section in vmagent dashboard 2026-01-13 20:03:43 +02:00
Max Kotliar
90f59383b2 docs: Add docs-update-flags step to release. (#10284)
### Describe Your Changes

Previously, we had to manually update flags in documentation whenever we
made flag-related changes in source code. Someone did it by hand, others
compiled enterprise binaries, executed them with `-help` flag, and
copied and pasted output to the documentation.

In https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9632 a
command `make docs-update-flags` was introduced. It automates the whole
process. It compiles binaries, runs `-help,` and syncs output to changes
automatically.

Now, we can **omit updating doc flags in the PR** and do it once before
releasing a new version.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-01-13 17:09:49 +02:00
Fred Navruzov
8fec7005d0 docs/vmanomaly-release-v1.28.4 (#10283)
### Describe Your Changes

Docs upgrades, including v1.28.4 adjustments, some diagrams refinement
and deprecations

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-01-13 15:32:03 +02:00
Max Kotliar
4d42b291e5 docs: run make docs-update-flags 2026-01-13 10:56:14 +02:00
Max Kotliar
50f4fbf28e lib/flagutil: Add explicit month duration unit (M) for -retentionPeriod.
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10181
2026-01-13 10:37:02 +02:00
Max Kotliar
a5da6afb88 docs: run make docs-update-flags 2026-01-13 10:28:16 +02:00
Max Kotliar
71f9e7f2c4 app/vmctl: fix link to documentation
See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10268

Co-authored-by: Danijel Tasov <data@consol.de>
Signed-off-by: Danijel Tasov <data@consol.de>
2026-01-12 20:53:35 +02:00
Max Kotliar
eb7c5df65e dashboards: run make dashboards-sync 2026-01-09 19:07:03 +02:00
Max Kotliar
5af493297a docs/changelog: move 2025 changes to CHANGELOG_2025.md, create CHANGELOG_2026.md 2026-01-09 13:24:47 +02:00
JAYICE
2f61fa867e Makefile: Move enterprise-only flags to a separate block in document (#10241)
### Describe Your Changes

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10218.

Improve `docs-update-flags` command to move enterprise-only flags to a
separate block.

<img width="936" height="964" alt="image"
src="https://github.com/user-attachments/assets/f96a3515-4acc-4a65-94b1-55e01fab6e25"
/>


### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-01-08 19:42:33 +02:00
Max Kotliar
729b1099d8 dashboards: Enhance VictoriaMetrics - single-node dashboard stats raw. (#10260)
### Describe Your Changes

Currently, the stats are small and hard to read (see screenshot in the
PR). In addition, the version and uptime panels work well for a single
vmsingle, but become inconvenient when multiple instances are present,
since only one is visible.

This PR changes the version and uptime panels from single stat to time
series, aligning them with the VictoriaMetrics – cluster dashboard. It
also enlarges the remaining stats so the values are easier to read,
consistent with the cluster dashboard (see screenshot in the PR).

Follow up on
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10187 and
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10132

Before:
<img width="1512" height="364" alt="Screenshot 2026-01-07 at 21 38 17"
src="https://github.com/user-attachments/assets/8d8baa86-b31b-4c58-ae22-cef94a1607e6"
/>

After:
<img width="1512" height="670" alt="Screenshot 2026-01-07 at 22 07 10"
src="https://github.com/user-attachments/assets/9e60596d-72ec-4060-af11-a69ce554d3b1"
/>

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Co-authored-by: Hui Wang <haley@victoriametrics.com>
2026-01-08 13:49:46 +02:00
Yury Moladau
945ca569b9 app/vmui: add localStorage availability checks
* Added browser `localStorage` availability checks with user-facing
error reporting.
* Introduced `VMUI:`-prefixed `localStorage` keys to avoid key
collisions.
* Added migration logic for existing unprefixed `localStorage` keys.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10085
2026-01-08 11:13:38 +01:00
Hui Wang
7fb8a8a0b2 vmalert: skip alert annotation templating in replay mode
In alerting rules, annotations are only attached to alert messages that
are sent to the notifier (such as Alertmanager). These annotations
typically contain human-readable information, such as instructions for
resolving the alert.

In [replay
mode](https://docs.victoriametrics.com/victoriametrics/vmalert/#rules-backfilling),
vmalert does not send alert messages to the notifier at all(no notifier
is configured), as these alerts are outdated. Therefore, it does not
need to template the annotations in this mode.

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10262
2026-01-08 11:09:21 +01:00
JAYICE
89f95f74ed vmagent: add metric for persistentqueue capacity
This commit adds new metric `vm_persistentqueue_free_disk_space_bytes`, which helps
to track free space for persistent queue.

part of implementation for
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10193
2026-01-08 11:07:28 +01:00
Hui Wang
46e13fe0ca vmselect: expose vm_rollup_result_cache_requests_total metric
which tracks the number of requests to the query rollup cache


As described in
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10117, when
retrieving cached data from the rollup result cache, there can be mixed
`get()` and `getBig()` calls to the underlaying fastcache. And it's
unpredictable how many times `getBig()` will call `get()`, so the
metrics from fastcache cannot be used to indicate query cache miss
ratio.
Exposing a new counter `vm_rollup_result_cache_requests_total` to track
the number of requests to the query rollup cache, together with the
existing `vm_rollup_result_cache_miss_total`, allows for monitoring the
rollup cache miss rate per query (or subquery), which is more
user-facing.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10117
related to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5056
2026-01-08 11:05:25 +01:00
Fred Navruzov
50d8ad6733 docs/vmanomaly-release-v1.28.3 (#10258)
### Describe Your Changes

Docs update for vmanomaly v1.28.3 release + `retention` doc section for
model artifacts

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-01-07 20:40:32 +02:00
JAYICE
3b8550adb1 dashboard: refine vmsingle dashboard and align it to vmcluster dashboard (#10187)
### Describe Your Changes

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10132

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-01-07 12:58:49 +02:00
Max Kotliar
1708b73312 lib/promscrape: show (N/A) instead of hiding target response link when original labels are dropped (#10244)
Related to
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10237,
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9901

When `-promscrape.dropOriginalLabels=true` is enabled, original target
labels are unavailable. These labels are required to compute the target
ID used by the /target_response endpoint, so the response link cannot be
generated. See

7a5003212e/lib/promscrape/targetstatus.qtpl (L236)

Previously, the link silently disappeared from the UI. Now the UI shows
(N/A) Not available, explicitly indicating that required data is
missing.
2026-01-06 12:31:18 +01:00
f41gh7
57defe7ab4 apptest: add zabbixconnector integration test
follow-up for 859435a8df
2026-01-06 12:27:01 +01:00
Sinotov Vladimir
d58cfb7f36 app/vminsert: properly route zabbixconnector requests
Previously VictoriaMetrics: Single-node version used

`http://<victoriametrics-addr>:8428/zabbixconnector/api/v1/history`
resulted in a missing path error. The issue was introduced during changes back-porting from vmagent.


Additionally, the http response was fixed. Zabbix expects a 200 status
code during normal operation.

 Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10214
2026-01-06 11:17:09 +01:00
Cancai Cai
a244750bc6 doc/table: fix typo (#10243)
### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Signed-off-by: cancaicai <2356672992@qq.com>
2026-01-05 21:42:01 +02:00
Max Kotliar
f06e7f9a6e app/vmagent: replace go.yaml.in/yaml/v3 package with gopkg.in.yaml.v2
It address the comment:
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10213/files#r2662305818

The reasons:
- It was decidede to use v2 for now and do not upgrade to v3.
- The later package is used in more places so it is better to use it
here too.
2026-01-05 21:34:08 +02:00
Artem Fetishev
7a5003212e docs: bump VictoriaMetrics components version
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-01-05 19:22:53 +01:00
Artem Fetishev
846392405e deployment/docker: bump VictoriaMetrics component version
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-01-05 19:18:49 +01:00
Artem Fetishev
37c3d8c26b CHANGELOG.md: fix issue links
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-01-05 19:15:50 +01:00
Artem Fetishev
8bc0475ee7 docs: update LTS releases
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-01-05 18:42:21 +01:00
Zhu Jiekun
89414062bf bugfix: allow reloading when init with empty remote write relabeling flags (#10213)
### Describe Your Changes

fix https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10211

This pull request adds `flagSet bool` field to `relabelConfigs` struct.
And use this flagSet value as the result of `isSet()` function.

The reloading should be available when at least one of the command-line
flags `-remoteWrite.relabelConfig` / `-remoteWrite.urlRelabelConfig` is
set.

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Co-authored-by: Hui Wang <haley@victoriametrics.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-01-05 12:53:52 +02:00
JAYICE
67c51b009d document: guide users to use --data-binary in curl when import multi lines influx data (#10198)
### Describe Your Changes

fix https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10165.

Refer to [curl docs](https://curl.se/docs/manpage.html#--data).

> When --data is told to read from a file like that, carriage returns,
newlines and null bytes are stripped out

If users import multiple lines of data in file via `/api/v2/write`, he
may follow the example we gave to use `-d` to instruct curl, then
newlines will be stripped out, hence the parse error in VictoriaMetrics.

It's not VictoriaMetrics' bug, but it will be better to guide users to
use `--data-binary` just like how
[/api/v1/import](https://docs.victoriametrics.com/victoriametrics/url-examples/#apiv1import)
did.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Co-authored-by: Zhu Jiekun <jiekun@victoriametrics.com>
2026-01-05 10:54:47 +02:00
Artem Fetishev
e8160fc8fb CHANGELOG.md: cut v1.133.0 release
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-01-02 12:17:38 +00:00
Artem Fetishev
e3a4ceaef3 deployment/docker: upgrade base docker image (Alpine) from 3.22.2 to 3.23.2
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-01-02 10:22:41 +00:00
Andrii Chubatiuk
e9cedca8c8 docs: replace old grafana datasource page with links to a new one (#10231)
### Describe Your Changes

fixes https://github.com/VictoriaMetrics/vmdocs/issues/192

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-01-01 18:54:52 +02:00
Andrii Chubatiuk
b720e55c13 vmsingle: properly proxy requests to all supported vmalert paths (#10179)
### Describe Your Changes

modify initial request path before sending request to vmalert with a
proper value
sync vmalert proxy implementation with one in cluster branch
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10178

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Co-authored-by: Hui Wang <haley@victoriametrics.com>
2026-01-01 16:28:55 +02:00
Artem Fetishev
ab1429c896 lib/storage: fix tagFiltersCache stats collection (#10230)
Since the cache may be reset too often, using the sizeBytes as an
indicator that this is the first met indexDB to collect tfssCache stats
is unreliable because it often can be zero all indexDB instances. Use
Requests metric instead because it is never reset.

Follow-up for #10204.

---------

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-12-31 13:35:54 +01:00
JAYICE
74b03c93a6 makefile: support vmauth in docs-update-flags command (#10222)
### Describe Your Changes

implement
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10221

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-12-30 19:14:06 +02:00
Max Kotliar
0e9bb5a42d docs: sync flags in docs with acutal binaries 2025-12-30 18:59:33 +02:00
Max Kotliar
f1a88e57cf docs/changelog: fix link to PR
follow up on
1792b6bd9a
2025-12-30 17:38:48 +02:00
Max Kotliar
76176ac1d3 app/vmauth: increase concurency limit reached before waiting in queue
Follow up on
c9596a0364 (r173413964)

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10078
2025-12-30 17:23:10 +02:00
Max Kotliar
c08adb31bb docs: remove available from placeholder from code block
The {{% available_from "#" %}} placeholder does not work inside code
blocks. Replacing it with hard coded value.

Introduced in
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10168.

See comment
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10168/files#r2651440620
for more details.
2025-12-30 16:10:55 +02:00
Artem Fetishev
b49b0471ef lib/storage: move legacy code to legacy files (#10215)
Follow-up for f97f627 (#8134)

The code was moved as is, no changes were made to moved code.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-12-30 13:16:24 +01:00
Artem Fetishev
13102045a7 changelog: update v1.132.0 release notes with a note on ungraceful shutdown
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-12-30 10:29:24 +01:00
Artem Fetishev
d226e5b95f lib/ingestserver: Actually close the first vminsert connection (#10224)
Since the first connection is not closed, the vmstorage will never
terminate gracefully which will cause the reset of all caches on the
start-up.

Follow-up for 244769a00d (#10136)

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-12-29 15:13:30 +01:00
Hui Wang
30bbb5660b docs: clarify recording rule labels do not support templating (#10186)
fix https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10183
2025-12-29 15:29:45 +02:00
Max Kotliar
1792b6bd9a docs/changelog: Add PR\issue links, fix typo in tip section 2025-12-29 12:58:07 +02:00
Artem Fetishev
f97f627f79 lib/storage: implement partition index (#8134)
This should reduce disk space occupied by indexDBs as they get deleted along
with the corresponding partitions once those partitions become outside the
retention window.

- Motivation: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7599
- What to expect: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8134

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
Co-authored-by: Andrei Baidarov <baidarov@nebius.com>
2025-12-24 18:53:49 +01:00
Phuong Le
785c1fd053 issues/question-template: fix typos (#947) 2025-12-24 11:37:34 +01:00
Aliaksandr Valialkin
697bfd5cee app/vmauth: properly verify whether the request has been canceled by the client in handleConcurrecnyLimitError()
The `err` may contain information about request cancelation performed by the server code.
In such cases the error must be logged. The error must be ignored only if the client canceled the request.

This is a follow-up for the commit c9596a0364

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10078
2025-12-24 11:31:36 +01:00
Artem Fetishev
f0ac6d9ac9 lib/storage: log the beginning and end of saving metric name usage stats to file (#10205)
This is to debug cases when metric name tracker resets the tsid cache
after restart. It could be due vmstorage not having enough time to stop
gracefully. Logs should provide this info.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-12-23 17:25:43 +01:00
Artem Fetishev
f0b251d967 lib/storage: fix per-idb cache stats (#10204)
This fixes the following corner case: if all instances of a cache have
zero size, the stats won't be set at all. This results in some weird
graphs if the cache is reset very often (such as tfssCache): the cache
sizeMaxBytes alternates between the actual value and zero.

Follow-up for f62893c151

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-12-23 17:06:10 +01:00
Nikolay
c3346ae8fd app/victoria-metrics: properly add prometheus metrics metadata (#10192)
Commit 5a587f2006 was not properly ported
to the single node branch. Since single node is able to perform both
promscrape and self-scrape, it's required to add metadata add methods to
those paths.

 This commit fixes missing metadata add to the storage.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10175
2025-12-23 13:57:19 +01:00
Jinlin
0ffb3fdfce lib/storage: fix log typo 2025-12-23 13:50:40 +01:00
Zakhar Bessarab
4e234ccbd1 docs/enterprise: add description of license key update (#10194)
Describe Your Changes:

- describe options of updating the enterprise license key
- fix a few typos
2025-12-23 13:37:36 +01:00
Alexander Frolov
943589ca31 lib/promscrape: fix isAutoMetric to recognize all auto-generated metrics
Previously, `scrape_labels_limit` was missing from the check.

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10197
2025-12-23 13:36:50 +01:00
Aliaksandr Valialkin
c9596a0364 app/vmauth: add -maxQueueDuration command-line flag for graceful handling of short spikes in the number of concurrent requests
Previously a short spike in the number of concurrent requests immediately led to `429 Too Many Requests` errors
when the number of concurrent requests exceeds -maxConcurrentRequests or -maxConcurrentPerUserRequests.

This commit allows processing short spikes in the number of concurrent requests during the -maxQueueDuration timeout.
The requests are rejected only if they couldn't be served accroding to the concurrency limits during the -maxQueueDuration.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10078
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10112
2025-12-22 16:39:01 +01:00
Aliaksandr Valialkin
e7b0a00493 app/vmauth: follow-up for the commit 7f689df824
- Introduce backendURLs struct, which holds all the backend urls and allows stopping
  all the health checkers across all the backend urls with a single call to backendURLs.stopHealthChecks().

- Immediately cancel the pending Dial call to the backend when backendURLs.stopHealthChecks() is called.
  Use lib/netutil.Dialer.DialContext() for this.

- Replace a fragile closing of stopHealthCheckCh channel via stopHealthCheckOnce.Do()
  with easier to maintain call of cancel() func for the corresponding healthChecksContext.

- Wait until health checker goroutines are finished before return from UserInfo.stopHealthChecks().
  Previously the health checker goroutines could run for some time trying to dial the backend
  after the return from UserInfo.stopHealthChecks().

- Try dialing the broken backend for https urls. It is better if the broken backend logs the error
  instead of routing client requests to the broken backend.

- Log dial errors to the broken backend, so users could troubleshoot the backend connectivity issue with more details.

- Refer the correct issue - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9997 -
  in the comments explaining why periodic dialing of the broken backend is needed.
  Previously the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9890 was incorrectly referred.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9997
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10147
2025-12-22 15:20:51 +01:00
Hui Wang
be0fe546e5 vmauth: skip a redundant request if all backends are broken with least_loaded policy (#10202)
similar to https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10170
2025-12-22 13:06:12 +01:00
Hui Wang
13911db316 vmauth: add new counters to track the number of user request errors
follow up https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10177

Add `vmauth_user_request_backend_requests_total` and
`vmauth_unauthorized_user_request_backend_requests_total` which track
the number of user request errors, and aligned with
`vmauth_user_requests_total`.

The existing `vmauth_http_request_errors_total` currently only counts
requests with `invalid_auth_token`. Once authorization has passed, any
subsequent request errors are tracked under
`xxx_user_request_backend_requests_total`.
2025-12-22 13:05:54 +01:00
Artem Fetishev
0cb90f91fc lib/storage: follow-up for d9c07dbc0b (#10169) - fix changelog
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-12-19 08:44:10 +01:00
Alexander Frolov
bdf65dde88 app/vmagent: make sure vmagent_rows_inserted_total counts samples (#10191)
As vminsert does

4d9b69b5a6/app/vminsert/newrelic/request_handler.go (L68)

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10191
2025-12-18 16:37:37 +01:00
Max Kotliar
4d9b69b5a6 docs/changelog: add known issue note related to memory leak on OpenTelemetry parsing code. 2025-12-18 12:39:12 +02:00
Nikolay
692a9be5fa lib/storage: check indexDB refCount at MustClose
In order to gracefully stop indexDB, refCount must be checked during
storage graceful shutdown.

Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10063
2025-12-17 18:48:53 +01:00
Kirill Kobylyanskiy
c8742ab120 lib/promscrape: add global sampleLimit support
This commit introduces the global `sampleLimit` setting to restrict the number
of samples accepted per scrape target, mirroring the behavior of
Prometheus.

Motivation:
1) The existing `-promscrape.seriesLimitPerTarget` flag currently takes
precedence over any `sample_limit` setting defined directly on the
scrape target. The new `sampleLimit` implementation ensures that the
target configuration is able to override the global setting, allowing
users to define specific limits per target.
2) The existing series limit flag uses memory-intensive Bloom filters,
resulting in high RAM consumption under high-cardinality scraping
scenarios. The `sampleLimit` provides a much simpler, low-overhead
alternative.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10145
2025-12-17 18:47:05 +01:00
Aliaksandr Valialkin
b6f8128273 Makefile: update golangci-lint from v2.4.0 to v2.7.2
See https://github.com/golangci/golangci-lint/releases/tag/v2.7.2
2025-12-17 16:59:02 +01:00
Aliaksandr Valialkin
bed7cbd0a4 all: consistently use encoding.DecompressZSTD* instead of zstd.Decompress* across the codebase
The encoding.DecompressZSTD* consistently updates the vm_zstd_block_decompress_calls_total metric.

Also make the follwing improvements after the commit 10f7cd2ffc:

- Add encoding.DecompressZSTDLimited() function and use it instead of zstd.DecompressLimited,
  so it properly updates vm_zstd_block_decompress_calls_total metric.

- Clarify description for the encoding.DecompressZSTD* and zstd.Decompress* functions.
2025-12-17 16:48:06 +01:00
Artem Fetishev
d9c07dbc0b lib/storage: rotate dateMetricIDCache instead of resetting (#10169)
Currently, `dateMetricIDCache` is reset when it is full and it is never
reset is not full but the data it stores is no longer needed. This leads
to the following problems:
- During regular data ingestion the cache sizeBytes may exceed max
allowed size and the cache gets reset which may potentially slow down
data ingestion (see #10064)
- The cache is per-indexDB. This means that in partition index (#8134)
there will be as many instances of this cache as the number of
partitions. If someone performs a backfill across all partitions, this
will fill all caches and they will never get reset even if no more
historical data is ingested.

So the solution is to periodically rotate the cache. After first
rotation the data is not deleted but moved to `prev` storage. After
second rotation `prev` gets deleted. This gives the cache an opportunity
to restore the `prev` data if it is still in use. Based on #10167.

This PR also removes the introduced recently introduced
`-storage.cacheSizeIndexDBDateMetricID` flag (see #10135). This should
be safe since it is new and its use case is very niche, i.e. no one
would really use it.

---------

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-12-17 15:43:05 +01:00
Artem Fetishev
20ad9cd395 lib/storage: introduce metricIDCache
The cache serves the same purpose as `dateMetricIDCache` but is used for
caching metricIDs from global index.
The cache was introduces in https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10167 and it has been decided to add it in a separate commit to reduce diff.

Related  PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10167
2025-12-17 13:31:11 +01:00
Hui Wang
8b3fe9cdec app/vmauth: add new counters to track the number of requests sent to backends
We have `vmauth_user_requests_total` and
`vmauth_unauthorized_user_requests_total` to track requests from the
user side. However, in scenarios such as request timeouts or when the
response code matches `retry_status_code`, a single request may be
retried across multiple backends.

Exposing counters `vmauth_user_request_backend_requests_total` and
`vmauth_unauthorized_user_request_backend_requests_total` that track the
number of requests sent to backends provides insight into the routing
logic and can help identify if requests are being consistently retried,
which may contribute to increased request duration.

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10171
2025-12-17 13:27:08 +01:00
Hui Wang
e1e367b3cb app/vmauth: properly increment metric xxx_user_request_backend_errors_total
Currently, backendErrors may be counted twice if a request to the
backend fails due to context.DeadlineExceeded.

9bc7a17d80/app/vmauth/main.go (L328)

9bc7a17d80/app/vmauth/main.go (L294)

And we increment this counter in a way that is somewhat inconsistent.
Given that the counter's name is `xx_request_backend_errors_total`, it
should only increase when a backend request returns an error. This value
can exceed the user request error count if multiple backend requests
fail for a single user request.
The `xxx_request_backend_errors_total` counter should be used in
conjunction with the `xxx_request_backend_requests_total` introduced in
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10171.
2025-12-17 13:24:26 +01:00
Hui Wang
f40c6fcad1 app/vmauth: skip a redundant request if all backends are broken with first_available policy
There is no reason to send a request to the first backend if all
backends are marked as broken.
Also, 
>// getFirstAvailableBackendURL returns the first available backendURL,
which isn't broken.


The fix only skips a redundant request when all backends are
unavailable, it doesn't introduce any changes from user's perspective,
so I skipped changelog.
2025-12-17 13:22:37 +01:00
Aliaksandr Valialkin
b6bc186013 docs/victoriametrics/Articles.md: add https://developer-friendly.blog/blog/2024/06/17/unlocking-the-power-of-victoriametrics-a-prometheus-alternative/ 2025-12-16 15:46:23 +01:00
Aliaksandr Valialkin
9bc7a17d80 lib/protoparser/opentelemetry: typo fix: wince -> since
This is a follow-up for the commit 293d80910c
2025-12-15 20:13:45 +01:00
f41gh7
9ce548dcb5 docs: update release version to latest 2025-12-15 10:37:35 +01:00
f41gh7
82e583338d docs: update LTS releases 2025-12-15 10:34:43 +01:00
Aliaksandr Valialkin
19009836c7 vendor: update github.com/valyala/fastjson from v1.6.5 to v1.6.7 2025-12-14 23:09:43 +01:00
Max Kotliar
c2362ab670 docs: review links in changelogs 2025-12-12 19:43:15 +02:00
f41gh7
d04a42e846 make vmui-update 2025-12-12 12:50:13 +01:00
f41gh7
0d930dda16 CHANGELOG.md: cut v1.132.0 release 2025-12-12 12:45:34 +01:00
Artem Fetishev
e026215701 lib/storage: Document post-delete cache resets (#10158)
When the time series deletion is performed some of the storage caches
need to be reset but some not. This PR reviews all storage caches and
documents why there are reset or not and also places all the resetting
logic (and comments) in one place.
2025-12-12 11:11:30 +01:00
JAYICE
34a542c324 lib/storage: include last sample when query at the last millisecond of the day
One millisecond shouldn't be subtracted from the `tr.MaxTimestamp`, and
related test cases will be added

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9804
2025-12-12 11:01:06 +01:00
Fred Navruzov
ff0aaa38b7 docs/vmanomaly: release v1.28.2 (#10160)
### Describe Your Changes

Update docs and assets (visualizations) for /anomaly-detection section
with `v1.28.2` release

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-12-11 20:56:59 +02:00
Max Kotliar
0e2f0ac95f lib/protoparser/opentelemetry: fix typo in code
#
github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentelemetry/pb
lib/protoparser/opentelemetry/pb/pb.go:1683:19: undefined: lctx

Bug introduced in
1dc71212f8
2025-12-11 18:37:04 +02:00
Max Kotliar
7f689df824 app/vmauth: validate backend with a dial check before marking it healthy (#10147)
### Describe Your Changes

Previously, a backend was considered healthy as soon as its
'bu.brokenDeadline' deadline expired, even if it was still unavailable.
This caused avoidable request failures and retries.

Now vmauth performs a TCP dial (1s timeout) before restoring the backend
to the healthy
pool. This avoids routing traffic to backends that are still down.

The dial check also covers cases where a route to the backend cannot be
resolved. Without this check, user requests would hang until the
connection timeout, leading to long waits
or errors. The new check fails fast and doesn't impact real user
requests.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9997


### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-12-11 18:26:59 +02:00
Max Kotliar
bd725bdd69 dashboards: add usseful links to dashboards
Dashboards:

- Add a link to proper docs section
- Add a link to troubleshooting page
- Add links to community and enterprise support

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9904
2025-12-11 18:11:07 +02:00
Aliaksandr Valialkin
712b7cfeeb lib/promscrape: allow scraping targets with responses equal to c.maxScrapeSize
Return "too big response size" error only for responses bigger than c.maxScrapeSize
(this option can be set either via max_scrape_size option inside scrape config
or via -promscrape.maxScrapeSize command-line flag).

Previously responses with sizes equal to c.maxScrapeSize were incorrectly rejected.
2025-12-11 16:15:46 +01:00
Aliaksandr Valialkin
1dc71212f8 lib/protoparser/opentelemetry/pb: reset the decoderContext.ls.Labels length to zero after clearing all the references to the original byte slice
This is a follow-up for 25f49e6f54
2025-12-11 15:39:32 +01:00
Aliaksandr Valialkin
25f49e6f54 lib/protoparser/opentelemetry: explicitly clear all the references to the underlying byte slice at decoderContext.ls.Labels up to its capacity
This should prevent from the excess memory usage because of dangling source byte slices
referred by decoderContext.ls.Labels.

This is a change similar to 63a68edb05
2025-12-11 15:33:13 +01:00
Max Kotliar
dcf9f0eb7b lib/promscrape: Add a warning to active targets panel if -dropOriginalLabels=true (some debug info not available)
Previously the original labels were preserved (
-dropOriginalLabels=false). In
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9772 the default
behavior was changed. Now vmagent\vmsingle drops origianal labels. The
change
created some confusion related to UI. For example, debug relabling
column is completly hidden when the labels are not available. It created
a steram of questions.

This commit adds a warning similar to one we have at "Discovered
targets" tab, and also always show the "Debug relabeling" column. When
there is not info for it "N/A" printed.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9901
Follow-up https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9772
2025-12-11 16:24:50 +02:00
Aliaksandr Valialkin
606382178b lib/protoparser/protoparserutil: do not store too big buffers to the pool at ReadUncompressedData if only a small part of the buffer is used last time
This should prevent from excess memory usage because of inefficiently used buffers.

This should help the case at https://github.com/VictoriaMetrics/VictoriaLogs/issues/869
2025-12-11 15:11:44 +01:00
Artem Fetishev
220249f023 lib/storage: use lrucache to implement tagFilters loops cache
The tagFilters loop cache is per-indexDB which means that currently
there are two instances, one for idbCurr and one for idbPrev. When the
partition index (#8134) is released, there will be as many instances of
this cache as there will be partitions.

The cache is implemented using workingsetcache. Which occupies at least
30MB even when unused. Given that only the latest indexDB is used most
of the time, a lot of memory can be wasted.

Therefore the cache implementation is changed to lrucache because it
does not consume memory when it is unused and also has timeout-based
eviction.

This is a follow-up for 4cd727a511
(https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10072).
2025-12-11 08:42:58 +01:00
Max Kotliar
c6731f964c dashboards: add memory usage breakdown panels into Drilldown sections
Right now we have two separate panels: RSS memory % usage and RSS
anonymous memory % usage. This makes trend comparison difficult because
one have to visually correlate two independent panels. Another problem
is that these panels don't show Go runtime allocations at all. The same
applies to memory allocated in C. There are allocations in C (zstd) one
should account for but there is no even a metric to expose it.

The commit adds Memory usage breakdown panel into Drilldown section. It
provides insight into Go Stack, Go Heap, Go Heap Released, Go Other,
Mmap: VM Cache, File cache memory distribution

It should help spot trends changes in memory by type or invistigate
issues such as #10069 and #10028 easier.

Panel info:
This panel shows memory usage by category.

How to use:
- Start from the high-level RSS panel.
- Identify an instance with unexpected or abnormal memory growth.
- Filter to that instance to inspect the detailed breakdown here.

Interpretation
- A steadily rising Go Heap usually indicates a memory leak. Collect
pprof memory profile.
- A growing Go Stack commonly points to a goroutine leak.

<img width="1508" height="628" alt="Screenshot 2025-12-08 at 13 18 44"
src="https://github.com/user-attachments/assets/0e794324-e86d-468e-b926-8bb11f5a2043"
/>
<img width="1503" height="674" alt="Screenshot 2025-12-08 at 13 19 34"
src="https://github.com/user-attachments/assets/62fc3fff-33b3-4dfe-ad3f-ad0526a8a606"
/>

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10139
2025-12-11 08:39:00 +01:00
Sinotov Vladimir
859435a8df lib/protoparser: added push data with zabbix connector (#6087)
Support receiving data from the Zabbix connector with API `/zabbixconnector/api/v1/history`

Labels:
    - The metric name is added to the `__name__` label.
    - Host name to `host` label.
    - Visible name  to `hostname` label.

The returned response complies with the requirements of the Zabbix

 See the following doc for connector [protocol](https://www.zabbix.com/documentation/current/en/manual/config/export/streaming).

Useful links:
- Zabbix Streaming to external systems
(https://www.zabbix.com/documentation/current/en/manual/config/export/streaming)
- Zabbix Newline-delimited JSON expor
(https://www.zabbix.com/documentation/current/en/manual/appendix/protocols/real_time_export)

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6087
2025-12-10 17:00:27 +01:00
Max Kotliar
5b12fd35d7 app/vminsert: improve slowness-based rerouting logic
Adjust slowness-based rerouting logic.

Rerouting now occurs only from the slowest node, and only if the cluster
as a whole has enough available capacity to handle the additional load.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9890
2025-12-10 16:25:44 +01:00
Aliaksandr Valialkin
293d80910c lib/protoparser/opentelemetry: eliminate memory allocations during parsing of samples send via OpenTelemetry protocol
This increases the parser performance by 4x-6x.

This commit uses the technique similar to https://github.com/VictoriaMetrics/VictoriaLogs/pull/720

goos: linux
goarch: amd64
pkg: github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentelemetry/stream
cpu: AMD Ryzen 7 PRO 5850U with Radeon Graphics
                                                    │   old.txt    │               new.txt               │
                                                    │    sec/op    │   sec/op     vs base                │
ParseStream/default-metrics-labels-formatting-16      15.565µ ± 1%   2.150µ ± 3%  -86.19% (p=0.000 n=10)
ParseStream/prometheus-metrics-labels-formatting-16   24.228µ ± 2%   4.355µ ± 1%  -82.02% (p=0.000 n=10)
ParseStream/prometheus-metrics-formatting-16          23.028µ ± 2%   3.395µ ± 1%  -85.26% (p=0.000 n=10)
geomean                                                20.55µ        3.168µ       -84.59%

                                                    │   old.txt    │                new.txt                 │
                                                    │     B/s      │      B/s       vs base                 │
ParseStream/default-metrics-labels-formatting-16      127.9Mi ± 1%    918.3Mi ± 3%  +617.82% (p=0.000 n=10)
ParseStream/prometheus-metrics-labels-formatting-16   82.19Mi ± 2%   453.32Mi ± 1%  +451.57% (p=0.000 n=10)
ParseStream/prometheus-metrics-formatting-16          86.47Mi ± 2%   581.56Mi ± 1%  +572.52% (p=0.000 n=10)
geomean                                               96.88Mi         623.3Mi       +543.34%

                                                    │   old.txt    │                 new.txt                  │
                                                    │     B/op     │    B/op      vs base                     │
ParseStream/default-metrics-labels-formatting-16      12.53Ki ± 0%   0.00Ki ± 0%  -100.00% (p=0.000 n=10)
ParseStream/prometheus-metrics-labels-formatting-16   21.15Ki ± 1%   0.00Ki ±  ?  -100.00% (p=0.000 n=10)
ParseStream/prometheus-metrics-formatting-16          20.74Ki ± 1%   0.00Ki ±  ?  -100.00% (p=0.000 n=10)
geomean                                               17.65Ki                     ?                       ¹ ²
¹ summaries must be >0 to compute geomean
² ratios must be >0 to compute geomean

                                                    │  old.txt   │                new.txt                 │
                                                    │ allocs/op  │ allocs/op  vs base                     │
ParseStream/default-metrics-labels-formatting-16      426.0 ± 0%    0.0 ± 0%  -100.00% (p=0.000 n=10)
ParseStream/prometheus-metrics-labels-formatting-16   514.0 ± 0%    0.0 ± 0%  -100.00% (p=0.000 n=10)
ParseStream/prometheus-metrics-formatting-16          514.0 ± 0%    0.0 ± 0%  -100.00% (p=0.000 n=10)
geomean                                               482.8                   ?                       ¹ ²
2025-12-10 16:11:59 +01:00
Artem Fetishev
bc4d98b358 app/vmstorage: properly name dateMetricIDCache metrics
The following dmc metrics were given standard names, i.e.:

- vm_date_metric_id_cache_resets_total became
vm_cache_resets_total{type="indexdb/date_metricID"}
- vm_date_metric_id_cache_syncs_total became
vm_cache_syncs_total{type="indexdb/date_metricID"}

This change should be safe since these metrics are currently not used in
VictoriaMetrics Gragana dashboards.

Additionally, other cache metrics were organized within the code so that
each metric has the same order.
2025-12-10 14:57:52 +01:00
Alexander Frolov
ad153f72ef lib/storage: utilize persisted hourMetricIDs cache to avoid redundant indexDB lookups after vmstorage restart
This commit optimizes the performance of the storage by improving the utilization of persisted hourMetricIDs cache to avoid redundant indexDB lookups after vmstorage restart. The change refactors the hour-based cache checking logic using a switch statement to handle multiple hour scenarios more efficiently.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10114
2025-12-10 14:56:05 +01:00
Vadim Rutkovsky
f2578a9764 docs/victoriametrics: update LTS-releases.md (#10153)
### Describe Your Changes

Doc update to mention fresh patch releases - 1.122.10 and 1.110.25

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-12-10 15:11:26 +02:00
Aliaksandr Valialkin
d5e19717b7 Makefile: use the correct -trim_path at pprof-cpu
It shouldn't end with @.

The `PPROF_FILE=/path/to/cpu.pprof make pprof-cpu` is good for investigating profiles received from production builds.
2025-12-10 13:41:10 +01:00
Max Kotliar
5c40328e5f docs: mention Grafana panel that can help with swap related issues 2025-12-10 14:08:43 +02:00
Yury Moladau
1117437456 app/vmui: improve legend auto-collapse threshold, warning and toggle (#10140)
### Describe Your Changes

This PR improves the legend auto-collapse behavior in vmui:
- Increase the legend auto-collapse threshold from `20` to `100` series.
- Add a warning message when the legend is collapsed by default, showing
the actual series count.
- Add a user setting to disable automatic legend collapsing (enabled by
default).

Related issue: #10075

<img width="352" alt="image"
src="https://github.com/user-attachments/assets/22ee2ef9-6369-47a8-87a1-c63a0e17fccd"
/>
<img width="1618" height="197" alt="image"
src="https://github.com/user-attachments/assets/791eb9b6-4397-476d-ad44-5152e50d1975"
/>


### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Signed-off-by: Yury Molodov <yurymolodov@gmail.com>
2025-12-10 13:59:16 +02:00
Aliaksandr Valialkin
094a7cf3f9 lib/protoparser/opentelemetry/stream: benchmark cases when prometheus-compatible naming for metrics and labels is enabled 2025-12-10 11:50:46 +01:00
Artem Fetishev
538e489497 docs: Update cache tuning section (#10149)
- Remove mentions of `Caches` section in Grafana dashobards since this section does not exist anymore.
- Rewrite a bit the description of cache panels in Troubleshooting section.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2025-12-10 11:42:33 +01:00
Aliaksandr Valialkin
744aa3fe9f lib/protoparser/opentelemetry/stream: make the BenchmarkParseStream closer to real production cases
- Add more metrics to the protobuf to parse.
- Measure scan speed of the original protobuf at bytes/sec. Previously the number of ParseStream() calls per second was measured.
2025-12-10 11:21:57 +01:00
Aliaksandr Valialkin
44a3885f97 lib/protoparser/opentelemetry/stream: avoid memory allocations for bytes.NewBuffer() on every iteration of BenchmarkParseStream
Re-use benchReader for reading the same data on every iteration of BenchmarkParseStream.
2025-12-10 11:10:39 +01:00
Aliaksandr Valialkin
f43264f9f2 lib/ioutil: add missing package after the commit 2da010495c 2025-12-10 11:07:15 +01:00
Aliaksandr Valialkin
e07bc7a74e lib/prompb: move all the code related to WriteRequestUnmarshaler to a separate file - write_request_unmarshaler.go
This should improve code maintenance a bit.

This is a follow-up for the commit b98e592752
2025-12-10 10:45:59 +01:00
Aliaksandr Valialkin
d1680063f5 lib/prompb: rename MetricMetadataType to MetricType
Also rename MetricMetadata* constants to MetricType* constants.

This makes the code a bit more readable.

This is a follow-up for the commit 25cd5637bc

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2974
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9306
2025-12-10 01:18:44 +01:00
Aliaksandr Valialkin
2da010495c all: pool io.LimitedReader in order to save a memory allocation and reduce CPU usage a bit 2025-12-10 01:18:43 +01:00
Artem Fetishev
7c78f95f2e docs: Update flags (#10148)
Follow-up for dc5d7aa4ce
(https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10135)

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-12-09 17:48:40 +01:00
Kirill Yurkov
5bd67c5f49 docs: recommend disabling swap (#10113)
add swap disable commands in install recommendations to prevent
performance issues

---------

Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2025-12-09 15:50:13 +02:00
Max Kotliar
c618f471ca apptest: make results order stable in test special query regression
Sometimes test fails with error:

--- FAIL: TestClusterSpecialQueryRegression (15.57s)
special_query_regression_test.go:76: unexpected /api/v1/export
response (-want, +got):
          &apptest.PrometheusAPIV1QueryResponse{
          	... // 1 ignored field
          	Data: &apptest.QueryData{
          		... // 1 ignored field
          		Result: []*apptest.QueryResult{
          			&{
          				Metric: map[string]string{
          					"__name__":
"prometheus.sensitiveRegex",
- 					"label":
"SensitiveRegex",
+ 					"label":
"sensitiveRegex",
          				},
          				Sample:  nil,
          				Samples: {&{Timestamp:
1707123456700, Value: 10}},
          			},
          			&{
          				Metric: map[string]string{
          					"__name__":
"prometheus.sensitiveRegex",
- 					"label":
"sensitiveRegex",
+ 					"label":
"SensitiveRegex",
          				},
          				Sample:  nil,
          				Samples: {&{Timestamp:
1707123456700, Value: 10}},
          			},
          		},
          	},
          	ErrorType: "",
          	Error:     "",
          	IsPartial: false,
          }

FAIL
FAIL	github.com/VictoriaMetrics/VictoriaMetrics/apptest/tests
	18.676s
FAIL
2025-12-09 15:20:01 +02:00
Artem Fetishev
f62893c151 lib/storage: report per-idb cache stats only once
`tagFiltersCache` and `dateMeticIDCache` are now per-indexDB. Currently
we have 2 instance of indexDBs (prev and curr) and therefore 2 instances
of each cache.

When the storage stats is collected, the stats of individual caches is
added together. For example, is the `sizeMaxBytes` of each
tagFiltersCache is `100MB` and the `sizeBytes` of each instance is
`10MB` and `99MB`, then the resulting stats will be `sizeMaxBytes ==
200MB, sizeBytes == 109MB`.

While this is accurate, this stats hides a potential problem. It says
that the cache utilization is slightly above `50%` (109/200) and
everything seems to be okay. But in reality one of the caches is
utilized by 99% and soon will start evicting existing records to make
room for new ones, potentially slowing down the data retrieval. Ops
won't see it and will not take necessary action.

The solution is to report stats only for one instance of cache whose
utilization is the highest.

Alternatives considered:
- #10123. Might work, but breaks the encapsulation and can potentially
be slower
- Do not aggregate the stats and report is per-indexDB. This increases
the number of metrics and makes it dependent on the number of indexDB
instances (which can be many once #8134 is released).

Related issue https://github.com/VictoriaMetrics/VictoriaMetrics/pull/8134
2025-12-09 12:43:50 +01:00
JAYICE
76f5def301 dashboard: fix page fault panel (#10141)
add `[$__rate_interval]` to fix page fault panel introduced in
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9977
2025-12-09 12:41:28 +01:00
Artem Fetishev
3be5ed0e32 Revert "lib/storage: after deleting series, reset tsid only once" (#10143)
This reverts commit dbe71700b5.

tsidCache is persistent and must be reset before deletedMetricID records
are added to the index. THis is needed to handle ungraceful shutdowns
properly.
2025-12-09 10:57:08 +01:00
Aliaksandr Valialkin
4ac40d955b lib/prompb: use MetricMetadataType type for MetricMetadata.Type field
This eliminates the need of manual conversion between MetricMetadataType and uint32 / int32.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2974

This is a follow-up for the commit 5a587f2006
2025-12-08 20:31:29 +01:00
Artem Fetishev
dc5d7aa4ce lib/storage: properly report dateMetricIDCache stats
A number of changes to `dateMetricIDCache` stats and configuration:

1. Export `SizeMaxBytes` metric and make the size configurable via a
flag
2. Fix `EntriesCount` and `SizeBytes` stats. Previously the cache
reported this stats for its immutable part only. Whereas there are cases
when the number of entries in its mutable part is comparable with the
number in immutable part. The stats from the mutable part remains
invisible until it is sync'ed to the immutable part. It is also possible
that the cache gets reset after the sync because the cache size exceeds
the max allowed size. Reporting the stats for both mutable and immutable
parts should provide a clear picture of the cache utilization.

Together, SizeBytes and SizeMaxBytes should enable tracking the cache
utilization properly. And take appropriate actions if necessary (such as
adjusting the memory resources and/or cache size limit via a flag).

Related issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10064
2025-12-08 14:17:58 +01:00
JAYICE
244769a00d vmstorage: skip last sleep when closing vminsertSrv connections
After closing last connection to vminsert, vmstorage will still wait for
an interval, causing actual shutdown time will be always longger than
configurations.

This commit just skip the last sleep

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10136
2025-12-08 14:10:17 +01:00
Max Kotliar
8e81d54851 Revert "dashboards: add memory usage breakdown panels into Drilldown sections"
This reverts commit 5117cde8bc.
2025-12-08 13:42:10 +02:00
Max Kotliar
5117cde8bc dashboards: add memory usage breakdown panels into Drilldown sections
Right now we have two separate panels: RSS memory % usage and RSS
anonymous memory % usage. This makes trend comparison difficult because
one have to visually correlate two independent panels. Another problem
is that these panels don't show Go runtime allocations at all. The same
applies to memory allocated in C. There are allocations in C (zstd) one
should account for but there is no even a metric to expose it.

The commit adds Memory usage breakdown panel into Drilldown section. It
provides insight into Go Stack, Go Heap, Go Heap Released, Go Other,
Mmap: VM Cache, File cache memory distribution

It should help spot trends changes in memory by type or invistigate
issues such as #10069 and #10028 easier.

Panel info:
This panel shows memory usage by category.

How to use:
- Start from the high-level RSS panel.
- Identify an instance with unexpected or abnormal memory growth.
- Filter to that instance to inspect the detailed breakdown here.

Interpretation
- A steadily rising Go Heap usually indicates a memory leak. Collect
pprof memory profile.
- A growing Go Stack commonly points to a goroutine leak.
2025-12-08 13:39:34 +02:00
Artem Fetishev
85367cae38 Idb blockcache metrics unittest (#10050)
indexDB has 3 block caches. These caches export metrics. Storage
collects these
metrics for each indexDB it has (currently prev and curr only).

There is a potential problem:
- These caches are shared by all indexDBs
- Each indexDB reports the block cache metrics.
- Storage collects the metrics of all indexDBs by adding them together.

I.e. it is possible to count block cache metrics several times.
It is not the case in current implementation because the addition of the
metrics
is not performed intentionally.

The added unit test 1) demonstrates that the resulting counts are
reported
correctly and 2) protects from future unintentional changes in this
behavior.

Additionally a code comment is added to explain why block cache metrics
are not summed up.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-12-06 18:14:52 +01:00
Aliaksandr Valialkin
159b71cabb lib/protoparser/influx: properly clean references to underlying byte slices from tagsPool and fieldsPool inside unmarshalContext
This should prevent from memory leaks when unmarshalContext fields point to unused byte slices.
2025-12-06 11:52:57 +01:00
Aliaksandr Valialkin
78b8c773ae docs/victoriametrics/: remove misleading statement about extending ext4 partition to 16TB+
It is enough to recommend the given format options for disks with 1TB+ sizes
2025-12-05 23:00:47 +01:00
Nikolay
aab92d3c0f protoparser/influx: reduce memory allocation (#10109)
Previously, influx parser allocated a new slice byte for
unescape of Row fields. It adds extra pressure at GC and increases CPU
usage.

 This commit changes escape to in-place updates for provided []byte.
Since request for parsing is actually a []byte converted into the
string, it's safe to update it in-place. To be able to interact with
[]byte directly, this commit changes parser API and accepts []byte
instead of string.

Benchstat:
```
                                 │   before    │                after                │
                                 │   sec/op    │   sec/op     vs base                │
RowsUnmarshalUnescape-10           74.68n ± 4%   54.23n ± 5%  -27.38% (p=0.000 n=10)
RowsUnmarshalUnescapeNoEscape-10   40.41n ± 2%   42.59n ± 1%   +5.39% (p=0.000 n=10)
geomean                            54.93n        48.06n       -12.51%

                                 │    before    │                after                 │
                                 │     B/s      │     B/s       vs base                │
RowsUnmarshalUnescape-10           1.035Gi ± 4%   1.425Gi ± 5%  +37.72% (p=0.000 n=10)
RowsUnmarshalUnescapeNoEscape-10   1.613Gi ± 2%   1.531Gi ± 1%   -5.11% (p=0.000 n=10)
geomean                            1.292Gi        1.477Gi       +14.32%

                                 │   before    │                after                 │
                                 │    B/op     │    B/op     vs base                  │
RowsUnmarshalUnescape-10           149.00 ± 0%   96.00 ± 0%  -35.57% (p=0.000 n=10)
RowsUnmarshalUnescapeNoEscape-10    80.00 ± 0%   80.00 ± 0%        ~ (p=1.000 n=10) ¹
geomean                             109.2        87.64       -19.73%
¹ all samples are equal

                                 │   before   │                after                 │
                                 │ allocs/op  │ allocs/op   vs base                  │
RowsUnmarshalUnescape-10           5.000 ± 0%   1.000 ± 0%  -80.00% (p=0.000 n=10)
RowsUnmarshalUnescapeNoEscape-10   1.000 ± 0%   1.000 ± 0%        ~ (p=1.000 n=10) ¹
geomean                            2.236        1.000       -55.28%
```

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10053

### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2025-12-05 18:18:07 +01:00
Nikolay
2bef26288e lib/memory: add validation for remaining system memory
Previously, if user defined value for `memory.allowedBytes` flag
exceeded system memory limit, remaining memory could take negative
value. It results into incorrect memory auto-detect calculations for
various components. Such as vmstorage unique timeseries limit and parts
size.

 This commit adds negative value check. And also logs system memory
limit at start-up of vm components.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10083
2025-12-05 18:14:04 +01:00
Hui Wang
c14dbad33b vmselect: disable rollup result cache for instant queries that contain rate function
Previously, in order to cache results for `rate`, we consider
`rate(m[d])` as `(increase(m[d]) / d)` and cache the `increase` result.
However, in MetricsQL, `rate(d) = (lastValue - firstValue) /
(lastTimestamp - firstTimestamp)`, so it does not equal to
`increase(d)/d` if `d != (lastTimestamp - firstTimestamp)`.
Although the issue primarily arises when the time series samples are not
continuous, but the discrepancy is hard to debug and can be confusing to
users. Because the range query doesn't use this optimization, causing
recording rule results to
differ from raw query results in VMUI. 
Therefore, it is better to disable the usage and only enable it when we
can cache it correctly.

fixes https://github.com/VictoriaMetrics/victoriaMetrics/issues/10098
2025-12-05 17:38:32 +01:00
Artem Fetishev
dbe71700b5 lib/storage: after deleting series, reset tsid only once
As indexDBs became independent from each other, the tsidCache is now
reset more than once when the DeleteSeries() operation is performed. But
it needs to be performed only once. Thus, move the deletion from indexDB
to Storage.

Follow-up for 16d75ab0bd.

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10119
2025-12-05 17:38:02 +01:00
Hui Wang
d4fa326659 vmselect: reset rollup result cache with -search.disableCache when necessary
There’s no need to call `c.Reset()` for rollup result cache if it’s not
persisted(`-cacheDataPath` not specified) or has already been cleared by
`-search.resetRollupResultCacheOnStartup`, as it is already newly
created.


Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10095
2025-12-05 17:37:30 +01:00
Andrei Baidarov
040ef931d1 vmalert: do not increment errors counter on cancel context errors
Follow-up for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10027

`vmalert_alerting_rules_errors_total` increments on any error


445f30a4a6/app/vmalert/rule/alerting.go (L455-L460)

while `vmalert_execution_errors_total` only on non-cancellation ones


445f30a4a6/app/vmalert/rule/group.go (L747-L756)

This commit ignores cancellation errors in
`vmalert_alerting_rules_errors_total` too
2025-12-05 17:36:42 +01:00
JAYICE
474009a7f1 dashboard: add page faults panel for vmsingle&vmcluster (#9977)
### Describe Your Changes
add page fault panel in `Troubleshooting`section for vmcluster and
vmsingle. fix
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9974

The query
```
sum(rate(process_minor_pagefaults_total{job=~"$job", instance=~"$instance"})) by (job,instance)

sum(rate(process_major_pagefaults_total{job=~"$job", instance=~"$instance"})) by (job,instance)
```

<img width="1088" height="306" alt="image"
src="https://github.com/user-attachments/assets/4b4ac884-5372-4141-a429-ac0b296dc926"
/>
2025-12-05 18:04:44 +02:00
Nikolay
1b1442d91b app/vmgateway: properly handle proxy request errors
Previously vmgateway didn't handle http.Abort error.
It could lead to the unexpected panic at webserver.

This commit adds panic recover and prevent app from crash.
2025-12-05 16:32:50 +01:00
Aliaksandr Valialkin
3e359dc920 lib/protoparser/influx: remove IgnoreErrors field from Rows and replace it with the explicit skipInvalidLines arg at Rows.Unmarshal()
This improves the maintainability of the code, since the caller of Rows.Unmarshal() always knows
whether invalid lines must be skipped.

While at it, add missing error checks returned from Rows.Unmarshal().

This is a follow-up for the commit daa7183749

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7090
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/7165
2025-12-05 16:24:54 +01:00
Hui Wang
e41f642a59 add flag description for -selectNode (#10022) 2025-12-05 14:53:06 +02:00
Artem Fetishev
7a2cc7fbad lib/storage: use deadline instead is.deadline
This makes SearchTSIDs() consistent with SearchMetricNames().

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-12-05 02:08:14 +01:00
Aliaksandr Valialkin
a7b99dd164 vendor: update github.com/VictoriaMetrics/easyproto from v0.1.4 to v1.0.0 2025-12-04 21:47:20 +01:00
Andrii Chubatiuk
647f107576 vmui: always add /prometheus prefix while generating backend url 2025-12-04 18:09:47 +02:00
dependabot[bot]
04f8296c85 build(deps-dev): bump js-yaml from 4.1.0 to 4.1.1 in /app/vmui/packages/vmui (#10017)
Bumps [js-yaml](https://github.com/nodeca/js-yaml) from 4.1.0 to 4.1.1.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/nodeca/js-yaml/blob/master/CHANGELOG.md">js-yaml's
changelog</a>.</em></p>
<blockquote>
<h2>[4.1.1] - 2025-11-12</h2>
<h3>Security</h3>
<ul>
<li>Fix prototype pollution issue in yaml merge (&lt;&lt;)
operator.</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="cc482e7759"><code>cc482e7</code></a>
4.1.1 released</li>
<li><a
href="50968b862e"><code>50968b8</code></a>
dist rebuild</li>
<li><a
href="d092d86603"><code>d092d86</code></a>
lint fix</li>
<li><a
href="383665ff42"><code>383665f</code></a>
fix prototype pollution in merge (&lt;&lt;)</li>
<li><a
href="0d3ca7a27b"><code>0d3ca7a</code></a>
README.md: HTTP =&gt; HTTPS (<a
href="https://redirect.github.com/nodeca/js-yaml/issues/678">#678</a>)</li>
<li><a
href="49baadd52a"><code>49baadd</code></a>
doc: 'empty' style option for !!null</li>
<li><a
href="ba3460eb9d"><code>ba3460e</code></a>
Fix demo link (<a
href="https://redirect.github.com/nodeca/js-yaml/issues/618">#618</a>)</li>
<li>See full diff in <a
href="https://github.com/nodeca/js-yaml/compare/4.1.0...4.1.1">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=js-yaml&package-manager=npm_and_yarn&previous-version=4.1.0&new-version=4.1.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/VictoriaMetrics/VictoriaMetrics/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-04 17:42:17 +02:00
dependabot[bot]
1c3e64e9ad build(deps): bump actions/checkout from 4 to 6 (#10082)
Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to
6.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/actions/checkout/releases">actions/checkout's
releases</a>.</em></p>
<blockquote>
<h2>v6.0.0</h2>
<h2>What's Changed</h2>
<ul>
<li>Update README to include Node.js 24 support details and requirements
by <a href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a>
in <a
href="https://redirect.github.com/actions/checkout/pull/2248">actions/checkout#2248</a></li>
<li>Persist creds to a separate file by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2286">actions/checkout#2286</a></li>
<li>v6-beta by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2298">actions/checkout#2298</a></li>
<li>update readme/changelog for v6 by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2311">actions/checkout#2311</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/checkout/compare/v5.0.0...v6.0.0">https://github.com/actions/checkout/compare/v5.0.0...v6.0.0</a></p>
<h2>v6-beta</h2>
<h2>What's Changed</h2>
<p>Updated persist-credentials to store the credentials under
<code>$RUNNER_TEMP</code> instead of directly in the local git
config.</p>
<p>This requires a minimum Actions Runner version of <a
href="https://github.com/actions/runner/releases/tag/v2.329.0">v2.329.0</a>
to access the persisted credentials for <a
href="https://docs.github.com/en/actions/tutorials/use-containerized-services/create-a-docker-container-action">Docker
container action</a> scenarios.</p>
<h2>v5.0.1</h2>
<h2>What's Changed</h2>
<ul>
<li>Port v6 cleanup to v5 by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2301">actions/checkout#2301</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/checkout/compare/v5...v5.0.1">https://github.com/actions/checkout/compare/v5...v5.0.1</a></p>
<h2>v5.0.0</h2>
<h2>What's Changed</h2>
<ul>
<li>Update actions checkout to use node 24 by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2226">actions/checkout#2226</a></li>
<li>Prepare v5.0.0 release by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2238">actions/checkout#2238</a></li>
</ul>
<h2>⚠️ Minimum Compatible Runner Version</h2>
<p><strong>v2.327.1</strong><br />
<a
href="https://github.com/actions/runner/releases/tag/v2.327.1">Release
Notes</a></p>
<p>Make sure your runner is updated to this version or newer to use this
release.</p>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/checkout/compare/v4...v5.0.0">https://github.com/actions/checkout/compare/v4...v5.0.0</a></p>
<h2>v4.3.1</h2>
<h2>What's Changed</h2>
<ul>
<li>Port v6 cleanup to v4 by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2305">actions/checkout#2305</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/checkout/compare/v4...v4.3.1">https://github.com/actions/checkout/compare/v4...v4.3.1</a></p>
<h2>v4.3.0</h2>
<h2>What's Changed</h2>
<ul>
<li>docs: update README.md by <a
href="https://github.com/motss"><code>@​motss</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1971">actions/checkout#1971</a></li>
<li>Add internal repos for checking out multiple repositories by <a
href="https://github.com/mouismail"><code>@​mouismail</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1977">actions/checkout#1977</a></li>
<li>Documentation update - add recommended permissions to Readme by <a
href="https://github.com/benwells"><code>@​benwells</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2043">actions/checkout#2043</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/actions/checkout/blob/main/CHANGELOG.md">actions/checkout's
changelog</a>.</em></p>
<blockquote>
<h1>Changelog</h1>
<h2>V6.0.0</h2>
<ul>
<li>Persist creds to a separate file by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2286">actions/checkout#2286</a></li>
<li>Update README to include Node.js 24 support details and requirements
by <a href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a>
in <a
href="https://redirect.github.com/actions/checkout/pull/2248">actions/checkout#2248</a></li>
</ul>
<h2>V5.0.1</h2>
<ul>
<li>Port v6 cleanup to v5 by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2301">actions/checkout#2301</a></li>
</ul>
<h2>V5.0.0</h2>
<ul>
<li>Update actions checkout to use node 24 by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2226">actions/checkout#2226</a></li>
</ul>
<h2>V4.3.1</h2>
<ul>
<li>Port v6 cleanup to v4 by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2305">actions/checkout#2305</a></li>
</ul>
<h2>V4.3.0</h2>
<ul>
<li>docs: update README.md by <a
href="https://github.com/motss"><code>@​motss</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1971">actions/checkout#1971</a></li>
<li>Add internal repos for checking out multiple repositories by <a
href="https://github.com/mouismail"><code>@​mouismail</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1977">actions/checkout#1977</a></li>
<li>Documentation update - add recommended permissions to Readme by <a
href="https://github.com/benwells"><code>@​benwells</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2043">actions/checkout#2043</a></li>
<li>Adjust positioning of user email note and permissions heading by <a
href="https://github.com/joshmgross"><code>@​joshmgross</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2044">actions/checkout#2044</a></li>
<li>Update README.md by <a
href="https://github.com/nebuk89"><code>@​nebuk89</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2194">actions/checkout#2194</a></li>
<li>Update CODEOWNERS for actions by <a
href="https://github.com/TingluoHuang"><code>@​TingluoHuang</code></a>
in <a
href="https://redirect.github.com/actions/checkout/pull/2224">actions/checkout#2224</a></li>
<li>Update package dependencies by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2236">actions/checkout#2236</a></li>
</ul>
<h2>v4.2.2</h2>
<ul>
<li><code>url-helper.ts</code> now leverages well-known environment
variables by <a href="https://github.com/jww3"><code>@​jww3</code></a>
in <a
href="https://redirect.github.com/actions/checkout/pull/1941">actions/checkout#1941</a></li>
<li>Expand unit test coverage for <code>isGhes</code> by <a
href="https://github.com/jww3"><code>@​jww3</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1946">actions/checkout#1946</a></li>
</ul>
<h2>v4.2.1</h2>
<ul>
<li>Check out other refs/* by commit if provided, fall back to ref by <a
href="https://github.com/orhantoy"><code>@​orhantoy</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1924">actions/checkout#1924</a></li>
</ul>
<h2>v4.2.0</h2>
<ul>
<li>Add Ref and Commit outputs by <a
href="https://github.com/lucacome"><code>@​lucacome</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1180">actions/checkout#1180</a></li>
<li>Dependency updates by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>- <a
href="https://redirect.github.com/actions/checkout/pull/1777">actions/checkout#1777</a>,
<a
href="https://redirect.github.com/actions/checkout/pull/1872">actions/checkout#1872</a></li>
</ul>
<h2>v4.1.7</h2>
<ul>
<li>Bump the minor-npm-dependencies group across 1 directory with 4
updates by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1739">actions/checkout#1739</a></li>
<li>Bump actions/checkout from 3 to 4 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1697">actions/checkout#1697</a></li>
<li>Check out other refs/* by commit by <a
href="https://github.com/orhantoy"><code>@​orhantoy</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1774">actions/checkout#1774</a></li>
<li>Pin actions/checkout's own workflows to a known, good, stable
version. by <a href="https://github.com/jww3"><code>@​jww3</code></a> in
<a
href="https://redirect.github.com/actions/checkout/pull/1776">actions/checkout#1776</a></li>
</ul>
<h2>v4.1.6</h2>
<ul>
<li>Check platform to set archive extension appropriately by <a
href="https://github.com/cory-miller"><code>@​cory-miller</code></a> in
<a
href="https://redirect.github.com/actions/checkout/pull/1732">actions/checkout#1732</a></li>
</ul>
<h2>v4.1.5</h2>
<ul>
<li>Update NPM dependencies by <a
href="https://github.com/cory-miller"><code>@​cory-miller</code></a> in
<a
href="https://redirect.github.com/actions/checkout/pull/1703">actions/checkout#1703</a></li>
<li>Bump github/codeql-action from 2 to 3 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1694">actions/checkout#1694</a></li>
<li>Bump actions/setup-node from 1 to 4 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1696">actions/checkout#1696</a></li>
<li>Bump actions/upload-artifact from 2 to 4 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1695">actions/checkout#1695</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="1af3b93b68"><code>1af3b93</code></a>
update readme/changelog for v6 (<a
href="https://redirect.github.com/actions/checkout/issues/2311">#2311</a>)</li>
<li><a
href="71cf2267d8"><code>71cf226</code></a>
v6-beta (<a
href="https://redirect.github.com/actions/checkout/issues/2298">#2298</a>)</li>
<li><a
href="069c695914"><code>069c695</code></a>
Persist creds to a separate file (<a
href="https://redirect.github.com/actions/checkout/issues/2286">#2286</a>)</li>
<li><a
href="ff7abcd0c3"><code>ff7abcd</code></a>
Update README to include Node.js 24 support details and requirements (<a
href="https://redirect.github.com/actions/checkout/issues/2248">#2248</a>)</li>
<li><a
href="08c6903cd8"><code>08c6903</code></a>
Prepare v5.0.0 release (<a
href="https://redirect.github.com/actions/checkout/issues/2238">#2238</a>)</li>
<li><a
href="9f265659d3"><code>9f26565</code></a>
Update actions checkout to use node 24 (<a
href="https://redirect.github.com/actions/checkout/issues/2226">#2226</a>)</li>
<li>See full diff in <a
href="https://github.com/actions/checkout/compare/v4...v6">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/checkout&package-manager=github_actions&previous-version=4&new-version=6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-04 17:40:42 +02:00
Hui Wang
4212491031 vmalert: clarify templating in alerting rule labels (#10121)
follow up
38dd971f58.

Labels only support limited templating variables in
https://docs.victoriametrics.com/victoriametrics/vmalert/#templating,
including `$labels`, `$value` and `expr`, to avoid breaking alert states
or causing cardinality issue with results.
2025-12-04 17:35:27 +02:00
Zakhar Bessarab
f76bc956ca app/vmctl: respect context cancellation during user prompts
Previously, context cancellation was ignored when reading user response
for the prompt. That leads to ignoring of "Ctrl+C" and other termination
signals to vmctl until user finishes the input.

Fix that by properly propagating the context and respecting the
cancellation of the context.


Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-12-04 15:57:31 +04:00
Aliaksandr Valialkin
655074c3e0 lib/protoparser/opentelemetry/pb: remove code related to parsing logs in OTEL format
This code is no longer needed after the commit 4ffb74448d

See https://github.com/VictoriaMetrics/VictoriaLogs/pull/720
2025-12-04 00:49:54 +01:00
Aliaksandr Valialkin
5e95fdf23e docs/victoriametrics/FAQ.md: add a link to the guide on how to calculate the needed disk space at VictoriaLogs at why indexdb size is so large? chapter
This is a follow-up for 68f670cbc5
2025-12-03 15:51:40 +01:00
Aliaksandr Valialkin
ffcfb74b17 deployment: update Go builder from v1.25.4 to v1.25.5
See https://github.com/golang/go/issues?q=milestone%3AGo1.25.5%20label%3ACherryPickApproved
2025-12-03 15:20:11 +01:00
Max Kotliar
fe803bfc6e Capitalize titles in operator.json
Signed-off-by: d3spair <git@agrshv.dev>
2025-12-03 13:43:39 +02:00
Andrii Chubatiuk
8ee466ab06 dashboard: add panels for operator flags and global params 2025-12-03 13:28:59 +02:00
Sylvain Rabot
6ca48d5025 lib/vmbackup/s3backup: support custom SSE KMS key id and ACL
Add more S3 configurations.

- SSES3KeyID allows to push to a bucket that is another account as the
KMS key it uses to encrypt data server side.
- ACL allows configure which permissions are given to the object
uploaded on the bucket (usefull when bucket policy expect a given
permission such as `bucket-owner-full-control`).

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Andrii Chubatiuk <andrew.chubatiuk@gmail.com>
2025-12-03 10:06:57 +04:00
Fred Navruzov
70eb9d39d5 docs/vmanomaly: release v1.28.1 (#10111)
### Describe Your Changes

Updates of docs and examples to `vmanomaly` v1.28.1

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-12-02 21:08:17 +02:00
Zakhar Bessarab
1985c79a4d deployment: update references to the latest release 2025-12-01 21:16:14 +04:00
Zakhar Bessarab
f0dafacfd3 docs: update references to the latest release 2025-12-01 21:15:13 +04:00
Zakhar Bessarab
6c01f5d50f docs/changelog: backport LTS changelogs 2025-12-01 20:44:43 +04:00
f41gh7
84658e77da docs/changelog: sort changelog entries
Signed-off-by: f41gh7 <nik@victoriametrics.com>
2025-12-01 11:11:54 +01:00
Zakhar Bessarab
4dc32ff1d7 app/vminsert/netstorage: fix list of nodes used for SD
Previously, vminsert was using original list of addrs instead of
discovered addrs. Properly use discovered list of addrs.
2025-12-01 11:11:53 +01:00
Artem Fetishev
08a1b2e75c lib/lrucache: do not reset requests and misses after cache reset
Follow-up for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10072.

Do not reset requests and misses metrics since cache reset implies the
reset of the storage only.
2025-12-01 10:12:47 +01:00
Zakhar Bessarab
7e5b68fc1f docs/changelog: cut v1.131.0 2025-11-28 20:20:08 +04:00
Zakhar Bessarab
dcc130603c docs: update availble from tags 2025-11-28 20:13:42 +04:00
Zakhar Bessarab
9842ad2299 app/vmselect: run make vmui-update 2025-11-28 20:01:08 +04:00
Aliaksandr Valialkin
63c0cf673f Makefile: generate quicktemplate output files only at lib and app directories
Previously the output files were incorrectly generated inside unexpeted directories such as vendor
2025-11-28 16:07:22 +01:00
Nowa Ammerlaan
7f51bb4ce7 protoparser/influx: account for excess white spaces before timestamp
Some influx clients ( such as nimon monitoring client) adds excess white spaces in the influx line and does not set a
timestamp. Since Influx protocol requires whitespace before timestamp only when it set, it could present without timestamp. Whitespace before omitted timestamp confuses parser.

This commit adds check for the skipped timestamp and test case for it.

Fixes: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10049
2025-11-28 14:36:35 +01:00
Nikolay
38df52ea08 app/vmselect: improve performance for multi-level requests
Previously, proxy vmselect (aka 1st level vmselect) performed parsing
of MetricBlock received from vmstorage before forwarding it into top vmselect. It required an additional CPU and Memory, which greatly slowed down query requests.

This commit changes lib/vmselectapi iterator API, instead of MetricBlock, it returns encoded MetricBlock as a byte slice.
It allows to save CPU and memory at proxy vmselect by eliminating need of decoding MetricBlock received from storage.

In addition, it adds the following optimizations for proxy vmselect:
* reduces memory allocations by using iterator pool
 * add per storageNode workerItem for iterator

Also, it adds optimization for vmstorage, it no longer performs extra memory copy of MetricName for MetricBlock.

vmselect and vmstorage metrics vm_vmselect_metric_rows_read_total and vm_metric_rows_read_total were removed, it's not used at any dashboards and rules. New Iterator API doesn't support it.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9899
2025-11-28 13:04:55 +01:00
Max Kotliar
023a13435c dashboards: make dashboards-sync 2025-11-27 16:52:45 +02:00
Max Kotliar
1ddcbed6d7 dashboards: Show "Disk space usage % by type" as stacked graph in Cluster dashboard. (#10089)
### Describe Your Changes

VictoriaMetrics - cluster dashboard.

vmstorage -> Disk space usage % by type pane.

Switch panel to 100% stacked view to show space distribution.

The goal is to highlight how space is split between datapoints and
indexdb types; Simple time-series values made this hard to see. A 100%
stacked layout makes the distribution immediately visible.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9932

was: <img width="1201" height="609" alt="Image"
src="https://github.com/user-attachments/assets/1d199e65-5a20-4c63-a251-b7087020f42a"
/>


now: 
<img width="1208" height="608" alt="Screenshot 2025-11-27 at 13 14 51"
src="https://github.com/user-attachments/assets/96aa32f3-1243-486b-bac8-2d3c0f4bdb7a"
/>


### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-11-27 16:50:15 +02:00
Aliaksandr Valialkin
edd02cdb5b docs/victoriametrics/goals.md: clarify that bugs, which affect a small number of users at rare edge cases, can be fixed later 2025-11-27 14:29:17 +01:00
Artem Fetishev
4cd727a511 lib/storage: use lrucache for tfss cache (#10072)
The purpose of this PR is the same as #10000, except `lrucache` is used
for implementing tfss cache.

---------

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-27 14:18:03 +01:00
Andrii Chubatiuk
19c0477976 chore(app/vmui): conditionally render accordion children (#10068)
### Describe Your Changes

revert change, that was introduced in
483e00ffb9
since rendering of all nested children significantly impacts alerting
tab performance in case of multiple items
@Loori-R @arturminchukov , what do you think about using react-virtuoso
additionally for alerting tab to decrease dom size?

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-11-27 14:31:34 +02:00
Ben Randall
4fdd8f0906 lib/protoparser/opentelemetry: use separate loggers for unsupported delta temporality/metric type logs (#10021)
A throttled logger will continue to log messages occasionally with a
suffix indicating how many similar logs were throttled. Using the same
logger for multiple log messages can result in certain logs being
entirely suppressed and invisible in the logs. This updates most of the
loggers used in `appendFromScopeMetrics` to be their own logger so that
"unsupported delta temporality/metric type" logs will be visible for all
metric types. Additionally, `skippedSampleLogger` is only used by
`appendSamplesFromHistogram` so this was moved closer to that function.

Related to #9447
Related to #9498

- [X] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [X] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Co-authored-by: Max Kotliar <kotlyar.maksim@gmail.com>
2025-11-27 14:19:43 +02:00
Andrii Chubatiuk
9897872ca9 lib/flagutil: clarify usage of quotes in array flag values 2025-11-27 14:17:07 +02:00
Hui Wang
b8bbb07431 dashboard: tidy vmauth panels (#10088)
before:
<img width="2498" height="1042" alt="image"
src="https://github.com/user-attachments/assets/0bbd7cc2-7062-494f-827b-96d86133537f"
/>
after:
<img width="2497" height="968" alt="image"
src="https://github.com/user-attachments/assets/6256ccc2-2f8f-40ea-a23b-a1a20e242b3c"
/>
which is more consistent with other dashabords.
2025-11-27 14:12:53 +02:00
Max Kotliar
eb1c8dd67d docs: add links to issues in changelog 2025-11-27 14:09:02 +02:00
Aliaksandr Valialkin
50fc48ac47 lib/fs: avoid Go runtime stalls on Linux when all the GOMAXPROCS threads are blocked in major pagefaults while reading the data from memory-mapped files
Go runtime executes all the goroutines on GOMAXPROCS operating system threads.
Go runtime cannot switch the OS thread to another goroutine if the current goroutine
is stuck in the major pagefault while reading the data from memory-mapped file,
because Go runtime doesn't distiguinsh between reading from regular memory and reading
from memory-mapped file. So the OS thread becomes stuck while waiting until the OS
reads the data from file at the requested memory address and returns back control to Go application.

In the worst case it is possible that all the GOMAXPROCS threads are stuck in major pagefaults,
so Go runtime pauses executing all the goroutines. This state is possible in environments
with small GOMAXPROCS and high-latency disks such as NFS or small HDD-based disks at AWS.

See https://valyala.medium.com/mmap-in-go-considered-harmful-d92a25cb161d for more details.

This commit protects from such stalls by verifying whether the given memory location from memory-mapped file
is already loaded in the OS page cache before reading from that memory.
If the location isn't in the OS page cache, then it falls back to pread() syscall for reading the data from file.
Go runtime allocates extra OS threads for long-running syscalls, so it can continue executing goroutines
across all the GOMAXPROCS threads while reading the data from slow storage via pread() syscall.

This commit uses mincore() syscall for detecting whether the given memory page is available in the OS page cache.
It also caches mincore() results for up to a minute in order to reduce the overhead for the mincore() syscall.

This commit reduces the increase rate for the process_major_pagefaults_total metric by multiple orders of magnitude
on systems with high-latency disks.
2025-11-26 20:52:27 +01:00
Artem Fetishev
3bd9c75acc lib/lrucache: use uint64 for SizeBytes() and SizeMaxBytes() (#10077)
Currently, `lrucache.Cache` `SizeBytes()` and `SizeMaxBytes()` return
type is `int`. The cache `Entry.SizeBytes()` also returns `int` value.
Changing the type to `uint64` will allow using `uint64set.Set` as the
cache entry type (see #10072).

Please note that using `uint64` regardless the cpu architecture is set
is not entirely correct, because in 32-bit systems the size won't ever
get bigger than `2^32`, so the `uint64` will too much. However current
type (`int`) is not correct either since it is signed and will only
allow to store values up to `2^31`. Alternatively, all `SizeBytes()`
methods should return `uint`.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-26 11:39:07 +01:00
Max Kotliar
9c0683f8d1 dashboards: run make dashboards-sync 2025-11-25 20:13:06 +02:00
Max Kotliar
bf4660912f .github: Add changelog tip linter 2025-11-25 13:41:48 +02:00
Yury Molodov
bb54b5e661 app/vmui: improve alert styles for better readability (#10012)
### Describe Your Changes

This PR improves vmui alert styles by adding borders between rows,
introducing a hover state for easier row identification, and aligning
badges to the left.

Related issue: #9856

| Before | After |
|--------|--------|
| <img width="1427" height="1310" alt="image"
src="https://github.com/user-attachments/assets/68f3469e-95df-449f-a85d-1c0285520e2d"
/> | <img width="1427" height="1310" alt="Image"
src="https://github.com/user-attachments/assets/89501efb-c66f-402a-9d14-01c86930a5e2"
/> |

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Signed-off-by: Yury Molodov <yurymolodov@gmail.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2025-11-25 13:38:24 +02:00
Andrii Chubatiuk
200a729565 app/vmui: fixed ability to select multiple metrics in explore metrics tab (#10008)
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9995

change in only `Select` component leads to infinite
ExploreMetricsGraphItem component refresh since each time array has a
new reference

### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-11-25 13:29:45 +02:00
Yury Molodov
7303495ae1 app/vmui: fix rendering of multiple points at the same timestamp (#10010)
### Describe Your Changes

1. Removed the *step* control from the **Raw Query** page, as it didn’t
affect chart rendering and caused confusion.
2. Fixed rendering of multiple points with the same timestamp -
previously, the second point was hidden.
3. Added proper visualization for points with the same timestamp and
identical values: such points are now shown as a square, and the tooltip
displays the number of duplicates.

**Example:**

```json
{
  "values": [1, 22, 10, 10, 5, 6],
  "timestamps": [
    1761955247950,
    1761955247950,
    1761955248960,
    1761955248960,
    1761955251980,
    1761955252990
  ]
}
```

<img width="500" height="1120" alt="image"
src="https://github.com/user-attachments/assets/192aa43e-8008-4f03-8966-00f59e52ec40"
/>
<img width="300" height="676" alt="image"
src="https://github.com/user-attachments/assets/8e361cb3-1286-452a-a687-b6b40ba7807b"
/>

Related issues: #9667 and #9666

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Signed-off-by: Yury Molodov <yurymolodov@gmail.com>
2025-11-25 11:00:00 +02:00
Andrei Baidarov
98b5288e9c vmselect: do not immediately fail request if vmstorage returns search… (#10030)
….maxConcurrentRequests error

If `vmstorage` is currently overloaded it could return
maxConcurrentRequests error. Now `vmselect` immediately fails the whole
request even if `replicationFactor` is set up and other replicas could
respond without errors.

This PR treats them as regular errors, not fatal ones.

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-11-24 20:37:37 +02:00
Cancai Cai
d7f9cd971d docs/notes: fix syntax errors (#10019)
### Describe Your Changes

I'm not sure if this is a mistake.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Signed-off-by: cancaicai <2356672992@qq.com>
2025-11-24 20:37:05 +02:00
cancaicai
2cb08095c6 docs/storage: fix typo
Signed-off-by: cancaicai <2356672992@qq.com>
2025-11-24 15:46:40 +02:00
dependabot[bot]
8bc41f4c79 build(deps): bump golang.org/x/crypto from 0.43.0 to 0.45.0 (#10052)
Bumps [golang.org/x/crypto](https://github.com/golang/crypto) from
0.43.0 to 0.45.0.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="4e0068c009"><code>4e0068c</code></a>
go.mod: update golang.org/x dependencies</li>
<li><a
href="e79546e28b"><code>e79546e</code></a>
ssh: curb GSSAPI DoS risk by limiting number of specified OIDs</li>
<li><a
href="f91f7a7c31"><code>f91f7a7</code></a>
ssh/agent: prevent panic on malformed constraint</li>
<li><a
href="2df4153a03"><code>2df4153</code></a>
acme/autocert: let automatic renewal work with short lifetime certs</li>
<li><a
href="bcf6a849ef"><code>bcf6a84</code></a>
acme: pass context to request</li>
<li><a
href="b4f2b62076"><code>b4f2b62</code></a>
ssh: fix error message on unsupported cipher</li>
<li><a
href="79ec3a51fc"><code>79ec3a5</code></a>
ssh: allow to bind to a hostname in remote forwarding</li>
<li><a
href="122a78f140"><code>122a78f</code></a>
go.mod: update golang.org/x dependencies</li>
<li><a
href="c0531f9c34"><code>c0531f9</code></a>
all: eliminate vet diagnostics</li>
<li><a
href="0997000b45"><code>0997000</code></a>
all: fix some comments</li>
<li>Additional commits viewable in <a
href="https://github.com/golang/crypto/compare/v0.43.0...v0.45.0">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=golang.org/x/crypto&package-manager=go_modules&previous-version=0.43.0&new-version=0.45.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/VictoriaMetrics/VictoriaMetrics/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-24 15:13:50 +02:00
Zhu Jiekun
bb1e0d8f3b opentsdb: Avoid blocking when a connection doesn't send anything (#10045)
### Describe Your Changes

fix #9987 

Avoid blocking when a connection to `-opentsdbListenAddr` doesn't send
any data. This issue blocked other connections from being handled.

> This bug can be tested with:
> 1. Start VictoriaMetrics Single-node with `-opentsdbListenAddr=:4242`.
> 2. Run: `telnet 127.0.0.1 4242` without typing any data after
connection established.
> 3. Run (in another terminal, after step 2): `curl -H 'Content-Type:
application/json' -d
'{"metric":"x.y.z","value":2222222.34,"tags":{"t1":"v1","t2":"v2"}}'
http://localhost:4242/api/put`
> 
> Before the change:
> - Step 3 was blocked infinitely.
> 
> Expect result after the change:
> - Step 3 was executed.
> - Connection established by step 2 will be closed after 5 seconds.

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2025-11-24 14:31:19 +02:00
Mathias Palmersheim
70be2e7ea3 Remove threshold from available cpu panel (#10056)
### Describe Your Changes

fixes #9988 by removing the cpu threshold from the Available CPU panel

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-11-24 14:16:35 +02:00
Kirill Yurkov
61796e355a docs: link faq for large indexdb (#10061)
Clarified the index size note in
docs/guides/understand-your-setup-size/README.md to steer readers toward
the FAQ when indexdb feels oversized, noting typical ratios and
troubleshooting guidance.
2025-11-24 14:04:05 +02:00
dependabot[bot]
2ae3fd47eb build(deps): bump actions/checkout from 5 to 6 (#10060)
Bumps [actions/checkout](https://github.com/actions/checkout) from 5 to
6.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/actions/checkout/releases">actions/checkout's
releases</a>.</em></p>
<blockquote>
<h2>v6.0.0</h2>
<h2>What's Changed</h2>
<ul>
<li>Update README to include Node.js 24 support details and requirements
by <a href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a>
in <a
href="https://redirect.github.com/actions/checkout/pull/2248">actions/checkout#2248</a></li>
<li>Persist creds to a separate file by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2286">actions/checkout#2286</a></li>
<li>v6-beta by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2298">actions/checkout#2298</a></li>
<li>update readme/changelog for v6 by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2311">actions/checkout#2311</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/checkout/compare/v5.0.0...v6.0.0">https://github.com/actions/checkout/compare/v5.0.0...v6.0.0</a></p>
<h2>v6-beta</h2>
<h2>What's Changed</h2>
<p>Updated persist-credentials to store the credentials under
<code>$RUNNER_TEMP</code> instead of directly in the local git
config.</p>
<p>This requires a minimum Actions Runner version of <a
href="https://github.com/actions/runner/releases/tag/v2.329.0">v2.329.0</a>
to access the persisted credentials for <a
href="https://docs.github.com/en/actions/tutorials/use-containerized-services/create-a-docker-container-action">Docker
container action</a> scenarios.</p>
<h2>v5.0.1</h2>
<h2>What's Changed</h2>
<ul>
<li>Port v6 cleanup to v5 by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2301">actions/checkout#2301</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/checkout/compare/v5...v5.0.1">https://github.com/actions/checkout/compare/v5...v5.0.1</a></p>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/actions/checkout/blob/main/CHANGELOG.md">actions/checkout's
changelog</a>.</em></p>
<blockquote>
<h1>Changelog</h1>
<h2>V6.0.0</h2>
<ul>
<li>Persist creds to a separate file by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2286">actions/checkout#2286</a></li>
<li>Update README to include Node.js 24 support details and requirements
by <a href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a>
in <a
href="https://redirect.github.com/actions/checkout/pull/2248">actions/checkout#2248</a></li>
</ul>
<h2>V5.0.1</h2>
<ul>
<li>Port v6 cleanup to v5 by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2301">actions/checkout#2301</a></li>
</ul>
<h2>V5.0.0</h2>
<ul>
<li>Update actions checkout to use node 24 by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2226">actions/checkout#2226</a></li>
</ul>
<h2>V4.3.1</h2>
<ul>
<li>Port v6 cleanup to v4 by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2305">actions/checkout#2305</a></li>
</ul>
<h2>V4.3.0</h2>
<ul>
<li>docs: update README.md by <a
href="https://github.com/motss"><code>@​motss</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1971">actions/checkout#1971</a></li>
<li>Add internal repos for checking out multiple repositories by <a
href="https://github.com/mouismail"><code>@​mouismail</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1977">actions/checkout#1977</a></li>
<li>Documentation update - add recommended permissions to Readme by <a
href="https://github.com/benwells"><code>@​benwells</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2043">actions/checkout#2043</a></li>
<li>Adjust positioning of user email note and permissions heading by <a
href="https://github.com/joshmgross"><code>@​joshmgross</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2044">actions/checkout#2044</a></li>
<li>Update README.md by <a
href="https://github.com/nebuk89"><code>@​nebuk89</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2194">actions/checkout#2194</a></li>
<li>Update CODEOWNERS for actions by <a
href="https://github.com/TingluoHuang"><code>@​TingluoHuang</code></a>
in <a
href="https://redirect.github.com/actions/checkout/pull/2224">actions/checkout#2224</a></li>
<li>Update package dependencies by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2236">actions/checkout#2236</a></li>
</ul>
<h2>v4.2.2</h2>
<ul>
<li><code>url-helper.ts</code> now leverages well-known environment
variables by <a href="https://github.com/jww3"><code>@​jww3</code></a>
in <a
href="https://redirect.github.com/actions/checkout/pull/1941">actions/checkout#1941</a></li>
<li>Expand unit test coverage for <code>isGhes</code> by <a
href="https://github.com/jww3"><code>@​jww3</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1946">actions/checkout#1946</a></li>
</ul>
<h2>v4.2.1</h2>
<ul>
<li>Check out other refs/* by commit if provided, fall back to ref by <a
href="https://github.com/orhantoy"><code>@​orhantoy</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1924">actions/checkout#1924</a></li>
</ul>
<h2>v4.2.0</h2>
<ul>
<li>Add Ref and Commit outputs by <a
href="https://github.com/lucacome"><code>@​lucacome</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1180">actions/checkout#1180</a></li>
<li>Dependency updates by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>- <a
href="https://redirect.github.com/actions/checkout/pull/1777">actions/checkout#1777</a>,
<a
href="https://redirect.github.com/actions/checkout/pull/1872">actions/checkout#1872</a></li>
</ul>
<h2>v4.1.7</h2>
<ul>
<li>Bump the minor-npm-dependencies group across 1 directory with 4
updates by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1739">actions/checkout#1739</a></li>
<li>Bump actions/checkout from 3 to 4 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1697">actions/checkout#1697</a></li>
<li>Check out other refs/* by commit by <a
href="https://github.com/orhantoy"><code>@​orhantoy</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1774">actions/checkout#1774</a></li>
<li>Pin actions/checkout's own workflows to a known, good, stable
version. by <a href="https://github.com/jww3"><code>@​jww3</code></a> in
<a
href="https://redirect.github.com/actions/checkout/pull/1776">actions/checkout#1776</a></li>
</ul>
<h2>v4.1.6</h2>
<ul>
<li>Check platform to set archive extension appropriately by <a
href="https://github.com/cory-miller"><code>@​cory-miller</code></a> in
<a
href="https://redirect.github.com/actions/checkout/pull/1732">actions/checkout#1732</a></li>
</ul>
<h2>v4.1.5</h2>
<ul>
<li>Update NPM dependencies by <a
href="https://github.com/cory-miller"><code>@​cory-miller</code></a> in
<a
href="https://redirect.github.com/actions/checkout/pull/1703">actions/checkout#1703</a></li>
<li>Bump github/codeql-action from 2 to 3 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1694">actions/checkout#1694</a></li>
<li>Bump actions/setup-node from 1 to 4 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1696">actions/checkout#1696</a></li>
<li>Bump actions/upload-artifact from 2 to 4 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1695">actions/checkout#1695</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="1af3b93b68"><code>1af3b93</code></a>
update readme/changelog for v6 (<a
href="https://redirect.github.com/actions/checkout/issues/2311">#2311</a>)</li>
<li><a
href="71cf2267d8"><code>71cf226</code></a>
v6-beta (<a
href="https://redirect.github.com/actions/checkout/issues/2298">#2298</a>)</li>
<li><a
href="069c695914"><code>069c695</code></a>
Persist creds to a separate file (<a
href="https://redirect.github.com/actions/checkout/issues/2286">#2286</a>)</li>
<li><a
href="ff7abcd0c3"><code>ff7abcd</code></a>
Update README to include Node.js 24 support details and requirements (<a
href="https://redirect.github.com/actions/checkout/issues/2248">#2248</a>)</li>
<li>See full diff in <a
href="https://github.com/actions/checkout/compare/v5...v6">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/checkout&package-manager=github_actions&previous-version=5&new-version=6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-24 14:00:22 +02:00
Max Kotliar
ebad7e5496 docs: Describe relation between slow inserts and unsorted labels. 2025-11-24 13:35:23 +02:00
Max Kotliar
e52de06ee5 docs: sync flags in docs with actual binaries 2025-11-24 13:16:22 +02:00
Aliaksandr Valialkin
38dd971f58 docs/victoriametrics/vmalert.md: clarify that templates can be used inside rule labels
Rule labels can contain templates in the same way as annotations.
See aad6ab009e/app/vmalert/rule/alerting_test.go (L1192)
and https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/#templating

Document this, since users sometimes ask this question.
2025-11-24 10:50:55 +01:00
Artem Fetishev
aad6ab009e lib/storage: minor metricNameSearch fixes (#10065)
- Fix comment
- Re-use dst instead introducing a new variable.

This change has been requested to be in a separated PR during the
pt-index (#8134) code review.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-21 20:06:20 +01:00
Artem Fetishev
2c125e14c7 lib/storage: also create parts.json on parition creation (#10051)
Currently, when a partition is created its corresponding parts.json file
is not created right away (see createNewParition()). Its creation is
delayed until the first part files are created on disk (see
swapSrcWithDstParts()). However, the parts.json file is created for a
possibly empty partition when an existing partition is opened (see
mustOpenPartition()) and when a partition snapshot is create (see
MustCreateSnapshotAt()).

I.e. `parts.json` is an important part of a partition, since it is an
artifact that describes the partition contents. And it should be created
on pt creation even if its contents is empty.

To be honest, this change is mostly a no-op for the current storage
implementation. It only makes the code consistent, i.e. the parts.json
is created along with the partition.

However having it created when a partition is created becomes in
pt-index (#7599, #8134), because it allows having partitions with no
data and therefore without parts.json file. Still not a big deal but the
unit tests start failing.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-21 14:19:53 +01:00
Artem Fetishev
13dc60e257 lib/storage: refactoring - move dateMetricIDCache code to a separate file (#10055)
dateMetricIDCache does not belong to storage anymore since it has been
moved to indexDB. Instead moving the case to index_db.go, move it to a
separate file in order to navigate the code more easily.

No changes have been done to the code or tests.

Follow up for: #9983

---------

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
Co-authored-by: Alexander Frolov <9749087+fxrlv@users.noreply.github.com>
2025-11-21 13:52:33 +01:00
Artem Fetishev
ed64c90e7a lib/storage: fix comments related to nextDayMetricIDs
Follow-up for 49b0a4fb16

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-21 13:31:20 +01:00
Artem Fetishev
49b0a4fb16 lib/storage: refactoring - simplify nextDayMetricIDs data structure (#10058)
The data structure used for holding the nextDayMetricIDs is too complex
and can be simplified (flattened).

Follow up for: #9983

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-21 13:02:02 +01:00
Artem Fetishev
5141496c43 lib/storage: add overlapsWith() and contains() methods to TimeRange (#10059)
The change was introduced in pt-index PR (#8134) and is extracted into a
separate PR.

Currently used in partition_search and partition. If you see more places
like this, please let me know.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-21 12:24:40 +01:00
Andrii Chubatiuk
24fac64875 docs: add warning blockquote regarding latest backup lifecycle policy (#10054)
Update formatting for warning text.

<img width="732" height="432" alt="image"
src="https://github.com/user-attachments/assets/1549e69a-fc65-445f-b567-9b5e4e1a8617"
/>
2025-11-20 13:46:34 +04:00
Aliaksandr Valialkin
8250f469a7 docs/victoriametrics/Articles.md: add https://medium.com/@kanakaraju896/backing-up-victoriametrics-data-a-complete-guide-24473c74450f 2025-11-20 08:36:59 +01:00
Aliaksandr Valialkin
7fb0f0e015 docs/victoriametrics/Articles.md: add https://blackmetalz.github.io/why-i-switched-to-victoriametrics-scaling-from-small-business-to-enterprise.html 2025-11-20 08:33:19 +01:00
Andrii Chubatiuk
563dbeaea1 app/vmalert: do not increment errors counter on cancel context
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10027
2025-11-19 13:32:16 +01:00
Nikolay
7e6468c1e3 lib/storage: properly increment missing tsids metric
Bug was introduced at 2380e4829d

Due to typo vm_missing_tsids_for_metric_id_total metric was incremented instead of vm_missing_metric_names_for_metric_id_total for missing metricName for metricID search.

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10041
2025-11-19 13:29:44 +01:00
Hui Wang
328f33202f chore: clarify vmalert -external.label usage (#10042)
To clarify that HA vmalert doesn't need to specify `-external.label`.
2025-11-19 13:28:36 +01:00
Fred Navruzov
951331db80 docs/vmanomaly: release v1.28.0 (#10031)
### Describe Your Changes

Upgraded vmanomaly docs & guides to release v1.28.0 (UI v1.2.0)

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-11-18 21:47:03 +02:00
Andrii Chubatiuk
e6139be8ba docs/vmbackupmanager: mention version since which -backupTypeTagName flag is available (#10038)
Mention version since which `backupTypeTagName` flag is available
2025-11-18 18:56:19 +04:00
Andrii Chubatiuk
77e5920014 app/vmbackupmanager: set backup type tag on backup's items
* app/vmbackupmanager: set VMBackupType tag on backup's items

* address review comments
2025-11-18 16:30:13 +04:00
Zakhar Bessarab
78049e991b docs/cluster: remove mention of select for metadata (#10034)
vmselect does not have a flag to enable metadata querying, remove
invalid reference to it from the docs.
2025-11-18 15:32:20 +04:00
Artem Fetishev
c972d70f00 docs: update VictoriaMetrics components version to v1.130.0
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-17 22:03:17 +01:00
Artem Fetishev
b947562f2b deployment/docker: update VM components version to v1.130.0
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-17 21:56:42 +01:00
Artem Fetishev
344a81fa20 docs: bump last LTS versions
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-17 20:14:07 +00:00
Artem Fetishev
4b022ea8a8 docs/CHANGELOG.md: update changelog with LTS release notes
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-17 20:08:16 +00:00
Artem Fetishev
04c24fc831 lib/workingsetcache: Fix bytesSize metric calculation (#10025)
Follow-up for 3e6fc445a9

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-17 13:49:17 +01:00
Artem Fetishev
d2f78e4b2b docs/CHANGELOG.md: cut v1.130.0
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-14 17:45:20 +00:00
Max Kotliar
3995837c58 docs: update latest version in docs to v1.130.0 2025-11-14 19:37:14 +02:00
Artem Fetishev
1d53496f98 make vmui-update
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-14 17:21:49 +00:00
Artem Fetishev
73a1ce2dd6 lib/storage: Move dateMetricIDCache to indexDB (#9983)
Looks like the `dateMetricIDCache` must be per indexDB:

- the use of this cache and `is.hasDateMetricID()` often go in pairs. So
it makes
  sense to use this cache in that method.
- The same is true for `createPerDayIndexes()`: everytime the index
entry is
  created, a corresponding entry is added to the cache.
- As a result the generation field is also removed from the cache.

Related to #7599 and #8134.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-14 16:03:28 +01:00
Aliaksandr Valialkin
daa88f6a43 docs/victoriametrics: cross-link rebalancing section at VictoriaMetrics cluster docs and the corresponding question at the FAQ page 2025-11-14 15:36:57 +01:00
Aliaksandr Valialkin
7bff73b0f7 docs/victoriametrics/Cluster-VictoriaMetrics.md: add rebalancing chapter, which explains how to rebalance data among vmstorage nodes
This is very frequent question from new users of VcitoriaMetrcs who migrate from other solutions
with automatic data rebalancing among storage nodes, so it is a good idea to cover it in the docs.
2025-11-14 15:32:29 +01:00
Max Kotliar
bf3b1cf6b6 lib/storage/metricsmetadata: ensure deterministic sorting for identical metric names across tenants
Metrics metadata is loaded from a per-tenant storage map
(perTenantStorage map[uint64]map[string]*Row), so result rows order is
non-deterministic. The existing sortRows implementation only sorts by
metric name and ingestion time, which means rows that differ only by
tenant/account ID still sorted undeterministically.

This change updates `sortRows` to include account\project identifiers in
the comparison, ensuring stable and deterministic ordering for metadata
entries that share the same metric name and timestamp.

First discovered as flaky test:

--- FAIL: TestStorageRead (0.00s)
    storage_test.go:337: unexpected rows get result (-want, +got):
          []*metricsmetadata.Row{
          	&{
          		... // 2 ignored and 1 identical fields
          		Help:      "uselesshelp1",
          		Unit:      "seconds1",
        - 		AccountID: 1,
        + 		AccountID: 0,
        - 		ProjectID: 1,
        + 		ProjectID: 0,
          		Type:      1,
          	},
          	&{
          		... // 2 ignored and 1 identical fields
          		Help:      "uselesshelp1",
          		Unit:      "seconds1",
        - 		AccountID: 0,
        + 		AccountID: 1,
        - 		ProjectID: 0,
        + 		ProjectID: 1,
          		Type:      1,
          	},
          }
FAIL

https://github.com/VictoriaMetrics/VictoriaMetrics-enterprise/actions/runs/19361594138/job/55394642029#step:4:133
2025-11-14 15:22:27 +02:00
Max Kotliar
a10ff67354 docs/changelog: Add links to changelog 2025-11-14 13:41:59 +02:00
Haley Wang
9a8463df42 lib/storage: add a value check for retentionFilter to ensure it does not exceed retentionPeriod 2025-11-14 12:50:46 +02:00
Max Kotliar
7e22b169f1 docs: Add metrics metadata how to use in docs
follow-up for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9487
2025-11-14 10:37:15 +01:00
f41gh7
80c1af5af1 apptest: add metrics metadata test for vmsingle
related issue github.com/VictoriaMetrics/VictoriaMetrics/issues/2974
2025-11-14 10:29:28 +01:00
f41gh7
5a587f2006 app/{vmstorage,vmselect,vminsert}: introduce metrics metadata storage
This commits adds storage part and cluster RPC methods for metrics metadata.

 Key concepts:
* vmstorage persists metadata in-memory only.
* vmstorage evicts metadata records older than 1 hour.
* vmstorage stores only the last value of metadata for time series
  metric name.
* vminsert opens an additional TCP connection to the vmstorage for
  metadata write requests.
* vmselect doesn't support `limit_per_metric_name`.

This feature is available optional and must be enabled via flag - `-enableMetadata` provided to vminsert/vmsingle.

Fixes github.com/VictoriaMetrics/VictoriaMetrics/issues/2974
2025-11-14 10:24:38 +01:00
Aliaksandr Valialkin
847cd1e336 docs/guides/understand-your-setup-size/README.md: remove the misleading recommendation for having at least 2vCPU cores per each vmstorage node
vmstorage nodes work perfectly with one CPU core and even with 10% of a single CPU core
if the allocated CPU resources matches their workload.

It is better to recommend allocating the an interger number of CPU cores to vmstorage
in order to achieve an optimal performance, since vmstorage allocates internal resources
according to the available CPU cores. If there is a fractional number of CPU cores,
then the allocation of internal resources may be not so optimal.

Fractional number of CPU cores may also lead to increased latencies and stalls
because some P threads at Go runtime won't be able to run goroutines from their ready queues
in a timely manner becasue of the lack of CPU time. See https://victoriametrics.com/blog/kubernetes-cpu-go-gomaxprocs/
2025-11-14 09:48:30 +01:00
Aliaksandr Valialkin
c86857b269 docs/victoriametrics/vmagent.md: mention that it isn't recommended increasing the -maxConcurrentRequests command-line flag value in general case
Too big values for the -maxConcurrentRequests command-line flag increase memory usage
and increase CPU overhead for processing incoming requests in most cases.
The only valid reason for increasing the value for -maxConcurrentRequests command-line flag
is when many clients send data to vmagent over very slow network.
2025-11-14 09:40:31 +01:00
Hui Wang
c93937101c Improve vmalert UI tip (#9998) 2025-11-13 21:04:39 +01:00
Aliaksandr Valialkin
cca7380dd3 docs/victoriametrics: fix broken link to /api/v1/rules docs at Prometheus 2025-11-13 19:40:10 +01:00
Aliaksandr Valialkin
ca3b9b18b5 docs/victoriametrics/README.md: add context links to the FAQ entry describing why IndexDB size may be too large 2025-11-13 19:36:18 +01:00
Nikolay
10f7cd2ffc lib/encoding/zstd: properly apply size limits
Previously, zstd Decoder didn't take in account Request Size limits
applied by VictoriaMetrics components.  And in case of incorrectly formed zstd block, VictoriaMetrics
component may allocate extra memory. Which may lead to the OOM errors.

This commit makes ingest endpoints check frame content size and window size headers based on MaxRequest Limits.
2025-11-13 18:13:23 +01:00
Hui Wang
fa85726a82 vmalert: print the error message as value if templating fails in alerting rule
For users, if an alerting rule has a misconfigured annotation, it's more
important to deliver the alert when the rule triggers rather than skip
it with templating error logs.
Then users can see the faulty annotation in alert message and fix it.

Note: the previous behavior is retained in replay mode because errors
there should be noticed immediately; hiding them could waste time,
resources and require a re-replay after fixes.
Also the rule's status in the vmalert UI remains unhealthy if templating
failed.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9853
2025-11-13 17:34:55 +01:00
Hui Wang
567c084d6d vmalert: drop labels with empty values in generated alerts and time series
In prometheus ecosystem, a label with an empty value equals no label,
since a query like `test{something=""}` matches all the series without
label `something`.
So for vmalert, preserving empty-value labels in generated alerts or
time series is unnecessary and can cause alert hash mismatches during
[restore](https://docs.victoriametrics.com/victoriametrics/vmalert/#alerts-state-on-restarts).
The empty-value label shouldn't come from datasource response since they
follow the same rule(omit empty-value labels), it may come from
`-external.label` or rule labels, but the empty value could be caused by
occasionally templating failures, which is hard to check there.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9984
2025-11-13 17:24:27 +01:00
Hui Wang
12a1388fbc vmalert: fix a potential race condition in web api during rule hot reload
Group rules are not protected by
[m.groupsMu](03c784e3e3/app/vmalert/manager.go (L25)),
they could be updated(with config hot reload) during `/api/v1/rule`,
`/api/v1/alert` and `/api/v1/alerts` API calls. This fix takes a
snapshot by calling `group.ToAPI()` first, making all reads safe.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9551
2025-11-13 17:22:25 +01:00
JAYICE
62c19b386a lib/httputl: fix failing to access http2 sd service by the shadow copy of http.DefaultTransport
Clone `http.DefaultTransport` and disable HTTP2 without resetting
`TLSClientConfig.NextProtos` in the shadow copy of
`http.DefaultTransport` will cause the request to HTTP/2 server to fail.
See https://github.com/golang/go/issues/39302.

To reproduce it, use a scrape config like:
```
scrape_configs:
  - job_name: test
    yandexcloud_sd_configs:
      - service: compute
        api_endpoint: https://api.cloud.yandex.net
```
Before the fix, access to the SD service would fail.

A solution is to specify `http/1.1` in  `TLSClientConfig.NextProtos`.

Related golang issue: https://github.com/golang/go/issues/39302

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9981
2025-11-13 17:19:15 +01:00
1968 changed files with 116920 additions and 98362 deletions

View File

@@ -5,7 +5,7 @@ body:
- type: textarea
id: describe-the-component
attributes:
label: Is your question request related to a specific component?
label: Is your question related to a specific component?
placeholder: |
VictoriaMetrics, vmagent, vmalert, vmui, etc...
validations:

48
.github/scripts/lint-changelog-tip.sh vendored Executable file
View File

@@ -0,0 +1,48 @@
#!/usr/bin/env sh
set -e
CHANGELOG_FILE="docs/victoriametrics/changelog/CHANGELOG.md"
GITHUB_BASE_REF=${GITHUB_BASE_REF:-"master"}
GIT_REMOTE=${GIT_REMOTE:-"origin"}
git diff "${GIT_REMOTE}/${GITHUB_BASE_REF}"...HEAD -- $CHANGELOG_FILE > diff.txt
if ! grep -q "^+" diff.txt; then
echo "No additions in CHANGELOG.md"
exit 0
fi
ADDED_LINES=$(grep "^+\S" diff.txt | sed 's/^+//')
START_TIP=$(grep -n "^## tip" "$CHANGELOG_FILE" | head -1 | cut -d: -f1)
if [ -z "$START_TIP" ]; then
echo "ERROR: ${CHANGELOG_FILE} does not contain a ## tip section"
exit 1
fi
END_TIP=$(awk "NR>$START_TIP && /^## / {print NR; exit}" "${CHANGELOG_FILE}")
if [ -z "$END_TIP" ]; then
END_TIP=$(wc -l < "$CHANGELOG_FILE")
fi
BAD=0
while IFS= read -r line; do
# Grep exact line inside the file and get line numbers
MATCHES=$(grep -n -F "$line" "$CHANGELOG_FILE" | cut -d: -f1)
for m in $MATCHES; do
if [ "$m" -lt "$START_TIP" ] || [ "$m" -gt "$END_TIP" ]; then
echo "'$line' on line ${m} is outside ## tip section (lines ${START_TIP}-${END_TIP})"
BAD=1
fi
done
done << EOF
$ADDED_LINES
EOF
if [ "$BAD" -ne 0 ]; then
echo "CHANGELOG modifications must be placed inside the ## tip section."
exit 1
fi
echo "CHANGELOG modifications are valid."

View File

@@ -61,7 +61,7 @@ jobs:
arch: amd64
steps:
- name: Code checkout
uses: actions/checkout@v5
uses: actions/checkout@v6
- name: Setup Go
id: go

19
.github/workflows/changelog-linter.yml vendored Normal file
View File

@@ -0,0 +1,19 @@
name: 'changelog-linter'
on:
pull_request:
paths:
- "docs/victoriametrics/changelog/CHANGELOG.md"
jobs:
tip-lint:
runs-on: 'ubuntu-latest'
steps:
- uses: 'actions/checkout@v6'
with:
# needed for proper diff
fetch-depth: 0
- name: 'Validate that changelog changes are under ## tip'
run: |
GITHUB_BASE_REF=${{ github.base_ref }} ./.github/scripts/lint-changelog-tip.sh

View File

@@ -8,7 +8,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v5
uses: actions/checkout@v6
with:
fetch-depth: 0 # we need full history for commit verification

View File

@@ -29,7 +29,7 @@ jobs:
steps:
- name: Checkout repository
uses: actions/checkout@v5
uses: actions/checkout@v6
- name: Set up Go
id: go

View File

@@ -16,12 +16,12 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Code checkout
uses: actions/checkout@v5
uses: actions/checkout@v6
with:
path: __vm
- name: Checkout private code
uses: actions/checkout@v5
uses: actions/checkout@v6
with:
repository: VictoriaMetrics/vmdocs
token: ${{ secrets.VM_BOT_GH_TOKEN }}

View File

@@ -32,7 +32,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Code checkout
uses: actions/checkout@v5
uses: actions/checkout@v6
- name: Setup Go
id: go
@@ -71,7 +71,7 @@ jobs:
steps:
- name: Code checkout
uses: actions/checkout@v5
uses: actions/checkout@v6
- name: Setup Go
id: go
@@ -97,7 +97,7 @@ jobs:
steps:
- name: Code checkout
uses: actions/checkout@v5
uses: actions/checkout@v6
- name: Setup Go
id: go

View File

@@ -32,35 +32,41 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Code checkout
uses: actions/checkout@v5
uses: actions/checkout@v6
- name: Setup Node
uses: actions/setup-node@v6
- name: Cache node_modules
id: cache
uses: actions/cache@v5
with:
node-version: '24.x'
path: app/vmui/packages/vmui/node_modules
key: vmui-deps-${{ runner.os }}-${{ hashFiles('app/vmui/packages/vmui/package-lock.json', 'app/vmui/Dockerfile-build') }}
restore-keys: |
vmui-deps-${{ runner.os }}-
- name: Cache node-modules
uses: actions/cache@v4
with:
path: |
app/vmui/packages/vmui/node_modules
key: vmui-artifacts-${{ runner.os }}-${{ hashFiles('package-lock.json') }}
restore-keys: vmui-artifacts-${{ runner.os }}-
- name: Install dependencies
if: steps.cache.outputs.cache-hit != 'true'
run: make vmui-install
- name: Run lint
id: lint
run: make vmui-lint
continue-on-error: true
env:
VMUI_SKIP_INSTALL: true
- name: Run tests
id: test
run: make vmui-test
continue-on-error: true
env:
VMUI_SKIP_INSTALL: true
- name: Run typecheck
id: typecheck
run: make vmui-typecheck
continue-on-error: true
env:
VMUI_SKIP_INSTALL: true
- name: Annotate Code Linting Results
uses: ataylorme/eslint-annotate-action@v3

View File

@@ -175,7 +175,7 @@
END OF TERMS AND CONDITIONS
Copyright 2019-2025 VictoriaMetrics, Inc.
Copyright 2019-2026 VictoriaMetrics, Inc.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.

View File

@@ -17,7 +17,7 @@ EXTRA_GO_BUILD_TAGS ?=
GO_BUILDINFO = -X '$(PKG_PREFIX)/lib/buildinfo.Version=$(APP_NAME)-$(DATEINFO_TAG)-$(BUILDINFO_TAG)'
TAR_OWNERSHIP ?= --owner=1000 --group=1000
GOLANGCI_LINT_VERSION := 2.4.0
GOLANGCI_LINT_VERSION := 2.7.2
.PHONY: $(MAKECMDGOALS)
@@ -435,7 +435,7 @@ release-vmutils-windows-goarch: \
vmctl-windows-$(GOARCH)-prod.exe
pprof-cpu:
go tool pprof -trim_path=github.com/VictoriaMetrics/VictoriaMetrics@ $(PPROF_FILE)
go tool pprof -trim_path=github.com/VictoriaMetrics/VictoriaMetrics $(PPROF_FILE)
fmt:
gofmt -l -w -s ./lib
@@ -471,7 +471,23 @@ integration-test:
apptest:
$(MAKE) victoria-metrics vmagent vmalert vmauth vmctl vmbackup vmrestore
go test ./apptest/... -skip="^TestCluster.*"
go test ./apptest/... -skip="^Test(Cluster|Legacy).*"
integration-test-legacy: victoria-metrics vmbackup vmrestore
OS=$$(uname | tr '[:upper:]' '[:lower:]'); \
ARCH=$$(uname -m | tr '[:upper:]' '[:lower:]' | sed 's/x86_64/amd64/'); \
VERSION=v1.132.0; \
VMSINGLE=victoria-metrics-$${OS}-$${ARCH}-$${VERSION}.tar.gz; \
VMCLUSTER=victoria-metrics-$${OS}-$${ARCH}-$${VERSION}-cluster.tar.gz; \
URL=https://github.com/VictoriaMetrics/VictoriaMetrics/releases/download/$${VERSION}; \
DIR=/tmp/$${VERSION}; \
test -d $${DIR} || (mkdir $${DIR} && \
curl --output-dir /tmp -LO $${URL}/$${VMSINGLE} && tar xzf /tmp/$${VMSINGLE} -C $${DIR} && \
curl --output-dir /tmp -LO $${URL}/$${VMCLUSTER} && tar xzf /tmp/$${VMCLUSTER} -C $${DIR} \
); \
VM_LEGACY_VMSINGLE_PATH=$${DIR}/victoria-metrics-prod \
VM_LEGACY_VMSTORAGE_PATH=$${DIR}/vmstorage-prod \
go test ./apptest/tests -run="^TestLegacySingle.*"
benchmark:
GOEXPERIMENT=synctest go test -bench=. ./lib/...
@@ -500,7 +516,8 @@ app-local-windows-goarch:
CGO_ENABLED=0 GOOS=windows GOARCH=$(GOARCH) go build $(RACE) -ldflags "$(GO_BUILDINFO)" -tags "$(EXTRA_GO_BUILD_TAGS)" -o bin/$(APP_NAME)-windows-$(GOARCH)$(RACE).exe $(PKG_PREFIX)/app/$(APP_NAME)
quicktemplate-gen: install-qtc
qtc
qtc -dir=lib
qtc -dir=app
install-qtc:
which qtc || go install github.com/valyala/quicktemplate/qtc@latest

View File

@@ -134,6 +134,7 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
}
w.Header().Add("Content-Type", "text/html; charset=utf-8")
fmt.Fprintf(w, "<h2>Single-node VictoriaMetrics</h2></br>")
fmt.Fprintf(w, "Version %s<br>", buildinfo.Version)
fmt.Fprintf(w, "See docs at <a href='https://docs.victoriametrics.com/'>https://docs.victoriametrics.com/</a></br>")
fmt.Fprintf(w, "Useful endpoints:</br>")
httpserver.WriteAPIHelp(w, [][2]string{

View File

@@ -10,9 +10,11 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/decimal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prommetadata"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/prometheus"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage/metricsmetadata"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/timeserieslimits"
)
@@ -27,11 +29,9 @@ var selfScraperWG sync.WaitGroup
func startSelfScraper() {
selfScraperStopCh = make(chan struct{})
selfScraperWG.Add(1)
go func() {
defer selfScraperWG.Done()
selfScraperWG.Go(func() {
selfScraper(*selfScrapeInterval)
}()
})
}
func stopSelfScraper() {
@@ -48,6 +48,7 @@ func selfScraper(scrapeInterval time.Duration) {
var bb bytesutil.ByteBuffer
var rows prometheus.Rows
var metadataRows prometheus.MetadataRows
var mrs []storage.MetricRow
var labels []prompb.Label
t := time.NewTicker(scrapeInterval)
@@ -57,8 +58,12 @@ func selfScraper(scrapeInterval time.Duration) {
appmetrics.WritePrometheusMetrics(&bb)
s := bytesutil.ToUnsafeString(bb.B)
rows.Reset()
// VictoriaMetrics components don't expose metadata yet, only need to parse samples
rows.UnmarshalWithErrLogger(s, nil)
// Parse metrics and optionally metadata when enabled
if prommetadata.IsEnabled() {
rows, metadataRows = prometheus.UnmarshalWithMetadata(rows, metadataRows, s, nil)
} else {
rows.UnmarshalWithErrLogger(s, nil)
}
mrs = mrs[:0]
for i := range rows.Rows {
r := &rows.Rows[i]
@@ -91,6 +96,19 @@ func selfScraper(scrapeInterval time.Duration) {
if err := vmstorage.AddRows(mrs); err != nil {
logger.Errorf("cannot store self-scraped metrics: %s", err)
}
if len(metadataRows.Rows) > 0 {
mms := make([]metricsmetadata.Row, 0, len(metadataRows.Rows))
for _, mm := range metadataRows.Rows {
mms = append(mms, metricsmetadata.Row{
MetricFamilyName: bytesutil.ToUnsafeBytes(mm.Metric),
Help: bytesutil.ToUnsafeBytes(mm.Help),
Type: mm.Type,
})
}
if err := vmstorage.AddMetadataRows(mms); err != nil {
logger.Errorf("cannot store self-scraped metrics metadata: %s", err)
}
}
}
for {
select {

View File

@@ -27,6 +27,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/promremotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/vmimport"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/zabbixconnector"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
@@ -244,6 +245,7 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
}
w.Header().Add("Content-Type", "text/html; charset=utf-8")
fmt.Fprintf(w, "<h2>vmagent</h2>")
fmt.Fprintf(w, "Version %s<br>", buildinfo.Version)
fmt.Fprintf(w, "See docs at <a href='https://docs.victoriametrics.com/victoriametrics/vmagent/'>https://docs.victoriametrics.com/victoriametrics/vmagent/</a></br>")
fmt.Fprintf(w, "Useful endpoints:</br>")
httpserver.WriteAPIHelp(w, [][2]string{
@@ -350,6 +352,17 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
}
firehose.WriteSuccessResponse(w, r)
return true
case "/zabbixconnector/api/v1/history":
zabbixconnectorHistoryRequests.Inc()
if err := zabbixconnector.InsertHandlerForHTTP(nil, r); err != nil {
zabbixconnectorHistoryErrors.Inc()
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(http.StatusBadRequest)
fmt.Fprintf(w, `{"error":%q}`, err.Error())
return true
}
w.WriteHeader(http.StatusOK)
return true
case "/newrelic":
newrelicCheckRequest.Inc()
w.Header().Set("Content-Type", "application/json")
@@ -644,6 +657,17 @@ func processMultitenantRequest(w http.ResponseWriter, r *http.Request, path stri
}
firehose.WriteSuccessResponse(w, r)
return true
case "zabbixconnector/api/v1/history":
zabbixconnectorHistoryRequests.Inc()
if err := zabbixconnector.InsertHandlerForHTTP(at, r); err != nil {
zabbixconnectorHistoryErrors.Inc()
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(http.StatusBadRequest)
fmt.Fprintf(w, `{"error":%q}`, err.Error())
return true
}
w.WriteHeader(http.StatusOK)
return true
case "newrelic":
newrelicCheckRequest.Inc()
w.Header().Set("Content-Type", "application/json")
@@ -765,6 +789,9 @@ var (
opentelemetryPushRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/opentelemetry/v1/metrics", protocol="opentelemetry"}`)
opentelemetryPushErrors = metrics.NewCounter(`vmagent_http_request_errors_total{path="/opentelemetry/v1/metrics", protocol="opentelemetry"}`)
zabbixconnectorHistoryRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/zabbixconnector/api/v1/history", protocol="zabbixconnector"}`)
zabbixconnectorHistoryErrors = metrics.NewCounter(`vmagent_http_request_errors_total{path="/zabbixconnector/api/v1/history", protocol="zabbixconnector"}`)
newrelicWriteRequests = metrics.NewCounter(`vm_http_requests_total{path="/newrelic/infra/v2/metrics/events/bulk", protocol="newrelic"}`)
newrelicWriteErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/newrelic/infra/v2/metrics/events/bulk", protocol="newrelic"}`)

View File

@@ -78,7 +78,7 @@ func insertRows(at *auth.Token, rows []newrelic.Row, extraLabels []prompb.Label)
if !remotewrite.TryPush(at, &ctx.WriteRequest) {
return remotewrite.ErrQueueFullHTTPRetry
}
rowsInserted.Add(len(rows))
rowsInserted.Add(samplesCount)
if at != nil {
rowsTenantInserted.Get(at).Add(samplesCount)
}

View File

@@ -25,7 +25,7 @@ var (
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="opentelemetry"}`)
)
// InsertHandler processes metrics from given reader.
// InsertHandlerForReader processes metrics from given reader.
func InsertHandlerForReader(at *auth.Token, r io.Reader, encoding string) error {
return stream.ParseStream(r, encoding, nil, func(tss []prompb.TimeSeries, mms []prompb.MetricMetadata) error {
return insertRows(at, tss, mms, nil)

View File

@@ -15,7 +15,6 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/awsapi"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/encoding"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/encoding/zstd"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httputil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
@@ -203,14 +202,10 @@ func (c *client) init(argIdx, concurrency int, sanitizedURL string) {
c.retriesCount = metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_retries_count_total{url=%q}`, c.sanitizedURL))
c.sendDuration = metrics.GetOrCreateFloatCounter(fmt.Sprintf(`vmagent_remotewrite_send_duration_seconds_total{url=%q}`, c.sanitizedURL))
metrics.GetOrCreateGauge(fmt.Sprintf(`vmagent_remotewrite_queues{url=%q}`, c.sanitizedURL), func() float64 {
return float64(*queues)
return float64(concurrency)
})
for i := 0; i < concurrency; i++ {
c.wg.Add(1)
go func() {
defer c.wg.Done()
c.runWorker()
}()
for range concurrency {
c.wg.Go(c.runWorker)
}
logger.Infof("initialized client for -remoteWrite.url=%q", c.sanitizedURL)
}
@@ -554,9 +549,9 @@ func getRetryDuration(retryAfterDuration, retryDuration, maxRetryDuration time.D
// For more details, see: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9417
func repackBlockFromZstdToSnappy(zstdBlock []byte) ([]byte, error) {
plainBlock := make([]byte, 0, len(zstdBlock)*2)
plainBlock, err := zstd.Decompress(plainBlock, zstdBlock)
plainBlock, err := encoding.DecompressZSTD(plainBlock, zstdBlock)
if err != nil {
return nil, fmt.Errorf("zstd: decompress: %s", err)
return nil, err
}
return snappy.Encode(nil, plainBlock), nil

View File

@@ -48,11 +48,7 @@ func newPendingSeries(fq *persistentqueue.FastQueue, isVMRemoteWrite *atomic.Boo
ps.wr.significantFigures = significantFigures
ps.wr.roundDigits = roundDigits
ps.stopCh = make(chan struct{})
ps.periodicFlusherWG.Add(1)
go func() {
defer ps.periodicFlusherWG.Done()
ps.periodicFlusher()
}()
ps.periodicFlusherWG.Go(ps.periodicFlusher)
return &ps
}

View File

@@ -9,14 +9,14 @@ import (
"sync"
"sync/atomic"
"github.com/VictoriaMetrics/metrics"
"gopkg.in/yaml.v2"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fasttime"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
"go.yaml.in/yaml/v3"
"github.com/VictoriaMetrics/metrics"
)
var (
@@ -139,6 +139,7 @@ func loadRelabelConfigs() (*relabelConfigs, error) {
remoteWriteRelabelConfigData.Store(&rawCfg)
rcs.global = global
}
if len(*relabelConfigPaths) > len(*remoteWriteURLs) {
return nil, fmt.Errorf("too many -remoteWrite.urlRelabelConfig args: %d; it mustn't exceed the number of -remoteWrite.url args: %d",
len(*relabelConfigPaths), (len(*remoteWriteURLs)))
@@ -176,19 +177,9 @@ type relabelConfigs struct {
perURL []*promrelabel.ParsedConfigs
}
// isSet indicates whether (global or per-URL) command-line flags is set
func (rcs *relabelConfigs) isSet() bool {
if rcs == nil {
return false
}
if rcs.global.Len() > 0 {
return true
}
for _, pc := range rcs.perURL {
if pc.Len() > 0 {
return true
}
}
return false
return *relabelConfigPathGlobal != "" || len(*relabelConfigPaths) > 0
}
// initLabelsGlobal must be called after parsing command-line flags.

View File

@@ -59,7 +59,7 @@ var (
"See also -remoteWrite.maxDiskUsagePerURL and -remoteWrite.disableOnDiskQueue")
keepDanglingQueues = flag.Bool("remoteWrite.keepDanglingQueues", false, "Keep persistent queues contents at -remoteWrite.tmpDataPath in case there are no matching -remoteWrite.url. "+
"Useful when -remoteWrite.url is changed temporarily and persistent queue files will be needed later on.")
queues = flag.Int("remoteWrite.queues", cgroup.AvailableCPUs()*2, "The number of concurrent queues to each -remoteWrite.url. Set more queues if default number of queues "+
queues = flagutil.NewArrayInt("remoteWrite.queues", cgroup.AvailableCPUs()*2, "The number of concurrent queues to each -remoteWrite.url. Set more queues if default number of queues "+
"isn't enough for sending high volume of collected data to remote storage. "+
"Default value depends on the number of available CPU cores. It should work fine in most cases since it minimizes resource usage")
showRemoteWriteURL = flag.Bool("remoteWrite.showURL", false, "Whether to show -remoteWrite.url in the exported metrics. "+
@@ -176,13 +176,6 @@ func Init() {
})
}
if *queues > maxQueues {
*queues = maxQueues
}
if *queues <= 0 {
*queues = 1
}
if len(*shardByURLLabels) > 0 && len(*shardByURLIgnoreLabels) > 0 {
logger.Fatalf("-remoteWrite.shardByURL.labels and -remoteWrite.shardByURL.ignoreLabels cannot be set simultaneously; " +
"see https://docs.victoriametrics.com/victoriametrics/vmagent/#sharding-among-remote-storages")
@@ -215,9 +208,7 @@ func Init() {
dropDanglingQueues()
// Start config reloader.
configReloaderWG.Add(1)
go func() {
defer configReloaderWG.Done()
configReloaderWG.Go(func() {
for {
select {
case <-configReloaderStopCh:
@@ -227,7 +218,7 @@ func Init() {
reloadRelabelConfigs()
reloadStreamAggrConfigs()
}
}()
})
}
func dropDanglingQueues() {
@@ -267,17 +258,6 @@ func initRemoteWriteCtxs(urls []string) {
if len(urls) == 0 {
logger.Panicf("BUG: urls must be non-empty")
}
maxInmemoryBlocks := memory.Allowed() / len(urls) / *maxRowsPerBlock / 100
if maxInmemoryBlocks / *queues > 100 {
// There is no much sense in keeping higher number of blocks in memory,
// since this means that the producer outperforms consumer and the queue
// will continue growing. It is better storing the queue to file.
maxInmemoryBlocks = 100 * *queues
}
if maxInmemoryBlocks < 2 {
maxInmemoryBlocks = 2
}
rwctxs := make([]*remoteWriteCtx, len(urls))
rwctxIdx := make([]int, len(urls))
if retryMaxTime.String() != "" {
@@ -292,7 +272,7 @@ func initRemoteWriteCtxs(urls []string) {
if *showRemoteWriteURL {
sanitizedURL = fmt.Sprintf("%d:%s", i+1, remoteWriteURL)
}
rwctxs[i] = newRemoteWriteCtx(i, remoteWriteURL, maxInmemoryBlocks, sanitizedURL)
rwctxs[i] = newRemoteWriteCtx(i, remoteWriteURL, sanitizedURL)
rwctxIdx[i] = i
}
@@ -558,11 +538,9 @@ func tryPushMetadataToRemoteStorages(rwctxs []*remoteWriteCtx, mms []prompb.Metr
// Push metadata to remote storage systems in parallel to reduce
// the time needed for sending the data to multiple remote storage systems.
var wg sync.WaitGroup
wg.Add(len(rwctxs))
var anyPushFailed atomic.Bool
for _, rwctx := range rwctxs {
go func(rwctx *remoteWriteCtx) {
defer wg.Done()
wg.Go(func() {
if !rwctx.tryPushMetadataInternal(mms) {
rwctx.pushFailures.Inc()
if forceDropSamplesOnFailure {
@@ -571,7 +549,7 @@ func tryPushMetadataToRemoteStorages(rwctxs []*remoteWriteCtx, mms []prompb.Metr
}
anyPushFailed.Store(true)
}
}(rwctx)
})
}
wg.Wait()
return !anyPushFailed.Load()
@@ -603,15 +581,13 @@ func tryPushTimeSeriesToRemoteStorages(rwctxs []*remoteWriteCtx, tssBlock []prom
// Push tssBlock to remote storage systems in parallel to reduce
// the time needed for sending the data to multiple remote storage systems.
var wg sync.WaitGroup
wg.Add(len(rwctxs))
var anyPushFailed atomic.Bool
for _, rwctx := range rwctxs {
go func(rwctx *remoteWriteCtx) {
defer wg.Done()
wg.Go(func() {
if !rwctx.TryPushTimeSeries(tssBlock, forceDropSamplesOnFailure) {
anyPushFailed.Store(true)
}
}(rwctx)
})
}
wg.Wait()
return !anyPushFailed.Load()
@@ -633,13 +609,11 @@ func tryShardingTimeSeriesAmongRemoteStorages(rwctxs []*remoteWriteCtx, tssBlock
if len(shard) == 0 {
continue
}
wg.Add(1)
go func(rwctx *remoteWriteCtx, tss []prompb.TimeSeries) {
defer wg.Done()
if !rwctx.TryPushTimeSeries(tss, forceDropSamplesOnFailure) {
wg.Go(func() {
if !rwctx.TryPushTimeSeries(shard, forceDropSamplesOnFailure) {
anyPushFailed.Store(true)
}
}(rwctx, shard)
})
}
wg.Wait()
return !anyPushFailed.Load()
@@ -848,7 +822,7 @@ type remoteWriteCtx struct {
rowsDroppedOnPushFailure *metrics.Counter
}
func newRemoteWriteCtx(argIdx int, remoteWriteURL *url.URL, maxInmemoryBlocks int, sanitizedURL string) *remoteWriteCtx {
func newRemoteWriteCtx(argIdx int, remoteWriteURL *url.URL, sanitizedURL string) *remoteWriteCtx {
// strip query params, otherwise changing params resets pq
pqURL := *remoteWriteURL
pqURL.RawQuery = ""
@@ -863,6 +837,23 @@ func newRemoteWriteCtx(argIdx int, remoteWriteURL *url.URL, maxInmemoryBlocks in
}
isPQDisabled := disableOnDiskQueue.GetOptionalArg(argIdx)
queuesSize := queues.GetOptionalArg(argIdx)
if queuesSize > maxQueues {
queuesSize = maxQueues
} else if queuesSize <= 0 {
queuesSize = 1
}
maxInmemoryBlocks := memory.Allowed() / len(*remoteWriteURLs) / *maxRowsPerBlock / 100
if maxInmemoryBlocks/queuesSize > 100 {
// There is no much sense in keeping higher number of blocks in memory,
// since this means that the producer outperforms consumer and the queue
// will continue growing. It is better storing the queue to file.
maxInmemoryBlocks = 100 * queuesSize
}
if maxInmemoryBlocks < 2 {
maxInmemoryBlocks = 2
}
fq := persistentqueue.MustOpenFastQueue(queuePath, sanitizedURL, maxInmemoryBlocks, maxPendingBytes, isPQDisabled)
_ = metrics.GetOrCreateGauge(fmt.Sprintf(`vmagent_remotewrite_pending_data_bytes{path=%q, url=%q}`, queuePath, sanitizedURL), func() float64 {
return float64(fq.GetPendingBytes())
@@ -880,16 +871,16 @@ func newRemoteWriteCtx(argIdx int, remoteWriteURL *url.URL, maxInmemoryBlocks in
var c *client
switch remoteWriteURL.Scheme {
case "http", "https":
c = newHTTPClient(argIdx, remoteWriteURL.String(), sanitizedURL, fq, *queues)
c = newHTTPClient(argIdx, remoteWriteURL.String(), sanitizedURL, fq, queuesSize)
default:
logger.Fatalf("unsupported scheme: %s for remoteWriteURL: %s, want `http`, `https`", remoteWriteURL.Scheme, sanitizedURL)
}
c.init(argIdx, *queues, sanitizedURL)
c.init(argIdx, queuesSize, sanitizedURL)
// Initialize pss
sf := significantFigures.GetOptionalArg(argIdx)
rd := roundDigits.GetOptionalArg(argIdx)
pssLen := *queues
pssLen := queuesSize
if n := cgroup.AvailableCPUs(); pssLen > n {
// There is no sense in running more than availableCPUs concurrent pendingSeries,
// since every pendingSeries can saturate up to a single CPU.

View File

@@ -0,0 +1,80 @@
package zabbixconnector
import (
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/protoparserutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/zabbixconnector"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/zabbixconnector/stream"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="zabbixconnector"}`)
rowsTenantInserted = tenantmetrics.NewCounterMap(`vmagent_tenant_inserted_rows_total{type="zabbixconnector"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="zabbixconnector"}`)
)
// InsertHandlerForHTTP processes remote write for ZabbixConnector POST /zabbixconnector/v1/history request.
func InsertHandlerForHTTP(at *auth.Token, req *http.Request) error {
extraLabels, err := protoparserutil.GetExtraLabels(req)
if err != nil {
return err
}
encoding := req.Header.Get("Content-Encoding")
return stream.Parse(req.Body, encoding, func(rows []zabbixconnector.Row) error {
return insertRows(at, rows, extraLabels)
})
}
func insertRows(at *auth.Token, rows []zabbixconnector.Row, extraLabels []prompb.Label) error {
ctx := common.GetPushCtx()
defer common.PutPushCtx(ctx)
rowsTotal := len(rows)
tssDst := ctx.WriteRequest.Timeseries[:0]
labels := ctx.Labels[:0]
samples := ctx.Samples[:0]
for i := range rows {
r := &rows[i]
labelsLen := len(labels)
for j := range r.Tags {
tag := &r.Tags[j]
labels = append(labels, prompb.Label{
Name: bytesutil.ToUnsafeString(tag.Key),
Value: bytesutil.ToUnsafeString(tag.Value),
})
}
labels = append(labels, extraLabels...)
samplesLen := len(samples)
samples = append(samples, prompb.Sample{
Value: r.Value,
Timestamp: r.Timestamp,
})
tssDst = append(tssDst, prompb.TimeSeries{
Labels: labels[labelsLen:],
Samples: samples[samplesLen:],
})
}
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
if !remotewrite.TryPush(at, &ctx.WriteRequest) {
return remotewrite.ErrQueueFullHTTPRetry
}
rowsInserted.Add(rowsTotal)
if at != nil {
rowsTenantInserted.Get(at).Add(rowsTotal)
}
rowsPerInsert.Update(float64(rowsTotal))
return nil
}

View File

@@ -116,7 +116,7 @@ func TestParse_Failure(t *testing.T) {
f([]string{"testdata/rules/rules_interval_bad.rules"}, "eval_offset should be smaller than interval")
f([]string{"testdata/rules/rules0-bad.rules"}, "unexpected token")
f([]string{"testdata/dir/rules0-bad.rules"}, "error parsing annotation")
f([]string{"testdata/dir/rules0-bad.rules"}, "invalid annotations")
f([]string{"testdata/dir/rules1-bad.rules"}, "duplicate in file")
f([]string{"testdata/dir/rules2-bad.rules"}, "function \"unknown\" not defined")
f([]string{"testdata/dir/rules3-bad.rules"}, "either `record` or `alert` must be set")
@@ -343,7 +343,6 @@ func TestGroupValidate_Failure(t *testing.T) {
},
},
}, true, "bad prometheus expr")
}
func TestGroupValidate_Success(t *testing.T) {

View File

@@ -76,11 +76,14 @@ func (t *Type) ValidateExpr(expr string) error {
if err != nil {
return fmt.Errorf("bad LogsQL expr: %q, err: %w", expr, err)
}
fields, _ := q.GetStatsByFields()
for i := range fields {
labels, err := q.GetStatsLabels()
if err != nil {
return fmt.Errorf("cannot obtain labels from LogsQL expr: %q, err: %w", expr, err)
}
for i := range labels {
// VictoriaLogs inserts `_time` field as a label in result when query with `stats by (_time:step)`,
// making the result meaningless and may lead to cardinality issues.
if fields[i] == "_time" {
if labels[i] == "_time" {
return fmt.Errorf("bad LogsQL expr: %q, err: cannot contain time buckets stats pipe `stats by (_time:step)`", expr)
}
}

View File

@@ -81,9 +81,7 @@ absolute path to all .tpl files in root.
dryRun = flag.Bool("dryRun", false, "Whether to check only config files without running vmalert. The rules file are validated. The -rule flag must be specified.")
)
var (
extURL *url.URL
)
var extURL *url.URL
func main() {
// Write flags and help message to stdout, since it is easier to grep or pipe.
@@ -161,7 +159,7 @@ func main() {
ctx, cancel := context.WithCancel(context.Background())
manager, err := newManager(ctx)
if err != nil {
logger.Fatalf("failed to init: %s", err)
logger.Fatalf("failed to create manager: %s", err)
}
logger.Infof("reading rules configuration file from %q", strings.Join(*rulePath, ";"))
groupsCfg, err := config.Parse(*rulePath, validateTplFn, *validateExpressions)

View File

@@ -3,6 +3,7 @@ package main
import (
"context"
"fmt"
"strconv"
"sync"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/config"
@@ -45,13 +46,15 @@ func (m *manager) ruleAPI(gID, rID uint64) (rule.ApiRule, error) {
m.groupsMu.RLock()
defer m.groupsMu.RUnlock()
g, ok := m.groups[gID]
group, ok := m.groups[gID]
if !ok {
return rule.ApiRule{}, fmt.Errorf("can't find group with id %d", gID)
}
g := group.ToAPI()
ruleID := strconv.FormatUint(rID, 10)
for _, r := range g.Rules {
if r.ID() == rID {
return r.ToAPI(), nil
if r.ID == ruleID {
return r, nil
}
}
return rule.ApiRule{}, fmt.Errorf("can't find rule with id %d in group %q", rID, g.Name)
@@ -62,17 +65,20 @@ func (m *manager) alertAPI(gID, aID uint64) (*rule.ApiAlert, error) {
m.groupsMu.RLock()
defer m.groupsMu.RUnlock()
g, ok := m.groups[gID]
group, ok := m.groups[gID]
if !ok {
return nil, fmt.Errorf("can't find group with id %d", gID)
}
g := group.ToAPI()
for _, r := range g.Rules {
ar, ok := r.(*rule.AlertingRule)
if !ok {
if r.Type != rule.TypeAlerting {
continue
}
if apiAlert := ar.AlertToAPI(aID); apiAlert != nil {
return apiAlert, nil
alertID := strconv.FormatUint(aID, 10)
for _, a := range r.Alerts {
if a.ID == alertID {
return a, nil
}
}
}
return nil, fmt.Errorf("can't find alert with id %d in group %q", aID, g.Name)

View File

@@ -65,11 +65,9 @@ func TestManagerUpdateConcurrent(t *testing.T) {
const workers = 500
const iterations = 10
wg := sync.WaitGroup{}
wg.Add(workers)
for i := 0; i < workers; i++ {
go func(n int) {
defer wg.Done()
var wg sync.WaitGroup
for n := range workers {
wg.Go(func() {
r := rand.New(rand.NewSource(int64(n)))
for i := 0; i < iterations; i++ {
rnd := r.Intn(len(paths))
@@ -79,7 +77,7 @@ func TestManagerUpdateConcurrent(t *testing.T) {
}
_ = m.update(context.Background(), cfg, false)
}
}(i)
})
}
wg.Wait()
}

View File

@@ -80,14 +80,15 @@ func (as AlertState) String() string {
// AlertTplData is used to execute templating
type AlertTplData struct {
Type string
Labels map[string]string
Value float64
Expr string
AlertID uint64
GroupID uint64
ActiveAt time.Time
For time.Duration
Type string
Labels map[string]string
Value float64
Expr string
AlertID uint64
GroupID uint64
ActiveAt time.Time
For time.Duration
IsPartial bool
}
var tplHeaders = []string{
@@ -101,6 +102,7 @@ var tplHeaders = []string{
"{{ $groupID := .GroupID }}",
"{{ $activeAt := .ActiveAt }}",
"{{ $for := .For }}",
"{{ $isPartial := .IsPartial }}",
}
// ExecTemplate executes the Alert template for given
@@ -166,8 +168,8 @@ func templateAnnotations(annotations map[string]string, data AlertTplData, tmpl
ctmpl, _ := tmpl.Clone()
ctmpl = ctmpl.Option("missingkey=zero")
if err := templateAnnotation(&buf, builder.String(), tData, ctmpl, execute); err != nil {
r[key] = text
eg.Add(fmt.Errorf("key %q, template %q: %w", key, text, err))
r[key] = err.Error()
eg.Add(fmt.Errorf("(key: %q, value: %q): %w", key, text, err))
continue
}
r[key] = buf.String()
@@ -184,13 +186,13 @@ type tplData struct {
func templateAnnotation(dst io.Writer, text string, data tplData, tpl *textTpl.Template, execute bool) error {
tpl, err := tpl.Parse(text)
if err != nil {
return fmt.Errorf("error parsing annotation template: %w", err)
return fmt.Errorf("error parsing template: %w", err)
}
if !execute {
return nil
}
if err = tpl.Execute(dst, data); err != nil {
return fmt.Errorf("error evaluating annotation template: %w", err)
return fmt.Errorf("error evaluating template: %w", err)
}
return nil
}

View File

@@ -3,6 +3,7 @@ package notifier
import (
"bytes"
"context"
"errors"
"fmt"
"io"
"net/http"
@@ -13,7 +14,6 @@ import (
"github.com/VictoriaMetrics/metrics"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/vmalertutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httputil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promauth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
@@ -86,6 +86,11 @@ func (am *AlertManager) Send(ctx context.Context, alerts []Alert, alertLabels []
err := am.send(ctx, alerts, alertLabels, headers)
am.metrics.alertsSendDuration.UpdateDuration(startTime)
if err != nil {
// the context can be cancelled on graceful shutdown
// or on group update. So no need to handle the error as usual.
if errors.Is(err, context.Canceled) {
return nil
}
am.metrics.alertsSendErrors.Add(len(alerts))
am.lastError = err.Error()
} else {
@@ -166,11 +171,6 @@ const alertManagerPath = "/api/v2/alerts"
func NewAlertManager(alertManagerURL string, fn AlertURLGenerator, authCfg promauth.HTTPClientConfig,
relabelCfg *promrelabel.ParsedConfigs, timeout time.Duration,
) (*AlertManager, error) {
if err := httputil.CheckURL(alertManagerURL); err != nil {
return nil, fmt.Errorf("invalid alertmanager URL: %w", err)
}
tls := &promauth.TLSConfig{}
if authCfg.TLSConfig != nil {
tls = authCfg.TLSConfig

View File

@@ -212,18 +212,16 @@ consul_sd_configs:
const workers = 500
const iterations = 10
wg := sync.WaitGroup{}
wg.Add(workers)
for i := 0; i < workers; i++ {
go func(n int) {
defer wg.Done()
var wg sync.WaitGroup
for n := range workers {
wg.Go(func() {
r := rand.New(rand.NewSource(int64(n)))
for i := 0; i < iterations; i++ {
rnd := r.Intn(len(paths))
_ = cw.reload(paths[rnd]) // update can fail and this is expected
_ = cw.notifiers()
}
}(i)
})
}
wg.Wait()
}

View File

@@ -13,6 +13,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/vmalertutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httputil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promauth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
@@ -229,6 +230,9 @@ func notifiersFromFlags(gen AlertURLGenerator) ([]Notifier, error) {
Headers: []string{headers.GetOptionalArg(i)},
}
if err := httputil.CheckURL(addr); err != nil {
return nil, fmt.Errorf("invalid notifier.url %q: %w", addr, err)
}
addr = strings.TrimSuffix(addr, "/")
am, err := NewAlertManager(addr+alertManagerPath, gen, authCfg, nil, sendTimeout.GetOptionalArg(i))
if err != nil {
@@ -266,7 +270,7 @@ func GetTargets() map[TargetType][]Target {
if getActiveNotifiers == nil {
return nil
}
var targets = make(map[TargetType][]Target)
targets := make(map[TargetType][]Target)
// use cached targets from configWatcher instead of getActiveNotifiers for the extra target labels
if cw != nil {
cw.targetsMu.RLock()

View File

@@ -55,9 +55,9 @@ func TestInitNegative(t *testing.T) {
*blackHole = oldBlackHole
}()
f := func(path, addr string, bh bool) {
f := func(path string, addr []string, bh bool) {
*configPath = path
*addrs = flagutil.ArrayString{addr}
*addrs = flagutil.ArrayString(addr)
*blackHole = bh
if err := Init(nil, ""); err == nil {
t.Fatalf("expected to get error; got nil instead")
@@ -65,9 +65,12 @@ func TestInitNegative(t *testing.T) {
}
// *configPath, *addrs and *blackhole are mutually exclusive
f("/dummy/path", "127.0.0.1", false)
f("/dummy/path", "", true)
f("", "127.0.0.1", true)
f("/dummy/path", []string{"127.0.0.1"}, false)
f("/dummy/path", []string{}, true)
f("", []string{"127.0.0.1"}, true)
// addr cannot be ""
f("", []string{""}, false)
f("", []string{"127.0.0.1", ""}, false)
}
func TestBlackHole(t *testing.T) {

View File

@@ -2,6 +2,7 @@ package rule
import (
"context"
"errors"
"fmt"
"hash/fnv"
"math"
@@ -246,16 +247,6 @@ func (ar *AlertingRule) GetAlerts() []*notifier.Alert {
return alerts
}
// GetAlert returns alert if id exists
func (ar *AlertingRule) GetAlert(id uint64) *notifier.Alert {
ar.alertsMu.RLock()
defer ar.alertsMu.RUnlock()
if ar.alerts == nil {
return nil
}
return ar.alerts[id]
}
func (ar *AlertingRule) logDebugf(at time.Time, a *notifier.Alert, format string, args ...any) {
if !ar.Debug {
return
@@ -321,6 +312,11 @@ type labelSet struct {
// On k conflicts in origin set, the original value is preferred and copied
// to processed with `exported_%k` key. The copy happens only if passed v isn't equal to origin[k] value.
func (ls *labelSet) add(k, v string) {
// do not add label with empty value, since it has no meaning.
// see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9984
if v == "" {
return
}
ls.processed[k] = v
ov, ok := ls.origin[k]
if !ok {
@@ -350,14 +346,13 @@ func (ar *AlertingRule) toLabels(m datasource.Metric, qFn templates.QueryFn) (*l
ls.processed[l.Name] = l.Value
}
// labels only support limited templating variables,
// including `labels`, `value` and `expr`, to avoid breaking alert states or causing cardinality issue with results
extraLabels, err := notifier.ExecTemplate(qFn, ar.Labels, notifier.AlertTplData{
Labels: ls.origin,
Value: m.Values[0],
Expr: ar.Expr,
})
if err != nil {
return nil, fmt.Errorf("failed to expand labels: %w", err)
}
for k, v := range extraLabels {
ls.add(k, v)
}
@@ -368,7 +363,7 @@ func (ar *AlertingRule) toLabels(m datasource.Metric, qFn templates.QueryFn) (*l
if !*disableAlertGroupLabel && ar.GroupName != "" {
ls.add(alertGroupNameLabel, ar.GroupName)
}
return ls, nil
return ls, err
}
// execRange executes alerting rule on the given time range similarly to exec.
@@ -394,11 +389,7 @@ func (ar *AlertingRule) execRange(ctx context.Context, start, end time.Time) ([]
return nil, err
}
alertID := hash(ls.processed)
as, err := ar.expandAnnotationTemplates(s, qFn, time.Time{}, ls)
if err != nil {
return nil, err
}
a := ar.newAlert(s, time.Time{}, ls.processed, as) // initial alert
a := ar.newAlert(s, time.Time{}, ls.processed, nil) // initial alert
prevT := time.Time{}
for i := range s.Values {
@@ -414,8 +405,6 @@ func (ar *AlertingRule) execRange(ctx context.Context, start, end time.Time) ([]
// reset to Pending if there are gaps > EvalInterval between DPs
a.State = notifier.StatePending
a.ActiveAt = at
// re-template the annotations as active timestamp is changed
a.Annotations, _ = ar.expandAnnotationTemplates(s, qFn, at, ls)
a.Start = time.Time{}
} else if at.Sub(a.ActiveAt) >= ar.For && a.State != notifier.StateFiring {
a.State = notifier.StateFiring
@@ -461,7 +450,7 @@ func (ar *AlertingRule) exec(ctx context.Context, ts time.Time, limit int) ([]pr
defer func() {
ar.state.add(curState)
if curState.Err != nil {
if curState.Err != nil && !errors.Is(curState.Err, context.Canceled) {
ar.metrics.errors.Inc()
}
}()
@@ -470,7 +459,8 @@ func (ar *AlertingRule) exec(ctx context.Context, ts time.Time, limit int) ([]pr
return nil, fmt.Errorf("failed to execute query %q: %w", ar.Expr, err)
}
ar.logDebugf(ts, nil, "query returned %d series (elapsed: %s, isPartial: %t)", curState.Samples, curState.Duration, isPartialResponse(res))
isPartial := isPartialResponse(res)
ar.logDebugf(ts, nil, "query returned %d series (elapsed: %s, isPartial: %t)", curState.Samples, curState.Duration, isPartial)
qFn := func(query string) ([]datasource.Metric, error) {
res, _, err := ar.q.Query(ctx, query, ts)
return res.Data, err
@@ -484,8 +474,9 @@ func (ar *AlertingRule) exec(ctx context.Context, ts time.Time, limit int) ([]pr
for i, m := range res.Data {
ls, err := ar.expandLabelTemplates(m, qFn)
if err != nil {
// only set error in current state, but do not break alert processing
curState.Err = err
return nil, curState.Err
logger.Errorf("got templating error in rule %s: %q", ar.Name, err)
}
at := ts
alertID := hash(ls.processed)
@@ -495,10 +486,11 @@ func (ar *AlertingRule) exec(ctx context.Context, ts time.Time, limit int) ([]pr
at = a.ActiveAt
}
}
as, err := ar.expandAnnotationTemplates(m, qFn, at, ls)
as, err := ar.expandAnnotationTemplates(m, qFn, at, ls, isPartial)
if err != nil {
// only set error in current state, but do not break alert processing
curState.Err = err
return nil, curState.Err
logger.Errorf("got templating error in rule %s: %q", ar.Name, err)
}
expandedLabels[i] = ls
expandedAnnotations[i] = as
@@ -607,25 +599,26 @@ func (ar *AlertingRule) exec(ctx context.Context, ts time.Time, limit int) ([]pr
func (ar *AlertingRule) expandLabelTemplates(m datasource.Metric, qFn templates.QueryFn) (*labelSet, error) {
ls, err := ar.toLabels(m, qFn)
if err != nil {
return nil, fmt.Errorf("failed to expand label templates: %s", err)
return ls, fmt.Errorf("failed to expand label templates: %s", err)
}
return ls, nil
}
func (ar *AlertingRule) expandAnnotationTemplates(m datasource.Metric, qFn templates.QueryFn, activeAt time.Time, ls *labelSet) (map[string]string, error) {
func (ar *AlertingRule) expandAnnotationTemplates(m datasource.Metric, qFn templates.QueryFn, activeAt time.Time, ls *labelSet, isPartial bool) (map[string]string, error) {
tplData := notifier.AlertTplData{
Value: m.Values[0],
Type: ar.Type.String(),
Labels: ls.origin,
Expr: ar.Expr,
AlertID: hash(ls.processed),
GroupID: ar.GroupID,
ActiveAt: activeAt,
For: ar.For,
Value: m.Values[0],
Type: ar.Type.String(),
Labels: ls.origin,
Expr: ar.Expr,
AlertID: hash(ls.processed),
GroupID: ar.GroupID,
ActiveAt: activeAt,
For: ar.For,
IsPartial: isPartial,
}
as, err := notifier.ExecTemplate(qFn, ar.Annotations, tplData)
if err != nil {
return nil, fmt.Errorf("failed to expand annotation templates: %s", err)
return as, fmt.Errorf("failed to expand annotation templates: %s", err)
}
return as, nil
}

View File

@@ -664,7 +664,7 @@ func TestAlertingRuleExecRange(t *testing.T) {
Name: "for-pending",
Type: config.NewPrometheusType().String(),
Labels: map[string]string{"alertname": "for-pending"},
Annotations: map[string]string{"activeAt": "5000"},
Annotations: map[string]string{},
State: notifier.StatePending,
ActiveAt: time.Unix(5, 0),
Value: 1,
@@ -684,7 +684,7 @@ func TestAlertingRuleExecRange(t *testing.T) {
Name: "for-firing",
Type: config.NewPrometheusType().String(),
Labels: map[string]string{"alertname": "for-firing"},
Annotations: map[string]string{"activeAt": "1000"},
Annotations: map[string]string{},
State: notifier.StateFiring,
ActiveAt: time.Unix(1, 0),
Start: time.Unix(5, 0),
@@ -705,7 +705,7 @@ func TestAlertingRuleExecRange(t *testing.T) {
Name: "for-hold-pending",
Type: config.NewPrometheusType().String(),
Labels: map[string]string{"alertname": "for-hold-pending"},
Annotations: map[string]string{"activeAt": "5000"},
Annotations: map[string]string{},
State: notifier.StatePending,
ActiveAt: time.Unix(5, 0),
Value: 1,
@@ -1120,7 +1120,7 @@ func TestAlertingRuleLimit_Success(t *testing.T) {
}
func TestAlertingRule_Template(t *testing.T) {
f := func(rule *AlertingRule, metrics []datasource.Metric, alertsExpected map[uint64]*notifier.Alert) {
f := func(rule *AlertingRule, metrics []datasource.Metric, isResponsePartial bool, alertsExpected map[uint64]*notifier.Alert) {
t.Helper()
fakeGroup := Group{
@@ -1133,6 +1133,7 @@ func TestAlertingRule_Template(t *testing.T) {
entries: make([]StateEntry, 10),
}
fq.Add(metrics...)
fq.SetPartialResponse(isResponsePartial)
if _, err := rule.exec(context.TODO(), time.Now(), 0); err != nil {
t.Fatalf("unexpected error: %s", err)
@@ -1163,7 +1164,7 @@ func TestAlertingRule_Template(t *testing.T) {
}, []datasource.Metric{
metricWithValueAndLabels(t, 1, "instance", "foo"),
metricWithValueAndLabels(t, 1, "instance", "bar"),
}, map[uint64]*notifier.Alert{
}, false, map[uint64]*notifier.Alert{
hash(map[string]string{alertNameLabel: "common", "region": "east", "instance": "foo"}): {
Annotations: map[string]string{
"summary": `common: Too high connection number for "foo"`,
@@ -1192,14 +1193,14 @@ func TestAlertingRule_Template(t *testing.T) {
"instance": "{{ $labels.instance }}",
},
Annotations: map[string]string{
"summary": `{{ $labels.__name__ }}: Too high connection number for "{{ $labels.instance }}"`,
"summary": `{{ $labels.__name__ }}: Too high connection number for "{{ $labels.instance }}".{{ if $isPartial }} WARNING: Partial response detected - this alert may be incomplete. Please verify the results manually.{{ end }}`,
"description": `{{ $labels.alertname}}: It is {{ $value }} connections for "{{ $labels.instance }}"`,
},
alerts: make(map[uint64]*notifier.Alert),
}, []datasource.Metric{
metricWithValueAndLabels(t, 2, "__name__", "first", "instance", "foo", alertNameLabel, "override"),
metricWithValueAndLabels(t, 10, "__name__", "second", "instance", "bar", alertNameLabel, "override"),
}, map[uint64]*notifier.Alert{
}, false, map[uint64]*notifier.Alert{
hash(map[string]string{alertNameLabel: "override label", "exported_alertname": "override", "instance": "foo"}): {
Labels: map[string]string{
alertNameLabel: "override label",
@@ -1207,7 +1208,7 @@ func TestAlertingRule_Template(t *testing.T) {
"instance": "foo",
},
Annotations: map[string]string{
"summary": `first: Too high connection number for "foo"`,
"summary": `first: Too high connection number for "foo".`,
"description": `override: It is 2 connections for "foo"`,
},
},
@@ -1218,7 +1219,7 @@ func TestAlertingRule_Template(t *testing.T) {
"instance": "bar",
},
Annotations: map[string]string{
"summary": `second: Too high connection number for "bar"`,
"summary": `second: Too high connection number for "bar".`,
"description": `override: It is 10 connections for "bar"`,
},
},
@@ -1231,7 +1232,7 @@ func TestAlertingRule_Template(t *testing.T) {
"instance": "{{ $labels.instance }}",
},
Annotations: map[string]string{
"summary": `Alert "{{ $labels.alertname }}({{ $labels.alertgroup }})" for instance {{ $labels.instance }}`,
"summary": `Alert "{{ $labels.alertname }}({{ $labels.alertgroup }})" for instance {{ $labels.instance }}.{{ if $isPartial }} WARNING: Partial response detected - this alert may be incomplete. Please verify the results manually.{{ end }}`,
},
alerts: make(map[uint64]*notifier.Alert),
}, []datasource.Metric{
@@ -1239,7 +1240,7 @@ func TestAlertingRule_Template(t *testing.T) {
alertNameLabel, "originAlertname",
alertGroupNameLabel, "originGroupname",
"instance", "foo"),
}, map[uint64]*notifier.Alert{
}, true, map[uint64]*notifier.Alert{
hash(map[string]string{
alertNameLabel: "OriginLabels",
"exported_alertname": "originAlertname",
@@ -1255,7 +1256,7 @@ func TestAlertingRule_Template(t *testing.T) {
"instance": "foo",
},
Annotations: map[string]string{
"summary": `Alert "originAlertname(originGroupname)" for instance foo`,
"summary": `Alert "originAlertname(originGroupname)" for instance foo. WARNING: Partial response detected - this alert may be incomplete. Please verify the results manually.`,
},
},
})
@@ -1370,8 +1371,10 @@ func TestAlertingRule_ToLabels(t *testing.T) {
ar := &AlertingRule{
Labels: map[string]string{
"instance": "override", // this should override instance with new value
"group": "vmalert", // this shouldn't have effect since value in metric is equal
"instance": "override", // this should override instance with new value
"group": "vmalert", // this shouldn't have effect since value in metric is equal
"invalid_label": "{{ .Values.mustRuntimeFail }}",
"empty_label": "", // this should be dropped
},
Expr: "sum(vmalert_alerting_rules_error) by(instance, group, alertname) > 0",
Name: "AlertingRulesError",
@@ -1379,10 +1382,11 @@ func TestAlertingRule_ToLabels(t *testing.T) {
}
expectedOriginLabels := map[string]string{
"instance": "0.0.0.0:8800",
"group": "vmalert",
"alertname": "ConfigurationReloadFailure",
"alertgroup": "vmalert",
"instance": "0.0.0.0:8800",
"group": "vmalert",
"alertname": "ConfigurationReloadFailure",
"alertgroup": "vmalert",
"invalid_label": `error evaluating template: template: :1:298: executing "" at <.Values.mustRuntimeFail>: can't evaluate field Values in type notifier.tplData`,
}
expectedProcessedLabels := map[string]string{
@@ -1392,11 +1396,12 @@ func TestAlertingRule_ToLabels(t *testing.T) {
"exported_alertname": "ConfigurationReloadFailure",
"group": "vmalert",
"alertgroup": "vmalert",
"invalid_label": `error evaluating template: template: :1:298: executing "" at <.Values.mustRuntimeFail>: can't evaluate field Values in type notifier.tplData`,
}
ls, err := ar.toLabels(metric, nil)
if err != nil {
t.Fatalf("unexpected error: %s", err)
if err == nil || !strings.Contains(err.Error(), "error evaluating template") {
t.Fatalf("unexpected error %q", err.Error())
}
if !reflect.DeepEqual(ls.origin, expectedOriginLabels) {

View File

@@ -2,6 +2,7 @@ package rule
import (
"context"
"errors"
"fmt"
"strings"
"time"
@@ -197,7 +198,7 @@ func (rr *RecordingRule) exec(ctx context.Context, ts time.Time, limit int) ([]p
defer func() {
rr.state.add(curState)
if curState.Err != nil {
if curState.Err != nil && !errors.Is(curState.Err, context.Canceled) {
rr.metrics.errors.Inc()
}
}()
@@ -236,7 +237,8 @@ func (rr *RecordingRule) exec(ctx context.Context, ts time.Time, limit int) ([]p
Labels: stringToLabels(k),
Samples: []prompb.Sample{
{Value: decimal.StaleNaN, Timestamp: ts.UnixNano() / 1e6},
}})
},
})
}
rr.lastEvaluation = curEvaluation
return tss, nil
@@ -291,6 +293,11 @@ func (rr *RecordingRule) toTimeSeries(m datasource.Metric) prompb.TimeSeries {
}
// add extra labels configured by user
for k := range rr.Labels {
// do not add label with empty value, since it has no meaning.
// see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9984
if rr.Labels[k] == "" {
continue
}
existingLabel := promrelabel.GetLabelByName(m.Labels, k)
if existingLabel != nil { // there is a conflict between extra and existing label
if existingLabel.Value == rr.Labels[k] {

View File

@@ -65,17 +65,15 @@ func TestRule_stateConcurrent(_ *testing.T) {
r := &AlertingRule{state: &ruleState{entries: make([]StateEntry, 20)}}
const workers = 50
const iterations = 100
wg := sync.WaitGroup{}
wg.Add(workers)
for i := 0; i < workers; i++ {
go func() {
defer wg.Done()
var wg sync.WaitGroup
for range workers {
wg.Go(func() {
for i := 0; i < iterations; i++ {
r.state.add(StateEntry{At: time.Now()})
r.state.getAll()
r.state.getLast()
}
}()
})
}
wg.Wait()
}

View File

@@ -209,15 +209,6 @@ func (ar *AlertingRule) AlertsToAPI() []*ApiAlert {
return alerts
}
// AlertToAPI generates apiAlert object from alert by its id(hash)
func (ar *AlertingRule) AlertToAPI(id uint64) *ApiAlert {
a := ar.GetAlert(id)
if a == nil {
return nil
}
return NewAlertAPI(ar, a)
}
// NewAlertAPI creates apiAlert for notifier.Alert
func NewAlertAPI(ar *AlertingRule, a *notifier.Alert) *ApiAlert {
aa := &ApiAlert{

View File

@@ -412,18 +412,18 @@ func (rh *requestHandler) groupAlerts() []rule.GroupAlerts {
defer rh.m.groupsMu.RUnlock()
var gAlerts []rule.GroupAlerts
for _, g := range rh.m.groups {
for _, group := range rh.m.groups {
var alerts []*rule.ApiAlert
g := group.ToAPI()
for _, r := range g.Rules {
a, ok := r.(*rule.AlertingRule)
if !ok {
if r.Type != rule.TypeAlerting {
continue
}
alerts = append(alerts, a.AlertsToAPI()...)
alerts = append(alerts, r.Alerts...)
}
if len(alerts) > 0 {
gAlerts = append(gAlerts, rule.GroupAlerts{
Group: g.ToAPI(),
Group: g,
Alerts: alerts,
})
}
@@ -444,12 +444,12 @@ func (rh *requestHandler) listAlerts(rf *rulesFilter) ([]byte, error) {
if !rf.matchesGroup(group) {
continue
}
for _, r := range group.Rules {
a, ok := r.(*rule.AlertingRule)
if !ok {
g := group.ToAPI()
for _, r := range g.Rules {
if r.Type != rule.TypeAlerting {
continue
}
lr.Data.Alerts = append(lr.Data.Alerts, a.AlertsToAPI()...)
lr.Data.Alerts = append(lr.Data.Alerts, r.Alerts...)
}
}

View File

@@ -9,6 +9,7 @@
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/vmalertutil"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/rule"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
) %}
{% func Controls(prefix, currentIcon, currentText string, icons, filters map[string]string, search bool) %}
@@ -78,6 +79,8 @@
{% func Welcome(r *http.Request) %}
{%= tpl.Header(r, navItems, "vmalert", getLastConfigError()) %}
<p>
Version {%s buildinfo.Version %} <br>
API:<br>
{% for _, p := range apiLinks %}
{%code p, doc := p[0], p[1] %}
@@ -602,11 +605,11 @@
<table class="table table-striped table-hover table-sm">
<thead>
<tr>
<th scope="col" title="The time when event was created">Updated at</th>
<th scope="col" title="The time when the rule was executed">Updated at</th>
<th scope="col" class="w-10 text-center" title="How many series expression returns. Each series will represent an alert.">Series returned</th>
{% if seriesFetchedEnabled %}<th scope="col" class="w-10 text-center" title="How many series were scanned by datasource during the evaluation">Series fetched</th>{% endif %}
<th scope="col" class="w-10 text-center" title="How many seconds request took">Duration</th>
<th scope="col" class="text-center" title="Time used for rule execution">Executed at</th>
<th scope="col" class="text-center" title="The time used in execution query request">Execution timestamp</th>
<th scope="col" class="text-center" title="cURL command with request example">cURL</th>
</tr>
</thead>

File diff suppressed because it is too large Load Diff

View File

@@ -4,6 +4,7 @@ import (
"bytes"
"context"
"encoding/base64"
"errors"
"flag"
"fmt"
"math"
@@ -94,6 +95,8 @@ type UserInfo struct {
rt http.RoundTripper
requests *metrics.Counter
requestErrors *metrics.Counter
backendRequests *metrics.Counter
backendErrors *metrics.Counter
requestsDuration *metrics.Summary
}
@@ -105,13 +108,29 @@ type HeadersConf struct {
KeepOriginalHost *bool `yaml:"keep_original_host,omitempty"`
}
func (ui *UserInfo) beginConcurrencyLimit() error {
func (ui *UserInfo) beginConcurrencyLimit(ctx context.Context) error {
select {
case ui.concurrencyLimitCh <- struct{}{}:
return nil
default:
ui.concurrencyLimitReached.Inc()
return fmt.Errorf("cannot handle more than %d concurrent requests from user %s", ui.getMaxConcurrentRequests(), ui.name())
// The number of concurrently executed requests for the given user equals the limt.
// Wait until some of the currently executed requests are finished, so the current request could be executed.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10078
select {
case ui.concurrencyLimitCh <- struct{}{}:
return nil
case <-ctx.Done():
err := ctx.Err()
if errors.Is(err, context.DeadlineExceeded) {
// The current request couldn't be executed until the request timeout.
ui.concurrencyLimitReached.Inc()
return fmt.Errorf("cannot start executing the request during -maxQueueDuration=%s because %d concurrent requests from the user %s are executed",
*maxQueueDuration, ui.getMaxConcurrentRequests(), ui.name())
}
return fmt.Errorf("cannot start executing the request because %d concurrent requests from the user %s are executed: %w",
ui.getMaxConcurrentRequests(), ui.name(), err)
}
}
}
@@ -127,6 +146,28 @@ func (ui *UserInfo) getMaxConcurrentRequests() int {
return mcr
}
func (ui *UserInfo) stopHealthChecks() {
if ui == nil {
return
}
if ui.URLPrefix != nil {
bus := ui.URLPrefix.bus.Load()
bus.stopHealthChecks()
}
if ui.DefaultURL != nil {
bus := ui.DefaultURL.bus.Load()
bus.stopHealthChecks()
}
for i := range ui.URLMaps {
um := &ui.URLMaps[i]
if um.URLPrefix != nil {
bus := um.URLPrefix.bus.Load()
bus.stopHealthChecks()
}
}
}
// Header is `Name: Value` http header, which must be added to the proxied request.
type Header struct {
Name string
@@ -262,7 +303,7 @@ type URLPrefix struct {
// the list of backend urls
//
// the list can be dynamically updated if `discover_backend_ips` option is set.
bus atomic.Pointer[[]*backendURL]
bus atomic.Pointer[backendURLs]
// if this option is set, then backend ips for busOriginal are periodically re-discovered and put to bus.
discoverBackendIPs bool
@@ -286,21 +327,91 @@ func (up *URLPrefix) setLoadBalancingPolicy(loadBalancingPolicy string) error {
}
}
type backendURLs struct {
healthChecksContext context.Context
healthChecksCancel func()
healthChecksWG sync.WaitGroup
bus []*backendURL
}
func newBackendURLs() *backendURLs {
ctx, cancel := context.WithCancel(context.Background())
return &backendURLs{
healthChecksContext: ctx,
healthChecksCancel: cancel,
}
}
func (bus *backendURLs) add(u *url.URL) {
bus.bus = append(bus.bus, &backendURL{
url: u,
healthCheckContext: bus.healthChecksContext,
healthCheckWG: &bus.healthChecksWG,
})
}
func (bus *backendURLs) stopHealthChecks() {
bus.healthChecksCancel()
bus.healthChecksWG.Wait()
}
type backendURL struct {
brokenDeadline atomic.Uint64
broken atomic.Bool
healthCheckContext context.Context
healthCheckWG *sync.WaitGroup
concurrentRequests atomic.Int32
url *url.URL
}
func (bu *backendURL) isBroken() bool {
ct := fasttime.UnixTimestamp()
return ct < bu.brokenDeadline.Load()
return bu.broken.Load()
}
func (bu *backendURL) setBroken() {
deadline := fasttime.UnixTimestamp() + uint64((*failTimeout).Seconds())
bu.brokenDeadline.Store(deadline)
if bu.broken.CompareAndSwap(false, true) {
bu.healthCheckWG.Go(func() {
bu.runHealthCheck()
bu.broken.Store(false)
})
}
}
func (bu *backendURL) runHealthCheck() {
port := bu.url.Port()
if port == "" {
port = "80"
}
addr := net.JoinHostPort(bu.url.Hostname(), port)
t := time.NewTicker(*failTimeout)
defer t.Stop()
for {
select {
case <-t.C:
// Verify network connectivity via TCP dial before marking backend healthy.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9997
ctx, cancel := context.WithTimeout(bu.healthCheckContext, time.Second)
c, err := netutil.Dialer.DialContext(ctx, "tcp", addr)
cancel()
if err != nil {
if errors.Is(bu.healthCheckContext.Err(), context.Canceled) {
return
}
logger.Warnf("ignoring the backend at %s for %s because of dial error: %s", addr, *failTimeout, err)
continue
}
_ = c.Close()
return
case <-bu.healthCheckContext.Done():
return
}
}
}
func (bu *backendURL) get() {
@@ -312,8 +423,8 @@ func (bu *backendURL) put() {
}
func (up *URLPrefix) getBackendsCount() int {
pbus := up.bus.Load()
return len(*pbus)
bus := up.bus.Load()
return len(bus.bus)
}
// getBackendURL returns the backendURL depending on the load balance policy.
@@ -324,16 +435,15 @@ func (up *URLPrefix) getBackendsCount() int {
func (up *URLPrefix) getBackendURL() *backendURL {
up.discoverBackendAddrsIfNeeded()
pbus := up.bus.Load()
bus := *pbus
if len(bus) == 0 {
bus := up.bus.Load()
if len(bus.bus) == 0 {
return nil
}
if up.loadBalancingPolicy == "first_available" {
return getFirstAvailableBackendURL(bus)
return getFirstAvailableBackendURL(bus.bus)
}
return getLeastLoadedBackendURL(bus, &up.n)
return getLeastLoadedBackendURL(bus.bus, &up.n)
}
func (up *URLPrefix) discoverBackendAddrsIfNeeded() {
@@ -407,25 +517,24 @@ func (up *URLPrefix) discoverBackendAddrsIfNeeded() {
cancel()
// generate new backendURLs for the resolved IPs
var busNew []*backendURL
busNew := newBackendURLs()
for _, bu := range up.busOriginal {
host := bu.Hostname()
for _, addr := range hostToAddrs[host] {
buCopy := *bu
buCopy.Host = addr
busNew = append(busNew, &backendURL{
url: &buCopy,
})
busNew.add(&buCopy)
}
}
pbus := up.bus.Load()
if areEqualBackendURLs(*pbus, busNew) {
bus := up.bus.Load()
if areEqualBackendURLs(bus.bus, busNew.bus) {
return
}
// Store new backend urls
up.bus.Store(&busNew)
up.bus.Store(busNew)
bus.stopHealthChecks()
}
func areEqualBackendURLs(a, b []*backendURL) bool {
@@ -456,20 +565,23 @@ func getFirstAvailableBackendURL(bus []*backendURL) *backendURL {
for i := 1; i < len(bus); i++ {
if !bus[i].isBroken() {
bu = bus[i]
break
bu.get()
return bu
}
}
bu.get()
return bu
return nil
}
// getLeastLoadedBackendURL returns the backendURL with the minimum number of concurrent requests.
// getLeastLoadedBackendURL returns a non-broken backendURL with the lowest number of concurrent requests.
//
// backendURL.put() must be called on the returned backendURL after the request is complete.
func getLeastLoadedBackendURL(bus []*backendURL, atomicCounter *atomic.Uint32) *backendURL {
if len(bus) == 1 {
// Fast path - return the only backend url.
bu := bus[0]
if bu.isBroken() {
return nil
}
bu.get()
return bu
}
@@ -494,7 +606,7 @@ func getLeastLoadedBackendURL(bus []*backendURL, atomicCounter *atomic.Uint32) *
// Slow path - return the backend with the minimum number of concurrently executed requests.
buMinIdx := n % uint32(len(bus))
minRequests := bus[buMinIdx].concurrentRequests.Load()
for i := uint32(0); i < uint32(len(bus)); i++ {
for i := uint32(1); i < uint32(len(bus)); i++ {
idx := (n + i) % uint32(len(bus))
bu := bus[idx]
if bu.isBroken() {
@@ -508,6 +620,9 @@ func getLeastLoadedBackendURL(bus []*backendURL, atomicCounter *atomic.Uint32) *
}
}
buMin := bus[buMinIdx]
if buMin.isBroken() {
return nil
}
buMin.get()
atomicCounter.CompareAndSwap(n+1, buMinIdx+1)
return buMin
@@ -626,11 +741,9 @@ func initAuthConfig() {
configTimestamp.Set(fasttime.UnixTimestamp())
stopCh = make(chan struct{})
authConfigWG.Add(1)
go func() {
defer authConfigWG.Done()
authConfigWG.Go(func() {
authConfigReloader(sighupCh)
}()
})
}
func stopAuthConfig() {
@@ -702,7 +815,7 @@ func reloadAuthConfig() (bool, error) {
ok, err := reloadAuthConfigData(data)
if err != nil {
return false, fmt.Errorf("failed to pars -auth.config=%q: %w", *authConfigPath, err)
return false, fmt.Errorf("failed to parse -auth.config=%q: %w", *authConfigPath, err)
}
if !ok {
return false, nil
@@ -732,6 +845,11 @@ func reloadAuthConfigData(data []byte) (bool, error) {
acPrev := authConfig.Load()
if acPrev != nil {
acPrev.UnauthorizedUser.stopHealthChecks()
for i := range acPrev.Users {
acPrev.Users[i].stopHealthChecks()
}
metrics.UnregisterSet(acPrev.ms, true)
}
metrics.RegisterSet(ac.ms)
@@ -778,6 +896,8 @@ func parseAuthConfig(data []byte) (*AuthConfig, error) {
return nil, fmt.Errorf("cannot parse metric_labels for unauthorized_user: %w", err)
}
ui.requests = ac.ms.NewCounter(`vmauth_unauthorized_user_requests_total` + metricLabels)
ui.requestErrors = ac.ms.NewCounter(`vmauth_unauthorized_user_request_errors_total` + metricLabels)
ui.backendRequests = ac.ms.NewCounter(`vmauth_unauthorized_user_request_backend_requests_total` + metricLabels)
ui.backendErrors = ac.ms.NewCounter(`vmauth_unauthorized_user_request_backend_errors_total` + metricLabels)
ui.requestsDuration = ac.ms.NewSummary(`vmauth_unauthorized_user_request_duration_seconds` + metricLabels)
ui.concurrencyLimitCh = make(chan struct{}, ui.getMaxConcurrentRequests())
@@ -826,6 +946,8 @@ func parseAuthConfigUsers(ac *AuthConfig) (map[string]*UserInfo, error) {
return nil, fmt.Errorf("cannot parse metric_labels: %w", err)
}
ui.requests = ac.ms.GetOrCreateCounter(`vmauth_user_requests_total` + metricLabels)
ui.requestErrors = ac.ms.GetOrCreateCounter(`vmauth_user_request_errors_total` + metricLabels)
ui.backendRequests = ac.ms.GetOrCreateCounter(`vmauth_user_request_backend_requests_total` + metricLabels)
ui.backendErrors = ac.ms.GetOrCreateCounter(`vmauth_user_request_backend_errors_total` + metricLabels)
ui.requestsDuration = ac.ms.GetOrCreateSummary(`vmauth_user_request_duration_seconds` + metricLabels)
mcr := ui.getMaxConcurrentRequests()
@@ -1060,13 +1182,11 @@ func (up *URLPrefix) sanitizeAndInitialize() error {
}
// Initialize up.bus
bus := make([]*backendURL, len(up.busOriginal))
for i, bu := range up.busOriginal {
bus[i] = &backendURL{
url: bu,
}
bus := newBackendURLs()
for _, bu := range up.busOriginal {
bus.add(bu)
}
up.bus.Store(&bus)
up.bus.Store(bus)
return nil
}

View File

@@ -753,7 +753,7 @@ func TestGetLeastLoadedBackendURL(t *testing.T) {
up.loadBalancingPolicy = "least_loaded"
pbus := up.bus.Load()
bus := *pbus
bus := pbus.bus
fn := func(ns ...int) {
t.Helper()
@@ -825,7 +825,7 @@ func TestBrokenBackend(t *testing.T) {
})
up.loadBalancingPolicy = "least_loaded"
pbus := up.bus.Load()
bus := *pbus
bus := pbus.bus
// explicitly mark one of the backends as broken
bus[1].setBroken()
@@ -848,7 +848,7 @@ func TestDiscoverBackendIPsWithIPV6(t *testing.T) {
up.discoverBackendAddrsIfNeeded()
pbus := up.bus.Load()
bus := *pbus
bus := pbus.bus
if len(bus) != 1 {
t.Fatalf("expected url list to be of size 1; got %d instead", len(bus))
@@ -942,16 +942,14 @@ func mustParseURL(u string) *URLPrefix {
}
func mustParseURLs(us []string) *URLPrefix {
bus := make([]*backendURL, len(us))
bus := newBackendURLs()
urls := make([]*url.URL, len(us))
for i, u := range us {
pu, err := url.Parse(u)
if err != nil {
panic(fmt.Errorf("BUG: cannot parse %q: %w", u, err))
}
bus[i] = &backendURL{
url: pu,
}
bus.add(pu)
urls[i] = pu
}
up := &URLPrefix{}
@@ -960,7 +958,7 @@ func mustParseURLs(us []string) *URLPrefix {
} else {
up.vOriginal = us
}
up.bus.Store(&bus)
up.bus.Store(bus)
up.busOriginal = urls
return up
}

View File

@@ -24,6 +24,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httputil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/ioutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/netutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/procutil"
@@ -40,22 +41,38 @@ var (
useProxyProtocol = flagutil.NewArrayBool("httpListenAddr.useProxyProtocol", "Whether to use proxy protocol for connections accepted at the corresponding -httpListenAddr . "+
"See https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt . "+
"With enabled proxy protocol http server cannot serve regular /metrics endpoint. Use -pushmetrics.url for metrics pushing")
maxIdleConnsPerBackend = flag.Int("maxIdleConnsPerBackend", 100, "The maximum number of idle connections vmauth can open per each backend host. "+
"See also -maxConcurrentRequests")
idleConnTimeout = flag.Duration("idleConnTimeout", 50*time.Second, "The timeout for HTTP keep-alive connections to backend services. "+
maxIdleConnsPerBackend = flag.Int("maxIdleConnsPerBackend", 100, "The maximum number of idle connections vmauth can open per each backend host")
idleConnTimeout = flag.Duration("idleConnTimeout", 50*time.Second, "The timeout for HTTP keep-alive connections to backend services. "+
"It is recommended setting this value to values smaller than -http.idleConnTimeout set at backend services")
responseTimeout = flag.Duration("responseTimeout", 5*time.Minute, "The timeout for receiving a response from backend")
maxConcurrentRequests = flag.Int("maxConcurrentRequests", 1000, "The maximum number of concurrent requests vmauth can process. Other requests are rejected with "+
"'429 Too Many Requests' http status code. See also -maxConcurrentPerUserRequests and -maxIdleConnsPerBackend command-line options")
maxConcurrentPerUserRequests = flag.Int("maxConcurrentPerUserRequests", 300, "The maximum number of concurrent requests vmauth can process per each configured user. "+
"Other requests are rejected with '429 Too Many Requests' http status code. See also -maxConcurrentRequests command-line option and max_concurrent_requests option "+
"in per-user config")
responseTimeout = flag.Duration("responseTimeout", 5*time.Minute, "The timeout for receiving a response from backend")
requestBufferSize = flagutil.NewBytes("requestBufferSize", 32*1024, "The size of the buffer for reading the request body before proxying the request to backends. "+
"This allows reducing the comsumption of backend resources when processing requests from clients connected via slow networks. "+
"Set to 0 to disable request buffering. See https://docs.victoriametrics.com/victoriametrics/vmauth/#request-body-buffering")
maxRequestBodySizeToRetry = flagutil.NewBytes("maxRequestBodySizeToRetry", 16*1024, "The maximum request body size to buffer in memory for potential retries at other backends. "+
"Request bodies larger than this size cannot be retried if the backend fails. Zero or negative value disables request body buffering and retries. "+
"See also -requestBufferSize")
maxConcurrentRequests = flag.Int("maxConcurrentRequests", 1000, "The maximum number of concurrent requests vmauth can process simultaneously. "+
"Requests exceeding this limit are queued for up to -maxQueueDuration and then rejected with '429 Too Many Requests' http status code if the limit is still reached. "+
"This protects vmauth itself from overloading and out-of-memory (OOM) failures. See also -maxConcurrentPerUserRequests "+
"and https://docs.victoriametrics.com/victoriametrics/vmauth/#concurrency-limiting")
maxConcurrentPerUserRequests = flag.Int("maxConcurrentPerUserRequests", 100, "The maximum number of concurrent requests vmauth can process per each configured user. "+
"Requests exceeding this limit are queued for up to -maxQueueDuration and then rejected with '429 Too Many Requests' http status code if the limit is still reached. "+
"This provides fairness and isolation between users, preventing a single user from consuming all the available resources. "+
"It works in conjunction with -maxConcurrentRequests, which sets the global limit across all users. "+
"This default can be overridden for individual users via max_concurrent_requests option in per-user config. "+
"See https://docs.victoriametrics.com/victoriametrics/vmauth/#concurrency-limiting")
maxQueueDuration = flag.Duration("maxQueueDuration", 10*time.Second, "The maximum duration to wait before rejecting incoming requests if concurrency limit "+
"specified via -maxConcurrentRequests or -maxConcurrentPerUserRequests command-line flags is reached. "+
"Requests are rejected with '429 Too Many Requests' http status code if the limit is still reached after the -maxQueueDuration duration. "+
"This allows graceful handling of short spikes in concurrent requests. See https://docs.victoriametrics.com/victoriametrics/vmauth/#concurrency-limiting")
reloadAuthKey = flagutil.NewPassword("reloadAuthKey", "Auth key for /-/reload http endpoint. It must be passed via authKey query arg. It overrides -httpAuth.*")
logInvalidAuthTokens = flag.Bool("logInvalidAuthTokens", false, "Whether to log requests with invalid auth tokens. "+
`Such requests are always counted at vmauth_http_request_errors_total{reason="invalid_auth_token"} metric, which is exposed at /metrics page`)
failTimeout = flag.Duration("failTimeout", 3*time.Second, "Sets a delay period for load balancing to skip a malfunctioning backend")
maxRequestBodySizeToRetry = flagutil.NewBytes("maxRequestBodySizeToRetry", 16*1024, "The maximum request body size, which can be cached and re-tried at other backends. "+
"Bigger values may require more memory. Zero or negative value disables caching of request body. This may be useful when proxying data ingestion requests")
failTimeout = flag.Duration("failTimeout", 3*time.Second, "Sets a delay period for load balancing to skip a malfunctioning backend")
backendTLSInsecureSkipVerify = flag.Bool("backend.tlsInsecureSkipVerify", false, "Whether to skip TLS verification when connecting to backends over HTTPS. "+
"See https://docs.victoriametrics.com/victoriametrics/vmauth/#backend-tls-setup")
backendTLSCAFile = flag.String("backend.TLSCAFile", "", "Optional path to TLS root CA file, which is used for TLS verification when connecting to backends over HTTPS. "+
@@ -151,7 +168,6 @@ func requestHandlerWithInternalRoutes(w http.ResponseWriter, r *http.Request) bo
}
func requestHandler(w http.ResponseWriter, r *http.Request) bool {
ats := getAuthTokensFromRequest(r)
if len(ats) == 0 {
// Process requests for unauthorized users
@@ -208,26 +224,124 @@ func processUserRequest(w http.ResponseWriter, r *http.Request, ui *UserInfo) {
ui.requests.Inc()
// Limit the concurrency of requests to backends
concurrencyLimitOnce.Do(concurrencyLimitInit)
select {
case concurrencyLimitCh <- struct{}{}:
if err := ui.beginConcurrencyLimit(); err != nil {
handleConcurrencyLimitError(w, r, err)
<-concurrencyLimitCh
return
}
default:
concurrentRequestsLimitReached.Inc()
err := fmt.Errorf("cannot serve more than -maxConcurrentRequests=%d concurrent requests", cap(concurrencyLimitCh))
ctx, cancel := context.WithTimeout(r.Context(), *maxQueueDuration)
defer cancel()
// Acquire global concurrency limit.
if err := beginConcurrencyLimit(ctx); err != nil {
handleConcurrencyLimitError(w, r, err)
return
}
defer endConcurrencyLimit()
// Set read deadline for reading the initial chunk for the request body.
rc := http.NewResponseController(w)
deadline, ok := ctx.Deadline()
if !ok {
logger.Panicf("BUG: expecting valid deadline for the context")
}
if err := rc.SetReadDeadline(deadline); err != nil {
logger.Panicf("BUG: cannot set read deadline: %s", err)
}
// Read the initial chunk for the request body.
userName := ui.name()
if userName == "" {
userName = "unauthorized"
}
bb, err := bufferRequestBody(ctx, r.Body, userName)
if err != nil {
httpserver.Errorf(w, r, "%s", err)
return
}
r.Body = bb
// Disable the read deadline for the rest of the request body.
if err := rc.SetReadDeadline(time.Time{}); err != nil {
logger.Panicf("BUG: cannot reset read deadline: %s", err)
}
// Acquire concurrency limit for the given user.
if err := ui.beginConcurrencyLimit(ctx); err != nil {
handleConcurrencyLimitError(w, r, err)
return
}
defer ui.endConcurrencyLimit()
// Process the request.
processRequest(w, r, ui)
ui.endConcurrencyLimit()
}
func beginConcurrencyLimit(ctx context.Context) error {
concurrencyLimitOnce.Do(concurrencyLimitInit)
select {
case concurrencyLimitCh <- struct{}{}:
return nil
default:
// The -maxConcurrentRequests are executed. Wait until some of the requests are finished,
// so the current request could be executed.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10078
select {
case concurrencyLimitCh <- struct{}{}:
return nil
case <-ctx.Done():
err := ctx.Err()
if errors.Is(err, context.DeadlineExceeded) {
// The current request couldn't be executed until the request timeout.
concurrentRequestsLimitReached.Inc()
return fmt.Errorf("cannot start executing the request during -maxQueueDuration=%s because -maxConcurrentRequests=%d concurrent requests are executed",
*maxQueueDuration, cap(concurrencyLimitCh))
}
return fmt.Errorf("cannot start executing the request because -maxConcurrentRequests=%d concurrent requests are executed: %w", cap(concurrencyLimitCh), err)
}
}
}
func endConcurrencyLimit() {
<-concurrencyLimitCh
}
func bufferRequestBody(ctx context.Context, r io.ReadCloser, userName string) (io.ReadCloser, error) {
if r == nil {
// This is a GET request with nil reader.
return nil, nil
}
maxBufSize := max(requestBufferSize.IntN(), maxRequestBodySizeToRetry.IntN())
if maxBufSize <= 0 {
return r, nil
}
lr := ioutil.GetLimitedReader(r, int64(maxBufSize))
defer ioutil.PutLimitedReader(lr)
start := time.Now()
buf, err := io.ReadAll(lr)
bufferRequestBodyDuration.UpdateDuration(start)
if err != nil {
if errors.Is(ctx.Err(), context.DeadlineExceeded) {
rejectSlowClientRequests.Inc()
d := time.Since(start)
return nil, &httpserver.ErrorWithStatusCode{
Err: fmt.Errorf("reject request from the user %s because the request body couldn't be read in -maxQueueDuration=%s; read %d bytes in %s",
userName, *maxQueueDuration, len(buf), d.Truncate(time.Second)),
StatusCode: http.StatusBadRequest,
}
}
return nil, &httpserver.ErrorWithStatusCode{
Err: fmt.Errorf("cannot read request body: %w", err),
StatusCode: http.StatusBadRequest,
}
}
bb := newBufferedBody(r, buf, maxBufSize)
return bb, nil
}
func processRequest(w http.ResponseWriter, r *http.Request, ui *UserInfo) {
u := normalizeURL(r.URL)
up, hc := ui.getURLPrefixAndHeaders(u, r.Host, r.Header)
@@ -253,9 +367,6 @@ func processRequest(w http.ResponseWriter, r *http.Request, ui *UserInfo) {
isDefault = true
}
rtb := newReadTrackingBody(r.Body, maxRequestBodySizeToRetry.IntN())
r.Body = rtb
maxAttempts := up.getBackendsCount()
for i := 0; i < maxAttempts; i++ {
bu := up.getBackendURL()
@@ -263,18 +374,19 @@ func processRequest(w http.ResponseWriter, r *http.Request, ui *UserInfo) {
break
}
targetURL := bu.url
// Don't change path and add request_path query param for default route.
if isDefault {
// Don't change path and add request_path query param for default route.
query := targetURL.Query()
query.Set("request_path", u.String())
targetURL.RawQuery = query.Encode()
} else { // Update path for regular routes.
} else {
// Update path for regular routes.
targetURL = mergeURLs(targetURL, u, up.dropSrcPathPrefixParts, up.mergeQueryArgs)
}
wasLocalRetry := false
again:
ok, needLocalRetry := tryProcessingRequest(w, r, targetURL, hc, up.retryStatusCodes, ui)
ok, needLocalRetry := tryProcessingRequest(w, r, targetURL, hc, up.retryStatusCodes, ui, bu)
if needLocalRetry && !wasLocalRetry {
wasLocalRetry = true
goto again
@@ -284,17 +396,20 @@ func processRequest(w http.ResponseWriter, r *http.Request, ui *UserInfo) {
if ok {
return
}
bu.setBroken()
ui.backendErrors.Inc()
}
err := &httpserver.ErrorWithStatusCode{
Err: fmt.Errorf("all the %d backends for the user %q are unavailable", up.getBackendsCount(), ui.name()),
StatusCode: http.StatusBadGateway,
}
httpserver.Errorf(w, r, "%s", err)
ui.backendErrors.Inc()
ui.requestErrors.Inc()
}
func tryProcessingRequest(w http.ResponseWriter, r *http.Request, targetURL *url.URL, hc HeadersConf, retryStatusCodes []int, ui *UserInfo) (bool, bool) {
func tryProcessingRequest(w http.ResponseWriter, r *http.Request, targetURL *url.URL, hc HeadersConf, retryStatusCodes []int, ui *UserInfo, bu *backendURL) (bool, bool) {
ui.backendRequests.Inc()
req := sanitizeRequestHeaders(r)
req.URL = targetURL
@@ -308,28 +423,19 @@ func tryProcessingRequest(w http.ResponseWriter, r *http.Request, targetURL *url
}
}
rtb, rtbOK := req.Body.(*readTrackingBody)
bb, bbOK := req.Body.(*bufferedBody)
canRetry := !bbOK || bb.canRetry()
res, err := ui.rt.RoundTrip(req)
if ctxErr := r.Context().Err(); ctxErr != nil {
// Override the error returned by the RoundTrip with the context error if it isn't non-nil
// This makes sure the proper logging for canceled and timed out requests - log the real cause of the error
// instead of the random error, which could be returned from RoundTrip because of canceled or timed out request.
err = ctxErr
if errors.Is(r.Context().Err(), context.Canceled) {
// Do not retry canceled requests.
clientCanceledRequests.Inc()
return true, false
}
if err != nil {
if errors.Is(err, context.Canceled) || errors.Is(err, context.DeadlineExceeded) {
// Do not retry canceled or timed out requests
remoteAddr := httpserver.GetQuotedRemoteAddr(r)
requestURI := httpserver.GetRequestURI(r)
if errors.Is(err, context.DeadlineExceeded) {
// Timed out request must be counted as errors, since this usually means that the backend is slow.
logger.Warnf("remoteAddr: %s; requestURI: %s; timeout while proxying the response from %s: %s", remoteAddr, requestURI, targetURL, err)
ui.backendErrors.Inc()
}
return false, false
}
if !rtbOK || !rtb.canRetry() {
if !canRetry {
// Request body cannot be re-sent to another backend. Return the error to the client then.
err = &httpserver.ErrorWithStatusCode{
Err: fmt.Errorf("cannot proxy the request to %s: %w", targetURL, err),
@@ -337,41 +443,51 @@ func tryProcessingRequest(w http.ResponseWriter, r *http.Request, targetURL *url
}
httpserver.Errorf(w, r, "%s", err)
ui.backendErrors.Inc()
ui.requestErrors.Inc()
bu.setBroken()
return true, false
}
if netutil.IsTrivialNetworkError(err) {
// Retry request at the same backend on trivial network errors, such as proxy idle timeout misconfiguration or socket close by OS
if bbOK {
bb.resetReader()
}
return false, true
}
// Retry the request if its body wasn't read yet. This usually means that the backend isn't reachable.
// Retry the request at another backend
remoteAddr := httpserver.GetQuotedRemoteAddr(r)
// NOTE: do not use httpserver.GetRequestURI
// it explicitly reads request body, which may fail retries.
logger.Warnf("remoteAddr: %s; requestURI: %s; retrying the request to %s because of response error: %s", remoteAddr, req.URL, targetURL, err)
requestURI := httpserver.GetRequestURI(r)
logger.Warnf("remoteAddr: %s; requestURI: %s; request to %s failed: %s, retrying the request at another backend", remoteAddr, requestURI, targetURL, err)
if bbOK {
bb.resetReader()
}
return false, false
}
if slices.Contains(retryStatusCodes, res.StatusCode) {
_ = res.Body.Close()
if !rtbOK || !rtb.canRetry() {
if !canRetry {
// If we get an error from the retry_status_codes list, but cannot execute retry,
// we consider such a request an error as well.
err := &httpserver.ErrorWithStatusCode{
Err: fmt.Errorf("got response status code=%d from %s, but cannot retry the request on another backend, because the request has been already consumed",
Err: fmt.Errorf("got response status code=%d from %s, but cannot retry the request at another backend, because the request body has been already consumed",
res.StatusCode, targetURL),
StatusCode: http.StatusServiceUnavailable,
}
httpserver.Errorf(w, r, "%s", err)
ui.backendErrors.Inc()
ui.requestErrors.Inc()
return true, false
}
// Retry requests at other backends if it matches retryStatusCodes.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4893
remoteAddr := httpserver.GetQuotedRemoteAddr(r)
// NOTE: do not use httpserver.GetRequestURI
// it explicitly reads request body, which may fail retries.
logger.Warnf("remoteAddr: %s; requestURI: %s; retrying the request to %s because response status code=%d belongs to retry_status_codes=%d",
remoteAddr, req.URL, targetURL, res.StatusCode, retryStatusCodes)
requestURI := httpserver.GetRequestURI(r)
logger.Warnf("remoteAddr: %s; requestURI: %s; request to %s failed, retrying the request at another backend because response status code=%d belongs to retry_status_codes=%d",
remoteAddr, requestURI, targetURL, res.StatusCode, retryStatusCodes)
if bbOK {
bb.resetReader()
}
return false, false
}
removeHopHeaders(res.Header)
@@ -381,11 +497,18 @@ func tryProcessingRequest(w http.ResponseWriter, r *http.Request, targetURL *url
err = copyStreamToClient(w, res.Body)
_ = res.Body.Close()
if err != nil && !netutil.IsTrivialNetworkError(err) && !errors.Is(err, context.Canceled) {
if errors.Is(r.Context().Err(), context.Canceled) {
// Do not retry canceled requests.
clientCanceledRequests.Inc()
return true, false
}
if err != nil && !netutil.IsTrivialNetworkError(err) {
remoteAddr := httpserver.GetQuotedRemoteAddr(r)
requestURI := httpserver.GetRequestURI(r)
logger.Warnf("remoteAddr: %s; requestURI: %s; error when proxying response body from %s: %s", remoteAddr, requestURI, targetURL, err)
ui.requestErrors.Inc()
return true, false
}
return true, false
@@ -513,6 +636,10 @@ var (
configReloadRequests = metrics.NewCounter(`vmauth_http_requests_total{path="/-/reload"}`)
invalidAuthTokenRequests = metrics.NewCounter(`vmauth_http_request_errors_total{reason="invalid_auth_token"}`)
missingRouteRequests = metrics.NewCounter(`vmauth_http_request_errors_total{reason="missing_route"}`)
clientCanceledRequests = metrics.NewCounter(`vmauth_http_request_errors_total{reason="client_canceled"}`)
rejectSlowClientRequests = metrics.NewCounter(`vmauth_http_request_errors_total{reason="reject_slow_client"}`)
bufferRequestBodyDuration = metrics.NewSummary(`vmauth_buffer_request_body_duration_seconds`)
)
func newRoundTripper(caFileOpt, certFileOpt, keyFileOpt, serverNameOpt string, insecureSkipVerifyP *bool) (http.RoundTripper, error) {
@@ -596,6 +723,13 @@ func handleMissingAuthorizationError(w http.ResponseWriter) {
}
func handleConcurrencyLimitError(w http.ResponseWriter, r *http.Request, err error) {
if errors.Is(r.Context().Err(), context.Canceled) {
// Do not return any response for the request canceled by the client,
// since the connection to the client is already closed.
clientCanceledRequests.Inc()
return
}
w.Header().Add("Retry-After", "10")
err = &httpserver.ErrorWithStatusCode{
Err: err,
@@ -604,122 +738,78 @@ func handleConcurrencyLimitError(w http.ResponseWriter, r *http.Request, err err
httpserver.Errorf(w, r, "%s", err)
}
// readTrackingBody must be obtained via getReadTrackingBody()
type readTrackingBody struct {
// maxBodySize is the maximum body size to cache in buf.
// bufferedBody serves two purposes:
// 1. Enables request retries when the body size does not exceed maxBodySize
// by fully buffering the body in memory.
// 2. Prevents slow clients from reducing effective server capacity by
// buffering the request body before acquiring a per-user concurrency slot.
//
// See bufferRequestBody for details on how bufferedBody is used.
type bufferedBody struct {
// r contains reader for reading the data after buf is read.
//
// Bigger bodies cannot be retried.
maxBodySize int
// r contains reader for initial data reading
// r is nil if buf contains all the data.
r io.ReadCloser
// buf is a buffer for data read from r. Buf size is limited by maxBodySize.
// If more than maxBodySize is read from r, then cannotRetry is set to true.
// buf contains the initial buffer read from r.
buf []byte
// readBuf points to the cached data at buf, which must be read in the next call to Read().
readBuf []byte
// bufOffset is the offset at buf for already read bytes.
bufOffset int
// cannotRetry is set to true when more than maxBodySize bytes are read from r.
// In this case the read data cannot fit buf, so it cannot be re-read from buf.
// cannotRetry is set to true after Close() call on non-nil r.
cannotRetry bool
// bufComplete is set to true when buf contains complete request body read from r.
bufComplete bool
}
func newReadTrackingBody(r io.ReadCloser, maxBodySize int) *readTrackingBody {
// do not use sync.Pool there
// since http.RoundTrip may still use request body after return
// See this issue for details https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8051
rtb := &readTrackingBody{}
if maxBodySize < 0 {
maxBodySize = 0
func newBufferedBody(r io.ReadCloser, buf []byte, maxBufSize int) *bufferedBody {
// Do not use sync.Pool here, since http.RoundTrip may still use request body after return.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8051
if len(buf) < maxBufSize {
// Read the full request body into buf.
r = nil
}
rtb.maxBodySize = maxBodySize
if r == nil {
// This is GET request without request body
r = (*zeroReader)(nil)
return &bufferedBody{
r: r,
buf: buf,
}
rtb.r = r
return rtb
}
type zeroReader struct{}
func (r *zeroReader) Read(_ []byte) (int, error) {
return 0, io.EOF
}
func (r *zeroReader) Close() error {
return nil
}
// Read implements io.Reader interface.
func (rtb *readTrackingBody) Read(p []byte) (int, error) {
if len(rtb.readBuf) > 0 {
n := copy(p, rtb.readBuf)
rtb.readBuf = rtb.readBuf[n:]
func (bb *bufferedBody) Read(p []byte) (int, error) {
if bb.cannotRetry {
return 0, fmt.Errorf("cannot read already closed body")
}
if bb.bufOffset < len(bb.buf) {
n := copy(p, bb.buf[bb.bufOffset:])
bb.bufOffset += n
return n, nil
}
if rtb.r == nil {
if rtb.bufComplete {
return 0, io.EOF
}
return 0, fmt.Errorf("cannot read client request body after closing client reader")
if bb.r == nil {
return 0, io.EOF
}
n, err := rtb.r.Read(p)
if rtb.cannotRetry {
return n, err
}
if len(rtb.buf)+n > rtb.maxBodySize {
rtb.cannotRetry = true
return n, err
}
rtb.buf = append(rtb.buf, p[:n]...)
if err == io.EOF {
rtb.bufComplete = true
}
return n, err
return bb.r.Read(p)
}
func (rtb *readTrackingBody) canRetry() bool {
if rtb.cannotRetry {
return false
}
if rtb.bufComplete {
return true
}
return rtb.r != nil
func (bb *bufferedBody) canRetry() bool {
return bb.r == nil
}
// Close implements io.Closer interface.
func (rtb *readTrackingBody) Close() error {
if !rtb.cannotRetry {
rtb.readBuf = rtb.buf
} else {
rtb.readBuf = nil
func (bb *bufferedBody) Close() error {
bb.resetReader()
if bb.r != nil {
bb.cannotRetry = true
return bb.r.Close()
}
// Close rtb.r only if the request body is completely read or if it is too big.
// http.Roundtrip performs body.Close call even without any Read calls,
// so this hack allows us to reuse request body.
if rtb.bufComplete || rtb.cannotRetry {
if rtb.r == nil {
return nil
}
err := rtb.r.Close()
rtb.r = nil
return err
}
return nil
}
func (bb *bufferedBody) resetReader() {
bb.bufOffset = 0
}
func debugInfo(u *url.URL, r *http.Request) string {
s := &strings.Builder{}
fmt.Fprintf(s, " (host: %q; ", r.Host)

View File

@@ -2,6 +2,7 @@ package main
import (
"bytes"
"context"
"fmt"
"io"
"net"
@@ -10,6 +11,7 @@ import (
"strings"
"sync/atomic"
"testing"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/netutil"
)
@@ -546,28 +548,300 @@ func (w *fakeResponseWriter) WriteHeader(statusCode int) {
}
}
func TestReadTrackingBody_RetrySuccess(t *testing.T) {
// This is needed for net/http.ResponseController
func (w *fakeResponseWriter) SetReadDeadline(deadline time.Time) error {
return nil
}
func TestBufferRequestBody_Success(t *testing.T) {
defaultRequestBufferSize := requestBufferSize.String()
defer func() {
if err := requestBufferSize.Set(defaultRequestBufferSize); err != nil {
t.Fatalf("cannot reset requestBufferSize: %s", err)
}
}()
defaultMaxRequestBodySizeToRetry := maxRequestBodySizeToRetry.String()
defer func() {
if err := maxRequestBodySizeToRetry.Set(defaultMaxRequestBodySizeToRetry); err != nil {
t.Fatalf("cannot reset maxRequestBodySizeToRetry: %s", err)
}
}()
f := func(body *bytes.Buffer, requestBufferSizeFlag, maxRequestBodySizeToRetryFlag string) {
t.Helper()
expectedResponse := "statusCode=200"
if body.Len() > 0 {
expectedResponse += "\n" + body.String()
}
if err := requestBufferSize.Set(requestBufferSizeFlag); err != nil {
t.Fatalf("cannot set requestBufferSize: %s", err)
}
if err := maxRequestBodySizeToRetry.Set(maxRequestBodySizeToRetryFlag); err != nil {
t.Fatalf("cannot set maxRequestBodySizeToRetry: %s", err)
}
var backendCalled bool
ts := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
backendCalled = true
b, err := io.ReadAll(r.Body)
if err != nil {
http.Error(w, fmt.Sprintf("cannot read body: %s", err), http.StatusBadRequest)
return
}
if _, err := w.Write(b); err != nil {
http.Error(w, fmt.Sprintf("cannot write body: %s", err), http.StatusInternalServerError)
return
}
}))
defer ts.Close()
// regular url_prefix
cfgStr := strings.ReplaceAll(`
unauthorized_user:
url_prefix: {BACKEND}/foo`, "{BACKEND}", ts.URL)
cfgOrigP := authConfigData.Load()
if _, err := reloadAuthConfigData([]byte(cfgStr)); err != nil {
t.Fatalf("cannot load config data: %s", err)
}
defer func() {
cfgOrig := []byte("unauthorized_user:\n url_prefix: http://foo/bar")
if cfgOrigP != nil {
cfgOrig = *cfgOrigP
}
_, err := reloadAuthConfigData(cfgOrig)
if err != nil {
t.Fatalf("cannot load the original config: %s", err)
}
}()
r, err := http.NewRequest(http.MethodPost, `http://some-host.com`, body)
if err != nil {
t.Fatalf("cannot initialize http request: %s", err)
}
w := &fakeResponseWriter{}
if !requestHandlerWithInternalRoutes(w, r) {
t.Fatalf("unexpected false is returned from requestHandler")
}
response := w.getResponse()
response = strings.ReplaceAll(response, "\r\n", "\n")
response = strings.TrimSpace(response)
if response != expectedResponse {
t.Fatalf("unexpected response\ngot\n%s\nwant\n%s", response, expectedResponse)
}
if !backendCalled {
t.Fatalf("backend is not called")
}
}
// no body, no buffering, no retry
f(bytes.NewBuffer(nil), "0", "0")
// no body, buffering on, no retry
f(bytes.NewBuffer(nil), "100", "0")
// no body, no buffering, retry on
f(bytes.NewBuffer(nil), "0", "100")
// no body, buffering on, retry on
f(bytes.NewBuffer(nil), "100", "100")
// body smaller than buffer, retry max on
f(bytes.NewBufferString(strings.Repeat("abcdf", 100)), "101", "101")
// body smaller than buffer
f(bytes.NewBufferString(strings.Repeat("abcdf", 100)), "501", "0")
// body same size as buffer
f(bytes.NewBufferString(strings.Repeat("abcdf", 100)), "500", "0")
// body bigger than a buffer
f(bytes.NewBufferString(strings.Repeat("abcdf", 100)), "499", "0")
// body bigger than tmpBuf 8KiB used in buffering
f(bytes.NewBufferString(strings.Repeat("a", 32*1024)), "16384", "")
f(bytes.NewBufferString(strings.Repeat("a", 32*1024)), "16385", "")
f(bytes.NewBufferString(strings.Repeat("a", 32*1024)), "16383", "")
}
func TestBufferRequestBody_Failure(t *testing.T) {
defaultRequestBufferSize := requestBufferSize.String()
defer func() {
if err := requestBufferSize.Set(defaultRequestBufferSize); err != nil {
t.Fatalf("cannot reset requestBufferSize: %s", err)
}
}()
defaultMaxRequestBodySizeToRetry := maxRequestBodySizeToRetry.String()
defer func() {
if err := maxRequestBodySizeToRetry.Set(defaultMaxRequestBodySizeToRetry); err != nil {
t.Fatalf("cannot reset maxRequestBodySizeToRetry: %s", err)
}
}()
defaultMaxQueueDuration := *maxQueueDuration
defer func() {
*maxQueueDuration = defaultMaxQueueDuration
}()
f := func(body *mockBody, expectedResponse string) {
t.Helper()
if err := maxRequestBodySizeToRetry.Set("0"); err != nil {
t.Fatalf("cannot set maxRequestBodySizeToRetry: %s", err)
}
if err := requestBufferSize.Set("2048"); err != nil {
t.Fatalf("cannot set requestBufferSize: %s", err)
}
*maxQueueDuration = 100 * time.Millisecond
var backendCalled bool
ts := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
backendCalled = true
b, err := io.ReadAll(r.Body)
if err != nil {
http.Error(w, fmt.Sprintf("cannot read body: %s", err), http.StatusBadRequest)
return
}
if _, err := w.Write(b); err != nil {
http.Error(w, fmt.Sprintf("cannot write body: %s", err), http.StatusInternalServerError)
return
}
}))
defer ts.Close()
// regular url_prefix
cfgStr := strings.ReplaceAll(`
unauthorized_user:
url_prefix: {BACKEND}/foo`, "{BACKEND}", ts.URL)
cfgOrigP := authConfigData.Load()
if _, err := reloadAuthConfigData([]byte(cfgStr)); err != nil {
t.Fatalf("cannot load config data: %s", err)
}
defer func() {
cfgOrig := []byte("unauthorized_user:\n url_prefix: http://foo/bar")
if cfgOrigP != nil {
cfgOrig = *cfgOrigP
}
_, err := reloadAuthConfigData(cfgOrig)
if err != nil {
t.Fatalf("cannot load the original config: %s", err)
}
}()
r, err := http.NewRequest(http.MethodPost, `http://some-host.com`, body)
if err != nil {
t.Fatalf("cannot initialize http request: %s", err)
}
w := &fakeResponseWriter{}
if !requestHandlerWithInternalRoutes(w, r) {
t.Fatalf("unexpected false is returned from requestHandler")
}
response := w.getResponse()
response = strings.ReplaceAll(response, "\r\n", "\n")
response = strings.TrimSpace(response)
if response != expectedResponse {
t.Fatalf("unexpected response\ngot\n%s\nwant\n%s", response, expectedResponse)
}
if backendCalled {
t.Fatalf("backend is called")
}
}
// an error at the beginning of reading
f(&mockBody{err: fmt.Errorf("an error")}, `statusCode=400
cannot read request body: an error`)
// an error after reading 1024 bytes, buffer size is 2048 bytes
f(&mockBody{head: make([]byte, 1024), err: fmt.Errorf("an error")}, `statusCode=400
cannot read request body: an error`)
}
type mockBody struct {
head []byte
err error
tail []byte
}
func (r *mockBody) Read(p []byte) (n int, err error) {
if len(r.head) > 0 {
n = copy(p, r.head)
r.head = r.head[n:]
return n, nil
}
if r.err != nil {
return 0, r.err
}
if len(r.tail) > 0 {
n = copy(p, r.tail)
r.tail = r.tail[n:]
return n, nil
}
return 0, io.EOF
}
func TestBufferedBody_RetrySuccess(t *testing.T) {
f := func(s string, maxBodySize int) {
t.Helper()
rtb := newReadTrackingBody(io.NopCloser(bytes.NewBufferString(s)), maxBodySize)
defaultRequestBufferSize := requestBufferSize.String()
defer func() {
if err := requestBufferSize.Set(defaultRequestBufferSize); err != nil {
t.Fatalf("cannot reset requestBufferSize: %s", err)
}
}()
if err := requestBufferSize.Set(fmt.Sprintf("%d", maxBodySize)); err != nil {
t.Fatalf("cannot set requestBufferSize: %s", err)
}
if !rtb.canRetry() {
defaultMaxRequestBodySizeToRetry := maxRequestBodySizeToRetry.String()
defer func() {
if err := maxRequestBodySizeToRetry.Set(defaultMaxRequestBodySizeToRetry); err != nil {
t.Fatalf("cannot reset maxRequestBodySizeToRetry: %s", err)
}
}()
if err := maxRequestBodySizeToRetry.Set("0"); err != nil {
t.Fatalf("cannot set maxRequestBodySizeToRetry: %s", err)
}
ctx := context.Background()
rb, err := bufferRequestBody(ctx, io.NopCloser(bytes.NewBufferString(s)), "foo")
if err != nil {
t.Fatalf("unexpected error: %s", err)
}
bb, ok := rb.(*bufferedBody)
canRetry := !ok || bb.canRetry()
if !canRetry {
t.Fatalf("canRetry() must return true before reading anything")
}
for i := 0; i < 5; i++ {
data, err := io.ReadAll(rtb)
data, err := io.ReadAll(rb)
if err != nil {
t.Fatalf("unexpected error when reading all the data at iteration %d: %s", i, err)
}
if string(data) != s {
t.Fatalf("unexpected data read at iteration %d\ngot\n%s\nwant\n%s", i, data, s)
}
if err := rtb.Close(); err != nil {
t.Fatalf("unexpected error when closing readTrackingBody at iteration %d: %s", i, err)
}
if !rtb.canRetry() {
t.Fatalf("canRetry() must return true at iteration %d", i)
if err := rb.Close(); err != nil {
t.Fatalf("unexpected error when closing bufferedBody at iteration %d: %s", i, err)
}
}
}
@@ -577,19 +851,48 @@ func TestReadTrackingBody_RetrySuccess(t *testing.T) {
f("", 100)
f("foo", 100)
f("foobar", 100)
f(newTestString(1000), 1000)
f(newTestString(1000), 1001)
}
func TestReadTrackingBody_RetrySuccessPartialRead(t *testing.T) {
func TestBufferedBody_RetrySuccessPartialRead(t *testing.T) {
f := func(s string, maxBodySize int) {
t.Helper()
// Check the case with partial read
rtb := newReadTrackingBody(io.NopCloser(bytes.NewBufferString(s)), maxBodySize)
defaultRequestBufferSize := requestBufferSize.String()
defer func() {
if err := requestBufferSize.Set(defaultRequestBufferSize); err != nil {
t.Fatalf("cannot reset requestBufferSize: %s", err)
}
}()
if err := requestBufferSize.Set(fmt.Sprintf("%d", maxBodySize)); err != nil {
t.Fatalf("cannot set requestBufferSize: %s", err)
}
defaultMaxRequestBodySizeToRetry := maxRequestBodySizeToRetry.String()
defer func() {
if err := maxRequestBodySizeToRetry.Set(defaultMaxRequestBodySizeToRetry); err != nil {
t.Fatalf("cannot reset maxRequestBodySizeToRetry: %s", err)
}
}()
if err := maxRequestBodySizeToRetry.Set("0"); err != nil {
t.Fatalf("cannot set maxRequestBodySizeToRetry: %s", err)
}
ctx := context.Background()
rb, err := bufferRequestBody(ctx, io.NopCloser(bytes.NewBufferString(s)), "foo")
if err != nil {
t.Fatalf("unexpected error: %s", err)
}
bb, ok := rb.(*bufferedBody)
canRetry := !ok || bb.canRetry()
if !canRetry {
t.Fatalf("canRetry must return true")
}
for i := 0; i < len(s); i++ {
buf := make([]byte, i)
n, err := io.ReadFull(rtb, buf)
n, err := io.ReadFull(rb, buf)
if err != nil {
t.Fatalf("unexpected error when reading %d bytes: %s", i, err)
}
@@ -599,26 +902,20 @@ func TestReadTrackingBody_RetrySuccessPartialRead(t *testing.T) {
if string(buf) != s[:i] {
t.Fatalf("unexpected data read with the length %d\ngot\n%s\nwant\n%s", i, buf, s[:i])
}
if err := rtb.Close(); err != nil {
if err := rb.Close(); err != nil {
t.Fatalf("unexpected error when closing reader after reading %d bytes", i)
}
if !rtb.canRetry() {
t.Fatalf("canRetry() must return true after closing the reader after reading %d bytes", i)
}
}
data, err := io.ReadAll(rtb)
data, err := io.ReadAll(rb)
if err != nil {
t.Fatalf("unexpected error when reading all the data: %s", err)
}
if string(data) != s {
t.Fatalf("unexpected data read\ngot\n%s\nwant\n%s", data, s)
}
if err := rtb.Close(); err != nil {
t.Fatalf("unexpected error when closing readTrackingBody: %s", err)
}
if !rtb.canRetry() {
t.Fatalf("canRetry() must return true after closing the reader after reading all the input")
if err := rb.Close(); err != nil {
t.Fatalf("unexpected error when closing bufferedBody: %s", err)
}
}
@@ -627,30 +924,53 @@ func TestReadTrackingBody_RetrySuccessPartialRead(t *testing.T) {
f("", 100)
f("foo", 100)
f("foobar", 100)
f(newTestString(1000), 1000)
f(newTestString(1000), 1001)
}
func TestReadTrackingBody_RetryFailureTooBigBody(t *testing.T) {
func TestBufferedBody_RetryFailureTooBigBody(t *testing.T) {
f := func(s string, maxBodySize int) {
t.Helper()
rtb := newReadTrackingBody(io.NopCloser(bytes.NewBufferString(s)), maxBodySize)
defaultRequestBufferSize := requestBufferSize.String()
defer func() {
if err := requestBufferSize.Set(defaultRequestBufferSize); err != nil {
t.Fatalf("cannot reset requestBufferSize: %s", err)
}
}()
if err := requestBufferSize.Set("0"); err != nil {
t.Fatalf("cannot set requestBufferSize: %s", err)
}
if !rtb.canRetry() {
t.Fatalf("canRetry() must return true before reading anything")
defaultMaxRequestBodySizeToRetry := maxRequestBodySizeToRetry.String()
defer func() {
if err := maxRequestBodySizeToRetry.Set(defaultMaxRequestBodySizeToRetry); err != nil {
t.Fatalf("cannot reset maxRequestBodySizeToRetry: %s", err)
}
}()
if err := maxRequestBodySizeToRetry.Set(fmt.Sprintf("%d", maxBodySize)); err != nil {
t.Fatalf("cannot set maxRequestBodySizeToRetry: %s", err)
}
ctx := context.Background()
rb, err := bufferRequestBody(ctx, io.NopCloser(bytes.NewBufferString(s)), "foo")
if err != nil {
t.Fatalf("unexpected error: %s", err)
}
bb, ok := rb.(*bufferedBody)
canRetry := !ok || bb.canRetry()
if canRetry {
t.Fatalf("canRetry() must return false because of too big request body")
}
buf := make([]byte, 1)
n, err := io.ReadFull(rtb, buf)
n, err := io.ReadFull(rb, buf)
if err != nil {
t.Fatalf("unexpected error when reading a single byte: %s", err)
}
if n != 1 {
t.Fatalf("unexpected number of bytes read; got %d; want 1", n)
}
if !rtb.canRetry() {
t.Fatalf("canRetry() must return true after reading one byte")
}
data, err := io.ReadAll(rtb)
data, err := io.ReadAll(rb)
if err != nil {
t.Fatalf("unexpected error when reading all the data: %s", err)
}
@@ -658,14 +978,11 @@ func TestReadTrackingBody_RetryFailureTooBigBody(t *testing.T) {
if dataRead != s {
t.Fatalf("unexpected data read\ngot\n%s\nwant\n%s", dataRead, s)
}
if err := rtb.Close(); err != nil {
t.Fatalf("unexpected error when closing readTrackingBody: %s", err)
}
if rtb.canRetry() {
t.Fatalf("canRetry() must return false after closing the reader")
if err := rb.Close(); err != nil {
t.Fatalf("unexpected error when closing bufferedBody: %s", err)
}
data, err = io.ReadAll(rtb)
data, err = io.ReadAll(rb)
if err == nil {
t.Fatalf("expecting non-nil error")
}
@@ -679,35 +996,48 @@ func TestReadTrackingBody_RetryFailureTooBigBody(t *testing.T) {
f(newTestString(2*maxBodySize), maxBodySize)
}
func TestReadTrackingBody_RetryFailureZeroOrNegativeMaxBodySize(t *testing.T) {
func TestBufferedBody_RetryFailureZeroOrNegativeMaxBodySize(t *testing.T) {
f := func(s string, maxBodySize int) {
t.Helper()
rtb := newReadTrackingBody(io.NopCloser(bytes.NewBufferString(s)), maxBodySize)
defaultRequestBufferSize := requestBufferSize.String()
defer func() {
if err := requestBufferSize.Set(defaultRequestBufferSize); err != nil {
t.Fatalf("cannot reset requestBufferSize: %s", err)
}
}()
if err := requestBufferSize.Set(fmt.Sprintf("%d", maxBodySize)); err != nil {
t.Fatalf("cannot set requestBufferSize: %s", err)
}
if !rtb.canRetry() {
ctx := context.Background()
rb, err := bufferRequestBody(ctx, io.NopCloser(bytes.NewBufferString(s)), "foo")
if err != nil {
t.Fatalf("unexpected error: %s", err)
}
bb, ok := rb.(*bufferedBody)
canRetry := !ok || bb.canRetry()
if !canRetry {
t.Fatalf("canRetry() must return true before reading anything")
}
data, err := io.ReadAll(rtb)
data, err := io.ReadAll(rb)
if err != nil {
t.Fatalf("unexpected error when reading all the data: %s", err)
}
if string(data) != s {
t.Fatalf("unexpected data read\ngot\n%s\nwant\n%s", data, s)
}
if err := rtb.Close(); err != nil {
t.Fatalf("unexpected error when closing readTrackingBody: %s", err)
if err := rb.Close(); err != nil {
t.Fatalf("unexpected error when closing bufferedBody: %s", err)
}
if rtb.canRetry() {
t.Fatalf("canRetry() must return false after closing the reader")
data, err = io.ReadAll(rb)
if err != nil {
t.Fatalf("unexpected error in io.ReadAll: %s", err)
}
data, err = io.ReadAll(rtb)
if err == nil {
t.Fatalf("expecting non-nil error")
}
if len(data) != 0 {
t.Fatalf("unexpected non-empty data read: %q", data)
if string(data) != s {
t.Fatalf("unexpected data read\ngot\n%s\nwant\n%s", data, s)
}
}

View File

@@ -212,7 +212,7 @@ func newSrcFS() (*fslocal.FS, error) {
}
func newDstFS(ctx context.Context) (common.RemoteFS, error) {
fs, err := actions.NewRemoteFS(ctx, *dst)
fs, err := actions.NewRemoteFS(ctx, *dst, nil)
if err != nil {
return nil, fmt.Errorf("cannot parse `-dst`=%q: %w", *dst, err)
}
@@ -255,7 +255,7 @@ func newOriginFS(ctx context.Context) (common.OriginFS, error) {
if len(*origin) == 0 {
return &fsnil.FS{}, nil
}
fs, err := actions.NewRemoteFS(ctx, *origin)
fs, err := actions.NewRemoteFS(ctx, *origin, nil)
if err != nil {
return nil, fmt.Errorf("cannot parse `-origin`=%q: %w", *origin, err)
}
@@ -266,7 +266,7 @@ func newRemoteOriginFS(ctx context.Context) (common.RemoteFS, error) {
if len(*origin) == 0 {
return nil, fmt.Errorf("-origin cannot be empty when -snapshotName and -snapshot.createURL aren't set")
}
fs, err := actions.NewRemoteFS(ctx, *origin)
fs, err := actions.NewRemoteFS(ctx, *origin, nil)
if err != nil {
return nil, fmt.Errorf("cannot parse `-origin`=%q: %w", *origin, err)
}

View File

@@ -123,32 +123,32 @@ var (
Name: vmExtraLabel,
Value: nil,
Usage: "Extra labels, that will be added to imported timeseries. In case of collision, label value defined by flag" +
"will have priority. Flag can be set multiple times, to add few additional labels.",
" will have priority. Flag can be set multiple times, to add few additional labels.",
},
&cli.Int64Flag{
Name: vmRateLimit,
Usage: "Optional data transfer rate limit in bytes per second.\n" +
"By default, the rate limit is disabled. It can be useful for limiting load on configured via '--vmAddr' destination.",
"By default, the rate limit is disabled. It can be useful for limiting load on configured via '--vm-addr' destination.",
},
&cli.StringFlag{
Name: vmCertFile,
Usage: "Optional path to client-side TLS certificate file to use when connecting to '--vmAddr'",
Usage: "Optional path to client-side TLS certificate file to use when connecting to '--vm-addr'",
},
&cli.StringFlag{
Name: vmKeyFile,
Usage: "Optional path to client-side TLS key to use when connecting to '--vmAddr'",
Usage: "Optional path to client-side TLS key to use when connecting to '--vm-addr'",
},
&cli.StringFlag{
Name: vmCAFile,
Usage: "Optional path to TLS CA file to use for verifying connections to '--vmAddr'. By default, system CA is used",
Usage: "Optional path to TLS CA file to use for verifying connections to '--vm-addr'. By default, system CA is used",
},
&cli.StringFlag{
Name: vmServerName,
Usage: "Optional TLS server name to use for connections to '--vmAddr'. By default, the server name from '--vmAddr' is used",
Usage: "Optional TLS server name to use for connections to '--vm-addr'. By default, the server name from '--vm-addr' is used",
},
&cli.BoolFlag{
Name: vmInsecureSkipVerify,
Usage: "Whether to skip tls verification when connecting to '--vmAddr'",
Usage: "Whether to skip tls verification when connecting to '--vm-addr'",
Value: false,
},
&cli.IntFlag{
@@ -468,7 +468,7 @@ var (
Name: vmNativeFilterMatch,
Usage: "Time series selector to match series for export. For example, select {instance!=\"localhost\"} will " +
"match all series with \"instance\" label different to \"localhost\".\n" +
" See more details here https://github.com/VictoriaMetrics/VictoriaMetrics#how-to-export-data-in-native-format",
" See more details here https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#how-to-export-data-in-native-format",
Value: `{__name__!=""}`,
},
&cli.StringFlag{
@@ -598,7 +598,7 @@ var (
Name: vmExtraLabel,
Value: nil,
Usage: "Extra labels, that will be added to imported timeseries. In case of collision, label value defined by flag" +
"will have priority. Flag can be set multiple times, to add few additional labels.",
" will have priority. Flag can be set multiple times, to add few additional labels.",
},
&cli.Int64Flag{
Name: vmRateLimit,
@@ -625,8 +625,8 @@ var (
&cli.BoolFlag{
Name: vmNativeDisableBinaryProtocol,
Usage: "Whether to use https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#how-to-export-data-in-json-line-format " +
"instead of https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#how-to-export-data-in-native-format API." +
"Binary export/import API protocol implies less network and resource usage, as it transfers compressed binary data blocks." +
"instead of https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#how-to-export-data-in-native-format API. " +
"Binary export/import API protocol implies less network and resource usage, as it transfers compressed binary data blocks. " +
"Non-binary export/import API is less efficient, but supports deduplication if it is configured on vm-native-src-addr side.",
Value: false,
},

View File

@@ -1,6 +1,7 @@
package main
import (
"context"
"fmt"
"io"
"log"
@@ -37,7 +38,7 @@ func newInfluxProcessor(ic *influx.Client, im *vm.Importer, cc int, separator st
}
}
func (ip *influxProcessor) run() error {
func (ip *influxProcessor) run(ctx context.Context) error {
series, err := ip.ic.Explore()
if err != nil {
return fmt.Errorf("explore query failed: %s", err)
@@ -47,7 +48,7 @@ func (ip *influxProcessor) run() error {
}
question := fmt.Sprintf("Found %d timeseries to import. Continue?", len(series))
if !prompt(question) {
if !prompt(ctx, question) {
return nil
}
@@ -62,10 +63,8 @@ func (ip *influxProcessor) run() error {
ip.im.ResetStats()
var wg sync.WaitGroup
wg.Add(ip.cc)
for i := 0; i < ip.cc; i++ {
go func() {
defer wg.Done()
for range ip.cc {
wg.Go(func() {
for s := range seriesCh {
if err := ip.do(s); err != nil {
errCh <- fmt.Errorf("request failed for %q.%q: %s", s.Measurement, s.Field, err)
@@ -73,7 +72,7 @@ func (ip *influxProcessor) run() error {
}
bar.Increment()
}
}()
})
}
// any error breaks the import

View File

@@ -103,7 +103,7 @@ func main() {
}
otsdbProcessor := newOtsdbProcessor(otsdbClient, importer, c.Int(otsdbConcurrency), c.Bool(globalVerbose))
return otsdbProcessor.run()
return otsdbProcessor.run(ctx)
},
},
{
@@ -164,7 +164,7 @@ func main() {
c.Bool(influxSkipDatabaseLabel),
c.Bool(influxPrometheusMode),
c.Bool(globalVerbose))
return processor.run()
return processor.run(ctx)
},
},
{
@@ -279,7 +279,7 @@ func main() {
cc: c.Int(promConcurrency),
isVerbose: c.Bool(globalVerbose),
}
return pp.run()
return pp.run(ctx)
},
},
{

View File

@@ -1,6 +1,7 @@
package main
import (
"context"
"fmt"
"log"
"sync"
@@ -37,7 +38,7 @@ func newOtsdbProcessor(oc *opentsdb.Client, im *vm.Importer, otsdbcc int, verbos
}
}
func (op *otsdbProcessor) run() error {
func (op *otsdbProcessor) run(ctx context.Context) error {
log.Println("Loading all metrics from OpenTSDB for filters: ", op.oc.Filters)
var metrics []string
for _, filter := range op.oc.Filters {
@@ -53,7 +54,7 @@ func (op *otsdbProcessor) run() error {
}
question := fmt.Sprintf("Found %d metrics to import. Continue?", len(metrics))
if !prompt(question) {
if !prompt(ctx, question) {
return nil
}
op.im.ResetStats()
@@ -88,10 +89,8 @@ func (op *otsdbProcessor) run() error {
bar.Finish()
}(bar)
var wg sync.WaitGroup
wg.Add(op.otsdbcc)
for i := 0; i < op.otsdbcc; i++ {
go func() {
defer wg.Done()
for range op.otsdbcc {
wg.Go(func() {
for s := range seriesCh {
if err := op.do(s); err != nil {
errCh <- fmt.Errorf("couldn't retrieve series for %s : %s", metric, err)
@@ -99,7 +98,7 @@ func (op *otsdbProcessor) run() error {
}
bar.Increment()
}
}()
})
}
/*
Loop through all series for this metric, processing all retentions and time ranges

View File

@@ -1,10 +1,13 @@
package main
import (
"context"
"fmt"
"log"
"strings"
"sync"
"github.com/prometheus/prometheus/model/labels"
"github.com/prometheus/prometheus/tsdb"
"github.com/prometheus/prometheus/tsdb/chunkenc"
@@ -30,7 +33,7 @@ type prometheusProcessor struct {
isVerbose bool
}
func (pp *prometheusProcessor) run() error {
func (pp *prometheusProcessor) run(ctx context.Context) error {
blocks, err := pp.cl.Explore()
if err != nil {
return fmt.Errorf("explore failed: %s", err)
@@ -39,7 +42,7 @@ func (pp *prometheusProcessor) run() error {
return fmt.Errorf("found no blocks to import")
}
question := fmt.Sprintf("Found %d blocks to import. Continue?", len(blocks))
if !prompt(question) {
if !prompt(ctx, question) {
return nil
}
@@ -60,19 +63,19 @@ func (pp *prometheusProcessor) do(b tsdb.BlockReader) error {
var it chunkenc.Iterator
for ss.Next() {
var name string
var labels []vm.LabelPair
var labelPairs []vm.LabelPair
series := ss.At()
for _, label := range series.Labels() {
series.Labels().Range(func(label labels.Label) {
if label.Name == "__name__" {
name = label.Value
continue
return
}
labels = append(labels, vm.LabelPair{
Name: label.Name,
Value: label.Value,
labelPairs = append(labelPairs, vm.LabelPair{
Name: strings.Clone(label.Name),
Value: strings.Clone(label.Value),
})
}
})
if name == "" {
return fmt.Errorf("failed to find `__name__` label in labelset for block %v", b.Meta().ULID)
}
@@ -98,7 +101,7 @@ func (pp *prometheusProcessor) do(b tsdb.BlockReader) error {
}
ts := vm.TimeSeries{
Name: name,
LabelPairs: labels,
LabelPairs: labelPairs,
Timestamps: timestamps,
Values: values,
}
@@ -121,10 +124,8 @@ func (pp *prometheusProcessor) processBlocks(blocks []tsdb.BlockReader) error {
pp.im.ResetStats()
var wg sync.WaitGroup
wg.Add(pp.cc)
for i := 0; i < pp.cc; i++ {
go func() {
defer wg.Done()
for range pp.cc {
wg.Go(func() {
for br := range blockReadersCh {
if err := pp.do(br); err != nil {
errCh <- fmt.Errorf("read failed for block %q: %s", br.Meta().ULID, err)
@@ -132,7 +133,7 @@ func (pp *prometheusProcessor) processBlocks(blocks []tsdb.BlockReader) error {
}
bar.Increment()
}
}()
})
}
// any error breaks the import
for _, br := range blocks {

View File

@@ -47,7 +47,7 @@ func (rrp *remoteReadProcessor) run(ctx context.Context) error {
question := fmt.Sprintf("Selected time range %q - %q will be split into %d ranges according to %q step. Continue?",
rrp.filter.timeStart.String(), rrp.filter.timeEnd.String(), len(ranges), rrp.filter.chunk)
if !prompt(question) {
if !prompt(ctx, question) {
return nil
}
@@ -66,10 +66,8 @@ func (rrp *remoteReadProcessor) run(ctx context.Context) error {
errCh := make(chan error)
var wg sync.WaitGroup
wg.Add(rrp.cc)
for i := 0; i < rrp.cc; i++ {
go func() {
defer wg.Done()
for range rrp.cc {
wg.Go(func() {
for r := range rangeC {
if err := rrp.do(ctx, r); err != nil {
errCh <- fmt.Errorf("request failed for: %s", err)
@@ -77,7 +75,7 @@ func (rrp *remoteReadProcessor) run(ctx context.Context) error {
}
bar.Increment()
}
}()
})
}
for _, r := range ranges {

View File

@@ -2,6 +2,7 @@ package main
import (
"bufio"
"context"
"fmt"
"os"
"strings"
@@ -15,7 +16,7 @@ const barTpl = `{{ blue "%s:" }} {{ counters . }} {{ bar . "[" "█" (cycle . "
// isSilent should be inited in main
var isSilent bool
func prompt(question string) bool {
func prompt(ctx context.Context, question string) bool {
if isSilent {
return true
}
@@ -25,15 +26,32 @@ func prompt(question string) bool {
}
reader := bufio.NewReader(os.Stdin)
fmt.Print(question, " [Y/n] ")
answer, err := reader.ReadString('\n')
if err != nil {
answerCh := make(chan string, 1)
errCh := make(chan error, 1)
go func() {
answer, err := reader.ReadString('\n')
if err != nil {
errCh <- err
return
}
answerCh <- answer
}()
select {
case <-ctx.Done():
fmt.Println("\nCanceled.")
return false
case err := <-errCh:
panic(err)
case answer := <-answerCh:
answer = strings.TrimSpace(strings.ToLower(answer))
if answer == "" || answer == "yes" || answer == "y" {
return true
}
return false
}
answer = strings.TrimSpace(strings.ToLower(answer))
if answer == "" || answer == "yes" || answer == "y" {
return true
}
return false
}
func wrapErr(vmErr *vm.ImportError, verbose bool) error {

View File

@@ -156,15 +156,13 @@ func NewImporter(ctx context.Context, cfg Config) (*Importer, error) {
cfg.BatchSize = 1e5
}
im.wg.Add(int(cfg.Concurrency))
for i := 0; i < int(cfg.Concurrency); i++ {
for i := range int(cfg.Concurrency) {
pbPrefix := fmt.Sprintf(`{{ green "VM worker %d:" }}`, i)
bar := barpool.AddWithTemplate(pbPrefix+pbTpl, 0)
go func(bar barpool.Bar) {
defer im.wg.Done()
im.wg.Go(func() {
im.startWorker(ctx, bar, cfg.BatchSize, cfg.SignificantFigures, cfg.RoundDigits)
}(bar)
})
}
im.ResetStats()
return im, nil

View File

@@ -79,7 +79,7 @@ func (p *vmNativeProcessor) run(ctx context.Context) error {
return fmt.Errorf("failed to get tenants: %w", err)
}
question := fmt.Sprintf("The following tenants were discovered: %s.\n Continue?", tenants)
if !prompt(question) {
if !prompt(ctx, question) {
return nil
}
}
@@ -233,7 +233,7 @@ func (p *vmNativeProcessor) runBackfilling(ctx context.Context, tenantID string,
// do not prompt for intercluster because there could be many tenants,
// and we don't want to interrupt the process when moving to the next tenant.
question := foundSeriesMsg + ". Continue?"
if !prompt(question) {
if !prompt(ctx, question) {
return nil
}
} else {
@@ -249,9 +249,7 @@ func (p *vmNativeProcessor) runBackfilling(ctx context.Context, tenantID string,
var wg sync.WaitGroup
for i := 0; i < p.cc; i++ {
wg.Add(1)
go func() {
defer wg.Done()
wg.Go(func() {
for f := range filterCh {
if !p.disablePerMetricRequests {
if err := p.do(ctx, f, srcURL, dstURL, nil); err != nil {
@@ -266,7 +264,7 @@ func (p *vmNativeProcessor) runBackfilling(ctx context.Context, tenantID string,
}
}
}
}()
})
}
// any error breaks the import

View File

@@ -11,9 +11,11 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/prometheus"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/ratelimiter"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/slicesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage/metricsmetadata"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/timeserieslimits"
)
@@ -50,8 +52,9 @@ var (
type InsertCtx struct {
Labels sortedLabels
mrs []storage.MetricRow
metricNamesBuf []byte
mrs []storage.MetricRow
mms []metricsmetadata.Row
metricNameBuf []byte
relabelCtx relabel.Ctx
streamAggrCtx streamAggrCtx
@@ -73,8 +76,13 @@ func (ctx *InsertCtx) Reset(rowsLen int) {
}
mrs = slicesutil.SetLength(mrs, rowsLen)
ctx.mrs = mrs[:0]
mms := ctx.mms
for i := range mms {
cleanMetricMetadata(&mms[i])
}
ctx.mms = mms[:0]
ctx.metricNamesBuf = ctx.metricNamesBuf[:0]
ctx.metricNameBuf = ctx.metricNameBuf[:0]
ctx.relabelCtx.Reset()
ctx.streamAggrCtx.Reset()
ctx.skipStreamAggr = false
@@ -84,11 +92,20 @@ func cleanMetricRow(mr *storage.MetricRow) {
mr.MetricNameRaw = nil
}
func cleanMetricMetadata(mm *metricsmetadata.Row) {
mm.MetricFamilyName = nil
mm.Unit = nil
mm.Help = nil
mm.Type = 0
mm.ProjectID = 0
mm.AccountID = 0
}
func (ctx *InsertCtx) marshalMetricNameRaw(prefix []byte, labels []prompb.Label) []byte {
start := len(ctx.metricNamesBuf)
ctx.metricNamesBuf = append(ctx.metricNamesBuf, prefix...)
ctx.metricNamesBuf = storage.MarshalMetricNameRaw(ctx.metricNamesBuf, labels)
metricNameRaw := ctx.metricNamesBuf[start:]
start := len(ctx.metricNameBuf)
ctx.metricNameBuf = append(ctx.metricNameBuf, prefix...)
ctx.metricNameBuf = storage.MarshalMetricNameRaw(ctx.metricNameBuf, labels)
metricNameRaw := ctx.metricNameBuf[start:]
return metricNameRaw[:len(metricNameRaw):len(metricNameRaw)]
}
@@ -143,7 +160,7 @@ func (ctx *InsertCtx) addRow(metricNameRaw []byte, timestamp int64, value float6
mr.MetricNameRaw = metricNameRaw
mr.Timestamp = timestamp
mr.Value = value
if len(ctx.metricNamesBuf) > 16*1024*1024 {
if len(ctx.metricNameBuf) > 16*1024*1024 {
if err := ctx.FlushBufs(); err != nil {
return err
}
@@ -151,6 +168,55 @@ func (ctx *InsertCtx) addRow(metricNameRaw []byte, timestamp int64, value float6
return nil
}
// WriteMetadata writes given prometheus protobuf metadata into the storage.
func (ctx *InsertCtx) WriteMetadata(mmpbs []prompb.MetricMetadata) error {
if len(mmpbs) == 0 {
return nil
}
mms := ctx.mms
mms = slicesutil.SetLength(mms, len(mmpbs))
for idx, mmpb := range mmpbs {
mm := &mms[idx]
mm.MetricFamilyName = bytesutil.ToUnsafeBytes(mmpb.MetricFamilyName)
mm.Help = bytesutil.ToUnsafeBytes(mmpb.Help)
mm.Type = mmpb.Type
mm.Unit = bytesutil.ToUnsafeBytes(mmpb.Unit)
}
err := vmstorage.AddMetadataRows(mms)
if err != nil {
return &httpserver.ErrorWithStatusCode{
Err: fmt.Errorf("cannot store metrics metadata: %w", err),
StatusCode: http.StatusServiceUnavailable,
}
}
return nil
}
// WritePromMetadata writes given prometheus metric metadata into the storage
func (ctx *InsertCtx) WritePromMetadata(mmps []prometheus.Metadata) error {
if len(mmps) == 0 {
return nil
}
mms := ctx.mms
mms = slicesutil.SetLength(mms, len(mmps))
for idx, mmpb := range mmps {
mm := &mms[idx]
mm.MetricFamilyName = bytesutil.ToUnsafeBytes(mmpb.Metric)
mm.Help = bytesutil.ToUnsafeBytes(mmpb.Help)
mm.Type = mmpb.Type
}
err := vmstorage.AddMetadataRows(mms)
if err != nil {
return &httpserver.ErrorWithStatusCode{
Err: fmt.Errorf("cannot store prometheus metrics metadata: %w", err),
StatusCode: http.StatusServiceUnavailable,
}
}
return nil
}
// AddLabelBytes adds (name, value) label to ctx.Labels.
//
// name and value must exist until ctx.Labels is used.

View File

@@ -111,9 +111,7 @@ func InitStreamAggr() {
saCfgTimestamp.Set(fasttime.UnixTimestamp())
// Start config reloader.
saCfgReloaderWG.Add(1)
go func() {
defer saCfgReloaderWG.Done()
saCfgReloaderWG.Go(func() {
for {
select {
case <-sighupCh:
@@ -122,7 +120,7 @@ func InitStreamAggr() {
}
reloadStreamAggrConfig()
}
}()
})
}
func reloadStreamAggrConfig() {

View File

@@ -27,6 +27,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/promremotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/relabel"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/vmimport"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/zabbixconnector"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
@@ -231,6 +232,17 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
}
firehose.WriteSuccessResponse(w, r)
return true
case "/zabbixconnector/api/v1/history":
zabbixconnectorHistoryRequests.Inc()
if err := zabbixconnector.InsertHandlerForHTTP(r); err != nil {
zabbixconnectorHistoryErrors.Inc()
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(http.StatusBadRequest)
fmt.Fprintf(w, `{"error":%q}`, err.Error())
return true
}
w.WriteHeader(http.StatusOK)
return true
case "/newrelic":
newrelicCheckRequest.Inc()
w.Header().Set("Content-Type", "application/json")
@@ -423,6 +435,9 @@ var (
opentelemetryPushRequests = metrics.NewCounter(`vm_http_requests_total{path="/opentelemetry/v1/metrics", protocol="opentelemetry"}`)
opentelemetryPushErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/opentelemetry/v1/metrics", protocol="opentelemetry"}`)
zabbixconnectorHistoryRequests = metrics.NewCounter(`vm_http_requests_total{path="/zabbixconnector/api/v1/history", protocol="zabbixconnector"}`)
zabbixconnectorHistoryErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/zabbixconnector/api/v1/history", protocol="zabbixconnector"}`)
newrelicWriteRequests = metrics.NewCounter(`vm_http_requests_total{path="/newrelic/infra/v2/metrics/events/bulk", protocol="newrelic"}`)
newrelicWriteErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/newrelic/infra/v2/metrics/events/bulk", protocol="newrelic"}`)

View File

@@ -6,6 +6,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/relabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prommetadata"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentelemetry/firehose"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentelemetry/stream"
@@ -14,8 +15,9 @@ import (
)
var (
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="opentelemetry"}`)
rowsPerInsert = metrics.NewHistogram(`vm_rows_per_insert{type="opentelemetry"}`)
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="opentelemetry"}`)
rowsPerInsert = metrics.NewHistogram(`vm_rows_per_insert{type="opentelemetry"}`)
metadataInserted = metrics.NewCounter(`vm_metadata_rows_inserted_total{type="opentelemetry"}`)
)
// InsertHandler processes opentelemetry metrics.
@@ -33,12 +35,12 @@ func InsertHandler(req *http.Request) error {
return fmt.Errorf("json encoding isn't supported for opentelemetry format. Use protobuf encoding")
}
}
return stream.ParseStream(req.Body, encoding, processBody, func(tss []prompb.TimeSeries, _ []prompb.MetricMetadata) error {
return insertRows(tss, extraLabels)
return stream.ParseStream(req.Body, encoding, processBody, func(tss []prompb.TimeSeries, mms []prompb.MetricMetadata) error {
return insertRows(tss, mms, extraLabels)
})
}
func insertRows(tss []prompb.TimeSeries, extraLabels []prompb.Label) error {
func insertRows(tss []prompb.TimeSeries, mms []prompb.MetricMetadata, extraLabels []prompb.Label) error {
ctx := common.GetInsertCtx()
defer common.PutInsertCtx(ctx)
@@ -75,5 +77,14 @@ func insertRows(tss []prompb.TimeSeries, extraLabels []prompb.Label) error {
}
rowsInserted.Add(rowsTotal)
rowsPerInsert.Update(float64(rowsTotal))
return ctx.FlushBufs()
if err := ctx.FlushBufs(); err != nil {
return fmt.Errorf("cannot flush metric bufs: %w", err)
}
if prommetadata.IsEnabled() {
if err := ctx.WriteMetadata(mms); err != nil {
return err
}
metadataInserted.Add(len(mms))
}
return nil
}

View File

@@ -1,6 +1,7 @@
package prometheusimport
import (
"fmt"
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
@@ -15,8 +16,9 @@ import (
)
var (
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="prometheus"}`)
rowsPerInsert = metrics.NewHistogram(`vm_rows_per_insert{type="prometheus"}`)
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="prometheus"}`)
rowsPerInsert = metrics.NewHistogram(`vm_rows_per_insert{type="prometheus"}`)
metadataInserted = metrics.NewCounter(`vm_metadata_rows_inserted_total{type="prometheus"}`)
)
// InsertHandler processes `/api/v1/import/prometheus` request.
@@ -30,14 +32,14 @@ func InsertHandler(req *http.Request) error {
return err
}
encoding := req.Header.Get("Content-Encoding")
return stream.Parse(req.Body, defaultTimestamp, encoding, true, prommetadata.IsEnabled(), func(rows []prometheus.Row, _ []prometheus.Metadata) error {
return insertRows(rows, extraLabels)
return stream.Parse(req.Body, defaultTimestamp, encoding, true, prommetadata.IsEnabled(), func(rows []prometheus.Row, mms []prometheus.Metadata) error {
return insertRows(rows, mms, extraLabels)
}, func(s string) {
httpserver.LogError(req, s)
})
}
func insertRows(rows []prometheus.Row, extraLabels []prompb.Label) error {
func insertRows(rows []prometheus.Row, mms []prometheus.Metadata, extraLabels []prompb.Label) error {
ctx := common.GetInsertCtx()
defer common.PutInsertCtx(ctx)
@@ -64,5 +66,15 @@ func insertRows(rows []prometheus.Row, extraLabels []prompb.Label) error {
}
rowsInserted.Add(len(rows))
rowsPerInsert.Update(float64(len(rows)))
return ctx.FlushBufs()
if err := ctx.FlushBufs(); err != nil {
return fmt.Errorf("cannot flush metric bufs: %w", err)
}
if prommetadata.IsEnabled() {
if err := ctx.WritePromMetadata(mms); err != nil {
return err
}
metadataInserted.Add(len(mms))
}
return nil
}

View File

@@ -4,13 +4,15 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/relabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prommetadata"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="promscrape"}`)
rowsPerInsert = metrics.NewHistogram(`vm_rows_per_insert{type="promscrape"}`)
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="promscrape"}`)
rowsPerInsert = metrics.NewHistogram(`vm_rows_per_insert{type="promscrape"}`)
metadataRowsInserted = metrics.NewCounter(`vm_metadata_rows_inserted_total{type="promscrape"}`)
)
const maxRowsPerBlock = 10000
@@ -41,6 +43,13 @@ func Push(wr *prompb.WriteRequest) {
}
push(ctx, tssBlock)
}
if prommetadata.IsEnabled() {
if err := ctx.WriteMetadata(wr.Metadata); err != nil {
logger.Errorf("cannot write promscrape metrics metadata to storage: %s", err)
} else {
metadataRowsInserted.Add(len(wr.Metadata))
}
}
}
func push(ctx *common.InsertCtx, tss []prompb.TimeSeries) {

View File

@@ -1,10 +1,12 @@
package promremotewrite
import (
"fmt"
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/relabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prommetadata"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/promremotewrite/stream"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/protoparserutil"
@@ -12,8 +14,9 @@ import (
)
var (
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="promremotewrite"}`)
rowsPerInsert = metrics.NewHistogram(`vm_rows_per_insert{type="promremotewrite"}`)
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="promremotewrite"}`)
rowsPerInsert = metrics.NewHistogram(`vm_rows_per_insert{type="promremotewrite"}`)
metadataInserted = metrics.NewCounter(`vm_metadata_rows_inserted_total{type="promremotewrite"}`)
)
// InsertHandler processes remote write for prometheus.
@@ -23,12 +26,12 @@ func InsertHandler(req *http.Request) error {
return err
}
isVMRemoteWrite := req.Header.Get("Content-Encoding") == "zstd"
return stream.Parse(req.Body, isVMRemoteWrite, func(tss []prompb.TimeSeries, _ []prompb.MetricMetadata) error {
return insertRows(tss, extraLabels)
return stream.Parse(req.Body, isVMRemoteWrite, func(tss []prompb.TimeSeries, mms []prompb.MetricMetadata) error {
return insertRows(tss, mms, extraLabels)
})
}
func insertRows(timeseries []prompb.TimeSeries, extraLabels []prompb.Label) error {
func insertRows(timeseries []prompb.TimeSeries, mms []prompb.MetricMetadata, extraLabels []prompb.Label) error {
ctx := common.GetInsertCtx()
defer common.PutInsertCtx(ctx)
@@ -68,5 +71,15 @@ func insertRows(timeseries []prompb.TimeSeries, extraLabels []prompb.Label) erro
}
rowsInserted.Add(rowsTotal)
rowsPerInsert.Update(float64(rowsTotal))
return ctx.FlushBufs()
if err := ctx.FlushBufs(); err != nil {
return fmt.Errorf("cannot flush metric bufs: %w", err)
}
if prommetadata.IsEnabled() {
if err := ctx.WriteMetadata(mms); err != nil {
return err
}
metadataInserted.Add(len(mms))
}
return nil
}

View File

@@ -0,0 +1,67 @@
package zabbixconnector
import (
"net/http"
"github.com/VictoriaMetrics/metrics"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/relabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/protoparserutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/zabbixconnector"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/zabbixconnector/stream"
)
var (
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="zabbixconnector"}`)
rowsPerInsert = metrics.NewHistogram(`vm_rows_per_insert{type="zabbixconnector"}`)
)
// InsertHandlerForHTTP processes remote write for ZabbixConnector POST /zabbixconnector/v1/history request.
func InsertHandlerForHTTP(req *http.Request) error {
extraLabels, err := protoparserutil.GetExtraLabels(req)
if err != nil {
return err
}
encoding := req.Header.Get("Content-Encoding")
return stream.Parse(req.Body, encoding, func(rows []zabbixconnector.Row) error {
return insertRows(rows, extraLabels)
})
}
func insertRows(rows []zabbixconnector.Row, extraLabels []prompb.Label) error {
ctx := common.GetInsertCtx()
defer common.PutInsertCtx(ctx)
rowsTotal := len(rows)
ctx.Reset(rowsTotal)
hasRelabeling := relabel.HasRelabeling()
for i := range rows {
r := &rows[i]
ctx.Labels = ctx.Labels[:0]
for k := range r.Tags {
t := &r.Tags[k]
ctx.AddLabelBytes(t.Key, t.Value)
}
for k := range extraLabels {
label := &extraLabels[k]
ctx.AddLabel(label.Name, label.Value)
}
if hasRelabeling {
ctx.ApplyRelabeling()
}
if len(ctx.Labels) == 0 {
// Skip metric without labels.
continue
}
ctx.SortLabelsIfNeeded()
if err := ctx.WriteDataPoint(nil, ctx.Labels, r.Timestamp, r.Value); err != nil {
return err
}
}
rowsInserted.Add(rowsTotal)
rowsPerInsert.Update(float64(rowsTotal))
return ctx.FlushBufs()
}

View File

@@ -104,7 +104,7 @@ func newDstFS() (*fslocal.FS, error) {
}
func newSrcFS(ctx context.Context) (common.RemoteFS, error) {
fs, err := actions.NewRemoteFS(ctx, *src)
fs, err := actions.NewRemoteFS(ctx, *src, nil)
if err != nil {
return nil, fmt.Errorf("cannot parse `-src`=%q: %w", *src, err)
}

View File

@@ -3896,27 +3896,9 @@ func nextSeriesConcurrentWrapper(nextSeries nextSeriesFunc, f func(s *series) (*
seriesCh := make(chan *series, goroutines)
errCh := make(chan error, 1)
var wg sync.WaitGroup
wg.Add(goroutines)
go func() {
var err error
for {
s, e := nextSeries()
if e != nil || s == nil {
err = e
break
}
seriesCh <- s
}
close(seriesCh)
wg.Wait()
close(resultCh)
errCh <- err
close(errCh)
}()
var skipProcessing atomic.Bool
for i := 0; i < goroutines; i++ {
go func() {
defer wg.Done()
for range goroutines {
wg.Go(func() {
for s := range seriesCh {
if skipProcessing.Load() {
continue
@@ -3934,8 +3916,24 @@ func nextSeriesConcurrentWrapper(nextSeries nextSeriesFunc, f func(s *series) (*
}
}
}
}()
})
}
go func() {
var err error
for {
s, e := nextSeries()
if e != nil || s == nil {
err = e
break
}
seriesCh <- s
}
close(seriesCh)
wg.Wait()
close(resultCh)
errCh <- err
close(errCh)
}()
wrapper := func() (*series, error) {
r := <-resultCh
if r == nil {

View File

@@ -421,6 +421,16 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
}
w.WriteHeader(http.StatusNoContent)
return true
case "/api/v1/metadata":
// Return dumb placeholder for https://prometheus.io/docs/prometheus/latest/querying/api/#querying-metric-metadata
metadataRequests.Inc()
if err := prometheus.MetadataHandler(qt, startTime, w, r); err != nil {
metadataErrors.Inc()
httpserver.SendPrometheusError(w, r, err)
return true
}
return true
default:
return false
}
@@ -510,7 +520,7 @@ func handleStaticAndSimpleRequests(w http.ResponseWriter, r *http.Request, path
fmt.Fprintf(w, "%s", `{"status":"error","msg":"for accessing vmalert flag '-vmalert.proxyURL' must be configured"}`)
return true
}
proxyVMAlertRequests(w, r)
proxyVMAlertRequests(w, r, path)
return true
}
@@ -548,7 +558,7 @@ func handleStaticAndSimpleRequests(w http.ResponseWriter, r *http.Request, path
case "/api/v1/rules", "/rules":
rulesRequests.Inc()
if len(*vmalertProxyURL) > 0 {
proxyVMAlertRequests(w, r)
proxyVMAlertRequests(w, r, path)
return true
}
// Return dumb placeholder for https://prometheus.io/docs/prometheus/latest/querying/api/#rules
@@ -558,7 +568,7 @@ func handleStaticAndSimpleRequests(w http.ResponseWriter, r *http.Request, path
case "/api/v1/alerts", "/alerts":
alertsRequests.Inc()
if len(*vmalertProxyURL) > 0 {
proxyVMAlertRequests(w, r)
proxyVMAlertRequests(w, r, path)
return true
}
// Return dumb placeholder for https://prometheus.io/docs/prometheus/latest/querying/api/#alerts
@@ -568,18 +578,12 @@ func handleStaticAndSimpleRequests(w http.ResponseWriter, r *http.Request, path
case "/api/v1/notifiers", "/notifiers":
notifiersRequests.Inc()
if len(*vmalertProxyURL) > 0 {
proxyVMAlertRequests(w, r)
proxyVMAlertRequests(w, r, path)
return true
}
w.Header().Set("Content-Type", "application/json")
fmt.Fprint(w, `{"status":"success","data":{"notifiers":[]}}`)
return true
case "/api/v1/metadata":
// Return dumb placeholder for https://prometheus.io/docs/prometheus/latest/querying/api/#querying-metric-metadata
metadataRequests.Inc()
w.Header().Set("Content-Type", "application/json")
fmt.Fprintf(w, "%s", `{"status":"success","data":{}}`)
return true
case "/api/v1/status/buildinfo":
buildInfoRequests.Inc()
w.Header().Set("Content-Type", "application/json")
@@ -708,7 +712,9 @@ var (
alertsRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/alerts"}`)
notifiersRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/notifiers"}`)
metadataRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/metadata"}`)
metadataRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/metadata"}`)
metadataErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/api/v1/metadata"}`)
buildInfoRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/buildinfo"}`)
queryExemplarsRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/query_exemplars"}`)
@@ -719,7 +725,7 @@ var (
metricNamesStatsResetErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/api/v1/admin/status/metric_names_stats/reset"}`)
)
func proxyVMAlertRequests(w http.ResponseWriter, r *http.Request) {
func proxyVMAlertRequests(w http.ResponseWriter, r *http.Request, path string) {
defer func() {
err := recover()
if err == nil || err == http.ErrAbortHandler {
@@ -730,8 +736,10 @@ func proxyVMAlertRequests(w http.ResponseWriter, r *http.Request) {
// Forward other panics to the caller.
panic(err)
}()
r.Host = vmalertProxyHost
vmalertProxy.ServeHTTP(w, r)
req := r.Clone(r.Context())
req.URL.Path = strings.TrimPrefix(path, "prometheus")
req.Host = vmalertProxyHost
vmalertProxy.ServeHTTP(w, req)
}
var (

View File

@@ -20,6 +20,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fasttime"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/querytracer"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage/metricsmetadata"
)
var (
@@ -295,14 +296,12 @@ func (rss *Results) runParallel(qt *querytracer.Tracer, f func(rs *Result, worke
// Start workers and wait until they finish the work.
var wg sync.WaitGroup
for i := range workChs {
wg.Add(1)
qtChild := qt.NewChild("worker #%d", i)
go func(workerID uint) {
timeseriesWorker(qtChild, workChs, workerID)
for workerID := range workChs {
qtChild := qt.NewChild("worker #%d", workerID)
wg.Go(func() {
timeseriesWorker(qtChild, workChs, uint(workerID))
qtChild.Done()
wg.Done()
}(uint(i))
})
}
wg.Wait()
@@ -513,12 +512,10 @@ func (pts *packedTimeseries) unpackTo(dst []*sortBlock, tbf *tmpBlocksFile, tr s
// Start workers and wait until they finish the work.
var wg sync.WaitGroup
for i := 0; i < workers; i++ {
wg.Add(1)
go func(workerID uint) {
unpackWorker(workChs, workerID)
wg.Done()
}(uint(i))
for workerID := range workers {
wg.Go(func() {
unpackWorker(workChs, uint(workerID))
})
}
wg.Wait()
@@ -865,6 +862,23 @@ func LabelValues(qt *querytracer.Tracer, labelName string, sq *storage.SearchQue
return labelValues, nil
}
// GetMetricsMetadata returns time series metric names metadata for the given args
func GetMetricsMetadata(qt *querytracer.Tracer, limit int, metricName string) ([]*metricsmetadata.Row, error) {
qt = qt.NewChild("get metrics metadata: limit=%d, metric_name=%q", limit, metricName)
defer qt.Done()
metadata := vmstorage.Storage.GetMetadataRows(qt, limit, metricName)
sort.Slice(metadata, func(i, j int) bool {
return string(metadata[i].MetricFamilyName) < string(metadata[j].MetricFamilyName)
})
if limit > 0 && len(metadata) >= limit {
metadata = metadata[:limit]
}
return metadata, nil
}
// GraphiteTagValues returns tag values for the given tagName until the given deadline.
func GraphiteTagValues(qt *querytracer.Tracer, tagName, filter string, limit int, deadline searchutil.Deadline) ([]string, error) {
qt = qt.NewChild("get graphite tag values for tagName=%s, filter=%s, limit=%d", tagName, filter, limit)
@@ -1002,12 +1016,10 @@ func ExportBlocks(qt *querytracer.Tracer, sq *storage.SearchQuery, deadline sear
mustStop atomic.Bool
)
var wg sync.WaitGroup
wg.Add(gomaxprocs)
for i := 0; i < gomaxprocs; i++ {
go func(workerID uint) {
defer wg.Done()
for workerID := range gomaxprocs {
wg.Go(func() {
for xw := range workCh {
if err := f(&xw.mn, &xw.b, tr, workerID); err != nil {
if err := f(&xw.mn, &xw.b, tr, uint(workerID)); err != nil {
errGlobalLock.Lock()
if errGlobal == nil {
errGlobal = err
@@ -1018,7 +1030,7 @@ func ExportBlocks(qt *querytracer.Tracer, sq *storage.SearchQuery, deadline sear
xw.reset()
exportWorkPool.Put(xw)
}
}(uint(i))
})
}
// Feed workers with work

View File

@@ -0,0 +1,35 @@
{% import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage/metricsmetadata"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/querytracer"
) %}
{% stripspace %}
MetadataResponse generates response for /api/v1/metadata
See https://prometheus.io/docs/prometheus/latest/querying/api/#querying-metric-metadata
{% func MetadataResponse( result []*metricsmetadata.Row, qt *querytracer.Tracer) %}
{
"status":"success",
"data": {
{% code
mapItems := len(result)
currentItem := 0
%}
{% for _, row := range result %}
"{%s string(row.MetricFamilyName) %}": [
{
"type": {%q= row.Type.String() %},
{% if len(row.Unit) > 0 -%}
"unit": {%q= string(row.Unit) %},
{% endif -%}
"help": {%q= string(row.Help) %}
}
]
{% if currentItem != mapItems-1 %},{% endif %}
{% code currentItem++ %}
{% endfor %}
}
{%= dumpQueryTrace(qt) %}
}
{% endfunc %}
{% endstripspace %}

View File

@@ -0,0 +1,108 @@
// Code generated by qtc from "metadata_response.qtpl". DO NOT EDIT.
// See https://github.com/valyala/quicktemplate for details.
//line app/vmselect/prometheus/metadata_response.qtpl:1
package prometheus
//line app/vmselect/prometheus/metadata_response.qtpl:1
import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/querytracer"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage/metricsmetadata"
)
// MetadataResponse generates response for /api/v1/metadataSee https://prometheus.io/docs/prometheus/latest/querying/api/#querying-metric-metadata
//line app/vmselect/prometheus/metadata_response.qtpl:9
import (
qtio422016 "io"
qt422016 "github.com/valyala/quicktemplate"
)
//line app/vmselect/prometheus/metadata_response.qtpl:9
var (
_ = qtio422016.Copy
_ = qt422016.AcquireByteBuffer
)
//line app/vmselect/prometheus/metadata_response.qtpl:9
func StreamMetadataResponse(qw422016 *qt422016.Writer, result []*metricsmetadata.Row, qt *querytracer.Tracer) {
//line app/vmselect/prometheus/metadata_response.qtpl:9
qw422016.N().S(`{"status":"success","data": {`)
//line app/vmselect/prometheus/metadata_response.qtpl:14
mapItems := len(result)
currentItem := 0
//line app/vmselect/prometheus/metadata_response.qtpl:17
for _, row := range result {
//line app/vmselect/prometheus/metadata_response.qtpl:17
qw422016.N().S(`"`)
//line app/vmselect/prometheus/metadata_response.qtpl:18
qw422016.E().S(string(row.MetricFamilyName))
//line app/vmselect/prometheus/metadata_response.qtpl:18
qw422016.N().S(`": [{"type":`)
//line app/vmselect/prometheus/metadata_response.qtpl:20
qw422016.N().Q(row.Type.String())
//line app/vmselect/prometheus/metadata_response.qtpl:20
qw422016.N().S(`,`)
//line app/vmselect/prometheus/metadata_response.qtpl:21
if len(row.Unit) > 0 {
//line app/vmselect/prometheus/metadata_response.qtpl:21
qw422016.N().S(`"unit":`)
//line app/vmselect/prometheus/metadata_response.qtpl:22
qw422016.N().Q(string(row.Unit))
//line app/vmselect/prometheus/metadata_response.qtpl:22
qw422016.N().S(`,`)
//line app/vmselect/prometheus/metadata_response.qtpl:23
}
//line app/vmselect/prometheus/metadata_response.qtpl:23
qw422016.N().S(`"help":`)
//line app/vmselect/prometheus/metadata_response.qtpl:24
qw422016.N().Q(string(row.Help))
//line app/vmselect/prometheus/metadata_response.qtpl:24
qw422016.N().S(`}]`)
//line app/vmselect/prometheus/metadata_response.qtpl:27
if currentItem != mapItems-1 {
//line app/vmselect/prometheus/metadata_response.qtpl:27
qw422016.N().S(`,`)
//line app/vmselect/prometheus/metadata_response.qtpl:27
}
//line app/vmselect/prometheus/metadata_response.qtpl:28
currentItem++
//line app/vmselect/prometheus/metadata_response.qtpl:29
}
//line app/vmselect/prometheus/metadata_response.qtpl:29
qw422016.N().S(`}`)
//line app/vmselect/prometheus/metadata_response.qtpl:31
streamdumpQueryTrace(qw422016, qt)
//line app/vmselect/prometheus/metadata_response.qtpl:31
qw422016.N().S(`}`)
//line app/vmselect/prometheus/metadata_response.qtpl:33
}
//line app/vmselect/prometheus/metadata_response.qtpl:33
func WriteMetadataResponse(qq422016 qtio422016.Writer, result []*metricsmetadata.Row, qt *querytracer.Tracer) {
//line app/vmselect/prometheus/metadata_response.qtpl:33
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmselect/prometheus/metadata_response.qtpl:33
StreamMetadataResponse(qw422016, result, qt)
//line app/vmselect/prometheus/metadata_response.qtpl:33
qt422016.ReleaseWriter(qw422016)
//line app/vmselect/prometheus/metadata_response.qtpl:33
}
//line app/vmselect/prometheus/metadata_response.qtpl:33
func MetadataResponse(result []*metricsmetadata.Row, qt *querytracer.Tracer) string {
//line app/vmselect/prometheus/metadata_response.qtpl:33
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmselect/prometheus/metadata_response.qtpl:33
WriteMetadataResponse(qb422016, result, qt)
//line app/vmselect/prometheus/metadata_response.qtpl:33
qs422016 := string(qb422016.B)
//line app/vmselect/prometheus/metadata_response.qtpl:33
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmselect/prometheus/metadata_response.qtpl:33
return qs422016
//line app/vmselect/prometheus/metadata_response.qtpl:33
}

View File

@@ -639,6 +639,37 @@ func LabelsHandler(qt *querytracer.Tracer, startTime time.Time, w http.ResponseW
return nil
}
// MetadataHandler processes /api/v1/metadata request.
//
// See https://prometheus.io/docs/prometheus/latest/querying/api/#querying-metric-metadata
func MetadataHandler(qt *querytracer.Tracer, startTime time.Time, w http.ResponseWriter, r *http.Request) error {
limit, err := httputil.GetInt(r, "limit")
if err != nil {
return err
}
if limit < 0 {
limit = 0
}
metricName := r.FormValue("metric")
metadata, err := netstorage.GetMetricsMetadata(qt, limit, metricName)
if err != nil {
return fmt.Errorf("cannot get metadata: %w", err)
}
qt.Done()
w.Header().Set("Content-Type", "application/json")
bw := bufferedwriter.Get(w)
defer bufferedwriter.Put(bw)
WriteMetadataResponse(bw, metadata, qt)
if err := bw.Flush(); err != nil {
return fmt.Errorf("cannot send metadata response to remote client: %w", err)
}
return nil
}
var labelsDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/api/v1/labels"}`)
// SeriesCountHandler processes /api/v1/series/count request.

View File

@@ -103,15 +103,13 @@ func testIncrementalParallelAggr(iafc *incrementalAggrFuncContext, tssSrc, tssEx
workersCount := netstorage.MaxWorkers()
tsCh := make(chan *timeseries)
var wg sync.WaitGroup
wg.Add(workersCount)
for i := 0; i < workersCount; i++ {
go func(workerID uint) {
defer wg.Done()
for workerID := range workersCount {
wg.Go(func() {
for ts := range tsCh {
runtime.Gosched() // allow other goroutines performing the work
iafc.updateTimeseries(ts, workerID)
iafc.updateTimeseries(ts, uint(workerID))
}
}(uint(i))
})
}
for _, ts := range tssSrc {
tsCh <- ts

View File

@@ -477,22 +477,18 @@ func execBinaryOpArgs(qt *querytracer.Tracer, ec *EvalConfig, exprFirst, exprSec
var tssFirst []*timeseries
var errFirst error
qtFirst := qt.NewChild("expr1")
wg.Add(1)
go func() {
defer wg.Done()
wg.Go(func() {
tssFirst, errFirst = evalExpr(qtFirst, ec, exprFirst)
qtFirst.Done()
}()
})
var tssSecond []*timeseries
var errSecond error
qtSecond := qt.NewChild("expr2")
wg.Add(1)
go func() {
defer wg.Done()
wg.Go(func() {
tssSecond, errSecond = evalExpr(qtSecond, ec, exprSecond)
qtSecond.Done()
}()
})
wg.Wait()
if errFirst != nil {
@@ -710,17 +706,13 @@ func evalExprsInParallel(qt *querytracer.Tracer, ec *EvalConfig, es []metricsql.
qt.Printf("eval function args in parallel")
var wg sync.WaitGroup
for i, e := range es {
wg.Add(1)
qtChild := qt.NewChild("eval arg %d", i)
go func(e metricsql.Expr, i int) {
defer func() {
qtChild.Done()
wg.Done()
}()
wg.Go(func() {
defer qtChild.Done()
rv, err := evalExpr(qtChild, ec, e)
rvs[i] = rv
errs[i] = err
}(e, i)
})
}
wg.Wait()
for _, err := range errs {
@@ -785,7 +777,8 @@ func getRollupExprArg(arg metricsql.Expr) *metricsql.RollupExpr {
// - rollupFunc(m) if iafc is nil
// - aggrFunc(rollupFunc(m)) if iafc isn't nil
func evalRollupFunc(qt *querytracer.Tracer, ec *EvalConfig, funcName string, rf rollupFunc, expr metricsql.Expr,
re *metricsql.RollupExpr, iafc *incrementalAggrFuncContext) ([]*timeseries, error) {
re *metricsql.RollupExpr, iafc *incrementalAggrFuncContext,
) ([]*timeseries, error) {
if re.At == nil {
return evalRollupFuncWithoutAt(qt, ec, funcName, rf, expr, re, iafc)
}
@@ -835,7 +828,8 @@ func evalRollupFunc(qt *querytracer.Tracer, ec *EvalConfig, funcName string, rf
}
func evalRollupFuncWithoutAt(qt *querytracer.Tracer, ec *EvalConfig, funcName string, rf rollupFunc,
expr metricsql.Expr, re *metricsql.RollupExpr, iafc *incrementalAggrFuncContext) ([]*timeseries, error) {
expr metricsql.Expr, re *metricsql.RollupExpr, iafc *incrementalAggrFuncContext,
) ([]*timeseries, error) {
funcName = strings.ToLower(funcName)
ecNew := ec
var offset int64
@@ -1017,16 +1011,14 @@ func doParallel(tss []*timeseries, f func(ts *timeseries, values []float64, time
}
var wg sync.WaitGroup
wg.Add(workers)
for i := 0; i < workers; i++ {
go func(workerID uint) {
defer wg.Done()
for workerID := range workers {
wg.Go(func() {
var tmpValues []float64
var tmpTimestamps []int64
for ts := range workChs[workerID] {
tmpValues, tmpTimestamps = f(ts, tmpValues, tmpTimestamps, workerID)
tmpValues, tmpTimestamps = f(ts, tmpValues, tmpTimestamps, uint(workerID))
}
}(uint(i))
})
}
wg.Wait()
}
@@ -1058,7 +1050,8 @@ func removeNanValues(dstValues []float64, dstTimestamps []int64, values []float6
// evalInstantRollup evaluates instant rollup where ec.Start == ec.End.
func evalInstantRollup(qt *querytracer.Tracer, ec *EvalConfig, funcName string, rf rollupFunc,
expr metricsql.Expr, me *metricsql.MetricExpr, iafc *incrementalAggrFuncContext, window int64) ([]*timeseries, error) {
expr metricsql.Expr, me *metricsql.MetricExpr, iafc *incrementalAggrFuncContext, window int64,
) ([]*timeseries, error) {
if ec.Start != ec.End {
logger.Panicf("BUG: evalInstantRollup cannot be called on non-empty time range; got %s", ec.timeRangeString())
}
@@ -1083,10 +1076,12 @@ func evalInstantRollup(qt *querytracer.Tracer, ec *EvalConfig, funcName string,
rollupResultCacheV.DeleteInstantValues(qt, expr, window, ec.Step, ec.EnforcedTagFilterss)
}
getCachedSeries := func(qt *querytracer.Tracer) ([]*timeseries, int64, error) {
rollupResultCacheV.rollupResultCacheRequests.Inc()
again:
offset := int64(0)
tssCached := rollupResultCacheV.GetInstantValues(qt, expr, window, ec.Step, ec.EnforcedTagFilterss)
if len(tssCached) == 0 {
rollupResultCacheV.rollupResultCacheMisses.Inc()
// Cache miss. Re-populate the missing data.
start := int64(fasttime.UnixTimestamp()*1000) - cacheTimestampOffset.Milliseconds()
offset = timestamp - start
@@ -1129,6 +1124,7 @@ func evalInstantRollup(qt *querytracer.Tracer, ec *EvalConfig, funcName string,
deleteCachedSeries(qt)
goto again
}
rollupResultCacheV.rollupResultCachePartialHits.Inc()
ec.QueryStats.addSeriesFetched(len(tssCached))
return tssCached, offset, nil
}
@@ -1169,60 +1165,6 @@ func evalInstantRollup(qt *querytracer.Tracer, ec *EvalConfig, funcName string,
},
}
return evalExpr(qt, ec, be)
case "rate":
if iafc != nil {
if !strings.EqualFold(iafc.ae.Name, "sum") {
qt.Printf("do not apply instant rollup optimization for incremental aggregate %s()", iafc.ae.Name)
return evalAt(qt, timestamp, window)
}
qt.Printf("optimized calculation for sum(rate(m[d])) as (sum(increase(m[d])) / d)")
afe := expr.(*metricsql.AggrFuncExpr)
fe := afe.Args[0].(*metricsql.FuncExpr)
feIncrease := *fe
feIncrease.Name = "increase"
// copy RollupExpr to drop possible offset,
// see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9762
newArg := copyRollupExpr(fe.Args[0].(*metricsql.RollupExpr))
newArg.Offset = nil
feIncrease.Args = []metricsql.Expr{newArg}
d := newArg.Window.Duration(ec.Step)
if d == 0 {
d = ec.Step
}
afeIncrease := *afe
afeIncrease.Args = []metricsql.Expr{&feIncrease}
be := &metricsql.BinaryOpExpr{
Op: "/",
KeepMetricNames: true,
Left: &afeIncrease,
Right: &metricsql.NumberExpr{
N: float64(d) / 1000,
},
}
return evalExpr(qt, ec, be)
}
qt.Printf("optimized calculation for instant rollup rate(m[d]) as (increase(m[d]) / d)")
fe := expr.(*metricsql.FuncExpr)
feIncrease := *fe
feIncrease.Name = "increase"
// copy RollupExpr to drop possible offset,
// see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9762
newArg := copyRollupExpr(fe.Args[0].(*metricsql.RollupExpr))
newArg.Offset = nil
feIncrease.Args = []metricsql.Expr{newArg}
d := newArg.Window.Duration(ec.Step)
if d == 0 {
d = ec.Step
}
be := &metricsql.BinaryOpExpr{
Op: "/",
KeepMetricNames: fe.KeepMetricNames,
Left: &feIncrease,
Right: &metricsql.NumberExpr{
N: float64(d) / 1000,
},
}
return evalExpr(qt, ec, be)
case "max_over_time":
if iafc != nil {
if !strings.EqualFold(iafc.ae.Name, "max") {
@@ -1591,16 +1533,11 @@ func assertInstantValues(tss []*timeseries) {
}
}
var (
rollupResultCacheFullHits = metrics.NewCounter(`vm_rollup_result_cache_full_hits_total`)
rollupResultCachePartialHits = metrics.NewCounter(`vm_rollup_result_cache_partial_hits_total`)
rollupResultCacheMiss = metrics.NewCounter(`vm_rollup_result_cache_miss_total`)
memoryIntensiveQueries = metrics.NewCounter(`vm_memory_intensive_queries_total`)
)
var memoryIntensiveQueries = metrics.NewCounter(`vm_memory_intensive_queries_total`)
func evalRollupFuncWithMetricExpr(qt *querytracer.Tracer, ec *EvalConfig, funcName string, rf rollupFunc,
expr metricsql.Expr, me *metricsql.MetricExpr, iafc *incrementalAggrFuncContext, windowExpr *metricsql.DurationExpr) ([]*timeseries, error) {
expr metricsql.Expr, me *metricsql.MetricExpr, iafc *incrementalAggrFuncContext, windowExpr *metricsql.DurationExpr,
) ([]*timeseries, error) {
window, err := windowExpr.NonNegativeDuration(ec.Step)
if err != nil {
return nil, fmt.Errorf("cannot parse lookbehind window in square brackets at %s: %w", expr.AppendString(nil), err)
@@ -1636,19 +1573,20 @@ func evalRollupFuncWithMetricExpr(qt *querytracer.Tracer, ec *EvalConfig, funcNa
}
// Search for cached results.
rollupResultCacheV.rollupResultCacheRequests.Inc()
tssCached, start := rollupResultCacheV.GetSeries(qt, ec, expr, window)
ec.QueryStats.addSeriesFetched(len(tssCached))
if start > ec.End {
qt.Printf("the result is fully cached")
rollupResultCacheFullHits.Inc()
rollupResultCacheV.rollupResultCacheFullHits.Inc()
return tssCached, nil
}
if start > ec.Start {
qt.Printf("partial cache hit")
rollupResultCachePartialHits.Inc()
rollupResultCacheV.rollupResultCachePartialHits.Inc()
} else {
qt.Printf("cache miss")
rollupResultCacheMiss.Inc()
rollupResultCacheV.rollupResultCacheMisses.Inc()
}
// Fetch missing results, which aren't cached yet.
@@ -1684,7 +1622,8 @@ func evalRollupFuncWithMetricExpr(qt *querytracer.Tracer, ec *EvalConfig, funcNa
//
// pointsPerSeries is used only for estimating the needed memory for query processing
func evalRollupFuncNoCache(qt *querytracer.Tracer, ec *EvalConfig, funcName string, rf rollupFunc,
expr metricsql.Expr, me *metricsql.MetricExpr, iafc *incrementalAggrFuncContext, window, pointsPerSeries int64) ([]*timeseries, error) {
expr metricsql.Expr, me *metricsql.MetricExpr, iafc *incrementalAggrFuncContext, window, pointsPerSeries int64,
) ([]*timeseries, error) {
if qt.Enabled() {
qt = qt.NewChild("rollup %s: timeRange=%s, step=%d, window=%d", expr.AppendString(nil), ec.timeRangeString(), ec.Step, window)
defer qt.Done()
@@ -1807,7 +1746,8 @@ func maxSilenceInterval() int64 {
func evalRollupWithIncrementalAggregate(qt *querytracer.Tracer, funcName string, keepMetricNames bool,
iafc *incrementalAggrFuncContext, rss *netstorage.Results, rcs []*rollupConfig,
preFunc func(values []float64, timestamps []int64), sharedTimestamps []int64) ([]*timeseries, error) {
preFunc func(values []float64, timestamps []int64), sharedTimestamps []int64,
) ([]*timeseries, error) {
qt = qt.NewChild("rollup %s() with incremental aggregation %s() over %d series; rollupConfigs=%s", funcName, iafc.ae.Name, rss.Len(), rcs)
defer qt.Done()
var samplesScannedTotal atomic.Uint64
@@ -1846,7 +1786,8 @@ func evalRollupWithIncrementalAggregate(qt *querytracer.Tracer, funcName string,
}
func evalRollupNoIncrementalAggregate(qt *querytracer.Tracer, funcName string, keepMetricNames bool, rss *netstorage.Results, rcs []*rollupConfig,
preFunc func(values []float64, timestamps []int64), sharedTimestamps []int64) ([]*timeseries, error) {
preFunc func(values []float64, timestamps []int64), sharedTimestamps []int64,
) ([]*timeseries, error) {
qt = qt.NewChild("rollup %s() over %d series; rollupConfigs=%s", funcName, rss.Len(), rcs)
defer qt.Done()
@@ -1886,7 +1827,8 @@ func evalRollupNoIncrementalAggregate(qt *querytracer.Tracer, funcName string, k
}
func doRollupForTimeseries(funcName string, keepMetricNames bool, rc *rollupConfig, tsDst *timeseries, mnSrc *storage.MetricName,
valuesSrc []float64, timestampsSrc []int64, sharedTimestamps []int64) uint64 {
valuesSrc []float64, timestampsSrc []int64, sharedTimestamps []int64,
) uint64 {
tsDst.MetricName.CopyFrom(mnSrc)
if len(rc.TagValue) > 0 {
tsDst.MetricName.AddTag("rollup", rc.TagValue)

View File

@@ -534,7 +534,10 @@ type rollupFuncArg struct {
timestamps []int64
// Real value preceding values.
// Is populated if preceding value is within the rc.LookbackDelta.
// Is populated if the preceding sample falls within the rc.LookbackDelta range, or if rc.LookbackDelta is not set.
//
// It provides an additional check and value for rollup functions such as increase(), changes(),
// when the prevValue is NaN due to a gap or a small lookback window.
realPrevValue float64
// Real value which goes after values.
@@ -713,7 +716,11 @@ func (rc *rollupConfig) doInternal(dstValues []float64, tsm *timeseriesMap, valu
// Extend dstValues in order to remove mallocs below.
dstValues = decimal.ExtendFloat64sCapacity(dstValues, len(rc.Timestamps))
// Use step as the scrape interval for instant queries (when start == end).
// Set maxPrevInterval for subsequent rfa.prevValue calculations in rollupFunc:
// For instant queries, use rc.Step directly as maxPrevInterval.
// For range queries, rc.Step is typically too small to serve as the lookback window between two rollup points.
// Instead, estimate the scrape interval from raw sample timestamps (using the 0.6 quantile of the last 20 intervals)
// and slightly inflate the scrape interval to set maxPrevInterval, allowing for some tolerance to jitter.
maxPrevInterval := rc.Step
if rc.Start < rc.End {
scrapeInterval := getScrapeInterval(timestamps, rc.Step)
@@ -729,22 +736,21 @@ func (rc *rollupConfig) doInternal(dstValues []float64, tsm *timeseriesMap, valu
}
}
window := rc.Window
// Adjust lookbehind window only if it isn't set explicitly, e.g. rate(foo).
// In the case of missing lookbehind window it should be adjusted in order to return non-empty graph
// when the window doesn't cover at least two raw samples (this is what most users expect).
//
// If the user explicitly sets the lookbehind window to some fixed value, e.g. rate(foo[1s]),
// then it is expected he knows what he is doing. Do not adjust the lookbehind window then.
//
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3483
if window <= 0 {
window = rc.Step
if rc.MayAdjustWindow && window < maxPrevInterval {
// Adjust lookbehind window only if it isn't set explicitly, e.g. rate(foo).
// In the case of missing lookbehind window it should be adjusted in order to return non-empty graph
// when the window doesn't cover at least two raw samples (this is what most users expect).
//
// If the user explicitly sets the lookbehind window to some fixed value, e.g. rate(foo[1s]),
// then it is expected he knows what he is doing. Do not adjust the lookbehind window then.
//
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3483
window = maxPrevInterval
}
// Artificial window cannot exceed explicit rc.LookbackDelta, see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/784
if rc.isDefaultRollup && rc.LookbackDelta > 0 && window > rc.LookbackDelta {
// Implicit window exceeds -search.maxStalenessInterval, so limit it to -search.maxStalenessInterval
// according to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/784
window = rc.LookbackDelta
}
}
@@ -869,17 +875,17 @@ func getScrapeInterval(timestamps []int64, defaultInterval int64) int64 {
return defaultInterval
}
// Estimate scrape interval as 0.6 quantile for the first 20 intervals.
tsPrev := timestamps[0]
timestamps = timestamps[1:]
// Estimate scrape interval as 0.6 quantile of the last 20 intervals.
tsPrev := timestamps[len(timestamps)-1]
timestamps = timestamps[:len(timestamps)-1]
if len(timestamps) > 20 {
timestamps = timestamps[:20]
timestamps = timestamps[len(timestamps)-20:]
}
a := getFloat64s()
intervals := a.A[:0]
for _, ts := range timestamps {
intervals = append(intervals, float64(ts-tsPrev))
tsPrev = ts
for i := len(timestamps) - 1; i >= 0; i-- {
intervals = append(intervals, float64(tsPrev-timestamps[i]))
tsPrev = timestamps[i]
}
scrapeInterval := int64(quantile(0.6, intervals))
a.A = intervals
@@ -2107,9 +2113,15 @@ func rollupChanges(rfa *rollupFuncArg) float64 {
if len(values) == 0 {
return nan
}
prevValue = values[0]
values = values[1:]
n++
// Assume that the value didn't change during the current gap
// if realPrevValue exists.
if !math.IsNaN(rfa.realPrevValue) {
prevValue = rfa.realPrevValue
} else {
n++
prevValue = values[0]
values = values[1:]
}
}
for _, v := range values {
if v != prevValue {

View File

@@ -83,9 +83,11 @@ func checkRollupResultCacheReset() {
const checkRollupResultCacheResetInterval = 5 * time.Second
var needRollupResultCacheReset atomic.Bool
var checkRollupResultCacheResetOnce sync.Once
var rollupResultResetMetricRowSample atomic.Pointer[storage.MetricRow]
var (
needRollupResultCacheReset atomic.Bool
checkRollupResultCacheResetOnce sync.Once
rollupResultResetMetricRowSample atomic.Pointer[storage.MetricRow]
)
var rollupResultCacheV = &rollupResultCache{
c: workingsetcache.New(1024 * 1024), // This is a cache for testing.
@@ -132,7 +134,7 @@ func InitRollupResultCache(cachePath string) {
c = workingsetcache.New(cacheSize)
rollupResultCacheKeyPrefix.Store(newRollupResultCacheKeyPrefix())
}
if *disableCache {
if *disableCache && len(rollupResultCachePath) > 0 && !*resetRollupResultCacheOnStartup {
c.Reset()
}
@@ -178,6 +180,12 @@ func InitRollupResultCache(cachePath string) {
rollupResultCacheV = &rollupResultCache{
c: c,
rollupResultCacheRequests: metrics.GetOrCreateCounter(`vm_rollup_result_cache_requests_total`),
rollupResultCacheFullHits: metrics.GetOrCreateCounter(`vm_rollup_result_cache_full_hits_total`),
rollupResultCachePartialHits: metrics.GetOrCreateCounter(`vm_rollup_result_cache_partial_hits_total`),
rollupResultCacheMisses: metrics.GetOrCreateCounter(`vm_rollup_result_cache_miss_total`),
rollupResultCacheResets: metrics.GetOrCreateCounter(`vm_rollup_result_cache_resets_total`),
}
}
@@ -193,13 +201,18 @@ func StopRollupResultCache() {
type rollupResultCache struct {
c *workingsetcache.Cache
}
var rollupResultCacheResets = metrics.NewCounter(`vm_cache_resets_total{type="promql/rollupResult"}`)
rollupResultCacheRequests *metrics.Counter
rollupResultCacheFullHits *metrics.Counter
rollupResultCachePartialHits *metrics.Counter
rollupResultCacheMisses *metrics.Counter
rollupResultCacheResets *metrics.Counter
}
// ResetRollupResultCache resets rollup result cache.
func ResetRollupResultCache() {
rollupResultCacheResets.Inc()
rollupResultCacheV.rollupResultCacheResets.Inc()
rollupResultCacheKeyPrefix.Add(1)
logger.Infof("rollupResult cache has been cleared")
}

View File

@@ -232,6 +232,7 @@ func testRollupFunc(t *testing.T, funcName string, args []any, vExpected float64
}
var rfa rollupFuncArg
rfa.prevValue = nan
rfa.realPrevValue = nan
rfa.prevTimestamp = 0
rfa.values = append(rfa.values, testValues...)
rfa.timestamps = append(rfa.timestamps, testTimestamps...)
@@ -1654,7 +1655,7 @@ func TestRollupDeltaWithStaleness(t *testing.T) {
rc.Timestamps = rc.getTimestamps()
gotValues, samplesScanned := rc.Do(nil, values, timestamps)
if samplesScanned != 7 {
t.Fatalf("expecting 8 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
t.Fatalf("expecting 7 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{1, 0}
timestampsExpected := []int64{0, 45e3}
@@ -1674,7 +1675,7 @@ func TestRollupDeltaWithStaleness(t *testing.T) {
rc.Timestamps = rc.getTimestamps()
gotValues, samplesScanned := rc.Do(nil, values, timestamps)
if samplesScanned != 7 {
t.Fatalf("expecting 8 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
t.Fatalf("expecting 7 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{1, 0}
timestampsExpected := []int64{0, 45e3}
@@ -1794,7 +1795,7 @@ func TestRollupIncreasePureWithStaleness(t *testing.T) {
rc.Timestamps = rc.getTimestamps()
gotValues, samplesScanned := rc.Do(nil, values, timestamps)
if samplesScanned != 7 {
t.Fatalf("expecting 8 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
t.Fatalf("expecting 7 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{1, 0}
timestampsExpected := []int64{0, 45e3}
@@ -1814,7 +1815,7 @@ func TestRollupIncreasePureWithStaleness(t *testing.T) {
rc.Timestamps = rc.getTimestamps()
gotValues, samplesScanned := rc.Do(nil, values, timestamps)
if samplesScanned != 7 {
t.Fatalf("expecting 8 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
t.Fatalf("expecting 7 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{1, 0}
timestampsExpected := []int64{0, 45e3}
@@ -1888,3 +1889,126 @@ func TestRollupIncreasePureWithStaleness(t *testing.T) {
testRowsEqual(t, gotValues, rc.Timestamps, valuesExpected, timestampsExpected)
})
}
func TestRollupChangesWithStaleness(t *testing.T) {
// there is a gap between samples in the dataset below
timestamps := []int64{0, 15000, 30000, 70000}
values := []float64{1, 1, 1, 1}
// if step > gap, then changes will always respect value before gap
t.Run("step>gap", func(t *testing.T) {
rc := rollupConfig{
Func: rollupChanges,
Start: 0,
End: 70000,
Step: 45000,
Window: 0,
MaxPointsPerSeries: 1e4,
}
rc.Timestamps = rc.getTimestamps()
gotValues, samplesScanned := rc.Do(nil, values, timestamps)
if samplesScanned != 7 {
t.Fatalf("expecting 7 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{1, 0}
timestampsExpected := []int64{0, 45e3}
testRowsEqual(t, gotValues, rc.Timestamps, valuesExpected, timestampsExpected)
})
// even if LookbackDelta < gap
t.Run("step>gap;LookbackDelta<gap", func(t *testing.T) {
rc := rollupConfig{
Func: rollupChanges,
Start: 0,
End: 70000,
Step: 45000,
LookbackDelta: 10e3,
Window: 0,
MaxPointsPerSeries: 1e4,
}
rc.Timestamps = rc.getTimestamps()
gotValues, samplesScanned := rc.Do(nil, values, timestamps)
if samplesScanned != 7 {
t.Fatalf("expecting 7 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{1, 0}
timestampsExpected := []int64{0, 45e3}
testRowsEqual(t, gotValues, rc.Timestamps, valuesExpected, timestampsExpected)
})
// if step < gap and LookbackDelta>0 then changes will respect value before gap
// only if it is not stale according to LookbackDelta
t.Run("step<gap;LookbackDelta>0", func(t *testing.T) {
rc := rollupConfig{
Func: rollupChanges,
Start: 0,
End: 70000,
Step: 10000,
Window: 0,
MaxPointsPerSeries: 1e4,
LookbackDelta: 30e3,
}
rc.Timestamps = rc.getTimestamps()
gotValues, samplesScanned := rc.Do(nil, values, timestamps)
if samplesScanned != 8 {
t.Fatalf("expecting 8 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{1, 0, 0, 0, 0, 0, 0, 1}
timestampsExpected := []int64{0, 10e3, 20e3, 30e3, 40e3, 50e3, 60e3, 70e3}
testRowsEqual(t, gotValues, rc.Timestamps, valuesExpected, timestampsExpected)
})
// there is a staleness marker between samples in the dataset below
timestamps = []int64{0, 10000, 20000, 30000, 40000}
values = []float64{1, 1, 1, decimal.StaleNaN, 1}
t.Run("staleness marker", func(t *testing.T) {
rc := rollupConfig{
Func: rollupChanges,
Start: 0,
End: 40000,
Step: 10000,
Window: 0,
MaxPointsPerSeries: 1e4,
}
rc.Timestamps = rc.getTimestamps()
gotValues, samplesScanned := rc.Do(nil, values, timestamps)
if samplesScanned != 10 {
t.Fatalf("expecting 10 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{1, 0, 0, 1, 1}
timestampsExpected := []int64{0, 10e3, 20e3, 30e3, 40e3}
testRowsEqual(t, gotValues, rc.Timestamps, valuesExpected, timestampsExpected)
})
// https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10280
//
// When there are gaps between samples that exceed maxPrevInterval,
// either due to changes in the scrape interval or missing scrapes.
// For example, if the scrape interval was initially 30s and later changed to 10s,
// the auto-calculated scrape interval is 10s, with maxPrevInterval inflated to 15s.
//
// At t=30s:
// prevValue is NaN, as the last sample at t=0s is considered stale for t=30s given the maxPrevInterval.
// realPrevValue is 1, taken from t=0s, since LookbackDelta=0 ignores staleness.
// the result should be `changes(1, 1) -> 0` instead of `changes(1, NaN)`.
// At t=100s:
// preValue is also NaN, as the last sample at t=70s is considered stale for t=100s.
// realPrevValue is 1, taken from t=70s,
// result should be `changes(2, 1) -> 1`.
timestamps = []int64{0, 30000, 40000, 50000, 60000, 70000, 100000}
values = []float64{1, 1, 1, 1, 1, 1, 2}
t.Run("issue-10280", func(t *testing.T) {
rc := rollupConfig{
Func: rollupChanges,
Start: 0,
End: 100e3,
Step: 10e3,
MaxPointsPerSeries: 1e4,
}
rc.Timestamps = rc.getTimestamps()
gotValues, _ := rc.Do(nil, values, timestamps)
valuesExpected := []float64{1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1}
timestampsExpected := []int64{0, 10e3, 20e3, 30e3, 40e3, 50e3, 60e3, 70e3, 80e3, 90e3, 100e3}
testRowsEqual(t, gotValues, rc.Timestamps, valuesExpected, timestampsExpected)
})
}

View File

@@ -2,6 +2,7 @@ package promql
import (
"fmt"
"math/rand"
"reflect"
"strconv"
"strings"
@@ -280,6 +281,87 @@ func timeseriesToPromMetrics(tss []*timeseries) string {
return strings.Join(a, "\n")
}
func TestTransformFuncSort(t *testing.T) {
f := func(isDesc bool, metrics, expectedMetrics string) {
t.Helper()
tss := promMetricsToTimeseries(metrics)
// Input tss order is not stable in VictoriaMetrics
// Shuffle tss to reflect that
// Commenting out the shuffle to make the test stable
rand.Shuffle(len(tss), func(i, j int) {
tss[i], tss[j] = tss[j], tss[i]
})
sortFunc := newTransformFuncSort(isDesc)
sorted, err := sortFunc(&transformFuncArg{
args: [][]*timeseries{tss},
})
if err != nil {
t.Fatalf("sort failed: %s", err)
}
result := timeseriesToPromMetrics(sorted)
if result != expectedMetrics {
t.Fatalf("unexpected sort result:\ngot\n%s\nwant\n%s", result, expectedMetrics)
}
}
// Test asc sort with different values
f(
false,
`foo{label="a"} 3 123
foo{label="b"} 2 123
foo{label="c"} 1 123`,
`foo{label="c"} 1 123
foo{label="b"} 2 123
foo{label="a"} 3 123`,
)
// Test desc sort with different values
f(
true,
`foo{label="a"} 3 123
foo{label="b"} 2 123
foo{label="c"} 1 123`,
`foo{label="a"} 3 123
foo{label="b"} 2 123
foo{label="c"} 1 123`,
)
// Test asc sort with mixed values
f(
false,
`foo{label="a"} 1 123
foo{label="b"} 1 123
foo{label="c"} 2 123
foo{label="d"} 2 123
foo{label="e"} 3 123
`,
`foo{label="a"} 1 123
foo{label="b"} 1 123
foo{label="c"} 2 123
foo{label="d"} 2 123
foo{label="e"} 3 123`,
)
// Test desc sort with mixed values
f(
true,
`foo{label="a"} 1 123
foo{label="b"} 1 123
foo{label="c"} 2 123
foo{label="d"} 2 123
foo{label="e"} 3 123`,
`foo{label="e"} 3 123
foo{label="c"} 2 123
foo{label="d"} 2 123
foo{label="a"} 1 123
foo{label="b"} 1 123`,
)
}
func TestGetNumPrefix(t *testing.T) {
f := func(s, prefixExpected string) {
t.Helper()

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@@ -37,10 +37,10 @@
<meta property="og:title" content="UI for VictoriaMetrics">
<meta property="og:url" content="https://victoriametrics.com/">
<meta property="og:description" content="Explore and troubleshoot your VictoriaMetrics data">
<script type="module" crossorigin src="./assets/index-zpalCSif.js"></script>
<link rel="modulepreload" crossorigin href="./assets/vendor-DY9kCvzk.js">
<script type="module" crossorigin src="./assets/index-B6lol36n.js"></script>
<link rel="modulepreload" crossorigin href="./assets/vendor-EZef-S_8.js">
<link rel="stylesheet" crossorigin href="./assets/vendor-D1GxaB_c.css">
<link rel="stylesheet" crossorigin href="./assets/index-CBxdwuZH.css">
<link rel="stylesheet" crossorigin href="./assets/index-VQRcNK83.css">
</head>
<body>
<noscript>You need to enable JavaScript to run this app.</noscript>

View File

@@ -22,13 +22,15 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/mergeset"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/querytracer"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage/metricsmetadata"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/stringsutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/syncwg"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/timeutil"
)
var (
retentionPeriod = flagutil.NewRetentionDuration("retentionPeriod", "1", "Data with timestamps outside the retentionPeriod is automatically deleted. The minimum retentionPeriod is 24h or 1d. See also -retentionFilter")
retentionPeriod = flagutil.NewRetentionDuration("retentionPeriod", "1M", "Data with timestamps outside the retentionPeriod is automatically deleted. The minimum retentionPeriod is 24h or 1d. "+
"See https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#retention. See also -retentionFilter")
snapshotAuthKey = flagutil.NewPassword("snapshotAuthKey", "authKey, which must be passed in query string to /snapshot* pages. It overrides -httpAuth.*")
forceMergeAuthKey = flagutil.NewPassword("forceMergeAuthKey", "authKey, which must be passed in query string to /internal/force_merge pages. It overrides -httpAuth.*")
forceFlushAuthKey = flagutil.NewPassword("forceFlushAuthKey", "authKey, which must be passed in query string to /internal/force_flush pages. It overrides -httpAuth.*")
@@ -90,6 +92,9 @@ var (
"In most cases, this value should not be changed. The maximum allowed value is 23h.")
logNewSeriesAuthKey = flagutil.NewPassword("logNewSeriesAuthKey", "authKey, which must be passed in query string to /internal/log_new_series. It overrides -httpAuth.*")
metadataStorageSize = flagutil.NewBytes("storage.maxMetadataStorageSize", 0, "Overrides max size for metrics metadata entries in-memory storage. "+
"If set to 0 or a negative value, defaults to 1% of allowed memory.")
)
// CheckTimeRange returns true if the given tr is denied for querying.
@@ -114,12 +119,13 @@ func Init(resetCacheIfNeeded func(mrs []storage.MetricRow)) {
}
resetResponseCacheIfNeeded = resetCacheIfNeeded
storage.SetRetentionTimezoneOffset(*retentionTimezoneOffset)
storage.LegacySetRetentionTimezoneOffset(*retentionTimezoneOffset)
storage.SetFreeDiskSpaceLimit(minFreeDiskSpaceBytes.N)
storage.SetTSIDCacheSize(cacheSizeStorageTSID.IntN())
storage.SetTagFiltersCacheSize(cacheSizeIndexDBTagFilters.IntN())
storage.SetMetricNamesStatsCacheSize(cacheSizeMetricNamesStats.IntN())
storage.SetMetricNameCacheSize(cacheSizeStorageMetricName.IntN())
storage.SetMetadataStorageSize(metadataStorageSize.IntN())
mergeset.SetIndexBlocksCacheSize(cacheSizeIndexDBIndexBlocks.IntN())
mergeset.SetDataBlocksCacheSize(cacheSizeIndexDBDataBlocks.IntN())
mergeset.SetDataBlocksSparseCacheSize(cacheSizeIndexDBDataBlocksSparse.IntN())
@@ -194,6 +200,19 @@ func AddRows(mrs []storage.MetricRow) error {
return nil
}
// AddMetadataRows adds mrs to the storage.
//
// The caller should limit the number of concurrent calls to AddMetadataRows() in order to limit memory usage.
func AddMetadataRows(mms []metricsmetadata.Row) error {
if Storage.IsReadOnly() {
return errReadOnly
}
WG.Add(1)
Storage.AddMetadataRows(mms)
WG.Done()
return nil
}
var errReadOnly = errors.New("the storage is in read-only mode; check -storage.minFreeDiskSpaceBytes command-line flag value")
// RegisterMetricNames registers all the metrics from mrs in the storage.
@@ -370,11 +389,23 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
case "/create":
snapshotsCreateTotal.Inc()
w.Header().Set("Content-Type", "application/json")
snapshotPath := Storage.MustCreateSnapshot()
snapshotName := Storage.MustCreateSnapshot()
// Verify whether the client already closed the connection.
// In this case it is better to drop the created snapshot, since the client isn't interested in it.
if err := r.Context().Err(); err != nil {
logger.Infof("deleting already created snapshot at %s because the client canceled the request", snapshotName)
if err := deleteSnapshot(snapshotName); err != nil {
logger.Infof("cannot delete just created snapshot: %s", err)
return true
}
return true
}
if prometheusCompatibleResponse {
fmt.Fprintf(w, `{"status":"success","data":{"name":%s}}`, stringsutil.JSONString(snapshotPath))
fmt.Fprintf(w, `{"status":"success","data":{"name":%s}}`, stringsutil.JSONString(snapshotName))
} else {
fmt.Fprintf(w, `{"status":"ok","snapshot":%s}`, stringsutil.JSONString(snapshotPath))
fmt.Fprintf(w, `{"status":"ok","snapshot":%s}`, stringsutil.JSONString(snapshotName))
}
return true
case "/list":
@@ -394,23 +425,12 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
snapshotsDeleteTotal.Inc()
w.Header().Set("Content-Type", "application/json")
snapshotName := r.FormValue("snapshot")
snapshots := Storage.MustListSnapshots()
for _, snName := range snapshots {
if snName == snapshotName {
if err := Storage.DeleteSnapshot(snName); err != nil {
err = fmt.Errorf("cannot delete snapshot %q: %w", snName, err)
jsonResponseError(w, err)
snapshotsDeleteErrorsTotal.Inc()
return true
}
fmt.Fprintf(w, `{"status":"ok"}`)
return true
}
if err := deleteSnapshot(snapshotName); err != nil {
jsonResponseError(w, err)
snapshotsDeleteErrorsTotal.Inc()
return true
}
err := fmt.Errorf("cannot find snapshot %q", snapshotName)
jsonResponseError(w, err)
fmt.Fprintf(w, `{"status":"ok"}`)
return true
case "/delete_all":
snapshotsDeleteAllTotal.Inc()
@@ -431,15 +451,26 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
}
}
func deleteSnapshot(snapshotName string) error {
snapshots := Storage.MustListSnapshots()
for _, snName := range snapshots {
if snName == snapshotName {
if err := Storage.DeleteSnapshot(snName); err != nil {
return fmt.Errorf("cannot delete snapshot %q: %w", snName, err)
}
return nil
}
}
return fmt.Errorf("cannot find snapshot %q", snapshotName)
}
func initStaleSnapshotsRemover(strg *storage.Storage) {
staleSnapshotsRemoverCh = make(chan struct{})
if snapshotsMaxAge.Duration() <= 0 {
return
}
snapshotsMaxAgeDur := snapshotsMaxAge.Duration()
staleSnapshotsRemoverWG.Add(1)
go func() {
defer staleSnapshotsRemoverWG.Done()
staleSnapshotsRemoverWG.Go(func() {
d := timeutil.AddJitterToDuration(time.Second * 11)
t := time.NewTicker(d)
defer t.Stop()
@@ -451,7 +482,7 @@ func initStaleSnapshotsRemover(strg *storage.Storage) {
}
strg.MustDeleteStaleSnapshots(snapshotsMaxAgeDur)
}
}()
})
}
func stopStaleSnapshotsRemover() {
@@ -482,7 +513,7 @@ func writeStorageMetrics(w io.Writer, strg *storage.Storage) {
var m storage.Metrics
strg.UpdateMetrics(&m)
tm := &m.TableMetrics
idbm := &m.IndexDBMetrics
idbm := &m.TableMetrics.IndexDBMetrics
metrics.WriteGaugeUint64(w, fmt.Sprintf(`vm_free_disk_space_bytes{path=%q}`, *DataPath), fs.MustGetFreeSpace(*DataPath))
metrics.WriteGaugeUint64(w, fmt.Sprintf(`vm_free_disk_space_limit_bytes{path=%q}`, *DataPath), uint64(minFreeDiskSpaceBytes.N))
@@ -610,75 +641,82 @@ func writeStorageMetrics(w io.Writer, strg *storage.Storage) {
metrics.WriteCounterUint64(w, `vm_missing_metric_names_for_metric_id_total`, idbm.MissingMetricNamesForMetricID)
metrics.WriteCounterUint64(w, `vm_date_metric_id_cache_syncs_total`, m.DateMetricIDCacheSyncsCount)
metrics.WriteCounterUint64(w, `vm_date_metric_id_cache_resets_total`, m.DateMetricIDCacheResetsCount)
metrics.WriteGaugeUint64(w, `vm_cache_entries{type="storage/indexBlocks"}`, tm.IndexBlocksCacheSize)
metrics.WriteGaugeUint64(w, `vm_cache_entries{type="storage/tsid"}`, m.TSIDCacheSize)
metrics.WriteGaugeUint64(w, `vm_cache_entries{type="storage/metricIDs"}`, m.MetricIDCacheSize)
metrics.WriteGaugeUint64(w, `vm_cache_entries{type="storage/metricName"}`, m.MetricNameCacheSize)
metrics.WriteGaugeUint64(w, `vm_cache_entries{type="storage/date_metricID"}`, m.DateMetricIDCacheSize)
metrics.WriteGaugeUint64(w, `vm_cache_entries{type="storage/hour_metric_ids"}`, m.HourMetricIDCacheSize)
metrics.WriteGaugeUint64(w, `vm_cache_entries{type="storage/next_day_metric_ids"}`, m.NextDayMetricIDCacheSize)
metrics.WriteGaugeUint64(w, `vm_cache_entries{type="storage/indexBlocks"}`, tm.IndexBlocksCacheSize)
metrics.WriteGaugeUint64(w, `vm_cache_entries{type="storage/regexps"}`, uint64(storage.RegexpCacheSize()))
metrics.WriteGaugeUint64(w, `vm_cache_entries{type="storage/regexpPrefixes"}`, uint64(storage.RegexpPrefixesCacheSize()))
metrics.WriteGaugeUint64(w, `vm_cache_entries{type="indexdb/dataBlocks"}`, idbm.DataBlocksCacheSize)
metrics.WriteGaugeUint64(w, `vm_cache_entries{type="indexdb/dataBlocksSparse"}`, idbm.DataBlocksSparseCacheSize)
metrics.WriteGaugeUint64(w, `vm_cache_entries{type="indexdb/indexBlocks"}`, idbm.IndexBlocksCacheSize)
metrics.WriteGaugeUint64(w, `vm_cache_entries{type="indexdb/metricID"}`, idbm.MetricIDCacheSize)
metrics.WriteGaugeUint64(w, `vm_cache_entries{type="indexdb/date_metricID"}`, idbm.DateMetricIDCacheSize)
metrics.WriteGaugeUint64(w, `vm_cache_entries{type="indexdb/tagFiltersToMetricIDs"}`, idbm.TagFiltersToMetricIDsCacheSize)
metrics.WriteGaugeUint64(w, `vm_cache_entries{type="storage/regexps"}`, uint64(storage.RegexpCacheSize()))
metrics.WriteGaugeUint64(w, `vm_cache_entries{type="storage/regexpPrefixes"}`, uint64(storage.RegexpPrefixesCacheSize()))
metrics.WriteGaugeUint64(w, `vm_cache_size_bytes{type="storage/indexBlocks"}`, tm.IndexBlocksCacheSizeBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_bytes{type="storage/tsid"}`, m.TSIDCacheSizeBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_bytes{type="storage/metricIDs"}`, m.MetricIDCacheSizeBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_bytes{type="storage/metricName"}`, m.MetricNameCacheSizeBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_bytes{type="storage/indexBlocks"}`, tm.IndexBlocksCacheSizeBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_bytes{type="storage/hour_metric_ids"}`, m.HourMetricIDCacheSizeBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_bytes{type="storage/next_day_metric_ids"}`, m.NextDayMetricIDCacheSizeBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_bytes{type="storage/regexps"}`, storage.RegexpCacheSizeBytes())
metrics.WriteGaugeUint64(w, `vm_cache_size_bytes{type="storage/regexpPrefixes"}`, storage.RegexpPrefixesCacheSizeBytes())
metrics.WriteGaugeUint64(w, `vm_cache_size_bytes{type="indexdb/metricID"}`, idbm.MetricIDCacheSizeBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_bytes{type="indexdb/date_metricID"}`, idbm.DateMetricIDCacheSizeBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_bytes{type="indexdb/dataBlocks"}`, idbm.DataBlocksCacheSizeBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_bytes{type="indexdb/dataBlocksSparse"}`, idbm.DataBlocksSparseCacheSizeBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_bytes{type="indexdb/indexBlocks"}`, idbm.IndexBlocksCacheSizeBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_bytes{type="storage/date_metricID"}`, m.DateMetricIDCacheSizeBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_bytes{type="storage/hour_metric_ids"}`, m.HourMetricIDCacheSizeBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_bytes{type="storage/next_day_metric_ids"}`, m.NextDayMetricIDCacheSizeBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_bytes{type="indexdb/tagFiltersToMetricIDs"}`, idbm.TagFiltersToMetricIDsCacheSizeBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_bytes{type="storage/regexps"}`, uint64(storage.RegexpCacheSizeBytes()))
metrics.WriteGaugeUint64(w, `vm_cache_size_bytes{type="storage/regexpPrefixes"}`, uint64(storage.RegexpPrefixesCacheSizeBytes()))
metrics.WriteGaugeUint64(w, `vm_cache_size_max_bytes{type="storage/indexBlocks"}`, tm.IndexBlocksCacheSizeMaxBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_max_bytes{type="storage/tsid"}`, m.TSIDCacheSizeMaxBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_max_bytes{type="storage/metricIDs"}`, m.MetricIDCacheSizeMaxBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_max_bytes{type="storage/metricName"}`, m.MetricNameCacheSizeMaxBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_max_bytes{type="storage/indexBlocks"}`, tm.IndexBlocksCacheSizeMaxBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_max_bytes{type="storage/regexps"}`, storage.RegexpCacheMaxSizeBytes())
metrics.WriteGaugeUint64(w, `vm_cache_size_max_bytes{type="storage/regexpPrefixes"}`, storage.RegexpPrefixesCacheMaxSizeBytes())
metrics.WriteGaugeUint64(w, `vm_cache_size_max_bytes{type="indexdb/dataBlocks"}`, idbm.DataBlocksCacheSizeMaxBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_max_bytes{type="indexdb/dataBlocksSparse"}`, idbm.DataBlocksSparseCacheSizeMaxBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_max_bytes{type="indexdb/indexBlocks"}`, idbm.IndexBlocksCacheSizeMaxBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_max_bytes{type="indexdb/tagFiltersToMetricIDs"}`, idbm.TagFiltersToMetricIDsCacheSizeMaxBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_max_bytes{type="storage/regexps"}`, uint64(storage.RegexpCacheMaxSizeBytes()))
metrics.WriteGaugeUint64(w, `vm_cache_size_max_bytes{type="storage/regexpPrefixes"}`, uint64(storage.RegexpPrefixesCacheMaxSizeBytes()))
metrics.WriteCounterUint64(w, `vm_cache_requests_total{type="storage/indexBlocks"}`, tm.IndexBlocksCacheRequests)
metrics.WriteCounterUint64(w, `vm_cache_requests_total{type="storage/tsid"}`, m.TSIDCacheRequests)
metrics.WriteCounterUint64(w, `vm_cache_requests_total{type="storage/metricIDs"}`, m.MetricIDCacheRequests)
metrics.WriteCounterUint64(w, `vm_cache_requests_total{type="storage/metricName"}`, m.MetricNameCacheRequests)
metrics.WriteCounterUint64(w, `vm_cache_requests_total{type="storage/indexBlocks"}`, tm.IndexBlocksCacheRequests)
metrics.WriteCounterUint64(w, `vm_cache_requests_total{type="storage/regexps"}`, storage.RegexpCacheRequests())
metrics.WriteCounterUint64(w, `vm_cache_requests_total{type="storage/regexpPrefixes"}`, storage.RegexpPrefixesCacheRequests())
metrics.WriteCounterUint64(w, `vm_cache_requests_total{type="indexdb/dataBlocks"}`, idbm.DataBlocksCacheRequests)
metrics.WriteCounterUint64(w, `vm_cache_requests_total{type="indexdb/dataBlocksSparse"}`, idbm.DataBlocksSparseCacheRequests)
metrics.WriteCounterUint64(w, `vm_cache_requests_total{type="indexdb/indexBlocks"}`, idbm.IndexBlocksCacheRequests)
metrics.WriteCounterUint64(w, `vm_cache_requests_total{type="indexdb/tagFiltersToMetricIDs"}`, idbm.TagFiltersToMetricIDsCacheRequests)
metrics.WriteCounterUint64(w, `vm_cache_requests_total{type="storage/regexps"}`, storage.RegexpCacheRequests())
metrics.WriteCounterUint64(w, `vm_cache_requests_total{type="storage/regexpPrefixes"}`, storage.RegexpPrefixesCacheRequests())
metrics.WriteCounterUint64(w, `vm_cache_misses_total{type="storage/indexBlocks"}`, tm.IndexBlocksCacheMisses)
metrics.WriteCounterUint64(w, `vm_cache_misses_total{type="storage/tsid"}`, m.TSIDCacheMisses)
metrics.WriteCounterUint64(w, `vm_cache_misses_total{type="storage/metricIDs"}`, m.MetricIDCacheMisses)
metrics.WriteCounterUint64(w, `vm_cache_misses_total{type="storage/metricName"}`, m.MetricNameCacheMisses)
metrics.WriteCounterUint64(w, `vm_cache_misses_total{type="storage/indexBlocks"}`, tm.IndexBlocksCacheMisses)
metrics.WriteCounterUint64(w, `vm_cache_misses_total{type="storage/regexps"}`, storage.RegexpCacheMisses())
metrics.WriteCounterUint64(w, `vm_cache_misses_total{type="storage/regexpPrefixes"}`, storage.RegexpPrefixesCacheMisses())
metrics.WriteCounterUint64(w, `vm_cache_misses_total{type="indexdb/dataBlocks"}`, idbm.DataBlocksCacheMisses)
metrics.WriteCounterUint64(w, `vm_cache_misses_total{type="indexdb/dataBlocksSparse"}`, idbm.DataBlocksSparseCacheMisses)
metrics.WriteCounterUint64(w, `vm_cache_misses_total{type="indexdb/indexBlocks"}`, idbm.IndexBlocksCacheMisses)
metrics.WriteCounterUint64(w, `vm_cache_misses_total{type="indexdb/tagFiltersToMetricIDs"}`, idbm.TagFiltersToMetricIDsCacheMisses)
metrics.WriteCounterUint64(w, `vm_cache_misses_total{type="storage/regexps"}`, storage.RegexpCacheMisses())
metrics.WriteCounterUint64(w, `vm_cache_misses_total{type="storage/regexpPrefixes"}`, storage.RegexpPrefixesCacheMisses())
metrics.WriteCounterUint64(w, `vm_deleted_metrics_total{type="indexdb"}`, m.DeletedMetricsCount)
metrics.WriteCounterUint64(w, `vm_cache_resets_total{type="indexdb/tagFiltersToMetricIDs"}`, idbm.TagFiltersToMetricIDsCacheResets)
metrics.WriteCounterUint64(w, `vm_cache_collisions_total{type="storage/tsid"}`, m.TSIDCacheCollisions)
metrics.WriteCounterUint64(w, `vm_cache_collisions_total{type="storage/metricName"}`, m.MetricNameCacheCollisions)
metrics.WriteCounterUint64(w, `vm_cache_syncs_total{type="indexdb/metricID"}`, idbm.MetricIDCacheSyncsCount)
metrics.WriteCounterUint64(w, `vm_cache_syncs_total{type="indexdb/date_metricID"}`, idbm.DateMetricIDCacheSyncsCount)
metrics.WriteCounterUint64(w, `vm_cache_rotations_total{type="indexdb/metricID"}`, idbm.MetricIDCacheRotationsCount)
metrics.WriteCounterUint64(w, `vm_cache_rotations_total{type="indexdb/date_metricID"}`, idbm.DateMetricIDCacheRotationsCount)
metrics.WriteCounterUint64(w, `vm_deleted_metrics_total{type="indexdb"}`, m.DeletedMetricsCount)
metrics.WriteGaugeUint64(w, `vm_next_retention_seconds`, m.NextRetentionSeconds)
if *trackMetricNamesStats {
@@ -689,6 +727,11 @@ func writeStorageMetrics(w io.Writer, strg *storage.Storage) {
metrics.WriteGaugeUint64(w, `vm_downsampling_partitions_scheduled`, tm.ScheduledDownsamplingPartitions)
metrics.WriteGaugeUint64(w, `vm_downsampling_partitions_scheduled_size_bytes`, tm.ScheduledDownsamplingPartitionsSize)
metrics.WriteGaugeUint64(w, `vm_metrics_metadata_storage_items`, m.MetadataStorageItemsCurrent)
metrics.WriteCounterUint64(w, `vm_metrics_metadata_storage_size_bytes`, m.MetadataStorageCurrentSizeBytes)
metrics.WriteCounterUint64(w, `vm_metrics_metadata_storage_max_size_bytes`, m.MetadataStorageMaxSizeBytes)
}
func jsonResponseError(w http.ResponseWriter, err error) {

View File

@@ -1,4 +1,4 @@
FROM golang:1.25.4 AS build-web-stage
FROM golang:1.25.6 AS build-web-stage
COPY build /build
WORKDIR /build

View File

@@ -1,26 +1,26 @@
# All these commands must run from repository root.
copy-metricsql-docs:
cp docs/victoriametrics/MetricsQL.md app/vmui/packages/vmui/src/assets/MetricsQL.md
vmui-package-base-image:
docker build -t vmui-builder-image -f app/vmui/Dockerfile-build ./app/vmui
vmui-build: copy-metricsql-docs vmui-package-base-image
vmui-run-npm-command: vmui-package-base-image
docker run --rm \
--user $(shell id -u):$(shell id -g) \
--mount type=bind,src="$(shell pwd)/app/vmui",dst=/build \
-w /build/packages/vmui \
--entrypoint=/bin/bash \
vmui-builder-image -c "npm install && npm run build"
vmui-builder-image -c "[ \"$$VMUI_SKIP_INSTALL\" = \"true\" ] || npm ci; $(NPM_COMMAND)"
vmui-anomaly-build: vmui-package-base-image
docker run --rm \
--user $(shell id -u):$(shell id -g) \
--mount type=bind,src="$(shell pwd)/app/vmui",dst=/build \
-w /build/packages/vmui \
--entrypoint=/bin/bash \
vmui-builder-image -c "npm install && npm run build:anomaly"
vmui-install:
NPM_COMMAND="true" $(MAKE) vmui-run-npm-command
vmui-package-base-image:
docker build -t vmui-builder-image -f app/vmui/Dockerfile-build ./app/vmui
vmui-build: copy-metricsql-docs
NPM_COMMAND="npm run build" $(MAKE) vmui-run-npm-command
vmui-release: vmui-build
docker build -t ${DOCKER_NAMESPACE}/vmui:latest -f app/vmui/Dockerfile-web ./app/vmui/packages/vmui
@@ -38,11 +38,11 @@ vmui-update: vmui-build
vmui-install-dependencies:
cd app/vmui/packages/vmui && npm ci
vmui-lint: vmui-install-dependencies
cd app/vmui/packages/vmui && npm run lint
vmui-lint:
NPM_COMMAND="npm run lint" $(MAKE) vmui-run-npm-command
vmui-typecheck: vmui-install-dependencies
cd app/vmui/packages/vmui && npm run typecheck
vmui-typecheck:
NPM_COMMAND="npm run typecheck" $(MAKE) vmui-run-npm-command
vmui-test: vmui-install-dependencies
cd app/vmui/packages/vmui && npm run test
vmui-test:
NPM_COMMAND="npm run test" $(MAKE) vmui-run-npm-command

View File

@@ -1 +0,0 @@
VITE_APP_TYPE=vmanomaly

View File

@@ -1,23 +0,0 @@
import { readFile } from "fs/promises";
import { IndexHtmlTransform } from "vite";
/**
* Vite plugin to dynamically load index.html based on the current mode.
* If a specific mode-based index file (e.g., index.vmanomaly.html) exists, it is used.
* Otherwise, the default index.html is loaded.
*/
export default function dynamicIndexHtmlPlugin({ mode }) {
return {
name: "vm-dynamic-index-html",
transformIndexHtml: {
order: "pre",
handler: async () => {
try {
return await readFile(`./index.${mode}.html`, "utf8");
} catch (error) {
return await readFile("./index.html", "utf8");
}
}
} as IndexHtmlTransform
};
}

View File

@@ -46,7 +46,7 @@ export default [...compat.extends(
settings: {
react: {
pragma: "React",
version: "detect",
version: "19.0",
},
linkComponents: ["Hyperlink", {
@@ -69,10 +69,11 @@ export default [...compat.extends(
"varsIgnorePattern": "^_",
"ignoreRestSiblings": true
}],
"unused-imports/no-unused-imports": "error",
"react/jsx-closing-bracket-location": [1, "line-aligned"],
"object-curly-spacing": [2, "always"],
"react/jsx-max-props-per-line": [1, {
maximum: 1,
@@ -81,13 +82,23 @@ export default [...compat.extends(
"react/jsx-first-prop-new-line": [1, "multiline"],
// Disable core indent rule due to recursion issues in ESLint 9; use JSX-specific rules instead
indent: "off",
indent: ["error", 2, {
SwitchCase: 1,
ignoredNodes: [
"JSXElement",
"JSXElement *",
"JSXFragment",
"JSXFragment *",
],
}],
"react/jsx-indent": ["error", 2],
"react/jsx-indent-props": ["error", 2],
"linebreak-style": ["error", "unix"],
quotes: ["error", "double"],
semi: ["error", "always"],
// Formatting rules moved out of ESLint core; omit here to avoid deprecation noise
"react/prop-types": 0,
"react/react-in-jsx-scope": "off",
},
}];

View File

@@ -1,54 +0,0 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8"/>
<link rel="icon" href="/favicon.svg" />
<link rel="apple-touch-icon" href="/favicon.svg" />
<link rel="mask-icon" href="/favicon.svg" color="#000000">
<meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=5"/>
<meta name="theme-color" content="#000000"/>
<meta name="description" content="Detect anomalies in your metrics with VictoriaMetrics Anomaly Detection UI"/>
<!--
manifest.json provides metadata used when your web app is installed on a
user's mobile device or desktop. See https://developers.google.com/web/fundamentals/web-app-manifest/
-->
<link rel="manifest" href="/manifest.json" crossorigin="use-credentials"/>
<!--
Notice the use of in the tags above.
It will be replaced with the URL of the `public` folder during the build.
Only files inside the `public` folder can be referenced from the HTML.
Unlike "/favicon.ico" or "favicon.ico", "/favicon.ico" will
work correctly both with client-side routing and a non-root public URL.
Learn how to configure a non-root public URL by running `npm run build`.
-->
<title>UI for VictoriaMetrics Anomaly Detection</title>
<meta name="twitter:card" content="summary">
<meta name="twitter:title" content="UI for VictoriaMetrics Anomaly Detection">
<meta name="twitter:site" content="@https://victoriametrics.com/products/enterprise/anomaly-detection/">
<meta name="twitter:description" content="Detect anomalies in your metrics with VictoriaMetrics Anomaly Detection UI">
<meta name="twitter:image" content="/preview.jpg">
<meta property="og:type" content="website">
<meta property="og:title" content="UI for VictoriaMetrics Anomaly Detection">
<meta property="og:url" content="https://victoriametrics.com/products/enterprise/anomaly-detection/">
<meta property="og:description" content="Detect anomalies in your metrics with VictoriaMetrics Anomaly Detection UI">
</head>
<body>
<noscript>You need to enable JavaScript to run this app.</noscript>
<div id="root"></div>
<!--
This HTML file is a template.
If you open it directly in the browser, you will see an empty page.
You can add webfonts, meta tags, or analytics to this file.
The build step will place the bundled scripts into the <body> tag.
To begin the development, run `npm start` or `yarn start`.
To create a production bundle, use `npm run build` or `yarn build`.
-->
<script type="module" src="/src/index.tsx"></script>
</body>
</html>

File diff suppressed because it is too large Load Diff

View File

@@ -7,10 +7,8 @@
"scripts": {
"prestart": "npm run copy-metricsql-docs",
"start": "vite",
"start:playground": "cross-env PLAYGROUND=METRICS npm run start",
"start:anomaly": "vite --mode vmanomaly",
"start:playground": "cross-env PLAYGROUND=true npm run start",
"build": "vite build",
"build:anomaly": "vite build --mode vmanomaly",
"lint": "eslint --output-file vmui-lint-report.json --format json 'src/**/*.{ts,tsx}'",
"lint:local": "eslint --ext .ts,.tsx -f stylish src",
"lint:fix": "eslint 'src/**/*.{ts,tsx}' --fix",
@@ -18,47 +16,48 @@
"preview": "vite preview",
"typecheck": "tsc --noEmit",
"test": "vitest run",
"test:dev": "vitest"
"test:dev": "vitest",
"precommit": "npm run lint:local && npm run typecheck && npm run test"
},
"dependencies": {
"classnames": "^2.5.1",
"dayjs": "^1.11.13",
"dayjs": "^1.11.19",
"lodash.debounce": "^4.0.8",
"marked": "^16.0.0",
"preact": "^10.26.9",
"qs": "^6.14.0",
"marked": "^17.0.1",
"preact": "^10.28.2",
"qs": "^6.14.1",
"react-input-mask": "^2.0.4",
"react-router-dom": "^7.6.3",
"react-router-dom": "^7.12.0",
"uplot": "^1.6.32",
"vite": "^7.1.11",
"web-vitals": "^5.0.3"
"vite": "^7.3.1",
"web-vitals": "^5.1.0"
},
"devDependencies": {
"@eslint/eslintrc": "^3.3.1",
"@eslint/js": "^9.30.1",
"@eslint/eslintrc": "^3.3.3",
"@eslint/js": "^9.39.2",
"@preact/preset-vite": "^2.10.2",
"@testing-library/jest-dom": "^6.6.3",
"@testing-library/jest-dom": "^6.9.1",
"@testing-library/preact": "^3.2.4",
"@types/lodash.debounce": "^4.0.9",
"@types/node": "^24.0.12",
"@types/node": "^25.0.8",
"@types/qs": "^6.14.0",
"@types/react": "^19.1.8",
"@types/react": "^19.2.8",
"@types/react-input-mask": "^3.0.6",
"@types/react-router-dom": "^5.3.3",
"@typescript-eslint/eslint-plugin": "^8.36.0",
"@typescript-eslint/parser": "^8.36.0",
"cross-env": "^7.0.3",
"eslint": "^9.30.1",
"@typescript-eslint/eslint-plugin": "^8.53.0",
"@typescript-eslint/parser": "^8.53.0",
"cross-env": "^10.1.0",
"eslint": "^9.39.2",
"eslint-plugin-react": "^7.37.5",
"eslint-plugin-unused-imports": "^4.1.4",
"globals": "^16.3.0",
"eslint-plugin-unused-imports": "^4.3.0",
"globals": "^17.0.0",
"http-proxy-middleware": "^3.0.5",
"jsdom": "^26.1.0",
"jsdom": "^27.4.0",
"postcss": "^8.5.6",
"rollup-plugin-visualizer": "^6.0.3",
"sass-embedded": "^1.89.2",
"typescript": "^5.8.3",
"vitest": "^3.2.4"
"rollup-plugin-visualizer": "^6.0.5",
"sass-embedded": "^1.97.2",
"typescript": "^5.9.3",
"vitest": "^4.0.17"
},
"browserslist": {
"production": [

View File

@@ -1,41 +0,0 @@
import { FC, useState } from "preact/compat";
import { HashRouter, Route, Routes } from "react-router-dom";
import AppContextProvider from "./contexts/AppContextProvider";
import ThemeProvider from "./components/Main/ThemeProvider/ThemeProvider";
import AnomalyLayout from "./layouts/AnomalyLayout/AnomalyLayout";
import ExploreAnomaly from "./pages/ExploreAnomaly/ExploreAnomaly";
import router from "./router";
import CustomPanel from "./pages/CustomPanel";
const AppAnomaly: FC = () => {
const [loadedTheme, setLoadedTheme] = useState(false);
return <>
<HashRouter>
<AppContextProvider>
<>
<ThemeProvider onLoaded={setLoadedTheme}/>
{loadedTheme && (
<Routes>
<Route
path={"/"}
element={<AnomalyLayout/>}
>
<Route
path={"/"}
element={<ExploreAnomaly/>}
/>
<Route
path={router.query}
element={<CustomPanel/>}
/>
</Route>
</Routes>
)}
</>
</AppContextProvider>
</HashRouter>
</>;
};
export default AppAnomaly;

View File

@@ -20,6 +20,7 @@ export interface ChartTooltipProps {
info?: ReactNode;
marker?: string;
show?: boolean;
duplicateCount?: number;
onClose?: (id: string) => void;
}
@@ -35,6 +36,7 @@ const ChartTooltip: FC<ChartTooltipProps> = ({
statsFormatted,
isSticky,
marker,
duplicateCount = 0,
onClose
}) => {
const tooltipRef = useRef<HTMLDivElement>(null);
@@ -156,6 +158,7 @@ const ChartTooltip: FC<ChartTooltipProps> = ({
<p className="vm-chart-tooltip-data__value">
<b>{value}</b>{unit}
</p>
{duplicateCount > 1 && <p>(overlapping points: {duplicateCount})</p>}
</div>
{statsFormatted && (
<table className="vm-chart-tooltip-stats">

View File

@@ -14,12 +14,11 @@ export type QueryGroup = {
interface LegendProps {
labels: LegendItemType[];
query: string[];
isAnomalyView?: boolean;
isPredefinedPanel?: boolean;
onChange: (item: LegendItemType, metaKey: boolean) => void;
}
const Legend: FC<LegendProps> = ({ labels, query, isAnomalyView, isPredefinedPanel, onChange }) => {
const Legend: FC<LegendProps> = ({ labels, query, isPredefinedPanel, onChange }) => {
const { groupByLabel } = useLegendGroup();
const groupSeries = useGroupSeries({ labels, query, groupByLabel });
@@ -33,7 +32,6 @@ const Legend: FC<LegendProps> = ({ labels, query, isAnomalyView, isPredefinedPan
key={group}
labels={items}
group={group}
isAnomalyView={isAnomalyView}
onChange={onChange}
/>
))}

View File

@@ -8,11 +8,11 @@ import { useHideDuplicateFields } from "./hooks/useHideDuplicateFields";
import Accordion from "../../../Main/Accordion/Accordion";
import { useLegendGroup } from "./hooks/useLegendGroup";
import useCopyToClipboard from "../../../../hooks/useCopyToClipboard";
import { DEFAULT_MAX_SERIES } from "../../../../constants/graph";
import { LEGEND_COLLAPSE_SERIES_LIMIT } from "../../../../constants/graph";
import { getFromStorage } from "../../../../utils/storage";
export type LegendProps = {
labels: LegendItemType[];
isAnomalyView?: boolean;
duplicateFields?: string[];
onChange: (item: LegendItemType, metaKey: boolean) => void;
}
@@ -21,7 +21,7 @@ interface LegendGroupProps extends LegendProps {
group: string | number;
}
const LegendGroup: FC<LegendGroupProps> = ({ labels, group, isAnomalyView, onChange }) => {
const LegendGroup: FC<LegendGroupProps> = ({ labels, group, onChange }) => {
const { isTableView } = useLegendView();
const { groupByLabel } = useLegendGroup();
const copyToClipboard = useCopyToClipboard();
@@ -38,17 +38,26 @@ const LegendGroup: FC<LegendGroupProps> = ({ labels, group, isAnomalyView, onCha
const Content = isTableView ? LegendTable : LegendLines;
const disableAutoCollapse = getFromStorage("LEGEND_AUTO_COLLAPSE") === "false";
const defaultExpanded = disableAutoCollapse ? true : sortedLabels.length <= LEGEND_COLLAPSE_SERIES_LIMIT;
const expandedWarning = (
<span className="vm-legend-group-header__warning">
Legend collapsed by default ({sortedLabels.length} series) click to expand.
</span>
);
return (
<div
className="vm-legend-group"
key={group}
>
<Accordion
defaultExpanded={sortedLabels.length < DEFAULT_MAX_SERIES.chart}
defaultExpanded={defaultExpanded}
title={(
<div className="vm-legend-group-header">
<div className="vm-legend-group-header-title">
Group by{groupByLabel ? "" : " query"}: <b>{group}</b>
Group by{groupByLabel ? "" : " query"}: <b>{group}</b> {!defaultExpanded && expandedWarning}
</div>
{!!duplicateFields.length && (
<div className="vm-legend-group-header-labels">
@@ -71,7 +80,6 @@ const LegendGroup: FC<LegendGroupProps> = ({ labels, group, isAnomalyView, onCha
>
<Content
labels={sortedLabels}
isAnomalyView={isAnomalyView}
duplicateFields={duplicateFields}
onChange={onChange}
/>

View File

@@ -13,11 +13,10 @@ import { getLabelAlias } from "../../../../../utils/metric";
interface LegendItemProps {
legend: LegendItemType;
onChange?: (item: LegendItemType, metaKey: boolean) => void;
isAnomalyView?: boolean;
duplicateFields?: string[];
}
const LegendItem: FC<LegendItemProps> = ({ legend, onChange, duplicateFields, isAnomalyView }) => {
const LegendItem: FC<LegendItemProps> = ({ legend, onChange, duplicateFields }) => {
const copyToClipboard = useCopyToClipboard();
const { hideStats } = useShowStats();
@@ -52,12 +51,10 @@ const LegendItem: FC<LegendItemProps> = ({ legend, onChange, duplicateFields, is
})}
onClick={createHandlerClick(legend)}
>
{!isAnomalyView && (
<div
className="vm-legend-item__marker"
style={{ backgroundColor: legend.color }}
/>
)}
<div
className="vm-legend-item__marker"
style={{ backgroundColor: legend.color }}
/>
<div className="vm-legend-item-info">
<span className="vm-legend-item-info__label">
{legend.hasAlias && legend.label}

View File

@@ -2,7 +2,7 @@ import { FC } from "preact/compat";
import LegendItem from "../LegendItem/LegendItem";
import { LegendProps } from "../LegendGroup";
const LegendLines: FC<LegendProps> = ({ labels, isAnomalyView, duplicateFields, onChange }) => {
const LegendLines: FC<LegendProps> = ({ labels, duplicateFields, onChange }) => {
return (
<div className="vm-legend-item-container">
@@ -10,7 +10,6 @@ const LegendLines: FC<LegendProps> = ({ labels, isAnomalyView, duplicateFields,
<LegendItem
key={legendItem.label}
legend={legendItem}
isAnomalyView={isAnomalyView}
duplicateFields={duplicateFields}
onChange={onChange}
/>

View File

@@ -32,6 +32,14 @@
}
}
&__warning {
flex-grow: 1;
text-align: right;
padding-right: calc($padding-large * 2);
font-size: $font-size-small;
color: $color-warning;
}
&-labels {
display: flex;
flex-wrap: wrap;

View File

@@ -1,82 +0,0 @@
import { FC, useMemo } from "preact/compat";
import { ForecastType, SeriesItem } from "../../../../types";
import { anomalyColors } from "../../../../utils/color";
import "./style.scss";
type Props = {
series: SeriesItem[];
};
const titles: Partial<Record<ForecastType, string>> = {
[ForecastType.yhat]: "yhat",
[ForecastType.yhatLower]: "yhat_upper - yhat_lower",
[ForecastType.yhatUpper]: "yhat_upper - yhat_lower",
[ForecastType.anomaly]: "anomalies",
[ForecastType.training]: "training data",
[ForecastType.actual]: "y"
};
const LegendAnomaly: FC<Props> = ({ series }) => {
const uniqSeriesStyles = useMemo(() => {
const uniqSeries = series.reduce((accumulator, currentSeries) => {
const hasForecast = Object.prototype.hasOwnProperty.call(currentSeries, "forecast");
const isNotUpper = currentSeries.forecast !== ForecastType.yhatUpper;
const isUniqForecast = !accumulator.find(s => s.forecast === currentSeries.forecast);
if (hasForecast && isUniqForecast && isNotUpper) {
accumulator.push(currentSeries);
}
return accumulator;
}, [] as SeriesItem[]);
const trainingSeries = {
...uniqSeries[0],
forecast: ForecastType.training,
color: anomalyColors[ForecastType.training],
};
uniqSeries.splice(1, 0, trainingSeries);
return uniqSeries.map(s => ({
...s,
color: typeof s.stroke === "string" ? s.stroke : anomalyColors[s.forecast || ForecastType.actual],
}));
}, [series]);
return <>
<div className="vm-legend-anomaly">
{/* TODO: remove .filter() after the correct training data has been added */}
{uniqSeriesStyles.filter(f => f.forecast !== ForecastType.training).map((s, i) => (
<div
key={`${i}_${s.forecast}`}
className="vm-legend-anomaly-item"
>
<svg>
{s.forecast === ForecastType.anomaly ? (
<circle
cx="15"
cy="7"
r="4"
fill={s.color}
stroke={s.color}
strokeWidth="1.4"
/>
) : (
<line
x1="0"
y1="7"
x2="30"
y2="7"
stroke={s.color}
strokeWidth={s.width || 1}
strokeDasharray={s.dash?.join(",")}
/>
)}
</svg>
<div className="vm-legend-anomaly-item__title">{titles[s.forecast || ForecastType.actual]}</div>
</div>
))}
</div>
</>;
};
export default LegendAnomaly;

View File

@@ -1,23 +0,0 @@
@use "src/styles/variables" as *;
.vm-legend-anomaly {
position: relative;
display: flex;
align-items: center;
justify-content: center;
flex-wrap: wrap;
gap: calc($padding-large * 2);
cursor: default;
&-item {
display: flex;
align-items: center;
justify-content: center;
gap: $padding-small;
svg {
width: 30px;
height: 14px;
}
}
}

View File

@@ -13,7 +13,6 @@ import {
getRangeY,
getScales,
handleDestroy,
setBand,
setSelect
} from "../../../../utils/uplot";
import { MetricResult } from "../../../../api/types";
@@ -40,7 +39,6 @@ export interface LineChartProps {
setPeriod: ({ from, to }: { from: Date, to: Date }) => void;
layoutSize: ElementSize;
height?: number;
isAnomalyView?: boolean;
spanGaps?: boolean;
showAllPoints?: boolean;
}
@@ -55,7 +53,6 @@ const LineChart: FC<LineChartProps> = ({
setPeriod,
layoutSize,
height,
isAnomalyView,
spanGaps = false,
showAllPoints = false,
}) => {
@@ -75,7 +72,7 @@ const LineChart: FC<LineChartProps> = ({
seriesFocus,
setCursor,
resetTooltips
} = useLineTooltip({ u: uPlotInst, metrics, series, unit, isAnomalyView });
} = useLineTooltip({ u: uPlotInst, metrics, series, unit });
const options: uPlotOptions = {
...getDefaultOptions({ width: layoutSize.width, height }),
@@ -111,7 +108,6 @@ const LineChart: FC<LineChartProps> = ({
if (!uPlotInst) return;
delSeries(uPlotInst);
addSeries(uPlotInst, series, spanGaps, showAllPoints);
setBand(uPlotInst, series);
uPlotInst.redraw();
}, [series, spanGaps, showAllPoints]);

Some files were not shown because too many files have changed in this diff Show More