Compare commits

..

835 Commits

Author SHA1 Message Date
func25
4e57f4dc48 fix 2026-04-14 15:10:04 +07:00
Artem Fetishev
8a20ccf21d apptest: sync code between branches and fix backup/restore range queries (#10799)
Fix app tests:

1. Sync code between vmsingle and vmcluster: it must be the same because
apptest does not differentiate between branches, it just runs pre-built
binaries
2. Simplify range queries in backup/restore test so that it does not
depend on the interval between samples to work correctly.

---------

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-04-14 07:18:09 +02:00
Max Kotliar
1a01dbbec7 docs/changelog: fix unwanted release tag change
The tag v1.138.0 was unintentinally changed to v1.139.0 due to bug in
release script.

Reverting the change. The bug will be addressed separate.
2026-04-13 14:52:21 +03:00
f41gh7
630e413812 docs: update flags with actual v1.140.0 binaries
Signed-off-by: f41gh7 <nik@victoriametrics.com>
2026-04-13 11:34:04 +02:00
f41gh7
b639e7e641 docs: bump version to v1.140.0
Signed-off-by: f41gh7 <nik@victoriametrics.com>
2026-04-13 11:31:40 +02:00
f41gh7
858c318e1f deplyoment/docker: bump version to v1.140.0 2026-04-13 11:31:11 +02:00
f41gh7
b8327ce09c docs: mention new LTS releases
Signed-off-by: f41gh7 <nik@victoriametrics.com>
2026-04-13 11:16:30 +02:00
Aliaksandr Valialkin
7514511c68 app/vmauth/main.go: clarify comments for bufferedBody struct a bit
This is a follow-up for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10677#discussion_r3064731250
2026-04-11 09:42:32 +02:00
f41gh7
33d524bf13 follow-up for d07c1c73d1
move bugifx into current release
2026-04-10 19:37:14 +02:00
Alexander Frolov
d07c1c73d1 lib/writeconcurrencylimiter: prevent deadlock at IncConcurrency
Previously (*writeconcurrencylimiter.Reader).Read() could permanently leak concurrency tokens from the -maxConcurrentInserts semaphore.
 
 Consider the following example:
* GetReader() acquires a token, then PutReader() unconditionally releases it.
* Read() calls DecConcurrency() before the underlying I/O and IncConcurrency() after it. If IncConcurrency() returns an error, Read() returns without holding a token.
* Each such failure permanently removes one slot from the concurrencyLimitCh semaphore. Slots leak one by one until the channel is fully drained, at which point DecConcurrency() blocks forever, deadlocking ingestion on vmstorage.

 This commit adds tracking for obtained tokens to the reader. Which prevents possible tokens leakage. 

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10784
2026-04-10 19:35:59 +02:00
f41gh7
a896673c42 CHANGELOG.md: cut v1.140.0 release 2026-04-10 17:02:32 +02:00
f41gh7
c60ab2d57a make docs-update-version 2026-04-10 16:54:18 +02:00
f41gh7
49e51611d7 make vmui-update 2026-04-10 16:51:13 +02:00
Hui Wang
902ca83177 app/vmalert: adopt additional rule states in the list rules API
In grafana, the alert list panel can use VictoriaMetrics as datasource
and call `/api/v1/rules` api with [specific
states](https://grafana.com/docs/grafana/latest/alerting/fundamentals/alert-rule-evaluation/nodata-and-error-states/#alert-instance-states).
See
https://play-grafana.victoriametrics.com/d/febljk0a32qyoa/3e68cf3?orgId=1&from=now-1h&to=now&timezone=browser&var-prometheus_datasource=P4169E866C3094E38&var-jaeger_datasource=P14D5514F5CCC0D1C&var-victorialogs_datasource=PD775F2863313E6C7&var-service_namespace=$__all&var-service_name=checkout&refresh=5m&editPanel=40.
Some states are already defined in vmalert, although with different
names. Others, such as "recovering", are currently undefined.
This pull request adopts all these states, rather than fail the request.

Above panel request also uses the `matcher` param to filter rules.
However,
[prometheus](https://prometheus.io/docs/prometheus/latest/querying/api/#rules)
also does not support this parameter and simply ignore it, so I don't
think vmalert needs to support it now.

JFYI, the grafana [Alerting
page](https://play-grafana.victoriametrics.com/alerting) does not
include any of the mentioned `state` or `matcher` parameters in rule
listing requests to the datasource. Filtering is handled by the Grafana
frontend, so most users are not affected by partial support for
filtering in backend products.

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10778
2026-04-10 16:47:25 +02:00
Phuong Le
66e3f8736b ci: remove automatic Codecov reporting from test workflow (#10780)
This removes automatic Codecov reporting from VictoriaMetrics CI. This
change keeps local coverage generation available, but removes automatic
PR noise (such as
[this](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10625#issuecomment-4084390659))
and unnecessary CI overhead.
2026-04-10 16:45:11 +02:00
f41gh7
532fcc3dfe docs: remove reverted commit changelog
Signed-off-by: f41gh7 <nik@victoriametrics.com>
2026-04-10 16:35:29 +02:00
Aliaksandr Valialkin
b003d6c6ae Revert "app/vmauth: align request body buffering flags"
This reverts commit b3c03c023c.

Reason for revert: the original logic was correct from the user's perspective:

- The -maxRequestBodySizeToRetry command-line flag controls the size of the request body,
  which could be retried on backend failure. The meaining of this flag wasn't changed after
  the introduction of the -requestBufferSize flag in the commit e31abfc25c
  (see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10309 )

- The -requestBufferSize flag controls the size of the buffer for reading request body
  before sending sending it to the backend and before applying concurrency limits.

These flags are independent from user's perspective. The fact that these flags share the implementation,
sholdn't be known to the user - this is an implementation detail, which allows avoiding double buffering.

Both flags enable request buffering. If the user wants disabling of all the request buffering,
then both flags must be set to 0. That's why these flags are cross-mentioned in their -help descriptions.

Also the reverted commit had the following issues:

- It reduced the default value for the -requestBufferSize flag from 32KiB to 16KiB.
  The 32KiB value has been calculated and justified at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10309 .
  It shouldn't increase vmagent memory usage too much for typical workloads.
  For example, if vmagent handles 10K concurrent requests, then the memory overhead for the request buffering
  will be 10K*32KiB=320MiB. This is a small price for being able to efficiently handling 10K concurrent requests.

- It added a dot to the end of the https://docs.victoriametrics.com/victoriametrics/vmauth/#request-body-buffering link
  in the description for the description of the -requestBufferSize flag. This breaks clicking the link in some environments,
  since the trailing dot is considered as a part of the url.

- It added a superflouous whitespace in front of the 'Disabling request buffering' text inside the description
  for the -requstBufferSize flag.

- It introduced an unnecessary complexity to the user by mentioning that the zero value
  at -maxBufferSize disables buffering for request reties (these things must be independent
  from the user's perspective).

- It changed the bufferedBody logic in non-trivial ways, which aren't related to the original issue.
  If these changes are needed, then they must be justified in a separate issue and must be prepared
  in a separate pull request / commit.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10675
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10677
2026-04-10 15:55:47 +02:00
Aliaksandr Valialkin
8fa0fae05a lib/protoparser/protoparserutil: fix encoding -> contentType in the description of the ReadUncompressedData function
This is a follow-up for the commit bed7cbd0a4
2026-04-10 15:20:27 +02:00
Aliaksandr Valialkin
3fe2ec7bde docs/victoriametrics/Articles.md: add https://medium.com/airbnb-engineering/building-a-high-volume-metrics-pipeline-with-opentelemetry-and-vmagent-c714d6910b45 2026-04-10 13:28:49 +02:00
Max Kotliar
6389979bce docs/changelog: add thank you for bugfix contribution 2026-04-10 13:08:28 +03:00
Max Kotliar
210fd0ae15 docs/changelog: add thank you for the contribution 2026-04-10 13:06:08 +03:00
Noureldin
f95b483a13 lib/storage: fixes data race at startFreeDiskSpaceWatcher
Previously, Storage.table was initialized after startFreeDiskSpaceWatcher was called.
This created a potential data race condition: if openTable took a long time to complete
and freed disk space during that window, the free disk space watcher could read an
uninitialized (or partially initialized) Storage.table, leading to an invalid memory
address or nil pointer dereference panic.

This commit properly initializes s.isReadOnly state during storage start and
starts FreeDiskSpaceWatcher after openTable.

Bug was introduced in github.com/VictoriaMetrics/VictoriaMetrics/commit/27b958ba8bc66578206ddac26ccf47b2cc3e8101

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10747
2026-04-10 08:33:49 +02:00
Hui Wang
b71c37e20a app/vmalert: align group evaluation time with the eval_offset option
Align group evaluation time with the `eval_offset` option to allow users
to manage group execution more effectively by understanding the exact
time each group will be scheduled, particularly in cases of spreading
rule execution within a window, chaining groups, or debugging data delay
issue.

If the group evaluation takes less than the group interval, but the
initial evaluation combined with the additional restore operation
exceeds the group interval, the evaluation time will be gradually
corrected in subsequent evaluations, as the interval ticker schedule
remains unchanged.

For groups without `eval_offset`, this change also ensures that all
evaluations follow the interval. Previously, the gap between the first
and second evaluations was larger than the interval. And the
`eval_delay` continues to help prevent partial responses.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10772.
2026-04-10 08:31:36 +02:00
Aliaksandr Valialkin
c27b5f5dfe docs/victoriametrics/vmauth.md: fix link to concurrency limiting chapter
The correct link must be https://docs.victoriametrics.com/victoriametrics/vmauth/#concurrency-limiting
instead of https://docs.victoriametrics.com/victoriametrics/vmauth/#concurrency-limits

The incorrect link has been introduced in the commit e31abfc25c
2026-04-09 19:37:40 +02:00
Max Kotliar
0a31eacb3d lib/{osinfo,appmetrics}: Move vm_os_info metric code to lib/appmetrics package (#10776)
Follow-up commit for
211fb08028

Address @f41gh7 review comments:
- Move code from `lib/osinfo` to `lib/appmetrics`.
- Make the logic private.
- Use metrics.WriteGaugeUint64 func.
- Remove registration logic from `app/xxx/main.go`.
- Remove `lib/osinfo` package.
2026-04-09 18:32:47 +03:00
Artem Fetishev
70b0115ea6 lib/storage: reuse nextDayMetricIDs during the first hour of the day (#10704)
At 00:00 UTC the ingested samples start to have timestamps for the new
day (in the ingested samples are always recent). Even though there was a
next-day prefill of the per-day index during the last hour of the day,
some performance degradation is still possible.

For example, in https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10698
it is manifested as `vminsert-to-vmstorage connection saturation` peaks
right after midnight.

Possible hypothesis why this is happening. At midnight,
currHourMetricIDs is empty and prevHourMetricIDs cannot be used because
it holds metricIDs for the previous day. So the ingestion logic hits
dateMetricIDsCache which may not have the metricID in its read-only
buffer and therefore should aquire lock to check its prev read-only
buffer or read-write buffer. Which creates lock contention and therefore
raises ingestion request latency.

A solution to this could be re-using the nextDayMetricIDs during the
first hour of the day. During this time, it is equivalent to
currHourMetricIDs.

---------

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
Signed-off-by: Artem Fetishev <149964189+rtm0@users.noreply.github.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
2026-04-09 16:33:42 +02:00
Max Kotliar
dfafd14767 apptest: Improve TestSingleVMAgentDropOnOverload stability (#10774)
Previosly the test could fail on resource constraint runners because
remoteWrite retry happens before the assertion in:

```
    waitFor(
        func() bool {
            return vmagent.RemoteWriteRequests(t, url1) == 1 &&
vmagent.RemoteWriteRequests(t, url2) == 1
        },
    )
```

Because of retry the metric jumps to two and assert never satisfied.

The commit explisitly postpones retries so there is no race condition.

Failed  CI job:

https://github.com/VictoriaMetrics/VictoriaMetrics/actions/runs/24186679213/job/70593055140

PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10774

<img width="1157" height="879" alt="Screenshot 2026-04-09 at 15 30 33"
src="https://github.com/user-attachments/assets/e170ae12-cf79-4501-a57b-fbd3612d31a0"
/>
2026-04-09 16:57:39 +03:00
Max Kotliar
e3fdbc8341 docs/changelog: cleanup follow-up on e1a9901654
e1a9901654
2026-04-09 15:04:48 +03:00
Max Kotliar
1bf442537f docs/changelog: cleanup. follow-up on 211fb08028 commit
211fb08028
2026-04-09 15:01:44 +03:00
JAYICE
211fb08028 introduce os kernel version information metric (#10746)
The commit introduces the `vm_os_info` metric, which is exposed by all VM binaries by default. It provides visibility into the operating system version on which VictoriaMetrics is running, helping with troubleshooting environment-specific issues, like known kernel or fs bugs. 

FIxes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10481
PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10746

Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-04-09 14:43:25 +03:00
Yury Moladau
846124e280 app/vmui: generate CSV format using /api/v1/labels (#10771)
`Export query` button on `Raw Query` tab now fetches labels of executed query and composes export `format` based on that list of labels. It ensures that all query response labels are preserved in the CSV export. 

Also, commit removes the addition of the CSV header in the frontend. Now the header is added by the backend (see https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10706).

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10667
PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10771
Duplicate of: https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10737

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Signed-off-by: Yury Molodov <yurymolodov@gmail.com>
Co-authored-by: lawrence3699 <lawrence3699@users.noreply.github.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-04-09 14:18:01 +03:00
andriibeee
e1a9901654 vmselect: add CSV header support for export/import (#10706)
Export (/api/v1/export/csv) now always writes a header row matching the requested format fields. Examples:

```
  # format=__timestamp__:unix_ms,__value__,job,instance
  __timestamp__:unix_ms,__value__,job,instance
  1704067200000,42.5,node,localhost:9090
```

Import (/api/v1/import/csv) gains auto-detection logic: the first row is skipped if any timestamp column fails timestamp parsing or any metric value column fails float parsing. If the first row is not detected as headers, it is parsed as data. This makes the import backward compatible. 

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10666
PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10706

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-04-09 14:00:39 +03:00
dependabot[bot]
5d0cf1d4a5 build(deps): bump vite from 8.0.2 to 8.0.7 in /app/vmui/packages/vmui (#10761)
Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite) from 8.0.2 to 8.0.7.

https://github.com/vitejs/vite/blob/v8.0.7/packages/vite/CHANGELOG.md

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-04-09 13:16:01 +03:00
Pablo (Tomas) Fernandez
cd3d297a3d docs: udpate playground page (#10754)
This change reverts part of the changes in
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10686

Motivation: docs added https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10686 in most cases are too verbose, ai-generated and bringing low practical sense.

The improvement goal: remove bloat from the docs and keep them practical and useful.

What it does:
- Completely removes items from the sidebar
- Moves the content of the most important playground pages to the
`/playground/` stub (README.md). Use H2s for each playground.
- Updates and cleans the text.
- Removes the individual children pages in the playground category (keep
only the `/playgrounds/` page/stub and remove the children).
- Removes items as these don't really need much introduction or aren't
playgrounds:
  - log to logsql: a conversion tool
  - sql to logsql: same
- adds Grafana playground section

Links of child pages will become invalid. We don't preserve them as this is pretty new doc (1w on prod) and is unlikely to have already persisted links somewhere.

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2026-04-09 12:08:48 +02:00
f41gh7
52f4d0f055 follow-up for 72c9e9377c
Move changelog entry to the upcoming release section
2026-04-09 11:38:36 +02:00
Hui Wang
72c9e9377c app/vmalert: expose remotewrite queue_size metrics
This commit adds new metrics `vmalert_remotewrite_queue_capacity` and `vmalert_remotewrite_queue_size`, which is updated with each push and it's
frequency depends on `-remoteWrite.concurrency`,
`remoteWrite.flushInterval`

It doesn't account for the pending data within each pushers request, it
should provide a general indication of the queue usage.

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10765
2026-04-09 11:22:38 +02:00
andriibeee
0aaa741b5b lib/awsapi: add support for named AWS profile to ec2_sd_config
Add support for named AWS profiles in ec2_sd_config, matching Prometheus behavior.

Example:

```text
~/.aws/config:
[profile account-one]
source_profile = root
role_arn = arn:aws:iam::000000000001:role/prometheus
```

```yaml
scrape config:
- job: ec2
  ec2_sd_configs:
    - profile: account-one
```

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1685
2026-04-09 11:17:17 +02:00
f41gh7
0e9870b7a9 vendor: run go get -u ./lib/...
go get -u ./app/...
go mod tidy -compat=1.26
go mod vendor
2026-04-09 09:34:03 +02:00
Artem Fetishev
accb06d131 lib/storage: refactor storage synctests
Exctract repeated code from nextDayMetricIDs synctests into separate
funcs to make the code more readable.

The change was originally introduced in
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10704 and was
extracted into a separate PR to keep the original change simple.
2026-04-09 09:07:37 +02:00
f41gh7
1787bce6cb app/vminsert: opitimise per insert request memory buffer size
Previously, vminsert did not account for the ingest concurrency limit in buffer size calculation.
This could lead to excessively large buffers and OOM errors when the concurrency limit was reached.

 This commit fixes buffer size calculation by separating `insertCtx` and `storageNode` buffer size limits.

`storageNode` buffer size is set to a larger value, as it is allocated per configured `-storageNode`
and is independent of the concurrency limit.

`insertCtx` buffer size now accounts for the configured concurrency limit
and calculates the maximum buffer size accordingly.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10725
2026-04-09 08:48:37 +02:00
JAYICE
141febd413 app/vmselect: disable partial responses for cluster native requests
Previously, vmselect in cluster-native mode could return partial responses to upstream vmselect.
Since upstream vmselect expects full responses (mimicking vmstorage behavior),
partial responses must be disabled in cluster-native mode.
This prevents incomplete responses from being cached at the upstream vmselect level.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10678
2026-04-09 08:48:13 +02:00
0e4ef622
256eff061d docs/victoriametrics/stream-aggregation: fix rate_sum link (#10756)
### Describe Your Changes

https://github.com/VictoriaMetrics/VictoriaMetrics/pull/8349 updated the recommendation for histogram aggregation from `total` to `rate_sum`, but missed one of the links.

PR: https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10756

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-04-08 13:09:58 +03:00
Zakhar Bessarab
fa1dd0ec0a app/vmagent/remotewrite: automatically set series limits to MaxInt32 when setting value to -1 (#9614)
Automatically set daily and hourly series limits to `MaxInt32` when `remoteWrite.maxHourlySeries` or `remoteWrite.maxDailySeries` is set to `-1`.

This change addresses a usability issue with the cardinality limiter. Users may want to enable the limiter to observe its metrics before deciding on an appropriate limit. However, the underlying bloom filter only supports `int32`, so setting large values can lead to overflow.

With this PR:
* Setting either flag to `-1` is treated as “no practical limit” and internally mapped to `math.MaxInt32`
* Values exceeding `int32` are safely clamped to `MaxInt32` to prevent overflow

This allows users to enable the limiter for estimation purposes without risking invalid configurations or runtime issues.

https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9614

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Signed-off-by: Max Kotliar <kotlyar.maksim@gmail.com>
Co-authored-by: Nikolay <nik@victoriametrics.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
2026-04-08 12:54:27 +03:00
Max Kotliar
6337dfc472 deployment/docker: update Go builder from Go1.26.1 to Go1.26.2
See
https://github.com/golang/go/issues?q=milestone%3AGo1.26.2%20label%3ACherryPickApproved
2026-04-08 12:43:29 +03:00
JAYICE
0a256002e5 lib/promscape: update last scrape result only when current scrape is successful
Previously, last scrape result was unconditionally update, despite possible scrape error.

The commit updates last scrape result only at successful scrape. It properly accounts `scrape_series_added` metric and aligns it with the same metric in Prometheus.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10653
2026-04-06 17:14:47 +02:00
Nikolay
b3c03c023c app/vmauth: align request body buffering flags
Previously introduced flag `requestBufferSize` raised default value for
in-memory buffer from 16KB to 32KB. It could increase memory usage for
vmauth. Also it made unclean how to actually disable requests buffering.

 This commit aligns flags value to the 16KB. And disables requests
buffering if any of flags value are 0 as mentioned at flags description.
If any of flags have non-default value, those value are used as max size
for request buffer. If both flags are modified - bigger value wins.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10675
2026-04-06 09:51:44 +02:00
Hui Wang
80e2f29761 app/vmalert: add random jitter to concurrent periodical flushers targeting the remote write destination
I expect the change to help in two ways:
1. Spreading remote write flushes over the flush interval to avoid
congestion at the remote write destination;
2. Enhance queue data consumption. Currently, all flushers may always
flush data simultaneously, resulting in periods where no flushers are
consuming data from the queue, which increases the risk of reaching the
queue limit `remoteWrite.maxQueueSize` even when a increased
`remoteWrite.concurrency`. By making the flushers more dispersed, it is
more likely that some flushers are consistently consuming data from the
queue, which should make queue management easier.

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10729/
2026-04-06 09:50:20 +02:00
Hui Wang
df34ba3ba2 app/vmalert: expose new histograms to provide better visibility into remote write request sizes
The new histograms should help with debugging whether remote write
pushes are efficient(pushes can be underutilized due to small flush
interval), like in
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10693 and
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10536. This
enhanced visibility will allow related parameters such as
`-remoteWrite.maxBatchSize`, `-remoteWrite.maxQueueSize`,
`-remoteWrite.flushInterval` to be tuned accordingly.

Eventually, `vmalert_remotewrite_sent_rows_total`
and `vmalert_remotewrite_sent_bytes_total` could be deprecated, but it's also fine to leave
them as they are since they're small counters.

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10727
2026-04-06 09:49:17 +02:00
sias32
10dd45c4fd dashboards: improvement alert statistics (#10571)
Changes:

- Added the number of `pending alerts` and `firing alerts`
- Improvement `transormations` for panel - FIRING over time by group and rules
- Added sort for panel - FIRING over time by rule

Signed-off-by: sias32 <sias.32@yandex.ru>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-04-03 21:27:19 +03:00
Max Kotliar
a3294b5aa2 docs/guide: fix free space calculation factor in capacity planning formula
Replace 1.2 multiplier with 1.25 in disk space estimation formula.

1.2 only provides ~16.7% free space, while the docs recommend keeping
20%. Using 1.25 correctly accounts for 20% free space.

Inspired by
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10394
2026-04-03 21:20:01 +03:00
Zhu Jiekun
4438454567 vendor: update metrics package with fix unsupported metric type for summary (#10745)
Fix `unsupported` metric type display in exposed metric metadata for
summaries and quantiles by bumping `metrics` SDK version.

This `unsupported` type exists when a summary is not updated within a
certain time window. See https://github.com/VictoriaMetrics/metrics/issues/120 and pull
request https://github.com/VictoriaMetrics/metrics/pull/121 for details.

Signed-off-by: Zhu Jiekun <jiekun@victoriametrics.com>
Signed-off-by: Max Kotliar <kotlyar.maksim@gmail.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-04-03 16:06:47 +03:00
Max Kotliar
71af1ee5f1 .github: Set 21-day cooldown to dependabot updates (#10740)
Recent supply chain attacks on GitHub Actions and npm packages show the
risk of pulling dependency updates too quickly:
-
https://socket.dev/blog/trivy-under-attack-again-github-actions-compromise
-
https://www.stepsecurity.io/blog/axios-compromised-on-npm-malicious-versions-drop-remote-access-trojan
2026-04-03 15:51:07 +03:00
Evgeny
e00fb7e605 app/vmagent: add per-URL -remoteWrite.disableMetadata
Add per-URL `-remoteWrite.disableMetadata` flag to control metadata
sending for each remote storage independently.

After v1.137.0 enabled `-enableMetadata` by default, metadata is sent to
ALL remote write targets, even those with relabeling filters that drop
most metrics. This causes unnecessary growth in
`vmagent_remotewrite_requests_total`. and significant increase in
network load for heavy filtered remote write destinations.
2026-04-03 10:32:34 +02:00
Roman Khavronenko
5e2ee00504 app/vmauth: mention that vmauth can be used with other components
A cosmetic change to highlight that vmauth can be used with other
compnents besides VM only
2026-04-03 10:27:43 +02:00
JAYICE
de2bc4237a lib/backup/s3: retry the requests that failed with unexpected EOF
When the network between client and s3 server is unstable, the client may encounter temporary io.EOF errors when reading the response from s3 server.
Currently, the s3 sdk in vmbackup uses the default retry policy. However, this default retry policy won't retry when s3 sdk meet unexpected EOF. This means that the temporary unexpected EOF error will cause the backup task to fail.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10699
2026-04-03 10:26:58 +02:00
Fred Navruzov
3e51f277bd docs/vmanomaly: v1.29.2 (#10741)
update docs to vmanomaly v1.29.2 release

Signed-off-by: Fred Navruzov <fred-navruzov@users.noreply.github.com>
2026-04-02 22:02:26 +03:00
Roman Khavronenko
5723339525 docs: mention https://victoriametrics.com/blog/victoriametrics-remote-write/ (#10726)
Add link to blogpost with detailed information about zstd+rw protocol.
This PR is based on question in community channel about implementation
details.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2026-04-02 16:30:10 +03:00
Max Kotliar
3b986ad326 docs/changelog: add thank for contribution 2026-04-02 15:58:23 +03:00
Max Kotliar
08dd38d4a0 vendor: update https://github.com/VictoriaMetrics/metricsql from v0.85.0 to v0.86.0
It contains https://github.com/VictoriaMetrics/metricsql/pull/63 that
reduce number of parentheses added.

It should improve prettify functinality in vmui
2026-04-02 15:42:01 +03:00
Aliaksandr Valialkin
815cc97952 vendor: update github.com/VictoriaMetrics/metrics from v1.42.0 to v1.43.0 2026-04-02 14:18:30 +02:00
Dmytro Kozlov
93d71e7106 vmctl: add thanos migration mode (#10659)
Implemented dedicated thanos migration mode for vmctl to migrate data from Thanos installations to VictoriaMetrics.

Key features:
1. Raw and downsampled blocks support: Reads both raw blocks
(resolution=0) and downsampled blocks (5m/1h resolution) directly from
Thanos snapshots
2. All aggregate types: Imports count, sum, min, max, and counter
aggregates from downsampled blocks as separate metrics with resolution
and type suffixes (e.g., metric_name:5m:count)
3. Dedicated flags: Uses `--thanos-*` prefixed flags (--thanos-snapshot,
--thanos-concurrency, --thanos-filter-time-start,
--thanos-filter-time-end, --thanos-filter-label,
--thanos-filter-label-value, --thanos-aggr-types)
4. Selective aggregate import: Use `--thanos-aggr-types` to import only
specific aggregates

Usage:
```
vmctl thanos --thanos-snapshot /path/to/thanos-data --vm-addr http://victoria-metrics:8428
```

Closes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9262

Signed-off-by: Dmytro Kozlov <d.kozlov@victoriametrics.com>
Signed-off-by: Max Kotliar <kotlyar.maksim@gmail.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
Co-authored-by: Max Kotliar <kotlyar.maksim@gmail.com>
2026-04-02 14:51:01 +03:00
Aliaksandr Valialkin
577b161343 docs/victoriametrics/changelog/CHANGELOG.md: add a description for the change in the commit dd2d6807e4 2026-04-02 13:18:38 +02:00
Mehrdad Banikian
dd2d6807e4 Add split phase metrics for filestream fsync operations (#10493)
## Summary

This PR implements split phase metrics for filestream operations as
requested in #10432.

### Changes

- Added `vm_filestream_fsync_duration_seconds_total` metric to track
fsync syscall duration separately
- Added `vm_filestream_fsync_calls_total` metric to count fsync calls
- Added `vm_filestream_write_syscall_duration_seconds_total` metric to
track write syscall duration (previously mixed with flush time)
- Refactored `MustClose()` and `MustFlush()` to use new `flush()` and
`sync()` helper methods
- Kept `vm_filestream_write_duration_seconds_total` for backward
compatibility

### Problem Solved

Previously, `vm_filestream_write_duration_seconds_total` was being
incremented in two places:
1. `statWriter.Write()` - triggered by `bw.Flush()` and `bw.Write()`
2. `Writer.MustFlush()` - which included the above process, leading to
double-counting

This made it impossible to distinguish between write syscall time and
fsync time, which is critical for diagnosing storage latency issues.

### Solution

The new metrics allow users to:
- Distinguish "flush got slower" vs "fsync got slower" using metrics
only
- No file path labels (bounded cardinality)
- No double-counting between metrics

### Testing

- Code compiles successfully
- All existing metrics are preserved for backward compatibility

Closes #10432

---------

Signed-off-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
Signed-off-by: Aliaksandr Valialkin <valyala@gmail.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2026-04-02 13:14:33 +02:00
Aliaksandr Valialkin
e38e25b756 app/vmagent/remotewrite: improve the readability of the parseRetryAfterHeader() function a bit
- Use shorter name for its' arg: retryAfterString -> s. This is OK to do because the function is small enough,
so it is easier to read 's' instead of 'retryAfterString' in multiple places of the function.

- Remove the name for the returned value - retryAfterDuration, since it only confuses the reader.

This is a follow-up for the commit 5319acb8ed , which introduced this function.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6097
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6124
2026-04-02 12:52:59 +02:00
Vadim Alekseev
bc708c8568 lib/timeutil: introduce backoff timer struct (#10714)
### Describe Your Changes

I noticed that the backoff timer logic is repeated across multiple
packages. I've implemented a universal wrapper to avoid duplicating this
logic. This structure is already [actively
used](2aa0ea10bb/app/vlagent/kubernetescollector/backoff_timer.go (L11))
for the Kubernetes Collector in vlagent and can be reused in vlagent's
remotewrite. I've also included a usage example in this PR so you can
evaluate its utility.

### Checklist

The following checks are **mandatory**:

- [X] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [X] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-04-02 12:31:28 +02:00
Aliaksandr Valialkin
cd73472a3e docs/victoriametrics/Articles.md: add https://mirastacklabs.ai/blog/chunk-split-caching/ 2026-04-01 22:42:15 +02:00
Aliaksandr Valialkin
527d09653a lib/storage: remove MetricNamesStatsResponse and MetricNamesStatsRecord types
These types hide public types from lib/storage/metricnamestats package.
These types do not resolve any practical issues. Instead, they add a level of indirection,
which complicates reading and understanding the code.

These types were introduced in the commit 795d3fe722
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6145
2026-04-01 22:25:50 +02:00
Aliaksandr Valialkin
28a87b90bb apptest: test apps with the enabled built-in race detector in order to be able to catch data races 2026-04-01 22:11:17 +02:00
Artem Fetishev
c445e7fcc0 docs: bump version to v1.139.0
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-04-01 15:22:06 +02:00
Artem Fetishev
9494ee103e deplyoment/docker: bump version to v1.139.0
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-04-01 15:19:03 +02:00
Artem Fetishev
94af588e92 docs: forward port LTS v1.122.18 changelog to upstream
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-04-01 14:24:20 +02:00
Artem Fetishev
e5c194cc10 docs: forward port LTS v1.136.3 changelog to upstream
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-04-01 14:22:51 +02:00
Artem Fetishev
7c65e3daca docs: fix changelog
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-04-01 13:17:16 +02:00
Pablo (Tomas) Fernandez
a6532c28b2 docs/guides: Add new guide "Set up Datasource-Managed Alerts with vmalert and Grafana" (#10691)
Create a guide to use datasource-managed alerts in Grafana

See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10528

Signed-off-by: Pablo (Tomas) Fernandez <46322567+TomFern@users.noreply.github.com>
Co-authored-by: Mathias Palmersheim <mathias@victoriametrics.com>
2026-03-31 18:57:04 +03:00
Jose Gómez-Sellés
ec26ebb803 docs: raise cloud awareness in docs (#10716)
### Describe Your Changes

Some users may not know that VictoriaMetrics Cloud provides relevant
features to manage workloads. This change add notes in relevant places
in which users may find that a managed solution is what they need.

The intention is not to push users to Cloud, but giving the information.
That's why it's always phrased like: "If you don't want to do X, Cloud
can do it for you", instead of "Start for free, etc". This is an Open
Source first project, and shall remain as such.

After this gets proper review, VictoriaLogs and other repos may follow.

### Checklist

The following checks are **mandatory**:

- [X] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [X] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Signed-off-by: Jose Gómez-Sellés <14234281+jgomezselles@users.noreply.github.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
2026-03-31 18:51:06 +03:00
Max Kotliar
faba8b985b docs: bump lts tags 2026-03-28 15:49:56 +02:00
Artem Fetishev
d233a409d9 docs/changelog: cut release v1.139.0
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-03-27 11:28:24 +01:00
Artem Fetishev
96cbd6fff3 docs: update version to v1.139.0
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-03-27 11:27:29 +01:00
Artem Fetishev
b1d009b13a app/vmselect: run make vmui-update
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-03-27 11:18:18 +01:00
Nikolay
57ce00a5c6 lib/fs: restore async deletion of NFS folders
Commit 83da33d8cf
 removed NFS directory delete retries. It was made on assumption, that
 only directory rename could cause such issues. However, both rename and
unlink uses the same "silly rename" logic
https://linux-nfs.org/wiki/index.php/Server-side_silly_rename
 and linux kernel - `fs/nfs/dir.c` `nfs_unlink` and  `nfs_rename`.

 And NFS client may treat file still open, even if it
was properly closed by application. Most probably it could be triggered, because VictoriaMetrics may
open the same file multiple times ( data read and background merges).

There is no issue with VictoriaMetrics itself, it properly closes files. But NFS-client may have delays
or cache metadata information for the files. So it could trigger silly rename behavior.

 This commit restores original behavior with deletion retries and brings
 back metrics for unsuccessful delete operations.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9842
2026-03-27 09:51:27 +01:00
dependabot[bot]
cfb53cbfb9 build(deps): bump codecov/codecov-action from 5 to 6 (#10709)
Bumps [codecov/codecov-action](https://github.com/codecov/codecov-action) from 5 to 6.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-27 07:43:28 +02:00
Benjamin Nichols-Farquhar
febafc1cf1 lib/backup: speed up restores on linuxsystems (#10661)
Related to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10680

We noticed that backup restores in our environment were much slower than
the hardware/bandwidth constraints would suggest and we traced this down
to a couple of bottlenecks. This PR attempts to address all of them.

#### Lack of pre-allocation of files, 

This was causing writes far into files to be quite slow as new blocks
needed to be continually allocated. This was particularly bad on ext4
for us, but will likely be applicable to most disks and filesystems,
you'll see the impl here is linux specific but this is mostly because I
don't have a test env for any other platform and didn't want to blindly
make changes without a validation env.

This comes with the downside of no longer being to to resume a restore
mid file, and requiring the re-downloading of parts already in the file
size the file will appear at full size from the very start. This is I
think _generally_ a good tradeoff for the restore speed gains, it is
definitely a tradeoff so I've included a flag to disable the
pre-allocation behavior and fall back to the existing part diffing
logic.

#### Fsync after each part

With many small parts in relatively few files, or in high concurrency
setups the the writerCloser fsync on each part(actually double fsync
since both `filestream.Writer.mustFlush` and
`filestream.Writer.mustClose` both fsync). Was causing slowdowns since
we would be continually queuing fsyncs.

With the pre-allocation pattern the file is only "ready" once re-named
so I moved to a per file fsync after rename.

#### Concurrent read/write 

The previous download pattern was to do a read from the remoteFs, with
whatever latency that entailed, then sequentially do a write, again with
whatever latency that entailed. This meant that throughput was limited
to `readLatency + writeLatency * blockSize`.

Similar to how `crossTypeCopy` is implemented in the backup process we
can instead use `io.pipe` to allow two goroutines to work in parallel
with a small buffer between them.

#### Pagecache avoidance 

`filestream.Writer` does quite a lot to avoid polluting the page cache,
but this is not relevent in a restore context and with large sequential
block writes its much more effecient to let the OS flush the pagecache
whenever it wants rather than doing a bunch of small buffer syscalls to
flush blocks.

Therefore this switches over to a much simplier directWriterCloser that
does direct file IO and lets the OS handle flushes while mid write.

### Performance 

Before the changes we were seeing writes speeds of only 100MBps, this
was a restore from EBS volumes, ext with 1GB/s throughput with
<img width="1613" height="586" alt="Screenshot 2026-03-16 at 1 29 46 PM"
src="https://github.com/user-attachments/assets/5d54dcb7-cb59-43e0-9247-fda8c70feb2f"
/>


After these changes in the same restore env we're seeing 600MBs flat
rates.
<img width="1611" height="471" alt="Screenshot 2026-03-16 at 1 31 33 PM"
src="https://github.com/user-attachments/assets/ea8e2eb7-533a-48fa-99e0-0b38286e5572"
/>

Signed-off-by: Max Kotliar <kotlyar.maksim@gmail.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-03-27 07:35:44 +02:00
Fred Navruzov
f1f70e976e docs/vmanomaly: fix typos in min rel dev param (#10708)
Fix a typo in the changed example for minimal relative deviation
2026-03-27 07:24:24 +02:00
Fred Navruzov
5aa0a75ff8 docs/vmanomaly-release-v1.29.1 (#10707)
some of the docs not included in v1.29.1 docs' release

Signed-off-by: Fred Navruzov <fred-navruzov@users.noreply.github.com>
2026-03-26 22:32:00 +02:00
Max Kotliar
d83f142c63 .github: check commit signature for both GPG and SSH 2026-03-26 19:37:16 +02:00
Artem Fetishev
a07cae3279 lib/lrucache: remove shards (#10697)
Remove shards as they only complicate things when the number of requests
per second is in the range of thousands.

Related to #10532.

---------

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-03-26 16:28:01 +01:00
Phuong Le
8cda999238 README.md: fix wrong links for Docker and Slack badges (#10705)
### Describe Your Changes

Clicking the Docker and Slack badges redirects to an intermediate page
instead of taking users directly to the intended sites. This change
fixes those links.

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Signed-off-by: Max Kotliar <kotlyar.maksim@gmail.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-03-26 12:05:06 +02:00
Hui Wang
2d6cf8827d lib/protoparser/opentelemetry: support ExponentialHistogram negative buckets (#10669)
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9896#issuecomment-4037424586.
Histogram-related functions such as histogram_quantile() and the VMUI
heatmap also work with negative bucket values.

Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-03-26 11:57:19 +02:00
Max Kotliar
c59ca79f2b docs/changelog: mention external contributor 2026-03-26 11:54:42 +02:00
andriibeee
be5ae9b95c lib/jwt: support array claim values in match_claims
This commit allows to perform JWT claim matching over 1 dimension arrays. It could
be useful from practical standpoint. Because permissions are usually assigned as a list of values.

  For example, the following config allows admin access over list of assigned roles for user:

```yaml
 match_claims:
   access.roles: "admin"
```

JWT token:
```json
 {
  "access": {
    "roles": [
      "read",
      "write",
      "admin"
   ]
 }
}
```

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10647
2026-03-26 10:23:43 +01:00
andriibeee
60aef0510f lib/promauth: make username optional in basic_auth section
RFC-7617 allows empty password/username. Moreover, from RFC standpoint both empty values are valid as well. It should be just encoded as `:`. So this commit relaxes non-empty username restriction.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6956
2026-03-26 02:18:37 -07:00
Yury Moladau
b3b555c09c app/vmui: update dependencies to latest compatible versions (#10696)
update dependencies to latest compatible versions

Signed-off-by: Yury Molodov <yurymolodov@gmail.com>
2026-03-25 19:34:40 +02:00
Fred Navruzov
c57ea02564 docs/vmanomaly: release v1.29.1 (#10703)
### Describe Your Changes

vmanomaly docs upgrade to v1.29.1 (including AI assistance providers and
respective section rework on UI page)

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-03-25 19:31:14 +02:00
Max Kotliar
5983d27b00 Revert "docs/anomaly-detection: remove vmanomaly docs from vm repo"
This reverts commit d36f7b6b49.
2026-03-25 19:30:21 +02:00
Max Kotliar
d36f7b6b49 docs/anomaly-detection: remove vmanomaly docs from vm repo
vmanomaly docs should be updated via PRs to vmdocs repo. There is no
need to merge them here and cherry-pick through all branches.
2026-03-25 19:17:24 +02:00
Ty Sarna
70ab2c1585 lib/protoparser/prometheus: add support for OpenMetrics-specific metric types (#10689)
- Adds `info`, `gaugehistogram`, `stateset`, and `unknown` as recognized
metric type names in the Prometheus/OpenMetrics text format parser.
- Previously these valid
[OpenMetrics](https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md)
types hit the `default` case and emitted an `error`-level log on every
scrape, flooding logs and continuously triggering the `TooManyLogs`
alert.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10685

Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-03-24 15:34:33 +02:00
Pablo (Tomas) Fernandez
c854816642 docs: add playgrounds category (#10686)
Add a new Playgrounds category to the sidebar. Each VictoriaMetrics playground is represented in a separate file.

Signed-off-by: Pablo (Tomas) Fernandez <46322567+TomFern@users.noreply.github.com>
Co-authored-by: Max Kotliar <kotlyar.maksim@gmail.com>
2026-03-24 15:25:02 +02:00
Max Kotliar
285e3d2a63 app/vmselect: enforce datasource_type=prometheus when proxying alert requests (#10668)
Grafana currently supports only Prometheus-style alerts. If other alert types
(e.g. logs or traces) are returned, it may fail with "Error loading alerts".

Grafana queries the vmalert API directly, bypassing the VictoriaMetrics datasource, so query params (such as datasource_type) cannot be enforced on the Grafana side.

To ensure compatibility, we detect Grafana requests via the User-Agent and enforce `datasource_type=prometheus`.

See:
- https://github.com/VictoriaMetrics/victoriametrics-datasource/issues/329#issuecomment-3847585443
- https://github.com/VictoriaMetrics/victoriametrics-datasource/issues/59
2026-03-24 14:52:18 +02:00
Artem Fetishev
95175e00b4 lib/lrucache: sizeBytes should also include key length (#10679)
There are cases then the key sizeBytes is much greater than the value
sizeBytes. Therefore it is important to include the key sizeBytes into
the total.

Also fix some code comments.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-03-24 12:54:31 +01:00
Artem Fetishev
d21d9e8382 lib/storage: Improve indexDB error messages (#10684)
Fixes: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9499

---------

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
Signed-off-by: Nikolay <nik@victoriametrics.com>
Co-authored-by: Nikolay <nik@victoriametrics.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
2026-03-24 12:19:04 +01:00
dependabot[bot]
235daa6208 build(deps-dev): bump flatted from 3.3.3 to 3.4.2 in /app/vmui/packages/vmui (#10688)
Bumps [flatted](https://github.com/WebReflection/flatted) from 3.3.3 to
3.4.2.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="3bf09091c3"><code>3bf0909</code></a>
3.4.2</li>
<li><a
href="885ddcc33c"><code>885ddcc</code></a>
fix CWE-1321</li>
<li><a
href="0bdba705d1"><code>0bdba70</code></a>
added flatted-view to the benchmark</li>
<li><a
href="2a02dce7c6"><code>2a02dce</code></a>
3.4.1</li>
<li><a
href="fba4e8f2e1"><code>fba4e8f</code></a>
Merge pull request <a
href="https://redirect.github.com/WebReflection/flatted/issues/89">#89</a>
from WebReflection/python-fix</li>
<li><a
href="5fe86485e6"><code>5fe8648</code></a>
added &quot;when in Rome&quot; also a test for PHP</li>
<li><a
href="53517adbef"><code>53517ad</code></a>
some minor improvement</li>
<li><a
href="b3e2a0c387"><code>b3e2a0c</code></a>
Fixing recursion issue in Python too</li>
<li><a
href="c4b46dbcbf"><code>c4b46db</code></a>
Add SECURITY.md for security policy and reporting</li>
<li><a
href="f86d071e0f"><code>f86d071</code></a>
Create dependabot.yml for version updates</li>
<li>Additional commits viewable in <a
href="https://github.com/WebReflection/flatted/compare/v3.3.3...v3.4.2">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=flatted&package-manager=npm_and_yarn&previous-version=3.3.3&new-version=3.4.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/VictoriaMetrics/VictoriaMetrics/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-24 12:34:05 +02:00
dependabot[bot]
10f4a86540 build(deps): bump google.golang.org/grpc from 1.79.1 to 1.79.3 (#10674)
Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.79.1 to 1.79.3.

See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10674

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-24 11:37:15 +02:00
dependabot[bot]
79cfffb984 build(deps): bump undici from 7.20.0 to 7.24.4 in /app/vmui/packages/vmui (#10673)
Bumps [undici](https://github.com/nodejs/undici) from 7.20.0 to 7.24.4.

See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10673

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-24 11:35:18 +02:00
Aliaksandr Valialkin
23e2379c28 docs/victoriametrics/Articles.md: add https://www.infoq.com/news/2026/03/self-hosted-observability/ 2026-03-20 18:39:13 +01:00
andriibeee
e761f22049 lib/netutil: warn when IPv6 listen address is used without -enableTCP6 (#10640)
### Describe Your Changes

Fixes #6858

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Signed-off-by: andriibeee <154226341+andriibeee@users.noreply.github.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-03-18 21:01:55 +02:00
andriibeee
fb579cf592 lib/jwt: fail on unsupported alg when use=sig, skip non-sig JWKS keys (#10664)
### Describe Your Changes

Fixes #10663

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-03-18 20:40:04 +02:00
hagen1778
fd0d764720 docs/articles: add https://setevoy.medium.com/freebsd-monitoring-with-victoriametrics-and-grafana-f789904f2628
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2026-03-18 16:04:30 +01:00
Roman Khavronenko
fe8aaa8885 docs: add unique identifier to FAQ page (#10671)
Due to a conflict with VL FAQ page identifier,
VM FAQ page stopped rendering.

This change adds unique identifier to VM FAQ page and fixes the issue.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2026-03-18 15:29:21 +01:00
hagen1778
b903fc29ec docs/data-ingestion: fix typo in OtelCollector
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2026-03-18 15:29:04 +01:00
hagen1778
a6833ffd08 dashboards/metrics-explorer: properly reference datasource variable
Before, by mistake, datasource was referenced by input name instead
of variable name. For an unknown reason, it worked well in local setup
and on playground.

This fix is confirmed by users and continues working at local setup
and playground.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2026-03-18 15:28:26 +01:00
Andrii Chubatiuk
4516a58df9 app/vmalert: add group_limit and page_num for pagination and search for search at /api/v1/rules (#10046)
### Describe Your Changes

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9580

inspired by https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9057

improve https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10005

added changes to support pagination in VMUI alerting tab:
- added pagination panel
<img width="1431" height="197" alt="image"
src="https://github.com/user-attachments/assets/17b2c4e1-06b7-4345-8ccc-008637edd4e0"
/>
- added navigation from group modal to rule and from child modals to
group as a replacement for anchors navigation, which became impossible
after introduction of pagination
<img width="1264" height="599" alt="image"
src="https://github.com/user-attachments/assets/a803347f-e44e-4325-9b59-8656bd6a5d9b"
/>
<img width="1253" height="523" alt="image"
src="https://github.com/user-attachments/assets/70db27bd-0027-4510-9cad-0354e016d2f2"
/>


PR is rebased against [this
change](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10068)

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Signed-off-by: Andrii Chubatiuk <achubatiuk@victoriametrics.com>
Co-authored-by: Haley Wang <haley@victoriametrics.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-03-18 13:25:36 +02:00
Pablo (Tomas) Fernandez
5ad7b645e6 Docs: update guide "HA monitoring setup in Kubernetes via VictoriaMetrics Cluster" (#10580)
### Describe Your Changes

Updated the [HA monitoring setup in Kubernetes via VictoriaMetrics
Cluster](https://docs.victoriametrics.com/guides/k8s-ha-monitoring-via-vm-cluster/)
guide.

Changes:
- Added an introduction explaining how HA works in this guide
- Updated and verified commands used in the guide
- Replaced using Grafana UI usage in favor of using VMUI instead (it was
used to run queries, it's easier to just use the built-in VMUI instead
of installing Grafana just to use the Explore tab)
- Removed Grafana screenshots and replaced them with VMUI
- Tested on a modern version of GKE
- Added explanations for `replicationFactor`, de-duplication, and
`isPartial`
- Added next steps
- Added VMUI screenshots


### Checklist

The following checks are **mandatory**:

- [X] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [X] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-03-18 10:55:30 +02:00
Yury Moladau
51a53014c8 app/vmui: fix autocomplete dropdown closing on Raw Query page (#10665)
### Describe Your Changes

Fixed an issue where the autocomplete dropdown did not close after
selecting an option on the Raw Query page.

**How to reproduce**

* Open the Raw Query page
* Trigger autocomplete
* Select any option from the dropdown
* Before the fix, the dropdown stayed open after selection


### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Signed-off-by: Yury Molodov <yurymolodov@gmail.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-03-18 10:45:40 +02:00
Aliaksandr Valialkin
e47abd6385 docs/victoriametrics/Articles.md: add https://clovisc.medium.com/monitoring-pipeline-with-blackbox-exporter-prometheus-victoriametrics-and-vmalert-0ab020c7202a 2026-03-18 02:42:49 +01:00
Aliaksandr Valialkin
c04a5a597d docs/victoriametrics/Articles.md: add https://apprecode.com/blog/a-complete-guide-to-victoriametrics-a-prometheus-comparison-and-kubernetes-monitoring-implementation 2026-03-18 02:41:34 +01:00
JAYICE
e695d5f425 app/vmselect: retry with new connection when previous rpc fail on a broken connection
This commit adds a rpc retry by dialing a new connection instead of
getting an old one from the connection pool when the previous rpc error
is `io.EOF`.

It helps prevent broken connections from remaining for too long and
causing failed requests and partial responses during `vmstorage` rolling
restart period

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10314
2026-03-17 11:01:16 +01:00
andriibeee
2bb03f6e34 lib/storage, lib/mergeset: properly account inmemoryPart refCount
Previously inmemoryPart refCount was not properly decremented.

Previous behavior:
* createInmemoryPart called newPartWrapperFromInmemoryPart and returns a partWrapper with refCount=1
* multiple parts are merged in mustMergeInmemoryPartsFinal, which creates a new merged part
* the source partWrappers are never decRef'd
* Since refCount never reaches 0, putInmemoryPart and (*part).MustClose are never called 

 This commit properly decrements refCount at mustMergeInmemoryPartsFinal. 

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10086
2026-03-17 10:54:08 +01:00
Br1an
92f03344eb lib/promscrape/discovery/yandexcloud: add folder_ids option
This commit adds a new `folder_ids` field in
`yandexcloud_sd_configs` that allows users to specify Yandex Cloud
folder IDs directly, bypassing the organization->cloud->folder hierarchy
traversal.

Previously, the Yandex Cloud service discovery required traversing the
entire resource hierarchy (organizations -> clouds -> folders ->
instances) to discover instances. This works when the Service Account
has permissions at all levels. However, some Service Accounts may only
have permissions at the folder level, causing discovery to fail when it
cannot access organization or cloud resources.

With this change, users can now configure folder IDs directly:

```yaml
yandexcloud_sd_configs:
  - service: compute
    folder_ids:
      - folder-id-1
      - folder-id-2
```

When `folder_ids` is specified, the discovery skips the hierarchy
traversal and directly queries instances from the specified folders.
This is a backward-compatible change - when `folder_ids` is not
specified, the existing behavior is preserved.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10587
2026-03-17 10:51:05 +01:00
Artem Fetishev
e3360b87ff docs: run make docs-update-flags
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-03-16 16:59:51 +01:00
Artem Fetishev
4c98b912fa docs: bump version to v1.138.0
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-03-16 16:54:04 +01:00
Artem Fetishev
225e2e870b deplyoment/docker: bump version to v1.138.0
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-03-16 16:48:43 +01:00
Artem Fetishev
2b078301c1 docs/CHANGELOG.md: update changelog with LTS release notes
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-03-16 15:23:05 +01:00
Arie Heinrich
14090c5a07 all: spelling fixes in code comments (#10650)
fixing spelling issues in comments and text strings

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-03-16 11:11:54 +01:00
Arie Heinrich
66d47f23e4 docs: spelling fixes (#10649)
fix spelling in docs (potential removal of empty spaces as default)

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Signed-off-by: Arie Heinrich <arie.heinrich@outlook.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
2026-03-16 11:11:25 +01:00
Roman Khavronenko
eacdb80ed7 docs: add AI tools section to the docs (#10642)
The new section is placed in root directory and is supposed to promote
information about the following tools:
* MCP servers for Logs, Traces and Metrics
* List of available agentic skills

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Signed-off-by: Roman Khavronenko <hagen1778@gmail.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
2026-03-16 11:10:52 +01:00
Roman Khavronenko
504cf31dab docs: minor wording updates in storage section (#10633)
The change suppose to make it more clear for understanding and stress
attention on important things.

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Signed-off-by: Roman Khavronenko <hagen1778@gmail.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
2026-03-16 11:10:12 +01:00
Roman Khavronenko
34d190b32a dashboards: add dashboard for exploring stored metrics (#10617)
The new Grafana dashboard uses the following APIs:
- /api/v1/status/tsdb
- /api/v1/status/metric_names_stats

It shows the list of metric names, the request count and the last time
they were "used". Clicking on metric name allows exploring its
cardinality.

Based on https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9832

-----------

The PR contains a few unrelated changes: 
* rename of folder for prometheus datasource to remove the duplicated
word
* fix for vmalert's access to the datasource, as before it wasn't able
to write/read properly

-------------

The dashboard screen cast:


https://github.com/user-attachments/assets/01dda5d9-14e5-4f5a-b795-a838abec4f5e

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Haley Wang <haley@victoriametrics.com>
2026-03-16 11:09:06 +01:00
Roshan Banisetti
44fa216bb5 app/vmui: show seriesCountByMetricName when label is in focus in Cardinality Explorer (#10638)
### Describe Your Changes

When a label is set as focus label in the Cardinality Explorer, the
"Metric names with the highest number of series" table was hidden. This
change makes it visible alongside the focus label values table.

### How to reproduce

  1. Go to Explore → Cardinality Explorer
2. Enter a selector like `{namespace!=""}` and set Focus label to
`namespace`
  3. Click Execute Query

**Before:** Only "Values for 'namespace' label..." table is shown
**After:** "Metric names with the highest number of series" table is
also shown

<img width="1512" height="723"
alt="b2a8395a1577b31f58ae00f87e29eb87ca98eabfd0b3c0d9185be8f3a9789b5f"
src="https://github.com/user-attachments/assets/50c7f67a-1cfc-40d0-8e99-7750a933ee45"
/>

Fixes #10630 

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Signed-off-by: Roshan1299 <banisettirosh@gmail.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2026-03-16 11:04:50 +01:00
JAYICE
4589442345 dashboard: refine top10 instances by sample panel in vmagent (#10655)
### Describe Your Changes

fix https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10654

<img width="1995" height="846" alt="image"
src="https://github.com/user-attachments/assets/673afd18-9d64-43d3-9ec2-38508847a851"
/>


### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-03-16 11:04:04 +01:00
Artem Fetishev
78ad4b974c docs: cut release v1.138.0
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-03-13 16:16:31 +00:00
Artem Fetishev
d12524749f make docs-update-version
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-03-13 17:03:48 +01:00
Artem Fetishev
1a5235a18f make vmui-update
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-03-13 15:48:15 +00:00
Max Kotliar
27847dbbb8 docs: chore vmauth jwt related documentation
fix tags
add available_from
add cross links
2026-03-13 15:40:39 +02:00
Andrii Chubatiuk
33fab3a2d6 lib/backup/s3remote: overwrite source tags, while syncing parts from one s3 location to another
in case of conflicting tags while syncing latest backup with other backup types by default s3 keeps original ones. Commit changes default behaviour, which enables replacing original tags

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics-enterprise/issues/1004
2026-03-13 13:09:51 +01:00
f41gh7
695b21ecfc docs/changelog: mention vmbackupmanager bugfix at changelog
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10639
2026-03-13 10:28:26 +01:00
Nikolay
4ba488f806 lib/jwt: support regex value claim matching
This commit adds regex value matching for JWT claims matching.

Related to
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10584 Fixes
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10628
2026-03-13 10:14:05 +01:00
dependabot[bot]
1e046d35a8 build(deps): bump immutable from 5.1.4 to 5.1.5 in /app/vmui/packages/vmui (#10586)
Bumps [immutable](https://github.com/immutable-js/immutable-js) from
5.1.4 to 5.1.5.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/immutable-js/immutable-js/releases">immutable's
releases</a>.</em></p>
<blockquote>
<h2>v5.1.5</h2>
<h2>What's Changed</h2>
<ul>
<li>Fix Improperly Controlled Modification of Object Prototype
Attributes ('Prototype Pollution') in immutable</li>
<li>Upgrade devtools and use immutable version by <a
href="https://github.com/jdeniau"><code>@​jdeniau</code></a> in <a
href="https://redirect.github.com/immutable-js/immutable-js/pull/2158">immutable-js/immutable-js#2158</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/immutable-js/immutable-js/compare/v5.1.4...v5.1.5">https://github.com/immutable-js/immutable-js/compare/v5.1.4...v5.1.5</a></p>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/immutable-js/immutable-js/blob/main/CHANGELOG.md">immutable's
changelog</a>.</em></p>
<blockquote>
<h2>5.1.5</h2>
<ul>
<li>Fix Improperly Controlled Modification of Object Prototype
Attributes ('Prototype Pollution') in immutable</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="b37b855686"><code>b37b855</code></a>
5.1.5</li>
<li><a
href="16b3313fdf"><code>16b3313</code></a>
Merge commit from fork</li>
<li><a
href="fd2ef4977e"><code>fd2ef49</code></a>
fix new proto key injection</li>
<li><a
href="6734b7b2af"><code>6734b7b</code></a>
fix Prototype Pollution in mergeDeep, toJS, etc.</li>
<li><a
href="6f772de1e4"><code>6f772de</code></a>
Merge pull request <a
href="https://redirect.github.com/immutable-js/immutable-js/issues/2175">#2175</a>
from immutable-js/dependabot/npm_and_yarn/rollup-4.59.0</li>
<li><a
href="5f3dc61fd0"><code>5f3dc61</code></a>
Bump rollup from 4.34.8 to 4.59.0</li>
<li><a
href="049a594410"><code>049a594</code></a>
Merge pull request <a
href="https://redirect.github.com/immutable-js/immutable-js/issues/2173">#2173</a>
from immutable-js/dependabot/npm_and_yarn/lodash-4.1...</li>
<li><a
href="2481a77331"><code>2481a77</code></a>
Merge pull request <a
href="https://redirect.github.com/immutable-js/immutable-js/issues/2172">#2172</a>
from mrazauskas/update-tstyche</li>
<li><a
href="eb047790b4"><code>eb04779</code></a>
Bump lodash from 4.17.21 to 4.17.23</li>
<li><a
href="b973bf3b62"><code>b973bf3</code></a>
format</li>
<li>Additional commits viewable in <a
href="https://github.com/immutable-js/immutable-js/compare/v5.1.4...v5.1.5">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=immutable&package-manager=npm_and_yarn&previous-version=5.1.4&new-version=5.1.5)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/VictoriaMetrics/VictoriaMetrics/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-12 18:08:38 +02:00
dependabot[bot]
8c9b202c94 build(deps): bump rollup from 4.52.5 to 4.59.0 in /app/vmui/packages/vmui (#10556)
Bumps [rollup](https://github.com/rollup/rollup) from 4.52.5 to 4.59.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/rollup/rollup/releases">rollup's
releases</a>.</em></p>
<blockquote>
<h2>v4.59.0</h2>
<h2>4.59.0</h2>
<p><em>2026-02-22</em></p>
<h3>Features</h3>
<ul>
<li>Throw when the generated bundle contains paths that would leave the
output directory (<a
href="https://redirect.github.com/rollup/rollup/issues/6276">#6276</a>)</li>
</ul>
<h3>Pull Requests</h3>
<ul>
<li><a
href="https://redirect.github.com/rollup/rollup/pull/6275">#6275</a>:
Validate bundle stays within output dir (<a
href="https://github.com/lukastaegert"><code>@​lukastaegert</code></a>)</li>
</ul>
<h2>v4.58.0</h2>
<h2>4.58.0</h2>
<p><em>2026-02-20</em></p>
<h3>Features</h3>
<ul>
<li>Also support <code>__NO_SIDE_EFFECTS__</code> annotation before
variable declarations declaring function expressions (<a
href="https://redirect.github.com/rollup/rollup/issues/6272">#6272</a>)</li>
</ul>
<h3>Pull Requests</h3>
<ul>
<li><a
href="https://redirect.github.com/rollup/rollup/pull/6256">#6256</a>:
docs: document PreRenderedChunk properties including isDynamicEntry and
isImplicitEntry (<a
href="https://github.com/njg7194"><code>@​njg7194</code></a>, <a
href="https://github.com/lukastaegert"><code>@​lukastaegert</code></a>)</li>
<li><a
href="https://redirect.github.com/rollup/rollup/pull/6259">#6259</a>:
docs: Correct typo and improve sentence structure in docs for
<code>output.experimentalMinChunkSize</code> (<a
href="https://github.com/millerick"><code>@​millerick</code></a>, <a
href="https://github.com/lukastaegert"><code>@​lukastaegert</code></a>)</li>
<li><a
href="https://redirect.github.com/rollup/rollup/pull/6260">#6260</a>:
fix(deps): update rust crate swc_compiler_base to v47 (<a
href="https://github.com/renovate"><code>@​renovate</code></a>[bot], <a
href="https://github.com/lukastaegert"><code>@​lukastaegert</code></a>)</li>
<li><a
href="https://redirect.github.com/rollup/rollup/pull/6261">#6261</a>:
fix(deps): lock file maintenance minor/patch updates (<a
href="https://github.com/renovate"><code>@​renovate</code></a>[bot], <a
href="https://github.com/lukastaegert"><code>@​lukastaegert</code></a>)</li>
<li><a
href="https://redirect.github.com/rollup/rollup/pull/6262">#6262</a>:
Avoid unnecessary cloning of the code string (<a
href="https://github.com/lukastaegert"><code>@​lukastaegert</code></a>)</li>
<li><a
href="https://redirect.github.com/rollup/rollup/pull/6263">#6263</a>:
fix(deps): update minor/patch updates (<a
href="https://github.com/renovate"><code>@​renovate</code></a>[bot], <a
href="https://github.com/lukastaegert"><code>@​lukastaegert</code></a>)</li>
<li><a
href="https://redirect.github.com/rollup/rollup/pull/6265">#6265</a>:
chore(deps): lock file maintenance (<a
href="https://github.com/renovate"><code>@​renovate</code></a>[bot])</li>
<li><a
href="https://redirect.github.com/rollup/rollup/pull/6267">#6267</a>:
fix(deps): update minor/patch updates (<a
href="https://github.com/renovate"><code>@​renovate</code></a>[bot])</li>
<li><a
href="https://redirect.github.com/rollup/rollup/pull/6268">#6268</a>:
chore(deps): update dependency eslint-plugin-unicorn to v63 (<a
href="https://github.com/renovate"><code>@​renovate</code></a>[bot], <a
href="https://github.com/lukastaegert"><code>@​lukastaegert</code></a>)</li>
<li><a
href="https://redirect.github.com/rollup/rollup/pull/6269">#6269</a>:
chore(deps): update dependency lru-cache to v11 (<a
href="https://github.com/renovate"><code>@​renovate</code></a>[bot])</li>
<li><a
href="https://redirect.github.com/rollup/rollup/pull/6270">#6270</a>:
chore(deps): lock file maintenance (<a
href="https://github.com/renovate"><code>@​renovate</code></a>[bot])</li>
<li><a
href="https://redirect.github.com/rollup/rollup/pull/6272">#6272</a>:
forward NO_SIDE_EFFECTS annotations to function expressions in variable
declarations (<a
href="https://github.com/lukastaegert"><code>@​lukastaegert</code></a>)</li>
</ul>
<h2>v4.57.1</h2>
<h2>4.57.1</h2>
<p><em>2026-01-30</em></p>
<h3>Bug Fixes</h3>
<ul>
<li>Fix heap corruption issue in Windows (<a
href="https://redirect.github.com/rollup/rollup/issues/6251">#6251</a>)</li>
<li>Ensure exports of a dynamic import are fully included when called
from a try...catch (<a
href="https://redirect.github.com/rollup/rollup/issues/6254">#6254</a>)</li>
</ul>
<h3>Pull Requests</h3>
<ul>
<li><a
href="https://redirect.github.com/rollup/rollup/pull/6251">#6251</a>:
fix: Isolate and cache <code>process.report.getReport()</code> calls in
a child process for robust environment detection (<a
href="https://github.com/alan-agius4"><code>@​alan-agius4</code></a>, <a
href="https://github.com/lukastaegert"><code>@​lukastaegert</code></a>)</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/rollup/rollup/blob/master/CHANGELOG.md">rollup's
changelog</a>.</em></p>
<blockquote>
<h2>4.59.0</h2>
<p><em>2026-02-22</em></p>
<h3>Features</h3>
<ul>
<li>Throw when the generated bundle contains paths that would leave the
output directory (<a
href="https://redirect.github.com/rollup/rollup/issues/6276">#6276</a>)</li>
</ul>
<h3>Pull Requests</h3>
<ul>
<li><a
href="https://redirect.github.com/rollup/rollup/pull/6275">#6275</a>:
Validate bundle stays within output dir (<a
href="https://github.com/lukastaegert"><code>@​lukastaegert</code></a>)</li>
</ul>
<h2>4.58.0</h2>
<p><em>2026-02-20</em></p>
<h3>Features</h3>
<ul>
<li>Also support <code>__NO_SIDE_EFFECTS__</code> annotation before
variable declarations declaring function expressions (<a
href="https://redirect.github.com/rollup/rollup/issues/6272">#6272</a>)</li>
</ul>
<h3>Pull Requests</h3>
<ul>
<li><a
href="https://redirect.github.com/rollup/rollup/pull/6256">#6256</a>:
docs: document PreRenderedChunk properties including isDynamicEntry and
isImplicitEntry (<a
href="https://github.com/njg7194"><code>@​njg7194</code></a>, <a
href="https://github.com/lukastaegert"><code>@​lukastaegert</code></a>)</li>
<li><a
href="https://redirect.github.com/rollup/rollup/pull/6259">#6259</a>:
docs: Correct typo and improve sentence structure in docs for
<code>output.experimentalMinChunkSize</code> (<a
href="https://github.com/millerick"><code>@​millerick</code></a>, <a
href="https://github.com/lukastaegert"><code>@​lukastaegert</code></a>)</li>
<li><a
href="https://redirect.github.com/rollup/rollup/pull/6260">#6260</a>:
fix(deps): update rust crate swc_compiler_base to v47 (<a
href="https://github.com/renovate"><code>@​renovate</code></a>[bot], <a
href="https://github.com/lukastaegert"><code>@​lukastaegert</code></a>)</li>
<li><a
href="https://redirect.github.com/rollup/rollup/pull/6261">#6261</a>:
fix(deps): lock file maintenance minor/patch updates (<a
href="https://github.com/renovate"><code>@​renovate</code></a>[bot], <a
href="https://github.com/lukastaegert"><code>@​lukastaegert</code></a>)</li>
<li><a
href="https://redirect.github.com/rollup/rollup/pull/6262">#6262</a>:
Avoid unnecessary cloning of the code string (<a
href="https://github.com/lukastaegert"><code>@​lukastaegert</code></a>)</li>
<li><a
href="https://redirect.github.com/rollup/rollup/pull/6263">#6263</a>:
fix(deps): update minor/patch updates (<a
href="https://github.com/renovate"><code>@​renovate</code></a>[bot], <a
href="https://github.com/lukastaegert"><code>@​lukastaegert</code></a>)</li>
<li><a
href="https://redirect.github.com/rollup/rollup/pull/6265">#6265</a>:
chore(deps): lock file maintenance (<a
href="https://github.com/renovate"><code>@​renovate</code></a>[bot])</li>
<li><a
href="https://redirect.github.com/rollup/rollup/pull/6267">#6267</a>:
fix(deps): update minor/patch updates (<a
href="https://github.com/renovate"><code>@​renovate</code></a>[bot])</li>
<li><a
href="https://redirect.github.com/rollup/rollup/pull/6268">#6268</a>:
chore(deps): update dependency eslint-plugin-unicorn to v63 (<a
href="https://github.com/renovate"><code>@​renovate</code></a>[bot], <a
href="https://github.com/lukastaegert"><code>@​lukastaegert</code></a>)</li>
<li><a
href="https://redirect.github.com/rollup/rollup/pull/6269">#6269</a>:
chore(deps): update dependency lru-cache to v11 (<a
href="https://github.com/renovate"><code>@​renovate</code></a>[bot])</li>
<li><a
href="https://redirect.github.com/rollup/rollup/pull/6270">#6270</a>:
chore(deps): lock file maintenance (<a
href="https://github.com/renovate"><code>@​renovate</code></a>[bot])</li>
<li><a
href="https://redirect.github.com/rollup/rollup/pull/6272">#6272</a>:
forward NO_SIDE_EFFECTS annotations to function expressions in variable
declarations (<a
href="https://github.com/lukastaegert"><code>@​lukastaegert</code></a>)</li>
</ul>
<h2>4.57.1</h2>
<p><em>2026-01-30</em></p>
<h3>Bug Fixes</h3>
<ul>
<li>Fix heap corruption issue in Windows (<a
href="https://redirect.github.com/rollup/rollup/issues/6251">#6251</a>)</li>
<li>Ensure exports of a dynamic import are fully included when called
from a try...catch (<a
href="https://redirect.github.com/rollup/rollup/issues/6254">#6254</a>)</li>
</ul>
<h3>Pull Requests</h3>
<ul>
<li><a
href="https://redirect.github.com/rollup/rollup/pull/6251">#6251</a>:
fix: Isolate and cache <code>process.report.getReport()</code> calls in
a child process for robust environment detection (<a
href="https://github.com/alan-agius4"><code>@​alan-agius4</code></a>, <a
href="https://github.com/lukastaegert"><code>@​lukastaegert</code></a>)</li>
<li><a
href="https://redirect.github.com/rollup/rollup/pull/6252">#6252</a>:
chore(deps): update dependency lru-cache to v11 (<a
href="https://github.com/renovate"><code>@​renovate</code></a>[bot])</li>
<li><a
href="https://redirect.github.com/rollup/rollup/pull/6253">#6253</a>:
chore(deps): lock file maintenance minor/patch updates (<a
href="https://github.com/renovate"><code>@​renovate</code></a>[bot], <a
href="https://github.com/lukastaegert"><code>@​lukastaegert</code></a>)</li>
<li><a
href="https://redirect.github.com/rollup/rollup/pull/6254">#6254</a>:
Fully include dynamic imports in a try-catch (<a
href="https://github.com/lukastaegert"><code>@​lukastaegert</code></a>)</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="ae846957f1"><code>ae84695</code></a>
4.59.0</li>
<li><a
href="b39616e917"><code>b39616e</code></a>
Update audit-resolve</li>
<li><a
href="c60770d7aa"><code>c60770d</code></a>
Validate bundle stays within output dir (<a
href="https://redirect.github.com/rollup/rollup/issues/6275">#6275</a>)</li>
<li><a
href="33f39c1f20"><code>33f39c1</code></a>
4.58.0</li>
<li><a
href="b61c40803b"><code>b61c408</code></a>
forward NO_SIDE_EFFECTS annotations to function expressions in variable
decla...</li>
<li><a
href="7f00689ec9"><code>7f00689</code></a>
Extend agent instructions</li>
<li><a
href="e7b2b85af0"><code>e7b2b85</code></a>
chore(deps): lock file maintenance (<a
href="https://redirect.github.com/rollup/rollup/issues/6270">#6270</a>)</li>
<li><a
href="2aa5da9baf"><code>2aa5da9</code></a>
fix(deps): update minor/patch updates (<a
href="https://redirect.github.com/rollup/rollup/issues/6267">#6267</a>)</li>
<li><a
href="4319837c54"><code>4319837</code></a>
chore(deps): update dependency lru-cache to v11 (<a
href="https://redirect.github.com/rollup/rollup/issues/6269">#6269</a>)</li>
<li><a
href="c3b6b4bdc4"><code>c3b6b4b</code></a>
chore(deps): update dependency eslint-plugin-unicorn to v63 (<a
href="https://redirect.github.com/rollup/rollup/issues/6268">#6268</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/rollup/rollup/compare/v4.52.5...v4.59.0">compare
view</a></li>
</ul>
</details>
<details>
<summary>Install script changes</summary>
<p>This version modifies <code>prepare</code> script that runs during
installation. Review the package contents before updating.</p>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=rollup&package-manager=npm_and_yarn&previous-version=4.52.5&new-version=4.59.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/VictoriaMetrics/VictoriaMetrics/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-12 18:08:25 +02:00
dependabot[bot]
060423141d build(deps): bump minimatch in /app/vmui/packages/vmui (#10555)
Bumps and [minimatch](https://github.com/isaacs/minimatch). These
dependencies needed to be updated together.
Updates `minimatch` from 3.1.2 to 3.1.5
<details>
<summary>Commits</summary>
<ul>
<li><a
href="7bba97888a"><code>7bba978</code></a>
3.1.5</li>
<li><a
href="bd259425b2"><code>bd25942</code></a>
docs: add warning about ReDoS</li>
<li><a
href="1a9c27c757"><code>1a9c27c</code></a>
fix partial matching of globstar patterns</li>
<li><a
href="1a2e084af5"><code>1a2e084</code></a>
3.1.4</li>
<li><a
href="ae24656237"><code>ae24656</code></a>
update lockfile</li>
<li><a
href="b100374922"><code>b100374</code></a>
limit recursion for **, improve perf considerably</li>
<li><a
href="26ffeaa091"><code>26ffeaa</code></a>
lockfile update</li>
<li><a
href="9eca892a4e"><code>9eca892</code></a>
lock node version to 14</li>
<li><a
href="00c323b188"><code>00c323b</code></a>
3.1.3</li>
<li><a
href="30486b2048"><code>30486b2</code></a>
update CI matrix and actions</li>
<li>Additional commits viewable in <a
href="https://github.com/isaacs/minimatch/compare/v3.1.2...v3.1.5">compare
view</a></li>
</ul>
</details>
<br />

Updates `minimatch` from 9.0.5 to 9.0.9
<details>
<summary>Commits</summary>
<ul>
<li><a
href="7bba97888a"><code>7bba978</code></a>
3.1.5</li>
<li><a
href="bd259425b2"><code>bd25942</code></a>
docs: add warning about ReDoS</li>
<li><a
href="1a9c27c757"><code>1a9c27c</code></a>
fix partial matching of globstar patterns</li>
<li><a
href="1a2e084af5"><code>1a2e084</code></a>
3.1.4</li>
<li><a
href="ae24656237"><code>ae24656</code></a>
update lockfile</li>
<li><a
href="b100374922"><code>b100374</code></a>
limit recursion for **, improve perf considerably</li>
<li><a
href="26ffeaa091"><code>26ffeaa</code></a>
lockfile update</li>
<li><a
href="9eca892a4e"><code>9eca892</code></a>
lock node version to 14</li>
<li><a
href="00c323b188"><code>00c323b</code></a>
3.1.3</li>
<li><a
href="30486b2048"><code>30486b2</code></a>
update CI matrix and actions</li>
<li>Additional commits viewable in <a
href="https://github.com/isaacs/minimatch/compare/v3.1.2...v3.1.5">compare
view</a></li>
</ul>
</details>
<br />


Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/VictoriaMetrics/VictoriaMetrics/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-12 18:08:16 +02:00
dependabot[bot]
0ee16ff2e5 build(deps): bump crazy-max/ghaction-import-gpg from 6 to 7 (#10572)
Bumps
[crazy-max/ghaction-import-gpg](https://github.com/crazy-max/ghaction-import-gpg)
from 6 to 7.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/crazy-max/ghaction-import-gpg/releases">crazy-max/ghaction-import-gpg's
releases</a>.</em></p>
<blockquote>
<h2>v7.0.0</h2>
<ul>
<li>Node 24 as default runtime (requires <a
href="https://github.com/actions/runner/releases/tag/v2.327.1">Actions
Runner v2.327.1</a> or later) by <a
href="https://github.com/crazy-max"><code>@​crazy-max</code></a> in <a
href="https://redirect.github.com/crazy-max/ghaction-import-gpg/pull/241">crazy-max/ghaction-import-gpg#241</a></li>
<li>Switch to ESM and update config/test wiring by <a
href="https://github.com/crazy-max"><code>@​crazy-max</code></a> in <a
href="https://redirect.github.com/crazy-max/ghaction-import-gpg/pull/239">crazy-max/ghaction-import-gpg#239</a></li>
<li>Bump <code>@​actions/core</code> from 1.11.1 to 3.0.0 in <a
href="https://redirect.github.com/crazy-max/ghaction-import-gpg/pull/232">crazy-max/ghaction-import-gpg#232</a></li>
<li>Bump <code>@​actions/exec</code> from 1.1.1 to 3.0.0 in <a
href="https://redirect.github.com/crazy-max/ghaction-import-gpg/pull/242">crazy-max/ghaction-import-gpg#242</a></li>
<li>Bump brace-expansion from 1.1.11 to 1.1.12 in <a
href="https://redirect.github.com/crazy-max/ghaction-import-gpg/pull/221">crazy-max/ghaction-import-gpg#221</a></li>
<li>Bump minimatch from 3.1.2 to 3.1.5 in <a
href="https://redirect.github.com/crazy-max/ghaction-import-gpg/pull/240">crazy-max/ghaction-import-gpg#240</a></li>
<li>Bump openpgp from 6.1.0 to 6.3.0 in <a
href="https://redirect.github.com/crazy-max/ghaction-import-gpg/pull/233">crazy-max/ghaction-import-gpg#233</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/crazy-max/ghaction-import-gpg/compare/v6.3.0...v7.0.0">https://github.com/crazy-max/ghaction-import-gpg/compare/v6.3.0...v7.0.0</a></p>
<h2>v6.3.0</h2>
<ul>
<li>Bump openpgp from 5.11.2 to 6.1.0 in <a
href="https://redirect.github.com/crazy-max/ghaction-import-gpg/pull/215">crazy-max/ghaction-import-gpg#215</a></li>
<li>Bump cross-spawn from 7.0.3 to 7.0.6 in <a
href="https://redirect.github.com/crazy-max/ghaction-import-gpg/pull/212">crazy-max/ghaction-import-gpg#212</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/crazy-max/ghaction-import-gpg/compare/v6.2.0...v6.3.0">https://github.com/crazy-max/ghaction-import-gpg/compare/v6.2.0...v6.3.0</a></p>
<h2>v6.2.0</h2>
<ul>
<li>Bump <code>@​actions/core</code> from 1.10.1 to 1.11.1 in <a
href="https://redirect.github.com/crazy-max/ghaction-import-gpg/pull/209">crazy-max/ghaction-import-gpg#209</a></li>
<li>Bump braces from 3.0.2 to 3.0.3 in <a
href="https://redirect.github.com/crazy-max/ghaction-import-gpg/pull/203">crazy-max/ghaction-import-gpg#203</a></li>
<li>Bump ip from 2.0.0 to 2.0.1 in <a
href="https://redirect.github.com/crazy-max/ghaction-import-gpg/pull/196">crazy-max/ghaction-import-gpg#196</a></li>
<li>Bump micromatch from 4.0.4 to 4.0.8 in <a
href="https://redirect.github.com/crazy-max/ghaction-import-gpg/pull/207">crazy-max/ghaction-import-gpg#207</a></li>
<li>Bump openpgp from 5.11.0 to 5.11.2 in <a
href="https://redirect.github.com/crazy-max/ghaction-import-gpg/pull/205">crazy-max/ghaction-import-gpg#205</a></li>
<li>Bump tar from 6.1.14 to 6.2.1 in <a
href="https://redirect.github.com/crazy-max/ghaction-import-gpg/pull/198">crazy-max/ghaction-import-gpg#198</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/crazy-max/ghaction-import-gpg/compare/v6.1.0...v6.2.0">https://github.com/crazy-max/ghaction-import-gpg/compare/v6.1.0...v6.2.0</a></p>
<h2>v6.1.0</h2>
<ul>
<li>Bump <code>@​actions/core</code> from 1.10.0 to 1.10.1 in <a
href="https://redirect.github.com/crazy-max/ghaction-import-gpg/pull/186">crazy-max/ghaction-import-gpg#186</a></li>
<li>Bump <code>@​babel/traverse</code> from 7.17.3 to 7.23.2 in <a
href="https://redirect.github.com/crazy-max/ghaction-import-gpg/pull/191">crazy-max/ghaction-import-gpg#191</a></li>
<li>Bump debug from 4.1.1 to 4.3.4 in <a
href="https://redirect.github.com/crazy-max/ghaction-import-gpg/pull/190">crazy-max/ghaction-import-gpg#190</a></li>
<li>Bump openpgp from 5.10.1 to 5.11.0 in <a
href="https://redirect.github.com/crazy-max/ghaction-import-gpg/pull/192">crazy-max/ghaction-import-gpg#192</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/crazy-max/ghaction-import-gpg/compare/v6.0.0...v6.1.0">https://github.com/crazy-max/ghaction-import-gpg/compare/v6.0.0...v6.1.0</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="2dc316deee"><code>2dc316d</code></a>
Merge pull request <a
href="https://redirect.github.com/crazy-max/ghaction-import-gpg/issues/242">#242</a>
from crazy-max/dependabot/npm_and_yarn/actions/exec-3...</li>
<li><a
href="5812792d2b"><code>5812792</code></a>
chore: update generated content</li>
<li><a
href="ceb906ede8"><code>ceb906e</code></a>
build(deps): bump <code>@​actions/exec</code> from 1.1.1 to 3.0.0</li>
<li><a
href="a9dffd9307"><code>a9dffd9</code></a>
Merge pull request <a
href="https://redirect.github.com/crazy-max/ghaction-import-gpg/issues/241">#241</a>
from crazy-max/node24</li>
<li><a
href="36d49fcb3c"><code>36d49fc</code></a>
node 24 as default runtime</li>
<li><a
href="50c4e4f047"><code>50c4e4f</code></a>
Merge pull request <a
href="https://redirect.github.com/crazy-max/ghaction-import-gpg/issues/233">#233</a>
from crazy-max/dependabot/npm_and_yarn/openpgp-6.3.0</li>
<li><a
href="c78fe49862"><code>c78fe49</code></a>
chore: update generated content</li>
<li><a
href="8dbbb1e8e5"><code>8dbbb1e</code></a>
Merge pull request <a
href="https://redirect.github.com/crazy-max/ghaction-import-gpg/issues/221">#221</a>
from crazy-max/dependabot/npm_and_yarn/brace-expansio...</li>
<li><a
href="fc715b05fd"><code>fc715b0</code></a>
build(deps): bump openpgp from 6.1.0 to 6.3.0</li>
<li><a
href="99469162d0"><code>9946916</code></a>
build(deps): bump brace-expansion from 1.1.11 to 1.1.12</li>
<li>Additional commits viewable in <a
href="https://github.com/crazy-max/ghaction-import-gpg/compare/v6...v7">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=crazy-max/ghaction-import-gpg&package-manager=github_actions&previous-version=6&new-version=7)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-12 17:21:59 +02:00
Max Kotliar
b22853b97f lib/jwt: mark deprecated properties needed only for vmgateway 2026-03-12 16:00:23 +02:00
Max Kotliar
b578fe9817 docs: add guides for vmauth jwt authentication (#10129)
### Describe Your Changes

Related to
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9439

Commit adds two guides:
- One sets up keyclock, vmcluster, vmauth, grafana, and demo how to log
in to grafana using OIDC and use the jwt token to limit metrics fetched
by grafana datasource from vmcluster.
- Second demo on how to configure vmagent so it gets jwt token and uses
it during remote write requests.

To see guides locally run, checkout the branch, run `make docs-debug`,
open browser `http://localhost:1313`.

vmauth jwt related PRs should be merged into
[vmauth-jwt](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/vmauth-jwt)
brench, and when everything is ready, merged into master.

Debug notes for the guides:
https://github.com/VictoriaMetrics/debug-notes/tree/main/guides/vmauth-jwt

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Signed-off-by: Pablo (Tomas) Fernandez <46322567+TomFern@users.noreply.github.com>
Co-authored-by: Pablo Fernandez <46322567+TomFern@users.noreply.github.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
2026-03-12 15:46:46 +02:00
f41gh7
e9b7adc0e5 vendor: update metrics package
Related to https://github.com/VictoriaMetrics/metrics/issues/85
2026-03-12 09:41:47 +01:00
Max Kotliar
82eab5c5b7 lib/encoding: fix integer overflow in UnmarshalBytes (#10629)
Poison varint: MaxUint64 encoded as varint (0xFFFFFFFFFFFFFFFF). 
The bounds check uint64(nSize)+n overflows to 9, bypassing the guard. 
Then int(MaxUint64)=-1 makes src[10:9] which panics.
2026-03-11 11:42:49 +01:00
Max Kotliar
5b4ab4456e lib/jwt: Verifier support jwks kid (#10611)
### Describe Your Changes

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10606

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Signed-off-by: Nikolay <nik@victoriametrics.com>
Co-authored-by: Nikolay <nik@victoriametrics.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
2026-03-11 00:20:02 +02:00
Nikolay
d3ccc8d7a7 app/vmauth: remove data-race at default_url proxy
Previously there was a data-race, when targetURL was concurrently
 updated in case of default url route.

 This commit fixes data-race and adds concurrency to the routing tests.

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10626
2026-03-10 21:00:16 +01:00
Fred Navruzov
eb34bdd8d9 docs/vmanomaly - release v1.29.0 (#10620)
### Describe Your Changes

Documentation updates following `v1.29.0` release of `vmanomaly`

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-03-10 19:41:46 +02:00
Roman Khavronenko
3139fa1c9b docs/vmalert: add more clarification on config reload procedure 2026-03-10 12:50:18 +01:00
Roman Khavronenko
8f4cdb8a42 app/vmauth: add request duration to access log
Request duration could be useful for tracking access logs too. For
example, track referrers for all slow requests.

While there, added tests to track log structure changes.

Related to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5936
2026-03-10 12:49:07 +01:00
andriibeee
f236801fa4 lib/promauth: support headers in oauth2 token_url requests
OAuth2 token source lib doesn't allow to define request headers explicitly.
This commit  adds a custom transport to mitigate it. New transport modifies http.Request by making a shallow copy of it and setting additional headers.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8939
2026-03-10 10:10:56 +01:00
JAYICE
2c48133ad8 lib/filestream: properly account vm_filestream_write_duration_seconds_total metric
Previously vm_filestream_write_duration_seconds_total will be increased in two places:
*  statWriter.Write()
* Writer.MustFlush(). It will eventually call statWriter.Write(), hence double counting vm_filestream_write_duration_seconds_total

For reference, vm_filestream_read_duration_seconds_total will be increased only in statReader.Read to track read syscall.

 This commit removes latency tracking from MustFlush method.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10564
2026-03-10 10:06:09 +01:00
f41gh7
1cb634858e app/vmauth: add match_claims JWT routing
This commit adds claims matching for jwt token auth.

It allows to perform match for any jwt token json field with nested traversal.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10584
2026-03-10 10:48:18 +02:00
Max Kotliar
4b45f909b5 docs/changelog: port v1.136.1 changelog to master 2026-03-09 20:32:30 +02:00
Yury Moladau
4ae495bd1d app/vmui: rename debug tools buttons for clarity
Replace ambiguous button labels such as "Submit" and "Apply" with
clearer wording to indicate that these actions only preview results and
do not modify the deployment configuration.

Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10453
2026-03-09 14:27:36 +01:00
Max Kotliar
925b0ecdc9 app/vmauth: Implement OpenID Connect Discovery support
Add support for [OpenID Connect
Discovery](https://openid.net/specs/openid-connect-discovery-1_0.html#IANA)
as an alternative way to obtain verification keys and rotate them
automatically.

`jwt` configuration should allow **exactly one** of the following
verification modes: `public_keys`, `oidc`, `skip_verify`. These options
must be mutually exclusive.

Example: OIDC configuration

```yaml
users:
- jwt:
    oidc:
      issuer: http://identity-provider.com
```

When `oidc` is enabled:

1. On startup, `vmauth` fetches:

   ```
   {issuer}/.well-known/openid-configuration
   ```
2. Extracts `jwks_uri`.
3. Fetches [JWK
keys](https://openid.net/specs/draft-jones-json-web-key-03.html#ExampleJWK)
from `jwks_uri`.
4. Uses discovered keys to verify JWT tokens.

Related to
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10585

Failure handling:
* If discovery fails at startup:
  * No keys are available.
  * The user is skipped.
* Discovery runs periodically in background (e.g., every 1 minute).
* If keys become available later, authentication should start working
automatically.
* If keys were previously fetched and the identity provider becomes
unavailable:
  * Cached keys must be preserved.
  * Authentication continues using cached keys.

#### JWT Requirements in OIDC Mode

When `oidc` is enabled:

* `iss` claim becomes
[mandatory](https://openid.net/specs/openid-connect-core-1_0.html#IDToken).
* `iss` [must
match](https://openid.net/specs/openid-connect-core-1_0.html#RotateEncKeys):
  * `oidc.issuer` from config.
  * `issuer` returned in the OpenID configuration document.
* JWT header must contain `kid`.
* `kid` must be used to select the appropriate key from JWKS.
* Tokens without `kid` must be rejected.
* Tokens without `iss` must be rejected.

Rationale
* Enables automatic key rotation.
* Eliminates manual public key configuration.
* Maintains compatibility with standard OIDC providers.

---------

Signed-off-by: Max Kotliar <kotlyar.maksim@gmail.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
2026-03-09 14:26:23 +01:00
Ihar Statkevich
1348b0e424 vmui: use increase_pure instead of rate for histogram heatmaps
- VMUI Explore Metrics uses `rate` for histogram bucket queries, which
skips the first observation
in each bucket because `rate` requires two data points to calculate a
per-second rate.
- Replace `rate` with `increase_pure`, which assumes counters start from
0 and correctly shows
the first observation when a new bucket appears.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10365
2026-03-09 11:45:56 +01:00
Artem Fetishev
83656e544d lib/storage: remove 1 cpu special case from storage tests
The test should not fail now on systems with 1 cpu because partition
indexDBs are not rotated. See #8948.

Also removed two TODOs from the test to keep it simple.
2026-03-09 11:44:35 +01:00
Nikolay
38a76eca7b app/vmauth: reduce memory allocations for JWT token parsing
This commit adds in-memory pool for jwt tokens. It reduces memory
 allocations and GC pressure.

 Benchmark results:
```
                                         ? before_optimisation.txt ?       after_optimisation.txt        ?
                                         ?         sec/op          ?   sec/op     vs base                ?
JWTRequestHandler/full_template-10                     65.82µ ± 2%   26.87µ ± 2%  -59.18% (p=0.000 n=10)
JWTRequestHandler/token_without_claim-10               734.4n ± 1%   543.9n ± 0%  -25.94% (p=0.000 n=10)
JWTRequestHandler/expired_token-10                    1560.0n ± 0%   681.2n ± 1%  -56.33% (p=0.000 n=10)
geomean                                                4.225µ        2.151µ       -49.08%

                                         ? before_optimisation.txt ?        after_optimisation.txt        ?
                                         ?          B/op           ?     B/op      vs base                ?
JWTRequestHandler/full_template-10                    33.60Ki ± 0%   16.52Ki ± 0%  -50.85% (p=0.000 n=10)
JWTRequestHandler/token_without_claim-10              1.605Ki ± 0%   1.105Ki ± 0%  -31.14% (p=0.000 n=10)
JWTRequestHandler/expired_token-10                    3.267Ki ± 0%   1.045Ki ± 0%  -68.01% (p=0.000 n=10)
geomean                                               5.606Ki        2.672Ki       -52.34%

                                         ? before_optimisation.txt ?       after_optimisation.txt       ?
                                         ?        allocs/op        ? allocs/op   vs base                ?
JWTRequestHandler/full_template-10                      224.0 ± 0%   172.0 ± 0%  -23.21% (p=0.000 n=10)
JWTRequestHandler/token_without_claim-10                17.00 ± 0%   13.00 ± 0%  -23.53% (p=0.000 n=10)
JWTRequestHandler/expired_token-10                      30.00 ± 0%   11.00 ± 0%  -63.33% (p=0.000 n=10)
geomean                                                 48.52        29.08       -40.06%
```

follow-up for f8a101e45e

related issue
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10492
2026-03-09 11:42:51 +01:00
f41gh7
dea915c10d deployment/docker: update Go builder from Go1.26.0 to Go1.26.1
See https://github.com/golang/go/issues?q=milestone%3AGo1.26.1%20label%3ACherryPickApproved
2026-03-09 11:39:32 +01:00
f41gh7
b3f57c113b lib/httpserver: fixes tests after 686c9a21ff 2026-03-05 16:12:29 +01:00
andriibeee
686c9a21ff lib/httpserver: handle preflight HTTP requests properly
Previously OPTIONS HTTP requests for CORS preflight checks would trigger
the original request handler. This pull request fixes that behavior to
align with https://developer.mozilla.org/en-US/docs/Web/HTTP/Methods/OPTIONS

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5563
2026-03-05 15:57:13 +01:00
Hui Wang
8f215137e7 docs: polish opentelemetry integration doc 2026-03-05 15:53:06 +01:00
Artem Fetishev
ed5dc35876 app/vmselect: Disable Graphite Tag Series HTTP endpoints (#10579)
Disabling is done by making the the handlers for `/tags/tagSeries` and
`/tags/tagMultiSeries` to return `501 (Not Implemented)` status code
along with the error message saying that the API has been disabled and
will be removed in future.

See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10544.


Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-03-05 14:27:43 +01:00
Artem Fetishev
13ab8cfb78 docs: Update docs to reflect partition index changes (#10582)
Now that indexDB is per-partition, the indexDB-related docs need to be
updated. Specifically the how the indexDB is cleaned up when it becomes
outside the `-retentionPeriod`.

Follow-up for #8134.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
Signed-off-by: Aliaksandr Valialkin <valyala@gmail.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
2026-03-04 18:46:16 +01:00
Nikolay
f8a101e45e lib/jwt: remove memory allocation from token parsing
This commit adds `Reset()` method to the Token struct.
It allows to re-use `Token` object, which reduces memory allocations
needed for parsing `Token` and CPU pressure on GarbageCollector.

 Additionally, it adds fastjson parser, which allows efficiently perform
 claims matching based on dynamic value input.

 Benchmark stats:

```
                                         │ profiles/jwt_parse_before.txt │    profiles/jwt_parse_after.txt     │
                                         │            sec/op             │   sec/op     vs base                │
TokenParse/simple-10                                       3375.0n ± 41%   335.6n ± 4%  -90.05% (p=0.000 n=10)
TokenParse/gateway_labels_and_filters-10                   4259.0n ±  6%   423.3n ± 5%  -90.06% (p=0.000 n=10)
TokenParse/scope_as_slice_string-10                        3781.5n ±  2%   374.7n ± 5%  -90.09% (p=0.000 n=10)
TokenParse/access_claim_string-10                          2974.5n ±  1%   290.9n ± 4%  -90.22% (p=0.000 n=10)
TokenParse/vmauth_related_fields-10                        4340.5n ±  2%   389.2n ± 2%  -91.03% (p=0.000 n=10)
geomean                                                     3.709µ         359.8n       -90.30%

                                         │ profiles/jwt_parse_before.txt │       profiles/jwt_parse_after.txt        │
                                         │             B/op              │     B/op      vs base                     │
TokenParse/simple-10                                        5.195Ki ± 0%   0.000Ki ± 0%  -100.00% (p=0.000 n=10)
TokenParse/gateway_labels_and_filters-10                    6312.00 ± 0%     16.00 ± 0%   -99.75% (p=0.000 n=10)
TokenParse/scope_as_slice_string-10                         6312.00 ± 0%     16.00 ± 0%   -99.75% (p=0.000 n=10)
TokenParse/access_claim_string-10                           4.789Ki ± 0%   0.000Ki ± 0%  -100.00% (p=0.000 n=10)
TokenParse/vmauth_related_fields-10                         6.327Ki ± 0%   0.000Ki ± 0%  -100.00% (p=0.000 n=10)
geomean                                                     5.693Ki                      ?                       ¹ ²
¬π summaries must be >0 to compute geomean
² ratios must be >0 to compute geomean

                                         │ profiles/jwt_parse_before.txt │      profiles/jwt_parse_after.txt       │
                                         │           allocs/op           │ allocs/op   vs base                     │
TokenParse/simple-10                                          39.00 ± 0%    0.00 ± 0%  -100.00% (p=0.000 n=10)
TokenParse/gateway_labels_and_filters-10                     53.000 ± 0%   1.000 ± 0%   -98.11% (p=0.000 n=10)
TokenParse/scope_as_slice_string-10                          54.000 ± 0%   1.000 ± 0%   -98.15% (p=0.000 n=10)
TokenParse/access_claim_string-10                             41.00 ± 0%    0.00 ± 0%  -100.00% (p=0.000 n=10)
TokenParse/vmauth_related_fields-10                           57.00 ± 0%    0.00 ± 0%  -100.00% (p=0.000 n=10)
geomean                                                       48.23                    ?                       ¹ ²
```

Related to
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10492
2026-03-04 17:31:30 +01:00
Max Kotliar
a1a35fd870 .github: remove copilot instruction since we use cubic AI for code review
Copilot results were far from good, so we switched to Cubic AI.
2026-03-04 14:37:01 +02:00
Artem Fetishev
0d5df2722d lib/storage: add an apptest for Graphite tag registration (#10558)
Add an apptest for `/graphite/tags/tagSeries` and `/graphite/tags/tagMultiSeries` URLs path to test the time series registration in the index. This PR is a preparation for disabling these paths (#10544). For now just testing that they actually work as described in https://graphite.readthedocs.io/en/stable/tags.html#adding-series-to-the-tagdb.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-03-04 07:43:07 +01:00
Hui Wang
db3353c6e1 app/vmalert: support negative values for the group eval_offset option
There are following main use cases for `eval_offset`:
1. To ensure rules are evaluated at an exact offset, so the results have
the exact timestamp the user wants.
2. The source data for a certain rule is delivered at a specific time
point, so rules need to be executed after that time point to get correct
results. For example, [chaining
groups](https://docs.victoriametrics.com/victoriametrics/vmalert/#chaining-groups).
3. A group contains some heavy rules that can take a few minutes to
finish. To guarantee a single evaluation can complete in time and not
delay the next run, the user may want to schedule the group to be
executed within [intervalStart, intervalEnd-avgTotalEvaluationDuration].

Negative value can be convenient for case3, as users only need to set
group `eval_offset: -avgTotalEvaluationDuration(a bigger value than the
real duration to leave some buffer would be better)`.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10424
2026-03-03 12:06:56 +01:00
Hui Wang
cfbc5ae31d dashboard: fix expressions in vmauth memory usage panel (#10574)
vmauth doesn’t use fastcache or expose `vm_cache_size_bytes`, so having
`vm_cache_size_bytes` makes the expression evaluate to null.

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10574/
2026-03-03 12:06:12 +01:00
hklhai
fdb3c96fc1 app/{vmagent,vminsert}: properly attach host label for datadog-sketches
Due to bug introduced at initial datadog-sketches API implementation, `host` label was incorrectly obtained from `Tags` structure. While actually it's present directly at root of protobuf message.

 This commit properly attaches `host` label in such case.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10557
2026-03-03 12:03:31 +01:00
Max Kotliar
486d923351 docs/changelog: sync lts changelogs 2026-03-02 20:20:31 +02:00
Max Kotliar
f8552bdc96 docs: bump version to v1.137.0
Signed-off-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-03-02 16:11:54 +02:00
Max Kotliar
893c981c57 deplyoment/docker: bump version to v1.137.0
Signed-off-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-03-02 16:04:23 +02:00
Hui Wang
3d7ff783b6 vmalert: prevent a subsequent small remote write requests if the previous one takes too long
If the data flush to the remote write destination takes longer than the
periodic flush interval (default 2s), the ticker channel will contain a
stale tick, causing the ticker case to be selected too early with an
empty or small amount of data inside `wr`, resulting in a wasted remote
write request with one or two time series(if `ts, ok := <-c.input` was
also randomly selected beforehand).

We could also consider resetting the ticker after drain the stale tick
to ensure `wr` always accumulates data for the full flush interval, but
that seems more trivial to me.
2026-03-02 11:28:10 +01:00
Zakhar Bessarab
78543b7f87 lib/backup/actions: do not set s3ACL by default
Disable ACL default configuration as ACL is not always supported by
S3-compatible storages (for example, linode does not support it in some
regions). So it requires users to disable it manually to make it work.
Moreover, it is not a recommended way of objects access configuration
anymore as ACLs for buckts is disabled by default. Currently, it is
recommended to use policies for access controls. See -
https://docs.aws.amazon.com/AmazonS3/latest/userguide/about-object-ownership.html

Fixes: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10539
2026-03-02 11:25:36 +01:00
Roman Khavronenko
f54d22562a docs: add availability mark for access_log feature in vmauth (#10567)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2026-03-02 11:23:55 +01:00
Roman Khavronenko
b672e05dce app/vmauth: support printing access logs per user
Add new option per-user to print access logs. Such logs
contain limited amount of information to prevent exposing
sensitive data.

Access logs can be enabled/disabled via hot-reload and could
help locating clients that incorrectly use or abuse vmauth.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5936
2026-03-02 10:51:40 +01:00
Artem Fetishev
847871b916 apptest: Fix flaky tests
Cluster apptests failed from time to time with the following error:

```
timed out while waiting for inserted rows to be sent to vmstorage
cluster
```

due to incorrect calculation of inserted row count before and after
insertion. This PR fixes it by putting the "before" count calculation
before the send() operation.
2026-03-02 10:41:35 +01:00
Max Kotliar
2aecca1163 docs/changelog: fix link 2026-02-27 20:01:51 +02:00
Max Kotliar
d1efb2dd37 docs: cut release v1.137.0
Signed-off-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-02-27 19:56:51 +02:00
Max Kotliar
6882c72075 docs: update version to v1.137.0
Signed-off-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-02-27 19:18:53 +02:00
Max Kotliar
60eb543dba app/vmselect: run make vmui-update
Signed-off-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-02-27 18:53:43 +02:00
Max Kotliar
7db42b0659 go.mod: fix govulncheck
govulncheck ./...
=== Symbol Results ===

Vulnerability #1: GO-2026-4559
    Sending certain HTTP/2 frames can cause a server to panic in
    golang.org/x/net
  More info: https://pkg.go.dev/vuln/GO-2026-4559
  Module: golang.org/x/net
    Found in: golang.org/x/net@v0.50.0
    Fixed in: golang.org/x/net@v0.51.0
2026-02-27 14:45:32 +02:00
Hui Wang
8d924f0631 vmselect: revert rollup result cache for instant queries that contain rate function (#10553)
See reason in
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10098#issuecomment-3895011084
2026-02-27 14:37:48 +02:00
Nikolay
791679253d lib/promauth: check client certificate rotation during requests
Previously, the client certificate was only refreshed during the TLS
handshake, which occurs when establishing a new connection. This meant
the remote HTTP server had to close the existing connection for the
client to pick up an updated (e.g. expired) certificate. As a
workaround, connection keep-alive could be disabled, but that
significantly increased request latency.

This commit adds a certificate check during HTTP RoundTrip. If the
client certificate has changed, the RoundTripper recreates the transport
and its connection pool. This behavior is already implemented for CA
certificate changes.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10393
2026-02-27 13:19:50 +01:00
Max Kotliar
a745bb797a docs/changelog: add update note for multitenant api endpoint 2026-02-27 13:45:04 +02:00
Artem Fetishev
3607c53b7c lib/storage: rename cache methods to match unified format (#10534)
Per @valyala's request, rename storage cache methods to adhere the
following format:

```
get[Value]By[Key]FromCache
put[Value]By[Key]ToCache
```

Also move `s.metricIDCache` methods from `indexDB` to `Storage` because
this cache exists at the `Storage` level.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-02-27 10:33:52 +01:00
John Allberg
7969647553 publish SPDX SBOM attestations for container images (#10474)
Enable BuildKit-native SPDX SBOM and provenance attestations by setting
`--sbom=true --provenance=true` in `docker buildx build` within
`publish-via-docker`.

- Set `--provenance=true --sbom=true` in `publish-via-docker` for both
Alpine and scratch variants
- Add SBOM section to SECURITY.md with inspection and Trivy scan
instructions
- Update Release-Guide.md
- Add changelog entry

Verified end-to-end: pushed test image to GHCR, confirmed SBOM
attestation via `docker buildx imagetools inspect`, and Trivy scan via
`trivy image --sbom-sources oci` succeeded (with 0 vulnerabilities :-)).

Fixes #10473 

### Checklist

The following checks are **mandatory**:

- [X] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [X] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Signed-off-by: John Allberg <john@ayoy.se>
Signed-off-by: Max Kotliar <mkotlyar@victoriametrics.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
Co-authored-by: Max Kotliar <kotlyar.maksim@gmail.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-02-27 10:50:03 +02:00
Hui Wang
5f887b66c5 docs: add a note for vmctl remote read stream mode (#10548)
Samples in Mimir (or Prometheus) are stored in chunks, which are
compressed efficiently using algorithms rather than being stored as
independent samples, see details in [this
article](https://prometheus.io/blog/2019/10/10/remote-read-meets-streaming/)
and [this talk](https://www.youtube.com/watch?v=b_pEevMAC3I).
When using a small `--remote-read-step-interval`, particularly `minute`,
a single chunk may contain samples that exceed the requested time
window, and all the returned chunks contain overlapping samples.
Consequently, vmctl will read and migrate many duplicate samples into
VictoriaMetrics.

In tests, `--remote-read-step-interval=minute
--remote-read-use-stream=true` with raw sample `scrape_interval: 10s`
and remote read time range of 24h can write ~20x duplication.
But I assume the minute interval is rarely used with a large time range
and duplicates are fine in VictoriaMetrics due to deduplication, so we
don't need to disallow using it.
```
## --remote-read-step-interval=minute --remote-read-use-stream=false
## total samples: **15696611(the real number)**
2026/02/26 22:10:25 VictoriaMetrics importer stats:
  idle duration: 50.080851955s;
  time spent while importing: 32.108903417s;
  total samples: 15696611;
  samples/s: 488855.41;
  total bytes: 735.8 MB;
  bytes/s: 22.9 MB;
  import requests: 79;
  import requests retries: 0;
2026/02/26 22:10:25 Total time: 32.112912208s

## --remote-read-step-interval=day --remote-read-use-stream=true
## total samples: 15878869
2026/02/26 22:20:37 VictoriaMetrics importer stats:
  idle duration: 960.698874ms;
  time spent while importing: 6.338309625s;
  total samples: 15878869;
  samples/s: 2505221.41;
  total bytes: 278.6 MB;
  bytes/s: 44.0 MB;
  import requests: 80;
  import requests retries: 0;
2026/02/26 22:20:37 Total time: 6.340023167s

## --remote-read-step-interval=hour --remote-read-use-stream=true
## total samples: 21824000
2026/02/26 22:13:14 VictoriaMetrics importer stats:
  idle duration: 5.238827666s;
  time spent while importing: 7.274528s;
  total samples: 21824000;
  samples/s: 3000057.19;
  total bytes: 394.4 MB;
  bytes/s: 54.2 MB;
  import requests: 110;
  import requests retries: 0;
2026/02/26 22:13:14 Total time: 7.278895084s

## --remote-read-step-interval=minute --remote-read-use-stream=true
## total samples: **353800724(353800724/15696611~22.5)**
2026/02/26 22:18:41 VictoriaMetrics importer stats:
  idle duration: 1m45.09105431s;
  time spent while importing: 1m51.716730125s;
  total samples: 353800724;
  samples/s: 3166944.86;
  total bytes: 6.8 GB;
  bytes/s: 61.3 MB;
  import requests: 1769;
  import requests retries: 0;
2026/02/26 22:18:41 Total time: 1m51.721834958s
```
2026-02-27 10:45:52 +02:00
Roman Khavronenko
d3e2946791 dashboards: remove $instance from drilldown link (#10518)
For unknown reason, $instance variable can't be passed unescaped via
dashboard link. In result, clicking on the line on panel opens a new tab
where panel fails to render.

This happens when `$instance=$__all`. The rendered link becomes
`&var-instance=.*` which then gets double-escaped in the query and
yields no result. This behavior can be verified at
https://play-grafana.victoriametrics.com/.

I've tried to properly unescape the variable using
https://grafana.com/docs/grafana/latest/visualizations/dashboards/variables/variable-syntax
but found no solution.

Hence, proposing to remove this filter from drilldown.

------------



https://github.com/user-attachments/assets/faf76d63-7739-48d7-8ce6-3d567e77003c

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Signed-off-by: Roman Khavronenko <hagen1778@gmail.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
2026-02-27 10:41:26 +02:00
Max Kotliar
603dc03c7d dashboards: add job\instance filters to alerts statistics dashboard (#10549)
### Describe Your Changes

Add `job` and `instance` filters to the `VictoriaMetrics - Alert
statistics` dashboard. This allows users running multiple independent
[vmalert](https://docs.victoriametrics.com/victoriametrics/vmalert/)
instances to filter and analyze alerts statistics per specific instance,
making it easier to identify issues in a particular vmalert deployment.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-02-27 09:41:21 +02:00
Max Kotliar
1cc471a6c1 app/vmauth: userinfo returns jwt as name (#10546)
### Describe Your Changes

Previously it would return empty string if jwt auth method is
configured. The empty string complicates reading logs.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-02-26 16:57:18 +02:00
Max Kotliar
d40adb1e58 docs: reorganize OpenTelemetry documentation into integrations and data-ingestion (#10520)
### Describe Your Changes

Move OpenTelemetry-related documentation under docs/integrations and
docs/data-ingestion to establish a clear, scalable structure.

As OpenTelemetry support expands, we need a dedicated place to document
protocol details, implementation specifics, and known limitations, such
as:

- Delta temporality not working with downsampling. See
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10014#issuecomment-3697509266.
- Negative histogram buckets being discarded by VictoriaMetrics. See
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9896.

The new structure separates concerns:

- `docs/integrations/` — protocol overview, implementation details, and
limitations.
- `docs/data-ingestion/` — OpenTelemetry Collector configuration and
ingestion setup.

This aligns OpenTelemetry documentation with the existing structure used
across other integrations and ingestion methods.

New pages and links preserve backward compatiblity

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-02-26 16:55:53 +02:00
Max Kotliar
8056806d5f docs/changelog: chore changelog 2026-02-26 14:52:53 +02:00
Roman Khavronenko
3d67942a65 Docs: add integration with bindplace (#10543)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2026-02-26 14:40:37 +02:00
Max Kotliar
23bdd14cee docs: refine vmauth jwt documentation 2026-02-26 14:38:53 +02:00
Pablo (Tomas) Fernandez
18a2955553 Docs: Update guide "Kubernetes monitoring with VictoriaMetrics Cluster" (#10410)
### Describe Your Changes

- Updated GKE version to a more current 1.34+
- Updated guide to more modern Helm and Kubectl versions
- Tested updated instructions on GKE 1.34.1-gke.3971001 (and a local k3s
instance) successfully
- Removed revision from Grafana values for helm chart (confirmed it
pulls the latest revision)
- Split the helm chart values (`guide-vmcluster-vmagent-values.yaml`)
into more readable chunks and added explanations next to each chunk
- Added and updated expected outputs. Some were missing and others were
outdated
- Updated Grafana dashboards screenshots since they changed from the
last revision
- Updated Grafana repo to use community org (old grafana chart was
deprecated
on Jan 30th -
[source](https://community.grafana.com/t/helm-repository-migration-grafana-community-charts/160983))
- Minor corrections and typo fixes. Improved flow
- Added a section at the end pointing readers where they can go next.

### Checklist

The following checks are **mandatory**:

- [X] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [X] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Signed-off-by: Pablo (Tomas) Fernandez <46322567+TomFern@users.noreply.github.com>
Co-authored-by: Vadim Rutkovsky <vadim@vrutkovs.eu>
2026-02-26 14:08:15 +02:00
hagen1778
570a9ef627 docs: update best recommendations for swap
* simplify wording
* add link to Grafana dashboards where they're mentioned

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2026-02-26 11:43:01 +01:00
Maxime Grenu
40e27fc2c8 docs/vmctl: fix invalid MetricsQL numeric literal in monitoring example (#10494)
## Summary

Fix an invalid MetricsQL numeric literal in the vmctl monitoring
documentation.

## Problem

The PromQL/MetricsQL example query for monitoring vm-native migration
data transfer speed used `1Mb` as a divisor:

```promql
rate(vmctl_vm_native_migration_bytes_transferred_total[5m]) / 1Mb
```

However, `Mb` is **not** a valid MetricsQL numeric suffix. According to
the [MetricsQL
documentation](https://docs.victoriametrics.com/victoriametrics/metricsql/#numeric-values):

> Numeric values can have `K`, `Ki`, `M`, `Mi`, `G`, `Gi`, `T` and `Ti`
suffixes.

The suffix `Mb` does not exist — only `M` (mega, 10^6) and `Mi` (mebi,
2^20 = 1,048,576) are valid.

## Fix

Replace `1Mb` with `1Mi` (1 mebibyte = 1,048,576 bytes), which is the
standard binary unit for memory/storage transfer measurements in
computing, and update the comment to reflect `MiB/s` instead of `MB/s`.

## Files Changed

- `docs/victoriametrics/vmctl/vmctl.md`: fixed the invalid literal `1Mb`
→ `1Mi` and updated the comment

---------

Signed-off-by: Maxime Grenu <maxime.grenu@gmail.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
Co-authored-by: Vadim Alekseev <vadimaleksv@gmail.com>
Co-authored-by: Yury Moladau <yurymolodov@gmail.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
Co-authored-by: Nikolay <nik@victoriametrics.com>
2026-02-26 11:34:25 +01:00
hagen1778
befbf9afca deployment: include alert-statistics in default dashboards
Having this dashboard by default simplifies its maintainance.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2026-02-26 11:33:26 +01:00
hagen1778
65d0a8e129 dashboards: review alert-statistics dashboard
* add meaningful description, it is required for publishin on grafana.com
* remove dependency on `victoriametrics-metrics-datasource` as it is not used

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2026-02-26 11:33:26 +01:00
Hui Wang
c2841ca36c metricsql: add function histogram_fraction()
This commit improves compatibility with promql by introducing a missing function `histogram_fraction`.
 
 histogram_fraction is a shortcut for `histogram_share(upperLe, buckets) - histogram_share(lowerLe, buckets)`

histogram_count, histogram_sum or histogram_avg will not be added to metricsQL, as they only operate on Prometheus native histogram, which doesn't have _count and _sum series like the classic histogram or Victoriametrics histogram. For classic histogram, _count and _sum series can be used directly.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5346.
2026-02-26 09:35:13 +01:00
Aliaksandr Valialkin
cd2026e430 lib/httpserver: prefer gzip over zstd compression for http responses if the client indicates it supports both methods
This is needed because some clients and proxies improperly handle zstd-compressed responses.
See https://github.com/VictoriaMetrics/victoriametrics-datasource/issues/455 .

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10535
2026-02-25 21:54:18 +01:00
Nikita Koptelov
216821aa1c app/vmselect: prom handler: LabelValues: decode UTF8-encoded label name
This commit enhances UTF-8 decoding for `/label//values` API by making it compatible 
with Prometheus labelName encodoing. 

 If the label is encoded according to the Prometheus UTF8 encoding scheme
(https://github.com/prometheus/proposals/blob/main/proposals/0028-utf8.md),
decode it before doing the search.

Every label value that starts with "U__" is considered to be
UTF8-encoded, according to the spec.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10446
2026-02-25 19:50:45 +01:00
Nikolay
ef507d372b lib/promscrape: reduce CPU and memory usage for originalLabels
This commit optimizes the storage of originalLabels. Previously, they
were stored as a clone of the discovered labels, which required many
small allocations and added high pressure on the garbage collector.

Now originalLabels are stored as zstd-compressed JSON ([]byte). Since
they are rarely requested, the overhead of zstd decompression and
json.Unmarshal is negligible.

This optimization reduces memory usage for storing originalLabels by 3x
and CPU usage by 2x.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9952
2026-02-25 19:47:30 +01:00
Nikolay
e383b62f59 lib/timerpool: remove misleading panic
After golang 1.23 it's safe to ignore timer.Reset True value.

According to the spec:

 For a chan-based timer created with NewTimer, as of Go 1.23,
 any receive from t.C after Reset has returned is guaranteed not
 to receive a time value corresponding to the previous timer
settings;

 If the program has not received from t.C already and the timer is
 running, Reset is guaranteed to return true.
 Before Go 1.23, the only safe way to use Reset was to call [Timer.Stop]
and explicitly drain the timer first.

 Golang 1.23 changed timer implementation from sync and async. And it
made possible that chan send and timer.Stop could happen in the same
time.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9721
2026-02-25 19:45:12 +01:00
Max Kotliar
8f34284dd2 docs: add available_from for vmauth\jwt feature
Follow-up on
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10499
2026-02-25 15:24:39 +02:00
Max Kotliar
8f4eca39f7 app/vmauth: implement upstream request templating based on JWT vm_access claim
For proposal and implementation check out https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10492

address review comments

* simplify placeholder logic with pre-defined data structure
* add validation helper functions
* consolidate JWT placeholders parsing logic
* slightly reduce memory allocations for query templating
* do not allow templating for client request url params

Signed-off-by: f41gh7 <nik@victoriametrics.com>
2026-02-25 14:46:51 +02:00
hagen1778
d467faf739 docs: add change lines after 673b2ca7db
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2026-02-25 11:27:53 +01:00
sias32
673b2ca7db dashboards/deployment: add links for vmalert (#10509)
### Describe Your Changes

1. Dashboard: Adding a link to an alert for quick access to it
(alert-statisticl)
2. Rules: Replace localhost with $externalURL to take the address from
the --external.url flag

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Signed-off-by: sias32 <sias.32@yandex.ru>
2026-02-25 11:26:44 +01:00
hagen1778
40ccf0c333 app/vmalert: fix typo Minium => Minimum
Follow-up after a6200cc83d

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2026-02-25 09:28:04 +01:00
hklhai
fe341a4204 Improve Influx parsing error message when raw newline (\n) appears inside quoted fieldvmagent: Improve Influx parsing error message when raw newline (\n)… (#10524)
# Investigation & Root Cause --- InfluxDB Line Protocol Parsing with Raw
Newline (`\n`)

This document describes the investigation process and root cause
analysis for Influx Line Protocol parsing errors in VictoriaMetrics when
a **raw newline (`\n`) byte appears inside a quoted field value**.

------------------------------------------------------------------------

## Background

According to the Influx Line Protocol specification:

-   Each point must be represented as a single line.
-   The newline character (`\n`) separates points.
-   Literal newline bytes are not allowed inside quoted field values.

Therefore, any raw newline byte (`0x0A`) inside a quoted string makes
the line invalid.

------------------------------------------------------------------------

## Related Issue

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10067

------------------------------------------------------------------------

## Expected Behavior

VictoriaMetrics should reject Influx Line Protocol lines that contain a
raw newline inside a quoted field value, since this violates the
protocol specification.

The parsing failure itself is correct.

------------------------------------------------------------------------

## Actual Behavior

VictoriaMetrics rejects the line with the following error:

cannot parse field value for "...": missing closing quote for quoted
field value

While technically correct, the error message does not clearly indicate
that the root cause is a raw newline inside the quoted field value.

------------------------------------------------------------------------

## Minimal Reproducer

The issue can be reproduced without Telegraf or Jolokia:

``` bash
printf 'test value="hello
world"\n' | curl -X POST http://localhost:8428/write --data-binary @-
```

This produces:

cannot parse field value for "value": missing closing quote for quoted
field value

The failure occurs because the value contains an actual newline byte
(0x0A), not the escaped sequence `\n`.

------------------------------------------------------------------------

## Environment Setup

The issue was reproduced using the following stack:

-   VictoriaMetrics v1.127.0
-   InfluxDB 1.8
-   Spring Boot + Jolokia
-   Telegraf 1.36.2

Telegraf collects JVM `SystemProperties`, including:

``` json
"line.separator": "\n"
```

After JSON unmarshalling, this becomes a real newline byte in memory.

Detailed reproduction steps can be found here:

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10067#issuecomment-3896175100

------------------------------------------------------------------------

## Observed Serialized Line

Using breakpoint debugging in:

    lib/bytesutil/bytebuffer.go:58

The `ReadFrom` function reads and assembles an Influx line containing:

    SystemProperties.line.separator="
    ",

The quoted field contains an actual newline byte before the closing
quote.

This breaks the single-line assumption of Influx Line Protocol.

VictoriaMetrics splits on `\n`, resulting in:

-   A truncated first line
-   A missing closing quote
-   Parsing failure

------------------------------------------------------------------------

## Important Clarification

This issue is **not** caused by the escaped sequence `"\\n"`.

The failure occurs only when the serialized Influx line contains an
actual newline byte (`0x0A`) inside the quoted value.

Escaped `\n` (two characters: `\` and `n`) is valid.

------------------------------------------------------------------------

## Root Cause

-   Telegraf serializes a field containing a real newline byte.
-   Influx Line Protocol forbids literal newline characters inside
    quoted fields.
-   VictoriaMetrics correctly treats `\n` as a line separator.
-   The parser then encounters an incomplete quoted field and reports
    "missing closing quote".

The parsing behavior is correct per specification.

------------------------------------------------------------------------

## Proposed Improvement

The parsing logic should remain unchanged.

However, the error message can be improved to better indicate the root
cause.

Suggested error message:

invalid Influx line protocol: missing closing quote for quoted field
value;
this may be caused by a raw newline (`\n`) inside the quoted field value

This makes the failure immediately actionable and easier to diagnose.

------------------------------------------------------------------------

## Summary

-   The failure is caused by a raw newline byte inside a quoted field
    value.
-   This violates the Influx Line Protocol specification.
-   VictoriaMetrics correctly rejects the line.
-   The error message should explicitly mention the possibility of a raw
    newline (`\n`) inside the quoted field.

Signed-off-by: hklhai <hkhai@outlook.com>
Co-authored-by: Max Kotliar <kotlyar.maksim@gmail.com>
2026-02-24 20:42:43 +02:00
Max Kotliar
83ebf00659 app/vmstorage: increase min free disk space from 10M to 100M (#10529)
### Describe Your Changes

The free disk space check is not continuous but occurs periodically. In
high-load environments with large ingestion rates, the system can exceed
the remaining 10MB between checks. This can lead to a situation where
disk space is exhausted before the next check occurs, causing panic.

Increase the default value 10x to cover the case.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9561

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-02-24 18:06:55 +02:00
Roman Khavronenko
5e602726f5 app/vmselect: properly apply extra filters for tenant tokens for /api/v1/label/../values (#10503)
Previosly, extra filters were ignored for
`/api/v1/label/vm_account_id/values` or
`/api/v1/label/vm_project_id/values` calls. In result, even if user's
visibility was limited by applying
`?extra_filters[]={vm_account_id="1"}` param they could get the list of
all available tenants in the system.

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>

(cherry picked from commit d2a033453e)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2026-02-24 15:42:13 +01:00
hagen1778
a6200cc83d app/vmalert: rename MiniMum => Minimum
Follow-up after a5811d3c3b

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2026-02-24 15:37:02 +01:00
Fedor Kanin
a5811d3c3b docs/vmalert: fix a typo by replacing maxiMum with maximum (#10516)
### Describe Your Changes

Fix a typo by replacing `maxiMum` with `maximum` in Markdown docs and
CLI flags help.

Resolve #10515 

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-02-24 15:34:41 +01:00
JAYICE
5962b47c31 document: enrich the description of buckets_limit (#10465)
### Describe Your Changes

fix https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10417

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-02-24 15:33:16 +01:00
Roman Khavronenko
9a4edc738a docs: re-visit Troubleshooting docs (#10512)
* remove ToC in the beginning, as it duplicates right-bar functionality
and is easier to make a mistake with. For example, it didn't have the
ZFS section in it
* simplify wording where it was possible
* reference new tools VM got in recent releases
* re-prioritize tips order based on personal experience

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Signed-off-by: Roman Khavronenko <hagen1778@gmail.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
Co-authored-by: Pablo (Tomas) Fernandez <46322567+TomFern@users.noreply.github.com>
2026-02-24 15:30:31 +01:00
Roman Khavronenko
30d01e9cae dashboards: filter out zero value for Major page faults panel (#10517)
Components like vmselect and vminsert rarely touch disk, so most of the
time their values are 0. Filtering out 0 values makes the panel cleaner.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2026-02-24 15:30:05 +01:00
Artem Fetishev
6b46f3920c lib/uint64set: move set un/marshal methods from Storage to uint64set (#10521)
A refactoring that moves the uint64set.Set marshaling and unmarshaling from lib/storage/storage.go to lib/uint64set. Also added function docs and tests.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-02-24 11:15:49 +01:00
Zhu Jiekun
97b11146ee flaky test: disable GC during sync.Pool test (#10523)
Disable GC when testing sync.Pool `Get` and `Put` logic, so the items in pool won't be recycled too fast.

Follow-up for 785daff65d.
2026-02-24 10:19:04 +01:00
Fred Navruzov
2ef74bd6ea docs/vmanomaly - strip bad chars from filenames (#10525)
### Describe Your Changes

Strip spaces and `=` from filenames as suggested in #10522 

now
```shellhelp
find ./docs |egrep '[ =]'
```
returns no such files

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-02-24 10:05:48 +02:00
Max Kotliar
845161e377 .github: Run apptests on separate pool of runners
It should prvent apptest timeouts due to runners saturation. When
apptests are run with other tests and linters they do not have enough
CPU to complete in time and often times out.

If one re-runs the apptests shortly after they are likely to pass
because the same runner has enough resources available (other job
finished).

Remove GOGC=10 as the runner has enough memory (16Gb)  to run apptests.

I did some tests and obeserve drop in overal test duration from 4.5m to
3.30-3m.
2026-02-23 14:16:40 +02:00
Vadim Rutkovsky
f176a6624a dashboards: operator dashboard should extract version from metrics (#10502)
### Describe Your Changes

Use vm_app_version to determine operator version instead of static text

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Signed-off-by: Vadim Rutkovsky <vadim@vrutkovs.eu>
2026-02-23 13:32:14 +02:00
Roman Khavronenko
4d06e34b66 docs: add dedicated opentelemetry section to docs (#10491)
The new section is supposed to contain otel related information for all
products, like VT, VM, VL.

It also supposed to be visible for readers right away, without need to
dig for info in each product.

It contains basic information and is supposed to act as a router to more
detailed info in each product.

While there, also updated VM-related otel info.


---------

Depends on
https://github.com/VictoriaMetrics/victoriametrics-datasource/pull/458

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2026-02-23 10:24:01 +01:00
Aliaksandr Valialkin
6d8ddcb9ed vendor: update github.com/valyala/fastjson from v1.6.9 to v1.6.10
This fixes the issue mentioned at https://github.com/VictoriaMetrics/VictoriaLogs/issues/1042#issuecomment-3936084518
2026-02-21 13:20:45 +01:00
Pablo (Tomas) Fernandez
dd4167709a Docs: Update guide "Getting started with VM Operator" (#10429)
### Describe Your Changes

- Add an introduction with a brief explanation of the operator and its
benefits as an intro
- Make some steps more explicit, instead of just linking to the VM
cluster guide
- Separate config/chart values files from kubectl apply (instead of
using heredoc and in-line yaml)
- Update screenshots and add figcaptions where needed
- Update Kubernetes and tools versions to newer releases
- Remove revision numbers from the Grafana config to install the latest
revision
- Added a section to configure scraping of Kubernetes resources (nodes,
pods, etc.)
- Tested updated instructions on GKE 1.33 and 1.34 (and a local k3s
instance) successfully
- Added and updated expected outputs. Some were missing and others were
outdated
- Updated Grafana dashboards screenshots since they changed from the
last revision
- Minor corrections and typo fixes. Improved flow
- Added a section at the end pointing readers to where they can go next.

### Checklist

The following checks are **mandatory**:

- [X] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [X] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-02-20 22:38:11 +02:00
Pablo (Tomas) Fernandez
71e253e1f0 Docs: update guide "Headlamp Kubernetes UI and VictoriaMetrics" (#10462)
### Describe Your Changes

- Updated introduction
- Added proper steps
- Tested intructions on headlamp desktop version and the in-cluster web
ui
- Added images to guide user
- Mentioned that the test connection button does not work (it probes a
`-healthy` endpoint that is not supported by VM). The plugin still
works, it's just the test button that fails
- Added links to the single and cluster installation guides

### Checklist

The following checks are **mandatory**:

- [X] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [X] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Signed-off-by: Pablo (Tomas) Fernandez <46322567+TomFern@users.noreply.github.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-02-20 22:37:59 +02:00
Pablo (Tomas) Fernandez
9e155ffd9e Docs: Update Guide "How to delete or replace metrics in VictoriaMetrics" (#10500)
### Describe Your Changes

- Rewrote the introduction
- Added list of endpoints for single node, cluster, and cloud
- Added tips for working with VictoriaMetrics running on Kubernetes
- Flushed out explanations for each step
- Added reference links for all required endpoints
- Tested every command

### Checklist

The following checks are **mandatory**:

- [X] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [X] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Signed-off-by: Pablo (Tomas) Fernandez <46322567+TomFern@users.noreply.github.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-02-20 22:37:51 +02:00
Max Kotliar
2e9e40dc75 docs/changelog: add regexp example to bugfix description 2026-02-20 16:27:59 +02:00
Max Kotliar
10d4294f9b docs: tiny corrections 2026-02-20 16:21:06 +02:00
Max Kotliar
5e77771668 docs/changelog: chore changelog 2026-02-20 13:23:09 +02:00
Nikolay
dda5545078 lib/storage: properly search tenants
Commit 610b328e5a introduced a bug in the
date range search logic. If the first searched date for a given tenant
did not match, the search could proceed incorrectly.

This commit fixes the SearchTenants API by correctly advancing the date
passed to table.Seek.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10422
2026-02-20 12:03:00 +01:00
Roman Khavronenko
087efbc451 docs: clarify details on dump_request_on_errors
* add example of the produced log, so users could understand the impact;
* stress once again about sensetive data exposure when
dump_request_on_errors is enabled.
2026-02-20 11:54:09 +01:00
Roman Khavronenko
68e64536b1 app/vmauth: clarify the error message for all failed backends
This change adds some context to the error when all backend failed. From
support cases it seems like without the context users might not know
what to do with this error message. Clarification advises them to check
the prev error messages.
2026-02-20 11:53:16 +01:00
Yury Moladau
6e3ce4d55c app/vmui: fix label escaping for cardinality and autocomplete (#10498)
This PR fixes handling of label names containing special characters
(e.g. `.`, `/`, `-`).

Changes:
- Fixed escaping logic for cardinality requests.
- Fixed autocomplete insertion to escape label names in query selectors.

Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10485
2026-02-20 11:52:30 +01:00
Vadim Alekseev
8d1b88f985 lib/regexutil: prevent panic error parsing regexp: expression nests too deeply
Previously regex simplify function made an attempt to parse string representation of simplified regex.
And it could produce runtime panic due to std lib specification:

```
// Simplify returns a regexp equivalent to re but without counted repetitions
// and with various other simplifications, such as rewriting /(?:a+)+/ to /a+/.
// The resulting regexp will execute correctly but its string representation
// will not produce the same parse tree, because capturing parentheses
// may have been duplicated or removed.
```
 
 This commit ignores simplified regex parsing error and returns back original regex. 
It results into possible missing simplification of some niche regex patterns. 
But it's extremely rare cases rarely seen in production. So the tradeoff is acceptable. 

Fixes victoriaMetrics/victoriaLogs/issues/1112
2026-02-20 11:51:42 +01:00
Max Kotliar
3d3c057d52 docs: make docs-update-flags should rely on git tag (#10490)
### Describe Your Changes

As requested by @valyala changing the behvior of `make
docs-update-flags` from relying on git worktree, specific git remotes to
the git tags. Same way as `make publish-release` works.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-02-19 18:51:43 +02:00
Max Kotliar
94622fef29 lib/prommetadata: enable metrics metadata ingestion and storing by default (#10489)
### Describe Your Changes

Related to
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2974

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-02-19 18:45:44 +02:00
Aliaksandr Valialkin
804d77ffc5 all: run go fix -reflecttypefor 2026-02-19 14:05:06 +01:00
Aliaksandr Valialkin
79b18e9742 vendor: update github.com/valyala/fastjson from v1.6.8 to v1.6.9
This should help reducing memory usage at https://github.com/VictoriaMetrics/VictoriaLogs/issues/1042
2026-02-19 13:28:41 +01:00
Benjamin Nichols-Farquhar
3404a47a6d lib/backup implement cross-type backup copies
While server side copies when using the same backup origin and
destination are always most efficient there are times when moving
between backup locations is required.

Right now vmbackup throws an error in these cases. 

While its true that a user could always do a fresh backup from a
snapshot rather than copy an old backup, this requires access to storage
data locations and a running vmstorage instance, something that is not
_generally_ required for otherwise moving backups around in remote
locations using vmbackup.

This is a small change that makes the moving of backups from one
location to another transparent to users, without having to consider if
those locations are the same or different. This both simplifies backup
migrations and unlocks using vmbackup for more complex operations.

Specifically this came up in my use case because we want to orchestrate
the down-scaling of EBS volumes backing our vmstorage cluster, which
requires some complex backup operations, one of which being taking a
backup from s3 to a local filesystem.

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10401
2026-02-18 21:45:28 +01:00
Aliaksandr Valialkin
0b8205ef46 lib/httpserver: escape the error string before sending it in the response to the client
See https://github.com/VictoriaMetrics/VictoriaMetrics/security/code-scanning/353
2026-02-18 20:39:52 +01:00
Aliaksandr Valialkin
53514febdc vendor: update github.com/VictoriaMetrics/VictoriaLogs from v0.0.0-20260125191521-bc89d84cd61d to v0.0.0-20260218111324-95b48d57d032 2026-02-18 20:39:19 +01:00
Aliaksandr Valialkin
8531d86da0 lib/timeutil: avoid losing the precision at decimalExp when converting it from int64 to int
This fixes https://github.com/VictoriaMetrics/VictoriaMetrics/security/code-scanning/354
2026-02-18 20:08:47 +01:00
Aliaksandr Valialkin
a47d32e129 vendor: run make vendor-update 2026-02-18 19:46:18 +01:00
Aliaksandr Valialkin
df96f4d3ab all: run go fix -omitzero 2026-02-18 19:37:07 +01:00
Aliaksandr Valialkin
84dc5453ad all: run go fix -minmax 2026-02-18 19:24:27 +01:00
Aliaksandr Valialkin
8093d98c0e all: run go fix -newexpr 2026-02-18 19:05:59 +01:00
Aliaksandr Valialkin
809f9471df all: run go fix -fmtappendf 2026-02-18 18:21:02 +01:00
Aliaksandr Valialkin
f9d6d2e428 all: run go fix -mapsloop 2026-02-18 18:17:20 +01:00
Aliaksandr Valialkin
32eac31416 all: run go fix -slicescontains 2026-02-18 18:17:20 +01:00
Artem Fetishev
4d4c1ff72e lib/storage: shard dateMetricIDCache (#10486)
Use the same sharded implementation as in metricIDCache. The change is
basically a copy-paste. The only difference is that the rotation period
remains `1h` instead `1m` in order not to break the fix for #10064.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-02-18 18:16:37 +01:00
Aliaksandr Valialkin
645ce2b6b3 all: run go fix -slicessort 2026-02-18 15:00:56 +01:00
Aliaksandr Valialkin
89600bd229 all: run go fix -any 2026-02-18 14:58:01 +01:00
Aliaksandr Valialkin
9b3a60efee lib/protoparser/protoparserutil: read request body to chunked buffer instead of contiguous byte slice
This should reduce memory reallocations and fragmentation when reading large request bodies from slow clients.
This also should reduce memory usage a bit because of the reduced memory fragmentation.

Updates https://github.com/VictoriaMetrics/VictoriaLogs/issues/1042
2026-02-18 14:50:31 +01:00
Aliaksandr Valialkin
a8c5934d1b vendor: update github.com/VictoriaMetrics/fastcache from v1.13.2 to v1.13.3 2026-02-18 14:28:34 +01:00
Aliaksandr Valialkin
43544fdb63 vendor: update github.com/valyala/fastjson from v1.6.7 to v1.6.8 2026-02-18 14:28:33 +01:00
Aliaksandr Valialkin
7a4df5755a go.mod: update github.com/VictoriaMetrics/metrics from v1.41.1 to v1.41.2, and github.com/VictoriaMetrics/metricsql from v0.84.10 to v0.85.0 2026-02-18 14:28:33 +01:00
Aliaksandr Valialkin
83bcbc43d1 app/vmauth: consistently use for i := range N instead of for i := 0; i < N; i++ 2026-02-18 14:28:32 +01:00
Aliaksandr Valialkin
79921cf434 app/vmctl: run go fix -rangeint 2026-02-18 14:28:32 +01:00
Aliaksandr Valialkin
40402fdac3 lib: run go fix -rangeint 2026-02-18 14:28:31 +01:00
Aliaksandr Valialkin
05943abc11 lib/persistentqueue: run go fix -rangeint 2026-02-18 14:28:31 +01:00
Aliaksandr Valialkin
e66e71c87e lib/streamaggr: run go fix -rangeint 2026-02-18 14:28:30 +01:00
Aliaksandr Valialkin
7f682c4c76 lib/promscrape: run go fix -rangeint 2026-02-18 14:28:30 +01:00
Aliaksandr Valialkin
4947cd7f14 lib/encoding: run go fix -rangeint 2026-02-18 14:28:30 +01:00
Aliaksandr Valialkin
5ea7314912 lib/mergeset: run go fix -rangeint 2026-02-18 14:28:29 +01:00
Aliaksandr Valialkin
655f0e9c1d lib/storage: run go fix -rangeint 2026-02-18 14:28:29 +01:00
Aliaksandr Valialkin
2ffd25a120 apptest: run go fix -rangeint 2026-02-18 14:28:28 +01:00
Aliaksandr Valialkin
175fcf6676 app/vmalert-tool: run go fix -rangeint 2026-02-18 14:28:28 +01:00
Aliaksandr Valialkin
c05516afbe app/vmselect: run go fix -rangeint 2026-02-18 14:28:27 +01:00
Aliaksandr Valialkin
6b12684e56 app/vmauth: run go fix -rangeint 2026-02-18 14:28:27 +01:00
Aliaksandr Valialkin
8f7c94f512 app/vmalert: run go fix -rangeint 2026-02-18 14:28:26 +01:00
Aliaksandr Valialkin
4a6259a9b2 app/vmagent: run go fix -rangeint 2026-02-18 14:28:26 +01:00
Max Kotliar
d5b9d3e641 dashboards/vmauth: Add Client request buffering latency panel (#10412)
### Describe Your Changes

In https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10310 ability
to [buffer request
body](https://docs.victoriametrics.com/victoriametrics/vmauth/#request-body-buffering)
was added to `vmauth`. This PR adds a new panel `Request body buffering
latency` to `vmauth` dashboard.

Related to
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10309

<img width="1504" height="680" alt="Screenshot 2026-02-07 at 00 28 46"
src="https://github.com/user-attachments/assets/ba98b06f-de2c-4d4c-96bb-e5c20049cebc"
/>

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Signed-off-by: Max Kotliar <kotlyar.maksim@gmail.com>
Co-authored-by: Hui Wang <haley@victoriametrics.com>
2026-02-18 15:25:38 +02:00
Max Kotliar
6863de2c0e package/release: Add github-verify-release job (#10476)
### Describe Your Changes

The job ensure that:
- the draft release with given `$(TAG)` exists
- the release has excpected `$(GITHUB_ASSETS_COUNT)` number of uploaded
assets
- All the assets were uploaded succesfully.

It also adds helper job `github-get-release` which finds a draft release
by `$(TAG)` and stores into file `/tmp/vm-github-release-$(TAG)` file.

The `github-delete-release1 job is decoupled from the file produced by
`github-create-release job`. So it could be run at any time from any
machine.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-02-18 15:05:50 +02:00
Artem Fetishev
51a3e4e27a lib/storage: metricIDCache cache follow-up for e5c8581bad (#10468) (#10479)
This is a follow-up PR for e5c8581bad (#10468):

- Extract the bucket size into a constant and document it
- Make benchmark constant metricIDCache-specific
- Add the same benchmark for dateMetricIDCache to compare it with metricIDCache.  See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10479 for benchmark results.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-02-17 16:10:32 +01:00
Max Kotliar
d7046d6e19 go.mod: update metrics module (#10470)
### Describe Your Changes

VictoriaMetrics binaries will now expose some process-level metrics when
run on macOS.

See:
- https://github.com/VictoriaMetrics/metrics/issues/75
- https://github.com/VictoriaMetrics/metrics/pull/107

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-02-16 19:52:14 +02:00
Max Kotliar
7e6c03e9c6 docs/changelog: correctly place feater into tip section 2026-02-16 19:43:01 +02:00
Max Kotliar
5267f35104 app/vmauth: authenticate by jwt token (#10435)
### Describe Your Changes

Adds JWT authentication support to vmauth with signature verification
and tenant-based access control. For now, public_keys have to set
explisitly in the config, OIDC discovery will be added in upcoming PRs.

Related to
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10445

Key Features

- JWT Configuration: Added `jwt_token` field to user config supporting
RSA/ECDSA public keys or skip_verify mode (for testing purposes).
- Token Validation: Verifies JWT signatures, checks expiration, and
extracts vm_access claims
- Compatible with vmgateway: jwt tokens issued for vmgateway should work
with vmauth too.

Examples

```yaml
users:
- jwt_token:
    public_keys:
    - |
      -----BEGIN PUBLIC KEY-----
      MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA...
      -----END PUBLIC KEY-----
  url_prefix: "http://victoria-metrics:8428/"
```

```yaml
users:
- jwt_token:
    skip_verify: true
  url_prefix: "http://victoria-metrics:8428/"
```


Constraints

- JWT tokens cannot be mixed with other auth methods (bearer_token,
username, password)
- Requires at least one public key OR skip_verify=true
- Limited to single JWT user (multiple JWT users will be supported in
the future)

Next steps
- Multiple `jwt_token` support. 
- Claim matching
- Claim based routing
- OIDC\JWKS support

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Co-authored-by: Pablo (Tomas) Fernandez <46322567+TomFern@users.noreply.github.com>
2026-02-16 19:40:54 +02:00
Max Kotliar
172ff84299 docs: start v1.136 lts line 2026-02-16 19:18:47 +02:00
Max Kotliar
a3f955dd84 docs: bump version to v1.136.0 2026-02-16 17:43:31 +02:00
Max Kotliar
19e7d986fe deplyoment/docker: bump version to v1.136.0
Signed-off-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-02-16 17:36:53 +02:00
Max Kotliar
db2ad6f900 docs/changelog: update changelog with LTS release notes 2026-02-16 17:31:02 +02:00
Max Kotliar
db1f3f4ab8 deployment/docker: Fix publish final fips images from rc 2026-02-16 14:18:55 +02:00
Max Kotliar
7386a35942 docs/changelog: cut v1.136.0 2026-02-13 19:58:15 +02:00
Max Kotliar
6be2d89008 app/vmselect: run make vmui-update 2026-02-13 19:44:54 +02:00
Artem Fetishev
e5c8581bad lib/storage: optimize metricIDCache sharding (#10468)
Exploit uint64set data structure peculiarities (adjacent elements are
stored in
64KiB buckets) to optimize metricIDCache memory footprint.

As the result the cache utilizes 87% less memory and is up to 90%
faster. See
[benchstat.txt](https://github.com/user-attachments/files/25294076/benchstat.txt).

Follow-up for #10388 and #10346.

Thanks to @valyala for the optimization idea.

---------

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-02-13 18:29:48 +02:00
Nikolay
14bc51554b lib/storage: properly report metrics for the last partition
Previously, on the last day of a month, storage could report empty
metrics for the last partition. This could happen if a new empty
partition was created in updateNextDayMetricIDs or if time series with
future timestamps were ingested.

This commit adds a check to ensure the last partition belongs to the
current month. Since this is typically the most actively used partition,
it should be treated as the last one.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10387
2026-02-13 11:21:20 +01:00
Max Kotliar
7db81d062c docs/changelog: chore tip before release 2026-02-13 10:32:42 +02:00
f41gh7
ad62fe88ed go.mod: update metricsql
It contains fix for https://github.com/VictoriaMetrics/metricsql/issues/60

Signed-off-by: f41gh7 <nik@victoriametrics.com>
2026-02-12 23:49:54 +01:00
Artem Fetishev
40b85eb211 Makefile: rename integration-test to apptest (#10461)
Follow-up for 73015bccb9

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-02-12 19:01:16 +01:00
Roman Khavronenko
88b2464fe8 docs: simplify wording in the top section (#10451)
The purpose of the change is to make better first impression for readers
by removing all unnecessary verbosity. As with status pages, try to
increase the density of useful information.

The initial idea was borrowed from @func25

---------------

<img width="961" height="649" alt="image"
src="https://github.com/user-attachments/assets/2a91ded5-17cf-49ad-a589-45b634af991a"
/>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Signed-off-by: Roman Khavronenko <hagen1778@gmail.com>
Signed-off-by: Max Kotliar <kotlyar.maksim@gmail.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-02-12 19:27:46 +02:00
Max Kotliar
e4221f97a7 docs: mention top query by memory usage
Follow up on
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10391
2026-02-12 17:54:27 +02:00
Stephan Burns
d40696a2f2 Add restarts annotation to remaining dashboards (#10439)
### Describe Your Changes

Added annotation to show restarts.

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Signed-off-by: Stephan Burns <34520077+Sleuth56@users.noreply.github.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-02-12 16:39:38 +02:00
Aliaksandr Valialkin
b2a74ec494 dashboards/vm/vmauth.json: run make dashboards-sync after the commit 9774fe8df1 according to dashboards/README.md
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10437
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10438
2026-02-12 14:24:09 +01:00
Mathias Palmersheim
9774fe8df1 Change user count query so it accounts for multiple replicas of vmauth (#10438)
### Describe Your Changes

Fixes issue where multiple replicas of vmauth cause the user count to be
inflated for vmauth see #10437

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-02-12 14:22:22 +01:00
Artem Fetishev
efd3b66609 Makefile: make vet and golangci-lint to also check synctests
Follow-up for 3d6f353430

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-02-12 13:16:55 +01:00
Zhu Jiekun
785daff65d vminsert: proper reset labelsBuf for OpenTelemetry ingestion to avoid high memory usage
Ensure proper expansion and reset of `buf` size for OpenTelemetry
ingestion. This pull request does:
1. Flush data in `wctx` when `buf` is over 4MiB.
2. Do not return `wctx` with `buf` larger than 4MiB while the actual
in-use length is less than 1MiB to the pool.

Previously, when a small number of requests carried a large volume of
time series or labels, `buf` was over-expanded and recycled to the pool,
resulting in an excessive memory usage issue.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10378
2026-02-12 12:49:05 +01:00
Roman Khavronenko
e3a57a3d80 docs: fix the broken image for single-node (#10460)
See
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10449#issuecomment-3890326179

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2026-02-12 12:47:26 +01:00
Pablo (Tomas) Fernandez
161633158c Docs: Update guide "How to use OpenTelemetry with VictoriaMetrics and VictoriaLogs" (#10396)
This is part of the effort to upgrate and validate the [Guides in the
docs](https://docs.victoriametrics.com/guides/).

Doc page:
https://docs.victoriametrics.com/guides/getting-started-with-opentelemetry/

Functionally, nothing should change. Aside from the fix that prevented
one of the example applications to run, the rest of the commands in the
guide should be equivalent to the original.

Header anchor links do not change with this update. I added a few
headers but the existing headers anchors should remain unchanged to
prevent breaking existing links.

- Tested on a more modern version of GKE to validate it still works OK
(1.34.1-gke.3971001)
- Changed wording of some sections to improve flow and readability
- Added some missing steps/troubleshooting
- Add tips annotations for cardinality explorer and setup references to
make them stand apart form the main content
- Use `kubectl port-forward svc/...` instead of `kubecl port-forward
pod` (service selectors vs pod names) in some test commands to make
instructions simpler
- Updated OpenTelemetry version to fix error that prevented
`app.go-collector.example` sample code from running
- Replaced the "Visit these links" part in the second program (with the
fast/slow endpoints) with curl commands
- Updated the first VMUI test link to show table instead of graph while
testing OpenTelemetry ingestion (default graph view can be confusing as
there metric value for `k8s_container_ready` doesn't really show any
values)
- Minor typos, grammar check, and consistency (Kubernetes vs kubernetes,
Helm vs Helm, Collector vs collector, etc)
2026-02-12 12:47:06 +01:00
Aliaksandr Valialkin
7b708a8947 .github/workflows/test.yml: use Go version in the cache key for golangci-lint
This should fix issues like in the https://github.com/VictoriaMetrics/VictoriaMetrics/actions/runs/21943547755/job/63375204688 :

    package requires newer Go version go1.26 (application built with go1.25)
2026-02-12 12:20:48 +01:00
Roman Khavronenko
16d5f281fe lib/storage: use child trace during index searches
This change only affects query trace. It correctly uses the branched
query trace in callback function, so in trace it is placed in the right
actions branch.

Bug was introduced in
c705da74f6
2026-02-12 12:19:00 +01:00
JAYICE
6846ca09cb document: add description about time-based kafka commit
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10420
2026-02-12 12:08:53 +01:00
Roman Khavronenko
71997bc754 docs: mention Perses on integrations list (#10442)
While there, attempted to simplify wording in perses doc.
2026-02-12 12:08:27 +01:00
Roman Khavronenko
2ec6fafed0 docs: add diagrams for single and cluster components (#10449)
This PR adds diagram for single-node and updates diagram for cluster
version. Both diagram go with excalidraw source attached, so they can be
updated in future.

Related to
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10398
2026-02-12 12:08:06 +01:00
Roman Khavronenko
a8ac5dfae5 docs: excalidraw vmagent diagram
Source vmagent diagram to excalidraw, so it can be easily updated in
future.

-----------------

<img width="936" height="671" alt="image"
src="https://github.com/user-attachments/assets/1dfc9cb5-0323-4e0d-881c-3c76ccda578f"
/>

<img width="922" height="706" alt="image"
src="https://github.com/user-attachments/assets/42297ede-5986-451c-83fc-c11dba9560e3"
/>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2026-02-12 12:07:36 +01:00
Phuong Le
6292d5fefa ci: scope Go artifact cache restore fallback by Go version
Fixes
https://github.com/VictoriaMetrics/VictoriaMetrics/actions/runs/21921172620/job/63301435721
2026-02-12 12:07:14 +01:00
Zhu Jiekun
2a09f25f78 docs: mentioning VictoriaTraces in vmalert's doc (#10457) 2026-02-12 12:06:50 +01:00
Aliaksandr Valialkin
6824ade224 lib/promscrape: follow-up for the commit 22696f378c
- Return back the check that the size of the scraped response doesn't exceed the maxScrapeSize
  at the client.ReadData(). Without this check the scraped response may be truncated to maxScrapeSize+1
  bytes, which can result in decompression error. The decompression error in this case
  hides the original errror about too big response side. This complicates troubleshooting by users.

- Stop decompressing the scraped response as soon as the decompressed response size exceeds maxScrapeSize.
  This protects from excess memory usage needed for holding the decompressed response with sizes exceeding
  the maxScrapeSize.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10320
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9481
2026-02-12 11:39:51 +01:00
Aliaksandr Valialkin
3a3c2084d3 vendor: run make vendor-update 2026-02-11 17:52:42 +01:00
Aliaksandr Valialkin
3d6f353430 deployment/docker/Makefile: update Go builder from Go1.25.7 to Go1.26.0
See https://go.dev/doc/go1.26
2026-02-11 17:35:56 +01:00
Vadim Alekseev
b1f333093b .github/workflows: use Go version from go.mod (#1092) 2026-02-11 16:10:53 +01:00
Artem Fetishev
19403b9cd1 lib/storage: use workingsetcache for tfss loops cache again (#10427)
lrucache causes huge cpu usage in some caches. See #10297.

There was a hypothesis that this was due to too short ttl in lrucache.
Setting it to 1h (the default workingsetcache eviction period) but it did not
completely eliminate the problem. The CPU utilization was not huge but still high.
See #10416.

Thus reverting back fix such deployments. This solution is temporary
because the cache consumes at least 32MB. There is one instance per
indexDB which means that if the retention is 3y then the total memory
utilized by this cache will be over 1GB and most of it will be unused.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-02-11 15:18:34 +01:00
Max Kotliar
4edff7eae2 .github: pin go version to 1.25 to fix CI (#10448)
Go1.26 has been recently released and was picked up by CI actions.

The tests and linter actions start to fail with:

GOEXPERIMENT=synctest go vet ./lib/...
go: unknown GOEXPERIMENT synctest

This happens because Go 1.26 remove synctest experiment.

Changelog:
This package was first available in Go 1.24 under GOEXPERIMENT=synctest,
with a slightly different API. The experiment has now graduated to
general availability. The old API is still present if
GOEXPERIMENT=synctest is set, but will be removed in Go 1.26.

https://go.dev/doc/go1.25#library

### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-02-11 15:43:10 +02:00
Max Kotliar
ce4b131816 docs/changelog: cleanup after merge 2026-02-11 15:01:27 +02:00
Yury Moladau
cf69c56bb7 app/vmui: add label autocomplete context-aware by applying existing label matchers (#10399)
### Describe Your Changes

* Add context-aware label autocomplete by applying existing label
matchers (e.g. namespace/job) when fetching labels and label values.
* Update `package.json` dependencies.
* Update `vite.config.ts` to ensure correct API requests in playground
mode (`start:playground`).

Related issue: #9269

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Signed-off-by: Yury Molodov <yurymolodov@gmail.com>
2026-02-11 15:00:13 +02:00
JAYICE
42ec981fe9 vmui: add Queries with most memory to execute section in Top Queries page (#10391)
### Describe Your Changes

fix  https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9330

<img width="5088" height="1674" alt="image"
src="https://github.com/user-attachments/assets/4364cfae-8c56-417d-9d1c-6a219fa8802c"
/>


### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Signed-off-by: JAYICE <1185430411@qq.com>
2026-02-11 14:53:57 +02:00
Hui Wang
35e287d740 docs: remove incorrect description on -search.logSlowQueryStats (#10447)
>Query statistics logging is enabled by default {{% available_from
"v1.129.0" %}} with a threshold of 5s.
2026-02-11 14:49:22 +02:00
Fred Navruzov
9df9a77169 docs/vmanomaly: fix-non-canonical-url-reader-docs (#10444)
### Describe Your Changes

fix non-canonical link to MetricsQL

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-02-11 13:06:04 +02:00
Max Kotliar
17c514d2fa docs: use canonical link if life of sample diagram 2026-02-11 12:48:06 +02:00
Max Kotliar
c12512bdd7 lib/jwt: address code review comments (#10428)
### Describe Your Changes

Addressing code revoew comments from
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10426, kept them
separate to isolate copy-paste change from follow up changes

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-02-10 18:57:17 +02:00
Max Kotliar
a108da8215 lib/jwt: opensource jwt library (#10426)
### Describe Your Changes

It was
[decided](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9439#issuecomment-3612299461)
that OIDC authentication in vmauth will be part of open source repo.

That requires opensourcing lib/jwt. PR does not contain any changes in
logic, just copy-paste from enterprise repository.

Related to
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9439

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-02-10 18:49:23 +02:00
Aliaksandr Valialkin
4e7606f669 lib/backup/actions: properly validate the size for the last part during the restoring from backup
This issue has been found by https://www.cubic.dev/codebase-scan/7b15eebd-abc2-4604-9523-7f9bec5f67f6?violationId=324521b6-50fb-502d-8981-980bd9fd44ab
2026-02-10 15:16:21 +01:00
Aliaksandr Valialkin
060d7f6ed1 lib/protoparser/protoparserutil: limit the maximum size of the snappy-encoded data block, which can be read from the remote client
This is a follow-up for the commit 51b44afd34

This issue has been found by https://www.cubic.dev/codebase-scan/7b15eebd-abc2-4604-9523-7f9bec5f67f6?violationId=5a8fb3b7-1086-5d11-bb06-1f0864bd56ff
2026-02-10 15:03:34 +01:00
Aliaksandr Valialkin
b3c1b00e4d lib/protoparser/protoparserutil: re-use byte buffers in readUncompressedData() with the capacity up to 1MiB
The expected size of the data ingestion request body accepted by VictoriaMetrics / VictoriaLogs / VictoriaTraces
exceeds 64KiB, and is close to 1MiB. That's why it is better to re-use byte buffers with capacities up to 1MiB,
even if less than 25% of their capacity was used the last time.

This should reduce the number of GC cycles at high data ingestion rate when the request body sizes
are distributed at both sided of the 16KiB ... 64KiB range.
This is a follow-up for 09d2ce36e8

Updates https://github.com/VictoriaMetrics/VictoriaLogs/issues/1042
2026-02-10 13:04:47 +01:00
Fred Navruzov
a65f693649 docs/vmanomaly: fix iframe params (#10421)
### Describe Your Changes

fix iframe params in embedded playgrounds on /anomaly-detection/ui/ ,
anomaly-detection/quickstart/ pages

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-02-10 12:41:38 +02:00
Hui Wang
6285bc4179 app/vmselect: properly count vm_deduplicated_samples_total{type="select"}metric
Previously `vm_deduplicated_samples_total{type="select"}` didn't take in account identical samples.

This commit takes it in account in the same way as `vm_deduplicated_samples_total{type="merge"}` metric.

Related to  https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10384.
2026-02-10 10:23:40 +01:00
Roman Khavronenko
e89f131e34 docs: update metadata API reference across the docs
* mention support of multitenancy in metadata
* add a basic alerting rule for tracking cache utilization
* clarify cleanup policy of metadata cache
2026-02-10 10:19:19 +01:00
Roman Khavronenko
493c1d410f app/vmagent: clarify global nature of remoteWrite.label cmd-line flag
Before, by mistake, -remoteWrite.label flag was referenced in one part
of the doc as per-remoteWrite-url flag. In fact, -remoteWrite.label is
global and applies labels to all remoteWrite URLs unconditionally.

This commit tries to clarify it in docs:
* update the life-of-a-sample diagram to change the labels applying
logic
* add hint how to add a label via `extra_label`
* removes duplicated description for -remoteWrite.label flag

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10373
2026-02-10 10:18:56 +01:00
Pablo (Tomas) Fernandez
b0029ee933 Update guide/k8s-monitoring-via-vm-single (#10372)
This is the first PR on a proposed series of updates to the guides.

I started with this one because:

It's on the top ten guides according to Google Analytics
It's a good starting point for me to get familiar with VM on Kubernetes
I plan to work through the rest of the guides in the following days
(coordinating the effort with JJ).

Changelog for this guide:

- Updated GKE version to a more current 1.34+
- Updated guide to more modern Helm and Kubectl versions
- Tested updated instructions on GKE 1.34.1-gke.3971001 (and a local k3s
instance) successfully
- Removed revision from Grafana values for helm chart (confirmed it
pulls the latest revision)
- Split the helm chart values into more readable chunks and added
explanations next to each chunk
- Added and updated expected outputs. Some were missing and others were
outdated
- Updated Grafana dashboards screenshots since they changed from the
last revision
- Updated Grafana repo to use community org (old grafana chart was
deprecated
on Jan 30th -
[source](https://community.grafana.com/t/helm-repository-migration-grafana-community-charts/160983))
- Minor corrections and typo fixes
- Added a section at the end pointing readers where they can go next.
2026-02-10 10:17:58 +01:00
Alexander Frolov
97e1308386 vmselect: handle NaN values when merging blocks
`vmselect` merges samples from multiple replicas using an optimistic
deduplication path.

c7f52992e7/app/vmselect/netstorage/netstorage.go (L593-L595)

This is useful when `replicationFactor > 1`. However, identical series
containing NaN values from different replicas are treated as different
(due to `NaN != NaN`), forcing the slower fallback path unnecessarily.

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10384
2026-02-10 10:16:17 +01:00
Max Kotliar
a279517034 dashboards: add source code data link to logging rate panel (#10406)
### Describe Your Changes

Add Source Code data link (link to bar or line in graph to see) that
points directly to a source code file on Github. `VictoriaMetrics -
cluster`, `VictoriaMetrics - single-node`, and `VictoriaMetrics -
vmagent` dashboards were updated. I did not add it to other panels since
they do not have Drilldown section at all.

Also, fixed a misplaced Drilldown link in `VictoriaMetrics -
single-node` dashboard.

Proxy service code is here
https://github.com/VictoriaMetrics/location2source/

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-02-10 10:25:46 +02:00
Fred Navruzov
f7ba76a59d docs/vmanomaly: v1.28.6-1.28.7 (#10419)
### Describe Your Changes

- Updated docs to reflect v1.28.6-v1.28.7 changes
- Fixed typos and misaligned section content
- Embedded playgrounds into documentation (data querying, vmanomaly
experiment)

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-02-10 10:20:43 +02:00
Max Kotliar
60dbd5a97e dashboards: Rename "Concurrent flushes on disk" panel to "Concurrent inserts" (#10409)
### Describe Your Changes

The new title better aligns with the code of
[writeconcurrencylimiter](d9dabea303/lib/writeconcurrencylimiter/concurrencylimiter.go (L140)),
the panel description and the metric used in the query.

Previously, the panel title suggested that it reflected only disk write
performance. During an incident investigation, this led to a wrong
assumption that the panel was unrelated to client-side performance.

In reality, the metric [includes the full write
path](98e320842c/lib/vminsertapi/server.go (L263)):
time spent reading data from the TCP connection, processing it, and
acknowledging the block. The updated title reflects this behavior more
accurately and reduces the risk of misinterpretation during incident
analysis.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-02-09 19:40:41 +02:00
Aliaksandr Valialkin
32ddfa973b docs/victoriametrics/Single-server-VictoriaMetrics.md: add https://docs.victoriametrics.com/VictoriaMetrics.html seen in the wild according to the 404 pages report in Google Analytics 2026-02-09 16:59:26 +01:00
Aliaksandr Valialkin
d9554a3a22 docs/victoriametrics/Cluster-VictoriaMetrics.md: add https://docs.victoriametrics.com/Cluster-VictoriaMetrics/ alias seen in wild according to the 404 pages report in Google Analytics 2026-02-09 16:58:05 +01:00
Aliaksandr Valialkin
fbab6403dc docs/victoriametrics/MetricsQL.md: add https://docs.victoriametrics.com/MetricsQL/ alias seen in wild according to the 404 pages report in Google Analytics 2026-02-09 16:56:49 +01:00
Jayice
07dd79608b app/vmselect: align graphite render API process timeout to query deadline
Previosly the error returned on timeout suggested a memory leak, which
could confuse a user. In reality timeout could happen if vmselect is
overloaded or the query takes a lot of time to process. The commit
aligns rss. RunParallel with query deadline set either via flag
`-search.maxQueryDuration` or the `timeout` query argument. The logged
warn message is adjusted to suggest resource increase or timeout
increase.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8484

Signed-off-by: JAYICE <jayice.zhou@qq.com>
Signed-off-by: Max Kotliar <kotlyar.maksim@gmail.com>
2026-02-09 14:03:26 +02:00
Artem Fetishev
5915c57b46 docs/changelog: add known issue to v1.132.0 release notes
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-02-09 11:03:19 +01:00
JAYICE
f36e1857c0 app/vmagent: improve kafka consumer performance
Previously, the Kafka consumer in vmagent committed offsets per message
(manual commit). At high message rates, this could overload the commit
path (coordinator, __consumer_offsets topic, and network).

This commit introduces time-based manual commits with a controlled window:
* enable.auto.commit remains false by default.
* After a successful TryPush (data accepted into the buffer before the
  vmagent queue/backend), vmagent adds the message to pending offsets.
* Offsets are committed periodically (every second), as well as during
  shutdown and partition rebalance.

This keeps the commit point tied to TryPush (stronger guarantees than
auto-commit) while significantly reducing commit QPS.

Auto-commit is also time-based, but it advances offsets based on poll()
delivery rather than application-level processing. This means offsets
may be committed before data is actually accepted by the vmagent
pipeline, slightly increasing the risk of data loss on crash or restart.

This change does not make the Kafka consumer fully transactional
end-to-end. Buffers in vmagent/vminsert/vmstorage still imply possible
data loss on hard stops. However, it provides stronger guarantees than
auto-commit, since commits are based on TryPush rather than poll().

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10395
2026-02-06 13:17:14 +01:00
Aliaksandr Valialkin
04f4a28cf4 docs/victoriametrics/integrations/zabbixconnector.md: add an alias - https://docs.victoriametrics.com/victoriametrics/integrations/zabbix/ - seen in the Internet
Visits to this page are seen in Google Analytics reports.
2026-02-05 23:52:08 +01:00
Aliaksandr Valialkin
7f3d370244 deployment/docker: update base Alpine Docker image from 3.23.2 to 3.23.3
See https://www.alpinelinux.org/posts/Alpine-3.20.9-3.21.6-3.22.3-3.23.3-released.html
2026-02-05 19:48:54 +01:00
Aliaksandr Valialkin
c89b7f7ad5 deployment/docker: update Go builder from Go1.25.6 to Go1.25.7
See https://github.com/golang/go/issues?q=milestone%3AGo1.25.7%20label%3ACherryPickApproved
2026-02-05 19:46:56 +01:00
Aliaksandr Valialkin
d9dabea303 docs/victoriametrics: add links on how to tune VictoriaMetrics for IoT and industrial monitoring cases with low churn rate for time series
The link is https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#index-tuning-for-low-churn-rate
Put this link to the docs which mention IoT and industrial monitoring, so users could figure out
how to optimize VictoriaMetrics for these cases.
2026-02-05 17:23:10 +01:00
Aliaksandr Valialkin
09d2ce36e8 lib/protoparser/protoparserutil: do not store byte slices with more than 75% of unused space in the pool
Keeping such byte slices in the pool may increase memory usage when processing a small share of requests
with much bigger sizes than the average processed request.

This should help reducing memory usage at https://github.com/VictoriaMetrics/VictoriaLogs/issues/1042
2026-02-04 15:31:21 +01:00
Max Kotliar
08755c838b docs: update changelog with LTS release notes 2026-02-02 18:45:07 +02:00
Max Kotliar
d2e438ef41 docs: bump version to v1.135.0 2026-02-02 18:38:03 +02:00
Max Kotliar
e508fa5fe2 deplyoment/docker: bump version to v1.135.0 2026-02-02 18:27:28 +02:00
f41gh7
9a7deca207 follow-up for 60cadfbad1
Respect the default value of http.DefaultTransport.Proxy. Previously,
it could be unintentionally overridden with a nil value.

This commit aligns Proxy configuration across all created transports.
2026-02-02 16:39:52 +01:00
Zane DeGraffenried
60cadfbad1 lib/promauth: fix oauth http client overwriting default proxy with nil
Previously, default `Proxy` was unconditionally replaced with config value, which could be nil. 
It made impossible to use  default http client proxy env variables.

This commit adds check in oauth http client builder that only overwrites the
transport proxy if a custom proxy url function is defined.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10385
2026-02-02 15:39:46 +01:00
Vadim Alekseev
b36c8b1110 app/vminsert/common: reduce allocations when writing metadata
Bug was introduced at 5a587f2006, while porting change from cluster branch.

This commit properlyslice `mms
[]metricsmetadata.Row` slice . Previously, every WriteMetadata call triggered a
slice allocation.
This shouldn't significantly impact overall performance, so I haven't
included benchmarks.

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10392
2026-02-02 15:36:33 +01:00
Nikolay
90f0405b11 lib/promscrape: properly expose kubernetes_sd dialer metrics (#10381)
Commit 35b31f904d introduced a bug, where
dialer metrics for Kubernetes discovery were overwritten.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10382
2026-02-02 14:47:50 +01:00
Nikolay
eac0a7ed86 docs: mention downsampling export API behavior
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10326
2026-02-02 14:44:40 +01:00
Artem Fetishev
a8a99105b1 lib/storage: reduce number of shards in metricIDCache (#10388)
This should reduce cpu utilization while still removing the storage
connection saturation.

Follow-up for 6bc809813b (#10346)

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-02-01 18:13:44 +01:00
Max Kotliar
c7f52992e7 docs/changelog: cut v1.135.0 2026-01-30 14:12:05 +02:00
Max Kotliar
5fe14e5479 docs: run make docs-update-flags 2026-01-30 14:09:42 +02:00
Max Kotliar
c7ef079eba docs: run make docs-update-flags 2026-01-30 14:04:39 +02:00
Max Kotliar
424d007a39 app/vmselect: run make vmui-update 2026-01-30 13:58:57 +02:00
Zakhar Bessarab
ad4562cd56 lib/pushmetrics: allow enabling push metrics via config
This is needed in order to allow using lib/pushmetrics for vmctl as it does not use go native flags.

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

app/vmctl: add metrics for the migrations

- add flags to allow setting up metrics push
- add metrics to track progress of the migration for all modes
- add metrics for generic backoff and limiter packages

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2026-01-30 13:02:13 +02:00
f41gh7
f9895d7e5e follow-up for a2271284
Remove duplicate line at app/vmui/Makefile
2026-01-30 11:28:50 +01:00
Andrei Baidarov
6bc809813b lib/storage: shard metricIdCache
The current implementation has a bottleneck – a single mutex to access
`prev`/`next` metric sets. Each rotation results in storage utilization
spikes since lock-free `curr` is almost empty, and cache needs to
promote metrics from `prev` to `next`.

This is an attempt to reduce contention by spliting cache into separate
shards.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10367
2026-01-30 11:19:40 +01:00
Hui Wang
9b40fd00e0 app/vmalert: do not skip sending alert notifications to -notifier.url if remote write requests fail
Note: remote write request won't fail immediately if `-remoteWrite.url`
is unreachable, as vmalert maintains a remote write queue (with capacity
controlled by `-remoteWrite.maxQueueSize(default 1e5)`) and uses a
separate process to batch and push queued data.

vmalert uses error group to print error messages associated with a
single group together, which should assist the group owner in reviewing
relevant error messages.
With this pull request, the error message would be like:
```
2026-01-30T08:26:46.641Z	error	app/vmalert/rule/group.go:395	group "group2": errors(3): 
rule "rule1": remote write failure: failed to push timeseries - queue is full (1 entries). Queue size is controlled by -remoteWrite.maxQueueSize flag
rule "rule1": notifier failure: failed to send alerts to addr "http://non-existing-alertmanager-1/api/v2/alerts": invalid SC 502 from "http://non-existing-alertmanager-1/api/v2/alerts"; response body: 
rule "rule1": notifier failure: failed to send alerts to addr "http://non-existing-alertmanager-2/api/v2/alerts": invalid SC 502 from "http://non-existing-alertmanager-2/api/v2/alerts"; response body: 
2026-01-30T08:26:46.641Z	error	app/vmalert/rule/group.go:395	group "group2": errors(3): 
rule "rule2": remote write failure: failed to push timeseries - queue is full (1 entries). Queue size is controlled by -remoteWrite.maxQueueSize flag
rule "rule2": notifier failure: failed to send alerts to addr "http://non-existing-alertmanager-2/api/v2/alerts": invalid SC 502 from "http://non-existing-alertmanager-2/api/v2/alerts"; response body: 
rule "rule2": notifier failure: failed to send alerts to addr "http://non-existing-alertmanager-1/api/v2/alerts": invalid SC 502 from "http://non-existing-alertmanager-1/api/v2/alerts"; response body: 
2026-01-30T08:26:52.229Z	error	app/vmalert/rule/group.go:395	group "group1": errors(3): 
rule "rule1": remote write failure: failed to push timeseries - queue is full (1 entries). Queue size is controlled by -remoteWrite.maxQueueSize flag
rule "rule1": notifier failure: failed to send alerts to addr "http://non-existing-alertmanager-1/api/v2/alerts": invalid SC 502 from "http://non-existing-alertmanager-1/api/v2/alerts"; response body: 
rule "rule1": notifier failure: failed to send alerts to addr "http://non-existing-alertmanager-2/api/v2/alerts": invalid SC 502 from "http://non-existing-alertmanager-2/api/v2/alerts"; response body: 
2026-01-30T08:26:52.229Z	error	app/vmalert/rule/group.go:395	group "group1": errors(3): 
rule "rule2": remote write failure: failed to push timeseries - queue is full (1 entries). Queue size is controlled by -remoteWrite.maxQueueSize flag
rule "rule2": notifier failure: failed to send alerts to addr "http://non-existing-alertmanager-2/api/v2/alerts": invalid SC 502 from "http://non-existing-alertmanager-2/api/v2/alerts"; response body: 
rule "rule2": notifier failure: failed to send alerts to addr "http://non-existing-alertmanager-1/api/v2/alerts": invalid SC 502 from "http://non-existing-alertmanager-1/api/v2/alerts"; response body: 
```

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10376
2026-01-30 11:14:27 +01:00
JAYICE
9d59a31290 expose topN average memory bytes consumption queries in /api/v1/status/top_queries (#10350)
### Describe Your Changes

part of https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9330

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Signed-off-by: JAYICE <1185430411@qq.com>
2026-01-30 10:47:31 +02:00
Zakhar Bessarab
8391be18be app/vmbackupmanager: allow disabling scheduled backups
This commit adds a new flag `disableScheduledBackups` for `vmbackupmanager. Which disables any scheduled backups. It could be useful to keep vmbackupmanager running and serving API calls only.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10364
2026-01-29 13:46:00 +01:00
Vadim Rutkovsky
8feb8c17aa docs: update examples and documentation after nodes/proxy permission removed
Updated helm-charts and operators no longer come with nodes/proxy
permissions for vmagent/vmsingle roles. In the examples using kubelet's
proxy endpoint we should explicitly create ClusterRoles /
ClusterRoleBinding to grant access.

See https://github.com/VictoriaMetrics/operator/pull/1754 and
https://github.com/VictoriaMetrics/helm-charts/pull/2676

Ref: https://github.com/VictoriaMetrics/operator/issues/1753
2026-01-29 13:20:12 +01:00
Hui Wang
634b4d035d app/vmalert: ensure alert restore retrieve the correct previous alert state if the group takes long time to evaluate
The new `ALERTS_FOR_STATE` may be retrieved during restore when:
1. a group contains multiple heavy rules, alerting rule A may have
already been executed and its state metrics successfully uploaded to the
datasource by the time all rules within the group have finished
executing;
2. the datasource makes data queryable very quickly, for instance, when
users configure a small value for `-search.latencyOffset`.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10335
2026-01-29 13:18:32 +01:00
Hui Wang
1db7597e45 vmalert: disallow setting the -notifier.url command-line flag to a null value
Previously, running a vmalert with an empty notifier.url does not produce an error and leads to vmalert which will never send a notification successfully.

 This commit properly validates notifier.url empty value.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10355
2026-01-28 14:09:18 +01:00
Artem Fetishev
23fe7db35c lib/storage: follow-up for making searchAndMerge profile-friendly
Follow-up for c705da74f6
2026-01-28 14:07:26 +01:00
Hui Wang
817f2dc9e7 app/vmselect/promql: fix gaps at changes() functions
After changing the scrape interval from a smaller value (e.g., 30s) to a larger value (e.g., 60s), the changes() function starts to yield non-zero values even when the underlying values have not changed.

 This commit keeps unchanged series values when a large gap occurs between samples or when the scrape interval decreases.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10280
2026-01-28 14:06:51 +01:00
Max Kotliar
731ba17962 docs: Update vmctl flags in docs with a command (#10357)
### Describe Your Changes

The commit extends make docs-update-flags command so it updates vmctl
flags as well. It creates one md file with global flags and several
files per supported mode.


### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-01-28 14:12:45 +02:00
Max Kotliar
bb163692ba docs: add avilable_from to request body buffering vmauth doc
Follow-up for
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10310 and
e31abfc25c
2026-01-28 12:50:25 +02:00
Nikolay
952ef51cd1 lib/fs: properly check for partially deleted directories (#10342)
Commit 83da33d8cf introduced a check to
detect directories partially removed via IsPartiallyRemovedDir.

However, the check was performed using the full path, while de.Name()
returns only the current entry name (without the path). As a result, the
check always succeeded and the function did not behave as intended.
2026-01-28 10:30:35 +01:00
Nikolay
1fc548b63a lib/fs: add fs.disableMincore flag
This flag allows disabling the mincore() syscall introduced in
50fc48ac47. On older ZFS filesystems,
mincore() may trigger a bug related to ZFSÕs own in-memory cache. Mixing
reads from mmap()ed files and direct disk reads can corrupt the ZFS ARC
cache and lead to data read corruption.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10327
2026-01-27 20:29:01 +01:00
Nikolay
aa5236877c lib/storage: properly aggregate per IndexedDB cache stats
Commit f62893c151 added an attempt to fix
stats for `tagFiltersCache`, `metricIDCache`, and `dateMetricIDCache`.
Instead of aggregated stats, it returned the largest cache stats by
cache size.

This resulted in possible counter decreases for counter metric types. It
made aggregated metrics less usable.

This commit changes cache stats aggregation by metric type:
* size-related gauge metrics are returned based on max cache size usage
* metric counters are reported as a sum of all counters

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10275
2026-01-27 20:27:41 +01:00
Artem Fetishev
c705da74f6 lib/storage: make pt and legacy idbs visible in golang profiles
Rewrite the searchAndMerge so that golang profiles could show exactly
how much resources is consumed by each idb type.
2026-01-27 20:26:39 +01:00
Aliaksandr Valialkin
2e9bda2bff lib/{mergeset,storage}: add a comment explaining why the strange construct with anonymous function is needed
This is a follow-up for the commit 2a0e382a99

Updates https://github.com/VictoriaMetrics/VictoriaLogs/issues/1020
2026-01-27 19:44:49 +01:00
Jiekun
e1413536fc chore: add build version information to the home page for consistency with other projects
The build version added to:
- victoria-metrics
- vmagent
- vmalert

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10249

Co-authored-by: Hui Wang <haley@victoriametrics.com>
Signed-off-by: Zhu Jiekun <jiekun@victoriametrics.com>
2026-01-27 18:28:15 +02:00
Jayice
1a438a04ba introduce new alert for vmagent persistenqueue capacity 2026-01-27 18:14:02 +02:00
Aliaksandr Valialkin
4ad47d6fe3 docs/victoriametrics/README.md: remove obsolete docs about staleness markers during deduplication after the commit 7bd5d19f62
Staleness markers are ignored on the deduplication interval if there are other numeric samples exist on that interval.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10196
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5587
2026-01-27 16:08:45 +01:00
Aliaksandr Valialkin
879443f915 lib/storage/dedup.go: remove obsolete comment from DeduplicateSamples - it doesnt keep stale NaNs on purpose after the commit 7bd5d19f62
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5587
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10196
2026-01-27 16:08:44 +01:00
Max Kotliar
bd6788cb8f docs/changelog: fix ordering after merging pr.
related pr https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10320
2026-01-27 16:37:49 +02:00
Jayice
22696f378c lib/promscrape: apply promscrape.maxScrapeSize to decompressed data
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9481
2026-01-27 16:30:38 +02:00
Artur Minchukou
7205f479aa app/vmui: fix build of vmui by handling playground env variable correctly (#10354)
### Describe Your Changes

Fixed build of vmui by handling playground env variable correctly.

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-01-27 16:24:18 +02:00
Yury Moladau
5c7031c000 vmui: fix "Percentage from total" for multiple metrics in Cardinality Explorer (#10323)
### Describe Your Changes

In the Cardinality Explorer, when filtering, a "Percentage from total"
stat appears. This stat is documented as "the share of these series in
the total number of time series".

This works for pages for individual metrics. However, if using a filter
that returns *multiple* metrics, the value of "Percentage from total"
will only account for the size of the *first* metric. One can have a
filter that returns, say, 10k time series (out of, say, 100k in the VM
cluster), and if the first metric returned has 1k time series, then
"Percentage from total" will show 1%, not 10%.

This PR fixes that calculation.

Credits to @PleasingFungus for the original fix (PR #10288).

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Signed-off-by: Yury Molodov <yurymolodov@gmail.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
Co-authored-by: PleasingFungus <PleasingFungus@users.noreply.github.com>
2026-01-27 15:37:34 +02:00
Artur Minchukou
a227128467 app/vmui: move node from ci to docker and update build steps (#10299)
### Describe Your Changes

Moved node from CI to make command and update build steps.

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-01-27 15:23:35 +02:00
Nikolay
777c8913b3 follow-up after e35a9a366c
Commit e35a9a366c changed the order of wg.Add calls in the Graphite transform package. Previously, all wg.Add calls were made upfront, but after that change it became possible for wg.Wait to exit earlier than expected.

This commit fixes the issue by spawning all background goroutines first and starting the goroutine that calls wg.Wait afterward.
2026-01-27 13:50:07 +01:00
Aliaksandr Valialkin
e35a9a366c all: consistently use sync.WaitGroup.Go() instead of sync.WaitGroup.Add(1) + sync.WaitGroup.Done()
This improves code readability a bit.
2026-01-27 00:29:47 +01:00
JAYICE
6bbc03ecf8 app/vmagent: support configuring different -remoteWrite-queues per url
Previously vmagent had remoteWrite.queues as a global setting that was be applied to every persistentqueue. However, it could be useful to specify remotewrite.queues per remotewrite.url.

Considering each rw might have different workload(latency, throughput, and availability), so it will be more flexible for tuning if we can set remoteWrite.queues separately for specific rw.

This commit, makes `-remoteWrite-queues` configurable per remoteWrite.url. 

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10270
2026-01-26 20:09:35 +01:00
Max Kotliar
ca34ae48b4 docs/changelog: chore changelog
- rename `these docs` link to a more explisit link
- Add thank you for contribution.
2026-01-26 18:45:04 +02:00
Max Kotliar
f18fd37433 docs: run make docs-update-flags 2026-01-26 18:43:35 +02:00
Zhu Jiekun
f191a052dc lib/promscrape: ceiling the last scrape size
ceiling the last scrape size as an integer in bytes or kilobytes to
avoid misleading dots.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10307
2026-01-26 12:46:24 +01:00
Max Kotliar
0fdd5cb435 app/vmauth: fix backend healthcheck for url prefixes defined inside url_map
Previously health checks for url prefixes defined inside `url_map` were
not properly stopped. See STR in
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10334#issuecomment-3791401822

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10334
2026-01-26 11:47:43 +01:00
f41gh7
76dd8f4adb lib/storage: properly search searchTenantsOnDate
Initial implementation of searchTenantsOnDate used a index scan for the given prefix (index prefix + tenant + date).
It did not check whether the date prefix was actually outside the current date.

This commit adds the missing date check and makes the tenant search results accurate.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10295
2026-01-26 11:35:27 +01:00
Aliaksandr Valialkin
e31abfc25c app/vmauth: allow buffering request body before proxying it to the backend
This should help reducing load on backends when many concurrent clients
send requests over slow networks (for example, when many IoT devices send metrics
to vmauth over slow connections).

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10309

This commit is based on top of https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10310
Thanks to @makasim for the initial idea.
2026-01-26 03:02:32 +01:00
Aliaksandr Valialkin
ac6d9d632f app/vmauth: properly increment vmauth_user_concurrent_requests_limit_reached_total and vmauth_unauthorized_user_concurrent_requests_limit_reached_total metrics when the request is rejected because of the concurrency limit
These metrics must be incremented when the request couldn't be processed because of the configured per-user concurrency limit.
The commit 76176ac1d3 moved the counter increase to the place when the current request
is put in the wait queue because of the concurrency limit is reached. This is incorrect, since such requests
can still be successfully processed during -maxQueueDuration . This also contradicts the docs at https://docs.victoriametrics.com/victoriametrics/vmauth/#concurrency-limiting

There is a small practical sense in counting the number of times the concurrency limit is reached,
while the request is successfully processed during the -maxQueueDuration after that.

Add missing alerting rule for rejected unauthorized requests because of the concurrency limit.

Add missing grouping by instance for per-user counter of rejected queries because of the concurrency limit.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10078
2026-01-25 21:43:44 +01:00
Aliaksandr Valialkin
e43de2a2b3 app/vmauth: put comments into the correct places after the commit 5f67f04f6b 2026-01-25 21:19:01 +01:00
Aliaksandr Valialkin
efe4a3b2dd vendor: update github.com/VictoriaMetrics/VictoriaLogs from v1.36.2-0.20251008164716-21c0fb3de84d to v0.0.0-20260125191521-bc89d84cd61d 2026-01-25 20:24:04 +01:00
Aliaksandr Valialkin
3bf5f0297b LICENSE: update the end copyright year from 2025 to 2026 2026-01-25 20:14:07 +01:00
Aliaksandr Valialkin
5632ccc64a lib/logger: count both printed and suppressed logs at vm_log_messages_total metric
This simplifies troubleshooting by investigating the vm_log_messages_total metric
when logs are unavailable. The logs may be unavailable when the -loggerLevel command-line
flag is set to value other than INFO. The logs may be unavailable when clients
use Monitoring of Monitoring service ( https://victoriametrics.com/products/mom/ ),
which provides metrics, but doesn't provide logs from VictoriaMetrics components
running at the client side.

Add `is_printed` label to the `vm_log_messages_total` metric in order to detect whether
the given log has been suppressed or printed.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10304

While at it, make more readable the description for the TooManyLogs alert,
which is based on the vm_log_messages_total metric.
Also return back the `level!="info"` instead of `level="error"` filter
in the query for this alerting rule, in order to be consistent with queries
at the official dashboards for VictoriaMetrics components.
TODO: investigate too high warnings rate at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2760
and fix it at the source of these warnings instead of modifying the query
for the TooManyLogs alert.
2026-01-25 17:43:20 +01:00
Nikolay
446452857c lib/storage: tsdb stats fallback to legacy idb
Add fallback to legacy indexDB for stats search

After introducing the new partition index
(f97f627f79), storage stopped returning
stats for date ranges outside the partition index. This made the
migration backward incompatible, as there was no way to retrieve stats
for dates prior to the migration.

This change adds a fallback to the legacy indexDB search when the status
search on the current partition index returns zero series.

This is an imperfect solution: due to tag filters, the TSDB status
search may legitimately return empty results. However, the additional
overhead is small and acceptable.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10315
2026-01-23 16:43:13 +01:00
Max Kotliar
4ff409eb27 docs/changelog: mention already fixed bug fix in vmauth
The bug was fixed in
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10233 before we
realized it was a bug, at that time we considered it as improvement - do
less retries. But later in
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10318 we
realized that we actually fixed a bug.

Adding postfactum a record about bugfix to the changelog
2026-01-22 16:57:58 +02:00
Phuong Le
1b7f0172d2 fsutil: fix a typo related to default concurrent goroutines working with files
s/265/256
2026-01-20 21:46:38 +01:00
Andrii Chubatiuk
1c77ee9527 app/vmui: removed anomaly ui (#10316)
The vmanomaly has been moved to a separate repository. This means that the functionality related to vmanomaly is no longer needed in the app/vmui located in the VictoriaMetrics repository.

This commit removes all the functionality and unnecessary abstractions related to vmanomaly from the app/vmui repository. This should help improving long-term maintenance of the code.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9755
2026-01-20 21:46:09 +01:00
Phuong Le
2a0e382a99 lib/storage, lib/mergeset: avoid deadlock on panic while merging
Related to
https://github.com/VictoriaMetrics/VictoriaLogs/issues/1020#issuecomment-3763912067
2026-01-20 21:43:12 +01:00
Max Kotliar
02c8ea5a48 docs/changelog: fix typo in security upgrade 2026-01-20 21:53:52 +02:00
Aliaksandr Valialkin
34f242a6b8 vendor: run make vendor-update 2026-01-19 15:29:25 +01:00
f41gh7
bc8f6c5688 docs: point examples to the v1.134.0 release 2026-01-19 14:28:00 +01:00
f41gh7
c0fe67c2db docs: cut LTS releases v1.110.28 and v1.122.13
Signed-off-by: f41gh7 <nik@victoriametrics.com>
2026-01-19 14:26:36 +01:00
Fred Navruzov
ede1c2cde9 docs/vmanomaly: release v1.28.5 (#10311)
### Describe Your Changes

- Adjusted vmanomaly docs for v1.28.5
- Added missing `server` page at /anomaly-detection/components/server

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-01-17 21:52:48 +02:00
Aliaksandr Valialkin
ad34a5eb53 lib/protoparser/protoparserutil: reduce memory usage in ReadUncompressedData() when processing big number of incoming connections
Wait for the first byte from the reader passed to ReadUncompressedData()
before obtaining concurrency token from -maxConcurrentInserts and before allocating
buffers needed for reading the request body in memory.
This should limit the amounts of memory needed for processing a big number of concurrent
HTTP requests via Prometheus remote_write protocol and via other HTTP-based data ingestion
protocols where every request contains a single block of data to process.
Now the maximum memory usage is limited by -maxConcurrentInserts, while the server
can process much more than -maxConcurrentInserts concurrent HTTP requests by pausing the excess requests.

Previously the memory usage wasn't limited by -maxConcurrentInserts, since buffers for reading the data
from concurrent connections were allocated before obtaining the concurrency token from -maxConcurrentInserts.

While at it, use protoparserutil.ReadUncompressedData() in lib/protoparser/promremotewrite/stream.Parse()
for the sake of consistency across parsers for protocols, which send the full block of data per every incoming HTTP request.

This is a follow-up for the commit d107dee9c7
2026-01-17 15:49:53 +01:00
f41gh7
eaf7a68c92 CHANGELOG.md: cut v1.134.0 release 2026-01-16 20:49:31 +01:00
Max Kotliar
c5e43e1c91 docs: use canonical link 2026-01-16 19:09:49 +02:00
f41gh7
b343f541f0 make vmui-update 2026-01-16 16:46:26 +01:00
f41gh7
a23a902953 deployment: update Go builder from v1.25.5 to v1.25.6
See https://github.com/golang/go/issues?q=milestone%3AGo1.25.6%20label%3ACherryPickApproved
2026-01-16 16:26:27 +01:00
Hui Wang
54c60706ca lib/streamaggr: prefer numerical values over stale markers when sample share the same timestamp during deduplication (#10300)
follow up
7bd5d19f62,
apply the same logic in stream aggregation.
2026-01-16 16:14:09 +01:00
Nikolay
cd2e11b7cf lib/storage: increase rotation time for daily metricID cache
This is follow-up for c5713a09d3

Originally, dateMetricID cache was fully rotate every 20 minutes. It
made daily-index pre-creation less efficient and caused CPU usage spikes
for index records lookup at midnight.

storage pre-fills index records for the next day in 1 hour before night.
But this rotation made only last 20 minutes before midnight visible in
the cache.

 This commit changes rotation period from 20 minutes to 2 hours ( 1 hour
 tick interval). While
it could slighlty increase cache memory usage ( in practice it shouldn't
be noticeable). It prevents from CPU usage spikes.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10064
2026-01-16 16:12:59 +01:00
Aliaksandr Valialkin
5423d5e93a docs/victoriametrics/relabeling.md: add an alias seen in wild - https://docs.victoriametrics.com/victoria-metrics/relabeling/
Google sends users to this alias according to the report on 404 pages.
2026-01-16 15:46:55 +01:00
Aliaksandr Valialkin
48819b6781 docs/victoriametrics/CaseStudies.md: added alias for this page seen on the Internet - https://docs.victoriametrics.com/casestudies.html
Google sends users to this url according to the report on 404 pages.
2026-01-16 15:32:40 +01:00
JAYICE
c4bff27f46 lib/storage: properly search for LabelNames and LabelValues
Issue was introduced at d6ef8a807b commit.

Due to variable shadowing, if filter matched more than 100_000 metricIDs, it's fallback to the indexDB scan.
But because of type, `filter` value was not properly updated. And it triggered incorrect results.

 This commit fixes this typo and adds test to verify this case.


fixes  https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10294
2026-01-16 13:53:29 +01:00
Max Kotliar
432b313a48 docs: cleanup changelog a bit before release 2026-01-16 12:51:08 +02:00
Haley Wang
7bd5d19f62 lib/storage: prefer numerical values over stale markers when samples share the same timestamp during deduplication
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10196

Prefer the non StaleNaN  value when both StaleNaN and non-StaleNaN samples share the timestamp during deduplication(downsampling). The scenario can occur when:
1. Multiple vmagent instances scrape the same target(without -promscrape.cluster.name flag), one instance fails to scrape due to issues such as network, while others succeed.
2. Multiple vmalert instances evaluate the same recording rule, with one instance receiving a partial response while others receive a complete response.

In both cases, since the samples share the same timestamp and represent the metric state at that moment,
the non-StaleNaN value is entirely valid, whereas the StaleNaN could be caused by other unknown issues.
Therefore, it is reasonable to prioritize the non-StaleNaN value.
2026-01-16 09:25:11 +02:00
Haley Wang
8d18bc288f vmselect: use the last 20 raw samples to auto-calculate the lookbehind during range query
Previously, the first 20 raw samples were used for calculation.
But compare to the first 20 samples, the last 20 samples represent the latest state of the metrics,
so the lookbehind window calculated from them should be more accurate when applied to the most recent samples,
resulting in better query results for recent time ranges.

For example,if the scrape interval changes at day4, and the query range is set to last 7 days.
Applying the window derived from the first 20 samples(the old scrape interval) to new samples could result in consistently incorrect results from day4 through day11.
Conversely, applying the window derived from the last 20 samples (the new scrape interval) could lead to incorrect results for [day0-day4),
which are old states and generally less important.

This pull request does not address any specific bug, but change the general behavior, so there is no changelog.

Inspired by https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10280, but not the fix for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10280.
2026-01-16 09:00:51 +02:00
Nikolay
ff6e5c2983 app/vmstorage: reduce default value for storage.vminsertConnsShutdownDuration
This commit reduces default value for
`storage.vminsertConnsShutdownDuration` flag from `25s` to `10s`
seconds.
This change should help to reduce probability of ungraceful storage
shutdown at Kubernetes based environments, which has 30 seconds default
graceful termination period value (terminationGracePeriodSeconds).

Related issue
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10063
2026-01-15 17:26:12 +01:00
Yury Moladau
23af0086d8 app/vmui: fix heatmap rendering for uniform or sparse histogram buckets (#10292)
* Fixed a heatmap crash that happened when all visible cells had the
same value (division by zero produced invalid color indices).
* Improved how histogram buckets are chosen for display when the data is
very sparse, so the heatmap doesn’t look empty or drop the only
meaningful bucket.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10240
2026-01-15 16:55:24 +01:00
Yury Moladau
8657470068 app/vmui: bump package versions (#10291)
### Describe Your Changes

Updated project dependencies to the latest versions.

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Signed-off-by: Yury Molodov <yurymolodov@gmail.com>
2026-01-15 16:53:23 +02:00
Aliaksandr Valialkin
3f16bc7cb2 docs/victoriametrics/Articles.md: add https://www.keyvalue.systems/blog/kubernetes-observability-with-victoriametrics-loki-grafana/ 2026-01-15 13:53:26 +01:00
Aliaksandr Valialkin
655a0eb0c3 app/vmstorage/main.go: typo fix after the commit 7cbd2a8600: partition -> snapshot 2026-01-15 12:50:24 +01:00
Aliaksandr Valialkin
7cbd2a8600 app/vmstorage: delete just created snapshot if the client canceled the request for creating the snapshot
It is better to delete the snapshot, since the client is no longer interested in it.
This should prevent from creating many unused snapshots when clients cancel creating snapshots
because of timeouts. This is the real production case from one of VictoriaMetrics users:
the disk IO subsystem became very slow, so creating a snapshot took a lot of time, so vmbackup
was canceling creating the snapshot because of the timeout. But vmstorage was still continue
creating the snapshot. This resulted in the increasing number of created but unused snapshots.
2026-01-15 12:36:48 +01:00
Max Kotliar
5f67f04f6b app/vmauth: measure client cancelled requests
Without measuring this, we have a blind spot. Exposing it as a metric
improves visibility and should save time during future debugging
sessions.

Inspired by review commit
c9596a0364 (r173621968)
2026-01-15 12:13:35 +01:00
Nikolay
2056e5b46d lib/mergeset: do no cache inmemoryBlock with single item
indexDB mergeset has an edge for single item inmemoryBlock. It stores
such items blocks in-memory at blockheader firstItem. So there is no
need to perform on-disk read operations and storing copy of it at cache.

 It also may result in incorrect search results, inmemoryBlock with a
 single item has always zero index block offset. Which causes collisions
if it's cached with the next index block at part.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10239
Probably fixes
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10063
2026-01-15 12:12:08 +01:00
Hui Wang
4d1f262ec4 vmalert: add support for $isPartial variable in alerting rule annotation templating
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4531
2026-01-15 12:10:07 +01:00
Vadim Rutkovsky
afca599a46 app/vmauth: Ffx typo in auth config warning message 2026-01-15 12:09:36 +01:00
Yury Moladau
d667f694bc app/vmui: fix tenant ID handling via URL path (#10287)
**Problem**

* VMUI had two tenant ID sources:

  * URL path: `/select/<accountID>/vmui/`
  * Query param: `tenantID`
* These could differ, causing confusion and inconsistent behavior.

**Solution**

* Removed the legacy `tenantID` query parameter.
* Use the URL path as the single source of truth for tenant ID.
* Changing the tenant in the UI now updates the URL path.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10232
2026-01-15 12:05:12 +01:00
Aliaksandr Valialkin
fe2c60c79b dashboards: follow-up for the commit 36460f6297
Use $__range duration instead of 1h duration for the 'Retention errors' stats panel
in the similar way it was done in the commit 36460f6297
for the 'Backup errors' stats panel.

While at it, run `make dashboards-sync` in order to sync the dashboards
in the dasbhoards/vm/ folder. See https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/dashboards/README.md
for details.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10279
2026-01-14 23:30:08 +01:00
Stephan Burns
36460f6297 Make stats panel use the range specified in grafana (#10279)
### Describe Your Changes

The Backups errors panel uses a hard coded rate, when looking over a
large period of time this number would likely stay low do to the hard
coded rate when in reality the amount of errors is much larger.

This change addresses this by using the __rate variable in Grafana so
the rate will align with the date/time range in Grafana.

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-01-14 23:21:16 +01:00
Aliaksandr Valialkin
d107dee9c7 lib/writeconcurrencylimiter: remove Reader.DecConcurrency() method
Call decConcurrency() inside Reader.Read() before calling the Read() at the underlying reader.
This reduces chances of improper use of the writeconcurrencylimiter.Reader by callers.

While at it, move the creation of writeconcurrencylimiter.GetReader() to the top of stream parser functions
at lib/protoparser/* packages, and call incConcurrency() inside GetReader() call.
This reduces the frequency of decConcurrency() / incConcurrency() calls
for typical buffered reads when parsing the incoming data. This, in turn,
reduces the contention on the concurrencyLimitCh.
2026-01-14 22:55:17 +01:00
Max Kotliar
b33d7c3ef9 dashboards: remove timezone from vmagent dashboard
The bug introduced in
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10267 and breaks
helm charts customization, see discussion
415ff27c74 (r174600675)
2026-01-14 13:28:35 +02:00
JAYICE
d3848f6802 vmagent: fix calculation of vm_persistentqueue_free_disk_space_bytes (#10271)
### Describe Your Changes

follow up https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10242,
see discussion in
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10267#issuecomment-3729577415
for more context

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-01-13 20:12:31 +02:00
Jayice
415ff27c74 dashboards: add Persistent queue Full ETA panel to the Drilldown section in vmagent dashboard 2026-01-13 20:03:43 +02:00
Max Kotliar
90f59383b2 docs: Add docs-update-flags step to release. (#10284)
### Describe Your Changes

Previously, we had to manually update flags in documentation whenever we
made flag-related changes in source code. Someone did it by hand, others
compiled enterprise binaries, executed them with `-help` flag, and
copied and pasted output to the documentation.

In https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9632 a
command `make docs-update-flags` was introduced. It automates the whole
process. It compiles binaries, runs `-help,` and syncs output to changes
automatically.

Now, we can **omit updating doc flags in the PR** and do it once before
releasing a new version.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-01-13 17:09:49 +02:00
Fred Navruzov
8fec7005d0 docs/vmanomaly-release-v1.28.4 (#10283)
### Describe Your Changes

Docs upgrades, including v1.28.4 adjustments, some diagrams refinement
and deprecations

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-01-13 15:32:03 +02:00
Max Kotliar
4d42b291e5 docs: run make docs-update-flags 2026-01-13 10:56:14 +02:00
Max Kotliar
50f4fbf28e lib/flagutil: Add explicit month duration unit (M) for -retentionPeriod.
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10181
2026-01-13 10:37:02 +02:00
Max Kotliar
a5da6afb88 docs: run make docs-update-flags 2026-01-13 10:28:16 +02:00
Max Kotliar
71f9e7f2c4 app/vmctl: fix link to documentation
See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10268

Co-authored-by: Danijel Tasov <data@consol.de>
Signed-off-by: Danijel Tasov <data@consol.de>
2026-01-12 20:53:35 +02:00
Max Kotliar
eb7c5df65e dashboards: run make dashboards-sync 2026-01-09 19:07:03 +02:00
Max Kotliar
5af493297a docs/changelog: move 2025 changes to CHANGELOG_2025.md, create CHANGELOG_2026.md 2026-01-09 13:24:47 +02:00
JAYICE
2f61fa867e Makefile: Move enterprise-only flags to a separate block in document (#10241)
### Describe Your Changes

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10218.

Improve `docs-update-flags` command to move enterprise-only flags to a
separate block.

<img width="936" height="964" alt="image"
src="https://github.com/user-attachments/assets/f96a3515-4acc-4a65-94b1-55e01fab6e25"
/>


### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-01-08 19:42:33 +02:00
Max Kotliar
729b1099d8 dashboards: Enhance VictoriaMetrics - single-node dashboard stats raw. (#10260)
### Describe Your Changes

Currently, the stats are small and hard to read (see screenshot in the
PR). In addition, the version and uptime panels work well for a single
vmsingle, but become inconvenient when multiple instances are present,
since only one is visible.

This PR changes the version and uptime panels from single stat to time
series, aligning them with the VictoriaMetrics – cluster dashboard. It
also enlarges the remaining stats so the values are easier to read,
consistent with the cluster dashboard (see screenshot in the PR).

Follow up on
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10187 and
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10132

Before:
<img width="1512" height="364" alt="Screenshot 2026-01-07 at 21 38 17"
src="https://github.com/user-attachments/assets/8d8baa86-b31b-4c58-ae22-cef94a1607e6"
/>

After:
<img width="1512" height="670" alt="Screenshot 2026-01-07 at 22 07 10"
src="https://github.com/user-attachments/assets/9e60596d-72ec-4060-af11-a69ce554d3b1"
/>

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Co-authored-by: Hui Wang <haley@victoriametrics.com>
2026-01-08 13:49:46 +02:00
Yury Moladau
945ca569b9 app/vmui: add localStorage availability checks
* Added browser `localStorage` availability checks with user-facing
error reporting.
* Introduced `VMUI:`-prefixed `localStorage` keys to avoid key
collisions.
* Added migration logic for existing unprefixed `localStorage` keys.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10085
2026-01-08 11:13:38 +01:00
Hui Wang
7fb8a8a0b2 vmalert: skip alert annotation templating in replay mode
In alerting rules, annotations are only attached to alert messages that
are sent to the notifier (such as Alertmanager). These annotations
typically contain human-readable information, such as instructions for
resolving the alert.

In [replay
mode](https://docs.victoriametrics.com/victoriametrics/vmalert/#rules-backfilling),
vmalert does not send alert messages to the notifier at all(no notifier
is configured), as these alerts are outdated. Therefore, it does not
need to template the annotations in this mode.

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10262
2026-01-08 11:09:21 +01:00
JAYICE
89f95f74ed vmagent: add metric for persistentqueue capacity
This commit adds new metric `vm_persistentqueue_free_disk_space_bytes`, which helps
to track free space for persistent queue.

part of implementation for
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10193
2026-01-08 11:07:28 +01:00
Hui Wang
46e13fe0ca vmselect: expose vm_rollup_result_cache_requests_total metric
which tracks the number of requests to the query rollup cache


As described in
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10117, when
retrieving cached data from the rollup result cache, there can be mixed
`get()` and `getBig()` calls to the underlaying fastcache. And it's
unpredictable how many times `getBig()` will call `get()`, so the
metrics from fastcache cannot be used to indicate query cache miss
ratio.
Exposing a new counter `vm_rollup_result_cache_requests_total` to track
the number of requests to the query rollup cache, together with the
existing `vm_rollup_result_cache_miss_total`, allows for monitoring the
rollup cache miss rate per query (or subquery), which is more
user-facing.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10117
related to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5056
2026-01-08 11:05:25 +01:00
Fred Navruzov
50d8ad6733 docs/vmanomaly-release-v1.28.3 (#10258)
### Describe Your Changes

Docs update for vmanomaly v1.28.3 release + `retention` doc section for
model artifacts

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-01-07 20:40:32 +02:00
JAYICE
3b8550adb1 dashboard: refine vmsingle dashboard and align it to vmcluster dashboard (#10187)
### Describe Your Changes

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10132

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-01-07 12:58:49 +02:00
Max Kotliar
1708b73312 lib/promscrape: show (N/A) instead of hiding target response link when original labels are dropped (#10244)
Related to
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10237,
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9901

When `-promscrape.dropOriginalLabels=true` is enabled, original target
labels are unavailable. These labels are required to compute the target
ID used by the /target_response endpoint, so the response link cannot be
generated. See

7a5003212e/lib/promscrape/targetstatus.qtpl (L236)

Previously, the link silently disappeared from the UI. Now the UI shows
(N/A) Not available, explicitly indicating that required data is
missing.
2026-01-06 12:31:18 +01:00
f41gh7
57defe7ab4 apptest: add zabbixconnector integration test
follow-up for 859435a8df
2026-01-06 12:27:01 +01:00
Sinotov Vladimir
d58cfb7f36 app/vminsert: properly route zabbixconnector requests
Previously VictoriaMetrics: Single-node version used

`http://<victoriametrics-addr>:8428/zabbixconnector/api/v1/history`
resulted in a missing path error. The issue was introduced during changes back-porting from vmagent.


Additionally, the http response was fixed. Zabbix expects a 200 status
code during normal operation.

 Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10214
2026-01-06 11:17:09 +01:00
Cancai Cai
a244750bc6 doc/table: fix typo (#10243)
### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Signed-off-by: cancaicai <2356672992@qq.com>
2026-01-05 21:42:01 +02:00
Max Kotliar
f06e7f9a6e app/vmagent: replace go.yaml.in/yaml/v3 package with gopkg.in.yaml.v2
It address the comment:
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10213/files#r2662305818

The reasons:
- It was decidede to use v2 for now and do not upgrade to v3.
- The later package is used in more places so it is better to use it
here too.
2026-01-05 21:34:08 +02:00
Artem Fetishev
7a5003212e docs: bump VictoriaMetrics components version
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-01-05 19:22:53 +01:00
Artem Fetishev
846392405e deployment/docker: bump VictoriaMetrics component version
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-01-05 19:18:49 +01:00
Artem Fetishev
37c3d8c26b CHANGELOG.md: fix issue links
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-01-05 19:15:50 +01:00
Artem Fetishev
8bc0475ee7 docs: update LTS releases
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-01-05 18:42:21 +01:00
Zhu Jiekun
89414062bf bugfix: allow reloading when init with empty remote write relabeling flags (#10213)
### Describe Your Changes

fix https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10211

This pull request adds `flagSet bool` field to `relabelConfigs` struct.
And use this flagSet value as the result of `isSet()` function.

The reloading should be available when at least one of the command-line
flags `-remoteWrite.relabelConfig` / `-remoteWrite.urlRelabelConfig` is
set.

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Co-authored-by: Hui Wang <haley@victoriametrics.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-01-05 12:53:52 +02:00
JAYICE
67c51b009d document: guide users to use --data-binary in curl when import multi lines influx data (#10198)
### Describe Your Changes

fix https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10165.

Refer to [curl docs](https://curl.se/docs/manpage.html#--data).

> When --data is told to read from a file like that, carriage returns,
newlines and null bytes are stripped out

If users import multiple lines of data in file via `/api/v2/write`, he
may follow the example we gave to use `-d` to instruct curl, then
newlines will be stripped out, hence the parse error in VictoriaMetrics.

It's not VictoriaMetrics' bug, but it will be better to guide users to
use `--data-binary` just like how
[/api/v1/import](https://docs.victoriametrics.com/victoriametrics/url-examples/#apiv1import)
did.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Co-authored-by: Zhu Jiekun <jiekun@victoriametrics.com>
2026-01-05 10:54:47 +02:00
Artem Fetishev
e8160fc8fb CHANGELOG.md: cut v1.133.0 release
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-01-02 12:17:38 +00:00
Artem Fetishev
e3a4ceaef3 deployment/docker: upgrade base docker image (Alpine) from 3.22.2 to 3.23.2
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-01-02 10:22:41 +00:00
Andrii Chubatiuk
e9cedca8c8 docs: replace old grafana datasource page with links to a new one (#10231)
### Describe Your Changes

fixes https://github.com/VictoriaMetrics/vmdocs/issues/192

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2026-01-01 18:54:52 +02:00
Andrii Chubatiuk
b720e55c13 vmsingle: properly proxy requests to all supported vmalert paths (#10179)
### Describe Your Changes

modify initial request path before sending request to vmalert with a
proper value
sync vmalert proxy implementation with one in cluster branch
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10178

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Co-authored-by: Hui Wang <haley@victoriametrics.com>
2026-01-01 16:28:55 +02:00
Artem Fetishev
ab1429c896 lib/storage: fix tagFiltersCache stats collection (#10230)
Since the cache may be reset too often, using the sizeBytes as an
indicator that this is the first met indexDB to collect tfssCache stats
is unreliable because it often can be zero all indexDB instances. Use
Requests metric instead because it is never reset.

Follow-up for #10204.

---------

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-12-31 13:35:54 +01:00
JAYICE
74b03c93a6 makefile: support vmauth in docs-update-flags command (#10222)
### Describe Your Changes

implement
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10221

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-12-30 19:14:06 +02:00
Max Kotliar
0e9bb5a42d docs: sync flags in docs with acutal binaries 2025-12-30 18:59:33 +02:00
Max Kotliar
f1a88e57cf docs/changelog: fix link to PR
follow up on
1792b6bd9a
2025-12-30 17:38:48 +02:00
Max Kotliar
76176ac1d3 app/vmauth: increase concurency limit reached before waiting in queue
Follow up on
c9596a0364 (r173413964)

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10078
2025-12-30 17:23:10 +02:00
Max Kotliar
c08adb31bb docs: remove available from placeholder from code block
The {{% available_from "#" %}} placeholder does not work inside code
blocks. Replacing it with hard coded value.

Introduced in
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10168.

See comment
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10168/files#r2651440620
for more details.
2025-12-30 16:10:55 +02:00
Artem Fetishev
b49b0471ef lib/storage: move legacy code to legacy files (#10215)
Follow-up for f97f627 (#8134)

The code was moved as is, no changes were made to moved code.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-12-30 13:16:24 +01:00
Artem Fetishev
13102045a7 changelog: update v1.132.0 release notes with a note on ungraceful shutdown
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-12-30 10:29:24 +01:00
Artem Fetishev
d226e5b95f lib/ingestserver: Actually close the first vminsert connection (#10224)
Since the first connection is not closed, the vmstorage will never
terminate gracefully which will cause the reset of all caches on the
start-up.

Follow-up for 244769a00d (#10136)

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-12-29 15:13:30 +01:00
Hui Wang
30bbb5660b docs: clarify recording rule labels do not support templating (#10186)
fix https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10183
2025-12-29 15:29:45 +02:00
Max Kotliar
1792b6bd9a docs/changelog: Add PR\issue links, fix typo in tip section 2025-12-29 12:58:07 +02:00
Artem Fetishev
f97f627f79 lib/storage: implement partition index (#8134)
This should reduce disk space occupied by indexDBs as they get deleted along
with the corresponding partitions once those partitions become outside the
retention window.

- Motivation: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7599
- What to expect: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8134

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
Co-authored-by: Andrei Baidarov <baidarov@nebius.com>
2025-12-24 18:53:49 +01:00
Phuong Le
785c1fd053 issues/question-template: fix typos (#947) 2025-12-24 11:37:34 +01:00
Aliaksandr Valialkin
697bfd5cee app/vmauth: properly verify whether the request has been canceled by the client in handleConcurrecnyLimitError()
The `err` may contain information about request cancelation performed by the server code.
In such cases the error must be logged. The error must be ignored only if the client canceled the request.

This is a follow-up for the commit c9596a0364

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10078
2025-12-24 11:31:36 +01:00
Artem Fetishev
f0ac6d9ac9 lib/storage: log the beginning and end of saving metric name usage stats to file (#10205)
This is to debug cases when metric name tracker resets the tsid cache
after restart. It could be due vmstorage not having enough time to stop
gracefully. Logs should provide this info.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-12-23 17:25:43 +01:00
Artem Fetishev
f0b251d967 lib/storage: fix per-idb cache stats (#10204)
This fixes the following corner case: if all instances of a cache have
zero size, the stats won't be set at all. This results in some weird
graphs if the cache is reset very often (such as tfssCache): the cache
sizeMaxBytes alternates between the actual value and zero.

Follow-up for f62893c151

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-12-23 17:06:10 +01:00
Nikolay
c3346ae8fd app/victoria-metrics: properly add prometheus metrics metadata (#10192)
Commit 5a587f2006 was not properly ported
to the single node branch. Since single node is able to perform both
promscrape and self-scrape, it's required to add metadata add methods to
those paths.

 This commit fixes missing metadata add to the storage.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10175
2025-12-23 13:57:19 +01:00
Jinlin
0ffb3fdfce lib/storage: fix log typo 2025-12-23 13:50:40 +01:00
Zakhar Bessarab
4e234ccbd1 docs/enterprise: add description of license key update (#10194)
Describe Your Changes:

- describe options of updating the enterprise license key
- fix a few typos
2025-12-23 13:37:36 +01:00
Alexander Frolov
943589ca31 lib/promscrape: fix isAutoMetric to recognize all auto-generated metrics
Previously, `scrape_labels_limit` was missing from the check.

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10197
2025-12-23 13:36:50 +01:00
Aliaksandr Valialkin
c9596a0364 app/vmauth: add -maxQueueDuration command-line flag for graceful handling of short spikes in the number of concurrent requests
Previously a short spike in the number of concurrent requests immediately led to `429 Too Many Requests` errors
when the number of concurrent requests exceeds -maxConcurrentRequests or -maxConcurrentPerUserRequests.

This commit allows processing short spikes in the number of concurrent requests during the -maxQueueDuration timeout.
The requests are rejected only if they couldn't be served accroding to the concurrency limits during the -maxQueueDuration.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10078
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10112
2025-12-22 16:39:01 +01:00
Aliaksandr Valialkin
e7b0a00493 app/vmauth: follow-up for the commit 7f689df824
- Introduce backendURLs struct, which holds all the backend urls and allows stopping
  all the health checkers across all the backend urls with a single call to backendURLs.stopHealthChecks().

- Immediately cancel the pending Dial call to the backend when backendURLs.stopHealthChecks() is called.
  Use lib/netutil.Dialer.DialContext() for this.

- Replace a fragile closing of stopHealthCheckCh channel via stopHealthCheckOnce.Do()
  with easier to maintain call of cancel() func for the corresponding healthChecksContext.

- Wait until health checker goroutines are finished before return from UserInfo.stopHealthChecks().
  Previously the health checker goroutines could run for some time trying to dial the backend
  after the return from UserInfo.stopHealthChecks().

- Try dialing the broken backend for https urls. It is better if the broken backend logs the error
  instead of routing client requests to the broken backend.

- Log dial errors to the broken backend, so users could troubleshoot the backend connectivity issue with more details.

- Refer the correct issue - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9997 -
  in the comments explaining why periodic dialing of the broken backend is needed.
  Previously the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9890 was incorrectly referred.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9997
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10147
2025-12-22 15:20:51 +01:00
Hui Wang
be0fe546e5 vmauth: skip a redundant request if all backends are broken with least_loaded policy (#10202)
similar to https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10170
2025-12-22 13:06:12 +01:00
Hui Wang
13911db316 vmauth: add new counters to track the number of user request errors
follow up https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10177

Add `vmauth_user_request_backend_requests_total` and
`vmauth_unauthorized_user_request_backend_requests_total` which track
the number of user request errors, and aligned with
`vmauth_user_requests_total`.

The existing `vmauth_http_request_errors_total` currently only counts
requests with `invalid_auth_token`. Once authorization has passed, any
subsequent request errors are tracked under
`xxx_user_request_backend_requests_total`.
2025-12-22 13:05:54 +01:00
Artem Fetishev
0cb90f91fc lib/storage: follow-up for d9c07dbc0b (#10169) - fix changelog
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-12-19 08:44:10 +01:00
Alexander Frolov
bdf65dde88 app/vmagent: make sure vmagent_rows_inserted_total counts samples (#10191)
As vminsert does

4d9b69b5a6/app/vminsert/newrelic/request_handler.go (L68)

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10191
2025-12-18 16:37:37 +01:00
Max Kotliar
4d9b69b5a6 docs/changelog: add known issue note related to memory leak on OpenTelemetry parsing code. 2025-12-18 12:39:12 +02:00
Nikolay
692a9be5fa lib/storage: check indexDB refCount at MustClose
In order to gracefully stop indexDB, refCount must be checked during
storage graceful shutdown.

Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10063
2025-12-17 18:48:53 +01:00
Kirill Kobylyanskiy
c8742ab120 lib/promscrape: add global sampleLimit support
This commit introduces the global `sampleLimit` setting to restrict the number
of samples accepted per scrape target, mirroring the behavior of
Prometheus.

Motivation:
1) The existing `-promscrape.seriesLimitPerTarget` flag currently takes
precedence over any `sample_limit` setting defined directly on the
scrape target. The new `sampleLimit` implementation ensures that the
target configuration is able to override the global setting, allowing
users to define specific limits per target.
2) The existing series limit flag uses memory-intensive Bloom filters,
resulting in high RAM consumption under high-cardinality scraping
scenarios. The `sampleLimit` provides a much simpler, low-overhead
alternative.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10145
2025-12-17 18:47:05 +01:00
Aliaksandr Valialkin
b6f8128273 Makefile: update golangci-lint from v2.4.0 to v2.7.2
See https://github.com/golangci/golangci-lint/releases/tag/v2.7.2
2025-12-17 16:59:02 +01:00
Aliaksandr Valialkin
bed7cbd0a4 all: consistently use encoding.DecompressZSTD* instead of zstd.Decompress* across the codebase
The encoding.DecompressZSTD* consistently updates the vm_zstd_block_decompress_calls_total metric.

Also make the follwing improvements after the commit 10f7cd2ffc:

- Add encoding.DecompressZSTDLimited() function and use it instead of zstd.DecompressLimited,
  so it properly updates vm_zstd_block_decompress_calls_total metric.

- Clarify description for the encoding.DecompressZSTD* and zstd.Decompress* functions.
2025-12-17 16:48:06 +01:00
Artem Fetishev
d9c07dbc0b lib/storage: rotate dateMetricIDCache instead of resetting (#10169)
Currently, `dateMetricIDCache` is reset when it is full and it is never
reset is not full but the data it stores is no longer needed. This leads
to the following problems:
- During regular data ingestion the cache sizeBytes may exceed max
allowed size and the cache gets reset which may potentially slow down
data ingestion (see #10064)
- The cache is per-indexDB. This means that in partition index (#8134)
there will be as many instances of this cache as the number of
partitions. If someone performs a backfill across all partitions, this
will fill all caches and they will never get reset even if no more
historical data is ingested.

So the solution is to periodically rotate the cache. After first
rotation the data is not deleted but moved to `prev` storage. After
second rotation `prev` gets deleted. This gives the cache an opportunity
to restore the `prev` data if it is still in use. Based on #10167.

This PR also removes the introduced recently introduced
`-storage.cacheSizeIndexDBDateMetricID` flag (see #10135). This should
be safe since it is new and its use case is very niche, i.e. no one
would really use it.

---------

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-12-17 15:43:05 +01:00
Artem Fetishev
20ad9cd395 lib/storage: introduce metricIDCache
The cache serves the same purpose as `dateMetricIDCache` but is used for
caching metricIDs from global index.
The cache was introduces in https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10167 and it has been decided to add it in a separate commit to reduce diff.

Related  PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10167
2025-12-17 13:31:11 +01:00
Hui Wang
8b3fe9cdec app/vmauth: add new counters to track the number of requests sent to backends
We have `vmauth_user_requests_total` and
`vmauth_unauthorized_user_requests_total` to track requests from the
user side. However, in scenarios such as request timeouts or when the
response code matches `retry_status_code`, a single request may be
retried across multiple backends.

Exposing counters `vmauth_user_request_backend_requests_total` and
`vmauth_unauthorized_user_request_backend_requests_total` that track the
number of requests sent to backends provides insight into the routing
logic and can help identify if requests are being consistently retried,
which may contribute to increased request duration.

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10171
2025-12-17 13:27:08 +01:00
Hui Wang
e1e367b3cb app/vmauth: properly increment metric xxx_user_request_backend_errors_total
Currently, backendErrors may be counted twice if a request to the
backend fails due to context.DeadlineExceeded.

9bc7a17d80/app/vmauth/main.go (L328)

9bc7a17d80/app/vmauth/main.go (L294)

And we increment this counter in a way that is somewhat inconsistent.
Given that the counter's name is `xx_request_backend_errors_total`, it
should only increase when a backend request returns an error. This value
can exceed the user request error count if multiple backend requests
fail for a single user request.
The `xxx_request_backend_errors_total` counter should be used in
conjunction with the `xxx_request_backend_requests_total` introduced in
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10171.
2025-12-17 13:24:26 +01:00
Hui Wang
f40c6fcad1 app/vmauth: skip a redundant request if all backends are broken with first_available policy
There is no reason to send a request to the first backend if all
backends are marked as broken.
Also, 
>// getFirstAvailableBackendURL returns the first available backendURL,
which isn't broken.


The fix only skips a redundant request when all backends are
unavailable, it doesn't introduce any changes from user's perspective,
so I skipped changelog.
2025-12-17 13:22:37 +01:00
Aliaksandr Valialkin
b6bc186013 docs/victoriametrics/Articles.md: add https://developer-friendly.blog/blog/2024/06/17/unlocking-the-power-of-victoriametrics-a-prometheus-alternative/ 2025-12-16 15:46:23 +01:00
Aliaksandr Valialkin
9bc7a17d80 lib/protoparser/opentelemetry: typo fix: wince -> since
This is a follow-up for the commit 293d80910c
2025-12-15 20:13:45 +01:00
f41gh7
9ce548dcb5 docs: update release version to latest 2025-12-15 10:37:35 +01:00
f41gh7
82e583338d docs: update LTS releases 2025-12-15 10:34:43 +01:00
Aliaksandr Valialkin
19009836c7 vendor: update github.com/valyala/fastjson from v1.6.5 to v1.6.7 2025-12-14 23:09:43 +01:00
Max Kotliar
c2362ab670 docs: review links in changelogs 2025-12-12 19:43:15 +02:00
f41gh7
d04a42e846 make vmui-update 2025-12-12 12:50:13 +01:00
f41gh7
0d930dda16 CHANGELOG.md: cut v1.132.0 release 2025-12-12 12:45:34 +01:00
Artem Fetishev
e026215701 lib/storage: Document post-delete cache resets (#10158)
When the time series deletion is performed some of the storage caches
need to be reset but some not. This PR reviews all storage caches and
documents why there are reset or not and also places all the resetting
logic (and comments) in one place.
2025-12-12 11:11:30 +01:00
JAYICE
34a542c324 lib/storage: include last sample when query at the last millisecond of the day
One millisecond shouldn't be subtracted from the `tr.MaxTimestamp`, and
related test cases will be added

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9804
2025-12-12 11:01:06 +01:00
Fred Navruzov
ff0aaa38b7 docs/vmanomaly: release v1.28.2 (#10160)
### Describe Your Changes

Update docs and assets (visualizations) for /anomaly-detection section
with `v1.28.2` release

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-12-11 20:56:59 +02:00
Max Kotliar
0e2f0ac95f lib/protoparser/opentelemetry: fix typo in code
#
github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentelemetry/pb
lib/protoparser/opentelemetry/pb/pb.go:1683:19: undefined: lctx

Bug introduced in
1dc71212f8
2025-12-11 18:37:04 +02:00
Max Kotliar
7f689df824 app/vmauth: validate backend with a dial check before marking it healthy (#10147)
### Describe Your Changes

Previously, a backend was considered healthy as soon as its
'bu.brokenDeadline' deadline expired, even if it was still unavailable.
This caused avoidable request failures and retries.

Now vmauth performs a TCP dial (1s timeout) before restoring the backend
to the healthy
pool. This avoids routing traffic to backends that are still down.

The dial check also covers cases where a route to the backend cannot be
resolved. Without this check, user requests would hang until the
connection timeout, leading to long waits
or errors. The new check fails fast and doesn't impact real user
requests.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9997


### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-12-11 18:26:59 +02:00
Max Kotliar
bd725bdd69 dashboards: add usseful links to dashboards
Dashboards:

- Add a link to proper docs section
- Add a link to troubleshooting page
- Add links to community and enterprise support

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9904
2025-12-11 18:11:07 +02:00
Aliaksandr Valialkin
712b7cfeeb lib/promscrape: allow scraping targets with responses equal to c.maxScrapeSize
Return "too big response size" error only for responses bigger than c.maxScrapeSize
(this option can be set either via max_scrape_size option inside scrape config
or via -promscrape.maxScrapeSize command-line flag).

Previously responses with sizes equal to c.maxScrapeSize were incorrectly rejected.
2025-12-11 16:15:46 +01:00
Aliaksandr Valialkin
1dc71212f8 lib/protoparser/opentelemetry/pb: reset the decoderContext.ls.Labels length to zero after clearing all the references to the original byte slice
This is a follow-up for 25f49e6f54
2025-12-11 15:39:32 +01:00
Aliaksandr Valialkin
25f49e6f54 lib/protoparser/opentelemetry: explicitly clear all the references to the underlying byte slice at decoderContext.ls.Labels up to its capacity
This should prevent from the excess memory usage because of dangling source byte slices
referred by decoderContext.ls.Labels.

This is a change similar to 63a68edb05
2025-12-11 15:33:13 +01:00
Max Kotliar
dcf9f0eb7b lib/promscrape: Add a warning to active targets panel if -dropOriginalLabels=true (some debug info not available)
Previously the original labels were preserved (
-dropOriginalLabels=false). In
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9772 the default
behavior was changed. Now vmagent\vmsingle drops origianal labels. The
change
created some confusion related to UI. For example, debug relabling
column is completly hidden when the labels are not available. It created
a steram of questions.

This commit adds a warning similar to one we have at "Discovered
targets" tab, and also always show the "Debug relabeling" column. When
there is not info for it "N/A" printed.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9901
Follow-up https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9772
2025-12-11 16:24:50 +02:00
Aliaksandr Valialkin
606382178b lib/protoparser/protoparserutil: do not store too big buffers to the pool at ReadUncompressedData if only a small part of the buffer is used last time
This should prevent from excess memory usage because of inefficiently used buffers.

This should help the case at https://github.com/VictoriaMetrics/VictoriaLogs/issues/869
2025-12-11 15:11:44 +01:00
Artem Fetishev
220249f023 lib/storage: use lrucache to implement tagFilters loops cache
The tagFilters loop cache is per-indexDB which means that currently
there are two instances, one for idbCurr and one for idbPrev. When the
partition index (#8134) is released, there will be as many instances of
this cache as there will be partitions.

The cache is implemented using workingsetcache. Which occupies at least
30MB even when unused. Given that only the latest indexDB is used most
of the time, a lot of memory can be wasted.

Therefore the cache implementation is changed to lrucache because it
does not consume memory when it is unused and also has timeout-based
eviction.

This is a follow-up for 4cd727a511
(https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10072).
2025-12-11 08:42:58 +01:00
Max Kotliar
c6731f964c dashboards: add memory usage breakdown panels into Drilldown sections
Right now we have two separate panels: RSS memory % usage and RSS
anonymous memory % usage. This makes trend comparison difficult because
one have to visually correlate two independent panels. Another problem
is that these panels don't show Go runtime allocations at all. The same
applies to memory allocated in C. There are allocations in C (zstd) one
should account for but there is no even a metric to expose it.

The commit adds Memory usage breakdown panel into Drilldown section. It
provides insight into Go Stack, Go Heap, Go Heap Released, Go Other,
Mmap: VM Cache, File cache memory distribution

It should help spot trends changes in memory by type or invistigate
issues such as #10069 and #10028 easier.

Panel info:
This panel shows memory usage by category.

How to use:
- Start from the high-level RSS panel.
- Identify an instance with unexpected or abnormal memory growth.
- Filter to that instance to inspect the detailed breakdown here.

Interpretation
- A steadily rising Go Heap usually indicates a memory leak. Collect
pprof memory profile.
- A growing Go Stack commonly points to a goroutine leak.

<img width="1508" height="628" alt="Screenshot 2025-12-08 at 13 18 44"
src="https://github.com/user-attachments/assets/0e794324-e86d-468e-b926-8bb11f5a2043"
/>
<img width="1503" height="674" alt="Screenshot 2025-12-08 at 13 19 34"
src="https://github.com/user-attachments/assets/62fc3fff-33b3-4dfe-ad3f-ad0526a8a606"
/>

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10139
2025-12-11 08:39:00 +01:00
Sinotov Vladimir
859435a8df lib/protoparser: added push data with zabbix connector (#6087)
Support receiving data from the Zabbix connector with API `/zabbixconnector/api/v1/history`

Labels:
    - The metric name is added to the `__name__` label.
    - Host name to `host` label.
    - Visible name  to `hostname` label.

The returned response complies with the requirements of the Zabbix

 See the following doc for connector [protocol](https://www.zabbix.com/documentation/current/en/manual/config/export/streaming).

Useful links:
- Zabbix Streaming to external systems
(https://www.zabbix.com/documentation/current/en/manual/config/export/streaming)
- Zabbix Newline-delimited JSON expor
(https://www.zabbix.com/documentation/current/en/manual/appendix/protocols/real_time_export)

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6087
2025-12-10 17:00:27 +01:00
Max Kotliar
5b12fd35d7 app/vminsert: improve slowness-based rerouting logic
Adjust slowness-based rerouting logic.

Rerouting now occurs only from the slowest node, and only if the cluster
as a whole has enough available capacity to handle the additional load.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9890
2025-12-10 16:25:44 +01:00
Aliaksandr Valialkin
293d80910c lib/protoparser/opentelemetry: eliminate memory allocations during parsing of samples send via OpenTelemetry protocol
This increases the parser performance by 4x-6x.

This commit uses the technique similar to https://github.com/VictoriaMetrics/VictoriaLogs/pull/720

goos: linux
goarch: amd64
pkg: github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentelemetry/stream
cpu: AMD Ryzen 7 PRO 5850U with Radeon Graphics
                                                    │   old.txt    │               new.txt               │
                                                    │    sec/op    │   sec/op     vs base                │
ParseStream/default-metrics-labels-formatting-16      15.565µ ± 1%   2.150µ ± 3%  -86.19% (p=0.000 n=10)
ParseStream/prometheus-metrics-labels-formatting-16   24.228µ ± 2%   4.355µ ± 1%  -82.02% (p=0.000 n=10)
ParseStream/prometheus-metrics-formatting-16          23.028µ ± 2%   3.395µ ± 1%  -85.26% (p=0.000 n=10)
geomean                                                20.55µ        3.168µ       -84.59%

                                                    │   old.txt    │                new.txt                 │
                                                    │     B/s      │      B/s       vs base                 │
ParseStream/default-metrics-labels-formatting-16      127.9Mi ± 1%    918.3Mi ± 3%  +617.82% (p=0.000 n=10)
ParseStream/prometheus-metrics-labels-formatting-16   82.19Mi ± 2%   453.32Mi ± 1%  +451.57% (p=0.000 n=10)
ParseStream/prometheus-metrics-formatting-16          86.47Mi ± 2%   581.56Mi ± 1%  +572.52% (p=0.000 n=10)
geomean                                               96.88Mi         623.3Mi       +543.34%

                                                    │   old.txt    │                 new.txt                  │
                                                    │     B/op     │    B/op      vs base                     │
ParseStream/default-metrics-labels-formatting-16      12.53Ki ± 0%   0.00Ki ± 0%  -100.00% (p=0.000 n=10)
ParseStream/prometheus-metrics-labels-formatting-16   21.15Ki ± 1%   0.00Ki ±  ?  -100.00% (p=0.000 n=10)
ParseStream/prometheus-metrics-formatting-16          20.74Ki ± 1%   0.00Ki ±  ?  -100.00% (p=0.000 n=10)
geomean                                               17.65Ki                     ?                       ¹ ²
¹ summaries must be >0 to compute geomean
² ratios must be >0 to compute geomean

                                                    │  old.txt   │                new.txt                 │
                                                    │ allocs/op  │ allocs/op  vs base                     │
ParseStream/default-metrics-labels-formatting-16      426.0 ± 0%    0.0 ± 0%  -100.00% (p=0.000 n=10)
ParseStream/prometheus-metrics-labels-formatting-16   514.0 ± 0%    0.0 ± 0%  -100.00% (p=0.000 n=10)
ParseStream/prometheus-metrics-formatting-16          514.0 ± 0%    0.0 ± 0%  -100.00% (p=0.000 n=10)
geomean                                               482.8                   ?                       ¹ ²
2025-12-10 16:11:59 +01:00
Artem Fetishev
bc4d98b358 app/vmstorage: properly name dateMetricIDCache metrics
The following dmc metrics were given standard names, i.e.:

- vm_date_metric_id_cache_resets_total became
vm_cache_resets_total{type="indexdb/date_metricID"}
- vm_date_metric_id_cache_syncs_total became
vm_cache_syncs_total{type="indexdb/date_metricID"}

This change should be safe since these metrics are currently not used in
VictoriaMetrics Gragana dashboards.

Additionally, other cache metrics were organized within the code so that
each metric has the same order.
2025-12-10 14:57:52 +01:00
Alexander Frolov
ad153f72ef lib/storage: utilize persisted hourMetricIDs cache to avoid redundant indexDB lookups after vmstorage restart
This commit optimizes the performance of the storage by improving the utilization of persisted hourMetricIDs cache to avoid redundant indexDB lookups after vmstorage restart. The change refactors the hour-based cache checking logic using a switch statement to handle multiple hour scenarios more efficiently.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10114
2025-12-10 14:56:05 +01:00
Vadim Rutkovsky
f2578a9764 docs/victoriametrics: update LTS-releases.md (#10153)
### Describe Your Changes

Doc update to mention fresh patch releases - 1.122.10 and 1.110.25

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-12-10 15:11:26 +02:00
Aliaksandr Valialkin
d5e19717b7 Makefile: use the correct -trim_path at pprof-cpu
It shouldn't end with @.

The `PPROF_FILE=/path/to/cpu.pprof make pprof-cpu` is good for investigating profiles received from production builds.
2025-12-10 13:41:10 +01:00
Max Kotliar
5c40328e5f docs: mention Grafana panel that can help with swap related issues 2025-12-10 14:08:43 +02:00
Yury Moladau
1117437456 app/vmui: improve legend auto-collapse threshold, warning and toggle (#10140)
### Describe Your Changes

This PR improves the legend auto-collapse behavior in vmui:
- Increase the legend auto-collapse threshold from `20` to `100` series.
- Add a warning message when the legend is collapsed by default, showing
the actual series count.
- Add a user setting to disable automatic legend collapsing (enabled by
default).

Related issue: #10075

<img width="352" alt="image"
src="https://github.com/user-attachments/assets/22ee2ef9-6369-47a8-87a1-c63a0e17fccd"
/>
<img width="1618" height="197" alt="image"
src="https://github.com/user-attachments/assets/791eb9b6-4397-476d-ad44-5152e50d1975"
/>


### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Signed-off-by: Yury Molodov <yurymolodov@gmail.com>
2025-12-10 13:59:16 +02:00
Aliaksandr Valialkin
094a7cf3f9 lib/protoparser/opentelemetry/stream: benchmark cases when prometheus-compatible naming for metrics and labels is enabled 2025-12-10 11:50:46 +01:00
Artem Fetishev
538e489497 docs: Update cache tuning section (#10149)
- Remove mentions of `Caches` section in Grafana dashobards since this section does not exist anymore.
- Rewrite a bit the description of cache panels in Troubleshooting section.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2025-12-10 11:42:33 +01:00
Aliaksandr Valialkin
744aa3fe9f lib/protoparser/opentelemetry/stream: make the BenchmarkParseStream closer to real production cases
- Add more metrics to the protobuf to parse.
- Measure scan speed of the original protobuf at bytes/sec. Previously the number of ParseStream() calls per second was measured.
2025-12-10 11:21:57 +01:00
Aliaksandr Valialkin
44a3885f97 lib/protoparser/opentelemetry/stream: avoid memory allocations for bytes.NewBuffer() on every iteration of BenchmarkParseStream
Re-use benchReader for reading the same data on every iteration of BenchmarkParseStream.
2025-12-10 11:10:39 +01:00
Aliaksandr Valialkin
f43264f9f2 lib/ioutil: add missing package after the commit 2da010495c 2025-12-10 11:07:15 +01:00
Aliaksandr Valialkin
e07bc7a74e lib/prompb: move all the code related to WriteRequestUnmarshaler to a separate file - write_request_unmarshaler.go
This should improve code maintenance a bit.

This is a follow-up for the commit b98e592752
2025-12-10 10:45:59 +01:00
Aliaksandr Valialkin
d1680063f5 lib/prompb: rename MetricMetadataType to MetricType
Also rename MetricMetadata* constants to MetricType* constants.

This makes the code a bit more readable.

This is a follow-up for the commit 25cd5637bc

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2974
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9306
2025-12-10 01:18:44 +01:00
Aliaksandr Valialkin
2da010495c all: pool io.LimitedReader in order to save a memory allocation and reduce CPU usage a bit 2025-12-10 01:18:43 +01:00
Artem Fetishev
7c78f95f2e docs: Update flags (#10148)
Follow-up for dc5d7aa4ce
(https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10135)

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-12-09 17:48:40 +01:00
Kirill Yurkov
5bd67c5f49 docs: recommend disabling swap (#10113)
add swap disable commands in install recommendations to prevent
performance issues

---------

Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2025-12-09 15:50:13 +02:00
Max Kotliar
c618f471ca apptest: make results order stable in test special query regression
Sometimes test fails with error:

--- FAIL: TestClusterSpecialQueryRegression (15.57s)
special_query_regression_test.go:76: unexpected /api/v1/export
response (-want, +got):
          &apptest.PrometheusAPIV1QueryResponse{
          	... // 1 ignored field
          	Data: &apptest.QueryData{
          		... // 1 ignored field
          		Result: []*apptest.QueryResult{
          			&{
          				Metric: map[string]string{
          					"__name__":
"prometheus.sensitiveRegex",
- 					"label":
"SensitiveRegex",
+ 					"label":
"sensitiveRegex",
          				},
          				Sample:  nil,
          				Samples: {&{Timestamp:
1707123456700, Value: 10}},
          			},
          			&{
          				Metric: map[string]string{
          					"__name__":
"prometheus.sensitiveRegex",
- 					"label":
"sensitiveRegex",
+ 					"label":
"SensitiveRegex",
          				},
          				Sample:  nil,
          				Samples: {&{Timestamp:
1707123456700, Value: 10}},
          			},
          		},
          	},
          	ErrorType: "",
          	Error:     "",
          	IsPartial: false,
          }

FAIL
FAIL	github.com/VictoriaMetrics/VictoriaMetrics/apptest/tests
	18.676s
FAIL
2025-12-09 15:20:01 +02:00
Artem Fetishev
f62893c151 lib/storage: report per-idb cache stats only once
`tagFiltersCache` and `dateMeticIDCache` are now per-indexDB. Currently
we have 2 instance of indexDBs (prev and curr) and therefore 2 instances
of each cache.

When the storage stats is collected, the stats of individual caches is
added together. For example, is the `sizeMaxBytes` of each
tagFiltersCache is `100MB` and the `sizeBytes` of each instance is
`10MB` and `99MB`, then the resulting stats will be `sizeMaxBytes ==
200MB, sizeBytes == 109MB`.

While this is accurate, this stats hides a potential problem. It says
that the cache utilization is slightly above `50%` (109/200) and
everything seems to be okay. But in reality one of the caches is
utilized by 99% and soon will start evicting existing records to make
room for new ones, potentially slowing down the data retrieval. Ops
won't see it and will not take necessary action.

The solution is to report stats only for one instance of cache whose
utilization is the highest.

Alternatives considered:
- #10123. Might work, but breaks the encapsulation and can potentially
be slower
- Do not aggregate the stats and report is per-indexDB. This increases
the number of metrics and makes it dependent on the number of indexDB
instances (which can be many once #8134 is released).

Related issue https://github.com/VictoriaMetrics/VictoriaMetrics/pull/8134
2025-12-09 12:43:50 +01:00
JAYICE
76f5def301 dashboard: fix page fault panel (#10141)
add `[$__rate_interval]` to fix page fault panel introduced in
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9977
2025-12-09 12:41:28 +01:00
Artem Fetishev
3be5ed0e32 Revert "lib/storage: after deleting series, reset tsid only once" (#10143)
This reverts commit dbe71700b5.

tsidCache is persistent and must be reset before deletedMetricID records
are added to the index. THis is needed to handle ungraceful shutdowns
properly.
2025-12-09 10:57:08 +01:00
Aliaksandr Valialkin
4ac40d955b lib/prompb: use MetricMetadataType type for MetricMetadata.Type field
This eliminates the need of manual conversion between MetricMetadataType and uint32 / int32.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2974

This is a follow-up for the commit 5a587f2006
2025-12-08 20:31:29 +01:00
Artem Fetishev
dc5d7aa4ce lib/storage: properly report dateMetricIDCache stats
A number of changes to `dateMetricIDCache` stats and configuration:

1. Export `SizeMaxBytes` metric and make the size configurable via a
flag
2. Fix `EntriesCount` and `SizeBytes` stats. Previously the cache
reported this stats for its immutable part only. Whereas there are cases
when the number of entries in its mutable part is comparable with the
number in immutable part. The stats from the mutable part remains
invisible until it is sync'ed to the immutable part. It is also possible
that the cache gets reset after the sync because the cache size exceeds
the max allowed size. Reporting the stats for both mutable and immutable
parts should provide a clear picture of the cache utilization.

Together, SizeBytes and SizeMaxBytes should enable tracking the cache
utilization properly. And take appropriate actions if necessary (such as
adjusting the memory resources and/or cache size limit via a flag).

Related issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10064
2025-12-08 14:17:58 +01:00
JAYICE
244769a00d vmstorage: skip last sleep when closing vminsertSrv connections
After closing last connection to vminsert, vmstorage will still wait for
an interval, causing actual shutdown time will be always longger than
configurations.

This commit just skip the last sleep

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10136
2025-12-08 14:10:17 +01:00
Max Kotliar
8e81d54851 Revert "dashboards: add memory usage breakdown panels into Drilldown sections"
This reverts commit 5117cde8bc.
2025-12-08 13:42:10 +02:00
Max Kotliar
5117cde8bc dashboards: add memory usage breakdown panels into Drilldown sections
Right now we have two separate panels: RSS memory % usage and RSS
anonymous memory % usage. This makes trend comparison difficult because
one have to visually correlate two independent panels. Another problem
is that these panels don't show Go runtime allocations at all. The same
applies to memory allocated in C. There are allocations in C (zstd) one
should account for but there is no even a metric to expose it.

The commit adds Memory usage breakdown panel into Drilldown section. It
provides insight into Go Stack, Go Heap, Go Heap Released, Go Other,
Mmap: VM Cache, File cache memory distribution

It should help spot trends changes in memory by type or invistigate
issues such as #10069 and #10028 easier.

Panel info:
This panel shows memory usage by category.

How to use:
- Start from the high-level RSS panel.
- Identify an instance with unexpected or abnormal memory growth.
- Filter to that instance to inspect the detailed breakdown here.

Interpretation
- A steadily rising Go Heap usually indicates a memory leak. Collect
pprof memory profile.
- A growing Go Stack commonly points to a goroutine leak.
2025-12-08 13:39:34 +02:00
Artem Fetishev
85367cae38 Idb blockcache metrics unittest (#10050)
indexDB has 3 block caches. These caches export metrics. Storage
collects these
metrics for each indexDB it has (currently prev and curr only).

There is a potential problem:
- These caches are shared by all indexDBs
- Each indexDB reports the block cache metrics.
- Storage collects the metrics of all indexDBs by adding them together.

I.e. it is possible to count block cache metrics several times.
It is not the case in current implementation because the addition of the
metrics
is not performed intentionally.

The added unit test 1) demonstrates that the resulting counts are
reported
correctly and 2) protects from future unintentional changes in this
behavior.

Additionally a code comment is added to explain why block cache metrics
are not summed up.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-12-06 18:14:52 +01:00
Aliaksandr Valialkin
159b71cabb lib/protoparser/influx: properly clean references to underlying byte slices from tagsPool and fieldsPool inside unmarshalContext
This should prevent from memory leaks when unmarshalContext fields point to unused byte slices.
2025-12-06 11:52:57 +01:00
Aliaksandr Valialkin
78b8c773ae docs/victoriametrics/: remove misleading statement about extending ext4 partition to 16TB+
It is enough to recommend the given format options for disks with 1TB+ sizes
2025-12-05 23:00:47 +01:00
Nikolay
aab92d3c0f protoparser/influx: reduce memory allocation (#10109)
Previously, influx parser allocated a new slice byte for
unescape of Row fields. It adds extra pressure at GC and increases CPU
usage.

 This commit changes escape to in-place updates for provided []byte.
Since request for parsing is actually a []byte converted into the
string, it's safe to update it in-place. To be able to interact with
[]byte directly, this commit changes parser API and accepts []byte
instead of string.

Benchstat:
```
                                 │   before    │                after                │
                                 │   sec/op    │   sec/op     vs base                │
RowsUnmarshalUnescape-10           74.68n ± 4%   54.23n ± 5%  -27.38% (p=0.000 n=10)
RowsUnmarshalUnescapeNoEscape-10   40.41n ± 2%   42.59n ± 1%   +5.39% (p=0.000 n=10)
geomean                            54.93n        48.06n       -12.51%

                                 │    before    │                after                 │
                                 │     B/s      │     B/s       vs base                │
RowsUnmarshalUnescape-10           1.035Gi ± 4%   1.425Gi ± 5%  +37.72% (p=0.000 n=10)
RowsUnmarshalUnescapeNoEscape-10   1.613Gi ± 2%   1.531Gi ± 1%   -5.11% (p=0.000 n=10)
geomean                            1.292Gi        1.477Gi       +14.32%

                                 │   before    │                after                 │
                                 │    B/op     │    B/op     vs base                  │
RowsUnmarshalUnescape-10           149.00 ± 0%   96.00 ± 0%  -35.57% (p=0.000 n=10)
RowsUnmarshalUnescapeNoEscape-10    80.00 ± 0%   80.00 ± 0%        ~ (p=1.000 n=10) ¹
geomean                             109.2        87.64       -19.73%
¹ all samples are equal

                                 │   before   │                after                 │
                                 │ allocs/op  │ allocs/op   vs base                  │
RowsUnmarshalUnescape-10           5.000 ± 0%   1.000 ± 0%  -80.00% (p=0.000 n=10)
RowsUnmarshalUnescapeNoEscape-10   1.000 ± 0%   1.000 ± 0%        ~ (p=1.000 n=10) ¹
geomean                            2.236        1.000       -55.28%
```

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10053

### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2025-12-05 18:18:07 +01:00
Nikolay
2bef26288e lib/memory: add validation for remaining system memory
Previously, if user defined value for `memory.allowedBytes` flag
exceeded system memory limit, remaining memory could take negative
value. It results into incorrect memory auto-detect calculations for
various components. Such as vmstorage unique timeseries limit and parts
size.

 This commit adds negative value check. And also logs system memory
limit at start-up of vm components.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10083
2025-12-05 18:14:04 +01:00
Hui Wang
c14dbad33b vmselect: disable rollup result cache for instant queries that contain rate function
Previously, in order to cache results for `rate`, we consider
`rate(m[d])` as `(increase(m[d]) / d)` and cache the `increase` result.
However, in MetricsQL, `rate(d) = (lastValue - firstValue) /
(lastTimestamp - firstTimestamp)`, so it does not equal to
`increase(d)/d` if `d != (lastTimestamp - firstTimestamp)`.
Although the issue primarily arises when the time series samples are not
continuous, but the discrepancy is hard to debug and can be confusing to
users. Because the range query doesn't use this optimization, causing
recording rule results to
differ from raw query results in VMUI. 
Therefore, it is better to disable the usage and only enable it when we
can cache it correctly.

fixes https://github.com/VictoriaMetrics/victoriaMetrics/issues/10098
2025-12-05 17:38:32 +01:00
Artem Fetishev
dbe71700b5 lib/storage: after deleting series, reset tsid only once
As indexDBs became independent from each other, the tsidCache is now
reset more than once when the DeleteSeries() operation is performed. But
it needs to be performed only once. Thus, move the deletion from indexDB
to Storage.

Follow-up for 16d75ab0bd.

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10119
2025-12-05 17:38:02 +01:00
Hui Wang
d4fa326659 vmselect: reset rollup result cache with -search.disableCache when necessary
There’s no need to call `c.Reset()` for rollup result cache if it’s not
persisted(`-cacheDataPath` not specified) or has already been cleared by
`-search.resetRollupResultCacheOnStartup`, as it is already newly
created.


Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10095
2025-12-05 17:37:30 +01:00
Andrei Baidarov
040ef931d1 vmalert: do not increment errors counter on cancel context errors
Follow-up for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10027

`vmalert_alerting_rules_errors_total` increments on any error


445f30a4a6/app/vmalert/rule/alerting.go (L455-L460)

while `vmalert_execution_errors_total` only on non-cancellation ones


445f30a4a6/app/vmalert/rule/group.go (L747-L756)

This commit ignores cancellation errors in
`vmalert_alerting_rules_errors_total` too
2025-12-05 17:36:42 +01:00
JAYICE
474009a7f1 dashboard: add page faults panel for vmsingle&vmcluster (#9977)
### Describe Your Changes
add page fault panel in `Troubleshooting`section for vmcluster and
vmsingle. fix
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9974

The query
```
sum(rate(process_minor_pagefaults_total{job=~"$job", instance=~"$instance"})) by (job,instance)

sum(rate(process_major_pagefaults_total{job=~"$job", instance=~"$instance"})) by (job,instance)
```

<img width="1088" height="306" alt="image"
src="https://github.com/user-attachments/assets/4b4ac884-5372-4141-a429-ac0b296dc926"
/>
2025-12-05 18:04:44 +02:00
Nikolay
1b1442d91b app/vmgateway: properly handle proxy request errors
Previously vmgateway didn't handle http.Abort error.
It could lead to the unexpected panic at webserver.

This commit adds panic recover and prevent app from crash.
2025-12-05 16:32:50 +01:00
Aliaksandr Valialkin
3e359dc920 lib/protoparser/influx: remove IgnoreErrors field from Rows and replace it with the explicit skipInvalidLines arg at Rows.Unmarshal()
This improves the maintainability of the code, since the caller of Rows.Unmarshal() always knows
whether invalid lines must be skipped.

While at it, add missing error checks returned from Rows.Unmarshal().

This is a follow-up for the commit daa7183749

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7090
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/7165
2025-12-05 16:24:54 +01:00
Hui Wang
e41f642a59 add flag description for -selectNode (#10022) 2025-12-05 14:53:06 +02:00
Artem Fetishev
7a2cc7fbad lib/storage: use deadline instead is.deadline
This makes SearchTSIDs() consistent with SearchMetricNames().

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-12-05 02:08:14 +01:00
Aliaksandr Valialkin
a7b99dd164 vendor: update github.com/VictoriaMetrics/easyproto from v0.1.4 to v1.0.0 2025-12-04 21:47:20 +01:00
Andrii Chubatiuk
647f107576 vmui: always add /prometheus prefix while generating backend url 2025-12-04 18:09:47 +02:00
dependabot[bot]
04f8296c85 build(deps-dev): bump js-yaml from 4.1.0 to 4.1.1 in /app/vmui/packages/vmui (#10017)
Bumps [js-yaml](https://github.com/nodeca/js-yaml) from 4.1.0 to 4.1.1.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/nodeca/js-yaml/blob/master/CHANGELOG.md">js-yaml's
changelog</a>.</em></p>
<blockquote>
<h2>[4.1.1] - 2025-11-12</h2>
<h3>Security</h3>
<ul>
<li>Fix prototype pollution issue in yaml merge (&lt;&lt;)
operator.</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="cc482e7759"><code>cc482e7</code></a>
4.1.1 released</li>
<li><a
href="50968b862e"><code>50968b8</code></a>
dist rebuild</li>
<li><a
href="d092d86603"><code>d092d86</code></a>
lint fix</li>
<li><a
href="383665ff42"><code>383665f</code></a>
fix prototype pollution in merge (&lt;&lt;)</li>
<li><a
href="0d3ca7a27b"><code>0d3ca7a</code></a>
README.md: HTTP =&gt; HTTPS (<a
href="https://redirect.github.com/nodeca/js-yaml/issues/678">#678</a>)</li>
<li><a
href="49baadd52a"><code>49baadd</code></a>
doc: 'empty' style option for !!null</li>
<li><a
href="ba3460eb9d"><code>ba3460e</code></a>
Fix demo link (<a
href="https://redirect.github.com/nodeca/js-yaml/issues/618">#618</a>)</li>
<li>See full diff in <a
href="https://github.com/nodeca/js-yaml/compare/4.1.0...4.1.1">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=js-yaml&package-manager=npm_and_yarn&previous-version=4.1.0&new-version=4.1.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/VictoriaMetrics/VictoriaMetrics/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-04 17:42:17 +02:00
dependabot[bot]
1c3e64e9ad build(deps): bump actions/checkout from 4 to 6 (#10082)
Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to
6.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/actions/checkout/releases">actions/checkout's
releases</a>.</em></p>
<blockquote>
<h2>v6.0.0</h2>
<h2>What's Changed</h2>
<ul>
<li>Update README to include Node.js 24 support details and requirements
by <a href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a>
in <a
href="https://redirect.github.com/actions/checkout/pull/2248">actions/checkout#2248</a></li>
<li>Persist creds to a separate file by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2286">actions/checkout#2286</a></li>
<li>v6-beta by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2298">actions/checkout#2298</a></li>
<li>update readme/changelog for v6 by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2311">actions/checkout#2311</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/checkout/compare/v5.0.0...v6.0.0">https://github.com/actions/checkout/compare/v5.0.0...v6.0.0</a></p>
<h2>v6-beta</h2>
<h2>What's Changed</h2>
<p>Updated persist-credentials to store the credentials under
<code>$RUNNER_TEMP</code> instead of directly in the local git
config.</p>
<p>This requires a minimum Actions Runner version of <a
href="https://github.com/actions/runner/releases/tag/v2.329.0">v2.329.0</a>
to access the persisted credentials for <a
href="https://docs.github.com/en/actions/tutorials/use-containerized-services/create-a-docker-container-action">Docker
container action</a> scenarios.</p>
<h2>v5.0.1</h2>
<h2>What's Changed</h2>
<ul>
<li>Port v6 cleanup to v5 by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2301">actions/checkout#2301</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/checkout/compare/v5...v5.0.1">https://github.com/actions/checkout/compare/v5...v5.0.1</a></p>
<h2>v5.0.0</h2>
<h2>What's Changed</h2>
<ul>
<li>Update actions checkout to use node 24 by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2226">actions/checkout#2226</a></li>
<li>Prepare v5.0.0 release by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2238">actions/checkout#2238</a></li>
</ul>
<h2>⚠️ Minimum Compatible Runner Version</h2>
<p><strong>v2.327.1</strong><br />
<a
href="https://github.com/actions/runner/releases/tag/v2.327.1">Release
Notes</a></p>
<p>Make sure your runner is updated to this version or newer to use this
release.</p>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/checkout/compare/v4...v5.0.0">https://github.com/actions/checkout/compare/v4...v5.0.0</a></p>
<h2>v4.3.1</h2>
<h2>What's Changed</h2>
<ul>
<li>Port v6 cleanup to v4 by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2305">actions/checkout#2305</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/checkout/compare/v4...v4.3.1">https://github.com/actions/checkout/compare/v4...v4.3.1</a></p>
<h2>v4.3.0</h2>
<h2>What's Changed</h2>
<ul>
<li>docs: update README.md by <a
href="https://github.com/motss"><code>@​motss</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1971">actions/checkout#1971</a></li>
<li>Add internal repos for checking out multiple repositories by <a
href="https://github.com/mouismail"><code>@​mouismail</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1977">actions/checkout#1977</a></li>
<li>Documentation update - add recommended permissions to Readme by <a
href="https://github.com/benwells"><code>@​benwells</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2043">actions/checkout#2043</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/actions/checkout/blob/main/CHANGELOG.md">actions/checkout's
changelog</a>.</em></p>
<blockquote>
<h1>Changelog</h1>
<h2>V6.0.0</h2>
<ul>
<li>Persist creds to a separate file by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2286">actions/checkout#2286</a></li>
<li>Update README to include Node.js 24 support details and requirements
by <a href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a>
in <a
href="https://redirect.github.com/actions/checkout/pull/2248">actions/checkout#2248</a></li>
</ul>
<h2>V5.0.1</h2>
<ul>
<li>Port v6 cleanup to v5 by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2301">actions/checkout#2301</a></li>
</ul>
<h2>V5.0.0</h2>
<ul>
<li>Update actions checkout to use node 24 by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2226">actions/checkout#2226</a></li>
</ul>
<h2>V4.3.1</h2>
<ul>
<li>Port v6 cleanup to v4 by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2305">actions/checkout#2305</a></li>
</ul>
<h2>V4.3.0</h2>
<ul>
<li>docs: update README.md by <a
href="https://github.com/motss"><code>@​motss</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1971">actions/checkout#1971</a></li>
<li>Add internal repos for checking out multiple repositories by <a
href="https://github.com/mouismail"><code>@​mouismail</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1977">actions/checkout#1977</a></li>
<li>Documentation update - add recommended permissions to Readme by <a
href="https://github.com/benwells"><code>@​benwells</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2043">actions/checkout#2043</a></li>
<li>Adjust positioning of user email note and permissions heading by <a
href="https://github.com/joshmgross"><code>@​joshmgross</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2044">actions/checkout#2044</a></li>
<li>Update README.md by <a
href="https://github.com/nebuk89"><code>@​nebuk89</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2194">actions/checkout#2194</a></li>
<li>Update CODEOWNERS for actions by <a
href="https://github.com/TingluoHuang"><code>@​TingluoHuang</code></a>
in <a
href="https://redirect.github.com/actions/checkout/pull/2224">actions/checkout#2224</a></li>
<li>Update package dependencies by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2236">actions/checkout#2236</a></li>
</ul>
<h2>v4.2.2</h2>
<ul>
<li><code>url-helper.ts</code> now leverages well-known environment
variables by <a href="https://github.com/jww3"><code>@​jww3</code></a>
in <a
href="https://redirect.github.com/actions/checkout/pull/1941">actions/checkout#1941</a></li>
<li>Expand unit test coverage for <code>isGhes</code> by <a
href="https://github.com/jww3"><code>@​jww3</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1946">actions/checkout#1946</a></li>
</ul>
<h2>v4.2.1</h2>
<ul>
<li>Check out other refs/* by commit if provided, fall back to ref by <a
href="https://github.com/orhantoy"><code>@​orhantoy</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1924">actions/checkout#1924</a></li>
</ul>
<h2>v4.2.0</h2>
<ul>
<li>Add Ref and Commit outputs by <a
href="https://github.com/lucacome"><code>@​lucacome</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1180">actions/checkout#1180</a></li>
<li>Dependency updates by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>- <a
href="https://redirect.github.com/actions/checkout/pull/1777">actions/checkout#1777</a>,
<a
href="https://redirect.github.com/actions/checkout/pull/1872">actions/checkout#1872</a></li>
</ul>
<h2>v4.1.7</h2>
<ul>
<li>Bump the minor-npm-dependencies group across 1 directory with 4
updates by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1739">actions/checkout#1739</a></li>
<li>Bump actions/checkout from 3 to 4 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1697">actions/checkout#1697</a></li>
<li>Check out other refs/* by commit by <a
href="https://github.com/orhantoy"><code>@​orhantoy</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1774">actions/checkout#1774</a></li>
<li>Pin actions/checkout's own workflows to a known, good, stable
version. by <a href="https://github.com/jww3"><code>@​jww3</code></a> in
<a
href="https://redirect.github.com/actions/checkout/pull/1776">actions/checkout#1776</a></li>
</ul>
<h2>v4.1.6</h2>
<ul>
<li>Check platform to set archive extension appropriately by <a
href="https://github.com/cory-miller"><code>@​cory-miller</code></a> in
<a
href="https://redirect.github.com/actions/checkout/pull/1732">actions/checkout#1732</a></li>
</ul>
<h2>v4.1.5</h2>
<ul>
<li>Update NPM dependencies by <a
href="https://github.com/cory-miller"><code>@​cory-miller</code></a> in
<a
href="https://redirect.github.com/actions/checkout/pull/1703">actions/checkout#1703</a></li>
<li>Bump github/codeql-action from 2 to 3 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1694">actions/checkout#1694</a></li>
<li>Bump actions/setup-node from 1 to 4 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1696">actions/checkout#1696</a></li>
<li>Bump actions/upload-artifact from 2 to 4 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1695">actions/checkout#1695</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="1af3b93b68"><code>1af3b93</code></a>
update readme/changelog for v6 (<a
href="https://redirect.github.com/actions/checkout/issues/2311">#2311</a>)</li>
<li><a
href="71cf2267d8"><code>71cf226</code></a>
v6-beta (<a
href="https://redirect.github.com/actions/checkout/issues/2298">#2298</a>)</li>
<li><a
href="069c695914"><code>069c695</code></a>
Persist creds to a separate file (<a
href="https://redirect.github.com/actions/checkout/issues/2286">#2286</a>)</li>
<li><a
href="ff7abcd0c3"><code>ff7abcd</code></a>
Update README to include Node.js 24 support details and requirements (<a
href="https://redirect.github.com/actions/checkout/issues/2248">#2248</a>)</li>
<li><a
href="08c6903cd8"><code>08c6903</code></a>
Prepare v5.0.0 release (<a
href="https://redirect.github.com/actions/checkout/issues/2238">#2238</a>)</li>
<li><a
href="9f265659d3"><code>9f26565</code></a>
Update actions checkout to use node 24 (<a
href="https://redirect.github.com/actions/checkout/issues/2226">#2226</a>)</li>
<li>See full diff in <a
href="https://github.com/actions/checkout/compare/v4...v6">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/checkout&package-manager=github_actions&previous-version=4&new-version=6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-04 17:40:42 +02:00
Hui Wang
4212491031 vmalert: clarify templating in alerting rule labels (#10121)
follow up
38dd971f58.

Labels only support limited templating variables in
https://docs.victoriametrics.com/victoriametrics/vmalert/#templating,
including `$labels`, `$value` and `expr`, to avoid breaking alert states
or causing cardinality issue with results.
2025-12-04 17:35:27 +02:00
Zakhar Bessarab
f76bc956ca app/vmctl: respect context cancellation during user prompts
Previously, context cancellation was ignored when reading user response
for the prompt. That leads to ignoring of "Ctrl+C" and other termination
signals to vmctl until user finishes the input.

Fix that by properly propagating the context and respecting the
cancellation of the context.


Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-12-04 15:57:31 +04:00
Aliaksandr Valialkin
655074c3e0 lib/protoparser/opentelemetry/pb: remove code related to parsing logs in OTEL format
This code is no longer needed after the commit 4ffb74448d

See https://github.com/VictoriaMetrics/VictoriaLogs/pull/720
2025-12-04 00:49:54 +01:00
Aliaksandr Valialkin
5e95fdf23e docs/victoriametrics/FAQ.md: add a link to the guide on how to calculate the needed disk space at VictoriaLogs at why indexdb size is so large? chapter
This is a follow-up for 68f670cbc5
2025-12-03 15:51:40 +01:00
Aliaksandr Valialkin
ffcfb74b17 deployment: update Go builder from v1.25.4 to v1.25.5
See https://github.com/golang/go/issues?q=milestone%3AGo1.25.5%20label%3ACherryPickApproved
2025-12-03 15:20:11 +01:00
Max Kotliar
fe803bfc6e Capitalize titles in operator.json
Signed-off-by: d3spair <git@agrshv.dev>
2025-12-03 13:43:39 +02:00
Andrii Chubatiuk
8ee466ab06 dashboard: add panels for operator flags and global params 2025-12-03 13:28:59 +02:00
Sylvain Rabot
6ca48d5025 lib/vmbackup/s3backup: support custom SSE KMS key id and ACL
Add more S3 configurations.

- SSES3KeyID allows to push to a bucket that is another account as the
KMS key it uses to encrypt data server side.
- ACL allows configure which permissions are given to the object
uploaded on the bucket (usefull when bucket policy expect a given
permission such as `bucket-owner-full-control`).

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Andrii Chubatiuk <andrew.chubatiuk@gmail.com>
2025-12-03 10:06:57 +04:00
Fred Navruzov
70eb9d39d5 docs/vmanomaly: release v1.28.1 (#10111)
### Describe Your Changes

Updates of docs and examples to `vmanomaly` v1.28.1

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-12-02 21:08:17 +02:00
Zakhar Bessarab
1985c79a4d deployment: update references to the latest release 2025-12-01 21:16:14 +04:00
Zakhar Bessarab
f0dafacfd3 docs: update references to the latest release 2025-12-01 21:15:13 +04:00
Zakhar Bessarab
6c01f5d50f docs/changelog: backport LTS changelogs 2025-12-01 20:44:43 +04:00
f41gh7
84658e77da docs/changelog: sort changelog entries
Signed-off-by: f41gh7 <nik@victoriametrics.com>
2025-12-01 11:11:54 +01:00
Zakhar Bessarab
4dc32ff1d7 app/vminsert/netstorage: fix list of nodes used for SD
Previously, vminsert was using original list of addrs instead of
discovered addrs. Properly use discovered list of addrs.
2025-12-01 11:11:53 +01:00
Artem Fetishev
08a1b2e75c lib/lrucache: do not reset requests and misses after cache reset
Follow-up for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10072.

Do not reset requests and misses metrics since cache reset implies the
reset of the storage only.
2025-12-01 10:12:47 +01:00
Zakhar Bessarab
7e5b68fc1f docs/changelog: cut v1.131.0 2025-11-28 20:20:08 +04:00
Zakhar Bessarab
dcc130603c docs: update availble from tags 2025-11-28 20:13:42 +04:00
Zakhar Bessarab
9842ad2299 app/vmselect: run make vmui-update 2025-11-28 20:01:08 +04:00
Aliaksandr Valialkin
63c0cf673f Makefile: generate quicktemplate output files only at lib and app directories
Previously the output files were incorrectly generated inside unexpeted directories such as vendor
2025-11-28 16:07:22 +01:00
Nowa Ammerlaan
7f51bb4ce7 protoparser/influx: account for excess white spaces before timestamp
Some influx clients ( such as nimon monitoring client) adds excess white spaces in the influx line and does not set a
timestamp. Since Influx protocol requires whitespace before timestamp only when it set, it could present without timestamp. Whitespace before omitted timestamp confuses parser.

This commit adds check for the skipped timestamp and test case for it.

Fixes: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10049
2025-11-28 14:36:35 +01:00
Nikolay
38df52ea08 app/vmselect: improve performance for multi-level requests
Previously, proxy vmselect (aka 1st level vmselect) performed parsing
of MetricBlock received from vmstorage before forwarding it into top vmselect. It required an additional CPU and Memory, which greatly slowed down query requests.

This commit changes lib/vmselectapi iterator API, instead of MetricBlock, it returns encoded MetricBlock as a byte slice.
It allows to save CPU and memory at proxy vmselect by eliminating need of decoding MetricBlock received from storage.

In addition, it adds the following optimizations for proxy vmselect:
* reduces memory allocations by using iterator pool
 * add per storageNode workerItem for iterator

Also, it adds optimization for vmstorage, it no longer performs extra memory copy of MetricName for MetricBlock.

vmselect and vmstorage metrics vm_vmselect_metric_rows_read_total and vm_metric_rows_read_total were removed, it's not used at any dashboards and rules. New Iterator API doesn't support it.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9899
2025-11-28 13:04:55 +01:00
Max Kotliar
023a13435c dashboards: make dashboards-sync 2025-11-27 16:52:45 +02:00
Max Kotliar
1ddcbed6d7 dashboards: Show "Disk space usage % by type" as stacked graph in Cluster dashboard. (#10089)
### Describe Your Changes

VictoriaMetrics - cluster dashboard.

vmstorage -> Disk space usage % by type pane.

Switch panel to 100% stacked view to show space distribution.

The goal is to highlight how space is split between datapoints and
indexdb types; Simple time-series values made this hard to see. A 100%
stacked layout makes the distribution immediately visible.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9932

was: <img width="1201" height="609" alt="Image"
src="https://github.com/user-attachments/assets/1d199e65-5a20-4c63-a251-b7087020f42a"
/>


now: 
<img width="1208" height="608" alt="Screenshot 2025-11-27 at 13 14 51"
src="https://github.com/user-attachments/assets/96aa32f3-1243-486b-bac8-2d3c0f4bdb7a"
/>


### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-11-27 16:50:15 +02:00
Aliaksandr Valialkin
edd02cdb5b docs/victoriametrics/goals.md: clarify that bugs, which affect a small number of users at rare edge cases, can be fixed later 2025-11-27 14:29:17 +01:00
Artem Fetishev
4cd727a511 lib/storage: use lrucache for tfss cache (#10072)
The purpose of this PR is the same as #10000, except `lrucache` is used
for implementing tfss cache.

---------

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-27 14:18:03 +01:00
Andrii Chubatiuk
19c0477976 chore(app/vmui): conditionally render accordion children (#10068)
### Describe Your Changes

revert change, that was introduced in
483e00ffb9
since rendering of all nested children significantly impacts alerting
tab performance in case of multiple items
@Loori-R @arturminchukov , what do you think about using react-virtuoso
additionally for alerting tab to decrease dom size?

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-11-27 14:31:34 +02:00
Ben Randall
4fdd8f0906 lib/protoparser/opentelemetry: use separate loggers for unsupported delta temporality/metric type logs (#10021)
A throttled logger will continue to log messages occasionally with a
suffix indicating how many similar logs were throttled. Using the same
logger for multiple log messages can result in certain logs being
entirely suppressed and invisible in the logs. This updates most of the
loggers used in `appendFromScopeMetrics` to be their own logger so that
"unsupported delta temporality/metric type" logs will be visible for all
metric types. Additionally, `skippedSampleLogger` is only used by
`appendSamplesFromHistogram` so this was moved closer to that function.

Related to #9447
Related to #9498

- [X] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [X] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Co-authored-by: Max Kotliar <kotlyar.maksim@gmail.com>
2025-11-27 14:19:43 +02:00
Andrii Chubatiuk
9897872ca9 lib/flagutil: clarify usage of quotes in array flag values 2025-11-27 14:17:07 +02:00
Hui Wang
b8bbb07431 dashboard: tidy vmauth panels (#10088)
before:
<img width="2498" height="1042" alt="image"
src="https://github.com/user-attachments/assets/0bbd7cc2-7062-494f-827b-96d86133537f"
/>
after:
<img width="2497" height="968" alt="image"
src="https://github.com/user-attachments/assets/6256ccc2-2f8f-40ea-a23b-a1a20e242b3c"
/>
which is more consistent with other dashabords.
2025-11-27 14:12:53 +02:00
Max Kotliar
eb1c8dd67d docs: add links to issues in changelog 2025-11-27 14:09:02 +02:00
Aliaksandr Valialkin
50fc48ac47 lib/fs: avoid Go runtime stalls on Linux when all the GOMAXPROCS threads are blocked in major pagefaults while reading the data from memory-mapped files
Go runtime executes all the goroutines on GOMAXPROCS operating system threads.
Go runtime cannot switch the OS thread to another goroutine if the current goroutine
is stuck in the major pagefault while reading the data from memory-mapped file,
because Go runtime doesn't distiguinsh between reading from regular memory and reading
from memory-mapped file. So the OS thread becomes stuck while waiting until the OS
reads the data from file at the requested memory address and returns back control to Go application.

In the worst case it is possible that all the GOMAXPROCS threads are stuck in major pagefaults,
so Go runtime pauses executing all the goroutines. This state is possible in environments
with small GOMAXPROCS and high-latency disks such as NFS or small HDD-based disks at AWS.

See https://valyala.medium.com/mmap-in-go-considered-harmful-d92a25cb161d for more details.

This commit protects from such stalls by verifying whether the given memory location from memory-mapped file
is already loaded in the OS page cache before reading from that memory.
If the location isn't in the OS page cache, then it falls back to pread() syscall for reading the data from file.
Go runtime allocates extra OS threads for long-running syscalls, so it can continue executing goroutines
across all the GOMAXPROCS threads while reading the data from slow storage via pread() syscall.

This commit uses mincore() syscall for detecting whether the given memory page is available in the OS page cache.
It also caches mincore() results for up to a minute in order to reduce the overhead for the mincore() syscall.

This commit reduces the increase rate for the process_major_pagefaults_total metric by multiple orders of magnitude
on systems with high-latency disks.
2025-11-26 20:52:27 +01:00
Artem Fetishev
3bd9c75acc lib/lrucache: use uint64 for SizeBytes() and SizeMaxBytes() (#10077)
Currently, `lrucache.Cache` `SizeBytes()` and `SizeMaxBytes()` return
type is `int`. The cache `Entry.SizeBytes()` also returns `int` value.
Changing the type to `uint64` will allow using `uint64set.Set` as the
cache entry type (see #10072).

Please note that using `uint64` regardless the cpu architecture is set
is not entirely correct, because in 32-bit systems the size won't ever
get bigger than `2^32`, so the `uint64` will too much. However current
type (`int`) is not correct either since it is signed and will only
allow to store values up to `2^31`. Alternatively, all `SizeBytes()`
methods should return `uint`.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-26 11:39:07 +01:00
Max Kotliar
9c0683f8d1 dashboards: run make dashboards-sync 2025-11-25 20:13:06 +02:00
Max Kotliar
bf4660912f .github: Add changelog tip linter 2025-11-25 13:41:48 +02:00
Yury Molodov
bb54b5e661 app/vmui: improve alert styles for better readability (#10012)
### Describe Your Changes

This PR improves vmui alert styles by adding borders between rows,
introducing a hover state for easier row identification, and aligning
badges to the left.

Related issue: #9856

| Before | After |
|--------|--------|
| <img width="1427" height="1310" alt="image"
src="https://github.com/user-attachments/assets/68f3469e-95df-449f-a85d-1c0285520e2d"
/> | <img width="1427" height="1310" alt="Image"
src="https://github.com/user-attachments/assets/89501efb-c66f-402a-9d14-01c86930a5e2"
/> |

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Signed-off-by: Yury Molodov <yurymolodov@gmail.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2025-11-25 13:38:24 +02:00
Andrii Chubatiuk
200a729565 app/vmui: fixed ability to select multiple metrics in explore metrics tab (#10008)
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9995

change in only `Select` component leads to infinite
ExploreMetricsGraphItem component refresh since each time array has a
new reference

### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-11-25 13:29:45 +02:00
Yury Molodov
7303495ae1 app/vmui: fix rendering of multiple points at the same timestamp (#10010)
### Describe Your Changes

1. Removed the *step* control from the **Raw Query** page, as it didn’t
affect chart rendering and caused confusion.
2. Fixed rendering of multiple points with the same timestamp -
previously, the second point was hidden.
3. Added proper visualization for points with the same timestamp and
identical values: such points are now shown as a square, and the tooltip
displays the number of duplicates.

**Example:**

```json
{
  "values": [1, 22, 10, 10, 5, 6],
  "timestamps": [
    1761955247950,
    1761955247950,
    1761955248960,
    1761955248960,
    1761955251980,
    1761955252990
  ]
}
```

<img width="500" height="1120" alt="image"
src="https://github.com/user-attachments/assets/192aa43e-8008-4f03-8966-00f59e52ec40"
/>
<img width="300" height="676" alt="image"
src="https://github.com/user-attachments/assets/8e361cb3-1286-452a-a687-b6b40ba7807b"
/>

Related issues: #9667 and #9666

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Signed-off-by: Yury Molodov <yurymolodov@gmail.com>
2025-11-25 11:00:00 +02:00
Andrei Baidarov
98b5288e9c vmselect: do not immediately fail request if vmstorage returns search… (#10030)
….maxConcurrentRequests error

If `vmstorage` is currently overloaded it could return
maxConcurrentRequests error. Now `vmselect` immediately fails the whole
request even if `replicationFactor` is set up and other replicas could
respond without errors.

This PR treats them as regular errors, not fatal ones.

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-11-24 20:37:37 +02:00
Cancai Cai
d7f9cd971d docs/notes: fix syntax errors (#10019)
### Describe Your Changes

I'm not sure if this is a mistake.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Signed-off-by: cancaicai <2356672992@qq.com>
2025-11-24 20:37:05 +02:00
cancaicai
2cb08095c6 docs/storage: fix typo
Signed-off-by: cancaicai <2356672992@qq.com>
2025-11-24 15:46:40 +02:00
dependabot[bot]
8bc41f4c79 build(deps): bump golang.org/x/crypto from 0.43.0 to 0.45.0 (#10052)
Bumps [golang.org/x/crypto](https://github.com/golang/crypto) from
0.43.0 to 0.45.0.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="4e0068c009"><code>4e0068c</code></a>
go.mod: update golang.org/x dependencies</li>
<li><a
href="e79546e28b"><code>e79546e</code></a>
ssh: curb GSSAPI DoS risk by limiting number of specified OIDs</li>
<li><a
href="f91f7a7c31"><code>f91f7a7</code></a>
ssh/agent: prevent panic on malformed constraint</li>
<li><a
href="2df4153a03"><code>2df4153</code></a>
acme/autocert: let automatic renewal work with short lifetime certs</li>
<li><a
href="bcf6a849ef"><code>bcf6a84</code></a>
acme: pass context to request</li>
<li><a
href="b4f2b62076"><code>b4f2b62</code></a>
ssh: fix error message on unsupported cipher</li>
<li><a
href="79ec3a51fc"><code>79ec3a5</code></a>
ssh: allow to bind to a hostname in remote forwarding</li>
<li><a
href="122a78f140"><code>122a78f</code></a>
go.mod: update golang.org/x dependencies</li>
<li><a
href="c0531f9c34"><code>c0531f9</code></a>
all: eliminate vet diagnostics</li>
<li><a
href="0997000b45"><code>0997000</code></a>
all: fix some comments</li>
<li>Additional commits viewable in <a
href="https://github.com/golang/crypto/compare/v0.43.0...v0.45.0">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=golang.org/x/crypto&package-manager=go_modules&previous-version=0.43.0&new-version=0.45.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/VictoriaMetrics/VictoriaMetrics/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-24 15:13:50 +02:00
Zhu Jiekun
bb1e0d8f3b opentsdb: Avoid blocking when a connection doesn't send anything (#10045)
### Describe Your Changes

fix #9987 

Avoid blocking when a connection to `-opentsdbListenAddr` doesn't send
any data. This issue blocked other connections from being handled.

> This bug can be tested with:
> 1. Start VictoriaMetrics Single-node with `-opentsdbListenAddr=:4242`.
> 2. Run: `telnet 127.0.0.1 4242` without typing any data after
connection established.
> 3. Run (in another terminal, after step 2): `curl -H 'Content-Type:
application/json' -d
'{"metric":"x.y.z","value":2222222.34,"tags":{"t1":"v1","t2":"v2"}}'
http://localhost:4242/api/put`
> 
> Before the change:
> - Step 3 was blocked infinitely.
> 
> Expect result after the change:
> - Step 3 was executed.
> - Connection established by step 2 will be closed after 5 seconds.

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2025-11-24 14:31:19 +02:00
Mathias Palmersheim
70be2e7ea3 Remove threshold from available cpu panel (#10056)
### Describe Your Changes

fixes #9988 by removing the cpu threshold from the Available CPU panel

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-11-24 14:16:35 +02:00
Kirill Yurkov
61796e355a docs: link faq for large indexdb (#10061)
Clarified the index size note in
docs/guides/understand-your-setup-size/README.md to steer readers toward
the FAQ when indexdb feels oversized, noting typical ratios and
troubleshooting guidance.
2025-11-24 14:04:05 +02:00
dependabot[bot]
2ae3fd47eb build(deps): bump actions/checkout from 5 to 6 (#10060)
Bumps [actions/checkout](https://github.com/actions/checkout) from 5 to
6.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/actions/checkout/releases">actions/checkout's
releases</a>.</em></p>
<blockquote>
<h2>v6.0.0</h2>
<h2>What's Changed</h2>
<ul>
<li>Update README to include Node.js 24 support details and requirements
by <a href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a>
in <a
href="https://redirect.github.com/actions/checkout/pull/2248">actions/checkout#2248</a></li>
<li>Persist creds to a separate file by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2286">actions/checkout#2286</a></li>
<li>v6-beta by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2298">actions/checkout#2298</a></li>
<li>update readme/changelog for v6 by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2311">actions/checkout#2311</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/checkout/compare/v5.0.0...v6.0.0">https://github.com/actions/checkout/compare/v5.0.0...v6.0.0</a></p>
<h2>v6-beta</h2>
<h2>What's Changed</h2>
<p>Updated persist-credentials to store the credentials under
<code>$RUNNER_TEMP</code> instead of directly in the local git
config.</p>
<p>This requires a minimum Actions Runner version of <a
href="https://github.com/actions/runner/releases/tag/v2.329.0">v2.329.0</a>
to access the persisted credentials for <a
href="https://docs.github.com/en/actions/tutorials/use-containerized-services/create-a-docker-container-action">Docker
container action</a> scenarios.</p>
<h2>v5.0.1</h2>
<h2>What's Changed</h2>
<ul>
<li>Port v6 cleanup to v5 by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2301">actions/checkout#2301</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/checkout/compare/v5...v5.0.1">https://github.com/actions/checkout/compare/v5...v5.0.1</a></p>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/actions/checkout/blob/main/CHANGELOG.md">actions/checkout's
changelog</a>.</em></p>
<blockquote>
<h1>Changelog</h1>
<h2>V6.0.0</h2>
<ul>
<li>Persist creds to a separate file by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2286">actions/checkout#2286</a></li>
<li>Update README to include Node.js 24 support details and requirements
by <a href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a>
in <a
href="https://redirect.github.com/actions/checkout/pull/2248">actions/checkout#2248</a></li>
</ul>
<h2>V5.0.1</h2>
<ul>
<li>Port v6 cleanup to v5 by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2301">actions/checkout#2301</a></li>
</ul>
<h2>V5.0.0</h2>
<ul>
<li>Update actions checkout to use node 24 by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2226">actions/checkout#2226</a></li>
</ul>
<h2>V4.3.1</h2>
<ul>
<li>Port v6 cleanup to v4 by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2305">actions/checkout#2305</a></li>
</ul>
<h2>V4.3.0</h2>
<ul>
<li>docs: update README.md by <a
href="https://github.com/motss"><code>@​motss</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1971">actions/checkout#1971</a></li>
<li>Add internal repos for checking out multiple repositories by <a
href="https://github.com/mouismail"><code>@​mouismail</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1977">actions/checkout#1977</a></li>
<li>Documentation update - add recommended permissions to Readme by <a
href="https://github.com/benwells"><code>@​benwells</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2043">actions/checkout#2043</a></li>
<li>Adjust positioning of user email note and permissions heading by <a
href="https://github.com/joshmgross"><code>@​joshmgross</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2044">actions/checkout#2044</a></li>
<li>Update README.md by <a
href="https://github.com/nebuk89"><code>@​nebuk89</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2194">actions/checkout#2194</a></li>
<li>Update CODEOWNERS for actions by <a
href="https://github.com/TingluoHuang"><code>@​TingluoHuang</code></a>
in <a
href="https://redirect.github.com/actions/checkout/pull/2224">actions/checkout#2224</a></li>
<li>Update package dependencies by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2236">actions/checkout#2236</a></li>
</ul>
<h2>v4.2.2</h2>
<ul>
<li><code>url-helper.ts</code> now leverages well-known environment
variables by <a href="https://github.com/jww3"><code>@​jww3</code></a>
in <a
href="https://redirect.github.com/actions/checkout/pull/1941">actions/checkout#1941</a></li>
<li>Expand unit test coverage for <code>isGhes</code> by <a
href="https://github.com/jww3"><code>@​jww3</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1946">actions/checkout#1946</a></li>
</ul>
<h2>v4.2.1</h2>
<ul>
<li>Check out other refs/* by commit if provided, fall back to ref by <a
href="https://github.com/orhantoy"><code>@​orhantoy</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1924">actions/checkout#1924</a></li>
</ul>
<h2>v4.2.0</h2>
<ul>
<li>Add Ref and Commit outputs by <a
href="https://github.com/lucacome"><code>@​lucacome</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1180">actions/checkout#1180</a></li>
<li>Dependency updates by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>- <a
href="https://redirect.github.com/actions/checkout/pull/1777">actions/checkout#1777</a>,
<a
href="https://redirect.github.com/actions/checkout/pull/1872">actions/checkout#1872</a></li>
</ul>
<h2>v4.1.7</h2>
<ul>
<li>Bump the minor-npm-dependencies group across 1 directory with 4
updates by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1739">actions/checkout#1739</a></li>
<li>Bump actions/checkout from 3 to 4 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1697">actions/checkout#1697</a></li>
<li>Check out other refs/* by commit by <a
href="https://github.com/orhantoy"><code>@​orhantoy</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1774">actions/checkout#1774</a></li>
<li>Pin actions/checkout's own workflows to a known, good, stable
version. by <a href="https://github.com/jww3"><code>@​jww3</code></a> in
<a
href="https://redirect.github.com/actions/checkout/pull/1776">actions/checkout#1776</a></li>
</ul>
<h2>v4.1.6</h2>
<ul>
<li>Check platform to set archive extension appropriately by <a
href="https://github.com/cory-miller"><code>@​cory-miller</code></a> in
<a
href="https://redirect.github.com/actions/checkout/pull/1732">actions/checkout#1732</a></li>
</ul>
<h2>v4.1.5</h2>
<ul>
<li>Update NPM dependencies by <a
href="https://github.com/cory-miller"><code>@​cory-miller</code></a> in
<a
href="https://redirect.github.com/actions/checkout/pull/1703">actions/checkout#1703</a></li>
<li>Bump github/codeql-action from 2 to 3 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1694">actions/checkout#1694</a></li>
<li>Bump actions/setup-node from 1 to 4 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1696">actions/checkout#1696</a></li>
<li>Bump actions/upload-artifact from 2 to 4 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1695">actions/checkout#1695</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="1af3b93b68"><code>1af3b93</code></a>
update readme/changelog for v6 (<a
href="https://redirect.github.com/actions/checkout/issues/2311">#2311</a>)</li>
<li><a
href="71cf2267d8"><code>71cf226</code></a>
v6-beta (<a
href="https://redirect.github.com/actions/checkout/issues/2298">#2298</a>)</li>
<li><a
href="069c695914"><code>069c695</code></a>
Persist creds to a separate file (<a
href="https://redirect.github.com/actions/checkout/issues/2286">#2286</a>)</li>
<li><a
href="ff7abcd0c3"><code>ff7abcd</code></a>
Update README to include Node.js 24 support details and requirements (<a
href="https://redirect.github.com/actions/checkout/issues/2248">#2248</a>)</li>
<li>See full diff in <a
href="https://github.com/actions/checkout/compare/v5...v6">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/checkout&package-manager=github_actions&previous-version=5&new-version=6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-24 14:00:22 +02:00
Max Kotliar
ebad7e5496 docs: Describe relation between slow inserts and unsorted labels. 2025-11-24 13:35:23 +02:00
Max Kotliar
e52de06ee5 docs: sync flags in docs with actual binaries 2025-11-24 13:16:22 +02:00
Aliaksandr Valialkin
38dd971f58 docs/victoriametrics/vmalert.md: clarify that templates can be used inside rule labels
Rule labels can contain templates in the same way as annotations.
See aad6ab009e/app/vmalert/rule/alerting_test.go (L1192)
and https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/#templating

Document this, since users sometimes ask this question.
2025-11-24 10:50:55 +01:00
Artem Fetishev
aad6ab009e lib/storage: minor metricNameSearch fixes (#10065)
- Fix comment
- Re-use dst instead introducing a new variable.

This change has been requested to be in a separated PR during the
pt-index (#8134) code review.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-21 20:06:20 +01:00
Artem Fetishev
2c125e14c7 lib/storage: also create parts.json on parition creation (#10051)
Currently, when a partition is created its corresponding parts.json file
is not created right away (see createNewParition()). Its creation is
delayed until the first part files are created on disk (see
swapSrcWithDstParts()). However, the parts.json file is created for a
possibly empty partition when an existing partition is opened (see
mustOpenPartition()) and when a partition snapshot is create (see
MustCreateSnapshotAt()).

I.e. `parts.json` is an important part of a partition, since it is an
artifact that describes the partition contents. And it should be created
on pt creation even if its contents is empty.

To be honest, this change is mostly a no-op for the current storage
implementation. It only makes the code consistent, i.e. the parts.json
is created along with the partition.

However having it created when a partition is created becomes in
pt-index (#7599, #8134), because it allows having partitions with no
data and therefore without parts.json file. Still not a big deal but the
unit tests start failing.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-21 14:19:53 +01:00
Artem Fetishev
13dc60e257 lib/storage: refactoring - move dateMetricIDCache code to a separate file (#10055)
dateMetricIDCache does not belong to storage anymore since it has been
moved to indexDB. Instead moving the case to index_db.go, move it to a
separate file in order to navigate the code more easily.

No changes have been done to the code or tests.

Follow up for: #9983

---------

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
Co-authored-by: Alexander Frolov <9749087+fxrlv@users.noreply.github.com>
2025-11-21 13:52:33 +01:00
Artem Fetishev
ed64c90e7a lib/storage: fix comments related to nextDayMetricIDs
Follow-up for 49b0a4fb16

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-21 13:31:20 +01:00
Artem Fetishev
49b0a4fb16 lib/storage: refactoring - simplify nextDayMetricIDs data structure (#10058)
The data structure used for holding the nextDayMetricIDs is too complex
and can be simplified (flattened).

Follow up for: #9983

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-21 13:02:02 +01:00
Artem Fetishev
5141496c43 lib/storage: add overlapsWith() and contains() methods to TimeRange (#10059)
The change was introduced in pt-index PR (#8134) and is extracted into a
separate PR.

Currently used in partition_search and partition. If you see more places
like this, please let me know.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-21 12:24:40 +01:00
Andrii Chubatiuk
24fac64875 docs: add warning blockquote regarding latest backup lifecycle policy (#10054)
Update formatting for warning text.

<img width="732" height="432" alt="image"
src="https://github.com/user-attachments/assets/1549e69a-fc65-445f-b567-9b5e4e1a8617"
/>
2025-11-20 13:46:34 +04:00
Aliaksandr Valialkin
8250f469a7 docs/victoriametrics/Articles.md: add https://medium.com/@kanakaraju896/backing-up-victoriametrics-data-a-complete-guide-24473c74450f 2025-11-20 08:36:59 +01:00
Aliaksandr Valialkin
7fb0f0e015 docs/victoriametrics/Articles.md: add https://blackmetalz.github.io/why-i-switched-to-victoriametrics-scaling-from-small-business-to-enterprise.html 2025-11-20 08:33:19 +01:00
Andrii Chubatiuk
563dbeaea1 app/vmalert: do not increment errors counter on cancel context
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10027
2025-11-19 13:32:16 +01:00
Nikolay
7e6468c1e3 lib/storage: properly increment missing tsids metric
Bug was introduced at 2380e4829d

Due to typo vm_missing_tsids_for_metric_id_total metric was incremented instead of vm_missing_metric_names_for_metric_id_total for missing metricName for metricID search.

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10041
2025-11-19 13:29:44 +01:00
Hui Wang
328f33202f chore: clarify vmalert -external.label usage (#10042)
To clarify that HA vmalert doesn't need to specify `-external.label`.
2025-11-19 13:28:36 +01:00
Fred Navruzov
951331db80 docs/vmanomaly: release v1.28.0 (#10031)
### Describe Your Changes

Upgraded vmanomaly docs & guides to release v1.28.0 (UI v1.2.0)

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-11-18 21:47:03 +02:00
Andrii Chubatiuk
e6139be8ba docs/vmbackupmanager: mention version since which -backupTypeTagName flag is available (#10038)
Mention version since which `backupTypeTagName` flag is available
2025-11-18 18:56:19 +04:00
Andrii Chubatiuk
77e5920014 app/vmbackupmanager: set backup type tag on backup's items
* app/vmbackupmanager: set VMBackupType tag on backup's items

* address review comments
2025-11-18 16:30:13 +04:00
Zakhar Bessarab
78049e991b docs/cluster: remove mention of select for metadata (#10034)
vmselect does not have a flag to enable metadata querying, remove
invalid reference to it from the docs.
2025-11-18 15:32:20 +04:00
Artem Fetishev
c972d70f00 docs: update VictoriaMetrics components version to v1.130.0
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-17 22:03:17 +01:00
Artem Fetishev
b947562f2b deployment/docker: update VM components version to v1.130.0
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-17 21:56:42 +01:00
Artem Fetishev
344a81fa20 docs: bump last LTS versions
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-17 20:14:07 +00:00
Artem Fetishev
4b022ea8a8 docs/CHANGELOG.md: update changelog with LTS release notes
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-17 20:08:16 +00:00
Artem Fetishev
04c24fc831 lib/workingsetcache: Fix bytesSize metric calculation (#10025)
Follow-up for 3e6fc445a9

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-17 13:49:17 +01:00
Artem Fetishev
d2f78e4b2b docs/CHANGELOG.md: cut v1.130.0
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-14 17:45:20 +00:00
Max Kotliar
3995837c58 docs: update latest version in docs to v1.130.0 2025-11-14 19:37:14 +02:00
Artem Fetishev
1d53496f98 make vmui-update
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-14 17:21:49 +00:00
Artem Fetishev
73a1ce2dd6 lib/storage: Move dateMetricIDCache to indexDB (#9983)
Looks like the `dateMetricIDCache` must be per indexDB:

- the use of this cache and `is.hasDateMetricID()` often go in pairs. So
it makes
  sense to use this cache in that method.
- The same is true for `createPerDayIndexes()`: everytime the index
entry is
  created, a corresponding entry is added to the cache.
- As a result the generation field is also removed from the cache.

Related to #7599 and #8134.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-14 16:03:28 +01:00
Aliaksandr Valialkin
daa88f6a43 docs/victoriametrics: cross-link rebalancing section at VictoriaMetrics cluster docs and the corresponding question at the FAQ page 2025-11-14 15:36:57 +01:00
Aliaksandr Valialkin
7bff73b0f7 docs/victoriametrics/Cluster-VictoriaMetrics.md: add rebalancing chapter, which explains how to rebalance data among vmstorage nodes
This is very frequent question from new users of VcitoriaMetrcs who migrate from other solutions
with automatic data rebalancing among storage nodes, so it is a good idea to cover it in the docs.
2025-11-14 15:32:29 +01:00
Max Kotliar
bf3b1cf6b6 lib/storage/metricsmetadata: ensure deterministic sorting for identical metric names across tenants
Metrics metadata is loaded from a per-tenant storage map
(perTenantStorage map[uint64]map[string]*Row), so result rows order is
non-deterministic. The existing sortRows implementation only sorts by
metric name and ingestion time, which means rows that differ only by
tenant/account ID still sorted undeterministically.

This change updates `sortRows` to include account\project identifiers in
the comparison, ensuring stable and deterministic ordering for metadata
entries that share the same metric name and timestamp.

First discovered as flaky test:

--- FAIL: TestStorageRead (0.00s)
    storage_test.go:337: unexpected rows get result (-want, +got):
          []*metricsmetadata.Row{
          	&{
          		... // 2 ignored and 1 identical fields
          		Help:      "uselesshelp1",
          		Unit:      "seconds1",
        - 		AccountID: 1,
        + 		AccountID: 0,
        - 		ProjectID: 1,
        + 		ProjectID: 0,
          		Type:      1,
          	},
          	&{
          		... // 2 ignored and 1 identical fields
          		Help:      "uselesshelp1",
          		Unit:      "seconds1",
        - 		AccountID: 0,
        + 		AccountID: 1,
        - 		ProjectID: 0,
        + 		ProjectID: 1,
          		Type:      1,
          	},
          }
FAIL

https://github.com/VictoriaMetrics/VictoriaMetrics-enterprise/actions/runs/19361594138/job/55394642029#step:4:133
2025-11-14 15:22:27 +02:00
Max Kotliar
a10ff67354 docs/changelog: Add links to changelog 2025-11-14 13:41:59 +02:00
Haley Wang
9a8463df42 lib/storage: add a value check for retentionFilter to ensure it does not exceed retentionPeriod 2025-11-14 12:50:46 +02:00
Max Kotliar
7e22b169f1 docs: Add metrics metadata how to use in docs
follow-up for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9487
2025-11-14 10:37:15 +01:00
f41gh7
80c1af5af1 apptest: add metrics metadata test for vmsingle
related issue github.com/VictoriaMetrics/VictoriaMetrics/issues/2974
2025-11-14 10:29:28 +01:00
f41gh7
5a587f2006 app/{vmstorage,vmselect,vminsert}: introduce metrics metadata storage
This commits adds storage part and cluster RPC methods for metrics metadata.

 Key concepts:
* vmstorage persists metadata in-memory only.
* vmstorage evicts metadata records older than 1 hour.
* vmstorage stores only the last value of metadata for time series
  metric name.
* vminsert opens an additional TCP connection to the vmstorage for
  metadata write requests.
* vmselect doesn't support `limit_per_metric_name`.

This feature is available optional and must be enabled via flag - `-enableMetadata` provided to vminsert/vmsingle.

Fixes github.com/VictoriaMetrics/VictoriaMetrics/issues/2974
2025-11-14 10:24:38 +01:00
Aliaksandr Valialkin
847cd1e336 docs/guides/understand-your-setup-size/README.md: remove the misleading recommendation for having at least 2vCPU cores per each vmstorage node
vmstorage nodes work perfectly with one CPU core and even with 10% of a single CPU core
if the allocated CPU resources matches their workload.

It is better to recommend allocating the an interger number of CPU cores to vmstorage
in order to achieve an optimal performance, since vmstorage allocates internal resources
according to the available CPU cores. If there is a fractional number of CPU cores,
then the allocation of internal resources may be not so optimal.

Fractional number of CPU cores may also lead to increased latencies and stalls
because some P threads at Go runtime won't be able to run goroutines from their ready queues
in a timely manner becasue of the lack of CPU time. See https://victoriametrics.com/blog/kubernetes-cpu-go-gomaxprocs/
2025-11-14 09:48:30 +01:00
Aliaksandr Valialkin
c86857b269 docs/victoriametrics/vmagent.md: mention that it isn't recommended increasing the -maxConcurrentRequests command-line flag value in general case
Too big values for the -maxConcurrentRequests command-line flag increase memory usage
and increase CPU overhead for processing incoming requests in most cases.
The only valid reason for increasing the value for -maxConcurrentRequests command-line flag
is when many clients send data to vmagent over very slow network.
2025-11-14 09:40:31 +01:00
Hui Wang
c93937101c Improve vmalert UI tip (#9998) 2025-11-13 21:04:39 +01:00
Aliaksandr Valialkin
cca7380dd3 docs/victoriametrics: fix broken link to /api/v1/rules docs at Prometheus 2025-11-13 19:40:10 +01:00
Aliaksandr Valialkin
ca3b9b18b5 docs/victoriametrics/README.md: add context links to the FAQ entry describing why IndexDB size may be too large 2025-11-13 19:36:18 +01:00
Nikolay
10f7cd2ffc lib/encoding/zstd: properly apply size limits
Previously, zstd Decoder didn't take in account Request Size limits
applied by VictoriaMetrics components.  And in case of incorrectly formed zstd block, VictoriaMetrics
component may allocate extra memory. Which may lead to the OOM errors.

This commit makes ingest endpoints check frame content size and window size headers based on MaxRequest Limits.
2025-11-13 18:13:23 +01:00
Hui Wang
fa85726a82 vmalert: print the error message as value if templating fails in alerting rule
For users, if an alerting rule has a misconfigured annotation, it's more
important to deliver the alert when the rule triggers rather than skip
it with templating error logs.
Then users can see the faulty annotation in alert message and fix it.

Note: the previous behavior is retained in replay mode because errors
there should be noticed immediately; hiding them could waste time,
resources and require a re-replay after fixes.
Also the rule's status in the vmalert UI remains unhealthy if templating
failed.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9853
2025-11-13 17:34:55 +01:00
Hui Wang
567c084d6d vmalert: drop labels with empty values in generated alerts and time series
In prometheus ecosystem, a label with an empty value equals no label,
since a query like `test{something=""}` matches all the series without
label `something`.
So for vmalert, preserving empty-value labels in generated alerts or
time series is unnecessary and can cause alert hash mismatches during
[restore](https://docs.victoriametrics.com/victoriametrics/vmalert/#alerts-state-on-restarts).
The empty-value label shouldn't come from datasource response since they
follow the same rule(omit empty-value labels), it may come from
`-external.label` or rule labels, but the empty value could be caused by
occasionally templating failures, which is hard to check there.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9984
2025-11-13 17:24:27 +01:00
Hui Wang
12a1388fbc vmalert: fix a potential race condition in web api during rule hot reload
Group rules are not protected by
[m.groupsMu](03c784e3e3/app/vmalert/manager.go (L25)),
they could be updated(with config hot reload) during `/api/v1/rule`,
`/api/v1/alert` and `/api/v1/alerts` API calls. This fix takes a
snapshot by calling `group.ToAPI()` first, making all reads safe.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9551
2025-11-13 17:22:25 +01:00
JAYICE
62c19b386a lib/httputl: fix failing to access http2 sd service by the shadow copy of http.DefaultTransport
Clone `http.DefaultTransport` and disable HTTP2 without resetting
`TLSClientConfig.NextProtos` in the shadow copy of
`http.DefaultTransport` will cause the request to HTTP/2 server to fail.
See https://github.com/golang/go/issues/39302.

To reproduce it, use a scrape config like:
```
scrape_configs:
  - job_name: test
    yandexcloud_sd_configs:
      - service: compute
        api_endpoint: https://api.cloud.yandex.net
```
Before the fix, access to the SD service would fail.

A solution is to specify `http/1.1` in  `TLSClientConfig.NextProtos`.

Related golang issue: https://github.com/golang/go/issues/39302

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9981
2025-11-13 17:19:15 +01:00
Andrii Chubatiuk
a84586a246 docs: update grafana plugin links, move root file to plugins repo (#10001)
### Describe Your Changes

update victorialogs grafana plugin links, moved root plugin file to
plugin repo

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-11-13 15:00:14 +02:00
Zhu Jiekun
24867a042b docs: mention VictoriaTraces playground in doc (#9999)
### Describe Your Changes

Add VictoriaTraces playground in doc.

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-11-13 12:50:54 +02:00
Andrii Chubatiuk
9baade2898 docs: added perses section, move grafana datasource to integrations (#9994)
### Describe Your Changes

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9888

additionally adds grafana datasource into integrations section and
excludes previous location from menu and search

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-11-13 12:49:16 +02:00
Zhu Jiekun
6117b2ead9 VMUI/relabel debug: Allow labels textarea input without curly braces (#9950)
### Describe Your Changes

Fixed #9900

relax the validation for the labels text area. It now accepts input
labels without being enclosed in curly braces.

The following input format should be supported now:


```
	metric_name
	metric_name{label1="value1"}
	{__name__="metric_name", label1="value1"}
	__name__="metric_name", label1="value1"
	label1="value1"
```

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2025-11-13 12:41:38 +02:00
Aliaksandr Valialkin
0df2993cf4 vendor: update github.com/valyala/gozstd from v1.23.2 to v1.24.0
This is needed for being able to use DecompressLimited() function for limiting
the size of descropressed data.

See https://github.com/valyala/gozstd/pull/75
Updates https://github.com/VictoriaMetrics/VictoriaMetrics-enterprise/issues/958
2025-11-12 21:10:51 +01:00
Aliaksandr Valialkin
14f1bda8fc docs/victoriametrics/vmalert.md: remove "👉" char from the Common mistakes chapter, since this looks like AI-generated content
While at it, fix a typo `&step` -> `step`.

This is a follow-up for the commit 40ab285fb9

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9343
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9373
2025-11-12 18:51:21 +01:00
Max Kotliar
f96f4709f6 docs/changelog: move changelog line to tip. polish it a bit 2025-11-12 19:40:18 +02:00
Samarth Bagga
96c1392b45 Add log which will report dropped log count (#9752)
### Describe Your Changes

I have added a counter for the throttled logs which gets logged every 1
minute.
Fixes #9498

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Co-authored-by: Hui Wang <haley@victoriametrics.com>
2025-11-12 19:32:33 +02:00
Max Kotliar
8dd905c7a9 lib/envflag: apply -secret.flags inside envflag.Parse function (2nd attempt) (#9963)
### Describe Your Changes

The PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9942 was
reverted in
c90c7c3123
because of the import cycle in the enterprise VM. Needs more work.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-11-12 19:23:51 +02:00
Hui Wang
1c7abd3137 docs: fix flag type in descriptions (#9979)
Do not use backticks in command-line flag description, it pollutes the
flag type in descriptions.
2025-11-12 13:51:48 +02:00
Aliaksandr Valialkin
e8114806aa docs/victoriametrics: add a case study from Spotify based on the https://www.youtube.com/watch?v=87koDlpKDR4 2025-11-12 10:49:36 +01:00
Aliaksandr Valialkin
8f1c1cc7c9 docs/victoriametrics/FAQ.md: mention that disabling per-day index may reduce the growth rate of indexdb for static time series over time 2025-11-11 16:58:10 +01:00
Aliaksandr Valialkin
68f670cbc5 docs/victoriametrics/FAQ.md: add Why IndexDB is so large? chapter, since this is quite frequent question from VictoriaMetrics users 2025-11-11 16:48:03 +01:00
Aliaksandr Valialkin
dac7e8d554 docs/victoriametrics/FAQ.md: add trailing slashes to links to posts about VcitoriaMetrics components
Trailing slashes are needed to make the URLs canonical and avoid redirects.

This is a follow-up for d4aefcecc4
2025-11-11 15:59:20 +01:00
Aliaksandr Valialkin
3db6c40b70 deployment: update Go builder from v1.25.3 to v1.25.4
See https://github.com/golang/go/issues?q=milestone%3AGo1.25.4%20label%3ACherryPickApproved
2025-11-11 12:15:12 +01:00
Aliaksandr Valialkin
90c69a07a9 docs/victoriametrics/stream-aggregation: fix broken links to https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#stream-aggregation-config ( was https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#aggregation-config )
This is a follow-up after the commit f385e36b96
2025-11-11 02:01:04 +01:00
Aliaksandr Valialkin
f5b1092e07 lib/workingsetcache: prevent from duplicate misleading log messages when reading the cache from file
While at it, improve logging when reading workingsetcache from file and saving it to file.
This should simplify troubleshooting various issues related to the workingsetcache.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9750
2025-11-11 00:02:09 +01:00
Aliaksandr Valialkin
3e6fc445a9 lib/workingsetcache: properly update cache stats
This is a follow-up for the commit 1130adebad .

The EntriesCount, BytesSize and MaxBytesSize metrics must take into account the data
stored in both prev and curr caches, since this data occupies memory and it is expected
that the exposed metrics - vm_cache_entries, vm_cache_size_bytes and vm_cache_size_max_bytes -
take into account all the memory occupied by the corresponding caches.

The GetCalls, SetCalls, Collisions and Corruptions metrics must take into account stats
from the curr cache only, since the corresponding stats for the prev cache is already taken
during the rotation (when moving curr to prev and resetting the previous prev).

The Misses metric must take into account only misses in the prev cache, since these misses
mean that the given entry is missing the both the curr and the prev cache.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9553
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9715
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9657

While at it, make sure that the cache mode and cache stats is always read and updated under c.mu lock.
This may help resolving races similar to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9921
2025-11-10 22:15:50 +01:00
Aliaksandr Valialkin
1130adebad Revert "lib/workingsetcache: properly count workingsetcache metrics "
This reverts commit 89fd27c922.

Reason for revert: this commit adds scalability bottleneck in the fast path - Cache.Get() -
in the form of c.getCalls.Add(). This call doesn't scale on systems with big number of CPU cores,
since it needs to update atomically a shared memory from big number of CPU cores.

The Cache.Get() is called per every ingested sample when obtaining TSID by MetricName from the cache
at lib/storage.Storage.get(), so this can be a major bottleneck on systems with many CPU cores.

The solution for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9553
is to properly track cache requests and misses: cache requests must be taken into account
only at the curr cache, while cache misses must be taken into account only at the prev cache.
This will be implemented in the follow-up commit.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9657
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9715
2025-11-10 21:43:33 +01:00
Aliaksandr Valialkin
1fb3f105c3 Revert "lib/storage: Introduce vm_cache_eviction_bytes_total metric"
This reverts commit 994dadb4d5.

Reason for revert: the introduced metrics have zero practical applicability.

The lib/workingsetcache doesn't need manual tuning in most cases - its' size
is automatically adjusted to the given working set, if the working set is smaller
than the cache size limit set at the cache creation time. The limit just prevents
unbounded cache growth for large working sets.

If the working set exceeds the given limit, then the cache may become inefficient
because of the increased cache miss rate. The introduced metrics do not help determining
the needed cache size.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9293
2025-11-10 16:49:51 +01:00
Aliaksandr Valialkin
38d3033e66 go.mod: update github.com/VictoriaMetrics/fastcache from v1.13.1 to v1.13.2
This is needed for removing the EvictedBytes metric from the fastcache.

See the description of f6080737bb for details.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9293
Updates https://github.com/VictoriaMetrics/fastcache/pull/93
2025-11-10 16:43:33 +01:00
Aliaksandr Valialkin
cfba80ed4d lib/workingsetcache: replace Cache.Save() with Cache.MustSave()
If the cache cannot be saved to the given file, this is a fatal error.
It is better to log this fatal error inside Cache.MustSave() and then exit
instead of returning it to the caller. This makes the code more clear at the caller side.
2025-11-10 16:00:59 +01:00
Aliaksandr Valialkin
ee0eff0ca2 lib/workingsetcache: improve log messages for various expected cases when reading the cache from files
The improved log messages must help users understanding the logged cases
without asking VictoriaMetrics developers on these cases.
2025-11-10 14:54:22 +01:00
Aliaksandr Valialkin
181f95aaf6 deployment/docker/rules/alerts-health.yml: clarify the description of the TooManyTSIDMisses alert after the commit 30641b201b
It is expected that the number of TSIDs misses over the last 5 minutes is zero in steady state.
If it is non-zero, then something wrong happens. That's why it is better to use increase() instead of rate() function
for this alert.
2025-11-10 14:35:36 +01:00
Aliaksandr Valialkin
58485df033 lib/storage: consistently rename searchMetricNameWithCache() to SearchTSIDs() across comments after the commit 90d23d7c9f
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9765
2025-11-10 14:30:55 +01:00
Aliaksandr Valialkin
30641b201b deployment/docker/rules/alerts-health.yml: clarify the description for the TooManyTSIDMisses alert
This alert is expected after unclean shutdown (OOM, power off, kill -9) of VictoriaMetrics.
It should go away in a few minutes after the restart while VictoriaMetrics deletes metricIDs
for the missing MetricID->TSID entries which were created for the newly registered time series
just before unclean shutdown. It is OK to delete such metricIDs, since the corresponding time series
will be re-registered again. See the commit 20812008a7 .

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3502
2025-11-10 14:25:22 +01:00
Aliaksandr Valialkin
b2cd3bf1f2 lib/workingsetcache: properly initialize new cache when the stored cache has unexpected size
This is a follow-up for 9bc541587b
2025-11-10 12:49:04 +01:00
Artem Fetishev
5336091785 lib/storage: Fix data race in containsTimeRange() (#9965)
When one goroutine attemps to update the min timestamp under the lock it
could have been updated already by another goroutine with a smaller
timestamp. As a result the goroutine will update the timestamp with a
bigger value.

A simple unit test (included in this commit) demonstrates that.

Additionally, use a simple Mutex instead of RWMutex. RWMutexes only
introduce an unnecessary overhead for operations as simple as retrieving
a value from a map and regular Mutex should be preferred.

Thanks to @valyala for spotting a bug and the advice on RWMutexes.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-07 15:20:29 +01:00
Yury Molodov
5b11f6f384 app/vmui: improve chart performance and fix median calculation
This commit improves overall performance and stability of chart rendering,
refines time series generation, and fixes incorrect median calculation
in metric series.
JavaScript execution time improved by up to ×6 on large datasets.

**Changes:**

* Reworked `getTimeSeries` - one point per pixel.
* Added legend auto-collapse when >20 items.
* Switched median algorithm to Quickselect (Floyd–Rivest).
* Unified array stats functions (`min`, `max`, `avg`, `median`) into a
single pass.
* Removed unused `last` value from series.
* Renamed `roundToMilliseconds` to `roundToThousandths` and moved to
`utils/math`.
* Replaced `isSupportedDuration` with `parseSupportedDuration`, added
fractional duration support.

Related issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9699
Related issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9926
2025-11-07 16:19:56 +03:00
Zhu Jiekun
e8975e560d lib/promscrape: prevent early exit when one of multiple service discovery configs fails
When multiple service discovery configs of the same type exist (e.g.,
`hetzner_sd_config`), vmagent currently behaves as follows:
1. Attempts to request each config.
2. Exits immediately if any config returns an error.
3. Skips the rest configs and falls back to the previous service
discovery result.

The correct behavior—more compatible with Prometheus—should be:
1. Attempt to request each config.
2. Collect all valid results.
3. Use the valid results if there's at least one. otherwise (all
failed), fall back to the previous SD result.

Scrape example:

```yaml
scrape_configs:
  - job_name: hetzner-default
    hetzner_sd_configs:
      - role: "hcloud"
        authorization:
          credentials: "some_valid_value"
      - role: "hcloud"
        authorization:
          credentials: "some_wrong_value"
```

Expected outcome: 
- At least targets from `credentials: "some_valid_value"` should appear
in the service discovery result.

current outcome:
- the error from `credentials: "some_wrong_value"` leads to an **empty**
result.

This issue should affect service discovery which using
`getScrapeWorkGeneric` function:

- `azure_sd_config`
- `consul_sd_config`
- `consulagent_sd_config`
- `digitalocean_sd_config`
- `dns_sd_config`
- `docker_sd_config`
- `dockerswarm_sd_config`
- `ec2_sd_config`
- `eureka_sd_config`
- `gce_sd_config`
- `hetzner_sd_config`
- `http_sd_config`
- `kuma_sd_config`
- `marathon_sd_config`
- `nomad_sd_config`
- `openstack_sd_config`
- `ovhcloud_sd_config`
- `puppetdb_sd_config`
- `vultr_sd_config`
- `yandexcloud_sd_config`

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9375
2025-11-07 16:18:11 +03:00
Yury Molodov
3b5822398c app/vmui: fix points display; add option to show all points
* Fix rendering of isolated points at gaps.
* Add toggle to always show all points (even when connected by a line).

Related issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9666
2025-11-07 16:09:28 +03:00
Yury Molodov
742a3384dc app/vmui: fix incorrect median value calculation in series (#9926)
Signed-off-by: Yury Molodov <yurymolodov@gmail.com>
2025-11-07 13:42:22 +02:00
Clément Nussbaumer
8a1beef46d lib/promscrape/kubernetes: add namespace metadata discovery
permits attaching namespace metadata to pods, services, ingresses,
endpoints and endpointslices for kubernetes service-discovery.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7486
2025-11-07 11:49:08 +03:00
Fred Navruzov
22158b7272 docs/vmanomaly: patch release v1.27.1 (#9964)
### Describe Your Changes

Patch release doc updates (v1.27.1)

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-11-06 14:17:18 +02:00
Aliaksandr Valialkin
4a18f4e49d docs/victoriametrics/Articles.md: add https://www.tigrisdata.com/blog/billing-prometheus/ 2025-11-06 11:44:16 +01:00
Aliaksandr Valialkin
85cbbfe0bc lib/logstorage: verify that nobody holds references to parts when closing the partition
This is needed in order to detect and prevent cases of improper usage of partitions
while they are closed.

This is a follow-up for the commit 9725ee50ec .
2025-11-06 11:37:44 +01:00
Max Kotliar
d5705a9647 docs/guides: use canonical link 2025-11-05 21:09:19 +02:00
Max Kotliar
c90c7c3123 Revert "lib/envflag: apply -secret.flags inside envflag.Parse function (#9942)"
This reverts commit 1b11031ec8.

There is an import cycle because of the change in enterprise version of VM
2025-11-05 21:01:48 +02:00
Max Kotliar
1b11031ec8 lib/envflag: apply -secret.flags inside envflag.Parse function (#9942)
### Describe Your Changes

Follow up on PR:
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9839, which
addresses review comment

https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9839#discussion_r2477729886

Alex: 
```
this design decision isn't good, since it will lead to potential security issues over time when we'll forget adding ApplySecretFlags() call after the flag.Parse() call or add it at the wrong place. BTW, we do not call flag.Parse() explicitly - instead envflag.Parse() is called. So it is natural to call ApplySecretFlags() inside this call. Are there restrictions which prevent from doing this? If there are no restrictions, then there is no need in making this function public - it will be called explicitly inside envflag.Parse().
```

There is no changelog entry as there is no change in user-visible
behavior.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-11-05 20:53:43 +02:00
Max Kotliar
e7b7015eb1 docs: clarify why we advise 50% free RAM. Add link to discussion (#9943)
### Describe Your Changes

Based on answer
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9895#issuecomment-3442491150

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-11-05 20:51:17 +02:00
Max Kotliar
6ee1edeb4d docs: fix links in VictoriaMetrics topologies guide 2025-11-05 20:45:32 +02:00
Aliaksandr Valialkin
9725ee50ec lib/mergeset: verify that Table parts are no longer used at Table.MustClose()
This should catch possible errors related to improper release of Table parts.
Fix such an error at TestTableCreateSnapshotAt by properly closing all the initialized
TableSearch instances.

Thanks to @rtm0 for pointing to this issue.
2025-11-05 13:25:16 +01:00
f41gh7
780cb1bf05 docs: mention latest v1.129.1 release 2025-11-04 18:07:57 +03:00
f41gh7
a34d0d6056 docs: mention latest LTS releases
v1.110.23 and v1.122.8
2025-11-04 18:07:57 +03:00
f41gh7
5e98e0cff5 CHANGELOG.md: cut v1.129.1 release 2025-11-04 13:15:37 +03:00
Nikolay
51b44afd34 lib: properly apply snappy Decode limits
Previously, snappy Decoder didn't take in account Request Size limits
applied by VictoriaMetrics components.  And in case of incorrectly formed snappy block, VictoriaMetrics
 component may allocate extra memory. Which may lead to the OOM errors.

This commit makes ingest endpoints check block size header based on MaxRequest Limits.
2025-11-04 13:04:27 +03:00
Fred Navruzov
5a8d7984ca docs/vmanomaly: release v1.27.0 (#9954)
### Describe Your Changes

Docs update to follow vmanomaly's release v1.27.0, including:
- UI page update (changelogs, auth, new screenshots)
- Migration guide addition
- Cross-references of the above and version bumps

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-11-04 10:22:20 +04:00
Zakhar Bessarab
57752ca2c0 docs: update VM version to v1.129.0
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-11-03 16:55:32 +04:00
Zakhar Bessarab
171cdf0614 deployment/docker: update VM version to v1.129.0
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-11-03 16:53:21 +04:00
Artem Fetishev
7d19ec2e4d lib/storage: extract storage file names into constants (#9944)
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-10-31 20:32:16 +01:00
Zakhar Bessarab
a9b5033d50 docs/changelog: backport LTS changelog
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-10-31 22:41:17 +04:00
Zakhar Bessarab
a783b2048f docs/changelog: cut v1.129.0
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-10-31 20:04:01 +04:00
Zakhar Bessarab
fd48c72a83 docs: update version tooltips
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-10-31 19:59:27 +04:00
Zakhar Bessarab
05e52fa05a app/vmselect: run make vmui-update
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-10-31 19:47:16 +04:00
Zakhar Bessarab
b49e8d032f make: add vmutils target for linux/s390x for CI builds
Currently, CI builds are ignoring linux/s390x due to missing target. See an example: https://github.com/VictoriaMetrics/VictoriaMetrics-enterprise/actions/runs/18971198439/job/54179142369

Follow-up for 255a0cf6, 73b10d76

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-10-31 15:45:33 +04:00
Zakhar Bessarab
73b10d7621 make: include s390x binaries into release artifacts (#9941)
Previously, it was possible to build binaries with make targets but
those builds were not included in the release artifact. Update release
targets to include s390x artifacts in release artifacts.

Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9697

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-10-31 15:16:07 +04:00
hagen1778
18268c3d13 docs: re-qualify load-balancing optimization to feature
Motivation: the change updates load-balancing logic, enhancing it rather than fixing
a critical bug. Such enhancement should not be ported to LTS versions.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9712

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-10-31 09:52:34 +01:00
hagen1778
bfb49c55af docs: add changelog for vmalert UI fix
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9892
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9909
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-10-31 09:34:29 +01:00
Kirill Yurkov
bd7fed9b41 docs: address review comments from PR #9919 (#9940)
- Standardize all sections to use 'Recommended for:' instead of mixed
'For whom:' and 'Target audience:'
- Fix wording: 'Query evaluation is always local' 

Addresses comments in #9919
2025-10-31 09:21:52 +01:00
Roman Khavronenko
a85c5830c1 app/vmalert: limit delayBeforeStart up to 5min (#9930)
vmalert tries to spread the moment group starts its evaluation
on `[0..group.interval]` duration. This approach allows to avoid
thundering herd problem when on vmalert start all groups execute their
rules simultaneously. It was introduced in
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/724

While for most configs it works great, for groups with big evaluation
intervals (30min, 60min) the first evaluation can be delayed
significantly.
This change introduces a start delay limit via new flag
`--group.maxStartDelay` (5m default).
It limits the `[0..group.interval]` start delay to
`[0..math.min(--group.maxStartDelay, group.interval)]`.
So all groups will start in first 5m or earlier.

The --group.maxStartDelay is ignored if user set `eval_offset`.

The 5m default limitation was picked high to not affect users with
relatively low evaluation intervals.

-----------

Based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9929

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-10-31 09:20:46 +01:00
Andrii Chubatiuk
009ddb9ce1 vmui: wrap annotations in alerting (#9909)
similar to https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9892
but for alerting tab in vmui
2025-10-31 09:13:32 +01:00
Zakhar Bessarab
91bce8d3b4 app/vmbackupmanager: enforce newline at the end of CLI result (#956)
* app/vmbackupmanager: enforce newline at the end of CLI result

Previously, vmbackupmanager only printed a response from API which did not include a newline character. That leads to issues with the rendering of the next command when using a shell.

Always append a newline character to avoid breaking shell formatting when using CLI mode.

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* Update docs/victoriametrics/changelog/CHANGELOG.md

Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2025-10-31 11:52:53 +04:00
Andrii Chubatiuk
a7d69cc51e app/vmbackupmanager: create vm_backup_last_created_at metric for latest backup (#954) 2025-10-31 11:52:47 +04:00
Aliaksandr Valialkin
2e6f42bff8 lib/promauth/config.go: typo fix: It -> If
The typo is spotted in https://github.com/VictoriaMetrics/VictoriaLogs/pull/765/files#r2434359693
2025-10-30 18:20:00 +01:00
Aliaksandr Valialkin
7ae1fd9614 docs/victoriametrics/Articles.md: add a link to https://medium.com/@vijayrauniyar1818/how-we-eliminated-10k-year-in-aws-cross-zone-data-transfer-costs-with-zone-aware-kubernetes-09fff0c2435b 2025-10-30 17:24:17 +01:00
Aliaksandr Valialkin
51d1c16230 app/vmauth: make load distribution more even among backends which execute queries with varying durations
The load distribution could be uneven when short queries arrive to vmauth while a part of backends are busy
with long-running queries. In this case the major load goes to the backend after a row of busy backend.

Suppose we have four backends - b1, b2, b3 and b4. The first two backends are busy with bigger number
of long-running queries than b3 and b4. Then 75% of short queries will go to b3, while only 25%
of short queries will go to b4.

The new algorithm makes the distribution more even in these cases by storing the next backend
after the chosen backend as candidate for the next query (its' index is stored in the atomicCounter).
Avoid races when updating atomicCounter from concurrently executed queries by using CompareAndSwap() -
if the concurrent query updated it first then the current query won't overwrite it with the outdated value.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9712
2025-10-30 14:37:48 +01:00
Max Kotliar
874f8b31a3 docs/changelog: fix pr link in v1.122.5 tag 2025-10-30 15:07:39 +02:00
Artem Fetishev
d3ac2473c0 lib/storage: Add data ingestion benchmarks for various data patterns
Data patterns considered:

- Same series, same date
- Same series, different dates
- Different series, same date
- Different series, different dates

To make sure that the pattern condition holds, a new storage instance is
started every benchmark iteration.

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9912
2025-10-30 14:39:28 +03:00
Zakhar Bessarab
3f45690342 app/vmctl/remote-read: allow providing multiple label filters
Previously, vmctl only accepted one label for filtering. Extend this to
allow providing multiple-filters at once. This is useful when migrating
large volumes of data as it allows narrowing down migration scope of
migration for one run so that the source side is not overwhelmed with
migration.

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9917/
2025-10-30 14:38:30 +03:00
Roman Khavronenko
3abd442742 app/vmalert: properly show last evaluation with 0 value
Before, rules that didn't get evaluated yet were showing weird values in
vmalert's UI. It was happening because of
`time.Since(r.LastEvaluation).Seconds()` expression when
`r.LastEvaluation` had 0 value.

With this change, rules that weren't evaluated yet would show `Never` in
Updated column instead.

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9924
2025-10-30 14:37:34 +03:00
Zhu Jiekun
5cec04d8e2 lib/httpserver: revert HTTP/2 support
This commit request revert the commit
d6bbfaf164 for the following reasons:

1. HTTP/2 carries security risks.
2. Most components in the VictoriaMetrics stack do not require HTTP/2
support.
3. While HTTP/2 support was available only as an option in previous
commit, there remains a potential risk of misusing this option and
enabling HTTP/2 inadvertently.

For components (e.g., VictoriaTraces) that require HTTP/2 support, they
should currently build an HTTP server manually with built-in packages,
instead of using `lib/httpserver` in VictoriaMetrics. If the mentioned
issue is resolved in the future and more components need HTTP/2, this
support can be reintroduced into `lib/httpserver`.
 
Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9927
2025-10-30 14:36:48 +03:00
Aliaksandr Valialkin
60f777620f lib/fs/fsutil: set the default value for -fs.maxConcurrency depending on the number of available CPU cores
This should reduce the need to tune this flag on systems with different number of CPU cores.
16 concurrent file operations per CPU should give quite low Go scheduling latency (~10ms)
according to https://github.com/VictoriaMetrics/VictoriaLogs/issues/774#issuecomment-3456814064

This is a follow-up for the commit 8a9a40dbdd
2025-10-30 11:57:10 +01:00
Aliaksandr Valialkin
8a9a40dbdd lib/fs/fsutil: add -fs.maxConcurrency command-line flag for tuning concurrent operations with files
This flag can help tuning Go scheduling latency on systems with small number of CPU cores
vs data ingestion performance on systems with high-latency storage such as NFS or Ceph.

Updates https://github.com/VictoriaMetrics/VictoriaLogs/issues/774
Updates https://github.com/VictoriaMetrics/VictoriaLogs/issues/517
2025-10-30 11:46:27 +01:00
hagen1778
fde4b4013a docs: rm extra line
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-10-30 11:02:16 +01:00
hagen1778
bf69b0d686 docs: order recent changes by components
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-10-30 11:01:43 +01:00
hagen1778
a866474918 dashboards: run make dashboards-sync
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-10-30 10:58:58 +01:00
hagen1778
22f6cb6339 docs: mention PR author for dashboard change
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-10-30 10:56:08 +01:00
Samarth Bagga
0d7b7649bf dashboards: enable search for non default flags panel (#9928)
### Describe Your Changes

Added search for non default flags by editing the grafana configs.
Resolves #9910

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Co-authored-by: hagen1778 <roman@victoriametrics.com>
2025-10-30 10:53:36 +01:00
nemobis
74611ce6f2 docs: Update RELEX figures (#9931)
### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2025-10-30 10:20:27 +01:00
Roman Khavronenko
a5dd0324a9 app/vmalert: simplify delayBeforeStart func (#9929)
It is a cosmetic change: it simplifies function signature by making it a
method of the Group struct.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-10-30 10:19:05 +01:00
hagen1778
45c0d40127 docs: fix typo after 3e0aa46
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-10-30 10:18:28 +01:00
Andrii Chubatiuk
fc978c95af app/vmalert: use search expression to match group and file names (#9920)
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9886

---------

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2025-10-30 10:01:25 +01:00
Kirill Yurkov
8e99efe0fa docs: add VictoriaMetrics architectures guide from startups to hypers… (#9919)
The new guide section about architecture from scratch to hyperscale!
2025-10-30 09:55:48 +01:00
hagen1778
3e0aa46fdb docs: reorder template functions alphabetically
follow-up after ea41fea453

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-10-30 09:54:40 +01:00
Anh-Dung Nguyen
ea41fea453 app/vmalert: add now template (#9913)
### Describe Your Changes

Related to #9864, add "now" as template in vmalert rules templating and
update the docs. I haven't been able to test the docs change as I can't
run make docs-debug locally so if anyone know how to do it locally,
please let me know!

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Co-authored-by: Hui Wang <haley@victoriametrics.com>
Co-authored-by: Max Kotliar <kotlyar.maksim@gmail.com>
2025-10-30 09:52:48 +01:00
hagen1778
0165108a8f docs: move change line to the right place
Follow-up after 2652a7c762

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-10-30 09:52:17 +01:00
Hui Wang
2652a7c762 vmalert: support alert_relabel_configs per each notifier in -notifier.config file (#9736)
fix https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5980
2025-10-30 09:47:40 +01:00
Stephan Burns
82aacc5b75 vmagent/docs: grammar changes (#9863)
### Describe Your Changes

Lots of small changes to grammar to make the docs flow nicer.

I'm sorry that this ended up being such a large PR. I will split these
up in the future.

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Co-authored-by: Phuong Le <39565248+func25@users.noreply.github.com>
2025-10-28 16:56:03 +02:00
dependabot[bot]
0544bb12e0 build(deps): bump vite from 7.1.5 to 7.1.11 in /app/vmui/packages/vmui (#9885)
Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite)
from 7.1.5 to 7.1.11.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/vitejs/vite/releases">vite's
releases</a>.</em></p>
<blockquote>
<h2>v7.1.11</h2>
<p>Please refer to <a
href="https://github.com/vitejs/vite/blob/v7.1.11/packages/vite/CHANGELOG.md">CHANGELOG.md</a>
for details.</p>
<h2>v7.1.10</h2>
<p>Please refer to <a
href="https://github.com/vitejs/vite/blob/v7.1.10/packages/vite/CHANGELOG.md">CHANGELOG.md</a>
for details.</p>
<h2>v7.1.9</h2>
<p>Please refer to <a
href="https://github.com/vitejs/vite/blob/v7.1.9/packages/vite/CHANGELOG.md">CHANGELOG.md</a>
for details.</p>
<h2>v7.1.8</h2>
<p>Please refer to <a
href="https://github.com/vitejs/vite/blob/v7.1.8/packages/vite/CHANGELOG.md">CHANGELOG.md</a>
for details.</p>
<h2>v7.1.7</h2>
<p>Please refer to <a
href="https://github.com/vitejs/vite/blob/v7.1.7/packages/vite/CHANGELOG.md">CHANGELOG.md</a>
for details.</p>
<h2>v7.1.6</h2>
<p>Please refer to <a
href="https://github.com/vitejs/vite/blob/v7.1.6/packages/vite/CHANGELOG.md">CHANGELOG.md</a>
for details.</p>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/vitejs/vite/blob/main/packages/vite/CHANGELOG.md">vite's
changelog</a>.</em></p>
<blockquote>
<h2><!-- raw HTML omitted --><a
href="https://github.com/vitejs/vite/compare/v7.1.10...v7.1.11">7.1.11</a>
(2025-10-20)<!-- raw HTML omitted --></h2>
<h3>Bug Fixes</h3>
<ul>
<li><strong>dev:</strong> trim trailing slash before
<code>server.fs.deny</code> check (<a
href="https://redirect.github.com/vitejs/vite/issues/20968">#20968</a>)
(<a
href="f479cc57c4">f479cc5</a>)</li>
</ul>
<h3>Miscellaneous Chores</h3>
<ul>
<li><strong>deps:</strong> update all non-major dependencies (<a
href="https://redirect.github.com/vitejs/vite/issues/20966">#20966</a>)
(<a
href="6fb41a260b">6fb41a2</a>)</li>
</ul>
<h3>Code Refactoring</h3>
<ul>
<li>use subpath imports for types module reference (<a
href="https://redirect.github.com/vitejs/vite/issues/20921">#20921</a>)
(<a
href="d0094af639">d0094af</a>)</li>
</ul>
<h3>Build System</h3>
<ul>
<li>remove cjs reference in files field (<a
href="https://redirect.github.com/vitejs/vite/issues/20945">#20945</a>)
(<a
href="ef411cee26">ef411ce</a>)</li>
<li>remove hash from built filenames (<a
href="https://redirect.github.com/vitejs/vite/issues/20946">#20946</a>)
(<a
href="a81730754d">a817307</a>)</li>
</ul>
<h2><!-- raw HTML omitted --><a
href="https://github.com/vitejs/vite/compare/v7.1.9...v7.1.10">7.1.10</a>
(2025-10-14)<!-- raw HTML omitted --></h2>
<h3>Bug Fixes</h3>
<ul>
<li><strong>css:</strong> avoid duplicate style for server rendered
stylesheet link and client inline style during dev (<a
href="https://redirect.github.com/vitejs/vite/issues/20767">#20767</a>)
(<a
href="3a92bc79b3">3a92bc7</a>)</li>
<li><strong>css:</strong> respect emitAssets when cssCodeSplit=false (<a
href="https://redirect.github.com/vitejs/vite/issues/20883">#20883</a>)
(<a
href="d3e7eeefa9">d3e7eee</a>)</li>
<li><strong>deps:</strong> update all non-major dependencies (<a
href="879de86935">879de86</a>)</li>
<li><strong>deps:</strong> update all non-major dependencies (<a
href="https://redirect.github.com/vitejs/vite/issues/20894">#20894</a>)
(<a
href="3213f90ff0">3213f90</a>)</li>
<li><strong>dev:</strong> allow aliases starting with <code>//</code>
(<a
href="https://redirect.github.com/vitejs/vite/issues/20760">#20760</a>)
(<a
href="b95fa2aa75">b95fa2a</a>)</li>
<li><strong>dev:</strong> remove timestamp query consistently (<a
href="https://redirect.github.com/vitejs/vite/issues/20887">#20887</a>)
(<a
href="6537d15591">6537d15</a>)</li>
<li><strong>esbuild:</strong> inject esbuild helpers correctly for
esbuild 0.25.9+ (<a
href="https://redirect.github.com/vitejs/vite/issues/20906">#20906</a>)
(<a
href="446eb38632">446eb38</a>)</li>
<li>normalize path before calling <code>fileToBuiltUrl</code> (<a
href="https://redirect.github.com/vitejs/vite/issues/20898">#20898</a>)
(<a
href="73b6d243e0">73b6d24</a>)</li>
<li>preserve original sourcemap file field when combining sourcemaps (<a
href="https://redirect.github.com/vitejs/vite/issues/20926">#20926</a>)
(<a
href="c714776aa1">c714776</a>)</li>
</ul>
<h3>Documentation</h3>
<ul>
<li>correct <code>WebSocket</code> spelling (<a
href="https://redirect.github.com/vitejs/vite/issues/20890">#20890</a>)
(<a
href="29e98dc3ef">29e98dc</a>)</li>
</ul>
<h3>Miscellaneous Chores</h3>
<ul>
<li><strong>deps:</strong> update rolldown-related dependencies (<a
href="https://redirect.github.com/vitejs/vite/issues/20923">#20923</a>)
(<a
href="a5e3b064fa">a5e3b06</a>)</li>
</ul>
<h2><!-- raw HTML omitted --><a
href="https://github.com/vitejs/vite/compare/v7.1.8...v7.1.9">7.1.9</a>
(2025-10-03)<!-- raw HTML omitted --></h2>
<h3>Reverts</h3>
<ul>
<li><strong>server:</strong> drain stdin when not interactive (<a
href="https://redirect.github.com/vitejs/vite/issues/20885">#20885</a>)
(<a
href="12d72b0538">12d72b0</a>)</li>
</ul>
<h2><!-- raw HTML omitted --><a
href="https://github.com/vitejs/vite/compare/v7.1.7...v7.1.8">7.1.8</a>
(2025-10-02)<!-- raw HTML omitted --></h2>
<h3>Bug Fixes</h3>
<ul>
<li><strong>css:</strong> improve url escape characters handling (<a
href="https://redirect.github.com/vitejs/vite/issues/20847">#20847</a>)
(<a
href="24a61a3f54">24a61a3</a>)</li>
<li><strong>deps:</strong> update all non-major dependencies (<a
href="https://redirect.github.com/vitejs/vite/issues/20855">#20855</a>)
(<a
href="788a183afc">788a183</a>)</li>
<li><strong>deps:</strong> update artichokie to 0.4.2 (<a
href="https://redirect.github.com/vitejs/vite/issues/20864">#20864</a>)
(<a
href="e670799e12">e670799</a>)</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="8b69c9e32c"><code>8b69c9e</code></a>
release: v7.1.11</li>
<li><a
href="f479cc57c4"><code>f479cc5</code></a>
fix(dev): trim trailing slash before <code>server.fs.deny</code> check
(<a
href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/20968">#20968</a>)</li>
<li><a
href="6fb41a260b"><code>6fb41a2</code></a>
chore(deps): update all non-major dependencies (<a
href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/20966">#20966</a>)</li>
<li><a
href="a81730754d"><code>a817307</code></a>
build: remove hash from built filenames (<a
href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/20946">#20946</a>)</li>
<li><a
href="ef411cee26"><code>ef411ce</code></a>
build: remove cjs reference in files field (<a
href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/20945">#20945</a>)</li>
<li><a
href="d0094af639"><code>d0094af</code></a>
refactor: use subpath imports for types module reference (<a
href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/20921">#20921</a>)</li>
<li><a
href="ed4a0dc913"><code>ed4a0dc</code></a>
release: v7.1.10</li>
<li><a
href="c714776aa1"><code>c714776</code></a>
fix: preserve original sourcemap file field when combining sourcemaps
(<a
href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/20926">#20926</a>)</li>
<li><a
href="446eb38632"><code>446eb38</code></a>
fix(esbuild): inject esbuild helpers correctly for esbuild 0.25.9+ (<a
href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/20906">#20906</a>)</li>
<li><a
href="879de86935"><code>879de86</code></a>
fix(deps): update all non-major dependencies</li>
<li>Additional commits viewable in <a
href="https://github.com/vitejs/vite/commits/v7.1.11/packages/vite">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=vite&package-manager=npm_and_yarn&previous-version=7.1.5&new-version=7.1.11)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/VictoriaMetrics/VictoriaMetrics/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-10-28 16:45:07 +02:00
Max Kotliar
70c293467a app/vmselect: Enable log slow query stats with 5s default
Rationale: Having query stats logging enabled by default can greatly
help in investigating incidents.

Currently, it is disabled by default, so many users don’t enable it, and
when issues occur there are no stats available.

After discussion with the team, a 5s threshold was agreed upon as a
reasonable default to capture meaningful slow query data without
excessive logging.
2025-10-28 16:43:04 +02:00
Max Kotliar
90b4c84ad5 docs/changelog: put misplaced changelog entries to tip 2025-10-28 16:37:15 +02:00
Hui Wang
0a194d067a stream aggregation: change the behavior when both `streamAggr.dropInp… (#9877)
stream aggregation: change the behavior when both `streamAggr.dropInput`
and `streamAggr.keepInput` are set to true

fix https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9724,
making dropInput and keepInput work separately.

<img width="744" height="366" alt="image"
src="https://github.com/user-attachments/assets/7ebb3d1e-872f-4789-8dd1-c4e3f80a84de"
/>

Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2025-10-28 16:19:33 +02:00
Hui Wang
9ffe965063 vmagent: add /remotewrite-relabel-config and `/remotewrite-url-rela… (#9722)
…bel-config` APIs to return `-promscrape.config` and
`-remoteWrite.relabelConfig` flag values

part of https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9504

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2025-10-27 13:52:46 +01:00
f41gh7
775ee71fad app/{vminsert,vmstorage}: implement RPC protocol for vmstorage-vminsert communication
This commit adds new RPC protocol for vminsert-vmstorage communication,
it acts in the same way as vmselect-vmstorage RPC.

  It's implemented with new handshake hello methods in a backward
compatible way. Server attempts to parse RPC only if client send new
Hello message, while client fallbacks to the old Hello message if server
closes connection.

This change is need for the new metrics metadata forwarded from vminsert
into vmstorage.

Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2974
Changes extracted from PR:
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9487
2025-10-27 09:56:07 +01:00
Artem Fetishev
75b597e727 lib/storage: fix loading nextDayMetricIDs cache from a file for different indexDB generation (#9911)
Previously, if a storage started with curr indexDB different from one
stored in nextDayMetricIDs cache file, the cache would still be loaded
into memory possibly affecting the next day prefill.

This is an unlikely case but it is still possible when:

- A programmer makes a mistake in the code and uses something else
instead of idbCurr.generation.
- Downgrading from pt-index to previous version

Related to #7599 and #8134.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-10-26 19:43:59 +01:00
Artem Fetishev
f3b0f4292d lib/storage: add a unit test for next day idb prefill (#9906)
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-10-25 18:16:32 +02:00
Roman Khavronenko
5fa87af6be app/vmalert/datasource: explicitly check response type during replay (#9868)
This change validates that QueryRange() method for prometheus datasource
receives response with `matrix` data type. It would throw an error
otherwise.

The change is needed to avoid confusions like in
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9779.

The fix is not elegant, but it should be simple from code support
perspective. So each API has its own parsing function. Even if some
processing code is repeated.

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Andrii Chubatiuk <achubatiuk@victoriametrics.com>
2025-10-24 16:53:01 +02:00
Roman Khavronenko
b9a3369254 app/vmalert: preserve html formatting in annotations (#9892)
The change is purely visual. It preserves html formatting in annotations
when rendering them on rule or alert details page. The css change is
clumsy, but demonstrates the point.

--------

Before:
<img width="1179" height="297" alt="image"
src="https://github.com/user-attachments/assets/c30e2222-7b0f-4f28-bf6e-c546cc5bb2fc"
/>


After:
<img width="1196" height="321" alt="image"
src="https://github.com/user-attachments/assets/2c6d9530-7ae9-47fd-b4ba-87fe6f44c625"
/>


-------

p.s. @AndrewChubatiuk I sure know that you could make this change in
more elegant or stylish way than I did. Please do so, if you want.
Please port this change to vmui too. Thanks!

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-10-24 16:39:09 +02:00
Max Kotliar
159f990c8e docs/changelog: fix typo in update note 2025-10-24 13:27:49 +03:00
Max Kotliar
b5c3e93f7e docs/changelog: fix typo in LTS link 2025-10-24 13:17:11 +03:00
Zakhar Bessarab
7a6139416e lib/backup/s3remote: properly extend http client if it is present
fb1344b5 replaced an HTTP client unconditionally which overrides
configurations which were loaded by AWS SDK. This leads to AWS env
variables to being overwritten.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9858

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9898
2025-10-24 10:30:59 +02:00
Zhu Jiekun
d6bbfaf164 httpserver: add http2 option
Currently, the `httpserver` disabled HTTP/2 support by design, because:
```
// Disable http/2, since it doesn't give any advantages for VictoriaMetrics services.
```

As VictoriaLogs and VictoriaTraces rely on `httpserver`, in order to
support gRPC over HTTP/2, an option to support HTTP/2 is required.


Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9881
2025-10-24 10:30:27 +02:00
Nikolay
11f488d8ff lib/streamaggr: concurrently push timeseries to aggregators
Previously all timeseries pushed into aggregators were added
sequentially. It could cause delays on data ingestion and it was not
possible to use all available.

 This commit adds concurrency based on available CPU cores.

Also, it adds new generic Buffer and BufferPool into slicesutil.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9878
2025-10-24 10:29:48 +02:00
Aliaksandr Valialkin
d0f8773f4b app/vmauth: log the real cause for timed out requests to vmauth
Previously a misleading random error could be logged for canceled and/or timed out requests to vmauth.
Consistently log the request timeout error for timed out requests.

While at it, do not log errors for requests canceled by the remote client, since such logs aren't actionable
and just pollute error logs generated by vmauth.
2025-10-21 15:59:05 +02:00
Max Kotliar
7ec6f28a7c docs/changelog: add links to related PRs. 2025-10-21 16:11:21 +03:00
Max Kotliar
46ef5460a9 docs: run make docs-update-flags
sync flags with actual values in binaries
2025-10-21 16:04:20 +03:00
Max Kotliar
1a68d4ac8a docs/CHANGELOG.md: update changelog with LTS release notes; bump LTS versions 2025-10-21 11:23:01 +03:00
Max Kotliar
be7039429d docs: bump latest version in docs 2025-10-21 10:58:47 +03:00
Max Kotliar
76e5cd2cd4 deployment/docker: bump version 2025-10-21 10:55:35 +03:00
Max Kotliar
f91789eebd docs/CHANGELOG.md: cut v1.128.0 2025-10-17 14:53:46 +03:00
Max Kotliar
9b7fee13d6 docs: update version tooltips 2025-10-17 14:50:54 +03:00
Max Kotliar
c89825f64d app/{vmselect,vlselect}: run make vmui-update 2025-10-17 14:44:39 +03:00
Stephan Burns
2d0d96855b docs: grammatical changes (#9862)
### Describe Your Changes

Some small grammatical changes.

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-10-17 14:09:05 +03:00
Rishub
e32b1ba6a5 Fix docker pull command for vmanomaly images (#9870)
### Describe Your Changes

The docker-hub image url was also set to quay.io changed it back to
docker.

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-10-17 14:06:49 +03:00
Nikolay
41267decd3 lib/protoparser: add flag opentelemetry.convertMetricNamesToPrometheus (#9875)
Introduce a new flag, which converts only metric names into Prometheus
compatible format. And keeps label names in original form.

 It's needed to keep labels in original form, which
is useful for correlation with other telemetry sources, such as logs or
traces.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9830

### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Signed-off-by: f41gh7 <nik@victoriametrics.com>
2025-10-17 13:42:30 +03:00
Andrii Chubatiuk
75fef7a011 lib/streamaggr: disable impact of flush_on_shutdown on aggregated series flush time (#9852)
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9642

additionally renamed variable since meaning of variables
`!flushOnShutdown` and `skipIncompleteFlush` is not equal.

### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2025-10-17 13:03:29 +03:00
Andrii Chubatiuk
b87c515d14 {dashboards,rules}: update storage ETA calculations in both dashboards and rules (#9848)
currently index file size is calculated as average across all storages
from all clusters, updated it to get more valid calculations. also PR
[fixes helm chart issue
](https://github.com/VictoriaMetrics/helm-charts/issues/2474).

### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-10-17 12:45:10 +03:00
Zakhar Bessarab
fb1344b5bf lib/backup/s3remote: use http client from AWS instead of custom implementation (#9869)
AWS SDK does not modify custom http client configuration if it was provided. This leads to
additional configuration such as environment variables being ignored.

Use AWS http client builder instead of custom implementation and
override DialContext to preserve metrics exposed by custom transport.

See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9858

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-10-17 12:29:16 +04:00
Nikolay
168ee75a3c app/vmagent/kafka: add opentelemetry consumer format
This commit adds opentelemetry format for kafka consumer

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9734
2025-10-14 22:28:20 +02:00
dependabot[bot]
e3a5e29a02 build(deps): bump actions/setup-node from 4 to 6 (#9857)
Bumps [actions/setup-node](https://github.com/actions/setup-node) from 4
to 6.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/actions/setup-node/releases">actions/setup-node's
releases</a>.</em></p>
<blockquote>
<h2>v6.0.0</h2>
<h2>What's Changed</h2>
<p><strong>Breaking Changes</strong></p>
<ul>
<li>Limit automatic caching to npm, update workflows and documentation
by <a
href="https://github.com/priyagupta108"><code>@​priyagupta108</code></a>
in <a
href="https://redirect.github.com/actions/setup-node/pull/1374">actions/setup-node#1374</a></li>
</ul>
<p><strong>Dependency Upgrades</strong></p>
<ul>
<li>Upgrade ts-jest from 29.1.2 to 29.4.1 and document breaking changes
in v5 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-node/pull/1336">#1336</a></li>
<li>Upgrade prettier from 2.8.8 to 3.6.2 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-node/pull/1334">#1334</a></li>
<li>Upgrade actions/publish-action from 0.3.0 to 0.4.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-node/pull/1362">#1362</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/setup-node/compare/v5...v6.0.0">https://github.com/actions/setup-node/compare/v5...v6.0.0</a></p>
<h2>v5.0.0</h2>
<h2>What's Changed</h2>
<h3>Breaking Changes</h3>
<ul>
<li>Enhance caching in setup-node with automatic package manager
detection by <a
href="https://github.com/priya-kinthali"><code>@​priya-kinthali</code></a>
in <a
href="https://redirect.github.com/actions/setup-node/pull/1348">actions/setup-node#1348</a></li>
</ul>
<p>This update, introduces automatic caching when a valid
<code>packageManager</code> field is present in your
<code>package.json</code>. This aims to improve workflow performance and
make dependency management more seamless.
To disable this automatic caching, set <code>package-manager-cache:
false</code></p>
<pre lang="yaml"><code>steps:
- uses: actions/checkout@v5
- uses: actions/setup-node@v5
  with:
    package-manager-cache: false
</code></pre>
<ul>
<li>Upgrade action to use node24 by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/setup-node/pull/1325">actions/setup-node#1325</a></li>
</ul>
<p>Make sure your runner is on version v2.327.1 or later to ensure
compatibility with this release. <a
href="https://github.com/actions/runner/releases/tag/v2.327.1">See
Release Notes</a></p>
<h3>Dependency Upgrades</h3>
<ul>
<li>Upgrade <code>@​octokit/request-error</code> and
<code>@​actions/github</code> by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-node/pull/1227">actions/setup-node#1227</a></li>
<li>Upgrade uuid from 9.0.1 to 11.1.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-node/pull/1273">actions/setup-node#1273</a></li>
<li>Upgrade undici from 5.28.5 to 5.29.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-node/pull/1295">actions/setup-node#1295</a></li>
<li>Upgrade form-data to bring in fix for critical vulnerability by <a
href="https://github.com/gowridurgad"><code>@​gowridurgad</code></a> in
<a
href="https://redirect.github.com/actions/setup-node/pull/1332">actions/setup-node#1332</a></li>
<li>Upgrade actions/checkout from 4 to 5 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-node/pull/1345">actions/setup-node#1345</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a
href="https://github.com/priya-kinthali"><code>@​priya-kinthali</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/setup-node/pull/1348">actions/setup-node#1348</a></li>
<li><a href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/setup-node/pull/1325">actions/setup-node#1325</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/setup-node/compare/v4...v5.0.0">https://github.com/actions/setup-node/compare/v4...v5.0.0</a></p>
<h2>v4.4.0</h2>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="2028fbc5c2"><code>2028fbc</code></a>
Limit automatic caching to npm, update workflows and documentation (<a
href="https://redirect.github.com/actions/setup-node/issues/1374">#1374</a>)</li>
<li><a
href="13427813f7"><code>1342781</code></a>
Bump actions/publish-action from 0.3.0 to 0.4.0 (<a
href="https://redirect.github.com/actions/setup-node/issues/1362">#1362</a>)</li>
<li><a
href="89d709d423"><code>89d709d</code></a>
Bump prettier from 2.8.8 to 3.6.2 (<a
href="https://redirect.github.com/actions/setup-node/issues/1334">#1334</a>)</li>
<li><a
href="cd2651c462"><code>cd2651c</code></a>
Bump ts-jest from 29.1.2 to 29.4.1 (<a
href="https://redirect.github.com/actions/setup-node/issues/1336">#1336</a>)</li>
<li><a
href="a0853c2454"><code>a0853c2</code></a>
Bump actions/checkout from 4 to 5 (<a
href="https://redirect.github.com/actions/setup-node/issues/1345">#1345</a>)</li>
<li><a
href="b7234cc9fe"><code>b7234cc</code></a>
Upgrade action to use node24 (<a
href="https://redirect.github.com/actions/setup-node/issues/1325">#1325</a>)</li>
<li><a
href="d7a11313b5"><code>d7a1131</code></a>
Enhance caching in setup-node with automatic package manager detection
(<a
href="https://redirect.github.com/actions/setup-node/issues/1348">#1348</a>)</li>
<li><a
href="5e2628c959"><code>5e2628c</code></a>
Bumps form-data (<a
href="https://redirect.github.com/actions/setup-node/issues/1332">#1332</a>)</li>
<li><a
href="65beceff8e"><code>65becef</code></a>
Bump undici from 5.28.5 to 5.29.0 (<a
href="https://redirect.github.com/actions/setup-node/issues/1295">#1295</a>)</li>
<li><a
href="7e24a656e1"><code>7e24a65</code></a>
Bump uuid from 9.0.1 to 11.1.0 (<a
href="https://redirect.github.com/actions/setup-node/issues/1273">#1273</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/actions/setup-node/compare/v4...v6">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/setup-node&package-manager=github_actions&previous-version=4&new-version=6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-10-14 17:32:28 +03:00
Max Kotliar
7dbe569fe7 deployment/docker: update Go builder from Go1.25.2 to Go1.25.3 (#9859)
Release https://go.dev/doc/devel/release#go1.25.3

Changeshttps://github.com/golang/go/issues?q=milestone%3AGo1.25.3%20label%3ACherryPickApproved

Follow up on
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9843
2025-10-14 17:09:01 +03:00
Roman Khavronenko
8f3e96fa38 deployment: bump alpine image to 3.22.2 (#9855)
See
https://www.alpinelinux.org/posts/Alpine-3.19.9-3.20.8-3.21.5-3.22.2-released.html

Addresses CVE-2025-9230, CVE-2025-9231, CVE-2025-9232.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-10-13 18:56:51 +03:00
Nikolay
cd9d5619d9 lib/promscrape/config: change promscrape.dropOriginalLabels default value to false
Tracking original labels requires storing a copy of labels obtained
from service discovery. It adds extra Garbage Collection pressure and as
a result increased CPU usage.

While dropOriginalLabels has almost no impact at test and small
installations.
Impact grows with a scale. And especially is impactful at Kubernetes
based installations.

 In addition, this flag is disabled by default for `k8s-stack` helm
chart, which is our main Kubernetes monitoring solution.

 An also, we recommend at vmagent optimisation guide to disable original
 labels storing.

 This commit changes default value to true and disables tracking of
dropped targets by default. In case of debugging, it could be easily
enabled back by providing `false` value to the flag:
`promscrape.dropOriginalLabels`. It should improve resource usage out of
box by reducing user-experience for minority of users.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9665

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9772
2025-10-13 11:05:24 +02:00
Andrii Chubatiuk
fb60857b8f app/vmui: make TextField box height constant regardless error message presence
TextField component has ability to show error message and depending on
it's presence text field height changes, which may cause visibility
issues if this field is vertically aligned with some neighbour
components. This PR makes textfield height constant and its input box
horizontally symmetrical

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9693
2025-10-13 11:03:16 +02:00
Andrii Chubatiuk
93f72c487d app/vmui: fixed code color value in light mode
Fixed typo in code color variable value in light mode

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9851
2025-10-13 11:02:36 +02:00
Zakhar Bessarab
0eb465782a deps: unpin AWS dependencies and add workaround for S3 compatibility (#9844)
Updates:
- unpin AWS dependencies and run `make vendor-update`
- add config options to enable checksums only if required by storage in
order to preserve backwards compatibility

Related issues:
- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9748
- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8622

Tested with: AWS S3, self-hosted MinIO, Linode object storage as it was
failing previously with multi-part uploads (reported here -
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/8630#issuecomment-2772185033).
An updated library allows (PR with the
fix - https://github.com/aws/aws-sdk-go-v2/pull/3151) overriding
multi-part upload configurations so that compatibility can be preserved.

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-10-10 18:35:37 +04:00
Zakhar Bessarab
b703768e9c lib/backup/actions: improve progress logging (#9836)
Currently, it is hard to make sense of progress based on logging as it
requires manual calculation of progress and ETA.
Solve this by:
- making data units humanly readable
- adding an estimation of completion for the operation

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-10-10 18:34:01 +04:00
dependabot[bot]
4e86ddfdaa build(deps): bump github/codeql-action from 3 to 4 (#9827)
Bumps [github/codeql-action](https://github.com/github/codeql-action)
from 3 to 4.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/github/codeql-action/releases">github/codeql-action's
releases</a>.</em></p>
<blockquote>
<h2>v3.30.7</h2>
<h1>CodeQL Action Changelog</h1>
<p>See the <a
href="https://github.com/github/codeql-action/releases">releases
page</a> for the relevant changes to the CodeQL CLI and language
packs.</p>
<h2>3.30.7 - 06 Oct 2025</h2>
<p>No user facing changes.</p>
<p>See the full <a
href="https://github.com/github/codeql-action/blob/v3.30.7/CHANGELOG.md">CHANGELOG.md</a>
for more information.</p>
<h2>v3.30.6</h2>
<h1>CodeQL Action Changelog</h1>
<p>See the <a
href="https://github.com/github/codeql-action/releases">releases
page</a> for the relevant changes to the CodeQL CLI and language
packs.</p>
<h2>3.30.6 - 02 Oct 2025</h2>
<ul>
<li>Update default CodeQL bundle version to 2.23.2. <a
href="https://redirect.github.com/github/codeql-action/pull/3168">#3168</a></li>
</ul>
<p>See the full <a
href="https://github.com/github/codeql-action/blob/v3.30.6/CHANGELOG.md">CHANGELOG.md</a>
for more information.</p>
<h2>v3.30.5</h2>
<h1>CodeQL Action Changelog</h1>
<p>See the <a
href="https://github.com/github/codeql-action/releases">releases
page</a> for the relevant changes to the CodeQL CLI and language
packs.</p>
<h2>3.30.5 - 26 Sep 2025</h2>
<ul>
<li>We fixed a bug that was introduced in <code>3.30.4</code> with
<code>upload-sarif</code> which resulted in files without a
<code>.sarif</code> extension not getting uploaded. <a
href="https://redirect.github.com/github/codeql-action/pull/3160">#3160</a></li>
</ul>
<p>See the full <a
href="https://github.com/github/codeql-action/blob/v3.30.5/CHANGELOG.md">CHANGELOG.md</a>
for more information.</p>
<h2>v3.30.4</h2>
<h1>CodeQL Action Changelog</h1>
<p>See the <a
href="https://github.com/github/codeql-action/releases">releases
page</a> for the relevant changes to the CodeQL CLI and language
packs.</p>
<h2>3.30.4 - 25 Sep 2025</h2>
<ul>
<li>We have improved the CodeQL Action's ability to validate that the
workflow it is used in does not use different versions of the CodeQL
Action for different workflow steps. Mixing different versions of the
CodeQL Action in the same workflow is unsupported and can lead to
unpredictable results. A warning will now be emitted from the
<code>codeql-action/init</code> step if different versions of the CodeQL
Action are detected in the workflow file. Additionally, an error will
now be thrown by the other CodeQL Action steps if they load a
configuration file that was generated by a different version of the
<code>codeql-action/init</code> step. <a
href="https://redirect.github.com/github/codeql-action/pull/3099">#3099</a>
and <a
href="https://redirect.github.com/github/codeql-action/pull/3100">#3100</a></li>
<li>We added support for reducing the size of dependency caches for Java
analyses, which will reduce cache usage and speed up workflows. This
will be enabled automatically at a later time. <a
href="https://redirect.github.com/github/codeql-action/pull/3107">#3107</a></li>
<li>You can now run the latest CodeQL nightly bundle by passing
<code>tools: nightly</code> to the <code>init</code> action. In general,
the nightly bundle is unstable and we only recommend running it when
directed by GitHub staff. <a
href="https://redirect.github.com/github/codeql-action/pull/3130">#3130</a></li>
<li>Update default CodeQL bundle version to 2.23.1. <a
href="https://redirect.github.com/github/codeql-action/pull/3118">#3118</a></li>
</ul>
<p>See the full <a
href="https://github.com/github/codeql-action/blob/v3.30.4/CHANGELOG.md">CHANGELOG.md</a>
for more information.</p>
<h2>v3.30.3</h2>
<h1>CodeQL Action Changelog</h1>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/github/codeql-action/blob/main/CHANGELOG.md">github/codeql-action's
changelog</a>.</em></p>
<blockquote>
<h2>3.29.4 - 23 Jul 2025</h2>
<p>No user facing changes.</p>
<h2>3.29.3 - 21 Jul 2025</h2>
<p>No user facing changes.</p>
<h2>3.29.2 - 30 Jun 2025</h2>
<ul>
<li>Experimental: When the <code>quality-queries</code> input for the
<code>init</code> action is provided with an argument, separate
<code>.quality.sarif</code> files are produced and uploaded for each
language with the results of the specified queries. Do not use this in
production as it is part of an internal experiment and subject to change
at any time. <a
href="https://redirect.github.com/github/codeql-action/pull/2935">#2935</a></li>
</ul>
<h2>3.29.1 - 27 Jun 2025</h2>
<ul>
<li>Fix bug in PR analysis where user-provided <code>include</code>
query filter fails to exclude non-included queries. <a
href="https://redirect.github.com/github/codeql-action/pull/2938">#2938</a></li>
<li>Update default CodeQL bundle version to 2.22.1. <a
href="https://redirect.github.com/github/codeql-action/pull/2950">#2950</a></li>
</ul>
<h2>3.29.0 - 11 Jun 2025</h2>
<ul>
<li>Update default CodeQL bundle version to 2.22.0. <a
href="https://redirect.github.com/github/codeql-action/pull/2925">#2925</a></li>
<li>Bump minimum CodeQL bundle version to 2.16.6. <a
href="https://redirect.github.com/github/codeql-action/pull/2912">#2912</a></li>
</ul>
<h2>3.28.21 - 28 July 2025</h2>
<p>No user facing changes.</p>
<h2>3.28.20 - 21 July 2025</h2>
<ul>
<li>Remove support for combining SARIF files from a single upload for
GHES 3.18, see <a
href="https://github.blog/changelog/2024-05-06-code-scanning-will-stop-combining-runs-from-a-single-upload/">the
changelog post</a>. <a
href="https://redirect.github.com/github/codeql-action/pull/2959">#2959</a></li>
</ul>
<h2>3.28.19 - 03 Jun 2025</h2>
<ul>
<li>The CodeQL Action no longer includes its own copy of the extractor
for the <code>actions</code> language, which is currently in public
preview.
The <code>actions</code> extractor has been included in the CodeQL CLI
since v2.20.6. If your workflow has enabled the <code>actions</code>
language <em>and</em> you have pinned
your <code>tools:</code> property to a specific version of the CodeQL
CLI earlier than v2.20.6, you will need to update to at least CodeQL
v2.20.6 or disable
<code>actions</code> analysis.</li>
<li>Update default CodeQL bundle version to 2.21.4. <a
href="https://redirect.github.com/github/codeql-action/pull/2910">#2910</a></li>
</ul>
<h2>3.28.18 - 16 May 2025</h2>
<ul>
<li>Update default CodeQL bundle version to 2.21.3. <a
href="https://redirect.github.com/github/codeql-action/pull/2893">#2893</a></li>
<li>Skip validating SARIF produced by CodeQL for improved performance.
<a
href="https://redirect.github.com/github/codeql-action/pull/2894">#2894</a></li>
<li>The number of threads and amount of RAM used by CodeQL can now be
set via the <code>CODEQL_THREADS</code> and <code>CODEQL_RAM</code>
runner environment variables. If set, these environment variables
override the <code>threads</code> and <code>ram</code> inputs
respectively. <a
href="https://redirect.github.com/github/codeql-action/pull/2891">#2891</a></li>
</ul>
<h2>3.28.17 - 02 May 2025</h2>
<ul>
<li>Update default CodeQL bundle version to 2.21.2. <a
href="https://redirect.github.com/github/codeql-action/pull/2872">#2872</a></li>
</ul>
<h2>3.28.16 - 23 Apr 2025</h2>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="aac66ec793"><code>aac66ec</code></a>
Remove <code>update-proxy-release</code> workflow</li>
<li><a
href="91a63dc72c"><code>91a63dc</code></a>
Remove <code>undefined</code> values from results of
<code>unsafeEntriesInvariant</code></li>
<li><a
href="d25fa60a90"><code>d25fa60</code></a>
ESLint: Disable <code>no-unused-vars</code> for parameters starting with
<code>_</code></li>
<li><a
href="3adb1ff7b8"><code>3adb1ff</code></a>
Reorder supported tags in descending order</li>
<li>See full diff in <a
href="https://github.com/github/codeql-action/compare/v3...v4">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=github/codeql-action&package-manager=github_actions&previous-version=3&new-version=4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-10-10 15:56:37 +03:00
Max Kotliar
7569d21123 docs: add CVE handling policy (#9847)
### Describe Your Changes

Add a CVE handling policy. 

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-10-10 15:53:41 +03:00
hagen1778
5d36616d02 Revert "docs: vmstorage 2 cpu requirement (#9846)"
This reverts commit 772ac8803e.

The reason for revert is that this recommendation should not be strict.
Installations with <= 1 vCPU will continue working efficiently. The load
from reads, writes and background merges will be evenly spread by Go runtime.

cc @Sleuth56 @tiny-pangolin
2025-10-10 14:16:08 +02:00
Roman Khavronenko
e31e0657c8 app/vmalert: uniformly populate error messages with URL context (#9845)
Before, `req.URL.Redacted` info was present in some error messages and
empty in others. This change uniformly adds it to the errors context.

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-10-10 14:09:49 +02:00
Andrii Chubatiuk
50af991677 docs: mention ignore_first_sample_interval in stream aggregation docs (#9834)
### Describe Your Changes

related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9615

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-10-10 14:09:24 +02:00
Roman Khavronenko
5617b24cef docs: fix typo in queryRequestsCount field name (#9829)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-10-10 14:09:12 +02:00
Roman Khavronenko
ece3ee2427 docs: mention -metricNamesStatsResetAuthKey in Security (#9828)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-10-10 14:09:02 +02:00
Andrii Chubatiuk
4add451701 app/vmui: store query hidden state in query args (#9826)
### Describe Your Changes

save queries hidden state to
`expr.hide:<hidden_query_idx_1,..,hidden_query_idx_n>` query argument

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2025-10-10 14:08:37 +02:00
Andrii Chubatiuk
1c7d8d030f app/vmui: pass query arg to search input, reset dropdown state during update (#9825)
- fixed ignored `search` query argument in `Notifiers` and `Rules` tabs
- added dropdown state reset, if other filters were updated and selected
state is not a subset of available items
- proxy requests to config.json for a local setup

### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2025-10-10 14:03:45 +02:00
Yury Molodov
80b4ca6367 app/vmui: rename "Reset" to "Reset filters" and disable when no filters are modified (#9821)
### Describe Your Changes

This PR updates the **Cardinality Explorer** page in `vmui`:

* Renames the `Reset` button to `Reset filters`.
* Disables the button when no filters are modified.

Related issue: #9609

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Signed-off-by: Yury Molodov <yurymolodov@gmail.com>
2025-10-10 13:57:42 +02:00
Yury Molodov
fafc754c11 app/vmui: prevent removing other query params when updating one (#9818)
### Describe Your Changes

Prevents removal of unrelated query parameters when updating a single
one in `vmui`.
Previously, changing one search parameter could unintentionally clear
others.

Related issue: #9816

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Signed-off-by: Yury Molodov <yurymolodov@gmail.com>
2025-10-10 13:52:54 +02:00
Vadim Rutkovsky
e1e2b69b2c docs/guides: add Headlamp setup guide (#9817)
### Describe Your Changes

This adds a guide on how to configure Headlamp k8s UI to display k8s
metrics from VictoriaMetrics

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-10-10 13:51:38 +02:00
hagen1778
bc2b822d59 docs: fix link in build badge on github readme page
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-10-10 13:50:25 +02:00
hagen1778
d9b0c5a268 docs: restore build badge on github readme page
Based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9809

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-10-10 13:49:19 +02:00
Stephan Burns
772ac8803e docs: vmstorage 2 cpu requirement (#9846)
### Describe Your Changes



### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Co-authored-by: Mathias Palmersheim <mathias@victoriametrics.com>
2025-10-09 16:24:34 -04:00
Zakhar Bessarab
4d35b359bd deployment/docker: update Go builder from Go1.25.1 to Go1.25.2 (#9843)
See
https://github.com/golang/go/issues?q=milestone%3AGo1.25.2%20label%3ACherryPickApproved

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-10-09 16:53:00 +04:00
Max Kotliar
8c4faba658 app/{vmagent,vmalert}: add -secret.flags to configure flag to be hidd… (#9839)
This is a refined version of
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6940, with all
work completed by @truepele.

---

### Describe Your Changes

Fixes #6938 

introduce -secret.flags to configure flag names to be hidden in logs and
on /metrics

### Checklist

The following checks are **mandatory**:

- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

Co-authored-by: andrii <truepele@gmail.com>
2025-10-09 15:07:46 +03:00
Fred Navruzov
5a29174b87 docs/vmanomaly: release v1.26.2 (#9841)
### Describe Your Changes

update the docs to forthcoming v1.26.2 patch release
⚠️ please do not merge upon approval before respective image tags
are live

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-10-09 10:06:58 +02:00
Fred Navruzov
ab89bbba5d docs/vmanomaly: ui page formatting fixes (#9840)
### Describe Your Changes

ui page formatting fixes

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-10-08 20:15:05 +02:00
Fred Navruzov
748c49cc9d docs/vmanomaly: release v1.26.1 (#9833)
### Describe Your Changes

release v1.26.1 docs updates

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-10-08 12:55:41 +02:00
Artem Fetishev
a88851f4a0 docs: bump latest version in docs
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-10-07 17:56:32 +02:00
Artem Fetishev
e0a76e2552 deployment/docker: bump version
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-10-07 17:50:17 +02:00
Artem Fetishev
a569224dad docs: bump last LTS versions
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-10-07 17:47:26 +02:00
Artem Fetishev
ef21fd6e2c docs/CHANGELOG.md: update changelog with LTS release notes
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-10-07 17:41:03 +02:00
hagen1778
57b16f0976 docs: update formatting for stream aggregation header
While there, remove excessive relabeling info and point users to Routing section.
The Routing section should explain how to build flexible processing pipleines.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-10-07 17:02:38 +02:00
hagen1778
4fbf26df02 deployment: revert upgrade libcrypto3 and libssl3 to 3.5.4-r0
Reverts 17ca1ba8c4

Reason for reverts are following:
1. The fix relies on release candidates of specific libraries
2. The real fix would be to update Alpine version, which is not released yet
3. It makes the fix partially done, as it would require follow-up in future to
switch from release candidates to stable versions, or to update Alpine version.
4. The fix is not effective, as it doesn't update the base image cached by Docker.
The real fix will be to host&update the base image separately like in https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9811.
5. VM binaries aren't vulnerable to mentioned vulnerabilites.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-10-07 16:55:05 +02:00
Zhu Jiekun
91f0738e67 docs: clarify how sharding in vmagent works when srv url is used (#9787)
### Describe Your Changes

clarify how sharding in vmagent works when srv url is used

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-10-07 16:45:15 +02:00
Roman Khavronenko
c70fc288f1 docs: fix small typos in changelog (#9798)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-10-07 16:43:56 +02:00
Vadim Rutkovsky
921e6df382 lib/promb: restore 1000 iterations for write request benchmark (#9812)
### Describe Your Changes

Initially, this benchmark was running 1000 iterations. Later in
c005245741 it was refactored and bumped to
10_000. This alerts continuous benchmark systems, so this commit brings
it back to 1000
### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-10-07 16:42:54 +02:00
Yury Molodov
e5dd794db5 app/vmui: refactor Header.tsx and fix styles (#9823)
### Describe Your Changes

- Centered the logo in the header (minor visual tweak; see screenshots
below)
- Simplified sidebar visibility conditions in `Header.tsx` for
readability/maintainability

_No changelog entry - minor visual tweak and internal refactor; no
functional impact._

### Before / After

| Before | After |
|--------|-------|
| <img width="404" height="81" alt="Before"
src="https://github.com/user-attachments/assets/21fad295-8c90-4c03-8837-d335923e645c"
/> | <img width="404" height="81" alt="After"
src="https://github.com/user-attachments/assets/8c3d7a34-fd9c-4326-aea8-ef82eade8b72"
/> |



### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Signed-off-by: Yury Molodov <yurymolodov@gmail.com>
2025-10-07 16:42:20 +02:00
Vadim Rutkovsky
cc87505f9d docs/victoriametrics/data-ingestion: add OpenShift guide (#9790)
### Describe Your Changes

This guide describes remote write configuration for OpenShift and
includes several useful tricks to make it efficient.

Relates to #8573

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>
2025-10-07 16:41:34 +02:00
Fred Navruzov
e198753f0d docs/vmanomaly: canonical link formatting (p2) (#9822)
### Describe Your Changes

canonical link formatting (p2)

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-10-07 16:08:45 +02:00
Fred Navruzov
6b9fa21d30 docs/vmanomaly: fix links to canonical form (1) (#9819)
### Describe Your Changes

1st iteration of improving remaining non-canonical links

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-10-07 15:43:06 +02:00
Fred Navruzov
e7abdce0de docs/vmanomaly: update schemas to v1.26.0 (#9813)
### Describe Your Changes

- update component schemas to v1.26.0; 
- fix old ?highlight=... links format

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-10-07 08:48:45 +02:00
Artem Fetishev
31adbf9094 docs/CHANGELOG.md: cut v1.127.0
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-10-03 18:27:29 +02:00
Artem Fetishev
dd508b9542 make vmui-update
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-10-03 16:09:33 +00:00
Roman Khavronenko
f151668188 apptest: tolerate time drift (#9801)
Shift time selectors by 1s to prevent time drifting in tests. Because of
shifting, test was flapping based on when it is run. See
https://github.com/VictoriaMetrics/VictoriaMetrics/actions/runs/18186734439/job/51824319927

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9770

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-10-03 16:54:16 +02:00
Roman Khavronenko
17ca1ba8c4 deployment: upgrade libcrypto3 and libssl3 to 3.5.4-r0 (#9805)
Addresses CVE-2025-9230, CVE-2025-9231, CVE-2025-9232.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-10-03 14:22:03 +02:00
Fred Navruzov
7654536aab docs/vmanomaly: typos and 404 after 1.26.0 release (#9800)
### Describe Your Changes

fixing broken links in UI/FAQ pages

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-10-02 21:04:19 +02:00
Fred Navruzov
901de22894 change page title from GUI to UI (#9799)
### Describe Your Changes

Fixing a typo in page name

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-10-02 17:40:42 +02:00
Fred Navruzov
96e514439d docs/vmanomaly: release v1.26.0 (#9793)
### Describe Your Changes

Docs update for v1.26.0 release of `vmanomaly`, including vmui-like GUI
docs and `VLogsReader`

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-10-02 17:34:03 +02:00
Aliaksandr Valialkin
43e189a56c docs/victoriametrics/enterprise.md: typo fix in the 'VictoriaLogs Enterprise features' chapter: VictoriaMetrics -> VictoriaLogs 2025-10-02 13:00:49 +02:00
Aliaksandr Valialkin
4dbf899d89 docs/victoriametrics/enterprise.md: properly state that Monitoring of Monitoring feature at VictoriaLogs enterprise helps preventing issues in VictoriaLogs setups, not VictoriaMetrics setups 2025-10-02 12:58:26 +02:00
4048 changed files with 378749 additions and 232930 deletions

View File

@@ -5,7 +5,7 @@ body:
- type: textarea
id: describe-the-component
attributes:
label: Is your question request related to a specific component?
label: Is your question related to a specific component?
placeholder: |
VictoriaMetrics, vmagent, vmalert, vmui, etc...
validations:

View File

@@ -1,23 +0,0 @@
# Project Overview
VictoriaMetrics is a fast, cost-saving, and scalable solution for monitoring and managing time series data. It delivers high performance and reliability, making it an ideal choice for businesses of all sizes.
## Folder Structure
- `/app`: Contains the compilable binaries.
- `/lib`: Contains the golang reusable libraries
- `/docs/victoriametrics`: Contains documentation for the project.
- `/apptest/tests`: Contains integration tests.
## Libraries and Frameworks
- Backend: Golang, no framework. Use third-party libraries sparingly.
- Frontend: React.
## Code review guidelines
Ensure the feature or bugfix includes a changelog entry in /docs/victoriametrics/changelog/CHANGELOG.md.
Verify the entry is under the ## tip section and matches the structure and style of existing entries.
Chore-only changes may be omitted from the changelog.

View File

@@ -4,6 +4,8 @@ updates:
directory: "/"
schedule:
interval: "daily"
cooldown:
default-days: 21
- package-ecosystem: "gomod"
directory: "/"
schedule:
@@ -23,6 +25,8 @@ updates:
directory: "/"
schedule:
interval: "daily"
cooldown:
default-days: 21
- package-ecosystem: "npm"
directory: "/app/vmui/packages/vmui"
schedule:

48
.github/scripts/lint-changelog-tip.sh vendored Executable file
View File

@@ -0,0 +1,48 @@
#!/usr/bin/env sh
set -e
CHANGELOG_FILE="docs/victoriametrics/changelog/CHANGELOG.md"
GITHUB_BASE_REF=${GITHUB_BASE_REF:-"master"}
GIT_REMOTE=${GIT_REMOTE:-"origin"}
git diff "${GIT_REMOTE}/${GITHUB_BASE_REF}"...HEAD -- $CHANGELOG_FILE > diff.txt
if ! grep -q "^+" diff.txt; then
echo "No additions in CHANGELOG.md"
exit 0
fi
ADDED_LINES=$(grep "^+\S" diff.txt | sed 's/^+//')
START_TIP=$(grep -n "^## tip" "$CHANGELOG_FILE" | head -1 | cut -d: -f1)
if [ -z "$START_TIP" ]; then
echo "ERROR: ${CHANGELOG_FILE} does not contain a ## tip section"
exit 1
fi
END_TIP=$(awk "NR>$START_TIP && /^## / {print NR; exit}" "${CHANGELOG_FILE}")
if [ -z "$END_TIP" ]; then
END_TIP=$(wc -l < "$CHANGELOG_FILE")
fi
BAD=0
while IFS= read -r line; do
# Grep exact line inside the file and get line numbers
MATCHES=$(grep -n -F "$line" "$CHANGELOG_FILE" | cut -d: -f1)
for m in $MATCHES; do
if [ "$m" -lt "$START_TIP" ] || [ "$m" -gt "$END_TIP" ]; then
echo "'$line' on line ${m} is outside ## tip section (lines ${START_TIP}-${END_TIP})"
BAD=1
fi
done
done << EOF
$ADDED_LINES
EOF
if [ "$BAD" -ne 0 ]; then
echo "CHANGELOG modifications must be placed inside the ## tip section."
exit 1
fi
echo "CHANGELOG modifications are valid."

View File

@@ -47,6 +47,8 @@ jobs:
arch: arm
- os: linux
arch: ppc64le
- os: linux
arch: s390x
- os: darwin
arch: amd64
- os: darwin
@@ -59,7 +61,7 @@ jobs:
arch: amd64
steps:
- name: Code checkout
uses: actions/checkout@v5
uses: actions/checkout@v6
- name: Setup Go
id: go
@@ -69,7 +71,8 @@ jobs:
go.sum
Makefile
app/**/Makefile
go-version: stable
go-version-file: 'go.mod'
- run: go version
- name: Build victoria-metrics for ${{ matrix.os }}-${{ matrix.arch }}
run: make victoria-metrics-${{ matrix.os }}-${{ matrix.arch }}

19
.github/workflows/changelog-linter.yml vendored Normal file
View File

@@ -0,0 +1,19 @@
name: 'changelog-linter'
on:
pull_request:
paths:
- "docs/victoriametrics/changelog/CHANGELOG.md"
jobs:
tip-lint:
runs-on: 'ubuntu-latest'
steps:
- uses: 'actions/checkout@v6'
with:
# needed for proper diff
fetch-depth: 0
- name: 'Validate that changelog changes are under ## tip'
run: |
GITHUB_BASE_REF=${{ github.base_ref }} ./.github/scripts/lint-changelog-tip.sh

View File

@@ -8,7 +8,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v5
uses: actions/checkout@v6
with:
fetch-depth: 0 # we need full history for commit verification
@@ -27,11 +27,21 @@ jobs:
exit 0
fi
unsigned=$(git log --pretty="%H %G?" $RANGE | grep -vE " (G|E)$" || true)
# Check raw commit objects for a "gpgsig" header as a fast early signal for
# contributors. Both GPG and SSH signatures use this header.
# This avoids relying on %G? which returns N for SSH commits.
# This check is not a security enforcement — unsigned commits cannot be merged
# anyway due to the GitHub repository merge policy.
unsigned=""
for sha in $(git rev-list $RANGE); do
if ! git cat-file commit "$sha" | grep -q "^gpgsig"; then
unsigned="$unsigned $sha"
fi
done
if [ -n "$unsigned" ]; then
echo "Found unsigned commits:"
echo "$unsigned"
exit 1
fi
echo "All commits in PR are signed (G or E)"
echo "All commits in PR are signed (GPG or SSH)"

View File

@@ -21,9 +21,11 @@ jobs:
id: go
uses: actions/setup-go@v6
with:
go-version: stable
go-version-file: 'go.mod'
cache: false
- run: go version
- name: Cache Go artifacts
uses: actions/cache@v4
with:
@@ -32,7 +34,7 @@ jobs:
~/go/pkg/mod
~/go/bin
key: go-artifacts-${{ runner.os }}-check-licenses-${{ steps.go.outputs.go-version }}-${{ hashFiles('go.sum', 'Makefile', 'app/**/Makefile') }}
restore-keys: go-artifacts-${{ runner.os }}-check-licenses-
restore-keys: go-artifacts-${{ runner.os }}-check-licenses-${{ steps.go.outputs.go-version }}-
- name: Check License
run: make check-licenses

View File

@@ -29,14 +29,15 @@ jobs:
steps:
- name: Checkout repository
uses: actions/checkout@v5
uses: actions/checkout@v6
- name: Set up Go
id: go
uses: actions/setup-go@v6
with:
cache: false
go-version: stable
go-version-file: 'go.mod'
- run: go version
- name: Cache Go artifacts
uses: actions/cache@v4
@@ -46,17 +47,17 @@ jobs:
~/go/bin
~/go/pkg/mod
key: go-artifacts-${{ runner.os }}-codeql-analyze-${{ steps.go.outputs.go-version }}-${{ hashFiles('go.sum', 'Makefile', 'app/**/Makefile') }}
restore-keys: go-artifacts-${{ runner.os }}-codeql-analyze-
restore-keys: go-artifacts-${{ runner.os }}-codeql-analyze-${{ steps.go.outputs.go-version }}-
- name: Initialize CodeQL
uses: github/codeql-action/init@v3
uses: github/codeql-action/init@v4
with:
languages: go
- name: Autobuild
uses: github/codeql-action/autobuild@v3
uses: github/codeql-action/autobuild@v4
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v3
uses: github/codeql-action/analyze@v4
with:
category: 'language:go'

View File

@@ -16,19 +16,19 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Code checkout
uses: actions/checkout@v5
uses: actions/checkout@v6
with:
path: __vm
- name: Checkout private code
uses: actions/checkout@v5
uses: actions/checkout@v6
with:
repository: VictoriaMetrics/vmdocs
token: ${{ secrets.VM_BOT_GH_TOKEN }}
path: __vm-docs
- name: Import GPG key
uses: crazy-max/ghaction-import-gpg@v6
uses: crazy-max/ghaction-import-gpg@v7
id: import-gpg
with:
gpg_private_key: ${{ secrets.VM_BOT_GPG_PRIVATE_KEY }}

View File

@@ -32,7 +32,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Code checkout
uses: actions/checkout@v5
uses: actions/checkout@v6
- name: Setup Go
id: go
@@ -42,8 +42,9 @@ jobs:
go.sum
Makefile
app/**/Makefile
go-version: stable
go-version-file: 'go.mod'
- run: go version
- name: Cache golangci-lint
uses: actions/cache@v4
@@ -51,7 +52,7 @@ jobs:
path: |
~/.cache/golangci-lint
~/go/bin
key: golangci-lint-${{ runner.os }}-${{ hashFiles('.golangci.yml') }}
key: golangci-lint-${{ runner.os }}-${{ steps.go.outputs.go-version }}-${{ hashFiles('.golangci.yml') }}
- name: Run check-all
run: |
@@ -65,13 +66,13 @@ jobs:
strategy:
matrix:
scenario:
- 'test-full'
- 'test-full-386'
- 'test'
- 'test-386'
- 'test-pure'
steps:
- name: Code checkout
uses: actions/checkout@v5
uses: actions/checkout@v6
- name: Setup Go
id: go
@@ -81,23 +82,19 @@ jobs:
go.sum
Makefile
app/**/Makefile
go-version: stable
go-version-file: 'go.mod'
- run: go version
- name: Run tests
run: GOGC=10 make ${{ matrix.scenario}}
run: make ${{ matrix.scenario}}
- name: Publish coverage
uses: codecov/codecov-action@v5
with:
files: ./coverage.txt
integration:
name: integration
runs-on: ubuntu-latest
apptest:
name: apptest
runs-on: apptest
steps:
- name: Code checkout
uses: actions/checkout@v5
uses: actions/checkout@v6
- name: Setup Go
id: go
@@ -107,7 +104,8 @@ jobs:
go.sum
Makefile
app/**/Makefile
go-version: stable
go-version-file: 'go.mod'
- run: go version
- name: Run integration tests
run: make integration-test
- name: Run app tests
run: make apptest

View File

@@ -32,35 +32,41 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Code checkout
uses: actions/checkout@v5
uses: actions/checkout@v6
- name: Setup Node
uses: actions/setup-node@v4
- name: Cache node_modules
id: cache
uses: actions/cache@v5
with:
node-version: '24.x'
path: app/vmui/packages/vmui/node_modules
key: vmui-deps-${{ runner.os }}-${{ hashFiles('app/vmui/packages/vmui/package-lock.json', 'app/vmui/Dockerfile-build') }}
restore-keys: |
vmui-deps-${{ runner.os }}-
- name: Cache node-modules
uses: actions/cache@v4
with:
path: |
app/vmui/packages/vmui/node_modules
key: vmui-artifacts-${{ runner.os }}-${{ hashFiles('package-lock.json') }}
restore-keys: vmui-artifacts-${{ runner.os }}-
- name: Install dependencies
if: steps.cache.outputs.cache-hit != 'true'
run: make vmui-install
- name: Run lint
id: lint
run: make vmui-lint
continue-on-error: true
env:
VMUI_SKIP_INSTALL: true
- name: Run tests
id: test
run: make vmui-test
continue-on-error: true
env:
VMUI_SKIP_INSTALL: true
- name: Run typecheck
id: typecheck
run: make vmui-typecheck
continue-on-error: true
env:
VMUI_SKIP_INSTALL: true
- name: Annotate Code Linting Results
uses: ataylorme/eslint-annotate-action@v3

View File

@@ -175,7 +175,7 @@
END OF TERMS AND CONDITIONS
Copyright 2019-2025 VictoriaMetrics, Inc.
Copyright 2019-2026 VictoriaMetrics, Inc.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.

View File

@@ -17,7 +17,7 @@ EXTRA_GO_BUILD_TAGS ?=
GO_BUILDINFO = -X '$(PKG_PREFIX)/lib/buildinfo.Version=$(APP_NAME)-$(DATEINFO_TAG)-$(BUILDINFO_TAG)'
TAR_OWNERSHIP ?= --owner=1000 --group=1000
GOLANGCI_LINT_VERSION := 2.4.0
GOLANGCI_LINT_VERSION := 2.9.0
.PHONY: $(MAKECMDGOALS)
@@ -125,6 +125,15 @@ vmutils-linux-ppc64le: \
vmrestore-linux-ppc64le \
vmctl-linux-ppc64le
vmutils-linux-s390x: \
vmagent-linux-s390x \
vmalert-linux-s390x \
vmalert-tool-linux-s390x \
vmauth-linux-s390x \
vmbackup-linux-s390x \
vmrestore-linux-s390x \
vmctl-linux-s390x
vmutils-darwin-amd64: \
vmagent-darwin-amd64 \
vmalert-darwin-amd64 \
@@ -257,6 +266,7 @@ release-victoria-metrics: \
release-victoria-metrics-linux-amd64 \
release-victoria-metrics-linux-arm \
release-victoria-metrics-linux-arm64 \
release-victoria-metrics-linux-s390x \
release-victoria-metrics-darwin-amd64 \
release-victoria-metrics-darwin-arm64 \
release-victoria-metrics-freebsd-amd64 \
@@ -275,6 +285,9 @@ release-victoria-metrics-linux-arm:
release-victoria-metrics-linux-arm64:
GOOS=linux GOARCH=arm64 $(MAKE) release-victoria-metrics-goos-goarch
release-victoria-metrics-linux-s390x:
GOOS=linux GOARCH=s390x $(MAKE) release-victoria-metrics-goos-goarch
release-victoria-metrics-darwin-amd64:
GOOS=darwin GOARCH=amd64 $(MAKE) release-victoria-metrics-goos-goarch
@@ -314,6 +327,7 @@ release-vmutils: \
release-vmutils-linux-amd64 \
release-vmutils-linux-arm64 \
release-vmutils-linux-arm \
release-vmutils-linux-s390x \
release-vmutils-darwin-amd64 \
release-vmutils-darwin-arm64 \
release-vmutils-freebsd-amd64 \
@@ -332,6 +346,9 @@ release-vmutils-linux-arm64:
release-vmutils-linux-arm:
GOOS=linux GOARCH=arm $(MAKE) release-vmutils-goos-goarch
release-vmutils-linux-s390x:
GOOS=linux GOARCH=s390x $(MAKE) release-vmutils-goos-goarch
release-vmutils-darwin-amd64:
GOOS=darwin GOARCH=amd64 $(MAKE) release-vmutils-goos-goarch
@@ -418,7 +435,7 @@ release-vmutils-windows-goarch: \
vmctl-windows-$(GOARCH)-prod.exe
pprof-cpu:
go tool pprof -trim_path=github.com/VictoriaMetrics/VictoriaMetrics@ $(PPROF_FILE)
go tool pprof -trim_path=github.com/VictoriaMetrics/VictoriaMetrics $(PPROF_FILE)
fmt:
gofmt -l -w -s ./lib
@@ -426,7 +443,7 @@ fmt:
gofmt -l -w -s ./apptest
vet:
GOEXPERIMENT=synctest go vet ./lib/...
go vet -tags 'synctest' ./lib/...
go vet ./app/...
go vet ./apptest/...
@@ -435,39 +452,55 @@ check-all: fmt vet golangci-lint govulncheck
clean-checkers: remove-golangci-lint remove-govulncheck
test:
GOEXPERIMENT=synctest go test ./lib/... ./app/...
go test -tags 'synctest' ./lib/... ./app/...
test-race:
GOEXPERIMENT=synctest go test -race ./lib/... ./app/...
go test -tags 'synctest' -race ./lib/... ./app/...
test-386:
GOARCH=386 go test -tags 'synctest' ./lib/... ./app/...
test-pure:
GOEXPERIMENT=synctest CGO_ENABLED=0 go test ./lib/... ./app/...
CGO_ENABLED=0 go test -tags 'synctest' ./lib/... ./app/...
test-full:
GOEXPERIMENT=synctest go test -coverprofile=coverage.txt -covermode=atomic ./lib/... ./app/...
go test -tags 'synctest' -coverprofile=coverage.txt -covermode=atomic ./lib/... ./app/...
test-full-386:
GOEXPERIMENT=synctest GOARCH=386 go test -coverprofile=coverage.txt -covermode=atomic ./lib/... ./app/...
integration-test:
$(MAKE) apptest
GOARCH=386 go test -tags 'synctest' -coverprofile=coverage.txt -covermode=atomic ./lib/... ./app/...
apptest:
$(MAKE) victoria-metrics vmagent vmalert vmauth vmctl vmbackup vmrestore
go test ./apptest/... -skip="^TestCluster.*"
$(MAKE) victoria-metrics-race vmagent-race vmalert-race vmauth-race vmctl-race vmbackup-race vmrestore-race
go test ./apptest/... -skip="^Test(Cluster|Legacy).*"
apptest-legacy: victoria-metrics-race vmbackup-race vmrestore-race
OS=$$(uname | tr '[:upper:]' '[:lower:]'); \
ARCH=$$(uname -m | tr '[:upper:]' '[:lower:]' | sed 's/x86_64/amd64/'); \
VERSION=v1.132.0; \
VMSINGLE=victoria-metrics-$${OS}-$${ARCH}-$${VERSION}.tar.gz; \
VMCLUSTER=victoria-metrics-$${OS}-$${ARCH}-$${VERSION}-cluster.tar.gz; \
URL=https://github.com/VictoriaMetrics/VictoriaMetrics/releases/download/$${VERSION}; \
DIR=/tmp/$${VERSION}; \
test -d $${DIR} || (mkdir $${DIR} && \
curl --output-dir /tmp -LO $${URL}/$${VMSINGLE} && tar xzf /tmp/$${VMSINGLE} -C $${DIR} && \
curl --output-dir /tmp -LO $${URL}/$${VMCLUSTER} && tar xzf /tmp/$${VMCLUSTER} -C $${DIR} \
); \
VM_LEGACY_VMSINGLE_PATH=$${DIR}/victoria-metrics-prod \
VM_LEGACY_VMSTORAGE_PATH=$${DIR}/vmstorage-prod \
go test ./apptest/tests -run="^TestLegacySingle.*"
benchmark:
GOEXPERIMENT=synctest go test -bench=. ./lib/...
go test -bench=. ./app/...
go test -run=NO_TESTS -bench=. ./lib/...
go test -run=NO_TESTS -bench=. ./app/...
benchmark-pure:
GOEXPERIMENT=synctest CGO_ENABLED=0 go test -bench=. ./lib/...
CGO_ENABLED=0 go test -bench=. ./app/...
CGO_ENABLED=0 go test -run=NO_TESTS -bench=. ./lib/...
CGO_ENABLED=0 go test -run=NO_TESTS -bench=. ./app/...
vendor-update:
go get -u ./lib/...
go get -u ./app/...
go mod tidy -compat=1.24
go mod tidy -compat=1.26
go mod vendor
app-local:
@@ -483,14 +516,15 @@ app-local-windows-goarch:
CGO_ENABLED=0 GOOS=windows GOARCH=$(GOARCH) go build $(RACE) -ldflags "$(GO_BUILDINFO)" -tags "$(EXTRA_GO_BUILD_TAGS)" -o bin/$(APP_NAME)-windows-$(GOARCH)$(RACE).exe $(PKG_PREFIX)/app/$(APP_NAME)
quicktemplate-gen: install-qtc
qtc
qtc -dir=lib
qtc -dir=app
install-qtc:
which qtc || go install github.com/valyala/quicktemplate/qtc@latest
golangci-lint: install-golangci-lint
GOEXPERIMENT=synctest golangci-lint run
golangci-lint run --build-tags 'synctest'
install-golangci-lint:
which golangci-lint && (golangci-lint --version | grep -q $(GOLANGCI_LINT_VERSION)) || curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(shell go env GOPATH)/bin v$(GOLANGCI_LINT_VERSION)

View File

@@ -1,12 +1,11 @@
# VictoriaMetrics
[![Latest Release](https://img.shields.io/github/v/release/VictoriaMetrics/VictoriaMetrics?sort=semver&label=&filter=!*-victorialogs&logo=github&labelColor=gray&color=gray&link=https%3A%2F%2Fgithub.com%2FVictoriaMetrics%2FVictoriaMetrics%2Freleases%2Flatest)](https://github.com/VictoriaMetrics/VictoriaMetrics/releases)
![Docker Pulls](https://img.shields.io/docker/pulls/victoriametrics/victoria-metrics?label=&logo=docker&logoColor=white&labelColor=2496ED&color=2496ED&link=https%3A%2F%2Fhub.docker.com%2Fr%2Fvictoriametrics%2Fvictoria-metrics)
[![Docker Pulls](https://img.shields.io/docker/pulls/victoriametrics/victoria-metrics?label=&logo=docker&logoColor=white&labelColor=2496ED&color=2496ED&link=https%3A%2F%2Fhub.docker.com%2Fr%2Fvictoriametrics%2Fvictoria-metrics)](https://hub.docker.com/u/victoriametrics)
[![Go Report](https://goreportcard.com/badge/github.com/VictoriaMetrics/VictoriaMetrics?link=https%3A%2F%2Fgoreportcard.com%2Freport%2Fgithub.com%2FVictoriaMetrics%2FVictoriaMetrics)](https://goreportcard.com/report/github.com/VictoriaMetrics/VictoriaMetrics)
[![Build Status](https://github.com/VictoriaMetrics/VictoriaMetrics/actions/workflows/main.yml/badge.svg?branch=master&link=https%3A%2F%2Fgithub.com%2FVictoriaMetrics%2FVictoriaMetrics%2Factions)](https://github.com/VictoriaMetrics/VictoriaMetrics/actions/workflows/main.yml)
[![codecov](https://codecov.io/gh/VictoriaMetrics/VictoriaMetrics/branch/master/graph/badge.svg?link=https%3A%2F%2Fcodecov.io%2Fgh%2FVictoriaMetrics%2FVictoriaMetrics)](https://app.codecov.io/gh/VictoriaMetrics/VictoriaMetrics)
[![Build Status](https://github.com/VictoriaMetrics/VictoriaMetrics/actions/workflows/build.yml/badge.svg?branch=master&link=https%3A%2F%2Fgithub.com%2FVictoriaMetrics%2FVictoriaMetrics%2Factions)](https://github.com/VictoriaMetrics/VictoriaMetrics/actions/workflows/build.yml)
[![License](https://img.shields.io/github/license/VictoriaMetrics/VictoriaMetrics?labelColor=green&label=&link=https%3A%2F%2Fgithub.com%2FVictoriaMetrics%2FVictoriaMetrics%2Fblob%2Fmaster%2FLICENSE)](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/LICENSE)
![Slack](https://img.shields.io/badge/Join-4A154B?logo=slack&link=https%3A%2F%2Fslack.victoriametrics.com)
[![Join Slack](https://img.shields.io/badge/Join%20Slack-4A154B?logo=slack)](https://slack.victoriametrics.com)
[![X](https://img.shields.io/twitter/follow/VictoriaMetrics?style=flat&label=Follow&color=black&logo=x&labelColor=black&link=https%3A%2F%2Fx.com%2FVictoriaMetrics)](https://x.com/VictoriaMetrics/)
[![Reddit](https://img.shields.io/reddit/subreddit-subscribers/VictoriaMetrics?style=flat&label=Join&labelColor=red&logoColor=white&logo=reddit&link=https%3A%2F%2Fwww.reddit.com%2Fr%2FVictoriaMetrics)](https://www.reddit.com/r/VictoriaMetrics/)
@@ -16,16 +15,21 @@
<img src="docs/victoriametrics/logo.webp" width="300" alt="VictoriaMetrics logo">
</picture>
VictoriaMetrics is a fast, cost-saving, and scalable solution for monitoring and managing time series data. It delivers high performance and reliability, making it an ideal choice for businesses of all sizes.
VictoriaMetrics is a fast, cost-effective, and scalable solution for monitoring and managing time series data. It delivers high performance and reliability, making it an ideal choice for businesses of all sizes.
Here are some resources and information about VictoriaMetrics:
- Documentation: [docs.victoriametrics.com](https://docs.victoriametrics.com)
- Case studies: [Grammarly, Roblox, Wix,...](https://docs.victoriametrics.com/victoriametrics/casestudies/).
- Available: [Binary releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/latest), docker images [Docker Hub](https://hub.docker.com/r/victoriametrics/victoria-metrics/) and [Quay](https://quay.io/repository/victoriametrics/victoria-metrics), [Source code](https://github.com/VictoriaMetrics/VictoriaMetrics)
- Deployment types: [Single-node version](https://docs.victoriametrics.com/), [Cluster version](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/), and [Enterprise version](https://docs.victoriametrics.com/victoriametrics/enterprise/)
- Changelog: [CHANGELOG](https://docs.victoriametrics.com/victoriametrics/changelog/), and [How to upgrade](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#how-to-upgrade-victoriametrics)
- Community: [Slack](https://slack.victoriametrics.com/), [X (Twitter)](https://x.com/VictoriaMetrics), [LinkedIn](https://www.linkedin.com/company/victoriametrics/), [YouTube](https://www.youtube.com/@VictoriaMetrics)
- **Case studies**: [Grammarly, Roblox, Wix, Spotify,...](https://docs.victoriametrics.com/victoriametrics/casestudies/).
- **Available**: [Binary releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/latest), Docker images on [Docker Hub](https://hub.docker.com/r/victoriametrics/victoria-metrics/) and [Quay](https://quay.io/repository/victoriametrics/victoria-metrics), [Source code](https://github.com/VictoriaMetrics/VictoriaMetrics).
- **Deployment types**: [Single-node version](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/) and [Cluster version](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/) under [Apache License 2.0](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/LICENSE).
- **Getting started:** Read [key concepts](https://docs.victoriametrics.com/victoriametrics/keyconcepts/) and follow the
[quick start guide](https://docs.victoriametrics.com/victoriametrics/quick-start/).
- **Community**: [Slack](https://slack.victoriametrics.com/) (join via [Slack Inviter](https://slack.victoriametrics.com/)), [X (Twitter)](https://x.com/VictoriaMetrics), [YouTube](https://www.youtube.com/@VictoriaMetrics). See full list [here](https://docs.victoriametrics.com/victoriametrics/#community-and-contributions).
- **Changelog**: Project evolves fast - check the [CHANGELOG](https://docs.victoriametrics.com/victoriametrics/changelog/), and [How to upgrade](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#how-to-upgrade-victoriametrics).
- **Enterprise support:** [Contact us](mailto:info@victoriametrics.com) for commercial support with additional [enterprise features](https://docs.victoriametrics.com/victoriametrics/enterprise/).
- **Enterprise releases:** Enterprise and [long-term support releases (LTS)](https://docs.victoriametrics.com/victoriametrics/lts-releases/) are publicly available and can be evaluated for free
using a [free trial license](https://victoriametrics.com/products/enterprise/trial/).
- **Security:** we achieved [security certifications](https://victoriametrics.com/security/) for Database Software Development and Software-Based Monitoring Services.
Yes, we open-source both the single-node VictoriaMetrics and the cluster version.

View File

@@ -12,6 +12,31 @@ The following versions of VictoriaMetrics receive regular security fixes:
See [this page](https://victoriametrics.com/security/) for more details.
## Software Bill of Materials (SBOM)
Every VictoriaMetrics container{{% available_from "#" %}} image published to
[Docker Hub](https://hub.docker.com/u/victoriametrics)
and [Quay.io](https://quay.io/organization/victoriametrics)
includes an [SPDX](https://spdx.dev/) SBOM attestation
generated automatically by BuildKit during
`docker buildx build`.
To inspect the SBOM for an image:
```sh
docker buildx imagetools inspect \
docker.io/victoriametrics/victoria-metrics:latest \
--format "{{ json .SBOM }}"
```
To scan an image using its SBOM attestation with
[Trivy](https://github.com/aquasecurity/trivy):
```sh
trivy image --sbom-sources oci \
docker.io/victoriametrics/victoria-metrics:latest
```
## Reporting a Vulnerability
Please report any security issues to <security@victoriametrics.com>

View File

@@ -27,6 +27,9 @@ victoria-metrics-linux-ppc64le-prod:
victoria-metrics-linux-386-prod:
APP_NAME=victoria-metrics $(MAKE) app-via-docker-linux-386
victoria-metrics-linux-s390x-prod:
APP_NAME=victoria-metrics $(MAKE) app-via-docker-linux-s390x
victoria-metrics-darwin-amd64-prod:
APP_NAME=victoria-metrics $(MAKE) app-via-docker-darwin-amd64

View File

@@ -134,6 +134,7 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
}
w.Header().Add("Content-Type", "text/html; charset=utf-8")
fmt.Fprintf(w, "<h2>Single-node VictoriaMetrics</h2></br>")
fmt.Fprintf(w, "Version %s<br>", buildinfo.Version)
fmt.Fprintf(w, "See docs at <a href='https://docs.victoriametrics.com/'>https://docs.victoriametrics.com/</a></br>")
fmt.Fprintf(w, "Useful endpoints:</br>")
httpserver.WriteAPIHelp(w, [][2]string{

View File

@@ -10,9 +10,11 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/decimal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prommetadata"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/prometheus"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage/metricsmetadata"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/timeserieslimits"
)
@@ -27,11 +29,9 @@ var selfScraperWG sync.WaitGroup
func startSelfScraper() {
selfScraperStopCh = make(chan struct{})
selfScraperWG.Add(1)
go func() {
defer selfScraperWG.Done()
selfScraperWG.Go(func() {
selfScraper(*selfScrapeInterval)
}()
})
}
func stopSelfScraper() {
@@ -48,6 +48,7 @@ func selfScraper(scrapeInterval time.Duration) {
var bb bytesutil.ByteBuffer
var rows prometheus.Rows
var metadataRows prometheus.MetadataRows
var mrs []storage.MetricRow
var labels []prompb.Label
t := time.NewTicker(scrapeInterval)
@@ -57,8 +58,12 @@ func selfScraper(scrapeInterval time.Duration) {
appmetrics.WritePrometheusMetrics(&bb)
s := bytesutil.ToUnsafeString(bb.B)
rows.Reset()
// VictoriaMetrics components don't expose metadata yet, only need to parse samples
rows.UnmarshalWithErrLogger(s, nil)
// Parse metrics and optionally metadata when enabled
if prommetadata.IsEnabled() {
rows, metadataRows = prometheus.UnmarshalWithMetadata(rows, metadataRows, s, nil)
} else {
rows.UnmarshalWithErrLogger(s, nil)
}
mrs = mrs[:0]
for i := range rows.Rows {
r := &rows.Rows[i]
@@ -91,6 +96,19 @@ func selfScraper(scrapeInterval time.Duration) {
if err := vmstorage.AddRows(mrs); err != nil {
logger.Errorf("cannot store self-scraped metrics: %s", err)
}
if len(metadataRows.Rows) > 0 {
mms := make([]metricsmetadata.Row, 0, len(metadataRows.Rows))
for _, mm := range metadataRows.Rows {
mms = append(mms, metricsmetadata.Row{
MetricFamilyName: bytesutil.ToUnsafeBytes(mm.Metric),
Help: bytesutil.ToUnsafeBytes(mm.Help),
Type: mm.Type,
})
}
if err := vmstorage.AddMetadataRows(mms); err != nil {
logger.Errorf("cannot store self-scraped metrics metadata: %s", err)
}
}
}
for {
select {

View File

@@ -33,13 +33,13 @@ func PopulateTimeTpl(b []byte, tGlobal time.Time) []byte {
}
switch strings.TrimSpace(parts[0]) {
case `TIME_S`:
return []byte(fmt.Sprintf("%d", t.Unix()))
return fmt.Appendf(nil, "%d", t.Unix())
case `TIME_MSZ`:
return []byte(fmt.Sprintf("%d", t.Unix()*1e3))
return fmt.Appendf(nil, "%d", t.Unix()*1e3)
case `TIME_MS`:
return []byte(fmt.Sprintf("%d", timeToMillis(t)))
return fmt.Appendf(nil, "%d", timeToMillis(t))
case `TIME_NS`:
return []byte(fmt.Sprintf("%d", t.UnixNano()))
return fmt.Appendf(nil, "%d", t.UnixNano())
default:
log.Fatalf("unknown time pattern %s in %s", parts[0], repl)
}

View File

@@ -27,6 +27,9 @@ vmagent-linux-ppc64le-prod:
vmagent-linux-386-prod:
APP_NAME=vmagent $(MAKE) app-via-docker-linux-386
vmagent-linux-s390x-prod:
APP_NAME=vmagent $(MAKE) app-via-docker-linux-s390x
vmagent-darwin-amd64-prod:
APP_NAME=vmagent $(MAKE) app-via-docker-darwin-amd64

View File

@@ -49,6 +49,11 @@ func insertRows(at *auth.Token, sketches []*datadogsketches.Sketch, extraLabels
Name: "__name__",
Value: m.Name,
})
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10557
labels = append(labels, prompb.Label{
Name: "host",
Value: sketch.Host,
})
for _, label := range m.Labels {
labels = append(labels, prompb.Label{
Name: label.Name,
@@ -57,9 +62,6 @@ func insertRows(at *auth.Token, sketches []*datadogsketches.Sketch, extraLabels
}
for _, tag := range sketch.Tags {
name, value := datadogutil.SplitTag(tag)
if name == "host" {
name = "exported_host"
}
labels = append(labels, prompb.Label{
Name: name,
Value: value,

View File

@@ -27,6 +27,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/promremotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/vmimport"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/zabbixconnector"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
@@ -74,7 +75,7 @@ var (
"See also -opentsdbHTTPListenAddr.useProxyProtocol")
opentsdbHTTPUseProxyProtocol = flag.Bool("opentsdbHTTPListenAddr.useProxyProtocol", false, "Whether to use proxy protocol for connections accepted "+
"at -opentsdbHTTPListenAddr . See https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt")
configAuthKey = flagutil.NewPassword("configAuthKey", "Authorization key for accessing /config page. It must be passed via authKey query arg. It overrides -httpAuth.*")
configAuthKey = flagutil.NewPassword("configAuthKey", "Authorization key for accessing /config and /remotewrite-.*-config pages. It must be passed via authKey query arg. It overrides -httpAuth.*")
reloadAuthKey = flagutil.NewPassword("reloadAuthKey", "Auth key for /-/reload http endpoint. It must be passed via authKey query arg. It overrides -httpAuth.*")
dryRun = flag.Bool("dryRun", false, "Whether to check config files without running vmagent. The following files are checked: "+
"-promscrape.config, -remoteWrite.relabelConfig, -remoteWrite.urlRelabelConfig, -remoteWrite.streamAggr.config . "+
@@ -244,6 +245,7 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
}
w.Header().Add("Content-Type", "text/html; charset=utf-8")
fmt.Fprintf(w, "<h2>vmagent</h2>")
fmt.Fprintf(w, "Version %s<br>", buildinfo.Version)
fmt.Fprintf(w, "See docs at <a href='https://docs.victoriametrics.com/victoriametrics/vmagent/'>https://docs.victoriametrics.com/victoriametrics/vmagent/</a></br>")
fmt.Fprintf(w, "Useful endpoints:</br>")
httpserver.WriteAPIHelp(w, [][2]string{
@@ -252,6 +254,8 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
{"metric-relabel-debug", "debug metric relabeling"},
{"api/v1/targets", "advanced information about discovered targets in JSON format"},
{"config", "-promscrape.config contents"},
{"remotewrite-relabel-config", "-remoteWrite.relabelConfig contents"},
{"remotewrite-url-relabel-config", "-remoteWrite.urlRelabelConfig contents"},
{"metrics", "available service metrics"},
{"flags", "command-line flags"},
{"-/reload", "reload configuration"},
@@ -348,6 +352,17 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
}
firehose.WriteSuccessResponse(w, r)
return true
case "/zabbixconnector/api/v1/history":
zabbixconnectorHistoryRequests.Inc()
if err := zabbixconnector.InsertHandlerForHTTP(nil, r); err != nil {
zabbixconnectorHistoryErrors.Inc()
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(http.StatusBadRequest)
fmt.Fprintf(w, `{"error":%q}`, err.Error())
return true
}
w.WriteHeader(http.StatusOK)
return true
case "/newrelic":
newrelicCheckRequest.Inc()
w.Header().Set("Content-Type", "application/json")
@@ -477,6 +492,42 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
promscrape.WriteConfigData(&bb)
fmt.Fprintf(w, `{"status":"success","data":{"yaml":%s}}`, stringsutil.JSONString(string(bb.B)))
return true
case "/remotewrite-relabel-config":
if !httpserver.CheckAuthFlag(w, r, configAuthKey) {
return true
}
remoteWriteRelabelConfigRequests.Inc()
w.Header().Set("Content-Type", "text/plain; charset=utf-8")
remotewrite.WriteRelabelConfigData(w)
return true
case "/api/v1/status/remotewrite-relabel-config":
if !httpserver.CheckAuthFlag(w, r, configAuthKey) {
return true
}
remoteWriteStatusRelabelConfigRequests.Inc()
w.Header().Set("Content-Type", "application/json")
var bb bytesutil.ByteBuffer
remotewrite.WriteRelabelConfigData(&bb)
fmt.Fprintf(w, `{"status":"success","data":{"yaml":%s}}`, stringsutil.JSONString(string(bb.B)))
return true
case "/remotewrite-url-relabel-config":
if !httpserver.CheckAuthFlag(w, r, configAuthKey) {
return true
}
remoteWriteURLRelabelConfigRequests.Inc()
w.Header().Set("Content-Type", "text/plain; charset=utf-8")
remotewrite.WriteURLRelabelConfigData(w)
return true
case "/api/v1/status/remotewrite-url-relabel-config":
if !httpserver.CheckAuthFlag(w, r, configAuthKey) {
return true
}
remoteWriteStatusURLRelabelConfigRequests.Inc()
w.Header().Set("Content-Type", "application/json")
var bb bytesutil.ByteBuffer
remotewrite.WriteURLRelabelConfigData(&bb)
fmt.Fprintf(w, `{"status":"success","data":{"yaml":%s}}`, stringsutil.JSONString(string(bb.B)))
return true
case "/prometheus/-/reload", "/-/reload":
if !httpserver.CheckAuthFlag(w, r, reloadAuthKey) {
return true
@@ -606,6 +657,17 @@ func processMultitenantRequest(w http.ResponseWriter, r *http.Request, path stri
}
firehose.WriteSuccessResponse(w, r)
return true
case "zabbixconnector/api/v1/history":
zabbixconnectorHistoryRequests.Inc()
if err := zabbixconnector.InsertHandlerForHTTP(at, r); err != nil {
zabbixconnectorHistoryErrors.Inc()
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(http.StatusBadRequest)
fmt.Fprintf(w, `{"error":%q}`, err.Error())
return true
}
w.WriteHeader(http.StatusOK)
return true
case "newrelic":
newrelicCheckRequest.Inc()
w.Header().Set("Content-Type", "application/json")
@@ -727,6 +789,9 @@ var (
opentelemetryPushRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/opentelemetry/v1/metrics", protocol="opentelemetry"}`)
opentelemetryPushErrors = metrics.NewCounter(`vmagent_http_request_errors_total{path="/opentelemetry/v1/metrics", protocol="opentelemetry"}`)
zabbixconnectorHistoryRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/zabbixconnector/api/v1/history", protocol="zabbixconnector"}`)
zabbixconnectorHistoryErrors = metrics.NewCounter(`vmagent_http_request_errors_total{path="/zabbixconnector/api/v1/history", protocol="zabbixconnector"}`)
newrelicWriteRequests = metrics.NewCounter(`vm_http_requests_total{path="/newrelic/infra/v2/metrics/events/bulk", protocol="newrelic"}`)
newrelicWriteErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/newrelic/infra/v2/metrics/events/bulk", protocol="newrelic"}`)
@@ -747,6 +812,12 @@ var (
promscrapeConfigRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/config"}`)
promscrapeStatusConfigRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/api/v1/status/config"}`)
remoteWriteRelabelConfigRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/remotewrite-relabel-config"}`)
remoteWriteStatusRelabelConfigRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/api/v1/status/remotewrite-relabel-config"}`)
remoteWriteURLRelabelConfigRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/remotewrite-url-relabel-config"}`)
remoteWriteStatusURLRelabelConfigRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/api/v1/status/remotewrite-url-relabel-config"}`)
promscrapeConfigReloadRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/-/reload"}`)
)

View File

@@ -78,7 +78,7 @@ func insertRows(at *auth.Token, rows []newrelic.Row, extraLabels []prompb.Label)
if !remotewrite.TryPush(at, &ctx.WriteRequest) {
return remotewrite.ErrQueueFullHTTPRetry
}
rowsInserted.Add(len(rows))
rowsInserted.Add(samplesCount)
if at != nil {
rowsTenantInserted.Get(at).Add(samplesCount)
}

View File

@@ -2,6 +2,7 @@ package opentelemetry
import (
"fmt"
"io"
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
@@ -24,6 +25,13 @@ var (
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="opentelemetry"}`)
)
// InsertHandlerForReader processes metrics from given reader.
func InsertHandlerForReader(at *auth.Token, r io.Reader, encoding string) error {
return stream.ParseStream(r, encoding, nil, func(tss []prompb.TimeSeries, mms []prompb.MetricMetadata) error {
return insertRows(at, tss, mms, nil)
})
}
// InsertHandler processes opentelemetry metrics.
func InsertHandler(at *auth.Token, req *http.Request) error {
extraLabels, err := protoparserutil.GetExtraLabels(req)

View File

@@ -13,19 +13,18 @@ import (
"sync/atomic"
"time"
"github.com/VictoriaMetrics/metrics"
"github.com/golang/snappy"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/awsapi"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/encoding"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/encoding/zstd"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httputil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/persistentqueue"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promauth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/ratelimiter"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/timerpool"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/timeutil"
"github.com/VictoriaMetrics/metrics"
"github.com/golang/snappy"
)
var (
@@ -203,14 +202,10 @@ func (c *client) init(argIdx, concurrency int, sanitizedURL string) {
c.retriesCount = metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_retries_count_total{url=%q}`, c.sanitizedURL))
c.sendDuration = metrics.GetOrCreateFloatCounter(fmt.Sprintf(`vmagent_remotewrite_send_duration_seconds_total{url=%q}`, c.sanitizedURL))
metrics.GetOrCreateGauge(fmt.Sprintf(`vmagent_remotewrite_queues{url=%q}`, c.sanitizedURL), func() float64 {
return float64(*queues)
return float64(concurrency)
})
for i := 0; i < concurrency; i++ {
c.wg.Add(1)
go func() {
defer c.wg.Done()
c.runWorker()
}()
for range concurrency {
c.wg.Go(c.runWorker)
}
logger.Infof("initialized client for -remoteWrite.url=%q", c.sanitizedURL)
}
@@ -295,7 +290,7 @@ func getAWSAPIConfig(argIdx int) (*awsapi.Config, error) {
accessKey := awsAccessKey.GetOptionalArg(argIdx)
secretKey := awsSecretKey.GetOptionalArg(argIdx)
service := awsService.GetOptionalArg(argIdx)
cfg, err := awsapi.NewConfig(ec2Endpoint, stsEndpoint, region, roleARN, accessKey, secretKey, service)
cfg, err := awsapi.NewConfig(ec2Endpoint, stsEndpoint, region, roleARN, accessKey, secretKey, service, "")
if err != nil {
return nil, err
}
@@ -410,8 +405,7 @@ func (c *client) newRequest(url string, body []byte) (*http.Request, error) {
// Otherwise, it tries sending the block to remote storage indefinitely.
func (c *client) sendBlockHTTP(block []byte) bool {
c.rl.Register(len(block))
maxRetryDuration := timeutil.AddJitterToDuration(c.retryMaxInterval)
retryDuration := timeutil.AddJitterToDuration(c.retryMinInterval)
bt := timeutil.NewBackoffTimer(c.retryMinInterval, c.retryMaxInterval)
retriesCount := 0
again:
@@ -420,19 +414,10 @@ again:
c.requestDuration.UpdateDuration(startTime)
if err != nil {
c.errorsCount.Inc()
retryDuration *= 2
if retryDuration > maxRetryDuration {
retryDuration = maxRetryDuration
}
remoteWriteRetryLogger.Warnf("couldn't send a block with size %d bytes to %q: %s; re-sending the block in %.3f seconds",
len(block), c.sanitizedURL, err, retryDuration.Seconds())
t := timerpool.Get(retryDuration)
select {
case <-c.stopCh:
timerpool.Put(t)
remoteWriteRetryLogger.Warnf("couldn't send a block with size %d bytes to %q: %s; re-sending the block in %s",
len(block), c.sanitizedURL, err, bt.CurrentDelay())
if !bt.Wait(c.stopCh) {
return false
case <-t.C:
timerpool.Put(t)
}
c.retriesCount.Inc()
goto again
@@ -498,7 +483,10 @@ again:
// Unexpected status code returned
retriesCount++
retryAfterHeader := parseRetryAfterHeader(resp.Header.Get("Retry-After"))
retryDuration = getRetryDuration(retryAfterHeader, retryDuration, maxRetryDuration)
// retryAfterDuration has the highest priority duration
if retryAfterHeader > 0 {
bt.SetDelay(retryAfterHeader)
}
// Handle response
body, err := io.ReadAll(resp.Body)
@@ -507,15 +495,10 @@ again:
logger.Errorf("cannot read response body from %q during retry #%d: %s", c.sanitizedURL, retriesCount, err)
} else {
logger.Errorf("unexpected status code received after sending a block with size %d bytes to %q during retry #%d: %d; response body=%q; "+
"re-sending the block in %.3f seconds", len(block), c.sanitizedURL, retriesCount, statusCode, body, retryDuration.Seconds())
"re-sending the block in %s", len(block), c.sanitizedURL, retriesCount, statusCode, body, bt.CurrentDelay())
}
t := timerpool.Get(retryDuration)
select {
case <-c.stopCh:
timerpool.Put(t)
if !bt.Wait(c.stopCh) {
return false
case <-t.C:
timerpool.Put(t)
}
c.retriesCount.Inc()
goto again
@@ -524,27 +507,6 @@ again:
var remoteWriteRejectedLogger = logger.WithThrottler("remoteWriteRejected", 5*time.Second)
var remoteWriteRetryLogger = logger.WithThrottler("remoteWriteRetry", 5*time.Second)
// getRetryDuration returns retry duration.
// retryAfterDuration has the highest priority.
// If retryAfterDuration is not specified, retryDuration gets doubled.
// retryDuration can't exceed maxRetryDuration.
//
// Also see: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6097
func getRetryDuration(retryAfterDuration, retryDuration, maxRetryDuration time.Duration) time.Duration {
// retryAfterDuration has the highest priority duration
if retryAfterDuration > 0 {
return timeutil.AddJitterToDuration(retryAfterDuration)
}
// default backoff retry policy
retryDuration *= 2
if retryDuration > maxRetryDuration {
retryDuration = maxRetryDuration
}
return retryDuration
}
// repackBlockFromZstdToSnappy repacks the given zstd-compressed block to snappy-compressed block.
//
// The input block may be corrupted, for example, if vmagent was shut down ungracefully and
@@ -554,9 +516,9 @@ func getRetryDuration(retryAfterDuration, retryDuration, maxRetryDuration time.D
// For more details, see: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9417
func repackBlockFromZstdToSnappy(zstdBlock []byte) ([]byte, error) {
plainBlock := make([]byte, 0, len(zstdBlock)*2)
plainBlock, err := zstd.Decompress(plainBlock, zstdBlock)
plainBlock, err := encoding.DecompressZSTD(plainBlock, zstdBlock)
if err != nil {
return nil, fmt.Errorf("zstd: decompress: %s", err)
return nil, err
}
return snappy.Encode(nil, plainBlock), nil
@@ -575,24 +537,20 @@ func logBlockRejected(block []byte, sanitizedURL string, resp *http.Response) {
}
// parseRetryAfterHeader parses `Retry-After` value retrieved from HTTP response header.
// retryAfterString should be in either HTTP-date or a number of seconds.
// It will return time.Duration(0) if `retryAfterString` does not follow RFC 7231.
func parseRetryAfterHeader(retryAfterString string) (retryAfterDuration time.Duration) {
if retryAfterString == "" {
return retryAfterDuration
//
// s should be in either HTTP-date or a number of seconds.
// It returns time.Duration(0) if s does not follow RFC 7231.
func parseRetryAfterHeader(s string) time.Duration {
if s == "" {
return 0
}
defer func() {
v := retryAfterDuration.Seconds()
logger.Infof("'Retry-After: %s' parsed into %.2f second(s)", retryAfterString, v)
}()
// Retry-After could be in "Mon, 02 Jan 2006 15:04:05 GMT" format.
if parsedTime, err := time.Parse(http.TimeFormat, retryAfterString); err == nil {
if parsedTime, err := time.Parse(http.TimeFormat, s); err == nil {
return time.Duration(time.Until(parsedTime).Seconds()) * time.Second
}
// Retry-After could be in seconds.
if seconds, err := strconv.Atoi(retryAfterString); err == nil {
if seconds, err := strconv.Atoi(s); err == nil {
return time.Duration(seconds) * time.Second
}

View File

@@ -6,66 +6,12 @@ import (
"testing"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/encoding"
"github.com/golang/snappy"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/encoding"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
)
func TestCalculateRetryDuration(t *testing.T) {
// `testFunc` call `calculateRetryDuration` for `n` times
// and evaluate if the result of `calculateRetryDuration` is
// 1. >= expectMinDuration
// 2. <= expectMinDuration + 10% (see timeutil.AddJitterToDuration)
f := func(retryAfterDuration, retryDuration time.Duration, n int, expectMinDuration time.Duration) {
t.Helper()
for i := 0; i < n; i++ {
retryDuration = getRetryDuration(retryAfterDuration, retryDuration, time.Minute)
}
expectMaxDuration := helper(expectMinDuration)
expectMinDuration = expectMinDuration - (1000 * time.Millisecond) // Avoid edge case when calculating time.Until(now)
if retryDuration < expectMinDuration || retryDuration > expectMaxDuration {
t.Fatalf(
"incorrect retry duration, want (ms): [%d, %d], got (ms): %d",
expectMinDuration.Milliseconds(), expectMaxDuration.Milliseconds(),
retryDuration.Milliseconds(),
)
}
}
// Call calculateRetryDuration for 1 time.
{
// default backoff policy
f(0, time.Second, 1, 2*time.Second)
// default backoff policy exceed max limit"
f(0, 10*time.Minute, 1, time.Minute)
// retry after > default backoff policy
f(10*time.Second, 1*time.Second, 1, 10*time.Second)
// retry after < default backoff policy
f(1*time.Second, 10*time.Second, 1, 1*time.Second)
// retry after invalid and < default backoff policy
f(0, time.Second, 1, 2*time.Second)
}
// Call calculateRetryDuration for multiple times.
{
// default backoff policy 2 times
f(0, time.Second, 2, 4*time.Second)
// default backoff policy 3 times
f(0, time.Second, 3, 8*time.Second)
// default backoff policy N times exceed max limit
f(0, time.Second, 10, time.Minute)
// retry after 120s 1 times
f(120*time.Second, time.Second, 1, 120*time.Second)
// retry after 120s 2 times
f(120*time.Second, time.Second, 2, 120*time.Second)
}
}
func TestParseRetryAfterHeader(t *testing.T) {
f := func(retryAfterString string, expectResult time.Duration) {
t.Helper()
@@ -91,11 +37,32 @@ func TestParseRetryAfterHeader(t *testing.T) {
f(time.Now().Add(10*time.Second).Format("Mon, 02 Jan 2006 15:04:05 FAKETZ"), 0)
}
// helper calculate the max possible time duration calculated by timeutil.AddJitterToDuration.
func helper(d time.Duration) time.Duration {
dv := min(d/10, 10*time.Second)
func TestInitSecretFlags(t *testing.T) {
showRemoteWriteURLOrig := *showRemoteWriteURL
defer func() {
*showRemoteWriteURL = showRemoteWriteURLOrig
flagutil.UnregisterAllSecretFlags()
}()
return d + dv
flagutil.UnregisterAllSecretFlags()
*showRemoteWriteURL = false
InitSecretFlags()
if !flagutil.IsSecretFlag("remotewrite.url") {
t.Fatalf("expecting remoteWrite.url to be secret")
}
if !flagutil.IsSecretFlag("remotewrite.headers") {
t.Fatalf("expecting remoteWrite.headers to be secret")
}
flagutil.UnregisterAllSecretFlags()
*showRemoteWriteURL = true
InitSecretFlags()
if flagutil.IsSecretFlag("remotewrite.url") {
t.Fatalf("remoteWrite.url must remain visible when -remoteWrite.showURL is set")
}
if !flagutil.IsSecretFlag("remotewrite.headers") {
t.Fatalf("expecting remoteWrite.headers to remain secret")
}
}
func TestRepackBlockFromZstdToSnappy(t *testing.T) {

View File

@@ -48,11 +48,7 @@ func newPendingSeries(fq *persistentqueue.FastQueue, isVMRemoteWrite *atomic.Boo
ps.wr.significantFigures = significantFigures
ps.wr.roundDigits = roundDigits
ps.stopCh = make(chan struct{})
ps.periodicFlusherWG.Add(1)
go func() {
defer ps.periodicFlusherWG.Done()
ps.periodicFlusher()
}()
ps.periodicFlusherWG.Go(ps.periodicFlusher)
return &ps
}

View File

@@ -51,9 +51,9 @@ func testPushWriteRequest(t *testing.T, rowsCount, expectedBlockLenProm, expecte
func newTestWriteRequest(seriesCount, labelsCount int) *prompb.WriteRequest {
var wr prompb.WriteRequest
for i := 0; i < seriesCount; i++ {
for i := range seriesCount {
var labels []prompb.Label
for j := 0; j < labelsCount; j++ {
for j := range labelsCount {
labels = append(labels, prompb.Label{
Name: fmt.Sprintf("label_%d_%d", i, j),
Value: fmt.Sprintf("value_%d_%d", i, j),

View File

@@ -3,22 +3,24 @@ package remotewrite
import (
"flag"
"fmt"
"io"
"strconv"
"strings"
"sync"
"sync/atomic"
"github.com/VictoriaMetrics/metrics"
"gopkg.in/yaml.v2"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fasttime"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
"github.com/VictoriaMetrics/metrics"
)
var (
unparsedLabelsGlobal = flagutil.NewArrayString("remoteWrite.label", "Optional label in the form 'name=value' to add to all the metrics before sending them to -remoteWrite.url. "+
"Pass multiple -remoteWrite.label flags in order to add multiple labels to metrics before sending them to remote storage")
unparsedLabelsGlobal = flagutil.NewArrayString("remoteWrite.label", "Optional label in the form 'name=value' to add to all the metrics before sending them to all -remoteWrite.url.")
relabelConfigPathGlobal = flag.String("remoteWrite.relabelConfig", "", "Optional path to file with relabeling configs, which are applied "+
"to all the metrics before sending them to -remoteWrite.url. See also -remoteWrite.urlRelabelConfig. "+
"The path can point either to local file or to http url. "+
@@ -32,9 +34,12 @@ var (
"See https://prometheus.io/docs/concepts/data_model/#metric-names-and-labels")
)
var labelsGlobal []prompb.Label
var (
labelsGlobal []prompb.Label
remoteWriteRelabelConfigData atomic.Pointer[[]byte]
remoteWriteURLRelabelConfigData atomic.Pointer[[]any]
relabelConfigReloads *metrics.Counter
relabelConfigReloadErrors *metrics.Counter
relabelConfigSuccess *metrics.Gauge
@@ -67,6 +72,42 @@ func initRelabelConfigs() {
}
}
// WriteRelabelConfigData writes -remoteWrite.relabelConfig contents to w
func WriteRelabelConfigData(w io.Writer) {
p := remoteWriteRelabelConfigData.Load()
if p == nil {
// Nothing to write to w
return
}
_, _ = w.Write(*p)
}
// WriteURLRelabelConfigData writes -remoteWrite.urlRelabelConfig contents to w
func WriteURLRelabelConfigData(w io.Writer) {
p := remoteWriteURLRelabelConfigData.Load()
if p == nil {
// Nothing to write to w
return
}
type urlRelabelCfg struct {
Url string `yaml:"url"`
RelabelConfig any `yaml:"relabel_config"`
}
var cs []urlRelabelCfg
for i, url := range *remoteWriteURLs {
cfgData := (*p)[i]
if !*showRemoteWriteURL {
url = fmt.Sprintf("%d:secret-url", i+1)
}
cs = append(cs, urlRelabelCfg{
Url: url,
RelabelConfig: cfgData,
})
}
d, _ := yaml.Marshal(cs)
_, _ = w.Write(d)
}
func reloadRelabelConfigs() {
rcs := allRelabelConfigs.Load()
if !rcs.isSet() {
@@ -90,28 +131,43 @@ func reloadRelabelConfigs() {
func loadRelabelConfigs() (*relabelConfigs, error) {
var rcs relabelConfigs
if *relabelConfigPathGlobal != "" {
global, err := promrelabel.LoadRelabelConfigs(*relabelConfigPathGlobal)
global, rawCfg, err := promrelabel.LoadRelabelConfigs(*relabelConfigPathGlobal)
if err != nil {
return nil, fmt.Errorf("cannot load -remoteWrite.relabelConfig=%q: %w", *relabelConfigPathGlobal, err)
}
remoteWriteRelabelConfigData.Store(&rawCfg)
rcs.global = global
}
if len(*relabelConfigPaths) > len(*remoteWriteURLs) {
return nil, fmt.Errorf("too many -remoteWrite.urlRelabelConfig args: %d; it mustn't exceed the number of -remoteWrite.url args: %d",
len(*relabelConfigPaths), (len(*remoteWriteURLs)))
}
var urlRelabelCfgs []any
rcs.perURL = make([]*promrelabel.ParsedConfigs, len(*remoteWriteURLs))
for i, path := range *relabelConfigPaths {
if len(path) == 0 {
// Skip empty relabel config.
urlRelabelCfgs = append(urlRelabelCfgs, nil)
continue
}
prc, err := promrelabel.LoadRelabelConfigs(path)
prc, rawCfg, err := promrelabel.LoadRelabelConfigs(path)
if err != nil {
return nil, fmt.Errorf("cannot load relabel configs from -remoteWrite.urlRelabelConfig=%q: %w", path, err)
}
rcs.perURL[i] = prc
var parsedCfg any
_ = yaml.Unmarshal(rawCfg, &parsedCfg)
urlRelabelCfgs = append(urlRelabelCfgs, parsedCfg)
}
if len(*remoteWriteURLs) > len(*relabelConfigPaths) {
// fill the urlRelabelCfgs with empty relabel configs if not set
for i := len(*relabelConfigPaths); i < len(*remoteWriteURLs); i++ {
urlRelabelCfgs = append(urlRelabelCfgs, nil)
}
}
remoteWriteURLRelabelConfigData.Store(&urlRelabelCfgs)
return &rcs, nil
}
@@ -120,19 +176,9 @@ type relabelConfigs struct {
perURL []*promrelabel.ParsedConfigs
}
// isSet indicates whether (global or per-URL) command-line flags is set
func (rcs *relabelConfigs) isSet() bool {
if rcs == nil {
return false
}
if rcs.global.Len() > 0 {
return true
}
for _, pc := range rcs.perURL {
if pc.Len() > 0 {
return true
}
}
return false
return *relabelConfigPathGlobal != "" || len(*relabelConfigPaths) > 0
}
// initLabelsGlobal must be called after parsing command-line flags.

View File

@@ -3,6 +3,7 @@ package remotewrite
import (
"flag"
"fmt"
"math"
"net/http"
"net/url"
"path/filepath"
@@ -11,6 +12,10 @@ import (
"sync/atomic"
"time"
"github.com/cespare/xxhash/v2"
"github.com/VictoriaMetrics/metrics"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bloomfilter"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
@@ -23,14 +28,14 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/memory"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/persistentqueue"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/procutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prommetadata"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/ratelimiter"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/slicesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/streamaggr"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/timeserieslimits"
"github.com/VictoriaMetrics/metrics"
"github.com/cespare/xxhash/v2"
)
var (
@@ -58,7 +63,7 @@ var (
"See also -remoteWrite.maxDiskUsagePerURL and -remoteWrite.disableOnDiskQueue")
keepDanglingQueues = flag.Bool("remoteWrite.keepDanglingQueues", false, "Keep persistent queues contents at -remoteWrite.tmpDataPath in case there are no matching -remoteWrite.url. "+
"Useful when -remoteWrite.url is changed temporarily and persistent queue files will be needed later on.")
queues = flag.Int("remoteWrite.queues", cgroup.AvailableCPUs()*2, "The number of concurrent queues to each -remoteWrite.url. Set more queues if default number of queues "+
queues = flagutil.NewArrayInt("remoteWrite.queues", cgroup.AvailableCPUs()*2, "The number of concurrent queues to each -remoteWrite.url. Set more queues if default number of queues "+
"isn't enough for sending high volume of collected data to remote storage. "+
"Default value depends on the number of available CPU cores. It should work fine in most cases since it minimizes resource usage")
showRemoteWriteURL = flag.Bool("remoteWrite.showURL", false, "Whether to show -remoteWrite.url in the exported metrics. "+
@@ -79,10 +84,14 @@ var (
`This may be needed for reducing memory usage at remote storage when the order of labels in incoming samples is random. `+
`For example, if m{k1="v1",k2="v2"} may be sent as m{k2="v2",k1="v1"}`+
`Enabled sorting for labels can slow down ingestion performance a bit`)
maxHourlySeries = flag.Int("remoteWrite.maxHourlySeries", 0, "The maximum number of unique series vmagent can send to remote storage systems during the last hour. "+
"Excess series are logged and dropped. This can be useful for limiting series cardinality. See https://docs.victoriametrics.com/victoriametrics/vmagent/#cardinality-limiter")
maxDailySeries = flag.Int("remoteWrite.maxDailySeries", 0, "The maximum number of unique series vmagent can send to remote storage systems during the last 24 hours. "+
"Excess series are logged and dropped. This can be useful for limiting series churn rate. See https://docs.victoriametrics.com/victoriametrics/vmagent/#cardinality-limiter")
maxHourlySeries = flag.Int64("remoteWrite.maxHourlySeries", 0, "The maximum number of unique series vmagent can send to remote storage systems during the last hour. "+
"Excess series are logged and dropped. This can be useful for limiting series cardinality. "+
fmt.Sprintf("Setting this flag to '-1' sets limit to maximum possible value (%d) which is useful in order to enable series tracking without enforcing limits. ", math.MaxInt32)+
"See https://docs.victoriametrics.com/victoriametrics/vmagent/#cardinality-limiter")
maxDailySeries = flag.Int64("remoteWrite.maxDailySeries", 0, "The maximum number of unique series vmagent can send to remote storage systems during the last 24 hours. "+
"Excess series are logged and dropped. This can be useful for limiting series churn rate. "+
fmt.Sprintf("Setting this flag to '-1' sets limit to maximum possible value (%d) which is useful in order to enable series tracking without enforcing limits. ", math.MaxInt32)+
"See https://docs.victoriametrics.com/victoriametrics/vmagent/#cardinality-limiter")
maxIngestionRate = flag.Int("maxIngestionRate", 0, "The maximum number of samples vmagent can receive per second. Data ingestion is paused when the limit is exceeded. "+
"By default there are no limits on samples ingestion rate. See also -remoteWrite.rateLimit")
@@ -91,6 +100,8 @@ var (
"See https://docs.victoriametrics.com/victoriametrics/vmagent/#disabling-on-disk-persistence . See also -remoteWrite.dropSamplesOnOverload")
dropSamplesOnOverload = flag.Bool("remoteWrite.dropSamplesOnOverload", false, "Whether to drop samples when -remoteWrite.disableOnDiskQueue is set and if the samples "+
"cannot be pushed into the configured -remoteWrite.url systems in a timely manner. See https://docs.victoriametrics.com/victoriametrics/vmagent/#disabling-on-disk-persistence")
disableMetadataPerURL = flagutil.NewArrayBool("remoteWrite.disableMetadata", "Whether to disable sending metadata to the corresponding -remoteWrite.url. "+
"By default, metadata sending is controlled by the global -enableMetadata flag")
)
var (
@@ -140,6 +151,8 @@ func InitSecretFlags() {
// remoteWrite.url can contain authentication codes, so hide it at `/metrics` output.
flagutil.RegisterSecretFlag("remoteWrite.url")
}
// remoteWrite.headers can contain auth headers such as Authorization and API keys.
flagutil.RegisterSecretFlag("remoteWrite.headers")
}
var (
@@ -156,8 +169,8 @@ func Init() {
if len(*remoteWriteURLs) == 0 {
logger.Fatalf("at least one `-remoteWrite.url` command-line flag must be set")
}
if *maxHourlySeries > 0 {
hourlySeriesLimiter = bloomfilter.NewLimiter(*maxHourlySeries, time.Hour)
if limit := getMaxHourlySeries(); limit > 0 {
hourlySeriesLimiter = bloomfilter.NewLimiter(limit, time.Hour)
_ = metrics.NewGauge(`vmagent_hourly_series_limit_max_series`, func() float64 {
return float64(hourlySeriesLimiter.MaxItems())
})
@@ -165,8 +178,8 @@ func Init() {
return float64(hourlySeriesLimiter.CurrentItems())
})
}
if *maxDailySeries > 0 {
dailySeriesLimiter = bloomfilter.NewLimiter(*maxDailySeries, 24*time.Hour)
if limit := getMaxDailySeries(); limit > 0 {
dailySeriesLimiter = bloomfilter.NewLimiter(limit, 24*time.Hour)
_ = metrics.NewGauge(`vmagent_daily_series_limit_max_series`, func() float64 {
return float64(dailySeriesLimiter.MaxItems())
})
@@ -175,13 +188,6 @@ func Init() {
})
}
if *queues > maxQueues {
*queues = maxQueues
}
if *queues <= 0 {
*queues = 1
}
if len(*shardByURLLabels) > 0 && len(*shardByURLIgnoreLabels) > 0 {
logger.Fatalf("-remoteWrite.shardByURL.labels and -remoteWrite.shardByURL.ignoreLabels cannot be set simultaneously; " +
"see https://docs.victoriametrics.com/victoriametrics/vmagent/#sharding-among-remote-storages")
@@ -214,9 +220,7 @@ func Init() {
dropDanglingQueues()
// Start config reloader.
configReloaderWG.Add(1)
go func() {
defer configReloaderWG.Done()
configReloaderWG.Go(func() {
for {
select {
case <-configReloaderStopCh:
@@ -226,7 +230,7 @@ func Init() {
reloadRelabelConfigs()
reloadStreamAggrConfigs()
}
}()
})
}
func dropDanglingQueues() {
@@ -266,17 +270,6 @@ func initRemoteWriteCtxs(urls []string) {
if len(urls) == 0 {
logger.Panicf("BUG: urls must be non-empty")
}
maxInmemoryBlocks := memory.Allowed() / len(urls) / *maxRowsPerBlock / 100
if maxInmemoryBlocks / *queues > 100 {
// There is no much sense in keeping higher number of blocks in memory,
// since this means that the producer outperforms consumer and the queue
// will continue growing. It is better storing the queue to file.
maxInmemoryBlocks = 100 * *queues
}
if maxInmemoryBlocks < 2 {
maxInmemoryBlocks = 2
}
rwctxs := make([]*remoteWriteCtx, len(urls))
rwctxIdx := make([]int, len(urls))
if retryMaxTime.String() != "" {
@@ -291,7 +284,7 @@ func initRemoteWriteCtxs(urls []string) {
if *showRemoteWriteURL {
sanitizedURL = fmt.Sprintf("%d:%s", i+1, remoteWriteURL)
}
rwctxs[i] = newRemoteWriteCtx(i, remoteWriteURL, maxInmemoryBlocks, sanitizedURL)
rwctxs[i] = newRemoteWriteCtx(i, remoteWriteURL, sanitizedURL)
rwctxIdx[i] = i
}
@@ -485,6 +478,9 @@ func tryPush(at *auth.Token, wr *prompb.WriteRequest, forceDropSamplesOnFailure
matchIdxs.B = sas.Push(tssBlock, matchIdxs.B)
if !*streamAggrGlobalKeepInput {
tssBlock = dropAggregatedSeries(tssBlock, matchIdxs.B, *streamAggrGlobalDropInput)
} else if *streamAggrGlobalDropInput {
// if both keep_input and drop_input are true, we keep only the aggregated series
tssBlock = dropUnaggregatedSeries(tssBlock, matchIdxs.B)
}
matchIdxsPool.Put(matchIdxs)
}
@@ -554,11 +550,13 @@ func tryPushMetadataToRemoteStorages(rwctxs []*remoteWriteCtx, mms []prompb.Metr
// Push metadata to remote storage systems in parallel to reduce
// the time needed for sending the data to multiple remote storage systems.
var wg sync.WaitGroup
wg.Add(len(rwctxs))
var anyPushFailed atomic.Bool
for _, rwctx := range rwctxs {
go func(rwctx *remoteWriteCtx) {
defer wg.Done()
if !rwctx.enableMetadata {
// Skip remote storage with disabled metadata
continue
}
wg.Go(func() {
if !rwctx.tryPushMetadataInternal(mms) {
rwctx.pushFailures.Inc()
if forceDropSamplesOnFailure {
@@ -567,7 +565,7 @@ func tryPushMetadataToRemoteStorages(rwctxs []*remoteWriteCtx, mms []prompb.Metr
}
anyPushFailed.Store(true)
}
}(rwctx)
})
}
wg.Wait()
return !anyPushFailed.Load()
@@ -599,15 +597,13 @@ func tryPushTimeSeriesToRemoteStorages(rwctxs []*remoteWriteCtx, tssBlock []prom
// Push tssBlock to remote storage systems in parallel to reduce
// the time needed for sending the data to multiple remote storage systems.
var wg sync.WaitGroup
wg.Add(len(rwctxs))
var anyPushFailed atomic.Bool
for _, rwctx := range rwctxs {
go func(rwctx *remoteWriteCtx) {
defer wg.Done()
wg.Go(func() {
if !rwctx.TryPushTimeSeries(tssBlock, forceDropSamplesOnFailure) {
anyPushFailed.Store(true)
}
}(rwctx)
})
}
wg.Wait()
return !anyPushFailed.Load()
@@ -629,13 +625,11 @@ func tryShardingTimeSeriesAmongRemoteStorages(rwctxs []*remoteWriteCtx, tssBlock
if len(shard) == 0 {
continue
}
wg.Add(1)
go func(rwctx *remoteWriteCtx, tss []prompb.TimeSeries) {
defer wg.Done()
if !rwctx.TryPushTimeSeries(tss, forceDropSamplesOnFailure) {
wg.Go(func() {
if !rwctx.TryPushTimeSeries(shard, forceDropSamplesOnFailure) {
anyPushFailed.Store(true)
}
}(rwctx, shard)
})
}
wg.Wait()
return !anyPushFailed.Load()
@@ -833,6 +827,11 @@ type remoteWriteCtx struct {
streamAggrKeepInput bool
streamAggrDropInput bool
// enableMetadata indicates whether metadata should be sent to this remote storage.
// It is determined by -remoteWrite.enableMetadata per-URL flag if set,
// otherwise by the global -enableMetadata flag.
enableMetadata bool
pss []*pendingSeries
pssNextIdx atomic.Uint64
@@ -844,7 +843,19 @@ type remoteWriteCtx struct {
rowsDroppedOnPushFailure *metrics.Counter
}
func newRemoteWriteCtx(argIdx int, remoteWriteURL *url.URL, maxInmemoryBlocks int, sanitizedURL string) *remoteWriteCtx {
// isMetadataEnabledForURL returns true if metadata should be sent to the remote storage at argIdx.
// It checks the per-URL -remoteWrite.disableMetadata flag first.
// If not set, it falls back to the global -enableMetadata flag.
func isMetadataEnabledForURL(argIdx int) bool {
if disableMetadataPerURL.GetOptionalArg(argIdx) {
// Metadata is explicitly disabled for this URL
return false
}
// Use global -enableMetadata value
return prommetadata.IsEnabled()
}
func newRemoteWriteCtx(argIdx int, remoteWriteURL *url.URL, sanitizedURL string) *remoteWriteCtx {
// strip query params, otherwise changing params resets pq
pqURL := *remoteWriteURL
pqURL.RawQuery = ""
@@ -859,6 +870,23 @@ func newRemoteWriteCtx(argIdx int, remoteWriteURL *url.URL, maxInmemoryBlocks in
}
isPQDisabled := disableOnDiskQueue.GetOptionalArg(argIdx)
queuesSize := queues.GetOptionalArg(argIdx)
if queuesSize > maxQueues {
queuesSize = maxQueues
} else if queuesSize <= 0 {
queuesSize = 1
}
maxInmemoryBlocks := memory.Allowed() / len(*remoteWriteURLs) / *maxRowsPerBlock / 100
if maxInmemoryBlocks/queuesSize > 100 {
// There is no much sense in keeping higher number of blocks in memory,
// since this means that the producer outperforms consumer and the queue
// will continue growing. It is better storing the queue to file.
maxInmemoryBlocks = 100 * queuesSize
}
if maxInmemoryBlocks < 2 {
maxInmemoryBlocks = 2
}
fq := persistentqueue.MustOpenFastQueue(queuePath, sanitizedURL, maxInmemoryBlocks, maxPendingBytes, isPQDisabled)
_ = metrics.GetOrCreateGauge(fmt.Sprintf(`vmagent_remotewrite_pending_data_bytes{path=%q, url=%q}`, queuePath, sanitizedURL), func() float64 {
return float64(fq.GetPendingBytes())
@@ -876,16 +904,16 @@ func newRemoteWriteCtx(argIdx int, remoteWriteURL *url.URL, maxInmemoryBlocks in
var c *client
switch remoteWriteURL.Scheme {
case "http", "https":
c = newHTTPClient(argIdx, remoteWriteURL.String(), sanitizedURL, fq, *queues)
c = newHTTPClient(argIdx, remoteWriteURL.String(), sanitizedURL, fq, queuesSize)
default:
logger.Fatalf("unsupported scheme: %s for remoteWriteURL: %s, want `http`, `https`", remoteWriteURL.Scheme, sanitizedURL)
}
c.init(argIdx, *queues, sanitizedURL)
c.init(argIdx, queuesSize, sanitizedURL)
// Initialize pss
sf := significantFigures.GetOptionalArg(argIdx)
rd := roundDigits.GetOptionalArg(argIdx)
pssLen := *queues
pssLen := queuesSize
if n := cgroup.AvailableCPUs(); pssLen > n {
// There is no sense in running more than availableCPUs concurrent pendingSeries,
// since every pendingSeries can saturate up to a single CPU.
@@ -897,10 +925,11 @@ func newRemoteWriteCtx(argIdx int, remoteWriteURL *url.URL, maxInmemoryBlocks in
}
rwctx := &remoteWriteCtx{
idx: argIdx,
fq: fq,
c: c,
pss: pss,
idx: argIdx,
fq: fq,
c: c,
pss: pss,
enableMetadata: isMetadataEnabledForURL(argIdx),
rowsPushedAfterRelabel: metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_rows_pushed_after_relabel_total{path=%q,url=%q}`, queuePath, sanitizedURL)),
rowsDroppedByRelabel: metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_relabel_metrics_dropped_total{path=%q,url=%q}`, queuePath, sanitizedURL)),
@@ -988,7 +1017,17 @@ func (rwctx *remoteWriteCtx) TryPushTimeSeries(tss []prompb.TimeSeries, forceDro
tss = append(*v, tss...)
}
tss = dropAggregatedSeries(tss, matchIdxs.B, rwctx.streamAggrDropInput)
} else if rwctx.streamAggrDropInput {
// if both keep_input and drop_input are true, we keep only the aggregated series
if rctx == nil {
rctx = getRelabelCtx()
// Make a copy of tss before dropping aggregated series
v = tssPool.Get().(*[]prompb.TimeSeries)
tss = append(*v, tss...)
}
tss = dropUnaggregatedSeries(tss, matchIdxs.B)
}
matchIdxsPool.Put(matchIdxs)
}
if rwctx.deduplicator != nil {
@@ -1011,9 +1050,10 @@ func (rwctx *remoteWriteCtx) TryPushTimeSeries(tss []prompb.TimeSeries, forceDro
return false
}
var matchIdxsPool bytesutil.ByteBufferPool
var matchIdxsPool slicesutil.BufferPool[uint32]
func dropAggregatedSeries(src []prompb.TimeSeries, matchIdxs []byte, dropInput bool) []prompb.TimeSeries {
// dropAggregatedSeries drops matched series, also the unmatched if dropInput is true.
func dropAggregatedSeries(src []prompb.TimeSeries, matchIdxs []uint32, dropInput bool) []prompb.TimeSeries {
dst := src[:0]
if !dropInput {
for i, match := range matchIdxs {
@@ -1028,6 +1068,20 @@ func dropAggregatedSeries(src []prompb.TimeSeries, matchIdxs []byte, dropInput b
return dst
}
// dropUnaggregatedSeries drops unmatched series.
func dropUnaggregatedSeries(src []prompb.TimeSeries, matchIdxs []uint32) []prompb.TimeSeries {
dst := src[:0]
for i, match := range matchIdxs {
if match == 0 {
continue
}
dst = append(dst, src[i])
}
tail := src[len(dst):]
clear(tail)
return dst
}
func (rwctx *remoteWriteCtx) pushInternalTrackDropped(tss []prompb.TimeSeries) {
if rwctx.tryPushTimeSeriesInternal(tss) {
return
@@ -1060,7 +1114,7 @@ func (rwctx *remoteWriteCtx) tryPushTimeSeriesInternal(tss []prompb.TimeSeries)
}()
if len(labelsGlobal) > 0 {
// Make a copy of tss before adding extra labels in order to prevent
// Make a copy of tss before adding extra labels to prevent
// from affecting time series for other remoteWrite.url configs.
rctx = getRelabelCtx()
v = tssPool.Get().(*[]prompb.TimeSeries)
@@ -1096,3 +1150,21 @@ func newMapFromStrings(a []string) map[string]struct{} {
}
return m
}
func getMaxHourlySeries() int {
limit := *maxHourlySeries
if limit == -1 || limit > math.MaxInt32 {
return math.MaxInt32
}
return int(limit)
}
func getMaxDailySeries() int {
limit := *maxDailySeries
if limit == -1 || limit > math.MaxInt32 {
return math.MaxInt32
}
return int(limit)
}

View File

@@ -10,6 +10,8 @@ import (
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/consistenthash"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/persistentqueue"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/prometheus"
@@ -26,12 +28,12 @@ func TestGetLabelsHash_Distribution(t *testing.T) {
itemsCount := 1_000 * bucketsCount
m := make([]int, bucketsCount)
var labels []prompb.Label
for i := 0; i < itemsCount; i++ {
for i := range itemsCount {
labels = append(labels[:0], prompb.Label{
Name: "__name__",
Value: fmt.Sprintf("some_name_%d", i),
})
for j := 0; j < 10; j++ {
for j := range 10 {
labels = append(labels, prompb.Label{
Name: fmt.Sprintf("label_%d", j),
Value: fmt.Sprintf("value_%d_%d", i, j),
@@ -57,8 +59,8 @@ func TestGetLabelsHash_Distribution(t *testing.T) {
f(10)
}
func TestRemoteWriteContext_TryPush_ImmutableTimeseries(t *testing.T) {
f := func(streamAggrConfig, relabelConfig string, enableWindows bool, dedupInterval time.Duration, keepInput, dropInput bool, input string) {
func TestRemoteWriteContext_TryPushTimeSeries(t *testing.T) {
f := func(streamAggrConfig, relabelConfig string, enableWindows bool, dedupInterval time.Duration, keepInput, dropInput bool, input string, expectedRowsPushedAfterRelabel, expectedPushedSample int) {
t.Helper()
perURLRelabel, err := promrelabel.ParseRelabelConfigsData([]byte(relabelConfig))
if err != nil {
@@ -71,10 +73,16 @@ func TestRemoteWriteContext_TryPush_ImmutableTimeseries(t *testing.T) {
}
allRelabelConfigs.Store(rcs)
path := "fast-queue-write-test"
fs.MustRemoveDir(path)
fq := persistentqueue.MustOpenFastQueue(path, "test", 100, 0, false)
defer fs.MustRemoveDir(path)
defer fq.MustClose()
pss := make([]*pendingSeries, 1)
isVMProto := &atomic.Bool{}
isVMProto.Store(true)
pss[0] = newPendingSeries(nil, isVMProto, 0, 100)
pss[0] = newPendingSeries(fq, isVMProto, 0, 100)
rwctx := &remoteWriteCtx{
idx: 0,
streamAggrKeepInput: keepInput,
@@ -83,6 +91,8 @@ func TestRemoteWriteContext_TryPush_ImmutableTimeseries(t *testing.T) {
rowsPushedAfterRelabel: metrics.GetOrCreateCounter(`foo`),
rowsDroppedByRelabel: metrics.GetOrCreateCounter(`bar`),
}
defer metrics.UnregisterAllMetrics()
if dedupInterval > 0 {
rwctx.deduplicator = streamaggr.NewDeduplicator(nil, enableWindows, dedupInterval, nil, "dedup-global")
}
@@ -104,23 +114,27 @@ func TestRemoteWriteContext_TryPush_ImmutableTimeseries(t *testing.T) {
inputTss := prometheus.MustParsePromMetrics(input, offsetMsecs)
expectedTss := make([]prompb.TimeSeries, len(inputTss))
// copy inputTss to make sure it is not mutated during TryPush call
// check inputTss is not modified after TryPushTimeSeries
copy(expectedTss, inputTss)
if !rwctx.TryPushTimeSeries(inputTss, false) {
t.Fatalf("cannot push samples to rwctx")
}
if int(rwctx.rowsPushedAfterRelabel.Get()) != expectedRowsPushedAfterRelabel {
t.Fatalf("unexpected number of rows after relabel; got %d; want %d", rwctx.rowsPushedAfterRelabel.Get(), expectedRowsPushedAfterRelabel)
}
if len(pss[0].wr.tss) != expectedPushedSample {
t.Fatalf("unexpected number of pushed samples; got %d; want %d", len(pss[0].wr.tss), expectedPushedSample)
}
if !reflect.DeepEqual(expectedTss, inputTss) {
t.Fatalf("unexpected samples;\ngot\n%v\nwant\n%v", inputTss, expectedTss)
}
}
f(`
- interval: 1m
outputs: [sum_samples]
- interval: 2m
outputs: [count_series]
`, `
// relabeling
f(``, `
- action: keep
source_labels: [env]
regex: "dev"
@@ -129,53 +143,66 @@ metric{env="dev"} 10
metric{env="bar"} 20
metric{env="dev"} 15
metric{env="bar"} 25
`)
`, 2, 2)
// relabeling + aggregation
f(`
- match: '{env="dev"}'
interval: 1m
outputs: [sum_samples]
`, `
- action: keep
source_labels: [env]
regex: ".*"
`, false, 0, false, false, `
metric{env="dev"} 10
metric{env="bar"} 20
metric{env="dev"} 15
metric{env="bar"} 25
`, 4, 2)
// aggregation + keepInput
f(`
- match: '{env="dev"}'
interval: 1m
outputs: [sum_samples]
`, ``, false, 0, true, false, `
metric{env="dev"} 10
metric{env="bar"} 20
metric{env="dev"} 15
metric{env="bar"} 25
`, 4, 4)
// aggregation + dropInput
f(`
- match: '{env="dev"}'
interval: 1m
outputs: [sum_samples]
`, ``, false, 0, false, true, `
metric{env="dev"} 10
metric{env="bar"} 20
metric{env="dev"} 15
metric{env="bar"} 25
`, 4, 0)
// aggregation + keepInput + dropInput
f(`
- match: '{env="dev"}'
interval: 1m
outputs: [sum_samples]
`, ``, false, 0, true, true, `
metric{env="dev"} 10
metric{env="bar"} 20
metric{env="bar"} 25
`, 3, 1)
// aggregation + deduplication
f(``, ``, true, time.Hour, false, false, `
metric{env="dev"} 10
metric{env="foo"} 20
metric{env="dev"} 15
metric{env="foo"} 25
`)
f(``, `
- action: keep
source_labels: [env]
regex: "dev"
`, true, time.Hour, false, false, `
metric{env="dev"} 10
metric{env="bar"} 20
metric{env="dev"} 15
metric{env="bar"} 25
`)
f(``, `
- action: keep
source_labels: [env]
regex: "dev"
`, true, time.Hour, true, false, `
metric{env="test"} 10
metric{env="dev"} 20
metric{env="foo"} 15
metric{env="dev"} 25
`)
f(``, `
- action: keep
source_labels: [env]
regex: "dev"
`, true, time.Hour, false, true, `
metric{env="foo"} 10
metric{env="dev"} 20
metric{env="foo"} 15
metric{env="dev"} 25
`)
f(``, `
- action: keep
source_labels: [env]
regex: "dev"
`, true, time.Hour, true, true, `
metric{env="dev"} 10
metric{env="test"} 20
metric{env="dev"} 15
metric{env="bar"} 25
`)
`, 4, 0)
}
func TestShardAmountRemoteWriteCtx(t *testing.T) {
@@ -221,7 +248,7 @@ func TestShardAmountRemoteWriteCtx(t *testing.T) {
seriesCount := 100000
// build 1000000 series
tssBlock := make([]prompb.TimeSeries, 0, seriesCount)
for i := 0; i < seriesCount; i++ {
for i := range seriesCount {
tssBlock = append(tssBlock, prompb.TimeSeries{
Labels: []prompb.Label{
{
@@ -242,7 +269,7 @@ func TestShardAmountRemoteWriteCtx(t *testing.T) {
// build active time series set
nodes := make([]string, 0, remoteWriteCount)
activeTimeSeriesByNodes := make([]map[string]struct{}, remoteWriteCount)
for i := 0; i < remoteWriteCount; i++ {
for i := range remoteWriteCount {
nodes = append(nodes, fmt.Sprintf("node%d", i))
activeTimeSeriesByNodes[i] = make(map[string]struct{})
}

View File

@@ -18,12 +18,12 @@ var (
streamAggrGlobalConfig = flag.String("streamAggr.config", "", "Optional path to file with stream aggregation config. "+
"See https://docs.victoriametrics.com/victoriametrics/stream-aggregation/ . "+
"See also -streamAggr.keepInput, -streamAggr.dropInput and -streamAggr.dedupInterval")
streamAggrGlobalKeepInput = flag.Bool("streamAggr.keepInput", false, "Whether to keep all the input samples after the aggregation "+
"with -streamAggr.config. By default, only aggregates samples are dropped, while the remaining samples "+
"are written to remote storages write. See also -streamAggr.dropInput and https://docs.victoriametrics.com/victoriametrics/stream-aggregation/")
streamAggrGlobalDropInput = flag.Bool("streamAggr.dropInput", false, "Whether to drop all the input samples after the aggregation "+
"with -remoteWrite.streamAggr.config. By default, only aggregates samples are dropped, while the remaining samples "+
"are written to remote storages write. See also -streamAggr.keepInput and https://docs.victoriametrics.com/victoriametrics/stream-aggregation/")
streamAggrGlobalKeepInput = flag.Bool("streamAggr.keepInput", false, "Whether to keep input samples that match any rule in "+
"-streamAggr.config. By default, matched raw samples are aggregated and dropped, while unmatched samples "+
"are written to the remote storage. See also -streamAggr.dropInput and https://docs.victoriametrics.com/victoriametrics/stream-aggregation/")
streamAggrGlobalDropInput = flag.Bool("streamAggr.dropInput", false, "Whether to drop input samples that not matching any rule in "+
"-streamAggr.config. By default, only matched raw samples are dropped, while unmatched samples "+
"are written to the remote storage. See also -streamAggr.keepInput and https://docs.victoriametrics.com/victoriametrics/stream-aggregation/")
streamAggrGlobalDedupInterval = flag.Duration("streamAggr.dedupInterval", 0, "Input samples are de-duplicated with this interval on "+
"aggregator before optional aggregation with -streamAggr.config . "+
"See also -dedup.minScrapeInterval and https://docs.victoriametrics.com/victoriametrics/stream-aggregation/#deduplication")
@@ -43,11 +43,11 @@ var (
streamAggrConfig = flagutil.NewArrayString("remoteWrite.streamAggr.config", "Optional path to file with stream aggregation config for the corresponding -remoteWrite.url. "+
"See https://docs.victoriametrics.com/victoriametrics/stream-aggregation/ . "+
"See also -remoteWrite.streamAggr.keepInput, -remoteWrite.streamAggr.dropInput and -remoteWrite.streamAggr.dedupInterval")
streamAggrDropInput = flagutil.NewArrayBool("remoteWrite.streamAggr.dropInput", "Whether to drop all the input samples after the aggregation "+
"with -remoteWrite.streamAggr.config at the corresponding -remoteWrite.url. By default, only aggregates samples are dropped, while the remaining samples "+
streamAggrDropInput = flagutil.NewArrayBool("remoteWrite.streamAggr.dropInput", "Whether to drop input samples that not matching any rule in "+
"the corresponding -remoteWrite.streamAggr.config. By default, only matched raw samples are dropped, while unmatched samples "+
"are written to the corresponding -remoteWrite.url . See also -remoteWrite.streamAggr.keepInput and https://docs.victoriametrics.com/victoriametrics/stream-aggregation/")
streamAggrKeepInput = flagutil.NewArrayBool("remoteWrite.streamAggr.keepInput", "Whether to keep all the input samples after the aggregation "+
"with -remoteWrite.streamAggr.config at the corresponding -remoteWrite.url. By default, only aggregates samples are dropped, while the remaining samples "+
streamAggrKeepInput = flagutil.NewArrayBool("remoteWrite.streamAggr.keepInput", "Whether to keep input samples that match any rule in "+
"the corresponding -remoteWrite.streamAggr.config. By default, matched raw samples are aggregated and dropped, while unmatched samples "+
"are written to the corresponding -remoteWrite.url . See also -remoteWrite.streamAggr.dropInput and https://docs.victoriametrics.com/victoriametrics/stream-aggregation/")
streamAggrDedupInterval = flagutil.NewArrayDuration("remoteWrite.streamAggr.dedupInterval", 0, "Input samples are de-duplicated with this interval before optional aggregation "+
"with -remoteWrite.streamAggr.config at the corresponding -remoteWrite.url. See also -dedup.minScrapeInterval and https://docs.victoriametrics.com/victoriametrics/stream-aggregation/#deduplication")

View File

@@ -0,0 +1,80 @@
package zabbixconnector
import (
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/protoparserutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/zabbixconnector"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/zabbixconnector/stream"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="zabbixconnector"}`)
rowsTenantInserted = tenantmetrics.NewCounterMap(`vmagent_tenant_inserted_rows_total{type="zabbixconnector"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="zabbixconnector"}`)
)
// InsertHandlerForHTTP processes remote write for ZabbixConnector POST /zabbixconnector/v1/history request.
func InsertHandlerForHTTP(at *auth.Token, req *http.Request) error {
extraLabels, err := protoparserutil.GetExtraLabels(req)
if err != nil {
return err
}
encoding := req.Header.Get("Content-Encoding")
return stream.Parse(req.Body, encoding, func(rows []zabbixconnector.Row) error {
return insertRows(at, rows, extraLabels)
})
}
func insertRows(at *auth.Token, rows []zabbixconnector.Row, extraLabels []prompb.Label) error {
ctx := common.GetPushCtx()
defer common.PutPushCtx(ctx)
rowsTotal := len(rows)
tssDst := ctx.WriteRequest.Timeseries[:0]
labels := ctx.Labels[:0]
samples := ctx.Samples[:0]
for i := range rows {
r := &rows[i]
labelsLen := len(labels)
for j := range r.Tags {
tag := &r.Tags[j]
labels = append(labels, prompb.Label{
Name: bytesutil.ToUnsafeString(tag.Key),
Value: bytesutil.ToUnsafeString(tag.Value),
})
}
labels = append(labels, extraLabels...)
samplesLen := len(samples)
samples = append(samples, prompb.Sample{
Value: r.Value,
Timestamp: r.Timestamp,
})
tssDst = append(tssDst, prompb.TimeSeries{
Labels: labels[labelsLen:],
Samples: samples[samplesLen:],
})
}
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
if !remotewrite.TryPush(at, &ctx.WriteRequest) {
return remotewrite.ErrQueueFullHTTPRetry
}
rowsInserted.Add(rowsTotal)
if at != nil {
rowsTenantInserted.Get(at).Add(rowsTotal)
}
rowsPerInsert.Update(float64(rowsTotal))
return nil
}

View File

@@ -27,6 +27,9 @@ vmalert-tool-linux-ppc64le-prod:
vmalert-tool-linux-386-prod:
APP_NAME=vmalert-tool $(MAKE) app-via-docker-linux-386
vmalert-tool-linux-s390x-prod:
APP_NAME=vmalert-tool $(MAKE) app-via-docker-linux-s390x
vmalert-tool-darwin-amd64-prod:
APP_NAME=vmalert-tool $(MAKE) app-via-docker-darwin-amd64

View File

@@ -41,7 +41,7 @@ func TestParseInputValue_Success(t *testing.T) {
if len(outputExpected) != len(output) {
t.Fatalf("unexpected output length; got %d; want %d", len(outputExpected), len(output))
}
for i := 0; i < len(outputExpected); i++ {
for i := range outputExpected {
if outputExpected[i].Omitted != output[i].Omitted {
t.Fatalf("unexpected Omitted field in the output\ngot\n%v\nwant\n%v", output, outputExpected)
}

View File

@@ -4,6 +4,7 @@ import (
"context"
"flag"
"fmt"
"maps"
"net"
"net/http"
"net/http/httptest"
@@ -12,6 +13,7 @@ import (
"os/signal"
"path/filepath"
"reflect"
"slices"
"sort"
"strings"
"syscall"
@@ -132,7 +134,7 @@ func UnitTest(files []string, disableGroupLabel bool, externalLabels []string, e
}
labels[s[:n]] = s[n+1:]
}
_, err = notifier.Init(labels, externalURL)
err = notifier.Init(labels, externalURL)
if err != nil {
logger.Fatalf("failed to init notifier: %v", err)
}
@@ -348,9 +350,7 @@ func (tg *testGroup) test(evalInterval time.Duration, groupOrderMap map[string]i
for k := range alertEvalTimesMap {
alertEvalTimes = append(alertEvalTimes, k)
}
sort.Slice(alertEvalTimes, func(i, j int) bool {
return alertEvalTimes[i] < alertEvalTimes[j]
})
slices.Sort(alertEvalTimes)
// sort group eval order according to the given "group_eval_order".
sort.Slice(testGroups, func(i, j int) bool {
@@ -361,12 +361,8 @@ func (tg *testGroup) test(evalInterval time.Duration, groupOrderMap map[string]i
var groups []*rule.Group
for _, group := range testGroups {
mergedExternalLabels := make(map[string]string)
for k, v := range tg.ExternalLabels {
mergedExternalLabels[k] = v
}
for k, v := range externalLabels {
mergedExternalLabels[k] = v
}
maps.Copy(mergedExternalLabels, tg.ExternalLabels)
maps.Copy(mergedExternalLabels, externalLabels)
ng := rule.NewGroup(group, q, time.Minute, mergedExternalLabels)
ng.Init()
groups = append(groups, ng)
@@ -379,7 +375,7 @@ func (tg *testGroup) test(evalInterval time.Duration, groupOrderMap map[string]i
if len(g.Rules) == 0 {
continue
}
errs := g.ExecOnce(context.Background(), func() []notifier.Notifier { return nil }, rw, ts)
errs := g.ExecOnce(context.Background(), rw, ts)
for err := range errs {
if err != nil {
checkErrs = append(checkErrs, fmt.Errorf("\nfailed to exec group: %q, time: %s, err: %w", g.Name,

View File

@@ -27,6 +27,9 @@ vmalert-linux-ppc64le-prod:
vmalert-linux-386-prod:
APP_NAME=vmalert $(MAKE) app-via-docker-linux-386
vmalert-linux-s390x-prod:
APP_NAME=vmalert $(MAKE) app-via-docker-linux-s390x
vmalert-darwin-amd64-prod:
APP_NAME=vmalert $(MAKE) app-via-docker-darwin-amd64

View File

@@ -81,12 +81,9 @@ func (g *Group) Validate(validateTplFn ValidateTplFn, validateExpressions bool)
if g.Interval.Duration() < 0 {
return fmt.Errorf("interval shouldn't be lower than 0")
}
if g.EvalOffset.Duration() < 0 {
return fmt.Errorf("eval_offset shouldn't be lower than 0")
}
// if `eval_offset` is set, interval won't use global evaluationInterval flag and must bigger than offset.
if g.EvalOffset.Duration() > g.Interval.Duration() {
return fmt.Errorf("eval_offset should be smaller than interval; now eval_offset: %v, interval: %v", g.EvalOffset.Duration(), g.Interval.Duration())
// if `eval_offset` is set, the group interval must be specified explicitly(instead of inherited from global evaluationInterval flag) and must bigger than offset.
if g.EvalOffset.Duration().Abs() > g.Interval.Duration() {
return fmt.Errorf("the abs value of eval_offset should be smaller than interval; now eval_offset: %v, interval: %v", g.EvalOffset.Duration(), g.Interval.Duration())
}
if g.EvalOffset != nil && g.EvalDelay != nil {
return fmt.Errorf("eval_offset cannot be used with eval_delay")

View File

@@ -116,7 +116,7 @@ func TestParse_Failure(t *testing.T) {
f([]string{"testdata/rules/rules_interval_bad.rules"}, "eval_offset should be smaller than interval")
f([]string{"testdata/rules/rules0-bad.rules"}, "unexpected token")
f([]string{"testdata/dir/rules0-bad.rules"}, "error parsing annotation")
f([]string{"testdata/dir/rules0-bad.rules"}, "invalid annotations")
f([]string{"testdata/dir/rules1-bad.rules"}, "duplicate in file")
f([]string{"testdata/dir/rules2-bad.rules"}, "function \"unknown\" not defined")
f([]string{"testdata/dir/rules3-bad.rules"}, "either `record` or `alert` must be set")
@@ -176,11 +176,17 @@ func TestGroupValidate_Failure(t *testing.T) {
}, false, "interval shouldn't be lower than 0")
f(&Group{
Name: "wrong eval_offset",
Name: "too big eval_offset",
Interval: promutil.NewDuration(time.Minute),
EvalOffset: promutil.NewDuration(2 * time.Minute),
}, false, "eval_offset should be smaller than interval")
f(&Group{
Name: "too big negative eval_offset",
Interval: promutil.NewDuration(time.Minute),
EvalOffset: promutil.NewDuration(-2 * time.Minute),
}, false, "eval_offset should be smaller than interval")
limit := -1
f(&Group{
Name: "wrong limit",
@@ -343,7 +349,6 @@ func TestGroupValidate_Failure(t *testing.T) {
},
},
}, true, "bad prometheus expr")
}
func TestGroupValidate_Success(t *testing.T) {

View File

@@ -2,6 +2,7 @@ package config
import (
"fmt"
"slices"
"strings"
"github.com/VictoriaMetrics/VictoriaLogs/lib/logstorage"
@@ -76,13 +77,12 @@ func (t *Type) ValidateExpr(expr string) error {
if err != nil {
return fmt.Errorf("bad LogsQL expr: %q, err: %w", expr, err)
}
fields, _ := q.GetStatsByFields()
for i := range fields {
// VictoriaLogs inserts `_time` field as a label in result when query with `stats by (_time:step)`,
// making the result meaningless and may lead to cardinality issues.
if fields[i] == "_time" {
return fmt.Errorf("bad LogsQL expr: %q, err: cannot contain time buckets stats pipe `stats by (_time:step)`", expr)
}
labels, err := q.GetStatsLabels()
if err != nil {
return fmt.Errorf("cannot obtain labels from LogsQL expr: %q, err: %w", expr, err)
}
if slices.Contains(labels, "_time") {
return fmt.Errorf("bad LogsQL expr: %q, err: cannot contain time buckets stats pipe `stats by (_time:step)`", expr)
}
default:
return fmt.Errorf("unknown datasource type=%q", t.Name)

View File

@@ -5,6 +5,7 @@ import (
"errors"
"fmt"
"io"
"maps"
"net/http"
"net/url"
"strings"
@@ -91,9 +92,7 @@ func (c *Client) Clone() *Client {
ns.extraHeaders = make([]keyValue, len(c.extraHeaders))
copy(ns.extraHeaders, c.extraHeaders)
}
for k, v := range c.extraParams {
ns.extraParams[k] = v
}
maps.Copy(ns.extraParams, c.extraParams)
return ns
}
@@ -173,22 +172,26 @@ func (c *Client) Query(ctx context.Context, query string, ts time.Time) (Result,
return Result{}, nil, fmt.Errorf("second attempt: %w", err)
}
}
defer func() { _ = resp.Body.Close() }()
// Process the received response.
var parseFn func(req *http.Request, resp *http.Response) (Result, error)
var parseFn func(resp *http.Response) (Result, error)
switch c.dataSourceType {
case datasourcePrometheus:
parseFn = parsePrometheusResponse
parseFn = parsePrometheusInstantResponse
case datasourceGraphite:
parseFn = parseGraphiteResponse
case datasourceVLogs:
parseFn = parseVLogsResponse
parseFn = parseVLogsInstantResponse
default:
logger.Panicf("BUG: unsupported datasource type %q to parse query response", c.dataSourceType)
}
result, err := parseFn(req, resp)
_ = resp.Body.Close()
return result, req, err
result, err := parseFn(resp)
if err != nil {
return Result{}, nil, fmt.Errorf("error parsing response from %q: %w", req.URL.Redacted(), err)
}
return result, req, nil
}
// QueryRange executes the given query on the given time range.
@@ -229,19 +232,23 @@ func (c *Client) QueryRange(ctx context.Context, query string, start, end time.T
return res, fmt.Errorf("second attempt: %w", err)
}
}
defer func() { _ = resp.Body.Close() }()
// Process the received response.
var parseFn func(req *http.Request, resp *http.Response) (Result, error)
var parseFn func(resp *http.Response) (Result, error)
switch c.dataSourceType {
case datasourcePrometheus:
parseFn = parsePrometheusResponse
parseFn = parsePrometheusRangeResponse
case datasourceVLogs:
parseFn = parseVLogsResponse
parseFn = parseVLogsRangeResponse
default:
logger.Panicf("BUG: unsupported datasource type %q to parse query range response", c.dataSourceType)
}
res, err = parseFn(req, resp)
_ = resp.Body.Close()
res, err = parseFn(resp)
if err != nil {
return Result{}, fmt.Errorf("error parsing response from %q: %w", req.URL.Redacted(), err)
}
return res, err
}

View File

@@ -33,10 +33,10 @@ func (r graphiteResponse) metrics() []Metric {
return ms
}
func parseGraphiteResponse(req *http.Request, resp *http.Response) (Result, error) {
func parseGraphiteResponse(resp *http.Response) (Result, error) {
r := &graphiteResponse{}
if err := json.NewDecoder(resp.Body).Decode(r); err != nil {
return Result{}, fmt.Errorf("error parsing graphite metrics for %s: %w", req.URL.Redacted(), err)
return Result{}, fmt.Errorf("error parsing graphite metrics: %w", err)
}
return Result{Data: r.metrics()}, nil
}

View File

@@ -34,7 +34,7 @@ type promResponse struct {
// Stats supported by VictoriaMetrics since v1.90
Stats struct {
SeriesFetched *string `json:"seriesFetched,omitempty"`
} `json:"stats,omitempty"`
} `json:"stats"`
// IsPartial supported by VictoriaMetrics
IsPartial *bool `json:"isPartial,omitempty"`
}
@@ -172,17 +172,26 @@ const (
rtVector, rtMatrix, rScalar = "vector", "matrix", "scalar"
)
func parsePrometheusResponse(req *http.Request, resp *http.Response) (res Result, err error) {
func parsePromResponse(resp *http.Response) (*promResponse, error) {
r := &promResponse{}
if err = json.NewDecoder(resp.Body).Decode(r); err != nil {
return res, fmt.Errorf("error parsing response from %s: %w", req.URL.Redacted(), err)
if err := json.NewDecoder(resp.Body).Decode(r); err != nil {
return nil, fmt.Errorf("failed to decode response: %w", err)
}
if r.Status == statusError {
return res, fmt.Errorf("response error, query: %s, errorType: %s, error: %s", req.URL.Redacted(), r.ErrorType, r.Error)
return nil, fmt.Errorf("response error %q: %s", r.ErrorType, r.Error)
}
if r.Status != statusSuccess {
return res, fmt.Errorf("unknown status: %s, Expected success or error", r.Status)
return nil, fmt.Errorf("unknown response status %q", r.Status)
}
return r, nil
}
func parsePrometheusInstantResponse(resp *http.Response) (res Result, err error) {
r, err := parsePromResponse(resp)
if err != nil {
return res, fmt.Errorf("failed to parse response: %w", err)
}
var parseFn func() ([]Metric, error)
switch r.Data.ResultType {
case rtVector:
@@ -191,12 +200,6 @@ func parsePrometheusResponse(req *http.Request, resp *http.Response) (res Result
return res, fmt.Errorf("unmarshal err %w; \n %#v", err, string(r.Data.Result))
}
parseFn = pi.metrics
case rtMatrix:
var pr promRange
if err := json.Unmarshal(r.Data.Result, &pr.Result); err != nil {
return res, err
}
parseFn = pr.metrics
case rScalar:
var ps promScalar
if err := json.Unmarshal(r.Data.Result, &ps); err != nil {
@@ -206,7 +209,6 @@ func parsePrometheusResponse(req *http.Request, resp *http.Response) (res Result
default:
return res, fmt.Errorf("unknown result type %q", r.Data.ResultType)
}
ms, err := parseFn()
if err != nil {
return res, err
@@ -222,6 +224,34 @@ func parsePrometheusResponse(req *http.Request, resp *http.Response) (res Result
return res, nil
}
func parsePrometheusRangeResponse(resp *http.Response) (res Result, err error) {
r, err := parsePromResponse(resp)
if err != nil {
return res, fmt.Errorf("failed to parse response: %w", err)
}
if r.Data.ResultType != rtMatrix {
return res, fmt.Errorf("unexpected result type %q; expected result type %q", r.Data.ResultType, rtMatrix)
}
var pr promRange
if err := json.Unmarshal(r.Data.Result, &pr.Result); err != nil {
return res, err
}
ms, err := pr.metrics()
if err != nil {
return res, err
}
res = Result{Data: ms, IsPartial: r.IsPartial}
if r.Stats.SeriesFetched != nil {
intV, err := strconv.Atoi(*r.Stats.SeriesFetched)
if err != nil {
return res, fmt.Errorf("failed to convert stats.seriesFetched to int: %w", err)
}
res.SeriesFetched = &intV
}
return res, nil
}
func (c *Client) setPrometheusInstantReqParams(r *http.Request, query string, timestamp time.Time) {
if c.appendTypePrefix {
r.URL.Path += "/prometheus"
@@ -249,12 +279,9 @@ func (c *Client) setPrometheusRangeReqParams(r *http.Request, query string, star
if c.appendTypePrefix {
r.URL.Path += "/prometheus"
}
// deliberately ignore *disablePathAppend
// if we don't append path, then newQueryRangeRequest and newQueryRequest will produce the same URL path and will become
// indistinguishable for remote datasource. This may lead to confusion as in https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9779
r.URL.Path += "/api/v1/query_range"
if !*disablePathAppend {
r.URL.Path += "/api/v1/query_range"
}
q := r.URL.Query()
q.Add("start", start.Format(time.RFC3339))
q.Add("end", end.Format(time.RFC3339))

View File

@@ -65,21 +65,23 @@ func TestVMInstantQuery(t *testing.T) {
case 3:
w.Write([]byte(`{"status":"unknown"}`))
case 4:
w.Write([]byte(`{"status":"success","data":{"resultType":"matrix"}}`))
w.Write([]byte(`{"status":"success","data":{"resultType":"vector"}}`))
case 5:
w.Write([]byte(`{"status":"success","data":{"resultType":"vector","result":[{"metric":{"__name__":"vm_rows","foo":"bar"},"value":[1583786142,"13763"]},{"metric":{"__name__":"vm_requests","foo":"baz"},"value":[1583786140,"2000"]}]}}`))
w.Write([]byte(`{"status":"success","data":{"resultType":"matrix","result":[{"metric":{"__name__":"vm_rows"},"values":[[1583786142,"13763"]]}]}}`))
case 6:
w.Write([]byte(`{"status":"success","data":{"resultType":"scalar","result":[1583786142, "1"]}}`))
w.Write([]byte(`{"status":"success","data":{"resultType":"vector","result":[{"metric":{"__name__":"vm_rows","foo":"bar"},"value":[1583786142,"13763"]},{"metric":{"__name__":"vm_requests","foo":"baz"},"value":[1583786140,"2000"]}]}}`))
case 7:
w.Write([]byte(`{"status":"success","data":{"resultType":"scalar","result":[1583786142, "1"]},"stats":{"seriesFetched": "42"}}`))
w.Write([]byte(`{"status":"success","data":{"resultType":"scalar","result":[1583786142, "1"]}}`))
case 8:
w.Write([]byte(`{"status":"success","data":{"resultType":"scalar","result":[1583786142, "1"]},"stats":{"seriesFetched": "42"}}`))
case 9:
w.Write([]byte(`{"status":"success", "isPartial":true, "data":{"resultType":"scalar","result":[1583786142, "1"]}}`))
}
})
mux.HandleFunc("/render", func(w http.ResponseWriter, _ *http.Request) {
c++
switch c {
case 9:
case 10:
w.Write([]byte(`[{"target":"constantLine(10)","tags":{"name":"constantLine(10)"},"datapoints":[[10,1611758343],[10,1611758373],[10,1611758403]]}]`))
}
})
@@ -102,9 +104,9 @@ func TestVMInstantQuery(t *testing.T) {
t.Fatalf("failed to parse 'time' query param %q: %s", timeParam, err)
}
switch c {
case 10:
w.Write([]byte("[]"))
case 11:
w.Write([]byte("[]"))
case 12:
w.Write([]byte(`{"status":"success","data":{"resultType":"vector","result":[{"metric":{"__name__":"total","foo":"bar"},"value":[1583786142,"13763"]},{"metric":{"__name__":"total","foo":"baz"},"value":[1583786140,"2000"]}]}}`))
}
})
@@ -123,6 +125,7 @@ func TestVMInstantQuery(t *testing.T) {
ts := time.Now()
expErr := func(query, err string) {
t.Helper()
_, _, gotErr := pq.Query(ctx, query, ts)
if gotErr == nil {
t.Fatalf("expected %q got nil", err)
@@ -135,10 +138,11 @@ func TestVMInstantQuery(t *testing.T) {
expErr(vmQuery, "500") // 0
expErr(vmQuery, "error parsing response") // 1
expErr(vmQuery, "response error") // 2
expErr(vmQuery, "unknown status") // 3
expErr(vmQuery, "unknown response status") // 3
expErr(vmQuery, "unexpected end of JSON input") // 4
expErr(vmQuery, "unknown result type") // 5
res, _, err := pq.Query(ctx, vmQuery, ts) // 5 - vector
res, _, err := pq.Query(ctx, vmQuery, ts) // 6 - vector
if err != nil {
t.Fatalf("unexpected %s", err)
}
@@ -159,7 +163,7 @@ func TestVMInstantQuery(t *testing.T) {
}
metricsEqual(t, res.Data, expected)
res, req, err := pq.Query(ctx, vmQuery, ts) // 6 - scalar
res, req, err := pq.Query(ctx, vmQuery, ts) // 7 - scalar
if err != nil {
t.Fatalf("unexpected %s", err)
}
@@ -184,7 +188,7 @@ func TestVMInstantQuery(t *testing.T) {
res.SeriesFetched)
}
res, _, err = pq.Query(ctx, vmQuery, ts) // 7 - scalar with stats
res, _, err = pq.Query(ctx, vmQuery, ts) // 8 - scalar with stats
if err != nil {
t.Fatalf("unexpected %s", err)
}
@@ -205,7 +209,7 @@ func TestVMInstantQuery(t *testing.T) {
*res.SeriesFetched)
}
res, _, err = pq.Query(ctx, vmQuery, ts) // 8
res, _, err = pq.Query(ctx, vmQuery, ts) // 9
if err != nil {
t.Fatalf("unexpected %s", err)
}
@@ -216,7 +220,7 @@ func TestVMInstantQuery(t *testing.T) {
// test graphite
gq := s.BuildWithParams(QuerierParams{DataSourceType: string(datasourceGraphite)})
res, _, err = gq.Query(ctx, queryRender, ts) // 9 - graphite
res, _, err = gq.Query(ctx, queryRender, ts) // 10 - graphite
if err != nil {
t.Fatalf("unexpected %s", err)
}
@@ -236,9 +240,9 @@ func TestVMInstantQuery(t *testing.T) {
vlogs := datasourceVLogs
pq = s.BuildWithParams(QuerierParams{DataSourceType: string(vlogs), EvaluationInterval: 15 * time.Second})
expErr(vlogsQuery, "error parsing response") // 10
expErr(vlogsQuery, "error parsing response") // 11
res, _, err = pq.Query(ctx, vlogsQuery, ts) // 11
res, _, err = pq.Query(ctx, vlogsQuery, ts) // 12
if err != nil {
t.Fatalf("unexpected %s", err)
}
@@ -390,6 +394,8 @@ func TestVMRangeQuery(t *testing.T) {
switch c {
case 0:
w.Write([]byte(`{"status":"success","data":{"resultType":"matrix","result":[{"metric":{"__name__":"vm_rows"},"values":[[1583786142,"13763"]]}]}}`))
case 1:
w.Write([]byte(`{"status":"success","data":{"resultType":"vector","result":[1583786142, "1"]}}`))
}
})
mux.HandleFunc("/select/logsql/stats_query_range", func(w http.ResponseWriter, r *http.Request) {
@@ -422,7 +428,7 @@ func TestVMRangeQuery(t *testing.T) {
t.Fatalf("expected 'step' query param to be 60s; got %q instead", step)
}
switch c {
case 1:
case 2:
w.Write([]byte(`{"status":"success","data":{"resultType":"matrix","result":[{"metric":{"__name__":"total"},"values":[[1583786142,"10"]]}]}}`))
}
})
@@ -446,13 +452,13 @@ func TestVMRangeQuery(t *testing.T) {
start, end := time.Now().Add(-time.Minute), time.Now()
res, err := pq.QueryRange(ctx, vmQuery, start, end)
res, err := pq.QueryRange(ctx, vmQuery, start, end) // case 0
if err != nil {
t.Fatalf("unexpected %s", err)
}
m := res.Data
if len(m) != 1 {
t.Fatalf("expected 1 metric got %d in %+v", len(m), m)
t.Fatalf("expected 1 metric got %d in %+v", len(m), m)
}
expected := Metric{
Labels: []prompb.Label{{Value: "vm_rows", Name: "__name__"}},
@@ -463,6 +469,9 @@ func TestVMRangeQuery(t *testing.T) {
t.Fatalf("unexpected metric %+v want %+v", m[0], expected)
}
_, err = pq.QueryRange(ctx, vmQuery, start, end) // case 1
expectError(t, err, "unexpected result type")
// test unsupported graphite
gq := s.BuildWithParams(QuerierParams{DataSourceType: string(datasourceGraphite)})
@@ -566,22 +575,6 @@ func TestRequestParams(t *testing.T) {
checkEqualString(t, "/prometheus/api/v1/query_range", r.URL.Path)
})
// disable path append
*disablePathAppend = true
f(false, &Client{
dataSourceType: datasourcePrometheus,
}, func(t *testing.T, r *http.Request) {
checkEqualString(t, "", r.URL.Path)
})
f(true, &Client{
dataSourceType: datasourcePrometheus,
}, func(t *testing.T, r *http.Request) {
// path expected to be present despite *disablePathAppend setting
checkEqualString(t, "/api/v1/query_range", r.URL.Path)
})
*disablePathAppend = false
// graphite path
f(false, &Client{
dataSourceType: datasourceGraphite,

View File

@@ -40,8 +40,28 @@ func (c *Client) setVLogsRangeReqParams(r *http.Request, query string, start, en
c.setReqParams(r, query)
}
func parseVLogsResponse(req *http.Request, resp *http.Response) (res Result, err error) {
res, err = parsePrometheusResponse(req, resp)
func parseVLogsInstantResponse(resp *http.Response) (res Result, err error) {
res, err = parsePrometheusInstantResponse(resp)
if err != nil {
return Result{}, err
}
for i := range res.Data {
m := &res.Data[i]
for j := range m.Labels {
// reserve the stats func result name with a new label `stats_result` instead of dropping it,
// since there could be multiple stats results in a single query, for instance:
// _time:5m | stats quantile(0.5, request_duration_seconds) p50, quantile(0.9, request_duration_seconds) p90
if m.Labels[j].Name == "__name__" {
m.Labels[j].Name = "stats_result"
break
}
}
}
return
}
func parseVLogsRangeResponse(resp *http.Response) (res Result, err error) {
res, err = parsePrometheusRangeResponse(resp)
if err != nil {
return Result{}, err
}

View File

@@ -134,7 +134,7 @@ func (ls Labels) String() string {
func LabelCompare(a, b Labels) int {
l := min(len(b), len(a))
for i := 0; i < l; i++ {
for i := range l {
if a[i].Name != b[i].Name {
if a[i].Name < b[i].Name {
return -1

View File

@@ -15,7 +15,7 @@ import (
)
var (
addr = flag.String("datasource.url", "", "Datasource compatible with Prometheus or VictoriaLogs HTTP API. It can be single node VictoriaMetrics, vmselect or VictoriaLogs endpoint. Required parameter. "+
addr = flag.String("datasource.url", "", "Datasource compatible with Prometheus HTTP API. It can be single node VictoriaMetrics or vmselect endpoint. Required parameter. "+
"Supports address in the form of IP address with a port (e.g., http://127.0.0.1:8428) or DNS SRV record. "+
"See also -remoteRead.disablePathAppend and -datasource.showURL")
appendTypePrefix = flag.Bool("datasource.appendTypePrefix", false, "Whether to add type prefix to -datasource.url based on the query type. Set to true if sending different query types to the vmselect URL.")

View File

@@ -13,7 +13,7 @@ func BenchmarkPromInstantUnmarshal(b *testing.B) {
// BenchmarkParsePrometheusResponse/Instant_std+fastjson-10 1760 668959 ns/op 280147 B/op 5781 allocs/op
b.Run("Instant std+fastjson", func(b *testing.B) {
for i := 0; i < b.N; i++ {
for range b.N {
var pi promInstant
err = pi.Unmarshal(data)
if err != nil {

View File

@@ -56,7 +56,7 @@ absolute path to all .tpl files in root.
-rule.templates="dir/**/*.tpl". Includes all the .tpl files in "dir" subfolders recursively.
`)
configCheckInterval = flag.Duration("configCheckInterval", 0, "Interval for checking for changes in '-rule' or '-notifier.config' files. "+
configCheckInterval = flag.Duration("configCheckInterval", 0, "Interval for checking for changes in '-rule', '-rule.templates' and '-notifier.config' files. "+
"By default, the checking is disabled. Send SIGHUP signal in order to force config check for changes.")
httpListenAddrs = flagutil.NewArrayString("httpListenAddr", "Address to listen for incoming http requests. See also -tls and -httpListenAddr.useProxyProtocol")
@@ -76,14 +76,12 @@ absolute path to all .tpl files in root.
`Link to VMUI: -external.alert.source='vmui/#/?g0.expr={{.Expr|queryEscape}}'. `+
`If empty 'vmalert/alert?group_id={{.GroupID}}&alert_id={{.AlertID}}' is used.`)
externalLabels = flagutil.NewArrayString("external.label", "Optional label in the form 'Name=value' to add to all generated recording rules and alerts. "+
"In case of conflicts, original labels are kept with prefix `exported_`.")
"In case of conflicts, original labels are kept with prefix 'exported_'.")
dryRun = flag.Bool("dryRun", false, "Whether to check only config files without running vmalert. The rules file are validated. The -rule flag must be specified.")
)
var (
extURL *url.URL
)
var extURL *url.URL
func main() {
// Write flags and help message to stdout, since it is easier to grep or pipe.
@@ -161,7 +159,7 @@ func main() {
ctx, cancel := context.WithCancel(context.Background())
manager, err := newManager(ctx)
if err != nil {
logger.Fatalf("failed to init: %s", err)
logger.Fatalf("failed to create manager: %s", err)
}
logger.Infof("reading rules configuration file from %q", strings.Join(*rulePath, ";"))
groupsCfg, err := config.Parse(*rulePath, validateTplFn, *validateExpressions)
@@ -226,14 +224,13 @@ func newManager(ctx context.Context) (*manager, error) {
labels[s[:n]] = s[n+1:]
}
nts, err := notifier.Init(labels, *externalURL)
err = notifier.Init(labels, *externalURL)
if err != nil {
return nil, fmt.Errorf("failed to init notifier: %w", err)
}
manager := &manager{
groups: make(map[uint64]*rule.Group),
querierBuilder: q,
notifiers: nts,
labels: labels,
}
rw, err := remotewrite.Init(ctx)

View File

@@ -96,9 +96,10 @@ groups:
querierBuilder: &datasource.FakeQuerier{},
groups: make(map[uint64]*rule.Group),
labels: map[string]string{},
notifiers: func() []notifier.Notifier { return []notifier.Notifier{&notifier.FakeNotifier{}} },
rw: &remotewrite.Client{},
}
_, cleanup := notifier.InitFakeNotifier()
defer cleanup()
syncCh := make(chan struct{})
sighupCh := procutil.NewSighupChan()

View File

@@ -3,6 +3,7 @@ package main
import (
"context"
"fmt"
"strconv"
"sync"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/config"
@@ -16,7 +17,6 @@ import (
// manager controls group states
type manager struct {
querierBuilder datasource.QuerierBuilder
notifiers func() []notifier.Notifier
rw remotewrite.RWClient
// remote read builder.
@@ -46,13 +46,15 @@ func (m *manager) ruleAPI(gID, rID uint64) (rule.ApiRule, error) {
m.groupsMu.RLock()
defer m.groupsMu.RUnlock()
g, ok := m.groups[gID]
group, ok := m.groups[gID]
if !ok {
return rule.ApiRule{}, fmt.Errorf("can't find group with id %d", gID)
}
g := group.ToAPI()
ruleID := strconv.FormatUint(rID, 10)
for _, r := range g.Rules {
if r.ID() == rID {
return r.ToAPI(), nil
if r.ID == ruleID {
return r, nil
}
}
return rule.ApiRule{}, fmt.Errorf("can't find rule with id %d in group %q", rID, g.Name)
@@ -63,17 +65,20 @@ func (m *manager) alertAPI(gID, aID uint64) (*rule.ApiAlert, error) {
m.groupsMu.RLock()
defer m.groupsMu.RUnlock()
g, ok := m.groups[gID]
group, ok := m.groups[gID]
if !ok {
return nil, fmt.Errorf("can't find group with id %d", gID)
}
g := group.ToAPI()
for _, r := range g.Rules {
ar, ok := r.(*rule.AlertingRule)
if !ok {
if r.Type != rule.TypeAlerting {
continue
}
if apiAlert := ar.AlertToAPI(aID); apiAlert != nil {
return apiAlert, nil
alertID := strconv.FormatUint(aID, 10)
for _, a := range r.Alerts {
if a.ID == alertID {
return a, nil
}
}
}
return nil, fmt.Errorf("can't find alert with id %d in group %q", aID, g.Name)
@@ -93,20 +98,18 @@ func (m *manager) close() {
m.wg.Wait()
}
func (m *manager) startGroup(ctx context.Context, g *rule.Group, restore bool) error {
m.wg.Add(1)
func (m *manager) startGroup(ctx context.Context, g *rule.Group, restore bool) {
id := g.GetID()
g.Init()
go func() {
defer m.wg.Done()
m.wg.Go(func() {
if restore {
g.Start(ctx, m.notifiers, m.rw, m.rr)
g.Start(ctx, m.rw, m.rr)
} else {
g.Start(ctx, m.notifiers, m.rw, nil)
g.Start(ctx, m.rw, nil)
}
}()
})
m.groups[id] = g
return nil
}
func (m *manager) update(ctx context.Context, groupsCfg []config.Group, restore bool) error {
@@ -115,7 +118,7 @@ func (m *manager) update(ctx context.Context, groupsCfg []config.Group, restore
for _, cfg := range groupsCfg {
for _, r := range cfg.Rules {
if rrPresent && arPresent {
continue
break
}
if r.Record != "" {
rrPresent = true
@@ -131,7 +134,7 @@ func (m *manager) update(ctx context.Context, groupsCfg []config.Group, restore
if rrPresent && m.rw == nil {
return fmt.Errorf("config contains recording rules but `-remoteWrite.url` isn't set")
}
if arPresent && m.notifiers == nil {
if arPresent && notifier.GetTargets() == nil {
return fmt.Errorf("config contains alerting rules but neither `-notifier.url` nor `-notifier.config` nor `-notifier.blackhole` aren't set")
}
@@ -158,25 +161,22 @@ func (m *manager) update(ctx context.Context, groupsCfg []config.Group, restore
}
}
for _, ng := range groupsRegistry {
if err := m.startGroup(ctx, ng, restore); err != nil {
m.groupsMu.Unlock()
return err
}
m.startGroup(ctx, ng, restore)
}
m.groupsMu.Unlock()
if len(toUpdate) > 0 {
var wg sync.WaitGroup
for _, item := range toUpdate {
wg.Add(1)
// cancel evaluation so the Update will be applied as fast as possible.
// it is important to call InterruptEval before the update, because cancel fn
// can be re-assigned during the update.
item.old.InterruptEval()
go func(oldGroup *rule.Group, newGroup *rule.Group) {
oldGroup.UpdateWith(newGroup)
wg.Done()
}(item.old, item.new)
oldG := item.old
newG := item.new
wg.Go(func() {
// cancel evaluation so the Update will be applied as fast as possible.
// it is important to call InterruptEval before the update, because cancel fn
// can be re-assigned during the update.
oldG.InterruptEval()
oldG.UpdateWith(newG)
})
}
wg.Wait()
}

View File

@@ -40,10 +40,11 @@ func TestManagerEmptyRulesDir(t *testing.T) {
// execution of configuration update.
// Should be executed with -race flag
func TestManagerUpdateConcurrent(t *testing.T) {
_, cleanup := notifier.InitFakeNotifier()
defer cleanup()
m := &manager{
groups: make(map[uint64]*rule.Group),
querierBuilder: &datasource.FakeQuerier{},
notifiers: func() []notifier.Notifier { return []notifier.Notifier{&notifier.FakeNotifier{}} },
}
paths := []string{
"config/testdata/dir/rules0-good.rules",
@@ -64,13 +65,11 @@ func TestManagerUpdateConcurrent(t *testing.T) {
const workers = 500
const iterations = 10
wg := sync.WaitGroup{}
wg.Add(workers)
for i := 0; i < workers; i++ {
go func(n int) {
defer wg.Done()
var wg sync.WaitGroup
for n := range workers {
wg.Go(func() {
r := rand.New(rand.NewSource(int64(n)))
for i := 0; i < iterations; i++ {
for range iterations {
rnd := r.Intn(len(paths))
cfg, err := config.Parse([]string{paths[rnd]}, notifier.ValidateTemplates, true)
if err != nil { // update can fail and this is expected
@@ -78,7 +77,7 @@ func TestManagerUpdateConcurrent(t *testing.T) {
}
_ = m.update(context.Background(), cfg, false)
}
}(i)
})
}
wg.Wait()
}
@@ -127,8 +126,9 @@ func TestManagerUpdate_Success(t *testing.T) {
m := &manager{
groups: make(map[uint64]*rule.Group),
querierBuilder: &datasource.FakeQuerier{},
notifiers: func() []notifier.Notifier { return []notifier.Notifier{&notifier.FakeNotifier{}} },
}
_, cleanup := notifier.InitFakeNotifier()
defer cleanup()
cfgInit := loadCfg(t, []string{initPath}, true, true)
if err := m.update(ctx, cfgInit, false); err != nil {
@@ -259,7 +259,7 @@ func compareGroups(t *testing.T, a, b *rule.Group) {
for i, r := range a.Rules {
got, want := r, b.Rules[i]
if a.CreateID() != b.CreateID() {
t.Fatalf("expected to have rule %q; got %q", want.ID(), got.ID())
t.Fatalf("expected to have rule %d; got %d", want.ID(), got.ID())
}
if err := rule.CompareRules(t, want, got); err != nil {
t.Fatalf("comparison error: %s", err)
@@ -277,7 +277,8 @@ func TestManagerUpdate_Failure(t *testing.T) {
rw: rw,
}
if notifiers != nil {
m.notifiers = func() []notifier.Notifier { return notifiers }
_, cleanup := notifier.InitFakeNotifier()
defer cleanup()
}
err := m.update(context.Background(), []config.Group{cfg}, false)
if err == nil {

View File

@@ -80,14 +80,15 @@ func (as AlertState) String() string {
// AlertTplData is used to execute templating
type AlertTplData struct {
Type string
Labels map[string]string
Value float64
Expr string
AlertID uint64
GroupID uint64
ActiveAt time.Time
For time.Duration
Type string
Labels map[string]string
Value float64
Expr string
AlertID uint64
GroupID uint64
ActiveAt time.Time
For time.Duration
IsPartial bool
}
var tplHeaders = []string{
@@ -101,6 +102,7 @@ var tplHeaders = []string{
"{{ $groupID := .GroupID }}",
"{{ $activeAt := .ActiveAt }}",
"{{ $for := .For }}",
"{{ $isPartial := .IsPartial }}",
}
// ExecTemplate executes the Alert template for given
@@ -166,8 +168,8 @@ func templateAnnotations(annotations map[string]string, data AlertTplData, tmpl
ctmpl, _ := tmpl.Clone()
ctmpl = ctmpl.Option("missingkey=zero")
if err := templateAnnotation(&buf, builder.String(), tData, ctmpl, execute); err != nil {
r[key] = text
eg.Add(fmt.Errorf("key %q, template %q: %w", key, text, err))
r[key] = err.Error()
eg.Add(fmt.Errorf("(key: %q, value: %q): %w", key, text, err))
continue
}
r[key] = buf.String()
@@ -184,13 +186,13 @@ type tplData struct {
func templateAnnotation(dst io.Writer, text string, data tplData, tpl *textTpl.Template, execute bool) error {
tpl, err := tpl.Parse(text)
if err != nil {
return fmt.Errorf("error parsing annotation template: %w", err)
return fmt.Errorf("error parsing template: %w", err)
}
if !execute {
return nil
}
if err = tpl.Execute(dst, data); err != nil {
return fmt.Errorf("error evaluating annotation template: %w", err)
return fmt.Errorf("error evaluating template: %w", err)
}
return nil
}

View File

@@ -20,7 +20,7 @@ func TestAlertExecTemplate(t *testing.T) {
)
extLabels["cluster"] = extCluster
extLabels["dc"] = extDC
_, err := Init(extLabels, extURL)
err := Init(extLabels, extURL)
checkErr(t, err)
f := func(alert *Alert, annotations map[string]string, tplExpected map[string]string) {

View File

@@ -3,6 +3,7 @@ package notifier
import (
"bytes"
"context"
"errors"
"fmt"
"io"
"net/http"
@@ -13,7 +14,6 @@ import (
"github.com/VictoriaMetrics/metrics"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/vmalertutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httputil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promauth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
@@ -77,12 +77,20 @@ func (am *AlertManager) LastError() string {
}
// Send an alert or resolve message
func (am *AlertManager) Send(ctx context.Context, alerts []Alert, headers map[string]string) error {
func (am *AlertManager) Send(ctx context.Context, alerts []Alert, alertLabels [][]prompb.Label, headers map[string]string) error {
if len(alerts) != len(alertLabels) {
return fmt.Errorf("mismatched number of alerts and label sets after global alert relabeling")
}
am.metrics.alertsSent.Add(len(alerts))
startTime := time.Now()
err := am.send(ctx, alerts, headers)
err := am.send(ctx, alerts, alertLabels, headers)
am.metrics.alertsSendDuration.UpdateDuration(startTime)
if err != nil {
// the context can be cancelled on graceful shutdown
// or on group update. So no need to handle the error as usual.
if errors.Is(err, context.Canceled) {
return nil
}
am.metrics.alertsSendErrors.Add(len(alerts))
am.lastError = err.Error()
} else {
@@ -91,12 +99,15 @@ func (am *AlertManager) Send(ctx context.Context, alerts []Alert, headers map[st
return err
}
func (am *AlertManager) send(ctx context.Context, alerts []Alert, headers map[string]string) error {
func (am *AlertManager) send(ctx context.Context, alerts []Alert, alertLabels [][]prompb.Label, headers map[string]string) error {
b := &bytes.Buffer{}
alertsToSend := make([]Alert, 0, len(alerts))
lblss := make([][]prompb.Label, 0, len(alerts))
for _, a := range alerts {
lbls := a.applyRelabelingIfNeeded(am.relabelConfigs)
for i, a := range alerts {
lbls := alertLabels[i]
if am.relabelConfigs != nil {
lbls = am.relabelConfigs.Apply(lbls, 0)
}
if len(lbls) == 0 {
continue
}
@@ -160,11 +171,6 @@ const alertManagerPath = "/api/v2/alerts"
func NewAlertManager(alertManagerURL string, fn AlertURLGenerator, authCfg promauth.HTTPClientConfig,
relabelCfg *promrelabel.ParsedConfigs, timeout time.Duration,
) (*AlertManager, error) {
if err := httputil.CheckURL(alertManagerURL); err != nil {
return nil, fmt.Errorf("invalid alertmanager URL: %w", err)
}
tls := &promauth.TLSConfig{}
if authCfg.TLSConfig != nil {
tls = authCfg.TLSConfig

View File

@@ -11,6 +11,7 @@ import (
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promauth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
)
@@ -145,11 +146,11 @@ func TestAlertManager_Send(t *testing.T) {
t.Fatalf("unexpected error: %s", err)
}
if err := am.Send(context.Background(), []Alert{{Labels: map[string]string{"a": "b"}}}, nil); err == nil {
if err := am.Send(context.Background(), []Alert{{Labels: map[string]string{"a": "b"}}}, [][]prompb.Label{{{Name: "a", Value: "b"}}}, nil); err == nil {
t.Fatalf("expected connection error got nil")
}
if err := am.Send(context.Background(), []Alert{{Labels: map[string]string{"a": "b"}}}, nil); err == nil {
if err := am.Send(context.Background(), []Alert{{Labels: map[string]string{"a": "b"}}}, [][]prompb.Label{{{Name: "a", Value: "b"}}}, nil); err == nil {
t.Fatalf("expected wrong http code error got nil")
}
@@ -160,7 +161,7 @@ func TestAlertManager_Send(t *testing.T) {
End: time.Now().UTC(),
Labels: map[string]string{"alertname": "alert0"},
Annotations: map[string]string{"a": "b", "c": "d"},
}}, map[string]string{headerKey: "bar"}); err != nil {
}}, [][]prompb.Label{{{Name: "alertname", Value: "alert0"}}}, map[string]string{headerKey: "bar"}); err != nil {
t.Fatalf("unexpected error %s", err)
}
@@ -174,7 +175,7 @@ func TestAlertManager_Send(t *testing.T) {
Name: "alert2",
Labels: map[string]string{"rule": "test", "tenant": "1"},
},
}, map[string]string{headerKey: "bar"}); err != nil {
}, [][]prompb.Label{{{Name: "rule", Value: "test"}, {Name: "tenant", Value: "0"}}, {{Name: "rule", Value: "test"}, {Name: "tenant", Value: "1"}}}, map[string]string{headerKey: "bar"}); err != nil {
t.Fatalf("unexpected error %s", err)
}
@@ -187,7 +188,7 @@ func TestAlertManager_Send(t *testing.T) {
Name: "alert2",
Labels: map[string]string{},
},
}, map[string]string{}); err != nil {
}, [][]prompb.Label{{{Name: "rule", Value: "test"}}, {{}}}, map[string]string{}); err != nil {
t.Fatalf("unexpected error %s", err)
}

View File

@@ -27,15 +27,9 @@ type Config struct {
// PathPrefix is added to URL path before adding alertManagerPath value
PathPrefix string `yaml:"path_prefix,omitempty"`
// ConsulSDConfigs contains list of settings for service discovery via Consul
// see https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config
ConsulSDConfigs []consul.SDConfig `yaml:"consul_sd_configs,omitempty"`
// DNSSDConfigs contains list of settings for service discovery via DNS.
// See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dns_sd_config
DNSSDConfigs []dns.SDConfig `yaml:"dns_sd_configs,omitempty"`
// StaticConfigs contains list of static targets
StaticConfigs []StaticConfig `yaml:"static_configs,omitempty"`
ConsulSDConfigs []ConsulSDConfigs `yaml:"consul_sd_configs,omitempty"`
DNSSDConfigs []DNSSDConfigs `yaml:"dns_sd_configs,omitempty"`
StaticConfigs []StaticConfig `yaml:"static_configs,omitempty"`
// HTTPClientConfig contains HTTP configuration for Notifier clients
HTTPClientConfig promauth.HTTPClientConfig `yaml:",inline"`
@@ -62,14 +56,29 @@ type Config struct {
parsedAlertRelabelConfigs *promrelabel.ParsedConfigs
}
// StaticConfig contains list of static targets in the following form:
// staticConfig contains list of static targets in the following form:
//
// targets:
// [ - '<host>' ]
type StaticConfig struct {
Targets []string `yaml:"targets"`
// HTTPClientConfig contains HTTP configuration for the Targets
HTTPClientConfig promauth.HTTPClientConfig `yaml:",inline"`
HTTPClientConfig promauth.HTTPClientConfig `yaml:",inline"`
AlertRelabelConfigs []promrelabel.RelabelConfig `yaml:"alert_relabel_configs,omitempty"`
}
// ConsulSDConfigs contains list of settings for service discovery via Consul,
// see https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config
type ConsulSDConfigs struct {
consul.SDConfig `yaml:",inline"`
AlertRelabelConfigs []promrelabel.RelabelConfig `yaml:"alert_relabel_configs,omitempty"`
}
// DNSSDConfigs contains list of settings for service discovery via DNS,
// See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dns_sd_config
type DNSSDConfigs struct {
dns.SDConfig `yaml:",inline"`
AlertRelabelConfigs []promrelabel.RelabelConfig `yaml:"alert_relabel_configs,omitempty"`
}
// UnmarshalYAML implements the yaml.Unmarshaler interface.
@@ -95,6 +104,31 @@ func (cfg *Config) UnmarshalYAML(unmarshal func(any) error) error {
}
cfg.parsedAlertRelabelConfigs = arCfg
for _, s := range cfg.StaticConfigs {
if len(s.AlertRelabelConfigs) > 0 {
_, err := promrelabel.ParseRelabelConfigs(s.AlertRelabelConfigs)
if err != nil {
return fmt.Errorf("failed to parse alert_relabel_configs in static_config: %w", err)
}
}
}
for _, s := range cfg.ConsulSDConfigs {
if len(s.AlertRelabelConfigs) > 0 {
_, err := promrelabel.ParseRelabelConfigs(s.AlertRelabelConfigs)
if err != nil {
return fmt.Errorf("failed to parse alert_relabel_configs in consul_sd_config: %w", err)
}
}
}
for _, s := range cfg.DNSSDConfigs {
if len(s.AlertRelabelConfigs) > 0 {
_, err := promrelabel.ParseRelabelConfigs(s.AlertRelabelConfigs)
if err != nil {
return fmt.Errorf("failed to parse alert_relabel_configs in dns_sd_config: %w", err)
}
}
}
b, err := yaml.Marshal(cfg)
if err != nil {
return fmt.Errorf("failed to marshal configuration for checksum: %w", err)

View File

@@ -35,4 +35,6 @@ func TestParseConfig_Failure(t *testing.T) {
f("testdata/unknownFields.bad.yaml", "unknown field")
f("non-existing-file", "error reading")
f("testdata/consul.bad.yaml", "failed to parse alert_relabel_configs in consul_sd_config")
f("testdata/dns.bad.yaml", "failed to parse alert relabeling config")
}

View File

@@ -8,6 +8,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promauth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape/discovery/consul"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape/discovery/dns"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promutil"
@@ -28,11 +29,7 @@ type configWatcher struct {
targets map[TargetType][]Target
}
func newWatcher(path string, gen AlertURLGenerator) (*configWatcher, error) {
cfg, err := parseConfig(path)
if err != nil {
return nil, err
}
func newWatcher(cfg *Config, gen AlertURLGenerator) (*configWatcher, error) {
cw := &configWatcher{
cfg: cfg,
wg: sync.WaitGroup{},
@@ -88,18 +85,15 @@ func (cw *configWatcher) reload(path string) error {
return cw.start()
}
func (cw *configWatcher) add(typeK TargetType, interval time.Duration, labelsFn getLabels) error {
targetMetadata, errors := getTargetMetadata(labelsFn, cw.cfg)
func (cw *configWatcher) add(typeK TargetType, interval time.Duration, targetsFn getTargets) error {
targetMetadata, errors := getTargetMetadata(targetsFn, cw.cfg)
for _, err := range errors {
return fmt.Errorf("failed to init notifier for %q: %w", typeK, err)
}
cw.updateTargets(typeK, targetMetadata, cw.cfg, cw.genFn)
cw.wg.Add(1)
go func() {
defer cw.wg.Done()
cw.wg.Go(func() {
ticker := time.NewTicker(interval)
defer ticker.Stop()
@@ -109,62 +103,77 @@ func (cw *configWatcher) add(typeK TargetType, interval time.Duration, labelsFn
return
case <-ticker.C:
}
targetMetadata, errors := getTargetMetadata(labelsFn, cw.cfg)
targetMetadata, errors := getTargetMetadata(targetsFn, cw.cfg)
for _, err := range errors {
logger.Errorf("failed to init notifier for %q: %w", typeK, err)
}
cw.updateTargets(typeK, targetMetadata, cw.cfg, cw.genFn)
}
}()
})
return nil
}
func getTargetMetadata(labelsFn getLabels, cfg *Config) (map[string]*promutil.Labels, []error) {
metaLabels, err := labelsFn()
type targetMetadata struct {
*promutil.Labels
alertRelabelConfigs *promrelabel.ParsedConfigs
}
func getTargetMetadata(targetsFn getTargets, cfg *Config) (map[string]targetMetadata, []error) {
metaLabelsList, alertRelabelCfgs, err := targetsFn()
if err != nil {
return nil, []error{fmt.Errorf("failed to get labels: %w", err)}
}
targetMetadata := make(map[string]*promutil.Labels, len(metaLabels))
targetMts := make(map[string]targetMetadata, len(metaLabelsList))
var errors []error
duplicates := make(map[string]struct{})
for _, labels := range metaLabels {
target := labels.Get("__address__")
u, processedLabels, err := parseLabels(target, labels, cfg)
if err != nil {
errors = append(errors, err)
continue
}
if len(u) == 0 {
continue
}
if _, ok := duplicates[u]; ok { // check for duplicates
if !*suppressDuplicateTargetErrors {
logger.Errorf("skipping duplicate target with identical address %q; "+
"make sure service discovery and relabeling is set up properly; "+
"original labels: %s; resulting labels: %s",
u, labels, processedLabels)
for i := range metaLabelsList {
metaLabels := metaLabelsList[i]
alertRelabelCfg := alertRelabelCfgs[i]
for _, labels := range metaLabels {
target := labels.Get("__address__")
u, processedLabels, err := parseLabels(target, labels, cfg)
if err != nil {
errors = append(errors, err)
continue
}
if len(u) == 0 {
continue
}
// check for duplicated targets
// targets with same address but different alert_relabel_configs are still considered duplicates since it's mostly due to misconfiguration and could cause duplicated notifications.
if _, ok := duplicates[u]; ok {
if !*suppressDuplicateTargetErrors {
logger.Errorf("skipping duplicate target with identical address %q; "+
"make sure service discovery and relabeling is set up properly; "+
"original labels: %s; resulting labels: %s",
u, labels, processedLabels)
}
continue
}
duplicates[u] = struct{}{}
targetMts[u] = targetMetadata{
Labels: processedLabels,
alertRelabelConfigs: alertRelabelCfg,
}
continue
}
duplicates[u] = struct{}{}
targetMetadata[u] = processedLabels
}
return targetMetadata, errors
return targetMts, errors
}
type getLabels func() ([]*promutil.Labels, error)
type getTargets func() ([][]*promutil.Labels, []*promrelabel.ParsedConfigs, error)
func (cw *configWatcher) start() error {
if len(cw.cfg.StaticConfigs) > 0 {
var targets []Target
for _, cfg := range cw.cfg.StaticConfigs {
for i, cfg := range cw.cfg.StaticConfigs {
alertRelabelConfig, _ := promrelabel.ParseRelabelConfigs(cw.cfg.StaticConfigs[i].AlertRelabelConfigs)
httpCfg := mergeHTTPClientConfigs(cw.cfg.HTTPClientConfig, cfg.HTTPClientConfig)
for _, target := range cfg.Targets {
address, labels, err := parseLabels(target, nil, cw.cfg)
if err != nil {
return fmt.Errorf("failed to parse labels for target %q: %w", target, err)
}
notifier, err := NewAlertManager(address, cw.genFn, httpCfg, cw.cfg.parsedAlertRelabelConfigs, cw.cfg.Timeout.Duration())
notifier, err := NewAlertManager(address, cw.genFn, httpCfg, alertRelabelConfig, cw.cfg.Timeout.Duration())
if err != nil {
return fmt.Errorf("failed to init alertmanager for addr %q: %w", address, err)
}
@@ -178,17 +187,20 @@ func (cw *configWatcher) start() error {
}
if len(cw.cfg.ConsulSDConfigs) > 0 {
err := cw.add(TargetConsul, *consul.SDCheckInterval, func() ([]*promutil.Labels, error) {
var labels []*promutil.Labels
err := cw.add(TargetConsul, *consul.SDCheckInterval, func() ([][]*promutil.Labels, []*promrelabel.ParsedConfigs, error) {
var labels [][]*promutil.Labels
var alertRelabelConfigs []*promrelabel.ParsedConfigs
for i := range cw.cfg.ConsulSDConfigs {
alertRelabelConfig, _ := promrelabel.ParseRelabelConfigs(cw.cfg.ConsulSDConfigs[i].AlertRelabelConfigs)
sdc := &cw.cfg.ConsulSDConfigs[i]
targetLabels, err := sdc.GetLabels(cw.cfg.baseDir)
if err != nil {
return nil, fmt.Errorf("got labels err: %w", err)
return nil, nil, fmt.Errorf("got labels err: %w", err)
}
labels = append(labels, targetLabels...)
labels = append(labels, targetLabels)
alertRelabelConfigs = append(alertRelabelConfigs, alertRelabelConfig)
}
return labels, nil
return labels, alertRelabelConfigs, nil
})
if err != nil {
return fmt.Errorf("failed to start consulSD discovery: %w", err)
@@ -196,17 +208,21 @@ func (cw *configWatcher) start() error {
}
if len(cw.cfg.DNSSDConfigs) > 0 {
err := cw.add(TargetDNS, *dns.SDCheckInterval, func() ([]*promutil.Labels, error) {
var labels []*promutil.Labels
err := cw.add(TargetDNS, *dns.SDCheckInterval, func() ([][]*promutil.Labels, []*promrelabel.ParsedConfigs, error) {
var labels [][]*promutil.Labels
var alertRelabelConfigs []*promrelabel.ParsedConfigs
for i := range cw.cfg.DNSSDConfigs {
alertRelabelConfig, _ := promrelabel.ParseRelabelConfigs(cw.cfg.DNSSDConfigs[i].AlertRelabelConfigs)
sdc := &cw.cfg.DNSSDConfigs[i]
targetLabels, err := sdc.GetLabels(cw.cfg.baseDir)
if err != nil {
return nil, fmt.Errorf("got labels err: %w", err)
return nil, nil, fmt.Errorf("got labels err: %w", err)
}
labels = append(labels, targetLabels...)
labels = append(labels, targetLabels)
alertRelabelConfigs = append(alertRelabelConfigs, alertRelabelConfig)
}
return labels, nil
return labels, alertRelabelConfigs, nil
})
if err != nil {
return fmt.Errorf("failed to start DNSSD discovery: %w", err)
@@ -240,30 +256,30 @@ func (cw *configWatcher) setTargets(key TargetType, targets []Target) {
cw.targetsMu.Unlock()
}
func (cw *configWatcher) updateTargets(key TargetType, targetMetadata map[string]*promutil.Labels, cfg *Config, genFn AlertURLGenerator) {
func (cw *configWatcher) updateTargets(key TargetType, targetMts map[string]targetMetadata, cfg *Config, genFn AlertURLGenerator) {
cw.targetsMu.Lock()
defer cw.targetsMu.Unlock()
oldTargets := cw.targets[key]
var updatedTargets []Target
for _, ot := range oldTargets {
if _, ok := targetMetadata[ot.Addr()]; !ok {
if _, ok := targetMts[ot.Addr()]; !ok {
// if target not exists in currentTargets, close it
ot.Close()
} else {
updatedTargets = append(updatedTargets, ot)
delete(targetMetadata, ot.Addr())
delete(targetMts, ot.Addr())
}
}
// create new resources for the new targets
for addr, labels := range targetMetadata {
am, err := NewAlertManager(addr, genFn, cfg.HTTPClientConfig, cfg.parsedAlertRelabelConfigs, cfg.Timeout.Duration())
for addr, metadata := range targetMts {
am, err := NewAlertManager(addr, genFn, cfg.HTTPClientConfig, metadata.alertRelabelConfigs, cfg.Timeout.Duration())
if err != nil {
logger.Errorf("failed to init %s notifier with addr %q: %w", key, addr, err)
continue
}
updatedTargets = append(updatedTargets, Target{
Notifier: am,
Labels: labels,
Labels: metadata.Labels,
})
}

View File

@@ -7,6 +7,7 @@ import (
"net/http/httptest"
"os"
"sync"
"sync/atomic"
"testing"
"time"
@@ -28,7 +29,11 @@ static_configs:
- localhost:9093
- localhost:9094
`)
cw, err := newWatcher(f.Name(), nil)
cfg, err := parseConfig(f.Name())
if err != nil {
t.Fatalf("failed to parse config: %s", err)
}
cw, err := newWatcher(cfg, nil)
if err != nil {
t.Fatalf("failed to start config watcher: %s", err)
}
@@ -83,33 +88,64 @@ consul_sd_configs:
- server: %s
services:
- alertmanager
`, consulSDServer.URL))
- server: %s
services:
- alertmanager
alert_relabel_configs:
- target_label: "foo"
replacement: "tar"
`, consulSDServer.URL, consulSDServer.URL))
cw, err := newWatcher(consulSDFile.Name(), nil)
cfg, err := parseConfig(consulSDFile.Name())
if err != nil {
t.Fatalf("failed to parse config: %s", err)
}
cw, err := newWatcher(cfg, nil)
if err != nil {
t.Fatalf("failed to start config watcher: %s", err)
}
defer cw.mustStop()
if len(cw.notifiers()) != 2 {
t.Fatalf("expected to get 2 notifiers; got %d", len(cw.notifiers()))
if len(cw.notifiers()) != 3 {
t.Fatalf("expected to get 3 notifiers; got %d", len(cw.notifiers()))
}
expAddr1 := fmt.Sprintf("https://%s/proxy/api/v2/alerts", fakeConsulService1)
expAddr2 := fmt.Sprintf("https://%s/proxy/api/v2/alerts", fakeConsulService2)
expAddr3 := fmt.Sprintf("https://%s/proxy/api/v2/alerts", fakeConsulService3)
n1, n2 := cw.notifiers()[0], cw.notifiers()[1]
n1, n2, n3 := cw.notifiers()[0], cw.notifiers()[1], cw.notifiers()[2]
if n1.Addr() != expAddr1 {
t.Fatalf("exp address %q; got %q", expAddr1, n1.Addr())
}
if n2.Addr() != expAddr2 {
t.Fatalf("exp address %q; got %q", expAddr2, n2.Addr())
}
if n3.Addr() != expAddr3 {
t.Fatalf("exp address %q; got %q", expAddr3, n3.Addr())
}
if n1.(*AlertManager).relabelConfigs.String() != "" {
t.Fatalf("unexpected relabel configs: %q", n1.(*AlertManager).relabelConfigs.String())
}
if n2.(*AlertManager).relabelConfigs.String() != "" {
t.Fatalf("unexpected relabel configs: %q", n2.(*AlertManager).relabelConfigs.String())
}
if n3.(*AlertManager).relabelConfigs.String() != "- target_label: foo\n replacement: tar\n" {
t.Fatalf("unexpected relabel configs: %q", n3.(*AlertManager).relabelConfigs.String())
}
f := func() bool { return len(cw.notifiers()) == 1 }
if !waitFor(f, time.Second) {
t.Fatalf("expected to get 1 notifiers; got %d", len(cw.notifiers()))
}
n3 = cw.notifiers()[0]
if n3.Addr() != expAddr3 {
t.Fatalf("exp address %q; got %q", expAddr3, n3.Addr())
}
if n3.(*AlertManager).relabelConfigs.String() != "- target_label: foo\n replacement: tar\n" {
t.Fatalf("unexpected relabel configs: %q", n3.(*AlertManager).relabelConfigs.String())
}
}
// TestConfigWatcherReloadConcurrent supposed to test concurrent
@@ -164,7 +200,11 @@ consul_sd_configs:
"unknownFields.bad.yaml",
}
cw, err := newWatcher(paths[0], nil)
cfg, err := parseConfig(paths[0])
if err != nil {
t.Fatalf("failed to parse config: %s", err)
}
cw, err := newWatcher(cfg, nil)
if err != nil {
t.Fatalf("failed to start config watcher: %s", err)
}
@@ -172,18 +212,16 @@ consul_sd_configs:
const workers = 500
const iterations = 10
wg := sync.WaitGroup{}
wg.Add(workers)
for i := 0; i < workers; i++ {
go func(n int) {
defer wg.Done()
var wg sync.WaitGroup
for n := range workers {
wg.Go(func() {
r := rand.New(rand.NewSource(int64(n)))
for i := 0; i < iterations; i++ {
for range iterations {
rnd := r.Intn(len(paths))
_ = cw.reload(paths[rnd]) // update can fail and this is expected
_ = cw.notifiers()
}
}(i)
})
}
wg.Wait()
}
@@ -202,10 +240,11 @@ func checkErr(t *testing.T, err error) {
const (
fakeConsulService1 = "127.0.0.1:9093"
fakeConsulService2 = "127.0.0.1:9095"
fakeConsulService3 = "127.0.0.1:9097"
)
func newFakeConsulServer() *httptest.Server {
requestCount := 0
var requestCount atomic.Int32
mux := http.NewServeMux()
mux.HandleFunc("/v1/agent/self", func(rw http.ResponseWriter, _ *http.Request) {
rw.Write([]byte(`{"Config": {"Datacenter": "dc1"}}`))
@@ -220,7 +259,7 @@ func newFakeConsulServer() *httptest.Server {
}`))
})
mux.HandleFunc("/v1/health/service/alertmanager", func(rw http.ResponseWriter, _ *http.Request) {
if requestCount == 0 {
if requestCount.Load() == 0 {
rw.Header().Set("X-Consul-Index", "1")
rw.Write([]byte(`
[
@@ -360,7 +399,7 @@ func newFakeConsulServer() *httptest.Server {
}
]`))
}
requestCount++
requestCount.Add(1)
})
return httptest.NewServer(mux)

View File

@@ -5,6 +5,8 @@ import (
"fmt"
"sync"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
)
// FakeNotifier is a mock notifier
@@ -15,6 +17,19 @@ type FakeNotifier struct {
counter int
}
// InitFakeNotifier initializes global notifier to FakeNotifier,
// and returns a cleanup function to restore the original getActiveNotifiers.
func InitFakeNotifier() (*FakeNotifier, func()) {
originalGetActiveNotifiers := getActiveNotifiers
fn := &FakeNotifier{}
getActiveNotifiers = func() []Notifier {
return []Notifier{fn}
}
return fn, func() {
getActiveNotifiers = originalGetActiveNotifiers
}
}
// Close does nothing
func (*FakeNotifier) Close() {}
@@ -27,7 +42,7 @@ func (*FakeNotifier) LastError() string {
func (*FakeNotifier) Addr() string { return "" }
// Send sets alerts and increases counter
func (fn *FakeNotifier) Send(_ context.Context, alerts []Alert, _ map[string]string) error {
func (fn *FakeNotifier) Send(_ context.Context, alerts []Alert, _ [][]prompb.Label, _ map[string]string) error {
fn.Lock()
defer fn.Unlock()
fn.counter += len(alerts)

View File

@@ -1,17 +1,22 @@
package notifier
import (
"context"
"flag"
"fmt"
"net/url"
"strconv"
"strings"
"sync"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httputil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promauth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promutil"
)
@@ -96,11 +101,25 @@ func InitAlertURLGeneratorFn(externalURL *url.URL, externalAlertSource string, v
return nil
}
// cw holds a configWatcher for configPath configuration file
// configWatcher provides a list of Notifier objects discovered
// from static config or via service discovery.
// cw is not nil only if configPath is provided.
var cw *configWatcher
var (
// getActiveNotifiers returns the current list of Notifier objects.
getActiveNotifiers func() []Notifier
// globalRelabelCfg stores the parsed alert relabeling config from the config file if there is
globalRelabelCfg *promrelabel.ParsedConfigs
// cw holds a configWatcher for configPath configuration file
// configWatcher provides a list of Notifier objects discovered
// from static config or via service discovery.
// cw is not nil only if configPath is provided.
cw *configWatcher
// externalLabels is a global variable for holding external labels configured via flags
// It is supposed to be inited via Init function only.
externalLabels map[string]string
// externalURL is a global variable for holding external URL value configured via flag
// It is supposed to be inited via Init function only.
externalURL string
)
// Reload checks the changes in configPath configuration file
// and applies changes if any.
@@ -111,66 +130,62 @@ func Reload() error {
return cw.reload(*configPath)
}
var staticNotifiersFn func() []Notifier
var (
// externalLabels is a global variable for holding external labels configured via flags
// It is supposed to be inited via Init function only.
externalLabels map[string]string
// externalURL is a global variable for holding external URL value configured via flag
// It is supposed to be inited via Init function only.
externalURL string
)
// Init returns a function for retrieving actual list of Notifier objects.
// Init works in two mods:
// - configuration via flags (for backward compatibility). Is always static
// and don't support live reloads.
// - configuration via file. Supports live reloads and service discovery.
//
// Init returns an error if both mods are used.
func Init(extLabels map[string]string, extURL string) (func() []Notifier, error) {
func Init(extLabels map[string]string, extURL string) error {
externalURL = extURL
externalLabels = extLabels
_, err := url.Parse(externalURL)
if err != nil {
return nil, fmt.Errorf("failed to parse external URL: %w", err)
return fmt.Errorf("failed to parse external URL: %w", err)
}
if *blackHole {
if len(*addrs) > 0 || *configPath != "" {
return nil, fmt.Errorf("only one of -notifier.blackhole, -notifier.url and -notifier.config flags must be specified")
return fmt.Errorf("only one of -notifier.blackhole, -notifier.url and -notifier.config flags must be specified")
}
notifier := newBlackHoleNotifier()
staticNotifiersFn = func() []Notifier {
getActiveNotifiers = func() []Notifier {
return []Notifier{notifier}
}
return staticNotifiersFn, nil
return nil
}
if *configPath == "" && len(*addrs) == 0 {
return nil, nil
return nil
}
if *configPath != "" && len(*addrs) > 0 {
return nil, fmt.Errorf("only one of -notifier.config or -notifier.url flags must be specified")
return fmt.Errorf("only one of -notifier.config or -notifier.url flags must be specified")
}
if len(*addrs) > 0 {
notifiers, err := notifiersFromFlags(AlertURLGeneratorFn)
if err != nil {
return nil, fmt.Errorf("failed to create notifier from flag values: %w", err)
return fmt.Errorf("failed to create notifier from flag values: %w", err)
}
staticNotifiersFn = func() []Notifier {
getActiveNotifiers = func() []Notifier {
return notifiers
}
return staticNotifiersFn, nil
return nil
}
cw, err = newWatcher(*configPath, AlertURLGeneratorFn)
cfg, err := parseConfig(*configPath)
if err != nil {
return nil, fmt.Errorf("failed to init config watcher: %w", err)
return err
}
return cw.notifiers, nil
if cfg.AlertRelabelConfigs != nil {
globalRelabelCfg = cfg.parsedAlertRelabelConfigs
}
cw, err = newWatcher(cfg, AlertURLGeneratorFn)
if err != nil {
return fmt.Errorf("failed to init config watcher: %w", err)
}
getActiveNotifiers = cw.notifiers
return nil
}
// InitSecretFlags must be called after flag.Parse and before any logging
@@ -214,6 +229,9 @@ func notifiersFromFlags(gen AlertURLGenerator) ([]Notifier, error) {
Headers: []string{headers.GetOptionalArg(i)},
}
if err := httputil.CheckURL(addr); err != nil {
return nil, fmt.Errorf("invalid notifier.url %q: %w", addr, err)
}
addr = strings.TrimSuffix(addr, "/")
am, err := NewAlertManager(addr+alertManagerPath, gen, authCfg, nil, sendTimeout.GetOptionalArg(i))
if err != nil {
@@ -245,23 +263,58 @@ const (
// GetTargets returns list of static or discovered targets
// via notifier configuration.
//
// Must be called after Init.
func GetTargets() map[TargetType][]Target {
var targets = make(map[TargetType][]Target)
if staticNotifiersFn != nil {
for _, ns := range staticNotifiersFn() {
targets[TargetStatic] = append(targets[TargetStatic], Target{
Notifier: ns,
})
}
if getActiveNotifiers == nil {
return nil
}
targets := make(map[TargetType][]Target)
// use cached targets from configWatcher instead of getActiveNotifiers for the extra target labels
if cw != nil {
cw.targetsMu.RLock()
for key, ns := range cw.targets {
targets[key] = append(targets[key], ns...)
}
cw.targetsMu.RUnlock()
return targets
}
// static notifiers don't have labels
for _, ns := range getActiveNotifiers() {
targets[TargetStatic] = append(targets[TargetStatic], Target{
Notifier: ns,
})
}
return targets
}
// Send sends alerts to all active notifiers
func Send(ctx context.Context, alerts []Alert, notifierHeaders map[string]string) chan error {
alertsToSend := make([]Alert, 0, len(alerts))
lblss := make([][]prompb.Label, 0, len(alerts))
// apply global relabel config first without modifying original alerts in alerts
for _, a := range alerts {
lbls := a.applyRelabelingIfNeeded(globalRelabelCfg)
if len(lbls) == 0 {
continue
}
alertsToSend = append(alertsToSend, a)
lblss = append(lblss, lbls)
}
wg := sync.WaitGroup{}
activeNotifiers := getActiveNotifiers()
errCh := make(chan error, len(activeNotifiers))
defer close(errCh)
for i := range activeNotifiers {
nt := activeNotifiers[i]
wg.Go(func() {
if err := nt.Send(ctx, alertsToSend, lblss, notifierHeaders); err != nil {
errCh <- fmt.Errorf("failed to send alerts to addr %q: %w", nt.Addr(), err)
}
})
}
wg.Wait()
return errCh
}

View File

@@ -1,11 +1,17 @@
package notifier
import (
"context"
"encoding/json"
"fmt"
"net/http"
"net/http/httptest"
"net/url"
"os"
"testing"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
)
func TestInit(t *testing.T) {
@@ -14,14 +20,13 @@ func TestInit(t *testing.T) {
*addrs = flagutil.ArrayString{"127.0.0.1", "127.0.0.2"}
fn, err := Init(nil, "")
err := Init(nil, "")
if err != nil {
t.Fatalf("%s", err)
}
nfs := fn()
if len(nfs) != 2 {
t.Fatalf("expected to get 2 notifiers; got %d", len(nfs))
if len(getActiveNotifiers()) != 2 {
t.Fatalf("expected to get 2 notifiers; got %d", len(getActiveNotifiers()))
}
targets := GetTargets()
@@ -50,19 +55,22 @@ func TestInitNegative(t *testing.T) {
*blackHole = oldBlackHole
}()
f := func(path, addr string, bh bool) {
f := func(path string, addr []string, bh bool) {
*configPath = path
*addrs = flagutil.ArrayString{addr}
*addrs = flagutil.ArrayString(addr)
*blackHole = bh
if _, err := Init(nil, ""); err == nil {
if err := Init(nil, ""); err == nil {
t.Fatalf("expected to get error; got nil instead")
}
}
// *configPath, *addrs and *blackhole are mutually exclusive
f("/dummy/path", "127.0.0.1", false)
f("/dummy/path", "", true)
f("", "127.0.0.1", true)
f("/dummy/path", []string{"127.0.0.1"}, false)
f("/dummy/path", []string{}, true)
f("", []string{"127.0.0.1"}, true)
// addr cannot be ""
f("", []string{""}, false)
f("", []string{"127.0.0.1", ""}, false)
}
func TestBlackHole(t *testing.T) {
@@ -71,14 +79,13 @@ func TestBlackHole(t *testing.T) {
*blackHole = true
fn, err := Init(nil, "")
err := Init(nil, "")
if err != nil {
t.Fatalf("%s", err)
}
nfs := fn()
if len(nfs) != 1 {
t.Fatalf("expected to get 1 notifier; got %d", len(nfs))
if len(getActiveNotifiers()) != 1 {
t.Fatalf("expected to get 1 notifier; got %d", len(getActiveNotifiers()))
}
targets := GetTargets()
@@ -120,3 +127,87 @@ func TestGetAlertURLGenerator(t *testing.T) {
t.Fatalf("unexpected url want %s, got %s", exp, AlertURLGeneratorFn(testAlert))
}
}
func TestSendAlerts(t *testing.T) {
oldAlertURLGeneratorFn := AlertURLGeneratorFn
defer func() { AlertURLGeneratorFn = oldAlertURLGeneratorFn }()
AlertURLGeneratorFn = func(alert Alert) string {
return ""
}
mux := http.NewServeMux()
mux.HandleFunc("/", func(_ http.ResponseWriter, _ *http.Request) {
t.Fatalf("should not be called")
})
mux.HandleFunc(alertManagerPath, func(w http.ResponseWriter, r *http.Request) {
var a []struct {
Labels map[string]string `json:"labels"`
}
if err := json.NewDecoder(r.Body).Decode(&a); err != nil {
t.Fatalf("can not unmarshal data into alert %s", err)
}
if len(a) != 2 {
t.Fatalf("expected 2 alert in array got %d", len(a))
}
if len(a[0].Labels) != 4 {
t.Fatalf("expected 4 labels got %d", len(a[0].Labels))
}
if a[0].Labels["env"] != "prod" {
t.Fatalf("expected env label to be prod during relabeling, got %s", a[0].Labels["env"])
}
if a[0].Labels["c"] != "baz" {
t.Fatalf("expected c label to be baz during relabeling, got %s", a[0].Labels["c"])
}
if len(a[1].Labels) != 1 {
t.Fatalf("expected 1 labels got %d", len(a[1].Labels))
}
})
srv := httptest.NewServer(mux)
defer srv.Close()
f, err := os.CreateTemp("", "")
if err != nil {
t.Fatal(err)
}
defer fs.MustRemovePath(f.Name())
rawConfig := `
static_configs:
- targets:
- %s
alert_relabel_configs:
- source_labels: [b]
target_label: "c"
alert_relabel_configs:
- source_labels: [a]
target_label: "b"
- target_label: "env"
replacement: "prod"
`
config := fmt.Sprintf(rawConfig, srv.URL+alertManagerPath)
writeToFile(f.Name(), config)
oldConfigPath := configPath
defer func() { configPath = oldConfigPath }()
*configPath = f.Name()
err = Init(nil, "")
if err != nil {
t.Fatalf("unexpected error when parse notifier config: %s", err)
}
firingAlerts := []Alert{
{
Name: "alert1",
Labels: map[string]string{"a": "baz"},
},
{
Name: "alert2",
Labels: map[string]string{},
},
}
errG := Send(context.Background(), firingAlerts, nil)
for err := range errG {
if err != nil {
t.Errorf("unexpected error when sending alerts: %s", err)
}
}
}

View File

@@ -1,13 +1,17 @@
package notifier
import "context"
import (
"context"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
)
// Notifier is a common interface for alert manager provider
type Notifier interface {
// Send sends the given list of alerts.
// Returns an error if fails to send the alerts.
// Must unblock if the given ctx is cancelled.
Send(ctx context.Context, alerts []Alert, notifierHeaders map[string]string) error
Send(ctx context.Context, alerts []Alert, alertLabels [][]prompb.Label, notifierHeaders map[string]string) error
// Addr returns address where alerts are sent.
Addr() string
// LastError returns error, that occured during last attempt to send data

View File

@@ -1,6 +1,10 @@
package notifier
import "context"
import (
"context"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
)
// blackHoleNotifier is a Notifier stub, used when no notifications need
// to be sent.
@@ -10,7 +14,7 @@ type blackHoleNotifier struct {
}
// Send will send no notifications, but increase the metric.
func (bh *blackHoleNotifier) Send(_ context.Context, alerts []Alert, _ map[string]string) error { //nolint:revive
func (bh *blackHoleNotifier) Send(_ context.Context, alerts []Alert, _ [][]prompb.Label, _ map[string]string) error { //nolint:revive
bh.metrics.alertsSent.Add(len(alerts))
return nil
}

View File

@@ -5,6 +5,7 @@ import (
"testing"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
metricset "github.com/VictoriaMetrics/metrics"
)
@@ -16,7 +17,7 @@ func TestBlackHoleNotifier_Send(t *testing.T) {
Start: time.Now().UTC(),
End: time.Now().UTC(),
Annotations: map[string]string{"a": "b", "c": "d", "e": "f"},
}}, nil); err != nil {
}}, [][]prompb.Label{{}}, nil); err != nil {
t.Fatalf("unexpected error %s", err)
}
@@ -34,7 +35,7 @@ func TestBlackHoleNotifier_Close(t *testing.T) {
Start: time.Now().UTC(),
End: time.Now().UTC(),
Annotations: map[string]string{"a": "b", "c": "d", "e": "f"},
}}, nil); err != nil {
}}, [][]prompb.Label{{}}, nil); err != nil {
t.Fatalf("unexpected error %s", err)
}

View File

@@ -0,0 +1,19 @@
consul_sd_configs:
- server: localhost:8500
scheme: http
services:
- alertmanager
alert_relabel_configs:
- action: keep
source_labels: [env]
regex: "prod"
- server: localhost:8500
services:
- consul
alert_relabel_configs:
- action: keep
source_labels: [env]
regex: "(abc"
alert_relabel_configs:
- target_label: "foo"
replacement: "aaa"

View File

@@ -0,0 +1,13 @@
dns_sd_configs:
- names:
- cloudflare.com
type: 'A'
port: 9093
relabel_configs:
- source_labels: [__meta_dns_name]
replacement: '${1}'
target_label: dns_name
alert_relabel_configs:
- action: keep
source_labels: [env]
regex: "(abc"

View File

@@ -2,12 +2,19 @@ static_configs:
- targets:
- localhost:9093
- localhost:9095
alert_relabel_configs:
- action: keep
source_labels: [env]
regex: "static"
consul_sd_configs:
- server: localhost:8500
scheme: http
services:
- alertmanager
alert_relabel_configs:
- action: keep
source_labels: [env]
regex: "consul"
- server: localhost:8500
services:
- consul
@@ -17,6 +24,10 @@ dns_sd_configs:
- cloudflare.com
type: 'A'
port: 9093
alert_relabel_configs:
- action: keep
source_labels: [env]
regex: "dns"
relabel_configs:
- source_labels: [__meta_consul_tags]
@@ -25,4 +36,4 @@ relabel_configs:
target_label: __scheme__
- source_labels: [__meta_dns_name]
replacement: '${1}'
target_label: dns_name
target_label: dns_name

View File

@@ -1,22 +1,14 @@
headers:
- 'CustomHeader: foo'
static_configs:
- targets:
- localhost:9093
- localhost:9095
- https://localhost:9093/test/api/v2/alerts
basic_auth:
username: foo
password: bar
- http://192.168.0.101:9093
alert_relabel_configs:
- target_label: "foo"
replacement: "aaa"
- targets:
- localhost:9096
- localhost:9097
basic_auth:
username: foo
password: baz
- http://192.168.0.101:9093
alert_relabel_configs:
- target_label: "foo"
replacement: "ccc"
alert_relabel_configs:
- target_label: "foo"
replacement: "aaa"

View File

@@ -14,9 +14,9 @@ import (
)
var (
addr = flag.String("remoteRead.url", "", "Optional URL to datasource compatible with MetricsQL. It can be single node VictoriaMetrics or vmselect."+
"Remote read is used to restore alerts state."+
"This configuration makes sense only if `vmalert` was configured with `remoteWrite.url` before and has been successfully persisted its state. "+
addr = flag.String("remoteRead.url", "", "Optional URL to datasource compatible with MetricsQL. It can be single node VictoriaMetrics or vmselect. "+
"Remote read is used to restore alerts state. "+
"This configuration makes sense only if vmalert was configured with '-remoteWrite.url' before and has been successfully persisted its state. "+
"Supports address in the form of IP address with a port (e.g., http://127.0.0.1:8428) or DNS SRV record. "+
"See also '-remoteRead.disablePathAppend', '-remoteRead.showURL'.")

View File

@@ -13,14 +13,18 @@ import (
"sync"
"time"
"github.com/cespare/xxhash/v2"
"github.com/golang/snappy"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httputil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/netutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promauth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/timeutil"
"github.com/VictoriaMetrics/metrics"
)
@@ -114,7 +118,9 @@ func NewClient(ctx context.Context, cfg Config) (*Client, error) {
}
for i := 0; i < cc; i++ {
c.run(ctx)
c.wg.Go(func() {
c.run(ctx, i)
})
}
return c, nil
}
@@ -156,8 +162,7 @@ func (c *Client) Close() error {
return nil
}
func (c *Client) run(ctx context.Context) {
ticker := time.NewTicker(c.flushInterval)
func (c *Client) run(ctx context.Context, id int) {
wr := &prompb.WriteRequest{}
shutdown := func() {
lastCtx, cancel := context.WithTimeout(context.Background(), defaultWriteTimeout)
@@ -173,42 +178,73 @@ func (c *Client) run(ctx context.Context) {
cancel()
}
c.wg.Add(1)
go func() {
defer c.wg.Done()
defer ticker.Stop()
for {
// add jitter to spread remote write flushes over the flush interval to avoid congestion at the remote write destination
h := xxhash.Sum64(bytesutil.ToUnsafeBytes(fmt.Sprintf("%d", id)))
randJitter := uint64(float64(c.flushInterval) * (float64(h) / (1 << 64)))
timer := time.NewTimer(time.Duration(randJitter))
addJitter:
for {
select {
case <-c.doneCh:
timer.Stop()
shutdown()
return
case <-ctx.Done():
timer.Stop()
shutdown()
return
case <-timer.C:
break addJitter
}
}
ticker := time.NewTicker(c.flushInterval)
defer ticker.Stop()
for {
select {
case <-c.doneCh:
shutdown()
return
case <-ctx.Done():
shutdown()
return
case <-ticker.C:
c.flush(ctx, wr)
// drain the potential stale tick to avoid small or empty flushes after a slow flush.
select {
case <-c.doneCh:
shutdown()
return
case <-ctx.Done():
shutdown()
return
case <-ticker.C:
default:
}
case ts, ok := <-c.input:
if !ok {
continue
}
wr.Timeseries = append(wr.Timeseries, ts)
if len(wr.Timeseries) >= c.maxBatchSize {
c.flush(ctx, wr)
case ts, ok := <-c.input:
if !ok {
continue
}
wr.Timeseries = append(wr.Timeseries, ts)
if len(wr.Timeseries) >= c.maxBatchSize {
c.flush(ctx, wr)
}
}
}
}()
}
}
var (
rwErrors = metrics.NewCounter(`vmalert_remotewrite_errors_total`)
rwTotal = metrics.NewCounter(`vmalert_remotewrite_total`)
sentRows = metrics.NewCounter(`vmalert_remotewrite_sent_rows_total`)
sentBytes = metrics.NewCounter(`vmalert_remotewrite_sent_bytes_total`)
droppedRows = metrics.NewCounter(`vmalert_remotewrite_dropped_rows_total`)
sendDuration = metrics.NewFloatCounter(`vmalert_remotewrite_send_duration_seconds_total`)
bufferFlushDuration = metrics.NewHistogram(`vmalert_remotewrite_flush_duration_seconds`)
// sentRows and sentBytes are historical counters that can now be replaced by flushedRows and flushedBytes histograms. They may be deprecated in the future after the new histograms have been adopted for some time.
sentRows = metrics.NewCounter(`vmalert_remotewrite_sent_rows_total`)
sentBytes = metrics.NewCounter(`vmalert_remotewrite_sent_bytes_total`)
flushedRows = metrics.NewHistogram(`vmalert_remotewrite_sent_rows`)
flushedBytes = metrics.NewHistogram(`vmalert_remotewrite_sent_bytes`)
droppedRows = metrics.NewCounter(`vmalert_remotewrite_dropped_rows_total`)
sendDuration = metrics.NewFloatCounter(`vmalert_remotewrite_send_duration_seconds_total`)
bufferFlushDuration = metrics.NewHistogram(`vmalert_remotewrite_flush_duration_seconds`)
remoteWriteQueueSize = metrics.NewHistogram(`vmalert_remotewrite_queue_size`)
_ = metrics.NewGauge(`vmalert_remotewrite_queue_capacity`, func() float64 {
return float64(*maxQueueSize)
})
_ = metrics.NewGauge(`vmalert_remotewrite_concurrency`, func() float64 {
return float64(*concurrency)
@@ -222,6 +258,7 @@ func GetDroppedRows() int { return int(droppedRows.Get()) }
// it to remote-write endpoint. Flush performs limited amount of retries
// if request fails.
func (c *Client) flush(ctx context.Context, wr *prompb.WriteRequest) {
remoteWriteQueueSize.Update(float64(len(c.input)))
if len(wr.Timeseries) < 1 {
return
}
@@ -231,16 +268,16 @@ func (c *Client) flush(ctx context.Context, wr *prompb.WriteRequest) {
data := wr.MarshalProtobuf(nil)
b := snappy.Encode(nil, data)
retryInterval, maxRetryInterval := *retryMinInterval, *retryMaxTime
if retryInterval > maxRetryInterval {
retryInterval = maxRetryInterval
}
maxRetryInterval := *retryMaxTime
bt := timeutil.NewBackoffTimer(*retryMinInterval, maxRetryInterval)
timeStart := time.Now()
defer func() {
sendDuration.Add(time.Since(timeStart).Seconds())
}()
attempts := 0
L:
for attempts := 0; ; attempts++ {
for {
err := c.send(ctx, b)
if err != nil && (errors.Is(err, io.EOF) || netutil.IsTrivialNetworkError(err)) {
// Something in the middle between client and destination might be closing
@@ -250,6 +287,8 @@ L:
if err == nil {
sentRows.Add(len(wr.Timeseries))
sentBytes.Add(len(b))
flushedRows.Update(float64(len(wr.Timeseries)))
flushedBytes.Update(float64(len(b)))
return
}
@@ -275,13 +314,13 @@ L:
break
}
if retryInterval > timeLeftForRetries {
retryInterval = timeLeftForRetries
if bt.CurrentDelay() > timeLeftForRetries {
bt.SetDelay(timeLeftForRetries)
}
// sleeping to prevent remote db hammering
time.Sleep(retryInterval)
retryInterval *= 2
bt.Wait(ctx.Done())
attempts++
}
rwErrors.Inc()

View File

@@ -44,7 +44,7 @@ func TestClient_Push(t *testing.T) {
r := rand.New(rand.NewSource(1))
const rowsN = int(1e4)
for i := 0; i < rowsN; i++ {
for range rowsN {
s := prompb.TimeSeries{
Samples: []prompb.Sample{{
Value: r.Float64(),
@@ -102,7 +102,7 @@ func TestClient_run_maxBatchSizeDuringShutdown(t *testing.T) {
}
// push time series to the client.
for i := 0; i < pushCnt; i++ {
for range pushCnt {
if err = rwClient.Push(prompb.TimeSeries{}); err != nil {
t.Fatalf("cannot time series to the client: %s", err)
}

View File

@@ -22,7 +22,7 @@ func TestDebugClient_Push(t *testing.T) {
const rowsN = 100
var sent int
for i := 0; i < rowsN; i++ {
for i := range rowsN {
s := prompb.TimeSeries{
Samples: []prompb.Sample{{
Value: float64(i),

View File

@@ -2,6 +2,7 @@ package rule
import (
"context"
"errors"
"fmt"
"hash/fnv"
"math"
@@ -246,16 +247,6 @@ func (ar *AlertingRule) GetAlerts() []*notifier.Alert {
return alerts
}
// GetAlert returns alert if id exists
func (ar *AlertingRule) GetAlert(id uint64) *notifier.Alert {
ar.alertsMu.RLock()
defer ar.alertsMu.RUnlock()
if ar.alerts == nil {
return nil
}
return ar.alerts[id]
}
func (ar *AlertingRule) logDebugf(at time.Time, a *notifier.Alert, format string, args ...any) {
if !ar.Debug {
return
@@ -321,6 +312,11 @@ type labelSet struct {
// On k conflicts in origin set, the original value is preferred and copied
// to processed with `exported_%k` key. The copy happens only if passed v isn't equal to origin[k] value.
func (ls *labelSet) add(k, v string) {
// do not add label with empty value, since it has no meaning.
// see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9984
if v == "" {
return
}
ls.processed[k] = v
ov, ok := ls.origin[k]
if !ok {
@@ -350,14 +346,13 @@ func (ar *AlertingRule) toLabels(m datasource.Metric, qFn templates.QueryFn) (*l
ls.processed[l.Name] = l.Value
}
// labels only support limited templating variables,
// including `labels`, `value` and `expr`, to avoid breaking alert states or causing cardinality issue with results
extraLabels, err := notifier.ExecTemplate(qFn, ar.Labels, notifier.AlertTplData{
Labels: ls.origin,
Value: m.Values[0],
Expr: ar.Expr,
})
if err != nil {
return nil, fmt.Errorf("failed to expand labels: %w", err)
}
for k, v := range extraLabels {
ls.add(k, v)
}
@@ -368,7 +363,7 @@ func (ar *AlertingRule) toLabels(m datasource.Metric, qFn templates.QueryFn) (*l
if !*disableAlertGroupLabel && ar.GroupName != "" {
ls.add(alertGroupNameLabel, ar.GroupName)
}
return ls, nil
return ls, err
}
// execRange executes alerting rule on the given time range similarly to exec.
@@ -394,11 +389,7 @@ func (ar *AlertingRule) execRange(ctx context.Context, start, end time.Time) ([]
return nil, err
}
alertID := hash(ls.processed)
as, err := ar.expandAnnotationTemplates(s, qFn, time.Time{}, ls)
if err != nil {
return nil, err
}
a := ar.newAlert(s, time.Time{}, ls.processed, as) // initial alert
a := ar.newAlert(s, time.Time{}, ls.processed, nil) // initial alert
prevT := time.Time{}
for i := range s.Values {
@@ -414,8 +405,6 @@ func (ar *AlertingRule) execRange(ctx context.Context, start, end time.Time) ([]
// reset to Pending if there are gaps > EvalInterval between DPs
a.State = notifier.StatePending
a.ActiveAt = at
// re-template the annotations as active timestamp is changed
a.Annotations, _ = ar.expandAnnotationTemplates(s, qFn, at, ls)
a.Start = time.Time{}
} else if at.Sub(a.ActiveAt) >= ar.For && a.State != notifier.StateFiring {
a.State = notifier.StateFiring
@@ -461,7 +450,7 @@ func (ar *AlertingRule) exec(ctx context.Context, ts time.Time, limit int) ([]pr
defer func() {
ar.state.add(curState)
if curState.Err != nil {
if curState.Err != nil && !errors.Is(curState.Err, context.Canceled) {
ar.metrics.errors.Inc()
}
}()
@@ -470,7 +459,8 @@ func (ar *AlertingRule) exec(ctx context.Context, ts time.Time, limit int) ([]pr
return nil, fmt.Errorf("failed to execute query %q: %w", ar.Expr, err)
}
ar.logDebugf(ts, nil, "query returned %d series (elapsed: %s, isPartial: %t)", curState.Samples, curState.Duration, isPartialResponse(res))
isPartial := isPartialResponse(res)
ar.logDebugf(ts, nil, "query returned %d series (elapsed: %s, isPartial: %t)", curState.Samples, curState.Duration, isPartial)
qFn := func(query string) ([]datasource.Metric, error) {
res, _, err := ar.q.Query(ctx, query, ts)
return res.Data, err
@@ -484,8 +474,9 @@ func (ar *AlertingRule) exec(ctx context.Context, ts time.Time, limit int) ([]pr
for i, m := range res.Data {
ls, err := ar.expandLabelTemplates(m, qFn)
if err != nil {
// only set error in current state, but do not break alert processing
curState.Err = err
return nil, curState.Err
logger.Errorf("got templating error in rule %s: %q", ar.Name, err)
}
at := ts
alertID := hash(ls.processed)
@@ -495,10 +486,11 @@ func (ar *AlertingRule) exec(ctx context.Context, ts time.Time, limit int) ([]pr
at = a.ActiveAt
}
}
as, err := ar.expandAnnotationTemplates(m, qFn, at, ls)
as, err := ar.expandAnnotationTemplates(m, qFn, at, ls, isPartial)
if err != nil {
// only set error in current state, but do not break alert processing
curState.Err = err
return nil, curState.Err
logger.Errorf("got templating error in rule %s: %q", ar.Name, err)
}
expandedLabels[i] = ls
expandedAnnotations[i] = as
@@ -607,25 +599,26 @@ func (ar *AlertingRule) exec(ctx context.Context, ts time.Time, limit int) ([]pr
func (ar *AlertingRule) expandLabelTemplates(m datasource.Metric, qFn templates.QueryFn) (*labelSet, error) {
ls, err := ar.toLabels(m, qFn)
if err != nil {
return nil, fmt.Errorf("failed to expand label templates: %s", err)
return ls, fmt.Errorf("failed to expand label templates: %s", err)
}
return ls, nil
}
func (ar *AlertingRule) expandAnnotationTemplates(m datasource.Metric, qFn templates.QueryFn, activeAt time.Time, ls *labelSet) (map[string]string, error) {
func (ar *AlertingRule) expandAnnotationTemplates(m datasource.Metric, qFn templates.QueryFn, activeAt time.Time, ls *labelSet, isPartial bool) (map[string]string, error) {
tplData := notifier.AlertTplData{
Value: m.Values[0],
Type: ar.Type.String(),
Labels: ls.origin,
Expr: ar.Expr,
AlertID: hash(ls.processed),
GroupID: ar.GroupID,
ActiveAt: activeAt,
For: ar.For,
Value: m.Values[0],
Type: ar.Type.String(),
Labels: ls.origin,
Expr: ar.Expr,
AlertID: hash(ls.processed),
GroupID: ar.GroupID,
ActiveAt: activeAt,
For: ar.For,
IsPartial: isPartial,
}
as, err := notifier.ExecTemplate(qFn, ar.Annotations, tplData)
if err != nil {
return nil, fmt.Errorf("failed to expand annotation templates: %s", err)
return as, fmt.Errorf("failed to expand annotation templates: %s", err)
}
return as, nil
}
@@ -825,7 +818,9 @@ func (ar *AlertingRule) restore(ctx context.Context, q datasource.Querier, ts ti
expr := fmt.Sprintf("default_rollup(%s{%s%s}[%ds])",
alertForStateMetricName, nameStr, labelsFilter, int(lookback.Seconds()))
res, _, err := q.Query(ctx, expr, ts)
// query ALERTS_FOR_STATE at `ts-1s` instead `ts` to avoid retrieving data written in the current run,
// see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10335
res, _, err := q.Query(ctx, expr, ts.Add(-1*time.Second))
if err != nil {
return fmt.Errorf("failed to execute restore query %q: %w ", expr, err)
}

View File

@@ -0,0 +1,106 @@
//go:build synctest
package rule
import (
"context"
"strings"
"testing"
"testing/synctest"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
)
// TestAlertingRule_ActiveAtPreservedInAnnotations ensures that the fix for
// https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9543 is preserved
// while allowing query templates in labels (https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9783)
func TestAlertingRule_ActiveAtPreservedInAnnotations(t *testing.T) {
// wrap into synctest because of time manipulations
synctest.Test(t, func(t *testing.T) {
fq := &datasource.FakeQuerier{}
ar := &AlertingRule{
Name: "TestActiveAtPreservation",
Labels: map[string]string{
"test_query_in_label": `{{ "static_value" }}`,
},
Annotations: map[string]string{
"description": "Alert active since {{ $activeAt }}",
},
alerts: make(map[uint64]*notifier.Alert),
q: fq,
state: &ruleState{
entries: make([]StateEntry, 10),
},
}
// Mock query result - return empty result to make suppress_for_mass_alert = false
// (no need to add anything to fq for empty result)
// Add a metric that should trigger the alert
fq.Add(metricWithValueAndLabels(t, 1, "instance", "server1"))
// First execution - creates new alert
ts1 := time.Now()
_, err := ar.exec(context.TODO(), ts1, 0)
if err != nil {
t.Fatalf("unexpected error on first exec: %s", err)
}
if len(ar.alerts) != 1 {
t.Fatalf("expected 1 alert, got %d", len(ar.alerts))
}
firstAlert := ar.GetAlerts()[0]
// Verify first execution: activeAt should be ts1 and annotation should reflect it
if !firstAlert.ActiveAt.Equal(ts1) {
t.Fatalf("expected activeAt to be %v, got %v", ts1, firstAlert.ActiveAt)
}
// Extract time from annotation (format will be like "Alert active since 2025-09-30 08:55:13.638551611 -0400 EDT m=+0.002928464")
expectedTimeStr := ts1.Format("2006-01-02 15:04:05")
if !strings.Contains(firstAlert.Annotations["description"], expectedTimeStr) {
t.Fatalf("first exec annotation should contain time %s, got: %s", expectedTimeStr, firstAlert.Annotations["description"])
}
// Second execution - should preserve activeAt in annotation
// Ensure different timestamp with different seconds
// sleep is non-blocking thanks to synctest
time.Sleep(2 * time.Second)
ts2 := time.Now()
_, err = ar.exec(context.TODO(), ts2, 0)
if err != nil {
t.Fatalf("unexpected error on second exec: %s", err)
}
// Get the alert again (should be the same alert)
if len(ar.alerts) != 1 {
t.Fatalf("expected 1 alert, got %d", len(ar.alerts))
}
secondAlert := ar.GetAlerts()[0]
// Critical test: activeAt should still be ts1, not ts2
if !secondAlert.ActiveAt.Equal(ts1) {
t.Fatalf("activeAt should be preserved as %v, but got %v", ts1, secondAlert.ActiveAt)
}
// Critical test: annotation should still contain ts1 time, not ts2
if !strings.Contains(secondAlert.Annotations["description"], expectedTimeStr) {
t.Fatalf("second exec annotation should still contain original time %s, got: %s", expectedTimeStr, secondAlert.Annotations["description"])
}
// Additional verification: annotation should NOT contain ts2 time
ts2TimeStr := ts2.Format("2006-01-02 15:04:05")
if strings.Contains(secondAlert.Annotations["description"], ts2TimeStr) {
t.Fatalf("annotation should NOT contain new eval time %s, got: %s", ts2TimeStr, secondAlert.Annotations["description"])
}
// Verify query template in labels still works (this would fail if query templates were broken)
if firstAlert.Labels["test_query_in_label"] != "static_value" {
t.Fatalf("expected test_query_in_label=static_value, got %s", firstAlert.Labels["test_query_in_label"])
}
})
}

View File

@@ -10,7 +10,6 @@ import (
"strings"
"sync"
"testing"
"testing/synctest"
"time"
"github.com/VictoriaMetrics/metrics"
@@ -664,7 +663,7 @@ func TestAlertingRuleExecRange(t *testing.T) {
Name: "for-pending",
Type: config.NewPrometheusType().String(),
Labels: map[string]string{"alertname": "for-pending"},
Annotations: map[string]string{"activeAt": "5000"},
Annotations: map[string]string{},
State: notifier.StatePending,
ActiveAt: time.Unix(5, 0),
Value: 1,
@@ -684,7 +683,7 @@ func TestAlertingRuleExecRange(t *testing.T) {
Name: "for-firing",
Type: config.NewPrometheusType().String(),
Labels: map[string]string{"alertname": "for-firing"},
Annotations: map[string]string{"activeAt": "1000"},
Annotations: map[string]string{},
State: notifier.StateFiring,
ActiveAt: time.Unix(1, 0),
Start: time.Unix(5, 0),
@@ -705,7 +704,7 @@ func TestAlertingRuleExecRange(t *testing.T) {
Name: "for-hold-pending",
Type: config.NewPrometheusType().String(),
Labels: map[string]string{"alertname": "for-hold-pending"},
Annotations: map[string]string{"activeAt": "5000"},
Annotations: map[string]string{},
State: notifier.StatePending,
ActiveAt: time.Unix(5, 0),
Value: 1,
@@ -827,12 +826,9 @@ func TestGroup_Restore(t *testing.T) {
fg := NewGroup(config.Group{Name: "TestRestore", Rules: rules}, fqr, time.Second, nil)
fg.Init()
wg := sync.WaitGroup{}
wg.Add(1)
go func() {
nts := func() []notifier.Notifier { return []notifier.Notifier{&notifier.FakeNotifier{}} }
fg.Start(context.Background(), nts, nil, fqr)
wg.Done()
}()
wg.Go(func() {
fg.Start(context.Background(), nil, fqr)
})
fg.Close()
wg.Wait()
@@ -1123,7 +1119,7 @@ func TestAlertingRuleLimit_Success(t *testing.T) {
}
func TestAlertingRule_Template(t *testing.T) {
f := func(rule *AlertingRule, metrics []datasource.Metric, alertsExpected map[uint64]*notifier.Alert) {
f := func(rule *AlertingRule, metrics []datasource.Metric, isResponsePartial bool, alertsExpected map[uint64]*notifier.Alert) {
t.Helper()
fakeGroup := Group{
@@ -1136,6 +1132,7 @@ func TestAlertingRule_Template(t *testing.T) {
entries: make([]StateEntry, 10),
}
fq.Add(metrics...)
fq.SetPartialResponse(isResponsePartial)
if _, err := rule.exec(context.TODO(), time.Now(), 0); err != nil {
t.Fatalf("unexpected error: %s", err)
@@ -1166,7 +1163,7 @@ func TestAlertingRule_Template(t *testing.T) {
}, []datasource.Metric{
metricWithValueAndLabels(t, 1, "instance", "foo"),
metricWithValueAndLabels(t, 1, "instance", "bar"),
}, map[uint64]*notifier.Alert{
}, false, map[uint64]*notifier.Alert{
hash(map[string]string{alertNameLabel: "common", "region": "east", "instance": "foo"}): {
Annotations: map[string]string{
"summary": `common: Too high connection number for "foo"`,
@@ -1195,14 +1192,14 @@ func TestAlertingRule_Template(t *testing.T) {
"instance": "{{ $labels.instance }}",
},
Annotations: map[string]string{
"summary": `{{ $labels.__name__ }}: Too high connection number for "{{ $labels.instance }}"`,
"summary": `{{ $labels.__name__ }}: Too high connection number for "{{ $labels.instance }}".{{ if $isPartial }} WARNING: Partial response detected - this alert may be incomplete. Please verify the results manually.{{ end }}`,
"description": `{{ $labels.alertname}}: It is {{ $value }} connections for "{{ $labels.instance }}"`,
},
alerts: make(map[uint64]*notifier.Alert),
}, []datasource.Metric{
metricWithValueAndLabels(t, 2, "__name__", "first", "instance", "foo", alertNameLabel, "override"),
metricWithValueAndLabels(t, 10, "__name__", "second", "instance", "bar", alertNameLabel, "override"),
}, map[uint64]*notifier.Alert{
}, false, map[uint64]*notifier.Alert{
hash(map[string]string{alertNameLabel: "override label", "exported_alertname": "override", "instance": "foo"}): {
Labels: map[string]string{
alertNameLabel: "override label",
@@ -1210,7 +1207,7 @@ func TestAlertingRule_Template(t *testing.T) {
"instance": "foo",
},
Annotations: map[string]string{
"summary": `first: Too high connection number for "foo"`,
"summary": `first: Too high connection number for "foo".`,
"description": `override: It is 2 connections for "foo"`,
},
},
@@ -1221,7 +1218,7 @@ func TestAlertingRule_Template(t *testing.T) {
"instance": "bar",
},
Annotations: map[string]string{
"summary": `second: Too high connection number for "bar"`,
"summary": `second: Too high connection number for "bar".`,
"description": `override: It is 10 connections for "bar"`,
},
},
@@ -1234,7 +1231,7 @@ func TestAlertingRule_Template(t *testing.T) {
"instance": "{{ $labels.instance }}",
},
Annotations: map[string]string{
"summary": `Alert "{{ $labels.alertname }}({{ $labels.alertgroup }})" for instance {{ $labels.instance }}`,
"summary": `Alert "{{ $labels.alertname }}({{ $labels.alertgroup }})" for instance {{ $labels.instance }}.{{ if $isPartial }} WARNING: Partial response detected - this alert may be incomplete. Please verify the results manually.{{ end }}`,
},
alerts: make(map[uint64]*notifier.Alert),
}, []datasource.Metric{
@@ -1242,7 +1239,7 @@ func TestAlertingRule_Template(t *testing.T) {
alertNameLabel, "originAlertname",
alertGroupNameLabel, "originGroupname",
"instance", "foo"),
}, map[uint64]*notifier.Alert{
}, true, map[uint64]*notifier.Alert{
hash(map[string]string{
alertNameLabel: "OriginLabels",
"exported_alertname": "originAlertname",
@@ -1258,7 +1255,7 @@ func TestAlertingRule_Template(t *testing.T) {
"instance": "foo",
},
Annotations: map[string]string{
"summary": `Alert "originAlertname(originGroupname)" for instance foo`,
"summary": `Alert "originAlertname(originGroupname)" for instance foo. WARNING: Partial response detected - this alert may be incomplete. Please verify the results manually.`,
},
},
})
@@ -1373,8 +1370,10 @@ func TestAlertingRule_ToLabels(t *testing.T) {
ar := &AlertingRule{
Labels: map[string]string{
"instance": "override", // this should override instance with new value
"group": "vmalert", // this shouldn't have effect since value in metric is equal
"instance": "override", // this should override instance with new value
"group": "vmalert", // this shouldn't have effect since value in metric is equal
"invalid_label": "{{ .Values.mustRuntimeFail }}",
"empty_label": "", // this should be dropped
},
Expr: "sum(vmalert_alerting_rules_error) by(instance, group, alertname) > 0",
Name: "AlertingRulesError",
@@ -1382,10 +1381,11 @@ func TestAlertingRule_ToLabels(t *testing.T) {
}
expectedOriginLabels := map[string]string{
"instance": "0.0.0.0:8800",
"group": "vmalert",
"alertname": "ConfigurationReloadFailure",
"alertgroup": "vmalert",
"instance": "0.0.0.0:8800",
"group": "vmalert",
"alertname": "ConfigurationReloadFailure",
"alertgroup": "vmalert",
"invalid_label": `error evaluating template: template: :1:298: executing "" at <.Values.mustRuntimeFail>: can't evaluate field Values in type notifier.tplData`,
}
expectedProcessedLabels := map[string]string{
@@ -1395,11 +1395,12 @@ func TestAlertingRule_ToLabels(t *testing.T) {
"exported_alertname": "ConfigurationReloadFailure",
"group": "vmalert",
"alertgroup": "vmalert",
"invalid_label": `error evaluating template: template: :1:298: executing "" at <.Values.mustRuntimeFail>: can't evaluate field Values in type notifier.tplData`,
}
ls, err := ar.toLabels(metric, nil)
if err != nil {
t.Fatalf("unexpected error: %s", err)
if err == nil || !strings.Contains(err.Error(), "error evaluating template") {
t.Fatalf("unexpected error %q", err.Error())
}
if !reflect.DeepEqual(ls.origin, expectedOriginLabels) {
@@ -1477,95 +1478,3 @@ func TestAlertingRule_QueryTemplateInLabels(t *testing.T) {
t.Fatalf("expected 'suppress_for_mass_alert' label to be 'true' or 'false', got '%s'", suppressLabel)
}
}
// TestAlertingRule_ActiveAtPreservedInAnnotations ensures that the fix for
// https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9543 is preserved
// while allowing query templates in labels (https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9783)
func TestAlertingRule_ActiveAtPreservedInAnnotations(t *testing.T) {
// wrap into synctest because of time manipulations
synctest.Test(t, func(t *testing.T) {
fq := &datasource.FakeQuerier{}
ar := &AlertingRule{
Name: "TestActiveAtPreservation",
Labels: map[string]string{
"test_query_in_label": `{{ "static_value" }}`,
},
Annotations: map[string]string{
"description": "Alert active since {{ $activeAt }}",
},
alerts: make(map[uint64]*notifier.Alert),
q: fq,
state: &ruleState{
entries: make([]StateEntry, 10),
},
}
// Mock query result - return empty result to make suppress_for_mass_alert = false
// (no need to add anything to fq for empty result)
// Add a metric that should trigger the alert
fq.Add(metricWithValueAndLabels(t, 1, "instance", "server1"))
// First execution - creates new alert
ts1 := time.Now()
_, err := ar.exec(context.TODO(), ts1, 0)
if err != nil {
t.Fatalf("unexpected error on first exec: %s", err)
}
if len(ar.alerts) != 1 {
t.Fatalf("expected 1 alert, got %d", len(ar.alerts))
}
firstAlert := ar.GetAlerts()[0]
// Verify first execution: activeAt should be ts1 and annotation should reflect it
if !firstAlert.ActiveAt.Equal(ts1) {
t.Fatalf("expected activeAt to be %v, got %v", ts1, firstAlert.ActiveAt)
}
// Extract time from annotation (format will be like "Alert active since 2025-09-30 08:55:13.638551611 -0400 EDT m=+0.002928464")
expectedTimeStr := ts1.Format("2006-01-02 15:04:05")
if !strings.Contains(firstAlert.Annotations["description"], expectedTimeStr) {
t.Fatalf("first exec annotation should contain time %s, got: %s", expectedTimeStr, firstAlert.Annotations["description"])
}
// Second execution - should preserve activeAt in annotation
// Ensure different timestamp with different seconds
// sleep is non-blocking thanks to synctest
time.Sleep(2 * time.Second)
ts2 := time.Now()
_, err = ar.exec(context.TODO(), ts2, 0)
if err != nil {
t.Fatalf("unexpected error on second exec: %s", err)
}
// Get the alert again (should be the same alert)
if len(ar.alerts) != 1 {
t.Fatalf("expected 1 alert, got %d", len(ar.alerts))
}
secondAlert := ar.GetAlerts()[0]
// Critical test: activeAt should still be ts1, not ts2
if !secondAlert.ActiveAt.Equal(ts1) {
t.Fatalf("activeAt should be preserved as %v, but got %v", ts1, secondAlert.ActiveAt)
}
// Critical test: annotation should still contain ts1 time, not ts2
if !strings.Contains(secondAlert.Annotations["description"], expectedTimeStr) {
t.Fatalf("second exec annotation should still contain original time %s, got: %s", expectedTimeStr, secondAlert.Annotations["description"])
}
// Additional verification: annotation should NOT contain ts2 time
ts2TimeStr := ts2.Format("2006-01-02 15:04:05")
if strings.Contains(secondAlert.Annotations["description"], ts2TimeStr) {
t.Fatalf("annotation should NOT contain new eval time %s, got: %s", ts2TimeStr, secondAlert.Annotations["description"])
}
// Verify query template in labels still works (this would fail if query templates were broken)
if firstAlert.Labels["test_query_in_label"] != "static_value" {
t.Fatalf("expected test_query_in_label=static_value, got %s", firstAlert.Labels["test_query_in_label"])
}
})
}

View File

@@ -6,6 +6,7 @@ import (
"flag"
"fmt"
"hash/fnv"
"maps"
"net/url"
"sync"
"time"
@@ -30,8 +31,8 @@ var (
"0 means no limit.")
ruleUpdateEntriesLimit = flag.Int("rule.updateEntriesLimit", 20, "Defines the max number of rule's state updates stored in-memory. "+
"Rule's updates are available on rule's Details page and are used for debugging purposes. The number of stored updates can be overridden per rule via update_entries_limit param.")
resendDelay = flag.Duration("rule.resendDelay", 0, "MiniMum amount of time to wait before resending an alert to notifier.")
maxResolveDuration = flag.Duration("rule.maxResolveDuration", 0, "Limits the maxiMum duration for automatic alert expiration, "+
resendDelay = flag.Duration("rule.resendDelay", 0, "Minimum amount of time to wait before resending an alert to notifier.")
maxResolveDuration = flag.Duration("rule.maxResolveDuration", 0, "Limits the maximum duration for automatic alert expiration, "+
"which by default is 4 times evaluationInterval of the parent group")
evalDelay = flag.Duration("rule.evalDelay", 30*time.Second, "Adjustment of the 'time' parameter for rule evaluation requests to compensate intentional data delay from the datasource. "+
"Normally, should be equal to '-search.latencyOffset' (cmd-line flag configured for VictoriaMetrics single-node or vmselect). "+
@@ -39,6 +40,8 @@ var (
disableAlertGroupLabel = flag.Bool("disableAlertgroupLabel", false, "Whether to disable adding group's Name as label to generated alerts and time series.")
remoteReadLookBack = flag.Duration("remoteRead.lookback", time.Hour, "Lookback defines how far to look into past for alerts timeseries. "+
"For example, if lookback=1h then range from now() to now()-1h will be scanned.")
maxStartDelay = flag.Duration("group.maxStartDelay", 5*time.Minute, "Defines the max delay before starting the group evaluation. Group's start is artificially delayed for random duration on interval"+
" [0..min(--group.maxStartDelay, group.interval)]. This helps smoothing out the load on the configured datasource, so evaluations aren't executed too close to each other.")
)
// Group is an entity for grouping rules
@@ -95,9 +98,7 @@ type groupMetrics struct {
// set2 has priority over set1.
func mergeLabels(groupName, ruleName string, set1, set2 map[string]string) map[string]string {
r := map[string]string{}
for k, v := range set1 {
r[k] = v
}
maps.Copy(r, set1)
for k, v := range set2 {
if prevV, ok := r[k]; ok {
logger.Infof("label %q=%q for rule %q.%q overwritten with external label %q=%q",
@@ -330,13 +331,13 @@ func (g *Group) Init() {
}
// Start starts group's evaluation
func (g *Group) Start(ctx context.Context, nts func() []notifier.Notifier, rw remotewrite.RWClient, rr datasource.QuerierBuilder) {
func (g *Group) Start(ctx context.Context, rw remotewrite.RWClient, rr datasource.QuerierBuilder) {
defer func() { close(g.finishedCh) }()
evalTS := time.Now()
// sleep random duration to spread group rules evaluation
// over time to reduce the load on datasource.
// over maxStartDelay to reduce the load on datasource.
if !SkipRandSleepOnGroupStart {
sleepBeforeStart := delayBeforeStart(evalTS, g.GetID(), g.Interval, g.EvalOffset)
sleepBeforeStart := g.delayBeforeStart(evalTS, *maxStartDelay)
g.infof("will start in %v", sleepBeforeStart)
sleepTimer := time.NewTimer(sleepBeforeStart)
@@ -368,21 +369,22 @@ func (g *Group) Start(ctx context.Context, nts func() []notifier.Notifier, rw re
e := &executor{
Rw: rw,
Notifiers: nts,
notifierHeaders: g.NotifierHeaders,
}
g.infof("started")
eval := func(ctx context.Context, ts time.Time) {
eval := func(ctx context.Context, ts time.Time) time.Time {
g.metrics.iterationTotal.Inc()
start := time.Now()
if len(g.Rules) < 1 {
g.metrics.iterationDuration.UpdateDuration(start)
g.mu.Lock()
g.LastEvaluation = start
return
g.mu.Unlock()
return ts
}
resolveDuration := getResolveDuration(g.Interval, *resendDelay, *maxResolveDuration)
@@ -395,7 +397,10 @@ func (g *Group) Start(ctx context.Context, nts func() []notifier.Notifier, rw re
}
}
g.metrics.iterationDuration.UpdateDuration(start)
g.mu.Lock()
g.LastEvaluation = start
g.mu.Unlock()
return ts
}
evalCtx, cancel := context.WithCancel(ctx)
@@ -404,15 +409,15 @@ func (g *Group) Start(ctx context.Context, nts func() []notifier.Notifier, rw re
g.mu.Unlock()
defer g.evalCancel()
eval(evalCtx, evalTS)
t := time.NewTicker(g.Interval)
defer t.Stop()
realEvalTS := eval(evalCtx, evalTS)
// restore the rules state after the first evaluation
// so only active alerts can be restored.
if rr != nil {
err := g.restore(ctx, rr, evalTS, *remoteReadLookBack)
err := g.restore(ctx, rr, realEvalTS, *remoteReadLookBack)
if err != nil {
logger.Errorf("error while restoring ruleState for group %q: %s", g.Name, err)
}
@@ -475,20 +480,35 @@ func (g *Group) UpdateWith(newGroup *Group) {
g.updateCh <- newGroup
}
// if offset is specified, delayBeforeStart returns a duration to help aligning timestamp with offset;
// otherwise, it returns a random duration between [0..interval] based on group key.
func delayBeforeStart(ts time.Time, key uint64, interval time.Duration, offset *time.Duration) time.Duration {
if offset != nil {
currentOffsetPoint := ts.Truncate(interval).Add(*offset)
// delayBeforeStart returns duration for delaying the evaluation start
// based on given ts and Group settings. The delay can't exceed maxDelay.
// maxDelay is ignored if g.EvalOffset != nil.
//
// Delaying is important to smooth out the load on the datasource when all groups start at the same time.
// delayBeforeStart calculates delay based on Group ID, so all groups will start at different moments of time.
func (g *Group) delayBeforeStart(ts time.Time, maxDelay time.Duration) time.Duration {
if g.EvalOffset != nil {
offset := *g.EvalOffset
// adjust the offset for negative evalOffset, the rule is:
// `eval_offset: -x` is equivalent to `eval_offset: y` for `interval: x+y`.
// For example, `eval_offset: -6m` is equivalent to `eval_offset: 4m` for `interval: 10m`.
if offset < 0 {
offset += g.Interval
}
// if offset is specified, ignore the maxDelay and return a duration aligned with offset
currentOffsetPoint := ts.Truncate(g.Interval).Add(offset)
if currentOffsetPoint.Before(ts) {
// wait until the next offset point
return currentOffsetPoint.Add(interval).Sub(ts)
return currentOffsetPoint.Add(g.Interval).Sub(ts)
}
return currentOffsetPoint.Sub(ts)
}
// otherwise, return a random duration between [0..min(interval, maxDelay)] based on group ID
// artificially limit interval, so groups with big intervals could start sooner.
interval := min(g.Interval, maxDelay)
var randSleep time.Duration
randSleep = time.Duration(float64(interval) * (float64(key) / (1 << 64)))
randSleep = time.Duration(float64(interval) * (float64(g.GetID()) / (1 << 64)))
sleepOffset := time.Duration(ts.UnixNano() % interval.Nanoseconds())
if randSleep < sleepOffset {
randSleep += interval
@@ -550,15 +570,13 @@ func (g *Group) Replay(start, end time.Time, rw remotewrite.RWClient, maxDataPoi
if !disableProgressBar {
bar = pb.StartNew(iterations * len(g.Rules))
}
for _, r := range g.Rules {
for i := range g.Rules {
rule := g.Rules[i]
sem <- struct{}{}
wg.Add(1)
go func(r Rule, ri rangeIterator) {
// pass ri as a copy, so it can be modified within the replayRuleRange
res <- replayRuleRange(r, ri, bar, rw, replayRuleRetryAttempts, ruleEvaluationConcurrency)
wg.Go(func() {
res <- replayRuleRange(rule, ri, bar, rw, replayRuleRetryAttempts, ruleEvaluationConcurrency)
<-sem
wg.Done()
}(r, ri)
})
}
wg.Wait()
@@ -588,10 +606,10 @@ func replayRuleRange(r Rule, ri rangeIterator, bar *pb.ProgressBar, rw remotewri
res := make(chan int, int(ri.end.Sub(ri.start)/ri.step)+1)
for ri.next() {
sem <- struct{}{}
wg.Add(1)
go func(s, e time.Time) {
n, err := replayRule(r, s, e, rw, replayRuleRetryAttempts)
start := ri.s
end := ri.e
wg.Go(func() {
n, err := replayRule(r, start, end, rw, replayRuleRetryAttempts)
if err != nil {
logger.Fatalf("rule %q: %s", r, err)
}
@@ -600,8 +618,7 @@ func replayRuleRange(r Rule, ri rangeIterator, bar *pb.ProgressBar, rw remotewri
}
res <- n
<-sem
wg.Done()
}(ri.s, ri.e)
})
}
wg.Wait()
close(res)
@@ -615,10 +632,9 @@ func replayRuleRange(r Rule, ri rangeIterator, bar *pb.ProgressBar, rw remotewri
}
// ExecOnce evaluates all the rules under group for once with given timestamp.
func (g *Group) ExecOnce(ctx context.Context, nts func() []notifier.Notifier, rw remotewrite.RWClient, evalTS time.Time) chan error {
func (g *Group) ExecOnce(ctx context.Context, rw remotewrite.RWClient, evalTS time.Time) chan error {
e := &executor{
Rw: rw,
Notifiers: nts,
notifierHeaders: g.NotifierHeaders,
}
if len(g.Rules) < 1 {
@@ -693,7 +709,6 @@ func (g *Group) getEvalDelay() time.Duration {
// executor contains group's notify and rw configs
type executor struct {
Notifiers func() []notifier.Notifier
notifierHeaders map[string]string
Rw remotewrite.RWClient
@@ -714,14 +729,13 @@ func (e *executor) execConcurrently(ctx context.Context, rules []Rule, ts time.T
sem := make(chan struct{}, concurrency)
go func() {
wg := sync.WaitGroup{}
for _, r := range rules {
for i := range rules {
rule := rules[i]
sem <- struct{}{}
wg.Add(1)
go func(r Rule) {
res <- e.exec(ctx, r, ts, resolveDuration, limit)
wg.Go(func() {
res <- e.exec(ctx, rule, ts, resolveDuration, limit)
<-sem
wg.Done()
}(r)
})
}
wg.Wait()
close(res)
@@ -750,6 +764,7 @@ func (e *executor) exec(ctx context.Context, r Rule, ts time.Time, resolveDurati
return fmt.Errorf("rule %q: failed to execute: %w", r, err)
}
var errG vmalertutil.ErrGroup
if e.Rw != nil {
pushToRW := func(tss []prompb.TimeSeries) error {
var lastErr error
@@ -761,31 +776,26 @@ func (e *executor) exec(ctx context.Context, r Rule, ts time.Time, resolveDurati
return lastErr
}
if err := pushToRW(tss); err != nil {
return err
errG.Add(err)
}
}
ar, ok := r.(*AlertingRule)
if !ok {
return nil
return errG.Err()
}
alerts := ar.alertsToSend(resolveDuration, *resendDelay)
if len(alerts) < 1 {
return nil
return errG.Err()
}
wg := sync.WaitGroup{}
errGr := new(vmalertutil.ErrGroup)
for _, nt := range e.Notifiers() {
wg.Add(1)
go func(nt notifier.Notifier) {
if err := nt.Send(ctx, alerts, e.notifierHeaders); err != nil {
errGr.Add(fmt.Errorf("rule %q: failed to send alerts to addr %q: %w", r, nt.Addr(), err))
}
wg.Done()
}(nt)
notifierErr := notifier.Send(ctx, alerts, e.notifierHeaders)
for err := range notifierErr {
if err != nil {
errG.Add(fmt.Errorf("rule %q: notifier failure: %w", r, err))
}
}
wg.Wait()
return errGr.Err()
return errG.Err()
}

View File

@@ -262,7 +262,7 @@ func TestUpdateDuringRandSleep(t *testing.T) {
updateCh: make(chan *Group),
}
g.Init()
go g.Start(context.Background(), nil, nil, nil)
go g.Start(context.Background(), nil, nil)
rule1 := AlertingRule{
Name: "jobDown",
@@ -346,7 +346,8 @@ func TestGroupStart(t *testing.T) {
}
fs := &datasource.FakeQuerier{}
fn := &notifier.FakeNotifier{}
fn, cleanup := notifier.InitFakeNotifier()
defer cleanup()
const evalInterval = time.Millisecond
g := NewGroup(groups[0], fs, evalInterval, map[string]string{"cluster": "east-1"})
@@ -395,7 +396,7 @@ func TestGroupStart(t *testing.T) {
fs.Add(m2)
g.Init()
go func() {
g.Start(context.Background(), func() []notifier.Notifier { return []notifier.Notifier{fn} }, nil, fs)
g.Start(context.Background(), nil, fs)
close(finished)
}()
@@ -404,7 +405,8 @@ func TestGroupStart(t *testing.T) {
var cur uint64
prev := g.metrics.iterationTotal.Get()
for i := 0; ; i++ {
i := 0
for {
if i > 40 {
t.Fatalf("group wasn't able to perform %d evaluations during %d eval intervals", n, i)
}
@@ -413,6 +415,7 @@ func TestGroupStart(t *testing.T) {
return
}
time.Sleep(interval)
i++
}
}
@@ -472,15 +475,10 @@ func TestFaultyNotifier(t *testing.T) {
r := newTestAlertingRule("instant", 0)
r.q = fq
fn := &notifier.FakeNotifier{}
e := &executor{
Notifiers: func() []notifier.Notifier {
return []notifier.Notifier{
&notifier.FaultyNotifier{},
fn,
}
},
}
fn, cleanup := notifier.InitFakeNotifier()
defer cleanup()
e := &executor{}
delay := 5 * time.Second
ctx, cancel := context.WithTimeout(context.Background(), delay)
defer cancel()
@@ -553,7 +551,7 @@ func TestCloseWithEvalInterruption(t *testing.T) {
g := NewGroup(groups[0], fq, evalInterval, nil)
g.Init()
go g.Start(context.Background(), nil, nil, nil)
go g.Start(context.Background(), nil, nil)
time.Sleep(evalInterval * 20)
@@ -571,9 +569,10 @@ func TestCloseWithEvalInterruption(t *testing.T) {
func TestGroupStartDelay(t *testing.T) {
g := &Group{}
g.id = uint64(math.MaxUint64 / 10)
// interval of 5min and key generate a static delay of 30s
g.Interval = time.Minute * 5
key := uint64(math.MaxUint64 / 10)
maxDelay := time.Minute * 5
f := func(atS, expS string) {
t.Helper()
@@ -585,7 +584,7 @@ func TestGroupStartDelay(t *testing.T) {
if err != nil {
t.Fatal(err)
}
delay := delayBeforeStart(at, key, g.Interval, g.EvalOffset)
delay := g.delayBeforeStart(at, maxDelay)
gotStart := at.Add(delay)
if expTS != gotStart {
t.Fatalf("expected to get %v; got %v instead", expTS, gotStart)
@@ -606,6 +605,24 @@ func TestGroupStartDelay(t *testing.T) {
f("2023-01-01T00:01:00.000+00:00", "2023-01-01T00:03:00.000+00:00")
f("2023-01-01T00:03:30.000+00:00", "2023-01-01T00:08:00.000+00:00")
f("2023-01-01T00:08:00.000+00:00", "2023-01-01T00:08:00.000+00:00")
// test group with negative offset -2min, which is equivalent to 3min offset for 5min interval
offset = -2 * time.Minute
g.EvalOffset = &offset
f("2023-01-01T00:00:15.000+00:00", "2023-01-01T00:03:00.000+00:00")
f("2023-01-01T00:01:00.000+00:00", "2023-01-01T00:03:00.000+00:00")
f("2023-01-01T00:03:30.000+00:00", "2023-01-01T00:08:00.000+00:00")
f("2023-01-01T00:08:00.000+00:00", "2023-01-01T00:08:00.000+00:00")
maxDelay = time.Minute * 1
g.EvalOffset = nil
// test group with maxDelay, and offset disabled
f("2023-01-01T00:00:00.000+00:00", "2023-01-01T00:00:06.000+00:00")
f("2023-01-01T00:00:01.000+00:00", "2023-01-01T00:00:06.000+00:00")
f("2023-01-01T00:00:06.100+00:00", "2023-01-01T00:01:06.000+00:00")
f("2023-01-01T00:00:11.000+00:00", "2023-01-01T00:01:06.000+00:00")
}
func TestGetPrometheusReqTimestamp(t *testing.T) {

View File

@@ -2,6 +2,7 @@ package rule
import (
"context"
"errors"
"fmt"
"strings"
"time"
@@ -197,7 +198,7 @@ func (rr *RecordingRule) exec(ctx context.Context, ts time.Time, limit int) ([]p
defer func() {
rr.state.add(curState)
if curState.Err != nil {
if curState.Err != nil && !errors.Is(curState.Err, context.Canceled) {
rr.metrics.errors.Inc()
}
}()
@@ -236,7 +237,8 @@ func (rr *RecordingRule) exec(ctx context.Context, ts time.Time, limit int) ([]p
Labels: stringToLabels(k),
Samples: []prompb.Sample{
{Value: decimal.StaleNaN, Timestamp: ts.UnixNano() / 1e6},
}})
},
})
}
rr.lastEvaluation = curEvaluation
return tss, nil
@@ -291,6 +293,11 @@ func (rr *RecordingRule) toTimeSeries(m datasource.Metric) prompb.TimeSeries {
}
// add extra labels configured by user
for k := range rr.Labels {
// do not add label with empty value, since it has no meaning.
// see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9984
if rr.Labels[k] == "" {
continue
}
existingLabel := promrelabel.GetLabelByName(m.Labels, k)
if existingLabel != nil { // there is a conflict between extra and existing label
if existingLabel.Value == rr.Labels[k] {

View File

@@ -121,7 +121,7 @@ func (s *ruleState) add(e StateEntry) {
func replayRule(r Rule, start, end time.Time, rw remotewrite.RWClient, replayRuleRetryAttempts int) (int, error) {
var err error
var tss []prompb.TimeSeries
for i := 0; i < replayRuleRetryAttempts; i++ {
for i := range replayRuleRetryAttempts {
tss, err = r.execRange(context.Background(), start, end)
if err == nil {
break

View File

@@ -40,7 +40,7 @@ func TestRule_state(t *testing.T) {
}
var last time.Time
for i := 0; i < stateEntriesN*2; i++ {
for range stateEntriesN * 2 {
last = time.Now()
r.state.add(StateEntry{At: last})
}
@@ -65,17 +65,15 @@ func TestRule_stateConcurrent(_ *testing.T) {
r := &AlertingRule{state: &ruleState{entries: make([]StateEntry, 20)}}
const workers = 50
const iterations = 100
wg := sync.WaitGroup{}
wg.Add(workers)
for i := 0; i < workers; i++ {
go func() {
defer wg.Done()
for i := 0; i < iterations; i++ {
var wg sync.WaitGroup
for range workers {
wg.Go(func() {
for range iterations {
r.state.add(StateEntry{At: time.Now()})
r.state.getAll()
r.state.getLast()
}
}()
})
}
wg.Wait()
}

View File

@@ -19,13 +19,13 @@ func CompareRules(t *testing.T, a, b Rule) error {
case *AlertingRule:
br, ok := b.(*AlertingRule)
if !ok {
return fmt.Errorf("rule %q supposed to be of type AlertingRule", b.ID())
return fmt.Errorf("rule %d supposed to be of type AlertingRule", b.ID())
}
return compareAlertingRules(t, v, br)
case *RecordingRule:
br, ok := b.(*RecordingRule)
if !ok {
return fmt.Errorf("rule %q supposed to be of type RecordingRule", b.ID())
return fmt.Errorf("rule %d supposed to be of type RecordingRule", b.ID())
}
return compareRecordingRules(t, v, br)
default:

View File

@@ -57,12 +57,8 @@ type ApiGroup struct {
EvalOffset float64 `json:"eval_offset,omitempty"`
// EvalDelay will adjust the `time` parameter of rule evaluation requests to compensate intentional query delay from datasource.
EvalDelay float64 `json:"eval_delay,omitempty"`
// Unhealthy unhealthy rules count
Unhealthy int
// Healthy passing rules count
Healthy int
// NoMatch not matching rules count
NoMatch int
// States represents counts per each rule state
States map[string]int `json:"states"`
}
// APILink returns a link to the group's JSON representation.
@@ -134,6 +130,11 @@ type ApiRule struct {
Updates []StateEntry `json:"-"`
}
// IsNoMatch returns true if rule is in nomatch state
func (r *ApiRule) IsNoMatch() bool {
return r.LastSamples == 0 && r.LastSeriesFetched != nil && *r.LastSeriesFetched == 0
}
// ApiAlert represents a notifier.AlertingRule state
// for WEB view
// https://github.com/prometheus/compliance/blob/main/alert_generator/specification.md#get-apiv1rules
@@ -209,15 +210,6 @@ func (ar *AlertingRule) AlertsToAPI() []*ApiAlert {
return alerts
}
// AlertToAPI generates apiAlert object from alert by its id(hash)
func (ar *AlertingRule) AlertToAPI(id uint64) *ApiAlert {
a := ar.GetAlert(id)
if a == nil {
return nil
}
return NewAlertAPI(ar, a)
}
// NewAlertAPI creates apiAlert for notifier.Alert
func NewAlertAPI(ar *AlertingRule, a *notifier.Alert) *ApiAlert {
aa := &ApiAlert{
@@ -244,6 +236,20 @@ func NewAlertAPI(ar *AlertingRule, a *notifier.Alert) *ApiAlert {
return aa
}
func (r *ApiRule) ExtendState() {
if len(r.Alerts) > 0 {
return
}
if r.State == "" {
r.State = "ok"
}
if r.Health != "ok" {
r.State = "unhealthy"
} else if r.IsNoMatch() {
r.State = "nomatch"
}
}
// ToAPI returns ApiGroup representation of g
func (g *Group) ToAPI() *ApiGroup {
g.mu.RLock()
@@ -261,6 +267,7 @@ func (g *Group) ToAPI() *ApiGroup {
Headers: headersToStrings(g.Headers),
NotifierHeaders: headersToStrings(g.NotifierHeaders),
Labels: g.Labels,
States: make(map[string]int),
}
if g.EvalOffset != nil {
ag.EvalOffset = g.EvalOffset.Seconds()
@@ -268,9 +275,10 @@ func (g *Group) ToAPI() *ApiGroup {
if g.EvalDelay != nil {
ag.EvalDelay = g.EvalDelay.Seconds()
}
ag.Rules = make([]ApiRule, 0)
ag.Rules = make([]ApiRule, 0, len(g.Rules))
for _, r := range g.Rules {
ag.Rules = append(ag.Rules, r.ToAPI())
ar := r.ToAPI()
ag.Rules = append(ag.Rules, ar)
}
return &ag
}

View File

@@ -34,11 +34,12 @@ body {
padding-top: 4.5rem;
}
.group-items {
.vm-group {
cursor: pointer;
padding: 5px;
margin-top: 5px;
position: relative;
display: none;
}
.btn svg, .dropdown-item svg {
@@ -55,14 +56,22 @@ body {
height: 38px;
}
.group-items:not(:has(.sub-item:not(.d-none))) {
display: none !important;
.vm-item:not(.vm-found) {
display: none;
}
.group-items:hover {
.vm-group:has(.vm-item:is(.vm-found)), .vm-group:is(.vm-found) {
display: flex;
}
.vm-group:hover {
background-color: #f8f9fa!important;
}
.vm-group:is(.vm-found) .vm-item {
display: table-row;
}
.table {
table-layout: fixed;
}
@@ -111,3 +120,9 @@ textarea.curl-area {
.w-60 {
width: 60%;
}
.annotations {
white-space: pre-wrap;
color: gray;
word-wrap: break-word;
}

View File

@@ -11,7 +11,7 @@
<path d="M224.163 175.27a1.9 1.9 0 0 0 2.8 0l6-5.9a2.1 2.1 0 0 0 .2-2.7 1.9 1.9 0 0 0-3-.2l-2.6 2.6v-5.2c0-1.54-1.667-2.502-3-1.732-.619.357-1 1.017-1 1.732v5.2l-2.6-2.6a1.9 1.9 0 0 0-3 .2 2.1 2.1 0 0 0 .2 2.7zm-16.459-23.297h36c1.54 0 2.502-1.667 1.732-3a2 2 0 0 0-1.732-1h-36c-1.54 0-2.502 1.667-1.732 3 .357.619 1.017 1 1.732 1m36 4h-36c-1.54 0-2.502 1.667-1.732 3 .357.619 1.017 1 1.732 1h36c1.54 0 2.502-1.667 1.732-3a2 2 0 0 0-1.732-1m-16.59-23.517a1.9 1.9 0 0 0-2.8 0l-6 5.9a2.1 2.1 0 0 0-.2 2.7 1.9 1.9 0 0 0 3 .2l2.6-2.6v5.2c0 1.54 1.667 2.502 3 1.732.619-.357 1-1.017 1-1.732v-5.2l2.6 2.6a1.9 1.9 0 0 0 3-.2 2.1 2.1 0 0 0-.2-2.7z"/>
</symbol>
<symbol id="filter" viewBox="-10 -10 320 310">
<symbol id="state" viewBox="-10 -10 320 310">
<path d="M288.953 0h-277c-5.522 0-10 4.478-10 10v49.531c0 5.522 4.478 10 10 10h12.372l91.378 107.397v113.978a10 10 0 0 0 15.547 8.32l49.5-33a10 10 0 0 0 4.453-8.32v-80.978l91.378-107.397h12.372c5.522 0 10-4.478 10-10V10c0-5.522-4.477-10-10-10M167.587 166.77a10 10 0 0 0-2.384 6.48v79.305l-29.5 19.666V173.25a10 10 0 0 0-2.384-6.48L50.585 69.531h199.736zM278.953 49.531h-257V20h257z"/>
</symbol>

Before

Width:  |  Height:  |  Size: 4.7 KiB

After

Width:  |  Height:  |  Size: 4.7 KiB

View File

@@ -8,9 +8,9 @@ function actionAll(isCollapse) {
});
}
function groupFilter(key) {
function groupForState(key) {
if (key) {
location.href = `?filter=${key}`;
location.href = `?state=${key}`;
} else {
window.location = window.location.pathname;
}
@@ -65,32 +65,34 @@ function getParamURL(key) {
return url.searchParams.get(key)
}
function matchText(search, item) {
const text = item.innerText.toLowerCase();
return text.indexOf(search) >= 0;
}
function filterRules(searchPhrase) {
document.querySelectorAll('.sub-items').forEach((rules) => {
let found = false;
rules.querySelectorAll('.sub-item').forEach((rule) => {
if (searchPhrase) {
const ruleName = rule.innerText.toLowerCase();
const matches = []
const hasValue = ruleName.indexOf(searchPhrase) >= 0;
rule.querySelectorAll('.label').forEach((label) => {
const text = label.innerText.toLowerCase();
if (text.indexOf(searchPhrase) >= 0) {
matches.push(text);
}
});
if (!matches.length && !hasValue) {
rule.classList.add('d-none');
return;
}
document.querySelectorAll('.vm-group').forEach((group) => {
if (!searchPhrase) {
group.classList.add('vm-found');
return;
}
for (const item of group.querySelectorAll('.vm-group-search')) {
if (matchText(searchPhrase, item)) {
group.classList.add('vm-found');
return;
}
rule.classList.remove('d-none');
found = true;
});
if (found && searchPhrase || !searchPhrase) {
rules.classList.remove('d-none');
} else {
rules.classList.add('d-none');
}
group.classList.remove('vm-found');
for (const item of group.querySelectorAll('.vm-item')) {
if (matchText(searchPhrase, item)) {
item.classList.add('vm-found');
continue;
}
if (Array.from(item.querySelectorAll('.label')).find(l => matchText(searchPhrase, l))) {
item.classList.add('vm-found');
continue;
}
item.classList.remove('vm-found');
}
});
}

View File

@@ -485,6 +485,12 @@ func templateFuncs() textTpl.FuncMap {
/* Helpers */
// now returns the Unix timestamp in seconds at the time of the template evaluation.
// For example: {{ (now | toTime).Sub $activeAt }} will return the duration the alert has been active.
"now": func() float64 {
return float64(time.Now().Unix())
},
// Converts a list of objects to a map with keys arg0, arg1 etc.
// This is intended to allow multiple arguments to be passed to templates.
"args": func(args ...any) map[string]any {

View File

@@ -45,7 +45,7 @@ func (eg *ErrGroup) Error() string {
return ""
}
var b strings.Builder
fmt.Fprintf(&b, "errors(%d): ", len(eg.errs))
fmt.Fprintf(&b, "errors(%d): \n", len(eg.errs))
for i, err := range eg.errs {
b.WriteString(err.Error())
if i != len(eg.errs)-1 {

View File

@@ -30,8 +30,8 @@ func TestErrGroup(t *testing.T) {
}
f(nil, "")
f([]error{errors.New("timeout")}, "errors(1): timeout")
f([]error{errors.New("timeout"), errors.New("deadline")}, "errors(2): timeout\ndeadline")
f([]error{errors.New("timeout")}, "errors(1): \ntimeout")
f([]error{errors.New("timeout"), errors.New("deadline")}, "errors(2): \ntimeout\ndeadline")
}
// TestErrGroupConcurrent supposed to test concurrent
@@ -42,7 +42,7 @@ func TestErrGroupConcurrent(_ *testing.T) {
const writersN = 4
payload := make(chan error, writersN)
for i := 0; i < writersN; i++ {
for range writersN {
go func() {
for err := range payload {
eg.Add(err)
@@ -51,7 +51,7 @@ func TestErrGroupConcurrent(_ *testing.T) {
}
const iterations = 500
for i := 0; i < iterations; i++ {
for i := range iterations {
payload <- fmt.Errorf("error %d", i)
if i%10 == 0 {
_ = eg.Err()

View File

@@ -1,9 +1,11 @@
package main
import (
"cmp"
"embed"
"encoding/json"
"fmt"
"math"
"net/http"
"slices"
"strconv"
@@ -50,6 +52,13 @@ var (
"alert": rule.TypeAlerting,
"record": rule.TypeRecording,
}
// The "recovering", "noData", "normal", "error" states are used by Grafana.
// Ignore "recovering" since it is not currently acknowledged by vmalert,
// treat "noData" as an alias for "nomatch",
// treat "normal" as an alias for "inactive",
// treat "error" as an alias for "unhealthy"
ruleStates = []string{"ok", "nomatch", "inactive", "firing", "pending", "unhealthy", "recovering", "noData", "normal", "error"}
)
type requestHandler struct {
@@ -63,6 +72,14 @@ var (
staticServer = http.StripPrefix("/vmalert", staticHandler)
)
func marshalJson(v any, kind string) ([]byte, *httpserver.ErrorWithStatusCode) {
data, err := json.Marshal(v)
if err != nil {
return nil, errResponse(fmt.Errorf("failed to marshal %s: %s", kind, err), http.StatusInternalServerError)
}
return data, nil
}
func (rh *requestHandler) handler(w http.ResponseWriter, r *http.Request) bool {
if strings.HasPrefix(r.URL.Path, "/vmalert/static") {
staticServer.ServeHTTP(w, r)
@@ -94,40 +111,32 @@ func (rh *requestHandler) handler(w http.ResponseWriter, r *http.Request) bool {
httpserver.Errorf(w, r, "%s", err)
return true
}
WriteRuleDetails(w, r, rule)
WriteRule(w, r, rule)
return true
case "/vmalert/groups":
// current used by old vmalert UI and Grafana Alerts
case "/vmalert/groups", "/rules":
rf, err := newRulesFilter(r)
if err != nil {
httpserver.Errorf(w, r, "%s", err)
return true
}
data := rh.groups(rf)
WriteListGroups(w, r, data, rf.filter)
// only support filtering by a single state
state := ""
if len(rf.states) > 0 {
state = rf.states[0]
rf.states = rf.states[:1]
}
lr := rh.groups(rf)
WriteListGroups(w, r, lr.Data.Groups, state)
return true
case "/vmalert/notifiers":
WriteListTargets(w, r, notifier.GetTargets())
return true
// special cases for Grafana requests,
// served without `vmalert` prefix:
case "/rules":
// Grafana makes an extra request to `/rules`
// handler in addition to `/api/v1/rules` calls in alerts UI
var data []*rule.ApiGroup
rf, err := newRulesFilter(r)
if err != nil {
httpserver.Errorf(w, r, "%s", err)
return true
}
data = rh.groups(rf)
WriteListGroups(w, r, data, rf.filter)
return true
case "/vmalert/api/v1/notifiers", "/api/v1/notifiers":
data, err := rh.listNotifiers()
if err != nil {
httpserver.Errorf(w, r, "%s", err)
errJson(w, r, err)
return true
}
w.Header().Set("Content-Type", "application/json")
@@ -135,15 +144,14 @@ func (rh *requestHandler) handler(w http.ResponseWriter, r *http.Request) bool {
return true
case "/vmalert/api/v1/rules", "/api/v1/rules":
// path used by Grafana for ng alerting
var data []byte
rf, err := newRulesFilter(r)
if err != nil {
httpserver.Errorf(w, r, "%s", err)
errJson(w, r, err)
return true
}
data, err = rh.listGroups(rf)
data, err := rh.listGroups(rf)
if err != nil {
httpserver.Errorf(w, r, "%s", err)
errJson(w, r, err)
return true
}
w.Header().Set("Content-Type", "application/json")
@@ -152,14 +160,14 @@ func (rh *requestHandler) handler(w http.ResponseWriter, r *http.Request) bool {
case "/vmalert/api/v1/alerts", "/api/v1/alerts":
// path used by Grafana for ng alerting
rf, err := newRulesFilter(r)
gf, err := newGroupsFilter(r)
if err != nil {
httpserver.Errorf(w, r, "%s", err)
errJson(w, r, err)
return true
}
data, err := rh.listAlerts(rf)
data, err := rh.listAlerts(gf)
if err != nil {
httpserver.Errorf(w, r, "%s", err)
errJson(w, r, err)
return true
}
w.Header().Set("Content-Type", "application/json")
@@ -168,12 +176,12 @@ func (rh *requestHandler) handler(w http.ResponseWriter, r *http.Request) bool {
case "/vmalert/api/v1/alert", "/api/v1/alert":
alert, err := rh.getAlert(r)
if err != nil {
httpserver.Errorf(w, r, "%s", err)
errJson(w, r, err)
return true
}
data, err := json.Marshal(alert)
data, err := marshalJson(alert, "alert")
if err != nil {
httpserver.Errorf(w, r, "failed to marshal alert: %s", err)
errJson(w, r, err)
return true
}
w.Header().Set("Content-Type", "application/json")
@@ -182,16 +190,16 @@ func (rh *requestHandler) handler(w http.ResponseWriter, r *http.Request) bool {
case "/vmalert/api/v1/rule", "/api/v1/rule":
apiRule, err := rh.getRule(r)
if err != nil {
httpserver.Errorf(w, r, "%s", err)
errJson(w, r, err)
return true
}
rwu := rule.ApiRuleWithUpdates{
ApiRule: apiRule,
StateUpdates: apiRule.Updates,
}
data, err := json.Marshal(rwu)
data, err := marshalJson(rwu, "rule")
if err != nil {
httpserver.Errorf(w, r, "failed to marshal rule: %s", err)
errJson(w, r, err)
return true
}
w.Header().Set("Content-Type", "application/json")
@@ -200,12 +208,12 @@ func (rh *requestHandler) handler(w http.ResponseWriter, r *http.Request) bool {
case "/vmalert/api/v1/group", "/api/v1/group":
group, err := rh.getGroup(r)
if err != nil {
httpserver.Errorf(w, r, "%s", err)
errJson(w, r, err)
return true
}
data, err := json.Marshal(group)
data, err := marshalJson(group, "group")
if err != nil {
httpserver.Errorf(w, r, "failed to marshal group: %s", err)
errJson(w, r, err)
return true
}
w.Header().Set("Content-Type", "application/json")
@@ -225,10 +233,10 @@ func (rh *requestHandler) handler(w http.ResponseWriter, r *http.Request) bool {
}
}
func (rh *requestHandler) getGroup(r *http.Request) (*rule.ApiGroup, error) {
func (rh *requestHandler) getGroup(r *http.Request) (*rule.ApiGroup, *httpserver.ErrorWithStatusCode) {
groupID, err := strconv.ParseUint(r.FormValue(rule.ParamGroupID), 10, 64)
if err != nil {
return nil, fmt.Errorf("failed to read %q param: %w", rule.ParamGroupID, err)
return nil, errResponse(fmt.Errorf("failed to read %q param: %w", rule.ParamGroupID, err), http.StatusBadRequest)
}
obj, err := rh.m.groupAPI(groupID)
if err != nil {
@@ -237,14 +245,14 @@ func (rh *requestHandler) getGroup(r *http.Request) (*rule.ApiGroup, error) {
return obj, nil
}
func (rh *requestHandler) getRule(r *http.Request) (rule.ApiRule, error) {
func (rh *requestHandler) getRule(r *http.Request) (rule.ApiRule, *httpserver.ErrorWithStatusCode) {
groupID, err := strconv.ParseUint(r.FormValue(rule.ParamGroupID), 10, 64)
if err != nil {
return rule.ApiRule{}, fmt.Errorf("failed to read %q param: %w", rule.ParamGroupID, err)
return rule.ApiRule{}, errResponse(fmt.Errorf("failed to read %q param: %w", rule.ParamGroupID, err), http.StatusBadRequest)
}
ruleID, err := strconv.ParseUint(r.FormValue(rule.ParamRuleID), 10, 64)
if err != nil {
return rule.ApiRule{}, fmt.Errorf("failed to read %q param: %w", rule.ParamRuleID, err)
return rule.ApiRule{}, errResponse(fmt.Errorf("failed to read %q param: %w", rule.ParamRuleID, err), http.StatusBadRequest)
}
obj, err := rh.m.ruleAPI(groupID, ruleID)
if err != nil {
@@ -253,14 +261,14 @@ func (rh *requestHandler) getRule(r *http.Request) (rule.ApiRule, error) {
return obj, nil
}
func (rh *requestHandler) getAlert(r *http.Request) (*rule.ApiAlert, error) {
func (rh *requestHandler) getAlert(r *http.Request) (*rule.ApiAlert, *httpserver.ErrorWithStatusCode) {
groupID, err := strconv.ParseUint(r.FormValue(rule.ParamGroupID), 10, 64)
if err != nil {
return nil, fmt.Errorf("failed to read %q param: %w", rule.ParamGroupID, err)
return nil, errResponse(fmt.Errorf("failed to read %q param: %w", rule.ParamGroupID, err), http.StatusBadRequest)
}
alertID, err := strconv.ParseUint(r.FormValue(rule.ParamAlertID), 10, 64)
if err != nil {
return nil, fmt.Errorf("failed to read %q param: %w", rule.ParamAlertID, err)
return nil, errResponse(fmt.Errorf("failed to read %q param: %w", rule.ParamAlertID, err), http.StatusBadRequest)
}
a, err := rh.m.alertAPI(groupID, alertID)
if err != nil {
@@ -270,28 +278,76 @@ func (rh *requestHandler) getAlert(r *http.Request) (*rule.ApiAlert, error) {
}
type listGroupsResponse struct {
Status string `json:"status"`
Data struct {
Status string `json:"status"`
Page int `json:"page,omitempty"`
TotalPages int `json:"total_pages,omitempty"`
TotalGroups int `json:"total_groups,omitempty"`
TotalRules int `json:"total_rules,omitempty"`
Data struct {
Groups []*rule.ApiGroup `json:"groups"`
} `json:"data"`
}
// see https://prometheus.io/docs/prometheus/latest/querying/api/#rules
type rulesFilter struct {
files []string
groupNames []string
ruleNames []string
ruleType string
excludeAlerts bool
filter string
dsType config.Type
type groupsFilter struct {
groupNames []string
files []string
dsType config.Type
}
func newRulesFilter(r *http.Request) (*rulesFilter, error) {
rf := &rulesFilter{}
query := r.URL.Query()
func newGroupsFilter(r *http.Request) (*groupsFilter, *httpserver.ErrorWithStatusCode) {
_ = r.ParseForm()
vs := r.Form
gf := &groupsFilter{
groupNames: vs["rule_group[]"],
files: vs["file[]"],
}
dsType := vs.Get("datasource_type")
if len(dsType) > 0 {
if config.SupportedType(dsType) {
gf.dsType = config.NewRawType(dsType)
} else {
return nil, errResponse(fmt.Errorf(`invalid parameter "datasource_type": not supported value %q`, dsType), http.StatusBadRequest)
}
}
return gf, nil
}
ruleTypeParam := query.Get("type")
func (gf *groupsFilter) matches(group *rule.Group) bool {
if len(gf.groupNames) > 0 && !slices.Contains(gf.groupNames, group.Name) {
return false
}
if len(gf.files) > 0 && !slices.Contains(gf.files, group.File) {
return false
}
if len(gf.dsType.Name) > 0 && gf.dsType.String() != group.Type.String() {
return false
}
return true
}
// see https://prometheus.io/docs/prometheus/latest/querying/api/#rules
type rulesFilter struct {
gf *groupsFilter
ruleNames []string
ruleType string
excludeAlerts bool
states []string
maxGroups int
pageNum int
search string
extendedStates bool
}
func newRulesFilter(r *http.Request) (*rulesFilter, *httpserver.ErrorWithStatusCode) {
gf, err := newGroupsFilter(r)
if err != nil {
return nil, err
}
var rf rulesFilter
rf.gf = gf
vs := r.Form
ruleTypeParam := vs.Get("type")
if len(ruleTypeParam) > 0 {
if ruleType, ok := ruleTypeMap[ruleTypeParam]; ok {
rf.ruleType = ruleType
@@ -300,102 +356,155 @@ func newRulesFilter(r *http.Request) (*rulesFilter, error) {
}
}
dsType := query.Get("datasource_type")
if len(dsType) > 0 {
if config.SupportedType(dsType) {
rf.dsType = config.NewRawType(dsType)
} else {
return nil, errResponse(fmt.Errorf(`invalid parameter "datasource_type": not supported value %q`, dsType), http.StatusBadRequest)
}
states := vs["state"]
if len(states) == 0 {
states = vs["filter"]
}
filter := strings.ToLower(query.Get("filter"))
if len(filter) > 0 {
if filter == "nomatch" || filter == "unhealthy" {
rf.filter = filter
} else {
return nil, errResponse(fmt.Errorf(`invalid parameter "filter": not supported value %q`, filter), http.StatusBadRequest)
for _, s := range states {
values := strings.Split(s, ",")
for _, v := range values {
if len(v) == 0 {
continue
}
if !slices.Contains(ruleStates, v) {
return nil, errResponse(fmt.Errorf(`invalid parameter "state": contains not supported value %q`, v), http.StatusBadRequest)
}
// Replace grafana states with supported internal states
switch v {
case "noData":
v = "nomatch"
case "normal":
v = "inactive"
case "error":
v = "unhealthy"
}
rf.states = append(rf.states, v)
}
}
rf.excludeAlerts = httputil.GetBool(r, "exclude_alerts")
rf.ruleNames = append([]string{}, r.Form["rule_name[]"]...)
rf.groupNames = append([]string{}, r.Form["rule_group[]"]...)
rf.files = append([]string{}, r.Form["file[]"]...)
return rf, nil
rf.extendedStates = httputil.GetBool(r, "extended_states")
rf.ruleNames = append([]string{}, vs["rule_name[]"]...)
rf.search = strings.ToLower(vs.Get("search"))
pageNum := vs.Get("page_num")
maxGroups := vs.Get("group_limit")
if pageNum != "" {
if maxGroups == "" {
return nil, errResponse(fmt.Errorf(`"group_limit" needs to be present in order to paginate over the groups`), http.StatusBadRequest)
}
v, err := strconv.Atoi(pageNum)
if err != nil || v <= 0 {
return nil, errResponse(fmt.Errorf(`"page_num" is expected to be a positive number, found %q`, pageNum), http.StatusBadRequest)
}
rf.pageNum = v
}
if maxGroups != "" {
v, err := strconv.Atoi(maxGroups)
if err != nil || v <= 0 {
return nil, errResponse(fmt.Errorf(`"group_limit" is expected to be a positive number, found %q`, maxGroups), http.StatusBadRequest)
}
rf.maxGroups = v
}
return &rf, nil
}
func (rf *rulesFilter) matchesGroup(group *rule.Group) bool {
if len(rf.groupNames) > 0 && !slices.Contains(rf.groupNames, group.Name) {
func (rf *rulesFilter) matchesRule(r *rule.ApiRule) bool {
if rf.ruleType != "" && rf.ruleType != r.Type {
return false
}
if len(rf.files) > 0 && !slices.Contains(rf.files, group.File) {
if len(rf.ruleNames) > 0 && !slices.Contains(rf.ruleNames, r.Name) {
return false
}
if len(rf.dsType.Name) > 0 && rf.dsType.String() != group.Type.String() {
return false
if len(rf.states) == 0 {
return true
}
return true
return slices.Contains(rf.states, r.State)
}
func (rh *requestHandler) groups(rf *rulesFilter) []*rule.ApiGroup {
func (rh *requestHandler) groups(rf *rulesFilter) *listGroupsResponse {
rh.m.groupsMu.RLock()
defer rh.m.groupsMu.RUnlock()
groups := make([]*rule.ApiGroup, 0)
skipGroups := (rf.pageNum - 1) * rf.maxGroups
lr := &listGroupsResponse{
Status: "success",
}
lr.Data.Groups = make([]*rule.ApiGroup, 0)
if skipGroups >= len(rh.m.groups) {
return lr
}
// sort list of groups for deterministic output
groups := make([]*rule.Group, 0, len(rh.m.groups))
for _, group := range rh.m.groups {
if !rf.matchesGroup(group) {
groups = append(groups, group)
}
slices.SortFunc(groups, func(a, b *rule.Group) int {
nameCmp := cmp.Compare(a.Name, b.Name)
if nameCmp != 0 {
return nameCmp
}
return cmp.Compare(a.File, b.File)
})
for _, group := range groups {
if !rf.gf.matches(group) {
continue
}
groupFound := len(rf.search) == 0 || strings.Contains(strings.ToLower(group.Name), rf.search) || strings.Contains(strings.ToLower(group.File), rf.search)
g := group.ToAPI()
// the returned list should always be non-nil
// https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4221
filteredRules := make([]rule.ApiRule, 0)
for _, rule := range g.Rules {
if rf.ruleType != "" && rf.ruleType != rule.Type {
if !groupFound && !strings.Contains(strings.ToLower(rule.Name), rf.search) {
continue
}
if len(rf.ruleNames) > 0 && !slices.Contains(rf.ruleNames, rule.Name) {
continue
if rf.extendedStates {
rule.ExtendState()
}
if (rule.LastError == "" && rf.filter == "unhealthy") || (!isNoMatch(rule) && rf.filter == "nomatch") {
if !rf.matchesRule(&rule) {
continue
}
if rf.excludeAlerts {
rule.Alerts = nil
}
if rule.LastError != "" {
g.Unhealthy++
} else {
g.Healthy++
}
if isNoMatch(rule) {
g.NoMatch++
}
g.States[rule.State]++
filteredRules = append(filteredRules, rule)
}
g.Rules = filteredRules
groups = append(groups, g)
}
// sort list of groups for deterministic output
slices.SortFunc(groups, func(a, b *rule.ApiGroup) int {
if a.Name != b.Name {
return strings.Compare(a.Name, b.Name)
if len(g.Rules) == 0 || len(filteredRules) > 0 {
if rf.maxGroups > 0 {
lr.TotalGroups++
lr.TotalRules += len(filteredRules)
}
if skipGroups > 0 {
skipGroups--
continue
}
if rf.maxGroups == 0 || len(lr.Data.Groups) < rf.maxGroups {
g.Rules = filteredRules
lr.Data.Groups = append(lr.Data.Groups, g)
}
}
return strings.Compare(a.File, b.File)
})
return groups
}
if rf.maxGroups > 0 {
lr.Page = rf.pageNum
lr.TotalPages = max(int(math.Ceil(float64(lr.TotalGroups)/float64(rf.maxGroups))), 1)
}
return lr
}
func (rh *requestHandler) listGroups(rf *rulesFilter) ([]byte, error) {
lr := listGroupsResponse{Status: "success"}
lr.Data.Groups = rh.groups(rf)
func (rh *requestHandler) listGroups(rf *rulesFilter) ([]byte, *httpserver.ErrorWithStatusCode) {
lr := rh.groups(rf)
if rf.pageNum > 1 && len(lr.Data.Groups) == 0 {
return nil, errResponse(fmt.Errorf(`page_num exceeds total amount of pages`), http.StatusBadRequest)
}
if lr.Page > lr.TotalPages {
return nil, errResponse(fmt.Errorf(`page_num=%d exceeds total amount of pages in result=%d`, lr.Page, lr.TotalPages), http.StatusBadRequest)
}
b, err := json.Marshal(lr)
if err != nil {
return nil, &httpserver.ErrorWithStatusCode{
Err: fmt.Errorf(`error encoding list of active alerts: %w`, err),
StatusCode: http.StatusInternalServerError,
}
return nil, errResponse(fmt.Errorf(`error encoding list of groups: %w`, err), http.StatusInternalServerError)
}
return b, nil
}
@@ -412,18 +521,18 @@ func (rh *requestHandler) groupAlerts() []rule.GroupAlerts {
defer rh.m.groupsMu.RUnlock()
var gAlerts []rule.GroupAlerts
for _, g := range rh.m.groups {
for _, group := range rh.m.groups {
var alerts []*rule.ApiAlert
g := group.ToAPI()
for _, r := range g.Rules {
a, ok := r.(*rule.AlertingRule)
if !ok {
if r.Type != rule.TypeAlerting {
continue
}
alerts = append(alerts, a.AlertsToAPI()...)
alerts = append(alerts, r.Alerts...)
}
if len(alerts) > 0 {
gAlerts = append(gAlerts, rule.GroupAlerts{
Group: g.ToAPI(),
Group: g,
Alerts: alerts,
})
}
@@ -434,22 +543,22 @@ func (rh *requestHandler) groupAlerts() []rule.GroupAlerts {
return gAlerts
}
func (rh *requestHandler) listAlerts(rf *rulesFilter) ([]byte, error) {
func (rh *requestHandler) listAlerts(gf *groupsFilter) ([]byte, *httpserver.ErrorWithStatusCode) {
rh.m.groupsMu.RLock()
defer rh.m.groupsMu.RUnlock()
lr := listAlertsResponse{Status: "success"}
lr.Data.Alerts = make([]*rule.ApiAlert, 0)
for _, group := range rh.m.groups {
if !rf.matchesGroup(group) {
if !gf.matches(group) {
continue
}
for _, r := range group.Rules {
a, ok := r.(*rule.AlertingRule)
if !ok {
g := group.ToAPI()
for _, r := range g.Rules {
if r.Type != rule.TypeAlerting {
continue
}
lr.Data.Alerts = append(lr.Data.Alerts, a.AlertsToAPI()...)
lr.Data.Alerts = append(lr.Data.Alerts, r.Alerts...)
}
}
@@ -460,10 +569,7 @@ func (rh *requestHandler) listAlerts(rf *rulesFilter) ([]byte, error) {
b, err := json.Marshal(lr)
if err != nil {
return nil, &httpserver.ErrorWithStatusCode{
Err: fmt.Errorf(`error encoding list of active alerts: %w`, err),
StatusCode: http.StatusInternalServerError,
}
return nil, errResponse(fmt.Errorf(`error encoding list of active alerts: %w`, err), http.StatusInternalServerError)
}
return b, nil
}
@@ -475,7 +581,7 @@ type listNotifiersResponse struct {
} `json:"data"`
}
func (rh *requestHandler) listNotifiers() ([]byte, error) {
func (rh *requestHandler) listNotifiers() ([]byte, *httpserver.ErrorWithStatusCode) {
targets := notifier.GetTargets()
lr := listNotifiersResponse{Status: "success"}
@@ -497,10 +603,7 @@ func (rh *requestHandler) listNotifiers() ([]byte, error) {
b, err := json.Marshal(lr)
if err != nil {
return nil, &httpserver.ErrorWithStatusCode{
Err: fmt.Errorf(`error encoding list of notifiers: %w`, err),
StatusCode: http.StatusInternalServerError,
}
return nil, errResponse(fmt.Errorf(`error encoding list of notifiers: %w`, err), http.StatusInternalServerError)
}
return b, nil
}
@@ -511,3 +614,8 @@ func errResponse(err error, sc int) *httpserver.ErrorWithStatusCode {
StatusCode: sc,
}
}
func errJson(w http.ResponseWriter, r *http.Request, err *httpserver.ErrorWithStatusCode) {
w.Header().Set("Content-Type", "application/json")
httpserver.Errorf(w, r, `{"error":%q,"errorType":%d}`, err, err.StatusCode)
}

View File

@@ -9,9 +9,10 @@
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/vmalertutil"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/rule"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
) %}
{% func Controls(prefix, currentIcon, currentText string, icons, filters map[string]string, search bool) %}
{% func Controls(prefix, currentIcon, currentText string, icons, states map[string]string, search bool) %}
<div class="btn-toolbar mb-3" role="toolbar">
<div class="d-flex gap-2 justify-content-between w-100">
<div class="d-flex gap-2 align-items-center">
@@ -27,10 +28,10 @@
<use href="{%s prefix %}static/icons/icons.svg#expand"/>
</svg>
</a>
{% if len(filters) > 0 %}
{% if len(states) > 0 %}
<span class="d-none d-md-inline-block">Filter by status:</span>
<svg class="d-md-none" width="20" height="20">
<use href="{%s prefix %}static/icons/icons.svg#filter">
<use href="{%s prefix %}static/icons/icons.svg#state">
</svg>
<div class="dropdown">
<button
@@ -45,10 +46,10 @@
</svg>
</button>
<ul class="dropdown-menu">
{% for key, title := range filters %}
{% for key, title := range states %}
{% if title != currentText %}
<li>
<a class="dropdown-item" onclick="groupFilter('{%s key %}')">
<a class="dropdown-item" onclick="groupForState('{%s key %}')">
<span class="d-none d-md-inline-block">{%s title %}</span>
<svg class="d-md-none" width="22" height="22">
<use href="{%s prefix %}static/icons/icons.svg#{%s icons[key] %}"/>
@@ -78,6 +79,8 @@
{% func Welcome(r *http.Request) %}
{%= tpl.Header(r, navItems, "vmalert", getLastConfigError()) %}
<p>
Version {%s buildinfo.Version %} <br>
API:<br>
{% for _, p := range apiLinks %}
{%code p, doc := p[0], p[1] %}
@@ -94,10 +97,10 @@
{%= tpl.Footer(r) %}
{% endfunc %}
{% func ListGroups(r *http.Request, groups []*rule.ApiGroup, filter string) %}
{% func ListGroups(r *http.Request, groups []*rule.ApiGroup, state string) %}
{%code
prefix := vmalertutil.Prefix(r.URL.Path)
filters := map[string]string{
states := map[string]string{
"": "All",
"unhealthy": "Unhealthy",
"nomatch": "No Match",
@@ -107,26 +110,29 @@
"unhealthy": "unhealthy",
"nomatch": "nomatch",
}
currentText := filters[filter]
currentIcon := icons[filter]
currentText := states[state]
currentIcon := icons[state]
%}
{%= tpl.Header(r, navItems, "Groups", getLastConfigError()) %}
{%= Controls(prefix, currentIcon, currentText, icons, filters, true) %}
{%= Controls(prefix, currentIcon, currentText, icons, states, true) %}
{% if len(groups) > 0 %}
{% for _, g := range groups %}
<div id="group-{%s g.ID %}" class="d-flex w-100 border-0 flex-column group-items{% if g.Unhealthy > 0 %} alert-danger{% endif %}">
<div id="group-{%s g.ID %}" class="w-100 border-0 flex-column vm-group{% if g.States["unhealthy"] > 0 %} alert-danger{% endif %}">
<span class="d-flex justify-content-between">
<a href="#group-{%s g.ID %}">{%s g.Name %}{% if g.Type != "prometheus" %} ({%s g.Type %}){% endif %} (every {%f.0 g.Interval %}s) #</a>
<a
class="vm-group-search"
href="#group-{%s g.ID %}"
>{%s g.Name %}{% if g.Type != "prometheus" %} ({%s g.Type %}){% endif %} (every {%f.0 g.Interval %}s) #</a>
<span
class="flex-grow-1 d-flex justify-content-end"
role="button"
data-bs-toggle="collapse"
data-bs-target="#sub-{%s g.ID %}"
data-bs-target="#item-{%s g.ID %}"
>
<span class="d-flex gap-2">
{% if g.Unhealthy > 0 %}<span class="badge bg-danger" title="Number of rules with status Error">{%d g.Unhealthy %}</span> {% endif %}
{% if g.NoMatch > 0 %}<span class="badge bg-warning" title="Number of rules with status NoMatch">{%d g.NoMatch %}</span> {% endif %}
<span class="badge bg-success" title="Number of rules with status Ok">{%d g.Healthy %}</span>
{% if g.States["unhealthy"] > 0 %}<span class="badge bg-danger" title="Number of rules with status Error">{%d g.States["unhealthy"] %}</span> {% endif %}
{% if g.States["nomatch"] > 0 %}<span class="badge bg-warning" title="Number of rules with status NoMatch">{%d g.States["nomatch"] %}</span> {% endif %}
<span class="badge bg-success" title="Number of rules with status Ok">{%d g.States["ok"] %}</span>
</span>
</span>
</span>
@@ -134,9 +140,9 @@
class="d-flex flex-column row-gap-2 mb-2"
role="button"
data-bs-toggle="collapse"
data-bs-target="#sub-{%s g.ID %}"
data-bs-target="#item-{%s g.ID %}"
>
<span class="fs-6 text-start w-100 fw-lighter">{%s g.File %}</span>
<span class="fs-6 text-start vm-group-search w-100 fw-lighter">{%s g.File %}</span>
{% if len(g.Params) > 0 %}
<span class="fs-6 text-start w-100 d-flex justify-content-between fw-lighter">
<span>Extra params</span>
@@ -158,7 +164,7 @@
</span>
{% endif %}
</span>
<div class="collapse sub-items" id="sub-{%s g.ID %}">
<div class="collapse" id="item-{%s g.ID %}">
<table class="table table-striped table-hover table-sm">
<thead>
<tr>
@@ -169,7 +175,7 @@
</thead>
<tbody>
{% for _, r := range g.Rules %}
<tr class="sub-item{% if r.LastError != "" %} alert-danger{% endif %}">
<tr class="vm-item{% if r.LastError != "" %} alert-danger{% endif %}">
<td>
<div class="row">
<div class="col-12 mb-2">
@@ -183,7 +189,7 @@
<b>record:</b> {%s r.Name %}
{% endif %}
|
{%= seriesFetchedWarn(prefix, r) %}
{%= seriesFetchedWarn(prefix, &r) %}
<span><a target="_blank" href="{%s prefix+r.WebLink() %}">Details</a></span>
</div>
<div class="col-12">
@@ -206,7 +212,12 @@
</div>
</td>
<td class="text-center">{%d r.LastSamples %}</td>
<td class="text-center">{%f.3 time.Since(r.LastEvaluation).Seconds() %}s ago</td>
<td class="text-center">{% if r.LastEvaluation.IsZero() %}
Never
{% else %}
{%f.3 time.Since(r.LastEvaluation).Seconds() %}s ago
{% endif %}
</td>
</tr>
{% endfor %}
</tbody>
@@ -241,14 +252,14 @@
}
sort.Strings(keys)
%}
<div class="d-flex w-100 flex-column group-items alert-danger">
<div class="w-100 flex-column vm-group alert-danger">
<span id="group-{%s g.ID %}" class="d-flex justify-content-between">
<a href="#group-{%s g.ID %}">{%s g.Name %}{% if g.Type != "prometheus" %} ({%s g.Type %}){% endif %}</a>
<span
class="flex-grow-1 d-flex justify-content-end"
role="button"
data-bs-toggle="collapse"
data-bs-target="#sub-{%s g.ID %}"
data-bs-target="#item-{%s g.ID %}"
>
<span class="badge bg-danger" title="Number of active alerts">{%d len(ga.Alerts) %}</span>
</span>
@@ -258,10 +269,10 @@
class="fs-6 text-start w-100 fw-lighter"
role="button"
data-bs-toggle="collapse"
data-bs-target="#sub-{%s g.ID %}"
data-bs-target="#item-{%s g.ID %}"
>{%s g.File %}</span>
</span>
<div class="collapse sub-items" id="sub-{%s g.ID %}">
<div class="collapse" id="item-{%s g.ID %}">
{% for _, ruleID := range keys %}
{%code
defaultAR := alertsByRule[ruleID][0]
@@ -272,7 +283,7 @@
sort.Strings(labelKeys)
%}
<br>
<div class="sub-item">
<div class="vm-item">
<b>alert:</b> {%s defaultAR.Name %} ({%d len(alertsByRule[ruleID]) %})
| <span><a target="_blank" href="{%s defaultAR.SourceLink %}">Source</a></span>
<br>
@@ -337,20 +348,20 @@
typeK, ns := keys[i], targets[notifier.TargetType(keys[i])]
count := len(ns)
%}
<div class="d-flex w-100 flex-column group-items">
<div class="w-100 flex-column vm-group">
<span class="d-flex justify-content-between" id="group-{%s typeK %}">
<a href="#group-{%s typeK %}">{%s typeK %} ({%d count %})</a>
<span
class="flex-grow-1"
role="button"
data-bs-toggle="collapse"
data-bs-target="#sub-{%s typeK %}"
data-bs-target="#item-{%s typeK %}"
></span>
</span>
<div id="sub-{%s typeK %}" class="collapse show sub-items">
<div id="item-{%s typeK %}" class="collapse show">
<table class="table table-striped table-hover table-sm">
<thead>
<tr class="sub-item">
<tr class="vm-item">
<th scope="col">Labels</th>
<th scope="col">Address</th>
</tr>
@@ -435,7 +446,7 @@
<div class="col">
{% for _, k := range annotationKeys %}
<b>{%s k %}:</b><br>
<p>{%s alert.Annotations[k] %}</p>
<p class="annotations">{%s alert.Annotations[k] %}</p>
{% endfor %}
</div>
</div>
@@ -465,7 +476,7 @@
{% endfunc %}
{% func RuleDetails(r *http.Request, rule rule.ApiRule) %}
{% func Rule(r *http.Request, rule rule.ApiRule) %}
{%code prefix := vmalertutil.Prefix(r.URL.Path) %}
{%= tpl.Header(r, navItems, "", getLastConfigError()) %}
{%code
@@ -549,7 +560,7 @@
<div class="col">
{% for _, k := range annotationKeys %}
<b>{%s k %}:</b><br>
<p>{%s rule.Annotations[k] %}</p>
<p class="annotations">{%s rule.Annotations[k] %}</p>
{% endfor %}
</div>
</div>
@@ -594,11 +605,11 @@
<table class="table table-striped table-hover table-sm">
<thead>
<tr>
<th scope="col" title="The time when event was created">Updated at</th>
<th scope="col" title="The time when the rule was executed">Updated at</th>
<th scope="col" class="w-10 text-center" title="How many series expression returns. Each series will represent an alert.">Series returned</th>
{% if seriesFetchedEnabled %}<th scope="col" class="w-10 text-center" title="How many series were scanned by datasource during the evaluation">Series fetched</th>{% endif %}
<th scope="col" class="w-10 text-center" title="How many seconds request took">Duration</th>
<th scope="col" class="text-center" title="Time used for rule execution">Executed at</th>
<th scope="col" class="text-center" title="The time used in execution query request">Execution timestamp</th>
<th scope="col" class="text-center" title="cURL command with request example">cURL</th>
</tr>
</thead>
@@ -650,8 +661,8 @@
<span class="badge bg-warning text-dark" title="This firing state is kept because of `keep_firing_for`">stabilizing</span>
{% endfunc %}
{% func seriesFetchedWarn(prefix string, r rule.ApiRule) %}
{% if isNoMatch(r) %}
{% func seriesFetchedWarn(prefix string, r *rule.ApiRule) %}
{% if r.IsNoMatch() %}
<svg
data-bs-toggle="tooltip"
title="No match! This rule's last evaluation hasn't selected any time series from the datasource.
@@ -662,9 +673,3 @@
</svg>
{% endif %}
{% endfunc %}
{%code
func isNoMatch (r rule.ApiRule) bool {
return r.LastSamples == 0 && r.LastSeriesFetched != nil && *r.LastSeriesFetched == 0
}
%}

File diff suppressed because it is too large Load Diff

View File

@@ -23,6 +23,9 @@ func TestHandler(t *testing.T) {
Timestamps: []int64{0},
})
m := &manager{groups: map[uint64]*rule.Group{}}
_, cleanup := notifier.InitFakeNotifier()
defer cleanup()
var ar *rule.AlertingRule
var rr *rule.RecordingRule
var groupIDs []uint64
@@ -45,7 +48,7 @@ func TestHandler(t *testing.T) {
}, fq, 1*time.Minute, nil)
ar = g.Rules[0].(*rule.AlertingRule)
rr = g.Rules[1].(*rule.RecordingRule)
g.ExecOnce(context.Background(), func() []notifier.Notifier { return nil }, nil, time.Time{})
g.ExecOnce(context.Background(), nil, time.Time{})
id := g.CreateID()
m.groups[id] = g
groupIDs = append(groupIDs, id)
@@ -207,7 +210,7 @@ func TestHandler(t *testing.T) {
}
})
t.Run("/api/v1/rules&filters", func(t *testing.T) {
t.Run("/api/v1/rules&states", func(t *testing.T) {
check := func(url string, statusCode, expGroups, expRules int) {
t.Helper()
lr := listGroupsResponse{}
@@ -249,9 +252,15 @@ func TestHandler(t *testing.T) {
check("/api/v1/rules?rule_group[]=group&file[]=foo", 200, 0, 0)
check("/api/v1/rules?rule_group[]=group&file[]=rules.yaml", 200, 3, 6)
check("/api/v1/rules?rule_group[]=group&file[]=rules.yaml&rule_name[]=foo", 200, 3, 0)
check("/api/v1/rules?rule_group[]=group&file[]=rules.yaml&rule_name[]=foo", 200, 0, 0)
check("/api/v1/rules?rule_group[]=group&file[]=rules.yaml&rule_name[]=alert", 200, 3, 3)
check("/api/v1/rules?rule_group[]=group&file[]=rules.yaml&rule_name[]=alert&rule_name[]=record", 200, 3, 6)
check("/api/v1/rules?group_limit=1", 200, 1, 2)
check("/api/v1/rules?group_limit=1&type=alert", 200, 1, 1)
check("/api/v1/rules?group_limit=1&type=record", 200, 1, 1)
check("/api/v1/rules?group_limit=2", 200, 2, 4)
check(fmt.Sprintf("/api/v1/rules?group_limit=1&page_num=%d", 1), 200, 1, 2)
})
t.Run("/api/v1/rules&exclude_alerts=true", func(t *testing.T) {
// check if response returns active alerts by default

View File

@@ -27,6 +27,9 @@ vmauth-linux-ppc64le-prod:
vmauth-linux-386-prod:
APP_NAME=vmauth $(MAKE) app-via-docker-linux-386
vmauth-linux-s390x-prod:
APP_NAME=vmauth $(MAKE) app-via-docker-linux-s390x
vmauth-darwin-amd64-prod:
APP_NAME=vmauth $(MAKE) app-via-docker-darwin-amd64

View File

@@ -4,6 +4,7 @@ import (
"bytes"
"context"
"encoding/base64"
"errors"
"flag"
"fmt"
"math"
@@ -12,6 +13,7 @@ import (
"net/url"
"os"
"regexp"
"slices"
"sort"
"strconv"
"strings"
@@ -27,6 +29,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fasttime"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs/fscore"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/netutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/procutil"
@@ -64,10 +67,11 @@ type AuthConfig struct {
type UserInfo struct {
Name string `yaml:"name,omitempty"`
BearerToken string `yaml:"bearer_token,omitempty"`
AuthToken string `yaml:"auth_token,omitempty"`
Username string `yaml:"username,omitempty"`
Password string `yaml:"password,omitempty"`
BearerToken string `yaml:"bearer_token,omitempty"`
JWT *JWTConfig `yaml:"jwt,omitempty"`
AuthToken string `yaml:"auth_token,omitempty"`
Username string `yaml:"username,omitempty"`
Password string `yaml:"password,omitempty"`
URLPrefix *URLPrefix `yaml:"url_prefix,omitempty"`
DiscoverBackendIPs *bool `yaml:"discover_backend_ips,omitempty"`
@@ -88,30 +92,79 @@ type UserInfo struct {
MetricLabels map[string]string `yaml:"metric_labels,omitempty"`
AccessLog *AccessLog `yaml:"access_log,omitempty"`
concurrencyLimitCh chan struct{}
concurrencyLimitReached *metrics.Counter
rt http.RoundTripper
requests *metrics.Counter
requestErrors *metrics.Counter
backendRequests *metrics.Counter
backendErrors *metrics.Counter
requestsDuration *metrics.Summary
}
// HeadersConf represents config for request and response headers.
type HeadersConf struct {
RequestHeaders []*Header `yaml:"headers,omitempty"`
ResponseHeaders []*Header `yaml:"response_headers,omitempty"`
KeepOriginalHost *bool `yaml:"keep_original_host,omitempty"`
// AccessLog represents configuration for access log settings.
type AccessLog struct {
Filters *AccessLogFilters `yaml:"filters"`
}
func (ui *UserInfo) beginConcurrencyLimit() error {
// AccessLogFilters represents list of filters for access logs printing
type AccessLogFilters struct {
// SkipStatusCodes is a list of HTTP status codes for which access logs will be skipped
SkipStatusCodes []int `yaml:"skip_status_codes"`
}
func (ui *UserInfo) logRequest(r *http.Request, userName string, statusCode int, duration time.Duration) {
if ui.AccessLog == nil {
return
}
filters := ui.AccessLog.Filters
if filters != nil && len(filters.SkipStatusCodes) > 0 {
if slices.Contains(filters.SkipStatusCodes, statusCode) {
return
}
}
remoteAddr := httpserver.GetQuotedRemoteAddr(r)
requestURI := httpserver.GetRequestURI(r)
logger.Infof("access_log request_host=%q request_uri=%q status_code=%d remote_addr=%s user_agent=%q referer=%q duration_ms=%d username=%q",
r.Host, requestURI, statusCode, remoteAddr, r.UserAgent(), r.Referer(), duration.Milliseconds(), userName)
}
// HeadersConf represents config for request and response headers.
type HeadersConf struct {
RequestHeaders []*Header `yaml:"headers,omitempty"`
ResponseHeaders []*Header `yaml:"response_headers,omitempty"`
KeepOriginalHost *bool `yaml:"keep_original_host,omitempty"`
hasAnyPlaceHolders bool
}
func (ui *UserInfo) beginConcurrencyLimit(ctx context.Context) error {
select {
case ui.concurrencyLimitCh <- struct{}{}:
return nil
default:
ui.concurrencyLimitReached.Inc()
return fmt.Errorf("cannot handle more than %d concurrent requests from user %s", ui.getMaxConcurrentRequests(), ui.name())
// The number of concurrently executed requests for the given user equals the limit.
// Wait until some of the currently executed requests are finished, so the current request could be executed.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10078
select {
case ui.concurrencyLimitCh <- struct{}{}:
return nil
case <-ctx.Done():
err := ctx.Err()
if errors.Is(err, context.DeadlineExceeded) {
// The current request couldn't be executed until the request timeout.
ui.concurrencyLimitReached.Inc()
return fmt.Errorf("cannot start executing the request during -maxQueueDuration=%s because %d concurrent requests from the user %s are executed",
*maxQueueDuration, ui.getMaxConcurrentRequests(), ui.name())
}
return fmt.Errorf("cannot start executing the request because %d concurrent requests from the user %s are executed: %w",
ui.getMaxConcurrentRequests(), ui.name(), err)
}
}
}
@@ -127,6 +180,28 @@ func (ui *UserInfo) getMaxConcurrentRequests() int {
return mcr
}
func (ui *UserInfo) stopHealthChecks() {
if ui == nil {
return
}
if ui.URLPrefix != nil {
bus := ui.URLPrefix.bus.Load()
bus.stopHealthChecks()
}
if ui.DefaultURL != nil {
bus := ui.DefaultURL.bus.Load()
bus.stopHealthChecks()
}
for i := range ui.URLMaps {
um := &ui.URLMaps[i]
if um.URLPrefix != nil {
bus := um.URLPrefix.bus.Load()
bus.stopHealthChecks()
}
}
}
// Header is `Name: Value` http header, which must be added to the proxied request.
type Header struct {
Name string
@@ -262,7 +337,7 @@ type URLPrefix struct {
// the list of backend urls
//
// the list can be dynamically updated if `discover_backend_ips` option is set.
bus atomic.Pointer[[]*backendURL]
bus atomic.Pointer[backendURLs]
// if this option is set, then backend ips for busOriginal are periodically re-discovered and put to bus.
discoverBackendIPs bool
@@ -286,21 +361,94 @@ func (up *URLPrefix) setLoadBalancingPolicy(loadBalancingPolicy string) error {
}
}
type backendURLs struct {
healthChecksContext context.Context
healthChecksCancel func()
healthChecksWG sync.WaitGroup
bus []*backendURL
}
func newBackendURLs() *backendURLs {
ctx, cancel := context.WithCancel(context.Background())
return &backendURLs{
healthChecksContext: ctx,
healthChecksCancel: cancel,
}
}
func (bus *backendURLs) add(u *url.URL) {
bus.bus = append(bus.bus, &backendURL{
url: u,
healthCheckContext: bus.healthChecksContext,
healthCheckWG: &bus.healthChecksWG,
hasPlaceHolders: hasAnyPlaceholders(u),
})
}
func (bus *backendURLs) stopHealthChecks() {
bus.healthChecksCancel()
bus.healthChecksWG.Wait()
}
type backendURL struct {
brokenDeadline atomic.Uint64
broken atomic.Bool
healthCheckContext context.Context
healthCheckWG *sync.WaitGroup
concurrentRequests atomic.Int32
url *url.URL
hasPlaceHolders bool
}
func (bu *backendURL) isBroken() bool {
ct := fasttime.UnixTimestamp()
return ct < bu.brokenDeadline.Load()
return bu.broken.Load()
}
func (bu *backendURL) setBroken() {
deadline := fasttime.UnixTimestamp() + uint64((*failTimeout).Seconds())
bu.brokenDeadline.Store(deadline)
if bu.broken.CompareAndSwap(false, true) {
bu.healthCheckWG.Go(func() {
bu.runHealthCheck()
bu.broken.Store(false)
})
}
}
func (bu *backendURL) runHealthCheck() {
port := bu.url.Port()
if port == "" {
port = "80"
}
addr := net.JoinHostPort(bu.url.Hostname(), port)
t := time.NewTicker(*failTimeout)
defer t.Stop()
for {
select {
case <-t.C:
// Verify network connectivity via TCP dial before marking backend healthy.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9997
ctx, cancel := context.WithTimeout(bu.healthCheckContext, time.Second)
c, err := netutil.Dialer.DialContext(ctx, "tcp", addr)
cancel()
if err != nil {
if errors.Is(bu.healthCheckContext.Err(), context.Canceled) {
return
}
logger.Warnf("ignoring the backend at %s for %s because of dial error: %s", addr, *failTimeout, err)
continue
}
_ = c.Close()
return
case <-bu.healthCheckContext.Done():
return
}
}
}
func (bu *backendURL) get() {
@@ -312,8 +460,8 @@ func (bu *backendURL) put() {
}
func (up *URLPrefix) getBackendsCount() int {
pbus := up.bus.Load()
return len(*pbus)
bus := up.bus.Load()
return len(bus.bus)
}
// getBackendURL returns the backendURL depending on the load balance policy.
@@ -324,16 +472,15 @@ func (up *URLPrefix) getBackendsCount() int {
func (up *URLPrefix) getBackendURL() *backendURL {
up.discoverBackendAddrsIfNeeded()
pbus := up.bus.Load()
bus := *pbus
if len(bus) == 0 {
bus := up.bus.Load()
if len(bus.bus) == 0 {
return nil
}
if up.loadBalancingPolicy == "first_available" {
return getFirstAvailableBackendURL(bus)
return getFirstAvailableBackendURL(bus.bus)
}
return getLeastLoadedBackendURL(bus, &up.n)
return getLeastLoadedBackendURL(bus.bus, &up.n)
}
func (up *URLPrefix) discoverBackendAddrsIfNeeded() {
@@ -407,25 +554,24 @@ func (up *URLPrefix) discoverBackendAddrsIfNeeded() {
cancel()
// generate new backendURLs for the resolved IPs
var busNew []*backendURL
busNew := newBackendURLs()
for _, bu := range up.busOriginal {
host := bu.Hostname()
for _, addr := range hostToAddrs[host] {
buCopy := *bu
buCopy.Host = addr
busNew = append(busNew, &backendURL{
url: &buCopy,
})
busNew.add(&buCopy)
}
}
pbus := up.bus.Load()
if areEqualBackendURLs(*pbus, busNew) {
bus := up.bus.Load()
if areEqualBackendURLs(bus.bus, busNew.bus) {
return
}
// Store new backend urls
up.bus.Store(&busNew)
up.bus.Store(busNew)
bus.stopHealthChecks()
}
func areEqualBackendURLs(a, b []*backendURL) bool {
@@ -456,53 +602,66 @@ func getFirstAvailableBackendURL(bus []*backendURL) *backendURL {
for i := 1; i < len(bus); i++ {
if !bus[i].isBroken() {
bu = bus[i]
break
bu.get()
return bu
}
}
bu.get()
return bu
return nil
}
// getLeastLoadedBackendURL returns the backendURL with the minimum number of concurrent requests.
// getLeastLoadedBackendURL returns a non-broken backendURL with the lowest number of concurrent requests.
//
// backendURL.put() must be called on the returned backendURL after the request is complete.
func getLeastLoadedBackendURL(bus []*backendURL, atomicCounter *atomic.Uint32) *backendURL {
if len(bus) == 1 {
// Fast path - return the only backend url.
bu := bus[0]
if bu.isBroken() {
return nil
}
bu.get()
return bu
}
// Slow path - select other backend urls.
n := atomicCounter.Add(1) - 1
for i := uint32(0); i < uint32(len(bus)); i++ {
for i := range uint32(len(bus)) {
idx := (n + i) % uint32(len(bus))
bu := bus[idx]
if bu.isBroken() {
continue
}
if bu.concurrentRequests.Load() == 0 {
// Fast path - return the backend with zero concurrently executed requests.
// Do not use CompareAndSwap() instead of Load(), since it is much slower on systems with many CPU cores.
bu.concurrentRequests.Add(1)
// The Load() in front of CompareAndSwap() avoids CAS overhead for items with values bigger than 0.
if bu.concurrentRequests.Load() == 0 && bu.concurrentRequests.CompareAndSwap(0, 1) {
atomicCounter.CompareAndSwap(n+1, idx+1)
// There is no need in the call bu.get(), because we already incremented bu.concurrentRequests above.
return bu
}
}
// Slow path - return the backend with the minimum number of concurrently executed requests.
buMin := bus[n%uint32(len(bus))]
minRequests := buMin.concurrentRequests.Load()
for _, bu := range bus {
buMinIdx := n % uint32(len(bus))
minRequests := bus[buMinIdx].concurrentRequests.Load()
for i := uint32(1); i < uint32(len(bus)); i++ {
idx := (n + i) % uint32(len(bus))
bu := bus[idx]
if bu.isBroken() {
continue
}
if n := bu.concurrentRequests.Load(); n < minRequests || buMin.isBroken() {
buMin = bu
minRequests = n
reqs := bu.concurrentRequests.Load()
if reqs < minRequests || bus[buMinIdx].isBroken() {
buMinIdx = idx
minRequests = reqs
}
}
buMin := bus[buMinIdx]
if buMin.isBroken() {
return nil
}
buMin.get()
atomicCounter.CompareAndSwap(n+1, buMinIdx+1)
return buMin
}
@@ -619,11 +778,9 @@ func initAuthConfig() {
configTimestamp.Set(fasttime.UnixTimestamp())
stopCh = make(chan struct{})
authConfigWG.Add(1)
go func() {
defer authConfigWG.Done()
authConfigWG.Go(func() {
authConfigReloader(sighupCh)
}()
})
}
func stopAuthConfig() {
@@ -679,6 +836,9 @@ var (
// authUsers contains the currently loaded auth users
authUsers atomic.Pointer[map[string]*UserInfo]
// jwt authentication cache
jwtAuthCache atomic.Pointer[jwtCache]
authConfigWG sync.WaitGroup
stopCh chan struct{}
)
@@ -695,7 +855,7 @@ func reloadAuthConfig() (bool, error) {
ok, err := reloadAuthConfigData(data)
if err != nil {
return false, fmt.Errorf("failed to pars -auth.config=%q: %w", *authConfigPath, err)
return false, fmt.Errorf("failed to parse -auth.config=%q: %w", *authConfigPath, err)
}
if !ok {
return false, nil
@@ -718,6 +878,16 @@ func reloadAuthConfigData(data []byte) (bool, error) {
return false, fmt.Errorf("failed to parse auth config: %w", err)
}
jui, oidcDP, err := parseJWTUsers(ac)
if err != nil {
return false, fmt.Errorf("failed to parse JWT users from auth config: %w", err)
}
oidcDP.startDiscovery()
jwtc := &jwtCache{
users: jui,
oidcDP: oidcDP,
}
m, err := parseAuthConfigUsers(ac)
if err != nil {
return false, fmt.Errorf("failed to parse users from auth config: %w", err)
@@ -725,13 +895,24 @@ func reloadAuthConfigData(data []byte) (bool, error) {
acPrev := authConfig.Load()
if acPrev != nil {
acPrev.UnauthorizedUser.stopHealthChecks()
for i := range acPrev.Users {
acPrev.Users[i].stopHealthChecks()
}
metrics.UnregisterSet(acPrev.ms, true)
}
metrics.RegisterSet(ac.ms)
jwtcPrev := jwtAuthCache.Load()
if jwtcPrev != nil {
jwtcPrev.oidcDP.stopDiscovery()
}
authConfig.Store(ac)
authConfigData.Store(&data)
authUsers.Store(&m)
jwtAuthCache.Store(jwtc)
return true, nil
}
@@ -756,12 +937,18 @@ func parseAuthConfig(data []byte) (*AuthConfig, error) {
if ui.BearerToken != "" {
return nil, fmt.Errorf("field bearer_token can't be specified for unauthorized_user section")
}
if ui.JWT != nil {
return nil, fmt.Errorf("field jwt can't be specified for unauthorized_user section")
}
if ui.AuthToken != "" {
return nil, fmt.Errorf("field auth_token can't be specified for unauthorized_user section")
}
if ui.Name != "" {
return nil, fmt.Errorf("field name can't be specified for unauthorized_user section")
}
if err := parseJWTPlaceholdersForUserInfo(ui, false); err != nil {
return nil, err
}
if err := ui.initURLs(); err != nil {
return nil, err
}
@@ -771,6 +958,8 @@ func parseAuthConfig(data []byte) (*AuthConfig, error) {
return nil, fmt.Errorf("cannot parse metric_labels for unauthorized_user: %w", err)
}
ui.requests = ac.ms.NewCounter(`vmauth_unauthorized_user_requests_total` + metricLabels)
ui.requestErrors = ac.ms.NewCounter(`vmauth_unauthorized_user_request_errors_total` + metricLabels)
ui.backendRequests = ac.ms.NewCounter(`vmauth_unauthorized_user_request_backend_requests_total` + metricLabels)
ui.backendErrors = ac.ms.NewCounter(`vmauth_unauthorized_user_request_backend_errors_total` + metricLabels)
ui.requestsDuration = ac.ms.NewSummary(`vmauth_unauthorized_user_request_duration_seconds` + metricLabels)
ui.concurrencyLimitCh = make(chan struct{}, ui.getMaxConcurrentRequests())
@@ -800,16 +989,27 @@ func parseAuthConfigUsers(ac *AuthConfig) (map[string]*UserInfo, error) {
}
for i := range uis {
ui := &uis[i]
// users with jwt tokens are parsed by parseJWTUsers function.
// the function also checks that users with jwt tokens do not have auth tokens, bearer tokens, usernames and passwords.
if ui.JWT != nil {
continue
}
ats, err := getAuthTokens(ui.AuthToken, ui.BearerToken, ui.Username, ui.Password)
if err != nil {
return nil, err
}
for _, at := range ats {
if uiOld := byAuthToken[at]; uiOld != nil {
return nil, fmt.Errorf("duplicate auth token=%q found for username=%q, name=%q; the previous one is set for username=%q, name=%q",
at, ui.Username, ui.Name, uiOld.Username, uiOld.Name)
}
}
if err := parseJWTPlaceholdersForUserInfo(ui, false); err != nil {
return nil, err
}
if err := ui.initURLs(); err != nil {
return nil, err
}
@@ -819,6 +1019,8 @@ func parseAuthConfigUsers(ac *AuthConfig) (map[string]*UserInfo, error) {
return nil, fmt.Errorf("cannot parse metric_labels: %w", err)
}
ui.requests = ac.ms.GetOrCreateCounter(`vmauth_user_requests_total` + metricLabels)
ui.requestErrors = ac.ms.GetOrCreateCounter(`vmauth_user_request_errors_total` + metricLabels)
ui.backendRequests = ac.ms.GetOrCreateCounter(`vmauth_user_request_backend_requests_total` + metricLabels)
ui.backendErrors = ac.ms.GetOrCreateCounter(`vmauth_user_request_backend_errors_total` + metricLabels)
ui.requestsDuration = ac.ms.GetOrCreateSummary(`vmauth_user_request_duration_seconds` + metricLabels)
mcr := ui.getMaxConcurrentRequests()
@@ -907,6 +1109,7 @@ func (ui *UserInfo) initURLs() error {
return err
}
}
for _, e := range ui.URLMaps {
if len(e.SrcPaths) == 0 && len(e.SrcHosts) == 0 && len(e.SrcQueryArgs) == 0 && len(e.SrcHeaders) == 0 {
return fmt.Errorf("missing `src_paths`, `src_hosts`, `src_query_args` and `src_headers` in `url_map`")
@@ -966,6 +1169,9 @@ func (ui *UserInfo) name() string {
h := xxhash.Sum64([]byte(ui.AuthToken))
return fmt.Sprintf("auth_token:hash:%016X", h)
}
if ui.JWT != nil {
return `jwt`
}
return ""
}
@@ -1053,13 +1259,11 @@ func (up *URLPrefix) sanitizeAndInitialize() error {
}
// Initialize up.bus
bus := make([]*backendURL, len(up.busOriginal))
for i, bu := range up.busOriginal {
bus[i] = &backendURL{
url: bu,
}
bus := newBackendURLs()
for _, bu := range up.busOriginal {
bus.add(bu)
}
up.bus.Store(&bus)
up.bus.Store(bus)
return nil
}

View File

@@ -4,8 +4,11 @@ import (
"bytes"
"fmt"
"net"
"net/http"
"net/url"
"strings"
"testing"
"time"
"gopkg.in/yaml.v2"
@@ -276,6 +279,50 @@ users:
url_prefix: http://foo.bar
metric_labels:
not-prometheus-compatible: value
`)
// placeholder in url_prefix
f(`
users:
- username: foo
password: bar
url_prefix: 'http://ahost/{{a_placeholder}}/foobar'
`)
// placeholder in a header
f(`
users:
- username: foo
password: bar
headers:
- 'X-Foo: {{a_placeholder}}'
url_prefix: 'http://ahost'
`)
// placeholder in url_prefix
f(`
users:
- username: foo
password: bar
url_prefix: 'http://ahost/{{a_placeholder}}/foobar'
`)
// placeholder in a header in url_map
f(`
users:
- username: foo
password: bar
url_map:
- src_paths: ["/select/.*"]
headers:
- 'X-Foo: {{a_placeholder}}'
url_prefix: 'http://ahost'
`)
// placeholder in a header in url_map
f(`
users:
- username: foo
password: bar
url_map:
- src_paths: ["/select/.*"]
url_prefix: 'http://ahost/{{a_placeholder}}/foobar'
`)
}
@@ -378,7 +425,7 @@ users:
RetryStatusCodes: []int{500, 501},
LoadBalancingPolicy: "first_available",
MergeQueryArgs: []string{"foo", "bar"},
DropSrcPathPrefixParts: intp(1),
DropSrcPathPrefixParts: new(1),
DiscoverBackendIPs: &discoverBackendIPsTrue,
},
}, nil)
@@ -621,6 +668,47 @@ unauthorized_user:
},
},
})
// skip user info with jwt, it is parsed by parseJWTUsers
f(`
users:
- username: foo
password: bar
url_prefix: http://aaa:343/bbb
- jwt: {skip_verify: true}
url_prefix: http://aaa:343/bbb
`, map[string]*UserInfo{
getHTTPAuthBasicToken("foo", "bar"): {
Username: "foo",
Password: "bar",
URLPrefix: mustParseURL("http://aaa:343/bbb"),
},
}, nil)
// Multiple users with access logs enabled
f(`
users:
- username: foo
url_prefix: http://foo
access_log: {}
- username: bar
url_prefix: https://bar/x/
access_log:
filters:
skip_status_codes: [404]
`, map[string]*UserInfo{
getHTTPAuthBasicToken("foo", ""): {
Username: "foo",
URLPrefix: mustParseURL("http://foo"),
AccessLog: &AccessLog{},
},
getHTTPAuthBasicToken("bar", ""): {
Username: "bar",
URLPrefix: mustParseURL("https://bar/x/"),
AccessLog: &AccessLog{Filters: &AccessLogFilters{SkipStatusCodes: []int{404}}},
},
}, nil)
}
func TestParseAuthConfigPassesTLSVerificationConfig(t *testing.T) {
@@ -752,10 +840,12 @@ func TestGetLeastLoadedBackendURL(t *testing.T) {
})
up.loadBalancingPolicy = "least_loaded"
pbus := up.bus.Load()
bus := pbus.bus
fn := func(ns ...int) {
t.Helper()
pbus := up.bus.Load()
bus := *pbus
for i, b := range bus {
got := int(b.concurrentRequests.Load())
exp := ns[i]
@@ -767,45 +857,52 @@ func TestGetLeastLoadedBackendURL(t *testing.T) {
up.getBackendURL()
fn(1, 0, 0)
up.getBackendURL()
fn(1, 1, 0)
up.getBackendURL()
fn(1, 1, 1)
up.getBackendURL()
up.getBackendURL()
fn(2, 2, 1)
bus := up.bus.Load()
pbus := *bus
pbus[0].concurrentRequests.Add(2)
pbus[2].concurrentRequests.Add(5)
fn(4, 2, 6)
bus[1].put()
bus[2].put()
fn(1, 0, 0)
up.getBackendURL()
fn(4, 3, 6)
fn(1, 1, 0)
bus[1].put()
up.getBackendURL()
fn(4, 4, 6)
up.getBackendURL()
fn(4, 5, 6)
up.getBackendURL()
fn(5, 5, 6)
up.getBackendURL()
fn(6, 5, 6)
up.getBackendURL()
fn(6, 6, 6)
up.getBackendURL()
fn(6, 6, 7)
fn(1, 0, 1)
up.getBackendURL()
up.getBackendURL()
fn(7, 7, 7)
fn(1, 1, 2)
bus[0].concurrentRequests.Add(2)
bus[2].concurrentRequests.Add(2)
fn(3, 1, 4)
up.getBackendURL()
fn(3, 2, 4)
up.getBackendURL()
fn(3, 3, 4)
up.getBackendURL()
fn(4, 3, 4)
up.getBackendURL()
fn(4, 4, 4)
bus[0].put()
bus[2].put()
up.getBackendURL()
fn(3, 4, 4)
up.getBackendURL()
fn(4, 4, 4)
}
func TestBrokenBackend(t *testing.T) {
@@ -816,13 +913,13 @@ func TestBrokenBackend(t *testing.T) {
})
up.loadBalancingPolicy = "least_loaded"
pbus := up.bus.Load()
bus := *pbus
bus := pbus.bus
// explicitly mark one of the backends as broken
bus[1].setBroken()
// broken backend should never return while there are healthy backends
for i := 0; i < 1e3; i++ {
for range int(1e3) {
b := up.getBackendURL()
if b.isBroken() {
t.Fatalf("unexpected broken backend %q", b.url)
@@ -839,7 +936,7 @@ func TestDiscoverBackendIPsWithIPV6(t *testing.T) {
up.discoverBackendAddrsIfNeeded()
pbus := up.bus.Load()
bus := *pbus
bus := pbus.bus
if len(bus) != 1 {
t.Fatalf("expected url list to be of size 1; got %d instead", len(bus))
@@ -899,6 +996,41 @@ func TestDiscoverBackendIPsWithIPV6(t *testing.T) {
}
func TestLogRequest(t *testing.T) {
ui := &UserInfo{AccessLog: &AccessLog{}}
testOutput := &bytes.Buffer{}
logger.SetOutputForTests(testOutput)
defer logger.ResetOutputForTest()
req, err := http.NewRequest("GET", "http://localhost:8080/select/0/prometheus", nil)
if err != nil {
t.Fatalf("unexpected error: %s", err)
}
f := func(user string, status int, duration time.Duration, expectedLog string) {
t.Helper()
testOutput.Reset()
ui.logRequest(req, user, status, duration)
got := testOutput.String()
if expectedLog == "" && got != "" {
t.Fatalf("expected empty log, got %q", got)
}
if !strings.Contains(got, expectedLog) {
t.Fatalf("output \n%q \nshould contain \n%q", testOutput.String(), expectedLog)
}
}
f("foo", 200, 10*time.Millisecond, `access_log request_host="localhost:8080" request_uri="" status_code=200 remote_addr="" user_agent="" referer="" duration_ms=10 username="foo"`)
f("foo", 404, time.Second, `access_log request_host="localhost:8080" request_uri="" status_code=404 remote_addr="" user_agent="" referer="" duration_ms=1000 username="foo"`)
ui.AccessLog.Filters = &AccessLogFilters{SkipStatusCodes: []int{200}}
f("foo", 200, 10*time.Millisecond, ``)
f("foo", 404, 10*time.Millisecond, `access_log request_host="localhost:8080" request_uri="" status_code=404 remote_addr="" user_agent="" referer="" duration_ms=10 username="foo"`)
}
func getRegexs(paths []string) []*Regex {
var sps []*Regex
for _, path := range paths {
@@ -933,16 +1065,14 @@ func mustParseURL(u string) *URLPrefix {
}
func mustParseURLs(us []string) *URLPrefix {
bus := make([]*backendURL, len(us))
bus := newBackendURLs()
urls := make([]*url.URL, len(us))
for i, u := range us {
pu, err := url.Parse(u)
if err != nil {
panic(fmt.Errorf("BUG: cannot parse %q: %w", u, err))
}
bus[i] = &backendURL{
url: pu,
}
bus.add(pu)
urls[i] = pu
}
up := &URLPrefix{}
@@ -951,15 +1081,11 @@ func mustParseURLs(us []string) *URLPrefix {
} else {
up.vOriginal = us
}
up.bus.Store(&bus)
up.bus.Store(bus)
up.busOriginal = urls
return up
}
func intp(n int) *int {
return &n
}
func mustNewRegex(s string) *Regex {
var re Regex
if err := yaml.Unmarshal([]byte(s), &re); err != nil {

View File

@@ -116,6 +116,20 @@ users:
- "http://default1:8888/unsupported_url_handler"
- "http://default2:8888/unsupported_url_handler"
# A JWT token based routing:
# - Requests with JWT token that has the following structure:
# {"team": "ops", "security": {"read_access": "1"}, "vm_access": {"metrics_account_id": 1000,"metrics_project_id":5}}
# is routed to vmselect nodes and request url placeholder replaced with metrics tenant identificators
- name: jwt-opts-team
jwt:
match_claims:
team: ops
security.read_access: "1"
skip_verify: true
url_prefix:
- "http://vmselect1:8481/select/{{.MetricsTenant}}/prometheus"
- "http://vmselect2:8481/select/{{.MetricsTenant}}/prometheus"
# Requests without Authorization header are proxied according to `unauthorized_user` section.
# Requests are proxied in round-robin fashion between `url_prefix` backends.
# The deny_partial_response query arg is added to all the proxied requests.
@@ -125,3 +139,8 @@ unauthorized_user:
- http://vmselect-az1/?deny_partial_response=1
- http://vmselect-az2/?deny_partial_response=1
retry_status_codes: [503, 500]
# log access for requests routed to this user
access_log:
filters:
# except requests with Status Codes below
skip_status_codes: [200, 202]

486
app/vmauth/jwt.go Normal file
View File

@@ -0,0 +1,486 @@
package main
import (
"fmt"
"net/url"
"os"
"slices"
"sort"
"strings"
"sync"
"sync/atomic"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/jwt"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
)
const (
metricsTenantPlaceholder = `{{.MetricsTenant}}`
metricsExtraLabelsPlaceholder = `{{.MetricsExtraLabels}}`
metricsExtraFiltersPlaceholder = `{{.MetricsExtraFilters}}`
logsAccountIDPlaceholder = `{{.LogsAccountID}}`
logsProjectIDPlaceholder = `{{.LogsProjectID}}`
logsExtraFiltersPlaceholder = `{{.LogsExtraFilters}}`
logsExtraStreamFiltersPlaceholder = `{{.LogsExtraStreamFilters}}`
placeholderPrefix = `{{`
)
var allPlaceholders = []string{
metricsTenantPlaceholder,
metricsExtraLabelsPlaceholder,
metricsExtraFiltersPlaceholder,
logsAccountIDPlaceholder,
logsProjectIDPlaceholder,
logsExtraFiltersPlaceholder,
logsExtraStreamFiltersPlaceholder,
}
var urlPathPlaceHolders = []string{
metricsTenantPlaceholder,
logsAccountIDPlaceholder,
logsProjectIDPlaceholder,
}
type jwtCache struct {
// users contain UserInfo`s from AuthConfig with JWTConfig set
users []*UserInfo
oidcDP *oidcDiscovererPool
}
type JWTConfig struct {
PublicKeys []string `yaml:"public_keys,omitempty"`
PublicKeyFiles []string `yaml:"public_key_files,omitempty"`
SkipVerify bool `yaml:"skip_verify,omitempty"`
OIDC *oidcConfig `yaml:"oidc,omitempty"`
MatchClaims map[string]string `yaml:"match_claims,omitempty"`
parsedMatchClaims []*jwt.Claim
// verifierPool is used to verify JWT tokens.
// It is initialized from PublicKeys and/or PublicKeyFiles.
// In this case, it is initialized once at config reload and never updated until next reload
// In case of OIDC, it is initialized on config reload and periodically updated by discovery process.
verifierPool atomic.Pointer[jwt.VerifierPool]
}
func parseJWTUsers(ac *AuthConfig) ([]*UserInfo, *oidcDiscovererPool, error) {
jui := make([]*UserInfo, 0, len(ac.Users))
oidcDP := &oidcDiscovererPool{}
uniqClaims := make(map[string]*UserInfo)
var sortedClaims []string
for idx, ui := range ac.Users {
jwtToken := ui.JWT
if jwtToken == nil {
continue
}
if ui.AuthToken != "" || ui.BearerToken != "" || ui.Username != "" || ui.Password != "" {
return nil, nil, fmt.Errorf("auth_token, bearer_token, username and password cannot be specified if jwt is set")
}
if len(jwtToken.PublicKeys) == 0 && len(jwtToken.PublicKeyFiles) == 0 && !jwtToken.SkipVerify && jwtToken.OIDC == nil {
return nil, nil, fmt.Errorf("jwt must contain at least a single public key, public_key_files, oidc or have skip_verify=true")
}
var claimsString string
sortedClaims = sortedClaims[:0]
parsedClaims := make([]*jwt.Claim, 0, len(jwtToken.MatchClaims))
for ck, cv := range jwtToken.MatchClaims {
sortedClaims = append(sortedClaims, fmt.Sprintf("%s=%s", ck, cv))
pc, err := jwt.NewClaim(ck, cv)
if err != nil {
return nil, nil, fmt.Errorf("incorrect match claim, key=%q, value regex=%q: %w", ck, cv, err)
}
parsedClaims = append(parsedClaims, pc)
}
ui.JWT.parsedMatchClaims = parsedClaims
sort.Strings(sortedClaims)
claimsString = strings.Join(sortedClaims, ",")
if oldUI, ok := uniqClaims[claimsString]; ok {
return nil, nil, fmt.Errorf("duplicate match claims=%q found for name=%q at idx=%d; the previous one is set for name=%q", claimsString, ui.Name, idx, oldUI.Name)
}
uniqClaims[claimsString] = &ui
if len(jwtToken.PublicKeys) > 0 || len(jwtToken.PublicKeyFiles) > 0 {
keys := make([]any, 0, len(jwtToken.PublicKeys)+len(jwtToken.PublicKeyFiles))
for i := range jwtToken.PublicKeys {
k, err := jwt.ParseKey([]byte(jwtToken.PublicKeys[i]))
if err != nil {
return nil, nil, err
}
keys = append(keys, k)
}
for _, filePath := range jwtToken.PublicKeyFiles {
keyData, err := os.ReadFile(filePath)
if err != nil {
return nil, nil, fmt.Errorf("cannot read public key from file %q: %w", filePath, err)
}
k, err := jwt.ParseKey(keyData)
if err != nil {
return nil, nil, fmt.Errorf("cannot parse public key from file %q: %w", filePath, err)
}
keys = append(keys, k)
}
vp, err := jwt.NewVerifierPool(keys)
if err != nil {
return nil, nil, err
}
jwtToken.verifierPool.Store(vp)
}
if jwtToken.OIDC != nil {
if len(jwtToken.PublicKeys) > 0 || len(jwtToken.PublicKeyFiles) > 0 || jwtToken.SkipVerify {
return nil, nil, fmt.Errorf("jwt with oidc cannot contain public keys or have skip_verify=true")
}
if jwtToken.OIDC.Issuer == "" {
return nil, nil, fmt.Errorf("oidc issuer cannot be empty")
}
isserURL, err := url.Parse(jwtToken.OIDC.Issuer)
if err != nil {
return nil, nil, fmt.Errorf("oidc issuer %q must be a valid URL", jwtToken.OIDC.Issuer)
}
if isserURL.Scheme != "https" && isserURL.Scheme != "http" {
return nil, nil, fmt.Errorf("oidc issuer %q must have http or https scheme", jwtToken.OIDC.Issuer)
}
oidcDP.createOrAdd(ui.JWT.OIDC.Issuer, &ui.JWT.verifierPool)
}
if err := parseJWTPlaceholdersForUserInfo(&ui, true); err != nil {
return nil, nil, err
}
if err := ui.initURLs(); err != nil {
return nil, nil, err
}
metricLabels, err := ui.getMetricLabels()
if err != nil {
return nil, nil, fmt.Errorf("cannot parse metric_labels: %w", err)
}
ui.requests = ac.ms.GetOrCreateCounter(`vmauth_user_requests_total` + metricLabels)
ui.requestErrors = ac.ms.GetOrCreateCounter(`vmauth_user_request_errors_total` + metricLabels)
ui.backendRequests = ac.ms.GetOrCreateCounter(`vmauth_user_request_backend_requests_total` + metricLabels)
ui.backendErrors = ac.ms.GetOrCreateCounter(`vmauth_user_request_backend_errors_total` + metricLabels)
ui.requestsDuration = ac.ms.GetOrCreateSummary(`vmauth_user_request_duration_seconds` + metricLabels)
mcr := ui.getMaxConcurrentRequests()
ui.concurrencyLimitCh = make(chan struct{}, mcr)
ui.concurrencyLimitReached = ac.ms.GetOrCreateCounter(`vmauth_user_concurrent_requests_limit_reached_total` + metricLabels)
_ = ac.ms.GetOrCreateGauge(`vmauth_user_concurrent_requests_capacity`+metricLabels, func() float64 {
return float64(cap(ui.concurrencyLimitCh))
})
_ = ac.ms.GetOrCreateGauge(`vmauth_user_concurrent_requests_current`+metricLabels, func() float64 {
return float64(len(ui.concurrencyLimitCh))
})
rt, err := newRoundTripper(ui.TLSCAFile, ui.TLSCertFile, ui.TLSKeyFile, ui.TLSServerName, ui.TLSInsecureSkipVerify)
if err != nil {
return nil, nil, fmt.Errorf("cannot initialize HTTP RoundTripper: %w", err)
}
ui.rt = rt
jui = append(jui, &ui)
}
// sort by amount of matching claims
// it allows to more specific claim win in case of clash
sort.SliceStable(jui, func(i, j int) bool {
return len(jui[i].JWT.MatchClaims) > len(jui[j].JWT.MatchClaims)
})
return jui, oidcDP, nil
}
var tokenPool sync.Pool
func getToken() *jwt.Token {
tkn := tokenPool.Get()
if tkn == nil {
return &jwt.Token{}
}
return tkn.(*jwt.Token)
}
func putToken(tkn *jwt.Token) {
tkn.Reset()
tokenPool.Put(tkn)
}
func getJWTUserInfo(ats []string) (*UserInfo, *jwt.Token) {
js := *jwtAuthCache.Load()
if len(js.users) == 0 {
return nil, nil
}
tkn := getToken()
for _, at := range ats {
if strings.Count(at, ".") != 2 {
continue
}
at, _ = strings.CutPrefix(at, `http_auth:`)
tkn.Reset()
if err := tkn.Parse(at, true); err != nil {
if *logInvalidAuthTokens {
logger.Infof("cannot parse jwt token: %s", err)
}
continue
}
if tkn.IsExpired(time.Now()) {
if *logInvalidAuthTokens {
// TODO: add more context:
// token claims with issuer
logger.Infof("jwt token is expired")
}
continue
}
if ui := getUserInfoByJWTToken(tkn, js.users); ui != nil {
return ui, tkn
}
}
putToken(tkn)
return nil, nil
}
func getUserInfoByJWTToken(tkn *jwt.Token, users []*UserInfo) *UserInfo {
for _, ui := range users {
if !tkn.MatchClaims(ui.JWT.parsedMatchClaims) {
continue
}
if ui.JWT.SkipVerify {
return ui
}
if ui.JWT.OIDC != nil {
// OIDC requires iss claim.
// It must match the discovery issuer URL set in OIDC config.
// https://openid.net/specs/openid-connect-discovery-1_0.html#ProviderMetadata
if tkn.Issuer() == "" {
if *logInvalidAuthTokens {
logger.Infof("jwt token must have issuer filed")
}
return nil
}
if tkn.Issuer() != ui.JWT.OIDC.Issuer {
if *logInvalidAuthTokens {
logger.Infof("jwt token issuer: %q does not match oidc issuer: %q", tkn.Issuer(), ui.JWT.OIDC.Issuer)
}
return nil
}
}
vp := ui.JWT.verifierPool.Load()
if vp == nil {
if *logInvalidAuthTokens {
logger.Infof("jwt verifier not initialed")
}
return nil
}
if err := vp.Verify(tkn); err != nil {
if *logInvalidAuthTokens {
logger.Infof("cannot verify jwt token: %s", err)
}
return nil
}
return ui
}
if *logInvalidAuthTokens {
logger.Infof("no user match jwt token")
}
return nil
}
func replaceJWTPlaceholders(bu *backendURL, hc HeadersConf, vma *jwt.VMAccessClaim) (*url.URL, HeadersConf) {
if !bu.hasPlaceHolders && !hc.hasAnyPlaceHolders {
return bu.url, hc
}
targetURL := bu.url
data := jwtClaimsData(vma)
if bu.hasPlaceHolders {
// template url params and request path
// make a copy of url
uCopy := *bu.url
for _, uph := range urlPathPlaceHolders {
replacement := data[uph]
uCopy.Path = strings.ReplaceAll(uCopy.Path, uph, replacement[0])
}
query := uCopy.Query()
var foundAnyQueryPlaceholder bool
var templatedValues []string
for param, values := range query {
templatedValues = templatedValues[:0]
// filter in-place values with placeholders
// and accumulate replacements
// it will change the order of param values
// but it's not guaranteed
// and will be changed in any way with multiple arg templates
var cnt int
for _, value := range values {
if dv, ok := data[value]; ok {
foundAnyQueryPlaceholder = true
templatedValues = append(templatedValues, dv...)
continue
}
values[cnt] = value
cnt++
}
values = values[:cnt]
values = append(values, templatedValues...)
query[param] = values
}
if foundAnyQueryPlaceholder {
uCopy.RawQuery = query.Encode()
}
targetURL = &uCopy
}
if hc.hasAnyPlaceHolders {
// make a copy of headers and update only values with placeholder
rhs := make([]*Header, 0, len(hc.RequestHeaders))
for _, rh := range hc.RequestHeaders {
if dv, ok := data[rh.Value]; ok {
rh := &Header{
Name: rh.Name,
Value: strings.Join(dv, ","),
}
rhs = append(rhs, rh)
continue
}
rhs = append(rhs, rh)
}
hc.RequestHeaders = rhs
}
return targetURL, hc
}
func jwtClaimsData(vma *jwt.VMAccessClaim) map[string][]string {
data := map[string][]string{
// TODO: optimize at parsing stage
metricsTenantPlaceholder: {fmt.Sprintf("%d:%d", vma.MetricsAccountID, vma.MetricsProjectID)},
metricsExtraLabelsPlaceholder: vma.MetricsExtraLabels,
metricsExtraFiltersPlaceholder: vma.MetricsExtraFilters,
// TODO: optimize at parsing stage
logsAccountIDPlaceholder: {fmt.Sprintf("%d", vma.LogsAccountID)},
logsProjectIDPlaceholder: {fmt.Sprintf("%d", vma.LogsProjectID)},
logsExtraFiltersPlaceholder: vma.LogsExtraFilters,
logsExtraStreamFiltersPlaceholder: vma.LogsExtraStreamFilters,
}
return data
}
func parseJWTPlaceholdersForUserInfo(ui *UserInfo, isAllowed bool) error {
if ui.URLPrefix != nil {
if err := validateJWTPlaceholdersForURL(ui.URLPrefix, isAllowed); err != nil {
return err
}
}
if err := parsePlaceholdersForHC(&ui.HeadersConf, isAllowed); err != nil {
return err
}
if ui.DefaultURL != nil {
if err := validateJWTPlaceholdersForURL(ui.DefaultURL, isAllowed); err != nil {
return fmt.Errorf("invalid `default_url` placeholders: %w", err)
}
}
for i := range ui.URLMaps {
e := &ui.URLMaps[i]
if e.URLPrefix != nil {
if err := validateJWTPlaceholdersForURL(e.URLPrefix, isAllowed); err != nil {
return fmt.Errorf("invalid `url_map` `url_prefix` placeholders: %w", err)
}
}
if err := parsePlaceholdersForHC(&e.HeadersConf, isAllowed); err != nil {
return fmt.Errorf("invalid `url_map` headers placeholders: %w", err)
}
}
return nil
}
func validateJWTPlaceholdersForURL(up *URLPrefix, isAllowed bool) error {
for _, bu := range up.busOriginal {
ok := strings.Contains(bu.Path, placeholderPrefix)
if ok && !isAllowed {
return fmt.Errorf("placeholder: %q is only allowed at JWT token context", bu.Path)
}
if ok {
p := bu.Path
for _, ph := range allPlaceholders {
p = strings.ReplaceAll(p, ph, ``)
}
if strings.Contains(p, placeholderPrefix) {
return fmt.Errorf("invalid placeholder found in URL request path: %q, supported values are: %s", bu.Path, strings.Join(allPlaceholders, ", "))
}
}
for param, values := range bu.Query() {
for _, value := range values {
ok := strings.Contains(value, placeholderPrefix)
if ok && !isAllowed {
return fmt.Errorf("query param: %q with placeholder: %q is only allowed at JWT token context", param, value)
}
if ok {
// possible placeholder
if !slices.Contains(allPlaceholders, value) {
return fmt.Errorf("query param: %q has unsupported placeholder string: %q, supported values are: %s", param, value, strings.Join(allPlaceholders, ", "))
}
}
}
}
}
return nil
}
func parsePlaceholdersForHC(hc *HeadersConf, isAllowed bool) error {
for _, rhs := range hc.RequestHeaders {
ok := strings.Contains(rhs.Value, placeholderPrefix)
if ok && !isAllowed {
return fmt.Errorf("request header: %q placeholder: %q is only supported at JWT context", rhs.Name, rhs.Value)
}
if ok {
if !slices.Contains(allPlaceholders, rhs.Value) {
return fmt.Errorf("request header: %q has unsupported placeholder: %q, supported values are: %s", rhs.Name, rhs.Value, strings.Join(allPlaceholders, ", "))
}
hc.hasAnyPlaceHolders = true
}
}
for _, rhs := range hc.ResponseHeaders {
if strings.Contains(rhs.Value, placeholderPrefix) {
return fmt.Errorf("response header placeholders are not supported; found placeholder prefix at header: %q with value: %q", rhs.Name, rhs.Value)
}
}
return nil
}
func hasAnyPlaceholders(u *url.URL) bool {
if strings.Contains(u.Path, placeholderPrefix) {
return true
}
if len(u.Query()) == 0 {
return false
}
for _, values := range u.Query() {
for _, value := range values {
if strings.HasPrefix(value, placeholderPrefix) {
return true
}
}
}
return false
}

Some files were not shown because too many files have changed in this diff Show More