Compare commits

..

408 Commits

Author SHA1 Message Date
f41gh7
4109382d0f docs/changelog: sort changelog entries
Signed-off-by: f41gh7 <nik@victoriametrics.com>
2025-12-01 11:12:06 +01:00
Zakhar Bessarab
9502c86fe3 app/vminsert/netstorage: fix list of nodes used for SD
Previously, vminsert was using original list of addrs instead of
discovered addrs. Properly use discovered list of addrs.
2025-12-01 11:12:06 +01:00
Artem Fetishev
a61b5e6da6 lib/lrucache: do not reset requests and misses after cache reset
Follow-up for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10072.

Do not reset requests and misses metrics since cache reset implies the
reset of the storage only.
2025-12-01 10:15:07 +01:00
Zakhar Bessarab
bdac7b52cb docs/changelog: cut v1.131.0 2025-11-28 20:20:26 +04:00
Zakhar Bessarab
d6d5e6e39d docs: update availble from tags 2025-11-28 20:13:56 +04:00
Zakhar Bessarab
64d7122711 app/vmselect: run make vmui-update 2025-11-28 20:01:23 +04:00
Aliaksandr Valialkin
6da9334593 Makefile: generate quicktemplate output files only at lib and app directories
Previously the output files were incorrectly generated inside unexpeted directories such as vendor
2025-11-28 16:07:46 +01:00
Nowa Ammerlaan
30be4c0859 protoparser/influx: account for excess white spaces before timestamp
Some influx clients ( such as nimon monitoring client) adds excess white spaces in the influx line and does not set a
timestamp. Since Influx protocol requires whitespace before timestamp only when it set, it could present without timestamp. Whitespace before omitted timestamp confuses parser.

This commit adds check for the skipped timestamp and test case for it.

Fixes: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10049
2025-11-28 14:37:24 +01:00
Nikolay
0c55cd7af8 app/vmselect: improve performance for multi-level requests
Previously, proxy vmselect (aka 1st level vmselect) performed parsing
of MetricBlock received from vmstorage before forwarding it into top vmselect. It required an additional CPU and Memory, which greatly slowed down query requests.

This commit changes lib/vmselectapi iterator API, instead of MetricBlock, it returns encoded MetricBlock as a byte slice.
It allows to save CPU and memory at proxy vmselect by eliminating need of decoding MetricBlock received from storage.

In addition, it adds the following optimizations for proxy vmselect:
* reduces memory allocations by using iterator pool
 * add per storageNode workerItem for iterator

Also, it adds optimization for vmstorage, it no longer performs extra memory copy of MetricName for MetricBlock.

vmselect and vmstorage metrics vm_vmselect_metric_rows_read_total and vm_metric_rows_read_total were removed, it's not used at any dashboards and rules. New Iterator API doesn't support it.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9899
2025-11-28 13:04:03 +01:00
Max Kotliar
1a0fe0f79f dashboards: make dashboards-sync 2025-11-27 16:53:00 +02:00
Max Kotliar
190573fb3d dashboards: Show "Disk space usage % by type" as stacked graph in Cluster dashboard. (#10089)
### Describe Your Changes

VictoriaMetrics - cluster dashboard.

vmstorage -> Disk space usage % by type pane.

Switch panel to 100% stacked view to show space distribution.

The goal is to highlight how space is split between datapoints and
indexdb types; Simple time-series values made this hard to see. A 100%
stacked layout makes the distribution immediately visible.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9932

was: <img width="1201" height="609" alt="Image"
src="https://github.com/user-attachments/assets/1d199e65-5a20-4c63-a251-b7087020f42a"
/>


now: 
<img width="1208" height="608" alt="Screenshot 2025-11-27 at 13 14 51"
src="https://github.com/user-attachments/assets/96aa32f3-1243-486b-bac8-2d3c0f4bdb7a"
/>


### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-11-27 16:51:06 +02:00
Aliaksandr Valialkin
4df724aed6 docs/victoriametrics/goals.md: clarify that bugs, which affect a small number of users at rare edge cases, can be fixed later 2025-11-27 14:29:06 +01:00
Artem Fetishev
5d7f53d92a lib/storage: use lrucache for tfss cache (#10072)
The purpose of this PR is the same as #10000, except `lrucache` is used
for implementing tfss cache.

---------

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-27 14:25:43 +01:00
Andrii Chubatiuk
5c4688de2b chore(app/vmui): conditionally render accordion children (#10068)
### Describe Your Changes

revert change, that was introduced in
483e00ffb9
since rendering of all nested children significantly impacts alerting
tab performance in case of multiple items
@Loori-R @arturminchukov , what do you think about using react-virtuoso
additionally for alerting tab to decrease dom size?

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-11-27 14:32:19 +02:00
Ben Randall
2029fe563a lib/protoparser/opentelemetry: use separate loggers for unsupported delta temporality/metric type logs (#10021)
A throttled logger will continue to log messages occasionally with a
suffix indicating how many similar logs were throttled. Using the same
logger for multiple log messages can result in certain logs being
entirely suppressed and invisible in the logs. This updates most of the
loggers used in `appendFromScopeMetrics` to be their own logger so that
"unsupported delta temporality/metric type" logs will be visible for all
metric types. Additionally, `skippedSampleLogger` is only used by
`appendSamplesFromHistogram` so this was moved closer to that function.

Related to #9447
Related to #9498

- [X] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [X] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Co-authored-by: Max Kotliar <kotlyar.maksim@gmail.com>
2025-11-27 14:20:05 +02:00
Andrii Chubatiuk
e5c530bb4c lib/flagutil: clarify usage of quotes in array flag values 2025-11-27 14:18:06 +02:00
Hui Wang
0a39ef03dc dashboard: tidy vmauth panels (#10088)
before:
<img width="2498" height="1042" alt="image"
src="https://github.com/user-attachments/assets/0bbd7cc2-7062-494f-827b-96d86133537f"
/>
after:
<img width="2497" height="968" alt="image"
src="https://github.com/user-attachments/assets/6256ccc2-2f8f-40ea-a23b-a1a20e242b3c"
/>
which is more consistent with other dashabords.
2025-11-27 14:14:15 +02:00
Max Kotliar
dc8543d5b5 docs: add links to issues in changelog 2025-11-27 14:10:01 +02:00
Aliaksandr Valialkin
a70421d457 lib/fs: avoid Go runtime stalls on Linux when all the GOMAXPROCS threads are blocked in major pagefaults while reading the data from memory-mapped files
Go runtime executes all the goroutines on GOMAXPROCS operating system threads.
Go runtime cannot switch the OS thread to another goroutine if the current goroutine
is stuck in the major pagefault while reading the data from memory-mapped file,
because Go runtime doesn't distiguinsh between reading from regular memory and reading
from memory-mapped file. So the OS thread becomes stuck while waiting until the OS
reads the data from file at the requested memory address and returns back control to Go application.

In the worst case it is possible that all the GOMAXPROCS threads are stuck in major pagefaults,
so Go runtime pauses executing all the goroutines. This state is possible in environments
with small GOMAXPROCS and high-latency disks such as NFS or small HDD-based disks at AWS.

See https://valyala.medium.com/mmap-in-go-considered-harmful-d92a25cb161d for more details.

This commit protects from such stalls by verifying whether the given memory location from memory-mapped file
is already loaded in the OS page cache before reading from that memory.
If the location isn't in the OS page cache, then it falls back to pread() syscall for reading the data from file.
Go runtime allocates extra OS threads for long-running syscalls, so it can continue executing goroutines
across all the GOMAXPROCS threads while reading the data from slow storage via pread() syscall.

This commit uses mincore() syscall for detecting whether the given memory page is available in the OS page cache.
It also caches mincore() results for up to a minute in order to reduce the overhead for the mincore() syscall.

This commit reduces the increase rate for the process_major_pagefaults_total metric by multiple orders of magnitude
on systems with high-latency disks.
2025-11-26 20:53:50 +01:00
Artem Fetishev
ffa5b26bd9 lib/lrucache: use uint64 for SizeBytes() and SizeMaxBytes() (#10077)
Currently, `lrucache.Cache` `SizeBytes()` and `SizeMaxBytes()` return
type is `int`. The cache `Entry.SizeBytes()` also returns `int` value.
Changing the type to `uint64` will allow using `uint64set.Set` as the
cache entry type (see #10072).

Please note that using `uint64` regardless the cpu architecture is set
is not entirely correct, because in 32-bit systems the size won't ever
get bigger than `2^32`, so the `uint64` will too much. However current
type (`int`) is not correct either since it is signed and will only
allow to store values up to `2^31`. Alternatively, all `SizeBytes()`
methods should return `uint`.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-26 11:40:24 +01:00
Max Kotliar
baed72173d dashboards: run make dashboards-sync 2025-11-25 20:13:45 +02:00
Max Kotliar
de3849995e .github: Add changelog tip linter 2025-11-25 13:42:39 +02:00
Yury Molodov
774ea9dade app/vmui: improve alert styles for better readability (#10012)
### Describe Your Changes

This PR improves vmui alert styles by adding borders between rows,
introducing a hover state for easier row identification, and aligning
badges to the left.

Related issue: #9856

| Before | After |
|--------|--------|
| <img width="1427" height="1310" alt="image"
src="https://github.com/user-attachments/assets/68f3469e-95df-449f-a85d-1c0285520e2d"
/> | <img width="1427" height="1310" alt="Image"
src="https://github.com/user-attachments/assets/89501efb-c66f-402a-9d14-01c86930a5e2"
/> |

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Signed-off-by: Yury Molodov <yurymolodov@gmail.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2025-11-25 13:40:02 +02:00
Andrii Chubatiuk
1e802948ff app/vmui: fixed ability to select multiple metrics in explore metrics tab (#10008)
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9995

change in only `Select` component leads to infinite
ExploreMetricsGraphItem component refresh since each time array has a
new reference

### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-11-25 13:30:56 +02:00
Yury Molodov
908514dcc9 app/vmui: fix rendering of multiple points at the same timestamp (#10010)
### Describe Your Changes

1. Removed the *step* control from the **Raw Query** page, as it didn’t
affect chart rendering and caused confusion.
2. Fixed rendering of multiple points with the same timestamp -
previously, the second point was hidden.
3. Added proper visualization for points with the same timestamp and
identical values: such points are now shown as a square, and the tooltip
displays the number of duplicates.

**Example:**

```json
{
  "values": [1, 22, 10, 10, 5, 6],
  "timestamps": [
    1761955247950,
    1761955247950,
    1761955248960,
    1761955248960,
    1761955251980,
    1761955252990
  ]
}
```

<img width="500" height="1120" alt="image"
src="https://github.com/user-attachments/assets/192aa43e-8008-4f03-8966-00f59e52ec40"
/>
<img width="300" height="676" alt="image"
src="https://github.com/user-attachments/assets/8e361cb3-1286-452a-a687-b6b40ba7807b"
/>

Related issues: #9667 and #9666

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Signed-off-by: Yury Molodov <yurymolodov@gmail.com>
2025-11-25 11:00:33 +02:00
Cancai Cai
84046e5d3c docs/notes: fix syntax errors (#10019)
### Describe Your Changes

I'm not sure if this is a mistake.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Signed-off-by: cancaicai <2356672992@qq.com>
2025-11-24 20:38:01 +02:00
Andrei Baidarov
50a5b5dd8f vmselect: do not immediately fail request if vmstorage returns search… (#10030)
….maxConcurrentRequests error

### Describe Your Changes

If `vmstorage` is currently overloaded it could return
maxConcurrentRequests error. Now `vmselect` immediately fails the whole
request even if `replicationFactor` is set up and other replicas could
respond without errors.

This PR treats them as regular errors, not fatal ones.

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-11-24 20:11:02 +02:00
Hui Wang
27edd5057b add flag description for -selectNode (#10022) 2025-11-24 19:57:00 +02:00
cancaicai
090c5466b7 docs/storage: fix typo
Signed-off-by: cancaicai <2356672992@qq.com>
2025-11-24 15:48:25 +02:00
dependabot[bot]
4f44c9ed13 build(deps): bump golang.org/x/crypto from 0.43.0 to 0.45.0 (#10052)
Bumps [golang.org/x/crypto](https://github.com/golang/crypto) from
0.43.0 to 0.45.0.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="4e0068c009"><code>4e0068c</code></a>
go.mod: update golang.org/x dependencies</li>
<li><a
href="e79546e28b"><code>e79546e</code></a>
ssh: curb GSSAPI DoS risk by limiting number of specified OIDs</li>
<li><a
href="f91f7a7c31"><code>f91f7a7</code></a>
ssh/agent: prevent panic on malformed constraint</li>
<li><a
href="2df4153a03"><code>2df4153</code></a>
acme/autocert: let automatic renewal work with short lifetime certs</li>
<li><a
href="bcf6a849ef"><code>bcf6a84</code></a>
acme: pass context to request</li>
<li><a
href="b4f2b62076"><code>b4f2b62</code></a>
ssh: fix error message on unsupported cipher</li>
<li><a
href="79ec3a51fc"><code>79ec3a5</code></a>
ssh: allow to bind to a hostname in remote forwarding</li>
<li><a
href="122a78f140"><code>122a78f</code></a>
go.mod: update golang.org/x dependencies</li>
<li><a
href="c0531f9c34"><code>c0531f9</code></a>
all: eliminate vet diagnostics</li>
<li><a
href="0997000b45"><code>0997000</code></a>
all: fix some comments</li>
<li>Additional commits viewable in <a
href="https://github.com/golang/crypto/compare/v0.43.0...v0.45.0">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=golang.org/x/crypto&package-manager=go_modules&previous-version=0.43.0&new-version=0.45.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/VictoriaMetrics/VictoriaMetrics/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-24 15:15:55 +02:00
Zhu Jiekun
2d1ff1936c opentsdb: Avoid blocking when a connection doesn't send anything (#10045)
### Describe Your Changes

fix #9987 

Avoid blocking when a connection to `-opentsdbListenAddr` doesn't send
any data. This issue blocked other connections from being handled.

> This bug can be tested with:
> 1. Start VictoriaMetrics Single-node with `-opentsdbListenAddr=:4242`.
> 2. Run: `telnet 127.0.0.1 4242` without typing any data after
connection established.
> 3. Run (in another terminal, after step 2): `curl -H 'Content-Type:
application/json' -d
'{"metric":"x.y.z","value":2222222.34,"tags":{"t1":"v1","t2":"v2"}}'
http://localhost:4242/api/put`
> 
> Before the change:
> - Step 3 was blocked infinitely.
> 
> Expect result after the change:
> - Step 3 was executed.
> - Connection established by step 2 will be closed after 5 seconds.

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2025-11-24 14:31:58 +02:00
Mathias Palmersheim
1a5169adda Remove threshold from available cpu panel (#10056)
### Describe Your Changes

fixes #9988 by removing the cpu threshold from the Available CPU panel

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-11-24 14:17:40 +02:00
Kirill Yurkov
4d3b511d06 docs: link faq for large indexdb (#10061)
Clarified the index size note in
docs/guides/understand-your-setup-size/README.md to steer readers toward
the FAQ when indexdb feels oversized, noting typical ratios and
troubleshooting guidance.
2025-11-24 14:04:35 +02:00
dependabot[bot]
0abeb5a094 build(deps): bump actions/checkout from 5 to 6 (#10060)
Bumps [actions/checkout](https://github.com/actions/checkout) from 5 to
6.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/actions/checkout/releases">actions/checkout's
releases</a>.</em></p>
<blockquote>
<h2>v6.0.0</h2>
<h2>What's Changed</h2>
<ul>
<li>Update README to include Node.js 24 support details and requirements
by <a href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a>
in <a
href="https://redirect.github.com/actions/checkout/pull/2248">actions/checkout#2248</a></li>
<li>Persist creds to a separate file by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2286">actions/checkout#2286</a></li>
<li>v6-beta by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2298">actions/checkout#2298</a></li>
<li>update readme/changelog for v6 by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2311">actions/checkout#2311</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/checkout/compare/v5.0.0...v6.0.0">https://github.com/actions/checkout/compare/v5.0.0...v6.0.0</a></p>
<h2>v6-beta</h2>
<h2>What's Changed</h2>
<p>Updated persist-credentials to store the credentials under
<code>$RUNNER_TEMP</code> instead of directly in the local git
config.</p>
<p>This requires a minimum Actions Runner version of <a
href="https://github.com/actions/runner/releases/tag/v2.329.0">v2.329.0</a>
to access the persisted credentials for <a
href="https://docs.github.com/en/actions/tutorials/use-containerized-services/create-a-docker-container-action">Docker
container action</a> scenarios.</p>
<h2>v5.0.1</h2>
<h2>What's Changed</h2>
<ul>
<li>Port v6 cleanup to v5 by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2301">actions/checkout#2301</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/checkout/compare/v5...v5.0.1">https://github.com/actions/checkout/compare/v5...v5.0.1</a></p>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/actions/checkout/blob/main/CHANGELOG.md">actions/checkout's
changelog</a>.</em></p>
<blockquote>
<h1>Changelog</h1>
<h2>V6.0.0</h2>
<ul>
<li>Persist creds to a separate file by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2286">actions/checkout#2286</a></li>
<li>Update README to include Node.js 24 support details and requirements
by <a href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a>
in <a
href="https://redirect.github.com/actions/checkout/pull/2248">actions/checkout#2248</a></li>
</ul>
<h2>V5.0.1</h2>
<ul>
<li>Port v6 cleanup to v5 by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2301">actions/checkout#2301</a></li>
</ul>
<h2>V5.0.0</h2>
<ul>
<li>Update actions checkout to use node 24 by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2226">actions/checkout#2226</a></li>
</ul>
<h2>V4.3.1</h2>
<ul>
<li>Port v6 cleanup to v4 by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2305">actions/checkout#2305</a></li>
</ul>
<h2>V4.3.0</h2>
<ul>
<li>docs: update README.md by <a
href="https://github.com/motss"><code>@​motss</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1971">actions/checkout#1971</a></li>
<li>Add internal repos for checking out multiple repositories by <a
href="https://github.com/mouismail"><code>@​mouismail</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1977">actions/checkout#1977</a></li>
<li>Documentation update - add recommended permissions to Readme by <a
href="https://github.com/benwells"><code>@​benwells</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2043">actions/checkout#2043</a></li>
<li>Adjust positioning of user email note and permissions heading by <a
href="https://github.com/joshmgross"><code>@​joshmgross</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2044">actions/checkout#2044</a></li>
<li>Update README.md by <a
href="https://github.com/nebuk89"><code>@​nebuk89</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2194">actions/checkout#2194</a></li>
<li>Update CODEOWNERS for actions by <a
href="https://github.com/TingluoHuang"><code>@​TingluoHuang</code></a>
in <a
href="https://redirect.github.com/actions/checkout/pull/2224">actions/checkout#2224</a></li>
<li>Update package dependencies by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2236">actions/checkout#2236</a></li>
</ul>
<h2>v4.2.2</h2>
<ul>
<li><code>url-helper.ts</code> now leverages well-known environment
variables by <a href="https://github.com/jww3"><code>@​jww3</code></a>
in <a
href="https://redirect.github.com/actions/checkout/pull/1941">actions/checkout#1941</a></li>
<li>Expand unit test coverage for <code>isGhes</code> by <a
href="https://github.com/jww3"><code>@​jww3</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1946">actions/checkout#1946</a></li>
</ul>
<h2>v4.2.1</h2>
<ul>
<li>Check out other refs/* by commit if provided, fall back to ref by <a
href="https://github.com/orhantoy"><code>@​orhantoy</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1924">actions/checkout#1924</a></li>
</ul>
<h2>v4.2.0</h2>
<ul>
<li>Add Ref and Commit outputs by <a
href="https://github.com/lucacome"><code>@​lucacome</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1180">actions/checkout#1180</a></li>
<li>Dependency updates by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>- <a
href="https://redirect.github.com/actions/checkout/pull/1777">actions/checkout#1777</a>,
<a
href="https://redirect.github.com/actions/checkout/pull/1872">actions/checkout#1872</a></li>
</ul>
<h2>v4.1.7</h2>
<ul>
<li>Bump the minor-npm-dependencies group across 1 directory with 4
updates by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1739">actions/checkout#1739</a></li>
<li>Bump actions/checkout from 3 to 4 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1697">actions/checkout#1697</a></li>
<li>Check out other refs/* by commit by <a
href="https://github.com/orhantoy"><code>@​orhantoy</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1774">actions/checkout#1774</a></li>
<li>Pin actions/checkout's own workflows to a known, good, stable
version. by <a href="https://github.com/jww3"><code>@​jww3</code></a> in
<a
href="https://redirect.github.com/actions/checkout/pull/1776">actions/checkout#1776</a></li>
</ul>
<h2>v4.1.6</h2>
<ul>
<li>Check platform to set archive extension appropriately by <a
href="https://github.com/cory-miller"><code>@​cory-miller</code></a> in
<a
href="https://redirect.github.com/actions/checkout/pull/1732">actions/checkout#1732</a></li>
</ul>
<h2>v4.1.5</h2>
<ul>
<li>Update NPM dependencies by <a
href="https://github.com/cory-miller"><code>@​cory-miller</code></a> in
<a
href="https://redirect.github.com/actions/checkout/pull/1703">actions/checkout#1703</a></li>
<li>Bump github/codeql-action from 2 to 3 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1694">actions/checkout#1694</a></li>
<li>Bump actions/setup-node from 1 to 4 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1696">actions/checkout#1696</a></li>
<li>Bump actions/upload-artifact from 2 to 4 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1695">actions/checkout#1695</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="1af3b93b68"><code>1af3b93</code></a>
update readme/changelog for v6 (<a
href="https://redirect.github.com/actions/checkout/issues/2311">#2311</a>)</li>
<li><a
href="71cf2267d8"><code>71cf226</code></a>
v6-beta (<a
href="https://redirect.github.com/actions/checkout/issues/2298">#2298</a>)</li>
<li><a
href="069c695914"><code>069c695</code></a>
Persist creds to a separate file (<a
href="https://redirect.github.com/actions/checkout/issues/2286">#2286</a>)</li>
<li><a
href="ff7abcd0c3"><code>ff7abcd</code></a>
Update README to include Node.js 24 support details and requirements (<a
href="https://redirect.github.com/actions/checkout/issues/2248">#2248</a>)</li>
<li>See full diff in <a
href="https://github.com/actions/checkout/compare/v5...v6">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/checkout&package-manager=github_actions&previous-version=5&new-version=6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-24 14:00:45 +02:00
Max Kotliar
69205bb757 docs: Describe relation between slow inserts and unsorted labels. 2025-11-24 13:38:22 +02:00
Max Kotliar
7853661e02 docs: sync flags in docs with actual binaries 2025-11-24 13:17:59 +02:00
Aliaksandr Valialkin
13fface17e docs/victoriametrics/vmalert.md: clarify that templates can be used inside rule labels
Rule labels can contain templates in the same way as annotations.
See aad6ab009e/app/vmalert/rule/alerting_test.go (L1192)
and https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/#templating

Document this, since users sometimes ask this question.
2025-11-24 10:54:35 +01:00
Artem Fetishev
55f32a06b9 lib/storage: minor metricNameSearch fixes (#10065)
- Fix comment
- Re-use dst instead introducing a new variable.

This change has been requested to be in a separated PR during the
pt-index (#8134) code review.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-21 20:07:56 +01:00
Artem Fetishev
5e63fef0a7 lib/storage: also create parts.json on parition creation (#10051)
Currently, when a partition is created its corresponding parts.json file
is not created right away (see createNewParition()). Its creation is
delayed until the first part files are created on disk (see
swapSrcWithDstParts()). However, the parts.json file is created for a
possibly empty partition when an existing partition is opened (see
mustOpenPartition()) and when a partition snapshot is create (see
MustCreateSnapshotAt()).

I.e. `parts.json` is an important part of a partition, since it is an
artifact that describes the partition contents. And it should be created
on pt creation even if its contents is empty.

To be honest, this change is mostly a no-op for the current storage
implementation. It only makes the code consistent, i.e. the parts.json
is created along with the partition.

However having it created when a partition is created becomes in
pt-index (#7599, #8134), because it allows having partitions with no
data and therefore without parts.json file. Still not a big deal but the
unit tests start failing.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-21 14:23:00 +01:00
Artem Fetishev
bb3b60b0b4 lib/storage: refactoring - move dateMetricIDCache code to a separate file (#10055)
dateMetricIDCache does not belong to storage anymore since it has been
moved to indexDB. Instead moving the case to index_db.go, move it to a
separate file in order to navigate the code more easily.

No changes have been done to the code or tests.

Follow up for: #9983

---------

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
Co-authored-by: Alexander Frolov <9749087+fxrlv@users.noreply.github.com>
2025-11-21 13:54:25 +01:00
Artem Fetishev
d6ac587547 lib/storage: fix comments related to nextDayMetricIDs
Follow-up for 49b0a4fb16

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-21 13:34:26 +01:00
Artem Fetishev
95c9404bbe lib/storage: refactoring - simplify nextDayMetricIDs data structure (#10058)
The data structure used for holding the nextDayMetricIDs is too complex
and can be simplified (flattened).

Follow up for: #9983

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-21 13:09:15 +01:00
Artem Fetishev
3028bfa425 lib/storage: add overlapsWith() and contains() methods to TimeRange (#10059)
The change was introduced in pt-index PR (#8134) and is extracted into a
separate PR.

Currently used in partition_search and partition. If you see more places
like this, please let me know.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-21 12:30:19 +01:00
Andrii Chubatiuk
fdb6f73d70 docs: add warning blockquote regarding latest backup lifecycle policy (#10054)
Update formatting for warning text.

<img width="732" height="432" alt="image"
src="https://github.com/user-attachments/assets/1549e69a-fc65-445f-b567-9b5e4e1a8617"
/>
2025-11-20 13:46:48 +04:00
Aliaksandr Valialkin
043cd80adb docs/victoriametrics/Articles.md: add https://medium.com/@kanakaraju896/backing-up-victoriametrics-data-a-complete-guide-24473c74450f 2025-11-20 08:37:07 +01:00
Aliaksandr Valialkin
6dbcff5252 docs/victoriametrics/Articles.md: add https://blackmetalz.github.io/why-i-switched-to-victoriametrics-scaling-from-small-business-to-enterprise.html 2025-11-20 08:35:24 +01:00
Andrii Chubatiuk
2d1519e37c app/vmalert: do not increment errors counter on cancel context
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10027
2025-11-19 13:37:54 +01:00
Nikolay
053419c9d4 lib/storage: properly increment missing tsids metric
Bug was introduced at 2380e4829d

Due to typo vm_missing_tsids_for_metric_id_total metric was incremented instead of vm_missing_metric_names_for_metric_id_total for missing metricName for metricID search.

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10041
2025-11-19 13:37:53 +01:00
Hui Wang
03f94978be chore: clarify vmalert -external.label usage (#10042)
To clarify that HA vmalert doesn't need to specify `-external.label`.
2025-11-19 13:37:53 +01:00
Fred Navruzov
b9393ce4c2 docs/vmanomaly: release v1.28.0 (#10031)
### Describe Your Changes

Upgraded vmanomaly docs & guides to release v1.28.0 (UI v1.2.0)

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-11-18 21:48:01 +02:00
Andrii Chubatiuk
39d7b0c3c9 docs/vmbackupmanager: mention version since which -backupTypeTagName flag is available (#10038)
Mention version since which `backupTypeTagName` flag is available
2025-11-18 18:57:08 +04:00
Andrii Chubatiuk
667ff2d7c1 app/vmbackupmanager: set backup type tag on backup's items
* app/vmbackupmanager: set VMBackupType tag on backup's items

* address review comments
2025-11-18 16:30:33 +04:00
Zakhar Bessarab
4682b78005 docs/cluster: remove mention of select for metadata (#10034)
vmselect does not have a flag to enable metadata querying, remove
invalid reference to it from the docs.
2025-11-18 15:32:56 +04:00
Artem Fetishev
db956a65f5 docs: update VictoriaMetrics components version to v1.130.0
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-17 22:03:50 +01:00
Artem Fetishev
203440a026 deployment/docker: update VM components version to v1.130.0
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-17 21:57:42 +01:00
Artem Fetishev
7f297ec705 docs: bump last LTS versions
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-17 20:14:31 +00:00
Artem Fetishev
5d2cb8f7e6 docs/CHANGELOG.md: update changelog with LTS release notes
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-17 20:09:24 +00:00
Artem Fetishev
e7d9e1dbc3 lib/workingsetcache: Fix bytesSize metric calculation (#10025)
Follow-up for 3e6fc445a9

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-17 14:08:44 +01:00
Artem Fetishev
4824786122 docs/CHANGELOG.md: cut v1.130.0
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-14 17:46:02 +00:00
Max Kotliar
20b0e6ddd0 docs: update latest version in docs to v1.130.0 2025-11-14 17:40:59 +00:00
Artem Fetishev
9cc706bfa9 make vmui-update
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-14 17:21:53 +00:00
Artem Fetishev
c41ead2cc3 lib/storage: Move dateMetricIDCache to indexDB (#9983)
Looks like the `dateMetricIDCache` must be per indexDB:

- the use of this cache and `is.hasDateMetricID()` often go in pairs. So
it makes
  sense to use this cache in that method.
- The same is true for `createPerDayIndexes()`: everytime the index
entry is
  created, a corresponding entry is added to the cache.
- As a result the generation field is also removed from the cache.

Related to #7599 and #8134.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-14 16:18:45 +01:00
Aliaksandr Valialkin
bda6628894 docs/victoriametrics: cross-link rebalancing section at VictoriaMetrics cluster docs and the corresponding question at the FAQ page 2025-11-14 15:37:11 +01:00
Aliaksandr Valialkin
90994adcda docs/victoriametrics/Cluster-VictoriaMetrics.md: add rebalancing chapter, which explains how to rebalance data among vmstorage nodes
This is very frequent question from new users of VcitoriaMetrcs who migrate from other solutions
with automatic data rebalancing among storage nodes, so it is a good idea to cover it in the docs.
2025-11-14 15:33:23 +01:00
Max Kotliar
511a07fdc2 lib/storage/metricsmetadata: ensure deterministic sorting for identical metric names across tenants
Metrics metadata is loaded from a per-tenant storage map
(perTenantStorage map[uint64]map[string]*Row), so result rows order is
non-deterministic. The existing sortRows implementation only sorts by
metric name and ingestion time, which means rows that differ only by
tenant/account ID still sorted undeterministically.

This change updates `sortRows` to include account\project identifiers in
the comparison, ensuring stable and deterministic ordering for metadata
entries that share the same metric name and timestamp.

First discovered as flaky test:

--- FAIL: TestStorageRead (0.00s)
    storage_test.go:337: unexpected rows get result (-want, +got):
          []*metricsmetadata.Row{
          	&{
          		... // 2 ignored and 1 identical fields
          		Help:      "uselesshelp1",
          		Unit:      "seconds1",
        - 		AccountID: 1,
        + 		AccountID: 0,
        - 		ProjectID: 1,
        + 		ProjectID: 0,
          		Type:      1,
          	},
          	&{
          		... // 2 ignored and 1 identical fields
          		Help:      "uselesshelp1",
          		Unit:      "seconds1",
        - 		AccountID: 0,
        + 		AccountID: 1,
        - 		ProjectID: 0,
        + 		ProjectID: 1,
          		Type:      1,
          	},
          }
FAIL

https://github.com/VictoriaMetrics/VictoriaMetrics-enterprise/actions/runs/19361594138/job/55394642029#step:4:133
2025-11-14 15:41:16 +02:00
Max Kotliar
e5bd9c4286 docs/changelog: Add links to changelog 2025-11-14 13:42:17 +02:00
Haley Wang
c2aa8a7885 lib/storage: add a value check for retentionFilter to ensure it does not exceed retentionPeriod 2025-11-14 12:51:08 +02:00
Max Kotliar
a343c1ea25 docs: Add metrics metadata how to use in docs
follow-up for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9487
2025-11-14 10:42:22 +01:00
f41gh7
0ba59757c9 apptest: add metrics metadata test for vmsingle
related issue github.com/VictoriaMetrics/VictoriaMetrics/issues/2974
2025-11-14 10:35:32 +01:00
Zakhar Bessarab
d52d8ed92b app/{vmstorage,vmselect,vminsert}: introduce metrics metadata storage
This commits adds storage part and cluster RPC methods for metrics metadata.
 
 Key concepts:

* vmstorage persists metadata in-memory only.
* vmstorage evicts metadata records older than 1 hour.
* vmstorage stores only the last value of metadata for time series metric name.
* vminsert opens an additional TCP connection to the vmstorage for metrics metadata write requests.
* vmselect implements prometheus compatible HTTP API for reading metrics metadata

This feature is available optional and must be enabled via flag - `-enableMetadata` provided to vminsert/vmsingle.

Fixes github.com/VictoriaMetrics/VictoriaMetrics/issues/2974
2025-11-14 10:12:15 +01:00
Aliaksandr Valialkin
9ab51f7921 docs/guides/understand-your-setup-size/README.md: remove the misleading recommendation for having at least 2vCPU cores per each vmstorage node
vmstorage nodes work perfectly with one CPU core and even with 10% of a single CPU core
if the allocated CPU resources matches their workload.

It is better to recommend allocating the an interger number of CPU cores to vmstorage
in order to achieve an optimal performance, since vmstorage allocates internal resources
according to the available CPU cores. If there is a fractional number of CPU cores,
then the allocation of internal resources may be not so optimal.

Fractional number of CPU cores may also lead to increased latencies and stalls
because some P threads at Go runtime won't be able to run goroutines from their ready queues
in a timely manner becasue of the lack of CPU time. See https://victoriametrics.com/blog/kubernetes-cpu-go-gomaxprocs/
2025-11-14 09:54:38 +01:00
Aliaksandr Valialkin
8e2a5a2641 docs/victoriametrics/vmagent.md: mention that it isn't recommended increasing the -maxConcurrentRequests command-line flag value in general case
Too big values for the -maxConcurrentRequests command-line flag increase memory usage
and increase CPU overhead for processing incoming requests in most cases.
The only valid reason for increasing the value for -maxConcurrentRequests command-line flag
is when many clients send data to vmagent over very slow network.
2025-11-14 09:54:37 +01:00
Hui Wang
21aad0a171 Improve vmalert UI tip (#9998) 2025-11-13 21:06:04 +01:00
Aliaksandr Valialkin
34253a96fe docs/victoriametrics: fix broken link to /api/v1/rules docs at Prometheus 2025-11-13 19:40:21 +01:00
Aliaksandr Valialkin
34066ffb3a docs/victoriametrics/README.md: add context links to the FAQ entry describing why IndexDB size may be too large 2025-11-13 19:37:01 +01:00
Nikolay
5db9d8fe04 lib/encoding/zstd: properly apply size limits
Previously, zstd Decoder didn't take in account Request Size limits
applied by VictoriaMetrics components.  And in case of incorrectly formed zstd block, VictoriaMetrics
component may allocate extra memory. Which may lead to the OOM errors.

This commit makes ingest endpoints check frame content size and window size headers based on MaxRequest Limits.
2025-11-13 18:13:27 +01:00
Hui Wang
b5da6cb97d vmalert: print the error message as value if templating fails in alerting rule
For users, if an alerting rule has a misconfigured annotation, it's more
important to deliver the alert when the rule triggers rather than skip
it with templating error logs.
Then users can see the faulty annotation in alert message and fix it.

Note: the previous behavior is retained in replay mode because errors
there should be noticed immediately; hiding them could waste time,
resources and require a re-replay after fixes.
Also the rule's status in the vmalert UI remains unhealthy if templating
failed.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9853
2025-11-13 17:36:52 +01:00
Hui Wang
fe149b0178 vmalert: drop labels with empty values in generated alerts and time series
In prometheus ecosystem, a label with an empty value equals no label,
since a query like `test{something=""}` matches all the series without
label `something`.
So for vmalert, preserving empty-value labels in generated alerts or
time series is unnecessary and can cause alert hash mismatches during
[restore](https://docs.victoriametrics.com/victoriametrics/vmalert/#alerts-state-on-restarts).
The empty-value label shouldn't come from datasource response since they
follow the same rule(omit empty-value labels), it may come from
`-external.label` or rule labels, but the empty value could be caused by
occasionally templating failures, which is hard to check there.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9984
2025-11-13 17:36:52 +01:00
Hui Wang
55ea30f7da vmalert: fix a potential race condition in web api during rule hot reload
Group rules are not protected by
[m.groupsMu](03c784e3e3/app/vmalert/manager.go (L25)),
they could be updated(with config hot reload) during `/api/v1/rule`,
`/api/v1/alert` and `/api/v1/alerts` API calls. This fix takes a
snapshot by calling `group.ToAPI()` first, making all reads safe.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9551
2025-11-13 17:36:51 +01:00
JAYICE
a6716beee0 lib/httputl: fix failing to access http2 sd service by the shadow copy of http.DefaultTransport
Clone `http.DefaultTransport` and disable HTTP2 without resetting
`TLSClientConfig.NextProtos` in the shadow copy of
`http.DefaultTransport` will cause the request to HTTP/2 server to fail.
See https://github.com/golang/go/issues/39302.

To reproduce it, use a scrape config like:
```
scrape_configs:
  - job_name: test
    yandexcloud_sd_configs:
      - service: compute
        api_endpoint: https://api.cloud.yandex.net
```
Before the fix, access to the SD service would fail.

A solution is to specify `http/1.1` in  `TLSClientConfig.NextProtos`.

Related golang issue: https://github.com/golang/go/issues/39302

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9981
2025-11-13 17:36:51 +01:00
Andrii Chubatiuk
94d1214cef docs: update grafana plugin links, move root file to plugins repo (#10001)
### Describe Your Changes

update victorialogs grafana plugin links, moved root plugin file to
plugin repo

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-11-13 15:00:53 +02:00
Zhu Jiekun
7c1e578f04 docs: mention VictoriaTraces playground in doc (#9999)
### Describe Your Changes

Add VictoriaTraces playground in doc.

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-11-13 12:51:17 +02:00
Andrii Chubatiuk
1aef4c61d4 docs: added perses section, move grafana datasource to integrations (#9994)
### Describe Your Changes

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9888

additionally adds grafana datasource into integrations section and
excludes previous location from menu and search

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-11-13 12:49:36 +02:00
Zhu Jiekun
036bcb07d9 VMUI/relabel debug: Allow labels textarea input without curly braces (#9950)
### Describe Your Changes

Fixed #9900

relax the validation for the labels text area. It now accepts input
labels without being enclosed in curly braces.

The following input format should be supported now:


```
	metric_name
	metric_name{label1="value1"}
	{__name__="metric_name", label1="value1"}
	__name__="metric_name", label1="value1"
	label1="value1"
```

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2025-11-13 12:42:20 +02:00
Aliaksandr Valialkin
425eff9489 vendor: update github.com/valyala/gozstd from v1.23.2 to v1.24.0
This is needed for being able to use DecompressLimited() function for limiting
the size of descropressed data.

See https://github.com/valyala/gozstd/pull/75
Updates https://github.com/VictoriaMetrics/VictoriaMetrics-enterprise/issues/958
2025-11-12 21:13:17 +01:00
Aliaksandr Valialkin
45d9b4a378 docs/victoriametrics/vmalert.md: remove "👉" char from the Common mistakes chapter, since this looks like AI-generated content
While at it, fix a typo `&step` -> `step`.

This is a follow-up for the commit 40ab285fb9

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9343
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9373
2025-11-12 18:51:32 +01:00
Max Kotliar
9938d49659 docs/changelog: move changelog line to tip. polish it a bit 2025-11-12 19:40:36 +02:00
Samarth Bagga
0631058f3d Add log which will report dropped log count (#9752)
### Describe Your Changes

I have added a counter for the throttled logs which gets logged every 1
minute.
Fixes #9498

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Co-authored-by: Hui Wang <haley@victoriametrics.com>
2025-11-12 19:33:09 +02:00
Max Kotliar
150d02e277 lib/envflag: apply -secret.flags inside envflag.Parse function (2nd attempt) (#9963)
### Describe Your Changes

The PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9942 was
reverted in
c90c7c3123
because of the import cycle in the enterprise VM. Needs more work.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-11-12 19:29:53 +02:00
Hui Wang
8976fca736 docs: fix flag type in descriptions (#9979)
Do not use backticks in command-line flag description, it pollutes the
flag type in descriptions.
2025-11-12 13:52:19 +02:00
Aliaksandr Valialkin
dbfa287dce docs/victoriametrics: add a case study from Spotify based on the https://www.youtube.com/watch?v=87koDlpKDR4 2025-11-12 10:49:57 +01:00
Aliaksandr Valialkin
63cd72ae80 docs/victoriametrics/FAQ.md: mention that disabling per-day index may reduce the growth rate of indexdb for static time series over time 2025-11-11 16:58:20 +01:00
Aliaksandr Valialkin
3ffc53947c docs/victoriametrics/FAQ.md: add Why IndexDB is so large? chapter, since this is quite frequent question from VictoriaMetrics users 2025-11-11 16:48:22 +01:00
Aliaksandr Valialkin
bbf24651f2 docs/victoriametrics/FAQ.md: add trailing slashes to links to posts about VcitoriaMetrics components
Trailing slashes are needed to make the URLs canonical and avoid redirects.

This is a follow-up for d4aefcecc4
2025-11-11 16:48:22 +01:00
Aliaksandr Valialkin
afea8a4380 deployment: update Go builder from v1.25.3 to v1.25.4
See https://github.com/golang/go/issues?q=milestone%3AGo1.25.4%20label%3ACherryPickApproved
2025-11-11 12:15:46 +01:00
Aliaksandr Valialkin
3453acc783 docs/victoriametrics/stream-aggregation: fix broken links to https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#stream-aggregation-config ( was https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#aggregation-config )
This is a follow-up after the commit f385e36b96
2025-11-11 02:01:33 +01:00
Aliaksandr Valialkin
a529b5e74d lib/workingsetcache: prevent from duplicate misleading log messages when reading the cache from file
While at it, improve logging when reading workingsetcache from file and saving it to file.
This should simplify troubleshooting various issues related to the workingsetcache.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9750
2025-11-11 00:04:01 +01:00
Aliaksandr Valialkin
c3549e429f lib/workingsetcache: properly update cache stats
This is a follow-up for the commit 1130adebad .

The EntriesCount, BytesSize and MaxBytesSize metrics must take into account the data
stored in both prev and curr caches, since this data occupies memory and it is expected
that the exposed metrics - vm_cache_entries, vm_cache_size_bytes and vm_cache_size_max_bytes -
take into account all the memory occupied by the corresponding caches.

The GetCalls, SetCalls, Collisions and Corruptions metrics must take into account stats
from the curr cache only, since the corresponding stats for the prev cache is already taken
during the rotation (when moving curr to prev and resetting the previous prev).

The Misses metric must take into account only misses in the prev cache, since these misses
mean that the given entry is missing the both the curr and the prev cache.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9553
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9715
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9657

While at it, make sure that the cache mode and cache stats is always read and updated under c.mu lock.
This may help resolving races similar to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9921
2025-11-11 00:04:01 +01:00
Aliaksandr Valialkin
3998cc45e7 Revert "lib/workingsetcache: properly count workingsetcache metrics "
This reverts commit 89fd27c922.

Reason for revert: this commit adds scalability bottleneck in the fast path - Cache.Get() -
in the form of c.getCalls.Add(). This call doesn't scale on systems with big number of CPU cores,
since it needs to update atomically a shared memory from big number of CPU cores.

The Cache.Get() is called per every ingested sample when obtaining TSID by MetricName from the cache
at lib/storage.Storage.get(), so this can be a major bottleneck on systems with many CPU cores.

The solution for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9553
is to properly track cache requests and misses: cache requests must be taken into account
only at the curr cache, while cache misses must be taken into account only at the prev cache.
This will be implemented in the follow-up commit.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9657
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9715
2025-11-11 00:04:00 +01:00
Aliaksandr Valialkin
a0b8d04713 Revert "lib/storage: Introduce vm_cache_eviction_bytes_total metric"
This reverts commit 994dadb4d5.

Reason for revert: the introduced metrics have zero practical applicability.

The lib/workingsetcache doesn't need manual tuning in most cases - its' size
is automatically adjusted to the given working set, if the working set is smaller
than the cache size limit set at the cache creation time. The limit just prevents
unbounded cache growth for large working sets.

If the working set exceeds the given limit, then the cache may become inefficient
because of the increased cache miss rate. The introduced metrics do not help determining
the needed cache size.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9293
2025-11-10 16:55:09 +01:00
Aliaksandr Valialkin
3995494d58 go.mod: update github.com/VictoriaMetrics/fastcache from v1.13.1 to v1.13.2
This is needed for removing the EvictedBytes metric from the fastcache.

See the description of f6080737bb for details.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9293
Updates https://github.com/VictoriaMetrics/fastcache/pull/93
2025-11-10 16:44:55 +01:00
Aliaksandr Valialkin
4aaeaf5943 lib/workingsetcache: replace Cache.Save() with Cache.MustSave()
If the cache cannot be saved to the given file, this is a fatal error.
It is better to log this fatal error inside Cache.MustSave() and then exit
instead of returning it to the caller. This makes the code more clear at the caller side.
2025-11-10 16:44:55 +01:00
Aliaksandr Valialkin
e4e22aefdd lib/workingsetcache: improve log messages for various expected cases when reading the cache from files
The improved log messages must help users understanding the logged cases
without asking VictoriaMetrics developers on these cases.
2025-11-10 14:55:34 +01:00
Aliaksandr Valialkin
ebb5ccbfcf deployment/docker/rules/alerts-health.yml: clarify the description of the TooManyTSIDMisses alert after the commit 30641b201b
It is expected that the number of TSIDs misses over the last 5 minutes is zero in steady state.
If it is non-zero, then something wrong happens. That's why it is better to use increase() instead of rate() function
for this alert.
2025-11-10 14:37:36 +01:00
Aliaksandr Valialkin
a22a36dee2 lib/storage: consistently rename searchMetricNameWithCache() to SearchTSIDs() across comments after the commit 90d23d7c9f
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9765
2025-11-10 14:31:30 +01:00
Aliaksandr Valialkin
d7e4d8aa7a deployment/docker/rules/alerts-health.yml: clarify the description for the TooManyTSIDMisses alert
This alert is expected after unclean shutdown (OOM, power off, kill -9) of VictoriaMetrics.
It should go away in a few minutes after the restart while VictoriaMetrics deletes metricIDs
for the missing MetricID->TSID entries which were created for the newly registered time series
just before unclean shutdown. It is OK to delete such metricIDs, since the corresponding time series
will be re-registered again. See the commit 20812008a7 .

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3502
2025-11-10 14:31:30 +01:00
Aliaksandr Valialkin
ee39766626 lib/workingsetcache: properly initialize new cache when the stored cache has unexpected size
This is a follow-up for 9bc541587b
2025-11-10 12:49:27 +01:00
Artem Fetishev
e14432e347 lib/storage: Fix data race in containsTimeRange() (#9965)
When one goroutine attemps to update the min timestamp under the lock it
could have been updated already by another goroutine with a smaller
timestamp. As a result the goroutine will update the timestamp with a
bigger value.

A simple unit test (included in this commit) demonstrates that.

Additionally, use a simple Mutex instead of RWMutex. RWMutexes only
introduce an unnecessary overhead for operations as simple as retrieving
a value from a map and regular Mutex should be preferred.

Thanks to @valyala for spotting a bug and the advice on RWMutexes.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-11-07 15:33:01 +01:00
Yury Molodov
57b85de212 app/vmui: improve chart performance and fix median calculation
This commit improves overall performance and stability of chart rendering,
refines time series generation, and fixes incorrect median calculation
in metric series.
JavaScript execution time improved by up to ×6 on large datasets.

**Changes:**

* Reworked `getTimeSeries` - one point per pixel.
* Added legend auto-collapse when >20 items.
* Switched median algorithm to Quickselect (Floyd–Rivest).
* Unified array stats functions (`min`, `max`, `avg`, `median`) into a
single pass.
* Removed unused `last` value from series.
* Renamed `roundToMilliseconds` to `roundToThousandths` and moved to
`utils/math`.
* Replaced `isSupportedDuration` with `parseSupportedDuration`, added
fractional duration support.

Related issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9699
Related issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9926
2025-11-07 16:28:25 +03:00
Zhu Jiekun
45b4d7c476 lib/promscrape: prevent early exit when one of multiple service discovery configs fails
When multiple service discovery configs of the same type exist (e.g.,
`hetzner_sd_config`), vmagent currently behaves as follows:
1. Attempts to request each config.
2. Exits immediately if any config returns an error.
3. Skips the rest configs and falls back to the previous service
discovery result.

The correct behavior—more compatible with Prometheus—should be:
1. Attempt to request each config.
2. Collect all valid results.
3. Use the valid results if there's at least one. otherwise (all
failed), fall back to the previous SD result.

Scrape example:

```yaml
scrape_configs:
  - job_name: hetzner-default
    hetzner_sd_configs:
      - role: "hcloud"
        authorization:
          credentials: "some_valid_value"
      - role: "hcloud"
        authorization:
          credentials: "some_wrong_value"
```

Expected outcome: 
- At least targets from `credentials: "some_valid_value"` should appear
in the service discovery result.

current outcome:
- the error from `credentials: "some_wrong_value"` leads to an **empty**
result.

This issue should affect service discovery which using
`getScrapeWorkGeneric` function:

- `azure_sd_config`
- `consul_sd_config`
- `consulagent_sd_config`
- `digitalocean_sd_config`
- `dns_sd_config`
- `docker_sd_config`
- `dockerswarm_sd_config`
- `ec2_sd_config`
- `eureka_sd_config`
- `gce_sd_config`
- `hetzner_sd_config`
- `http_sd_config`
- `kuma_sd_config`
- `marathon_sd_config`
- `nomad_sd_config`
- `openstack_sd_config`
- `ovhcloud_sd_config`
- `puppetdb_sd_config`
- `vultr_sd_config`
- `yandexcloud_sd_config`

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9375
2025-11-07 16:28:25 +03:00
Yury Molodov
a58518d1cc app/vmui: fix points display; add option to show all points
* Fix rendering of isolated points at gaps.
* Add toggle to always show all points (even when connected by a line).

Related issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9666
2025-11-07 16:28:24 +03:00
Zhu Jiekun
e54041af69 chore: unify the usage of consistenthash pkg
1. Initially the consistenthash package/functions exists only in
`vminsert` in cluster branch. It's for `vminsert` to do consistent hash
to shard data to `vmstorage`.
2. vmagent use consistent hash after
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/8597, and the
related functions are ported to `lib/consistenthash` in `master` branch.
3. After syncing commit in `master` to `cluster` branch, there're 2
identical packages/functions in `cluster` branch.

What's done in this pull request (to `cluster` branch only):
- remove `vminsert/netstorage/consistent_hash.go`, and use shared pkg
under `lib/consistenthash`.

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9970
2025-11-07 16:05:02 +03:00
Yury Molodov
43f41062d5 app/vmui: fix incorrect median value calculation in series (#9926)
Signed-off-by: Yury Molodov <yurymolodov@gmail.com>
2025-11-07 13:47:37 +02:00
Clément Nussbaumer
c0af0a41be lib/promscrape/kubernetes: add namespace metadata discovery
permits attaching namespace metadata to pods, services, ingresses,
endpoints and endpointslices for kubernetes service-discovery.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7486
2025-11-07 11:50:25 +03:00
Fred Navruzov
55ced57ee3 docs/vmanomaly: patch release v1.27.1 (#9964)
### Describe Your Changes

Patch release doc updates (v1.27.1)

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-11-06 14:17:45 +02:00
Aliaksandr Valialkin
1a8b71b6f2 docs/victoriametrics/Articles.md: add https://www.tigrisdata.com/blog/billing-prometheus/ 2025-11-06 11:44:26 +01:00
Aliaksandr Valialkin
10d35e232e lib/logstorage: verify that nobody holds references to parts when closing the partition
This is needed in order to detect and prevent cases of improper usage of partitions
while they are closed.

This is a follow-up for the commit 9725ee50ec .
2025-11-06 11:39:55 +01:00
Max Kotliar
7be6c54d92 docs/guides: use canonical link 2025-11-05 21:09:34 +02:00
Max Kotliar
c83e342a91 Revert "lib/envflag: apply -secret.flags inside envflag.Parse function (#9942)"
This reverts commit 1b11031ec8.

There is an import cycle because of the change in enterprise version of VM
2025-11-05 21:02:47 +02:00
Max Kotliar
768f0f484d lib/envflag: apply -secret.flags inside envflag.Parse function (#9942)
### Describe Your Changes

Follow up on PR:
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9839, which
addresses review comment

https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9839#discussion_r2477729886

Alex: 
```
this design decision isn't good, since it will lead to potential security issues over time when we'll forget adding ApplySecretFlags() call after the flag.Parse() call or add it at the wrong place. BTW, we do not call flag.Parse() explicitly - instead envflag.Parse() is called. So it is natural to call ApplySecretFlags() inside this call. Are there restrictions which prevent from doing this? If there are no restrictions, then there is no need in making this function public - it will be called explicitly inside envflag.Parse().
```

There is no changelog entry as there is no change in user-visible
behavior.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-11-05 20:54:13 +02:00
Max Kotliar
c377c6aa33 docs: clarify why we advise 50% free RAM. Add link to discussion (#9943)
### Describe Your Changes

Based on answer
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9895#issuecomment-3442491150

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-11-05 20:51:39 +02:00
Max Kotliar
69c8e78ba9 docs: fix links in VictoriaMetrics topologies guide 2025-11-05 20:49:40 +02:00
Aliaksandr Valialkin
57e9e9105f lib/mergeset: verify that Table parts are no longer used at Table.MustClose()
This should catch possible errors related to improper release of Table parts.
Fix such an error at TestTableCreateSnapshotAt by properly closing all the initialized
TableSearch instances.

Thanks to @rtm0 for pointing to this issue.
2025-11-05 13:26:30 +01:00
f41gh7
858adaed32 docs: mention latest v1.129.1 release 2025-11-04 18:07:47 +03:00
f41gh7
db4cbf128d docs: mention latest LTS releases
v1.110.23 and v1.122.8
2025-11-04 18:07:31 +03:00
f41gh7
6661154b41 CHANGELOG.md: cut v1.129.1 release 2025-11-04 13:15:46 +03:00
Nikolay
f521975cdd lib: properly apply snappy Decode limits
Previously, snappy Decoder didn't take in account Request Size limits
applied by VictoriaMetrics components.  And in case of incorrectly formed snappy block, VictoriaMetrics
 component may allocate extra memory. Which may lead to the OOM errors.

This commit makes ingest endpoints check block size header based on MaxRequest Limits.
2025-11-04 13:04:30 +03:00
Fred Navruzov
4bdcb52a01 docs/vmanomaly: release v1.27.0 (#9954)
### Describe Your Changes

Docs update to follow vmanomaly's release v1.27.0, including:
- UI page update (changelogs, auth, new screenshots)
- Migration guide addition
- Cross-references of the above and version bumps

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-11-04 10:22:46 +04:00
Zakhar Bessarab
dc07b29c63 docs: update VM version to v1.129.0
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-11-03 16:56:33 +04:00
Zakhar Bessarab
e5dfb5d2bd deployment/docker: update VM version to v1.129.0
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-11-03 16:56:33 +04:00
f41gh7
5f21f7fd81 lib/clusternative: properly return io.EOF error
Follow-up for 8c2ea6eec7048dd7aed112d89d88fed89c4db9b3

 Previously, Parse function returned an empty error in case of
connection close in case of io.EOF. While it must return error,
which should be handled by server and result in connection close.
2025-11-01 15:08:32 +03:00
Artem Fetishev
315c4861bd lib/storage: extract storage file names into constants (#9944)
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-10-31 20:34:03 +01:00
Zakhar Bessarab
06b2b5b706 docs/changelog: backport LTS changelog
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-10-31 22:42:58 +04:00
Zakhar Bessarab
24a8654181 docs/changelog: cut v1.129.0
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-10-31 20:04:18 +04:00
Zakhar Bessarab
5492173694 docs: update version tooltips
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-10-31 19:59:41 +04:00
Zakhar Bessarab
5dec778304 app/vmselect: run make vmui-update
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-10-31 19:47:27 +04:00
Aliaksandr Valialkin
8175319d4e lib/promauth/config.go: typo fix: It -> If
The typo is spotted in https://github.com/VictoriaMetrics/VictoriaLogs/pull/765/files#r2434359693
2025-10-31 19:31:37 +04:00
f41gh7
21907d0bc4 lib/vminsertapi: remote metrics metadata RPC
Metrics Metadata feature must also include RPC implementation. New RPC
server added stub for it, but it will work incorrectly if it'd be used
by different releases of vminsert/vmstorage.
2025-10-31 18:14:56 +03:00
Zakhar Bessarab
bcd15e6898 make: add missing cluster targets for CI
Follow-up for 16790389

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-10-31 15:27:20 +04:00
Zakhar Bessarab
1679038945 make: include s390x binaries into release artifacts (#9941)
Previously, it was possible to build binaries with make targets but
those builds were not included in the release artifact. Update release
targets to include s390x artifacts in release artifacts.

Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9697

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-10-31 15:25:19 +04:00
hagen1778
c696b40b1d docs: re-qualify load-balancing optimization to feature
Motivation: the change updates load-balancing logic, enhancing it rather than fixing
a critical bug. Such enhancement should not be ported to LTS versions.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9712

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 18268c3d13)
2025-10-31 09:52:44 +01:00
hagen1778
6ddaadc7b3 docs: add changelog for vmalert UI fix
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9892
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9909
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit bfb49c55af)
2025-10-31 09:36:32 +01:00
Kirill Yurkov
a663bd2b9d docs: address review comments from PR #9919 (#9940)
- Standardize all sections to use 'Recommended for:' instead of mixed
'For whom:' and 'Target audience:'
- Fix wording: 'Query evaluation is always local'

Addresses comments in #9919

(cherry picked from commit bd7fed9b41)
2025-10-31 09:36:32 +01:00
Roman Khavronenko
06627a7e6b app/vmalert: limit delayBeforeStart up to 5min (#9930)
vmalert tries to spread the moment group starts its evaluation
on `[0..group.interval]` duration. This approach allows to avoid
thundering herd problem when on vmalert start all groups execute their
rules simultaneously. It was introduced in
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/724

While for most configs it works great, for groups with big evaluation
intervals (30min, 60min) the first evaluation can be delayed
significantly.
This change introduces a start delay limit via new flag
`--group.maxStartDelay` (5m default).
It limits the `[0..group.interval]` start delay to
`[0..math.min(--group.maxStartDelay, group.interval)]`.
So all groups will start in first 5m or earlier.

The --group.maxStartDelay is ignored if user set `eval_offset`.

The 5m default limitation was picked high to not affect users with
relatively low evaluation intervals.

-----------

Based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9929

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit a85c5830c1)
2025-10-31 09:36:32 +01:00
Andrii Chubatiuk
0e94b3ee04 vmui: wrap annotations in alerting (#9909)
similar to https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9892
but for alerting tab in vmui

(cherry picked from commit 009ddb9ce1)
2025-10-31 09:36:32 +01:00
Zakhar Bessarab
631501384b app/vmbackupmanager: enforce newline at the end of CLI result (#956)
* app/vmbackupmanager: enforce newline at the end of CLI result

Previously, vmbackupmanager only printed a response from API which did not include a newline character. That leads to issues with the rendering of the next command when using a shell.

Always append a newline character to avoid breaking shell formatting when using CLI mode.

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* Update docs/victoriametrics/changelog/CHANGELOG.md

Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2025-10-31 11:53:37 +04:00
Andrii Chubatiuk
aa2e679479 app/vmbackupmanager: create vm_backup_last_created_at metric for latest backup (#954) 2025-10-31 11:53:37 +04:00
Aliaksandr Valialkin
c76731bf88 docs/victoriametrics/Articles.md: add a link to https://medium.com/@vijayrauniyar1818/how-we-eliminated-10k-year-in-aws-cross-zone-data-transfer-costs-with-zone-aware-kubernetes-09fff0c2435b 2025-10-30 17:24:38 +01:00
Aliaksandr Valialkin
891915b9ff app/vmauth: make load distribution more even among backends which execute queries with varying durations
The load distribution could be uneven when short queries arrive to vmauth while a part of backends are busy
with long-running queries. In this case the major load goes to the backend after a row of busy backend.

Suppose we have four backends - b1, b2, b3 and b4. The first two backends are busy with bigger number
of long-running queries than b3 and b4. Then 75% of short queries will go to b3, while only 25%
of short queries will go to b4.

The new algorithm makes the distribution more even in these cases by storing the next backend
after the chosen backend as candidate for the next query (its' index is stored in the atomicCounter).
Avoid races when updating atomicCounter from concurrently executed queries by using CompareAndSwap() -
if the concurrent query updated it first then the current query won't overwrite it with the outdated value.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9712
2025-10-30 14:38:37 +01:00
Max Kotliar
3f5fae33eb docs/changelog: fix pr link in v1.122.5 tag 2025-10-30 15:08:07 +02:00
Artem Fetishev
f59ecb24eb lib/storage: Add data ingestion benchmarks for various data patterns
Data patterns considered:

- Same series, same date
- Same series, different dates
- Different series, same date
- Different series, different dates

To make sure that the pattern condition holds, a new storage instance is
started every benchmark iteration.

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9912
2025-10-30 14:40:37 +03:00
Zakhar Bessarab
84307ed678 app/vmctl/remote-read: allow providing multiple label filters
Previously, vmctl only accepted one label for filtering. Extend this to
allow providing multiple-filters at once. This is useful when migrating
large volumes of data as it allows narrowing down migration scope of
migration for one run so that the source side is not overwhelmed with
migration.

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9917/
2025-10-30 14:40:37 +03:00
Roman Khavronenko
c7d2568cb4 app/vmalert: properly show last evaluation with 0 value
Before, rules that didn't get evaluated yet were showing weird values in
vmalert's UI. It was happening because of
`time.Since(r.LastEvaluation).Seconds()` expression when
`r.LastEvaluation` had 0 value.

With this change, rules that weren't evaluated yet would show `Never` in
Updated column instead.

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9924
2025-10-30 14:40:37 +03:00
Zhu Jiekun
e4422e14eb lib/httpserver: revert HTTP/2 support
This commit request revert the commit
d6bbfaf164 for the following reasons:

1. HTTP/2 carries security risks.
2. Most components in the VictoriaMetrics stack do not require HTTP/2
support.
3. While HTTP/2 support was available only as an option in previous
commit, there remains a potential risk of misusing this option and
enabling HTTP/2 inadvertently.

For components (e.g., VictoriaTraces) that require HTTP/2 support, they
should currently build an HTTP server manually with built-in packages,
instead of using `lib/httpserver` in VictoriaMetrics. If the mentioned
issue is resolved in the future and more components need HTTP/2, this
support can be reintroduced into `lib/httpserver`.
 
Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9927
2025-10-30 14:40:36 +03:00
Aliaksandr Valialkin
44d33c1570 lib/fs/fsutil: set the default value for -fs.maxConcurrency depending on the number of available CPU cores
This should reduce the need to tune this flag on systems with different number of CPU cores.
16 concurrent file operations per CPU should give quite low Go scheduling latency (~10ms)
according to https://github.com/VictoriaMetrics/VictoriaLogs/issues/774#issuecomment-3456814064

This is a follow-up for the commit 8a9a40dbdd
2025-10-30 12:02:10 +01:00
Fred Navruzov
59649fa260 docs/vmanomaly: release v1.26.2 (#9841)
### Describe Your Changes

update the docs to forthcoming v1.26.2 patch release
⚠️ please do not merge upon approval before respective image tags
are live

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-10-30 11:49:23 +01:00
Fred Navruzov
4a7537d1ae docs/vmanomaly: ui page formatting fixes (#9840)
### Describe Your Changes

ui page formatting fixes

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-10-30 11:48:26 +01:00
Fred Navruzov
4950c66488 docs/vmanomaly: release v1.26.1 (#9833)
### Describe Your Changes

release v1.26.1 docs updates

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-10-30 11:48:18 +01:00
Aliaksandr Valialkin
bcdbafdf96 lib/fs/fsutil: add -fs.maxConcurrency command-line flag for tuning concurrent operations with files
This flag can help tuning Go scheduling latency on systems with small number of CPU cores
vs data ingestion performance on systems with high-latency storage such as NFS or Ceph.

Updates https://github.com/VictoriaMetrics/VictoriaLogs/issues/774
Updates https://github.com/VictoriaMetrics/VictoriaLogs/issues/517
2025-10-30 11:47:14 +01:00
hagen1778
cebc7e1b3b docs: rm extra line
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit fde4b4013a)
2025-10-30 11:05:23 +01:00
hagen1778
1820271305 docs: order recent changes by components
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit bf69b0d686)
2025-10-30 11:05:23 +01:00
hagen1778
f0d62ae548 dashboards: run make dashboards-sync
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit a866474918)
2025-10-30 11:05:23 +01:00
hagen1778
3f6897fd73 docs: mention PR author for dashboard change
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 22f6cb6339)
2025-10-30 11:05:22 +01:00
Samarth Bagga
9271739fec dashboards: enable search for non default flags panel (#9928)
### Describe Your Changes

Added search for non default flags by editing the grafana configs.
Resolves #9910

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 0d7b7649bf)
2025-10-30 11:05:22 +01:00
nemobis
1c2286bbef docs: Update RELEX figures (#9931)
### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
(cherry picked from commit 74611ce6f2)
2025-10-30 11:05:22 +01:00
Roman Khavronenko
a0354c6c27 app/vmalert: simplify delayBeforeStart func (#9929)
It is a cosmetic change: it simplifies function signature by making it a
method of the Group struct.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit a5dd0324a9)
2025-10-30 11:05:22 +01:00
hagen1778
d0666876d2 docs: fix typo after 3e0aa46
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 45c0d40127)
2025-10-30 11:05:21 +01:00
Andrii Chubatiuk
32ed45b672 app/vmalert: use search expression to match group and file names (#9920)
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9886

---------

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
(cherry picked from commit fc978c95af)
2025-10-30 10:02:15 +01:00
Kirill Yurkov
84196c30c2 docs: add VictoriaMetrics architectures guide from startups to hypers… (#9919)
The new guide section about architecture from scratch to hyperscale!

(cherry picked from commit 8e99efe0fa)
2025-10-30 10:02:15 +01:00
hagen1778
79b62e3fc6 docs: reorder template functions alphabetically
follow-up after ea41fea453

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 3e0aa46fdb)
2025-10-30 10:02:15 +01:00
Anh-Dung Nguyen
32e19ffaa4 app/vmalert: add now template (#9913)
### Describe Your Changes

Related to #9864, add "now" as template in vmalert rules templating and
update the docs. I haven't been able to test the docs change as I can't
run make docs-debug locally so if anyone know how to do it locally,
please let me know!

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Co-authored-by: Hui Wang <haley@victoriametrics.com>
Co-authored-by: Max Kotliar <kotlyar.maksim@gmail.com>
(cherry picked from commit ea41fea453)
2025-10-30 10:02:15 +01:00
hagen1778
3df3e19d33 docs: move change line to the right place
Follow-up after 2652a7c762

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 0165108a8f)
2025-10-30 10:02:14 +01:00
Hui Wang
a20fd88572 vmalert: support alert_relabel_configs per each notifier in -notifier.config file (#9736)
fix https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5980

(cherry picked from commit 2652a7c762)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-10-30 10:02:14 +01:00
Stephan Burns
3124439874 vmagent/docs: grammar changes (#9863)
### Describe Your Changes

Lots of small changes to grammar to make the docs flow nicer.

I'm sorry that this ended up being such a large PR. I will split these
up in the future.

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Co-authored-by: Phuong Le <39565248+func25@users.noreply.github.com>
2025-10-28 16:56:23 +02:00
dependabot[bot]
d5ebb18058 build(deps): bump vite from 7.1.5 to 7.1.11 in /app/vmui/packages/vmui (#9885)
Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite)
from 7.1.5 to 7.1.11.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/vitejs/vite/releases">vite's
releases</a>.</em></p>
<blockquote>
<h2>v7.1.11</h2>
<p>Please refer to <a
href="https://github.com/vitejs/vite/blob/v7.1.11/packages/vite/CHANGELOG.md">CHANGELOG.md</a>
for details.</p>
<h2>v7.1.10</h2>
<p>Please refer to <a
href="https://github.com/vitejs/vite/blob/v7.1.10/packages/vite/CHANGELOG.md">CHANGELOG.md</a>
for details.</p>
<h2>v7.1.9</h2>
<p>Please refer to <a
href="https://github.com/vitejs/vite/blob/v7.1.9/packages/vite/CHANGELOG.md">CHANGELOG.md</a>
for details.</p>
<h2>v7.1.8</h2>
<p>Please refer to <a
href="https://github.com/vitejs/vite/blob/v7.1.8/packages/vite/CHANGELOG.md">CHANGELOG.md</a>
for details.</p>
<h2>v7.1.7</h2>
<p>Please refer to <a
href="https://github.com/vitejs/vite/blob/v7.1.7/packages/vite/CHANGELOG.md">CHANGELOG.md</a>
for details.</p>
<h2>v7.1.6</h2>
<p>Please refer to <a
href="https://github.com/vitejs/vite/blob/v7.1.6/packages/vite/CHANGELOG.md">CHANGELOG.md</a>
for details.</p>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/vitejs/vite/blob/main/packages/vite/CHANGELOG.md">vite's
changelog</a>.</em></p>
<blockquote>
<h2><!-- raw HTML omitted --><a
href="https://github.com/vitejs/vite/compare/v7.1.10...v7.1.11">7.1.11</a>
(2025-10-20)<!-- raw HTML omitted --></h2>
<h3>Bug Fixes</h3>
<ul>
<li><strong>dev:</strong> trim trailing slash before
<code>server.fs.deny</code> check (<a
href="https://redirect.github.com/vitejs/vite/issues/20968">#20968</a>)
(<a
href="f479cc57c4">f479cc5</a>)</li>
</ul>
<h3>Miscellaneous Chores</h3>
<ul>
<li><strong>deps:</strong> update all non-major dependencies (<a
href="https://redirect.github.com/vitejs/vite/issues/20966">#20966</a>)
(<a
href="6fb41a260b">6fb41a2</a>)</li>
</ul>
<h3>Code Refactoring</h3>
<ul>
<li>use subpath imports for types module reference (<a
href="https://redirect.github.com/vitejs/vite/issues/20921">#20921</a>)
(<a
href="d0094af639">d0094af</a>)</li>
</ul>
<h3>Build System</h3>
<ul>
<li>remove cjs reference in files field (<a
href="https://redirect.github.com/vitejs/vite/issues/20945">#20945</a>)
(<a
href="ef411cee26">ef411ce</a>)</li>
<li>remove hash from built filenames (<a
href="https://redirect.github.com/vitejs/vite/issues/20946">#20946</a>)
(<a
href="a81730754d">a817307</a>)</li>
</ul>
<h2><!-- raw HTML omitted --><a
href="https://github.com/vitejs/vite/compare/v7.1.9...v7.1.10">7.1.10</a>
(2025-10-14)<!-- raw HTML omitted --></h2>
<h3>Bug Fixes</h3>
<ul>
<li><strong>css:</strong> avoid duplicate style for server rendered
stylesheet link and client inline style during dev (<a
href="https://redirect.github.com/vitejs/vite/issues/20767">#20767</a>)
(<a
href="3a92bc79b3">3a92bc7</a>)</li>
<li><strong>css:</strong> respect emitAssets when cssCodeSplit=false (<a
href="https://redirect.github.com/vitejs/vite/issues/20883">#20883</a>)
(<a
href="d3e7eeefa9">d3e7eee</a>)</li>
<li><strong>deps:</strong> update all non-major dependencies (<a
href="879de86935">879de86</a>)</li>
<li><strong>deps:</strong> update all non-major dependencies (<a
href="https://redirect.github.com/vitejs/vite/issues/20894">#20894</a>)
(<a
href="3213f90ff0">3213f90</a>)</li>
<li><strong>dev:</strong> allow aliases starting with <code>//</code>
(<a
href="https://redirect.github.com/vitejs/vite/issues/20760">#20760</a>)
(<a
href="b95fa2aa75">b95fa2a</a>)</li>
<li><strong>dev:</strong> remove timestamp query consistently (<a
href="https://redirect.github.com/vitejs/vite/issues/20887">#20887</a>)
(<a
href="6537d15591">6537d15</a>)</li>
<li><strong>esbuild:</strong> inject esbuild helpers correctly for
esbuild 0.25.9+ (<a
href="https://redirect.github.com/vitejs/vite/issues/20906">#20906</a>)
(<a
href="446eb38632">446eb38</a>)</li>
<li>normalize path before calling <code>fileToBuiltUrl</code> (<a
href="https://redirect.github.com/vitejs/vite/issues/20898">#20898</a>)
(<a
href="73b6d243e0">73b6d24</a>)</li>
<li>preserve original sourcemap file field when combining sourcemaps (<a
href="https://redirect.github.com/vitejs/vite/issues/20926">#20926</a>)
(<a
href="c714776aa1">c714776</a>)</li>
</ul>
<h3>Documentation</h3>
<ul>
<li>correct <code>WebSocket</code> spelling (<a
href="https://redirect.github.com/vitejs/vite/issues/20890">#20890</a>)
(<a
href="29e98dc3ef">29e98dc</a>)</li>
</ul>
<h3>Miscellaneous Chores</h3>
<ul>
<li><strong>deps:</strong> update rolldown-related dependencies (<a
href="https://redirect.github.com/vitejs/vite/issues/20923">#20923</a>)
(<a
href="a5e3b064fa">a5e3b06</a>)</li>
</ul>
<h2><!-- raw HTML omitted --><a
href="https://github.com/vitejs/vite/compare/v7.1.8...v7.1.9">7.1.9</a>
(2025-10-03)<!-- raw HTML omitted --></h2>
<h3>Reverts</h3>
<ul>
<li><strong>server:</strong> drain stdin when not interactive (<a
href="https://redirect.github.com/vitejs/vite/issues/20885">#20885</a>)
(<a
href="12d72b0538">12d72b0</a>)</li>
</ul>
<h2><!-- raw HTML omitted --><a
href="https://github.com/vitejs/vite/compare/v7.1.7...v7.1.8">7.1.8</a>
(2025-10-02)<!-- raw HTML omitted --></h2>
<h3>Bug Fixes</h3>
<ul>
<li><strong>css:</strong> improve url escape characters handling (<a
href="https://redirect.github.com/vitejs/vite/issues/20847">#20847</a>)
(<a
href="24a61a3f54">24a61a3</a>)</li>
<li><strong>deps:</strong> update all non-major dependencies (<a
href="https://redirect.github.com/vitejs/vite/issues/20855">#20855</a>)
(<a
href="788a183afc">788a183</a>)</li>
<li><strong>deps:</strong> update artichokie to 0.4.2 (<a
href="https://redirect.github.com/vitejs/vite/issues/20864">#20864</a>)
(<a
href="e670799e12">e670799</a>)</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="8b69c9e32c"><code>8b69c9e</code></a>
release: v7.1.11</li>
<li><a
href="f479cc57c4"><code>f479cc5</code></a>
fix(dev): trim trailing slash before <code>server.fs.deny</code> check
(<a
href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/20968">#20968</a>)</li>
<li><a
href="6fb41a260b"><code>6fb41a2</code></a>
chore(deps): update all non-major dependencies (<a
href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/20966">#20966</a>)</li>
<li><a
href="a81730754d"><code>a817307</code></a>
build: remove hash from built filenames (<a
href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/20946">#20946</a>)</li>
<li><a
href="ef411cee26"><code>ef411ce</code></a>
build: remove cjs reference in files field (<a
href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/20945">#20945</a>)</li>
<li><a
href="d0094af639"><code>d0094af</code></a>
refactor: use subpath imports for types module reference (<a
href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/20921">#20921</a>)</li>
<li><a
href="ed4a0dc913"><code>ed4a0dc</code></a>
release: v7.1.10</li>
<li><a
href="c714776aa1"><code>c714776</code></a>
fix: preserve original sourcemap file field when combining sourcemaps
(<a
href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/20926">#20926</a>)</li>
<li><a
href="446eb38632"><code>446eb38</code></a>
fix(esbuild): inject esbuild helpers correctly for esbuild 0.25.9+ (<a
href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/20906">#20906</a>)</li>
<li><a
href="879de86935"><code>879de86</code></a>
fix(deps): update all non-major dependencies</li>
<li>Additional commits viewable in <a
href="https://github.com/vitejs/vite/commits/v7.1.11/packages/vite">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=vite&package-manager=npm_and_yarn&previous-version=7.1.5&new-version=7.1.11)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/VictoriaMetrics/VictoriaMetrics/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-10-28 16:45:38 +02:00
Max Kotliar
5e9d153f19 app/vmselect: Enable log slow query stats with 5s default
Rationale: Having query stats logging enabled by default can greatly
help in investigating incidents.

Currently, it is disabled by default, so many users don’t enable it, and
when issues occur there are no stats available.

After discussion with the team, a 5s threshold was agreed upon as a
reasonable default to capture meaningful slow query data without
excessive logging.
2025-10-28 16:43:24 +02:00
Max Kotliar
cb22a48429 docs/changelog: put misplaced changelog entries to tip 2025-10-28 16:37:52 +02:00
Hui Wang
003c271622 stream aggregation: change the behavior when both `streamAggr.dropInp… (#9877)
stream aggregation: change the behavior when both `streamAggr.dropInput`
and `streamAggr.keepInput` are set to true

fix https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9724,
making dropInput and keepInput work separately.

<img width="744" height="366" alt="image"
src="https://github.com/user-attachments/assets/7ebb3d1e-872f-4789-8dd1-c4e3f80a84de"
/>

Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2025-10-28 16:20:43 +02:00
Hui Wang
337412430b vmagent: add /remotewrite-relabel-config and `/remotewrite-url-rela… (#9722)
…bel-config` APIs to return `-promscrape.config` and
`-remoteWrite.relabelConfig` flag values

part of https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9504

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
(cherry picked from commit 9ffe965063)
2025-10-27 13:54:28 +01:00
Nikolay
24470a8cb8 app/{vminsert,vmstorage}: implement RPC protocol for vmstorage-vminsert communication
This commit adds new RPC protocol for vminsert-vmstorage communication,
it acts in the same way as vmselect-vmstorage RPC.

  It's implemented with new handshake hello methods in a backward
compatible way. Server attempts to parse RPC only if client send new
Hello message, while client fallbacks to the old Hello message if server
closes connection.

This change is need for the new metrics metadata forwarded from vminsert
into vmstorage.

Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2974
Changes extracted from PR:
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9487
2025-10-27 09:51:41 +01:00
Artem Fetishev
41770a0edb lib/storage: fix loading nextDayMetricIDs cache from a file for different indexDB generation (#9911)
Previously, if a storage started with curr indexDB different from one
stored in nextDayMetricIDs cache file, the cache would still be loaded
into memory possibly affecting the next day prefill.

This is an unlikely case but it is still possible when:

- A programmer makes a mistake in the code and uses something else
instead of idbCurr.generation.
- Downgrading from pt-index to previous version

Related to #7599 and #8134.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-10-26 19:51:47 +01:00
Artem Fetishev
8baba9c2eb lib/storage: add a unit test for next day idb prefill (#9906)
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-10-25 18:27:44 +02:00
Roman Khavronenko
80853b6664 app/vmalert/datasource: explicitly check response type during replay (#9868)
This change validates that QueryRange() method for prometheus datasource
receives response with `matrix` data type. It would throw an error
otherwise.

The change is needed to avoid confusions like in
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9779.

The fix is not elegant, but it should be simple from code support
perspective. So each API has its own parsing function. Even if some
processing code is repeated.

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Andrii Chubatiuk <achubatiuk@victoriametrics.com>
(cherry picked from commit 5fa87af6be)
2025-10-24 17:19:25 +02:00
Roman Khavronenko
6881646ed7 app/vmalert: preserve html formatting in annotations (#9892)
The change is purely visual. It preserves html formatting in annotations
when rendering them on rule or alert details page. The css change is
clumsy, but demonstrates the point.

--------

Before:
<img width="1179" height="297" alt="image"
src="https://github.com/user-attachments/assets/c30e2222-7b0f-4f28-bf6e-c546cc5bb2fc"
/>

After:
<img width="1196" height="321" alt="image"
src="https://github.com/user-attachments/assets/2c6d9530-7ae9-47fd-b4ba-87fe6f44c625"
/>

-------

p.s. @AndrewChubatiuk I sure know that you could make this change in
more elegant or stylish way than I did. Please do so, if you want.
Please port this change to vmui too. Thanks!

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit b9a3369254)
2025-10-24 17:19:25 +02:00
Max Kotliar
5816a9945e docs/changelog: fix typo in update note 2025-10-24 13:28:05 +03:00
Max Kotliar
a6a1da1e4b docs/changelog: fix typo in LTS link 2025-10-24 13:18:27 +03:00
Zakhar Bessarab
13a9fd4e67 lib/backup/s3remote: properly extend http client if it is present
fb1344b5 replaced an HTTP client unconditionally which overrides
configurations which were loaded by AWS SDK. This leads to AWS env
variables to being overwritten.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9858

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9898
2025-10-24 11:05:50 +02:00
Zhu Jiekun
b26792e1dd httpserver: add http2 option
Currently, the `httpserver` disabled HTTP/2 support by design, because:
```
// Disable http/2, since it doesn't give any advantages for VictoriaMetrics services.
```

As VictoriaLogs and VictoriaTraces rely on `httpserver`, in order to
support gRPC over HTTP/2, an option to support HTTP/2 is required.


Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9881
2025-10-24 11:05:50 +02:00
f41gh7
4383d7fbdf lib/streamaggr: concurrently push timeseries to aggregators
Previously all timeseries pushed into aggregators were added
sequentially. It could cause delays on data ingestion and it was not
possible to use all available.

 This commit adds concurrency based on available CPU cores.

Also, it adds new generic Buffer and BufferPool into slicesutil.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9878
2025-10-24 11:05:45 +02:00
Aliaksandr Valialkin
dc485acdc7 app/vmauth: log the real cause for timed out requests to vmauth
Previously a misleading random error could be logged for canceled and/or timed out requests to vmauth.
Consistently log the request timeout error for timed out requests.

While at it, do not log errors for requests canceled by the remote client, since such logs aren't actionable
and just pollute error logs generated by vmauth.
2025-10-21 16:02:42 +02:00
Max Kotliar
5a4e5b2cb7 docs/changelog: add links to related PRs. 2025-10-21 16:11:38 +03:00
Max Kotliar
1dc0562576 docs: run make docs-update-flags
sync flags with actual values in binaries
2025-10-21 16:04:37 +03:00
Max Kotliar
f7d8c48b6c docs/CHANGELOG.md: update changelog with LTS release notes; bump LTS versions 2025-10-21 11:23:22 +03:00
Max Kotliar
81afbdf9b2 docs: bump latest version in docs 2025-10-21 10:59:03 +03:00
Max Kotliar
fd099b5dcc deployment/docker: bump version 2025-10-21 10:56:02 +03:00
Max Kotliar
9a009c491d docs/CHANGELOG.md: cut v1.128.0 2025-10-17 14:54:10 +03:00
Max Kotliar
d4048bff17 docs: update version tooltips 2025-10-17 14:51:13 +03:00
Max Kotliar
07eae0f742 app/{vmselect,vlselect}: run make vmui-update 2025-10-17 14:44:59 +03:00
Stephan Burns
c9108a97f7 docs: grammatical changes (#9862)
### Describe Your Changes

Some small grammatical changes.

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-10-17 14:09:38 +03:00
Rishub
712e61e349 Fix docker pull command for vmanomaly images (#9870)
### Describe Your Changes

The docker-hub image url was also set to quay.io changed it back to
docker.

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-10-17 14:07:17 +03:00
Nikolay
c7377288f8 lib/protoparser: add flag opentelemetry.convertMetricNamesToPrometheus (#9875)
Introduce a new flag, which converts only metric names into Prometheus
compatible format. And keeps label names in original form.

 It's needed to keep labels in original form, which
is useful for correlation with other telemetry sources, such as logs or
traces.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9830

### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Signed-off-by: f41gh7 <nik@victoriametrics.com>
2025-10-17 13:43:52 +03:00
Andrii Chubatiuk
b3365be0b3 lib/streamaggr: disable impact of flush_on_shutdown on aggregated series flush time (#9852)
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9642

additionally renamed variable since meaning of variables
`!flushOnShutdown` and `skipIncompleteFlush` is not equal.

### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2025-10-17 13:04:08 +03:00
Andrii Chubatiuk
ea172f7bc7 {dashboards,rules}: update storage ETA calculations in both dashboards and rules (#9848)
currently index file size is calculated as average across all storages
from all clusters, updated it to get more valid calculations. also PR
[fixes helm chart issue
](https://github.com/VictoriaMetrics/helm-charts/issues/2474).

### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-10-17 12:50:56 +03:00
Zakhar Bessarab
3e8da5fd19 lib/backup/s3remote: use http client from AWS instead of custom implementation (#9869)
AWS SDK does not modify custom http client configuration if it was provided. This leads to
additional configuration such as environment variables being ignored.

Use AWS http client builder instead of custom implementation and
override DialContext to preserve metrics exposed by custom transport.

See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9858

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-10-17 12:31:04 +04:00
Nikolay
4eb0f11a72 app/vmagent/kafka: add opentelemetry consumer format
This commit adds opentelemetry format for kafka consumer

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9734
2025-10-14 22:32:46 +02:00
dependabot[bot]
84f452883e build(deps): bump actions/setup-node from 4 to 6 (#9857)
Bumps [actions/setup-node](https://github.com/actions/setup-node) from 4
to 6.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/actions/setup-node/releases">actions/setup-node's
releases</a>.</em></p>
<blockquote>
<h2>v6.0.0</h2>
<h2>What's Changed</h2>
<p><strong>Breaking Changes</strong></p>
<ul>
<li>Limit automatic caching to npm, update workflows and documentation
by <a
href="https://github.com/priyagupta108"><code>@​priyagupta108</code></a>
in <a
href="https://redirect.github.com/actions/setup-node/pull/1374">actions/setup-node#1374</a></li>
</ul>
<p><strong>Dependency Upgrades</strong></p>
<ul>
<li>Upgrade ts-jest from 29.1.2 to 29.4.1 and document breaking changes
in v5 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-node/pull/1336">#1336</a></li>
<li>Upgrade prettier from 2.8.8 to 3.6.2 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-node/pull/1334">#1334</a></li>
<li>Upgrade actions/publish-action from 0.3.0 to 0.4.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-node/pull/1362">#1362</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/setup-node/compare/v5...v6.0.0">https://github.com/actions/setup-node/compare/v5...v6.0.0</a></p>
<h2>v5.0.0</h2>
<h2>What's Changed</h2>
<h3>Breaking Changes</h3>
<ul>
<li>Enhance caching in setup-node with automatic package manager
detection by <a
href="https://github.com/priya-kinthali"><code>@​priya-kinthali</code></a>
in <a
href="https://redirect.github.com/actions/setup-node/pull/1348">actions/setup-node#1348</a></li>
</ul>
<p>This update, introduces automatic caching when a valid
<code>packageManager</code> field is present in your
<code>package.json</code>. This aims to improve workflow performance and
make dependency management more seamless.
To disable this automatic caching, set <code>package-manager-cache:
false</code></p>
<pre lang="yaml"><code>steps:
- uses: actions/checkout@v5
- uses: actions/setup-node@v5
  with:
    package-manager-cache: false
</code></pre>
<ul>
<li>Upgrade action to use node24 by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/setup-node/pull/1325">actions/setup-node#1325</a></li>
</ul>
<p>Make sure your runner is on version v2.327.1 or later to ensure
compatibility with this release. <a
href="https://github.com/actions/runner/releases/tag/v2.327.1">See
Release Notes</a></p>
<h3>Dependency Upgrades</h3>
<ul>
<li>Upgrade <code>@​octokit/request-error</code> and
<code>@​actions/github</code> by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-node/pull/1227">actions/setup-node#1227</a></li>
<li>Upgrade uuid from 9.0.1 to 11.1.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-node/pull/1273">actions/setup-node#1273</a></li>
<li>Upgrade undici from 5.28.5 to 5.29.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-node/pull/1295">actions/setup-node#1295</a></li>
<li>Upgrade form-data to bring in fix for critical vulnerability by <a
href="https://github.com/gowridurgad"><code>@​gowridurgad</code></a> in
<a
href="https://redirect.github.com/actions/setup-node/pull/1332">actions/setup-node#1332</a></li>
<li>Upgrade actions/checkout from 4 to 5 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-node/pull/1345">actions/setup-node#1345</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a
href="https://github.com/priya-kinthali"><code>@​priya-kinthali</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/setup-node/pull/1348">actions/setup-node#1348</a></li>
<li><a href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/setup-node/pull/1325">actions/setup-node#1325</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/setup-node/compare/v4...v5.0.0">https://github.com/actions/setup-node/compare/v4...v5.0.0</a></p>
<h2>v4.4.0</h2>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="2028fbc5c2"><code>2028fbc</code></a>
Limit automatic caching to npm, update workflows and documentation (<a
href="https://redirect.github.com/actions/setup-node/issues/1374">#1374</a>)</li>
<li><a
href="13427813f7"><code>1342781</code></a>
Bump actions/publish-action from 0.3.0 to 0.4.0 (<a
href="https://redirect.github.com/actions/setup-node/issues/1362">#1362</a>)</li>
<li><a
href="89d709d423"><code>89d709d</code></a>
Bump prettier from 2.8.8 to 3.6.2 (<a
href="https://redirect.github.com/actions/setup-node/issues/1334">#1334</a>)</li>
<li><a
href="cd2651c462"><code>cd2651c</code></a>
Bump ts-jest from 29.1.2 to 29.4.1 (<a
href="https://redirect.github.com/actions/setup-node/issues/1336">#1336</a>)</li>
<li><a
href="a0853c2454"><code>a0853c2</code></a>
Bump actions/checkout from 4 to 5 (<a
href="https://redirect.github.com/actions/setup-node/issues/1345">#1345</a>)</li>
<li><a
href="b7234cc9fe"><code>b7234cc</code></a>
Upgrade action to use node24 (<a
href="https://redirect.github.com/actions/setup-node/issues/1325">#1325</a>)</li>
<li><a
href="d7a11313b5"><code>d7a1131</code></a>
Enhance caching in setup-node with automatic package manager detection
(<a
href="https://redirect.github.com/actions/setup-node/issues/1348">#1348</a>)</li>
<li><a
href="5e2628c959"><code>5e2628c</code></a>
Bumps form-data (<a
href="https://redirect.github.com/actions/setup-node/issues/1332">#1332</a>)</li>
<li><a
href="65beceff8e"><code>65becef</code></a>
Bump undici from 5.28.5 to 5.29.0 (<a
href="https://redirect.github.com/actions/setup-node/issues/1295">#1295</a>)</li>
<li><a
href="7e24a656e1"><code>7e24a65</code></a>
Bump uuid from 9.0.1 to 11.1.0 (<a
href="https://redirect.github.com/actions/setup-node/issues/1273">#1273</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/actions/setup-node/compare/v4...v6">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/setup-node&package-manager=github_actions&previous-version=4&new-version=6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-10-14 17:32:54 +03:00
Max Kotliar
3d32673969 deployment/docker: update Go builder from Go1.25.2 to Go1.25.3 (#9859)
Release https://go.dev/doc/devel/release#go1.25.3

Changeshttps://github.com/golang/go/issues?q=milestone%3AGo1.25.3%20label%3ACherryPickApproved

Follow up on
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9843
2025-10-14 17:09:45 +03:00
Roman Khavronenko
0f302bf667 deployment: bump alpine image to 3.22.2 (#9855)
See
https://www.alpinelinux.org/posts/Alpine-3.19.9-3.20.8-3.21.5-3.22.2-released.html

Addresses CVE-2025-9230, CVE-2025-9231, CVE-2025-9232.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-10-13 18:57:59 +03:00
Roman Khavronenko
f2af269464 app/vmselect: add more context for bad URL error
It was multiple times already for users to get confused with Single-node
and Cluster URL formats for VictoriaMetrics. This is an attempt to bring
more context to the error message, if request doesn't contain expected
"/select" or "/delete" prefix. Which is usually a sign of using wrong
API.

This commit is only an attempt to fix it. But it demonstrates the idea.

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9741
2025-10-13 11:13:42 +02:00
Nikolay
7faecb1884 lib/promscrape/config: change promscrape.dropOriginalLabels default value to false
Tracking original labels requires storing a copy of labels obtained
from service discovery. It adds extra Garbage Collection pressure and as
a result increased CPU usage.

While dropOriginalLabels has almost no impact at test and small
installations.
Impact grows with a scale. And especially is impactful at Kubernetes
based installations.

 In addition, this flag is disabled by default for `k8s-stack` helm
chart, which is our main Kubernetes monitoring solution.

 An also, we recommend at vmagent optimisation guide to disable original
 labels storing.

 This commit changes default value to true and disables tracking of
dropped targets by default. In case of debugging, it could be easily
enabled back by providing `false` value to the flag:
`promscrape.dropOriginalLabels`. It should improve resource usage out of
box by reducing user-experience for minority of users.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9665

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9772
2025-10-13 11:05:43 +02:00
Andrii Chubatiuk
4b19f9aae3 app/vmui: make TextField box height constant regardless error message presence
TextField component has ability to show error message and depending on
it's presence text field height changes, which may cause visibility
issues if this field is vertically aligned with some neighbour
components. This PR makes textfield height constant and its input box
horizontally symmetrical

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9693
2025-10-13 11:05:43 +02:00
Andrii Chubatiuk
7265089a62 app/vmui: fixed code color value in light mode
Fixed typo in code color variable value in light mode

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9851
2025-10-13 11:05:43 +02:00
Zakhar Bessarab
08099b47fa deps: unpin AWS dependencies and add workaround for S3 compatibility (#9844)
Updates:
- unpin AWS dependencies and run `make vendor-update`
- add config options to enable checksums only if required by storage in
order to preserve backwards compatibility

Related issues:
- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9748
- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8622

Tested with: AWS S3, self-hosted MinIO, Linode object storage as it was
failing previously with multi-part uploads (reported here -
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/8630#issuecomment-2772185033).
An updated library allows (PR with the
fix - https://github.com/aws/aws-sdk-go-v2/pull/3151) overriding
multi-part upload configurations so that compatibility can be preserved.

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-10-10 18:36:09 +04:00
Zakhar Bessarab
6ef2b02c1e lib/backup/actions: improve progress logging (#9836)
Currently, it is hard to make sense of progress based on logging as it
requires manual calculation of progress and ETA.
Solve this by:
- making data units humanly readable
- adding an estimation of completion for the operation

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-10-10 18:34:21 +04:00
dependabot[bot]
67416d1e58 build(deps): bump github/codeql-action from 3 to 4 (#9827)
Bumps [github/codeql-action](https://github.com/github/codeql-action)
from 3 to 4.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/github/codeql-action/releases">github/codeql-action's
releases</a>.</em></p>
<blockquote>
<h2>v3.30.7</h2>
<h1>CodeQL Action Changelog</h1>
<p>See the <a
href="https://github.com/github/codeql-action/releases">releases
page</a> for the relevant changes to the CodeQL CLI and language
packs.</p>
<h2>3.30.7 - 06 Oct 2025</h2>
<p>No user facing changes.</p>
<p>See the full <a
href="https://github.com/github/codeql-action/blob/v3.30.7/CHANGELOG.md">CHANGELOG.md</a>
for more information.</p>
<h2>v3.30.6</h2>
<h1>CodeQL Action Changelog</h1>
<p>See the <a
href="https://github.com/github/codeql-action/releases">releases
page</a> for the relevant changes to the CodeQL CLI and language
packs.</p>
<h2>3.30.6 - 02 Oct 2025</h2>
<ul>
<li>Update default CodeQL bundle version to 2.23.2. <a
href="https://redirect.github.com/github/codeql-action/pull/3168">#3168</a></li>
</ul>
<p>See the full <a
href="https://github.com/github/codeql-action/blob/v3.30.6/CHANGELOG.md">CHANGELOG.md</a>
for more information.</p>
<h2>v3.30.5</h2>
<h1>CodeQL Action Changelog</h1>
<p>See the <a
href="https://github.com/github/codeql-action/releases">releases
page</a> for the relevant changes to the CodeQL CLI and language
packs.</p>
<h2>3.30.5 - 26 Sep 2025</h2>
<ul>
<li>We fixed a bug that was introduced in <code>3.30.4</code> with
<code>upload-sarif</code> which resulted in files without a
<code>.sarif</code> extension not getting uploaded. <a
href="https://redirect.github.com/github/codeql-action/pull/3160">#3160</a></li>
</ul>
<p>See the full <a
href="https://github.com/github/codeql-action/blob/v3.30.5/CHANGELOG.md">CHANGELOG.md</a>
for more information.</p>
<h2>v3.30.4</h2>
<h1>CodeQL Action Changelog</h1>
<p>See the <a
href="https://github.com/github/codeql-action/releases">releases
page</a> for the relevant changes to the CodeQL CLI and language
packs.</p>
<h2>3.30.4 - 25 Sep 2025</h2>
<ul>
<li>We have improved the CodeQL Action's ability to validate that the
workflow it is used in does not use different versions of the CodeQL
Action for different workflow steps. Mixing different versions of the
CodeQL Action in the same workflow is unsupported and can lead to
unpredictable results. A warning will now be emitted from the
<code>codeql-action/init</code> step if different versions of the CodeQL
Action are detected in the workflow file. Additionally, an error will
now be thrown by the other CodeQL Action steps if they load a
configuration file that was generated by a different version of the
<code>codeql-action/init</code> step. <a
href="https://redirect.github.com/github/codeql-action/pull/3099">#3099</a>
and <a
href="https://redirect.github.com/github/codeql-action/pull/3100">#3100</a></li>
<li>We added support for reducing the size of dependency caches for Java
analyses, which will reduce cache usage and speed up workflows. This
will be enabled automatically at a later time. <a
href="https://redirect.github.com/github/codeql-action/pull/3107">#3107</a></li>
<li>You can now run the latest CodeQL nightly bundle by passing
<code>tools: nightly</code> to the <code>init</code> action. In general,
the nightly bundle is unstable and we only recommend running it when
directed by GitHub staff. <a
href="https://redirect.github.com/github/codeql-action/pull/3130">#3130</a></li>
<li>Update default CodeQL bundle version to 2.23.1. <a
href="https://redirect.github.com/github/codeql-action/pull/3118">#3118</a></li>
</ul>
<p>See the full <a
href="https://github.com/github/codeql-action/blob/v3.30.4/CHANGELOG.md">CHANGELOG.md</a>
for more information.</p>
<h2>v3.30.3</h2>
<h1>CodeQL Action Changelog</h1>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/github/codeql-action/blob/main/CHANGELOG.md">github/codeql-action's
changelog</a>.</em></p>
<blockquote>
<h2>3.29.4 - 23 Jul 2025</h2>
<p>No user facing changes.</p>
<h2>3.29.3 - 21 Jul 2025</h2>
<p>No user facing changes.</p>
<h2>3.29.2 - 30 Jun 2025</h2>
<ul>
<li>Experimental: When the <code>quality-queries</code> input for the
<code>init</code> action is provided with an argument, separate
<code>.quality.sarif</code> files are produced and uploaded for each
language with the results of the specified queries. Do not use this in
production as it is part of an internal experiment and subject to change
at any time. <a
href="https://redirect.github.com/github/codeql-action/pull/2935">#2935</a></li>
</ul>
<h2>3.29.1 - 27 Jun 2025</h2>
<ul>
<li>Fix bug in PR analysis where user-provided <code>include</code>
query filter fails to exclude non-included queries. <a
href="https://redirect.github.com/github/codeql-action/pull/2938">#2938</a></li>
<li>Update default CodeQL bundle version to 2.22.1. <a
href="https://redirect.github.com/github/codeql-action/pull/2950">#2950</a></li>
</ul>
<h2>3.29.0 - 11 Jun 2025</h2>
<ul>
<li>Update default CodeQL bundle version to 2.22.0. <a
href="https://redirect.github.com/github/codeql-action/pull/2925">#2925</a></li>
<li>Bump minimum CodeQL bundle version to 2.16.6. <a
href="https://redirect.github.com/github/codeql-action/pull/2912">#2912</a></li>
</ul>
<h2>3.28.21 - 28 July 2025</h2>
<p>No user facing changes.</p>
<h2>3.28.20 - 21 July 2025</h2>
<ul>
<li>Remove support for combining SARIF files from a single upload for
GHES 3.18, see <a
href="https://github.blog/changelog/2024-05-06-code-scanning-will-stop-combining-runs-from-a-single-upload/">the
changelog post</a>. <a
href="https://redirect.github.com/github/codeql-action/pull/2959">#2959</a></li>
</ul>
<h2>3.28.19 - 03 Jun 2025</h2>
<ul>
<li>The CodeQL Action no longer includes its own copy of the extractor
for the <code>actions</code> language, which is currently in public
preview.
The <code>actions</code> extractor has been included in the CodeQL CLI
since v2.20.6. If your workflow has enabled the <code>actions</code>
language <em>and</em> you have pinned
your <code>tools:</code> property to a specific version of the CodeQL
CLI earlier than v2.20.6, you will need to update to at least CodeQL
v2.20.6 or disable
<code>actions</code> analysis.</li>
<li>Update default CodeQL bundle version to 2.21.4. <a
href="https://redirect.github.com/github/codeql-action/pull/2910">#2910</a></li>
</ul>
<h2>3.28.18 - 16 May 2025</h2>
<ul>
<li>Update default CodeQL bundle version to 2.21.3. <a
href="https://redirect.github.com/github/codeql-action/pull/2893">#2893</a></li>
<li>Skip validating SARIF produced by CodeQL for improved performance.
<a
href="https://redirect.github.com/github/codeql-action/pull/2894">#2894</a></li>
<li>The number of threads and amount of RAM used by CodeQL can now be
set via the <code>CODEQL_THREADS</code> and <code>CODEQL_RAM</code>
runner environment variables. If set, these environment variables
override the <code>threads</code> and <code>ram</code> inputs
respectively. <a
href="https://redirect.github.com/github/codeql-action/pull/2891">#2891</a></li>
</ul>
<h2>3.28.17 - 02 May 2025</h2>
<ul>
<li>Update default CodeQL bundle version to 2.21.2. <a
href="https://redirect.github.com/github/codeql-action/pull/2872">#2872</a></li>
</ul>
<h2>3.28.16 - 23 Apr 2025</h2>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="aac66ec793"><code>aac66ec</code></a>
Remove <code>update-proxy-release</code> workflow</li>
<li><a
href="91a63dc72c"><code>91a63dc</code></a>
Remove <code>undefined</code> values from results of
<code>unsafeEntriesInvariant</code></li>
<li><a
href="d25fa60a90"><code>d25fa60</code></a>
ESLint: Disable <code>no-unused-vars</code> for parameters starting with
<code>_</code></li>
<li><a
href="3adb1ff7b8"><code>3adb1ff</code></a>
Reorder supported tags in descending order</li>
<li>See full diff in <a
href="https://github.com/github/codeql-action/compare/v3...v4">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=github/codeql-action&package-manager=github_actions&previous-version=3&new-version=4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-10-10 15:57:00 +03:00
Max Kotliar
7988115a33 docs: add CVE handling policy (#9847)
### Describe Your Changes

Add a CVE handling policy. 

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-10-10 15:54:13 +03:00
hagen1778
dd9607de49 Revert "docs: vmstorage 2 cpu requirement (#9846)"
This reverts commit 772ac8803e.

The reason for revert is that this recommendation should not be strict.
Installations with <= 1 vCPU will continue working efficiently. The load
from reads, writes and background merges will be evenly spread by Go runtime.

cc @Sleuth56 @tiny-pangolin

(cherry picked from commit 5d36616d02)
2025-10-10 14:17:48 +02:00
Roman Khavronenko
b28e3b984d app/vmalert: uniformly populate error messages with URL context (#9845)
Before, `req.URL.Redacted` info was present in some error messages and
empty in others. This change uniformly adds it to the errors context.

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit e31e0657c8)
2025-10-10 14:17:48 +02:00
Andrii Chubatiuk
223e5a24ea docs: mention ignore_first_sample_interval in stream aggregation docs (#9834)
### Describe Your Changes

related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9615

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

(cherry picked from commit 50af991677)
2025-10-10 14:17:48 +02:00
Roman Khavronenko
c2c5c6019a docs: fix typo in queryRequestsCount field name (#9829)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 5617b24cef)
2025-10-10 14:17:48 +02:00
Roman Khavronenko
9f24172d8c docs: mention -metricNamesStatsResetAuthKey in Security (#9828)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit ece3ee2427)
2025-10-10 14:17:47 +02:00
Andrii Chubatiuk
ab461eadfa app/vmui: store query hidden state in query args (#9826)
### Describe Your Changes

save queries hidden state to
`expr.hide:<hidden_query_idx_1,..,hidden_query_idx_n>` query argument

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 4add451701)
2025-10-10 14:17:47 +02:00
Andrii Chubatiuk
4bc4243b47 app/vmui: pass query arg to search input, reset dropdown state during update (#9825)
- fixed ignored `search` query argument in `Notifiers` and `Rules` tabs
- added dropdown state reset, if other filters were updated and selected
state is not a subset of available items
- proxy requests to config.json for a local setup

### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 1c7d8d030f)
2025-10-10 14:17:47 +02:00
Yury Molodov
e1cb6305b2 app/vmui: rename "Reset" to "Reset filters" and disable when no filters are modified (#9821)
### Describe Your Changes

This PR updates the **Cardinality Explorer** page in `vmui`:

* Renames the `Reset` button to `Reset filters`.
* Disables the button when no filters are modified.

Related issue: #9609

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Signed-off-by: Yury Molodov <yurymolodov@gmail.com>
(cherry picked from commit 80b4ca6367)
2025-10-10 14:17:46 +02:00
Yury Molodov
fa4bcbe692 app/vmui: prevent removing other query params when updating one (#9818)
### Describe Your Changes

Prevents removal of unrelated query parameters when updating a single
one in `vmui`.
Previously, changing one search parameter could unintentionally clear
others.

Related issue: #9816

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Signed-off-by: Yury Molodov <yurymolodov@gmail.com>
(cherry picked from commit fafc754c11)
2025-10-10 14:17:46 +02:00
Vadim Rutkovsky
ca78f53dc4 docs/guides: add Headlamp setup guide (#9817)
### Describe Your Changes

This adds a guide on how to configure Headlamp k8s UI to display k8s
metrics from VictoriaMetrics

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

(cherry picked from commit e1e2b69b2c)
2025-10-10 14:17:46 +02:00
hagen1778
2ad94b2087 docs: fix link in build badge on github readme page
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit bc2b822d59)
2025-10-10 14:17:46 +02:00
hagen1778
2790d171d4 docs: restore build badge on github readme page
Based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9809

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit d9b0c5a268)
2025-10-10 14:17:45 +02:00
Stephan Burns
59cb68696a docs: vmstorage 2 cpu requirement (#9846)
### Describe Your Changes

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Co-authored-by: Mathias Palmersheim <mathias@victoriametrics.com>
(cherry picked from commit 772ac8803e)
2025-10-10 14:17:45 +02:00
Zakhar Bessarab
c0b11e2064 deployment/docker: update Go builder from Go1.25.1 to Go1.25.2 (#9843)
See
https://github.com/golang/go/issues?q=milestone%3AGo1.25.2%20label%3ACherryPickApproved

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-10-09 16:53:22 +04:00
Max Kotliar
34a748aad1 app/{vmagent,vmalert}: add -secret.flags to configure flag to be hidd… (#9839)
This is a refined version of
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6940, with all
work completed by @truepele.

---

### Describe Your Changes

Fixes #6938 

introduce -secret.flags to configure flag names to be hidden in logs and
on /metrics

### Checklist

The following checks are **mandatory**:

- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

Co-authored-by: andrii <truepele@gmail.com>
2025-10-09 15:08:17 +03:00
Artem Fetishev
258d2979ed docs: bump latest version in docs
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-10-07 17:57:09 +02:00
Artem Fetishev
3224545696 deployment/docker: bump version
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-10-07 17:50:57 +02:00
Artem Fetishev
1dc7a6c9ad docs: bump last LTS versions
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-10-07 17:47:55 +02:00
Artem Fetishev
2968075a28 docs/CHANGELOG.md: update changelog with LTS release notes
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-10-07 17:42:58 +02:00
hagen1778
1192059035 docs: update formatting for stream aggregation header
While there, remove excessive relabeling info and point users to Routing section.
The Routing section should explain how to build flexible processing pipleines.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 57b16f0976)
2025-10-07 17:05:25 +02:00
hagen1778
ad5a41abc6 deployment: revert upgrade libcrypto3 and libssl3 to 3.5.4-r0
Reverts 17ca1ba8c4

Reason for reverts are following:
1. The fix relies on release candidates of specific libraries
2. The real fix would be to update Alpine version, which is not released yet
3. It makes the fix partially done, as it would require follow-up in future to
switch from release candidates to stable versions, or to update Alpine version.
4. The fix is not effective, as it doesn't update the base image cached by Docker.
The real fix will be to host&update the base image separately like in https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9811.
5. VM binaries aren't vulnerable to mentioned vulnerabilites.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 4fbf26df02)
2025-10-07 17:05:24 +02:00
Zhu Jiekun
8dc1e6e117 docs: clarify how sharding in vmagent works when srv url is used (#9787)
### Describe Your Changes

clarify how sharding in vmagent works when srv url is used

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
(cherry picked from commit 91f0738e67)
2025-10-07 17:05:24 +02:00
Roman Khavronenko
23e1d7f18d docs: fix small typos in changelog (#9798)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit c70fc288f1)
2025-10-07 17:05:24 +02:00
Vadim Rutkovsky
b946a6f74b lib/promb: restore 1000 iterations for write request benchmark (#9812)
### Describe Your Changes

Initially, this benchmark was running 1000 iterations. Later in
c005245741 it was refactored and bumped to
10_000. This alerts continuous benchmark systems, so this commit brings
it back to 1000
### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

(cherry picked from commit 921e6df382)
2025-10-07 17:05:24 +02:00
Yury Molodov
3e72d21a85 app/vmui: refactor Header.tsx and fix styles (#9823)
### Describe Your Changes

- Centered the logo in the header (minor visual tweak; see screenshots
below)
- Simplified sidebar visibility conditions in `Header.tsx` for
readability/maintainability

_No changelog entry - minor visual tweak and internal refactor; no
functional impact._

### Before / After

| Before | After |
|--------|-------|
| <img width="404" height="81" alt="Before"
src="https://github.com/user-attachments/assets/21fad295-8c90-4c03-8837-d335923e645c"
/> | <img width="404" height="81" alt="After"
src="https://github.com/user-attachments/assets/8c3d7a34-fd9c-4326-aea8-ef82eade8b72"
/> |

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Signed-off-by: Yury Molodov <yurymolodov@gmail.com>
(cherry picked from commit e5dd794db5)
2025-10-07 17:05:23 +02:00
Vadim Rutkovsky
4b80fc2723 docs/victoriametrics/data-ingestion: add OpenShift guide (#9790)
### Describe Your Changes

This guide describes remote write configuration for OpenShift and
includes several useful tricks to make it efficient.

Relates to #8573

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>
(cherry picked from commit cc87505f9d)
2025-10-07 17:05:23 +02:00
Fred Navruzov
5b54a4f1be docs/vmanomaly: canonical link formatting (p2) (#9822)
### Describe Your Changes

canonical link formatting (p2)

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

(cherry picked from commit e198753f0d)
2025-10-07 17:05:23 +02:00
Fred Navruzov
d1e1ab5423 docs/vmanomaly: fix links to canonical form (1) (#9819)
### Describe Your Changes

1st iteration of improving remaining non-canonical links

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

(cherry picked from commit 6b9fa21d30)
2025-10-07 17:05:23 +02:00
Fred Navruzov
b102bec044 docs/vmanomaly: update schemas to v1.26.0 (#9813)
### Describe Your Changes

- update component schemas to v1.26.0;
- fix old ?highlight=... links format

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

(cherry picked from commit e7abdce0de)
2025-10-07 17:05:22 +02:00
Artem Fetishev
c77d3f2898 docs/CHANGELOG.md: cut v1.127.0
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-10-03 18:28:12 +02:00
Artem Fetishev
4f031002ed make vmui-update
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-10-03 16:09:33 +00:00
Roman Khavronenko
9859ccefc6 apptest: tolerate time drift (#9801)
Shift time selectors by 1s to prevent time drifting in tests. Because of
shifting, test was flapping based on when it is run. See
https://github.com/VictoriaMetrics/VictoriaMetrics/actions/runs/18186734439/job/51824319927

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9770

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-10-03 17:01:08 +02:00
Roman Khavronenko
cf14c7ee02 deployment: upgrade libcrypto3 and libssl3 to 3.5.4-r0 (#9805)
Addresses CVE-2025-9230, CVE-2025-9231, CVE-2025-9232.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-10-03 14:39:06 +02:00
Fred Navruzov
5ae06597de docs/vmanomaly: typos and 404 after 1.26.0 release (#9800)
### Describe Your Changes

fixing broken links in UI/FAQ pages

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-10-03 14:39:05 +02:00
Fred Navruzov
0d7a6f0967 change page title from GUI to UI (#9799)
### Describe Your Changes

Fixing a typo in page name

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-10-03 14:39:05 +02:00
Fred Navruzov
39e1b51564 docs/vmanomaly: release v1.26.0 (#9793)
### Describe Your Changes

Docs update for v1.26.0 release of `vmanomaly`, including vmui-like GUI
docs and `VLogsReader`

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-10-03 14:39:01 +02:00
Aliaksandr Valialkin
4486c79d39 docs/victoriametrics/enterprise.md: typo fix in the 'VictoriaLogs Enterprise features' chapter: VictoriaMetrics -> VictoriaLogs 2025-10-02 13:00:59 +02:00
Aliaksandr Valialkin
509e35471e docs/victoriametrics/enterprise.md: properly state that Monitoring of Monitoring feature at VictoriaLogs enterprise helps preventing issues in VictoriaLogs setups, not VictoriaMetrics setups 2025-10-02 12:58:43 +02:00
Evgeny
e0fa0dc2e7 app/vmalert: restore usage of query template in labels
- Fixes regression from https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9543
- Issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9783

(cherry picked from commit 2fcbf75539)
2025-10-02 09:59:47 +02:00
Aliaksandr Valialkin
44c9668682 Revert "integration test: prevent GetMetric from interrupting the test when metric not found"
This reverts commit ccf97a4143.

reason for revert: this change may break tests, which expect that ServesMetrics.GetMetric() fails
when the given metric doesn't exist in the output.

It is better to add 'TryGetMetric() (float64, bool)' function, which would return '(0, false)'
when the given metric doesn't exist, so the caller could decide what to do next.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9773
2025-10-02 02:43:30 +02:00
Aliaksandr Valialkin
c85084d03a app/vmui/packages/vmui: deny indexing vmui page by Google and other web crawlers
The vmui page has zero interesting contents for indexing.
2025-10-01 13:54:28 +02:00
hagen1778
7031e02b1b docs: fix markdown formatting typo
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 09251f0a1e)
2025-10-01 13:36:01 +02:00
Hui Wang
d47c959ad3 vmselect: prevent duplicate offset modifier when instant query uses r… (#9770)
…ollup functions rate() and avg_over_time() with cache available

fix https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9762

(cherry picked from commit 4ea5f8a84d)
2025-10-01 13:36:01 +02:00
Roman Khavronenko
f4a22712d1 vendor: update metrics package to v1.40.2 (#9780)
Restore sorting order of summary and quantile metrics exposed by
VictoriaMetrics components on `/metrics` page.

https://github.com/VictoriaMetrics/metrics/pull/105

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit cd52978096)
2025-10-01 13:36:00 +02:00
Roman Khavronenko
dd29733ad5 docs: add Life of a sample section to vmagent docs (#9719)
The routing section aims to describe the processing flow in the exact
order to the user. It substitutes previous incomplete and verbose
routing documentation in Stream Aggregation docs
https://docs.victoriametrics.com/victoriametrics/stream-aggregation/#routing

The processing order is taken from picture in
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9646#issue-3367074827

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
Co-authored-by: func25 <phuongle0205@gmail.com>
Co-authored-by: Phuong Le <39565248+func25@users.noreply.github.com>
(cherry picked from commit f65e24b2ab)
2025-10-01 13:36:00 +02:00
Andrii Chubatiuk
e54f38c96e dashboards: add adhoc filter to query stats and operator (#9774)
Add ad-hoc filters to query stats and operator dashboards.
These filters are useful for exploring non-uniform metrics sets
without distinct job/instance filters.

(cherry picked from commit 0579e68409)
2025-10-01 13:36:00 +02:00
Roman Khavronenko
11b44028c5 docs: clarify how vmagent addresses multi-level ingestion shortcomings (#9785)
The previous text didn't contain links to vmagent's capabilities.
Instead, it contained misleading multitenancy-mode link that doesn't
seem to be related to the subject.

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit f2aea8532f)
2025-10-01 13:36:00 +02:00
hagen1778
b2102ff233 docs: rm unreachable link
https://www.vultr.com/docs/install-and-configure-victoriametrics-on-debian is not reachable anymore.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 94473ed262)
2025-10-01 13:35:59 +02:00
Artem Fetishev
9c1a99715e lib/storage: do not use the default 0:0 tenant in cluster tests
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-09-30 18:17:07 +02:00
Roman Khavronenko
2f05ae4537 app/{vmbackup/vmrestore}: push metrics on shutdown
Push metrics on shutdown if `-pushmetrics.url` is configured. Before
metrics reporting might have been skipped because of shutdown.

Obsoletes https://github.com/VictoriaMetrics/metrics/pull/103

--------------

To test:
1. Run local VictoriaMetrics instance
2. Build and run vmbackup or vmrestore:
```
make vmbackup && ./bin/vmbackup -storageDataPath=victoria-metrics-data -snapshot.createURL="http://user:pass@localhost:8428/snapshot/create?authKey=foobar" -dst=fs:////vmbackup/dir -pushmetrics.url=http://localhost:8428/api/v1/import/prometheus,http://127.0.0.1:8428/api/v1/import/prometheus
```
3. Try playing with `-pushmetrics.url` (good/bad/many addresses) and
observe logs

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9767
2025-09-30 09:30:25 +02:00
Zhu Jiekun
e8292f6088 integration test: prevent GetMetric from interrupting the test when metric not found
Previously, `GetMetric` do `t.Fatalf` immediately when the target metric
not exist in `/metrics` page.

However, some metrics may start to appear after the process has been
running for a while. `t.Fatalf` invalidates the retry mechanism of
assertions, if the metric is not found the first time, the test case
will terminate.

This commit request changes `t.Fatalf` to `t.Logf` (instead of `t.Errorf`,
because error output may be considered a test case failure in some
scenarios).

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9773
2025-09-30 09:30:25 +02:00
Nikolay
22698a007e lib/workingsetache:add runtime finalizer to the cache
Follow-up for cea9505bab

 fastcache.Cache allocates off-heap memory, which must be explicitly
returned back to the pool with Reset method call.

 After changed made at commit above, during cache transit from whole to
split mode, it's possible that current cache is referenced by Cache.Get
or Cache.Call atomic pointers. It leads to potential memory leaks, since
we don't have any memory synchronization for atomic.Pointer.Store calls.

 This commit adds `Finalizer` to the `fastcache.Cache` instances.
It properly releases memory, when cache is no reachable.

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9769
2025-09-30 09:30:24 +02:00
Nikolay
8f61d864ec lib/workingsetcache: properly transit cache state
Previously, cache state transition from split into whole could left
cache into broken state, if Reset cache method was called in switching
mode.

 Also, cache Reset didn't start background workers and didn't change
cache size.

 This commit properly check mode during cache transition. In addition,
it no longer stops background workers after whole mode transition and
always start workers during start-up.

 Access to the prev, curr and mode Cache fields are properly locked
in order to mitigate possible race conditions.

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9769
2025-09-29 14:56:02 +02:00
Hui Wang
2954e75cdb vmalert: add -rule.resultLimit command-line flag to allow limiting … (#9737)
…the number of alerts or recording results a single rule can produce

fix https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5792

(cherry picked from commit 30ac8cd3fa)
2025-09-29 13:13:29 +02:00
hagen1778
ed7fc48f9e apptest: remove vlogs related code
VictoriaLogs has a new home for integration tests
https://github.com/VictoriaMetrics/VictoriaLogs/tree/master/apptest

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit a1f0b792af)
2025-09-29 13:13:29 +02:00
hagen1778
4326dd0484 lib/streamaggr: prevent compilator from overoptimizing testing path
It seems like go compilator skipped computations and allocations for samples
as they weren't used afterwards. Sinking results into global variable removes
this optimizations and benchmark starts showing allocations within `pushSamples` fn.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 50f75d751f)
2025-09-29 13:13:29 +02:00
hagen1778
46a227142f docs: fix links leading to legacy anchors
Change link to point to up-to-date documents instead
of pointing to legacy links.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 27f7bc81e0)
2025-09-29 13:13:28 +02:00
Artem Fetishev
768a13baf4 lib/storage: refactor tsid search (#9765)
- Make SearchTSIDs look similar to SearchMetricNames, i.e. search for metricIDs within the method
- Make the corresponding corrupted index test look similar to one for metric names search

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-09-26 16:09:48 +02:00
Artem Fetishev
ac394576e7 lib/storage: remove unused storage field from Search type
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-09-25 16:20:58 +02:00
hagen1778
5470db5642 deployment/docker: update Go builder from Go1.25.0 to Go1.25.1
See https://github.com/golang/go/issues?q=milestone%3AGo1.25.1%20label%3ACherryPickApproved

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit f24bf391a4)
2025-09-25 11:48:23 +02:00
hagen1778
96f60c664d deployment: bump Grafana to v12.2.0
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit bc64ecfa3d)
2025-09-25 11:48:23 +02:00
hagen1778
d04feb8d0c deployment: bump node-exporter to v1.9.1
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit f0bbf6ec15)
2025-09-25 11:48:23 +02:00
hagen1778
ced7da736b deployment: bump alertmanager to v0.28.1
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit cff4bde4d6)
2025-09-25 11:48:23 +02:00
hagen1778
d4943c0b8b deployment: drop vlogs-example-alerts
It was moved to VictoriaLogs repo https://github.com/VictoriaMetrics/VictoriaLogs/blob/master/deployment/docker/vlogs-example-alerts.yml

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 1716f11677)
2025-09-25 11:48:23 +02:00
Phuong Le
b4aa8114c4 docs: update internal vendoring contribution guide (#9739)
### Describe Your Changes

Consistently use the `v0.0.0-YYYYMMDDHHMMSS-commit_hash` reference for
the internal deps such as `github.com/VictoriaMetrics/VictoriaMetrics`
dependency, since it allows referring any commit without waiting for the
release tag.

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

(cherry picked from commit b4932ed2da)
2025-09-25 11:48:22 +02:00
hagen1778
c70c5f8977 fix a small typo
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 77f2ab139f)
2025-09-25 11:48:22 +02:00
Hui Wang
832b712c78 lib/protoparser: remove error log when marshaling an invalid comment or an empty HELP metadata line (#9732)
fix https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9710

---------

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
(cherry picked from commit 5537140074)
2025-09-25 11:48:22 +02:00
i2blind
8d99f964b3 docs/stream-aggregation: streamAggr.dedupInterval comments on old samples (#9731)
### Describe Your Changes

- Add comments to stream-aggregation README.md to clarify the effect
that the +flag will have on old samples
- Fix a spelling error with peridically to periodically in several files
that codespell-check caught.

Related to [#6775]

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>
(cherry picked from commit 5d766bf7f1)
2025-09-25 11:48:22 +02:00
Andrii Chubatiuk
d60417d5ee app/vmui: reset select values, when 'ALL' selected (#9702)
### Describe Your Changes

resetting Select component selected items, when all items are selected,
this should speed up filtering on alerting page on VMUI

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Co-authored-by: Artur Minchukou <aminchukov@victoriametrics.com>
(cherry picked from commit 5907239181)
2025-09-25 11:48:21 +02:00
Andrii Chubatiuk
5a6a72bc0e app/vmui: fix disabled state for select, textfield and datetimepicker components (#9698)
### Describe Your Changes

select and textfield components look confusing, while disabled. it's
impossible to guess if it's disabled or not before interaction. updated
colors for components, when they are disabled

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

(cherry picked from commit 720c2bfa1d)
2025-09-25 11:48:21 +02:00
Arie Heinrich
f9ff91a82e docs: markdown, grammar and spelling (#9695)
### Describe Your Changes

This pull request consists of the following:

1. Markdown fixes
    following https://www.markdownguide.org/basic-syntax/
and https://github.com/markdownlint/markdownlint/blob/main/docs/RULES.md

- Add empty lines after headers or lists
- Remove extra lines between paragraphs
- Remove extra spaces at the end of a line
- Add language to code quote
- Consistent list (dont mix astrixes and dashes on same file, choose one
and be consistent in the same file)
- Proper URL links
- Use meaningful context to URLs instead of "here".

2. Concise language

3. Grammar fixes

- removing extra spaces between words
- there are multiple ones but i picked the basic ones that triggered my
eye :)

4. Spelling fixes

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit e971e6102e)
2025-09-25 11:48:21 +02:00
Andrii Chubatiuk
575f279376 app/vmui: add minDate and maxDate parameters for DatePicker to allow limiting available dates to select (#9694)
add ability to limit available in datePicker dates using `minDate` and
`maxDate` parameters. all dates before `minDate` and after `maxDate`
cannot be picked. lower and upper bounds can be set independently.

This `minDate` and `maxDate` parameters aren't set by default in vmui.
The datepicker component with these params is re-used elsewhere.

(cherry picked from commit 5cd6d7cfba)
2025-09-25 11:48:21 +02:00
hagen1778
856908f4f1 docs: add question about old and out-of-order metrics to FAQ
The change also explciitly mentions `out-of-order` phrase, as it is commonly
used in Prometheus ecosystem.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 907aa1973a)
2025-09-25 11:48:20 +02:00
f41gh7
f6527ea8eb lib/storage: introduce metricNameSearch type
Searching metricName by metricID happens many times during a single API
call. This requires getting the current set of idbs before those calls
happen. Which is fine but requires propagating idbs across the code
base. This is also fine in case of OSS version as it is used in Search
only.

Propagating idbs across the code base becomes a problem in Enterprise
version as it is used in at least 3 places. As a result it becomes very
difficult to merge things from OSS to Ent.

Localizing the all the dependencies in one searchMetricName type and
reusing this type everywhere should make things simpler.

Related enterprise changes:
https://github.com/VictoriaMetrics/VictoriaMetrics-enterprise/compare/search-metric-name-ent?expand=1

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9756
2025-09-24 15:39:00 +02:00
f41gh7
3226b1656b lib/storage: Move searchTSIDs to Storage
A small refactoring that reduces Search dependency on Storage:

- Move searchTSIDs() from Search to Storage because this method does not
depend on anything Search-specific but does depend on Storage.
- Use metricsTracker instead of storage.metricTracker.

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9754
2025-09-24 15:35:54 +02:00
Artem Fetishev
06a15b352c lib/storage: rewrite search benchmarks to allow to make it easy adding new cases (#9691)
Benchmarking storage search api requires taking into account many
parameters, such as:
- data configuration: how many series, deleted series, search time range
- where the index data recides: prev and or indexDB
- which search operation to measure

While adding a new benchmark use case involves a lot boilerplate code.

This pr implements a framework for testing storage search ops that can
be relatively easily extended. This come in expecially handy when adding
new cases for parition index.

The current set of params will result of a lot of benchmarks to be run
which most probably does not make sense because:
- it will take a lot of time and
- the output data is hard to compare manually.

However, these benchmarks are very useful when only small set of params
is of interest. For example, if I want to compare the search of 100k
metric names when the index data resides in prevOnly, currOnly or
prevAndCurr indexDBs. This would translate in the following cmd:

```shell
go test ./lib/storage --loggerLevel=ERROR -run=^$ -bench=^BenchmarkSearch/MetricNames/.*/VariableSeries/100000$
```

Why this change:
- I often need to run benchmarks with configs that I did not have
before, requires either modifying the existing one or writing a new one.
It is easy to get lost and make benchmark non-comparable
- I need some way to make legacy and pt index benchmarks comparable

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-09-22 15:22:22 +02:00
Roman Khavronenko
d6d7616c2c docs: replace link to WITH templates playgorund (#9729)
The new link is shorter and has nice UI.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 95ca45d05a)
2025-09-22 13:55:29 +02:00
Nils K
13d9193efc docs: fix typo of dree -> free in formula (#9743)
(cherry picked from commit 828a2aaf17)
2025-09-22 13:55:29 +02:00
hagen1778
a7658e4263 docs: fix a few typos in the changelog
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 007ae5a3f0)
2025-09-22 13:55:29 +02:00
Zakhar Bessarab
e11448a8a7 docs/vmbackupmanager: add docs to clarify unsafe usage of lifecycle rules (#9728)
- state that it is unsafe to use lifecycle rules and describe the reason
- update formatting according latest changes in docs


---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2025-09-22 11:57:24 +04:00
Roman Khavronenko
0af6553878 docs: update vmagent diagram image (#9727)
The original image seems outdated by now.
Replacing it with the updated and more detailed version from
https://victoriametrics.com/blog/vmagent-key-features-explained/

Picture is created by @func25

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: func25 <phuongle0205@gmail.com>
2025-09-17 16:35:43 +03:00
Arie Heinrich
56c2abd54b Markdown, grammar and spelling (#9692)
### Describe Your Changes

This pull request consists of the following:

1. Markdown fixes
    following https://www.markdownguide.org/basic-syntax/
and https://github.com/markdownlint/markdownlint/blob/main/docs/RULES.md
- Add empty lines after headers or lists
- Remove extra lines between paragraphs
- Remove extra spaces at the end of a line
- Add language to code quote
- Consistent list (dont mix astrixes and dashes on same file, choose one
and be consistent in the same file)
- Proper URL links
- Use meaningful context to URLs instead of "here".

2. Concise language

3. Grammar fixes

- removing extra spaces between words
- there are multiple ones but i picked the basic ones that triggered my
eye :)

4. Spelling fixes

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-09-17 13:53:29 +03:00
Aliaksandr Valialkin
8f042b800b app/vmauth: follow-up for 8ce4636bc0
- Rename copyStream to copyStreamToClient in order to make it more clear
  that the stream must be copied from backend to client.

- Make sure that the client implements net/http.Flusher interface.
  It is a programming error (BUG) if the client passed to copyStreamToClient
  doesn't implement net/http.Flusher interface.

- Do not write zero-length data to the backend.

Updates https://github.com/VictoriaMetrics/VictoriaLogs/issues/667
2025-09-17 10:35:19 +02:00
Roman Khavronenko
357b6bd5ca deployment: drop logs-benchmark (#9726)
It has a new home now - see
https://github.com/VictoriaMetrics/VictoriaLogs/tree/master/deployment/logs-benchmark

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-09-16 17:15:19 +03:00
Dima Shur
7e7ae97486 Improvements for backup description and configuration for single node, cluster , quick start (#9459)
### Describe Your Changes

Updating backup-related documentation:
vmbackup, single node, cluster node, quick start to increase clarity and
improve doc structure

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2025-09-16 16:42:20 +03:00
dependabot[bot]
891941f910 build(deps): bump vite from 7.0.4 to 7.1.5 in /app/vmui/packages/vmui (#9706)
Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite)
from 7.0.4 to 7.1.5.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/vitejs/vite/releases">vite's
releases</a>.</em></p>
<blockquote>
<h2>v7.1.5</h2>
<p>Please refer to <a
href="https://github.com/vitejs/vite/blob/v7.1.5/packages/vite/CHANGELOG.md">CHANGELOG.md</a>
for details.</p>
<h2>v7.1.4</h2>
<p>Please refer to <a
href="https://github.com/vitejs/vite/blob/v7.1.4/packages/vite/CHANGELOG.md">CHANGELOG.md</a>
for details.</p>
<h2>v7.1.3</h2>
<p>Please refer to <a
href="https://github.com/vitejs/vite/blob/v7.1.3/packages/vite/CHANGELOG.md">CHANGELOG.md</a>
for details.</p>
<h2>v7.1.2</h2>
<p>Please refer to <a
href="https://github.com/vitejs/vite/blob/v7.1.2/packages/vite/CHANGELOG.md">CHANGELOG.md</a>
for details.</p>
<h2>v7.1.1</h2>
<p>Please refer to <a
href="https://github.com/vitejs/vite/blob/v7.1.1/packages/vite/CHANGELOG.md">CHANGELOG.md</a>
for details.</p>
<h2>create-vite@7.1.1</h2>
<p>Please refer to <a
href="https://github.com/vitejs/vite/blob/create-vite@7.1.1/packages/create-vite/CHANGELOG.md">CHANGELOG.md</a>
for details.</p>
<h2>plugin-legacy@7.1.0</h2>
<p>Please refer to <a
href="https://github.com/vitejs/vite/blob/plugin-legacy@7.1.0/packages/plugin-legacy/CHANGELOG.md">CHANGELOG.md</a>
for details.</p>
<h2>create-vite@7.1.0</h2>
<p>Please refer to <a
href="https://github.com/vitejs/vite/blob/create-vite@7.1.0/packages/create-vite/CHANGELOG.md">CHANGELOG.md</a>
for details.</p>
<h2>v7.1.0</h2>
<p>Please refer to <a
href="https://github.com/vitejs/vite/blob/v7.1.0/packages/vite/CHANGELOG.md">CHANGELOG.md</a>
for details.</p>
<h2>v7.1.0-beta.1</h2>
<p>Please refer to <a
href="https://github.com/vitejs/vite/blob/v7.1.0-beta.1/packages/vite/CHANGELOG.md">CHANGELOG.md</a>
for details.</p>
<h2>v7.1.0-beta.0</h2>
<p>Please refer to <a
href="https://github.com/vitejs/vite/blob/v7.1.0-beta.0/packages/vite/CHANGELOG.md">CHANGELOG.md</a>
for details.</p>
<h2>v7.0.7</h2>
<p>Please refer to <a
href="https://github.com/vitejs/vite/blob/v7.0.7/packages/vite/CHANGELOG.md">CHANGELOG.md</a>
for details.</p>
<h2>v7.0.6</h2>
<p>Please refer to <a
href="https://github.com/vitejs/vite/blob/v7.0.6/packages/vite/CHANGELOG.md">CHANGELOG.md</a>
for details.</p>
<h2>v7.0.5</h2>
<p>Please refer to <a
href="https://github.com/vitejs/vite/blob/v7.0.5/packages/vite/CHANGELOG.md">CHANGELOG.md</a>
for details.</p>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/vitejs/vite/blob/main/packages/vite/CHANGELOG.md">vite's
changelog</a>.</em></p>
<blockquote>
<h2><!-- raw HTML omitted --><a
href="https://github.com/vitejs/vite/compare/v7.1.4...v7.1.5">7.1.5</a>
(2025-09-08)<!-- raw HTML omitted --></h2>
<h3>Bug Fixes</h3>
<ul>
<li>apply <code>fs.strict</code> check to HTML files (<a
href="https://redirect.github.com/vitejs/vite/issues/20736">#20736</a>)
(<a
href="14015d794f">14015d7</a>)</li>
<li><strong>deps:</strong> update all non-major dependencies (<a
href="https://redirect.github.com/vitejs/vite/issues/20732">#20732</a>)
(<a
href="122bfbabeb">122bfba</a>)</li>
<li>upgrade sirv to 3.0.2 (<a
href="https://redirect.github.com/vitejs/vite/issues/20735">#20735</a>)
(<a
href="09f2b52e8d">09f2b52</a>)</li>
</ul>
<h2><!-- raw HTML omitted --><a
href="https://github.com/vitejs/vite/compare/v7.1.3...v7.1.4">7.1.4</a>
(2025-09-01)<!-- raw HTML omitted --></h2>
<h3>Bug Fixes</h3>
<ul>
<li>add missing awaits (<a
href="https://redirect.github.com/vitejs/vite/issues/20697">#20697</a>)
(<a
href="79d10ed634">79d10ed</a>)</li>
<li><strong>deps:</strong> update all non-major dependencies (<a
href="https://redirect.github.com/vitejs/vite/issues/20676">#20676</a>)
(<a
href="5a274b29df">5a274b2</a>)</li>
<li><strong>deps:</strong> update all non-major dependencies (<a
href="https://redirect.github.com/vitejs/vite/issues/20709">#20709</a>)
(<a
href="0401feba17">0401feb</a>)</li>
<li>pass rollup watch options when building in watch mode (<a
href="https://redirect.github.com/vitejs/vite/issues/20674">#20674</a>)
(<a
href="f367453ca2">f367453</a>)</li>
</ul>
<h3>Miscellaneous Chores</h3>
<ul>
<li>remove unused constants entry from rolldown.config.ts (<a
href="https://redirect.github.com/vitejs/vite/issues/20710">#20710</a>)
(<a
href="537fcf9186">537fcf9</a>)</li>
</ul>
<h3>Code Refactoring</h3>
<ul>
<li>remove unnecessary <code>minify</code> parameter from
<code>finalizeCss</code> (<a
href="https://redirect.github.com/vitejs/vite/issues/20701">#20701</a>)
(<a
href="8099582e53">8099582</a>)</li>
</ul>
<h2><!-- raw HTML omitted --><a
href="https://github.com/vitejs/vite/compare/v7.1.2...v7.1.3">7.1.3</a>
(2025-08-19)<!-- raw HTML omitted --></h2>
<h3>Features</h3>
<ul>
<li><strong>cli:</strong> add Node.js version warning for unsupported
versions (<a
href="https://redirect.github.com/vitejs/vite/issues/20638">#20638</a>)
(<a
href="a1be1bf090">a1be1bf</a>)</li>
<li>generate code frame for parse errors thrown by terser (<a
href="https://redirect.github.com/vitejs/vite/issues/20642">#20642</a>)
(<a
href="a9ba0174a5">a9ba017</a>)</li>
<li>support long lines in <code>generateCodeFrame</code> (<a
href="https://redirect.github.com/vitejs/vite/issues/20640">#20640</a>)
(<a
href="1559577317">1559577</a>)</li>
</ul>
<h3>Bug Fixes</h3>
<ul>
<li><strong>deps:</strong> update all non-major dependencies (<a
href="https://redirect.github.com/vitejs/vite/issues/20634">#20634</a>)
(<a
href="4851cab3ba">4851cab</a>)</li>
<li><strong>optimizer:</strong> incorrect incompatible error (<a
href="https://redirect.github.com/vitejs/vite/issues/20439">#20439</a>)
(<a
href="446fe83033">446fe83</a>)</li>
<li>support multiline new URL(..., import.meta.url) expressions (<a
href="https://redirect.github.com/vitejs/vite/issues/20644">#20644</a>)
(<a
href="9ccf142764">9ccf142</a>)</li>
</ul>
<h3>Performance Improvements</h3>
<ul>
<li><strong>cli:</strong> dynamically import <code>resolveConfig</code>
(<a
href="https://redirect.github.com/vitejs/vite/issues/20646">#20646</a>)
(<a
href="f691f57e46">f691f57</a>)</li>
</ul>
<h3>Miscellaneous Chores</h3>
<ul>
<li><strong>deps:</strong> update rolldown-related dependencies (<a
href="https://redirect.github.com/vitejs/vite/issues/20633">#20633</a>)
(<a
href="98b92e8c4b">98b92e8</a>)</li>
</ul>
<h3>Code Refactoring</h3>
<ul>
<li>replace startsWith with strict equality (<a
href="https://redirect.github.com/vitejs/vite/issues/20603">#20603</a>)
(<a
href="42816dee0e">42816de</a>)</li>
<li>use <code>import</code> in worker threads (<a
href="https://redirect.github.com/vitejs/vite/issues/20641">#20641</a>)
(<a
href="530687a344">530687a</a>)</li>
</ul>
<h3>Tests</h3>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="564754061e"><code>5647540</code></a>
release: v7.1.5</li>
<li><a
href="09f2b52e8d"><code>09f2b52</code></a>
fix: upgrade sirv to 3.0.2 (<a
href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/20735">#20735</a>)</li>
<li><a
href="14015d794f"><code>14015d7</code></a>
fix: apply <code>fs.strict</code> check to HTML files (<a
href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/20736">#20736</a>)</li>
<li><a
href="122bfbabeb"><code>122bfba</code></a>
fix(deps): update all non-major dependencies (<a
href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/20732">#20732</a>)</li>
<li><a
href="bcc31449c0"><code>bcc3144</code></a>
release: v7.1.4</li>
<li><a
href="0401feba17"><code>0401feb</code></a>
fix(deps): update all non-major dependencies (<a
href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/20709">#20709</a>)</li>
<li><a
href="537fcf9186"><code>537fcf9</code></a>
chore: remove unused constants entry from rolldown.config.ts (<a
href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/20710">#20710</a>)</li>
<li><a
href="79d10ed634"><code>79d10ed</code></a>
fix: add missing awaits (<a
href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/20697">#20697</a>)</li>
<li><a
href="8099582e53"><code>8099582</code></a>
refactor: remove unnecessary <code>minify</code> parameter from
<code>finalizeCss</code> (<a
href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/20701">#20701</a>)</li>
<li><a
href="f367453ca2"><code>f367453</code></a>
fix: pass rollup watch options when building in watch mode (<a
href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/20674">#20674</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/vitejs/vite/commits/v7.1.5/packages/vite">compare
view</a></li>
</ul>
</details>
<details>
<summary>Maintainer changes</summary>
<p>This version was pushed to npm by [GitHub Actions](<a
href="https://www.npmjs.com/~GitHub">https://www.npmjs.com/~GitHub</a>
Actions), a new releaser for vite since your current version.</p>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=vite&package-manager=npm_and_yarn&previous-version=7.0.4&new-version=7.1.5)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/VictoriaMetrics/VictoriaMetrics/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-16 15:57:19 +03:00
Aliaksandr Valialkin
d314f04d25 app/vmauth: do not log requests canceled by the client, since this is an expected condition
See https://github.com/VictoriaMetrics/VictoriaLogs/issues/667#issuecomment-3297270128
2025-09-16 11:59:23 +02:00
Max Kotliar
41b76706b3 docs: use canonical link 2025-09-16 12:43:21 +03:00
Aliaksandr Valialkin
fd47e7f798 app/vmauth: flush data chunks from backends to clients as soon as possible without bufferring them at vmauth side
This allows the proper live tailing of responses from backends
such as VictoriaLogs live tailing - https://docs.victoriametrics.com/victorialogs/querying/#live-tailing

See https://github.com/VictoriaMetrics/VictoriaLogs/issues/667

Thanks to @func25 for the initial pull request at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9723
2025-09-16 11:11:01 +02:00
Andrii Chubatiuk
00e53b6c9e lib/timerpool: removed unneeded code, unified package usage (#9735)
### Describe Your Changes

after golang 1.23 it's enough just to stop timer, no need to drain a
channel

related issue
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9721, but this
is not a fix for it

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-09-16 09:56:29 +02:00
Max Kotliar
3577a0ff8e docs: use canonical links 2025-09-16 10:23:28 +03:00
Max Kotliar
f290dfd7a3 docs: bump latest version in docs 2025-09-15 14:21:23 +03:00
Max Kotliar
43192c8384 deployment/docker: bump version 2025-09-15 14:08:43 +03:00
Max Kotliar
afb019d1a9 docs: bump last LTS versions 2025-09-15 12:29:16 +03:00
Max Kotliar
ff22e799d3 docs/CHANGELOG.md: update changelog with LTS release notes 2025-09-15 12:25:06 +03:00
Max Kotliar
c63e61141d docs: correct the availabe from version 2025-09-15 10:47:03 +03:00
Max Kotliar
026b2d612b docs/changelog: fix link; chore a bit 2025-09-12 20:23:12 +03:00
Max Kotliar
785086e5d6 docs/CHANGELOG.md: cut v1.126.0 2025-09-12 15:58:06 +03:00
Max Kotliar
ca1caaff8a docs: update version help tooltips 2025-09-12 15:55:10 +03:00
Max Kotliar
4b2861d04d security: do not mention exact lts versions
Provide link to LTS page where the version is updated.
2025-09-12 15:47:27 +03:00
Max Kotliar
a5acfb4886 app/vmselect: run make vmui-update 2025-09-12 15:33:54 +03:00
Zakhar Bessarab
0e2030ee12 app/vmbackupmanager: use full backup path for restore mark (#939)
1beb629b removed logic which was used in order to keep full backup
location path in the restore mark file. Because of this, backups created
with a shortname (e.g. `vmbackupmanager restore create
daily/2025-09-12`) will fail as backup location is not prepended.

Fix that by properly constructing full backup name from parsed canonical
values.

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-09-12 14:51:45 +03:00
Max Kotliar
9548ccf0e5 vendor: update metrics package to v1.40.1 (#9725)
### Describe Your Changes

Includes fix https://github.com/VictoriaMetrics/metrics/pull/99

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-09-12 14:15:28 +03:00
Andrii Chubatiuk
b06b173fe1 app/vmui: fixed backend URL for multitenant endpoints (#9703)
### Describe Your Changes

vmui builds incorrect endpoint, while using multitenant API. bug was
introduced in PR #8989

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2025-09-12 13:50:42 +03:00
Hui Wang
253b3a6841 fix automatic issuing of TLS certificates (#935)
* fix automatic issuing of TLS certificates
2025-09-12 13:28:00 +03:00
Aliaksandr Valialkin
a36d195c51 lib/fs: sync the directory scheduled for removal after the removing the deleteDirFilename file
This should help removing various metadata in the directory, which may be left
by some exotic filesystems such as OSSFS2.

See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9709
and https://github.com/VictoriaMetrics/VictoriaLogs/issues/649 for details.
2025-09-11 17:45:06 +02:00
dependabot[bot]
82a6713e91 build(deps): bump actions/setup-go from 5 to 6 (#9688)
Bumps [actions/setup-go](https://github.com/actions/setup-go) from 5 to
6.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/actions/setup-go/releases">actions/setup-go's
releases</a>.</em></p>
<blockquote>
<h2>v6.0.0</h2>
<h2>What's Changed</h2>
<h3>Breaking Changes</h3>
<ul>
<li>Improve toolchain handling to ensure more reliable and consistent
toolchain selection and management by <a
href="https://github.com/matthewhughes934"><code>@​matthewhughes934</code></a>
in <a
href="https://redirect.github.com/actions/setup-go/pull/460">actions/setup-go#460</a></li>
<li>Upgrade Nodejs runtime from node20 to node 24 by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/setup-go/pull/624">actions/setup-go#624</a></li>
</ul>
<p>Make sure your runner is on version v2.327.1 or later to ensure
compatibility with this release. <a
href="https://github.com/actions/runner/releases/tag/v2.327.1">See
Release Notes</a></p>
<h3>Dependency Upgrades</h3>
<ul>
<li>Upgrade <code>@​types/jest</code> from 29.5.12 to 29.5.14 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-go/pull/589">actions/setup-go#589</a></li>
<li>Upgrade <code>@​actions/tool-cache</code> from 2.0.1 to 2.0.2 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-go/pull/591">actions/setup-go#591</a></li>
<li>Upgrade <code>@​typescript-eslint/parser</code> from 8.31.1 to
8.35.1 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-go/pull/590">actions/setup-go#590</a></li>
<li>Upgrade undici from 5.28.5 to 5.29.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-go/pull/594">actions/setup-go#594</a></li>
<li>Upgrade typescript from 5.4.2 to 5.8.3 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-go/pull/538">actions/setup-go#538</a></li>
<li>Upgrade eslint-plugin-jest from 28.11.0 to 29.0.1 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-go/pull/603">actions/setup-go#603</a></li>
<li>Upgrade <code>form-data</code> to bring in fix for critical
vulnerability by <a
href="https://github.com/matthewhughes934"><code>@​matthewhughes934</code></a>
in <a
href="https://redirect.github.com/actions/setup-go/pull/618">actions/setup-go#618</a></li>
<li>Upgrade actions/checkout from 4 to 5 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-go/pull/631">actions/setup-go#631</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a
href="https://github.com/matthewhughes934"><code>@​matthewhughes934</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/setup-go/pull/618">actions/setup-go#618</a></li>
<li><a href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/setup-go/pull/624">actions/setup-go#624</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/setup-go/compare/v5...v6.0.0">https://github.com/actions/setup-go/compare/v5...v6.0.0</a></p>
<h2>v5.5.0</h2>
<h2>What's Changed</h2>
<h3>Bug fixes:</h3>
<ul>
<li>Update self-hosted environment validation by <a
href="https://github.com/priyagupta108"><code>@​priyagupta108</code></a>
in <a
href="https://redirect.github.com/actions/setup-go/pull/556">actions/setup-go#556</a></li>
<li>Add manifest validation and improve error handling by <a
href="https://github.com/priyagupta108"><code>@​priyagupta108</code></a>
in <a
href="https://redirect.github.com/actions/setup-go/pull/586">actions/setup-go#586</a></li>
<li>Update template link by <a
href="https://github.com/jsoref"><code>@​jsoref</code></a> in <a
href="https://redirect.github.com/actions/setup-go/pull/527">actions/setup-go#527</a></li>
</ul>
<h3>Dependency  updates:</h3>
<ul>
<li>Upgrade <code>@​action/cache</code> from 4.0.2 to 4.0.3 by <a
href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a>
in <a
href="https://redirect.github.com/actions/setup-go/pull/574">actions/setup-go#574</a></li>
<li>Upgrade <code>@​actions/glob</code> from 0.4.0 to 0.5.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/setup-go/pull/573">actions/setup-go#573</a></li>
<li>Upgrade ts-jest from 29.1.2 to 29.3.2 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/setup-go/pull/582">actions/setup-go#582</a></li>
<li>Upgrade eslint-plugin-jest from 27.9.0 to 28.11.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/setup-go/pull/537">actions/setup-go#537</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/jsoref"><code>@​jsoref</code></a> made
their first contribution in <a
href="https://redirect.github.com/actions/setup-go/pull/527">actions/setup-go#527</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/setup-go/compare/v5...v5.5.0">https://github.com/actions/setup-go/compare/v5...v5.5.0</a></p>
<h2>v5.4.0</h2>
<h2>What's Changed</h2>
<h3>Dependency updates :</h3>
<ul>
<li>Upgrade semver from 7.6.0 to 7.6.3 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/setup-go/pull/535">actions/setup-go#535</a></li>
<li>Upgrade eslint-config-prettier from 8.10.0 to 10.0.1 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/setup-go/pull/536">actions/setup-go#536</a></li>
<li>Upgrade <code>@​action/cache</code> from 4.0.0 to 4.0.2 by <a
href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a>
in <a
href="https://redirect.github.com/actions/setup-go/pull/568">actions/setup-go#568</a></li>
<li>Upgrade undici from 5.28.4 to 5.28.5 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/setup-go/pull/541">actions/setup-go#541</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="4469467582"><code>4469467</code></a>
Bump actions/checkout from 4 to 5 (<a
href="https://redirect.github.com/actions/setup-go/issues/631">#631</a>)</li>
<li><a
href="e093d1e9bb"><code>e093d1e</code></a>
Node 24 upgrade (<a
href="https://redirect.github.com/actions/setup-go/issues/624">#624</a>)</li>
<li><a
href="1d76b952eb"><code>1d76b95</code></a>
Improve toolchain handling (<a
href="https://redirect.github.com/actions/setup-go/issues/460">#460</a>)</li>
<li><a
href="e75c3e80bc"><code>e75c3e8</code></a>
Bump <code>form-data</code> to bring in fix for critical vulnerability
(<a
href="https://redirect.github.com/actions/setup-go/issues/618">#618</a>)</li>
<li><a
href="8e57b58e57"><code>8e57b58</code></a>
Bump eslint-plugin-jest from 28.11.0 to 29.0.1 (<a
href="https://redirect.github.com/actions/setup-go/issues/603">#603</a>)</li>
<li><a
href="7c0b336c9a"><code>7c0b336</code></a>
Bump typescript from 5.4.2 to 5.8.3 (<a
href="https://redirect.github.com/actions/setup-go/issues/538">#538</a>)</li>
<li><a
href="6f26dcc668"><code>6f26dcc</code></a>
Bump undici from 5.28.5 to 5.29.0 (<a
href="https://redirect.github.com/actions/setup-go/issues/594">#594</a>)</li>
<li><a
href="8d4083a006"><code>8d4083a</code></a>
Bump <code>@​typescript-eslint/parser</code> from 5.62.0 to 8.32.0 (<a
href="https://redirect.github.com/actions/setup-go/issues/590">#590</a>)</li>
<li><a
href="fa96338abe"><code>fa96338</code></a>
Bump <code>@​actions/tool-cache</code> from 2.0.1 to 2.0.2 (<a
href="https://redirect.github.com/actions/setup-go/issues/591">#591</a>)</li>
<li><a
href="4de67c04ab"><code>4de67c0</code></a>
Bump <code>@​types/jest</code> from 29.5.12 to 29.5.14 (<a
href="https://redirect.github.com/actions/setup-go/issues/589">#589</a>)</li>
<li>See full diff in <a
href="https://github.com/actions/setup-go/compare/v5...v6">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/setup-go&package-manager=github_actions&previous-version=5&new-version=6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-11 17:43:53 +02:00
Zhu Jiekun
216a6a42f4 docs: add contributing guide for vendor package
Some VictoriaMetrics organization's repos vendor each others. To avoid
pull request like
https://github.com/VictoriaMetrics/VictoriaLogs/pull/658, this pull
request adds contributing guide for vendor package.

Related: https://github.com/VictoriaMetrics/VictoriaLogs/issues/659
2025-09-11 17:42:56 +02:00
Nikolay
c37518cb7f app/vmagent: respect enable.auto.commit for kafka consumer
Previously, vmagent always set enable.auto.commit to false and manually
commited messages. It adds additional pressure to the kafka brokers and could slow down
data consumption.

 This commit allows vmagent to skip manual commit and use auto-commit
based on provided configuration. Which may improve message read throughput.

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics-enterprise/pull/931
2025-09-11 12:09:35 +02:00
Max Kotliar
a5b2f58bd5 docs: reduce redirects in docs (#9711)
### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-09-10 21:04:50 +03:00
Max Kotliar
8d06d82445 docs: reduce redirect in docs 2025-09-10 14:22:25 +03:00
Max Kotliar
7332dd44e0 docs: add slash at the end to avoid redirect (#9705)
### Describe Your Changes

Add a slash at the end of the link to avoid redirects. Remove `.html` in
links.

P.S. While working on this one, I found that anchors to guides are
broken. I'll address them ina separate PR.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-09-09 20:16:16 +03:00
Max Kotliar
52afcdbeb3 docs: use direct correct links (#9704)
Don't use legacy pages, use direct links to proper pages, avoid
redirects or alias (aka `.html`) pages.

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-09-09 17:27:33 +03:00
Max Kotliar
a1d3ee2462 docs: drop highlight query param from links
Text highlighting in the docs used to work, but it no longer does.
Removing it makes indexing the docs a bit more convenient.
2025-09-09 14:00:06 +03:00
Max Kotliar
4bb2244e33 Makefile: Add docs-update-flags command that syncs docs flags from the actual binaries (#9632)
### Describe Your Changes

This PR introduces a `make docs-update-flags` command that updates flags
in the documentation using the actual binaries compiled from the latest
`enterprise-single-node` and `enterprise-cluster` branches (hardcoded
for now). The command also normalizes the output format.

It can be run from any branch. All work happens inside temporary
directories under /tmp. The script checks out the required branch,
builds the binaries, and updates the documentation. The current Git
repository is not touched.

The command adjusts default values to more meaningful ones, such as
changing `-maxConcurrentInserts` (default 20) to (default
2*cgroup.AvailableCPUs()).

Currently the logic is implemented only for vminsert, vmstorage,
vmselect, vmagent, vmalert, and victoria-metrics (aka single).

The goal is to make it easy to keep documentation synchronized with real
binaries

_**Note:** Please ignore xxx_flags.md files for now. Review flags in
`README.md` and `Cluster-VictoriaMetrics.md`, and `vmagent.md`,
`vmalert.md` only. Once we agree on the changes in those files, I'll
replace the flags with the `{{% content "xxx_flags.md" %}}`._

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-09-09 13:32:31 +03:00
Aliaksandr Valialkin
4ab68f287d docs/victoriametrics/Articles.md: add new third-party articles about VictoriaMetrics 2025-09-08 17:56:24 +02:00
hagen1778
f61e51073c docs: clarify details of data migration
* stress on requirement to have empty destination folder for copying;
* remove extra verbosity from docs;
* remove list vmctl migration options as they became unsynced. Instead of syncing,
  refer to the vmctl docs;
* fix typos.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit f8859574de)
2025-09-08 14:23:18 +02:00
Yury Molodov
68b5aaf5fb app/vmui: display vmselect version in footer (#9690)
### Describe Your Changes

* Updated `useFetchAppConfig` to respect the provided `serverUrl`.
* Added `vmselect` version display in the footer for easier debugging
and support.
<img width="1449" height="71" alt="image"
src="https://github.com/user-attachments/assets/228b4ed5-89c2-4e95-9436-ee464a7fd40b"
/>

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Signed-off-by: Yury Molodov <yurymolodov@gmail.com>
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 9d4a8ed799)
2025-09-05 17:13:12 +02:00
Roman Khavronenko
f8989105fb app/vmselect: encode application version into manifest (#9654)
The application version can be then displayed in the vmui. Showing the
application version in vmui should make it easier to determine currently
used VM version (at least vmselect version).

------------

@Loori-R it would be could to add the app version in vmui in a follow-up
PR or by pushing a commit to this branch.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit dfcfacd04f)
Signed-off-by: hagen1778 <roman@victoriametrics.com>

# Conflicts:
#	app/vmselect/main.go
2025-09-05 17:11:37 +02:00
Roman Khavronenko
0e5c0f6fef vmalert: re-factoring follow-up (#9683)
A minir changes after the follow-up in
85f556f53e

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 28da18282b)
2025-09-05 17:10:09 +02:00
Arie Heinrich
13b3a854fe docs: markdown, grammar and spelling (#9686)
### Describe Your Changes

This pull request consists of the following:

1. Markdown fixes
following https://www.markdownguide.org/basic-syntax/
and https://github.com/markdownlint/markdownlint/blob/main/docs/RULES.md

* Add empty lines after headers or lists
* Remove extra lines between paragraphs
* Remove extra spaces at the end of a line
* Add language to code quote
* Consistent list (dont mix astrixes and dashes on same file, choose one
and be consistent in the same file)
* Proper URL links
* Use meaningful context to URLs instead of "here".

2. Concise language

3. Grammar fixes

* removing extra spaces between words
* there are multiple ones but i picked the basic ones that triggered my
eye :)

4. Spelling fixes

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>
Co-authored-by: Zhu Jiekun <jiekun@victoriametrics.com>
(cherry picked from commit db8e40f26c)
2025-09-05 17:10:08 +02:00
Andrii Chubatiuk
4d7e1e31df app/vmui: fixed alerting page on mobile devices (#9678)
### Describe Your Changes

fixed visualisation issues while opening alerting page on mobile devices
before:
<img width="334" height="68" alt="image"
src="https://github.com/user-attachments/assets/fb085c46-5e01-430e-b109-46971e377a48"
/>
<img width="337" height="452" alt="image"
src="https://github.com/user-attachments/assets/871affb8-c4dc-4d23-9958-fba9f77a5612"
/>
<img width="318" height="509" alt="image"
src="https://github.com/user-attachments/assets/a66c8634-3e3e-4bd7-abc8-ec1a7fa92318"
/>

after:
<img width="334" height="74" alt="image"
src="https://github.com/user-attachments/assets/8ad127f2-cc61-4297-97fa-d54910f31761"
/>
<img width="337" height="419" alt="image"
src="https://github.com/user-attachments/assets/15e9fb04-0873-4967-aa59-1370f2b0adaf"
/>
<img width="305" height="501" alt="image"
src="https://github.com/user-attachments/assets/8233a43a-70ce-4b15-afb2-d64a6b696038"
/>

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

(cherry picked from commit e958d488b0)
2025-09-05 17:10:08 +02:00
Andrii Chubatiuk
735bad5d16 app/vmui: minor main components improvements (#9644)
`Table` component:
- add `format` property for table column, which allows to apply custom
formatting depending on column type
- add `rowClasses` table property, that allows to pass function that
allows to customize row css class depending on row value
- add `rowAction` table property, that allows to execute action while
clicking on table row

`Popper` component:
- add `classes` to specify additional CSS classes for popper to
differentiate from other poppers, since it's mounted to a DOM root

`Switch` component:
- use gap instead of left-margin

`DateTimeInput` component:
- add `dateOnly` property to allow accepting only date in the input

additional fixes:
- fix TopQuery header fields alignment

<img width="1279" height="125" alt="image"
src="https://github.com/user-attachments/assets/08ad4dbc-19e5-47f5-9ccd-a9fb222335a4"
/>

### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

(cherry picked from commit 569f045728)
2025-09-05 17:10:08 +02:00
Max Kotliar
23651b7e0b go.mod: update metricsql lib to v0.84.8
https://github.com/VictoriaMetrics/metricsql/releases/tag/v0.84.8
2025-09-05 12:43:30 +03:00
Max Kotliar
2d70eeec3d docs/url-examples: align export example format with import example (#9681)
### Describe Your Changes

The export/import format can be confusing for users unfamiliar with its
syntax. To make matters worse, the format shown in [export
examples](https://docs.victoriametrics.com/victoriametrics/url-examples/#apiv1exportcsv)
was incompatible with the one used in [import
examples](https://docs.victoriametrics.com/victoriametrics/url-examples/#apiv1importcsv).

This PR updates the examples so they are compatible, allowing users to
follow the export and import steps to complete a full data cycle.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-09-04 14:27:40 +03:00
hagen1778
39e7633e5b docs: add change after 5854d9df72
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 3fff181e2c)
2025-09-04 12:34:36 +02:00
wbwren-eric
d06f78e631 app/vmui: set rateEnabled default value to false for probe_success (#9648)
### Describe Your Changes

Set rateEnabled to false for probe_success in VMUI

Fix issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9655

Problem:
probe_success is incorrectly initialized with rateEnabled = true because
the regex detecting counters (/_sum?|_total?|_count?/) matches partial
strings like _su. This causes probe_success (a gauge) to be treated as a
counter, producing slightly misleading graphs. For example, when
rateEnabled is set to true, probe_success often shows as 0 in VMUI when
the probe is actually succeding.
It is not intuative for users to have to disable rateEnabled manually
just to get the correct value for probe_success in VMUI.

Solution:
Update the regex to strictly match suffixes:
`/_sum$|_total$|_count$/`

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Signed-off-by: William Wren <william.wren@ericsson.com>
(cherry picked from commit 5854d9df72)
2025-09-04 12:34:34 +02:00
Hui Wang
3700753060 vmalert: move the web types into sub-packages (#9560)
fix https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9551.

To avoid the group update blocking API calls for irrelevant resources
for too long, we don't lock the `m.groupsMu` during the [group
updates](fd928a0f5b/app/vmalert/manager.go (L100)).
And to avoid group changes during related API calls, a
[DeepCopy](61c5e8185c/app/vmalert/web_types.go (L341))
was used to copy needed group info, but it was not implemented correctly
and can't be implemented efficiently.
This pull request splits rule-related web types into sub-packages, which
should be clearer and easier to maintain.

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>

(cherry picked from commit 85f556f53e)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-09-04 12:34:32 +02:00
Aliaksandr Valialkin
07f9fb32f9 docs/victoriametrics/vmauth.md: update ./vmauth -help output with the newly added -mergeQueryArgs command-line flag at 272f6b2a46 2025-09-04 11:15:51 +02:00
Hui Wang
d5152c59b2 dashboards: add panel Storage full ETA in the vmstorage section (#9670)
(cherry picked from commit 08c835e79f)
2025-09-04 09:15:11 +02:00
Andrii Chubatiuk
cd68d7edef app/vmui: make sidebar scrollable and its items collapsible (#9662)
after adding Alerting section all menu items cannot be displayed on
mobile devices in a sidebar. this PR:

- makes sidebar scrollable, when it's content overflows screen
- makes sidebar items collapsible
- fixes menu layout on mobile devices with big screens

before:

<img width="1074" height="57" alt="image"
src="https://github.com/user-attachments/assets/6ae69487-d89a-4aaa-985b-de788be06cff"
/>

<img width="198" height="490" alt="image"
src="https://github.com/user-attachments/assets/0a494c52-6db7-4160-a04d-df69b88604dc"
/>

after:

<img width="1170" height="55" alt="image"
src="https://github.com/user-attachments/assets/57909536-0353-4be2-8d8f-4302b3bfe338"
/>

<img width="199" height="509" alt="image"
src="https://github.com/user-attachments/assets/43f33536-86eb-41b1-91d8-5b8ca95faeca"
/>

### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

(cherry picked from commit 4c23f6913e)
2025-09-04 09:15:10 +02:00
Arie Heinrich
cf8633e956 docs: markdown, grammar and spelling (#9675)
### Describe Your Changes

This pull request consists of the following:

1. Markdown fixes
following https://www.markdownguide.org/basic-syntax/
and https://github.com/markdownlint/markdownlint/blob/main/docs/RULES.md

- Add empty lines after headers or lists
- Remove extra lines between paragraphs
- Remove extra spaces at the end of a line
- Add language to code quote
- Consistent list (dont mix astrixes and dashes on same file, choose one
and be consistent in the same file)
- Proper URL links
- Use meaningful context to URLs instead of "here".

2. Concise language

3. Grammar fixes

- removing extra spaces between words
- there are multiple ones but i picked the basic ones that triggered my
eye :)

4. Spelling fixes

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

(cherry picked from commit 8411675d55)
2025-09-04 09:15:10 +02:00
hagen1778
8e37b54ba1 docs: change update note to known issues for consistency
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit ba5cacbe60)
2025-09-04 09:15:10 +02:00
hagen1778
82be3772fd docs: update changelog with fixes in recent releases
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 1d2d0c49cc)
2025-09-04 08:59:32 +02:00
f41gh7
d7f3eaba15 docs: add v1.125.1 and v1.110.18 releases
Signed-off-by: f41gh7 <nik@victoriametrics.com>
2025-09-03 22:10:31 +02:00
Aliaksandr Valialkin
df7bd0a576 docs/victoriametrics/vmauth.md: typo fix: sepcified -> specified 2025-09-03 15:57:56 +02:00
andriibeee
cb6a36885e app/vmauth: fix unauthorized_user routing inconsistency
This commit makes vmauth respect the routing config for unauthorized
requests for requests that despite having Authorization header failed to
authorize successfully.

 It covers the following use-cases:
- vmauth is used at load-balanacer and must forward requests as is. There is no any authorization configs.
- vmauth has authorization config, but it must forward requests with invalid credential tokens to some other backend.

related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7543

---------
Signed-off-by: Andrii <andriibeee@gmail.com>
2025-09-03 15:54:01 +02:00
Aliaksandr Valialkin
6a9e84e796 app/vmauth: add an ability to merge the given client query args with the query args specified at the backend url
This is needed for VictoriaLogs, which allows limiting query results with the given set of extra filters
specified via extra_filters query arg. The request url can contain multiple extra_filters query args -
they are all applied with AND logic to the query. See https://docs.victoriametrics.com/victorialogs/querying/#extra-filters

The merge_query_args option at vmauth allows merging the extra_filters provided by the client
(such as Grafana plugin for VictoriaLogs or built-in web UI) with the extra_filters specified in the backend
url at vmauth config.

This is needed for https://github.com/VictoriaMetrics/VictoriaLogs/issues/106
2025-09-03 15:51:36 +02:00
f41gh7
e0a2686fcd CHANGELOG.md: cut v1.125.1 release 2025-09-03 15:40:31 +02:00
f41gh7
da1e2ed1e9 make vmui-update 2025-09-03 15:32:31 +02:00
Artem Fetishev
c244c83eff lib/workingsetcache: properly count workingsetcache metrics
`workingsetcache` is built on top of two
[fastcache](https://github.com/VictoriaMetrics/fastcache) instances
(curr and prev) that are rotated periodically (configurable via
`-cacheExpireDuration` flag). During the rotation curr becomes prev and
prev is discarded, new curr is an empty. If an entry is not found in
curr then the prev cache is checked, and if the entry is found there it
is copied to curr.

`workingsetcache` also exports metrics, such as `EntriesCount`,
`GetCalls`, `SetCalls`, and `Misses` counts. These metrics are currently
implemented as the sum of the same metrics in prev and curr `fastcache`
instances. Given to rotation logic, these counts can be incorrect:

1. `EntriesCount`. It is the sum of prev and curr entry counts. If an
entry is not found in curr and found in prev (and therefore is copied
from prev to curr) the resulting entry count will be incorrect, i.e. it
will count copied entries two times.
2. `GetCalls`. It is the sum of prev and curr get calls. If an entry is
not found in curr the logic will attempt to retrieve it from prev, which
will result in double counting. While it is actually one get call to
`workingsetcache`.
3. `SetCalls`. It is the sum of prev and curr get calls. If an entry is
not found in curr but found in prev it will be copied to curr resulting
in a set call to curr. While from the `workingsetcache` perspective
there hasn't been any set operation at all.
4. `Misses`. It is the sum of prev and curr misses. If an etry is not
found in curr, it is recorded as a miss. If it is then found in prev,
the entry is returned to the caller, but that cache miss remains. If it
is not found in prev, then there will be 2 misses for 1
`worksingsetcache` get call.

This PR introduces `GetCalls`, `SetCalls`, and `Misses` counts at the
`workingsetcache` level in order to count the calls correctly. It also
excludes duplicates from `EntriesCount`.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9553
2025-09-03 15:26:32 +02:00
Phuong Le
82ff49756b flag: introduce new flag ExtendedDuration
Related: https://github.com/VictoriaMetrics/VictoriaLogs/issues/50
2025-09-03 15:26:31 +02:00
Yury Molodov
5dab0c6250 vmui: fix useSearchParamsFromObject not updating searchParams
Fix bug in `useSearchParamsFromObject` hook that prevented filtering on
the *Explore Cardinality* page.

 Bug was introduced at 483e00ffb9

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9674
2025-09-03 15:26:31 +02:00
Felix Yan
b6bacdad8e docs: correct a typo in vmalert.md (#9668)
### Describe Your Changes

Correct a typo in vmalert.md

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-09-03 13:29:56 +03:00
Andrii Chubatiuk
249c48a3ee app/vmui: reuse codeexample component in alerts tab (#9649)
### Describe Your Changes

reuse codeexample component in vmui alerts page

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

(cherry picked from commit 63a6b9b863)
2025-09-02 14:44:45 +02:00
Dmytro Kozlov
c7d4ba82fe benchmark: add gnuplot to show write speed (#9490)
### Describe Your Changes

Implemented the script that generates graphs using `gnuplot`.
Those graphs show the write speed to the db.
How to use it:
1. From the root run `make tsbs`;
2. The file will be generated automatically
`/tmp/tsbs-load-100000-2025-07-22T00:00:00Z-2025-07-23T00:00:00Z-80s.csv`
4. From the root run `make tsbs-plot-load` and observe the result
5. If you have two files with the `tsbs_load_victoriametrics` output,
just define the second in the
`TSBS_LOAD_RESULT_CSV_FILE_COMPARE=/tmp/tsbs-load-10
0000-2025-07-22T01:00:00Z-2025-07-23T01:00:00Z-80s.csv
`
To plot the measurements from some other benchmark, run
`make tsbs-plot-load TSBS_LOAD_RESULT_CSV_FILE=/path/to/file.csv`

To plot the measurements from two benchmarks, run
`make tsbs-plot-load TSBS_LOAD_RESULT_CSV_FILE=/path/to/file1.csv
TSBS_LOAD_RESULT_CSV_FILE_COMPARE=/path/to/file2.csv`

This command should generate a graph like described in the picture

<img width="638" height="578" alt="Screenshot 2025-07-25 at 15 35 42"
src="https://github.com/user-attachments/assets/900b05ab-0b98-4f7f-8f2c-18d28ad2eab6"
/>

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
Co-authored-by: Artem Fetishev <rtm@victoriametrics.com>
Co-authored-by: Artem Fetishev <149964189+rtm0@users.noreply.github.com>
(cherry picked from commit fd23f6bfb3)
2025-09-02 14:44:44 +02:00
Max Kotliar
42955f6b06 docs/stream-aggregation: Add deduplication common mistake (#9659)
### Describe Your Changes

Fix a stream aggregation pitfall when deduplication intervals differ
between storage and vmagent.

Follow up on
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9581

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

(cherry picked from commit 1b8dc8a94c)
2025-09-02 14:44:44 +02:00
Phuong Le
0fea23374e docs: fix localhost link (#9661)
(cherry picked from commit 9109e2e7c3)
2025-09-02 14:44:44 +02:00
hagen1778
72aedd30f6 docs: re-organize changelog lines by priority and components
This helps to improve readability of changes, so users
can see more important changes first, and see changes related
to the same component one after another.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit bc75bbfbe7)
2025-09-02 14:44:44 +02:00
hagen1778
b5279502c6 dashboards: update descriptions for resource usage panel
The description new content is a courtesy of @func25

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit dd19a17ef6)
2025-09-02 14:44:43 +02:00
f41gh7
1138fa853f app/vmselect: properly route requests for config.json
Bug was introduced during back-porting changes from single-node to the
cluster branch.

Follow-up after: 7f15e9f64c
2025-09-01 21:50:50 +02:00
Max Kotliar
ca907a239f .github/workflow: add check commit signed action (#9639)
### Describe Your Changes

    .github/workflow: add check commit signed action
    
    Add GitHub Action to verify commit signatures.
    
This action checks commit signatures, accepting G (good) and E (signed
    but key not available for full verification).
    
Note: This is not a 100% accurate check. The CI mainly targets unsigned
    commits from external contributors.
    
Reference:
https://git-scm.com/docs/pretty-formats#Documentation/pretty-formats.txt-G

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-09-01 18:49:59 +03:00
minxinyi
9049be2733 refactor: use the built-in max/min to simplify the code (#9525)
use the built-in max/min to simplify the code

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Signed-off-by: minxinyi <minxinyi6@outlook.com>
2025-09-01 18:49:27 +03:00
hagen1778
a0edaa415e docs: update flag description for Kafka related flags
Follow-up after 0278bc5d9a

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 611e96d875)
2025-09-01 16:37:32 +02:00
Roman Khavronenko
a4e237fcd3 docs: move vmagent's Kafka integration to /integrations page (#9658)
This change requires a follow-up commit to update cmd-line flags in ENT
version.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 0278bc5d9a)
2025-09-01 16:28:49 +02:00
Andrii Chubatiuk
a7141bc025 docs: exclude updated files from rendering and from sitemap.xml (#9616)
### Describe Your Changes

fixes https://github.com/VictoriaMetrics/vmdocs/issues/164

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

(cherry picked from commit a585d95365)
2025-09-01 16:28:48 +02:00
Arie Heinrich
e596940a7e docs: markdown, grammar and spelling (#9650)
### Describe Your Changes

As there are quite a few files, and each file might have multiple
changes and to make it easily to review, i limited the PR to 5 files at
a time.

I suggest you take a look at markdownlint and add it as part of your CI,
similar to
https://github.com/MicrosoftDocs/PowerShell-Docs/blob/main/.markdownlint.yaml
And while at it, take a look at cspell and how its used in thier repo
and replace the python one you have in your current implementation -
might open a PR with it after all the fixes PRs).

This pull request consists of the following:

1. Markdown fixes
    following https://www.markdownguide.org/basic-syntax/
and https://github.com/markdownlint/markdownlint/blob/main/docs/RULES.md

   - Add empty lines after headers or lists
   - Remove extra lines between paragraphs
   - Remove extra spaces at the end of a line
   - Add language to code quote
- Consistent list (dont mix astrixes and dashes on same file, choose one
and be consistent in the same file)
   - Proper URL links
   - Use meaningful context to URLs instead of "here".

2. Concise language

3. Grammar fixes
    - removing extra spaces between words
- there are multiple ones but i picked the basic ones that triggered my
eye :)

4. Spelling fixes

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

(cherry picked from commit b5578fcac2)
2025-09-01 16:28:48 +02:00
Roman Khavronenko
9c5cd74ee0 docs: move vmagent's pubsub integration to /integrations page (#9656)
This change requires a follow-up commit to update cmd-line flags in ENT
version.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 86334534f6)
2025-09-01 16:28:48 +02:00
f41gh7
4d7d70029d docs: replace v1.124.0 with v1.125.0 release
Signed-off-by: f41gh7 <nik@victoriametrics.com>
2025-09-01 12:58:09 +02:00
f41gh7
f835d93bfc docs: mention LTS releases
Signed-off-by: f41gh7 <nik@victoriametrics.com>
2025-09-01 12:54:41 +02:00
Aliaksandr Valialkin
e8988eefb8 lib/fs: remove fsync for the parent directory from MustMkdirIfNotExist(), MustMkdirFailIfExist(), MustHardLinkFiles() and MustCopyDirectory()
This allows performing a single MustFsyncPath() for the parent directory after multiple calls to these functions.
This clarifies code paths, which call these functions, and makes them more maintainable.

This also removes a redundant fsync() call for the parent directory when creating a file-based part.
Previously the first fsync() was indirectly called when the directory was created via MustMkdirFailIfExist()
and the second fsync() was called via MustSyncPathAndParentDir() after all the data is written to the part.
2025-08-30 01:55:23 +02:00
Aliaksandr Valialkin
6abefaae30 lib/persistentqueue/persistentqueue.go: remove fs.MustSyncPath() call after fs.MustWriteSync()
The fs.MustWriteSync() already fsyncs the created file, so there is no need in additional fsync() call.

While at it, add missing fsync for the parent directory after creating a directory for persistent queue.
2025-08-30 01:55:23 +02:00
Aliaksandr Valialkin
06c30c4ea5 lib/backup/fsremote/fsremote.go: remove unneeded fsync for the hard-linked file
The source file contents should be already fsynced to disk before creating a hard link,
so there is no sense in calling fsync() on the created hard link.
2025-08-30 01:55:22 +02:00
Aliaksandr Valialkin
7bb9c895fd docs/victoriametrics/sd_configs.md: fix internal links to different Kubernetes service discovery roles
This is a follow-up for the commit 51aebcd061
2025-08-29 16:28:37 +02:00
Aliaksandr Valialkin
4b190cb274 docs/victoriametrics/Articles.md: add https://amir-shams.medium.com/why-victoriametrics-a-practical-guide-to-scalable-and-faster-monitoring-than-prometheus-54ef21f10465 2025-08-29 16:28:36 +02:00
f41gh7
f0391334da CHANGELOG.md: cut v1.125.0 release 2025-08-29 14:36:47 +02:00
Hui Wang
73d712aeec fix vmcluster docker-compose example (#9643)
1. fix vmcluster docker-compose example: vminsert scrape job and vmagent
remote write authorization.
2. upgrade grafana to v12.1.1
2025-08-29 14:32:25 +02:00
f41gh7
47b4a60128 make vmui-update 2025-08-29 14:18:53 +02:00
Aliaksandr Valialkin
88d73cc142 docs/victoriametrics/sd_configs.md: add titles per every target role in service discovery configs
This allows referring per-role docs via direct links to the correponsing sub-chapters with the given titles
2025-08-29 13:28:59 +02:00
Artem Fetishev
51fa3de6cb lib/storage: fix double counting in vm_deleted_metrics_total
The vm_deleted_metrics_total metric value represents the number of
metricIDs stored in deletedMetricIDs cache. This cache lives at the
storage level and stores the deleted metrics from both prev and curr
idbs. However, the metric is populated at the idb level. Since there are
always 2 idbs (prev and curr), the value is populated twice. Hence the
doubled value of the metric.

The fix is to populate the metric value at the storage level.

Related issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9602
2025-08-29 11:50:35 +02:00
Andrii Chubatiuk
7f15e9f64c app/vmui: craft UI configuration on backend instead of using /flags endpoint and static config.json file
- load and parse static`/vmui/config.json`, modify it according to
runtime values and use it as a replacement for static config.json
- remove using `/flags` endpoint for checking features, that should be
enabled on VMUI

 Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9635
2025-08-29 10:59:39 +02:00
Andrii Chubatiuk
1be5ddb002 app/vmui: removed home page hack
`router.home` represents `/` path, which is the same for all UI apps,
but content and title for root path differs depending on application
type. added `getDefaultOptions` function, which returns proper home
route configuration depending on application type, which allows to
remove renamings in respective layouts

 Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9641
2025-08-29 10:45:36 +02:00
Hui Wang
afdaeb91eb lib/httpserver: properly issue automatic TLS certificates
Bug was introduced at commit 93ad502d6dcb4724e8ec40a4a0351b0316853af0

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics-enterprise/pull/930
2025-08-29 10:09:08 +02:00
hagen1778
f7d5eaa700 deployment/docker: replace single-node image on transparent background version
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 87b291debe)
2025-08-28 20:18:32 +02:00
hagen1778
9663232659 deployment/docker: strip victorialogs images from excalidraw sources
VictoriaLogs excalidraw images should be stored in VictoriaLogs repo

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit cce1cdcb6d)
2025-08-28 20:18:32 +02:00
hagen1778
36802fc9c5 deployment/docker: use light and dark images for github markdown for cluster images
This is an attempt to adjust image styles to GitHub themes, because
existing images with transparent backround become unreadable on dark theme.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 03e003c828)
2025-08-28 20:18:31 +02:00
hagen1778
a805e5ddb2 deployment/docker: use light and dark images for github markdown
This is an attempt to adjust image styles to GitHub themes, because
existing images with transparent backround become unreadable on dark theme.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit ad9d11ba3f)
2025-08-28 20:18:31 +02:00
hagen1778
fbcffdf27e deployment/docker: rm victorialogs images
The vlogs images were moved to VictoriaLogs github repo
and aren't needed here anymore.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 5c2ed99dab)
2025-08-28 20:18:31 +02:00
Max Kotliar
6f9b74b1e6 lib/prommetadata: Extract -enableMetadata flag to separate package, avoid pulling in promscrape discovery flags into vminsert
The commit
25cd5637bc
introduced the `-enableMetadata` flag and the
`promscrape.IsMetadataEnabled()` function, which is now used in multiple
places, including the `app/vminsert/prometheusimport` [request
handler](b24b76ff08/app/vminsert/prometheusimport/request_handler.go (L36)).
    
Because of the use of `promscrape` package vminsert registered all
`-promscrape.*` service discovery flags, which were not relevant for
`vminsert`.
    
This change moves the metadata flag logic into a dedicated package,
preventing vminsert from unintentionally loading unrelated promscrape
flags.

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9631
2025-08-28 17:33:24 +02:00
f41gh7
a1ed3fa888 follow-up after 76eb654e7e
mention change at changelog
2025-08-28 17:03:15 +02:00
Oron Sharabi
76eb654e7e lib/storage: improve searchLabels and searchLabelValues performance
When having a `match` of `__name__` key alone for labels api, it's going
to hit max series limit in case of high cardinality metric name.
Instead, we can skip looking by `metricIDs` and fallback to inverted
index scan with a `composite key` since we only have some `__name__` and
a label name.
 
 Common requests for optimisations are:
1) /api/v1/labels?match=up or /api/v1/labels?extra_filters=up
2) /api/v1/label/job/values?match=up or /api/v1/labels?extra_filters =up

 It's widely used by grafana variables.

 Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9489
2025-08-28 17:02:19 +02:00
Charles-Antoine Mathieu
f8d5b080e7 app/vmselect/graphite: enforce search.maxQueryLen for Graphite queries
This commit ensures that the -search.maxQueryLen flag applies to Graphite
queries, matching the behavior already present for Prometheus queries.
Previously, Graphite queries could bypass this limit, creating an
inconsistency and a potential vector for resource exhaustion.

Key changes:

Added getMaxQueryLen() to access the global query length limit.
Enforced query length validation in execExpr() for Graphite queries.
Added comprehensive tests for the new validation logic and edge cases.
Error messages are consistent with Prometheus query validation.
The default limit is 16KB (configurable via -search.maxQueryLen).
Setting the limit to 0 disables validation.
This change closes the gap where Graphite queries could exceed
configured length limits, providing consistent protection against
excessively long queries across both query APIs.

Follow-up for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9534
Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9600
2025-08-28 16:33:34 +02:00
f41gh7
411874dd19 deployment/docker: switch from musl to glibc
This should remove vertical scalability limit for data ingestion at VictoriaMetrics running on machines with big number of CPU cores.

Related issue: https://github.com/VictoriaMetrics/VictoriaLogs/issues/517
2025-08-28 16:26:56 +02:00
Artem Fetishev
99a85f8288 lib/uint64set: Optimize subtract operation
a.Subtract(b) perfomance degrades as b becomes bigger than a. For
example if len(b2) == 10xlen(b1) then time(a.Subtract(b2)) == 10x
time(a.Subtract(b1)).

A quick fix is to iterate over a elements in len(b) > len(a). Iterating
over a's elements and at the same time deleting should be safe since no
elements are actually deleted (i.e. memory freed, etc). Deletion here
means setting a corresponding bit from 1 to 0.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9602
2025-08-28 16:26:55 +02:00
Max Kotliar
7969b48ee4 app/vmselect/clusternative: sync -clusternative.maxConcurrentRequests flag description with one from docs.
It reduses the diff between flags descritpion in docs and actual
binaries.

See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9632/files
2025-08-28 12:13:41 +03:00
Max Kotliar
52a27f41d8 lib/flagutil: fix flag description. 2025-08-27 20:08:52 +03:00
Max Kotliar
95ffea5d32 Revert "docs: sync documented flags with binaries"
This reverts commit 7c0c8cc702.
2025-08-27 19:10:57 +03:00
Aliaksandr Valialkin
b24b76ff08 go.mod: update github.com/valyala/gozstd from v1.22.0 to v1.23.2 2025-08-27 14:29:10 +02:00
Artem Fetishev
227ec25795 benchmarks: support for all query types in TSBS (#9630)
### Describe Your Changes

Add the support of all standard TSDB query types that can be executed
against VictoriaMetrics. `double-groupby-all` is commented out as it
attempts to retrieve all 1B samples and fails. While this can be fixed
by setting the `-search.maxSamplesPerQuery` this query is left disabled
anyway because it will consume way too much memory and cpu time.

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
(cherry picked from commit d0690ba15f)
2025-08-27 13:51:53 +02:00
Andrii Chubatiuk
e040cfdae3 vmui: replace VMAlert proxy with Alerting tab in VMUI (#8989)
### Describe Your Changes

Rules page header + content
<img width="1235" height="520" alt="image"
src="https://github.com/user-attachments/assets/bb0c5818-c44a-46e6-bc47-e6718be34016"
/>
Expanded rule without alert
<img width="1418" alt="image"
src="https://github.com/user-attachments/assets/ae0b265f-24fe-4549-8913-b1be8e7c2862"
/>
Expanded rule with alert
<img width="1418" alt="image"
src="https://github.com/user-attachments/assets/8a138403-0712-4de2-bfa5-467da3a979dd"
/>
Notifiers page
<img width="1419" alt="image"
src="https://github.com/user-attachments/assets/557c2831-e960-44ec-9b93-f1ebfeb1fbb0"
/>

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8330
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6091
fixes https://github.com/VictoriaMetrics/VictoriaLogs/issues/90

VMUI:
- Added added `Alerting -> Rules` and `Alerting -> Notifiers` pages for
VictoriaMetrics
- Support includeAll option in Select component

VMAlert:
- added `/api/v1/group`useful to get information about certain group
- added `lastError` for `/api/v1/notifiers` for each target to see
information about failed notifiers

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/).

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 483e00ffb9)
2025-08-27 13:51:50 +02:00
Artem Fetishev
62bb5459d3 lib/storage: Follow-up for 9517f5cf1 - use 100k series in all benchmarks, fix benchmark names
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-08-27 12:09:53 +02:00
Artem Fetishev
d00320db4d lib/storage: new storage search benchmarks (#9620)
New benchmarks for storage search (data and index):
- Use the same dataset that accounts for prev and curr indexDBs and
deleted series
- The code is more structured
- Account for various numbers of series in response including higher
numbers (>10k) as this appears to be a quite common use case.

These bechmarks were used for investigating #9602 performance issue and
helped discover that prefetching metric names needed to be restored

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-08-27 12:09:34 +02:00
1959 changed files with 131445 additions and 89524 deletions

48
.github/scripts/lint-changelog-tip.sh vendored Executable file
View File

@@ -0,0 +1,48 @@
#!/usr/bin/env sh
set -e
CHANGELOG_FILE="docs/victoriametrics/changelog/CHANGELOG.md"
GITHUB_BASE_REF=${GITHUB_BASE_REF:-"master"}
GIT_REMOTE=${GIT_REMOTE:-"origin"}
git diff "${GIT_REMOTE}/${GITHUB_BASE_REF}"...HEAD -- $CHANGELOG_FILE > diff.txt
if ! grep -q "^+" diff.txt; then
echo "No additions in CHANGELOG.md"
exit 0
fi
ADDED_LINES=$(grep "^+\S" diff.txt | sed 's/^+//')
START_TIP=$(grep -n "^## tip" "$CHANGELOG_FILE" | head -1 | cut -d: -f1)
if [ -z "$START_TIP" ]; then
echo "ERROR: ${CHANGELOG_FILE} does not contain a ## tip section"
exit 1
fi
END_TIP=$(awk "NR>$START_TIP && /^## / {print NR; exit}" "${CHANGELOG_FILE}")
if [ -z "$END_TIP" ]; then
END_TIP=$(wc -l < "$CHANGELOG_FILE")
fi
BAD=0
while IFS= read -r line; do
# Grep exact line inside the file and get line numbers
MATCHES=$(grep -n -F "$line" "$CHANGELOG_FILE" | cut -d: -f1)
for m in $MATCHES; do
if [ "$m" -lt "$START_TIP" ] || [ "$m" -gt "$END_TIP" ]; then
echo "'$line' on line ${m} is outside ## tip section (lines ${START_TIP}-${END_TIP})"
BAD=1
fi
done
done << EOF
$ADDED_LINES
EOF
if [ "$BAD" -ne 0 ]; then
echo "CHANGELOG modifications must be placed inside the ## tip section."
exit 1
fi
echo "CHANGELOG modifications are valid."

View File

@@ -47,17 +47,19 @@ jobs:
arch: ppc64le
- os: linux
arch: 386
- os: linux
arch: s390x
- os: freebsd
arch: amd64
- os: openbsd
arch: amd64
steps:
- name: Code checkout
uses: actions/checkout@v5
uses: actions/checkout@v6
- name: Setup Go
id: go
uses: actions/setup-go@v5
uses: actions/setup-go@v6
with:
cache-dependency-path: |
go.sum

19
.github/workflows/changelog-linter.yml vendored Normal file
View File

@@ -0,0 +1,19 @@
name: 'changelog-linter'
on:
pull_request:
paths:
- "docs/victoriametrics/changelog/CHANGELOG.md"
jobs:
tip-lint:
runs-on: 'ubuntu-latest'
steps:
- uses: 'actions/checkout@v4'
with:
# needed for proper diff
fetch-depth: 0
- name: 'Validate that changelog changes are under ## tip'
run: |
GITHUB_BASE_REF=${{ github.base_ref }} ./.github/scripts/lint-changelog-tip.sh

View File

@@ -0,0 +1,37 @@
name: check-commit-signed
on:
pull_request:
jobs:
check-commit-signed:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v6
with:
fetch-depth: 0 # we need full history for commit verification
- name: Check commit signatures
run: |
if [ "${{ github.event_name }}" != "pull_request" ]; then
echo "Not a PR event, skipping signature check"
exit 0
fi
RANGE="${{ github.event.pull_request.base.sha }}..${{ github.event.pull_request.head.sha }}"
echo "Checking commits in PR range: $RANGE"
if [ -z "$(git rev-list $RANGE)" ]; then
echo "No new commits in this PR, skipping signature check"
exit 0
fi
unsigned=$(git log --pretty="%H %G?" $RANGE | grep -vE " (G|E)$" || true)
if [ -n "$unsigned" ]; then
echo "Found unsigned commits:"
echo "$unsigned"
exit 1
fi
echo "All commits in PR are signed (G or E)"

View File

@@ -19,7 +19,7 @@ jobs:
- name: Setup Go
id: go
uses: actions/setup-go@v5
uses: actions/setup-go@v6
with:
go-version: stable
cache: false

View File

@@ -29,11 +29,11 @@ jobs:
steps:
- name: Checkout repository
uses: actions/checkout@v5
uses: actions/checkout@v6
- name: Set up Go
id: go
uses: actions/setup-go@v5
uses: actions/setup-go@v6
with:
cache: false
go-version: stable
@@ -49,14 +49,14 @@ jobs:
restore-keys: go-artifacts-${{ runner.os }}-codeql-analyze-
- name: Initialize CodeQL
uses: github/codeql-action/init@v3
uses: github/codeql-action/init@v4
with:
languages: go
- name: Autobuild
uses: github/codeql-action/autobuild@v3
uses: github/codeql-action/autobuild@v4
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v3
uses: github/codeql-action/analyze@v4
with:
category: 'language:go'

View File

@@ -16,12 +16,12 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Code checkout
uses: actions/checkout@v5
uses: actions/checkout@v6
with:
path: __vm
- name: Checkout private code
uses: actions/checkout@v5
uses: actions/checkout@v6
with:
repository: VictoriaMetrics/vmdocs
token: ${{ secrets.VM_BOT_GH_TOKEN }}

View File

@@ -32,11 +32,11 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Code checkout
uses: actions/checkout@v5
uses: actions/checkout@v6
- name: Setup Go
id: go
uses: actions/setup-go@v5
uses: actions/setup-go@v6
with:
cache-dependency-path: |
go.sum
@@ -71,11 +71,11 @@ jobs:
steps:
- name: Code checkout
uses: actions/checkout@v5
uses: actions/checkout@v6
- name: Setup Go
id: go
uses: actions/setup-go@v5
uses: actions/setup-go@v6
with:
cache-dependency-path: |
go.sum
@@ -97,11 +97,11 @@ jobs:
steps:
- name: Code checkout
uses: actions/checkout@v5
uses: actions/checkout@v6
- name: Setup Go
id: go
uses: actions/setup-go@v5
uses: actions/setup-go@v6
with:
cache-dependency-path: |
go.sum

View File

@@ -32,10 +32,10 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Code checkout
uses: actions/checkout@v5
uses: actions/checkout@v6
- name: Setup Node
uses: actions/setup-node@v4
uses: actions/setup-node@v6
with:
node-version: '24.x'

View File

@@ -67,6 +67,11 @@ vmcluster-linux-386: \
vmselect-linux-386 \
vmstorage-linux-386
vmcluster-linux-s390x: \
vminsert-linux-s390x \
vmselect-linux-s390x \
vmstorage-linux-s390x
vmcluster-freebsd-amd64: \
vminsert-freebsd-amd64 \
vmselect-freebsd-amd64 \
@@ -168,6 +173,7 @@ release:
release-vmcluster: \
release-vmcluster-linux-amd64 \
release-vmcluster-linux-arm64 \
release-vmcluster-linux-s390x \
release-vmcluster-freebsd-amd64 \
release-vmcluster-openbsd-amd64 \
release-vmcluster-windows-amd64 \
@@ -180,6 +186,9 @@ release-vmcluster-linux-amd64:
release-vmcluster-linux-arm64:
GOOS=linux GOARCH=arm64 $(MAKE) release-vmcluster-goos-goarch
release-vmcluster-linux-s390x:
GOOS=linux GOARCH=s390x $(MAKE) release-vmcluster-goos-goarch
release-vmcluster-freebsd-amd64:
GOOS=freebsd GOARCH=amd64 $(MAKE) release-vmcluster-goos-goarch
@@ -299,7 +308,8 @@ app-local-windows-goarch:
CGO_ENABLED=0 GOOS=windows GOARCH=$(GOARCH) go build $(RACE) -ldflags "$(GO_BUILDINFO)" -tags "$(EXTRA_GO_BUILD_TAGS)" -o bin/$(APP_NAME)-windows-$(GOARCH)$(RACE).exe $(PKG_PREFIX)/app/$(APP_NAME)
quicktemplate-gen: install-qtc
qtc
qtc -dir=lib
qtc -dir=app
install-qtc:
which qtc || go install github.com/valyala/quicktemplate/qtc@latest

View File

@@ -3,7 +3,7 @@
[![Latest Release](https://img.shields.io/github/v/release/VictoriaMetrics/VictoriaMetrics?sort=semver&label=&filter=!*-victorialogs&logo=github&labelColor=gray&color=gray&link=https%3A%2F%2Fgithub.com%2FVictoriaMetrics%2FVictoriaMetrics%2Freleases%2Flatest)](https://github.com/VictoriaMetrics/VictoriaMetrics/releases)
![Docker Pulls](https://img.shields.io/docker/pulls/victoriametrics/victoria-metrics?label=&logo=docker&logoColor=white&labelColor=2496ED&color=2496ED&link=https%3A%2F%2Fhub.docker.com%2Fr%2Fvictoriametrics%2Fvictoria-metrics)
[![Go Report](https://goreportcard.com/badge/github.com/VictoriaMetrics/VictoriaMetrics?link=https%3A%2F%2Fgoreportcard.com%2Freport%2Fgithub.com%2FVictoriaMetrics%2FVictoriaMetrics)](https://goreportcard.com/report/github.com/VictoriaMetrics/VictoriaMetrics)
[![Build Status](https://github.com/VictoriaMetrics/VictoriaMetrics/actions/workflows/main.yml/badge.svg?branch=master&link=https%3A%2F%2Fgithub.com%2FVictoriaMetrics%2FVictoriaMetrics%2Factions)](https://github.com/VictoriaMetrics/VictoriaMetrics/actions/workflows/main.yml)
[![Build Status](https://github.com/VictoriaMetrics/VictoriaMetrics/actions/workflows/build.yml/badge.svg?branch=master&link=https%3A%2F%2Fgithub.com%2FVictoriaMetrics%2FVictoriaMetrics%2Factions)](https://github.com/VictoriaMetrics/VictoriaMetrics/actions/workflows/build.yml)
[![codecov](https://codecov.io/gh/VictoriaMetrics/VictoriaMetrics/branch/master/graph/badge.svg?link=https%3A%2F%2Fcodecov.io%2Fgh%2FVictoriaMetrics%2FVictoriaMetrics)](https://app.codecov.io/gh/VictoriaMetrics/VictoriaMetrics)
[![License](https://img.shields.io/github/license/VictoriaMetrics/VictoriaMetrics?labelColor=green&label=&link=https%3A%2F%2Fgithub.com%2FVictoriaMetrics%2FVictoriaMetrics%2Fblob%2Fmaster%2FLICENSE)](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/LICENSE)
![Slack](https://img.shields.io/badge/Join-4A154B?logo=slack&link=https%3A%2F%2Fslack.victoriametrics.com)

View File

@@ -4,12 +4,11 @@
The following versions of VictoriaMetrics receive regular security fixes:
| Version | Supported |
|---------|--------------------|
| [latest release](https://docs.victoriametrics.com/victoriametrics/changelog/) | :white_check_mark: |
| v1.102.x [LTS line](https://docs.victoriametrics.com/victoriametrics/lts-releases/) | :white_check_mark: |
| v1.110.x [LTS line](https://docs.victoriametrics.com/victoriametrics/lts-releases/) | :white_check_mark: |
| other releases | :x: |
| Version | Supported |
|--------------------------------------------------------------------------------|--------------------|
| [Latest release](https://docs.victoriametrics.com/victoriametrics/changelog/) | :white_check_mark: |
| [LTS releases](https://docs.victoriametrics.com/victoriametrics/lts-releases/) | :white_check_mark: |
| other releases | :x: |
See [this page](https://victoriametrics.com/security/) for more details.

View File

@@ -27,6 +27,9 @@ vmagent-linux-ppc64le-prod:
vmagent-linux-386-prod:
APP_NAME=vmagent $(MAKE) app-via-docker-linux-386
vmagent-linux-s390x-prod:
APP_NAME=vmagent $(MAKE) app-via-docker-linux-s390x
vmagent-darwin-amd64-prod:
APP_NAME=vmagent $(MAKE) app-via-docker-darwin-amd64

View File

@@ -74,7 +74,7 @@ var (
"See also -opentsdbHTTPListenAddr.useProxyProtocol")
opentsdbHTTPUseProxyProtocol = flag.Bool("opentsdbHTTPListenAddr.useProxyProtocol", false, "Whether to use proxy protocol for connections accepted "+
"at -opentsdbHTTPListenAddr . See https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt")
configAuthKey = flagutil.NewPassword("configAuthKey", "Authorization key for accessing /config page. It must be passed via authKey query arg. It overrides -httpAuth.*")
configAuthKey = flagutil.NewPassword("configAuthKey", "Authorization key for accessing /config and /remotewrite-.*-config pages. It must be passed via authKey query arg. It overrides -httpAuth.*")
reloadAuthKey = flagutil.NewPassword("reloadAuthKey", "Auth key for /-/reload http endpoint. It must be passed via authKey query arg. It overrides -httpAuth.*")
dryRun = flag.Bool("dryRun", false, "Whether to check config files without running vmagent. The following files are checked: "+
"-promscrape.config, -remoteWrite.relabelConfig, -remoteWrite.urlRelabelConfig, -remoteWrite.streamAggr.config . "+
@@ -252,6 +252,8 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
{"metric-relabel-debug", "debug metric relabeling"},
{"api/v1/targets", "advanced information about discovered targets in JSON format"},
{"config", "-promscrape.config contents"},
{"remotewrite-relabel-config", "-remoteWrite.relabelConfig contents"},
{"remotewrite-url-relabel-config", "-remoteWrite.urlRelabelConfig contents"},
{"metrics", "available service metrics"},
{"flags", "command-line flags"},
{"-/reload", "reload configuration"},
@@ -477,6 +479,42 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
promscrape.WriteConfigData(&bb)
fmt.Fprintf(w, `{"status":"success","data":{"yaml":%s}}`, stringsutil.JSONString(string(bb.B)))
return true
case "/remotewrite-relabel-config":
if !httpserver.CheckAuthFlag(w, r, configAuthKey) {
return true
}
remoteWriteRelabelConfigRequests.Inc()
w.Header().Set("Content-Type", "text/plain; charset=utf-8")
remotewrite.WriteRelabelConfigData(w)
return true
case "/api/v1/status/remotewrite-relabel-config":
if !httpserver.CheckAuthFlag(w, r, configAuthKey) {
return true
}
remoteWriteStatusRelabelConfigRequests.Inc()
w.Header().Set("Content-Type", "application/json")
var bb bytesutil.ByteBuffer
remotewrite.WriteRelabelConfigData(&bb)
fmt.Fprintf(w, `{"status":"success","data":{"yaml":%s}}`, stringsutil.JSONString(string(bb.B)))
return true
case "/remotewrite-url-relabel-config":
if !httpserver.CheckAuthFlag(w, r, configAuthKey) {
return true
}
remoteWriteURLRelabelConfigRequests.Inc()
w.Header().Set("Content-Type", "text/plain; charset=utf-8")
remotewrite.WriteURLRelabelConfigData(w)
return true
case "/api/v1/status/remotewrite-url-relabel-config":
if !httpserver.CheckAuthFlag(w, r, configAuthKey) {
return true
}
remoteWriteStatusURLRelabelConfigRequests.Inc()
w.Header().Set("Content-Type", "application/json")
var bb bytesutil.ByteBuffer
remotewrite.WriteURLRelabelConfigData(&bb)
fmt.Fprintf(w, `{"status":"success","data":{"yaml":%s}}`, stringsutil.JSONString(string(bb.B)))
return true
case "/prometheus/-/reload", "/-/reload":
if !httpserver.CheckAuthFlag(w, r, reloadAuthKey) {
return true
@@ -747,6 +785,12 @@ var (
promscrapeConfigRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/config"}`)
promscrapeStatusConfigRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/api/v1/status/config"}`)
remoteWriteRelabelConfigRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/remotewrite-relabel-config"}`)
remoteWriteStatusRelabelConfigRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/api/v1/status/remotewrite-relabel-config"}`)
remoteWriteURLRelabelConfigRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/remotewrite-url-relabel-config"}`)
remoteWriteStatusURLRelabelConfigRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/api/v1/status/remotewrite-url-relabel-config"}`)
promscrapeConfigReloadRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/-/reload"}`)
)

View File

@@ -2,13 +2,14 @@ package opentelemetry
import (
"fmt"
"io"
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prommetadata"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentelemetry/firehose"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentelemetry/stream"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/protoparserutil"
@@ -24,6 +25,13 @@ var (
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="opentelemetry"}`)
)
// InsertHandler processes metrics from given reader.
func InsertHandlerForReader(at *auth.Token, r io.Reader, encoding string) error {
return stream.ParseStream(r, encoding, nil, func(tss []prompb.TimeSeries, mms []prompb.MetricMetadata) error {
return insertRows(at, tss, mms, nil)
})
}
// InsertHandler processes opentelemetry metrics.
func InsertHandler(at *auth.Token, req *http.Request) error {
extraLabels, err := protoparserutil.GetExtraLabels(req)
@@ -68,7 +76,7 @@ func insertRows(at *auth.Token, tss []prompb.TimeSeries, mms []prompb.MetricMeta
ctx.WriteRequest.Timeseries = tssDst
var metadataTotal int
if promscrape.IsMetadataEnabled() {
if prommetadata.IsEnabled() {
var accountID, projectID uint32
if at != nil {
accountID = at.AccountID

View File

@@ -7,8 +7,8 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prommetadata"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/prometheus"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/prometheus/stream"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/protoparserutil"
@@ -36,7 +36,7 @@ func InsertHandler(at *auth.Token, req *http.Request) error {
return err
}
encoding := req.Header.Get("Content-Encoding")
return stream.Parse(req.Body, defaultTimestamp, encoding, true, promscrape.IsMetadataEnabled(), func(rows []prometheus.Row, mms []prometheus.Metadata) error {
return stream.Parse(req.Body, defaultTimestamp, encoding, true, prommetadata.IsEnabled(), func(rows []prometheus.Row, mms []prometheus.Metadata) error {
return insertRows(at, rows, mms, extraLabels)
}, func(s string) {
httpserver.LogError(req, s)

View File

@@ -6,8 +6,8 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prommetadata"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/promremotewrite/stream"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/protoparserutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
@@ -71,7 +71,7 @@ func insertRows(at *auth.Token, timeseries []prompb.TimeSeries, mms []prompb.Met
ctx.WriteRequest.Timeseries = tssDst
var metadataTotal int
if promscrape.IsMetadataEnabled() {
if prommetadata.IsEnabled() {
var accountID, projectID uint32
if at != nil {
accountID = at.AccountID

View File

@@ -93,10 +93,7 @@ func TestParseRetryAfterHeader(t *testing.T) {
// helper calculate the max possible time duration calculated by timeutil.AddJitterToDuration.
func helper(d time.Duration) time.Duration {
dv := d / 10
if dv > 10*time.Second {
dv = 10 * time.Second
}
dv := min(d/10, 10*time.Second)
return d + dv
}

View File

@@ -3,15 +3,18 @@ package remotewrite
import (
"flag"
"fmt"
"io"
"strconv"
"strings"
"sync"
"sync/atomic"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fasttime"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
"go.yaml.in/yaml/v3"
"github.com/VictoriaMetrics/metrics"
)
@@ -32,9 +35,12 @@ var (
"See https://prometheus.io/docs/concepts/data_model/#metric-names-and-labels")
)
var labelsGlobal []prompb.Label
var (
labelsGlobal []prompb.Label
remoteWriteRelabelConfigData atomic.Pointer[[]byte]
remoteWriteURLRelabelConfigData atomic.Pointer[[]interface{}]
relabelConfigReloads *metrics.Counter
relabelConfigReloadErrors *metrics.Counter
relabelConfigSuccess *metrics.Gauge
@@ -67,6 +73,42 @@ func initRelabelConfigs() {
}
}
// WriteRelabelConfigData writes -remoteWrite.relabelConfig contents to w
func WriteRelabelConfigData(w io.Writer) {
p := remoteWriteRelabelConfigData.Load()
if p == nil {
// Nothing to write to w
return
}
_, _ = w.Write(*p)
}
// WriteURLRelabelConfigData writes -remoteWrite.urlRelabelConfig contents to w
func WriteURLRelabelConfigData(w io.Writer) {
p := remoteWriteURLRelabelConfigData.Load()
if p == nil {
// Nothing to write to w
return
}
type urlRelabelCfg struct {
Url string `yaml:"url"`
RelabelConfig interface{} `yaml:"relabel_config"`
}
var cs []urlRelabelCfg
for i, url := range *remoteWriteURLs {
cfgData := (*p)[i]
if !*showRemoteWriteURL {
url = fmt.Sprintf("%d:secret-url", i+1)
}
cs = append(cs, urlRelabelCfg{
Url: url,
RelabelConfig: cfgData,
})
}
d, _ := yaml.Marshal(cs)
_, _ = w.Write(d)
}
func reloadRelabelConfigs() {
rcs := allRelabelConfigs.Load()
if !rcs.isSet() {
@@ -90,28 +132,42 @@ func reloadRelabelConfigs() {
func loadRelabelConfigs() (*relabelConfigs, error) {
var rcs relabelConfigs
if *relabelConfigPathGlobal != "" {
global, err := promrelabel.LoadRelabelConfigs(*relabelConfigPathGlobal)
global, rawCfg, err := promrelabel.LoadRelabelConfigs(*relabelConfigPathGlobal)
if err != nil {
return nil, fmt.Errorf("cannot load -remoteWrite.relabelConfig=%q: %w", *relabelConfigPathGlobal, err)
}
remoteWriteRelabelConfigData.Store(&rawCfg)
rcs.global = global
}
if len(*relabelConfigPaths) > len(*remoteWriteURLs) {
return nil, fmt.Errorf("too many -remoteWrite.urlRelabelConfig args: %d; it mustn't exceed the number of -remoteWrite.url args: %d",
len(*relabelConfigPaths), (len(*remoteWriteURLs)))
}
var urlRelabelCfgs []interface{}
rcs.perURL = make([]*promrelabel.ParsedConfigs, len(*remoteWriteURLs))
for i, path := range *relabelConfigPaths {
if len(path) == 0 {
// Skip empty relabel config.
urlRelabelCfgs = append(urlRelabelCfgs, nil)
continue
}
prc, err := promrelabel.LoadRelabelConfigs(path)
prc, rawCfg, err := promrelabel.LoadRelabelConfigs(path)
if err != nil {
return nil, fmt.Errorf("cannot load relabel configs from -remoteWrite.urlRelabelConfig=%q: %w", path, err)
}
rcs.perURL[i] = prc
var parsedCfg interface{}
_ = yaml.Unmarshal(rawCfg, &parsedCfg)
urlRelabelCfgs = append(urlRelabelCfgs, parsedCfg)
}
if len(*remoteWriteURLs) > len(*relabelConfigPaths) {
// fill the urlRelabelCfgs with empty relabel configs if not set
for i := len(*relabelConfigPaths); i < len(*remoteWriteURLs); i++ {
urlRelabelCfgs = append(urlRelabelCfgs, nil)
}
}
remoteWriteURLRelabelConfigData.Store(&urlRelabelCfgs)
return &rcs, nil
}

View File

@@ -27,6 +27,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/ratelimiter"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/slicesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/streamaggr"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/timeserieslimits"
"github.com/VictoriaMetrics/metrics"
@@ -485,6 +486,9 @@ func tryPush(at *auth.Token, wr *prompb.WriteRequest, forceDropSamplesOnFailure
matchIdxs.B = sas.Push(tssBlock, matchIdxs.B)
if !*streamAggrGlobalKeepInput {
tssBlock = dropAggregatedSeries(tssBlock, matchIdxs.B, *streamAggrGlobalDropInput)
} else if *streamAggrGlobalDropInput {
// if both keep_input and drop_input are true, we keep only the aggregated series
tssBlock = dropUnaggregatedSeries(tssBlock, matchIdxs.B)
}
matchIdxsPool.Put(matchIdxs)
}
@@ -988,7 +992,17 @@ func (rwctx *remoteWriteCtx) TryPushTimeSeries(tss []prompb.TimeSeries, forceDro
tss = append(*v, tss...)
}
tss = dropAggregatedSeries(tss, matchIdxs.B, rwctx.streamAggrDropInput)
} else if rwctx.streamAggrDropInput {
// if both keep_input and drop_input are true, we keep only the aggregated series
if rctx == nil {
rctx = getRelabelCtx()
// Make a copy of tss before dropping aggregated series
v = tssPool.Get().(*[]prompb.TimeSeries)
tss = append(*v, tss...)
}
tss = dropUnaggregatedSeries(tss, matchIdxs.B)
}
matchIdxsPool.Put(matchIdxs)
}
if rwctx.deduplicator != nil {
@@ -1011,9 +1025,10 @@ func (rwctx *remoteWriteCtx) TryPushTimeSeries(tss []prompb.TimeSeries, forceDro
return false
}
var matchIdxsPool bytesutil.ByteBufferPool
var matchIdxsPool slicesutil.BufferPool[uint32]
func dropAggregatedSeries(src []prompb.TimeSeries, matchIdxs []byte, dropInput bool) []prompb.TimeSeries {
// dropAggregatedSeries drops matched series, also the unmatched if dropInput is true.
func dropAggregatedSeries(src []prompb.TimeSeries, matchIdxs []uint32, dropInput bool) []prompb.TimeSeries {
dst := src[:0]
if !dropInput {
for i, match := range matchIdxs {
@@ -1028,6 +1043,20 @@ func dropAggregatedSeries(src []prompb.TimeSeries, matchIdxs []byte, dropInput b
return dst
}
// dropUnaggregatedSeries drops unmatched series.
func dropUnaggregatedSeries(src []prompb.TimeSeries, matchIdxs []uint32) []prompb.TimeSeries {
dst := src[:0]
for i, match := range matchIdxs {
if match == 0 {
continue
}
dst = append(dst, src[i])
}
tail := src[len(dst):]
clear(tail)
return dst
}
func (rwctx *remoteWriteCtx) pushInternalTrackDropped(tss []prompb.TimeSeries) {
if rwctx.tryPushTimeSeriesInternal(tss) {
return

View File

@@ -10,6 +10,8 @@ import (
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/consistenthash"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/persistentqueue"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/prometheus"
@@ -57,8 +59,8 @@ func TestGetLabelsHash_Distribution(t *testing.T) {
f(10)
}
func TestRemoteWriteContext_TryPush_ImmutableTimeseries(t *testing.T) {
f := func(streamAggrConfig, relabelConfig string, enableWindows bool, dedupInterval time.Duration, keepInput, dropInput bool, input string) {
func TestRemoteWriteContext_TryPushTimeSeries(t *testing.T) {
f := func(streamAggrConfig, relabelConfig string, enableWindows bool, dedupInterval time.Duration, keepInput, dropInput bool, input string, expectedRowsPushedAfterRelabel, expectedPushedSample int) {
t.Helper()
perURLRelabel, err := promrelabel.ParseRelabelConfigsData([]byte(relabelConfig))
if err != nil {
@@ -71,10 +73,16 @@ func TestRemoteWriteContext_TryPush_ImmutableTimeseries(t *testing.T) {
}
allRelabelConfigs.Store(rcs)
path := "fast-queue-write-test"
fs.MustRemoveDir(path)
fq := persistentqueue.MustOpenFastQueue(path, "test", 100, 0, false)
defer fs.MustRemoveDir(path)
defer fq.MustClose()
pss := make([]*pendingSeries, 1)
isVMProto := &atomic.Bool{}
isVMProto.Store(true)
pss[0] = newPendingSeries(nil, isVMProto, 0, 100)
pss[0] = newPendingSeries(fq, isVMProto, 0, 100)
rwctx := &remoteWriteCtx{
idx: 0,
streamAggrKeepInput: keepInput,
@@ -83,6 +91,8 @@ func TestRemoteWriteContext_TryPush_ImmutableTimeseries(t *testing.T) {
rowsPushedAfterRelabel: metrics.GetOrCreateCounter(`foo`),
rowsDroppedByRelabel: metrics.GetOrCreateCounter(`bar`),
}
defer metrics.UnregisterAllMetrics()
if dedupInterval > 0 {
rwctx.deduplicator = streamaggr.NewDeduplicator(nil, enableWindows, dedupInterval, nil, "dedup-global")
}
@@ -104,23 +114,27 @@ func TestRemoteWriteContext_TryPush_ImmutableTimeseries(t *testing.T) {
inputTss := prometheus.MustParsePromMetrics(input, offsetMsecs)
expectedTss := make([]prompb.TimeSeries, len(inputTss))
// copy inputTss to make sure it is not mutated during TryPush call
// check inputTss is not modified after TryPushTimeSeries
copy(expectedTss, inputTss)
if !rwctx.TryPushTimeSeries(inputTss, false) {
t.Fatalf("cannot push samples to rwctx")
}
if int(rwctx.rowsPushedAfterRelabel.Get()) != expectedRowsPushedAfterRelabel {
t.Fatalf("unexpected number of rows after relabel; got %d; want %d", rwctx.rowsPushedAfterRelabel.Get(), expectedRowsPushedAfterRelabel)
}
if len(pss[0].wr.tss) != expectedPushedSample {
t.Fatalf("unexpected number of pushed samples; got %d; want %d", len(pss[0].wr.tss), expectedPushedSample)
}
if !reflect.DeepEqual(expectedTss, inputTss) {
t.Fatalf("unexpected samples;\ngot\n%v\nwant\n%v", inputTss, expectedTss)
}
}
f(`
- interval: 1m
outputs: [sum_samples]
- interval: 2m
outputs: [count_series]
`, `
// relabeling
f(``, `
- action: keep
source_labels: [env]
regex: "dev"
@@ -129,53 +143,66 @@ metric{env="dev"} 10
metric{env="bar"} 20
metric{env="dev"} 15
metric{env="bar"} 25
`)
`, 2, 2)
// relabeling + aggregation
f(`
- match: '{env="dev"}'
interval: 1m
outputs: [sum_samples]
`, `
- action: keep
source_labels: [env]
regex: ".*"
`, false, 0, false, false, `
metric{env="dev"} 10
metric{env="bar"} 20
metric{env="dev"} 15
metric{env="bar"} 25
`, 4, 2)
// aggregation + keepInput
f(`
- match: '{env="dev"}'
interval: 1m
outputs: [sum_samples]
`, ``, false, 0, true, false, `
metric{env="dev"} 10
metric{env="bar"} 20
metric{env="dev"} 15
metric{env="bar"} 25
`, 4, 4)
// aggregation + dropInput
f(`
- match: '{env="dev"}'
interval: 1m
outputs: [sum_samples]
`, ``, false, 0, false, true, `
metric{env="dev"} 10
metric{env="bar"} 20
metric{env="dev"} 15
metric{env="bar"} 25
`, 4, 0)
// aggregation + keepInput + dropInput
f(`
- match: '{env="dev"}'
interval: 1m
outputs: [sum_samples]
`, ``, false, 0, true, true, `
metric{env="dev"} 10
metric{env="bar"} 20
metric{env="bar"} 25
`, 3, 1)
// aggregation + deduplication
f(``, ``, true, time.Hour, false, false, `
metric{env="dev"} 10
metric{env="foo"} 20
metric{env="dev"} 15
metric{env="foo"} 25
`)
f(``, `
- action: keep
source_labels: [env]
regex: "dev"
`, true, time.Hour, false, false, `
metric{env="dev"} 10
metric{env="bar"} 20
metric{env="dev"} 15
metric{env="bar"} 25
`)
f(``, `
- action: keep
source_labels: [env]
regex: "dev"
`, true, time.Hour, true, false, `
metric{env="test"} 10
metric{env="dev"} 20
metric{env="foo"} 15
metric{env="dev"} 25
`)
f(``, `
- action: keep
source_labels: [env]
regex: "dev"
`, true, time.Hour, false, true, `
metric{env="foo"} 10
metric{env="dev"} 20
metric{env="foo"} 15
metric{env="dev"} 25
`)
f(``, `
- action: keep
source_labels: [env]
regex: "dev"
`, true, time.Hour, true, true, `
metric{env="dev"} 10
metric{env="test"} 20
metric{env="dev"} 15
metric{env="bar"} 25
`)
`, 4, 0)
}
func TestShardAmountRemoteWriteCtx(t *testing.T) {

View File

@@ -18,12 +18,12 @@ var (
streamAggrGlobalConfig = flag.String("streamAggr.config", "", "Optional path to file with stream aggregation config. "+
"See https://docs.victoriametrics.com/victoriametrics/stream-aggregation/ . "+
"See also -streamAggr.keepInput, -streamAggr.dropInput and -streamAggr.dedupInterval")
streamAggrGlobalKeepInput = flag.Bool("streamAggr.keepInput", false, "Whether to keep all the input samples after the aggregation "+
"with -streamAggr.config. By default, only aggregates samples are dropped, while the remaining samples "+
"are written to remote storages write. See also -streamAggr.dropInput and https://docs.victoriametrics.com/victoriametrics/stream-aggregation/")
streamAggrGlobalDropInput = flag.Bool("streamAggr.dropInput", false, "Whether to drop all the input samples after the aggregation "+
"with -remoteWrite.streamAggr.config. By default, only aggregates samples are dropped, while the remaining samples "+
"are written to remote storages write. See also -streamAggr.keepInput and https://docs.victoriametrics.com/victoriametrics/stream-aggregation/")
streamAggrGlobalKeepInput = flag.Bool("streamAggr.keepInput", false, "Whether to keep input samples that match any rule in "+
"-streamAggr.config. By default, matched raw samples are aggregated and dropped, while unmatched samples "+
"are written to the remote storage. See also -streamAggr.dropInput and https://docs.victoriametrics.com/victoriametrics/stream-aggregation/")
streamAggrGlobalDropInput = flag.Bool("streamAggr.dropInput", false, "Whether to drop input samples that not matching any rule in "+
"-streamAggr.config. By default, only matched raw samples are dropped, while unmatched samples "+
"are written to the remote storage. See also -streamAggr.keepInput and https://docs.victoriametrics.com/victoriametrics/stream-aggregation/")
streamAggrGlobalDedupInterval = flag.Duration("streamAggr.dedupInterval", 0, "Input samples are de-duplicated with this interval on "+
"aggregator before optional aggregation with -streamAggr.config . "+
"See also -dedup.minScrapeInterval and https://docs.victoriametrics.com/victoriametrics/stream-aggregation/#deduplication")
@@ -43,11 +43,11 @@ var (
streamAggrConfig = flagutil.NewArrayString("remoteWrite.streamAggr.config", "Optional path to file with stream aggregation config for the corresponding -remoteWrite.url. "+
"See https://docs.victoriametrics.com/victoriametrics/stream-aggregation/ . "+
"See also -remoteWrite.streamAggr.keepInput, -remoteWrite.streamAggr.dropInput and -remoteWrite.streamAggr.dedupInterval")
streamAggrDropInput = flagutil.NewArrayBool("remoteWrite.streamAggr.dropInput", "Whether to drop all the input samples after the aggregation "+
"with -remoteWrite.streamAggr.config at the corresponding -remoteWrite.url. By default, only aggregates samples are dropped, while the remaining samples "+
streamAggrDropInput = flagutil.NewArrayBool("remoteWrite.streamAggr.dropInput", "Whether to drop input samples that not matching any rule in "+
"the corresponding -remoteWrite.streamAggr.config. By default, only matched raw samples are dropped, while unmatched samples "+
"are written to the corresponding -remoteWrite.url . See also -remoteWrite.streamAggr.keepInput and https://docs.victoriametrics.com/victoriametrics/stream-aggregation/")
streamAggrKeepInput = flagutil.NewArrayBool("remoteWrite.streamAggr.keepInput", "Whether to keep all the input samples after the aggregation "+
"with -remoteWrite.streamAggr.config at the corresponding -remoteWrite.url. By default, only aggregates samples are dropped, while the remaining samples "+
streamAggrKeepInput = flagutil.NewArrayBool("remoteWrite.streamAggr.keepInput", "Whether to keep input samples that match any rule in "+
"the corresponding -remoteWrite.streamAggr.config. By default, matched raw samples are aggregated and dropped, while unmatched samples "+
"are written to the corresponding -remoteWrite.url . See also -remoteWrite.streamAggr.dropInput and https://docs.victoriametrics.com/victoriametrics/stream-aggregation/")
streamAggrDedupInterval = flagutil.NewArrayDuration("remoteWrite.streamAggr.dedupInterval", 0, "Input samples are de-duplicated with this interval before optional aggregation "+
"with -remoteWrite.streamAggr.config at the corresponding -remoteWrite.url. See also -dedup.minScrapeInterval and https://docs.victoriametrics.com/victoriametrics/stream-aggregation/#deduplication")

View File

@@ -27,6 +27,9 @@ vmalert-linux-ppc64le-prod:
vmalert-linux-386-prod:
APP_NAME=vmalert $(MAKE) app-via-docker-linux-386
vmalert-linux-s390x-prod:
APP_NAME=vmalert $(MAKE) app-via-docker-linux-s390x
vmalert-darwin-amd64-prod:
APP_NAME=vmalert $(MAKE) app-via-docker-darwin-amd64

View File

@@ -31,7 +31,7 @@ type Group struct {
// EvalDelay will adjust the `time` parameter of rule evaluation requests to compensate intentional query delay from datasource.
// see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5155
EvalDelay *promutil.Duration `yaml:"eval_delay,omitempty"`
Limit int `yaml:"limit,omitempty"`
Limit *int `yaml:"limit,omitempty"`
Rules []Rule `yaml:"rules"`
Concurrency int `yaml:"concurrency"`
// Labels is a set of label value pairs, that will be added to every rule.
@@ -91,8 +91,8 @@ func (g *Group) Validate(validateTplFn ValidateTplFn, validateExpressions bool)
if g.EvalOffset != nil && g.EvalDelay != nil {
return fmt.Errorf("eval_offset cannot be used with eval_delay")
}
if g.Limit < 0 {
return fmt.Errorf("invalid limit %d, shouldn't be less than 0", g.Limit)
if g.Limit != nil && *g.Limit < 0 {
return fmt.Errorf("invalid limit %d, shouldn't be less than 0", *g.Limit)
}
if g.Concurrency < 0 {
return fmt.Errorf("invalid concurrency %d, shouldn't be less than 0", g.Concurrency)

View File

@@ -116,7 +116,7 @@ func TestParse_Failure(t *testing.T) {
f([]string{"testdata/rules/rules_interval_bad.rules"}, "eval_offset should be smaller than interval")
f([]string{"testdata/rules/rules0-bad.rules"}, "unexpected token")
f([]string{"testdata/dir/rules0-bad.rules"}, "error parsing annotation")
f([]string{"testdata/dir/rules0-bad.rules"}, "invalid annotations")
f([]string{"testdata/dir/rules1-bad.rules"}, "duplicate in file")
f([]string{"testdata/dir/rules2-bad.rules"}, "function \"unknown\" not defined")
f([]string{"testdata/dir/rules3-bad.rules"}, "either `record` or `alert` must be set")
@@ -181,9 +181,10 @@ func TestGroupValidate_Failure(t *testing.T) {
EvalOffset: promutil.NewDuration(2 * time.Minute),
}, false, "eval_offset should be smaller than interval")
limit := -1
f(&Group{
Name: "wrong limit",
Limit: -1,
Limit: &limit,
}, false, "invalid limit")
f(&Group{
@@ -342,7 +343,6 @@ func TestGroupValidate_Failure(t *testing.T) {
},
},
}, true, "bad prometheus expr")
}
func TestGroupValidate_Success(t *testing.T) {

View File

@@ -173,22 +173,26 @@ func (c *Client) Query(ctx context.Context, query string, ts time.Time) (Result,
return Result{}, nil, fmt.Errorf("second attempt: %w", err)
}
}
defer func() { _ = resp.Body.Close() }()
// Process the received response.
var parseFn func(req *http.Request, resp *http.Response) (Result, error)
var parseFn func(resp *http.Response) (Result, error)
switch c.dataSourceType {
case datasourcePrometheus:
parseFn = parsePrometheusResponse
parseFn = parsePrometheusInstantResponse
case datasourceGraphite:
parseFn = parseGraphiteResponse
case datasourceVLogs:
parseFn = parseVLogsResponse
parseFn = parseVLogsInstantResponse
default:
logger.Panicf("BUG: unsupported datasource type %q to parse query response", c.dataSourceType)
}
result, err := parseFn(req, resp)
_ = resp.Body.Close()
return result, req, err
result, err := parseFn(resp)
if err != nil {
return Result{}, nil, fmt.Errorf("error parsing response from %q: %w", req.URL.Redacted(), err)
}
return result, req, nil
}
// QueryRange executes the given query on the given time range.
@@ -229,19 +233,23 @@ func (c *Client) QueryRange(ctx context.Context, query string, start, end time.T
return res, fmt.Errorf("second attempt: %w", err)
}
}
defer func() { _ = resp.Body.Close() }()
// Process the received response.
var parseFn func(req *http.Request, resp *http.Response) (Result, error)
var parseFn func(resp *http.Response) (Result, error)
switch c.dataSourceType {
case datasourcePrometheus:
parseFn = parsePrometheusResponse
parseFn = parsePrometheusRangeResponse
case datasourceVLogs:
parseFn = parseVLogsResponse
parseFn = parseVLogsRangeResponse
default:
logger.Panicf("BUG: unsupported datasource type %q to parse query range response", c.dataSourceType)
}
res, err = parseFn(req, resp)
_ = resp.Body.Close()
res, err = parseFn(resp)
if err != nil {
return Result{}, fmt.Errorf("error parsing response from %q: %w", req.URL.Redacted(), err)
}
return res, err
}

View File

@@ -33,10 +33,10 @@ func (r graphiteResponse) metrics() []Metric {
return ms
}
func parseGraphiteResponse(req *http.Request, resp *http.Response) (Result, error) {
func parseGraphiteResponse(resp *http.Response) (Result, error) {
r := &graphiteResponse{}
if err := json.NewDecoder(resp.Body).Decode(r); err != nil {
return Result{}, fmt.Errorf("error parsing graphite metrics for %s: %w", req.URL.Redacted(), err)
return Result{}, fmt.Errorf("error parsing graphite metrics: %w", err)
}
return Result{Data: r.metrics()}, nil
}

View File

@@ -172,17 +172,26 @@ const (
rtVector, rtMatrix, rScalar = "vector", "matrix", "scalar"
)
func parsePrometheusResponse(req *http.Request, resp *http.Response) (res Result, err error) {
func parsePromResponse(resp *http.Response) (*promResponse, error) {
r := &promResponse{}
if err = json.NewDecoder(resp.Body).Decode(r); err != nil {
return res, fmt.Errorf("error parsing response from %s: %w", req.URL.Redacted(), err)
if err := json.NewDecoder(resp.Body).Decode(r); err != nil {
return nil, fmt.Errorf("failed to decode response: %w", err)
}
if r.Status == statusError {
return res, fmt.Errorf("response error, query: %s, errorType: %s, error: %s", req.URL.Redacted(), r.ErrorType, r.Error)
return nil, fmt.Errorf("response error %q: %s", r.ErrorType, r.Error)
}
if r.Status != statusSuccess {
return res, fmt.Errorf("unknown status: %s, Expected success or error", r.Status)
return nil, fmt.Errorf("unknown response status %q", r.Status)
}
return r, nil
}
func parsePrometheusInstantResponse(resp *http.Response) (res Result, err error) {
r, err := parsePromResponse(resp)
if err != nil {
return res, fmt.Errorf("failed to parse response: %w", err)
}
var parseFn func() ([]Metric, error)
switch r.Data.ResultType {
case rtVector:
@@ -191,12 +200,6 @@ func parsePrometheusResponse(req *http.Request, resp *http.Response) (res Result
return res, fmt.Errorf("unmarshal err %w; \n %#v", err, string(r.Data.Result))
}
parseFn = pi.metrics
case rtMatrix:
var pr promRange
if err := json.Unmarshal(r.Data.Result, &pr.Result); err != nil {
return res, err
}
parseFn = pr.metrics
case rScalar:
var ps promScalar
if err := json.Unmarshal(r.Data.Result, &ps); err != nil {
@@ -206,7 +209,6 @@ func parsePrometheusResponse(req *http.Request, resp *http.Response) (res Result
default:
return res, fmt.Errorf("unknown result type %q", r.Data.ResultType)
}
ms, err := parseFn()
if err != nil {
return res, err
@@ -222,6 +224,34 @@ func parsePrometheusResponse(req *http.Request, resp *http.Response) (res Result
return res, nil
}
func parsePrometheusRangeResponse(resp *http.Response) (res Result, err error) {
r, err := parsePromResponse(resp)
if err != nil {
return res, fmt.Errorf("failed to parse response: %w", err)
}
if r.Data.ResultType != rtMatrix {
return res, fmt.Errorf("unexpected result type %q; expected result type %q", r.Data.ResultType, rtMatrix)
}
var pr promRange
if err := json.Unmarshal(r.Data.Result, &pr.Result); err != nil {
return res, err
}
ms, err := pr.metrics()
if err != nil {
return res, err
}
res = Result{Data: ms, IsPartial: r.IsPartial}
if r.Stats.SeriesFetched != nil {
intV, err := strconv.Atoi(*r.Stats.SeriesFetched)
if err != nil {
return res, fmt.Errorf("failed to convert stats.seriesFetched to int: %w", err)
}
res.SeriesFetched = &intV
}
return res, nil
}
func (c *Client) setPrometheusInstantReqParams(r *http.Request, query string, timestamp time.Time) {
if c.appendTypePrefix {
r.URL.Path += "/prometheus"

View File

@@ -65,21 +65,23 @@ func TestVMInstantQuery(t *testing.T) {
case 3:
w.Write([]byte(`{"status":"unknown"}`))
case 4:
w.Write([]byte(`{"status":"success","data":{"resultType":"matrix"}}`))
w.Write([]byte(`{"status":"success","data":{"resultType":"vector"}}`))
case 5:
w.Write([]byte(`{"status":"success","data":{"resultType":"vector","result":[{"metric":{"__name__":"vm_rows","foo":"bar"},"value":[1583786142,"13763"]},{"metric":{"__name__":"vm_requests","foo":"baz"},"value":[1583786140,"2000"]}]}}`))
w.Write([]byte(`{"status":"success","data":{"resultType":"matrix","result":[{"metric":{"__name__":"vm_rows"},"values":[[1583786142,"13763"]]}]}}`))
case 6:
w.Write([]byte(`{"status":"success","data":{"resultType":"scalar","result":[1583786142, "1"]}}`))
w.Write([]byte(`{"status":"success","data":{"resultType":"vector","result":[{"metric":{"__name__":"vm_rows","foo":"bar"},"value":[1583786142,"13763"]},{"metric":{"__name__":"vm_requests","foo":"baz"},"value":[1583786140,"2000"]}]}}`))
case 7:
w.Write([]byte(`{"status":"success","data":{"resultType":"scalar","result":[1583786142, "1"]},"stats":{"seriesFetched": "42"}}`))
w.Write([]byte(`{"status":"success","data":{"resultType":"scalar","result":[1583786142, "1"]}}`))
case 8:
w.Write([]byte(`{"status":"success","data":{"resultType":"scalar","result":[1583786142, "1"]},"stats":{"seriesFetched": "42"}}`))
case 9:
w.Write([]byte(`{"status":"success", "isPartial":true, "data":{"resultType":"scalar","result":[1583786142, "1"]}}`))
}
})
mux.HandleFunc("/render", func(w http.ResponseWriter, _ *http.Request) {
c++
switch c {
case 9:
case 10:
w.Write([]byte(`[{"target":"constantLine(10)","tags":{"name":"constantLine(10)"},"datapoints":[[10,1611758343],[10,1611758373],[10,1611758403]]}]`))
}
})
@@ -102,9 +104,9 @@ func TestVMInstantQuery(t *testing.T) {
t.Fatalf("failed to parse 'time' query param %q: %s", timeParam, err)
}
switch c {
case 10:
w.Write([]byte("[]"))
case 11:
w.Write([]byte("[]"))
case 12:
w.Write([]byte(`{"status":"success","data":{"resultType":"vector","result":[{"metric":{"__name__":"total","foo":"bar"},"value":[1583786142,"13763"]},{"metric":{"__name__":"total","foo":"baz"},"value":[1583786140,"2000"]}]}}`))
}
})
@@ -123,6 +125,7 @@ func TestVMInstantQuery(t *testing.T) {
ts := time.Now()
expErr := func(query, err string) {
t.Helper()
_, _, gotErr := pq.Query(ctx, query, ts)
if gotErr == nil {
t.Fatalf("expected %q got nil", err)
@@ -135,10 +138,11 @@ func TestVMInstantQuery(t *testing.T) {
expErr(vmQuery, "500") // 0
expErr(vmQuery, "error parsing response") // 1
expErr(vmQuery, "response error") // 2
expErr(vmQuery, "unknown status") // 3
expErr(vmQuery, "unknown response status") // 3
expErr(vmQuery, "unexpected end of JSON input") // 4
expErr(vmQuery, "unknown result type") // 5
res, _, err := pq.Query(ctx, vmQuery, ts) // 5 - vector
res, _, err := pq.Query(ctx, vmQuery, ts) // 6 - vector
if err != nil {
t.Fatalf("unexpected %s", err)
}
@@ -159,7 +163,7 @@ func TestVMInstantQuery(t *testing.T) {
}
metricsEqual(t, res.Data, expected)
res, req, err := pq.Query(ctx, vmQuery, ts) // 6 - scalar
res, req, err := pq.Query(ctx, vmQuery, ts) // 7 - scalar
if err != nil {
t.Fatalf("unexpected %s", err)
}
@@ -184,7 +188,7 @@ func TestVMInstantQuery(t *testing.T) {
res.SeriesFetched)
}
res, _, err = pq.Query(ctx, vmQuery, ts) // 7 - scalar with stats
res, _, err = pq.Query(ctx, vmQuery, ts) // 8 - scalar with stats
if err != nil {
t.Fatalf("unexpected %s", err)
}
@@ -205,7 +209,7 @@ func TestVMInstantQuery(t *testing.T) {
*res.SeriesFetched)
}
res, _, err = pq.Query(ctx, vmQuery, ts) // 8
res, _, err = pq.Query(ctx, vmQuery, ts) // 9
if err != nil {
t.Fatalf("unexpected %s", err)
}
@@ -216,7 +220,7 @@ func TestVMInstantQuery(t *testing.T) {
// test graphite
gq := s.BuildWithParams(QuerierParams{DataSourceType: string(datasourceGraphite)})
res, _, err = gq.Query(ctx, queryRender, ts) // 9 - graphite
res, _, err = gq.Query(ctx, queryRender, ts) // 10 - graphite
if err != nil {
t.Fatalf("unexpected %s", err)
}
@@ -236,9 +240,9 @@ func TestVMInstantQuery(t *testing.T) {
vlogs := datasourceVLogs
pq = s.BuildWithParams(QuerierParams{DataSourceType: string(vlogs), EvaluationInterval: 15 * time.Second})
expErr(vlogsQuery, "error parsing response") // 10
expErr(vlogsQuery, "error parsing response") // 11
res, _, err = pq.Query(ctx, vlogsQuery, ts) // 11
res, _, err = pq.Query(ctx, vlogsQuery, ts) // 12
if err != nil {
t.Fatalf("unexpected %s", err)
}
@@ -390,6 +394,8 @@ func TestVMRangeQuery(t *testing.T) {
switch c {
case 0:
w.Write([]byte(`{"status":"success","data":{"resultType":"matrix","result":[{"metric":{"__name__":"vm_rows"},"values":[[1583786142,"13763"]]}]}}`))
case 1:
w.Write([]byte(`{"status":"success","data":{"resultType":"vector","result":[1583786142, "1"]}}`))
}
})
mux.HandleFunc("/select/logsql/stats_query_range", func(w http.ResponseWriter, r *http.Request) {
@@ -422,7 +428,7 @@ func TestVMRangeQuery(t *testing.T) {
t.Fatalf("expected 'step' query param to be 60s; got %q instead", step)
}
switch c {
case 1:
case 2:
w.Write([]byte(`{"status":"success","data":{"resultType":"matrix","result":[{"metric":{"__name__":"total"},"values":[[1583786142,"10"]]}]}}`))
}
})
@@ -446,13 +452,13 @@ func TestVMRangeQuery(t *testing.T) {
start, end := time.Now().Add(-time.Minute), time.Now()
res, err := pq.QueryRange(ctx, vmQuery, start, end)
res, err := pq.QueryRange(ctx, vmQuery, start, end) // case 0
if err != nil {
t.Fatalf("unexpected %s", err)
}
m := res.Data
if len(m) != 1 {
t.Fatalf("expected 1 metric got %d in %+v", len(m), m)
t.Fatalf("expected 1 metric got %d in %+v", len(m), m)
}
expected := Metric{
Labels: []prompb.Label{{Value: "vm_rows", Name: "__name__"}},
@@ -463,6 +469,9 @@ func TestVMRangeQuery(t *testing.T) {
t.Fatalf("unexpected metric %+v want %+v", m[0], expected)
}
_, err = pq.QueryRange(ctx, vmQuery, start, end) // case 1
expectError(t, err, "unexpected result type")
// test unsupported graphite
gq := s.BuildWithParams(QuerierParams{DataSourceType: string(datasourceGraphite)})

View File

@@ -40,8 +40,28 @@ func (c *Client) setVLogsRangeReqParams(r *http.Request, query string, start, en
c.setReqParams(r, query)
}
func parseVLogsResponse(req *http.Request, resp *http.Response) (res Result, err error) {
res, err = parsePrometheusResponse(req, resp)
func parseVLogsInstantResponse(resp *http.Response) (res Result, err error) {
res, err = parsePrometheusInstantResponse(resp)
if err != nil {
return Result{}, err
}
for i := range res.Data {
m := &res.Data[i]
for j := range m.Labels {
// reserve the stats func result name with a new label `stats_result` instead of dropping it,
// since there could be multiple stats results in a single query, for instance:
// _time:5m | stats quantile(0.5, request_duration_seconds) p50, quantile(0.9, request_duration_seconds) p90
if m.Labels[j].Name == "__name__" {
m.Labels[j].Name = "stats_result"
break
}
}
}
return
}
func parseVLogsRangeResponse(resp *http.Response) (res Result, err error) {
res, err = parsePrometheusRangeResponse(resp)
if err != nil {
return Result{}, err
}

View File

@@ -132,10 +132,7 @@ func (ls Labels) String() string {
// a=[]Label{{Name: "a", Value: "2"}},b=[]Label{{Name: "a", Value: "1"}}, return 1
// a=[]Label{{Name: "a", Value: "1"}},b=[]Label{{Name: "a", Value: "1"}}, return 0
func LabelCompare(a, b Labels) int {
l := len(a)
if len(b) < l {
l = len(b)
}
l := min(len(b), len(a))
for i := 0; i < l; i++ {
if a[i].Name != b[i].Name {

View File

@@ -7,7 +7,6 @@ import (
"net/url"
"os"
"sort"
"strconv"
"strings"
"sync"
"time"
@@ -77,14 +76,13 @@ absolute path to all .tpl files in root.
`Link to VMUI: -external.alert.source='vmui/#/?g0.expr={{.Expr|queryEscape}}'. `+
`If empty 'vmalert/alert?group_id={{.GroupID}}&alert_id={{.AlertID}}' is used.`)
externalLabels = flagutil.NewArrayString("external.label", "Optional label in the form 'Name=value' to add to all generated recording rules and alerts. "+
"In case of conflicts, original labels are kept with prefix `exported_`.")
"In case of conflicts, original labels are kept with prefix 'exported_'.")
dryRun = flag.Bool("dryRun", false, "Whether to check only config files without running vmalert. The rules file are validated. The -rule flag must be specified.")
)
var (
alertURLGeneratorFn notifier.AlertURLGenerator
extURL *url.URL
extURL *url.URL
)
func main() {
@@ -121,7 +119,7 @@ func main() {
return
}
alertURLGeneratorFn, err = getAlertURLGenerator(extURL, *externalAlertSource, *validateTemplates)
err = notifier.InitAlertURLGeneratorFn(extURL, *externalAlertSource, *validateTemplates)
if err != nil {
logger.Fatalf("failed to init `external.alert.source`: %s", err)
}
@@ -228,14 +226,13 @@ func newManager(ctx context.Context) (*manager, error) {
labels[s[:n]] = s[n+1:]
}
nts, err := notifier.Init(alertURLGeneratorFn, labels, *externalURL)
err = notifier.Init(labels, *externalURL)
if err != nil {
return nil, fmt.Errorf("failed to init notifier: %w", err)
}
manager := &manager{
groups: make(map[uint64]*rule.Group),
querierBuilder: q,
notifiers: nts,
labels: labels,
}
rw, err := remotewrite.Init(ctx)
@@ -292,35 +289,6 @@ func getHostnameAsExternalURL(addr string, isSecure bool) (*url.URL, error) {
return url.Parse(fmt.Sprintf("%s%s%s", schema, hname, port))
}
func getAlertURLGenerator(externalURL *url.URL, externalAlertSource string, validateTemplate bool) (notifier.AlertURLGenerator, error) {
if externalAlertSource == "" {
return func(a notifier.Alert) string {
gID, aID := strconv.FormatUint(a.GroupID, 10), strconv.FormatUint(a.ID, 10)
return fmt.Sprintf("%s/vmalert/alert?%s=%s&%s=%s", externalURL, paramGroupID, gID, paramAlertID, aID)
}, nil
}
if validateTemplate {
if err := notifier.ValidateTemplates(map[string]string{
"tpl": externalAlertSource,
}); err != nil {
return nil, fmt.Errorf("error validating source template %s: %w", externalAlertSource, err)
}
}
m := map[string]string{
"tpl": externalAlertSource,
}
return func(alert notifier.Alert) string {
qFn := func(_ string) ([]datasource.Metric, error) {
return nil, fmt.Errorf("`query` template isn't supported for alert source template")
}
templated, err := alert.ExecTemplate(qFn, alert.Labels, m)
if err != nil {
logger.Errorf("cannot template alert source: %s", err)
}
return fmt.Sprintf("%s/%s", externalURL, templated["tpl"])
}, nil
}
func usage() {
const s = `
vmalert processes alerts and recording rules.

View File

@@ -49,30 +49,6 @@ func TestGetExternalURL(t *testing.T) {
}
}
func TestGetAlertURLGenerator(t *testing.T) {
testAlert := notifier.Alert{GroupID: 42, ID: 2, Value: 4, Labels: map[string]string{"tenant": "baz"}}
u, _ := url.Parse("https://victoriametrics.com/path")
fn, err := getAlertURLGenerator(u, "", false)
if err != nil {
t.Fatalf("unexpected error %s", err)
}
exp := fmt.Sprintf("https://victoriametrics.com/path/vmalert/alert?%s=42&%s=2", paramGroupID, paramAlertID)
if exp != fn(testAlert) {
t.Fatalf("unexpected url want %s, got %s", exp, fn(testAlert))
}
_, err = getAlertURLGenerator(nil, "foo?{{invalid}}", true)
if err == nil {
t.Fatalf("expected template validation error got nil")
}
fn, err = getAlertURLGenerator(u, "foo?query={{$value}}&ds={{ $labels.tenant }}", true)
if err != nil {
t.Fatalf("unexpected error %s", err)
}
if exp := "https://victoriametrics.com/path/foo?query=4&ds=baz"; exp != fn(testAlert) {
t.Fatalf("unexpected url want %s, got %s", exp, fn(testAlert))
}
}
func TestConfigReload(t *testing.T) {
originalRulePath := *rulePath
originalExternalURL := extURL
@@ -120,9 +96,10 @@ groups:
querierBuilder: &datasource.FakeQuerier{},
groups: make(map[uint64]*rule.Group),
labels: map[string]string{},
notifiers: func() []notifier.Notifier { return []notifier.Notifier{&notifier.FakeNotifier{}} },
rw: &remotewrite.Client{},
}
_, cleanup := notifier.InitFakeNotifier()
defer cleanup()
syncCh := make(chan struct{})
sighupCh := procutil.NewSighupChan()

View File

@@ -3,6 +3,7 @@ package main
import (
"context"
"fmt"
"strconv"
"sync"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/config"
@@ -16,7 +17,6 @@ import (
// manager controls group states
type manager struct {
querierBuilder datasource.QuerierBuilder
notifiers func() []notifier.Notifier
rw remotewrite.RWClient
// remote read builder.
@@ -29,25 +29,8 @@ type manager struct {
groups map[uint64]*rule.Group
}
// ruleAPI generates apiRule object from alert by its ID(hash)
func (m *manager) ruleAPI(gID, rID uint64) (apiRule, error) {
m.groupsMu.RLock()
defer m.groupsMu.RUnlock()
g, ok := m.groups[gID]
if !ok {
return apiRule{}, fmt.Errorf("can't find group with id %d", gID)
}
for _, rule := range g.Rules {
if rule.ID() == rID {
return ruleToAPI(rule), nil
}
}
return apiRule{}, fmt.Errorf("can't find rule with id %d in group %q", rID, g.Name)
}
// alertAPI generates apiAlert object from alert by its ID(hash)
func (m *manager) alertAPI(gID, aID uint64) (*apiAlert, error) {
// groupAPI generates apiGroup object from group by its ID(hash)
func (m *manager) groupAPI(gID uint64) (*rule.ApiGroup, error) {
m.groupsMu.RLock()
defer m.groupsMu.RUnlock()
@@ -55,13 +38,47 @@ func (m *manager) alertAPI(gID, aID uint64) (*apiAlert, error) {
if !ok {
return nil, fmt.Errorf("can't find group with id %d", gID)
}
return g.ToAPI(), nil
}
// ruleAPI generates apiRule object from alert by its ID(hash)
func (m *manager) ruleAPI(gID, rID uint64) (rule.ApiRule, error) {
m.groupsMu.RLock()
defer m.groupsMu.RUnlock()
group, ok := m.groups[gID]
if !ok {
return rule.ApiRule{}, fmt.Errorf("can't find group with id %d", gID)
}
g := group.ToAPI()
ruleID := strconv.FormatUint(rID, 10)
for _, r := range g.Rules {
ar, ok := r.(*rule.AlertingRule)
if !ok {
if r.ID == ruleID {
return r, nil
}
}
return rule.ApiRule{}, fmt.Errorf("can't find rule with id %d in group %q", rID, g.Name)
}
// alertAPI generates apiAlert object from alert by its ID(hash)
func (m *manager) alertAPI(gID, aID uint64) (*rule.ApiAlert, error) {
m.groupsMu.RLock()
defer m.groupsMu.RUnlock()
group, ok := m.groups[gID]
if !ok {
return nil, fmt.Errorf("can't find group with id %d", gID)
}
g := group.ToAPI()
for _, r := range g.Rules {
if r.Type != rule.TypeAlerting {
continue
}
if apiAlert := alertToAPI(ar, aID); apiAlert != nil {
return apiAlert, nil
alertID := strconv.FormatUint(aID, 10)
for _, a := range r.Alerts {
if a.ID == alertID {
return a, nil
}
}
}
return nil, fmt.Errorf("can't find alert with id %d in group %q", aID, g.Name)
@@ -82,17 +99,16 @@ func (m *manager) close() {
}
func (m *manager) startGroup(ctx context.Context, g *rule.Group, restore bool) error {
m.wg.Add(1)
id := g.GetID()
g.Init()
go func() {
defer m.wg.Done()
m.wg.Go(func() {
if restore {
g.Start(ctx, m.notifiers, m.rw, m.rr)
g.Start(ctx, m.rw, m.rr)
} else {
g.Start(ctx, m.notifiers, m.rw, nil)
g.Start(ctx, m.rw, nil)
}
}()
})
m.groups[id] = g
return nil
}
@@ -119,7 +135,7 @@ func (m *manager) update(ctx context.Context, groupsCfg []config.Group, restore
if rrPresent && m.rw == nil {
return fmt.Errorf("config contains recording rules but `-remoteWrite.url` isn't set")
}
if arPresent && m.notifiers == nil {
if arPresent && notifier.GetTargets() == nil {
return fmt.Errorf("config contains alerting rules but neither `-notifier.url` nor `-notifier.config` nor `-notifier.blackhole` aren't set")
}
@@ -156,15 +172,15 @@ func (m *manager) update(ctx context.Context, groupsCfg []config.Group, restore
if len(toUpdate) > 0 {
var wg sync.WaitGroup
for _, item := range toUpdate {
wg.Add(1)
// cancel evaluation so the Update will be applied as fast as possible.
// it is important to call InterruptEval before the update, because cancel fn
// can be re-assigned during the update.
item.old.InterruptEval()
go func(oldGroup *rule.Group, newGroup *rule.Group) {
oldGroup.UpdateWith(newGroup)
wg.Done()
}(item.old, item.new)
oldG := item.old
newG := item.new
wg.Go(func() {
// cancel evaluation so the Update will be applied as fast as possible.
// it is important to call InterruptEval before the update, because cancel fn
// can be re-assigned during the update.
oldG.InterruptEval()
oldG.UpdateWith(newG)
})
}
wg.Wait()
}

View File

@@ -40,10 +40,11 @@ func TestManagerEmptyRulesDir(t *testing.T) {
// execution of configuration update.
// Should be executed with -race flag
func TestManagerUpdateConcurrent(t *testing.T) {
_, cleanup := notifier.InitFakeNotifier()
defer cleanup()
m := &manager{
groups: make(map[uint64]*rule.Group),
querierBuilder: &datasource.FakeQuerier{},
notifiers: func() []notifier.Notifier { return []notifier.Notifier{&notifier.FakeNotifier{}} },
}
paths := []string{
"config/testdata/dir/rules0-good.rules",
@@ -127,8 +128,9 @@ func TestManagerUpdate_Success(t *testing.T) {
m := &manager{
groups: make(map[uint64]*rule.Group),
querierBuilder: &datasource.FakeQuerier{},
notifiers: func() []notifier.Notifier { return []notifier.Notifier{&notifier.FakeNotifier{}} },
}
_, cleanup := notifier.InitFakeNotifier()
defer cleanup()
cfgInit := loadCfg(t, []string{initPath}, true, true)
if err := m.update(ctx, cfgInit, false); err != nil {
@@ -277,7 +279,8 @@ func TestManagerUpdate_Failure(t *testing.T) {
rw: rw,
}
if notifiers != nil {
m.notifiers = func() []notifier.Notifier { return notifiers }
_, cleanup := notifier.InitFakeNotifier()
defer cleanup()
}
err := m.update(context.Background(), []config.Group{cfg}, false)
if err == nil {

View File

@@ -166,8 +166,8 @@ func templateAnnotations(annotations map[string]string, data AlertTplData, tmpl
ctmpl, _ := tmpl.Clone()
ctmpl = ctmpl.Option("missingkey=zero")
if err := templateAnnotation(&buf, builder.String(), tData, ctmpl, execute); err != nil {
r[key] = text
eg.Add(fmt.Errorf("key %q, template %q: %w", key, text, err))
r[key] = err.Error()
eg.Add(fmt.Errorf("(key: %q, value: %q): %w", key, text, err))
continue
}
r[key] = buf.String()
@@ -184,13 +184,13 @@ type tplData struct {
func templateAnnotation(dst io.Writer, text string, data tplData, tpl *textTpl.Template, execute bool) error {
tpl, err := tpl.Parse(text)
if err != nil {
return fmt.Errorf("error parsing annotation template: %w", err)
return fmt.Errorf("error parsing template: %w", err)
}
if !execute {
return nil
}
if err = tpl.Execute(dst, data); err != nil {
return fmt.Errorf("error evaluating annotation template: %w", err)
return fmt.Errorf("error evaluating template: %w", err)
}
return nil
}

View File

@@ -20,7 +20,7 @@ func TestAlertExecTemplate(t *testing.T) {
)
extLabels["cluster"] = extCluster
extLabels["dc"] = extDC
_, err := Init(nil, extLabels, extURL)
err := Init(extLabels, extURL)
checkErr(t, err)
f := func(alert *Alert, annotations map[string]string, tplExpected map[string]string) {

View File

@@ -3,6 +3,7 @@ package notifier
import (
"bytes"
"context"
"errors"
"fmt"
"io"
"net/http"
@@ -22,10 +23,11 @@ import (
// AlertManager represents integration provider with Prometheus alert manager
// https://github.com/prometheus/alertmanager
type AlertManager struct {
addr *url.URL
argFunc AlertURLGenerator
client *http.Client
timeout time.Duration
addr *url.URL
argFunc AlertURLGenerator
client *http.Client
timeout time.Duration
lastError string
authCfg *promauth.Config
// stores already parsed RelabelConfigs object
@@ -71,24 +73,42 @@ func (am AlertManager) Addr() string {
return am.addr.Redacted()
}
func (am *AlertManager) LastError() string {
return am.lastError
}
// Send an alert or resolve message
func (am *AlertManager) Send(ctx context.Context, alerts []Alert, headers map[string]string) error {
func (am *AlertManager) Send(ctx context.Context, alerts []Alert, alertLabels [][]prompb.Label, headers map[string]string) error {
if len(alerts) != len(alertLabels) {
return fmt.Errorf("mismatched number of alerts and label sets after global alert relabeling")
}
am.metrics.alertsSent.Add(len(alerts))
startTime := time.Now()
err := am.send(ctx, alerts, headers)
err := am.send(ctx, alerts, alertLabels, headers)
am.metrics.alertsSendDuration.UpdateDuration(startTime)
if err != nil {
// the context can be cancelled on graceful shutdown
// or on group update. So no need to handle the error as usual.
if errors.Is(err, context.Canceled) {
return nil
}
am.metrics.alertsSendErrors.Add(len(alerts))
am.lastError = err.Error()
} else {
am.lastError = ""
}
return err
}
func (am *AlertManager) send(ctx context.Context, alerts []Alert, headers map[string]string) error {
func (am *AlertManager) send(ctx context.Context, alerts []Alert, alertLabels [][]prompb.Label, headers map[string]string) error {
b := &bytes.Buffer{}
alertsToSend := make([]Alert, 0, len(alerts))
lblss := make([][]prompb.Label, 0, len(alerts))
for _, a := range alerts {
lbls := a.applyRelabelingIfNeeded(am.relabelConfigs)
for i, a := range alerts {
lbls := alertLabels[i]
if am.relabelConfigs != nil {
lbls = am.relabelConfigs.Apply(lbls, 0)
}
if len(lbls) == 0 {
continue
}

View File

@@ -11,6 +11,7 @@ import (
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promauth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
)
@@ -145,11 +146,11 @@ func TestAlertManager_Send(t *testing.T) {
t.Fatalf("unexpected error: %s", err)
}
if err := am.Send(context.Background(), []Alert{{Labels: map[string]string{"a": "b"}}}, nil); err == nil {
if err := am.Send(context.Background(), []Alert{{Labels: map[string]string{"a": "b"}}}, [][]prompb.Label{{{Name: "a", Value: "b"}}}, nil); err == nil {
t.Fatalf("expected connection error got nil")
}
if err := am.Send(context.Background(), []Alert{{Labels: map[string]string{"a": "b"}}}, nil); err == nil {
if err := am.Send(context.Background(), []Alert{{Labels: map[string]string{"a": "b"}}}, [][]prompb.Label{{{Name: "a", Value: "b"}}}, nil); err == nil {
t.Fatalf("expected wrong http code error got nil")
}
@@ -160,7 +161,7 @@ func TestAlertManager_Send(t *testing.T) {
End: time.Now().UTC(),
Labels: map[string]string{"alertname": "alert0"},
Annotations: map[string]string{"a": "b", "c": "d"},
}}, map[string]string{headerKey: "bar"}); err != nil {
}}, [][]prompb.Label{{{Name: "alertname", Value: "alert0"}}}, map[string]string{headerKey: "bar"}); err != nil {
t.Fatalf("unexpected error %s", err)
}
@@ -174,7 +175,7 @@ func TestAlertManager_Send(t *testing.T) {
Name: "alert2",
Labels: map[string]string{"rule": "test", "tenant": "1"},
},
}, map[string]string{headerKey: "bar"}); err != nil {
}, [][]prompb.Label{{{Name: "rule", Value: "test"}, {Name: "tenant", Value: "0"}}, {{Name: "rule", Value: "test"}, {Name: "tenant", Value: "1"}}}, map[string]string{headerKey: "bar"}); err != nil {
t.Fatalf("unexpected error %s", err)
}
@@ -187,7 +188,7 @@ func TestAlertManager_Send(t *testing.T) {
Name: "alert2",
Labels: map[string]string{},
},
}, map[string]string{}); err != nil {
}, [][]prompb.Label{{{Name: "rule", Value: "test"}}, {{}}}, map[string]string{}); err != nil {
t.Fatalf("unexpected error %s", err)
}

View File

@@ -27,15 +27,9 @@ type Config struct {
// PathPrefix is added to URL path before adding alertManagerPath value
PathPrefix string `yaml:"path_prefix,omitempty"`
// ConsulSDConfigs contains list of settings for service discovery via Consul
// see https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config
ConsulSDConfigs []consul.SDConfig `yaml:"consul_sd_configs,omitempty"`
// DNSSDConfigs contains list of settings for service discovery via DNS.
// See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dns_sd_config
DNSSDConfigs []dns.SDConfig `yaml:"dns_sd_configs,omitempty"`
// StaticConfigs contains list of static targets
StaticConfigs []StaticConfig `yaml:"static_configs,omitempty"`
ConsulSDConfigs []ConsulSDConfigs `yaml:"consul_sd_configs,omitempty"`
DNSSDConfigs []DNSSDConfigs `yaml:"dns_sd_configs,omitempty"`
StaticConfigs []StaticConfig `yaml:"static_configs,omitempty"`
// HTTPClientConfig contains HTTP configuration for Notifier clients
HTTPClientConfig promauth.HTTPClientConfig `yaml:",inline"`
@@ -62,14 +56,29 @@ type Config struct {
parsedAlertRelabelConfigs *promrelabel.ParsedConfigs
}
// StaticConfig contains list of static targets in the following form:
// staticConfig contains list of static targets in the following form:
//
// targets:
// [ - '<host>' ]
type StaticConfig struct {
Targets []string `yaml:"targets"`
// HTTPClientConfig contains HTTP configuration for the Targets
HTTPClientConfig promauth.HTTPClientConfig `yaml:",inline"`
HTTPClientConfig promauth.HTTPClientConfig `yaml:",inline"`
AlertRelabelConfigs []promrelabel.RelabelConfig `yaml:"alert_relabel_configs,omitempty"`
}
// ConsulSDConfigs contains list of settings for service discovery via Consul,
// see https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config
type ConsulSDConfigs struct {
consul.SDConfig `yaml:",inline"`
AlertRelabelConfigs []promrelabel.RelabelConfig `yaml:"alert_relabel_configs,omitempty"`
}
// DNSSDConfigs contains list of settings for service discovery via DNS,
// See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dns_sd_config
type DNSSDConfigs struct {
dns.SDConfig `yaml:",inline"`
AlertRelabelConfigs []promrelabel.RelabelConfig `yaml:"alert_relabel_configs,omitempty"`
}
// UnmarshalYAML implements the yaml.Unmarshaler interface.
@@ -95,6 +104,31 @@ func (cfg *Config) UnmarshalYAML(unmarshal func(any) error) error {
}
cfg.parsedAlertRelabelConfigs = arCfg
for _, s := range cfg.StaticConfigs {
if len(s.AlertRelabelConfigs) > 0 {
_, err := promrelabel.ParseRelabelConfigs(s.AlertRelabelConfigs)
if err != nil {
return fmt.Errorf("failed to parse alert_relabel_configs in static_config: %w", err)
}
}
}
for _, s := range cfg.ConsulSDConfigs {
if len(s.AlertRelabelConfigs) > 0 {
_, err := promrelabel.ParseRelabelConfigs(s.AlertRelabelConfigs)
if err != nil {
return fmt.Errorf("failed to parse alert_relabel_configs in consul_sd_config: %w", err)
}
}
}
for _, s := range cfg.DNSSDConfigs {
if len(s.AlertRelabelConfigs) > 0 {
_, err := promrelabel.ParseRelabelConfigs(s.AlertRelabelConfigs)
if err != nil {
return fmt.Errorf("failed to parse alert_relabel_configs in dns_sd_config: %w", err)
}
}
}
b, err := yaml.Marshal(cfg)
if err != nil {
return fmt.Errorf("failed to marshal configuration for checksum: %w", err)

View File

@@ -35,4 +35,6 @@ func TestParseConfig_Failure(t *testing.T) {
f("testdata/unknownFields.bad.yaml", "unknown field")
f("non-existing-file", "error reading")
f("testdata/consul.bad.yaml", "failed to parse alert_relabel_configs in consul_sd_config")
f("testdata/dns.bad.yaml", "failed to parse alert relabeling config")
}

View File

@@ -8,6 +8,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promauth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape/discovery/consul"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape/discovery/dns"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promutil"
@@ -28,11 +29,7 @@ type configWatcher struct {
targets map[TargetType][]Target
}
func newWatcher(path string, gen AlertURLGenerator) (*configWatcher, error) {
cfg, err := parseConfig(path)
if err != nil {
return nil, err
}
func newWatcher(cfg *Config, gen AlertURLGenerator) (*configWatcher, error) {
cw := &configWatcher{
cfg: cfg,
wg: sync.WaitGroup{},
@@ -88,18 +85,15 @@ func (cw *configWatcher) reload(path string) error {
return cw.start()
}
func (cw *configWatcher) add(typeK TargetType, interval time.Duration, labelsFn getLabels) error {
targetMetadata, errors := getTargetMetadata(labelsFn, cw.cfg)
func (cw *configWatcher) add(typeK TargetType, interval time.Duration, targetsFn getTargets) error {
targetMetadata, errors := getTargetMetadata(targetsFn, cw.cfg)
for _, err := range errors {
return fmt.Errorf("failed to init notifier for %q: %w", typeK, err)
}
cw.updateTargets(typeK, targetMetadata, cw.cfg, cw.genFn)
cw.wg.Add(1)
go func() {
defer cw.wg.Done()
cw.wg.Go(func() {
ticker := time.NewTicker(interval)
defer ticker.Stop()
@@ -109,62 +103,77 @@ func (cw *configWatcher) add(typeK TargetType, interval time.Duration, labelsFn
return
case <-ticker.C:
}
targetMetadata, errors := getTargetMetadata(labelsFn, cw.cfg)
targetMetadata, errors := getTargetMetadata(targetsFn, cw.cfg)
for _, err := range errors {
logger.Errorf("failed to init notifier for %q: %w", typeK, err)
}
cw.updateTargets(typeK, targetMetadata, cw.cfg, cw.genFn)
}
}()
})
return nil
}
func getTargetMetadata(labelsFn getLabels, cfg *Config) (map[string]*promutil.Labels, []error) {
metaLabels, err := labelsFn()
type targetMetadata struct {
*promutil.Labels
alertRelabelConfigs *promrelabel.ParsedConfigs
}
func getTargetMetadata(targetsFn getTargets, cfg *Config) (map[string]targetMetadata, []error) {
metaLabelsList, alertRelabelCfgs, err := targetsFn()
if err != nil {
return nil, []error{fmt.Errorf("failed to get labels: %w", err)}
}
targetMetadata := make(map[string]*promutil.Labels, len(metaLabels))
targetMts := make(map[string]targetMetadata, len(metaLabelsList))
var errors []error
duplicates := make(map[string]struct{})
for _, labels := range metaLabels {
target := labels.Get("__address__")
u, processedLabels, err := parseLabels(target, labels, cfg)
if err != nil {
errors = append(errors, err)
continue
}
if len(u) == 0 {
continue
}
if _, ok := duplicates[u]; ok { // check for duplicates
if !*suppressDuplicateTargetErrors {
logger.Errorf("skipping duplicate target with identical address %q; "+
"make sure service discovery and relabeling is set up properly; "+
"original labels: %s; resulting labels: %s",
u, labels, processedLabels)
for i := range metaLabelsList {
metaLabels := metaLabelsList[i]
alertRelabelCfg := alertRelabelCfgs[i]
for _, labels := range metaLabels {
target := labels.Get("__address__")
u, processedLabels, err := parseLabels(target, labels, cfg)
if err != nil {
errors = append(errors, err)
continue
}
if len(u) == 0 {
continue
}
// check for duplicated targets
// targets with same address but different alert_relabel_configs are still considered duplicates since it's mostly due to misconfiguration and could cause duplicated notifications.
if _, ok := duplicates[u]; ok {
if !*suppressDuplicateTargetErrors {
logger.Errorf("skipping duplicate target with identical address %q; "+
"make sure service discovery and relabeling is set up properly; "+
"original labels: %s; resulting labels: %s",
u, labels, processedLabels)
}
continue
}
duplicates[u] = struct{}{}
targetMts[u] = targetMetadata{
Labels: processedLabels,
alertRelabelConfigs: alertRelabelCfg,
}
continue
}
duplicates[u] = struct{}{}
targetMetadata[u] = processedLabels
}
return targetMetadata, errors
return targetMts, errors
}
type getLabels func() ([]*promutil.Labels, error)
type getTargets func() ([][]*promutil.Labels, []*promrelabel.ParsedConfigs, error)
func (cw *configWatcher) start() error {
if len(cw.cfg.StaticConfigs) > 0 {
var targets []Target
for _, cfg := range cw.cfg.StaticConfigs {
for i, cfg := range cw.cfg.StaticConfigs {
alertRelabelConfig, _ := promrelabel.ParseRelabelConfigs(cw.cfg.StaticConfigs[i].AlertRelabelConfigs)
httpCfg := mergeHTTPClientConfigs(cw.cfg.HTTPClientConfig, cfg.HTTPClientConfig)
for _, target := range cfg.Targets {
address, labels, err := parseLabels(target, nil, cw.cfg)
if err != nil {
return fmt.Errorf("failed to parse labels for target %q: %w", target, err)
}
notifier, err := NewAlertManager(address, cw.genFn, httpCfg, cw.cfg.parsedAlertRelabelConfigs, cw.cfg.Timeout.Duration())
notifier, err := NewAlertManager(address, cw.genFn, httpCfg, alertRelabelConfig, cw.cfg.Timeout.Duration())
if err != nil {
return fmt.Errorf("failed to init alertmanager for addr %q: %w", address, err)
}
@@ -178,17 +187,20 @@ func (cw *configWatcher) start() error {
}
if len(cw.cfg.ConsulSDConfigs) > 0 {
err := cw.add(TargetConsul, *consul.SDCheckInterval, func() ([]*promutil.Labels, error) {
var labels []*promutil.Labels
err := cw.add(TargetConsul, *consul.SDCheckInterval, func() ([][]*promutil.Labels, []*promrelabel.ParsedConfigs, error) {
var labels [][]*promutil.Labels
var alertRelabelConfigs []*promrelabel.ParsedConfigs
for i := range cw.cfg.ConsulSDConfigs {
alertRelabelConfig, _ := promrelabel.ParseRelabelConfigs(cw.cfg.ConsulSDConfigs[i].AlertRelabelConfigs)
sdc := &cw.cfg.ConsulSDConfigs[i]
targetLabels, err := sdc.GetLabels(cw.cfg.baseDir)
if err != nil {
return nil, fmt.Errorf("got labels err: %w", err)
return nil, nil, fmt.Errorf("got labels err: %w", err)
}
labels = append(labels, targetLabels...)
labels = append(labels, targetLabels)
alertRelabelConfigs = append(alertRelabelConfigs, alertRelabelConfig)
}
return labels, nil
return labels, alertRelabelConfigs, nil
})
if err != nil {
return fmt.Errorf("failed to start consulSD discovery: %w", err)
@@ -196,17 +208,21 @@ func (cw *configWatcher) start() error {
}
if len(cw.cfg.DNSSDConfigs) > 0 {
err := cw.add(TargetDNS, *dns.SDCheckInterval, func() ([]*promutil.Labels, error) {
var labels []*promutil.Labels
err := cw.add(TargetDNS, *dns.SDCheckInterval, func() ([][]*promutil.Labels, []*promrelabel.ParsedConfigs, error) {
var labels [][]*promutil.Labels
var alertRelabelConfigs []*promrelabel.ParsedConfigs
for i := range cw.cfg.DNSSDConfigs {
alertRelabelConfig, _ := promrelabel.ParseRelabelConfigs(cw.cfg.DNSSDConfigs[i].AlertRelabelConfigs)
sdc := &cw.cfg.DNSSDConfigs[i]
targetLabels, err := sdc.GetLabels(cw.cfg.baseDir)
if err != nil {
return nil, fmt.Errorf("got labels err: %w", err)
return nil, nil, fmt.Errorf("got labels err: %w", err)
}
labels = append(labels, targetLabels...)
labels = append(labels, targetLabels)
alertRelabelConfigs = append(alertRelabelConfigs, alertRelabelConfig)
}
return labels, nil
return labels, alertRelabelConfigs, nil
})
if err != nil {
return fmt.Errorf("failed to start DNSSD discovery: %w", err)
@@ -240,30 +256,30 @@ func (cw *configWatcher) setTargets(key TargetType, targets []Target) {
cw.targetsMu.Unlock()
}
func (cw *configWatcher) updateTargets(key TargetType, targetMetadata map[string]*promutil.Labels, cfg *Config, genFn AlertURLGenerator) {
func (cw *configWatcher) updateTargets(key TargetType, targetMts map[string]targetMetadata, cfg *Config, genFn AlertURLGenerator) {
cw.targetsMu.Lock()
defer cw.targetsMu.Unlock()
oldTargets := cw.targets[key]
var updatedTargets []Target
for _, ot := range oldTargets {
if _, ok := targetMetadata[ot.Addr()]; !ok {
if _, ok := targetMts[ot.Addr()]; !ok {
// if target not exists in currentTargets, close it
ot.Close()
} else {
updatedTargets = append(updatedTargets, ot)
delete(targetMetadata, ot.Addr())
delete(targetMts, ot.Addr())
}
}
// create new resources for the new targets
for addr, labels := range targetMetadata {
am, err := NewAlertManager(addr, genFn, cfg.HTTPClientConfig, cfg.parsedAlertRelabelConfigs, cfg.Timeout.Duration())
for addr, metadata := range targetMts {
am, err := NewAlertManager(addr, genFn, cfg.HTTPClientConfig, metadata.alertRelabelConfigs, cfg.Timeout.Duration())
if err != nil {
logger.Errorf("failed to init %s notifier with addr %q: %w", key, addr, err)
continue
}
updatedTargets = append(updatedTargets, Target{
Notifier: am,
Labels: labels,
Labels: metadata.Labels,
})
}

View File

@@ -7,6 +7,7 @@ import (
"net/http/httptest"
"os"
"sync"
"sync/atomic"
"testing"
"time"
@@ -28,7 +29,11 @@ static_configs:
- localhost:9093
- localhost:9094
`)
cw, err := newWatcher(f.Name(), nil)
cfg, err := parseConfig(f.Name())
if err != nil {
t.Fatalf("failed to parse config: %s", err)
}
cw, err := newWatcher(cfg, nil)
if err != nil {
t.Fatalf("failed to start config watcher: %s", err)
}
@@ -83,33 +88,64 @@ consul_sd_configs:
- server: %s
services:
- alertmanager
`, consulSDServer.URL))
- server: %s
services:
- alertmanager
alert_relabel_configs:
- target_label: "foo"
replacement: "tar"
`, consulSDServer.URL, consulSDServer.URL))
cw, err := newWatcher(consulSDFile.Name(), nil)
cfg, err := parseConfig(consulSDFile.Name())
if err != nil {
t.Fatalf("failed to parse config: %s", err)
}
cw, err := newWatcher(cfg, nil)
if err != nil {
t.Fatalf("failed to start config watcher: %s", err)
}
defer cw.mustStop()
if len(cw.notifiers()) != 2 {
t.Fatalf("expected to get 2 notifiers; got %d", len(cw.notifiers()))
if len(cw.notifiers()) != 3 {
t.Fatalf("expected to get 3 notifiers; got %d", len(cw.notifiers()))
}
expAddr1 := fmt.Sprintf("https://%s/proxy/api/v2/alerts", fakeConsulService1)
expAddr2 := fmt.Sprintf("https://%s/proxy/api/v2/alerts", fakeConsulService2)
expAddr3 := fmt.Sprintf("https://%s/proxy/api/v2/alerts", fakeConsulService3)
n1, n2 := cw.notifiers()[0], cw.notifiers()[1]
n1, n2, n3 := cw.notifiers()[0], cw.notifiers()[1], cw.notifiers()[2]
if n1.Addr() != expAddr1 {
t.Fatalf("exp address %q; got %q", expAddr1, n1.Addr())
}
if n2.Addr() != expAddr2 {
t.Fatalf("exp address %q; got %q", expAddr2, n2.Addr())
}
if n3.Addr() != expAddr3 {
t.Fatalf("exp address %q; got %q", expAddr3, n3.Addr())
}
if n1.(*AlertManager).relabelConfigs.String() != "" {
t.Fatalf("unexpected relabel configs: %q", n1.(*AlertManager).relabelConfigs.String())
}
if n2.(*AlertManager).relabelConfigs.String() != "" {
t.Fatalf("unexpected relabel configs: %q", n2.(*AlertManager).relabelConfigs.String())
}
if n3.(*AlertManager).relabelConfigs.String() != "- target_label: foo\n replacement: tar\n" {
t.Fatalf("unexpected relabel configs: %q", n3.(*AlertManager).relabelConfigs.String())
}
f := func() bool { return len(cw.notifiers()) == 1 }
if !waitFor(f, time.Second) {
t.Fatalf("expected to get 1 notifiers; got %d", len(cw.notifiers()))
}
n3 = cw.notifiers()[0]
if n3.Addr() != expAddr3 {
t.Fatalf("exp address %q; got %q", expAddr3, n3.Addr())
}
if n3.(*AlertManager).relabelConfigs.String() != "- target_label: foo\n replacement: tar\n" {
t.Fatalf("unexpected relabel configs: %q", n3.(*AlertManager).relabelConfigs.String())
}
}
// TestConfigWatcherReloadConcurrent supposed to test concurrent
@@ -164,7 +200,11 @@ consul_sd_configs:
"unknownFields.bad.yaml",
}
cw, err := newWatcher(paths[0], nil)
cfg, err := parseConfig(paths[0])
if err != nil {
t.Fatalf("failed to parse config: %s", err)
}
cw, err := newWatcher(cfg, nil)
if err != nil {
t.Fatalf("failed to start config watcher: %s", err)
}
@@ -202,10 +242,11 @@ func checkErr(t *testing.T, err error) {
const (
fakeConsulService1 = "127.0.0.1:9093"
fakeConsulService2 = "127.0.0.1:9095"
fakeConsulService3 = "127.0.0.1:9097"
)
func newFakeConsulServer() *httptest.Server {
requestCount := 0
var requestCount atomic.Int32
mux := http.NewServeMux()
mux.HandleFunc("/v1/agent/self", func(rw http.ResponseWriter, _ *http.Request) {
rw.Write([]byte(`{"Config": {"Datacenter": "dc1"}}`))
@@ -220,7 +261,7 @@ func newFakeConsulServer() *httptest.Server {
}`))
})
mux.HandleFunc("/v1/health/service/alertmanager", func(rw http.ResponseWriter, _ *http.Request) {
if requestCount == 0 {
if requestCount.Load() == 0 {
rw.Header().Set("X-Consul-Index", "1")
rw.Write([]byte(`
[
@@ -360,7 +401,7 @@ func newFakeConsulServer() *httptest.Server {
}
]`))
}
requestCount++
requestCount.Add(1)
})
return httptest.NewServer(mux)

View File

@@ -5,6 +5,8 @@ import (
"fmt"
"sync"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
)
// FakeNotifier is a mock notifier
@@ -15,14 +17,32 @@ type FakeNotifier struct {
counter int
}
// InitFakeNotifier initializes global notifier to FakeNotifier,
// and returns a cleanup function to restore the original getActiveNotifiers.
func InitFakeNotifier() (*FakeNotifier, func()) {
originalGetActiveNotifiers := getActiveNotifiers
fn := &FakeNotifier{}
getActiveNotifiers = func() []Notifier {
return []Notifier{fn}
}
return fn, func() {
getActiveNotifiers = originalGetActiveNotifiers
}
}
// Close does nothing
func (*FakeNotifier) Close() {}
// LastError returns last error message
func (*FakeNotifier) LastError() string {
return ""
}
// Addr returns ""
func (*FakeNotifier) Addr() string { return "" }
// Send sets alerts and increases counter
func (fn *FakeNotifier) Send(_ context.Context, alerts []Alert, _ map[string]string) error {
func (fn *FakeNotifier) Send(_ context.Context, alerts []Alert, _ [][]prompb.Label, _ map[string]string) error {
fn.Lock()
defer fn.Unlock()
fn.counter += len(alerts)

View File

@@ -1,14 +1,22 @@
package notifier
import (
"context"
"flag"
"fmt"
"net/url"
"strconv"
"strings"
"sync"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/vmalertutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promauth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promutil"
)
@@ -57,11 +65,61 @@ var (
sendTimeout = flagutil.NewArrayDuration("notifier.sendTimeout", 10*time.Second, "Timeout when sending alerts to the corresponding -notifier.url")
)
// cw holds a configWatcher for configPath configuration file
// configWatcher provides a list of Notifier objects discovered
// from static config or via service discovery.
// cw is not nil only if configPath is provided.
var cw *configWatcher
// AlertURLGeneratorFn returns a URL to the passed alert object.
// Call InitAlertURLGeneratorFn before using this function.
var AlertURLGeneratorFn AlertURLGenerator
// InitAlertURLGeneratorFn populates AlertURLGeneratorFn
func InitAlertURLGeneratorFn(externalURL *url.URL, externalAlertSource string, validateTemplate bool) error {
if externalAlertSource == "" {
AlertURLGeneratorFn = func(a Alert) string {
gID, aID := strconv.FormatUint(a.GroupID, 10), strconv.FormatUint(a.ID, 10)
return fmt.Sprintf("%s/vmalert/alert?%s=%s&%s=%s", externalURL, "group_id", gID, "alert_id", aID)
}
return nil
}
if validateTemplate {
if err := ValidateTemplates(map[string]string{
"tpl": externalAlertSource,
}); err != nil {
return fmt.Errorf("error validating source template %s: %w", externalAlertSource, err)
}
}
m := map[string]string{
"tpl": externalAlertSource,
}
AlertURLGeneratorFn = func(alert Alert) string {
qFn := func(_ string) ([]datasource.Metric, error) {
return nil, fmt.Errorf("`query` template isn't supported for alert source template")
}
templated, err := alert.ExecTemplate(qFn, alert.Labels, m)
if err != nil {
logger.Errorf("cannot template alert source: %s", err)
}
return fmt.Sprintf("%s/%s", externalURL, templated["tpl"])
}
return nil
}
var (
// getActiveNotifiers returns the current list of Notifier objects.
getActiveNotifiers func() []Notifier
// globalRelabelCfg stores the parsed alert relabeling config from the config file if there is
globalRelabelCfg *promrelabel.ParsedConfigs
// cw holds a configWatcher for configPath configuration file
// configWatcher provides a list of Notifier objects discovered
// from static config or via service discovery.
// cw is not nil only if configPath is provided.
cw *configWatcher
// externalLabels is a global variable for holding external labels configured via flags
// It is supposed to be inited via Init function only.
externalLabels map[string]string
// externalURL is a global variable for holding external URL value configured via flag
// It is supposed to be inited via Init function only.
externalURL string
)
// Reload checks the changes in configPath configuration file
// and applies changes if any.
@@ -72,66 +130,62 @@ func Reload() error {
return cw.reload(*configPath)
}
var staticNotifiersFn func() []Notifier
var (
// externalLabels is a global variable for holding external labels configured via flags
// It is supposed to be inited via Init function only.
externalLabels map[string]string
// externalURL is a global variable for holding external URL value configured via flag
// It is supposed to be inited via Init function only.
externalURL string
)
// Init returns a function for retrieving actual list of Notifier objects.
// Init works in two mods:
// - configuration via flags (for backward compatibility). Is always static
// and don't support live reloads.
// - configuration via file. Supports live reloads and service discovery.
//
// Init returns an error if both mods are used.
func Init(gen AlertURLGenerator, extLabels map[string]string, extURL string) (func() []Notifier, error) {
func Init(extLabels map[string]string, extURL string) error {
externalURL = extURL
externalLabels = extLabels
_, err := url.Parse(externalURL)
if err != nil {
return nil, fmt.Errorf("failed to parse external URL: %w", err)
return fmt.Errorf("failed to parse external URL: %w", err)
}
if *blackHole {
if len(*addrs) > 0 || *configPath != "" {
return nil, fmt.Errorf("only one of -notifier.blackhole, -notifier.url and -notifier.config flags must be specified")
return fmt.Errorf("only one of -notifier.blackhole, -notifier.url and -notifier.config flags must be specified")
}
notifier := newBlackHoleNotifier()
staticNotifiersFn = func() []Notifier {
getActiveNotifiers = func() []Notifier {
return []Notifier{notifier}
}
return staticNotifiersFn, nil
return nil
}
if *configPath == "" && len(*addrs) == 0 {
return nil, nil
return nil
}
if *configPath != "" && len(*addrs) > 0 {
return nil, fmt.Errorf("only one of -notifier.config or -notifier.url flags must be specified")
return fmt.Errorf("only one of -notifier.config or -notifier.url flags must be specified")
}
if len(*addrs) > 0 {
notifiers, err := notifiersFromFlags(gen)
notifiers, err := notifiersFromFlags(AlertURLGeneratorFn)
if err != nil {
return nil, fmt.Errorf("failed to create notifier from flag values: %w", err)
return fmt.Errorf("failed to create notifier from flag values: %w", err)
}
staticNotifiersFn = func() []Notifier {
getActiveNotifiers = func() []Notifier {
return notifiers
}
return staticNotifiersFn, nil
return nil
}
cw, err = newWatcher(*configPath, gen)
cfg, err := parseConfig(*configPath)
if err != nil {
return nil, fmt.Errorf("failed to init config watcher: %w", err)
return err
}
return cw.notifiers, nil
if cfg.AlertRelabelConfigs != nil {
globalRelabelCfg = cfg.parsedAlertRelabelConfigs
}
cw, err = newWatcher(cfg, AlertURLGeneratorFn)
if err != nil {
return fmt.Errorf("failed to init config watcher: %w", err)
}
getActiveNotifiers = cw.notifiers
return nil
}
// InitSecretFlags must be called after flag.Parse and before any logging
@@ -206,23 +260,57 @@ const (
// GetTargets returns list of static or discovered targets
// via notifier configuration.
//
// Must be called after Init.
func GetTargets() map[TargetType][]Target {
var targets = make(map[TargetType][]Target)
if staticNotifiersFn != nil {
for _, ns := range staticNotifiersFn() {
targets[TargetStatic] = append(targets[TargetStatic], Target{
Notifier: ns,
})
}
if getActiveNotifiers == nil {
return nil
}
var targets = make(map[TargetType][]Target)
// use cached targets from configWatcher instead of getActiveNotifiers for the extra target labels
if cw != nil {
cw.targetsMu.RLock()
for key, ns := range cw.targets {
targets[key] = append(targets[key], ns...)
}
cw.targetsMu.RUnlock()
return targets
}
// static notifiers don't have labels
for _, ns := range getActiveNotifiers() {
targets[TargetStatic] = append(targets[TargetStatic], Target{
Notifier: ns,
})
}
return targets
}
// Send sends alerts to all active notifiers
func Send(ctx context.Context, alerts []Alert, notifierHeaders map[string]string) *vmalertutil.ErrGroup {
alertsToSend := make([]Alert, 0, len(alerts))
lblss := make([][]prompb.Label, 0, len(alerts))
// apply global relabel config first without modifying original alerts in alerts
for _, a := range alerts {
lbls := a.applyRelabelingIfNeeded(globalRelabelCfg)
if len(lbls) == 0 {
continue
}
alertsToSend = append(alertsToSend, a)
lblss = append(lblss, lbls)
}
errGr := new(vmalertutil.ErrGroup)
wg := sync.WaitGroup{}
activeNotifiers := getActiveNotifiers()
for i := range activeNotifiers {
nt := activeNotifiers[i]
wg.Go(func() {
if err := nt.Send(ctx, alertsToSend, lblss, notifierHeaders); err != nil {
errGr.Add(fmt.Errorf("failed to send alerts to addr %q: %w", nt.Addr(), err))
}
})
}
wg.Wait()
return errGr
}

View File

@@ -1,9 +1,17 @@
package notifier
import (
"context"
"encoding/json"
"fmt"
"net/http"
"net/http/httptest"
"net/url"
"os"
"testing"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
)
func TestInit(t *testing.T) {
@@ -12,14 +20,13 @@ func TestInit(t *testing.T) {
*addrs = flagutil.ArrayString{"127.0.0.1", "127.0.0.2"}
fn, err := Init(nil, nil, "")
err := Init(nil, "")
if err != nil {
t.Fatalf("%s", err)
}
nfs := fn()
if len(nfs) != 2 {
t.Fatalf("expected to get 2 notifiers; got %d", len(nfs))
if len(getActiveNotifiers()) != 2 {
t.Fatalf("expected to get 2 notifiers; got %d", len(getActiveNotifiers()))
}
targets := GetTargets()
@@ -52,7 +59,7 @@ func TestInitNegative(t *testing.T) {
*configPath = path
*addrs = flagutil.ArrayString{addr}
*blackHole = bh
if _, err := Init(nil, nil, ""); err == nil {
if err := Init(nil, ""); err == nil {
t.Fatalf("expected to get error; got nil instead")
}
}
@@ -69,14 +76,13 @@ func TestBlackHole(t *testing.T) {
*blackHole = true
fn, err := Init(nil, nil, "")
err := Init(nil, "")
if err != nil {
t.Fatalf("%s", err)
}
nfs := fn()
if len(nfs) != 1 {
t.Fatalf("expected to get 1 notifier; got %d", len(nfs))
if len(getActiveNotifiers()) != 1 {
t.Fatalf("expected to get 1 notifier; got %d", len(getActiveNotifiers()))
}
targets := GetTargets()
@@ -91,3 +97,112 @@ func TestBlackHole(t *testing.T) {
t.Fatalf("expected to get \"blackhole\"; got %q instead", nf1.Addr())
}
}
func TestGetAlertURLGenerator(t *testing.T) {
oldAlertURLGeneratorFn := AlertURLGeneratorFn
defer func() { AlertURLGeneratorFn = oldAlertURLGeneratorFn }()
testAlert := Alert{GroupID: 42, ID: 2, Value: 4, Labels: map[string]string{"tenant": "baz"}}
u, _ := url.Parse("https://victoriametrics.com/path")
err := InitAlertURLGeneratorFn(u, "", false)
if err != nil {
t.Fatalf("unexpected error %s", err)
}
exp := fmt.Sprintf("https://victoriametrics.com/path/vmalert/alert?%s=42&%s=2", "group_id", "alert_id")
if exp != AlertURLGeneratorFn(testAlert) {
t.Fatalf("unexpected url want %s, got %s", exp, AlertURLGeneratorFn(testAlert))
}
err = InitAlertURLGeneratorFn(nil, "foo?{{invalid}}", true)
if err == nil {
t.Fatalf("expected template validation error got nil")
}
err = InitAlertURLGeneratorFn(u, "foo?query={{$value}}&ds={{ $labels.tenant }}", true)
if err != nil {
t.Fatalf("unexpected error %s", err)
}
if exp := "https://victoriametrics.com/path/foo?query=4&ds=baz"; exp != AlertURLGeneratorFn(testAlert) {
t.Fatalf("unexpected url want %s, got %s", exp, AlertURLGeneratorFn(testAlert))
}
}
func TestSendAlerts(t *testing.T) {
oldAlertURLGeneratorFn := AlertURLGeneratorFn
defer func() { AlertURLGeneratorFn = oldAlertURLGeneratorFn }()
AlertURLGeneratorFn = func(alert Alert) string {
return ""
}
mux := http.NewServeMux()
mux.HandleFunc("/", func(_ http.ResponseWriter, _ *http.Request) {
t.Fatalf("should not be called")
})
mux.HandleFunc(alertManagerPath, func(w http.ResponseWriter, r *http.Request) {
var a []struct {
Labels map[string]string `json:"labels"`
}
if err := json.NewDecoder(r.Body).Decode(&a); err != nil {
t.Fatalf("can not unmarshal data into alert %s", err)
}
if len(a) != 2 {
t.Fatalf("expected 2 alert in array got %d", len(a))
}
if len(a[0].Labels) != 4 {
t.Fatalf("expected 4 labels got %d", len(a[0].Labels))
}
if a[0].Labels["env"] != "prod" {
t.Fatalf("expected env label to be prod during relabeling, got %s", a[0].Labels["env"])
}
if a[0].Labels["c"] != "baz" {
t.Fatalf("expected c label to be baz during relabeling, got %s", a[0].Labels["c"])
}
if len(a[1].Labels) != 1 {
t.Fatalf("expected 1 labels got %d", len(a[1].Labels))
}
})
srv := httptest.NewServer(mux)
defer srv.Close()
f, err := os.CreateTemp("", "")
if err != nil {
t.Fatal(err)
}
defer fs.MustRemovePath(f.Name())
rawConfig := `
static_configs:
- targets:
- %s
alert_relabel_configs:
- source_labels: [b]
target_label: "c"
alert_relabel_configs:
- source_labels: [a]
target_label: "b"
- target_label: "env"
replacement: "prod"
`
config := fmt.Sprintf(rawConfig, srv.URL+alertManagerPath)
writeToFile(f.Name(), config)
oldConfigPath := configPath
defer func() { configPath = oldConfigPath }()
*configPath = f.Name()
err = Init(nil, "")
if err != nil {
t.Fatalf("unexpected error when parse notifier config: %s", err)
}
firingAlerts := []Alert{
{
Name: "alert1",
Labels: map[string]string{"a": "baz"},
},
{
Name: "alert2",
Labels: map[string]string{},
},
}
errG := Send(context.Background(), firingAlerts, nil)
if errG.Err() != nil {
t.Fatalf("unexpected error when sending alerts: %s", err)
}
}

View File

@@ -1,15 +1,21 @@
package notifier
import "context"
import (
"context"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
)
// Notifier is a common interface for alert manager provider
type Notifier interface {
// Send sends the given list of alerts.
// Returns an error if fails to send the alerts.
// Must unblock if the given ctx is cancelled.
Send(ctx context.Context, alerts []Alert, notifierHeaders map[string]string) error
Send(ctx context.Context, alerts []Alert, alertLabels [][]prompb.Label, notifierHeaders map[string]string) error
// Addr returns address where alerts are sent.
Addr() string
// LastError returns error, that occured during last attempt to send data
LastError() string
// Close is a destructor for the Notifier
Close()
}

View File

@@ -1,6 +1,10 @@
package notifier
import "context"
import (
"context"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
)
// blackHoleNotifier is a Notifier stub, used when no notifications need
// to be sent.
@@ -10,7 +14,7 @@ type blackHoleNotifier struct {
}
// Send will send no notifications, but increase the metric.
func (bh *blackHoleNotifier) Send(_ context.Context, alerts []Alert, _ map[string]string) error { //nolint:revive
func (bh *blackHoleNotifier) Send(_ context.Context, alerts []Alert, _ [][]prompb.Label, _ map[string]string) error { //nolint:revive
bh.metrics.alertsSent.Add(len(alerts))
return nil
}
@@ -25,6 +29,11 @@ func (bh *blackHoleNotifier) Close() {
bh.metrics.close()
}
// LastError return last notifier's error
func (bh *blackHoleNotifier) LastError() string {
return ""
}
// newBlackHoleNotifier creates a new blackHoleNotifier
func newBlackHoleNotifier() *blackHoleNotifier {
address := "blackhole"

View File

@@ -5,6 +5,7 @@ import (
"testing"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
metricset "github.com/VictoriaMetrics/metrics"
)
@@ -16,7 +17,7 @@ func TestBlackHoleNotifier_Send(t *testing.T) {
Start: time.Now().UTC(),
End: time.Now().UTC(),
Annotations: map[string]string{"a": "b", "c": "d", "e": "f"},
}}, nil); err != nil {
}}, [][]prompb.Label{{}}, nil); err != nil {
t.Fatalf("unexpected error %s", err)
}
@@ -34,7 +35,7 @@ func TestBlackHoleNotifier_Close(t *testing.T) {
Start: time.Now().UTC(),
End: time.Now().UTC(),
Annotations: map[string]string{"a": "b", "c": "d", "e": "f"},
}}, nil); err != nil {
}}, [][]prompb.Label{{}}, nil); err != nil {
t.Fatalf("unexpected error %s", err)
}

View File

@@ -0,0 +1,19 @@
consul_sd_configs:
- server: localhost:8500
scheme: http
services:
- alertmanager
alert_relabel_configs:
- action: keep
source_labels: [env]
regex: "prod"
- server: localhost:8500
services:
- consul
alert_relabel_configs:
- action: keep
source_labels: [env]
regex: "(abc"
alert_relabel_configs:
- target_label: "foo"
replacement: "aaa"

View File

@@ -0,0 +1,13 @@
dns_sd_configs:
- names:
- cloudflare.com
type: 'A'
port: 9093
relabel_configs:
- source_labels: [__meta_dns_name]
replacement: '${1}'
target_label: dns_name
alert_relabel_configs:
- action: keep
source_labels: [env]
regex: "(abc"

View File

@@ -2,12 +2,19 @@ static_configs:
- targets:
- localhost:9093
- localhost:9095
alert_relabel_configs:
- action: keep
source_labels: [env]
regex: "static"
consul_sd_configs:
- server: localhost:8500
scheme: http
services:
- alertmanager
alert_relabel_configs:
- action: keep
source_labels: [env]
regex: "consul"
- server: localhost:8500
services:
- consul
@@ -17,6 +24,10 @@ dns_sd_configs:
- cloudflare.com
type: 'A'
port: 9093
alert_relabel_configs:
- action: keep
source_labels: [env]
regex: "dns"
relabel_configs:
- source_labels: [__meta_consul_tags]
@@ -25,4 +36,4 @@ relabel_configs:
target_label: __scheme__
- source_labels: [__meta_dns_name]
replacement: '${1}'
target_label: dns_name
target_label: dns_name

View File

@@ -1,22 +1,14 @@
headers:
- 'CustomHeader: foo'
static_configs:
- targets:
- localhost:9093
- localhost:9095
- https://localhost:9093/test/api/v2/alerts
basic_auth:
username: foo
password: bar
- http://192.168.0.101:9093
alert_relabel_configs:
- target_label: "foo"
replacement: "aaa"
- targets:
- localhost:9096
- localhost:9097
basic_auth:
username: foo
password: baz
- http://192.168.0.101:9093
alert_relabel_configs:
- target_label: "foo"
replacement: "ccc"
alert_relabel_configs:
- target_label: "foo"
replacement: "aaa"

View File

@@ -0,0 +1,19 @@
package notifier
// ApiNotifier represents a Notifier configuration for WEB view
type ApiNotifier struct {
// Kind is a Notifier type
Kind TargetType `json:"kind"`
// Targets is a list of Notifier targets
Targets []*ApiTarget `json:"targets"`
}
// ApiTarget represents a specific Notifier target for WEB view
type ApiTarget struct {
// Address is a URL for sending notifications
Address string `json:"address"`
// Labels is a list of labels to add to each sent notification
Labels map[string]string `json:"labels"`
// LastError contains the error faced while sending to notifier.
LastError string `json:"lastError"`
}

View File

@@ -14,9 +14,9 @@ import (
)
var (
addr = flag.String("remoteRead.url", "", "Optional URL to datasource compatible with MetricsQL. It can be single node VictoriaMetrics or vmselect."+
"Remote read is used to restore alerts state."+
"This configuration makes sense only if `vmalert` was configured with `remoteWrite.url` before and has been successfully persisted its state. "+
addr = flag.String("remoteRead.url", "", "Optional URL to datasource compatible with MetricsQL. It can be single node VictoriaMetrics or vmselect. "+
"Remote read is used to restore alerts state. "+
"This configuration makes sense only if vmalert was configured with '-remoteWrite.url' before and has been successfully persisted its state. "+
"Supports address in the form of IP address with a port (e.g., http://127.0.0.1:8428) or DNS SRV record. "+
"See also '-remoteRead.disablePathAppend', '-remoteRead.showURL'.")

View File

@@ -173,9 +173,8 @@ func (c *Client) run(ctx context.Context) {
cancel()
}
c.wg.Add(1)
go func() {
defer c.wg.Done()
c.wg.Go(func() {
defer ticker.Stop()
for {
select {
@@ -197,7 +196,7 @@ func (c *Client) run(ctx context.Context) {
}
}
}
}()
})
}
var (

View File

@@ -187,6 +187,54 @@ func (ar *AlertingRule) ID() uint64 {
return ar.RuleID
}
// ToAPI returns ApiRule representation of ar
func (ar *AlertingRule) ToAPI() ApiRule {
state := ar.state
lastState := state.getLast()
r := ApiRule{
Type: TypeAlerting,
DatasourceType: ar.Type.String(),
Name: ar.Name,
Query: ar.Expr,
Duration: ar.For.Seconds(),
KeepFiringFor: ar.KeepFiringFor.Seconds(),
Labels: ar.Labels,
Annotations: ar.Annotations,
LastEvaluation: lastState.Time,
EvaluationTime: lastState.Duration.Seconds(),
Health: "ok",
State: "inactive",
Alerts: ar.AlertsToAPI(),
LastSamples: lastState.Samples,
LastSeriesFetched: lastState.SeriesFetched,
MaxUpdates: state.size(),
Updates: state.getAll(),
Debug: ar.Debug,
// encode as strings to avoid rounding in JSON
ID: fmt.Sprintf("%d", ar.ID()),
GroupID: fmt.Sprintf("%d", ar.GroupID),
GroupName: ar.GroupName,
File: ar.File,
}
if lastState.Err != nil {
r.LastError = lastState.Err.Error()
r.Health = "err"
}
// satisfy apiRule.State logic
if len(r.Alerts) > 0 {
r.State = notifier.StatePending.String()
stateFiring := notifier.StateFiring.String()
for _, a := range r.Alerts {
if a.State == stateFiring {
r.State = stateFiring
break
}
}
}
return r
}
// GetAlerts returns active alerts of rule
func (ar *AlertingRule) GetAlerts() []*notifier.Alert {
ar.alertsMu.RLock()
@@ -198,16 +246,6 @@ func (ar *AlertingRule) GetAlerts() []*notifier.Alert {
return alerts
}
// GetAlert returns alert if id exists
func (ar *AlertingRule) GetAlert(id uint64) *notifier.Alert {
ar.alertsMu.RLock()
defer ar.alertsMu.RUnlock()
if ar.alerts == nil {
return nil
}
return ar.alerts[id]
}
func (ar *AlertingRule) logDebugf(at time.Time, a *notifier.Alert, format string, args ...any) {
if !ar.Debug {
return
@@ -273,6 +311,11 @@ type labelSet struct {
// On k conflicts in origin set, the original value is preferred and copied
// to processed with `exported_%k` key. The copy happens only if passed v isn't equal to origin[k] value.
func (ls *labelSet) add(k, v string) {
// do not add label with empty value, since it has no meaning.
// see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9984
if v == "" {
return
}
ls.processed[k] = v
ov, ok := ls.origin[k]
if !ok {
@@ -307,9 +350,6 @@ func (ar *AlertingRule) toLabels(m datasource.Metric, qFn templates.QueryFn) (*l
Value: m.Values[0],
Expr: ar.Expr,
})
if err != nil {
return nil, fmt.Errorf("failed to expand labels: %w", err)
}
for k, v := range extraLabels {
ls.add(k, v)
}
@@ -320,7 +360,7 @@ func (ar *AlertingRule) toLabels(m datasource.Metric, qFn templates.QueryFn) (*l
if !*disableAlertGroupLabel && ar.GroupName != "" {
ls.add(alertGroupNameLabel, ar.GroupName)
}
return ls, nil
return ls, err
}
// execRange executes alerting rule on the given time range similarly to exec.
@@ -341,7 +381,7 @@ func (ar *AlertingRule) execRange(ctx context.Context, start, end time.Time) ([]
return []datasource.Metric{{Timestamps: []int64{0}, Values: []float64{math.NaN()}}}, nil
}
for _, s := range res.Data {
ls, err := ar.expandLabelTemplates(s)
ls, err := ar.expandLabelTemplates(s, qFn)
if err != nil {
return nil, err
}
@@ -434,10 +474,11 @@ func (ar *AlertingRule) exec(ctx context.Context, ts time.Time, limit int) ([]pr
expandedLabels := make([]*labelSet, len(res.Data))
expandedAnnotations := make([]map[string]string, len(res.Data))
for i, m := range res.Data {
ls, err := ar.expandLabelTemplates(m)
ls, err := ar.expandLabelTemplates(m, qFn)
if err != nil {
// only set error in current state, but do not break alert processing
curState.Err = err
return nil, curState.Err
logger.Errorf("got templating error in rule %s: %q", ar.Name, err)
}
at := ts
alertID := hash(ls.processed)
@@ -449,8 +490,9 @@ func (ar *AlertingRule) exec(ctx context.Context, ts time.Time, limit int) ([]pr
}
as, err := ar.expandAnnotationTemplates(m, qFn, at, ls)
if err != nil {
// only set error in current state, but do not break alert processing
curState.Err = err
return nil, curState.Err
logger.Errorf("got templating error in rule %s: %q", ar.Name, err)
}
expandedLabels[i] = ls
expandedAnnotations[i] = as
@@ -556,13 +598,10 @@ func (ar *AlertingRule) exec(ctx context.Context, ts time.Time, limit int) ([]pr
return append(tss, ar.toTimeSeries(ts.Unix())...), nil
}
func (ar *AlertingRule) expandLabelTemplates(m datasource.Metric) (*labelSet, error) {
qFn := func(_ string) ([]datasource.Metric, error) {
return nil, fmt.Errorf("`query` template isn't supported in rule label")
}
func (ar *AlertingRule) expandLabelTemplates(m datasource.Metric, qFn templates.QueryFn) (*labelSet, error) {
ls, err := ar.toLabels(m, qFn)
if err != nil {
return nil, fmt.Errorf("failed to expand label templates: %s", err)
return ls, fmt.Errorf("failed to expand label templates: %s", err)
}
return ls, nil
}
@@ -580,7 +619,7 @@ func (ar *AlertingRule) expandAnnotationTemplates(m datasource.Metric, qFn templ
}
as, err := notifier.ExecTemplate(qFn, ar.Annotations, tplData)
if err != nil {
return nil, fmt.Errorf("failed to expand annotation templates: %s", err)
return as, fmt.Errorf("failed to expand annotation templates: %s", err)
}
return as, nil
}

View File

@@ -10,6 +10,7 @@ import (
"strings"
"sync"
"testing"
"testing/synctest"
"time"
"github.com/VictoriaMetrics/metrics"
@@ -826,12 +827,9 @@ func TestGroup_Restore(t *testing.T) {
fg := NewGroup(config.Group{Name: "TestRestore", Rules: rules}, fqr, time.Second, nil)
fg.Init()
wg := sync.WaitGroup{}
wg.Add(1)
go func() {
nts := func() []notifier.Notifier { return []notifier.Notifier{&notifier.FakeNotifier{}} }
fg.Start(context.Background(), nts, nil, fqr)
wg.Done()
}()
wg.Go(func() {
fg.Start(context.Background(), nil, fqr)
})
fg.Close()
wg.Wait()
@@ -1372,8 +1370,10 @@ func TestAlertingRule_ToLabels(t *testing.T) {
ar := &AlertingRule{
Labels: map[string]string{
"instance": "override", // this should override instance with new value
"group": "vmalert", // this shouldn't have effect since value in metric is equal
"instance": "override", // this should override instance with new value
"group": "vmalert", // this shouldn't have effect since value in metric is equal
"invalid_label": "{{ .Values.mustRuntimeFail }}",
"empty_label": "", // this should be dropped
},
Expr: "sum(vmalert_alerting_rules_error) by(instance, group, alertname) > 0",
Name: "AlertingRulesError",
@@ -1381,10 +1381,11 @@ func TestAlertingRule_ToLabels(t *testing.T) {
}
expectedOriginLabels := map[string]string{
"instance": "0.0.0.0:8800",
"group": "vmalert",
"alertname": "ConfigurationReloadFailure",
"alertgroup": "vmalert",
"instance": "0.0.0.0:8800",
"group": "vmalert",
"alertname": "ConfigurationReloadFailure",
"alertgroup": "vmalert",
"invalid_label": `error evaluating template: template: :1:268: executing "" at <.Values.mustRuntimeFail>: can't evaluate field Values in type notifier.tplData`,
}
expectedProcessedLabels := map[string]string{
@@ -1394,11 +1395,12 @@ func TestAlertingRule_ToLabels(t *testing.T) {
"exported_alertname": "ConfigurationReloadFailure",
"group": "vmalert",
"alertgroup": "vmalert",
"invalid_label": `error evaluating template: template: :1:268: executing "" at <.Values.mustRuntimeFail>: can't evaluate field Values in type notifier.tplData`,
}
ls, err := ar.toLabels(metric, nil)
if err != nil {
t.Fatalf("unexpected error: %s", err)
if err == nil || !strings.Contains(err.Error(), "error evaluating template") {
t.Fatalf("unexpected error %q", err.Error())
}
if !reflect.DeepEqual(ls.origin, expectedOriginLabels) {
@@ -1429,3 +1431,142 @@ func TestAlertingRuleExec_Partial(t *testing.T) {
t.Fatalf("unexpected error: %s", err)
}
}
func TestAlertingRule_QueryTemplateInLabels(t *testing.T) {
fq := &datasource.FakeQuerier{}
fakeGroup := Group{
Name: "TestQueryTemplateInLabels",
}
ar := &AlertingRule{
Name: "test_alert",
Labels: map[string]string{
"suppress_for_mass_alert": `{{ if (printf "ALERTS{alertname='SomeAlert', alertstate='firing', device='%s'} == 1" $labels.device | query) }}true{{ else }}false{{ end }}`,
},
Annotations: map[string]string{
"summary": "Test alert with query template in labels",
},
alerts: make(map[uint64]*notifier.Alert),
}
ar.GroupID = fakeGroup.GetID()
ar.q = fq
ar.state = &ruleState{
entries: make([]StateEntry, 10),
}
// Add a metric that should trigger the alert
fq.Add(metricWithValueAndLabels(t, 1, "device", "sda1"))
ts := time.Now()
_, err := ar.exec(context.TODO(), ts, 0)
if err != nil {
t.Fatalf("unexpected error with query template in labels: %s", err)
}
// Verify that the alert was created and the query template was executed
if len(ar.alerts) != 1 {
t.Fatalf("expected 1 alert, got %d", len(ar.alerts))
}
alert := ar.GetAlerts()[0]
suppressLabel, exists := alert.Labels["suppress_for_mass_alert"]
if !exists {
t.Fatalf("expected 'suppress_for_mass_alert' label to exist")
}
// The query template should have been executed (even if it returns false due to mock data)
if suppressLabel != "true" && suppressLabel != "false" {
t.Fatalf("expected 'suppress_for_mass_alert' label to be 'true' or 'false', got '%s'", suppressLabel)
}
}
// TestAlertingRule_ActiveAtPreservedInAnnotations ensures that the fix for
// https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9543 is preserved
// while allowing query templates in labels (https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9783)
func TestAlertingRule_ActiveAtPreservedInAnnotations(t *testing.T) {
// wrap into synctest because of time manipulations
synctest.Test(t, func(t *testing.T) {
fq := &datasource.FakeQuerier{}
ar := &AlertingRule{
Name: "TestActiveAtPreservation",
Labels: map[string]string{
"test_query_in_label": `{{ "static_value" }}`,
},
Annotations: map[string]string{
"description": "Alert active since {{ $activeAt }}",
},
alerts: make(map[uint64]*notifier.Alert),
q: fq,
state: &ruleState{
entries: make([]StateEntry, 10),
},
}
// Mock query result - return empty result to make suppress_for_mass_alert = false
// (no need to add anything to fq for empty result)
// Add a metric that should trigger the alert
fq.Add(metricWithValueAndLabels(t, 1, "instance", "server1"))
// First execution - creates new alert
ts1 := time.Now()
_, err := ar.exec(context.TODO(), ts1, 0)
if err != nil {
t.Fatalf("unexpected error on first exec: %s", err)
}
if len(ar.alerts) != 1 {
t.Fatalf("expected 1 alert, got %d", len(ar.alerts))
}
firstAlert := ar.GetAlerts()[0]
// Verify first execution: activeAt should be ts1 and annotation should reflect it
if !firstAlert.ActiveAt.Equal(ts1) {
t.Fatalf("expected activeAt to be %v, got %v", ts1, firstAlert.ActiveAt)
}
// Extract time from annotation (format will be like "Alert active since 2025-09-30 08:55:13.638551611 -0400 EDT m=+0.002928464")
expectedTimeStr := ts1.Format("2006-01-02 15:04:05")
if !strings.Contains(firstAlert.Annotations["description"], expectedTimeStr) {
t.Fatalf("first exec annotation should contain time %s, got: %s", expectedTimeStr, firstAlert.Annotations["description"])
}
// Second execution - should preserve activeAt in annotation
// Ensure different timestamp with different seconds
// sleep is non-blocking thanks to synctest
time.Sleep(2 * time.Second)
ts2 := time.Now()
_, err = ar.exec(context.TODO(), ts2, 0)
if err != nil {
t.Fatalf("unexpected error on second exec: %s", err)
}
// Get the alert again (should be the same alert)
if len(ar.alerts) != 1 {
t.Fatalf("expected 1 alert, got %d", len(ar.alerts))
}
secondAlert := ar.GetAlerts()[0]
// Critical test: activeAt should still be ts1, not ts2
if !secondAlert.ActiveAt.Equal(ts1) {
t.Fatalf("activeAt should be preserved as %v, but got %v", ts1, secondAlert.ActiveAt)
}
// Critical test: annotation should still contain ts1 time, not ts2
if !strings.Contains(secondAlert.Annotations["description"], expectedTimeStr) {
t.Fatalf("second exec annotation should still contain original time %s, got: %s", expectedTimeStr, secondAlert.Annotations["description"])
}
// Additional verification: annotation should NOT contain ts2 time
ts2TimeStr := ts2.Format("2006-01-02 15:04:05")
if strings.Contains(secondAlert.Annotations["description"], ts2TimeStr) {
t.Fatalf("annotation should NOT contain new eval time %s, got: %s", ts2TimeStr, secondAlert.Annotations["description"])
}
// Verify query template in labels still works (this would fail if query templates were broken)
if firstAlert.Labels["test_query_in_label"] != "static_value" {
t.Fatalf("expected test_query_in_label=static_value, got %s", firstAlert.Labels["test_query_in_label"])
}
})
}

View File

@@ -2,7 +2,6 @@ package rule
import (
"context"
"encoding/json"
"errors"
"flag"
"fmt"
@@ -19,12 +18,15 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/vmalertutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
)
var (
ruleResultsLimit = flag.Int("rule.resultsLimit", 0, "Limits the number of alerts or recording results a single rule can produce. "+
"Can be overridden by the limit option under group if specified. "+
"If exceeded, the rule will be marked with an error and all its results will be discarded. "+
"0 means no limit.")
ruleUpdateEntriesLimit = flag.Int("rule.updateEntriesLimit", 20, "Defines the max number of rule's state updates stored in-memory. "+
"Rule's updates are available on rule's Details page and are used for debugging purposes. The number of stored updates can be overridden per rule via update_entries_limit param.")
resendDelay = flag.Duration("rule.resendDelay", 0, "MiniMum amount of time to wait before resending an alert to notifier.")
@@ -36,6 +38,8 @@ var (
disableAlertGroupLabel = flag.Bool("disableAlertgroupLabel", false, "Whether to disable adding group's Name as label to generated alerts and time series.")
remoteReadLookBack = flag.Duration("remoteRead.lookback", time.Hour, "Lookback defines how far to look into past for alerts timeseries. "+
"For example, if lookback=1h then range from now() to now()-1h will be scanned.")
maxStartDelay = flag.Duration("group.maxStartDelay", 5*time.Minute, "Defines the max delay before starting the group evaluation. Group's start is artificially delayed for random duration on interval"+
" [0..min(--group.maxStartDelay, group.interval)]. This helps smoothing out the load on the configured datasource, so evaluations aren't executed too close to each other.")
)
// Group is an entity for grouping rules
@@ -112,7 +116,6 @@ func NewGroup(cfg config.Group, qb datasource.QuerierBuilder, defaultInterval ti
Name: cfg.Name,
File: cfg.File,
Interval: cfg.Interval.Duration(),
Limit: cfg.Limit,
Concurrency: cfg.Concurrency,
checksum: cfg.Checksum,
Params: cfg.Params,
@@ -129,6 +132,11 @@ func NewGroup(cfg config.Group, qb datasource.QuerierBuilder, defaultInterval ti
if g.Interval == 0 {
g.Interval = defaultInterval
}
if cfg.Limit != nil {
g.Limit = *cfg.Limit
} else {
g.Limit = *ruleResultsLimit
}
if g.Concurrency < 1 {
g.Concurrency = 1
}
@@ -289,7 +297,7 @@ func (g *Group) InterruptEval() {
}
}
// Close stops the group and it's rules, unregisters group metrics
// Close stops the group and its rules, unregisters group metrics
func (g *Group) Close() {
if g.doneCh == nil {
return
@@ -298,10 +306,6 @@ func (g *Group) Close() {
g.InterruptEval()
<-g.finishedCh
g.closeGroupMetrics()
}
func (g *Group) closeGroupMetrics() {
metrics.UnregisterSet(g.metrics.set, true)
}
@@ -327,13 +331,13 @@ func (g *Group) Init() {
}
// Start starts group's evaluation
func (g *Group) Start(ctx context.Context, nts func() []notifier.Notifier, rw remotewrite.RWClient, rr datasource.QuerierBuilder) {
func (g *Group) Start(ctx context.Context, rw remotewrite.RWClient, rr datasource.QuerierBuilder) {
defer func() { close(g.finishedCh) }()
evalTS := time.Now()
// sleep random duration to spread group rules evaluation
// over time in order to reduce load on datasource.
// over maxStartDelay to reduce the load on datasource.
if !SkipRandSleepOnGroupStart {
sleepBeforeStart := delayBeforeStart(evalTS, g.GetID(), g.Interval, g.EvalOffset)
sleepBeforeStart := g.delayBeforeStart(evalTS, *maxStartDelay)
g.infof("will start in %v", sleepBeforeStart)
sleepTimer := time.NewTimer(sleepBeforeStart)
@@ -365,7 +369,6 @@ func (g *Group) Start(ctx context.Context, nts func() []notifier.Notifier, rw re
e := &executor{
Rw: rw,
Notifiers: nts,
notifierHeaders: g.NotifierHeaders,
}
@@ -472,32 +475,31 @@ func (g *Group) UpdateWith(newGroup *Group) {
g.updateCh <- newGroup
}
// DeepCopy returns a deep copy of group
func (g *Group) DeepCopy() *Group {
g.mu.RLock()
data, _ := json.Marshal(g)
g.mu.RUnlock()
newG := Group{}
_ = json.Unmarshal(data, &newG)
newG.Rules = g.Rules
newG.id = g.id
return &newG
}
// if offset is specified, delayBeforeStart returns a duration to help aligning timestamp with offset;
// otherwise, it returns a random duration between [0..interval] based on group key.
func delayBeforeStart(ts time.Time, key uint64, interval time.Duration, offset *time.Duration) time.Duration {
if offset != nil {
currentOffsetPoint := ts.Truncate(interval).Add(*offset)
// delayBeforeStart returns duration for delaying the evaluation start
// based on given ts and Group settings. The delay can't exceed maxDelay.
// maxDelay is ignored if g.EvalOffset != nil.
//
// Delaying is important to smooth out the load on the datasource when all groups start at the same time.
// delayBeforeStart calculates delay based on Group ID, so all groups will start at different moments of time.
func (g *Group) delayBeforeStart(ts time.Time, maxDelay time.Duration) time.Duration {
if g.EvalOffset != nil {
// if offset is specified, ignore the maxDelay and return a duration aligned with offset
currentOffsetPoint := ts.Truncate(g.Interval).Add(*g.EvalOffset)
if currentOffsetPoint.Before(ts) {
// wait until the next offset point
return currentOffsetPoint.Add(interval).Sub(ts)
return currentOffsetPoint.Add(g.Interval).Sub(ts)
}
return currentOffsetPoint.Sub(ts)
}
// otherwise, return a random duration between [0..min(interval, maxDelay)] based on group ID
interval := g.Interval
if interval > maxDelay {
// artificially limit interval, so groups with big intervals could start sooner.
interval = maxDelay
}
var randSleep time.Duration
randSleep = time.Duration(float64(interval) * (float64(key) / (1 << 64)))
randSleep = time.Duration(float64(interval) * (float64(g.GetID()) / (1 << 64)))
sleepOffset := time.Duration(ts.UnixNano() % interval.Nanoseconds())
if randSleep < sleepOffset {
randSleep += interval
@@ -559,15 +561,13 @@ func (g *Group) Replay(start, end time.Time, rw remotewrite.RWClient, maxDataPoi
if !disableProgressBar {
bar = pb.StartNew(iterations * len(g.Rules))
}
for _, r := range g.Rules {
for i := range g.Rules {
rule := g.Rules[i]
sem <- struct{}{}
wg.Add(1)
go func(r Rule, ri rangeIterator) {
// pass ri as a copy, so it can be modified within the replayRuleRange
res <- replayRuleRange(r, ri, bar, rw, replayRuleRetryAttempts, ruleEvaluationConcurrency)
wg.Go(func() {
res <- replayRuleRange(rule, ri, bar, rw, replayRuleRetryAttempts, ruleEvaluationConcurrency)
<-sem
wg.Done()
}(r, ri)
})
}
wg.Wait()
@@ -597,10 +597,10 @@ func replayRuleRange(r Rule, ri rangeIterator, bar *pb.ProgressBar, rw remotewri
res := make(chan int, int(ri.end.Sub(ri.start)/ri.step)+1)
for ri.next() {
sem <- struct{}{}
wg.Add(1)
go func(s, e time.Time) {
n, err := replayRule(r, s, e, rw, replayRuleRetryAttempts)
start := ri.s
end := ri.e
wg.Go(func() {
n, err := replayRule(r, start, end, rw, replayRuleRetryAttempts)
if err != nil {
logger.Fatalf("rule %q: %s", r, err)
}
@@ -609,8 +609,7 @@ func replayRuleRange(r Rule, ri rangeIterator, bar *pb.ProgressBar, rw remotewri
}
res <- n
<-sem
wg.Done()
}(ri.s, ri.e)
})
}
wg.Wait()
close(res)
@@ -624,10 +623,9 @@ func replayRuleRange(r Rule, ri rangeIterator, bar *pb.ProgressBar, rw remotewri
}
// ExecOnce evaluates all the rules under group for once with given timestamp.
func (g *Group) ExecOnce(ctx context.Context, nts func() []notifier.Notifier, rw remotewrite.RWClient, evalTS time.Time) chan error {
func (g *Group) ExecOnce(ctx context.Context, rw remotewrite.RWClient, evalTS time.Time) chan error {
e := &executor{
Rw: rw,
Notifiers: nts,
notifierHeaders: g.NotifierHeaders,
}
if len(g.Rules) < 1 {
@@ -702,7 +700,6 @@ func (g *Group) getEvalDelay() time.Duration {
// executor contains group's notify and rw configs
type executor struct {
Notifiers func() []notifier.Notifier
notifierHeaders map[string]string
Rw remotewrite.RWClient
@@ -723,14 +720,13 @@ func (e *executor) execConcurrently(ctx context.Context, rules []Rule, ts time.T
sem := make(chan struct{}, concurrency)
go func() {
wg := sync.WaitGroup{}
for _, r := range rules {
for i := range rules {
rule := rules[i]
sem <- struct{}{}
wg.Add(1)
go func(r Rule) {
res <- e.exec(ctx, r, ts, resolveDuration, limit)
wg.Go(func() {
res <- e.exec(ctx, rule, ts, resolveDuration, limit)
<-sem
wg.Done()
}(r)
})
}
wg.Wait()
close(res)
@@ -784,17 +780,6 @@ func (e *executor) exec(ctx context.Context, r Rule, ts time.Time, resolveDurati
return nil
}
wg := sync.WaitGroup{}
errGr := new(vmalertutil.ErrGroup)
for _, nt := range e.Notifiers() {
wg.Add(1)
go func(nt notifier.Notifier) {
if err := nt.Send(ctx, alerts, e.notifierHeaders); err != nil {
errGr.Add(fmt.Errorf("rule %q: failed to send alerts to addr %q: %w", r, nt.Addr(), err))
}
wg.Done()
}(nt)
}
wg.Wait()
errGr := notifier.Send(ctx, alerts, e.notifierHeaders)
return errGr.Err()
}

View File

@@ -262,7 +262,7 @@ func TestUpdateDuringRandSleep(t *testing.T) {
updateCh: make(chan *Group),
}
g.Init()
go g.Start(context.Background(), nil, nil, nil)
go g.Start(context.Background(), nil, nil)
rule1 := AlertingRule{
Name: "jobDown",
@@ -346,7 +346,8 @@ func TestGroupStart(t *testing.T) {
}
fs := &datasource.FakeQuerier{}
fn := &notifier.FakeNotifier{}
fn, cleanup := notifier.InitFakeNotifier()
defer cleanup()
const evalInterval = time.Millisecond
g := NewGroup(groups[0], fs, evalInterval, map[string]string{"cluster": "east-1"})
@@ -395,7 +396,7 @@ func TestGroupStart(t *testing.T) {
fs.Add(m2)
g.Init()
go func() {
g.Start(context.Background(), func() []notifier.Notifier { return []notifier.Notifier{fn} }, nil, fs)
g.Start(context.Background(), nil, fs)
close(finished)
}()
@@ -472,15 +473,10 @@ func TestFaultyNotifier(t *testing.T) {
r := newTestAlertingRule("instant", 0)
r.q = fq
fn := &notifier.FakeNotifier{}
e := &executor{
Notifiers: func() []notifier.Notifier {
return []notifier.Notifier{
&notifier.FaultyNotifier{},
fn,
}
},
}
fn, cleanup := notifier.InitFakeNotifier()
defer cleanup()
e := &executor{}
delay := 5 * time.Second
ctx, cancel := context.WithTimeout(context.Background(), delay)
defer cancel()
@@ -553,7 +549,7 @@ func TestCloseWithEvalInterruption(t *testing.T) {
g := NewGroup(groups[0], fq, evalInterval, nil)
g.Init()
go g.Start(context.Background(), nil, nil, nil)
go g.Start(context.Background(), nil, nil)
time.Sleep(evalInterval * 20)
@@ -571,9 +567,10 @@ func TestCloseWithEvalInterruption(t *testing.T) {
func TestGroupStartDelay(t *testing.T) {
g := &Group{}
g.id = uint64(math.MaxUint64 / 10)
// interval of 5min and key generate a static delay of 30s
g.Interval = time.Minute * 5
key := uint64(math.MaxUint64 / 10)
maxDelay := time.Minute * 5
f := func(atS, expS string) {
t.Helper()
@@ -585,7 +582,7 @@ func TestGroupStartDelay(t *testing.T) {
if err != nil {
t.Fatal(err)
}
delay := delayBeforeStart(at, key, g.Interval, g.EvalOffset)
delay := g.delayBeforeStart(at, maxDelay)
gotStart := at.Add(delay)
if expTS != gotStart {
t.Fatalf("expected to get %v; got %v instead", expTS, gotStart)
@@ -606,6 +603,15 @@ func TestGroupStartDelay(t *testing.T) {
f("2023-01-01T00:01:00.000+00:00", "2023-01-01T00:03:00.000+00:00")
f("2023-01-01T00:03:30.000+00:00", "2023-01-01T00:08:00.000+00:00")
f("2023-01-01T00:08:00.000+00:00", "2023-01-01T00:08:00.000+00:00")
maxDelay = time.Minute * 1
g.EvalOffset = nil
// test group with maxDelay, and offset disabled
f("2023-01-01T00:00:00.000+00:00", "2023-01-01T00:00:06.000+00:00")
f("2023-01-01T00:00:01.000+00:00", "2023-01-01T00:00:06.000+00:00")
f("2023-01-01T00:00:06.100+00:00", "2023-01-01T00:01:06.000+00:00")
f("2023-01-01T00:00:11.000+00:00", "2023-01-01T00:01:06.000+00:00")
}
func TestGetPrometheusReqTimestamp(t *testing.T) {

View File

@@ -81,6 +81,37 @@ func (rr *RecordingRule) ID() uint64 {
return rr.RuleID
}
// ToAPI returns ApiRule representation of rr
func (rr *RecordingRule) ToAPI() ApiRule {
state := rr.state
lastState := state.getLast()
r := ApiRule{
Type: TypeRecording,
DatasourceType: rr.Type.String(),
Name: rr.Name,
Query: rr.Expr,
Labels: rr.Labels,
LastEvaluation: lastState.Time,
EvaluationTime: lastState.Duration.Seconds(),
Health: "ok",
LastSamples: lastState.Samples,
LastSeriesFetched: lastState.SeriesFetched,
MaxUpdates: state.size(),
Updates: state.getAll(),
// encode as strings to avoid rounding
ID: fmt.Sprintf("%d", rr.ID()),
GroupID: fmt.Sprintf("%d", rr.GroupID),
GroupName: rr.GroupName,
File: rr.File,
}
if lastState.Err != nil {
r.LastError = lastState.Err.Error()
r.Health = "err"
}
return r
}
// NewRecordingRule creates a new RecordingRule
func NewRecordingRule(qb datasource.QuerierBuilder, group *Group, cfg config.Rule) *RecordingRule {
debug := group.Debug
@@ -205,7 +236,8 @@ func (rr *RecordingRule) exec(ctx context.Context, ts time.Time, limit int) ([]p
Labels: stringToLabels(k),
Samples: []prompb.Sample{
{Value: decimal.StaleNaN, Timestamp: ts.UnixNano() / 1e6},
}})
},
})
}
rr.lastEvaluation = curEvaluation
return tss, nil
@@ -260,6 +292,11 @@ func (rr *RecordingRule) toTimeSeries(m datasource.Metric) prompb.TimeSeries {
}
// add extra labels configured by user
for k := range rr.Labels {
// do not add label with empty value, since it has no meaning.
// see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9984
if rr.Labels[k] == "" {
continue
}
existingLabel := promrelabel.GetLabelByName(m.Labels, k)
if existingLabel != nil { // there is a conflict between extra and existing label
if existingLabel.Value == rr.Labels[k] {

View File

@@ -21,6 +21,8 @@ type Rule interface {
// ID returns unique ID that may be used for
// identifying this Rule among others.
ID() uint64
// ToAPI returns ApiRule representation of Rule
ToAPI() ApiRule
// exec executes the rule with given context at the given timestamp and limit.
// returns an err if number of resulting time series exceeds the limit.
exec(ctx context.Context, ts time.Time, limit int) ([]prompb.TimeSeries, error)
@@ -68,39 +70,6 @@ type StateEntry struct {
Curl string `json:"curl"`
}
// GetLastEntry returns latest stateEntry of rule
func GetLastEntry(r Rule) StateEntry {
if rule, ok := r.(*AlertingRule); ok {
return rule.state.getLast()
}
if rule, ok := r.(*RecordingRule); ok {
return rule.state.getLast()
}
return StateEntry{}
}
// GetRuleStateSize returns size of rule stateEntry
func GetRuleStateSize(r Rule) int {
if rule, ok := r.(*AlertingRule); ok {
return rule.state.size()
}
if rule, ok := r.(*RecordingRule); ok {
return rule.state.size()
}
return 0
}
// GetAllRuleState returns rule entire stateEntries
func GetAllRuleState(r Rule) []StateEntry {
if rule, ok := r.(*AlertingRule); ok {
return rule.state.getAll()
}
if rule, ok := r.(*RecordingRule); ok {
return rule.state.getAll()
}
return []StateEntry{}
}
func (s *ruleState) size() int {
s.RLock()
defer s.RUnlock()

View File

@@ -1,4 +1,4 @@
package main
package rule
import (
"fmt"
@@ -8,79 +8,28 @@ import (
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/rule"
)
const (
// ParamGroupID is group id key in url parameter
paramGroupID = "group_id"
ParamGroupID = "group_id"
// ParamAlertID is alert id key in url parameter
paramAlertID = "alert_id"
ParamAlertID = "alert_id"
// ParamRuleID is rule id key in url parameter
paramRuleID = "rule_id"
ParamRuleID = "rule_id"
// TypeRecording is a RecordingRule type
TypeRecording = "recording"
// TypeAlerting is an AlertingRule type
TypeAlerting = "alerting"
)
type apiNotifier struct {
Kind string `json:"kind"`
Targets []*apiTarget `json:"targets"`
}
type apiTarget struct {
Address string `json:"address"`
Labels map[string]string `json:"labels"`
}
// apiAlert represents a notifier.AlertingRule state
// for WEB view
// https://github.com/prometheus/compliance/blob/main/alert_generator/specification.md#get-apiv1rules
type apiAlert struct {
State string `json:"state"`
Name string `json:"name"`
Value string `json:"value"`
Labels map[string]string `json:"labels,omitempty"`
Annotations map[string]string `json:"annotations"`
ActiveAt time.Time `json:"activeAt"`
// Additional fields
// ID is an unique Alert's ID within a group
ID string `json:"id"`
// RuleID is an unique Rule's ID within a group
RuleID string `json:"rule_id"`
// GroupID is an unique Group's ID
GroupID string `json:"group_id"`
// Expression contains the PromQL/MetricsQL expression
// for Rule's evaluation
Expression string `json:"expression"`
// SourceLink contains a link to a system which should show
// why Alert was generated
SourceLink string `json:"source"`
// Restored shows whether Alert's state was restored on restart
Restored bool `json:"restored"`
// Stabilizing shows when firing state is kept because of
// `keep_firing_for` instead of real alert
Stabilizing bool `json:"stabilizing"`
}
// WebLink returns a link to the alert which can be used in UI.
func (aa *apiAlert) WebLink() string {
return fmt.Sprintf("alert?%s=%s&%s=%s",
paramGroupID, aa.GroupID, paramAlertID, aa.ID)
}
// APILink returns a link to the alert's JSON representation.
func (aa *apiAlert) APILink() string {
return fmt.Sprintf("api/v1/alert?%s=%s&%s=%s",
paramGroupID, aa.GroupID, paramAlertID, aa.ID)
}
// apiGroup represents Group for web view
// https://github.com/prometheus/compliance/blob/main/alert_generator/specification.md#get-apiv1rules
type apiGroup struct {
// ApiGroup represents a Group for web view
type ApiGroup struct {
// Name is the group name as present in the config
Name string `json:"name"`
// Rules contains both recording and alerting rules
Rules []apiRule `json:"rules"`
Rules []ApiRule `json:"rules"`
// Interval is the Group's evaluation interval in float seconds as present in the file.
Interval float64 `json:"interval"`
// LastEvaluation is the timestamp of the last time the Group was executed
@@ -116,15 +65,20 @@ type apiGroup struct {
NoMatch int
}
// groupAlerts represents a group of alerts for WEB view
type groupAlerts struct {
Group *apiGroup
Alerts []*apiAlert
// APILink returns a link to the group's JSON representation.
func (ag *ApiGroup) APILink() string {
return fmt.Sprintf("api/v1/group?%s=%s", ParamGroupID, ag.ID)
}
// apiRule represents a Rule for web view
// GroupAlerts represents a Group with its Alerts for web view
type GroupAlerts struct {
Group *ApiGroup
Alerts []*ApiAlert
}
// ApiRule represents a Rule for web view
// see https://github.com/prometheus/compliance/blob/main/alert_generator/specification.md#get-apiv1rules
type apiRule struct {
type ApiRule struct {
// State must be one of these under following scenarios
// "pending": at least 1 alert in the rule in pending state and no other alert in firing ruleState.
// "firing": at least 1 alert in the rule in firing state.
@@ -146,7 +100,7 @@ type apiRule struct {
// LastEvaluation is the timestamp of the last time the rule was executed
LastEvaluation time.Time `json:"lastEvaluation"`
// Alerts is the list of all the alerts in this rule that are currently pending or firing
Alerts []*apiAlert `json:"alerts,omitempty"`
Alerts []*ApiAlert `json:"alerts,omitempty"`
// Health is the health of rule evaluation.
// It MUST be one of "ok", "err", "unknown"
Health string `json:"health"`
@@ -177,143 +131,87 @@ type apiRule struct {
// MaxUpdates is the max number of recorded ruleStateEntry objects
MaxUpdates int `json:"max_updates_entries"`
// Updates contains the ordered list of recorded ruleStateEntry objects
Updates []rule.StateEntry `json:"-"`
Updates []StateEntry `json:"-"`
}
// apiRuleWithUpdates represents apiRule but with extra fields for marshalling
type apiRuleWithUpdates struct {
apiRule
// Updates contains the ordered list of recorded ruleStateEntry objects
StateUpdates []rule.StateEntry `json:"updates,omitempty"`
}
// ApiAlert represents a notifier.AlertingRule state
// for WEB view
// https://github.com/prometheus/compliance/blob/main/alert_generator/specification.md#get-apiv1rules
type ApiAlert struct {
State string `json:"state"`
Name string `json:"name"`
Value string `json:"value"`
Labels map[string]string `json:"labels,omitempty"`
Annotations map[string]string `json:"annotations"`
ActiveAt time.Time `json:"activeAt"`
// APILink returns a link to the rule's JSON representation.
func (ar apiRule) APILink() string {
return fmt.Sprintf("api/v1/rule?%s=%s&%s=%s",
paramGroupID, ar.GroupID, paramRuleID, ar.ID)
// Additional fields
// ID is an unique Alert's ID within a group
ID string `json:"id"`
// RuleID is an unique Rule's ID within a group
RuleID string `json:"rule_id"`
// GroupID is an unique Group's ID
GroupID string `json:"group_id"`
// Expression contains the PromQL/MetricsQL expression
// for Rule's evaluation
Expression string `json:"expression"`
// SourceLink contains a link to a system which should show
// why Alert was generated
SourceLink string `json:"source"`
// Restored shows whether Alert's state was restored on restart
Restored bool `json:"restored"`
// Stabilizing shows when firing state is kept because of
// `keep_firing_for` instead of real alert
Stabilizing bool `json:"stabilizing"`
}
// WebLink returns a link to the alert which can be used in UI.
func (ar apiRule) WebLink() string {
func (aa *ApiAlert) WebLink() string {
return fmt.Sprintf("alert?%s=%s&%s=%s",
ParamGroupID, aa.GroupID, ParamAlertID, aa.ID)
}
// APILink returns a link to the alert's JSON representation.
func (aa *ApiAlert) APILink() string {
return fmt.Sprintf("api/v1/alert?%s=%s&%s=%s",
ParamGroupID, aa.GroupID, ParamAlertID, aa.ID)
}
// ApiRuleWithUpdates represents ApiRule but with extra fields for marshalling
type ApiRuleWithUpdates struct {
ApiRule
// Updates contains the ordered list of recorded ruleStateEntry objects
StateUpdates []StateEntry `json:"updates,omitempty"`
}
// APILink returns a link to the rule's JSON representation.
func (ar ApiRule) APILink() string {
return fmt.Sprintf("api/v1/rule?%s=%s&%s=%s",
ParamGroupID, ar.GroupID, ParamRuleID, ar.ID)
}
// WebLink returns a link to the alert which can be used in UI.
func (ar ApiRule) WebLink() string {
return fmt.Sprintf("rule?%s=%s&%s=%s",
paramGroupID, ar.GroupID, paramRuleID, ar.ID)
ParamGroupID, ar.GroupID, ParamRuleID, ar.ID)
}
func ruleToAPI(r any) apiRule {
if ar, ok := r.(*rule.AlertingRule); ok {
return alertingToAPI(ar)
}
if rr, ok := r.(*rule.RecordingRule); ok {
return recordingToAPI(rr)
}
return apiRule{}
}
const (
ruleTypeRecording = "recording"
ruleTypeAlerting = "alerting"
)
func recordingToAPI(rr *rule.RecordingRule) apiRule {
lastState := rule.GetLastEntry(rr)
r := apiRule{
Type: ruleTypeRecording,
DatasourceType: rr.Type.String(),
Name: rr.Name,
Query: rr.Expr,
Labels: rr.Labels,
LastEvaluation: lastState.Time,
EvaluationTime: lastState.Duration.Seconds(),
Health: "ok",
LastSamples: lastState.Samples,
LastSeriesFetched: lastState.SeriesFetched,
MaxUpdates: rule.GetRuleStateSize(rr),
Updates: rule.GetAllRuleState(rr),
// encode as strings to avoid rounding
ID: fmt.Sprintf("%d", rr.ID()),
GroupID: fmt.Sprintf("%d", rr.GroupID),
GroupName: rr.GroupName,
File: rr.File,
}
if lastState.Err != nil {
r.LastError = lastState.Err.Error()
r.Health = "err"
}
return r
}
// alertingToAPI returns Rule representation in form of apiRule
func alertingToAPI(ar *rule.AlertingRule) apiRule {
lastState := rule.GetLastEntry(ar)
r := apiRule{
Type: ruleTypeAlerting,
DatasourceType: ar.Type.String(),
Name: ar.Name,
Query: ar.Expr,
Duration: ar.For.Seconds(),
KeepFiringFor: ar.KeepFiringFor.Seconds(),
Labels: ar.Labels,
Annotations: ar.Annotations,
LastEvaluation: lastState.Time,
EvaluationTime: lastState.Duration.Seconds(),
Health: "ok",
State: "inactive",
Alerts: ruleToAPIAlert(ar),
LastSamples: lastState.Samples,
LastSeriesFetched: lastState.SeriesFetched,
MaxUpdates: rule.GetRuleStateSize(ar),
Updates: rule.GetAllRuleState(ar),
Debug: ar.Debug,
// encode as strings to avoid rounding in JSON
ID: fmt.Sprintf("%d", ar.ID()),
GroupID: fmt.Sprintf("%d", ar.GroupID),
GroupName: ar.GroupName,
File: ar.File,
}
if lastState.Err != nil {
r.LastError = lastState.Err.Error()
r.Health = "err"
}
// satisfy apiRule.State logic
if len(r.Alerts) > 0 {
r.State = notifier.StatePending.String()
stateFiring := notifier.StateFiring.String()
for _, a := range r.Alerts {
if a.State == stateFiring {
r.State = stateFiring
break
}
}
}
return r
}
// ruleToAPIAlert generates list of apiAlert objects from existing alerts
func ruleToAPIAlert(ar *rule.AlertingRule) []*apiAlert {
var alerts []*apiAlert
// AlertsToAPI returns list of ApiAlert objects from existing alerts
func (ar *AlertingRule) AlertsToAPI() []*ApiAlert {
var alerts []*ApiAlert
for _, a := range ar.GetAlerts() {
if a.State == notifier.StateInactive {
continue
}
alerts = append(alerts, newAlertAPI(ar, a))
alerts = append(alerts, NewAlertAPI(ar, a))
}
return alerts
}
// alertToAPI generates apiAlert object from alert by its id(hash)
func alertToAPI(ar *rule.AlertingRule, id uint64) *apiAlert {
a := ar.GetAlert(id)
if a == nil {
return nil
}
return newAlertAPI(ar, a)
}
// NewAlertAPI creates apiAlert for notifier.Alert
func newAlertAPI(ar *rule.AlertingRule, a *notifier.Alert) *apiAlert {
aa := &apiAlert{
func NewAlertAPI(ar *AlertingRule, a *notifier.Alert) *ApiAlert {
aa := &ApiAlert{
// encode as strings to avoid rounding
ID: fmt.Sprintf("%d", a.ID),
GroupID: fmt.Sprintf("%d", a.GroupID),
@@ -328,8 +226,8 @@ func newAlertAPI(ar *rule.AlertingRule, a *notifier.Alert) *apiAlert {
Restored: a.Restored,
Value: strconv.FormatFloat(a.Value, 'f', -1, 32),
}
if alertURLGeneratorFn != nil {
aa.SourceLink = alertURLGeneratorFn(*a)
if notifier.AlertURLGeneratorFn != nil {
aa.SourceLink = notifier.AlertURLGeneratorFn(*a)
}
if a.State == notifier.StateFiring && !a.KeepFiringSince.IsZero() {
aa.Stabilizing = true
@@ -337,9 +235,11 @@ func newAlertAPI(ar *rule.AlertingRule, a *notifier.Alert) *apiAlert {
return aa
}
func groupToAPI(g *rule.Group) *apiGroup {
g = g.DeepCopy()
ag := apiGroup{
// ToAPI returns ApiGroup representation of g
func (g *Group) ToAPI() *ApiGroup {
g.mu.RLock()
defer g.mu.RUnlock()
ag := ApiGroup{
// encode as string to avoid rounding
ID: strconv.FormatUint(g.GetID(), 10),
Name: g.Name,
@@ -359,9 +259,9 @@ func groupToAPI(g *rule.Group) *apiGroup {
if g.EvalDelay != nil {
ag.EvalDelay = g.EvalDelay.Seconds()
}
ag.Rules = make([]apiRule, 0)
ag.Rules = make([]ApiRule, 0)
for _, r := range g.Rules {
ag.Rules = append(ag.Rules, ruleToAPI(r))
ag.Rules = append(ag.Rules, r.ToAPI())
}
return &ag
}

View File

@@ -1,4 +1,4 @@
package main
package rule
import (
"fmt"
@@ -8,7 +8,6 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/config"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/rule"
)
func TestRecordingToApi(t *testing.T) {
@@ -17,7 +16,7 @@ func TestRecordingToApi(t *testing.T) {
Values: []float64{1}, Timestamps: []int64{0},
})
entriesLimit := 44
g := rule.NewGroup(config.Group{
g := NewGroup(config.Group{
Name: "group",
File: "rules.yaml",
Concurrency: 1,
@@ -31,24 +30,24 @@ func TestRecordingToApi(t *testing.T) {
},
},
}, fq, 1*time.Minute, nil)
rr := g.Rules[0].(*rule.RecordingRule)
rr := g.Rules[0].(*RecordingRule)
expectedRes := apiRule{
expectedRes := ApiRule{
Name: "record_name",
Query: "up",
Labels: map[string]string{"label": "value"},
Health: "ok",
Type: ruleTypeRecording,
Type: TypeRecording,
DatasourceType: "prometheus",
ID: "1248",
GroupID: fmt.Sprintf("%d", g.CreateID()),
GroupName: "group",
File: "rules.yaml",
MaxUpdates: 44,
Updates: make([]rule.StateEntry, 0),
Updates: make([]StateEntry, 0),
}
res := recordingToAPI(rr)
res := rr.ToAPI()
if !reflect.DeepEqual(res, expectedRes) {
t.Fatalf("expected to have: \n%v;\ngot: \n%v", expectedRes, res)

View File

@@ -34,11 +34,12 @@ body {
padding-top: 4.5rem;
}
.group-items {
.vm-group {
cursor: pointer;
padding: 5px;
margin-top: 5px;
position: relative;
display: none;
}
.btn svg, .dropdown-item svg {
@@ -55,14 +56,22 @@ body {
height: 38px;
}
.group-items:not(:has(.sub-item:not(.d-none))) {
display: none !important;
.vm-item:not(.vm-found) {
display: none;
}
.group-items:hover {
.vm-group:has(.vm-item:is(.vm-found)), .vm-group:is(.vm-found) {
display: flex;
}
.vm-group:hover {
background-color: #f8f9fa!important;
}
.vm-group:is(.vm-found) .vm-item {
display: table-row;
}
.table {
table-layout: fixed;
}
@@ -111,3 +120,9 @@ textarea.curl-area {
.w-60 {
width: 60%;
}
.annotations {
white-space: pre-wrap;
color: gray;
word-wrap: break-word;
}

View File

@@ -65,32 +65,34 @@ function getParamURL(key) {
return url.searchParams.get(key)
}
function matchText(search, item) {
const text = item.innerText.toLowerCase();
return text.indexOf(search) >= 0;
}
function filterRules(searchPhrase) {
document.querySelectorAll('.sub-items').forEach((rules) => {
let found = false;
rules.querySelectorAll('.sub-item').forEach((rule) => {
if (searchPhrase) {
const ruleName = rule.innerText.toLowerCase();
const matches = []
const hasValue = ruleName.indexOf(searchPhrase) >= 0;
rule.querySelectorAll('.label').forEach((label) => {
const text = label.innerText.toLowerCase();
if (text.indexOf(searchPhrase) >= 0) {
matches.push(text);
}
});
if (!matches.length && !hasValue) {
rule.classList.add('d-none');
return;
}
document.querySelectorAll('.vm-group').forEach((group) => {
if (!searchPhrase) {
group.classList.add('vm-found');
return;
}
for (const item of group.querySelectorAll('.vm-group-search')) {
if (matchText(searchPhrase, item)) {
group.classList.add('vm-found');
return;
}
rule.classList.remove('d-none');
found = true;
});
if (found && searchPhrase || !searchPhrase) {
rules.classList.remove('d-none');
} else {
rules.classList.add('d-none');
}
group.classList.remove('vm-found');
for (const item of group.querySelectorAll('.vm-item')) {
if (matchText(searchPhrase, item)) {
item.classList.add('vm-found');
continue;
}
if (Array.from(item.querySelectorAll('.label')).find(l => matchText(searchPhrase, l))) {
item.classList.add('vm-found');
continue;
}
item.classList.remove('vm-found');
}
});
}

View File

@@ -485,6 +485,12 @@ func templateFuncs() textTpl.FuncMap {
/* Helpers */
// now returns the Unix timestamp in seconds at the time of the template evaluation.
// For example: {{ (now | toTime).Sub $activeAt }} will return the duration the alert has been active.
"now": func() float64 {
return float64(time.Now().Unix())
},
// Converts a list of objects to a map with keys arg0, arg1 etc.
// This is intended to allow multiple arguments to be passed to templates.
"args": func(args ...any) map[string]any {

View File

@@ -29,7 +29,9 @@ var (
{"api/v1/rules", "list all loaded groups and rules"},
{"api/v1/alerts", "list all active alerts"},
{"api/v1/notifiers", "list all notifiers"},
{fmt.Sprintf("api/v1/alert?%s=<int>&%s=<int>", paramGroupID, paramAlertID), "get alert status by group and alert ID"},
{fmt.Sprintf("api/v1/alert?%s=<int>&%s=<int>", rule.ParamGroupID, rule.ParamAlertID), "get alert status by group and alert ID"},
{fmt.Sprintf("api/v1/rule?%s=<int>&%s=<int>", rule.ParamGroupID, rule.ParamRuleID), "get rule status by group and rule ID"},
{fmt.Sprintf("api/v1/group?%s=<int>", rule.ParamGroupID), "get group status by group ID"},
}
systemLinks = [][2]string{
{"vmalert/groups", "UI"},
@@ -45,8 +47,8 @@ var (
{Name: "Docs", URL: "https://docs.victoriametrics.com/victoriametrics/vmalert/"},
}
ruleTypeMap = map[string]string{
"alert": ruleTypeAlerting,
"record": ruleTypeRecording,
"alert": rule.TypeAlerting,
"record": rule.TypeRecording,
}
)
@@ -112,7 +114,7 @@ func (rh *requestHandler) handler(w http.ResponseWriter, r *http.Request) bool {
case "/rules":
// Grafana makes an extra request to `/rules`
// handler in addition to `/api/v1/rules` calls in alerts UI
var data []*apiGroup
var data []*rule.ApiGroup
rf, err := newRulesFilter(r)
if err != nil {
httpserver.Errorf(w, r, "%s", err)
@@ -178,14 +180,14 @@ func (rh *requestHandler) handler(w http.ResponseWriter, r *http.Request) bool {
w.Write(data)
return true
case "/vmalert/api/v1/rule", "/api/v1/rule":
rule, err := rh.getRule(r)
apiRule, err := rh.getRule(r)
if err != nil {
httpserver.Errorf(w, r, "%s", err)
return true
}
rwu := apiRuleWithUpdates{
apiRule: rule,
StateUpdates: rule.Updates,
rwu := rule.ApiRuleWithUpdates{
ApiRule: apiRule,
StateUpdates: apiRule.Updates,
}
data, err := json.Marshal(rwu)
if err != nil {
@@ -195,6 +197,20 @@ func (rh *requestHandler) handler(w http.ResponseWriter, r *http.Request) bool {
w.Header().Set("Content-Type", "application/json")
w.Write(data)
return true
case "/vmalert/api/v1/group", "/api/v1/group":
group, err := rh.getGroup(r)
if err != nil {
httpserver.Errorf(w, r, "%s", err)
return true
}
data, err := json.Marshal(group)
if err != nil {
httpserver.Errorf(w, r, "failed to marshal group: %s", err)
return true
}
w.Header().Set("Content-Type", "application/json")
w.Write(data)
return true
case "/-/reload":
if !httpserver.CheckAuthFlag(w, r, reloadAuthKey) {
return true
@@ -209,30 +225,42 @@ func (rh *requestHandler) handler(w http.ResponseWriter, r *http.Request) bool {
}
}
func (rh *requestHandler) getRule(r *http.Request) (apiRule, error) {
groupID, err := strconv.ParseUint(r.FormValue(paramGroupID), 10, 64)
func (rh *requestHandler) getGroup(r *http.Request) (*rule.ApiGroup, error) {
groupID, err := strconv.ParseUint(r.FormValue(rule.ParamGroupID), 10, 64)
if err != nil {
return apiRule{}, fmt.Errorf("failed to read %q param: %w", paramGroupID, err)
return nil, fmt.Errorf("failed to read %q param: %w", rule.ParamGroupID, err)
}
ruleID, err := strconv.ParseUint(r.FormValue(paramRuleID), 10, 64)
obj, err := rh.m.groupAPI(groupID)
if err != nil {
return apiRule{}, fmt.Errorf("failed to read %q param: %w", paramRuleID, err)
}
obj, err := rh.m.ruleAPI(groupID, ruleID)
if err != nil {
return apiRule{}, errResponse(err, http.StatusNotFound)
return nil, errResponse(err, http.StatusNotFound)
}
return obj, nil
}
func (rh *requestHandler) getAlert(r *http.Request) (*apiAlert, error) {
groupID, err := strconv.ParseUint(r.FormValue(paramGroupID), 10, 64)
func (rh *requestHandler) getRule(r *http.Request) (rule.ApiRule, error) {
groupID, err := strconv.ParseUint(r.FormValue(rule.ParamGroupID), 10, 64)
if err != nil {
return nil, fmt.Errorf("failed to read %q param: %w", paramGroupID, err)
return rule.ApiRule{}, fmt.Errorf("failed to read %q param: %w", rule.ParamGroupID, err)
}
alertID, err := strconv.ParseUint(r.FormValue(paramAlertID), 10, 64)
ruleID, err := strconv.ParseUint(r.FormValue(rule.ParamRuleID), 10, 64)
if err != nil {
return nil, fmt.Errorf("failed to read %q param: %w", paramAlertID, err)
return rule.ApiRule{}, fmt.Errorf("failed to read %q param: %w", rule.ParamRuleID, err)
}
obj, err := rh.m.ruleAPI(groupID, ruleID)
if err != nil {
return rule.ApiRule{}, errResponse(err, http.StatusNotFound)
}
return obj, nil
}
func (rh *requestHandler) getAlert(r *http.Request) (*rule.ApiAlert, error) {
groupID, err := strconv.ParseUint(r.FormValue(rule.ParamGroupID), 10, 64)
if err != nil {
return nil, fmt.Errorf("failed to read %q param: %w", rule.ParamGroupID, err)
}
alertID, err := strconv.ParseUint(r.FormValue(rule.ParamAlertID), 10, 64)
if err != nil {
return nil, fmt.Errorf("failed to read %q param: %w", rule.ParamAlertID, err)
}
a, err := rh.m.alertAPI(groupID, alertID)
if err != nil {
@@ -244,7 +272,7 @@ func (rh *requestHandler) getAlert(r *http.Request) (*apiAlert, error) {
type listGroupsResponse struct {
Status string `json:"status"`
Data struct {
Groups []*apiGroup `json:"groups"`
Groups []*rule.ApiGroup `json:"groups"`
} `json:"data"`
}
@@ -310,19 +338,19 @@ func (rf *rulesFilter) matchesGroup(group *rule.Group) bool {
return true
}
func (rh *requestHandler) groups(rf *rulesFilter) []*apiGroup {
func (rh *requestHandler) groups(rf *rulesFilter) []*rule.ApiGroup {
rh.m.groupsMu.RLock()
defer rh.m.groupsMu.RUnlock()
groups := make([]*apiGroup, 0)
groups := make([]*rule.ApiGroup, 0)
for _, group := range rh.m.groups {
if !rf.matchesGroup(group) {
continue
}
g := groupToAPI(group)
g := group.ToAPI()
// the returned list should always be non-nil
// https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4221
filteredRules := make([]apiRule, 0)
filteredRules := make([]rule.ApiRule, 0)
for _, rule := range g.Rules {
if rf.ruleType != "" && rf.ruleType != rule.Type {
continue
@@ -350,7 +378,7 @@ func (rh *requestHandler) groups(rf *rulesFilter) []*apiGroup {
groups = append(groups, g)
}
// sort list of groups for deterministic output
slices.SortFunc(groups, func(a, b *apiGroup) int {
slices.SortFunc(groups, func(a, b *rule.ApiGroup) int {
if a.Name != b.Name {
return strings.Compare(a.Name, b.Name)
}
@@ -375,32 +403,32 @@ func (rh *requestHandler) listGroups(rf *rulesFilter) ([]byte, error) {
type listAlertsResponse struct {
Status string `json:"status"`
Data struct {
Alerts []*apiAlert `json:"alerts"`
Alerts []*rule.ApiAlert `json:"alerts"`
} `json:"data"`
}
func (rh *requestHandler) groupAlerts() []groupAlerts {
func (rh *requestHandler) groupAlerts() []rule.GroupAlerts {
rh.m.groupsMu.RLock()
defer rh.m.groupsMu.RUnlock()
var gAlerts []groupAlerts
for _, g := range rh.m.groups {
var alerts []*apiAlert
var gAlerts []rule.GroupAlerts
for _, group := range rh.m.groups {
var alerts []*rule.ApiAlert
g := group.ToAPI()
for _, r := range g.Rules {
a, ok := r.(*rule.AlertingRule)
if !ok {
if r.Type != rule.TypeAlerting {
continue
}
alerts = append(alerts, ruleToAPIAlert(a)...)
alerts = append(alerts, r.Alerts...)
}
if len(alerts) > 0 {
gAlerts = append(gAlerts, groupAlerts{
Group: groupToAPI(g),
gAlerts = append(gAlerts, rule.GroupAlerts{
Group: g,
Alerts: alerts,
})
}
}
slices.SortFunc(gAlerts, func(a, b groupAlerts) int {
slices.SortFunc(gAlerts, func(a, b rule.GroupAlerts) int {
return strings.Compare(a.Group.Name, b.Group.Name)
})
return gAlerts
@@ -411,22 +439,22 @@ func (rh *requestHandler) listAlerts(rf *rulesFilter) ([]byte, error) {
defer rh.m.groupsMu.RUnlock()
lr := listAlertsResponse{Status: "success"}
lr.Data.Alerts = make([]*apiAlert, 0)
lr.Data.Alerts = make([]*rule.ApiAlert, 0)
for _, group := range rh.m.groups {
if !rf.matchesGroup(group) {
continue
}
for _, r := range group.Rules {
a, ok := r.(*rule.AlertingRule)
if !ok {
g := group.ToAPI()
for _, r := range g.Rules {
if r.Type != rule.TypeAlerting {
continue
}
lr.Data.Alerts = append(lr.Data.Alerts, ruleToAPIAlert(a)...)
lr.Data.Alerts = append(lr.Data.Alerts, r.Alerts...)
}
}
// sort list of alerts for deterministic output
slices.SortFunc(lr.Data.Alerts, func(a, b *apiAlert) int {
slices.SortFunc(lr.Data.Alerts, func(a, b *rule.ApiAlert) int {
return strings.Compare(a.ID, b.ID)
})
@@ -443,7 +471,7 @@ func (rh *requestHandler) listAlerts(rf *rulesFilter) ([]byte, error) {
type listNotifiersResponse struct {
Status string `json:"status"`
Data struct {
Notifiers []*apiNotifier `json:"notifiers"`
Notifiers []*notifier.ApiNotifier `json:"notifiers"`
} `json:"data"`
}
@@ -451,19 +479,20 @@ func (rh *requestHandler) listNotifiers() ([]byte, error) {
targets := notifier.GetTargets()
lr := listNotifiersResponse{Status: "success"}
lr.Data.Notifiers = make([]*apiNotifier, 0)
lr.Data.Notifiers = make([]*notifier.ApiNotifier, 0)
for protoName, protoTargets := range targets {
notifier := &apiNotifier{
Kind: string(protoName),
Targets: make([]*apiTarget, 0, len(protoTargets)),
nr := &notifier.ApiNotifier{
Kind: protoName,
Targets: make([]*notifier.ApiTarget, 0, len(protoTargets)),
}
for _, target := range protoTargets {
notifier.Targets = append(notifier.Targets, &apiTarget{
Address: target.Addr(),
Labels: target.Labels.ToMap(),
nr.Targets = append(nr.Targets, &notifier.ApiTarget{
Address: target.Addr(),
Labels: target.Labels.ToMap(),
LastError: target.LastError(),
})
}
lr.Data.Notifiers = append(lr.Data.Notifiers, notifier)
lr.Data.Notifiers = append(lr.Data.Notifiers, nr)
}
b, err := json.Marshal(lr)

View File

@@ -8,6 +8,7 @@
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/tpl"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/vmalertutil"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/rule"
) %}
{% func Controls(prefix, currentIcon, currentText string, icons, filters map[string]string, search bool) %}
@@ -93,7 +94,7 @@
{%= tpl.Footer(r) %}
{% endfunc %}
{% func ListGroups(r *http.Request, groups []*apiGroup, filter string) %}
{% func ListGroups(r *http.Request, groups []*rule.ApiGroup, filter string) %}
{%code
prefix := vmalertutil.Prefix(r.URL.Path)
filters := map[string]string{
@@ -113,14 +114,17 @@
{%= Controls(prefix, currentIcon, currentText, icons, filters, true) %}
{% if len(groups) > 0 %}
{% for _, g := range groups %}
<div id="group-{%s g.ID %}" class="d-flex w-100 border-0 flex-column group-items{% if g.Unhealthy > 0 %} alert-danger{% endif %}">
<div id="group-{%s g.ID %}" class="w-100 border-0 flex-column vm-group{% if g.Unhealthy > 0 %} alert-danger{% endif %}">
<span class="d-flex justify-content-between">
<a href="#group-{%s g.ID %}">{%s g.Name %}{% if g.Type != "prometheus" %} ({%s g.Type %}){% endif %} (every {%f.0 g.Interval %}s) #</a>
<a
class="vm-group-search"
href="#group-{%s g.ID %}"
>{%s g.Name %}{% if g.Type != "prometheus" %} ({%s g.Type %}){% endif %} (every {%f.0 g.Interval %}s) #</a>
<span
class="flex-grow-1 d-flex justify-content-end"
role="button"
data-bs-toggle="collapse"
data-bs-target="#sub-{%s g.ID %}"
data-bs-target="#item-{%s g.ID %}"
>
<span class="d-flex gap-2">
{% if g.Unhealthy > 0 %}<span class="badge bg-danger" title="Number of rules with status Error">{%d g.Unhealthy %}</span> {% endif %}
@@ -133,9 +137,9 @@
class="d-flex flex-column row-gap-2 mb-2"
role="button"
data-bs-toggle="collapse"
data-bs-target="#sub-{%s g.ID %}"
data-bs-target="#item-{%s g.ID %}"
>
<span class="fs-6 text-start w-100 fw-lighter">{%s g.File %}</span>
<span class="fs-6 text-start vm-group-search w-100 fw-lighter">{%s g.File %}</span>
{% if len(g.Params) > 0 %}
<span class="fs-6 text-start w-100 d-flex justify-content-between fw-lighter">
<span>Extra params</span>
@@ -157,7 +161,7 @@
</span>
{% endif %}
</span>
<div class="collapse sub-items" id="sub-{%s g.ID %}">
<div class="collapse" id="item-{%s g.ID %}">
<table class="table table-striped table-hover table-sm">
<thead>
<tr>
@@ -168,7 +172,7 @@
</thead>
<tbody>
{% for _, r := range g.Rules %}
<tr class="sub-item{% if r.LastError != "" %} alert-danger{% endif %}">
<tr class="vm-item{% if r.LastError != "" %} alert-danger{% endif %}">
<td>
<div class="row">
<div class="col-12 mb-2">
@@ -205,7 +209,12 @@
</div>
</td>
<td class="text-center">{%d r.LastSamples %}</td>
<td class="text-center">{%f.3 time.Since(r.LastEvaluation).Seconds() %}s ago</td>
<td class="text-center">{% if r.LastEvaluation.IsZero() %}
Never
{% else %}
{%f.3 time.Since(r.LastEvaluation).Seconds() %}s ago
{% endif %}
</td>
</tr>
{% endfor %}
</tbody>
@@ -222,7 +231,7 @@
{% endfunc %}
{% func ListAlerts(r *http.Request, groupAlerts []groupAlerts) %}
{% func ListAlerts(r *http.Request, groupAlerts []rule.GroupAlerts) %}
{%code prefix := vmalertutil.Prefix(r.URL.Path) %}
{%= tpl.Header(r, navItems, "Alerts", getLastConfigError()) %}
{%= Controls(prefix, "", "", nil, nil, true) %}
@@ -231,7 +240,7 @@
{%code
g := ga.Group
var keys []string
alertsByRule := make(map[string][]*apiAlert)
alertsByRule := make(map[string][]*rule.ApiAlert)
for _, alert := range ga.Alerts {
if len(alertsByRule[alert.RuleID]) < 1 {
keys = append(keys, alert.RuleID)
@@ -240,14 +249,14 @@
}
sort.Strings(keys)
%}
<div class="d-flex w-100 flex-column group-items alert-danger">
<div class="w-100 flex-column vm-group alert-danger">
<span id="group-{%s g.ID %}" class="d-flex justify-content-between">
<a href="#group-{%s g.ID %}">{%s g.Name %}{% if g.Type != "prometheus" %} ({%s g.Type %}){% endif %}</a>
<span
class="flex-grow-1 d-flex justify-content-end"
role="button"
data-bs-toggle="collapse"
data-bs-target="#sub-{%s g.ID %}"
data-bs-target="#item-{%s g.ID %}"
>
<span class="badge bg-danger" title="Number of active alerts">{%d len(ga.Alerts) %}</span>
</span>
@@ -257,10 +266,10 @@
class="fs-6 text-start w-100 fw-lighter"
role="button"
data-bs-toggle="collapse"
data-bs-target="#sub-{%s g.ID %}"
data-bs-target="#item-{%s g.ID %}"
>{%s g.File %}</span>
</span>
<div class="collapse sub-items" id="sub-{%s g.ID %}">
<div class="collapse" id="item-{%s g.ID %}">
{% for _, ruleID := range keys %}
{%code
defaultAR := alertsByRule[ruleID][0]
@@ -271,7 +280,7 @@
sort.Strings(labelKeys)
%}
<br>
<div class="sub-item">
<div class="vm-item">
<b>alert:</b> {%s defaultAR.Name %} ({%d len(alertsByRule[ruleID]) %})
| <span><a target="_blank" href="{%s defaultAR.SourceLink %}">Source</a></span>
<br>
@@ -336,20 +345,20 @@
typeK, ns := keys[i], targets[notifier.TargetType(keys[i])]
count := len(ns)
%}
<div class="d-flex w-100 flex-column group-items">
<div class="w-100 flex-column vm-group">
<span class="d-flex justify-content-between" id="group-{%s typeK %}">
<a href="#group-{%s typeK %}">{%s typeK %} ({%d count %})</a>
<span
class="flex-grow-1"
role="button"
data-bs-toggle="collapse"
data-bs-target="#sub-{%s typeK %}"
data-bs-target="#item-{%s typeK %}"
></span>
</span>
<div id="sub-{%s typeK %}" class="collapse show sub-items">
<div id="item-{%s typeK %}" class="collapse show">
<table class="table table-striped table-hover table-sm">
<thead>
<tr class="sub-item">
<tr class="vm-item">
<th scope="col">Labels</th>
<th scope="col">Address</th>
</tr>
@@ -378,7 +387,7 @@
{%= tpl.Footer(r) %}
{% endfunc %}
{% func Alert(r *http.Request, alert *apiAlert) %}
{% func Alert(r *http.Request, alert *rule.ApiAlert) %}
{%code prefix := vmalertutil.Prefix(r.URL.Path) %}
{%= tpl.Header(r, navItems, "", getLastConfigError()) %}
{%code
@@ -434,7 +443,7 @@
<div class="col">
{% for _, k := range annotationKeys %}
<b>{%s k %}:</b><br>
<p>{%s alert.Annotations[k] %}</p>
<p class="annotations">{%s alert.Annotations[k] %}</p>
{% endfor %}
</div>
</div>
@@ -464,7 +473,7 @@
{% endfunc %}
{% func RuleDetails(r *http.Request, rule apiRule) %}
{% func RuleDetails(r *http.Request, rule rule.ApiRule) %}
{%code prefix := vmalertutil.Prefix(r.URL.Path) %}
{%= tpl.Header(r, navItems, "", getLastConfigError()) %}
{%code
@@ -548,7 +557,7 @@
<div class="col">
{% for _, k := range annotationKeys %}
<b>{%s k %}:</b><br>
<p>{%s rule.Annotations[k] %}</p>
<p class="annotations">{%s rule.Annotations[k] %}</p>
{% endfor %}
</div>
</div>
@@ -593,11 +602,11 @@
<table class="table table-striped table-hover table-sm">
<thead>
<tr>
<th scope="col" title="The time when event was created">Updated at</th>
<th scope="col" title="The time when the rule was executed">Updated at</th>
<th scope="col" class="w-10 text-center" title="How many series expression returns. Each series will represent an alert.">Series returned</th>
{% if seriesFetchedEnabled %}<th scope="col" class="w-10 text-center" title="How many series were scanned by datasource during the evaluation">Series fetched</th>{% endif %}
<th scope="col" class="w-10 text-center" title="How many seconds request took">Duration</th>
<th scope="col" class="text-center" title="Time used for rule execution">Executed at</th>
<th scope="col" class="text-center" title="The time used in execution query request">Execution timestamp</th>
<th scope="col" class="text-center" title="cURL command with request example">cURL</th>
</tr>
</thead>
@@ -649,7 +658,7 @@
<span class="badge bg-warning text-dark" title="This firing state is kept because of `keep_firing_for`">stabilizing</span>
{% endfunc %}
{% func seriesFetchedWarn(prefix string, r apiRule) %}
{% func seriesFetchedWarn(prefix string, r rule.ApiRule) %}
{% if isNoMatch(r) %}
<svg
data-bs-toggle="tooltip"
@@ -663,7 +672,7 @@
{% endfunc %}
{%code
func isNoMatch (r apiRule) bool {
func isNoMatch (r rule.ApiRule) bool {
return r.LastSamples == 0 && r.LastSeriesFetched != nil && *r.LastSeriesFetched == 0
}
%}

File diff suppressed because it is too large Load Diff

View File

@@ -23,8 +23,12 @@ func TestHandler(t *testing.T) {
Timestamps: []int64{0},
})
m := &manager{groups: map[uint64]*rule.Group{}}
_, cleanup := notifier.InitFakeNotifier()
defer cleanup()
var ar *rule.AlertingRule
var rr *rule.RecordingRule
var groupIDs []uint64
for _, dsType := range []string{"prometheus", "", "graphite"} {
g := rule.NewGroup(config.Group{
Name: "group",
@@ -44,8 +48,10 @@ func TestHandler(t *testing.T) {
}, fq, 1*time.Minute, nil)
ar = g.Rules[0].(*rule.AlertingRule)
rr = g.Rules[1].(*rule.RecordingRule)
g.ExecOnce(context.Background(), func() []notifier.Notifier { return nil }, nil, time.Time{})
m.groups[g.CreateID()] = g
g.ExecOnce(context.Background(), nil, time.Time{})
id := g.CreateID()
m.groups[id] = g
groupIDs = append(groupIDs, id)
}
rh := &requestHandler{m: m}
@@ -82,22 +88,22 @@ func TestHandler(t *testing.T) {
})
t.Run("/vmalert/rule", func(t *testing.T) {
a := ruleToAPI(ar)
a := ar.ToAPI()
getResp(t, ts.URL+"/vmalert/"+a.WebLink(), nil, 200)
r := ruleToAPI(rr)
r := rr.ToAPI()
getResp(t, ts.URL+"/vmalert/"+r.WebLink(), nil, 200)
})
t.Run("/vmalert/alert", func(t *testing.T) {
alerts := ruleToAPIAlert(ar)
alerts := ar.AlertsToAPI()
for _, a := range alerts {
getResp(t, ts.URL+"/vmalert/"+a.WebLink(), nil, 200)
}
})
t.Run("/vmalert/rule?badParam", func(t *testing.T) {
params := fmt.Sprintf("?%s=0&%s=1", paramGroupID, paramRuleID)
params := fmt.Sprintf("?%s=0&%s=1", rule.ParamGroupID, rule.ParamRuleID)
getResp(t, ts.URL+"/vmalert/rule"+params, nil, 404)
params = fmt.Sprintf("?%s=1&%s=0", paramGroupID, paramRuleID)
params = fmt.Sprintf("?%s=1&%s=0", rule.ParamGroupID, rule.ParamRuleID)
getResp(t, ts.URL+"/vmalert/rule"+params, nil, 404)
})
@@ -124,14 +130,14 @@ func TestHandler(t *testing.T) {
}
})
t.Run("/api/v1/alert?alertID&groupID", func(t *testing.T) {
expAlert := newAlertAPI(ar, ar.GetAlerts()[0])
alert := &apiAlert{}
expAlert := rule.NewAlertAPI(ar, ar.GetAlerts()[0])
alert := &rule.ApiAlert{}
getResp(t, ts.URL+"/"+expAlert.APILink(), alert, 200)
if !reflect.DeepEqual(alert, expAlert) {
t.Fatalf("expected %v is equal to %v", alert, expAlert)
}
alert = &apiAlert{}
alert = &rule.ApiAlert{}
getResp(t, ts.URL+"/vmalert/"+expAlert.APILink(), alert, 200)
if !reflect.DeepEqual(alert, expAlert) {
t.Fatalf("expected %v is equal to %v", alert, expAlert)
@@ -139,16 +145,16 @@ func TestHandler(t *testing.T) {
})
t.Run("/api/v1/alert?badParams", func(t *testing.T) {
params := fmt.Sprintf("?%s=0&%s=1", paramGroupID, paramAlertID)
params := fmt.Sprintf("?%s=0&%s=1", rule.ParamGroupID, rule.ParamAlertID)
getResp(t, ts.URL+"/api/v1/alert"+params, nil, 404)
getResp(t, ts.URL+"/vmalert/api/v1/alert"+params, nil, 404)
params = fmt.Sprintf("?%s=1&%s=0", paramGroupID, paramAlertID)
params = fmt.Sprintf("?%s=1&%s=0", rule.ParamGroupID, rule.ParamAlertID)
getResp(t, ts.URL+"/api/v1/alert"+params, nil, 404)
getResp(t, ts.URL+"/vmalert/api/v1/alert"+params, nil, 404)
// bad request, alertID is missing
params = fmt.Sprintf("?%s=1", paramGroupID)
params = fmt.Sprintf("?%s=1", rule.ParamGroupID)
getResp(t, ts.URL+"/api/v1/alert"+params, nil, 400)
getResp(t, ts.URL+"/vmalert/api/v1/alert"+params, nil, 400)
})
@@ -167,27 +173,42 @@ func TestHandler(t *testing.T) {
}
})
t.Run("/api/v1/rule?ruleID&groupID", func(t *testing.T) {
expRule := ruleToAPI(ar)
gotRule := apiRule{}
expRule := ar.ToAPI()
gotRule := rule.ApiRule{}
getResp(t, ts.URL+"/"+expRule.APILink(), &gotRule, 200)
if expRule.ID != gotRule.ID {
t.Fatalf("expected to get Rule %q; got %q instead", expRule.ID, gotRule.ID)
}
gotRule = apiRule{}
gotRule = rule.ApiRule{}
getResp(t, ts.URL+"/vmalert/"+expRule.APILink(), &gotRule, 200)
if expRule.ID != gotRule.ID {
t.Fatalf("expected to get Rule %q; got %q instead", expRule.ID, gotRule.ID)
}
gotRuleWithUpdates := apiRuleWithUpdates{}
gotRuleWithUpdates := rule.ApiRuleWithUpdates{}
getResp(t, ts.URL+"/"+expRule.APILink(), &gotRuleWithUpdates, 200)
if len(gotRuleWithUpdates.StateUpdates) < 1 {
t.Fatalf("expected %+v to have state updates field not empty", gotRuleWithUpdates.StateUpdates)
}
})
t.Run("/api/v1/group?groupID", func(t *testing.T) {
id := groupIDs[0]
g := m.groups[id]
expGroup := g.ToAPI()
gotGroup := rule.ApiGroup{}
getResp(t, ts.URL+"/"+expGroup.APILink(), &gotGroup, 200)
if expGroup.ID != gotGroup.ID {
t.Fatalf("expected to get Group %q; got %q instead", expGroup.ID, gotGroup.ID)
}
gotGroup = rule.ApiGroup{}
getResp(t, ts.URL+"/vmalert/"+expGroup.APILink(), &gotGroup, 200)
if expGroup.ID != gotGroup.ID {
t.Fatalf("expected to get Group %q; got %q instead", expGroup.ID, gotGroup.ID)
}
})
t.Run("/api/v1/rules&filters", func(t *testing.T) {
check := func(url string, statusCode, expGroups, expRules int) {

View File

@@ -27,6 +27,9 @@ vmauth-linux-ppc64le-prod:
vmauth-linux-386-prod:
APP_NAME=vmauth $(MAKE) app-via-docker-linux-386
vmauth-linux-s390x-prod:
APP_NAME=vmauth $(MAKE) app-via-docker-linux-s390x
vmauth-darwin-amd64-prod:
APP_NAME=vmauth $(MAKE) app-via-docker-darwin-amd64

View File

@@ -41,6 +41,9 @@ var (
"See https://docs.victoriametrics.com/victoriametrics/vmauth/#load-balancing for details")
defaultLoadBalancingPolicy = flag.String("loadBalancingPolicy", "least_loaded", "The default load balancing policy to use for backend urls specified inside url_prefix section. "+
"Supported policies: least_loaded, first_available. See https://docs.victoriametrics.com/victoriametrics/vmauth/#load-balancing")
defaultMergeQueryArgs = flagutil.NewArrayString("mergeQueryArgs", "An optional list of client query arg names, which must be merged with args at backend urls. "+
"The rest of client query args are replaced by the corresponding query args from backend urls for security reasons; "+
"see https://docs.victoriametrics.com/victoriametrics/vmauth/#query-args-handling")
discoverBackendIPsGlobal = flag.Bool("discoverBackendIPs", false, "Whether to discover backend IPs via periodic DNS queries to hostnames specified in url_prefix. "+
"This may be useful when url_prefix points to a hostname with dynamically scaled instances behind it. See https://docs.victoriametrics.com/victoriametrics/vmauth/#discovering-backend-ips")
discoverBackendIPsInterval = flag.Duration("discoverBackendIPsInterval", 10*time.Second, "The interval for re-discovering backend IPs if -discoverBackendIPs command-line flag is set. "+
@@ -75,6 +78,7 @@ type UserInfo struct {
DefaultURL *URLPrefix `yaml:"default_url,omitempty"`
RetryStatusCodes []int `yaml:"retry_status_codes,omitempty"`
LoadBalancingPolicy string `yaml:"load_balancing_policy,omitempty"`
MergeQueryArgs []string `yaml:"merge_query_args,omitempty"`
DropSrcPathPrefixParts *int `yaml:"drop_src_path_prefix_parts,omitempty"`
TLSCAFile string `yaml:"tls_ca_file,omitempty"`
TLSCertFile string `yaml:"tls_cert_file,omitempty"`
@@ -182,6 +186,11 @@ type URLMap struct {
// LoadBalancingPolicy is load balancing policy among UrlPrefix backends.
LoadBalancingPolicy string `yaml:"load_balancing_policy,omitempty"`
// MergeQueryArgs is a list of client query args, which must be merged with the existing backend query args.
//
// The rest of client query args are replaced with the corresponding backend query args for security reasons.
MergeQueryArgs []string `yaml:"merge_query_args,omitempty"`
// DropSrcPathPrefixParts is the number of `/`-delimited request path prefix parts to drop before proxying the request to backend.
DropSrcPathPrefixParts *int `yaml:"drop_src_path_prefix_parts,omitempty"`
}
@@ -228,7 +237,7 @@ func (qa *QueryArg) MarshalYAML() (any, error) {
return qa.sOriginal, nil
}
// URLPrefix represents passed `url_prefix`
// URLPrefix represents the `url_prefix` from auth config.
type URLPrefix struct {
// requests are re-tried on other backend urls for these http response status codes
retryStatusCodes []int
@@ -236,6 +245,11 @@ type URLPrefix struct {
// load balancing policy used
loadBalancingPolicy string
// the list of client query args, which must be merged with backend query args.
//
// By default backend query args replace all the client query args for security reasons.
mergeQueryArgs []string
// how many request path prefix parts to drop before routing the request to backendURL
dropSrcPathPrefixParts int
@@ -468,27 +482,34 @@ func getLeastLoadedBackendURL(bus []*backendURL, atomicCounter *atomic.Uint32) *
if bu.isBroken() {
continue
}
if bu.concurrentRequests.Load() == 0 {
// Fast path - return the backend with zero concurrently executed requests.
// Do not use CompareAndSwap() instead of Load(), since it is much slower on systems with many CPU cores.
bu.concurrentRequests.Add(1)
// The Load() in front of CompareAndSwap() avoids CAS overhead for items with values bigger than 0.
if bu.concurrentRequests.Load() == 0 && bu.concurrentRequests.CompareAndSwap(0, 1) {
atomicCounter.CompareAndSwap(n+1, idx+1)
// There is no need in the call bu.get(), because we already incremented bu.concrrentRequests above.
return bu
}
}
// Slow path - return the backend with the minimum number of concurrently executed requests.
buMin := bus[n%uint32(len(bus))]
minRequests := buMin.concurrentRequests.Load()
for _, bu := range bus {
buMinIdx := n % uint32(len(bus))
minRequests := bus[buMinIdx].concurrentRequests.Load()
for i := uint32(0); i < uint32(len(bus)); i++ {
idx := (n + i) % uint32(len(bus))
bu := bus[idx]
if bu.isBroken() {
continue
}
if n := bu.concurrentRequests.Load(); n < minRequests || buMin.isBroken() {
buMin = bu
minRequests = n
reqs := bu.concurrentRequests.Load()
if reqs < minRequests || bus[buMinIdx].isBroken() {
buMinIdx = idx
minRequests = reqs
}
}
buMin := bus[buMinIdx]
buMin.get()
atomicCounter.CompareAndSwap(n+1, buMinIdx+1)
return buMin
}
@@ -856,6 +877,7 @@ func (ui *UserInfo) getMetricLabels() (string, error) {
func (ui *UserInfo) initURLs() error {
retryStatusCodes := defaultRetryStatusCodes.Values()
loadBalancingPolicy := *defaultLoadBalancingPolicy
mergeQueryArgs := *defaultMergeQueryArgs
dropSrcPathPrefixParts := 0
discoverBackendIPs := *discoverBackendIPsGlobal
if ui.RetryStatusCodes != nil {
@@ -864,6 +886,9 @@ func (ui *UserInfo) initURLs() error {
if ui.LoadBalancingPolicy != "" {
loadBalancingPolicy = ui.LoadBalancingPolicy
}
if len(ui.MergeQueryArgs) != 0 {
mergeQueryArgs = ui.MergeQueryArgs
}
if ui.DropSrcPathPrefixParts != nil {
dropSrcPathPrefixParts = *ui.DropSrcPathPrefixParts
}
@@ -871,16 +896,18 @@ func (ui *UserInfo) initURLs() error {
discoverBackendIPs = *ui.DiscoverBackendIPs
}
if ui.URLPrefix != nil {
if err := ui.URLPrefix.sanitizeAndInitialize(); err != nil {
up := ui.URLPrefix
if up != nil {
if err := up.sanitizeAndInitialize(); err != nil {
return err
}
ui.URLPrefix.retryStatusCodes = retryStatusCodes
ui.URLPrefix.dropSrcPathPrefixParts = dropSrcPathPrefixParts
ui.URLPrefix.discoverBackendIPs = discoverBackendIPs
if err := ui.URLPrefix.setLoadBalancingPolicy(loadBalancingPolicy); err != nil {
up.retryStatusCodes = retryStatusCodes
up.dropSrcPathPrefixParts = dropSrcPathPrefixParts
up.discoverBackendIPs = discoverBackendIPs
if err := up.setLoadBalancingPolicy(loadBalancingPolicy); err != nil {
return err
}
up.mergeQueryArgs = mergeQueryArgs
}
if ui.DefaultURL != nil {
if err := ui.DefaultURL.sanitizeAndInitialize(); err != nil {
@@ -899,6 +926,7 @@ func (ui *UserInfo) initURLs() error {
}
rscs := retryStatusCodes
lbp := loadBalancingPolicy
mqa := mergeQueryArgs
dsp := dropSrcPathPrefixParts
dbd := discoverBackendIPs
if e.RetryStatusCodes != nil {
@@ -907,6 +935,9 @@ func (ui *UserInfo) initURLs() error {
if e.LoadBalancingPolicy != "" {
lbp = e.LoadBalancingPolicy
}
if len(e.MergeQueryArgs) != 0 {
mqa = e.MergeQueryArgs
}
if e.DropSrcPathPrefixParts != nil {
dsp = *e.DropSrcPathPrefixParts
}
@@ -917,6 +948,7 @@ func (ui *UserInfo) initURLs() error {
if err := e.URLPrefix.setLoadBalancingPolicy(lbp); err != nil {
return err
}
e.URLPrefix.mergeQueryArgs = mqa
e.URLPrefix.dropSrcPathPrefixParts = dsp
e.URLPrefix.discoverBackendIPs = dbd
}

View File

@@ -280,7 +280,7 @@ users:
}
func TestParseAuthConfigSuccess(t *testing.T) {
f := func(s string, expectedAuthConfig map[string]*UserInfo) {
f := func(s string, expectedAuthConfig map[string]*UserInfo, expectedUnauthorizedUserConfig *UserInfo) {
t.Helper()
ac, err := parseAuthConfig([]byte(s))
if err != nil {
@@ -294,15 +294,19 @@ func TestParseAuthConfigSuccess(t *testing.T) {
if err := areEqualConfigs(m, expectedAuthConfig); err != nil {
t.Fatal(err)
}
if err := areEqualConfigs(ac.UnauthorizedUser, expectedUnauthorizedUserConfig); err != nil {
t.Fatal(err)
}
}
insecureSkipVerifyTrue := true
// Empty config
f(``, map[string]*UserInfo{})
f(``, map[string]*UserInfo{}, nil)
// Empty users
f(`users: []`, map[string]*UserInfo{})
f(`users: []`, map[string]*UserInfo{}, nil)
// Single user
f(`
@@ -320,7 +324,7 @@ users:
MaxConcurrentRequests: 5,
TLSInsecureSkipVerify: &insecureSkipVerifyTrue,
},
})
}, nil)
// Single user with auth_token
f(`
@@ -344,7 +348,7 @@ users:
TLSCertFile: "foo/baz",
TLSKeyFile: "foo/foo",
},
})
}, nil)
// Multiple url_prefix entries
insecureSkipVerifyFalse := false
@@ -359,6 +363,7 @@ users:
tls_insecure_skip_verify: false
retry_status_codes: [500, 501]
load_balancing_policy: first_available
merge_query_args: [foo, bar]
drop_src_path_prefix_parts: 1
discover_backend_ips: true
`, map[string]*UserInfo{
@@ -372,10 +377,11 @@ users:
TLSInsecureSkipVerify: &insecureSkipVerifyFalse,
RetryStatusCodes: []int{500, 501},
LoadBalancingPolicy: "first_available",
MergeQueryArgs: []string{"foo", "bar"},
DropSrcPathPrefixParts: intp(1),
DiscoverBackendIPs: &discoverBackendIPsTrue,
},
})
}, nil)
// Multiple users
f(`
@@ -393,7 +399,7 @@ users:
Username: "bar",
URLPrefix: mustParseURL("https://bar/x/"),
},
})
}, nil)
// non-empty URLMap
sharedUserInfo := &UserInfo{
@@ -443,7 +449,7 @@ users:
`, map[string]*UserInfo{
getHTTPAuthBearerToken("foo"): sharedUserInfo,
getHTTPAuthBasicToken("foo", ""): sharedUserInfo,
})
}, nil)
// Multiple users with the same name - this should work, since these users have different passwords
f(`
@@ -465,7 +471,7 @@ users:
Password: "bar",
URLPrefix: mustParseURL("https://bar/x"),
},
})
}, nil)
// with default url
keepOriginalHost := true
@@ -481,6 +487,8 @@ users:
- "foo: bar"
- "xxx: y"
keep_original_host: true
load_balancing_policy: first_available
merge_query_args: [foo, bar]
default_url:
- http://default1/select/0/prometheus
- http://default2/select/0/prometheus
@@ -505,6 +513,8 @@ users:
},
KeepOriginalHost: &keepOriginalHost,
},
LoadBalancingPolicy: "first_available",
MergeQueryArgs: []string{"foo", "bar"},
},
},
DefaultURL: mustParseURLs([]string{
@@ -532,6 +542,8 @@ users:
},
KeepOriginalHost: &keepOriginalHost,
},
LoadBalancingPolicy: "first_available",
MergeQueryArgs: []string{"foo", "bar"},
},
},
DefaultURL: mustParseURLs([]string{
@@ -539,7 +551,7 @@ users:
"http://default2/select/0/prometheus",
}),
},
})
}, nil)
// With metric_labels
f(`
@@ -591,6 +603,23 @@ users:
},
},
},
}, nil)
// unauthorized_user
f(`
unauthorized_user:
merge_query_args: [extra_filters]
url_map:
- src_paths: ["/select/.+"]
url_prefix: 'http://victoria-logs:9428/?extra_filters={env="prod"}'
`, nil, &UserInfo{
MergeQueryArgs: []string{"extra_filters"},
URLMaps: []URLMap{
{
SrcPaths: getRegexs([]string{"/select/.+"}),
URLPrefix: mustParseURL(`http://victoria-logs:9428/?extra_filters={env="prod"}`),
},
},
})
}
@@ -723,10 +752,12 @@ func TestGetLeastLoadedBackendURL(t *testing.T) {
})
up.loadBalancingPolicy = "least_loaded"
pbus := up.bus.Load()
bus := *pbus
fn := func(ns ...int) {
t.Helper()
pbus := up.bus.Load()
bus := *pbus
for i, b := range bus {
got := int(b.concurrentRequests.Load())
exp := ns[i]
@@ -738,45 +769,52 @@ func TestGetLeastLoadedBackendURL(t *testing.T) {
up.getBackendURL()
fn(1, 0, 0)
up.getBackendURL()
fn(1, 1, 0)
up.getBackendURL()
fn(1, 1, 1)
up.getBackendURL()
up.getBackendURL()
fn(2, 2, 1)
bus := up.bus.Load()
pbus := *bus
pbus[0].concurrentRequests.Add(2)
pbus[2].concurrentRequests.Add(5)
fn(4, 2, 6)
bus[1].put()
bus[2].put()
fn(1, 0, 0)
up.getBackendURL()
fn(4, 3, 6)
fn(1, 1, 0)
bus[1].put()
up.getBackendURL()
fn(4, 4, 6)
up.getBackendURL()
fn(4, 5, 6)
up.getBackendURL()
fn(5, 5, 6)
up.getBackendURL()
fn(6, 5, 6)
up.getBackendURL()
fn(6, 6, 6)
up.getBackendURL()
fn(6, 6, 7)
fn(1, 0, 1)
up.getBackendURL()
up.getBackendURL()
fn(7, 7, 7)
fn(1, 1, 2)
bus[0].concurrentRequests.Add(2)
bus[2].concurrentRequests.Add(2)
fn(3, 1, 4)
up.getBackendURL()
fn(3, 2, 4)
up.getBackendURL()
fn(3, 3, 4)
up.getBackendURL()
fn(4, 3, 4)
up.getBackendURL()
fn(4, 4, 4)
bus[0].put()
bus[2].put()
up.getBackendURL()
fn(3, 4, 4)
up.getBackendURL()
fn(4, 4, 4)
}
func TestBrokenBackend(t *testing.T) {
@@ -884,7 +922,7 @@ func removeMetrics(m map[string]*UserInfo) {
}
}
func areEqualConfigs(a, b map[string]*UserInfo) error {
func areEqualConfigs(a, b any) error {
aData, err := yaml.Marshal(a)
if err != nil {
return fmt.Errorf("cannot marshal a: %w", err)

View File

@@ -167,6 +167,12 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
ui := getUserInfoByAuthTokens(ats)
if ui == nil {
uu := authConfig.Load().UnauthorizedUser
if uu != nil {
processUserRequest(w, r, uu)
return true
}
invalidAuthTokenRequests.Inc()
if *logInvalidAuthTokens {
err := fmt.Errorf("cannot authorize request with auth tokens %q", ats)
@@ -263,7 +269,7 @@ func processRequest(w http.ResponseWriter, r *http.Request, ui *UserInfo) {
query.Set("request_path", u.String())
targetURL.RawQuery = query.Encode()
} else { // Update path for regular routes.
targetURL = mergeURLs(targetURL, u, up.dropSrcPathPrefixParts)
targetURL = mergeURLs(targetURL, u, up.dropSrcPathPrefixParts, up.mergeQueryArgs)
}
wasLocalRetry := false
@@ -304,14 +310,21 @@ func tryProcessingRequest(w http.ResponseWriter, r *http.Request, targetURL *url
rtb, rtbOK := req.Body.(*readTrackingBody)
res, err := ui.rt.RoundTrip(req)
if ctxErr := r.Context().Err(); ctxErr != nil {
// Override the error returned by the RoundTrip with the context error if it isn't non-nil
// This makes sure the proper logging for canceled and timed out requests - log the real cause of the error
// instead of the random error, which could be returned from RoundTrip because of canceled or timed out request.
err = ctxErr
}
if err != nil {
if errors.Is(err, context.Canceled) || errors.Is(err, context.DeadlineExceeded) {
// Do not retry canceled or timed out requests
remoteAddr := httpserver.GetQuotedRemoteAddr(r)
requestURI := httpserver.GetRequestURI(r)
logger.Warnf("remoteAddr: %s; requestURI: %s; error when proxying response body from %s: %s", remoteAddr, requestURI, targetURL, err)
if errors.Is(err, context.DeadlineExceeded) {
// Timed out request must be counted as errors, since this usually means that the backend is slow.
logger.Warnf("remoteAddr: %s; requestURI: %s; timeout while proxying the response from %s: %s", remoteAddr, requestURI, targetURL, err)
ui.backendErrors.Inc()
}
return false, false
@@ -366,20 +379,54 @@ func tryProcessingRequest(w http.ResponseWriter, r *http.Request, targetURL *url
updateHeadersByConfig(w.Header(), hc.ResponseHeaders)
w.WriteHeader(res.StatusCode)
copyBuf := copyBufPool.Get()
copyBuf.B = bytesutil.ResizeNoCopyNoOverallocate(copyBuf.B, 16*1024)
_, err = io.CopyBuffer(w, res.Body, copyBuf.B)
copyBufPool.Put(copyBuf)
err = copyStreamToClient(w, res.Body)
_ = res.Body.Close()
if err != nil && !netutil.IsTrivialNetworkError(err) {
if err != nil && !netutil.IsTrivialNetworkError(err) && !errors.Is(err, context.Canceled) {
remoteAddr := httpserver.GetQuotedRemoteAddr(r)
requestURI := httpserver.GetRequestURI(r)
logger.Warnf("remoteAddr: %s; requestURI: %s; error when proxying response body from %s: %s", remoteAddr, requestURI, targetURL, err)
return true, false
}
return true, false
}
func copyStreamToClient(client io.Writer, backend io.Reader) error {
copyBuf := copyBufPool.Get()
copyBuf.B = bytesutil.ResizeNoCopyNoOverallocate(copyBuf.B, 16*1024)
defer copyBufPool.Put(copyBuf)
buf := copyBuf.B
flusher, ok := client.(http.Flusher)
if !ok {
logger.Panicf("BUG: client must implement net/http.Flusher interface; got %T", client)
}
for {
n, backendErr := backend.Read(buf)
if n > 0 {
data := buf[:n]
n, clientErr := client.Write(data)
if clientErr != nil {
return fmt.Errorf("cannot write data to client: %w", clientErr)
}
if n != len(data) {
logger.Panicf("BUG: unexpected number of bytes written returned by client.Write; got %d; want %d", n, len(data))
}
// Flush the read data from the backend to the client as fast as possible
// in order to reduce delays for data propagation.
// See https://github.com/VictoriaMetrics/VictoriaLogs/issues/667
flusher.Flush()
}
if backendErr != nil {
if backendErr == io.EOF {
return nil
}
return fmt.Errorf("cannot read data from backend: %w", backendErr)
}
}
}
var copyBufPool bytesutil.ByteBufferPool
func copyHeader(dst, src http.Header) {

View File

@@ -90,6 +90,20 @@ User-Agent: vmauth
X-Forwarded-For: 12.34.56.78, 42.2.3.84`
f(cfgStr, requestURL, backendHandler, responseExpected)
// routing of all failed to authorize requests to unauthorized_user (issue #7543)
cfgStr = `
unauthorized_user:
url_prefix: "{BACKEND}/foo"
keep_original_host: true`
requestURL = "http://foo:invalid-secret@some-host.com/abc/def"
backendHandler = func(w http.ResponseWriter, r *http.Request) {
fmt.Fprintf(w, "requested_url=http://%s%s", r.Host, r.URL)
}
responseExpected = `
statusCode=200
requested_url=http://some-host.com/foo/abc/def`
f(cfgStr, requestURL, backendHandler, responseExpected)
// keep_original_host
cfgStr = `
unauthorized_user:
@@ -500,6 +514,11 @@ func (w *fakeResponseWriter) getResponse() string {
return w.bb.String()
}
// Flush implements net/http.Flusher
func (w *fakeResponseWriter) Flush() {
// Nothing to do.
}
func (w *fakeResponseWriter) Header() http.Header {
if w.h == nil {
w.h = http.Header{}

View File

@@ -8,29 +8,42 @@ import (
"strings"
)
func mergeURLs(uiURL, requestURI *url.URL, dropSrcPathPrefixParts int) *url.URL {
func mergeURLs(uiURL, requestURI *url.URL, dropSrcPathPrefixParts int, mergeQueryArgs []string) *url.URL {
targetURL := *uiURL
srcPath := dropPrefixParts(requestURI.Path, dropSrcPathPrefixParts)
if strings.HasPrefix(srcPath, "/") {
targetURL.Path = strings.TrimSuffix(targetURL.Path, "/")
}
targetURL.Path += srcPath
requestParams := requestURI.Query()
// fast path
if len(requestParams) == 0 {
return &targetURL
}
// merge query parameters from requests.
uiParams := targetURL.Query()
// Merge client query args with backend query args
targetParams := targetURL.Query()
uiParams := url.Values{}
// Copy all the target query args
for k, v := range targetParams {
for i := range v {
uiParams.Add(k, v[i])
}
}
// Copy the client query args if they do not clash with target args.
for k, v := range requestParams {
// skip clashed query params from original request
if exist := uiParams.Get(k); len(exist) > 0 {
if targetParams.Has(k) && !slices.Contains(mergeQueryArgs, k) {
// Skip clashed client query params for security reasons
continue
}
for i := range v {
uiParams.Add(k, v[i])
}
}
targetURL.RawQuery = uiParams.Encode()
return &targetURL
}

View File

@@ -101,7 +101,7 @@ func TestCreateTargetURLSuccess(t *testing.T) {
return
}
bu := up.getBackendURL()
target := mergeURLs(bu.url, u, up.dropSrcPathPrefixParts)
target := mergeURLs(bu.url, u, up.dropSrcPathPrefixParts, up.mergeQueryArgs)
bu.put()
gotTarget := target.String()
@@ -352,7 +352,7 @@ func TestUserInfoGetBackendURL_SRV(t *testing.T) {
return
}
bu := up.getBackendURL()
target := mergeURLs(bu.url, u, up.dropSrcPathPrefixParts)
target := mergeURLs(bu.url, u, up.dropSrcPathPrefixParts, up.mergeQueryArgs)
bu.put()
gotTarget := target.String()
@@ -528,3 +528,43 @@ func (r *fakeResolver) LookupIPAddr(_ context.Context, host string) ([]net.IPAdd
func (r *fakeResolver) LookupMX(_ context.Context, _ string) ([]*net.MX, error) {
return nil, nil
}
func TestMergeURLs(t *testing.T) {
f := func(clientURL, backendURL string, dropSrcPathPrefixParts int, mergeQueryArgs []string, resultURLExpected string) {
t.Helper()
cu, err := url.Parse(clientURL)
if err != nil {
t.Fatalf("cannot parse client url %q: %s", clientURL, err)
}
cu = normalizeURL(cu)
bu, err := url.Parse(backendURL)
if err != nil {
t.Fatalf("cannot parse backend url %q: %s", backendURL, err)
}
ru := mergeURLs(bu, cu, dropSrcPathPrefixParts, mergeQueryArgs)
resultURL := ru.String()
if resultURL != resultURLExpected {
t.Fatalf("unexpected resultURL\ngot\n%s\nwant\n%s", resultURL, resultURLExpected)
}
}
f("http://foo:1234", "https://backend/foo/bar?baz=abc&de", 0, nil, "https://backend/foo/bar?baz=abc&de")
f("http://foo:1234", "https://backend/foo/bar/?baz=abc&de", 0, nil, "https://backend/foo/bar/?baz=abc&de")
f("https://foo:1234/", "https://backend/foo/bar?baz=abc&de", 0, nil, "https://backend/foo/bar?baz=abc&de")
f("https://foo:1234/", "http://backend:8888/foo/bar/?baz=abc&de", 0, nil, "http://backend:8888/foo/bar/?baz=abc&de")
// merge paths
f("http://foo:1234/x/y?z=xxx", "https://backend/foo/bar?baz=abc&de", 0, nil, "https://backend/foo/bar/x/y?baz=abc&de=&z=xxx")
// "hacky" url
f("http://foo:1234/../../x/../y?z=xxx", "https://backend/foo/bar?baz=abc&de", 0, nil, "https://backend/foo/bar/y?baz=abc&de=&z=xxx")
// make sure that the client args are overridden by server args by default
f("http://foo:1234/x/y?password=hack&qqq=www", "https://backend/foo/bar?password=abc", 0, nil, "https://backend/foo/bar/x/y?password=abc&qqq=www")
// allow overriding the selected query args
f("http://foo:1234/x/y?baz=xxx&qqq=www", "https://backend/foo/bar?baz=abc", 0, []string{"baz"}, "https://backend/foo/bar/x/y?baz=abc&baz=xxx&qqq=www")
}

View File

@@ -31,6 +31,9 @@ vmbackup-linux-ppc64le-prod:
vmbackup-linux-386-prod:
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) $(MAKE) app-via-docker-linux-386
vmbackup-linux-s390x-prod:
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) $(MAKE) app-via-docker-linux-s390x
vmbackup-darwin-amd64-prod:
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) $(MAKE) app-via-docker-darwin-amd64

View File

@@ -115,7 +115,7 @@ func main() {
if err != nil {
logger.Fatalf("cannot create backup: %s", err)
}
pushmetrics.Stop()
pushmetrics.StopAndPush()
startTime := time.Now()
logger.Infof("gracefully shutting down http server for metrics at %q", listenAddrs)
@@ -212,7 +212,7 @@ func newSrcFS() (*fslocal.FS, error) {
}
func newDstFS(ctx context.Context) (common.RemoteFS, error) {
fs, err := actions.NewRemoteFS(ctx, *dst)
fs, err := actions.NewRemoteFS(ctx, *dst, nil)
if err != nil {
return nil, fmt.Errorf("cannot parse `-dst`=%q: %w", *dst, err)
}
@@ -255,7 +255,7 @@ func newOriginFS(ctx context.Context) (common.OriginFS, error) {
if len(*origin) == 0 {
return &fsnil.FS{}, nil
}
fs, err := actions.NewRemoteFS(ctx, *origin)
fs, err := actions.NewRemoteFS(ctx, *origin, nil)
if err != nil {
return nil, fmt.Errorf("cannot parse `-origin`=%q: %w", *origin, err)
}
@@ -266,7 +266,7 @@ func newRemoteOriginFS(ctx context.Context) (common.RemoteFS, error) {
if len(*origin) == 0 {
return nil, fmt.Errorf("-origin cannot be empty when -snapshotName and -snapshot.createURL aren't set")
}
fs, err := actions.NewRemoteFS(ctx, *origin)
fs, err := actions.NewRemoteFS(ctx, *origin, nil)
if err != nil {
return nil, fmt.Errorf("cannot parse `-origin`=%q: %w", *origin, err)
}

View File

@@ -27,6 +27,9 @@ vmctl-linux-ppc64le-prod:
vmctl-linux-386-prod:
APP_NAME=vmctl $(MAKE) app-via-docker-linux-386
vmctl-linux-s390x-prod:
APP_NAME=vmctl $(MAKE) app-via-docker-linux-s390x
vmctl-darwin-amd64-prod:
APP_NAME=vmctl $(MAKE) app-via-docker-darwin-amd64

View File

@@ -689,15 +689,15 @@ var (
Usage: "The time filter in RFC3339 format to select timeseries with timestamp equal or lower than provided value. E.g. '2020-01-01T20:07:00Z'",
Layout: time.RFC3339,
},
&cli.StringFlag{
Name: remoteReadFilterLabel,
Usage: "Prometheus label name to filter timeseries by. E.g. '__name__' will filter timeseries by name.",
Value: "__name__",
&cli.StringSliceFlag{
Name: remoteReadFilterLabel,
Usage: "Prometheus label name to filter timeseries by. E.g. '__name__' will filter timeseries by name.",
DefaultText: "__name__",
},
&cli.StringFlag{
Name: remoteReadFilterLabelValue,
Usage: fmt.Sprintf("Prometheus regular expression to filter label from %q flag.", remoteReadFilterLabelValue),
Value: ".*",
&cli.StringSliceFlag{
Name: remoteReadFilterLabelValue,
Usage: fmt.Sprintf("Prometheus regular expression to filter label from %q flag.", remoteReadFilterLabelValue),
DefaultText: ".*",
},
&cli.BoolFlag{
Name: remoteRead,

View File

@@ -192,6 +192,14 @@ func main() {
return fmt.Errorf("failed to create transport for -%s=%q: %s", remoteReadSrcAddr, addr, err)
}
// Backwards compatible default values if none provided by user
rrLabelNames := c.StringSlice(remoteReadFilterLabel)
rrLabelValues := c.StringSlice(remoteReadFilterLabelValue)
if len(rrLabelNames) == 0 && len(rrLabelValues) == 0 {
rrLabelNames = []string{"__name__"}
rrLabelValues = []string{".*"}
}
rr, err := remoteread.NewClient(remoteread.Config{
Addr: addr,
Transport: tr,
@@ -200,8 +208,8 @@ func main() {
Timeout: c.Duration(remoteReadHTTPTimeout),
UseStream: c.Bool(remoteReadUseStream),
Headers: c.String(remoteReadHeaders),
LabelName: c.String(remoteReadFilterLabel),
LabelValue: c.String(remoteReadFilterLabelValue),
LabelNames: rrLabelNames,
LabelValues: rrLabelValues,
DisablePathAppend: c.Bool(remoteReadDisablePathAppend),
})
if err != nil {

View File

@@ -11,14 +11,15 @@ import (
"strings"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl/vm"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/gogo/protobuf/proto"
"github.com/golang/snappy"
"github.com/prometheus/prometheus/config"
"github.com/prometheus/prometheus/prompb"
"github.com/prometheus/prometheus/storage/remote"
"github.com/prometheus/prometheus/tsdb/chunkenc"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl/vm"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
)
const (
@@ -63,9 +64,9 @@ type Config struct {
UseStream bool
// Headers optional HTTP headers to send with each request to the corresponding remote storage
Headers string
// LabelName, LabelValue stands for label=~value pair used for read requests.
// LabelNames, LabelValues stands for label=~value pair used for read requests.
// Is optional.
LabelName, LabelValue string
LabelNames, LabelValues []string
}
// Filter defines a list of filters applied to requested data
@@ -94,12 +95,22 @@ func NewClient(cfg Config) (*Client, error) {
return nil, err
}
var m *prompb.LabelMatcher
if cfg.LabelName != "" && cfg.LabelValue != "" {
m = &prompb.LabelMatcher{
Type: prompb.LabelMatcher_RE,
Name: cfg.LabelName,
Value: cfg.LabelValue,
var matchers []*prompb.LabelMatcher
if len(cfg.LabelNames) > 0 || len(cfg.LabelValues) > 0 {
if len(cfg.LabelNames) != len(cfg.LabelValues) {
return nil, fmt.Errorf("the number of label names and label values must be the same")
}
for i := range cfg.LabelNames {
if cfg.LabelNames[i] == "" {
return nil, fmt.Errorf("label name cannot be empty")
}
matcher := &prompb.LabelMatcher{
Type: prompb.LabelMatcher_RE,
Name: cfg.LabelNames[i],
Value: cfg.LabelValues[i],
}
matchers = append(matchers, matcher)
}
}
@@ -116,7 +127,7 @@ func NewClient(cfg Config) (*Client, error) {
password: cfg.Password,
useStream: cfg.UseStream,
headers: headers,
matchers: []*prompb.LabelMatcher{m},
matchers: matchers,
}
return c, nil

View File

@@ -63,10 +63,7 @@ func (ts *TimeSeries) write(w io.Writer) (int, error) {
// Split long lines with more than 10K samples into multiple JSON lines.
// This should limit memory usage at VictoriaMetrics during data ingestion,
// since it allocates memory for the whole JSON line and processes it in one go.
batchSize := 10000
if batchSize > len(timestamps) {
batchSize = len(timestamps)
}
batchSize := min(10000, len(timestamps))
timestampsBatch := timestamps[:batchSize]
valuesBatch := values[:batchSize]
timestamps = timestamps[batchSize:]

View File

@@ -30,6 +30,9 @@ vminsert-linux-ppc64le-prod:
vminsert-linux-386-prod:
APP_NAME=vminsert $(MAKE) app-via-docker-linux-386
vminsert-linux-s390x-prod:
APP_NAME=vminsert $(MAKE) app-via-docker-linux-s390x
vminsert-freebsd-amd64-prod:
APP_NAME=vminsert $(MAKE) app-via-docker-freebsd-amd64

View File

@@ -1,87 +0,0 @@
package clusternative
import (
"fmt"
"net"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/netstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/relabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/handshake"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/clusternative/stream"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="clusternative"}`)
rowsTenantInserted = tenantmetrics.NewCounterMap(`vm_tenant_inserted_rows_total{type="clusternative"}`)
rowsPerInsert = metrics.NewHistogram(`vm_rows_per_insert{type="clusternative"}`)
)
// InsertHandler processes data from vminsert nodes.
func InsertHandler(c net.Conn) error {
// There is no need in response compression, since
// lower-level vminsert sends only small packets to upper-level vminsert.
bc, err := handshake.VMInsertServer(c, 0)
if err != nil {
if handshake.IsTCPHealthcheck(err) {
return nil
}
if handshake.IsTimeoutNetworkError(err) {
logger.Warnf("cannot complete vminsert handshake due to network timeout error with client %q: %s. "+
"If errors are transient and infrequent increase -rpc.handshakeTimeout and -vmstorageDialTimeout on client and server side. Check vminsert logs for errors", c.RemoteAddr(), err)
return nil
}
if handshake.IsClientNetworkError(err) {
logger.Warnf("cannot complete vminsert handshake due to network error with client %q: %s. "+
"Check vminsert logs for errors", c.RemoteAddr(), err)
return nil
}
return fmt.Errorf("cannot perform vminsert handshake with client %q: %w", c.RemoteAddr(), err)
}
return stream.Parse(bc, func(rows []storage.MetricRow) error {
return insertRows(rows)
}, nil)
}
func insertRows(rows []storage.MetricRow) error {
ctx := netstorage.GetInsertCtx()
defer netstorage.PutInsertCtx(ctx)
ctx.Reset() // This line is required for initializing ctx internals.
hasRelabeling := relabel.HasRelabeling()
var at auth.Token
var rowsPerTenant *metrics.Counter
var mn storage.MetricName
for i := range rows {
mr := &rows[i]
if err := mn.UnmarshalRaw(mr.MetricNameRaw); err != nil {
return fmt.Errorf("cannot unmarshal MetricNameRaw: %w", err)
}
if rowsPerTenant == nil || mn.AccountID != at.AccountID || mn.ProjectID != at.ProjectID {
at.AccountID = mn.AccountID
at.ProjectID = mn.ProjectID
rowsPerTenant = rowsTenantInserted.Get(&at)
}
ctx.Labels = ctx.Labels[:0]
ctx.AddLabelBytes(nil, mn.MetricGroup)
for j := range mn.Tags {
tag := &mn.Tags[j]
ctx.AddLabelBytes(tag.Key, tag.Value)
}
if !ctx.TryPrepareLabels(hasRelabeling) {
continue
}
if err := ctx.WriteDataPoint(&at, ctx.Labels, mr.Timestamp, mr.Value); err != nil {
return err
}
rowsPerTenant.Inc()
}
rowsInserted.Add(len(rows))
rowsPerInsert.Update(float64(len(rows)))
return ctx.FlushBufs()
}

View File

@@ -0,0 +1,104 @@
package clusternative
import (
"crypto/tls"
"flag"
"fmt"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/netstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/relabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage/metricsmetadata"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/vminsertapi"
"github.com/VictoriaMetrics/metrics"
)
var (
vminsertConnsShutdownDuration = flag.Duration("clusternative.vminsertConnsShutdownDuration", 25*time.Second, "The time needed for gradual closing of upstream "+
"vminsert connections during graceful shutdown. Bigger duration reduces spikes in CPU, RAM and disk IO load on the remaining lower-level clusters "+
"during rolling restart. Smaller duration reduces the time needed to close all the upstream vminsert connections, thus reducing the time for graceful shutdown. "+
"See https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/#improving-re-routing-performance-during-restart")
)
// NewVMinsertServer creates and start vminsert server at the given addr
func NewVMinsertServer(addr string, tc *tls.Config) (*vminsertapi.VMInsertServer, error) {
api := &vminsertAPI{}
return vminsertapi.NewVMInsertServer(addr, *vminsertConnsShutdownDuration, "clusternative", api, tc)
}
type vminsertAPI struct {
}
var (
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="clusternative"}`)
metadataInserted = metrics.NewCounter(`vm_metadata_inserted_total{type="clusternative"}`)
rowsTenantInserted = tenantmetrics.NewCounterMap(`vm_tenant_inserted_rows_total{type="clusternative"}`)
rowsPerInsert = metrics.NewHistogram(`vm_rows_per_insert{type="clusternative"}`)
)
// WriteRows implements lib/vminsertapi/API interface
func (v *vminsertAPI) WriteRows(rows []storage.MetricRow) error {
ctx := netstorage.GetInsertCtx()
defer netstorage.PutInsertCtx(ctx)
ctx.Reset() // This line is required for initializing ctx internals.
hasRelabeling := relabel.HasRelabeling()
var at auth.Token
var rowsPerTenant *metrics.Counter
var mn storage.MetricName
for i := range rows {
mr := &rows[i]
if err := mn.UnmarshalRaw(mr.MetricNameRaw); err != nil {
return fmt.Errorf("cannot unmarshal MetricNameRaw: %w", err)
}
if rowsPerTenant == nil || mn.AccountID != at.AccountID || mn.ProjectID != at.ProjectID {
at.AccountID = mn.AccountID
at.ProjectID = mn.ProjectID
rowsPerTenant = rowsTenantInserted.Get(&at)
}
ctx.Labels = ctx.Labels[:0]
ctx.AddLabelBytes(nil, mn.MetricGroup)
for j := range mn.Tags {
tag := &mn.Tags[j]
ctx.AddLabelBytes(tag.Key, tag.Value)
}
if !ctx.TryPrepareLabels(hasRelabeling) {
continue
}
if err := ctx.WriteDataPoint(&at, ctx.Labels, mr.Timestamp, mr.Value); err != nil {
return err
}
rowsPerTenant.Inc()
}
rowsInserted.Add(len(rows))
rowsPerInsert.Update(float64(len(rows)))
return ctx.FlushBufs()
}
// WriteMetadata implements lib/vminsertapi/API interface
func (v *vminsertAPI) WriteMetadata(mrs []metricsmetadata.Row) error {
ctx := netstorage.GetInsertCtx()
defer netstorage.PutInsertCtx(ctx)
ctx.ResetForMetricsMetadata() // This line is required for initializing ctx internals.
for i := range mrs {
row := &mrs[i]
ctx.Buf = row.MarshalTo(ctx.Buf[:0])
storageNodeIdx := ctx.GetStorageNodeIdxForMeta(ctx.Buf)
if err := ctx.WriteMetadataExt(storageNodeIdx, ctx.Buf); err != nil {
return err
}
}
metadataInserted.Add(len(mrs))
return ctx.FlushBufs()
}
// IsReadOnly implements lib/vminsertapi/API interface
func (v *vminsertAPI) IsReadOnly() bool {
return false
}

View File

@@ -65,10 +65,10 @@ func insertRows(at *auth.Token, sketches []*datadogsketches.Sketch, extraLabels
continue
}
atLocal := ctx.GetLocalAuthToken(at)
ctx.MetricNameBuf = storage.MarshalMetricNameRaw(ctx.MetricNameBuf[:0], atLocal.AccountID, atLocal.ProjectID, ctx.Labels)
ctx.Buf = storage.MarshalMetricNameRaw(ctx.Buf[:0], atLocal.AccountID, atLocal.ProjectID, ctx.Labels)
storageNodeIdx := ctx.GetStorageNodeIdx(atLocal, ctx.Labels)
for _, p := range m.Points {
if err := ctx.WriteDataPointExt(storageNodeIdx, ctx.MetricNameBuf, p.Timestamp, p.Value); err != nil {
if err := ctx.WriteDataPointExt(storageNodeIdx, ctx.Buf, p.Timestamp, p.Value); err != nil {
return err
}
}

View File

@@ -68,12 +68,12 @@ func insertRows(at *auth.Token, series []datadogv1.Series, extraLabels []prompb.
continue
}
atLocal := ctx.GetLocalAuthToken(at)
ctx.MetricNameBuf = storage.MarshalMetricNameRaw(ctx.MetricNameBuf[:0], atLocal.AccountID, atLocal.ProjectID, ctx.Labels)
ctx.Buf = storage.MarshalMetricNameRaw(ctx.Buf[:0], atLocal.AccountID, atLocal.ProjectID, ctx.Labels)
storageNodeIdx := ctx.GetStorageNodeIdx(atLocal, ctx.Labels)
for _, pt := range ss.Points {
timestamp := pt.Timestamp()
value := pt.Value()
if err := ctx.WriteDataPointExt(storageNodeIdx, ctx.MetricNameBuf, timestamp, value); err != nil {
if err := ctx.WriteDataPointExt(storageNodeIdx, ctx.Buf, timestamp, value); err != nil {
return err
}
}

View File

@@ -71,12 +71,12 @@ func insertRows(at *auth.Token, series []datadogv2.Series, extraLabels []prompb.
continue
}
atLocal := ctx.GetLocalAuthToken(at)
ctx.MetricNameBuf = storage.MarshalMetricNameRaw(ctx.MetricNameBuf[:0], atLocal.AccountID, atLocal.ProjectID, ctx.Labels)
ctx.Buf = storage.MarshalMetricNameRaw(ctx.Buf[:0], atLocal.AccountID, atLocal.ProjectID, ctx.Labels)
storageNodeIdx := ctx.GetStorageNodeIdx(atLocal, ctx.Labels)
for _, pt := range ss.Points {
timestamp := pt.Timestamp * 1000
value := pt.Value
if err := ctx.WriteDataPointExt(storageNodeIdx, ctx.MetricNameBuf, timestamp, value); err != nil {
if err := ctx.WriteDataPointExt(storageNodeIdx, ctx.Buf, timestamp, value); err != nil {
return err
}
}

View File

@@ -114,12 +114,12 @@ func insertRows(at *auth.Token, db string, rows []influx.Row, extraLabels []prom
continue
}
atLocal := ic.GetLocalAuthToken(at)
ic.MetricNameBuf = storage.MarshalMetricNameRaw(ic.MetricNameBuf[:0], atLocal.AccountID, atLocal.ProjectID, nil)
ic.Buf = storage.MarshalMetricNameRaw(ic.Buf[:0], atLocal.AccountID, atLocal.ProjectID, nil)
for i := range ic.Labels {
ic.MetricNameBuf = storage.MarshalMetricLabelRaw(ic.MetricNameBuf, &ic.Labels[i])
ic.Buf = storage.MarshalMetricLabelRaw(ic.Buf, &ic.Labels[i])
}
storageNodeIdx := ic.GetStorageNodeIdx(atLocal, ic.Labels)
if err := ic.WriteDataPointExt(storageNodeIdx, ic.MetricNameBuf, r.Timestamp, f.Value); err != nil {
if err := ic.WriteDataPointExt(storageNodeIdx, ic.Buf, r.Timestamp, f.Value); err != nil {
return err
}
perTenantRows[*atLocal]++
@@ -135,8 +135,8 @@ func insertRows(at *auth.Token, db string, rows []influx.Row, extraLabels []prom
}
}
atLocal := ic.GetLocalAuthToken(at)
ic.MetricNameBuf = storage.MarshalMetricNameRaw(ic.MetricNameBuf[:0], atLocal.AccountID, atLocal.ProjectID, ic.Labels)
metricNameBufLen := len(ic.MetricNameBuf)
ic.Buf = storage.MarshalMetricNameRaw(ic.Buf[:0], atLocal.AccountID, atLocal.ProjectID, ic.Labels)
metricNameBufLen := len(ic.Buf)
labelsLen := len(ic.Labels)
for j := range r.Fields {
f := &r.Fields[j]
@@ -151,10 +151,10 @@ func insertRows(at *auth.Token, db string, rows []influx.Row, extraLabels []prom
continue
}
}
ic.MetricNameBuf = ic.MetricNameBuf[:metricNameBufLen]
ic.MetricNameBuf = storage.MarshalMetricLabelRaw(ic.MetricNameBuf, &ic.Labels[len(ic.Labels)-1])
ic.Buf = ic.Buf[:metricNameBufLen]
ic.Buf = storage.MarshalMetricLabelRaw(ic.Buf, &ic.Labels[len(ic.Labels)-1])
storageNodeIdx := ic.GetStorageNodeIdx(atLocal, ic.Labels)
if err := ic.WriteDataPointExt(storageNodeIdx, ic.MetricNameBuf, r.Timestamp, f.Value); err != nil {
if err := ic.WriteDataPointExt(storageNodeIdx, ic.Buf, r.Timestamp, f.Value); err != nil {
return err
}
perTenantRows[*atLocal]++

View File

@@ -4,7 +4,6 @@ import (
"flag"
"fmt"
"io"
"net"
"net/http"
"os"
"strings"
@@ -35,7 +34,6 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/influxutil"
clusternativeserver "github.com/VictoriaMetrics/VictoriaMetrics/lib/ingestserver/clusternative"
graphiteserver "github.com/VictoriaMetrics/VictoriaMetrics/lib/ingestserver/graphite"
influxserver "github.com/VictoriaMetrics/VictoriaMetrics/lib/ingestserver/influx"
opentsdbserver "github.com/VictoriaMetrics/VictoriaMetrics/lib/ingestserver/opentsdb"
@@ -46,6 +44,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/protoparserutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/pushmetrics"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/timeserieslimits"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/vminsertapi"
)
var (
@@ -83,7 +82,7 @@ var (
)
var (
clusternativeServer *clusternativeserver.Server
clusternativeServer *vminsertapi.VMInsertServer
graphiteServer *graphiteserver.Server
influxServer *influxserver.Server
opentsdbServer *opentsdbserver.Server
@@ -123,9 +122,11 @@ func main() {
timeserieslimits.Init(*maxLabelsPerTimeseries, *maxLabelNameLen, *maxLabelValueLen)
protoparserutil.StartUnmarshalWorkers()
if len(*clusternativeListenAddr) > 0 {
clusternativeServer = clusternativeserver.MustStart(*clusternativeListenAddr, func(c net.Conn) error {
return clusternative.InsertHandler(c)
})
s, err := clusternative.NewVMinsertServer(*clusternativeListenAddr, nil)
if err != nil {
logger.Fatalf("cannot initialize vminsertapi server: %s", err)
}
clusternativeServer = s
}
if len(*graphiteListenAddr) > 0 {
graphiteServer = graphiteserver.MustStart(*graphiteListenAddr, *graphiteUseProxyProtocol, func(r io.Reader) error {

View File

@@ -64,7 +64,7 @@ func insertRows(at *auth.Token, block *stream.Block, extraLabels []prompb.Label)
}
// use tenant info from data if it's a multi-tenant import.
atLocal := ctx.GetLocalAuthToken(at)
ctx.MetricNameBuf = storage.MarshalMetricNameRaw(ctx.MetricNameBuf[:0], atLocal.AccountID, atLocal.ProjectID, ctx.Labels)
ctx.Buf = storage.MarshalMetricNameRaw(ctx.Buf[:0], atLocal.AccountID, atLocal.ProjectID, ctx.Labels)
storageNodeIdx := ctx.GetStorageNodeIdx(atLocal, ctx.Labels)
values := block.Values
timestamps := block.Timestamps
@@ -73,7 +73,7 @@ func insertRows(at *auth.Token, block *stream.Block, extraLabels []prompb.Label)
}
for j, value := range values {
timestamp := timestamps[j]
if err := ctx.WriteDataPointExt(storageNodeIdx, ctx.MetricNameBuf, timestamp, value); err != nil {
if err := ctx.WriteDataPointExt(storageNodeIdx, ctx.Buf, timestamp, value); err != nil {
return err
}
}

View File

@@ -1,58 +0,0 @@
package netstorage
import (
"github.com/cespare/xxhash/v2"
)
// See the following docs:
// - https://www.eecs.umich.edu/techreports/cse/96/CSE-TR-316-96.pdf
// - https://github.com/dgryski/go-rendezvous
// - https://dgryski.medium.com/consistent-hashing-algorithmic-tradeoffs-ef6b8e2fcae8
type consistentHash struct {
hashSeed uint64
nodeHashes []uint64
}
func newConsistentHash(nodes []string, hashSeed uint64) *consistentHash {
nodeHashes := make([]uint64, len(nodes))
for i, node := range nodes {
nodeHashes[i] = xxhash.Sum64([]byte(node))
}
return &consistentHash{
hashSeed: hashSeed,
nodeHashes: nodeHashes,
}
}
func (rh *consistentHash) getNodeIdx(h uint64, excludeIdxs []int) int {
var mMax uint64
var idx int
h ^= rh.hashSeed
if len(excludeIdxs) == len(rh.nodeHashes) {
// All the nodes are excluded. Treat this case as no nodes are excluded.
// This is better from load-balacning PoV than selecting some static node.
excludeIdxs = nil
}
next:
for i, nh := range rh.nodeHashes {
for _, j := range excludeIdxs {
if i == j {
continue next
}
}
if m := fastHashUint64(nh ^ h); m > mMax {
mMax = m
idx = i
}
}
return idx
}
func fastHashUint64(x uint64) uint64 {
x ^= x >> 12 // a
x ^= x << 25 // b
x ^= x >> 27 // c
return x * 2685821657736338717
}

View File

@@ -1,65 +0,0 @@
package netstorage
import (
"math"
"math/rand"
"testing"
)
func TestConsistentHash(t *testing.T) {
r := rand.New(rand.NewSource(1))
nodes := []string{
"node1",
"node2",
"node3",
"node4",
}
rh := newConsistentHash(nodes, 0)
keys := make([]uint64, 100000)
for i := 0; i < len(keys); i++ {
keys[i] = r.Uint64()
}
perIdxCounts := make([]int, len(nodes))
keyIndexes := make([]int, len(keys))
for i, k := range keys {
idx := rh.getNodeIdx(k, nil)
perIdxCounts[idx]++
keyIndexes[i] = idx
}
// verify that the number of selected node indexes per each node is roughly the same
expectedPerIdxCount := float64(len(keys)) / float64(len(nodes))
for _, perIdxCount := range perIdxCounts {
if p := math.Abs(float64(perIdxCount)-expectedPerIdxCount) / expectedPerIdxCount; p > 0.005 {
t.Fatalf("uneven number of per-index items %f: %d", p, perIdxCounts)
}
}
// Ignore a single node and verify that the selection for the remaining nodes is even
perIdxCounts = make([]int, len(nodes))
idxsExclude := []int{1}
indexMismatches := 0
for i, k := range keys {
idx := rh.getNodeIdx(k, idxsExclude)
perIdxCounts[idx]++
if keyIndexes[i] != idx {
indexMismatches++
}
}
maxIndexMismatches := float64(len(keys)) / float64(len(nodes))
if float64(indexMismatches) > maxIndexMismatches {
t.Fatalf("too many index mismtaches after excluding a node; got %d; want no more than %f", indexMismatches, maxIndexMismatches)
}
expectedPerIdxCount = float64(len(keys)) / float64(len(nodes)-1)
for i, perIdxCount := range perIdxCounts {
if i == idxsExclude[0] {
if perIdxCount != 0 {
t.Fatalf("unexpected non-zero items for excluded index %d: %d items", idxsExclude[0], perIdxCount)
}
continue
}
if p := math.Abs(float64(perIdxCount)-expectedPerIdxCount) / expectedPerIdxCount; p > 0.005 {
t.Fatalf("uneven number of per-index items %f: %d", p, perIdxCounts)
}
}
}

View File

@@ -1,40 +0,0 @@
package netstorage
import (
"math/rand"
"sync/atomic"
"testing"
)
func BenchmarkConsistentHash(b *testing.B) {
nodes := []string{
"node1",
"node2",
"node3",
"node4",
}
rh := newConsistentHash(nodes, 0)
b.ReportAllocs()
b.SetBytes(int64(len(benchKeys)))
b.RunParallel(func(pb *testing.PB) {
sum := 0
for pb.Next() {
for _, k := range benchKeys {
idx := rh.getNodeIdx(k, nil)
sum += idx
}
}
BenchSink.Add(uint64(sum))
})
}
var benchKeys = func() []uint64 {
r := rand.New(rand.NewSource(1))
keys := make([]uint64, 10000)
for i := 0; i < len(keys); i++ {
keys[i] = r.Uint64()
}
return keys
}()
var BenchSink atomic.Uint64

View File

@@ -5,6 +5,8 @@ import (
"net/http"
"strconv"
"github.com/cespare/xxhash/v2"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/relabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
@@ -12,21 +14,23 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage/metricsmetadata"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/timeserieslimits"
"github.com/cespare/xxhash/v2"
)
// InsertCtx is a generic context for inserting data.
//
// InsertCtx.Reset must be called before the first usage.
// InsertCtx.Reset or InsertCtx.ResetForMetricsMetadata must be called before the first usage.
type InsertCtx struct {
snb *storageNodesBucket
Labels sortedLabels
MetricNameBuf []byte
snb *storageNodesBucket
Labels sortedLabels
Buf []byte
bufRowss []bufRows
labelsBuf []byte
getRowHasher func() rowHasher
relabelCtx relabel.Ctx
at auth.Token
@@ -42,9 +46,9 @@ func (br *bufRows) reset() {
br.rows = 0
}
func (br *bufRows) pushTo(snb *storageNodesBucket, sn *storageNode) error {
func (br *bufRows) pushTo(snb *storageNodesBucket, sn *storageNode, getRowHasher func() rowHasher) error {
bufLen := len(br.buf)
err := sn.push(snb, br.buf, br.rows)
err := sn.push(snb, br.buf, br.rows, getRowHasher)
br.reset()
if err != nil {
return &httpserver.ErrorWithStatusCode{
@@ -58,14 +62,25 @@ func (br *bufRows) pushTo(snb *storageNodesBucket, sn *storageNode) error {
// Reset resets ctx.
func (ctx *InsertCtx) Reset() {
ctx.snb = getStorageNodesBucket()
ctx.getRowHasher = getMetricRowHasher
ctx.reset()
}
// ResetForMetricsMetadata resets ctx and prepares it for metrics metadata ingestion
func (ctx *InsertCtx) ResetForMetricsMetadata() {
ctx.snb = getMetadataStorageNodesBucket()
ctx.getRowHasher = getMetadataRowHasher
ctx.reset()
}
func (ctx *InsertCtx) reset() {
labels := ctx.Labels
for i := range labels {
labels[i] = prompb.Label{}
}
ctx.Labels = labels[:0]
ctx.MetricNameBuf = ctx.MetricNameBuf[:0]
ctx.Buf = ctx.Buf[:0]
if ctx.bufRowss == nil || len(ctx.bufRowss) != len(ctx.snb.sns) {
ctx.bufRowss = make([]bufRows, len(ctx.snb.sns))
@@ -73,6 +88,7 @@ func (ctx *InsertCtx) Reset() {
for i := range ctx.bufRowss {
ctx.bufRowss[i].reset()
}
ctx.labelsBuf = ctx.labelsBuf[:0]
ctx.relabelCtx.Reset()
ctx.at.Set(0, 0)
@@ -123,9 +139,9 @@ func (ctx *InsertCtx) applyRelabeling() {
//
// caller must invoke TryPrepareLabels before using this function
func (ctx *InsertCtx) WriteDataPoint(at *auth.Token, labels []prompb.Label, timestamp int64, value float64) error {
ctx.MetricNameBuf = storage.MarshalMetricNameRaw(ctx.MetricNameBuf[:0], at.AccountID, at.ProjectID, labels)
ctx.Buf = storage.MarshalMetricNameRaw(ctx.Buf[:0], at.AccountID, at.ProjectID, labels)
storageNodeIdx := ctx.GetStorageNodeIdx(at, labels)
return ctx.WriteDataPointExt(storageNodeIdx, ctx.MetricNameBuf, timestamp, value)
return ctx.WriteDataPointExt(storageNodeIdx, ctx.Buf, timestamp, value)
}
// WriteDataPointExt writes the given metricNameRaw with (timestmap, value) to ctx buffer with the given storageNodeIdx.
@@ -138,7 +154,7 @@ func (ctx *InsertCtx) WriteDataPointExt(storageNodeIdx int, metricNameRaw []byte
bufNew := storage.MarshalMetricRow(br.buf, metricNameRaw, timestamp, value)
if len(bufNew) >= maxBufSizePerStorageNode {
// Send buf to sn, since it is too big.
if err := br.pushTo(snb, sn); err != nil {
if err := br.pushTo(snb, sn, getMetricRowHasher); err != nil {
return err
}
br.buf = storage.MarshalMetricRow(bufNew[:0], metricNameRaw, timestamp, value)
@@ -149,6 +165,58 @@ func (ctx *InsertCtx) WriteDataPointExt(storageNodeIdx int, metricNameRaw []byte
return nil
}
// WriteMetadata writes the given MetricMetadata to the storage buffer
func (ctx *InsertCtx) WriteMetadata(at *auth.Token, m *prompb.MetricMetadata) error {
mdr := metricsmetadata.Row{
Type: m.Type,
MetricFamilyName: []byte(m.MetricFamilyName),
Unit: []byte(m.Unit),
Help: []byte(m.Help),
}
if at != nil {
mdr.AccountID = at.AccountID
mdr.ProjectID = at.ProjectID
}
ctx.Buf = mdr.MarshalTo(ctx.Buf[:0])
storageNodeIdx := ctx.GetStorageNodeIdxForMeta(ctx.Buf)
return ctx.WriteMetadataExt(storageNodeIdx, ctx.Buf)
}
// WriteMetadataExt writes the given buffer to the storage buffer by provided storageNodeIdx
func (ctx *InsertCtx) WriteMetadataExt(storageNodeIdx int, buf []byte) error {
br := &ctx.bufRowss[storageNodeIdx]
snb := ctx.snb
sn := snb.sns[storageNodeIdx]
bufNew := append(br.buf, buf...)
if len(bufNew) >= maxBufSizePerStorageNode {
// Send metaBuf to sn, since it is too big.
if err := br.pushTo(snb, sn, getMetadataRowHasher); err != nil {
return err
}
br.buf = append(bufNew[:0], buf...)
} else {
br.buf = bufNew
}
br.rows++
return nil
}
// GetStorageNodeIdxForMeta returns storage node ID for given buffer
//
// The returned index must be passed to WriteMetadataExt.
func (ctx *InsertCtx) GetStorageNodeIdxForMeta(buf []byte) int {
if len(ctx.snb.sns) == 1 {
// Fast path - only a single storage node.
return 0
}
h := xxhash.Sum64(buf)
// Do not exclude unavailable storage nodes in order to properly account for rerouted rows in storageNode.push().
idx := ctx.snb.nodesHash.GetNodeIdx(h, nil)
return idx
}
// FlushBufs flushes ctx bufs to remote storage nodes.
func (ctx *InsertCtx) FlushBufs() error {
var firstErr error
@@ -159,10 +227,11 @@ func (ctx *InsertCtx) FlushBufs() error {
if len(br.buf) == 0 {
continue
}
if err := br.pushTo(snb, sns[i]); err != nil && firstErr == nil {
if err := br.pushTo(snb, sns[i], ctx.getRowHasher); err != nil && firstErr == nil {
firstErr = err
}
}
return firstErr
}
@@ -187,7 +256,7 @@ func (ctx *InsertCtx) GetStorageNodeIdx(at *auth.Token, labels []prompb.Label) i
ctx.labelsBuf = buf
// Do not exclude unavailable storage nodes in order to properly account for rerouted rows in storageNode.push().
idx := ctx.snb.nodesHash.getNodeIdx(h, nil)
idx := ctx.snb.nodesHash.GetNodeIdx(h, nil)
return idx
}
@@ -235,6 +304,18 @@ func (ctx *InsertCtx) GetLocalAuthToken(at *auth.Token) *auth.Token {
return &ctx.at
}
// GetLocalAuthTokenForMetadata obtains auth.Token from given metrics metadata if at is nil.
//
// At is returned as is if it isn't nil.
func (ctx *InsertCtx) GetLocalAuthTokenForMetadata(at *auth.Token, mm *prompb.MetricMetadata) *auth.Token {
if at != nil {
return at
}
ctx.at.Set(mm.AccountID, mm.ProjectID)
return &ctx.at
}
func parseUint32(s string) uint32 {
n, err := strconv.ParseUint(s, 10, 32)
if err != nil {
@@ -259,3 +340,27 @@ func (ctx *InsertCtx) TryPrepareLabels(hasRelabeling bool) bool {
ctx.SortLabelsIfNeeded()
return true
}
type rowHasher func(src []byte) (h uint64, tail []byte, err error)
func getMetricRowHasher() rowHasher {
var mr storage.MetricRow
return func(src []byte) (h uint64, tail []byte, err error) {
mr.ResetX()
tail, err = mr.UnmarshalX(src)
return xxhash.Sum64(mr.MetricNameRaw), tail, err
}
}
func getMetadataRowHasher() rowHasher {
var mm metricsmetadata.Row
return func(src []byte) (h uint64, tail []byte, err error) {
mm.Reset()
tail, err = mm.Unmarshal(src)
if err != nil {
return
}
return xxhash.Sum64(mm.MetricFamilyName), tail, err
}
}

View File

@@ -4,18 +4,16 @@ import (
"errors"
"flag"
"fmt"
"io"
"net"
"sort"
"sync"
"sync/atomic"
"time"
"github.com/VictoriaMetrics/metrics"
"github.com/cespare/xxhash/v2"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/consistenthash"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/consts"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/encoding"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fasttime"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/handshake"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
@@ -24,11 +22,12 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/timerpool"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/timeutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/vminsertapi"
)
var (
disableRPCCompression = flag.Bool("rpc.disableCompression", false, "Whether to disable compression for the data sent from vminsert to vmstorage. This reduces CPU usage at the cost of higher network bandwidth usage")
replicationFactor = flag.Int("replicationFactor", 1, "Replication factor for the ingested data, i.e. how many copies to make among distinct -storageNode instances. "+
disableCompression = flag.Bool("rpc.disableCompression", false, "Flag is deprecated and kept for backward compatibility, vminsert performs per block compression instead of streaming compression on RPC connection")
replicationFactor = flag.Int("replicationFactor", 1, "Replication factor for the ingested data, i.e. how many copies to make among distinct -storageNode instances. "+
"Note that vmselect must run with -dedup.minScrapeInterval=1ms for data de-duplication when replicationFactor is greater than 1. "+
"Higher values for -dedup.minScrapeInterval at vmselect is OK")
disableRerouting = flag.Bool("disableRerouting", true, "Whether to disable re-routing when some of vmstorage nodes accept incoming data at slower speed compared to other storage nodes. Disabled re-routing limits the ingestion rate by the slowest vmstorage node. On the other side, disabled re-routing minimizes the number of active time series in the cluster during rolling restarts and during spikes in series churn rate. See also -disableReroutingOnUnavailable and -dropSamplesOnOverload")
@@ -46,7 +45,7 @@ var (
"See also -disableRerouting")
)
var errStorageReadOnly = errors.New("storage node is read only")
const unsupportedRPCRetrySeconds = 120
func (sn *storageNode) isReady() bool {
return !sn.isBroken.Load() && !sn.isReadOnly.Load()
@@ -62,7 +61,7 @@ func (sn *storageNode) isReady() bool {
// if sn is currently unavailable or overloaded.
//
// rows must match the number of rows in the buf.
func (sn *storageNode) push(snb *storageNodesBucket, buf []byte, rows int) error {
func (sn *storageNode) push(snb *storageNodesBucket, buf []byte, rows int, getRowHasher func() rowHasher) error {
if len(buf) > maxBufSizePerStorageNode {
logger.Panicf("BUG: len(buf)=%d cannot exceed %d", len(buf), maxBufSizePerStorageNode)
}
@@ -71,14 +70,14 @@ func (sn *storageNode) push(snb *storageNodesBucket, buf []byte, rows int) error
// Fast path - the buffer is successfully sent to sn.
return nil
}
if *dropSamplesOnOverload && !sn.isReadOnly.Load() {
if sn.dropRowsOnOverload && !sn.isReadOnly.Load() {
sn.rowsDroppedOnOverload.Add(rows)
dropSamplesOnOverloadLogger.Warnf("some rows dropped, because -dropSamplesOnOverload is set and vmstorage %s cannot accept new rows now. "+
dropSamplesOnOverloadLogger.Warnf("some rows are dropped, because -dropSamplesOnOverload is set and vmstorage %s cannot accept new rows now. "+
"See vm_rpc_rows_dropped_on_overload_total metric at /metrics page", sn.dialer.Addr())
return nil
}
// Slow path - sn cannot accept buf now, so re-route it to other vmstorage nodes.
if err := sn.rerouteBufToOtherStorageNodes(snb, buf, rows); err != nil {
if err := sn.rerouteBufToOtherStorageNodes(snb, buf, rows, getRowHasher); err != nil {
return fmt.Errorf("error when re-routing rows from %s: %w", sn.dialer.Addr(), err)
}
return nil
@@ -86,7 +85,7 @@ func (sn *storageNode) push(snb *storageNodesBucket, buf []byte, rows int) error
var dropSamplesOnOverloadLogger = logger.WithThrottler("droppedSamplesOnOverload", 5*time.Second)
func (sn *storageNode) rerouteBufToOtherStorageNodes(snb *storageNodesBucket, buf []byte, rows int) error {
func (sn *storageNode) rerouteBufToOtherStorageNodes(snb *storageNodesBucket, buf []byte, rows int, getRowHasher func() rowHasher) error {
sns := snb.sns
sn.brLock.Lock()
again:
@@ -104,14 +103,14 @@ again:
goto again
}
if *disableReroutingOnUnavailable {
// We should not send timeseries from currently unavailable storage to alive storage nodes.
// We should not send rows from currently unavailable storage to alive storage nodes.
sn.brCond.Wait()
goto again
}
sn.brLock.Unlock()
// The vmstorage node isn't ready for data processing. Re-route buf to healthy vmstorage nodes even if disableRerouting is set.
rowsProcessed, err := rerouteRowsToReadyStorageNodes(snb, sn, buf)
rowsProcessed, err := rerouteRowsToReadyStorageNodes(snb, sn, buf, getRowHasher)
rows -= rowsProcessed
if err != nil {
return fmt.Errorf("%d rows dropped because the current vsmtorage is unavailable and %w", rows, err)
@@ -132,7 +131,7 @@ again:
goto again
}
sn.brLock.Unlock()
rowsProcessed, err := rerouteRowsToFreeStorageNodes(snb, sn, buf)
rowsProcessed, err := rerouteRowsToFreeStorageNodes(snb, sn, buf, getRowHasher)
rows -= rowsProcessed
if err != nil {
return fmt.Errorf("%d rows dropped because the current vmstorage buf is full and %w", rows, err)
@@ -171,7 +170,7 @@ func (sn *storageNode) run(snb *storageNodesBucket, snIdx int) {
select {
case <-sn.stopCh:
mustStop = true
// Make sure the br.buf is flushed last time before returning
// Make sure the br bufs are flushed last time before returning
// in order to send the remaining bits of data.
case <-ticker.C:
}
@@ -268,6 +267,12 @@ func (sn *storageNode) checkHealth() {
// The sn looks healthy.
return
}
if deadline := sn.rpcIsNotSupportedDeadline.Load(); deadline > 0 {
if deadline > fasttime.UnixTimestamp() {
// do not attemp to re-connect
return
}
}
bc, err := sn.dial()
if err != nil {
sn.isBroken.Store(true)
@@ -300,16 +305,25 @@ func (sn *storageNode) sendBufRowsNonblocking(br *bufRows) bool {
// sn.dial() should be called by sn.checkHealth() on unsuccessful call to sendBufToReplicasNonblocking().
return false
}
startTime := time.Now()
err := sendToConn(sn.bc, br.buf)
var err error
if sn.bc.IsLegacy {
err = vminsertapi.SendToConn(sn.bc, br.buf)
} else {
err = vminsertapi.SendRPCRequestToConn(sn.bc, sn.rpcCall.VersionedName, br.buf)
}
duration := time.Since(startTime)
sn.sendDurationSeconds.Add(duration.Seconds())
if err == nil {
if deadline := sn.rpcIsNotSupportedDeadline.Load(); deadline > 0 {
sn.rpcIsNotSupportedDeadline.Store(0)
}
// Successfully sent buf to bc.
sn.rowsSent.Add(br.rows)
return true
}
if errors.Is(err, errStorageReadOnly) {
if errors.Is(err, storage.ErrReadOnly) {
// The vmstorage is transitioned to readonly mode.
sn.isReadOnly.Store(true)
sn.brCond.Broadcast()
@@ -317,6 +331,9 @@ func (sn *storageNode) sendBufRowsNonblocking(br *bufRows) bool {
// so it will be re-routed to the remaining vmstorage nodes.
return false
}
if errors.Is(err, vminsertapi.ErrRpcIsNotSupported) {
sn.rpcIsNotSupportedDeadline.Store(unsupportedRPCRetrySeconds + fasttime.UnixTimestamp())
}
// Couldn't flush buf to sn. Mark sn as broken.
cannotSendBufsLogger.Warnf("cannot send %d bytes with %d rows to -storageNode=%q: %s; closing the connection to storageNode and "+
"re-routing this data to healthy storage nodes", len(br.buf), br.rows, sn.dialer.Addr(), err)
@@ -334,74 +351,26 @@ var cannotCloseStorageNodeConnLogger = logger.WithThrottler("cannotCloseStorageN
var cannotSendBufsLogger = logger.WithThrottler("cannotSendBufRows", 5*time.Second)
func sendToConn(bc *handshake.BufferedConn, buf []byte) error {
// if len(buf) == 0, it must be sent to the vmstorage too in order to check for vmstorage health
// See checkReadOnlyMode() and https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4870
timeoutSeconds := len(buf) / 3e5
if timeoutSeconds < 60 {
timeoutSeconds = 60
}
timeout := time.Duration(timeoutSeconds) * time.Second
deadline := time.Now().Add(timeout)
if err := bc.SetWriteDeadline(deadline); err != nil {
return fmt.Errorf("cannot set write deadline to %s: %w", deadline, err)
}
// sizeBuf guarantees that the rows batch will be either fully
// read or fully discarded on the vmstorage side.
// sizeBuf is used for read optimization in vmstorage.
sizeBuf := sizeBufPool.Get()
defer sizeBufPool.Put(sizeBuf)
sizeBuf.B = encoding.MarshalUint64(sizeBuf.B[:0], uint64(len(buf)))
if _, err := bc.Write(sizeBuf.B); err != nil {
return fmt.Errorf("cannot write data size %d: %w", len(buf), err)
}
if _, err := bc.Write(buf); err != nil {
return fmt.Errorf("cannot write data with size %d: %w", len(buf), err)
}
if err := bc.Flush(); err != nil {
return fmt.Errorf("cannot flush data with size %d: %w", len(buf), err)
}
// Wait for `ack` from vmstorage.
// This guarantees that the message has been fully received by vmstorage.
deadline = time.Now().Add(timeout)
if err := bc.SetReadDeadline(deadline); err != nil {
return fmt.Errorf("cannot set read deadline for reading `ack` to vmstorage: %w", err)
}
if _, err := io.ReadFull(bc, sizeBuf.B[:1]); err != nil {
return fmt.Errorf("cannot read `ack` from vmstorage: %w", err)
}
ackResp := sizeBuf.B[0]
switch ackResp {
case 1:
// ok response, data successfully accepted by vmstorage
case 2:
// vmstorage is in readonly mode
return errStorageReadOnly
default:
return fmt.Errorf("unexpected `ack` received from vmstorage; got %d; want 1 or 2", sizeBuf.B[0])
}
return nil
}
var sizeBufPool bytesutil.ByteBufferPool
func (sn *storageNode) dial() (*handshake.BufferedConn, error) {
c, err := sn.dialer.Dial()
if err != nil {
sn.dialErrors.Inc()
return nil, err
compression := 1
if *disableCompression {
compression = 0
}
compressionLevel := 1
if *disableRPCCompression {
compressionLevel = 0
}
bc, err := handshake.VMInsertClient(c, compressionLevel)
var dialError error
bc, err := handshake.VMInsertClientWithDialer(func() (net.Conn, error) {
c, err := sn.dialer.Dial()
if err != nil {
dialError = err
sn.dialErrors.Inc()
return nil, err
}
return c, nil
}, compression)
if err != nil {
_ = c.Close()
if dialError != nil {
return nil, dialError
}
sn.handshakeErrors.Inc()
return nil, fmt.Errorf("handshake error: %w", err)
}
@@ -410,10 +379,22 @@ func (sn *storageNode) dial() (*handshake.BufferedConn, error) {
// storageNode is a client sending data to vmstorage node.
type storageNode struct {
// rpc defines RPC method to push data from br
rpc vminsertapi.RPCCall
// rpcIsNotSupportedDeadline defines a timeout for the next storage rpc call
// if the given rpc version is not supported by storage server
rpcIsNotSupportedDeadline atomic.Uint64
// dropSamplesOnOverload defines whether to drop rows from br due to storage overload
dropRowsOnOverload bool
// isBroken is set to true if the given vmstorage node is temporarily unhealthy.
// In this case the data is re-routed to the remaining healthy vmstorage nodes.
isBroken atomic.Bool
rpcCall vminsertapi.RPCCall
// isReadOnly is set to true if the given vmstorage node is read only
// In this case the data is re-routed to the remaining healthy vmstorage nodes.
isReadOnly atomic.Bool
@@ -480,7 +461,7 @@ type storageNodesBucket struct {
ms *metrics.Set
// nodesHash is used for consistently selecting a storage node by key.
nodesHash *consistentHash
nodesHash *consistenthash.ConsistentHash
// sns is a list of storage nodes.
sns []*storageNode
@@ -489,8 +470,19 @@ type storageNodesBucket struct {
wg *sync.WaitGroup
}
// storageNodes contains a list of vmstorage node clients.
var storageNodes atomic.Pointer[storageNodesBucket]
// storageNodes and metadataStorageNodes contains a list of vmstorage node clients.
var (
storageNodes atomic.Pointer[storageNodesBucket]
metadataStorageNodes atomic.Pointer[storageNodesBucket]
)
func getMetadataStorageNodesBucket() *storageNodesBucket {
return metadataStorageNodes.Load()
}
func setMetadataStorageNodesBucket(snb *storageNodesBucket) {
metadataStorageNodes.Store(snb)
}
func getStorageNodesBucket() *storageNodesBucket {
return storageNodes.Load()
@@ -506,17 +498,22 @@ func setStorageNodesBucket(snb *storageNodesBucket) {
//
// Call MustStop when the initialized vmstorage connections are no longer needed.
func Init(addrs []string, hashSeed uint64) {
snb := initStorageNodes(addrs, hashSeed)
snb := initStorageNodes(addrs, vminsertapi.MetricRowsRpcCall, hashSeed)
setStorageNodesBucket(snb)
metadataSnb := initStorageNodes(addrs, vminsertapi.MetricMetadataRpcCall, hashSeed)
setMetadataStorageNodesBucket(metadataSnb)
}
// MustStop stops netstorage.
func MustStop() {
snb := getStorageNodesBucket()
mustStopStorageNodes(snb)
metadataSnb := getMetadataStorageNodesBucket()
mustStopStorageNodes(metadataSnb)
}
func initStorageNodes(unsortedAddrs []string, hashSeed uint64) *storageNodesBucket {
func initStorageNodes(unsortedAddrs []string, rpcCall vminsertapi.RPCCall, hashSeed uint64) *storageNodesBucket {
if len(unsortedAddrs) == 0 {
logger.Panicf("BUG: addrs must be non-empty")
}
@@ -526,9 +523,15 @@ func initStorageNodes(unsortedAddrs []string, hashSeed uint64) *storageNodesBuck
sort.Strings(addrs)
ms := metrics.NewSet()
nodesHash := newConsistentHash(addrs, hashSeed)
nodesHash := consistenthash.NewConsistentHash(addrs, hashSeed)
sns := make([]*storageNode, 0, len(addrs))
var dropRowsOnOverload bool
if rpcCall.Name == vminsertapi.MetricRowsRpcCall.Name {
dropRowsOnOverload = *dropSamplesOnOverload
}
stopCh := make(chan struct{})
rpcName := rpcCall.Name
for _, addr := range addrs {
normalizedAddr, err := netutil.NormalizeAddr(addr, 8400)
if err != nil {
@@ -537,45 +540,53 @@ func initStorageNodes(unsortedAddrs []string, hashSeed uint64) *storageNodesBuck
addr = normalizedAddr
sn := &storageNode{
dialer: netutil.NewTCPDialer(ms, "vminsert", addr, *vmstorageDialTimeout, *vmstorageUserTimeout),
dialer: netutil.NewTCPDialer(ms, "vminsert_"+rpcName, addr, *vmstorageDialTimeout, *vmstorageUserTimeout),
rpc: rpcCall,
dropRowsOnOverload: dropRowsOnOverload,
stopCh: stopCh,
dialErrors: ms.NewCounter(fmt.Sprintf(`vm_rpc_dial_errors_total{name="vminsert", addr=%q}`, addr)),
handshakeErrors: ms.NewCounter(fmt.Sprintf(`vm_rpc_handshake_errors_total{name="vminsert", addr=%q}`, addr)),
connectionErrors: ms.NewCounter(fmt.Sprintf(`vm_rpc_connection_errors_total{name="vminsert", addr=%q}`, addr)),
rowsPushed: ms.NewCounter(fmt.Sprintf(`vm_rpc_rows_pushed_total{name="vminsert", addr=%q}`, addr)),
rowsSent: ms.NewCounter(fmt.Sprintf(`vm_rpc_rows_sent_total{name="vminsert", addr=%q}`, addr)),
rowsDroppedOnOverload: ms.NewCounter(fmt.Sprintf(`vm_rpc_rows_dropped_on_overload_total{name="vminsert", addr=%q}`, addr)),
rowsReroutedFromHere: ms.NewCounter(fmt.Sprintf(`vm_rpc_rows_rerouted_from_here_total{name="vminsert", addr=%q}`, addr)),
rowsReroutedToHere: ms.NewCounter(fmt.Sprintf(`vm_rpc_rows_rerouted_to_here_total{name="vminsert", addr=%q}`, addr)),
sendDurationSeconds: ms.NewFloatCounter(fmt.Sprintf(`vm_rpc_send_duration_seconds_total{name="vminsert", addr=%q}`, addr)),
rpcCall: rpcCall,
dialErrors: ms.NewCounter(fmt.Sprintf(`vm_rpc_dial_errors_total{name="vminsert", addr=%q, rpc_call=%q}`, addr, rpcCall.Name)),
handshakeErrors: ms.NewCounter(fmt.Sprintf(`vm_rpc_handshake_errors_total{name="vminsert", addr=%q, rpc_call=%q}`, addr, rpcCall.Name)),
connectionErrors: ms.NewCounter(fmt.Sprintf(`vm_rpc_connection_errors_total{name="vminsert", addr=%q, rpc_call=%q}`, addr, rpcCall.Name)),
rowsPushed: ms.NewCounter(fmt.Sprintf(`vm_rpc_rows_pushed_total{name="vminsert", addr=%q, rpc_call=%q}`, addr, rpcCall.Name)),
rowsSent: ms.NewCounter(fmt.Sprintf(`vm_rpc_rows_sent_total{name="vminsert", addr=%q, rpc_call=%q}`, addr, rpcCall.Name)),
rowsDroppedOnOverload: ms.NewCounter(fmt.Sprintf(`vm_rpc_rows_dropped_on_overload_total{name="vminsert", addr=%q, rpc_call=%q}`, addr, rpcCall.Name)),
rowsReroutedFromHere: ms.NewCounter(fmt.Sprintf(`vm_rpc_rows_rerouted_from_here_total{name="vminsert", addr=%q, rpc_call=%q}`, addr, rpcCall.Name)),
rowsReroutedToHere: ms.NewCounter(fmt.Sprintf(`vm_rpc_rows_rerouted_to_here_total{name="vminsert", addr=%q, rpc_call=%q}`, addr, rpcCall.Name)),
sendDurationSeconds: ms.NewFloatCounter(fmt.Sprintf(`vm_rpc_send_duration_seconds_total{name="vminsert", addr=%q, rpc_call=%q}`, addr, rpcCall.Name)),
}
sn.brCond = sync.NewCond(&sn.brLock)
_ = ms.NewGauge(fmt.Sprintf(`vm_rpc_rows_pending{name="vminsert", addr=%q}`, addr), func() float64 {
_ = ms.NewGauge(fmt.Sprintf(`vm_rpc_rows_pending{name="vminsert", addr=%q, rpc_call=%q}`, addr, rpcCall.Name), func() float64 {
sn.brLock.Lock()
n := sn.br.rows
sn.brLock.Unlock()
return float64(n)
})
_ = ms.NewGauge(fmt.Sprintf(`vm_rpc_buf_pending_bytes{name="vminsert", addr=%q}`, addr), func() float64 {
_ = ms.NewGauge(fmt.Sprintf(`vm_rpc_buf_pending_bytes{name="vminsert", addr=%q, rpc_call=%q}`, addr, rpcCall.Name), func() float64 {
sn.brLock.Lock()
n := len(sn.br.buf)
sn.brLock.Unlock()
return float64(n)
})
_ = ms.NewGauge(fmt.Sprintf(`vm_rpc_vmstorage_is_reachable{name="vminsert", addr=%q}`, addr), func() float64 {
if sn.isBroken.Load() {
return 0
}
return 1
})
_ = ms.NewGauge(fmt.Sprintf(`vm_rpc_vmstorage_is_read_only{name="vminsert", addr=%q}`, addr), func() float64 {
if sn.isReadOnly.Load() {
// conditionally export health related metrics
if rpcCall == vminsertapi.MetricRowsRpcCall {
_ = ms.NewGauge(fmt.Sprintf(`vm_rpc_vmstorage_is_reachable{name="vminsert", addr=%q, rpc_call=%q}`, addr, rpcCall.Name), func() float64 {
if sn.isBroken.Load() {
return 0
}
return 1
}
return 0
})
})
_ = ms.NewGauge(fmt.Sprintf(`vm_rpc_vmstorage_is_read_only{name="vminsert", addr=%q, rpc_call=%q}`, addr, rpcCall.Name), func() float64 {
if sn.isReadOnly.Load() {
return 1
}
return 0
})
}
sns = append(sns, sn)
}
@@ -585,6 +596,7 @@ func initStorageNodes(unsortedAddrs []string, hashSeed uint64) *storageNodesBuck
}
metrics.RegisterSet(ms)
var wg sync.WaitGroup
snb := &storageNodesBucket{
ms: ms,
@@ -617,27 +629,25 @@ func mustStopStorageNodes(snb *storageNodesBucket) {
// rerouteRowsToReadyStorageNodes reroutes src from not ready snSource to ready storage nodes.
//
// The function blocks until src is fully re-routed.
func rerouteRowsToReadyStorageNodes(snb *storageNodesBucket, snSource *storageNode, src []byte) (int, error) {
func rerouteRowsToReadyStorageNodes(snb *storageNodesBucket, snSource *storageNode, src []byte, getRowHasher func() rowHasher) (int, error) {
reroutesTotal.Inc()
rowsProcessed := 0
var idxsExclude, idxsExcludeNew []int
nodesHash := snb.nodesHash
sns := snb.sns
idxsExclude = getNotReadyStorageNodeIdxsBlocking(snb, idxsExclude[:0])
var mr storage.MetricRow
rowHasher := getRowHasher()
for len(src) > 0 {
tail, err := mr.UnmarshalX(src)
h, tail, err := rowHasher(src)
if err != nil {
logger.Panicf("BUG: cannot unmarshal MetricRow: %s", err)
}
rowBuf := src[:len(src)-len(tail)]
src = tail
reroutedRowsProcessed.Inc()
h := xxhash.Sum64(mr.MetricNameRaw)
mr.ResetX()
var sn *storageNode
for {
idx := nodesHash.getNodeIdx(h, idxsExclude)
idx := nodesHash.GetNodeIdx(h, idxsExclude)
sn = sns[idx]
if sn.isReady() {
break
@@ -673,7 +683,7 @@ func rerouteRowsToReadyStorageNodes(snb *storageNodesBucket, snSource *storageNo
}
// If the re-routing is enabled, then try sending the row to another storage node.
idxsExcludeNew = getNotReadyStorageNodeIdxs(snb, idxsExcludeNew[:0], sn)
idx := nodesHash.getNodeIdx(h, idxsExcludeNew)
idx := nodesHash.GetNodeIdx(h, idxsExcludeNew)
snNew := sns[idx]
if !snNew.trySendBuf(rowBuf, 1) {
// The row cannot be sent to both snSource, sn and snNew without blocking.
@@ -695,10 +705,11 @@ func rerouteRowsToReadyStorageNodes(snb *storageNodesBucket, snSource *storageNo
// It is expected that snSource has no enough buffer for sending src.
// It is expected than *disableRerouting isn't set when calling this function.
// It is expected that len(snb.sns) >= 2
func rerouteRowsToFreeStorageNodes(snb *storageNodesBucket, snSource *storageNode, src []byte) (int, error) {
func rerouteRowsToFreeStorageNodes(snb *storageNodesBucket, snSource *storageNode, src []byte, getRowHasher func() rowHasher) (int, error) {
if *disableRerouting {
logger.Panicf("BUG: disableRerouting must be disabled when calling rerouteRowsToFreeStorageNodes")
}
sns := snb.sns
if len(sns) < 2 {
logger.Panicf("BUG: the number of storage nodes is too small for calling rerouteRowsToFreeStorageNodes: %d", len(sns))
@@ -708,17 +719,15 @@ func rerouteRowsToFreeStorageNodes(snb *storageNodesBucket, snSource *storageNod
var idxsExclude []int
nodesHash := snb.nodesHash
idxsExclude = getNotReadyStorageNodeIdxs(snb, idxsExclude[:0], snSource)
var mr storage.MetricRow
rowHasher := getRowHasher()
for len(src) > 0 {
tail, err := mr.UnmarshalX(src)
h, tail, err := rowHasher(src)
if err != nil {
logger.Panicf("BUG: cannot unmarshal MetricRow: %s", err)
logger.Panicf("BUG: cannot unmarshal row: %s", err)
}
rowBuf := src[:len(src)-len(tail)]
src = tail
reroutedRowsProcessed.Inc()
h := xxhash.Sum64(mr.MetricNameRaw)
mr.ResetX()
again:
// Try sending the row to snSource in order to minimize re-routing.
@@ -727,12 +736,12 @@ func rerouteRowsToFreeStorageNodes(snb *storageNodesBucket, snSource *storageNod
continue
}
// The row couldn't be sent to snSrouce. Try re-routing it to other node.
idx := nodesHash.getNodeIdx(h, idxsExclude)
idx := nodesHash.GetNodeIdx(h, idxsExclude)
sn := sns[idx]
for !sn.isReady() && len(idxsExclude) < len(sns) {
// re-generate idxsExclude list, since sn and snSource must be put there.
idxsExclude = getNotReadyStorageNodeIdxs(snb, idxsExclude[:0], snSource)
idx := nodesHash.getNodeIdx(h, idxsExclude)
idx := nodesHash.GetNodeIdx(h, idxsExclude)
sn = sns[idx]
}
if !sn.trySendBuf(rowBuf, 1) {
@@ -846,14 +855,18 @@ func (sn *storageNode) checkReadOnlyMode() {
if sn.bc == nil {
return
}
// send nil buff to check ack response from storage
err := sendToConn(sn.bc, nil)
var err error
if sn.bc.IsLegacy {
err = vminsertapi.SendToConn(sn.bc, nil)
} else {
err = vminsertapi.SendRPCRequestToConn(sn.bc, sn.rpcCall.VersionedName, nil)
}
if err == nil {
// The storage switched from readonly to non-readonly mode
sn.isReadOnly.Store(false)
return
}
if errors.Is(err, errStorageReadOnly) {
if errors.Is(err, storage.ErrReadOnly) {
// The storage remains in read-only mode
return
}

View File

@@ -7,6 +7,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/netstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/relabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prommetadata"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentelemetry/firehose"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentelemetry/stream"
@@ -20,6 +21,7 @@ var (
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="opentelemetry"}`)
rowsTenantInserted = tenantmetrics.NewCounterMap(`vm_tenant_inserted_rows_total{type="opentelemetry"}`)
rowsPerInsert = metrics.NewHistogram(`vm_rows_per_insert{type="opentelemetry"}`)
metadataInserted = metrics.NewCounter(`vm_metadata_rows_inserted_total{type="opentelemetry"}`)
)
// InsertHandler processes opentelemetry metrics.
@@ -37,12 +39,12 @@ func InsertHandler(at *auth.Token, req *http.Request) error {
return fmt.Errorf("json encoding isn't supported for opentelemetry format. Use protobuf encoding")
}
}
return stream.ParseStream(req.Body, encoding, processBody, func(tss []prompb.TimeSeries, _ []prompb.MetricMetadata) error {
return insertRows(at, tss, extraLabels)
return stream.ParseStream(req.Body, encoding, processBody, func(tss []prompb.TimeSeries, mms []prompb.MetricMetadata) error {
return insertRows(at, tss, mms, extraLabels)
})
}
func insertRows(at *auth.Token, tss []prompb.TimeSeries, extraLabels []prompb.Label) error {
func insertRows(at *auth.Token, tss []prompb.TimeSeries, mms []prompb.MetricMetadata, extraLabels []prompb.Label) error {
ctx := netstorage.GetInsertCtx()
defer netstorage.PutInsertCtx(ctx)
@@ -65,14 +67,14 @@ func insertRows(at *auth.Token, tss []prompb.TimeSeries, extraLabels []prompb.La
}
atLocal := ctx.GetLocalAuthToken(at)
storageNodeIdx := ctx.GetStorageNodeIdx(atLocal, ctx.Labels)
ctx.MetricNameBuf = ctx.MetricNameBuf[:0]
ctx.Buf = ctx.Buf[:0]
samples := ts.Samples
for i := range samples {
r := &samples[i]
if len(ctx.MetricNameBuf) == 0 {
ctx.MetricNameBuf = storage.MarshalMetricNameRaw(ctx.MetricNameBuf[:0], atLocal.AccountID, atLocal.ProjectID, ctx.Labels)
if len(ctx.Buf) == 0 {
ctx.Buf = storage.MarshalMetricNameRaw(ctx.Buf[:0], atLocal.AccountID, atLocal.ProjectID, ctx.Labels)
}
if err := ctx.WriteDataPointExt(storageNodeIdx, ctx.MetricNameBuf, r.Timestamp, r.Value); err != nil {
if err := ctx.WriteDataPointExt(storageNodeIdx, ctx.Buf, r.Timestamp, r.Value); err != nil {
return err
}
}
@@ -81,5 +83,20 @@ func insertRows(at *auth.Token, tss []prompb.TimeSeries, extraLabels []prompb.La
rowsInserted.Add(rowsTotal)
rowsTenantInserted.MultiAdd(perTenantRows)
rowsPerInsert.Update(float64(rowsTotal))
return ctx.FlushBufs()
if err := ctx.FlushBufs(); err != nil {
return fmt.Errorf("cannot flush metric bufs: %w", err)
}
if prommetadata.IsEnabled() {
ctx.ResetForMetricsMetadata()
for i := range mms {
m := &mms[i]
atLocal := ctx.GetLocalAuthTokenForMetadata(at, m)
if err := ctx.WriteMetadata(atLocal, m); err != nil {
return err
}
}
metadataInserted.Add(len(mms))
return ctx.FlushBufs()
}
return nil
}

Some files were not shown because too many files have changed in this diff Show More