Compare commits

...

88 Commits

Author SHA1 Message Date
func25
879e444058 clean 2025-08-27 13:58:13 +07:00
func25
4d501b20fd clean 2025-08-27 11:33:44 +07:00
func25
65fa35dfdf clean 2025-08-27 10:50:20 +07:00
func25
3768919413 clean 2025-08-27 10:32:45 +07:00
func25
c65dc7b15a clean 2025-08-27 10:10:45 +07:00
func25
a762889e45 remove maxDebugSamples flag and limit checking 2025-08-27 09:40:53 +07:00
func25
3220760480 add debug functionality to QueryHandler 2025-08-27 09:40:53 +07:00
func25
6ae8855e29 update 2025-08-27 09:40:53 +07:00
Alexander Frolov
92ee5a019d vmctl: inconsistent vm-native logs (#9607)
### Describe Your Changes

Some messages were written to `stdout` using `fmt.Printf` and
`fmt.Println`, while the other messages like import statistics were
written to `stderr` through the `log` package.

This led to ordering problems where the `Import finished!` +
`VictoriaMetrics importer stats` messages, which expected to be the last
messages, appeared before `Continue import process with filter`
messages, creating confusing output for users.

```
2025/08/20 13:07:26 Import finished!
2025/08/20 13:07:26 VictoriaMetrics importer stats:
  time spent while importing: 20h49m10.8497184s;
  total bytes: 277.1 GB;
  bytes/s: 3.7 MB;
  requests: 7978614;
  requests retries: 0;
2025/08/20 13:07:26 Total time: 20h49m10.851006088s
Continue import process with filter
        filter: match[]={__name__!=""}
        start: 2025-08-08T00:00:00Z
        end: 2025-08-15T00:00:00Z:
Continue import process with filter
        filter: match[]={__name__!=""}
        start: 2025-08-15T00:00:00Z
        end: 2025-08-19T16:18:15Z:
```


### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-08-26 18:55:13 +03:00
Max Kotliar
3739fd29dd Revert "app/{vminsert,vmagent}: added flags for periodical relabel and stream aggregation configs check (#9598)"
This reverts commit 77997971fc and partly
d0aa1f0640.

The reasons explained in
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9598#issuecomment-3223766551
2025-08-26 14:47:32 +03:00
Max Kotliar
d0aa1f0640 docs: sync documented flags with binaries 2025-08-26 10:54:44 +03:00
Andrii Chubatiuk
77997971fc app/{vminsert,vmagent}: added flags for periodical relabel and stream aggregation configs check (#9598)
related issue
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9590

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2025-08-26 10:06:02 +03:00
Alexander Frolov
16adae57e0 app/vmagent/remotewrite: restore protocol downgrade logic (#9621)
### Describe Your Changes

It seems db39f045e1 accidentally reverted
#9419 changes.
```patch
--- a/app/vmagent/remotewrite/client.go
+++ b/app/vmagent/remotewrite/client.go
@@ -448,7 +448,8 @@ again:
 	}
 
 	metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_requests_total{url=%q, status_code="%d"}`, c.sanitizedURL, statusCode)).Inc()
-	if statusCode == 409 {
+	switch statusCode {
+	case 409:
 		logBlockRejected(block, c.sanitizedURL, resp)
 
 		// Just drop block on 409 status code like Prometheus does.
@@ -461,7 +462,13 @@ again:
 		// - Remote Write v2 specification explicitly specifies a `415 Unsupported Media Type` for unsupported encodings.
 		// - Real-world implementations of v1 use both 400 and 415 status codes.
 		// See more in research: https://github.com/VictoriaMetrics/VictoriaMetrics/pull/8462#issuecomment-2786918054
-	} else if statusCode == 415 || statusCode == 400 {
+	case 415, 400:
+		if c.canDowngradeVMProto.Swap(false) {
+			logger.Infof("received unsupported media type or bad request from remote storage at %q. Downgrading protocol from VictoriaMetrics to Prometheus remote write for all future requests. "+
+				"See https://docs.victoriametrics.com/victoriametrics/vmagent/#victoriametrics-remote-write-protocol", c.sanitizedURL)
+			c.useVMProto.Store(false)
+		}
+
 		if encoding.IsZstd(block) {
 			logger.Infof("received unsupported media type or bad request from remote storage at %q. Re-packing the block to Prometheus remote write and retrying."+
 				"See https://docs.victoriametrics.com/victoriametrics/vmagent/#victoriametrics-remote-write-protocol", c.sanitizedURL)
```

cc @makasim

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-08-26 09:20:26 +03:00
Hui Wang
47b8256e54 lib/prompb: replace fields hardcoded hex values with their correspond… (#9617)
…ing bitwise operations

fix https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9608
2025-08-26 09:04:23 +03:00
f41gh7
1473bc7794 app/vmagent: pubsub properly handle ingestion error
Previously, if pushBlockPubSub function returned error, vmagent stopped
remote write worker thread assigned for it. Expected behavior for this
scenario is to retry error inside pushBlockPubSub function. It must
return only on vmagent shutdown.

 This commit properly handles this error and prevents from ingestion
stop.
2025-08-24 21:37:35 +02:00
Aliaksandr Valialkin
1b69d2d766 lib/netutil: return tls.Conn from TCPListener.Accept for TLS connections
This is needed because the servers, which may use the TCPListener, such as net/http.Server,
expect to get tls.Conn for TLS connections in order to properly fill various fields such as net/http.Request.TLS.
If the listener returns some other net.Conn, then these fields aren't filled properly,
and this may prevent from the proper mTLS-based authorization and request routing
such as https://docs.victoriametrics.com/victoriametrics/vmauth/#mtls-based-request-routing

Updates https://github.com/VictoriaMetrics/VictoriaLogs/issues/29
2025-08-22 20:26:03 +02:00
Aliaksandr Valialkin
4e656e2793 docs/victoriametrics/enterprise.md: mention VictoriaLogs enterprise
Updates https://github.com/VictoriaMetrics/VictoriaLogs/issues/120
2025-08-22 18:32:39 +02:00
hagen1778
7a40d24633 docs: reword -vmalert.proxyURL usage in vmalert
Make it clear that `-vmalert.proxyURL` needs to be applied to
VM single or vmselect.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit f85fd161e4)
2025-08-22 09:50:00 +02:00
Max Kotliar
532615c297 metricsql: improve timestamp function compatibility with Prometheus when used with sub-expressions (#9603)
### Describe Your Changes

Fixes
[#9527](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9527)
Related PR: https://github.com/VictoriaMetrics/metricsql/pull/55

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-08-21 17:39:00 +03:00
Max Kotliar
2a8451efb4 ib/appmetrics: revert accidental change 2025-08-21 17:34:50 +03:00
Max Kotliar
c80f77705b docs/changelog: add update note 2025-08-21 17:34:50 +03:00
Andrii Chubatiuk
d8ec4894b5 deployment/rules: set proper job filters for rules (#9587)
### Describe Your Changes

related issue https://github.com/VictoriaMetrics/helm-charts/issues/2350

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

(cherry picked from commit 7e05200c60)
2025-08-21 15:37:24 +02:00
hagen1778
6457daae4b docs: refresh vmui description
* add missing features
* re-organize text without breaking links to improve clarity

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit a2f033ce6c)
2025-08-21 15:37:24 +02:00
Artur Minchukou
f8be7c0d84 app/vmui: add export functionality for Query and RawQuery tabs with CSV/JSON support (#9463)
### Describe Your Changes

Related issue: #9332
- add export functionality for Query and RawQuery tabs with CSV/JSON
support;
 - replace unused icons and update `DebugIcon` usage in `DownloadReport`

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 78b217d70c)
2025-08-21 15:37:24 +02:00
Aliaksandr Valialkin
5d10823e61 lib/httpserver: add missing whitespace after the dot in the description for the -tlsAutocertEmail command-line flag
This is a follow-up for 1d80e8f860
2025-08-21 11:03:00 +02:00
Andrii Chubatiuk
cc3301c5d9 docs: exclude files from rendering by hugo (#9591)
required for https://github.com/VictoriaMetrics/vmdocs/issues/164

### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-08-20 12:04:42 +03:00
Nikolay
828527c8af go.mod: unpin cloud.google.com/go/storage
Add build tag `disable_grpc_modules` for vmbackup, vmrestore and
vmbackupmanager. Binary size increases only for 3MB with it. It's
acceptable trade-off for security and feature updates.

Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8008
2025-08-19 12:22:28 +02:00
Fred Navruzov
c7e1211851 docs/vmanomaly: release v1.25.3 (#9597)
### Describe Your Changes

Update docs to vmanomaly release v1.25.3

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-08-19 10:25:32 +04:00
Roman Khavronenko
53fb7d6f1b benchmarks: update makefile commands
* check if built binary is present for `make tsbs-build`. Before, if
build fails, the command stopped working.
* make ENV variables configurable from command line, so `TSBS_STEP=15s
make tsbs-generate-data` would respect the configured step.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-08-18 22:55:53 +02:00
Arie Heinrich
fd02edcb4b Spelling and Markdown Standards
Another batch of documentation improvements

Fix Spelling in:
- Comments in code
- Displayed strings

One change was in a json file used for the anomaly dashboard in docker,
else no other code was changed.

Some Markdown changes, related to standards:
- URLs
- List numbering
- Empty spaces at the end of a line
2025-08-18 22:55:53 +02:00
Corporte Gadfly
6494508f00 fix typo in sentence 2025-08-18 22:54:35 +02:00
Zakhar Bessarab
ec8515e3c1 docs: update references to the latest releases
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-08-18 16:09:02 +04:00
Zakhar Bessarab
a9a6f5c67a docs/changelog: backport LTS release notes
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-08-18 15:37:52 +04:00
f41gh7
eaaa3e1fe2 synctest: replace deprecated Run call with Test
Signed-off-by: f41gh7 <nik@victoriametrics.com>
2025-08-17 21:00:48 +02:00
f41gh7
155d7560c3 Makefile: upgrade golangci-lint from 2.2.1 to 2.4.0
Changelog https://golangci-lint.run/docs/product/changelog/#240
2025-08-17 20:36:20 +02:00
f41gh7
ef1399fcc0 deployment/docker: update Go builder from 1.24.6 to 1.25.0
Changes https://tip.golang.org/doc/go1.25
2025-08-17 20:31:57 +02:00
Zakhar Bessarab
dd31f47b41 docs/CHANGELOG.md: cut v1.124.0
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-08-15 15:00:58 +04:00
Zakhar Bessarab
31e324c6d2 docs: update version tooltips
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-08-15 14:51:54 +04:00
Zakhar Bessarab
75eaf8b771 app/vmselect: run make vmui-update
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-08-15 14:47:03 +04:00
Max Kotliar
1fde679dc2 .github: add copilot instruction (#9586)
### Describe Your Changes

Trying to teach Copilot correct changelog changes, such as a misplaced
entry
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9306#issuecomment-3185126897

I couldn’t test this properly because Copilot doesn’t pick up
instructions from the PR itself. They must be on the master branch. The
instruction needs to be merged first, then tested. Please review.

If it doesn’t work, I’ll remove it.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-08-14 19:51:30 +03:00
Andrei Baidarov
552be46699 app/vmagent: properly apply dropOnOverload condition
Previously, vmagent treated differently the following configuration:

1) ./bin/vmagent --remoteWrite.url=url-0 --remoteWrite.url=url-1 --remoteWrite.disableOndiskQueue

 and

2)./bin/vmagent --remoteWrite.url=url-0 --remoteWrite.url=url-1 --remoteWrite.disableOndiskQueue=true,true

In first case, it could produce duplicates and blocks ingestion requests if one of remote write targets were not accessible.
In second case, it implicitly added --remoteWrite.dropSamplesOnOverload as true and silently dropped samples for inaccessible target.

 This commit treat this configuration as the same and silently drop samples on both cases to mitigate possible duplicates. 

 It's expected, that vmagent provides delivery guarantees, only if it has a single remote write target, when flag remoteWrite.disableOndiskQueue=true is set.


Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9565
2025-08-14 16:12:02 +02:00
Andrii Chubatiuk
33aac9ceb5 lib/backup: added checksum algorithm for all S3 PutObject requests (#9549)
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9532
set checksum algorithm to SHA256, not sure if this property should be
configurable

### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-08-14 17:50:41 +04:00
Artem Fetishev
43030f9ba3 lib/storage: fix searchMetricName() (#9582)
While working on #9431 there has been introduced 2 bugs related to
indexDB.searchMetricName():

1. During the search the index records are unconditionally placed in
sparse index
2. If search touches index records in both prev and curr indexDBs, there
will be possible cases that metricIDs can be unintentionally removed
using `wasMetricIDMissingBefore()` logic

Additionally, the PR moves the searchMetricName from indexDB and Search
to Storage which simplifies the code and makes it spossible to reuse the
function as-is in enterprise code.

Follow up for #9431.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2025-08-14 10:30:49 +02:00
Max Kotliar
911f0e0222 docs/changelog: move metadata changelog record to tip
Follow up on
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9306
2025-08-13 21:59:22 +03:00
dependabot[bot]
11bdcca641 build(deps): bump actions/checkout from 4 to 5 (#9574)
Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to
5.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/actions/checkout/releases">actions/checkout's
releases</a>.</em></p>
<blockquote>
<h2>v5.0.0</h2>
<h2>What's Changed</h2>
<ul>
<li>Update actions checkout to use node 24 by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2226">actions/checkout#2226</a></li>
<li>Prepare v5.0.0 release by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2238">actions/checkout#2238</a></li>
</ul>
<h2>⚠️ Minimum Compatible Runner Version</h2>
<p><strong>v2.327.1</strong><br />
<a
href="https://github.com/actions/runner/releases/tag/v2.327.1">Release
Notes</a></p>
<p>Make sure your runner is updated to this version or newer to use this
release.</p>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/checkout/compare/v4...v5.0.0">https://github.com/actions/checkout/compare/v4...v5.0.0</a></p>
<h2>v4.3.0</h2>
<h2>What's Changed</h2>
<ul>
<li>docs: update README.md by <a
href="https://github.com/motss"><code>@​motss</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1971">actions/checkout#1971</a></li>
<li>Add internal repos for checking out multiple repositories by <a
href="https://github.com/mouismail"><code>@​mouismail</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1977">actions/checkout#1977</a></li>
<li>Documentation update - add recommended permissions to Readme by <a
href="https://github.com/benwells"><code>@​benwells</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2043">actions/checkout#2043</a></li>
<li>Adjust positioning of user email note and permissions heading by <a
href="https://github.com/joshmgross"><code>@​joshmgross</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2044">actions/checkout#2044</a></li>
<li>Update README.md by <a
href="https://github.com/nebuk89"><code>@​nebuk89</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2194">actions/checkout#2194</a></li>
<li>Update CODEOWNERS for actions by <a
href="https://github.com/TingluoHuang"><code>@​TingluoHuang</code></a>
in <a
href="https://redirect.github.com/actions/checkout/pull/2224">actions/checkout#2224</a></li>
<li>Update package dependencies by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2236">actions/checkout#2236</a></li>
<li>Prepare release v4.3.0 by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2237">actions/checkout#2237</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/motss"><code>@​motss</code></a> made
their first contribution in <a
href="https://redirect.github.com/actions/checkout/pull/1971">actions/checkout#1971</a></li>
<li><a href="https://github.com/mouismail"><code>@​mouismail</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/checkout/pull/1977">actions/checkout#1977</a></li>
<li><a href="https://github.com/benwells"><code>@​benwells</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/checkout/pull/2043">actions/checkout#2043</a></li>
<li><a href="https://github.com/nebuk89"><code>@​nebuk89</code></a> made
their first contribution in <a
href="https://redirect.github.com/actions/checkout/pull/2194">actions/checkout#2194</a></li>
<li><a href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/checkout/pull/2236">actions/checkout#2236</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/checkout/compare/v4...v4.3.0">https://github.com/actions/checkout/compare/v4...v4.3.0</a></p>
<h2>v4.2.2</h2>
<h2>What's Changed</h2>
<ul>
<li><code>url-helper.ts</code> now leverages well-known environment
variables by <a href="https://github.com/jww3"><code>@​jww3</code></a>
in <a
href="https://redirect.github.com/actions/checkout/pull/1941">actions/checkout#1941</a></li>
<li>Expand unit test coverage for <code>isGhes</code> by <a
href="https://github.com/jww3"><code>@​jww3</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1946">actions/checkout#1946</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/checkout/compare/v4.2.1...v4.2.2">https://github.com/actions/checkout/compare/v4.2.1...v4.2.2</a></p>
<h2>v4.2.1</h2>
<h2>What's Changed</h2>
<ul>
<li>Check out other refs/* by commit if provided, fall back to ref by <a
href="https://github.com/orhantoy"><code>@​orhantoy</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1924">actions/checkout#1924</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/Jcambass"><code>@​Jcambass</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/checkout/pull/1919">actions/checkout#1919</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/checkout/compare/v4.2.0...v4.2.1">https://github.com/actions/checkout/compare/v4.2.0...v4.2.1</a></p>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/actions/checkout/blob/main/CHANGELOG.md">actions/checkout's
changelog</a>.</em></p>
<blockquote>
<h1>Changelog</h1>
<h2>V5.0.0</h2>
<ul>
<li>Update actions checkout to use node 24 by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2226">actions/checkout#2226</a></li>
</ul>
<h2>V4.3.0</h2>
<ul>
<li>docs: update README.md by <a
href="https://github.com/motss"><code>@​motss</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1971">actions/checkout#1971</a></li>
<li>Add internal repos for checking out multiple repositories by <a
href="https://github.com/mouismail"><code>@​mouismail</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1977">actions/checkout#1977</a></li>
<li>Documentation update - add recommended permissions to Readme by <a
href="https://github.com/benwells"><code>@​benwells</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2043">actions/checkout#2043</a></li>
<li>Adjust positioning of user email note and permissions heading by <a
href="https://github.com/joshmgross"><code>@​joshmgross</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2044">actions/checkout#2044</a></li>
<li>Update README.md by <a
href="https://github.com/nebuk89"><code>@​nebuk89</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2194">actions/checkout#2194</a></li>
<li>Update CODEOWNERS for actions by <a
href="https://github.com/TingluoHuang"><code>@​TingluoHuang</code></a>
in <a
href="https://redirect.github.com/actions/checkout/pull/2224">actions/checkout#2224</a></li>
<li>Update package dependencies by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2236">actions/checkout#2236</a></li>
</ul>
<h2>v4.2.2</h2>
<ul>
<li><code>url-helper.ts</code> now leverages well-known environment
variables by <a href="https://github.com/jww3"><code>@​jww3</code></a>
in <a
href="https://redirect.github.com/actions/checkout/pull/1941">actions/checkout#1941</a></li>
<li>Expand unit test coverage for <code>isGhes</code> by <a
href="https://github.com/jww3"><code>@​jww3</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1946">actions/checkout#1946</a></li>
</ul>
<h2>v4.2.1</h2>
<ul>
<li>Check out other refs/* by commit if provided, fall back to ref by <a
href="https://github.com/orhantoy"><code>@​orhantoy</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1924">actions/checkout#1924</a></li>
</ul>
<h2>v4.2.0</h2>
<ul>
<li>Add Ref and Commit outputs by <a
href="https://github.com/lucacome"><code>@​lucacome</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1180">actions/checkout#1180</a></li>
<li>Dependency updates by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>- <a
href="https://redirect.github.com/actions/checkout/pull/1777">actions/checkout#1777</a>,
<a
href="https://redirect.github.com/actions/checkout/pull/1872">actions/checkout#1872</a></li>
</ul>
<h2>v4.1.7</h2>
<ul>
<li>Bump the minor-npm-dependencies group across 1 directory with 4
updates by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1739">actions/checkout#1739</a></li>
<li>Bump actions/checkout from 3 to 4 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1697">actions/checkout#1697</a></li>
<li>Check out other refs/* by commit by <a
href="https://github.com/orhantoy"><code>@​orhantoy</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1774">actions/checkout#1774</a></li>
<li>Pin actions/checkout's own workflows to a known, good, stable
version. by <a href="https://github.com/jww3"><code>@​jww3</code></a> in
<a
href="https://redirect.github.com/actions/checkout/pull/1776">actions/checkout#1776</a></li>
</ul>
<h2>v4.1.6</h2>
<ul>
<li>Check platform to set archive extension appropriately by <a
href="https://github.com/cory-miller"><code>@​cory-miller</code></a> in
<a
href="https://redirect.github.com/actions/checkout/pull/1732">actions/checkout#1732</a></li>
</ul>
<h2>v4.1.5</h2>
<ul>
<li>Update NPM dependencies by <a
href="https://github.com/cory-miller"><code>@​cory-miller</code></a> in
<a
href="https://redirect.github.com/actions/checkout/pull/1703">actions/checkout#1703</a></li>
<li>Bump github/codeql-action from 2 to 3 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1694">actions/checkout#1694</a></li>
<li>Bump actions/setup-node from 1 to 4 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1696">actions/checkout#1696</a></li>
<li>Bump actions/upload-artifact from 2 to 4 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1695">actions/checkout#1695</a></li>
<li>README: Suggest <code>user.email</code> to be
<code>41898282+github-actions[bot]@users.noreply.github.com</code> by <a
href="https://github.com/cory-miller"><code>@​cory-miller</code></a> in
<a
href="https://redirect.github.com/actions/checkout/pull/1707">actions/checkout#1707</a></li>
</ul>
<h2>v4.1.4</h2>
<ul>
<li>Disable <code>extensions.worktreeConfig</code> when disabling
<code>sparse-checkout</code> by <a
href="https://github.com/jww3"><code>@​jww3</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1692">actions/checkout#1692</a></li>
<li>Add dependabot config by <a
href="https://github.com/cory-miller"><code>@​cory-miller</code></a> in
<a
href="https://redirect.github.com/actions/checkout/pull/1688">actions/checkout#1688</a></li>
<li>Bump the minor-actions-dependencies group with 2 updates by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1693">actions/checkout#1693</a></li>
<li>Bump word-wrap from 1.2.3 to 1.2.5 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1643">actions/checkout#1643</a></li>
</ul>
<h2>v4.1.3</h2>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="08c6903cd8"><code>08c6903</code></a>
Prepare v5.0.0 release (<a
href="https://redirect.github.com/actions/checkout/issues/2238">#2238</a>)</li>
<li><a
href="9f265659d3"><code>9f26565</code></a>
Update actions checkout to use node 24 (<a
href="https://redirect.github.com/actions/checkout/issues/2226">#2226</a>)</li>
<li>See full diff in <a
href="https://github.com/actions/checkout/compare/v4...v5">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/checkout&package-manager=github_actions&previous-version=4&new-version=5)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-13 19:02:44 +03:00
Zakhar Bessarab
dcf70141e7 docs: update examples to use proper license flags (#9579)
`-eula` was deprecated and made no-op in v1.123.0, so examples with
`-eula` will no longer work.
Replace those with proper license configuration.

While at it, remove license flags from vmbackupmanager CLI commands as
it is not required when using CLI.

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-08-13 19:14:33 +04:00
hagen1778
d70cdf42a3 metricsql: return a proper error message for scalar arguments
Follow-up for 8b92af9d45

Initial PR contained the change for getScalar function - see https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9548
But change was dropped during incorrect rebase.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 5869a39e7b)
2025-08-13 13:35:47 +02:00
Max Kotliar
5cb9469f0c apptest: Fix flaky TestSingleVMAuthRouterWithAuth (#9575)
### Describe Your Changes

Do not check vmauth_config_last_reload_success_timestamp_seconds since
it may contain the timestamp < time.Now() due to how lib/fasttime works.

Instead, compare the number of config reloads.

follow up on
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9369 and
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9572

Also, split the config update and reload into two separate functions.

master:
```
$gotest -race ./apptest/tests/ -run=TestSingleVMAuthRouterWithInternalAddr -count=40
ok  	github.com/VictoriaMetrics/VictoriaMetrics/apptest/tests	90.176s
```

pr:
```
$gotest -race ./apptest/tests/ -run=TestSingleVMAuthRouterWithInternalAddr -count=40
ok  	github.com/VictoriaMetrics/VictoriaMetrics/apptest/tests	46.130s
```

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

(cherry picked from commit c3c802a61c)
2025-08-13 13:06:16 +02:00
Hui Wang
5116c5e56e metricsql: return a proper error message when the function argument i… (#9548)
…s expected to be a string

In MetricsQL, functions like
[count_values](https://docs.victoriametrics.com/victoriametrics/metricsql/#count_values),
[label_replace](https://docs.victoriametrics.com/victoriametrics/metricsql/#label_replace)
expect string arguments, and `getString()` checks if the result from a
string expr query.
Previously, error messages were not intuitive, now
`label_replace("","","","",up)` and `label_replace("","","","",1)`
should return clearer error message.

(cherry picked from commit 8b92af9d45)
2025-08-13 13:06:16 +02:00
Hui Wang
dc61936643 vmalert: fix the {{ $activeAt }} variable value in annotation templ… (#9576)
…ating when the alert has already triggered

fix https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9543,
bug was introduced in
[v1.101.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.101.0)
with
a84491324d.

(cherry picked from commit e313874d01)
2025-08-13 13:06:16 +02:00
Hui Wang
85b464e0bc vmalert: fix potential data race and missing firing states when repla… (#9559)
…ying alerting rule with `-replay.ruleEvaluationConcurrency>1`

(cherry picked from commit 58a4e48901)
2025-08-13 13:06:16 +02:00
Artem Fetishev
2380e4829d lib/storage: remove extDB from indexDB, search indexDBs independently (#9431)
Removing extDB from indexDB makes prev, curr, and next indexDBs independent.
I.e. the search is performed independently in prev and curr, the results are
then merged.

Additionally, since no search is now performed in extDB:
- all indexDB search methods now return the original maps used for populating
  the result, without invermediate conversion to slices.
- `NoExtDB` suffix has been removed from method names

This has been extracted from #8134.

Signed-off-by: Andrei Baidarov <baidarov@nebius.com>
Co-authored-by: Artem Fetishev <rtm@victoriametrics.com>
2025-08-13 07:42:57 +02:00
Dmytro Kozlov
975cc117e8 benchmark: update date calculation for the benchmark script (#9563)
### Describe Your Changes

Updated date calculation for the TSBS benchmark. Before it requires the
installation of the `coreutils` if you run those benchmarks on the macOS
system, but you do not need to install anything.
`make tsbs` should work correctly on Linux and macOS as well.

Checked on both systems, it works correctly:
1. MacOS
<img width="1292" height="372" alt="Screenshot 2025-08-08 at 11 45 03"
src="https://github.com/user-attachments/assets/609a797d-c54a-40d3-abe2-270c173ff9c3"
/>

2. Linux
<img width="1440" height="283" alt="Screenshot 2025-08-08 at 11 46 33"
src="https://github.com/user-attachments/assets/e9f094a1-40cc-4cd2-afd5-55c5678c041f"
/>

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

(cherry picked from commit fe0afc3fea)
2025-08-12 16:54:36 +02:00
Roman Khavronenko
61b51863fc dashboards/victoriametrics-cluster: show max 99th percentile on vmselect panels (#9555)
Before, we showed summarized 99th percentile for query complexity across
all available instance. This doesn't make much sense, as it doesn't
answer on the following questions:
1. What complexity limits to set per vmselect
2. What are the most expensive queries

The change is to use `max` instead of `sum`, to show only outliers, the
heaviest served queries. The update should help answering on questions
above.

### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit f99e49c15d)
2025-08-12 16:54:36 +02:00
Andrii Chubatiuk
a7564a5f74 metricsql: fixed gaps in histogram_quantile calculation, when first bucket contains NaNs (#9547)
fixes case, when `histogram_quantile` result contains gaps, that occur
in same time range, where NaNs are present in a first bucket of a
histogram

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 1ba994970b)
2025-08-12 16:54:36 +02:00
Hui Wang
25f2155d3a app/vmagent: add time series metadata support
By default, `vmagent` doesn't parse
[metadata](https://github.com/prometheus/docs/blob/main/docs/instrumenting/exposition_formats.md)
when scraping targets, and drops metadata that received via [Prometheus remote write v1(https://prometheus.io/docs/specs/prw/remote_write_spec/) or
[OpenTelemetryprotocol](https://github.com/open-telemetry/opentelemetryproto/blob/v1.7.0/opentelemetry/proto/metrics/v1/metrics.proto).

To enable parsing metadata when scraping and sending metadata to the
configured `-remoteWrite.url`, set `-enableMetadata=true`.

Besides native metadata fields, vmagent also adds tenant info to
metadata when `-enableMultitenantHandlers` is enabled and data is sent
via the multitenant endpoints (/insert/<accountID>/<suffix>), allowing
storing metadata under different tenants in VictoriaMetrics cluster.
However, if `vm_account_id` or `vm_project_id labels` are added directly
in metrics labels and send to the [vminsert multitenantendpoints](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/#multitenancy-via-labels),
tenant info won't be attached in the metadata, and it will be stored in
the default tenant of VictoriaMetrics cluster.

part of https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2974
2025-08-12 15:21:33 +02:00
Max Kotliar
4a38d6eacf apptest: fix flaky single vmauth router with auth test
Fix flaky integration test `TestSingleVMAuthRouterWithAuth`.
The flakiness is caused by the
`vmauth_config_last_reload_success_timestamp_seconds` metric, which
reports time with second-level precision.
Update the test to account for this when verifying that the config
reloads correctly.
2025-08-12 11:36:40 +02:00
Nikolay
f9ad1f9c63 lib/storage: cardinality limiter prevent performance degradation on limit hit
Previously, if limit was reached for cardinality limiter, vmstorage
started to perform index lookups for any series exceed limit. Since
storage must skip index creation for such series, it's not possible to
cache it. It resulted into opposite effect of cardinality limiter -
instead of reducing resource usage, it increased it instead.

 This commit changes cardinality limit calculation from metricID to the
hash from raw metricName. It could slightly increase CPU usage if
cardinality limiter is configured, since hash must be calculated for
each metricName row. But it mitigates excessive CPU and memory usage on
limit hit

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9554
2025-08-12 11:36:40 +02:00
Nikolay
8454972d63 docs: add vmselect group and vmstorage node auto-discovery 2025-08-12 11:36:40 +02:00
Max Kotliar
a31431bdec docs: add available from hint for -rpc.handshakeTimeout flag
follow up on
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9541
2025-08-12 10:12:30 +03:00
Max Kotliar
5eae13fbe9 lib/handshake: set deadline for whole handshake; change deadline (1s per op to 3s whole process) (#9541)
### Describe Your Changes

The current one-second timeout for individual read or write operations
during the handshake phase has proven to be insufficient in some
scenarios
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9345. For
example, short-lived CPU spikes lasting a few seconds can cause
handshake failures due to the low timeout threshold.

While a small timeout may work well in environments with fast and
reliable networking, such as within a single datacenter, it becomes
problematic in more complex setups—particularly in a [multi-level
cluster
setup](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/#multi-level-cluster-setup)
where the top-level vmselect may reside in a different availability zone
and work on a less reliable network.

Another issue with the per-operation timeout approach is that it allows
the total time for a handshake to accumulate significantly in the
worst-case scenario. If each operation experiences a delay just under
the timeout threshold, the entire handshake process could take up to 6s.
Which accounts for 60% of `-search.maxQueueDuration` and leaves only 4s
for the actual query.

Introducing a single timeout for the entire handshake process would
provide more predictable behavior and improve usability from a
configuration standpoint. The timeout for the whole handshake op is also
easier to understand from the operator's point of view. Increasing the
timeout value and providing a configuration option for it would make the
system more resilient to transient conditions like CPU contention and
better suited for use cases involving cross-AZ communication.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9345

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-08-11 19:30:03 +03:00
Max Kotliar
1dd4a48032 .github/workflows: Run cross builds and tests in parallel (#9443)
The commit changes CI behavior:
- Run build in parallel for different os\arch
- Run unit\integration\lint in parallel
- Remove the custom Go cache step in favor of the logic provided in
`actions/setup-go`. The custom cache was used to build key based on
go.sum and makefiles. This logic is preserved.
- Introduce cache for golangci-lint.

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-08-11 16:05:13 +03:00
Max Kotliar
43840436f0 apptest: Add vmauth use proxy protocol integration test (#9556)
### Describe Your Changes

Add an integration test that verifies that vmauth works with
`-httpListenAddr.useProxyProtocol=true` enabled and the x-forwarded-for
header is propagated correctly.

Related to https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9546

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-08-11 15:50:29 +03:00
Aliaksandr Valialkin
cb16774bcb lib/envtemplate: allow referring non-existing environment variables in config files and in command-line flags
A few users reported unexpected errors when environment variables referred other environment variables
at VictoriaMetrics startup. This resulted in the following fatal error on startup:

    cannot expand "..." env var value "...%{SOME_NON_EXISTING_ENV_VAR}..."

Fix this by leaving placeholders with non-existing env vars as is.
This improves the general usability of environment variables by VictoriaMetrics components
inside command-line flags and inside config files. User can easily notice placeholders with non-existing
environment variables by looking at the corresponding command-line flag or at the corresponding config option value.

While at it, replace duplicate docs about environment variables at the https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/#environment-variables
with the link to the same docs at https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#environment-variables .

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3999
2025-08-09 21:07:18 +02:00
Aliaksandr Valialkin
a83f9c0608 go.sum: run go mod tidy after 1f2c14260c 2025-08-08 20:24:07 +02:00
Aliaksandr Valialkin
35d254d994 deployment/docker: update Go builder from Go1.24.5 to Go1.24.6
See https://github.com/golang/go/issues?q=milestone%3AGo1.24.6+label%3ACherryPickApproved
2025-08-08 20:22:11 +02:00
Charles-Antoine Mathieu
051361183c app/vmselect: truncate graphite excessive pathExpression field
vmselect is experiencing memory exhaustion and OOM kills
when processing complex Graphite queries with nested functions and large
numbers of label selectors (30k+ values).

The root cause was unbounded growth of the pathExpression field.

 This commit adds configurable truncation for Graphite pathExpression fields to
prevent memory exhaustion while preserving query functionality:

New flag: -search.maxGraphitePathExpressionLen=1024 (default 1024
characters)
Safe truncation: Long expressions are truncated with "..." suffix
Zero disables: Set to 0 to disable truncation entirely

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9534/
2025-08-08 13:43:40 +02:00
Max Kotliar
766132b8f0 lib/netutil: fix linter issues in proxy protocol tests 2025-08-07 14:36:09 +03:00
Nikolay
4d41cda5bc lib/netutil: properly accept proxy protocol
Previously, tcp listener perform synchronous proxy protocol header
read during connection accept. It could significantly reduce vmauth
performance and lead to timeout at serving http requests.

 This commit changes this logic and performs proxy protocol header
parsing during first Read request from connection or RemoteAddr method
call. It significantly improves performance and reduce possible
bottleneck at connections accept method.

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9546/
2025-08-07 12:26:19 +02:00
f41gh7
1705867173 go.mod: update fastcache to v1.13.0 2025-08-06 18:29:53 +02:00
Max Kotliar
cdc9a68545 lib/prompb: fix review comment after merge prompbmarshal into prompn
- Rename WriteRequestUnmarshaller to WriteRequestUnmarshaler
- Add a description to WriteRequestUnmarshaler struct

Review comments
b98e592752 (r163365472)

Follow up on
b98e592752
2025-08-06 19:24:23 +03:00
Alexander Frolov
53465350c7 vmselect: properly release tmp blocks for /federate
The `/federate` endpoint handler might return early before calling
`rss.RunParallel()`, which causes temporary block files to not be closed
properly.

Related PR: https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9536
2025-08-06 18:19:41 +02:00
Andrii Chubatiuk
ab1aecf2d7 docs: override canonical url of pages, that have multiple copies (#9550)
### Describe Your Changes

multiple pages, that reference same document in `{% content %}`
shortcode same content, but different canonical URLs, added canonical
parameter to override default url

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

(cherry picked from commit 5266bf1f3b)
2025-08-06 16:14:34 +02:00
Roman Khavronenko
59a2849304 docs: mention series of articles on VM internals in FAQ (#9528)
While there, mention https://victoriametrics.com/blog in the articles
section, as it seems not being mentioned anywhere.

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Mathias Palmersheim <mathias@victoriametrics.com>
(cherry picked from commit d4aefcecc4)
2025-08-06 16:14:34 +02:00
Zakhar Bessarab
7982d2aad3 dashboards/vmagent: fix expression for samples rate (#9530)
In case vmagent does not scrape any metrics left part will be evaluated
as empty resulting in right part being skipped.

Before:
<details>
<img width="1401" height="1080" alt="image"
src="https://github.com/user-attachments/assets/c242593f-8503-4bd2-b6a7-85c1dcc54d0f"
/>
</details>

After:
<details>
<img width="1416" height="1128" alt="image"
src="https://github.com/user-attachments/assets/45565c28-a731-4f5d-af54-1ab3daf75778"
/>
</details>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 93c373d55a)
2025-08-06 16:14:34 +02:00
Hui Wang
709d7a7780 vmalert-tool: fix panic when rule execution fails (#9540)
fix https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9526,
bug was introduced from **v1.114.0**.

Please note, the rule execution failure should only happen if there is a
bad template or duplicated alert(rare case), added a test case to cover
the template.

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>

(cherry picked from commit 58bc05ce56)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-08-06 16:14:33 +02:00
Roman Khavronenko
2a68529e85 docs: update monitoring section (#9538)
* remove duplicated content between single and cluster versions
* mention recommendation to group component types by jobs in scrape
config
* link the example of scrape configs
* update wording

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 516a454f0a)
2025-08-06 16:14:23 +02:00
Jamie Wiebe
0f19ee2cfa vmui: fix typo in "returned too many series" message (#9533)
A few simple grammar changes on messages presented to the user

(cherry picked from commit 9fd9de7ab4)
2025-08-06 16:14:23 +02:00
Max Kotliar
840d5fed90 docs/changelog: remove mention of latest Docker tag deprecation, clarify stable tag removal 2025-08-05 19:12:42 +03:00
f41gh7
b4ad56a858 docs/cluster: follow-up after 33392e1135
Mention new logNewSeriesAuthKey flag at docs
2025-08-04 17:10:11 +02:00
f41gh7
f628db3e0d docs/changelog: add v1.110.15 and v1.122.1 changes
Signed-off-by: f41gh7 <nik@victoriametrics.com>
2025-08-04 17:10:10 +02:00
f41gh7
8adf8b051a docs: update LTS releases versions
Signed-off-by: f41gh7 <nik@victoriametrics.com>
2025-08-04 17:10:10 +02:00
f41gh7
588bb6ab82 docs: mention v1.123.0 release at examples
Signed-off-by: f41gh7 <nik@victoriametrics.com>
2025-08-04 17:10:10 +02:00
Aliaksandr Valialkin
f63d12a309 lib/fs/fs.go: added missing lock for the diskSpaceMapLock inside MustGetTotalSpace() function
This is a follow-up for 7da45924e2

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9523
Updates https://github.com/VictoriaMetrics/VictoriaLogs/issues/513
2025-08-04 10:12:33 +02:00
Aliaksandr Valialkin
7f72f4b819 app/vmstorage: expose vm_total_disk_space_bytes metric, which shows disk volume size for -storageDataPath directory
This metric can be used for building alerts and graphs for free disk space usage percentage by using the following MetricsQL query:

    100 * (vm_free_disk_space_bytes / vm_total_disk_space_bytes)
2025-08-04 10:07:57 +02:00
Phuong Le
0a66ad83a0 lib/fs: Add total disk space retrieval (#9523)
Extends the disk space monitoring functionality by adding support for
retrieving total disk capacity in addition to free space.

Related: https://github.com/VictoriaMetrics/VictoriaLogs/issues/513

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-08-04 09:58:56 +02:00
Aliaksandr Valialkin
4bd57a455b vendor: run make vendor-update 2025-08-03 22:11:09 +02:00
Aliaksandr Valialkin
14a48bb737 vendor: update github.com/VictoriaMetrics/metrics from v1.38.0 to v1.39.1 2025-08-03 22:11:09 +02:00
1285 changed files with 425173 additions and 12897 deletions

23
.github/copilot-instructions.md vendored Normal file
View File

@@ -0,0 +1,23 @@
# Project Overview
VictoriaMetrics is a fast, cost-saving, and scalable solution for monitoring and managing time series data. It delivers high performance and reliability, making it an ideal choice for businesses of all sizes.
## Folder Structure
- `/app`: Contains the compilable binaries.
- `/lib`: Contains the golang reusable libraries
- `/docs/victoriametrics`: Contains documentation for the project.
- `/apptest/tests`: Contains integration tests.
## Libraries and Frameworks
- Backend: Golang, no framework. Use third-party libraries sparingly.
- Frontend: React.
## Code review guidelines
Ensure the feature or bugfix includes a changelog entry in /docs/victoriametrics/changelog/CHANGELOG.md.
Verify the entry is under the ## tip section and matches the structure and style of existing entries.
Chore-only changes may be omitted from the changelog.

View File

@@ -31,43 +31,39 @@ concurrency:
jobs:
build:
name: Build
name: ${{ matrix.os }}-${{ matrix.arch }}
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
include:
- os: linux
arch: amd64
- os: linux
arch: arm64
- os: linux
arch: arm
- os: linux
arch: ppc64le
- os: linux
arch: 386
- os: freebsd
arch: amd64
- os: openbsd
arch: amd64
steps:
- name: Free space
run: |
# cleanup up space to free additional ~20GiB of memory
# which are lacking for multiplaform images build
formatByteCount() { echo $(numfmt --to=iec-i --suffix=B --padding=7 $1'000'); }
getAvailableSpace() { echo $(df -a $1 | awk 'NR > 1 {avail+=$4} END {print avail}'); }
BEFORE=$(getAvailableSpace)
sudo rm -rf /usr/local/lib/android || true
sudo rm -rf /usr/share/dotnet || true
sudo rm -rf /opt/ghc || true
sudo rm -rf /usr/local/.ghcup || true
AFTER=$(getAvailableSpace)
SAVED=$((AFTER-BEFORE))
echo "Saved $(formatByteCount $SAVED)"
- name: Code checkout
uses: actions/checkout@v4
uses: actions/checkout@v5
- name: Setup Go
id: go
uses: actions/setup-go@v5
with:
cache-dependency-path: |
go.sum
Makefile
app/**/Makefile
go-version: stable
cache: false
- name: Cache Go artifacts
uses: actions/cache@v4
with:
path: |
~/.cache/go-build
~/go/bin
~/go/pkg/mod
key: go-artifacts-${{ runner.os }}-crossbuild-${{ steps.go.outputs.go-version }}-${{ hashFiles('go.sum', 'Makefile', 'app/**/Makefile') }}
restore-keys: go-artifacts-${{ runner.os }}-crossbuild-
- name: Run crossbuild
run: make crossbuild
- name: Build vmcluster for ${{ matrix.os }}-${{ matrix.arch }}
run: make vmcluster-${{ matrix.os }}-${{ matrix.arch }}

View File

@@ -29,7 +29,7 @@ jobs:
steps:
- name: Checkout repository
uses: actions/checkout@v4
uses: actions/checkout@v5
- name: Set up Go
id: go

View File

@@ -16,12 +16,12 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Code checkout
uses: actions/checkout@v4
uses: actions/checkout@v5
with:
path: __vm
- name: Checkout private code
uses: actions/checkout@v4
uses: actions/checkout@v5
with:
repository: VictoriaMetrics/vmdocs
token: ${{ secrets.VM_BOT_GH_TOKEN }}

View File

@@ -1,4 +1,4 @@
name: main
name: test
on:
push:
@@ -25,39 +25,41 @@ concurrency:
cancel-in-progress: true
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
jobs:
lint:
name: lint
runs-on: ubuntu-latest
steps:
- name: Code checkout
uses: actions/checkout@v4
uses: actions/checkout@v5
- name: Setup Go
id: go
uses: actions/setup-go@v5
with:
cache: false
cache-dependency-path: |
go.sum
Makefile
app/**/Makefile
go-version: stable
- name: Cache Go artifacts
- name: Cache golangci-lint
uses: actions/cache@v4
with:
path: |
~/.cache/go-build
~/.cache/golangci-lint
~/go/bin
~/go/pkg/mod
key: go-artifacts-${{ runner.os }}-check-all-${{ steps.go.outputs.go-version }}-${{ hashFiles('go.sum', 'Makefile', 'app/**/Makefile') }}
restore-keys: go-artifacts-${{ runner.os }}-check-all-
key: golangci-lint-${{ runner.os }}-${{ hashFiles('.golangci.yml') }}
- name: Run check-all
run: |
make check-all
git diff --exit-code
test:
name: test
needs: lint
unit:
name: unit
runs-on: ubuntu-latest
strategy:
@@ -69,25 +71,18 @@ jobs:
steps:
- name: Code checkout
uses: actions/checkout@v4
uses: actions/checkout@v5
- name: Setup Go
id: go
uses: actions/setup-go@v5
with:
cache: false
cache-dependency-path: |
go.sum
Makefile
app/**/Makefile
go-version: stable
- name: Cache Go artifacts
uses: actions/cache@v4
with:
path: |
~/.cache/go-build
~/go/bin
~/go/pkg/mod
key: go-artifacts-${{ runner.os }}-${{ matrix.scenario }}-${{ steps.go.outputs.go-version }}-${{ hashFiles('go.sum', 'Makefile', 'app/**/Makefile') }}
restore-keys: go-artifacts-${{ runner.os }}-${{ matrix.scenario }}-
- name: Run tests
run: GOGC=10 make ${{ matrix.scenario}}
@@ -96,31 +91,23 @@ jobs:
with:
files: ./coverage.txt
integration-test:
name: integration-test
needs: [lint, test]
integration:
name: integration
runs-on: ubuntu-latest
steps:
- name: Code checkout
uses: actions/checkout@v4
uses: actions/checkout@v5
- name: Setup Go
id: go
uses: actions/setup-go@v5
with:
cache: false
cache-dependency-path: |
go.sum
Makefile
app/**/Makefile
go-version: stable
- name: Cache Go artifacts
uses: actions/cache@v4
with:
path: |
~/.cache/go-build
~/go/bin
~/go/pkg/mod
key: go-artifacts-${{ runner.os }}-${{ matrix.scenario }}-${{ steps.go.outputs.go-version }}-${{ hashFiles('go.sum', 'Makefile', 'app/**/Makefile') }}
restore-keys: go-artifacts-${{ runner.os }}-${{ matrix.scenario }}-
- name: Run integration tests
run: make integration-test

View File

@@ -32,7 +32,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Code checkout
uses: actions/checkout@v4
uses: actions/checkout@v5
- name: Setup Node
uses: actions/setup-node@v4

View File

@@ -12,11 +12,12 @@ PKG_TAG := $(BUILDINFO_TAG)
endif
EXTRA_DOCKER_TAG_SUFFIX ?=
EXTRA_GO_BUILD_TAGS ?=
GO_BUILDINFO = -X '$(PKG_PREFIX)/lib/buildinfo.Version=$(APP_NAME)-$(DATEINFO_TAG)-$(BUILDINFO_TAG)'
TAR_OWNERSHIP ?= --owner=1000 --group=1000
GOLANGCI_LINT_VERSION := 2.2.1
GOLANGCI_LINT_VERSION := 2.4.0
.PHONY: $(MAKECMDGOALS)
@@ -91,8 +92,10 @@ vmcluster-darwin-arm64: \
vmselect-darwin-arm64 \
vmstorage-darwin-arm64
# When adding a new crossbuild target, please also add it to the .github/workflows/build.yml
crossbuild: vmcluster-crossbuild
# When adding a new crossbuild target, please also add it to the .github/workflows/build.yml
vmcluster-crossbuild:
$(MAKE_PARALLEL) vmcluster-linux-amd64 \
vmcluster-linux-arm64 \
@@ -284,16 +287,16 @@ vendor-update:
go mod vendor
app-local:
CGO_ENABLED=1 go build $(RACE) -ldflags "$(GO_BUILDINFO)" -o bin/$(APP_NAME)$(RACE) $(PKG_PREFIX)/app/$(APP_NAME)
CGO_ENABLED=1 go build $(RACE) -ldflags "$(GO_BUILDINFO)" -tags "$(EXTRA_GO_BUILD_TAGS)" -o bin/$(APP_NAME)$(RACE) $(PKG_PREFIX)/app/$(APP_NAME)
app-local-pure:
CGO_ENABLED=0 go build $(RACE) -ldflags "$(GO_BUILDINFO)" -o bin/$(APP_NAME)-pure$(RACE) $(PKG_PREFIX)/app/$(APP_NAME)
CGO_ENABLED=0 go build $(RACE) -ldflags "$(GO_BUILDINFO)" -tags "$(EXTRA_GO_BUILD_TAGS)" -o bin/$(APP_NAME)-pure$(RACE) $(PKG_PREFIX)/app/$(APP_NAME)
app-local-goos-goarch:
CGO_ENABLED=$(CGO_ENABLED) GOOS=$(GOOS) GOARCH=$(GOARCH) go build $(RACE) -ldflags "$(GO_BUILDINFO)" -o bin/$(APP_NAME)-$(GOOS)-$(GOARCH)$(RACE) $(PKG_PREFIX)/app/$(APP_NAME)
CGO_ENABLED=$(CGO_ENABLED) GOOS=$(GOOS) GOARCH=$(GOARCH) go build $(RACE) -ldflags "$(GO_BUILDINFO)" -tags "$(EXTRA_GO_BUILD_TAGS)" -o bin/$(APP_NAME)-$(GOOS)-$(GOARCH)$(RACE) $(PKG_PREFIX)/app/$(APP_NAME)
app-local-windows-goarch:
CGO_ENABLED=0 GOOS=windows GOARCH=$(GOARCH) go build $(RACE) -ldflags "$(GO_BUILDINFO)" -o bin/$(APP_NAME)-windows-$(GOARCH)$(RACE).exe $(PKG_PREFIX)/app/$(APP_NAME)
CGO_ENABLED=0 GOOS=windows GOARCH=$(GOARCH) go build $(RACE) -ldflags "$(GO_BUILDINFO)" -tags "$(EXTRA_GO_BUILD_TAGS)" -o bin/$(APP_NAME)-windows-$(GOARCH)$(RACE).exe $(PKG_PREFIX)/app/$(APP_NAME)
quicktemplate-gen: install-qtc
qtc

View File

@@ -8,6 +8,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentelemetry/firehose"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentelemetry/stream"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/protoparserutil"
@@ -16,9 +17,11 @@ import (
)
var (
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="opentelemetry"}`)
rowsTenantInserted = tenantmetrics.NewCounterMap(`vmagent_tenant_inserted_rows_total{type="opentelemetry"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="opentelemetry"}`)
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="opentelemetry"}`)
metadataInserted = metrics.NewCounter(`vmagent_metadata_inserted_total{type="opentelemetry"}`)
rowsTenantInserted = tenantmetrics.NewCounterMap(`vmagent_tenant_inserted_rows_total{type="opentelemetry"}`)
metadataTenantInserted = tenantmetrics.NewCounterMap(`vmagent_tenant_inserted_metadata_total{type="opentelemetry"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="opentelemetry"}`)
)
// InsertHandler processes opentelemetry metrics.
@@ -36,12 +39,12 @@ func InsertHandler(at *auth.Token, req *http.Request) error {
return fmt.Errorf("json encoding isn't supported for opentelemetry format. Use protobuf encoding")
}
}
return stream.ParseStream(req.Body, encoding, processBody, func(tss []prompb.TimeSeries) error {
return insertRows(at, tss, extraLabels)
return stream.ParseStream(req.Body, encoding, processBody, func(tss []prompb.TimeSeries, mms []prompb.MetricMetadata) error {
return insertRows(at, tss, mms, extraLabels)
})
}
func insertRows(at *auth.Token, tss []prompb.TimeSeries, extraLabels []prompb.Label) error {
func insertRows(at *auth.Token, tss []prompb.TimeSeries, mms []prompb.MetricMetadata, extraLabels []prompb.Label) error {
ctx := common.GetPushCtx()
defer common.PutPushCtx(ctx)
@@ -63,14 +66,33 @@ func insertRows(at *auth.Token, tss []prompb.TimeSeries, extraLabels []prompb.La
})
}
ctx.WriteRequest.Timeseries = tssDst
var metadataTotal int
if promscrape.IsMetadataEnabled() {
var accountID, projectID uint32
if at != nil {
accountID = at.AccountID
projectID = at.ProjectID
for i := range mms {
mm := &mms[i]
mm.AccountID = accountID
mm.ProjectID = projectID
}
}
ctx.WriteRequest.Metadata = mms
metadataTotal = len(mms)
}
ctx.Labels = labels
ctx.Samples = samples
if !remotewrite.TryPush(at, &ctx.WriteRequest) {
return remotewrite.ErrQueueFullHTTPRetry
}
rowsInserted.Add(rowsTotal)
metadataInserted.Add(metadataTotal)
if at != nil {
rowsTenantInserted.Get(at).Add(rowsTotal)
metadataTenantInserted.Get(at).Add(metadataTotal)
}
rowsPerInsert.Update(float64(rowsTotal))
return nil

View File

@@ -8,6 +8,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/prometheus"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/prometheus/stream"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/protoparserutil"
@@ -16,9 +17,12 @@ import (
)
var (
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="prometheus"}`)
rowsTenantInserted = tenantmetrics.NewCounterMap(`vmagent_tenant_inserted_rows_total{type="prometheus"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="prometheus"}`)
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="prometheus"}`)
metadataInserted = metrics.NewCounter(`vmagent_metadata_inserted_total{type="prometheus"}`)
rowsTenantInserted = tenantmetrics.NewCounterMap(`vmagent_tenant_inserted_rows_total{type="prometheus"}`)
metadataTenantInserted = tenantmetrics.NewCounterMap(`vmagent_tenant_inserted_metadata_total{type="prometheus"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="prometheus"}`)
)
// InsertHandler processes `/api/v1/import/prometheus` request.
@@ -32,18 +36,19 @@ func InsertHandler(at *auth.Token, req *http.Request) error {
return err
}
encoding := req.Header.Get("Content-Encoding")
return stream.Parse(req.Body, defaultTimestamp, encoding, true, func(rows []prometheus.Row) error {
return insertRows(at, rows, extraLabels)
return stream.Parse(req.Body, defaultTimestamp, encoding, true, promscrape.IsMetadataEnabled(), func(rows []prometheus.Row, mms []prometheus.Metadata) error {
return insertRows(at, rows, mms, extraLabels)
}, func(s string) {
httpserver.LogError(req, s)
})
}
func insertRows(at *auth.Token, rows []prometheus.Row, extraLabels []prompb.Label) error {
func insertRows(at *auth.Token, rows []prometheus.Row, mms []prometheus.Metadata, extraLabels []prompb.Label) error {
ctx := common.GetPushCtx()
defer common.PutPushCtx(ctx)
tssDst := ctx.WriteRequest.Timeseries[:0]
mmsDst := ctx.WriteRequest.Metadata[:0]
labels := ctx.Labels[:0]
samples := ctx.Samples[:0]
for i := range rows {
@@ -70,15 +75,35 @@ func insertRows(at *auth.Token, rows []prometheus.Row, extraLabels []prompb.Labe
Samples: samples[len(samples)-1:],
})
}
var accountID, projectID uint32
if at != nil {
accountID = at.AccountID
projectID = at.ProjectID
}
for i := range mms {
mm := &mms[i]
mmsDst = append(mmsDst, prompb.MetricMetadata{
MetricFamilyName: mm.Metric,
Help: mm.Help,
Type: mm.Type,
// there is no unit in Prometheus exposition formats
AccountID: accountID,
ProjectID: projectID,
})
}
ctx.WriteRequest.Timeseries = tssDst
ctx.WriteRequest.Metadata = mmsDst
ctx.Labels = labels
ctx.Samples = samples
if !remotewrite.TryPush(at, &ctx.WriteRequest) {
return remotewrite.ErrQueueFullHTTPRetry
}
rowsInserted.Add(len(rows))
metadataInserted.Add(len(mms))
if at != nil {
rowsTenantInserted.Get(at).Add(len(rows))
metadataTenantInserted.Get(at).Add(len(mms))
}
rowsPerInsert.Update(float64(len(rows)))
return nil

View File

@@ -7,6 +7,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/promremotewrite/stream"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/protoparserutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
@@ -14,9 +15,11 @@ import (
)
var (
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="promremotewrite"}`)
rowsTenantInserted = tenantmetrics.NewCounterMap(`vmagent_tenant_inserted_rows_total{type="promremotewrite"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="promremotewrite"}`)
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="promremotewrite"}`)
metadataInserted = metrics.NewCounter(`vmagent_metadata_inserted_total{type="promremotewrite"}`)
rowsTenantInserted = tenantmetrics.NewCounterMap(`vmagent_tenant_inserted_rows_total{type="promremotewrite"}`)
metadataTenantInserted = tenantmetrics.NewCounterMap(`vmagent_tenant_inserted_metadata_total{type="promremotewrite"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="promremotewrite"}`)
)
// InsertHandler processes remote write for prometheus.
@@ -26,17 +29,18 @@ func InsertHandler(at *auth.Token, req *http.Request) error {
return err
}
isVMRemoteWrite := req.Header.Get("Content-Encoding") == "zstd"
return stream.Parse(req.Body, isVMRemoteWrite, func(tss []prompb.TimeSeries) error {
return insertRows(at, tss, extraLabels)
return stream.Parse(req.Body, isVMRemoteWrite, func(tss []prompb.TimeSeries, mms []prompb.MetricMetadata) error {
return insertRows(at, tss, mms, extraLabels)
})
}
func insertRows(at *auth.Token, timeseries []prompb.TimeSeries, extraLabels []prompb.Label) error {
func insertRows(at *auth.Token, timeseries []prompb.TimeSeries, mms []prompb.MetricMetadata, extraLabels []prompb.Label) error {
ctx := common.GetPushCtx()
defer common.PutPushCtx(ctx)
rowsTotal := 0
tssDst := ctx.WriteRequest.Timeseries[:0]
mmsDst := ctx.WriteRequest.Metadata[:0]
labels := ctx.Labels[:0]
samples := ctx.Samples[:0]
for i := range timeseries {
@@ -65,6 +69,30 @@ func insertRows(at *auth.Token, timeseries []prompb.TimeSeries, extraLabels []pr
})
}
ctx.WriteRequest.Timeseries = tssDst
var metadataTotal int
if promscrape.IsMetadataEnabled() {
var accountID, projectID uint32
if at != nil {
accountID = at.AccountID
projectID = at.ProjectID
}
for i := range mms {
mm := &mms[i]
mmsDst = append(mmsDst, prompb.MetricMetadata{
MetricFamilyName: mm.MetricFamilyName,
Help: mm.Help,
Type: mm.Type,
Unit: mm.Unit,
AccountID: accountID,
ProjectID: projectID,
})
}
ctx.WriteRequest.Metadata = mmsDst
metadataTotal = len(mms)
}
ctx.Labels = labels
ctx.Samples = samples
if !remotewrite.TryPush(at, &ctx.WriteRequest) {
@@ -73,7 +101,9 @@ func insertRows(at *auth.Token, timeseries []prompb.TimeSeries, extraLabels []pr
rowsInserted.Add(rowsTotal)
if at != nil {
rowsTenantInserted.Get(at).Add(rowsTotal)
metadataTenantInserted.Get(at).Add(metadataTotal)
}
metadataInserted.Add(metadataTotal)
rowsPerInsert.Update(float64(rowsTotal))
return nil
}

View File

@@ -463,12 +463,6 @@ again:
// - Real-world implementations of v1 use both 400 and 415 status codes.
// See more in research: https://github.com/VictoriaMetrics/VictoriaMetrics/pull/8462#issuecomment-2786918054
case 415, 400:
if c.canDowngradeVMProto.Swap(false) {
logger.Infof("received unsupported media type or bad request from remote storage at %q. Downgrading protocol from VictoriaMetrics to Prometheus remote write for all future requests. "+
"See https://docs.victoriametrics.com/victoriametrics/vmagent/#victoriametrics-remote-write-protocol", c.sanitizedURL)
c.useVMProto.Store(false)
}
if encoding.IsZstd(block) {
logger.Infof("received unsupported media type or bad request from remote storage at %q. Re-packing the block to Prometheus remote write and retrying."+
"See https://docs.victoriametrics.com/victoriametrics/vmagent/#victoriametrics-remote-write-protocol", c.sanitizedURL)

View File

@@ -24,9 +24,10 @@ import (
var (
flushInterval = flag.Duration("remoteWrite.flushInterval", time.Second, "Interval for flushing the data to remote storage. "+
"This option takes effect only when less than 10K data points per second are pushed to -remoteWrite.url")
"This option takes effect only when less than -remoteWrite.maxRowsPerBlock data points per -remoteWrite.flushInterval are pushed to -remoteWrite.url")
maxUnpackedBlockSize = flagutil.NewBytes("remoteWrite.maxBlockSize", 8*1024*1024, "The maximum block size to send to remote storage. Bigger blocks may improve performance at the cost of the increased memory usage. See also -remoteWrite.maxRowsPerBlock")
maxRowsPerBlock = flag.Int("remoteWrite.maxRowsPerBlock", 10000, "The maximum number of samples to send in each block to remote storage. Higher number may improve performance at the cost of the increased memory usage. See also -remoteWrite.maxBlockSize")
maxMetadataPerBlock = flag.Int("remoteWrite.maxMetadataPerBlock", 5000, "The maximum number of metadata to send in each block to remote storage. Higher number may improve performance at the cost of the increased memory usage. See also -remoteWrite.maxBlockSize")
vmProtoCompressLevel = flag.Int("remoteWrite.vmProtoCompressLevel", 0, "The compression level for VictoriaMetrics remote write protocol. "+
"Higher values reduce network traffic at the cost of higher CPU usage. Negative values reduce CPU usage at the cost of increased network traffic. "+
"See https://docs.victoriametrics.com/victoriametrics/vmagent/#victoriametrics-remote-write-protocol")
@@ -60,9 +61,16 @@ func (ps *pendingSeries) MustStop() {
ps.periodicFlusherWG.Wait()
}
func (ps *pendingSeries) TryPush(tss []prompb.TimeSeries) bool {
func (ps *pendingSeries) TryPushTimeSeries(tss []prompb.TimeSeries) bool {
ps.mu.Lock()
ok := ps.wr.tryPush(tss)
ok := ps.wr.tryPushTimeSeries(tss)
ps.mu.Unlock()
return ok
}
func (ps *pendingSeries) TryPushMetadata(mms []prompb.MetricMetadata) bool {
ps.mu.Lock()
ok := ps.wr.tryPushMetadata(mms)
ps.mu.Unlock()
return ok
}
@@ -111,26 +119,34 @@ type writeRequest struct {
wr prompb.WriteRequest
tss []prompb.TimeSeries
mms []prompb.MetricMetadata
labels []prompb.Label
samples []prompb.Sample
// buf holds labels data
buf []byte
// metadatabuf holds metadata data
metadatabuf []byte
}
func (wr *writeRequest) reset() {
// Do not reset lastFlushTime, fq, isVMRemoteWrite, significantFigures and roundDigits, since they are reused.
wr.wr.Timeseries = nil
wr.wr.Metadata = nil
clear(wr.tss)
wr.tss = wr.tss[:0]
clear(wr.mms)
wr.mms = wr.mms[:0]
promrelabel.CleanLabels(wr.labels)
wr.labels = wr.labels[:0]
wr.samples = wr.samples[:0]
wr.buf = wr.buf[:0]
wr.metadatabuf = wr.metadatabuf[:0]
}
// mustFlushOnStop force pushes wr data into wr.fq
@@ -138,6 +154,7 @@ func (wr *writeRequest) reset() {
// This is needed in order to properly save in-memory data to persistent queue on graceful shutdown.
func (wr *writeRequest) mustFlushOnStop() {
wr.wr.Timeseries = wr.tss
wr.wr.Metadata = wr.mms
if !tryPushWriteRequest(&wr.wr, wr.mustWriteBlock, wr.isVMRemoteWrite.Load()) {
logger.Panicf("BUG: final flush must always return true")
}
@@ -151,6 +168,7 @@ func (wr *writeRequest) mustWriteBlock(block []byte) bool {
func (wr *writeRequest) tryFlush() bool {
wr.wr.Timeseries = wr.tss
wr.wr.Metadata = wr.mms
wr.lastFlushTime.Store(fasttime.UnixTimestamp())
if !tryPushWriteRequest(&wr.wr, wr.fq.TryWriteBlock, wr.isVMRemoteWrite.Load()) {
return false
@@ -174,7 +192,49 @@ func adjustSampleValues(samples []prompb.Sample, significantFigures, roundDigits
}
}
func (wr *writeRequest) tryPush(src []prompb.TimeSeries) bool {
func (wr *writeRequest) tryPushMetadata(mms []prompb.MetricMetadata) bool {
mmdDst := wr.mms
maxMetadataPerBlock := *maxMetadataPerBlock
for i := range mms {
if len(wr.mms) >= maxMetadataPerBlock {
if !wr.tryFlush() {
return false
}
mmdDst = wr.mms
}
mmSrc := &mms[i]
mmdDst = append(mmdDst, prompb.MetricMetadata{})
wr.copyMetadata(&mmdDst[len(mmdDst)-1], mmSrc)
}
wr.mms = mmdDst
return true
}
func (wr *writeRequest) copyMetadata(dst, src *prompb.MetricMetadata) {
// Direct copy for non-string fields, which are safe by value.
dst.Type = src.Type
dst.Unit = src.Unit
// Pre-allocate memory for all string fields.
neededBufLen := len(src.MetricFamilyName) + len(src.Help)
bufLen := len(wr.metadatabuf)
wr.metadatabuf = slicesutil.SetLength(wr.metadatabuf, bufLen+neededBufLen)
buf := wr.metadatabuf[:bufLen]
// Copy MetricFamilyName
bufLen = len(buf)
buf = append(buf, src.MetricFamilyName...)
dst.MetricFamilyName = bytesutil.ToUnsafeString(buf[bufLen:])
// Copy Help
bufLen = len(buf)
buf = append(buf, src.Help...)
dst.Help = bytesutil.ToUnsafeString(buf[bufLen:])
wr.metadatabuf = buf
}
func (wr *writeRequest) tryPushTimeSeries(src []prompb.TimeSeries) bool {
tssDst := wr.tss
maxSamplesPerBlock := *maxRowsPerBlock
// Allow up to 10x of labels per each block on average.
@@ -241,7 +301,7 @@ func (wr *writeRequest) copyTimeSeries(dst, src *prompb.TimeSeries) {
var marshalConcurrencyCh = make(chan struct{}, cgroup.AvailableCPUs())
func tryPushWriteRequest(wr *prompb.WriteRequest, tryPushBlock func(block []byte) bool, isVMRemoteWrite bool) bool {
if len(wr.Timeseries) == 0 {
if wr.IsEmpty() {
// Nothing to push
return true
}
@@ -267,6 +327,7 @@ func tryPushWriteRequest(wr *prompb.WriteRequest, tryPushBlock func(block []byte
compressBufPool.Put(zb)
if ok {
blockSizeRows.Update(float64(len(wr.Timeseries)))
blockMetadataRows.Update(float64(len(wr.Metadata)))
blockSizeBytes.Update(float64(zbLen))
}
return ok
@@ -278,47 +339,86 @@ func tryPushWriteRequest(wr *prompb.WriteRequest, tryPushBlock func(block []byte
<-marshalConcurrencyCh
}
// Too big block. Recursively split it into smaller parts if possible.
if len(wr.Timeseries) == 1 {
// A single time series left. Recursively split its samples into smaller parts if possible.
// Split timeseries or metadata into two smaller blocks
switch len(wr.Timeseries) {
case 0:
if len(wr.Metadata) == 1 {
logger.Warnf("dropping a metadata exceeding -remoteWrite.maxBlockSize=%d bytes", maxUnpackedBlockSize.N)
return true
}
metadata := wr.Metadata
n := len(metadata) / 2
wr.Metadata = metadata[:n]
if !tryPushWriteRequest(wr, tryPushBlock, isVMRemoteWrite) {
wr.Metadata = metadata
return false
}
wr.Metadata = metadata[n:]
if !tryPushWriteRequest(wr, tryPushBlock, isVMRemoteWrite) {
wr.Metadata = metadata
return false
}
wr.Metadata = metadata
return true
case 1:
// A single time series left. Recursively split its samples and metadata into smaller parts if possible.
samples := wr.Timeseries[0].Samples
if len(samples) == 1 {
logger.Warnf("dropping a sample for metric with too long labels exceeding -remoteWrite.maxBlockSize=%d bytes", maxUnpackedBlockSize.N)
metaData := wr.Metadata
if len(samples) == 1 && len(metaData) <= 1 {
logger.Warnf("dropping a sample for metric and %d metadata which are exceeding -remoteWrite.maxBlockSize=%d bytes", len(metaData), maxUnpackedBlockSize.N)
return true
}
n := len(samples) / 2
m := len(metaData) / 2
wr.Timeseries[0].Samples = samples[:n]
wr.Metadata = metaData[:m]
if !tryPushWriteRequest(wr, tryPushBlock, isVMRemoteWrite) {
wr.Timeseries[0].Samples = samples
wr.Metadata = metaData
return false
}
wr.Timeseries[0].Samples = samples[n:]
wr.Metadata = metaData[m:]
if !tryPushWriteRequest(wr, tryPushBlock, isVMRemoteWrite) {
wr.Timeseries[0].Samples = samples
wr.Metadata = metaData
return false
}
wr.Timeseries[0].Samples = samples
wr.Metadata = metaData
return true
default:
// Split both timeseries and metadata.
timeseries := wr.Timeseries
metaData := wr.Metadata
n := len(timeseries) / 2
m := len(metaData) / 2
wr.Timeseries = timeseries[:n]
wr.Metadata = metaData[:m]
if !tryPushWriteRequest(wr, tryPushBlock, isVMRemoteWrite) {
wr.Timeseries = timeseries
wr.Metadata = metaData
return false
}
wr.Timeseries = timeseries[n:]
wr.Metadata = metaData[m:]
if !tryPushWriteRequest(wr, tryPushBlock, isVMRemoteWrite) {
wr.Timeseries = timeseries
wr.Metadata = metaData
return false
}
wr.Timeseries = timeseries
wr.Metadata = metaData
return true
}
timeseries := wr.Timeseries
n := len(timeseries) / 2
wr.Timeseries = timeseries[:n]
if !tryPushWriteRequest(wr, tryPushBlock, isVMRemoteWrite) {
wr.Timeseries = timeseries
return false
}
wr.Timeseries = timeseries[n:]
if !tryPushWriteRequest(wr, tryPushBlock, isVMRemoteWrite) {
wr.Timeseries = timeseries
return false
}
wr.Timeseries = timeseries
return true
}
var (
blockSizeBytes = metrics.NewHistogram(`vmagent_remotewrite_block_size_bytes`)
blockSizeRows = metrics.NewHistogram(`vmagent_remotewrite_block_size_rows`)
blockSizeBytes = metrics.NewHistogram(`vmagent_remotewrite_block_size_bytes`)
blockSizeRows = metrics.NewHistogram(`vmagent_remotewrite_block_size_rows`)
blockMetadataRows = metrics.NewHistogram(`vmagent_remotewrite_block_metadata_rows`)
)
var (

View File

@@ -209,7 +209,7 @@ func Init() {
// In this case it is impossible to prevent from sending many duplicates of samples passed to TryPush() to all the configured -remoteWrite.url
// if these samples couldn't be sent to the -remoteWrite.url with the disabled persistent queue. So it is better sending samples
// to the remaining -remoteWrite.url and dropping them on the blocked queue.
dropSamplesOnFailureGlobal = *dropSamplesOnOverload || disableOnDiskQueueAny && len(disableOnDiskQueues) > 1
dropSamplesOnFailureGlobal = *dropSamplesOnOverload || disableOnDiskQueueAny && len(*remoteWriteURLs) > 1
dropDanglingQueues()
@@ -388,13 +388,7 @@ func TryPush(at *auth.Token, wr *prompb.WriteRequest) bool {
func tryPush(at *auth.Token, wr *prompb.WriteRequest, forceDropSamplesOnFailure bool) bool {
tss := wr.Timeseries
var tenantRctx *relabelCtx
if at != nil {
// Convert at to (vm_account_id, vm_project_id) labels.
tenantRctx = getRelabelCtx()
defer putRelabelCtx(tenantRctx)
}
mms := wr.Metadata
// Quick check whether writes to configured remote storage systems are blocked.
// This allows saving CPU time spent on relabeling and block compression
@@ -411,6 +405,23 @@ func tryPush(at *auth.Token, wr *prompb.WriteRequest, forceDropSamplesOnFailure
return true
}
// Push metadata separately from time series, since it doesn't need sharding,
// relabeling, stream aggregation, deduplication, etc.
if !tryPushMetadataToRemoteStorages(rwctxs, mms, forceDropSamplesOnFailure) {
return false
}
if len(tss) == 0 {
return true
}
var tenantRctx *relabelCtx
if at != nil {
// Convert at to (vm_account_id, vm_project_id) labels.
tenantRctx = getRelabelCtx()
defer putRelabelCtx(tenantRctx)
}
var rctx *relabelCtx
rcs := allRelabelConfigs.Load()
pcsGlobal := rcs.global
@@ -481,7 +492,7 @@ func tryPush(at *auth.Token, wr *prompb.WriteRequest, forceDropSamplesOnFailure
deduplicatorGlobal.Push(tssBlock)
tssBlock = tssBlock[:0]
}
if !tryPushBlockToRemoteStorages(rwctxs, tssBlock, forceDropSamplesOnFailure) {
if !tryPushTimeSeriesToRemoteStorages(rwctxs, tssBlock, forceDropSamplesOnFailure) {
return false
}
}
@@ -520,18 +531,49 @@ func getEligibleRemoteWriteCtxs(tss []prompb.TimeSeries, forceDropSamplesOnFailu
return rwctxs, true
}
func pushToRemoteStoragesTrackDropped(tss []prompb.TimeSeries) {
func pushTimeSeriesToRemoteStoragesTrackDropped(tss []prompb.TimeSeries) {
rwctxs, _ := getEligibleRemoteWriteCtxs(tss, true)
if len(rwctxs) == 0 {
return
}
if !tryPushBlockToRemoteStorages(rwctxs, tss, true) {
logger.Panicf("BUG: tryPushBlockToRemoteStorages() must return true when forceDropSamplesOnFailure=true")
if !tryPushTimeSeriesToRemoteStorages(rwctxs, tss, true) {
logger.Panicf("BUG: tryPushTimeSeriesToRemoteStorages() must return true when forceDropSamplesOnFailure=true")
}
}
func tryPushBlockToRemoteStorages(rwctxs []*remoteWriteCtx, tssBlock []prompb.TimeSeries, forceDropSamplesOnFailure bool) bool {
func tryPushMetadataToRemoteStorages(rwctxs []*remoteWriteCtx, mms []prompb.MetricMetadata, forceDropSamplesOnFailure bool) bool {
if len(mms) == 0 {
// Nothing to push
return true
}
// Do not shard metadata even if -remoteWrite.shardByURL is set, just replicate it among rwctxs.
// Since metadata is usually small and there is no guarantee that metadata can be sent to
// the same remote storage with the corresponding metrics.
//
// Push metadata to remote storage systems in parallel to reduce
// the time needed for sending the data to multiple remote storage systems.
var wg sync.WaitGroup
wg.Add(len(rwctxs))
var anyPushFailed atomic.Bool
for _, rwctx := range rwctxs {
go func(rwctx *remoteWriteCtx) {
defer wg.Done()
if !rwctx.tryPushMetadataInternal(mms) {
rwctx.pushFailures.Inc()
if forceDropSamplesOnFailure {
rwctx.metadataDroppedOnPushFailure.Add(len(mms))
return
}
anyPushFailed.Store(true)
}
}(rwctx)
}
wg.Wait()
return !anyPushFailed.Load()
}
func tryPushTimeSeriesToRemoteStorages(rwctxs []*remoteWriteCtx, tssBlock []prompb.TimeSeries, forceDropSamplesOnFailure bool) bool {
if len(tssBlock) == 0 {
// Nothing to push
return true
@@ -539,7 +581,7 @@ func tryPushBlockToRemoteStorages(rwctxs []*remoteWriteCtx, tssBlock []prompb.Ti
if len(rwctxs) == 1 {
// Fast path - just push data to the configured single remote storage
return rwctxs[0].TryPush(tssBlock, forceDropSamplesOnFailure)
return rwctxs[0].TryPushTimeSeries(tssBlock, forceDropSamplesOnFailure)
}
// We need to push tssBlock to multiple remote storages.
@@ -550,11 +592,11 @@ func tryPushBlockToRemoteStorages(rwctxs []*remoteWriteCtx, tssBlock []prompb.Ti
if replicas <= 0 {
replicas = 1
}
return tryShardingBlockAmongRemoteStorages(rwctxs, tssBlock, replicas, forceDropSamplesOnFailure)
return tryShardingTimeSeriesAmongRemoteStorages(rwctxs, tssBlock, replicas, forceDropSamplesOnFailure)
}
// Replicate tssBlock samples among rwctxs.
// Push tssBlock to remote storage systems in parallel in order to reduce
// Push tssBlock to remote storage systems in parallel to reduce
// the time needed for sending the data to multiple remote storage systems.
var wg sync.WaitGroup
wg.Add(len(rwctxs))
@@ -562,7 +604,7 @@ func tryPushBlockToRemoteStorages(rwctxs []*remoteWriteCtx, tssBlock []prompb.Ti
for _, rwctx := range rwctxs {
go func(rwctx *remoteWriteCtx) {
defer wg.Done()
if !rwctx.TryPush(tssBlock, forceDropSamplesOnFailure) {
if !rwctx.TryPushTimeSeries(tssBlock, forceDropSamplesOnFailure) {
anyPushFailed.Store(true)
}
}(rwctx)
@@ -571,7 +613,7 @@ func tryPushBlockToRemoteStorages(rwctxs []*remoteWriteCtx, tssBlock []prompb.Ti
return !anyPushFailed.Load()
}
func tryShardingBlockAmongRemoteStorages(rwctxs []*remoteWriteCtx, tssBlock []prompb.TimeSeries, replicas int, forceDropSamplesOnFailure bool) bool {
func tryShardingTimeSeriesAmongRemoteStorages(rwctxs []*remoteWriteCtx, tssBlock []prompb.TimeSeries, replicas int, forceDropSamplesOnFailure bool) bool {
x := getTSSShards(len(rwctxs))
defer putTSSShards(x)
@@ -590,7 +632,7 @@ func tryShardingBlockAmongRemoteStorages(rwctxs []*remoteWriteCtx, tssBlock []pr
wg.Add(1)
go func(rwctx *remoteWriteCtx, tss []prompb.TimeSeries) {
defer wg.Done()
if !rwctx.TryPush(tss, forceDropSamplesOnFailure) {
if !rwctx.TryPushTimeSeries(tss, forceDropSamplesOnFailure) {
anyPushFailed.Store(true)
}
}(rwctx, shard)
@@ -797,8 +839,9 @@ type remoteWriteCtx struct {
rowsPushedAfterRelabel *metrics.Counter
rowsDroppedByRelabel *metrics.Counter
pushFailures *metrics.Counter
rowsDroppedOnPushFailure *metrics.Counter
pushFailures *metrics.Counter
metadataDroppedOnPushFailure *metrics.Counter
rowsDroppedOnPushFailure *metrics.Counter
}
func newRemoteWriteCtx(argIdx int, remoteWriteURL *url.URL, maxInmemoryBlocks int, sanitizedURL string) *remoteWriteCtx {
@@ -862,8 +905,9 @@ func newRemoteWriteCtx(argIdx int, remoteWriteURL *url.URL, maxInmemoryBlocks in
rowsPushedAfterRelabel: metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_rows_pushed_after_relabel_total{path=%q,url=%q}`, queuePath, sanitizedURL)),
rowsDroppedByRelabel: metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_relabel_metrics_dropped_total{path=%q,url=%q}`, queuePath, sanitizedURL)),
pushFailures: metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_push_failures_total{path=%q,url=%q}`, queuePath, sanitizedURL)),
rowsDroppedOnPushFailure: metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_samples_dropped_total{path=%q,url=%q}`, queuePath, sanitizedURL)),
pushFailures: metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_push_failures_total{path=%q,url=%q}`, queuePath, sanitizedURL)),
metadataDroppedOnPushFailure: metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_metadata_dropped_total{path=%q,url=%q}`, queuePath, sanitizedURL)),
rowsDroppedOnPushFailure: metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_samples_dropped_total{path=%q,url=%q}`, queuePath, sanitizedURL)),
}
rwctx.initStreamAggrConfig()
@@ -897,10 +941,10 @@ func (rwctx *remoteWriteCtx) MustStop() {
rwctx.rowsDroppedByRelabel = nil
}
// TryPush sends tss series to the configured remote write endpoint
// TryPushTimeSeries sends tss series to the configured remote write endpoint
//
// TryPush doesn't modify tss, so tss can be passed concurrently to TryPush across distinct rwctx instances.
func (rwctx *remoteWriteCtx) TryPush(tss []prompb.TimeSeries, forceDropSamplesOnFailure bool) bool {
// TryPushTimeSeries doesn't modify tss, so tss can be passed concurrently to TryPush across distinct rwctx instances.
func (rwctx *remoteWriteCtx) TryPushTimeSeries(tss []prompb.TimeSeries, forceDropSamplesOnFailure bool) bool {
var rctx *relabelCtx
var v *[]prompb.TimeSeries
defer func() {
@@ -953,7 +997,7 @@ func (rwctx *remoteWriteCtx) TryPush(tss []prompb.TimeSeries, forceDropSamplesOn
}
// Try pushing tss to remote storage
if rwctx.tryPushInternal(tss) {
if rwctx.tryPushTimeSeriesInternal(tss) {
return true
}
@@ -985,7 +1029,7 @@ func dropAggregatedSeries(src []prompb.TimeSeries, matchIdxs []byte, dropInput b
}
func (rwctx *remoteWriteCtx) pushInternalTrackDropped(tss []prompb.TimeSeries) {
if rwctx.tryPushInternal(tss) {
if rwctx.tryPushTimeSeriesInternal(tss) {
return
}
if !rwctx.fq.IsPersistentQueueDisabled() {
@@ -996,7 +1040,14 @@ func (rwctx *remoteWriteCtx) pushInternalTrackDropped(tss []prompb.TimeSeries) {
rwctx.rowsDroppedOnPushFailure.Add(rowsCount)
}
func (rwctx *remoteWriteCtx) tryPushInternal(tss []prompb.TimeSeries) bool {
func (rwctx *remoteWriteCtx) tryPushMetadataInternal(mms []prompb.MetricMetadata) bool {
pss := rwctx.pss
idx := rwctx.pssNextIdx.Add(1) % uint64(len(pss))
return pss[idx].TryPushMetadata(mms)
}
func (rwctx *remoteWriteCtx) tryPushTimeSeriesInternal(tss []prompb.TimeSeries) bool {
var rctx *relabelCtx
var v *[]prompb.TimeSeries
defer func() {
@@ -1020,7 +1071,7 @@ func (rwctx *remoteWriteCtx) tryPushInternal(tss []prompb.TimeSeries) bool {
pss := rwctx.pss
idx := rwctx.pssNextIdx.Add(1) % uint64(len(pss))
return pss[idx].TryPush(tss)
return pss[idx].TryPushTimeSeries(tss)
}
var tssPool = &sync.Pool{

View File

@@ -106,7 +106,7 @@ func TestRemoteWriteContext_TryPush_ImmutableTimeseries(t *testing.T) {
// copy inputTss to make sure it is not mutated during TryPush call
copy(expectedTss, inputTss)
if !rwctx.TryPush(inputTss, false) {
if !rwctx.TryPushTimeSeries(inputTss, false) {
t.Fatalf("cannot push samples to rwctx")
}

View File

@@ -141,7 +141,7 @@ func initStreamAggrConfigGlobal() {
}
dedupInterval := *streamAggrGlobalDedupInterval
if dedupInterval > 0 {
deduplicatorGlobal = streamaggr.NewDeduplicator(pushToRemoteStoragesTrackDropped, *streamAggrGlobalEnableWindows, dedupInterval, *streamAggrGlobalDropInputLabels, "dedup-global")
deduplicatorGlobal = streamaggr.NewDeduplicator(pushTimeSeriesToRemoteStoragesTrackDropped, *streamAggrGlobalEnableWindows, dedupInterval, *streamAggrGlobalDropInputLabels, "dedup-global")
}
}
@@ -216,7 +216,7 @@ func newStreamAggrConfigGlobal() (*streamaggr.Aggregators, error) {
EnableWindows: *streamAggrGlobalEnableWindows,
}
sas, err := streamaggr.LoadFromFile(path, pushToRemoteStoragesTrackDropped, opts, "global")
sas, err := streamaggr.LoadFromFile(path, pushTimeSeriesToRemoteStoragesTrackDropped, opts, "global")
if err != nil {
return nil, fmt.Errorf("cannot load -streamAggr.config=%q: %w", *streamAggrGlobalConfig, err)
}

View File

@@ -295,10 +295,7 @@ func parse(files map[string][]byte, validateTplFn ValidateTplFn, validateExpress
}
func parseConfig(data []byte) ([]Group, error) {
data, err := envtemplate.ReplaceBytes(data)
if err != nil {
return nil, fmt.Errorf("cannot expand environment vars: %w", err)
}
data = envtemplate.ReplaceBytes(data)
var result []Group
type cfgFile struct {
@@ -310,13 +307,13 @@ func parseConfig(data []byte) ([]Group, error) {
decoder := yaml.NewDecoder(bytes.NewReader(data))
for {
var cf cfgFile
if err = decoder.Decode(&cf); err != nil {
if err := decoder.Decode(&cf); err != nil {
if err == io.EOF { // EOF indicates no more documents to read
break
}
return nil, err
}
if err = checkOverflow(cf.XXX, "config"); err != nil {
if err := checkOverflow(cf.XXX, "config"); err != nil {
return nil, err
}
result = append(result, cf.Groups...)

View File

@@ -182,7 +182,7 @@ func (rw *rwServer) handler(w http.ResponseWriter, r *http.Request) {
rw.err(w, fmt.Errorf("decode err: %w", err))
return
}
wru := &prompb.WriteRequestUnmarshaller{}
wru := &prompb.WriteRequestUnmarshaler{}
wr, err := wru.UnmarshalProtobuf(b)
if err != nil {
rw.err(w, fmt.Errorf("unmarhsal err: %w", err))

View File

@@ -28,8 +28,8 @@ var (
"Defines how many retries to make before giving up on rule if request for it returns an error.")
disableProgressBar = flag.Bool("replay.disableProgressBar", false, "Whether to disable rendering progress bars during the replay. "+
"Progress bar rendering might be verbose or break the logs parsing, so it is recommended to be disabled when not used in interactive mode.")
ruleEvaluationConcurrency = flag.Int("replay.ruleEvaluationConcurrency", 1, "The maximum number of concurrent `/query_range` requests for a single rule. "+
"Increasing this value when replaying for a long time and a single request range is limited by `-replay.maxDatapointsPerQuery`.")
ruleEvaluationConcurrency = flag.Int("replay.ruleEvaluationConcurrency", 1, "The maximum number of concurrent '/query_range' requests when replay recording rule or alerting rule with for=0. "+
"Increasing this value when replaying for a long time, since each request is limited by -replay.maxDatapointsPerQuery.")
)
func replay(groupsCfg []config.Group, qb datasource.QuerierBuilder, rw remotewrite.RWClient) (totalRows, droppedRows int, err error) {

View File

@@ -246,24 +246,33 @@ func TestReplay(t *testing.T) {
// multiple rules + rule concurrency + group concurrency
f("2021-01-01T12:00:00.000Z", "2021-01-01T12:02:30.000Z", 1, 3, 0, []config.Group{
{Rules: []config.Rule{{Alert: "foo-group-single-concurrent", Expr: "sum(up) > 1"}, {Alert: "bar-group-single-concurrent", Expr: "max(up) < 1"}}, Concurrency: 2}}, &fakeReplayQuerier{
{Rules: []config.Rule{{Alert: "foo-group-single-concurrent", For: promutil.NewDuration(30 * time.Second), Expr: "sum(up) > 1"}, {Alert: "bar-group-single-concurrent", Expr: "max(up) < 1"}}, Concurrency: 2}}, &fakeReplayQuerier{
registry: map[string]map[string][]datasource.Metric{
"sum(up) > 1": {
"12:00:00+12:01:00": {},
"12:01:00+12:02:00": {{
Timestamps: []int64{1},
"12:00:00+12:01:00": {{
Timestamps: []int64{1609502460},
Values: []float64{1},
}},
"12:01:00+12:02:00": {{
Timestamps: []int64{1609502520},
Values: []float64{1},
}},
"12:02:00+12:02:30": {{
Timestamps: []int64{1609502580},
Values: []float64{1},
}},
"12:02:00+12:02:30": {},
},
"max(up) < 1": {
"12:00:00+12:01:00": {},
"12:00:00+12:01:00": {{
Timestamps: []int64{1609502460},
Values: []float64{1},
}},
"12:01:00+12:02:00": {{
Timestamps: []int64{1},
Timestamps: []int64{1609502520},
Values: []float64{1},
}},
"12:02:00+12:02:30": {},
},
},
}, 4)
}, 10)
}

View File

@@ -341,11 +341,15 @@ func (ar *AlertingRule) execRange(ctx context.Context, start, end time.Time) ([]
return []datasource.Metric{{Timestamps: []int64{0}, Values: []float64{math.NaN()}}}, nil
}
for _, s := range res.Data {
ls, as, err := ar.expandTemplates(s, qFn, time.Time{})
ls, err := ar.expandLabelTemplates(s)
if err != nil {
return nil, fmt.Errorf("failed to expand templates: %s", err)
return nil, err
}
alertID := hash(ls.processed)
as, err := ar.expandAnnotationTemplates(s, qFn, time.Time{}, ls)
if err != nil {
return nil, err
}
a := ar.newAlert(s, time.Time{}, ls.processed, as) // initial alert
prevT := time.Time{}
@@ -363,7 +367,7 @@ func (ar *AlertingRule) execRange(ctx context.Context, start, end time.Time) ([]
a.State = notifier.StatePending
a.ActiveAt = at
// re-template the annotations as active timestamp is changed
_, a.Annotations, _ = ar.expandTemplates(s, qFn, at)
a.Annotations, _ = ar.expandAnnotationTemplates(s, qFn, at, ls)
a.Start = time.Time{}
} else if at.Sub(a.ActiveAt) >= ar.For && a.State != notifier.StateFiring {
a.State = notifier.StateFiring
@@ -376,13 +380,15 @@ func (ar *AlertingRule) execRange(ctx context.Context, start, end time.Time) ([]
}
result = append(result, ar.alertToTimeSeries(a, s.Timestamps[i])...)
// save alert's state on last iteration, so it can be used on the next execRange call
if at.Equal(end) {
// if for>0, save alert's state on last iteration, so it can be used on the next execRange call
if ar.For > 0 && at.Equal(end) {
holdAlertState[alertID] = a
}
}
}
ar.alerts = holdAlertState
if len(holdAlertState) > 0 {
ar.alerts = holdAlertState
}
return result, nil
}
@@ -428,9 +434,22 @@ func (ar *AlertingRule) exec(ctx context.Context, ts time.Time, limit int) ([]pr
expandedLabels := make([]*labelSet, len(res.Data))
expandedAnnotations := make([]map[string]string, len(res.Data))
for i, m := range res.Data {
ls, as, err := ar.expandTemplates(m, qFn, ts)
ls, err := ar.expandLabelTemplates(m)
if err != nil {
curState.Err = fmt.Errorf("failed to expand templates: %w", err)
curState.Err = err
return nil, curState.Err
}
at := ts
alertID := hash(ls.processed)
if a, ok := ar.alerts[alertID]; ok {
// modify activeAt for annotation templating if the alert has already triggered(in state Pending or Firing)
if a.State != notifier.StateInactive {
at = a.ActiveAt
}
}
as, err := ar.expandAnnotationTemplates(m, qFn, at, ls)
if err != nil {
curState.Err = err
return nil, curState.Err
}
expandedLabels[i] = ls
@@ -473,6 +492,7 @@ func (ar *AlertingRule) exec(ctx context.Context, ts time.Time, limit int) ([]pr
a.KeepFiringSince = time.Time{}
continue
}
a := ar.newAlert(m, ts, labels.processed, annotations)
a.ID = alertID
a.State = notifier.StatePending
@@ -536,12 +556,18 @@ func (ar *AlertingRule) exec(ctx context.Context, ts time.Time, limit int) ([]pr
return append(tss, ar.toTimeSeries(ts.Unix())...), nil
}
func (ar *AlertingRule) expandTemplates(m datasource.Metric, qFn templates.QueryFn, ts time.Time) (*labelSet, map[string]string, error) {
func (ar *AlertingRule) expandLabelTemplates(m datasource.Metric) (*labelSet, error) {
qFn := func(_ string) ([]datasource.Metric, error) {
return nil, fmt.Errorf("`query` template isn't supported in rule label")
}
ls, err := ar.toLabels(m, qFn)
if err != nil {
return nil, nil, fmt.Errorf("failed to expand labels: %w", err)
return nil, fmt.Errorf("failed to expand label templates: %s", err)
}
return ls, nil
}
func (ar *AlertingRule) expandAnnotationTemplates(m datasource.Metric, qFn templates.QueryFn, activeAt time.Time, ls *labelSet) (map[string]string, error) {
tplData := notifier.AlertTplData{
Value: m.Values[0],
Type: ar.Type.String(),
@@ -549,14 +575,14 @@ func (ar *AlertingRule) expandTemplates(m datasource.Metric, qFn templates.Query
Expr: ar.Expr,
AlertID: hash(ls.processed),
GroupID: ar.GroupID,
ActiveAt: ts,
ActiveAt: activeAt,
For: ar.For,
}
as, err := notifier.ExecTemplate(qFn, ar.Annotations, tplData)
if err != nil {
return nil, nil, fmt.Errorf("failed to template annotations: %w", err)
return nil, fmt.Errorf("failed to expand annotation templates: %s", err)
}
return ls, as, nil
return as, nil
}
// toTimeSeries creates `ALERTS` and `ALERTS_FOR_STATE` for active alerts

View File

@@ -6,6 +6,7 @@ import (
"fmt"
"reflect"
"sort"
"strconv"
"strings"
"sync"
"testing"
@@ -267,8 +268,15 @@ func TestAlertingRule_Exec(t *testing.T) {
if got.State != exp.State {
t.Fatalf("evalIndex %d: expected state %d; got %d", i, exp.State, got.State)
}
if rule.Annotations != nil && exp.Annotations != nil {
if !reflect.DeepEqual(got.Annotations, exp.Annotations) {
t.Fatalf("evalIndex %d: expected annotations %v; got %v", i, exp.Annotations, got.Annotations)
}
}
}
}
// reset ts for next test
ts, _ = time.Parse(time.RFC3339, "2024-10-29T00:00:00Z")
}
f(newTestAlertingRule("empty", 0), [][]datasource.Metric{}, nil, nil)
@@ -522,7 +530,7 @@ func TestAlertingRule_Exec(t *testing.T) {
},
})
f(newTestAlertingRule("for-pending=>firing=>inactive=>pending=>firing", defaultStep), [][]datasource.Metric{
f(newTestAlertingRuleWithCustomFields("for-pending=>firing=>inactive=>pending=>firing", defaultStep, 0, 0, map[string]string{"activeAt": "{{ $activeAt.UnixMilli }}"}), [][]datasource.Metric{
{metricWithLabels(t, "name", "foo")},
{metricWithLabels(t, "name", "foo")},
// empty step to set alert inactive
@@ -530,11 +538,11 @@ func TestAlertingRule_Exec(t *testing.T) {
{metricWithLabels(t, "name", "foo")},
{metricWithLabels(t, "name", "foo")},
}, map[int][]testAlert{
0: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StatePending}}},
1: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateFiring}}},
2: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateInactive}}},
3: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StatePending}}},
4: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateFiring}}},
0: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StatePending, Annotations: map[string]string{"activeAt": strconv.FormatInt(ts.UnixMilli(), 10)}}}},
1: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateFiring, Annotations: map[string]string{"activeAt": strconv.FormatInt(ts.UnixMilli(), 10)}}}},
2: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateInactive, Annotations: map[string]string{"activeAt": strconv.FormatInt(ts.UnixMilli(), 10)}}}},
3: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StatePending, Annotations: map[string]string{"activeAt": strconv.FormatInt(ts.Add(defaultStep*3).UnixMilli(), 10)}}}},
4: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateFiring, Annotations: map[string]string{"activeAt": strconv.FormatInt(ts.Add(defaultStep*3).UnixMilli(), 10)}}}},
}, nil)
f(newTestAlertingRuleWithCustomFields("for-pending=>firing=>keepfiring=>firing", defaultStep, 0, defaultStep, nil), [][]datasource.Metric{

View File

@@ -587,6 +587,11 @@ func (g *Group) Replay(start, end time.Time, rw remotewrite.RWClient, maxDataPoi
func replayRuleRange(r Rule, ri rangeIterator, bar *pb.ProgressBar, rw remotewrite.RWClient, replayRuleRetryAttempts, ruleEvaluationConcurrency int) int {
fmt.Printf("> Rule %q (ID: %d)\n", r, r.ID())
// alerting rule with for>0 can't be replayed concurrently, since the status change might depend on the previous evaluation
// see https://github.com/VictoriaMetrics/VictoriaMetrics/commit/abcb21aa5ee918ba9a4e9cde495dba06e1e9564c
if r, ok := r.(*AlertingRule); ok && r.For > 0 {
ruleEvaluationConcurrency = 1
}
sem := make(chan struct{}, ruleEvaluationConcurrency)
wg := sync.WaitGroup{}
res := make(chan int, int(ri.end.Sub(ri.start)/ri.step)+1)

View File

@@ -437,7 +437,7 @@ func TestRecordingRuleExec_Negative(t *testing.T) {
_, err = rr.exec(context.TODO(), time.Now(), 0)
if err != nil {
t.Fatalf("cannot execute recroding rule: %s", err)
t.Fatalf("cannot execute recording rule: %s", err)
}
}

View File

@@ -723,14 +723,11 @@ func reloadAuthConfigData(data []byte) (bool, error) {
}
func parseAuthConfig(data []byte) (*AuthConfig, error) {
data, err := envtemplate.ReplaceBytes(data)
if err != nil {
return nil, fmt.Errorf("cannot expand environment vars: %w", err)
}
data = envtemplate.ReplaceBytes(data)
ac := &AuthConfig{
ms: metrics.NewSet(),
}
if err = yaml.UnmarshalStrict(data, ac); err != nil {
if err := yaml.UnmarshalStrict(data, ac); err != nil {
return nil, fmt.Errorf("cannot unmarshal AuthConfig data: %w", err)
}

View File

@@ -1,106 +1,110 @@
# All these commands must run from repository root.
# special tag to reduce resulting binary size
# See this issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8008
VMBACKUP_GO_BUILD_TAGS=disable_grpc_modules
vmbackup:
APP_NAME=vmbackup $(MAKE) app-local
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) $(MAKE) app-local
vmbackup-race:
APP_NAME=vmbackup RACE=-race $(MAKE) app-local
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) RACE=-race $(MAKE) app-local
vmbackup-prod:
APP_NAME=vmbackup $(MAKE) app-via-docker
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) $(MAKE) app-via-docker
vmbackup-pure-prod:
APP_NAME=vmbackup $(MAKE) app-via-docker-pure
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) $(MAKE) app-via-docker-pure
vmbackup-linux-amd64-prod:
APP_NAME=vmbackup $(MAKE) app-via-docker-linux-amd64
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) $(MAKE) app-via-docker-linux-amd64
vmbackup-linux-arm-prod:
APP_NAME=vmbackup $(MAKE) app-via-docker-linux-arm
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) $(MAKE) app-via-docker-linux-arm
vmbackup-linux-arm64-prod:
APP_NAME=vmbackup $(MAKE) app-via-docker-linux-arm64
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) $(MAKE) app-via-docker-linux-arm64
vmbackup-linux-ppc64le-prod:
APP_NAME=vmbackup $(MAKE) app-via-docker-linux-ppc64le
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) $(MAKE) app-via-docker-linux-ppc64le
vmbackup-linux-386-prod:
APP_NAME=vmbackup $(MAKE) app-via-docker-linux-386
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) $(MAKE) app-via-docker-linux-386
vmbackup-darwin-amd64-prod:
APP_NAME=vmbackup $(MAKE) app-via-docker-darwin-amd64
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) $(MAKE) app-via-docker-darwin-amd64
vmbackup-darwin-arm64-prod:
APP_NAME=vmbackup $(MAKE) app-via-docker-darwin-arm64
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) $(MAKE) app-via-docker-darwin-arm64
vmbackup-freebsd-amd64-prod:
APP_NAME=vmbackup $(MAKE) app-via-docker-freebsd-amd64
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) $(MAKE) app-via-docker-freebsd-amd64
vmbackup-openbsd-amd64-prod:
APP_NAME=vmbackup $(MAKE) app-via-docker-openbsd-amd64
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) $(MAKE) app-via-docker-openbsd-amd64
vmbackup-windows-amd64-prod:
APP_NAME=vmbackup $(MAKE) app-via-docker-windows-amd64
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) $(MAKE) app-via-docker-windows-amd64
package-vmbackup:
APP_NAME=vmbackup $(MAKE) package-via-docker
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) $(MAKE) package-via-docker
package-vmbackup-pure:
APP_NAME=vmbackup $(MAKE) package-via-docker-pure
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) $(MAKE) package-via-docker-pure
package-vmbackup-amd64:
APP_NAME=vmbackup $(MAKE) package-via-docker-amd64
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) $(MAKE) package-via-docker-amd64
package-vmbackup-arm:
APP_NAME=vmbackup $(MAKE) package-via-docker-arm
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) $(MAKE) package-via-docker-arm
package-vmbackup-arm64:
APP_NAME=vmbackup $(MAKE) package-via-docker-arm64
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) $(MAKE) package-via-docker-arm64
package-vmbackup-ppc64le:
APP_NAME=vmbackup $(MAKE) package-via-docker-ppc64le
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) $(MAKE) package-via-docker-ppc64le
package-vmbackup-386:
APP_NAME=vmbackup $(MAKE) package-via-docker-386
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) $(MAKE) package-via-docker-386
publish-vmbackup:
APP_NAME=vmbackup $(MAKE) publish-via-docker
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) $(MAKE) publish-via-docker
vmbackup-linux-amd64:
APP_NAME=vmbackup CGO_ENABLED=1 GOOS=linux GOARCH=amd64 $(MAKE) app-local-goos-goarch
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) CGO_ENABLED=1 GOOS=linux GOARCH=amd64 $(MAKE) app-local-goos-goarch
vmbackup-linux-arm:
APP_NAME=vmbackup CGO_ENABLED=0 GOOS=linux GOARCH=arm $(MAKE) app-local-goos-goarch
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) CGO_ENABLED=0 GOOS=linux GOARCH=arm $(MAKE) app-local-goos-goarch
vmbackup-linux-arm64:
APP_NAME=vmbackup CGO_ENABLED=0 GOOS=linux GOARCH=arm64 $(MAKE) app-local-goos-goarch
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) CGO_ENABLED=0 GOOS=linux GOARCH=arm64 $(MAKE) app-local-goos-goarch
vmbackup-linux-ppc64le:
APP_NAME=vmbackup CGO_ENABLED=0 GOOS=linux GOARCH=ppc64le $(MAKE) app-local-goos-goarch
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) CGO_ENABLED=0 GOOS=linux GOARCH=ppc64le $(MAKE) app-local-goos-goarch
vmbackup-linux-s390x:
APP_NAME=vmbackup CGO_ENABLED=0 GOOS=linux GOARCH=s390x $(MAKE) app-local-goos-goarch
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) CGO_ENABLED=0 GOOS=linux GOARCH=s390x $(MAKE) app-local-goos-goarch
vmbackup-linux-loong64:
APP_NAME=vmbackup CGO_ENABLED=0 GOOS=linux GOARCH=loong64 $(MAKE) app-local-goos-goarch
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) CGO_ENABLED=0 GOOS=linux GOARCH=loong64 $(MAKE) app-local-goos-goarch
vmbackup-linux-386:
APP_NAME=vmbackup CGO_ENABLED=0 GOOS=linux GOARCH=386 $(MAKE) app-local-goos-goarch
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) CGO_ENABLED=0 GOOS=linux GOARCH=386 $(MAKE) app-local-goos-goarch
vmbackup-darwin-amd64:
APP_NAME=vmbackup CGO_ENABLED=0 GOOS=darwin GOARCH=amd64 $(MAKE) app-local-goos-goarch
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) CGO_ENABLED=0 GOOS=darwin GOARCH=amd64 $(MAKE) app-local-goos-goarch
vmbackup-darwin-arm64:
APP_NAME=vmbackup CGO_ENABLED=0 GOOS=darwin GOARCH=arm64 $(MAKE) app-local-goos-goarch
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) CGO_ENABLED=0 GOOS=darwin GOARCH=arm64 $(MAKE) app-local-goos-goarch
vmbackup-freebsd-amd64:
APP_NAME=vmbackup CGO_ENABLED=0 GOOS=freebsd GOARCH=amd64 $(MAKE) app-local-goos-goarch
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) CGO_ENABLED=0 GOOS=freebsd GOARCH=amd64 $(MAKE) app-local-goos-goarch
vmbackup-openbsd-amd64:
APP_NAME=vmbackup CGO_ENABLED=0 GOOS=openbsd GOARCH=amd64 $(MAKE) app-local-goos-goarch
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) CGO_ENABLED=0 GOOS=openbsd GOARCH=amd64 $(MAKE) app-local-goos-goarch
vmbackup-windows-amd64:
GOARCH=amd64 APP_NAME=vmbackup $(MAKE) app-local-windows-goarch
GOARCH=amd64 APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) $(MAKE) app-local-windows-goarch
vmbackup-pure:
APP_NAME=vmbackup $(MAKE) app-local-pure
APP_NAME=vmbackup EXTRA_GO_BUILD_TAGS=$(VMBACKUP_GO_BUILD_TAGS) $(MAKE) app-local-pure

View File

@@ -121,7 +121,7 @@ func (p *vmNativeProcessor) runSingle(ctx context.Context, f native.Filter, srcU
pr := bar.NewProxyReader(reader)
if pr != nil {
reader = pr
fmt.Printf("Continue import process with filter %s:\n", f.String())
fmt.Fprintf(log.Writer(), "Continue import process with filter %s:\n", f.String())
}
}
@@ -191,7 +191,7 @@ func (p *vmNativeProcessor) runBackfilling(ctx context.Context, tenantID string,
initParams = []any{srcURL, dstURL, p.filter.String(), tenantID}
}
fmt.Println("") // extra line for better output formatting
fmt.Fprintln(log.Writer(), "") // extra line for better output formatting
log.Printf(initMessage, initParams...)
if len(ranges) > 1 {
log.Printf("Selected time range will be split into %d ranges according to %q step", len(ranges), p.filter.Chunk)

View File

@@ -30,6 +30,11 @@ func InsertHandler(c net.Conn) error {
if handshake.IsTCPHealthcheck(err) {
return nil
}
if handshake.IsTimeoutNetworkError(err) {
logger.Warnf("cannot complete vminsert handshake due to network timeout error with client %q: %s. "+
"If errors are transient and infrequent increase -rpc.handshakeTimeout and -vmstorageDialTimeout on client and server side. Check vminsert logs for errors", c.RemoteAddr(), err)
return nil
}
if handshake.IsClientNetworkError(err) {
logger.Warnf("cannot complete vminsert handshake due to network error with client %q: %s. "+
"Check vminsert logs for errors", c.RemoteAddr(), err)

View File

@@ -37,7 +37,7 @@ func InsertHandler(at *auth.Token, req *http.Request) error {
return fmt.Errorf("json encoding isn't supported for opentelemetry format. Use protobuf encoding")
}
}
return stream.ParseStream(req.Body, encoding, processBody, func(tss []prompb.TimeSeries) error {
return stream.ParseStream(req.Body, encoding, processBody, func(tss []prompb.TimeSeries, _ []prompb.MetricMetadata) error {
return insertRows(at, tss, extraLabels)
})
}

View File

@@ -8,6 +8,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/prometheus"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/prometheus/stream"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/protoparserutil"
@@ -32,7 +33,7 @@ func InsertHandler(at *auth.Token, req *http.Request) error {
return err
}
encoding := req.Header.Get("Content-Encoding")
return stream.Parse(req.Body, defaultTimestamp, encoding, true, func(rows []prometheus.Row) error {
return stream.Parse(req.Body, defaultTimestamp, encoding, true, promscrape.IsMetadataEnabled(), func(rows []prometheus.Row, _ []prometheus.Metadata) error {
return insertRows(at, rows, extraLabels)
}, func(s string) {
httpserver.LogError(req, s)

View File

@@ -27,7 +27,7 @@ func InsertHandler(at *auth.Token, req *http.Request) error {
return err
}
isVMRemoteWrite := req.Header.Get("Content-Encoding") == "zstd"
return stream.Parse(req.Body, isVMRemoteWrite, func(tss []prompb.TimeSeries) error {
return stream.Parse(req.Body, isVMRemoteWrite, func(tss []prompb.TimeSeries, _ []prompb.MetricMetadata) error {
return insertRows(at, tss, extraLabels)
})
}

View File

@@ -1,106 +1,110 @@
# All these commands must run from repository root.
# special tag to reduce resulting binary size
# See this issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8008
VMRESTORE_GO_BUILD_TAGS=disable_grpc_modules
vmrestore:
APP_NAME=vmrestore $(MAKE) app-local
APP_NAME=vmrestore EXTRA_GO_BUILD_TAGS=$(VMRESTORE_GO_BUILD_TAGS) $(MAKE) app-local
vmrestore-race:
APP_NAME=vmrestore RACE=-race $(MAKE) app-local
APP_NAME=vmrestore EXTRA_GO_BUILD_TAGS=$(VMRESTORE_GO_BUILD_TAGS) RACE=-race $(MAKE) app-local
vmrestore-prod:
APP_NAME=vmrestore $(MAKE) app-via-docker
APP_NAME=vmrestore EXTRA_GO_BUILD_TAGS=$(VMRESTORE_GO_BUILD_TAGS) $(MAKE) app-via-docker
vmrestore-pure-prod:
APP_NAME=vmrestore $(MAKE) app-via-docker-pure
APP_NAME=vmrestore EXTRA_GO_BUILD_TAGS=$(VMRESTORE_GO_BUILD_TAGS) $(MAKE) app-via-docker-pure
vmrestore-linux-amd64-prod:
APP_NAME=vmrestore $(MAKE) app-via-docker-linux-amd64
APP_NAME=vmrestore EXTRA_GO_BUILD_TAGS=$(VMRESTORE_GO_BUILD_TAGS) $(MAKE) app-via-docker-linux-amd64
vmrestore-linux-arm-prod:
APP_NAME=vmrestore $(MAKE) app-via-docker-linux-arm
APP_NAME=vmrestore EXTRA_GO_BUILD_TAGS=$(VMRESTORE_GO_BUILD_TAGS) $(MAKE) app-via-docker-linux-arm
vmrestore-linux-arm64-prod:
APP_NAME=vmrestore $(MAKE) app-via-docker-linux-arm64
APP_NAME=vmrestore EXTRA_GO_BUILD_TAGS=$(VMRESTORE_GO_BUILD_TAGS) $(MAKE) app-via-docker-linux-arm64
vmrestore-linux-ppc64le-prod:
APP_NAME=vmrestore $(MAKE) app-via-docker-linux-ppc64le
APP_NAME=vmrestore EXTRA_GO_BUILD_TAGS=$(VMRESTORE_GO_BUILD_TAGS) $(MAKE) app-via-docker-linux-ppc64le
vmrestore-linux-386-prod:
APP_NAME=vmrestore $(MAKE) app-via-docker-linux-386
APP_NAME=vmrestore EXTRA_GO_BUILD_TAGS=$(VMRESTORE_GO_BUILD_TAGS) $(MAKE) app-via-docker-linux-386
vmrestore-darwin-amd64-prod:
APP_NAME=vmrestore $(MAKE) app-via-docker-darwin-amd64
APP_NAME=vmrestore EXTRA_GO_BUILD_TAGS=$(VMRESTORE_GO_BUILD_TAGS) $(MAKE) app-via-docker-darwin-amd64
vmrestore-darwin-arm64-prod:
APP_NAME=vmrestore $(MAKE) app-via-docker-darwin-arm64
APP_NAME=vmrestore EXTRA_GO_BUILD_TAGS=$(VMRESTORE_GO_BUILD_TAGS) $(MAKE) app-via-docker-darwin-arm64
vmrestore-freebsd-amd64-prod:
APP_NAME=vmrestore $(MAKE) app-via-docker-freebsd-amd64
APP_NAME=vmrestore EXTRA_GO_BUILD_TAGS=$(VMRESTORE_GO_BUILD_TAGS) $(MAKE) app-via-docker-freebsd-amd64
vmrestore-openbsd-amd64-prod:
APP_NAME=vmrestore $(MAKE) app-via-docker-openbsd-amd64
APP_NAME=vmrestore EXTRA_GO_BUILD_TAGS=$(VMRESTORE_GO_BUILD_TAGS) $(MAKE) app-via-docker-openbsd-amd64
vmrestore-windows-amd64-prod:
APP_NAME=vmrestore $(MAKE) app-via-docker-windows-amd64
APP_NAME=vmrestore EXTRA_GO_BUILD_TAGS=$(VMRESTORE_GO_BUILD_TAGS) $(MAKE) app-via-docker-windows-amd64
package-vmrestore:
APP_NAME=vmrestore $(MAKE) package-via-docker
APP_NAME=vmrestore EXTRA_GO_BUILD_TAGS=$(VMRESTORE_GO_BUILD_TAGS) $(MAKE) package-via-docker
package-vmrestore-pure:
APP_NAME=vmrestore $(MAKE) package-via-docker-pure
APP_NAME=vmrestore EXTRA_GO_BUILD_TAGS=$(VMRESTORE_GO_BUILD_TAGS) $(MAKE) package-via-docker-pure
package-vmrestore-amd64:
APP_NAME=vmrestore $(MAKE) package-via-docker-amd64
APP_NAME=vmrestore EXTRA_GO_BUILD_TAGS=$(VMRESTORE_GO_BUILD_TAGS) $(MAKE) package-via-docker-amd64
package-vmrestore-arm:
APP_NAME=vmrestore $(MAKE) package-via-docker-arm
APP_NAME=vmrestore EXTRA_GO_BUILD_TAGS=$(VMRESTORE_GO_BUILD_TAGS) $(MAKE) package-via-docker-arm
package-vmrestore-arm64:
APP_NAME=vmrestore $(MAKE) package-via-docker-arm64
APP_NAME=vmrestore EXTRA_GO_BUILD_TAGS=$(VMRESTORE_GO_BUILD_TAGS) $(MAKE) package-via-docker-arm64
package-vmrestore-ppc64le:
APP_NAME=vmrestore $(MAKE) package-via-docker-ppc64le
APP_NAME=vmrestore EXTRA_GO_BUILD_TAGS=$(VMRESTORE_GO_BUILD_TAGS) $(MAKE) package-via-docker-ppc64le
package-vmrestore-386:
APP_NAME=vmrestore $(MAKE) package-via-docker-386
APP_NAME=vmrestore EXTRA_GO_BUILD_TAGS=$(VMRESTORE_GO_BUILD_TAGS) $(MAKE) package-via-docker-386
publish-vmrestore:
APP_NAME=vmrestore $(MAKE) publish-via-docker
APP_NAME=vmrestore EXTRA_GO_BUILD_TAGS=$(VMRESTORE_GO_BUILD_TAGS) $(MAKE) publish-via-docker
vmrestore-linux-amd64:
APP_NAME=vmrestore CGO_ENABLED=1 GOOS=linux GOARCH=amd64 $(MAKE) app-local-goos-goarch
APP_NAME=vmrestore EXTRA_GO_BUILD_TAGS=$(VMRESTORE_GO_BUILD_TAGS) CGO_ENABLED=1 GOOS=linux GOARCH=amd64 $(MAKE) app-local-goos-goarch
vmrestore-linux-arm:
APP_NAME=vmrestore CGO_ENABLED=0 GOOS=linux GOARCH=arm $(MAKE) app-local-goos-goarch
APP_NAME=vmrestore EXTRA_GO_BUILD_TAGS=$(VMRESTORE_GO_BUILD_TAGS) CGO_ENABLED=0 GOOS=linux GOARCH=arm $(MAKE) app-local-goos-goarch
vmrestore-linux-arm64:
APP_NAME=vmrestore CGO_ENABLED=0 GOOS=linux GOARCH=arm64 $(MAKE) app-local-goos-goarch
APP_NAME=vmrestore EXTRA_GO_BUILD_TAGS=$(VMRESTORE_GO_BUILD_TAGS) CGO_ENABLED=0 GOOS=linux GOARCH=arm64 $(MAKE) app-local-goos-goarch
vmrestore-linux-ppc64le:
APP_NAME=vmrestore CGO_ENABLED=0 GOOS=linux GOARCH=ppc64le $(MAKE) app-local-goos-goarch
APP_NAME=vmrestore EXTRA_GO_BUILD_TAGS=$(VMRESTORE_GO_BUILD_TAGS) CGO_ENABLED=0 GOOS=linux GOARCH=ppc64le $(MAKE) app-local-goos-goarch
vmrestore-linux-s390x:
APP_NAME=vmrestore CGO_ENABLED=0 GOOS=linux GOARCH=s390x $(MAKE) app-local-goos-goarch
APP_NAME=vmrestore EXTRA_GO_BUILD_TAGS=$(VMRESTORE_GO_BUILD_TAGS) CGO_ENABLED=0 GOOS=linux GOARCH=s390x $(MAKE) app-local-goos-goarch
vmrestore-linux-loong64:
APP_NAME=vmrestore CGO_ENABLED=0 GOOS=linux GOARCH=loong64 $(MAKE) app-local-goos-goarch
APP_NAME=vmrestore EXTRA_GO_BUILD_TAGS=$(VMRESTORE_GO_BUILD_TAGS) CGO_ENABLED=0 GOOS=linux GOARCH=loong64 $(MAKE) app-local-goos-goarch
vmrestore-linux-386:
APP_NAME=vmrestore CGO_ENABLED=0 GOOS=linux GOARCH=386 $(MAKE) app-local-goos-goarch
APP_NAME=vmrestore EXTRA_GO_BUILD_TAGS=$(VMRESTORE_GO_BUILD_TAGS) CGO_ENABLED=0 GOOS=linux GOARCH=386 $(MAKE) app-local-goos-goarch
vmrestore-darwin-amd64:
APP_NAME=vmrestore CGO_ENABLED=0 GOOS=darwin GOARCH=amd64 $(MAKE) app-local-goos-goarch
APP_NAME=vmrestore EXTRA_GO_BUILD_TAGS=$(VMRESTORE_GO_BUILD_TAGS) CGO_ENABLED=0 GOOS=darwin GOARCH=amd64 $(MAKE) app-local-goos-goarch
vmrestore-darwin-arm64:
APP_NAME=vmrestore CGO_ENABLED=0 GOOS=darwin GOARCH=arm64 $(MAKE) app-local-goos-goarch
APP_NAME=vmrestore EXTRA_GO_BUILD_TAGS=$(VMRESTORE_GO_BUILD_TAGS) CGO_ENABLED=0 GOOS=darwin GOARCH=arm64 $(MAKE) app-local-goos-goarch
vmrestore-freebsd-amd64:
APP_NAME=vmrestore CGO_ENABLED=0 GOOS=freebsd GOARCH=amd64 $(MAKE) app-local-goos-goarch
APP_NAME=vmrestore EXTRA_GO_BUILD_TAGS=$(VMRESTORE_GO_BUILD_TAGS) CGO_ENABLED=0 GOOS=freebsd GOARCH=amd64 $(MAKE) app-local-goos-goarch
vmrestore-openbsd-amd64:
APP_NAME=vmrestore CGO_ENABLED=0 GOOS=openbsd GOARCH=amd64 $(MAKE) app-local-goos-goarch
APP_NAME=vmrestore EXTRA_GO_BUILD_TAGS=$(VMRESTORE_GO_BUILD_TAGS) CGO_ENABLED=0 GOOS=openbsd GOARCH=amd64 $(MAKE) app-local-goos-goarch
vmrestore-windows-amd64:
GOARCH=amd64 APP_NAME=vmrestore $(MAKE) app-local-windows-goarch
GOARCH=amd64 APP_NAME=vmrestore EXTRA_GO_BUILD_TAGS=$(VMRESTORE_GO_BUILD_TAGS) $(MAKE) app-local-windows-goarch
vmrestore-pure:
APP_NAME=vmrestore $(MAKE) app-local-pure
APP_NAME=vmrestore EXTRA_GO_BUILD_TAGS=$(VMRESTORE_GO_BUILD_TAGS) $(MAKE) app-local-pure

View File

@@ -18,6 +18,9 @@ import (
var maxGraphiteSeries = flag.Int("search.maxGraphiteSeries", 300e3, "The maximum number of time series, which can be scanned during queries to Graphite Render API. "+
"See https://docs.victoriametrics.com/victoriametrics/integrations/graphite/#render-api")
var maxGraphitePathExpressionLen = flag.Int("search.maxGraphitePathExpressionLen", 1024, "The maximum length of pathExpression field in Graphite series. "+
"Longer expressions are truncated to prevent memory exhaustion on complex nested queries. Set to 0 to disable truncation.")
type evalConfig struct {
at *auth.Token
startTime int64
@@ -56,6 +59,21 @@ func (ec *evalConfig) newTimestamps(step int64) []int64 {
return timestamps
}
// safePathExpression creates a pathExpression string from the given expression,
// truncating it if it exceeds the maximum allowed length to prevent memory exhaustion.
func safePathExpression(expr graphiteql.Expr) string {
if expr == nil {
return ""
}
pathExpr := string(expr.AppendString(nil))
maxLen := *maxGraphitePathExpressionLen
if maxLen > 0 && len(pathExpr) > maxLen {
return pathExpr[:maxLen] + "..."
}
return pathExpr
}
type series struct {
Name string
Tags map[string]string
@@ -172,7 +190,7 @@ func newNextSeriesForSearchQuery(ec *evalConfig, sq *storage.SearchQuery, expr g
Timestamps: append([]int64{}, rs.Timestamps...),
Values: append([]float64{}, rs.Values...),
expr: expr,
pathExpression: string(expr.AppendString(nil)),
pathExpression: safePathExpression(expr),
}
s.summarize(aggrAvg, ec.startTime, ec.endTime, ec.storageStep, 0)
t := timerpool.Get(30 * time.Second)

View File

@@ -4181,3 +4181,170 @@ func formatTimestamps(tss []int64) string {
fmt.Fprintf(&sb, " ]")
return sb.String()
}
func TestSafePathExpression(t *testing.T) {
// Save original value and restore after test
originalMaxLen := *maxGraphitePathExpressionLen
defer func() {
*maxGraphitePathExpressionLen = originalMaxLen
}()
t.Run("nil expression", func(t *testing.T) {
result := safePathExpression(nil)
if result != "" {
t.Fatalf("expected empty string for nil expression, got: %q", result)
}
})
t.Run("short expression - no truncation", func(t *testing.T) {
*maxGraphitePathExpressionLen = 50
expr := &graphiteql.MetricExpr{Query: "metric.cpu.usage"}
result := safePathExpression(expr)
expected := "metric.cpu.usage"
if result != expected {
t.Fatalf("expected %q, got %q", expected, result)
}
})
t.Run("long expression - with truncation", func(t *testing.T) {
*maxGraphitePathExpressionLen = 20
longQuery := "vertica.metrics.fr4.verticamultitenant-eon.request_resource_consumption.very_long_metric_name"
expr := &graphiteql.MetricExpr{Query: longQuery}
result := safePathExpression(expr)
expectedPrefix := longQuery[:20]
expectedSuffix := "..."
expected := expectedPrefix + expectedSuffix
if result != expected {
t.Fatalf("expected %q, got %q", expected, result)
}
if len(result) != 23 { // 20 + 3 for "..."
t.Fatalf("expected result length 23, got %d", len(result))
}
if !strings.HasSuffix(result, "...") {
t.Fatalf("expected result to end with '...', got %q", result)
}
})
t.Run("truncation disabled", func(t *testing.T) {
*maxGraphitePathExpressionLen = 0 // Disable truncation
longQuery := "very.long.metric.name.that.would.normally.be.truncated.but.should.not.be"
expr := &graphiteql.MetricExpr{Query: longQuery}
result := safePathExpression(expr)
if result != longQuery {
t.Fatalf("expected full string %q when truncation disabled, got %q", longQuery, result)
}
})
t.Run("function expression", func(t *testing.T) {
*maxGraphitePathExpressionLen = 30
// Create a function expression: sum(metric.cpu.usage)
args := []*graphiteql.ArgExpr{
{Expr: &graphiteql.MetricExpr{Query: "metric.cpu.usage"}},
}
funcExpr := &graphiteql.FuncExpr{
FuncName: "sum",
Args: args,
}
result := safePathExpression(funcExpr)
expected := "sum(metric.cpu.usage)"
if result != expected {
t.Fatalf("expected %q, got %q", expected, result)
}
})
t.Run("complex nested function - truncated", func(t *testing.T) {
*maxGraphitePathExpressionLen = 15
// Create nested functions: sum(avg(metric.cpu.usage))
innerArgs := []*graphiteql.ArgExpr{
{Expr: &graphiteql.MetricExpr{Query: "metric.cpu.usage"}},
}
innerFunc := &graphiteql.FuncExpr{
FuncName: "avg",
Args: innerArgs,
}
outerArgs := []*graphiteql.ArgExpr{
{Expr: innerFunc},
}
outerFunc := &graphiteql.FuncExpr{
FuncName: "sum",
Args: outerArgs,
}
result := safePathExpression(outerFunc)
if len(result) != 18 { // 15 + 3 for "..."
t.Fatalf("expected result length 18, got %d", len(result))
}
if !strings.HasSuffix(result, "...") {
t.Fatalf("expected result to end with '...', got %q", result)
}
if !strings.HasPrefix(result, "sum(avg(metric") {
t.Fatalf("expected result to start with 'sum(avg(metric', got %q", result)
}
})
t.Run("boundary case - exact length", func(t *testing.T) {
*maxGraphitePathExpressionLen = 10
expr := &graphiteql.MetricExpr{Query: "metric.cpu"} // Exactly 10 characters
result := safePathExpression(expr)
expected := "metric.cpu"
if result != expected {
t.Fatalf("expected %q, got %q", expected, result)
}
})
t.Run("boundary case - one character over", func(t *testing.T) {
*maxGraphitePathExpressionLen = 10
expr := &graphiteql.MetricExpr{Query: "metric.cpu.x"} // 11 characters
result := safePathExpression(expr)
expected := "metric.cpu..."
if result != expected {
t.Fatalf("expected %q, got %q", expected, result)
}
})
}
func TestSafePathExpressionFromString(t *testing.T) {
// Save original value and restore after test
originalMaxLen := *maxGraphitePathExpressionLen
defer func() {
*maxGraphitePathExpressionLen = originalMaxLen
}()
t.Run("short string - no truncation", func(t *testing.T) {
*maxGraphitePathExpressionLen = 50
input := "sumSeries(metric1,metric2)"
result := safePathExpressionFromString(input)
if result != input {
t.Fatalf("expected %q, got %q", input, result)
}
})
t.Run("long string - with truncation", func(t *testing.T) {
*maxGraphitePathExpressionLen = 20
input := "sumSeries(very.long.metric.name.that.exceeds.limit,another.metric)"
result := safePathExpressionFromString(input)
expected := "sumSeries(very.long...."
if result != expected {
t.Fatalf("expected %q, got %q", expected, result)
}
})
t.Run("truncation disabled", func(t *testing.T) {
*maxGraphitePathExpressionLen = 0
input := "very.long.string.that.would.normally.be.truncated"
result := safePathExpressionFromString(input)
if result != input {
t.Fatalf("expected full string when truncation disabled, got %q", result)
}
})
}

View File

@@ -10,7 +10,7 @@ func TestParseIntervalSuccess(t *testing.T) {
t.Helper()
interval, err := parseInterval(s)
if err != nil {
t.Fatalf("unexpected error in parseInterva(%q): %s", s, err)
t.Fatalf("unexpected error in parseInterval(%q): %s", s, err)
}
if interval != intervalExpected {
t.Fatalf("unexpected result for parseInterval(%q); got %d; want %d", s, interval, intervalExpected)

View File

@@ -17,6 +17,16 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
)
// safePathExpressionFromString truncates a pathExpression string if it exceeds
// the maximum allowed length to prevent memory exhaustion.
func safePathExpressionFromString(pathExpr string) string {
maxLen := *maxGraphitePathExpressionLen
if maxLen > 0 && len(pathExpr) > maxLen {
return pathExpr[:maxLen] + "..."
}
return pathExpr
}
// nextSeriesFunc must return the next series to process.
//
// nextSeriesFunc must release all the occupied resources before returning non-nil error.
@@ -319,7 +329,7 @@ func aggregateSeries(ec *evalConfig, expr graphiteql.Expr, nextSeries nextSeries
Tags: tags,
Timestamps: ec.newTimestamps(step),
Values: as.Finalize(xFilesFactor),
pathExpression: name,
pathExpression: safePathExpressionFromString(name),
expr: expr,
step: step,
}
@@ -1124,7 +1134,7 @@ func constantLine(ec *evalConfig, expr graphiteql.Expr, n float64) nextSeriesFun
Timestamps: []int64{ec.startTime, ec.startTime + step, ec.startTime + 2*step},
Values: []float64{n, n, n},
expr: expr,
pathExpression: string(expr.AppendString(nil)),
pathExpression: safePathExpression(expr),
step: step,
}
return singleSeriesFunc(s)

View File

@@ -17,7 +17,7 @@ func TestScanStringSuccess(t *testing.T) {
t.Fatalf("unexpected string scanned from %s; got %s; want %s", s, result, sExpected)
}
if !strings.HasPrefix(s, result) {
t.Fatalf("invalid prefix for scanne string %s: %s", s, result)
t.Fatalf("invalid prefix for scanned string %s: %s", s, result)
}
}
f(`""`, `""`)

View File

@@ -210,7 +210,7 @@ func (p *parser) parseMetricExprOrFuncCall() (Expr, error) {
}
return fe, nil
default:
// Metric epxression or bool expression or None.
// Metric expression or bool expression or None.
if isBool(ident) {
be := &BoolExpr{
B: strings.EqualFold(ident, "true"),

View File

@@ -397,6 +397,13 @@ func selectHandler(qt *querytracer.Tracer, startTime time.Time, w http.ResponseW
return true
}
return true
case "prometheus/api/v1/config":
httpserver.EnableCORS(w, r)
if err := prometheus.ConfigHandler(qt, startTime, w, r); err != nil {
httpserver.Errorf(w, r, "%s", err)
return true
}
return true
case "prometheus/api/v1/export":
exportRequests.Inc()
if err := prometheus.ExportHandler(startTime, at, w, r); err != nil {
@@ -731,6 +738,13 @@ func handleStaticAndSimpleRequests(w http.ResponseWriter, r *http.Request, path
expandWithExprsRequests.Inc()
prometheus.ExpandWithExprs(w, r)
return true
case "prometheus/extract-metric-exprs", "extract-metric-exprs":
startTime := time.Now()
if err := prometheus.ExtractMetricExprsHandler(startTime, w, r); err != nil {
httpserver.Errorf(w, r, "%s", err)
return true
}
return true
case "prometheus/prettify-query", "prettify-query":
prettifyQueryRequests.Inc()
prometheus.PrettifyQuery(w, r)

View File

@@ -88,10 +88,18 @@ type Results struct {
tbfs []*tmpBlocksFile
packedTimeseries []packedTimeseries
// the result is simulated
isSimulated bool
simulatedSeries []*storage.SimulatedSamples
}
// Len returns the number of results in rss.
func (rss *Results) Len() int {
if rss.isSimulated {
return len(rss.simulatedSeries)
}
return len(rss.packedTimeseries)
}
@@ -247,6 +255,10 @@ var defaultMaxWorkersPerQuery = func() int {
//
// rss becomes unusable after the call to RunParallel.
func (rss *Results) RunParallel(qt *querytracer.Tracer, f func(rs *Result, workerID uint) error) error {
if rss.isSimulated {
return rss.runParallelSimulated(qt, f)
}
qt = qt.NewChild("parallel process of fetched data")
defer rss.closeTmpBlockFiles()
@@ -262,6 +274,94 @@ func (rss *Results) RunParallel(qt *querytracer.Tracer, f func(rs *Result, worke
return err
}
func (rss *Results) runParallelSimulated(qt *querytracer.Tracer, f func(rs *Result, workerID uint) error) error {
qt = qt.NewChild("parallel process of fetched data")
cb := f
if rss.shouldConvertTenantToLabels {
cb = func(rs *Result, workerID uint) error {
metricNameTenantToTags(&rs.MetricName)
return f(rs, workerID)
}
}
tmpResult := getTmpResult()
defer putTmpResult(tmpResult)
// For simplicity, let's process serially first. Parallelization can be added if needed.
// If parallelization is desired, it would mirror the worker pool logic of the original runParallel,
// but iterating over rss.simulatedSamples entries.
workerID := uint(0)
var firstErr error
for _, metric := range rss.simulatedSeries {
r := &tmpResult.rs
r.reset()
r.MetricName.CopyFrom(&metric.Name)
for i, ts := range metric.Timestamps {
if ts >= rss.tr.MinTimestamp && ts <= rss.tr.MaxTimestamp {
r.Values = append(r.Values, metric.Value[i])
r.Timestamps = append(r.Timestamps, ts)
}
}
// Sort timestamps chronologically to match real storage behavior.
// Real storage ensures chronological order through:
// 1. Block-level sorting by MinTimestamp
// 2. Within-block timestamp ordering via encoding.EnsureNonDecreasingSequence()
if len(r.Timestamps) > 1 {
// Create pairs for sorting
type timestampValue struct {
timestamp int64
value float64
}
pairs := make([]timestampValue, len(r.Timestamps))
for i := range r.Timestamps {
pairs[i] = timestampValue{
timestamp: r.Timestamps[i],
value: r.Values[i],
}
}
// Sort by timestamp
sort.Slice(pairs, func(i, j int) bool {
return pairs[i].timestamp < pairs[j].timestamp
})
// Extract back to separate slices
for i := range pairs {
r.Timestamps[i] = pairs[i].timestamp
r.Values[i] = pairs[i].value
}
}
// The input from the client is most likely already deduplicated, since it's emitted by
// vmselect. However, the client may modify the input instead of using the returned one.
dedupInterval := storage.GetDedupInterval()
if dedupInterval > 0 && len(r.Timestamps) > 0 {
r.Timestamps, r.Values = storage.DeduplicateSamples(r.Timestamps, r.Values, dedupInterval)
}
rowProcessed := len(r.Timestamps)
if rowProcessed > 0 {
err := cb(r, workerID)
if err != nil {
firstErr = err
break
}
}
}
// Count total samples across all series
totalSamples := 0
for _, metric := range rss.simulatedSeries {
totalSamples += len(metric.Timestamps)
}
qt.Donef("series=%d, samples=%d", len(rss.simulatedSeries), totalSamples)
return firstErr
}
func (rss *Results) runParallel(qt *querytracer.Tracer, f func(rs *Result, workerID uint) error) (int, error) {
tswsLen := len(rss.packedTimeseries)
if tswsLen == 0 {
@@ -308,7 +408,7 @@ func (rss *Results) runParallel(qt *querytracer.Tracer, f func(rs *Result, worke
}
// Slow path - spin up multiple local workers for parallel data processing.
// Do not use global workers pool, since it increases inter-CPU memory ping-poing,
// Do not use global workers pool, since it increases inter-CPU memory ping-pong,
// which reduces the scalability on systems with many CPU cores.
// Prepare the work for workers.
@@ -526,7 +626,7 @@ func (pts *packedTimeseries) unpackTo(dst []*sortBlock, tbfs []*tmpBlocksFile, t
}
// Slow path - spin up multiple local workers for parallel data unpacking.
// Do not use global workers pool, since it increases inter-CPU memory ping-poing,
// Do not use global workers pool, since it increases inter-CPU memory ping-pong,
// which reduces the scalability on systems with many CPU cores.
// Prepare the work for workers.
@@ -1798,6 +1898,10 @@ func (e limitExceededErr) Error() string { return e.err.Error() }
//
// Results.RunParallel or Results.Cancel must be called on the returned Results.
func ProcessSearchQuery(qt *querytracer.Tracer, denyPartialResponse bool, sq *storage.SearchQuery, deadline searchutil.Deadline) (*Results, bool, error) {
if len(sq.SimulatedSeries) > 0 {
return processSearchSimulated(qt, sq, deadline)
}
qt = qt.NewChild("fetch matching series: %s", sq)
defer qt.Done()
if deadline.Exceeded() {
@@ -1862,6 +1966,41 @@ func ProcessSearchQuery(qt *querytracer.Tracer, denyPartialResponse bool, sq *st
return &rss, isPartial, nil
}
func processSearchSimulated(qt *querytracer.Tracer, sq *storage.SearchQuery, deadline searchutil.Deadline) (*Results, bool, error) {
qt = qt.NewChild("fetch matching series (simulated): %s", sq)
defer qt.Done()
if deadline.Exceeded() {
return nil, false, fmt.Errorf("timeout exceeded before starting the query processing: %s", deadline.String())
}
tr := storage.TimeRange{
MinTimestamp: sq.MinTimestamp,
MaxTimestamp: sq.MaxTimestamp,
}
// Process simulated samples.
matchedSamples, err := storage.MatchSimulatedSamples(sq.TenantTokens[0].AccountID, sq.TenantTokens[0].ProjectID, sq.SimulatedSeries, sq.TagFilterss)
if err != nil {
return nil, false, fmt.Errorf("cannot match simulated samples: %w", err)
}
// Create a result set similar to ProcessSearchQuery
rss := &Results{
tr: tr,
deadline: deadline,
isSimulated: true,
simulatedSeries: matchedSamples,
}
if len(matchedSamples) == 0 {
qt.Printf("no matching series found")
} else {
qt.Printf("found %d series", len(rss.simulatedSeries))
}
return rss, false, nil
}
// ProcessBlocks calls processBlock per each block matching the given sq.
func ProcessBlocks(qt *querytracer.Tracer, denyPartialResponse bool, sq *storage.SearchQuery,
processBlock func(mb *storage.MetricBlock, workerID uint) error, deadline searchutil.Deadline,

View File

@@ -137,7 +137,7 @@ func (tbf *tmpBlocksFile) WriteBlockData(b []byte, tbfIdx uint) (tmpBlockAddr, e
return addr, nil
}
// Len() returnt tbf size in bytes.
// Len() return tbf size in bytes.
func (tbf *tmpBlocksFile) Len() uint64 {
return tbf.offset
}

View File

@@ -0,0 +1,20 @@
{% import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/querytracer"
) %}
{% stripspace %}
ConfigResponse generates response for /api/v1/config .
{% func ConfigResponse(config *ConfigData, qt *querytracer.Tracer) %}
{
"status":"success",
"data":{
"minStalenessInterval": {%q= config.MinStalenessInterval %},
"maxStalenessInterval": {%q= config.MaxStalenessInterval %}
}
{% code qt.Done() %}
{%= dumpQueryTrace(qt) %}
}
{% endfunc %}
{% endstripspace %}

View File

@@ -0,0 +1,73 @@
// Code generated by qtc from "config_response.qtpl". DO NOT EDIT.
// See https://github.com/valyala/quicktemplate for details.
//line app/vmselect/prometheus/config_response.qtpl:1
package prometheus
//line app/vmselect/prometheus/config_response.qtpl:1
import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/querytracer"
)
// ConfigResponse generates response for /api/v1/config .
//line app/vmselect/prometheus/config_response.qtpl:8
import (
qtio422016 "io"
qt422016 "github.com/valyala/quicktemplate"
)
//line app/vmselect/prometheus/config_response.qtpl:8
var (
_ = qtio422016.Copy
_ = qt422016.AcquireByteBuffer
)
//line app/vmselect/prometheus/config_response.qtpl:8
func StreamConfigResponse(qw422016 *qt422016.Writer, config *ConfigData, qt *querytracer.Tracer) {
//line app/vmselect/prometheus/config_response.qtpl:8
qw422016.N().S(`{"status":"success","data":{"minStalenessInterval":`)
//line app/vmselect/prometheus/config_response.qtpl:12
qw422016.N().Q(config.MinStalenessInterval)
//line app/vmselect/prometheus/config_response.qtpl:12
qw422016.N().S(`,"maxStalenessInterval":`)
//line app/vmselect/prometheus/config_response.qtpl:13
qw422016.N().Q(config.MaxStalenessInterval)
//line app/vmselect/prometheus/config_response.qtpl:13
qw422016.N().S(`}`)
//line app/vmselect/prometheus/config_response.qtpl:15
qt.Done()
//line app/vmselect/prometheus/config_response.qtpl:16
streamdumpQueryTrace(qw422016, qt)
//line app/vmselect/prometheus/config_response.qtpl:16
qw422016.N().S(`}`)
//line app/vmselect/prometheus/config_response.qtpl:18
}
//line app/vmselect/prometheus/config_response.qtpl:18
func WriteConfigResponse(qq422016 qtio422016.Writer, config *ConfigData, qt *querytracer.Tracer) {
//line app/vmselect/prometheus/config_response.qtpl:18
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmselect/prometheus/config_response.qtpl:18
StreamConfigResponse(qw422016, config, qt)
//line app/vmselect/prometheus/config_response.qtpl:18
qt422016.ReleaseWriter(qw422016)
//line app/vmselect/prometheus/config_response.qtpl:18
}
//line app/vmselect/prometheus/config_response.qtpl:18
func ConfigResponse(config *ConfigData, qt *querytracer.Tracer) string {
//line app/vmselect/prometheus/config_response.qtpl:18
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmselect/prometheus/config_response.qtpl:18
WriteConfigResponse(qb422016, config, qt)
//line app/vmselect/prometheus/config_response.qtpl:18
qs422016 := string(qb422016.B)
//line app/vmselect/prometheus/config_response.qtpl:18
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmselect/prometheus/config_response.qtpl:18
return qs422016
//line app/vmselect/prometheus/config_response.qtpl:18
}

View File

@@ -0,0 +1,18 @@
{% stripspace %}
ExtractMetricExprsResponse generates response for /extract-metric-exprs .
{% func ExtractMetricExprsResponse(metrics []string) %}
{
"status":"success",
"data":[
{% if len(metrics) > 0 %}
{%q= metrics[0] %}
{% for i := 1; i < len(metrics); i++ %}
,{%q= metrics[i] %}
{% endfor %}
{% endif %}
]
}
{% endfunc %}
{% endstripspace %}

View File

@@ -0,0 +1,69 @@
// Code generated by qtc from "extract_metric_exprs_response.qtpl". DO NOT EDIT.
// See https://github.com/valyala/quicktemplate for details.
// ExtractMetricExprsResponse generates response for /extract-metric-exprs .
//line app/vmselect/prometheus/extract_metric_exprs_response.qtpl:4
package prometheus
//line app/vmselect/prometheus/extract_metric_exprs_response.qtpl:4
import (
qtio422016 "io"
qt422016 "github.com/valyala/quicktemplate"
)
//line app/vmselect/prometheus/extract_metric_exprs_response.qtpl:4
var (
_ = qtio422016.Copy
_ = qt422016.AcquireByteBuffer
)
//line app/vmselect/prometheus/extract_metric_exprs_response.qtpl:4
func StreamExtractMetricExprsResponse(qw422016 *qt422016.Writer, metrics []string) {
//line app/vmselect/prometheus/extract_metric_exprs_response.qtpl:4
qw422016.N().S(`{"status":"success","data":[`)
//line app/vmselect/prometheus/extract_metric_exprs_response.qtpl:8
if len(metrics) > 0 {
//line app/vmselect/prometheus/extract_metric_exprs_response.qtpl:9
qw422016.N().Q(metrics[0])
//line app/vmselect/prometheus/extract_metric_exprs_response.qtpl:10
for i := 1; i < len(metrics); i++ {
//line app/vmselect/prometheus/extract_metric_exprs_response.qtpl:10
qw422016.N().S(`,`)
//line app/vmselect/prometheus/extract_metric_exprs_response.qtpl:11
qw422016.N().Q(metrics[i])
//line app/vmselect/prometheus/extract_metric_exprs_response.qtpl:12
}
//line app/vmselect/prometheus/extract_metric_exprs_response.qtpl:13
}
//line app/vmselect/prometheus/extract_metric_exprs_response.qtpl:13
qw422016.N().S(`]}`)
//line app/vmselect/prometheus/extract_metric_exprs_response.qtpl:16
}
//line app/vmselect/prometheus/extract_metric_exprs_response.qtpl:16
func WriteExtractMetricExprsResponse(qq422016 qtio422016.Writer, metrics []string) {
//line app/vmselect/prometheus/extract_metric_exprs_response.qtpl:16
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmselect/prometheus/extract_metric_exprs_response.qtpl:16
StreamExtractMetricExprsResponse(qw422016, metrics)
//line app/vmselect/prometheus/extract_metric_exprs_response.qtpl:16
qt422016.ReleaseWriter(qw422016)
//line app/vmselect/prometheus/extract_metric_exprs_response.qtpl:16
}
//line app/vmselect/prometheus/extract_metric_exprs_response.qtpl:16
func ExtractMetricExprsResponse(metrics []string) string {
//line app/vmselect/prometheus/extract_metric_exprs_response.qtpl:16
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmselect/prometheus/extract_metric_exprs_response.qtpl:16
WriteExtractMetricExprsResponse(qb422016, metrics)
//line app/vmselect/prometheus/extract_metric_exprs_response.qtpl:16
qs422016 := string(qb422016.B)
//line app/vmselect/prometheus/extract_metric_exprs_response.qtpl:16
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmselect/prometheus/extract_metric_exprs_response.qtpl:16
return qs422016
//line app/vmselect/prometheus/extract_metric_exprs_response.qtpl:16
}

View File

@@ -1,8 +1,10 @@
package prometheus
import (
"encoding/json"
"flag"
"fmt"
"io"
"math"
"net/http"
"runtime"
@@ -43,6 +45,9 @@ var (
maxLookback = flag.Duration("search.maxLookback", 0, "Synonym to -query.lookback-delta from Prometheus. "+
"The value is dynamically detected from interval between time series datapoints if not set. It can be overridden on per-query basis via max_lookback arg. "+
"See also '-search.maxStalenessInterval' flag, which has the same meaning due to historical reasons")
minStalenessInterval = flag.Duration("search.minStalenessInterval", 0, "The minimum interval for staleness calculations. "+
"This flag could be useful for removing gaps on graphs generated from time series with irregular intervals between samples. "+
"See also '-search.maxStalenessInterval'")
maxStalenessInterval = flag.Duration("search.maxStalenessInterval", 0, "The maximum interval for staleness calculations. "+
"By default, it is automatically calculated from the median interval between samples. This flag could be useful for tuning "+
"Prometheus data model closer to Influx-style data model. See https://prometheus.io/docs/prometheus/latest/querying/basics/#staleness for details. "+
@@ -118,7 +123,7 @@ func FederateHandler(startTime time.Time, at *auth.Token, w http.ResponseWriter,
if err != nil {
return err
}
lookbackDelta, err := getMaxLookback(r)
lookbackDelta, err := getMaxLookback(r, *maxStalenessInterval)
if err != nil {
return err
}
@@ -138,6 +143,7 @@ func FederateHandler(startTime time.Time, at *auth.Token, w http.ResponseWriter,
return fmt.Errorf("cannot fetch data for %q: %w", sq, err)
}
if isPartial {
rss.Cancel()
return fmt.Errorf("cannot export federated metrics, because some of vmstorage nodes are unavailable")
}
@@ -722,6 +728,55 @@ func TSDBStatusHandler(qt *querytracer.Tracer, startTime time.Time, at *auth.Tok
var tsdbStatusDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/api/v1/status/tsdb"}`)
// ConfigData holds the current configuration values for search-related flags
type ConfigData struct {
MinStalenessInterval string
MaxStalenessInterval string
}
// ConfigHandler processes /api/v1/config request.
//
// It returns the current configuration for search-related flags.
func ConfigHandler(qt *querytracer.Tracer, startTime time.Time, w http.ResponseWriter, _ *http.Request) error {
config := &ConfigData{
MinStalenessInterval: (*minStalenessInterval).String(),
MaxStalenessInterval: (*maxStalenessInterval).String(),
}
w.Header().Set("Content-Type", "application/json")
bw := bufferedwriter.Get(w)
defer bufferedwriter.Put(bw)
WriteConfigResponse(bw, config, qt)
if err := bw.Flush(); err != nil {
return fmt.Errorf("cannot send config response to remote client: %w", err)
}
return nil
}
// ExtractMetricExprsHandler processes /extract-metric-exprs request.
//
// It extracts metric expressions from a given PromQL query.
func ExtractMetricExprsHandler(startTime time.Time, w http.ResponseWriter, r *http.Request) error {
query := r.FormValue("query")
if len(query) == 0 {
return fmt.Errorf("missing `query` arg")
}
metrics, err := promql.ExtractMetricsFromQuery(query)
if err != nil {
return fmt.Errorf("cannot extract metrics from query: %w", err)
}
w.Header().Set("Content-Type", "application/json")
bw := bufferedwriter.Get(w)
defer bufferedwriter.Put(bw)
WriteExtractMetricExprsResponse(bw, metrics)
if err := bw.Flush(); err != nil {
return fmt.Errorf("cannot send extract metric exprs response to remote client: %w", err)
}
return nil
}
// LabelsHandler processes /api/v1/labels request.
//
// See https://prometheus.io/docs/prometheus/latest/querying/api/#getting-label-names
@@ -846,7 +901,8 @@ func QueryHandler(qt *querytracer.Tracer, startTime time.Time, at *auth.Token, w
ct := startTime.UnixNano() / 1e6
deadline := searchutil.GetDeadlineForQuery(r, startTime)
noCache := httputil.GetBool(r, "nocache")
isDebug := httputil.GetBool(r, "debug")
noCache := httputil.GetBool(r, "nocache") || isDebug
query := r.FormValue("query")
if len(query) == 0 {
return fmt.Errorf("missing `query` arg")
@@ -855,7 +911,7 @@ func QueryHandler(qt *querytracer.Tracer, startTime time.Time, at *auth.Token, w
if err != nil {
return err
}
lookbackDelta, err := getMaxLookback(r)
lookbackDelta, err := getMaxLookback(r, *maxStalenessInterval)
if err != nil {
return err
}
@@ -941,29 +997,18 @@ func QueryHandler(qt *querytracer.Tracer, startTime time.Time, at *auth.Token, w
} else {
queryOffset = 0
}
ec := &promql.EvalConfig{
Start: start,
End: start,
Step: step,
MaxPointsPerSeries: *maxPointsPerTimeseries,
MaxSeries: *maxUniqueTimeseries,
QuotedRemoteAddr: httpserver.GetQuotedRemoteAddr(r),
Deadline: deadline,
NoCache: noCache,
LookbackDelta: lookbackDelta,
RoundDigits: getRoundDigits(r),
EnforcedTagFilterss: etfs,
CacheTagFilters: etfs,
GetRequestURI: func() string {
return httpserver.GetRequestURI(r)
},
DenyPartialResponse: httputil.GetDenyPartialResponse(r),
}
ec := newEvalConfig(r, start, start, step, deadline, noCache, lookbackDelta, isDebug, etfs)
err = populateAuthTokens(qt, ec, at, deadline)
if err != nil {
return fmt.Errorf("cannot populate auth tokens: %w", err)
}
if isDebug {
if err := populateSimulatedData(r, at, ec); err != nil {
_ = r.Body.Close()
return fmt.Errorf("cannot read simulated samples: %w", err)
}
}
_ = r.Body.Close()
qs := promql.NewQueryStats(query, at, ec)
ec.QueryStats = qs
@@ -1037,8 +1082,9 @@ func QueryRangeHandler(qt *querytracer.Tracer, startTime time.Time, at *auth.Tok
func queryRangeHandler(qt *querytracer.Tracer, startTime time.Time, at *auth.Token, w http.ResponseWriter, query string,
start, end, step int64, r *http.Request, ct int64, etfs [][]storage.TagFilter) error {
deadline := searchutil.GetDeadlineForQuery(r, startTime)
noCache := httputil.GetBool(r, "nocache")
lookbackDelta, err := getMaxLookback(r)
isDebug := httputil.GetBool(r, "debug")
noCache := httputil.GetBool(r, "nocache") || isDebug
lookbackDelta, err := getMaxLookback(r, *maxStalenessInterval)
if err != nil {
return err
}
@@ -1057,29 +1103,19 @@ func queryRangeHandler(qt *querytracer.Tracer, startTime time.Time, at *auth.Tok
start, end = promql.AdjustStartEnd(start, end, step)
}
ec := &promql.EvalConfig{
Start: start,
End: end,
Step: step,
MaxPointsPerSeries: *maxPointsPerTimeseries,
MaxSeries: *maxUniqueTimeseries,
QuotedRemoteAddr: httpserver.GetQuotedRemoteAddr(r),
Deadline: deadline,
NoCache: noCache,
LookbackDelta: lookbackDelta,
RoundDigits: getRoundDigits(r),
EnforcedTagFilterss: etfs,
CacheTagFilters: etfs,
GetRequestURI: func() string {
return httpserver.GetRequestURI(r)
},
DenyPartialResponse: httputil.GetDenyPartialResponse(r),
}
ec := newEvalConfig(r, start, end, step, deadline, noCache, lookbackDelta, isDebug, etfs)
err = populateAuthTokens(qt, ec, at, deadline)
if err != nil {
return fmt.Errorf("cannot populate auth tokens: %w", err)
}
if isDebug {
if err := populateSimulatedData(r, at, ec); err != nil {
_ = r.Body.Close()
return fmt.Errorf("cannot read simulated samples: %w", err)
}
}
_ = r.Body.Close()
qs := promql.NewQueryStats(query, at, ec)
ec.QueryStats = qs
@@ -1115,6 +1151,102 @@ func queryRangeHandler(qt *querytracer.Tracer, startTime time.Time, at *auth.Tok
return nil
}
func newEvalConfig(r *http.Request, start, end, step int64, deadline searchutil.Deadline, noCache bool, lookbackDelta int64, isDebug bool, etfs [][]storage.TagFilter) *promql.EvalConfig {
ec := &promql.EvalConfig{
Start: start,
End: end,
Step: step,
MaxPointsPerSeries: *maxPointsPerTimeseries,
MaxSeries: *maxUniqueTimeseries,
MinStalenessInterval: *minStalenessInterval,
QuotedRemoteAddr: httpserver.GetQuotedRemoteAddr(r),
Deadline: deadline,
NoCache: noCache,
LookbackDelta: lookbackDelta,
RoundDigits: getRoundDigits(r),
EnforcedTagFilterss: etfs,
GetRequestURI: func() string {
return httpserver.GetRequestURI(r)
},
DenyPartialResponse: !isDebug && httputil.GetDenyPartialResponse(r),
}
return ec
}
func populateSimulatedData(r *http.Request, at *auth.Token, evalConfig *promql.EvalConfig) error {
type jsonExportBlockInput struct {
Metric map[string]string `json:"metric"`
Values []float64 `json:"values"`
Timestamps []int64 `json:"timestamps"`
}
// --- Read and Parse Input Samples from r.Body ---
var simulatedSeries []*storage.SimulatedSamples
decoder := json.NewDecoder(r.Body)
lineNum := 0
accountID := uint32(0)
projectID := uint32(0)
if at != nil {
accountID = at.AccountID
projectID = at.ProjectID
}
for {
var jeb jsonExportBlockInput
if err := decoder.Decode(&jeb); err == io.EOF {
break
} else if err != nil {
return fmt.Errorf("error decoding input JSON on line %d: %w", lineNum, err)
}
// Validate that values and timestamps arrays have the same length
if len(jeb.Values) != len(jeb.Timestamps) {
return fmt.Errorf("mismatched values and timestamps arrays length in debug data on line %d: values=%d, timestamps=%d", lineNum, len(jeb.Values), len(jeb.Timestamps))
}
mn := storage.GetMetricName()
defer storage.PutMetricName(mn)
mn.AccountID = accountID
mn.ProjectID = projectID
for k, v := range jeb.Metric {
mn.AddTag(k, v)
}
ss := &storage.SimulatedSamples{
Value: jeb.Values,
Timestamps: jeb.Timestamps,
}
ss.Name.CopyFrom(mn)
simulatedSeries = append(simulatedSeries, ss)
lineNum++
}
// It doesn't make sense to debug with empty samples
if len(simulatedSeries) == 0 {
return fmt.Errorf("no simulated samples found")
}
minStalenessInterval, err := httputil.GetDurationRaw(r, "min_staleness_interval", evalConfig.MinStalenessInterval)
if err != nil {
return fmt.Errorf("cannot parse `min_staleness_interval` arg: %w", err)
}
maxStalenessInterval, err := httputil.GetDurationRaw(r, "max_staleness_interval", *maxStalenessInterval)
if err != nil {
return fmt.Errorf("cannot parse `max_staleness_interval` arg: %w", err)
}
evalConfig.SimulatedSamples = simulatedSeries
evalConfig.MinStalenessInterval = minStalenessInterval
evalConfig.LookbackDelta, err = getMaxLookback(r, maxStalenessInterval)
if err != nil {
return err
}
return nil
}
func populateAuthTokens(qt *querytracer.Tracer, ec *promql.EvalConfig, at *auth.Token, deadline searchutil.Deadline) error {
if at != nil {
ec.AuthTokens = []*auth.Token{at}
@@ -1214,7 +1346,7 @@ func adjustLastPoints(tss []netstorage.Result, start, end int64) []netstorage.Re
return tss
}
func getMaxLookback(r *http.Request) (int64, error) {
func getMaxLookback(r *http.Request, maxStalenessInterval time.Duration) (int64, error) {
d := maxLookback.Milliseconds()
if d == 0 {
d = maxStalenessInterval.Milliseconds()

View File

@@ -188,7 +188,7 @@ func newBinaryOpFunc(bf func(left, right float64, isBool bool) float64) binaryOp
rightValues := right[i].Values
dstValues := dst[i].Values
if len(leftValues) != len(rightValues) || len(leftValues) != len(dstValues) {
logger.Panicf("BUG: len(leftVaues) must match len(rightValues) and len(dstValues); got %d vs %d vs %d",
logger.Panicf("BUG: len(leftValues) must match len(rightValues) and len(dstValues); got %d vs %d vs %d",
len(leftValues), len(rightValues), len(dstValues))
}
for j, a := range leftValues {

View File

@@ -138,6 +138,10 @@ type EvalConfig struct {
// LookbackDelta is analog to `-query.lookback-delta` from Prometheus.
LookbackDelta int64
// MaxStalenessInterval corresponds to -search.maxStalenessInterval,
// but customized per query request.
MinStalenessInterval time.Duration
// How many decimal digits after the point to leave in response.
RoundDigits int
@@ -168,6 +172,9 @@ type EvalConfig struct {
timestamps []int64
timestampsOnce sync.Once
// Simulated samples
SimulatedSamples []*storage.SimulatedSamples
}
// copyEvalConfig returns src copy.
@@ -190,6 +197,8 @@ func copyEvalConfig(src *EvalConfig) *EvalConfig {
ec.DenyPartialResponse = src.DenyPartialResponse
ec.IsPartialResponse.Store(src.IsPartialResponse.Load())
ec.QueryStats = src.QueryStats
ec.MinStalenessInterval = src.MinStalenessInterval
ec.SimulatedSamples = src.SimulatedSamples
// do not copy src.timestamps - they must be generated again.
return &ec
@@ -949,7 +958,7 @@ func evalRollupFuncWithSubquery(qt *querytracer.Tracer, ec *EvalConfig, funcName
}
ecSQ := copyEvalConfig(ec)
ecSQ.Start -= window + step + maxSilenceInterval()
ecSQ.Start -= window + step + maxSilenceInterval(ec.MinStalenessInterval)
ecSQ.End += step
ecSQ.Step = step
ecSQ.MaxPointsPerSeries = *maxPointsSubqueryPerTimeseries
@@ -967,7 +976,7 @@ func evalRollupFuncWithSubquery(qt *querytracer.Tracer, ec *EvalConfig, funcName
return nil, nil
}
sharedTimestamps := getTimestamps(ec.Start, ec.End, ec.Step, ec.MaxPointsPerSeries)
preFunc, rcs, err := getRollupConfigs(funcName, rf, expr, ec.Start, ec.End, ec.Step, ec.MaxPointsPerSeries, window, ec.LookbackDelta, sharedTimestamps, ec.IsMultiTenant)
preFunc, rcs, err := getRollupConfigs(funcName, rf, expr, ec.Start, ec.End, ec.Step, ec.MaxPointsPerSeries, window, ec.LookbackDelta, sharedTimestamps, ec.IsMultiTenant, ec.MinStalenessInterval)
if err != nil {
return nil, err
}
@@ -1720,7 +1729,7 @@ func evalRollupFuncNoCache(qt *querytracer.Tracer, ec *EvalConfig, funcName stri
}
// Obtain rollup configs before fetching data from db, so type errors could be caught earlier.
sharedTimestamps := getTimestamps(ec.Start, ec.End, ec.Step, ec.MaxPointsPerSeries)
preFunc, rcs, err := getRollupConfigs(funcName, rf, expr, ec.Start, ec.End, ec.Step, ec.MaxPointsPerSeries, window, ec.LookbackDelta, sharedTimestamps, ec.IsMultiTenant)
preFunc, rcs, err := getRollupConfigs(funcName, rf, expr, ec.Start, ec.End, ec.Step, ec.MaxPointsPerSeries, window, ec.LookbackDelta, sharedTimestamps, ec.IsMultiTenant, ec.MinStalenessInterval)
if err != nil {
return nil, err
}
@@ -1730,7 +1739,7 @@ func evalRollupFuncNoCache(qt *querytracer.Tracer, ec *EvalConfig, funcName stri
tfss = searchutil.JoinTagFilterss(tfss, ec.EnforcedTagFilterss)
minTimestamp := ec.Start
if needSilenceIntervalForRollupFunc[funcName] {
minTimestamp -= maxSilenceInterval()
minTimestamp -= maxSilenceInterval(ec.MinStalenessInterval)
}
if window > ec.Step {
minTimestamp -= window
@@ -1749,10 +1758,13 @@ func evalRollupFuncNoCache(qt *querytracer.Tracer, ec *EvalConfig, funcName stri
} else {
sq = storage.NewSearchQuery(ec.AuthTokens[0].AccountID, ec.AuthTokens[0].ProjectID, minTimestamp, ec.End, tfss, ec.MaxSeries)
}
sq.SimulatedSeries = ec.SimulatedSamples
rss, isPartial, err := netstorage.ProcessSearchQuery(qt, ec.DenyPartialResponse, sq, ec.Deadline)
if err != nil {
return nil, err
}
ec.updateIsPartialResponse(isPartial)
qs := ec.QueryStats
rssLen := rss.Len()
@@ -1835,7 +1847,7 @@ func getRollupMemoryLimiter() *memoryLimiter {
return &rollupMemoryLimiter
}
func maxSilenceInterval() int64 {
func maxSilenceInterval(minStalenessInterval time.Duration) int64 {
d := minStalenessInterval.Milliseconds()
if d <= 0 {
d = 5 * 60 * 1000

View File

@@ -56,7 +56,7 @@ func TestValidateMaxPointsPerSeriesFailure(t *testing.T) {
f := func(start, end, step int64, maxPoints int) {
t.Helper()
if err := ValidateMaxPointsPerSeries(start, end, step, maxPoints); err == nil {
t.Fatalf("expecint non-nil error for ValidateMaxPointsPerSeries(start=%d, end=%d, step=%d, maxPoints=%d)", start, end, step, maxPoints)
t.Fatalf("expecting non-nil error for ValidateMaxPointsPerSeries(start=%d, end=%d, step=%d, maxPoints=%d)", start, end, step, maxPoints)
}
}
// zero step

View File

@@ -67,12 +67,15 @@ func Exec(qt *querytracer.Tracer, ec *EvalConfig, q string, isFirstPointOnly boo
}
}
var rv []*timeseries
qid := activeQueriesV.Add(ec, q)
rv, err := evalExpr(qt, ec, e)
rv, err = evalExpr(qt, ec, e)
activeQueriesV.Remove(qid)
if err != nil {
return nil, err
}
if isFirstPointOnly {
// Remove all the points except the first one from every time series.
for _, ts := range rv {
@@ -331,3 +334,23 @@ func escapeDots(s string) string {
}
return string(result)
}
// ExtractMetricsFromQuery visits all the expressions in query and returns all the metrics found in the query.
func ExtractMetricsFromQuery(query string) ([]string, error) {
expr, err := metricsql.Parse(query)
if err != nil {
return nil, fmt.Errorf("error parsing query: %w", err)
}
var metrics []string
metricsql.VisitAll(expr, func(e metricsql.Expr) {
if me, ok := e.(*metricsql.MetricExpr); ok {
metricStr := string(me.AppendString(nil))
if metricStr != "" {
metrics = append(metrics, metricStr)
}
}
})
return metrics, nil
}

View File

@@ -0,0 +1,329 @@
package promql
import (
"math"
"slices"
"testing"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/netstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/searchutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
)
func TestSimulatedExec(t *testing.T) {
accountID := uint32(123)
projectID := uint32(567)
start := int64(1000e3)
end := int64(2000e3)
step := int64(200e3)
// Base EvalConfig that will be copied for each test
baseEC := EvalConfig{
AuthTokens: []*auth.Token{{
AccountID: accountID,
ProjectID: projectID,
}},
Start: start,
End: end,
Step: step,
MaxPointsPerSeries: 1e4,
MaxSeries: 1000,
Deadline: searchutil.NewDeadline(time.Now(), time.Hour, ""),
RoundDigits: 100,
NoCache: true,
}
t.Run(`simple_metric_exact_match`, func(t *testing.T) {
t.Skip()
ec := copyEvalConfig(&baseEC)
mn := newMetric(accountID, projectID,
"__name__", "test_metric",
"a", "b",
)
ec.SimulatedSamples = []*storage.SimulatedSamples{mn.build()}
q := `test_metric{a="b"}`
result, err := Exec(nil, ec, q, false)
if err != nil {
t.Fatalf(`unexpected error when executing %q: %s`, q, err)
}
// Expected result
expectedMN := storage.MetricName{
AccountID: accountID,
ProjectID: projectID,
MetricGroup: []byte("test_metric"),
Tags: []storage.Tag{
{
Key: []byte("a"),
Value: []byte("b"),
},
},
}
expectedResult := []netstorage.Result{
{
MetricName: expectedMN,
Values: mn.Value,
Timestamps: mn.Timestamps,
},
}
testResultsEqual(t, result, expectedResult, false)
})
t.Run(`filtered_by_tag_value`, func(t *testing.T) {
t.Skip()
// Create a copy of base EvalConfig
ec := copyEvalConfig(&baseEC)
mn := metricBuilders{
newMetric(accountID, projectID,
"__name__", "test_metric",
"a", "b",
"region", "us-west",
),
newMetric(accountID, projectID,
"__name__", "test_metric",
"a", "b",
"region", "us-east",
),
}
ec.SimulatedSamples = mn.build()
q := `test_metric{region="us-west"}`
result, err := Exec(nil, ec, q, false)
if err != nil {
t.Fatalf(`unexpected error when executing %q: %s`, q, err)
}
// Expected result
expectedMN := storage.MetricName{
AccountID: accountID,
ProjectID: projectID,
MetricGroup: []byte("test_metric"),
Tags: []storage.Tag{
{
Key: []byte("a"),
Value: []byte("b"),
},
{
Key: []byte("region"),
Value: []byte("us-west"),
},
},
}
expectedResult := []netstorage.Result{
{
MetricName: expectedMN,
Values: mn[0].Value,
Timestamps: mn[0].Timestamps,
},
}
testResultsEqual(t, result, expectedResult, false)
})
t.Run(`regex_match_on_tag`, func(t *testing.T) {
ec := copyEvalConfig(&baseEC)
mn := metricBuilders{
newMetric(accountID, projectID,
"__name__", "test_metric",
"env", "prod",
),
newMetric(accountID, projectID,
"__name__", "test_metric",
"env", "staging",
),
newMetric(accountID, projectID,
"__name__", "test_metric",
"env", "dev",
),
}
ec.SimulatedSamples = mn.build()
q := `test_metric{env=~"prod|staging"}`
result, err := Exec(nil, ec, q, false)
if err != nil {
t.Fatalf(`unexpected error when executing %q: %s`, q, err)
}
expectedResult := []netstorage.Result{mn[0].toResult(), mn[1].toResult()}
testResultsEqual(t, result, expectedResult, false)
})
}
func TestSumOverTime(t *testing.T) {
accountID := uint32(123)
projectID := uint32(567)
start := int64(1000e3)
end := int64(1300e3)
step := int64(30e3)
baseEC := EvalConfig{
AuthTokens: []*auth.Token{{
AccountID: accountID,
ProjectID: projectID,
}},
Start: start,
End: end,
Step: step,
MaxPointsPerSeries: 1e4,
MaxSeries: 1000,
Deadline: searchutil.NewDeadline(time.Now(), time.Hour, ""),
RoundDigits: 100,
NoCache: true,
}
t.Run(`basic_sum_over_time`, func(t *testing.T) {
ec := copyEvalConfig(&baseEC)
metric := newMetric(accountID, projectID,
"__name__", "test_metric",
"app", "api-server",
).withValues(1, 2, 3, 4, 5, 6).withUnix(1000, 1015, 1030, 1045, 1060, 1075)
ec.SimulatedSamples = []*storage.SimulatedSamples{metric.build()}
q := `sum_over_time(test_metric[30s])`
result, err := Exec(nil, ec, q, false)
if err != nil {
t.Fatalf(`unexpected error when executing %q: %s`, q, err)
}
expectedResult := []netstorage.Result{
newMetric(accountID, projectID,
"app", "api-server",
).withValues(1, 5, 9, 6).withUnix(1000, 1030, 1060, 1090).toResult(),
}
testSimulatedResultsEqual(t, result, expectedResult, false)
})
}
type metricBuilder storage.SimulatedSamples
func newMetric(accountID uint32, projectID uint32, pairs ...string) *metricBuilder {
mn := storage.MetricName{
AccountID: accountID,
ProjectID: projectID,
}
for i := 0; i < len(pairs); i += 2 {
mn.AddTag(pairs[i], pairs[i+1])
}
return &metricBuilder{
Name: mn,
Value: []float64{10, 20, 30, 40, 50, 60},
Timestamps: []int64{1000e3, 1200e3, 1400e3, 1600e3, 1800e3, 2000e3},
}
}
func (b *metricBuilder) withUnix(unix ...int64) *metricBuilder {
b.Timestamps = make([]int64, len(unix))
for i := range unix {
b.Timestamps[i] = unix[i] * 1e3
}
return b
}
func (b *metricBuilder) withValues(values ...float64) *metricBuilder {
b.Value = values
return b
}
func (b *metricBuilder) build() *storage.SimulatedSamples {
return (*storage.SimulatedSamples)(b)
}
func (b *metricBuilder) toResult() netstorage.Result {
return netstorage.Result{
MetricName: b.Name,
Values: b.Value,
Timestamps: b.Timestamps,
}
}
type metricBuilders []*metricBuilder
func (b metricBuilders) build() []*storage.SimulatedSamples {
ss := make([]*storage.SimulatedSamples, len(b))
for i := range b {
ss[i] = b[i].build()
}
return ss
}
func testSimulatedResultsEqual(t *testing.T, result, resultExpected []netstorage.Result, verifyTenant bool) {
t.Helper()
result = removeEmptyValuesAndTimeseries(result)
if len(result) != len(resultExpected) {
t.Fatalf(`unexpected timeseries count; got %d; want %d`, len(result), len(resultExpected))
}
for i := range result {
r := &result[i]
rExpected := &resultExpected[i]
testMetricNamesEqual(t, &r.MetricName, &rExpected.MetricName, verifyTenant, i)
testRowsEqual(t, r.Values, r.Timestamps, rExpected.Values, rExpected.Timestamps)
}
}
func removeEmptyValuesAndTimeseries(tss []netstorage.Result) []netstorage.Result {
dst := tss[:0]
for i := range tss {
ts := &tss[i]
hasNaNs := slices.ContainsFunc(ts.Values, math.IsNaN)
if !hasNaNs {
// Fast path: nothing to remove.
if len(ts.Values) > 0 {
dst = append(dst, *ts)
}
continue
}
// Slow path: remove NaNs.
srcTimestamps := ts.Timestamps
dstValues := ts.Values[:0]
// Do not reuse ts.Timestamps for dstTimestamps, since ts.Timestamps
// may be shared among multiple time series.
dstTimestamps := make([]int64, 0, len(ts.Timestamps))
for j, v := range ts.Values {
if math.IsNaN(v) {
continue
}
dstValues = append(dstValues, v)
dstTimestamps = append(dstTimestamps, srcTimestamps[j])
}
ts.Values = dstValues
ts.Timestamps = dstTimestamps
if len(ts.Values) > 0 {
dst = append(dst, *ts)
}
}
return dst
}
func TestExtractMetricsFromQuery(t *testing.T) {
query := `(vm_free_disk_space_bytes{job=~"$job", instance=~"$instance"}-vm_free_disk_space_limit_bytes{job=~"$job", instance=~"$instance"})
/
ignoring(path) (
(rate(vm_rows_added_to_storage_total{job=~"$job", instance=~"$instance"}[1d]) -
sum(rate(vm_deduplicated_samples_total{job=~"$job", instance=~"$instance"}[1d])) without (type)) *
(
sum(vm_data_size_bytes{job=~"$job", instance=~"$instance", type!~"indexdb.*"}) without(type) /
sum(vm_rows{job=~"$job", instance=~"$instance", type!~"indexdb.*"}) without(type)
)
+
rate(vm_new_timeseries_created_total{job=~"$job", instance=~"$instance"}[1d]) *
(
sum(vm_data_size_bytes{job=~"$job", instance=~"$instance", type="indexdb/file"}) /
sum(vm_rows{job=~"$job", instance=~"$instance", type="indexdb/file"})
)
)`
metrics, err := ExtractMetricsFromQuery(query)
if err != nil {
t.Fatalf(`unexpected error when extracting metrics from query: %s`, err)
}
t.Logf(`metrics: %v`, metrics)
}

View File

@@ -1,12 +1,12 @@
package promql
import (
"flag"
"fmt"
"math"
"strconv"
"strings"
"sync"
"time"
"github.com/VictoriaMetrics/metrics"
"github.com/VictoriaMetrics/metricsql"
@@ -17,10 +17,6 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
)
var minStalenessInterval = flag.Duration("search.minStalenessInterval", 0, "The minimum interval for staleness calculations. "+
"This flag could be useful for removing gaps on graphs generated from time series with irregular intervals between samples. "+
"See also '-search.maxStalenessInterval'")
var rollupFuncs = map[string]newRollupFunc{
"absent_over_time": newRollupFuncOneArg(rollupAbsent),
"aggr_over_time": newRollupFuncTwoArgs(rollupFake),
@@ -372,7 +368,7 @@ func getRollupTag(expr metricsql.Expr) (string, error) {
}
func getRollupConfigs(funcName string, rf rollupFunc, expr metricsql.Expr, start, end, step int64, maxPointsPerSeries int,
window, lookbackDelta int64, sharedTimestamps []int64, isMultiTenant bool) (
window, lookbackDelta int64, sharedTimestamps []int64, isMultiTenant bool, minStalenessInterval time.Duration) (
func(values []float64, timestamps []int64), []*rollupConfig, error) {
preFunc := func(_ []float64, _ []int64) {}
funcName = strings.ToLower(funcName)
@@ -409,6 +405,7 @@ func getRollupConfigs(funcName string, rf rollupFunc, expr metricsql.Expr, start
isDefaultRollup: funcName == "default_rollup",
samplesScannedPerCall: samplesScannedPerCall,
isMultiTenant: isMultiTenant,
minStalenessInterval: minStalenessInterval,
}
}
@@ -605,6 +602,9 @@ type rollupConfig struct {
// Whether the rollup is used in multi-tenant mode.
// This is used in order to populate labels with tenancy information.
isMultiTenant bool
// The minimum interval for staleness calculations.
minStalenessInterval time.Duration
}
func (rc *rollupConfig) getTimestamps() []int64 {
@@ -728,8 +728,8 @@ func (rc *rollupConfig) doInternal(dstValues []float64, tsm *timeseriesMap, valu
if rc.LookbackDelta > 0 && maxPrevInterval > rc.LookbackDelta {
maxPrevInterval = rc.LookbackDelta
}
if *minStalenessInterval > 0 {
if msi := minStalenessInterval.Milliseconds(); msi > 0 && maxPrevInterval < msi {
if rc.minStalenessInterval > 0 {
if msi := rc.minStalenessInterval.Milliseconds(); msi > 0 && maxPrevInterval < msi {
maxPrevInterval = msi
}
}
@@ -2448,13 +2448,14 @@ func rollupFake(_ *rollupFuncArg) float64 {
return 0
}
// getScalar expects result from a [scalar](https://prometheus.io/docs/prometheus/latest/querying/basics/#expression-language-data-types).
func getScalar(arg any, argNum int) ([]float64, error) {
ts, ok := arg.([]*timeseries)
if !ok {
return nil, fmt.Errorf(`unexpected type for arg #%d; got %T; want %T`, argNum+1, arg, ts)
return nil, fmt.Errorf(`arg #%d must be a scalar`, argNum+1)
}
if len(ts) != 1 {
return nil, fmt.Errorf(`arg #%d must contain a single timeseries; got %d timeseries`, argNum+1, len(ts))
return nil, fmt.Errorf(`arg #%d must be a scalar`, argNum+1)
}
return ts[0].Values, nil
}
@@ -2471,14 +2472,15 @@ func getIntNumber(arg any, argNum int) (int, error) {
return n, nil
}
// getString expects result from a string expression, which contains a single timeseries with only NaN values.
func getString(tss []*timeseries, argNum int) (string, error) {
if len(tss) != 1 {
return "", fmt.Errorf(`arg #%d must contain a single timeseries; got %d timeseries`, argNum+1, len(tss))
return "", fmt.Errorf(`arg #%d must be a string`, argNum+1)
}
ts := tss[0]
for _, v := range ts.Values {
if !math.IsNaN(v) {
return "", fmt.Errorf(`arg #%d contains non-string timeseries`, argNum+1)
return "", fmt.Errorf(`arg #%d must be a string`, argNum+1)
}
}
return string(ts.MetricName.MetricGroup), nil

View File

@@ -903,7 +903,6 @@ func transformHistogramQuantile(tfa *transformFuncArg) ([]*timeseries, error) {
// Convert buckets with `vmrange` labels to buckets with `le` labels.
tss := vmrangeBucketsToLE(args[1])
// Parse boundsLabel. See https://github.com/prometheus/prometheus/issues/5706 for details.
var boundsLabel string
if len(args) > 2 {
@@ -1050,9 +1049,15 @@ func fixBrokenBuckets(i int, xss []leTimeseries) {
return
}
vNext := xss[0].ts.Values[i]
// Set the lowest bucket to 0 if its value is NaN, so it can be properly
// compared with upper buckets in the loop below.
if math.IsNaN(vNext) {
vNext = 0
xss[0].ts.Values[i] = vNext
}
// Substitute upper bucket values with lower bucket values if the upper values are NaN
// or are bigger than the lower bucket values.
vNext := xss[0].ts.Values[i]
for j := 1; j < len(xss); j++ {
v := xss[j].ts.Values[i]
if math.IsNaN(v) || vNext > v {

View File

@@ -37,6 +37,9 @@ func TestFixBrokenBuckets(t *testing.T) {
f([]float64{5, 1, 2, 3, nan}, []float64{5, 5, 5, 5, 5})
f([]float64{1, 5, 2, nan, 6, 3}, []float64{1, 5, 5, 5, 6, 6})
f([]float64{5, 10, 4, 3}, []float64{5, 10, 10, 10})
f([]float64{nan, 2, nan, 5}, []float64{0, 2, 2, 5})
f([]float64{nan, nan, 4, 5}, []float64{0, 0, 4, 5})
f([]float64{nan, nan, nan, 4}, []float64{0, 0, 0, 4})
}
func TestFixBrokenBucketsMultipleValues(t *testing.T) {
@@ -44,12 +47,11 @@ func TestFixBrokenBucketsMultipleValues(t *testing.T) {
t.Helper()
xss := make([]leTimeseries, len(values))
for i, v := range values {
xss[i].ts = &timeseries{
Values: v,
}
}
for i := range len(values) - 1 {
for i := range len(values[0]) {
fixBrokenBuckets(i, xss)
}
result := make([][]float64, len(values))
@@ -61,6 +63,8 @@ func TestFixBrokenBucketsMultipleValues(t *testing.T) {
}
}
f([][]float64{{10, 1}, {11, 2}, {13, 3}}, [][]float64{{10, 1}, {11, 2}, {13, 3}})
f([][]float64{{nan, nan}, {11, 2}, {13, 3}}, [][]float64{{0, 0}, {11, 2}, {13, 3}})
f([][]float64{{nan, nan, nan}, {11, 2, 3}, {13, 3, 4}}, [][]float64{{0, 0, 0}, {11, 2, 3}, {13, 3, 4}})
}
func TestVmrangeBucketsToLE(t *testing.T) {

View File

@@ -36,7 +36,7 @@
<meta property="og:title" content="UI for VictoriaMetrics">
<meta property="og:url" content="https://victoriametrics.com/">
<meta property="og:description" content="Explore and troubleshoot your VictoriaMetrics data">
<script type="module" crossorigin src="./assets/index-BT5pWGkz.js"></script>
<script type="module" crossorigin src="./assets/index-Ck5nH8JI.js"></script>
<link rel="modulepreload" crossorigin href="./assets/vendor-BVRvRxZ2.js">
<link rel="stylesheet" crossorigin href="./assets/vendor-D1GxaB_c.css">
<link rel="stylesheet" crossorigin href="./assets/index-BHg4iVVe.css">

View File

@@ -416,6 +416,7 @@ func writeStorageMetrics(w io.Writer, strg *storage.Storage) {
metrics.WriteGaugeUint64(w, fmt.Sprintf(`vm_free_disk_space_bytes{path=%q}`, *storageDataPath), fs.MustGetFreeSpace(*storageDataPath))
metrics.WriteGaugeUint64(w, fmt.Sprintf(`vm_free_disk_space_limit_bytes{path=%q}`, *storageDataPath), uint64(minFreeDiskSpaceBytes.N))
metrics.WriteGaugeUint64(w, fmt.Sprintf(`vm_total_disk_space_bytes{path=%q}`, *storageDataPath), fs.MustGetTotalSpace(*storageDataPath))
isReadOnly := 0
if strg.IsReadOnly() {

View File

@@ -106,7 +106,10 @@ func (s *VMInsertServer) run() {
// c is stopped inside VMInsertServer.MustStop
return
}
if handshake.IsClientNetworkError(err) {
if handshake.IsTimeoutNetworkError(err) {
logger.Warnf("cannot complete vminsert handshake due to network timeout error with client %q: %s. "+
"If errors are transient and infrequent increase -rpc.handshakeTimeout and -vmstorageDialTimeout on client and server side. Check vminsert logs for errors", c.RemoteAddr(), err)
} else if handshake.IsClientNetworkError(err) {
logger.Warnf("cannot complete vminsert handshake due to network error with client %q: %s. "+
"Check vminsert logs for errors", c.RemoteAddr(), err)
} else if !handshake.IsTCPHealthcheck(err) {

View File

@@ -1,4 +1,4 @@
FROM golang:1.24.5 AS build-web-stage
FROM golang:1.25.0 AS build-web-stage
COPY build /build
WORKDIR /build

View File

@@ -15,3 +15,24 @@ export const getExportDataUrl = (server: string, query: string, period: TimePara
if (reduceMemUsage) params.set("reduce_mem_usage", "1");
return `${server}/api/v1/export?${params}`;
};
export const getExportCSVDataUrl = (server: string, query: string[], period: TimeParams, reduceMemUsage: boolean): string => {
const params = new URLSearchParams({
start: period.start.toString(),
end: period.end.toString(),
format: "__name__,__value__,__timestamp__:unix_ms",
});
query.forEach((q => params.append("match[]", q)));
if (reduceMemUsage) params.set("reduce_mem_usage", "1");
return `${server}/api/v1/export/csv?${params}`;
};
export const getExportJSONDataUrl = (server: string, query: string[], period: TimeParams, reduceMemUsage: boolean): string => {
const params = new URLSearchParams({
start: period.start.toString(),
end: period.end.toString(),
});
query.forEach((q => params.append("match[]", q)));
if (reduceMemUsage) params.set("reduce_mem_usage", "1");
return `${server}/api/v1/export?${params}`;
};

View File

@@ -1,20 +1,18 @@
import { FC, useCallback } from "preact/compat";
import { useCallback, useRef } from "preact/compat";
import Tooltip from "../Main/Tooltip/Tooltip";
import Button from "../Main/Button/Button";
import { DownloadIcon } from "../Main/Icons";
import Popper from "../Main/Popper/Popper";
import { useRef } from "react";
import "./style.scss";
import useBoolean from "../../hooks/useBoolean";
interface DownloadButtonProps {
interface DownloadButtonProps<T extends string> {
title: string;
downloadFormatOptions?: string[];
onDownload: (format?: string) => void;
downloadFormatOptions?: T[];
onDownload: (format?: T) => void;
}
/** TODO: Currently unused, later will be added for the exporting metrics */
const DownloadButton: FC<DownloadButtonProps> = ({ title, downloadFormatOptions, onDownload }) => {
const DownloadButton = <T extends string>({ title, downloadFormatOptions, onDownload }: DownloadButtonProps<T>) => {
const {
value: isPopupOpen,
setTrue: onOpenPopup,
@@ -35,9 +33,19 @@ const DownloadButton: FC<DownloadButtonProps> = ({ title, downloadFormatOptions,
}
}, [onDownload, onClosePopup, isPopupOpen, onOpenPopup]);
const isDownloadFormat = useCallback((format: string): format is T => {
return (downloadFormatOptions as string[])?.includes(format);
}, [downloadFormatOptions]);
const onDownloadFormatClick = useCallback((event: Event) => {
const button = event.currentTarget as HTMLButtonElement;
onDownload(button.textContent ?? undefined);
const format = button.textContent;
if (format && isDownloadFormat(format)) {
onDownload(format);
} else {
onDownload();
}
onClosePopup();
}, [onDownload]);
return (

View File

@@ -578,97 +578,13 @@ export const CommentIcon = () => (
</svg>
);
export const FilterIcon = () => (
export const DebugIcon = () => (
<svg
viewBox="0 0 24 24"
fill="currentColor"
>
<path
d="M4.25 5.61C6.27 8.2 10 13 10 13v6c0 .55.45 1 1 1h2c.55 0 1-.45 1-1v-6s3.72-4.8 5.74-7.39c.51-.66.04-1.61-.79-1.61H5.04c-.83 0-1.3.95-.79 1.61"
></path>
</svg>
);
export const FilterOffIcon = () => (
<svg
viewBox="0 0 24 24"
fill="currentColor"
>
<path
d="M19.79 5.61C20.3 4.95 19.83 4 19 4H6.83l7.97 7.97zM2.81 2.81 1.39 4.22 10 13v6c0 .55.45 1 1 1h2c.55 0 1-.45 1-1v-2.17l5.78 5.78 1.41-1.41z"
></path>
</svg>
);
export const OpenNewIcon = () => (
<svg
viewBox="0 0 24 24"
fill="currentColor"
>
<path
d="M19 19H5V5h7V3H5c-1.11 0-2 .9-2 2v14c0 1.1.89 2 2 2h14c1.1 0 2-.9 2-2v-7h-2zM14 3v2h3.59l-9.83 9.83 1.41 1.41L19 6.41V10h2V3z"
></path>
</svg>
);
export const ModalIcon = () => (
<svg
viewBox="0 0 24 24"
fill="currentColor"
>
<path d="M19 4H5c-1.11 0-2 .9-2 2v12c0 1.1.89 2 2 2h14c1.1 0 2-.9 2-2V6c0-1.1-.89-2-2-2m0 14H5V8h14z"></path>
</svg>
);
export const PauseIcon = () => (
<svg
viewBox="0 0 24 24"
fill="currentColor"
>
<path d="M6 19h4V5H6v14zm8-14v14h4V5h-4z" />
</svg>
);
export const ScrollToTopIcon = () => (
<svg
viewBox="0 0 24 24"
fill="currentColor"
>
<path
d="M8 12l4-4 4 4m-4-4v12"
strokeWidth="2"
stroke="currentColor"
fill="none"
d="M20 8h-2.81c-.45-.78-1.07-1.45-1.82-1.96L17 4.41 15.59 3l-2.17 2.17C12.96 5.06 12.49 5 12 5c-.49 0-.96.06-1.41.17L8.41 3 7 4.41l1.62 1.63C7.88 6.55 7.26 7.22 6.81 8H4v2h2.09c-.05.33-.09.66-.09 1v1H4v2h2v1c0 .34.04.67.09 1H4v2h2.81c1.04 1.79 2.97 3 5.19 3s4.15-1.21 5.19-3H20v-2h-2.09c.05-.33.09-.66.09-1v-1h2v-2h-2v-1c0-.34-.04-.67-.09-1H20V8zm-6 8h-4v-2h4v2zm0-4h-4v-2h4v2z"
/>
</svg>
);
export const SortIcon = () => (
<svg
viewBox="0 0 24 24"
fill="currentColor"
>
<path d="M4 3 L4 15 L1.5 15 L5.5 21 L9.5 15 L7 15 L7 3 Z"/>
<path d="M13 21 L13 9 L10.5 9 L14.5 3 L18.5 9 L16 9 L16 21 Z"/>
</svg>
);
export const SortArrowDownIcon = () => (
<svg
viewBox="0 0 24 24"
fill="currentColor"
>
<path d="M10.5 3 L10.5 15 L8 15 L12 21 L16 15 L13.5 15 L13.5 3 Z"/>
</svg>
);
export const SortArrowUpIcon = () => (
<svg
viewBox="0 0 24 24"
fill="currentColor"
>
<path d="M10.5 21 L10.5 9 L8 9 L12 3 L16 9 L13.5 9 L13.5 21 Z"/>
</svg>
);

View File

@@ -152,7 +152,7 @@ export const useFetchQuery = ({
counter++;
}
const limitText = `Showing ${tempData.length} series out of ${totalLength} series due to performance reasons. Please narrow down the query, so it returns less series`;
const limitText = `Showing ${tempData.length} series out of ${totalLength} series due to performance reasons. Please narrow down the query, so it returns fewer series`;
setWarning(totalLength > seriesLimit ? limitText : "");
isDisplayChart ? setGraphData(tempData as MetricResult[]) : setLiveData(tempData as InstantMetricResult[]);
setTraces(tempTraces);

View File

@@ -1,5 +1,5 @@
import { FC, useCallback, useEffect, useRef, useState } from "preact/compat";
import { DownloadIcon } from "../../../components/Main/Icons";
import { DebugIcon } from "../../../components/Main/Icons";
import Button from "../../../components/Main/Button/Button";
import Tooltip from "../../../components/Main/Tooltip/Tooltip";
import useBoolean from "../../../hooks/useBoolean";
@@ -217,17 +217,17 @@ const DownloadReport: FC<Props> = ({ fetchUrl, reportType = ReportType.QUERY_DAT
return (
<>
<Tooltip title={"Export query"}>
<Tooltip title={"Debug query"}>
<Button
variant="text"
startIcon={<DownloadIcon/>}
startIcon={<DebugIcon />}
onClick={toggleOpen}
ariaLabel="export query"
ariaLabel="Debug query"
/>
</Tooltip>
{openModal && (
<Modal
title={"Export query"}
title={"Debug query"}
onClose={handleClose}
isOpen={openModal}
>

View File

@@ -1,4 +1,4 @@
import { FC, useEffect, useState } from "preact/compat";
import { FC, useEffect, useState, useMemo, useRef, useCallback } from "preact/compat";
import QueryConfigurator from "./QueryConfigurator/QueryConfigurator";
import { useFetchQuery } from "../../hooks/useFetchQuery";
import { DisplayTypeSwitch } from "./DisplayTypeSwitch";
@@ -12,13 +12,17 @@ import Alert from "../../components/Main/Alert/Alert";
import classNames from "classnames";
import useDeviceDetect from "../../hooks/useDeviceDetect";
import InstantQueryTip from "./InstantQueryTip/InstantQueryTip";
import { useRef } from "react";
import CustomPanelTraces from "./CustomPanelTraces/CustomPanelTraces";
import WarningLimitSeries from "./WarningLimitSeries/WarningLimitSeries";
import CustomPanelTabs from "./CustomPanelTabs";
import { DisplayType } from "../../types";
import DownloadReport from "./DownloadReport/DownloadReport";
import WarningHeatmapToLine from "./WarningHeatmapToLine/WarningHeatmapToLine";
import DownloadButton from "../../components/DownloadButton/DownloadButton";
import { downloadCSV, downloadJSON } from "../../utils/file";
import { convertMetricsDataToCSV } from "./utils";
type ExportFormats = "csv" | "json";
const CustomPanel: FC = () => {
useSetQueryParams();
@@ -55,6 +59,27 @@ const CustomPanel: FC = () => {
showAllSeries
});
const fileDownloaders = useMemo(() => {
const getFilename = (format: ExportFormats) => {
return `vmui_export_${query.join("_")}.${format}`;
};
return {
csv: async () => {
if(!liveData) return;
const csvData = convertMetricsDataToCSV(liveData);
downloadCSV(csvData, getFilename("csv"));
},
json: async () => {
downloadJSON(JSON.stringify(liveData), getFilename("json"));
},
};
}, [liveData, query]);
const onDownloadClick = useCallback((format?: ExportFormats) => {
format && fileDownloaders[format]();
}, [fileDownloaders]);
const showInstantQueryTip = !liveData?.length && (displayType !== DisplayType.chart);
const showError = !hideError && error;
@@ -110,7 +135,7 @@ const CustomPanel: FC = () => {
"vm-block_mobile": isMobile,
})}
>
{isLoading && <LineLoader />}
{isLoading && <LineLoader/>}
<div
className="vm-custom-panel-body-header"
ref={controlsRef}
@@ -118,7 +143,13 @@ const CustomPanel: FC = () => {
<div className="vm-custom-panel-body-header__tabs">
<DisplayTypeSwitch/>
</div>
{(graphData || liveData) && <DownloadReport fetchUrl={fetchUrl}/>}
{displayType === "table" && (
<DownloadButton
title={"Export query"}
onDownload={onDownloadClick}
downloadFormatOptions={["json", "csv"]}
/>)}
{(graphData || liveData) && displayType !== "code" && <DownloadReport fetchUrl={fetchUrl}/>}
</div>
<CustomPanelTabs
graphData={graphData}

View File

@@ -0,0 +1,86 @@
import { describe, expect, it } from "vitest";
import { convertMetricsDataToCSV } from "./utils";
import { InstantMetricResult } from "../../api/types";
describe("convertMetricsDataToCSV", () => {
it("should return an empty string if headers are empty", () => {
const data: InstantMetricResult[] = [];
expect(convertMetricsDataToCSV(data)).toBe("");
});
it("should return a valid CSV string for single metric entry with value", () => {
const data: InstantMetricResult[] = [
{
value: [1623945600, "123"],
group: 0,
metric: {
header1: "123",
header2: "value2"
}
},
];
const result = convertMetricsDataToCSV(data);
expect(result).toBe("header1,header2\n123,value2");
});
it("should return a valid CSV string for multiple metric entries with values", () => {
const data: InstantMetricResult[] = [
{
value: [1623945600, "123"],
group: 0,
metric: {
header1: "123",
header2: "value2"
}
},
{
value: [1623949200, "456"],
group: 0,
metric: {
header1: "456",
header2: "value4"
}
},
];
const result = convertMetricsDataToCSV(data);
expect(result).toBe("header1,header2\n123,value2\n456,value4");
});
it("should handle metric entries with multiple values field", () => {
const data: InstantMetricResult[] = [
{
values: [[1623945600, "123"], [1623949200, "456"]],
group: 0,
metric: {
header1: "123-456",
header2: "values"
}
},
];
const result = convertMetricsDataToCSV(data);
expect(result).toBe("header1,header2\n123-456,values");
});
it("should handle a combination of metric entries with value and values", () => {
const data: InstantMetricResult[] = [
{
value: [1623945600, "123"],
group: 0,
metric: {
header1: "123",
header2: "first"
}
},
{
values: [[1623949200, "456"], [1623952800, "789"]],
group: 0,
metric: {
header1: "456-789",
header2: "second"
}
},
];
const result = convertMetricsDataToCSV(data);
expect(result).toBe("header1,header2\n123,first\n456-789,second");
});
});

View File

@@ -0,0 +1,18 @@
import { InstantMetricResult } from "../../api/types";
import { getColumns, MetricCategory } from "../../hooks/useSortedCategories";
import { formatValueToCSV } from "../../utils/csv";
const getHeaders = (data: InstantMetricResult[]): string => {
return getColumns(data).map(({ key }) => key).join(",");
};
const getRows = (data: InstantMetricResult[], headers: MetricCategory[]) => {
return data?.map(d => headers.map(c => formatValueToCSV(d.metric[c.key] || "-")).join(","));
};
export const convertMetricsDataToCSV = (data: InstantMetricResult[]): string => {
const headers = getHeaders(data);
if (!headers.length) return "";
const rows = getRows(data, getColumns(data));
return [headers, ...rows].join("\n");
};

View File

@@ -1,13 +1,15 @@
import { Dispatch, SetStateAction, useCallback, useEffect, useMemo, useRef, useState } from "preact/compat";
import { MetricBase, MetricResult, ExportMetricResult } from "../../../api/types";
import { ErrorTypes, SeriesLimits } from "../../../types";
import { ErrorTypes, SeriesLimits, TimeParams } from "../../../types";
import { useQueryState } from "../../../state/query/QueryStateContext";
import { useTimeState } from "../../../state/time/TimeStateContext";
import { useAppState } from "../../../state/common/StateContext";
import { useCustomPanelState } from "../../../state/customPanel/CustomPanelStateContext";
import { isValidHttpUrl } from "../../../utils/url";
import { getExportDataUrl } from "../../../api/query-range";
import { getExportCSVDataUrl, getExportDataUrl, getExportJSONDataUrl } from "../../../api/query-range";
import { parseLineToJSON } from "../../../utils/json";
import { downloadCSV, downloadJSON } from "../../../utils/file";
import { useSnack } from "../../../contexts/Snackbar";
interface FetchQueryParams {
hideQuery?: number[];
@@ -16,6 +18,7 @@ interface FetchQueryParams {
interface FetchQueryReturn {
fetchUrl?: string[],
exportData: (format: ExportFormats) => void,
isLoading: boolean,
data?: MetricResult[],
error?: ErrorTypes | string,
@@ -25,11 +28,16 @@ interface FetchQueryReturn {
abortFetch: () => void
}
type ExportFormats = "csv" | "json";
type FormatDownloader = (serverUrl: string, query: string[], period: TimeParams, reduceMemUsage: boolean) => void;
type DownloadFileFormats = Record<ExportFormats, FormatDownloader>
export const useFetchExport = ({ hideQuery, showAllSeries }: FetchQueryParams): FetchQueryReturn => {
const { query } = useQueryState();
const { period } = useTimeState();
const { displayType, reduceMemUsage, seriesLimits: stateSeriesLimits } = useCustomPanelState();
const { serverUrl } = useAppState();
const { showInfoMessage } = useSnack();
const [isLoading, setIsLoading] = useState(false);
const [data, setData] = useState<MetricResult[]>();
@@ -55,6 +63,35 @@ export const useFetchExport = ({ hideQuery, showAllSeries }: FetchQueryParams):
}
}, [serverUrl, period, hideQuery, reduceMemUsage]);
const fileDownloaders: DownloadFileFormats = useMemo(() => {
const getFilename = (format: ExportFormats) => `vmui_export_${query.join("_")}_${period.start}_${period.end}.${format}`;
return {
csv: async () => {
const url = getExportCSVDataUrl(serverUrl, query, period, reduceMemUsage);
const response = await fetch(url);
try {
let text = await response.text();
text = "name,value,timestamp\n" + text;
downloadCSV(text, getFilename("csv"));
} catch (e) {
console.error(e);
showInfoMessage({ text: "Couldn't fetch data for CSV export. Please try again", type: "error" });
}
},
json: async () => {
const url = getExportJSONDataUrl(serverUrl, query, period, reduceMemUsage);
try {
const response = await fetch(url);
const text = await response.text();
downloadJSON(text, getFilename("json"));
} catch (e) {
console.error(e);
showInfoMessage({ text: "Couldn't fetch data for JSON export. Please try again", type: "error" });
}
}
};
}, [query, period, serverUrl, reduceMemUsage]);
const fetchData = useCallback(async ({ fetchUrl, stateSeriesLimits, showAllSeries }: {
fetchUrl: string[];
stateSeriesLimits: SeriesLimits;
@@ -131,7 +168,7 @@ export const useFetchExport = ({ hideQuery, showAllSeries }: FetchQueryParams):
counter++;
}
const limitText = `Showing ${tempData.length} series out of ${totalLength} series due to performance reasons. Please narrow down the query, so it returns less series`;
const limitText = `Showing ${tempData.length} series out of ${totalLength} series due to performance reasons. Please narrow down the query, so it returns fewer series`;
setWarning(totalLength > seriesLimit ? limitText : "");
setData(tempData as MetricResult[]);
setIsLoading(false);
@@ -144,6 +181,12 @@ export const useFetchExport = ({ hideQuery, showAllSeries }: FetchQueryParams):
}
}, [displayType, hideQuery]);
const exportData = useCallback((format: ExportFormats) => {
if (error) return;
const updatedPeriod = { ...period };
fileDownloaders[format](serverUrl, query, updatedPeriod, reduceMemUsage);
}, [serverUrl, query, period, reduceMemUsage, error, fileDownloaders]);
const abortFetch = useCallback(() => {
abortControllerRef.current.abort();
setData([]);
@@ -167,5 +210,6 @@ export const useFetchExport = ({ hideQuery, showAllSeries }: FetchQueryParams):
setQueryErrors,
warning,
abortFetch,
exportData
};
};

View File

@@ -1,4 +1,4 @@
import { FC, useState } from "preact/compat";
import { FC, useCallback, useState } from "preact/compat";
import LineLoader from "../../components/Main/LineLoader/LineLoader";
import { useCustomPanelState } from "../../state/customPanel/CustomPanelStateContext";
import { useQueryState } from "../../state/query/QueryStateContext";
@@ -17,7 +17,7 @@ import { DisplayType } from "../../types";
import Hyperlink from "../../components/Main/Hyperlink/Hyperlink";
import { CloseIcon } from "../../components/Main/Icons";
import Button from "../../components/Main/Button/Button";
import DownloadReport, { ReportType } from "../CustomPanel/DownloadReport/DownloadReport";
import DownloadButton from "../../components/DownloadButton/DownloadButton";
const RawSamplesLink = () => (
<Hyperlink
@@ -66,7 +66,7 @@ const RawQueryPage: FC = () => {
queryErrors,
setQueryErrors,
abortFetch,
fetchUrl,
exportData
} = useFetchExport({ hideQuery, showAllSeries });
const controlsRef = useRef<HTMLDivElement>(null);
@@ -85,6 +85,11 @@ const RawQueryPage: FC = () => {
setShowPageDescription(false);
};
const onExportClick = useCallback(async (format?: "csv" | "json") => {
if (!format) return;
exportData(format);
}, [exportData]);
return (
<div
className={classNames({
@@ -159,9 +164,10 @@ const RawQueryPage: FC = () => {
<DisplayTypeSwitch tabFilter={(tab) => (tab.value !== DisplayType.table)}/>
</div>
{data && (
<DownloadReport
fetchUrl={fetchUrl}
reportType={ReportType.RAW_DATA}
<DownloadButton
title={"Export query"}
downloadFormatOptions={["json", "csv"]}
onDownload={onExportClick}
/>
)}
</div>

View File

@@ -0,0 +1,34 @@
import { describe, expect, it } from "vitest";
import { formatValueToCSV } from "./csv";
describe("formatValueToCSV", () => {
it("should wrap value in quotes if it contains a comma", () => {
const value = "hello,world";
const result = formatValueToCSV(value);
expect(result).toBe("\"hello,world\"");
});
it("should wrap value in quotes if it contains a newline", () => {
const value = "hello\nworld";
const result = formatValueToCSV(value);
expect(result).toBe("\"hello\nworld\"");
});
it("should escape quotes and wrap in quotes if value contains a double quote", () => {
const value = "hello \"world\"";
const result = formatValueToCSV(value);
expect(result).toBe("\"hello \"\"world\"\"\"");
});
it("should return the same value if it does not contain special characters", () => {
const value = "hello world";
const result = formatValueToCSV(value);
expect(result).toBe("hello world");
});
it("should handle empty strings correctly", () => {
const value = "";
const result = formatValueToCSV(value);
expect(result).toBe("");
});
});

View File

@@ -0,0 +1,4 @@
export const formatValueToCSV= (value: string) =>
(value.includes(",") || value.includes("\n") || value.includes("\""))
? "\"" + value.replace(/"/g, "\"\"") + "\""
: value;

View File

@@ -11,38 +11,12 @@ export const downloadFile = (data: Blob, filename: string) => {
URL.revokeObjectURL(url);
};
export const downloadCSV = (data: Record<string, string>[], filename: string) => {
const getHeader = (data: Record<string, string>[]) => {
const headersObj = data.reduce<Record<string, boolean>>((headers, row) => {
Object.keys(row).forEach((key) => {
if(key && !headers[key]){
headers[key] = true;
}
});
return headers;
}, {});
return Object.keys(headersObj);
};
const formatValueToCSV= (value: string) =>
(value.includes(",") || value.includes("\n") || value.includes("\""))
? "\"" + value.replace(/"/g, "\"\"") + "\""
: value;
const convertToCSV = (data: Record<string, string>[]): string => {
const header = getHeader(data);
const rows = data.map(item =>
header.map(fieldName => item[fieldName] ? formatValueToCSV(item[fieldName]): "").join(",")
);
return [header.map(formatValueToCSV).join(","), ...rows].join("\r\n");
};
const csvContent = convertToCSV(data);
const blob = new Blob([csvContent], { type: "text/csv;charset=utf-8;" });
export const downloadCSV = (data: string, filename: string) => {
const blob = new Blob([data], { type: "text/csv;charset=utf-8;" });
downloadFile(blob, filename);
};
export const downloadJSON = (data: string, filename: string) => {
const blob = new Blob([data], { type: "application/json" });
downloadFile(blob, filename);
};
};

View File

@@ -6,6 +6,7 @@ import (
"net"
"net/http"
"net/url"
"regexp"
"strconv"
"strings"
"testing"
@@ -183,3 +184,32 @@ func (app *ServesMetrics) GetMetricsByPrefix(t *testing.T, prefix string) []floa
}
return values
}
func (app *ServesMetrics) GetMetricsByRegexp(t *testing.T, re *regexp.Regexp) []float64 {
t.Helper()
values := []float64{}
metrics, statusCode := app.cli.Get(t, app.metricsURL)
if statusCode != http.StatusOK {
t.Fatalf("unexpected status code: got %d, want %d", statusCode, http.StatusOK)
}
for _, metric := range strings.Split(metrics, "\n") {
if !re.MatchString(metric) {
continue
}
parts := strings.Split(metric, " ")
if len(parts) < 2 {
t.Fatalf("unexpected record format: got %q, want metric name and value separated by a space", metric)
}
value, err := strconv.ParseFloat(parts[len(parts)-1], 64)
if err != nil {
t.Fatalf("could not parse metric value %s: %v", metric, err)
}
values = append(values, value)
}
return values
}

View File

@@ -173,7 +173,7 @@ func (tc *TestCase) MustStartVmagent(instance string, flags []string, promScrape
// vminsert, and one vmselect.
//
// Both Vmsingle and Vmcluster implement the PrometheusWriteQuerier used in
// business logic tests to abstract out the infrasture.
// business logic tests to abstract out the infrastructure.
//
// This type is not suitable for infrastructure tests where custom cluster
// setups are often required.

View File

@@ -17,7 +17,7 @@ func TestClusterMultilevelSelect(t *testing.T) {
//
// vmselect (L2) -> vmselect (L1) -> vmstorage <- vminsert
//
// vmisert writes data into vmstorage.
// vminsert writes data into vmstorage.
// vmselect (L2) reads that data via vmselect (L1).
vmstorage := tc.MustStartVmstorage("vmstorage", []string{

View File

@@ -6,6 +6,7 @@ import (
"net/http/httptest"
"sync"
"testing"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/apptest"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
@@ -248,3 +249,88 @@ func TestSingleVMAgentDowngradeRemoteWriteProtocol(t *testing.T) {
t.Fatalf("unexpected number of dropped packets; got %d, want %d", actualPacketsDroppedCount, expectedPacketsDroppedTotal)
}
}
func TestSingleVMAgentDropOnOverload(t *testing.T) {
tc := apptest.NewTestCase(t)
defer tc.Stop()
remoteWriteSrv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusNoContent)
}))
defer remoteWriteSrv.Close()
remoteWriteSrv2 := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusServiceUnavailable)
}))
defer remoteWriteSrv2.Close()
vmagent := tc.MustStartVmagent("vmagent", []string{
`-remoteWrite.flushInterval=50ms`,
fmt.Sprintf(`-remoteWrite.url=%s/api/v1/write`, remoteWriteSrv.URL),
fmt.Sprintf(`-remoteWrite.url=%s/api/v1/write`, remoteWriteSrv2.URL),
"-remoteWrite.disableOnDiskQueue=true",
// use only 1 worker to get a full queue faster
"-remoteWrite.queues=1",
// fastqueue size is roughly memory.Allowed() / len(urls) / *maxRowsPerBlock / 100
// Use very large maxRowsPerBlock to get fastqueue of minimal length(2).
// See initRemoteWriteCtxs function in remotewrite.go for details.
"-remoteWrite.maxRowsPerBlock=1000000000",
"-remoteWrite.tmpDataPath=" + tc.Dir() + "/vmagent",
}, ``)
const (
retries = 20
period = 100 * time.Millisecond
)
waitFor := func(f func() bool) {
t.Helper()
for i := 0; i < retries; i++ {
if f() {
return
}
time.Sleep(period)
}
t.Fatalf("timed out waiting for retry #%d", retries)
}
// Real remote write URLs are hidden in metrics
url1 := "1:secret-url"
url2 := "2:secret-url"
// Wait until first request got flushed to remote write server
vmagent.APIV1ImportPrometheusNoWaitFlush(t, []string{
"foo_bar 1 1652169600000", // 2022-05-10T08:00:00Z
}, apptest.QueryOpts{})
waitFor(
func() bool {
return vmagent.RemoteWriteRequests(t, url1) == 1 && vmagent.RemoteWriteRequests(t, url2) == 1
},
)
// Send 2 more requests, the first RW endpoint should receive everything, the second should add them to the queue
// since worker is busy with the first request.
for i := 0; i < 2; i++ {
vmagent.APIV1ImportPrometheusNoWaitFlush(t, []string{
"foo_bar 1 1652169600000", // 2022-05-10T08:00:00Z
}, apptest.QueryOpts{})
waitFor(
func() bool {
return vmagent.RemoteWriteRequests(t, url1) == 2+i && vmagent.RemoteWritePendingInmemoryBlocks(t, url2) == 1+i
},
)
}
// Send one more request.
vmagent.APIV1ImportPrometheusNoWaitFlush(t, []string{
"foo_bar 1 1652169600000", // 2022-05-10T08:00:00Z
}, apptest.QueryOpts{})
waitFor(
func() bool {
return vmagent.RemoteWriteRequests(t, url1) == 4 && vmagent.RemoteWriteSamplesDropped(t, url2) > 0
},
)
}

View File

@@ -1,8 +1,10 @@
package tests
import (
"context"
"fmt"
"io"
"net"
"net/http"
"net/http/httptest"
"net/url"
@@ -301,3 +303,82 @@ unauthorized_user:
assertBackendsRequestsCount(1)
}
func TestSingleVMAuthUseProxyProtocol(t *testing.T) {
tc := apptest.NewTestCase(t)
defer tc.Stop()
var requestsCount int
var actualForwardedForHeader string
backend := httptest.NewServer(http.HandlerFunc(func(_ http.ResponseWriter, r *http.Request) {
actualForwardedForHeader = r.Header.Get("X-Forwarded-For")
requestsCount++
}))
defer backend.Close()
authConfig := fmt.Sprintf(`
unauthorized_user:
url_prefix: %s
`, backend.URL)
vmauth := tc.MustStartVmauth("vmauth", []string{
"-httpListenAddr.useProxyProtocol=true",
}, authConfig)
req, err := http.NewRequest("GET", fmt.Sprintf("http://%s/backend", vmauth.GetHTTPListenAddr()), nil)
if err != nil {
t.Fatalf("cannot build http.Request: %s", err)
}
// make request using proxy protocol
c := &http.Client{
Transport: &http.Transport{
DialContext: func(_ context.Context, network, addr string) (net.Conn, error) {
conn, err := net.Dial(network, addr)
if err != nil {
return nil, err
}
// Write a proxy protocol header to the connection
if _, err := conn.Write([]byte{
0x0D, 0x0A, 0x0D, 0x0A, 0x00, 0x0D, 0x0A, 0x51, 0x55, 0x49, 0x54, 0x0A, // signature
0x21, // version 2
0x11, // family IPv4
0x00, 0x0C, // length: 12 bytes (IPv4 + ports)
192, 168, 1, 100, // source IP
10, 0, 0, 1, // destination IP
0x1F, 0x90, // source port 8080
0x00, 0x50, // destination port 80
}); err != nil {
t.Fatalf("cannot send proxy protocol header: %s", err)
}
return conn, nil
},
},
}
resp, err := c.Do(req)
if err != nil {
t.Fatalf("cannot make http.Get request for target=%q: %s", req.URL, err)
}
responseText, err := io.ReadAll(resp.Body)
if err != nil {
t.Fatalf("cannot read response body: %s", err)
}
resp.Body.Close()
if resp.StatusCode != http.StatusOK {
t.Fatalf("unexpected http response code: %d, want: %d, response text: %s", resp.StatusCode, http.StatusOK, responseText)
}
// ensure that request was proxied
if requestsCount != 1 {
t.Fatalf("expected to have %d unauthorized proxied requests, got: %d", 1, requestsCount)
}
// ensure that X-Forwarded-For header is set to the source IP from proxy protocol
expectedForwardedForHeader := "192.168.1.100"
if actualForwardedForHeader != expectedForwardedForHeader {
t.Fatalf("expected X-Forwarded-For header to be equal to proxy source IP, got: %s, want: %s'", actualForwardedForHeader, expectedForwardedForHeader)
}
}

View File

@@ -104,6 +104,36 @@ func (app *Vmagent) RemoteWritePacketsDroppedTotal(t *testing.T) int {
return int(total)
}
// RemoteWriteSamplesDropped sums up the total number of dropped remote write samples for given remote write URL.
func (app *Vmagent) RemoteWriteSamplesDropped(t *testing.T, url string) int {
re := regexp.MustCompile(fmt.Sprintf("vmagent_remotewrite_samples_dropped_total{.*url=%q.*}", url))
total := 0.0
for _, v := range app.GetMetricsByRegexp(t, re) {
total += v
}
return int(total)
}
// RemoteWritePendingInmemoryBlocks sums up the total number of pending in-memory blocks for given remote write URL.
func (app *Vmagent) RemoteWritePendingInmemoryBlocks(t *testing.T, url string) int {
re := regexp.MustCompile(fmt.Sprintf("vmagent_remotewrite_pending_inmemory_blocks{.*url=%q.*}", url))
total := 0.0
for _, v := range app.GetMetricsByRegexp(t, re) {
total += v
}
return int(total)
}
// RemoteWriteRequests sums up the total number of sending requests for given remote write URL.
func (app *Vmagent) RemoteWriteRequests(t *testing.T, url string) int {
re := regexp.MustCompile(fmt.Sprintf("vmagent_remotewrite_requests_total{.*url=%q.*}", url))
total := 0.0
for _, v := range app.GetMetricsByRegexp(t, re) {
total += v
}
return int(total)
}
// ReloadRelabelConfigs sends SIGHUP to trigger relabel config reload
// and waits until vmagent_relabel_config_reloads_total increases.
// Fails the test if no reload is detected within 3 seconds.
@@ -123,9 +153,7 @@ func (app *Vmagent) ReloadRelabelConfigs(t *testing.T) {
time.Sleep(100 * time.Millisecond)
}
if currTotal <= prevTotal {
t.Fatalf("relabel configs were not reloaded after SIGHUP signal; previous total: %f, current total: %f", prevTotal, currTotal)
}
t.Fatalf("relabel configs were not reloaded after SIGHUP signal; previous total: %f, current total: %f", prevTotal, currTotal)
}
// sendBlocking sends the data to vmstorage by executing `send` function and

View File

@@ -56,25 +56,32 @@ func StartVmauth(instance string, flags []string, cli *Client, configFilePath st
}, nil
}
// UpdateConfiguration performs configuration file reload for app and waits for configuration apply
//
// Due to second prescision of config reload metric, config cannot be reloaded more than 1 time in a second
// UpdateConfiguration updates the vmauth configuration file with the provided YAML content,
// sends SIGHUP to trigger config reload
// and waits until vmauth_config_last_reload_total increases.
// Fails the test if no reload is detected within 2 seconds.
func (app *Vmauth) UpdateConfiguration(t *testing.T, configFileYAML string) {
t.Helper()
ct := int(time.Now().Unix())
fs.MustWriteSync(app.configFilePath, []byte(configFileYAML))
prevTotal := app.GetIntMetric(t, "vmauth_config_last_reload_total")
if err := app.process.Signal(syscall.SIGHUP); err != nil {
t.Fatalf("unexpected signal error: %s", err)
}
for range 10 {
ts := app.GetIntMetric(t, "vmauth_config_last_reload_success_timestamp_seconds")
if ts < ct {
time.Sleep(time.Millisecond * 100)
continue
var currTotal int
for range 20 {
currTotal = app.GetIntMetric(t, "vmauth_config_last_reload_total")
if currTotal > prevTotal {
return
}
return
time.Sleep(time.Millisecond * 100)
}
t.Fatalf("timeout waiting for config reload success")
t.Fatalf("config were not reloaded after SIGHUP signal; previous total: %d, current total: %d", prevTotal, currTotal)
}
// GetHTTPListenAddr returns listen http addr

View File

@@ -49,7 +49,7 @@ func StartVminsert(instance string, flags []string, cli *Client, output io.Write
graphiteListenAddrRE,
openTSDBListenAddrRE,
}
// Add storateNode REs to block until vminsert establishes connections with
// Add storageNode REs to block until vminsert establishes connections with
// all storage nodes. The extracted values are unused.
for _, sn := range storageNodes(flags) {
logRecord := fmt.Sprintf("successfully dialed -storageNode=\"%s\"", sn)

View File

@@ -34,26 +34,28 @@
# for details
tsbs: tsbs-build tsbs-generate-data tsbs-load-data tsbs-generate-queries tsbs-run-queries
TSBS_SCALE := 100000
# If GNU date is available, use it; otherwise, fall back to the standard date command
# User can install GNU date on macOS via `brew install coreutils`
DATE_CMD := $(shell which gdate 2>/dev/null || echo date)
TSBS_START := $(shell $(DATE_CMD) -u -d "1 day ago 00:00:00" +"%Y-%m-%dT%H:%M:%SZ")
TSBS_END := $(shell $(DATE_CMD) -u -d "00:00:00" +"%Y-%m-%dT%H:%M:%SZ")
TSBS_STEP := 80s
TSBS_QUERIES := 1000
TSBS_WORKERS := 4
TSBS_SCALE ?= 100000
TSBS_END ?= $(shell date -u +%Y-%m-%dT00:00:00Z)
TSBS_START ?= $(shell \
NOW=$$(date -u +%s); \
START=$$((NOW - 86400)); \
date -u -d "@$$START" +%Y-%m-%dT00:00:00Z 2>/dev/null || \
date -u -r $$START +%Y-%m-%dT00:00:00Z 2>/dev/null \
)
TSBS_STEP ?= 80s
TSBS_QUERIES ?= 1000
TSBS_WORKERS ?= 4
TSBS_DATA_FILE := /tmp/tsbs-data-$(TSBS_SCALE)-$(TSBS_START)-$(TSBS_END)-$(TSBS_STEP).gz
TSBS_QUERY_FILE := /tmp/tsbs-queries-$(TSBS_SCALE)-$(TSBS_START)-$(TSBS_END)-$(TSBS_QUERIES).gz
# For cluster setup use http://vminsert:8480/insert/0/influx/write
TSBS_WRITE_URLS := http://localhost:8428/write
TSBS_WRITE_URLS ?= http://localhost:8428/write
# For cluster setup use http://vmselect:8481/select/0/prometheus
TSBS_READ_URLS := http://localhost:8428
TSBS_METRICS_URL := http://localhost:8428/metrics
TSBS_READ_URLS ?= http://localhost:8428
TSBS_METRICS_URL ?= http://localhost:8428/metrics
# Build TSBS tools
tsbs-build:
test -d /tmp/tsbs || (git clone https://github.com/timescale/tsbs.git /tmp/tsbs && \
test -d /tmp/tsbs/cmd/tsbs_run_queries_victoriametrics || (git clone https://github.com/timescale/tsbs.git /tmp/tsbs && \
cd /tmp/tsbs/cmd/tsbs_generate_data && GOBIN=/tmp/tsbs/bin go install && \
cd /tmp/tsbs/cmd/tsbs_generate_queries && GOBIN=/tmp/tsbs/bin go install && \
cd /tmp/tsbs/cmd/tsbs_load_victoriametrics && GOBIN=/tmp/tsbs/bin go install && \

View File

@@ -7962,7 +7962,7 @@
"type": "prometheus",
"uid": "$ds"
},
"description": "99th percentile of the number of time series read per query.",
"description": "Shows the max 99th percentile of time series read per query across all vmselects.",
"fieldConfig": {
"defaults": {
"color": {
@@ -8051,7 +8051,7 @@
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(histogram_quantile(0.99, sum(rate(vm_series_read_per_query_bucket{job=~\"$job_select\", instance=~\"$instance\"}[$__rate_interval])) by (instance, vmrange)))",
"expr": "max(histogram_quantile(0.99, sum(rate(vm_series_read_per_query_bucket{job=~\"$job_select\", instance=~\"$instance\"}[$__rate_interval])) by (instance, vmrange)))",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
@@ -8068,7 +8068,7 @@
"type": "prometheus",
"uid": "$ds"
},
"description": "99th percentile of number of [data samples](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#raw-samples) read per queried time series.",
"description": "Shows the max 99th percentile of number of [data samples](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#raw-samples) read per time series.",
"fieldConfig": {
"defaults": {
"color": {
@@ -8157,7 +8157,7 @@
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(histogram_quantile(0.99, sum(rate(vm_rows_read_per_series_bucket{job=~\"$job_select\", instance=~\"$instance\"}[$__rate_interval])) by (instance, vmrange)))",
"expr": "max(histogram_quantile(0.99, sum(rate(vm_rows_read_per_series_bucket{job=~\"$job_select\", instance=~\"$instance\"}[$__rate_interval])) by (instance, vmrange)))",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
@@ -8174,7 +8174,7 @@
"type": "prometheus",
"uid": "$ds"
},
"description": "99th percentile of number of [data samples](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#raw-samples) read per query.",
"description": "Shows the max 99th percentile of number of [data samples](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#raw-samples) read per query.",
"fieldConfig": {
"defaults": {
"color": {
@@ -8263,7 +8263,7 @@
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(histogram_quantile(0.99, sum(rate(vm_rows_read_per_query_bucket{job=~\"$job_select\", instance=~\"$instance\"}[$__rate_interval])) by (instance, vmrange)))",
"expr": "max(histogram_quantile(0.99, sum(rate(vm_rows_read_per_query_bucket{job=~\"$job_select\", instance=~\"$instance\"}[$__rate_interval])) by (instance, vmrange)))",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
@@ -8280,7 +8280,7 @@
"type": "prometheus",
"uid": "$ds"
},
"description": "99th percentile of number of [data samples](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#raw-samples) scanner per query.\n\nThis number can exceed number of DatapointsReadPerQuery if `step` query arg passed to [/api/v1/query_range](https://prometheus.io/docs/prometheus/latest/querying/api/#range-queries) is smaller than the lookbehind window set in square brackets of [rollup function](https://docs.victoriametrics.com/victoriametrics/metricsql/#rollup-functions). For example, if `increase(some_metric[1h])` is executed with the `step=5m`, then the same [data samples](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#raw-samples) on a hour time range are scanned `1h/5m=12` times. See [this article](https://valyala.medium.com/how-to-optimize-promql-and-metricsql-queries-85a1b75bf986) for details.",
"description": "Shows the max 99th percentile of number of [data samples](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#raw-samples) scanner per query.\n\nThis number can exceed number of DatapointsReadPerQuery if `step` query arg passed to [/api/v1/query_range](https://prometheus.io/docs/prometheus/latest/querying/api/#range-queries) is smaller than the lookbehind window set in square brackets of [rollup function](https://docs.victoriametrics.com/victoriametrics/metricsql/#rollup-functions). For example, if `increase(some_metric[1h])` is executed with the `step=5m`, then the same [data samples](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#raw-samples) on a hour time range are scanned `1h/5m=12` times. See [this article](https://valyala.medium.com/how-to-optimize-promql-and-metricsql-queries-85a1b75bf986) for details.",
"fieldConfig": {
"defaults": {
"color": {
@@ -8369,7 +8369,7 @@
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(histogram_quantile(0.99, sum(rate(vm_rows_scanned_per_query_bucket{job=~\"$job_select\", instance=~\"$instance\"}[$__rate_interval])) by (instance, vmrange)))",
"expr": "max(histogram_quantile(0.99, sum(rate(vm_rows_scanned_per_query_bucket{job=~\"$job_select\", instance=~\"$instance\"}[$__rate_interval])) by (instance, vmrange)))",
"format": "time_series",
"interval": "",
"intervalFactor": 1,

View File

@@ -7963,7 +7963,7 @@
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"description": "99th percentile of the number of time series read per query.",
"description": "Shows the max 99th percentile of time series read per query across all vmselects.",
"fieldConfig": {
"defaults": {
"color": {
@@ -8052,7 +8052,7 @@
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(histogram_quantile(0.99, sum(rate(vm_series_read_per_query_bucket{job=~\"$job_select\", instance=~\"$instance\"}[$__rate_interval])) by (instance, vmrange)))",
"expr": "max(histogram_quantile(0.99, sum(rate(vm_series_read_per_query_bucket{job=~\"$job_select\", instance=~\"$instance\"}[$__rate_interval])) by (instance, vmrange)))",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
@@ -8069,7 +8069,7 @@
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"description": "99th percentile of number of [data samples](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#raw-samples) read per queried time series.",
"description": "Shows the max 99th percentile of number of [data samples](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#raw-samples) read per time series.",
"fieldConfig": {
"defaults": {
"color": {
@@ -8158,7 +8158,7 @@
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(histogram_quantile(0.99, sum(rate(vm_rows_read_per_series_bucket{job=~\"$job_select\", instance=~\"$instance\"}[$__rate_interval])) by (instance, vmrange)))",
"expr": "max(histogram_quantile(0.99, sum(rate(vm_rows_read_per_series_bucket{job=~\"$job_select\", instance=~\"$instance\"}[$__rate_interval])) by (instance, vmrange)))",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
@@ -8175,7 +8175,7 @@
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"description": "99th percentile of number of [data samples](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#raw-samples) read per query.",
"description": "Shows the max 99th percentile of number of [data samples](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#raw-samples) read per query.",
"fieldConfig": {
"defaults": {
"color": {
@@ -8264,7 +8264,7 @@
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(histogram_quantile(0.99, sum(rate(vm_rows_read_per_query_bucket{job=~\"$job_select\", instance=~\"$instance\"}[$__rate_interval])) by (instance, vmrange)))",
"expr": "max(histogram_quantile(0.99, sum(rate(vm_rows_read_per_query_bucket{job=~\"$job_select\", instance=~\"$instance\"}[$__rate_interval])) by (instance, vmrange)))",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
@@ -8281,7 +8281,7 @@
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"description": "99th percentile of number of [data samples](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#raw-samples) scanner per query.\n\nThis number can exceed number of DatapointsReadPerQuery if `step` query arg passed to [/api/v1/query_range](https://victoriametrics-metrics-datasource.io/docs/victoriametrics-metrics-datasource/latest/querying/api/#range-queries) is smaller than the lookbehind window set in square brackets of [rollup function](https://docs.victoriametrics.com/victoriametrics/metricsql/#rollup-functions). For example, if `increase(some_metric[1h])` is executed with the `step=5m`, then the same [data samples](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#raw-samples) on a hour time range are scanned `1h/5m=12` times. See [this article](https://valyala.medium.com/how-to-optimize-promql-and-metricsql-queries-85a1b75bf986) for details.",
"description": "Shows the max 99th percentile of number of [data samples](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#raw-samples) scanner per query.\n\nThis number can exceed number of DatapointsReadPerQuery if `step` query arg passed to [/api/v1/query_range](https://victoriametrics-metrics-datasource.io/docs/victoriametrics-metrics-datasource/latest/querying/api/#range-queries) is smaller than the lookbehind window set in square brackets of [rollup function](https://docs.victoriametrics.com/victoriametrics/metricsql/#rollup-functions). For example, if `increase(some_metric[1h])` is executed with the `step=5m`, then the same [data samples](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#raw-samples) on a hour time range are scanned `1h/5m=12` times. See [this article](https://valyala.medium.com/how-to-optimize-promql-and-metricsql-queries-85a1b75bf986) for details.",
"fieldConfig": {
"defaults": {
"color": {
@@ -8370,7 +8370,7 @@
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(histogram_quantile(0.99, sum(rate(vm_rows_scanned_per_query_bucket{job=~\"$job_select\", instance=~\"$instance\"}[$__rate_interval])) by (instance, vmrange)))",
"expr": "max(histogram_quantile(0.99, sum(rate(vm_rows_scanned_per_query_bucket{job=~\"$job_select\", instance=~\"$instance\"}[$__rate_interval])) by (instance, vmrange)))",
"format": "time_series",
"interval": "",
"intervalFactor": 1,

View File

@@ -7966,7 +7966,7 @@
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vm_promscrape_scraped_samples_sum{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) by(job, instance)\n+ sum(rate(vmagent_rows_inserted_total{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) by(job, instance)",
"expr": "sum(rate({__name__=~\"vm_promscrape_scraped_samples_sum|vmagent_rows_inserted_total\",job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) by(job, instance)",
"hide": false,
"interval": "",
"legendFormat": "in {{instance}} {{job}}",
@@ -8244,4 +8244,4 @@
"uid": "G7Z9GzMGz_vm",
"version": 1,
"weekStart": ""
}
}

View File

@@ -7965,7 +7965,7 @@
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vm_promscrape_scraped_samples_sum{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) by(job, instance)\n+ sum(rate(vmagent_rows_inserted_total{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) by(job, instance)",
"expr": "sum(rate({__name__=~\"vm_promscrape_scraped_samples_sum|vmagent_rows_inserted_total\",job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) by(job, instance)",
"hide": false,
"interval": "",
"legendFormat": "in {{instance}} {{job}}",
@@ -8243,4 +8243,4 @@
"uid": "G7Z9GzMGz",
"version": 1,
"weekStart": ""
}
}

View File

@@ -7,7 +7,7 @@ ROOT_IMAGE ?= alpine:3.22.1
ROOT_IMAGE_SCRATCH ?= scratch
CERTS_IMAGE := alpine:3.22.1
GO_BUILDER_IMAGE := golang:1.24.5-alpine
GO_BUILDER_IMAGE := golang:1.25.0-alpine
BUILDER_IMAGE := local/builder:2.0.0-$(shell echo $(GO_BUILDER_IMAGE) | tr :/ __)-1
BASE_IMAGE := local/base:1.1.4-$(shell echo $(ROOT_IMAGE) | tr :/ __)-$(shell echo $(CERTS_IMAGE) | tr :/ __)
DOCKER ?= docker
@@ -43,7 +43,7 @@ app-via-docker: package-builder
$(BUILDER_IMAGE) \
go build $(RACE) -trimpath -buildvcs=false \
-ldflags "-extldflags '-static' $(GO_BUILDINFO)" \
-tags 'netgo osusergo musl' \
-tags 'netgo osusergo musl $(EXTRA_GO_BUILD_TAGS)' \
-o bin/$(APP_NAME)$(APP_SUFFIX)-prod $(PKG_PREFIX)/app/$(APP_NAME)
app-via-docker-windows: package-builder
@@ -58,7 +58,7 @@ app-via-docker-windows: package-builder
$(BUILDER_IMAGE) \
go build $(RACE) -trimpath -buildvcs=false \
-ldflags "-s -w -extldflags '-static' $(GO_BUILDINFO)" \
-tags 'netgo osusergo' \
-tags 'netgo osusergo $(EXTRA_GO_BUILD_TAGS)' \
-o bin/$(APP_NAME)-windows$(APP_SUFFIX)-prod.exe $(PKG_PREFIX)/app/$(APP_NAME)
package-via-docker: package-base

View File

@@ -3,7 +3,7 @@ services:
# It scrapes targets defined in --promscrape.config
# And forward them to --remoteWrite.url
vmagent:
image: victoriametrics/vmagent:v1.122.0
image: victoriametrics/vmagent:v1.123.0
depends_on:
- "vmauth"
ports:
@@ -35,14 +35,14 @@ services:
# vmstorage shards. Each shard receives 1/N of all metrics sent to vminserts,
# where N is number of vmstorages (2 in this case).
vmstorage-1:
image: victoriametrics/vmstorage:v1.122.0-cluster
image: victoriametrics/vmstorage:v1.123.0-cluster
volumes:
- strgdata-1:/storage
command:
- "--storageDataPath=/storage"
restart: always
vmstorage-2:
image: victoriametrics/vmstorage:v1.122.0-cluster
image: victoriametrics/vmstorage:v1.123.0-cluster
volumes:
- strgdata-2:/storage
command:
@@ -52,7 +52,7 @@ services:
# vminsert is ingestion frontend. It receives metrics pushed by vmagent,
# pre-process them and distributes across configured vmstorage shards.
vminsert-1:
image: victoriametrics/vminsert:v1.122.0-cluster
image: victoriametrics/vminsert:v1.123.0-cluster
depends_on:
- "vmstorage-1"
- "vmstorage-2"
@@ -61,7 +61,7 @@ services:
- "--storageNode=vmstorage-2:8400"
restart: always
vminsert-2:
image: victoriametrics/vminsert:v1.122.0-cluster
image: victoriametrics/vminsert:v1.123.0-cluster
depends_on:
- "vmstorage-1"
- "vmstorage-2"
@@ -73,7 +73,7 @@ services:
# vmselect is a query fronted. It serves read queries in MetricsQL or PromQL.
# vmselect collects results from configured `--storageNode` shards.
vmselect-1:
image: victoriametrics/vmselect:v1.122.0-cluster
image: victoriametrics/vmselect:v1.123.0-cluster
depends_on:
- "vmstorage-1"
- "vmstorage-2"
@@ -83,7 +83,7 @@ services:
- "--vmalert.proxyURL=http://vmalert:8880"
restart: always
vmselect-2:
image: victoriametrics/vmselect:v1.122.0-cluster
image: victoriametrics/vmselect:v1.123.0-cluster
depends_on:
- "vmstorage-1"
- "vmstorage-2"
@@ -98,7 +98,7 @@ services:
# read requests from Grafana, vmui, vmalert among vmselects.
# It can be used as an authentication proxy.
vmauth:
image: victoriametrics/vmauth:v1.122.0
image: victoriametrics/vmauth:v1.123.0
depends_on:
- "vmselect-1"
- "vmselect-2"
@@ -112,7 +112,7 @@ services:
# vmalert executes alerting and recording rules
vmalert:
image: victoriametrics/vmalert:v1.122.0
image: victoriametrics/vmalert:v1.123.0
depends_on:
- "vmauth"
ports:

View File

@@ -3,7 +3,7 @@ services:
# It scrapes targets defined in --promscrape.config
# And forward them to --remoteWrite.url
vmagent:
image: victoriametrics/vmagent:v1.122.0
image: victoriametrics/vmagent:v1.123.0
depends_on:
- "victoriametrics"
ports:
@@ -18,7 +18,7 @@ services:
# VictoriaMetrics instance, a single process responsible for
# storing metrics and serve read requests.
victoriametrics:
image: victoriametrics/victoria-metrics:v1.122.0
image: victoriametrics/victoria-metrics:v1.123.0
ports:
- 8428:8428
- 8089:8089
@@ -54,7 +54,7 @@ services:
# vmalert executes alerting and recording rules
vmalert:
image: victoriametrics/vmalert:v1.122.0
image: victoriametrics/vmalert:v1.123.0
depends_on:
- "victoriametrics"
- "alertmanager"

View File

@@ -7,7 +7,7 @@ groups:
# note the `job` filter and update accordingly to your setup
rules:
- alert: TooManyRestarts
expr: changes(process_start_time_seconds{job=~".*(victoriametrics|vmselect|vminsert|vmstorage|vmagent|vmalert|vmsingle|vmalertmanager|vmauth|victorialogs|vlstorage|vlselect|vlinsert).*"}[15m]) > 2
expr: changes(process_start_time_seconds{job=~".*(victoriametrics|vmselect|vminsert|vmstorage|vmagent|vmalert|vmsingle|vmalertmanager|vmauth).*"}[15m]) > 2
labels:
severity: critical
annotations:
@@ -17,7 +17,7 @@ groups:
It might be crashlooping.
- alert: ServiceDown
expr: up{job=~".*(victoriametrics|vmselect|vminsert|vmstorage|vmagent|vmalert|vmsingle|vmalertmanager|vmauth|victorialogs|vlstorage|vlselect|vlinsert).*"} == 0
expr: up{job=~".*(victoriametrics|vmselect|vminsert|vmstorage|vmagent|vmalert|vmsingle|vmalertmanager|vmauth).*"} == 0
for: 2m
labels:
severity: critical
@@ -59,7 +59,7 @@ groups:
Consider to either increase available CPU resources or decrease the load on the process.
- alert: TooHighGoroutineSchedulingLatency
expr: histogram_quantile(0.99, sum(rate(go_sched_latencies_seconds_bucket[5m])) by (le, job, instance)) > 0.1
expr: histogram_quantile(0.99, sum(rate(go_sched_latencies_seconds_bucket{job=~".*(victoriametrics|vmselect|vminsert|vmstorage|vmagent|vmalert|vmsingle|vmalertmanager|vmauth).*"}[5m])) by (le, job, instance)) > 0.1
for: 15m
labels:
severity: critical

View File

@@ -100,7 +100,7 @@ groups:
summary: "Churn rate is more than 10% on \"{{ $labels.instance }}\" for the last 15m"
description: "VM constantly creates new time series on \"{{ $labels.instance }}\".\n
This effect is known as Churn Rate.\n
High Churn Rate tightly connected with database performance and may
High Churn Rate is tightly connected with database performance and may
result in unexpected OOM's or slow queries."
- alert: TooHighChurnRate24h
@@ -117,7 +117,7 @@ groups:
description: "The number of created new time series over last 24h is 3x times higher than
current number of active series on \"{{ $labels.instance }}\".\n
This effect is known as Churn Rate.\n
High Churn Rate tightly connected with database performance and may
High Churn Rate is tightly connected with database performance and may
result in unexpected OOM's or slow queries."
- alert: TooHighSlowInsertsRate
@@ -135,4 +135,4 @@ groups:
summary: "Percentage of slow inserts is more than 5% on \"{{ $labels.instance }}\" for the last 15m"
description: "High rate of slow inserts on \"{{ $labels.instance }}\" may be a sign of resource exhaustion
for the current load. It is likely more RAM is needed for optimal handling of the current number of active time series.
See also https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3976#issuecomment-1476883183"
See also https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3976#issuecomment-1476883183"

View File

@@ -1,6 +1,6 @@
services:
vmagent:
image: victoriametrics/vmagent:v1.122.0
image: victoriametrics/vmagent:v1.123.0
depends_on:
- "victoriametrics"
ports:
@@ -14,7 +14,7 @@ services:
restart: always
victoriametrics:
image: victoriametrics/victoria-metrics:v1.122.0
image: victoriametrics/victoria-metrics:v1.123.0
ports:
- 8428:8428
volumes:
@@ -40,7 +40,7 @@ services:
restart: always
vmalert:
image: victoriametrics/vmalert:v1.122.0
image: victoriametrics/vmalert:v1.123.0
depends_on:
- "victoriametrics"
ports:
@@ -59,7 +59,7 @@ services:
- '--external.alert.source=explore?orgId=1&left=["now-1h","now","VictoriaMetrics",{"expr": },{"mode":"Metrics"},{"ui":[true,true,true,"none"]}]'
restart: always
vmanomaly:
image: victoriametrics/vmanomaly:v1.25.2
image: victoriametrics/vmanomaly:v1.25.3
depends_on:
- "victoriametrics"
ports:

View File

@@ -1005,7 +1005,7 @@
"refId": "A"
}
],
"title": "Anoamlies: Read Latency",
"title": "Anomalies: Read Latency",
"type": "state-timeline"
},
{

View File

@@ -18,7 +18,7 @@ services:
- vlogs
generator:
image: golang:1.24.5-alpine
image: golang:1.25.0-alpine
restart: always
working_dir: /go/src/app
volumes:

View File

@@ -2,7 +2,7 @@ version: "3"
services:
generator:
image: golang:1.24.5-alpine
image: golang:1.25.0-alpine
restart: always
working_dir: /go/src/app
volumes:

View File

@@ -4,12 +4,12 @@ Benchmark compares VictoriaLogs with ELK stack and Grafana Loki.
Benchmark is based on:
- Logs from this repository - https://github.com/logpai/loghub
- Logs from this repository - [https://github.com/logpai/loghub](https://github.com/logpai/loghub)
- [logs generator](./generator)
For ELK suite it uses:
- filebeat - https://www.elastic.co/beats/filebeat
- filebeat - [https://www.elastic.co/beats/filebeat](https://www.elastic.co/beats/filebeat)
- elastic + kibana
For Grafana Loki suite it uses:
@@ -24,7 +24,7 @@ For Grafana Loki suite it uses:
- VictoriaLogs instance
- vmsingle - port forwarded to `localhost:8428` to see UI
- exporters for system metris
- exporters for system metrics
ELK suite uses [docker-compose-elk.yml](./docker-compose-elk.yml) with the following services:
@@ -54,7 +54,7 @@ Each filebeat than writes logs to elastic and VictoriaLogs via elasticsearch-com
1. Download and unarchive logs by running:
```shell
cd source_logs
cd source_logs
bash download.sh
```
@@ -74,11 +74,11 @@ Unarchived logs size per file for reference:
13G hadoop-*.log
```
2. (optional) If needed, adjust amount of logs sent by generator by modifying `-outputRateLimitItems` and
1. (optional) If needed, adjust amount of logs sent by generator by modifying `-outputRateLimitItems` and
`outputRateLimitPeriod` parameters in [docker-compose.yml](./docker-compose.yml). By default, it is configured to
send 10000 logs per second.
3. (optional) Build victoria-logs image and adjust `image` parameter in [docker-compose.yml](./docker-compose.yml):
1. (optional) Build victoria-logs image and adjust `image` parameter in [docker-compose.yml](./docker-compose.yml):
```shell
make package-victoria-logs
@@ -95,26 +95,27 @@ output.elasticsearch:
hosts: [ "http://vlogs:9428/insert/elasticsearch/" ]
```
4. Choose a suite to run.
1. Choose a suite to run.
In order to run ELK suite use the following command:
```
```sh
make docker-up-elk
```
In order to run Loki suite use the following command:
```
```sh
make docker-up-loki
```
5. Navigate to `http://localhost:3000/` to see Grafana dashboards with resource usage
1. Navigate to `http://localhost:3000/` to see Grafana dashboards with resource usage
comparison.
Navigate to `http://localhost:3000/d/hkm6P6_4z/elastic-vs-vlogs` to see ELK suite results.
Navigate to `http://localhost:3000/d/hkm6P6_4y/loki-vs-vlogs` to see Loki suite results.
Example results vs ELK:
![elk-grafana-dashboard.png](results/elk-grafana-dashboard.png)

View File

@@ -14,6 +14,19 @@ aliases:
---
Please find the changelog for VictoriaMetrics Anomaly Detection below.
## v1.25.3
Released: 2025-08-19
- FEATURE: Added forecasting capabilities to the [`ProphetModel`](https://docs.victoriametrics.com/anomaly-detection/components/models/#prophet) this allows users to generate *future* (point-wise and interval) predictions with offsets defined by `forecast_at` argument (e.g. `['1d', '1w']`) at *current* timestamp and store these in respective series, e.g. `yhat_1d`, `yhat_lower_1d`, `yhat_upper_1d`, etc. This feature is particularly useful for scenarios where future predictions are needed, such as capacity planning or trend analysis. See [FAQ](https://docs.victoriametrics.com/anomaly-detection/faq/#forecasting) for more details.
- IMPROVEMENT: Added `logger_levels` argument to `settings` [config section](https://docs.victoriametrics.com/anomaly-detection/components/settings/#logger-levels) to allow setting specific log levels for individual components. Useful for debugging specific components. For example, `logger_levels: { "reader.vm": "DEBUG" }` will set the log level for the `VmReader` component to `DEBUG`, while leaving other components at their default log levels. Also is supported in [hot reload](https://docs.victoriametrics.com/anomaly-detection/components/#hot-reload) mode, allowing for dynamic log level changes without service restarts.
- IMPROVEMENT: Added logging of URLs used for querying VictoriaMetrics TSDB in [`VmReader`](https://docs.victoriametrics.com/anomaly-detection/components/reader/#vm-reader) to ease the debugging of incomplete data retrieval, incorrect endpoints, or misconfigured tenant IDs. The URLs are logged at the `DEBUG` level, so you can control their verbosity using the `--loggerLevelComponents` argument with `reader.vm=DEBUG` or `reader=DEBUG` to see the URLs in the logs.
- IMPROVEMENT: Added `offset` [argument](https://docs.victoriametrics.com/anomaly-detection/components/reader/#vm-reader) to `VmReader` on reader and query levels to allow for flexible time offset adjustments in the reader. Useful for correcting for data collection delays. The `offset` can be specified as a string (e.g., "15s", "-20s") and will be applied to all queries processed by the reader. See [FAQ](https://docs.victoriametrics.com/anomaly-detection/faq/#using-offsets) for more details.
- BUGFIX: Resolved the issue where symlink-ed configuration files were not properly processed by [hot reload](https://docs.victoriametrics.com/anomaly-detection/components/#hot-reload) mechanism, leading to the service not picking up changes made to the original files. Now it properly resolves symlinks and reloads the configuration when the original file is modified.
## v1.25.2
Released: 2025-07-30

View File

@@ -54,6 +54,25 @@ Respective config is defined in a [`reader`](https://docs.victoriametrics.com/an
## Handling noisy input data
`vmanomaly` operates on data fetched from VictoriaMetrics using [MetricsQL](https://docs.victoriametrics.com/victoriametrics/metricsql/) queries, so the initial data quality can be fine-tuned with aggregation, grouping, and filtering to reduce noise and improve anomaly detection accuracy.
## Using offsets
`vmanomaly` supports {{% available_from "v1.25.3" anomaly %}} the use of offsets in the [`reader`](https://docs.victoriametrics.com/anomaly-detection/components/reader/#vm-reader) section to adjust the time range of the data being queried. This can be particularly useful for correcting for data collection delays or other timing issues. It can be also defined or overridden on [per-query basis](https://docs.victoriametrics.com/anomaly-detection/components/reader/#per-query-parameters).
For example, if you want to query data with a 60-second delay (e.g. data collection happened 1 sec ago, however, timestamps written to VictoriaMetrics are 60 seconds in the past), you can set the `offset` argument to `-60s` in the reader section:
```yaml
reader:
class: 'vm'
datasource_url: 'http://localhost:8428'
sampling_period: '10s'
offset: '-60s'
queries:
vmb:
expr: 'avg(vm_blocks)'
cpu_custom_offset:
expr: 'avg(rate(vm_cpu_usage[5m]))'
offset: '-30s' # this will override the global offset for this query only
```
## Handling timezones
`vmanomaly` supports timezone-aware anomaly detection {{% available_from "v1.18.0" anomaly %}} through a `tz` argument, available both at the [reader level](https://docs.victoriametrics.com/anomaly-detection/components/reader#vm-reader) and at the [query level](https://docs.victoriametrics.com/anomaly-detection/components/reader/#per-query-parameters).
@@ -179,6 +198,22 @@ While `vmanomaly` detects anomalies and produces scores, it *does not directly g
<img src="https://docs.victoriametrics.com/anomaly-detection/guides/guide-vmanomaly-vmalert/guide-vmanomaly-vmalert_overview.webp" alt="node_exporter_example_diagram" style="width:60%"/>
Once anomaly scores are written back to VictoriaMetrics, you can use [MetricsQL](https://docs.victoriametrics.com/victoriametrics/metricsql/) expressions subset in `vmalert` to define alerting rules based on these scores. Reasonable defaults are `anomaly_score > 1`:
```yaml
groups:
- name: vmanomaly_alerts
rules:
- alert: HighAnomalyScore
expr: anomaly_score > 1 # or similar expressions, like `min(anomaly_score{...}) by (...) > 1`
for: 5m
labels:
severity: warning
annotations:
summary: "Anomaly score > 1 for {{ $labels.for }} query"
description: "Anomaly score is {{ $value }} for query {{ $labels.for }}. Value: {{ $value }}."
```
## Preventing alert fatigue
Produced anomaly scores are designed in such a way that values from 0.0 to 1.0 indicate non-anomalous data, while a value greater than 1.0 is generally classified as an anomaly. However, there are no perfect models for anomaly detection, that's why reasonable defaults expressions like `anomaly_score > 1` may not work 100% of the time. However, anomaly scores, produced by `vmanomaly` are written back as metrics to VictoriaMetrics, where tools like [`vmalert`](https://docs.victoriametrics.com/victoriametrics/vmalert/) can use [MetricsQL](https://docs.victoriametrics.com/victoriametrics/metricsql/) expressions to fine-tune alerting thresholds and conditions, balancing between avoiding [false negatives](https://victoriametrics.com/blog/victoriametrics-anomaly-detection-handbook-chapter-1/#false-negative) and reducing [false positives](https://victoriametrics.com/blog/victoriametrics-anomaly-detection-handbook-chapter-1/#false-positive).
@@ -228,6 +263,117 @@ writer:
Configuration above will produce N intervals of full length (`fit_window`=14d + `fit_every`=1h) until `to_iso` timestamp is reached to run N consecutive `fit` calls to train models; Then these models will be used to produce `M = [fit_every / sampling_frequency]` infer datapoints for `fit_every` range at the end of each such interval, imitating M consecutive calls of `infer_every` in `PeriodicScheduler` [config](https://docs.victoriametrics.com/anomaly-detection/components/scheduler#periodic-scheduler). These datapoints then will be written back to VictoriaMetrics TSDB, defined in `writer` [section](https://docs.victoriametrics.com/anomaly-detection/components/writer#vm-writer) for further visualization (i.e. in VMUI or Grafana)
## Forecasting
Not intended for forecasting in its core, `vmanomaly` can still be used to produce forecasts using [ProphetModel](https://docs.victoriametrics.com/anomaly-detection/components/models#prophet) {{% available_from "v1.25.3" anomaly %}}, which can be helpful in scenarios like capacity planning, resource allocation, or trend analysis, if the underlying data is complex and can't be handled by inline MetricsQL queries, including [predict_linear](https://docs.victoriametrics.com/victoriametrics/metricsql/#predict_linear).
> However, please note that this mode should be used with care, as the model will produce `yhat_{h}` (and probably `yhat_lower_{h}`, and `yhat_upper_{h}`) time series **for each timeseries returned by input queries and for each forecasting horizon specified in `forecast_at` argument, which can lead to a significant increase in the number of active timeseries in VictoriaMetrics TSDB**.
Here's an example of how to produce forecasts using `vmanomaly` and combine it with the regular model, e.g. to estimate daily outcomes for a disk usage metric:
```yaml
# https://docs.victoriametrics.com/anomaly-detection/components/scheduler/#periodic-scheduler
schedulers:
periodic_5m: # this scheduler will be used to produce anomaly scores each 5 minutes using "regular" simple model
class: 'periodic'
fit_every: '30d'
fit_window: '3d'
infer_every: '5m'
periodic_forecast: # this scheduler will be used to produce forecasts each 24h using "daily" model
class: 'periodic'
fit_every: '7d'
fit_window: '730d' # to fit the model on 2 years of data to account for seasonality and holidays
infer_every: '24h'
# https://docs.victoriametrics.com/anomaly-detection/components/reader/#vm-reader
reader:
class: 'vm'
datasource_url: 'http://play.victoriametrics.com'
tenant_id: '0:0'
sampling_period: '5m'
# other reader params ...
queries:
disk_usage_perc_5m:
expr: |
max_over_time(
1 - (node_filesystem_avail_bytes{mountpoint="/",fstype!="rootfs"}
/
node_filesystem_size_bytes{mountpoint="/",fstype!="rootfs"}),
1h
)
data_range: [0, 1]
# step: '1m' # default will be inherited from sampling_period
disk_usage_perc_1d:
expr: |
max_over_time(
1 - (node_filesystem_avail_bytes{mountpoint="/",fstype!="rootfs"}
/
node_filesystem_size_bytes{mountpoint="/",fstype!="rootfs"}),
24h
)
step: '1d' # override default step to 1d, as we want to produce daily forecasts
data_range: [0, 1]
# https://docs.victoriametrics.com/anomaly-detection/components/models/
models:
quantile_5m:
class: 'quantile_online' # online model, which updates itself each infer call
queries: ['disk_usage_perc_5m']
schedulers: ['periodic_5m']
clip_predictions: True
detection_direction: 'above_expected' # as we are interested in spikes in capacity planning
quantiles: [0.25, 0.5, 0.75] # to produce median and upper quartiles
iqr_threshold: 2.0
prophet_1d:
class: 'prophet'
queries: ['disk_usage_perc_1d']
schedulers: ['periodic_forecast']
clip_predictions: True
detection_direction: 'above_expected' # as we are interested in spikes in capacity planning
forecast_at: ['3d', '7d'] # this will produce forecasts for 3 and 7 days ahead
provide_series: ['yhat', 'yhat_upper'] # to write forecasts back to VictoriaMetrics, omitting `yhat_lower` as it is not needed in this example
# other model params, yearly_seasonality may stay
# https://facebook.github.io/prophet/docs/quick_start#python-api
args:
interval_width: 0.98 # see https://facebook.github.io/prophet/docs/uncertainty_intervals
country_holidays: 'US'
# https://docs.victoriametrics.com/anomaly-detection/components/writer/#vm-writer
writer:
class: 'vm'
datasource_url: '{your_victoriametrics_url_for_writing}'
# tenant_id: '0:0' # or your tenant ID if using clustered VictoriaMetrics
# other writer params ...
# https://docs.victoriametrics.com/anomaly-detection/components/writer/#metrics-formatting
metric_format:
__name__: $VAR
for: $QUERY_KEY
```
Then, respective alerts can be configured in [`vmalert`](https://docs.victoriametrics.com/victoriametrics/vmalert/) to notify disk exhaustion risks, e.g. if the forecasted disk usage exceeds 90% in the next 3 days:
```yaml
groups:
- name: disk_usage_alerts
rules:
- alert: DiskUsageHigh
expr: |
yhat_7d{for="disk_usage_perc_1d"} > 0.9
for: 24h
labels:
severity: critical
annotations:
summary: "Disk usage is forecasted to exceed 90% in the next 3 days"
description: "Disk usage is forecasted to exceed 90% in the next 3 days for instance {{ $labels.instance }}. Forecasted value: {{ $value }}."
- alert: DiskUsageCritical
expr: |
yhat_3d{for="disk_usage_perc_1d"} > 0.95
for: 24h
labels:
severity: critical
annotations:
summary: "Disk usage is forecasted to exceed 95% in the next 3 days"
description: "Disk usage is forecasted to exceed 95% in the next 3 days for instance {{ $labels.instance }}. Forecasted value: {{ $value }}."
```
## Resource consumption of vmanomaly
`vmanomaly` itself is a lightweight service, resource usage is primarily dependent on [scheduling](https://docs.victoriametrics.com/anomaly-detection/components/scheduler) (how often and on what data to fit/infer your models), [# and size of timeseries returned by your queries](https://docs.victoriametrics.com/anomaly-detection/components/reader/#vm-reader), and the complexity of the employed [models](https://docs.victoriametrics.com/anomaly-detection/components/models). Its resource usage is directly related to these factors, making it adaptable to various operational scales. Various optimizations are available to balance between RAM usage, processing speed, and model capacity. These options are described in the sections below.
@@ -243,7 +389,7 @@ services:
# ...
vmanomaly:
container_name: vmanomaly
image: victoriametrics/vmanomaly:v1.25.2
image: victoriametrics/vmanomaly:v1.25.3
# ...
ports:
- "8490:8490"
@@ -456,7 +602,7 @@ options:
Heres an example of using the config splitter to divide configurations based on the `extra_filters` argument from the reader section:
```sh
docker pull victoriametrics/vmanomaly:v1.25.2 && docker image tag victoriametrics/vmanomaly:v1.25.2 vmanomaly
docker pull victoriametrics/vmanomaly:v1.25.3 && docker image tag victoriametrics/vmanomaly:v1.25.3 vmanomaly
```
```sh

View File

@@ -121,13 +121,13 @@ Below are the steps to get `vmanomaly` up and running inside a Docker container:
1. Pull Docker image:
```sh
docker pull victoriametrics/vmanomaly:v1.25.2
docker pull victoriametrics/vmanomaly:v1.25.3
```
2. (Optional step) tag the `vmanomaly` Docker image:
```sh
docker image tag victoriametrics/vmanomaly:v1.25.2 vmanomaly
docker image tag victoriametrics/vmanomaly:v1.25.3 vmanomaly
```
3. Start the `vmanomaly` Docker container with a *license file*, use the command below.
@@ -163,7 +163,7 @@ docker run -it --user 1000:1000 \
services:
# ...
vmanomaly:
image: victoriametrics/vmanomaly:v1.25.2
image: victoriametrics/vmanomaly:v1.25.3
volumes:
$YOUR_LICENSE_FILE_PATH:/license
$YOUR_CONFIG_FILE_PATH:/config.yml
@@ -220,6 +220,14 @@ settings:
n_workers: 4 # number of workers to run workload in parallel, set to 0 or negative number to use all available CPU cores
anomaly_score_outside_data_range: 5.0 # default anomaly score for anomalies outside expected data range
restore_state: True # restore state from previous run, available since v1.24.0
# https://docs.victoriametrics.com/anomaly-detection/components/settings/#logger-levels
# to override service-global logger levels, use the `logger_levels` section
logger_levels:
# vmanomaly: info
# scheduler: info
# reader: info
# writer: info
model.prophet: warning
schedulers:
1d_1m:
@@ -299,6 +307,9 @@ For optimal service behavior, consider the following tweaks when configuring `vm
- Set up [anomaly score dashboard](https://docs.victoriametrics.com/anomaly-detection/presets/#grafana-dashboard) to visualize the results of anomaly detection.
- Set up [self-monitoring dashboard](https://docs.victoriametrics.com/anomaly-detection/self-monitoring/) to monitor the health of `vmanomaly` service and its components.
**Logging**:
- Tune logging levels in the `settings.logger_levels` [section](https://docs.victoriametrics.com/anomaly-detection/components/settings/#logger-levels) to control the verbosity of logs. This can help in debugging and monitoring the service behavior, as well as in disabling excessive logging for production environments.
## Check also
Please refer to the following links for a deeper understanding of Anomaly Detection and `vmanomaly`:

View File

@@ -652,7 +652,7 @@ models:
> `ProphetModel` is a [univariate](#univariate-models), [non-rolling](#non-rolling-models), [offline](#offline-models) model.
> {{% available_from "v1.18.2" anomaly %}} the format for `tz_seasonalities` has been updated to enhance flexibility. Previously, it accepted a list of strings (e.g., `['hod', 'minute']`). Now, it follows the same structure as custom seasonalities defined in the `seasonalities` argument (e.g., `{"name": "hod", "fourier_order": 5, "mode": "additive"}`). This change is backward-compatible, so older configurations will be automatically converted to the new format using default values.
> {{% available_from "v1.25.3" anomaly %}} Producing forecasts for future timestamps is now supported. To enable this, set the `forecast_at` argument to a list of relative future offsets (e.g., `['1h', '1d']`). The model will then generate forecasts for these future timestamps, which can be useful for planning and resource allocation. Output series are affected by [provide_series](#provide-series) argument, which need to include at least `yhat` for point-wise forecasts (and `yhat_lower` or/and `yhat_upper` for respective confidence intervals). See the example below for more details.
*Parameters specific for vmanomaly*:
@@ -661,7 +661,11 @@ models:
- `scale`{{% available_from "v1.18.0" anomaly %}} (float): Is used to adjust the margins between `yhat` and [`yhat_lower`, `yhat_upper`]. New margin = `|yhat_* - yhat_lower| * scale`. Defaults to 1 (no scaling is applied). See `scale`[common arg](https://docs.victoriametrics.com/anomaly-detection/components/models/#scale) section for detailed instructions and 2-sided option.
- `tz_aware`{{% available_from "v1.18.0" anomaly %}} (bool): Enables handling of timezone-aware timestamps. Default is `False`. Should be used with `tz_seasonalities` and `tz_use_cyclical_encoding` parameters.
- `tz_seasonalities`{{% available_from "v1.18.0" anomaly %}} (list[dict]): Specifies timezone-aware seasonal components. Requires `tz_aware=True`. Supported options include `minute`, `hod` (hour of day), `dow` (day of week), and `month` (month of year). {{% available_from "v1.18.2" anomaly %}} users can configure additional parameters for each seasonality, such as `fourier_order`, `prior_scale`, and `mode`. For more details, please refer to the **Timezone-unaware** configuration example below.
> {{% available_from "v1.18.2" anomaly %}} the format for `tz_seasonalities` has been updated to enhance flexibility. Previously, it accepted a list of strings (e.g., `['hod', 'minute']`). Now, it follows the same structure as custom seasonalities defined in the `seasonalities` argument (e.g., `{"name": "hod", "fourier_order": 5, "mode": "additive"}`). This change is backward-compatible, so older configurations will be automatically converted to the new format using default values.
- `tz_use_cyclical_encoding`{{% available_from "v1.18.0" anomaly %}} (bool): If set to `True`, applies [cyclical encoding technique](https://www.kaggle.com/code/avanwyk/encoding-cyclical-features-for-deep-learning) to timezone-aware seasonalities. Should be used with `tz_aware=True` and `tz_seasonalities`.
- `forecast_at`{{% available_from "v1.25.3" anomaly %}} (list[str]): Specifies future relative offsets for which forecasts should be generated (e.g., `['1h', '1d']`). Works similarly to [predict_linear](https://docs.victoriametrics.com/victoriametrics/metricsql/#predict_linear) in MetricQL, but with more flexibility and seasonality support - produced series will have *the same timestamp* as the other [output](#vmanomaly-output) series, but with the forecasted value for the *future timestamp*. Defaults to `[]` (empty list, meaning no future forecasts are produced). If set, `provide_series` must include at least `yhat` for point-wise forecasts (and `yhat_lower` or/and `yhat_upper` for respective confidence intervals). For example, if `forecast_at` is set to `['1h', '1d']`, the model will produce forecasts for both the next hour and the next day, and these series can be accessed by `yhat_1h`, `yhat_lower_1h`, `yhat_upper_1h`, `yhat_1d`, `yhat_lower_1d`, and `yhat_upper_1d` in the output, respectively. See [FAQ](https://docs.victoriametrics.com/anomaly-detection/faq/#forecasting) for more details.
> `forecast_at` parameter can lead to **significant increase in active timeseries** if you have a lot of time series returned by your queries, as it will produce additional series for each of the future timestamps specified in `forecast_at` (optionally multiplied by 1-3 if interval forecasts are included). For example, if you have 1000 time series returned by your query and set `forecast_at` to `[1h, 1d, 1w]`, and `provide_series` includes `yhat_lower` and `yhat_upper`, it will produce 1000 (series) * 3 (intervals) * 3 (predictions, point + interval) = 9000 additional timeseries. Consider using it only on small subset of metrics (e.g. grouped by `host` or `region`) to avoid this issue, as it also **proportionally (to the number of `forecast_at` elements) increases the timings of inference calls**.
> Apart from standard [`vmanomaly` output](#vmanomaly-output), Prophet model can provide additional metrics.
@@ -1308,7 +1312,7 @@ monitoring:
Let's pull the docker image for `vmanomaly`:
```sh
docker pull victoriametrics/vmanomaly:v1.25.2
docker pull victoriametrics/vmanomaly:v1.25.3
```
Now we can run the docker container putting as volumes both config and model file:
@@ -1322,7 +1326,7 @@ docker run -it \
-v $(PWD)/license:/license \
-v $(PWD)/custom_model.py:/vmanomaly/model/custom.py \
-v $(PWD)/custom.yaml:/config.yaml \
victoriametrics/vmanomaly:v1.25.2 /config.yaml \
victoriametrics/vmanomaly:v1.25.3 /config.yaml \
--licenseFile=/license
```

View File

@@ -85,14 +85,18 @@ There is change{{% available_from "v1.13.0" anomaly %}} of [`queries`](https://d
> The recommended approach for using per-query `tenant_id`s is to set both `reader.tenant_id` and `writer.tenant_id` to `multitenant`. See [this section](https://docs.victoriametrics.com/anomaly-detection/components/writer/#multitenancy-support) for more details. Configurations where `reader.tenant_id` equals `writer.tenant_id` and is not `multitenant` are also considered safe, provided there is a single, DISTINCT `tenant_id` defined in the reader (either at the reader level or the query level, if set).
- `offset` {{% available_from "v1.25.3" anomaly %}} (string): this optional argument allows specifying a time offset for the query, which can be useful for adjusting the query time range to account for data collection delays or other timing issues. The offset is specified as a string (e.g., "15s", "-20s") and will be applied to the query time range. Valid resolutions are `ms`, `s`, `m`, `h`, `d` (miliseconds, seconds, minutes, hours, days). If not set, defaults to `0s` (0). See [FAQ](https://docs.victoriametrics.com/anomaly-detection/faq/#using-offsets) for more details.
### Per-query config example
```yaml
reader:
class: 'vm'
sampling_period: '1m'
datasource_url: 'https://play.victoriametrics.com/' # source victoriametrics/prometheus
max_points_per_query: 10000
data_range: [0, 'inf']
tenant_id: 'multitenant'
offset: '0s' # optional, defaults to 0s if not set
# other reader params ...
queries:
ingestion_rate_t1:
@@ -109,6 +113,7 @@ reader:
max_points_per_query: 5000 # overrides reader-level value of 10000 for `ingestion_rate` query
tz: 'America/New_York' # to override reader-wise `tz`
tenant_id: '2:0' # overriding tenant_id to isolate data
offset: '-15s' # to override reader-wise `offset` and query data 15 seconds earlier to account for data collection delays
```
### Config parameters
@@ -395,10 +400,24 @@ Optional argument{{% available_from "v1.18.0" anomaly %}} specifies the [IANA](h
Optional argument{{% available_from "v1.18.1" anomaly %}} allows defining **valid** data ranges for input of all the queries in `queries`. Defaults to `["-inf", "inf"]` if not set and can be overridden on a [per-query basis](#per-query-parameters).
</td>
</tr>
<tr>
<td>
<span style="white-space: nowrap;">`offset`</span>
</td>
<td>
`60s`
</td>
<td>
Optional argument{{% available_from "v1.25.3" anomaly %}} allows specifying a time offset for all queries in `queries`. Defaults to `0s` (0) if not set and can be overridden on a [per-query basis](#per-query-parameters).
</td>
</tr>
</tbody>
</table>
Config file example:
<br>
Config section example:
```yaml
reader:
@@ -407,6 +426,7 @@ reader:
tenant_id: '0:0'
tz: 'America/New_York'
data_range: [1, 'inf'] # reader-level
offset: '0s' # reader-level
queries:
ingestion_rate:
expr: 'sum(rate(vm_rows_inserted_total[5m])) by (type) > 0'
@@ -414,6 +434,7 @@ reader:
data_range: [0, 'inf'] # if set, overrides reader-level data_range
tz: 'Australia/Sydney' # if set, overrides reader-level tz
# tenant_id: '1:0' # if set, overrides reader-level tenant_id
# offset: '-15s' # if set, overrides reader-level offset
sampling_period: '1m'
query_from_last_seen_timestamp: True # false by default
latency_offset: '1ms'

View File

@@ -305,3 +305,27 @@ reader: # can be partially reused, because its class and datasource URL are unc
This means that the service upon restart:
1. Won't restore the state of `zscore_online` model, because its `z_threshold` argument **has changed**, retraining from scratch is needed on the last `fit_window` = 24 hours of data for `q1`, `q2` and `q3` (as model's `queries` arg is not set so it defaults to all queries found in the reader).
2. Will **partially** restore the state of `prophet` model, because its class and schedulers are unchanged, but **only instances trained on timeseries returned by `q1` query**. New fit/infer jobs will be set for new query `q3`. The old query `q2` artifacts will be dropped upon restart - all respective models and data for (`prophet`, `q2`) combination will be removed from the database file and from the disk.
### Logger Levels
{{% available_from "v1.25.3" anomaly %}} `vmanomaly` service supports per-component logger levels, allowing to control the verbosity of logs for each component independently. This can be useful for debugging or monitoring specific components without overwhelming the logs with information from other components. Prefixes are also supported, allowing to set the logger level for all components with a specific prefix.
The logger levels can be set in the `settings` section of the config file under `logger_levels` key, where the key is the component name or prefix and the value is the desired logger level. The available logger levels are: `debug`, `info`, `warning`, `error`, and `critical`.
> Best used in combination with [hot-reload](https://docs.victoriametrics.com/anomaly-detection/components/#hot-reload) to change the logger levels *on-the-fly* without restarting the service through a short-circuit config check than doesn't even trigger the state restoration logic.
Here's an example configuration that sets the logger level for the `reader` component to `debug` and for the `writer` component to `critical`, while `--loggerLevel` [command line argument](https://docs.victoriametrics.com/anomaly-detection/quickstart/#command-line-arguments) sets the default logger level to `INFO` for all (the other) components, unless overridden by the config:
> If commented out in hot-reload mode during hot-reload event, the logger level for the component will be set back to what `--loggerLevel` command line argument is set to, which defaults to `info` if not specified.
```yaml
settings:
n_workers: 4
restore_state: True # enables state restoration
logger_levels:
reader.vm: debug # affects only VmReader logs
model: warning # applies to all components with 'model' prefix, such as 'model.zscore_online', 'model.prophet', etc.
# once commented out in hot-reload mode, will use the default logger level set by --loggerLevel command line argument
# monitoring.push: critical
```

Some files were not shown because too many files have changed in this diff Show More