Compare commits

..

65 Commits

Author SHA1 Message Date
Max Kotliar
ebc9d49e50 docs: forward port LTS v1.136.8 changelog to upstream
Signed-off-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-04-30 20:50:48 +03:00
f41gh7
b2a6fba673 docs/changelog: mention vminsert enterprise bugfix
At v1.142.0 was introduced a bug, when changes from OSS version were
 back-ported into Enterprise branch. It changed the order of storage
 nodes discovery. And resulted into:
 * overwrite of discovered storage nodes
 * duplicate of per storage node metrics

  This bug only affects enterprise vminsert version.
2026-04-30 17:13:40 +02:00
Roman Khavronenko
6100b8ba10 docs/vmalert: mention -rule.stripFilePath in #security (#10902)
Mention -rule.stripFilePath cmd-limne flag in security recommendations,
so users can be aware of it.

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Haley Wang <haley@victoriametrics.com>
2026-04-29 20:01:25 +02:00
Roman Khavronenko
403d32f57f docs: mention AI observability (#10903)
The change adds `AI observability` section to `AI tools` documentation.
It mentions excellent @Amper articles describing these integrations in
all details.

The doc change doesn't repeat the articles, but rather helps users to
discover them.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2026-04-29 20:00:49 +02:00
Mathias Palmersheim
ed8ebb8314 docs/vmalert: clarified urls for tenant option (#10898)
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10897 by clarifying what URLS should be used for `-datasource.url`, `-remoteRead.url`, and `-remoteWrite.url` when `-clusterMode` is specified.


PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10898

---------

Co-authored-by: Haley Wang <haley@victoriametrics.com>
2026-04-29 12:18:08 +03:00
Hui Wang
55c8bb26db docs: polish stream aggregation doc (#10896)
PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10896

Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-04-29 12:12:31 +03:00
Max Kotliar
129358f9ea docs: update release guidance doc (#10887)
Leave only generic details about the release process in public docs.

To maintainers: 
All internal details are described in
https://github.com/VictoriaMetrics/release/blob/main/README.md. The new
document contains up-to-date release process guidance. Please refer to
it instead while preparing a new release.

An archived version of this document is available at:
https://github.com/VictoriaMetrics/release/blob/main/legacy_docs/Release-Guide.md.
2026-04-29 12:04:59 +03:00
Hui Wang
5d5e5b3e44 app/vmalert: add -rule.stripFilePath flag
The flag already exists in the ENT version. We decided to expose it in
OSS and strip the path from all public places, including all
APIs(includes `/metrics`) and debug logs(it's minor info there).

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5625
2026-04-29 10:12:11 +02:00
andriibeee
88882227f7 app/vmalert: add formatTime template function
This commit adds `formatTime` template function to the vmalert. Which accepts format string and current timestamp.

{{ now | formatTime "2006-01-02T15:04:05Z07:00" }}


Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10624
2026-04-29 10:09:54 +02:00
Nikolay
64e43e59a7 lib/httpserver: suppress TCP health check for tls connections
Previously, if `-tls` flag was provided, victoria metrics components
produced the following log error entry at health checks:

 http: TLS handshake error from 10.244.0.1:46556: EOF

Such health checks are common for many orchestration systems, such as
consul
or kubernetes. And default http server already suppresses such EOF
health checks.

 This commit adds suppression to the tls server as well.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10538
2026-04-29 09:59:57 +02:00
Max Kotliar
200a764d32 docs: add links to telegram channels (#10894)
PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10894
2026-04-28 19:22:59 +03:00
Pablo (Tomas) Fernandez
b29ad9e6ce docs: update guide "Collecting OpenShift logs with Victoria Logs" (#10864)
# What Changed

- Updated the operator installation procedure
- Updated the commands to match the rest of the guides
- Updated screenshots
- Reordered steps to make more sense of the process
- Fixed issues in the YAML
- Tested on actual OpenShift trial instance running on AWS
- Added steps to confirm log ingestion using VMUI

PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10864
2026-04-28 16:59:30 +03:00
Pablo (Tomas) Fernandez
00c0c149da docs: fix links in docs; refine security page (#10874)
This PR fixes several broken links and anchors in the victoriametrics
docs.

Note about links changes in FAQ.md file. The links inside the paragraph
break navigation in the right-side menu. To fix this, an explicit anchor
definition has been added. The anchor is the same as before, setting it
explsitly fixes the siebar links.

See https://github.com/VictoriaMetrics/vmdocs/issues/221 for the
up-to-date list once this PR is merged.

PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10874
2026-04-28 16:58:09 +03:00
Max Kotliar
542ea4788e app/vmalert: fix typo in comment 2026-04-28 16:38:49 +03:00
Max Kotliar
124bdbd383 docs: Replace waiting_for_release with completed label in CONTRIBUTING.md 2026-04-28 16:37:23 +03:00
Max Kotliar
1b3e549833 docs/changelog: cleanup CHANGELOG_2025.md 2026-04-28 16:31:46 +03:00
Max Kotliar
c37b78f366 docs: bump version to v1.142.0
Signed-off-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-04-28 14:05:59 +03:00
Max Kotliar
017bfc094d deplyoment/docker: bump version to v1.142.0
Signed-off-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-04-28 14:05:02 +03:00
Max Kotliar
411ec81619 docs: forward port LTS v1.136.7 changelog to upstream
Signed-off-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-04-28 14:02:21 +03:00
Max Kotliar
64ccd2ed44 docs/changelog: cut release v1.142.0
Signed-off-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-04-28 12:55:51 +03:00
Nikolay
89c0b1c1aa lib/opentelemetry: properly reset metric metadata
Previously, metricMetadata was not properly reset during parsing of
metrics. It could result into `Unit` suffix to be added from previously
parsed metric into next metric without Unit field.

  For example, metric `http_request` with `Unit` `seconds` will be
converted into `http_request_seconds` and `Unit` field hold `seconds`.
Next parsed metric `cpu_usage_ratio` has no `Unit` and it will get
previous `seconds` `Unit` -> `cpu_usage_ratio_seconds`.

 This commit adds metricMetadata reset call before parsing of next
 metric.

 Bug was introduced at 293d80910c

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10889
2026-04-28 11:17:12 +02:00
Hui Wang
387a54d3c8 dashboards: polish vmauth dashboard (#10884)
See updated dashboard in
https://play-grafana.victoriametrics.com/d/nbuo5Mr4k/victoriametrics-vmauth?orgId=1&from=now-3h&to=now&timezone=browser&var-ds=P4169E866C3094E38&var-job=vmclusterlb-benchmark-vm-cluster-lts&var-instance=$__all&var-user=$__all&var-adhoc=&refresh=30s.

`Stats`:
1. `Users count`: set default value 0;
2. `Uptime`: count vmauth instances per job instead of showing instance
uptime, to be consistent with other dashboards. The actual uptime is not
very useful and is hard to read.

`Overview`:
1. Reorder panels;
2. `Requests rejected rate`: add a `>0` threshold in query.

`Troubleshooting`:
1. Remove unused `Restarts` panel;
2. `Logging rate`: add a `>0` threshold in query;
3. Add `Requests backend error rate` to show underlying backend errors
in addition to request errors.

I don’t see a specific change that needs to be mentioned in the
changelog.
2026-04-27 20:20:40 +03:00
Roman Khavronenko
20928171a8 docs/playgrounds: mention iximiuz playgrounds (#10878)
Iximiuz labs prepared a set of playgrounds for VictoriaMetrics. These
are interactive playgrounds backed by real Linux machines running
VictoriaMetrics software, allowing experimenting and investigating right
in the browser tab.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2026-04-27 19:58:04 +03:00
Zakhar Bessarab
ff79527c7f docs/playgrounds: add links to SSO playground (#10877)
Added info about Grafana SSO playground to playgrounds docs.

---------

Signed-off-by: Max Kotliar <kotlyar.maksim@gmail.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-04-27 19:43:19 +03:00
Max Kotliar
492419c2e8 docs: update flags with actual v1.141.0 binaries
Signed-off-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-04-27 14:38:27 +03:00
Max Kotliar
f42c56fc48 docs: bump version to v1.141.0
Signed-off-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-04-27 14:36:07 +03:00
Max Kotliar
684f96759f deplyoment/docker: bump version to v1.141.0
Signed-off-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-04-27 14:29:48 +03:00
Max Kotliar
5c3dc0f429 docs: forward port LTS v1.122.21 changelog to upstream
Signed-off-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-04-27 13:57:54 +03:00
Max Kotliar
ca5bc3a4c4 docs: forward port LTS v1.136.6 changelog to upstream
Signed-off-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-04-27 13:57:03 +03:00
Max Kotliar
2336c7e72f docs/changelog: fix upgrade alpine version
follow-up for
49a8dd4da6
2026-04-24 21:38:04 +03:00
Max Kotliar
b803a46e7f docs/changelog: cut release v1.141.0
Signed-off-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-04-24 19:41:27 +03:00
Max Kotliar
0e845e234f app/vmselect: run make vmui-update
Signed-off-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-04-24 19:37:52 +03:00
Max Kotliar
49a8dd4da6 deployment/docker: update base Alpine Docker image from 3.23.2 to 3.23.3
See
https://www.alpinelinux.org/posts/Alpine-3.20.10-3.21.7-3.22.4-3.23.4-released.html
2026-04-24 18:22:01 +03:00
Max Kotliar
2609a53e41 docs/changelog: chore 2026-04-24 15:59:34 +03:00
Nikolay
1ca4b3ba3c app/vmagent: properly attach tenant information to metadata (#10865)
Previously, vmagent ignored tenant ID information obtained from
`__tenant_id__` label for metrics metadata. It made it impossible to route
metrics metadata to the `/multitenant` endpoints. This commit adds tenant ID to the metrics metadata.

It also fixes VMagent multitenant ingestion endpoints. Previously, the tenant info defined there was not properly set to metadata. 

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10828
PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10865

---------

Signed-off-by: Nikolay <nik@victoriametrics.com>
Signed-off-by: f41gh7 <nik@victoriametrics.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-04-24 14:36:35 +03:00
Hui Wang
66b9890025 dashboards: add metadata ingestion row rate queries to vmagent&vmcluster dashboards (#10868)
Metadata is enabled by default since v1.137.0, and the metadata volume
can be a big contributor to resource usage and network traffic.

vmagent dahsboard:
1. `Troubleshooting` section: rename `Datapoints rate` panel to `Rows
rate` to include metadata rate;
2. `Ingestion` section: add metadata rate to existing `Rows rate` panel.
(The difference between this panel and the one above is that this panel
only contains data from write requests, while the above panel also
includes the scraping part.)


vmcluster dashboard:
1. `vminsert` section: add `Rows rate` panel

Didn’t see a good place for it in the vmsingle dashboard, since it
doesn’t have a dedicated insert section, and I don’t want to add it to
`overview` yet.

https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10868
2026-04-24 14:07:31 +03:00
Yury Moladau
2e7591d567 app/vmui: improve series color visibility (#10872)
### Describe Your Changes

Improve generated series colors to increase visibility and consistency
across light and dark themes.

Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10869
PR: https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10872

| Before | After |
|---|---|
| <img width="758" height="469" alt="image"
src="https://github.com/user-attachments/assets/dfe879fc-c1ff-4128-923b-24dd0b829421"
/> | <img width="758" height="469" alt="image"
src="https://github.com/user-attachments/assets/7ea6f618-2d6d-43b6-b881-9525a2897ef6"
/> |
| <img width="758" height="469" alt="image"
src="https://github.com/user-attachments/assets/ab07e223-5ab5-43dc-8c3f-7ab28d4ab2b6"
/> | <img width="758" height="469" alt="image"
src="https://github.com/user-attachments/assets/988d19b6-ca16-4ca6-af8a-e043cfb066d3"
/> |

---------

Signed-off-by: Yury Molodov <yurymolodov@gmail.com>
Signed-off-by: Max Kotliar <kotlyar.maksim@gmail.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2026-04-24 13:36:26 +03:00
hagen1778
ca8d9d21a9 docs: mention accuracy issues for histogram aggregation
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2026-04-24 10:31:17 +02:00
hagen1778
0653b7c7b8 docs: mention histogram aggregation link
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2026-04-24 10:23:24 +02:00
Roman Khavronenko
569197d038 docs: update stream aggregation docs (#10871)
* add visual mermaid diagram to demonstrate aggregation concept;
* update Recording-rules-alternative:
* * recommend using rate_sum instead of total for better reliability
* * demonstrate how to calculate sliding window, typicall for recording
rules

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Pablo Fernandez <46322567+TomFern@users.noreply.github.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-04-24 10:20:29 +02:00
Max Kotliar
5f357e6a94 docs/changelog: chore update notes
force evey update note to be on a new line
2026-04-23 20:36:42 +03:00
Artem Fetishev
c317e95ab8 lib/storage: support samples with future timestamps (#10718)
Add the support of storage and retrieval of samples with future
timestamps as requested in https://github.com/VictoriaMetrics/VictoriaMetrics/issues/827

What to expect:

- By default, the max future timestamp is still limited to `now+2d`. To
change it, set the `-futureRetention` flag in `vmstorage`. The max flag
value is currently limited to `100y`. It can be extended if we see a
demand for this, but it can't be more than `~ 290y` due to how the time
duration is implemented in Go. The flag value can't be less than `2d`.
- downsampling and retention filters (available in enterprise edition)
are currently not supported for future timestamps
- If `vmstorage` restarts with a smaller value of `-futureRetention`
flag, any future partitions that are outside the new future retention
will be automatically deleted.
- Data ingestion, data retrieval, backup/restore, timeseries (soft)
deletion, and other operations work with future timestamps the same way
as with the historical timestamps.
- In the cluster version, the affected binaries are `vmstorage` and
`vmselect`. This means that `vmselect` version must match `vmstorage`
version if you want to query future timestamps. `vminsert` was not
affected, so its version can be a lower one.
- If you downgrade the `vmstorage`, the data with future timestamps will
remain on disk and memory (per-partition caches) but won't be available
for querying.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
Signed-off-by: Artem Fetishev <149964189+rtm0@users.noreply.github.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
2026-04-23 18:12:33 +02:00
Artem Fetishev
a875597b09 lib/timeutil: ensure parsed time is in allowed range (#10870)
Update `timeutil.ParseTimeAt` to check the time limits for all date/time formats, not just year.

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
2026-04-23 17:37:15 +02:00
Max Kotliar
3062f4355d docs: forward port LTS v1.122.20 changelog to upstream
Signed-off-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-04-23 17:38:00 +03:00
Max Kotliar
aa206acd6f docs: forward port LTS v1.136.5 changelog to upstream
Signed-off-by: Max Kotliar <mkotlyar@victoriametrics.com>
2026-04-23 17:32:25 +03:00
Nikolay
9a74f71a5f app/vmauth: properly start backend healths
Previously, backend url health check start could produce a data race
and a race condition.

 The following panic could be produced:
`panic: sync: WaitGroup is reused before previous Wait has returned`

 It happened because concurrent goroutine could process request, while
 configuration was reloaded and stopHealthChecks method was called.

 This commit adds a dedicated structure for backend health checks.
Which protects from data race with mutex guard. And prevents race
condition with a boolean flag.

Fixes: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10806
2026-04-23 11:05:06 +02:00
Roman Khavronenko
1dcf0f6826 github: update PR template
Visually outline that guideline message should be removed from
description before submitting the PR. This should prevent cases when PR
template was blending into the PRs description remaining unnoticed.
2026-04-23 11:04:29 +02:00
Max Kotliar
727abb0b57 go.mod: update metricsql to version that fixes bug in binary op evaluation ordering
The commit in metricsql
d0bc93816e
introduced a bug that changes an order of binary op evaluation. This
commit updates to metricsql version that fixes a bug by reverting to
previous behavior.

The bug was introduced in v1.140.0, v1.136.4, and v1.122.19 releases.

It was reported in
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10856
2026-04-22 20:40:00 +03:00
cubic-dev-ai[bot]
2c262c5ef6 app/vmctl: return errors instead of silently skipping unexpected OpenTSDB responses
Previously 
- `GetData` in the OpenTSDB client was returning empty `Metric{}` with
`nil` error for several conditions (multiple series returned, aggregate
tags present, `modifyData` failures), causing `vmctl opentsdb` to
silently drop series during migration

 This commit changes these silent return paths to return proper errors with
descriptive messages including the query string, so operators can detect
and diagnose partial migrations.

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10797
2026-04-22 11:28:55 +02:00
andriibeee
a3df0f890b lib/cgroup: support reading cpu/memory limits from systemd slices
cgroup v2 version supports slices ( aka path hierarchy) for resource limits. It's mostly supported by systemd
and container runtime build on top of it.

 This commit reads subpath for systemd slices and traverse it with reading minimal limit value.

Related docs:
https://docs.oracle.com/en/operating-systems/oracle-linux/9/systemd/SystemdMngCgroupsV2.html#SlicesServicesScopesHierarchy
https://www.freedesktop.org/software/systemd/man/latest/systemd.slice.html

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10635
2026-04-22 10:18:03 +02:00
Max Kotliar
0785d16711 docs/vmauth: add example for using TLS on public addr but keeping internal non-TLS (#10858)
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10793
2026-04-22 10:06:42 +02:00
Hui Wang
dc94aa9339 app/vmalert: properly remove empty labels value
Previously, if rule label value was set to empty string, vmalert ignored this label during labels merge with labels from data source response. In contrast, Prometheus removes data source label in this case as well. Which allows to perform label delete operation.

 This commit uses the same logic as Prometheus for resolving labels conflicts and allows to remove labels.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10766
2026-04-22 09:53:25 +02:00
Max Kotliar
032f70e262 docs/changelog: add update note about bug in metricsql
Follow up to
7029283f7d
for LTS releases
2026-04-21 20:19:56 +03:00
Max Kotliar
7029283f7d docs/changelog: add update note about bug in metricsql
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10856

Bug introduced in https://github.com/VictoriaMetrics/metricsql/pull/63
via commit
08dd38d4a0
2026-04-21 20:15:47 +03:00
Fred Navruzov
6c1534c7b1 docs/vmanomaly: update visual assets and formulations (#10859)
Update vmanomaly visual assets and improve clarification on allowed
datasources
2026-04-21 19:46:44 +03:00
Roman Khavronenko
0c05b0b15b apptest: restore helper for default tenant
Helper `getTenant` was removed in
e0e01e46f0 assuming that new change
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10782 will
tolerate missing tenantID in the path.

While that change is still not merged - restoring the helper for tests
to remain functional.
2026-04-21 11:03:06 +02:00
Alexander Frolov
a2b1d1eb62 app/vminsert: account storageNodesBucket count in per-node buffer size
Follow-up for ceda0407fb which added a regression, which could
double vminsert memory usage.

 This commit takes in account a second buffer per storageNode.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10725#issuecomment-4282256709
2026-04-20 21:28:14 +02:00
dependabot[bot]
e3cd3329d6 build(deps): bump github/codeql-action from 4 to 4.35.1 (#10844)
Bumps [github/codeql-action](https://github.com/github/codeql-action)
from 4 to 4.35.1.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/github/codeql-action/releases">github/codeql-action's
releases</a>.</em></p>
<blockquote>
<h2>v4.35.1</h2>
<ul>
<li>Fix incorrect minimum required Git version for <a
href="https://redirect.github.com/github/roadmap/issues/1158">improved
incremental analysis</a>: it should have been 2.36.0, not 2.11.0. <a
href="https://redirect.github.com/github/codeql-action/pull/3781">#3781</a></li>
</ul>
<h2>v4.35.0</h2>
<ul>
<li>Reduced the minimum Git version required for <a
href="https://redirect.github.com/github/roadmap/issues/1158">improved
incremental analysis</a> from 2.38.0 to 2.11.0. <a
href="https://redirect.github.com/github/codeql-action/pull/3767">#3767</a></li>
<li>Update default CodeQL bundle version to <a
href="https://github.com/github/codeql-action/releases/tag/codeql-bundle-v2.25.1">2.25.1</a>.
<a
href="https://redirect.github.com/github/codeql-action/pull/3773">#3773</a></li>
</ul>
<h2>v4.34.1</h2>
<ul>
<li>Downgrade default CodeQL bundle version to <a
href="https://github.com/github/codeql-action/releases/tag/codeql-bundle-v2.24.3">2.24.3</a>
due to issues with a small percentage of Actions and JavaScript
analyses. <a
href="https://redirect.github.com/github/codeql-action/pull/3762">#3762</a></li>
</ul>
<h2>v4.34.0</h2>
<ul>
<li>Added an experimental change which disables TRAP caching when <a
href="https://redirect.github.com/github/roadmap/issues/1158">improved
incremental analysis</a> is enabled, since improved incremental analysis
supersedes TRAP caching. This will improve performance and reduce
Actions cache usage. We expect to roll this change out to everyone in
March. <a
href="https://redirect.github.com/github/codeql-action/pull/3569">#3569</a></li>
<li>We are rolling out improved incremental analysis to C/C++ analyses
that use build mode <code>none</code>. We expect this rollout to be
complete by the end of April 2026. <a
href="https://redirect.github.com/github/codeql-action/pull/3584">#3584</a></li>
<li>Update default CodeQL bundle version to <a
href="https://github.com/github/codeql-action/releases/tag/codeql-bundle-v2.25.0">2.25.0</a>.
<a
href="https://redirect.github.com/github/codeql-action/pull/3585">#3585</a></li>
</ul>
<h2>v4.33.0</h2>
<ul>
<li>
<p>Upcoming change: Starting April 2026, the CodeQL Action will skip
collecting file coverage information on pull requests to improve
analysis performance. File coverage information will still be computed
on non-PR analyses. Pull request analyses will log a warning about this
upcoming change. <a
href="https://redirect.github.com/github/codeql-action/pull/3562">#3562</a></p>
<p>To opt out of this change:</p>
<ul>
<li><strong>Repositories owned by an organization:</strong> Create a
custom repository property with the name
<code>github-codeql-file-coverage-on-prs</code> and the type
&quot;True/false&quot;, then set this property to <code>true</code> in
the repository's settings. For more information, see <a
href="https://docs.github.com/en/organizations/managing-organization-settings/managing-custom-properties-for-repositories-in-your-organization">Managing
custom properties for repositories in your organization</a>.
Alternatively, if you are using an advanced setup workflow, you can set
the <code>CODEQL_ACTION_FILE_COVERAGE_ON_PRS</code> environment variable
to <code>true</code> in your workflow.</li>
<li><strong>User-owned repositories using default setup:</strong> Switch
to an advanced setup workflow and set the
<code>CODEQL_ACTION_FILE_COVERAGE_ON_PRS</code> environment variable to
<code>true</code> in your workflow.</li>
<li><strong>User-owned repositories using advanced setup:</strong> Set
the <code>CODEQL_ACTION_FILE_COVERAGE_ON_PRS</code> environment variable
to <code>true</code> in your workflow.</li>
</ul>
</li>
<li>
<p>Fixed <a
href="https://redirect.github.com/github/codeql-action/issues/3555">a
bug</a> which caused the CodeQL Action to fail loading repository
properties if a &quot;Multi select&quot; repository property was
configured for the repository. <a
href="https://redirect.github.com/github/codeql-action/pull/3557">#3557</a></p>
</li>
<li>
<p>The CodeQL Action now loads <a
href="https://docs.github.com/en/organizations/managing-organization-settings/managing-custom-properties-for-repositories-in-your-organization">custom
repository properties</a> on GitHub Enterprise Server, enabling the
customization of features such as
<code>github-codeql-disable-overlay</code> that was previously only
available on GitHub.com. <a
href="https://redirect.github.com/github/codeql-action/pull/3559">#3559</a></p>
</li>
<li>
<p>Once <a
href="https://docs.github.com/en/code-security/how-tos/secure-at-scale/configure-organization-security/manage-usage-and-access/giving-org-access-private-registries">private
package registries</a> can be configured with OIDC-based authentication
for organizations, the CodeQL Action will now be able to accept such
configurations. <a
href="https://redirect.github.com/github/codeql-action/pull/3563">#3563</a></p>
</li>
<li>
<p>Fixed the retry mechanism for database uploads. Previously this would
fail with the error &quot;Response body object should not be disturbed
or locked&quot;. <a
href="https://redirect.github.com/github/codeql-action/pull/3564">#3564</a></p>
</li>
<li>
<p>A warning is now emitted if the CodeQL Action detects a repository
property whose name suggests that it relates to the CodeQL Action, but
which is not one of the properties recognised by the current version of
the CodeQL Action. <a
href="https://redirect.github.com/github/codeql-action/pull/3570">#3570</a></p>
</li>
</ul>
<h2>v4.32.6</h2>
<ul>
<li>Update default CodeQL bundle version to <a
href="https://github.com/github/codeql-action/releases/tag/codeql-bundle-v2.24.3">2.24.3</a>.
<a
href="https://redirect.github.com/github/codeql-action/pull/3548">#3548</a></li>
</ul>
<h2>v4.32.5</h2>
<ul>
<li>Repositories owned by an organization can now set up the
<code>github-codeql-disable-overlay</code> custom repository property to
disable <a
href="https://redirect.github.com/github/roadmap/issues/1158">improved
incremental analysis for CodeQL</a>. First, create a custom repository
property with the name <code>github-codeql-disable-overlay</code> and
the type &quot;True/false&quot; in the organization's settings. Then in
the repository's settings, set this property to <code>true</code> to
disable improved incremental analysis. For more information, see <a
href="https://docs.github.com/en/organizations/managing-organization-settings/managing-custom-properties-for-repositories-in-your-organization">Managing
custom properties for repositories in your organization</a>. This
feature is not yet available on GitHub Enterprise Server. <a
href="https://redirect.github.com/github/codeql-action/pull/3507">#3507</a></li>
<li>Added an experimental change so that when <a
href="https://redirect.github.com/github/roadmap/issues/1158">improved
incremental analysis</a> fails on a runner — potentially due to
insufficient disk space — the failure is recorded in the Actions cache
so that subsequent runs will automatically skip improved incremental
analysis until something changes (e.g. a larger runner is provisioned or
a new CodeQL version is released). We expect to roll this change out to
everyone in March. <a
href="https://redirect.github.com/github/codeql-action/pull/3487">#3487</a></li>
<li>The minimum memory check for improved incremental analysis is now
skipped for CodeQL 2.24.3 and later, which has reduced peak RAM usage.
<a
href="https://redirect.github.com/github/codeql-action/pull/3515">#3515</a></li>
<li>Reduced log levels for best-effort private package registry
connection check failures to reduce noise from workflow annotations. <a
href="https://redirect.github.com/github/codeql-action/pull/3516">#3516</a></li>
<li>Added an experimental change which lowers the minimum disk space
requirement for <a
href="https://redirect.github.com/github/roadmap/issues/1158">improved
incremental analysis</a>, enabling it to run on standard GitHub Actions
runners. We expect to roll this change out to everyone in March. <a
href="https://redirect.github.com/github/codeql-action/pull/3498">#3498</a></li>
<li>Added an experimental change which allows the
<code>start-proxy</code> action to resolve the CodeQL CLI version from
feature flags instead of using the linked CLI bundle version. We expect
to roll this change out to everyone in March. <a
href="https://redirect.github.com/github/codeql-action/pull/3512">#3512</a></li>
<li>The previously experimental changes from versions 4.32.3, 4.32.4,
3.32.3 and 3.32.4 are now enabled by default. <a
href="https://redirect.github.com/github/codeql-action/pull/3503">#3503</a>,
<a
href="https://redirect.github.com/github/codeql-action/pull/3504">#3504</a></li>
</ul>
<h2>v4.32.4</h2>
<ul>
<li>Update default CodeQL bundle version to <a
href="https://github.com/github/codeql-action/releases/tag/codeql-bundle-v2.24.2">2.24.2</a>.
<a
href="https://redirect.github.com/github/codeql-action/pull/3493">#3493</a></li>
<li>Added an experimental change which improves how certificates are
generated for the authentication proxy that is used by the CodeQL Action
in Default Setup when <a
href="https://docs.github.com/en/code-security/how-tos/secure-at-scale/configure-organization-security/manage-usage-and-access/giving-org-access-private-registries">private
package registries are configured</a>. This is expected to generate more
widely compatible certificates and should have no impact on analyses
which are working correctly already. We expect to roll this change out
to everyone in February. <a
href="https://redirect.github.com/github/codeql-action/pull/3473">#3473</a></li>
<li>When the CodeQL Action is run <a
href="https://docs.github.com/en/code-security/how-tos/scan-code-for-vulnerabilities/troubleshooting/troubleshooting-analysis-errors/logs-not-detailed-enough#creating-codeql-debugging-artifacts-for-codeql-default-setup">with
debugging enabled in Default Setup</a> and <a
href="https://docs.github.com/en/code-security/how-tos/secure-at-scale/configure-organization-security/manage-usage-and-access/giving-org-access-private-registries">private
package registries are configured</a>, the &quot;Setup proxy for
registries&quot; step will output additional diagnostic information that
can be used for troubleshooting. <a
href="https://redirect.github.com/github/codeql-action/pull/3486">#3486</a></li>
<li>Added a setting which allows the CodeQL Action to enable network
debugging for Java programs. This will help GitHub staff support
customers with troubleshooting issues in GitHub-managed CodeQL
workflows, such as Default Setup. This setting can only be enabled by
GitHub staff. <a
href="https://redirect.github.com/github/codeql-action/pull/3485">#3485</a></li>
<li>Added a setting which enables GitHub-managed workflows, such as
Default Setup, to use a <a
href="https://github.com/dsp-testing/codeql-cli-nightlies">nightly
CodeQL CLI release</a> instead of the latest, stable release that is
used by default. This will help GitHub staff support customers whose
analyses for a given repository or organization require early access to
a change in an upcoming CodeQL CLI release. This setting can only be
enabled by GitHub staff. <a
href="https://redirect.github.com/github/codeql-action/pull/3484">#3484</a></li>
</ul>
<h2>v4.32.3</h2>
<ul>
<li>Added experimental support for testing connections to <a
href="https://docs.github.com/en/code-security/how-tos/secure-at-scale/configure-organization-security/manage-usage-and-access/giving-org-access-private-registries">private
package registries</a>. This feature is not currently enabled for any
analysis. In the future, it may be enabled by default for Default Setup.
<a
href="https://redirect.github.com/github/codeql-action/pull/3466">#3466</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/github/codeql-action/blob/main/CHANGELOG.md">github/codeql-action's
changelog</a>.</em></p>
<blockquote>
<h2>4.35.1 - 27 Mar 2026</h2>
<ul>
<li>Fix incorrect minimum required Git version for <a
href="https://redirect.github.com/github/roadmap/issues/1158">improved
incremental analysis</a>: it should have been 2.36.0, not 2.11.0. <a
href="https://redirect.github.com/github/codeql-action/pull/3781">#3781</a></li>
</ul>
<h2>4.35.0 - 27 Mar 2026</h2>
<ul>
<li>Reduced the minimum Git version required for <a
href="https://redirect.github.com/github/roadmap/issues/1158">improved
incremental analysis</a> from 2.38.0 to 2.11.0. <a
href="https://redirect.github.com/github/codeql-action/pull/3767">#3767</a></li>
<li>Update default CodeQL bundle version to <a
href="https://github.com/github/codeql-action/releases/tag/codeql-bundle-v2.25.1">2.25.1</a>.
<a
href="https://redirect.github.com/github/codeql-action/pull/3773">#3773</a></li>
</ul>
<h2>4.34.1 - 20 Mar 2026</h2>
<ul>
<li>Downgrade default CodeQL bundle version to <a
href="https://github.com/github/codeql-action/releases/tag/codeql-bundle-v2.24.3">2.24.3</a>
due to issues with a small percentage of Actions and JavaScript
analyses. <a
href="https://redirect.github.com/github/codeql-action/pull/3762">#3762</a></li>
</ul>
<h2>4.34.0 - 20 Mar 2026</h2>
<ul>
<li>Added an experimental change which disables TRAP caching when <a
href="https://redirect.github.com/github/roadmap/issues/1158">improved
incremental analysis</a> is enabled, since improved incremental analysis
supersedes TRAP caching. This will improve performance and reduce
Actions cache usage. We expect to roll this change out to everyone in
March. <a
href="https://redirect.github.com/github/codeql-action/pull/3569">#3569</a></li>
<li>We are rolling out improved incremental analysis to C/C++ analyses
that use build mode <code>none</code>. We expect this rollout to be
complete by the end of April 2026. <a
href="https://redirect.github.com/github/codeql-action/pull/3584">#3584</a></li>
<li>Update default CodeQL bundle version to <a
href="https://github.com/github/codeql-action/releases/tag/codeql-bundle-v2.25.0">2.25.0</a>.
<a
href="https://redirect.github.com/github/codeql-action/pull/3585">#3585</a></li>
</ul>
<h2>4.33.0 - 16 Mar 2026</h2>
<ul>
<li>
<p>Upcoming change: Starting April 2026, the CodeQL Action will skip
collecting file coverage information on pull requests to improve
analysis performance. File coverage information will still be computed
on non-PR analyses. Pull request analyses will log a warning about this
upcoming change. <a
href="https://redirect.github.com/github/codeql-action/pull/3562">#3562</a></p>
<p>To opt out of this change:</p>
<ul>
<li><strong>Repositories owned by an organization:</strong> Create a
custom repository property with the name
<code>github-codeql-file-coverage-on-prs</code> and the type
&quot;True/false&quot;, then set this property to <code>true</code> in
the repository's settings. For more information, see <a
href="https://docs.github.com/en/organizations/managing-organization-settings/managing-custom-properties-for-repositories-in-your-organization">Managing
custom properties for repositories in your organization</a>.
Alternatively, if you are using an advanced setup workflow, you can set
the <code>CODEQL_ACTION_FILE_COVERAGE_ON_PRS</code> environment variable
to <code>true</code> in your workflow.</li>
<li><strong>User-owned repositories using default setup:</strong> Switch
to an advanced setup workflow and set the
<code>CODEQL_ACTION_FILE_COVERAGE_ON_PRS</code> environment variable to
<code>true</code> in your workflow.</li>
<li><strong>User-owned repositories using advanced setup:</strong> Set
the <code>CODEQL_ACTION_FILE_COVERAGE_ON_PRS</code> environment variable
to <code>true</code> in your workflow.</li>
</ul>
</li>
<li>
<p>Fixed <a
href="https://redirect.github.com/github/codeql-action/issues/3555">a
bug</a> which caused the CodeQL Action to fail loading repository
properties if a &quot;Multi select&quot; repository property was
configured for the repository. <a
href="https://redirect.github.com/github/codeql-action/pull/3557">#3557</a></p>
</li>
<li>
<p>The CodeQL Action now loads <a
href="https://docs.github.com/en/organizations/managing-organization-settings/managing-custom-properties-for-repositories-in-your-organization">custom
repository properties</a> on GitHub Enterprise Server, enabling the
customization of features such as
<code>github-codeql-disable-overlay</code> that was previously only
available on GitHub.com. <a
href="https://redirect.github.com/github/codeql-action/pull/3559">#3559</a></p>
</li>
<li>
<p>Once <a
href="https://docs.github.com/en/code-security/how-tos/secure-at-scale/configure-organization-security/manage-usage-and-access/giving-org-access-private-registries">private
package registries</a> can be configured with OIDC-based authentication
for organizations, the CodeQL Action will now be able to accept such
configurations. <a
href="https://redirect.github.com/github/codeql-action/pull/3563">#3563</a></p>
</li>
<li>
<p>Fixed the retry mechanism for database uploads. Previously this would
fail with the error &quot;Response body object should not be disturbed
or locked&quot;. <a
href="https://redirect.github.com/github/codeql-action/pull/3564">#3564</a></p>
</li>
<li>
<p>A warning is now emitted if the CodeQL Action detects a repository
property whose name suggests that it relates to the CodeQL Action, but
which is not one of the properties recognised by the current version of
the CodeQL Action. <a
href="https://redirect.github.com/github/codeql-action/pull/3570">#3570</a></p>
</li>
</ul>
<h2>4.32.6 - 05 Mar 2026</h2>
<ul>
<li>Update default CodeQL bundle version to <a
href="https://github.com/github/codeql-action/releases/tag/codeql-bundle-v2.24.3">2.24.3</a>.
<a
href="https://redirect.github.com/github/codeql-action/pull/3548">#3548</a></li>
</ul>
<h2>4.32.5 - 02 Mar 2026</h2>
<ul>
<li>Repositories owned by an organization can now set up the
<code>github-codeql-disable-overlay</code> custom repository property to
disable <a
href="https://redirect.github.com/github/roadmap/issues/1158">improved
incremental analysis for CodeQL</a>. First, create a custom repository
property with the name <code>github-codeql-disable-overlay</code> and
the type &quot;True/false&quot; in the organization's settings. Then in
the repository's settings, set this property to <code>true</code> to
disable improved incremental analysis. For more information, see <a
href="https://docs.github.com/en/organizations/managing-organization-settings/managing-custom-properties-for-repositories-in-your-organization">Managing
custom properties for repositories in your organization</a>. This
feature is not yet available on GitHub Enterprise Server. <a
href="https://redirect.github.com/github/codeql-action/pull/3507">#3507</a></li>
<li>Added an experimental change so that when <a
href="https://redirect.github.com/github/roadmap/issues/1158">improved
incremental analysis</a> fails on a runner — potentially due to
insufficient disk space — the failure is recorded in the Actions cache
so that subsequent runs will automatically skip improved incremental
analysis until something changes (e.g. a larger runner is provisioned or
a new CodeQL version is released). We expect to roll this change out to
everyone in March. <a
href="https://redirect.github.com/github/codeql-action/pull/3487">#3487</a></li>
<li>The minimum memory check for improved incremental analysis is now
skipped for CodeQL 2.24.3 and later, which has reduced peak RAM usage.
<a
href="https://redirect.github.com/github/codeql-action/pull/3515">#3515</a></li>
<li>Reduced log levels for best-effort private package registry
connection check failures to reduce noise from workflow annotations. <a
href="https://redirect.github.com/github/codeql-action/pull/3516">#3516</a></li>
<li>Added an experimental change which lowers the minimum disk space
requirement for <a
href="https://redirect.github.com/github/roadmap/issues/1158">improved
incremental analysis</a>, enabling it to run on standard GitHub Actions
runners. We expect to roll this change out to everyone in March. <a
href="https://redirect.github.com/github/codeql-action/pull/3498">#3498</a></li>
<li>Added an experimental change which allows the
<code>start-proxy</code> action to resolve the CodeQL CLI version from
feature flags instead of using the linked CLI bundle version. We expect
to roll this change out to everyone in March. <a
href="https://redirect.github.com/github/codeql-action/pull/3512">#3512</a></li>
<li>The previously experimental changes from versions 4.32.3, 4.32.4,
3.32.3 and 3.32.4 are now enabled by default. <a
href="https://redirect.github.com/github/codeql-action/pull/3503">#3503</a>,
<a
href="https://redirect.github.com/github/codeql-action/pull/3504">#3504</a></li>
</ul>
<h2>4.32.4 - 20 Feb 2026</h2>
<ul>
<li>Update default CodeQL bundle version to <a
href="https://github.com/github/codeql-action/releases/tag/codeql-bundle-v2.24.2">2.24.2</a>.
<a
href="https://redirect.github.com/github/codeql-action/pull/3493">#3493</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="c10b8064de"><code>c10b806</code></a>
Merge pull request <a
href="https://redirect.github.com/github/codeql-action/issues/3782">#3782</a>
from github/update-v4.35.1-d6d1743b8</li>
<li><a
href="c5ffd06837"><code>c5ffd06</code></a>
Update changelog for v4.35.1</li>
<li><a
href="d6d1743b8e"><code>d6d1743</code></a>
Merge pull request <a
href="https://redirect.github.com/github/codeql-action/issues/3781">#3781</a>
from github/henrymercer/update-git-minimum-version</li>
<li><a
href="65d2efa733"><code>65d2efa</code></a>
Add changelog note</li>
<li><a
href="2437b20ab3"><code>2437b20</code></a>
Update minimum git version for overlay to 2.36.0</li>
<li><a
href="ea5f71947c"><code>ea5f719</code></a>
Merge pull request <a
href="https://redirect.github.com/github/codeql-action/issues/3775">#3775</a>
from github/dependabot/npm_and_yarn/node-forge-1.4.0</li>
<li><a
href="45ceeea896"><code>45ceeea</code></a>
Merge pull request <a
href="https://redirect.github.com/github/codeql-action/issues/3777">#3777</a>
from github/mergeback/v4.35.0-to-main-b8bb9f28</li>
<li><a
href="24448c9843"><code>24448c9</code></a>
Rebuild</li>
<li><a
href="7c51060631"><code>7c51060</code></a>
Update changelog and version after v4.35.0</li>
<li><a
href="b8bb9f28b8"><code>b8bb9f2</code></a>
Merge pull request <a
href="https://redirect.github.com/github/codeql-action/issues/3776">#3776</a>
from github/update-v4.35.0-0078ad667</li>
<li>Additional commits viewable in <a
href="https://github.com/github/codeql-action/compare/v4...v4.35.1">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=github/codeql-action&package-manager=github_actions&previous-version=4&new-version=4.35.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-04-20 15:55:02 +03:00
dependabot[bot]
6c57246940 build(deps): bump actions/cache from 4 to 5.0.4 (#10802)
Bumps [actions/cache](https://github.com/actions/cache) from 4 to 5.0.4.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/actions/cache/releases">actions/cache's
releases</a>.</em></p>
<blockquote>
<h2>v5.0.4</h2>
<h2>What's Changed</h2>
<ul>
<li>Add release instructions and update maintainer docs by <a
href="https://github.com/Link"><code>@​Link</code></a>- in <a
href="https://redirect.github.com/actions/cache/pull/1696">actions/cache#1696</a></li>
<li>Potential fix for code scanning alert no. 52: Workflow does not
contain permissions by <a
href="https://github.com/Link"><code>@​Link</code></a>- in <a
href="https://redirect.github.com/actions/cache/pull/1697">actions/cache#1697</a></li>
<li>Fix workflow permissions and cleanup workflow names / formatting by
<a href="https://github.com/Link"><code>@​Link</code></a>- in <a
href="https://redirect.github.com/actions/cache/pull/1699">actions/cache#1699</a></li>
<li>docs: Update examples to use the latest version by <a
href="https://github.com/XZTDean"><code>@​XZTDean</code></a> in <a
href="https://redirect.github.com/actions/cache/pull/1690">actions/cache#1690</a></li>
<li>Fix proxy integration tests by <a
href="https://github.com/Link"><code>@​Link</code></a>- in <a
href="https://redirect.github.com/actions/cache/pull/1701">actions/cache#1701</a></li>
<li>Fix cache key in examples.md for bun.lock by <a
href="https://github.com/RyPeck"><code>@​RyPeck</code></a> in <a
href="https://redirect.github.com/actions/cache/pull/1722">actions/cache#1722</a></li>
<li>Update dependencies &amp; patch security vulnerabilities by <a
href="https://github.com/Link"><code>@​Link</code></a>- in <a
href="https://redirect.github.com/actions/cache/pull/1738">actions/cache#1738</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/XZTDean"><code>@​XZTDean</code></a> made
their first contribution in <a
href="https://redirect.github.com/actions/cache/pull/1690">actions/cache#1690</a></li>
<li><a href="https://github.com/RyPeck"><code>@​RyPeck</code></a> made
their first contribution in <a
href="https://redirect.github.com/actions/cache/pull/1722">actions/cache#1722</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/cache/compare/v5...v5.0.4">https://github.com/actions/cache/compare/v5...v5.0.4</a></p>
<h2>v5.0.3</h2>
<h2>What's Changed</h2>
<ul>
<li>Bump <code>@actions/cache</code> to v5.0.5 (Resolves: <a
href="https://github.com/actions/cache/security/dependabot/33">https://github.com/actions/cache/security/dependabot/33</a>)</li>
<li>Bump <code>@actions/core</code> to v2.0.3</li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/cache/compare/v5...v5.0.3">https://github.com/actions/cache/compare/v5...v5.0.3</a></p>
<h2>v.5.0.2</h2>
<h1>v5.0.2</h1>
<h2>What's Changed</h2>
<p>When creating cache entries, 429s returned from the cache service
will not be retried.</p>
<h2>v5.0.1</h2>
<blockquote>
<p>[!IMPORTANT]
<strong><code>actions/cache@v5</code> runs on the Node.js 24 runtime and
requires a minimum Actions Runner version of
<code>2.327.1</code>.</strong></p>
<p>If you are using self-hosted runners, ensure they are updated before
upgrading.</p>
</blockquote>
<hr />
<h1>v5.0.1</h1>
<h2>What's Changed</h2>
<ul>
<li>fix: update <code>@​actions/cache</code> for Node.js 24 punycode
deprecation by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/cache/pull/1685">actions/cache#1685</a></li>
<li>prepare release v5.0.1 by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/cache/pull/1686">actions/cache#1686</a></li>
</ul>
<h1>v5.0.0</h1>
<h2>What's Changed</h2>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/actions/cache/blob/main/RELEASES.md">actions/cache's
changelog</a>.</em></p>
<blockquote>
<h1>Releases</h1>
<h2>How to prepare a release</h2>
<blockquote>
<p>[!NOTE]<br />
Relevant for maintainers with write access only.</p>
</blockquote>
<ol>
<li>Switch to a new branch from <code>main</code>.</li>
<li>Run <code>npm test</code> to ensure all tests are passing.</li>
<li>Update the version in <a
href="https://github.com/actions/cache/blob/main/package.json"><code>https://github.com/actions/cache/blob/main/package.json</code></a>.</li>
<li>Run <code>npm run build</code> to update the compiled files.</li>
<li>Update this <a
href="https://github.com/actions/cache/blob/main/RELEASES.md"><code>https://github.com/actions/cache/blob/main/RELEASES.md</code></a>
with the new version and changes in the <code>## Changelog</code>
section.</li>
<li>Run <code>licensed cache</code> to update the license report.</li>
<li>Run <code>licensed status</code> and resolve any warnings by
updating the <a
href="https://github.com/actions/cache/blob/main/.licensed.yml"><code>https://github.com/actions/cache/blob/main/.licensed.yml</code></a>
file with the exceptions.</li>
<li>Commit your changes and push your branch upstream.</li>
<li>Open a pull request against <code>main</code> and get it reviewed
and merged.</li>
<li>Draft a new release <a
href="https://github.com/actions/cache/releases">https://github.com/actions/cache/releases</a>
use the same version number used in <code>package.json</code>
<ol>
<li>Create a new tag with the version number.</li>
<li>Auto generate release notes and update them to match the changes you
made in <code>RELEASES.md</code>.</li>
<li>Toggle the set as the latest release option.</li>
<li>Publish the release.</li>
</ol>
</li>
<li>Navigate to <a
href="https://github.com/actions/cache/actions/workflows/release-new-action-version.yml">https://github.com/actions/cache/actions/workflows/release-new-action-version.yml</a>
<ol>
<li>There should be a workflow run queued with the same version
number.</li>
<li>Approve the run to publish the new version and update the major tags
for this action.</li>
</ol>
</li>
</ol>
<h2>Changelog</h2>
<h3>5.0.4</h3>
<ul>
<li>Bump <code>minimatch</code> to v3.1.5 (fixes ReDoS via globstar
patterns)</li>
<li>Bump <code>undici</code> to v6.24.1 (WebSocket decompression bomb
protection, header validation fixes)</li>
<li>Bump <code>fast-xml-parser</code> to v5.5.6</li>
</ul>
<h3>5.0.3</h3>
<ul>
<li>Bump <code>@actions/cache</code> to v5.0.5 (Resolves: <a
href="https://github.com/actions/cache/security/dependabot/33">https://github.com/actions/cache/security/dependabot/33</a>)</li>
<li>Bump <code>@actions/core</code> to v2.0.3</li>
</ul>
<h3>5.0.2</h3>
<ul>
<li>Bump <code>@actions/cache</code> to v5.0.3 <a
href="https://redirect.github.com/actions/cache/pull/1692">#1692</a></li>
</ul>
<h3>5.0.1</h3>
<ul>
<li>Update <code>@azure/storage-blob</code> to <code>^12.29.1</code> via
<code>@actions/cache@5.0.1</code> <a
href="https://redirect.github.com/actions/cache/pull/1685">#1685</a></li>
</ul>
<h3>5.0.0</h3>
<blockquote>
<p>[!IMPORTANT]
<code>actions/cache@v5</code> runs on the Node.js 24 runtime and
requires a minimum Actions Runner version of <code>2.327.1</code>.</p>
</blockquote>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="27d5ce7f10"><code>27d5ce7</code></a>
Merge pull request <a
href="https://redirect.github.com/actions/cache/issues/1747">#1747</a>
from actions/yacaovsnc/update-dependency</li>
<li><a
href="f280785d7b"><code>f280785</code></a>
licensed changes</li>
<li><a
href="619aeb1606"><code>619aeb1</code></a>
npm run build generated dist files</li>
<li><a
href="bcf16c2893"><code>bcf16c2</code></a>
Update ts-http-runtime to 0.3.5</li>
<li><a
href="668228422a"><code>6682284</code></a>
Merge pull request <a
href="https://redirect.github.com/actions/cache/issues/1738">#1738</a>
from actions/prepare-v5.0.4</li>
<li><a
href="e34039626f"><code>e340396</code></a>
Update RELEASES</li>
<li><a
href="8a67110529"><code>8a67110</code></a>
Add licenses</li>
<li><a
href="1865903e1b"><code>1865903</code></a>
Update dependencies &amp; patch security vulnerabilities</li>
<li><a
href="5656298164"><code>5656298</code></a>
Merge pull request <a
href="https://redirect.github.com/actions/cache/issues/1722">#1722</a>
from RyPeck/patch-1</li>
<li><a
href="4e380d19e1"><code>4e380d1</code></a>
Fix cache key in examples.md for bun.lock</li>
<li>Additional commits viewable in <a
href="https://github.com/actions/cache/compare/v4...v5">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/cache&package-manager=github_actions&previous-version=4&new-version=5.0.4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-04-20 15:53:09 +03:00
andriibeee
05112e54e2 lib/netutil: fix IPv6 address corruption in proxy protocol v2 parser
Proxy protocol parser kept sub-slice reference for pooled bytesBuffer at readProxyProto
```
 bb := bbPool.Get()
 defer bbPool.Put(bb)   // ← buffer returned to pool AFTER function returns
...
   IP:   bb.B[0:16],  // ← BUG: sub-slice of pooled buffer!
...
 ```

 This commit properly allocates new slice for ipv6 address and copies buffer content to it.

 Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10839
2026-04-20 12:11:04 +02:00
Andrii Chubatiuk
ce227fe7d9 lib/streamaggr: added vm_streamaggr_counter_resets_total counter (#10807)
### Describe Your Changes

Added `vm_streamaggr_counter_resets` metric for `rate*`, `total*`, and
`increase*` outputs, which is useful for unpredictable output behaviour
investigation.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Signed-off-by: Andrii Chubatiuk <andrew.chubatiuk@gmail.com>
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Signed-off-by: Roman Khavronenko <hagen1778@gmail.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2026-04-20 11:48:03 +02:00
hagen1778
e4524eb2fb deployment/alerts: move IndexDBRecordsDrop and TooManyTSIDMisses rules to storage-related files
`IndexDBRecordsDrop` and `TooManyTSIDMisses` were mistakenly placed to `alerts-health.yml`,
which was supposed to contain rules related to all VM components. But these two rules
are related to storage components only (vmstorage and vmsingle). Moving them to corresponding
files.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2026-04-20 11:43:21 +02:00
hagen1778
b9ba5dacc3 deployment/alerts: rename alerts.yml to alerts-single-node.yml
The change should reduce confusion for users where `alerts.yml`
belongs to. Before, developers could mistakenly assume that
`alerts.yml` was related to both single and cluster installations.
In result, rule `MetadataCacheUtilizationIsTooHigh` was added only
to `alerts.yml` and not copied to `alerts-cluster.yml`.

The rename change should bring more context into the file name
and reduce confusion in the future.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2026-04-20 11:37:30 +02:00
hagen1778
1a8fe4f2f8 deployment/alerts: add MetadataCacheUtilizationIsTooHigh to cluster rules
Before, this rule was only a part of single-node rule set.
But it is applicable for both: single and cluster installations.
Adding it to cluster as well.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2026-04-20 11:31:43 +02:00
Roman Khavronenko
2dcfbd8e19 deployment/rules: add MetricNameStatsCacheUtilizationIsTooHigh alert (#10840)
The new rule `MetricNameStatsCacheUtilizationIsTooHigh` will signalize
about overutilization of Metric names usage stats tracker. See
https://docs.victoriametrics.com/victoriametrics/#track-ingested-metrics-usage

This rule can fire for deployments with high churn rate of metric names.
In cases like this, it is better to disable metric name tracking
completely, as it brings no use.

It might fire for deployments that were tracking metric names for very
long periods and this alert might be a good sign to reset the cache.

### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2026-04-20 11:29:53 +02:00
168 changed files with 4901 additions and 2480 deletions

View File

@@ -1 +1,3 @@
Before creating the PR, please read [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist) and remove this line after confirming you understand and follow them.
**PLEASE REMOVE LINE BELOW BEFORE SUBMITTING**
Before creating the PR, make sure you have read and followed the [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).

View File

@@ -27,7 +27,7 @@ jobs:
- run: go version
- name: Cache Go artifacts
uses: actions/cache@v4
uses: actions/cache@v5
with:
path: |
~/.cache/go-build

View File

@@ -40,7 +40,7 @@ jobs:
- run: go version
- name: Cache Go artifacts
uses: actions/cache@v4
uses: actions/cache@v5
with:
path: |
~/.cache/go-build
@@ -50,14 +50,14 @@ jobs:
restore-keys: go-artifacts-${{ runner.os }}-codeql-analyze-${{ steps.go.outputs.go-version }}-
- name: Initialize CodeQL
uses: github/codeql-action/init@v4
uses: github/codeql-action/init@v4.35.1
with:
languages: go
- name: Autobuild
uses: github/codeql-action/autobuild@v4
uses: github/codeql-action/autobuild@v4.35.1
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v4
uses: github/codeql-action/analyze@v4.35.1
with:
category: 'language:go'

View File

@@ -47,7 +47,7 @@ jobs:
- run: go version
- name: Cache golangci-lint
uses: actions/cache@v4
uses: actions/cache@v5
with:
path: |
~/.cache/golangci-lint

View File

@@ -1,42 +1,4 @@
# Security Policy
## Supported Versions
You can find out about our security policy and VictoriaMetrics version support on the [security page](https://docs.victoriametrics.com/victoriametrics/#security) in the documentation.
The following versions of VictoriaMetrics receive regular security fixes:
| Version | Supported |
|--------------------------------------------------------------------------------|--------------------|
| [Latest release](https://docs.victoriametrics.com/victoriametrics/changelog/) | :white_check_mark: |
| [LTS releases](https://docs.victoriametrics.com/victoriametrics/lts-releases/) | :white_check_mark: |
| other releases | :x: |
See [this page](https://victoriametrics.com/security/) for more details.
## Software Bill of Materials (SBOM)
Every VictoriaMetrics container{{% available_from "#" %}} image published to
[Docker Hub](https://hub.docker.com/u/victoriametrics)
and [Quay.io](https://quay.io/organization/victoriametrics)
includes an [SPDX](https://spdx.dev/) SBOM attestation
generated automatically by BuildKit during
`docker buildx build`.
To inspect the SBOM for an image:
```sh
docker buildx imagetools inspect \
docker.io/victoriametrics/victoria-metrics:latest \
--format "{{ json .SBOM }}"
```
To scan an image using its SBOM attestation with
[Trivy](https://github.com/aquasecurity/trivy):
```sh
trivy image --sbom-sources oci \
docker.io/victoriametrics/victoria-metrics:latest
```
## Reporting a Vulnerability
Please report any security issues to <security@victoriametrics.com>

View File

@@ -77,16 +77,6 @@ func insertRows(at *auth.Token, tss []prompb.TimeSeries, mms []prompb.MetricMeta
var metadataTotal int
if prommetadata.IsEnabled() {
var accountID, projectID uint32
if at != nil {
accountID = at.AccountID
projectID = at.ProjectID
for i := range mms {
mm := &mms[i]
mm.AccountID = accountID
mm.ProjectID = projectID
}
}
ctx.WriteRequest.Metadata = mms
metadataTotal = len(mms)
}

View File

@@ -75,11 +75,6 @@ func insertRows(at *auth.Token, rows []prometheus.Row, mms []prometheus.Metadata
Samples: samples[len(samples)-1:],
})
}
var accountID, projectID uint32
if at != nil {
accountID = at.AccountID
projectID = at.ProjectID
}
for i := range mms {
mm := &mms[i]
mmsDst = append(mmsDst, prompb.MetricMetadata{
@@ -88,8 +83,6 @@ func insertRows(at *auth.Token, rows []prometheus.Row, mms []prometheus.Metadata
Type: mm.Type,
// there is no unit in Prometheus exposition formats
AccountID: accountID,
ProjectID: projectID,
})
}
ctx.WriteRequest.Timeseries = tssDst

View File

@@ -72,11 +72,6 @@ func insertRows(at *auth.Token, timeseries []prompb.TimeSeries, mms []prompb.Met
var metadataTotal int
if prommetadata.IsEnabled() {
var accountID, projectID uint32
if at != nil {
accountID = at.AccountID
projectID = at.ProjectID
}
for i := range mms {
mm := &mms[i]
mmsDst = append(mmsDst, prompb.MetricMetadata{
@@ -85,8 +80,8 @@ func insertRows(at *auth.Token, timeseries []prompb.TimeSeries, mms []prompb.Met
Type: mm.Type,
Unit: mm.Unit,
AccountID: accountID,
ProjectID: projectID,
AccountID: mm.AccountID,
ProjectID: mm.ProjectID,
})
}
ctx.WriteRequest.Metadata = mmsDst

View File

@@ -211,6 +211,9 @@ func (wr *writeRequest) copyMetadata(dst, src *prompb.MetricMetadata) {
dst.Type = src.Type
dst.Unit = src.Unit
dst.AccountID = src.AccountID
dst.ProjectID = src.ProjectID
// Pre-allocate memory for all string fields.
neededBufLen := len(src.MetricFamilyName) + len(src.Help)
bufLen := len(wr.metadatabuf)

View File

@@ -398,7 +398,7 @@ func tryPush(at *auth.Token, wr *prompb.WriteRequest, forceDropSamplesOnFailure
// Push metadata separately from time series, since it doesn't need sharding,
// relabeling, stream aggregation, deduplication, etc.
if !tryPushMetadataToRemoteStorages(rwctxs, mms, forceDropSamplesOnFailure) {
if !tryPushMetadataToRemoteStorages(at, rwctxs, mms, forceDropSamplesOnFailure) {
return false
}
@@ -536,11 +536,18 @@ func pushTimeSeriesToRemoteStoragesTrackDropped(tss []prompb.TimeSeries) {
}
}
func tryPushMetadataToRemoteStorages(rwctxs []*remoteWriteCtx, mms []prompb.MetricMetadata, forceDropSamplesOnFailure bool) bool {
func tryPushMetadataToRemoteStorages(at *auth.Token, rwctxs []*remoteWriteCtx, mms []prompb.MetricMetadata, forceDropSamplesOnFailure bool) bool {
if len(mms) == 0 {
// Nothing to push
return true
}
if at != nil {
for idx := range mms {
mm := &mms[idx]
mm.AccountID = at.AccountID
mm.ProjectID = at.ProjectID
}
}
// Do not shard metadata even if -remoteWrite.shardByURL is set, just replicate it among rwctxs.
// Since metadata is usually small and there is no guarantee that metadata can be sent to
// the same remote storage with the corresponding metrics.

View File

@@ -222,6 +222,9 @@ func (r *Rule) Validate() error {
if r.Expr == "" {
return fmt.Errorf("expression can't be empty")
}
if _, ok := r.Labels["__name__"]; ok {
return fmt.Errorf("invalid rule label __name__")
}
return checkOverflow(r.XXX, "rule")
}

View File

@@ -136,6 +136,9 @@ func TestRuleValidate(t *testing.T) {
if err := (&Rule{Alert: "alert"}).Validate(); err == nil {
t.Fatalf("expected empty expr error")
}
if err := (&Rule{Record: "record", Expr: "sum(test)", Labels: map[string]string{"__name__": "test"}}).Validate(); err == nil {
t.Fatalf("invalid rule label; got %s", err)
}
if err := (&Rule{Alert: "alert", Expr: "test>0"}).Validate(); err != nil {
t.Fatalf("expected valid rule; got %s", err)
}

View File

@@ -87,6 +87,7 @@ func (m *Metric) DelLabel(key string) {
for i, l := range m.Labels {
if l.Name == key {
m.Labels = append(m.Labels[:i], m.Labels[i+1:]...)
break
}
}
}

View File

@@ -14,7 +14,7 @@ type Notifier interface {
Send(ctx context.Context, alerts []Alert, alertLabels [][]prompb.Label, notifierHeaders map[string]string) error
// Addr returns address where alerts are sent.
Addr() string
// LastError returns error, that occured during last attempt to send data
// LastError returns error, that occurred during last attempt to send data
LastError() string
// Close is a destructor for the Notifier
Close()

View File

@@ -312,9 +312,11 @@ type labelSet struct {
// On k conflicts in origin set, the original value is preferred and copied
// to processed with `exported_%k` key. The copy happens only if passed v isn't equal to origin[k] value.
func (ls *labelSet) add(k, v string) {
// do not add label with empty value, since it has no meaning.
// see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9984
// do not add label with empty value to the result, as it has no meaning:
// if the label already exists in the original query result, remove it to preserve compatibility with relabeling, see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10766.
// otherwise, ignore the label, see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9984.
if v == "" {
delete(ls.processed, k)
return
}
ls.processed[k] = v

View File

@@ -1363,6 +1363,7 @@ func TestAlertingRule_ToLabels(t *testing.T) {
{Name: "instance", Value: "0.0.0.0:8800"},
{Name: "group", Value: "vmalert"},
{Name: "alertname", Value: "ConfigurationReloadFailure"},
{Name: "pod", Value: "vmalert-0"},
},
Values: []float64{1},
Timestamps: []int64{time.Now().UnixNano()},
@@ -1374,6 +1375,7 @@ func TestAlertingRule_ToLabels(t *testing.T) {
"group": "vmalert", // this shouldn't have effect since value in metric is equal
"invalid_label": "{{ .Values.mustRuntimeFail }}",
"empty_label": "", // this should be dropped
"pod": "", // this should remove the pod label from query result
},
Expr: "sum(vmalert_alerting_rules_error) by(instance, group, alertname) > 0",
Name: "AlertingRulesError",
@@ -1385,6 +1387,7 @@ func TestAlertingRule_ToLabels(t *testing.T) {
"group": "vmalert",
"alertname": "ConfigurationReloadFailure",
"alertgroup": "vmalert",
"pod": "vmalert-0",
"invalid_label": `error evaluating template: template: :1:298: executing "" at <.Values.mustRuntimeFail>: can't evaluate field Values in type notifier.tplData`,
}

View File

@@ -8,6 +8,7 @@ import (
"hash/fnv"
"maps"
"net/url"
"path"
"sync"
"time"
@@ -42,6 +43,9 @@ var (
"For example, if lookback=1h then range from now() to now()-1h will be scanned.")
maxStartDelay = flag.Duration("group.maxStartDelay", 5*time.Minute, "Defines the max delay before starting the group evaluation. Group's start is artificially delayed for random duration on interval"+
" [0..min(--group.maxStartDelay, group.interval)]. This helps smoothing out the load on the configured datasource, so evaluations aren't executed too close to each other.")
ruleStripFilePath = flag.Bool("rule.stripFilePath", false, "Whether to strip rule file paths in logs and all API responses, including /metrics. "+
"For example, file path '/path/to/tenant_id/rules.yml' will be stripped to 'groupHashID/rules.yml'. "+
"This flag may be useful for hiding sensitive information in file paths, such as S3 bucket details.")
)
// Group is an entity for grouping rules
@@ -147,6 +151,12 @@ func NewGroup(cfg config.Group, qb datasource.QuerierBuilder, defaultInterval ti
g.EvalDelay = &cfg.EvalDelay.D
}
g.id = g.CreateID()
// strip file path from group.File after generated group ID when ruleStripFilePath is set,
// so it won't be exposed in logs and api responses
if *ruleStripFilePath {
_, filename := path.Split(g.File)
g.File = fmt.Sprintf("%d/%s", g.id, filename)
}
for _, h := range cfg.Headers {
g.Headers[h.Key] = h.Value
}
@@ -409,6 +419,9 @@ func (g *Group) Start(ctx context.Context, rw remotewrite.RWClient, rr datasourc
g.mu.Unlock()
defer g.evalCancel()
// start the interval ticker before the first evaluation,
// so that the evaluation timestamps of groups with the `eval_offset` option are also aligned,
// see https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10773
t := time.NewTicker(g.Interval)
defer t.Stop()

View File

@@ -742,3 +742,64 @@ func parseTime(t *testing.T, s string) time.Time {
}
return tt
}
func TestRuleStripFilePath(t *testing.T) {
configG := config.Group{
Name: "group",
File: "/var/local/test/rules.yaml",
Type: config.NewRawType("prometheus"),
Concurrency: 1,
Rules: []config.Rule{
{
ID: 0,
Alert: "alert",
},
{
ID: 1,
Record: "record",
},
}}
qb := &datasource.FakeQuerier{}
g := NewGroup(configG, qb, 1*time.Minute, nil)
gID := g.id
if g.File != "/var/local/test/rules.yaml" {
t.Fatalf("expected file path to be unchanged; got %q instead", g.File)
}
for _, r := range g.Rules {
if ar, ok := r.(*AlertingRule); ok {
if ar.File != "/var/local/test/rules.yaml" {
t.Fatalf("expected rule file path to be unchanged; got %q instead", ar.File)
}
}
if rr, ok := r.(*RecordingRule); ok {
if rr.File != "/var/local/test/rules.yaml" {
t.Fatalf("expected rule file path to be unchanged; got %q instead", rr.File)
}
}
}
oldRuleStripFilePath := *ruleStripFilePath
*ruleStripFilePath = true
defer func() {
*ruleStripFilePath = oldRuleStripFilePath
}()
g = NewGroup(configG, qb, 1*time.Minute, nil)
if g.File != fmt.Sprintf("%d/rules.yaml", gID) {
t.Fatalf("expected file path to be stripped to %q; got %q instead", fmt.Sprintf("%d/rules.yaml", gID), g.File)
}
for _, r := range g.Rules {
if ar, ok := r.(*AlertingRule); ok {
if ar.File != fmt.Sprintf("%d/rules.yaml", gID) {
t.Fatalf("expected rule file path to be unchanged; got %q instead", ar.File)
}
}
if rr, ok := r.(*RecordingRule); ok {
if rr.File != fmt.Sprintf("%d/rules.yaml", gID) {
t.Fatalf("expected rule file path to be unchanged; got %q instead", rr.File)
}
}
}
}

View File

@@ -293,9 +293,11 @@ func (rr *RecordingRule) toTimeSeries(m datasource.Metric) prompb.TimeSeries {
}
// add extra labels configured by user
for k := range rr.Labels {
// do not add label with empty value, since it has no meaning.
// see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9984
// do not add label with empty value to the result, as it has no meaning:
// if the label already exists in the original query result, remove it to preserve compatibility with relabeling, see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10766.
// otherwise, ignore the label, see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9984.
if rr.Labels[k] == "" {
m.DelLabel(k)
continue
}
existingLabel := promrelabel.GetLabelByName(m.Labels, k)

View File

@@ -163,11 +163,13 @@ func TestRecordingRule_Exec(t *testing.T) {
f(&RecordingRule{
Name: "job:foo",
Labels: map[string]string{
"source": "test",
"source": "test",
"empty_label": "", // this should be dropped
"pod": "", // this should remove the pod label from query result
},
}, [][]datasource.Metric{{
metricWithValueAndLabels(t, 2, "__name__", "foo", "job", "foo"),
metricWithValueAndLabels(t, 1, "__name__", "bar", "job", "bar", "source", "origin"),
metricWithValueAndLabels(t, 2, "__name__", "foo", "job", "foo", "pod", "vmalert-0"),
metricWithValueAndLabels(t, 1, "__name__", "bar", "job", "bar", "source", "origin", "pod", "vmalert-1"),
metricWithValueAndLabels(t, 1, "__name__", "baz", "job", "baz", "source", "test"),
}}, [][]prompb.TimeSeries{{
newTimeSeries([]float64{2}, []int64{ts.UnixNano()}, []prompb.Label{

View File

@@ -252,6 +252,9 @@ func (r *ApiRule) ExtendState() {
// ToAPI returns ApiGroup representation of g
func (g *Group) ToAPI() *ApiGroup {
if g == nil {
return &ApiGroup{}
}
g.mu.RLock()
defer g.mu.RUnlock()
ag := ApiGroup{

View File

@@ -402,6 +402,20 @@ func templateFuncs() textTpl.FuncMap {
return t, nil
},
// formatTime formats the given Unix timestamp with the provided layout.
// For example: {{ now | formatTime "2006-01-02T15:04:05Z07:00" }}
"formatTime": func(layout string, i any) (string, error) {
v, err := toFloat64(i)
if err != nil {
return "", fmt.Errorf("formatTime: %w", err)
}
if math.IsNaN(v) || math.IsInf(v, 0) {
return "", fmt.Errorf("formatTime: cannot convert %v to time", v)
}
t := timeFromUnixTimestamp(v).Time().UTC()
return t.Format(layout), nil
},
/* URLs */
// externalURL returns value of `external.url` flag

View File

@@ -6,6 +6,7 @@ import (
"strings"
"testing"
textTpl "text/template"
"time"
)
func TestTemplateFuncs_StringConversion(t *testing.T) {
@@ -103,6 +104,26 @@ func TestTemplateFuncs_Formatting(t *testing.T) {
f("humanizeTimestamp", 1679055557, "2023-03-17 12:19:17 +0000 UTC")
}
func TestTemplateFuncs_FormatTime(t *testing.T) {
funcs := templateFuncs()
formatTime := funcs["formatTime"].(func(layout string, i any) (string, error))
f := func(layout string, input any, expected string) {
t.Helper()
result, err := formatTime(layout, input)
if err != nil {
t.Fatalf("unexpected error for formatTime(%q, %v): %s", layout, input, err)
}
if result != expected {
t.Fatalf("unexpected result for formatTime(%q, %v); got\n%s\nwant\n%s", layout, input, result, expected)
}
}
f(time.RFC3339, float64(1679055557), "2023-03-17T12:19:17Z")
f("2006-01-02T15:04:05", int64(1679055557), "2023-03-17T12:19:17")
f(time.RFC822, int(1679055557), "17 Mar 23 12:19 UTC")
}
func mkTemplate(current, replacement any) textTemplate {
tmpl := textTemplate{}
if current != nil {

View File

@@ -362,40 +362,62 @@ func (up *URLPrefix) setLoadBalancingPolicy(loadBalancingPolicy string) error {
}
type backendURLs struct {
healthChecksContext context.Context
healthChecksCancel func()
healthChecksWG sync.WaitGroup
bhc backendHealthCheck
bus []*backendURL
}
type backendHealthCheck struct {
ctx context.Context
// mu protects fields below
cancel func()
mu sync.Mutex
isStopped bool
wg sync.WaitGroup
}
func (bhc *backendHealthCheck) run(hc func()) {
bhc.mu.Lock()
defer bhc.mu.Unlock()
if bhc.isStopped {
return
}
bhc.wg.Go(hc)
}
func (bhc *backendHealthCheck) stop() {
bhc.mu.Lock()
bhc.cancel()
bhc.isStopped = true
bhc.mu.Unlock()
bhc.wg.Wait()
}
func newBackendURLs() *backendURLs {
ctx, cancel := context.WithCancel(context.Background())
return &backendURLs{
healthChecksContext: ctx,
healthChecksCancel: cancel,
bhc: backendHealthCheck{
ctx: ctx,
cancel: cancel,
},
}
}
func (bus *backendURLs) add(u *url.URL) {
bus.bus = append(bus.bus, &backendURL{
url: u,
healthCheckContext: bus.healthChecksContext,
healthCheckWG: &bus.healthChecksWG,
hasPlaceHolders: hasAnyPlaceholders(u),
url: u,
bhc: &bus.bhc,
hasPlaceHolders: hasAnyPlaceholders(u),
})
}
func (bus *backendURLs) stopHealthChecks() {
bus.healthChecksCancel()
bus.healthChecksWG.Wait()
bus.bhc.stop()
}
type backendURL struct {
broken atomic.Bool
healthCheckContext context.Context
healthCheckWG *sync.WaitGroup
bhc *backendHealthCheck
concurrentRequests atomic.Int32
@@ -410,7 +432,7 @@ func (bu *backendURL) isBroken() bool {
func (bu *backendURL) setBroken() {
if bu.broken.CompareAndSwap(false, true) {
bu.healthCheckWG.Go(func() {
bu.bhc.run(func() {
bu.runHealthCheck()
bu.broken.Store(false)
})
@@ -432,11 +454,11 @@ func (bu *backendURL) runHealthCheck() {
case <-t.C:
// Verify network connectivity via TCP dial before marking backend healthy.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9997
ctx, cancel := context.WithTimeout(bu.healthCheckContext, time.Second)
ctx, cancel := context.WithTimeout(bu.bhc.ctx, time.Second)
c, err := netutil.Dialer.DialContext(ctx, "tcp", addr)
cancel()
if err != nil {
if errors.Is(bu.healthCheckContext.Err(), context.Canceled) {
if errors.Is(bu.bhc.ctx.Err(), context.Canceled) {
return
}
logger.Warnf("ignoring the backend at %s for %s because of dial error: %s", addr, *failTimeout, err)
@@ -445,7 +467,7 @@ func (bu *backendURL) runHealthCheck() {
_ = c.Close()
return
case <-bu.healthCheckContext.Done():
case <-bu.bhc.ctx.Done():
return
}
}
@@ -517,7 +539,6 @@ func (up *URLPrefix) discoverBackendAddrsIfNeeded() {
continue
}
logger.Infof("try to resolve backend IPs for %s", host)
var resolvedAddrs []string
if strings.HasPrefix(host, "srv+") {
// The host has the format 'srv+realhost'. Strip 'srv+' prefix before performing the lookup.
@@ -545,7 +566,6 @@ func (up *URLPrefix) discoverBackendAddrsIfNeeded() {
resolvedAddrs = make([]string, len(addrs))
for i, addr := range addrs {
resolvedAddrs[i] = net.JoinHostPort(addr.String(), port)
logger.Infof("discover backend IPs for %s into %d addresses, one is %s", bu, len(resolvedAddrs), resolvedAddrs[i])
}
}
}
@@ -568,7 +588,6 @@ func (up *URLPrefix) discoverBackendAddrsIfNeeded() {
bus := up.bus.Load()
if areEqualBackendURLs(bus.bus, busNew.bus) {
logger.Infof("resolved addr are the same as the original one")
return
}

View File

@@ -8,10 +8,10 @@ import (
"time"
vmetrics "github.com/VictoriaMetrics/metrics"
"github.com/cheggaaa/pb/v3"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl/opentsdb"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl/vm"
"github.com/cheggaaa/pb/v3"
)
type otsdbProcessor struct {
@@ -89,9 +89,6 @@ func (op *otsdbProcessor) run(ctx context.Context) error {
// we're going to make serieslist * queryRanges queries, so we should represent that in the progress bar
otsdbSeriesTotal.Add(len(serieslist) * queryRanges)
bar := pb.StartNew(len(serieslist) * queryRanges)
defer func(bar *pb.ProgressBar) {
bar.Finish()
}(bar)
var wg sync.WaitGroup
for range op.otsdbcc {
wg.Go(func() {
@@ -106,41 +103,22 @@ func (op *otsdbProcessor) run(ctx context.Context) error {
}
})
}
/*
Loop through all series for this metric, processing all retentions and time ranges
requested. This loop is our primary "collect data from OpenTSDB loop" and should
be async, sending data to VictoriaMetrics over time.
runErr := op.sendQueries(ctx, serieslist, seriesCh, errCh, startTime)
The idea with having the select at the inner-most loop is to ensure quick
short-circuiting on error.
*/
for _, series := range serieslist {
for _, rt := range op.oc.Retentions {
for _, tr := range rt.QueryRanges {
select {
case otsdbErr := <-errCh:
return fmt.Errorf("opentsdb error: %s", otsdbErr)
case vmErr := <-op.im.Errors():
otsdbErrorsTotal.Inc()
return fmt.Errorf("import process failed: %s", wrapErr(vmErr, op.isVerbose))
case seriesCh <- queryObj{
Tr: tr, StartTime: startTime,
Series: series, Rt: opentsdb.RetentionMeta{
FirstOrder: rt.FirstOrder, SecondOrder: rt.SecondOrder, AggTime: rt.AggTime}}:
}
}
}
}
// Drain channels per metric
// Always drain channels and wait for workers to prevent goroutine leaks
close(seriesCh)
wg.Wait()
close(errCh)
// check for any lingering errors on the query side
for otsdbErr := range errCh {
return fmt.Errorf("import process failed: \n%s", otsdbErr)
if runErr == nil {
runErr = fmt.Errorf("import process failed: \n%s", otsdbErr)
}
}
bar.Finish()
if runErr != nil {
return runErr
}
log.Print(op.im.Stats())
}
op.im.Close()
@@ -155,6 +133,34 @@ func (op *otsdbProcessor) run(ctx context.Context) error {
return nil
}
// sendQueries iterates over all series and retention ranges, sending queries to workers.
// It returns early if ctx is canceled or an error is received.
func (op *otsdbProcessor) sendQueries(ctx context.Context, serieslist []opentsdb.Meta, seriesCh chan<- queryObj, errCh <-chan error, startTime int64) error {
for _, series := range serieslist {
for _, rt := range op.oc.Retentions {
for _, tr := range rt.QueryRanges {
select {
case <-ctx.Done():
return fmt.Errorf("context canceled: %s", ctx.Err())
case otsdbErr := <-errCh:
otsdbErrorsTotal.Inc()
return fmt.Errorf("opentsdb error: %s", otsdbErr)
case vmErr := <-op.im.Errors():
return fmt.Errorf("import process failed: %s", wrapErr(vmErr, op.isVerbose))
case seriesCh <- queryObj{
Tr: tr, StartTime: startTime,
Series: series, Rt: opentsdb.RetentionMeta{
FirstOrder: rt.FirstOrder,
SecondOrder: rt.SecondOrder,
AggTime: rt.AggTime,
}}:
}
}
}
}
return nil
}
func (op *otsdbProcessor) do(s queryObj) error {
start := s.StartTime - s.Tr.Start
end := s.StartTime - s.Tr.End
@@ -163,6 +169,7 @@ func (op *otsdbProcessor) do(s queryObj) error {
return fmt.Errorf("failed to collect data for %v in %v:%v :: %v", s.Series, s.Rt, s.Tr, err)
}
if len(data.Timestamps) < 1 || len(data.Values) < 1 {
log.Printf("no data found for %v in %v:%v...skipping", s.Series, s.Rt, s.Tr)
return nil
}
labels := make([]vm.LabelPair, 0, len(data.Tags))

View File

@@ -108,10 +108,10 @@ func (c Client) FindMetrics(q string) ([]string, error) {
if err != nil {
return nil, fmt.Errorf("failed to send GET request to %q: %s", q, err)
}
defer func() { _ = resp.Body.Close() }()
if resp.StatusCode != 200 {
return nil, fmt.Errorf("bad return from OpenTSDB: %d: %v", resp.StatusCode, resp)
}
defer func() { _ = resp.Body.Close() }()
body, err := io.ReadAll(resp.Body)
if err != nil {
return nil, fmt.Errorf("could not retrieve metric data from %q: %s", q, err)
@@ -130,12 +130,12 @@ func (c Client) FindSeries(metric string) ([]Meta, error) {
q := fmt.Sprintf("%s/api/search/lookup?m=%s&limit=%d", c.Addr, metric, c.Limit)
resp, err := c.c.Get(q)
if err != nil {
return nil, fmt.Errorf("failed to set GET request to %q: %s", q, err)
return nil, fmt.Errorf("failed to send GET request to %q: %s", q, err)
}
defer func() { _ = resp.Body.Close() }()
if resp.StatusCode != 200 {
return nil, fmt.Errorf("bad return from OpenTSDB: %d: %v", resp.StatusCode, resp)
}
defer func() { _ = resp.Body.Close() }()
body, err := io.ReadAll(resp.Body)
if err != nil {
return nil, fmt.Errorf("could not retrieve series data from %q: %s", q, err)
@@ -185,6 +185,7 @@ func (c Client) GetData(series Meta, rt RetentionMeta, start int64, end int64, m
if err != nil {
return Metric{}, fmt.Errorf("failed to send GET request to %q: %s", q, err)
}
defer func() { _ = resp.Body.Close() }()
/*
There are three potential failures here, none of which should kill the entire
migration run:
@@ -196,7 +197,6 @@ func (c Client) GetData(series Meta, rt RetentionMeta, start int64, end int64, m
log.Printf("bad response code from OpenTSDB query %v for %q...skipping", resp.StatusCode, q)
return Metric{}, nil
}
defer func() { _ = resp.Body.Close() }()
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Println("couldn't read response body from OpenTSDB query...skipping")
@@ -239,27 +239,20 @@ func (c Client) GetData(series Meta, rt RetentionMeta, start int64, end int64, m
In all "bad" cases, we don't end the migration, we just don't process that particular message
*/
if len(output) < 1 {
// no results returned...return an empty object without error
return Metric{}, nil
}
if len(output) > 1 {
// multiple series returned for a single query. We can't process this right, so...
return Metric{}, nil
return Metric{}, fmt.Errorf("unexpected number of series returned: %d for query %q; expected 1", len(output), q)
}
if len(output[0].AggregateTags) > 0 {
// This failure means we've suppressed potential series somehow...
return Metric{}, nil
return Metric{}, fmt.Errorf("aggregate tags %v present in response for query %q; series may be suppressed", output[0].AggregateTags, q)
}
data := Metric{}
data.Metric = output[0].Metric
data.Tags = output[0].Tags
/*
We evaluate data for correctness before formatting the actual values
to skip a little bit of time if the series has invalid formatting
*/
data, err = modifyData(data, c.Normalize)
if err != nil {
return Metric{}, nil
return Metric{}, fmt.Errorf("failed to convert metric data for query %q: %w", q, err)
}
/*

View File

@@ -32,7 +32,7 @@ func convertDuration(duration string) (time.Duration, error) {
var err error
var timeValue int
if strings.HasSuffix(duration, "y") {
timeValue, err = strconv.Atoi(strings.Trim(duration, "y"))
timeValue, err = strconv.Atoi(strings.TrimSuffix(duration, "y"))
if err != nil {
return 0, fmt.Errorf("invalid time range: %q", duration)
}
@@ -42,7 +42,7 @@ func convertDuration(duration string) (time.Duration, error) {
return 0, fmt.Errorf("invalid time range: %q", duration)
}
} else if strings.HasSuffix(duration, "w") {
timeValue, err = strconv.Atoi(strings.Trim(duration, "w"))
timeValue, err = strconv.Atoi(strings.TrimSuffix(duration, "w"))
if err != nil {
return 0, fmt.Errorf("invalid time range: %q", duration)
}
@@ -52,7 +52,7 @@ func convertDuration(duration string) (time.Duration, error) {
return 0, fmt.Errorf("invalid time range: %q", duration)
}
} else if strings.HasSuffix(duration, "d") {
timeValue, err = strconv.Atoi(strings.Trim(duration, "d"))
timeValue, err = strconv.Atoi(strings.TrimSuffix(duration, "d"))
if err != nil {
return 0, fmt.Errorf("invalid time range: %q", duration)
}
@@ -95,6 +95,9 @@ func convertRetention(retention string, offset int64, msecTime bool) (Retention,
if !msecTime {
queryLength = queryLength / 1000
}
if queryLength <= 0 {
return Retention{}, fmt.Errorf("ttl %q resolves to non-positive query range %d; use a larger duration", chunks[2], queryLength)
}
queryRange := queryLength
// bump by the offset so we don't look at empty ranges any time offset > ttl
queryLength += offset
@@ -138,16 +141,29 @@ func convertRetention(retention string, offset int64, msecTime bool) (Retention,
2. we discover the actual size of each "chunk"
This is second division step
*/
querySize = int64(queryRange / (queryRange / (rowLength * 4)))
divisor := queryRange / (rowLength * 4)
if divisor == 0 {
querySize = queryRange
} else {
querySize = queryRange / divisor
}
} else {
/*
Unless the aggTime (how long a range of data we're requesting per individual point)
is greater than the row size. Then we'll need to use that to determine
how big each individual query should be
*/
querySize = int64(queryRange / (queryRange / (aggTime * 4)))
divisor := queryRange / (aggTime * 4)
if divisor == 0 {
querySize = queryRange
} else {
querySize = queryRange / divisor
}
}
if querySize <= 0 {
return Retention{}, fmt.Errorf("computed non-positive querySize=%d for retention %q; check parameters", querySize, retention)
}
var timeChunks []TimeRange
var i int64
for i = offset; i <= queryLength; i = i + querySize {

View File

@@ -1223,11 +1223,7 @@ func getCommonParamsInternal(r *http.Request, startTime time.Time, requireNonEmp
if err != nil {
return nil, err
}
// Limit the `end` arg to the current time +2 days in the same way
// as it is limited during data ingestion.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/blob/ea06d2fd3ccbbb6aa4480ab3b04f7b671408be2a/lib/storage/table.go#L378
// This should fix possible timestamp overflow - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2669
maxTS := startTime.UnixNano()/1e6 + 2*24*3600*1000
maxTS := int64(math.MaxInt64 / 1_000_000)
if end > maxTS {
end = maxTS
}

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@@ -37,9 +37,9 @@
<meta property="og:title" content="UI for VictoriaMetrics">
<meta property="og:url" content="https://victoriametrics.com/">
<meta property="og:description" content="Explore and troubleshoot your VictoriaMetrics data">
<script type="module" crossorigin src="./assets/index-C24BPpD_.js"></script>
<script type="module" crossorigin src="./assets/index-fsxQMWD9.js"></script>
<link rel="modulepreload" crossorigin href="./assets/rolldown-runtime-COnpUsM8.js">
<link rel="modulepreload" crossorigin href="./assets/vendor-BWBgVCcr.js">
<link rel="modulepreload" crossorigin href="./assets/vendor-BF3F25aG.js">
<link rel="stylesheet" crossorigin href="./assets/vendor-CnsZ1jie.css">
<link rel="stylesheet" crossorigin href="./assets/index-D2OEy8Ra.css">
</head>

View File

@@ -33,6 +33,8 @@ import (
var (
retentionPeriod = flagutil.NewRetentionDuration("retentionPeriod", "1M", "Data with timestamps outside the retentionPeriod is automatically deleted. The minimum retentionPeriod is 24h or 1d. "+
"See https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#retention. See also -retentionFilter")
futureRetention = flagutil.NewRetentionDuration("futureRetention", "2d", "Data with timestamps bigger than now+futureRetention is automatically deleted. "+
"The minimum futureRetention is 2 days. See https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#retention")
snapshotAuthKey = flagutil.NewPassword("snapshotAuthKey", "authKey, which must be passed in query string to /snapshot* pages. It overrides -httpAuth.*")
forceMergeAuthKey = flagutil.NewPassword("forceMergeAuthKey", "authKey, which must be passed in query string to /internal/force_merge pages. It overrides -httpAuth.*")
forceFlushAuthKey = flagutil.NewPassword("forceFlushAuthKey", "authKey, which must be passed in query string to /internal/force_flush pages. It overrides -httpAuth.*")
@@ -135,7 +137,12 @@ func Init(resetCacheIfNeeded func(mrs []storage.MetricRow)) {
mergeset.SetDataBlocksSparseCacheSize(cacheSizeIndexDBDataBlocksSparse.IntN())
if retentionPeriod.Duration() < 24*time.Hour {
logger.Fatalf("-retentionPeriod cannot be smaller than a day; got %s", retentionPeriod)
logger.Fatalf("-retentionPeriod cannot be smaller than a day; got %s. "+
"See https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#retention", retentionPeriod)
}
if futureRetention.Duration() < 2*24*time.Hour {
logger.Fatalf("-futureRetention cannot be smaller than 2 days; got %s. "+
"See https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#retention", futureRetention)
}
if *idbPrefillStart > 23*time.Hour {
logger.Panicf("-storage.idbPrefillStart cannot exceed 23 hours; got %s", idbPrefillStart)
@@ -145,6 +152,7 @@ func Init(resetCacheIfNeeded func(mrs []storage.MetricRow)) {
WG = syncwg.WaitGroup{}
opts := storage.OpenOptions{
Retention: retentionPeriod.Duration(),
FutureRetention: futureRetention.Duration(),
MaxHourlySeries: getMaxHourlySeries(),
MaxDailySeries: getMaxDailySeries(),
DisablePerDayIndex: *disablePerDayIndex,

View File

@@ -6,7 +6,7 @@ COPY web/ /build/
RUN GOOS=linux GOARCH=amd64 CGO_ENABLED=0 go build -o web-amd64 github.com/VictoriMetrics/vmui/ && \
GOOS=windows GOARCH=amd64 CGO_ENABLED=0 go build -o web-windows github.com/VictoriMetrics/vmui/
FROM alpine:3.23.3
FROM alpine:3.23.4
USER root
COPY --from=build-web-stage /build/web-amd64 /app/web

View File

@@ -1,47 +1,18 @@
import { ArrayRGB } from "../types";
export const baseContrastColors = [
"#e54040",
"#32a9dc",
"#2ee329",
"#7126a1",
"#e38f0f",
"#3d811a",
"#ffea00",
"#2d2d2d",
"#da42a6",
"#a44e0c",
"#e6194b", // red
"#4363d8", // blue
"#3cb44b", // green
"#911eb4", // purple
"#f58231", // orange
"#f032e6", // magenta
"#c8a200", // dark yellow
"#a65628", // brown
"#42d4f4", // cyan
"#a9a9a9", // gray
];
export const hexToRGB = (hex: string): string => {
if (hex.length != 7) return "0, 0, 0";
const r = parseInt(hex.slice(1, 3), 16);
const g = parseInt(hex.slice(3, 5), 16);
const b = parseInt(hex.slice(5, 7), 16);
return `${r}, ${g}, ${b}`;
};
export const getColorFromString = (text: string): string => {
const SEED = 16777215;
const FACTOR = 49979693;
let b = 1;
let d = 0;
let f = 1;
if (text.length > 0) {
for (let i = 0; i < text.length; i++) {
text[i].charCodeAt(0) > d && (d = text[i].charCodeAt(0));
f = parseInt(String(SEED / d));
b = (b + text[i].charCodeAt(0) * f * FACTOR) % SEED;
}
}
let hex = ((b * text.length) % SEED).toString(16);
hex = hex.padEnd(6, hex);
return `#${hex}`;
};
export const getContrastColor = (value: string) => {
let hex = value.replace("#", "").trim();
@@ -70,3 +41,109 @@ export const generateGradient = (start: ArrayRGB, end: ArrayRGB, steps: number)
}
return gradient.map(c => `rgb(${c})`);
};
const clamp = (n: number, min: number, max: number) => Math.min(max, Math.max(min, n));
const hexToRgb = (hex: string) => {
let value = hex.replace("#", "").trim();
if (value.length === 3) {
value = value.split("").map((c) => c + c).join("");
}
if (!/^[0-9a-fA-F]{6}$/.test(value)) {
throw new Error("Invalid HEX color.");
}
return {
r: parseInt(value.slice(0, 2), 16),
g: parseInt(value.slice(2, 4), 16),
b: parseInt(value.slice(4, 6), 16),
};
};
const rgbToHex = (r: number, g: number, b: number) =>
`#${[r, g, b].map((v) => clamp(Math.round(v), 0, 255).toString(16).padStart(2, "0")).join("")}`;
const rgbToHsl = (r: number, g: number, b: number) => {
r /= 255; g /= 255; b /= 255;
const max = Math.max(r, g, b);
const min = Math.min(r, g, b);
const l = (max + min) / 2;
const d = max - min;
let h = 0;
let s = 0;
if (d !== 0) {
s = d / (1 - Math.abs(2 * l - 1));
switch (max) {
case r: h = ((g - b) / d) % 6; break;
case g: h = (b - r) / d + 2; break;
case b: h = (r - g) / d + 4; break;
}
h *= 60;
if (h < 0) h += 360;
}
return { h, s: s * 100, l: l * 100 };
};
const hslToRgb = (h: number, s: number, l: number) => {
s /= 100;
l /= 100;
const c = (1 - Math.abs(2 * l - 1)) * s;
const x = c * (1 - Math.abs((h / 60) % 2 - 1));
const m = l - c / 2;
let r: number;
let g: number;
let b: number;
if (h < 60) [r, g, b] = [c, x, 0];
else if (h < 120) [r, g, b] = [x, c, 0];
else if (h < 180) [r, g, b] = [0, c, x];
else if (h < 240) [r, g, b] = [0, x, c];
else if (h < 300) [r, g, b] = [x, 0, c];
else [r, g, b] = [c, 0, x];
return {
r: (r + m) * 255,
g: (g + m) * 255,
b: (b + m) * 255,
};
};
const varyColor = (hex: string, variant: number) => {
const { r, g, b } = hexToRgb(hex);
const { h, s, l } = rgbToHsl(r, g, b);
const variants = [
{ ds: 0, dl: 0 },
{ ds: -20, dl: -16 },
{ ds: -16, dl: +16 },
{ ds: +14, dl: -20 },
];
const v = variants[variant % variants.length];
const nextS = clamp(s + v.ds, 35, 85);
const nextL = clamp(l + v.dl, 35, 70);
const rgb = hslToRgb(h, nextS, nextL);
return rgbToHex(rgb.r, rgb.g, rgb.b);
};
export const getSeriesColor = (index: number) => {
const baseCount = baseContrastColors.length;
const baseIndex = index % baseCount;
const variantIndex = Math.floor(index / baseCount);
const base = baseContrastColors[(baseIndex + variantIndex) % baseCount];
return varyColor(base, variantIndex);
};

View File

@@ -2,7 +2,7 @@ import { MetricBase, MetricResult } from "../../api/types";
import uPlot, { Series as uPlotSeries } from "uplot";
import { getNameForMetric, promValueToNumber } from "../metric";
import { HideSeriesArgs, LegendItemType, SeriesItem } from "../../types";
import { baseContrastColors, getColorFromString } from "../color";
import { getSeriesColor } from "../color";
import { getMathStats } from "../math";
import { formatPrettyNumber } from "./helpers";
import { drawPoints } from "./scatter";
@@ -17,11 +17,10 @@ export const extractFields = (metric: MetricBase["metric"]): string => {
export const getSeriesItemContext = (data: MetricResult[], hideSeries: string[], alias: string[], showPoints?: boolean, isRawQuery?: boolean) => {
const colorState: {[key: string]: string} = {};
const maxColors = Math.min(data.length, baseContrastColors.length);
for (let i = 0; i < maxColors; i++) {
for (let i = 0; i < data.length; i++) {
const label = getNameForMetric(data[i], alias[data[i].group - 1]);
colorState[label] = baseContrastColors[i];
colorState[label] = getSeriesColor(i);
}
return (d: MetricResult): SeriesItem => {
@@ -32,7 +31,7 @@ export const getSeriesItemContext = (data: MetricResult[], hideSeries: string[],
label,
hasAlias: Boolean(aliasValue),
width: 1.4,
stroke: colorState[label] || getColorFromString(label),
stroke: colorState[label],
points: getPointsSeries(showPoints, isRawQuery),
spanGaps: false,
freeFormFields: d.metric,

View File

@@ -93,6 +93,14 @@ type QueryOpts struct {
Headers http.Header
}
// getTenant returns tenant with optional default value
func (qos *QueryOpts) getTenant() string {
if qos.Tenant == "" {
return "0"
}
return qos.Tenant
}
func (qos *QueryOpts) getHeaders() http.Header {
if qos.Headers == nil {
qos.Headers = make(http.Header)

View File

@@ -28,6 +28,7 @@ func TestSingleBackupRestore(t *testing.T) {
return tc.MustStartVmsingle("vmsingle", []string{
"-storageDataPath=" + storageDataPath,
"-retentionPeriod=100y",
"-futureRetention=2y",
})
},
stopSUT: func() {
@@ -60,11 +61,13 @@ func TestClusterBackupRestore(t *testing.T) {
Vmstorage1Flags: []string{
"-storageDataPath=" + storage1DataPath,
"-retentionPeriod=100y",
"-futureRetention=2y",
},
Vmstorage2Instance: "vmstorage2",
Vmstorage2Flags: []string{
"-storageDataPath=" + storage2DataPath,
"-retentionPeriod=100y",
"-futureRetention=2y",
},
VminsertInstance: "vminsert",
VminsertFlags: []string{},
@@ -97,10 +100,16 @@ func TestClusterBackupRestore(t *testing.T) {
func testBackupRestore(tc *apptest.TestCase, opts testBackupRestoreOpts) {
t := tc.T()
genData := func(count int, prefix string, start, step int64) (recs []string, wantSeries []map[string]string, wantQueryResults []*apptest.QueryResult) {
recs = make([]string, count)
wantSeries = make([]map[string]string, count)
wantQueryResults = make([]*apptest.QueryResult, count)
type data struct {
samples []string
wantSeries []map[string]string
wantQueryResults []*apptest.QueryResult
}
genData := func(count int, prefix string, start, step int64) data {
recs := make([]string, count)
wantSeries := make([]map[string]string, count)
wantQueryResults := make([]*apptest.QueryResult, count)
for i := range count {
name := fmt.Sprintf("%s_%03d", prefix, i)
value := float64(i)
@@ -113,7 +122,15 @@ func testBackupRestore(tc *apptest.TestCase, opts testBackupRestoreOpts) {
Samples: []*apptest.Sample{{Timestamp: timestamp, Value: value}},
}
}
return recs, wantSeries, wantQueryResults
return data{recs, wantSeries, wantQueryResults}
}
concatData := func(d1, d2 data) data {
var d data
d.samples = slices.Concat(d1.samples, d2.samples)
d.wantSeries = slices.Concat(d1.wantSeries, d2.wantSeries)
d.wantQueryResults = slices.Concat(d1.wantQueryResults, d2.wantQueryResults)
return d
}
backupBaseDir, err := filepath.Abs(filepath.Join(tc.Dir(), "backups"))
@@ -190,10 +207,20 @@ func testBackupRestore(tc *apptest.TestCase, opts testBackupRestoreOpts) {
// Use the same number of metrics and time range for all the data ingestions
// below.
const numMetrics = 1000
// With 1000 metrics (one per minute), the time range spans 2 months.
start := time.Date(2025, 1, 1, 0, 0, 0, 0, time.UTC).UnixMilli()
end := time.Date(2025, 3, 1, 0, 0, 0, 0, time.UTC).UnixMilli()
step := (end - start) / numMetrics
batch1 := genData(numMetrics, "batch1", start, step)
batch2 := genData(numMetrics, "batch2", start, step)
batches12 := concatData(batch1, batch2)
now := time.Now().UTC()
startFuture := time.Date(now.Year()+1, 1, 1, 0, 0, 0, 0, time.UTC).UnixMilli()
endFuture := time.Date(now.Year()+1, 3, 1, 0, 0, 0, 0, time.UTC).UnixMilli()
stepFuture := (endFuture - startFuture) / numMetrics
batch1Future := genData(numMetrics, "batch1", startFuture, stepFuture)
batch2Future := genData(numMetrics, "batch2", startFuture, stepFuture)
batches12Future := concatData(batch1Future, batch2Future)
// Verify backup/restore:
//
@@ -207,24 +234,25 @@ func testBackupRestore(tc *apptest.TestCase, opts testBackupRestoreOpts) {
// - Start vmsingle
// - Ensure that the queries return batch1 data only.
batch1Data, wantBatch1Series, wantBatch1QueryResults := genData(numMetrics, "batch1", start, step)
batch2Data, wantBatch2Series, wantBatch2QueryResults := genData(numMetrics, "batch2", start, step)
wantBatch12Series := slices.Concat(wantBatch1Series, wantBatch2Series)
wantBatch12QueryResults := slices.Concat(wantBatch1QueryResults, wantBatch2QueryResults)
sut := opts.startSUT()
sut.PrometheusAPIV1ImportPrometheus(t, batch1Data, apptest.QueryOpts{})
sut.PrometheusAPIV1ImportPrometheus(t, batch1.samples, apptest.QueryOpts{})
sut.PrometheusAPIV1ImportPrometheus(t, batch1Future.samples, apptest.QueryOpts{})
sut.ForceFlush(t)
assertSeries(sut, `{__name__=~"batch1.*"}`, start, end, wantBatch1Series)
assertQueryResults(sut, `{__name__=~"batch1.*"}`, start, end, step, wantBatch1QueryResults)
assertSeries(sut, `{__name__=~"batch1.*"}`, start, end, batch1.wantSeries)
assertSeries(sut, `{__name__=~"batch1.*"}`, startFuture, endFuture, batch1Future.wantSeries)
assertQueryResults(sut, `{__name__=~"batch1.*"}`, start, end, step, batch1.wantQueryResults)
assertQueryResults(sut, `{__name__=~"batch1.*"}`, startFuture, endFuture, stepFuture, batch1Future.wantQueryResults)
createBackup(sut, "batch1")
sut.PrometheusAPIV1ImportPrometheus(t, batch2Data, apptest.QueryOpts{})
sut.PrometheusAPIV1ImportPrometheus(t, batch2.samples, apptest.QueryOpts{})
sut.PrometheusAPIV1ImportPrometheus(t, batch2Future.samples, apptest.QueryOpts{})
sut.ForceFlush(t)
assertSeries(sut, `{__name__=~"batch(1|2).*"}`, start, end, wantBatch12Series)
assertQueryResults(sut, `{__name__=~"batch(1|2).*"}`, start, end, step, wantBatch12QueryResults)
assertSeries(sut, `{__name__=~"batch(1|2).*"}`, start, end, batches12.wantSeries)
assertSeries(sut, `{__name__=~"batch(1|2).*"}`, startFuture, endFuture, batches12Future.wantSeries)
assertQueryResults(sut, `{__name__=~"batch(1|2).*"}`, start, end, step, batches12.wantQueryResults)
assertQueryResults(sut, `{__name__=~"batch(1|2).*"}`, startFuture, endFuture, stepFuture, batches12Future.wantQueryResults)
createBackup(sut, "batch12")
opts.stopSUT()
@@ -233,6 +261,8 @@ func testBackupRestore(tc *apptest.TestCase, opts testBackupRestoreOpts) {
sut = opts.startSUT()
assertSeries(sut, `{__name__=~"batch1.*"}`, start, end, wantBatch1Series)
assertQueryResults(sut, `{__name__=~"batch1.*"}`, start, end, step, wantBatch1QueryResults)
assertSeries(sut, `{__name__=~"batch(1|2).*"}`, start, end, batch1.wantSeries)
assertSeries(sut, `{__name__=~"batch(1|2).*"}`, startFuture, endFuture, batch1Future.wantSeries)
assertQueryResults(sut, `{__name__=~"batch(1|2).*"}`, start, end, step, batch1.wantQueryResults)
assertQueryResults(sut, `{__name__=~"batch(1|2).*"}`, startFuture, endFuture, stepFuture, batch1Future.wantQueryResults)
}

View File

@@ -0,0 +1,211 @@
package tests
import (
"fmt"
"path/filepath"
"testing"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/apptest"
)
func TestSingleFutureTimestamps(t *testing.T) {
tc := apptest.NewTestCase(t)
defer tc.Stop()
opts := testFutureTimestampsOpts{
start: func() apptest.PrometheusWriteQuerier {
return tc.MustStartVmsingle("vmsingle", []string{
"-storageDataPath=" + filepath.Join(tc.Dir(), "vmsingle"),
"-retentionPeriod=100y",
"-futureRetention=100y",
})
},
stop: func() {
tc.StopApp("vmsingle")
},
}
testFutureTimestamps(tc, opts)
}
func TestClusterFutureTimestamps(t *testing.T) {
tc := apptest.NewTestCase(t)
defer tc.Stop()
opts := testFutureTimestampsOpts{
start: func() apptest.PrometheusWriteQuerier {
return tc.MustStartCluster(&apptest.ClusterOptions{
Vmstorage1Instance: "vmstorage1",
Vmstorage1Flags: []string{
"-storageDataPath=" + filepath.Join(tc.Dir(), "vmstorage1"),
"-retentionPeriod=100y",
"-futureRetention=100y",
},
Vmstorage2Instance: "vmstorage2",
Vmstorage2Flags: []string{
"-storageDataPath=" + filepath.Join(tc.Dir(), "vmstorage2"),
"-retentionPeriod=100y",
"-futureRetention=100y",
},
VminsertInstance: "vminsert",
VminsertFlags: []string{},
VmselectInstance: "vmselect",
VmselectFlags: []string{},
})
},
stop: func() {
tc.StopApp("vminsert")
tc.StopApp("vmselect")
tc.StopApp("vmstorage1")
tc.StopApp("vmstorage2")
},
}
testFutureTimestamps(tc, opts)
}
type testFutureTimestampsOpts struct {
start func() apptest.PrometheusWriteQuerier
stop func()
}
func testFutureTimestamps(tc *apptest.TestCase, opts testFutureTimestampsOpts) {
t := tc.T()
// assertSeries retrieves set of all metric names from the storage and
// compares it with the expected set.
assertSeries := func(app apptest.PrometheusQuerier, prefix string, start, end int64, want []map[string]string) {
t.Helper()
query := fmt.Sprintf(`{__name__=~"metric_%s.*"}`, prefix)
tc.Assert(&apptest.AssertOptions{
Msg: "unexpected /api/v1/series response",
Got: func() any {
return app.PrometheusAPIV1Series(t, query, apptest.QueryOpts{
Start: fmt.Sprintf("%d", start),
End: fmt.Sprintf("%d", end),
}).Sort()
},
Want: &apptest.PrometheusAPIV1SeriesResponse{
Status: "success",
Data: want,
},
FailNow: true,
})
}
// assertSeries retrieves all data from the storage and compares it with the
// expected result.
assertQueryResults := func(app apptest.PrometheusQuerier, prefix string, start, end, step int64, want []*apptest.QueryResult) {
t.Helper()
query := fmt.Sprintf(`{__name__=~"metric_%s.*"}`, prefix)
tc.Assert(&apptest.AssertOptions{
Msg: "unexpected /api/v1/query_range response",
Got: func() any {
return app.PrometheusAPIV1QueryRange(t, query, apptest.QueryOpts{
Start: fmt.Sprintf("%d", start),
End: fmt.Sprintf("%d", end),
Step: fmt.Sprintf("%dms", step),
MaxLookback: fmt.Sprintf("%dms", step-1),
NoCache: "1",
})
},
Want: &apptest.PrometheusAPIV1QueryResponse{
Status: "success",
Data: &apptest.QueryData{
ResultType: "matrix",
Result: want,
},
},
FailNow: true,
})
}
f := func(prefix string, startTime, endTime time.Time, wantEmpty bool) {
const numMetrics = 1000
start := startTime.UnixMilli()
end := endTime.UnixMilli()
step := (end - start) / numMetrics
data := genFutureTimestampsData(prefix, numMetrics, start, step)
if wantEmpty {
data.wantSeries = []map[string]string{}
data.wantQueryResults = []*apptest.QueryResult{}
}
// Ingest data and check query results.
sut := opts.start()
sut.PrometheusAPIV1ImportPrometheus(t, data.samples, apptest.QueryOpts{})
sut.ForceFlush(t)
assertSeries(sut, prefix, start, end, data.wantSeries)
assertQueryResults(sut, prefix, start, end, step, data.wantQueryResults)
// Ensure the queries work after restrart.
opts.stop()
sut = opts.start()
assertSeries(sut, prefix, start, end, data.wantSeries)
assertQueryResults(sut, prefix, start, end, step, data.wantQueryResults)
opts.stop()
}
now := time.Now().UTC()
retentionLimit := 100 * 365 * 24 * time.Hour
var start, end time.Time
start = time.Date(now.Year(), now.Month(), now.Day()+1, 0, 0, 0, 0, time.UTC)
end = time.Date(now.Year(), now.Month(), now.Day()+2, 0, 0, 0, 0, time.UTC)
f("future_1d", start, end, false)
start = time.Date(now.Year(), now.Month()+1, 1, 0, 0, 0, 0, time.UTC)
end = time.Date(now.Year(), now.Month()+2, 1, 0, 0, 0, 0, time.UTC)
f("future_1m", start, end, false)
start = time.Date(now.Year()+1, 1, 1, 0, 0, 0, 0, time.UTC)
end = time.Date(now.Year()+2, 1, 1, 0, 0, 0, 0, time.UTC)
f("future_1y", start, end, false)
start = now.Add(retentionLimit - 24*time.Hour)
end = now.Add(retentionLimit)
f("future_1d_before_limit", start, end, false)
start = now.Add(retentionLimit + time.Minute)
end = now.Add(retentionLimit + 24*time.Hour)
f("future_1d_beyond_limit", start, end, true)
}
type futureTimestampsData struct {
samples []string
wantSeries []map[string]string
wantQueryResults []*apptest.QueryResult
}
func genFutureTimestampsData(prefix string, numMetrics, start, step int64) futureTimestampsData {
samples := make([]string, numMetrics)
wantSeries := make([]map[string]string, numMetrics)
wantQueryResults := make([]*apptest.QueryResult, numMetrics)
for i := range numMetrics {
metricName := fmt.Sprintf("metric_%s_%04d", prefix, i)
labelName := fmt.Sprintf("label_%s_%04d", prefix, i)
labelValue := fmt.Sprintf("value_%s_%04d", prefix, i)
value := i
timestamp := start + i*step
samples[i] = fmt.Sprintf(`%s{%s="value", label="%s"} %d %d`, metricName, labelName, labelValue, value, timestamp)
wantSeries[i] = map[string]string{
"__name__": metricName,
labelName: "value",
"label": labelValue,
}
wantQueryResults[i] = &apptest.QueryResult{
Metric: map[string]string{
"__name__": metricName,
labelName: "value",
"label": labelValue,
},
Samples: []*apptest.Sample{{Timestamp: timestamp, Value: float64(value)}},
}
}
return futureTimestampsData{samples, wantSeries, wantQueryResults}
}

View File

@@ -0,0 +1,186 @@
package tests
import (
"encoding/json"
"fmt"
"net/http"
"net/http/httptest"
"sort"
"strconv"
"strings"
"testing"
)
// openTSDBPoint is a single data point served by the mock OpenTSDB server.
type openTSDBPoint struct {
Metric string
Tags map[string]string
Timestamp int64
Value float64
}
// openTSDBMockServer implements the minimal subset of the OpenTSDB HTTP API
// used by vmctl opentsdb: /api/suggest, /api/search/lookup, /api/query.
type openTSDBMockServer struct {
server *httptest.Server
points []openTSDBPoint
}
// newOpenTSDBMockServer starts an httptest server serving the given points.
func newOpenTSDBMockServer(t *testing.T, points []openTSDBPoint) *openTSDBMockServer {
t.Helper()
s := &openTSDBMockServer{points: points}
mux := http.NewServeMux()
mux.HandleFunc("/api/suggest", s.handleSuggest)
mux.HandleFunc("/api/search/lookup", s.handleLookup)
mux.HandleFunc("/api/query", s.handleQuery)
s.server = httptest.NewServer(mux)
return s
}
// close shuts down the server.
func (s *openTSDBMockServer) close() { s.server.Close() }
// httpAddr returns the server URL.
func (s *openTSDBMockServer) httpAddr() string { return s.server.URL }
// handleSuggest serves https://opentsdb.net/docs/build/html/api_http/suggest.html
func (s *openTSDBMockServer) handleSuggest(w http.ResponseWriter, r *http.Request) {
q := r.URL.Query().Get("q")
seen := make(map[string]bool, len(s.points))
var out []string
for _, p := range s.points {
if seen[p.Metric] {
continue
}
if q != "" && !strings.Contains(p.Metric, q) {
continue
}
seen[p.Metric] = true
out = append(out, p.Metric)
}
_ = json.NewEncoder(w).Encode(out)
}
// handleLookup serves https://opentsdb.net/docs/build/html/api_http/search/lookup.html
func (s *openTSDBMockServer) handleLookup(w http.ResponseWriter, r *http.Request) {
metric := r.URL.Query().Get("m")
type meta struct {
Metric string `json:"metric"`
Tags map[string]string `json:"tags"`
}
seen := make(map[string]bool, len(s.points))
var results []meta
for _, p := range s.points {
if p.Metric != metric {
continue
}
key := tagsKey(p.Tags)
if seen[key] {
continue
}
seen[key] = true
results = append(results, meta{p.Metric, p.Tags})
}
_ = json.NewEncoder(w).Encode(map[string]any{
"type": "LOOKUP",
"metric": metric,
"results": results,
})
}
// handleQuery serves https://opentsdb.net/docs/build/html/api_http/query/index.html
func (s *openTSDBMockServer) handleQuery(w http.ResponseWriter, r *http.Request) {
m := r.URL.Query().Get("m")
metric, tagFilter, ok := parseQuery(m)
if !ok {
http.Error(w, "bad query param", http.StatusBadRequest)
return
}
start, err := strconv.ParseInt(r.URL.Query().Get("start"), 10, 64)
if err != nil {
http.Error(w, "bad start param", http.StatusBadRequest)
return
}
end, err := strconv.ParseInt(r.URL.Query().Get("end"), 10, 64)
if err != nil {
http.Error(w, "bad end param", http.StatusBadRequest)
return
}
type resp struct {
Metric string `json:"metric"`
Tags map[string]string `json:"tags"`
AggregateTags []string `json:"aggregateTags"`
Dps map[string]float64 `json:"dps"`
}
grouped := make(map[string]*resp, len(s.points))
for _, p := range s.points {
if p.Metric != metric {
continue
}
if !matchTags(p.Tags, tagFilter) {
continue
}
if p.Timestamp < start || p.Timestamp > end {
continue
}
key := tagsKey(p.Tags)
if _, exists := grouped[key]; !exists {
grouped[key] = &resp{
Metric: p.Metric,
Tags: p.Tags,
AggregateTags: []string{},
Dps: map[string]float64{},
}
}
grouped[key].Dps[fmt.Sprintf("%d", p.Timestamp)] = p.Value
}
out := make([]*resp, 0, len(grouped))
for _, v := range grouped {
out = append(out, v)
}
_ = json.NewEncoder(w).Encode(out)
}
// parseQuery parses the OpenTSDB m= query parameter.
// Format: "<agg>:<bucket>-<agg>-none:<metric>{k=v,k=v}"
func parseQuery(m string) (string, map[string]string, bool) {
parts := strings.SplitN(m, ":", 3)
if len(parts) != 3 {
return "", nil, false
}
metric, tagStr, _ := strings.Cut(parts[2], "{")
tags := make(map[string]string, 4)
tagStr = strings.TrimSuffix(tagStr, "}")
for _, kv := range strings.Split(tagStr, ",") {
if k, v, ok := strings.Cut(kv, "="); ok {
tags[k] = v
}
}
return metric, tags, true
}
func matchTags(got, filter map[string]string) bool {
for k, v := range filter {
if v == "*" {
continue
}
if got[k] != v {
return false
}
}
return true
}
func tagsKey(tags map[string]string) string {
keys := make([]string, 0, len(tags))
for k := range tags {
keys = append(keys, k)
}
sort.Strings(keys)
parts := make([]string, 0, len(keys))
for _, k := range keys {
parts = append(parts, k+"="+tags[k])
}
return strings.Join(parts, ",")
}

View File

@@ -13,6 +13,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/apptest"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
)
// TestSingleVMAgentReloadConfigs verifies that vmagent reload new configurations on SIGHUP signal
@@ -512,3 +513,74 @@ func TestSingleVMAgentCardinalityLimiter(t *testing.T) {
t.Fatalf("unexpected vmagent_daily_series_limit_rows_dropped_total value: %d", v)
}
}
func TestClusterVMAgentForwardMetricsMetadata(t *testing.T) {
tc := apptest.NewTestCase(t)
defer tc.Stop()
sut := tc.MustStartDefaultCluster()
vmagent := tc.MustStartVmagent("vmagent", []string{
`-remoteWrite.flushInterval=50ms`,
`-remoteWrite.forcePromProto=true`,
`-enableMultitenantHandlers=true`,
"-remoteWrite.tmpDataPath=" + tc.Dir() + "/vmagent",
fmt.Sprintf(`-remoteWrite.url=http://%s/insert/multitenant/prometheus/api/v1/write`, sut.Vminsert.HTTPAddr()),
}, ``)
prometheusRemoteWriteDataSet := prompb.WriteRequest{
Metadata: []prompb.MetricMetadata{
{MetricFamilyName: "metric_name_4", Help: "some help message", Type: prompb.MetricTypeSummary, AccountID: 100},
},
}
vmagent.PrometheusAPIV1Write(t, prometheusRemoteWriteDataSet, apptest.QueryOpts{Tenant: "multitenant"})
tc.Assert(&apptest.AssertOptions{
Msg: "unexpected /api/v1/metadata response",
Got: func() any {
return sut.PrometheusAPIV1Metadata(t, ``, -1, apptest.QueryOpts{Tenant: "100:0"})
},
Want: &apptest.PrometheusAPIV1Metadata{
Status: "success",
Data: map[string][]apptest.MetadataEntry{
"metric_name_4": {{Help: "some help message", Type: "summary"}},
},
},
})
prometheusRemoteWriteDataSet = prompb.WriteRequest{
Metadata: []prompb.MetricMetadata{
{MetricFamilyName: "metric_name_6", Help: "some help message", Type: prompb.MetricTypeSummary, AccountID: 100},
},
}
// enforce tenant from request uri /insert/tenant_id/prometheus/api/v1/write
vmagent.PrometheusAPIV1Write(t, prometheusRemoteWriteDataSet, apptest.QueryOpts{Tenant: "500:500"})
tc.Assert(&apptest.AssertOptions{
Msg: "unexpected /api/v1/metadata response",
Got: func() any {
return sut.PrometheusAPIV1Metadata(t, ``, -1, apptest.QueryOpts{Tenant: "500:500"})
},
Want: &apptest.PrometheusAPIV1Metadata{
Status: "success",
Data: map[string][]apptest.MetadataEntry{
"metric_name_6": {{Help: "some help message", Type: "summary"}},
},
},
})
tc.Assert(&apptest.AssertOptions{
Msg: "unexpected /api/v1/metadata response",
Got: func() any {
return sut.PrometheusAPIV1Metadata(t, ``, -1, apptest.QueryOpts{Tenant: "multitenant"})
},
Want: &apptest.PrometheusAPIV1Metadata{
Status: "success",
Data: map[string][]apptest.MetadataEntry{
"metric_name_4": {{Help: "some help message", Type: "summary"}},
"metric_name_6": {{Help: "some help message", Type: "summary"}},
},
},
})
}

View File

@@ -0,0 +1,167 @@
package tests
import (
"fmt"
"testing"
"time"
"github.com/google/go-cmp/cmp"
"github.com/google/go-cmp/cmp/cmpopts"
"github.com/VictoriaMetrics/VictoriaMetrics/apptest"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
)
func TestSingleVmctlOpenTSDBProtocol(t *testing.T) {
fs.MustRemoveDir(t.Name())
tc := apptest.NewTestCase(t)
defer tc.Stop()
vmsingleDst := tc.MustStartDefaultVmsingle()
vmAddr := fmt.Sprintf("http://%s/", vmsingleDst.HTTPAddr())
// Generate 60 points at 1-minute intervals starting 2 hours ago.
// This ensures data falls within vmctl's default query window (now - retention).
baseTS := time.Now().Add(-2 * time.Hour).Truncate(time.Minute).Unix()
points := make([]openTSDBPoint, 0, 60)
for i := range 60 {
points = append(points, openTSDBPoint{
Metric: "test.cpu",
Tags: map[string]string{"host": "h1", "env": "prod"},
Timestamp: baseTS + int64(i*60),
Value: float64(i),
})
}
otsdb := newOpenTSDBMockServer(t, points)
defer otsdb.close()
vmctlFlags := []string{
`opentsdb`,
`--otsdb-addr=` + otsdb.httpAddr(),
`--vm-addr=` + vmAddr,
`--otsdb-retentions=ssum-1m-avg:1d:1d`,
`--otsdb-filters=test`,
`--otsdb-normalize`,
`--disable-progress-bar=true`,
`-s`,
}
testOpenTSDBProtocol(tc, vmsingleDst, vmctlFlags, points, "test_cpu", baseTS)
}
func TestClusterVmctlOpenTSDBProtocol(t *testing.T) {
fs.MustRemoveDir(t.Name())
tc := apptest.NewTestCase(t)
defer tc.Stop()
cluster := tc.MustStartDefaultCluster()
vmAddr := fmt.Sprintf("http://%s/", cluster.Vminsert.HTTPAddr())
// Generate 60 points at 1-minute intervals starting 2 hours ago.
baseTS := time.Now().Add(-2 * time.Hour).Truncate(time.Minute).Unix()
points := make([]openTSDBPoint, 0, 60)
for i := range 60 {
points = append(points, openTSDBPoint{
Metric: "test.mem",
Tags: map[string]string{"host": "h1"},
Timestamp: baseTS + int64(i*60),
Value: float64(i * 2),
})
}
otsdb := newOpenTSDBMockServer(t, points)
defer otsdb.close()
vmctlFlags := []string{
`opentsdb`,
`--otsdb-addr=` + otsdb.httpAddr(),
`--vm-addr=` + vmAddr,
`--otsdb-retentions=sum-1m-avg:1d:1d`,
`--otsdb-filters=test`,
`--otsdb-normalize`,
`--disable-progress-bar=true`,
`--vm-account-id=0`,
`-s`,
}
testOpenTSDBProtocol(tc, cluster, vmctlFlags, points, "test_mem", baseTS)
}
func testOpenTSDBProtocol(
tc *apptest.TestCase,
queries apptest.PrometheusWriteQuerier,
vmctlFlags []string,
points []openTSDBPoint,
vmMetricName string,
baseTS int64,
) {
t := tc.T()
t.Helper()
// Build dynamic time range covering all data points with 1-hour padding.
queryStart := time.Unix(baseTS-3600, 0).UTC().Format(time.RFC3339)
queryEnd := time.Unix(baseTS+7200, 0).UTC().Format(time.RFC3339)
cmpOpt := cmpopts.IgnoreFields(apptest.PrometheusAPIV1QueryResponse{}, "Status", "Data.ResultType")
got := queries.PrometheusAPIV1Query(t, `{__name__=~".*"}`, apptest.QueryOpts{
Step: "5m",
Time: queryStart,
})
want := apptest.NewPrometheusAPIV1QueryResponse(t, `{"data":{"result":[]}}`)
if diff := cmp.Diff(want, got, cmpOpt); diff != "" {
t.Errorf("unexpected response (-want, +got):\n%s", diff)
}
tc.MustStartVmctl("vmctl", vmctlFlags)
queries.ForceFlush(t)
expected := buildExpectedOpenTSDBResult(points, vmMetricName)
tc.Assert(&apptest.AssertOptions{
Retries: 300,
Msg: `unexpected metrics stored via opentsdb protocol`,
Got: func() any {
r := queries.PrometheusAPIV1Export(t, fmt.Sprintf(`{__name__=%q}`, vmMetricName), apptest.QueryOpts{
Start: queryStart,
End: queryEnd,
})
r.Sort()
return r.Data.Result
},
Want: expected,
CmpOpts: []cmp.Option{
cmpopts.IgnoreFields(apptest.PrometheusAPIV1QueryResponse{}, "Status", "Data.ResultType"),
},
})
}
func buildExpectedOpenTSDBResult(points []openTSDBPoint, vmMetricName string) []*apptest.QueryResult {
grouped := map[string]*apptest.QueryResult{}
for _, p := range points {
metric := map[string]string{"__name__": vmMetricName}
for k, v := range p.Tags {
metric[k] = v
}
key := tagsKey(metric)
if _, ok := grouped[key]; !ok {
grouped[key] = &apptest.QueryResult{Metric: metric}
}
grouped[key].Samples = append(grouped[key].Samples, &apptest.Sample{
Timestamp: p.Timestamp * 1000,
Value: p.Value,
})
}
out := make([]*apptest.QueryResult, 0, len(grouped))
for _, v := range grouped {
out = append(out, v)
}
resp := apptest.PrometheusAPIV1QueryResponse{
Data: &apptest.QueryData{Result: out},
}
resp.Sort()
return resp.Data.Result
}

View File

@@ -10,6 +10,10 @@ import (
"syscall"
"testing"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prommetadata"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/golang/snappy"
)
// Vmagent holds the state of a vmagent app and provides vmagent-specific functions
@@ -158,6 +162,31 @@ func (app *Vmagent) ReloadRelabelConfigs(t *testing.T) {
t.Fatalf("relabel configs were not reloaded after SIGHUP signal; previous total: %f, current total: %f", prevTotal, currTotal)
}
// PrometheusAPIV1Write is a test helper function that inserts a
// collection of records in Prometheus remote-write format by sending a HTTP
// POST request to /prometheus/api/v1/write vmagent endpoint.
func (app *Vmagent) PrometheusAPIV1Write(t *testing.T, wr prompb.WriteRequest, opts QueryOpts) {
t.Helper()
url := fmt.Sprintf("http://%s/prometheus/api/v1/write", app.httpListenAddr)
if opts.Tenant != "" {
url = fmt.Sprintf("http://%s/insert/%s/prometheus/api/v1/write", app.httpListenAddr, opts.Tenant)
}
data := snappy.Encode(nil, wr.MarshalProtobuf(nil))
recordsCount := len(wr.Timeseries)
if prommetadata.IsEnabled() {
recordsCount += len(wr.Metadata)
}
headers := opts.getHeaders()
headers.Set("Content-Type", "application/x-protobuf")
app.sendBlocking(t, recordsCount, func() {
_, statusCode := app.cli.Post(t, url, data, headers)
if statusCode != http.StatusNoContent {
t.Fatalf("unexpected status code: got %d, want %d", statusCode, http.StatusNoContent)
}
})
}
// HTTPAddr returns the address at which the vmagent process is listening
// for http connections.
func (app *Vmagent) HTTPAddr() string {
@@ -176,16 +205,22 @@ func (app *Vmagent) HTTPAddr() string {
// If it is, then the data has been sent to vmstorage.
//
// Unreliable if the records are inserted concurrently.
func (app *Vmagent) sendBlocking(t *testing.T, numRecordsToSend int, send func()) {
func (app *Vmagent) sendBlocking(t *testing.T, _ int, send func()) {
t.Helper()
currRowsSentCount := app.remoteWriteRequestsTotal(t)
send()
const (
retries = 20
period = 100 * time.Millisecond
)
wantRowsSentCount := app.remoteWriteRequestsTotal(t) + numRecordsToSend
// TODO: properly account wantRowsSentCount
// currently vmagent doesn't expose per time-series write information
// so we can only account number of blocks sent via remote write protocol
// it should be suitable for tests purpose
wantRowsSentCount := currRowsSentCount + 1
for range retries {
if app.remoteWriteRequestsTotal(t) >= wantRowsSentCount {
return

View File

@@ -106,7 +106,7 @@ func (app *Vminsert) HTTPAddr() string {
func (app *Vminsert) InfluxWrite(t *testing.T, records []string, opts QueryOpts) {
t.Helper()
url := fmt.Sprintf("http://%s/insert/%s/influx/write", app.httpListenAddr, opts.Tenant)
url := fmt.Sprintf("http://%s/insert/%s/influx/write", app.httpListenAddr, opts.getTenant())
uv := opts.asURLValues()
uvs := uv.Encode()
if len(uvs) > 0 {
@@ -141,7 +141,7 @@ func (app *Vminsert) GraphiteWrite(t *testing.T, records []string, _ QueryOpts)
func (app *Vminsert) PrometheusAPIV1ImportCSV(t *testing.T, records []string, opts QueryOpts) {
t.Helper()
url := fmt.Sprintf("http://%s/insert/%s/prometheus/api/v1/import/csv", app.httpListenAddr, opts.Tenant)
url := fmt.Sprintf("http://%s/insert/%s/prometheus/api/v1/import/csv", app.httpListenAddr, opts.getTenant())
uv := opts.asURLValues()
uvs := uv.Encode()
if len(uvs) > 0 {
@@ -166,7 +166,7 @@ func (app *Vminsert) PrometheusAPIV1ImportCSV(t *testing.T, records []string, op
func (app *Vminsert) PrometheusAPIV1ImportNative(t *testing.T, data []byte, opts QueryOpts) {
t.Helper()
url := fmt.Sprintf("http://%s/insert/%s/prometheus/api/v1/import/native", app.httpListenAddr, opts.Tenant)
url := fmt.Sprintf("http://%s/insert/%s/prometheus/api/v1/import/native", app.httpListenAddr, opts.getTenant())
uv := opts.asURLValues()
uvs := uv.Encode()
if len(uvs) > 0 {
@@ -190,7 +190,7 @@ func (app *Vminsert) PrometheusAPIV1ImportNative(t *testing.T, data []byte, opts
func (app *Vminsert) OpenTSDBAPIPut(t *testing.T, records []string, opts QueryOpts) {
t.Helper()
url := fmt.Sprintf("http://%s/insert/%s/opentsdb/api/put", app.openTSDBListenAddr, opts.Tenant)
url := fmt.Sprintf("http://%s/insert/%s/opentsdb/api/put", app.openTSDBListenAddr, opts.getTenant())
uv := opts.asURLValues()
uvs := uv.Encode()
if len(uvs) > 0 {
@@ -213,7 +213,7 @@ func (app *Vminsert) OpenTSDBAPIPut(t *testing.T, records []string, opts QueryOp
func (app *Vminsert) PrometheusAPIV1Write(t *testing.T, wr prompb.WriteRequest, opts QueryOpts) {
t.Helper()
url := fmt.Sprintf("http://%s/insert/%s/prometheus/api/v1/write", app.httpListenAddr, opts.Tenant)
url := fmt.Sprintf("http://%s/insert/%s/prometheus/api/v1/write", app.httpListenAddr, opts.getTenant())
data := snappy.Encode(nil, wr.MarshalProtobuf(nil))
recordsCount := len(wr.Timeseries)
if prommetadata.IsEnabled() {
@@ -238,7 +238,7 @@ func (app *Vminsert) PrometheusAPIV1Write(t *testing.T, wr prompb.WriteRequest,
func (app *Vminsert) PrometheusAPIV1ImportPrometheus(t *testing.T, records []string, opts QueryOpts) {
t.Helper()
url := fmt.Sprintf("http://%s/insert/%s/prometheus/api/v1/import/prometheus", app.httpListenAddr, opts.Tenant)
url := fmt.Sprintf("http://%s/insert/%s/prometheus/api/v1/import/prometheus", app.httpListenAddr, opts.getTenant())
uv := opts.asURLValues()
uvs := uv.Encode()
if len(uvs) > 0 {
@@ -287,7 +287,7 @@ func (app *Vminsert) PrometheusAPIV1ImportPrometheus(t *testing.T, records []str
func (app *Vminsert) ZabbixConnectorHistory(t *testing.T, records []string, opts QueryOpts) {
t.Helper()
url := fmt.Sprintf("http://%s/insert/%s/zabbixconnector/api/v1/history", app.httpListenAddr, opts.Tenant)
url := fmt.Sprintf("http://%s/insert/%s/zabbixconnector/api/v1/history", app.httpListenAddr, opts.getTenant())
uv := opts.asURLValues()
uvs := uv.Encode()
if len(uvs) > 0 {

View File

@@ -72,7 +72,7 @@ func (app *Vmselect) HTTPAddr() string {
func (app *Vmselect) PrometheusAPIV1Export(t *testing.T, query string, opts QueryOpts) *PrometheusAPIV1QueryResponse {
t.Helper()
exportURL := fmt.Sprintf("http://%s/select/%s/prometheus/api/v1/export", app.httpListenAddr, opts.Tenant)
exportURL := fmt.Sprintf("http://%s/select/%s/prometheus/api/v1/export", app.httpListenAddr, opts.getTenant())
values := opts.asURLValues()
values.Add("match[]", query)
values.Add("format", "promapi")
@@ -88,7 +88,7 @@ func (app *Vmselect) PrometheusAPIV1Export(t *testing.T, query string, opts Quer
func (app *Vmselect) PrometheusAPIV1ExportNative(t *testing.T, query string, opts QueryOpts) []byte {
t.Helper()
exportURL := fmt.Sprintf("http://%s/select/%s/prometheus/api/v1/export/native", app.httpListenAddr, opts.Tenant)
exportURL := fmt.Sprintf("http://%s/select/%s/prometheus/api/v1/export/native", app.httpListenAddr, opts.getTenant())
values := opts.asURLValues()
values.Add("match[]", query)
values.Add("format", "promapi")
@@ -104,7 +104,7 @@ func (app *Vmselect) PrometheusAPIV1ExportNative(t *testing.T, query string, opt
func (app *Vmselect) PrometheusAPIV1Query(t *testing.T, query string, opts QueryOpts) *PrometheusAPIV1QueryResponse {
t.Helper()
queryURL := fmt.Sprintf("http://%s/select/%s/prometheus/api/v1/query", app.httpListenAddr, opts.Tenant)
queryURL := fmt.Sprintf("http://%s/select/%s/prometheus/api/v1/query", app.httpListenAddr, opts.getTenant())
values := opts.asURLValues()
values.Add("query", query)
@@ -120,7 +120,7 @@ func (app *Vmselect) PrometheusAPIV1Query(t *testing.T, query string, opts Query
func (app *Vmselect) PrometheusAPIV1QueryRange(t *testing.T, query string, opts QueryOpts) *PrometheusAPIV1QueryResponse {
t.Helper()
queryURL := fmt.Sprintf("http://%s/select/%s/prometheus/api/v1/query_range", app.httpListenAddr, opts.Tenant)
queryURL := fmt.Sprintf("http://%s/select/%s/prometheus/api/v1/query_range", app.httpListenAddr, opts.getTenant())
values := opts.asURLValues()
values.Add("query", query)
@@ -135,7 +135,7 @@ func (app *Vmselect) PrometheusAPIV1QueryRange(t *testing.T, query string, opts
func (app *Vmselect) PrometheusAPIV1Series(t *testing.T, matchQuery string, opts QueryOpts) *PrometheusAPIV1SeriesResponse {
t.Helper()
seriesURL := fmt.Sprintf("http://%s/select/%s/prometheus/api/v1/series", app.httpListenAddr, opts.Tenant)
seriesURL := fmt.Sprintf("http://%s/select/%s/prometheus/api/v1/series", app.httpListenAddr, opts.getTenant())
values := opts.asURLValues()
values.Add("match[]", matchQuery)
@@ -150,7 +150,7 @@ func (app *Vmselect) PrometheusAPIV1Series(t *testing.T, matchQuery string, opts
func (app *Vmselect) PrometheusAPIV1SeriesCount(t *testing.T, opts QueryOpts) *PrometheusAPIV1SeriesCountResponse {
t.Helper()
seriesURL := fmt.Sprintf("http://%s/select/%s/prometheus/api/v1/series/count", app.httpListenAddr, opts.Tenant)
seriesURL := fmt.Sprintf("http://%s/select/%s/prometheus/api/v1/series/count", app.httpListenAddr, opts.getTenant())
values := opts.asURLValues()
res, _ := app.cli.PostForm(t, seriesURL, values, opts.Headers)
@@ -167,7 +167,7 @@ func (app *Vmselect) PrometheusAPIV1Labels(t *testing.T, matchQuery string, opts
values := opts.asURLValues()
values.Add("match[]", matchQuery)
queryURL := fmt.Sprintf("http://%s/select/%s/prometheus/api/v1/labels", app.httpListenAddr, opts.Tenant)
queryURL := fmt.Sprintf("http://%s/select/%s/prometheus/api/v1/labels", app.httpListenAddr, opts.getTenant())
res, _ := app.cli.PostForm(t, queryURL, values, opts.Headers)
return NewPrometheusAPIV1LabelsResponse(t, res)
}
@@ -181,7 +181,7 @@ func (app *Vmselect) PrometheusAPIV1LabelValues(t *testing.T, labelName, matchQu
values := opts.asURLValues()
values.Add("match[]", matchQuery)
queryURL := fmt.Sprintf("http://%s/select/%s/prometheus/api/v1/label/%s/values", app.httpListenAddr, opts.Tenant, labelName)
queryURL := fmt.Sprintf("http://%s/select/%s/prometheus/api/v1/label/%s/values", app.httpListenAddr, opts.getTenant(), labelName)
res, _ := app.cli.PostForm(t, queryURL, values, opts.Headers)
return NewPrometheusAPIV1LabelValuesResponse(t, res)
@@ -195,7 +195,7 @@ func (app *Vmselect) PrometheusAPIV1Metadata(t *testing.T, metric string, limit
values := opts.asURLValues()
values.Add("metric", metric)
values.Add("limit", strconv.Itoa(limit))
queryURL := fmt.Sprintf("http://%s/select/%s/prometheus/api/v1/metadata", app.httpListenAddr, opts.Tenant)
queryURL := fmt.Sprintf("http://%s/select/%s/prometheus/api/v1/metadata", app.httpListenAddr, opts.getTenant())
res, _ := app.cli.PostForm(t, queryURL, values, opts.Headers)
return NewPrometheusAPIV1Metadata(t, res)
@@ -208,7 +208,7 @@ func (app *Vmselect) PrometheusAPIV1Metadata(t *testing.T, metric string, limit
func (app *Vmselect) APIV1AdminTSDBDeleteSeries(t *testing.T, matchQuery string, opts QueryOpts) {
t.Helper()
queryURL := fmt.Sprintf("http://%s/delete/%s/prometheus/api/v1/admin/tsdb/delete_series", app.httpListenAddr, opts.Tenant)
queryURL := fmt.Sprintf("http://%s/delete/%s/prometheus/api/v1/admin/tsdb/delete_series", app.httpListenAddr, opts.getTenant())
values := opts.asURLValues()
values.Add("match[]", matchQuery)
@@ -229,7 +229,7 @@ func (app *Vmselect) MetricNamesStats(t *testing.T, limit, le, matchPattern stri
values.Add("limit", limit)
values.Add("le", le)
values.Add("match_pattern", matchPattern)
queryURL := fmt.Sprintf("http://%s/select/%s/prometheus/api/v1/status/metric_names_stats", app.httpListenAddr, opts.Tenant)
queryURL := fmt.Sprintf("http://%s/select/%s/prometheus/api/v1/status/metric_names_stats", app.httpListenAddr, opts.getTenant())
res, statusCode := app.cli.PostForm(t, queryURL, values, opts.Headers)
if statusCode != http.StatusOK {
@@ -263,7 +263,7 @@ func (app *Vmselect) MetricNamesStatsReset(t *testing.T, opts QueryOpts) {
func (app *Vmselect) APIV1StatusTSDB(t *testing.T, matchQuery string, date string, topN string, opts QueryOpts) TSDBStatusResponse {
t.Helper()
seriesURL := fmt.Sprintf("http://%s/select/%s/prometheus/api/v1/status/tsdb", app.httpListenAddr, opts.Tenant)
seriesURL := fmt.Sprintf("http://%s/select/%s/prometheus/api/v1/status/tsdb", app.httpListenAddr, opts.getTenant())
values := opts.asURLValues()
addNonEmpty := func(name, value string) {
if len(value) == 0 {
@@ -294,7 +294,7 @@ func (app *Vmselect) APIV1StatusTSDB(t *testing.T, matchQuery string, date strin
func (app *Vmselect) GraphiteMetricsIndex(t *testing.T, opts QueryOpts) GraphiteMetricsIndexResponse {
t.Helper()
seriesURL := fmt.Sprintf("http://%s/select/%s/graphite/metrics/index.json", app.httpListenAddr, opts.Tenant)
seriesURL := fmt.Sprintf("http://%s/select/%s/graphite/metrics/index.json", app.httpListenAddr, opts.getTenant())
res, statusCode := app.cli.Get(t, seriesURL, opts.Headers)
if statusCode != http.StatusOK {
t.Fatalf("unexpected status code: got %d, want %d, resp text=%q", statusCode, http.StatusOK, res)
@@ -313,7 +313,7 @@ func (app *Vmselect) GraphiteMetricsIndex(t *testing.T, opts QueryOpts) Graphite
func (app *Vmselect) GraphiteTagsTagSeries(t *testing.T, record string, opts QueryOpts) {
t.Helper()
url := fmt.Sprintf("http://%s/select/%s/graphite/tags/tagSeries", app.httpListenAddr, opts.Tenant)
url := fmt.Sprintf("http://%s/select/%s/graphite/tags/tagSeries", app.httpListenAddr, opts.getTenant())
values := opts.asURLValues()
values.Add("path", record)
@@ -326,7 +326,7 @@ func (app *Vmselect) GraphiteTagsTagSeries(t *testing.T, record string, opts Que
func (app *Vmselect) GraphiteTagsTagMultiSeries(t *testing.T, records []string, opts QueryOpts) {
t.Helper()
url := fmt.Sprintf("http://%s/select/%s/graphite/tags/tagMultiSeries", app.httpListenAddr, opts.Tenant)
url := fmt.Sprintf("http://%s/select/%s/graphite/tags/tagMultiSeries", app.httpListenAddr, opts.getTenant())
values := opts.asURLValues()
for _, rec := range records {
values.Add("path", rec)

View File

@@ -9093,18 +9093,20 @@
"type": "prometheus",
"uid": "$ds"
},
"description": "* `*` - unsupported query path\n* `/write` - insert into VM\n* `/metrics` - query VM system metrics\n* `/query` - query instant values\n* `/query_range` - query over a range of time\n* `/series` - match a certain label set\n* `/label/{}/values` - query a list of label values (variables mostly)",
"description": "Shows the rate of data rows ingested from write requests. There are two kinds of rows:\n1. Raw sample: each sample consists of a value and a timestamp, see https://docs.victoriametrics.com/victoriametrics/keyconcepts/#raw-samples. \n2. Metric metadata: refers to descriptive information about metrics. It can be disabled by setting `-enableMetadata=false`, see https://docs.victoriametrics.com/victoriametrics/vmagent/#metric-metadata.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
@@ -9113,6 +9115,7 @@
"tooltip": false,
"viz": false
},
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
@@ -9120,6 +9123,7 @@
"type": "linear"
},
"showPoints": "never",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
@@ -9129,6 +9133,7 @@
"mode": "off"
}
},
"decimals": 2,
"links": [],
"mappings": [],
"min": 0,
@@ -9136,7 +9141,8 @@
"mode": "absolute",
"steps": [
{
"color": "green"
"color": "green",
"value": 0
},
{
"color": "red",
@@ -9152,9 +9158,9 @@
"h": 8,
"w": 12,
"x": 0,
"y": 8385
"y": 43
},
"id": 97,
"id": 227,
"options": {
"legend": {
"calcs": [
@@ -9169,11 +9175,12 @@
"sortDesc": true
},
"tooltip": {
"hideZeros": false,
"mode": "multi",
"sort": "desc"
}
},
"pluginVersion": "9.1.0",
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
@@ -9181,15 +9188,30 @@
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vm_http_requests_total{job=~\"$job_insert\", instance=~\"$instance\"}[$__rate_interval])) by (path) > 0",
"expr": "sum(rate(vm_rows_inserted_total{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) by (type) > 0 ",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
"legendFormat": "__auto",
"legendFormat": "sample: {{type}}",
"range": true,
"refId": "A"
},
{
"datasource": {
"type": "prometheus",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vm_metadata_rows_inserted_total{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) by (type) > 0 ",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
"legendFormat": "metadata: {{type}}",
"range": true,
"refId": "B"
}
],
"title": "Requests rate ($instance)",
"title": "Rows rate ($instance)",
"type": "timeseries"
},
{
@@ -9204,11 +9226,13 @@
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "none",
@@ -9217,6 +9241,7 @@
"tooltip": false,
"viz": false
},
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
@@ -9224,6 +9249,7 @@
"type": "linear"
},
"showPoints": "never",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
@@ -9241,7 +9267,8 @@
"mode": "absolute",
"steps": [
{
"color": "green"
"color": "green",
"value": 0
},
{
"color": "red",
@@ -9277,7 +9304,7 @@
"h": 8,
"w": 12,
"x": 12,
"y": 8385
"y": 43
},
"id": 99,
"options": {
@@ -9294,11 +9321,12 @@
"sortDesc": true
},
"tooltip": {
"hideZeros": false,
"mode": "multi",
"sort": "desc"
}
},
"pluginVersion": "9.1.0",
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
@@ -9336,17 +9364,20 @@
"type": "prometheus",
"uid": "$ds"
},
"description": "* `*` - unsupported query path\n* `/write` - insert into VM\n* `/metrics` - query VM system metrics\n* `/query` - query instant values\n* `/query_range` - query over a range of time\n* `/series` - match a certain label set\n* `/label/{}/values` - query a list of label values (variables mostly)",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
@@ -9355,6 +9386,7 @@
"tooltip": false,
"viz": false
},
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
@@ -9362,6 +9394,116 @@
"type": "linear"
},
"showPoints": "never",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"links": [],
"mappings": [],
"min": 0,
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": 0
},
{
"color": "red",
"value": 80
}
]
},
"unit": "short"
},
"overrides": []
},
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 51
},
"id": 97,
"options": {
"legend": {
"calcs": [
"mean",
"lastNotNull",
"max"
],
"displayMode": "table",
"placement": "bottom",
"showLegend": true,
"sortBy": "Last *",
"sortDesc": true
},
"tooltip": {
"hideZeros": false,
"mode": "multi",
"sort": "desc"
}
},
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vm_http_requests_total{job=~\"$job_insert\", instance=~\"$instance\"}[$__rate_interval])) by (path) > 0",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "__auto",
"range": true,
"refId": "A"
}
],
"title": "Requests rate ($instance)",
"type": "timeseries"
},
{
"datasource": {
"type": "prometheus",
"uid": "$ds"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "never",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
@@ -9384,7 +9526,8 @@
"mode": "absolute",
"steps": [
{
"color": "green"
"color": "green",
"value": 0
},
{
"color": "red",
@@ -9399,8 +9542,8 @@
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 8393
"x": 12,
"y": 51
},
"id": 185,
"options": {
@@ -9417,11 +9560,12 @@
"sortDesc": true
},
"tooltip": {
"hideZeros": false,
"mode": "multi",
"sort": "none"
}
},
"pluginVersion": "9.1.0",
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
@@ -9447,7 +9591,6 @@
"exemplar": true,
"expr": "min(\n rate(process_cpu_seconds_total{job=~\"$job_insert\", instance=~\"$instance\"}[$__rate_interval])\n /\n process_cpu_cores_available{job=~\"$job_insert\", instance=~\"$instance\"}\n)",
"format": "time_series",
"hide": false,
"interval": "",
"intervalFactor": 1,
"legendFormat": "min",
@@ -9463,7 +9606,6 @@
"exemplar": true,
"expr": "quantile(0.5,\n rate(process_cpu_seconds_total{job=~\"$job_insert\", instance=~\"$instance\"}[$__rate_interval])\n /\n process_cpu_cores_available{job=~\"$job_insert\", instance=~\"$instance\"}\n)",
"format": "time_series",
"hide": false,
"interval": "",
"intervalFactor": 1,
"legendFormat": "median",
@@ -9486,11 +9628,13 @@
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
@@ -9499,6 +9643,7 @@
"tooltip": false,
"viz": false
},
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
@@ -9506,6 +9651,7 @@
"type": "linear"
},
"showPoints": "never",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
@@ -9528,7 +9674,8 @@
"mode": "absolute",
"steps": [
{
"color": "green"
"color": "green",
"value": 0
},
{
"color": "red",
@@ -9543,8 +9690,8 @@
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 8393
"x": 0,
"y": 59
},
"id": 187,
"options": {
@@ -9561,11 +9708,12 @@
"sortDesc": true
},
"tooltip": {
"hideZeros": false,
"mode": "multi",
"sort": "none"
}
},
"pluginVersion": "9.1.0",
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
@@ -9591,7 +9739,6 @@
"exemplar": true,
"expr": "min(\n max_over_time(process_resident_memory_anon_bytes{job=~\"$job_insert\", instance=~\"$instance\"}[$__rate_interval])\n /\n vm_available_memory_bytes{job=~\"$job_insert\", instance=~\"$instance\"}\n)",
"format": "time_series",
"hide": false,
"interval": "",
"intervalFactor": 1,
"legendFormat": "min",
@@ -9607,7 +9754,6 @@
"exemplar": true,
"expr": "quantile(0.5,\n max_over_time(process_resident_memory_anon_bytes{job=~\"$job_insert\", instance=~\"$instance\"}[$__rate_interval])\n /\n vm_available_memory_bytes{job=~\"$job_insert\", instance=~\"$instance\"}\n)",
"format": "time_series",
"hide": false,
"interval": "",
"intervalFactor": 1,
"legendFormat": "median",
@@ -9623,18 +9769,20 @@
"type": "prometheus",
"uid": "$ds"
},
"description": "Shows the saturation level of connection between vminsert and vmstorage components. \n\nIf the threshold of 0.9sec is reached, then the connection is saturated by more than 90% and vminsert won't be able to keep up. This usually means that either vminsert or vmstorage nodes are struggling with the load. Verify CPU/mem saturation of both components and network saturation between them.\nIf vminsert resources are saturated - consider adding more resources or scale vminserts horizontally.\n\nIf vminsert resources and network are fine, check vmstorage metrics for anomalies.",
"description": "Shows when vmstorage node is unreachable for vminsert.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
@@ -9643,6 +9791,7 @@
"tooltip": false,
"viz": false
},
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
@@ -9650,6 +9799,116 @@
"type": "linear"
},
"showPoints": "never",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"decimals": 0,
"links": [],
"mappings": [],
"min": 0,
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": 0
},
{
"color": "red",
"value": 80
}
]
},
"unit": "short"
},
"overrides": []
},
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 59
},
"id": 114,
"options": {
"legend": {
"calcs": [
"mean",
"lastNotNull"
],
"displayMode": "table",
"placement": "bottom",
"showLegend": true
},
"tooltip": {
"hideZeros": false,
"mode": "multi",
"sort": "desc"
}
},
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "$ds"
},
"editorMode": "code",
"expr": "vm_rpc_vmstorage_is_reachable{job=~\"$job\", instance=~\"$instance\"} != 1",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
"legendFormat": "{{instance}} => {{addr}}",
"range": true,
"refId": "A"
}
],
"title": "Storage reachability ($instance)",
"type": "timeseries"
},
{
"datasource": {
"type": "prometheus",
"uid": "$ds"
},
"description": "Shows the saturation level of connection between vminsert and vmstorage components. \n\nIf the threshold of 0.9sec is reached, then the connection is saturated by more than 90% and vminsert won't be able to keep up. This usually means that either vminsert or vmstorage nodes are struggling with the load. Verify CPU/mem saturation of both components and network saturation between them.\nIf vminsert resources are saturated - consider adding more resources or scale vminserts horizontally.\n\nIf vminsert resources and network are fine, check vmstorage metrics for anomalies.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "never",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
@@ -9667,7 +9926,8 @@
"mode": "absolute",
"steps": [
{
"color": "transparent"
"color": "transparent",
"value": 0
},
{
"color": "red",
@@ -9683,7 +9943,7 @@
"h": 8,
"w": 12,
"x": 0,
"y": 8401
"y": 67
},
"id": 139,
"options": {
@@ -9699,11 +9959,12 @@
"sortDesc": true
},
"tooltip": {
"hideZeros": false,
"mode": "multi",
"sort": "desc"
}
},
"pluginVersion": "9.1.0",
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
@@ -9729,18 +9990,20 @@
"type": "prometheus",
"uid": "$ds"
},
"description": "Shows when vmstorage node is unreachable for vminsert.",
"description": "Shows network usage between vminserts and vmstorages.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
@@ -9749,6 +10012,7 @@
"tooltip": false,
"viz": false
},
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
@@ -9756,6 +10020,7 @@
"type": "linear"
},
"showPoints": "never",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
@@ -9765,15 +10030,14 @@
"mode": "off"
}
},
"decimals": 0,
"links": [],
"mappings": [],
"min": 0,
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green"
"color": "green",
"value": 0
},
{
"color": "red",
@@ -9781,17 +10045,30 @@
}
]
},
"unit": "short"
"unit": "bps"
},
"overrides": []
"overrides": [
{
"matcher": {
"id": "byRegexp",
"options": "/read.*/"
},
"properties": [
{
"id": "custom.transform",
"value": "negative-Y"
}
]
}
]
},
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 8401
"y": 67
},
"id": 114,
"id": 209,
"options": {
"legend": {
"calcs": [
@@ -9800,14 +10077,17 @@
],
"displayMode": "table",
"placement": "bottom",
"showLegend": true
"showLegend": true,
"sortBy": "Last *",
"sortDesc": true
},
"tooltip": {
"hideZeros": false,
"mode": "multi",
"sort": "desc"
}
},
"pluginVersion": "9.1.0",
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
@@ -9815,16 +10095,28 @@
"uid": "$ds"
},
"editorMode": "code",
"expr": "vm_rpc_vmstorage_is_reachable{job=~\"$job\", instance=~\"$instance\"} != 1",
"expr": "sum(rate(vm_tcpdialer_read_bytes_total{job=~\"$job_insert\", instance=~\"$instance\"}[$__rate_interval])) * 8 > 0",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
"legendFormat": "{{instance}} => {{addr}}",
"legendFormat": "read from vmstorage",
"range": true,
"refId": "A"
},
{
"datasource": {
"type": "prometheus",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vm_tcpdialer_written_bytes_total{job=~\"$job_insert\", instance=~\"$instance\"}[$__rate_interval])) * 8 > 0",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "write to vmstorage",
"range": true,
"refId": "B"
}
],
"title": "Storage reachability ($instance)",
"title": "Network usage: vmstorage ($instance)",
"type": "timeseries"
},
{
@@ -9839,11 +10131,13 @@
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
@@ -9852,6 +10146,7 @@
"tooltip": false,
"viz": false
},
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
@@ -9859,6 +10154,7 @@
"type": "linear"
},
"showPoints": "never",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
@@ -9874,7 +10170,8 @@
"mode": "absolute",
"steps": [
{
"color": "green"
"color": "green",
"value": 0
},
{
"color": "red",
@@ -9903,7 +10200,7 @@
"h": 8,
"w": 12,
"x": 0,
"y": 8409
"y": 75
},
"id": 208,
"options": {
@@ -9919,11 +10216,12 @@
"sortDesc": true
},
"tooltip": {
"hideZeros": false,
"mode": "multi",
"sort": "desc"
}
},
"pluginVersion": "9.1.0",
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
@@ -9946,7 +10244,6 @@
"editorMode": "code",
"expr": "sum(rate(vm_tcplistener_written_bytes_total{job=~\"$job_insert\", instance=~\"$instance\"}[$__rate_interval])) * 8 > 0",
"format": "time_series",
"hide": false,
"intervalFactor": 1,
"legendFormat": "write to client",
"range": true,
@@ -9961,147 +10258,19 @@
"type": "prometheus",
"uid": "$ds"
},
"description": "Shows network usage between vminserts and vmstorages.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "never",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"links": [],
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green"
},
{
"color": "red",
"value": 80
}
]
},
"unit": "bps"
},
"overrides": [
{
"matcher": {
"id": "byRegexp",
"options": "/read.*/"
},
"properties": [
{
"id": "custom.transform",
"value": "negative-Y"
}
]
}
]
},
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 8409
},
"id": 209,
"options": {
"legend": {
"calcs": [
"mean",
"lastNotNull"
],
"displayMode": "table",
"placement": "bottom",
"showLegend": true,
"sortBy": "Last *",
"sortDesc": true
},
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"pluginVersion": "9.1.0",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vm_tcpdialer_read_bytes_total{job=~\"$job_insert\", instance=~\"$instance\"}[$__rate_interval])) * 8 > 0",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "read from vmstorage",
"range": true,
"refId": "A"
},
{
"datasource": {
"type": "prometheus",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vm_tcpdialer_written_bytes_total{job=~\"$job_insert\", instance=~\"$instance\"}[$__rate_interval])) * 8 > 0",
"format": "time_series",
"hide": false,
"intervalFactor": 1,
"legendFormat": "write to vmstorage",
"range": true,
"refId": "B"
}
],
"title": "Network usage: vmstorage ($instance)",
"type": "timeseries"
},
{
"datasource": {
"type": "prometheus",
"uid": "$ds"
},
"description": "",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
@@ -10110,6 +10279,7 @@
"tooltip": false,
"viz": false
},
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
@@ -10117,6 +10287,7 @@
"type": "linear"
},
"showPoints": "never",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
@@ -10134,7 +10305,8 @@
"mode": "absolute",
"steps": [
{
"color": "green"
"color": "green",
"value": 0
},
{
"color": "red",
@@ -10150,7 +10322,7 @@
"h": 8,
"w": 12,
"x": 12,
"y": 8417
"y": 75
},
"id": 88,
"options": {
@@ -10167,11 +10339,12 @@
"sortDesc": true
},
"tooltip": {
"hideZeros": false,
"mode": "multi",
"sort": "desc"
}
},
"pluginVersion": "9.1.0",
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {

View File

@@ -9094,18 +9094,20 @@
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"description": "* `*` - unsupported query path\n* `/write` - insert into VM\n* `/metrics` - query VM system metrics\n* `/query` - query instant values\n* `/query_range` - query over a range of time\n* `/series` - match a certain label set\n* `/label/{}/values` - query a list of label values (variables mostly)",
"description": "Shows the rate of data rows ingested from write requests. There are two kinds of rows:\n1. Raw sample: each sample consists of a value and a timestamp, see https://docs.victoriametrics.com/victoriametrics/keyconcepts/#raw-samples. \n2. Metric metadata: refers to descriptive information about metrics. It can be disabled by setting `-enableMetadata=false`, see https://docs.victoriametrics.com/victoriametrics/vmagent/#metric-metadata.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
@@ -9114,6 +9116,7 @@
"tooltip": false,
"viz": false
},
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
@@ -9121,6 +9124,7 @@
"type": "linear"
},
"showPoints": "never",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
@@ -9130,6 +9134,7 @@
"mode": "off"
}
},
"decimals": 2,
"links": [],
"mappings": [],
"min": 0,
@@ -9137,7 +9142,8 @@
"mode": "absolute",
"steps": [
{
"color": "green"
"color": "green",
"value": 0
},
{
"color": "red",
@@ -9153,9 +9159,9 @@
"h": 8,
"w": 12,
"x": 0,
"y": 8385
"y": 43
},
"id": 97,
"id": 227,
"options": {
"legend": {
"calcs": [
@@ -9170,11 +9176,12 @@
"sortDesc": true
},
"tooltip": {
"hideZeros": false,
"mode": "multi",
"sort": "desc"
}
},
"pluginVersion": "9.1.0",
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
@@ -9182,15 +9189,30 @@
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vm_http_requests_total{job=~\"$job_insert\", instance=~\"$instance\"}[$__rate_interval])) by (path) > 0",
"expr": "sum(rate(vm_rows_inserted_total{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) by (type) > 0 ",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
"legendFormat": "__auto",
"legendFormat": "sample: {{type}}",
"range": true,
"refId": "A"
},
{
"datasource": {
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vm_metadata_rows_inserted_total{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) by (type) > 0 ",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
"legendFormat": "metadata: {{type}}",
"range": true,
"refId": "B"
}
],
"title": "Requests rate ($instance)",
"title": "Rows rate ($instance)",
"type": "timeseries"
},
{
@@ -9205,11 +9227,13 @@
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "none",
@@ -9218,6 +9242,7 @@
"tooltip": false,
"viz": false
},
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
@@ -9225,6 +9250,7 @@
"type": "linear"
},
"showPoints": "never",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
@@ -9242,7 +9268,8 @@
"mode": "absolute",
"steps": [
{
"color": "green"
"color": "green",
"value": 0
},
{
"color": "red",
@@ -9278,7 +9305,7 @@
"h": 8,
"w": 12,
"x": 12,
"y": 8385
"y": 43
},
"id": 99,
"options": {
@@ -9295,11 +9322,12 @@
"sortDesc": true
},
"tooltip": {
"hideZeros": false,
"mode": "multi",
"sort": "desc"
}
},
"pluginVersion": "9.1.0",
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
@@ -9337,17 +9365,20 @@
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"description": "* `*` - unsupported query path\n* `/write` - insert into VM\n* `/metrics` - query VM system metrics\n* `/query` - query instant values\n* `/query_range` - query over a range of time\n* `/series` - match a certain label set\n* `/label/{}/values` - query a list of label values (variables mostly)",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
@@ -9356,6 +9387,7 @@
"tooltip": false,
"viz": false
},
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
@@ -9363,6 +9395,116 @@
"type": "linear"
},
"showPoints": "never",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"links": [],
"mappings": [],
"min": 0,
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": 0
},
{
"color": "red",
"value": 80
}
]
},
"unit": "short"
},
"overrides": []
},
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 51
},
"id": 97,
"options": {
"legend": {
"calcs": [
"mean",
"lastNotNull",
"max"
],
"displayMode": "table",
"placement": "bottom",
"showLegend": true,
"sortBy": "Last *",
"sortDesc": true
},
"tooltip": {
"hideZeros": false,
"mode": "multi",
"sort": "desc"
}
},
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vm_http_requests_total{job=~\"$job_insert\", instance=~\"$instance\"}[$__rate_interval])) by (path) > 0",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "__auto",
"range": true,
"refId": "A"
}
],
"title": "Requests rate ($instance)",
"type": "timeseries"
},
{
"datasource": {
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "never",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
@@ -9385,7 +9527,8 @@
"mode": "absolute",
"steps": [
{
"color": "green"
"color": "green",
"value": 0
},
{
"color": "red",
@@ -9400,8 +9543,8 @@
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 8393
"x": 12,
"y": 51
},
"id": 185,
"options": {
@@ -9418,11 +9561,12 @@
"sortDesc": true
},
"tooltip": {
"hideZeros": false,
"mode": "multi",
"sort": "none"
}
},
"pluginVersion": "9.1.0",
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
@@ -9448,7 +9592,6 @@
"exemplar": true,
"expr": "min(\n rate(process_cpu_seconds_total{job=~\"$job_insert\", instance=~\"$instance\"}[$__rate_interval])\n /\n process_cpu_cores_available{job=~\"$job_insert\", instance=~\"$instance\"}\n)",
"format": "time_series",
"hide": false,
"interval": "",
"intervalFactor": 1,
"legendFormat": "min",
@@ -9464,7 +9607,6 @@
"exemplar": true,
"expr": "quantile(0.5,\n rate(process_cpu_seconds_total{job=~\"$job_insert\", instance=~\"$instance\"}[$__rate_interval])\n /\n process_cpu_cores_available{job=~\"$job_insert\", instance=~\"$instance\"}\n)",
"format": "time_series",
"hide": false,
"interval": "",
"intervalFactor": 1,
"legendFormat": "median",
@@ -9487,11 +9629,13 @@
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
@@ -9500,6 +9644,7 @@
"tooltip": false,
"viz": false
},
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
@@ -9507,6 +9652,7 @@
"type": "linear"
},
"showPoints": "never",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
@@ -9529,7 +9675,8 @@
"mode": "absolute",
"steps": [
{
"color": "green"
"color": "green",
"value": 0
},
{
"color": "red",
@@ -9544,8 +9691,8 @@
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 8393
"x": 0,
"y": 59
},
"id": 187,
"options": {
@@ -9562,11 +9709,12 @@
"sortDesc": true
},
"tooltip": {
"hideZeros": false,
"mode": "multi",
"sort": "none"
}
},
"pluginVersion": "9.1.0",
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
@@ -9592,7 +9740,6 @@
"exemplar": true,
"expr": "min(\n max_over_time(process_resident_memory_anon_bytes{job=~\"$job_insert\", instance=~\"$instance\"}[$__rate_interval])\n /\n vm_available_memory_bytes{job=~\"$job_insert\", instance=~\"$instance\"}\n)",
"format": "time_series",
"hide": false,
"interval": "",
"intervalFactor": 1,
"legendFormat": "min",
@@ -9608,7 +9755,6 @@
"exemplar": true,
"expr": "quantile(0.5,\n max_over_time(process_resident_memory_anon_bytes{job=~\"$job_insert\", instance=~\"$instance\"}[$__rate_interval])\n /\n vm_available_memory_bytes{job=~\"$job_insert\", instance=~\"$instance\"}\n)",
"format": "time_series",
"hide": false,
"interval": "",
"intervalFactor": 1,
"legendFormat": "median",
@@ -9624,18 +9770,20 @@
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"description": "Shows the saturation level of connection between vminsert and vmstorage components. \n\nIf the threshold of 0.9sec is reached, then the connection is saturated by more than 90% and vminsert won't be able to keep up. This usually means that either vminsert or vmstorage nodes are struggling with the load. Verify CPU/mem saturation of both components and network saturation between them.\nIf vminsert resources are saturated - consider adding more resources or scale vminserts horizontally.\n\nIf vminsert resources and network are fine, check vmstorage metrics for anomalies.",
"description": "Shows when vmstorage node is unreachable for vminsert.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
@@ -9644,6 +9792,7 @@
"tooltip": false,
"viz": false
},
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
@@ -9651,6 +9800,116 @@
"type": "linear"
},
"showPoints": "never",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"decimals": 0,
"links": [],
"mappings": [],
"min": 0,
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": 0
},
{
"color": "red",
"value": 80
}
]
},
"unit": "short"
},
"overrides": []
},
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 59
},
"id": 114,
"options": {
"legend": {
"calcs": [
"mean",
"lastNotNull"
],
"displayMode": "table",
"placement": "bottom",
"showLegend": true
},
"tooltip": {
"hideZeros": false,
"mode": "multi",
"sort": "desc"
}
},
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"editorMode": "code",
"expr": "vm_rpc_vmstorage_is_reachable{job=~\"$job\", instance=~\"$instance\"} != 1",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
"legendFormat": "{{instance}} => {{addr}}",
"range": true,
"refId": "A"
}
],
"title": "Storage reachability ($instance)",
"type": "timeseries"
},
{
"datasource": {
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"description": "Shows the saturation level of connection between vminsert and vmstorage components. \n\nIf the threshold of 0.9sec is reached, then the connection is saturated by more than 90% and vminsert won't be able to keep up. This usually means that either vminsert or vmstorage nodes are struggling with the load. Verify CPU/mem saturation of both components and network saturation between them.\nIf vminsert resources are saturated - consider adding more resources or scale vminserts horizontally.\n\nIf vminsert resources and network are fine, check vmstorage metrics for anomalies.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "never",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
@@ -9668,7 +9927,8 @@
"mode": "absolute",
"steps": [
{
"color": "transparent"
"color": "transparent",
"value": 0
},
{
"color": "red",
@@ -9684,7 +9944,7 @@
"h": 8,
"w": 12,
"x": 0,
"y": 8401
"y": 67
},
"id": 139,
"options": {
@@ -9700,11 +9960,12 @@
"sortDesc": true
},
"tooltip": {
"hideZeros": false,
"mode": "multi",
"sort": "desc"
}
},
"pluginVersion": "9.1.0",
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
@@ -9730,18 +9991,20 @@
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"description": "Shows when vmstorage node is unreachable for vminsert.",
"description": "Shows network usage between vminserts and vmstorages.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
@@ -9750,6 +10013,7 @@
"tooltip": false,
"viz": false
},
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
@@ -9757,6 +10021,7 @@
"type": "linear"
},
"showPoints": "never",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
@@ -9766,15 +10031,14 @@
"mode": "off"
}
},
"decimals": 0,
"links": [],
"mappings": [],
"min": 0,
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green"
"color": "green",
"value": 0
},
{
"color": "red",
@@ -9782,17 +10046,30 @@
}
]
},
"unit": "short"
"unit": "bps"
},
"overrides": []
"overrides": [
{
"matcher": {
"id": "byRegexp",
"options": "/read.*/"
},
"properties": [
{
"id": "custom.transform",
"value": "negative-Y"
}
]
}
]
},
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 8401
"y": 67
},
"id": 114,
"id": 209,
"options": {
"legend": {
"calcs": [
@@ -9801,14 +10078,17 @@
],
"displayMode": "table",
"placement": "bottom",
"showLegend": true
"showLegend": true,
"sortBy": "Last *",
"sortDesc": true
},
"tooltip": {
"hideZeros": false,
"mode": "multi",
"sort": "desc"
}
},
"pluginVersion": "9.1.0",
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
@@ -9816,16 +10096,28 @@
"uid": "$ds"
},
"editorMode": "code",
"expr": "vm_rpc_vmstorage_is_reachable{job=~\"$job\", instance=~\"$instance\"} != 1",
"expr": "sum(rate(vm_tcpdialer_read_bytes_total{job=~\"$job_insert\", instance=~\"$instance\"}[$__rate_interval])) * 8 > 0",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
"legendFormat": "{{instance}} => {{addr}}",
"legendFormat": "read from vmstorage",
"range": true,
"refId": "A"
},
{
"datasource": {
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vm_tcpdialer_written_bytes_total{job=~\"$job_insert\", instance=~\"$instance\"}[$__rate_interval])) * 8 > 0",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "write to vmstorage",
"range": true,
"refId": "B"
}
],
"title": "Storage reachability ($instance)",
"title": "Network usage: vmstorage ($instance)",
"type": "timeseries"
},
{
@@ -9840,11 +10132,13 @@
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
@@ -9853,6 +10147,7 @@
"tooltip": false,
"viz": false
},
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
@@ -9860,6 +10155,7 @@
"type": "linear"
},
"showPoints": "never",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
@@ -9875,7 +10171,8 @@
"mode": "absolute",
"steps": [
{
"color": "green"
"color": "green",
"value": 0
},
{
"color": "red",
@@ -9904,7 +10201,7 @@
"h": 8,
"w": 12,
"x": 0,
"y": 8409
"y": 75
},
"id": 208,
"options": {
@@ -9920,11 +10217,12 @@
"sortDesc": true
},
"tooltip": {
"hideZeros": false,
"mode": "multi",
"sort": "desc"
}
},
"pluginVersion": "9.1.0",
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
@@ -9947,7 +10245,6 @@
"editorMode": "code",
"expr": "sum(rate(vm_tcplistener_written_bytes_total{job=~\"$job_insert\", instance=~\"$instance\"}[$__rate_interval])) * 8 > 0",
"format": "time_series",
"hide": false,
"intervalFactor": 1,
"legendFormat": "write to client",
"range": true,
@@ -9962,147 +10259,19 @@
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"description": "Shows network usage between vminserts and vmstorages.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "never",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"links": [],
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green"
},
{
"color": "red",
"value": 80
}
]
},
"unit": "bps"
},
"overrides": [
{
"matcher": {
"id": "byRegexp",
"options": "/read.*/"
},
"properties": [
{
"id": "custom.transform",
"value": "negative-Y"
}
]
}
]
},
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 8409
},
"id": 209,
"options": {
"legend": {
"calcs": [
"mean",
"lastNotNull"
],
"displayMode": "table",
"placement": "bottom",
"showLegend": true,
"sortBy": "Last *",
"sortDesc": true
},
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"pluginVersion": "9.1.0",
"targets": [
{
"datasource": {
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vm_tcpdialer_read_bytes_total{job=~\"$job_insert\", instance=~\"$instance\"}[$__rate_interval])) * 8 > 0",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "read from vmstorage",
"range": true,
"refId": "A"
},
{
"datasource": {
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vm_tcpdialer_written_bytes_total{job=~\"$job_insert\", instance=~\"$instance\"}[$__rate_interval])) * 8 > 0",
"format": "time_series",
"hide": false,
"intervalFactor": 1,
"legendFormat": "write to vmstorage",
"range": true,
"refId": "B"
}
],
"title": "Network usage: vmstorage ($instance)",
"type": "timeseries"
},
{
"datasource": {
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"description": "",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
@@ -10111,6 +10280,7 @@
"tooltip": false,
"viz": false
},
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
@@ -10118,6 +10288,7 @@
"type": "linear"
},
"showPoints": "never",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
@@ -10135,7 +10306,8 @@
"mode": "absolute",
"steps": [
{
"color": "green"
"color": "green",
"value": 0
},
{
"color": "red",
@@ -10151,7 +10323,7 @@
"h": 8,
"w": 12,
"x": 12,
"y": 8417
"y": 75
},
"id": 88,
"options": {
@@ -10168,11 +10340,12 @@
"sortDesc": true
},
"tooltip": {
"hideZeros": false,
"mode": "multi",
"sort": "desc"
}
},
"pluginVersion": "9.1.0",
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {

View File

@@ -4020,7 +4020,7 @@
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"description": "Shows the rate of parsed datapoints from write or scrape requests.",
"description": "Shows the rate of data rows parsed from write requests and scraping. \nThere are two kinds of rows:\n1. Raw sample: each sample consists of a value and a timestamp, see https://docs.victoriametrics.com/victoriametrics/keyconcepts/#raw-samples \n2. Metric metadata: refers to descriptive information about metrics. It can be disabled by setting `-enableMetadata=false`, see https://docs.victoriametrics.com/victoriametrics/vmagent/#metric-metadata.",
"fieldConfig": {
"defaults": {
"color": {
@@ -4050,6 +4050,7 @@
"type": "linear"
},
"showPoints": "never",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
@@ -4066,7 +4067,8 @@
"mode": "absolute",
"steps": [
{
"color": "green"
"color": "green",
"value": 0
},
{
"color": "red",
@@ -4102,7 +4104,7 @@
"sort": "desc"
}
},
"pluginVersion": "11.5.0",
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
@@ -4113,12 +4115,24 @@
"exemplar": true,
"expr": "sum(rate(vm_protoparser_rows_read_total{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) by(job, type) > 0",
"interval": "",
"legendFormat": "{{ type }} ({{job}})",
"legendFormat": "sample: {{ type }} ({{job}})",
"range": true,
"refId": "A"
},
{
"datasource": {
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vm_protoparser_metadata_read_total{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) by(job, type) > 0",
"instant": false,
"legendFormat": "metadata: {{ type }} ({{job}})",
"range": true,
"refId": "B"
}
],
"title": "Datapoints rate ($instance)",
"title": "Rows rate ($instance)",
"type": "timeseries"
},
{
@@ -5722,7 +5736,7 @@
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"description": "Shows the rate of rows ingested in vmagent via push protocols.",
"description": "Shows the rate of data rows ingested from write requests. There are two kinds of rows:\n1. Raw sample: each sample consists of a value and a timestamp, see https://docs.victoriametrics.com/victoriametrics/keyconcepts/#raw-samples. \n2. Metric metadata: refers to descriptive information about metrics. It can be disabled by setting `-enableMetadata=false`, see https://docs.victoriametrics.com/victoriametrics/vmagent/#metric-metadata.",
"fieldConfig": {
"defaults": {
"color": {
@@ -5735,6 +5749,7 @@
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
@@ -5751,6 +5766,7 @@
"type": "linear"
},
"showPoints": "never",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
@@ -5767,7 +5783,8 @@
"mode": "absolute",
"steps": [
{
"color": "green"
"color": "green",
"value": 0
},
{
"color": "red",
@@ -5783,7 +5800,7 @@
"h": 8,
"w": 12,
"x": 12,
"y": 3931
"y": 41
},
"id": 131,
"options": {
@@ -5798,11 +5815,12 @@
"showLegend": true
},
"tooltip": {
"hideZeros": false,
"mode": "multi",
"sort": "desc"
}
},
"pluginVersion": "9.2.6",
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
@@ -5813,12 +5831,24 @@
"exemplar": true,
"expr": "sum(rate(vmagent_rows_inserted_total{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) by(job, type) > 0",
"interval": "",
"legendFormat": "{{ type }} ({{job}})",
"legendFormat": "sample: {{ type }} ({{job}})",
"range": true,
"refId": "A"
},
{
"datasource": {
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vmagent_metadata_inserted_total{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) by(job, type) > 0",
"instant": false,
"legendFormat": "metadata:{{ type }} ({{job}})",
"range": true,
"refId": "B"
}
],
"title": "Rows rate ($instance)",
"title": "Rows rate ($instance)",
"type": "timeseries"
},
{

View File

@@ -202,6 +202,7 @@
"defaults": {
"mappings": [],
"min": 0,
"noValue": "0",
"thresholds": {
"mode": "absolute",
"steps": [
@@ -239,7 +240,7 @@
"textMode": "auto",
"wideLayout": true
},
"pluginVersion": "12.2.0",
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
@@ -561,8 +562,7 @@
"value": 80
}
]
},
"unit": "s"
}
},
"overrides": []
},
@@ -599,7 +599,7 @@
},
"editorMode": "code",
"exemplar": false,
"expr": "sum(min_over_time(vm_app_uptime_seconds{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) by (job)",
"expr": "sum(min_over_time(up{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) by (job)",
"format": "time_series",
"instant": false,
"interval": "",
@@ -623,6 +623,342 @@
"title": "Overview",
"type": "row"
},
{
"datasource": {
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"description": "Shows the rate of requests per user.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "auto",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": 0
},
{
"color": "red",
"value": 80
}
]
}
},
"overrides": []
},
"gridPos": {
"h": 9,
"w": 12,
"x": 0,
"y": 9
},
"id": 11,
"options": {
"legend": {
"calcs": [],
"displayMode": "list",
"placement": "bottom",
"showLegend": true
},
"tooltip": {
"hideZeros": false,
"mode": "single",
"sort": "none"
}
},
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vmauth_user_requests_total{job=~\"$job\", instance=~\"$instance\", username=~\"$user\"}[$__rate_interval])) by(username) > 0",
"legendFormat": "__auto",
"range": true,
"refId": "A"
},
{
"datasource": {
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vmauth_unauthorized_user_requests_total{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) > 0",
"legendFormat": "unauthorized_user",
"range": true,
"refId": "B"
}
],
"title": "Requests rate",
"type": "timeseries"
},
{
"datasource": {
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"description": "Shows duration in seconds of user requests by quantile.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "auto",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": 0
},
{
"color": "red",
"value": 80
}
]
},
"unit": "s"
},
"overrides": []
},
"gridPos": {
"h": 9,
"w": 12,
"x": 12,
"y": 9
},
"id": 19,
"options": {
"legend": {
"calcs": [
"max",
"mean"
],
"displayMode": "table",
"placement": "bottom",
"showLegend": true,
"sortBy": "Max",
"sortDesc": true
},
"tooltip": {
"hideZeros": false,
"mode": "single",
"sort": "none"
}
},
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"editorMode": "code",
"expr": "max(vmauth_user_request_duration_seconds{job=~\"$job\", instance=~\"$instance\", username=~\"$user\", quantile=~\"(0.99|0.5)\"}) by (quantile, username) > 0",
"legendFormat": "user: {{username}} q: {{ quantile}}",
"range": true,
"refId": "A"
},
{
"datasource": {
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"editorMode": "code",
"expr": "max(vmauth_unauthorized_user_request_duration_seconds{job=~\"$job\", instance=~\"$instance\", quantile=~\"(0.99|0.5)\"}) by (quantile) > 0",
"legendFormat": "user: unauthorized q: {{ quantile}}",
"range": true,
"refId": "B"
}
],
"title": "Requests duration",
"type": "timeseries"
},
{
"datasource": {
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"description": "Shows the request error rate per user.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "auto",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": 0
},
{
"color": "red",
"value": 80
}
]
}
},
"overrides": []
},
"gridPos": {
"h": 9,
"w": 12,
"x": 0,
"y": 18
},
"id": 37,
"options": {
"legend": {
"calcs": [],
"displayMode": "list",
"placement": "bottom",
"showLegend": true
},
"tooltip": {
"hideZeros": false,
"mode": "single",
"sort": "none"
}
},
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vmauth_user_request_errors_total{job=~\"$job\", instance=~\"$instance\", username=~\"$user\"}[$__rate_interval])) by (username) > 0",
"legendFormat": "__auto",
"range": true,
"refId": "A"
},
{
"datasource": {
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vmauth_unauthorized_user_request_errors_total{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) > 0",
"legendFormat": "unauthorized_user",
"range": true,
"refId": "B"
}
],
"title": "Requests error rate",
"type": "timeseries"
},
{
"datasource": {
"type": "victoriametrics-metrics-datasource",
@@ -688,8 +1024,8 @@
"gridPos": {
"h": 9,
"w": 12,
"x": 0,
"y": 9
"x": 12,
"y": 18
},
"id": 16,
"options": {
@@ -705,7 +1041,7 @@
"sort": "none"
}
},
"pluginVersion": "12.3.0",
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
@@ -713,8 +1049,7 @@
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vmauth_http_request_errors_total{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) by (reason)",
"hide": false,
"expr": "sum(rate(vmauth_http_request_errors_total{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) by (reason) > 0",
"legendFormat": "__auto",
"range": true,
"refId": "A"
@@ -728,353 +1063,7 @@
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"description": " The number of concurrent connections processed by vmauth reached one of limits. Possible solutions:\n- increase global limit with flag -maxConcurrentRequests\n- increase limit with flag: -maxConcurrentPerUserRequests for all users or with config option `max_concurrent_requests` per user.\n- deploy additional vmauth replicas\n- check requests latency at backend service and allocate resources to it if needed",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "bars",
"fillOpacity": 0,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "auto",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": 0
},
{
"color": "red",
"value": 80
}
]
}
},
"overrides": []
},
"gridPos": {
"h": 9,
"w": 12,
"x": 12,
"y": 9
},
"id": 10,
"options": {
"legend": {
"calcs": [],
"displayMode": "list",
"placement": "bottom",
"showLegend": true
},
"tooltip": {
"hideZeros": false,
"mode": "single",
"sort": "none"
}
},
"pluginVersion": "12.3.0",
"targets": [
{
"datasource": {
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vmauth_user_concurrent_requests_limit_reached_total{job=~\"$job\", instance=~\"$instance\", username=~\"$user\"}[$__rate_interval])) by(username) > 0",
"interval": "1m",
"legendFormat": "__auto",
"range": true,
"refId": "A"
},
{
"datasource": {
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vmauth_unauthorized_user_concurrent_requests_limit_reached_total{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) > 0",
"hide": false,
"legendFormat": "unauthorized",
"range": true,
"refId": "B"
},
{
"datasource": {
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vmauth_concurrent_requests_limit_reached_total[$__rate_interval])) > 0",
"hide": false,
"legendFormat": "global at {{ $instance }}",
"range": true,
"refId": "C"
}
],
"title": "Concurrent limit reached",
"type": "timeseries"
},
{
"datasource": {
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "auto",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": 0
},
{
"color": "red",
"value": 80
}
]
}
},
"overrides": []
},
"gridPos": {
"h": 9,
"w": 12,
"x": 0,
"y": 18
},
"id": 11,
"options": {
"legend": {
"calcs": [],
"displayMode": "list",
"placement": "bottom",
"showLegend": true
},
"tooltip": {
"hideZeros": false,
"mode": "single",
"sort": "none"
}
},
"pluginVersion": "12.3.0",
"targets": [
{
"datasource": {
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vmauth_user_requests_total{job=~\"$job\", instance=~\"$instance\", username=~\"$user\"}[$__rate_interval])) by(username)",
"hide": false,
"legendFormat": "__auto",
"range": true,
"refId": "A"
},
{
"datasource": {
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vmauth_unauthorized_user_requests_total{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval]))",
"hide": false,
"legendFormat": "unauthorized_user",
"range": true,
"refId": "B"
}
],
"title": "User requests rate",
"type": "timeseries"
},
{
"datasource": {
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "auto",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": 0
},
{
"color": "red",
"value": 80
}
]
}
},
"overrides": []
},
"gridPos": {
"h": 9,
"w": 12,
"x": 12,
"y": 18
},
"id": 37,
"options": {
"legend": {
"calcs": [],
"displayMode": "list",
"placement": "bottom",
"showLegend": true
},
"tooltip": {
"hideZeros": false,
"mode": "single",
"sort": "none"
}
},
"pluginVersion": "12.3.0",
"targets": [
{
"datasource": {
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vmauth_user_request_errors_total{job=~\"$job\", instance=~\"$instance\", username=~\"$user\"}[$__rate_interval])) by (username) > 0",
"hide": false,
"legendFormat": "__auto",
"range": true,
"refId": "A"
},
{
"datasource": {
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vmauth_unauthorized_user_request_errors_total{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) > 0",
"hide": false,
"legendFormat": "unauthorized_user",
"range": true,
"refId": "B"
}
],
"title": "User requests error rate",
"type": "timeseries"
},
{
"datasource": {
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"description": "Shows percent utilization of per concurrent requests capacity.",
"description": "Shows the percentage utilization of concurrent request capacity per user.",
"fieldConfig": {
"defaults": {
"color": {
@@ -1158,7 +1147,7 @@
"sort": "desc"
}
},
"pluginVersion": "12.3.0",
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
@@ -1167,14 +1156,13 @@
},
"editorMode": "code",
"expr": "max(\nmax_over_time(vmauth_user_concurrent_requests_current{job=~\"$job\", instance=~\"$instance\", username=~\"$user\"}[$__rate_interval])\n/ \nvmauth_user_concurrent_requests_capacity{job=~\"$job\", instance=~\"$instance\", username=~\"$user\"}\n) by(username) > 0\n",
"hide": false,
"interval": "5m",
"legendFormat": "__auto",
"range": true,
"refId": "A"
}
],
"title": "User concurrent requests usage",
"title": "Concurrent requests usage",
"type": "timeseries"
},
{
@@ -1182,7 +1170,7 @@
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"description": "Shows duration in seconds of user requests by quantile.",
"description": " The number of concurrent connections processed by vmauth reached one of limits. Possible solutions:\n- increase global limit with flag -maxConcurrentRequests\n- increase limit with flag: -maxConcurrentPerUserRequests for all users or with config option `max_concurrent_requests` per user.\n- deploy additional vmauth replicas\n- check requests latency at backend service and allocate resources to it if needed",
"fieldConfig": {
"defaults": {
"color": {
@@ -1196,7 +1184,7 @@
"axisPlacement": "auto",
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "line",
"drawStyle": "bars",
"fillOpacity": 0,
"gradientMode": "none",
"hideFrom": {
@@ -1235,8 +1223,7 @@
"value": 80
}
]
},
"unit": "s"
}
},
"overrides": []
},
@@ -1246,18 +1233,13 @@
"x": 12,
"y": 27
},
"id": 19,
"id": 10,
"options": {
"legend": {
"calcs": [
"max",
"mean"
],
"displayMode": "table",
"calcs": [],
"displayMode": "list",
"placement": "bottom",
"showLegend": true,
"sortBy": "Max",
"sortDesc": true
"showLegend": true
},
"tooltip": {
"hideZeros": false,
@@ -1265,7 +1247,7 @@
"sort": "none"
}
},
"pluginVersion": "12.3.0",
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
@@ -1273,9 +1255,9 @@
"uid": "$ds"
},
"editorMode": "code",
"expr": "max(vmauth_user_request_duration_seconds{job=~\"$job\", instance=~\"$instance\", username=~\"$user\", quantile=~\"(0.99|0.5)\"}) by (quantile, username) > 0",
"hide": false,
"legendFormat": "user: {{username}} q: {{ quantile}}",
"expr": "sum(rate(vmauth_user_concurrent_requests_limit_reached_total{job=~\"$job\", instance=~\"$instance\", username=~\"$user\"}[$__rate_interval])) by(username) > 0",
"interval": "1m",
"legendFormat": "__auto",
"range": true,
"refId": "A"
},
@@ -1285,14 +1267,24 @@
"uid": "$ds"
},
"editorMode": "code",
"expr": "max(vmauth_unauthorized_user_request_duration_seconds{job=~\"$job\", instance=~\"$instance\", quantile=~\"(0.99|0.5)\"}) by (quantile) > 0",
"hide": false,
"legendFormat": "user: unauthorized q: {{ quantile}}",
"expr": "sum(rate(vmauth_unauthorized_user_concurrent_requests_limit_reached_total{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) > 0",
"legendFormat": "unauthorized",
"range": true,
"refId": "B"
},
{
"datasource": {
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vmauth_concurrent_requests_limit_reached_total[$__rate_interval])) > 0",
"legendFormat": "global at {{ $instance }}",
"range": true,
"refId": "C"
}
],
"title": "User requests duration",
"title": "Concurrent limit reached",
"type": "timeseries"
},
{
@@ -2599,7 +2591,7 @@
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"description": "Shows the number of restarts per job. The chart can be useful to identify periodic process restarts and correlate them with potential issues or anomalies. Normally, processes shouldn't restart unless restart was inited by user. The reason of restarts should be figured out by checking the logs of each specific service. ",
"description": "Shows the backend request error rate per user.\nThe value can be higher than the request error rate if:\n1. A single request is retried multiple times due to transient network errors, such as proxy idle timeout misconfiguration or sockets being closed by the OS.\n2. The request fails and is retried on other backends because of configured retry_status_codes.",
"fieldConfig": {
"defaults": {
"color": {
@@ -2611,8 +2603,8 @@
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"axisSoftMin": 0,
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
@@ -2622,13 +2614,14 @@
"viz": false
},
"insertNulls": false,
"lineInterpolation": "stepAfter",
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "never",
"showPoints": "auto",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
@@ -2638,23 +2631,20 @@
"mode": "off"
}
},
"decimals": 0,
"links": [],
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
"value": 0
},
{
"color": "red",
"value": 80
}
]
},
"unit": "none"
}
},
"overrides": []
},
@@ -2662,26 +2652,23 @@
"h": 8,
"w": 12,
"x": 0,
"y": 47
"y": 46
},
"id": 37,
"id": 41,
"options": {
"legend": {
"calcs": [
"lastNotNull"
],
"displayMode": "table",
"calcs": [],
"displayMode": "list",
"placement": "bottom",
"showLegend": true,
"sortBy": "Last *",
"sortDesc": true
"showLegend": true
},
"tooltip": {
"mode": "multi",
"sort": "desc"
"hideZeros": false,
"mode": "single",
"sort": "none"
}
},
"pluginVersion": "9.1.0",
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
@@ -2689,14 +2676,24 @@
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(changes(vm_app_start_timestamp{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval]) > 0) by(job)",
"format": "time_series",
"instant": false,
"legendFormat": "{{job}}",
"expr": "sum(rate(vmauth_user_request_backend_errors_total{job=~\"$job\", instance=~\"$instance\", username=~\"$user\"}[$__rate_interval])) by (username) > 0",
"legendFormat": "__auto",
"range": true,
"refId": "A"
},
{
"datasource": {
"type": "victoriametrics-metrics-datasource",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vmauth_unauthorized_user_request_backend_errors_total{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) > 0",
"legendFormat": "unauthorized_user",
"range": true,
"refId": "B"
}
],
"title": "Restarts ($job)",
"title": "Requests backend error rate",
"type": "timeseries"
},
{

View File

@@ -4019,7 +4019,7 @@
"type": "prometheus",
"uid": "$ds"
},
"description": "Shows the rate of parsed datapoints from write or scrape requests.",
"description": "Shows the rate of data rows parsed from write requests and scraping. \nThere are two kinds of rows:\n1. Raw sample: each sample consists of a value and a timestamp, see https://docs.victoriametrics.com/victoriametrics/keyconcepts/#raw-samples \n2. Metric metadata: refers to descriptive information about metrics. It can be disabled by setting `-enableMetadata=false`, see https://docs.victoriametrics.com/victoriametrics/vmagent/#metric-metadata.",
"fieldConfig": {
"defaults": {
"color": {
@@ -4049,6 +4049,7 @@
"type": "linear"
},
"showPoints": "never",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
@@ -4065,7 +4066,8 @@
"mode": "absolute",
"steps": [
{
"color": "green"
"color": "green",
"value": 0
},
{
"color": "red",
@@ -4101,7 +4103,7 @@
"sort": "desc"
}
},
"pluginVersion": "11.5.0",
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
@@ -4112,12 +4114,24 @@
"exemplar": true,
"expr": "sum(rate(vm_protoparser_rows_read_total{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) by(job, type) > 0",
"interval": "",
"legendFormat": "{{ type }} ({{job}})",
"legendFormat": "sample: {{ type }} ({{job}})",
"range": true,
"refId": "A"
},
{
"datasource": {
"type": "prometheus",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vm_protoparser_metadata_read_total{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) by(job, type) > 0",
"instant": false,
"legendFormat": "metadata: {{ type }} ({{job}})",
"range": true,
"refId": "B"
}
],
"title": "Datapoints rate ($instance)",
"title": "Rows rate ($instance)",
"type": "timeseries"
},
{
@@ -5721,7 +5735,7 @@
"type": "prometheus",
"uid": "$ds"
},
"description": "Shows the rate of rows ingested in vmagent via push protocols.",
"description": "Shows the rate of data rows ingested from write requests. There are two kinds of rows:\n1. Raw sample: each sample consists of a value and a timestamp, see https://docs.victoriametrics.com/victoriametrics/keyconcepts/#raw-samples. \n2. Metric metadata: refers to descriptive information about metrics. It can be disabled by setting `-enableMetadata=false`, see https://docs.victoriametrics.com/victoriametrics/vmagent/#metric-metadata.",
"fieldConfig": {
"defaults": {
"color": {
@@ -5734,6 +5748,7 @@
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
@@ -5750,6 +5765,7 @@
"type": "linear"
},
"showPoints": "never",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
@@ -5766,7 +5782,8 @@
"mode": "absolute",
"steps": [
{
"color": "green"
"color": "green",
"value": 0
},
{
"color": "red",
@@ -5782,7 +5799,7 @@
"h": 8,
"w": 12,
"x": 12,
"y": 3931
"y": 41
},
"id": 131,
"options": {
@@ -5797,11 +5814,12 @@
"showLegend": true
},
"tooltip": {
"hideZeros": false,
"mode": "multi",
"sort": "desc"
}
},
"pluginVersion": "9.2.6",
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
@@ -5812,12 +5830,24 @@
"exemplar": true,
"expr": "sum(rate(vmagent_rows_inserted_total{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) by(job, type) > 0",
"interval": "",
"legendFormat": "{{ type }} ({{job}})",
"legendFormat": "sample: {{ type }} ({{job}})",
"range": true,
"refId": "A"
},
{
"datasource": {
"type": "prometheus",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vmagent_metadata_inserted_total{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) by(job, type) > 0",
"instant": false,
"legendFormat": "metadata:{{ type }} ({{job}})",
"range": true,
"refId": "B"
}
],
"title": "Rows rate ($instance)",
"title": "Rows rate ($instance)",
"type": "timeseries"
},
{

View File

@@ -201,6 +201,7 @@
"defaults": {
"mappings": [],
"min": 0,
"noValue": "0",
"thresholds": {
"mode": "absolute",
"steps": [
@@ -238,7 +239,7 @@
"textMode": "auto",
"wideLayout": true
},
"pluginVersion": "12.2.0",
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
@@ -560,8 +561,7 @@
"value": 80
}
]
},
"unit": "s"
}
},
"overrides": []
},
@@ -598,7 +598,7 @@
},
"editorMode": "code",
"exemplar": false,
"expr": "sum(min_over_time(vm_app_uptime_seconds{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) by (job)",
"expr": "sum(min_over_time(up{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) by (job)",
"format": "time_series",
"instant": false,
"interval": "",
@@ -622,6 +622,342 @@
"title": "Overview",
"type": "row"
},
{
"datasource": {
"type": "prometheus",
"uid": "$ds"
},
"description": "Shows the rate of requests per user.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "auto",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": 0
},
{
"color": "red",
"value": 80
}
]
}
},
"overrides": []
},
"gridPos": {
"h": 9,
"w": 12,
"x": 0,
"y": 9
},
"id": 11,
"options": {
"legend": {
"calcs": [],
"displayMode": "list",
"placement": "bottom",
"showLegend": true
},
"tooltip": {
"hideZeros": false,
"mode": "single",
"sort": "none"
}
},
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vmauth_user_requests_total{job=~\"$job\", instance=~\"$instance\", username=~\"$user\"}[$__rate_interval])) by(username) > 0",
"legendFormat": "__auto",
"range": true,
"refId": "A"
},
{
"datasource": {
"type": "prometheus",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vmauth_unauthorized_user_requests_total{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) > 0",
"legendFormat": "unauthorized_user",
"range": true,
"refId": "B"
}
],
"title": "Requests rate",
"type": "timeseries"
},
{
"datasource": {
"type": "prometheus",
"uid": "$ds"
},
"description": "Shows duration in seconds of user requests by quantile.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "auto",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": 0
},
{
"color": "red",
"value": 80
}
]
},
"unit": "s"
},
"overrides": []
},
"gridPos": {
"h": 9,
"w": 12,
"x": 12,
"y": 9
},
"id": 19,
"options": {
"legend": {
"calcs": [
"max",
"mean"
],
"displayMode": "table",
"placement": "bottom",
"showLegend": true,
"sortBy": "Max",
"sortDesc": true
},
"tooltip": {
"hideZeros": false,
"mode": "single",
"sort": "none"
}
},
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "$ds"
},
"editorMode": "code",
"expr": "max(vmauth_user_request_duration_seconds{job=~\"$job\", instance=~\"$instance\", username=~\"$user\", quantile=~\"(0.99|0.5)\"}) by (quantile, username) > 0",
"legendFormat": "user: {{username}} q: {{ quantile}}",
"range": true,
"refId": "A"
},
{
"datasource": {
"type": "prometheus",
"uid": "$ds"
},
"editorMode": "code",
"expr": "max(vmauth_unauthorized_user_request_duration_seconds{job=~\"$job\", instance=~\"$instance\", quantile=~\"(0.99|0.5)\"}) by (quantile) > 0",
"legendFormat": "user: unauthorized q: {{ quantile}}",
"range": true,
"refId": "B"
}
],
"title": "Requests duration",
"type": "timeseries"
},
{
"datasource": {
"type": "prometheus",
"uid": "$ds"
},
"description": "Shows the request error rate per user.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "auto",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": 0
},
{
"color": "red",
"value": 80
}
]
}
},
"overrides": []
},
"gridPos": {
"h": 9,
"w": 12,
"x": 0,
"y": 18
},
"id": 37,
"options": {
"legend": {
"calcs": [],
"displayMode": "list",
"placement": "bottom",
"showLegend": true
},
"tooltip": {
"hideZeros": false,
"mode": "single",
"sort": "none"
}
},
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vmauth_user_request_errors_total{job=~\"$job\", instance=~\"$instance\", username=~\"$user\"}[$__rate_interval])) by (username) > 0",
"legendFormat": "__auto",
"range": true,
"refId": "A"
},
{
"datasource": {
"type": "prometheus",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vmauth_unauthorized_user_request_errors_total{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) > 0",
"legendFormat": "unauthorized_user",
"range": true,
"refId": "B"
}
],
"title": "Requests error rate",
"type": "timeseries"
},
{
"datasource": {
"type": "prometheus",
@@ -687,8 +1023,8 @@
"gridPos": {
"h": 9,
"w": 12,
"x": 0,
"y": 9
"x": 12,
"y": 18
},
"id": 16,
"options": {
@@ -704,7 +1040,7 @@
"sort": "none"
}
},
"pluginVersion": "12.3.0",
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
@@ -712,8 +1048,7 @@
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vmauth_http_request_errors_total{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) by (reason)",
"hide": false,
"expr": "sum(rate(vmauth_http_request_errors_total{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) by (reason) > 0",
"legendFormat": "__auto",
"range": true,
"refId": "A"
@@ -727,353 +1062,7 @@
"type": "prometheus",
"uid": "$ds"
},
"description": " The number of concurrent connections processed by vmauth reached one of limits. Possible solutions:\n- increase global limit with flag -maxConcurrentRequests\n- increase limit with flag: -maxConcurrentPerUserRequests for all users or with config option `max_concurrent_requests` per user.\n- deploy additional vmauth replicas\n- check requests latency at backend service and allocate resources to it if needed",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "bars",
"fillOpacity": 0,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "auto",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": 0
},
{
"color": "red",
"value": 80
}
]
}
},
"overrides": []
},
"gridPos": {
"h": 9,
"w": 12,
"x": 12,
"y": 9
},
"id": 10,
"options": {
"legend": {
"calcs": [],
"displayMode": "list",
"placement": "bottom",
"showLegend": true
},
"tooltip": {
"hideZeros": false,
"mode": "single",
"sort": "none"
}
},
"pluginVersion": "12.3.0",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vmauth_user_concurrent_requests_limit_reached_total{job=~\"$job\", instance=~\"$instance\", username=~\"$user\"}[$__rate_interval])) by(username) > 0",
"interval": "1m",
"legendFormat": "__auto",
"range": true,
"refId": "A"
},
{
"datasource": {
"type": "prometheus",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vmauth_unauthorized_user_concurrent_requests_limit_reached_total{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) > 0",
"hide": false,
"legendFormat": "unauthorized",
"range": true,
"refId": "B"
},
{
"datasource": {
"type": "prometheus",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vmauth_concurrent_requests_limit_reached_total[$__rate_interval])) > 0",
"hide": false,
"legendFormat": "global at {{ $instance }}",
"range": true,
"refId": "C"
}
],
"title": "Concurrent limit reached",
"type": "timeseries"
},
{
"datasource": {
"type": "prometheus",
"uid": "$ds"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "auto",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": 0
},
{
"color": "red",
"value": 80
}
]
}
},
"overrides": []
},
"gridPos": {
"h": 9,
"w": 12,
"x": 0,
"y": 18
},
"id": 11,
"options": {
"legend": {
"calcs": [],
"displayMode": "list",
"placement": "bottom",
"showLegend": true
},
"tooltip": {
"hideZeros": false,
"mode": "single",
"sort": "none"
}
},
"pluginVersion": "12.3.0",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vmauth_user_requests_total{job=~\"$job\", instance=~\"$instance\", username=~\"$user\"}[$__rate_interval])) by(username)",
"hide": false,
"legendFormat": "__auto",
"range": true,
"refId": "A"
},
{
"datasource": {
"type": "prometheus",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vmauth_unauthorized_user_requests_total{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval]))",
"hide": false,
"legendFormat": "unauthorized_user",
"range": true,
"refId": "B"
}
],
"title": "User requests rate",
"type": "timeseries"
},
{
"datasource": {
"type": "prometheus",
"uid": "$ds"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "auto",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": 0
},
{
"color": "red",
"value": 80
}
]
}
},
"overrides": []
},
"gridPos": {
"h": 9,
"w": 12,
"x": 12,
"y": 18
},
"id": 37,
"options": {
"legend": {
"calcs": [],
"displayMode": "list",
"placement": "bottom",
"showLegend": true
},
"tooltip": {
"hideZeros": false,
"mode": "single",
"sort": "none"
}
},
"pluginVersion": "12.3.0",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vmauth_user_request_errors_total{job=~\"$job\", instance=~\"$instance\", username=~\"$user\"}[$__rate_interval])) by (username) > 0",
"hide": false,
"legendFormat": "__auto",
"range": true,
"refId": "A"
},
{
"datasource": {
"type": "prometheus",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vmauth_unauthorized_user_request_errors_total{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) > 0",
"hide": false,
"legendFormat": "unauthorized_user",
"range": true,
"refId": "B"
}
],
"title": "User requests error rate",
"type": "timeseries"
},
{
"datasource": {
"type": "prometheus",
"uid": "$ds"
},
"description": "Shows percent utilization of per concurrent requests capacity.",
"description": "Shows the percentage utilization of concurrent request capacity per user.",
"fieldConfig": {
"defaults": {
"color": {
@@ -1157,7 +1146,7 @@
"sort": "desc"
}
},
"pluginVersion": "12.3.0",
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
@@ -1166,14 +1155,13 @@
},
"editorMode": "code",
"expr": "max(\nmax_over_time(vmauth_user_concurrent_requests_current{job=~\"$job\", instance=~\"$instance\", username=~\"$user\"}[$__rate_interval])\n/ \nvmauth_user_concurrent_requests_capacity{job=~\"$job\", instance=~\"$instance\", username=~\"$user\"}\n) by(username) > 0\n",
"hide": false,
"interval": "5m",
"legendFormat": "__auto",
"range": true,
"refId": "A"
}
],
"title": "User concurrent requests usage",
"title": "Concurrent requests usage",
"type": "timeseries"
},
{
@@ -1181,7 +1169,7 @@
"type": "prometheus",
"uid": "$ds"
},
"description": "Shows duration in seconds of user requests by quantile.",
"description": " The number of concurrent connections processed by vmauth reached one of limits. Possible solutions:\n- increase global limit with flag -maxConcurrentRequests\n- increase limit with flag: -maxConcurrentPerUserRequests for all users or with config option `max_concurrent_requests` per user.\n- deploy additional vmauth replicas\n- check requests latency at backend service and allocate resources to it if needed",
"fieldConfig": {
"defaults": {
"color": {
@@ -1195,7 +1183,7 @@
"axisPlacement": "auto",
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "line",
"drawStyle": "bars",
"fillOpacity": 0,
"gradientMode": "none",
"hideFrom": {
@@ -1234,8 +1222,7 @@
"value": 80
}
]
},
"unit": "s"
}
},
"overrides": []
},
@@ -1245,18 +1232,13 @@
"x": 12,
"y": 27
},
"id": 19,
"id": 10,
"options": {
"legend": {
"calcs": [
"max",
"mean"
],
"displayMode": "table",
"calcs": [],
"displayMode": "list",
"placement": "bottom",
"showLegend": true,
"sortBy": "Max",
"sortDesc": true
"showLegend": true
},
"tooltip": {
"hideZeros": false,
@@ -1264,7 +1246,7 @@
"sort": "none"
}
},
"pluginVersion": "12.3.0",
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
@@ -1272,9 +1254,9 @@
"uid": "$ds"
},
"editorMode": "code",
"expr": "max(vmauth_user_request_duration_seconds{job=~\"$job\", instance=~\"$instance\", username=~\"$user\", quantile=~\"(0.99|0.5)\"}) by (quantile, username) > 0",
"hide": false,
"legendFormat": "user: {{username}} q: {{ quantile}}",
"expr": "sum(rate(vmauth_user_concurrent_requests_limit_reached_total{job=~\"$job\", instance=~\"$instance\", username=~\"$user\"}[$__rate_interval])) by(username) > 0",
"interval": "1m",
"legendFormat": "__auto",
"range": true,
"refId": "A"
},
@@ -1284,14 +1266,24 @@
"uid": "$ds"
},
"editorMode": "code",
"expr": "max(vmauth_unauthorized_user_request_duration_seconds{job=~\"$job\", instance=~\"$instance\", quantile=~\"(0.99|0.5)\"}) by (quantile) > 0",
"hide": false,
"legendFormat": "user: unauthorized q: {{ quantile}}",
"expr": "sum(rate(vmauth_unauthorized_user_concurrent_requests_limit_reached_total{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) > 0",
"legendFormat": "unauthorized",
"range": true,
"refId": "B"
},
{
"datasource": {
"type": "prometheus",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vmauth_concurrent_requests_limit_reached_total[$__rate_interval])) > 0",
"legendFormat": "global at {{ $instance }}",
"range": true,
"refId": "C"
}
],
"title": "User requests duration",
"title": "Concurrent limit reached",
"type": "timeseries"
},
{
@@ -2598,7 +2590,7 @@
"type": "prometheus",
"uid": "$ds"
},
"description": "Shows the number of restarts per job. The chart can be useful to identify periodic process restarts and correlate them with potential issues or anomalies. Normally, processes shouldn't restart unless restart was inited by user. The reason of restarts should be figured out by checking the logs of each specific service. ",
"description": "Shows the backend request error rate per user.\nThe value can be higher than the request error rate if:\n1. A single request is retried multiple times due to transient network errors, such as proxy idle timeout misconfiguration or sockets being closed by the OS.\n2. The request fails and is retried on other backends because of configured retry_status_codes.",
"fieldConfig": {
"defaults": {
"color": {
@@ -2610,8 +2602,8 @@
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"axisSoftMin": 0,
"barAlignment": 0,
"barWidthFactor": 0.6,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
@@ -2621,13 +2613,14 @@
"viz": false
},
"insertNulls": false,
"lineInterpolation": "stepAfter",
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "never",
"showPoints": "auto",
"showValues": false,
"spanNulls": false,
"stacking": {
"group": "A",
@@ -2637,23 +2630,20 @@
"mode": "off"
}
},
"decimals": 0,
"links": [],
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
"value": 0
},
{
"color": "red",
"value": 80
}
]
},
"unit": "none"
}
},
"overrides": []
},
@@ -2661,26 +2651,23 @@
"h": 8,
"w": 12,
"x": 0,
"y": 47
"y": 46
},
"id": 37,
"id": 41,
"options": {
"legend": {
"calcs": [
"lastNotNull"
],
"displayMode": "table",
"calcs": [],
"displayMode": "list",
"placement": "bottom",
"showLegend": true,
"sortBy": "Last *",
"sortDesc": true
"showLegend": true
},
"tooltip": {
"mode": "multi",
"sort": "desc"
"hideZeros": false,
"mode": "single",
"sort": "none"
}
},
"pluginVersion": "9.1.0",
"pluginVersion": "12.4.3",
"targets": [
{
"datasource": {
@@ -2688,14 +2675,24 @@
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(changes(vm_app_start_timestamp{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval]) > 0) by(job)",
"format": "time_series",
"instant": false,
"legendFormat": "{{job}}",
"expr": "sum(rate(vmauth_user_request_backend_errors_total{job=~\"$job\", instance=~\"$instance\", username=~\"$user\"}[$__rate_interval])) by (username) > 0",
"legendFormat": "__auto",
"range": true,
"refId": "A"
},
{
"datasource": {
"type": "prometheus",
"uid": "$ds"
},
"editorMode": "code",
"expr": "sum(rate(vmauth_unauthorized_user_request_backend_errors_total{job=~\"$job\", instance=~\"$instance\"}[$__rate_interval])) > 0",
"legendFormat": "unauthorized_user",
"range": true,
"refId": "B"
}
],
"title": "Restarts ($job)",
"title": "Requests backend error rate",
"type": "timeseries"
},
{

View File

@@ -3,9 +3,9 @@
DOCKER_REGISTRIES ?= docker.io quay.io
DOCKER_NAMESPACE ?= victoriametrics
ROOT_IMAGE ?= alpine:3.23.3
ROOT_IMAGE ?= alpine:3.23.4
ROOT_IMAGE_SCRATCH ?= scratch
CERTS_IMAGE := alpine:3.23.3
CERTS_IMAGE := alpine:3.23.4
GO_BUILDER_IMAGE := golang:1.26.2

View File

@@ -151,7 +151,7 @@ Some alerting rules thresholds are just recommendations and could require an adj
The list of alerting rules is the following:
* [alerts-health.yml](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/deployment/docker/rules/alerts-health.yml):
alerting rules related to all VictoriaMetrics components for tracking their "health" state;
* [alerts.yml](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/deployment/docker/rules/alerts.yml):
* [alerts-single-node.yml](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/deployment/docker/rules/alerts-single-node.yml):
alerting rules related to [single-server VictoriaMetrics](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/) installation;
* [alerts-cluster.yml](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/deployment/docker/rules/alerts-cluster.yml):
alerting rules related to [cluster version of VictoriaMetrics](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/);

View File

@@ -3,7 +3,7 @@ services:
# It scrapes targets defined in --promscrape.config
# And forward them to --remoteWrite.url
vmagent:
image: victoriametrics/vmagent:v1.140.0
image: victoriametrics/vmagent:v1.142.0
depends_on:
- "vmauth"
ports:
@@ -42,14 +42,14 @@ services:
# vmstorage shards. Each shard receives 1/N of all metrics sent to vminserts,
# where N is number of vmstorages (2 in this case).
vmstorage-1:
image: victoriametrics/vmstorage:v1.140.0-cluster
image: victoriametrics/vmstorage:v1.142.0-cluster
volumes:
- strgdata-1:/storage
command:
- "--storageDataPath=/storage"
restart: always
vmstorage-2:
image: victoriametrics/vmstorage:v1.140.0-cluster
image: victoriametrics/vmstorage:v1.142.0-cluster
volumes:
- strgdata-2:/storage
command:
@@ -59,7 +59,7 @@ services:
# vminsert is ingestion frontend. It receives metrics pushed by vmagent,
# pre-process them and distributes across configured vmstorage shards.
vminsert-1:
image: victoriametrics/vminsert:v1.140.0-cluster
image: victoriametrics/vminsert:v1.142.0-cluster
depends_on:
- "vmstorage-1"
- "vmstorage-2"
@@ -68,7 +68,7 @@ services:
- "--storageNode=vmstorage-2:8400"
restart: always
vminsert-2:
image: victoriametrics/vminsert:v1.140.0-cluster
image: victoriametrics/vminsert:v1.142.0-cluster
depends_on:
- "vmstorage-1"
- "vmstorage-2"
@@ -80,7 +80,7 @@ services:
# vmselect is a query fronted. It serves read queries in MetricsQL or PromQL.
# vmselect collects results from configured `--storageNode` shards.
vmselect-1:
image: victoriametrics/vmselect:v1.140.0-cluster
image: victoriametrics/vmselect:v1.142.0-cluster
depends_on:
- "vmstorage-1"
- "vmstorage-2"
@@ -90,7 +90,7 @@ services:
- "--vmalert.proxyURL=http://vmalert:8880"
restart: always
vmselect-2:
image: victoriametrics/vmselect:v1.140.0-cluster
image: victoriametrics/vmselect:v1.142.0-cluster
depends_on:
- "vmstorage-1"
- "vmstorage-2"
@@ -105,7 +105,7 @@ services:
# read requests from Grafana, vmui, vmalert among vmselects.
# It can be used as an authentication proxy.
vmauth:
image: victoriametrics/vmauth:v1.140.0
image: victoriametrics/vmauth:v1.142.0
depends_on:
- "vmselect-1"
- "vmselect-2"
@@ -119,13 +119,13 @@ services:
# vmalert executes alerting and recording rules
vmalert:
image: victoriametrics/vmalert:v1.140.0
image: victoriametrics/vmalert:v1.142.0
depends_on:
- "vmauth"
ports:
- 8880:8880
volumes:
- ./rules/alerts-cluster.yml:/etc/alerts/alerts.yml
- ./rules/alerts-cluster.yml:/etc/alerts/alerts-cluster.yml
- ./rules/alerts-health.yml:/etc/alerts/alerts-health.yml
- ./rules/alerts-vmagent.yml:/etc/alerts/alerts-vmagent.yml
- ./rules/alerts-vmalert.yml:/etc/alerts/alerts-vmalert.yml

View File

@@ -3,7 +3,7 @@ services:
# It scrapes targets defined in --promscrape.config
# And forward them to --remoteWrite.url
vmagent:
image: victoriametrics/vmagent:v1.140.0
image: victoriametrics/vmagent:v1.142.0
depends_on:
- "victoriametrics"
ports:
@@ -18,7 +18,7 @@ services:
# VictoriaMetrics instance, a single process responsible for
# storing metrics and serve read requests.
victoriametrics:
image: victoriametrics/victoria-metrics:v1.140.0
image: victoriametrics/victoria-metrics:v1.142.0
ports:
- 8428:8428
- 8089:8089
@@ -59,14 +59,14 @@ services:
# vmalert executes alerting and recording rules
vmalert:
image: victoriametrics/vmalert:v1.140.0
image: victoriametrics/vmalert:v1.142.0
depends_on:
- "victoriametrics"
- "alertmanager"
ports:
- 8880:8880
volumes:
- ./rules/alerts.yml:/etc/alerts/alerts.yml
- ./rules/alerts-single-node.yml:/etc/alerts/alerts-single-node.yml
- ./rules/alerts-health.yml:/etc/alerts/alerts-health.yml
- ./rules/alerts-vmagent.yml:/etc/alerts/alerts-vmagent.yml
- ./rules/alerts-vmalert.yml:/etc/alerts/alerts-vmalert.yml

View File

@@ -170,3 +170,57 @@ groups:
is saturated by more than 90% and vminsert won't be able to keep up.\n
This usually means that more vminsert or vmstorage nodes must be added to the cluster in order to increase
the total number of vminsert -> vmstorage links."
- alert: MetadataCacheUtilizationIsTooHigh
expr: |
vm_metrics_metadata_storage_size_bytes / vm_metrics_metadata_storage_max_size_bytes > 0.95
for: 15m
labels:
severity: warning
annotations:
summary: "Metadata cache capacity on {{ $labels.instance }} (job={{ $labels.job }}) is utilized for more than 95% for the last 15min"
description: "Metadata cache stores meta information about ingested time series - see https://docs.victoriametrics.com/victoriametrics/#metrics-metadata.
When cache is overutilized, the oldest entries will be dropped out automatically. It may result into incomplete
response for /api/v1/metadata API calls. It doesn't impact regular queries or alerts. Cache size is controlled
via -storage.maxMetadataStorageSize cmd-line flag."
- alert: MetricNameStatsCacheUtilizationIsTooHigh
expr: |
vm_cache_size_bytes{type="storage/metricNamesStatsTracker"} / vm_cache_size_max_bytes{type="storage/metricNamesStatsTracker"} > 0.95
for: 15m
labels:
severity: warning
annotations:
summary: "Cache capacity for tracking metric names usage on {{ $labels.instance }} (job={{ $labels.job }}) is utilized for more than 95% during the last 15min"
description: "Metric names usage cache stores information about unique metric names and how frequently they are queried - see https://docs.victoriametrics.com/victoriametrics/#track-ingested-metrics-usage.
When cache is overutilized, it will stop tracking the new metric names. It has no other negative impact.
Usually, the number of unique metric names is very limited (thousands). The cache can be overutilized only if metric names
are changing too frequently or if the cache size is too low. There are following ways to mitigate cache overutilization:
- disable cache via `--storage.trackMetricNamesStats=false` flag, so metric names usage will stop tracking
- increase the cache size via `--storage.cacheSizeMetricNamesStats` flag
- reset the cache (see docs for details)"
- alert: IndexDBRecordsDrop
expr: increase(vm_indexdb_items_dropped_total[5m]) > 0
labels:
severity: critical
annotations:
summary: "IndexDB skipped registering items during data ingestion with reason={{ $labels.reason }}."
description: |
VictoriaMetrics could skip registering new timeseries during ingestion if they fail the validation process.
For example, `reason=too_long_item` means that time series cannot exceed 64KB. Please, reduce the number
of labels or label values for such series. Or enforce these limits via `-maxLabelsPerTimeseries` and
`-maxLabelValueLen` command-line flags.
- alert: TooManyTSIDMisses
expr: increase(vm_missing_tsids_for_metric_id_total[5m]) > 0
for: 15m
labels:
severity: critical
annotations:
summary: "Unexpected TSID misses for job \"{{ $labels.job }}\" ({{ $labels.instance }}) for the last 15 minutes"
description: |
Unexpected TSID misses for \"{{ $labels.job }}\" ({{ $labels.instance }}) for the last 15 minutes.
If this happens after unclean shutdown of VictoriaMetrics process (via \"kill -9\", OOM or power off),
then this is OK - the alert must go away in a few minutes after the restart.
Otherwise this may point to the corruption of index data.

View File

@@ -82,19 +82,6 @@ groups:
Check the logs for the given target. Check also the \"location\" label at the vm_log_messages_total metric if -loggerLevel command-line flag is set to value other than INFO.
This label contains code locations responsible for generating log messages suppressed by -loggerLevel.
- alert: TooManyTSIDMisses
expr: increase(vm_missing_tsids_for_metric_id_total[5m]) > 0
for: 15m
labels:
severity: critical
annotations:
summary: "Unexpected TSID misses for job \"{{ $labels.job }}\" ({{ $labels.instance }}) for the last 15 minutes"
description: |
Unexpected TSID misses for \"{{ $labels.job }}\" ({{ $labels.instance }}) for the last 15 minutes.
If this happens after unclean shutdown of VictoriaMetrics process (via \"kill -9\", OOM or power off),
then this is OK - the alert must go away in a few minutes after the restart.
Otherwise this may point to the corruption of index data.
- alert: ConcurrentInsertsHitTheLimit
expr: avg_over_time(vm_concurrent_insert_current[1m]) >= vm_concurrent_insert_capacity
for: 15m
@@ -109,28 +96,6 @@ groups:
making write attempts. If vmagent's or vminsert's CPU usage and network saturation are at normal level, then
it might be worth adjusting `-maxConcurrentInserts` cmd-line flag.
- alert: IndexDBRecordsDrop
expr: increase(vm_indexdb_items_dropped_total[5m]) > 0
labels:
severity: critical
annotations:
summary: "IndexDB skipped registering items during data ingestion with reason={{ $labels.reason }}."
description: |
VictoriaMetrics could skip registering new timeseries during ingestion if they fail the validation process.
For example, `reason=too_long_item` means that time series cannot exceed 64KB. Please, reduce the number
of labels or label values for such series. Or enforce these limits via `-maxLabelsPerTimeseries` and
`-maxLabelValueLen` command-line flags.
- alert: RowsRejectedOnIngestion
expr: rate(vm_rows_ignored_total[5m]) > 0
for: 15m
labels:
severity: warning
annotations:
summary: "Some rows are rejected on \"{{ $labels.instance }}\" on ingestion attempt"
description: "Ingested rows on instance \"{{ $labels.instance }}\" are rejected due to the
following reason: \"{{ $labels.reason }}\""
- alert: TooHighQueryLoad
expr: increase(vm_concurrent_select_limit_timeout_total[5m]) > 0
for: 15m
@@ -148,3 +113,14 @@ groups:
* increase compute resources or number of replicas;
* adjust limits `-search.maxConcurrentRequests` and `-search.maxQueueDuration`.
See more at https://docs.victoriametrics.com/victoriametrics/troubleshooting/#slow-queries.
- alert: RowsRejectedOnIngestion
expr: rate(vm_rows_ignored_total[5m]) > 0
for: 15m
labels:
severity: warning
annotations:
summary: "Some rows are rejected on \"{{ $labels.instance }}\" on ingestion attempt"
description: "Ingested rows on instance \"{{ $labels.instance }}\" are rejected due to the
following reason: \"{{ $labels.reason }}\""

View File

@@ -148,4 +148,45 @@ groups:
description: "Metadata cache stores meta information about ingested time series - see https://docs.victoriametrics.com/victoriametrics/#metrics-metadata.
When cache is overutilized, the oldest entries will be dropped out automatically. It may result into incomplete
response for /api/v1/metadata API calls. It doesn't impact regular queries or alerts. Cache size is controlled
via -storage.maxMetadataStorageSize cmd-line flag."
via -storage.maxMetadataStorageSize cmd-line flag."
- alert: MetricNameStatsCacheUtilizationIsTooHigh
expr: |
vm_cache_size_bytes{type="storage/metricNamesStatsTracker"} / vm_cache_size_max_bytes{type="storage/metricNamesStatsTracker"} > 0.95
for: 15m
labels:
severity: warning
annotations:
summary: "Cache capacity for tracking metric names usage on {{ $labels.instance }} (job={{ $labels.job }}) is utilized for more than 95% during the last 15min"
description: "Metric names usage cache stores information about unique metric names and how frequently they are queried - see https://docs.victoriametrics.com/victoriametrics/#track-ingested-metrics-usage.
When cache is overutilized, it will stop tracking the new metric names. It has no other negative impact.
Usually, the number of unique metric names is very limited (thousands). The cache can be overutilized only if metric names
are changing too frequently or if the cache size is too low. There are following ways to mitigate cache overutilization:
- disable cache via `--storage.trackMetricNamesStats=false` flag, so metric names usage will stop tracking
- increase the cache size via `--storage.cacheSizeMetricNamesStats` flag
- reset the cache (see docs for details)"
- alert: IndexDBRecordsDrop
expr: increase(vm_indexdb_items_dropped_total[5m]) > 0
labels:
severity: critical
annotations:
summary: "IndexDB skipped registering items during data ingestion with reason={{ $labels.reason }}."
description: |
VictoriaMetrics could skip registering new timeseries during ingestion if they fail the validation process.
For example, `reason=too_long_item` means that time series cannot exceed 64KB. Please, reduce the number
of labels or label values for such series. Or enforce these limits via `-maxLabelsPerTimeseries` and
`-maxLabelValueLen` command-line flags.
- alert: TooManyTSIDMisses
expr: increase(vm_missing_tsids_for_metric_id_total[5m]) > 0
for: 15m
labels:
severity: critical
annotations:
summary: "Unexpected TSID misses for job \"{{ $labels.job }}\" ({{ $labels.instance }}) for the last 15 minutes"
description: |
Unexpected TSID misses for \"{{ $labels.job }}\" ({{ $labels.instance }}) for the last 15 minutes.
If this happens after unclean shutdown of VictoriaMetrics process (via \"kill -9\", OOM or power off),
then this is OK - the alert must go away in a few minutes after the restart.
Otherwise this may point to the corruption of index data.

View File

@@ -1,6 +1,6 @@
services:
vmagent:
image: victoriametrics/vmagent:v1.140.0
image: victoriametrics/vmagent:v1.142.0
depends_on:
- "victoriametrics"
ports:
@@ -14,7 +14,7 @@ services:
restart: always
victoriametrics:
image: victoriametrics/victoria-metrics:v1.140.0
image: victoriametrics/victoria-metrics:v1.142.0
ports:
- 8428:8428
volumes:
@@ -40,7 +40,7 @@ services:
restart: always
vmalert:
image: victoriametrics/vmalert:v1.140.0
image: victoriametrics/vmalert:v1.142.0
depends_on:
- "victoriametrics"
ports:

View File

@@ -1,6 +1,11 @@
VictoriaMetrics Observability Stack integrates with AI assistants through MCP servers and agent skills.
These integrations allow AI agents and automation tools to query metrics, logs, and traces, analyze telemetry data,
and assist engineers with debugging and observability tasks.
VictoriaMetrics Observability Stack integrates with AI assistants through [MCP servers](https://docs.victoriametrics.com/ai-tools/#mcp-servers)
and [agent skills](https://docs.victoriametrics.com/ai-tools/#agent-skills).
The integrations allow AI agents and automation tools to query Metrics, Logs, and Traces, analyze telemetry data,
and assist engineers with debugging, observability tasks, root cause analysis, anomaly detection, etc.
Support of [OpenTelemetry](https://docs.victoriametrics.com/opentelemetry/) for Metrics, Logs, and Traces
makes VictoriaMetrics Observability Stack optimal for [AI observability](https://docs.victoriametrics.com/ai-tools/#ai-observability).
Any SDK or AI assistant that can emit telemetry signals in OpenTelemetry format can be integrated with VictoriaMetrics.
# MCP Servers
@@ -72,7 +77,6 @@ Capabilities include:
See more details at [VictoriaMetrics/mcp-vmanomaly](https://github.com/VictoriaMetrics/mcp-vmanomaly).
# Agent Skills
[Agent skills](https://github.com/VictoriaMetrics/skills) help AI agents and automation tools understand, operate,
@@ -90,4 +94,17 @@ To install the available skills for AI agents, run:
npx skills add VictoriaMetrics/skills
```
See more details at [VictoriaMetrics/skills](https://github.com/VictoriaMetrics/skills).
See more details at [VictoriaMetrics/skills](https://github.com/VictoriaMetrics/skills).
# AI observability
VictoriaMetrics Observability Stack is optimal for monitoring AI agents using auto-instrumentation libraries
like [OpenLLMetry](https://github.com/traceloop/openllmetry), [OpenInference](https://github.com/Arize-ai/openinference),
[OpenLIT](https://victoriametrics.com/blog/ai-agents-observability/#using-openlit).
Please see more details in [AI Agents Observability with OpenTelemetry and the VictoriaMetrics Stack](https://victoriametrics.com/blog/ai-agents-observability).
AI code assistants like Claude Code, OpenAI Codex, Gemini CLI, Qwen Code, and OpenCode expose internal telemetry that
helps to monitor cost usage, analytics, performance, compliance and improves troubleshooting experience. All major
AI coding tools support OpenTelemetry and can be easily integrated into VictoriaMetrics Observability Stack.
Please see more details in [Vibe coding tools observability with VictoriaMetrics Stack and OpenTelemetry
](https://victoriametrics.com/blog/vibe-coding-observability/).

View File

@@ -48,13 +48,15 @@ Please see example graph illustrating this logic below:
## What data does vmanomaly operate on?
`vmanomaly` operates on timeseries (metrics) data, and supports both **VictoriaMetrics** and **VictoriaLogs** as data sources. Choose the source depending on the use case.
> [!NOTE]
> `vmanomaly` operates on timeseries (metrics) data, and supports both **VictoriaMetrics** and **VictoriaLogs/VictoriaTraces** as data sources to get metrics-compatible data. Choose the source depending on the use case. Single-node / Cluster and OpenSource / Enterprise datasources are supported as well, `vmanomaly` is compatible with both, yet itself requires an [Enterprise license](https://victoriametrics.com/products/enterprise/) to run.
**VictoriaMetrics (metrics):** use full [MetricsQL](https://docs.victoriametrics.com/victoriametrics/metricsql/) for selection, sampling, and processing; [global filters](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#prometheus-querying-api-enhancements) are also supported. See the [VmReader](https://docs.victoriametrics.com/anomaly-detection/components/reader/#vm-reader) for the details.
**VictoriaLogs (logs → metrics):** {{% available_from "v1.26.0" anomaly %}} use [LogsQL](https://docs.victoriametrics.com/victorialogs/logsql/) via the [`VLogsReader`](https://docs.victoriametrics.com/anomaly-detection/components/reader/#vlogs-reader) to create log-derived metrics for anomaly detection (e.g., error rates, request latencies).
**VictoriaLogs (logs → metrics):** {{% available_from "v1.26.0" anomaly %}} use [LogsQL](https://docs.victoriametrics.com/victorialogs/logsql/) via the [`VLogsReader`](https://docs.victoriametrics.com/anomaly-detection/components/reader/#vlogs-reader) to create log-derived or traces-derived metrics for anomaly detection (e.g., error rates, request latencies, error spans count).
> Please note that only LogsQL queries with [stats pipe](https://docs.victoriametrics.com/victorialogs/logsql/#stats-pipe) functions [subset](https://docs.victoriametrics.com/anomaly-detection/components/reader/#valid-stats-functions) are supported, as they produce **numeric** time series.
> [!NOTE]
> Please note that only LogsQL queries with [stats pipe](https://docs.victoriametrics.com/victorialogs/logsql/#stats-pipe) functions [subset](https://docs.victoriametrics.com/anomaly-detection/components/reader/#valid-stats-functions) are supported, as they produce **numeric** time series.
## Using offsets

View File

@@ -9,14 +9,17 @@ sitemap:
In today's fast-paced and complex landscape of system monitoring, [VictoriaMetrics Anomaly Detection](https://victoriametrics.com/products/enterprise/anomaly-detection/) (`vmanomaly`), a part of our [Enterprise offering](https://victoriametrics.com/products/enterprise/), serves as an **observability layer** for SREs and DevOps teams atop of collected data to **automate the detection of anomalies in time-series data**, reducing manual efforts required to identify abnormal system behavior.
Unlike traditional threshold-based alerting, which relies on **raw metric values** and requires constant tuning and maintenance of thresholds and alerting rules, `vmanomaly` introduces a **unified, interpretable [anomaly score](https://docs.victoriametrics.com/anomaly-detection/faq/#what-is-anomaly-score)** - a **de-trended, de-seasonalized metric** generated through machine learning. This approach eliminates the need for frequent manual adjustments by enabling **stable, long-term static thresholds (as simple as `anomaly_score > 1`)** that remain effective over time through continuous model retraining.
Unlike traditional threshold-based alerting, which relies on **raw metric values** and requires constant tuning and maintenance of thresholds and alerting rules, `vmanomaly` introduces a **unified, interpretable [anomaly score](https://docs.victoriametrics.com/anomaly-detection/faq/#what-is-anomaly-score)** - a **de-trended, de-seasonalized metric** generated through machine learning. This approach eliminates the need for frequent manual adjustments by enabling **stable, long-term static thresholds (as simple as `anomaly_score > 1`)** that remain effective over time through continuous model retraining and updates.
By shifting to anomaly-based detection, teams can **identify and respond to potential issues faster**, enhancing system reliability and operational efficiency while significantly **reducing the engineering effort spent on handcrafting and maintaining alerting rules**.
## What does it do?
`vmanomaly` is designed to **periodically analyze new data points** across selected metrics (either requested from [VictoriaMetrics TSDB](https://docs.victoriametrics.com/victoriametrics/) or produced by [VictoriaLogs](https://docs.victoriametrics.com/victorialogs/) metrics [endpoint](https://docs.victoriametrics.com/victorialogs/querying/#querying-log-range-stats)), generating a **unified metric** called [anomaly score](https://docs.victoriametrics.com/anomaly-detection/faq/#what-is-anomaly-score).
`vmanomaly` is designed to **periodically analyze new data points** across selected metrics - either requested from [VictoriaMetrics TSDB](https://docs.victoriametrics.com/victoriametrics/) or produced by [VictoriaLogs](https://docs.victoriametrics.com/victorialogs/) or [VictoriaTraces](https://docs.victoriametrics.com/victoriatraces/) metrics [endpoint](https://docs.victoriametrics.com/victorialogs/querying/#querying-log-range-stats) - to generate a **unified metric** called [anomaly score](https://docs.victoriametrics.com/anomaly-detection/faq/#what-is-anomaly-score).
> [!NOTE]
> `vmanomaly` can use both single-node and cluster versions of VictoriaMetrics/VictoriaLogs/VictoriaTraces as a data source, and is compatible with both OpenSource and Enterprise versions of it. However, `vmanomaly` itself requires an Enterprise license to run, and is part of our [Enterprise offering](https://victoriametrics.com/products/enterprise/).
Key functions:
- **Automated anomaly detection** - continuously scans time-series data to identify deviations from expected behavior.

Binary file not shown.

Before

Width:  |  Height:  |  Size: 51 KiB

After

Width:  |  Height:  |  Size: 357 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 54 KiB

After

Width:  |  Height:  |  Size: 467 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 22 KiB

After

Width:  |  Height:  |  Size: 188 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 25 KiB

After

Width:  |  Height:  |  Size: 234 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 106 KiB

After

Width:  |  Height:  |  Size: 282 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 181 KiB

After

Width:  |  Height:  |  Size: 1.0 MiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 164 KiB

After

Width:  |  Height:  |  Size: 929 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 104 KiB

After

Width:  |  Height:  |  Size: 563 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 331 KiB

After

Width:  |  Height:  |  Size: 871 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 93 KiB

After

Width:  |  Height:  |  Size: 310 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 122 KiB

After

Width:  |  Height:  |  Size: 944 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 2.6 KiB

After

Width:  |  Height:  |  Size: 19 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 44 KiB

After

Width:  |  Height:  |  Size: 303 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 140 KiB

After

Width:  |  Height:  |  Size: 681 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 109 KiB

After

Width:  |  Height:  |  Size: 805 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 17 KiB

After

Width:  |  Height:  |  Size: 111 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 160 KiB

After

Width:  |  Height:  |  Size: 1.2 MiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 236 KiB

After

Width:  |  Height:  |  Size: 189 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 26 KiB

After

Width:  |  Height:  |  Size: 206 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 26 KiB

After

Width:  |  Height:  |  Size: 206 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 1.5 KiB

After

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 53 KiB

After

Width:  |  Height:  |  Size: 206 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 42 KiB

After

Width:  |  Height:  |  Size: 428 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 9.7 KiB

After

Width:  |  Height:  |  Size: 37 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 34 KiB

After

Width:  |  Height:  |  Size: 740 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 2.4 KiB

After

Width:  |  Height:  |  Size: 11 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 35 KiB

After

Width:  |  Height:  |  Size: 225 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 98 KiB

After

Width:  |  Height:  |  Size: 189 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 11 KiB

After

Width:  |  Height:  |  Size: 26 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 25 KiB

After

Width:  |  Height:  |  Size: 88 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 24 KiB

After

Width:  |  Height:  |  Size: 160 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 19 KiB

After

Width:  |  Height:  |  Size: 93 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 80 KiB

After

Width:  |  Height:  |  Size: 425 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 53 KiB

After

Width:  |  Height:  |  Size: 514 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 32 KiB

After

Width:  |  Height:  |  Size: 193 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 19 KiB

After

Width:  |  Height:  |  Size: 64 KiB

View File

@@ -6,31 +6,133 @@ build:
sitemap:
disable: true
---
This guide covers OpenShift cluster configuration for collecting and storing logs in Victoria Logs.
This guide explains how to collect and store logs from an OpenShift cluster in VictoriaLogs.
## Pre-Requirements
* [OpenShift cluster](https://www.redhat.com/en/technologies/cloud-computing/openshift)
* Admin access to OpenShift cluster
* [kubectl installed](https://kubernetes.io/docs/tasks/tools/install-kubectl) and configured to access OpenShift cluster
* [kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl) or [oc](https://github.com/openshift/oc) installed and configured to access OpenShift cluster
* [Helm installed](https://helm.sh/docs/intro/install)
## Install Red Hat OpenShift Logging operator
> [!NOTE] Note
> You can replace every `kubectl` command in this guide with `oc`. They are interchangeable in most cases on OpenShift clusters.
[Cluster logging operator](https://github.com/openshift/cluster-logging-operator) is a logging solution to support aggregated cluster logging. It is using [Vector](https://vector.dev/) for logs collection and shipping to remote storage.
## Overview
In order to install the operator, navigate to the OpenShift web console and select the `Operators` tab. Then, click on `OperatorHub` and search for `Red Hat OpenShift Logging`. Click on the `Red Hat OpenShift Logging` operator and then click on `Install`.
To collect OpenShift logs, we're going to:
![Install logging operator](install-oc-logging-operator.webp)
1. [Install VictoriaLogs](https://docs.victoriametrics.com/guides/collecting-openshift-logs-with-victoria-logs/#install-victoria-logs) in the OpenShift Cluster
2. [Configure a service account](https://docs.victoriametrics.com/guides/collecting-openshift-logs-with-victoria-logs/#rbac-configuration) to access the logs
3. [Install the OpenShift Logging operator](https://docs.victoriametrics.com/guides/collecting-openshift-logs-with-victoria-logs/#install-red-hat-openshift-logging-operator)
4. [Configure a Log Forwarder](https://docs.victoriametrics.com/guides/collecting-openshift-logs-with-victoria-logs/#configure-logs-forwarding)
5. [Test log ingestion](https://docs.victoriametrics.com/guides/collecting-openshift-logs-with-victoria-logs/#verify-logs-ingestion) in VictoriaLogs
## Install VictoriaLogs {#install-victoria-logs}
## RBAC configuration
Run the following command to add the [VictoriaMetrics Helm repository](https://github.com/VictoriaMetrics/helm-charts):
Create a service account and cluster role binding for the service account to access the logs.
OpenShift provides separate `ClusterRoles` for monitoring of different types of logs: `audit`, `infrastructure` and `application`.
```shell
helm repo add vm https://victoriametrics.github.io/helm-charts/
helm repo update
```
The following configuration will allow the service account to collect all types of logs:
```yaml
To verify that everything is set up correctly, you may run this command:
```shell
helm search repo vm/
```
You should get a list similar to this:
```text
NAME CHART VERSION APP VERSION DESCRIPTION
vm/victoria-logs-agent 0.1.1 v1.50.0 VictoriaLogs Agent - accepts logs from various ...
vm/victoria-logs-collector 0.3.1 v1.50.0 VictoriaLogs Collector - collects logs from Kub...
vm/victoria-logs-single 0.12.2 v1.50.0 The VictoriaLogs single Helm chart deploys Vict...
...
```
Create a minimal configuration file to run VictoriaLogs in OpenShift:
```shell
cat <<EOF >vl-values.yml
securityContext:
enabled: true
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
podSecurityContext:
enabled: true
runAsNonRoot: true
EOF
```
> Note that, depending on the OpenShift cluster configuration, additional security settings might be required.
Create a namespace for VictoriaLogs called `vl`:
```shell
kubectl create namespace vl
```
Install VictoriaLogs with the following command:
```shell
helm upgrade --namespace vl --install vl vm/victoria-logs-single -f vl-values.yml
```
You should see a message like this:
```text
Release "vl" does not exist. Installing it now.
NAME: vl
LAST DEPLOYED: Fri Apr 17 01:05:42 2026
NAMESPACE: vl
STATUS: deployed
REVISION: 1
DESCRIPTION: Install complete
TEST SUITE: None
NOTES:
The VictoriaLogs write api can be accessed via port 9428 on the following DNS name from within your cluster:
vl-victoria-logs-single-server-0.vl-victoria-logs-single-server.vl.svc.cluster.local.
Logs Ingestion:
Get the VictoriaLogs service URL by running these commands in the same shell:
kubectl --namespace vl port-forward svc/vl-victoria-logs-single-server 9428:9428
echo http://localhost:9428
Write URL inside the kubernetes cluster:
http://vl-victoria-logs-single-server.vl.svc.cluster.local.:9428<protocol-specific-write-endpoint>
See the documentation for log ingestion and supported write endpoints at https://docs.victoriametrics.com/victorialogs/data-ingestion/.
Read Data:
The following URL can be used to query data:
http://vl-victoria-logs-single-server.vl.svc.cluster.local.:9428
See the documentation for log querying UI at https://docs.victoriametrics.com/victorialogs/querying/#web-ui or HTTP API at https://docs.victoriametrics.com/victorialogs/querying/#http-api
```
Note the "Write URL" value as you'll need it later. In the example above, the value is:
```text
http://vl-victoria-logs-single-server.vl.svc.cluster.local.:9428
```
## RBAC Configuration
Create a service account and cluster role binding for the service account to collect and forward the logs.
OpenShift provides separate `ClusterRoles` for monitoring of different types of logs: `audit`, `infrastructure`, and `application`.
Create a file to configure the service account and cluster role bindings:
```shell
cat <<EOF >vl-rbac.yml
kind: ServiceAccount
apiVersion: v1
metadata:
@@ -77,52 +179,42 @@ roleRef:
name: collect-application-logs
```
Alternatively, you can use OpenShift console to create the service account and cluster role binding.
Navigate to `ServiceAccounts` and click on `Create Service Account`. Fill in the name `victorialogs` and namespace `openshift-logging` and click on `Create`.
Then, navigate to `RoleBindings` and create a binding for each `ClusterRole` for subject `victorialogs` in `openshift-logging` namespace.
Install the roles in OpenShift with:
## Install Victoria Logs
Add Victoria Metrics Helm [repository](https://github.com/VictoriaMetrics/helm-charts):
```bash
helm repo add vm https://victoriametrics.github.io/helm-charts/
```shell
kubectl apply -f vl-rbac.yml
```
Minimal configuration for VictoriaLogs running in OpenShift:
```yaml
securityContext:
enabled: true
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
Alternatively, you can use the OpenShift web console to create the service account and cluster role binding. To do this:
podSecurityContext:
enabled: true
runAsNonRoot: true
```
Save the configuration to `vl.yaml`.
1. Navigate to **ServiceAccounts**.
2. Click on **Create Service Account**.
3. Fill in the name `victorialogs` and namespace `openshift-logging`.
4. Click on **Create**.
5. Navigate to **RoleBindings**.
6. Create a binding for each `ClusterRole` for subject `victorialogs` in `openshift-logging` namespace.
> Note, that depending on the OpenShift cluster configuration, additional security settings might be required.
## Install Red Hat OpenShift Logging operator
Create namespace for VictoriaLogs:
```bash
kubectl create namespace vl
```
The [Cluster logging operator](https://github.com/openshift/cluster-logging-operator) is a logging solution to support aggregated cluster logging. It is using [Vector](https://vector.dev/) for log collection and shipping to remote storage.
Install VictoriaLogs with the following command:
```bash
helm upgrade --namespace vl --install vl vm/victoria-logs-single -f vl.yaml
```
To install the operator:
1. Navigate to **Ecosystem** > **Software Catalog**
2. Search for "OpenShift Logging" and select the operator.
![Screenshot of OpenShift web console](software-catalog-openshift-logging-1.webp)
3. Press **Install**.
4. Confirm the settings and press **Install** again.
![Screenshot of OpenShift web console](software-catalog-openshift-logging-options-3.webp)
## Configure logs forwarding
Cluster logging operator uses `ClusterLogForwarder` resource for logs forwarding configuration.
We need to create a `ClusterLogForwarder` resource to forward logs from the OpenShift Logging operator to VictoriaLogs.
Here is an example configuration to forward all cluster logs to VictoriaLogs:
```yaml
Run the following command to create the resource file:
```shell
cat <<EOF > vl-forwarder.yml
apiVersion: observability.openshift.io/v1
kind: ClusterLogForwarder
metadata:
@@ -160,7 +252,7 @@ spec:
- application
name: application
outputRefs:
- victorialogs-audit
- victorialogs
- inputRefs:
- infrastructure
name: infrastructure
@@ -170,21 +262,66 @@ spec:
- audit
name: audit
outputRefs:
- victorialogs
- victorialogs-audit
serviceAccount:
name: victorialogs
EOF
```
It is also possible to configure logs forwarding using OpenShift console. Navigate to `Operators` tab and click on `Installed Operators`. Then, click on `Red Hat OpenShift Logging` and navigate to `ClusterLogForwarders`. Click on `Create ClusterLogForwarder`.
Configure the forwarding to VictoriaLogs using the provided form and click on `Create`.
![Create logs forwarders menu](create-cluster-logs-forwarder.webp)
![Create forwarder config](create-cluster-logs-forwarder-2.webp)
Then install the resource with:
## Verify logs collection
```shell
kubectl apply -f vl-forwarder.yml
```
To verify that logs are collected and stored in VictoriaLogs, navigate to the VictoriaLogs web interface.
Alternatively, you can configure log forwarding in the OpenShift web console. To do this:
1. Navigate to **Operators** tab
2. Click on **Installed Operators**.
3. Find **Red Hat OpenShift Logging**
4. Navigate to **ClusterLogForwarders**.
5. Click on **Create ClusterLogForwarder**.
![Screenshot of OpenShift web console](software-catalog-openshift-logging-forwarder-5.webp)
6. Use the form to configure each type of forwarder.
![Screenshot of OpenShift web console](forwarder-form.webp)
7. Click **Create**
![VictoriaLogs](openshift-logs.webp)
## Verify logs ingestion
We can verify that logs are being collected using the VictoriaLogs VMUI.
First, find the service name for the VMUI with:
```shell
kubectl get svc -n vl -l app.kubernetes.io/instance=vl
```
You should get a result similar to this. Note the name of the service:
```text
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
vl-victoria-logs-single-server ClusterIP None <none> 9428/TCP 113s
```
Then, port-forward the VMUI service console with:
```shell
kubectl -n vl port-forward svc/vl-victoria-logs-single-server 9428:9428
```
Open your browser in `http://localhost:9428/select/vmui/#/overview` and verify that logs are being collected. This overview page shows the number of log entries being consumed in real time.
![Screenshot of VMUI for VictoriaLogs](vmui-overview.webp)
<figcaption style="text-align: center; font-style: italic;">Overview pane in VMUI</figcaption>
You can query your logs in the **Query** tab, found in `http://localhost:9428/select/vmui`.
You can filter streams on the left side pane to filter logs and use [LogsQL](https://docs.victoriametrics.com/victorialogs/logsql/) to search for entries. Note that logs will have `log_type` attached to them to distinguish between different types of logs.
![Screenshot of VMUI for VictoriaLogs](vmui-query-filters.webp)
<figcaption style="text-align: center; font-style: italic;">Query pane in VMUI</figcaption>
## See also
- [VictoriaLogs Quickstart](https://docs.victoriametrics.com/victorialogs/quickstart/)
- [Logs Reference](https://docs.victoriametrics.com/victorialogs/logsql/)
- [LogsQL Examples](https://docs.victoriametrics.com/victorialogs/logsql-examples/)
Logs will be available in the interface and can be queried using [LogsQL](https://docs.victoriametrics.com/victorialogs/logsql/).
Note that logs will have `log_type` attached to them to distinguish between different types of logs.

Some files were not shown because too many files have changed in this diff Show More