Commit Graph

6175 Commits

Author SHA1 Message Date
Andrii Chubatiuk
16a75129be docs: exclude files from rendering by hugo (#9591)
required for https://github.com/VictoriaMetrics/vmdocs/issues/164

### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-08-20 12:04:06 +03:00
Fred Navruzov
4360d10962 docs/vmanomaly: release v1.25.3 (#9597)
### Describe Your Changes

Update docs to vmanomaly release v1.25.3

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-08-19 10:24:48 +04:00
Zakhar Bessarab
15ce9e5e49 docs: update references to the latest releases
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-08-18 16:08:12 +04:00
Zakhar Bessarab
2c1596ea84 docs/changelog: backport LTS release notes
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-08-18 15:36:43 +04:00
f41gh7
f35b9ed36d deployment/docker: update Go builder from 1.24.6 to 1.25.0
Changes https://tip.golang.org/doc/go1.25
2025-08-17 20:31:30 +02:00
Zakhar Bessarab
b4dc67cba6 docs/CHANGELOG.md: cut v1.124.0
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-08-15 15:00:46 +04:00
Zakhar Bessarab
70afdd0285 docs: update version tooltips
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-08-15 14:51:10 +04:00
Andrei Baidarov
e49027df8f app/vmagent: properly apply dropOnOverload condition
Previously, vmagent treated differently the following configuration:

1) ./bin/vmagent --remoteWrite.url=url-0 --remoteWrite.url=url-1 --remoteWrite.disableOndiskQueue

 and

2)./bin/vmagent --remoteWrite.url=url-0 --remoteWrite.url=url-1 --remoteWrite.disableOndiskQueue=true,true

In first case, it could produce duplicates and blocks ingestion requests if one of remote write targets were not accessible.
In second case, it implicitly added --remoteWrite.dropSamplesOnOverload as true and silently dropped samples for inaccessible target.

 This commit treat this configuration as the same and silently drop samples on both cases to mitigate possible duplicates. 

 It's expected, that vmagent provides delivery guarantees, only if it has a single remote write target, when flag remoteWrite.disableOndiskQueue=true is set.


Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9565
2025-08-14 16:11:08 +02:00
Andrii Chubatiuk
a518a4a904 lib/backup: added checksum algorithm for all S3 PutObject requests (#9549)
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9532
set checksum algorithm to SHA256, not sure if this property should be
configurable

### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-08-14 17:50:03 +04:00
Max Kotliar
7cc13ee1cc docs/changelog: move metadata changelog record to tip
Follow up on
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9306
2025-08-13 21:58:49 +03:00
Zakhar Bessarab
59007cda51 docs: update examples to use proper license flags (#9579)
`-eula` was deprecated and made no-op in v1.123.0, so examples with
`-eula` will no longer work.
Replace those with proper license configuration.

While at it, remove license flags from vmbackupmanager CLI commands as
it is not required when using CLI.

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-08-13 19:12:19 +04:00
hagen1778
5869a39e7b metricsql: return a proper error message for scalar arguments
Follow-up for 8b92af9d45

Initial PR contained the change for getScalar function - see https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9548
But change was dropped during incorrect rebase.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-08-13 13:22:13 +02:00
Hui Wang
8b92af9d45 metricsql: return a proper error message when the function argument i… (#9548)
…s expected to be a string

In MetricsQL, functions like
[count_values](https://docs.victoriametrics.com/victoriametrics/metricsql/#count_values),
[label_replace](https://docs.victoriametrics.com/victoriametrics/metricsql/#label_replace)
expect string arguments, and `getString()` checks if the result from a
string expr query.
Previously, error messages were not intuitive, now
`label_replace("","","","",up)` and `label_replace("","","","",1)`
should return clearer error message.
2025-08-13 13:05:00 +02:00
Hui Wang
e313874d01 vmalert: fix the {{ $activeAt }} variable value in annotation templ… (#9576)
…ating when the alert has already triggered

fix https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9543, 
bug was introduced in
[v1.101.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.101.0)
with
a84491324d.
2025-08-13 12:59:00 +02:00
Hui Wang
58a4e48901 vmalert: fix potential data race and missing firing states when repla… (#9559)
…ying alerting rule with `-replay.ruleEvaluationConcurrency>1`
2025-08-13 12:56:27 +02:00
Roman Khavronenko
f99e49c15d dashboards/victoriametrics-cluster: show max 99th percentile on vmselect panels (#9555)
Before, we showed summarized 99th percentile for query complexity across
all available instance. This doesn't make much sense, as it doesn't
answer on the following questions:
1. What complexity limits to set per vmselect
2. What are the most expensive queries

The change is to use `max` instead of `sum`, to show only outliers, the
heaviest served queries. The update should help answering on questions
above.

### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-08-12 16:43:07 +02:00
Andrii Chubatiuk
1ba994970b metricsql: fixed gaps in histogram_quantile calculation, when first bucket contains NaNs (#9547)
fixes case, when `histogram_quantile` result contains gaps, that occur
in same time range, where NaNs are present in a first bucket of a
histogram

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2025-08-12 16:41:58 +02:00
Hui Wang
25cd5637bc app/vmagent: add time series metadata support
By default, `vmagent` doesn't parse
[metadata](https://github.com/prometheus/docs/blob/main/docs/instrumenting/exposition_formats.md)
when scraping targets, and drops metadata that received via [Prometheus remote write v1(https://prometheus.io/docs/specs/prw/remote_write_spec/) or
[OpenTelemetryprotocol](https://github.com/open-telemetry/opentelemetryproto/blob/v1.7.0/opentelemetry/proto/metrics/v1/metrics.proto).

To enable parsing metadata when scraping and sending metadata to the
configured `-remoteWrite.url`, set `-enableMetadata=true`.

Besides native metadata fields, vmagent also adds tenant info to
metadata when `-enableMultitenantHandlers` is enabled and data is sent
via the multitenant endpoints (/insert/<accountID>/<suffix>), allowing
storing metadata under different tenants in VictoriaMetrics cluster.
However, if `vm_account_id` or `vm_project_id labels` are added directly
in metrics labels and send to the [vminsert multitenantendpoints](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/#multitenancy-via-labels),
tenant info won't be attached in the metadata, and it will be stored in
the default tenant of VictoriaMetrics cluster.

part of https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2974
2025-08-12 15:19:50 +02:00
Nikolay
f668e5d9c6 lib/storage: cardinality limiter prevent performance degradation on limit hit
Previously, if limit was reached for cardinality limiter, vmstorage
started to perform index lookups for any series exceed limit. Since
storage must skip index creation for such series, it's not possible to
cache it. It resulted into opposite effect of cardinality limiter -
instead of reducing resource usage, it increased it instead.

 This commit changes cardinality limit calculation from metricID to the
hash from raw metricName. It could slightly increase CPU usage if
cardinality limiter is configured, since hash must be calculated for
each metricName row. But it mitigates excessive CPU and memory usage on
limit hit

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9554
2025-08-12 11:32:31 +02:00
Nikolay
f4548a46a7 docs: add vmselect group and vmstorage node auto-discovery 2025-08-12 11:31:48 +02:00
Max Kotliar
7048de8d20 docs: add available from hint for -rpc.handshakeTimeout flag
follow up on
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9541
2025-08-12 10:12:02 +03:00
Max Kotliar
cba4b2f0df lib/handshake: set deadline for whole handshake; change deadline (1s per op to 3s whole process) (#9541)
The current one-second timeout for individual read or write operations
during the handshake phase has proven to be insufficient in some
scenarios
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9345. For
example, short-lived CPU spikes lasting a few seconds can cause
handshake failures due to the low timeout threshold.

While a small timeout may work well in environments with fast and
reliable networking, such as within a single datacenter, it becomes
problematic in more complex setups—particularly in a [multi-level
cluster
setup](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/#multi-level-cluster-setup)
where the top-level vmselect may reside in a different availability zone
and work on a less reliable network.

Another issue with the per-operation timeout approach is that it allows
the total time for a handshake to accumulate significantly in the
worst-case scenario. If each operation experiences a delay just under
the timeout threshold, the entire handshake process could take up to 6s.
Which accounts for 60% of `-search.maxQueueDuration` and leaves only 4s
for the actual query.

Introducing a single timeout for the entire handshake process would
provide more predictable behavior and improve usability from a
configuration standpoint. The timeout for the whole handshake op is also
easier to understand from the operator's point of view. Increasing the
timeout value and providing a configuration option for it would make the
system more resilient to transient conditions like CPU contention and
better suited for use cases involving cross-AZ communication.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9345

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-08-11 19:35:55 +03:00
Aliaksandr Valialkin
06f590ee63 lib/envtemplate: allow referring non-existing environment variables in config files and in command-line flags
A few users reported unexpected errors when environment variables referred other environment variables
at VictoriaMetrics startup. This resulted in the following fatal error on startup:

    cannot expand "..." env var value "...%{SOME_NON_EXISTING_ENV_VAR}..."

Fix this by leaving placeholders with non-existing env vars as is.
This improves the general usability of environment variables by VictoriaMetrics components
inside command-line flags and inside config files. User can easily notice placeholders with non-existing
environment variables by looking at the corresponding command-line flag or at the corresponding config option value.

While at it, replace duplicate docs about environment variables at the https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/#environment-variables
with the link to the same docs at https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#environment-variables .

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3999
2025-08-09 21:05:13 +02:00
Aliaksandr Valialkin
5a572387cf deployment/docker: update Go builder from Go1.24.5 to Go1.24.6
See https://github.com/golang/go/issues?q=milestone%3AGo1.24.6+label%3ACherryPickApproved
2025-08-08 20:21:57 +02:00
Charles-Antoine Mathieu
c6b165ecba app/vmselect: truncate graphite excessive pathExpression field
vmselect is experiencing memory exhaustion and OOM kills
when processing complex Graphite queries with nested functions and large
numbers of label selectors (30k+ values).

The root cause was unbounded growth of the pathExpression field.

 This commit adds configurable truncation for Graphite pathExpression fields to
prevent memory exhaustion while preserving query functionality:

New flag: -search.maxGraphitePathExpressionLen=1024 (default 1024
characters)
Safe truncation: Long expressions are truncated with "..." suffix
Zero disables: Set to 0 to disable truncation entirely

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9534/
2025-08-08 13:44:29 +02:00
Nikolay
c207c32c44 lib/netutil: properly accept proxy protocol
Previously, tcp listener perform synchronous proxy protocol header
read during connection accept. It could significantly reduce vmauth
performance and lead to timeout at serving http requests.

 This commit changes this logic and performs proxy protocol header
parsing during first Read request from connection or RemoteAddr method
call. It significantly improves performance and reduce possible
bottleneck at connections accept method.

Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9546/
2025-08-07 12:25:35 +02:00
Alexander Frolov
1beb1f69d5 vmselect: properly release tmp blocks for /federate
The `/federate` endpoint handler might return early before calling
`rss.RunParallel()`, which causes temporary block files to not be closed
properly.

Related PR: https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9536
2025-08-06 18:20:06 +02:00
Andrii Chubatiuk
5266bf1f3b docs: override canonical url of pages, that have multiple copies (#9550)
### Describe Your Changes

multiple pages, that reference same document in `{% content %}`
shortcode same content, but different canonical URLs, added canonical
parameter to override default url

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-08-06 16:13:20 +02:00
Roman Khavronenko
d4aefcecc4 docs: mention series of articles on VM internals in FAQ (#9528)
While there, mention https://victoriametrics.com/blog in the articles
section, as it seems not being mentioned anywhere.

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Mathias Palmersheim <mathias@victoriametrics.com>
2025-08-06 16:11:10 +02:00
Zakhar Bessarab
93c373d55a dashboards/vmagent: fix expression for samples rate (#9530)
In case vmagent does not scrape any metrics left part will be evaluated
as empty resulting in right part being skipped.

Before:
<details>
<img width="1401" height="1080" alt="image"
src="https://github.com/user-attachments/assets/c242593f-8503-4bd2-b6a7-85c1dcc54d0f"
/>
</details> 

After:
<details>
<img width="1416" height="1128" alt="image"
src="https://github.com/user-attachments/assets/45565c28-a731-4f5d-af54-1ab3daf75778"
/>
</details>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2025-08-06 16:08:02 +02:00
Hui Wang
58bc05ce56 vmalert-tool: fix panic when rule execution fails (#9540)
fix https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9526, 
bug was introduced from **v1.114.0**.

Please note, the rule execution failure should only happen if there is a
bad template or duplicated alert(rare case), added a test case to cover
the template.

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2025-08-06 16:02:22 +02:00
Roman Khavronenko
516a454f0a docs: update monitoring section (#9538)
* remove duplicated content between single and cluster versions
* mention recommendation to group component types by jobs in scrape
config
* link the example of scrape configs
* update wording

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-08-06 16:00:25 +02:00
Max Kotliar
5a75b93535 docs/changelog: remove mention of latest Docker tag deprecation, clarify stable tag removal 2025-08-05 18:48:18 +03:00
f41gh7
787bf8ffed docs/cluster: follow-up after 33392e1135
Mention new logNewSeriesAuthKey flag at docs
2025-08-04 17:09:11 +02:00
f41gh7
f4bbb83b6a docs/changelog: add v1.110.15 and v1.122.1 changes
Signed-off-by: f41gh7 <nik@victoriametrics.com>
2025-08-04 17:07:28 +02:00
f41gh7
b0409910dc docs: update LTS releases versions
Signed-off-by: f41gh7 <nik@victoriametrics.com>
2025-08-04 17:05:25 +02:00
f41gh7
b421f43532 docs: mention v1.123.0 release at examples
Signed-off-by: f41gh7 <nik@victoriametrics.com>
2025-08-04 17:04:21 +02:00
Aliaksandr Valialkin
53d8e99987 app/vmstorage: expose vm_total_disk_space_bytes metric, which shows disk volume size for -storageDataPath directory
This metric can be used for building alerts and graphs for free disk space usage percentage by using the following MetricsQL query:

    100 * (vm_free_disk_space_bytes / vm_total_disk_space_bytes)
2025-08-04 10:05:45 +02:00
f41gh7
fbe5ddcc2b CHANGELOG.md: cut v1.123.0 release 2025-08-01 14:54:19 +02:00
Aliaksandr Valialkin
c1cde61f20 docs/victoriametrics/changelog/CHANGELOG_2024.md: typo fix: steaming -> streaming
This is a follow-up for 8a7045e206
2025-08-01 13:33:46 +02:00
hagen1778
b8e82eef72 docs: fix copy&paste typo in stream aggregation
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-08-01 09:19:36 +02:00
Dima Shur
8a7045e206 Fixed typo (#9524)
### Describe Your Changes

Fixed typo (steaming aggregation -> streaming aggregation)
Updated vmalert doc - there were counterintuitive links, made it clearer

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
2025-07-31 17:52:00 +03:00
leiwingqueen
33392e1135 app/vmstorage: introduce /internal/log_new_series API
This commit introduces new storage API: `/internal/log_new_series`.
It helps to dynamically debug newly created series. Changing `-logNewSeries` value requires storage restart,
which may introduce downtime and is not recommended for production deployments.

 In addition, this commit adds flags: `-logNewSeriesAuthKey`, which protects newly added API.

Fixes: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8879
2025-07-31 13:07:44 +02:00
Aliaksandr Valialkin
656bcb23d3 docs/victoriametrics/Articles.md: add https://itnext.io/kubernetes-monitoring-a-complete-solution-part-1-architecture-eb5b998658d5 2025-07-31 11:56:03 +02:00
Aliaksandr Valialkin
c582b3a485 docs/victoriametrics/Articles.md: add https://medium.com/@isasamor/optimizing-datadog-costs-with-victoriametrics-a-practical-step-by-step-guide-c984d32c7423 2025-07-31 11:32:56 +02:00
Aliaksandr Valialkin
b2d8fc4b97 docs/victoriametrics/Articles.md: add https://medium.com/@heliodevhub/construindo-uma-stack-de-monitoramento-escal%C3%A1vel-e-econ%C3%B4mica-na-aws-com-victoria-metrics-532d535dcfb6 2025-07-31 11:31:39 +02:00
Aliaksandr Valialkin
7c3679075e docs/victoriametrics/Articles.md: add https://davidhernandez21.github.io/posts/Victoriametrics-k8s-stack-gotchas/ 2025-07-31 10:54:05 +02:00
dstevensson
7dd9407b94 lib/promscrape/discovery/gce: add support for ipv6 in metadata labels
This change will expose any IPv6 addresses assigned to an instance under
the meta labels:
* `__meta_gce_public_ipv6` -native IPv6 address, globally routed
* `__meta_gce_internal_ipv6` - unique local address (ULA).

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9370
2025-07-31 09:31:33 +02:00
Yury Molodov
59486f0bc1 vmui: always show tenant selector if tenant list is not empty
Previously, the tenant selector was hidden when only one tenant
was returned, making it impossible to run queries in multi-tenant mode.
Now, the selector is always shown as long as at least one tenant exists.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9396
2025-07-31 09:20:53 +02:00
Nikolay
98659633cc app/vmauth: properly set useProxyProtocol for httpInternalListenAddr
Commit e77df5d00b introduced
unintentional change, which prevents from using httpInternalListenAddr.
Which is designed to use with clients that do not support proxy
protocol.

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9515
2025-07-31 09:18:24 +02:00