The core `lib/promauth` already supports `usernameFile`
configs, but the CLI flags for vmagent remotewrite and vmalert
datasource/remotewrite/remoteread/notifier only expose
`basicAuth.username`.
This commit adds the corresponding `basicAuth.usernameFile` flags to match
the existing `basicAuth.passwordFile` pattern, closing the gap between
YAML and CLI configuration.
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9436
In most cases, vmalert is configured to write to vm components like
vminsert or vmagent, using VictoriaMetrics remote write protocol can
save network bandwidth.
The VictoriaMetrics remote write protocol is used by default, and the
protocol is downgraded from VictoriaMetrics to Prometheus remote write
if one request fails with protocol error.
Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10929
Add '=' separator between label name and value when computing the hash
to prevent false collisions, like {a="bc"} and {ab="c"} hashing to the
same value.
getLabelsHashForShard is added to avoid sharding disruptions in vmagent
(-remoteWrite.shardByURL=true mode). The function preserves previous
behavior, without '=' between name and value.
PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10937
When a request contains both URL path query params and POST form values
for extra_label and extra_filters[], URL query params now take
precedence. This resolves the conflict between the two sources and
simplifies security enforcement for extra_label/extra_filters policies
via vmauth or any other http proxy.
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10908
This commit introduces a new metric to expose fs type for the provided path.
For example:
```
vm_fs_info{path="/vmstorage-data", fs_type="xfs"}
```
Path must be registered with new method `fs.RegisterPathFsMetrics`.
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10482
This commit adds possibility to omit tenantID in the URL path. In this case,
tenantID will be fetched from HTTP headers `AccountID` and `ProjectID`.
If headers are missing too, then default `0:0` tenantID is used.
This functionality can be enabled only if -enableMultitenantHandlers
cmd-line flag was set to vminsert, vmselect or vmagent.
Motivation: this change makes VM configuration for multienancy
consistent with VL configuration - see
https://docs.victoriametrics.com/victorialogs/#multitenancy. And keeps
backward compatibility in the same time.
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4241
The flag already exists in the ENT version. We decided to expose it in
OSS and strip the path from all public places, including all
APIs(includes `/metrics`) and debug logs(it's minor info there).
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5625
Add the support of storage and retrieval of samples with future
timestamps as requested in https://github.com/VictoriaMetrics/VictoriaMetrics/issues/827
What to expect:
- By default, the max future timestamp is still limited to `now+2d`. To
change it, set the `-futureRetention` flag in `vmstorage`. The max flag
value is currently limited to `100y`. It can be extended if we see a
demand for this, but it can't be more than `~ 290y` due to how the time
duration is implemented in Go. The flag value can't be less than `2d`.
- downsampling and retention filters (available in enterprise edition)
are currently not supported for future timestamps
- If `vmstorage` restarts with a smaller value of `-futureRetention`
flag, any future partitions that are outside the new future retention
will be automatically deleted.
- Data ingestion, data retrieval, backup/restore, timeseries (soft)
deletion, and other operations work with future timestamps the same way
as with the historical timestamps.
- In the cluster version, the affected binaries are `vmstorage` and
`vmselect`. This means that `vmselect` version must match `vmstorage`
version if you want to query future timestamps. `vminsert` was not
affected, so its version can be a lower one.
- If you downgrade the `vmstorage`, the data with future timestamps will
remain on disk and memory (per-partition caches) but won't be available
for querying.
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
Signed-off-by: Artem Fetishev <149964189+rtm0@users.noreply.github.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
Previously, backend url health check start could produce a data race
and a race condition.
The following panic could be produced:
`panic: sync: WaitGroup is reused before previous Wait has returned`
It happened because concurrent goroutine could process request, while
configuration was reloaded and stopHealthChecks method was called.
This commit adds a dedicated structure for backend health checks.
Which protects from data race with mutex guard. And prevents race
condition with a boolean flag.
Fixes: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10806
Previously
- `GetData` in the OpenTSDB client was returning empty `Metric{}` with
`nil` error for several conditions (multiple series returned, aggregate
tags present, `modifyData` failures), causing `vmctl opentsdb` to
silently drop series during migration
This commit changes these silent return paths to return proper errors with
descriptive messages including the query string, so operators can detect
and diagnose partial migrations.
Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10797
Previously, if rule label value was set to empty string, vmalert ignored this label during labels merge with labels from data source response. In contrast, Prometheus removes data source label in this case as well. Which allows to perform label delete operation.
This commit uses the same logic as Prometheus for resolving labels conflicts and allows to remove labels.
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10766
Previously After RoundTrip returns successfully (err == nil, res != nil), the code checks if the original client request's context was canceled. If canceled, it returns immediately without closing res.Body.
There is a race window where:
1) RoundTrip completes successfully (res is non-nil)
2) The client cancels the request context (closes connection)
3) The context check at line 484 sees the cancellation
4) The function returns without closing res.Body
The response body holds a reference to the underlying TCP connection. Without closing it, the connection is permanently leaked along with the transport goroutines (readLoop + writeLoop or dialConnFor).
bug was introduced at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10233
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10833
**"Run query" link params**
Added correct params to "Run query" link on Alerting Rules page:
- `g0.step_input` - set to `group.interval` (in seconds)
- `g0.end_time` - set to `rule.lastEvaluation` / `alert.activeAt`
- `g0.relative_time=none` - to fix the time range
**Time display timezone**
Changed `t.format(...)` to `t.tz().format(...)` to display time in the
user-selected timezone.
Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10366https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10827
vmsingle shuts down vminsert before closing the ingestion rate limiter, even though the rate limiter API explicitly requires the opposite order to unblock callers. vminsert.Stop() waits for unmarshal workers, which can be blocked in ingestionRateLimiter.Register() when the limit is hit.
This reverts commit b3c03c023c.
Reason for revert: the original logic was correct from the user's perspective:
- The -maxRequestBodySizeToRetry command-line flag controls the size of the request body,
which could be retried on backend failure. The meaining of this flag wasn't changed after
the introduction of the -requestBufferSize flag in the commit e31abfc25c
(see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10309 )
- The -requestBufferSize flag controls the size of the buffer for reading request body
before sending sending it to the backend and before applying concurrency limits.
These flags are independent from user's perspective. The fact that these flags share the implementation,
sholdn't be known to the user - this is an implementation detail, which allows avoiding double buffering.
Both flags enable request buffering. If the user wants disabling of all the request buffering,
then both flags must be set to 0. That's why these flags are cross-mentioned in their -help descriptions.
Also the reverted commit had the following issues:
- It reduced the default value for the -requestBufferSize flag from 32KiB to 16KiB.
The 32KiB value has been calculated and justified at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10309 .
It shouldn't increase vmagent memory usage too much for typical workloads.
For example, if vmagent handles 10K concurrent requests, then the memory overhead for the request buffering
will be 10K*32KiB=320MiB. This is a small price for being able to efficiently handling 10K concurrent requests.
- It added a dot to the end of the https://docs.victoriametrics.com/victoriametrics/vmauth/#request-body-buffering link
in the description for the description of the -requestBufferSize flag. This breaks clicking the link in some environments,
since the trailing dot is considered as a part of the url.
- It added a superflouous whitespace in front of the 'Disabling request buffering' text inside the description
for the -requstBufferSize flag.
- It introduced an unnecessary complexity to the user by mentioning that the zero value
at -maxBufferSize disables buffering for request reties (these things must be independent
from the user's perspective).
- It changed the bufferedBody logic in non-trivial ways, which aren't related to the original issue.
If these changes are needed, then they must be justified in a separate issue and must be prepared
in a separate pull request / commit.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10675
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10677
Align group evaluation time with the `eval_offset` option to allow users
to manage group execution more effectively by understanding the exact
time each group will be scheduled, particularly in cases of spreading
rule execution within a window, chaining groups, or debugging data delay
issue.
If the group evaluation takes less than the group interval, but the
initial evaluation combined with the additional restore operation
exceeds the group interval, the evaluation time will be gradually
corrected in subsequent evaluations, as the interval ticker schedule
remains unchanged.
For groups without `eval_offset`, this change also ensures that all
evaluations follow the interval. Previously, the gap between the first
and second evaluations was larger than the interval. And the
`eval_delay` continues to help prevent partial responses.
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10772.
Follow-up commit for
211fb08028
Address @f41gh7 review comments:
- Move code from `lib/osinfo` to `lib/appmetrics`.
- Make the logic private.
- Use metrics.WriteGaugeUint64 func.
- Remove registration logic from `app/xxx/main.go`.
- Remove `lib/osinfo` package.
This commit adds new metrics `vmalert_remotewrite_queue_capacity` and `vmalert_remotewrite_queue_size`, which is updated with each push and it's
frequency depends on `-remoteWrite.concurrency`,
`remoteWrite.flushInterval`
It doesn't account for the pending data within each pushers request, it
should provide a general indication of the queue usage.
Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10765
Automatically set daily and hourly series limits to `MaxInt32` when `remoteWrite.maxHourlySeries` or `remoteWrite.maxDailySeries` is set to `-1`.
This change addresses a usability issue with the cardinality limiter. Users may want to enable the limiter to observe its metrics before deciding on an appropriate limit. However, the underlying bloom filter only supports `int32`, so setting large values can lead to overflow.
With this PR:
* Setting either flag to `-1` is treated as “no practical limit” and internally mapped to `math.MaxInt32`
* Values exceeding `int32` are safely clamped to `MaxInt32` to prevent overflow
This allows users to enable the limiter for estimation purposes without risking invalid configurations or runtime issues.
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9614
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Signed-off-by: Max Kotliar <kotlyar.maksim@gmail.com>
Co-authored-by: Nikolay <nik@victoriametrics.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
Previously introduced flag `requestBufferSize` raised default value for
in-memory buffer from 16KB to 32KB. It could increase memory usage for
vmauth. Also it made unclean how to actually disable requests buffering.
This commit aligns flags value to the 16KB. And disables requests
buffering if any of flags value are 0 as mentioned at flags description.
If any of flags have non-default value, those value are used as max size
for request buffer. If both flags are modified - bigger value wins.
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10675
I expect the change to help in two ways:
1. Spreading remote write flushes over the flush interval to avoid
congestion at the remote write destination;
2. Enhance queue data consumption. Currently, all flushers may always
flush data simultaneously, resulting in periods where no flushers are
consuming data from the queue, which increases the risk of reaching the
queue limit `remoteWrite.maxQueueSize` even when a increased
`remoteWrite.concurrency`. By making the flushers more dispersed, it is
more likely that some flushers are consistently consuming data from the
queue, which should make queue management easier.
Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10729/
Add per-URL `-remoteWrite.disableMetadata` flag to control metadata
sending for each remote storage independently.
After v1.137.0 enabled `-enableMetadata` by default, metadata is sent to
ALL remote write targets, even those with relabeling filters that drop
most metrics. This causes unnecessary growth in
`vmagent_remotewrite_requests_total`. and significant increase in
network load for heavy filtered remote write destinations.