The optimization includes the following improvements:
- Implementation of a function that processes 8 bytes per loop iteration to locate ASCII characters using bitwise manipulations.
- Implementation of the ToLowercaseFunc function that prevents string copying if the string is already in lowercase.
- Use of a lookup table for converting ASCII characters to lowercase, with logic copied from the VictoriaLogs repository.
Just run a simple bash command without the heavyweight Docker image
While at it, rely on TAG environment variable instead of PKG_TAG env variable
for `make docs-update-version`, in order to be consistent with other Make commands.
The change is needed to group splitting/sharding section of the documentation,
so they go one after another. This should improve readability.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
The previous descrioption didn't mention that relabeling can be used
for filtering scrape targets. Adding this metion.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
These links were removed in 134501bf99
without adding complete substitution to their content.
Restoring these links as they can be useful for readers to learn about relabeling.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
The old links were removed in #10754
mistakenly thinking that google didn't index it. However, it did. And users can get 404
when searching in google for VM plyagrounds.
Restoring the links via aliases. It means hugo will serve the `/playgrounds` page when
user requests `/playgrounds/victoriametrics/`.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Fix app tests:
1. Sync code between vmsingle and vmcluster: it must be the same because
apptest does not differentiate between branches, it just runs pre-built
binaries
2. Simplify range queries in backup/restore test so that it does not
depend on the interval between samples to work correctly.
---------
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
Previously (*writeconcurrencylimiter.Reader).Read() could permanently leak concurrency tokens from the -maxConcurrentInserts semaphore.
Consider the following example:
* GetReader() acquires a token, then PutReader() unconditionally releases it.
* Read() calls DecConcurrency() before the underlying I/O and IncConcurrency() after it. If IncConcurrency() returns an error, Read() returns without holding a token.
* Each such failure permanently removes one slot from the concurrencyLimitCh semaphore. Slots leak one by one until the channel is fully drained, at which point DecConcurrency() blocks forever, deadlocking ingestion on vmstorage.
This commit adds tracking for obtained tokens to the reader. Which prevents possible tokens leakage.
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10784
This reverts commit b3c03c023c.
Reason for revert: the original logic was correct from the user's perspective:
- The -maxRequestBodySizeToRetry command-line flag controls the size of the request body,
which could be retried on backend failure. The meaining of this flag wasn't changed after
the introduction of the -requestBufferSize flag in the commit e31abfc25c
(see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10309 )
- The -requestBufferSize flag controls the size of the buffer for reading request body
before sending sending it to the backend and before applying concurrency limits.
These flags are independent from user's perspective. The fact that these flags share the implementation,
sholdn't be known to the user - this is an implementation detail, which allows avoiding double buffering.
Both flags enable request buffering. If the user wants disabling of all the request buffering,
then both flags must be set to 0. That's why these flags are cross-mentioned in their -help descriptions.
Also the reverted commit had the following issues:
- It reduced the default value for the -requestBufferSize flag from 32KiB to 16KiB.
The 32KiB value has been calculated and justified at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10309 .
It shouldn't increase vmagent memory usage too much for typical workloads.
For example, if vmagent handles 10K concurrent requests, then the memory overhead for the request buffering
will be 10K*32KiB=320MiB. This is a small price for being able to efficiently handling 10K concurrent requests.
- It added a dot to the end of the https://docs.victoriametrics.com/victoriametrics/vmauth/#request-body-buffering link
in the description for the description of the -requestBufferSize flag. This breaks clicking the link in some environments,
since the trailing dot is considered as a part of the url.
- It added a superflouous whitespace in front of the 'Disabling request buffering' text inside the description
for the -requstBufferSize flag.
- It introduced an unnecessary complexity to the user by mentioning that the zero value
at -maxBufferSize disables buffering for request reties (these things must be independent
from the user's perspective).
- It changed the bufferedBody logic in non-trivial ways, which aren't related to the original issue.
If these changes are needed, then they must be justified in a separate issue and must be prepared
in a separate pull request / commit.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10675
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10677
Previously, Storage.table was initialized after startFreeDiskSpaceWatcher was called.
This created a potential data race condition: if openTable took a long time to complete
and freed disk space during that window, the free disk space watcher could read an
uninitialized (or partially initialized) Storage.table, leading to an invalid memory
address or nil pointer dereference panic.
This commit properly initializes s.isReadOnly state during storage start and
starts FreeDiskSpaceWatcher after openTable.
Bug was introduced in github.com/VictoriaMetrics/VictoriaMetrics/commit/27b958ba8bc66578206ddac26ccf47b2cc3e8101
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10747
Align group evaluation time with the `eval_offset` option to allow users
to manage group execution more effectively by understanding the exact
time each group will be scheduled, particularly in cases of spreading
rule execution within a window, chaining groups, or debugging data delay
issue.
If the group evaluation takes less than the group interval, but the
initial evaluation combined with the additional restore operation
exceeds the group interval, the evaluation time will be gradually
corrected in subsequent evaluations, as the interval ticker schedule
remains unchanged.
For groups without `eval_offset`, this change also ensures that all
evaluations follow the interval. Previously, the gap between the first
and second evaluations was larger than the interval. And the
`eval_delay` continues to help prevent partial responses.
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10772.
Follow-up commit for
211fb08028
Address @f41gh7 review comments:
- Move code from `lib/osinfo` to `lib/appmetrics`.
- Make the logic private.
- Use metrics.WriteGaugeUint64 func.
- Remove registration logic from `app/xxx/main.go`.
- Remove `lib/osinfo` package.
At 00:00 UTC the ingested samples start to have timestamps for the new
day (in the ingested samples are always recent). Even though there was a
next-day prefill of the per-day index during the last hour of the day,
some performance degradation is still possible.
For example, in https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10698
it is manifested as `vminsert-to-vmstorage connection saturation` peaks
right after midnight.
Possible hypothesis why this is happening. At midnight,
currHourMetricIDs is empty and prevHourMetricIDs cannot be used because
it holds metricIDs for the previous day. So the ingestion logic hits
dateMetricIDsCache which may not have the metricID in its read-only
buffer and therefore should aquire lock to check its prev read-only
buffer or read-write buffer. Which creates lock contention and therefore
raises ingestion request latency.
A solution to this could be re-using the nextDayMetricIDs during the
first hour of the day. During this time, it is equivalent to
currHourMetricIDs.
---------
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
Signed-off-by: Artem Fetishev <149964189+rtm0@users.noreply.github.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
This change reverts part of the changes in
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10686
Motivation: docs added https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10686 in most cases are too verbose, ai-generated and bringing low practical sense.
The improvement goal: remove bloat from the docs and keep them practical and useful.
What it does:
- Completely removes items from the sidebar
- Moves the content of the most important playground pages to the
`/playground/` stub (README.md). Use H2s for each playground.
- Updates and cleans the text.
- Removes the individual children pages in the playground category (keep
only the `/playgrounds/` page/stub and remove the children).
- Removes items as these don't really need much introduction or aren't
playgrounds:
- log to logsql: a conversion tool
- sql to logsql: same
- adds Grafana playground section
Links of child pages will become invalid. We don't preserve them as this is pretty new doc (1w on prod) and is unlikely to have already persisted links somewhere.
---------
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
This commit adds new metrics `vmalert_remotewrite_queue_capacity` and `vmalert_remotewrite_queue_size`, which is updated with each push and it's
frequency depends on `-remoteWrite.concurrency`,
`remoteWrite.flushInterval`
It doesn't account for the pending data within each pushers request, it
should provide a general indication of the queue usage.
Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10765
Exctract repeated code from nextDayMetricIDs synctests into separate
funcs to make the code more readable.
The change was originally introduced in
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10704 and was
extracted into a separate PR to keep the original change simple.
Previously, vminsert did not account for the ingest concurrency limit in buffer size calculation.
This could lead to excessively large buffers and OOM errors when the concurrency limit was reached.
This commit fixes buffer size calculation by separating `insertCtx` and `storageNode` buffer size limits.
`storageNode` buffer size is set to a larger value, as it is allocated per configured `-storageNode`
and is independent of the concurrency limit.
`insertCtx` buffer size now accounts for the configured concurrency limit
and calculates the maximum buffer size accordingly.
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10725
Previously, vmselect in cluster-native mode could return partial responses to upstream vmselect.
Since upstream vmselect expects full responses (mimicking vmstorage behavior),
partial responses must be disabled in cluster-native mode.
This prevents incomplete responses from being cached at the upstream vmselect level.
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10678
Automatically set daily and hourly series limits to `MaxInt32` when `remoteWrite.maxHourlySeries` or `remoteWrite.maxDailySeries` is set to `-1`.
This change addresses a usability issue with the cardinality limiter. Users may want to enable the limiter to observe its metrics before deciding on an appropriate limit. However, the underlying bloom filter only supports `int32`, so setting large values can lead to overflow.
With this PR:
* Setting either flag to `-1` is treated as “no practical limit” and internally mapped to `math.MaxInt32`
* Values exceeding `int32` are safely clamped to `MaxInt32` to prevent overflow
This allows users to enable the limiter for estimation purposes without risking invalid configurations or runtime issues.
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9614
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Signed-off-by: Max Kotliar <kotlyar.maksim@gmail.com>
Co-authored-by: Nikolay <nik@victoriametrics.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
Previously, last scrape result was unconditionally update, despite possible scrape error.
The commit updates last scrape result only at successful scrape. It properly accounts `scrape_series_added` metric and aligns it with the same metric in Prometheus.
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10653
Previously introduced flag `requestBufferSize` raised default value for
in-memory buffer from 16KB to 32KB. It could increase memory usage for
vmauth. Also it made unclean how to actually disable requests buffering.
This commit aligns flags value to the 16KB. And disables requests
buffering if any of flags value are 0 as mentioned at flags description.
If any of flags have non-default value, those value are used as max size
for request buffer. If both flags are modified - bigger value wins.
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10675
I expect the change to help in two ways:
1. Spreading remote write flushes over the flush interval to avoid
congestion at the remote write destination;
2. Enhance queue data consumption. Currently, all flushers may always
flush data simultaneously, resulting in periods where no flushers are
consuming data from the queue, which increases the risk of reaching the
queue limit `remoteWrite.maxQueueSize` even when a increased
`remoteWrite.concurrency`. By making the flushers more dispersed, it is
more likely that some flushers are consistently consuming data from the
queue, which should make queue management easier.
Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10729/
Changes:
- Added the number of `pending alerts` and `firing alerts`
- Improvement `transormations` for panel - FIRING over time by group and rules
- Added sort for panel - FIRING over time by rule
Signed-off-by: sias32 <sias.32@yandex.ru>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
Replace 1.2 multiplier with 1.25 in disk space estimation formula.
1.2 only provides ~16.7% free space, while the docs recommend keeping
20%. Using 1.25 correctly accounts for 20% free space.
Inspired by
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10394
Add per-URL `-remoteWrite.disableMetadata` flag to control metadata
sending for each remote storage independently.
After v1.137.0 enabled `-enableMetadata` by default, metadata is sent to
ALL remote write targets, even those with relabeling filters that drop
most metrics. This causes unnecessary growth in
`vmagent_remotewrite_requests_total`. and significant increase in
network load for heavy filtered remote write destinations.
When the network between client and s3 server is unstable, the client may encounter temporary io.EOF errors when reading the response from s3 server.
Currently, the s3 sdk in vmbackup uses the default retry policy. However, this default retry policy won't retry when s3 sdk meet unexpected EOF. This means that the temporary unexpected EOF error will cause the backup task to fail.
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10699
Add link to blogpost with detailed information about zstd+rw protocol.
This PR is based on question in community channel about implementation
details.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Implemented dedicated thanos migration mode for vmctl to migrate data from Thanos installations to VictoriaMetrics.
Key features:
1. Raw and downsampled blocks support: Reads both raw blocks
(resolution=0) and downsampled blocks (5m/1h resolution) directly from
Thanos snapshots
2. All aggregate types: Imports count, sum, min, max, and counter
aggregates from downsampled blocks as separate metrics with resolution
and type suffixes (e.g., metric_name:5m:count)
3. Dedicated flags: Uses `--thanos-*` prefixed flags (--thanos-snapshot,
--thanos-concurrency, --thanos-filter-time-start,
--thanos-filter-time-end, --thanos-filter-label,
--thanos-filter-label-value, --thanos-aggr-types)
4. Selective aggregate import: Use `--thanos-aggr-types` to import only
specific aggregates
Usage:
```
vmctl thanos --thanos-snapshot /path/to/thanos-data --vm-addr http://victoria-metrics:8428
```
Closes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9262
Signed-off-by: Dmytro Kozlov <d.kozlov@victoriametrics.com>
Signed-off-by: Max Kotliar <kotlyar.maksim@gmail.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
Co-authored-by: Max Kotliar <kotlyar.maksim@gmail.com>
## Summary
This PR implements split phase metrics for filestream operations as
requested in #10432.
### Changes
- Added `vm_filestream_fsync_duration_seconds_total` metric to track
fsync syscall duration separately
- Added `vm_filestream_fsync_calls_total` metric to count fsync calls
- Added `vm_filestream_write_syscall_duration_seconds_total` metric to
track write syscall duration (previously mixed with flush time)
- Refactored `MustClose()` and `MustFlush()` to use new `flush()` and
`sync()` helper methods
- Kept `vm_filestream_write_duration_seconds_total` for backward
compatibility
### Problem Solved
Previously, `vm_filestream_write_duration_seconds_total` was being
incremented in two places:
1. `statWriter.Write()` - triggered by `bw.Flush()` and `bw.Write()`
2. `Writer.MustFlush()` - which included the above process, leading to
double-counting
This made it impossible to distinguish between write syscall time and
fsync time, which is critical for diagnosing storage latency issues.
### Solution
The new metrics allow users to:
- Distinguish "flush got slower" vs "fsync got slower" using metrics
only
- No file path labels (bounded cardinality)
- No double-counting between metrics
### Testing
- Code compiles successfully
- All existing metrics are preserved for backward compatibility
Closes#10432
---------
Signed-off-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
Signed-off-by: Aliaksandr Valialkin <valyala@gmail.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
These types hide public types from lib/storage/metricnamestats package.
These types do not resolve any practical issues. Instead, they add a level of indirection,
which complicates reading and understanding the code.
These types were introduced in the commit 795d3fe722
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6145
### Describe Your Changes
Some users may not know that VictoriaMetrics Cloud provides relevant
features to manage workloads. This change add notes in relevant places
in which users may find that a managed solution is what they need.
The intention is not to push users to Cloud, but giving the information.
That's why it's always phrased like: "If you don't want to do X, Cloud
can do it for you", instead of "Start for free, etc". This is an Open
Source first project, and shall remain as such.
After this gets proper review, VictoriaLogs and other repos may follow.
### Checklist
The following checks are **mandatory**:
- [X] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [X] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
---------
Signed-off-by: Jose Gómez-Sellés <14234281+jgomezselles@users.noreply.github.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
Commit 83da33d8cf
removed NFS directory delete retries. It was made on assumption, that
only directory rename could cause such issues. However, both rename and
unlink uses the same "silly rename" logic
https://linux-nfs.org/wiki/index.php/Server-side_silly_rename
and linux kernel - `fs/nfs/dir.c` `nfs_unlink` and `nfs_rename`.
And NFS client may treat file still open, even if it
was properly closed by application. Most probably it could be triggered, because VictoriaMetrics may
open the same file multiple times ( data read and background merges).
There is no issue with VictoriaMetrics itself, it properly closes files. But NFS-client may have delays
or cache metadata information for the files. So it could trigger silly rename behavior.
This commit restores original behavior with deletion retries and brings
back metrics for unsuccessful delete operations.
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9842
Related to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10680
We noticed that backup restores in our environment were much slower than
the hardware/bandwidth constraints would suggest and we traced this down
to a couple of bottlenecks. This PR attempts to address all of them.
#### Lack of pre-allocation of files,
This was causing writes far into files to be quite slow as new blocks
needed to be continually allocated. This was particularly bad on ext4
for us, but will likely be applicable to most disks and filesystems,
you'll see the impl here is linux specific but this is mostly because I
don't have a test env for any other platform and didn't want to blindly
make changes without a validation env.
This comes with the downside of no longer being to to resume a restore
mid file, and requiring the re-downloading of parts already in the file
size the file will appear at full size from the very start. This is I
think _generally_ a good tradeoff for the restore speed gains, it is
definitely a tradeoff so I've included a flag to disable the
pre-allocation behavior and fall back to the existing part diffing
logic.
#### Fsync after each part
With many small parts in relatively few files, or in high concurrency
setups the the writerCloser fsync on each part(actually double fsync
since both `filestream.Writer.mustFlush` and
`filestream.Writer.mustClose` both fsync). Was causing slowdowns since
we would be continually queuing fsyncs.
With the pre-allocation pattern the file is only "ready" once re-named
so I moved to a per file fsync after rename.
#### Concurrent read/write
The previous download pattern was to do a read from the remoteFs, with
whatever latency that entailed, then sequentially do a write, again with
whatever latency that entailed. This meant that throughput was limited
to `readLatency + writeLatency * blockSize`.
Similar to how `crossTypeCopy` is implemented in the backup process we
can instead use `io.pipe` to allow two goroutines to work in parallel
with a small buffer between them.
#### Pagecache avoidance
`filestream.Writer` does quite a lot to avoid polluting the page cache,
but this is not relevent in a restore context and with large sequential
block writes its much more effecient to let the OS flush the pagecache
whenever it wants rather than doing a bunch of small buffer syscalls to
flush blocks.
Therefore this switches over to a much simplier directWriterCloser that
does direct file IO and lets the OS handle flushes while mid write.
### Performance
Before the changes we were seeing writes speeds of only 100MBps, this
was a restore from EBS volumes, ext with 1GB/s throughput with
<img width="1613" height="586" alt="Screenshot 2026-03-16 at 1 29 46 PM"
src="https://github.com/user-attachments/assets/5d54dcb7-cb59-43e0-9247-fda8c70feb2f"
/>
After these changes in the same restore env we're seeing 600MBs flat
rates.
<img width="1611" height="471" alt="Screenshot 2026-03-16 at 1 31 33 PM"
src="https://github.com/user-attachments/assets/ea8e2eb7-533a-48fa-99e0-0b38286e5572"
/>
Signed-off-by: Max Kotliar <kotlyar.maksim@gmail.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
Remove shards as they only complicate things when the number of requests
per second is in the range of thousands.
Related to #10532.
---------
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
This commit allows to perform JWT claim matching over 1 dimension arrays. It could
be useful from practical standpoint. Because permissions are usually assigned as a list of values.
For example, the following config allows admin access over list of assigned roles for user:
```yaml
match_claims:
access.roles: "admin"
```
JWT token:
```json
{
"access": {
"roles": [
"read",
"write",
"admin"
]
}
}
```
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10647
RFC-7617 allows empty password/username. Moreover, from RFC standpoint both empty values are valid as well. It should be just encoded as `:`. So this commit relaxes non-empty username restriction.
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6956
There are cases then the key sizeBytes is much greater than the value
sizeBytes. Therefore it is important to include the key sizeBytes into
the total.
Also fix some code comments.
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
Bumps [flatted](https://github.com/WebReflection/flatted) from 3.3.3 to
3.4.2.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="3bf09091c3"><code>3bf0909</code></a>
3.4.2</li>
<li><a
href="885ddcc33c"><code>885ddcc</code></a>
fix CWE-1321</li>
<li><a
href="0bdba705d1"><code>0bdba70</code></a>
added flatted-view to the benchmark</li>
<li><a
href="2a02dce7c6"><code>2a02dce</code></a>
3.4.1</li>
<li><a
href="fba4e8f2e1"><code>fba4e8f</code></a>
Merge pull request <a
href="https://redirect.github.com/WebReflection/flatted/issues/89">#89</a>
from WebReflection/python-fix</li>
<li><a
href="5fe86485e6"><code>5fe8648</code></a>
added "when in Rome" also a test for PHP</li>
<li><a
href="53517adbef"><code>53517ad</code></a>
some minor improvement</li>
<li><a
href="b3e2a0c387"><code>b3e2a0c</code></a>
Fixing recursion issue in Python too</li>
<li><a
href="c4b46dbcbf"><code>c4b46db</code></a>
Add SECURITY.md for security policy and reporting</li>
<li><a
href="f86d071e0f"><code>f86d071</code></a>
Create dependabot.yml for version updates</li>
<li>Additional commits viewable in <a
href="https://github.com/WebReflection/flatted/compare/v3.3.3...v3.4.2">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/VictoriaMetrics/VictoriaMetrics/network/alerts).
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Due to a conflict with VL FAQ page identifier,
VM FAQ page stopped rendering.
This change adds unique identifier to VM FAQ page and fixes the issue.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Before, by mistake, datasource was referenced by input name instead
of variable name. For an unknown reason, it worked well in local setup
and on playground.
This fix is confirmed by users and continues working at local setup
and playground.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
### Describe Your Changes
Updated the [HA monitoring setup in Kubernetes via VictoriaMetrics
Cluster](https://docs.victoriametrics.com/guides/k8s-ha-monitoring-via-vm-cluster/)
guide.
Changes:
- Added an introduction explaining how HA works in this guide
- Updated and verified commands used in the guide
- Replaced using Grafana UI usage in favor of using VMUI instead (it was
used to run queries, it's easier to just use the built-in VMUI instead
of installing Grafana just to use the Explore tab)
- Removed Grafana screenshots and replaced them with VMUI
- Tested on a modern version of GKE
- Added explanations for `replicationFactor`, de-duplication, and
`isPartial`
- Added next steps
- Added VMUI screenshots
### Checklist
The following checks are **mandatory**:
- [X] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [X] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
This commit adds a rpc retry by dialing a new connection instead of
getting an old one from the connection pool when the previous rpc error
is `io.EOF`.
It helps prevent broken connections from remaining for too long and
causing failed requests and partial responses during `vmstorage` rolling
restart period
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10314
Previously inmemoryPart refCount was not properly decremented.
Previous behavior:
* createInmemoryPart called newPartWrapperFromInmemoryPart and returns a partWrapper with refCount=1
* multiple parts are merged in mustMergeInmemoryPartsFinal, which creates a new merged part
* the source partWrappers are never decRef'd
* Since refCount never reaches 0, putInmemoryPart and (*part).MustClose are never called
This commit properly decrements refCount at mustMergeInmemoryPartsFinal.
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10086
This commit adds a new `folder_ids` field in
`yandexcloud_sd_configs` that allows users to specify Yandex Cloud
folder IDs directly, bypassing the organization->cloud->folder hierarchy
traversal.
Previously, the Yandex Cloud service discovery required traversing the
entire resource hierarchy (organizations -> clouds -> folders ->
instances) to discover instances. This works when the Service Account
has permissions at all levels. However, some Service Accounts may only
have permissions at the folder level, causing discovery to fail when it
cannot access organization or cloud resources.
With this change, users can now configure folder IDs directly:
```yaml
yandexcloud_sd_configs:
- service: compute
folder_ids:
- folder-id-1
- folder-id-2
```
When `folder_ids` is specified, the discovery skips the hierarchy
traversal and directly queries instances from the specified folders.
This is a backward-compatible change - when `folder_ids` is not
specified, the existing behavior is preserved.
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10587
The new section is placed in root directory and is supposed to promote
information about the following tools:
* MCP servers for Logs, Traces and Metrics
* List of available agentic skills
---------
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Signed-off-by: Roman Khavronenko <hagen1778@gmail.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
The change suppose to make it more clear for understanding and stress
attention on important things.
---------
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Signed-off-by: Roman Khavronenko <hagen1778@gmail.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
The new Grafana dashboard uses the following APIs:
- /api/v1/status/tsdb
- /api/v1/status/metric_names_stats
It shows the list of metric names, the request count and the last time
they were "used". Clicking on metric name allows exploring its
cardinality.
Based on https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9832
-----------
The PR contains a few unrelated changes:
* rename of folder for prometheus datasource to remove the duplicated
word
* fix for vmalert's access to the datasource, as before it wasn't able
to write/read properly
-------------
The dashboard screen cast:
https://github.com/user-attachments/assets/01dda5d9-14e5-4f5a-b795-a838abec4f5e
---------
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Haley Wang <haley@victoriametrics.com>
### Describe Your Changes
When a label is set as focus label in the Cardinality Explorer, the
"Metric names with the highest number of series" table was hidden. This
change makes it visible alongside the focus label values table.
### How to reproduce
1. Go to Explore → Cardinality Explorer
2. Enter a selector like `{namespace!=""}` and set Focus label to
`namespace`
3. Click Execute Query
**Before:** Only "Values for 'namespace' label..." table is shown
**After:** "Metric names with the highest number of series" table is
also shown
<img width="1512" height="723"
alt="b2a8395a1577b31f58ae00f87e29eb87ca98eabfd0b3c0d9185be8f3a9789b5f"
src="https://github.com/user-attachments/assets/50c7f67a-1cfc-40d0-8e99-7750a933ee45"
/>
Fixes#10630
### Checklist
The following checks are **mandatory**:
- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [x] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
---------
Signed-off-by: Roshan1299 <banisettirosh@gmail.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
Bumps and [minimatch](https://github.com/isaacs/minimatch). These
dependencies needed to be updated together.
Updates `minimatch` from 3.1.2 to 3.1.5
<details>
<summary>Commits</summary>
<ul>
<li><a
href="7bba97888a"><code>7bba978</code></a>
3.1.5</li>
<li><a
href="bd259425b2"><code>bd25942</code></a>
docs: add warning about ReDoS</li>
<li><a
href="1a9c27c757"><code>1a9c27c</code></a>
fix partial matching of globstar patterns</li>
<li><a
href="1a2e084af5"><code>1a2e084</code></a>
3.1.4</li>
<li><a
href="ae24656237"><code>ae24656</code></a>
update lockfile</li>
<li><a
href="b100374922"><code>b100374</code></a>
limit recursion for **, improve perf considerably</li>
<li><a
href="26ffeaa091"><code>26ffeaa</code></a>
lockfile update</li>
<li><a
href="9eca892a4e"><code>9eca892</code></a>
lock node version to 14</li>
<li><a
href="00c323b188"><code>00c323b</code></a>
3.1.3</li>
<li><a
href="30486b2048"><code>30486b2</code></a>
update CI matrix and actions</li>
<li>Additional commits viewable in <a
href="https://github.com/isaacs/minimatch/compare/v3.1.2...v3.1.5">compare
view</a></li>
</ul>
</details>
<br />
Updates `minimatch` from 9.0.5 to 9.0.9
<details>
<summary>Commits</summary>
<ul>
<li><a
href="7bba97888a"><code>7bba978</code></a>
3.1.5</li>
<li><a
href="bd259425b2"><code>bd25942</code></a>
docs: add warning about ReDoS</li>
<li><a
href="1a9c27c757"><code>1a9c27c</code></a>
fix partial matching of globstar patterns</li>
<li><a
href="1a2e084af5"><code>1a2e084</code></a>
3.1.4</li>
<li><a
href="ae24656237"><code>ae24656</code></a>
update lockfile</li>
<li><a
href="b100374922"><code>b100374</code></a>
limit recursion for **, improve perf considerably</li>
<li><a
href="26ffeaa091"><code>26ffeaa</code></a>
lockfile update</li>
<li><a
href="9eca892a4e"><code>9eca892</code></a>
lock node version to 14</li>
<li><a
href="00c323b188"><code>00c323b</code></a>
3.1.3</li>
<li><a
href="30486b2048"><code>30486b2</code></a>
update CI matrix and actions</li>
<li>Additional commits viewable in <a
href="https://github.com/isaacs/minimatch/compare/v3.1.2...v3.1.5">compare
view</a></li>
</ul>
</details>
<br />
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/VictoriaMetrics/VictoriaMetrics/network/alerts).
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Poison varint: MaxUint64 encoded as varint (0xFFFFFFFFFFFFFFFF).
The bounds check uint64(nSize)+n overflows to 9, bypassing the guard.
Then int(MaxUint64)=-1 makes src[10:9] which panics.
Previously there was a data-race, when targetURL was concurrently
updated in case of default url route.
This commit fixes data-race and adds concurrency to the routing tests.
Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10626
OAuth2 token source lib doesn't allow to define request headers explicitly.
This commit adds a custom transport to mitigate it. New transport modifies http.Request by making a shallow copy of it and setting additional headers.
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8939
Previously vm_filestream_write_duration_seconds_total will be increased in two places:
* statWriter.Write()
* Writer.MustFlush(). It will eventually call statWriter.Write(), hence double counting vm_filestream_write_duration_seconds_total
For reference, vm_filestream_read_duration_seconds_total will be increased only in statReader.Read to track read syscall.
This commit removes latency tracking from MustFlush method.
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10564
Replace ambiguous button labels such as "Submit" and "Apply" with
clearer wording to indicate that these actions only preview results and
do not modify the deployment configuration.
Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10453
Add support for [OpenID Connect
Discovery](https://openid.net/specs/openid-connect-discovery-1_0.html#IANA)
as an alternative way to obtain verification keys and rotate them
automatically.
`jwt` configuration should allow **exactly one** of the following
verification modes: `public_keys`, `oidc`, `skip_verify`. These options
must be mutually exclusive.
Example: OIDC configuration
```yaml
users:
- jwt:
oidc:
issuer: http://identity-provider.com
```
When `oidc` is enabled:
1. On startup, `vmauth` fetches:
```
{issuer}/.well-known/openid-configuration
```
2. Extracts `jwks_uri`.
3. Fetches [JWK
keys](https://openid.net/specs/draft-jones-json-web-key-03.html#ExampleJWK)
from `jwks_uri`.
4. Uses discovered keys to verify JWT tokens.
Related to
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10585
Failure handling:
* If discovery fails at startup:
* No keys are available.
* The user is skipped.
* Discovery runs periodically in background (e.g., every 1 minute).
* If keys become available later, authentication should start working
automatically.
* If keys were previously fetched and the identity provider becomes
unavailable:
* Cached keys must be preserved.
* Authentication continues using cached keys.
#### JWT Requirements in OIDC Mode
When `oidc` is enabled:
* `iss` claim becomes
[mandatory](https://openid.net/specs/openid-connect-core-1_0.html#IDToken).
* `iss` [must
match](https://openid.net/specs/openid-connect-core-1_0.html#RotateEncKeys):
* `oidc.issuer` from config.
* `issuer` returned in the OpenID configuration document.
* JWT header must contain `kid`.
* `kid` must be used to select the appropriate key from JWKS.
* Tokens without `kid` must be rejected.
* Tokens without `iss` must be rejected.
Rationale
* Enables automatic key rotation.
* Eliminates manual public key configuration.
* Maintains compatibility with standard OIDC providers.
---------
Signed-off-by: Max Kotliar <kotlyar.maksim@gmail.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
- VMUI Explore Metrics uses `rate` for histogram bucket queries, which
skips the first observation
in each bucket because `rate` requires two data points to calculate a
per-second rate.
- Replace `rate` with `increase_pure`, which assumes counters start from
0 and correctly shows
the first observation when a new bucket appears.
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10365
The test should not fail now on systems with 1 cpu because partition
indexDBs are not rotated. See #8948.
Also removed two TODOs from the test to keep it simple.
Disabling is done by making the the handlers for `/tags/tagSeries` and
`/tags/tagMultiSeries` to return `501 (Not Implemented)` status code
along with the error message saying that the API has been disabled and
will be removed in future.
See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10544.
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
Now that indexDB is per-partition, the indexDB-related docs need to be
updated. Specifically the how the indexDB is cleaned up when it becomes
outside the `-retentionPeriod`.
Follow-up for #8134.
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
Signed-off-by: Aliaksandr Valialkin <valyala@gmail.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
There are following main use cases for `eval_offset`:
1. To ensure rules are evaluated at an exact offset, so the results have
the exact timestamp the user wants.
2. The source data for a certain rule is delivered at a specific time
point, so rules need to be executed after that time point to get correct
results. For example, [chaining
groups](https://docs.victoriametrics.com/victoriametrics/vmalert/#chaining-groups).
3. A group contains some heavy rules that can take a few minutes to
finish. To guarantee a single evaluation can complete in time and not
delay the next run, the user may want to schedule the group to be
executed within [intervalStart, intervalEnd-avgTotalEvaluationDuration].
Negative value can be convenient for case3, as users only need to set
group `eval_offset: -avgTotalEvaluationDuration(a bigger value than the
real duration to leave some buffer would be better)`.
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10424
Due to bug introduced at initial datadog-sketches API implementation, `host` label was incorrectly obtained from `Tags` structure. While actually it's present directly at root of protobuf message.
This commit properly attaches `host` label in such case.
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10557
If the data flush to the remote write destination takes longer than the
periodic flush interval (default 2s), the ticker channel will contain a
stale tick, causing the ticker case to be selected too early with an
empty or small amount of data inside `wr`, resulting in a wasted remote
write request with one or two time series(if `ts, ok := <-c.input` was
also randomly selected beforehand).
We could also consider resetting the ticker after drain the stale tick
to ensure `wr` always accumulates data for the full flush interval, but
that seems more trivial to me.
Add new option per-user to print access logs. Such logs
contain limited amount of information to prevent exposing
sensitive data.
Access logs can be enabled/disabled via hot-reload and could
help locating clients that incorrectly use or abuse vmauth.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5936
Cluster apptests failed from time to time with the following error:
```
timed out while waiting for inserted rows to be sent to vmstorage
cluster
```
due to incorrect calculation of inserted row count before and after
insertion. This PR fixes it by putting the "before" count calculation
before the send() operation.
Previously, the client certificate was only refreshed during the TLS
handshake, which occurs when establishing a new connection. This meant
the remote HTTP server had to close the existing connection for the
client to pick up an updated (e.g. expired) certificate. As a
workaround, connection keep-alive could be disabled, but that
significantly increased request latency.
This commit adds a certificate check during HTTP RoundTrip. If the
client certificate has changed, the RoundTripper recreates the transport
and its connection pool. This behavior is already implemented for CA
certificate changes.
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10393
Per @valyala's request, rename storage cache methods to adhere the
following format:
```
get[Value]By[Key]FromCache
put[Value]By[Key]ToCache
```
Also move `s.metricIDCache` methods from `indexDB` to `Storage` because
this cache exists at the `Storage` level.
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
Enable BuildKit-native SPDX SBOM and provenance attestations by setting
`--sbom=true --provenance=true` in `docker buildx build` within
`publish-via-docker`.
- Set `--provenance=true --sbom=true` in `publish-via-docker` for both
Alpine and scratch variants
- Add SBOM section to SECURITY.md with inspection and Trivy scan
instructions
- Update Release-Guide.md
- Add changelog entry
Verified end-to-end: pushed test image to GHCR, confirmed SBOM
attestation via `docker buildx imagetools inspect`, and Trivy scan via
`trivy image --sbom-sources oci` succeeded (with 0 vulnerabilities :-)).
Fixes#10473
### Checklist
The following checks are **mandatory**:
- [X] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [X] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
---------
Signed-off-by: John Allberg <john@ayoy.se>
Signed-off-by: Max Kotliar <mkotlyar@victoriametrics.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
Co-authored-by: Max Kotliar <kotlyar.maksim@gmail.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
Samples in Mimir (or Prometheus) are stored in chunks, which are
compressed efficiently using algorithms rather than being stored as
independent samples, see details in [this
article](https://prometheus.io/blog/2019/10/10/remote-read-meets-streaming/)
and [this talk](https://www.youtube.com/watch?v=b_pEevMAC3I).
When using a small `--remote-read-step-interval`, particularly `minute`,
a single chunk may contain samples that exceed the requested time
window, and all the returned chunks contain overlapping samples.
Consequently, vmctl will read and migrate many duplicate samples into
VictoriaMetrics.
In tests, `--remote-read-step-interval=minute
--remote-read-use-stream=true` with raw sample `scrape_interval: 10s`
and remote read time range of 24h can write ~20x duplication.
But I assume the minute interval is rarely used with a large time range
and duplicates are fine in VictoriaMetrics due to deduplication, so we
don't need to disallow using it.
```
## --remote-read-step-interval=minute --remote-read-use-stream=false
## total samples: **15696611(the real number)**
2026/02/26 22:10:25 VictoriaMetrics importer stats:
idle duration: 50.080851955s;
time spent while importing: 32.108903417s;
total samples: 15696611;
samples/s: 488855.41;
total bytes: 735.8 MB;
bytes/s: 22.9 MB;
import requests: 79;
import requests retries: 0;
2026/02/26 22:10:25 Total time: 32.112912208s
## --remote-read-step-interval=day --remote-read-use-stream=true
## total samples: 15878869
2026/02/26 22:20:37 VictoriaMetrics importer stats:
idle duration: 960.698874ms;
time spent while importing: 6.338309625s;
total samples: 15878869;
samples/s: 2505221.41;
total bytes: 278.6 MB;
bytes/s: 44.0 MB;
import requests: 80;
import requests retries: 0;
2026/02/26 22:20:37 Total time: 6.340023167s
## --remote-read-step-interval=hour --remote-read-use-stream=true
## total samples: 21824000
2026/02/26 22:13:14 VictoriaMetrics importer stats:
idle duration: 5.238827666s;
time spent while importing: 7.274528s;
total samples: 21824000;
samples/s: 3000057.19;
total bytes: 394.4 MB;
bytes/s: 54.2 MB;
import requests: 110;
import requests retries: 0;
2026/02/26 22:13:14 Total time: 7.278895084s
## --remote-read-step-interval=minute --remote-read-use-stream=true
## total samples: **353800724(353800724/15696611~22.5)**
2026/02/26 22:18:41 VictoriaMetrics importer stats:
idle duration: 1m45.09105431s;
time spent while importing: 1m51.716730125s;
total samples: 353800724;
samples/s: 3166944.86;
total bytes: 6.8 GB;
bytes/s: 61.3 MB;
import requests: 1769;
import requests retries: 0;
2026/02/26 22:18:41 Total time: 1m51.721834958s
```
### Describe Your Changes
Move OpenTelemetry-related documentation under docs/integrations and
docs/data-ingestion to establish a clear, scalable structure.
As OpenTelemetry support expands, we need a dedicated place to document
protocol details, implementation specifics, and known limitations, such
as:
- Delta temporality not working with downsampling. See
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10014#issuecomment-3697509266.
- Negative histogram buckets being discarded by VictoriaMetrics. See
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9896.
The new structure separates concerns:
- `docs/integrations/` — protocol overview, implementation details, and
limitations.
- `docs/data-ingestion/` — OpenTelemetry Collector configuration and
ingestion setup.
This aligns OpenTelemetry documentation with the existing structure used
across other integrations and ingestion methods.
New pages and links preserve backward compatiblity
### Checklist
The following checks are **mandatory**:
- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
### Describe Your Changes
- Updated GKE version to a more current 1.34+
- Updated guide to more modern Helm and Kubectl versions
- Tested updated instructions on GKE 1.34.1-gke.3971001 (and a local k3s
instance) successfully
- Removed revision from Grafana values for helm chart (confirmed it
pulls the latest revision)
- Split the helm chart values (`guide-vmcluster-vmagent-values.yaml`)
into more readable chunks and added explanations next to each chunk
- Added and updated expected outputs. Some were missing and others were
outdated
- Updated Grafana dashboards screenshots since they changed from the
last revision
- Updated Grafana repo to use community org (old grafana chart was
deprecated
on Jan 30th -
[source](https://community.grafana.com/t/helm-repository-migration-grafana-community-charts/160983))
- Minor corrections and typo fixes. Improved flow
- Added a section at the end pointing readers where they can go next.
### Checklist
The following checks are **mandatory**:
- [X] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [X] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
---------
Signed-off-by: Pablo (Tomas) Fernandez <46322567+TomFern@users.noreply.github.com>
Co-authored-by: Vadim Rutkovsky <vadim@vrutkovs.eu>
## Summary
Fix an invalid MetricsQL numeric literal in the vmctl monitoring
documentation.
## Problem
The PromQL/MetricsQL example query for monitoring vm-native migration
data transfer speed used `1Mb` as a divisor:
```promql
rate(vmctl_vm_native_migration_bytes_transferred_total[5m]) / 1Mb
```
However, `Mb` is **not** a valid MetricsQL numeric suffix. According to
the [MetricsQL
documentation](https://docs.victoriametrics.com/victoriametrics/metricsql/#numeric-values):
> Numeric values can have `K`, `Ki`, `M`, `Mi`, `G`, `Gi`, `T` and `Ti`
suffixes.
The suffix `Mb` does not exist — only `M` (mega, 10^6) and `Mi` (mebi,
2^20 = 1,048,576) are valid.
## Fix
Replace `1Mb` with `1Mi` (1 mebibyte = 1,048,576 bytes), which is the
standard binary unit for memory/storage transfer measurements in
computing, and update the comment to reflect `MiB/s` instead of `MB/s`.
## Files Changed
- `docs/victoriametrics/vmctl/vmctl.md`: fixed the invalid literal `1Mb`
→ `1Mi` and updated the comment
---------
Signed-off-by: Maxime Grenu <maxime.grenu@gmail.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
Co-authored-by: Vadim Alekseev <vadimaleksv@gmail.com>
Co-authored-by: Yury Moladau <yurymolodov@gmail.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
Co-authored-by: Nikolay <nik@victoriametrics.com>
* add meaningful description, it is required for publishin on grafana.com
* remove dependency on `victoriametrics-metrics-datasource` as it is not used
Signed-off-by: hagen1778 <roman@victoriametrics.com>
This commit improves compatibility with promql by introducing a missing function `histogram_fraction`.
histogram_fraction is a shortcut for `histogram_share(upperLe, buckets) - histogram_share(lowerLe, buckets)`
histogram_count, histogram_sum or histogram_avg will not be added to metricsQL, as they only operate on Prometheus native histogram, which doesn't have _count and _sum series like the classic histogram or Victoriametrics histogram. For classic histogram, _count and _sum series can be used directly.
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5346.
This commit optimizes the storage of originalLabels. Previously, they
were stored as a clone of the discovered labels, which required many
small allocations and added high pressure on the garbage collector.
Now originalLabels are stored as zstd-compressed JSON ([]byte). Since
they are rarely requested, the overhead of zstd decompression and
json.Unmarshal is negligible.
This optimization reduces memory usage for storing originalLabels by 3x
and CPU usage by 2x.
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9952
After golang 1.23 it's safe to ignore timer.Reset True value.
According to the spec:
For a chan-based timer created with NewTimer, as of Go 1.23,
any receive from t.C after Reset has returned is guaranteed not
to receive a time value corresponding to the previous timer
settings;
If the program has not received from t.C already and the timer is
running, Reset is guaranteed to return true.
Before Go 1.23, the only safe way to use Reset was to call [Timer.Stop]
and explicitly drain the timer first.
Golang 1.23 changed timer implementation from sync and async. And it
made possible that chan send and timer.Stop could happen in the same
time.
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9721
# Investigation & Root Cause --- InfluxDB Line Protocol Parsing with Raw
Newline (`\n`)
This document describes the investigation process and root cause
analysis for Influx Line Protocol parsing errors in VictoriaMetrics when
a **raw newline (`\n`) byte appears inside a quoted field value**.
------------------------------------------------------------------------
## Background
According to the Influx Line Protocol specification:
- Each point must be represented as a single line.
- The newline character (`\n`) separates points.
- Literal newline bytes are not allowed inside quoted field values.
Therefore, any raw newline byte (`0x0A`) inside a quoted string makes
the line invalid.
------------------------------------------------------------------------
## Related Issue
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10067
------------------------------------------------------------------------
## Expected Behavior
VictoriaMetrics should reject Influx Line Protocol lines that contain a
raw newline inside a quoted field value, since this violates the
protocol specification.
The parsing failure itself is correct.
------------------------------------------------------------------------
## Actual Behavior
VictoriaMetrics rejects the line with the following error:
cannot parse field value for "...": missing closing quote for quoted
field value
While technically correct, the error message does not clearly indicate
that the root cause is a raw newline inside the quoted field value.
------------------------------------------------------------------------
## Minimal Reproducer
The issue can be reproduced without Telegraf or Jolokia:
``` bash
printf 'test value="hello
world"\n' | curl -X POST http://localhost:8428/write --data-binary @-
```
This produces:
cannot parse field value for "value": missing closing quote for quoted
field value
The failure occurs because the value contains an actual newline byte
(0x0A), not the escaped sequence `\n`.
------------------------------------------------------------------------
## Environment Setup
The issue was reproduced using the following stack:
- VictoriaMetrics v1.127.0
- InfluxDB 1.8
- Spring Boot + Jolokia
- Telegraf 1.36.2
Telegraf collects JVM `SystemProperties`, including:
``` json
"line.separator": "\n"
```
After JSON unmarshalling, this becomes a real newline byte in memory.
Detailed reproduction steps can be found here:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10067#issuecomment-3896175100
------------------------------------------------------------------------
## Observed Serialized Line
Using breakpoint debugging in:
lib/bytesutil/bytebuffer.go:58
The `ReadFrom` function reads and assembles an Influx line containing:
SystemProperties.line.separator="
",
The quoted field contains an actual newline byte before the closing
quote.
This breaks the single-line assumption of Influx Line Protocol.
VictoriaMetrics splits on `\n`, resulting in:
- A truncated first line
- A missing closing quote
- Parsing failure
------------------------------------------------------------------------
## Important Clarification
This issue is **not** caused by the escaped sequence `"\\n"`.
The failure occurs only when the serialized Influx line contains an
actual newline byte (`0x0A`) inside the quoted value.
Escaped `\n` (two characters: `\` and `n`) is valid.
------------------------------------------------------------------------
## Root Cause
- Telegraf serializes a field containing a real newline byte.
- Influx Line Protocol forbids literal newline characters inside
quoted fields.
- VictoriaMetrics correctly treats `\n` as a line separator.
- The parser then encounters an incomplete quoted field and reports
"missing closing quote".
The parsing behavior is correct per specification.
------------------------------------------------------------------------
## Proposed Improvement
The parsing logic should remain unchanged.
However, the error message can be improved to better indicate the root
cause.
Suggested error message:
invalid Influx line protocol: missing closing quote for quoted field
value;
this may be caused by a raw newline (`\n`) inside the quoted field value
This makes the failure immediately actionable and easier to diagnose.
------------------------------------------------------------------------
## Summary
- The failure is caused by a raw newline byte inside a quoted field
value.
- This violates the Influx Line Protocol specification.
- VictoriaMetrics correctly rejects the line.
- The error message should explicitly mention the possibility of a raw
newline (`\n`) inside the quoted field.
Signed-off-by: hklhai <hkhai@outlook.com>
Co-authored-by: Max Kotliar <kotlyar.maksim@gmail.com>
Previosly, extra filters were ignored for
`/api/v1/label/vm_account_id/values` or
`/api/v1/label/vm_project_id/values` calls. In result, even if user's
visibility was limited by applying
`?extra_filters[]={vm_account_id="1"}` param they could get the list of
all available tenants in the system.
---------
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit d2a033453e)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* remove ToC in the beginning, as it duplicates right-bar functionality
and is easier to make a mistake with. For example, it didn't have the
ZFS section in it
* simplify wording where it was possible
* reference new tools VM got in recent releases
* re-prioritize tips order based on personal experience
---------
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Signed-off-by: Roman Khavronenko <hagen1778@gmail.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
Co-authored-by: Pablo (Tomas) Fernandez <46322567+TomFern@users.noreply.github.com>
Components like vmselect and vminsert rarely touch disk, so most of the
time their values are 0. Filtering out 0 values makes the panel cleaner.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
A refactoring that moves the uint64set.Set marshaling and unmarshaling from lib/storage/storage.go to lib/uint64set. Also added function docs and tests.
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
It should prvent apptest timeouts due to runners saturation. When
apptests are run with other tests and linters they do not have enough
CPU to complete in time and often times out.
If one re-runs the apptests shortly after they are likely to pass
because the same runner has enough resources available (other job
finished).
Remove GOGC=10 as the runner has enough memory (16Gb) to run apptests.
I did some tests and obeserve drop in overal test duration from 4.5m to
3.30-3m.
The new section is supposed to contain otel related information for all
products, like VT, VM, VL.
It also supposed to be visible for readers right away, without need to
dig for info in each product.
It contains basic information and is supposed to act as a router to more
detailed info in each product.
While there, also updated VM-related otel info.
---------
Depends on
https://github.com/VictoriaMetrics/victoriametrics-datasource/pull/458
---------
Signed-off-by: hagen1778 <roman@victoriametrics.com>
### Describe Your Changes
- Add an introduction with a brief explanation of the operator and its
benefits as an intro
- Make some steps more explicit, instead of just linking to the VM
cluster guide
- Separate config/chart values files from kubectl apply (instead of
using heredoc and in-line yaml)
- Update screenshots and add figcaptions where needed
- Update Kubernetes and tools versions to newer releases
- Remove revision numbers from the Grafana config to install the latest
revision
- Added a section to configure scraping of Kubernetes resources (nodes,
pods, etc.)
- Tested updated instructions on GKE 1.33 and 1.34 (and a local k3s
instance) successfully
- Added and updated expected outputs. Some were missing and others were
outdated
- Updated Grafana dashboards screenshots since they changed from the
last revision
- Minor corrections and typo fixes. Improved flow
- Added a section at the end pointing readers to where they can go next.
### Checklist
The following checks are **mandatory**:
- [X] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [X] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
### Describe Your Changes
- Updated introduction
- Added proper steps
- Tested intructions on headlamp desktop version and the in-cluster web
ui
- Added images to guide user
- Mentioned that the test connection button does not work (it probes a
`-healthy` endpoint that is not supported by VM). The plugin still
works, it's just the test button that fails
- Added links to the single and cluster installation guides
### Checklist
The following checks are **mandatory**:
- [X] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [X] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
---------
Signed-off-by: Pablo (Tomas) Fernandez <46322567+TomFern@users.noreply.github.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
### Describe Your Changes
- Rewrote the introduction
- Added list of endpoints for single node, cluster, and cloud
- Added tips for working with VictoriaMetrics running on Kubernetes
- Flushed out explanations for each step
- Added reference links for all required endpoints
- Tested every command
### Checklist
The following checks are **mandatory**:
- [X] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [X] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
---------
Signed-off-by: Pablo (Tomas) Fernandez <46322567+TomFern@users.noreply.github.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>
Commit 610b328e5a introduced a bug in the
date range search logic. If the first searched date for a given tenant
did not match, the search could proceed incorrectly.
This commit fixes the SearchTenants API by correctly advancing the date
passed to table.Seek.
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10422
* add example of the produced log, so users could understand the impact;
* stress once again about sensetive data exposure when
dump_request_on_errors is enabled.
This change adds some context to the error when all backend failed. From
support cases it seems like without the context users might not know
what to do with this error message. Clarification advises them to check
the prev error messages.
Previously regex simplify function made an attempt to parse string representation of simplified regex.
And it could produce runtime panic due to std lib specification:
```
// Simplify returns a regexp equivalent to re but without counted repetitions
// and with various other simplifications, such as rewriting /(?:a+)+/ to /a+/.
// The resulting regexp will execute correctly but its string representation
// will not produce the same parse tree, because capturing parentheses
// may have been duplicated or removed.
```
This commit ignores simplified regex parsing error and returns back original regex.
It results into possible missing simplification of some niche regex patterns.
But it's extremely rare cases rarely seen in production. So the tradeoff is acceptable.
Fixes victoriaMetrics/victoriaLogs/issues/1112
While server side copies when using the same backup origin and
destination are always most efficient there are times when moving
between backup locations is required.
Right now vmbackup throws an error in these cases.
While its true that a user could always do a fresh backup from a
snapshot rather than copy an old backup, this requires access to storage
data locations and a running vmstorage instance, something that is not
_generally_ required for otherwise moving backups around in remote
locations using vmbackup.
This is a small change that makes the moving of backups from one
location to another transparent to users, without having to consider if
those locations are the same or different. This both simplifies backup
migrations and unlocks using vmbackup for more complex operations.
Specifically this came up in my use case because we want to orchestrate
the down-scaling of EBS volumes backing our vmstorage cluster, which
requires some complex backup operations, one of which being taking a
backup from s3 to a local filesystem.
Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10401
Use the same sharded implementation as in metricIDCache. The change is
basically a copy-paste. The only difference is that the rotation period
remains `1h` instead `1m` in order not to break the fix for #10064.
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
This should reduce memory reallocations and fragmentation when reading large request bodies from slow clients.
This also should reduce memory usage a bit because of the reduced memory fragmentation.
Updates https://github.com/VictoriaMetrics/VictoriaLogs/issues/1042
### Describe Your Changes
The job ensure that:
- the draft release with given `$(TAG)` exists
- the release has excpected `$(GITHUB_ASSETS_COUNT)` number of uploaded
assets
- All the assets were uploaded succesfully.
It also adds helper job `github-get-release` which finds a draft release
by `$(TAG)` and stores into file `/tmp/vm-github-release-$(TAG)` file.
The `github-delete-release1 job is decoupled from the file produced by
`github-create-release job`. So it could be run at any time from any
machine.
### Checklist
The following checks are **mandatory**:
- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
### Describe Your Changes
Adds JWT authentication support to vmauth with signature verification
and tenant-based access control. For now, public_keys have to set
explisitly in the config, OIDC discovery will be added in upcoming PRs.
Related to
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10445
Key Features
- JWT Configuration: Added `jwt_token` field to user config supporting
RSA/ECDSA public keys or skip_verify mode (for testing purposes).
- Token Validation: Verifies JWT signatures, checks expiration, and
extracts vm_access claims
- Compatible with vmgateway: jwt tokens issued for vmgateway should work
with vmauth too.
Examples
```yaml
users:
- jwt_token:
public_keys:
- |
-----BEGIN PUBLIC KEY-----
MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA...
-----END PUBLIC KEY-----
url_prefix: "http://victoria-metrics:8428/"
```
```yaml
users:
- jwt_token:
skip_verify: true
url_prefix: "http://victoria-metrics:8428/"
```
Constraints
- JWT tokens cannot be mixed with other auth methods (bearer_token,
username, password)
- Requires at least one public key OR skip_verify=true
- Limited to single JWT user (multiple JWT users will be supported in
the future)
Next steps
- Multiple `jwt_token` support.
- Claim matching
- Claim based routing
- OIDC\JWKS support
### Checklist
The following checks are **mandatory**:
- [ ] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist).
- [ ] My change adheres to [VictoriaMetrics development
goals](https://docs.victoriametrics.com/victoriametrics/goals/).
---------
Co-authored-by: Pablo (Tomas) Fernandez <46322567+TomFern@users.noreply.github.com>
Exploit uint64set data structure peculiarities (adjacent elements are
stored in
64KiB buckets) to optimize metricIDCache memory footprint.
As the result the cache utilizes 87% less memory and is up to 90%
faster. See
[benchstat.txt](https://github.com/user-attachments/files/25294076/benchstat.txt).
Follow-up for #10388 and #10346.
Thanks to @valyala for the optimization idea.
---------
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
Previously, on the last day of a month, storage could report empty
metrics for the last partition. This could happen if a new empty
partition was created in updateNextDayMetricIDs or if time series with
future timestamps were ingested.
This commit adds a check to ensure the last partition belongs to the
current month. Since this is typically the most actively used partition,
it should be treated as the last one.
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10387
Ensure proper expansion and reset of `buf` size for OpenTelemetry
ingestion. This pull request does:
1. Flush data in `wctx` when `buf` is over 4MiB.
2. Do not return `wctx` with `buf` larger than 4MiB while the actual
in-use length is less than 1MiB to the pool.
Previously, when a small number of requests carried a large volume of
time series or labels, `buf` was over-expanded and recycled to the pool,
resulting in an excessive memory usage issue.
fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10378
This is part of the effort to upgrate and validate the [Guides in the
docs](https://docs.victoriametrics.com/guides/).
Doc page:
https://docs.victoriametrics.com/guides/getting-started-with-opentelemetry/
Functionally, nothing should change. Aside from the fix that prevented
one of the example applications to run, the rest of the commands in the
guide should be equivalent to the original.
Header anchor links do not change with this update. I added a few
headers but the existing headers anchors should remain unchanged to
prevent breaking existing links.
- Tested on a more modern version of GKE to validate it still works OK
(1.34.1-gke.3971001)
- Changed wording of some sections to improve flow and readability
- Added some missing steps/troubleshooting
- Add tips annotations for cardinality explorer and setup references to
make them stand apart form the main content
- Use `kubectl port-forward svc/...` instead of `kubecl port-forward
pod` (service selectors vs pod names) in some test commands to make
instructions simpler
- Updated OpenTelemetry version to fix error that prevented
`app.go-collector.example` sample code from running
- Replaced the "Visit these links" part in the second program (with the
fast/slow endpoints) with curl commands
- Updated the first VMUI test link to show table instead of graph while
testing OpenTelemetry ingestion (default graph view can be confusing as
there metric value for `k8s_container_ready` doesn't really show any
values)
- Minor typos, grammar check, and consistency (Kubernetes vs kubernetes,
Helm vs Helm, Collector vs collector, etc)
This change only affects query trace. It correctly uses the branched
query trace in callback function, so in trace it is placed in the right
actions branch.
Bug was introduced in
c705da74f6
- Return back the check that the size of the scraped response doesn't exceed the maxScrapeSize
at the client.ReadData(). Without this check the scraped response may be truncated to maxScrapeSize+1
bytes, which can result in decompression error. The decompression error in this case
hides the original errror about too big response side. This complicates troubleshooting by users.
- Stop decompressing the scraped response as soon as the decompressed response size exceeds maxScrapeSize.
This protects from excess memory usage needed for holding the decompressed response with sizes exceeding
the maxScrapeSize.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10320
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9481
VictoriaMetrics is a fast, cost-saving, and scalable solution for monitoring and managing time series data. It delivers high performance and reliability, making it an ideal choice for businesses of all sizes.
## Folder Structure
-`/app`: Contains the compilable binaries.
-`/lib`: Contains the golang reusable libraries
-`/docs/victoriametrics`: Contains documentation for the project.
-`/apptest/tests`: Contains integration tests.
## Libraries and Frameworks
- Backend: Golang, no framework. Use third-party libraries sparingly.
- Frontend: React.
## Code review guidelines
Ensure the feature or bugfix includes a changelog entry in /docs/victoriametrics/changelog/CHANGELOG.md.
Verify the entry is under the ## tip section and matches the structure and style of existing entries.
Chore-only changes may be omitted from the changelog.
VictoriaMetrics is a fast, cost-saving, and scalable solution for monitoring and managing time series data. It delivers high performance and reliability, making it an ideal choice for businesses of all sizes.
VictoriaMetrics is a fast, cost-effective, and scalable solution for monitoring and managing time series data. It delivers high performance and reliability, making it an ideal choice for businesses of all sizes.
Here are some resources and information about VictoriaMetrics:
-Deployment types: [Single-node version](https://docs.victoriametrics.com/), [Cluster version](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/), and [Enterprise version](https://docs.victoriametrics.com/victoriametrics/enterprise/)
- Changelog: [CHANGELOG](https://docs.victoriametrics.com/victoriametrics/changelog/), and [How to upgrade](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#how-to-upgrade-victoriametrics)
-**Community**: [Slack](https://slack.victoriametrics.com/) (join via [Slack Inviter](https://slack.victoriametrics.com/)), [X (Twitter)](https://x.com/VictoriaMetrics), [YouTube](https://www.youtube.com/@VictoriaMetrics). See full list [here](https://docs.victoriametrics.com/victoriametrics/#community-and-contributions).
- **Changelog**: Project evolves fast - check the [CHANGELOG](https://docs.victoriametrics.com/victoriametrics/changelog/), and [How to upgrade](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#how-to-upgrade-victoriametrics).
- **Enterprise support:** [Contact us](mailto:info@victoriametrics.com) for commercial support with additional [enterprise features](https://docs.victoriametrics.com/victoriametrics/enterprise/).
- **Enterprise releases:** Enterprise and [long-term support releases (LTS)](https://docs.victoriametrics.com/victoriametrics/lts-releases/) are publicly available and can be evaluated for free
using a [free trial license](https://victoriametrics.com/products/enterprise/trial/).
- **Security:** we achieved [security certifications](https://victoriametrics.com/security/) for Database Software Development and Software-Based Monitoring Services.
Yes, we open-source both the single-node VictoriaMetrics and the cluster version.
`This may be needed for reducing memory usage at remote storage when the order of labels in incoming samples is random. `+
`For example, if m{k1="v1",k2="v2"} may be sent as m{k2="v2",k1="v1"}`+
`Enabled sorting for labels can slow down ingestion performance a bit`)
maxHourlySeries=flag.Int("remoteWrite.maxHourlySeries",0,"The maximum number of unique series vmagent can send to remote storage systems during the last hour. "+
"Excess series are logged and dropped. This can be useful for limiting series cardinality. See https://docs.victoriametrics.com/victoriametrics/vmagent/#cardinality-limiter")
maxDailySeries=flag.Int("remoteWrite.maxDailySeries",0,"The maximum number of unique series vmagent can send to remote storage systems during the last 24 hours. "+
"Excess series are logged and dropped. This can be useful for limiting series churn rate. See https://docs.victoriametrics.com/victoriametrics/vmagent/#cardinality-limiter")
maxHourlySeries=flag.Int64("remoteWrite.maxHourlySeries",0,"The maximum number of unique series vmagent can send to remote storage systems during the last hour. "+
"Excess series are logged and dropped. This can be useful for limiting series cardinality. "+
fmt.Sprintf("Setting this flag to '-1' sets limit to maximum possible value (%d) which is useful in order to enable series tracking without enforcing limits. ",math.MaxInt32)+
"See https://docs.victoriametrics.com/victoriametrics/vmagent/#cardinality-limiter")
maxDailySeries=flag.Int64("remoteWrite.maxDailySeries",0,"The maximum number of unique series vmagent can send to remote storage systems during the last 24 hours. "+
"Excess series are logged and dropped. This can be useful for limiting series churn rate. "+
fmt.Sprintf("Setting this flag to '-1' sets limit to maximum possible value (%d) which is useful in order to enable series tracking without enforcing limits. ",math.MaxInt32)+
"See https://docs.victoriametrics.com/victoriametrics/vmagent/#cardinality-limiter")
maxIngestionRate=flag.Int("maxIngestionRate",0,"The maximum number of samples vmagent can receive per second. Data ingestion is paused when the limit is exceeded. "+
"By default there are no limits on samples ingestion rate. See also -remoteWrite.rateLimit")
@@ -92,6 +100,8 @@ var (
"See https://docs.victoriametrics.com/victoriametrics/vmagent/#disabling-on-disk-persistence . See also -remoteWrite.dropSamplesOnOverload")
dropSamplesOnOverload=flag.Bool("remoteWrite.dropSamplesOnOverload",false,"Whether to drop samples when -remoteWrite.disableOnDiskQueue is set and if the samples "+
"cannot be pushed into the configured -remoteWrite.url systems in a timely manner. See https://docs.victoriametrics.com/victoriametrics/vmagent/#disabling-on-disk-persistence")
disableMetadataPerURL=flagutil.NewArrayBool("remoteWrite.disableMetadata","Whether to disable sending metadata to the corresponding -remoteWrite.url. "+
"By default, metadata sending is controlled by the global -enableMetadata flag")
)
var(
@@ -157,8 +167,8 @@ func Init() {
iflen(*remoteWriteURLs)==0{
logger.Fatalf("at least one `-remoteWrite.url` command-line flag must be set")
returnfmt.Errorf("interval shouldn't be lower than 0")
}
ifg.EvalOffset.Duration()<0{
returnfmt.Errorf("eval_offset shouldn't be lower than 0")
}
// if `eval_offset` is set, interval won't use global evaluationInterval flag and must bigger than offset.
ifg.EvalOffset.Duration()>g.Interval.Duration(){
returnfmt.Errorf("eval_offset should be smaller than interval; now eval_offset: %v, interval: %v",g.EvalOffset.Duration(),g.Interval.Duration())
// if `eval_offset` is set, the group interval must be specified explicitly(instead of inherited from global evaluationInterval flag) and must bigger than offset.
returnfmt.Errorf("the abs value of eval_offset should be smaller than interval; now eval_offset: %v, interval: %v",g.EvalOffset.Duration(),g.Interval.Duration())
}
ifg.EvalOffset!=nil&&g.EvalDelay!=nil{
returnfmt.Errorf("eval_offset cannot be used with eval_delay")
@@ -56,7 +56,7 @@ absolute path to all .tpl files in root.
-rule.templates="dir/**/*.tpl". Includes all the .tpl files in "dir" subfolders recursively.
`)
configCheckInterval=flag.Duration("configCheckInterval",0,"Interval for checking for changes in '-rule' or '-notifier.config' files. "+
configCheckInterval=flag.Duration("configCheckInterval",0,"Interval for checking for changes in '-rule', '-rule.templates' and '-notifier.config' files. "+
"By default, the checking is disabled. Send SIGHUP signal in order to force config check for changes.")
httpListenAddrs=flagutil.NewArrayString("httpListenAddr","Address to listen for incoming http requests. See also -tls and -httpListenAddr.useProxyProtocol")
// sentRows and sentBytes are historical counters that can now be replaced by flushedRows and flushedBytes histograms. They may be deprecated in the future after the new histograms have been adopted for some time.
ruleUpdateEntriesLimit=flag.Int("rule.updateEntriesLimit",20,"Defines the max number of rule's state updates stored in-memory. "+
"Rule's updates are available on rule's Details page and are used for debugging purposes. The number of stored updates can be overridden per rule via update_entries_limit param.")
resendDelay=flag.Duration("rule.resendDelay",0,"MiniMum amount of time to wait before resending an alert to notifier.")
maxResolveDuration=flag.Duration("rule.maxResolveDuration",0,"Limits the maxiMum duration for automatic alert expiration, "+
resendDelay=flag.Duration("rule.resendDelay",0,"Minimum amount of time to wait before resending an alert to notifier.")
maxResolveDuration=flag.Duration("rule.maxResolveDuration",0,"Limits the maximum duration for automatic alert expiration, "+
"which by default is 4 times evaluationInterval of the parent group")
evalDelay=flag.Duration("rule.evalDelay",30*time.Second,"Adjustment of the 'time' parameter for rule evaluation requests to compensate intentional data delay from the datasource. "+
"Normally, should be equal to '-search.latencyOffset' (cmd-line flag configured for VictoriaMetrics single-node or vmselect). "+
{% if g.Unhealthy > 0 %}<span class="badge bg-danger" title="Number of rules with status Error">{%d g.Unhealthy %}</span> {% endif %}
{% if g.NoMatch > 0 %}<span class="badge bg-warning" title="Number of rules with status NoMatch">{%d g.NoMatch %}</span> {% endif %}
<span class="badge bg-success" title="Number of rules with status Ok">{%d g.Healthy %}</span>
{% if g.States["unhealthy"] > 0 %}<span class="badge bg-danger" title="Number of rules with status Error">{%d g.States["unhealthy"] %}</span> {% endif %}
{% if g.States["nomatch"] > 0 %}<span class="badge bg-warning" title="Number of rules with status NoMatch">{%d g.States["nomatch"] %}</span> {% endif %}
<span class="badge bg-success" title="Number of rules with status Ok">{%d g.States["ok"] %}</span>
returnnil,nil,fmt.Errorf("incorrect match claim, key=%q, value regex=%q: %w",ck,cv,err)
}
parsedClaims=append(parsedClaims,pc)
}
ui.JWT.parsedMatchClaims=parsedClaims
sort.Strings(sortedClaims)
claimsString=strings.Join(sortedClaims,",")
ifoldUI,ok:=uniqClaims[claimsString];ok{
returnnil,nil,fmt.Errorf("duplicate match claims=%q found for name=%q at idx=%d; the previous one is set for name=%q",claimsString,ui.Name,idx,oldUI.Name)
responseTimeout=flag.Duration("responseTimeout",5*time.Minute,"The timeout for receiving a response from backend")
requestBufferSize=flagutil.NewBytes("requestBufferSize",32*1024,"The size of the buffer for reading the request body before proxying the request to backends. "+
"This allows reducing the comsumption of backend resources when processing requests from clients connected via slow networks. "+
"This allows reducing the consumption of backend resources when processing requests from clients connected via slow networks. "+
"Set to 0 to disable request buffering. See https://docs.victoriametrics.com/victoriametrics/vmauth/#request-body-buffering")
maxRequestBodySizeToRetry=flagutil.NewBytes("maxRequestBodySizeToRetry",16*1024,"The maximum request body size to buffer in memory for potential retries at other backends. "+
"Request bodies larger than this size cannot be retried if the backend fails. Zero or negative value disables request body buffering and retries. "+
Err:fmt.Errorf("all the %d backends for the user %q are unavailable",up.getBackendsCount(),ui.name()),
Err:fmt.Errorf("all the %d backends for the user %q are unavailable for proxying the request - check previous WARN logs to see the exact error for each failed backend",up.getBackendsCount(),ui.name()),
StatusCode:http.StatusBadGateway,
}
httpserver.Errorf(w,r,"%s",err)
@@ -710,7 +764,7 @@ var concurrentRequestsLimitReached = metrics.NewCounter("vmauth_concurrent_reque
funcusage(){
consts=`
vmauth authenticates and authorizes incoming requests and proxies them to VictoriaMetrics.
vmauth authenticates and authorizes incoming requests and proxies them to VictoriaMetrics components or any other HTTP backends.
See the docs at https://docs.victoriametrics.com/victoriametrics/vmauth/ .
concurrency=flag.Int("concurrency",10,"The number of concurrent workers. Higher concurrency may reduce restore duration")
maxBytesPerSecond=flagutil.NewBytes("maxBytesPerSecond",0,"The maximum download speed. There is no limit if it is set to 0")
skipBackupCompleteCheck=flag.Bool("skipBackupCompleteCheck",false,"Whether to skip checking for 'backup complete' file in -src. This may be useful for restoring from old backups, which were created without 'backup complete' file")
SkipPreallocation=flag.Bool("skipFilePreallocation",false,"Whether to skip pre-allocated files. This will likely be slower in most cases, but allows restores to resume mid file on failure")
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.