VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2026-05-17 00:26:36 +03:00

Author	SHA1	Message	Date
Zhu Jiekun	3d3cc4bceb	lib/memory: adds memory.allowedBytes warning message This commit adds a warning message, if `-memory.allowedBytes` has value less than 1MB. It should help to debug possible issues, if there is a problem with app start-up due to low memory limit. For example, fastcache could panic at `-memory.allowedBytes=` Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10935	2026-05-12 22:31:23 +02:00
Uğur Tafralı	475675b16c	lib/backup/fslocal: remove traling slash in provided directory (#10825 ) Trailing slash in -storageDataPath was causing vmrestore to panic. The fix calls filepath.Clean() in Init() to normalize the path. Added a test to verify ListParts works correctly with a trailing slash. Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10823 PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10825 --------- Signed-off-by: JAYICE <jayice.zhou@qq.com> Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>	2026-05-12 18:11:47 +03:00
Max Kotliar	243037823a	app/vmagent: fix rare hash collision in getLabelsHash (#10937 ) Add '=' separator between label name and value when computing the hash to prevent false collisions, like {a="bc"} and {ab="c"} hashing to the same value. getLabelsHashForShard is added to avoid sharding disruptions in vmagent (-remoteWrite.shardByURL=true mode). The function preserves previous behavior, without '=' between name and value. PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10937	2026-05-12 15:42:55 +03:00
andriibeee	85e0253569	lib/protoparser: add flag to allow OpenTelemetry underscore labels to pass through without being prefixed (#10475 ) Add `-opentelemetry.labelNameUnderscoreSanitization` command-line flag to control whether to enable prepending of `key` to labels starting with `_` when `-opentelemetry.usePrometheusNaming` is enabled. The labels starting with `__` are not modified. Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9663 Signed-off-by: andriibeee <154226341+andriibeee@users.noreply.github.com> Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>	2026-05-12 15:17:36 +03:00
Andrii Chubatiuk	e7c46a0f4c	lib/streamaggr: use max samples lag for flush delay calculation (#10835 ) ### Describe Your Changes fixes #10402 use max sample lag for flush delay calculation when aggregation windows enabled. before 95th percentile of samples lag was used, which led to dropped data ### Checklist The following checks are mandatory: - [ ] My change adheres to [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist). - [ ] My change adheres to [VictoriaMetrics development goals](https://docs.victoriametrics.com/victoriametrics/goals/). --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2026-05-11 13:21:20 +02:00
Alexei Margasov	20d4314168	lib/streamaggr: fix stale quantiles output (#10918 ) ### Describe Your Changes Fix stale `quantiles(...)` stream aggregation output for series without samples in the current aggregation interval. Previously, `quantilesAggrConfig` reused the `quantiles` buffer across aggregation values. If `quantilesAggrValue.flush` was called for a series without samples after another series had already calculated quantiles, the stale quantile values could be emitted for the empty series. This could produce unrealistic `_quantiles` output values and make the same aggregated value appear across unrelated labelsets. The PR skips `quantiles(...)` output when there is no histogram for the current interval and adds a regression test for this case. ### Checklist The following checks are mandatory*: - [x] My change adheres to [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist). - [x] My change adheres to [VictoriaMetrics development goals](https://docs.victoriametrics.com/victoriametrics/goals/). --------- Co-authored-by: hagen1778 <roman@victoriametrics.com>	2026-05-11 13:12:23 +02:00
Roman Khavronenko	b30c307bbb	lib/streamaggr: update sync tests (#10939 ) synctest runs inner closure in a new goroutine, which makes `t.Helper` instruction useless on `t.Fatalf` checks. So when test fails we observe the log line where `t.Fatalf` was called, instead of where `f()` was called. Moving checks out of synctest closure makes `t.Helper` useful again. -- In the synctest we were waiting for ingest a new batch of samples for aggregation interval. Because of this, the new batch had 50% chance to be ingested in the previous or current aggregation interval, depending on whether go run time initiated flush() call or no. This change waits for additional 1ms for flush to happen. Locally, it stopped producing flaky tests. --------- Signed-off-by: hagen1778 <roman@victoriametrics.com>	2026-05-11 13:06:36 +02:00
JAYICE	696c1aa3e8	lib/fs: introduce new metric for Filesystem type name This commit introduces a new metric to expose fs type for the provided path. For example: ``` vm_fs_info{path="/vmstorage-data", fs_type="xfs"} ``` Path must be registered with new method `fs.RegisterPathFsMetrics`. fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10482	2026-05-08 09:17:03 +02:00
Hui Wang	76e0bcdf45	lib/prompb: support prometheus native histogram during ingestion This commit adds support for Prometheus Native Histogram https://prometheus.io/docs/specs/native_histograms data ingestion via Prometheus RemoteWrite format. It converts Native Histograms into VictoriaMetrics histogram format. fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10743	2026-05-07 19:06:51 +02:00
f41gh7	8474f15359	lib/httpserver: support multitnenacy via headers This commit adds possibility to omit tenantID in the URL path. In this case, tenantID will be fetched from HTTP headers `AccountID` and `ProjectID`. If headers are missing too, then default `0:0` tenantID is used. This functionality can be enabled only if -enableMultitenantHandlers cmd-line flag was set to vminsert, vmselect or vmagent. Motivation: this change makes VM configuration for multienancy consistent with VL configuration - see https://docs.victoriametrics.com/victorialogs/#multitenancy. And keeps backward compatibility in the same time. fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4241	2026-05-06 17:49:54 +02:00
Julius Rickert	099ec5c25a	lib/promscrape//etzner: update hetzner_sd_configs for Hetzner Cloud datacenter → location API change On 2025-12-16, Hetzner Cloud deprecated the `datacenter` field in their Servers API and introduced a top-level `location` field carrying the same data. The `datacenter` field will be removed after 2026-07-01. Without this change, `__meta_hetzner_hcloud_datacenter_location`, and `__meta_hetzner_hcloud_datacenter_location_network_zone` would silently become empty for the `hcloud` role after that date. This mirrors the change made in Prometheus v3.11.0 ([prometheus/prometheus#17850](https://github.com/prometheus/prometheus/pull/17850)). ## Changes `hcloud` role: - Add `HCloudLocation` struct and `Location` field on `HCloudServer`, mapped to the new top-level `location` API field - Emit two new canonical labels: `__meta_hetzner_hcloud_location` and `__meta_hetzner_hcloud_location_network_zone` - Keep the deprecated `__meta_hetzner_hcloud_datacenter_location` and `__meta_hetzner_hcloud_datacenter_location_network_zone` labels, now sourced from the new `location` field so they continue to work past 2026-07-01 - `__meta_hetzner_datacenter` (the datacenter name, e.g. `fsn1-dc14`) is unaffected for this role — the datacenter name is a distinct concept from location and is kept as-is (this will stop working starting 2026-07-01) `robot` role: - Add `__meta_hetzner_robot_datacenter` as the canonical replacement for `__meta_hetzner_datacenter`; the old label is kept for backward compatibility Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10909	2026-05-05 17:51:13 +02:00
Nikolay	64e43e59a7	lib/httpserver: suppress TCP health check for tls connections Previously, if `-tls` flag was provided, victoria metrics components produced the following log error entry at health checks: http: TLS handshake error from 10.244.0.1:46556: EOF Such health checks are common for many orchestration systems, such as consul or kubernetes. And default http server already suppresses such EOF health checks. This commit adds suppression to the tls server as well. Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10538	2026-04-29 09:59:57 +02:00
Nikolay	89c0b1c1aa	lib/opentelemetry: properly reset metric metadata Previously, metricMetadata was not properly reset during parsing of metrics. It could result into `Unit` suffix to be added from previously parsed metric into next metric without Unit field. For example, metric `http_request` with `Unit` `seconds` will be converted into `http_request_seconds` and `Unit` field hold `seconds`. Next parsed metric `cpu_usage_ratio` has no `Unit` and it will get previous `seconds` `Unit` -> `cpu_usage_ratio_seconds`. This commit adds metricMetadata reset call before parsing of next metric. Bug was introduced at `293d80910c` Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10889	2026-04-28 11:17:12 +02:00
Artem Fetishev	c317e95ab8	lib/storage: support samples with future timestamps (#10718 ) Add the support of storage and retrieval of samples with future timestamps as requested in https://github.com/VictoriaMetrics/VictoriaMetrics/issues/827 What to expect: - By default, the max future timestamp is still limited to `now+2d`. To change it, set the `-futureRetention` flag in `vmstorage`. The max flag value is currently limited to `100y`. It can be extended if we see a demand for this, but it can't be more than `~ 290y` due to how the time duration is implemented in Go. The flag value can't be less than `2d`. - downsampling and retention filters (available in enterprise edition) are currently not supported for future timestamps - If `vmstorage` restarts with a smaller value of `-futureRetention` flag, any future partitions that are outside the new future retention will be automatically deleted. - Data ingestion, data retrieval, backup/restore, timeseries (soft) deletion, and other operations work with future timestamps the same way as with the historical timestamps. - In the cluster version, the affected binaries are `vmstorage` and `vmselect`. This means that `vmselect` version must match `vmstorage` version if you want to query future timestamps. `vminsert` was not affected, so its version can be a lower one. - If you downgrade the `vmstorage`, the data with future timestamps will remain on disk and memory (per-partition caches) but won't be available for querying. Signed-off-by: Artem Fetishev <rtm@victoriametrics.com> Signed-off-by: Artem Fetishev <149964189+rtm0@users.noreply.github.com> Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>	2026-04-23 18:12:33 +02:00
Artem Fetishev	a875597b09	lib/timeutil: ensure parsed time is in allowed range (#10870 ) Update `timeutil.ParseTimeAt` to check the time limits for all date/time formats, not just year. Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>	2026-04-23 17:37:15 +02:00
andriibeee	a3df0f890b	lib/cgroup: support reading cpu/memory limits from systemd slices cgroup v2 version supports slices ( aka path hierarchy) for resource limits. It's mostly supported by systemd and container runtime build on top of it. This commit reads subpath for systemd slices and traverse it with reading minimal limit value. Related docs: https://docs.oracle.com/en/operating-systems/oracle-linux/9/systemd/SystemdMngCgroupsV2.html#SlicesServicesScopesHierarchy https://www.freedesktop.org/software/systemd/man/latest/systemd.slice.html Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10635	2026-04-22 10:18:03 +02:00
andriibeee	05112e54e2	lib/netutil: fix IPv6 address corruption in proxy protocol v2 parser Proxy protocol parser kept sub-slice reference for pooled bytesBuffer at readProxyProto ``` bb := bbPool.Get() defer bbPool.Put(bb) // ← buffer returned to pool AFTER function returns ... IP: bb.B[0:16], // ← BUG: sub-slice of pooled buffer! ... ``` This commit properly allocates new slice for ipv6 address and copies buffer content to it. Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10839	2026-04-20 12:11:04 +02:00
Andrii Chubatiuk	ce227fe7d9	lib/streamaggr: added vm_streamaggr_counter_resets_total counter (#10807 ) ### Describe Your Changes Added `vm_streamaggr_counter_resets` metric for `rate`, `total`, and `increase` outputs, which is useful for unpredictable output behaviour investigation. ### Checklist The following checks are mandatory*: - [ ] My change adheres to [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist). - [ ] My change adheres to [VictoriaMetrics development goals](https://docs.victoriametrics.com/victoriametrics/goals/). --------- Signed-off-by: Andrii Chubatiuk <andrew.chubatiuk@gmail.com> Signed-off-by: hagen1778 <roman@victoriametrics.com> Signed-off-by: Roman Khavronenko <hagen1778@gmail.com> Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2026-04-20 11:48:03 +02:00
Nikolay	a29229a877	lib/promscrape: prevent unbounded scrape error body read Previously, on non-200 HTTP status codes, lib/promscrape performed an unbounded body read, which could potentially result in OOM. This commit adds a maxScrapeSize limit to error response body reads, protecting against malicious or misbehaving metrics endpoints.	2026-04-16 22:50:08 +02:00
cubic-dev-ai[bot]	3fe606770f	fix: prevent deadlock in vmrestore worker pool on context cancellation Workers in runParallelPerPathInternal check ctxLocal.Done() before processing each work item and exit early on cancellation — without sending a result to resultCh. However, the coordinator loop always waits for exactly len(perPath) results from resultCh. If cancellation occurs before all tasks report, the read blocks indefinitely.	2026-04-16 22:44:31 +02:00
andriibeee	a36395500b	lib/awsapi: pre-populate credentials only for static creds without roleARN `0aaa741b5b` introduced a regression in lib/awsapi/config.go that causes empty credentials to be returned on the very first call to getFreshAPICredentials() when using EKS Pod Identity (or any container credential mechanism with no static access key). These empty credentials are then used for SigV4 signing -> 403 Forbidden on every remote write request. Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10815	2026-04-16 11:51:42 +02:00
Alexander Frolov	d07c1c73d1	lib/writeconcurrencylimiter: prevent deadlock at IncConcurrency Previously (writeconcurrencylimiter.Reader).Read() could permanently leak concurrency tokens from the -maxConcurrentInserts semaphore. Consider the following example: GetReader() acquires a token, then PutReader() unconditionally releases it. * Read() calls DecConcurrency() before the underlying I/O and IncConcurrency() after it. If IncConcurrency() returns an error, Read() returns without holding a token. * Each such failure permanently removes one slot from the concurrencyLimitCh semaphore. Slots leak one by one until the channel is fully drained, at which point DecConcurrency() blocks forever, deadlocking ingestion on vmstorage. This commit adds tracking for obtained tokens to the reader. Which prevents possible tokens leakage. Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10784	2026-04-10 19:35:59 +02:00
Aliaksandr Valialkin	8fa0fae05a	lib/protoparser/protoparserutil: fix `encoding -> contentType` in the description of the ReadUncompressedData function This is a follow-up for the commit `bed7cbd0a4`	2026-04-10 15:20:27 +02:00
Noureldin	f95b483a13	lib/storage: fixes data race at startFreeDiskSpaceWatcher Previously, Storage.table was initialized after startFreeDiskSpaceWatcher was called. This created a potential data race condition: if openTable took a long time to complete and freed disk space during that window, the free disk space watcher could read an uninitialized (or partially initialized) Storage.table, leading to an invalid memory address or nil pointer dereference panic. This commit properly initializes s.isReadOnly state during storage start and starts FreeDiskSpaceWatcher after openTable. Bug was introduced in github.com/VictoriaMetrics/VictoriaMetrics/commit/27b958ba8bc66578206ddac26ccf47b2cc3e8101 Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10747	2026-04-10 08:33:49 +02:00
Max Kotliar	0a31eacb3d	lib/{osinfo,appmetrics}: Move vm_os_info metric code to lib/appmetrics package (#10776 ) Follow-up commit for `211fb08028` Address @f41gh7 review comments: - Move code from `lib/osinfo` to `lib/appmetrics`. - Make the logic private. - Use metrics.WriteGaugeUint64 func. - Remove registration logic from `app/xxx/main.go`. - Remove `lib/osinfo` package.	2026-04-09 18:32:47 +03:00
Artem Fetishev	70b0115ea6	lib/storage: reuse nextDayMetricIDs during the first hour of the day (#10704 ) At 00:00 UTC the ingested samples start to have timestamps for the new day (in the ingested samples are always recent). Even though there was a next-day prefill of the per-day index during the last hour of the day, some performance degradation is still possible. For example, in https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10698 it is manifested as `vminsert-to-vmstorage connection saturation` peaks right after midnight. Possible hypothesis why this is happening. At midnight, currHourMetricIDs is empty and prevHourMetricIDs cannot be used because it holds metricIDs for the previous day. So the ingestion logic hits dateMetricIDsCache which may not have the metricID in its read-only buffer and therefore should aquire lock to check its prev read-only buffer or read-write buffer. Which creates lock contention and therefore raises ingestion request latency. A solution to this could be re-using the nextDayMetricIDs during the first hour of the day. During this time, it is equivalent to currHourMetricIDs. --------- Signed-off-by: Artem Fetishev <rtm@victoriametrics.com> Signed-off-by: Artem Fetishev <149964189+rtm0@users.noreply.github.com> Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>	2026-04-09 16:33:42 +02:00
JAYICE	211fb08028	introduce os kernel version information metric (#10746 ) The commit introduces the `vm_os_info` metric, which is exposed by all VM binaries by default. It provides visibility into the operating system version on which VictoriaMetrics is running, helping with troubleshooting environment-specific issues, like known kernel or fs bugs. FIxes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10481 PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10746 Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>	2026-04-09 14:43:25 +03:00
andriibeee	e1a9901654	vmselect: add CSV header support for export/import (#10706 ) Export (/api/v1/export/csv) now always writes a header row matching the requested format fields. Examples: ``` # format=__timestamp__:unix_ms,__value__,job,instance __timestamp__:unix_ms,__value__,job,instance 1704067200000,42.5,node,localhost:9090 ``` Import (/api/v1/import/csv) gains auto-detection logic: the first row is skipped if any timestamp column fails timestamp parsing or any metric value column fails float parsing. If the first row is not detected as headers, it is parsed as data. This makes the import backward compatible. Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10666 PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10706 ### Checklist The following checks are mandatory: - [x] My change adheres to [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist). - [x] My change adheres to [VictoriaMetrics development goals](https://docs.victoriametrics.com/victoriametrics/goals/). --------- Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>	2026-04-09 14:00:39 +03:00
andriibeee	0aaa741b5b	lib/awsapi: add support for named AWS profile to ec2_sd_config Add support for named AWS profiles in ec2_sd_config, matching Prometheus behavior. Example: ```text ~/.aws/config: [profile account-one] source_profile = root role_arn = arn:aws:iam::000000000001:role/prometheus ``` ```yaml scrape config: - job: ec2 ec2_sd_configs: - profile: account-one ``` Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1685	2026-04-09 11:17:17 +02:00
Artem Fetishev	accb06d131	lib/storage: refactor storage synctests Exctract repeated code from nextDayMetricIDs synctests into separate funcs to make the code more readable. The change was originally introduced in https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10704 and was extracted into a separate PR to keep the original change simple.	2026-04-09 09:07:37 +02:00
JAYICE	0a256002e5	lib/promscape: update last scrape result only when current scrape is successful Previously, last scrape result was unconditionally update, despite possible scrape error. The commit updates last scrape result only at successful scrape. It properly accounts `scrape_series_added` metric and aligns it with the same metric in Prometheus. fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10653	2026-04-06 17:14:47 +02:00
JAYICE	de2bc4237a	lib/backup/s3: retry the requests that failed with unexpected EOF When the network between client and s3 server is unstable, the client may encounter temporary io.EOF errors when reading the response from s3 server. Currently, the s3 sdk in vmbackup uses the default retry policy. However, this default retry policy won't retry when s3 sdk meet unexpected EOF. This means that the temporary unexpected EOF error will cause the backup task to fail. fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10699	2026-04-03 10:26:58 +02:00
Aliaksandr Valialkin	577b161343	docs/victoriametrics/changelog/CHANGELOG.md: add a description for the change in the commit `dd2d6807e4`	2026-04-02 13:18:38 +02:00
Mehrdad Banikian	dd2d6807e4	Add split phase metrics for filestream fsync operations (#10493 ) ## Summary This PR implements split phase metrics for filestream operations as requested in #10432. ### Changes - Added `vm_filestream_fsync_duration_seconds_total` metric to track fsync syscall duration separately - Added `vm_filestream_fsync_calls_total` metric to count fsync calls - Added `vm_filestream_write_syscall_duration_seconds_total` metric to track write syscall duration (previously mixed with flush time) - Refactored `MustClose()` and `MustFlush()` to use new `flush()` and `sync()` helper methods - Kept `vm_filestream_write_duration_seconds_total` for backward compatibility ### Problem Solved Previously, `vm_filestream_write_duration_seconds_total` was being incremented in two places: 1. `statWriter.Write()` - triggered by `bw.Flush()` and `bw.Write()` 2. `Writer.MustFlush()` - which included the above process, leading to double-counting This made it impossible to distinguish between write syscall time and fsync time, which is critical for diagnosing storage latency issues. ### Solution The new metrics allow users to: - Distinguish "flush got slower" vs "fsync got slower" using metrics only - No file path labels (bounded cardinality) - No double-counting between metrics ### Testing - Code compiles successfully - All existing metrics are preserved for backward compatibility Closes #10432 --------- Signed-off-by: Aliaksandr Valialkin <valyala@victoriametrics.com> Signed-off-by: Aliaksandr Valialkin <valyala@gmail.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>	2026-04-02 13:14:33 +02:00
Vadim Alekseev	bc708c8568	lib/timeutil: introduce backoff timer struct (#10714 ) ### Describe Your Changes I noticed that the backoff timer logic is repeated across multiple packages. I've implemented a universal wrapper to avoid duplicating this logic. This structure is already [actively used](`2aa0ea10bb/app/vlagent/kubernetescollector/backoff_timer.go (L11)`) for the Kubernetes Collector in vlagent and can be reused in vlagent's remotewrite. I've also included a usage example in this PR so you can evaluate its utility. ### Checklist The following checks are mandatory: - [X] My change adheres to [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist). - [X] My change adheres to [VictoriaMetrics development goals](https://docs.victoriametrics.com/victoriametrics/goals/).	2026-04-02 12:31:28 +02:00
Aliaksandr Valialkin	527d09653a	lib/storage: remove MetricNamesStatsResponse and MetricNamesStatsRecord types These types hide public types from lib/storage/metricnamestats package. These types do not resolve any practical issues. Instead, they add a level of indirection, which complicates reading and understanding the code. These types were introduced in the commit `795d3fe722` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6145	2026-04-01 22:25:50 +02:00
Nikolay	57ce00a5c6	lib/fs: restore async deletion of NFS folders Commit `83da33d8cf` removed NFS directory delete retries. It was made on assumption, that only directory rename could cause such issues. However, both rename and unlink uses the same "silly rename" logic https://linux-nfs.org/wiki/index.php/Server-side_silly_rename and linux kernel - `fs/nfs/dir.c` `nfs_unlink` and `nfs_rename`. And NFS client may treat file still open, even if it was properly closed by application. Most probably it could be triggered, because VictoriaMetrics may open the same file multiple times ( data read and background merges). There is no issue with VictoriaMetrics itself, it properly closes files. But NFS-client may have delays or cache metadata information for the files. So it could trigger silly rename behavior. This commit restores original behavior with deletion retries and brings back metrics for unsuccessful delete operations. Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9842	2026-03-27 09:51:27 +01:00
Benjamin Nichols-Farquhar	febafc1cf1	lib/backup: speed up restores on linuxsystems (#10661 ) Related to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10680 We noticed that backup restores in our environment were much slower than the hardware/bandwidth constraints would suggest and we traced this down to a couple of bottlenecks. This PR attempts to address all of them. #### Lack of pre-allocation of files, This was causing writes far into files to be quite slow as new blocks needed to be continually allocated. This was particularly bad on ext4 for us, but will likely be applicable to most disks and filesystems, you'll see the impl here is linux specific but this is mostly because I don't have a test env for any other platform and didn't want to blindly make changes without a validation env. This comes with the downside of no longer being to to resume a restore mid file, and requiring the re-downloading of parts already in the file size the file will appear at full size from the very start. This is I think _generally_ a good tradeoff for the restore speed gains, it is definitely a tradeoff so I've included a flag to disable the pre-allocation behavior and fall back to the existing part diffing logic. #### Fsync after each part With many small parts in relatively few files, or in high concurrency setups the the writerCloser fsync on each part(actually double fsync since both `filestream.Writer.mustFlush` and `filestream.Writer.mustClose` both fsync). Was causing slowdowns since we would be continually queuing fsyncs. With the pre-allocation pattern the file is only "ready" once re-named so I moved to a per file fsync after rename. #### Concurrent read/write The previous download pattern was to do a read from the remoteFs, with whatever latency that entailed, then sequentially do a write, again with whatever latency that entailed. This meant that throughput was limited to `readLatency + writeLatency * blockSize`. Similar to how `crossTypeCopy` is implemented in the backup process we can instead use `io.pipe` to allow two goroutines to work in parallel with a small buffer between them. #### Pagecache avoidance `filestream.Writer` does quite a lot to avoid polluting the page cache, but this is not relevent in a restore context and with large sequential block writes its much more effecient to let the OS flush the pagecache whenever it wants rather than doing a bunch of small buffer syscalls to flush blocks. Therefore this switches over to a much simplier directWriterCloser that does direct file IO and lets the OS handle flushes while mid write. ### Performance Before the changes we were seeing writes speeds of only 100MBps, this was a restore from EBS volumes, ext with 1GB/s throughput with <img width="1613" height="586" alt="Screenshot 2026-03-16 at 1 29 46 PM" src="https://github.com/user-attachments/assets/5d54dcb7-cb59-43e0-9247-fda8c70feb2f" /> After these changes in the same restore env we're seeing 600MBs flat rates. <img width="1611" height="471" alt="Screenshot 2026-03-16 at 1 31 33 PM" src="https://github.com/user-attachments/assets/ea8e2eb7-533a-48fa-99e0-0b38286e5572" /> Signed-off-by: Max Kotliar <kotlyar.maksim@gmail.com> Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>	2026-03-27 07:35:44 +02:00
Artem Fetishev	a07cae3279	lib/lrucache: remove shards (#10697 ) Remove shards as they only complicate things when the number of requests per second is in the range of thousands. Related to #10532. --------- Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>	2026-03-26 16:28:01 +01:00
Hui Wang	2d6cf8827d	lib/protoparser/opentelemetry: support ExponentialHistogram negative buckets (#10669 ) Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9896#issuecomment-4037424586. Histogram-related functions such as histogram_quantile() and the VMUI heatmap also work with negative bucket values. Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>	2026-03-26 11:57:19 +02:00
andriibeee	be5ae9b95c	lib/jwt: support array claim values in match_claims This commit allows to perform JWT claim matching over 1 dimension arrays. It could be useful from practical standpoint. Because permissions are usually assigned as a list of values. For example, the following config allows admin access over list of assigned roles for user: ```yaml match_claims: access.roles: "admin" ``` JWT token: ```json { "access": { "roles": [ "read", "write", "admin" ] } } ``` Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10647	2026-03-26 10:23:43 +01:00
andriibeee	60aef0510f	lib/promauth: make username optional in basic_auth section RFC-7617 allows empty password/username. Moreover, from RFC standpoint both empty values are valid as well. It should be just encoded as `:`. So this commit relaxes non-empty username restriction. Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6956	2026-03-26 02:18:37 -07:00
Ty Sarna	70ab2c1585	lib/protoparser/prometheus: add support for OpenMetrics-specific metric types (#10689 ) - Adds `info`, `gaugehistogram`, `stateset`, and `unknown` as recognized metric type names in the Prometheus/OpenMetrics text format parser. - Previously these valid [OpenMetrics](https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md) types hit the `default` case and emitted an `error`-level log on every scrape, flooding logs and continuously triggering the `TooManyLogs` alert. Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10685 Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>	2026-03-24 15:34:33 +02:00
Artem Fetishev	95175e00b4	lib/lrucache: sizeBytes should also include key length (#10679 ) There are cases then the key sizeBytes is much greater than the value sizeBytes. Therefore it is important to include the key sizeBytes into the total. Also fix some code comments. Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>	2026-03-24 12:54:31 +01:00
Artem Fetishev	d21d9e8382	lib/storage: Improve indexDB error messages (#10684 ) Fixes: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9499 --------- Signed-off-by: Artem Fetishev <rtm@victoriametrics.com> Signed-off-by: Nikolay <nik@victoriametrics.com> Co-authored-by: Nikolay <nik@victoriametrics.com> Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>	2026-03-24 12:19:04 +01:00
andriibeee	e761f22049	lib/netutil: warn when IPv6 listen address is used without -enableTCP6 (#10640 ) ### Describe Your Changes Fixes #6858 ### Checklist The following checks are mandatory: - [x] My change adheres to [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist). - [x] My change adheres to [VictoriaMetrics development goals](https://docs.victoriametrics.com/victoriametrics/goals/). --------- Signed-off-by: andriibeee <154226341+andriibeee@users.noreply.github.com> Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com> Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>	2026-03-18 21:01:55 +02:00
andriibeee	fb579cf592	lib/jwt: fail on unsupported alg when use=sig, skip non-sig JWKS keys (#10664 ) ### Describe Your Changes Fixes #10663 ### Checklist The following checks are mandatory: - [x] My change adheres to [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist). - [x] My change adheres to [VictoriaMetrics development goals](https://docs.victoriametrics.com/victoriametrics/goals/). --------- Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>	2026-03-18 20:40:04 +02:00
andriibeee	2bb03f6e34	lib/storage, lib/mergeset: properly account inmemoryPart refCount Previously inmemoryPart refCount was not properly decremented. Previous behavior: * createInmemoryPart called newPartWrapperFromInmemoryPart and returns a partWrapper with refCount=1 * multiple parts are merged in mustMergeInmemoryPartsFinal, which creates a new merged part * the source partWrappers are never decRef'd * Since refCount never reaches 0, putInmemoryPart and (*part).MustClose are never called This commit properly decrements refCount at mustMergeInmemoryPartsFinal. Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10086	2026-03-17 10:54:08 +01:00
Br1an	92f03344eb	lib/promscrape/discovery/yandexcloud: add folder_ids option This commit adds a new `folder_ids` field in `yandexcloud_sd_configs` that allows users to specify Yandex Cloud folder IDs directly, bypassing the organization->cloud->folder hierarchy traversal. Previously, the Yandex Cloud service discovery required traversing the entire resource hierarchy (organizations -> clouds -> folders -> instances) to discover instances. This works when the Service Account has permissions at all levels. However, some Service Accounts may only have permissions at the folder level, causing discovery to fail when it cannot access organization or cloud resources. With this change, users can now configure folder IDs directly: ```yaml yandexcloud_sd_configs: - service: compute folder_ids: - folder-id-1 - folder-id-2 ``` When `folder_ids` is specified, the discovery skips the hierarchy traversal and directly queries instances from the specified folders. This is a backward-compatible change - when `folder_ids` is not specified, the existing behavior is preserved. fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10587	2026-03-17 10:51:05 +01:00
Arie Heinrich	14090c5a07	all: spelling fixes in code comments (#10650 ) fixing spelling issues in comments and text strings ### Checklist The following checks are mandatory: - [x] My change adheres to [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist). - [x] My change adheres to [VictoriaMetrics development goals](https://docs.victoriametrics.com/victoriametrics/goals/).	2026-03-16 11:11:54 +01:00

1 2 3 4 5 ...

3480 Commits