Compare commits

..

206 Commits

Author SHA1 Message Date
Aliaksandr Valialkin
c15b0769a6 docs/CHANGELOG.md: cut v1.87.13 2024-01-17 00:20:04 +02:00
Aliaksandr Valialkin
e749e2be57 deployment/docker: update Go builder from Go1.21.5 to Go1.21.6 2024-01-17 00:10:57 +02:00
Aliaksandr Valialkin
41bdf087a6 lib/pushmetrics: wait until the background goroutines, which push metrics, are stopped at pushmetrics.Stop()
Previously the was a race condition when the background goroutine still could try collecting metrics
from already stopped resources after returning from pushmetrics.Stop().
Now the pushmetrics.Stop() waits until the background goroutine is stopped before returning.

This is a follow-up for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5549
and the commit fe2d9f6646 .

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5548
2024-01-16 23:40:35 +02:00
hagen1778
ce9cd9e3ea app/all: follow-up after 84d710beab
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5548
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-16 23:39:31 +02:00
zhdd99
faaf6e8fd2 lib/pushmetrics: fix a panic caused by pushing metrics during the graceful shutdown process of vmstorage nodes. (#5549)
Co-authored-by: zhangdongdong <zhangdongdong@kuaishou.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-16 23:37:55 +02:00
Aliaksandr Valialkin
c008bbb91f vendor: run make vendor-update 2024-01-16 23:00:02 +02:00
Aliaksandr Valialkin
a2366e3b75 lib/storage: follow-up for 4b8088e377
- Clarify the bugfix description at docs/CHANGELOG.md
- Simplify the code by accessing prefetchedMetricIDs struct under the lock
  instead of using lockless access to immutable struct.
  This shouldn't worsen code scalability too much on busy systems with many CPU cores,
  since the code executed under the lock is quite small and fast.
  This allows removing cloning of prefetchedMetricIDs struct every time
  new metric names are pre-fetched. This should reduce load on Go GC,
  since the cloning of uin64set.Set struct allocates many new objects.
2024-01-16 22:45:23 +02:00
Roman Khavronenko
30ff086f4d lib/storage: properly check for storage/prefetchedMetricIDs cache expiration deadline (#5607)
Before, this cache was limited only by size.
Cache invalidation by time happens with jitter to prevent thundering herd problem.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-16 22:41:12 +02:00
Aliaksandr Valialkin
e175368935 app/vmselect/promql: simplify the code after 388d020b7c
Add a test, which verifies the correct sorting of float64 slices with NaNs.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5506
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5509
2024-01-16 22:37:03 +02:00
Aliaksandr Valialkin
90562a59a2 app/vmselect/promql: follow-up for ce4f26db02
- Document the bugfix at docs/CHANGELOG.md
- Filter out NaN values before sorting as suggested at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5509#discussion_r1447369218
- Revert unrelated changes in lib/filestream and lib/fs
- Use simpler test at app/vmselect/promql/exec_test.go

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5509
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5506
2024-01-16 22:20:29 +02:00
Zongyang
b478c4f56f FIX bottomk doesn't return any data when there are no time range overlap between timeseries (#5509)
* FIX sort order in bottomk

* Add lessWithNaNsReversed for bottomk

* Add ut for TopK

* Move lt from loop

* FIX lint

* FIX lint

* FIX lint

* Mod log format

---------

Co-authored-by: xiaozongyang <xiaozngyang@kanyun.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-01-16 22:19:51 +02:00
Aliaksandr Valialkin
d342f09917 app/vmstorage: addd missing -inmemoryDataFlushInterval command-line flag
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3337
2023-12-14 21:12:51 +02:00
Aliaksandr Valialkin
ddcc0efca4 docs/CHANGELOG.md: document the bugfix at 66c76a4d4d
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5414
2023-12-14 12:52:31 +02:00
Anton Tykhyy
9f67f62f96 Fix sum(aggr_over_time) 'got 1 args' error (#3028) (#5414)
app/vmselect/promql/eval.go:evalAggrFunc shunts evaluation
of AggrFuncExpr over rollupFunc over MetricsExpr to an optimized
path. tryGetArgRollupFuncWithMetricExpr() checks whether expression
can be shunted, but it mangles the AggrFuncExpr when the aggregation
function has more than one argument. This results in queries like
`sum(aggr_over_time("avg_over_time",m))` failing with error message
'expecting at least 2 args to "aggr_over_time"; got 1 args' while
the analogous query `sum(avg_over_time(m))` executes successfully.
This fix removes the unnecessary mangling.

Signed-off-by: Anton Tykhyy <atykhyy@gmail.com>
2023-12-14 12:51:31 +02:00
Aliaksandr Valialkin
1cfa94814b vendor: run make vendor-update 2023-12-10 13:54:09 +02:00
Aliaksandr Valialkin
c731b9fb27 docs/CHANGELOG.md: cut v1.87.12 release 2023-12-10 13:46:19 +02:00
Aliaksandr Valialkin
f5c1c55c44 deployment/docker: update base Docker image from alpine:3.18.5 to alpine:3.19.0
See https://www.alpinelinux.org/posts/Alpine-3.19.0-released.html
2023-12-10 02:30:04 +02:00
Aliaksandr Valialkin
2eff6c9a82 Makefile: update golangci-lint version from v1.54.2 to v1.55.1
See https://github.com/golangci/golangci-lint/releases/tag/v1.55.1
2023-12-08 18:30:13 +02:00
hagen1778
a39f8f4e00 make: remove build duplicates for crossbuild
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-12-08 18:24:27 +02:00
Roman Khavronenko
92336db671 app/vmalert: sanitize label names before sending to Alertmanager (#5442)
Before, vmalert would send notifications with labels containing characters
  not supported by Alertmanager validator, resulting into validation errors
  like `msg="Failed to validate alerts" err="invalid label set: invalid name "foo.bar"`

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-12-08 18:11:49 +02:00
Aliaksandr Valialkin
2e17fe6d6a app/vmselect/prometheus: properly encode Prometheus label values at /federate endpoint
Prometheus spec says that only \, \n and " must be escaped inside label values.
See 995743836e/content/docs/instrumenting/exposition_formats.md (L90)

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5431
2023-12-07 16:13:48 +02:00
Aliaksandr Valialkin
bf3215a82f deployment/docker: update Go builder from Go1.21.4 to Go1.21.5
See https://github.com/golang/go/issues?q=milestone%3AGo1.21.5+label%3ACherryPickApproved
2023-12-07 00:15:51 +02:00
Aliaksandr Valialkin
1e50bb8723 deployment/docker: update backe Docker image from alpine 3.18.4 to 3.18.5
See https://www.alpinelinux.org/posts/Alpine-3.15.11-3.16.8-3.17.6-3.18.5-released.html
2023-12-03 18:57:52 +02:00
Max Golionko
2034b62a23 CI: disable docker scan, enable auto release to sandbox (#4476)
* disable docker scan

* disable nightly, enable auto release to sandbox

* remove whitespace
2023-11-15 10:22:15 +01:00
Aliaksandr Valialkin
a6171f1b5b vendor: run make vendor-update 2023-11-14 22:08:11 +01:00
Aliaksandr Valialkin
e3a7408feb app/vmalert/datasource: test fix after 8a1b93a49d 2023-11-14 22:06:29 +01:00
Aliaksandr Valialkin
8bd1e2f459 docs/CHANGELOG.md: cut v1.87.11 LTS release 2023-11-14 21:55:49 +01:00
Nikolay
b1e08530fb lib/querytracer: makes package concurrent safe to use (#5322)
* lib/querytracer: makes package concurrent safe to use
it must fix various issues with concurrent code usage.
Especially, when it's not reasonable to wait for all goroutines to be finished

* wip

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-11-14 21:01:43 +01:00
Noah Labrecque
4c8f7a293e fix: apply correct bounds to sf and tf (#5274) 2023-11-14 01:52:18 +01:00
Aliaksandr Valialkin
73d56fd5a4 deployment: update Go builder from Go1.21.3 to Go1.21.4
See https://github.com/golang/go/issues?q=milestone%3AGo1.21.4+label%3ACherryPickApproved
2023-11-13 18:31:33 +01:00
Aliaksandr Valialkin
44db8c9f46 lib/regexutil: properly handle alternate regexps surrounded by .+ or .*
Previously the following regexps were improperly handled:

  .+foo|bar.+
  .*foo|bar.*

This could lead to unexpected regexp match results.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5297

Thanks to @Haleygo for the initial attempt to fix the issue at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5308
2023-11-13 18:29:00 +01:00
Aliaksandr Valialkin
f5ef2283c3 go.mod: increase the minimum supported Go version from Go1.19 to Go1.20 2023-11-02 21:30:46 +01:00
Aliaksandr Valialkin
511d1e5847 vendor: run make vendor-update 2023-11-02 21:18:17 +01:00
Aliaksandr Valialkin
e42af059a1 lib/persistentqueue: use the proper function call after 31945fb38d 2023-11-02 21:18:00 +01:00
Aliaksandr Valialkin
dc8bc5d6cd go.mod: pin the latest working version of golang.org/x/exp and github.com/VictoriaMetrics/metricsql 2023-11-02 21:14:19 +01:00
Aliaksandr Valialkin
31945fb38d lib/persistentqueue: properly re-create flock.lock file inside directory if persistent queue is broken.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5249

Thanks to @Sniper91 for the bugreport and initial fix at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5233
2023-10-31 18:37:18 +01:00
Aliaksandr Valialkin
ed76b92c7d lib/storage: follow-up for 29cebd82fb
Use atomic.CompareAndSwapUint32() instead of atomic.LoadUint32() followed by atomic.StoreUint32().
This makes the code more clear.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5159
2023-10-31 16:30:30 +01:00
Roman Khavronenko
bae069553e lib/storage: log warning about RO mode only on state change (#5191)
Before, vmstorage would log the same message each second producing excessive
amount of logs.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5159

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-10-31 16:30:24 +01:00
Aliaksandr Valialkin
3284c9af81 docs/CHANGELOG.md: cut v1.87.10 2023-10-16 22:23:01 +02:00
Aliaksandr Valialkin
e8d40d589d deployment/docker: update Go builder from Go1.21.1 to Go1.21.3
See https://github.com/golang/go/issues?q=milestone%3AGo1.21.2+label%3ACherryPickApproved
and https://github.com/golang/go/issues?q=milestone%3AGo1.21.3+label%3ACherryPickApproved
2023-10-15 18:57:36 +02:00
Aliaksandr Valialkin
ec37347fff docs/CHANGELOG.md: remove duplicate line after 14471496fe 2023-10-02 22:17:04 +02:00
Aliaksandr Valialkin
a70753d7f9 deployment/docker: update Alpine from 3.18.3 to 3.18.4
See https://alpinelinux.org/posts/Alpine-3.18.4-released.html
2023-10-02 22:14:38 +02:00
Dmytro Kozlov
f1fcb8f6ae app/vmagent: fix check of the DataDog agent path requests when requests have trailing slashes (#5106)
* app/vmagent: fix check of the DataDog agent path requests when requests have trailing slashes

* app/vmagent: fix CHANGELOG.md description

* wip

* wip

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-10-02 21:22:27 +02:00
Aliaksandr Valialkin
14471496fe app/vmagent: follow-up for cfef814750
- Properly handle /insert/multitenant/api/put url for opentsdb handler at vmagent
- Document that the bug has been introduced in v1.93.2 at docs/CHANGELOG.md
- Add a link to multitenant url docs in bugfix description

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5061
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4910
2023-10-01 21:07:51 +02:00
Alexander Marshalov
f9ca2d8fb3 fixed ingestion via multitenant url for opentsdbhttp (#5061) (#5063) 2023-10-01 21:07:45 +02:00
hagen1778
e3e9b17f6e build(deps): revert version change for codecov/codecov-action from 4 to 3
https://github.com/codecov/codecov-action/issues/1089
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-09-19 10:36:11 +02:00
Aliaksandr Valialkin
7616152360 docs/CHANGELOG.md: clarify the description of bugfixes at f7dda12b4d and b6ad581b45
This is a follow-up for 8b01bc4a5c

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4999
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5009
2023-09-19 00:49:05 +02:00
Nikolay
5b9f6604e3 docs: reflect recent changes at change logs (#5015) 2023-09-19 00:34:15 +02:00
dependabot[bot]
e1186257c7 build(deps): bump codecov/codecov-action from 3 to 4 (#5011)
Bumps [codecov/codecov-action](https://github.com/codecov/codecov-action) from 3 to 4.
- [Release notes](https://github.com/codecov/codecov-action/releases)
- [Changelog](https://github.com/codecov/codecov-action/blob/main/CHANGELOG.md)
- [Commits](https://github.com/codecov/codecov-action/compare/v3...v4)

---
updated-dependencies:
- dependency-name: codecov/codecov-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-19 00:31:10 +02:00
faceair
b38654e0d1 lib/storage: remove ForceMergeAllParts internal loop (#4999)
Signed-off-by: faceair <git@faceair.me>
2023-09-18 16:37:50 +02:00
Aliaksandr Valialkin
624b508c3a docs/CHANGELOG.md: cut v1.87.9 2023-09-10 20:39:54 +02:00
Aliaksandr Valialkin
1559d00abe docs/CHANGELOG.md: fix the release where the issue has been introduced 2023-09-09 06:14:40 +02:00
Roman Khavronenko
8a1b93a49d vmalert: correctly add duplicated params to the query (#4955)
Fix the bug when Group's `params` fields with multiple values were
overriding each other instead of adding up.
The bug was introduced in this commit eccecdf177
 starting from v1.91.1 https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.91.1

 https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4908

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 6351d07da8)
2023-09-08 09:37:27 +02:00
Aliaksandr Valialkin
30dc75352a app/vmselect: return 503 status code when partial responses are denied and some of vmstorage nodes are temporarily unavailable
This should help detecting this case and automatic retrying the query at healthy cluster replica
in another availability zone.

This commit is needed as a preparation for automatic query retry at another backend at vmauth on 5xx errors
as described at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4792#issuecomment-1674338561
2023-09-07 16:18:02 +02:00
Aliaksandr Valialkin
086843ec42 .github/workflows: remove nightly build, since it isnt used, while wastes compute resources (CPU and storage space) 2023-09-07 11:42:13 +02:00
Aliaksandr Valialkin
4f5e76bc4c all: update Go builder from Go1.21.0 to Go1.21.1
See https://github.com/golang/go/issues?q=milestone%3AGo1.21.1+label%3ACherryPickApproved
2023-09-07 11:40:55 +02:00
Aliaksandr Valialkin
6c3f235c4c docs/CHANGELOG.md: clarify the scope of recent bugfixes 2023-09-07 11:28:42 +02:00
Aliaksandr Valialkin
04afae2aa8 deployment/docker: properly build armv5 production builds for GOARCH=arm
Pass GOARM=5 when building GOARCH=arm production builds, since the default value for this env var
has been changed to GOARM=6 since Go1.21.0.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4965
and https://github.com/golang/go/issues/62475
2023-09-07 11:28:36 +02:00
Aliaksandr Valialkin
5f8de2945b docs/CHANGELOG.md: document the bugfix at 7db72dd7e6
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4947
2023-09-06 12:13:25 +02:00
Nikolay
23f0cc4a9b lib/vmselectapi: do not send empty label names for labelNames request (#4936)
* lib/vmselectapi: do not send empty label names for labelNames request
it breaks cluster communication, since vmselect incorrectly reads request buffer, leaving unread data on it
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4932

* typo fix

* wip

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-09-01 23:29:37 +02:00
Aliaksandr Valialkin
d487ff4ec5 docs/CHANGELOG.md: move the bugfix description for 1f66d5624d to proper place 2023-09-01 18:06:23 +02:00
Nikolay
1f66d5624d app/vminsert: properly close vmstorage connection (#4935)
* app/vminsert: properly close vmstorage connection
previously vmstorage may stuck in broken state until vminsert restarts
since vmstorage was marked as read-only and connection was broken to it.
checkReadonly function never marked connection as broken
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4870

* wip

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-09-01 18:04:59 +02:00
Aliaksandr Valialkin
17febd84ac docs/CHANGELOG.md: cut v1.87.8 2023-09-01 12:07:26 +02:00
Aliaksandr Valialkin
03728301e9 vendor: return back the working github.com/VictoriaMetrics/metricsql release after 4a338d32ce 2023-09-01 11:28:54 +02:00
Aliaksandr Valialkin
e2f6e351b3 lib/proxy: consistently use gopkg.in/yaml.v2 across all the code 2023-09-01 11:26:19 +02:00
Aliaksandr Valialkin
4a338d32ce vendor: run make vendor-update 2023-09-01 11:26:09 +02:00
Aliaksandr Valialkin
5be882cba1 Makefile: cleanup bin/ directory at the beginning of make publish-release command
This is needed in order to prevent from non-build artifacts to be uploaded to Github release page
2023-09-01 11:24:18 +02:00
Nikolay
8b636350f2 app/vminsert: fixes readonly check (#4892)
* app/vminsert: fixes readonly check
previously vminsert doesn't check readOnly state for vmstorage, since check was never performed for nil buffer
In this case every 30 second storage node loss readonly state and received some data.
It caused re-routing and possible slow down for ingestion
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4870

* wip

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-08-30 16:28:08 +02:00
Aliaksandr Valialkin
9d2fbc99ff lib/auth: add NewTokenPossibleMultitenant() for parsing auth token, which can be multitenant
Disallow parsing multitenant token at auth.NewToken().

Use auth.NewTokenPossibleMultitenant() at vminsert only. All the other callers should call auth.NewToken(),
since they do not support multitenant token.

This is a follow-up for f0c06b428e

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4910
2023-08-30 15:25:45 +02:00
Zakhar Bessarab
df856aec6f app/vmselect: fix panic when using /select/multitenant endpoint (#4912)
app/vmselect: fix panic when using `/select/multitenant` endpoint

Such requests must be rejected as not found since vmselect does not support multitenant endpoint.

See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4910

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-08-30 15:25:36 +02:00
Nikolay
d3ebce2c2f deployment/docker: disable provenance in buildx (#4911)
* deployment/docker: disable provenance in buildx
it must fix an issue with multi-platform manifest generation
at buildx >= 0.10 backward compatibility was broken and generated image cannot be used with docker systems that doesn't support oci.
disabling attestat temporary fixes it.
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4907
https://docs.docker.com/build/attestations/slsa-provenance/

* Update docs/CHANGELOG.md

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-08-30 15:22:52 +02:00
Dmytro Kozlov
a3df6e8a36 lib/protoparser: handle unexpected EOF error when parsing lines in prometheus exposition format (#4851)
Previously only io.EOF was handled, and io.ErrUnexpectedEOF was ignored, but it may happen if the client interrupts the connection.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4817
2023-08-28 09:40:18 +02:00
Aliaksandr Valialkin
f7f77307c6 lib/envflag: do not allow unsupported form for boolean command-line flags in the form -boolFlag value
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4845
2023-08-17 14:16:53 +02:00
Aliaksandr Valialkin
b2854d145d lib/promrelabel: properly replace : char with _ in metric names when -usePromCompatibleNaming command-line flag is set
This addresses https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3113#issuecomment-1275077071 comment from @johnseekins
2023-08-17 13:54:50 +02:00
Roman Khavronenko
08737f11e3 vmbackup: correctly check if specified -dst belongs to specified -storageDataPath (#4841)
See this issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4837

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-08-17 13:54:09 +02:00
Aliaksandr Valialkin
ab077ab41f deployment/docker/Makefile: do not overwrite latest tag when pushing Docker images for LTS release
The `latest` tag is reserved for the latest release
2023-08-12 14:38:54 -07:00
Aliaksandr Valialkin
2c5b260b42 docs/CHANGELOG.md: cut v1.87.7 2023-08-12 13:52:12 -07:00
Nikolay
9c0119a194 lib/promscrape: adds validation for proxy_url scheme (#4823)
* lib/promscrape: adds validation for proxy_url scheme
adds tests
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4811

* Update lib/proxy/proxy.go

* Update lib/proxy/proxy.go

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-08-12 13:50:27 -07:00
Aliaksandr Valialkin
9c3f87fac2 lib/promscrape: use local scrape timestamp for scraped metrics unless honor_timestamps: true is set explicitly
This fixes the case with gaps for metrics collected from cadvisor,
which exports invalid timestamps, which break staleness detection at VictoriaMetrics side.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4697 ,
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4697#issuecomment-1654614799
and https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4697#issuecomment-1656540535

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1773
2023-08-12 13:47:51 -07:00
Roman Khavronenko
eb8ee5aab6 vmctl: interrupt explore procedure in influx mode if no numeric fields were found (#4576)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-08-12 13:41:43 -07:00
Dmytro Kozlov
e58ef6e27d app/vmctl: fix panic --remote-read-filter-time-start flag not defined (#4605)
* app/vmctl: fix panic `--remote-read-filter-time-start` flag not defined

* app/vmctl: update CHANGELOG.md

---------

Co-authored-by: Nikolay <nik@victoriametrics.com>
2023-08-12 13:39:30 -07:00
Roman Khavronenko
01daa5ddb9 vmalert: check for negative offset for missed rounds (#4628)
It could happen for low evaluation intervals and irregular
delays during execution that evaluation time would get
a negative offset. This could result into cumulative
discrepancy between the actual time and evaluation time for rules.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-08-12 13:36:05 -07:00
Haleygo
08b106822d vmalert: fix evalTS after modify group interval (#4629) 2023-08-12 13:34:25 -07:00
Haleygo
a0752b20c7 vmselect: fix result in Prometheus query when time is small (#4578)
vmselect: fix result in Prometheus query when time is small

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2023-08-12 13:32:05 -07:00
Aliaksandr Valialkin
b351f73984 lib/promscrape/discovery: close unused HTTP connections to service discovery servers
This should prevent from connection leaks

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4724
2023-08-12 13:30:25 -07:00
Aliaksandr Valialkin
4c211028af docs/CHANGELOG.md: add a link to Pushgateway protocol in the bugfix description for 74237ce5c0
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4692
2023-08-12 13:24:43 -07:00
Alexander Marshalov
5dba522eaa fixed label values decoding for pushgateway compatibility (#4727)
Fixed decoding of label values with slash for pushgateway and prometheus golang client compatibility + added some tests. (#4962)
2023-08-12 13:23:14 -07:00
Aliaksandr Valialkin
63a793b506 docs/CHANGELOG.md: clarify the description of the bugfix at ce7141383d 2023-08-12 13:21:09 -07:00
Zakhar Bessarab
0ee30dc00e app/vmagent/remotewrite: fix vmagent panic on shutdown (#4407)
app/vmagent/remotewrite: fix vmagent panic on shutdown

Currently, when vmagent is stopping it first flushes pending series in remote write context and proceeds to stop streaming aggregation. This leads to streaming aggregation being unable to write results into pending timeseries (since it is already nil) and panic.
This can lead to losing some aggregation results being lost almost silently.

The fix is reordering flow to first stop streaming aggregation and flush all pending time series after that.

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-08-12 13:20:22 -07:00
Aliaksandr Valialkin
8234741c51 app/vmselect/netstorage: follow-up after 11ac551d52
- Clarify the scope of the fix at docs/CHANGELOG.md
- Handle the case when -search.maxSamplesPerSeries limit is exceeded
  in the same way as the -search.maxSamplesPerQuery limit.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4472
2023-08-12 13:11:24 -07:00
Roman Khavronenko
26e5cf4160 app/vmselect/netstorage: properly process -search.maxSamplesPerQuery limit (#4472)
Properly return the error to user when `-search.maxSamplesPerQuery` limit is exceeded.
Before, user could have received a partial response instead.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-08-12 13:09:11 -07:00
Aliaksandr Valialkin
2711770b10 vendor: run make vendor-update 2023-08-12 12:43:23 -07:00
Aliaksandr Valialkin
041a6369e7 Upgrade Go builder from 1.20.4 to 1.21.0 2023-08-12 12:35:12 -07:00
Aliaksandr Valialkin
e7b3a338fe deployment/docker/Makefile: upgrade base Docker image from alpine:3.18.2 to alpine:3.18.3
See https://alpinelinux.org/posts/Alpine-3.15.10-3.16.7-3.17.5-3.18.3-released.html
2023-08-12 12:33:03 -07:00
Zakhar Bessarab
7487a8de1c lib/promrelabel: fix relabeling if clause (#4816)
* lib/promrelabel: fix relabeling if clause being applied to labels outside of current context

Relabeling is applied to each metric row separately, but in order to lower amount of memory allocations it is reusing labels.

Functions which are working on current metric row labels are supposed to use only current metric labels by using provided offset, but if clause matcher was using the whole labels set instead of local metrics.

This leaded to invalid relabeling results such as one described here: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4806

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* docs/CHANGELOG.md: document the bugfix

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1998
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4806

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-08-12 12:31:36 -07:00
Roman Khavronenko
7dc31ab080 docs: mention fix for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4221 (#4382)
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4221

Signed-off-by: hagen1778 <roman@victoriametrics.com>

(cherry picked from commit 4b5faf7efb)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-06-05 11:43:53 +02:00
Roman Khavronenko
74a92110bd vmalert: do not return nil rules for /api/v1/rules (#4344)
The fix addresses a case when vmalert is configured with a group
which has `name`, but doesn't have `rules` configured. In this
case it still returns a `nil` instead of `[]` slice.

Fixing this via current commit.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4221

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 66ed6fe62f)
2023-06-05 11:43:09 +02:00
Roman Khavronenko
6e02885431 vmalert: properly form assets address if httpPrefix set (#4351)
Properly form path to static assets in WEB UI
 if `http.pathPrefix` set.

 https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4349

Signed-off-by: hagen1778 <roman@victoriametrics.com>

(cherry picked from commit 51cea6cad4)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-06-05 11:41:48 +02:00
hagen1778
e8448fe8fd docs/changelog: mention 20dc3db71e
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-06-05 11:40:49 +02:00
Roman Khavronenko
a8835b443f vmalert: fix nil map assignment (#4392)
* vmalert: fix nil map assignment

The storage instance with nil map params was created for remote-read purposes.
And before change 7a9ae9de0d this map was ignored in ApplyParams.
Now, it started to be used and vmalert panics in runtime.

The fix properly inits map for at `NewVMStorage` and verifies it is not nil
on assignment in `ApplyParams`.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: add to changelog

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: properly clone Storage params

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: properly clone Storage params

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: properly clone Storage params

Signed-off-by: hagen1778 <roman@victoriametrics.com>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>

(cherry picked from commit de94812088)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-06-05 11:38:41 +02:00
Roman Khavronenko
e082c3dd27 app/vmalert: follow-up after 7a9ae9de0d (#4381)
7a9ae9de0d

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit eccecdf177)
2023-06-05 11:37:43 +02:00
gsakun
bab0808a7f app/vmalert: fix datasource.roundDigits Parameter (#4341)
app/vmalert: fix querybuild clone and extraParams merge logic

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4340

(cherry picked from commit 20dc3db71e)
2023-06-05 11:36:18 +02:00
Nikolay
278c580055 app/vmauth: properly handle LOCAL proxy protocol command (#4373)
It is required for handling health checks from load balancers

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3335

(cherry picked from commit f263031fe9)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-06-05 11:34:41 +02:00
Aliaksandr Valialkin
4c1241d336 vendor: run make vendor-update 2023-05-18 15:48:55 -07:00
Aliaksandr Valialkin
b3a3260b1d docs/CHANGELOG.md: cut v1.87.6 2023-05-18 15:42:40 -07:00
Aliaksandr Valialkin
2783b99457 deployment/docker/Makefile: use alpine 3.17.3 instead of alpine 3.18.0 for certs image, since alpine 3.18.0 doesnt work for cross-platform builds 2023-05-18 14:12:55 -07:00
Nikolay
a0ece56649 app/vmauth: do not return invalid credentials (#4288)
at http response by default
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4188

based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4190
Thanks @raj-kumar-j  for init implementation
2023-05-17 00:11:40 -07:00
Aliaksandr Valialkin
1ae8912f0f lib/backup/actions: add missing import after 25b527f1e4 2023-05-16 15:42:46 -07:00
Aliaksandr Valialkin
26b9f41e08 lib/mergeset: close and open the table before making snapshots at TestTableCreateSnapshotAt()
This gives guarantees that all the in-memory data is written to disk at the snapshot time.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4272
See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4316
2023-05-16 15:42:13 -07:00
Aliaksandr Valialkin
a0d84278ce deployment/docker: update base docker image from 3.17.3 to 3.18.0
See https://www.alpinelinux.org/posts/Alpine-3.18.0-released.html
2023-05-12 17:33:30 -07:00
Aliaksandr Valialkin
90ee6f174c lib/promutils: properly return error when incorrect Prometheus label names are passed to NewLabelsFromString()
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4284
See also https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4304
2023-05-12 17:04:07 -07:00
Aliaksandr Valialkin
78cd76d96a lib/protoparser/csvimport: properly parse the last empty column in CSV line
Do not ignore the last empty column in CSV line.
While at it, properly parse CSV columns in single quotes, e.g. `'foo,bar',baz` is parsed as two columns - `foo,bar` and `baz`

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4048

See also https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4298
2023-05-12 17:00:48 -07:00
Alexander Marshalov
25b527f1e4 fixed error with double slash in vmbackupmanager (#557)
Signed-off-by: Alexander Marshalov <_@marshalov.org>
2023-05-11 13:38:59 -07:00
Aliaksandr Valialkin
8febf3c5e6 docs/CHANGELOG.md: document 8f4de6fa47 2023-05-09 23:09:06 -07:00
Roman Khavronenko
470c82919e vminsert: properly reset labels object on aggregation (#4278)
Without reset, labels duplicates could have been added during stream aggregation.
Since `ctx.Labels` is reused during processing of many series, each series will
add its labels to the context. Even if the same labels were already addeded on prev
iteration. Now, we reset `ctx.Labels` on each iteration to contain so labels from
different series didn't interfere.

This could have cause exceeding of the limit on number of labels per pushed time series.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4277

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-09 23:05:53 -07:00
Aliaksandr Valialkin
f4f6bb359e all: update Go builder from Go1.20.3 to Go1.20.4
See https://github.com/golang/go/issues?q=milestone%3AGo1.20.4+label%3ACherryPickApproved
2023-05-09 22:34:21 -07:00
Alexander Marshalov
591fb53b58 fixed vm_promscrape_config_last_reload_successful metric value recovery after successful reloading with unchanged content (#4260) (#4268)
Signed-off-by: Alexander Marshalov <_@marshalov.org>
2023-05-09 22:26:35 -07:00
Roman Khavronenko
fff051b889 alerts: update TooHighMemoryUsage threshold (#4256)
It appears that 90% usage for anonymous mem usage
is already concerning. So we lowering the threshold to 80%.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-09 21:43:35 -07:00
Nikolay
e5e10988ed lib/storage: properly update link for entry at dateMetricID cache (#4258)
previously during sync for mutable and immutable cache parts, link for hotEntry with current date may be not properly updated
it corrupts cache for backfilling metrics and increased cpu load
2023-05-09 21:40:11 -07:00
Zakhar Bessarab
4c3e0d4411 lib/promscrape/discovery/kubernetes: follow-up for d5e94721db (#4255)
- add changelog reference to an author
- fix tests
- add metadata to match Prometheus behavior

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-05-09 21:32:02 -07:00
Vasilchenko Anton
e6d24e68f7 Add endpoint labels for pod targets discovered form endpoint but has different ports (#4253)
Signed-off-by: Vasilchenko Anton <vasilchenko-as@yandex.ru>
2023-05-09 21:30:09 -07:00
Zakhar Bessarab
a50d63c376 lib/storage: fix indexdb rotation infinite loop (#4249)
When using `retentionTimezoneOffset` and having local timezone being more than 4 hours different from UTC indexdb retention calculation could return negative value. This caused indexdb rotation to get in loop.
Fix calculation of offset to use `retentionTimezoneOffset` value properly and add test to cover all legit timezone configs.
See:
- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4207
- https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4206

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Nikolay <nik@victoriametrics.com>
2023-05-09 21:28:20 -07:00
Roman Khavronenko
89ec85c5db vmselect: exit early from queue on context cancel (#4223)
* vmselect: exit early from queue on context cancel

When `-search.maxConcurrentRequests` is reached, vmselect puts
request in the queue. It is expected, that requests in the queue
will be processed as soon as it would be enough capacity to do so.

However, it could happen that while request was waiting its turn,
the client could have already cancel it (close the connection,
or just close the tab with UI). In this case, we should de-queue
such requests to avoid spending extra resources on them.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* app/vmselect: address review comments

Signed-off-by: hagen1778 <roman@victoriametrics.com>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-08 22:58:36 -07:00
Zakhar Bessarab
7cc7b4436f lib/promscrape/discovery/kubernetes: add common labels to all ports discovered from endpoints (#4235)
* lib/promscrape/discovery/kubernetes: add common labels to all ports discovered from endpoints

Sets
`__meta_kubernetes_endpoints_name` and `__meta_kubernetes_namespace` labels to all ports of pod.
Prometheus sets those labels to all ports in pod (0ab9553611/discovery/kubernetes/endpoints.go (L267C15-L269)) even if port is not matching any service.

See: #4154

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* lib/promscrape/discovery/kubernetes: fix test for updated discovery logic

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-05-08 22:16:36 -07:00
Roman Khavronenko
d09ba9f504 vmalert: fix API to return non-nil values (#4222)
Properly return empty slices instead of nil for `/api/v1/rules` and `/api/v1/alerts` API handlers.
This improves compatibility with Grafana.

 https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4221

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-08 21:49:13 -07:00
Aliaksandr Valialkin
ab1d226485 docs/CHANGELOG.md: document 03150c8973
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4204
2023-05-08 16:32:07 -07:00
Haleygo
c7a407368a lib/opentsdbhttp: fix a typo preventing from using writeconcurrencylimiter (#4208) 2023-05-08 16:31:05 -07:00
Zakhar Bessarab
56350eb1ce lib/httpserver: add handler to serve /robots.txt and deny search indexing (#4143)
This handler will instruct search engines that indexing is not allowed for the content exposed to the internet. This should help to address issues like #4128 when instances are exposed to the internet without authentication.
2023-05-08 09:52:04 -07:00
Aliaksandr Valialkin
70d9a2e679 docs/CHANGELOG.md: document c77385e78f 2023-05-08 08:56:32 -07:00
Aliaksandr Valialkin
d324241707 vendor: update github.com/VictoriaMetrics/metricsql from v0.56.1 to v0.56.2
This fixes panic when the duration in the query contains `M` suffix.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4120
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3589
2023-04-14 14:09:21 -07:00
Aliaksandr Valialkin
112c9a8118 app/vmstorage: deprecate -bigMergeConcurrency command-line flag
Improperly configured -bigMergeConcurrency command-line flag usually leads to uncontrolled
growth of unmerged parts, which, in turn, increases CPU usage and query durations.

So it is better deprecating this flag. In rare cases -smallMergeConcurrency command-line flag
can be used instead for controlling the concurrency of background merges.
2023-04-13 23:40:40 -07:00
Aliaksandr Valialkin
03ad293ba2 docs/CHANGELOG.md: run at least 4 background mergers on systems with less than 4 CPU cores
This reduces the probability of sudden spike in the number of small parts when all the background mergers
are busy with big merges.
2023-04-13 23:38:40 -07:00
Haleygo
74c7c49335 fix sort pendingDateMetricsIDs (#4102) 2023-04-10 12:23:25 -07:00
Aliaksandr Valialkin
68146a7a89 lib/streamaggr: remove accidentally left logger.Errorf() call after d655d6b047 2023-04-06 14:57:39 -07:00
Aliaksandr Valialkin
141f793051 docs/CHANGELOG.md: cut v1.87.5 2023-04-05 23:13:48 -07:00
Aliaksandr Valialkin
17386dd7c0 vendor: make vendor-update 2023-04-05 22:46:25 -07:00
Aliaksandr Valialkin
5d8b7d1c2a lib/encoding: fix test after 4725549cb2 2023-04-05 21:39:23 -07:00
Aliaksandr Valialkin
8600cb21ed vendor: update github.com/klauspost/compress from v1.16.3 to v1.16.4 2023-04-05 21:31:12 -07:00
Aliaksandr Valialkin
befef7f2b8 vendor: update github.com/valyala/gozstd from v1.18.0 to v1.19.0 2023-04-05 20:57:04 -07:00
Aliaksandr Valialkin
37bc95fe2d all: update Go builder from Go1.20.2 to Go1.20.3
See https://github.com/golang/go/issues?q=milestone%3AGo1.20.3+label%3ACherryPickApproved
2023-04-05 13:41:26 -07:00
Aliaksandr Valialkin
2191328b2b lib/promscrape: do not re-use previously loaded scrape targets on failed attempt to load updated scrape targets at file_sd_configs
The logic employed for re-using the previously loaded scrape target was broken initially.
The commit cc0427897c tried to fix it, but the new logic
became too complex and fragile. So it is better to just remove this logic,
since the targets from temporarily broken file should be eventually loaded on next
attempts every -promscrape.fileSDCheckInterval

This also allows removing fragile hacks around __vm_filepath label.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3989
2023-04-02 21:12:50 -07:00
Dmytro Kozlov
886087c4de lib/promscrape: fix the problem with scrape work duplicates when file_sd_config can't be read (#4027)
* lib/promscrape: fix the problem with scrape work duplicates when file_sd_config can't be read

* lib/promscrape: clarified comment

* lib/promscrape: made better approach to handle a problem with growing []*ScrapeWork on each error when loading config

* lib/promscrape: added CHANGELOG.md

* Update docs/CHANGELOG.md

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-04-02 21:11:59 -07:00
Roman Khavronenko
973ebff145 lib/storage: check for free disk space before opening tables (#4035)
* lib/storage: check for free disk space before opening tables

We check for free disk space before call to `openTable`,
so `Storage` can be set to ReadOnly before mergeWorkers start.

Before the change, there was a chance that merges will start
even if Storage has to start in ReadOnly mode because of
`-storage.minFreeDiskSpaceBytes` limit.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4023
Signed-off-by: hagen1778 <roman@victoriametrics.com>

* lib/storage: chore

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* Update lib/storage/storage.go

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-04-01 00:31:09 -07:00
Aliaksandr Valialkin
17ff88edb5 deployment/docker: update base Docker image from Alpine 3.17.2 to Alpine 3.17.3
This fixes security issues from https://alpinelinux.org/posts/Alpine-3.17.3-released.html

This is a follow-up for 59c350d0d2
2023-03-31 22:49:28 -07:00
Max Golionko
5058a92bb1 fix: app/vmui/Dockerfile-web to reduce vulnerabilities (#4044)
The following vulnerabilities are fixed with an upgrade:
- https://snyk.io/vuln/SNYK-ALPINE317-OPENSSL-3368755
- https://snyk.io/vuln/SNYK-ALPINE317-OPENSSL-3368755
- https://snyk.io/vuln/SNYK-ALPINE317-OPENSSL-5291795
- https://snyk.io/vuln/SNYK-ALPINE317-OPENSSL-5291795

Co-authored-by: snyk-bot <snyk-bot@snyk.io>
2023-03-31 22:48:26 -07:00
Aliaksandr Valialkin
a31b871845 lib/fs: follow-up for ec45f1bc5f
Properly close response body before checking for the response code.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4034
2023-03-31 22:41:04 -07:00
Zakhar Bessarab
baa9f0e573 lib/fs: verify response code when reading configuration over HTTP (#4036)
Verifying status code helps to avoid misleading errors caused by attempt to parse unsuccessful response.

Related issue: #4034

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-03-31 22:34:07 -07:00
Aliaksandr Valialkin
9e3818ca27 lib/flagutil: ArrayString: support commas inside quoted strings and inside [], {} and () braces
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3915
2023-03-28 21:25:52 -07:00
Aliaksandr Valialkin
fd1c69e4b7 app/vmselect/promql: follow-up for 79e1c6a6fc
- Document the fix at docs/CHANGELOG.md
- Add tests with multiple adjancent zero buckets
- Simplify the fix a bit

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/296
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4021
2023-03-27 18:05:53 -07:00
Ze'ev Klapow
b3ee33eb8e fix le buckets when adjacent vmrange is empty (#4021)
There is a bug here where if you have a single bucket like:

foo{vmrange="4.084e+02...4.642e+02"} 2 123

The expected output is three le encoded buckets like:

foo{le="4.084e+02"} 0 123
foo{le="4.642e+02"} 2 123
foo{le="+Inf"} 2 123

This correctly encodes the start and end of the vmrange.
If however, the input contains the previous bucket, and that bucket is
empty then you only get the end le and +Inf out currently, i.e:

foo{vmrange="7.743e+05...8.799e+05"} 5 123
foo{vmrange="6.813e+05...7.743e+05"} 0 123

results in:

foo{le="8.799e+05"} 5 123
foo{le="+Inf"} 5 123

This causes issues when you go to compute a quantile because this means
that the assumed lower bound of the buckets is 0 and this we interpolate
between 0->end rather than the vmrange start->end as expected.
2023-03-27 18:05:03 -07:00
Aliaksandr Valialkin
87769b36d1 docs/CHANGELOG.md: cut v1.87.4 release 2023-03-25 17:02:16 -07:00
Aliaksandr Valialkin
81704549c4 app/vmselect/netstorage: reduce the contention at fs.ReaderAt stats collection on systems with big number of CPU cores
This optimization is based on the profile provided at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3966#issuecomment-1483208419
2023-03-25 16:44:16 -07:00
Aliaksandr Valialkin
dfb61ad46c app/vmselect/netstorage: document why runtime.Gosched() is removed at 28f054bb00
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3966
2023-03-25 16:44:16 -07:00
Zakhar Bessarab
0607800f05 vmselect/netstorage: remove direct calls to Gosched to reduce amount of locks for global scope
using `runtime.Gosched` requires acquiring global lock to check if there are any other goroutines to perform tasks. with the latest versions of runtime it can pause running goroutines automatically without requiring to call `Gosched` directly.

Updates #3966

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-03-25 16:44:16 -07:00
Aliaksandr Valialkin
575032bb68 app/vmselect/netstorage: reduce the number of calls to runtime.Gosched() at timeseriesWorker() and unpackWorker()
Call runtime.Gosched() only when there is a work to steal from other workers.
Simplify the timeseriesWorker() and unpackWroker() code a bit by inlining stealTimeseriesWork() and stealUnpackWork().

This should reduce CPU usage when processing queries on systems with big number of CPU cores.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3966
2023-03-25 16:43:42 -07:00
Aliaksandr Valialkin
744517829d app/vmselect/promql: typo fix after e7f46a0aab 2023-03-25 01:25:43 -07:00
Aliaksandr Valialkin
a7079022ff app/vmselect/promql: follow-up for 7205c79c5a
- Allocate and initialize seriesByWorkerID slice in a single go instead
  of initializing every item in the list separately.
  This should reduce CPU usage a bit.
- Properly set anti-false sharing padding at timeseriesWithPadding structure
- Document the change at docs/CHANGELOG.md

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3966
2023-03-25 01:24:48 -07:00
Zakhar Bessarab
36da3faf73 app/vmselect/promql: use lock-less approach to gather results of parallel processing for evalRollup* funcs (#4004)
* vmselect/promql: refactor `evalRollupNoIncrementalAggregate` to use lock-less approach for parallel workers computation

Locking there is causing issues when running on highly multi-core system as it introduces lock contention during results merge.

New implementation uses lock less approach to store results per workerID and merges final result in the end, this is expected to significantly reduce lock contention and CPU usage for systems with high number of cores.

Related: #3966
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* vmselect/promql: add pooling for `timeseriesWithPadding` to reduce allocations

Related: #3966
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* vmselect/promql: refactor `evalRollupFuncWithSubquery` to avoid using locks

Uses same approach as `evalRollupNoIncrementalAggregate` to remove locking between workers and reduce lock contention.

Related: #3966
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-03-25 01:23:46 -07:00
Aliaksandr Valialkin
bc9bd614ee app/vmselect/promql: pass workerID to the callback inside doParallel()
This opens the possibility to remove tssLock from evalRollupFuncWithSubquery()
in the follow-up commit from @zekker6 in order to speed up the code
for systems with many CPU cores.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3966
2023-03-25 01:22:06 -07:00
Aliaksandr Valialkin
75791bcb77 app/vmselect/promql: fix TestIncrementalAggr test on systems less than 3 CPU cores
This is a follow-up for 4856a4cf5a
2023-03-25 01:21:38 -07:00
Aliaksandr Valialkin
83a8f87131 app/vmselect: optimize incremental aggregates a bit
Substitute sync.Map with an ordinary slice indexed by workerID.
This should reduce the overhead when updating the incremental aggregate state
2023-03-24 23:49:26 -07:00
oliverpool
48ee15ac42 app/vmselect/promql: add test to ensure 8-byte alignment (#3948)
See 0af9e2b693
2023-03-24 23:48:55 -07:00
Aliaksandr Valialkin
95f5d4780d vendor: run make vendor-update 2023-03-24 22:28:55 -07:00
Alexander Marshalov
568b5a7711 allowed using dashes and dots in environment variables names (#4009)
* allowed using dashes and dots in environment variables names for templating config files with envtemplate (#3999)

Signed-off-by: Alexander Marshalov <_@marshalov.org>

* Apply suggestions from code review

---------

Signed-off-by: Alexander Marshalov <_@marshalov.org>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-03-24 22:22:09 -07:00
Aliaksandr Valialkin
336c5947c8 app/vmbackup: simplify code a bit after 5ba347bd2c
Unconditionally call deleteSnapshot() func just after making the snapshot, either successful or unsuccessful

Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2055
2023-03-24 22:19:55 -07:00
Zakhar Bessarab
68b49a900c app/vmbackup: delete created snapshot in case of error during backup (#4008)
Related issue: #2055

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-03-24 22:16:59 -07:00
Nikolay
7da72b040b lib/netutil: log only parsing errors for proxy-protocol (#3985)
* lib/netutil: log only parsing errors for proxy-protocol

Previosly every error was logged. With configured TCP health checks at load-balancer or kubernetes, vmauth spams a lot of false positive error message into logs

* Update docs/CHANGELOG.md

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

* Update lib/netutil/tcplistener.go

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2023-03-24 22:14:18 -07:00
Aliaksandr Valialkin
d6b6cb56e5 lib/{mergeset,storage}: prevent from long wait time when creating a snapshot under high data ingestion rate
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3551
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3873
2023-03-24 22:11:58 -07:00
Dmytro Kozlov
a20c4804a0 lib/storage: fix collect downsampling metrics (#489)
* lib/storage: fix downsampling

* lib/storage: update logic

* lib/storage: fix comments, removed unneeded check
2023-03-19 23:34:10 -07:00
Aliaksandr Valialkin
16be82b959 docs/CHANGELOG.md: cut v1.87.3 2023-03-12 23:24:16 -07:00
Aliaksandr Valialkin
d390277509 app/vmselect/promql: prevent from cannot unmarshal timeseries from rollupResultCache panic after the upgrade to v1.89.0
The issue has been introduced in 0af9e2b693
2023-03-12 19:10:21 -07:00
Aliaksandr Valialkin
60ccaf670a Makefile: update golangci-lint from v1.51.1 to v1.51.2
See https://github.com/golangci/golangci-lint/releases/tag/v1.51.2
2023-03-12 17:11:47 -07:00
Aliaksandr Valialkin
88616346f1 app/vmselect: remove data race on updating EvalConfig.IsPartialResponse from concurrently running goroutines
This properly returns `is_partial: true` for partial responses.
2023-03-12 16:57:06 -07:00
Aliaksandr Valialkin
9ef38946fd app/vmselect/promql: prevent from SIGBUS crash on architecures, which deny unaligned access to 8-byte words (e.g. ARM)
Thanks to @oliverpool for nailing down the root cause of the issue and for the initial attempt to fix it
at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3927
2023-03-12 16:55:46 -07:00
Aliaksandr Valialkin
9480d609d1 docs/CHANGELOG.md: document 113a89904d 2023-03-12 01:59:24 -08:00
Aliaksandr Valialkin
481c928011 vendor: make vendor-update 2023-03-12 01:49:12 -08:00
Roman Khavronenko
d42650d3a9 security: bump go version to 1.20.2 (#3935)
upgrade Go builder from Go1.20.1 to Go1.20.2
See the list of issues addressed in Go1.20.2 here (https://github.com/golang/go/issues?q=milestone%3AGo1.20.2+label%3ACherryPickApproved).

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-03-12 01:38:25 -08:00
Aliaksandr Valialkin
4e71914e3c app/vmselect/netstorage: do not intern string representation of MetricName for time series received from vmstorage
It has been appeared that this interning may lead to increased memory usage and increased CPU usage
when vmselect performs queries, which select big number of time series.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3692
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3863
2023-03-12 01:35:23 -08:00
Aliaksandr Valialkin
0d836a51d7 docs/CHANGELOG.md: document 927d9da270 2023-03-12 01:33:40 -08:00
Nikolay
d28ee6192d lib/storage: correctly handle io.EOF error for pre-fetched metrics (#3946)
io.EOF shouldn't be returned from this function. It breaks all search
API logic and may result in empty query results.
2023-03-12 01:28:45 -08:00
Aliaksandr Valialkin
e6fa18bfd2 docs/CHANGELOG.md: clarify the description for 6bfe9cc733
- Add the panic message to the description, so it is easier to google
- Add a link to the corresponding bugreport

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3897
2023-03-12 01:26:20 -08:00
Nikolay
439f53fd3e lib{mergset,storage}: prevent possible race condition with logging st… (#3900)
lib{mergset,storage}: prevent possible race condition with logging stats for merges

Previously partwrapper could be release by background process and reference for part may be invalid
during logging stats. It will lead to panic at vmstorage
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3897
2023-03-12 01:23:43 -08:00
Aliaksandr Valialkin
fe4dae150d all: follow-up for 7a3e16e774
- Sync the description for -httpListenAddr.useProxyProtocol command-line flag at vmagent and vmauth,
  so it is consistent with the description at vmauth and victoria-metrics
- Add a sample of panic text to docs/CHANGELOG.md, so it could be googled
- Mention the -httpListenAddr.useProxyProtocol command-line flag in the description for the bugfix

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3335
2023-03-12 01:16:58 -08:00
Nikolay
4c33716a60 lib/netutil: fixes panic at proxy protocol (#3905)
it may occur if non proxy protocol message received by tcp server.
Listener Accept method must return only non-recoverable errors.
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3335
2023-03-12 01:09:56 -08:00
Zakhar Bessarab
47204d2f77 lib/promscrape: correctly register vm_promscrape_config_* metrics (#3876)
* lib/promscrape: set `vm_promscrape_config_last_reload_successful` to 1 if there was no promscrape config provided

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* lib/promscrape: register `vm_promscrape_config_*` metrics only in case promscrape config is used

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-02-27 12:06:13 -08:00
Dmytro Kozlov
90319e69f9 app/vmctl: skip series if measurement not found (#3869)
app/vmctl: skip measurements with no fields for influxdb mode
2023-02-27 12:04:39 -08:00
Aliaksandr Valialkin
cbdaafe541 docs/CHANGELOG.md: mention that v1.87.2 is an LTS release 2023-02-24 16:12:36 -08:00
Aliaksandr Valialkin
d0ce874b13 docs/CHANGELOG.md: document v1.79.9 release 2023-02-24 15:11:27 -08:00
Aliaksandr Valialkin
619da7bfa5 docs/CHANGELOG.md: cut v1.87.2 release 2023-02-24 15:09:30 -08:00
Aliaksandr Valialkin
dd6e7089a8 docs/CHANGELOG.md: document d8eaa511b0 2023-02-24 12:44:24 -08:00
Zakhar Bessarab
1dbf7a204c lib/{fs,mergeset,storage}: skip .must-remove. dirs when creating snapshot (#3858) (#3867) 2023-02-24 12:44:22 -08:00
Aliaksandr Valialkin
4c82081f57 docs/CHANGELOG.md: typo fix: scrape scrape -> scrape 2023-02-24 12:34:01 -08:00
Aliaksandr Valialkin
69e7621ad5 lib/promscrape: follow-up for 43e104a83f
- Return immediately on context cancel during the backoff sleep.
  This should help with https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3747

- Add a comment describing why the second attempt to obtain the response from remote side
  is perfromed immediately after the first attempt.

- Remove fasthttp dependency from lib/promscrape/discoveryutils

- Set context deadline before calling doRequestWithPossibleRetry().
  This simplifies the doRequestWithPossibleRetry() a bit.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3293
2023-02-24 12:26:13 -08:00
Zakhar Bessarab
00bc28626d fix: do not use exponential backoff for first retry of scrape request (#3824)
* fix: do not use exponential backoff for first retry of scrape request (#3293)

* lib/promscrape: refactor `doRequestWithPossibleRetry` backoff to simplify logic

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* Update lib/promscrape/client.go

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

* lib/promscrape: refactor `doRequestWithPossibleRetry` to make it more straightforward

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2023-02-24 12:26:06 -08:00
Alexander Marshalov
b443b7e2ca fix interpolate function for filling only intermediate gaps (#3816) (#3857)
* fix interpolate function for filling only intermediate gaps (#3816)

* Update docs/CHANGELOG.md

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-02-23 19:29:54 -08:00
Aliaksandr Valialkin
e0a16874f6 docs/CHANGELOG.md: document 6d019a3c37
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3830
2023-02-23 19:28:56 -08:00
Mattias Ängehov
62f40cb33b Azure Service Discovery - Fix token fetch for Container Apps/App Services (#3832)
* Modify API version when running in Container App

* Handle expires on from token response

Response from IMDS does not always contain expires in value which is
currently used to get the token expiry time. An example resources that
doesn't provide it are Container Apps and App Service.

Signed-off-by: Mattias Ängehov <mattias.angehov@castoredc.com>

* Fix client id parameter for user assigned identity

* Apply suggestions from code review

---------

Signed-off-by: Mattias Ängehov <mattias.angehov@castoredc.com>
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2023-02-23 19:28:49 -08:00
Aliaksandr Valialkin
905f1839ef docs/CHANGELOG.md: document d2b92d3264
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3747
2023-02-22 18:29:20 -08:00
Zakhar Bessarab
05ca0c16c7 lib/promscrape: fix cancelling in-flight scrape requests during configuration reload (#3853)
* lib/promscrape: fix cancelling in-flight scrape requests during configuration reload (see #3747)

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* lib/promscrape: fix order of params for `doRequestWithPossibleRetry` to follow codestyle

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* lib/promscrape: accept deadline explicitly and extend passed context for local use

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-02-22 18:28:19 -08:00
Aliaksandr Valialkin
0476f2a7ca vendor: update github.com/VictoriaMetrics/fasthttp from v1.1.0 to v1.2.0
The v1.2.0 adds HostClient.DoCtx() function, which is needed by https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3747
for implementing fast canceling of pending requests to scrape targets on config update
2023-02-22 18:27:37 -08:00
Aliaksandr Valialkin
d26b9d89e0 vendor: make vendor-update 2023-02-18 23:16:22 -08:00
Aliaksandr Valialkin
29f0a33500 all: update Go builder from Go1.20.0 to Go1.20.1
See https://github.com/golang/go/issues?q=milestone%3AGo1.20.1+label%3ACherryPickApproved
2023-02-14 23:13:01 -08:00
Aliaksandr Valialkin
af68892e68 docs/CHANGELOG.md: improve the docs for 8ea02eaa8e 2023-02-14 22:45:31 -08:00
Droxenator
0b91514a8f fixed opentsdbListenAddr timestamp conversion (#3810)
Co-authored-by: Andrei Ivanov <a.ivanov@corp.mail.ru>
2023-02-14 22:44:37 -08:00
Aliaksandr Valialkin
a96f0df64a lib/{mergeset,storage}: allow at least 3 concurrent flushes during background merges on systems with 1 or 2 CPU cores
This should prevent from data ingestion slowdown and query performance degradation
on systems with small number of CPU cores (1 or 2), when big merge is performed.

This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3790

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3337
2023-02-11 12:17:06 -08:00
Aliaksandr Valialkin
af83ce33f0 all: update alpine base docker image from 1.17.1 to 1.17.2
See https://alpinelinux.org/posts/Alpine-3.17.2-released.html
2023-02-11 12:16:04 -08:00
2752 changed files with 143050 additions and 159827 deletions

View File

@@ -1,32 +0,0 @@
name: Question
description: Ask a question regarding VictoriaMetrics or its components
labels: [question]
body:
- type: textarea
id: describe-the-component
attributes:
label: Is your question request related to a specific component?
placeholder: |
VictoriaMetrics, vmagent, vmalert, vmui, etc...
validations:
required: false
- type: textarea
id: describe-the-question
attributes:
label: Describe the question in detail
description: |
A clear and concise description of the issue and the question.
validations:
required: true
- type: checkboxes
id: troubleshooting
attributes:
label: Troubleshooting docs
description: I am familiar with the following troubleshooting docs
options:
- label: General - https://docs.victoriametrics.com/Troubleshooting.html
required: false
- label: vmagent - https://docs.victoriametrics.com/vmagent.html#troubleshooting
required: false
- label: vmalert - https://docs.victoriametrics.com/vmalert.html#troubleshooting
required: false

View File

@@ -8,12 +8,10 @@ updates:
directory: "/"
schedule:
interval: "weekly"
open-pull-requests-limit: 0
- package-ecosystem: "bundler"
directory: "/docs"
schedule:
interval: "weekly"
open-pull-requests-limit: 0
interval: "daily"
- package-ecosystem: "gomod"
directory: "/app/vmui/packages/vmui/web"
schedule:

View File

@@ -17,7 +17,7 @@ jobs:
- name: Setup Go
uses: actions/setup-go@main
with:
go-version: 1.20.7
go-version: stable
id: go
- name: Code checkout
uses: actions/checkout@master

View File

@@ -55,9 +55,9 @@ jobs:
uses: actions/checkout@v3
- name: Set up Go
uses: actions/setup-go@v4
uses: actions/setup-go@v3
with:
go-version: 1.20.7
go-version: stable
check-latest: true
cache: true
if: ${{ matrix.language == 'go' }}

View File

@@ -30,9 +30,9 @@ jobs:
uses: actions/checkout@v3
- name: Setup Go
uses: actions/setup-go@v4
uses: actions/setup-go@v3
with:
go-version: 1.20.7
go-version: stable
check-latest: true
cache: true
@@ -54,9 +54,9 @@ jobs:
uses: actions/checkout@v3
- name: Setup Go
uses: actions/setup-go@v4
uses: actions/setup-go@v3
with:
go-version: 1.20.7
go-version: stable
check-latest: true
cache: true
@@ -79,9 +79,9 @@ jobs:
- name: Setup Go
id: go
uses: actions/setup-go@v4
uses: actions/setup-go@v3
with:
go-version: 1.20.7
go-version: stable
check-latest: true
cache: true

View File

@@ -1,53 +0,0 @@
name: publish-docs
on:
push:
branches:
- 'master'
paths:
- 'docs/**'
workflow_dispatch: {}
permissions:
contents: read # This is required for actions/checkout and to commit back image update
deployments: write
jobs:
build:
name: Build
runs-on: ubuntu-latest
steps:
- name: Code checkout
uses: actions/checkout@v3
with:
path: main
- name: Checkout private code
uses: actions/checkout@v3
with:
repository: VictoriaMetrics/vmdocs
token: ${{ secrets.VM_BOT_GH_TOKEN }}
path: docs
- name: Import GPG key
uses: crazy-max/ghaction-import-gpg@v5
with:
gpg_private_key: ${{ secrets.VM_BOT_GPG_PRIVATE_KEY }}
passphrase: ${{ secrets.VM_BOT_PASSPHRASE }}
git_user_signingkey: true
git_commit_gpgsign: true
workdir: docs
- name: Set short git commit SHA
id: vars
run: |
calculatedSha=$(git rev-parse --short ${{ github.sha }})
echo "short_sha=$calculatedSha" >> $GITHUB_OUTPUT
working-directory: main
- name: update code and commit
run: |
rm -rf content
cp -r ../main/docs content
make clean-after-copy
git config --global user.name "${{ steps.import-gpg.outputs.email }}"
git config --global user.email "${{ steps.import-gpg.outputs.email }}"
git add .
git commit -S -m "sync docs with VictoriaMetrics/VictoriaMetrics commit: ${{ steps.vars.outputs.short_sha }}"
git push
working-directory: docs

View File

@@ -1,80 +0,0 @@
name: sandbox-release
on:
release:
types: [published]
permissions:
contents: write
jobs:
deploy-sandbox:
runs-on: ubuntu-latest
steps:
- name: check inputs
if: github.event.release.tag_name == ''
run: exit 1
- name: Check out code
uses: actions/checkout@v3
with:
repository: VictoriaMetrics/ops
token: ${{ secrets.VM_BOT_GH_TOKEN }}
- name: Import GPG key
id: import-gpg
uses: crazy-max/ghaction-import-gpg@v5
with:
gpg_private_key: ${{ secrets.VM_BOT_GPG_PRIVATE_KEY }}
passphrase: ${{ secrets.VM_BOT_PASSPHRASE }}
git_user_signingkey: true
git_commit_gpgsign: true
- name: update image tag
uses: fjogeleit/yaml-update-action@main
with:
valueFile: 'gcp-test/sandbox/manifests/benchmark-vm/vmcluster.yaml'
commitChange: false
createPR: false
changes: |
{
"gcp-test/sandbox/manifests/benchmark-vm/vmcluster.yaml": {
"spec.vminsert.image.tag": "${{ github.event.release.tag_name }}-enterprise-cluster",
"spec.vmselect.image.tag": "${{ github.event.release.tag_name }}-enterprise-cluster",
"spec.vmstorage.image.tag": "${{ github.event.release.tag_name }}-enterprise-cluster"
},
"gcp-test/sandbox/manifests/benchmark-vm/vmsingle.yaml": {
"spec.image.tag": "${{ github.event.release.tag_name }}-enterprise"
},
"gcp-test/sandbox/manifests/monitoring/monitoring-vmagent.yaml": {
"spec.image.tag": "${{ github.event.release.tag_name }}"
},
"gcp-test/sandbox/manifests/monitoring/monitoring-vmcluster.yaml": {
"spec.vminsert.image.tag": "${{ github.event.release.tag_name }}-enterprise-cluster",
"spec.vmselect.image.tag": "${{ github.event.release.tag_name }}-enterprise-cluster",
"spec.vmstorage.image.tag": "${{ github.event.release.tag_name }}-enterprise-cluster"
},
"gcp-test/sandbox/manifests/monitoring/vmalert.yaml": {
"spec.image.tag": "${{ github.event.release.tag_name }}-enterprise"
}
}
- name: commit changes
run: |
git config --global user.name "${{ steps.import-gpg.outputs.email }}"
git config --global user.email "${{ steps.import-gpg.outputs.email }}"
git add .
git commit -S -m "Deploy image tag ${RELEASE_TAG} to sandbox"
env:
RELEASE_TAG: ${{ github.event.release.tag_name }}
- name: Create Pull Request
uses: peter-evans/create-pull-request@v5
with:
author: ${{ github.actor }} <${{ github.actor }}@users.noreply.github.com>
branch: release-automation
token: ${{ secrets.VM_BOT_GH_TOKEN }}
delete-branch: true
title: "release ${{ github.event.release.tag_name }}"
body: |
Release [${{ github.event.release.tag_name }}](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/${{ github.event.release.tag_name }}) to sandbox
> Auto-generated by `Github Actions Bot`

3
.gitignore vendored
View File

@@ -8,7 +8,6 @@
*.test
*.swp
/gocache-for-docker
/victoria-logs-data
/victoria-metrics-data
/vmagent-remotewrite-data
/vmstorage-data
@@ -21,4 +20,4 @@
Gemfile.lock
/_site
_site
*.tmp
*.tmp

View File

@@ -1,18 +1,14 @@
run:
timeout: 2m
linters:
enable:
enable:
- revive
issues:
exclude-rules:
- linters:
- staticcheck
text: "SA(4003|1019|5011):"
include:
- EXC0012
- EXC0014
- linters:
- staticcheck
text: "SA(4003|1019|5011):"
linters-settings:
errcheck:

138
Makefile
View File

@@ -21,7 +21,6 @@ include package/release/Makefile
all: \
victoria-metrics-prod \
victoria-logs-prod \
vmagent-prod \
vmalert-prod \
vmauth-prod \
@@ -34,7 +33,6 @@ clean:
publish: package-base \
publish-victoria-metrics \
publish-victoria-logs \
publish-vmagent \
publish-vmalert \
publish-vmauth \
@@ -44,7 +42,6 @@ publish: package-base \
package: \
package-victoria-metrics \
package-victoria-logs \
package-vmagent \
package-vmalert \
package-vmauth \
@@ -144,8 +141,6 @@ vmutils-windows-amd64: \
vmagent-windows-amd64 \
vmalert-windows-amd64 \
vmauth-windows-amd64 \
vmbackup-windows-amd64 \
vmrestore-windows-amd64 \
vmctl-windows-amd64
victoria-metrics-crossbuild: \
@@ -153,7 +148,6 @@ victoria-metrics-crossbuild: \
victoria-metrics-linux-amd64 \
victoria-metrics-linux-arm64 \
victoria-metrics-linux-arm \
victoria-metrics-linux-386 \
victoria-metrics-linux-ppc64le \
victoria-metrics-darwin-amd64 \
victoria-metrics-darwin-arm64 \
@@ -165,7 +159,6 @@ vmutils-crossbuild: \
vmutils-linux-amd64 \
vmutils-linux-arm64 \
vmutils-linux-arm \
vmutils-linux-386 \
vmutils-linux-ppc64le \
vmutils-darwin-amd64 \
vmutils-darwin-arm64 \
@@ -174,6 +167,7 @@ vmutils-crossbuild: \
vmutils-windows-amd64
publish-release:
rm -rf bin/*
git checkout $(TAG) && LATEST_TAG=stable $(MAKE) release publish && \
git checkout $(TAG)-cluster && LATEST_TAG=cluster-stable $(MAKE) release publish && \
git checkout $(TAG)-enterprise && LATEST_TAG=enterprise-stable $(MAKE) release publish && \
@@ -181,7 +175,6 @@ publish-release:
release: \
release-victoria-metrics \
release-victoria-logs \
release-vmutils
release-victoria-metrics: \
@@ -192,9 +185,9 @@ release-victoria-metrics: \
release-victoria-metrics-darwin-amd64 \
release-victoria-metrics-darwin-arm64 \
release-victoria-metrics-freebsd-amd64 \
release-victoria-metrics-openbsd-amd64 \
release-victoria-metrics-windows-amd64
release-victoria-metrics-openbsd-amd64
# adds i386 arch
release-victoria-metrics-linux-386:
GOOS=linux GOARCH=386 $(MAKE) release-victoria-metrics-goos-goarch
@@ -219,9 +212,6 @@ release-victoria-metrics-freebsd-amd64:
release-victoria-metrics-openbsd-amd64:
GOOS=openbsd GOARCH=amd64 $(MAKE) release-victoria-metrics-goos-goarch
release-victoria-metrics-windows-amd64:
GOARCH=amd64 $(MAKE) release-victoria-metrics-windows-goarch
release-victoria-metrics-goos-goarch: victoria-metrics-$(GOOS)-$(GOARCH)-prod
cd bin && \
tar --transform="flags=r;s|-$(GOOS)-$(GOARCH)||" -czf victoria-metrics-$(GOOS)-$(GOARCH)-$(PKG_TAG).tar.gz \
@@ -231,73 +221,6 @@ release-victoria-metrics-goos-goarch: victoria-metrics-$(GOOS)-$(GOARCH)-prod
| sed s/-$(GOOS)-$(GOARCH)-prod/-prod/ > victoria-metrics-$(GOOS)-$(GOARCH)-$(PKG_TAG)_checksums.txt
cd bin && rm -rf victoria-metrics-$(GOOS)-$(GOARCH)-prod
release-victoria-metrics-windows-goarch: victoria-metrics-windows-$(GOARCH)-prod
cd bin && \
zip victoria-metrics-windows-$(GOARCH)-$(PKG_TAG).zip \
victoria-metrics-windows-$(GOARCH)-prod.exe \
&& sha256sum victoria-metrics-windows-$(GOARCH)-$(PKG_TAG).zip \
victoria-metrics-windows-$(GOARCH)-prod.exe \
> victoria-metrics-windows-$(GOARCH)-$(PKG_TAG)_checksums.txt
cd bin && rm -rf \
victoria-metrics-windows-$(GOARCH)-prod.exe
release-victoria-logs: \
release-victoria-logs-linux-386 \
release-victoria-logs-linux-amd64 \
release-victoria-logs-linux-arm \
release-victoria-logs-linux-arm64 \
release-victoria-logs-darwin-amd64 \
release-victoria-logs-darwin-arm64 \
release-victoria-logs-freebsd-amd64 \
release-victoria-logs-openbsd-amd64 \
release-victoria-logs-windows-amd64
release-victoria-logs-linux-386:
GOOS=linux GOARCH=386 $(MAKE) release-victoria-logs-goos-goarch
release-victoria-logs-linux-amd64:
GOOS=linux GOARCH=amd64 $(MAKE) release-victoria-logs-goos-goarch
release-victoria-logs-linux-arm:
GOOS=linux GOARCH=arm $(MAKE) release-victoria-logs-goos-goarch
release-victoria-logs-linux-arm64:
GOOS=linux GOARCH=arm64 $(MAKE) release-victoria-logs-goos-goarch
release-victoria-logs-darwin-amd64:
GOOS=darwin GOARCH=amd64 $(MAKE) release-victoria-logs-goos-goarch
release-victoria-logs-darwin-arm64:
GOOS=darwin GOARCH=arm64 $(MAKE) release-victoria-logs-goos-goarch
release-victoria-logs-freebsd-amd64:
GOOS=freebsd GOARCH=amd64 $(MAKE) release-victoria-logs-goos-goarch
release-victoria-logs-openbsd-amd64:
GOOS=openbsd GOARCH=amd64 $(MAKE) release-victoria-logs-goos-goarch
release-victoria-logs-windows-amd64:
GOARCH=amd64 $(MAKE) release-victoria-logs-windows-goarch
release-victoria-logs-goos-goarch: victoria-logs-$(GOOS)-$(GOARCH)-prod
cd bin && \
tar --transform="flags=r;s|-$(GOOS)-$(GOARCH)||" -czf victoria-logs-$(GOOS)-$(GOARCH)-$(PKG_TAG).tar.gz \
victoria-logs-$(GOOS)-$(GOARCH)-prod \
&& sha256sum victoria-logs-$(GOOS)-$(GOARCH)-$(PKG_TAG).tar.gz \
victoria-logs-$(GOOS)-$(GOARCH)-prod \
| sed s/-$(GOOS)-$(GOARCH)-prod/-prod/ > victoria-logs-$(GOOS)-$(GOARCH)-$(PKG_TAG)_checksums.txt
cd bin && rm -rf victoria-logs-$(GOOS)-$(GOARCH)-prod
release-victoria-logs-windows-goarch: victoria-logs-windows-$(GOARCH)-prod
cd bin && \
zip victoria-logs-windows-$(GOARCH)-$(PKG_TAG).zip \
victoria-logs-windows-$(GOARCH)-prod.exe \
&& sha256sum victoria-logs-windows-$(GOARCH)-$(PKG_TAG).zip \
victoria-logs-windows-$(GOARCH)-prod.exe \
> victoria-logs-windows-$(GOARCH)-$(PKG_TAG)_checksums.txt
cd bin && rm -rf \
victoria-logs-windows-$(GOARCH)-prod.exe
release-vmutils: \
release-vmutils-linux-386 \
release-vmutils-linux-amd64 \
@@ -371,33 +294,26 @@ release-vmutils-windows-goarch: \
vmagent-windows-$(GOARCH)-prod \
vmalert-windows-$(GOARCH)-prod \
vmauth-windows-$(GOARCH)-prod \
vmbackup-windows-$(GOARCH)-prod \
vmrestore-windows-$(GOARCH)-prod \
vmctl-windows-$(GOARCH)-prod
cd bin && \
zip vmutils-windows-$(GOARCH)-$(PKG_TAG).zip \
vmagent-windows-$(GOARCH)-prod.exe \
vmalert-windows-$(GOARCH)-prod.exe \
vmauth-windows-$(GOARCH)-prod.exe \
vmbackup-windows-$(GOARCH)-prod.exe \
vmrestore-windows-$(GOARCH)-prod.exe \
vmctl-windows-$(GOARCH)-prod.exe \
&& sha256sum vmutils-windows-$(GOARCH)-$(PKG_TAG).zip \
vmagent-windows-$(GOARCH)-prod.exe \
vmalert-windows-$(GOARCH)-prod.exe \
vmauth-windows-$(GOARCH)-prod.exe \
vmbackup-windows-$(GOARCH)-prod.exe \
vmrestore-windows-$(GOARCH)-prod.exe \
vmctl-windows-$(GOARCH)-prod.exe \
> vmutils-windows-$(GOARCH)-$(PKG_TAG)_checksums.txt
cd bin && rm -rf \
vmagent-windows-$(GOARCH)-prod.exe \
vmalert-windows-$(GOARCH)-prod.exe \
vmauth-windows-$(GOARCH)-prod.exe \
vmbackup-windows-$(GOARCH)-prod.exe \
vmrestore-windows-$(GOARCH)-prod.exe \
vmctl-windows-$(GOARCH)-prod.exe
pprof-cpu:
go tool pprof -trim_path=github.com/VictoriaMetrics/VictoriaMetrics@ $(PPROF_FILE)
@@ -437,7 +353,7 @@ benchmark-pure:
vendor-update:
go get -u -d ./lib/...
go get -u -d ./app/...
go mod tidy -compat=1.19
go mod tidy -compat=1.20
go mod vendor
app-local:
@@ -463,7 +379,7 @@ golangci-lint: install-golangci-lint
golangci-lint run
install-golangci-lint:
which golangci-lint || curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(shell go env GOPATH)/bin v1.51.2
which golangci-lint || curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(shell go env GOPATH)/bin v1.55.1
govulncheck: install-govulncheck
govulncheck ./...
@@ -478,39 +394,27 @@ check-licenses: install-wwhrd
wwhrd check -f .wwhrd.yml
copy-docs:
# The 'printf' function is used instead of 'echo' or 'echo -e' to handle line breaks (e.g. '\n') in the same way on different operating systems (MacOS/Ubuntu Linux/Arch Linux) and their shells (bash/sh/zsh/fish).
# For details, see https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4548#issue-1782796419 and https://stackoverflow.com/questions/8467424/echo-newline-in-bash-prints-literal-n
echo "---" > ${DST}
echo '' > ${DST}
@if [ ${ORDER} -ne 0 ]; then \
echo "sort: ${ORDER}" >> ${DST}; \
echo "weight: ${ORDER}" >> ${DST}; \
printf "menu:\n docs:\n parent: 'victoriametrics'\n weight: ${ORDER}\n" >> ${DST}; \
echo "---\nsort: ${ORDER}\n---\n" > ${DST}; \
fi
echo "title: ${TITLE}" >> ${DST}
@if [ ${OLD_URL} ]; then \
printf "aliases:\n - ${OLD_URL}\n" >> ${DST}; \
fi
echo "---" >> ${DST}
cat ${SRC} >> ${DST}
sed -i='.tmp' 's/<img src=\"docs\//<img src=\"/' ${DST}
rm -rf docs/*.tmp
# Copies docs for all components and adds the order/weight tag, title, menu position and alias with the backward compatible link for the old site.
# For ORDER=0 it adds no order tag/weight tag.
# FOR OLD_URL - relative link, used for backward compatibility with the link from documentation based on GitHub pages (old one)
# FOR OLD_URL='' it adds no alias, it should be empty for every new page, don't change it for already existing links.
# Copies docs for all components and adds the order tag.
# For ORDER=0 it adds no order tag.
# Images starting with <img src="docs/ are replaced with <img src="
# Cluster docs are supposed to be ordered as 2nd.
# Cluster docs are supposed to be ordered as 9th.
# The rest of docs is ordered manually.
docs-sync:
SRC=README.md DST=docs/README.md OLD_URL='' ORDER=0 TITLE=VictoriaMetrics $(MAKE) copy-docs
SRC=README.md DST=docs/Single-server-VictoriaMetrics.md OLD_URL='/Single-server-VictoriaMetrics.html' TITLE=VictoriaMetrics ORDER=1 $(MAKE) copy-docs
SRC=app/vmagent/README.md DST=docs/vmagent.md OLD_URL='/vmagent.html' ORDER=3 TITLE=vmagent $(MAKE) copy-docs
SRC=app/vmalert/README.md DST=docs/vmalert.md OLD_URL='/vmalert.html' ORDER=4 TITLE=vmalert $(MAKE) copy-docs
SRC=app/vmauth/README.md DST=docs/vmauth.md OLD_URL='/vmauth.html' ORDER=5 TITLE=vmauth $(MAKE) copy-docs
SRC=app/vmbackup/README.md DST=docs/vmbackup.md OLD_URL='/vmbackup.html' ORDER=6 TITLE=vmbackup $(MAKE) copy-docs
SRC=app/vmrestore/README.md DST=docs/vmrestore.md OLD_URL='/vmrestore.html' ORDER=7 TITLE=vmrestore $(MAKE) copy-docs
SRC=app/vmctl/README.md DST=docs/vmctl.md OLD_URL='/vmctl.html' ORDER=8 TITLE=vmctl $(MAKE) copy-docs
SRC=app/vmgateway/README.md DST=docs/vmgateway.md OLD_URL='/vmgateway.html' ORDER=9 TITLE=vmgateway $(MAKE) copy-docs
SRC=app/vmbackupmanager/README.md DST=docs/vmbackupmanager.md OLD_URL='/vmbackupmanager.html' ORDER=10 TITLE=vmbackupmanager $(MAKE) copy-docs
SRC=README.md DST=docs/README.md ORDER=0 $(MAKE) copy-docs
SRC=README.md DST=docs/Single-server-VictoriaMetrics.md ORDER=1 $(MAKE) copy-docs
SRC=app/vmagent/README.md DST=docs/vmagent.md ORDER=3 $(MAKE) copy-docs
SRC=app/vmalert/README.md DST=docs/vmalert.md ORDER=4 $(MAKE) copy-docs
SRC=app/vmauth/README.md DST=docs/vmauth.md ORDER=5 $(MAKE) copy-docs
SRC=app/vmbackup/README.md DST=docs/vmbackup.md ORDER=6 $(MAKE) copy-docs
SRC=app/vmrestore/README.md DST=docs/vmrestore.md ORDER=7 $(MAKE) copy-docs
SRC=app/vmctl/README.md DST=docs/vmctl.md ORDER=8 $(MAKE) copy-docs
SRC=app/vmgateway/README.md DST=docs/vmgateway.md ORDER=9 $(MAKE) copy-docs
SRC=app/vmbackupmanager/README.md DST=docs/vmbackupmanager.md ORDER=10 $(MAKE) copy-docs

444
README.md
View File

@@ -2,7 +2,6 @@
[![Latest Release](https://img.shields.io/github/release/VictoriaMetrics/VictoriaMetrics.svg?style=flat-square)](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/latest)
[![Docker Pulls](https://img.shields.io/docker/pulls/victoriametrics/victoria-metrics.svg?maxAge=604800)](https://hub.docker.com/r/victoriametrics/victoria-metrics)
[![victoriametrics](https://snapcraft.io/victoriametrics/badge.svg)](https://snapcraft.io/victoriametrics)
[![Slack](https://img.shields.io/badge/join%20slack-%23victoriametrics-brightgreen.svg)](https://slack.victoriametrics.com/)
[![GitHub license](https://img.shields.io/github/license/VictoriaMetrics/VictoriaMetrics.svg)](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/LICENSE)
[![Go Report](https://goreportcard.com/badge/github.com/VictoriaMetrics/VictoriaMetrics)](https://goreportcard.com/report/github.com/VictoriaMetrics/VictoriaMetrics)
@@ -16,16 +15,14 @@ VictoriaMetrics is a fast, cost-effective and scalable monitoring solution and t
VictoriaMetrics is available in [binary releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases),
[Docker images](https://hub.docker.com/r/victoriametrics/victoria-metrics/), [Snap packages](https://snapcraft.io/victoriametrics)
and [source code](https://github.com/VictoriaMetrics/VictoriaMetrics).
Just download [the latest version of VictoriaMetrics](https://docs.victoriametrics.com/CHANGELOG.html)
and follow [these instructions](https://docs.victoriametrics.com/Quick-Start.html).
The cluster version of VictoriaMetrics is available [here](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html).
Learn more about [key concepts](https://docs.victoriametrics.com/keyConcepts.html) of VictoriaMetrics and follow the
[quick start guide](https://docs.victoriametrics.com/Quick-Start.html) for a better experience.
There is also user-friendly database for logs - [VictoriaLogs](https://docs.victoriametrics.com/VictoriaLogs/).
If you have questions about VictoriaMetrics, then feel free asking them at [VictoriaMetrics community Slack chat](https://slack.victoriametrics.com/).
[Contact us](mailto:info@victoriametrics.com) if you need enterprise support for VictoriaMetrics.
See [features available in enterprise package](https://docs.victoriametrics.com/enterprise.html).
Enterprise binaries can be downloaded and evaluated for free
@@ -42,8 +39,7 @@ VictoriaMetrics has the following prominent features:
* It can be used as long-term storage for Prometheus. See [these docs](#prometheus-setup) for details.
* It can be used as a drop-in replacement for Prometheus in Grafana, because it supports [Prometheus querying API](#prometheus-querying-api-usage).
* It can be used as a drop-in replacement for Graphite in Grafana, because it supports [Graphite API](#graphite-api-usage).
VictoriaMetrics allows reducing infrastructure costs by more than 10x comparing to Graphite - see [this case study](https://docs.victoriametrics.com/CaseStudies.html#grammarly).
* It is easy to setup and operate:
* It features easy setup and operation:
* VictoriaMetrics consists of a single [small executable](https://medium.com/@valyala/stripping-dependency-bloat-in-victoriametrics-docker-image-983fb5912b0d)
without external dependencies.
* All the configuration is done via explicit command-line flags with reasonable defaults.
@@ -86,7 +82,6 @@ VictoriaMetrics has the following prominent features:
* [Arbitrary CSV data](#how-to-import-csv-data).
* [Native binary format](#how-to-import-data-in-native-format).
* [DataDog agent or DogStatsD](#how-to-send-data-from-datadog-agent).
* [OpenTelemetry metrics format](#sending-data-via-opentelemetry).
* It supports powerful [stream aggregation](https://docs.victoriametrics.com/stream-aggregation.html), which can be used as a [statsd](https://github.com/statsd/statsd) alternative.
* It supports metrics [relabeling](#relabeling).
* It can deal with [high cardinality issues](https://docs.victoriametrics.com/FAQ.html#what-is-high-cardinality) and
@@ -110,17 +105,14 @@ Case studies:
* [Brandwatch](https://docs.victoriametrics.com/CaseStudies.html#brandwatch)
* [CERN](https://docs.victoriametrics.com/CaseStudies.html#cern)
* [COLOPL](https://docs.victoriametrics.com/CaseStudies.html#colopl)
* [Dig Security](https://docs.victoriametrics.com/CaseStudies.html#dig-security)
* [Fly.io](https://docs.victoriametrics.com/CaseStudies.html#flyio)
* [German Research Center for Artificial Intelligence](https://docs.victoriametrics.com/CaseStudies.html#german-research-center-for-artificial-intelligence)
* [Grammarly](https://docs.victoriametrics.com/CaseStudies.html#grammarly)
* [Groove X](https://docs.victoriametrics.com/CaseStudies.html#groove-x)
* [Idealo.de](https://docs.victoriametrics.com/CaseStudies.html#idealode)
* [MHI Vestas Offshore Wind](https://docs.victoriametrics.com/CaseStudies.html#mhi-vestas-offshore-wind)
* [Naver](https://docs.victoriametrics.com/CaseStudies.html#naver)
* [Razorpay](https://docs.victoriametrics.com/CaseStudies.html#razorpay)
* [Percona](https://docs.victoriametrics.com/CaseStudies.html#percona)
* [Roblox](https://docs.victoriametrics.com/CaseStudies.html#roblox)
* [Sensedia](https://docs.victoriametrics.com/CaseStudies.html#sensedia)
* [Smarkets](https://docs.victoriametrics.com/CaseStudies.html#smarkets)
* [Synthesio](https://docs.victoriametrics.com/CaseStudies.html#synthesio)
@@ -133,26 +125,15 @@ See also [articles and slides about VictoriaMetrics from our users](https://docs
## Operation
### Install
To quickly try VictoriaMetrics, just download [VictoriaMetrics executable](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) or [Docker image](https://hub.docker.com/r/victoriametrics/victoria-metrics/) and start it with the desired command-line flags.
See also [QuickStart guide](https://docs.victoriametrics.com/Quick-Start.html) for additional information.
VictoriaMetrics can also be installed via these installation methods:
* [Helm charts for single-node and cluster versions of VictoriaMetrics](https://github.com/VictoriaMetrics/helm-charts).
* [Kubernetes operator for VictoriaMetrics](https://github.com/VictoriaMetrics/operator).
* [Ansible role for installing cluster VictoriaMetrics (by VictoriaMetrics)](https://github.com/VictoriaMetrics/ansible-playbooks).
* [Ansible role for installing cluster VictoriaMetrics (by community)](https://github.com/Slapper/ansible-victoriametrics-cluster-role).
* [Ansible role for installing single-node VictoriaMetrics (by community)](https://github.com/dreamteam-gg/ansible-victoriametrics-role).
* [Snap package for VictoriaMetrics](https://snapcraft.io/victoriametrics).
### How to start VictoriaMetrics
Just download [VictoriaMetrics executable](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) or [Docker image](https://hub.docker.com/r/victoriametrics/victoria-metrics/) and start it with the desired command-line flags.
See also [QuickStart guide](https://docs.victoriametrics.com/Quick-Start.html) for additional information.
The following command-line flags are used the most:
* `-storageDataPath` - VictoriaMetrics stores all the data in this directory. Default path is `victoria-metrics-data` in the current working directory.
* `-retentionPeriod` - retention for stored data. Older data is automatically deleted. Default retention is 1 month. The minimum retention period is 24h or 1d. See [the Retention section](#retention) for more details.
* `-retentionPeriod` - retention for stored data. Older data is automatically deleted. Default retention is 1 month. See [the Retention section](#retention) for more details.
Other flags have good enough default values, so set them only if you really need this. Pass `-help` to see [all the available flags with description and default values](#list-of-command-line-flags).
@@ -207,7 +188,7 @@ Changing scrape configuration is possible with text editor:
vi $SNAP_DATA/var/snap/victoriametrics/current/etc/victoriametrics-scrape-config.yaml
```
After changes were made, trigger config re-read with the command `curl 127.0.0.1:8428/-/reload`.
After changes were made, trigger config re-read with the command `curl 127.0.0.1:8248/-/reload`.
## Prometheus setup
@@ -287,11 +268,7 @@ http://<victoriametrics-addr>:8428
Substitute `<victoriametrics-addr>` with the hostname or IP address of VictoriaMetrics.
Then build graphs and dashboards for the created datasource using [PromQL](https://prometheus.io/docs/prometheus/latest/querying/basics/)
or [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html).
Alternatively, use VictoriaMetrics [datasource plugin](https://github.com/VictoriaMetrics/grafana-datasource) with support of extra features.
See more in [description](https://github.com/VictoriaMetrics/grafana-datasource#victoriametrics-data-source-for-grafana).
Then build graphs and dashboards for the created datasource using [PromQL](https://prometheus.io/docs/prometheus/latest/querying/basics/) or [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html).
## How to upgrade VictoriaMetrics
@@ -312,22 +289,13 @@ Prometheus doesn't drop data during VictoriaMetrics restart. See [this article](
## vmui
VictoriaMetrics provides UI for query troubleshooting and exploration. The UI is available at `http://victoriametrics:8428/vmui`.
The UI allows exploring query results via graphs and tables. It also provides the following features:
The UI allows exploring query results via graphs and tables.
It also provides the following features:
- Explore:
- [Metrics explorer](#metrics-explorer) - automatically builds graphs for selected metrics;
- [Cardinality explorer](#cardinality-explorer) - stats about existing metrics in TSDB;
- [Top queries](#top-queries) - shows most frequently executed queries;
- [Active queries](#active-queries) - shows currently executed queries;
- Tools:
- [Trace analyzer](#query-tracing) - playground for loading query traces in JSON format;
- [WITH expressions playground](https://play.victoriametrics.com/select/accounting/1/6a716b0f-38bc-4856-90ce-448fd713e3fe/prometheus/graph/#/expand-with-exprs) - test how WITH expressions work;
- [Metric relabel debugger](https://play.victoriametrics.com/select/accounting/1/6a716b0f-38bc-4856-90ce-448fd713e3fe/prometheus/graph/#/relabeling) - playground for [relabeling](#relabeling) configs.
VMUI automatically switches from graph view to heatmap view when the query returns [histogram](https://docs.victoriametrics.com/keyConcepts.html#histogram) buckets
(both [Prometheus histograms](https://prometheus.io/docs/concepts/metric_types/#histogram)
and [VictoriaMetrics histograms](https://valyala.medium.com/improving-histogram-usability-for-prometheus-and-grafana-bc7e5df0e350) are supported).
Try, for example, [this query](https://play.victoriametrics.com/select/accounting/1/6a716b0f-38bc-4856-90ce-448fd713e3fe/prometheus/graph/#/?g0.expr=sum%28rate%28vm_promscrape_scrape_duration_seconds_bucket%29%29+by+%28vmrange%29&g0.range_input=24h&g0.end_input=2023-04-10T17%3A46%3A12&g0.relative_time=last_24_hours&g0.step_input=31m).
- [metrics explorer](#metrics-explorer)
- [cardinality explorer](#cardinality-explorer)
- [query tracer](#query-tracing)
- [top queries explorer](#top-queries)
Graphs in `vmui` support scrolling and zooming:
@@ -339,12 +307,9 @@ Query history can be navigated by holding `Ctrl` (or `Cmd` on MacOS) and pressin
Multi-line queries can be entered by pressing `Shift-Enter` in query input field.
When querying the [backfilled data](https://docs.victoriametrics.com/#backfilling)
or during [query troubleshooting](https://docs.victoriametrics.com/Troubleshooting.html#unexpected-query-results),
it may be useful disabling response cache by clicking `Disable cache` checkbox.
When querying the [backfilled data](https://docs.victoriametrics.com/#backfilling) or during [query troubleshooting](https://docs.victoriametrics.com/Troubleshooting.html#unexpected-query-results), it may be useful disabling response cache by clicking `Disable cache` checkbox.
VMUI automatically adjusts the interval between datapoints on the graph depending on the horizontal resolution and on the selected time range.
The step value can be customized by changing `Step value` input.
VMUI automatically adjusts the interval between datapoints on the graph depending on the horizontal resolution and on the selected time range. The step value can be customized by changing `Step value` input.
VMUI allows investigating correlations between multiple queries on the same graph. Just click `Add Query` button,
enter an additional query in the newly appeared input field and press `Enter`.
@@ -363,24 +328,15 @@ See the [example VMUI at VictoriaMetrics playground](https://play.victoriametric
* queries with the biggest average execution duration;
* queries that took the most summary time for execution.
## Active queries
[VMUI](#vmui) provides `active queries` tab, which shows currently execute queries.
It provides the following information per each query:
- The query itself, together with the time range and step args passed to [/api/v1/query_range](https://docs.victoriametrics.com/keyConcepts.html#range-query).
- The duration of the query execution.
- The client address, who initiated the query execution.
## Metrics explorer
[VMUI](#vmui) provides an ability to explore metrics exported by a particular `job` / `instance` in the following way:
1. Open the `vmui` at `http://victoriametrics:8428/vmui/`.
1. Click the `Explore metrics` tab.
1. Select the `job` you want to explore.
1. Optionally select the `instance` for the selected job to explore.
1. Select metrics you want to explore and compare.
2. Click the `Explore metrics` tab.
3. Select the `job` you want to explore.
4. Optionally select the `instance` for the selected job to explore.
5. Select metrics you want to explore and compare.
It is possible to change the selected time range for the graphs in the top right corner.
@@ -397,17 +353,12 @@ VictoriaMetrics provides an ability to explore time series cardinality at `Explo
may show lower than expected number of unique label values for labels with small number of unique values.
This is because of [implementation limits](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/5a6e617b5e41c9170e7c562aecd15ee0c901d489/app/vmselect/netstorage/netstorage.go#L1039-L1045).
By default, cardinality explorer analyzes time series for the current date. It provides the ability to select different day at the top right corner.
By default, all the time series for the selected date are analyzed. It is possible to narrow down the analysis to series
By default cardinality explorer analyzes time series for the current date. It provides the ability to select different day at the top right corner.
By default all the time series for the selected date are analyzed. It is possible to narrow down the analysis to series
matching the specified [series selector](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors).
Cardinality explorer is built on top of [/api/v1/status/tsdb](#tsdb-stats).
In [cluster version of VictoriaMetrics](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html) each vmstorage tracks the stored time series individually.
vmselect requests stats via [/api/v1/status/tsdb](#tsdb-stats) API from each vmstorage node and merges the results by summing per-series stats.
This may lead to inflated values when samples for the same time series are spread across multiple vmstorage nodes
due to [replication](#replication) or [rerouting](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html?highlight=re-routes#cluster-availability).
See [cardinality explorer playground](https://play.victoriametrics.com/select/accounting/1/6a716b0f-38bc-4856-90ce-448fd713e3fe/prometheus/graph/#/cardinality).
See the example of using the cardinality explorer [here](https://victoriametrics.com/blog/cardinality-explorer/).
@@ -423,7 +374,7 @@ Prometheus doesn't drop data during VictoriaMetrics restart. See [this article](
## How to scrape Prometheus exporters such as [node-exporter](https://github.com/prometheus/node_exporter)
VictoriaMetrics can be used as drop-in replacement for Prometheus for scraping targets configured in `prometheus.yml` config file according to [the specification](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#configuration-file). Just set `-promscrape.config` command-line flag to the path to `prometheus.yml` config - and VictoriaMetrics should start scraping the configured targets. If the provided configuration file contains [unsupported options](https://docs.victoriametrics.com/vmagent.html#unsupported-prometheus-config-sections), then either delete them from the file or just pass `-promscrape.config.strictParse=false` command-line flag to VictoriaMetrics, so it will ignore unsupported options.
VictoriaMetrics can be used as drop-in replacement for Prometheus for scraping targets configured in `prometheus.yml` config file according to [the specification](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#configuration-file). Just set `-promscrape.config` command-line flag to the path to `prometheus.yml` config - and VictoriaMetrics should start scraping the configured targets.
The file pointed by `-promscrape.config` may contain `%{ENV_VAR}` placeholders, which are substituted by the corresponding `ENV_VAR` environment variable values.
@@ -460,7 +411,7 @@ DD_DD_URL=http://victoriametrics:8428/datadog
_Choose correct URL for VictoriaMetrics [here](https://docs.victoriametrics.com/url-examples.html#datadog)._
To configure DataDog agent via [configuration file](https://github.com/DataDog/datadog-agent/blob/878600ef7a55c5ef0efb41ed0915f020cf7e3bd0/pkg/config/config_template.yaml#L33)
To configure DataDog agent via [configuration file](https://docs.datadoghq.com/agent/guide/agent-configuration-files)
add the following line:
<div class="with-copy" markdown="1">
@@ -488,8 +439,7 @@ Run DataDog using the following ENV variable with VictoriaMetrics as additional
<div class="with-copy" markdown="1">
```
DD_ADDITIONAL_ENDPOINTS='{\"http://victoriametrics:8428/datadog\": [\"apikey\"]}'
DD_ADDITIONAL_ENDPOINTS='{\"http://victoriametrics:8428/datadog\"}'
```
</div>
@@ -503,9 +453,7 @@ add the following line:
<div class="with-copy" markdown="1">
```
additional_endpoints:
"http://victoriametrics:8428/datadog":
- apikey
additional_endpoints: http://victoriametrics:8428/datadog
```
</div>
@@ -551,10 +499,8 @@ and stream plain InfluxDB line protocol data to the configured TCP and/or UDP ad
VictoriaMetrics performs the following transformations to the ingested InfluxDB data:
* [db query arg](https://docs.influxdata.com/influxdb/v1.7/tools/api/#write-http-endpoint) is mapped into `db`
[label](https://docs.victoriametrics.com/keyConcepts.html#labels) value unless `db` tag exists in the InfluxDB line.
The `db` label name can be overridden via `-influxDBLabel` command-line flag. If more strict data isolation is required,
read more about multi-tenancy [here](https://docs.victoriametrics.com/keyConcepts.html#multi-tenancy).
* [db query arg](https://docs.influxdata.com/influxdb/v1.7/tools/api/#write-http-endpoint) is mapped into `db` label value
unless `db` tag exists in the InfluxDB line. The `db` label name can be overridden via `-influxDBLabel` command-line flag.
* Field names are mapped to time series names prefixed with `{measurement}{separator}` value, where `{separator}` equals to `_` by default. It can be changed with `-influxMeasurementFieldSeparator` command-line flag. See also `-influxSkipSingleField` command-line flag. If `{measurement}` is empty or if `-influxSkipMeasurement` command-line flag is set, then time series names correspond to field names.
* Field values are mapped to time series values.
* Tags are mapped to Prometheus labels as-is.
@@ -605,8 +551,7 @@ The `/api/v1/export` endpoint should return the following response:
```
Note that InfluxDB line protocol expects [timestamps in *nanoseconds* by default](https://docs.influxdata.com/influxdb/v1.7/write_protocols/line_protocol_tutorial/#timestamp),
while VictoriaMetrics stores them with *milliseconds* precision. It is allowed to ingest timestamps with seconds,
microseconds or nanoseconds precision - VictoriaMetrics will automatically convert them to milliseconds.
while VictoriaMetrics stores them with *milliseconds* precision.
Extra labels may be added to all the written time series by passing `extra_label=name=value` query args.
For example, `/write?extra_label=foo=bar` would add `{foo="bar"}` label to all the ingested metrics.
@@ -669,6 +614,7 @@ The `__graphite__` pseudo-label supports e.g. alternate regexp filters such as `
VictoriaMetrics also supports Graphite query language - see [these docs](#graphite-render-api-usage).
## How to send data from OpenTSDB-compatible agents
VictoriaMetrics supports [telnet put protocol](http://opentsdb.net/docs/build/html/api_telnet/put.html)
@@ -783,45 +729,20 @@ All the Prometheus querying API handlers can be prepended with `/prometheus` pre
### Prometheus querying API enhancements
VictoriaMetrics accepts optional `extra_label=<label_name>=<label_value>` query arg, which can be used
for enforcing additional label filters for queries. For example, `/api/v1/query_range?extra_label=user_id=123&extra_label=group_id=456&query=<query>`
would automatically add `{user_id="123",group_id="456"}` label filters to the given `<query>`.
This functionality can be used for limiting the scope of time series visible to the given tenant.
It is expected that the `extra_label` query args are automatically set by auth proxy sitting in front of VictoriaMetrics.
See [vmauth](https://docs.victoriametrics.com/vmauth.html) and [vmgateway](https://docs.victoriametrics.com/vmgateway.html) as examples of such proxies.
VictoriaMetrics accepts optional `extra_label=<label_name>=<label_value>` query arg, which can be used for enforcing additional label filters for queries. For example,
`/api/v1/query_range?extra_label=user_id=123&extra_label=group_id=456&query=<query>` would automatically add `{user_id="123",group_id="456"}` label filters to the given `<query>`. This functionality can be used for limiting the scope of time series visible to the given tenant. It is expected that the `extra_label` query args are automatically set by auth proxy sitting in front of VictoriaMetrics. See [vmauth](https://docs.victoriametrics.com/vmauth.html) and [vmgateway](https://docs.victoriametrics.com/vmgateway.html) as examples of such proxies.
VictoriaMetrics accepts optional `extra_filters[]=series_selector` query arg, which can be used for enforcing arbitrary label filters for queries.
For example, `/api/v1/query_range?extra_filters[]={env=~"prod|staging",user="xyz"}&query=<query>` would automatically
add `{env=~"prod|staging",user="xyz"}` label filters to the given `<query>`. This functionality can be used for limiting
the scope of time series visible to the given tenant. It is expected that the `extra_filters[]` query args are automatically
set by auth proxy sitting in front of VictoriaMetrics.
See [vmauth](https://docs.victoriametrics.com/vmauth.html) and [vmgateway](https://docs.victoriametrics.com/vmgateway.html) as examples of such proxies.
VictoriaMetrics accepts optional `extra_filters[]=series_selector` query arg, which can be used for enforcing arbitrary label filters for queries. For example,
`/api/v1/query_range?extra_filters[]={env=~"prod|staging",user="xyz"}&query=<query>` would automatically add `{env=~"prod|staging",user="xyz"}` label filters to the given `<query>`. This functionality can be used for limiting the scope of time series visible to the given tenant. It is expected that the `extra_filters[]` query args are automatically set by auth proxy sitting in front of VictoriaMetrics. See [vmauth](https://docs.victoriametrics.com/vmauth.html) and [vmgateway](https://docs.victoriametrics.com/vmgateway.html) as examples of such proxies.
VictoriaMetrics accepts multiple formats for `time`, `start` and `end` query args - see [these docs](#timestamp-formats).
VictoriaMetrics accepts `round_digits` query arg for [/api/v1/query](https://docs.victoriametrics.com/keyConcepts.html#instant-query)
and [/api/v1/query_range](https://docs.victoriametrics.com/keyConcepts.html#range-query) handlers. It can be used for rounding response values
to the given number of digits after the decimal point.
For example, `/api/v1/query?query=avg_over_time(temperature[1h])&round_digits=2` would round response values to up to two digits after the decimal point.
VictoriaMetrics accepts `round_digits` query arg for `/api/v1/query` and `/api/v1/query_range` handlers. It can be used for rounding response values to the given number of digits after the decimal point. For example, `/api/v1/query?query=avg_over_time(temperature[1h])&round_digits=2` would round response values to up to two digits after the decimal point.
VictoriaMetrics accepts `limit` query arg for [/api/v1/labels](https://docs.victoriametrics.com/url-examples.html#apiv1labels)
and [`/api/v1/label/<labelName>/values`](https://docs.victoriametrics.com/url-examples.html#apiv1labelvalues) handlers for limiting the number of returned entries.
For example, the query to `/api/v1/labels?limit=5` returns a sample of up to 5 unique labels, while ignoring the rest of labels.
If the provided `limit` value exceeds the corresponding `-search.maxTagKeys` / `-search.maxTagValues` command-line flag values,
then limits specified in the command-line flags are used.
VictoriaMetrics accepts `limit` query arg for `/api/v1/labels` and `/api/v1/label/<labelName>/values` handlers for limiting the number of returned entries. For example, the query to `/api/v1/labels?limit=5` returns a sample of up to 5 unique labels, while ignoring the rest of labels. If the provided `limit` value exceeds the corresponding `-search.maxTagKeys` / `-search.maxTagValues` command-line flag values, then limits specified in the command-line flags are used.
By default, VictoriaMetrics returns time series for the last day starting at 00:00 UTC
from [/api/v1/series](https://docs.victoriametrics.com/url-examples.html#apiv1series),
[/api/v1/labels](https://docs.victoriametrics.com/url-examples.html#apiv1labels) and
[`/api/v1/label/<labelName>/values`](https://docs.victoriametrics.com/url-examples.html#apiv1labelvalues),
while the Prometheus API defaults to all time. Explicitly set `start` and `end` to select the desired time range.
VictoriaMetrics rounds the specified `start..end` time range to day granularity because of performance optimization concerns.
If you need the exact set of label names and label values on the given time range, then send queries
to [/api/v1/query](https://docs.victoriametrics.com/keyConcepts.html#instant-query) or to [/api/v1/query_range](https://docs.victoriametrics.com/keyConcepts.html#range-query).
VictoriaMetrics accepts `limit` query arg at [/api/v1/series](https://docs.victoriametrics.com/url-examples.html#apiv1series)
for limiting the number of returned entries. For example, the query to `/api/v1/series?limit=5` returns a sample of up to 5 series, while ignoring the rest of series.
If the provided `limit` value exceeds the corresponding `-search.maxSeries` command-line flag values, then limits specified in the command-line flags are used.
By default, VictoriaMetrics returns time series for the last 5 minutes from `/api/v1/series`, `/api/v1/labels` and `/api/v1/label/<labelName>/values` while the Prometheus API defaults to all time. Explicitly set `start` and `end` to select the desired time range.
VictoriaMetrics accepts `limit` query arg for `/api/v1/series` handlers for limiting the number of returned entries. For example, the query to `/api/v1/series?limit=5` returns a sample of up to 5 series, while ignoring the rest. If the provided `limit` value exceeds the corresponding `-search.maxSeries` command-line flag values, then limits specified in the command-line flags are used.
Additionally, VictoriaMetrics provides the following handlers:
@@ -846,12 +767,9 @@ in [query APIs](https://docs.victoriametrics.com/#prometheus-querying-api-usage)
in [export APIs](https://docs.victoriametrics.com/#how-to-export-time-series).
- Unix timestamps in seconds with optional milliseconds after the point. For example, `1562529662.678`.
- Unix timestamps in milliseconds. For example, `1562529662678`.
- [RFC3339](https://www.ietf.org/rfc/rfc3339.txt). For example, `2022-03-29T01:02:03Z` or `2022-03-29T01:02:03+02:30`.
- Partial RFC3339. Examples: `2022`, `2022-03`, `2022-03-29`, `2022-03-29T01`, `2022-03-29T01:02`, `2022-03-29T01:02:03`.
The partial RFC3339 time is in UTC timezone by default. It is possible to specify timezone there by adding `+hh:mm` or `-hh:mm` suffix to partial time.
For example, `2022-03-01+06:30` is `2022-03-01` at `06:30` timezone.
- Relative duration comparing to the current time. For example, `1h5m`, `-1h5m` or `now-1h5m` means `one hour and five minutes ago`, while `now` means `now`.
- [RFC3339](https://www.ietf.org/rfc/rfc3339.txt). For example, '2022-03-29T01:02:03Z`.
- Partial RFC3339. Examples: `2022`, `2022-03`, `2022-03-29`, `2022-03-29T01`, `2022-03-29T01:02`.
- Relative duration comparing to the current time. For example, `1h5m` means `one hour and five minutes ago`.
## Graphite API usage
@@ -873,10 +791,10 @@ VictoriaMetrics supports `__graphite__` pseudo-label for filtering time series w
### Graphite Render API usage
VictoriaMetrics supports [Graphite Render API](https://graphite.readthedocs.io/en/stable/render_api.html) subset
[VictoriaMetrics Enterprise](https://docs.victoriametrics.com/enterprise.html) supports [Graphite Render API](https://graphite.readthedocs.io/en/stable/render_api.html) subset
at `/render` endpoint, which is used by [Graphite datasource in Grafana](https://grafana.com/docs/grafana/latest/datasources/graphite/).
When configuring Graphite datasource in Grafana, the `Storage-Step` http request header must be set to a step between Graphite data points
stored in VictoriaMetrics. For example, `Storage-Step: 10s` would mean 10 seconds distance between Graphite datapoints stored in VictoriaMetrics.
When configuring Graphite datasource in Grafana, the `Storage-Step` http request header must be set to a step between Graphite data points stored in VictoriaMetrics. For example, `Storage-Step: 10s` would mean 10 seconds distance between Graphite datapoints stored in VictoriaMetrics.
Enterprise binaries can be downloaded and evaluated for free from [the releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases).
### Graphite Metrics API usage
@@ -888,7 +806,7 @@ VictoriaMetrics supports the following handlers from [Graphite Metrics API](http
VictoriaMetrics accepts the following additional query args at `/metrics/find` and `/metrics/expand`:
* `label` - for selecting arbitrary label values. By default, `label=__name__`, i.e. metric names are selected.
* `label` - for selecting arbitrary label values. By default `label=__name__`, i.e. metric names are selected.
* `delimiter` - for using different delimiters in metric name hierarchy. For example, `/metrics/find?delimiter=_&query=node_*` would return all the metric name prefixes
that start with `node_`. By default `delimiter=.`.
@@ -915,13 +833,13 @@ to your needs or when testing bugfixes.
### Development build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.19.
1. Run `make victoria-metrics` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
2. Run `make victoria-metrics` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `victoria-metrics` binary and puts it into the `bin` folder.
### Production build
1. [Install docker](https://docs.docker.com/install/).
1. Run `make victoria-metrics-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
2. Run `make victoria-metrics-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `victoria-metrics-prod` binary and puts it into the `bin` folder.
### ARM build
@@ -931,13 +849,13 @@ ARM build may run on Raspberry Pi or on [energy-efficient ARM servers](https://b
### Development ARM build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.19.
1. Run `make victoria-metrics-linux-arm` or `make victoria-metrics-linux-arm64` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
2. Run `make victoria-metrics-linux-arm` or `make victoria-metrics-linux-arm64` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `victoria-metrics-linux-arm` or `victoria-metrics-linux-arm64` binary respectively and puts it into the `bin` folder.
### Production ARM build
1. [Install docker](https://docs.docker.com/install/).
1. Run `make victoria-metrics-linux-arm-prod` or `make victoria-metrics-linux-arm64-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
2. Run `make victoria-metrics-linux-arm-prod` or `make victoria-metrics-linux-arm64-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `victoria-metrics-linux-arm-prod` or `victoria-metrics-linux-arm64-prod` binary respectively and puts it into the `bin` folder.
### Pure Go build (CGO_ENABLED=0)
@@ -945,7 +863,7 @@ ARM build may run on Raspberry Pi or on [energy-efficient ARM servers](https://b
`Pure Go` mode builds only Go code without [cgo](https://golang.org/cmd/cgo/) dependencies.
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.19.
1. Run `make victoria-metrics-pure` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
2. Run `make victoria-metrics-pure` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `victoria-metrics-pure` binary and puts it into the `bin` folder.
### Building docker images
@@ -998,9 +916,9 @@ Navigate to `http://<victoriametrics-addr>:8428/snapshot/delete_all` in order to
Steps for restoring from a snapshot:
1. Stop VictoriaMetrics with `kill -INT`.
1. Restore snapshot contents from backup with [vmrestore](https://docs.victoriametrics.com/vmrestore.html)
2. Restore snapshot contents from backup with [vmrestore](https://docs.victoriametrics.com/vmrestore.html)
to the directory pointed by `-storageDataPath`.
1. Start VictoriaMetrics.
3. Start VictoriaMetrics.
## How to delete time series
@@ -1014,7 +932,7 @@ Note that background merges may never occur for data from previous months, so st
In this case [forced merge](#forced-merge) may help freeing up storage space.
It is recommended verifying which metrics will be deleted with the call to `http://<victoria-metrics-addr>:8428/api/v1/series?match[]=<timeseries_selector_for_delete>`
before actually deleting the metrics. By default, this query will only scan series in the past 5 minutes, so you may need to
before actually deleting the metrics. By default this query will only scan series in the past 5 minutes, so you may need to
adjust `start` and `end` to a suitable range to achieve match hits.
The `/api/v1/admin/tsdb/delete_series` handler may be protected with `authKey` if `-deleteAuthKey` command-line flag is set.
@@ -1065,7 +983,7 @@ VictoriaMetrics provides the following handlers for exporting data:
Send a request to `http://<victoriametrics-addr>:8428/api/v1/export?match[]=<timeseries_selector_for_export>`,
where `<timeseries_selector_for_export>` may contain any [time series selector](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors)
for metrics to export. Use `{__name__!=""}` selector for fetching all the time series.
The response would contain all the data for the selected time series in [JSON streaming format](http://ndjson.org/).
The response would contain all the data for the selected time series in [JSON streaming format](https://en.wikipedia.org/wiki/JSON_streaming#Line-delimited_JSON).
Each JSON line contains samples for a single time series. An example output:
```json
@@ -1079,7 +997,7 @@ See [allowed formats](#timestamp-formats) for these args.
For example:
```console
curl http://<victoriametrics-addr>:8428/api/v1/export -d 'match[]=<timeseries_selector_for_export>' -d 'start=1654543486' -d 'end=1654543486'
curl http://<victoriametrics-addr>:8428/api/v1/export -d 'match[]=<timeseries_selector_for_export>' -d 'start=2022-06-06T19:25:48' -d 'end=2022-06-06T19:29:07'
curl http://<victoriametrics-addr>:8428/api/v1/export -d 'match[]=<timeseries_selector_for_export>' -d 'start=2022-06-06T19:25:48+00:00' -d 'end=2022-06-06T19:29:07+00:00'
```
Optional `max_rows_per_line` arg may be added to the request for limiting the maximum number of rows exported per each JSON line.
@@ -1128,7 +1046,7 @@ See [allowed formats](#timestamp-formats) for these args.
For example:
```console
curl http://<victoriametrics-addr>:8428/api/v1/export/csv -d 'format=<format>' -d 'match[]=<timeseries_selector_for_export>' -d 'start=1654543486' -d 'end=1654543486'
curl http://<victoriametrics-addr>:8428/api/v1/export/csv -d 'format=<format>' -d 'match[]=<timeseries_selector_for_export>' -d 'start=2022-06-06T19:25:48' -d 'end=2022-06-06T19:29:07'
curl http://<victoriametrics-addr>:8428/api/v1/export/csv -d 'format=<format>' -d 'match[]=<timeseries_selector_for_export>' -d 'start=2022-06-06T19:25:48+00:00' -d 'end=2022-06-06T19:29:07+00:00'
```
The exported CSV data can be imported to VictoriaMetrics via [/api/v1/import/csv](#how-to-import-csv-data).
@@ -1156,7 +1074,7 @@ See [allowed formats](#timestamp-formats) for these args.
For example:
```console
curl http://<victoriametrics-addr>:8428/api/v1/export/native -d 'match[]=<timeseries_selector_for_export>' -d 'start=1654543486' -d 'end=1654543486'
curl http://<victoriametrics-addr>:8428/api/v1/export/native -d 'match[]=<timeseries_selector_for_export>' -d 'start=2022-06-06T19:25:48' -d 'end=2022-06-06T19:29:07'
curl http://<victoriametrics-addr>:8428/api/v1/export/native -d 'match[]=<timeseries_selector_for_export>' -d 'start=2022-06-06T19:25:48+00:00' -d 'end=2022-06-06T19:29:07+00:00'
```
The exported data can be imported to VictoriaMetrics via [/api/v1/import/native](#how-to-import-data-in-native-format).
@@ -1175,7 +1093,6 @@ Additionally, VictoriaMetrics can accept metrics via the following popular data
* DataDog `submit metrics` API. See [these docs](#how-to-send-data-from-datadog-agent) for details.
* InfluxDB line protocol. See [these docs](#how-to-send-data-from-influxdb-compatible-agents-such-as-telegraf) for details.
* Graphite plaintext protocol. See [these docs](#how-to-send-data-from-graphite-compatible-agents-such-as-statsd) for details.
* OpenTelemetry http API. See [these docs](#sending-data-via-opentelemetry) for details.
* OpenTSDB telnet put protocol. See [these docs](#sending-data-via-telnet-put-protocol) for details.
* OpenTSDB http `/api/put` protocol. See [these docs](#sending-opentsdb-data-via-http-apiput-requests) for details.
* `/api/v1/import` for importing data obtained from [/api/v1/export](#how-to-export-data-in-json-line-format).
@@ -1186,13 +1103,6 @@ Additionally, VictoriaMetrics can accept metrics via the following popular data
* `/api/v1/import/prometheus` for importing data in Prometheus exposition format and in [Pushgateway format](https://github.com/prometheus/pushgateway#url).
See [these docs](#how-to-import-data-in-prometheus-exposition-format) for details.
Please note, most of the ingestion APIs (except [Prometheus remote_write API](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write))
are optimized for performance and processes data in a streaming fashion.
It means that client can transfer unlimited amount of data through the open connection. Because of this, import APIs
may not return parsing errors to the client, as it is expected for data stream to be not interrupted.
Instead, look for parsing errors on the server side (VictoriaMetrics single-node or vminsert) or
check for changes in `vm_rows_invalid_total` (exported by server side) metric.
### How to import data in JSON line format
Example for importing data obtained via [/api/v1/export](#how-to-export-data-in-json-line-format):
@@ -1359,13 +1269,6 @@ Note that it could be required to flush response cache after importing historica
VictoriaMetrics also may scrape Prometheus targets - see [these docs](#how-to-scrape-prometheus-exporters-such-as-node-exporter).
## Sending data via OpenTelemetry
VictoriaMetrics supports data ingestion via [OpenTelemetry protocol for metrics](https://github.com/open-telemetry/opentelemetry-specification/blob/ffddc289462dfe0c2041e3ca42a7b1df805706de/specification/metrics/data-model.md) at `/opentelemetry/api/v1/push` path.
VictoriaMetrics expects `protobuf`-encoded requests at `/opentelemetry/api/v1/push`.
Set HTTP request header `Content-Encoding: gzip` when sending gzip-compressed data to `/opentelemetry/api/v1/push`.
## Relabeling
VictoriaMetrics supports Prometheus-compatible relabeling for all the ingested metrics if `-relabelConfig` command-line flag points
@@ -1395,8 +1298,7 @@ Example contents for `-relabelConfig` file:
VictoriaMetrics provides additional relabeling features such as Graphite-style relabeling.
See [these docs](https://docs.victoriametrics.com/vmagent.html#relabeling) for more details.
The relabeling can be debugged at `http://victoriametrics:8428/metric-relabel-debug` page
or at our [public playground](https://play.victoriametrics.com/select/accounting/1/6a716b0f-38bc-4856-90ce-448fd713e3fe/prometheus/graph/#/relabeling).
The relabeling can be debugged at `http://victoriametrics:8428/metric-relabel-debug` page.
See [these docs](https://docs.victoriametrics.com/vmagent.html#relabel-debug) for more details.
@@ -1411,7 +1313,7 @@ See [allowed formats](#timestamp-formats) for these args.
For example:
```console
curl http://<victoriametrics-addr>:8428/federate -d 'match[]=<timeseries_selector_for_export>' -d 'start=1654543486' -d 'end=1654543486'
curl http://<victoriametrics-addr>:8428/federate -d 'match[]=<timeseries_selector_for_export>' -d 'start=2022-06-06T19:25:48' -d 'end=2022-06-06T19:29:07'
curl http://<victoriametrics-addr>:8428/federate -d 'match[]=<timeseries_selector_for_export>' -d 'start=2022-06-06T19:25:48+00:00' -d 'end=2022-06-06T19:29:07+00:00'
```
By default, the last point on the interval `[now - max_lookback ... now]` is scraped for each time series. The default value for `max_lookback` is `5m` (5 minutes), but it can be overridden with `max_lookback` query arg.
For instance, `/federate?match[]=up&max_lookback=1h` would return last points on the `[now - 1h ... now]` interval. This may be useful for time series federation
@@ -1443,7 +1345,7 @@ See also [resource usage limits docs](#resource-usage-limits).
## Resource usage limits
By default, VictoriaMetrics is tuned for an optimal resource usage under typical workloads. Some workloads may need fine-grained resource usage limits. In these cases the following command-line flags may be useful:
By default VictoriaMetrics is tuned for an optimal resource usage under typical workloads. Some workloads may need fine-grained resource usage limits. In these cases the following command-line flags may be useful:
- `-memory.allowedPercent` and `-memory.allowedBytes` limit the amounts of memory, which may be used for various internal caches at VictoriaMetrics. Note that VictoriaMetrics may use more memory, since these flags don't limit additional memory, which may be needed on a per-query basis.
- `-search.maxMemoryPerQuery` limits the amounts of memory, which can be used for processing a single query. Queries, which need more memory, are rejected. Heavy queries, which select big number of time series, may exceed the per-query memory limit by a small percent. The total memory limit for concurrently executed queries can be estimated as `-search.maxMemoryPerQuery` multiplied by `-search.maxConcurrentRequests`.
@@ -1454,7 +1356,6 @@ By default, VictoriaMetrics is tuned for an optimal resource usage under typical
- `-search.maxSamplesPerQuery` limits the number of raw samples a single query can process. This allows limiting CPU usage for heavy queries.
- `-search.maxPointsPerTimeseries` limits the number of calculated points, which can be returned per each matching time series from [range query](https://docs.victoriametrics.com/keyConcepts.html#range-query).
- `-search.maxPointsSubqueryPerTimeseries` limits the number of calculated points, which can be generated per each matching time series during [subquery](https://docs.victoriametrics.com/MetricsQL.html#subqueries) evaluation.
- `-search.maxSeriesPerAggrFunc` limits the number of time series, which can be generated by [MetricsQL aggregate functions](https://docs.victoriametrics.com/MetricsQL.html#aggregate-functions) in a single query.
- `-search.maxSeries` limits the number of time series, which may be returned from [/api/v1/series](https://prometheus.io/docs/prometheus/latest/querying/api/#finding-series-by-label-matchers). This endpoint is used mostly by Grafana for auto-completion of metric names, label names and label values. Queries to this endpoint may take big amounts of CPU time and memory when the database contains big number of unique time series because of [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate). In this case it might be useful to set the `-search.maxSeries` to quite low value in order limit CPU and memory usage.
- `-search.maxTagKeys` limits the number of items, which may be returned from [/api/v1/labels](https://prometheus.io/docs/prometheus/latest/querying/api/#getting-label-names). This endpoint is used mostly by Grafana for auto-completion of label names. Queries to this endpoint may take big amounts of CPU time and memory when the database contains big number of unique time series because of [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate). In this case it might be useful to set the `-search.maxTagKeys` to quite low value in order to limit CPU and memory usage.
- `-search.maxTagValues` limits the number of items, which may be returned from [/api/v1/label/.../values](https://prometheus.io/docs/prometheus/latest/querying/api/#querying-label-values). This endpoint is used mostly by Grafana for auto-completion of label values. Queries to this endpoint may take big amounts of CPU time and memory when the database contains big number of unique time series because of [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate). In this case it might be useful to set the `-search.maxTagValues` to quite low value in order to limit CPU and memory usage.
@@ -1505,37 +1406,22 @@ with the enabled de-duplication. See [this section](#deduplication) for details.
## Deduplication
VictoriaMetrics leaves a single [raw sample](https://docs.victoriametrics.com/keyConcepts.html#raw-samples)
with the biggest [timestamp](https://en.wikipedia.org/wiki/Unix_time) for each [time series](https://docs.victoriametrics.com/keyConcepts.html#time-series)
per each `-dedup.minScrapeInterval` discrete interval if `-dedup.minScrapeInterval` is set to positive duration.
For example, `-dedup.minScrapeInterval=60s` would leave a single raw sample with the biggest timestamp per each discrete
`60s` interval.
VictoriaMetrics leaves a single raw sample with the biggest timestamp per each `-dedup.minScrapeInterval` discrete interval
if `-dedup.minScrapeInterval` is set to positive duration. For example, `-dedup.minScrapeInterval=60s` would leave a single
raw sample with the biggest timestamp per each discrete 60s interval.
This aligns with the [staleness rules in Prometheus](https://prometheus.io/docs/prometheus/latest/querying/basics/#staleness).
If multiple raw samples have **the same timestamp** on the given `-dedup.minScrapeInterval` discrete interval,
then the sample with **the biggest value** is kept.
If multiple raw samples have the same biggest timestamp on the given `-dedup.minScrapeInterval` discrete interval, then the sample with the biggest value is left.
Please note, [labels](https://docs.victoriametrics.com/keyConcepts.html#labels) of raw samples should be identical
in order to be deduplicated. For example, this is why [HA pair of vmagents](https://docs.victoriametrics.com/vmagent.html#high-availability)
needs to be identically configured.
The `-dedup.minScrapeInterval=D` is equivalent to `-downsampling.period=0s:D` if [downsampling](#downsampling) is enabled. So it is safe to use deduplication and downsampling simultaneously.
The `-dedup.minScrapeInterval=D` is equivalent to `-downsampling.period=0s:D` if [downsampling](#downsampling) is enabled.
So it is safe to use deduplication and downsampling simultaneously.
The recommended value for `-dedup.minScrapeInterval` must equal to `scrape_interval` config from Prometheus configs. It is recommended to have a single `scrape_interval` across all the scrape targets. See [this article](https://www.robustperception.io/keep-it-simple-scrape_interval-id) for details.
The recommended value for `-dedup.minScrapeInterval` must equal to `scrape_interval` config from Prometheus configs.
It is recommended to have a single `scrape_interval` across all the scrape targets.
See [this article](https://www.robustperception.io/keep-it-simple-scrape_interval-id) for details.
The de-duplication reduces disk space usage if multiple identically configured [vmagent](https://docs.victoriametrics.com/vmagent.html) or Prometheus instances in HA pair
write data to the same VictoriaMetrics instance. These vmagent or Prometheus instances must have identical
`external_labels` section in their configs, so they write data to the same time series. See also [how to set up multiple vmagent instances for scraping the same targets](https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets).
The de-duplication reduces disk space usage if multiple **identically configured** [vmagent](https://docs.victoriametrics.com/vmagent.html)
or Prometheus instances in HA pair write data to the same VictoriaMetrics instance.
These vmagent or Prometheus instances must have **identical** `external_labels` section in their configs,
so they write data to the same time series.
See also [how to set up multiple vmagent instances for scraping the same targets](https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets).
It is recommended passing different `-promscrape.cluster.name` values to each distinct HA pair of `vmagent` instances,
so the de-duplication consistently leaves samples for one `vmagent` instance and removes duplicate samples
from other `vmagent` instances.
See [these docs](https://docs.victoriametrics.com/vmagent.html#high-availability) for details.
It is recommended passing different `-promscrape.cluster.name` values to HA pairs of `vmagent` instances, so the de-duplication consistently leaves samples for one `vmagent` instance and removes duplicate samples from other `vmagent` instances. See [these docs](https://docs.victoriametrics.com/vmagent.html#high-availability) for details.
## Storage
@@ -1547,14 +1433,12 @@ can be configured with the `-inmemoryDataFlushInterval` command-line flag (note
In-memory parts are persisted to disk into `part` directories under the `<-storageDataPath>/data/small/YYYY_MM/` folder,
where `YYYY_MM` is the month partition for the stored data. For example, `2022_11` is the partition for `parts`
with [raw samples](https://docs.victoriametrics.com/keyConcepts.html#raw-samples) from `November 2022`.
Each partition directory contains `parts.json` file with the actual list of parts in the partition.
Every `part` directory contains `metadata.json` file with the following fields:
The `part` directory has the following name pattern: `rowsCount_blocksCount_minTimestamp_maxTimestamp`, where:
- `RowsCount` - the number of [raw samples](https://docs.victoriametrics.com/keyConcepts.html#raw-samples) stored in the part
- `BlocksCount` - the number of blocks stored in the part (see details about blocks below)
- `MinTimestamp` and `MaxTimestamp` - minimum and maximum timestamps across raw samples stored in the part
- `MinDedupInterval` - the [deduplication interval](#deduplication) applied to the given part.
- `rowsCount` - the number of [raw samples](https://docs.victoriametrics.com/keyConcepts.html#raw-samples) stored in the part
- `blocksCount` - the number of blocks stored in the part (see details about blocks below)
- `minTimestamp` and `maxTimestamp` - minimum and maximum timestamps across raw samples stored in the part
Each `part` consists of `blocks` sorted by internal time series id (aka `TSID`).
Each `block` contains up to 8K [raw samples](https://docs.victoriametrics.com/keyConcepts.html#raw-samples),
@@ -1576,20 +1460,15 @@ for fast block lookups, which belong to the given `TSID` and cover the given tim
and [freeing up disk space for the deleted time series](#how-to-delete-time-series) are performed during the merge
Newly added `parts` either successfully appear in the storage or fail to appear.
The newly added `part` is atomically registered in the `parts.json` file under the corresponding partition
after it is fully written and [fsynced](https://man7.org/linux/man-pages/man2/fsync.2.html) to the storage.
Thanks to this algorithm, storage never contains partially created parts, even if hardware power off
occurs in the middle of writing the `part` to disk - such incompletely written `parts`
The newly added `parts` are being created in a temporary directory under `<-storageDataPath>/data/{small,big}/YYYY_MM/tmp` folder.
When the newly added `part` is fully written and [fsynced](https://man7.org/linux/man-pages/man2/fsync.2.html)
to a temporary directory, then it is atomically moved to the storage directory.
Thanks to this alogrithm, storage never contains partially created parts, even if hardware power off
occurrs in the middle of writing the `part` to disk - such incompletely written `parts`
are automatically deleted on the next VictoriaMetrics start.
The same applies to merge process — `parts` are either fully merged into a new `part` or fail to merge,
leaving the source `parts` untouched. However, due to hardware issues data on disk may be corrupted regardless of
VictoriaMetrics process. VictoriaMetrics can detect corruption during decompressing, decoding or sanity checking
of the data blocks. But **it cannot fix the corrupted data**. Data parts that fail to load on startup need to be deleted
or restored from backups. This is why it is recommended performing
[regular backups](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#backups).
VictoriaMetrics doesn't use checksums for stored data blocks. See why [here](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3011).
leaving the source `parts` untouched.
VictoriaMetrics doesn't merge parts if their summary size exceeds free disk space.
This prevents from potential out of disk space errors during merge.
@@ -1608,22 +1487,20 @@ See also [how to work with snapshots](#how-to-work-with-snapshots).
## Retention
Retention is configured with the `-retentionPeriod` command-line flag, which takes a number followed by a time unit
character - `h(ours)`, `d(ays)`, `w(eeks)`, `y(ears)`. If the time unit is not specified, a month is assumed.
For instance, `-retentionPeriod=3` means that the data will be stored for 3 months and then deleted.
The default retention period is one month. The **minimum retention** period is 24h or 1d.
Retention is configured with the `-retentionPeriod` command-line flag, which takes a number followed by a time unit character - `h(ours)`, `d(ays)`, `w(eeks)`, `y(ears)`. If the time unit is not specified, a month is assumed. For instance, `-retentionPeriod=3` means that the data will be stored for 3 months and then deleted. The default retention period is one month.
Data is split in per-month partitions inside `<-storageDataPath>/data/{small,big}` folders.
**Data partitions** outside the configured retention are deleted **on the first day of the new month**.
Each partition consists of one or more **data parts**. Data parts outside the configured retention
are **eventually deleted** during [background merge](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282).
The time range covered by data part is **not limited by retention period unit**. One data part can cover hours or days of
data. Hence, a data part can be deleted only **when fully outside the configured retention**.
See more about partitions and parts [here](#storage).
Data partitions outside the configured retention are deleted on the first day of the new month.
Each partition consists of one or more data parts with the following name pattern `rowsCount_blocksCount_minTimestamp_maxTimestamp`.
Data parts outside of the configured retention are eventually deleted during
[background merge](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282).
The maximum disk space usage for a given `-retentionPeriod` is going to be (`-retentionPeriod` + 1) months.
For example, if `-retentionPeriod` is set to 1, data for January is deleted on March 1st.
Please note, the time range covered by data part is not limited by retention period unit. Hence, data part may contain data
for multiple days and will be deleted only when fully outside of the configured retention.
It is safe to extend `-retentionPeriod` on existing data. If `-retentionPeriod` is set to a lower
value than before, then data outside the configured period will be eventually deleted.
@@ -1666,7 +1543,7 @@ For example, the following config sets 3 days retention for time series with `te
Important notes:
- The data outside the configured retention isn't deleted instantly - it is deleted eventually during [background merges](https://docs.victoriametrics.com/#storage).
- The data outside of the configured retention isn't deleted instantly - it is deleted eventually during [background merges](https://docs.victoriametrics.com/#storage).
- The `-retentionFilter` doesn't remove old data from `indexdb` (aka inverted index) until the configured [-retentionPeriod](#retention).
So the `indexdb` size can grow big under [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate)
even for small retentions configured via `-retentionFilter`.
@@ -1686,12 +1563,7 @@ Retention filters can be evaluated for free by downloading and using enterprise
* `-downsampling.period=30d:5m,180d:1h` instructs VictoriaMetrics to deduplicate samples older than 30 days with 5 minutes interval and to deduplicate samples older than 180 days with 1 hour interval.
Downsampling is applied independently per each time series. It can reduce disk space usage and improve query performance if it is applied to time series with big number of samples per each series. The downsampling doesn't improve query performance if the database contains big number of time series with small number of samples per each series (aka [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate)), since downsampling doesn't reduce the number of time series. So the majority of time is spent on searching for the matching time series.
It is possible to use [stream aggregation](https://docs.victoriametrics.com/stream-aggregation.html) in vmagent or recording rules in [vmalert](https://docs.victoriametrics.com/vmalert.html) in order to [reduce the number of time series](https://docs.victoriametrics.com/vmalert.html#downsampling-and-aggregation-via-vmalert).
Downsampling happens during [background merges](https://docs.victoriametrics.com/#storage)
and can't be performed if there is not enough of free disk space or if vmstorage
is in [read-only mode](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#readonly-mode).
Downsampling is applied independently per each time series. It can reduce disk space usage and improve query performance if it is applied to time series with big number of samples per each series. The downsampling doesn't improve query performance if the database contains big number of time series with small number of samples per each series (aka [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate)), since downsampling doesn't reduce the number of time series. So the majority of time is spent on searching for the matching time series. It is possible to use recording rules in [vmalert](https://docs.victoriametrics.com/vmalert.html) in order to reduce the number of time series. See [these docs](https://docs.victoriametrics.com/vmalert.html#downsampling-and-aggregation-via-vmalert).
The downsampling can be evaluated for free by downloading and using enterprise binaries from [the releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases).
@@ -1782,7 +1654,7 @@ created by community.
Graphs on the dashboards contain useful hints - hover the `i` icon in the top left corner of each graph to read it.
We recommend setting up [alerts](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/master/deployment/docker#alerts)
We recommend setting up [alerts](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/deployment/docker/alerts.yml)
via [vmalert](https://docs.victoriametrics.com/vmalert.html) or via Prometheus.
VictoriaMetrics exposes currently running queries and their execution times at `/api/v1/status/active_queries` page.
@@ -1802,11 +1674,6 @@ VictoriaMetrics returns TSDB stats at `/api/v1/status/tsdb` page in the way simi
* `match[]=SELECTOR` where `SELECTOR` is an arbitrary [time series selector](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors) for series to take into account during stats calculation. By default all the series are taken into account.
* `extra_label=LABEL=VALUE`. See [these docs](#prometheus-querying-api-enhancements) for more details.
In [cluster version of VictoriaMetrics](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html) each vmstorage tracks the stored time series individually.
vmselect requests stats via [/api/v1/status/tsdb](#tsdb-stats) API from each vmstorage node and merges the results by summing per-series stats.
This may lead to inflated values when samples for the same time series are spread across multiple vmstorage nodes
due to [replication](#replication) or [rerouting](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html?highlight=re-routes#cluster-availability).
VictoriaMetrics provides an UI on top of `/api/v1/status/tsdb` - see [cardinality explorer docs](#cardinality-explorer).
## Query tracing
@@ -2018,14 +1885,7 @@ are added to all the metrics before sending them to the remote storage:
## Cache removal
VictoriaMetrics uses various internal caches. These caches are stored to `<-storageDataPath>/cache` directory during graceful shutdown
(e.g. when VictoriaMetrics is stopped by sending `SIGINT` signal). The caches are read on the next VictoriaMetrics startup.
Sometimes it is needed to remove such caches on the next startup. This can be done in the following ways:
- By manually removing the `<-storageDataPath>/cache` directory when VictoriaMetrics is stopped.
- By placing `reset_cache_on_startup` file inside the `<-storageDataPath>/cache` directory before the restart of VictoriaMetrics.
In this case VictoriaMetrics will automatically remove all the caches on the next start.
See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1447) for details.
VictoriaMetrics uses various internal caches. These caches are stored to `<-storageDataPath>/cache` directory during graceful shutdown (e.g. when VictoriaMetrics is stopped by sending `SIGINT` signal). The caches are read on the next VictoriaMetrics startup. Sometimes it is needed to remove such caches on the next startup. This can be performed by placing `reset_cache_on_startup` file inside the `<-storageDataPath>/cache` directory before the restart of VictoriaMetrics. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1447) for details.
## Cache tuning
@@ -2060,16 +1920,16 @@ The simplest way to migrate data from one single-node (source) to another (desti
to another do the following:
1. Stop the VictoriaMetrics (source) with `kill -INT`;
1. Copy (via [rsync](https://en.wikipedia.org/wiki/Rsync) or any other tool) the entire folder specified
2. Copy (via [rsync](https://en.wikipedia.org/wiki/Rsync) or any other tool) the entire folder specified
via `-storageDataPath` from the source node to the empty folder at the destination node.
1. Once copy is done, stop the VictoriaMetrics (destination) with `kill -INT` and verify that
3. Once copy is done, stop the VictoriaMetrics (destination) with `kill -INT` and verify that
its `-storageDataPath` points to the copied folder from p.2;
1. Start the VictoriaMetrics (destination). The copied data should be now available.
4. Start the VictoriaMetrics (destination). The copied data should be now available.
Things to consider when copying data:
1. Data formats between single-node and vmstorage node aren't compatible and can't be copied.
1. Copying data folder means complete replacement of the previous data on destination VictoriaMetrics.
2. Copying data folder means complete replacement of the previous data on destination VictoriaMetrics.
For more complex scenarios like single-to-cluster, cluster-to-single, re-sharding or migrating only a fraction
of data - see [vmctl. Migrating data from VictoriaMetrics](https://docs.victoriametrics.com/vmctl.html#migrating-data-from-victoriametrics).
@@ -2095,7 +1955,8 @@ It is recommended disabling query cache with `-search.disableCache` command-line
historical data with timestamps from the past, since the cache assumes that the data is written with
the current timestamps. Query cache can be enabled after the backfilling is complete.
An alternative solution is to query [/internal/resetRollupResultCache](https://docs.victoriametrics.com/url-examples.html#internalresetRollupResultCache) handler after the backfilling is complete. This will reset the query cache, which could contain incomplete data cached during the backfilling.
An alternative solution is to query `/internal/resetRollupResultCache` url after backfilling is complete. This will reset
the query cache, which could contain incomplete data cached during the backfilling.
Yet another solution is to increase `-search.cacheTimestampOffset` flag value in order to disable caching
for data with timestamps close to the current time. Single-node VictoriaMetrics automatically resets response
@@ -2129,9 +1990,9 @@ Enterprise binaries can be downloaded and evaluated for free from [the releases
A single-node VictoriaMetrics is capable of proxying requests to [vmalert](https://docs.victoriametrics.com/vmalert.html)
when `-vmalert.proxyURL` flag is set. Use this feature for the following cases:
* for proxying requests from [Grafana Alerting UI](https://grafana.com/docs/grafana/latest/alerting/);
* for accessing vmalerts UI through single-node VictoriaMetrics Web interface.
* for accessing vmalert's UI through single-node VictoriaMetrics Web interface.
For accessing vmalerts UI through single-node VictoriaMetrics configure `-vmalert.proxyURL` flag and visit
For accessing vmalert's UI through single-node VictoriaMetrics configure `-vmalert.proxyURL` flag and visit
`http://<victoriametrics-addr>:8428/vmalert/` link.
## Benchmarks
@@ -2176,10 +2037,17 @@ It is safe sharing the collected profiles from security point of view, since the
## Integrations
* [go-graphite/carbonapi](https://github.com/go-graphite/carbonapi) can use VictoriaMetrics as time series backend.
See [this example](https://github.com/go-graphite/carbonapi/blob/main/cmd/carbonapi/carbonapi.example.victoriametrics.yaml).
* [Helm charts for single-node and cluster versions of VictoriaMetrics](https://github.com/VictoriaMetrics/helm-charts).
* [Kubernetes operator for VictoriaMetrics](https://github.com/VictoriaMetrics/operator).
* [netdata](https://github.com/netdata/netdata) can push data into VictoriaMetrics via `Prometheus remote_write API`.
See [these docs](https://github.com/netdata/netdata#integrations).
* [go-graphite/carbonapi](https://github.com/go-graphite/carbonapi) can use VictoriaMetrics as time series backend.
See [this example](https://github.com/go-graphite/carbonapi/blob/main/cmd/carbonapi/carbonapi.example.victoriametrics.yaml).
* [Ansible role for installing cluster VictoriaMetrics (by VictoriaMetrics)](https://github.com/VictoriaMetrics/ansible-playbooks).
* [Ansible role for installing cluster VictoriaMetrics (by community)](https://github.com/Slapper/ansible-victoriametrics-cluster-role).
* [Ansible role for installing single-node VictoriaMetrics (by community)](https://github.com/dreamteam-gg/ansible-victoriametrics-role).
* [Snap package for VictoriaMetrics](https://snapcraft.io/victoriametrics).
* [vmalert-cli](https://github.com/aorfanos/vmalert-cli) - a CLI application for managing [vmalert](https://docs.victoriametrics.com/vmalert.html).
## Third-party contributions
@@ -2202,6 +2070,7 @@ Feel free asking any questions regarding VictoriaMetrics:
* [reddit](https://www.reddit.com/r/VictoriaMetrics/)
* [telegram-en](https://t.me/VictoriaMetrics_en)
* [telegram-ru](https://t.me/VictoriaMetrics_ru1)
* [articles and talks about VictoriaMetrics in Russian](https://github.com/denisgolius/victoriametrics-ru-links)
* [google groups](https://groups.google.com/forum/#!forum/victorametrics-users)
If you like VictoriaMetrics and want to contribute, then we need the following:
@@ -2278,18 +2147,18 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
-deleteAuthKey string
authKey for metrics' deletion via /api/v1/admin/tsdb/delete_series and /tags/delSeries
-denyQueriesOutsideRetention
Whether to deny queries outside the configured -retentionPeriod. When set, then /api/v1/query_range would return '503 Service Unavailable' error for queries with 'from' value outside -retentionPeriod. This may be useful when multiple data sources with distinct retentions are hidden behind query-tee
Whether to deny queries outside of the configured -retentionPeriod. When set, then /api/v1/query_range would return '503 Service Unavailable' error for queries with 'from' value outside -retentionPeriod. This may be useful when multiple data sources with distinct retentions are hidden behind query-tee
-denyQueryTracing
Whether to disable the ability to trace queries. See https://docs.victoriametrics.com/#query-tracing
-downsampling.period array
Comma-separated downsampling periods in the format 'offset:period'. For example, '30d:10m' instructs to leave a single sample per 10 minutes for samples older than 30 days. When setting multiple downsampling periods, it is necessary for the periods to be multiples of each other. See https://docs.victoriametrics.com/#downsampling for details. This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
Comma-separated downsampling periods in the format 'offset:period'. For example, '30d:10m' instructs to leave a single sample per 10 minutes for samples older than 30 days. See https://docs.victoriametrics.com/#downsampling for details. This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
Supports an array of values separated by comma or specified via multiple flags.
-dryRun
Whether to check config files without running VictoriaMetrics. The following config files are checked: -promscrape.config, -relabelConfig and -streamAggr.config. Unknown config entries aren't allowed in -promscrape.config by default. This can be changed with -promscrape.config.strictParse=false command-line flag
Whether to check only -promscrape.config and then exit. Unknown config entries aren't allowed in -promscrape.config by default. This can be changed with -promscrape.config.strictParse=false command-line flag
-enableTCP6
Whether to enable IPv6 for listening and dialing. By default, only IPv4 TCP and UDP are used
Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used
-envflag.enable
Whether to enable reading flags from environment variables in addition to the command line. Command line flag values have priority over values from environment vars. Flags are read only from the command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
-eula
@@ -2303,7 +2172,7 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
-forceMergeAuthKey string
authKey, which must be passed in query string to /internal/force_merge pages
-fs.disableMmap
Whether to use pread() instead of mmap() for reading data files. By default, mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread()
Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread()
-graphiteListenAddr string
TCP and UDP address to listen for Graphite plaintext data. Usually :2003 must be set. Doesn't work if empty. See also -graphiteListenAddr.useProxyProtocol
-graphiteListenAddr.useProxyProtocol
@@ -2313,7 +2182,7 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
-http.connTimeout duration
Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s)
-http.disableResponseCompression
Disable compression of HTTP responses to save CPU resources. By default, compression is enabled to save network bandwidth
Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth
-http.idleConnTimeout duration
Timeout for incoming idle http connections (default 1m0s)
-http.maxGracefulShutdownDuration duration
@@ -2323,13 +2192,13 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
-http.shutdownDelay duration
Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers
-httpAuth.password string
Password for HTTP server's Basic Auth. The authentication is disabled if -httpAuth.username is empty
Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty
-httpAuth.username string
Username for HTTP server's Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
-httpListenAddr string
TCP address to listen for http connections. See also -httpListenAddr.useProxyProtocol (default ":8428")
-httpListenAddr.useProxyProtocol
Whether to use proxy protocol for connections accepted at -httpListenAddr . See https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt . With enabled proxy protocol http server cannot serve regular /metrics endpoint. Use -pushmetrics.url for metrics pushing
Whether to use proxy protocol for connections accepted at -httpListenAddr . See https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt
-import.maxLineLen size
The maximum length in bytes of a single line accepted by /api/v1/import; the line length can be limited with 'max_rows_per_line' query arg passed to /api/v1/export
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 104857600)
@@ -2350,19 +2219,15 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
-influxSkipMeasurement
Uses '{field_name}' as a metric name while ignoring '{measurement}' and '-influxMeasurementFieldSeparator'
-influxSkipSingleField
Uses '{measurement}' instead of '{measurement}{separator}{field_name}' for metric name if InfluxDB line contains only a single field
Uses '{measurement}' instead of '{measurement}{separator}{field_name}' for metic name if InfluxDB line contains only a single field
-influxTrimTimestamp duration
Trim timestamps for InfluxDB line protocol data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms)
-inmemoryDataFlushInterval duration
The interval for guaranteed saving of in-memory data to disk. The saved data survives unclean shutdowns such as OOM crash, hardware reset, SIGKILL, etc. Bigger intervals may help increase the lifetime of flash storage with limited write cycles (e.g. Raspberry PI). Smaller intervals increase disk IO load. Minimum supported value is 1s (default 5s)
The interval for guaranteed saving of in-memory data to disk. The saved data survives unclean shutdown such as OOM crash, hardware reset, SIGKILL, etc. Bigger intervals may help increasing lifetime of flash storage with limited write cycles (e.g. Raspberry PI). Smaller intervals increase disk IO load. Minimum supported value is 1s (default 5s)
-insert.maxQueueDuration duration
The maximum duration to wait in the queue when -maxConcurrentInserts concurrent insert requests are executed (default 1m0s)
-internStringCacheExpireDuration duration
The expiry duration for caches for interned strings. See https://en.wikipedia.org/wiki/String_interning . See also -internStringMaxLen and -internStringDisableCache (default 6m0s)
-internStringDisableCache
Whether to disable caches for interned strings. This may reduce memory usage at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringCacheExpireDuration and -internStringMaxLen
-internStringMaxLen int
The maximum length for strings to intern. A lower limit may save memory at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringDisableCache and -internStringCacheExpireDuration (default 500)
The maximum length for strings to intern. Lower limit may save memory at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning (default 300)
-logNewSeries
Whether to log new series. This option is for debug purposes only. It can lead to performance issues when big number of new series are ingested into VictoriaMetrics
-loggerDisableTimestamps
@@ -2382,7 +2247,7 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
-loggerWarnsPerSecondLimit int
Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit
-maxConcurrentInserts int
The maximum number of concurrent insert requests. The default value should work for most cases, since it minimizes memory usage. The default value can be increased when clients send data over slow networks. See also -insert.maxQueueDuration (default 8)
The maximum number of concurrent insert requests. Default value should work for most cases, since it minimizes the memory usage. The default value can be increased when clients send data over slow networks. See also -insert.maxQueueDuration (default 8)
-maxInsertRequestSize size
The maximum size in bytes of a single Prometheus remote_write API request
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 33554432)
@@ -2391,18 +2256,18 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
-maxLabelsPerTimeseries int
The maximum number of labels accepted per time series. Superfluous labels are dropped. In this case the vm_metrics_with_dropped_labels_total metric at /metrics page is incremented (default 30)
-memory.allowedBytes size
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from the OS page cache resulting in higher disk IO usage
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 0)
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from the OS page cache which will result in higher disk IO usage (default 60)
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60)
-metricsAuthKey string
Auth key for /metrics endpoint. It must be passed via authKey query arg. It overrides httpAuth.* settings
-opentsdbHTTPListenAddr string
TCP address to listen for OpenTSDB HTTP put requests. Usually :4242 must be set. Doesn't work if empty. See also -opentsdbHTTPListenAddr.useProxyProtocol
TCP address to listen for OpentTSDB HTTP put requests. Usually :4242 must be set. Doesn't work if empty. See also -opentsdbHTTPListenAddr.useProxyProtocol
-opentsdbHTTPListenAddr.useProxyProtocol
Whether to use proxy protocol for connections accepted at -opentsdbHTTPListenAddr . See https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt
-opentsdbListenAddr string
TCP and UDP address to listen for OpenTSDB metrics. Telnet put messages and HTTP /api/put messages are simultaneously served on TCP port. Usually :4242 must be set. Doesn't work if empty. See also -opentsdbListenAddr.useProxyProtocol
TCP and UDP address to listen for OpentTSDB metrics. Telnet put messages and HTTP /api/put messages are simultaneously served on TCP port. Usually :4242 must be set. Doesn't work if empty. See also -opentsdbListenAddr.useProxyProtocol
-opentsdbListenAddr.useProxyProtocol
Whether to use proxy protocol for connections accepted at -opentsdbListenAddr . See https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt
-opentsdbTrimTimestamp duration
@@ -2421,9 +2286,9 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
-promscrape.azureSDCheckInterval duration
Interval for checking for changes in Azure. This works only if azure_sd_configs is configured in '-promscrape.config' file. See https://docs.victoriametrics.com/sd_configs.html#azure_sd_configs for details (default 1m0s)
-promscrape.cluster.memberNum string
The number of number in the cluster of scrapers. It must be a unique value in the range 0 ... promscrape.cluster.membersCount-1 across scrapers in the cluster. Can be specified as pod name of Kubernetes StatefulSet - pod-name-Num, where Num is a numeric part of pod name (default "0")
The number of number in the cluster of scrapers. It must be an unique value in the range 0 ... promscrape.cluster.membersCount-1 across scrapers in the cluster. Can be specified as pod name of Kubernetes StatefulSet - pod-name-Num, where Num is a numeric part of pod name (default "0")
-promscrape.cluster.membersCount int
The number of members in a cluster of scrapers. Each member must have a unique -promscrape.cluster.memberNum in the range 0 ... promscrape.cluster.membersCount-1 . Each member then scrapes roughly 1/N of all the targets. By default, cluster scraping is disabled, i.e. a single scraper scrapes all the targets
The number of members in a cluster of scrapers. Each member must have an unique -promscrape.cluster.memberNum in the range 0 ... promscrape.cluster.membersCount-1 . Each member then scrapes roughly 1/N of all the targets. By default cluster scraping is disabled, i.e. a single scraper scrapes all the targets
-promscrape.cluster.name string
Optional name of the cluster. If multiple vmagent clusters scrape the same targets, then each cluster must have unique name in order to properly de-duplicate samples received from these clusters. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2679
-promscrape.cluster.replicationFactor int
@@ -2435,19 +2300,17 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
-promscrape.config.strictParse
Whether to deny unsupported fields in -promscrape.config . Set to false in order to silently skip unsupported fields (default true)
-promscrape.configCheckInterval duration
Interval for checking for changes in '-promscrape.config' file. By default, the checking is disabled. Send SIGHUP signal in order to force config check for changes
Interval for checking for changes in '-promscrape.config' file. By default the checking is disabled. Send SIGHUP signal in order to force config check for changes
-promscrape.consul.waitTime duration
Wait time used by Consul service discovery. Default value is used if not set
-promscrape.consulSDCheckInterval duration
Interval for checking for changes in Consul. This works only if consul_sd_configs is configured in '-promscrape.config' file. See https://docs.victoriametrics.com/sd_configs.html#consul_sd_configs for details (default 30s)
-promscrape.consulagentSDCheckInterval duration
Interval for checking for changes in Consul Agent. This works only if consulagent_sd_configs is configured in '-promscrape.config' file. See https://docs.victoriametrics.com/sd_configs.html#consulagent_sd_configs for details (default 30s)
-promscrape.digitaloceanSDCheckInterval duration
Interval for checking for changes in digital ocean. This works only if digitalocean_sd_configs is configured in '-promscrape.config' file. See https://docs.victoriametrics.com/sd_configs.html#digitalocean_sd_configs for details (default 1m0s)
-promscrape.disableCompression
Whether to disable sending 'Accept-Encoding: gzip' request headers to all the scrape targets. This may reduce CPU usage on scrape targets at the cost of higher network bandwidth utilization. It is possible to set 'disable_compression: true' individually per each 'scrape_config' section in '-promscrape.config' for fine-grained control
Whether to disable sending 'Accept-Encoding: gzip' request headers to all the scrape targets. This may reduce CPU usage on scrape targets at the cost of higher network bandwidth utilization. It is possible to set 'disable_compression: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control
-promscrape.disableKeepAlive
Whether to disable HTTP keep-alive connections when scraping all the targets. This may be useful when targets has no support for HTTP keep-alive connection. It is possible to set 'disable_keepalive: true' individually per each 'scrape_config' section in '-promscrape.config' for fine-grained control. Note that disabling HTTP keep-alive may increase load on both vmagent and scrape targets
Whether to disable HTTP keep-alive connections when scraping all the targets. This may be useful when targets has no support for HTTP keep-alive connection. It is possible to set 'disable_keepalive: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control. Note that disabling HTTP keep-alive may increase load on both vmagent and scrape targets
-promscrape.discovery.concurrency int
The maximum number of concurrent requests to Prometheus autodiscovery API (Consul, Kubernetes, etc.) (default 100)
-promscrape.discovery.concurrentWaitTime duration
@@ -2474,8 +2337,6 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
How frequently to reload the full state from Kubernetes API server (default 30m0s)
-promscrape.kubernetesSDCheckInterval duration
Interval for checking for changes in Kubernetes API server. This works only if kubernetes_sd_configs is configured in '-promscrape.config' file. See https://docs.victoriametrics.com/sd_configs.html#kubernetes_sd_configs for details (default 30s)
-promscrape.kumaSDCheckInterval duration
Interval for checking for changes in kuma service discovery. This works only if kuma_sd_configs is configured in '-promscrape.config' file. See https://docs.victoriametrics.com/sd_configs.html#kuma_sd_configs for details (default 30s)
-promscrape.maxDroppedTargets int
The maximum number of droppedTargets to show at /api/v1/targets page. Increase this value if your setup drops more scrape targets during relabeling and you need investigating labels for all the dropped targets. Note that the increased number of tracked dropped targets may result in increased memory usage (default 1000)
-promscrape.maxResponseHeadersSize size
@@ -2498,7 +2359,7 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
-promscrape.seriesLimitPerTarget int
Optional limit on the number of unique time series a single scrape target can expose. See https://docs.victoriametrics.com/vmagent.html#cardinality-limiter for more info
-promscrape.streamParse
Whether to enable stream parsing for metrics obtained from scrape targets. This may be useful for reducing memory usage when millions of metrics are exposed per each scrape target. It is possible to set 'stream_parse: true' individually per each 'scrape_config' section in '-promscrape.config' for fine-grained control
Whether to enable stream parsing for metrics obtained from scrape targets. This may be useful for reducing memory usage when millions of metrics are exposed per each scrape target. It is posible to set 'stream_parse: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control
-promscrape.suppressDuplicateScrapeTargetErrors
Whether to suppress 'duplicate scrape target' errors; see https://docs.victoriametrics.com/vmagent.html#troubleshooting for details
-promscrape.suppressScrapeErrors
@@ -2513,7 +2374,7 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
-pushmetrics.interval duration
Interval for pushing metrics to -pushmetrics.url (default 10s)
-pushmetrics.url array
Optional URL to push metrics exposed at /metrics page. See https://docs.victoriametrics.com/#push-metrics . By default, metrics exposed at /metrics page aren't pushed to any remote storage
Optional URL to push metrics exposed at /metrics page. See https://docs.victoriametrics.com/#push-metrics . By default metrics exposed at /metrics page aren't pushed to any remote storage
Supports an array of values separated by comma or specified via multiple flags.
-relabelConfig string
Optional path to a file with relabeling rules, which are applied to all the ingested metrics. The path can point either to local file or to http url. See https://docs.victoriametrics.com/#relabeling for details. The config is reloaded on SIGHUP signal
@@ -2521,7 +2382,7 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
Retention filter in the format 'filter:retention'. For example, '{env="dev"}:3d' configures the retention for time series with env="dev" label to 3 days. See https://docs.victoriametrics.com/#retention-filters for details. This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
Supports an array of values separated by comma or specified via multiple flags.
-retentionPeriod value
Data with timestamps outside the retentionPeriod is automatically deleted. The minimum retentionPeriod is 24h or 1d. See also -retentionFilter
Data with timestamps outside the retentionPeriod is automatically deleted. See also -retentionFilter
The following optional suffixes are supported: h (hour), d (day), w (week), y (year). If suffix isn't set, then the duration is counted in months (default 1)
-retentionTimezoneOffset duration
The offset for performing indexdb rotation. If set to 0, then the indexdb rotation is performed at 4am UTC time per each -retentionPeriod. If set to 2h, then the indexdb rotation is performed at 4am EET time (the timezone with +2h offset)
@@ -2532,16 +2393,13 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
-search.disableCache
Whether to disable response caching. This may be useful during data backfilling
-search.graphiteMaxPointsPerSeries int
The maximum number of points per series Graphite render API can return (default 1000000)
The maximum number of points per series Graphite render API can return. This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html (default 1000000)
-search.graphiteStorageStep duration
The interval between datapoints stored in the database. It is used at Graphite Render API handler for normalizing the interval between datapoints in case it isn't normalized. It can be overridden by sending 'storage_step' query arg to /render API or by sending the desired interval via 'Storage-Step' http header during querying /render API (default 10s)
The interval between datapoints stored in the database. It is used at Graphite Render API handler for normalizing the interval between datapoints in case it isn't normalized. It can be overridden by sending 'storage_step' query arg to /render API or by sending the desired interval via 'Storage-Step' http header during querying /render API. This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html (default 10s)
-search.latencyOffset duration
The time when data points become visible in query results after the collection. It can be overridden on per-query basis via latency_offset arg. Too small value can result in incomplete last points for query results (default 30s)
-search.logQueryMemoryUsage size
Log queries, which require more memory than specified by this flag. This may help detecting and optimizing heavy queries. Query logging is disabled by default. See also -search.logSlowQueryDuration and -search.maxMemoryPerQuery
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 0)
-search.logSlowQueryDuration duration
Log queries with execution time exceeding this value. Zero disables slow query logging. See also -search.logQueryMemoryUsage (default 5s)
Log queries with execution time exceeding this value. Zero disables slow query logging (default 5s)
-search.maxConcurrentRequests int
The maximum number of concurrent search requests. It shouldn't be high, since a single request can saturate all the CPU cores, while many concurrently executed requests may require high amounts of memory. See also -search.maxQueueDuration and -search.maxMemoryPerQuery (default 8)
-search.maxExportDuration duration
@@ -2551,15 +2409,11 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
-search.maxFederateSeries int
The maximum number of time series, which can be returned from /federate. This option allows limiting memory usage (default 1000000)
-search.maxGraphiteSeries int
The maximum number of time series, which can be scanned during queries to Graphite Render API. See https://docs.victoriametrics.com/#graphite-render-api-usage (default 300000)
-search.maxGraphiteTagKeys int
The maximum number of tag keys returned from Graphite API, which returns tags. See https://docs.victoriametrics.com/#graphite-tags-api-usage (default 100000)
-search.maxGraphiteTagValues int
The maximum number of tag values returned from Graphite API, which returns tag values. See https://docs.victoriametrics.com/#graphite-tags-api-usage (default 100000)
The maximum number of time series, which can be scanned during queries to Graphite Render API. See https://docs.victoriametrics.com/#graphite-render-api-usage . This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html (default 300000)
-search.maxLookback duration
Synonym to -search.lookback-delta from Prometheus. The value is dynamically detected from interval between time series datapoints if not set. It can be overridden on per-query basis via max_lookback arg. See also '-search.maxStalenessInterval' flag, which has the same meaning due to historical reasons
Synonym to -search.lookback-delta from Prometheus. The value is dynamically detected from interval between time series datapoints if not set. It can be overridden on per-query basis via max_lookback arg. See also '-search.maxStalenessInterval' flag, which has the same meaining due to historical reasons
-search.maxMemoryPerQuery size
The maximum amounts of memory a single query may consume. Queries requiring more memory are rejected. The total memory limit for concurrently executed queries can be estimated as -search.maxMemoryPerQuery multiplied by -search.maxConcurrentRequests . See also -search.logQueryMemoryUsage
The maximum amounts of memory a single query may consume. Queries requiring more memory are rejected. The total memory limit for concurrently executed queries can be estimated as -search.maxMemoryPerQuery multiplied by -search.maxConcurrentRequests
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 0)
-search.maxPointsPerTimeseries int
The maximum points per a single timeseries returned from /api/v1/query_range. This option doesn't limit the number of scanned raw samples in the database. The main purpose of this option is to limit the number of per-series points returned to graphing UI such as VMUI or Grafana. There is no sense in setting this limit to values bigger than the horizontal resolution of the graph (default 30000)
@@ -2578,10 +2432,8 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
The maximum number of raw samples a single query can scan per each time series. This option allows limiting memory usage (default 30000000)
-search.maxSeries int
The maximum number of time series, which can be returned from /api/v1/series. This option allows limiting memory usage (default 30000)
-search.maxSeriesPerAggrFunc int
The maximum number of time series an aggregate MetricsQL function can generate (default 1000000)
-search.maxStalenessInterval duration
The maximum interval for staleness calculations. By default, it is automatically calculated from the median interval between samples. This flag could be useful for tuning Prometheus data model closer to Influx-style data model. See https://prometheus.io/docs/prometheus/latest/querying/basics/#staleness for details. See also '-search.setLookbackToStep' flag
The maximum interval for staleness calculations. By default it is automatically calculated from the median interval between samples. This flag could be useful for tuning Prometheus data model closer to Influx-style data model. See https://prometheus.io/docs/prometheus/latest/querying/basics/#staleness for details. See also '-search.setLookbackToStep' flag
-search.maxStatusRequestDuration duration
The maximum duration for /api/v1/status/* requests (default 5m0s)
-search.maxStepForPointsAdjustment duration
@@ -2620,8 +2472,6 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
The maximum number of workers for background merges. See https://docs.victoriametrics.com/#storage . It isn't recommended tuning this flag in general case, since this may lead to uncontrolled increase in the number of parts and increased CPU usage during queries
-snapshotAuthKey string
authKey, which must be passed in query string to /snapshot* pages
-snapshotCreateTimeout duration
The timeout for creating new snapshot. If set, make sure that timeout is lower than backup period
-snapshotsMaxAge value
Automatically delete snapshots older than -snapshotsMaxAge if it is set to non-zero duration. Make sure that backup process has enough time to finish the backup before the corresponding snapshot is automatically deleted
The following optional suffixes are supported: h (hour), d (day), w (week), y (year). If suffix isn't set, then the duration is counted in months (default 0)
@@ -2649,13 +2499,11 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
-storageDataPath string
Path to storage data (default "victoria-metrics-data")
-streamAggr.config string
Optional path to file with stream aggregation config. See https://docs.victoriametrics.com/stream-aggregation.html . See also -streamAggr.keepInput, -streamAggr.dropInput and -streamAggr.dedupInterval
Optional path to file with stream aggregation config. See https://docs.victoriametrics.com/stream-aggregation.html . See also -remoteWrite.streamAggr.keepInput and -streamAggr.dedupInterval
-streamAggr.dedupInterval duration
Input samples are de-duplicated with this interval before being aggregated. Only the last sample per each time series per each interval is aggregated if the interval is greater than zero
-streamAggr.dropInput
Whether to drop all the input samples after the aggregation with -streamAggr.config. By default, only aggregated samples are dropped, while the remaining samples are stored in the database. See also -streamAggr.keepInput and https://docs.victoriametrics.com/stream-aggregation.html
-streamAggr.keepInput
Whether to keep all the input samples after the aggregation with -streamAggr.config. By default, only aggregated samples are dropped, while the remaining samples are stored in the database. See also -streamAggr.dropInput and https://docs.victoriametrics.com/stream-aggregation.html
Whether to keep input samples after the aggregation with -streamAggr.config. By default the input is dropped after the aggregation, so only the aggregate data is stored. See https://docs.victoriametrics.com/stream-aggregation.html
-tls
Whether to enable TLS for incoming HTTP requests at -httpListenAddr (aka https). -tlsCertFile and -tlsKeyFile must be set if -tls is set
-tlsCertFile string

View File

@@ -4,10 +4,10 @@
| Version | Supported |
|---------|--------------------|
| [latest release](https://docs.victoriametrics.com/CHANGELOG.html) | :white_check_mark: |
| v1.87.x LTS release | :white_check_mark: |
| v1.79.x LTS release | :white_check_mark: |
| other releases | :x: |
| 1.81.x | :white_check_mark: |
| 1.80.x | :x: |
| 1.79.x | :white_check_mark: |
| < 1.78 | :x: |
## Reporting a Vulnerability

View File

@@ -1,103 +0,0 @@
# All these commands must run from repository root.
victoria-logs:
APP_NAME=victoria-logs $(MAKE) app-local
victoria-logs-race:
APP_NAME=victoria-logs RACE=-race $(MAKE) app-local
victoria-logs-prod:
APP_NAME=victoria-logs $(MAKE) app-via-docker
victoria-logs-pure-prod:
APP_NAME=victoria-logs $(MAKE) app-via-docker-pure
victoria-logs-linux-amd64-prod:
APP_NAME=victoria-logs $(MAKE) app-via-docker-linux-amd64
victoria-logs-linux-arm-prod:
APP_NAME=victoria-logs $(MAKE) app-via-docker-linux-arm
victoria-logs-linux-arm64-prod:
APP_NAME=victoria-logs $(MAKE) app-via-docker-linux-arm64
victoria-logs-linux-ppc64le-prod:
APP_NAME=victoria-logs $(MAKE) app-via-docker-linux-ppc64le
victoria-logs-linux-386-prod:
APP_NAME=victoria-logs $(MAKE) app-via-docker-linux-386
victoria-logs-darwin-amd64-prod:
APP_NAME=victoria-logs $(MAKE) app-via-docker-darwin-amd64
victoria-logs-darwin-arm64-prod:
APP_NAME=victoria-logs $(MAKE) app-via-docker-darwin-arm64
victoria-logs-freebsd-amd64-prod:
APP_NAME=victoria-logs $(MAKE) app-via-docker-freebsd-amd64
victoria-logs-openbsd-amd64-prod:
APP_NAME=victoria-logs $(MAKE) app-via-docker-openbsd-amd64
victoria-logs-windows-amd64-prod:
APP_NAME=victoria-logs $(MAKE) app-via-docker-windows-amd64
package-victoria-logs:
APP_NAME=victoria-logs $(MAKE) package-via-docker
package-victoria-logs-pure:
APP_NAME=victoria-logs $(MAKE) package-via-docker-pure
package-victoria-logs-amd64:
APP_NAME=victoria-logs $(MAKE) package-via-docker-amd64
package-victoria-logs-arm:
APP_NAME=victoria-logs $(MAKE) package-via-docker-arm
package-victoria-logs-arm64:
APP_NAME=victoria-logs $(MAKE) package-via-docker-arm64
package-victoria-logs-ppc64le:
APP_NAME=victoria-logs $(MAKE) package-via-docker-ppc64le
package-victoria-logs-386:
APP_NAME=victoria-logs $(MAKE) package-via-docker-386
publish-victoria-logs:
APP_NAME=victoria-logs $(MAKE) publish-via-docker
victoria-logs-linux-amd64:
APP_NAME=victoria-logs CGO_ENABLED=1 GOOS=linux GOARCH=amd64 $(MAKE) app-local-goos-goarch
victoria-logs-linux-arm:
APP_NAME=victoria-logs CGO_ENABLED=0 GOOS=linux GOARCH=arm $(MAKE) app-local-goos-goarch
victoria-logs-linux-arm64:
APP_NAME=victoria-logs CGO_ENABLED=0 GOOS=linux GOARCH=arm64 $(MAKE) app-local-goos-goarch
victoria-logs-linux-ppc64le:
APP_NAME=victoria-logs CGO_ENABLED=0 GOOS=linux GOARCH=ppc64le $(MAKE) app-local-goos-goarch
victoria-logs-linux-s390x:
APP_NAME=victoria-logs CGO_ENABLED=0 GOOS=linux GOARCH=s390x $(MAKE) app-local-goos-goarch
victoria-logs-linux-386:
APP_NAME=victoria-logs CGO_ENABLED=0 GOOS=linux GOARCH=386 $(MAKE) app-local-goos-goarch
victoria-logs-darwin-amd64:
APP_NAME=victoria-logs CGO_ENABLED=0 GOOS=darwin GOARCH=amd64 $(MAKE) app-local-goos-goarch
victoria-logs-darwin-arm64:
APP_NAME=victoria-logs CGO_ENABLED=0 GOOS=darwin GOARCH=arm64 $(MAKE) app-local-goos-goarch
victoria-logs-freebsd-amd64:
APP_NAME=victoria-logs CGO_ENABLED=0 GOOS=freebsd GOARCH=amd64 $(MAKE) app-local-goos-goarch
victoria-logs-openbsd-amd64:
APP_NAME=victoria-logs CGO_ENABLED=0 GOOS=openbsd GOARCH=amd64 $(MAKE) app-local-goos-goarch
victoria-logs-windows-amd64:
GOARCH=amd64 APP_NAME=victoria-logs $(MAKE) app-local-windows-goarch
victoria-logs-pure:
APP_NAME=victoria-logs $(MAKE) app-local-pure

View File

@@ -1,8 +0,0 @@
ARG base_image
FROM $base_image
EXPOSE 8428
ENTRYPOINT ["/victoria-logs-prod"]
ARG src_binary
COPY $src_binary ./victoria-logs-prod

View File

@@ -1,103 +0,0 @@
package main
import (
"flag"
"fmt"
"net/http"
"os"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vlinsert"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vlselect"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vlstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envflag"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/procutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/pushmetrics"
)
var (
httpListenAddr = flag.String("httpListenAddr", ":9428", "TCP address to listen for http connections. See also -httpListenAddr.useProxyProtocol")
useProxyProtocol = flag.Bool("httpListenAddr.useProxyProtocol", false, "Whether to use proxy protocol for connections accepted at -httpListenAddr . "+
"See https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt . "+
"With enabled proxy protocol http server cannot serve regular /metrics endpoint. Use -pushmetrics.url for metrics pushing")
gogc = flag.Int("gogc", 100, "GOGC to use. See https://tip.golang.org/doc/gc-guide")
)
func main() {
// Write flags and help message to stdout, since it is easier to grep or pipe.
flag.CommandLine.SetOutput(os.Stdout)
flag.Usage = usage
envflag.Parse()
cgroup.SetGOGC(*gogc)
buildinfo.Init()
logger.Init()
pushmetrics.Init()
logger.Infof("starting VictoriaLogs at %q...", *httpListenAddr)
startTime := time.Now()
vlstorage.Init()
vlselect.Init()
vlinsert.Init()
go httpserver.Serve(*httpListenAddr, *useProxyProtocol, requestHandler)
logger.Infof("started VictoriaLogs in %.3f seconds; see https://docs.victoriametrics.com/VictoriaLogs/", time.Since(startTime).Seconds())
sig := procutil.WaitForSigterm()
logger.Infof("received signal %s", sig)
logger.Infof("gracefully shutting down webservice at %q", *httpListenAddr)
startTime = time.Now()
if err := httpserver.Stop(*httpListenAddr); err != nil {
logger.Fatalf("cannot stop the webservice: %s", err)
}
logger.Infof("successfully shut down the webservice in %.3f seconds", time.Since(startTime).Seconds())
vlinsert.Stop()
vlselect.Stop()
vlstorage.Stop()
fs.MustStopDirRemover()
logger.Infof("the VictoriaLogs has been stopped in %.3f seconds", time.Since(startTime).Seconds())
}
func requestHandler(w http.ResponseWriter, r *http.Request) bool {
if r.URL.Path == "/" {
if r.Method != http.MethodGet {
return false
}
w.Header().Add("Content-Type", "text/html; charset=utf-8")
fmt.Fprintf(w, "<h2>Single-node VictoriaLogs</h2></br>")
fmt.Fprintf(w, "See docs at <a href='https://docs.victoriametrics.com/VictoriaLogs/'>https://docs.victoriametrics.com/VictoriaLogs/</a></br>")
fmt.Fprintf(w, "Useful endpoints:</br>")
httpserver.WriteAPIHelp(w, [][2]string{
{"select/vmui", "Web UI for VictoriaLogs"},
{"metrics", "available service metrics"},
{"flags", "command-line flags"},
})
return true
}
if vlinsert.RequestHandler(w, r) {
return true
}
if vlselect.RequestHandler(w, r) {
return true
}
return false
}
func usage() {
const s = `
victoria-logs is a log management and analytics service.
See the docs at https://docs.victoriametrics.com/VictoriaLogs/
`
flagutil.Usage(s)
}

View File

@@ -1,12 +0,0 @@
# See https://medium.com/on-docker/use-multi-stage-builds-to-inject-ca-certs-ad1e8f01de1b
ARG certs_image
ARG root_image
FROM $certs_image as certs
RUN apk update && apk upgrade && apk --update --no-cache add ca-certificates
FROM $root_image
COPY --from=certs /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ca-certificates.crt
EXPOSE 8428
ENTRYPOINT ["/victoria-logs-prod"]
ARG TARGETARCH
COPY victoria-logs-linux-${TARGETARCH}-prod ./victoria-logs-prod

View File

@@ -39,9 +39,6 @@ victoria-metrics-freebsd-amd64-prod:
victoria-metrics-openbsd-amd64-prod:
APP_NAME=victoria-metrics $(MAKE) app-via-docker-openbsd-amd64
victoria-metrics-windows-amd64-prod:
APP_NAME=victoria-metrics $(MAKE) app-via-docker-windows-amd64
package-victoria-metrics:
APP_NAME=victoria-metrics $(MAKE) package-via-docker
@@ -85,9 +82,6 @@ victoria-metrics-linux-arm64:
victoria-metrics-linux-ppc64le:
APP_NAME=victoria-metrics CGO_ENABLED=0 GOOS=linux GOARCH=ppc64le $(MAKE) app-local-goos-goarch
victoria-metrics-linux-s390x:
APP_NAME=victoria-metrics CGO_ENABLED=0 GOOS=linux GOARCH=s390x $(MAKE) app-local-goos-goarch
victoria-metrics-linux-386:
APP_NAME=victoria-metrics CGO_ENABLED=0 GOOS=linux GOARCH=386 $(MAKE) app-local-goos-goarch
@@ -103,9 +97,6 @@ victoria-metrics-freebsd-amd64:
victoria-metrics-openbsd-amd64:
APP_NAME=victoria-metrics CGO_ENABLED=0 GOOS=openbsd GOARCH=amd64 $(MAKE) app-local-goos-goarch
victoria-metrics-windows-amd64:
GOARCH=amd64 APP_NAME=victoria-metrics $(MAKE) app-local-windows-goarch
victoria-metrics-pure:
APP_NAME=victoria-metrics $(MAKE) app-local-pure

View File

@@ -8,8 +8,6 @@ import (
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert"
vminsertcommon "github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
vminsertrelabel "github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/relabel"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/promql"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage"
@@ -32,12 +30,11 @@ var (
"With enabled proxy protocol http server cannot serve regular /metrics endpoint. Use -pushmetrics.url for metrics pushing")
minScrapeInterval = flag.Duration("dedup.minScrapeInterval", 0, "Leave only the last sample in every time series per each discrete interval "+
"equal to -dedup.minScrapeInterval > 0. See https://docs.victoriametrics.com/#deduplication and https://docs.victoriametrics.com/#downsampling")
dryRun = flag.Bool("dryRun", false, "Whether to check config files without running VictoriaMetrics. The following config files are checked: "+
"-promscrape.config, -relabelConfig and -streamAggr.config. Unknown config entries aren't allowed in -promscrape.config by default. "+
"This can be changed with -promscrape.config.strictParse=false command-line flag")
dryRun = flag.Bool("dryRun", false, "Whether to check only -promscrape.config and then exit. "+
"Unknown config entries aren't allowed in -promscrape.config by default. This can be changed with -promscrape.config.strictParse=false command-line flag")
inmemoryDataFlushInterval = flag.Duration("inmemoryDataFlushInterval", 5*time.Second, "The interval for guaranteed saving of in-memory data to disk. "+
"The saved data survives unclean shutdowns such as OOM crash, hardware reset, SIGKILL, etc. "+
"Bigger intervals may help increase the lifetime of flash storage with limited write cycles (e.g. Raspberry PI). "+
"The saved data survives unclean shutdown such as OOM crash, hardware reset, SIGKILL, etc. "+
"Bigger intervals may help increasing lifetime of flash storage with limited write cycles (e.g. Raspberry PI). "+
"Smaller intervals increase disk IO load. Minimum supported value is 1s")
)
@@ -48,7 +45,6 @@ func main() {
envflag.Parse()
buildinfo.Init()
logger.Init()
pushmetrics.Init()
if promscrape.IsDryRun() {
*dryRun = true
@@ -57,13 +53,7 @@ func main() {
if err := promscrape.CheckConfig(); err != nil {
logger.Fatalf("error when checking -promscrape.config: %s", err)
}
if err := vminsertrelabel.CheckRelabelConfig(); err != nil {
logger.Fatalf("error when checking -relabelConfig: %s", err)
}
if err := vminsertcommon.CheckStreamAggrConfig(); err != nil {
logger.Fatalf("error when checking -streamAggr.config: %s", err)
}
logger.Infof("-promscrape.config is ok; exiting with 0 status code")
logger.Infof("-promscrape.config is ok; exitting with 0 status code")
return
}
@@ -74,13 +64,16 @@ func main() {
vmstorage.Init(promql.ResetRollupResultCacheIfNeeded)
vmselect.Init()
vminsert.Init()
startSelfScraper()
go httpserver.Serve(*httpListenAddr, *useProxyProtocol, requestHandler)
logger.Infof("started VictoriaMetrics in %.3f seconds", time.Since(startTime).Seconds())
pushmetrics.Init()
sig := procutil.WaitForSigterm()
logger.Infof("received signal %s", sig)
pushmetrics.Stop()
stopSelfScraper()
@@ -89,8 +82,8 @@ func main() {
if err := httpserver.Stop(*httpListenAddr); err != nil {
logger.Fatalf("cannot stop the webservice: %s", err)
}
vminsert.Stop()
logger.Infof("successfully shut down the webservice in %.3f seconds", time.Since(startTime).Seconds())
vminsert.Stop()
vmstorage.Stop()
vmselect.Stop()
@@ -102,7 +95,7 @@ func main() {
func requestHandler(w http.ResponseWriter, r *http.Request) bool {
if r.URL.Path == "/" {
if r.Method != http.MethodGet {
if r.Method != "GET" {
return false
}
w.Header().Add("Content-Type", "text/html; charset=utf-8")

View File

@@ -1,20 +0,0 @@
{% stripspace %}
{% func BulkResponse(n int, tookMs int64) %}
{
"took":{%dl tookMs %},
"errors":false,
"items":[
{% for i := 0; i < n; i++ %}
{
"create":{
"status":201
}
}
{% if i+1 < n %},{% endif %}
{% endfor %}
]
}
{% endfunc %}
{% endstripspace %}

View File

@@ -1,69 +0,0 @@
// Code generated by qtc from "bulk_response.qtpl". DO NOT EDIT.
// See https://github.com/valyala/quicktemplate for details.
//line app/vlinsert/elasticsearch/bulk_response.qtpl:3
package elasticsearch
//line app/vlinsert/elasticsearch/bulk_response.qtpl:3
import (
qtio422016 "io"
qt422016 "github.com/valyala/quicktemplate"
)
//line app/vlinsert/elasticsearch/bulk_response.qtpl:3
var (
_ = qtio422016.Copy
_ = qt422016.AcquireByteBuffer
)
//line app/vlinsert/elasticsearch/bulk_response.qtpl:3
func StreamBulkResponse(qw422016 *qt422016.Writer, n int, tookMs int64) {
//line app/vlinsert/elasticsearch/bulk_response.qtpl:3
qw422016.N().S(`{"took":`)
//line app/vlinsert/elasticsearch/bulk_response.qtpl:5
qw422016.N().DL(tookMs)
//line app/vlinsert/elasticsearch/bulk_response.qtpl:5
qw422016.N().S(`,"errors":false,"items":[`)
//line app/vlinsert/elasticsearch/bulk_response.qtpl:8
for i := 0; i < n; i++ {
//line app/vlinsert/elasticsearch/bulk_response.qtpl:8
qw422016.N().S(`{"create":{"status":201}}`)
//line app/vlinsert/elasticsearch/bulk_response.qtpl:14
if i+1 < n {
//line app/vlinsert/elasticsearch/bulk_response.qtpl:14
qw422016.N().S(`,`)
//line app/vlinsert/elasticsearch/bulk_response.qtpl:14
}
//line app/vlinsert/elasticsearch/bulk_response.qtpl:15
}
//line app/vlinsert/elasticsearch/bulk_response.qtpl:15
qw422016.N().S(`]}`)
//line app/vlinsert/elasticsearch/bulk_response.qtpl:18
}
//line app/vlinsert/elasticsearch/bulk_response.qtpl:18
func WriteBulkResponse(qq422016 qtio422016.Writer, n int, tookMs int64) {
//line app/vlinsert/elasticsearch/bulk_response.qtpl:18
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vlinsert/elasticsearch/bulk_response.qtpl:18
StreamBulkResponse(qw422016, n, tookMs)
//line app/vlinsert/elasticsearch/bulk_response.qtpl:18
qt422016.ReleaseWriter(qw422016)
//line app/vlinsert/elasticsearch/bulk_response.qtpl:18
}
//line app/vlinsert/elasticsearch/bulk_response.qtpl:18
func BulkResponse(n int, tookMs int64) string {
//line app/vlinsert/elasticsearch/bulk_response.qtpl:18
qb422016 := qt422016.AcquireByteBuffer()
//line app/vlinsert/elasticsearch/bulk_response.qtpl:18
WriteBulkResponse(qb422016, n, tookMs)
//line app/vlinsert/elasticsearch/bulk_response.qtpl:18
qs422016 := string(qb422016.B)
//line app/vlinsert/elasticsearch/bulk_response.qtpl:18
qt422016.ReleaseByteBuffer(qb422016)
//line app/vlinsert/elasticsearch/bulk_response.qtpl:18
return qs422016
//line app/vlinsert/elasticsearch/bulk_response.qtpl:18
}

View File

@@ -1,264 +0,0 @@
package elasticsearch
import (
"bufio"
"errors"
"fmt"
"io"
"math"
"net/http"
"strconv"
"strings"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vlinsert/insertutils"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vlstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bufferedwriter"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logjson"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
// RequestHandler processes Elasticsearch insert requests
func RequestHandler(path string, w http.ResponseWriter, r *http.Request) bool {
w.Header().Add("Content-Type", "application/json")
// This header is needed for Logstash
w.Header().Set("X-Elastic-Product", "Elasticsearch")
if strings.HasPrefix(path, "/_ilm/policy") {
// Return fake response for Elasticsearch ilm request.
fmt.Fprintf(w, `{}`)
return true
}
if strings.HasPrefix(path, "/_index_template") {
// Return fake response for Elasticsearch index template request.
fmt.Fprintf(w, `{}`)
return true
}
if strings.HasPrefix(path, "/_ingest") {
// Return fake response for Elasticsearch ingest pipeline request.
// See: https://www.elastic.co/guide/en/elasticsearch/reference/8.8/put-pipeline-api.html
fmt.Fprintf(w, `{}`)
return true
}
if strings.HasPrefix(path, "/_nodes") {
// Return fake response for Elasticsearch nodes discovery request.
// See: https://www.elastic.co/guide/en/elasticsearch/reference/8.8/cluster.html
fmt.Fprintf(w, `{}`)
return true
}
switch path {
case "/":
switch r.Method {
case http.MethodGet:
// Return fake response for Elasticsearch ping request.
// See the latest available version for Elasticsearch at https://github.com/elastic/elasticsearch/releases
fmt.Fprintf(w, `{
"version": {
"number": "8.8.0"
}
}`)
case http.MethodHead:
// Return empty response for Logstash ping request.
}
return true
case "/_license":
// Return fake response for Elasticsearch license request.
fmt.Fprintf(w, `{
"license": {
"uid": "cbff45e7-c553-41f7-ae4f-9205eabd80xx",
"type": "oss",
"status": "active",
"expiry_date_in_millis" : 4000000000000
}
}`)
return true
case "/_bulk":
startTime := time.Now()
bulkRequestsTotal.Inc()
cp, err := insertutils.GetCommonParams(r)
if err != nil {
httpserver.Errorf(w, r, "%s", err)
return true
}
lr := logstorage.GetLogRows(cp.StreamFields, cp.IgnoreFields)
processLogMessage := cp.GetProcessLogMessageFunc(lr)
isGzip := r.Header.Get("Content-Encoding") == "gzip"
n, err := readBulkRequest(r.Body, isGzip, cp.TimeField, cp.MsgField, processLogMessage)
if err != nil {
logger.Warnf("cannot decode log message #%d in /_bulk request: %s", n, err)
return true
}
vlstorage.MustAddRows(lr)
logstorage.PutLogRows(lr)
tookMs := time.Since(startTime).Milliseconds()
bw := bufferedwriter.Get(w)
defer bufferedwriter.Put(bw)
WriteBulkResponse(bw, n, tookMs)
_ = bw.Flush()
return true
default:
return false
}
}
var (
bulkRequestsTotal = metrics.NewCounter(`vl_http_requests_total{path="/insert/elasticsearch/_bulk"}`)
)
func readBulkRequest(r io.Reader, isGzip bool, timeField, msgField string,
processLogMessage func(timestamp int64, fields []logstorage.Field),
) (int, error) {
// See https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html
if isGzip {
zr, err := common.GetGzipReader(r)
if err != nil {
return 0, fmt.Errorf("cannot read gzipped _bulk request: %w", err)
}
defer common.PutGzipReader(zr)
r = zr
}
wcr := writeconcurrencylimiter.GetReader(r)
defer writeconcurrencylimiter.PutReader(wcr)
lb := lineBufferPool.Get()
defer lineBufferPool.Put(lb)
lb.B = bytesutil.ResizeNoCopyNoOverallocate(lb.B, insertutils.MaxLineSizeBytes.IntN())
sc := bufio.NewScanner(wcr)
sc.Buffer(lb.B, len(lb.B))
n := 0
nCheckpoint := 0
for {
ok, err := readBulkLine(sc, timeField, msgField, processLogMessage)
wcr.DecConcurrency()
if err != nil || !ok {
rowsIngestedTotal.Add(n - nCheckpoint)
return n, err
}
n++
if batchSize := n - nCheckpoint; n >= 1000 {
rowsIngestedTotal.Add(batchSize)
nCheckpoint = n
}
}
}
var lineBufferPool bytesutil.ByteBufferPool
var rowsIngestedTotal = metrics.NewCounter(`vl_rows_ingested_total{type="elasticsearch_bulk"}`)
func readBulkLine(sc *bufio.Scanner, timeField, msgField string,
processLogMessage func(timestamp int64, fields []logstorage.Field),
) (bool, error) {
var line []byte
// Read the command, must be "create" or "index"
for len(line) == 0 {
if !sc.Scan() {
if err := sc.Err(); err != nil {
if errors.Is(err, bufio.ErrTooLong) {
return false, fmt.Errorf(`cannot read "create" or "index" command, since its size exceeds -insert.maxLineSizeBytes=%d`,
insertutils.MaxLineSizeBytes.IntN())
}
return false, err
}
return false, nil
}
line = sc.Bytes()
}
lineStr := bytesutil.ToUnsafeString(line)
if !strings.Contains(lineStr, `"create"`) && !strings.Contains(lineStr, `"index"`) {
return false, fmt.Errorf(`unexpected command %q; expecting "create" or "index"`, line)
}
// Decode log message
if !sc.Scan() {
if err := sc.Err(); err != nil {
if errors.Is(err, bufio.ErrTooLong) {
return false, fmt.Errorf("cannot read log message, since its size exceeds -insert.maxLineSizeBytes=%d", insertutils.MaxLineSizeBytes.IntN())
}
return false, err
}
return false, fmt.Errorf(`missing log message after the "create" or "index" command`)
}
line = sc.Bytes()
p := logjson.GetParser()
if err := p.ParseLogMessage(line); err != nil {
return false, fmt.Errorf("cannot parse json-encoded log entry: %w", err)
}
ts, err := extractTimestampFromFields(timeField, p.Fields)
if err != nil {
return false, fmt.Errorf("cannot parse timestamp: %w", err)
}
if ts == 0 {
ts = time.Now().UnixNano()
}
p.RenameField(msgField, "_msg")
processLogMessage(ts, p.Fields)
logjson.PutParser(p)
return true, nil
}
func extractTimestampFromFields(timeField string, fields []logstorage.Field) (int64, error) {
for i := range fields {
f := &fields[i]
if f.Name != timeField {
continue
}
timestamp, err := parseElasticsearchTimestamp(f.Value)
if err != nil {
return 0, err
}
f.Value = ""
return timestamp, nil
}
return 0, nil
}
func parseElasticsearchTimestamp(s string) (int64, error) {
if s == "0" || s == "" {
// Special case - zero or empty timestamp must be substituted
// with the current time by the caller.
return 0, nil
}
if len(s) < len("YYYY-MM-DD") || s[len("YYYY")] != '-' {
// Try parsing timestamp in milliseconds
n, err := strconv.ParseInt(s, 10, 64)
if err != nil {
return 0, fmt.Errorf("cannot parse timestamp in milliseconds from %q: %w", s, err)
}
if n > int64(math.MaxInt64)/1e6 {
return 0, fmt.Errorf("too big timestamp in milliseconds: %d; mustn't exceed %d", n, int64(math.MaxInt64)/1e6)
}
if n < int64(math.MinInt64)/1e6 {
return 0, fmt.Errorf("too small timestamp in milliseconds: %d; must be bigger than %d", n, int64(math.MinInt64)/1e6)
}
n *= 1e6
return n, nil
}
if len(s) == len("YYYY-MM-DD") {
t, err := time.Parse("2006-01-02", s)
if err != nil {
return 0, fmt.Errorf("cannot parse date %q: %w", s, err)
}
return t.UnixNano(), nil
}
t, err := time.Parse(time.RFC3339, s)
if err != nil {
return 0, fmt.Errorf("cannot parse timestamp %q: %w", s, err)
}
return t.UnixNano(), nil
}

View File

@@ -1,129 +0,0 @@
package elasticsearch
import (
"bytes"
"compress/gzip"
"fmt"
"reflect"
"strings"
"testing"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logstorage"
)
func TestReadBulkRequestFailure(t *testing.T) {
f := func(data string) {
t.Helper()
processLogMessage := func(timestamp int64, fields []logstorage.Field) {
t.Fatalf("unexpected call to processLogMessage with timestamp=%d, fields=%s", timestamp, fields)
}
r := bytes.NewBufferString(data)
rows, err := readBulkRequest(r, false, "_time", "_msg", processLogMessage)
if err == nil {
t.Fatalf("expecting non-empty error")
}
if rows != 0 {
t.Fatalf("unexpected non-zero rows=%d", rows)
}
}
f("foobar")
f(`{}`)
f(`{"create":{}}`)
f(`{"creat":{}}
{}`)
f(`{"create":{}}
foobar`)
}
func TestReadBulkRequestSuccess(t *testing.T) {
f := func(data, timeField, msgField string, rowsExpected int, timestampsExpected []int64, resultExpected string) {
t.Helper()
var timestamps []int64
var result string
processLogMessage := func(timestamp int64, fields []logstorage.Field) {
timestamps = append(timestamps, timestamp)
a := make([]string, len(fields))
for i, f := range fields {
a[i] = fmt.Sprintf("%q:%q", f.Name, f.Value)
}
s := "{" + strings.Join(a, ",") + "}\n"
result += s
}
// Read the request without compression
r := bytes.NewBufferString(data)
rows, err := readBulkRequest(r, false, timeField, msgField, processLogMessage)
if err != nil {
t.Fatalf("unexpected error: %s", err)
}
if rows != rowsExpected {
t.Fatalf("unexpected rows read; got %d; want %d", rows, rowsExpected)
}
if !reflect.DeepEqual(timestamps, timestampsExpected) {
t.Fatalf("unexpected timestamps;\ngot\n%d\nwant\n%d", timestamps, timestampsExpected)
}
if result != resultExpected {
t.Fatalf("unexpected result;\ngot\n%s\nwant\n%s", result, resultExpected)
}
// Read the request with compression
timestamps = nil
result = ""
compressedData := compressData(data)
r = bytes.NewBufferString(compressedData)
rows, err = readBulkRequest(r, true, timeField, msgField, processLogMessage)
if err != nil {
t.Fatalf("unexpected error: %s", err)
}
if rows != rowsExpected {
t.Fatalf("unexpected rows read; got %d; want %d", rows, rowsExpected)
}
if !reflect.DeepEqual(timestamps, timestampsExpected) {
t.Fatalf("unexpected timestamps;\ngot\n%d\nwant\n%d", timestamps, timestampsExpected)
}
if result != resultExpected {
t.Fatalf("unexpected result;\ngot\n%s\nwant\n%s", result, resultExpected)
}
}
// Verify an empty data
f("", "_time", "_msg", 0, nil, "")
f("\n", "_time", "_msg", 0, nil, "")
f("\n\n", "_time", "_msg", 0, nil, "")
// Verify non-empty data
data := `{"create":{"_index":"filebeat-8.8.0"}}
{"@timestamp":"2023-06-06T04:48:11.735Z","log":{"offset":71770,"file":{"path":"/var/log/auth.log"}},"message":"foobar"}
{"create":{"_index":"filebeat-8.8.0"}}
{"@timestamp":"2023-06-06T04:48:12.735Z","message":"baz"}
{"index":{"_index":"filebeat-8.8.0"}}
{"message":"xyz","@timestamp":"2023-06-06T04:48:13.735Z","x":"y"}
`
timeField := "@timestamp"
msgField := "message"
rowsExpected := 3
timestampsExpected := []int64{1686026891735000000, 1686026892735000000, 1686026893735000000}
resultExpected := `{"@timestamp":"","log.offset":"71770","log.file.path":"/var/log/auth.log","_msg":"foobar"}
{"@timestamp":"","_msg":"baz"}
{"_msg":"xyz","@timestamp":"","x":"y"}
`
f(data, timeField, msgField, rowsExpected, timestampsExpected, resultExpected)
}
func compressData(s string) string {
var bb bytes.Buffer
zw := gzip.NewWriter(&bb)
if _, err := zw.Write([]byte(s)); err != nil {
panic(fmt.Errorf("unexpected error when compressing data: %s", err))
}
if err := zw.Close(); err != nil {
panic(fmt.Errorf("unexpected error when closing gzip writer: %s", err))
}
return bb.String()
}

View File

@@ -1,50 +0,0 @@
package elasticsearch
import (
"bytes"
"fmt"
"testing"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logstorage"
)
func BenchmarkReadBulkRequest(b *testing.B) {
b.Run("gzip:off", func(b *testing.B) {
benchmarkReadBulkRequest(b, false)
})
b.Run("gzip:on", func(b *testing.B) {
benchmarkReadBulkRequest(b, true)
})
}
func benchmarkReadBulkRequest(b *testing.B, isGzip bool) {
data := `{"create":{"_index":"filebeat-8.8.0"}}
{"@timestamp":"2023-06-06T04:48:11.735Z","log":{"offset":71770,"file":{"path":"/var/log/auth.log"}},"message":"foobar"}
{"create":{"_index":"filebeat-8.8.0"}}
{"@timestamp":"2023-06-06T04:48:12.735Z","message":"baz"}
{"create":{"_index":"filebeat-8.8.0"}}
{"message":"xyz","@timestamp":"2023-06-06T04:48:13.735Z","x":"y"}
`
if isGzip {
data = compressData(data)
}
dataBytes := bytesutil.ToUnsafeBytes(data)
timeField := "@timestamp"
msgField := "message"
processLogMessage := func(timestmap int64, fields []logstorage.Field) {}
b.ReportAllocs()
b.SetBytes(int64(len(data)))
b.RunParallel(func(pb *testing.PB) {
r := &bytes.Reader{}
for pb.Next() {
r.Reset(dataBytes)
_, err := readBulkRequest(r, isGzip, timeField, msgField, processLogMessage)
if err != nil {
panic(fmt.Errorf("unexpected error: %s", err))
}
}
})
}

View File

@@ -1,91 +0,0 @@
package insertutils
import (
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vlstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httputils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logstorage"
"github.com/VictoriaMetrics/metrics"
)
// CommonParams contains common HTTP parameters used by log ingestion APIs.
//
// See https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/#http-parameters
type CommonParams struct {
TenantID logstorage.TenantID
TimeField string
MsgField string
StreamFields []string
IgnoreFields []string
Debug bool
DebugRequestURI string
DebugRemoteAddr string
}
// GetCommonParams returns CommonParams from r.
func GetCommonParams(r *http.Request) (*CommonParams, error) {
// Extract tenantID
tenantID, err := logstorage.GetTenantIDFromRequest(r)
if err != nil {
return nil, err
}
// Extract time field name from _time_field query arg
var timeField = "_time"
if tf := r.FormValue("_time_field"); tf != "" {
timeField = tf
}
// Extract message field name from _msg_field query arg
var msgField = ""
if msgf := r.FormValue("_msg_field"); msgf != "" {
msgField = msgf
}
streamFields := httputils.GetArray(r, "_stream_fields")
ignoreFields := httputils.GetArray(r, "ignore_fields")
debug := httputils.GetBool(r, "debug")
debugRequestURI := ""
debugRemoteAddr := ""
if debug {
debugRequestURI = httpserver.GetRequestURI(r)
debugRemoteAddr = httpserver.GetQuotedRemoteAddr(r)
}
cp := &CommonParams{
TenantID: tenantID,
TimeField: timeField,
MsgField: msgField,
StreamFields: streamFields,
IgnoreFields: ignoreFields,
Debug: debug,
DebugRequestURI: debugRequestURI,
DebugRemoteAddr: debugRemoteAddr,
}
return cp, nil
}
// GetProcessLogMessageFunc returns a function, which adds parsed log messages to lr.
func (cp *CommonParams) GetProcessLogMessageFunc(lr *logstorage.LogRows) func(timestamp int64, fields []logstorage.Field) {
return func(timestamp int64, fields []logstorage.Field) {
lr.MustAdd(cp.TenantID, timestamp, fields)
if cp.Debug {
s := lr.GetRowString(0)
lr.ResetKeepSettings()
logger.Infof("remoteAddr=%s; requestURI=%s; ignoring log entry because of `debug` query arg: %s", cp.DebugRemoteAddr, cp.DebugRequestURI, s)
rowsDroppedTotal.Inc()
return
}
if lr.NeedFlush() {
vlstorage.MustAddRows(lr)
lr.ResetKeepSettings()
}
}
}
var rowsDroppedTotal = metrics.NewCounter(`vl_rows_dropped_total{reason="debug"}`)

View File

@@ -1,10 +0,0 @@
package insertutils
import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
)
var (
// MaxLineSizeBytes is the maximum length of a single line for /insert/* handlers
MaxLineSizeBytes = flagutil.NewBytes("insert.maxLineSizeBytes", 256*1024, "The maximum size of a single line, which can be read by /insert/* handlers")
)

View File

@@ -1,149 +0,0 @@
package jsonline
import (
"bufio"
"errors"
"fmt"
"net/http"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vlinsert/insertutils"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vlstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logjson"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
// RequestHandler processes jsonline insert requests
func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
w.Header().Add("Content-Type", "application/json")
if r.Method != "POST" {
w.WriteHeader(http.StatusMethodNotAllowed)
return true
}
requestsTotal.Inc()
cp, err := insertutils.GetCommonParams(r)
if err != nil {
httpserver.Errorf(w, r, "%s", err)
return true
}
lr := logstorage.GetLogRows(cp.StreamFields, cp.IgnoreFields)
processLogMessage := cp.GetProcessLogMessageFunc(lr)
reader := r.Body
if r.Header.Get("Content-Encoding") == "gzip" {
zr, err := common.GetGzipReader(reader)
if err != nil {
logger.Errorf("cannot read gzipped _bulk request: %s", err)
return true
}
defer common.PutGzipReader(zr)
reader = zr
}
wcr := writeconcurrencylimiter.GetReader(reader)
defer writeconcurrencylimiter.PutReader(wcr)
lb := lineBufferPool.Get()
defer lineBufferPool.Put(lb)
lb.B = bytesutil.ResizeNoCopyNoOverallocate(lb.B, insertutils.MaxLineSizeBytes.IntN())
sc := bufio.NewScanner(wcr)
sc.Buffer(lb.B, len(lb.B))
n := 0
for {
ok, err := readLine(sc, cp.TimeField, cp.MsgField, processLogMessage)
wcr.DecConcurrency()
if err != nil {
logger.Errorf("cannot read line #%d in /jsonline request: %s", n, err)
break
}
if !ok {
break
}
n++
rowsIngestedTotal.Inc()
}
vlstorage.MustAddRows(lr)
logstorage.PutLogRows(lr)
return true
}
func readLine(sc *bufio.Scanner, timeField, msgField string, processLogMessage func(timestamp int64, fields []logstorage.Field)) (bool, error) {
var line []byte
for len(line) == 0 {
if !sc.Scan() {
if err := sc.Err(); err != nil {
if errors.Is(err, bufio.ErrTooLong) {
return false, fmt.Errorf(`cannot read json line, since its size exceeds -insert.maxLineSizeBytes=%d`, insertutils.MaxLineSizeBytes.IntN())
}
return false, err
}
return false, nil
}
line = sc.Bytes()
}
p := logjson.GetParser()
if err := p.ParseLogMessage(line); err != nil {
return false, fmt.Errorf("cannot parse json-encoded log entry: %w", err)
}
ts, err := extractTimestampFromFields(timeField, p.Fields)
if err != nil {
return false, fmt.Errorf("cannot parse timestamp: %w", err)
}
if ts == 0 {
ts = time.Now().UnixNano()
}
p.RenameField(msgField, "_msg")
processLogMessage(ts, p.Fields)
logjson.PutParser(p)
return true, nil
}
func extractTimestampFromFields(timeField string, fields []logstorage.Field) (int64, error) {
for i := range fields {
f := &fields[i]
if f.Name != timeField {
continue
}
timestamp, err := parseISO8601Timestamp(f.Value)
if err != nil {
return 0, err
}
f.Value = ""
return timestamp, nil
}
return 0, nil
}
func parseISO8601Timestamp(s string) (int64, error) {
if s == "0" || s == "" {
// Special case for returning the current timestamp.
// It must be automatically converted to the current timestamp by the caller.
return 0, nil
}
t, err := time.Parse(time.RFC3339, s)
if err != nil {
return 0, fmt.Errorf("cannot parse timestamp %q: %w", s, err)
}
return t.UnixNano(), nil
}
var lineBufferPool bytesutil.ByteBufferPool
var (
requestsTotal = metrics.NewCounter(`vl_http_requests_total{path="/insert/jsonline"}`)
rowsIngestedTotal = metrics.NewCounter(`vl_rows_ingested_total{type="jsonline"}`)
)

View File

@@ -1,70 +0,0 @@
package jsonline
import (
"bufio"
"bytes"
"fmt"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logstorage"
"reflect"
"strings"
"testing"
)
func TestReadBulkRequestSuccess(t *testing.T) {
f := func(data, timeField, msgField string, rowsExpected int, timestampsExpected []int64, resultExpected string) {
t.Helper()
var timestamps []int64
var result string
processLogMessage := func(timestamp int64, fields []logstorage.Field) {
timestamps = append(timestamps, timestamp)
a := make([]string, len(fields))
for i, f := range fields {
a[i] = fmt.Sprintf("%q:%q", f.Name, f.Value)
}
s := "{" + strings.Join(a, ",") + "}\n"
result += s
}
// Read the request without compression
r := bytes.NewBufferString(data)
sc := bufio.NewScanner(r)
rows := 0
for {
ok, err := readLine(sc, timeField, msgField, processLogMessage)
if err != nil {
t.Fatalf("unexpected error: %s", err)
}
if !ok {
break
}
rows++
}
if rows != rowsExpected {
t.Fatalf("unexpected rows read; got %d; want %d", rows, rowsExpected)
}
if !reflect.DeepEqual(timestamps, timestampsExpected) {
t.Fatalf("unexpected timestamps;\ngot\n%d\nwant\n%d", timestamps, timestampsExpected)
}
if result != resultExpected {
t.Fatalf("unexpected result;\ngot\n%s\nwant\n%s", result, resultExpected)
}
}
// Verify non-empty data
data := `{"@timestamp":"2023-06-06T04:48:11.735Z","log":{"offset":71770,"file":{"path":"/var/log/auth.log"}},"message":"foobar"}
{"@timestamp":"2023-06-06T04:48:12.735Z","message":"baz"}
{"message":"xyz","@timestamp":"2023-06-06T04:48:13.735Z","x":"y"}
`
timeField := "@timestamp"
msgField := "message"
rowsExpected := 3
timestampsExpected := []int64{1686026891735000000, 1686026892735000000, 1686026893735000000}
resultExpected := `{"@timestamp":"","log.offset":"71770","log.file.path":"/var/log/auth.log","_msg":"foobar"}
{"@timestamp":"","_msg":"baz"}
{"_msg":"xyz","@timestamp":"","x":"y"}
`
f(data, timeField, msgField, rowsExpected, timestampsExpected, resultExpected)
}

View File

@@ -1,56 +0,0 @@
package loki
import (
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vlinsert/insertutils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logstorage"
"github.com/VictoriaMetrics/metrics"
)
var (
lokiRequestsJSONTotal = metrics.NewCounter(`vl_http_requests_total{path="/insert/loki/api/v1/push",format="json"}`)
lokiRequestsProtobufTotal = metrics.NewCounter(`vl_http_requests_total{path="/insert/loki/api/v1/push",format="protobuf"}`)
)
// RequestHandler processes Loki insert requests
//
// See https://grafana.com/docs/loki/latest/api/#push-log-entries-to-loki
func RequestHandler(path string, w http.ResponseWriter, r *http.Request) bool {
if path != "/api/v1/push" {
return false
}
contentType := r.Header.Get("Content-Type")
switch contentType {
case "application/json":
lokiRequestsJSONTotal.Inc()
return handleJSON(r, w)
default:
// Protobuf request body should be handled by default accoring to https://grafana.com/docs/loki/latest/api/#push-log-entries-to-loki
lokiRequestsProtobufTotal.Inc()
return handleProtobuf(r, w)
}
}
func getCommonParams(r *http.Request) (*insertutils.CommonParams, error) {
cp, err := insertutils.GetCommonParams(r)
if err != nil {
return nil, err
}
// If parsed tenant is (0,0) it is likely to be default tenant
// Try parsing tenant from Loki headers
if cp.TenantID.AccountID == 0 && cp.TenantID.ProjectID == 0 {
org := r.Header.Get("X-Scope-OrgID")
if org != "" {
tenantID, err := logstorage.GetTenantIDFromString(org)
if err != nil {
return nil, err
}
cp.TenantID = tenantID
}
}
return cp, nil
}

View File

@@ -1,190 +0,0 @@
package loki
import (
"fmt"
"io"
"math"
"net/http"
"strconv"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vlstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
"github.com/valyala/fastjson"
)
var (
rowsIngestedJSONTotal = metrics.NewCounter(`vl_rows_ingested_total{type="loki",format="json"}`)
parserPool fastjson.ParserPool
)
func handleJSON(r *http.Request, w http.ResponseWriter) bool {
reader := r.Body
if r.Header.Get("Content-Encoding") == "gzip" {
zr, err := common.GetGzipReader(reader)
if err != nil {
httpserver.Errorf(w, r, "cannot initialize gzip reader: %s", err)
return true
}
defer common.PutGzipReader(zr)
reader = zr
}
wcr := writeconcurrencylimiter.GetReader(reader)
data, err := io.ReadAll(wcr)
writeconcurrencylimiter.PutReader(wcr)
if err != nil {
httpserver.Errorf(w, r, "cannot read request body: %s", err)
return true
}
cp, err := getCommonParams(r)
if err != nil {
httpserver.Errorf(w, r, "cannot parse common params from request: %s", err)
return true
}
lr := logstorage.GetLogRows(cp.StreamFields, cp.IgnoreFields)
processLogMessage := cp.GetProcessLogMessageFunc(lr)
n, err := parseJSONRequest(data, processLogMessage)
vlstorage.MustAddRows(lr)
logstorage.PutLogRows(lr)
if err != nil {
httpserver.Errorf(w, r, "cannot parse Loki request: %s", err)
return true
}
rowsIngestedJSONTotal.Add(n)
return true
}
func parseJSONRequest(data []byte, processLogMessage func(timestamp int64, fields []logstorage.Field)) (int, error) {
p := parserPool.Get()
defer parserPool.Put(p)
v, err := p.ParseBytes(data)
if err != nil {
return 0, fmt.Errorf("cannot parse JSON request body: %w", err)
}
streamsV := v.Get("streams")
if streamsV == nil {
return 0, fmt.Errorf("missing `streams` item in the parsed JSON: %q", v)
}
streams, err := streamsV.Array()
if err != nil {
return 0, fmt.Errorf("`streams` item in the parsed JSON must contain an array; got %q", streamsV)
}
currentTimestamp := time.Now().UnixNano()
var commonFields []logstorage.Field
rowsIngested := 0
for _, stream := range streams {
// populate common labels from `stream` dict
commonFields = commonFields[:0]
labelsV := stream.Get("stream")
var labels *fastjson.Object
if labelsV != nil {
o, err := labelsV.Object()
if err != nil {
return rowsIngested, fmt.Errorf("`stream` item in the parsed JSON must contain an object; got %q", labelsV)
}
labels = o
}
labels.Visit(func(k []byte, v *fastjson.Value) {
if err != nil {
return
}
vStr, errLocal := v.StringBytes()
if errLocal != nil {
err = fmt.Errorf("unexpected label value type for %q:%q; want string", k, v)
return
}
commonFields = append(commonFields, logstorage.Field{
Name: bytesutil.ToUnsafeString(k),
Value: bytesutil.ToUnsafeString(vStr),
})
})
if err != nil {
return rowsIngested, fmt.Errorf("error when parsing `stream` object: %w", err)
}
// populate messages from `values` array
linesV := stream.Get("values")
if linesV == nil {
return rowsIngested, fmt.Errorf("missing `values` item in the parsed JSON %q", stream)
}
lines, err := linesV.Array()
if err != nil {
return rowsIngested, fmt.Errorf("`values` item in the parsed JSON must contain an array; got %q", linesV)
}
fields := commonFields
for _, line := range lines {
lineA, err := line.Array()
if err != nil {
return rowsIngested, fmt.Errorf("unexpected contents of `values` item; want array; got %q", line)
}
if len(lineA) != 2 {
return rowsIngested, fmt.Errorf("unexpected number of values in `values` item array %q; got %d want 2", line, len(lineA))
}
// parse timestamp
timestamp, err := lineA[0].StringBytes()
if err != nil {
return rowsIngested, fmt.Errorf("unexpected log timestamp type for %q; want string", lineA[0])
}
ts, err := parseLokiTimestamp(bytesutil.ToUnsafeString(timestamp))
if err != nil {
return rowsIngested, fmt.Errorf("cannot parse log timestamp %q: %w", timestamp, err)
}
if ts == 0 {
ts = currentTimestamp
}
// parse log message
msg, err := lineA[1].StringBytes()
if err != nil {
return rowsIngested, fmt.Errorf("unexpected log message type for %q; want string", lineA[1])
}
fields = append(fields[:len(commonFields)], logstorage.Field{
Name: "_msg",
Value: bytesutil.ToUnsafeString(msg),
})
processLogMessage(ts, fields)
}
rowsIngested += len(lines)
}
return rowsIngested, nil
}
func parseLokiTimestamp(s string) (int64, error) {
if s == "" {
// Special case - an empty timestamp must be substituted with the current time by the caller.
return 0, nil
}
n, err := strconv.ParseInt(s, 10, 64)
if err != nil {
// Fall back to parsing floating-point value
f, err := strconv.ParseFloat(s, 64)
if err != nil {
return 0, err
}
if f > math.MaxInt64 {
return 0, fmt.Errorf("too big timestamp in nanoseconds: %v; mustn't exceed %v", f, int64(math.MaxInt64))
}
if f < math.MinInt64 {
return 0, fmt.Errorf("too small timestamp in nanoseconds: %v; must be bigger or equal to %v", f, int64(math.MinInt64))
}
n = int64(f)
}
if n < 0 {
return 0, fmt.Errorf("too small timestamp in nanoseconds: %d; must be bigger than 0", n)
}
return n, nil
}

View File

@@ -1,130 +0,0 @@
package loki
import (
"fmt"
"strings"
"testing"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logstorage"
)
func TestParseJSONRequestFailure(t *testing.T) {
f := func(s string) {
t.Helper()
n, err := parseJSONRequest([]byte(s), func(timestamp int64, fields []logstorage.Field) {
t.Fatalf("unexpected call to parseJSONRequest callback!")
})
if err == nil {
t.Fatalf("expecting non-nil error")
}
if n != 0 {
t.Fatalf("unexpected number of parsed lines: %d; want 0", n)
}
}
f(``)
// Invalid json
f(`{}`)
f(`[]`)
f(`"foo"`)
f(`123`)
// invalid type for `streams` item
f(`{"streams":123}`)
// Missing `values` item
f(`{"streams":[{}]}`)
// Invalid type for `values` item
f(`{"streams":[{"values":"foobar"}]}`)
// Invalid type for `stream` item
f(`{"streams":[{"stream":[],"values":[]}]}`)
// Invalid type for `values` individual item
f(`{"streams":[{"values":[123]}]}`)
// Invalid length of `values` individual item
f(`{"streams":[{"values":[[]]}]}`)
f(`{"streams":[{"values":[["123"]]}]}`)
f(`{"streams":[{"values":[["123","456","789"]]}]}`)
// Invalid type for timestamp inside `values` individual item
f(`{"streams":[{"values":[[123,"456"]}]}`)
// Invalid type for log message
f(`{"streams":[{"values":[["123",1234]]}]}`)
}
func TestParseJSONRequestSuccess(t *testing.T) {
f := func(s string, resultExpected string) {
t.Helper()
var lines []string
n, err := parseJSONRequest([]byte(s), func(timestamp int64, fields []logstorage.Field) {
var a []string
for _, f := range fields {
a = append(a, f.String())
}
line := fmt.Sprintf("_time:%d %s", timestamp, strings.Join(a, " "))
lines = append(lines, line)
})
if err != nil {
t.Fatalf("unexpected error: %s", err)
}
if n != len(lines) {
t.Fatalf("unexpected number of lines parsed; got %d; want %d", n, len(lines))
}
result := strings.Join(lines, "\n")
if result != resultExpected {
t.Fatalf("unexpected result;\ngot\n%s\nwant\n%s", result, resultExpected)
}
}
// Empty streams
f(`{"streams":[]}`, ``)
f(`{"streams":[{"values":[]}]}`, ``)
f(`{"streams":[{"stream":{},"values":[]}]}`, ``)
f(`{"streams":[{"stream":{"foo":"bar"},"values":[]}]}`, ``)
// Empty stream labels
f(`{"streams":[{"values":[["1577836800000000001", "foo bar"]]}]}`, `_time:1577836800000000001 "_msg":"foo bar"`)
f(`{"streams":[{"stream":{},"values":[["1577836800000000001", "foo bar"]]}]}`, `_time:1577836800000000001 "_msg":"foo bar"`)
// Non-empty stream labels
f(`{"streams":[{"stream":{
"label1": "value1",
"label2": "value2"
},"values":[
["1577836800000000001", "foo bar"],
["1477836900005000002", "abc"],
["147.78369e9", "foobar"]
]}]}`, `_time:1577836800000000001 "label1":"value1" "label2":"value2" "_msg":"foo bar"
_time:1477836900005000002 "label1":"value1" "label2":"value2" "_msg":"abc"
_time:147783690000 "label1":"value1" "label2":"value2" "_msg":"foobar"`)
// Multiple streams
f(`{
"streams": [
{
"stream": {
"foo": "bar",
"a": "b"
},
"values": [
["1577836800000000001", "foo bar"],
["1577836900005000002", "abc"]
]
},
{
"stream": {
"x": "y"
},
"values": [
["1877836900005000002", "yx"]
]
}
]
}`, `_time:1577836800000000001 "foo":"bar" "a":"b" "_msg":"foo bar"
_time:1577836900005000002 "foo":"bar" "a":"b" "_msg":"abc"
_time:1877836900005000002 "x":"y" "_msg":"yx"`)
}

View File

@@ -1,78 +0,0 @@
package loki
import (
"fmt"
"strconv"
"testing"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logstorage"
)
func BenchmarkParseJSONRequest(b *testing.B) {
for _, streams := range []int{5, 10} {
for _, rows := range []int{100, 1000} {
for _, labels := range []int{10, 50} {
b.Run(fmt.Sprintf("streams_%d/rows_%d/labels_%d", streams, rows, labels), func(b *testing.B) {
benchmarkParseJSONRequest(b, streams, rows, labels)
})
}
}
}
}
func benchmarkParseJSONRequest(b *testing.B, streams, rows, labels int) {
b.ReportAllocs()
b.SetBytes(int64(streams * rows))
b.RunParallel(func(pb *testing.PB) {
data := getJSONBody(streams, rows, labels)
for pb.Next() {
_, err := parseJSONRequest(data, func(timestamp int64, fields []logstorage.Field) {})
if err != nil {
panic(fmt.Errorf("unexpected error: %s", err))
}
}
})
}
func getJSONBody(streams, rows, labels int) []byte {
body := append([]byte{}, `{"streams":[`...)
now := time.Now().UnixNano()
valuePrefix := fmt.Sprintf(`["%d","value_`, now)
for i := 0; i < streams; i++ {
body = append(body, `{"stream":{`...)
for j := 0; j < labels; j++ {
body = append(body, `"label_`...)
body = strconv.AppendInt(body, int64(j), 10)
body = append(body, `":"value_`...)
body = strconv.AppendInt(body, int64(j), 10)
body = append(body, '"')
if j < labels-1 {
body = append(body, ',')
}
}
body = append(body, `}, "values":[`...)
for j := 0; j < rows; j++ {
body = append(body, valuePrefix...)
body = strconv.AppendInt(body, int64(j), 10)
body = append(body, `"]`...)
if j < rows-1 {
body = append(body, ',')
}
}
body = append(body, `]}`...)
if i < streams-1 {
body = append(body, ',')
}
}
body = append(body, `]}`...)
return body
}

View File

@@ -1,171 +0,0 @@
package loki
import (
"fmt"
"io"
"net/http"
"strconv"
"strings"
"sync"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vlstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
"github.com/golang/snappy"
)
var (
rowsIngestedProtobufTotal = metrics.NewCounter(`vl_rows_ingested_total{type="loki",format="protobuf"}`)
bytesBufPool bytesutil.ByteBufferPool
pushReqsPool sync.Pool
)
func handleProtobuf(r *http.Request, w http.ResponseWriter) bool {
wcr := writeconcurrencylimiter.GetReader(r.Body)
data, err := io.ReadAll(wcr)
writeconcurrencylimiter.PutReader(wcr)
if err != nil {
httpserver.Errorf(w, r, "cannot read request body: %s", err)
return true
}
cp, err := getCommonParams(r)
if err != nil {
httpserver.Errorf(w, r, "cannot parse common params from request: %s", err)
return true
}
lr := logstorage.GetLogRows(cp.StreamFields, cp.IgnoreFields)
processLogMessage := cp.GetProcessLogMessageFunc(lr)
n, err := parseProtobufRequest(data, processLogMessage)
vlstorage.MustAddRows(lr)
logstorage.PutLogRows(lr)
if err != nil {
httpserver.Errorf(w, r, "cannot parse loki request: %s", err)
return true
}
rowsIngestedProtobufTotal.Add(n)
return true
}
func parseProtobufRequest(data []byte, processLogMessage func(timestamp int64, fields []logstorage.Field)) (int, error) {
bb := bytesBufPool.Get()
defer bytesBufPool.Put(bb)
buf, err := snappy.Decode(bb.B[:cap(bb.B)], data)
if err != nil {
return 0, fmt.Errorf("cannot decode snappy-encoded request body: %w", err)
}
bb.B = buf
req := getPushRequest()
defer putPushRequest(req)
err = req.Unmarshal(bb.B)
if err != nil {
return 0, fmt.Errorf("cannot parse request body: %s", err)
}
var commonFields []logstorage.Field
rowsIngested := 0
streams := req.Streams
currentTimestamp := time.Now().UnixNano()
for i := range streams {
stream := &streams[i]
// st.Labels contains labels for the stream.
// Labels are same for all entries in the stream.
commonFields, err = parsePromLabels(commonFields[:0], stream.Labels)
if err != nil {
return rowsIngested, fmt.Errorf("cannot parse stream labels %q: %s", stream.Labels, err)
}
fields := commonFields
entries := stream.Entries
for j := range entries {
entry := &entries[j]
fields = append(fields[:len(commonFields)], logstorage.Field{
Name: "_msg",
Value: entry.Line,
})
ts := entry.Timestamp.UnixNano()
if ts == 0 {
ts = currentTimestamp
}
processLogMessage(ts, fields)
}
rowsIngested += len(stream.Entries)
}
return rowsIngested, nil
}
// parsePromLabels parses log fields in Prometheus text exposition format from s, appends them to dst and returns the result.
//
// See test data of promtail for examples: https://github.com/grafana/loki/blob/a24ef7b206e0ca63ee74ca6ecb0a09b745cd2258/pkg/push/types_test.go
func parsePromLabels(dst []logstorage.Field, s string) ([]logstorage.Field, error) {
// Make sure s is wrapped into `{...}`
s = strings.TrimSpace(s)
if len(s) < 2 {
return nil, fmt.Errorf("too short string to parse: %q", s)
}
if s[0] != '{' {
return nil, fmt.Errorf("missing `{` at the beginning of %q", s)
}
if s[len(s)-1] != '}' {
return nil, fmt.Errorf("missing `}` at the end of %q", s)
}
s = s[1 : len(s)-1]
for len(s) > 0 {
// Parse label name
n := strings.IndexByte(s, '=')
if n < 0 {
return nil, fmt.Errorf("cannot find `=` char for label value at %s", s)
}
name := s[:n]
s = s[n+1:]
// Parse label value
qs, err := strconv.QuotedPrefix(s)
if err != nil {
return nil, fmt.Errorf("cannot parse value for label %q at %s: %w", name, s, err)
}
s = s[len(qs):]
value, err := strconv.Unquote(qs)
if err != nil {
return nil, fmt.Errorf("cannot unquote value %q for label %q: %w", qs, name, err)
}
// Append the found field to dst.
dst = append(dst, logstorage.Field{
Name: name,
Value: value,
})
// Check whether there are other labels remaining
if len(s) == 0 {
break
}
if !strings.HasPrefix(s, ",") {
return nil, fmt.Errorf("missing `,` char at %s", s)
}
s = s[1:]
s = strings.TrimPrefix(s, " ")
}
return dst, nil
}
func getPushRequest() *PushRequest {
v := pushReqsPool.Get()
if v == nil {
return &PushRequest{}
}
return v.(*PushRequest)
}
func putPushRequest(req *PushRequest) {
req.Reset()
pushReqsPool.Put(req)
}

View File

@@ -1,171 +0,0 @@
package loki
import (
"fmt"
"strings"
"testing"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logstorage"
"github.com/golang/snappy"
)
func TestParseProtobufRequestSuccess(t *testing.T) {
f := func(s string, resultExpected string) {
t.Helper()
var pr PushRequest
n, err := parseJSONRequest([]byte(s), func(timestamp int64, fields []logstorage.Field) {
msg := ""
for _, f := range fields {
if f.Name == "_msg" {
msg = f.Value
}
}
var a []string
for _, f := range fields {
if f.Name == "_msg" {
continue
}
item := fmt.Sprintf("%s=%q", f.Name, f.Value)
a = append(a, item)
}
labels := "{" + strings.Join(a, ", ") + "}"
pr.Streams = append(pr.Streams, Stream{
Labels: labels,
Entries: []Entry{
{
Timestamp: time.Unix(0, timestamp),
Line: msg,
},
},
})
})
if err != nil {
t.Fatalf("unexpected error: %s", err)
}
if n != len(pr.Streams) {
t.Fatalf("unexpected number of streams; got %d; want %d", len(pr.Streams), n)
}
data, err := pr.Marshal()
if err != nil {
t.Fatalf("unexpected error when marshaling PushRequest: %s", err)
}
encodedData := snappy.Encode(nil, data)
var lines []string
n, err = parseProtobufRequest(encodedData, func(timestamp int64, fields []logstorage.Field) {
var a []string
for _, f := range fields {
a = append(a, f.String())
}
line := fmt.Sprintf("_time:%d %s", timestamp, strings.Join(a, " "))
lines = append(lines, line)
})
if err != nil {
t.Fatalf("unexpected error: %s", err)
}
if n != len(lines) {
t.Fatalf("unexpected number of lines parsed; got %d; want %d", n, len(lines))
}
result := strings.Join(lines, "\n")
if result != resultExpected {
t.Fatalf("unexpected result;\ngot\n%s\nwant\n%s", result, resultExpected)
}
}
// Empty streams
f(`{"streams":[]}`, ``)
f(`{"streams":[{"values":[]}]}`, ``)
f(`{"streams":[{"stream":{},"values":[]}]}`, ``)
f(`{"streams":[{"stream":{"foo":"bar"},"values":[]}]}`, ``)
// Empty stream labels
f(`{"streams":[{"values":[["1577836800000000001", "foo bar"]]}]}`, `_time:1577836800000000001 "_msg":"foo bar"`)
f(`{"streams":[{"stream":{},"values":[["1577836800000000001", "foo bar"]]}]}`, `_time:1577836800000000001 "_msg":"foo bar"`)
// Non-empty stream labels
f(`{"streams":[{"stream":{
"label1": "value1",
"label2": "value2"
},"values":[
["1577836800000000001", "foo bar"],
["1477836900005000002", "abc"],
["147.78369e9", "foobar"]
]}]}`, `_time:1577836800000000001 "label1":"value1" "label2":"value2" "_msg":"foo bar"
_time:1477836900005000002 "label1":"value1" "label2":"value2" "_msg":"abc"
_time:147783690000 "label1":"value1" "label2":"value2" "_msg":"foobar"`)
// Multiple streams
f(`{
"streams": [
{
"stream": {
"foo": "bar",
"a": "b"
},
"values": [
["1577836800000000001", "foo bar"],
["1577836900005000002", "abc"]
]
},
{
"stream": {
"x": "y"
},
"values": [
["1877836900005000002", "yx"]
]
}
]
}`, `_time:1577836800000000001 "foo":"bar" "a":"b" "_msg":"foo bar"
_time:1577836900005000002 "foo":"bar" "a":"b" "_msg":"abc"
_time:1877836900005000002 "x":"y" "_msg":"yx"`)
}
func TestParsePromLabelsSuccess(t *testing.T) {
f := func(s string) {
t.Helper()
fields, err := parsePromLabels(nil, s)
if err != nil {
t.Fatalf("unexpected error: %s", err)
}
var a []string
for _, f := range fields {
a = append(a, fmt.Sprintf("%s=%q", f.Name, f.Value))
}
result := "{" + strings.Join(a, ", ") + "}"
if result != s {
t.Fatalf("unexpected result;\ngot\n%s\nwant\n%s", result, s)
}
}
f("{}")
f(`{foo="bar"}`)
f(`{foo="bar", baz="x", y="z"}`)
f(`{foo="ba\"r\\z\n", a="", b="\"\\"}`)
}
func TestParsePromLabelsFailure(t *testing.T) {
f := func(s string) {
t.Helper()
fields, err := parsePromLabels(nil, s)
if err == nil {
t.Fatalf("expecting non-nil error")
}
if len(fields) > 0 {
t.Fatalf("unexpected non-empty fields: %s", fields)
}
}
f("")
f("{")
f(`{foo}`)
f(`{foo=bar}`)
f(`{foo="bar}`)
f(`{foo="ba\",r}`)
f(`{foo="bar" baz="aa"}`)
f(`foobar`)
f(`foo{bar="baz"}`)
}

View File

@@ -1,65 +0,0 @@
package loki
import (
"fmt"
"strconv"
"testing"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logstorage"
"github.com/golang/snappy"
)
func BenchmarkParseProtobufRequest(b *testing.B) {
for _, streams := range []int{5, 10} {
for _, rows := range []int{100, 1000} {
for _, labels := range []int{10, 50} {
b.Run(fmt.Sprintf("streams_%d/rows_%d/labels_%d", streams, rows, labels), func(b *testing.B) {
benchmarkParseProtobufRequest(b, streams, rows, labels)
})
}
}
}
}
func benchmarkParseProtobufRequest(b *testing.B, streams, rows, labels int) {
b.ReportAllocs()
b.SetBytes(int64(streams * rows))
b.RunParallel(func(pb *testing.PB) {
body := getProtobufBody(streams, rows, labels)
for pb.Next() {
_, err := parseProtobufRequest(body, func(timestamp int64, fields []logstorage.Field) {})
if err != nil {
panic(fmt.Errorf("unexpected error: %s", err))
}
}
})
}
func getProtobufBody(streams, rows, labels int) []byte {
var pr PushRequest
for i := 0; i < streams; i++ {
var st Stream
st.Labels = `{`
for j := 0; j < labels; j++ {
st.Labels += `label_` + strconv.Itoa(j) + `="value_` + strconv.Itoa(j) + `"`
if j < labels-1 {
st.Labels += `,`
}
}
st.Labels += `}`
for j := 0; j < rows; j++ {
st.Entries = append(st.Entries, Entry{Timestamp: time.Now(), Line: "value_" + strconv.Itoa(j)})
}
pr.Streams = append(pr.Streams, st)
}
body, _ := pr.Marshal()
encodedBody := snappy.Encode(nil, body)
return encodedBody
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,38 +0,0 @@
syntax = "proto3";
// source: https://raw.githubusercontent.com/grafana/loki/main/pkg/push/push.proto
// Licensed under the Apache License, Version 2.0 (the "License");
// https://github.com/grafana/loki/blob/main/pkg/push/LICENSE
package logproto;
import "gogoproto/gogo.proto";
import "google/protobuf/timestamp.proto";
option go_package = "github.com/VictoriaMetrics/VictoriaMetrics/app/vlinsert/loki";
message PushRequest {
repeated StreamAdapter streams = 1 [
(gogoproto.jsontag) = "streams",
(gogoproto.customtype) = "Stream"
];
}
message StreamAdapter {
string labels = 1 [(gogoproto.jsontag) = "labels"];
repeated EntryAdapter entries = 2 [
(gogoproto.nullable) = false,
(gogoproto.jsontag) = "entries"
];
// hash contains the original hash of the stream.
uint64 hash = 3 [(gogoproto.jsontag) = "-"];
}
message EntryAdapter {
google.protobuf.Timestamp timestamp = 1 [
(gogoproto.stdtime) = true,
(gogoproto.nullable) = false,
(gogoproto.jsontag) = "ts"
];
string line = 2 [(gogoproto.jsontag) = "line"];
}

View File

@@ -1,110 +0,0 @@
package loki
// source: https://raw.githubusercontent.com/grafana/loki/main/pkg/push/timestamp.go
// Licensed under the Apache License, Version 2.0 (the "License");
// https://github.com/grafana/loki/blob/main/pkg/push/LICENSE
import (
"errors"
"strconv"
"time"
"github.com/gogo/protobuf/types"
)
const (
// Seconds field of the earliest valid Timestamp.
// This is time.Date(1, 1, 1, 0, 0, 0, 0, time.UTC).Unix().
minValidSeconds = -62135596800
// Seconds field just after the latest valid Timestamp.
// This is time.Date(10000, 1, 1, 0, 0, 0, 0, time.UTC).Unix().
maxValidSeconds = 253402300800
)
// validateTimestamp determines whether a Timestamp is valid.
// A valid timestamp represents a time in the range
// [0001-01-01, 10000-01-01) and has a Nanos field
// in the range [0, 1e9).
//
// If the Timestamp is valid, validateTimestamp returns nil.
// Otherwise, it returns an error that describes
// the problem.
//
// Every valid Timestamp can be represented by a time.Time, but the converse is not true.
func validateTimestamp(ts *types.Timestamp) error {
if ts == nil {
return errors.New("timestamp: nil Timestamp")
}
if ts.Seconds < minValidSeconds {
return errors.New("timestamp: " + formatTimestamp(ts) + " before 0001-01-01")
}
if ts.Seconds >= maxValidSeconds {
return errors.New("timestamp: " + formatTimestamp(ts) + " after 10000-01-01")
}
if ts.Nanos < 0 || ts.Nanos >= 1e9 {
return errors.New("timestamp: " + formatTimestamp(ts) + ": nanos not in range [0, 1e9)")
}
return nil
}
// formatTimestamp is equivalent to fmt.Sprintf("%#v", ts)
// but avoids the escape incurred by using fmt.Sprintf, eliminating
// unnecessary heap allocations.
func formatTimestamp(ts *types.Timestamp) string {
if ts == nil {
return "nil"
}
seconds := strconv.FormatInt(ts.Seconds, 10)
nanos := strconv.FormatInt(int64(ts.Nanos), 10)
return "&types.Timestamp{Seconds: " + seconds + ",\nNanos: " + nanos + ",\n}"
}
func sizeOfStdTime(t time.Time) int {
ts, err := timestampProto(t)
if err != nil {
return 0
}
return ts.Size()
}
func stdTimeMarshalTo(t time.Time, data []byte) (int, error) {
ts, err := timestampProto(t)
if err != nil {
return 0, err
}
return ts.MarshalTo(data)
}
func stdTimeUnmarshal(t *time.Time, data []byte) error {
ts := &types.Timestamp{}
if err := ts.Unmarshal(data); err != nil {
return err
}
tt, err := timestampFromProto(ts)
if err != nil {
return err
}
*t = tt
return nil
}
func timestampFromProto(ts *types.Timestamp) (time.Time, error) {
// Don't return the zero value on error, because corresponds to a valid
// timestamp. Instead return whatever time.Unix gives us.
var t time.Time
if ts == nil {
t = time.Unix(0, 0).UTC() // treat nil like the empty Timestamp
} else {
t = time.Unix(ts.Seconds, int64(ts.Nanos)).UTC()
}
return t, validateTimestamp(ts)
}
func timestampProto(t time.Time) (types.Timestamp, error) {
ts := types.Timestamp{
Seconds: t.Unix(),
Nanos: int32(t.Nanosecond()),
}
return ts, validateTimestamp(&ts)
}

View File

@@ -1,481 +0,0 @@
package loki
// source: https://raw.githubusercontent.com/grafana/loki/main/pkg/push/types.go
// Licensed under the Apache License, Version 2.0 (the "License");
// https://github.com/grafana/loki/blob/main/pkg/push/LICENSE
import (
"fmt"
"io"
"time"
)
// Stream contains a unique labels set as a string and a set of entries for it.
// We are not using the proto generated version but this custom one so that we
// can improve serialization see benchmark.
type Stream struct {
Labels string `protobuf:"bytes,1,opt,name=labels,proto3" json:"labels"`
Entries []Entry `protobuf:"bytes,2,rep,name=entries,proto3,customtype=EntryAdapter" json:"entries"`
Hash uint64 `protobuf:"varint,3,opt,name=hash,proto3" json:"-"`
}
// Entry is a log entry with a timestamp.
type Entry struct {
Timestamp time.Time `protobuf:"bytes,1,opt,name=timestamp,proto3,stdtime" json:"ts"`
Line string `protobuf:"bytes,2,opt,name=line,proto3" json:"line"`
}
// Marshal implements the proto.Marshaler interface.
func (m *Stream) Marshal() (dAtA []byte, err error) {
size := m.Size()
dAtA = make([]byte, size)
n, err := m.MarshalToSizedBuffer(dAtA[:size])
if err != nil {
return nil, err
}
return dAtA[:n], nil
}
// MarshalTo marshals m to dst.
func (m *Stream) MarshalTo(dAtA []byte) (int, error) {
size := m.Size()
return m.MarshalToSizedBuffer(dAtA[:size])
}
// MarshalToSizedBuffer marshals m to the sized buffer.
func (m *Stream) MarshalToSizedBuffer(dAtA []byte) (int, error) {
i := len(dAtA)
_ = i
var l int
_ = l
if m.Hash != 0 {
i = encodeVarintPush(dAtA, i, m.Hash)
i--
dAtA[i] = 0x18
}
if len(m.Entries) > 0 {
for iNdEx := len(m.Entries) - 1; iNdEx >= 0; iNdEx-- {
{
size, err := m.Entries[iNdEx].MarshalToSizedBuffer(dAtA[:i])
if err != nil {
return 0, err
}
i -= size
i = encodeVarintPush(dAtA, i, uint64(size))
}
i--
dAtA[i] = 0x12
}
}
if len(m.Labels) > 0 {
i -= len(m.Labels)
copy(dAtA[i:], m.Labels)
i = encodeVarintPush(dAtA, i, uint64(len(m.Labels)))
i--
dAtA[i] = 0xa
}
return len(dAtA) - i, nil
}
// Marshal implements the proto.Marshaler interface.
func (m *Entry) Marshal() (dAtA []byte, err error) {
size := m.Size()
dAtA = make([]byte, size)
n, err := m.MarshalToSizedBuffer(dAtA[:size])
if err != nil {
return nil, err
}
return dAtA[:n], nil
}
// MarshalTo marshals m to dst.
func (m *Entry) MarshalTo(dAtA []byte) (int, error) {
size := m.Size()
return m.MarshalToSizedBuffer(dAtA[:size])
}
// MarshalToSizedBuffer marshals m to the sized buffer.
func (m *Entry) MarshalToSizedBuffer(dAtA []byte) (int, error) {
i := len(dAtA)
_ = i
var l int
_ = l
if len(m.Line) > 0 {
i -= len(m.Line)
copy(dAtA[i:], m.Line)
i = encodeVarintPush(dAtA, i, uint64(len(m.Line)))
i--
dAtA[i] = 0x12
}
n7, err7 := stdTimeMarshalTo(m.Timestamp, dAtA[i-sizeOfStdTime(m.Timestamp):])
if err7 != nil {
return 0, err7
}
i -= n7
i = encodeVarintPush(dAtA, i, uint64(n7))
i--
dAtA[i] = 0xa
return len(dAtA) - i, nil
}
// Unmarshal unmarshals the given data into m.
func (m *Stream) Unmarshal(dAtA []byte) error {
l := len(dAtA)
iNdEx := 0
for iNdEx < l {
preIndex := iNdEx
var wire uint64
for shift := uint(0); ; shift += 7 {
if shift >= 64 {
return ErrIntOverflowPush
}
if iNdEx >= l {
return io.ErrUnexpectedEOF
}
b := dAtA[iNdEx]
iNdEx++
wire |= uint64(b&0x7F) << shift
if b < 0x80 {
break
}
}
fieldNum := int32(wire >> 3)
wireType := int(wire & 0x7)
if wireType == 4 {
return fmt.Errorf("proto: StreamAdapter: wiretype end group for non-group")
}
if fieldNum <= 0 {
return fmt.Errorf("proto: StreamAdapter: illegal tag %d (wire type %d)", fieldNum, wire)
}
switch fieldNum {
case 1:
if wireType != 2 {
return fmt.Errorf("proto: wrong wireType = %d for field Labels", wireType)
}
var stringLen uint64
for shift := uint(0); ; shift += 7 {
if shift >= 64 {
return ErrIntOverflowPush
}
if iNdEx >= l {
return io.ErrUnexpectedEOF
}
b := dAtA[iNdEx]
iNdEx++
stringLen |= uint64(b&0x7F) << shift
if b < 0x80 {
break
}
}
intStringLen := int(stringLen)
if intStringLen < 0 {
return ErrInvalidLengthPush
}
postIndex := iNdEx + intStringLen
if postIndex < 0 {
return ErrInvalidLengthPush
}
if postIndex > l {
return io.ErrUnexpectedEOF
}
m.Labels = string(dAtA[iNdEx:postIndex])
iNdEx = postIndex
case 2:
if wireType != 2 {
return fmt.Errorf("proto: wrong wireType = %d for field Entries", wireType)
}
var msglen int
for shift := uint(0); ; shift += 7 {
if shift >= 64 {
return ErrIntOverflowPush
}
if iNdEx >= l {
return io.ErrUnexpectedEOF
}
b := dAtA[iNdEx]
iNdEx++
msglen |= int(b&0x7F) << shift
if b < 0x80 {
break
}
}
if msglen < 0 {
return ErrInvalidLengthPush
}
postIndex := iNdEx + msglen
if postIndex < 0 {
return ErrInvalidLengthPush
}
if postIndex > l {
return io.ErrUnexpectedEOF
}
m.Entries = append(m.Entries, Entry{})
if err := m.Entries[len(m.Entries)-1].Unmarshal(dAtA[iNdEx:postIndex]); err != nil {
return err
}
iNdEx = postIndex
case 3:
if wireType != 0 {
return fmt.Errorf("proto: wrong wireType = %d for field Hash", wireType)
}
m.Hash = 0
for shift := uint(0); ; shift += 7 {
if shift >= 64 {
return ErrIntOverflowPush
}
if iNdEx >= l {
return io.ErrUnexpectedEOF
}
b := dAtA[iNdEx]
iNdEx++
m.Hash |= uint64(b&0x7F) << shift
if b < 0x80 {
break
}
}
default:
iNdEx = preIndex
skippy, err := skipPush(dAtA[iNdEx:])
if err != nil {
return err
}
if skippy < 0 {
return ErrInvalidLengthPush
}
if (iNdEx + skippy) < 0 {
return ErrInvalidLengthPush
}
if (iNdEx + skippy) > l {
return io.ErrUnexpectedEOF
}
iNdEx += skippy
}
}
if iNdEx > l {
return io.ErrUnexpectedEOF
}
return nil
}
// Unmarshal unmarshals the given data into m.
func (m *Entry) Unmarshal(dAtA []byte) error {
l := len(dAtA)
iNdEx := 0
for iNdEx < l {
preIndex := iNdEx
var wire uint64
for shift := uint(0); ; shift += 7 {
if shift >= 64 {
return ErrIntOverflowPush
}
if iNdEx >= l {
return io.ErrUnexpectedEOF
}
b := dAtA[iNdEx]
iNdEx++
wire |= uint64(b&0x7F) << shift
if b < 0x80 {
break
}
}
fieldNum := int32(wire >> 3)
wireType := int(wire & 0x7)
if wireType == 4 {
return fmt.Errorf("proto: EntryAdapter: wiretype end group for non-group")
}
if fieldNum <= 0 {
return fmt.Errorf("proto: EntryAdapter: illegal tag %d (wire type %d)", fieldNum, wire)
}
switch fieldNum {
case 1:
if wireType != 2 {
return fmt.Errorf("proto: wrong wireType = %d for field Timestamp", wireType)
}
var msglen int
for shift := uint(0); ; shift += 7 {
if shift >= 64 {
return ErrIntOverflowPush
}
if iNdEx >= l {
return io.ErrUnexpectedEOF
}
b := dAtA[iNdEx]
iNdEx++
msglen |= int(b&0x7F) << shift
if b < 0x80 {
break
}
}
if msglen < 0 {
return ErrInvalidLengthPush
}
postIndex := iNdEx + msglen
if postIndex < 0 {
return ErrInvalidLengthPush
}
if postIndex > l {
return io.ErrUnexpectedEOF
}
if err := stdTimeUnmarshal(&m.Timestamp, dAtA[iNdEx:postIndex]); err != nil {
return err
}
iNdEx = postIndex
case 2:
if wireType != 2 {
return fmt.Errorf("proto: wrong wireType = %d for field Line", wireType)
}
var stringLen uint64
for shift := uint(0); ; shift += 7 {
if shift >= 64 {
return ErrIntOverflowPush
}
if iNdEx >= l {
return io.ErrUnexpectedEOF
}
b := dAtA[iNdEx]
iNdEx++
stringLen |= uint64(b&0x7F) << shift
if b < 0x80 {
break
}
}
intStringLen := int(stringLen)
if intStringLen < 0 {
return ErrInvalidLengthPush
}
postIndex := iNdEx + intStringLen
if postIndex < 0 {
return ErrInvalidLengthPush
}
if postIndex > l {
return io.ErrUnexpectedEOF
}
m.Line = string(dAtA[iNdEx:postIndex])
iNdEx = postIndex
default:
iNdEx = preIndex
skippy, err := skipPush(dAtA[iNdEx:])
if err != nil {
return err
}
if skippy < 0 {
return ErrInvalidLengthPush
}
if (iNdEx + skippy) < 0 {
return ErrInvalidLengthPush
}
if (iNdEx + skippy) > l {
return io.ErrUnexpectedEOF
}
iNdEx += skippy
}
}
if iNdEx > l {
return io.ErrUnexpectedEOF
}
return nil
}
// Size returns the size of the serialized Stream.
func (m *Stream) Size() (n int) {
if m == nil {
return 0
}
var l int
_ = l
l = len(m.Labels)
if l > 0 {
n += 1 + l + sovPush(uint64(l))
}
if len(m.Entries) > 0 {
for _, e := range m.Entries {
l = e.Size()
n += 1 + l + sovPush(uint64(l))
}
}
if m.Hash != 0 {
n += 1 + sovPush(m.Hash)
}
return n
}
// Size returns the size of the serialized Entry
func (m *Entry) Size() (n int) {
if m == nil {
return 0
}
var l int
_ = l
l = sizeOfStdTime(m.Timestamp)
n += 1 + l + sovPush(uint64(l))
l = len(m.Line)
if l > 0 {
n += 1 + l + sovPush(uint64(l))
}
return n
}
// Equal returns true if the two Streams are equal.
func (m *Stream) Equal(that interface{}) bool {
if that == nil {
return m == nil
}
that1, ok := that.(*Stream)
if !ok {
that2, ok := that.(Stream)
if ok {
that1 = &that2
} else {
return false
}
}
if that1 == nil {
return m == nil
} else if m == nil {
return false
}
if m.Labels != that1.Labels {
return false
}
if len(m.Entries) != len(that1.Entries) {
return false
}
for i := range m.Entries {
if !m.Entries[i].Equal(that1.Entries[i]) {
return false
}
}
return m.Hash == that1.Hash
}
// Equal returns true if the two Entries are equal.
func (m *Entry) Equal(that interface{}) bool {
if that == nil {
return m == nil
}
that1, ok := that.(*Entry)
if !ok {
that2, ok := that.(Entry)
if ok {
that1 = &that2
} else {
return false
}
}
if that1 == nil {
return m == nil
} else if m == nil {
return false
}
if !m.Timestamp.Equal(that1.Timestamp) {
return false
}
if m.Line != that1.Line {
return false
}
return true
}

View File

@@ -1,43 +0,0 @@
package vlinsert
import (
"net/http"
"strings"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vlinsert/elasticsearch"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vlinsert/jsonline"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vlinsert/loki"
)
// Init initializes vlinsert
func Init() {
}
// Stop stops vlinsert
func Stop() {
}
// RequestHandler handles insert requests for VictoriaLogs
func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
path := r.URL.Path
if !strings.HasPrefix(path, "/insert/") {
// Skip requests, which do not start with /insert/, since these aren't our requests.
return false
}
path = strings.TrimPrefix(path, "/insert")
path = strings.ReplaceAll(path, "//", "/")
if path == "/jsonline" {
return jsonline.RequestHandler(w, r)
}
switch {
case strings.HasPrefix(path, "/elasticsearch/"):
path = strings.TrimPrefix(path, "/elasticsearch")
return elasticsearch.RequestHandler(path, w, r)
case strings.HasPrefix(path, "/loki/"):
path = strings.TrimPrefix(path, "/loki")
return loki.RequestHandler(path, w, r)
default:
return false
}
}

View File

@@ -1,56 +0,0 @@
package logsql
import (
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vlstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logstorage"
)
var (
maxSortBufferSize = flagutil.NewBytes("select.maxSortBufferSize", 1024*1024, "Query results from /select/logsql/query are automatically sorted by _time "+
"if their summary size doesn't exceed this value; otherwise, query results are streamed in the response without sorting; "+
"too big value for this flag may result in high memory usage since the sorting is performed in memory")
)
// ProcessQueryRequest handles /select/logsql/query request
func ProcessQueryRequest(w http.ResponseWriter, r *http.Request, stopCh <-chan struct{}) {
// Extract tenantID
tenantID, err := logstorage.GetTenantIDFromRequest(r)
if err != nil {
httpserver.Errorf(w, r, "%s", err)
return
}
qStr := r.FormValue("query")
q, err := logstorage.ParseQuery(qStr)
if err != nil {
httpserver.Errorf(w, r, "cannot parse query [%s]: %s", qStr, err)
return
}
w.Header().Set("Content-Type", "application/stream+json; charset=utf-8")
sw := getSortWriter()
sw.Init(w, maxSortBufferSize.IntN())
tenantIDs := []logstorage.TenantID{tenantID}
vlstorage.RunQuery(tenantIDs, q, stopCh, func(columns []logstorage.BlockColumn) {
if len(columns) == 0 {
return
}
rowsCount := len(columns[0].Values)
bb := blockResultPool.Get()
for rowIdx := 0; rowIdx < rowsCount; rowIdx++ {
WriteJSONRow(bb, columns, rowIdx)
}
sw.MustWrite(bb.B)
blockResultPool.Put(bb)
})
sw.FinalFlush()
putSortWriter(sw)
}
var blockResultPool bytesutil.ByteBufferPool

View File

@@ -1,41 +0,0 @@
{% import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logstorage"
) %}
{% stripspace %}
// JSONRow creates JSON row from the given fields.
{% func JSONRow(columns []logstorage.BlockColumn, rowIdx int) %}
{
{% code c := &columns[0] %}
{%q= c.Name %}:{%q= c.Values[rowIdx] %}
{% code columns = columns[1:] %}
{% for colIdx := range columns %}
{% code c := &columns[colIdx] %}
,{%q= c.Name %}:{%q= c.Values[rowIdx] %}
{% endfor %}
}{% newline %}
{% endfunc %}
// JSONRows prints formatted rows
{% func JSONRows(rows [][]logstorage.Field) %}
{% if len(rows) == 0 %}
{% return %}
{% endif %}
{% for _, fields := range rows %}
{
{% if len(fields) > 0 %}
{% code
f := fields[0]
fields = fields[1:]
%}
{%q= f.Name %}:{%q= f.Value %}
{% for _, f := range fields %}
,{%q= f.Name %}:{%q= f.Value %}
{% endfor %}
{% endif %}
}{% newline %}
{% endfor %}
{% endfunc %}
{% endstripspace %}

View File

@@ -1,166 +0,0 @@
// Code generated by qtc from "query_response.qtpl". DO NOT EDIT.
// See https://github.com/valyala/quicktemplate for details.
//line app/vlselect/logsql/query_response.qtpl:1
package logsql
//line app/vlselect/logsql/query_response.qtpl:1
import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logstorage"
)
// JSONRow creates JSON row from the given fields.
//line app/vlselect/logsql/query_response.qtpl:8
import (
qtio422016 "io"
qt422016 "github.com/valyala/quicktemplate"
)
//line app/vlselect/logsql/query_response.qtpl:8
var (
_ = qtio422016.Copy
_ = qt422016.AcquireByteBuffer
)
//line app/vlselect/logsql/query_response.qtpl:8
func StreamJSONRow(qw422016 *qt422016.Writer, columns []logstorage.BlockColumn, rowIdx int) {
//line app/vlselect/logsql/query_response.qtpl:8
qw422016.N().S(`{`)
//line app/vlselect/logsql/query_response.qtpl:10
c := &columns[0]
//line app/vlselect/logsql/query_response.qtpl:11
qw422016.N().Q(c.Name)
//line app/vlselect/logsql/query_response.qtpl:11
qw422016.N().S(`:`)
//line app/vlselect/logsql/query_response.qtpl:11
qw422016.N().Q(c.Values[rowIdx])
//line app/vlselect/logsql/query_response.qtpl:12
columns = columns[1:]
//line app/vlselect/logsql/query_response.qtpl:13
for colIdx := range columns {
//line app/vlselect/logsql/query_response.qtpl:14
c := &columns[colIdx]
//line app/vlselect/logsql/query_response.qtpl:14
qw422016.N().S(`,`)
//line app/vlselect/logsql/query_response.qtpl:15
qw422016.N().Q(c.Name)
//line app/vlselect/logsql/query_response.qtpl:15
qw422016.N().S(`:`)
//line app/vlselect/logsql/query_response.qtpl:15
qw422016.N().Q(c.Values[rowIdx])
//line app/vlselect/logsql/query_response.qtpl:16
}
//line app/vlselect/logsql/query_response.qtpl:16
qw422016.N().S(`}`)
//line app/vlselect/logsql/query_response.qtpl:17
qw422016.N().S(`
`)
//line app/vlselect/logsql/query_response.qtpl:18
}
//line app/vlselect/logsql/query_response.qtpl:18
func WriteJSONRow(qq422016 qtio422016.Writer, columns []logstorage.BlockColumn, rowIdx int) {
//line app/vlselect/logsql/query_response.qtpl:18
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vlselect/logsql/query_response.qtpl:18
StreamJSONRow(qw422016, columns, rowIdx)
//line app/vlselect/logsql/query_response.qtpl:18
qt422016.ReleaseWriter(qw422016)
//line app/vlselect/logsql/query_response.qtpl:18
}
//line app/vlselect/logsql/query_response.qtpl:18
func JSONRow(columns []logstorage.BlockColumn, rowIdx int) string {
//line app/vlselect/logsql/query_response.qtpl:18
qb422016 := qt422016.AcquireByteBuffer()
//line app/vlselect/logsql/query_response.qtpl:18
WriteJSONRow(qb422016, columns, rowIdx)
//line app/vlselect/logsql/query_response.qtpl:18
qs422016 := string(qb422016.B)
//line app/vlselect/logsql/query_response.qtpl:18
qt422016.ReleaseByteBuffer(qb422016)
//line app/vlselect/logsql/query_response.qtpl:18
return qs422016
//line app/vlselect/logsql/query_response.qtpl:18
}
// JSONRows prints formatted rows
//line app/vlselect/logsql/query_response.qtpl:21
func StreamJSONRows(qw422016 *qt422016.Writer, rows [][]logstorage.Field) {
//line app/vlselect/logsql/query_response.qtpl:22
if len(rows) == 0 {
//line app/vlselect/logsql/query_response.qtpl:23
return
//line app/vlselect/logsql/query_response.qtpl:24
}
//line app/vlselect/logsql/query_response.qtpl:25
for _, fields := range rows {
//line app/vlselect/logsql/query_response.qtpl:25
qw422016.N().S(`{`)
//line app/vlselect/logsql/query_response.qtpl:27
if len(fields) > 0 {
//line app/vlselect/logsql/query_response.qtpl:29
f := fields[0]
fields = fields[1:]
//line app/vlselect/logsql/query_response.qtpl:32
qw422016.N().Q(f.Name)
//line app/vlselect/logsql/query_response.qtpl:32
qw422016.N().S(`:`)
//line app/vlselect/logsql/query_response.qtpl:32
qw422016.N().Q(f.Value)
//line app/vlselect/logsql/query_response.qtpl:33
for _, f := range fields {
//line app/vlselect/logsql/query_response.qtpl:33
qw422016.N().S(`,`)
//line app/vlselect/logsql/query_response.qtpl:34
qw422016.N().Q(f.Name)
//line app/vlselect/logsql/query_response.qtpl:34
qw422016.N().S(`:`)
//line app/vlselect/logsql/query_response.qtpl:34
qw422016.N().Q(f.Value)
//line app/vlselect/logsql/query_response.qtpl:35
}
//line app/vlselect/logsql/query_response.qtpl:36
}
//line app/vlselect/logsql/query_response.qtpl:36
qw422016.N().S(`}`)
//line app/vlselect/logsql/query_response.qtpl:37
qw422016.N().S(`
`)
//line app/vlselect/logsql/query_response.qtpl:38
}
//line app/vlselect/logsql/query_response.qtpl:39
}
//line app/vlselect/logsql/query_response.qtpl:39
func WriteJSONRows(qq422016 qtio422016.Writer, rows [][]logstorage.Field) {
//line app/vlselect/logsql/query_response.qtpl:39
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vlselect/logsql/query_response.qtpl:39
StreamJSONRows(qw422016, rows)
//line app/vlselect/logsql/query_response.qtpl:39
qt422016.ReleaseWriter(qw422016)
//line app/vlselect/logsql/query_response.qtpl:39
}
//line app/vlselect/logsql/query_response.qtpl:39
func JSONRows(rows [][]logstorage.Field) string {
//line app/vlselect/logsql/query_response.qtpl:39
qb422016 := qt422016.AcquireByteBuffer()
//line app/vlselect/logsql/query_response.qtpl:39
WriteJSONRows(qb422016, rows)
//line app/vlselect/logsql/query_response.qtpl:39
qs422016 := string(qb422016.B)
//line app/vlselect/logsql/query_response.qtpl:39
qt422016.ReleaseByteBuffer(qb422016)
//line app/vlselect/logsql/query_response.qtpl:39
return qs422016
//line app/vlselect/logsql/query_response.qtpl:39
}

View File

@@ -1,225 +0,0 @@
package logsql
import (
"bytes"
"io"
"sort"
"sync"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logjson"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logstorage"
)
func getSortWriter() *sortWriter {
v := sortWriterPool.Get()
if v == nil {
return &sortWriter{}
}
return v.(*sortWriter)
}
func putSortWriter(sw *sortWriter) {
sw.reset()
sortWriterPool.Put(sw)
}
var sortWriterPool sync.Pool
// sortWriter expects JSON line stream to be written to it.
//
// It buffers the incoming data until its size reaches maxBufLen.
// Then it streams the buffered data and all the incoming data to w.
//
// The FinalFlush() must be called when all the data is written.
// If the buf isn't empty at FinalFlush() call, then the buffered data
// is sorted by _time field.
type sortWriter struct {
mu sync.Mutex
w io.Writer
maxBufLen int
buf []byte
bufFlushed bool
hasErr bool
}
func (sw *sortWriter) reset() {
sw.w = nil
sw.maxBufLen = 0
sw.buf = sw.buf[:0]
sw.bufFlushed = false
sw.hasErr = false
}
func (sw *sortWriter) Init(w io.Writer, maxBufLen int) {
sw.reset()
sw.w = w
sw.maxBufLen = maxBufLen
}
func (sw *sortWriter) MustWrite(p []byte) {
sw.mu.Lock()
defer sw.mu.Unlock()
if sw.hasErr {
return
}
if sw.bufFlushed {
if _, err := sw.w.Write(p); err != nil {
sw.hasErr = true
}
return
}
if len(sw.buf)+len(p) < sw.maxBufLen {
sw.buf = append(sw.buf, p...)
return
}
sw.bufFlushed = true
if len(sw.buf) > 0 {
if _, err := sw.w.Write(sw.buf); err != nil {
sw.hasErr = true
return
}
sw.buf = sw.buf[:0]
}
if _, err := sw.w.Write(p); err != nil {
sw.hasErr = true
}
}
func (sw *sortWriter) FinalFlush() {
if sw.hasErr || sw.bufFlushed {
return
}
rs := getRowsSorter()
rs.parseRows(sw.buf)
rs.sort()
WriteJSONRows(sw.w, rs.rows)
putRowsSorter(rs)
}
func getRowsSorter() *rowsSorter {
v := rowsSorterPool.Get()
if v == nil {
return &rowsSorter{}
}
return v.(*rowsSorter)
}
func putRowsSorter(rs *rowsSorter) {
rs.reset()
rowsSorterPool.Put(rs)
}
var rowsSorterPool sync.Pool
type rowsSorter struct {
buf []byte
fieldsBuf []logstorage.Field
rows [][]logstorage.Field
times []string
}
func (rs *rowsSorter) reset() {
rs.buf = rs.buf[:0]
fieldsBuf := rs.fieldsBuf
for i := range fieldsBuf {
fieldsBuf[i].Reset()
}
rs.fieldsBuf = fieldsBuf[:0]
rows := rs.rows
for i := range rows {
rows[i] = nil
}
rs.rows = rows[:0]
times := rs.times
for i := range times {
times[i] = ""
}
rs.times = times[:0]
}
func (rs *rowsSorter) parseRows(src []byte) {
rs.reset()
buf := rs.buf
fieldsBuf := rs.fieldsBuf
rows := rs.rows
times := rs.times
p := logjson.GetParser()
for len(src) > 0 {
var line []byte
n := bytes.IndexByte(src, '\n')
if n < 0 {
line = src
src = nil
} else {
line = src[:n]
src = src[n+1:]
}
if len(line) == 0 {
continue
}
if err := p.ParseLogMessage(line); err != nil {
logger.Panicf("BUG: unexpected invalid JSON line: %s", err)
}
timeValue := ""
fieldsBufLen := len(fieldsBuf)
for _, f := range p.Fields {
bufLen := len(buf)
buf = append(buf, f.Name...)
name := bytesutil.ToUnsafeString(buf[bufLen:])
bufLen = len(buf)
buf = append(buf, f.Value...)
value := bytesutil.ToUnsafeString(buf[bufLen:])
fieldsBuf = append(fieldsBuf, logstorage.Field{
Name: name,
Value: value,
})
if name == "_time" {
timeValue = value
}
}
rows = append(rows, fieldsBuf[fieldsBufLen:])
times = append(times, timeValue)
}
logjson.PutParser(p)
rs.buf = buf
rs.fieldsBuf = fieldsBuf
rs.rows = rows
rs.times = times
}
func (rs *rowsSorter) Len() int {
return len(rs.rows)
}
func (rs *rowsSorter) Less(i, j int) bool {
times := rs.times
return times[i] < times[j]
}
func (rs *rowsSorter) Swap(i, j int) {
times := rs.times
rows := rs.rows
times[i], times[j] = times[j], times[i]
rows[i], rows[j] = rows[j], rows[i]
}
func (rs *rowsSorter) sort() {
sort.Sort(rs)
}

View File

@@ -1,39 +0,0 @@
package logsql
import (
"bytes"
"strings"
"testing"
)
func TestSortWriter(t *testing.T) {
f := func(maxBufLen int, data string, expectedResult string) {
t.Helper()
var bb bytes.Buffer
sw := getSortWriter()
sw.Init(&bb, maxBufLen)
for _, s := range strings.Split(data, "\n") {
sw.MustWrite([]byte(s + "\n"))
}
sw.FinalFlush()
putSortWriter(sw)
result := bb.String()
if result != expectedResult {
t.Fatalf("unexpected result;\ngot\n%s\nwant\n%s", result, expectedResult)
}
}
f(100, "", "")
f(100, "{}", "{}\n")
data := `{"_time":"def","_msg":"xxx"}
{"_time":"abc","_msg":"foo"}`
resultExpected := `{"_time":"abc","_msg":"foo"}
{"_time":"def","_msg":"xxx"}
`
f(100, data, resultExpected)
f(10, data, data+"\n")
}

View File

@@ -1,162 +0,0 @@
package vlselect
import (
"embed"
"flag"
"fmt"
"net/http"
"strings"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vlselect/logsql"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httputils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/timerpool"
"github.com/VictoriaMetrics/metrics"
)
var (
maxConcurrentRequests = flag.Int("search.maxConcurrentRequests", getDefaultMaxConcurrentRequests(), "The maximum number of concurrent search requests. "+
"It shouldn't be high, since a single request can saturate all the CPU cores, while many concurrently executed requests may require high amounts of memory. "+
"See also -search.maxQueueDuration")
maxQueueDuration = flag.Duration("search.maxQueueDuration", 10*time.Second, "The maximum time the search request waits for execution when -search.maxConcurrentRequests "+
"limit is reached; see also -search.maxQueryDuration")
maxQueryDuration = flag.Duration("search.maxQueryDuration", time.Second*30, "The maximum duration for query execution")
)
func getDefaultMaxConcurrentRequests() int {
n := cgroup.AvailableCPUs()
if n <= 4 {
n *= 2
}
if n > 16 {
// A single request can saturate all the CPU cores, so there is no sense
// in allowing higher number of concurrent requests - they will just contend
// for unavailable CPU time.
n = 16
}
return n
}
// Init initializes vlselect
func Init() {
concurrencyLimitCh = make(chan struct{}, *maxConcurrentRequests)
}
// Stop stops vlselect
func Stop() {
}
var concurrencyLimitCh chan struct{}
var (
concurrencyLimitReached = metrics.NewCounter(`vl_concurrent_select_limit_reached_total`)
concurrencyLimitTimeout = metrics.NewCounter(`vl_concurrent_select_limit_timeout_total`)
_ = metrics.NewGauge(`vl_concurrent_select_capacity`, func() float64 {
return float64(cap(concurrencyLimitCh))
})
_ = metrics.NewGauge(`vl_concurrent_select_current`, func() float64 {
return float64(len(concurrencyLimitCh))
})
)
//go:embed vmui
var vmuiFiles embed.FS
var vmuiFileServer = http.FileServer(http.FS(vmuiFiles))
// RequestHandler handles select requests for VictoriaLogs
func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
path := r.URL.Path
if !strings.HasPrefix(path, "/select/") {
// Skip requests, which do not start with /select/, since these aren't our requests.
return false
}
path = strings.TrimPrefix(path, "/select")
path = strings.ReplaceAll(path, "//", "/")
if path == "/vmui" {
// VMUI access via incomplete url without `/` in the end. Redirect to complete url.
// Use relative redirect, since the hostname and path prefix may be incorrect if VictoriaMetrics
// is hidden behind vmauth or similar proxy.
_ = r.ParseForm()
newURL := "vmui/?" + r.Form.Encode()
httpserver.Redirect(w, newURL)
return true
}
if strings.HasPrefix(path, "/vmui/") {
r.URL.Path = path
vmuiFileServer.ServeHTTP(w, r)
return true
}
// Limit the number of concurrent queries, which can consume big amounts of CPU.
startTime := time.Now()
stopCh := r.Context().Done()
select {
case concurrencyLimitCh <- struct{}{}:
defer func() { <-concurrencyLimitCh }()
default:
// Sleep for a while until giving up. This should resolve short bursts in requests.
concurrencyLimitReached.Inc()
d := getMaxQueryDuration(r)
if d > *maxQueueDuration {
d = *maxQueueDuration
}
t := timerpool.Get(d)
select {
case concurrencyLimitCh <- struct{}{}:
timerpool.Put(t)
defer func() { <-concurrencyLimitCh }()
case <-stopCh:
timerpool.Put(t)
remoteAddr := httpserver.GetQuotedRemoteAddr(r)
requestURI := httpserver.GetRequestURI(r)
logger.Infof("client has cancelled the request after %.3f seconds: remoteAddr=%s, requestURI: %q",
time.Since(startTime).Seconds(), remoteAddr, requestURI)
return true
case <-t.C:
timerpool.Put(t)
concurrencyLimitTimeout.Inc()
err := &httpserver.ErrorWithStatusCode{
Err: fmt.Errorf("couldn't start executing the request in %.3f seconds, since -search.maxConcurrentRequests=%d concurrent requests "+
"are executed. Possible solutions: to reduce query load; to add more compute resources to the server; "+
"to increase -search.maxQueueDuration=%s; to increase -search.maxQueryDuration; to increase -search.maxConcurrentRequests",
d.Seconds(), *maxConcurrentRequests, maxQueueDuration),
StatusCode: http.StatusServiceUnavailable,
}
httpserver.Errorf(w, r, "%s", err)
return true
}
}
switch {
case path == "/logsql/query":
logsqlQueryRequests.Inc()
httpserver.EnableCORS(w, r)
logsql.ProcessQueryRequest(w, r, stopCh)
return true
default:
return false
}
}
// getMaxQueryDuration returns the maximum duration for query from r.
func getMaxQueryDuration(r *http.Request) time.Duration {
dms, err := httputils.GetDuration(r, "timeout", 0)
if err != nil {
dms = 0
}
d := time.Duration(dms) * time.Millisecond
if d <= 0 || d > *maxQueryDuration {
d = *maxQueryDuration
}
return d
}
var (
logsqlQueryRequests = metrics.NewCounter(`vl_http_requests_total{path="/select/logsql/query"}`)
)

Binary file not shown.

Before

Width:  |  Height:  |  Size: 14 KiB

View File

@@ -1,14 +0,0 @@
{
"files": {
"main.css": "./static/css/main.5f461a27.css",
"main.js": "./static/js/main.7566144a.js",
"static/js/522.b5ae4365.chunk.js": "./static/js/522.b5ae4365.chunk.js",
"static/media/Lato-Regular.ttf": "./static/media/Lato-Regular.d714fec1633b69a9c2e9.ttf",
"static/media/Lato-Bold.ttf": "./static/media/Lato-Bold.32360ba4b57802daa4d6.ttf",
"index.html": "./index.html"
},
"entrypoints": [
"static/css/main.5f461a27.css",
"static/js/main.7566144a.js"
]
}

Binary file not shown.

Before

Width:  |  Height:  |  Size: 1.6 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 15 KiB

View File

@@ -1 +0,0 @@
<!doctype html><html lang="en"><head><meta charset="utf-8"/><link rel="icon" href="./favicon.ico"/><meta name="viewport" content="width=device-width,initial-scale=1,maximum-scale=1,user-scalable=no"/><meta name="theme-color" content="#000000"/><meta name="description" content="UI for VictoriaMetrics"/><link rel="apple-touch-icon" href="./apple-touch-icon.png"/><link rel="icon" type="image/png" sizes="32x32" href="./favicon-32x32.png"><link rel="manifest" href="./manifest.json"/><title>VM UI</title><script src="./dashboards/index.js" type="module"></script><meta name="twitter:card" content="summary_large_image"><meta name="twitter:image" content="./preview.jpg"><meta name="twitter:title" content="UI for VictoriaMetrics"><meta name="twitter:description" content="Explore and troubleshoot your VictoriaMetrics data"><meta name="twitter:site" content="@VictoriaMetrics"><meta property="og:title" content="Metric explorer for VictoriaMetrics"><meta property="og:description" content="Explore and troubleshoot your VictoriaMetrics data"><meta property="og:image" content="./preview.jpg"><meta property="og:type" content="website"><script defer="defer" src="./static/js/main.7566144a.js"></script><link href="./static/css/main.5f461a27.css" rel="stylesheet"></head><body><noscript>You need to enable JavaScript to run this app.</noscript><div id="root"></div></body></html>

View File

@@ -1,20 +0,0 @@
{
"short_name": "Victoria Metrics UI",
"name": "Victoria Metrics UI is a metric explorer for Victoria Metrics",
"icons": [
{
"src": "favicon-32x32.png",
"sizes": "32x32",
"type": "image/png"
},
{
"src": "apple-touch-icon.png",
"type": "image/png",
"sizes": "192x192"
}
],
"start_url": ".",
"display": "standalone",
"theme_color": "#000000",
"background_color": "#ffffff"
}

Binary file not shown.

Before

Width:  |  Height:  |  Size: 67 KiB

View File

@@ -1,3 +0,0 @@
# https://www.robotstxt.org/robotstxt.html
User-agent: *
Disallow:

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@@ -1,40 +0,0 @@
/*!
Copyright (c) 2018 Jed Watson.
Licensed under the MIT License (MIT), see
http://jedwatson.github.io/classnames
*/
/*! regenerator-runtime -- Copyright (c) 2014-present, Facebook, Inc. -- license (MIT): https://github.com/facebook/regenerator/blob/main/LICENSE */
/**
* @remix-run/router v1.7.2
*
* Copyright (c) Remix Software Inc.
*
* This source code is licensed under the MIT license found in the
* LICENSE.md file in the root directory of this source tree.
*
* @license MIT
*/
/**
* React Router DOM v6.14.2
*
* Copyright (c) Remix Software Inc.
*
* This source code is licensed under the MIT license found in the
* LICENSE.md file in the root directory of this source tree.
*
* @license MIT
*/
/**
* React Router v6.14.2
*
* Copyright (c) Remix Software Inc.
*
* This source code is licensed under the MIT license found in the
* LICENSE.md file in the root directory of this source tree.
*
* @license MIT
*/

View File

@@ -1,189 +0,0 @@
package vlstorage
import (
"flag"
"fmt"
"sync"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logstorage"
"github.com/VictoriaMetrics/metrics"
)
var (
retentionPeriod = flagutil.NewDuration("retentionPeriod", "7d", "Log entries with timestamps older than now-retentionPeriod are automatically deleted; "+
"log entries with timestamps outside the retention are also rejected during data ingestion; the minimum supported retention is 1d (one day); "+
"see https://docs.victoriametrics.com/VictoriaLogs/#retention")
futureRetention = flagutil.NewDuration("futureRetention", "2d", "Log entries with timestamps bigger than now+futureRetention are rejected during data ingestion; "+
"see https://docs.victoriametrics.com/VictoriaLogs/#retention")
storageDataPath = flag.String("storageDataPath", "victoria-logs-data", "Path to directory with the VictoriaLogs data; "+
"see https://docs.victoriametrics.com/VictoriaLogs/#storage")
inmemoryDataFlushInterval = flag.Duration("inmemoryDataFlushInterval", 5*time.Second, "The interval for guaranteed saving of in-memory data to disk. "+
"The saved data survives unclean shutdowns such as OOM crash, hardware reset, SIGKILL, etc. "+
"Bigger intervals may help increase the lifetime of flash storage with limited write cycles (e.g. Raspberry PI). "+
"Smaller intervals increase disk IO load. Minimum supported value is 1s")
logNewStreams = flag.Bool("logNewStreams", false, "Whether to log creation of new streams; this can be useful for debugging of high cardinality issues with log streams; "+
"see https://docs.victoriametrics.com/VictoriaLogs/keyConcepts.html#stream-fields ; see also -logIngestedRows")
logIngestedRows = flag.Bool("logIngestedRows", false, "Whether to log all the ingested log entries; this can be useful for debugging of data ingestion; "+
"see https://docs.victoriametrics.com/VictoriaLogs/data-ingestion/ ; see also -logNewStreams")
)
// Init initializes vlstorage.
//
// Stop must be called when vlstorage is no longer needed
func Init() {
if strg != nil {
logger.Panicf("BUG: Init() has been already called")
}
if retentionPeriod.Msecs < 24*3600*1000 {
logger.Fatalf("-retentionPeriod cannot be smaller than a day; got %s", retentionPeriod)
}
cfg := &logstorage.StorageConfig{
Retention: time.Millisecond * time.Duration(retentionPeriod.Msecs),
FlushInterval: *inmemoryDataFlushInterval,
FutureRetention: time.Millisecond * time.Duration(futureRetention.Msecs),
LogNewStreams: *logNewStreams,
LogIngestedRows: *logIngestedRows,
}
logger.Infof("opening storage at -storageDataPath=%s", *storageDataPath)
startTime := time.Now()
strg = logstorage.MustOpenStorage(*storageDataPath, cfg)
var ss logstorage.StorageStats
strg.UpdateStats(&ss)
logger.Infof("successfully opened storage in %.3f seconds; partsCount: %d; blocksCount: %d; rowsCount: %d; sizeBytes: %d",
time.Since(startTime).Seconds(), ss.FileParts, ss.FileBlocks, ss.FileRowsCount, ss.CompressedFileSize)
storageMetrics = initStorageMetrics(strg)
metrics.RegisterSet(storageMetrics)
}
// Stop stops vlstorage.
func Stop() {
metrics.UnregisterSet(storageMetrics)
storageMetrics = nil
strg.MustClose()
strg = nil
}
var strg *logstorage.Storage
var storageMetrics *metrics.Set
// MustAddRows adds lr to vlstorage
func MustAddRows(lr *logstorage.LogRows) {
strg.MustAddRows(lr)
}
// RunQuery runs the given q and calls processBlock for the returned data blocks
func RunQuery(tenantIDs []logstorage.TenantID, q *logstorage.Query, stopCh <-chan struct{}, processBlock func(columns []logstorage.BlockColumn)) {
strg.RunQuery(tenantIDs, q, stopCh, processBlock)
}
func initStorageMetrics(strg *logstorage.Storage) *metrics.Set {
ssCache := &logstorage.StorageStats{}
var ssCacheLock sync.Mutex
var lastUpdateTime time.Time
m := func() *logstorage.StorageStats {
ssCacheLock.Lock()
defer ssCacheLock.Unlock()
if time.Since(lastUpdateTime) < time.Second {
return ssCache
}
var ss logstorage.StorageStats
strg.UpdateStats(&ss)
ssCache = &ss
lastUpdateTime = time.Now()
return ssCache
}
ms := metrics.NewSet()
ms.NewGauge(fmt.Sprintf(`vl_free_disk_space_bytes{path=%q}`, *storageDataPath), func() float64 {
return float64(fs.MustGetFreeSpace(*storageDataPath))
})
ms.NewGauge(`vl_active_merges{type="inmemory"}`, func() float64 {
return float64(m().InmemoryActiveMerges)
})
ms.NewGauge(`vl_merges_total{type="inmemory"}`, func() float64 {
return float64(m().InmemoryMergesTotal)
})
ms.NewGauge(`vl_active_merges{type="file"}`, func() float64 {
return float64(m().FileActiveMerges)
})
ms.NewGauge(`vl_merges_total{type="file"}`, func() float64 {
return float64(m().FileMergesTotal)
})
ms.NewGauge(`vl_storage_rows{type="inmemory"}`, func() float64 {
return float64(m().InmemoryRowsCount)
})
ms.NewGauge(`vl_storage_rows{type="file"}`, func() float64 {
return float64(m().FileRowsCount)
})
ms.NewGauge(`vl_storage_parts{type="inmemory"}`, func() float64 {
return float64(m().InmemoryParts)
})
ms.NewGauge(`vl_storage_parts{type="file"}`, func() float64 {
return float64(m().FileParts)
})
ms.NewGauge(`vl_storage_blocks{type="inmemory"}`, func() float64 {
return float64(m().InmemoryBlocks)
})
ms.NewGauge(`vl_storage_blocks{type="file"}`, func() float64 {
return float64(m().FileBlocks)
})
ms.NewGauge(`vl_partitions`, func() float64 {
return float64(m().PartitionsCount)
})
ms.NewGauge(`vl_streams_created_total`, func() float64 {
return float64(m().StreamsCreatedTotal)
})
ms.NewGauge(`vl_indexdb_rows`, func() float64 {
return float64(m().IndexdbItemsCount)
})
ms.NewGauge(`vl_indexdb_parts`, func() float64 {
return float64(m().IndexdbPartsCount)
})
ms.NewGauge(`vl_indexdb_blocks`, func() float64 {
return float64(m().IndexdbBlocksCount)
})
ms.NewGauge(`vl_data_size_bytes{type="indexdb"}`, func() float64 {
return float64(m().IndexdbSizeBytes)
})
ms.NewGauge(`vl_data_size_bytes{type="storage"}`, func() float64 {
dm := m()
return float64(dm.CompressedInmemorySize + dm.CompressedFileSize)
})
ms.NewGauge(`vl_compressed_data_size_bytes{type="inmemory"}`, func() float64 {
return float64(m().CompressedInmemorySize)
})
ms.NewGauge(`vl_compressed_data_size_bytes{type="file"}`, func() float64 {
return float64(m().CompressedFileSize)
})
ms.NewGauge(`vl_uncompressed_data_size_bytes{type="inmemory"}`, func() float64 {
return float64(m().UncompressedInmemorySize)
})
ms.NewGauge(`vl_uncompressed_data_size_bytes{type="file"}`, func() float64 {
return float64(m().UncompressedFileSize)
})
ms.NewGauge(`vl_rows_dropped_total{reason="too_big_timestamp"}`, func() float64 {
return float64(m().RowsDroppedTooBigTimestamp)
})
ms.NewGauge(`vl_rows_dropped_total{reason="too_small_timestamp"}`, func() float64 {
return float64(m().RowsDroppedTooSmallTimestamp)
})
return ms
}

View File

@@ -85,9 +85,6 @@ vmagent-linux-arm64:
vmagent-linux-ppc64le:
APP_NAME=vmagent CGO_ENABLED=0 GOOS=linux GOARCH=ppc64le $(MAKE) app-local-goos-goarch
vmagent-linux-s390x:
APP_NAME=vmagent CGO_ENABLED=0 GOOS=linux GOARCH=s390x $(MAKE) app-local-goos-goarch
vmagent-linux-386:
APP_NAME=vmagent CGO_ENABLED=0 GOOS=linux GOARCH=386 $(MAKE) app-local-goos-goarch

View File

@@ -13,7 +13,7 @@ See [Quick Start](#quick-start) for details.
While VictoriaMetrics provides an efficient solution to store and observe metrics, our users needed something fast
and RAM friendly to scrape metrics from Prometheus-compatible exporters into VictoriaMetrics.
Also, we found that our user's infrastructure are like snowflakes in that no two are alike. Therefore, we decided to add more flexibility
Also, we found that our user's infrastructure are like snowflakes in that no two are alike. Therefore we decided to add more flexibility
to `vmagent` such as the ability to [accept metrics via popular push protocols](#how-to-push-data-to-vmagent)
additionally to [discovering Prometheus-compatible targets and scraping metrics from them](#how-to-collect-metrics-in-prometheus-format).
@@ -25,9 +25,7 @@ additionally to [discovering Prometheus-compatible targets and scraping metrics
* Can add, remove and modify labels (aka tags) via Prometheus relabeling. Can filter data before sending it to remote storage. See [these docs](#relabeling) for details.
* Can accept data via all the ingestion protocols supported by VictoriaMetrics - see [these docs](#how-to-push-data-to-vmagent).
* Can aggregate incoming samples by time and by labels before sending them to remote storage - see [these docs](https://docs.victoriametrics.com/stream-aggregation.html).
* Can replicate collected metrics simultaneously to multiple Prometheus-compatible remote storage systems - see [these docs](#replication-and-high-availability).
* Can save egress network bandwidth usage costs when [VictoriaMetrics remote write protocol](#victoriametrics-remote-write-protocol)
is used for sending the data to VictoriaMetrics.
* Can replicate collected metrics simultaneously to multiple remote storage systems - see [these docs](#replication-and-high-availability).
* Works smoothly in environments with unstable connections to remote storage. If the remote storage is unavailable, the collected metrics
are buffered at `-remoteWrite.tmpDataPath`. The buffered metrics are sent to remote storage as soon as the connection
to the remote storage is repaired. The maximum disk usage for the buffer can be limited with `-remoteWrite.maxDiskUsagePerURL`.
@@ -47,42 +45,38 @@ additionally to [discovering Prometheus-compatible targets and scraping metrics
Please download `vmutils-*` archive from [releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) (
`vmagent` is also available in [docker images](https://hub.docker.com/r/victoriametrics/vmagent/tags)),
unpack it and pass the following flags to the `vmagent` binary in order to start scraping Prometheus-compatible targets
and sending the data to the Prometheus-compatible remote storage:
unpack it and pass the following flags to the `vmagent` binary in order to start scraping Prometheus-compatible targets:
* `-promscrape.config` with the path to [Prometheus config file](https://docs.victoriametrics.com/sd_configs.html) (usually located at `/etc/prometheus/prometheus.yml`).
* `-promscrape.config` with the path to Prometheus config file (usually located at `/etc/prometheus/prometheus.yml`).
The path can point either to local file or to http url. `vmagent` doesn't support some sections of Prometheus config file,
so you may need either to delete these sections or to run `vmagent` with `-promscrape.config.strictParse=false` command-line flag.
In this case `vmagent` ignores unsupported sections. See [the list of unsupported sections](#unsupported-prometheus-config-sections).
* `-remoteWrite.url` with Prometheus-compatible remote storage endpoint such as VictoriaMetrics, where to send the data to.
* `-remoteWrite.url` with the remote storage endpoint such as VictoriaMetrics, the `-remoteWrite.url` argument can be specified
multiple times to replicate data concurrently to an arbitrary number of remote storage systems. See [various use cases](#use-cases).
Example command for writing the data received via [supported push-based protocols](#how-to-push-data-to-vmagent)
to [single-node VictoriaMetrics](https://docs.victoriametrics.com/) located at `victoria-metrics-host:8428`:
```console
/path/to/vmagent -remoteWrite.url=https://victoria-metrics-host:8428/api/v1/write
```
See [these docs](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#url-format) if you need writing
the data to [VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html).
Example command for scraping Prometheus targets and writing the data to single-node VictoriaMetrics:
Example command line:
```console
/path/to/vmagent -promscrape.config=/path/to/prometheus.yml -remoteWrite.url=https://victoria-metrics-host:8428/api/v1/write
```
Example of scrape configuration for `-promscrape.config` argument you can find [here](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/deployment/docker/prometheus.yml)
See [how to scrape Prometheus-compatible targets](#how-to-collect-metrics-in-prometheus-format) for more details.
If you use single-node VictoriaMetrics, then you can discover and scrape Prometheus-compatible targets directly from VictoriaMetrics
without the need to use `vmagent` - see [these docs](https://docs.victoriametrics.com/#how-to-scrape-prometheus-exporters-such-as-node-exporter).
`vmagent` can save network bandwidth usage costs under high load when [VictoriaMetrics remote write protocol is used](#victoriametrics-remote-write-protocol).
If you don't need to scrape Prometheus-compatible targets, then the `-promscrape.config` option isn't needed.
For example, the following command is sufficient for accepting data via [supported push-based protocols](#how-to-push-data-to-vmagent)
and sending it to the provided `-remoteWrite.url`:
```console
/path/to/vmagent -remoteWrite.url=https://victoria-metrics-host:8428/api/v1/write
```
See [troubleshooting docs](#troubleshooting) if you encounter common issues with `vmagent`.
See [various use cases](#use-cases) for vmagent.
Pass `-help` to `vmagent` in order to see [the full list of supported command-line flags with their descriptions](#advanced-usage).
## How to push data to vmagent
@@ -93,7 +87,6 @@ additionally to pull-based Prometheus-compatible targets' scraping:
* DataDog "submit metrics" API. See [these docs](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-send-data-from-datadog-agent).
* InfluxDB line protocol via `http://<vmagent>:8429/write`. See [these docs](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-send-data-from-influxdb-compatible-agents-such-as-telegraf).
* Graphite plaintext protocol if `-graphiteListenAddr` command-line flag is set. See [these docs](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-send-data-from-graphite-compatible-agents-such-as-statsd).
* OpenTelemetry http API. See [these docs](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#sending-data-via-opentelemetry).
* OpenTSDB telnet and http protocols if `-opentsdbListenAddr` command-line flag is set. See [these docs](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-send-data-from-opentsdb-compatible-agents).
* Prometheus remote write protocol via `http://<vmagent>:8429/api/v1/write`.
* JSON lines import protocol via `http://<vmagent>:8429/api/v1/import`. See [these docs](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-import-data-in-json-line-format).
@@ -105,7 +98,7 @@ additionally to pull-based Prometheus-compatible targets' scraping:
`vmagent` should be restarted in order to update config options set via command-line args.
`vmagent` supports multiple approaches for reloading configs from updated config files such as
`-promscrape.config`, `-remoteWrite.relabelConfig`, `-remoteWrite.urlRelabelConfig` and `-remoteWrite.streamAggr.config`:
`-promscrape.config`, `-remoteWrite.relabelConfig` and `-remoteWrite.urlRelabelConfig`:
* Sending `SIGHUP` signal to `vmagent` process:
@@ -127,8 +120,7 @@ data to the remote storage. It re-tries sending the data to remote storage until
The maximum on-disk size for the buffered metrics can be limited with `-remoteWrite.maxDiskUsagePerURL`.
`vmagent` works on various architectures from the IoT world - 32-bit arm, 64-bit arm, ppc64, 386, amd64.
The `vmagent` can save network bandwidth usage costs by using [VictoriaMetrics remote write protocol](#victoriametrics-remote-write-protocol).
See [the corresponding Makefile rules](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmagent/Makefile) for details.
### Drop-in replacement for Prometheus
@@ -152,28 +144,9 @@ to other remote storage systems, which support Prometheus `remote_write` protoco
`vmagent` replicates the collected metrics among multiple remote storage instances configured via `-remoteWrite.url` args.
If a single remote storage instance temporarily is out of service, then the collected data remains available in another remote storage instance.
`vmagent` buffers the collected data in files at `-remoteWrite.tmpDataPath` until the remote storage becomes available again,
`vmagent` buffers the collected data in files at `-remoteWrite.tmpDataPath` until the remote storage becomes available again
and then it sends the buffered data to the remote storage in order to prevent data gaps.
[VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html) already supports replication,
so there is no need in specifying multiple `-remoteWrite.url` flags when writing data to the same cluster.
See [these docs](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#replication-and-data-safety).
### Sharding among remote storages
By default `vmagent` replicates data among remote storage systems enumerated via `-remoteWrite.url` command-line flag.
If the `-remoteWrite.shardByURL` command-line flag is set, then `vmagent` spreads evenly
the outgoing [time series](https://docs.victoriametrics.com/keyConcepts.html#time-series)
among all the remote storage systems enumerated via `-remoteWrite.url`. Note that samples for the same
time series are routed to the same remote storage system if `-remoteWrite.shardByURL` flag is specified.
This allows building scalable data processing pipelines when a single remote storage cannot keep up with the data ingestion workload.
For example, this allows building horizontally scalable [stream aggregation](https://docs.victoriametrics.com/stream-aggregation.html)
by routing outgoing samples for the same time series of [counter](https://docs.victoriametrics.com/keyConcepts.html#counter)
and [histogram](https://docs.victoriametrics.com/keyConcepts.html#histogram) types from top-level `vmagent` instances
to the same second-level `vmagent` instance, so they are aggregated properly.
See also [how to scrape big number of targets](#scraping-big-number-of-targets).
### Relabeling and filtering
`vmagent` can add, remove or update labels on the collected data before sending it to the remote storage. Additionally,
@@ -185,7 +158,7 @@ Please see [these docs](#relabeling) for details.
`vmagent` supports splitting the collected data between multiple destinations with the help of `-remoteWrite.urlRelabelConfig`,
which is applied independently for each configured `-remoteWrite.url` destination. For example, it is possible to replicate or split
data among long-term remote storage, short-term remote storage and a real-time analytical system [built on top of Kafka](https://github.com/Telefonica/prometheus-kafka-adapter).
Note that each destination can receive its own subset of the collected data due to per-destination relabeling via `-remoteWrite.urlRelabelConfig`.
Note that each destination can receive it's own subset of the collected data due to per-destination relabeling via `-remoteWrite.urlRelabelConfig`.
### Prometheus remote_write proxy
@@ -197,37 +170,11 @@ Also, Basic Auth can be enabled for the incoming `remote_write` requests with `-
### remote_write for clustered version
While `vmagent` can accept data in several supported protocols (OpenTSDB, Influx, Prometheus, Graphite) and scrape data from various targets,
writes are always performed in Prometheus remote_write protocol. Therefore, for the [clustered version](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html),
writes are always performed in Promethes remote_write protocol. Therefore for the [clustered version](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html),
the `-remoteWrite.url` command-line flag should be configured as `<schema>://<vminsert-host>:8480/insert/<accountID>/prometheus/api/v1/write`
according to [these docs](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#url-format).
There is also support for multitenant writes. See [these docs](#multitenancy).
## VictoriaMetrics remote write protocol
`vmagent` supports sending data to the configured `-remoteWrite.url` either via Prometheus remote write protocol
or via VictoriaMetrics remote write protocol.
VictoriaMetrics remote write protocol provides the following benefits comparing to Prometheus remote write protocol:
- Reduced network bandwidth usage by 2x-5x. This allows saving network bandwidth usage costs when `vmagent` and
the configured remote storage systems are located in different datacenters, availability zones or regions.
- Reduced disk read/write IO and disk space usage at `vmagent` when the remote storage is temporarily unavailable.
In this case `vmagent` buffers the incoming data to disk using the VictoriaMetrics remote write format.
This reduces disk read/write IO and disk space usage by 2x-5x comparing to Prometheus remote write format.
`vmagent` automatically switches to VictoriaMetrics remote write protocol when it sends data to VictoriaMetrics components such as other `vmagent` instances,
[single-node VictoriaMetrics](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html)
or `vminsert` at [cluster version](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html).
It is possible to force switch to VictoriaMetrics remote write protocol by specifying `-remoteWrite.forceVMProto`
command-line flag for the corresponding `-remoteWrite.url`.
It is possible to tune the compression level for VictoriaMetrics remote write protocol with `-remoteWrite.vmProtoCompressLevel` command-line flag.
Bigger values reduce network usage at the cost of higher CPU usage. Negative values reduce CPU usage at the cost of higher network usage.
`vmagent` automatically switches to Prometheus remote write protocol when it sends data to old versions of VictoriaMetrics components
or to other Prometheus-compatible remote storage systems. It is possible to force switch to Prometheus remote write protocol
by specifying `-remoteWrite.forcePromProto` command-line flag for the corresponding `-remoteWrite.url`.
## Multitenancy
By default `vmagent` collects the data without tenant identifiers and routes it to the configured `-remoteWrite.url`.
@@ -251,7 +198,7 @@ This allows using a single `vmagent` instance in front of multiple VictoriaMetri
If `-remoteWrite.multitenantURL` command-line flag is set and `vmagent` is configured to scrape Prometheus-compatible targets
(e.g. if `-promscrape.config` command-line flag is set) then `vmagent` reads tenantID from `__tenant_id__` label
for the discovered targets and routes all the metrics from this target to the given `__tenant_id__`,
e.g. to the url `<-remoteWrite.multitenantURL>/insert/<__tenant_id__>/prometheus/api/v1/write`.
e.g. to the url `<-remoteWrite.multitnenatURL>/insert/<__tenant_id__>/prometheus/api/v1/write`.
For example, the following relabeling rule instructs sending metrics to tenantID defined in the `prometheus.io/tenant` annotation of Kubernetes pod deployment:
@@ -297,14 +244,14 @@ scrape_configs:
- "My-Auth: TopSecret"
```
* `disable_compression: true` for disabling response compression on a per-job basis. By default, `vmagent` requests compressed responses
* `disable_compression: true` for disabling response compression on a per-job basis. By default `vmagent` requests compressed responses
from scrape targets for saving network bandwidth.
* `disable_keepalive: true` for disabling [HTTP keep-alive connections](https://en.wikipedia.org/wiki/HTTP_persistent_connection)
on a per-job basis. By default, `vmagent` uses keep-alive connections to scrape targets for reducing overhead on connection re-establishing.
on a per-job basis. By default `vmagent` uses keep-alive connections to scrape targets for reducing overhead on connection re-establishing.
* `series_limit: N` for limiting the number of unique time series a single scrape target can expose. See [these docs](#cardinality-limiter).
* `stream_parse: true` for scraping targets in a streaming manner. This may be useful when targets export big number of metrics. See [these docs](#stream-parsing-mode).
* `scrape_align_interval: duration` for aligning scrapes to the given interval instead of using random offset
in the range `[0 ... scrape_interval]` for scraping each target. The random offset helps to spread scrapes evenly in time.
in the range `[0 ... scrape_interval]` for scraping each target. The random offset helps spreading scrapes evenly in time.
* `scrape_offset: duration` for specifying the exact offset for scraping instead of using random offset in the range `[0 ... scrape_interval]`.
See [scrape_configs docs](https://docs.victoriametrics.com/sd_configs.html#scrape_configs) for more details on all the supported options.
@@ -353,7 +300,7 @@ There is no need in specifying top-level `scrape_configs` section in these files
The list of supported service discovery types is available [here](#how-to-collect-metrics-in-prometheus-format).
Additionally, `vmagent` doesn't support `refresh_interval` option at service discovery sections.
Additionally `vmagent` doesn't support `refresh_interval` option at service discovery sections.
This option is substituted with `-promscrape.*CheckInterval` command-line options, which are specific per each service discovery type.
See [the full list of command-line flags for vmagent](#advanced-usage).
@@ -475,12 +422,11 @@ VictoriaMetrics components support [Prometheus-compatible relabeling](https://pr
with [additional enhancements](#relabeling-enhancements). The relabeling can be defined in the following places processed by `vmagent`:
* At the `scrape_config -> relabel_configs` section in `-promscrape.config` file.
This relabeling is used for modifying labels in discovered targets and for dropping unneeded targets.
This relabeling is used for modifying labels in discovered targets and for dropping unneded targets.
See [relabeling cookbook](https://docs.victoriametrics.com/relabeling.html) for details.
This relabeling can be debugged by clicking the `debug` link at the corresponding target on the `http://vmagent:8429/targets` page
or on the `http://vmagent:8429/service-discovery` page. See [these docs](#relabel-debug) for details.
The link is unavailable if `vmagent` runs with `-promscrape.dropOriginalLabels` command-line flag.
* At the `scrape_config -> metric_relabel_configs` section in `-promscrape.config` file.
This relabeling is used for modifying labels in scraped metrics and for dropping unneeded metrics.
@@ -528,9 +474,8 @@ The following articles contain useful information about Prometheus relabeling:
{% endraw %}
* An optional `if` filter can be used for conditional relabeling. The `if` filter may contain
arbitrary [time series selector](https://docs.victoriametrics.com/keyConcepts.html#filtering).
The `action` is performed only for [samples](https://docs.victoriametrics.com/keyConcepts.html#raw-samples), which match the provided `if` filter.
For example, the following relabeling rule keeps metrics matching `foo{bar="baz"}` series selector, while dropping the rest of metrics:
arbitrary [time series selector](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors).
For example, the following relabeling rule drops metrics, which don't match `foo{bar="baz"}` series selector, while leaving the rest of metrics:
```yaml
- if: 'foo{bar="baz"}'
@@ -545,18 +490,6 @@ The following articles contain useful information about Prometheus relabeling:
regex: 'foo;baz'
```
The `if` option may contain more than one filter. In this case the `action` is performed if at least a single filter
matches the given [sample](https://docs.victoriametrics.com/keyConcepts.html#raw-samples).
For example, the following relabeling rule adds `foo="bar"` label to samples with `job="foo"` or `instance="bar"` labels:
```yaml
- target_label: foo
replacement: bar
if:
- '{job="foo"}'
- '{instance="bar"}'
```
* The `regex` value can be split into multiple lines for improved readability and maintainability.
These lines are automatically joined with `|` char when parsed. For example, the following configs are equivalent:
@@ -576,7 +509,7 @@ The following articles contain useful information about Prometheus relabeling:
* VictoriaMetrics provides the following additional relabeling actions on top of standard actions
from the [Prometheus relabeling](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config):
* `replace_all` replaces all the occurrences of `regex` in the values of `source_labels` with the `replacement`
* `replace_all` replaces all of the occurrences of `regex` in the values of `source_labels` with the `replacement`
and stores the results in the `target_label`. For example, the following relabeling config replaces all the occurrences
of `-` char in metric names with `_` char (e.g. `foo-bar-baz` metric name is transformed into `foo_bar_baz`):
@@ -588,7 +521,7 @@ The following articles contain useful information about Prometheus relabeling:
replacement: "_"
```
* `labelmap_all` replaces all the occurrences of `regex` in all the label names with the `replacement`.
* `labelmap_all` replaces all of the occurrences of `regex` in all the label names with the `replacement`.
For example, the following relabeling config replaces all the occurrences of `-` char in all the label names
with `_` char (e.g. `foo-bar-baz` label name is transformed into `foo_bar_baz`):
@@ -674,21 +607,18 @@ provide the following tools for debugging target-level and metric-level relabeli
- Target-level debugging (e.g. `relabel_configs` section at [scrape_configs](https://docs.victoriametrics.com/sd_configs.html#scrape_configs))
can be performed by navigating to `http://vmagent:8429/targets` page (`http://victoriametrics:8428/targets` page for single-node VictoriaMetrics)
and clicking the `debug target relabeling` link at the target, which must be debugged.
The link is unavailable if `vmagent` runs with `-promscrape.dropOriginalLabels` command-line flag.
The opened page shows step-by-step results for the actual target relabeling rules applied to the discovered target labels.
The page shows also the target URL generated after applying all the relabeling rules.
The opened page will show step-by-step results for the actual target relabeling rules applied to the discovered target labels.
The `http://vmagent:8429/targets` page shows only active targets. If you need to understand why some target
is dropped during the relabeling, then navigate to `http://vmagent:8428/service-discovery` page
(`http://victoriametrics:8428/service-discovery` for single-node VictoriaMetrics), find the dropped target
and click the `debug` link there. The link is unavailable if `vmagent` runs with `-promscrape.dropOriginalLabels` command-line flag.
The opened page shows step-by-step results for the actual relabeling rules, which result to target drop.
and click the `debug` link there. The opened page will show step-by-step results for the actual relabeling rules,
which result to target drop.
- Metric-level debugging (e.g. `metric_relabel_configs` section at [scrape_configs](https://docs.victoriametrics.com/sd_configs.html#scrape_configs)
can be performed by navigating to `http://vmagent:8429/targets` page (`http://victoriametrics:8428/targets` page for single-node VictoriaMetrics)
and clicking the `debug metrics relabeling` link at the target, which must be debugged.
The link is unavailable if `vmagent` runs with `-promscrape.dropOriginalLabels` command-line flag.
The opened page shows step-by-step results for the actual metric relabeling rules applied to the given target labels.
The opened page will show step-by-step results for the actual metric relabeling rules applied to the given target labels.
## Prometheus staleness markers
@@ -710,7 +640,7 @@ e.g. it sets `scrape_series_added` metric to zero. See [these docs](#automatical
## Stream parsing mode
By default, `vmagent` reads the full response body from scrape target into memory, then parses it, applies [relabeling](#relabeling)
By default `vmagent` reads the full response body from scrape target into memory, then parses it, applies [relabeling](#relabeling)
and then pushes the resulting metrics to the configured `-remoteWrite.url`. This mode works good for the majority of cases
when the scrape target exposes small number of metrics (e.g. less than 10 thousand). But this mode may take big amounts of memory
when the scrape target exposes big number of metrics. In this case it is recommended enabling stream parsing mode.
@@ -738,8 +668,8 @@ scrape_configs:
stream_parse: true
static_configs:
- targets:
- big-prometheus1
- big-prometheus2
- big-prometeus1
- big-prometeus2
honor_labels: true
metrics_path: /federate
params:
@@ -747,7 +677,7 @@ scrape_configs:
```
Note that `vmagent` in stream parsing mode stores up to `sample_limit` samples to the configured `-remoteStorage.url`
instead of dropping all the samples read from the target, because the parsed data is sent to the remote storage
instead of droping all the samples read from the target, because the parsed data is sent to the remote storage
as soon as it is parsed in stream parsing mode.
## Scraping big number of targets
@@ -767,7 +697,7 @@ spread scrape targets among a cluster of two `vmagent` instances:
The `-promscrape.cluster.memberNum` can be set to a StatefulSet pod name when `vmagent` runs in Kubernetes.
The pod name must end with a number in the range `0 ... promscrape.cluster.memberNum-1`. For example, `-promscrape.cluster.memberNum=vmagent-0`.
By default, each scrape target is scraped only by a single `vmagent` instance in the cluster. If there is a need for replicating scrape targets among multiple `vmagent` instances,
By default each scrape target is scraped only by a single `vmagent` instance in the cluster. If there is a need for replicating scrape targets among multiple `vmagent` instances,
then `-promscrape.cluster.replicationFactor` command-line flag must be set to the desired number of replicas. For example, the following commands
start a cluster of three `vmagent` instances, where each target is scraped by two `vmagent` instances:
@@ -781,23 +711,16 @@ If each target is scraped by multiple `vmagent` instances, then data deduplicati
The `-dedup.minScrapeInterval` must be set to the `scrape_interval` configured at `-promscrape.config`.
See [these docs](https://docs.victoriametrics.com/#deduplication) for details.
See also [how to shard data among multiple remote storage systems](#sharding-among-remote-storages).
## High availability
It is possible to run multiple **identically configured** `vmagent` instances or `vmagent`
[clusters](#scraping-big-number-of-targets), so they [scrape](#how-to-collect-metrics-in-prometheus-format)
the same set of targets and push the collected data to the same set of VictoriaMetrics remote storage systems.
Two **identically configured** vmagent instances or clusters is usually called an HA pair.
It is possible to run multiple identically configured `vmagent` instances or `vmagent` [clusters](#scraping-big-number-of-targets),
so they [scrape](#how-to-collect-metrics-in-prometheus-format) the same set of targets and push the collected data to the same set of VictoriaMetrics remote storage systems.
When running HA pairs, [deduplication](https://docs.victoriametrics.com/#deduplication) must be configured
at VictoriaMetrics side in order to de-duplicate received samples.
In this case the deduplication must be configured at VictoriaMetrics in order to de-duplicate samples received from multiple identically configured `vmagent` instances or clusters.
See [these docs](https://docs.victoriametrics.com/#deduplication) for details.
It is also recommended passing different values to `-promscrape.cluster.name` command-line flag per each `vmagent`
instance or per each `vmagent` cluster in HA setup. This is needed for proper data de-duplication.
See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2679) for details.
It is also recommended passing different values to `-promscrape.cluster.name` command-line flag per each `vmagent` instance or per each `vmagent` cluster in HA setup.
This is needed for proper data de-duplication. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2679) for details.
## Scraping targets via a proxy
@@ -840,7 +763,7 @@ scrape_configs:
## Cardinality limiter
By default, `vmagent` doesn't limit the number of time series each scrape target can expose.
By default `vmagent` doesn't limit the number of time series each scrape target can expose.
The limit can be enforced in the following places:
* Via `-promscrape.seriesLimitPerTarget` command-line option. This limit is applied individually
@@ -868,7 +791,7 @@ These metrics allow building the following alerting rules:
See also `sample_limit` option at [scrape_config section](https://docs.victoriametrics.com/sd_configs.html#scrape_configs).
By default, `vmagent` doesn't limit the number of time series written to remote storage systems specified at `-remoteWrite.url`.
By default `vmagent` doesn't limit the number of time series written to remote storage systems specified at `-remoteWrite.url`.
The limit can be enforced by setting the following command-line flags:
* `-remoteWrite.maxHourlySeries` - limits the number of unique time series `vmagent` can write to remote storage systems during the last hour.
@@ -911,7 +834,7 @@ If you have suggestions for improvements or have found a bug - please open an is
* `http://vmagent-host:8429/api/v1/targets`. This handler returns JSON response
compatible with [the corresponding page from Prometheus API](https://prometheus.io/docs/prometheus/latest/querying/api/#targets).
* `http://vmagent-host:8429/ready`. This handler returns http 200 status code when `vmagent` finishes
its initialization for all the [service_discovery configs](https://docs.victoriametrics.com/sd_configs.html).
it's initialization for all the [service_discovery configs](https://docs.victoriametrics.com/sd_configs.html).
It may be useful to perform `vmagent` rolling update without any scrape loss.
## Troubleshooting
@@ -939,9 +862,9 @@ If you have suggestions for improvements or have found a bug - please open an is
* The `/service-discovery` page could be useful for debugging relabeling process for scrape targets.
This page contains original labels for targets dropped during relabeling.
By default, the `-promscrape.maxDroppedTargets` targets are shown here. If your setup drops more targets during relabeling,
By default the `-promscrape.maxDroppedTargets` targets are shown here. If your setup drops more targets during relabeling,
then increase `-promscrape.maxDroppedTargets` command-line flag value to see all the dropped targets.
Note that tracking each dropped target requires up to 10Kb of RAM. Therefore, big values for `-promscrape.maxDroppedTargets`
Note that tracking each dropped target requires up to 10Kb of RAM. Therefore big values for `-promscrape.maxDroppedTargets`
may result in increased memory usage if a big number of scrape targets are dropped during relabeling.
* We recommend you increase `-remoteWrite.queues` if `vmagent_remotewrite_pending_data_bytes` metric exported
@@ -951,13 +874,13 @@ If you have suggestions for improvements or have found a bug - please open an is
* If you see gaps in the data pushed by `vmagent` to remote storage when `-remoteWrite.maxDiskUsagePerURL` is set,
try increasing `-remoteWrite.queues`. Such gaps may appear because `vmagent` cannot keep up with sending the collected data to remote storage.
Therefore, it starts dropping the buffered data if the on-disk buffer size exceeds `-remoteWrite.maxDiskUsagePerURL`.
Therefore it starts dropping the buffered data if the on-disk buffer size exceeds `-remoteWrite.maxDiskUsagePerURL`.
* `vmagent` drops data blocks if remote storage replies with `400 Bad Request` and `409 Conflict` HTTP responses.
The number of dropped blocks can be monitored via `vmagent_remotewrite_packets_dropped_total` metric exported at [/metrics page](#monitoring).
* Use `-remoteWrite.queues=1` when `-remoteWrite.url` points to remote storage, which doesn't accept out-of-order samples (aka data backfilling).
Such storage systems include Prometheus, Mimir, Cortex and Thanos, which typically emit `out of order sample` errors.
Such storage systems include Prometheus, Cortex and Thanos, which typically emit `out of order sample` errors.
The best solution is to use remote storage with [backfilling support](https://docs.victoriametrics.com/#backfilling) such as VictoriaMetrics.
* `vmagent` buffers scraped data at the `-remoteWrite.tmpDataPath` directory until it is sent to `-remoteWrite.url`.
@@ -1017,7 +940,7 @@ See also [troubleshooting docs](https://docs.victoriametrics.com/Troubleshooting
* [Writing metrics to Kafka](#writing-metrics-to-kafka)
The enterprise version of vmagent is available for evaluation at [releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) page
in `vmutils-...-enterprise.tar.gz` archives and in [docker images](https://hub.docker.com/r/victoriametrics/vmagent/tags) with tags containing `enterprise` suffix.
in `vmutils-...-enteprise.tar.gz` archives and in [docker images](https://hub.docker.com/r/victoriametrics/vmagent/tags) with tags containing `enterprise` suffix.
### Reading metrics from Kafka
@@ -1065,7 +988,7 @@ data_format = "influx"
These command-line flags are available only in [enterprise](https://docs.victoriametrics.com/enterprise.html) version of `vmagent`,
which can be downloaded for evaluation from [releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) page
(see `vmutils-...-enterprise.tar.gz` archives) and from [docker images](https://hub.docker.com/r/victoriametrics/vmagent/tags) with tags containing `enterprise` suffix.
(see `vmutils-...-enteprise.tar.gz` archives) and from [docker images](https://hub.docker.com/r/victoriametrics/vmagent/tags) with tags containing `enterprise` suffix.
```
-kafka.consumer.topic array
@@ -1131,13 +1054,13 @@ It may be needed to build `vmagent` from source code when developing or testing
### Development build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.19.
1. Run `make vmagent` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
2. Run `make vmagent` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds the `vmagent` binary and puts it into the `bin` folder.
### Production build
1. [Install docker](https://docs.docker.com/install/).
1. Run `make vmagent-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
2. Run `make vmagent-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmagent-prod` binary and puts it into the `bin` folder.
### Building docker images
@@ -1160,13 +1083,13 @@ ARM build may run on Raspberry Pi or on [energy-efficient ARM servers](https://b
### Development ARM build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.19.
1. Run `make vmagent-linux-arm` or `make vmagent-linux-arm64` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics)
2. Run `make vmagent-linux-arm` or `make vmagent-linux-arm64` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics)
It builds `vmagent-linux-arm` or `vmagent-linux-arm64` binary respectively and puts it into the `bin` folder.
### Production ARM build
1. [Install docker](https://docs.docker.com/install/).
1. Run `make vmagent-linux-arm-prod` or `make vmagent-linux-arm64-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
2. Run `make vmagent-linux-arm-prod` or `make vmagent-linux-arm64-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmagent-linux-arm-prod` or `vmagent-linux-arm64-prod` binary respectively and puts it into the `bin` folder.
## Profiling
@@ -1201,7 +1124,7 @@ It is safe sharing the collected profiles from security point of view, since the
## Advanced usage
`vmagent` can be fine-tuned with various command-line flags. Run `./vmagent -help` in order to see the full list of these flags with their descriptions and default values:
`vmagent` can be fine-tuned with various command-line flags. Run `./vmagent -help` in order to see the full list of these flags with their desciptions and default values:
```
./vmagent -help
@@ -1212,10 +1135,6 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
-cacheExpireDuration duration
Items are removed from in-memory caches after they aren't accessed for this duration. Lower values may reduce memory usage at the cost of higher CPU usage. See also -prevCacheRemovalPercent (default 30m0s)
-clients.docker
Decides whether a docker container be brought up automatically
-clients.semaphore
Tells if the job is running on Semaphore
-configAuthKey string
Authorization key for accessing /config page. It must be passed via authKey query arg
-csvTrimTimestamp duration
@@ -1228,11 +1147,11 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
-denyQueryTracing
Whether to disable the ability to trace queries. See https://docs.victoriametrics.com/#query-tracing
-dryRun
Whether to check config files without running vmagent. The following files are checked: -promscrape.config, -remoteWrite.relabelConfig, -remoteWrite.urlRelabelConfig, -remoteWrite.streamAggr.config . Unknown config entries aren't allowed in -promscrape.config by default. This can be changed by passing -promscrape.config.strictParse=false command-line flag
Whether to check only config files without running vmagent. The following files are checked: -promscrape.config, -remoteWrite.relabelConfig, -remoteWrite.urlRelabelConfig . Unknown config entries aren't allowed in -promscrape.config by default. This can be changed by passing -promscrape.config.strictParse=false command-line flag
-enableTCP6
Whether to enable IPv6 for listening and dialing. By default, only IPv4 TCP and UDP are used
Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used
-envflag.enable
Whether to enable reading flags from environment variables in addition to the command line. Command line flag values have priority over values from environment vars. Flags are read only from the command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
-eula
@@ -1240,7 +1159,7 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
-flagsAuthKey string
Auth key for /flags endpoint. It must be passed via authKey query arg. It overrides httpAuth.* settings
-fs.disableMmap
Whether to use pread() instead of mmap() for reading data files. By default, mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread()
Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread()
-graphiteListenAddr string
TCP and UDP address to listen for Graphite plaintext data. Usually :2003 must be set. Doesn't work if empty. See also -graphiteListenAddr.useProxyProtocol
-graphiteListenAddr.useProxyProtocol
@@ -1250,7 +1169,7 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
-http.connTimeout duration
Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s)
-http.disableResponseCompression
Disable compression of HTTP responses to save CPU resources. By default, compression is enabled to save network bandwidth
Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth
-http.idleConnTimeout duration
Timeout for incoming idle http connections (default 1m0s)
-http.maxGracefulShutdownDuration duration
@@ -1260,13 +1179,13 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
-http.shutdownDelay duration
Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers
-httpAuth.password string
Password for HTTP server's Basic Auth. The authentication is disabled if -httpAuth.username is empty
Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty
-httpAuth.username string
Username for HTTP server's Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
-httpListenAddr string
TCP address to listen for http connections. Set this flag to empty value in order to disable listening on any port. This mode may be useful for running multiple vmagent instances on the same server. Note that /targets and /metrics pages aren't available if -httpListenAddr=''. See also -httpListenAddr.useProxyProtocol (default ":8429")
-httpListenAddr.useProxyProtocol
Whether to use proxy protocol for connections accepted at -httpListenAddr . See https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt . With enabled proxy protocol http server cannot serve regular /metrics endpoint. Use -pushmetrics.url for metrics pushing
Whether to use proxy protocol for connections accepted at -httpListenAddr . See https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt
-import.maxLineLen size
The maximum length in bytes of a single line accepted by /api/v1/import; the line length can be limited with 'max_rows_per_line' query arg passed to /api/v1/export
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 104857600)
@@ -1287,17 +1206,13 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
-influxSkipMeasurement
Uses '{field_name}' as a metric name while ignoring '{measurement}' and '-influxMeasurementFieldSeparator'
-influxSkipSingleField
Uses '{measurement}' instead of '{measurement}{separator}{field_name}' for metric name if InfluxDB line contains only a single field
Uses '{measurement}' instead of '{measurement}{separator}{field_name}' for metic name if InfluxDB line contains only a single field
-influxTrimTimestamp duration
Trim timestamps for InfluxDB line protocol data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms)
-insert.maxQueueDuration duration
The maximum duration to wait in the queue when -maxConcurrentInserts concurrent insert requests are executed (default 1m0s)
-internStringCacheExpireDuration duration
The expiry duration for caches for interned strings. See https://en.wikipedia.org/wiki/String_interning . See also -internStringMaxLen and -internStringDisableCache (default 6m0s)
-internStringDisableCache
Whether to disable caches for interned strings. This may reduce memory usage at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringCacheExpireDuration and -internStringMaxLen
-internStringMaxLen int
The maximum length for strings to intern. A lower limit may save memory at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringDisableCache and -internStringCacheExpireDuration (default 500)
The maximum length for strings to intern. Lower limit may save memory at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning (default 300)
-kafka.consumer.topic array
Kafka topic names for data consumption. This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
Supports an array of values separated by comma or specified via multiple flags.
@@ -1310,9 +1225,6 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
-kafka.consumer.topic.brokers array
List of brokers to connect for given topic, e.g. -kafka.consumer.topic.broker=host-1:9092;host-2:9092 . This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
Supports an array of values separated by comma or specified via multiple flags.
-kafka.consumer.topic.concurrency array
Configures consumer concurrency for topic specified via -kafka.consumer.topic flag.This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
Supports array of values separated by comma or specified via multiple flags.
-kafka.consumer.topic.defaultFormat string
Expected data format in the topic if -kafka.consumer.topic.format is skipped. This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html (default "promremotewrite")
-kafka.consumer.topic.format array
@@ -1344,23 +1256,23 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
-loggerWarnsPerSecondLimit int
Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit
-maxConcurrentInserts int
The maximum number of concurrent insert requests. The default value should work for most cases, since it minimizes memory usage. The default value can be increased when clients send data over slow networks. See also -insert.maxQueueDuration (default 8)
The maximum number of concurrent insert requests. Default value should work for most cases, since it minimizes the memory usage. The default value can be increased when clients send data over slow networks. See also -insert.maxQueueDuration (default 8)
-maxInsertRequestSize size
The maximum size in bytes of a single Prometheus remote_write API request
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 33554432)
-memory.allowedBytes size
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from the OS page cache resulting in higher disk IO usage
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 0)
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from the OS page cache which will result in higher disk IO usage (default 60)
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60)
-metricsAuthKey string
Auth key for /metrics endpoint. It must be passed via authKey query arg. It overrides httpAuth.* settings
-opentsdbHTTPListenAddr string
TCP address to listen for OpenTSDB HTTP put requests. Usually :4242 must be set. Doesn't work if empty. See also -opentsdbHTTPListenAddr.useProxyProtocol
TCP address to listen for OpentTSDB HTTP put requests. Usually :4242 must be set. Doesn't work if empty. See also -opentsdbHTTPListenAddr.useProxyProtocol
-opentsdbHTTPListenAddr.useProxyProtocol
Whether to use proxy protocol for connections accepted at -opentsdbHTTPListenAddr . See https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt
-opentsdbListenAddr string
TCP and UDP address to listen for OpenTSDB metrics. Telnet put messages and HTTP /api/put messages are simultaneously served on TCP port. Usually :4242 must be set. Doesn't work if empty. See also -opentsdbListenAddr.useProxyProtocol
TCP and UDP address to listen for OpentTSDB metrics. Telnet put messages and HTTP /api/put messages are simultaneously served on TCP port. Usually :4242 must be set. Doesn't work if empty. See also -opentsdbListenAddr.useProxyProtocol
-opentsdbListenAddr.useProxyProtocol
Whether to use proxy protocol for connections accepted at -opentsdbListenAddr . See https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt
-opentsdbTrimTimestamp duration
@@ -1377,9 +1289,9 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
-promscrape.azureSDCheckInterval duration
Interval for checking for changes in Azure. This works only if azure_sd_configs is configured in '-promscrape.config' file. See https://docs.victoriametrics.com/sd_configs.html#azure_sd_configs for details (default 1m0s)
-promscrape.cluster.memberNum string
The number of number in the cluster of scrapers. It must be a unique value in the range 0 ... promscrape.cluster.membersCount-1 across scrapers in the cluster. Can be specified as pod name of Kubernetes StatefulSet - pod-name-Num, where Num is a numeric part of pod name (default "0")
The number of number in the cluster of scrapers. It must be an unique value in the range 0 ... promscrape.cluster.membersCount-1 across scrapers in the cluster. Can be specified as pod name of Kubernetes StatefulSet - pod-name-Num, where Num is a numeric part of pod name (default "0")
-promscrape.cluster.membersCount int
The number of members in a cluster of scrapers. Each member must have a unique -promscrape.cluster.memberNum in the range 0 ... promscrape.cluster.membersCount-1 . Each member then scrapes roughly 1/N of all the targets. By default, cluster scraping is disabled, i.e. a single scraper scrapes all the targets
The number of members in a cluster of scrapers. Each member must have an unique -promscrape.cluster.memberNum in the range 0 ... promscrape.cluster.membersCount-1 . Each member then scrapes roughly 1/N of all the targets. By default cluster scraping is disabled, i.e. a single scraper scrapes all the targets
-promscrape.cluster.name string
Optional name of the cluster. If multiple vmagent clusters scrape the same targets, then each cluster must have unique name in order to properly de-duplicate samples received from these clusters. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2679
-promscrape.cluster.replicationFactor int
@@ -1391,19 +1303,17 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
-promscrape.config.strictParse
Whether to deny unsupported fields in -promscrape.config . Set to false in order to silently skip unsupported fields (default true)
-promscrape.configCheckInterval duration
Interval for checking for changes in '-promscrape.config' file. By default, the checking is disabled. Send SIGHUP signal in order to force config check for changes
Interval for checking for changes in '-promscrape.config' file. By default the checking is disabled. Send SIGHUP signal in order to force config check for changes
-promscrape.consul.waitTime duration
Wait time used by Consul service discovery. Default value is used if not set
-promscrape.consulSDCheckInterval duration
Interval for checking for changes in Consul. This works only if consul_sd_configs is configured in '-promscrape.config' file. See https://docs.victoriametrics.com/sd_configs.html#consul_sd_configs for details (default 30s)
-promscrape.consulagentSDCheckInterval duration
Interval for checking for changes in Consul Agent. This works only if consulagent_sd_configs is configured in '-promscrape.config' file. See https://docs.victoriametrics.com/sd_configs.html#consulagent_sd_configs for details (default 30s)
-promscrape.digitaloceanSDCheckInterval duration
Interval for checking for changes in digital ocean. This works only if digitalocean_sd_configs is configured in '-promscrape.config' file. See https://docs.victoriametrics.com/sd_configs.html#digitalocean_sd_configs for details (default 1m0s)
-promscrape.disableCompression
Whether to disable sending 'Accept-Encoding: gzip' request headers to all the scrape targets. This may reduce CPU usage on scrape targets at the cost of higher network bandwidth utilization. It is possible to set 'disable_compression: true' individually per each 'scrape_config' section in '-promscrape.config' for fine-grained control
Whether to disable sending 'Accept-Encoding: gzip' request headers to all the scrape targets. This may reduce CPU usage on scrape targets at the cost of higher network bandwidth utilization. It is possible to set 'disable_compression: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control
-promscrape.disableKeepAlive
Whether to disable HTTP keep-alive connections when scraping all the targets. This may be useful when targets has no support for HTTP keep-alive connection. It is possible to set 'disable_keepalive: true' individually per each 'scrape_config' section in '-promscrape.config' for fine-grained control. Note that disabling HTTP keep-alive may increase load on both vmagent and scrape targets
Whether to disable HTTP keep-alive connections when scraping all the targets. This may be useful when targets has no support for HTTP keep-alive connection. It is possible to set 'disable_keepalive: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control. Note that disabling HTTP keep-alive may increase load on both vmagent and scrape targets
-promscrape.discovery.concurrency int
The maximum number of concurrent requests to Prometheus autodiscovery API (Consul, Kubernetes, etc.) (default 100)
-promscrape.discovery.concurrentWaitTime duration
@@ -1430,8 +1340,6 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
How frequently to reload the full state from Kubernetes API server (default 30m0s)
-promscrape.kubernetesSDCheckInterval duration
Interval for checking for changes in Kubernetes API server. This works only if kubernetes_sd_configs is configured in '-promscrape.config' file. See https://docs.victoriametrics.com/sd_configs.html#kubernetes_sd_configs for details (default 30s)
-promscrape.kumaSDCheckInterval duration
Interval for checking for changes in kuma service discovery. This works only if kuma_sd_configs is configured in '-promscrape.config' file. See https://docs.victoriametrics.com/sd_configs.html#kuma_sd_configs for details (default 30s)
-promscrape.maxDroppedTargets int
The maximum number of droppedTargets to show at /api/v1/targets page. Increase this value if your setup drops more scrape targets during relabeling and you need investigating labels for all the dropped targets. Note that the increased number of tracked dropped targets may result in increased memory usage (default 1000)
-promscrape.maxResponseHeadersSize size
@@ -1454,7 +1362,7 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
-promscrape.seriesLimitPerTarget int
Optional limit on the number of unique time series a single scrape target can expose. See https://docs.victoriametrics.com/vmagent.html#cardinality-limiter for more info
-promscrape.streamParse
Whether to enable stream parsing for metrics obtained from scrape targets. This may be useful for reducing memory usage when millions of metrics are exposed per each scrape target. It is possible to set 'stream_parse: true' individually per each 'scrape_config' section in '-promscrape.config' for fine-grained control
Whether to enable stream parsing for metrics obtained from scrape targets. This may be useful for reducing memory usage when millions of metrics are exposed per each scrape target. It is posible to set 'stream_parse: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control
-promscrape.suppressDuplicateScrapeTargetErrors
Whether to suppress 'duplicate scrape target' errors; see https://docs.victoriametrics.com/vmagent.html#troubleshooting for details
-promscrape.suppressScrapeErrors
@@ -1469,7 +1377,7 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
-pushmetrics.interval duration
Interval for pushing metrics to -pushmetrics.url (default 10s)
-pushmetrics.url array
Optional URL to push metrics exposed at /metrics page. See https://docs.victoriametrics.com/#push-metrics . By default, metrics exposed at /metrics page aren't pushed to any remote storage
Optional URL to push metrics exposed at /metrics page. See https://docs.victoriametrics.com/#push-metrics . By default metrics exposed at /metrics page aren't pushed to any remote storage
Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.aws.accessKey array
Optional AWS AccessKey to use for the corresponding -remoteWrite.url if -remoteWrite.aws.useSigv4 is set
@@ -1512,17 +1420,9 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.flushInterval duration
Interval for flushing the data to remote storage. This option takes effect only when less than 10K data points per second are pushed to -remoteWrite.url (default 1s)
-remoteWrite.forcePromProto array
Whether to force Prometheus remote write protocol for sending data to the corresponding -remoteWrite.url . See https://docs.victoriametrics.com/vmagent.html#victoriametrics-remote-write-protocol
Supports array of values separated by comma or specified via multiple flags.
-remoteWrite.forceVMProto array
Whether to force VictoriaMetrics remote write protocol for sending data to the corresponding -remoteWrite.url . See https://docs.victoriametrics.com/vmagent.html#victoriametrics-remote-write-protocol
Supports array of values separated by comma or specified via multiple flags.
-remoteWrite.headers array
Optional HTTP headers to send with each request to the corresponding -remoteWrite.url. For example, -remoteWrite.headers='My-Auth:foobar' would send 'My-Auth: foobar' HTTP header with every request to the corresponding -remoteWrite.url. Multiple headers must be delimited by '^^': -remoteWrite.headers='header1:value1^^header2:value2'
Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.keepDanglingQueues
Keep persistent queues contents at -remoteWrite.tmpDataPath in case there are no matching -remoteWrite.url. Useful when -remoteWrite.url is changed temporarily and persistent queue files will be needed later on.
-remoteWrite.label array
Optional label in the form 'name=value' to add to all the metrics before sending them to -remoteWrite.url. Pass multiple -remoteWrite.label flags in order to add multiple labels to metrics before sending them to remote storage
Supports an array of values separated by comma or specified via multiple flags.
@@ -1532,7 +1432,7 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
-remoteWrite.maxDailySeries int
The maximum number of unique series vmagent can send to remote storage systems during the last 24 hours. Excess series are logged and dropped. This can be useful for limiting series churn rate. See https://docs.victoriametrics.com/vmagent.html#cardinality-limiter
-remoteWrite.maxDiskUsagePerURL array
The maximum file-based buffer size in bytes at -remoteWrite.tmpDataPath for each -remoteWrite.url. When buffer size reaches the configured maximum, then old data is dropped when adding new data to the buffer. Buffered data is stored in ~500MB chunks. It is recommended to set the value for this flag to a multiple of the block size 500MB. Disk usage is unlimited if the value is set to 0
The maximum file-based buffer size in bytes at -remoteWrite.tmpDataPath for each -remoteWrite.url. When buffer size reaches the configured maximum, then old data is dropped when adding new data to the buffer. Buffered data is stored in ~500MB chunks, so the minimum practical value for this flag is 500MB. Disk usage is unlimited if the value is set to 0
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB.
Supports array of values separated by comma or specified via multiple flags.
-remoteWrite.maxHourlySeries int
@@ -1563,37 +1463,32 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
-remoteWrite.queues int
The number of concurrent queues to each -remoteWrite.url. Set more queues if default number of queues isn't enough for sending high volume of collected data to remote storage. Default value is 2 * numberOfAvailableCPUs (default 8)
-remoteWrite.rateLimit array
Optional rate limit in bytes per second for data sent to the corresponding -remoteWrite.url. By default, the rate limit is disabled. It can be useful for limiting load on remote storage when big amounts of buffered data is sent after temporary unavailability of the remote storage
Optional rate limit in bytes per second for data sent to the corresponding -remoteWrite.url. By default the rate limit is disabled. It can be useful for limiting load on remote storage when big amounts of buffered data is sent after temporary unavailability of the remote storage
Supports array of values separated by comma or specified via multiple flags.
-remoteWrite.relabelConfig string
Optional path to file with relabeling configs, which are applied to all the metrics before sending them to -remoteWrite.url. See also -remoteWrite.urlRelabelConfig. The path can point either to local file or to http url. See https://docs.victoriametrics.com/vmagent.html#relabeling
-remoteWrite.roundDigits array
Round metric values to this number of decimal digits after the point before writing them to remote storage. Examples: -remoteWrite.roundDigits=2 would round 1.236 to 1.24, while -remoteWrite.roundDigits=-1 would round 126.78 to 130. By default, digits rounding is disabled. Set it to 100 for disabling it for a particular remote storage. This option may be used for improving data compression for the stored metrics
Round metric values to this number of decimal digits after the point before writing them to remote storage. Examples: -remoteWrite.roundDigits=2 would round 1.236 to 1.24, while -remoteWrite.roundDigits=-1 would round 126.78 to 130. By default digits rounding is disabled. Set it to 100 for disabling it for a particular remote storage. This option may be used for improving data compression for the stored metrics
Supports array of values separated by comma or specified via multiple flags.
-remoteWrite.sendTimeout array
Timeout for sending a single block of data to the corresponding -remoteWrite.url (default 1m)
Timeout for sending a single block of data to the corresponding -remoteWrite.url
Supports array of values separated by comma or specified via multiple flags.
-remoteWrite.shardByURL
Whether to shard outgoing series across all the remote storage systems enumerated via -remoteWrite.url . By default the data is replicated across all the -remoteWrite.url . See https://docs.victoriametrics.com/vmagent.html#sharding-among-remote-storages
-remoteWrite.showURL
Whether to show -remoteWrite.url in the exported metrics. It is hidden by default, since it can contain sensitive info such as auth key
-remoteWrite.significantFigures array
The number of significant figures to leave in metric values before writing them to remote storage. See https://en.wikipedia.org/wiki/Significant_figures . Zero value saves all the significant figures. This option may be used for improving data compression for the stored metrics. See also -remoteWrite.roundDigits
Supports array of values separated by comma or specified via multiple flags.
-remoteWrite.streamAggr.config array
Optional path to file with stream aggregation config. See https://docs.victoriametrics.com/stream-aggregation.html . See also -remoteWrite.streamAggr.keepInput, -remoteWrite.streamAggr.dropInput and -remoteWrite.streamAggr.dedupInterval
Optional path to file with stream aggregation config. See https://docs.victoriametrics.com/stream-aggregation.html . See also -remoteWrite.streamAggr.keepInput and -remoteWrite.streamAggr.dedupInterval
Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.streamAggr.dedupInterval array
Input samples are de-duplicated with this interval before being aggregated. Only the last sample per each time series per each interval is aggregated if the interval is greater than zero
Supports array of values separated by comma or specified via multiple flags.
-remoteWrite.streamAggr.dropInput array
Whether to drop all the input samples after the aggregation with -remoteWrite.streamAggr.config. By default, only aggregates samples are dropped, while the remaining samples are written to the corresponding -remoteWrite.url . See also -remoteWrite.streamAggr.keepInput and https://docs.victoriametrics.com/stream-aggregation.html
Supports array of values separated by comma or specified via multiple flags.
-remoteWrite.streamAggr.keepInput array
Whether to keep all the input samples after the aggregation with -remoteWrite.streamAggr.config. By default, only aggregates samples are dropped, while the remaining samples are written to the corresponding -remoteWrite.url . See also -remoteWrite.streamAggr.dropInput and https://docs.victoriametrics.com/stream-aggregation.html
Whether to keep input samples after the aggregation with -remoteWrite.streamAggr.config. By default the input is dropped after the aggregation, so only the aggregate data is sent to the -remoteWrite.url. See https://docs.victoriametrics.com/stream-aggregation.html
Supports array of values separated by comma or specified via multiple flags.
-remoteWrite.tlsCAFile array
Optional path to TLS CA file to use for verifying connections to the corresponding -remoteWrite.url. By default, system CA is used
Optional path to TLS CA file to use for verifying connections to the corresponding -remoteWrite.url. By default system CA is used
Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.tlsCertFile array
Optional path to client-side TLS certificate file to use when connecting to the corresponding -remoteWrite.url
@@ -1605,18 +1500,16 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
Optional path to client-side TLS certificate key to use when connecting to the corresponding -remoteWrite.url
Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.tlsServerName array
Optional TLS server name to use for connections to the corresponding -remoteWrite.url. By default, the server name from -remoteWrite.url is used
Optional TLS server name to use for connections to the corresponding -remoteWrite.url. By default the server name from -remoteWrite.url is used
Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.tmpDataPath string
Path to directory where temporary data for remote write component is stored. See also -remoteWrite.maxDiskUsagePerURL (default "vmagent-remotewrite-data")
-remoteWrite.url array
Remote storage URL to write data to. It must support either VictoriaMetrics remote write protocol or Prometheus remote_write protocol. Example url: http://<victoriametrics-host>:8428/api/v1/write . Pass multiple -remoteWrite.url options in order to replicate the collected data to multiple remote storage systems. The data can be sharded among the configured remote storage systems if -remoteWrite.shardByURL flag is set. See also -remoteWrite.multitenantURL
Remote storage URL to write data to. It must support Prometheus remote_write API. It is recommended using VictoriaMetrics as remote storage. Example url: http://<victoriametrics-host>:8428/api/v1/write . Pass multiple -remoteWrite.url flags in order to replicate data to multiple remote storage systems. See also -remoteWrite.multitenantURL
Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.urlRelabelConfig array
Optional path to relabel configs for the corresponding -remoteWrite.url. See also -remoteWrite.relabelConfig. The path can point either to local file or to http url. See https://docs.victoriametrics.com/vmagent.html#relabeling
Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.vmProtoCompressLevel int
The compression level for VictoriaMetrics remote write protocol. Higher values reduce network traffic at the cost of higher CPU usage. Negative values reduce CPU usage at the cost of increased network traffic. See https://docs.victoriametrics.com/vmagent.html#victoriametrics-remote-write-protocol
-sortLabels
Whether to sort labels for incoming samples before writing them to all the configured remote storage systems. This may be needed for reducing memory usage at remote storage when the order of labels in incoming samples is random. For example, if m{k1="v1",k2="v2"} may be sent as m{k2="v2",k1="v1"}Enabled sorting for labels can slow down ingestion performance a bit
-tls

View File

@@ -9,7 +9,6 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/csvimport"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/csvimport/stream"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
"github.com/VictoriaMetrics/metrics"
)
@@ -26,7 +25,7 @@ func InsertHandler(at *auth.Token, req *http.Request) error {
if err != nil {
return err
}
return stream.Parse(req, func(rows []parser.Row) error {
return parser.ParseStream(req, func(rows []parser.Row) error {
return insertRows(at, rows, extraLabels)
})
}

View File

@@ -9,7 +9,6 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/datadog"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/datadog/stream"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
"github.com/VictoriaMetrics/metrics"
)
@@ -29,7 +28,7 @@ func InsertHandlerForHTTP(at *auth.Token, req *http.Request) error {
return err
}
ce := req.Header.Get("Content-Encoding")
return stream.Parse(req.Body, ce, func(series []parser.Series) error {
return parser.ParseStream(req.Body, ce, func(series []parser.Series) error {
return insertRows(at, series, extraLabels)
})
}

View File

@@ -7,7 +7,6 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/graphite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/graphite/stream"
"github.com/VictoriaMetrics/metrics"
)
@@ -20,7 +19,7 @@ var (
//
// See https://graphite.readthedocs.io/en/latest/feeding-carbon.html#the-plaintext-protocol
func InsertHandler(r io.Reader) error {
return stream.Parse(r, insertRows)
return parser.ParseStream(r, insertRows)
}
func insertRows(rows []parser.Row) error {

View File

@@ -15,14 +15,13 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/influx"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/influx/stream"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
"github.com/VictoriaMetrics/metrics"
)
var (
measurementFieldSeparator = flag.String("influxMeasurementFieldSeparator", "_", "Separator for '{measurement}{separator}{field_name}' metric name when inserted via InfluxDB line protocol")
skipSingleField = flag.Bool("influxSkipSingleField", false, "Uses '{measurement}' instead of '{measurement}{separator}{field_name}' for metric name if InfluxDB line contains only a single field")
skipSingleField = flag.Bool("influxSkipSingleField", false, "Uses '{measurement}' instead of '{measurement}{separator}{field_name}' for metic name if InfluxDB line contains only a single field")
skipMeasurement = flag.Bool("influxSkipMeasurement", false, "Uses '{field_name}' as a metric name while ignoring '{measurement}' and '-influxMeasurementFieldSeparator'")
dbLabel = flag.String("influxDBLabel", "db", "Default label for the DB name sent over '?db={db_name}' query parameter")
)
@@ -37,7 +36,7 @@ var (
//
// See https://github.com/influxdata/telegraf/tree/master/plugins/inputs/socket_listener/
func InsertHandlerForReader(r io.Reader, isGzipped bool) error {
return stream.Parse(r, isGzipped, "", "", func(db string, rows []parser.Row) error {
return parser.ParseStream(r, isGzipped, "", "", func(db string, rows []parser.Row) error {
return insertRows(nil, db, rows, nil)
})
}
@@ -55,7 +54,7 @@ func InsertHandlerForHTTP(at *auth.Token, req *http.Request) error {
precision := q.Get("precision")
// Read db tag from https://docs.influxdata.com/influxdb/v1.7/tools/api/#write-http-endpoint
db := q.Get("db")
return stream.Parse(req.Body, isGzipped, precision, db, func(db string, rows []parser.Row) error {
return parser.ParseStream(req.Body, isGzipped, precision, db, func(db string, rows []parser.Row) error {
return insertRows(at, db, rows, extraLabels)
})
}

View File

@@ -11,12 +11,13 @@ import (
"sync/atomic"
"time"
"github.com/VictoriaMetrics/metrics"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/csvimport"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/datadog"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/graphite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/influx"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/native"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/opentelemetry"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/opentsdb"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/opentsdbhttp"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/prometheusimport"
@@ -39,7 +40,6 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/pushmetrics"
"github.com/VictoriaMetrics/metrics"
)
var (
@@ -58,18 +58,18 @@ var (
"See also -graphiteListenAddr.useProxyProtocol")
graphiteUseProxyProtocol = flag.Bool("graphiteListenAddr.useProxyProtocol", false, "Whether to use proxy protocol for connections accepted at -graphiteListenAddr . "+
"See https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt")
opentsdbListenAddr = flag.String("opentsdbListenAddr", "", "TCP and UDP address to listen for OpenTSDB metrics. "+
opentsdbListenAddr = flag.String("opentsdbListenAddr", "", "TCP and UDP address to listen for OpentTSDB metrics. "+
"Telnet put messages and HTTP /api/put messages are simultaneously served on TCP port. "+
"Usually :4242 must be set. Doesn't work if empty. See also -opentsdbListenAddr.useProxyProtocol")
opentsdbUseProxyProtocol = flag.Bool("opentsdbListenAddr.useProxyProtocol", false, "Whether to use proxy protocol for connections accepted at -opentsdbListenAddr . "+
"See https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt")
opentsdbHTTPListenAddr = flag.String("opentsdbHTTPListenAddr", "", "TCP address to listen for OpenTSDB HTTP put requests. Usually :4242 must be set. Doesn't work if empty. "+
opentsdbHTTPListenAddr = flag.String("opentsdbHTTPListenAddr", "", "TCP address to listen for OpentTSDB HTTP put requests. Usually :4242 must be set. Doesn't work if empty. "+
"See also -opentsdbHTTPListenAddr.useProxyProtocol")
opentsdbHTTPUseProxyProtocol = flag.Bool("opentsdbHTTPListenAddr.useProxyProtocol", false, "Whether to use proxy protocol for connections accepted "+
"at -opentsdbHTTPListenAddr . See https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt")
configAuthKey = flag.String("configAuthKey", "", "Authorization key for accessing /config page. It must be passed via authKey query arg")
dryRun = flag.Bool("dryRun", false, "Whether to check config files without running vmagent. The following files are checked: "+
"-promscrape.config, -remoteWrite.relabelConfig, -remoteWrite.urlRelabelConfig, -remoteWrite.streamAggr.config . "+
dryRun = flag.Bool("dryRun", false, "Whether to check only config files without running vmagent. The following files are checked: "+
"-promscrape.config, -remoteWrite.relabelConfig, -remoteWrite.urlRelabelConfig . "+
"Unknown config entries aren't allowed in -promscrape.config by default. This can be changed by passing -promscrape.config.strictParse=false command-line flag")
)
@@ -94,26 +94,22 @@ func main() {
remotewrite.InitSecretFlags()
buildinfo.Init()
logger.Init()
pushmetrics.Init()
if promscrape.IsDryRun() {
if err := promscrape.CheckConfig(); err != nil {
logger.Fatalf("error when checking -promscrape.config: %s", err)
}
logger.Infof("-promscrape.config is ok; exiting with 0 status code")
logger.Infof("-promscrape.config is ok; exitting with 0 status code")
return
}
if *dryRun {
if err := promscrape.CheckConfig(); err != nil {
logger.Fatalf("error when checking -promscrape.config: %s", err)
}
if err := remotewrite.CheckRelabelConfigs(); err != nil {
logger.Fatalf("error when checking relabel configs: %s", err)
}
if err := remotewrite.CheckStreamAggrConfigs(); err != nil {
logger.Fatalf("error when checking -remoteWrite.streamAggr.config: %s", err)
if err := promscrape.CheckConfig(); err != nil {
logger.Fatalf("error when checking -promscrape.config: %s", err)
}
logger.Infof("all the configs are ok; exiting with 0 status code")
logger.Infof("all the configs are ok; exitting with 0 status code")
return
}
@@ -145,8 +141,10 @@ func main() {
}
logger.Infof("started vmagent in %.3f seconds", time.Since(startTime).Seconds())
pushmetrics.Init()
sig := procutil.WaitForSigterm()
logger.Infof("received signal %s", sig)
pushmetrics.Stop()
startTime = time.Now()
if len(*httpListenAddr) > 0 {
@@ -208,12 +206,12 @@ func getAuthTokenFromPath(path string) (*auth.Token, error) {
if p.Suffix != "opentsdb/api/put" {
return nil, fmt.Errorf("unsupported path requested: %q; expecting 'opentsdb/api/put'", p.Suffix)
}
return auth.NewToken(p.AuthToken)
return auth.NewTokenPossibleMultitenant(p.AuthToken)
}
func requestHandler(w http.ResponseWriter, r *http.Request) bool {
if r.URL.Path == "/" {
if r.Method != http.MethodGet {
if r.Method != "GET" {
return false
}
w.Header().Add("Content-Type", "text/html; charset=utf-8")
@@ -251,16 +249,13 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
w.WriteHeader(statusCode)
return true
}
if strings.HasPrefix(path, "datadog/") {
if strings.HasPrefix(path, "/datadog/") {
// Trim suffix from paths starting from /datadog/ in order to support legacy DataDog agent.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2670
path = strings.TrimSuffix(path, "/")
}
switch path {
case "/prometheus/api/v1/write", "/api/v1/write":
if common.HandleVMProtoServerHandshake(w, r) {
return true
}
prometheusWriteRequests.Inc()
if err := promremotewrite.InsertHandler(nil, r); err != nil {
prometheusWriteErrors.Inc()
@@ -309,15 +304,6 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
influxQueryRequests.Inc()
influxutils.WriteDatabaseNames(w)
return true
case "/opentelemetry/api/v1/push":
opentelemetryPushRequests.Inc()
if err := opentelemetry.InsertHandler(nil, r); err != nil {
opentelemetryPushErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(http.StatusOK)
return true
case "/datadog/api/v1/series":
datadogWriteRequests.Inc()
if err := datadog.InsertHandlerForHTTP(nil, r); err != nil {
@@ -509,15 +495,6 @@ func processMultitenantRequest(w http.ResponseWriter, r *http.Request, path stri
influxQueryRequests.Inc()
influxutils.WriteDatabaseNames(w)
return true
case "opentelemetry/api/v1/push":
opentelemetryPushRequests.Inc()
if err := opentelemetry.InsertHandler(at, r); err != nil {
opentelemetryPushErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(http.StatusOK)
return true
case "datadog/api/v1/series":
datadogWriteRequests.Inc()
if err := datadog.InsertHandlerForHTTP(at, r); err != nil {
@@ -587,9 +564,6 @@ var (
datadogIntakeRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/datadog/intake", protocol="datadog"}`)
datadogMetadataRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/datadog/api/v1/metadata", protocol="datadog"}`)
opentelemetryPushRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/opentelemetry/api/v1/push", protocol="opentelemetry"}`)
opentelemetryPushErrors = metrics.NewCounter(`vmagent_http_request_errors_total{path="/opentelemetry/api/v1/push", protocol="opentelemetry"}`)
promscrapeTargetsRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/targets"}`)
promscrapeServiceDiscoveryRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/service-discovery"}`)

View File

@@ -10,7 +10,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/native/stream"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/native"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
"github.com/VictoriaMetrics/metrics"
)
@@ -30,12 +30,12 @@ func InsertHandler(at *auth.Token, req *http.Request) error {
return err
}
isGzip := req.Header.Get("Content-Encoding") == "gzip"
return stream.Parse(req.Body, isGzip, func(block *stream.Block) error {
return parser.ParseStream(req.Body, isGzip, func(block *parser.Block) error {
return insertRows(at, block, extraLabels)
})
}
func insertRows(at *auth.Token, block *stream.Block, extraLabels []prompbmarshal.Label) error {
func insertRows(at *auth.Token, block *parser.Block, extraLabels []prompbmarshal.Label) error {
ctx := common.GetPushCtx()
defer common.PutPushCtx(ctx)

View File

@@ -1,65 +0,0 @@
package opentelemetry
import (
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentelemetry/stream"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="opentelemetry"}`)
rowsTenantInserted = tenantmetrics.NewCounterMap(`vmagent_tenant_inserted_rows_total{type="opentelemetry"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="opentelemetry"}`)
)
// InsertHandler processes opentelemetry metrics.
func InsertHandler(at *auth.Token, req *http.Request) error {
extraLabels, err := parserCommon.GetExtraLabels(req)
if err != nil {
return err
}
isGzipped := req.Header.Get("Content-Encoding") == "gzip"
return stream.ParseStream(req.Body, isGzipped, func(tss []prompbmarshal.TimeSeries) error {
return insertRows(at, tss, extraLabels)
})
}
func insertRows(at *auth.Token, tss []prompbmarshal.TimeSeries, extraLabels []prompbmarshal.Label) error {
ctx := common.GetPushCtx()
defer common.PutPushCtx(ctx)
rowsTotal := 0
tssDst := ctx.WriteRequest.Timeseries[:0]
labels := ctx.Labels[:0]
samples := ctx.Samples[:0]
for i := range tss {
ts := &tss[i]
rowsTotal += len(ts.Samples)
labelsLen := len(labels)
labels = append(labels, ts.Labels...)
labels = append(labels, extraLabels...)
samplesLen := len(samples)
samples = append(samples, ts.Samples...)
tssDst = append(tssDst, prompbmarshal.TimeSeries{
Labels: labels[labelsLen:],
Samples: samples[samplesLen:],
})
}
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.Push(at, &ctx.WriteRequest)
rowsInserted.Add(rowsTotal)
if at != nil {
rowsTenantInserted.Get(at).Add(rowsTotal)
}
rowsPerInsert.Update(float64(rowsTotal))
return nil
}

View File

@@ -7,7 +7,6 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentsdb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentsdb/stream"
"github.com/VictoriaMetrics/metrics"
)
@@ -20,7 +19,7 @@ var (
//
// See http://opentsdb.net/docs/build/html/api_telnet/put.html
func InsertHandler(r io.Reader) error {
return stream.Parse(r, insertRows)
return parser.ParseStream(r, insertRows)
}
func insertRows(rows []parser.Row) error {

View File

@@ -9,7 +9,6 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentsdbhttp"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentsdbhttp/stream"
"github.com/VictoriaMetrics/metrics"
)
@@ -25,7 +24,7 @@ func InsertHandler(at *auth.Token, req *http.Request) error {
if err != nil {
return err
}
return stream.Parse(req, func(rows []parser.Row) error {
return parser.ParseStream(req, func(rows []parser.Row) error {
return insertRows(at, rows, extraLabels)
})
}

View File

@@ -10,7 +10,6 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/prometheus"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/prometheus/stream"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
"github.com/VictoriaMetrics/metrics"
)
@@ -32,7 +31,7 @@ func InsertHandler(at *auth.Token, req *http.Request) error {
return err
}
isGzipped := req.Header.Get("Content-Encoding") == "gzip"
return stream.Parse(req.Body, defaultTimestamp, isGzipped, func(rows []parser.Row) error {
return parser.ParseStream(req.Body, defaultTimestamp, isGzipped, func(rows []parser.Row) error {
return insertRows(at, rows, extraLabels)
}, func(s string) {
httpserver.LogError(req, s)

View File

@@ -23,7 +23,7 @@ var (
func TestInsertHandler(t *testing.T) {
setUp()
defer tearDown()
req := httptest.NewRequest(http.MethodPost, "/insert/0/api/v1/import/prometheus", bytes.NewBufferString(`{"foo":"bar"}
req := httptest.NewRequest("POST", "/insert/0/api/v1/import/prometheus", bytes.NewBufferString(`{"foo":"bar"}
go_memstats_alloc_bytes_total 1`))
if err := InsertHandler(nil, req); err != nil {
t.Errorf("unxepected error %s", err)

View File

@@ -10,7 +10,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/promremotewrite/stream"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/promremotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
"github.com/VictoriaMetrics/metrics"
)
@@ -27,8 +27,7 @@ func InsertHandler(at *auth.Token, req *http.Request) error {
if err != nil {
return err
}
isVMRemoteWrite := req.Header.Get("Content-Encoding") == "zstd"
return stream.Parse(req.Body, isVMRemoteWrite, func(tss []prompb.TimeSeries) error {
return parser.ParseStream(req.Body, func(tss []prompb.TimeSeries) error {
return insertRows(at, tss, extraLabels)
})
}

View File

@@ -15,21 +15,15 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/persistentqueue"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promauth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/timerpool"
"github.com/VictoriaMetrics/metrics"
)
var (
forcePromProto = flagutil.NewArrayBool("remoteWrite.forcePromProto", "Whether to force Prometheus remote write protocol for sending data "+
"to the corresponding -remoteWrite.url . See https://docs.victoriametrics.com/vmagent.html#victoriametrics-remote-write-protocol")
forceVMProto = flagutil.NewArrayBool("remoteWrite.forceVMProto", "Whether to force VictoriaMetrics remote write protocol for sending data "+
"to the corresponding -remoteWrite.url . See https://docs.victoriametrics.com/vmagent.html#victoriametrics-remote-write-protocol")
rateLimit = flagutil.NewArrayInt("remoteWrite.rateLimit", "Optional rate limit in bytes per second for data sent to the corresponding -remoteWrite.url. "+
"By default, the rate limit is disabled. It can be useful for limiting load on remote storage when big amounts of buffered data "+
"By default the rate limit is disabled. It can be useful for limiting load on remote storage when big amounts of buffered data "+
"is sent after temporary unavailability of the remote storage")
sendTimeout = flagutil.NewArrayDuration("remoteWrite.sendTimeout", "Timeout for sending a single block of data to the corresponding -remoteWrite.url (default 1m)")
sendTimeout = flagutil.NewArrayDuration("remoteWrite.sendTimeout", "Timeout for sending a single block of data to the corresponding -remoteWrite.url")
proxyURL = flagutil.NewArrayString("remoteWrite.proxyURL", "Optional proxy URL for writing data to the corresponding -remoteWrite.url. "+
"Supported proxies: http, https, socks5. Example: -remoteWrite.proxyURL=socks5://proxy:1234")
@@ -38,9 +32,9 @@ var (
"to the corresponding -remoteWrite.url")
tlsKeyFile = flagutil.NewArrayString("remoteWrite.tlsKeyFile", "Optional path to client-side TLS certificate key to use when connecting to the corresponding -remoteWrite.url")
tlsCAFile = flagutil.NewArrayString("remoteWrite.tlsCAFile", "Optional path to TLS CA file to use for verifying connections to the corresponding -remoteWrite.url. "+
"By default, system CA is used")
"By default system CA is used")
tlsServerName = flagutil.NewArrayString("remoteWrite.tlsServerName", "Optional TLS server name to use for connections to the corresponding -remoteWrite.url. "+
"By default, the server name from -remoteWrite.url is used")
"By default the server name from -remoteWrite.url is used")
headers = flagutil.NewArrayString("remoteWrite.headers", "Optional HTTP headers to send with each request to the corresponding -remoteWrite.url. "+
"For example, -remoteWrite.headers='My-Auth:foobar' would send 'My-Auth: foobar' HTTP header with every request to the corresponding -remoteWrite.url. "+
@@ -75,12 +69,8 @@ var (
type client struct {
sanitizedURL string
remoteWriteURL string
// Whether to use VictoriaMetrics remote write protocol for sending the data to remoteWriteURL
useVMProto bool
fq *persistentqueue.FastQueue
hc *http.Client
fq *persistentqueue.FastQueue
hc *http.Client
sendBlock func(block []byte) bool
authCfg *promauth.Config
@@ -132,39 +122,19 @@ func newHTTPClient(argIdx int, remoteWriteURL, sanitizedURL string, fq *persiste
}
tr.Proxy = http.ProxyURL(pu)
}
hc := &http.Client{
Transport: tr,
Timeout: sendTimeout.GetOptionalArgOrDefault(argIdx, time.Minute),
}
c := &client{
sanitizedURL: sanitizedURL,
remoteWriteURL: remoteWriteURL,
authCfg: authCfg,
awsCfg: awsCfg,
fq: fq,
hc: hc,
stopCh: make(chan struct{}),
hc: &http.Client{
Transport: tr,
Timeout: sendTimeout.GetOptionalArgOrDefault(argIdx, time.Minute),
},
stopCh: make(chan struct{}),
}
c.sendBlock = c.sendBlockHTTP
useVMProto := forceVMProto.GetOptionalArg(argIdx)
usePromProto := forcePromProto.GetOptionalArg(argIdx)
if useVMProto && usePromProto {
logger.Fatalf("-remoteWrite.useVMProto and -remoteWrite.usePromProto cannot be set simultaneously for -remoteWrite.url=%s", sanitizedURL)
}
if !useVMProto && !usePromProto {
// Auto-detect whether the remote storage supports VictoriaMetrics remote write protocol.
doRequest := func(url string) (*http.Response, error) {
return c.doRequest(url, nil)
}
useVMProto = common.HandleVMProtoClientHandshake(c.remoteWriteURL, doRequest)
if !useVMProto {
logger.Infof("the remote storage at %q doesn't support VictoriaMetrics remote write protocol. Switching to Prometheus remote write protocol. "+
"See https://docs.victoriametrics.com/vmagent.html#victoriametrics-remote-write-protocol", sanitizedURL)
}
}
c.useVMProto = useVMProto
return c
}
@@ -257,7 +227,7 @@ func getAuthConfig(argIdx int) (*promauth.Config, error) {
}
authCfg, err := opts.NewConfig()
if err != nil {
return nil, fmt.Errorf("cannot populate auth config for remoteWrite idx: %d, err: %w", argIdx, err)
return nil, fmt.Errorf("cannot populate OAuth2 config for remoteWrite idx: %d, err: %w", argIdx, err)
}
return authCfg, nil
}
@@ -321,45 +291,38 @@ func (c *client) runWorker() {
}
}
func (c *client) doRequest(url string, body []byte) (*http.Response, error) {
reqBody := bytes.NewBuffer(body)
req, err := http.NewRequest(http.MethodPost, url, reqBody)
if err != nil {
logger.Panicf("BUG: unexpected error from http.NewRequest(%q): %s", url, err)
}
c.authCfg.SetHeaders(req, true)
h := req.Header
h.Set("User-Agent", "vmagent")
h.Set("Content-Type", "application/x-protobuf")
if c.useVMProto {
h.Set("Content-Encoding", "zstd")
h.Set("X-VictoriaMetrics-Remote-Write-Version", "1")
} else {
h.Set("Content-Encoding", "snappy")
h.Set("X-Prometheus-Remote-Write-Version", "0.1.0")
}
if c.awsCfg != nil {
sigv4Hash := awsapi.HashHex(body)
if err := c.awsCfg.SignRequest(req, sigv4Hash); err != nil {
// there is no need in retry, request will be rejected by client.Do and retried by code below
logger.Warnf("cannot sign remoteWrite request with AWS sigv4: %s", err)
}
}
return c.hc.Do(req)
}
// sendBlockHTTP sends the given block to c.remoteWriteURL.
//
// The function returns false only if c.stopCh is closed.
// sendBlockHTTP returns false only if c.stopCh is closed.
// Otherwise it tries sending the block to remote storage indefinitely.
func (c *client) sendBlockHTTP(block []byte) bool {
c.rl.register(len(block), c.stopCh)
retryDuration := time.Second
retriesCount := 0
c.bytesSent.Add(len(block))
c.blocksSent.Inc()
sigv4Hash := ""
if c.awsCfg != nil {
sigv4Hash = awsapi.HashHex(block)
}
again:
req, err := http.NewRequest("POST", c.remoteWriteURL, bytes.NewBuffer(block))
if err != nil {
logger.Panicf("BUG: unexpected error from http.NewRequest(%q): %s", c.sanitizedURL, err)
}
c.authCfg.SetHeaders(req, true)
h := req.Header
h.Set("User-Agent", "vmagent")
h.Set("Content-Type", "application/x-protobuf")
h.Set("Content-Encoding", "snappy")
h.Set("X-Prometheus-Remote-Write-Version", "0.1.0")
if c.awsCfg != nil {
if err := c.awsCfg.SignRequest(req, sigv4Hash); err != nil {
// there is no need in retry, request will be rejected by client.Do and retried by code below
logger.Warnf("cannot sign remoteWrite request with AWS sigv4: %s", err)
}
}
startTime := time.Now()
resp, err := c.doRequest(c.remoteWriteURL, block)
resp, err := c.hc.Do(req)
c.requestDuration.UpdateDuration(startTime)
if err != nil {
c.errorsCount.Inc()
@@ -384,8 +347,6 @@ again:
if statusCode/100 == 2 {
_ = resp.Body.Close()
c.requestsOKCount.Inc()
c.bytesSent.Add(len(block))
c.blocksSent.Inc()
return true
}
metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_requests_total{url=%q, status_code="%d"}`, c.sanitizedURL, statusCode)).Inc()

View File

@@ -8,7 +8,6 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/decimal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/encoding/zstd"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fasttime"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
@@ -24,9 +23,6 @@ var (
"This option takes effect only when less than 10K data points per second are pushed to -remoteWrite.url")
maxUnpackedBlockSize = flagutil.NewBytes("remoteWrite.maxBlockSize", 8*1024*1024, "The maximum block size to send to remote storage. Bigger blocks may improve performance at the cost of the increased memory usage. See also -remoteWrite.maxRowsPerBlock")
maxRowsPerBlock = flag.Int("remoteWrite.maxRowsPerBlock", 10000, "The maximum number of samples to send in each block to remote storage. Higher number may improve performance at the cost of the increased memory usage. See also -remoteWrite.maxBlockSize")
vmProtoCompressLevel = flag.Int("remoteWrite.vmProtoCompressLevel", 0, "The compression level for VictoriaMetrics remote write protocol. "+
"Higher values reduce network traffic at the cost of higher CPU usage. Negative values reduce CPU usage at the cost of increased network traffic. "+
"See https://docs.victoriametrics.com/vmagent.html#victoriametrics-remote-write-protocol")
)
type pendingSeries struct {
@@ -37,10 +33,9 @@ type pendingSeries struct {
periodicFlusherWG sync.WaitGroup
}
func newPendingSeries(pushBlock func(block []byte), isVMRemoteWrite bool, significantFigures, roundDigits int) *pendingSeries {
func newPendingSeries(pushBlock func(block []byte), significantFigures, roundDigits int) *pendingSeries {
var ps pendingSeries
ps.wr.pushBlock = pushBlock
ps.wr.isVMRemoteWrite = isVMRemoteWrite
ps.wr.significantFigures = significantFigures
ps.wr.roundDigits = roundDigits
ps.stopCh = make(chan struct{})
@@ -93,9 +88,6 @@ type writeRequest struct {
// pushBlock is called when whe write request is ready to be sent.
pushBlock func(block []byte)
// Whether to encode the write request with VictoriaMetrics remote write protocol.
isVMRemoteWrite bool
// How many significant figures must be left before sending the writeRequest to pushBlock.
significantFigures int
@@ -112,7 +104,7 @@ type writeRequest struct {
}
func (wr *writeRequest) reset() {
// Do not reset lastFlushTime, pushBlock, isVMRemoteWrite, significantFigures and roundDigits, since they are re-used.
// Do not reset pushBlock, significantFigures and roundDigits, since they are re-used.
wr.wr.Timeseries = nil
@@ -134,7 +126,7 @@ func (wr *writeRequest) flush() {
wr.wr.Timeseries = wr.tss
wr.adjustSampleValues()
atomic.StoreUint64(&wr.lastFlushTime, fasttime.UnixTimestamp())
pushWriteRequest(&wr.wr, wr.pushBlock, wr.isVMRemoteWrite)
pushWriteRequest(&wr.wr, wr.pushBlock)
wr.reset()
}
@@ -196,7 +188,7 @@ func (wr *writeRequest) copyTimeSeries(dst, src *prompbmarshal.TimeSeries) {
wr.buf = buf
}
func pushWriteRequest(wr *prompbmarshal.WriteRequest, pushBlock func(block []byte), isVMRemoteWrite bool) {
func pushWriteRequest(wr *prompbmarshal.WriteRequest, pushBlock func(block []byte)) {
if len(wr.Timeseries) == 0 {
// Nothing to push
return
@@ -205,11 +197,7 @@ func pushWriteRequest(wr *prompbmarshal.WriteRequest, pushBlock func(block []byt
bb.B = prompbmarshal.MarshalWriteRequest(bb.B[:0], wr)
if len(bb.B) <= maxUnpackedBlockSize.IntN() {
zb := snappyBufPool.Get()
if isVMRemoteWrite {
zb.B = zstd.CompressLevel(zb.B[:0], bb.B, *vmProtoCompressLevel)
} else {
zb.B = snappy.Encode(zb.B[:cap(zb.B)], bb.B)
}
zb.B = snappy.Encode(zb.B[:cap(zb.B)], bb.B)
writeRequestBufPool.Put(bb)
if len(zb.B) <= persistentqueue.MaxBlockSize {
pushBlock(zb.B)
@@ -233,18 +221,18 @@ func pushWriteRequest(wr *prompbmarshal.WriteRequest, pushBlock func(block []byt
}
n := len(samples) / 2
wr.Timeseries[0].Samples = samples[:n]
pushWriteRequest(wr, pushBlock, isVMRemoteWrite)
pushWriteRequest(wr, pushBlock)
wr.Timeseries[0].Samples = samples[n:]
pushWriteRequest(wr, pushBlock, isVMRemoteWrite)
pushWriteRequest(wr, pushBlock)
wr.Timeseries[0].Samples = samples
return
}
timeseries := wr.Timeseries
n := len(timeseries) / 2
wr.Timeseries = timeseries[:n]
pushWriteRequest(wr, pushBlock, isVMRemoteWrite)
pushWriteRequest(wr, pushBlock)
wr.Timeseries = timeseries[n:]
pushWriteRequest(wr, pushBlock, isVMRemoteWrite)
pushWriteRequest(wr, pushBlock)
wr.Timeseries = timeseries
}

View File

@@ -2,48 +2,40 @@ package remotewrite
import (
"fmt"
"math"
"testing"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/golang/snappy"
)
func TestPushWriteRequest(t *testing.T) {
rowsCounts := []int{1, 10, 100, 1e3, 1e4}
expectedBlockLensProm := []int{216, 1848, 16424, 169882, 1757876}
expectedBlockLensVM := []int{138, 492, 3927, 34995, 288476}
for i, rowsCount := range rowsCounts {
expectedBlockLenProm := expectedBlockLensProm[i]
expectedBlockLenVM := expectedBlockLensVM[i]
for _, rowsCount := range []int{1, 10, 100, 1e3, 1e4} {
t.Run(fmt.Sprintf("%d", rowsCount), func(t *testing.T) {
testPushWriteRequest(t, rowsCount, expectedBlockLenProm, expectedBlockLenVM)
testPushWriteRequest(t, rowsCount)
})
}
}
func testPushWriteRequest(t *testing.T, rowsCount, expectedBlockLenProm, expectedBlockLenVM int) {
f := func(isVMRemoteWrite bool, expectedBlockLen int, tolerancePrc float64) {
t.Helper()
wr := newTestWriteRequest(rowsCount, 20)
pushBlockLen := 0
pushBlock := func(block []byte) {
if pushBlockLen > 0 {
panic(fmt.Errorf("BUG: pushBlock called multiple times; pushBlockLen=%d at first call, len(block)=%d at second call", pushBlockLen, len(block)))
}
pushBlockLen = len(block)
}
pushWriteRequest(wr, pushBlock, isVMRemoteWrite)
if math.Abs(float64(pushBlockLen-expectedBlockLen)/float64(expectedBlockLen)*100) > tolerancePrc {
t.Fatalf("unexpected block len for rowsCount=%d, isVMRemoteWrite=%v; got %d bytes; expecting %d bytes +- %.0f%%",
rowsCount, isVMRemoteWrite, pushBlockLen, expectedBlockLen, tolerancePrc)
func testPushWriteRequest(t *testing.T, rowsCount int) {
wr := newTestWriteRequest(rowsCount, 10)
pushBlockLen := 0
pushBlock := func(block []byte) {
if pushBlockLen > 0 {
panic(fmt.Errorf("BUG: pushBlock called multiple times; pushBlockLen=%d at first call, len(block)=%d at second call", pushBlockLen, len(block)))
}
pushBlockLen = len(block)
}
pushWriteRequest(wr, pushBlock)
b := prompbmarshal.MarshalWriteRequest(nil, wr)
zb := snappy.Encode(nil, b)
maxPushBlockLen := len(zb)
minPushBlockLen := maxPushBlockLen / 2
if pushBlockLen < minPushBlockLen {
t.Fatalf("unexpected block len after pushWriteRequest; got %d bytes; must be at least %d bytes", pushBlockLen, minPushBlockLen)
}
if pushBlockLen > maxPushBlockLen {
t.Fatalf("unexpected block len after pushWriteRequest; got %d bytes; must be smaller or equal to %d bytes", pushBlockLen, maxPushBlockLen)
}
// Check Prometheus remote write
f(false, expectedBlockLenProm, 0)
// Check VictoriaMetrics remote write
f(true, expectedBlockLenVM, 15)
}
func newTestWriteRequest(seriesCount, labelsCount int) *prompbmarshal.WriteRequest {

View File

@@ -114,9 +114,9 @@ func (rctx *relabelCtx) applyRelabeling(tss []prompbmarshal.TimeSeries, extraLab
for j := range tmpLabels {
label := &tmpLabels[j]
if label.Name == "__name__" {
label.Value = promrelabel.SanitizeName(label.Value)
label.Value = promrelabel.SanitizeMetricName(label.Value)
} else {
label.Name = promrelabel.SanitizeName(label.Name)
label.Name = promrelabel.SanitizeLabelName(label.Name)
}
}
}

View File

@@ -4,21 +4,17 @@ import (
"flag"
"fmt"
"net/url"
"path/filepath"
"strconv"
"sync"
"sync/atomic"
"time"
"github.com/cespare/xxhash/v2"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bloomfilter"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fasttime"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/memory"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/persistentqueue"
@@ -28,37 +24,32 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/streamaggr"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
"github.com/VictoriaMetrics/metrics"
"github.com/cespare/xxhash/v2"
)
var (
remoteWriteURLs = flagutil.NewArrayString("remoteWrite.url", "Remote storage URL to write data to. It must support either VictoriaMetrics remote write protocol "+
"or Prometheus remote_write protocol. Example url: http://<victoriametrics-host>:8428/api/v1/write . "+
"Pass multiple -remoteWrite.url options in order to replicate the collected data to multiple remote storage systems. "+
"The data can be sharded among the configured remote storage systems if -remoteWrite.shardByURL flag is set. "+
"See also -remoteWrite.multitenantURL")
remoteWriteURLs = flagutil.NewArrayString("remoteWrite.url", "Remote storage URL to write data to. It must support Prometheus remote_write API. "+
"It is recommended using VictoriaMetrics as remote storage. Example url: http://<victoriametrics-host>:8428/api/v1/write . "+
"Pass multiple -remoteWrite.url flags in order to replicate data to multiple remote storage systems. See also -remoteWrite.multitenantURL")
remoteWriteMultitenantURLs = flagutil.NewArrayString("remoteWrite.multitenantURL", "Base path for multitenant remote storage URL to write data to. "+
"See https://docs.victoriametrics.com/vmagent.html#multitenancy for details. Example url: http://<vminsert>:8480 . "+
"Pass multiple -remoteWrite.multitenantURL flags in order to replicate data to multiple remote storage systems. See also -remoteWrite.url")
shardByURL = flag.Bool("remoteWrite.shardByURL", false, "Whether to shard outgoing series across all the remote storage systems enumerated via -remoteWrite.url . "+
"By default the data is replicated across all the -remoteWrite.url . See https://docs.victoriametrics.com/vmagent.html#sharding-among-remote-storages")
tmpDataPath = flag.String("remoteWrite.tmpDataPath", "vmagent-remotewrite-data", "Path to directory where temporary data for remote write component is stored. "+
"See also -remoteWrite.maxDiskUsagePerURL")
keepDanglingQueues = flag.Bool("remoteWrite.keepDanglingQueues", false, "Keep persistent queues contents at -remoteWrite.tmpDataPath in case there are no matching -remoteWrite.url. "+
"Useful when -remoteWrite.url is changed temporarily and persistent queue files will be needed later on.")
queues = flag.Int("remoteWrite.queues", cgroup.AvailableCPUs()*2, "The number of concurrent queues to each -remoteWrite.url. Set more queues if default number of queues "+
"isn't enough for sending high volume of collected data to remote storage. Default value is 2 * numberOfAvailableCPUs")
showRemoteWriteURL = flag.Bool("remoteWrite.showURL", false, "Whether to show -remoteWrite.url in the exported metrics. "+
"It is hidden by default, since it can contain sensitive info such as auth key")
maxPendingBytesPerURL = flagutil.NewArrayBytes("remoteWrite.maxDiskUsagePerURL", "The maximum file-based buffer size in bytes at -remoteWrite.tmpDataPath "+
"for each -remoteWrite.url. When buffer size reaches the configured maximum, then old data is dropped when adding new data to the buffer. "+
"Buffered data is stored in ~500MB chunks. It is recommended to set the value for this flag to a multiple of the block size 500MB. "+
"Buffered data is stored in ~500MB chunks, so the minimum practical value for this flag is 500MB. "+
"Disk usage is unlimited if the value is set to 0")
significantFigures = flagutil.NewArrayInt("remoteWrite.significantFigures", "The number of significant figures to leave in metric values before writing them "+
"to remote storage. See https://en.wikipedia.org/wiki/Significant_figures . Zero value saves all the significant figures. "+
"This option may be used for improving data compression for the stored metrics. See also -remoteWrite.roundDigits")
roundDigits = flagutil.NewArrayInt("remoteWrite.roundDigits", "Round metric values to this number of decimal digits after the point before writing them to remote storage. "+
"Examples: -remoteWrite.roundDigits=2 would round 1.236 to 1.24, while -remoteWrite.roundDigits=-1 would round 126.78 to 130. "+
"By default, digits rounding is disabled. Set it to 100 for disabling it for a particular remote storage. "+
"By default digits rounding is disabled. Set it to 100 for disabling it for a particular remote storage. "+
"This option may be used for improving data compression for the stored metrics")
sortLabels = flag.Bool("sortLabels", false, `Whether to sort labels for incoming samples before writing them to all the configured remote storage systems. `+
`This may be needed for reducing memory usage at remote storage when the order of labels in incoming samples is random. `+
@@ -71,13 +62,10 @@ var (
streamAggrConfig = flagutil.NewArrayString("remoteWrite.streamAggr.config", "Optional path to file with stream aggregation config. "+
"See https://docs.victoriametrics.com/stream-aggregation.html . "+
"See also -remoteWrite.streamAggr.keepInput, -remoteWrite.streamAggr.dropInput and -remoteWrite.streamAggr.dedupInterval")
streamAggrKeepInput = flagutil.NewArrayBool("remoteWrite.streamAggr.keepInput", "Whether to keep all the input samples after the aggregation "+
"with -remoteWrite.streamAggr.config. By default, only aggregates samples are dropped, while the remaining samples "+
"are written to the corresponding -remoteWrite.url . See also -remoteWrite.streamAggr.dropInput and https://docs.victoriametrics.com/stream-aggregation.html")
streamAggrDropInput = flagutil.NewArrayBool("remoteWrite.streamAggr.dropInput", "Whether to drop all the input samples after the aggregation "+
"with -remoteWrite.streamAggr.config. By default, only aggregates samples are dropped, while the remaining samples "+
"are written to the corresponding -remoteWrite.url . See also -remoteWrite.streamAggr.keepInput and https://docs.victoriametrics.com/stream-aggregation.html")
"See also -remoteWrite.streamAggr.keepInput and -remoteWrite.streamAggr.dedupInterval")
streamAggrKeepInput = flagutil.NewArrayBool("remoteWrite.streamAggr.keepInput", "Whether to keep input samples after the aggregation with -remoteWrite.streamAggr.config. "+
"By default the input is dropped after the aggregation, so only the aggregate data is sent to the -remoteWrite.url. "+
"See https://docs.victoriametrics.com/stream-aggregation.html")
streamAggrDedupInterval = flagutil.NewArrayDuration("remoteWrite.streamAggr.dedupInterval", "Input samples are de-duplicated with this interval before being aggregated. "+
"Only the last sample per each time series per each interval is aggregated if the interval is greater than zero")
)
@@ -100,14 +88,12 @@ func MultitenancyEnabled() bool {
}
// Contains the current relabelConfigs.
var allRelabelConfigs atomic.Pointer[relabelConfigs]
var allRelabelConfigs atomic.Value
// maxQueues limits the maximum value for `-remoteWrite.queues`. There is no sense in setting too high value,
// since it may lead to high memory usage due to big number of buffers.
var maxQueues = cgroup.AvailableCPUs() * 16
const persistentQueueDirname = "persistent-queue"
// InitSecretFlags must be called after flag.Parse and before any logging.
func InitSecretFlags() {
if !*showRemoteWriteURL {
@@ -164,8 +150,9 @@ func Init() {
logger.Fatalf("cannot load relabel configs: %s", err)
}
allRelabelConfigs.Store(rcs)
relabelConfigSuccess.Set(1)
relabelConfigTimestamp.Set(fasttime.UnixTimestamp())
configSuccess.Set(1)
configTimestamp.Set(fasttime.UnixTimestamp())
if len(*remoteWriteURLs) > 0 {
rwctxsDefault = newRemoteWriteCtxs(nil, *remoteWriteURLs)
@@ -178,56 +165,34 @@ func Init() {
for {
select {
case <-sighupCh:
case <-configReloaderStopCh:
case <-stopCh:
return
}
reloadRelabelConfigs()
reloadStreamAggrConfigs()
configReloads.Inc()
logger.Infof("SIGHUP received; reloading relabel configs pointed by -remoteWrite.relabelConfig and -remoteWrite.urlRelabelConfig")
rcs, err := loadRelabelConfigs()
if err != nil {
configReloadErrors.Inc()
configSuccess.Set(0)
logger.Errorf("cannot reload relabel configs; preserving the previous configs; error: %s", err)
continue
}
allRelabelConfigs.Store(rcs)
configSuccess.Set(1)
configTimestamp.Set(fasttime.UnixTimestamp())
logger.Infof("Successfully reloaded relabel configs")
}
}()
}
func reloadRelabelConfigs() {
relabelConfigReloads.Inc()
logger.Infof("reloading relabel configs pointed by -remoteWrite.relabelConfig and -remoteWrite.urlRelabelConfig")
rcs, err := loadRelabelConfigs()
if err != nil {
relabelConfigReloadErrors.Inc()
relabelConfigSuccess.Set(0)
logger.Errorf("cannot reload relabel configs; preserving the previous configs; error: %s", err)
return
}
allRelabelConfigs.Store(rcs)
relabelConfigSuccess.Set(1)
relabelConfigTimestamp.Set(fasttime.UnixTimestamp())
logger.Infof("successfully reloaded relabel configs")
}
var (
relabelConfigReloads = metrics.NewCounter(`vmagent_relabel_config_reloads_total`)
relabelConfigReloadErrors = metrics.NewCounter(`vmagent_relabel_config_reloads_errors_total`)
relabelConfigSuccess = metrics.NewCounter(`vmagent_relabel_config_last_reload_successful`)
relabelConfigTimestamp = metrics.NewCounter(`vmagent_relabel_config_last_reload_success_timestamp_seconds`)
configReloads = metrics.NewCounter(`vmagent_relabel_config_reloads_total`)
configReloadErrors = metrics.NewCounter(`vmagent_relabel_config_reloads_errors_total`)
configSuccess = metrics.NewCounter(`vmagent_relabel_config_last_reload_successful`)
configTimestamp = metrics.NewCounter(`vmagent_relabel_config_last_reload_success_timestamp_seconds`)
)
func reloadStreamAggrConfigs() {
if len(*remoteWriteMultitenantURLs) > 0 {
rwctxsMapLock.Lock()
for _, rwctxs := range rwctxsMap {
reinitStreamAggr(rwctxs)
}
rwctxsMapLock.Unlock()
} else {
reinitStreamAggr(rwctxsDefault)
}
}
func reinitStreamAggr(rwctxs []*remoteWriteCtx) {
for _, rwctx := range rwctxs {
rwctx.reinitStreamAggr()
}
}
func newRemoteWriteCtxs(at *auth.Token, urls []string) []*remoteWriteCtx {
if len(urls) == 0 {
logger.Panicf("BUG: urls must be non-empty")
@@ -260,44 +225,17 @@ func newRemoteWriteCtxs(at *auth.Token, urls []string) []*remoteWriteCtx {
}
rwctxs[i] = newRemoteWriteCtx(i, at, remoteWriteURL, maxInmemoryBlocks, sanitizedURL)
}
if !*keepDanglingQueues {
// Remove dangling queues, if any.
// This is required for the case when the number of queues has been changed or URL have been changed.
// See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4014
existingQueues := make(map[string]struct{}, len(rwctxs))
for _, rwctx := range rwctxs {
existingQueues[rwctx.fq.Dirname()] = struct{}{}
}
queuesDir := filepath.Join(*tmpDataPath, persistentQueueDirname)
files := fs.MustReadDir(queuesDir)
removed := 0
for _, f := range files {
dirname := f.Name()
if _, ok := existingQueues[dirname]; !ok {
logger.Infof("removing dangling queue %q", dirname)
fullPath := filepath.Join(queuesDir, dirname)
fs.MustRemoveAll(fullPath)
removed++
}
}
if removed > 0 {
logger.Infof("removed %d dangling queues from %q, active queues: %d", removed, *tmpDataPath, len(rwctxs))
}
}
return rwctxs
}
var configReloaderStopCh = make(chan struct{})
var stopCh = make(chan struct{})
var configReloaderWG sync.WaitGroup
// Stop stops remotewrite.
//
// It is expected that nobody calls Push during and after the call to this func.
func Stop() {
close(configReloaderStopCh)
close(stopCh)
configReloaderWG.Wait()
for _, rwctx := range rwctxsDefault {
@@ -326,7 +264,7 @@ func Stop() {
// If at is nil, then the data is pushed to the configured `-remoteWrite.url`.
// If at isn't nil, the data is pushed to the configured `-remoteWrite.multitenantURL`.
//
// Note that wr may be modified by Push because of relabeling and rounding.
// Note that wr may be modified by Push due to relabeling and rounding.
func Push(at *auth.Token, wr *prompbmarshal.WriteRequest) {
if at == nil && len(*remoteWriteMultitenantURLs) > 0 {
// Write data to default tenant if at isn't set while -remoteWrite.multitenantURL is set.
@@ -353,7 +291,7 @@ func Push(at *auth.Token, wr *prompbmarshal.WriteRequest) {
}
var rctx *relabelCtx
rcs := allRelabelConfigs.Load()
rcs := allRelabelConfigs.Load().(*relabelConfigs)
pcsGlobal := rcs.global
if pcsGlobal.Len() > 0 || len(labelsGlobal) > 0 {
rctx = getRelabelCtx()
@@ -407,46 +345,10 @@ func pushBlockToRemoteStorages(rwctxs []*remoteWriteCtx, tssBlock []prompbmarsha
// Nothing to push
return
}
if len(rwctxs) == 1 {
// Fast path - just push data to the configured single remote storage
rwctxs[0].Push(tssBlock)
return
}
// We need to push tssBlock to multiple remote storages.
// This is either sharding or replication depending on -remoteWrite.shardByURL command-line flag value.
if *shardByURL {
// Shard the data among rwctxs
tssByURL := make([][]prompbmarshal.TimeSeries, len(rwctxs))
for _, ts := range tssBlock {
h := getLabelsHash(ts.Labels)
idx := h % uint64(len(tssByURL))
tssByURL[idx] = append(tssByURL[idx], ts)
}
// Push sharded data to remote storages in parallel in order to reduce
// the time needed for sending the data to multiple remote storage systems.
var wg sync.WaitGroup
wg.Add(len(rwctxs))
for i, rwctx := range rwctxs {
tssShard := tssByURL[i]
if len(tssShard) == 0 {
continue
}
go func(rwctx *remoteWriteCtx, tss []prompbmarshal.TimeSeries) {
defer wg.Done()
rwctx.Push(tss)
}(rwctx, tssShard)
}
wg.Wait()
return
}
// Replicate data among rwctxs.
// Push block to remote storages in parallel in order to reduce
// the time needed for sending the data to multiple remote storage systems.
// Push block to remote storages in parallel in order to reduce the time needed for sending the data to multiple remote storage systems.
var wg sync.WaitGroup
wg.Add(len(rwctxs))
for _, rwctx := range rwctxs {
wg.Add(1)
go func(rwctx *remoteWriteCtx) {
defer wg.Done()
rwctx.Push(tssBlock)
@@ -548,9 +450,8 @@ type remoteWriteCtx struct {
fq *persistentqueue.FastQueue
c *client
sas atomic.Pointer[streamaggr.Aggregators]
sas *streamaggr.Aggregators
streamAggrKeepInput bool
streamAggrDropInput bool
pss []*pendingSeries
pssNextIdx uint64
@@ -565,13 +466,8 @@ func newRemoteWriteCtx(argIdx int, at *auth.Token, remoteWriteURL *url.URL, maxI
pqURL.RawQuery = ""
pqURL.Fragment = ""
h := xxhash.Sum64([]byte(pqURL.String()))
queuePath := filepath.Join(*tmpDataPath, persistentQueueDirname, fmt.Sprintf("%d_%016X", argIdx+1, h))
queuePath := fmt.Sprintf("%s/persistent-queue/%d_%016X", *tmpDataPath, argIdx+1, h)
maxPendingBytes := maxPendingBytesPerURL.GetOptionalArgOrDefault(argIdx, 0)
if maxPendingBytes != 0 && maxPendingBytes < persistentqueue.DefaultChunkFileSize {
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4195
logger.Warnf("rounding the -remoteWrite.maxDiskUsagePerURL=%d to the minimum supported value: %d", maxPendingBytes, persistentqueue.DefaultChunkFileSize)
maxPendingBytes = persistentqueue.DefaultChunkFileSize
}
fq := persistentqueue.MustOpenFastQueue(queuePath, sanitizedURL, maxInmemoryBlocks, maxPendingBytes)
_ = metrics.GetOrCreateGauge(fmt.Sprintf(`vmagent_remotewrite_pending_data_bytes{path=%q, url=%q}`, queuePath, sanitizedURL), func() float64 {
return float64(fq.GetPendingBytes())
@@ -579,7 +475,6 @@ func newRemoteWriteCtx(argIdx int, at *auth.Token, remoteWriteURL *url.URL, maxI
_ = metrics.GetOrCreateGauge(fmt.Sprintf(`vmagent_remotewrite_pending_inmemory_blocks{path=%q, url=%q}`, queuePath, sanitizedURL), func() float64 {
return float64(fq.GetInmemoryQueueLen())
})
var c *client
switch remoteWriteURL.Scheme {
case "http", "https":
@@ -600,7 +495,7 @@ func newRemoteWriteCtx(argIdx int, at *auth.Token, remoteWriteURL *url.URL, maxI
}
pss := make([]*pendingSeries, pssLen)
for i := range pss {
pss[i] = newPendingSeries(fq.MustWriteBlock, c.useVMProto, sf, rd)
pss[i] = newPendingSeries(fq.MustWriteBlock, sf, rd)
}
rwctx := &remoteWriteCtx{
@@ -619,13 +514,10 @@ func newRemoteWriteCtx(argIdx int, at *auth.Token, remoteWriteURL *url.URL, maxI
dedupInterval := streamAggrDedupInterval.GetOptionalArgOrDefault(argIdx, 0)
sas, err := streamaggr.LoadFromFile(sasFile, rwctx.pushInternal, dedupInterval)
if err != nil {
logger.Fatalf("cannot initialize stream aggregators from -remoteWrite.streamAggr.config=%q: %s", sasFile, err)
logger.Fatalf("cannot initialize stream aggregators from -remoteWrite.streamAggrFile=%q: %s", sasFile, err)
}
rwctx.sas.Store(sas)
rwctx.sas = sas
rwctx.streamAggrKeepInput = streamAggrKeepInput.GetOptionalArg(argIdx)
rwctx.streamAggrDropInput = streamAggrDropInput.GetOptionalArg(argIdx)
metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_streamaggr_config_reload_successful{path=%q}`, sasFile)).Set(1)
metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_streamaggr_config_reload_success_timestamp_seconds{path=%q}`, sasFile)).Set(fasttime.UnixTimestamp())
}
return rwctx
@@ -634,8 +526,8 @@ func newRemoteWriteCtx(argIdx int, at *auth.Token, remoteWriteURL *url.URL, maxI
func (rwctx *remoteWriteCtx) MustStop() {
// sas must be stopped before rwctx is closed
// because sas can write pending series to rwctx.pss if there are any
sas := rwctx.sas.Swap(nil)
sas.MustStop()
rwctx.sas.MustStop()
rwctx.sas = nil
for _, ps := range rwctx.pss {
ps.MustStop()
@@ -657,7 +549,7 @@ func (rwctx *remoteWriteCtx) Push(tss []prompbmarshal.TimeSeries) {
// Apply relabeling
var rctx *relabelCtx
var v *[]prompbmarshal.TimeSeries
rcs := allRelabelConfigs.Load()
rcs := allRelabelConfigs.Load().(*relabelConfigs)
pcs := rcs.perURL[rwctx.idx]
if pcs.Len() > 0 {
rctx = getRelabelCtx()
@@ -665,7 +557,7 @@ func (rwctx *remoteWriteCtx) Push(tss []prompbmarshal.TimeSeries) {
// from affecting time series for other remoteWrite.url configs.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/467
// and https://github.com/VictoriaMetrics/VictoriaMetrics/issues/599
v = tssPool.Get().(*[]prompbmarshal.TimeSeries)
v = tssRelabelPool.Get().(*[]prompbmarshal.TimeSeries)
tss = append(*v, tss...)
rowsCountBeforeRelabel := getRowsCount(tss)
tss = rctx.applyRelabeling(tss, nil, pcs)
@@ -676,83 +568,27 @@ func (rwctx *remoteWriteCtx) Push(tss []prompbmarshal.TimeSeries) {
rwctx.rowsPushedAfterRelabel.Add(rowsCount)
// Apply stream aggregation if any
sas := rwctx.sas.Load()
if sas != nil {
matchIdxs := matchIdxsPool.Get()
matchIdxs.B = sas.Push(tss, matchIdxs.B)
if !rwctx.streamAggrKeepInput {
if rctx == nil {
rctx = getRelabelCtx()
// Make a copy of tss before dropping aggregated series
v = tssPool.Get().(*[]prompbmarshal.TimeSeries)
tss = append(*v, tss...)
}
tss = dropAggregatedSeries(tss, matchIdxs.B, rwctx.streamAggrDropInput)
}
matchIdxsPool.Put(matchIdxs)
rwctx.sas.Push(tss)
if rwctx.sas == nil || rwctx.streamAggrKeepInput {
// Push samples to the remote storage
rwctx.pushInternal(tss)
}
rwctx.pushInternal(tss)
// Return back relabeling contexts to the pool
if rctx != nil {
*v = prompbmarshal.ResetTimeSeries(tss)
tssPool.Put(v)
tssRelabelPool.Put(v)
putRelabelCtx(rctx)
}
}
var matchIdxsPool bytesutil.ByteBufferPool
func dropAggregatedSeries(src []prompbmarshal.TimeSeries, matchIdxs []byte, dropInput bool) []prompbmarshal.TimeSeries {
dst := src[:0]
for i, match := range matchIdxs {
if match == 0 {
continue
}
dst = append(dst, src[i])
}
tail := src[len(dst):]
_ = prompbmarshal.ResetTimeSeries(tail)
return dst
}
func (rwctx *remoteWriteCtx) pushInternal(tss []prompbmarshal.TimeSeries) {
pss := rwctx.pss
idx := atomic.AddUint64(&rwctx.pssNextIdx, 1) % uint64(len(pss))
pss[idx].Push(tss)
}
func (rwctx *remoteWriteCtx) reinitStreamAggr() {
sas := rwctx.sas.Load()
if sas == nil {
// There is no stream aggregation for rwctx
return
}
sasFile := streamAggrConfig.GetOptionalArg(rwctx.idx)
logger.Infof("reloading stream aggregation configs pointed by -remoteWrite.streamAggr.config=%q", sasFile)
metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_streamaggr_config_reloads_total{path=%q}`, sasFile)).Inc()
dedupInterval := streamAggrDedupInterval.GetOptionalArgOrDefault(rwctx.idx, 0)
sasNew, err := streamaggr.LoadFromFile(sasFile, rwctx.pushInternal, dedupInterval)
if err != nil {
metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_streamaggr_config_reloads_errors_total{path=%q}`, sasFile)).Inc()
metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_streamaggr_config_reload_successful{path=%q}`, sasFile)).Set(0)
logger.Errorf("cannot reload stream aggregation config from -remoteWrite.streamAggr.config=%q; continue using the previously loaded config; error: %s", sasFile, err)
return
}
if !sasNew.Equal(sas) {
sasOld := rwctx.sas.Swap(sasNew)
sasOld.MustStop()
logger.Infof("successfully reloaded stream aggregation configs at -remoteWrite.streamAggr.config=%q", sasFile)
} else {
sasNew.MustStop()
logger.Infof("the config at -remoteWrite.streamAggr.config=%q wasn't changed", sasFile)
}
metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_streamaggr_config_reload_successful{path=%q}`, sasFile)).Set(1)
metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_streamaggr_config_reload_success_timestamp_seconds{path=%q}`, sasFile)).Set(fasttime.UnixTimestamp())
}
var tssPool = &sync.Pool{
var tssRelabelPool = &sync.Pool{
New: func() interface{} {
a := []prompbmarshal.TimeSeries{}
return &a
@@ -766,20 +602,3 @@ func getRowsCount(tss []prompbmarshal.TimeSeries) int {
}
return rowsCount
}
// CheckStreamAggrConfigs checks configs pointed by -remoteWrite.streamAggr.config
func CheckStreamAggrConfigs() error {
pushNoop := func(tss []prompbmarshal.TimeSeries) {}
for idx, sasFile := range *streamAggrConfig {
if sasFile == "" {
continue
}
dedupInterval := streamAggrDedupInterval.GetOptionalArgOrDefault(idx, 0)
sas, err := streamaggr.LoadFromFile(sasFile, pushNoop, dedupInterval)
if err != nil {
return fmt.Errorf("cannot load -remoteWrite.streamAggr.config=%q: %w", sasFile, err)
}
sas.MustStop()
}
return nil
}

View File

@@ -11,7 +11,6 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/vmimport"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/vmimport/stream"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
"github.com/VictoriaMetrics/metrics"
)
@@ -31,7 +30,7 @@ func InsertHandler(at *auth.Token, req *http.Request) error {
return err
}
isGzipped := req.Header.Get("Content-Encoding") == "gzip"
return stream.Parse(req.Body, isGzipped, func(rows []parser.Row) error {
return parser.ParseStream(req.Body, isGzipped, func(rows []parser.Row) error {
return insertRows(at, rows, extraLabels)
})
}

View File

@@ -115,9 +115,6 @@ vmalert-linux-arm64:
vmalert-linux-ppc64le:
APP_NAME=vmalert CGO_ENABLED=0 GOOS=linux GOARCH=ppc64le $(MAKE) app-local-goos-goarch
vmalert-linux-s390x:
APP_NAME=vmalert CGO_ENABLED=0 GOOS=linux GOARCH=s390x $(MAKE) app-local-goos-goarch
vmalert-linux-386:
APP_NAME=vmalert CGO_ENABLED=0 GOOS=linux GOARCH=386 $(MAKE) app-local-goos-goarch

View File

@@ -14,7 +14,7 @@ or [cluster version](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.ht
of VictoriaMetrics are capable of proxying requests to vmalert via `-vmalert.proxyURL` command-line flag.
Use this feature for the following cases:
* for proxying requests from [Grafana Alerting UI](https://grafana.com/docs/grafana/latest/alerting/);
* for accessing vmalerts UI through VictoriaMetrics Web interface.
* for accessing vmalert's UI through VictoriaMetrics Web interface.
## Features
@@ -23,14 +23,12 @@ Use this feature for the following cases:
support and expressions validation;
* Prometheus [alerting rules definition format](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/#defining-alerting-rules)
support;
* Integration with [Alertmanager](https://github.com/prometheus/alertmanager) starting from [Alertmanager v0.16.0-alpha](https://github.com/prometheus/alertmanager/releases/tag/v0.16.0-alpha.0);
* Integration with [Alertmanager](https://github.com/prometheus/alertmanager) starting from [Alertmanager v0.16.0-aplha](https://github.com/prometheus/alertmanager/releases/tag/v0.16.0-alpha.0);
* Keeps the alerts [state on restarts](#alerts-state-on-restarts);
* Graphite datasource can be used for alerting and recording rules. See [these docs](#graphite);
* Recording and Alerting rules backfilling (aka `replay`). See [these docs](#rules-backfilling);
* Lightweight and without extra dependencies.
* Supports [reusable templates](#reusable-templates) for annotations;
* Load of recording and alerting rules from local filesystem, URL, GCS and S3;
* Detect alerting rules which [don't match any series](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4039).
* Supports [reusable templates](#reusable-templates) for annotations.
## Limitations
@@ -145,15 +143,6 @@ params:
headers:
[ <string>, ...]
# Optional list of HTTP headers in form `header-name: value`
# applied for all alert notifications sent to notifiers
# generated by rules of this group.
# For example:
# notifier_headers:
# - "TenantID: foo"
notifier_headers:
[ <string>, ...]
# Optional list of labels added to every rule within a group.
# It has priority over the external labels.
# Labels are commonly used for adding environment
@@ -203,22 +192,16 @@ expr: <string>
# as firing once they return.
[ for: <duration> | default = 0s ]
# Alert will continue firing for this long even when the alerting expression no longer has results.
# This allows you to delay alert resolution.
[ keep_firing_for: <duration> | default = 0s ]
# Whether to print debug information into logs.
# Information includes alerts state changes and requests sent to the datasource.
# Please note, that if rule's query params contain sensitive
# information - it will be printed to logs.
# Is applicable to alerting rules only.
# Available starting from https://docs.victoriametrics.com/CHANGELOG.html#v1820
[ debug: <bool> | default = false ]
# Defines the number of rule's updates entries stored in memory
# and available for view on rule's Details page.
# Overrides `rule.updateEntriesLimit` value for this specific rule.
# Available starting from https://docs.victoriametrics.com/CHANGELOG.html#v1860
[ update_entries_limit: <integer> | default 0 ]
# Labels to add or overwrite for each alert.
@@ -277,7 +260,7 @@ Additionally, `vmalert` provides some extra templating functions listed [here](#
query at `-datasource.url` and returns the first result.
- `queryEscape` - escapes the input string, so it can be safely put inside [query arg](https://en.wikipedia.org/wiki/Percent-encoding) part of URL.
- `quotesEscape` - escapes the input string, so it can be safely embedded into JSON string.
- `reReplaceAll regex repl` - replaces all the occurrences of the `regex` in input string with the `repl`.
- `reReplaceAll regex repl` - replaces all the occurences of the `regex` in input string with the `repl`.
- `safeHtml` - marks the input string as safe to use in HTML context without the need to html-escape it.
- `sortByLabel name` - sorts the input query results by the label with the given `name`.
- `stripDomain` - leaves the first part of the domain. For example, `foo.bar.baz` is converted to `foo`.
@@ -361,24 +344,19 @@ For recording rules to work `-remoteWrite.url` must be specified.
### Alerts state on restarts
`vmalert` holds alerts state in the memory. Restart of the `vmalert` process will reset the state of all active alerts
in the memory. To prevent `vmalert` from losing the state on restarts configure it to persist the state
to the remote database via the following flags:
`vmalert` has no local storage, so alerts state is stored in the process memory. Hence, after restart of `vmalert`
the process alerts state will be lost. To avoid this situation, `vmalert` should be configured via the following flags:
* `-remoteWrite.url` - URL to VictoriaMetrics (Single) or vminsert (Cluster). `vmalert` will persist alerts state
to the configured address in the form of [time series](https://docs.victoriametrics.com/keyConcepts.html#time-series)
`ALERTS` and `ALERTS_FOR_STATE` via remote-write protocol.
These time series can be queried from VictoriaMetrics just as any other time series.
The state will be persisted to the configured address on each evaluation.
into the configured address in the form of time series named `ALERTS` and `ALERTS_FOR_STATE` via remote-write protocol.
These are regular time series and maybe queried from VM just as any other time series.
The state is stored to the configured address on every rule evaluation.
* `-remoteRead.url` - URL to VictoriaMetrics (Single) or vmselect (Cluster). `vmalert` will try to restore alerts state
from the configured address by querying time series with name `ALERTS_FOR_STATE`. The restore happens only once when
`vmalert` process starts, and only for the configured rules. Config [hot reload](#hot-config-reload) doesn't trigger
state restore.
from configured address by querying time series with name `ALERTS_FOR_STATE`.
Both flags are required for proper state restoration. Restore process may fail if time series are missing
in configured `-remoteRead.url`, weren't updated in the last `1h` (controlled by `-remoteRead.lookback`)
or received state doesn't match current `vmalert` rules configuration. `vmalert` marks successfully restored rules
with `restored` label in [web UI](#web).
or received state doesn't match current `vmalert` rules configuration.
### Multitenancy
@@ -426,25 +404,6 @@ The enterprise version of vmalert is available in `vmutils-*-enterprise.tar.gz`
at [release page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) and in `*-enterprise`
tags at [Docker Hub](https://hub.docker.com/r/victoriametrics/vmalert/tags).
### Reading rules from object storage
[Enterprise version](https://docs.victoriametrics.com/enterprise.html) of `vmalert` may read alerting and recording rules
from object storage:
- `./bin/vmalert -rule=s3://bucket/dir/alert.rules` would read rules from the given path at S3 bucket
- `./bin/vmalert -rule=gs://bucket/bir/alert.rules` would read rules from the given path at GCS bucket
S3 and GCS paths support only matching by prefix, e.g. `s3://bucket/dir/rule_` matches
all files with prefix `rule_` in the folder `dir`.
The following [command-line flags](#flags) can be used for fine-tuning access to S3 and GCS:
- `-s3.credsFilePath` - path to file with GCS or S3 credentials. Credentials are loaded from default locations if not set.
- `-s3.configFilePath` - path to file with S3 configs. Configs are loaded from default location if not set.
- `-s3.configProfile` - profile name for S3 configs. If no set, the value of the environment variable will be loaded (`AWS_PROFILE` or `AWS_DEFAULT_PROFILE`).
- `-s3.customEndpoint` - custom S3 endpoint for use with S3-compatible storages (e.g. MinIO). S3 is used if not set.
- `-s3.forcePathStyle` - prefixing endpoint with bucket name when set false, true by default.
### Topology examples
The following sections are showing how `vmalert` may be used and configured
@@ -494,7 +453,7 @@ Cluster mode could have multiple `vminsert` and `vmselect` components.
<img alt="vmalert cluster" src="vmalert_cluster.png">
In case when you want to spread the load on these components - add balancers before them and configure
`vmalert` with balancer addresses. Please, see more about VM's cluster architecture
`vmalert` with balancer's addresses. Please, see more about VM's cluster architecture
[here](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#architecture-overview).
#### HA vmalert
@@ -519,7 +478,7 @@ Alertmanagers.
To avoid recording rules results and alerts state duplication in VictoriaMetrics server
don't forget to configure [deduplication](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#deduplication).
The recommended value for `-dedup.minScrapeInterval` must be multiple of vmalert's `evaluation_interval`.
The recommended value for `-dedup.minScrapeInterval` must be greater or equal to vmalert's `evaluation_interval`.
If you observe inconsistent or "jumping" values in series produced by vmalert, try disabling `-datasource.queryTimeAlignment`
command line flag. Because of alignment, two or more vmalert HA pairs will produce results with the same timestamps.
But due of backfilling (data delivered to the datasource with some delay) values of such results may differ,
@@ -529,7 +488,7 @@ Alertmanager will automatically deduplicate alerts with identical labels, so ens
all `vmalert`s are having the same config.
Don't forget to configure [cluster mode](https://prometheus.io/docs/alerting/latest/alertmanager/)
for Alertmanagers for better reliability. List all Alertmanager URLs in vmalert `-notifier.url`
for Alertmanagers for better reliability. List all Alertmanager URLs in vmalert's `-notifier.url`
to ensure [high availability](https://github.com/prometheus/alertmanager#high-availability).
This example uses single-node VM server for the sake of simplicity.
@@ -537,9 +496,6 @@ Check how to replace it with [cluster VictoriaMetrics](#cluster-victoriametrics)
#### Downsampling and aggregation via vmalert
_Please note, [stream aggregation](https://docs.victoriametrics.com/stream-aggregation.html) might be more efficient
for cases when downsampling or aggregation need to be applied **before data gets into the TSDB.**_
`vmalert` can't modify existing data. But it can run arbitrary PromQL/MetricsQL queries
via [recording rules](#recording-rules) and backfill results to the configured `-remoteWrite.url`.
This ability allows to aggregate data. For example, the following rule will calculate the average value for
@@ -624,7 +580,7 @@ or time series modification via [relabeling](https://docs.victoriametrics.com/vm
`vmalert` web UI can be accessed from [single-node version of VictoriaMetrics](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html)
and from [cluster version of VictoriaMetrics](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html).
This may be used for better integration with Grafana unified alerting system. See the following docs for details:
This may be used for better integraion with Grafana unified alerting system. See the following docs for details:
* [How to query vmalert from single-node VictoriaMetrics](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#vmalert)
* [How to query vmalert from VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#vmalert)
@@ -645,12 +601,6 @@ can read the same rules configuration as normal, evaluate them on the given time
results via remote write to the configured storage. vmalert supports any PromQL/MetricsQL compatible
data source for backfilling.
Please note, that response caching may lead to unexpected results during and after backfilling process.
In order to avoid this you need to reset cache contents or disable caching when using backfilling
as described in [backfilling docs](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#backfilling).
See a blogpost about [Rules backfilling via vmalert](https://victoriametrics.com/blog/rules-replay/).
### How it works
In `replay` mode vmalert works as a cli-tool and exits immediately after work is done.
@@ -740,7 +690,6 @@ See full description for these flags in `./vmalert -help`.
* Graphite engine isn't supported yet;
* `query` template function is disabled for performance reasons (might be changed in future);
* `limit` group's param has no effect during replay (might be changed in future);
* `keep_firing_for` alerting rule param has no effect during replay (might be changed in future).
## Monitoring
@@ -756,42 +705,31 @@ a review to the dashboard.
## Troubleshooting
### Data delay
Data delay is one of the most common issues with rules execution.
vmalert executes configured rules within certain intervals at specifics timestamps.
It expects that the data is already present in configured `-datasource.url` at the moment of time when rule is executed:
vmalert executes configured rules within certain intervals. It is expected that at the moment when rule is executed,
the data is already present in configured `-datasource.url`:
<img alt="vmalert expected evaluation" src="vmalert_ts_normal.gif">
Usually, troubles start to appear when data in `-datasource.url` is delayed or absent. In such cases, evaluations
may get empty response from the datasource and produce empty recording rules or reset alerts state:
may get empty response from datasource and produce empty recording rules or reset alerts state:
<img alt="vmalert evaluation when data is delayed" src="vmalert_ts_data_delay.gif">
Try the following recommendations to reduce the chance of hitting the data delay issue:
By default, recently written samples to VictoriaMetrics aren't visible for queries for up to 30s.
This behavior is controlled by `-search.latencyOffset` command-line flag and the `latency_offset` query ag at `vmselect`.
Usually, this results into a 30s shift for recording rules results.
Note that too small value passed to `-search.latencyOffset` or to `latency_offest` query arg may lead to incomplete query results.
* Always configure group's `evaluationInterval` to be bigger or at least equal to
[time series resolution](https://docs.victoriametrics.com/keyConcepts.html#time-series-resolution);
* Ensure that `[duration]` value is at least twice bigger than
[time series resolution](https://docs.victoriametrics.com/keyConcepts.html#time-series-resolution). For example,
if expression is `rate(my_metric[2m]) > 0` then ensure that `my_metric` resolution is at least `1m` or better `30s`.
If you use VictoriaMetrics as datasource, `[duration]` can be omitted and VictoriaMetrics will adjust it automatically.
* If you know in advance, that data in datasource is delayed - try changing vmalerts `-datasource.lookback`
command-line flag to add a time shift for evaluations. Or extend `[duration]` to tolerate the delay.
For example, `max_over_time(errors_total[10m]) > 0` will be active even if there is no data in datasource for last `9m`.
* If [time series resolution](https://docs.victoriametrics.com/keyConcepts.html#time-series-resolution)
in datasource is inconsistent or `>=5min` - try changing vmalerts `-datasource.queryStep` command-line flag to specify
how far search query can lookback for the recent datapoint. The recommendation is to have the step
at least two times bigger than the resolution.
Try the following recommendations in such cases:
> Please note, data delay is inevitable in distributed systems. And it is better to account for it instead of ignoring.
By default, recently written samples to VictoriaMetrics aren't visible for queries for up to 30s
(see `-search.latencyOffset` command-line flag at vmselect). Such delay is needed to eliminate risk of incomplete
data on the moment of querying, since metrics collectors won't be able to deliver the data in time.
### Alerts state
* Always configure group's `evaluationInterval` to be bigger or equal to `scrape_interval` at which metrics
are delivered to the datasource;
* If you know in advance, that data in datasource is delayed - try changing vmalert's `-datasource.lookback`
command-line flag to add a time shift for evaluations;
* If time intervals between datapoints in datasource are irregular or `>=5min` - try changing vmalert's
`-datasource.queryStep` command-line flag to specify how far search query can lookback for the recent datapoint.
The recommendation is to have the step at least two times bigger than `scrape_interval`, since
there are no guarantees that scrape will not fail.
Sometimes, it is not clear why some specific alert fired or didn't fire. It is very important to remember, that
alerts with `for: 0` fire immediately when their expression becomes true. And alerts with `for > 0` will fire only
@@ -804,8 +742,7 @@ If `-remoteWrite.url` command-line flag is configured, vmalert will persist aler
changed in time.
vmalert stores last `-rule.updateEntriesLimit` (or `update_entries_limit` [per-rule config](https://docs.victoriametrics.com/vmalert.html#alerting-rules))
state updates for each rule starting from [v1.86](https://docs.victoriametrics.com/CHANGELOG.html#v1860).
To check updates, click on `Details` link next to rule's name on `/vmalert/groups` page
state updates for each rule. To check updates, click on `Details` link next to rule's name on `/vmalert/groups` page
and check the `Last updates` section:
<img alt="vmalert state" src="vmalert_state.png">
@@ -815,10 +752,8 @@ HTTP request sent by vmalert to the `-datasource.url` during evaluation. If spec
no samples returned and curl command returns data - then it is very likely there was no data in datasource on the
moment when rule was evaluated.
### Debug mode
vmalert allows configuring more detailed logging for specific alerting rule starting from [v1.82](https://docs.victoriametrics.com/CHANGELOG.html#v1820).
Just set `debug: true` in rule's configuration and vmalert will start printing additional log messages:
vmalert allows configuring more detailed logging for specific alerting rule. Just set `debug: true` in rule's configuration
and vmalert will start printing additional log messages:
```terminal
2022-09-15T13:35:41.155Z DEBUG rule "TestGroup":"Conns" (2601299393013563564) at 2022-09-15T15:35:41+02:00: query returned 0 samples (elapsed: 5.896041ms)
2022-09-15T13:35:56.149Z DEBUG datasource request: executing POST request with params "denyPartialResponse=true&query=sum%28vm_tcplistener_conns%7Binstance%3D%22localhost%3A8429%22%7D%29+by%28instance%29+%3E+0&step=15s&time=1663248945"
@@ -829,49 +764,6 @@ Just set `debug: true` in rule's configuration and vmalert will start printing a
2022-09-15T13:36:56.153Z DEBUG rule "TestGroup":"Conns" (2601299393013563564) at 2022-09-15T15:36:56+02:00: alert 10705778000901301787 {alertgroup="TestGroup",alertname="Conns",cluster="east-1",instance="localhost:8429",replica="a"} PENDING => FIRING: 1m0s since becoming active at 2022-09-15 15:35:56.126006 +0200 CEST m=+39.384575417
```
### Never-firing alerts
vmalert can detect if alert's expression doesn't match any time series in runtime
starting from [v1.91](https://docs.victoriametrics.com/CHANGELOG.html#v1910). This problem usually happens
when alerting expression selects time series which aren't present in the datasource (i.e. wrong `job` label)
or there is a typo in the series selector (i.e. `env=rpod`). Such alerting rules will be marked with special icon in
vmalerts UI and exposed via `vmalert_alerting_rules_last_evaluation_series_fetched` metric. The metric value will
show how many time series were matched before the filtering by rule's expression. If metric value is `-1`, then
this feature is not supported by the datasource (old versions of VictoriaMetrics). The following expression can be
used to detect rules matching no series:
```
max(vmalert_alerting_rules_last_evaluation_series_fetched) by(group, alertname) == 0
```
See more details [here](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4039).
This feature is available only if vmalert is using VictoriaMetrics v1.90 or higher as a datasource.
### Series with the same labelset
vmalert can produce the following error message during rules evaluation:
```
result contains metrics with the same labelset after applying rule labels
```
The error means there is a collision between [time series](https://docs.victoriametrics.com/keyConcepts.html#time-series)
after applying extra labels to result.
For example, a rule with `expr: foo > 0` returns two distinct time series in response:
```
foo{bar="baz"} 1
foo{bar="qux"} 2
```
If user configures `-external.label=bar=baz` cmd-line flag to enforce
adding `bar="baz"` label-value pair, then time series won't be distinct anymore:
```
foo{bar="baz"} 1
foo{bar="baz"} 2 # 'bar' label was overriden by `-external.label=bar=baz
```
The same issue can be caused by collision of configured `labels` on [Group](#groups) or [Rule](#rules) levels.
To fix it one should avoid collisions by carefully picking label overrides in configuration.
## Profiling
@@ -911,11 +803,11 @@ command-line flags with their descriptions.
The shortlist of configuration flags is the following:
{% raw %}
```console
```
-clusterMode
If clusterMode is enabled, then vmalert automatically adds the tenant specified in config groups to -datasource.url, -remoteWrite.url and -remoteRead.url. See https://docs.victoriametrics.com/vmalert.html#multitenancy . This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
-configCheckInterval duration
Interval for checking for changes in '-rule' or '-notifier.config' files. By default, the checking is disabled. Send SIGHUP signal in order to force config check for changes.
Interval for checking for changes in '-rule' or '-notifier.config' files. By default the checking is disabled. Send SIGHUP signal in order to force config check for changes.
-datasource.appendTypePrefix
Whether to add type prefix to -datasource.url based on the query type. Set to true if sending different query types to the vmselect URL.
-datasource.basicAuth.password string
@@ -930,8 +822,6 @@ The shortlist of configuration flags is the following:
Optional path to bearer token file to use for -datasource.url.
-datasource.disableKeepAlive
Whether to disable long-lived connections to the datasource. If true, disables HTTP keep-alives and will only use the connection to the server for a single HTTP request.
-datasource.disableStepParam
Whether to disable adding 'step' param to the issued instant queries. This might be useful when using vmalert with datasources that do not support 'step' param for instant queries, like Google Managed Prometheus. It is not recommended to enable this flag if you use vmalert with VictoriaMetrics.
-datasource.headers string
Optional HTTP extraHeaders to send with each request to the corresponding -datasource.url. For example, -datasource.headers='My-Auth:foobar' would send 'My-Auth: foobar' HTTP header with every request to the corresponding -datasource.url. Multiple headers must be delimited by '^^': -datasource.headers='header1:value1^^header2:value2'
-datasource.lookback duration
@@ -951,7 +841,7 @@ The shortlist of configuration flags is the following:
-datasource.queryStep duration
How far a value can fallback to when evaluating queries. For example, if -datasource.queryStep=15s then param "step" with value "15s" will be added to every query. If set to 0, rule's evaluation interval will be used instead. (default 5m0s)
-datasource.queryTimeAlignment
Whether to align "time" parameter with evaluation interval.Alignment supposed to produce deterministic results despite number of vmalert replicas or time they were started. See more details here https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1257 (default true)
Whether to align "time" parameter with evaluation interval.Alignment supposed to produce deterministic results despite of number of vmalert replicas or time they were started. See more details here https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1257 (default true)
-datasource.roundDigits int
Adds "round_digits" GET param to datasource requests. In VM "round_digits" limits the number of digits after the decimal point in response values.
-datasource.showURL
@@ -977,9 +867,9 @@ The shortlist of configuration flags is the following:
-dryRun
Whether to check only config files without running vmalert. The rules file are validated. The -rule flag must be specified.
-enableTCP6
Whether to enable IPv6 for listening and dialing. By default, only IPv4 TCP and UDP are used
Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used
-envflag.enable
Whether to enable reading flags from environment variables in addition to the command line. Command line flag values have priority over values from environment vars. Flags are read only from the command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
-eula
@@ -987,20 +877,20 @@ The shortlist of configuration flags is the following:
-evaluationInterval duration
How often to evaluate the rules (default 1m0s)
-external.alert.source string
External Alert Source allows to override the Source link for alerts sent to AlertManager for cases where you want to build a custom link to Grafana, Prometheus or any other service. Supports templating - see https://docs.victoriametrics.com/vmalert.html#templating . For example, link to Grafana: -external.alert.source='explore?orgId=1&left={"datasource":"VictoriaMetrics","queries":[{"expr":{{$expr|jsonEscape|queryEscape}},"refId":"A"}],"range":{"from":"now-1h","to":"now"}}'. Link to VMUI: -external.alert.source='vmui/#/?g0.expr={{.Expr|queryEscape}}'. If empty 'vmalert/alert?group_id={{.GroupID}}&alert_id={{.AlertID}}' is used.
External Alert Source allows to override the Source link for alerts sent to AlertManager for cases where you want to build a custom link to Grafana, Prometheus or any other service. Supports templating - see https://docs.victoriametrics.com/vmalert.html#templating . For example, link to Grafana: -external.alert.source='explore?orgId=1&left=["now-1h","now","VictoriaMetrics",{"expr":{{$expr|jsonEscape|queryEscape}} },{"mode":"Metrics"},{"ui":[true,true,true,"none"]}]' . If empty 'vmalert/alert?group_id={{.GroupID}}&alert_id={{.AlertID}}' is used.
-external.label array
Optional label in the form 'Name=value' to add to all generated recording rules and alerts. Pass multiple -label flags in order to add multiple label sets.
Supports an array of values separated by comma or specified via multiple flags.
-external.url string
External URL is used as alert's source for sent alerts to the notifier. By default, hostname is used as address.
External URL is used as alert's source for sent alerts to the notifier
-flagsAuthKey string
Auth key for /flags endpoint. It must be passed via authKey query arg. It overrides httpAuth.* settings
-fs.disableMmap
Whether to use pread() instead of mmap() for reading data files. By default, mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread()
Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread()
-http.connTimeout duration
Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s)
-http.disableResponseCompression
Disable compression of HTTP responses to save CPU resources. By default, compression is enabled to save network bandwidth
Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth
-http.idleConnTimeout duration
Timeout for incoming idle http connections (default 1m0s)
-http.maxGracefulShutdownDuration duration
@@ -1010,19 +900,13 @@ The shortlist of configuration flags is the following:
-http.shutdownDelay duration
Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers
-httpAuth.password string
Password for HTTP server's Basic Auth. The authentication is disabled if -httpAuth.username is empty
Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty
-httpAuth.username string
Username for HTTP server's Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
-httpListenAddr string
Address to listen for http connections. See also -httpListenAddr.useProxyProtocol (default ":8880")
-httpListenAddr.useProxyProtocol
Whether to use proxy protocol for connections accepted at -httpListenAddr . See https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt . With enabled proxy protocol http server cannot serve regular /metrics endpoint. Use -pushmetrics.url for metrics pushing
-internStringCacheExpireDuration duration
The expiry duration for caches for interned strings. See https://en.wikipedia.org/wiki/String_interning . See also -internStringMaxLen and -internStringDisableCache (default 6m0s)
-internStringDisableCache
Whether to disable caches for interned strings. This may reduce memory usage at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringCacheExpireDuration and -internStringMaxLen
-internStringMaxLen int
The maximum length for strings to intern. A lower limit may save memory at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringDisableCache and -internStringCacheExpireDuration (default 500)
Whether to use proxy protocol for connections accepted at -httpListenAddr . See https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt
-loggerDisableTimestamps
Whether to disable writing timestamps in logs
-loggerErrorsPerSecondLimit int
@@ -1039,11 +923,6 @@ The shortlist of configuration flags is the following:
Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC")
-loggerWarnsPerSecondLimit int
Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit
-memory.allowedBytes size
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from the OS page cache resulting in higher disk IO usage
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 0)
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from the OS page cache which will result in higher disk IO usage (default 60)
-metricsAuthKey string
Auth key for /metrics endpoint. It must be passed via authKey query arg. It overrides httpAuth.* settings
-notifier.basicAuth.password array
@@ -1081,7 +960,7 @@ The shortlist of configuration flags is the following:
-notifier.suppressDuplicateTargetErrors
Whether to suppress 'duplicate target' errors during discovery
-notifier.tlsCAFile array
Optional path to TLS CA file to use for verifying connections to -notifier.url. By default, system CA is used
Optional path to TLS CA file to use for verifying connections to -notifier.url. By default system CA is used
Supports an array of values separated by comma or specified via multiple flags.
-notifier.tlsCertFile array
Optional path to client-side TLS certificate file to use when connecting to -notifier.url
@@ -1093,13 +972,11 @@ The shortlist of configuration flags is the following:
Optional path to client-side TLS certificate key to use when connecting to -notifier.url
Supports an array of values separated by comma or specified via multiple flags.
-notifier.tlsServerName array
Optional TLS server name to use for connections to -notifier.url. By default, the server name from -notifier.url is used
Optional TLS server name to use for connections to -notifier.url. By default the server name from -notifier.url is used
Supports an array of values separated by comma or specified via multiple flags.
-notifier.url array
Prometheus Alertmanager URL, e.g. http://127.0.0.1:9093. List all Alertmanager URLs if it runs in the cluster mode to ensure high availability.
Supports an array of values separated by comma or specified via multiple flags.
-notifier.blackhole bool
Whether to blackhole alerting notifications. Enable this flag if you want vmalert to evaluate alerting rules without sending any notifications to external receivers (eg. alertmanager). `-notifier.url`, `-notifier.config` and `-notifier.blackhole` are mutually exclusive.
-pprofAuthKey string
Auth key for /debug/pprof/* endpoints. It must be passed via authKey query arg. It overrides httpAuth.* settings
-promscrape.consul.waitTime duration
@@ -1118,7 +995,7 @@ The shortlist of configuration flags is the following:
-pushmetrics.interval duration
Interval for pushing metrics to -pushmetrics.url (default 10s)
-pushmetrics.url array
Optional URL to push metrics exposed at /metrics page. See https://docs.victoriametrics.com/#push-metrics . By default, metrics exposed at /metrics page aren't pushed to any remote storage
Optional URL to push metrics exposed at /metrics page. See https://docs.victoriametrics.com/#push-metrics . By default metrics exposed at /metrics page aren't pushed to any remote storage
Supports an array of values separated by comma or specified via multiple flags.
-remoteRead.basicAuth.password string
Optional basic auth password for -remoteRead.url
@@ -1151,7 +1028,7 @@ The shortlist of configuration flags is the following:
-remoteRead.showURL
Whether to show -remoteRead.url in the exported metrics. It is hidden by default, since it can contain sensitive info such as auth key
-remoteRead.tlsCAFile string
Optional path to TLS CA file to use for verifying connections to -remoteRead.url. By default, system CA is used
Optional path to TLS CA file to use for verifying connections to -remoteRead.url. By default system CA is used
-remoteRead.tlsCertFile string
Optional path to client-side TLS certificate file to use when connecting to -remoteRead.url
-remoteRead.tlsInsecureSkipVerify
@@ -1159,7 +1036,7 @@ The shortlist of configuration flags is the following:
-remoteRead.tlsKeyFile string
Optional path to client-side TLS certificate key to use when connecting to -remoteRead.url
-remoteRead.tlsServerName string
Optional TLS server name to use for connections to -remoteRead.url. By default, the server name from -remoteRead.url is used
Optional TLS server name to use for connections to -remoteRead.url. By default the server name from -remoteRead.url is used
-remoteRead.url vmalert
Optional URL to datasource compatible with Prometheus HTTP API. It can be single node VictoriaMetrics or vmselect.Remote read is used to restore alerts state.This configuration makes sense only if vmalert was configured with `remoteWrite.url` before and has been successfully persisted its state. E.g. http://127.0.0.1:8428. See also '-remoteRead.disablePathAppend', '-remoteRead.showURL'.
-remoteWrite.basicAuth.password string
@@ -1181,7 +1058,7 @@ The shortlist of configuration flags is the following:
-remoteWrite.headers string
Optional HTTP headers to send with each request to the corresponding -remoteWrite.url. For example, -remoteWrite.headers='My-Auth:foobar' would send 'My-Auth: foobar' HTTP header with every request to the corresponding -remoteWrite.url. Multiple headers must be delimited by '^^': -remoteWrite.headers='header1:value1^^header2:value2'
-remoteWrite.maxBatchSize int
Defines max number of timeseries to be flushed at once (default 1000)
Defines defines max number of timeseries to be flushed at once (default 1000)
-remoteWrite.maxQueueSize int
Defines the max number of pending datapoints to remote write endpoint (default 100000)
-remoteWrite.oauth2.clientID string
@@ -1194,16 +1071,12 @@ The shortlist of configuration flags is the following:
Optional OAuth2 scopes to use for -notifier.url. Scopes must be delimited by ';'.
-remoteWrite.oauth2.tokenUrl string
Optional OAuth2 tokenURL to use for -notifier.url.
-remoteWrite.retryMaxTime duration
The max time spent on retry attempts for the failed remote-write request. Change this value if it is expected for remoteWrite.url to be unreachable for more than -remoteWrite.retryMaxTime. See also -remoteWrite.retryMinInterval (default 30s)
-remoteWrite.retryMinInterval duration
The minimum delay between retry attempts. Every next retry attempt will double the delay to prevent hammering of remote database. See also -remoteWrite.retryMaxInterval (default 1s)
-remoteWrite.sendTimeout duration
Timeout for sending data to the configured -remoteWrite.url. (default 30s)
-remoteWrite.showURL
Whether to show -remoteWrite.url in the exported metrics. It is hidden by default, since it can contain sensitive info such as auth key
-remoteWrite.tlsCAFile string
Optional path to TLS CA file to use for verifying connections to -remoteWrite.url. By default, system CA is used
Optional path to TLS CA file to use for verifying connections to -remoteWrite.url. By default system CA is used
-remoteWrite.tlsCertFile string
Optional path to client-side TLS certificate file to use when connecting to -remoteWrite.url
-remoteWrite.tlsInsecureSkipVerify
@@ -1211,74 +1084,50 @@ The shortlist of configuration flags is the following:
-remoteWrite.tlsKeyFile string
Optional path to client-side TLS certificate key to use when connecting to -remoteWrite.url
-remoteWrite.tlsServerName string
Optional TLS server name to use for connections to -remoteWrite.url. By default, the server name from -remoteWrite.url is used
Optional TLS server name to use for connections to -remoteWrite.url. By default the server name from -remoteWrite.url is used
-remoteWrite.url string
Optional URL to VictoriaMetrics or vminsert where to persist alerts state and recording rules results in form of timeseries. For example, if -remoteWrite.url=http://127.0.0.1:8428 is specified, then the alerts state will be written to http://127.0.0.1:8428/api/v1/write . See also -remoteWrite.disablePathAppend, '-remoteWrite.showURL'.
-replay.disableProgressBar
Whether to disable rendering progress bars during the replay. Progress bar rendering might be verbose or break the logs parsing, so it is recommended to be disabled when not used in interactive mode.
-replay.maxDatapointsPerQuery /query_range
Max number of data points expected in one request. It affects the max time range for every /query_range request during the replay. The higher the value, the less requests will be made during replay. (default 1000)
-replay.maxDatapointsPerQuery int
Max number of data points expected in one request. It affects the max time range for every `/query_range` request during the replay. The higher the value, the less requests will be made during replay. (default 1000)
-replay.ruleRetryAttempts int
Defines how many retries to make before giving up on rule if request for it returns an error. (default 5)
-replay.rulesDelay duration
Delay between rules evaluation within the group. Could be important if there are chained rules inside the group and processing need to wait for previous rule results to be persisted by remote storage before evaluating the next rule.Keep it equal or bigger than -remoteWrite.flushInterval. (default 1s)
Delay between rules evaluation within the group. Could be important if there are chained rules inside of the groupand processing need to wait for previous rule results to be persisted by remote storage before evaluating the next rule.Keep it equal or bigger than -remoteWrite.flushInterval. (default 1s)
-replay.timeFrom string
The time filter in RFC3339 format to select time series with timestamp equal or higher than provided value. E.g. '2020-01-01T20:07:00Z'
-replay.timeTo string
The time filter in RFC3339 format to select timeseries with timestamp equal or lower than provided value. E.g. '2020-01-01T20:07:00Z'
-rule array
Path to the files or http url with alerting and/or recording rules.
Supports hierarchical patterns and regexpes.
Path to the file with alert rules.
Supports patterns. Flag can be specified multiple times.
Examples:
-rule="/path/to/file". Path to a single file with alerting rules.
-rule="http://<some-server-addr>/path/to/rules". HTTP URL to a page with alerting rules.
-rule="dir/*.yaml" -rule="/*.yaml" -rule="gcs://vmalert-rules/tenant_%{TENANT_ID}/prod".
-rule="dir/**/*.yaml". Includes all the .yaml files in "dir" subfolders recursively.
-rule="/path/to/file". Path to a single file with alerting rules
-rule="dir/*.yaml" -rule="/*.yaml". Relative path to all .yaml files in "dir" folder,
absolute path to all .yaml files in root.
Rule files may contain %{ENV_VAR} placeholders, which are substituted by the corresponding env vars.
Enterprise version of vmalert supports S3 and GCS paths to rules.
For example: gs://bucket/path/to/rules, s3://bucket/path/to/rules
S3 and GCS paths support only matching by prefix, e.g. s3://bucket/dir/rule_ matches
all files with prefix rule_ in folder dir.
See https://docs.victoriametrics.com/vmalert.html#reading-rules-from-object-storage
Supports an array of values separated by comma or specified via multiple flags.
-rule.configCheckInterval duration
Interval for checking for changes in '-rule' files. By default the checking is disabled. Send SIGHUP signal in order to force config check for changes. DEPRECATED - see '-configCheckInterval' instead
-rule.maxResolveDuration duration
Limits the maximum duration for automatic alert expiration, which by default is 4 times evaluationInterval of the parent group.
Limits the maximum duration for automatic alert expiration, which is by default equal to 3 evaluation intervals of the parent group.
-rule.resendDelay duration
Minimum amount of time to wait before resending an alert to notifier
-rule.templates array
Path or glob pattern to location with go template definitions
for rules annotations templating. Flag can be specified multiple times.
for rules annotations templating. Flag can be specified multiple times.
Examples:
-rule.templates="/path/to/file". Path to a single file with go templates
-rule.templates="dir/*.tpl" -rule.templates="/*.tpl". Relative path to all .tpl files in "dir" folder,
absolute path to all .tpl files in root.
-rule.templates="dir/**/*.tpl". Includes all the .tpl files in "dir" subfolders recursively.
Supports an array of values separated by comma or specified via multiple flags.
-rule.updateEntriesLimit int
Defines the max number of rule's state updates stored in-memory. Rule's updates are available on rule's Details page and are used for debugging purposes. The number of stored updates can be overridden per rule via update_entries_limit param. (default 20)
Defines the max number of rule's state updates stored in-memory. Rule's updates are available on rule's Details page and are used for debugging purposes. The number of stored updates can be overriden per rule via update_entries_limit param. (default 20)
-rule.validateExpressions
Whether to validate rules expressions via MetricsQL engine (default true)
-rule.validateTemplates
Whether to validate annotation and label templates (default true)
-s2a_enable_appengine_dialer
If true, opportunistically use AppEngine-specific dialer to call S2A.
-s2a_timeout duration
Timeout enforced on the connection to the S2A service for handshake. (default 3s)
-s3.configFilePath string
Path to file with S3 configs. Configs are loaded from default location if not set.
See https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html . This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
-s3.configProfile string
Profile name for S3 configs. If no set, the value of the environment variable will be loaded (AWS_PROFILE or AWS_DEFAULT_PROFILE), or if both not set, DefaultSharedConfigProfile is used. This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
-s3.credsFilePath string
Path to file with GCS or S3 credentials. Credentials are loaded from default locations if not set.
See https://cloud.google.com/iam/docs/creating-managing-service-account-keys and https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html . This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
-s3.customEndpoint string
Custom S3 endpoint for use with S3-compatible storages (e.g. MinIO). S3 is used if not set. This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
-s3.forcePathStyle
Prefixing endpoint with bucket name when set false, true by default. This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html (default true)
-tls
Whether to enable TLS for incoming HTTP requests at -httpListenAddr (aka https). -tlsCertFile and -tlsKeyFile must be set if -tls is set
-tlsCertFile string
@@ -1339,7 +1188,7 @@ The configuration file allows to configure static notifiers, discover notifiers
and [DNS](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dns_sd_config):
For example:
```yaml
```
static_configs:
- targets:
- localhost:9093
@@ -1357,14 +1206,14 @@ dns_sd_configs:
port: 9093
```
The list of configured or discovered Notifiers can be explored via [UI](#web).
The list of configured or discovered Notifiers can be explored via [UI](#Web).
If Alertmanager runs in cluster mode then all its URLs needs to be available during discovery
to ensure [high availability](https://github.com/prometheus/alertmanager#high-availability).
The configuration file [specification](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmalert/notifier/config.go)
is the following:
```yaml
```
# Per-target Notifier timeout when pushing alerts.
[ timeout: <duration> | default = 10s ]
@@ -1483,7 +1332,6 @@ docker push my-repo:my-version-name
```
To run the built image in `victoria-metrics-k8s-stack` or `VMAlert` CR object apply the following config change:
```yaml
kind: VMAlert
spec:
@@ -1495,13 +1343,13 @@ spec:
### Development build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.19.
1. Run `make vmalert` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
2. Run `make vmalert` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmalert` binary and puts it into the `bin` folder.
### Production build
1. [Install docker](https://docs.docker.com/install/).
1. Run `make vmalert-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
2. Run `make vmalert-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmalert-prod` binary and puts it into the `bin` folder.
### ARM build
@@ -1511,11 +1359,11 @@ ARM build may run on Raspberry Pi or on [energy-efficient ARM servers](https://b
### Development ARM build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.19.
1. Run `make vmalert-linux-arm` or `make vmalert-linux-arm64` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
2. Run `make vmalert-linux-arm` or `make vmalert-linux-arm64` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmalert-linux-arm` or `vmalert-linux-arm64` binary respectively and puts it into the `bin` folder.
### Production ARM build
1. [Install docker](https://docs.docker.com/install/).
1. Run `make vmalert-linux-arm-prod` or `make vmalert-linux-arm64-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
2. Run `make vmalert-linux-arm-prod` or `make vmalert-linux-arm64-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmalert-linux-arm-prod` or `vmalert-linux-arm64-prod` binary respectively and puts it into the `bin` folder.

View File

@@ -21,18 +21,17 @@ import (
// AlertingRule is basic alert entity
type AlertingRule struct {
Type config.Type
RuleID uint64
Name string
Expr string
For time.Duration
KeepFiringFor time.Duration
Labels map[string]string
Annotations map[string]string
GroupID uint64
GroupName string
EvalInterval time.Duration
Debug bool
Type config.Type
RuleID uint64
Name string
Expr string
For time.Duration
Labels map[string]string
Annotations map[string]string
GroupID uint64
GroupName string
EvalInterval time.Duration
Debug bool
q datasource.Querier
@@ -48,27 +47,25 @@ type AlertingRule struct {
}
type alertingRuleMetrics struct {
errors *utils.Gauge
pending *utils.Gauge
active *utils.Gauge
samples *utils.Gauge
seriesFetched *utils.Gauge
errors *utils.Gauge
pending *utils.Gauge
active *utils.Gauge
samples *utils.Gauge
}
func newAlertingRule(qb datasource.QuerierBuilder, group *Group, cfg config.Rule) *AlertingRule {
ar := &AlertingRule{
Type: group.Type,
RuleID: cfg.ID,
Name: cfg.Alert,
Expr: cfg.Expr,
For: cfg.For.Duration(),
KeepFiringFor: cfg.KeepFiringFor.Duration(),
Labels: cfg.Labels,
Annotations: cfg.Annotations,
GroupID: group.ID(),
GroupName: group.Name,
EvalInterval: group.Interval,
Debug: cfg.Debug,
Type: group.Type,
RuleID: cfg.ID,
Name: cfg.Alert,
Expr: cfg.Expr,
For: cfg.For.Duration(),
Labels: cfg.Labels,
Annotations: cfg.Annotations,
GroupID: group.ID(),
GroupName: group.Name,
EvalInterval: group.Interval,
Debug: cfg.Debug,
q: qb.BuildWithParams(datasource.QuerierParams{
DataSourceType: group.Type.String(),
EvaluationInterval: group.Interval,
@@ -124,21 +121,6 @@ func newAlertingRule(qb datasource.QuerierBuilder, group *Group, cfg config.Rule
e := ar.state.getLast()
return float64(e.samples)
})
ar.metrics.seriesFetched = utils.GetOrCreateGauge(fmt.Sprintf(`vmalert_alerting_rules_last_evaluation_series_fetched{%s}`, labels),
func() float64 {
e := ar.state.getLast()
if e.seriesFetched == nil {
// means seriesFetched is unsupported
return -1
}
seriesFetched := float64(*e.seriesFetched)
if seriesFetched == 0 && e.samples > 0 {
// `alert: 0.95` will fetch no series
// but will get one time series in response.
seriesFetched = float64(e.samples)
}
return seriesFetched
})
return ar
}
@@ -148,7 +130,6 @@ func (ar *AlertingRule) Close() {
ar.metrics.pending.Unregister()
ar.metrics.errors.Unregister()
ar.metrics.samples.Unregister()
ar.metrics.seriesFetched.Unregister()
}
// String implements Stringer interface
@@ -253,7 +234,7 @@ func (ar *AlertingRule) toLabels(m datasource.Metric, qFn templates.QueryFn) (*l
// to get time series for backfilling.
// It returns ALERT and ALERT_FOR_STATE time series as result.
func (ar *AlertingRule) ExecRange(ctx context.Context, start, end time.Time) ([]prompbmarshal.TimeSeries, error) {
res, err := ar.q.QueryRange(ctx, ar.Expr, start, end)
series, err := ar.q.QueryRange(ctx, ar.Expr, start, end)
if err != nil {
return nil, err
}
@@ -261,7 +242,7 @@ func (ar *AlertingRule) ExecRange(ctx context.Context, start, end time.Time) ([]
qFn := func(query string) ([]datasource.Metric, error) {
return nil, fmt.Errorf("`query` template isn't supported in replay mode")
}
for _, s := range res.Data {
for _, s := range series {
a, err := ar.newAlert(s, nil, time.Time{}, qFn) // initial alert
if err != nil {
return nil, fmt.Errorf("failed to create alert: %s", err)
@@ -301,15 +282,14 @@ const resolvedRetention = 15 * time.Minute
// Based on the Querier results AlertingRule maintains notifier.Alerts
func (ar *AlertingRule) Exec(ctx context.Context, ts time.Time, limit int) ([]prompbmarshal.TimeSeries, error) {
start := time.Now()
res, req, err := ar.q.Query(ctx, ar.Expr, ts)
qMetrics, req, err := ar.q.Query(ctx, ar.Expr, ts)
curState := ruleStateEntry{
time: start,
at: ts,
duration: time.Since(start),
samples: len(res.Data),
seriesFetched: res.SeriesFetched,
err: err,
curl: requestToCurl(req),
time: start,
at: ts,
duration: time.Since(start),
samples: len(qMetrics),
err: err,
curl: requestToCurl(req),
}
defer func() {
@@ -335,11 +315,11 @@ func (ar *AlertingRule) Exec(ctx context.Context, ts time.Time, limit int) ([]pr
qFn := func(query string) ([]datasource.Metric, error) {
res, _, err := ar.q.Query(ctx, query, ts)
return res.Data, err
return res, err
}
updated := make(map[uint64]struct{})
// update list of active alerts
for _, m := range res.Data {
for _, m := range qMetrics {
ls, err := ar.toLabels(m, qFn)
if err != nil {
curState.err = fmt.Errorf("failed to expand labels: %s", err)
@@ -368,7 +348,6 @@ func (ar *AlertingRule) Exec(ctx context.Context, ts time.Time, limit int) ([]pr
if err != nil {
return nil, err
}
a.KeepFiringSince = time.Time{}
continue
}
a, err := ar.newAlert(m, ls, start, qFn)
@@ -394,24 +373,12 @@ func (ar *AlertingRule) Exec(ctx context.Context, ts time.Time, limit int) ([]pr
ar.logDebugf(ts, a, "PENDING => DELETED: is absent in current evaluation round")
continue
}
// check if alert should keep StateFiring if rule has
// `keep_firing_for` field
if a.State == notifier.StateFiring {
if ar.KeepFiringFor > 0 {
if a.KeepFiringSince.IsZero() {
a.KeepFiringSince = ts
}
}
// alerts with ar.KeepFiringFor>0 may remain FIRING
// even if their expression isn't true anymore
if ts.Sub(a.KeepFiringSince) > ar.KeepFiringFor {
a.State = notifier.StateInactive
a.ResolvedAt = ts
ar.logDebugf(ts, a, "FIRING => INACTIVE: is absent in current evaluation round")
continue
}
ar.logDebugf(ts, a, "KEEP_FIRING: will keep firing for %fs since %v", ar.KeepFiringFor.Seconds(), a.KeepFiringSince)
a.State = notifier.StateInactive
a.ResolvedAt = ts
ar.logDebugf(ts, a, "FIRING => INACTIVE: is absent in current evaluation round")
}
continue
}
numActivePending++
if a.State == notifier.StatePending && ts.Sub(a.ActiveAt) >= ar.For {
@@ -451,7 +418,6 @@ func (ar *AlertingRule) UpdateWith(r Rule) error {
}
ar.Expr = nr.Expr
ar.For = nr.For
ar.KeepFiringFor = nr.KeepFiringFor
ar.Labels = nr.Labels
ar.Annotations = nr.Annotations
ar.EvalInterval = nr.EvalInterval
@@ -519,24 +485,22 @@ func (ar *AlertingRule) AlertAPI(id uint64) *APIAlert {
func (ar *AlertingRule) ToAPI() APIRule {
lastState := ar.state.getLast()
r := APIRule{
Type: "alerting",
DatasourceType: ar.Type.String(),
Name: ar.Name,
Query: ar.Expr,
Duration: ar.For.Seconds(),
KeepFiringFor: ar.KeepFiringFor.Seconds(),
Labels: ar.Labels,
Annotations: ar.Annotations,
LastEvaluation: lastState.time,
EvaluationTime: lastState.duration.Seconds(),
Health: "ok",
State: "inactive",
Alerts: ar.AlertsToAPI(),
LastSamples: lastState.samples,
LastSeriesFetched: lastState.seriesFetched,
MaxUpdates: ar.state.size(),
Updates: ar.state.getAll(),
Debug: ar.Debug,
Type: "alerting",
DatasourceType: ar.Type.String(),
Name: ar.Name,
Query: ar.Expr,
Duration: ar.For.Seconds(),
Labels: ar.Labels,
Annotations: ar.Annotations,
LastEvaluation: lastState.time,
EvaluationTime: lastState.duration.Seconds(),
Health: "ok",
State: "inactive",
Alerts: ar.AlertsToAPI(),
LastSamples: lastState.samples,
MaxUpdates: ar.state.size(),
Updates: ar.state.getAll(),
Debug: ar.Debug,
// encode as strings to avoid rounding in JSON
ID: fmt.Sprintf("%d", ar.ID()),
@@ -593,9 +557,6 @@ func (ar *AlertingRule) newAlertAPI(a notifier.Alert) *APIAlert {
if alertURLGeneratorFn != nil {
aa.SourceLink = alertURLGeneratorFn(a)
}
if a.State == notifier.StateFiring && !a.KeepFiringSince.IsZero() {
aa.Stabilizing = true
}
return aa
}
@@ -676,12 +637,11 @@ func (ar *AlertingRule) Restore(ctx context.Context, q datasource.Querier, ts ti
ar.logDebugf(ts, nil, "restoring alert state via query %q", expr)
res, _, err := q.Query(ctx, expr, ts)
qMetrics, _, err := q.Query(ctx, expr, ts)
if err != nil {
return err
}
qMetrics := res.Data
if len(qMetrics) < 1 {
ar.logDebugf(ts, nil, "no response was received from restore query")
continue

View File

@@ -113,7 +113,7 @@ func TestAlertingRule_Exec(t *testing.T) {
testCases := []struct {
rule *AlertingRule
steps [][]datasource.Metric
expAlerts map[int][]testAlert
expAlerts []testAlert
}{
{
newTestAlertingRule("empty", 0),
@@ -125,8 +125,50 @@ func TestAlertingRule_Exec(t *testing.T) {
[][]datasource.Metric{
{datasource.Metric{Values: []float64{1}, Timestamps: []int64{1}}},
},
map[int][]testAlert{
0: {{alert: &notifier.Alert{State: notifier.StateFiring}}},
[]testAlert{
{alert: &notifier.Alert{State: notifier.StateFiring}},
},
},
{
newTestAlertingRule("single-firing", 0),
[][]datasource.Metric{
{metricWithLabels(t, "name", "foo")},
},
[]testAlert{
{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateFiring}},
},
},
{
newTestAlertingRule("single-firing=>inactive", 0),
[][]datasource.Metric{
{metricWithLabels(t, "name", "foo")},
{},
},
[]testAlert{
{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateInactive}},
},
},
{
newTestAlertingRule("single-firing=>inactive=>firing", 0),
[][]datasource.Metric{
{metricWithLabels(t, "name", "foo")},
{},
{metricWithLabels(t, "name", "foo")},
},
[]testAlert{
{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateFiring}},
},
},
{
newTestAlertingRule("single-firing=>inactive=>firing=>inactive", 0),
[][]datasource.Metric{
{metricWithLabels(t, "name", "foo")},
{},
{metricWithLabels(t, "name", "foo")},
{},
},
[]testAlert{
{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateInactive}},
},
},
{
@@ -138,16 +180,12 @@ func TestAlertingRule_Exec(t *testing.T) {
{},
{},
},
map[int][]testAlert{
0: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateFiring}}},
1: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateInactive}}},
2: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateFiring}}},
3: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateInactive}}},
4: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateInactive}}},
[]testAlert{
{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateInactive}},
},
},
{
newTestAlertingRule("single-firing=>inactive=>firing=>inactive=>inactive=>firing", 0),
newTestAlertingRule("single-firing=>inactive=>firing=>inactive=>empty=>firing", 0),
[][]datasource.Metric{
{metricWithLabels(t, "name", "foo")},
{},
@@ -156,13 +194,8 @@ func TestAlertingRule_Exec(t *testing.T) {
{},
{metricWithLabels(t, "name", "foo")},
},
map[int][]testAlert{
0: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateFiring}}},
1: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateInactive}}},
2: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateFiring}}},
3: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateInactive}}},
4: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateInactive}}},
5: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateFiring}}},
[]testAlert{
{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateFiring}},
},
},
{
@@ -174,12 +207,10 @@ func TestAlertingRule_Exec(t *testing.T) {
metricWithLabels(t, "name", "foo2"),
},
},
map[int][]testAlert{
0: {
{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateFiring}},
{labels: []string{"name", "foo1"}, alert: &notifier.Alert{State: notifier.StateFiring}},
{labels: []string{"name", "foo2"}, alert: &notifier.Alert{State: notifier.StateFiring}},
},
[]testAlert{
{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateFiring}},
{labels: []string{"name", "foo1"}, alert: &notifier.Alert{State: notifier.StateFiring}},
{labels: []string{"name", "foo2"}, alert: &notifier.Alert{State: notifier.StateFiring}},
},
},
{
@@ -192,19 +223,10 @@ func TestAlertingRule_Exec(t *testing.T) {
// 1: fire first alert
// 2: fire second alert, set first inactive
// 3: fire third alert, set second inactive
map[int][]testAlert{
0: {
{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateFiring}},
},
1: {
{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateInactive}},
{labels: []string{"name", "foo1"}, alert: &notifier.Alert{State: notifier.StateFiring}},
},
2: {
{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateInactive}},
{labels: []string{"name", "foo1"}, alert: &notifier.Alert{State: notifier.StateInactive}},
{labels: []string{"name", "foo2"}, alert: &notifier.Alert{State: notifier.StateFiring}},
},
[]testAlert{
{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateInactive}},
{labels: []string{"name", "foo1"}, alert: &notifier.Alert{State: notifier.StateInactive}},
{labels: []string{"name", "foo2"}, alert: &notifier.Alert{State: notifier.StateFiring}},
},
},
{
@@ -212,8 +234,8 @@ func TestAlertingRule_Exec(t *testing.T) {
[][]datasource.Metric{
{metricWithLabels(t, "name", "foo")},
},
map[int][]testAlert{
0: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StatePending}}},
[]testAlert{
{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StatePending}},
},
},
{
@@ -222,9 +244,8 @@ func TestAlertingRule_Exec(t *testing.T) {
{metricWithLabels(t, "name", "foo")},
{metricWithLabels(t, "name", "foo")},
},
map[int][]testAlert{
0: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StatePending}}},
1: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateFiring}}},
[]testAlert{
{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateFiring}},
},
},
{
@@ -232,13 +253,34 @@ func TestAlertingRule_Exec(t *testing.T) {
[][]datasource.Metric{
{metricWithLabels(t, "name", "foo")},
{metricWithLabels(t, "name", "foo")},
// empty step to delete pending alerts
// empty step to reset and delete pending alerts
{},
},
map[int][]testAlert{
0: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StatePending}}},
1: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StatePending}}},
2: {},
nil,
},
{
newTestAlertingRule("for-pending=>firing=>inactive", defaultStep),
[][]datasource.Metric{
{metricWithLabels(t, "name", "foo")},
{metricWithLabels(t, "name", "foo")},
// empty step to reset pending alerts
{},
},
[]testAlert{
{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateInactive}},
},
},
{
newTestAlertingRule("for-pending=>firing=>inactive=>pending", defaultStep),
[][]datasource.Metric{
{metricWithLabels(t, "name", "foo")},
{metricWithLabels(t, "name", "foo")},
// empty step to reset pending alerts
{},
{metricWithLabels(t, "name", "foo")},
},
[]testAlert{
{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StatePending}},
},
},
{
@@ -246,57 +288,13 @@ func TestAlertingRule_Exec(t *testing.T) {
[][]datasource.Metric{
{metricWithLabels(t, "name", "foo")},
{metricWithLabels(t, "name", "foo")},
// empty step to set alert inactive
// empty step to reset pending alerts
{},
{metricWithLabels(t, "name", "foo")},
{metricWithLabels(t, "name", "foo")},
},
map[int][]testAlert{
0: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StatePending}}},
1: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateFiring}}},
2: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateInactive}}},
3: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StatePending}}},
4: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateFiring}}},
},
},
{
newTestAlertingRuleWithKeepFiring("for-pending=>firing=>keepfiring=>firing", defaultStep, defaultStep),
[][]datasource.Metric{
{metricWithLabels(t, "name", "foo")},
{metricWithLabels(t, "name", "foo")},
// empty step to keep firing
{},
{metricWithLabels(t, "name", "foo")},
},
map[int][]testAlert{
0: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StatePending}}},
1: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateFiring}}},
2: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateFiring}}},
3: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateFiring}}},
},
},
{
newTestAlertingRuleWithKeepFiring("for-pending=>firing=>keepfiring=>keepfiring=>inactive=>pending=>firing", defaultStep, 2*defaultStep),
[][]datasource.Metric{
{metricWithLabels(t, "name", "foo")},
{metricWithLabels(t, "name", "foo")},
// empty step to keep firing
{},
// another empty step to keep firing
{},
// empty step to set alert inactive
{},
{metricWithLabels(t, "name", "foo")},
{metricWithLabels(t, "name", "foo")},
},
map[int][]testAlert{
0: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StatePending}}},
1: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateFiring}}},
2: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateFiring}}},
3: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateFiring}}},
4: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateInactive}}},
5: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StatePending}}},
6: {{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateFiring}}},
[]testAlert{
{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateFiring}},
},
},
}
@@ -306,7 +304,7 @@ func TestAlertingRule_Exec(t *testing.T) {
fq := &fakeQuerier{}
tc.rule.q = fq
tc.rule.GroupID = fakeGroup.ID()
for i, step := range tc.steps {
for _, step := range tc.steps {
fq.reset()
fq.add(step...)
if _, err := tc.rule.Exec(context.TODO(), time.Now(), 0); err != nil {
@@ -314,31 +312,28 @@ func TestAlertingRule_Exec(t *testing.T) {
}
// artificial delay between applying steps
time.Sleep(defaultStep)
if _, ok := tc.expAlerts[i]; !ok {
continue
}
if len(tc.rule.alerts) != len(tc.expAlerts) {
t.Fatalf("expected %d alerts; got %d", len(tc.expAlerts), len(tc.rule.alerts))
}
expAlerts := make(map[uint64]*notifier.Alert)
for _, ta := range tc.expAlerts {
labels := make(map[string]string)
for i := 0; i < len(ta.labels); i += 2 {
k, v := ta.labels[i], ta.labels[i+1]
labels[k] = v
}
if len(tc.rule.alerts) != len(tc.expAlerts[i]) {
t.Fatalf("evalIndex %d: expected %d alerts; got %d", i, len(tc.expAlerts[i]), len(tc.rule.alerts))
labels[alertNameLabel] = tc.rule.Name
h := hash(labels)
expAlerts[h] = ta.alert
}
for key, exp := range expAlerts {
got, ok := tc.rule.alerts[key]
if !ok {
t.Fatalf("expected to have key %d", key)
}
expAlerts := make(map[uint64]*notifier.Alert)
for _, ta := range tc.expAlerts[i] {
labels := make(map[string]string)
for i := 0; i < len(ta.labels); i += 2 {
k, v := ta.labels[i], ta.labels[i+1]
labels[k] = v
}
labels[alertNameLabel] = tc.rule.Name
h := hash(labels)
expAlerts[h] = ta.alert
}
for key, exp := range expAlerts {
got, ok := tc.rule.alerts[key]
if !ok {
t.Fatalf("evalIndex %d: expected to have key %d", i, key)
}
if got.State != exp.State {
t.Fatalf("evalIndex %d: expected state %d; got %d", i, exp.State, got.State)
}
if got.State != exp.State {
t.Fatalf("expected state %d; got %d", exp.State, got.State)
}
}
})
@@ -871,8 +866,8 @@ func TestAlertingRule_Template(t *testing.T) {
for hash, expAlert := range tc.expAlerts {
gotAlert := tc.rule.alerts[hash]
if gotAlert == nil {
t.Fatalf("alert %d is missing; labels: %v; annotations: %v", hash, expAlert.Labels, expAlert.Annotations)
break
t.Fatalf("alert %d is missing; labels: %v; annotations: %v",
hash, expAlert.Labels, expAlert.Annotations)
}
if !reflect.DeepEqual(expAlert.Annotations, gotAlert.Annotations) {
t.Fatalf("expected to have annotations %#v; got %#v", expAlert.Annotations, gotAlert.Annotations)
@@ -975,18 +970,11 @@ func newTestRuleWithLabels(name string, labels ...string) *AlertingRule {
}
func newTestAlertingRule(name string, waitFor time.Duration) *AlertingRule {
rule := AlertingRule{
return &AlertingRule{
Name: name,
For: waitFor,
EvalInterval: waitFor,
alerts: make(map[uint64]*notifier.Alert),
state: newRuleState(10),
}
return &rule
}
func newTestAlertingRuleWithKeepFiring(name string, waitFor, keepFiringFor time.Duration) *AlertingRule {
rule := newTestAlertingRule(name, waitFor)
rule.KeepFiringFor = keepFiringFor
return rule
}

View File

@@ -5,14 +5,16 @@ import (
"fmt"
"hash/fnv"
"net/url"
"os"
"path/filepath"
"sort"
"strings"
"gopkg.in/yaml.v2"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/config/log"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/utils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envtemplate"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promutils"
)
@@ -36,8 +38,7 @@ type Group struct {
Params url.Values `yaml:"params"`
// Headers contains optional HTTP headers added to each rule request
Headers []Header `yaml:"headers,omitempty"`
// NotifierHeaders contains optional HTTP headers sent to notifiers for generated notifications
NotifierHeaders []Header `yaml:"notifier_headers,omitempty"`
// Catches all undefined fields and must be empty after parsing.
XXX map[string]interface{} `yaml:",inline"`
}
@@ -105,16 +106,14 @@ func (g *Group) Validate(validateTplFn ValidateTplFn, validateExpressions bool)
// Rule describes entity that represent either
// recording rule or alerting rule.
type Rule struct {
ID uint64
Record string `yaml:"record,omitempty"`
Alert string `yaml:"alert,omitempty"`
Expr string `yaml:"expr"`
For *promutils.Duration `yaml:"for,omitempty"`
// Alert will continue firing for this long even when the alerting expression no longer has results.
KeepFiringFor *promutils.Duration `yaml:"keep_firing_for,omitempty"`
Labels map[string]string `yaml:"labels,omitempty"`
Annotations map[string]string `yaml:"annotations,omitempty"`
Debug bool `yaml:"debug,omitempty"`
ID uint64
Record string `yaml:"record,omitempty"`
Alert string `yaml:"alert,omitempty"`
Expr string `yaml:"expr"`
For *promutils.Duration `yaml:"for,omitempty"`
Labels map[string]string `yaml:"labels,omitempty"`
Annotations map[string]string `yaml:"annotations,omitempty"`
Debug bool `yaml:"debug,omitempty"`
// UpdateEntriesLimit defines max number of rule's state updates stored in memory.
// Overrides `-rule.updateEntriesLimit`.
UpdateEntriesLimit *int `yaml:"update_entries_limit,omitempty"`
@@ -202,45 +201,21 @@ func (r *Rule) Validate() error {
// ValidateTplFn must validate the given annotations
type ValidateTplFn func(annotations map[string]string) error
// cLogger is a logger with support of logs suppressing.
// it is used when logs emitted by config package needs
// to be suppressed.
var cLogger = &log.Logger{}
// ParseSilent parses rule configs from given file patterns without emitting logs
func ParseSilent(pathPatterns []string, validateTplFn ValidateTplFn, validateExpressions bool) ([]Group, error) {
cLogger.Suppress(true)
defer cLogger.Suppress(false)
files, err := readFromFS(pathPatterns)
if err != nil {
return nil, fmt.Errorf("failed to read from the config: %s", err)
}
return parse(files, validateTplFn, validateExpressions)
}
// Parse parses rule configs from given file patterns
func Parse(pathPatterns []string, validateTplFn ValidateTplFn, validateExpressions bool) ([]Group, error) {
files, err := readFromFS(pathPatterns)
if err != nil {
return nil, fmt.Errorf("failed to read from the config: %s", err)
var fp []string
for _, pattern := range pathPatterns {
matches, err := filepath.Glob(pattern)
if err != nil {
return nil, fmt.Errorf("error reading file pattern %s: %w", pattern, err)
}
fp = append(fp, matches...)
}
groups, err := parse(files, validateTplFn, validateExpressions)
if err != nil {
return nil, fmt.Errorf("failed to parse %s: %s", pathPatterns, err)
}
if len(groups) < 1 {
cLogger.Warnf("no groups found in %s", strings.Join(pathPatterns, ";"))
}
return groups, nil
}
func parse(files map[string][]byte, validateTplFn ValidateTplFn, validateExpressions bool) ([]Group, error) {
errGroup := new(utils.ErrGroup)
var groups []Group
for file, data := range files {
for _, file := range fp {
uniqueGroups := map[string]struct{}{}
gr, err := parseConfig(data)
gr, err := parseFile(file)
if err != nil {
errGroup.Add(fmt.Errorf("failed to parse file %q: %w", file, err))
continue
@@ -262,19 +237,20 @@ func parse(files map[string][]byte, validateTplFn ValidateTplFn, validateExpress
if err := errGroup.Err(); err != nil {
return nil, err
}
sort.SliceStable(groups, func(i, j int) bool {
if groups[i].File != groups[j].File {
return groups[i].File < groups[j].File
}
return groups[i].Name < groups[j].Name
})
if len(groups) < 1 {
logger.Warnf("no groups found in %s", strings.Join(pathPatterns, ";"))
}
return groups, nil
}
func parseConfig(data []byte) ([]Group, error) {
data, err := envtemplate.ReplaceBytes(data)
func parseFile(path string) ([]Group, error) {
data, err := os.ReadFile(path)
if err != nil {
return nil, fmt.Errorf("cannot expand environment vars: %w", err)
return nil, fmt.Errorf("error reading alert rule file %q: %w", path, err)
}
data, err = envtemplate.ReplaceBytes(data)
if err != nil {
return nil, fmt.Errorf("cannot expand environment vars in %q: %w", path, err)
}
g := struct {
Groups []Group `yaml:"groups"`

View File

@@ -1,8 +1,6 @@
package config
import (
"net/http"
"net/http/httptest"
"net/url"
"os"
"strings"
@@ -29,40 +27,6 @@ func TestParseGood(t *testing.T) {
}
}
func TestParseFromURL(t *testing.T) {
mux := http.NewServeMux()
mux.HandleFunc("/bad", func(w http.ResponseWriter, _ *http.Request) {
w.Write([]byte("foo bar"))
})
mux.HandleFunc("/good-alert", func(w http.ResponseWriter, _ *http.Request) {
w.Write([]byte(`
groups:
- name: TestGroup
rules:
- alert: Conns
expr: vm_tcplistener_conns > 0`))
})
mux.HandleFunc("/good-rr", func(w http.ResponseWriter, _ *http.Request) {
w.Write([]byte(`
groups:
- name: TestGroup
rules:
- record: conns
expr: max(vm_tcplistener_conns)`))
})
srv := httptest.NewServer(mux)
defer srv.Close()
if _, err := Parse([]string{srv.URL + "/good-alert", srv.URL + "/good-rr"}, notifier.ValidateTemplates, true); err != nil {
t.Errorf("error parsing URLs %s", err)
}
if _, err := Parse([]string{srv.URL + "/bad"}, notifier.ValidateTemplates, true); err == nil {
t.Errorf("expected parsing error: %s", err)
}
}
func TestParseBad(t *testing.T) {
testCases := []struct {
path []string
@@ -100,10 +64,6 @@ func TestParseBad(t *testing.T) {
[]string{"testdata/dir/rules6-bad.rules"},
"missing ':' in header",
},
{
[]string{"http://unreachable-url"},
"failed to read",
},
}
for _, tc := range testCases {
_, err := Parse(tc.path, notifier.ValidateTemplates, true)
@@ -142,8 +102,7 @@ func TestGroup_Validate(t *testing.T) {
expErr: "group name must be set",
},
{
group: &Group{
Name: "test",
group: &Group{Name: "test",
Rules: []Rule{
{
Record: "record",
@@ -154,8 +113,7 @@ func TestGroup_Validate(t *testing.T) {
expErr: "",
},
{
group: &Group{
Name: "test",
group: &Group{Name: "test",
Rules: []Rule{
{
Record: "record",
@@ -167,8 +125,7 @@ func TestGroup_Validate(t *testing.T) {
validateExpressions: true,
},
{
group: &Group{
Name: "test",
group: &Group{Name: "test",
Rules: []Rule{
{
Alert: "alert",
@@ -182,8 +139,7 @@ func TestGroup_Validate(t *testing.T) {
expErr: "",
},
{
group: &Group{
Name: "test",
group: &Group{Name: "test",
Rules: []Rule{
{
Alert: "alert",
@@ -200,8 +156,7 @@ func TestGroup_Validate(t *testing.T) {
validateAnnotations: true,
},
{
group: &Group{
Name: "test",
group: &Group{Name: "test",
Rules: []Rule{
{
Alert: "alert",
@@ -216,8 +171,7 @@ func TestGroup_Validate(t *testing.T) {
expErr: "duplicate",
},
{
group: &Group{
Name: "test",
group: &Group{Name: "test",
Rules: []Rule{
{Alert: "alert", Expr: "up == 1", Labels: map[string]string{
"summary": "{{ value|query }}",
@@ -230,8 +184,7 @@ func TestGroup_Validate(t *testing.T) {
expErr: "duplicate",
},
{
group: &Group{
Name: "test",
group: &Group{Name: "test",
Rules: []Rule{
{Record: "record", Expr: "up == 1", Labels: map[string]string{
"summary": "{{ value|query }}",
@@ -244,8 +197,7 @@ func TestGroup_Validate(t *testing.T) {
expErr: "duplicate",
},
{
group: &Group{
Name: "test",
group: &Group{Name: "test",
Rules: []Rule{
{Alert: "alert", Expr: "up == 1", Labels: map[string]string{
"summary": "{{ value|query }}",
@@ -258,8 +210,7 @@ func TestGroup_Validate(t *testing.T) {
expErr: "",
},
{
group: &Group{
Name: "test",
group: &Group{Name: "test",
Rules: []Rule{
{Record: "alert", Expr: "up == 1", Labels: map[string]string{
"summary": "{{ value|query }}",
@@ -272,8 +223,7 @@ func TestGroup_Validate(t *testing.T) {
expErr: "",
},
{
group: &Group{
Name: "test thanos",
group: &Group{Name: "test thanos",
Type: NewRawType("thanos"),
Rules: []Rule{
{Alert: "alert", Expr: "up == 1", Labels: map[string]string{
@@ -285,8 +235,7 @@ func TestGroup_Validate(t *testing.T) {
expErr: "unknown datasource type",
},
{
group: &Group{
Name: "test graphite",
group: &Group{Name: "test graphite",
Type: NewGraphiteType(),
Rules: []Rule{
{Alert: "alert", Expr: "up == 1", Labels: map[string]string{
@@ -298,8 +247,7 @@ func TestGroup_Validate(t *testing.T) {
expErr: "",
},
{
group: &Group{
Name: "test prometheus",
group: &Group{Name: "test prometheus",
Type: NewPrometheusType(),
Rules: []Rule{
{Alert: "alert", Expr: "up == 1", Labels: map[string]string{
@@ -404,7 +352,7 @@ func TestHashRule(t *testing.T) {
true,
},
{
Rule{Alert: "alert", Expr: "up == 1", For: promutils.NewDuration(time.Minute), KeepFiringFor: promutils.NewDuration(time.Minute)},
Rule{Alert: "alert", Expr: "up == 1", For: promutils.NewDuration(time.Minute)},
Rule{Alert: "alert", Expr: "up == 1"},
true,
},
@@ -590,24 +538,6 @@ rules:
`)
})
t.Run("`notifier_headers` change", func(t *testing.T) {
f(t, `
name: TestGroup
notifier_headers:
- "TenantID: foo"
rules:
- alert: foo
expr: sum by(job) (up == 1)
`, `
name: TestGroup
notifier_headers:
- "TenantID: bar"
rules:
- alert: foo
expr: sum by(job) (up == 1)
`)
})
t.Run("`debug` change", func(t *testing.T) {
f(t, `
name: TestGroup

View File

@@ -1,111 +0,0 @@
package config
import (
"fmt"
"strings"
"sync"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/config/fslocal"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/config/fsurl"
)
// FS represent a file system abstract for reading files.
type FS interface {
// Init initializes FS.
Init() error
// String must return human-readable representation of FS.
String() string
// List returns the list of file names which will be read via Read fn
List() ([]string, error)
// Read returns a list of read files in form of a map
// where key is a file name and value is a content of read file.
// Read must be called only after the successful Init call.
Read(files []string) (map[string][]byte, error)
}
var (
fsRegistryMu sync.Mutex
fsRegistry = make(map[string]FS)
)
// readFromFS parses the given path list and inits FS for each item.
// Once initialed, readFromFS will try to read and return files from each FS.
// readFromFS returns an error if at least one FS failed to init.
// The function can be called multiple times but each unique path
// will be initialed only once.
//
// It is allowed to mix different FS types in path list.
func readFromFS(paths []string) (map[string][]byte, error) {
var err error
result := make(map[string][]byte)
for _, path := range paths {
fsRegistryMu.Lock()
fs, ok := fsRegistry[path]
if !ok {
fs, err = newFS(path)
if err != nil {
fsRegistryMu.Unlock()
return nil, fmt.Errorf("error while parsing path %q: %w", path, err)
}
if err := fs.Init(); err != nil {
fsRegistryMu.Unlock()
return nil, fmt.Errorf("error while initializing path %q: %w", path, err)
}
fsRegistry[path] = fs
}
fsRegistryMu.Unlock()
list, err := fs.List()
if err != nil {
return nil, fmt.Errorf("failed to list files from %q", fs)
}
cLogger.Infof("found %d files to read from %q", len(list), fs)
if len(list) < 1 {
continue
}
ts := time.Now()
files, err := fs.Read(list)
if err != nil {
return nil, fmt.Errorf("error while reading files from %q: %w", fs, err)
}
cLogger.Infof("finished reading %d files in %v from %q", len(list), time.Since(ts), fs)
for k, v := range files {
if _, ok := result[k]; ok {
return nil, fmt.Errorf("duplicate found for file name %q: file names must be unique", k)
}
result[k] = v
}
}
return result, nil
}
// newFS creates FS based on the give path.
// Supported file systems are: fs
func newFS(originPath string) (FS, error) {
scheme := "fs"
path := originPath
n := strings.Index(path, "://")
if n >= 0 {
scheme = path[:n]
path = path[n+len("://"):]
}
if len(path) == 0 {
return nil, fmt.Errorf("path cannot be empty")
}
switch scheme {
case "fs":
return &fslocal.FS{Pattern: path}, nil
case "http", "https":
return &fsurl.FS{Path: originPath}, nil
default:
return nil, fmt.Errorf("unsupported scheme %q", scheme)
}
}

View File

@@ -1,39 +0,0 @@
package config
import (
"strings"
"testing"
)
func TestNewFS(t *testing.T) {
f := func(path, expStr string) {
t.Helper()
fs, err := newFS(path)
if err != nil {
t.Fatalf("unexpected err: %s", err)
}
if fs.String() != expStr {
t.Fatalf("expected FS %q; got %q", expStr, fs.String())
}
}
f("/foo/bar", "Local FS{MatchPattern: \"/foo/bar\"}")
f("fs:///foo/bar", "Local FS{MatchPattern: \"/foo/bar\"}")
}
func TestNewFSNegative(t *testing.T) {
f := func(path, expErr string) {
t.Helper()
_, err := newFS(path)
if err == nil {
t.Fatalf("expected to have err: %s", expErr)
}
if !strings.Contains(err.Error(), expErr) {
t.Fatalf("expected to have err %q; got %q instead", expErr, err)
}
}
f("", "path cannot be empty")
f("fs://", "path cannot be empty")
f("foobar://baz", `unsupported scheme "foobar"`)
}

View File

@@ -1,50 +0,0 @@
package fslocal
import (
"fmt"
"os"
"github.com/bmatcuk/doublestar/v4"
)
// FS represents a local file system
type FS struct {
// Pattern is used for matching one or multiple files.
// The pattern may describe hierarchical names such as
// /usr/*/bin/ed (assuming the Separator is '/').
Pattern string
}
// Init verifies that configured Pattern is correct
func (fs *FS) Init() error {
_, err := doublestar.FilepathGlob(fs.Pattern)
return err
}
// String implements Stringer interface
func (fs *FS) String() string {
return fmt.Sprintf("Local FS{MatchPattern: %q}", fs.Pattern)
}
// List returns the list of file names which will be read via Read fn
func (fs *FS) List() ([]string, error) {
matches, err := doublestar.FilepathGlob(fs.Pattern)
if err != nil {
return nil, fmt.Errorf("error while matching files via pattern %s: %w", fs.Pattern, err)
}
return matches, nil
}
// Read returns a map of read files where
// key is the file name and value is file's content.
func (fs *FS) Read(files []string) (map[string][]byte, error) {
result := make(map[string][]byte)
for _, path := range files {
data, err := os.ReadFile(path)
if err != nil {
return nil, fmt.Errorf("error while reading file %q: %w", path, err)
}
result[path] = data
}
return result, nil
}

View File

@@ -1,57 +0,0 @@
package fsurl
import (
"fmt"
"io"
"net/http"
"net/url"
)
// FS represents a struct which can read content from URL Path
type FS struct {
// Path defines the URL to read the data from
Path string
}
// Init verifies that configured Path is correct
func (fs *FS) Init() error {
_, err := url.Parse(fs.Path)
return err
}
// String implements Stringer interface
func (fs *FS) String() string {
return fmt.Sprintf("URL {Path: %q}", fs.Path)
}
// List returns the list of file names which will be read via Read fn
// List isn't supported by FS and reads from Path only
func (fs *FS) List() ([]string, error) {
return []string{fs.Path}, nil
}
// Read returns a map of read files where
// key is the file name and value is file's content.
func (fs *FS) Read(files []string) (map[string][]byte, error) {
result := make(map[string][]byte)
for _, path := range files {
resp, err := http.Get(path)
if err != nil {
return nil, fmt.Errorf("failed to read from %q: %w", path, err)
}
data, err := io.ReadAll(resp.Body)
_ = resp.Body.Close()
if resp.StatusCode != http.StatusOK {
if len(data) > 4*1024 {
data = data[:4*1024]
}
return nil, fmt.Errorf("unexpected status code when fetching %q: %d, expecting %d; response: %q",
path, resp.StatusCode, http.StatusOK, data)
}
if err != nil {
return nil, fmt.Errorf("cannot read %q: %s", path, err)
}
result[path] = data
}
return result, nil
}

View File

@@ -1,59 +0,0 @@
package log
import (
"sync"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
)
// Logger is using lib/logger for logging
// but can be suppressed via Suppress method
type Logger struct {
mu sync.RWMutex
disabled bool
}
// Suppress whether to ignore message logging.
// Once suppressed, logging continues to be ignored
// until logger is un-suppressed.
func (l *Logger) Suppress(v bool) {
l.mu.Lock()
l.disabled = v
l.mu.Unlock()
}
func (l *Logger) isDisabled() bool {
l.mu.RLock()
defer l.mu.RUnlock()
return l.disabled
}
// Errorf logs error message.
func (l *Logger) Errorf(format string, args ...interface{}) {
if l.isDisabled() {
return
}
logger.Errorf(format, args...)
}
// Warnf logs warning message.
func (l *Logger) Warnf(format string, args ...interface{}) {
if l.isDisabled() {
return
}
logger.Warnf(format, args...)
}
// Infof logs info message.
func (l *Logger) Infof(format string, args ...interface{}) {
if l.isDisabled() {
return
}
logger.Infof(format, args...)
}
// Panicf logs panic message and panics.
// Panicf can't be suppressed
func (l *Logger) Panicf(format string, args ...interface{}) {
logger.Panicf(format, args...)
}

View File

@@ -1,54 +0,0 @@
package log
import (
"bytes"
"fmt"
"strings"
"testing"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
)
func TestOutput(t *testing.T) {
testOutput := &bytes.Buffer{}
logger.SetOutputForTests(testOutput)
defer logger.ResetOutputForTest()
log := &Logger{}
mustMatch := func(exp string) {
t.Helper()
if exp == "" {
if testOutput.String() != "" {
t.Errorf("expected output to be empty; got %q", testOutput.String())
return
}
}
if !strings.Contains(testOutput.String(), exp) {
t.Errorf("output %q should contain %q", testOutput.String(), exp)
}
fmt.Println(testOutput.String())
testOutput.Reset()
}
log.Warnf("foo")
mustMatch("foo")
log.Infof("info %d", 2)
mustMatch("info 2")
log.Errorf("error %s %d", "baz", 5)
mustMatch("error baz 5")
log.Suppress(true)
log.Warnf("foo")
mustMatch("")
log.Infof("info %d", 2)
mustMatch("")
log.Errorf("error %q %d", "baz", 5)
mustMatch("")
}

View File

@@ -5,8 +5,6 @@ groups:
limit: 1000
headers:
- "MyHeader: foo"
notifier_headers:
- "MyHeader: foo"
params:
denyPartialResponse: ["true"]
rules:

View File

@@ -1,12 +1,9 @@
package datasource
import (
"bytes"
"context"
"net/http"
"net/url"
"sort"
"strconv"
"time"
)
@@ -16,22 +13,11 @@ type Querier interface {
// It returns list of Metric in response, the http.Request used for sending query
// and error if any. Returned http.Request can't be reused and its body is already read.
// Query should stop once ctx is cancelled.
Query(ctx context.Context, query string, ts time.Time) (Result, *http.Request, error)
Query(ctx context.Context, query string, ts time.Time) ([]Metric, *http.Request, error)
// QueryRange executes range request with the given query on the given time range.
// It returns list of Metric in response and error if any.
// QueryRange should stop once ctx is cancelled.
QueryRange(ctx context.Context, query string, from, to time.Time) (Result, error)
}
// Result represents expected response from the datasource
type Result struct {
// Data contains list of received Metric
Data []Metric
// SeriesFetched contains amount of time series processed by datasource
// during query evaluation.
// If nil, then this feature is not supported by the datasource.
// SeriesFetched is supported by VictoriaMetrics since v1.90.
SeriesFetched *int
QueryRange(ctx context.Context, query string, from, to time.Time) ([]Metric, error)
}
// QuerierBuilder builds Querier with given params.
@@ -111,69 +97,3 @@ type Label struct {
Name string
Value string
}
// Labels is collection of Label
type Labels []Label
func (ls Labels) Len() int { return len(ls) }
func (ls Labels) Swap(i, j int) { ls[i], ls[j] = ls[j], ls[i] }
func (ls Labels) Less(i, j int) bool { return ls[i].Name < ls[j].Name }
func (ls Labels) String() string {
var b bytes.Buffer
b.WriteByte('{')
for i, l := range ls {
if i > 0 {
b.WriteByte(',')
b.WriteByte(' ')
}
b.WriteString(l.Name)
b.WriteByte('=')
b.WriteString(strconv.Quote(l.Value))
}
b.WriteByte('}')
return b.String()
}
// LabelCompare return negative if a is less than b, return 0 if they are the same
// eg.
// a=[]Label{{Name: "a", Value: "1"}},b=[]Label{{Name: "b", Value: "1"}}, return -1
// a=[]Label{{Name: "a", Value: "2"}},b=[]Label{{Name: "a", Value: "1"}}, return 1
// a=[]Label{{Name: "a", Value: "1"}},b=[]Label{{Name: "a", Value: "1"}}, return 0
func LabelCompare(a, b Labels) int {
l := len(a)
if len(b) < l {
l = len(b)
}
for i := 0; i < l; i++ {
if a[i].Name != b[i].Name {
if a[i].Name < b[i].Name {
return -1
}
return 1
}
if a[i].Value != b[i].Value {
if a[i].Value < b[i].Value {
return -1
}
return 1
}
}
// if all labels so far were in common, the set with fewer labels comes first.
return len(a) - len(b)
}
// ConvertToLabels convert map to Labels
func ConvertToLabels(m map[string]string) (labelset Labels) {
for k, v := range m {
labelset = append(labelset, Label{
Name: k,
Value: v,
})
}
// sort label
sort.Slice(labelset, func(i, j int) bool { return labelset[i].Name < labelset[j].Name })
return
}

View File

@@ -47,7 +47,7 @@ var (
"For example, if -datasource.queryStep=15s then param \"step\" with value \"15s\" will be added to every query. "+
"If set to 0, rule's evaluation interval will be used instead.")
queryTimeAlignment = flag.Bool("datasource.queryTimeAlignment", true, `Whether to align "time" parameter with evaluation interval.`+
"Alignment supposed to produce deterministic results despite number of vmalert replicas or time they were started. See more details here https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1257")
"Alignment supposed to produce deterministic results despite of number of vmalert replicas or time they were started. See more details here https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1257")
maxIdleConnections = flag.Int("datasource.maxIdleConnections", 100, `Defines the number of idle (keep-alive connections) to each configured datasource. Consider setting this value equal to the value: groups_total * group.concurrency. Too low a value may result in a high number of sockets in TIME_WAIT state.`)
disableKeepAlive = flag.Bool("datasource.disableKeepAlive", false, `Whether to disable long-lived connections to the datasource. `+
`If true, disables HTTP keep-alives and will only use the connection to the server for a single HTTP request.`)

View File

@@ -2,7 +2,6 @@ package datasource
import (
"context"
"errors"
"fmt"
"io"
"net/http"
@@ -91,8 +90,15 @@ func (s *VMStorage) ApplyParams(params QuerierParams) *VMStorage {
s.extraParams = url.Values{}
}
for k, vl := range params.QueryParams {
for _, v := range vl { // custom query params are prior to default ones
s.extraParams.Set(k, v)
// custom query params are prior to default ones
if s.extraParams.Has(k) {
s.extraParams.Del(k)
}
for _, v := range vl {
// don't use .Set() instead of Del/Add since it is allowed
// for GET params to be duplicated
// see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4908
s.extraParams.Add(k, v)
}
}
}
@@ -126,10 +132,10 @@ func NewVMStorage(baseURL string, authCfg *promauth.Config, lookBack time.Durati
}
// Query executes the given query and returns parsed response
func (s *VMStorage) Query(ctx context.Context, query string, ts time.Time) (Result, *http.Request, error) {
func (s *VMStorage) Query(ctx context.Context, query string, ts time.Time) ([]Metric, *http.Request, error) {
req, err := s.newRequestPOST()
if err != nil {
return Result{}, nil, err
return nil, nil, err
}
switch s.dataSourceType {
@@ -138,12 +144,12 @@ func (s *VMStorage) Query(ctx context.Context, query string, ts time.Time) (Resu
case datasourceGraphite:
s.setGraphiteReqParams(req, query, ts)
default:
return Result{}, nil, fmt.Errorf("engine not found: %q", s.dataSourceType)
return nil, nil, fmt.Errorf("engine not found: %q", s.dataSourceType)
}
resp, err := s.do(ctx, req)
if err != nil {
return Result{}, req, err
return nil, req, err
}
defer func() {
_ = resp.Body.Close()
@@ -160,24 +166,24 @@ func (s *VMStorage) Query(ctx context.Context, query string, ts time.Time) (Resu
// QueryRange executes the given query on the given time range.
// For Prometheus type see https://prometheus.io/docs/prometheus/latest/querying/api/#range-queries
// Graphite type isn't supported.
func (s *VMStorage) QueryRange(ctx context.Context, query string, start, end time.Time) (res Result, err error) {
func (s *VMStorage) QueryRange(ctx context.Context, query string, start, end time.Time) ([]Metric, error) {
if s.dataSourceType != datasourcePrometheus {
return res, fmt.Errorf("%q is not supported for QueryRange", s.dataSourceType)
return nil, fmt.Errorf("%q is not supported for QueryRange", s.dataSourceType)
}
req, err := s.newRequestPOST()
if err != nil {
return res, err
return nil, err
}
if start.IsZero() {
return res, fmt.Errorf("start param is missing")
return nil, fmt.Errorf("start param is missing")
}
if end.IsZero() {
return res, fmt.Errorf("end param is missing")
return nil, fmt.Errorf("end param is missing")
}
s.setPrometheusRangeReqParams(req, query, start, end)
resp, err := s.do(ctx, req)
if err != nil {
return res, err
return nil, err
}
defer func() {
_ = resp.Body.Close()
@@ -190,11 +196,6 @@ func (s *VMStorage) do(ctx context.Context, req *http.Request) (*http.Response,
logger.Infof("DEBUG datasource request: executing %s request with params %q", req.Method, req.URL.RawQuery)
}
resp, err := s.c.Do(req.WithContext(ctx))
if errors.Is(err, io.EOF) || errors.Is(err, io.ErrUnexpectedEOF) {
// something in the middle between client and datasource might be closing
// the connection. So we do a one more attempt in hope request will succeed.
resp, err = s.c.Do(req.WithContext(ctx))
}
if err != nil {
return nil, fmt.Errorf("error getting response from %s: %w", req.URL.Redacted(), err)
}
@@ -207,7 +208,7 @@ func (s *VMStorage) do(ctx context.Context, req *http.Request) (*http.Response,
}
func (s *VMStorage) newRequestPOST() (*http.Request, error) {
req, err := http.NewRequest(http.MethodPost, s.datasourceURL, nil)
req, err := http.NewRequest("POST", s.datasourceURL, nil)
if err != nil {
return nil, err
}

View File

@@ -35,12 +35,12 @@ func (r graphiteResponse) metrics() []Metric {
return ms
}
func parseGraphiteResponse(req *http.Request, resp *http.Response) (Result, error) {
func parseGraphiteResponse(req *http.Request, resp *http.Response) ([]Metric, error) {
r := &graphiteResponse{}
if err := json.NewDecoder(resp.Body).Decode(r); err != nil {
return Result{}, fmt.Errorf("error parsing graphite metrics for %s: %w", req.URL.Redacted(), err)
return nil, fmt.Errorf("error parsing graphite metrics for %s: %w", req.URL.Redacted(), err)
}
return Result{Data: r.metrics()}, nil
return r.metrics(), nil
}
const (
@@ -54,16 +54,6 @@ func (s *VMStorage) setGraphiteReqParams(r *http.Request, query string, timestam
}
r.URL.Path += graphitePath
q := r.URL.Query()
from := "-5min"
if s.lookBack > 0 {
lookBack := timestamp.Add(-s.lookBack)
from = strconv.FormatInt(lookBack.Unix(), 10)
}
q.Set("from", from)
q.Set("format", "json")
q.Set("target", query)
q.Set("until", "now")
for k, vs := range s.extraParams {
if q.Has(k) { // extraParams are prior to params in URL
q.Del(k)
@@ -72,6 +62,14 @@ func (s *VMStorage) setGraphiteReqParams(r *http.Request, query string, timestam
q.Add(k, v)
}
}
q.Set("format", "json")
q.Set("target", query)
from := "-5min"
if s.lookBack > 0 {
lookBack := timestamp.Add(-s.lookBack)
from = strconv.FormatInt(lookBack.Unix(), 10)
}
q.Set("from", from)
q.Set("until", "now")
r.URL.RawQuery = q.Encode()
}

View File

@@ -12,9 +12,6 @@ import (
var (
disablePathAppend = flag.Bool("remoteRead.disablePathAppend", false, "Whether to disable automatic appending of '/api/v1/query' path "+
"to the configured -datasource.url and -remoteRead.url")
disableStepParam = flag.Bool("datasource.disableStepParam", false, "Whether to disable adding 'step' param to the issued instant queries. "+
"This might be useful when using vmalert with datasources that do not support 'step' param for instant queries, like Google Managed Prometheus. "+
"It is not recommended to enable this flag if you use vmalert with VictoriaMetrics.")
)
type promResponse struct {
@@ -25,10 +22,6 @@ type promResponse struct {
ResultType string `json:"resultType"`
Result json.RawMessage `json:"result"`
} `json:"data"`
// Stats supported by VictoriaMetrics since v1.90
Stats struct {
SeriesFetched *string `json:"seriesFetched,omitempty"`
} `json:"stats,omitempty"`
}
type promInstant struct {
@@ -103,54 +96,39 @@ const (
rtVector, rtMatrix, rScalar = "vector", "matrix", "scalar"
)
func parsePrometheusResponse(req *http.Request, resp *http.Response) (res Result, err error) {
func parsePrometheusResponse(req *http.Request, resp *http.Response) ([]Metric, error) {
r := &promResponse{}
if err = json.NewDecoder(resp.Body).Decode(r); err != nil {
return res, fmt.Errorf("error parsing prometheus metrics for %s: %w", req.URL.Redacted(), err)
if err := json.NewDecoder(resp.Body).Decode(r); err != nil {
return nil, fmt.Errorf("error parsing prometheus metrics for %s: %w", req.URL.Redacted(), err)
}
if r.Status == statusError {
return res, fmt.Errorf("response error, query: %s, errorType: %s, error: %s", req.URL.Redacted(), r.ErrorType, r.Error)
return nil, fmt.Errorf("response error, query: %s, errorType: %s, error: %s", req.URL.Redacted(), r.ErrorType, r.Error)
}
if r.Status != statusSuccess {
return res, fmt.Errorf("unknown status: %s, Expected success or error ", r.Status)
return nil, fmt.Errorf("unknown status: %s, Expected success or error ", r.Status)
}
var parseFn func() ([]Metric, error)
switch r.Data.ResultType {
case rtVector:
var pi promInstant
if err := json.Unmarshal(r.Data.Result, &pi.Result); err != nil {
return res, fmt.Errorf("umarshal err %s; \n %#v", err, string(r.Data.Result))
return nil, fmt.Errorf("umarshal err %s; \n %#v", err, string(r.Data.Result))
}
parseFn = pi.metrics
return pi.metrics()
case rtMatrix:
var pr promRange
if err := json.Unmarshal(r.Data.Result, &pr.Result); err != nil {
return res, err
return nil, err
}
parseFn = pr.metrics
return pr.metrics()
case rScalar:
var ps promScalar
if err := json.Unmarshal(r.Data.Result, &ps); err != nil {
return res, err
return nil, err
}
parseFn = ps.metrics
return ps.metrics()
default:
return res, fmt.Errorf("unknown result type %q", r.Data.ResultType)
return nil, fmt.Errorf("unknown result type %q", r.Data.ResultType)
}
ms, err := parseFn()
if err != nil {
return res, err
}
res = Result{Data: ms}
if r.Stats.SeriesFetched != nil {
intV, err := strconv.Atoi(*r.Stats.SeriesFetched)
if err != nil {
return res, fmt.Errorf("failed to convert stats.seriesFetched to int: %w", err)
}
res.SeriesFetched = &intV
}
return res, nil
}
func (s *VMStorage) setPrometheusInstantReqParams(r *http.Request, query string, timestamp time.Time) {
@@ -168,13 +146,13 @@ func (s *VMStorage) setPrometheusInstantReqParams(r *http.Request, query string,
// see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1232
timestamp = timestamp.Truncate(s.evaluationInterval)
}
q.Set("time", timestamp.Format(time.RFC3339))
if !*disableStepParam && s.evaluationInterval > 0 { // set step as evaluationInterval by default
q.Set("time", fmt.Sprintf("%d", timestamp.Unix()))
if s.evaluationInterval > 0 { // set step as evaluationInterval by default
// always convert to seconds to keep compatibility with older
// Prometheus versions. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1943
q.Set("step", fmt.Sprintf("%ds", int(s.evaluationInterval.Seconds())))
}
if !*disableStepParam && s.queryStep > 0 { // override step with user-specified value
if s.queryStep > 0 { // override step with user-specified value
// always convert to seconds to keep compatibility with older
// Prometheus versions. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1943
q.Set("step", fmt.Sprintf("%ds", int(s.queryStep.Seconds())))
@@ -191,8 +169,8 @@ func (s *VMStorage) setPrometheusRangeReqParams(r *http.Request, query string, s
r.URL.Path += "/api/v1/query_range"
}
q := r.URL.Query()
q.Add("start", start.Format(time.RFC3339))
q.Add("end", end.Format(time.RFC3339))
q.Add("start", fmt.Sprintf("%d", start.Unix()))
q.Add("end", fmt.Sprintf("%d", end.Unix()))
if s.evaluationInterval > 0 { // set step as evaluationInterval by default
// always convert to seconds to keep compatibility with older
// Prometheus versions. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1943

View File

@@ -8,6 +8,7 @@ import (
"net/url"
"reflect"
"sort"
"strconv"
"strings"
"testing"
"time"
@@ -34,6 +35,13 @@ func TestVMInstantQuery(t *testing.T) {
t.Errorf("should not be called")
})
c := -1
mux.HandleFunc("/render", func(w http.ResponseWriter, request *http.Request) {
c++
switch c {
case 8:
w.Write([]byte(`[{"target":"constantLine(10)","tags":{"name":"constantLine(10)"},"datapoints":[[10,1611758343],[10,1611758373],[10,1611758403]]}]`))
}
})
mux.HandleFunc("/api/v1/query", func(w http.ResponseWriter, r *http.Request) {
c++
if r.Method != http.MethodPost {
@@ -49,33 +57,27 @@ func TestVMInstantQuery(t *testing.T) {
if timeParam == "" {
t.Errorf("expected 'time' in query param, got nil instead")
}
if _, err := time.Parse(time.RFC3339, timeParam); err != nil {
t.Errorf("failed to parse 'time' query param %q: %s", timeParam, err)
if _, err := strconv.ParseInt(timeParam, 10, 64); err != nil {
t.Errorf("failed to parse 'time' query param: %s", err)
}
switch c {
case 0:
w.WriteHeader(500)
conn, _, _ := w.(http.Hijacker).Hijack()
_ = conn.Close()
case 1:
w.Write([]byte("[]"))
w.WriteHeader(500)
case 2:
w.Write([]byte(`{"status":"error", "errorType":"type:", "error":"some error msg"}`))
w.Write([]byte("[]"))
case 3:
w.Write([]byte(`{"status":"unknown"}`))
w.Write([]byte(`{"status":"error", "errorType":"type:", "error":"some error msg"}`))
case 4:
w.Write([]byte(`{"status":"success","data":{"resultType":"matrix"}}`))
w.Write([]byte(`{"status":"unknown"}`))
case 5:
w.Write([]byte(`{"status":"success","data":{"resultType":"vector","result":[{"metric":{"__name__":"vm_rows","foo":"bar"},"value":[1583786142,"13763"]},{"metric":{"__name__":"vm_requests","foo":"baz"},"value":[1583786140,"2000"]}]}}`))
w.Write([]byte(`{"status":"success","data":{"resultType":"matrix"}}`))
case 6:
w.Write([]byte(`{"status":"success","data":{"resultType":"scalar","result":[1583786142, "1"]}}`))
w.Write([]byte(`{"status":"success","data":{"resultType":"vector","result":[{"metric":{"__name__":"vm_rows","foo":"bar"},"value":[1583786142,"13763"]},{"metric":{"__name__":"vm_requests","foo":"baz"},"value":[1583786140,"2000"]}]}}`))
case 7:
w.Write([]byte(`{"status":"success","data":{"resultType":"scalar","result":[1583786142, "1"]},"stats":{"seriesFetched": "42"}}`))
}
})
mux.HandleFunc("/render", func(w http.ResponseWriter, request *http.Request) {
c++
switch c {
case 8:
w.Write([]byte(`[{"target":"constantLine(10)","tags":{"name":"constantLine(10)"},"datapoints":[[10,1611758343],[10,1611758373],[10,1611758403]]}]`))
w.Write([]byte(`{"status":"success","data":{"resultType":"scalar","result":[1583786142, "1"]}}`))
}
})
@@ -93,27 +95,24 @@ func TestVMInstantQuery(t *testing.T) {
ts := time.Now()
expErr := func(err string) {
_, _, gotErr := pq.Query(ctx, query, ts)
if gotErr == nil {
if _, _, err := pq.Query(ctx, query, ts); err == nil {
t.Fatalf("expected %q got nil", err)
}
if !strings.Contains(gotErr.Error(), err) {
t.Fatalf("expected err %q; got %q", err, gotErr)
}
}
expErr("500") // 0
expErr("error parsing prometheus metrics") // 1
expErr("response error") // 2
expErr("unknown status") // 3
expErr("unexpected end of JSON input") // 4
expErr("connection error") // 0
expErr("invalid response status error") // 1
expErr("response body error") // 2
expErr("error status") // 3
expErr("unknown status") // 4
expErr("non-vector resultType error") // 5
res, _, err := pq.Query(ctx, query, ts) // 5 - vector
m, _, err := pq.Query(ctx, query, ts) // 6 - vector
if err != nil {
t.Fatalf("unexpected %s", err)
}
if len(res.Data) != 2 {
t.Fatalf("expected 2 metrics got %d in %+v", len(res.Data), res.Data)
if len(m) != 2 {
t.Fatalf("expected 2 metrics got %d in %+v", len(m), m)
}
expected := []Metric{
{
@@ -127,17 +126,17 @@ func TestVMInstantQuery(t *testing.T) {
Values: []float64{2000},
},
}
metricsEqual(t, res.Data, expected)
metricsEqual(t, m, expected)
res, req, err := pq.Query(ctx, query, ts) // 6 - scalar
m, req, err := pq.Query(ctx, query, ts) // 7 - scalar
if err != nil {
t.Fatalf("unexpected %s", err)
}
if req == nil {
t.Fatalf("expected request to be non-nil")
}
if len(res.Data) != 1 {
t.Fatalf("expected 1 metrics got %d in %+v", len(res.Data), res.Data)
if len(m) != 1 {
t.Fatalf("expected 1 metrics got %d in %+v", len(m), m)
}
expected = []Metric{
{
@@ -145,44 +144,18 @@ func TestVMInstantQuery(t *testing.T) {
Values: []float64{1},
},
}
if !reflect.DeepEqual(res.Data, expected) {
t.Fatalf("unexpected metric %+v want %+v", res.Data, expected)
}
if res.SeriesFetched != nil {
t.Fatalf("expected `seriesFetched` field to be nil when it is missing in datasource response; got %v instead",
res.SeriesFetched)
}
res, _, err = pq.Query(ctx, query, ts) // 7 - scalar with stats
if err != nil {
t.Fatalf("unexpected %s", err)
}
if len(res.Data) != 1 {
t.Fatalf("expected 1 metrics got %d in %+v", len(res.Data), res)
}
expected = []Metric{
{
Timestamps: []int64{1583786142},
Values: []float64{1},
},
}
if !reflect.DeepEqual(res.Data, expected) {
t.Fatalf("unexpected metric %+v want %+v", res.Data, expected)
}
if *res.SeriesFetched != 42 {
t.Fatalf("expected `seriesFetched` field to be 42; got %d instead",
*res.SeriesFetched)
if !reflect.DeepEqual(m, expected) {
t.Fatalf("unexpected metric %+v want %+v", m, expected)
}
gq := s.BuildWithParams(QuerierParams{DataSourceType: string(datasourceGraphite)})
res, _, err = gq.Query(ctx, queryRender, ts) // 8 - graphite
m, _, err = gq.Query(ctx, queryRender, ts) // 8 - graphite
if err != nil {
t.Fatalf("unexpected %s", err)
}
if len(res.Data) != 1 {
t.Fatalf("expected 1 metric got %d in %+v", len(res.Data), res.Data)
if len(m) != 1 {
t.Fatalf("expected 1 metric got %d in %+v", len(m), m)
}
exp := []Metric{
{
@@ -191,76 +164,7 @@ func TestVMInstantQuery(t *testing.T) {
Values: []float64{10},
},
}
metricsEqual(t, res.Data, exp)
}
func TestVMInstantQueryWithRetry(t *testing.T) {
mux := http.NewServeMux()
mux.HandleFunc("/", func(_ http.ResponseWriter, _ *http.Request) {
t.Errorf("should not be called")
})
c := -1
mux.HandleFunc("/api/v1/query", func(w http.ResponseWriter, r *http.Request) {
c++
if r.URL.Query().Get("query") != query {
t.Errorf("expected %s in query param, got %s", query, r.URL.Query().Get("query"))
}
switch c {
case 0:
w.Write([]byte(`{"status":"success","data":{"resultType":"scalar","result":[1583786142, "1"]}}`))
case 1:
conn, _, _ := w.(http.Hijacker).Hijack()
_ = conn.Close()
case 2:
w.Write([]byte(`{"status":"success","data":{"resultType":"scalar","result":[1583786142, "2"]}}`))
case 3:
conn, _, _ := w.(http.Hijacker).Hijack()
_ = conn.Close()
case 4:
conn, _, _ := w.(http.Hijacker).Hijack()
_ = conn.Close()
}
})
srv := httptest.NewServer(mux)
defer srv.Close()
s := NewVMStorage(srv.URL, nil, time.Minute, 0, false, srv.Client())
pq := s.BuildWithParams(QuerierParams{DataSourceType: string(datasourcePrometheus)})
expErr := func(err string) {
_, _, gotErr := pq.Query(ctx, query, time.Now())
if gotErr == nil {
t.Fatalf("expected %q got nil", err)
}
if !strings.Contains(gotErr.Error(), err) {
t.Fatalf("expected err %q; got %q", err, gotErr)
}
}
expValue := func(v float64) {
res, _, err := pq.Query(ctx, query, time.Now())
if err != nil {
t.Fatalf("unexpected %s", err)
}
m := res.Data
if len(m) != 1 {
t.Fatalf("expected 1 metrics got %d in %+v", len(m), m)
}
expected := []Metric{
{
Timestamps: []int64{1583786142},
Values: []float64{v},
},
}
if !reflect.DeepEqual(m, expected) {
t.Fatalf("unexpected metric %+v want %+v", m, expected)
}
}
expValue(1) // 0
expValue(2) // 1 - fail, 2 - retry
expErr("EOF") // 3, 4 - retries
metricsEqual(t, m, exp)
}
func metricsEqual(t *testing.T, gotM, expectedM []Metric) {
@@ -307,14 +211,14 @@ func TestVMRangeQuery(t *testing.T) {
if startTS == "" {
t.Errorf("expected 'start' in query param, got nil instead")
}
if _, err := time.Parse(time.RFC3339, startTS); err != nil {
if _, err := strconv.ParseInt(startTS, 10, 64); err != nil {
t.Errorf("failed to parse 'start' query param: %s", err)
}
endTS := r.URL.Query().Get("end")
if endTS == "" {
t.Errorf("expected 'end' in query param, got nil instead")
}
if _, err := time.Parse(time.RFC3339, endTS); err != nil {
if _, err := strconv.ParseInt(endTS, 10, 64); err != nil {
t.Errorf("failed to parse 'end' query param: %s", err)
}
step := r.URL.Query().Get("step")
@@ -346,11 +250,10 @@ func TestVMRangeQuery(t *testing.T) {
start, end := time.Now().Add(-time.Minute), time.Now()
res, err := pq.QueryRange(ctx, query, start, end)
m, err := pq.QueryRange(ctx, query, start, end)
if err != nil {
t.Fatalf("unexpected %s", err)
}
m := res.Data
if len(m) != 1 {
t.Fatalf("expected 1 metric got %d in %+v", len(m), m)
}
@@ -453,8 +356,8 @@ func TestRequestParams(t *testing.T) {
false,
&VMStorage{},
func(t *testing.T, r *http.Request) {
exp := url.Values{"query": {query}, "time": {timestamp.Format(time.RFC3339)}}
checkEqualString(t, exp.Encode(), r.URL.RawQuery)
exp := fmt.Sprintf("query=%s&time=%d", query, timestamp.Unix())
checkEqualString(t, exp, r.URL.RawQuery)
},
},
{
@@ -462,9 +365,8 @@ func TestRequestParams(t *testing.T) {
true,
&VMStorage{},
func(t *testing.T, r *http.Request) {
ts := timestamp.Format(time.RFC3339)
exp := url.Values{"query": {query}, "start": {ts}, "end": {ts}}
checkEqualString(t, exp.Encode(), r.URL.RawQuery)
exp := fmt.Sprintf("end=%d&query=%s&start=%d", timestamp.Unix(), query, timestamp.Unix())
checkEqualString(t, exp, r.URL.RawQuery)
},
},
{
@@ -494,8 +396,8 @@ func TestRequestParams(t *testing.T) {
lookBack: time.Minute,
},
func(t *testing.T, r *http.Request) {
exp := url.Values{"query": {query}, "time": {timestamp.Add(-time.Minute).Format(time.RFC3339)}}
checkEqualString(t, exp.Encode(), r.URL.RawQuery)
exp := fmt.Sprintf("query=%s&time=%d", query, timestamp.Add(-time.Minute).Unix())
checkEqualString(t, exp, r.URL.RawQuery)
},
},
{
@@ -507,8 +409,8 @@ func TestRequestParams(t *testing.T) {
func(t *testing.T, r *http.Request) {
evalInterval := 15 * time.Second
tt := timestamp.Truncate(evalInterval)
exp := url.Values{"query": {query}, "step": {evalInterval.String()}, "time": {tt.Format(time.RFC3339)}}
checkEqualString(t, exp.Encode(), r.URL.RawQuery)
exp := fmt.Sprintf("query=%s&step=%v&time=%d", query, evalInterval, tt.Unix())
checkEqualString(t, exp, r.URL.RawQuery)
},
},
{
@@ -522,8 +424,8 @@ func TestRequestParams(t *testing.T) {
evalInterval := 15 * time.Second
tt := timestamp.Add(-time.Minute)
tt = tt.Truncate(evalInterval)
exp := url.Values{"query": {query}, "step": {evalInterval.String()}, "time": {tt.Format(time.RFC3339)}}
checkEqualString(t, exp.Encode(), r.URL.RawQuery)
exp := fmt.Sprintf("query=%s&step=%v&time=%d", query, evalInterval, tt.Unix())
checkEqualString(t, exp, r.URL.RawQuery)
},
},
{
@@ -533,12 +435,8 @@ func TestRequestParams(t *testing.T) {
queryStep: time.Minute,
},
func(t *testing.T, r *http.Request) {
exp := url.Values{
"query": {query},
"step": {fmt.Sprintf("%ds", int(time.Minute.Seconds()))},
"time": {timestamp.Format(time.RFC3339)},
}
checkEqualString(t, exp.Encode(), r.URL.RawQuery)
exp := fmt.Sprintf("query=%s&step=%ds&time=%d", query, int(time.Minute.Seconds()), timestamp.Unix())
checkEqualString(t, exp, r.URL.RawQuery)
},
},
{
@@ -550,8 +448,8 @@ func TestRequestParams(t *testing.T) {
func(t *testing.T, r *http.Request) {
evalInterval := 3 * time.Hour
tt := timestamp.Truncate(evalInterval)
exp := url.Values{"query": {query}, "step": {fmt.Sprintf("%ds", int(evalInterval.Seconds()))}, "time": {tt.Format(time.RFC3339)}}
checkEqualString(t, exp.Encode(), r.URL.RawQuery)
exp := fmt.Sprintf("query=%s&step=%ds&time=%d", query, int(evalInterval.Seconds()), tt.Unix())
checkEqualString(t, exp, r.URL.RawQuery)
},
},
{
@@ -561,8 +459,8 @@ func TestRequestParams(t *testing.T) {
extraParams: url.Values{"round_digits": {"10"}},
},
func(t *testing.T, r *http.Request) {
exp := url.Values{"query": {query}, "round_digits": {"10"}, "time": {timestamp.Format(time.RFC3339)}}
checkEqualString(t, exp.Encode(), r.URL.RawQuery)
exp := fmt.Sprintf("query=%s&round_digits=10&time=%d", query, timestamp.Unix())
checkEqualString(t, exp, r.URL.RawQuery)
},
},
{
@@ -575,14 +473,9 @@ func TestRequestParams(t *testing.T) {
},
},
func(t *testing.T, r *http.Request) {
exp := url.Values{
"query": {query},
"end": {timestamp.Format(time.RFC3339)},
"start": {timestamp.Format(time.RFC3339)},
"nocache": {"1"},
"max_lookback": {"1h"},
}
checkEqualString(t, exp.Encode(), r.URL.RawQuery)
exp := fmt.Sprintf("end=%d&max_lookback=1h&nocache=1&query=%s&start=%d",
timestamp.Unix(), query, timestamp.Unix())
checkEqualString(t, exp, r.URL.RawQuery)
},
},
{
@@ -592,7 +485,18 @@ func TestRequestParams(t *testing.T) {
QueryParams: url.Values{"round_digits": {"2"}},
}),
func(t *testing.T, r *http.Request) {
exp := url.Values{"query": {query}, "round_digits": {"2"}, "time": {timestamp.Format(time.RFC3339)}}
exp := fmt.Sprintf("query=%s&round_digits=2&time=%d", query, timestamp.Unix())
checkEqualString(t, exp, r.URL.RawQuery)
},
},
{
"allow duplicates in query params",
false,
storage.Clone().ApplyParams(QuerierParams{
QueryParams: url.Values{"extra_labels": {"env=dev", "foo=bar"}},
}),
func(t *testing.T, r *http.Request) {
exp := url.Values{"query": {query}, "round_digits": {"10"}, "extra_labels": {"env=dev", "foo=bar"}, "time": {fmt.Sprintf("%d", timestamp.Unix())}}
checkEqualString(t, exp.Encode(), r.URL.RawQuery)
},
},
@@ -611,20 +515,6 @@ func TestRequestParams(t *testing.T) {
checkEqualString(t, exp, r.URL.RawQuery)
},
},
{
"graphite extra params allows to override from",
false,
&VMStorage{
dataSourceType: datasourceGraphite,
extraParams: url.Values{
"from": {"-10m"},
},
},
func(t *testing.T, r *http.Request) {
exp := fmt.Sprintf("format=json&from=-10m&target=%s&until=now", query)
checkEqualString(t, exp, r.URL.RawQuery)
},
},
}
for _, tc := range testCases {
@@ -649,7 +539,7 @@ func TestRequestParams(t *testing.T) {
}
func TestHeaders(t *testing.T) {
testCases := []struct {
var testCases = []struct {
name string
vmFn func() *VMStorage
checkFn func(t *testing.T, r *http.Request)
@@ -714,8 +604,7 @@ func TestHeaders(t *testing.T) {
authCfg: cfg,
extraHeaders: []keyValue{
{key: "Authorization", value: "Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ=="},
},
}
}}
},
checkFn: func(t *testing.T, r *http.Request) {
u, p, _ := r.BasicAuth()

View File

@@ -2,7 +2,6 @@ package main
import (
"context"
"errors"
"fmt"
"hash/fnv"
"net/url"
@@ -36,19 +35,15 @@ type Group struct {
Checksum string
LastEvaluation time.Time
Labels map[string]string
Params url.Values
Headers map[string]string
NotifierHeaders map[string]string
Labels map[string]string
Params url.Values
Headers map[string]string
doneCh chan struct{}
finishedCh chan struct{}
// channel accepts new Group obj
// which supposed to update current group
updateCh chan *Group
// evalCancel stores the cancel fn for interrupting
// rules evaluation. Used on groups update() and close().
evalCancel context.CancelFunc
metrics *groupMetrics
}
@@ -94,17 +89,16 @@ func mergeLabels(groupName, ruleName string, set1, set2 map[string]string) map[s
func newGroup(cfg config.Group, qb datasource.QuerierBuilder, defaultInterval time.Duration, labels map[string]string) *Group {
g := &Group{
Type: cfg.Type,
Name: cfg.Name,
File: cfg.File,
Interval: cfg.Interval.Duration(),
Limit: cfg.Limit,
Concurrency: cfg.Concurrency,
Checksum: cfg.Checksum,
Params: cfg.Params,
Headers: make(map[string]string),
NotifierHeaders: make(map[string]string),
Labels: cfg.Labels,
Type: cfg.Type,
Name: cfg.Name,
File: cfg.File,
Interval: cfg.Interval.Duration(),
Limit: cfg.Limit,
Concurrency: cfg.Concurrency,
Checksum: cfg.Checksum,
Params: cfg.Params,
Headers: make(map[string]string),
Labels: cfg.Labels,
doneCh: make(chan struct{}),
finishedCh: make(chan struct{}),
@@ -119,9 +113,6 @@ func newGroup(cfg config.Group, qb datasource.QuerierBuilder, defaultInterval ti
for _, h := range cfg.Headers {
g.Headers[h.Key] = h.Value
}
for _, h := range cfg.NotifierHeaders {
g.NotifierHeaders[h.Key] = h.Value
}
g.metrics = newGroupMetrics(g)
rules := make([]Rule, len(cfg.Rules))
for i, r := range cfg.Rules {
@@ -235,7 +226,6 @@ func (g *Group) updateWith(newGroup *Group) error {
g.Concurrency = newGroup.Concurrency
g.Params = newGroup.Params
g.Headers = newGroup.Headers
g.NotifierHeaders = newGroup.NotifierHeaders
g.Labels = newGroup.Labels
g.Limit = newGroup.Limit
g.Checksum = newGroup.Checksum
@@ -243,24 +233,11 @@ func (g *Group) updateWith(newGroup *Group) error {
return nil
}
// interruptEval interrupts in-flight rules evaluations
// within the group. It is expected that g.evalCancel
// will be repopulated after the call.
func (g *Group) interruptEval() {
g.mu.RLock()
defer g.mu.RUnlock()
if g.evalCancel != nil {
g.evalCancel()
}
}
func (g *Group) close() {
if g.doneCh == nil {
return
}
close(g.doneCh)
g.interruptEval()
<-g.finishedCh
g.metrics.iterationDuration.Unregister()
@@ -277,38 +254,16 @@ var skipRandSleepOnGroupStart bool
func (g *Group) start(ctx context.Context, nts func() []notifier.Notifier, rw *remotewrite.Client, rr datasource.QuerierBuilder) {
defer func() { close(g.finishedCh) }()
// Spread group rules evaluation over time in order to reduce load on VictoriaMetrics.
if !skipRandSleepOnGroupStart {
randSleep := uint64(float64(g.Interval) * (float64(g.ID()) / (1 << 64)))
sleepOffset := uint64(time.Now().UnixNano()) % uint64(g.Interval)
if randSleep < sleepOffset {
randSleep += uint64(g.Interval)
}
randSleep -= sleepOffset
sleepTimer := time.NewTimer(time.Duration(randSleep))
select {
case <-ctx.Done():
sleepTimer.Stop()
return
case <-g.doneCh:
sleepTimer.Stop()
return
case <-sleepTimer.C:
}
}
e := &executor{
rw: rw,
notifiers: nts,
notifierHeaders: g.NotifierHeaders,
previouslySentSeriesToRW: make(map[uint64]map[string][]prompbmarshal.Label),
}
previouslySentSeriesToRW: make(map[uint64]map[string][]prompbmarshal.Label)}
evalTS := time.Now()
logger.Infof("group %q started; interval=%v; concurrency=%d", g.Name, g.Interval, g.Concurrency)
eval := func(ctx context.Context, ts time.Time) {
eval := func(ts time.Time) {
g.metrics.iterationTotal.Inc()
start := time.Now()
@@ -330,13 +285,7 @@ func (g *Group) start(ctx context.Context, nts func() []notifier.Notifier, rw *r
g.LastEvaluation = start
}
evalCtx, cancel := context.WithCancel(ctx)
g.mu.Lock()
g.evalCancel = cancel
g.mu.Unlock()
defer g.evalCancel()
eval(evalCtx, evalTS)
eval(evalTS)
t := time.NewTicker(g.Interval)
defer t.Stop()
@@ -360,14 +309,6 @@ func (g *Group) start(ctx context.Context, nts func() []notifier.Notifier, rw *r
return
case ng := <-g.updateCh:
g.mu.Lock()
// it is expected that g.evalCancel will be evoked
// somewhere else to unblock group from the rules evaluation.
// we recreate the evalCtx and g.evalCancel, so it can
// be called again.
evalCtx, cancel = context.WithCancel(ctx)
g.evalCancel = cancel
err := g.updateWith(ng)
if err != nil {
logger.Errorf("group %q: failed to update: %s", g.Name, err)
@@ -378,8 +319,6 @@ func (g *Group) start(ctx context.Context, nts func() []notifier.Notifier, rw *r
// ensure that staleness is tracked or existing rules only
e.purgeStaleSeries(g.Rules)
e.notifierHeaders = g.NotifierHeaders
if g.Interval != ng.Interval {
g.Interval = ng.Interval
t.Stop()
@@ -400,7 +339,7 @@ func (g *Group) start(ctx context.Context, nts func() []notifier.Notifier, rw *r
}
evalTS = evalTS.Add((missed + 1) * g.Interval)
eval(evalCtx, evalTS)
eval(evalTS)
}
}
}
@@ -419,10 +358,8 @@ func getResolveDuration(groupInterval, delta, maxDuration time.Duration) time.Du
}
type executor struct {
notifiers func() []notifier.Notifier
notifierHeaders map[string]string
rw *remotewrite.Client
notifiers func() []notifier.Notifier
rw *remotewrite.Client
previouslySentSeriesToRWMu sync.Mutex
// previouslySentSeriesToRW stores series sent to RW on previous iteration
@@ -476,11 +413,6 @@ func (e *executor) exec(ctx context.Context, rule Rule, ts time.Time, resolveDur
tss, err := rule.Exec(ctx, ts, limit)
if err != nil {
if errors.Is(err, context.Canceled) {
// the context can be cancelled on graceful shutdown
// or on group update. So no need to handle the error as usual.
return nil
}
execErrors.Inc()
return fmt.Errorf("rule %q: failed to execute: %w", rule, err)
}
@@ -522,7 +454,7 @@ func (e *executor) exec(ctx context.Context, rule Rule, ts time.Time, resolveDur
for _, nt := range e.notifiers() {
wg.Add(1)
go func(nt notifier.Notifier) {
if err := nt.Send(ctx, alerts, e.notifierHeaders); err != nil {
if err := nt.Send(ctx, alerts); err != nil {
errGr.Add(fmt.Errorf("rule %q: failed to send alerts to addr %q: %w", rule, nt.Addr(), err))
}
wg.Done()

Some files were not shown because too many files have changed in this diff Show More