Compare commits

..

102 Commits

Author SHA1 Message Date
f41gh7
a892b588ef Merge remote-tracking branch 'origin/cluster' into series-update-api 2023-12-05 18:47:15 +03:00
f41gh7
07cb5be348 app/vmselect: fixes and issue with slice reuse 2023-12-05 18:46:02 +03:00
Aliaksandr Valialkin
559e4db512 Revert "add datadog /api/v2/series and /api/beta/sketches support (#5094)"
This reverts commit d6b4c8e4ef.

Reason for revert: https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5094#issuecomment-1839789080
2023-12-05 02:30:40 +02:00
Aliaksandr Valialkin
61db92cdc7 Revert "lib/protoparser/datadog: follow-up after 543f218fe96574b9b2189c8350bb09afa349e3bb"
This reverts commit 73d18fbc7a.

Reason for revert: https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5094#issuecomment-1839789080
2023-12-05 02:29:00 +02:00
Aliaksandr Valialkin
bf187b2dc9 app/vmagent: add -enableMultitenantHandlers command-line flag
This flag allows converting tenant id to (vm_account_id, vm_project_id) labels.
this flag deprecates `-remoteWrite.multitenantURL` command-line flag,
because `-enableMultitenantHandlers` is easier to use and combine with multitenant url
at vminsert - https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multitenancy-via-labels

See https://docs.victoriametrics.com/vmagent.html#multitenancy

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1505
2023-12-05 01:35:59 +02:00
Aliaksandr Valialkin
5388d7ba12 docs/vmagent.md: mention that it may be useful to disable on-disk data persistence when reading data from Kafka or Google PubSub
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2110
2023-12-04 23:13:23 +02:00
Aliaksandr Valialkin
85fcefaa34 app/vmagent: code cleanup for Kafka and Google PubSub consumers / producers
- Add links to relevant docs into descriptions for every -kafka.* and -gcp.pubsub.* command-line flags.
- Wait until message processing goroutines are stopped before returning from gcppubsub.Stop().
- Prevent from multiple calls to Init() without Stop().
- Drop message if tenantID cannot be parsed properly.
- Take into account tenantID for all the supported message formats.
- Support gzip-compressed messages for graphite format.
- Use exponential backoff sleep when the message cannot be pushed to remote storage systems
  because of disabled on-disk persistence - https://docs.victoriametrics.com/vmagent.html#disabling-on-disk-persistence
- Unblock from sleep as soon as Stop() is called. Previously the sleep could take up to 2 seconds after Stop() is called.
- Remove unused globalCtx and initContext from app/vmagent/remotewrite/gcppubsub
- Mention Google PubSub support at docs/enterprise.md
- Make Google PubSub docs more clear at docs/vmagent.md

This is a follow-up for commits 115245924a5f096c5a3383d6cc8e8b6fbd421984
and e6eab781ce42285a6a1750dc01eba6801dd35516 .

Updates https://github.com/VictoriaMetrics/VictoriaMetrics-enterprise/pull/717
Updates https://github.com/VictoriaMetrics/VictoriaMetrics-enterprise/pull/713
2023-12-04 22:51:04 +02:00
Dmytro Kozlov
6770bad207 app/vmalert: expose /vmalert/api/v1/rule and /api/v1/rule API which returns rule status in JSON format (#5397)
* app/vmalert: expose `/vmalert/api/v1/rule` and `/api/v1/rule` API which returns rule status in JSON format

* app/vmalert: hide updates if query param not set

* app/vmalert: fix panic (recursion call)

* app/vmalert: add needed group name and file name

* app/vmalert: fix comment, update behavior

* app/vmalert: fix description

* app/vmalert: simplify API for /api/v1/rule

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* app/vmalert: simplify API for /api/v1/rule

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* app/vmalert: simplify API for /api/v1/rule

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* app/vmalert: simplify API for /api/v1/rule

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* app/vmalert: simplify API for /api/v1/rule

Signed-off-by: hagen1778 <roman@victoriametrics.com>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2023-12-04 22:49:39 +02:00
xzchaoo
758922e656 docs: fix the typo in vmctl.md (#5419)
fix the typo in vmctl.md

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2023-12-04 18:17:53 +02:00
Aliaksandr Valialkin
a3d0bbfcda deployment/docker: update backe Docker image from alpine 3.18.4 to 3.18.5
See https://www.alpinelinux.org/posts/Alpine-3.15.11-3.16.8-3.17.6-3.18.5-released.html
2023-12-04 18:17:07 +02:00
hagen1778
fd8731ed90 docs: fix formatting for datadog example
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-12-04 18:16:32 +02:00
Aliaksandr Valialkin
d868155751 app/vmselect: do not limit concurrency for static and fast queries
Previously concurrency for static and fast queries was limited with the -search.maxConcurrentRequests
command-line flag. This could complicate identifying heavy queries via `vmui` at `Top queries` and `Active queries` pages,
since `vmui` and these pages couldn't be opened on overloaded vmselect.

Thanks to @f41gh7 for the idea.
2023-12-04 18:14:29 +02:00
Aliaksandr Valialkin
b6d6a3a530 lib/promscrape: show dropped targets because of sharding at /service-discovery page
Previously the /service-discovery page didn't show targets dropped because of sharding
( https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets ).

Show also the reason why every target is dropped at /service-discovery page.
This should improve debuging why particular targets are dropped.

While at it, do not remove dropped targets from the list at /service-discovery page
until the total number of targets exceeds the limit passed to -promscrape.maxDroppedTargets .
Previously the list was cleaned up every 10 minutes from the entries, which weren't updated
for the last minute. This could complicate debugging of dropped targets.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5389
2023-12-04 17:42:46 +02:00
Aliaksandr Valialkin
2f4dc2aff1 lib/backup: consistently use path.Join() when constructing paths for s3, gs and azblob
E.g. replace `fs.Dir + filePath` with `path.Join(fs.Dir, filePath)`

The fs.Dir is guaranteed to end with slash - see Init() functions.
The filePath may start with slash. If it starts with slash, then `fs.Dir + filePath` constructs
an incorrect path with double slashes.
path.Join() properly substitutes duplicate slashes with a single slash in this case.

While at it, also substitute incorrect usage of filepath.Join() with path.Join()
for constructing paths to object storage systems, which expect forward slashes in paths.
filepath.Join() substittues forward slashes with backslashes on Windows, so this may break
creating or managing backups from Windows.

This is a follow-up for 0399367be602b577baf6a872ca81bf0f99ba401b
Updates https://github.com/VictoriaMetrics/VictoriaMetrics-enterprise/pull/719
2023-12-04 17:25:41 +02:00
Zakhar Bessarab
2992682f6c lib/backup/s3remote: remove prev object versions for recursive delete (#719)
* lib/backup/s3remote: remove prev object versions for recursive delete

- fix error caused by sending empty objects list to be deleted. This was possible in case old versions of objects where deleted, but root-level entries where still available. This caused paginator to return an empty page which wasn't skipped.

- delete previous versions of objects recursively for S3 remote

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* docs/changelog: add vmbackupmanager fix entry

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* lib/backup/s3remote: unify path construction for S3 objects

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-12-04 17:01:09 +02:00
Aliaksandr Valialkin
9f352f1b93 app/vminsert/newrelic: simplify the code a bit after 1fb8dc0092
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5416
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5421
2023-12-04 16:26:52 +02:00
Dmytro Kozlov
1fb8dc0092 app/vminsert: fix newrelic ingestion in cluster version (#5421)
Properly pass tenant ID to ingested data from newrelic.
Before tenant ID was mistakenly skipped.
2023-12-04 09:38:32 +01:00
hagen1778
c27968e79c docs: follow-up after 760a530305
760a530305
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit e1359c904c)
2023-12-01 14:02:00 +01:00
Artem Navoiev
fbd04e3437 docs: vmagent info about p queue disk size (#5399)
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
(cherry picked from commit 760a530305)
2023-12-01 14:01:40 +01:00
Hui Wang
3507e1e27b vmalert-tool: fix alert_rule_test case when eval_time is not multiple of evaluation_interval (#5387)
Co-authored-by: hagen1778 <roman@victoriametrics.com>

(cherry picked from commit 1911320c86)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-12-01 14:00:58 +01:00
Aliaksandr Valialkin
d1445bc0c8 all: expose additional metrics for simplifying debugging of VictoriaMetrics components
Updates https://github.com/VictoriaMetrics/metrics/issues/54

(cherry picked from commit 8eddccfbb4)
2023-12-01 14:00:28 +01:00
Aliaksandr Valialkin
e017176f45 docs/vmauth.md: add typical use cases
(cherry picked from commit 837f6f0975)
2023-12-01 14:00:23 +01:00
Aliaksandr Valialkin
f0215afee3 lib/promrelabel: add keep_if_contains and drop_if_contains relabeling actions
(cherry picked from commit ac65c6b178)
2023-12-01 14:00:20 +01:00
Nikolay
9505d48070 lib/streamaggr: properly reference slice with labels (#5406)
* lib/streamaggr: properly reference slice with labels
by limiting slice capacity. It must fix issues with slice modification, in case of append new slice will be allocated, instead of modifying refrenced slice
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5402

* Reduce memory allocations when output_relabel_configs adds new labels to output samples

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
(cherry picked from commit 41f7940f97)
2023-12-01 14:00:18 +01:00
Github Actions
2a0717c8db Automatic update operator docs from VictoriaMetrics/operator@f628bee (#5407)
(cherry picked from commit 7ca783dee9)
2023-12-01 14:00:11 +01:00
Andrii Chubatiuk
4ed7da8c58 docs: sync mistakenly deleted docs from 543f218fe9
543f218fe9
Signed-off-by: hagen1778 <roman@victoriametrics.com>

(cherry picked from commit 48228031e4)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-12-01 13:59:51 +01:00
hagen1778
73d18fbc7a lib/protoparser/datadog: follow-up after 543f218fe9
* prevent /api/v1 from panic on parsing rows
* add tests for Extract function for v1 and v2 api's
* separate request types in different pools to prevent different objects mixing
* add changelog line

543f218fe9
Signed-off-by: hagen1778 <roman@victoriametrics.com>

(cherry picked from commit 98d0f81f21)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-12-01 13:56:23 +01:00
Andrii Chubatiuk
d6b4c8e4ef add datadog /api/v2/series and /api/beta/sketches support (#5094)
Co-authored-by: Andrew Chubatiuk <andrew.chubatiuk@motional.com>
Co-authored-by: Nikolay <https://github.com/f41gh7>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

(cherry picked from commit 543f218fe9)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-12-01 13:55:32 +01:00
hagen1778
1c5120ce67 docs: add 3rd party article "Observe and record performance of Spark jobs with Victoria Metrics"
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit ec5b72c879)
2023-11-29 16:43:19 +01:00
Ivan Yatskevich
bfaca07774 docs/dns-srv-typo-fix: replace dns+src with dns+srv (#5396) 2023-11-28 17:46:00 +02:00
hagen1778
1e557b73a5 docs: mention contributor of PR 5368
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 5424632ba3)
2023-11-28 12:49:49 +01:00
luckyxiaoqiang
8ce82c5400 app/vmselect/promql: add day_of_year() function (#5368)
Co-authored-by: dingxiaoqiang <dingxiaoqiang@bytedance.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
(cherry picked from commit d7897e0d70)
2023-11-28 12:49:48 +01:00
hagen1778
1ca672f3ac docs: mention loadbalancer in Monitoring chapter
Signed-off-by: hagen1778 <roman@victoriametrics.com>

(cherry picked from commit 8a0bb4bf17)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-11-28 12:49:48 +01:00
hagen1778
e443c20e92 docs: fix indentation for /api/v1/labels
The indentation didn't change in b51d16e74c

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit d024fcf37f)
2023-11-28 09:45:18 +01:00
hagen1778
46f5aeb7ab docs: clarify steps for rollup cache purge for vmselects
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 5e38dde18d)
2023-11-28 09:45:18 +01:00
hagen1778
562a2ddffc docs: fix link for cache reset on vmselects
Signed-off-by: hagen1778 <roman@victoriametrics.com>

(cherry picked from commit f42ec79958)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-11-28 09:45:18 +01:00
Github Actions
4038a30d64 Automatic update operator docs from VictoriaMetrics/operator@e737115 (#5394) 2023-11-27 12:02:33 +02:00
Aliaksandr Valialkin
f2346a79b6 docs/Cluster-VictoriaMetrics.md: document that multitenancy via labels is applied to data ingested via non-http protocols
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3009
2023-11-27 11:18:15 +02:00
Aliaksandr Valialkin
5ccc22d66d app/vmagent: properly increase vmagent_remotewrite_samples_dropped_total when scraped samples cannot be sent to the remote storage and -remoteWrite.dropSamplesOnOverload is set
This is a follow-up for 5034aa0773
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2110
2023-11-25 14:44:42 +02:00
Aliaksandr Valialkin
e36f29080d docs/vmagent.md: remove duplicate chapter for Google PubSub integration
The previous chapter has been added in 752f89f13f
See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5358
2023-11-25 14:44:41 +02:00
Aliaksandr Valialkin
2f14394335 app/vmagent: follow-up for 090cb2c9de
- Add Try* prefix to functions, which return bool result in order to improve readability and reduce the probability of missing check
  for the result returned from these functions.

- Call the adjustSampleValues() only once on input samples. Previously it was called on every attempt to flush data to peristent queue.

- Properly restore the initial state of WriteRequest passed to tryPushWriteRequest() before returning from this function
  after unsuccessful push to persistent queue. Previously a part of WriteRequest samples may be lost in such case.

- Add -remoteWrite.dropSamplesOnOverload command-line flag, which can be used for dropping incoming samples instead
  of returning 429 Too Many Requests error to the client when -remoteWrite.disableOnDiskQueue is set and the remote storage
  cannot keep up with the data ingestion rate.

- Add vmagent_remotewrite_samples_dropped_total metric, which counts the number of dropped samples.

- Add vmagent_remotewrite_push_failures_total metric, which counts the number of unsuccessful attempts to push
  data to persistent queue when -remoteWrite.disableOnDiskQueue is set.

- Remove vmagent_remotewrite_aggregation_metrics_dropped_total and vm_promscrape_push_samples_dropped_total metrics,
  because they are replaced with vmagent_remotewrite_samples_dropped_total metric.

- Update 'Disabling on-disk persistence' docs at docs/vmagent.md

- Update stale comments in the code

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5088
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2110
2023-11-25 12:13:39 +02:00
Nikolay
25ac2aac31 app/vmagent: allow to disabled on-disk persistence (#5088)
* app/vmagent: allow to disabled on-disk queue
Previously, it wasn't possible to build data processing pipeline with a
chain of vmagents. In case when remoteWrite for the last vmagent in the
chain wasn't accessible, it persisted data only when it has enough disk
capacity. If disk queue is full, it started to silently drop ingested
metrics.

New flags allows to disable on-disk persistent and immediatly return an
error if remoteWrite is not accessible anymore. It blocks any writes and
notify client, that data ingestion isn't possible.

Main use case for this feature - use external queue such as kafka for
data persistence.
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2110

* adds test, updates readme

* apply review suggestions

* update docs for vmagent

* makes linter happy

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-11-25 12:12:29 +02:00
Aliaksandr Valialkin
3d22f98344 vendor: update github.com/VictoriaMetrics/fastcache from v1.12.1 to v1.12.2
This should help reducing GC overhead growth at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5379
2023-11-24 13:39:32 +02:00
Aliaksandr Valialkin
3674232128 docs: make more visible that the maximum JSON line length, which is accepted by /api/v1/import, is limited by -import.maxLineLen command-line flag value
This is a follow-up for 0cf55ded34

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5364
2023-11-24 13:14:40 +02:00
Roman Khavronenko
26242f526e lib/protoparser: decrease import.maxLineLen from 100MB to 10MB (#5364)
Tests showed that importing a single line with 70MB size takes 5.3GiB
RSS memory for VictoriaMetrics single-node.
In the scenario when user exports and imports data from one VM to another,
it could possibly lead to OOM exception for destination VM.

Importing a single line with 16MB size taks 1.3GiB RSS memory.
Hence, the limit for `import.maxLineLen` was decreased from 100MB to 10MB
to improve reliability of VictoriaMetrics during imports.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-11-24 13:13:33 +02:00
Aliaksandr Valialkin
01bc62eff9 docs/CHANGELOG.md: document Google PubSub support at vmagent (see 752f89f13f ) 2023-11-23 21:14:04 +02:00
Nikolay
892889823a apply review comments (#5358) 2023-11-23 21:13:57 +02:00
Github Actions
3a96830ed6 Automatic update operator docs from VictoriaMetrics/operator@fec3f9d (#5381) 2023-11-23 21:06:27 +02:00
Aliaksandr Valialkin
a906a7d85c app/vmagent/remotewrite: do not drop persistent queues when -remoteWrite.multitenantURL is set
It is unsafe to drop persistent queues when -remoteWrite.multitenantURL command-line flag is set,
since these queues are created on demand when a new sample for the given tenant is pushed
to the remote storage.

This addresses https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5357
The issue has been appeared in the commit f3a51e8b1d
when implementing https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4014
2023-11-23 20:43:21 +02:00
Github Actions
28df725a37 Automatic update operator docs from VictoriaMetrics/operator@45bfa36 (#5373) 2023-11-22 20:25:46 +02:00
Aliaksandr Valialkin
10b4dfbbf9 app/vmalert/notifier: remove backticks from the description for -notifier.blackhole command-line flag
Backticks in flag description are automatically converted to flag type. See https://pkg.go.dev/flag#PrintDefaults

This is a follow-up for 20025d4fd6 and 25317b4e70
2023-11-22 20:17:45 +02:00
Aliaksandr Valialkin
db6dadf1f7 docs: convert png images to webp in all the docs except of docs/operator/*
This reduces the size of docs/* folder from 33MB to 18MB

Images inside docs/operator/* must be converted at the https://github.com/VictoriaMetrics/operator/tree/master/docs
and then the updated images must be automatically propagated to the docs/operator/*

This is a follow-up for d3f919df3e

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5206
2023-11-22 19:29:47 +02:00
Github Actions
bf934482c5 Automatic update operator docs from VictoriaMetrics/operator@b96131b (#5371) 2023-11-22 19:23:15 +02:00
Aliaksandr Valialkin
0ccc1aca0a deployment/docker: remove built binaries at bin folder after creating docker image from them at make publish-via-docker 2023-11-21 14:33:50 +02:00
Aliaksandr Valialkin
fc40ebba7a .github/workflows/codeql-analysis.yml: cache Go artifacts 2023-11-21 13:05:15 +02:00
Aliaksandr Valialkin
2ebd5cbb53 .github/workflows/main.yml: ignore changes inside dashboards and deployment/**.yml
The dashaboards/ and deployment/**.yml do not contain files, which may change main workflow results,
so it is better to ignore them.
2023-11-21 12:58:43 +02:00
hagen1778
ae6152be5f lib/storage: fix typo
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-11-21 12:22:49 +02:00
Aliaksandr Valialkin
d8aceda5fd .github/workflows: run build and test jobs in parallel in order to speed up the workflow run 2023-11-21 12:20:31 +02:00
hagen1778
91e365acb6 lib/storage: fix typo
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-11-21 12:10:34 +02:00
Aliaksandr Valialkin
4f876ada0e .github/workflows: add Go version to Go artifacts cache key
When Go version changes, artifacts for the previous Go version may becomes useless,
so there is a little sense in re-using them.
2023-11-21 12:02:33 +02:00
Aliaksandr Valialkin
9acd591c8e .github/workflows: take into account Makefile contents when generating cache key for Go build aftifacts
The cache key must change when the corresponding 'make ...' command changes inside Makefile.
2023-11-21 11:51:56 +02:00
Aliaksandr Valialkin
b71388c1a8 .github/workflows: use stable Go release - it should always point to the latest stable release
This eliminates the need to update .github/workflows/* files whenever new Go stable release is out,
like in the 2db1a664e1 .

See https://github.com/actions/setup-go#using-stableoldstable-aliases
2023-11-21 11:39:04 +02:00
Aliaksandr Valialkin
3cb33196e6 .github/workflows/main.yml: try improving caching for Go artifacts
The default caching for Go artifacts from actions/setup-go@v4 uses the hash of go.sum file
as a cache key - see https://github.com/actions/cache/blob/main/examples.md#go---modules .
This isn't enough for VictoriaMetrics case, since different makefile actions build different the Go artifacts,
which need to be cached. So embed the action name in the cache key.
2023-11-21 01:12:17 +02:00
Aliaksandr Valialkin
95b076ba99 .github/workflows/codeql-analysis.yml: remove check-latest and cache inputs for actions/setup-go
- The `cache: true` is no longer needed starting from actions/setup-go@v4 - see https://github.com/actions/setup-go#v4
- The `check-latest: true` may slow down the action, so it is better to disalbe it - see https://github.com/actions/setup-go#check-latest-version
2023-11-21 01:12:00 +02:00
Aliaksandr Valialkin
c285fca256 Makefile: allow specifying the needed concurrency for make via MAKE_CONCURRENCY env var 2023-11-21 01:11:36 +02:00
Aliaksandr Valialkin
6b75523468 Makefile: speedup release, publish and crossbuild rules by using parallel make 2023-11-20 23:07:11 +02:00
Aliaksandr Valialkin
46e58f3669 app/vmagent/README.md: sync with docs/vmagent.md after cbe4a5c251 , so make docs-sync properly works 2023-11-20 22:43:28 +02:00
Nikolay
c06044ef52 app/vmagent: adds google pubsub as remoteWrite dst and ingest consumer (#713)
it allows to push and receive metrics from google pubsub queue
Adds needed documentation and examples for it
2023-11-20 22:43:26 +02:00
hagen1778
34b7783461 docs: calrify version when vminsertConnsShutdownDuration was added
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-11-20 17:14:57 +01:00
hagen1778
0dbbffbdd5 docs: typo after 3f5a41e35e
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 20025d4fd6)
2023-11-20 17:06:21 +01:00
hagen1778
6575d646c0 docs: follow-up after d3f919df3e
d3f919df3e
Signed-off-by: hagen1778 <roman@victoriametrics.com>

(cherry picked from commit 3ffa8975d4)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-11-20 11:54:28 +01:00
Dmytro Kozlov
2362af3b0c docs/managed-victoriametrics: use webp format to reduce image size (#5206)
(cherry picked from commit d3f919df3e)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-11-20 11:53:06 +01:00
Khanh Quoc Le
03e5ebaea9 Add _stream fields log (#5068) 2023-11-17 16:04:13 +01:00
Hui Wang
91379331eb lib/protoparser/promremotewrite: fall back to zstd decoding if Snappy-decoding fails (#5344)
This case is possible after the following steps:
1. vmagent successfully performed handshake with the -remoteWrite.url and the remote storage supports zstd-compressed data.
2. remote storage became unavailable or slow to ingest data, vmagent compressed the collected data into blocks with zstd and puts these blocks to persistent queue on disk.
3. vmagent restarts and the remote storage is unavailable during the handshake, then vmagent falls back to Snappy compression.
4. vmagent starts sending zstd-compressed data from persistent queue to the remote storage, while falsely advertizing it sends Snappy-compressed data.
5. The remote storage receives zstd-compressed data and fails unpacking it with Snappy.

The solution is the same as 12cd32fd75, just fall back to zstd decompression if Snappy decompression fails.
2023-11-17 15:53:18 +01:00
Aliaksandr Valialkin
39c56e8f65 docs/enterprise.md: update VictoriaMetrics version in examples from v1.95.0 to v1.95.1
See https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.95.1
2023-11-17 15:47:24 +01:00
Github Actions
2e6c404cdd Automatic update operator docs from VictoriaMetrics/operator@388745c (#5340) 2023-11-17 15:47:22 +01:00
Aliaksandr Valialkin
1149a98873 deployment: update VictoriaMetrics docker image tag from v1.95.0 to v1.95.1
See https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.95.1
2023-11-17 15:43:43 +01:00
Aliaksandr Valialkin
5492ccf0d5 app/vmselect/promql: reduce the number of memory allocations inside copyTimeseriesShallow()
Previously the number of memory allocations inside copyTimeseriesShallow() was equal to 1+len(tss)
Reduce this number to 2 by pre-allocating a slice of timeseries structs with len(tss) length.
2023-11-17 15:41:38 +01:00
luckyxiaoqiang
3419c14b35 docs/metricsql: remove duplicate sentence (#5349) 2023-11-17 15:41:21 +01:00
Artem Navoiev
acfaf5c352 gh action bump pagefind version to 1.0.4
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2023-11-17 15:40:54 +01:00
Aliaksandr Valialkin
f2eaf4d4aa deployment/docker: update Alpine from 3.18.3 to 3.18.4
See https://alpinelinux.org/posts/Alpine-3.18.4-released.html

(cherry picked from commit 5c28923c11)
2023-10-23 15:18:20 +02:00
f41gh7
f5e663c00c app/vmselect: adds traces for series update API 2023-10-03 14:57:03 +02:00
f41gh7
07394fb847 Merge remote-tracking branch 'origin/cluster' into series-update-api 2023-10-03 14:48:49 +02:00
Aliaksandr Valialkin
9e2d77ea62 app/vmselect/netstorage: prevent from deadlock at unpackUpdateAddrs() for time series with the updated data 2023-08-23 15:36:13 +02:00
f41gh7
23e53bdb80 app/vmselect: moves series update logic to vmselect
it should simplify migration and keep good performance for vmstorage component
2023-08-23 15:36:13 +02:00
f41gh7
1ab593f807 app/vminsert: fixes merge conflicts 2023-08-23 15:36:13 +02:00
f41gh7
fbfd7415da cluster: adds /api/v1/update/series API
It allows to modify exist series values.
User must write modified series into vminsert API
/insert/0/prometheus/api/v1/update/series

vminsert will generate id and add it to the series as __generation_id
label.

Modified series merged at vmselect side.
Only last series modify request at given time range will be applied.
Modification request could be exported with the following API request:
`curl localhost:8481/select/0/prometheus/api/v1/export -g -d
'reduce_mem_usage=true' -d 'match[]={__generation_id!=""}'`

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/844

adds guide

allow single datapoint modification

vmselectapi: prevent MetricBlockRef corruption

Modofying of MetricName byte slice may result into MetricBlockRef
corruption, since `ctx.mb.MetricName` is a pointer to
`MetricBlockRef.MetricName`.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Revert "vmselectapi: prevent MetricBlockRef corruption"

This reverts commit cf36bfa1895885fcc7dc2673248ee56c78180ea0.

app/vmstorage/servers: properly copy MetricName into MetricBlock inside blockIterator.NextBlock

This should fix the issue at cf36bfa189

(cherry picked from commit 916f1ab86c)

app/vmselect: correctly update single datapoint at merge

app/vmselect: adds mutex for series update map
previously it was sync api, but function signature was changed for performance optimizations
2023-08-23 15:36:12 +02:00
Dmytro Kozlov
54a67d439c docs: cut 1.93.1-lts in changelog
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-08-23 14:14:15 +02:00
Nikolay
8bc42baf19 lib/storage: properly caclucate nextRotationTimestamp (#4874)
cause of typo unix millis was used instead of unix for current timestamp
calculation
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4873

(cherry picked from commit c5aac34b68)
2023-08-23 14:14:15 +02:00
Aliaksandr Valialkin
90f4581a0e docs/stream-aggregation.md: typo fix after 54f522ac25 2023-08-17 15:28:37 +02:00
Aliaksandr Valialkin
e69580fe97 docs/stream-aggregation.md: clarify the usage of -remoteWrite.label after the fix at a27c2f3773
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4247
2023-08-17 15:19:52 +02:00
Aliaksandr Valialkin
be5673c39d app/vmagent/remotewrite: follow-up after a27c2f3773
- Fix Prometheus-compatible naming after applying the relabeling if -usePromCompatibleNaming command-line flag is set.
  This should prevent from possible Prometheus-incompatible metric names and label names generated by the relabeling.
- Do not return anything from relabelCtx.appendExtraLabels() function, since it cannot change the number of time series
  passed to it. Append labels for the passed time series in-place.
- Remove promrelabel.FinalizeLabels() call after adding extra labels to time series, since this call has been already
  made at relabelCtx.applyRelabeling(). It is user's responsibility if he passes labels with double underscore prefixes
  to -remoteWrite.label.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4247
2023-08-17 14:47:28 +02:00
Alexander Marshalov
4d6875d81b vmagent: fixed premature release of the context (after #4247 / #4824) (#4849)
Follow-up after a27c2f3773

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4247

Signed-off-by: Alexander Marshalov <_@marshalov.org>
2023-08-17 14:47:28 +02:00
Alexander Marshalov
e5e50504db fixed applying remoteWrite.label for pushed metrics (#4247) (#4824)
vmagent: properly add extra labels before sending data to remote storage

labels from `remoteWrite.label` are now added to sent metrics just before they
 are pushed to `remoteWrite.url` after all relabelings, including stream aggregation relabelings (#4247)

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4247

Signed-off-by: Alexander Marshalov <_@marshalov.org>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2023-08-17 14:47:28 +02:00
Aliaksandr Valialkin
cdf2eaf688 lib/envflag: do not allow unsupported form for boolean command-line flags in the form -boolFlag value
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4845
2023-08-17 14:15:48 +02:00
Aliaksandr Valialkin
eca318cc65 docs/CHANGELOG.md: mention that this is v1.93.x LTS release line 2023-08-17 13:57:41 +02:00
Aliaksandr Valialkin
5768ac0607 lib/promrelabel: stop emitting DEBUG log lines when parsing if expressions
These lines were accidentally left in the commit 62651570bb

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4635
2023-08-17 13:57:41 +02:00
Aliaksandr Valialkin
91b1700194 lib/promrelabel: properly replace : char with _ in metric names when -usePromCompatibleNaming command-line flag is set
This addresses https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3113#issuecomment-1275077071 comment from @johnseekins
2023-08-17 13:52:53 +02:00
Roman Khavronenko
f71382332b vmbackup: correctly check if specified -dst belongs to specified -storageDataPath (#4841)
See this issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4837

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-08-17 13:50:51 +02:00
Dmytro Kozlov
215e9dd724 app/vmctl: fix migration process if tenant have no data (#4799)
app/vmctl: don't interrupt migration process if tenant has no data

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Alexander Marshalov <_@marshalov.org>
2023-08-17 13:49:07 +02:00
Aliaksandr Valialkin
14dada5da0 docs/CHANGELOG.md: document that v1.93.x is a new line of LTS releases 2023-08-12 15:30:23 -07:00
Aliaksandr Valialkin
7e9112da50 deployment/docker/Makefile: do not overwrite latest tag when pushing Docker images for LTS release
The `latest` tag is reserved for the latest release
2023-08-12 15:28:51 -07:00
541 changed files with 3878 additions and 7790 deletions

View File

@@ -14,13 +14,25 @@ jobs:
name: Build
runs-on: ubuntu-latest
steps:
- name: Setup Go
uses: actions/setup-go@main
with:
go-version: 1.21.4
id: go
- name: Code checkout
uses: actions/checkout@master
- name: Setup Go
id: go
uses: actions/setup-go@v4
with:
go-version: stable
cache: false
- name: Cache Go artifacts
uses: actions/cache@v3
with:
path: |
~/.cache/go-build
~/go/pkg/mod
~/go/bin
key: go-artifacts-${{ runner.os }}-check-licenses-${{ steps.go.outputs.go-version }}-${{ hashFiles('go.sum', 'Makefile', 'app/**/Makefile') }}
restore-keys: go-artifacts-${{ runner.os }}-check-licenses-
- name: Check License
run: |
make check-licenses
run: make check-licenses

View File

@@ -55,11 +55,22 @@ jobs:
uses: actions/checkout@v4
- name: Set up Go
id: go
uses: actions/setup-go@v4
with:
go-version: 1.21.4
check-latest: true
cache: true
go-version: stable
cache: false
if: ${{ matrix.language == 'go' }}
- name: Cache Go artifacts
uses: actions/cache@v3
with:
path: |
~/.cache/go-build
~/go/pkg/mod
~/go/bin
key: go-artifacts-${{ runner.os }}-codeql-analyze-${{ steps.go.outputs.go-version }}-${{ hashFiles('go.sum', 'Makefile', 'app/**/Makefile') }}
restore-keys: go-artifacts-${{ runner.os }}-codeql-analyze-
if: ${{ matrix.language == 'go' }}
# Initializes the CodeQL tools for scanning.

View File

@@ -7,6 +7,8 @@ on:
paths-ignore:
- "docs/**"
- "**.md"
- "dashboards/**"
- "deployment/**.yml"
pull_request:
branches:
- master
@@ -14,6 +16,8 @@ on:
paths-ignore:
- "docs/**"
- "**.md"
- "dashboards/**"
- "deployment/**.yml"
permissions:
contents: read
@@ -30,18 +34,55 @@ jobs:
uses: actions/checkout@v4
- name: Setup Go
id: go
uses: actions/setup-go@v4
with:
go-version: 1.21.4
check-latest: true
cache: true
go-version: stable
cache: false
- name: Dependencies
- name: Cache Go artifacts
uses: actions/cache@v3
with:
path: |
~/.cache/go-build
~/go/pkg/mod
~/go/bin
key: go-artifacts-${{ runner.os }}-check-all-${{ steps.go.outputs.go-version }}-${{ hashFiles('go.sum', 'Makefile', 'app/**/Makefile') }}
restore-keys: go-artifacts-${{ runner.os }}-check-all-
- name: Run check-all
run: |
make install-golangci-lint
make check-all
git diff --exit-code
build:
needs: lint
name: build
runs-on: ubuntu-latest
steps:
- name: Code checkout
uses: actions/checkout@v4
- name: Setup Go
id: go
uses: actions/setup-go@v4
with:
go-version: stable
cache: false
- name: Cache Go artifacts
uses: actions/cache@v3
with:
path: |
~/.cache/go-build
~/go/pkg/mod
~/go/bin
key: go-artifacts-${{ runner.os }}-crossbuild-${{ steps.go.outputs.go-version }}-${{ hashFiles('go.sum', 'Makefile', 'app/**/Makefile') }}
restore-keys: go-artifacts-${{ runner.os }}-crossbuild-
- name: Build
run: make crossbuild
test:
needs: lint
strategy:
@@ -54,42 +95,26 @@ jobs:
uses: actions/checkout@v4
- name: Setup Go
id: go
uses: actions/setup-go@v4
with:
go-version: 1.21.4
check-latest: true
cache: true
go-version: stable
cache: false
- name: Cache Go artifacts
uses: actions/cache@v3
with:
path: |
~/.cache/go-build
~/go/pkg/mod
~/go/bin
key: go-artifacts-${{ runner.os }}-${{ matrix.scenario }}-${{ steps.go.outputs.go-version }}-${{ hashFiles('go.sum', 'Makefile', 'app/**/Makefile') }}
restore-keys: go-artifacts-${{ runner.os }}-${{ matrix.scenario }}-
- name: run tests
run: |
make ${{ matrix.scenario}}
run: make ${{ matrix.scenario}}
- name: Publish coverage
uses: codecov/codecov-action@v3
with:
file: ./coverage.txt
build:
needs: test
name: build
runs-on: ubuntu-latest
steps:
- name: Code checkout
uses: actions/checkout@v4
- name: Setup Go
id: go
uses: actions/setup-go@v4
with:
go-version: 1.21.4
check-latest: true
cache: true
- uses: actions/cache@v3
with:
path: gocache-for-docker
key: gocache-docker-${{ runner.os }}-${{ steps.go.outputs.go-version }}-${{ hashFiles('go.mod') }}
- name: Build
run: |
make vmcluster-crossbuild

View File

@@ -7,7 +7,7 @@ on:
- 'docs/**'
workflow_dispatch: {}
env:
PAGEFIND_VERSION: "1.0.3"
PAGEFIND_VERSION: "1.0.4"
HUGO_VERSION: "latest"
permissions:
contents: read # This is required for actions/checkout and to commit back image update

View File

@@ -1,5 +1,7 @@
PKG_PREFIX := github.com/VictoriaMetrics/VictoriaMetrics
MAKE_CONCURRENCY ?= $(shell cat /proc/cpuinfo | grep -c processor)
MAKE_PARALLEL := $(MAKE) -j $(MAKE_CONCURRENCY)
DATEINFO_TAG ?= $(shell date -u +'%Y%m%d-%H%M%S')
BUILDINFO_TAG ?= $(shell echo $$(git describe --long --all | tr '/' '-')$$( \
git diff-index --quiet HEAD -- || echo '-dirty-'$$(git diff-index -u HEAD | openssl sha1 | cut -d' ' -f2 | cut -c 1-8)))
@@ -15,6 +17,7 @@ GO_BUILDINFO = -X '$(PKG_PREFIX)/lib/buildinfo.Version=$(APP_NAME)-$(DATEINFO_TA
.PHONY: $(MAKECMDGOALS)
include app/*/Makefile
include docs/Makefile
include deployment/*/Makefile
include dashboards/Makefile
include package/release/Makefile
@@ -72,14 +75,16 @@ vmcluster-windows-amd64: \
vmselect-windows-amd64 \
vmstorage-windows-amd64
vmcluster-crossbuild: \
vmcluster-linux-amd64 \
vmcluster-linux-arm64 \
vmcluster-linux-arm \
vmcluster-linux-ppc64le \
vmcluster-linux-386 \
vmcluster-freebsd-amd64 \
vmcluster-openbsd-amd64
crossbuild: vmcluster-crossbuild
vmcluster-crossbuild:
$(MAKE_PARALLEL) vmcluster-linux-amd64 \
vmcluster-linux-arm64 \
vmcluster-linux-arm \
vmcluster-linux-ppc64le \
vmcluster-linux-386 \
vmcluster-freebsd-amd64 \
vmcluster-openbsd-amd64
publish: \
publish-vminsert \
@@ -93,13 +98,13 @@ package: \
publish-release:
rm -rf bin/*
git checkout $(TAG) && LATEST_TAG=stable $(MAKE) release publish && \
git checkout $(TAG)-cluster && LATEST_TAG=cluster-stable $(MAKE) release publish && \
git checkout $(TAG)-enterprise && LATEST_TAG=enterprise-stable $(MAKE) release publish && \
git checkout $(TAG)-enterprise-cluster && LATEST_TAG=enterprise-cluster-stable $(MAKE) release publish
git checkout $(TAG) && $(MAKE) release && LATEST_TAG=stable $(MAKE) publish && \
git checkout $(TAG)-cluster && $(MAKE) release && LATEST_TAG=cluster-stable $(MAKE) publish && \
git checkout $(TAG)-enterprise && $(MAKE) release && LATEST_TAG=enterprise-stable $(MAKE) publish && \
git checkout $(TAG)-enterprise-cluster && $(MAKE) release && LATEST_TAG=enterprise-cluster-stable $(MAKE) publish
release: \
release-vmcluster
release:
$(MAKE_PARALLEL) release-vmcluster
release-vmcluster: \
release-vmcluster-linux-amd64 \
@@ -268,12 +273,3 @@ copy-docs:
# The rest of docs is ordered manually.
docs-sync:
SRC=README.md DST=docs/Cluster-VictoriaMetrics.md OLD_URL='/Cluster-VictoriaMetrics.html' ORDER=2 TITLE='Cluster version' $(MAKE) copy-docs
SRC=app/vmagent/README.md DST=docs/vmagent.md OLD_URL='/vmagent.html' ORDER=3 TITLE=vmagent $(MAKE) copy-docs
SRC=app/vmalert/README.md DST=docs/vmalert.md OLD_URL='/vmalert.html' ORDER=4 TITLE=vmalert $(MAKE) copy-docs
SRC=app/vmauth/README.md DST=docs/vmauth.md OLD_URL='/vmauth.html' ORDER=5 TITLE=vmauth $(MAKE) copy-docs
SRC=app/vmbackup/README.md DST=docs/vmbackup.md OLD_URL='/vmbackup.html' ORDER=6 TITLE=vmbackup $(MAKE) copy-docs
SRC=app/vmrestore/README.md DST=docs/vmrestore.md OLD_URL='/vmrestore.html' ORDER=7 TITLE=vmrestore $(MAKE) copy-docs
SRC=app/vmctl/README.md DST=docs/vmctl.md OLD_URL='/vmctl.html' ORDER=8 TITLE=vmctl $(MAKE) copy-docs
SRC=app/vmgateway/README.md DST=docs/vmgateway.md OLD_URL='/vmgateway.html' ORDER=9 TITLE=vmgateway $(MAKE) copy-docs
SRC=app/vmbackupmanager/README.md DST=docs/vmbackupmanager.md OLD_URL='/vmbackupmanager.html' ORDER=10 TITLE=vmbackupmanager $(MAKE) copy-docs
SRC=app/vmalert-tool/README.md DST=docs/vmalert-tool.md OLD_URL='' ORDER=12 TITLE=vmalert-tool $(MAKE) copy-docs

View File

@@ -1,6 +1,6 @@
# Cluster version
<img alt="VictoriaMetrics" src="logo.png" width="300">
<img src="docs/logo.webp" width="300">
VictoriaMetrics is a fast, cost-effective and scalable time series database. It can be used as a long-term remote storage for Prometheus.
@@ -46,7 +46,7 @@ This is a [shared nothing architecture](https://en.wikipedia.org/wiki/Shared-not
It increases cluster availability, and simplifies cluster maintenance as well as cluster scaling.
<p align="center">
<img src="docs/Cluster-VictoriaMetrics_cluster-scheme.png" width="800">
<img src="docs/Cluster-VictoriaMetrics_cluster-scheme.webp" width="800">
</p>
## Multitenancy
@@ -99,6 +99,11 @@ while the `http_requests_total{path="/bar"} 34` would be stored in the tenant `a
The `vm_account_id` and `vm_project_id` labels are extracted after applying the [relabeling](https://docs.victoriametrics.com/relabeling.html)
set via `-relabelConfig` command-line flag, so these labels can be set at this stage.
The `vm_account_id` and `vm_project_id` labels are also taken into account when ingesting data via non-http-based protocols
such as [Graphite](https://docs.victoriametrics.com/#how-to-send-data-from-graphite-compatible-agents-such-as-statsd),
[InfluxDB line protocol via TCP and UDP](https://docs.victoriametrics.com/#how-to-send-data-from-influxdb-compatible-agents-such-as-telegraf) and
[OpenTSDB telnet put protocol](https://docs.victoriametrics.com/#sending-data-via-telnet-put-protocol).
**Security considerations:** it is recommended restricting access to `multitenant` endpoints only to trusted sources,
since untrusted source may break per-tenant data by writing unwanted samples to arbitrary tenants.
@@ -230,7 +235,7 @@ the following approaches for automatic discovery of `vmstorage` nodes:
The list of discovered `vmstorage` nodes is automatically updated when the file contents changes.
The update frequency can be controlled with `-storageNode.discoveryInterval` command-line flag.
- [dns+srv](https://en.wikipedia.org/wiki/SRV_record) - pass `dns+src:some-name` value to `-storageNode` command-line flag.
- [dns+srv](https://en.wikipedia.org/wiki/SRV_record) - pass `dns+srv:some-name` value to `-storageNode` command-line flag.
In this case the provided `dns+srv` names are resolved into tcp addresses of `vmstorage` nodes.
The list of discovered `vmstorage` nodes is automatically updated at `vminsert` and `vmselect`
when it changes behind the corresponding `dns+srv` names.
@@ -472,7 +477,8 @@ during the config update / version upgrade. In this case the following strategy
[rolling restarts](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#no-downtime-strategy),
since they need to process higher load when some of `vmstorage` nodes are temporarily unavailable in the cluster.
It is possible to reduce resource usage spikes by running more `vminsert` nodes and by passing bigger values
to `-storage.vminsertConnsShutdownDuration` command-line flag at `vmstorage` nodes.
to `-storage.vminsertConnsShutdownDuration` (available from [v1.95.0](https://docs.victoriametrics.com/CHANGELOG.html#v1950))
command-line flag at `vmstorage` nodes.
In this case `vmstorage` increases the interval between gradual closing of `vminsert` connections during graceful shutdown.
This reduces data ingestion slowdown during rollout restarts.
@@ -879,7 +885,7 @@ Below is the output for `/path/to/vminsert -help`:
-csvTrimTimestamp duration
Trim timestamps when importing csv data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms)
-datadog.maxInsertRequestSize size
The maximum size in bytes of a single DataDog POST request to /api/v1/series
The maximum size in bytes of a single DataDog POST request to /api/v1/series, /api/v2/series, /api/beta/sketches
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 67108864)
-datadog.sanitizeMetricName
Sanitize metric names for the ingested DataDog data to comply with DataDog behaviour described at https://docs.datadoghq.com/metrics/custom_metrics/#naming-custom-metrics (default true)
@@ -937,7 +943,7 @@ Below is the output for `/path/to/vminsert -help`:
Whether to use proxy protocol for connections accepted at -httpListenAddr . See https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt . With enabled proxy protocol http server cannot serve regular /metrics endpoint. Use -pushmetrics.url for metrics pushing
-import.maxLineLen size
The maximum length in bytes of a single line accepted by /api/v1/import; the line length can be limited with 'max_rows_per_line' query arg passed to /api/v1/export
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 104857600)
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 10485760)
-influx.databaseNames array
Comma-separated list of database names to return from /query and /influx/query API. This can be needed for accepting data from Telegraf plugins such as https://github.com/fangli/fluent-plugin-influxdb
Supports an array of values separated by comma or specified via multiple flags.

View File

@@ -105,7 +105,7 @@ func RequestHandler(path string, w http.ResponseWriter, r *http.Request) bool {
vlstorage.MustAddRows(lr)
logstorage.PutLogRows(lr)
if err != nil {
logger.Warnf("cannot decode log message #%d in /_bulk request: %s", n, err)
logger.Warnf("cannot decode log message #%d in /_bulk request: %s, stream fields: %s", n, err, cp.StreamFields)
return true
}

File diff suppressed because it is too large Load Diff

View File

@@ -65,7 +65,9 @@ func insertRows(at *auth.Token, rows []parser.Row, extraLabels []prompbmarshal.L
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.Push(at, &ctx.WriteRequest)
if !remotewrite.TryPush(at, &ctx.WriteRequest) {
return remotewrite.ErrQueueFullHTTPRetry
}
rowsInserted.Add(len(rows))
if at != nil {
rowsTenantInserted.Get(at).Add(len(rows))

View File

@@ -88,7 +88,9 @@ func insertRows(at *auth.Token, series []datadog.Series, extraLabels []prompbmar
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.Push(at, &ctx.WriteRequest)
if !remotewrite.TryPush(at, &ctx.WriteRequest) {
return remotewrite.ErrQueueFullHTTPRetry
}
rowsInserted.Add(rowsTotal)
if at != nil {
rowsTenantInserted.Get(at).Add(rowsTotal)

View File

@@ -5,6 +5,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/graphite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/graphite/stream"
@@ -20,10 +21,21 @@ var (
//
// See https://graphite.readthedocs.io/en/latest/feeding-carbon.html#the-plaintext-protocol
func InsertHandler(r io.Reader) error {
return stream.Parse(r, insertRows)
return stream.Parse(r, false, func(rows []parser.Row) error {
return insertRows(nil, rows)
})
}
func insertRows(rows []parser.Row) error {
// InsertHandlerForReader processes remote write for graphite plaintext protocol.
//
// See https://graphite.readthedocs.io/en/latest/feeding-carbon.html#the-plaintext-protocol
func InsertHandlerForReader(at *auth.Token, r io.Reader, isGzipped bool) error {
return stream.Parse(r, isGzipped, func(rows []parser.Row) error {
return insertRows(at, rows)
})
}
func insertRows(at *auth.Token, rows []parser.Row) error {
ctx := common.GetPushCtx()
defer common.PutPushCtx(ctx)
@@ -56,7 +68,9 @@ func insertRows(rows []parser.Row) error {
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.Push(nil, &ctx.WriteRequest)
if !remotewrite.TryPush(at, &ctx.WriteRequest) {
return remotewrite.ErrQueueFullHTTPRetry
}
rowsInserted.Add(len(rows))
rowsPerInsert.Update(float64(len(rows)))
return nil

View File

@@ -36,9 +36,9 @@ var (
// InsertHandlerForReader processes remote write for influx line protocol.
//
// See https://github.com/influxdata/telegraf/tree/master/plugins/inputs/socket_listener/
func InsertHandlerForReader(r io.Reader, isGzipped bool) error {
func InsertHandlerForReader(at *auth.Token, r io.Reader, isGzipped bool) error {
return stream.Parse(r, isGzipped, "", "", func(db string, rows []parser.Row) error {
return insertRows(nil, db, rows, nil)
return insertRows(at, db, rows, nil)
})
}
@@ -130,7 +130,9 @@ func insertRows(at *auth.Token, db string, rows []parser.Row, extraLabels []prom
ctx.ctx.Labels = labels
ctx.ctx.Samples = samples
ctx.commonLabels = commonLabels
remotewrite.Push(at, &ctx.ctx.WriteRequest)
if !remotewrite.TryPush(at, &ctx.ctx.WriteRequest) {
return remotewrite.ErrQueueFullHTTPRetry
}
rowsInserted.Add(rowsTotal)
if at != nil {
rowsTenantInserted.Get(at).Add(rowsTotal)

View File

@@ -11,8 +11,6 @@ import (
"sync/atomic"
"time"
"github.com/VictoriaMetrics/metrics"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/csvimport"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/datadog"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/graphite"
@@ -42,6 +40,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/pushmetrics"
"github.com/VictoriaMetrics/metrics"
)
var (
@@ -125,7 +124,7 @@ func main() {
common.StartUnmarshalWorkers()
if len(*influxListenAddr) > 0 {
influxServer = influxserver.MustStart(*influxListenAddr, *influxUseProxyProtocol, func(r io.Reader) error {
return influx.InsertHandlerForReader(r, false)
return influx.InsertHandlerForReader(nil, r, false)
})
}
if len(*graphiteListenAddr) > 0 {
@@ -140,7 +139,7 @@ func main() {
opentsdbhttpServer = opentsdbhttpserver.MustStart(*opentsdbHTTPListenAddr, *opentsdbHTTPUseProxyProtocol, httpInsertHandler)
}
promscrape.Init(remotewrite.Push)
promscrape.Init(remotewrite.PushDropSamplesOnFailure)
if len(*httpListenAddr) > 0 {
go httpserver.Serve(*httpListenAddr, *useProxyProtocol, requestHandler)

View File

@@ -84,6 +84,8 @@ func insertRows(at *auth.Token, block *stream.Block, extraLabels []prompbmarshal
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.Push(at, &ctx.WriteRequest)
if !remotewrite.TryPush(at, &ctx.WriteRequest) {
return remotewrite.ErrQueueFullHTTPRetry
}
return nil
}

View File

@@ -76,7 +76,9 @@ func insertRows(at *auth.Token, rows []newrelic.Row, extraLabels []prompbmarshal
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.Push(at, &ctx.WriteRequest)
if !remotewrite.TryPush(at, &ctx.WriteRequest) {
return remotewrite.ErrQueueFullHTTPRetry
}
rowsInserted.Add(len(rows))
if at != nil {
rowsTenantInserted.Get(at).Add(samplesCount)

View File

@@ -59,7 +59,9 @@ func insertRows(at *auth.Token, tss []prompbmarshal.TimeSeries, extraLabels []pr
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.Push(at, &ctx.WriteRequest)
if !remotewrite.TryPush(at, &ctx.WriteRequest) {
return remotewrite.ErrQueueFullHTTPRetry
}
rowsInserted.Add(rowsTotal)
if at != nil {
rowsTenantInserted.Get(at).Add(rowsTotal)

View File

@@ -56,7 +56,9 @@ func insertRows(rows []parser.Row) error {
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.Push(nil, &ctx.WriteRequest)
if !remotewrite.TryPush(nil, &ctx.WriteRequest) {
return remotewrite.ErrQueueFullHTTPRetry
}
rowsInserted.Add(len(rows))
rowsPerInsert.Update(float64(len(rows)))
return nil

View File

@@ -64,7 +64,9 @@ func insertRows(at *auth.Token, rows []parser.Row, extraLabels []prompbmarshal.L
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.Push(at, &ctx.WriteRequest)
if !remotewrite.TryPush(at, &ctx.WriteRequest) {
return remotewrite.ErrQueueFullHTTPRetry
}
rowsInserted.Add(len(rows))
rowsPerInsert.Update(float64(len(rows)))
return nil

View File

@@ -73,7 +73,9 @@ func insertRows(at *auth.Token, rows []parser.Row, extraLabels []prompbmarshal.L
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.Push(at, &ctx.WriteRequest)
if !remotewrite.TryPush(at, &ctx.WriteRequest) {
return remotewrite.ErrQueueFullHTTPRetry
}
rowsInserted.Add(len(rows))
if at != nil {
rowsTenantInserted.Get(at).Add(len(rows))

View File

@@ -69,7 +69,9 @@ func insertRows(at *auth.Token, timeseries []prompb.TimeSeries, extraLabels []pr
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.Push(at, &ctx.WriteRequest)
if !remotewrite.TryPush(at, &ctx.WriteRequest) {
return remotewrite.ErrQueueFullHTTPRetry
}
rowsInserted.Add(rowsTotal)
if at != nil {
rowsTenantInserted.Get(at).Add(rowsTotal)

View File

@@ -305,7 +305,7 @@ func (c *client) runWorker() {
continue
}
// Return unsent block to the queue.
c.fq.MustWriteBlock(block)
c.fq.MustWriteBlockIgnoreDisabledPQ(block)
return
case <-c.stopCh:
// c must be stopped. Wait for a while in the hope the block will be sent.
@@ -314,11 +314,11 @@ func (c *client) runWorker() {
case ok := <-ch:
if !ok {
// Return unsent block to the queue.
c.fq.MustWriteBlock(block)
c.fq.MustWriteBlockIgnoreDisabledPQ(block)
}
case <-time.After(graceDuration):
// Return unsent block to the queue.
c.fq.MustWriteBlock(block)
c.fq.MustWriteBlockIgnoreDisabledPQ(block)
}
return
}

View File

@@ -37,9 +37,9 @@ type pendingSeries struct {
periodicFlusherWG sync.WaitGroup
}
func newPendingSeries(pushBlock func(block []byte), isVMRemoteWrite bool, significantFigures, roundDigits int) *pendingSeries {
func newPendingSeries(fq *persistentqueue.FastQueue, isVMRemoteWrite bool, significantFigures, roundDigits int) *pendingSeries {
var ps pendingSeries
ps.wr.pushBlock = pushBlock
ps.wr.fq = fq
ps.wr.isVMRemoteWrite = isVMRemoteWrite
ps.wr.significantFigures = significantFigures
ps.wr.roundDigits = roundDigits
@@ -57,10 +57,11 @@ func (ps *pendingSeries) MustStop() {
ps.periodicFlusherWG.Wait()
}
func (ps *pendingSeries) Push(tss []prompbmarshal.TimeSeries) {
func (ps *pendingSeries) TryPush(tss []prompbmarshal.TimeSeries) bool {
ps.mu.Lock()
ps.wr.push(tss)
ok := ps.wr.tryPush(tss)
ps.mu.Unlock()
return ok
}
func (ps *pendingSeries) periodicFlusher() {
@@ -70,18 +71,20 @@ func (ps *pendingSeries) periodicFlusher() {
}
ticker := time.NewTicker(*flushInterval)
defer ticker.Stop()
mustStop := false
for !mustStop {
for {
select {
case <-ps.stopCh:
mustStop = true
ps.mu.Lock()
ps.wr.mustFlushOnStop()
ps.mu.Unlock()
return
case <-ticker.C:
if fasttime.UnixTimestamp()-atomic.LoadUint64(&ps.wr.lastFlushTime) < uint64(flushSeconds) {
continue
}
}
ps.mu.Lock()
ps.wr.flush()
_ = ps.wr.tryFlush()
ps.mu.Unlock()
}
}
@@ -90,16 +93,16 @@ type writeRequest struct {
// Move lastFlushTime to the top of the struct in order to guarantee atomic access on 32-bit architectures.
lastFlushTime uint64
// pushBlock is called when whe write request is ready to be sent.
pushBlock func(block []byte)
// The queue to send blocks to.
fq *persistentqueue.FastQueue
// Whether to encode the write request with VictoriaMetrics remote write protocol.
isVMRemoteWrite bool
// How many significant figures must be left before sending the writeRequest to pushBlock.
// How many significant figures must be left before sending the writeRequest to fq.
significantFigures int
// How many decimal digits after point must be left before sending the writeRequest to pushBlock.
// How many decimal digits after point must be left before sending the writeRequest to fq.
roundDigits int
wr prompbmarshal.WriteRequest
@@ -112,7 +115,7 @@ type writeRequest struct {
}
func (wr *writeRequest) reset() {
// Do not reset lastFlushTime, pushBlock, isVMRemoteWrite, significantFigures and roundDigits, since they are re-used.
// Do not reset lastFlushTime, fq, isVMRemoteWrite, significantFigures and roundDigits, since they are re-used.
wr.wr.Timeseries = nil
@@ -130,23 +133,40 @@ func (wr *writeRequest) reset() {
wr.buf = wr.buf[:0]
}
func (wr *writeRequest) flush() {
// mustFlushOnStop force pushes wr data into wr.fq
//
// This is needed in order to properly save in-memory data to persistent queue on graceful shutdown.
func (wr *writeRequest) mustFlushOnStop() {
wr.wr.Timeseries = wr.tss
wr.adjustSampleValues()
atomic.StoreUint64(&wr.lastFlushTime, fasttime.UnixTimestamp())
pushWriteRequest(&wr.wr, wr.pushBlock, wr.isVMRemoteWrite)
if !tryPushWriteRequest(&wr.wr, wr.mustWriteBlock, wr.isVMRemoteWrite) {
logger.Panicf("BUG: final flush must always return true")
}
wr.reset()
}
func (wr *writeRequest) adjustSampleValues() {
samples := wr.samples
if n := wr.significantFigures; n > 0 {
func (wr *writeRequest) mustWriteBlock(block []byte) bool {
wr.fq.MustWriteBlockIgnoreDisabledPQ(block)
return true
}
func (wr *writeRequest) tryFlush() bool {
wr.wr.Timeseries = wr.tss
atomic.StoreUint64(&wr.lastFlushTime, fasttime.UnixTimestamp())
if !tryPushWriteRequest(&wr.wr, wr.fq.TryWriteBlock, wr.isVMRemoteWrite) {
return false
}
wr.reset()
return true
}
func adjustSampleValues(samples []prompbmarshal.Sample, significantFigures, roundDigits int) {
if n := significantFigures; n > 0 {
for i := range samples {
s := &samples[i]
s.Value = decimal.RoundToSignificantFigures(s.Value, n)
}
}
if n := wr.roundDigits; n < 100 {
if n := roundDigits; n < 100 {
for i := range samples {
s := &samples[i]
s.Value = decimal.RoundToDecimalDigits(s.Value, n)
@@ -154,21 +174,27 @@ func (wr *writeRequest) adjustSampleValues() {
}
}
func (wr *writeRequest) push(src []prompbmarshal.TimeSeries) {
func (wr *writeRequest) tryPush(src []prompbmarshal.TimeSeries) bool {
tssDst := wr.tss
maxSamplesPerBlock := *maxRowsPerBlock
// Allow up to 10x of labels per each block on average.
maxLabelsPerBlock := 10 * maxSamplesPerBlock
for i := range src {
tssDst = append(tssDst, prompbmarshal.TimeSeries{})
wr.copyTimeSeries(&tssDst[len(tssDst)-1], &src[i])
if len(wr.samples) >= maxSamplesPerBlock || len(wr.labels) >= maxLabelsPerBlock {
wr.tss = tssDst
wr.flush()
if !wr.tryFlush() {
return false
}
tssDst = wr.tss
}
tsSrc := &src[i]
adjustSampleValues(tsSrc.Samples, wr.significantFigures, wr.roundDigits)
tssDst = append(tssDst, prompbmarshal.TimeSeries{})
wr.copyTimeSeries(&tssDst[len(tssDst)-1], tsSrc)
}
wr.tss = tssDst
return true
}
func (wr *writeRequest) copyTimeSeries(dst, src *prompbmarshal.TimeSeries) {
@@ -196,10 +222,10 @@ func (wr *writeRequest) copyTimeSeries(dst, src *prompbmarshal.TimeSeries) {
wr.buf = buf
}
func pushWriteRequest(wr *prompbmarshal.WriteRequest, pushBlock func(block []byte), isVMRemoteWrite bool) {
func tryPushWriteRequest(wr *prompbmarshal.WriteRequest, tryPushBlock func(block []byte) bool, isVMRemoteWrite bool) bool {
if len(wr.Timeseries) == 0 {
// Nothing to push
return
return true
}
bb := writeRequestBufPool.Get()
bb.B = prompbmarshal.MarshalWriteRequest(bb.B[:0], wr)
@@ -212,11 +238,13 @@ func pushWriteRequest(wr *prompbmarshal.WriteRequest, pushBlock func(block []byt
}
writeRequestBufPool.Put(bb)
if len(zb.B) <= persistentqueue.MaxBlockSize {
pushBlock(zb.B)
if !tryPushBlock(zb.B) {
return false
}
blockSizeRows.Update(float64(len(wr.Timeseries)))
blockSizeBytes.Update(float64(len(zb.B)))
snappyBufPool.Put(zb)
return
return true
}
snappyBufPool.Put(zb)
} else {
@@ -229,23 +257,36 @@ func pushWriteRequest(wr *prompbmarshal.WriteRequest, pushBlock func(block []byt
samples := wr.Timeseries[0].Samples
if len(samples) == 1 {
logger.Warnf("dropping a sample for metric with too long labels exceeding -remoteWrite.maxBlockSize=%d bytes", maxUnpackedBlockSize.N)
return
return true
}
n := len(samples) / 2
wr.Timeseries[0].Samples = samples[:n]
pushWriteRequest(wr, pushBlock, isVMRemoteWrite)
if !tryPushWriteRequest(wr, tryPushBlock, isVMRemoteWrite) {
wr.Timeseries[0].Samples = samples
return false
}
wr.Timeseries[0].Samples = samples[n:]
pushWriteRequest(wr, pushBlock, isVMRemoteWrite)
if !tryPushWriteRequest(wr, tryPushBlock, isVMRemoteWrite) {
wr.Timeseries[0].Samples = samples
return false
}
wr.Timeseries[0].Samples = samples
return
return true
}
timeseries := wr.Timeseries
n := len(timeseries) / 2
wr.Timeseries = timeseries[:n]
pushWriteRequest(wr, pushBlock, isVMRemoteWrite)
if !tryPushWriteRequest(wr, tryPushBlock, isVMRemoteWrite) {
wr.Timeseries = timeseries
return false
}
wr.Timeseries = timeseries[n:]
pushWriteRequest(wr, pushBlock, isVMRemoteWrite)
if !tryPushWriteRequest(wr, tryPushBlock, isVMRemoteWrite) {
wr.Timeseries = timeseries
return false
}
wr.Timeseries = timeseries
return true
}
var (

View File

@@ -26,13 +26,16 @@ func testPushWriteRequest(t *testing.T, rowsCount, expectedBlockLenProm, expecte
t.Helper()
wr := newTestWriteRequest(rowsCount, 20)
pushBlockLen := 0
pushBlock := func(block []byte) {
pushBlock := func(block []byte) bool {
if pushBlockLen > 0 {
panic(fmt.Errorf("BUG: pushBlock called multiple times; pushBlockLen=%d at first call, len(block)=%d at second call", pushBlockLen, len(block)))
}
pushBlockLen = len(block)
return true
}
if !tryPushWriteRequest(wr, pushBlock, isVMRemoteWrite) {
t.Fatalf("cannot push data to to remote storage")
}
pushWriteRequest(wr, pushBlock, isVMRemoteWrite)
if math.Abs(float64(pushBlockLen-expectedBlockLen)/float64(expectedBlockLen)*100) > tolerancePrc {
t.Fatalf("unexpected block len for rowsCount=%d, isVMRemoteWrite=%v; got %d bytes; expecting %d bytes +- %.0f%%",
rowsCount, isVMRemoteWrite, pushBlockLen, expectedBlockLen, tolerancePrc)

View File

@@ -3,6 +3,7 @@ package remotewrite
import (
"flag"
"fmt"
"strconv"
"strings"
"sync"
@@ -92,6 +93,7 @@ func (rctx *relabelCtx) applyRelabeling(tss []prompbmarshal.TimeSeries, pcs *pro
// Nothing to change.
return tss
}
rctx.reset()
tssDst := tss[:0]
labels := rctx.labels[:0]
for i := range tss {
@@ -120,6 +122,7 @@ func (rctx *relabelCtx) appendExtraLabels(tss []prompbmarshal.TimeSeries, extraL
if len(extraLabels) == 0 {
return
}
rctx.reset()
labels := rctx.labels[:0]
for i := range tss {
ts := &tss[i]
@@ -139,6 +142,34 @@ func (rctx *relabelCtx) appendExtraLabels(tss []prompbmarshal.TimeSeries, extraL
rctx.labels = labels
}
func (rctx *relabelCtx) tenantToLabels(tss []prompbmarshal.TimeSeries, accountID, projectID uint32) {
rctx.reset()
accountIDStr := strconv.FormatUint(uint64(accountID), 10)
projectIDStr := strconv.FormatUint(uint64(projectID), 10)
labels := rctx.labels[:0]
for i := range tss {
ts := &tss[i]
labelsLen := len(labels)
for _, label := range ts.Labels {
labelName := label.Name
if labelName == "vm_account_id" || labelName == "vm_project_id" {
continue
}
labels = append(labels, label)
}
labels = append(labels, prompbmarshal.Label{
Name: "vm_account_id",
Value: accountIDStr,
})
labels = append(labels, prompbmarshal.Label{
Name: "vm_project_id",
Value: projectIDStr,
})
ts.Labels = labels[labelsLen:]
}
rctx.labels = labels
}
type relabelCtx struct {
// pool for labels, which are used during the relabeling.
labels []prompbmarshal.Label
@@ -160,7 +191,7 @@ func getRelabelCtx() *relabelCtx {
}
func putRelabelCtx(rctx *relabelCtx) {
rctx.labels = rctx.labels[:0]
rctx.reset()
relabelCtxPool.Put(rctx)
}

View File

@@ -3,6 +3,7 @@ package remotewrite
import (
"flag"
"fmt"
"net/http"
"net/url"
"path/filepath"
"strconv"
@@ -10,6 +11,8 @@ import (
"sync/atomic"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bloomfilter"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
@@ -34,18 +37,22 @@ var (
remoteWriteURLs = flagutil.NewArrayString("remoteWrite.url", "Remote storage URL to write data to. It must support either VictoriaMetrics remote write protocol "+
"or Prometheus remote_write protocol. Example url: http://<victoriametrics-host>:8428/api/v1/write . "+
"Pass multiple -remoteWrite.url options in order to replicate the collected data to multiple remote storage systems. "+
"The data can be sharded among the configured remote storage systems if -remoteWrite.shardByURL flag is set. "+
"See also -remoteWrite.multitenantURL")
"The data can be sharded among the configured remote storage systems if -remoteWrite.shardByURL flag is set")
remoteWriteMultitenantURLs = flagutil.NewArrayString("remoteWrite.multitenantURL", "Base path for multitenant remote storage URL to write data to. "+
"See https://docs.victoriametrics.com/vmagent.html#multitenancy for details. Example url: http://<vminsert>:8480 . "+
"Pass multiple -remoteWrite.multitenantURL flags in order to replicate data to multiple remote storage systems. See also -remoteWrite.url")
"Pass multiple -remoteWrite.multitenantURL flags in order to replicate data to multiple remote storage systems. "+
"This flag is deprecated in favor of -enableMultitenantHandlers . See https://docs.victoriametrics.com/vmagent.html#multitenancy")
enableMultitenantHandlers = flag.Bool("enableMultitenantHandlers", false, "Whether to process incoming data via multitenant insert handlers according to "+
"https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#url-format . By default incoming data is processed via single-node insert handlers "+
"according to https://docs.victoriametrics.com/#how-to-import-time-series-data ."+
"See https://docs.victoriametrics.com/vmagent.html#multitenancy for details")
shardByURL = flag.Bool("remoteWrite.shardByURL", false, "Whether to shard outgoing series across all the remote storage systems enumerated via -remoteWrite.url . "+
"By default the data is replicated across all the -remoteWrite.url . See https://docs.victoriametrics.com/vmagent.html#sharding-among-remote-storages")
shardByURLLabels = flagutil.NewArrayString("remoteWrite.shardByURL.labels", "Optional list of labels, which must be used for sharding outgoing samples "+
"among remote storage systems if -remoteWrite.shardByURL command-line flag is set. By default all the labels are used for sharding in order to gain "+
"even distribution of series over the specified -remoteWrite.url systems")
tmpDataPath = flag.String("remoteWrite.tmpDataPath", "vmagent-remotewrite-data", "Path to directory where temporary data for remote write component is stored. "+
"See also -remoteWrite.maxDiskUsagePerURL")
tmpDataPath = flag.String("remoteWrite.tmpDataPath", "vmagent-remotewrite-data", "Path to directory for storing pending data, which isn't sent to the configured -remoteWrite.url . "+
"See also -remoteWrite.maxDiskUsagePerURL and -remoteWrite.disableOnDiskQueue")
keepDanglingQueues = flag.Bool("remoteWrite.keepDanglingQueues", false, "Keep persistent queues contents at -remoteWrite.tmpDataPath in case there are no matching -remoteWrite.url. "+
"Useful when -remoteWrite.url is changed temporarily and persistent queue files will be needed later on.")
queues = flag.Int("remoteWrite.queues", cgroup.AvailableCPUs()*2, "The number of concurrent queues to each -remoteWrite.url. Set more queues if default number of queues "+
@@ -84,6 +91,11 @@ var (
"are written to the corresponding -remoteWrite.url . See also -remoteWrite.streamAggr.keepInput and https://docs.victoriametrics.com/stream-aggregation.html")
streamAggrDedupInterval = flagutil.NewArrayDuration("remoteWrite.streamAggr.dedupInterval", 0, "Input samples are de-duplicated with this interval before being aggregated. "+
"Only the last sample per each time series per each interval is aggregated if the interval is greater than zero")
disableOnDiskQueue = flag.Bool("remoteWrite.disableOnDiskQueue", false, "Whether to disable storing pending data to -remoteWrite.tmpDataPath "+
"when the configured remote storage systems cannot keep up with the data ingestion rate. See https://docs.victoriametrics.com/vmagent.html#disabling-on-disk-persistence ."+
"See also -remoteWrite.dropSamplesOnOverload")
dropSamplesOnOverload = flag.Bool("remoteWrite.dropSamplesOnOverload", false, "Whether to drop samples when -remoteWrite.disableOnDiskQueue is set and if the samples "+
"cannot be pushed into the configured remote storage systems in a timely manner. See https://docs.victoriametrics.com/vmagent.html#disabling-on-disk-persistence")
)
var (
@@ -96,11 +108,19 @@ var (
// Data without tenant id is written to defaultAuthToken if -remoteWrite.multitenantURL is specified.
defaultAuthToken = &auth.Token{}
// ErrQueueFullHTTPRetry must be returned when TryPush() returns false.
ErrQueueFullHTTPRetry = &httpserver.ErrorWithStatusCode{
Err: fmt.Errorf("remote storage systems cannot keep up with the data ingestion rate; retry the request later " +
"or remove -remoteWrite.disableOnDiskQueue from vmagent command-line flags, so it could save pending data to -remoteWrite.tmpDataPath; " +
"see https://docs.victoriametrics.com/vmagent.html#disabling-on-disk-persistence"),
StatusCode: http.StatusTooManyRequests,
}
)
// MultitenancyEnabled returns true if -remoteWrite.multitenantURL is specified.
// MultitenancyEnabled returns true if -enableMultitenantHandlers or -remoteWrite.multitenantURL is specified.
func MultitenancyEnabled() bool {
return len(*remoteWriteMultitenantURLs) > 0
return *enableMultitenantHandlers || len(*remoteWriteMultitenantURLs) > 0
}
// Contains the current relabelConfigs.
@@ -183,6 +203,7 @@ func Init() {
if len(*remoteWriteURLs) > 0 {
rwctxsDefault = newRemoteWriteCtxs(nil, *remoteWriteURLs)
}
dropDanglingQueues()
// Start config reloader.
configReloaderWG.Add(1)
@@ -200,6 +221,42 @@ func Init() {
}()
}
func dropDanglingQueues() {
if *keepDanglingQueues {
return
}
if len(*remoteWriteMultitenantURLs) > 0 {
// Do not drop dangling queues for *remoteWriteMultitenantURLs, since it is impossible to determine
// unused queues for multitenant urls - they are created on demand when new sample for the given
// tenant is pushed to remote storage.
return
}
// Remove dangling persistent queues, if any.
// This is required for the case when the number of queues has been changed or URL have been changed.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4014
//
existingQueues := make(map[string]struct{}, len(rwctxsDefault))
for _, rwctx := range rwctxsDefault {
existingQueues[rwctx.fq.Dirname()] = struct{}{}
}
queuesDir := filepath.Join(*tmpDataPath, persistentQueueDirname)
files := fs.MustReadDir(queuesDir)
removed := 0
for _, f := range files {
dirname := f.Name()
if _, ok := existingQueues[dirname]; !ok {
logger.Infof("removing dangling queue %q", dirname)
fullPath := filepath.Join(queuesDir, dirname)
fs.MustRemoveAll(fullPath)
removed++
}
}
if removed > 0 {
logger.Infof("removed %d dangling queues from %q, active queues: %d", removed, *tmpDataPath, len(rwctxsDefault))
}
}
func reloadRelabelConfigs() {
relabelConfigReloads.Inc()
logger.Infof("reloading relabel configs pointed by -remoteWrite.relabelConfig and -remoteWrite.urlRelabelConfig")
@@ -273,33 +330,6 @@ func newRemoteWriteCtxs(at *auth.Token, urls []string) []*remoteWriteCtx {
}
rwctxs[i] = newRemoteWriteCtx(i, remoteWriteURL, maxInmemoryBlocks, sanitizedURL)
}
if !*keepDanglingQueues {
// Remove dangling queues, if any.
// This is required for the case when the number of queues has been changed or URL have been changed.
// See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4014
existingQueues := make(map[string]struct{}, len(rwctxs))
for _, rwctx := range rwctxs {
existingQueues[rwctx.fq.Dirname()] = struct{}{}
}
queuesDir := filepath.Join(*tmpDataPath, persistentQueueDirname)
files := fs.MustReadDir(queuesDir)
removed := 0
for _, f := range files {
dirname := f.Name()
if _, ok := existingQueues[dirname]; !ok {
logger.Infof("removing dangling queue %q", dirname)
fullPath := filepath.Join(queuesDir, dirname)
fs.MustRemoveAll(fullPath)
removed++
}
}
if removed > 0 {
logger.Infof("removed %d dangling queues from %q, active queues: %d", removed, *tmpDataPath, len(rwctxs))
}
}
return rwctxs
}
@@ -308,7 +338,7 @@ var configReloaderWG sync.WaitGroup
// Stop stops remotewrite.
//
// It is expected that nobody calls Push during and after the call to this func.
// It is expected that nobody calls TryPush during and after the call to this func.
func Stop() {
close(configReloaderStopCh)
configReloaderWG.Wait()
@@ -318,7 +348,7 @@ func Stop() {
}
rwctxsDefault = nil
// There is no need in locking rwctxsMapLock here, since nobody should call Push during the Stop call.
// There is no need in locking rwctxsMapLock here, since nobody should call TryPush during the Stop call.
for _, rwctxs := range rwctxsMap {
for _, rwctx := range rwctxs {
rwctx.MustStop()
@@ -334,24 +364,47 @@ func Stop() {
}
}
// Push sends wr to remote storage systems set via `-remoteWrite.url`.
// PushDropSamplesOnFailure pushes wr to the configured remote storage systems set via -remoteWrite.url and -remoteWrite.multitenantURL
//
// If at is nil, then the data is pushed to the configured `-remoteWrite.url`.
// If at isn't nil, the data is pushed to the configured `-remoteWrite.multitenantURL`.
// If at is nil, then the data is pushed to the configured -remoteWrite.url.
// If at isn't nil, the data is pushed to the configured -remoteWrite.multitenantURL.
//
// Note that wr may be modified by Push because of relabeling and rounding.
func Push(at *auth.Token, wr *prompbmarshal.WriteRequest) {
if at == nil && len(*remoteWriteMultitenantURLs) > 0 {
// Write data to default tenant if at isn't set while -remoteWrite.multitenantURL is set.
// PushDropSamplesOnFailure can modify wr contents.
func PushDropSamplesOnFailure(at *auth.Token, wr *prompbmarshal.WriteRequest) {
_ = tryPush(at, wr, true)
}
// TryPush tries sending wr to the configured remote storage systems set via -remoteWrite.url and -remoteWrite.multitenantURL
//
// If at is nil, then the data is pushed to the configured -remoteWrite.url.
// If at isn't nil, the data is pushed to the configured -remoteWrite.multitenantURL.
//
// TryPush can modify wr contents, so the caller must re-initialize wr before calling TryPush() after unsuccessful attempt.
// TryPush may send partial data from wr on unsuccessful attempt, so repeated call for the same wr may send the data multiple times.
//
// The caller must return ErrQueueFullHTTPRetry to the client, which sends wr, if TryPush returns false.
func TryPush(at *auth.Token, wr *prompbmarshal.WriteRequest) bool {
return tryPush(at, wr, *dropSamplesOnOverload)
}
func tryPush(at *auth.Token, wr *prompbmarshal.WriteRequest, dropSamplesOnFailure bool) bool {
tss := wr.Timeseries
if at == nil && MultitenancyEnabled() {
// Write data to default tenant if at isn't set when multitenancy is enabled.
at = defaultAuthToken
}
var tenantRctx *relabelCtx
var rwctxs []*remoteWriteCtx
if at == nil {
rwctxs = rwctxsDefault
} else if len(*remoteWriteMultitenantURLs) == 0 {
// Convert at to (vm_account_id, vm_project_id) labels.
tenantRctx = getRelabelCtx()
defer putRelabelCtx(tenantRctx)
rwctxs = rwctxsDefault
} else {
if len(*remoteWriteMultitenantURLs) == 0 {
logger.Panicf("BUG: -remoteWrite.multitenantURL command-line flag must be set when __tenant_id__=%q label is set", at)
}
rwctxsMapLock.Lock()
tenantID := tenantmetrics.TenantID{
AccountID: at.AccountID,
@@ -365,18 +418,37 @@ func Push(at *auth.Token, wr *prompbmarshal.WriteRequest) {
rwctxsMapLock.Unlock()
}
rowsCount := getRowsCount(tss)
if *disableOnDiskQueue {
// Quick check whether writes to configured remote storage systems are blocked.
// This allows saving CPU time spent on relabeling and block compression
// if some of remote storage systems cannot keep up with the data ingestion rate.
for _, rwctx := range rwctxs {
if rwctx.fq.IsWriteBlocked() {
pushFailures.Inc()
if dropSamplesOnFailure {
// Just drop samples
samplesDropped.Add(rowsCount)
return true
}
return false
}
}
}
var rctx *relabelCtx
rcs := allRelabelConfigs.Load()
pcsGlobal := rcs.global
if pcsGlobal.Len() > 0 {
rctx = getRelabelCtx()
defer putRelabelCtx(rctx)
}
tss := wr.Timeseries
rowsCount := getRowsCount(tss)
globalRowsPushedBeforeRelabel.Add(rowsCount)
maxSamplesPerBlock := *maxRowsPerBlock
// Allow up to 10x of labels per each block on average.
maxLabelsPerBlock := 10 * maxSamplesPerBlock
for len(tss) > 0 {
// Process big tss in smaller blocks in order to reduce the maximum memory usage
samplesCount := 0
@@ -384,7 +456,7 @@ func Push(at *auth.Token, wr *prompbmarshal.WriteRequest) {
i := 0
for i < len(tss) {
samplesCount += len(tss[i].Samples)
labelsCount += len(tss[i].Labels)
labelsCount += len(tss[i].Samples) * len(tss[i].Labels)
i++
if samplesCount >= maxSamplesPerBlock || labelsCount >= maxLabelsPerBlock {
break
@@ -397,6 +469,9 @@ func Push(at *auth.Token, wr *prompbmarshal.WriteRequest) {
} else {
tss = nil
}
if tenantRctx != nil {
tenantRctx.tenantToLabels(tssBlock, at.AccountID, at.ProjectID)
}
if rctx != nil {
rowsCountBeforeRelabel := getRowsCount(tssBlock)
tssBlock = rctx.applyRelabeling(tssBlock, pcsGlobal)
@@ -405,25 +480,35 @@ func Push(at *auth.Token, wr *prompbmarshal.WriteRequest) {
}
sortLabelsIfNeeded(tssBlock)
tssBlock = limitSeriesCardinality(tssBlock)
pushBlockToRemoteStorages(rwctxs, tssBlock)
if rctx != nil {
rctx.reset()
if !tryPushBlockToRemoteStorages(rwctxs, tssBlock) {
if !*disableOnDiskQueue {
logger.Panicf("BUG: tryPushBlockToRemoteStorages must return true if -remoteWrite.disableOnDiskQueue isn't set")
}
pushFailures.Inc()
if dropSamplesOnFailure {
samplesDropped.Add(rowsCount)
return true
}
return false
}
}
if rctx != nil {
putRelabelCtx(rctx)
}
return true
}
func pushBlockToRemoteStorages(rwctxs []*remoteWriteCtx, tssBlock []prompbmarshal.TimeSeries) {
var (
samplesDropped = metrics.NewCounter(`vmagent_remotewrite_samples_dropped_total`)
pushFailures = metrics.NewCounter(`vmagent_remotewrite_push_failures_total`)
)
func tryPushBlockToRemoteStorages(rwctxs []*remoteWriteCtx, tssBlock []prompbmarshal.TimeSeries) bool {
if len(tssBlock) == 0 {
// Nothing to push
return
return true
}
if len(rwctxs) == 1 {
// Fast path - just push data to the configured single remote storage
rwctxs[0].Push(tssBlock)
return
return rwctxs[0].TryPush(tssBlock)
}
// We need to push tssBlock to multiple remote storages.
@@ -452,6 +537,7 @@ func pushBlockToRemoteStorages(rwctxs []*remoteWriteCtx, tssBlock []prompbmarsha
// the time needed for sending the data to multiple remote storage systems.
var wg sync.WaitGroup
wg.Add(len(rwctxs))
var anyPushFailed uint64
for i, rwctx := range rwctxs {
tssShard := tssByURL[i]
if len(tssShard) == 0 {
@@ -459,11 +545,13 @@ func pushBlockToRemoteStorages(rwctxs []*remoteWriteCtx, tssBlock []prompbmarsha
}
go func(rwctx *remoteWriteCtx, tss []prompbmarshal.TimeSeries) {
defer wg.Done()
rwctx.Push(tss)
if !rwctx.TryPush(tss) {
atomic.StoreUint64(&anyPushFailed, 1)
}
}(rwctx, tssShard)
}
wg.Wait()
return
return atomic.LoadUint64(&anyPushFailed) == 0
}
// Replicate data among rwctxs.
@@ -471,13 +559,17 @@ func pushBlockToRemoteStorages(rwctxs []*remoteWriteCtx, tssBlock []prompbmarsha
// the time needed for sending the data to multiple remote storage systems.
var wg sync.WaitGroup
wg.Add(len(rwctxs))
var anyPushFailed uint64
for _, rwctx := range rwctxs {
go func(rwctx *remoteWriteCtx) {
defer wg.Done()
rwctx.Push(tssBlock)
if !rwctx.TryPush(tssBlock) {
atomic.StoreUint64(&anyPushFailed, 1)
}
}(rwctx)
}
wg.Wait()
return atomic.LoadUint64(&anyPushFailed) == 0
}
// sortLabelsIfNeeded sorts labels if -sortLabels command-line flag is set.
@@ -597,13 +689,19 @@ func newRemoteWriteCtx(argIdx int, remoteWriteURL *url.URL, maxInmemoryBlocks in
logger.Warnf("rounding the -remoteWrite.maxDiskUsagePerURL=%d to the minimum supported value: %d", maxPendingBytes, persistentqueue.DefaultChunkFileSize)
maxPendingBytes = persistentqueue.DefaultChunkFileSize
}
fq := persistentqueue.MustOpenFastQueue(queuePath, sanitizedURL, maxInmemoryBlocks, maxPendingBytes)
fq := persistentqueue.MustOpenFastQueue(queuePath, sanitizedURL, maxInmemoryBlocks, maxPendingBytes, *disableOnDiskQueue)
_ = metrics.GetOrCreateGauge(fmt.Sprintf(`vmagent_remotewrite_pending_data_bytes{path=%q, url=%q}`, queuePath, sanitizedURL), func() float64 {
return float64(fq.GetPendingBytes())
})
_ = metrics.GetOrCreateGauge(fmt.Sprintf(`vmagent_remotewrite_pending_inmemory_blocks{path=%q, url=%q}`, queuePath, sanitizedURL), func() float64 {
return float64(fq.GetInmemoryQueueLen())
})
_ = metrics.GetOrCreateGauge(fmt.Sprintf(`vmagent_remotewrite_queue_blocked{path=%q, url=%q}`, queuePath, sanitizedURL), func() float64 {
if fq.IsWriteBlocked() {
return 1
}
return 0
})
var c *client
switch remoteWriteURL.Scheme {
@@ -625,7 +723,7 @@ func newRemoteWriteCtx(argIdx int, remoteWriteURL *url.URL, maxInmemoryBlocks in
}
pss := make([]*pendingSeries, pssLen)
for i := range pss {
pss[i] = newPendingSeries(fq.MustWriteBlock, c.useVMProto, sf, rd)
pss[i] = newPendingSeries(fq, c.useVMProto, sf, rd)
}
rwctx := &remoteWriteCtx{
@@ -642,7 +740,7 @@ func newRemoteWriteCtx(argIdx int, remoteWriteURL *url.URL, maxInmemoryBlocks in
sasFile := streamAggrConfig.GetOptionalArg(argIdx)
if sasFile != "" {
dedupInterval := streamAggrDedupInterval.GetOptionalArg(argIdx)
sas, err := streamaggr.LoadFromFile(sasFile, rwctx.pushInternal, dedupInterval)
sas, err := streamaggr.LoadFromFile(sasFile, rwctx.pushInternalTrackDropped, dedupInterval)
if err != nil {
logger.Fatalf("cannot initialize stream aggregators from -remoteWrite.streamAggr.config=%q: %s", sasFile, err)
}
@@ -678,7 +776,7 @@ func (rwctx *remoteWriteCtx) MustStop() {
rwctx.rowsDroppedByRelabel = nil
}
func (rwctx *remoteWriteCtx) Push(tss []prompbmarshal.TimeSeries) {
func (rwctx *remoteWriteCtx) TryPush(tss []prompbmarshal.TimeSeries) bool {
// Apply relabeling
var rctx *relabelCtx
var v *[]prompbmarshal.TimeSeries
@@ -716,7 +814,9 @@ func (rwctx *remoteWriteCtx) Push(tss []prompbmarshal.TimeSeries) {
}
matchIdxsPool.Put(matchIdxs)
}
rwctx.pushInternal(tss)
// Try pushing the data to remote storage
ok := rwctx.tryPushInternal(tss)
// Return back relabeling contexts to the pool
if rctx != nil {
@@ -724,6 +824,8 @@ func (rwctx *remoteWriteCtx) Push(tss []prompbmarshal.TimeSeries) {
tssPool.Put(v)
putRelabelCtx(rctx)
}
return ok
}
var matchIdxsPool bytesutil.ByteBufferPool
@@ -743,7 +845,21 @@ func dropAggregatedSeries(src []prompbmarshal.TimeSeries, matchIdxs []byte, drop
return dst
}
func (rwctx *remoteWriteCtx) pushInternal(tss []prompbmarshal.TimeSeries) {
func (rwctx *remoteWriteCtx) pushInternalTrackDropped(tss []prompbmarshal.TimeSeries) {
if rwctx.tryPushInternal(tss) {
return
}
if !*disableOnDiskQueue {
logger.Panicf("BUG: tryPushInternal must return true if -remoteWrite.disableOnDiskQueue isn't set")
}
pushFailures.Inc()
if *dropSamplesOnOverload {
rowsCount := getRowsCount(tss)
samplesDropped.Add(rowsCount)
}
}
func (rwctx *remoteWriteCtx) tryPushInternal(tss []prompbmarshal.TimeSeries) bool {
var rctx *relabelCtx
var v *[]prompbmarshal.TimeSeries
if len(labelsGlobal) > 0 {
@@ -757,13 +873,16 @@ func (rwctx *remoteWriteCtx) pushInternal(tss []prompbmarshal.TimeSeries) {
pss := rwctx.pss
idx := atomic.AddUint64(&rwctx.pssNextIdx, 1) % uint64(len(pss))
pss[idx].Push(tss)
ok := pss[idx].TryPush(tss)
if rctx != nil {
*v = prompbmarshal.ResetTimeSeries(tss)
tssPool.Put(v)
putRelabelCtx(rctx)
}
return ok
}
func (rwctx *remoteWriteCtx) reinitStreamAggr() {
@@ -776,7 +895,7 @@ func (rwctx *remoteWriteCtx) reinitStreamAggr() {
logger.Infof("reloading stream aggregation configs pointed by -remoteWrite.streamAggr.config=%q", sasFile)
metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_streamaggr_config_reloads_total{path=%q}`, sasFile)).Inc()
dedupInterval := streamAggrDedupInterval.GetOptionalArg(rwctx.idx)
sasNew, err := streamaggr.LoadFromFile(sasFile, rwctx.pushInternal, dedupInterval)
sasNew, err := streamaggr.LoadFromFile(sasFile, rwctx.pushInternalTrackDropped, dedupInterval)
if err != nil {
metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_streamaggr_config_reloads_errors_total{path=%q}`, sasFile)).Inc()
metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_streamaggr_config_reload_successful{path=%q}`, sasFile)).Set(0)

View File

@@ -76,7 +76,9 @@ func insertRows(at *auth.Token, rows []parser.Row, extraLabels []prompbmarshal.L
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.Push(at, &ctx.WriteRequest)
if !remotewrite.TryPush(at, &ctx.WriteRequest) {
return remotewrite.ErrQueueFullHTTPRetry
}
rowsInserted.Add(rowsTotal)
if at != nil {
rowsTenantInserted.Get(at).Add(rowsTotal)

View File

@@ -1,246 +1,3 @@
See vmalert-tool docs [here](https://docs.victoriametrics.com/vmalert-tool.html).
# vmalert-tool
VMAlert command-line tool
## Unit testing for rules
You can use `vmalert-tool` to run unit tests for alerting and recording rules.
It will perform the following actions:
* sets up an isolated VictoriaMetrics instance;
* simulates the periodic ingestion of time series;
* queries the ingested data for recording and alerting rules evaluation like [vmalert](https://docs.victoriametrics.com/vmalert.html);
* checks whether the firing alerts or resulting recording rules match the expected results.
See how to run vmalert-tool for unit test below:
```
# Run vmalert-tool with one or multiple test files via --files cmd-line flag
./vmalert-tool unittest --files test1.yaml --files test2.yaml
```
vmalert-tool unittest is compatible with [Prometheus config format for tests](https://prometheus.io/docs/prometheus/latest/configuration/unit_testing_rules/#test-file-format)
except `promql_expr_test` field. Use `metricsql_expr_test` field name instead. The name is different because vmalert-tool
validates and executes [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) expressions,
which aren't always backward compatible with [PromQL](https://prometheus.io/docs/prometheus/latest/querying/basics/).
### Test file format
The configuration format for files specified in `--files` cmd-line flag is the following:
```yaml
# Path to the files or http url containing [rule groups](https://docs.victoriametrics.com/vmalert.html#groups) configuration.
# Enterprise version of vmalert-tool supports S3 and GCS paths to rules.
rule_files:
[ - <string> ]
# The evaluation interval for rules specified in `rule_files`
[ evaluation_interval: <duration> | default = 1m ]
# Groups listed below will be evaluated by order.
# Not All the groups need not be mentioned, if not, they will be evaluated by define order in rule_files.
group_eval_order:
[ - <string> ]
# The list of unit test files to be checked during evaluation.
tests:
[ - <test_group> ]
```
#### `<test_group>`
```yaml
# Interval between samples for input series
interval: <duration>
# Time series to persist into the database according to configured <interval> before running tests.
input_series:
[ - <series> ]
# Name of the test group, optional
[ name: <string> ]
# Unit tests for alerting rules
alert_rule_test:
[ - <alert_test_case> ]
# Unit tests for Metricsql expressions.
metricsql_expr_test:
[ - <metricsql_expr_test> ]
# External labels accessible for templating.
external_labels:
[ <labelname>: <string> ... ]
```
#### `<series>`
```yaml
# series in the following format '<metric name>{<label name>=<label value>, ...}'
# Examples:
# series_name{label1="value1", label2="value2"}
# go_goroutines{job="prometheus", instance="localhost:9090"}
series: <string>
# values support several special equations:
# 'a+bxc' becomes 'a a+b a+(2*b) a+(3*b) … a+(c*b)'
# Read this as series starts at a, then c further samples incrementing by b.
# 'a-bxc' becomes 'a a-b a-(2*b) a-(3*b) … a-(c*b)'
# Read this as series starts at a, then c further samples decrementing by b (or incrementing by negative b).
# '_' represents a missing sample from scrape
# 'stale' indicates a stale sample
# Examples:
# 1. '-2+4x3' becomes '-2 2 6 10' - series starts at -2, then 3 further samples incrementing by 4.
# 2. ' 1-2x4' becomes '1 -1 -3 -5 -7' - series starts at 1, then 4 further samples decrementing by 2.
# 3. ' 1x4' becomes '1 1 1 1 1' - shorthand for '1+0x4', series starts at 1, then 4 further samples incrementing by 0.
# 4. ' 1 _x3 stale' becomes '1 _ _ _ stale' - the missing sample cannot increment, so 3 missing samples are produced by the '_x3' expression.
values: <string>
```
#### `<alert_test_case>`
vmalert by default adds `alertgroup` and `alertname` to the generated alerts and time series.
So you will need to specify both `groupname` and `alertname` under a single `<alert_test_case>`,
but no need to add them under `exp_alerts`.
You can also pass `--disableAlertgroupLabel` to skip `alertgroup` check.
```yaml
# The time elapsed from time=0s when this alerting rule should be checked.
# Means this rule should be firing at this point, or shouldn't be firing if 'exp_alerts' is empty.
eval_time: <duration>
# Name of the group name to be tested.
groupname: <string>
# Name of the alert to be tested.
alertname: <string>
# List of the expected alerts that are firing under the given alertname at
# the given evaluation time. If you want to test if an alerting rule should
# not be firing, then you can mention only the fields above and leave 'exp_alerts' empty.
exp_alerts:
[ - <alert> ]
```
#### `<alert>`
```yaml
# These are the expanded labels and annotations of the expected alert.
# Note: labels also include the labels of the sample associated with the alert
exp_labels:
[ <labelname>: <string> ]
exp_annotations:
[ <labelname>: <string> ]
```
#### `<metricsql_expr_test>`
```yaml
# Expression to evaluate
expr: <string>
# The time elapsed from time=0s when this expression be evaluated.
eval_time: <duration>
# Expected samples at the given evaluation time.
exp_samples:
[ - <sample> ]
```
#### `<sample>`
```yaml
# Labels of the sample in usual series notation '<metric name>{<label name>=<label value>, ...}'
# Examples:
# series_name{label1="value1", label2="value2"}
# go_goroutines{job="prometheus", instance="localhost:9090"}
labels: <string>
# The expected value of the Metricsql expression.
value: <number>
```
### Example
This is an example input file for unit testing which will pass.
`test.yaml` is the test file which follows the syntax above and `alerts.yaml` contains the alerting rules.
With `rules.yaml` in the same directory, run `./vmalert-tool unittest --files=./unittest/testdata/test.yaml`.
#### `test.yaml`
```yaml
rule_files:
- rules.yaml
evaluation_interval: 1m
tests:
- interval: 1m
input_series:
- series: 'up{job="prometheus", instance="localhost:9090"}'
values: "0+0x1440"
metricsql_expr_test:
- expr: suquery_interval_test
eval_time: 4m
exp_samples:
- labels: '{__name__="suquery_interval_test", datacenter="dc-123", instance="localhost:9090", job="prometheus"}'
value: 1
alert_rule_test:
- eval_time: 2h
groupname: group1
alertname: InstanceDown
exp_alerts:
- exp_labels:
job: prometheus
severity: page
instance: localhost:9090
datacenter: dc-123
exp_annotations:
summary: "Instance localhost:9090 down"
description: "localhost:9090 of job prometheus has been down for more than 5 minutes."
- eval_time: 0
groupname: group1
alertname: AlwaysFiring
exp_alerts:
- exp_labels:
datacenter: dc-123
- eval_time: 0
groupname: group1
alertname: InstanceDown
exp_alerts: []
external_labels:
datacenter: dc-123
```
#### `alerts.yaml`
```yaml
# This is the rules file.
groups:
- name: group1
rules:
- alert: InstanceDown
expr: up == 0
for: 5m
labels:
severity: page
annotations:
summary: "Instance {{ $labels.instance }} down"
description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes."
- alert: AlwaysFiring
expr: 1
- name: group2
rules:
- record: job:test:count_over_time1m
expr: sum without(instance) (count_over_time(test[1m]))
- record: suquery_interval_test
expr: count_over_time(up[5m:])
```
vmalert-tool docs can be edited at [docs/vmalert-tool.md](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/docs/vmalert-tool.md).

File diff suppressed because it is too large Load Diff

View File

@@ -23,7 +23,7 @@ var (
"It is hidden by default, since it can contain sensitive info such as auth key")
blackHole = flag.Bool("notifier.blackhole", false, "Whether to blackhole alerting notifications. "+
"Enable this flag if you want vmalert to evaluate alerting rules without sending any notifications to external receivers (eg. alertmanager). "+
"`-notifier.url`, `-notifier.config` and `-notifier.blackhole` are mutually exclusive.")
"-notifier.url, -notifier.config and -notifier.blackhole are mutually exclusive.")
basicAuthUsername = flagutil.NewArrayString("notifier.basicAuth.username", "Optional basic auth username for -notifier.url")
basicAuthPassword = flagutil.NewArrayString("notifier.basicAuth.password", "Optional basic auth password for -notifier.url")

View File

@@ -30,6 +30,7 @@ type AlertingRule struct {
Annotations map[string]string
GroupID uint64
GroupName string
File string
EvalInterval time.Duration
Debug bool
@@ -67,6 +68,7 @@ func NewAlertingRule(qb datasource.QuerierBuilder, group *Group, cfg config.Rule
Annotations: cfg.Annotations,
GroupID: group.ID(),
GroupName: group.Name,
File: group.File,
EvalInterval: group.Interval,
Debug: cfg.Debug,
q: qb.BuildWithParams(datasource.QuerierParams{

View File

@@ -17,12 +17,14 @@ import (
// to evaluate configured Expression and
// return TimeSeries as result.
type RecordingRule struct {
Type config.Type
RuleID uint64
Name string
Expr string
Labels map[string]string
GroupID uint64
Type config.Type
RuleID uint64
Name string
Expr string
Labels map[string]string
GroupID uint64
GroupName string
File string
q datasource.Querier
@@ -52,13 +54,15 @@ func (rr *RecordingRule) ID() uint64 {
// NewRecordingRule creates a new RecordingRule
func NewRecordingRule(qb datasource.QuerierBuilder, group *Group, cfg config.Rule) *RecordingRule {
rr := &RecordingRule{
Type: group.Type,
RuleID: cfg.ID,
Name: cfg.Record,
Expr: cfg.Expr,
Labels: cfg.Labels,
GroupID: group.ID(),
metrics: &recordingRuleMetrics{},
Type: group.Type,
RuleID: cfg.ID,
Name: cfg.Record,
Expr: cfg.Expr,
Labels: cfg.Labels,
GroupID: group.ID(),
GroupName: group.Name,
File: group.File,
metrics: &recordingRuleMetrics{},
q: qb.BuildWithParams(datasource.QuerierParams{
DataSourceType: group.Type.String(),
EvaluationInterval: group.Interval,

View File

@@ -43,26 +43,26 @@ type ruleState struct {
// StateEntry stores rule's execution states
type StateEntry struct {
// stores last moment of time rule.Exec was called
Time time.Time
Time time.Time `json:"time"`
// stores the timesteamp with which rule.Exec was called
At time.Time
At time.Time `json:"at"`
// stores the duration of the last rule.Exec call
Duration time.Duration
Duration time.Duration `json:"duration"`
// stores last error that happened in Exec func
// resets on every successful Exec
// may be used as Health ruleState
Err error
Err error `json:"error"`
// stores the number of samples returned during
// the last evaluation
Samples int
Samples int `json:"samples"`
// stores the number of time series fetched during
// the last evaluation.
// Is supported by VictoriaMetrics only, starting from v1.90.0
// If seriesFetched == nil, then this attribute was missing in
// datasource response (unsupported).
SeriesFetched *int
SeriesFetched *int `json:"series_fetched"`
// stores the curl command reflecting the HTTP request used during rule.Exec
Curl string
Curl string `json:"curl"`
}
// GetLastEntry returns latest stateEntry of rule

Binary file not shown.

Before

Width:  |  Height:  |  Size: 73 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 151 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 122 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 80 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 62 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 109 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 41 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 41 KiB

View File

@@ -132,6 +132,24 @@ func (rh *requestHandler) handler(w http.ResponseWriter, r *http.Request) bool {
w.Header().Set("Content-Type", "application/json")
w.Write(data)
return true
case "/vmalert/api/v1/rule", "/api/v1/rule":
rule, err := rh.getRule(r)
if err != nil {
httpserver.Errorf(w, r, "%s", err)
return true
}
rwu := apiRuleWithUpdates{
apiRule: rule,
StateUpdates: rule.Updates,
}
data, err := json.Marshal(rwu)
if err != nil {
httpserver.Errorf(w, r, "failed to marshal rule: %s", err)
return true
}
w.Header().Set("Content-Type", "application/json")
w.Write(data)
return true
case "/-/reload":
logger.Infof("api config reload was called, sending sighup")
procutil.SelfSIGHUP()

View File

@@ -143,6 +143,28 @@ func TestHandler(t *testing.T) {
t.Errorf("expected 1 group got %d", length)
}
})
t.Run("/api/v1/rule?ruleID&groupID", func(t *testing.T) {
expRule := ruleToAPI(ar)
gotRule := apiRule{}
getResp(ts.URL+"/"+expRule.APILink(), &gotRule, 200)
if expRule.ID != gotRule.ID {
t.Errorf("expected to get Rule %q; got %q instead", expRule.ID, gotRule.ID)
}
gotRule = apiRule{}
getResp(ts.URL+"/vmalert/"+expRule.APILink(), &gotRule, 200)
if expRule.ID != gotRule.ID {
t.Errorf("expected to get Rule %q; got %q instead", expRule.ID, gotRule.ID)
}
gotRuleWithUpdates := apiRuleWithUpdates{}
getResp(ts.URL+"/"+expRule.APILink(), &gotRuleWithUpdates, 200)
if gotRuleWithUpdates.StateUpdates == nil || len(gotRuleWithUpdates.StateUpdates) < 1 {
t.Fatalf("expected %+v to have state updates field not empty", gotRuleWithUpdates.StateUpdates)
}
})
}
func TestEmptyResponse(t *testing.T) {

View File

@@ -151,6 +151,10 @@ type apiRule struct {
ID string `json:"id"`
// GroupID is an unique Group's ID
GroupID string `json:"group_id"`
// GroupName is Group name rule belong to
GroupName string `json:"group_name"`
// File is file name where rule is defined
File string `json:"file"`
// Debug shows whether debug mode is enabled
Debug bool `json:"debug"`
@@ -160,6 +164,19 @@ type apiRule struct {
Updates []rule.StateEntry `json:"-"`
}
// apiRuleWithUpdates represents apiRule but with extra fields for marshalling
type apiRuleWithUpdates struct {
apiRule
// Updates contains the ordered list of recorded ruleStateEntry objects
StateUpdates []rule.StateEntry `json:"updates,omitempty"`
}
// APILink returns a link to the rule's JSON representation.
func (ar apiRule) APILink() string {
return fmt.Sprintf("api/v1/rule?%s=%s&%s=%s",
paramGroupID, ar.GroupID, paramRuleID, ar.ID)
}
// WebLink returns a link to the alert which can be used in UI.
func (ar apiRule) WebLink() string {
return fmt.Sprintf("rule?%s=%s&%s=%s",
@@ -227,8 +244,10 @@ func alertingToAPI(ar *rule.AlertingRule) apiRule {
Debug: ar.Debug,
// encode as strings to avoid rounding in JSON
ID: fmt.Sprintf("%d", ar.ID()),
GroupID: fmt.Sprintf("%d", ar.GroupID),
ID: fmt.Sprintf("%d", ar.ID()),
GroupID: fmt.Sprintf("%d", ar.GroupID),
GroupName: ar.GroupName,
File: ar.File,
}
if lastState.Err != nil {
r.LastError = lastState.Err.Error()

View File

@@ -1,576 +1,3 @@
# vmauth
See vmauth docs [here](https://docs.victoriametrics.com/vmauth.html).
`vmauth` is a simple auth proxy, router and [load balancer](#load-balancing) for [VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics).
It reads auth credentials from `Authorization` http header ([Basic Auth](https://en.wikipedia.org/wiki/Basic_access_authentication), `Bearer token` and [InfluxDB authorization](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1897) is supported),
matches them against configs pointed by [-auth.config](#auth-config) command-line flag and proxies incoming HTTP requests to the configured per-user `url_prefix` on successful match.
The `-auth.config` can point to either local file or to http url.
## Quick start
Just download `vmutils-*` archive from [releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/latest), unpack it
and pass the following flag to `vmauth` binary in order to start authorizing and routing requests:
```console
/path/to/vmauth -auth.config=/path/to/auth/config.yml
```
After that `vmauth` starts accepting HTTP requests on port `8427` and routing them according to the provided [-auth.config](#auth-config).
The port can be modified via `-httpListenAddr` command-line flag.
The auth config can be reloaded via the following ways:
- By passing `SIGHUP` signal to `vmauth`.
- By querying `/-/reload` http endpoint. This endpoint can be protected with `-reloadAuthKey` command-line flag. See [security docs](#security) for more details.
- By specifying `-configCheckInterval` command-line flag to the interval between config re-reads. For example, `-configCheckInterval=5s` will re-read the config
and apply new changes every 5 seconds.
Docker images for `vmauth` are available [here](https://hub.docker.com/r/victoriametrics/vmauth/tags).
See how `vmauth` used in [docker-compose env](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/deployment/docker/README.md#victoriametrics-cluster).
Pass `-help` to `vmauth` in order to see all the supported command-line flags with their descriptions.
Feel free [contacting us](mailto:info@victoriametrics.com) if you need customized auth proxy for VictoriaMetrics with the support of LDAP, SSO, RBAC, SAML,
accounting and rate limiting such as [vmgateway](https://docs.victoriametrics.com/vmgateway.html).
## Dropping request path prefix
By default `vmauth` doesn't drop the path prefix from the original request when proxying the request to the matching backend.
Sometimes it is needed to drop path prefix before routing the request to the backend. This can be done by specifying the number of `/`-delimited
prefix parts to drop from the request path via `drop_src_path_prefix_parts` option at `url_map` level or at `user` level.
For example, if you need to serve requests to [vmalert](https://docs.victoriametrics.com/vmalert.html) at `/vmalert/` path prefix,
while serving requests to [vmagent](https://docs.victoriametrics.com/vmagent.html) at `/vmagent/` path prefix for a particular user,
then the following [-auth.config](#auth-config) can be used:
```yml
users:
- username: foo
url_map:
# proxy all the requests, which start with `/vmagent/`, to vmagent backend
- src_paths:
- "/vmagent/.+"
# drop /vmagent/ path prefix from the original request before proxying it to url_prefix.
drop_src_path_prefix_parts: 1
url_prefix: "http://vmagent-backend:8429/"
# proxy all the requests, which start with `/vmalert`, to vmalert backend
- src_paths:
- "/vmalert/.+"
# drop /vmalert/ path prefix from the original request before proxying it to url_prefix.
drop_src_path_prefix_parts: 1
url_prefix: "http://vmalert-backend:8880/"
```
## Load balancing
Each `url_prefix` in the [-auth.config](#auth-config) may contain either a single url or a list of urls.
In the latter case `vmauth` balances load among the configured urls in least-loaded round-robin manner.
If the backend at the configured url isn't available, then `vmauth` tries sending the request to the remaining configured urls.
It is possible to configure automatic retry of requests if the backend responds with status code from optional `retry_status_codes` list.
Load balancing feature can be used in the following cases:
- Balancing the load among multiple `vmselect` and/or `vminsert` nodes in [VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html).
The following `-auth.config` file can be used for spreading incoming requests among 3 vmselect nodes and re-trying failed requests
or requests with 500 and 502 response status codes:
```yml
unauthorized_user:
url_prefix:
- http://vmselect1:8481/
- http://vmselect2:8481/
- http://vmselect3:8481/
retry_status_codes: [500, 502]
```
- Spreading select queries among multiple availability zones (AZs) with identical data. For example, the following config spreads select queries
among 3 AZs. Requests are re-tried if some AZs are temporarily unavailable or if some `vmstorage` nodes in some AZs are temporarily unavailable.
`vmauth` adds `deny_partial_response=1` query arg to all the queries in order to guarantee to get full response from every AZ.
See [these docs](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#cluster-availability) for details.
```yml
unauthorized_user:
url_prefix:
- https://vmselect-az1/?deny_partial_response=1
- https://vmselect-az2/?deny_partial_response=1
- https://vmselect-az3/?deny_partial_response=1
retry_status_codes: [500, 502, 503]
```
Load balancig can also be configured independently per each user and per each `url_map` entry.
See [auth config docs](#auth-config) for more details.
## Concurrency limiting
`vmauth` limits the number of concurrent requests it can proxy according to the following command-line flags:
- `-maxConcurrentRequests` limits the global number of concurrent requests `vmauth` can serve across all the configured users.
- `-maxConcurrentPerUserRequests` limits the number of concurrent requests `vmauth` can serve per each configured user.
It is also possible to set individual limits on the number of concurrent requests per each user
with the `max_concurrent_requests` option - see [auth config example](#auth-config).
`vmauth` responds with `429 Too Many Requests` HTTP error when the number of concurrent requests exceeds the configured limits.
The following [metrics](#monitoring) related to concurrency limits are exposed by `vmauth`:
- `vmauth_concurrent_requests_capacity` - the global limit on the number of concurrent requests `vmauth` can serve.
It is set via `-maxConcurrentRequests` command-line flag.
- `vmauth_concurrent_requests_current` - the current number of concurrent requests `vmauth` processes.
- `vmauth_concurrent_requests_limit_reached_total` - the number of requests rejected with `429 Too Many Requests` error
because of the global concurrency limit has been reached.
- `vmauth_user_concurrent_requests_capacity{username="..."}` - the limit on the number of concurrent requests for the given `username`.
- `vmauth_user_concurrent_requests_current{username="..."}` - the current number of concurrent requests for the given `username`.
- `vmauth_user_concurrent_requests_limit_reached_total{username="foo"}` - the number of requests rejected with `429 Too Many Requests` error
because of the concurrency limit has been reached for the given `username`.
- `vmauth_unauthorized_user_concurrent_requests_capacity` - the limit on the number of concurrent requests for unauthorized users (if `unauthorized_user` section is used).
- `vmauth_unauthorized_user_concurrent_requests_current` - the current number of concurrent requests for unauthorized users (if `unauthorized_user` section is used).
- `vmauth_unauthorized_user_concurrent_requests_limit_reached_total` - the number of requests rejected with `429 Too Many Requests` error
because of the concurrency limit has been reached for unauthorized users (if `unauthorized_user` section is used).
## Backend TLS setup
By default `vmauth` uses system settings when performing requests to HTTPS backends specified via `url_prefix` option
in the [`-auth.config`](https://docs.victoriametrics.com/vmauth.html#auth-config). These settings can be overridden with the following command-line flags:
- `-backend.tlsInsecureSkipVerify` allows skipping TLS verification when connecting to HTTPS backends.
This global setting can be overridden at per-user level inside [`-auth.config`](https://docs.victoriametrics.com/vmauth.html#auth-config)
via `tls_insecure_skip_verify` option. For example:
```yml
- username: "foo"
url_prefix: "https://localhost"
tls_insecure_skip_verify: true
```
- `-backend.tlsCAFile` allows specifying the path to TLS Root CA, which will be used for TLS verification when connecting to HTTPS backends.
The `-backend.tlsCAFile` may point either to local file or to `http` / `https` url.
This global setting can be overridden at per-user level inside [`-auth.config`](https://docs.victoriametrics.com/vmauth.html#auth-config)
via `tls_ca_file` option. For example:
```yml
- username: "foo"
url_prefix: "https://localhost"
tls_ca_file: "/path/to/tls/root/ca"
```
## IP filters
[Enterprise version](https://docs.victoriametrics.com/enterprise.html) of `vmauth` can be configured to allow / deny incoming requests via global and per-user IP filters.
For example, the following config allows requests to `vmauth` from `10.0.0.0/24` network and from `1.2.3.4` IP address, while denying requests from `10.0.0.42` IP address:
```yml
users:
# User configs here
ip_filters:
allow_list:
- 10.0.0.0/24
- 1.2.3.4
deny_list: [10.0.0.42]
```
The following config allows requests for the user 'foobar' only from the IP `127.0.0.1`:
```yml
users:
- username: "foobar"
password: "***"
url_prefix: "http://localhost:8428"
ip_filters:
allow_list: [127.0.0.1]
```
See config example of using IP filters [here](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmauth/example_config_ent.yml).
## Auth config
`-auth.config` is represented in the following simple `yml` format:
```yml
# Arbitrary number of usernames may be put here.
# It is possible to set multiple identical usernames with different passwords.
# Such usernames can be differentiated by `name` option.
users:
# Requests with the 'Authorization: Bearer XXXX' and 'Authorization: Token XXXX'
# header are proxied to http://localhost:8428 .
# For example, http://vmauth:8427/api/v1/query is proxied to http://localhost:8428/api/v1/query
# Requests with the Basic Auth username=XXXX are proxied to http://localhost:8428 as well.
- bearer_token: "XXXX"
url_prefix: "http://localhost:8428"
# Requests with the 'Authorization: Bearer YYY' header are proxied to http://localhost:8428 ,
# The `X-Scope-OrgID: foobar` http header is added to every proxied request.
# The `X-Server-Hostname` http header is removed from the proxied response.
# For example, http://vmauth:8427/api/v1/query is proxied to http://localhost:8428/api/v1/query
- bearer_token: "YYY"
url_prefix: "http://localhost:8428"
# extra headers to add to the request or remove from the request (if header value is empty)
headers:
- "X-Scope-OrgID: foobar"
# extra headers to add to the response or remove from the response (if header value is empty)
response_headers:
- "X-Server-Hostname:" # empty value means the header will be removed from the response
# All the requests to http://vmauth:8427 with the given Basic Auth (username:password)
# are proxied to http://localhost:8428 .
# For example, http://vmauth:8427/api/v1/query is proxied to http://localhost:8428/api/v1/query
#
# The given user can send maximum 10 concurrent requests according to the provided max_concurrent_requests.
# Excess concurrent requests are rejected with 429 HTTP status code.
# See also -maxConcurrentPerUserRequests and -maxConcurrentRequests command-line flags.
- username: "local-single-node"
password: "***"
url_prefix: "http://localhost:8428"
max_concurrent_requests: 10
# All the requests to http://vmauth:8427 with the given Basic Auth (username:password)
# are proxied to http://localhost:8428 with extra_label=team=dev query arg.
# For example, http://vmauth:8427/api/v1/query is routed to http://localhost:8428/api/v1/query?extra_label=team=dev
- username: "local-single-node2"
password: "***"
url_prefix: "http://localhost:8428?extra_label=team=dev"
# All the requests to http://vmauth:8427 with the given Basic Auth (username:password)
# are proxied to https://localhost:8428.
# For example, http://vmauth:8427/api/v1/query is routed to https://localhost/api/v1/query
# TLS verification is skipped for https://localhost.
- username: "local-single-node-with-tls"
password: "***"
url_prefix: "https://localhost"
tls_insecure_skip_verify: true
# All the requests to http://vmauth:8427 with the given Basic Auth (username:password)
# are load-balanced among http://vmselect1:8481/select/123/prometheus and http://vmselect2:8481/select/123/prometheus
# For example, http://vmauth:8427/api/v1/query is proxied to the following urls in a round-robin manner:
# - http://vmselect1:8481/select/123/prometheus/api/v1/select
# - http://vmselect2:8481/select/123/prometheus/api/v1/select
- username: "cluster-select-account-123"
password: "***"
url_prefix:
- "http://vmselect1:8481/select/123/prometheus"
- "http://vmselect2:8481/select/123/prometheus"
# All the requests to http://vmauth:8427 with the given Basic Auth (username:password)
# are load-balanced between http://vminsert1:8480/insert/42/prometheus and http://vminsert2:8480/insert/42/prometheus
# For example, http://vmauth:8427/api/v1/write is proxied to the following urls in a round-robin manner:
# - http://vminsert1:8480/insert/42/prometheus/api/v1/write
# - http://vminsert2:8480/insert/42/prometheus/api/v1/write
- username: "cluster-insert-account-42"
password: "***"
url_prefix:
- "http://vminsert1:8480/insert/42/prometheus"
- "http://vminsert2:8480/insert/42/prometheus"
# A single user for querying and inserting data:
#
# - Requests to http://vmauth:8427/api/v1/query, http://vmauth:8427/api/v1/query_range
# and http://vmauth:8427/api/v1/label/<label_name>/values are proxied to the following urls in a round-robin manner:
# - http://vmselect1:8481/select/42/prometheus
# - http://vmselect2:8481/select/42/prometheus
# For example, http://vmauth:8427/api/v1/query is proxied to http://vmselect1:8480/select/42/prometheus/api/v1/query
# or to http://vmselect2:8480/select/42/prometheus/api/v1/query .
# Requests are re-tried at other url_prefix backends if response status codes match 500 or 502.
#
# - Requests to http://vmauth:8427/api/v1/write are proxied to http://vminsert:8480/insert/42/prometheus/api/v1/write .
# The "X-Scope-OrgID: abc" http header is added to these requests.
# The "X-Server-Hostname" http header is removed from the proxied response.
#
# Request which do not match `src_paths` from the `url_map` are proxied to the urls from `default_url`
# in a round-robin manner. The original request path is passed in `request_path` query arg.
# For example, request to http://vmauth:8427/non/existing/path are proxied:
# - to http://default1:8888/unsupported_url_handler?request_path=/non/existing/path
# - or http://default2:8888/unsupported_url_handler?request_path=/non/existing/path
#
# Regular expressions are allowed in `src_paths` entries.
- username: "foobar"
url_map:
- src_paths:
- "/api/v1/query"
- "/api/v1/query_range"
- "/api/v1/label/[^/]+/values"
url_prefix:
- "http://vmselect1:8481/select/42/prometheus"
- "http://vmselect2:8481/select/42/prometheus"
retry_status_codes: [500, 502]
- src_paths: ["/api/v1/write"]
url_prefix: "http://vminsert:8480/insert/42/prometheus"
headers:
- "X-Scope-OrgID: abc"
response_headers:
- "X-Server-Hostname:" # empty value means the header will be removed from the response
ip_filters:
deny_list: [127.0.0.1]
default_url:
- "http://default1:8888/unsupported_url_handler"
- "http://default2:8888/unsupported_url_handler"
# Requests without Authorization header are routed according to `unauthorized_user` section.
# Requests are routed in round-robin fashion between `url_prefix` backends.
# The deny_partial_response query arg is added to all the routed requests.
# The requests are re-tried if url_prefix backends send 500 or 503 response status codes.
# Note that the unauthorized_user section takes precedence when processing a route without credentials,
# even if such a route also exists in the users section (see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5236).
unauthorized_user:
url_prefix:
- http://vmselect-az1/?deny_partial_response=1
- http://vmselect-az2/?deny_partial_response=1
retry_status_codes: [503, 500]
ip_filters:
allow_list: ["1.2.3.0/24", "127.0.0.1"]
deny_list:
- 10.1.0.1
```
The config may contain `%{ENV_VAR}` placeholders, which are substituted by the corresponding `ENV_VAR` environment variable values.
This may be useful for passing secrets to the config.
Please note, vmauth doesn't follow redirects. If destination redirects request to a new location, make sure this
location is supported in vmauth `url_map` config.
## Security
It is expected that all the backend services protected by `vmauth` are located in an isolated private network, so they can be accessed by external users only via `vmauth`.
Do not transfer Basic Auth headers in plaintext over untrusted networks. Enable https. This can be done by passing the following `-tls*` command-line flags to `vmauth`:
```console
-tls
Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set
-tlsCertFile string
Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs, since RSA certs are slow
-tlsKeyFile string
Path to file with TLS key. Used only if -tls is set
```
Alternatively, [https termination proxy](https://en.wikipedia.org/wiki/TLS_termination_proxy) may be put in front of `vmauth`.
It is recommended protecting the following endpoints with authKeys:
* `/-/reload` with `-reloadAuthKey` command-line flag, so external users couldn't trigger config reload.
* `/flags` with `-flagsAuthKey` command-line flag, so unauthorized users couldn't get application command-line flags.
* `/metrics` with `-metricsAuthKey` command-line flag, so unauthorized users couldn't get access to [vmauth metrics](#monitoring).
* `/debug/pprof` with `-pprofAuthKey` command-line flag, so unauthorized users couldn't get access to [profiling information](#profiling).
`vmauth` also supports the ability to restrict access by IP - see [these docs](#ip-filters). See also [concurrency limiting docs](#concurrency-limiting).
## Monitoring
`vmauth` exports various metrics in Prometheus exposition format at `http://vmauth-host:8427/metrics` page. It is recommended setting up regular scraping of this page
either via [vmagent](https://docs.victoriametrics.com/vmagent.html) or via Prometheus, so the exported metrics could be analyzed later.
`vmauth` exports `vmauth_user_requests_total` [counter](https://docs.victoriametrics.com/keyConcepts.html#counter) metric
and `vmauth_user_request_duration_seconds_*` [summary](https://docs.victoriametrics.com/keyConcepts.html#summary) metric
with `username` label. The `username` label value equals to `username` field value set in the `-auth.config` file.
It is possible to override or hide the value in the label by specifying `name` field.
For example, the following config will result in `vmauth_user_requests_total{username="foobar"}`
instead of `vmauth_user_requests_total{username="secret_user"}`:
```yml
users:
- username: "secret_user"
name: "foobar"
# other config options here
```
For unauthorized users `vmauth` exports `vmauth_unauthorized_user_requests_total`
[counter](https://docs.victoriametrics.com/keyConcepts.html#counter) metric and
`vmauth_unauthorized_user_request_duration_seconds_*` [summary](https://docs.victoriametrics.com/keyConcepts.html#summary)
metric without label (if `unauthorized_user` section of config is used).
## How to build from sources
It is recommended using [binary releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/latest) - `vmauth` is located in `vmutils-*` archives there.
### Development build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.20.
1. Run `make vmauth` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmauth` binary and puts it into the `bin` folder.
### Production build
1. [Install docker](https://docs.docker.com/install/).
1. Run `make vmauth-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmauth-prod` binary and puts it into the `bin` folder.
### Building docker images
Run `make package-vmauth`. It builds `victoriametrics/vmauth:<PKG_TAG>` docker image locally.
`<PKG_TAG>` is auto-generated image tag, which depends on source code in the repository.
The `<PKG_TAG>` may be manually set via `PKG_TAG=foobar make package-vmauth`.
The base docker image is [alpine](https://hub.docker.com/_/alpine) but it is possible to use any other base image
by setting it via `<ROOT_IMAGE>` environment variable. For example, the following command builds the image on top of [scratch](https://hub.docker.com/_/scratch) image:
```console
ROOT_IMAGE=scratch make package-vmauth
```
## Profiling
`vmauth` provides handlers for collecting the following [Go profiles](https://blog.golang.org/profiling-go-programs):
* Memory profile. It can be collected with the following command (replace `0.0.0.0` with hostname if needed):
<div class="with-copy" markdown="1">
```console
curl http://0.0.0.0:8427/debug/pprof/heap > mem.pprof
```
</div>
* CPU profile. It can be collected with the following command (replace `0.0.0.0` with hostname if needed):
<div class="with-copy" markdown="1">
```console
curl http://0.0.0.0:8427/debug/pprof/profile > cpu.pprof
```
</div>
The command for collecting CPU profile waits for 30 seconds before returning.
The collected profiles may be analyzed with [go tool pprof](https://github.com/google/pprof).
It is safe sharing the collected profiles from security point of view, since they do not contain sensitive information.
## Advanced usage
Pass `-help` command-line arg to `vmauth` in order to see all the configuration options:
```console
./vmauth -help
vmauth authenticates and authorizes incoming requests and proxies them to VictoriaMetrics.
See the docs at https://docs.victoriametrics.com/vmauth.html .
-auth.config string
Path to auth config. It can point either to local file or to http url. See https://docs.victoriametrics.com/vmauth.html for details on the format of this auth config
-configCheckInterval duration
interval for config file re-read. Zero value disables config re-reading. By default, refreshing is disabled, send SIGHUP for config refresh.
-enableTCP6
Whether to enable IPv6 for listening and dialing. By default, only IPv4 TCP and UDP are used
-envflag.enable
Whether to enable reading flags from environment variables in addition to the command line. Command line flag values have priority over values from environment vars. Flags are read only from the command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
-eula
Deprecated, please use -license or -licenseFile flags instead. By specifying this flag, you confirm that you have an enterprise license and accept the ESA https://victoriametrics.com/legal/esa/ . This flag is available only in Enterprise binaries. See https://docs.victoriametrics.com/enterprise.html
-failTimeout duration
Sets a delay period for load balancing to skip a malfunctioning backend (default 3s)
-filestream.disableFadvise
Whether to disable fadvise() syscall when reading large data files. The fadvise() syscall prevents from eviction of recently accessed data from OS page cache during background merges and backups. In some rare cases it is better to disable the syscall if it uses too much CPU
-flagsAuthKey string
Auth key for /flags endpoint. It must be passed via authKey query arg. It overrides httpAuth.* settings
-fs.disableMmap
Whether to use pread() instead of mmap() for reading data files. By default, mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread()
-http.connTimeout duration
Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s)
-http.disableResponseCompression
Disable compression of HTTP responses to save CPU resources. By default, compression is enabled to save network bandwidth
-http.header.csp string
Value for 'Content-Security-Policy' header
-http.header.frameOptions string
Value for 'X-Frame-Options' header
-http.header.hsts string
Value for 'Strict-Transport-Security' header
-http.idleConnTimeout duration
Timeout for incoming idle http connections (default 1m0s)
-http.maxGracefulShutdownDuration duration
The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s)
-http.pathPrefix string
An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus
-http.shutdownDelay duration
Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers
-httpAuth.password string
Password for HTTP server's Basic Auth. The authentication is disabled if -httpAuth.username is empty
-httpAuth.username string
Username for HTTP server's Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
-httpListenAddr string
TCP address to listen for http connections. See also -httpListenAddr.useProxyProtocol (default ":8427")
-httpListenAddr.useProxyProtocol
Whether to use proxy protocol for connections accepted at -httpListenAddr . See https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt . With enabled proxy protocol http server cannot serve regular /metrics endpoint. Use -pushmetrics.url for metrics pushing
-internStringCacheExpireDuration duration
The expiry duration for caches for interned strings. See https://en.wikipedia.org/wiki/String_interning . See also -internStringMaxLen and -internStringDisableCache (default 6m0s)
-internStringDisableCache
Whether to disable caches for interned strings. This may reduce memory usage at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringCacheExpireDuration and -internStringMaxLen
-internStringMaxLen int
The maximum length for strings to intern. A lower limit may save memory at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringDisableCache and -internStringCacheExpireDuration (default 500)
-license string
Lisense key for VictoriaMetrics Enterprise. See https://victoriametrics.com/products/enterprise/ . Trial Enterprise license can be obtained from https://victoriametrics.com/products/enterprise/trial/ . This flag is available only in Enterprise binaries. The license key can be also passed via file specified by -licenseFile command-line flag
-license.forceOffline
Whether to enable offline verification for VictoriaMetrics Enterprise license key, which has been passed either via -license or via -licenseFile command-line flag. The issued license key must support offline verification feature. Contact info@victoriametrics.com if you need offline license verification. This flag is avilable only in Enterprise binaries
-licenseFile string
Path to file with license key for VictoriaMetrics Enterprise. See https://victoriametrics.com/products/enterprise/ . Trial Enterprise license can be obtained from https://victoriametrics.com/products/enterprise/trial/ . This flag is available only in Enterprise binaries. The license key can be also passed inline via -license command-line flag
-logInvalidAuthTokens
Whether to log requests with invalid auth tokens. Such requests are always counted at vmauth_http_request_errors_total{reason="invalid_auth_token"} metric, which is exposed at /metrics page
-loggerDisableTimestamps
Whether to disable writing timestamps in logs
-loggerErrorsPerSecondLimit int
Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit
-loggerFormat string
Format for logs. Possible values: default, json (default "default")
-loggerJSONFields string
Allows renaming fields in JSON formatted logs. Example: "ts:timestamp,msg:message" renames "ts" to "timestamp" and "msg" to "message". Supported fields: ts, level, caller, msg
-loggerLevel string
Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO")
-loggerOutput string
Output for the logs. Supported values: stderr, stdout (default "stderr")
-loggerTimezone string
Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC")
-loggerWarnsPerSecondLimit int
Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit
-maxConcurrentPerUserRequests int
The maximum number of concurrent requests vmauth can process per each configured user. Other requests are rejected with '429 Too Many Requests' http status code. See also -maxConcurrentRequests command-line option and max_concurrent_requests option in per-user config (default 300)
-maxConcurrentRequests int
The maximum number of concurrent requests vmauth can process. Other requests are rejected with '429 Too Many Requests' http status code. See also -maxConcurrentPerUserRequests and -maxIdleConnsPerBackend command-line options (default 1000)
-maxIdleConnsPerBackend int
The maximum number of idle connections vmauth can open per each backend host. See also -maxConcurrentRequests (default 100)
-maxRequestBodySizeToRetry size
The maximum request body size, which can be cached and re-tried at other backends. Bigger values may require more memory
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 16384)
-memory.allowedBytes size
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from the OS page cache resulting in higher disk IO usage
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 0)
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from the OS page cache which will result in higher disk IO usage (default 60)
-metricsAuthKey string
Auth key for /metrics endpoint. It must be passed via authKey query arg. It overrides httpAuth.* settings
-pprofAuthKey string
Auth key for /debug/pprof/* endpoints. It must be passed via authKey query arg. It overrides httpAuth.* settings
-pushmetrics.extraLabel array
Optional labels to add to metrics pushed to -pushmetrics.url . For example, -pushmetrics.extraLabel='instance="foo"' adds instance="foo" label to all the metrics pushed to -pushmetrics.url
Supports an array of values separated by comma or specified via multiple flags.
-pushmetrics.interval duration
Interval for pushing metrics to -pushmetrics.url (default 10s)
-pushmetrics.url array
Optional URL to push metrics exposed at /metrics page. See https://docs.victoriametrics.com/#push-metrics . By default, metrics exposed at /metrics page aren't pushed to any remote storage
Supports an array of values separated by comma or specified via multiple flags.
-reloadAuthKey string
Auth key for /-/reload http endpoint. It must be passed as authKey=...
-responseTimeout duration
The timeout for receiving a response from backend (default 5m0s)
-tls
Whether to enable TLS for incoming HTTP requests at -httpListenAddr (aka https). -tlsCertFile and -tlsKeyFile must be set if -tls is set
-tlsCertFile string
Path to file with TLS certificate if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower. The provided certificate file is automatically re-read every second, so it can be dynamically updated
-tlsCipherSuites array
Optional list of TLS cipher suites for incoming requests over HTTPS if -tls is set. See the list of supported cipher suites at https://pkg.go.dev/crypto/tls#pkg-constants
Supports an array of values separated by comma or specified via multiple flags.
-tlsKeyFile string
Path to file with TLS key if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated
-tlsMinVersion string
Optional minimum TLS version to use for incoming requests over HTTPS if -tls is set. Supported values: TLS10, TLS11, TLS12, TLS13
-version
Show VictoriaMetrics version
```
vmauth docs can be edited at [docs/vmauth.md](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/docs/vmauth.md).

View File

@@ -37,14 +37,14 @@ users:
# All the requests to http://vmauth:8427 with the given Basic Auth (username:password)
# are proxied to http://localhost:8428 with extra_label=team=dev query arg.
# For example, http://vmauth:8427/api/v1/query is routed to http://localhost:8428/api/v1/query?extra_label=team=dev
# For example, http://vmauth:8427/api/v1/query is proxied to http://localhost:8428/api/v1/query?extra_label=team=dev
- username: "local-single-node2"
password: "***"
url_prefix: "http://localhost:8428?extra_label=team=dev"
# All the requests to http://vmauth:8427 with the given Basic Auth (username:password)
# are proxied to https://localhost:8428
# For example, http://vmauth:8427/api/v1/query is routed to https://localhost/api/v1/query
# For example, http://vmauth:8427/api/v1/query is proxied to https://localhost/api/v1/query
# TLS verification is ignored for https://localhost.
- username: "local-single-node-with-tls"
password: "***"
@@ -111,9 +111,9 @@ users:
- "http://default1:8888/unsupported_url_handler"
- "http://default2:8888/unsupported_url_handler"
# Requests without Authorization header are routed according to `unauthorized_user` section.
# Requests are routed in round-robin fashion between `url_prefix` backends.
# The deny_partial_response query arg is added to all the routed requests.
# Requests without Authorization header are proxied according to `unauthorized_user` section.
# Requests are proxied in round-robin fashion between `url_prefix` backends.
# The deny_partial_response query arg is added to all the proxied requests.
# The requests are re-tried if url_prefix backends send 500 or 503 response status codes.
unauthorized_user:
url_prefix:

View File

@@ -195,7 +195,7 @@ func processRequest(w http.ResponseWriter, r *http.Request, ui *UserInfo) {
// Don't change path and add request_path query param for default route.
if isDefault {
query := targetURL.Query()
query.Set("request_path", u.Path)
query.Set("request_path", u.String())
targetURL.RawQuery = query.Encode()
} else { // Update path for regular routes.
targetURL = mergeURLs(targetURL, u, dropSrcPathPrefixParts)

View File

@@ -1,444 +1,3 @@
# vmbackup
See vmbackup docs [here](https://docs.victoriametrics.com/vmbackup.html).
`vmbackup` creates VictoriaMetrics data backups from [instant snapshots](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-work-with-snapshots).
`vmbackup` supports incremental and full backups. Incremental backups are created automatically if the destination path already contains data from the previous backup.
Full backups can be sped up with `-origin` pointing to an already existing backup on the same remote storage. In this case `vmbackup` makes server-side copy for the shared
data between the existing backup and new backup. It saves time and costs on data transfer.
Backup process can be interrupted at any time. It is automatically resumed from the interruption point when restarting `vmbackup` with the same args.
Backed up data can be restored with [vmrestore](https://docs.victoriametrics.com/vmrestore.html).
See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time-series-databases-533c1a927883) for more details.
See also [vmbackupmanager](https://docs.victoriametrics.com/vmbackupmanager.html) tool built on top of `vmbackup`. This tool simplifies
creation of hourly, daily, weekly and monthly backups.
## Supported storage types
`vmbackup` supports the following `-dst` storage types:
* [GCS](https://cloud.google.com/storage/). Example: `gs://<bucket>/<path/to/backup>`
* [S3](https://aws.amazon.com/s3/). Example: `s3://<bucket>/<path/to/backup>`
* [Azure Blob Storage](https://azure.microsoft.com/en-us/products/storage/blobs/). Example: `azblob://<container>/<path/to/backup>`
* Any S3-compatible storage such as [MinIO](https://github.com/minio/minio), [Ceph](https://docs.ceph.com/en/pacific/radosgw/s3/) or [Swift](https://platform.swiftstack.com/docs/admin/middleware/s3_middleware.html). See [these docs](#advanced-usage) for details.
* Local filesystem. Example: `fs://</absolute/path/to/backup>`. Note that `vmbackup` prevents from storing the backup into the directory pointed by `-storageDataPath` command-line flag, since this directory should be managed solely by VictoriaMetrics or `vmstorage`.
## Use cases
### Regular backups
Regular backup can be performed with the following command:
```console
./vmbackup -storageDataPath=</path/to/victoria-metrics-data> -snapshot.createURL=http://localhost:8428/snapshot/create -dst=gs://<bucket>/<path/to/new/backup>
```
* `</path/to/victoria-metrics-data>` - path to VictoriaMetrics data pointed by `-storageDataPath` command-line flag in single-node VictoriaMetrics or in cluster `vmstorage`.
There is no need to stop VictoriaMetrics for creating backups since they are performed from immutable [instant snapshots](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-work-with-snapshots).
* `http://victoriametrics:8428/snapshot/create` is the url for creating snapshots according to [these docs](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-work-with-snapshots). `vmbackup` creates a snapshot by querying the provided `-snapshot.createURL`, then performs the backup and then automatically removes the created snapshot.
* `<bucket>` is an already existing name for [GCS bucket](https://cloud.google.com/storage/docs/creating-buckets).
* `<path/to/new/backup>` is the destination path where new backup will be placed.
### Regular backups with server-side copy from existing backup
If the destination GCS bucket already contains the previous backup at `-origin` path, then new backup can be sped up
with the following command:
```console
./vmbackup -storageDataPath=</path/to/victoria-metrics-data> -snapshot.createURL=http://localhost:8428/snapshot/create -dst=gs://<bucket>/<path/to/new/backup> -origin=gs://<bucket>/<path/to/existing/backup>
```
It saves time and network bandwidth costs by performing server-side copy for the shared data from the `-origin` to `-dst`.
### Incremental backups
Incremental backups are performed if `-dst` points to an already existing backup. In this case only new data is uploaded to remote storage.
It saves time and network bandwidth costs when working with big backups:
```console
./vmbackup -storageDataPath=</path/to/victoria-metrics-data> -snapshot.createURL=http://localhost:8428/snapshot/create -dst=gs://<bucket>/<path/to/existing/backup>
```
### Smart backups
Smart backups mean storing full daily backups into `YYYYMMDD` folders and creating incremental hourly backup into `latest` folder:
* Run the following command every hour:
```console
./vmbackup -storageDataPath=</path/to/victoria-metrics-data> -snapshot.createURL=http://localhost:8428/snapshot/create -dst=gs://<bucket>/latest
```
Where `<latest-snapshot>` is the latest [snapshot](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-work-with-snapshots).
The command will upload only changed data to `gs://<bucket>/latest`.
* Run the following command once a day:
```console
./vmbackup -storageDataPath=</path/to/victoria-metrics-data> -snapshot.createURL=http://localhost:8428/snapshot/create -dst=gs://<bucket>/<YYYYMMDD> -origin=gs://<bucket>/latest
```
Where `<daily-snapshot>` is the snapshot for the last day `<YYYYMMDD>`.
This approach saves network bandwidth costs on hourly backups (since they are incremental) and allows recovering data from either the last hour (`latest` backup)
or from any day (`YYYYMMDD` backups). Note that hourly backup shouldn't run when creating daily backup.
Do not forget to remove old backups when they are no longer needed in order to save storage costs.
See also [vmbackupmanager tool](https://docs.victoriametrics.com/vmbackupmanager.html) for automating smart backups.
### Server-side copy of the existing backup
Sometimes it is needed to make server-side copy of the existing backup. This can be done by specifying the source backup path via `-origin` command-line flag,
while the destination path for backup copy must be specified via `-dst` command-line flag. For example, the following command copies backup
from `gs://bucket/foo` to `gs://bucket/bar`:
```console
./vmbackup -origin=gs://bucket/foo -dst=gs://bucket/bar
```
The `-origin` and `-dst` must point to the same object storage bucket or to the same filesystem.
The server-side backup copy is usually performed at much faster speed comparing to the usual backup, since backup data isn't transferred
between the remote storage and locally running `vmbackup` tool.
If the `-dst` already contains some data, then its' contents is synced with the `-origin` data. This allows making incremental server-side copies of backups.
## How does it work?
The backup algorithm is the following:
1. Create a snapshot by querying the provided `-snapshot.createURL`
1. Collect information about files in the created snapshot, in the `-dst` and in the `-origin`.
1. Determine which files in `-dst` are missing in the created snapshot, and delete them. These are usually small files, which are already merged into bigger files in the snapshot.
1. Determine which files in the created snapshot are missing in `-dst`. These are usually small new files and bigger merged files.
1. Determine which files from step 3 exist in the `-origin`, and perform server-side copy of these files from `-origin` to `-dst`.
These are usually the biggest and the oldest files, which are shared between backups.
1. Upload the remaining files from step 3 from the created snapshot to `-dst`.
1. Delete the created snapshot.
The algorithm splits source files into 1 GiB chunks in the backup. Each chunk is stored as a separate file in the backup.
Such splitting balances between the number of files in the backup and the amounts of data that needs to be re-transferred after temporary errors.
`vmbackup` relies on [instant snapshot](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282) properties:
* All the files in the snapshot are immutable.
* Old files are periodically merged into new files.
* Smaller files have higher probability to be merged.
* Consecutive snapshots share many identical files.
These properties allow performing fast and cheap incremental backups and server-side copying from `-origin` paths.
See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time-series-databases-533c1a927883) for more details.
`vmbackup` can work improperly or slowly when these properties are violated.
## Troubleshooting
* If the backup is slow, then try setting higher value for `-concurrency` flag. This will increase the number of concurrent workers that upload data to backup storage.
* If `vmbackup` eats all the network bandwidth or CPU, then either decrease the `-concurrency` command-line flag value or set `-maxBytesPerSecond` command-line flag value to lower value.
* If `vmbackup` consumes all the CPU on systems with big number of CPU cores, then try running it with `-filestream.disableFadvise` command-line flag.
* If `vmbackup` has been interrupted due to temporary error, then just restart it with the same args. It will resume the backup process.
* Backups created from [single-node VictoriaMetrics](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html) cannot be restored
at [cluster VictoriaMetrics](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html) and vice versa.
## Advanced usage
### Providing credentials as a file
Obtaining credentials from a file.
Add flag `-credsFilePath=/etc/credentials` with the following content:
- for S3 (AWS, MinIO or other S3 compatible storages):
```console
[default]
aws_access_key_id=theaccesskey
aws_secret_access_key=thesecretaccesskeyvalue
```
- for GCP cloud storage:
```json
{
"type": "service_account",
"project_id": "project-id",
"private_key_id": "key-id",
"private_key": "-----BEGIN PRIVATE KEY-----\nprivate-key\n-----END PRIVATE KEY-----\n",
"client_email": "service-account-email",
"client_id": "client-id",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://accounts.google.com/o/oauth2/token",
"auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
"client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/service-account-email"
}
```
### Providing credentials via env variables
Obtaining credentials from env variables.
- For AWS S3 compatible storages set env variable `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY`.
Also you can set env variable `AWS_SHARED_CREDENTIALS_FILE` with path to credentials file.
- For GCE cloud storage set env variable `GOOGLE_APPLICATION_CREDENTIALS` with path to credentials file.
- For Azure storage either set env variables `AZURE_STORAGE_ACCOUNT_NAME` and `AZURE_STORAGE_ACCOUNT_KEY`, or `AZURE_STORAGE_ACCOUNT_CONNECTION_STRING`.
Please, note that `vmbackup` will use credentials provided by cloud providers metadata service [when applicable](https://docs.victoriametrics.com/vmbackup.html#using-cloud-providers-metadata-service).
### Using cloud providers metadata service
`vmbackup` and `vmbackupmanager` will automatically use cloud providers metadata service in order to obtain credentials if they are running in cloud environment
and credentials are not explicitly provided via flags or env variables.
### Providing credentials in Kubernetes
The simplest way to provide credentials in Kubernetes is to use [Secrets](https://kubernetes.io/docs/concepts/configuration/secret/)
and inject them into the pod as environment variables. For example, the following secret can be used for AWS S3 credentials:
```yaml
apiVersion: v1
kind: Secret
metadata:
name: vmbackup-credentials
data:
access_key: key
secret_key: secret
```
And then it can be injected into the pod as environment variables:
```yaml
...
env:
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
key: access_key
name: vmbackup-credentials
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
key: secret_key
name: vmbackup-credentials
...
```
A more secure way is to use IAM roles to provide tokens for pods instead of managing credentials manually.
For AWS deployments it will be required to configure [IAM roles for service accounts](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html).
In order to use IAM roles for service accounts with `vmbackup` or `vmbackupmanager` it is required to create ServiceAccount with IAM role mapping:
```yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: monitoring-backups
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::{ACCOUNT_ID}:role/{ROLE_NAME}
```
And [configure pod to use service account](https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/).
After this `vmbackup` and `vmbackupmanager` will automatically use IAM role for service account in order to obtain credentials.
For GCP deployments it will be required to configure [Workload Identity](https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity).
In order to use Workload Identity with `vmbackup` or `vmbackupmanager` it is required to create ServiceAccount with Workload Identity annotation:
```yaml
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: monitoring-backups
annotations:
iam.gke.io/gcp-service-account: {sa_name}@{project_name}.iam.gserviceaccount.com
```
And [configure pod to use service account](https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/).
After this `vmbackup` and `vmbackupmanager` will automatically use Workload Identity for servicpe account in order to obtain credentials.
### Using custom S3 endpoint
Usage with s3 custom url endpoint. It is possible to use `vmbackup` with s3 compatible storages like minio, cloudian, etc.
You have to add a custom url endpoint via flag:
- for MinIO
```console
-customS3Endpoint=http://localhost:9000
```
- for aws gov region
```console
-customS3Endpoint=https://s3-fips.us-gov-west-1.amazonaws.com
```
### Permanent deletion of objects in S3-compatible storages
`vmbackup` and [vmbackupmanager](https://docs.victoriametrics.com/vmbackupmanager.html) use standard delete operation
for S3-compatible object storage when pefrorming [incremental backups](#incremental-backups).
This operation removes only the current version of the object. This works OK in most cases.
Sometimes it is needed to remove all the versions of an object. In this case pass `-deleteAllObjectVersions` command-line flag to `vmbackup`.
Alternatively, it is possible to use object storage lifecycle rules to remove non-current versions of objects automatically.
Refer to the respective documentation for your object storage provider for more details.
### Command-line flags
Run `vmbackup -help` in order to see all the available options:
```console
-concurrency int
The number of concurrent workers. Higher concurrency may reduce backup duration (default 10)
-configFilePath string
Path to file with S3 configs. Configs are loaded from default location if not set.
See https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html
-configProfile string
Profile name for S3 configs. If no set, the value of the environment variable will be loaded (AWS_PROFILE or AWS_DEFAULT_PROFILE), or if both not set, DefaultSharedConfigProfile is used
-credsFilePath string
Path to file with GCS or S3 credentials. Credentials are loaded from default locations if not set.
See https://cloud.google.com/iam/docs/creating-managing-service-account-keys and https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html
-customS3Endpoint string
Custom S3 endpoint for use with S3-compatible storages (e.g. MinIO). S3 is used if not set
-deleteAllObjectVersions
Whether to prune previous object versions when deleting an object. By default, when object storage has versioning enabled deleting the file removes only current version. This option forces removal of all previous versions. See: https://docs.victoriametrics.com/vmbackup.html#permanent-deletion-of-objects-in-s3-compatible-storages
-dst string
Where to put the backup on the remote storage. Example: gs://bucket/path/to/backup, s3://bucket/path/to/backup, azblob://container/path/to/backup or fs:///path/to/local/backup/dir
-dst can point to the previous backup. In this case incremental backup is performed, i.e. only changed data is uploaded
-enableTCP6
Whether to enable IPv6 for listening and dialing. By default, only IPv4 TCP and UDP are used
-envflag.enable
Whether to enable reading flags from environment variables in addition to the command line. Command line flag values have priority over values from environment vars. Flags are read only from the command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
-eula
Deprecated, please use -license or -licenseFile flags instead. By specifying this flag, you confirm that you have an enterprise license and accept the ESA https://victoriametrics.com/legal/esa/ . This flag is available only in Enterprise binaries. See https://docs.victoriametrics.com/enterprise.html
-filestream.disableFadvise
Whether to disable fadvise() syscall when reading large data files. The fadvise() syscall prevents from eviction of recently accessed data from OS page cache during background merges and backups. In some rare cases it is better to disable the syscall if it uses too much CPU
-flagsAuthKey string
Auth key for /flags endpoint. It must be passed via authKey query arg. It overrides httpAuth.* settings
-fs.disableMmap
Whether to use pread() instead of mmap() for reading data files. By default, mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread()
-http.connTimeout duration
Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s)
-http.disableResponseCompression
Disable compression of HTTP responses to save CPU resources. By default, compression is enabled to save network bandwidth
-http.header.csp string
Value for 'Content-Security-Policy' header
-http.header.frameOptions string
Value for 'X-Frame-Options' header
-http.header.hsts string
Value for 'Strict-Transport-Security' header
-http.idleConnTimeout duration
Timeout for incoming idle http connections (default 1m0s)
-http.maxGracefulShutdownDuration duration
The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s)
-http.pathPrefix string
An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus
-http.shutdownDelay duration
Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers
-httpAuth.password string
Password for HTTP server's Basic Auth. The authentication is disabled if -httpAuth.username is empty
-httpAuth.username string
Username for HTTP server's Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
-httpListenAddr string
TCP address for exporting metrics at /metrics page (default ":8420")
-internStringCacheExpireDuration duration
The expiry duration for caches for interned strings. See https://en.wikipedia.org/wiki/String_interning . See also -internStringMaxLen and -internStringDisableCache (default 6m0s)
-internStringDisableCache
Whether to disable caches for interned strings. This may reduce memory usage at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringCacheExpireDuration and -internStringMaxLen
-internStringMaxLen int
The maximum length for strings to intern. A lower limit may save memory at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringDisableCache and -internStringCacheExpireDuration (default 500)
-license string
Lisense key for VictoriaMetrics Enterprise. See https://victoriametrics.com/products/enterprise/ . Trial Enterprise license can be obtained from https://victoriametrics.com/products/enterprise/trial/ . This flag is available only in Enterprise binaries. The license key can be also passed via file specified by -licenseFile command-line flag
-license.forceOffline
Whether to enable offline verification for VictoriaMetrics Enterprise license key, which has been passed either via -license or via -licenseFile command-line flag. The issued license key must support offline verification feature. Contact info@victoriametrics.com if you need offline license verification. This flag is avilable only in Enterprise binaries
-licenseFile string
Path to file with license key for VictoriaMetrics Enterprise. See https://victoriametrics.com/products/enterprise/ . Trial Enterprise license can be obtained from https://victoriametrics.com/products/enterprise/trial/ . This flag is available only in Enterprise binaries. The license key can be also passed inline via -license command-line flag
-loggerDisableTimestamps
Whether to disable writing timestamps in logs
-loggerErrorsPerSecondLimit int
Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit
-loggerFormat string
Format for logs. Possible values: default, json (default "default")
-loggerJSONFields string
Allows renaming fields in JSON formatted logs. Example: "ts:timestamp,msg:message" renames "ts" to "timestamp" and "msg" to "message". Supported fields: ts, level, caller, msg
-loggerLevel string
Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO")
-loggerOutput string
Output for the logs. Supported values: stderr, stdout (default "stderr")
-loggerTimezone string
Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC")
-loggerWarnsPerSecondLimit int
Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit
-maxBytesPerSecond size
The maximum upload speed. There is no limit if it is set to 0
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 0)
-memory.allowedBytes size
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from the OS page cache resulting in higher disk IO usage
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 0)
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from the OS page cache which will result in higher disk IO usage (default 60)
-metricsAuthKey string
Auth key for /metrics endpoint. It must be passed via authKey query arg. It overrides httpAuth.* settings
-origin string
Optional origin directory on the remote storage with old backup for server-side copying when performing full backup. This speeds up full backups
-pprofAuthKey string
Auth key for /debug/pprof/* endpoints. It must be passed via authKey query arg. It overrides httpAuth.* settings
-pushmetrics.extraLabel array
Optional labels to add to metrics pushed to -pushmetrics.url . For example, -pushmetrics.extraLabel='instance="foo"' adds instance="foo" label to all the metrics pushed to -pushmetrics.url
Supports an array of values separated by comma or specified via multiple flags.
-pushmetrics.interval duration
Interval for pushing metrics to -pushmetrics.url (default 10s)
-pushmetrics.url array
Optional URL to push metrics exposed at /metrics page. See https://docs.victoriametrics.com/#push-metrics . By default, metrics exposed at /metrics page aren't pushed to any remote storage
Supports an array of values separated by comma or specified via multiple flags.
-s3ForcePathStyle
Prefixing endpoint with bucket name when set false, true by default. (default true)
-s3StorageClass string
The Storage Class applied to objects uploaded to AWS S3. Supported values are: GLACIER, DEEP_ARCHIVE, GLACIER_IR, INTELLIGENT_TIERING, ONEZONE_IA, OUTPOSTS, REDUCED_REDUNDANCY, STANDARD, STANDARD_IA.
See https://docs.aws.amazon.com/AmazonS3/latest/userguide/storage-class-intro.html
-snapshot.createURL string
VictoriaMetrics create snapshot url. When this is given a snapshot will automatically be created during backup. Example: http://victoriametrics:8428/snapshot/create . There is no need in setting -snapshotName if -snapshot.createURL is set
-snapshot.deleteURL string
VictoriaMetrics delete snapshot url. Optional. Will be generated from -snapshot.createURL if not provided. All created snapshots will be automatically deleted. Example: http://victoriametrics:8428/snapshot/delete
-snapshotName string
Name for the snapshot to backup. See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-work-with-snapshots. There is no need in setting -snapshotName if -snapshot.createURL is set
-storageDataPath string
Path to VictoriaMetrics data. Must match -storageDataPath from VictoriaMetrics or vmstorage (default "victoria-metrics-data")
-tls
Whether to enable TLS for incoming HTTP requests at -httpListenAddr (aka https). -tlsCertFile and -tlsKeyFile must be set if -tls is set
-tlsCertFile string
Path to file with TLS certificate if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower. The provided certificate file is automatically re-read every second, so it can be dynamically updated
-tlsCipherSuites array
Optional list of TLS cipher suites for incoming requests over HTTPS if -tls is set. See the list of supported cipher suites at https://pkg.go.dev/crypto/tls#pkg-constants
Supports an array of values separated by comma or specified via multiple flags.
-tlsKeyFile string
Path to file with TLS key if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated
-tlsMinVersion string
Optional minimum TLS version to use for incoming requests over HTTPS if -tls is set. Supported values: TLS10, TLS11, TLS12, TLS13
-version
Show VictoriaMetrics version
```
## How to build from sources
It is recommended using [binary releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/latest) - see `vmutils-*` archives there.
### Development build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.20.
1. Run `make vmbackup` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmbackup` binary and puts it into the `bin` folder.
### Production build
1. [Install docker](https://docs.docker.com/install/).
1. Run `make vmbackup-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmbackup-prod` binary and puts it into the `bin` folder.
### Building docker images
Run `make package-vmbackup`. It builds `victoriametrics/vmbackup:<PKG_TAG>` docker image locally.
`<PKG_TAG>` is auto-generated image tag, which depends on source code in the repository.
The `<PKG_TAG>` may be manually set via `PKG_TAG=foobar make package-vmbackup`.
The base docker image is [alpine](https://hub.docker.com/_/alpine) but it is possible to use any other base image
by setting it via `<ROOT_IMAGE>` environment variable. For example, the following command builds the image on top of [scratch](https://hub.docker.com/_/scratch) image:
```console
ROOT_IMAGE=scratch make package-vmbackup
```
vmbackup docs can be edited at [docs/vmbackup.md](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/docs/vmbackup.md).

View File

@@ -1,554 +1,3 @@
# vmbackupmanager
See vmbackupmanager docs [here](https://docs.victoriametrics.com/vmbackupmanager.html).
***vmbackupmanager is a part of [enterprise package](https://docs.victoriametrics.com/enterprise.html).
It is available for download and evaluation at [releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/latest).
See how to request a free trial license [here](https://victoriametrics.com/products/enterprise/trial/).***
The VictoriaMetrics backup manager automates regular backup procedures. It supports the following backup intervals: **hourly**, **daily**, **weekly** and **monthly**.
Multiple backup intervals may be configured simultaneously. I.e. the backup manager creates hourly backups every hour, while it creates daily backups every day, etc.
Backup manager must have read access to the storage data, so best practice is to install it on the same machine (or as a sidecar) where the storage node is installed.
The backup service makes a backup every hour and puts it to the latest folder and then copies data to the folders
which represent the backup intervals (hourly, daily, weekly and monthly)
The required flags for running the service are as follows:
* -eula - should be true and means that you have the legal right to run a backup manager. That can either be a signed contract or an email
with confirmation to run the service in a trial period.
* -storageDataPath - path to VictoriaMetrics or vmstorage data path to make backup from.
* -snapshot.createURL - VictoriaMetrics creates snapshot URL which will automatically be created during backup. Example: <http://victoriametrics:8428/snapshot/create>
* -dst - backup destination at [the supported storage types](https://docs.victoriametrics.com/vmbackup.html#supported-storage-types).
* -credsFilePath - path to file with GCS or S3 credentials. Credentials are loaded from default locations if not set.
See [https://cloud.google.com/iam/docs/creating-managing-service-account-keys](https://cloud.google.com/iam/docs/creating-managing-service-account-keys)
and [https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html](https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html).
Backup schedule is controlled by the following flags:
* -disableHourly - disable hourly run. Default false
* -disableDaily - disable daily run. Default false
* -disableWeekly - disable weekly run. Default false
* -disableMonthly - disable monthly run. Default false
By default, all flags are turned on and Backup Manager backups data every hour for every interval (hourly, daily, weekly and monthly).
The backup manager creates the following directory hierarchy at **-dst**:
* /latest/ - contains the latest backup
* /hourly/ - contains hourly backups. Each backup is named as *YYYY-MM-DD:HH*
* /daily/ - contains daily backups. Each backup is named as *YYYY-MM-DD*
* /weekly/ - contains weekly backups. Each backup is named as *YYYY-WW*
* /monthly/ - contains monthly backups. Each backup is named as *YYYY-MM*
To get the full list of supported flags please run the following command:
```console
./vmbackupmanager --help
```
The service creates a **full** backup each run. This means that the system can be restored fully
from any particular backup using [vmrestore](https://docs.victoriametrics.com/vmrestore.html).
Backup manager uploads only the data that has been changed or created since the most recent backup (incremental backup).
This reduces the consumed network traffic and the time needed for performing the backup.
See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time-series-databases-533c1a927883) for details.
*Please take into account that the first backup upload could take a significant amount of time as it needs to upload all the data.*
There are two flags which could help with performance tuning:
* -maxBytesPerSecond - the maximum upload speed. There is no limit if it is set to 0
* -concurrency - The number of concurrent workers. Higher concurrency may improve upload speed (default 10)
## Example of Usage
GCS and cluster version. You need to have a credentials file in json format with following structure:
credentials.json
```json
{
"type": "service_account",
"project_id": "<project>",
"private_key_id": "",
"private_key": "-----BEGIN PRIVATE KEY-----\-----END PRIVATE KEY-----\n",
"client_email": "test@<project>.iam.gserviceaccount.com",
"client_id": "",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://oauth2.googleapis.com/token",
"auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
"client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/test%40<project>.iam.gserviceaccount.com"
}
```
Backup manager launched with the following configuration:
```console
export NODE_IP=192.168.0.10
export VMSTORAGE_ENDPOINT=http://127.0.0.1:8428
./vmbackupmanager -dst=gs://vmstorage-data/$NODE_IP -credsFilePath=credentials.json -storageDataPath=/vmstorage-data -snapshot.createURL=$VMSTORAGE_ENDPOINT/snapshot/create -eula
```
Expected logs in vmbackupmanager:
```console
info lib/backup/actions/backup.go:131 server-side copied 81 out of 81 parts from GCS{bucket: "vmstorage-data", dir: "192.168.0.10//latest/"} to GCS{bucket: "vmstorage-data", dir: "192.168.0.10//weekly/2020-34/"} in 2.549833008s
info lib/backup/actions/backup.go:169 backed up 853315 bytes in 2.882 seconds; deleted 0 bytes; server-side copied 853315 bytes; uploaded 0 bytes
```
Expected logs in vmstorage:
```console
info VictoriaMetrics/lib/storage/table.go:146 creating table snapshot of "/vmstorage-data/data"...
info VictoriaMetrics/lib/storage/storage.go:311 deleting snapshot "/vmstorage-data/snapshots/20200818201959-162C760149895DDA"...
info VictoriaMetrics/lib/storage/storage.go:319 deleted snapshot "/vmstorage-data/snapshots/20200818201959-162C760149895DDA" in 0.169 seconds
```
The result on the GCS bucket
* The root folder
<img alt="root folder" src="vmbackupmanager_root_folder.png">
* The latest folder
<img alt="latest folder" src="vmbackupmanager_latest_folder.png">
Please, see [vmbackup docs](https://docs.victoriametrics.com/vmbackup.html#advanced-usage) for more examples of authentication with different
storage types.
## Backup Retention Policy
Backup retention policy is controlled by:
* -keepLastHourly - keep the last N hourly backups. Disabled by default
* -keepLastDaily - keep the last N daily backups. Disabled by default
* -keepLastWeekly - keep the last N weekly backups. Disabled by default
* -keepLastMonthly - keep the last N monthly backups. Disabled by default
> *Note*: 0 value in every keepLast flag results into deletion of ALL backups for particular type (hourly, daily, weekly and monthly)
> *Note*: retention policy does not enforce removing previous versions of objects in object storages such if versioning is enabled. See [these docs](https://docs.victoriametrics.com/vmbackup.html#permanent-deletion-of-objects-in-s3-compatible-storages) for more details.
Lets assume we have a backup manager collecting daily backups for the past 10 days.
<img alt="retention policy daily before retention cycle" src="vmbackupmanager_rp_daily_1.png">
We enable backup retention policy for backup manager by using following configuration:
```console
export NODE_IP=192.168.0.10
export VMSTORAGE_ENDPOINT=http://127.0.0.1:8428
./vmbackupmanager -dst=gs://vmstorage-data/$NODE_IP -credsFilePath=credentials.json -storageDataPath=/vmstorage-data -snapshot.createURL=$VMSTORAGE_ENDPOINT/snapshot/create
-keepLastDaily=3 -eula
```
Expected logs in backup manager on start:
```console
info lib/logger/flag.go:20 flag "keepLastDaily" = "3"
```
Expected logs in backup manager during retention cycle:
```console
info app/vmbackupmanager/retention.go:106 daily backups to delete [daily/2021-02-13 daily/2021-02-12 daily/2021-02-11 daily/2021-02-10 daily/2021-02-09 daily/2021-02-08 daily/2021-02-07]
```
The result on the GCS bucket. We see only 3 daily backups:
<img alt="retention policy daily after retention cycle" src="vmbackupmanager_rp_daily_2.png">
### Protection backups against deletion by retention policy
You can protect any backup against deletion by retention policy with the `vmbackupmanager backups lock` command.
For instance:
```console
./vmbackupmanager backup lock daily/2021-02-13 -dst=<DST_PATH> -storageDataPath=/vmstorage-data -eula
```
After that the backup won't be deleted by retention policy.
You can view the `locked` attribute in backup list:
```console
./vmbackupmanager backup list -dst=<DST_PATH> -storageDataPath=/vmstorage-data -eula
```
To remove protection, you can use the command `vmbackupmanager backups unlock`.
For example:
```console
./vmbackupmanager backup unlock daily/2021-02-13 -dst=<DST_PATH> -storageDataPath=/vmstorage-data -eula
```
## API methods
`vmbackupmanager` exposes the following API methods:
* GET `/api/v1/backups` - returns list of backups in remote storage.
Example output:
```json
[{"name":"daily/2023-04-07","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:07+00:00"},{"name":"hourly/2023-04-07:11","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:06+00:00"},{"name":"latest","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:04+00:00"},{"name":"monthly/2023-04","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:10+00:00"},{"name":"weekly/2023-14","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:09+00:00"}]
```
> Note: `created_at` field is in RFC3339 format.
* GET `/api/v1/backups/<BACKUP_NAME>` - returns backup info by name.
Example output:
```json
{"name":"daily/2023-04-07","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:07+00:00","locked":true}
```
* PUT `/api/v1/backups/<BACKUP_NAME>` - update "locked" attribute for backup by name.
Example request body:
```json
{"locked":true}
```
Example response:
```json
{"name":"daily/2023-04-07","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:07+00:00", "locked": true}
```
* POST `/api/v1/restore` - saves backup name to restore when [performing restore](#restore-commands).
Example request body:
```json
{"backup":"daily/2022-10-06"}
```
* GET `/api/v1/restore` - returns backup name from restore mark if it exists.
Example response:
```json
{"backup":"daily/2022-10-06"}
```
* DELETE `/api/v1/restore` - delete restore mark.
## CLI
`vmbackupmanager` exposes CLI commands to work with [API methods](#api-methods) without external dependencies.
Supported commands:
```console
vmbackupmanager backup
vmbackupmanager backup list
List backups in remote storage
vmbackupmanager backup lock
Locks backup in remote storage against deletion
vmbackupmanager backup unlock
Unlocks backup in remote storage for deletion
vmbackupmanager restore
Restore backup specified by restore mark if it exists
vmbackupmanager restore get
Get restore mark if it exists
vmbackupmanager restore delete
Delete restore mark if it exists
vmbackupmanager restore create [backup_name]
Create restore mark
```
By default, CLI commands are using `http://127.0.0.1:8300` endpoint to reach `vmbackupmanager` API.
It can be changed by using flag:
```
-apiURL string
vmbackupmanager address to perform API requests (default "http://127.0.0.1:8300")
```
### Backup commands
`vmbackupmanager backup list` lists backups in remote storage:
```console
$ ./vmbackupmanager backup list
[{"name":"daily/2023-04-07","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:07+00:00"},{"name":"hourly/2023-04-07:11","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:06+00:00"},{"name":"latest","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:04+00:00"},{"name":"monthly/2023-04","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:10+00:00"},{"name":"weekly/2023-14","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:09+00:00"}]
```
### Restore commands
Restore commands are used to create, get and delete restore mark.
Restore mark is used by `vmbackupmanager` to store backup name to restore when running restore.
Create restore mark:
```console
$ ./vmbackupmanager restore create daily/2022-10-06
```
Get restore mark if it exists:
```console
$ ./vmbackupmanager restore get
{"backup":"daily/2022-10-06"}
```
Delete restore mark if it exists:
```console
$ ./vmbackupmanager restore delete
```
Perform restore:
```console
$ /vmbackupmanager-prod restore -dst=gs://vmstorage-data/$NODE_IP -credsFilePath=credentials.json -storageDataPath=/vmstorage-data
```
Note that `vmsingle` or `vmstorage` should be stopped before performing restore.
If restore mark doesn't exist at `storageDataPath`(restore wasn't requested) `vmbackupmanager restore` will exit with successful status code.
### How to restore backup via CLI
1. Run `vmbackupmanager backup list` to get list of available backups:
```console
$ /vmbackupmanager-prod backup list
[{"name":"daily/2023-04-07","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:07+00:00"},{"name":"hourly/2023-04-07:11","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:06+00:00"},{"name":"latest","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:04+00:00"},{"name":"monthly/2023-04","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:10+00:00"},{"name":"weekly/2023-14","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:09+00:00"}]
```
1. Run `vmbackupmanager restore create` to create restore mark:
- Use relative path to backup to restore from currently used remote storage:
```console
$ /vmbackupmanager-prod restore create daily/2023-04-07
```
- Use full path to backup to restore from any remote storage:
```console
$ /vmbackupmanager-prod restore create azblob://test1/vmbackupmanager/daily/2023-04-07
```
1. Stop `vmstorage` or `vmsingle` node
1. Run `vmbackupmanager restore` to restore backup:
```console
$ /vmbackupmanager-prod restore -credsFilePath=credentials.json -storageDataPath=/vmstorage-data
```
1. Start `vmstorage` or `vmsingle` node
### How to restore in Kubernetes
1. Ensure there is an init container with `vmbackupmanager restore` in `vmstorage` or `vmsingle` pod.
For [VictoriaMetrics operator](https://docs.victoriametrics.com/operator/VictoriaMetrics-Operator.html) deployments it is required to add:
```yaml
vmbackup:
restore:
onStart:
enabled: "true"
```
See operator `VMStorage` schema [here](https://docs.victoriametrics.com/operator/api.html#vmstorage) and `VMSingle` [here](https://docs.victoriametrics.com/operator/api.html#vmsinglespec).
1. Enter container running `vmbackupmanager`
1. Use `vmbackupmanager backup list` to get list of available backups:
```console
$ /vmbackupmanager-prod backup list
[{"name":"daily/2023-04-07","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:07+00:00"},{"name":"hourly/2023-04-07:11","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:06+00:00"},{"name":"latest","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:04+00:00"},{"name":"monthly/2023-04","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:10+00:00"},{"name":"weekly/2023-14","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:09+00:00"}]
```
1. Use `vmbackupmanager restore create` to create restore mark:
- Use relative path to backup to restore from currently used remote storage:
```console
$ /vmbackupmanager-prod restore create daily/2023-04-07
```
- Use full path to backup to restore from any remote storage:
```console
$ /vmbackupmanager-prod restore create azblob://test1/vmbackupmanager/daily/2023-04-07
```
1. Restart pod
#### Restore cluster into another cluster
These steps are assuming that [VictoriaMetrics operator](https://docs.victoriametrics.com/operator/VictoriaMetrics-Operator.html) is used to manage `VMCluster`.
Clusters here are referred to as `source` and `destination`.
1. Create a new cluster with access to *source* cluster `vmbackupmanager` storage and same number of storage nodes.
Add the following section in order to enable restore on start (operator `VMStorage` schema can be found [here](https://docs.victoriametrics.com/operator/api.html#vmstorage):
```yaml
vmbackup:
restore:
onStart:
enabled: "true"
```
Note: it is safe to leave this section in the cluster configuration, since it will be ignored if restore mark doesn't exist.
> Important! Use different `-dst` for *destination* cluster to avoid overwriting backup data of the *source* cluster.
1. Enter container running `vmbackupmanager` in *source* cluster
1. Use `vmbackupmanager backup list` to get list of available backups:
```console
$ /vmbackupmanager-prod backup list
[{"name":"daily/2023-04-07","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:07+00:00"},{"name":"hourly/2023-04-07:11","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:06+00:00"},{"name":"latest","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:04+00:00"},{"name":"monthly/2023-04","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:10+00:00"},{"name":"weekly/2023-14","size_bytes":318837,"size":"311.4ki","created_at":"2023-04-07T16:15:09+00:00"}]
```
1. Use `vmbackupmanager restore create` to create restore mark at each pod of the *destination* cluster.
Each pod in *destination* cluster should be restored from backup of respective pod in *source* cluster.
For example: `vmstorage-source-0` in *source* cluster should be restored from `vmstorage-destination-0` in *destination* cluster.
```console
$ /vmbackupmanager-prod restore create s3://source_cluster/vmstorage-source-0/daily/2023-04-07
```
1. Restart `vmstorage` pods of *destination* cluster. On pod start `vmbackupmanager` will restore data from the specified backup.
## Monitoring
`vmbackupmanager` exports various metrics in Prometheus exposition format at `http://vmbackupmanager:8300/metrics` page. It is recommended setting up regular scraping of this page
either via [vmagent](https://docs.victoriametrics.com/vmagent.html) or via Prometheus, so the exported metrics could be analyzed later.
Use the official [Grafana dashboard](https://grafana.com/grafana/dashboards/17798-victoriametrics-backupmanager/) for `vmbackupmanager` overview.
Graphs on this dashboard contain useful hints - hover the `i` icon in the top left corner of each graph in order to read it.
If you have suggestions for improvements or have found a bug - please open an issue on github or add
a review to the dashboard.
## Configuration
### Flags
Pass `-help` to `vmbackupmanager` in order to see the full list of supported
command-line flags with their descriptions.
The shortlist of configuration flags is the following:
```
vmbackupmanager performs regular backups according to the provided configs.
subcommands:
backup: provides auxiliary backup-related commands
restore: restores backup specified by restore mark if it exists
command-line flags:
-apiURL string
vmbackupmanager address to perform API requests (default "http://127.0.0.1:8300")
-concurrency int
The number of concurrent workers. Higher concurrency may reduce backup duration (default 10)
-configFilePath string
Path to file with S3 configs. Configs are loaded from default location if not set.
See https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html
-configProfile string
Profile name for S3 configs. If no set, the value of the environment variable will be loaded (AWS_PROFILE or AWS_DEFAULT_PROFILE), or if both not set, DefaultSharedConfigProfile is used
-credsFilePath string
Path to file with GCS or S3 credentials. Credentials are loaded from default locations if not set.
See https://cloud.google.com/iam/docs/creating-managing-service-account-keys and https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html
-customS3Endpoint string
Custom S3 endpoint for use with S3-compatible storages (e.g. MinIO). S3 is used if not set
-deleteAllObjectVersions
Whether to prune previous object versions when deleting an object. By default, when object storage has versioning enabled deleting the file removes only current version. This option forces removal of all previous versions. See: https://docs.victoriametrics.com/vmbackup.html#permanent-deletion-of-objects-in-s3-compatible-storages
-disableDaily
Disable daily run. Default false
-disableHourly
Disable hourly run. Default false
-disableMonthly
Disable monthly run. Default false
-disableWeekly
Disable weekly run. Default false
-dst string
The root folder of Victoria Metrics backups. Example: gs://bucket/path/to/backup/dir, s3://bucket/path/to/backup/dir or fs:///path/to/local/backup/dir
-enableTCP6
Whether to enable IPv6 for listening and dialing. By default, only IPv4 TCP and UDP are used
-envflag.enable
Whether to enable reading flags from environment variables in addition to the command line. Command line flag values have priority over values from environment vars. Flags are read only from the command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
-eula
Deprecated, please use -license or -licenseFile flags instead. By specifying this flag, you confirm that you have an enterprise license and accept the ESA https://victoriametrics.com/legal/esa/ . This flag is available only in Enterprise binaries. See https://docs.victoriametrics.com/enterprise.html
-filestream.disableFadvise
Whether to disable fadvise() syscall when reading large data files. The fadvise() syscall prevents from eviction of recently accessed data from OS page cache during background merges and backups. In some rare cases it is better to disable the syscall if it uses too much CPU
-flagsAuthKey string
Auth key for /flags endpoint. It must be passed via authKey query arg. It overrides httpAuth.* settings
-fs.disableMmap
Whether to use pread() instead of mmap() for reading data files. By default, mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread()
-http.connTimeout duration
Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s)
-http.disableResponseCompression
Disable compression of HTTP responses to save CPU resources. By default, compression is enabled to save network bandwidth
-http.header.csp string
Value for 'Content-Security-Policy' header
-http.header.frameOptions string
Value for 'X-Frame-Options' header
-http.header.hsts string
Value for 'Strict-Transport-Security' header
-http.idleConnTimeout duration
Timeout for incoming idle http connections (default 1m0s)
-http.maxGracefulShutdownDuration duration
The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s)
-http.pathPrefix string
An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus
-http.shutdownDelay duration
Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers
-httpAuth.password string
Password for HTTP server's Basic Auth. The authentication is disabled if -httpAuth.username is empty
-httpAuth.username string
Username for HTTP server's Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
-httpListenAddr string
Address to listen for http connections (default ":8300")
-internStringCacheExpireDuration duration
The expiry duration for caches for interned strings. See https://en.wikipedia.org/wiki/String_interning . See also -internStringMaxLen and -internStringDisableCache (default 6m0s)
-internStringDisableCache
Whether to disable caches for interned strings. This may reduce memory usage at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringCacheExpireDuration and -internStringMaxLen
-internStringMaxLen int
The maximum length for strings to intern. A lower limit may save memory at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringDisableCache and -internStringCacheExpireDuration (default 500)
-keepLastDaily int
Keep last N daily backups. If 0 is specified next retention cycle removes all backups for given time period. (default -1)
-keepLastHourly int
Keep last N hourly backups. If 0 is specified next retention cycle removes all backups for given time period. (default -1)
-keepLastMonthly int
Keep last N monthly backups. If 0 is specified next retention cycle removes all backups for given time period. (default -1)
-keepLastWeekly int
Keep last N weekly backups. If 0 is specified next retention cycle removes all backups for given time period. (default -1)
-license string
Lisense key for VictoriaMetrics Enterprise. See https://victoriametrics.com/products/enterprise/ . Trial Enterprise license can be obtained from https://victoriametrics.com/products/enterprise/trial/ . This flag is available only in Enterprise binaries. The license key can be also passed via file specified by -licenseFile command-line flag
-license.forceOffline
Whether to enable offline verification for VictoriaMetrics Enterprise license key, which has been passed either via -license or via -licenseFile command-line flag. The issued license key must support offline verification feature. Contact info@victoriametrics.com if you need offline license verification. This flag is avilable only in Enterprise binaries
-licenseFile string
Path to file with license key for VictoriaMetrics Enterprise. See https://victoriametrics.com/products/enterprise/ . Trial Enterprise license can be obtained from https://victoriametrics.com/products/enterprise/trial/ . This flag is available only in Enterprise binaries. The license key can be also passed inline via -license command-line flag
-loggerDisableTimestamps
Whether to disable writing timestamps in logs
-loggerErrorsPerSecondLimit int
Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit
-loggerFormat string
Format for logs. Possible values: default, json (default "default")
-loggerJSONFields string
Allows renaming fields in JSON formatted logs. Example: "ts:timestamp,msg:message" renames "ts" to "timestamp" and "msg" to "message". Supported fields: ts, level, caller, msg
-loggerLevel string
Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO")
-loggerOutput string
Output for the logs. Supported values: stderr, stdout (default "stderr")
-loggerTimezone string
Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC")
-loggerWarnsPerSecondLimit int
Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit
-maxBytesPerSecond int
The maximum upload speed. There is no limit if it is set to 0
-memory.allowedBytes size
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from the OS page cache resulting in higher disk IO usage
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 0)
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from the OS page cache which will result in higher disk IO usage (default 60)
-metricsAuthKey string
Auth key for /metrics endpoint. It must be passed via authKey query arg. It overrides httpAuth.* settings
-pprofAuthKey string
Auth key for /debug/pprof/* endpoints. It must be passed via authKey query arg. It overrides httpAuth.* settings
-pushmetrics.extraLabel array
Optional labels to add to metrics pushed to -pushmetrics.url . For example, -pushmetrics.extraLabel='instance="foo"' adds instance="foo" label to all the metrics pushed to -pushmetrics.url
Supports an array of values separated by comma or specified via multiple flags.
-pushmetrics.interval duration
Interval for pushing metrics to -pushmetrics.url (default 10s)
-pushmetrics.url array
Optional URL to push metrics exposed at /metrics page. See https://docs.victoriametrics.com/#push-metrics . By default, metrics exposed at /metrics page aren't pushed to any remote storage
Supports an array of values separated by comma or specified via multiple flags.
-runOnStart
Upload backups immediately after start of the service. Otherwise the backup starts on new hour
-s3ForcePathStyle
Prefixing endpoint with bucket name when set false, true by default. (default true)
-s3StorageClass string
The Storage Class applied to objects uploaded to AWS S3. Supported values are: GLACIER, DEEP_ARCHIVE, GLACIER_IR, INTELLIGENT_TIERING, ONEZONE_IA, OUTPOSTS, REDUCED_REDUNDANCY, STANDARD, STANDARD_IA.
See https://docs.aws.amazon.com/AmazonS3/latest/userguide/storage-class-intro.html
-snapshot.createURL string
VictoriaMetrics create snapshot url. When this is given a snapshot will automatically be created during backup.Example: http://victoriametrics:8428/snapshot/create
-snapshot.deleteURL string
VictoriaMetrics delete snapshot url. Optional. Will be generated from snapshot.createURL if not provided. All created snaphosts will be automatically deleted.Example: http://victoriametrics:8428/snapshot/delete
-storageDataPath string
Path to VictoriaMetrics data. Must match -storageDataPath from VictoriaMetrics or vmstorage (default "victoria-metrics-data")
-tls
Whether to enable TLS for incoming HTTP requests at -httpListenAddr (aka https). -tlsCertFile and -tlsKeyFile must be set if -tls is set
-tlsCertFile string
Path to file with TLS certificate if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower. The provided certificate file is automatically re-read every second, so it can be dynamically updated
-tlsCipherSuites array
Optional list of TLS cipher suites for incoming requests over HTTPS if -tls is set. See the list of supported cipher suites at https://pkg.go.dev/crypto/tls#pkg-constants
Supports an array of values separated by comma or specified via multiple flags.
-tlsKeyFile string
Path to file with TLS key if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated
-tlsMinVersion string
Optional minimum TLS version to use for incoming requests over HTTPS if -tls is set. Supported values: TLS10, TLS11, TLS12, TLS13
-version
Show VictoriaMetrics version
```
vmbackupmanager docs can be edited at [docs/vmbackupmanager.md](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/docs/vmbackupmanager.md).

Binary file not shown.

Before

Width:  |  Height:  |  Size: 29 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 25 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 99 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 64 KiB

File diff suppressed because it is too large Load Diff

View File

@@ -1,467 +1,3 @@
# vmgateway
See vmgateway docs [here](https://docs.victoriametrics.com/vmgateway.html).
***vmgateway is a part of [enterprise package](https://docs.victoriametrics.com/enterprise.html).
It is available for download and evaluation at [releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/latest).
See how to request a free trial license [here](https://victoriametrics.com/products/enterprise/trial/).***
<img alt="vmgateway" src="vmgateway-overview.jpeg">
`vmgateway` is a proxy for the VictoriaMetrics Time Series Database (TSDB). It provides the following features:
* Rate Limiter
* Based on cluster tenant's utilization, it supports multiple time interval limits for both the ingestion and retrieval of metrics
* Token Access Control
* Supports additional per-label access control for both the Single and Cluster versions of the VictoriaMetrics TSDB
* Provides access by tenantID in the Cluster version
* Allows for separate write/read/admin access to data
`vmgateway` is included in our [enterprise packages](https://docs.victoriametrics.com/enterprise.html).
## Access Control
<img alt="vmgateway-ac" src="vmgateway-access-control.jpg">
`vmgateway` supports jwt based authentication. With jwt payload can be configured to give access to specific tenants and labels as well as to read/write.
jwt token must be in following format:
```json
{
"exp": 1617304574,
"vm_access": {
"tenant_id": {
"account_id": 1,
"project_id": 5
},
"extra_labels": {
"team": "dev",
"project": "mobile"
},
"extra_filters": ["{env=~\"prod|dev\",team!=\"test\"}"],
"mode": 1
}
}
```
Where:
* `exp` - required, expire time in unix_timestamp. If the token expires then `vmgateway` rejects the request.
* `vm_access` - required, dict with claim info, minimum form: `{"vm_access": {"tenant_id": {}}`
* `tenant_id` - optional, for cluster mode, routes requests to the corresponding tenant.
* `extra_labels` - optional, key-value pairs for label filters added to the ingested or selected metrics. Multiple filters are added with `and` operation. If defined, `extra_label` from original request removed.
* `extra_filters` - optional, [series selectors](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors) added to the select query requests. Multiple selectors are added with `or` operation. If defined, `extra_filter` from original request removed.
* `mode` - optional, access mode for api - read, write, or full. Supported values: 0 - full (default value), 1 - read, 2 - write.
## QuickStart
Start the single version of VictoriaMetrics
```console
# single
# start node
./bin/victoria-metrics --selfScrapeInterval=10s
```
Start vmgateway
```console
./bin/vmgateway -eula -enable.auth -read.url http://localhost:8428 --write.url http://localhost:8428
```
Retrieve data from the database
```console
curl 'http://localhost:8431/api/v1/series/count' -H 'Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ2bV9hY2Nlc3MiOnsidGVuYW50X2lkIjp7fSwicm9sZSI6MX0sImV4cCI6MTkzOTM0NjIxMH0.5WUxEfdcV9hKo4CtQdtuZYOGpGXWwaqM9VuVivMMrVg'
```
A request with an incorrect token or without any token will be rejected:
```console
curl 'http://localhost:8431/api/v1/series/count'
curl 'http://localhost:8431/api/v1/series/count' -H 'Authorization: Bearer incorrect-token'
```
## Rate Limiter
<img alt="vmgateway-rl" src="vmgateway-rate-limiting.jpg">
Limits incoming requests by given, pre-configured limits. It supports read and write limiting by tenant.
`vmgateway` needs a datasource for rate limit queries. It can be either single-node or cluster version of `victoria-metrics`.
The metrics that you want to rate limit must be scraped from the cluster.
List of supported limit types:
* `queries` - count of api requests made at tenant to read the api, such as `/api/v1/query`, `/api/v1/series` and others.
* `active_series` - count of current active series at any given tenant.
* `new_series` - count of created series; aka churn rate
* `rows_inserted` - count of inserted rows per tenant.
List of supported time windows:
* `minute`
* `hour`
Limits can be specified per tenant or at a global level if you omit `project_id` and `account_id`.
Example of configuration file:
```yaml
limits:
- type: queries
value: 1000
resolution: minute
- type: queries
value: 10000
resolution: hour
- type: queries
value: 10
resolution: minute
project_id: 5
account_id: 1
```
## QuickStart
cluster version of VictoriaMetrics is required for rate limiting.
```console
# start datasource for cluster metrics
cat << EOF > cluster.yaml
scrape_configs:
- job_name: cluster
scrape_interval: 5s
static_configs:
- targets: ['127.0.0.1:8481','127.0.0.1:8482','127.0.0.1:8480']
EOF
./bin/victoria-metrics --promscrape.config cluster.yaml
# start cluster
# start vmstorage, vmselect and vminsert
./bin/vmstorage -eula
./bin/vmselect -eula -storageNode 127.0.0.1:8401
./bin/vminsert -eula -storageNode 127.0.0.1:8400
# create base rate limiting config:
cat << EOF > limit.yaml
limits:
- type: queries
value: 100
- type: rows_inserted
value: 100000
- type: new_series
value: 1000
- type: active_series
value: 100000
- type: queries
value: 1
account_id: 15
EOF
# start gateway with clusterMoe
./bin/vmgateway -eula -enable.rateLimit -ratelimit.config limit.yaml -datasource.url http://localhost:8428 -enable.auth -clusterMode -write.url=http://localhost:8480 --read.url=http://localhost:8481
# ingest simple metric to tenant 1:5
curl 'http://localhost:8431/api/v1/import/prometheus' -X POST -d 'foo{bar="baz1"} 123' -H 'Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJleHAiOjE2MjAxNjIwMDAwMDAsInZtX2FjY2VzcyI6eyJ0ZW5hbnRfaWQiOnsiYWNjb3VudF9pZCI6MTV9fX0.PB1_KXDKPUp-40pxOGk6lt_jt9Yq80PIMpWVJqSForQ'
# read metric from tenant 1:5
curl 'http://localhost:8431/api/v1/labels' -H 'Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJleHAiOjE2MjAxNjIwMDAwMDAsInZtX2FjY2VzcyI6eyJ0ZW5hbnRfaWQiOnsiYWNjb3VudF9pZCI6MTV9fX0.PB1_KXDKPUp-40pxOGk6lt_jt9Yq80PIMpWVJqSForQ'
# check rate limit
```
## JWT signature verification
`vmgateway` supports JWT signature verification.
Supported algorithms are `RS256`, `RS384`, `RS512`, `ES256`, `ES384`, `ES512`, `PS256`, `PS384`, `PS512`.
Tokens with unsupported algorithms will be rejected.
In order to enable JWT signature verification, you need to specify keys for signature verification.
The following flags are used to specify keys:
- `-auth.publicKeyFiles` - allows to pass file path to file with public key.
- `-auth.publicKeys` - allows to pass public key directly.
Note that both flags support passing multiple keys and also can be used together.
Example usage:
```console
./bin/vmgateway -eula \
-enable.auth \
-write.url=http://localhost:8480 \
-read.url=http://localhost:8481 \
-auth.publicKeyFiles=public_key.pem \
-auth.publicKeyFiles=public_key2.pem \
-auth.publicKeys=`-----BEGIN PUBLIC KEY-----
MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAu1SU1LfVLPHCozMxH2Mo
4lgOEePzNm0tRgeLezV6ffAt0gunVTLw7onLRnrq0/IzW7yWR7QkrmBL7jTKEn5u
+qKhbwKfBstIs+bMY2Zkp18gnTxKLxoS2tFczGkPLPgizskuemMghRniWaoLcyeh
kd3qqGElvW/VDL5AaWTg0nLVkjRo9z+40RQzuVaE8AkAFmxZzow3x+VJYKdjykkJ
0iT9wCS0DRTXu269V264Vf/3jvredZiKRkgwlL9xNAwxXFg0x/XFw005UWVRIkdg
cKWTjpBP2dPwVZ4WWC+9aGVd+Gyn1o0CLelf4rEjGoXbAAEgAqeGUxrcIlbjXfbc
mwIDAQAB
-----END PUBLIC KEY-----
`
```
This command will result in 3 keys loaded: 2 keys from files and 1 from command line.
### Using OpenID discovery endpoint for JWT signature verification
`vmgateway` supports using OpenID discovery endpoint for JWKS keys discovery.
In order to enable [OpenID discovery](https://openid.net/specs/openid-connect-discovery-1_0.html) endpoint for JWT signature verification, you need to specify OpenID discovery endpoint URLs by using `auth.oidcDiscoveryEndpoints` flag.
When `auth.oidcDiscoveryEndpoints` is specified `vmageteway` will fetch JWKS keys from the specified endpoint and use them for JWT signature verification.
Example usage for tokens issued by Azure Active Directory:
```console
/bin/vmgateway -eula \
-enable.auth \
-write.url=http://localhost:8480 \
-read.url=http://localhost:8481 \
-auth.oidcDiscoveryEndpoints=https://login.microsoftonline.com/common/v2.0/.well-known/openid-configuration
```
Example usage for tokens issued by Google:
```console
/bin/vmgateway -eula \
-enable.auth \
-write.url=http://localhost:8480 \
-read.url=http://localhost:8481 \
-auth.oidcDiscoveryEndpoints=https://accounts.google.com/.well-known/openid-configuration
```
### Using JWKS endpoint for JWT signature verification
`vmgateway` supports using JWKS endpoint for JWT signature verification.
In order to enable JWKS endpoint for JWT signature verification, you need to specify JWKS endpoint URL by using `auth.jwksEndpoints` flag.
When `auth.jwksEndpoints` is specified `vmageteway` will fetch public keys from the specified endpoint and use them for JWT signature verification.
Example usage for tokens issued by Azure Active Directory:
```console
/bin/vmgateway -eula \
-enable.auth \
-write.url=http://localhost:8480 \
-read.url=http://localhost:8481 \
-auth.jwksEndpoints=https://login.microsoftonline.com/common/discovery/v2.0/keys
```
Example usage for tokens issued by Google:
```console
/bin/vmgateway -eula \
-enable.auth \
-write.url=http://localhost:8480 \
-read.url=http://localhost:8481 \
-auth.jwksEndpoints=https://www.googleapis.com/oauth2/v3/certs
```
## Configuration
The shortlist of configuration flags include the following:
```console
-auth.httpHeader string
HTTP header name to look for JWT authorization token (default "Authorization")
-auth.jwksEndpoints array
JWKS endpoints to fetch keys for JWT tokens signature verification
Supports an array of values separated by comma or specified via multiple flags.
-auth.oidcDiscoveryEndpoints array
OpenID Connect discovery endpoints to fetch keys for JWT tokens signature verification
Supports an array of values separated by comma or specified via multiple flags.
-auth.publicKeyFiles array
Path file with public key to verify JWT token signature
Supports an array of values separated by comma or specified via multiple flags.
-auth.publicKeys array
Public keys to verify JWT token signature
Supports an array of values separated by comma or specified via multiple flags.
-clusterMode
enable this for the cluster version
-datasource.appendTypePrefix
Whether to add type prefix to -datasource.url based on the query type. Set to true if sending different query types to the vmselect URL.
-datasource.basicAuth.password string
Optional basic auth password for -datasource.url
-datasource.basicAuth.passwordFile string
Optional path to basic auth password to use for -datasource.url
-datasource.basicAuth.username string
Optional basic auth username for -datasource.url
-datasource.bearerToken string
Optional bearer auth token to use for -datasource.url.
-datasource.bearerTokenFile string
Optional path to bearer token file to use for -datasource.url.
-datasource.disableKeepAlive
Whether to disable long-lived connections to the datasource. If true, disables HTTP keep-alives and will only use the connection to the server for a single HTTP request.
-datasource.disableStepParam
Whether to disable adding 'step' param to the issued instant queries. This might be useful when using vmalert with datasources that do not support 'step' param for instant queries, like Google Managed Prometheus. It is not recommended to enable this flag if you use vmalert with VictoriaMetrics.
-datasource.headers string
Optional HTTP extraHeaders to send with each request to the corresponding -datasource.url. For example, -datasource.headers='My-Auth:foobar' would send 'My-Auth: foobar' HTTP header with every request to the corresponding -datasource.url. Multiple headers must be delimited by '^^': -datasource.headers='header1:value1^^header2:value2'
-datasource.lookback duration
Lookback defines how far into the past to look when evaluating queries. For example, if the datasource.lookback=5m then param "time" with value now()-5m will be added to every query.
-datasource.maxIdleConnections int
Defines the number of idle (keep-alive connections) to each configured datasource. Consider setting this value equal to the value: groups_total * group.concurrency. Too low a value may result in a high number of sockets in TIME_WAIT state. (default 100)
-datasource.oauth2.clientID string
Optional OAuth2 clientID to use for -datasource.url.
-datasource.oauth2.clientSecret string
Optional OAuth2 clientSecret to use for -datasource.url.
-datasource.oauth2.clientSecretFile string
Optional OAuth2 clientSecretFile to use for -datasource.url.
-datasource.oauth2.scopes string
Optional OAuth2 scopes to use for -datasource.url. Scopes must be delimited by ';'
-datasource.oauth2.tokenUrl string
Optional OAuth2 tokenURL to use for -datasource.url.
-datasource.queryStep duration
How far a value can fallback to when evaluating queries. For example, if -datasource.queryStep=15s then param "step" with value "15s" will be added to every query. If set to 0, rule's evaluation interval will be used instead. (default 5m0s)
-datasource.queryTimeAlignment
Deprecated: please use "eval_alignment" in rule group instead. Whether to align "time" parameter with evaluation interval. Alignment supposed to produce deterministic results despite number of vmalert replicas or time they were started. See more details at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1257 (default true)
-datasource.roundDigits int
Adds "round_digits" GET param to datasource requests. In VM "round_digits" limits the number of digits after the decimal point in response values.
-datasource.showURL
Whether to avoid stripping sensitive information such as auth headers or passwords from URLs in log messages or UI and exported metrics. It is hidden by default, since it can contain sensitive info such as auth key
-datasource.tlsCAFile string
Optional path to TLS CA file to use for verifying connections to -datasource.url. By default, system CA is used
-datasource.tlsCertFile string
Optional path to client-side TLS certificate file to use when connecting to -datasource.url
-datasource.tlsInsecureSkipVerify
Whether to skip tls verification when connecting to -datasource.url
-datasource.tlsKeyFile string
Optional path to client-side TLS certificate key to use when connecting to -datasource.url
-datasource.tlsServerName string
Optional TLS server name to use for connections to -datasource.url. By default, the server name from -datasource.url is used
-datasource.url string
Datasource compatible with Prometheus HTTP API. It can be single node VictoriaMetrics or vmselect URL. Required parameter. E.g. http://127.0.0.1:8428 . See also -remoteRead.disablePathAppend and -datasource.showURL
-enable.auth
enables auth with jwt token
-enable.rateLimit
enables rate limiter
-enableTCP6
Whether to enable IPv6 for listening and dialing. By default, only IPv4 TCP and UDP are used
-envflag.enable
Whether to enable reading flags from environment variables in addition to the command line. Command line flag values have priority over values from environment vars. Flags are read only from the command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
-eula
Deprecated, please use -license or -licenseFile flags instead. By specifying this flag, you confirm that you have an enterprise license and accept the ESA https://victoriametrics.com/legal/esa/ . This flag is available only in Enterprise binaries. See https://docs.victoriametrics.com/enterprise.html
-filestream.disableFadvise
Whether to disable fadvise() syscall when reading large data files. The fadvise() syscall prevents from eviction of recently accessed data from OS page cache during background merges and backups. In some rare cases it is better to disable the syscall if it uses too much CPU
-flagsAuthKey string
Auth key for /flags endpoint. It must be passed via authKey query arg. It overrides httpAuth.* settings
-fs.disableMmap
Whether to use pread() instead of mmap() for reading data files. By default, mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread()
-http.connTimeout duration
Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s)
-http.disableResponseCompression
Disable compression of HTTP responses to save CPU resources. By default, compression is enabled to save network bandwidth
-http.header.csp string
Value for 'Content-Security-Policy' header
-http.header.frameOptions string
Value for 'X-Frame-Options' header
-http.header.hsts string
Value for 'Strict-Transport-Security' header
-http.idleConnTimeout duration
Timeout for incoming idle http connections (default 1m0s)
-http.maxGracefulShutdownDuration duration
The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s)
-http.pathPrefix string
An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus
-http.shutdownDelay duration
Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers
-httpAuth.password string
Password for HTTP server's Basic Auth. The authentication is disabled if -httpAuth.username is empty
-httpAuth.username string
Username for HTTP server's Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
-httpListenAddr string
TCP address to listen for http connections. See also -httpListenAddr.useProxyProtocol (default ":8431")
-httpListenAddr.useProxyProtocol
Whether to use proxy protocol for connections accepted at -httpListenAddr . See https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt
-internStringCacheExpireDuration duration
The expiry duration for caches for interned strings. See https://en.wikipedia.org/wiki/String_interning . See also -internStringMaxLen and -internStringDisableCache (default 6m0s)
-internStringDisableCache
Whether to disable caches for interned strings. This may reduce memory usage at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringCacheExpireDuration and -internStringMaxLen
-internStringMaxLen int
The maximum length for strings to intern. A lower limit may save memory at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringDisableCache and -internStringCacheExpireDuration (default 500)
-license string
Lisense key for VictoriaMetrics Enterprise. See https://victoriametrics.com/products/enterprise/ . Trial Enterprise license can be obtained from https://victoriametrics.com/products/enterprise/trial/ . This flag is available only in Enterprise binaries. The license key can be also passed via file specified by -licenseFile command-line flag
-license.forceOffline
Whether to enable offline verification for VictoriaMetrics Enterprise license key, which has been passed either via -license or via -licenseFile command-line flag. The issued license key must support offline verification feature. Contact info@victoriametrics.com if you need offline license verification. This flag is avilable only in Enterprise binaries
-licenseFile string
Path to file with license key for VictoriaMetrics Enterprise. See https://victoriametrics.com/products/enterprise/ . Trial Enterprise license can be obtained from https://victoriametrics.com/products/enterprise/trial/ . This flag is available only in Enterprise binaries. The license key can be also passed inline via -license command-line flag
-loggerDisableTimestamps
Whether to disable writing timestamps in logs
-loggerErrorsPerSecondLimit int
Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit
-loggerFormat string
Format for logs. Possible values: default, json (default "default")
-loggerJSONFields string
Allows renaming fields in JSON formatted logs. Example: "ts:timestamp,msg:message" renames "ts" to "timestamp" and "msg" to "message". Supported fields: ts, level, caller, msg
-loggerLevel string
Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO")
-loggerOutput string
Output for the logs. Supported values: stderr, stdout (default "stderr")
-loggerTimezone string
Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC")
-loggerWarnsPerSecondLimit int
Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit
-memory.allowedBytes size
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from the OS page cache resulting in higher disk IO usage
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 0)
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from the OS page cache which will result in higher disk IO usage (default 60)
-metricsAuthKey string
Auth key for /metrics endpoint. It must be passed via authKey query arg. It overrides httpAuth.* settings
-pprofAuthKey string
Auth key for /debug/pprof/* endpoints. It must be passed via authKey query arg. It overrides httpAuth.* settings
-pushmetrics.extraLabel array
Optional labels to add to metrics pushed to -pushmetrics.url . For example, -pushmetrics.extraLabel='instance="foo"' adds instance="foo" label to all the metrics pushed to -pushmetrics.url
Supports an array of values separated by comma or specified via multiple flags.
-pushmetrics.interval duration
Interval for pushing metrics to -pushmetrics.url (default 10s)
-pushmetrics.url array
Optional URL to push metrics exposed at /metrics page. See https://docs.victoriametrics.com/#push-metrics . By default, metrics exposed at /metrics page aren't pushed to any remote storage
Supports an array of values separated by comma or specified via multiple flags.
-ratelimit.config string
path for configuration file. Accepts url address
-ratelimit.configCheckInterval duration
interval for config file re-read. Zero value disables config re-reading. By default, refreshing is disabled, send SIGHUP for config refresh.
-ratelimit.extraLabels array
additional labels, that will be applied to fetchdata from datasource
Supports an array of values separated by comma or specified via multiple flags.
-ratelimit.refreshInterval duration
(default 5s)
-read.url string
read access url address, example: http://vmselect:8481
-remoteRead.disablePathAppend
Whether to disable automatic appending of '/api/v1/query' path to the configured -datasource.url and -remoteRead.url
-tls
Whether to enable TLS for incoming HTTP requests at -httpListenAddr (aka https). -tlsCertFile and -tlsKeyFile must be set if -tls is set
-tlsCertFile string
Path to file with TLS certificate if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower. The provided certificate file is automatically re-read every second, so it can be dynamically updated
-tlsCipherSuites array
Optional list of TLS cipher suites for incoming requests over HTTPS if -tls is set. See the list of supported cipher suites at https://pkg.go.dev/crypto/tls#pkg-constants
Supports an array of values separated by comma or specified via multiple flags.
-tlsKeyFile string
Path to file with TLS key if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated
-tlsMinVersion string
Optional minimum TLS version to use for incoming requests over HTTPS if -tls is set. Supported values: TLS10, TLS11, TLS12, TLS13
-version
Show VictoriaMetrics version
-write.url string
write access url address, example: http://vminsert:8480
```
## TroubleShooting
* Access control:
* incorrect `jwt` format, try <https://jwt.io/#debugger-io> with our tokens
* expired token, check `exp` field.
* Rate Limiting:
* `scrape_interval` at datasource, reduce it to apply limits faster.
## Limitations
* Access Control:
* `jwt` token signature verification for `HMAC` algorithms is not supported.
* RateLimiting:
* limits applied based on queries to `datasource.url`
* only cluster version can be rate-limited.
vmgateway docs can be edited at [docs/vmgateway.md](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/docs/vmgateway.md).

Binary file not shown.

Before

Width:  |  Height:  |  Size: 48 KiB

View File

@@ -22,7 +22,7 @@ var (
//
// See https://graphite.readthedocs.io/en/latest/feeding-carbon.html#the-plaintext-protocol
func InsertHandler(at *auth.Token, r io.Reader) error {
return stream.Parse(r, func(rows []parser.Row) error {
return stream.Parse(r, false, func(rows []parser.Row) error {
return insertRows(at, rows)
})
}

View File

@@ -11,6 +11,8 @@ import (
"sync/atomic"
"time"
"github.com/VictoriaMetrics/metrics"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/clusternative"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/csvimport"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/datadog"
@@ -25,6 +27,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/prometheusimport"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/promremotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/relabel"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/seriesupdate"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/vmimport"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
@@ -43,7 +46,6 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/pushmetrics"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/metrics"
)
var (
@@ -249,6 +251,14 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
}
w.WriteHeader(http.StatusNoContent)
return true
case "prometheus/api/v1/update/series":
// todo logging and errors.
if err := seriesupdate.InsertHandler(at, r); err != nil {
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "prometheus/api/v1/import":
vmimportRequests.Inc()
if err := vmimport.InsertHandler(at, r); err != nil {
@@ -314,7 +324,7 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
return true
case "newrelic/infra/v2/metrics/events/bulk":
newrelicWriteRequests.Inc()
if err := newrelic.InsertHandlerForHTTP(r); err != nil {
if err := newrelic.InsertHandlerForHTTP(at, r); err != nil {
newrelicWriteErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true

View File

@@ -7,6 +7,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/netstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/relabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/newrelic"
@@ -19,7 +20,7 @@ var (
)
// InsertHandlerForHTTP processes remote write for request to /newrelic/infra/v2/metrics/events/bulk request.
func InsertHandlerForHTTP(req *http.Request) error {
func InsertHandlerForHTTP(at *auth.Token, req *http.Request) error {
extraLabels, err := parserCommon.GetExtraLabels(req)
if err != nil {
return err
@@ -27,11 +28,11 @@ func InsertHandlerForHTTP(req *http.Request) error {
ce := req.Header.Get("Content-Encoding")
isGzip := ce == "gzip"
return stream.Parse(req.Body, isGzip, func(rows []newrelic.Row) error {
return insertRows(rows, extraLabels)
return insertRows(at, rows, extraLabels)
})
}
func insertRows(rows []newrelic.Row, extraLabels []prompbmarshal.Label) error {
func insertRows(at *auth.Token, rows []newrelic.Row, extraLabels []prompbmarshal.Label) error {
ctx := netstorage.GetInsertCtx()
defer netstorage.PutInsertCtx(ctx)
@@ -62,7 +63,8 @@ func insertRows(rows []newrelic.Row, extraLabels []prompbmarshal.Label) error {
continue
}
ctx.SortLabelsIfNeeded()
if err := ctx.WriteDataPoint(nil, ctx.Labels, r.Timestamp, s.Value); err != nil {
atLocal := ctx.GetLocalAuthToken(at)
if err := ctx.WriteDataPoint(atLocal, ctx.Labels, r.Timestamp, s.Value); err != nil {
return err
}
}

View File

@@ -0,0 +1,81 @@
package seriesupdate
import (
"net/http"
"strconv"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/netstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/vmimport"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/vmimport/stream"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="update_series"}`)
rowsTenantInserted = tenantmetrics.NewCounterMap(`vm_tenant_inserted_rows_total{type="update_series"}`)
rowsPerInsert = metrics.NewHistogram(`vm_rows_per_insert{type="update_series"}`)
)
// Returns local unique generationID.
func generateUniqueGenerationID() []byte {
nextID := time.Now().UnixNano()
return []byte(strconv.FormatInt(nextID, 10))
}
// InsertHandler processes `/api/v1/update/series` request.
func InsertHandler(at *auth.Token, req *http.Request) error {
isGzipped := req.Header.Get("Content-Encoding") == "gzip"
return stream.Parse(req.Body, isGzipped, func(rows []vmimport.Row) error {
return insertRows(at, rows)
})
}
func insertRows(at *auth.Token, rows []vmimport.Row) error {
ctx := netstorage.GetInsertCtx()
defer netstorage.PutInsertCtx(ctx)
ctx.Reset() // This line is required for initializing ctx internals.
rowsTotal := 0
generationID := generateUniqueGenerationID()
for i := range rows {
r := &rows[i]
rowsTotal += len(r.Values)
ctx.Labels = ctx.Labels[:0]
for j := range r.Tags {
tag := &r.Tags[j]
ctx.AddLabelBytes(tag.Key, tag.Value)
}
if len(ctx.Labels) == 0 {
// Skip metric without labels.
continue
}
// there is no need in relabeling and extra_label adding
// since modified series already passed this phase during ingestion,
// and it may lead to unexpected result for user.
ctx.AddLabelBytes([]byte(`__generation_id`), generationID)
ctx.MetricNameBuf = storage.MarshalMetricNameRaw(ctx.MetricNameBuf[:0], at.AccountID, at.ProjectID, ctx.Labels)
values := r.Values
timestamps := r.Timestamps
if len(timestamps) != len(values) {
logger.Panicf("BUG: len(timestamps)=%d must match len(values)=%d", len(timestamps), len(values))
}
atLocal := ctx.GetLocalAuthToken(at)
storageNodeIdx := ctx.GetStorageNodeIdx(atLocal, ctx.Labels)
for j, value := range values {
timestamp := timestamps[j]
if err := ctx.WriteDataPointExt(storageNodeIdx, ctx.MetricNameBuf, timestamp, value); err != nil {
return err
}
}
}
rowsInserted.Add(rowsTotal)
rowsTenantInserted.Get(at).Add(rowsTotal)
rowsPerInsert.Update(float64(rowsTotal))
return ctx.FlushBufs()
}

View File

@@ -1,237 +1,3 @@
# vmrestore
See vmrestore docs [here](https://docs.victoriametrics.com/vmrestore.html).
`vmrestore` restores data from backups created by [vmbackup](https://docs.victoriametrics.com/vmbackup.html).
Restore process can be interrupted at any time. It is automatically resumed from the interruption point
when restarting `vmrestore` with the same args.
## Usage
VictoriaMetrics must be stopped during the restore process.
Run the following command to restore backup from the given `-src` into the given `-storageDataPath`:
```console
./vmrestore -src=<storageType>://<path/to/backup> -storageDataPath=<local/path/to/restore>
```
* `<storageType>://<path/to/backup>` is the path to backup made with [vmbackup](https://docs.victoriametrics.com/vmbackup.html).
`vmrestore` can restore backups from the following storage types:
* [GCS](https://cloud.google.com/storage/). Example: `-src=gs://<bucket>/<path/to/backup>`
* [S3](https://aws.amazon.com/s3/). Example: `-src=s3://<bucket>/<path/to/backup>`
* [Azure Blob Storage](https://azure.microsoft.com/en-us/products/storage/blobs/). Example: `-src=azblob://<container>/<path/to/backup>`
* Any S3-compatible storage such as [MinIO](https://github.com/minio/minio), [Ceph](https://docs.ceph.com/en/pacific/radosgw/s3/)
or [Swift](https://platform.swiftstack.com/docs/admin/middleware/s3_middleware.html). See [these docs](#advanced-usage) for details.
* Local filesystem. Example: `-src=fs://</absolute/path/to/backup>`. Note that `vmbackup` prevents from storing the backup
into the directory pointed by `-storageDataPath` command-line flag, since this directory should be managed solely by VictoriaMetrics or `vmstorage`.
* `<local/path/to/restore>` is the path to folder where data will be restored. This folder must be passed
to VictoriaMetrics in `-storageDataPath` command-line flag after the restore process is complete.
The original `-storageDataPath` directory may contain old files. They will be substituted by the files from backup,
i.e. the end result would be similar to [rsync --delete](https://askubuntu.com/questions/476041/how-do-i-make-rsync-delete-files-that-have-been-deleted-from-the-source-folder).
## Troubleshooting
* If `vmrestore` eats all the network bandwidth, then set `-maxBytesPerSecond` to the desired value.
* If `vmrestore` has been interrupted due to temporary error, then just restart it with the same args. It will resume the restore process.
## Advanced usage
* Obtaining credentials from a file.
Add flag `-credsFilePath=/etc/credentials` with following content:
for s3 (aws, minio or other s3 compatible storages):
```console
[default]
aws_access_key_id=theaccesskey
aws_secret_access_key=thesecretaccesskeyvalue
```
for gce cloud storage:
```json
{
"type": "service_account",
"project_id": "project-id",
"private_key_id": "key-id",
"private_key": "-----BEGIN PRIVATE KEY-----\nprivate-key\n-----END PRIVATE KEY-----\n",
"client_email": "service-account-email",
"client_id": "client-id",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://accounts.google.com/o/oauth2/token",
"auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
"client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/service-account-email"
}
```
* Usage with s3 custom url endpoint. It is possible to use `vmrestore` with s3 api compatible storages, like minio, cloudian and other.
You have to add custom url endpoint with a flag:
```console
# for minio:
-customS3Endpoint=http://localhost:9000
# for aws gov region
-customS3Endpoint=https://s3-fips.us-gov-west-1.amazonaws.com
```
* Run `vmrestore -help` in order to see all the available options:
```console
-concurrency int
The number of concurrent workers. Higher concurrency may reduce restore duration (default 10)
-configFilePath string
Path to file with S3 configs. Configs are loaded from default location if not set.
See https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html
-configProfile string
Profile name for S3 configs. If no set, the value of the environment variable will be loaded (AWS_PROFILE or AWS_DEFAULT_PROFILE), or if both not set, DefaultSharedConfigProfile is used
-credsFilePath string
Path to file with GCS or S3 credentials. Credentials are loaded from default locations if not set.
See https://cloud.google.com/iam/docs/creating-managing-service-account-keys and https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html
-customS3Endpoint string
Custom S3 endpoint for use with S3-compatible storages (e.g. MinIO). S3 is used if not set
-deleteAllObjectVersions
Whether to prune previous object versions when deleting an object. By default, when object storage has versioning enabled deleting the file removes only current version. This option forces removal of all previous versions. See: https://docs.victoriametrics.com/vmbackup.html#permanent-deletion-of-objects-in-s3-compatible-storages
-enableTCP6
Whether to enable IPv6 for listening and dialing. By default, only IPv4 TCP and UDP are used
-envflag.enable
Whether to enable reading flags from environment variables in addition to the command line. Command line flag values have priority over values from environment vars. Flags are read only from the command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
-eula
Deprecated, please use -license or -licenseFile flags instead. By specifying this flag, you confirm that you have an enterprise license and accept the ESA https://victoriametrics.com/legal/esa/ . This flag is available only in Enterprise binaries. See https://docs.victoriametrics.com/enterprise.html
-filestream.disableFadvise
Whether to disable fadvise() syscall when reading large data files. The fadvise() syscall prevents from eviction of recently accessed data from OS page cache during background merges and backups. In some rare cases it is better to disable the syscall if it uses too much CPU
-flagsAuthKey string
Auth key for /flags endpoint. It must be passed via authKey query arg. It overrides httpAuth.* settings
-fs.disableMmap
Whether to use pread() instead of mmap() for reading data files. By default, mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread()
-http.connTimeout duration
Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s)
-http.disableResponseCompression
Disable compression of HTTP responses to save CPU resources. By default, compression is enabled to save network bandwidth
-http.header.csp string
Value for 'Content-Security-Policy' header
-http.header.frameOptions string
Value for 'X-Frame-Options' header
-http.header.hsts string
Value for 'Strict-Transport-Security' header
-http.idleConnTimeout duration
Timeout for incoming idle http connections (default 1m0s)
-http.maxGracefulShutdownDuration duration
The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s)
-http.pathPrefix string
An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus
-http.shutdownDelay duration
Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers
-httpAuth.password string
Password for HTTP server's Basic Auth. The authentication is disabled if -httpAuth.username is empty
-httpAuth.username string
Username for HTTP server's Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
-httpListenAddr string
TCP address for exporting metrics at /metrics page (default ":8421")
-internStringCacheExpireDuration duration
The expiry duration for caches for interned strings. See https://en.wikipedia.org/wiki/String_interning . See also -internStringMaxLen and -internStringDisableCache (default 6m0s)
-internStringDisableCache
Whether to disable caches for interned strings. This may reduce memory usage at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringCacheExpireDuration and -internStringMaxLen
-internStringMaxLen int
The maximum length for strings to intern. A lower limit may save memory at the cost of higher CPU usage. See https://en.wikipedia.org/wiki/String_interning . See also -internStringDisableCache and -internStringCacheExpireDuration (default 500)
-license string
Lisense key for VictoriaMetrics Enterprise. See https://victoriametrics.com/products/enterprise/ . Trial Enterprise license can be obtained from https://victoriametrics.com/products/enterprise/trial/ . This flag is available only in Enterprise binaries. The license key can be also passed via file specified by -licenseFile command-line flag
-license.forceOffline
Whether to enable offline verification for VictoriaMetrics Enterprise license key, which has been passed either via -license or via -licenseFile command-line flag. The issued license key must support offline verification feature. Contact info@victoriametrics.com if you need offline license verification. This flag is avilable only in Enterprise binaries
-licenseFile string
Path to file with license key for VictoriaMetrics Enterprise. See https://victoriametrics.com/products/enterprise/ . Trial Enterprise license can be obtained from https://victoriametrics.com/products/enterprise/trial/ . This flag is available only in Enterprise binaries. The license key can be also passed inline via -license command-line flag
-loggerDisableTimestamps
Whether to disable writing timestamps in logs
-loggerErrorsPerSecondLimit int
Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit
-loggerFormat string
Format for logs. Possible values: default, json (default "default")
-loggerJSONFields string
Allows renaming fields in JSON formatted logs. Example: "ts:timestamp,msg:message" renames "ts" to "timestamp" and "msg" to "message". Supported fields: ts, level, caller, msg
-loggerLevel string
Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO")
-loggerOutput string
Output for the logs. Supported values: stderr, stdout (default "stderr")
-loggerTimezone string
Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC")
-loggerWarnsPerSecondLimit int
Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit
-maxBytesPerSecond size
The maximum download speed. There is no limit if it is set to 0
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 0)
-memory.allowedBytes size
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from the OS page cache resulting in higher disk IO usage
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 0)
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from the OS page cache which will result in higher disk IO usage (default 60)
-metricsAuthKey string
Auth key for /metrics endpoint. It must be passed via authKey query arg. It overrides httpAuth.* settings
-pprofAuthKey string
Auth key for /debug/pprof/* endpoints. It must be passed via authKey query arg. It overrides httpAuth.* settings
-pushmetrics.extraLabel array
Optional labels to add to metrics pushed to -pushmetrics.url . For example, -pushmetrics.extraLabel='instance="foo"' adds instance="foo" label to all the metrics pushed to -pushmetrics.url
Supports an array of values separated by comma or specified via multiple flags.
-pushmetrics.interval duration
Interval for pushing metrics to -pushmetrics.url (default 10s)
-pushmetrics.url array
Optional URL to push metrics exposed at /metrics page. See https://docs.victoriametrics.com/#push-metrics . By default, metrics exposed at /metrics page aren't pushed to any remote storage
Supports an array of values separated by comma or specified via multiple flags.
-s3ForcePathStyle
Prefixing endpoint with bucket name when set false, true by default. (default true)
-s3StorageClass string
The Storage Class applied to objects uploaded to AWS S3. Supported values are: GLACIER, DEEP_ARCHIVE, GLACIER_IR, INTELLIGENT_TIERING, ONEZONE_IA, OUTPOSTS, REDUCED_REDUNDANCY, STANDARD, STANDARD_IA.
See https://docs.aws.amazon.com/AmazonS3/latest/userguide/storage-class-intro.html
-skipBackupCompleteCheck
Whether to skip checking for 'backup complete' file in -src. This may be useful for restoring from old backups, which were created without 'backup complete' file
-src string
Source path with backup on the remote storage. Example: gs://bucket/path/to/backup, s3://bucket/path/to/backup, azblob://container/path/to/backup or fs:///path/to/local/backup
-storageDataPath string
Destination path where backup must be restored. VictoriaMetrics must be stopped when restoring from backup. -storageDataPath dir can be non-empty. In this case the contents of -storageDataPath dir is synchronized with -src contents, i.e. it works like 'rsync --delete' (default "victoria-metrics-data")
-tls
Whether to enable TLS for incoming HTTP requests at -httpListenAddr (aka https). -tlsCertFile and -tlsKeyFile must be set if -tls is set
-tlsCertFile string
Path to file with TLS certificate if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower. The provided certificate file is automatically re-read every second, so it can be dynamically updated
-tlsCipherSuites array
Optional list of TLS cipher suites for incoming requests over HTTPS if -tls is set. See the list of supported cipher suites at https://pkg.go.dev/crypto/tls#pkg-constants
Supports an array of values separated by comma or specified via multiple flags.
-tlsKeyFile string
Path to file with TLS key if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated
-tlsMinVersion string
Optional minimum TLS version to use for incoming requests over HTTPS if -tls is set. Supported values: TLS10, TLS11, TLS12, TLS13
-version
Show VictoriaMetrics version
```
## How to build from sources
It is recommended using [binary releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/latest) - see `vmutils-*` archives there.
### Development build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.20.
1. Run `make vmrestore` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmrestore` binary and puts it into the `bin` folder.
### Production build
1. [Install docker](https://docs.docker.com/install/).
1. Run `make vmrestore-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmrestore-prod` binary and puts it into the `bin` folder.
### Building docker images
Run `make package-vmrestore`. It builds `victoriametrics/vmrestore:<PKG_TAG>` docker image locally.
`<PKG_TAG>` is auto-generated image tag, which depends on source code in the repository.
The `<PKG_TAG>` may be manually set via `PKG_TAG=foobar make package-vmrestore`.
The base docker image is [alpine](https://hub.docker.com/_/alpine) but it is possible to use any other base image
by setting it via `<ROOT_IMAGE>` environment variable. For example, the following command builds the image on top of [scratch](https://hub.docker.com/_/scratch) image:
```console
ROOT_IMAGE=scratch make package-vmrestore
```
vmrestore docs can be edited at [docs/vmrestore.md](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/docs/vmrestore.md).

View File

@@ -179,17 +179,13 @@ var (
)
func requestHandler(w http.ResponseWriter, r *http.Request) bool {
if r.URL.Path == "/" {
if r.Method != http.MethodGet {
return false
}
w.Header().Add("Content-Type", "text/html; charset=utf-8")
fmt.Fprintf(w, `vmselect - a component of VictoriaMetrics cluster<br/>
<a href="https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html">docs</a><br>
`)
path := strings.Replace(r.URL.Path, "//", "/", -1)
if handleStaticAndSimpleRequests(w, r, path) {
return true
}
// Handle non-trivial dynamic requests, which may take big amounts of time and resources.
startTime := time.Now()
defer requestDuration.UpdateDuration(startTime)
tracerEnabled := httputils.GetBool(r, "trace")
@@ -248,7 +244,6 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
}()
}
path := strings.Replace(r.URL.Path, "//", "/", -1)
if path == "/internal/resetRollupResultCache" {
if !httpserver.CheckAuthFlag(w, r, *resetCacheAuthKey, "resetCacheAuthKey") {
return true
@@ -256,22 +251,6 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
promql.ResetRollupResultCache()
return true
}
if path == "/api/v1/status/top_queries" {
globalTopQueriesRequests.Inc()
httpserver.EnableCORS(w, r)
if err := prometheus.QueryStatsHandler(startTime, nil, w, r); err != nil {
globalTopQueriesErrors.Inc()
sendPrometheusError(w, r, err)
return true
}
return true
}
if path == "/api/v1/status/active_queries" {
globalStatusActiveQueriesRequests.Inc()
httpserver.EnableCORS(w, r)
promql.ActiveQueriesHandler(nil, w, r)
return true
}
if path == "/admin/tenants" {
tenantsRequests.Inc()
httpserver.EnableCORS(w, r)
@@ -314,69 +293,6 @@ func selectHandler(qt *querytracer.Tracer, startTime time.Time, w http.ResponseW
httpRequests.Get(at).Inc()
httpRequestsDuration.Get(at).Add(int(time.Since(startTime).Milliseconds()))
}()
if p.Suffix == "" {
if r.Method != http.MethodGet {
return false
}
w.Header().Add("Content-Type", "text/html; charset=utf-8")
fmt.Fprintf(w, "<h2>VictoriaMetrics cluster - vmselect</h2></br>")
fmt.Fprintf(w, "See <a href='https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#url-format'>docs</a></br>")
fmt.Fprintf(w, "Useful endpoints:</br>")
fmt.Fprintf(w, `<a href="vmui">Web UI</a><br>`)
fmt.Fprintf(w, `<a href="metric-relabel-debug">metric-level relabel debugging</a></br>`)
fmt.Fprintf(w, `<a href="target-relabel-debug">target-level relabel debugging</a></br>`)
fmt.Fprintf(w, `<a href="expand-with-exprs">WITH expressions' tutorial</a></br>`)
fmt.Fprintf(w, `<a href="prometheus/api/v1/status/tsdb">tsdb status page</a><br>`)
fmt.Fprintf(w, `<a href="prometheus/api/v1/status/top_queries">top queries</a><br>`)
fmt.Fprintf(w, `<a href="prometheus/api/v1/status/active_queries">active queries</a><br>`)
return true
}
if strings.HasPrefix(p.Suffix, "static") {
prefix := strings.Join([]string{"", p.Prefix, p.AuthToken}, "/")
http.StripPrefix(prefix, staticServer).ServeHTTP(w, r)
return true
}
if strings.HasPrefix(p.Suffix, "prometheus/static") {
prefix := strings.Join([]string{"", p.Prefix, p.AuthToken}, "/")
r.URL.Path = strings.Replace(r.URL.Path, "/prometheus/static", "/static", 1)
http.StripPrefix(prefix, staticServer).ServeHTTP(w, r)
return true
}
if p.Suffix == "vmui" || p.Suffix == "graph" || p.Suffix == "prometheus/vmui" || p.Suffix == "prometheus/graph" {
// VMUI access via incomplete url without `/` in the end. Redirect to complete url.
// Use relative redirect, since the hostname and path prefix may be incorrect if VictoriaMetrics
// is hidden behind vmauth or similar proxy.
_ = r.ParseForm()
suffix := strings.Replace(p.Suffix, "prometheus/", "../prometheus/", 1)
newURL := suffix + "/?" + r.Form.Encode()
httpserver.Redirect(w, newURL)
return true
}
if strings.HasPrefix(p.Suffix, "graph/") || strings.HasPrefix(p.Suffix, "prometheus/graph/") {
// This is needed for serving /graph URLs from Prometheus datasource in Grafana.
p.Suffix = strings.Replace(p.Suffix, "graph/", "vmui/", 1)
r.URL.Path = strings.Replace(r.URL.Path, "/graph/", "/vmui/", 1)
}
if p.Suffix == "vmui/custom-dashboards" || p.Suffix == "prometheus/vmui/custom-dashboards" {
if err := handleVMUICustomDashboards(w); err != nil {
httpserver.Errorf(w, r, "%s", err)
return true
}
return true
}
if strings.HasPrefix(p.Suffix, "vmui/") || strings.HasPrefix(p.Suffix, "prometheus/vmui/") {
// vmui access.
if strings.HasPrefix(p.Suffix, "vmui/static/") || strings.HasPrefix(p.Suffix, "prometheus/vmui/static/") {
// Allow clients caching static contents for long period of time, since it shouldn't change over time.
// Path to static contents (such as js and css) must be changed whenever its contents is changed.
// See https://developer.chrome.com/docs/lighthouse/performance/uses-long-cache-ttl/
w.Header().Set("Cache-Control", "max-age=31536000")
}
prefix := strings.Join([]string{"", p.Prefix, p.AuthToken}, "/")
r.URL.Path = strings.Replace(r.URL.Path, "/prometheus/vmui/", "/vmui/", 1)
http.StripPrefix(prefix, vmuiFileServer).ServeHTTP(w, r)
return true
}
if strings.HasPrefix(p.Suffix, "prometheus/api/v1/label/") {
s := p.Suffix[len("prometheus/api/v1/label/"):]
if strings.HasSuffix(s, "/values") {
@@ -401,46 +317,6 @@ func selectHandler(qt *querytracer.Tracer, startTime time.Time, w http.ResponseW
}
return true
}
if strings.HasPrefix(p.Suffix, "graphite/functions") {
funcName := p.Suffix[len("graphite/functions"):]
funcName = strings.TrimPrefix(funcName, "/")
if funcName == "" {
graphiteFunctionsRequests.Inc()
if err := graphite.FunctionsHandler(w, r); err != nil {
graphiteFunctionsErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
return true
}
graphiteFunctionDetailsRequests.Inc()
if err := graphite.FunctionDetailsHandler(funcName, w, r); err != nil {
graphiteFunctionDetailsErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
return true
}
if p.Suffix == "prometheus/vmalert" {
// vmalert access via incomplete url without `/` in the end. Redirect to complete url.
// Use relative redirect, since the hostname and path prefix may be incorrect if VictoriaMetrics
// is hidden behind vmauth or similar proxy.
path := "../" + p.Suffix + "/"
httpserver.Redirect(w, path)
return true
}
if strings.HasPrefix(p.Suffix, "prometheus/vmalert/") {
vmalertRequests.Inc()
if len(*vmalertProxyURL) == 0 {
w.WriteHeader(http.StatusBadRequest)
w.Header().Set("Content-Type", "application/json")
fmt.Fprintf(w, "%s", `{"status":"error","msg":"for accessing vmalert flag '-vmalert.proxyURL' must be configured"}`)
return true
}
proxyVMAlertRequests(w, r, p.Suffix)
return true
}
switch p.Suffix {
case "prometheus/api/v1/query":
@@ -497,20 +373,6 @@ func selectHandler(qt *querytracer.Tracer, startTime time.Time, w http.ResponseW
return true
}
return true
case "prometheus/api/v1/status/active_queries":
statusActiveQueriesRequests.Inc()
httpserver.EnableCORS(w, r)
promql.ActiveQueriesHandler(at, w, r)
return true
case "prometheus/api/v1/status/top_queries":
topQueriesRequests.Inc()
httpserver.EnableCORS(w, r)
if err := prometheus.QueryStatsHandler(startTime, at, w, r); err != nil {
topQueriesErrors.Inc()
sendPrometheusError(w, r, err)
return true
}
return true
case "prometheus/api/v1/export":
exportRequests.Inc()
if err := prometheus.ExportHandler(startTime, at, w, r); err != nil {
@@ -519,14 +381,6 @@ func selectHandler(qt *querytracer.Tracer, startTime time.Time, w http.ResponseW
return true
}
return true
case "prometheus/api/v1/export/native":
exportNativeRequests.Inc()
if err := prometheus.ExportNativeHandler(startTime, at, w, r); err != nil {
exportNativeErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
return true
case "prometheus/api/v1/export/csv":
exportCSVRequests.Inc()
if err := prometheus.ExportCSVHandler(startTime, at, w, r); err != nil {
@@ -535,6 +389,14 @@ func selectHandler(qt *querytracer.Tracer, startTime time.Time, w http.ResponseW
return true
}
return true
case "prometheus/api/v1/export/native":
exportNativeRequests.Inc()
if err := prometheus.ExportNativeHandler(startTime, at, w, r); err != nil {
exportNativeErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
return true
case "prometheus/federate":
federateRequests.Inc()
if err := prometheus.FederateHandler(startTime, at, w, r); err != nil {
@@ -636,6 +498,167 @@ func selectHandler(qt *querytracer.Tracer, startTime time.Time, w http.ResponseW
return true
}
return true
default:
return false
}
}
func handleStaticAndSimpleRequests(w http.ResponseWriter, r *http.Request, path string) bool {
if path == "/" {
if r.Method != http.MethodGet {
return false
}
w.Header().Add("Content-Type", "text/html; charset=utf-8")
fmt.Fprintf(w, `vmselect - a component of VictoriaMetrics cluster<br/>
<a href="https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html">docs</a><br>
`)
return true
}
if path == "/api/v1/status/top_queries" {
globalTopQueriesRequests.Inc()
httpserver.EnableCORS(w, r)
if err := prometheus.QueryStatsHandler(nil, w, r); err != nil {
globalTopQueriesErrors.Inc()
sendPrometheusError(w, r, err)
return true
}
return true
}
if path == "/api/v1/status/active_queries" {
globalStatusActiveQueriesRequests.Inc()
httpserver.EnableCORS(w, r)
promql.ActiveQueriesHandler(nil, w, r)
return true
}
p, err := httpserver.ParsePath(path)
if err != nil {
return false
}
if p.Suffix == "" {
if r.Method != http.MethodGet {
return false
}
w.Header().Add("Content-Type", "text/html; charset=utf-8")
fmt.Fprintf(w, "<h2>VictoriaMetrics cluster - vmselect</h2></br>")
fmt.Fprintf(w, "See <a href='https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#url-format'>docs</a></br>")
fmt.Fprintf(w, "Useful endpoints:</br>")
fmt.Fprintf(w, `<a href="vmui">Web UI</a><br>`)
fmt.Fprintf(w, `<a href="metric-relabel-debug">metric-level relabel debugging</a></br>`)
fmt.Fprintf(w, `<a href="target-relabel-debug">target-level relabel debugging</a></br>`)
fmt.Fprintf(w, `<a href="expand-with-exprs">WITH expressions' tutorial</a></br>`)
fmt.Fprintf(w, `<a href="prometheus/api/v1/status/tsdb">tsdb status page</a><br>`)
fmt.Fprintf(w, `<a href="prometheus/api/v1/status/top_queries">top queries</a><br>`)
fmt.Fprintf(w, `<a href="prometheus/api/v1/status/active_queries">active queries</a><br>`)
return true
}
if strings.HasPrefix(p.Suffix, "static") {
prefix := strings.Join([]string{"", p.Prefix, p.AuthToken}, "/")
http.StripPrefix(prefix, staticServer).ServeHTTP(w, r)
return true
}
if strings.HasPrefix(p.Suffix, "prometheus/static") {
prefix := strings.Join([]string{"", p.Prefix, p.AuthToken}, "/")
r.URL.Path = strings.Replace(r.URL.Path, "/prometheus/static", "/static", 1)
http.StripPrefix(prefix, staticServer).ServeHTTP(w, r)
return true
}
if p.Suffix == "vmui" || p.Suffix == "graph" || p.Suffix == "prometheus/vmui" || p.Suffix == "prometheus/graph" {
// VMUI access via incomplete url without `/` in the end. Redirect to complete url.
// Use relative redirect, since the hostname and path prefix may be incorrect if VictoriaMetrics
// is hidden behind vmauth or similar proxy.
_ = r.ParseForm()
suffix := strings.Replace(p.Suffix, "prometheus/", "../prometheus/", 1)
newURL := suffix + "/?" + r.Form.Encode()
httpserver.Redirect(w, newURL)
return true
}
if strings.HasPrefix(p.Suffix, "graph/") || strings.HasPrefix(p.Suffix, "prometheus/graph/") {
// This is needed for serving /graph URLs from Prometheus datasource in Grafana.
p.Suffix = strings.Replace(p.Suffix, "graph/", "vmui/", 1)
r.URL.Path = strings.Replace(r.URL.Path, "/graph/", "/vmui/", 1)
}
if p.Suffix == "vmui/custom-dashboards" || p.Suffix == "prometheus/vmui/custom-dashboards" {
if err := handleVMUICustomDashboards(w); err != nil {
httpserver.Errorf(w, r, "%s", err)
return true
}
return true
}
if strings.HasPrefix(p.Suffix, "vmui/") || strings.HasPrefix(p.Suffix, "prometheus/vmui/") {
// vmui access.
if strings.HasPrefix(p.Suffix, "vmui/static/") || strings.HasPrefix(p.Suffix, "prometheus/vmui/static/") {
// Allow clients caching static contents for long period of time, since it shouldn't change over time.
// Path to static contents (such as js and css) must be changed whenever its contents is changed.
// See https://developer.chrome.com/docs/lighthouse/performance/uses-long-cache-ttl/
w.Header().Set("Cache-Control", "max-age=31536000")
}
prefix := strings.Join([]string{"", p.Prefix, p.AuthToken}, "/")
r.URL.Path = strings.Replace(r.URL.Path, "/prometheus/vmui/", "/vmui/", 1)
http.StripPrefix(prefix, vmuiFileServer).ServeHTTP(w, r)
return true
}
if strings.HasPrefix(p.Suffix, "graphite/functions") {
funcName := p.Suffix[len("graphite/functions"):]
funcName = strings.TrimPrefix(funcName, "/")
if funcName == "" {
graphiteFunctionsRequests.Inc()
if err := graphite.FunctionsHandler(w, r); err != nil {
graphiteFunctionsErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
return true
}
graphiteFunctionDetailsRequests.Inc()
if err := graphite.FunctionDetailsHandler(funcName, w, r); err != nil {
graphiteFunctionDetailsErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
return true
}
if p.Suffix == "prometheus/vmalert" {
// vmalert access via incomplete url without `/` in the end. Redirect to complete url.
// Use relative redirect, since the hostname and path prefix may be incorrect if VictoriaMetrics
// is hidden behind vmauth or similar proxy.
path := "../" + p.Suffix + "/"
httpserver.Redirect(w, path)
return true
}
if strings.HasPrefix(p.Suffix, "prometheus/vmalert/") {
vmalertRequests.Inc()
if len(*vmalertProxyURL) == 0 {
w.WriteHeader(http.StatusBadRequest)
w.Header().Set("Content-Type", "application/json")
fmt.Fprintf(w, "%s", `{"status":"error","msg":"for accessing vmalert flag '-vmalert.proxyURL' must be configured"}`)
return true
}
proxyVMAlertRequests(w, r, p.Suffix)
return true
}
switch p.Suffix {
case "prometheus/api/v1/status/active_queries":
at, err := auth.NewToken(p.AuthToken)
if err != nil {
return false
}
statusActiveQueriesRequests.Inc()
httpserver.EnableCORS(w, r)
promql.ActiveQueriesHandler(at, w, r)
return true
case "prometheus/api/v1/status/top_queries":
at, err := auth.NewToken(p.AuthToken)
if err != nil {
return false
}
topQueriesRequests.Inc()
httpserver.EnableCORS(w, r)
if err := prometheus.QueryStatsHandler(at, w, r); err != nil {
topQueriesErrors.Inc()
sendPrometheusError(w, r, err)
return true
}
return true
case "prometheus/metric-relabel-debug", "metric-relabel-debug":
promrelabelMetricRelabelDebugRequests.Inc()
metric := r.FormValue("metric")

View File

@@ -6,10 +6,12 @@ import (
"flag"
"fmt"
"io"
"math/bits"
"net"
"net/http"
"os"
"sort"
"strconv"
"strings"
"sync"
"sync/atomic"
@@ -64,12 +66,19 @@ type Result struct {
// Values are sorted by Timestamps.
Values []float64
Timestamps []int64
valuesBuf []float64
timestampsBuf []int64
// statistic for series updates
updateRows int
}
func (r *Result) reset() {
r.MetricName.Reset()
r.Values = r.Values[:0]
r.Timestamps = r.Timestamps[:0]
r.updateRows = 0
}
// Results holds results returned from ProcessSearchQuery.
@@ -110,7 +119,8 @@ type timeseriesWork struct {
f func(rs *Result, workerID uint) error
err error
rowsProcessed int
rowsProcessed int
updateRowsProcessed int
}
func (tsw *timeseriesWork) reset() {
@@ -120,6 +130,7 @@ func (tsw *timeseriesWork) reset() {
tsw.f = nil
tsw.err = nil
tsw.rowsProcessed = 0
tsw.updateRowsProcessed = 0
}
func getTimeseriesWork() *timeseriesWork {
@@ -150,6 +161,7 @@ func (tsw *timeseriesWork) do(r *Result, workerID uint) error {
atomic.StoreUint32(tsw.mustStop, 1)
return fmt.Errorf("error during time series unpacking: %w", err)
}
tsw.updateRowsProcessed = r.updateRows
tsw.rowsProcessed = len(r.Timestamps)
if len(r.Timestamps) > 0 {
if err := tsw.f(r, workerID); err != nil {
@@ -166,17 +178,23 @@ func timeseriesWorker(qt *querytracer.Tracer, workChs []chan *timeseriesWork, wo
// Perform own work at first.
rowsProcessed := 0
seriesProcessed := 0
updateGenerationsProcessed := 0
updateRowsProcessed := 0
ch := workChs[workerID]
for tsw := range ch {
tsw.err = tsw.do(&tmpResult.rs, workerID)
rowsProcessed += tsw.rowsProcessed
updateGenerationsProcessed += len(tsw.pts.updateAddrs)
updateRowsProcessed += tsw.updateRowsProcessed
seriesProcessed++
}
qt.Printf("own work processed: series=%d, samples=%d", seriesProcessed, rowsProcessed)
qt.Printf("own work processed: series=%d, samples=%d, update_generations=%d, update_rows=%d", seriesProcessed, rowsProcessed, updateGenerationsProcessed, updateRowsProcessed)
// Then help others with the remaining work.
rowsProcessed = 0
seriesProcessed = 0
updateGenerationsProcessed = 0
updateRowsProcessed = 0
for i := uint(1); i < uint(len(workChs)); i++ {
idx := (i + workerID) % uint(len(workChs))
ch := workChs[idx]
@@ -194,10 +212,12 @@ func timeseriesWorker(qt *querytracer.Tracer, workChs []chan *timeseriesWork, wo
}
tsw.err = tsw.do(&tmpResult.rs, workerID)
rowsProcessed += tsw.rowsProcessed
updateGenerationsProcessed += len(tsw.pts.updateAddrs)
updateRowsProcessed += tsw.updateRowsProcessed
seriesProcessed++
}
}
qt.Printf("others work processed: series=%d, samples=%d", seriesProcessed, rowsProcessed)
qt.Printf("others work processed: series=%d, samples=%d, update_generations=%d, update_rows=%d", seriesProcessed, rowsProcessed, updateGenerationsProcessed, updateRowsProcessed)
putTmpResult(tmpResult)
}
@@ -383,8 +403,139 @@ var (
)
type packedTimeseries struct {
metricName string
addrs []tmpBlockAddr
samples int
metricName string
addrs []tmpBlockAddr
updateAddrs sortedTimeseriesUpdateAddrs
}
type sortedTimeseriesUpdateAddrs [][]tmpBlockAddr
// merges dst Result with update result
// may allocate memory
// dst and update must be sorted by Timestamps
// keeps order by Timestamps
func mergeResult(dst, update *Result) {
// ignore timestamps with not enough points
if len(dst.Timestamps) == 0 || len(update.Timestamps) == 0 {
return
}
firstDstTs := dst.Timestamps[0]
lastDstTs := dst.Timestamps[len(dst.Timestamps)-1]
firstUpdateTs := update.Timestamps[0]
lastUpdateTs := update.Timestamps[len(update.Timestamps)-1]
// check lower bound
// [5,6,7] [1,2,3]
if lastUpdateTs <= firstDstTs {
// fast path
if lastUpdateTs == firstDstTs {
dst.Timestamps = dst.Timestamps[1:]
dst.Values = dst.Values[1:]
}
newLen := len(dst.Timestamps) + len(update.Timestamps)
dst.timestampsBuf = reuseMayAllocateInt64(dst.timestampsBuf, newLen)
dst.timestampsBuf = append(dst.timestampsBuf, update.Timestamps...)
dst.timestampsBuf = append(dst.timestampsBuf, dst.Timestamps...)
dst.valuesBuf = reuseMayAllocateFloat64(dst.valuesBuf, newLen)
dst.valuesBuf = append(dst.valuesBuf, update.Values...)
dst.valuesBuf = append(dst.valuesBuf, dst.Values...)
// swap buffers
dst.timestampsBuf, dst.Timestamps = dst.Timestamps, dst.timestampsBuf
dst.valuesBuf, dst.Values = dst.Values, dst.valuesBuf
return
}
// check upper bound
// fast path, memory allocation possible
// [1,2,3] [5,6,7]
if firstUpdateTs >= lastDstTs {
if firstUpdateTs == lastDstTs {
dst.Timestamps = dst.Timestamps[:len(dst.Timestamps)-1]
dst.Values = dst.Values[:len(dst.Values)-1]
}
dst.Timestamps = append(dst.Timestamps, update.Timestamps...)
dst.Values = append(dst.Values, update.Values...)
return
}
// changes inside given range
// [1,5,7] [2,3,6]
firstPos := position(dst.Timestamps, firstUpdateTs)
lastPos := position(dst.Timestamps, lastUpdateTs)
// corner case last timestamp overlaps
if lastPos != len(dst.Timestamps) && lastUpdateTs == dst.Timestamps[lastPos] {
lastPos++
}
headTs := dst.Timestamps[:firstPos]
tailTs := dst.Timestamps[lastPos:]
headVs := dst.Values[:firstPos]
tailValues := dst.Values[lastPos:]
newLen := len(headTs) + len(update.Timestamps) + len(tailTs)
dst.timestampsBuf = reuseMayAllocateInt64(dst.timestampsBuf, newLen)
dst.timestampsBuf = append(dst.timestampsBuf, headTs...)
dst.timestampsBuf = append(dst.timestampsBuf, update.Timestamps...)
dst.timestampsBuf = append(dst.timestampsBuf, tailTs...)
dst.valuesBuf = reuseMayAllocateFloat64(dst.valuesBuf, newLen)
dst.valuesBuf = append(dst.valuesBuf, headVs...)
dst.valuesBuf = append(dst.valuesBuf, update.Values...)
dst.valuesBuf = append(dst.valuesBuf, tailValues...)
// swap buffers
dst.timestampsBuf, dst.Timestamps = dst.Timestamps, dst.timestampsBuf
dst.valuesBuf, dst.Values = dst.Values, dst.valuesBuf
}
func reuseMayAllocateInt64(src []int64, n int) []int64 {
if n <= cap(src) {
return src[:0]
}
nNew := roundToNearestPow2(n)
bNew := make([]int64, nNew)
return bNew[:0]
}
func reuseMayAllocateFloat64(src []float64, n int) []float64 {
if n <= cap(src) {
return src[:0]
}
nNew := roundToNearestPow2(n)
bNew := make([]float64, nNew)
return bNew[:0]
}
func roundToNearestPow2(n int) int {
pow2 := uint8(bits.Len(uint(n - 1)))
return 1 << pow2
}
// position searches element position at given src with binary search
// copied and modified from sort.SearchInts
func position(src []int64, value int64) int {
// fast path
if len(src) < 1 || src[0] > value {
return 0
}
// Define f(-1) == false and f(n) == true.
// Invariant: f(i-1) == false, f(j) == true.
i, j := int64(0), int64(len(src))
for i < j {
h := int64(uint(i+j) >> 1) // avoid overflow when computing h
// i ≤ h < j
// return a[i] >= x
if !(src[h] >= value) {
i = h + 1 // preserves f(i-1) == false
} else {
j = h // preserves f(j) == true
}
}
// i == j, f(i-1) == false, and f(j) (= f(i)) == true => answer is i.
return int(i)
}
type unpackWork struct {
@@ -490,10 +641,165 @@ func (pts *packedTimeseries) Unpack(dst *Result, tbfs []*tmpBlocksFile, tr stora
}
dedupInterval := storage.GetDedupInterval()
mergeSortBlocks(dst, sbh, dedupInterval)
seriesUpdateSbss, err := pts.unpackUpdateAddrs(tbfs, tr)
if err != nil {
return fmt.Errorf("cannot unpack series updates: %w", err)
}
// apply updates
if len(seriesUpdateSbss) > 0 {
var updateDst Result
updateRows := 0
for _, seriesUpdateSbs := range seriesUpdateSbss {
updateDst.reset()
mergeSortBlocks(&updateDst, seriesUpdateSbs, dedupInterval)
updateRows += len(updateDst.Timestamps)
mergeResult(dst, &updateDst)
putSortBlocksHeap(seriesUpdateSbs)
}
dst.updateRows = updateRows
}
putSortBlocksHeap(sbh)
return nil
}
func (pts *packedTimeseries) unpackUpdateAddrs(tbfs []*tmpBlocksFile, tr storage.TimeRange) ([]*sortBlocksHeap, error) {
var upwsLen int
for i := range pts.updateAddrs {
upwsLen += len(pts.updateAddrs[i])
}
if upwsLen == 0 {
// Nothing to do
return nil, nil
}
initUnpackWork := func(upw *unpackWork, addr tmpBlockAddr) {
upw.tbfs = tbfs
upw.addr = addr
upw.tr = tr
}
dsts := make([]*sortBlocksHeap, 0, upwsLen)
samples := pts.samples
if gomaxprocs == 1 || upwsLen < 1000 {
// It is faster to unpack all the data in the current goroutine.
upw := getUnpackWork()
tmpBlock := getTmpStorageBlock()
var err error
for _, addrBlock := range pts.updateAddrs {
dst := getSortBlocksHeap()
for _, addr := range addrBlock {
initUnpackWork(upw, addr)
upw.unpack(tmpBlock)
if upw.err != nil {
return dsts, upw.err
}
samples += len(upw.sb.Timestamps)
if *maxSamplesPerSeries > 0 && samples > *maxSamplesPerSeries {
putSortBlock(upw.sb)
err = fmt.Errorf("cannot process more than %d samples per series; either increase -search.maxSamplesPerSeries "+
"or reduce time range for the query", *maxSamplesPerSeries)
break
}
dst.sbs = append(dst.sbs, upw.sb)
upw.reset()
}
dsts = append(dsts, dst)
}
putTmpStorageBlock(tmpBlock)
putUnpackWork(upw)
return dsts, err
}
// Slow path - spin up multiple local workers for parallel data unpacking.
// Do not use global workers pool, since it increases inter-CPU memory ping-poing,
// which reduces the scalability on systems with many CPU cores.
// Prepare the work for workers.
upwss := make([][]*unpackWork, len(pts.updateAddrs))
var workItems int
for i, addrs := range pts.updateAddrs {
upws := make([]*unpackWork, len(addrs))
for j, addr := range addrs {
workItems++
upw := getUnpackWork()
initUnpackWork(upw, addr)
upws[j] = upw
}
upwss[i] = upws
}
// Prepare worker channels.
workers := upwsLen
if workers > gomaxprocs {
workers = gomaxprocs
}
if workers < 1 {
workers = 1
}
itemsPerWorker := (workItems + workers - 1) / workers
workChs := make([]chan *unpackWork, workers)
for i := range workChs {
workChs[i] = make(chan *unpackWork, itemsPerWorker)
}
// Spread work among worker channels.
i := 0
for _, upws := range upwss {
for _, upw := range upws {
idx := i % len(workChs)
i++
workChs[idx] <- upw
}
}
// Mark worker channels as closed.
for _, workCh := range workChs {
close(workCh)
}
// Start workers and wait until they finish the work.
var wg sync.WaitGroup
for i := 0; i < workers; i++ {
wg.Add(1)
go func(workerID uint) {
unpackWorker(workChs, workerID)
wg.Done()
}(uint(i))
}
wg.Wait()
// Collect results.
var firstErr error
for _, upws := range upwss {
sbh := getSortBlocksHeap()
for _, upw := range upws {
if upw.err != nil && firstErr == nil {
// Return the first error only, since other errors are likely the same.
firstErr = upw.err
}
if firstErr == nil {
sb := upw.sb
samples += len(sb.Timestamps)
if *maxSamplesPerSeries > 0 && samples > *maxSamplesPerSeries {
putSortBlock(sb)
firstErr = fmt.Errorf("cannot process more than %d samples per series; either increase -search.maxSamplesPerSeries "+
"or reduce time range for the query", *maxSamplesPerSeries)
} else {
sbh.sbs = append(sbh.sbs, sb)
}
} else {
putSortBlock(upw.sb)
}
putUnpackWork(upw)
}
dsts = append(dsts, sbh)
}
if firstErr != nil {
for _, sbh := range dsts {
putSortBlocksHeap(sbh)
}
}
return dsts, nil
}
func (pts *packedTimeseries) unpackTo(dst []*sortBlock, tbfs []*tmpBlocksFile, tr storage.TimeRange) ([]*sortBlock, error) {
upwsLen := len(pts.addrs)
if upwsLen == 0 {
@@ -605,6 +911,7 @@ func (pts *packedTimeseries) unpackTo(dst []*sortBlock, tbfs []*tmpBlocksFile, t
}
putUnpackWork(upw)
}
pts.samples = samples
return dst, firstErr
}
@@ -1349,6 +1656,12 @@ type tmpBlocksFileWrapper struct {
tbfs []*tmpBlocksFile
ms []map[string]*blockAddrs
orderedMetricNamess [][]string
// mu protects series updates
// it shouldn't cause cpu contention
// usually series updates are small
mu sync.Mutex
// updates grouped by metric name and generation ID
seriesUpdatesByMetricName map[string]map[int64][]tmpBlockAddr
}
type blockAddrs struct {
@@ -1373,9 +1686,10 @@ func newTmpBlocksFileWrapper(sns []*storageNode) *tmpBlocksFileWrapper {
ms[i] = make(map[string]*blockAddrs)
}
return &tmpBlocksFileWrapper{
tbfs: tbfs,
ms: ms,
orderedMetricNamess: make([][]string, n),
tbfs: tbfs,
ms: ms,
seriesUpdatesByMetricName: make(map[string]map[int64][]tmpBlockAddr),
orderedMetricNamess: make([][]string, n),
}
}
@@ -1390,6 +1704,38 @@ func (tbfw *tmpBlocksFileWrapper) RegisterAndWriteBlock(mb *storage.MetricBlock,
// Do not intern mb.MetricName, since it leads to increased memory usage.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3692
metricName := mb.MetricName
var generationID int64
mn := storage.GetMetricName()
defer storage.PutMetricName(mn)
if err := mn.Unmarshal(metricName); err != nil {
return fmt.Errorf("cannot unmarshal metricName: %q %w", metricName, err)
}
generationIDTag := mn.RemoveTagWithResult(`__generation_id`)
if generationIDTag != nil {
generationID, err = strconv.ParseInt(string(generationIDTag.Value), 10, 64)
if err != nil {
return fmt.Errorf("cannot parse __generation_id label value: %s : %w", generationIDTag.Value, err)
}
metricName = mn.Marshal(metricName[:0])
}
// process data blocks with metric updates
// TODO profile it, probably it's better to replace mutex with per worker lock-free struct
if generationID > 0 {
tbfw.mu.Lock()
defer tbfw.mu.Unlock()
ups := tbfw.seriesUpdatesByMetricName[string(metricName)]
if ups == nil {
// fast path
tbfw.seriesUpdatesByMetricName[string(metricName)] = map[int64][]tmpBlockAddr{generationID: {addr}}
return nil
}
// todo memory optimization for metricNames, use interning?
addrs := tbfw.seriesUpdatesByMetricName[string(metricName)][generationID]
addrs = append(addrs, addr)
tbfw.seriesUpdatesByMetricName[string(metricName)][generationID] = addrs
return nil
}
m := tbfw.ms[workerID]
addrs := m[string(metricName)]
if addrs == nil {
@@ -1463,6 +1809,7 @@ func ExportBlocks(qt *querytracer.Tracer, sq *storage.SearchQuery, deadline sear
if err := mn.Unmarshal(mb.MetricName); err != nil {
return fmt.Errorf("cannot unmarshal metricName: %w", err)
}
if err := f(mn, &mb.Block, tr, workerID); err != nil {
return err
}
@@ -1589,7 +1936,7 @@ func ProcessSearchQuery(qt *querytracer.Tracer, denyPartialResponse bool, sq *st
if err != nil {
return nil, false, fmt.Errorf("cannot finalize temporary blocks files: %w", err)
}
qt.Printf("fetch unique series=%d, blocks=%d, samples=%d, bytes=%d", len(addrsByMetricName), blocksRead.GetTotal(), samples.GetTotal(), bytesTotal)
qt.Printf("fetch unique series=%d, blocks=%d, samples=%d, bytes=%d, unique_series_with_updates=%d", len(addrsByMetricName), blocksRead.GetTotal(), samples.GetTotal(), bytesTotal, len(tbfw.seriesUpdatesByMetricName))
var rss Results
rss.tr = tr
@@ -1597,9 +1944,25 @@ func ProcessSearchQuery(qt *querytracer.Tracer, denyPartialResponse bool, sq *st
rss.tbfs = tbfw.tbfs
pts := make([]packedTimeseries, len(orderedMetricNames))
for i, metricName := range orderedMetricNames {
seriesUpdateGenerations := tbfw.seriesUpdatesByMetricName[metricName]
var stua sortedTimeseriesUpdateAddrs
if len(seriesUpdateGenerations) > 0 {
stua = make(sortedTimeseriesUpdateAddrs, 0, len(seriesUpdateGenerations))
orderedGenerationIDs := make([]int64, 0, len(seriesUpdateGenerations))
for generationID := range seriesUpdateGenerations {
orderedGenerationIDs = append(orderedGenerationIDs, generationID)
}
sort.Slice(orderedGenerationIDs, func(i, j int) bool {
return orderedGenerationIDs[i] < orderedGenerationIDs[j]
})
for _, genID := range orderedGenerationIDs {
stua = append(stua, seriesUpdateGenerations[genID])
}
}
pts[i] = packedTimeseries{
metricName: metricName,
addrs: addrsByMetricName[metricName].addrs,
metricName: metricName,
addrs: addrsByMetricName[metricName].addrs,
updateAddrs: stua,
}
}
rss.packedTimeseries = pts

View File

@@ -5,6 +5,9 @@ import (
"reflect"
"runtime"
"testing"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/decimal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
)
func TestInitStopNodes(t *testing.T) {
@@ -47,7 +50,6 @@ func TestMergeSortBlocks(t *testing.T) {
t.Fatalf("unexpected timestamps;\ngot\n%v\nwant\n%v", result.Timestamps, expectedResult.Timestamps)
}
}
// Zero blocks
f(nil, 1, &Result{})
@@ -221,3 +223,265 @@ func TestMergeSortBlocks(t *testing.T) {
Values: []float64{7, 24, 26},
})
}
func TestMergeResult(t *testing.T) {
f := func(name string, dst, update, expect *Result) {
t.Helper()
t.Run(name, func(t *testing.T) {
mergeResult(dst, update)
if !reflect.DeepEqual(dst.Values, expect.Values) || !reflect.DeepEqual(dst.Timestamps, expect.Timestamps) {
t.Fatalf(" unexpected result \ngot: \n%v\nwant: \n%v", dst, expect)
}
})
}
f("append and replace",
&Result{Timestamps: []int64{1, 2}, Values: []float64{5.0, 6.0}},
&Result{Timestamps: []int64{2, 3}, Values: []float64{10.0, 30.0}},
&Result{Timestamps: []int64{1, 2, 3}, Values: []float64{5.0, 10.0, 30.0}})
f("extend and replace overlap",
&Result{Timestamps: []int64{2, 3}, Values: []float64{10.0, 30.0}},
&Result{Timestamps: []int64{1, 2}, Values: []float64{5.0, 6.0}},
&Result{Timestamps: []int64{1, 2, 3}, Values: []float64{5.0, 6.0, 30.0}})
f("extend and replace",
&Result{Timestamps: []int64{1, 2, 3}, Values: []float64{5.0, 6.0, 7.0}},
&Result{Timestamps: []int64{0, 1, 2}, Values: []float64{10.0, 15.0, 30.0}},
&Result{Timestamps: []int64{0, 1, 2, 3}, Values: []float64{10.0, 15.0, 30.0, 7.0}})
f("update single point",
&Result{Timestamps: []int64{1, 2, 3}, Values: []float64{5.0, 6.0, 7.0}},
&Result{Timestamps: []int64{15}, Values: []float64{35.0}},
&Result{Timestamps: []int64{1, 2, 3, 15}, Values: []float64{5.0, 6.0, 7.0, 35.0}})
f("append",
&Result{Timestamps: []int64{6, 7, 8}, Values: []float64{10.0, 15.0, 30.0}},
&Result{Timestamps: []int64{1, 2, 3}, Values: []float64{5.0, 6.0, 7.0}},
&Result{Timestamps: []int64{1, 2, 3, 6, 7, 8}, Values: []float64{5, 6, 7, 10, 15, 30}})
f("extend",
&Result{Timestamps: []int64{1, 2, 3}, Values: []float64{5.0, 6.0, 7.0}},
&Result{Timestamps: []int64{6, 7, 8}, Values: []float64{10.0, 15.0, 30.0}},
&Result{Timestamps: []int64{1, 2, 3, 6, 7, 8}, Values: []float64{5, 6, 7, 10, 15, 30}})
f("fast path",
&Result{},
&Result{Timestamps: []int64{1, 2, 3}},
&Result{})
f("merge at the middle",
&Result{Timestamps: []int64{1, 2, 5, 6, 10, 15}, Values: []float64{.1, .2, .3, .4, .5, .6}},
&Result{Timestamps: []int64{2, 6, 9, 10}, Values: []float64{1.1, 1.2, 1.3, 1.4}},
&Result{Timestamps: []int64{1, 2, 6, 9, 10, 15}, Values: []float64{.1, 1.1, 1.2, 1.3, 1.4, 0.6}})
f("merge and re-allocate",
&Result{
Timestamps: []int64{10, 20, 30, 50, 60, 90},
Values: []float64{1.1, 1.2, 1.3, 1.4, 1.5, 1.6},
},
&Result{
Timestamps: []int64{20, 30, 35, 45, 50, 55, 60},
Values: []float64{2.0, 2.3, 2.35, 2.45, 2.5, 2.55, 2.6},
},
&Result{
Timestamps: []int64{10, 20, 30, 35, 45, 50, 55, 60, 90},
Values: []float64{1.1, 2.0, 2.3, 2.35, 2.45, 2.50, 2.55, 2.6, 1.6},
})
f("update at the end",
&Result{
Timestamps: []int64{10, 20, 30, 40, 50, 60, 90},
Values: []float64{2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.9},
},
&Result{
Timestamps: []int64{50, 70, 100},
Values: []float64{0.0, 5.7, 5.1},
},
&Result{
Timestamps: []int64{10, 20, 30, 40, 50, 70, 100},
Values: []float64{2.1, 2.2, 2.3, 2.4, 0.0, 5.7, 5.1},
})
}
func TestPackedTimeseries_Unpack(t *testing.T) {
createBlock := func(ts []int64, vs []int64) *storage.Block {
tsid := &storage.TSID{
MetricID: 234211,
}
scale := int16(0)
precisionBits := uint8(8)
var b storage.Block
b.Init(tsid, ts, vs, scale, precisionBits)
_, _, _ = b.MarshalData(0, 0)
return &b
}
tr := storage.TimeRange{
MinTimestamp: 0,
MaxTimestamp: 1<<63 - 1,
}
var mn storage.MetricName
mn.MetricGroup = []byte("foobar")
metricName := string(mn.Marshal(nil))
type blockData struct {
timestamps []int64
values []int64
}
isValuesEqual := func(got, want []float64) bool {
equal := true
if len(got) != len(want) {
return false
}
for i, v := range want {
gotV := got[i]
if v == gotV {
continue
}
if decimal.IsStaleNaN(v) && decimal.IsStaleNaN(gotV) {
continue
}
equal = false
}
return equal
}
f := func(name string, dataBlocks []blockData, updateBlocks []blockData, wantResult *Result) {
t.Run(name, func(t *testing.T) {
pts := packedTimeseries{
metricName: metricName,
}
var dst Result
tbf := tmpBlocksFile{
buf: make([]byte, 0, 20*1024*1024),
}
for _, dataBlock := range dataBlocks {
bb := createBlock(dataBlock.timestamps, dataBlock.values)
addr, err := tbf.WriteBlockData(storage.MarshalBlock(nil, bb), 0)
if err != nil {
t.Fatalf("cannot write block: %s", err)
}
pts.addrs = append(pts.addrs, addr)
}
var updateAddrs []tmpBlockAddr
for _, updateBlock := range updateBlocks {
bb := createBlock(updateBlock.timestamps, updateBlock.values)
addr, err := tbf.WriteBlockData(storage.MarshalBlock(nil, bb), 0)
if err != nil {
t.Fatalf("cannot write update block: %s", err)
}
updateAddrs = append(updateAddrs, addr)
}
if len(updateAddrs) > 0 {
pts.updateAddrs = append(pts.updateAddrs, updateAddrs)
}
if err := pts.Unpack(&dst, []*tmpBlocksFile{&tbf}, tr); err != nil {
t.Fatalf("unexpected error at series unpack: %s", err)
}
if !reflect.DeepEqual(wantResult, &dst) && !isValuesEqual(wantResult.Values, dst.Values) {
t.Fatalf("unexpected result for unpack \nwant: \n%v\ngot: \n%v\n", wantResult, &dst)
}
})
}
f("2 blocks without updates",
[]blockData{
{
timestamps: []int64{10, 15, 30},
values: []int64{1, 2, 3},
},
{
timestamps: []int64{35, 40, 45},
values: []int64{4, 5, 6},
},
},
nil,
&Result{
MetricName: mn,
Values: []float64{1, 2, 3, 4, 5, 6},
Timestamps: []int64{10, 15, 30, 35, 40, 45},
})
f("2 blocks at the border of time range",
[]blockData{
{
timestamps: []int64{10, 15, 30},
values: []int64{1, 2, 3},
},
{
timestamps: []int64{35, 40, 45},
values: []int64{4, 5, 6},
},
},
[]blockData{
{
timestamps: []int64{10},
values: []int64{16},
},
},
&Result{
MetricName: mn,
Values: []float64{16, 2, 3, 4, 5, 6},
Timestamps: []int64{10, 15, 30, 35, 40, 45},
})
f("2 blocks with update",
[]blockData{
{
timestamps: []int64{10, 15, 30},
values: []int64{1, 2, 3},
},
{
timestamps: []int64{35, 40, 45},
values: []int64{4, 5, 6},
},
},
[]blockData{
{
timestamps: []int64{15, 30},
values: []int64{11, 12},
},
},
&Result{
MetricName: mn,
Values: []float64{1, 11, 12, 4, 5, 6},
Timestamps: []int64{10, 15, 30, 35, 40, 45},
})
f("2 blocks with 2 update blocks",
[]blockData{
{
timestamps: []int64{10, 15, 30},
values: []int64{1, 2, 3},
},
{
timestamps: []int64{35, 40, 65},
values: []int64{4, 5, 6},
},
},
[]blockData{
{
timestamps: []int64{15, 30},
values: []int64{11, 12},
},
{
timestamps: []int64{45, 55},
values: []int64{21, 22},
},
},
&Result{
MetricName: mn,
Values: []float64{1, 11, 12, 21, 22, 6},
Timestamps: []int64{10, 15, 30, 45, 55, 65},
})
}
func TestPosition(t *testing.T) {
f := func(src []int64, value, wantPosition int64) {
t.Helper()
gotPos := position(src, value)
if wantPosition != int64(gotPos) {
t.Fatalf("incorrect position: \ngot:\n%d\nwant: \n%d", gotPos, wantPosition)
}
if gotPos == len(src) {
_ = src[int64(gotPos)-1]
} else {
_ = src[int64(gotPos)]
}
}
f([]int64{1, 2, 3, 4}, 5, 4)
f([]int64{1, 2, 3, 4}, 0, 0)
f([]int64{1, 2, 3, 4}, 1, 0)
f([]int64{1, 2, 3, 4}, 4, 3)
f([]int64{1, 2, 3, 4}, 3, 2)
f([]int64{10, 20, 30, 40, 50, 60, 90}, 100, 7)
}

View File

@@ -2,6 +2,7 @@ package netstorage
import (
"fmt"
"reflect"
"testing"
)
@@ -105,3 +106,83 @@ func benchmarkMergeSortBlocks(b *testing.B, blocks []*sortBlock) {
}
})
}
func BenchmarkMergeResults(b *testing.B) {
b.ReportAllocs()
f := func(name string, dst, update, expect *Result) {
if len(dst.Timestamps) != len(dst.Values) {
b.Fatalf("bad input data, timestamps and values lens must match")
}
if len(update.Values) != len(update.Timestamps) {
b.Fatalf("bad input data, update timestamp and values must match")
}
toMerge := Result{
valuesBuf: make([]float64, 0, 256),
timestampsBuf: make([]int64, 0, 256),
}
b.Run(name, func(b *testing.B) {
b.ReportAllocs()
for i := 0; i < b.N; i++ {
toMerge.reset()
toMerge.Values = append(toMerge.Values, dst.Values...)
toMerge.Timestamps = append(toMerge.Timestamps, dst.Timestamps...)
mergeResult(&toMerge, update)
if !reflect.DeepEqual(toMerge.Timestamps, expect.Timestamps) || !reflect.DeepEqual(toMerge.Values, expect.Values) {
b.Fatalf("unexpected result, got \ntimestamps: \n%v\nvalues: \n%v\nwant: timestamps: \n%v\n values: \n%v", toMerge.Timestamps, toMerge.Values, expect.Timestamps, expect.Values)
}
}
})
}
f("update at the start",
&Result{
Timestamps: []int64{10, 20, 30, 40, 50, 60, 90},
Values: []float64{2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.9},
},
&Result{
Timestamps: []int64{0, 20, 40},
Values: []float64{0.0, 5.2, 5.4},
},
&Result{
Timestamps: []int64{0, 20, 40, 50, 60, 90},
Values: []float64{0.0, 5.2, 5.4, 2.5, 2.6, 2.9},
})
f("update at the end",
&Result{
Timestamps: []int64{10, 20, 30, 40, 50, 60, 90},
Values: []float64{2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.9},
},
&Result{
Timestamps: []int64{50, 70, 100},
Values: []float64{0.0, 5.7, 5.1},
},
&Result{
Timestamps: []int64{10, 20, 30, 40, 50, 70, 100},
Values: []float64{2.1, 2.2, 2.3, 2.4, 0.0, 5.7, 5.1},
})
f("update at the middle",
&Result{
Timestamps: []int64{10, 20, 30, 40, 50, 60, 90},
Values: []float64{2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.9},
},
&Result{
Timestamps: []int64{30, 40, 50, 60},
Values: []float64{5.3, 5.4, 5.5, 5.6},
},
&Result{
Timestamps: []int64{10, 20, 30, 40, 50, 60, 90},
Values: []float64{2.1, 2.2, 5.3, 5.4, 5.5, 5.6, 2.9},
})
f("merge and re-allocate",
&Result{
Timestamps: []int64{10, 20, 30, 50, 60, 90},
Values: []float64{1.1, 1.2, 1.3, 1.4, 1.5, 1.6},
},
&Result{
Timestamps: []int64{20, 30, 35, 45, 50, 55, 60},
Values: []float64{2.0, 2.3, 2.35, 2.45, 2.5, 2.55, 2.6},
},
&Result{
Timestamps: []int64{10, 20, 30, 35, 45, 50, 55, 60, 90},
Values: []float64{1.1, 2.0, 2.3, 2.35, 2.45, 2.50, 2.55, 2.6, 1.6},
})
}

View File

@@ -0,0 +1 @@
{"metric":{"__name__":"promhttp_metric_handler_requests_total","job":"node_exporter-6","up":"true"},"values":[131],"timestamps":[1676050756785]}

View File

@@ -1162,9 +1162,7 @@ func getLatencyOffsetMilliseconds(r *http.Request) (int64, error) {
}
// QueryStatsHandler returns query stats at `/api/v1/status/top_queries`
func QueryStatsHandler(startTime time.Time, at *auth.Token, w http.ResponseWriter, r *http.Request) error {
defer queryStatsDuration.UpdateDuration(startTime)
func QueryStatsHandler(at *auth.Token, w http.ResponseWriter, r *http.Request) error {
topN := 20
topNStr := r.FormValue("topN")
if len(topNStr) > 0 {
@@ -1193,8 +1191,6 @@ func QueryStatsHandler(startTime time.Time, at *auth.Token, w http.ResponseWrite
return nil
}
var queryStatsDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/api/v1/status/top_queries"}`)
// commonParams contains common parameters for all /api/v1/* handlers
//
// timeout, start, end, match[], extra_label, extra_filters[]

View File

@@ -863,6 +863,17 @@ func TestExecSuccess(t *testing.T) {
resultExpected := []netstorage.Result{r}
f(q, resultExpected)
})
t.Run("day_of_year()", func(t *testing.T) {
t.Parallel()
q := `day_of_year(time()*1e4)`
r := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{116, 139, 163, 186, 209, 232},
Timestamps: timestampsExpected,
}
resultExpected := []netstorage.Result{r}
f(q, resultExpected)
})
t.Run("days_in_month()", func(t *testing.T) {
t.Parallel()
q := `days_in_month(time()*2e4)`

View File

@@ -42,6 +42,7 @@ var transformFuncs = map[string]transformFunc{
"cosh": newTransformFuncOneArg(transformCosh),
"day_of_month": newTransformFuncDateTime(transformDayOfMonth),
"day_of_week": newTransformFuncDateTime(transformDayOfWeek),
"day_of_year": newTransformFuncDateTime(transformDayOfYear),
"days_in_month": newTransformFuncDateTime(transformDaysInMonth),
"deg": newTransformFuncOneArg(transformDeg),
"drop_common_labels": transformDropCommonLabels,
@@ -353,6 +354,10 @@ func newTransformFuncDateTime(f func(t time.Time) int) transformFunc {
}
}
func transformDayOfYear(t time.Time) int {
return t.YearDay()
}
func transformDayOfMonth(t time.Time) int {
return t.Day()
}
@@ -2743,14 +2748,15 @@ func copyTimeseriesMetricNames(tss []*timeseries, makeCopy bool) []*timeseries {
return rvs
}
// copyTimeseriesShallow returns a copy of arg with shallow copies of MetricNames,
// Timestamps and Values.
func copyTimeseriesShallow(arg []*timeseries) []*timeseries {
rvs := make([]*timeseries, len(arg))
for i, src := range arg {
var dst timeseries
dst.CopyShallow(src)
rvs[i] = &dst
// copyTimeseriesShallow returns a copy of src with shallow copies of MetricNames, Timestamps and Values.
func copyTimeseriesShallow(src []*timeseries) []*timeseries {
tss := make([]timeseries, len(src))
for i, src := range src {
tss[i].CopyShallow(src)
}
rvs := make([]*timeseries, len(tss))
for i := range tss {
rvs[i] = &tss[i]
}
return rvs
}

View File

@@ -2,8 +2,8 @@
DOCKER_NAMESPACE ?= victoriametrics
ROOT_IMAGE ?= alpine:3.18.4
CERTS_IMAGE := alpine:3.18.4
ROOT_IMAGE ?= alpine:3.18.5
CERTS_IMAGE := alpine:3.18.5
GO_BUILDER_IMAGE := golang:1.21.4-alpine
BUILDER_IMAGE := local/builder:2.0.0-$(shell echo $(GO_BUILDER_IMAGE) | tr :/ __)-1
@@ -68,24 +68,29 @@ package-via-docker: package-base
--tag $(DOCKER_NAMESPACE)/$(APP_NAME):$(PKG_TAG)$(APP_SUFFIX)$(RACE) \
-f app/$(APP_NAME)/deployment/Dockerfile bin)
publish-via-docker: \
app-via-docker-linux-amd64 \
app-via-docker-linux-arm \
app-via-docker-linux-arm64 \
app-via-docker-linux-ppc64le \
app-via-docker-linux-386
publish-via-docker:
$(MAKE_PARALLEL) app-via-docker-linux-amd64 \
app-via-docker-linux-arm \
app-via-docker-linux-arm64 \
app-via-docker-linux-ppc64le \
app-via-docker-linux-386
$(DOCKER) buildx build \
--platform=linux/amd64,linux/arm,linux/arm64,linux/ppc64le,linux/386 \
--build-arg certs_image=$(CERTS_IMAGE) \
--build-arg root_image=$(ROOT_IMAGE) \
--build-arg APP_NAME=$(APP_NAME) \
--tag $(DOCKER_NAMESPACE)/$(APP_NAME):$(PKG_TAG)$(RACE) \
--tag $(DOCKER_NAMESPACE)/$(APP_NAME):$(LATEST_TAG)$(RACE) \
-o type=image \
--provenance=false \
-f app/$(APP_NAME)/multiarch/Dockerfile \
--push \
bin
cd bin && rm -rf \
$(APP_NAME)-linux-amd64-prod \
$(APP_NAME)-linux-arm-prod \
$(APP_NAME)-linux-arm64-prod \
$(APP_NAME)-linux-ppc64le-prod \
$(APP_NAME)-linux-386-prod
run-via-docker: package-via-docker
$(DOCKER_RUN) -it --rm \

View File

@@ -2,7 +2,7 @@ version: '3.5'
services:
vmagent:
container_name: vmagent
image: victoriametrics/vmagent:v1.95.0
image: victoriametrics/vmagent:v1.95.1
depends_on:
- "vminsert"
ports:
@@ -32,7 +32,7 @@ services:
vmstorage-1:
container_name: vmstorage-1
image: victoriametrics/vmstorage:v1.95.0-cluster
image: victoriametrics/vmstorage:v1.95.1-cluster
ports:
- 8482
- 8400
@@ -44,7 +44,7 @@ services:
restart: always
vmstorage-2:
container_name: vmstorage-2
image: victoriametrics/vmstorage:v1.95.0-cluster
image: victoriametrics/vmstorage:v1.95.1-cluster
ports:
- 8482
- 8400
@@ -57,7 +57,7 @@ services:
vminsert:
container_name: vminsert
image: victoriametrics/vminsert:v1.95.0-cluster
image: victoriametrics/vminsert:v1.95.1-cluster
depends_on:
- "vmstorage-1"
- "vmstorage-2"
@@ -70,7 +70,7 @@ services:
vmselect-1:
container_name: vmselect-1
image: victoriametrics/vmselect:v1.95.0-cluster
image: victoriametrics/vmselect:v1.95.1-cluster
depends_on:
- "vmstorage-1"
- "vmstorage-2"
@@ -84,7 +84,7 @@ services:
vmselect-2:
container_name: vmselect-2
image: victoriametrics/vmselect:v1.95.0-cluster
image: victoriametrics/vmselect:v1.95.1-cluster
depends_on:
- "vmstorage-1"
- "vmstorage-2"
@@ -98,7 +98,7 @@ services:
vmauth:
container_name: vmauth
image: victoriametrics/vmauth:v1.95.0
image: victoriametrics/vmauth:v1.95.1
depends_on:
- "vmselect-1"
- "vmselect-2"
@@ -113,7 +113,7 @@ services:
vmalert:
container_name: vmalert
image: victoriametrics/vmalert:v1.95.0
image: victoriametrics/vmalert:v1.95.1
depends_on:
- "vmauth"
ports:

View File

@@ -2,7 +2,7 @@ version: "3.5"
services:
vmagent:
container_name: vmagent
image: victoriametrics/vmagent:v1.95.0
image: victoriametrics/vmagent:v1.95.1
depends_on:
- "victoriametrics"
ports:
@@ -18,7 +18,7 @@ services:
restart: always
victoriametrics:
container_name: victoriametrics
image: victoriametrics/victoria-metrics:v1.95.0
image: victoriametrics/victoria-metrics:v1.95.1
ports:
- 8428:8428
- 8089:8089
@@ -57,7 +57,7 @@ services:
restart: always
vmalert:
container_name: vmalert
image: victoriametrics/vmalert:v1.95.0
image: victoriametrics/vmalert:v1.95.1
depends_on:
- "victoriametrics"
- "alertmanager"

View File

@@ -46,7 +46,7 @@ services:
- '--config=/config.yml'
vmsingle:
image: victoriametrics/victoria-metrics:v1.95.0
image: victoriametrics/victoria-metrics:v1.95.1
ports:
- '8428:8428'
command:

View File

@@ -19,8 +19,8 @@ On the server:
* VictoriaMetrics is running on ports: 8428, 8089, 4242, 2003 and they are bound to the local interface.
********************************************************************************
# This image includes 1.95.0 version of VictoriaMetrics.
# See Release notes https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.95.0
# This image includes 1.95.1 version of VictoriaMetrics.
# See Release notes https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.95.1
# Welcome to VictoriaMetrics droplet!

View File

@@ -83,6 +83,7 @@ See also [case studies](https://docs.victoriametrics.com/CaseStudies.html).
* [Solving metrics at scale with VictoriaMetrics](https://sarthak-acoustic.medium.com/solving-metrics-at-scale-with-victoriametrics-ac9c306826c3)
* [VictoriaMetrics: a comprehensive guide](https://medium.com/@seifeddinerajhi/victoriametrics-a-comprehensive-guide-comparing-it-to-prometheus-and-implementing-kubernetes-03eb8feb0cc2)
* [Unleashing VM histograms for Ruby: Migrating from Prometheus to VictoriaMetrics with vm-client](https://hackernoon.com/unleashing-vm-histograms-for-ruby-migrating-from-prometheus-to-victoriametrics-with-vm-client)
* [Observe and record performance of Spark jobs with Victoria Metrics](https://medium.com/constructor-engineering/observe-and-record-performance-of-databricks-jobs-11ffe236555e)
## Our articles

View File

@@ -26,7 +26,44 @@ Metrics of the latest version of VictoriaMetrics cluster are available for viewi
The sandbox cluster installation is running under the constant load generated by
[prometheus-benchmark](https://github.com/VictoriaMetrics/prometheus-benchmark) and used for testing latest releases.
## tip
## v1.93.x long-time support release (LTS)
## v1.93.1 long-time support release (LTS)
* BUGFIX: [storage](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html): properly set next retention time for indexDB. Previously it may enter into endless retention loop. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4873) for details.
* BUGFIX: do not allow starting VictoriaMetrics components with improperly set boolean command-line flags in the form `-boolFlagName value`, since this leads to silent incomplete flags' parsing. This form should be replaced with `-boolFlagName=value`. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4845).
* BUGFIX: [vmagent](https://docs.victoriametrics.com/vmagent.html): properly set labels from `-remoteWrite.label` command-line flag just before sending samples to the configured `-remoteWrite.url` according to [these docs](https://docs.victoriametrics.com/vmagent.html#adding-labels-to-metrics). Previously these labels were incorrectly set before [the relabeling](https://docs.victoriametrics.com/vmagent.html#relabeling) configured via `-remoteWrite.urlRelabelConfigs` and [the stream aggregation](https://docs.victoriametrics.com/stream-aggregation.html) configured via `-remoteWrite.streamAggr.config`, so these labels could be lost or incorrectly transformed before sending the samples to remote storage. The fix allows using `-remoteWrite.label` for identifying `vmagent` instances in [cluster mode](https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets). See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4247) and [these docs](https://docs.victoriametrics.com/stream-aggregation.html#cluster-mode) for more details.
* BUGFIX: remove `DEBUG` logging when parsing `if` filters inside [relabeling rules](https://docs.victoriametrics.com/vmagent.html#relabeling-enhancements) and when parsing `match` filters inside [stream aggregation rules](https://docs.victoriametrics.com/stream-aggregation.html).
* BUGFIX: properly replace `:` chars in label names with `_` when `-usePromCompatibleNaming` command-line flag is passed to `vmagent`, `vminsert` or single-node VictoriaMetrics. This addresses [this comment](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3113#issuecomment-1275077071).
* BUGFIX: [vmbackup](https://docs.victoriametrics.com/vmbackup.html): correctly check if specified `-dst` belongs to specified `-storageDataPath`. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4837).
* BUGFIX: [vmctl](https://docs.victoriametrics.com/vmctl.html): don't interrupt the migration process if no metrics were found for a specific tenant. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4796).
* SECURITY: upgrade base docker image (Alpine) from 3.18.4 to 3.18.5. See [alpine 3.18.5 release notes](https://www.alpinelinux.org/posts/Alpine-3.15.11-3.16.8-3.17.6-3.18.5-released.html).
* FEATURE: `vmselect`: allow opening [vmui](https://docs.victoriametrics.com/#vmui) and investigating [Top queries](https://docs.victoriametrics.com/#top-queries) and [Active queries](https://docs.victoriametrics.com/#active-queries) when the `vmselect` is overloaded with concurrent queries (e.g. when more than `-search.maxConcurrentRequests` concurrent queries are executed). Previously an attempt to open `Top queries` or `Active queries` at `vmui` could result in `couldn't start executing the request in ... seconds, since -search.maxConcurrentRequests=... concurrent requests are executed` error, which could complicate debugging of overloaded `vmselect` or single-node VictoriaMetrics.
* FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): add `-enableMultitenantHandlers` command-line flag, which allows receiving data via [VictoriaMetrics cluster urls](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#url-format) at `vmagent` and converting [tenant ids](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multitenancy) to (`vm_account_id`, `vm_project_id`) labels before sending the data to the configured `-remoteWrite.url`. See [these docs](https://docs.victoriametrics.com/vmagent.html#multitenancy) for details.
* FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): add `-remoteWrite.disableOnDiskQueue` command-line flag, which can be used for disabling data queueing to disk when the remote storage cannot keep up with the data ingestion rate. See [these docs](https://docs.victoriametrics.com/vmagent.html#disabling-on-disk-persistence) and [this feature request](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2110).
* FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): add support for reading and writing samples via [Google PubSub](https://cloud.google.com/pubsub). See [these docs](https://docs.victoriametrics.com/vmagent.html#google-pubsub-integration).
* FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): show all the dropped targets together with the reason why they are dropped at `http://vmagent:8429/service-discovery` page. Previously targets, which were dropped because of [target sharding](https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets) weren't displayed on this page. This could complicate service discovery debugging. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5389).
* FEATURE: reduce the default value for `-import.maxLineLen` command-line flag from 100MB to 10MB in order to prevent excessive memory usage during data import via [/api/v1/import](https://docs.victoriametrics.com/#how-to-import-data-in-json-line-format).
* FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): add `keep_if_contains` and `drop_if_contains` relabeling actions. See [these docs](https://docs.victoriametrics.com/vmagent.html#relabeling-enhancements) for details.
* FEATURE: [vmalert](https://docs.victoriametrics.com/vmalert.html): provide `/vmalert/api/v1/rule` and `/api/v1/rule` API endpoints to get the rule object in JSON format. See [these docs](https://docs.victoriametrics.com/vmalert.html#web) for details.
* FEATURE: [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html): add [day_of_year()](https://docs.victoriametrics.com/MetricsQL.html#day_of_year) function, which returns the day of the year for each of the given unix timestamps. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5345) for details. Thanks to @luckyxiaoqiang for the [pull request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5368/).
* FEATURE: all VictoriaMetrics binaries: expose additional metrics at `/metrics` page, which may simplify debugging of VictoriaMetrics components (see [this feature request](https://github.com/VictoriaMetrics/metrics/issues/54)):
* `go_sched_latencies_seconds` - the [histogram](https://docs.victoriametrics.com/keyConcepts.html#histogram), which shows the time goroutines have spent in runnable state before actually running. Big values point to the lack of CPU time for the current workload.
* `go_mutex_wait_seconds_total` - the [counter](https://docs.victoriametrics.com/keyConcepts.html#counter), which shows the total time spent by goroutines waiting for locked mutex. Big values point to mutex contention issues.
* `go_gc_cpu_seconds_total` - the [counter](https://docs.victoriametrics.com/keyConcepts.html#counter), which shows the total CPU time spent by Go garbage collector.
* `go_gc_mark_assist_cpu_seconds_total` - the [counter](https://docs.victoriametrics.com/keyConcepts.html#counter), which shows the total CPU time spent by goroutines in GC mark assist state.
* `go_gc_pauses_seconds` - the [histogram](https://docs.victoriametrics.com/keyConcepts.html#histogram), which shows the duration of GC pauses.
* `go_scavenge_cpu_seconds_total` - the [counter](https://docs.victoriametrics.com/keyConcepts.html#counter), which shows the total CPU time spent by Go runtime for returning memory to the Operating System.
* `go_memlimit_bytes` - the value of [GOMEMLIMIT](https://pkg.go.dev/runtime#hdr-Environment_Variables) environment variable.
* BUGFIX: [vmagent](https://docs.victoriametrics.com/vmagent.html): prevent from `FATAL: cannot flush metainfo` panic when [`-remoteWrite.multitenantURL`](https://docs.victoriametrics.com/vmagent.html#multitenancy) command-line flag is set. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5357).
* BUGFIX: [vmagent](https://docs.victoriametrics.com/vmagent.html): properly decode zstd-encoded data blocks received via [VictoriaMetrics remote_write protocol](https://docs.victoriametrics.com/vmagent.html#victoriametrics-remote-write-protocol). See [this issue comment](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5301#issuecomment-1815871992).
* BUGFIX: [vmagent](https://docs.victoriametrics.com/vmagent.html): properly add new labels at `output_relabel_configs` during [stream aggregation](https://docs.victoriametrics.com/stream-aggregation.html). Previously this could lead to corrupted labels in output samples. Thanks to @ChengChung for providing [detailed report for this bug](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5402).
* BUGFIX: [vmalert-tool](https://docs.victoriametrics.com/#vmalert-tool): allow using arbitrary `eval_time` in [alert_rule_test](https://docs.victoriametrics.com/vmalert-tool.html#alert_test_case) case. Previously, test cases with `eval_time` not being a multiple of `evaluation_interval` would fail.
* BUGFIX: [vmbackupmanager](https://docs.victoriametrics.com/vmbackupmanager.html): fix `vmbackupmanager` not deleting previous object versions from S3 when applying retention policy with `-deleteAllObjectVersions` command-line flag.
* BUGFIX: [vminsert](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html): fix panic when ingesting data via [NewRelic protocol](https://docs.victoriametrics.com/#how-to-send-data-from-newrelic-agent) into VictoriaMetrics cluster. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5416).
## [v1.95.1](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.95.1)

View File

@@ -11,7 +11,7 @@ aliases:
---
# Cluster version
<img alt="VictoriaMetrics" src="logo.png" width="300">
<img src="logo.webp" width="300">
VictoriaMetrics is a fast, cost-effective and scalable time series database. It can be used as a long-term remote storage for Prometheus.
@@ -57,7 +57,7 @@ This is a [shared nothing architecture](https://en.wikipedia.org/wiki/Shared-not
It increases cluster availability, and simplifies cluster maintenance as well as cluster scaling.
<p align="center">
<img src="Cluster-VictoriaMetrics_cluster-scheme.png" width="800">
<img src="Cluster-VictoriaMetrics_cluster-scheme.webp" width="800">
</p>
## Multitenancy
@@ -110,6 +110,11 @@ while the `http_requests_total{path="/bar"} 34` would be stored in the tenant `a
The `vm_account_id` and `vm_project_id` labels are extracted after applying the [relabeling](https://docs.victoriametrics.com/relabeling.html)
set via `-relabelConfig` command-line flag, so these labels can be set at this stage.
The `vm_account_id` and `vm_project_id` labels are also taken into account when ingesting data via non-http-based protocols
such as [Graphite](https://docs.victoriametrics.com/#how-to-send-data-from-graphite-compatible-agents-such-as-statsd),
[InfluxDB line protocol via TCP and UDP](https://docs.victoriametrics.com/#how-to-send-data-from-influxdb-compatible-agents-such-as-telegraf) and
[OpenTSDB telnet put protocol](https://docs.victoriametrics.com/#sending-data-via-telnet-put-protocol).
**Security considerations:** it is recommended restricting access to `multitenant` endpoints only to trusted sources,
since untrusted source may break per-tenant data by writing unwanted samples to arbitrary tenants.
@@ -241,7 +246,7 @@ the following approaches for automatic discovery of `vmstorage` nodes:
The list of discovered `vmstorage` nodes is automatically updated when the file contents changes.
The update frequency can be controlled with `-storageNode.discoveryInterval` command-line flag.
- [dns+srv](https://en.wikipedia.org/wiki/SRV_record) - pass `dns+src:some-name` value to `-storageNode` command-line flag.
- [dns+srv](https://en.wikipedia.org/wiki/SRV_record) - pass `dns+srv:some-name` value to `-storageNode` command-line flag.
In this case the provided `dns+srv` names are resolved into tcp addresses of `vmstorage` nodes.
The list of discovered `vmstorage` nodes is automatically updated at `vminsert` and `vmselect`
when it changes behind the corresponding `dns+srv` names.
@@ -483,7 +488,8 @@ during the config update / version upgrade. In this case the following strategy
[rolling restarts](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#no-downtime-strategy),
since they need to process higher load when some of `vmstorage` nodes are temporarily unavailable in the cluster.
It is possible to reduce resource usage spikes by running more `vminsert` nodes and by passing bigger values
to `-storage.vminsertConnsShutdownDuration` command-line flag at `vmstorage` nodes.
to `-storage.vminsertConnsShutdownDuration` (available from [v1.95.0](https://docs.victoriametrics.com/CHANGELOG.html#v1950))
command-line flag at `vmstorage` nodes.
In this case `vmstorage` increases the interval between gradual closing of `vminsert` connections during graceful shutdown.
This reduces data ingestion slowdown during rollout restarts.
@@ -715,6 +721,48 @@ By default, `vminsert` tries to route all the samples for a single time series t
* when `vmstorage` nodes are temporarily unavailable (for instance, during their restart). Then new samples are re-routed to the remaining available `vmstorage` nodes;
* when `vmstorage` node has no enough capacity for processing incoming data stream. Then `vminsert` re-routes new samples to other `vmstorage` nodes.
## Alter/Update series
VictoriaMetrics supports data modification and update with following limitations:
- modified data cannot be changed with back-filling.
- modified data must be sent to `vminsert` component.
- only json-line format is supported.
How it works:
* Export series for modification with `/api/v1/export/prometheus` from `vmselect`.
* Modify values,timestamps with needed values. You can delete, add or modify timestamps and values.
* Send modified series to the `vminsert`'s API `/prometheus/api/v1/update/series` with POST request.
* `vminsert` adds unique monotonically increasing `__generation_id` label to each series during the update, so their samples could replace the original samples.
* at query requests `vmselect` merges original series with series updates sequentially according to their `__generation_id`.
How vmselect merges updates:
* example 1 - delete timestamps at time range
original series timestamps: [10,20,30,40,50] values: [1,2,3,4,5]
modified series timestamps: [20,50] values: [2,5]
query result series timestamps: [10,20,50] values: [1,2,5]
* example 2 - modify values
origin series timestamps: [10,20,30,40,50] values: [1,2,3,4,5]
modified series timestamps: [20,30,40] values: [22,33,44]
query result series timestamps: [10,20,30,40,50] values: [1,22,33,44,5]
* example 3 - modify timestamps
origin series timestamps: [10,20,30,40,50] values: [1,2,3,4,5]
modified series timestamps: [0,5,10,15,20,25,30] values: [0.1,0.5,1,1.5,2,2.5,30]
modified series timestamps: [0,5,10,15,20,25,30,40,50] values: [0.1,0.5,1,1.5,2,2.5,30,4,5]
How to check which series were modified:
* execute metrics export request with following params query params:
* `reduce_mem_usage=true`
* `extra_filter={__generation_id!=""}`
Example request
```bash
curl localhost:8481/select/0/prometheus/api/v1/export -g -d 'match[]={__name__="vmagent_rows_inserted_total",type="datadog"}' -d 'reduce_mem_usage=true' -d 'extra_filters={__generation_id!=""}'
```
example output:
```json
{"metric":{"__name__":"vmagent_rows_inserted_total","job":"vminsert","type":"datadog","instance":"localhost:8429","__generation_id":"1658153678907548000"},"values":[0,0,0,0,256,253,135,15],"timestamps":[1658153559703,1658153569703,1658153579703,1658153589703,1658153599703,1658153609703,1658153619703,1658153621703]}
```
There was single series modify request with `__generation_id="1658153678907548000"`
## Backups
@@ -948,7 +996,7 @@ Below is the output for `/path/to/vminsert -help`:
Whether to use proxy protocol for connections accepted at -httpListenAddr . See https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt . With enabled proxy protocol http server cannot serve regular /metrics endpoint. Use -pushmetrics.url for metrics pushing
-import.maxLineLen size
The maximum length in bytes of a single line accepted by /api/v1/import; the line length can be limited with 'max_rows_per_line' query arg passed to /api/v1/export
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 104857600)
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 10485760)
-influx.databaseNames array
Comma-separated list of database names to return from /query and /influx/query API. This can be needed for accepting data from Telegraf plugins such as https://github.com/fangli/fluent-plugin-influxdb
Supports an array of values separated by comma or specified via multiple flags.

Binary file not shown.

Before

Width:  |  Height:  |  Size: 125 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 53 KiB

View File

@@ -1,22 +1,24 @@
docs-install:
gem install jekyll bundler
bundle install --gemfile=Gemfile
# To run localy you need to use ruby version =< 2.7.6, but not >=3.x , see https://bbs.archlinux.org/viewtopic.php?pid=1976408#p1976408
#
# run local server for documentation website
# at http://127.0.0.1:4000/
# On first use, please run `make docs-install`
# run local server for documentation website at http://127.0.0.1:4000/
docs-up:
JEKYLL_GITHUB_TOKEN=blank PAGES_API_URL=http://0.0.0.0 bundle exec \
--gemfile=Gemfile \
jekyll server --livereload
docs-up-docker:
docker run --rm -it \
-e JEKYLL_GITHUB_TOKEN=blank \
-e PAGES_API_URL=http://0.0.0.0 \
-e PAGES_REPO_NWO=VictoriaMetrics/VictoriaMetrics \
-p 4000:4000 \
-v $(PWD):/srv/jekyll \
jekyll/jekyll:3.8 jekyll serve --livereload
-v $(shell pwd)/docs:/srv/jekyll \
jekyll/jekyll:3.8 jekyll serve --livereload --incremental
# Converts images at docs folder to webp format
# See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#images-in-documentation
docs-images-to-webp:
IMAGES_EXTENSION=jpg $(MAKE) docs-images-to-webp-by-extension
IMAGES_EXTENSION=jpeg $(MAKE) docs-images-to-webp-by-extension
IMAGES_EXTENSION=png $(MAKE) docs-images-to-webp-by-extension
docs-images-to-webp-by-extension:
docker run --rm -it \
-v $(shell pwd)/docs:/docs \
elswork/cwebp \
sh -c 'find /docs/ -type f ! -path "/docs/operator/*" ! -path "/docs/_site/*" -name "*.$(IMAGES_EXTENSION)" -print0 | \
xargs -0 -P $(MAKE_CONCURRENCY) -I {} sh -c '"'"'cwebp -preset drawing -m 6 -o "$${1%.*}.webp" $$1'"'"' _ {}'
find docs/ -type f ! -path 'docs/operator/*' ! -path 'docs/_site/*' -name '*.$(IMAGES_EXTENSION)' -print0 | xargs -0 rm -f

View File

@@ -1029,7 +1029,7 @@ for every point of every time series returned by `q`.
Metric names are stripped from the resulting series. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
This function is supported by PromQL. This function is supported by PromQL. See also [acosh](#acosh).
This function is supported by PromQL. See also [acosh](#acosh).
#### day_of_month
@@ -1049,6 +1049,15 @@ Metric names are stripped from the resulting series. Add [keep_metric_names](#ke
This function is supported by PromQL.
#### day_of_year
`day_of_year(q)` is a [transform function](#transform-functions), which returns the day of year for every point of every time series returned by `q`.
It is expected that `q` returns unix timestamps. The returned values are in the range `[1...365]` for non-leap years, and `[1 to 366]` in leap years.
Metric names are stripped from the resulting series. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
This function is supported by PromQL.
#### days_in_month
`days_in_month(q)` is a [transform function](#transform-functions), which returns the number of days in the month identified

Binary file not shown.

Before

Width:  |  Height:  |  Size: 83 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 39 KiB

View File

@@ -14,7 +14,7 @@ aliases:
***The per-tenant statistic is a part of [enterprise package](https://docs.victoriametrics.com/enterprise.html). It is available for download and evaluation at [releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/latest). You need to request a [free trial license](https://victoriametrics.com/products/enterprise/trial/) for evaluation.***
<img alt="cluster-per-tenant-stat" src="PerTenantStatistic-stats.jpg">
<img alt="cluster-per-tenant-stat" src="PerTenantStatistic-stats.webp">
VictoriaMetrics cluster for enterprise provides various metrics and statistics usage per tenant:

View File

@@ -12,7 +12,7 @@ title: VictoriaMetrics
[![Build Status](https://github.com/VictoriaMetrics/VictoriaMetrics/workflows/main/badge.svg)](https://github.com/VictoriaMetrics/VictoriaMetrics/actions)
[![codecov](https://codecov.io/gh/VictoriaMetrics/VictoriaMetrics/branch/master/graph/badge.svg)](https://codecov.io/gh/VictoriaMetrics/VictoriaMetrics)
<img src="logo.png" width="300" alt="VictoriaMetrics logo">
<img src="logo.webp" width="300" alt="VictoriaMetrics logo">
VictoriaMetrics is a fast, cost-effective and scalable monitoring solution and time series database.
@@ -510,9 +510,9 @@ See also [vmagent](https://docs.victoriametrics.com/vmagent.html), which can be
## How to send data from DataDog agent
VictoriaMetrics accepts data from [DataDog agent](https://docs.datadoghq.com/agent/)
or [DogStatsD](https://docs.datadoghq.com/developers/dogstatsd/)
via ["submit metrics" API](https://docs.datadoghq.com/api/latest/metrics/#submit-metrics)
VictoriaMetrics accepts data from [DataDog agent](https://docs.datadoghq.com/agent/)
or [DogStatsD](https://docs.datadoghq.com/developers/dogstatsd/)
via ["submit metrics" API](https://docs.datadoghq.com/api/latest/metrics/#submit-metrics)
at `/datadog/api/v1/series` path.
### Sending metrics to VictoriaMetrics
@@ -521,7 +521,7 @@ DataDog agent allows configuring destinations for metrics sending via ENV variab
or via [configuration file](https://docs.datadoghq.com/agent/guide/agent-configuration-files/) in section `dd_url`.
<p align="center">
<img src="Single-server-VictoriaMetrics-sending_DD_metrics_to_VM.png" width="800">
<img src="Single-server-VictoriaMetrics-sending_DD_metrics_to_VM.webp" width="800">
</p>
To configure DataDog agent via ENV variable add the following prefix:
@@ -555,7 +555,7 @@ DataDog allows configuring [Dual Shipping](https://docs.datadoghq.com/agent/guid
sending via ENV variable `DD_ADDITIONAL_ENDPOINTS` or via configuration file `additional_endpoints`.
<p align="center">
<img src="Single-server-VictoriaMetrics-sending_DD_metrics_to_VM_and_DD.png" width="800">
<img src="Single-server-VictoriaMetrics-sending_DD_metrics_to_VM_and_DD.webp" width="800">
</p>
Run DataDog using the following ENV variable with VictoriaMetrics as additional metrics receiver:
@@ -1417,7 +1417,12 @@ For example, `/api/v1/import?extra_label=foo=bar` would add `"foo":"bar"` label
Note that it could be required to flush response cache after importing historical data. See [these docs](#backfilling) for detail.
VictoriaMetrics parses input JSON lines one-by-one. It loads the whole JSON line in memory, then parses it and then saves the parsed samples into persistent storage. This means that VictoriaMetrics can occupy big amounts of RAM when importing too long JSON lines. The solution is to split too long JSON lines into smaller lines. It is OK if samples for a single time series are split among multiple JSON lines.
VictoriaMetrics parses input JSON lines one-by-one. It loads the whole JSON line in memory, then parses it and then saves the parsed samples into persistent storage.
This means that VictoriaMetrics can occupy big amounts of RAM when importing too long JSON lines.
The solution is to split too long JSON lines into shorter lines. It is OK if samples for a single time series are split among multiple JSON lines.
JSON line length can be limited via `max_rows_per_line` query arg when exporting via [/api/v1/export](how-to-export-data-in-json-line-format).
The maximum JSON line length, which can be parsed by VictoriaMetrics, is limited by `-import.maxLineLen` command-line flag value.
### How to import data in native format
@@ -1594,15 +1599,18 @@ The format follows [JSON streaming concept](http://ndjson.org/), e.g. each line
```
Note that every JSON object must be written in a single line, e.g. all the newline chars must be removed from it.
Every line length is limited by the value passed to `-import.maxLineLen` command-line flag (by default this is 100MB).
[/api/v1/import](#how-to-import-data-in-json-line-format) handler doesn't accept JSON lines longer than the value
passed to `-import.maxLineLen` command-line flag (by default this is 10MB).
It is recommended passing 1K-10K samples per line for achieving the maximum data ingestion performance at [/api/v1/import](#how-to-import-data-in-json-line-format).
Too long JSON lines may increase RAM usage at VictoriaMetrics side.
[/api/v1/export](#how-to-export-data-in-json-line-format) handler accepts `max_rows_per_line` query arg, which allows limiting the number of samples per each exported line.
It is OK to split [raw samples](https://docs.victoriametrics.com/keyConcepts.html#raw-samples)
for the same [time series](https://docs.victoriametrics.com/keyConcepts.html#time-series) across multiple lines.
The number of lines in JSON line document can be arbitrary.
The number of lines in the request to [/api/v1/import](#how-to-import-data-in-json-line-format) can be arbitrary - they are imported in streaming manner.
## Relabeling
@@ -2038,6 +2046,8 @@ Alternatively, single-node VictoriaMetrics can self-scrape the metrics when `-se
set to duration greater than 0. For example, `-selfScrapeInterval=10s` would enable self-scraping of `/metrics` page
with 10 seconds interval.
_Please note, never use loadbalancer address for scraping metrics. All monitored components should be scraped directly by their address._
Official Grafana dashboards available for [single-node](https://grafana.com/grafana/dashboards/10229-victoriametrics/)
and [clustered](https://grafana.com/grafana/dashboards/11176-victoriametrics-cluster/) VictoriaMetrics.
See an [alternative dashboard for clustered VictoriaMetrics](https://grafana.com/grafana/dashboards/11831)
@@ -2364,7 +2374,7 @@ It is recommended disabling query cache with `-search.disableCache` command-line
historical data with timestamps from the past, since the cache assumes that the data is written with
the current timestamps. Query cache can be enabled after the backfilling is complete.
An alternative solution is to query [/internal/resetRollupResultCache](https://docs.victoriametrics.com/url-examples.html#internalresetRollupResultCache) handler after the backfilling is complete. This will reset the query cache, which could contain incomplete data cached during the backfilling.
An alternative solution is to query [/internal/resetRollupResultCache](https://docs.victoriametrics.com/url-examples.html#internalresetrollupresultcache) handler after the backfilling is complete. This will reset the query cache, which could contain incomplete data cached during the backfilling.
Yet another solution is to increase `-search.cacheTimestampOffset` flag value in order to disable caching
for data with timestamps close to the current time. Single-node VictoriaMetrics automatically resets response
@@ -2497,6 +2507,20 @@ Adhering `KISS` principle simplifies the resulting code and architecture, so it
Report bugs and propose new features [here](https://github.com/VictoriaMetrics/VictoriaMetrics/issues).
## Images in documentation
Please, keep image size and number of images per single page low. Keep the docs page as lightweight as possible.
If the page needs to have many images, consider using WEB-optimized image format [webp](https://developers.google.com/speed/webp).
When adding a new doc with many images use `webp` format right away. Or use a Makefile command below to
convert already existing images at `docs` folder automatically to `web` format:
```console
make docs-images-to-webp
```
Once conversion is done, update the path to images in your docs and verify everything is correct.
## VictoriaMetrics Logo
[Zip](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/VM_logo.zip) contains three folders with different image orientations (main color and inverted version).
@@ -2614,7 +2638,7 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
Whether to use proxy protocol for connections accepted at -httpListenAddr . See https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt . With enabled proxy protocol http server cannot serve regular /metrics endpoint. Use -pushmetrics.url for metrics pushing
-import.maxLineLen size
The maximum length in bytes of a single line accepted by /api/v1/import; the line length can be limited with 'max_rows_per_line' query arg passed to /api/v1/export
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 104857600)
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 10485760)
-influx.databaseNames array
Comma-separated list of database names to return from /query and /influx/query API. This can be needed for accepting data from Telegraf plugins such as https://github.com/fangli/fluent-plugin-influxdb
Supports an array of values separated by comma or specified via multiple flags.

View File

@@ -173,7 +173,7 @@ Once updated, run the following commands:
1. Commit and push changes to `master`.
1. Run "Release" action on Github:
![image](Release-Guide_helm-release.png)
![image](Release-Guide_helm-release.webp)
1. Merge new PRs *"Automatic update CHANGELOGs and READMEs"* and *"Synchronize docs"* after pipelines are complete.
## Ansible Roles

Binary file not shown.

Before

Width:  |  Height:  |  Size: 433 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 81 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 60 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 16 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 80 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 22 KiB

View File

@@ -20,7 +20,7 @@ aliases:
[![Build Status](https://github.com/VictoriaMetrics/VictoriaMetrics/workflows/main/badge.svg)](https://github.com/VictoriaMetrics/VictoriaMetrics/actions)
[![codecov](https://codecov.io/gh/VictoriaMetrics/VictoriaMetrics/branch/master/graph/badge.svg)](https://codecov.io/gh/VictoriaMetrics/VictoriaMetrics)
<img src="logo.png" width="300" alt="VictoriaMetrics logo">
<img src="logo.webp" width="300" alt="VictoriaMetrics logo">
VictoriaMetrics is a fast, cost-effective and scalable monitoring solution and time series database.
@@ -518,9 +518,9 @@ See also [vmagent](https://docs.victoriametrics.com/vmagent.html), which can be
## How to send data from DataDog agent
VictoriaMetrics accepts data from [DataDog agent](https://docs.datadoghq.com/agent/)
or [DogStatsD](https://docs.datadoghq.com/developers/dogstatsd/)
via ["submit metrics" API](https://docs.datadoghq.com/api/latest/metrics/#submit-metrics)
VictoriaMetrics accepts data from [DataDog agent](https://docs.datadoghq.com/agent/)
or [DogStatsD](https://docs.datadoghq.com/developers/dogstatsd/)
via ["submit metrics" API](https://docs.datadoghq.com/api/latest/metrics/#submit-metrics)
at `/datadog/api/v1/series` path.
### Sending metrics to VictoriaMetrics
@@ -529,7 +529,7 @@ DataDog agent allows configuring destinations for metrics sending via ENV variab
or via [configuration file](https://docs.datadoghq.com/agent/guide/agent-configuration-files/) in section `dd_url`.
<p align="center">
<img src="Single-server-VictoriaMetrics-sending_DD_metrics_to_VM.png" width="800">
<img src="Single-server-VictoriaMetrics-sending_DD_metrics_to_VM.webp" width="800">
</p>
To configure DataDog agent via ENV variable add the following prefix:
@@ -563,7 +563,7 @@ DataDog allows configuring [Dual Shipping](https://docs.datadoghq.com/agent/guid
sending via ENV variable `DD_ADDITIONAL_ENDPOINTS` or via configuration file `additional_endpoints`.
<p align="center">
<img src="Single-server-VictoriaMetrics-sending_DD_metrics_to_VM_and_DD.png" width="800">
<img src="Single-server-VictoriaMetrics-sending_DD_metrics_to_VM_and_DD.webp" width="800">
</p>
Run DataDog using the following ENV variable with VictoriaMetrics as additional metrics receiver:
@@ -1425,7 +1425,12 @@ For example, `/api/v1/import?extra_label=foo=bar` would add `"foo":"bar"` label
Note that it could be required to flush response cache after importing historical data. See [these docs](#backfilling) for detail.
VictoriaMetrics parses input JSON lines one-by-one. It loads the whole JSON line in memory, then parses it and then saves the parsed samples into persistent storage. This means that VictoriaMetrics can occupy big amounts of RAM when importing too long JSON lines. The solution is to split too long JSON lines into smaller lines. It is OK if samples for a single time series are split among multiple JSON lines.
VictoriaMetrics parses input JSON lines one-by-one. It loads the whole JSON line in memory, then parses it and then saves the parsed samples into persistent storage.
This means that VictoriaMetrics can occupy big amounts of RAM when importing too long JSON lines.
The solution is to split too long JSON lines into shorter lines. It is OK if samples for a single time series are split among multiple JSON lines.
JSON line length can be limited via `max_rows_per_line` query arg when exporting via [/api/v1/export](how-to-export-data-in-json-line-format).
The maximum JSON line length, which can be parsed by VictoriaMetrics, is limited by `-import.maxLineLen` command-line flag value.
### How to import data in native format
@@ -1602,15 +1607,18 @@ The format follows [JSON streaming concept](http://ndjson.org/), e.g. each line
```
Note that every JSON object must be written in a single line, e.g. all the newline chars must be removed from it.
Every line length is limited by the value passed to `-import.maxLineLen` command-line flag (by default this is 100MB).
[/api/v1/import](#how-to-import-data-in-json-line-format) handler doesn't accept JSON lines longer than the value
passed to `-import.maxLineLen` command-line flag (by default this is 10MB).
It is recommended passing 1K-10K samples per line for achieving the maximum data ingestion performance at [/api/v1/import](#how-to-import-data-in-json-line-format).
Too long JSON lines may increase RAM usage at VictoriaMetrics side.
[/api/v1/export](#how-to-export-data-in-json-line-format) handler accepts `max_rows_per_line` query arg, which allows limiting the number of samples per each exported line.
It is OK to split [raw samples](https://docs.victoriametrics.com/keyConcepts.html#raw-samples)
for the same [time series](https://docs.victoriametrics.com/keyConcepts.html#time-series) across multiple lines.
The number of lines in JSON line document can be arbitrary.
The number of lines in the request to [/api/v1/import](#how-to-import-data-in-json-line-format) can be arbitrary - they are imported in streaming manner.
## Relabeling
@@ -2046,6 +2054,8 @@ Alternatively, single-node VictoriaMetrics can self-scrape the metrics when `-se
set to duration greater than 0. For example, `-selfScrapeInterval=10s` would enable self-scraping of `/metrics` page
with 10 seconds interval.
_Please note, never use loadbalancer address for scraping metrics. All monitored components should be scraped directly by their address._
Official Grafana dashboards available for [single-node](https://grafana.com/grafana/dashboards/10229-victoriametrics/)
and [clustered](https://grafana.com/grafana/dashboards/11176-victoriametrics-cluster/) VictoriaMetrics.
See an [alternative dashboard for clustered VictoriaMetrics](https://grafana.com/grafana/dashboards/11831)
@@ -2372,7 +2382,7 @@ It is recommended disabling query cache with `-search.disableCache` command-line
historical data with timestamps from the past, since the cache assumes that the data is written with
the current timestamps. Query cache can be enabled after the backfilling is complete.
An alternative solution is to query [/internal/resetRollupResultCache](https://docs.victoriametrics.com/url-examples.html#internalresetRollupResultCache) handler after the backfilling is complete. This will reset the query cache, which could contain incomplete data cached during the backfilling.
An alternative solution is to query [/internal/resetRollupResultCache](https://docs.victoriametrics.com/url-examples.html#internalresetrollupresultcache) handler after the backfilling is complete. This will reset the query cache, which could contain incomplete data cached during the backfilling.
Yet another solution is to increase `-search.cacheTimestampOffset` flag value in order to disable caching
for data with timestamps close to the current time. Single-node VictoriaMetrics automatically resets response
@@ -2505,6 +2515,20 @@ Adhering `KISS` principle simplifies the resulting code and architecture, so it
Report bugs and propose new features [here](https://github.com/VictoriaMetrics/VictoriaMetrics/issues).
## Images in documentation
Please, keep image size and number of images per single page low. Keep the docs page as lightweight as possible.
If the page needs to have many images, consider using WEB-optimized image format [webp](https://developers.google.com/speed/webp).
When adding a new doc with many images use `webp` format right away. Or use a Makefile command below to
convert already existing images at `docs` folder automatically to `web` format:
```console
make docs-images-to-webp
```
Once conversion is done, update the path to images in your docs and verify everything is correct.
## VictoriaMetrics Logo
[Zip](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/VM_logo.zip) contains three folders with different image orientations (main color and inverted version).
@@ -2622,7 +2646,7 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
Whether to use proxy protocol for connections accepted at -httpListenAddr . See https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt . With enabled proxy protocol http server cannot serve regular /metrics endpoint. Use -pushmetrics.url for metrics pushing
-import.maxLineLen size
The maximum length in bytes of a single line accepted by /api/v1/import; the line length can be limited with 'max_rows_per_line' query arg passed to /api/v1/export
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 104857600)
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 10485760)
-influx.databaseNames array
Comma-separated list of database names to return from /query and /influx/query API. This can be needed for accepting data from Telegraf plugins such as https://github.com/fangli/fluent-plugin-influxdb
Supports an array of values separated by comma or specified via multiple flags.

View File

@@ -154,6 +154,8 @@ If you see unexpected or unreliable query results from VictoriaMetrics, then try
- By passing `nocache=1` query arg to every request to `/api/v1/query` and `/api/v1/query_range`.
If you use Grafana, then this query arg can be specified in `Custom Query Parameters` field
at Prometheus datasource settings - see [these docs](https://grafana.com/docs/grafana/latest/datasources/prometheus/) for details.
If the problem was in the cache, try resetting it via [resetRollupCache handler](https://docs.victoriametrics.com/url-examples.html#internalresetrollupresultcache).
1. If you use cluster version of VictoriaMetrics, then it may return partial responses by default
when some of `vmstorage` nodes are temporarily unavailable - see [cluster availability docs](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#cluster-availability)

View File

@@ -81,7 +81,7 @@ with `vl_http_requests_total{path="/select/logsql/query"}` metric.
VictoriaLogs provides a simple Web UI for logs [querying](https://docs.victoriametrics.com/VictoriaLogs/LogsQL.html) and exploration
at `http://localhost:9428/select/vmui`. The UI allows exploring query results:
<img src="vmui.png" width="800" />
<img src="vmui.webp" width="800" />
There are three modes of displaying query results:

Binary file not shown.

Before

Width:  |  Height:  |  Size: 1.0 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 263 KiB

View File

@@ -51,6 +51,7 @@ plus the following additional features:
- [Advanced auth and rate limiter](https://docs.victoriametrics.com/vmgateway.html).
- [mTLS for cluster components](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#mtls-protection).
- [Kafka integration](https://docs.victoriametrics.com/vmagent.html#kafka-integration).
- [Google PubSub integration](https://docs.victoriametrics.com/vmagent.html#google-pubsub-integration).
- [Multitenant support in vmalert](https://docs.victoriametrics.com/vmalert.html#multitenancy).
- [Ability to read alerting and recording rules from Object Storage](https://docs.victoriametrics.com/vmalert.html#reading-rules-from-object-storage).
- [Ability to filter incoming requests by IP at vmauth](https://docs.victoriametrics.com/vmauth.html#ip-filters).
@@ -80,7 +81,7 @@ VictoriaMetrics Enterprise components are available in the following forms:
It is allowed to run VictoriaMetrics Enterprise components in [cases listed here](#valid-cases-for-victoriametrics-enterprise).
Binary releases of VictoriaMetrics Enterprise are available [at the releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/latest).
Enterprise binaries and packages have `enterprise` suffix in their names. For example, `victoria-metrics-linux-amd64-v1.95.0-enterprise.tar.gz`.
Enterprise binaries and packages have `enterprise` suffix in their names. For example, `victoria-metrics-linux-amd64-v1.95.1-enterprise.tar.gz`.
In order to run binary release of VictoriaMetrics Enterprise component, please download the `*-enterprise.tar.gz` archive for your OS and architecture
from the [releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/latest) and unpack it. Then run the unpacked binary.
@@ -100,8 +101,8 @@ For example, the following command runs VictoriaMetrics Enterprise binary with t
obtained at [this page](https://victoriametrics.com/products/enterprise/trial/):
```console
wget https://github.com/VictoriaMetrics/VictoriaMetrics/releases/download/v1.95.0/victoria-metrics-linux-amd64-v1.95.0-enterprise.tar.gz
tar -xzf victoria-metrics-linux-amd64-v1.95.0-enterprise.tar.gz
wget https://github.com/VictoriaMetrics/VictoriaMetrics/releases/download/v1.95.1/victoria-metrics-linux-amd64-v1.95.1-enterprise.tar.gz
tar -xzf victoria-metrics-linux-amd64-v1.95.1-enterprise.tar.gz
./victoria-metrics-prod -license=BASE64_ENCODED_LICENSE_KEY
```
@@ -116,7 +117,7 @@ Alternatively, VictoriaMetrics Enterprise license can be stored in the file and
It is allowed to run VictoriaMetrics Enterprise components in [cases listed here](#valid-cases-for-victoriametrics-enterprise).
Docker images for VictoriaMetrics Enterprise are available [at VictoriaMetrics DockerHub](https://hub.docker.com/u/victoriametrics).
Enterprise docker images have `enterprise` suffix in their names. For example, `victoriametrics/victoria-metrics:v1.95.0-enteprise`.
Enterprise docker images have `enterprise` suffix in their names. For example, `victoriametrics/victoria-metrics:v1.95.1-enteprise`.
In order to run Docker image of VictoriaMetrics Enterprise component, it is required to provide the license key via command-line
flag as described [here](#binary-releases).
@@ -126,13 +127,13 @@ Enterprise license key can be obtained at [this page](https://victoriametrics.co
For example, the following command runs VictoriaMetrics Enterprise Docker image with the specified license key:
```console
docker run --name=victoria-metrics victoriametrics/victoria-metrics:v1.95.0-enteprise -license=BASE64_ENCODED_LICENSE_KEY
docker run --name=victoria-metrics victoriametrics/victoria-metrics:v1.95.1-enteprise -license=BASE64_ENCODED_LICENSE_KEY
```
Alternatively, the license code can be stored in the file and then referred via `-licenseFile` command-line flag:
```console
docker run --name=victoria-metrics -v /vm-license:/vm-license victoriametrics/victoria-metrics:v1.95.0-enteprise -licenseFile=/path/to/vm-license
docker run --name=victoria-metrics -v /vm-license:/vm-license victoriametrics/victoria-metrics:v1.95.1-enteprise -licenseFile=/path/to/vm-license
```
Example docker-compose configuration:
@@ -141,7 +142,7 @@ version: "3.5"
services:
victoriametrics:
container_name: victoriametrics
image: victoriametrics/victoria-metrics:v1.95.0
image: victoriametrics/victoria-metrics:v1.95.1
ports:
- 8428:8428
volumes:
@@ -173,7 +174,7 @@ is used to provide key in plain-text:
```yaml
server:
image:
tag: v1.95.0-enterprise
tag: v1.95.1-enterprise
license:
key: {BASE64_ENCODED_LICENSE_KEY}
@@ -184,7 +185,7 @@ In order to provide key via existing secret, the following values file is used:
```yaml
server:
image:
tag: v1.95.0-enterprise
tag: v1.95.1-enterprise
license:
secret:
@@ -231,7 +232,7 @@ spec:
license:
key: {BASE64_ENCODED_LICENSE_KEY}
image:
tag: v1.95.0-enterprise
tag: v1.95.1-enterprise
```
In order to provide key via existing secret, the following custom resource is used:
@@ -248,7 +249,7 @@ spec:
name: vm-license
key: license
image:
tag: v1.95.0-enterprise
tag: v1.95.1-enterprise
```
Example secret with license key:

View File

@@ -109,7 +109,7 @@ Please find the example of provisioning Grafana instance with VictoriaMetrics da
1. Download the latest release:
``` bash
```bash
ver=$(curl -s https://api.github.com/repos/VictoriaMetrics/grafana-datasource/releases/latest | grep -oE 'v[0-9]+\.[0-9]+\.[0-9]+' | head -1)
curl -L https://github.com/VictoriaMetrics/grafana-datasource/releases/download/$ver/victoriametrics-datasource-$ver.tar.gz -o plugin.tar.gz
tar -xf plugin.tar.gz -C ./victoriametrics-datasource
@@ -143,7 +143,7 @@ docker-compose -f docker-compose.yaml up
When Grafana starts successfully datasources should be present on the datasources tab
<p>
<img src="provision_datasources.png" width="800" alt="Configuration">
<img src="grafana-datasource_provision_datasources.webp" width="800" alt="Configuration">
</p>
### Install in Kubernetes
@@ -248,7 +248,7 @@ This example uses init container to download and install plugin.
1. To download plugin build and move contents into Grafana plugins directory:
``` bash
```bash
ver=$(curl -s https://api.github.com/repos/VictoriaMetrics/grafana-datasource/releases/latest | grep -oE 'v[0-9]+\.[0-9]+\.[0-9]+' | head -1)
curl -L https://github.com/VictoriaMetrics/grafana-datasource/releases/download/$ver/victoriametrics-datasource-$ver.tar.gz -o /var/lib/grafana/plugins/plugin.tar.gz
tar -xf /var/lib/grafana/plugins/plugin.tar.gz -C /var/lib/grafana/plugins/
@@ -262,14 +262,19 @@ This example uses init container to download and install plugin.
### 1. Configure Grafana
Installing dev version of Grafana plugin requires to change `grafana.ini` config to allow loading unsigned plugins:
{% raw %}
``` ini
# Directory where Grafana will automatically scan and look for plugins
plugins = {{path to directory with plugin}}
```
{% endraw %}
``` ini
[plugins]
allow_loading_unsigned_plugins = victoriametrics-datasource
```
### 2. Run the plugin
In the project directory, you can run:
```

Binary file not shown.

After

Width:  |  Height:  |  Size: 36 KiB

View File

@@ -232,7 +232,7 @@ To check that `VMAgent` collects metrics from the k8s cluster open in the browse
You will see something like this:
<p align="center">
<img src="guide-vmcluster-k8s-via-vm-operator.png" width="800" alt="">
<img src="getting-started-with-vm-operator_vmcluster.webp" width="800" alt="">
</p>
`VMAgent` connects to [kubernetes service discovery](https://kubernetes.io/docs/concepts/services-networking/service/) and gets targets which needs to be scraped. This service discovery is controlled by [VictoriaMetrics Operator](https://github.com/VictoriaMetrics/operator)
@@ -312,13 +312,13 @@ EOF
To check that [VictoriaMetrics](https://victoriametrics.com) collecting metrics from the k8s cluster open in your browser [http://127.0.0.1:3000/dashboards](http://127.0.0.1:3000/dashboards) and choose the `VictoriaMetrics - cluster` dashboard. Use `admin` for login and the `password` that you previously got from kubectl.
<p align="center">
<img src="guide-vmcluster-k8s-via-vm-operator-grafana1.png" width="800" alt="grafana dashboards">
<img src="getting-started-with-vm-operator_vmcluster-grafana1.webp" width="800" alt="grafana dashboards">
</p>
The expected output is:
<p align="center">
<img src="guide-vmcluster-k8s-via-vm-operator-grafana2.png" width="800" alt="grafana dashboards">
<img src="getting-started-with-vm-operator_vmcluster-grafana2.webp" width="800" alt="grafana dashboards">
</p>
## 6. Summary

Some files were not shown because too many files have changed in this diff Show More