Compare commits

...

399 Commits

Author SHA1 Message Date
Aliaksandr Valialkin
ad105147dd docs/CHANGELOG.md: cut v1.84.0 2022-11-25 19:53:29 -08:00
Aliaksandr Valialkin
e2a061b6a3 vendor: make vendor-update 2022-11-25 19:52:00 -08:00
Aliaksandr Valialkin
e014467f42 docs/README.md: make docs-sync after 58d459e8a8 2022-11-25 16:55:47 -08:00
Aliaksandr Valialkin
58d459e8a8 app/{vminsert,vmagent}: follow-up after 53a63c6c4c
Extend /api/v1/import/prometheus with the support for Pushgateway way of specifying additional labels.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1415
2022-11-25 16:48:14 -08:00
Pedro Gonçalves
53a63c6c4c Adding pushgateway basic capabilities to vmagent (#3360)
* init pushgateway implementation

* Initial implementation of pushgateway in vmagent

* Initial implementation of pushgateway in vmagent
2022-11-25 16:35:01 -08:00
Zakhar Bessarab
8b6d528fbd {app/vmstorage,app/vmselect}: add API to get list of existing tenants (#3348)
* {app/vmstorage,app/vmselect}: add API to get list of existing tenants

* {app/vmstorage,app/vmselect}: add API to get list of existing tenants

* app/vmselect: fix error message

* {app/vmstorage,app/vmselect}: fix error messages

* app/vmselect: change log level for error handling

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-11-25 11:05:47 -08:00
Aliaksandr Valialkin
4ca44cfe9c app/vmselect/vmui: make vmui-update after 37cda9abd0 2022-11-25 07:33:09 -08:00
Yury Molodov
37cda9abd0 fix: change header settings (#3391) 2022-11-25 07:26:35 -08:00
Yury Molodov
1ab66186ca refactor: create Autocomplete component (#3390) 2022-11-25 07:25:35 -08:00
Roman Khavronenko
42e63fe0fd dashboards: cleanup & remove artifacts (#3387)
* some unexpected DS UIDs were removed;
* replace `$instance.*` filter with `$instance` since we respect
the instance port anyway;
* remove predefined datasource for `clusterbytenant`
in favour of datasource variable `ds`.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-11-25 09:28:14 +01:00
Aliaksandr Valialkin
da13d36af9 app/vmselect/vmui: make vmui-update after eb772aa50e 2022-11-24 17:37:18 -08:00
Yury Molodov
eb772aa50e vmui: improve table view (#3377)
* vmui: add compact table view (#3365)

* feat: add compact table view

* fix: add overflow table

* fix: change table styles

* vmui: compact table view

* Update docs/CHANGELOG.md

Co-authored-by: Michal Kralik <michal.kralik@percona.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-11-24 17:33:07 -08:00
Aliaksandr Valialkin
399ed9a3b9 docs/MetricsQL.md: document that histogram_share() accepts optional boundsLabel arg 2022-11-24 17:27:30 -08:00
Aliaksandr Valialkin
045fec631b docs/vmagent.md: typo fix 2022-11-24 17:19:45 -08:00
Roman Khavronenko
3407006cdb dashboards: cluster dashboard update (#3380)
The purpose of the update is to make the dash more usable
for large installations with many instances. Panels which showed
metrics per-instance (Mem, CPU) now are showing metrics per-job or min/max/avg
aggregations in % instead. This supposed to help immediately to identify
resource shortage and remain usable for small and big installations.

For cases when detailed info is needed, to the bottom of the dashboard
a new row `Drilldown` was added. Panels like Mem or CPU now contain
a `data-link` named `Drilldown` (cis shown on line click) which takes
user to more detailed panel.

The change list is the following:
* bump Grafana version to 9.1.0;
* replace old "Graph" panel with "TimeSeries" panel;
* improve Uptime panel to show number of instances per job;
* show % usage of Mem and CPU instead of absolute values;
* `Caches` row was removed. All needed info for caches is now part of `Troubleshooting`;
* add `Drilldown` section for detailed resource usage;
* add Annotations for Alert triggers. Not all alerts are supposed to be displayed
on the dashboard, but only those with label `show_at: dashboard`.
See `alerts-cluster.yml` change.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-11-23 18:03:25 -08:00
Aliaksandr Valialkin
04bb2e14dd docs/Articles.md: add a link to "The cost of scale in Prometheus ecosystem" talk 2022-11-23 00:11:51 +02:00
Aliaksandr Valialkin
ccf9bb32ac app/vmselect/vmui: make vmui-update after 7dc2349913 2022-11-22 15:36:03 +02:00
Yury Molodov
7dc2349913 vmui: add set up series custom limits (#3368)
* feat: add set up series custom limits

* feat: add button for show series without limits

* fix: resolve merge conflicts
2022-11-22 15:31:17 +02:00
Aliaksandr Valialkin
633ad34eb7 vendor: make vendor-update 2022-11-22 11:26:16 +02:00
Aliaksandr Valialkin
b1622ad63e docs/Articles.md: add a link to https://www.youtube.com/watch?v=_zORxrgLtec (OSA Con 2022: Specifics of data analysis in Time Series Databases) 2022-11-22 01:08:21 +02:00
Aliaksandr Valialkin
9498f871e7 app/vminsert: add missing vm_relabel_config_* metrics after 03d88bc066
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3345
2022-11-22 00:47:49 +02:00
Roman Khavronenko
03d88bc066 vmagent: expose metrics for tracking config state (#3375)
Expose `vm_relabel_config_*` and `vm_promscrape_config_*` metrics
for tracking relabel and scrape configuration hot-reloads.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3345
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-11-22 00:38:43 +02:00
Aliaksandr Valialkin
2ddfde78c3 app/vmselect/vmui: make vmui-update after 7d1b3e7e14 2022-11-22 00:34:33 +02:00
Yury Molodov
7d1b3e7e14 vmui: add copy button to row on Table view (#3363)
* feat: add copy button to row on Table view

* vmui: add copy button to row on Table view

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-11-22 00:31:33 +02:00
Yury Molodov
f81072f9a7 vmui: minor fixes (#3361)
* fix: reset the value of the switches trace and cache

* fix: add cursor text for inputs

* fix: solve the Infinite loop of useFetchQuery.ts

* fix: change condition for show/hide autocomplete

* fix: add limit error length for input
2022-11-22 00:28:03 +02:00
Yury Molodov
82d254af08 vmui: sticky tooltip (#3376)
* feat: add ability to make tooltip "sticky"

* vmui: add ability to make tooltip "sticky"
2022-11-22 00:26:53 +02:00
Aliaksandr Valialkin
ee1479bac6 docs/CHANGELOG.md: link to the related issue for range_normalize() function 2022-11-21 23:27:00 +02:00
Aliaksandr Valialkin
d9c3a2b605 app/vmselect/promql: add range_normalize(q1, ..., qN) function for normalizing query results into [0..1] value range
This may be useful for analyzing correlation between time series with different value ranges
2022-11-21 23:25:00 +02:00
Aliaksandr Valialkin
95f0266558 lib/promscrape/discovery/gce: do not pass filter arg when discovering zones
The filter arg isn't supported by zones API in GCE.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3202
2022-11-21 22:32:05 +02:00
Aliaksandr Valialkin
05ed98c98b app/vmselect/promql: allow using SI and IEC suffixes in numeric values inside queries
For example, 10Ki is equivalent to 10*1024, while 5.3M is equivalent to 5.3*1000*1000
2022-11-21 21:27:55 +02:00
Aliaksandr Valialkin
2c9e403d5f app/vmselect/promql: properly return an empty result from limit_offset() if offset exceeds the number of inner time series
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3312
2022-11-21 16:47:37 +02:00
Roman Khavronenko
0b6f439b11 vmalert: bump alerting rules evaluation interval to reasonable 30s (#3374)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-11-21 15:23:23 +01:00
Aliaksandr Valialkin
b796a0dc3f app/vmselect/promql: optimize e1 op e2 when e1 returns an empty result
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3349
2022-11-21 16:09:10 +02:00
Roman Khavronenko
84742f229a vmalert: add default list of alerting rules (#3373)
The default list of alerting rules contains the basic
rules for checking vmalert's health state and is recommended
to use for monitoring vmalert deployments.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-11-21 14:45:45 +01:00
Aliaksandr Valialkin
20d758e3e4 all: add a link to https://docs.victoriametrics.com/enterprise.html into description for enterprise flags 2022-11-21 15:42:01 +02:00
Aliaksandr Valialkin
cb1a621d63 app/{vminsert,vmselect}: add -storageNode.filter command-line flag for filtering the discovered storage nodes
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3353
2022-11-21 15:20:14 +02:00
Aliaksandr Valialkin
65b4e96a80 docs/Cluster-VictoriaMetrics.md: sync with cluster branch 2022-11-18 14:06:23 +02:00
Aliaksandr Valialkin
a061d33400 app/vmselect: clarify that it isnt recommended setting -replicationFactor at vmselect nodes even if the replication is enabled at vminsert nodes 2022-11-18 14:04:53 +02:00
Aliaksandr Valialkin
cae0f37edd app/vmselect/netstorage: remove superflouos map lookup at ProcessSearchQuery
This should reduce CPU usage a bit during querying
2022-11-18 13:40:04 +02:00
Yury Molodov
519bd2af7b vmui: add trace analyzer (#3310)
* refactor: change structure project

* refactor: change structure project

* fix: add hooks for set query params

* refactor: add index for pages

* docs: add TESTCASES.md

* refactor: restructure components

* feat: add page with trace analyzer

* fix: change detect trace data

* Update app/vmui/packages/vmui/src/pages/TracePage/index.tsx

Co-authored-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* Update app/vmui/packages/vmui/src/pages/TracePage/index.tsx

Co-authored-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* fix: change descriptions on trace page

* Update app/vmui/packages/vmui/src/pages/TracePage/index.tsx

Co-authored-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* feat: add base components

* feat: add reset styles

* docs: add description about trace analyzer

* feat: add styles for custom panel page

* feat: add styles for predefined panels

* feat: add style for TracingsView.tsx

* feat: add Alerts

* feat: add Tooltip.tsx

* fix: correct styles

* feat: add DatePicker.tsx

* feat: add tables

* feat: add theme provider

* fix: replace using callbacks as props to handlers

* fix: correct update time

* fix: change TimePicker.tsx

* fix: correct styles

* fix: update packages

* vmui: refactor code, remove material-ui

* feat: add paste json for trace analyzer

* vmui: update trace analyzer docs

* app/vmselect/vmui: `make vmui-update`

Co-authored-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-11-17 22:22:01 +02:00
Aliaksandr Valialkin
e79bfdf4b8 docs/CHANGELOG.md: document the fix for CPU usage spikes
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3343
2022-11-17 22:02:02 +02:00
Aliaksandr Valialkin
353396aa23 lib/workingsetcache: expose -cacheExpireDuration command-line flag for fine-tuning of the cache expiration
While at it, decrease -prevCacheRemovalPercent from 0.2 to 0.1 and increase -cacheExpireDuration from 20 minutes to 30 minutes.

This is needed for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3343
2022-11-17 19:59:13 +02:00
Aliaksandr Valialkin
578bb58ea9 app/vmselect/vmui: make vmui-update after 51bfd1ab80 2022-11-17 18:54:55 +02:00
Yury Molodov
51bfd1ab80 vmui: add ability hide query (#3359)
* feat: add ability hide query

* fix: change logic hide query

* fix: remove console.log
2022-11-17 18:42:33 +02:00
dependabot[bot]
3ed238b75b build(deps): bump loader-utils in /app/vmui/packages/vmui (#3350)
Bumps [loader-utils](https://github.com/webpack/loader-utils) from 2.0.3 to 2.0.4.
- [Release notes](https://github.com/webpack/loader-utils/releases)
- [Changelog](https://github.com/webpack/loader-utils/blob/v2.0.4/CHANGELOG.md)
- [Commits](https://github.com/webpack/loader-utils/compare/v2.0.3...v2.0.4)

---
updated-dependencies:
- dependency-name: loader-utils
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-11-17 12:49:09 +01:00
Aliaksandr Valialkin
2c9017f6df vendor: make vendor-update 2022-11-17 01:38:29 +02:00
Dmytro Kozlov
fb65fb39d2 app/vmstorage: fix potential file inclusion via variable (#3339)
* app/vmstorage: fix potential file inclusion via variable

* app/vmstorage: cleanup
2022-11-17 01:29:43 +02:00
Aliaksandr Valialkin
a21c8e7b9a app/vmselect/vmui: make vmui-update after bc8a782f74 2022-11-17 01:15:45 +02:00
Yury Molodov
bc8a782f74 vmui/refactor (#3298)
* refactor: change structure project

* refactor: change structure project

* fix: add hooks for set query params

* refactor: add index for pages

* docs: add TESTCASES.md

* refactor: restructure components

* feat: add base components

* feat: add reset styles

* feat: add styles for custom panel page

* feat: add styles for predefined panels

* feat: add style for TracingsView.tsx

* feat: add Alerts

* feat: add Tooltip.tsx

* fix: correct styles

* feat: add DatePicker.tsx

* feat: add tables

* feat: add theme provider

* fix: replace using callbacks as props to handlers

* fix: correct update time

* fix: change TimePicker.tsx

* fix: correct styles

* fix: update packages

* vmui: refactor code, remove material-ui

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-11-17 01:09:14 +02:00
Aliaksandr Valialkin
a260e2659e app/vmselect/promql: add range_stdvar() and range_stddev() functions for calculating variance and deviation over time series on the selected time range 2022-11-17 01:03:40 +02:00
Aliaksandr Valialkin
c1a3192d8b app/vmselect/promql: add range_linear_regression(q) function for calculating simple linear regression for the selected time series on the selected time range 2022-11-17 00:38:48 +02:00
Aliaksandr Valialkin
5955d23232 lib/promscrape: add a benchmark for internLabelStrings() 2022-11-16 23:02:49 +02:00
Aliaksandr Valialkin
a75137c1c2 lib/mergeset: properly reset bsr.bhIdx after the call to blockStreamReader.readNextBHS()
The issue has been introduced in 58b40f514c

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3343
2022-11-16 21:23:35 +02:00
Aliaksandr Valialkin
c3362e3db4 lib/workingsetcache: add -prevCacheRemovalPercent command-line flag for tuning memory usage vs CPU usage ratio
Reduce the default value of this flag from 1% to 0.2% after 71335e6024

This flag should help determining the best ratio for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3343
2022-11-16 12:39:39 +02:00
Aliaksandr Valialkin
4106f197f2 lib/mergeset: retain the buffer with the data used by indexBlock.bhs, inside indexBlock.buf
Previously indexBlock.bhs pointed to the buffer, which could be changed over time.
This could result in incorrect time series search over time.

This is a follow-up for 58b40f514c

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3343
2022-11-16 12:09:23 +02:00
Aliaksandr Valialkin
58b40f514c lib/mergeset: remove string allocation and copying when unmarshaling blockHeader
This should reduce CPU usage for the case from https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3343
2022-11-15 16:30:54 +02:00
Aliaksandr Valialkin
09b79d74a7 docs/CHANGELOG.md: document changes in v1.79.5 release 2022-11-11 01:27:45 +02:00
Aliaksandr Valialkin
99f187d9bc deployment/docker: update VictoriaMetrics version from v1.83.0 to v1.83.1 2022-11-11 01:24:40 +02:00
Aliaksandr Valialkin
bbe1a1472c docs/CHANGELOG.md: cut v1.83.1 2022-11-10 14:06:57 +02:00
Aliaksandr Valialkin
1b9dff133a docs/CHANGELOG.md: document the fix at 71335e6024 2022-11-10 13:46:51 +02:00
Aliaksandr Valialkin
2bcafbef25 vendor: make vendor-update 2022-11-10 13:46:33 +02:00
Aliaksandr Valialkin
71335e6024 lib/workingsetcache: tune cache miss threshold for resetting the previous cache from 5% to 1%
It has been appeared that some production workloads could suffer for some time
after every reset of the previous cache when it gets less than 5% of requests
after the needed item isn't found in the current cache. This could result
in reduced cache hit rates, which, in turn, could increase CPU, disk IO and RAM
usage needed for reading, unpacking and caching the missed data from disk.

This commit reduces the cache miss threshold for resetting the previous cache from 5% to 1%.
This should reduce the possible negative impact after each cache reset by at least 5x,
while reducing the total memory used by caches.

This is a follow-up for d906d8573e
2022-11-10 13:31:54 +02:00
Dmytro Kozlov
5ff6e0fb02 vmui: fix vmui vulnerability (#3336)
* vmui: fix vmui vulnerability

* vmui: code cleanup
2022-11-10 02:28:37 +01:00
Aliaksandr Valialkin
6c7361b1c5 app/vmselect/vmui: make vmui-update after 7130af7fd2 2022-11-09 16:43:18 +02:00
Aliaksandr Valialkin
86bce7f5f9 lib/promscrape: add more cases to TestAddRowToTimeseries
This is a follow-up for 16fdd2af8a
2022-11-09 16:13:56 +02:00
Jeremy PLANCKEEL
16fdd2af8a test(golang): add test to function addRowToTimeseries (#3282)
Co-authored-by: jplanckeel-externe <jplanckeel.externe@bedrockstreaming.com>
2022-11-09 15:41:26 +02:00
Aliaksandr Valialkin
b8839df32c lib/protoparser/opentsdb: follow-up after 04b0e4e7bf
- Simplify the parser code to be less error prone
- Document the change
- Add a test for OpenTSDB put line with trailing whitespace without tags

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3290
2022-11-09 15:35:05 +02:00
Roman Khavronenko
04b0e4e7bf protoparser/opentsdb: allow lines without tags (#3303)
According to http://opentsdb.net/docs/build/html/api_telnet/put.html
"At least one tag pair must be present".
However, in VictoriaMetrics datamodel tags aren't required.
This could be confusing for users. Allowing accept lines without
tags seems to do no harm.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3290
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-11-09 15:32:47 +02:00
Aliaksandr Valialkin
e17a1acf4a docs/CHANGELOG.md: document 7130af7fd2
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2814
2022-11-09 12:15:49 +02:00
Michal Kralik
7130af7fd2 vmui: show tracing in json view (#3316)
* vmui: show tracing in json view

* vmui: refactor tracing view
2022-11-09 12:12:28 +02:00
dependabot[bot]
10791bf224 build(deps): bump loader-utils in /app/vmui/packages/vmui (#3328)
Bumps [loader-utils](https://github.com/webpack/loader-utils) from 2.0.2 to 2.0.3.
- [Release notes](https://github.com/webpack/loader-utils/releases)
- [Changelog](https://github.com/webpack/loader-utils/blob/v2.0.3/CHANGELOG.md)
- [Commits](https://github.com/webpack/loader-utils/compare/v2.0.2...v2.0.3)

---
updated-dependencies:
- dependency-name: loader-utils
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-11-09 12:11:13 +02:00
Denys Holius
aebe21e2c8 guides/README.md: fix link to guide-delete-or-replace-metrics.html (#3331) 2022-11-09 12:00:46 +02:00
Aliaksandr Valialkin
34aa3f6404 README.md: sync changes with 9f8bf524ad 2022-11-09 11:55:50 +02:00
Aliaksandr Valialkin
20046dab6e app/vmui/packages/vmui: return back accidental changes at 9f8bf524ad 2022-11-09 11:55:34 +02:00
Aliaksandr Valialkin
c973aca617 app/vminsert/netstorage: move nodesHash from global state to storageNodesBucket
This should prevent from panics when the list of discovered vmstorage nodes changes.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3329
2022-11-09 11:51:10 +02:00
Roman Khavronenko
9f8bf524ad bump go version to 1.19.3 (#3327)
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-11-08 16:43:59 +01:00
Michal Kralik
9b540bba6f vmui: change graph legend label format (#3315) 2022-11-08 15:16:24 +01:00
Zakhar Bessarab
91dd79f40f docs/operator: fix description for SA at VMClusterSpec (#3313) 2022-11-07 16:51:01 +01:00
Aliaksandr Valialkin
7fa5d043f5 lib/promscrape/discovery/consul: add __meta_consul_partition label in the same way as Prometheus does
See https://github.com/prometheus/prometheus/pull/11482
2022-11-07 15:25:53 +02:00
Aliaksandr Valialkin
8332622037 vendor: update github.com/urfave/cli/v2 from v2.23.2 to v2.23.4 2022-11-07 14:58:35 +02:00
Aliaksandr Valialkin
daa70e6560 lib/storage: follow-up for 790768f20b
- Document the bugfix at docs/CHANGELOG.md
- Simplify the bugfix a bit
2022-11-07 14:04:08 +02:00
Aliaksandr Valialkin
f9dc3da9e2 lib/storage: typo fix after 32d48f8dfbb03174858c00bdfe6d9d22431dc8d8 2022-11-07 13:58:27 +02:00
Aliaksandr Valialkin
116811d761 lib/envtemplate: allow non-env var names inside "%{ ... }" 2022-11-07 13:58:27 +02:00
Aliaksandr Valialkin
dd88c628aa lib/storage: remove unused isFull field from hourMetricIDs struct 2022-11-07 13:58:26 +02:00
Łukasz Marszał
790768f20b Fix issue-3309 - currHourMetricIDs shouldn't contain metrics from prev hour (#3320)
* fix issue-3309 currHourMetricIDs shouldn't contain metrics from prev hour

* Update storage.go
2022-11-07 13:55:37 +02:00
Aliaksandr Valialkin
63d4cf661b vendor: make vendor-update 2022-11-05 10:34:35 +02:00
Aliaksandr Valialkin
d61691d5fa deployment/docker: update Go builder from v1.19.2 to v1.19.3
See https://github.com/golang/go/issues?q=milestone%3AGo1.19.3+label%3ACherryPickApproved
2022-11-05 10:19:54 +02:00
Aliaksandr Valialkin
23c79e2e49 docs/Single-server-VictoriaMetrics.md: mention about security certifications at Security chapter
This is a follow-up for 94bd49402e
2022-11-05 10:07:29 +02:00
Aliaksandr Valialkin
4ef5fe1317 docs/guides/guide-vmcluster-multiple-retention-setup.md: clarify docs after a75d85b11e 2022-11-05 10:02:48 +02:00
Artem Navoiev
94bd49402e docs: Add link to security page from Readme (#3286)
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2022-11-04 09:49:12 +01:00
Dmytro Kozlov
99d8fcb332 docs/operator: change VMAgentRemoteWriteSettings.MaxDiskUsagePerURL type from int32 to int64 (#3307)
* docs/operator: change VMAgentRemoteWriteSettings.MaxDiskUsagePerURL type from int32 to int64

* docs: operator updates api description

Co-authored-by: f41gh7 <nik@victoriametrics.com>
2022-11-04 01:12:57 +01:00
Roman Khavronenko
ac4e23de39 vmctl: fix panic on start (#3300)
The change disables initing the `-version` flag in new
`urfave/cli/v2` update. The `-version` flag conflicts
with the identical flag from `lib/buildinfo` and causes panic.

See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3299

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-11-01 19:50:22 +01:00
Artem Navoiev
a75d85b11e update multi-tenancy guide. Add infromation about Enterprise and Rete… (#3285)
docs: update multi-tenancy guide

Add information about Enterprise and Retention Filters

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2022-11-01 15:19:09 +01:00
Aliaksandr Valialkin
df88832c86 docs/Single-server-VictoriaMetrics.md: follow-up after a1a97b9321
- Remove trailing whitespace at the end of lines
- Remove redundant sentence stating that time series matching the given selector will be deleted.
  It should be clear from the surrounding context.
2022-11-01 10:49:10 +02:00
Aliaksandr Valialkin
a3dc324b19 vendor: update github.com/urfave/cli/v2 from 2.20.3 to 2.23.0 2022-11-01 10:46:26 +02:00
Dmytro Kozlov
a1a97b9321 docs: clarify information about usage of the api/v1/admin/tsdb/delete_series API (#3287)
docs: clarify information about usage of the `api/v1/admin/tsdb/delete_series` API
2022-11-01 09:36:28 +01:00
Aliaksandr Valialkin
a1011931ac docs/Release-Guide.md: instruct to update VictoriaMetrics version in deployment/docker/docker-compose*.yml files after creating new release
This is a follow-up for d1509f4559
2022-11-01 10:31:02 +02:00
Aliaksandr Valialkin
619b3c926d docs/enterprise.md: mention that feature requests from enterprise customers are prioritized 2022-11-01 10:28:12 +02:00
Denys Holius
d1509f4559 docker-compose: bump version of container tags for VictoriaMetrics components (#3294)
* deployment/docker/docker-compose-cluster.yml: bump VictoriaMetrics Cluster components to the latest v1.83.0 version

* deployment/docker/docker-compose.yml: bump VictoriaMetrics Single node and vmutils to the latest v1.83.0 version
2022-11-01 09:25:07 +01:00
Aliaksandr Valialkin
869e0f9f85 lib/promrelabel: go fmt after 5cec9706dc 2022-10-29 05:17:10 +03:00
Aliaksandr Valialkin
0f8f36de24 docs: typo fixes 2022-10-29 04:52:18 +03:00
Aliaksandr Valialkin
5cec9706dc lib/promrelabel: add a test from https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3251
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3251
2022-10-29 04:33:38 +03:00
Aliaksandr Valialkin
740f7ac5e0 docs/CHANGELOG.md: cut v1.83.0 2022-10-29 02:54:54 +03:00
Aliaksandr Valialkin
6ac7b088b2 vendor: make vendor-update 2022-10-29 02:53:48 +03:00
Aliaksandr Valialkin
cdd3443806 app/vmbackupmanager: add functionality for automated restore from backup 2022-10-29 02:30:52 +03:00
Aliaksandr Valialkin
320ae1c60a lib/envflag: small refactoring after 518c340ae3 and 02096e06d0 2022-10-29 02:28:58 +03:00
Aliaksandr Valialkin
76e8888272 lib/promscrape: properly add exported_ prefix to labels, which clash with target labels if honor_labels: true option isn't set.
The issue was in the `labels := dst[offset:]` line in the beginning of appendExtraLabels() function.
The `dst` may be re-allocated when adding extra labels to it. In this case the addition of `exported_`
prefix to labels inside `labels` slice become invisible in the returned `dst` labels.

While at it, properly handle some corner cases:

- Add additional `exported_` prefix to clashing metric labels with already existing `exported_` prefix.
- Store scraped metric names in `exported___name__` label if scrape target contains `__name__` label.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3278

Thanks to @jplanckeel for the initial attempt to fix this issue
at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3281
2022-10-28 22:14:26 +03:00
Aliaksandr Valialkin
454baf84d6 lib/promscrape/discovery/kubernetes: do not print an empty kubeconfig_file option in yaml at /config page 2022-10-28 22:14:25 +03:00
Aliaksandr Valialkin
9565dbed34 app/vmselect/vmui: make vmui-update after 54e1865d17 2022-10-28 14:51:23 +03:00
Yury Molodov
54e1865d17 vmui: minor fixes (#3276)
* feat: apply serverURL on down Enter

* fix: change method of set time range

* fix: remove prevent run fetch without changes

* fix: prevent reset timerange when autorefresh
2022-10-28 14:47:50 +03:00
Aliaksandr Valialkin
9aee303ca1 docs/Cluster-VictoriaMetrics.md: improve docs for dns+srv service discovery 2022-10-28 14:24:48 +03:00
Aliaksandr Valialkin
a72c5f76eb app/{vminsert,vmselect}: add support for automatic discovery and update of vmstorage nodes
Thanks to @dmitryk-dk for the initial implemenation at https://github.com/VictoriaMetrics/VictoriaMetrics-enterprise/pull/446
2022-10-28 13:13:45 +03:00
Aliaksandr Valialkin
ad76a54c87 vendor: make vendor-update 2022-10-28 00:18:15 +03:00
Aliaksandr Valialkin
a018b1d75e app/vmalert/templates: properly escape all the special chars in quotesEscape function
Previously the `quotesEscape` function was escaping only double quotes.
This wasn't enough, since the input string could contain other special chars,
which must be escaped when put inside JSON string. For example, carriage return and line feed chars (\n\r),
backslash char, etc. This led to the following issues, which were improperly fixed:

- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/890 - this issue
  was "fixed" by introducing the `crlfEscape` function, which led to unnecessary
  complications in user templates, while not fixing various corner cases
  such as backslash chars in the input string.
  See 1de15ad490

- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3139 - this issue
  was "fixed" by urlencoding the whole string passed to -external.alert.source
  command-line flag. This led to invalid urls, which couldn't be parsed by Grafana.
  See 00c838353d
  and 4bd0244599

This commit properly encodes the input string passed to `quotesEscape`, so it can be safely embedded inside JSON strings.

This commit deprecates crlfEscape template function and adds the following new template functions:

- strvalue and stripDomain - these functions are supported by Prometheus, so they were added
  for compatibility purposes.
- jsonEscape and htmlEscape for converting the input string to valid quoted JSON string
  and for html-escaping the input string, so it could be safely embedded as a plaintext
  into html.

This commit also documents all supported template functions at https://docs.victoriametrics.com/vmalert.html#template-functions
The deprecated crlfEscape function isn't documented on purpose, since its usefulness is negative in general case.
2022-10-28 00:01:16 +03:00
Aliaksandr Valialkin
4bd0244599 Revert "vmalert: escape query params if external alert source defined (#3267)"
This reverts commit 00c838353d.

Reason for revert: it incorrectly fixes the issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3139 .
Now `-external.alert.source=explore?orgId=1&left=...` is converted to the following invalid url, which cannot be handled by Grafana:

https://grafana.example.com/explore%3ForgId%3D1%26left%3D...

The next commit will contain the correct fix of the issue - the `quotesEscape` function must
properly escape the string, so it could be embedded into JSON string. This function must
properly escape \n\r chars too. In this case the `crlfEscape` function becomes unnecessary.
Actually, the next commit makes the `crlfEscape` function deprecated.
2022-10-27 22:30:27 +03:00
Aliaksandr Valialkin
75e22ed3a4 vendor: make vendor-update 2022-10-27 20:21:13 +03:00
Denys Holius
b4e6460d2f .github/workflows/codeql-analysis.yml: specifically setting the Go version (#3277)
see https://github.com/github/codeql-action/issues/1059
2022-10-27 10:06:33 +02:00
Dmytro Kozlov
00c838353d vmalert: escape query params if external alert source defined (#3267)
vmalert: escape query args if external alert source defined
2022-10-26 10:00:14 -04:00
Aliaksandr Valialkin
518c340ae3 lib/envtemplate: allow referring env vars from other env vars via %{ENV_VAR} syntax
This is a follow-up for 02096e06d0
2022-10-26 14:49:33 +03:00
Aliaksandr Valialkin
3c66e45ef0 app/vmselect/vmui: make vmui-update after eae6063450 2022-10-26 02:50:46 +03:00
Yury Molodov
eae6063450 fix: change step setting field (#3270)
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-10-26 02:46:46 +03:00
Yury Molodov
794bd8b424 vmui: limit the number of decimal places to 10 characters (#3258)
* fix: limit the number of decimal places to 10 characters

* fix: add float numbers stabilizer
2022-10-26 02:43:13 +03:00
Yury Molodov
bc7456841f vmui: add responsive styles for small screens (#3256)
* fix: add responsive styles for small screens

* fix: correct additional settings margins

* docs/CHANGELOG.md: add responsive styles
2022-10-26 02:39:54 +03:00
Aliaksandr Valialkin
685433b8da docs/Cluster-VictoriaMetrics.md: update docs about env vars usage in command-line flags
This is a follow-up for 02096e06d0
2022-10-26 01:54:15 +03:00
Aliaksandr Valialkin
02096e06d0 lib/envflag: allow referring environment variables in command-line flags 2022-10-26 01:52:05 +03:00
Aliaksandr Valialkin
5d82e7d64a docs: run make docs-sync after 2ed3d49c26 2022-10-26 01:14:31 +03:00
omahs
2ed3d49c26 Fix: typos (#3269)
Fix: typos
2022-10-26 01:12:54 +03:00
Aliaksandr Valialkin
c4265322f4 lib/fs: add canOverwrite arg to WriteFileAtomically when it is allowed to overwrite the file atomically if it already exists 2022-10-26 01:07:34 +03:00
Aliaksandr Valialkin
db8abd000e docs/Makefile: fix docs-up Makefile command after 9ccd22c1f6 2022-10-25 17:53:24 +03:00
Aliaksandr Valialkin
d9bbf24183 app/{vminsert,vmselect}/netstorage: allow calling Init()+MustStop() in a loop
Previously netstorage.MustStop() call didn't free up all the resources,
so the subsequent call to nestorage.Init() would panic.

This allows writing tests, which call nestorage.Init() + nestorage.MustStop() in a loop.
2022-10-25 17:47:17 +03:00
Denys Holius
9ccd22c1f6 Docs: add guide "How to delete and replace metrics" (#2829)
docs: add guide how to delete and replace metrics
2022-10-25 09:02:14 -04:00
Aliaksandr Valialkin
b7882dc9af app/vmselect/vmui: make vmui-update after 274e235bf7 2022-10-24 21:29:13 +03:00
Aliaksandr Valialkin
15849cb571 docs/guides/migrate-from-influx.md: properly display images at https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/docs/guides/migrate-from-influx.md
This is a follow-up for 42375679db
2022-10-24 21:28:31 +03:00
Nguyen Van Duc
42375679db Fix images not display on key concepts document (#3266) 2022-10-24 21:22:41 +03:00
Aliaksandr Valialkin
b1324631b1 docs/CHANGELOG.md: document 274e235bf7
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3240
2022-10-24 21:02:17 +03:00
Yury Molodov
274e235bf7 vmui: optimize memory (#3255)
* fix: change series limit logic

* fix: remove spread operators
2022-10-24 20:51:24 +03:00
Yury Molodov
59199a98dd vmui: extend options for app mode (#3252)
* feat: add vmui customization for dbaas

* feat: extends vmui customization for dbaas

* fix: move input tenandId after query

* fix: change serverURL when changing tenantID

* fix: remove options

* docs: add options description
2022-10-24 20:45:41 +03:00
Aliaksandr Valialkin
c52c23c272 docs/enterprise.md: describe all the enteprise features in a short doc at https://docs.victoriametrics.com/enterprise.html 2022-10-24 18:02:03 +03:00
Aliaksandr Valialkin
cac28ae0ae docs/CHANGELOG.md: typo fixes 2022-10-24 16:59:55 +03:00
Aliaksandr Valialkin
8e998aa1a1 lib/storage: add support for retention filters (aka multiple retentions for distinct sets of time series)
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/143
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/289
2022-10-24 16:40:20 +03:00
Aliaksandr Valialkin
d7e657e5f9 docs/CHANGELOG.md: document 69b27275d2b2bf1bdae0d8b887b44bde2a787649 2022-10-24 16:08:15 +03:00
Aliaksandr Valialkin
692af2d17c vendor: make vendor-update 2022-10-24 15:47:01 +03:00
Aliaksandr Valialkin
dba218a8ce lib/storage: skip blocks outside the configured retention during search
Blocks outside the configured retention are eventually deleted during background merge.
But such blocks may reside in the storage for long time until background merge.
Previously VictoriaMetrics could spend additional CPU time on processing such blocks
during search queries. Now these blocks are skipped.
2022-10-24 02:52:44 +03:00
Aliaksandr Valialkin
e2f0b76ebf lib/storage: do not pass retentionMsecs and isReadOnly args explicitly - access them via Storage arg
This makes code easier to read.

This is a follow-up after d2d30581a0
2022-10-24 01:31:04 +03:00
Aliaksandr Valialkin
89a1108b1a lib/storage: small code cleanups 2022-10-24 01:17:47 +03:00
Aliaksandr Valialkin
05512fdd74 lib/storage: re-use newTestStorage() instead of manually initializing Storage mock
This is a follow-up for d2d30581a0
2022-10-23 16:24:00 +03:00
Aliaksandr Valialkin
d2d30581a0 lib/storage: pass Storage to table and partition instead of getDeletedMetricIDs callback
This improves code readability a bit.
2022-10-23 16:10:04 +03:00
Aliaksandr Valialkin
54f35c175c lib/storage: small refactoring: move retentionDeadline to blockStreamMerger
This allows defining per-block retention in the future by updating the getRetentionDeadline function
2022-10-23 16:10:02 +03:00
Aliaksandr Valialkin
187e294a53 lib/storage: use a single reference to the currently merged block - bsm.Block during the block merge loop 2022-10-23 14:08:57 +03:00
Aliaksandr Valialkin
d0a9ca1bc2 lib/storage: properly pass uint64 constant to fmt.Errorf on 32-bit platforms 2022-10-23 12:48:00 +03:00
Aliaksandr Valialkin
5e4dfe50c6 lib/storage: subsitute searchTSIDs functions with more lightweight searchMetricIDs function
The searchTSIDs function was searching for metricIDs matching the the given tag filters
and then was locating the corresponding TSID entries for the found metricIDs.

The TSID entries aren't needed when searching for time series names (aka MetricName),
so this commit removes the uneeded TSID search from the implementation of /api/v1/series API.
This improves perfromance of /api/v1/series calls.

This commit also improves performance a bit for /api/v1/query and /api/v1/query_range calls,
since now these calls cache small metricIDs instead of big TSID entries
in the indexdb/tagFilters cache (now this cache is named indexdb/tagFiltersToMetricIDs)
without the need to compress the saved entries in order to save cache space.

This commit also removes concurrency limiter during searching for matching time series,
which was introduced in 8f16388428, since the concurrency
for all the read queries is already limited with -search.maxConcurrentRequests command-line flag.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
2022-10-23 12:23:47 +03:00
Aliaksandr Valialkin
a10647e0bf app/vmselect/promql: expose missing metric vm_cache_size_max_bytes{type="promql/rollupResult"} 2022-10-23 12:13:47 +03:00
Aliaksandr Valialkin
4128ad71e2 lib/storage: move common code to newRawRowsBlock() function 2022-10-21 14:46:55 +03:00
Aliaksandr Valialkin
b5674164c6 lib/storage: simplify code a bit after 3f5959c053 2022-10-21 14:39:27 +03:00
Aliaksandr Valialkin
fd7c86ae25 lib/{mergeset,storage}: simplify the code a bit after ae55ad8749 2022-10-21 14:33:03 +03:00
Aliaksandr Valialkin
99d67ac8ad lib/storage: validate timestamps in the block only if they use encoding, which needs validation
This reduces CPU usage when there is no sense in validating timestamps.

This is a follow-up for 5fa9525498

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2998
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3011
2022-10-21 00:52:32 +03:00
Aliaksandr Valialkin
3f5959c053 lib/storage: try generating initial parts from inmemory rows with identical sizes under high ingestion rate
This should improve background merge rate under high load a bit
2022-10-20 23:28:24 +03:00
Aliaksandr Valialkin
891ff6af2a lib/workingsetcache: increase default cache expiration from 10 minutes to 20 minutes
This increases the maximum time for cache population with new entries from 20 minutes to 40 minutes.
This

This change shouldn't increase memory usage for caches, since the prev cache cleaner
should free up memory by deleting unused prev cache as soon as possible.
See 08ca45d238 for details on prev cache cleaner.
2022-10-20 21:48:25 +03:00
Aliaksandr Valialkin
08ca45d238 lib/workingsetcache: move the cleaner for the prev cache into a separate goroutine
This makes the code more clear after d906d8573e
2022-10-20 21:45:29 +03:00
Aliaksandr Valialkin
4cd173bbaa lib/procutil: stop immediately after receiving the second SIGINT or SIGTERM signal
Previously VictoriaMetrics apps could stop responding to SIGINT and SIGTERM signals
if they hang for some reason in graceful shutdown procedure.
2022-10-20 21:40:20 +03:00
Aliaksandr Valialkin
150e99d403 lib/{mergeset,storage}: avoid unaligned 64-bit atomic operation panic on 32-bit platforms
The panic has been introduced in 68f3a02589

While at it, add padding to shard structs in order to avoid false sharing on mordern CPUs

This should improve scalability on systems with many CPU cores
2022-10-20 16:25:43 +03:00
Aliaksandr Valialkin
d906d8573e lib/workingsetcache: drop the previous cache whenever it recieves less than 5% of requests comparing to the current cache
This means that the majority of requests are successfully served from the current cache,
so the previous cache can be reset in order to free up memory.
2022-10-20 10:47:58 +03:00
Aliaksandr Valialkin
817aeafd69 lib/workingsetcache: use per-bucket stats counters instead of global stats counters for cache hits/misses
This should improve cache scalability on systems with many CPU cores.
2022-10-20 09:12:17 +03:00
Aliaksandr Valialkin
9c02c39487 lib/workingsetcache: randomize interval for swapping curr and prev caches
This should make CPU usage smoother over time, since different caches
will be swapped at different times.
2022-10-20 08:42:43 +03:00
Aliaksandr Valialkin
cba9696a14 docs/CHANGELOG.md: move the BUGFIX line for 1059c4d84a into correct place 2022-10-18 20:38:57 +03:00
Nikolay
1059c4d84a lib/promscrape/discovery/kubernetes: correctly wrap error (#3250)
* lib/promscrape/discovery/kubernetes: correctly wrap error
follow-up after 1304824201

* Update docs/CHANGELOG.md

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-10-18 20:37:42 +03:00
Roman Khavronenko
4e0ea95f26 vmalert: lower severity level for RW retries (#3237)
The message about dropped data still remains at `error` level.
The change supposed to make log message more clear about how
serious it is.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-10-18 14:27:20 +02:00
Aliaksandr Valialkin
b4f7243110 vendor: make vendor-update 2022-10-18 10:56:38 +03:00
Aliaksandr Valialkin
3bb3893b2d app/vmselect/vmui: make vmui-update after 080562030d 2022-10-18 10:50:35 +03:00
Yury Molodov
080562030d fix: remove rounding of axis limits (#3238) 2022-10-18 10:47:58 +03:00
Aliaksandr Valialkin
069401a304 all: log error when environment variables referred from -promscrape.config are missing
This should prevent from using incorrect config files
2022-10-18 10:47:16 +03:00
Aliaksandr Valialkin
fb50730ba7 lib/storage: double the number of rawRows shards on multi-core systems
This should increase data ingestion scalability on multi-core systems at the cost of slightly higher memory usage
2022-10-17 18:19:51 +03:00
Aliaksandr Valialkin
ae55ad8749 lib/{storage,mergeset}: do not hold per-shard lock in fast path when adding per-shard items to the flush list 2022-10-17 18:01:26 +03:00
Aliaksandr Valialkin
b6e8c1403a lib/promrelabel: add relabeling tests when the source label is missing 2022-10-17 14:47:52 +03:00
Aliaksandr Valialkin
2b2c58ecf8 vendor: make vendor-update 2022-10-14 15:16:03 +03:00
Aliaksandr Valialkin
646fb17237 docs/MetricsQL.md: document that input histograms must have the same set of buckets when calculating the quantile over them
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3231
2022-10-14 13:59:58 +03:00
Aliaksandr Valialkin
adf3419699 docs/CHANGELOG.md: add a note that it is recommended to use v1.82.1 instead of v1.82.0 2022-10-14 13:41:01 +03:00
Aliaksandr Valialkin
3a0c69651a docs/CHANGELOG.md: release v1.82.1 2022-10-14 11:33:25 +03:00
Aliaksandr Valialkin
2e3be68617 lib/bytesutil: make sure that the string passed to FastStringMather.Match() is copied before using it as a key in the internal cache map
This prevents from possible corruption of the internal cache map
when the underlying byte slice used by the string key is modified.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3227
2022-10-14 09:51:19 +03:00
Denys Holius
b1de74a6ba k8s-monitoring-via-vm-cluster.md: remove extra spaces (#3234) 2022-10-13 17:25:35 +02:00
Yury Molodov
ff6151fa49 vmui: limit number of plotted series (#3229)
* feat: add maximum display series by tabs

* feat: add warning on PredefinedPanels.tsx

* docs/CHANGELOG.md: vmui limit number of plotted series

* docs/CHANGELOG.md: vmui limit number of plotted series

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-10-13 12:13:47 +03:00
Aliaksandr Valialkin
92f7fe306e app/vmagent/remotewrite: typo fix after c914e4dace 2022-10-13 12:04:10 +03:00
Aliaksandr Valialkin
e6fd33044f app/vmselect/promql: follow-up for 930f1ee153
Document the change at docs/CHANGELOG.md
Apply it to histogram_quantile() in the same way as to histogram_share()

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3225
2022-10-13 12:03:08 +03:00
Siqi Liu
930f1ee153 BUGFIX: properly calculate histogram_quantile with the same value and different string le (#3225)
Co-authored-by: 647(siki.liu) <siki.liu@huolala.cn>
2022-10-13 11:57:16 +03:00
Aliaksandr Valialkin
76e275ddef docs/CHANGELOG.md: document b856581ad3 2022-10-13 10:35:48 +03:00
Nikolay
b856581ad3 lib/backup: set s3 default region to us-west-2 (#3224)
* lib/backup: set s3 default region to us-west-2
it should fix an error with region detection for bucket, if AWS_REGION env var is not set

* Update lib/backup/s3remote/s3.go

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-10-13 10:30:07 +03:00
Dan Fredell
42ce4364fc Multi retention standardization of docs (#3223)
* Multi retention standardization of docs

While reading through the docs the implementation details had different formatting for storageNode. This standardizes them and adds a link to the retention docs.

* Update docs/guides/guide-vmcluster-multiple-retention-setup.md

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2022-10-13 10:27:16 +03:00
Aliaksandr Valialkin
c914e4dace app/vmagent/remotewrite: typo fix after 50f5eae0e0 2022-10-13 10:19:02 +03:00
Roman Khavronenko
96a106eab2 vmalert: update troubleshooting docs (#3228)
The default value of `-datasource.queryStep` has changed, so we update
the troubleshooting docs accordingly.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-10-12 12:12:37 +02:00
Aliaksandr Valialkin
185cff307b lib/mergeset: mention in the error message the path to the part, which triggered the error
This should improve debuggability
2022-10-12 09:54:21 +03:00
Aliaksandr Valialkin
cfae887c75 docs/CHANGELOG.md: document a4975ace86
The original commit, which led to the issue - 877940a131
2022-10-12 09:30:36 +03:00
Aliaksandr Valialkin
b8da90b893 app/vmselect/promql: properly handle zero and negative values for -search.maxMemoryPerQuery
This is a follow-up for 04a05f161c

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3203
2022-10-12 09:25:17 +03:00
Aliaksandr Valialkin
41925a9500 docs/CHANGELOG.md: document 9544b5cacf7446203fac19eb7c575779dc9b280e 2022-10-12 09:25:17 +03:00
Roman Khavronenko
a4975ace86 vmalert: revert unexpected fileds rename during refactoring (#3222)
Due to auto-refactoring, the filed `state` was automatically
renamed to `ruleState` when the entity with the same name
was renamed in other file. Reverting the change.

https://github.com/VictoriaMetrics/helm-charts/issues/391
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-10-11 12:37:47 +02:00
Aliaksandr Valialkin
b7887c426b vendor: make vendor-update 2022-10-10 22:03:45 +03:00
Aliaksandr Valialkin
a79272db1f docs/vmbackup.md: run make docs-sync after 076e721c22 2022-10-10 21:57:46 +03:00
Zakhar Bessarab
076e721c22 doc: describe usage of env variables for obtaining credentials (#3219) 2022-10-10 21:56:46 +03:00
Aliaksandr Valialkin
875abf0ef4 docs/CHANGELOG.md: document e384d88abf 2022-10-10 21:52:06 +03:00
Aliaksandr Valialkin
04a05f161c app/vmselect: return back the logic for limits the amounts of memory occupied by concurrently executed queries if -search.maxMemoryPerQuery isn't set
This is needed for preserving backwards compatibility with the previous releases of VictoriaMetrics.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3203
2022-10-10 21:45:13 +03:00
Howie
e384d88abf fix issue#3053 (#3182)
vmalert: prevent duplicating label `alertname` for notifications

The issue has no impact on alerting procedure. But still needs to be fixed
for clarity. 

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3053

Signed-off-by: lihaowei <haoweili35@gmail.com>
2022-10-10 09:44:58 +02:00
Aliaksandr Valialkin
921918cb49 docs/sd_configs.md: document __scrape_timeout__, __scrape_interval__ and __series_limit__ labels 2022-10-09 15:05:30 +03:00
Aliaksandr Valialkin
50f5eae0e0 lib/promrelabel: remove unconditional sorting of the labels in ParsedConfigs.Apply(), since the sorting isnt needed in many places
Sort labels explicitly after calling the ParsedConfigs.Apply() when needed.

This reduces CPU usage when performing metric-level relabeling, where labels' sorting isn't needed.
2022-10-09 14:51:16 +03:00
Aliaksandr Valialkin
8e1ccecd97 docs: mention -search.maxMemoryPerQuery in the description to -search.maxConcurrentQueries command-line flag
This is a follow-up for 5138eaeea0

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3203
2022-10-09 13:56:59 +03:00
Aliaksandr Valialkin
4f4d591ccb docs/vmbackupmanager.md: update docs after adding the support to make backups to Azure blob storage
This is a follow-up for 262ce77e2d
2022-10-08 10:30:39 +03:00
Aliaksandr Valialkin
8f8ce5e238 docs: add description for -search.maxMemoryPerQuery command-line flag 2022-10-08 01:16:55 +03:00
Aliaksandr Valialkin
5138eaeea0 app/vmselect: allow limiting per-query memory usage via -search.maxMemoryPerQuery command-line flag
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3203
2022-10-08 01:08:05 +03:00
Aliaksandr Valialkin
3aafbc3624 docs/CHANGELOG.md: add a link to a feature request for the feature, which allows specifying full scrape urls in targets list and in the __address__ label
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3208
2022-10-07 23:45:57 +03:00
Aliaksandr Valialkin
5269b1ad77 lib/promscrape: allow controlling staleness tracking on a per-scrape_config basis
Add support for no_stale_markers option at scrape_config section.
See https://docs.victoriametrics.com/sd_configs.html#scrape_configs and
https://docs.victoriametrics.com/vmagent.html#prometheus-staleness-markers
2022-10-07 23:36:14 +03:00
Aliaksandr Valialkin
93811da76d docs/CHANGELOG.md: document the 27ed4b853e
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3169
See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3196#issuecomment-1269765205
2022-10-07 23:06:32 +03:00
Yury Molodov
27ed4b853e vmui: auto-update chart after query field removed (#3210)
* feat: run query after query field removed

* app/vmselect/vmui: `make vmui-update`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-10-07 23:02:41 +03:00
Aliaksandr Valialkin
b47caa86db all: update the minimum required Go verson from 1.19.1 to 1.19.2
This is needed because of security vulnerabilities found in Go 1.19.1
See https://go.dev/doc/devel/release#go1.19.2
2022-10-07 22:43:37 +03:00
Aliaksandr Valialkin
f9df0cae16 lib/promscrape: allow specifying full target url in __address__ label
Previously the `__address__` label could contain only `host:port` part of the target url,
while the scheme and metrics path were obtained from `__scheme__` and `__metrics_path__`
labels. Now it is possible to set the full url in `__address__` label.

This makes valid the following scrape config, which is frequently used by novice users:

scrape_configs:
- job_name: foo
  static_configs:
  - targets:
    - http://host1/metrics1
    - https://host2/metrics2
2022-10-07 22:43:04 +03:00
Aliaksandr Valialkin
8322760647 app/vmselect/vmui: make vmui-update after a54987f671 2022-10-07 03:31:17 +03:00
Aliaksandr Valialkin
8bc840358f docs/CHANGELOG.md: cut v1.82.0 2022-10-07 03:15:23 +03:00
Aliaksandr Valialkin
7b6ce3f75e docs/CHANGELOG.md: typo fix 2022-10-07 03:11:22 +03:00
Aliaksandr Valialkin
ba4050ab1f docs/CHANGELOG.md: add a changelog for v1.79.4 LTS release (copied from lts-1.79 branch) 2022-10-07 02:50:59 +03:00
Aliaksandr Valialkin
897e9ef427 go.mod: go mod tidy 2022-10-07 01:23:52 +03:00
Aliaksandr Valialkin
711698b858 lib/backup/azremote: typo fixes after 03872025b747fcc4ee98710ad10fc98764328511 2022-10-07 01:02:06 +03:00
Zakhar Bessarab
176f10f5b2 app/vmbackup: fix compatibility with latest azure sdk (#461) 2022-10-07 01:02:03 +03:00
Aliaksandr Valialkin
0cea525456 vendor: make vendor-update 2022-10-07 01:01:21 +03:00
Aliaksandr Valialkin
285e92706d docs/MetricsQL.md: formatting improvements
Also put a link to the function type in docs for every function.
This should simplify understanding of MetricsQL functions for novice users.
2022-10-07 00:53:14 +03:00
Aliaksandr Valialkin
f452c84579 app/vmselect/promql: properly calculate vm_rows_scanned_per_query histogram for rollup functions, which take into account only a few samples on the provided lookbehind window 2022-10-06 23:22:24 +03:00
Aliaksandr Valialkin
40e899fd67 app/vmselect/promql: properly calculate quantiles_over_time() over a single raw sample 2022-10-06 22:37:21 +03:00
Aliaksandr Valialkin
8ae713253e docs/CHANGELOG.md: remove duplicate description of the bugfix, which has been included in v1.81.2 2022-10-06 15:53:05 +03:00
Aliaksandr Valialkin
ecd2f7451b Makefile: remove docs/*.tmp files after running sed command there 2022-10-06 15:24:56 +03:00
Aliaksandr Valialkin
9acf1845f4 Makefile: add missing "=" char between "-i" flag and its value for sed
This is a follow-up after 78af27f955
2022-10-06 15:06:42 +03:00
Roman Khavronenko
78af27f955 docs: when modifying docs in place allow storing backups (#3205)
The stored backups would help to identify docs corruption
but aren't needed for commiting. So `.tmp` backup files
are also git-ignored.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-10-06 15:04:25 +03:00
Roman Khavronenko
5b10fa87b2 vmalert: fix misleading line regarding multitenancy (#3206)
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-10-06 15:03:45 +03:00
Aliaksandr Valialkin
8d7910a463 docs/CHANGELOG.md: add missing link to /api/v1/export docs
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3161
2022-10-06 15:02:13 +03:00
Aliaksandr Valialkin
d9282027e6 app: follow-up after ec04fcac93
* Optimize fast path for /api/v1/import when importing numeric values
* Move the docs about the change from features to bugfixes at docs/CHANGELOG.md
* Update tests at lib/protoparser/vmimport

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3161
2022-10-06 14:52:02 +03:00
Dmytro Kozlov
ec04fcac93 Properly parse json when export import metric (#3180)
* app/vmselect: properly work when export import json from `api/v1/{export, import}` API

* app/vmselect: update convert function

* app/vmselect: export null if `math.IsNaN(v)`

* app/vmselect: get float from json

* lib/protoparser: add test

* docs: add change log

* lib/protoparser: make export import api compatible
2022-10-06 13:54:20 +03:00
Zakhar Bessarab
97239e05ce lib/backup/s3remote: fix error checking for alternative S3 providers (#3191) 2022-10-06 13:36:40 +03:00
Aliaksandr Valialkin
dba49943d3 docs/Single-server-VictoriaMetrics.md: update docs after the a54987f671 2022-10-06 13:29:38 +03:00
Aliaksandr Valialkin
1e93ad84e3 lib/backup/azremote: remove unused methods after the 262ce77e2d 2022-10-06 13:08:58 +03:00
Yury Molodov
a54987f671 vmui: maximum queries (#3196)
vmui: allow using up 4 queries at the same time

The change also introduces UI updates to make using 
multiple queries more conveniently.
2022-10-06 07:10:21 +02:00
Aliaksandr Valialkin
cdf385f9e4 deployment/docker: update Go builder from v1.19.1 to v1.19.2
See https://github.com/golang/go/issues?q=milestone%3AGo1.19.2+label%3ACherryPickApproved
2022-10-06 02:01:32 +03:00
Aliaksandr Valialkin
5307cf068f docs: follow-up after 262ce77e2d
* Document the addition of Azure blob storage support in vmbackup / vmrestore
* List the supported storage system types at docs/vmrestore.md
* Mention about azblob storage system support at -src and -dst command-line flags
  for vmbackup / vmrestore tools.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1029
2022-10-06 00:35:34 +03:00
Zakhar Bessarab
262ce77e2d lib/backup: add support of Azure Blob Storage (#460)
* lib/backup: add support of Azure Blob Storage

* lib/backup: add enterprise support of Azure Blob Storage
2022-10-06 00:32:46 +03:00
Aliaksandr Valialkin
f596e49881 docs/CHANGELOG.md: document f8ac55d70ada9ef8490b322abefb05f28f75e2e9 2022-10-06 00:05:37 +03:00
Aliaksandr Valialkin
c45c61cf93 app/vmalert: follow-up after f8ac55d70ada9ef8490b322abefb05f28f75e2e9
* Use vm_account_id and vm_project_id labels to be consistent with https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multitenancy-via-labels
* Document the feature that vmalert now exposes vm_account_id and vm_project_id
  labels if -clusterMode is set.
* Use literal strings instead of string constants for vm_account_id and vm_project_id.
  This improves code readability.
2022-10-06 00:05:33 +03:00
Aliaksandr Valialkin
703094a37a app/vmbackupmanager: expose vm_backup_in_flight metrics (follow-up after c6bca4a0b47b4f5626e1913d3480e62d657ed4cf) 2022-10-05 23:30:21 +03:00
Aliaksandr Valialkin
f9fc838b7b app/vmalert: update -external.alert.source command-line flag description after 61544e13ad 2022-10-05 22:52:38 +03:00
Aliaksandr Valialkin
36584ef52c README.md: sync with docs/README.md after e0ea76db62 2022-10-05 22:52:36 +03:00
Aliaksandr Valialkin
440495df52 Makefile: fix sed command and if condition for Linux bash after 2d11896486 2022-10-05 22:40:04 +03:00
Roman Khavronenko
61544e13ad vmalert: allow using {{$labels}} for templating in -external.alert.source (#3194)
The change is supposed to provide additional flexibility for generating alert's
source link based on label values.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-10-05 19:25:03 +02:00
Zakhar Bessarab
6711eec109 docker-compose: move TooManyLogs into vm-health alerts set (#3199) 2022-10-05 19:23:36 +02:00
Dmytro Kozlov
e0ea76db62 docs: add cardinality explorer blog post (#3198)
docs: add cardinality explorer blog post
2022-10-05 13:19:41 +02:00
Roman Khavronenko
2d11896486 docs: udpate Datadog section (#3190)
The `docs-sync` command was updated to modify images path
for assets in `docs/` folder. The change allows to refer images from `docs`
without copying them to the root folder.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-10-03 13:13:07 +02:00
Yurii Kravets
272f00dbb6 docs: Update How to send data from DataDog agent (#3168)
docs: Update How to send data from DataDog agent
2022-10-03 11:45:05 +02:00
Aliaksandr Valialkin
c53b7e66ef app/vmselect: improve performance scalability on multi-CPU systems for /api/v1/export/... endpoints 2022-10-01 22:05:43 +03:00
Aliaksandr Valialkin
49311ae977 app/vmselect/prometheus: improve scalability of /federate endpoint on systems with many CPU cores
Minimize usage of global lock inside bufferedwriter.Write() when processing `/federate` data
on systems with many CPU cores
2022-10-01 20:13:24 +03:00
Aliaksandr Valialkin
fb1cc3cc94 app/vmselect/promql: increase scalability of incremental aggregate calculations on systems with many CPU cores
Use sync.Map instead of a global mutex there. This should lift scalability limits
on systems with many CPU cores.
2022-10-01 20:00:03 +03:00
Aliaksandr Valialkin
fcc7ab71b3 app/vmselect: do not export NaN values for stale metrics at /federate endpoint
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3185
2022-10-01 19:47:37 +03:00
Aliaksandr Valialkin
7812761ab4 docs/Cluster-VictoriaMetrics.md: add a security note for multitenant support via labels 2022-10-01 18:57:28 +03:00
Aliaksandr Valialkin
d7327d2f02 docs/vmagent.md: fix incorrect url for multitenant writes to VictoriaMetrics cluster 2022-10-01 18:51:37 +03:00
Aliaksandr Valialkin
0dc93cca7f app/vmagent/remotewrite: allow specifying per--remoteWrite.url disk limits for persistent queue with pending data
This commit is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3071

Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2970
2022-10-01 18:40:59 +03:00
Aliaksandr Valialkin
c1fa9828b3 lib/flagutil: rename Array to ArrayString
This makes the ArrayString more consistent with other Array* types.

While at it, add ArrayBytes type, which will be used for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3071
2022-10-01 18:26:36 +03:00
Aliaksandr Valialkin
366f04001b vendor: make vendor-update 2022-10-01 17:20:11 +03:00
Zakhar Bessarab
87c77727e4 vmbackup: update AWS SDK to v2 (#3174)
* lib/backup/s3remote: update AWS SDK to v2

* Update lib/backup/s3remote/s3.go

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>

* lib/backup/s3remote: refactor error handling

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-10-01 17:12:07 +03:00
Aliaksandr Valialkin
725dfb0ed6 lib/httpserver: use 302 redirects instead of 301 redirects
Incorrect 301 redirects can be cached by user agents such as web browsers.
This can complicate recovery procedure after the incorrect redirect is fixed,
e.g. web browser cache must be reset.

The related issue - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1752
2022-10-01 16:53:35 +03:00
Aliaksandr Valialkin
a296994fed app/vmauth: do not remove trailing slash from the proxied path
This should fix the issue with opening VMUI at /vmui/ page.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1752
2022-10-01 16:52:30 +03:00
Aliaksandr Valialkin
4998402004 lib/promscrape: add external_labels from global section of -promscrape.config after the relabeling is applied to the scraped metrics
This aligns with Prometheus behaviour.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3137
2022-10-01 16:13:19 +03:00
Aliaksandr Valialkin
3a98ef2f5f lib/promrelabel: export MustParseMetricWithLabels function, which can be used for simplifying tests 2022-10-01 16:05:51 +03:00
Aliaksandr Valialkin
f86070169d lib/promscrape/discovery/azure: remove unneeded conversion to string 2022-10-01 16:04:37 +03:00
Aliaksandr Valialkin
973ce4b561 app/vmagent: accept requests with /prometheus and /influx path prefixes in the same way as VictoriaMetrics does
This allows using vmagent as a drop-in replacement for VictoriaMetrics
for push protocols.
2022-10-01 14:10:48 +03:00
Aliaksandr Valialkin
63a94b1d54 docs/vmagent.md: formatting fixes 2022-10-01 13:36:24 +03:00
Aliaksandr Valialkin
5b83e6e14e app/vmagent/README.md: apply typo fix from 1fda517af9 2022-10-01 12:31:06 +03:00
lionelee
1fda517af9 docs/vmagent.md: fix typo (#3186) 2022-10-01 12:30:01 +03:00
Aliaksandr Valialkin
db16759c68 lib/storage: optimize matching speed for non-trivial regexp filters
Wrap re.Match into bytesutil.FastStringMatcher.

This increases performance for `{foo=~"complex_regex_here"}` filters
by up to 4x.
2022-10-01 12:06:06 +03:00
Aliaksandr Valialkin
9e8fbef27e docs/CHANGELOG.md: clarify the description of the improvement in relabeling performance 2022-10-01 11:54:48 +03:00
Aliaksandr Valialkin
e8a64f6e7a lib/promrelabel: remove redundant memory allocations by using interned strings 2022-10-01 11:50:21 +03:00
Aliaksandr Valialkin
73dc17ef64 lib/promrelabel: add a benchmark for realistic Kubernetes relabeling
The benchmark name is BenchmarkApplyRelabelConfigs/kubernetes

This benchmark has been copied from d521933053/model/relabel/relabel_test.go (L505)

See also https://github.com/prometheus/prometheus/pull/11147
2022-10-01 10:38:22 +03:00
Aliaksandr Valialkin
c54e14cdec lib/promscrape/discovery/ec2: expose __meta_ec2_region label in the same way as Prometheus 2.39 does
See https://github.com/prometheus/prometheus/pull/11326
2022-09-30 20:48:32 +03:00
Aliaksandr Valialkin
4d27fa41c8 Makefile: run errcheck for all the app/... subdirs 2022-09-30 18:35:53 +03:00
Aliaksandr Valialkin
d0b7172316 app/vminsert: remove support for undocumented VictoriaMetrics_AccountID and VictoriaMetrics_ProjectID labels in tcp-based data ingestion endpoints
These labels are substituted by documented vm_account_id and vm_project_id labels.

See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multitenancy-via-labels

This is a follow up for 505d359b39
2022-09-30 18:35:53 +03:00
Nikolay
33f40f4a5f app/vminsert: allows parsing tenant id from labels (#3009)
* app/vminsert: allows parsing tenant id from labels
it should help mitigate issues with vmagent's multiTenant mode, which works incorrectly at heavy load
and it cannot handle more then 100 different tenants.
This functional hidden with flag and do not change vminsert default behaviour
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2970

* Update docs/Cluster-VictoriaMetrics.md

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

* wip

* app/vminsert/netstorage: clean remaining labels in order to free up GC

* docs/Cluster-VictoriaMetrics.md: typo fix

* wip

* wip

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-09-30 18:35:53 +03:00
Roman Khavronenko
740bb2cc00 vmalert: support auth configs per static_target (#3188)
Allow configuring authorization params per list of targets
in vmalert's notifier config for `static_configs`.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2690

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-30 17:10:17 +02:00
Aliaksandr Valialkin
171dd14aa3 lib/promrelabel: go fmt 2022-09-30 12:28:55 +03:00
Aliaksandr Valialkin
a18d6d5ccc lib/promrelabel: optimize action: replace for non-trivial regex values
Cache `action: replace` results for non-trivial regexs and return them next time
instead of performing CPU-intensive regex replacement.

Optimize also `action: labelmap_all` and `action: replace_all` in the same way.
2022-09-30 12:25:05 +03:00
Aliaksandr Valialkin
146021a076 lib/promrelabel: there is no need in calling regex.HasPrefix() after the optimization at 17289ff481 2022-09-30 10:49:18 +03:00
Aliaksandr Valialkin
899d2c40fb lib/promrelabel: optimize action: labelmap for non-trivial regexs 2022-09-30 10:43:31 +03:00
Aliaksandr Valialkin
17289ff481 lib/regexutil: cache MatchString results for unoptimized regexps
This increases relabeling performance by 3x for unoptimized regexs
2022-09-30 10:41:29 +03:00
Dmytro Kozlov
e220bc3cd5 docs, app/vmgateway: add description about new auth.httpHeader flag (#3134)
* docs, app/vmgateway: add description about new `auth.httpHeader` flag

* Update docs/CHANGELOG.md

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-09-30 08:55:12 +03:00
Aliaksandr Valialkin
b70f815dc4 app/vmselect/promql: remove empty series before applying aggregate function
Previously empty series (e.g. series with all NaN samples) were passed to aggregate functions.
Such series must be ingored by all the aggregate functions.
So it is better from consistency PoV filtering out empty series before applying aggregate functions.
2022-09-30 08:39:54 +03:00
Roman Khavronenko
b64b9b9fec app/vmselect: ignore empty series for limit_offset (#3178)
* app/vmselect: ignore empty series for `limit_offset`

VictoriaMetrics doesn't return empty series (with all NaN values) to
the user. But such series are filtered after transform functions.
It means `limit_offset` will account for empty series as well.

For example, let's consider following data set:
```
time series:
foo{label="1"} NaN, NaN, NaN, NaN // empty series
foo{label="2"} 1, 2, 3, 4
foo{label="3"} 4, 3, 2, 1
```

When user requests all series for metric `foo` the empty series
will be filtered out:
```
/query=foo:
foo{label="v2"} 1, 2, 3, 4
foo{label="v3"} 4, 3, 2, 1
```

But `limit_offset(1, 1, foo)` is applied to original series, not filtered yet.
So it will return `foo{label="v2"}` (skips the first in list)
```
/query=limit_offset(1, 1, foo):
foo{label="v2"} 1, 2, 3, 4
```

Expected result would be to apply `limit_offset` to already filtered list,
so in result we receive `foo{label="v3"}`:
```
/query=limit_offset(1, 1, foo):
foo{label="v3"} 4, 3, 2, 1
```

The change does exactly that - filters empty series before applying `limit_offset`.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* app/vmselect: ignore empty series for `limit_offset`

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-30 08:20:34 +03:00
Aliaksandr Valialkin
fda60b3d4d lib/promrelabel: properly parse regex with escaped $ at the end
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3131

Thanks to @dmitryk-dk for the initial fix at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3179
2022-09-30 08:15:43 +03:00
Aliaksandr Valialkin
bf2f14a3a6 docs/CHANGELOG.md: document 39f559d22b
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3013
2022-09-30 07:48:25 +03:00
Aliaksandr Valialkin
593da3603e lib/bytesutil: move InternString() from lib/promscrape/discoverytutils to lib/bytesutil
lib/bytesutil is more appropriate place for InternString() function
2022-09-30 07:44:35 +03:00
Nikolay
f61b8cec69 lib/awsapi: fixes sign encoding (#3183)
* lib/awsapi: fixes sign encoding

previously white spaces at filter were incorrectly encoded
encoding tip was copied from aws signing lib
For example, the space character must be encoded as %20 (not using '+', as some encoding schemes do)
https://docs.aws.amazon.com/general/latest/gr/sigv4-create-canonical-request.html
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3171

* Update lib/awsapi/sign.go

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-09-30 07:43:44 +03:00
Roman Khavronenko
39f559d22b vmalert: allow using extra labels in annotations (#3181)
According to Ruler specification, only labels returned within time series
should be available for use in annotations.

For long time, vmalert didn't respect this rule. And in PR
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2403
this was fixed for the sake of compatibility. However, this resulted
into users confusion, as they expected all configured and extra labels
to be available - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3013

This fix allows to use extra labels in Annotations. But in the case of conflicts
the original labels (extracted from time series) are preferred.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-29 18:22:50 +02:00
Aliaksandr Valialkin
6a32a64073 lib/bytesutil: add FastStringTransformer and use it in the rest of the code where needed 2022-09-28 10:41:00 +03:00
Aliaksandr Valialkin
92b3622253 lib/protoparser/datadog: optimize sanitizeName() function by using result cache for input strings
This is a follow-up for 7c2474dac7

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3105
2022-09-28 10:40:59 +03:00
Aliaksandr Valialkin
ef435f8cc4 lib/promrelabel: add SanitizeName() function for sanitizing Prometheus metric names and label names
Optimize this function by using results cache for input strings.
Use this function all over the code.

This is a follow-up for fcffdba9dc

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3113
2022-09-28 10:40:59 +03:00
nemobis
6a818dbddf docs: fix typo in Cluster-VictoriaMetrics.md (#3172) 2022-09-28 09:18:29 +02:00
panguicai
fbc85e654c docs: fix typo for vmalert docs (#3173)
Signed-off-by: panguicai008 <1121906548@qq.com>
2022-09-28 09:15:05 +02:00
Roman Khavronenko
4ad3b36630 docs: update img for key concepts (#3176)
Make dots bigger for range_query, so it will be easier
to read the graph.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-27 21:22:59 +02:00
Aliaksandr Valialkin
6411bbcce7 lib/netutil/tls.go: consistently use tlsMinVersion name across source code
This should simplify further code maintenance and refactoring

This is a follow-up after 6ab1cede62
2022-09-26 17:58:01 +03:00
Dmytro Kozlov
6ab1cede62 lib/{httpserver,netutil}: allow to define min and max TLS version of the http server (#3109)
* lib/{httpserver,netutil}: allow to define min and max TLS version of the http server

* lib/httpserver: added descriptions about tls supported versions

* lib/netutil: check minimal tls version, added supported tls versions to error

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-09-26 17:35:45 +03:00
Denys Holius
d63410bf6f docs: managed_victoriametrics/quickstart.md fix image link (#3165) 2022-09-26 16:35:18 +02:00
Roman Khavronenko
36a9a834b3 docs: clarify numeric label values for label_value (#3129)
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3114
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-26 16:48:38 +03:00
Max Golionko
d67948b8e3 added security policy (#3140)
* added security policy

* Update SECURITY.md

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-09-26 16:47:02 +03:00
Denys Holius
32be84fc75 Adds packer build for server with VM Single node in vultr.com marketplace (#3142)
* adds packer build for server with VM Single node in vultr.com marketplace

* fix missed varibale
2022-09-26 16:44:36 +03:00
Roman Khavronenko
e96ccf3f71 lib/mergeset: follow-up after a0e7432e42 (#3145)
* lib/mergeset: follow-up after a0e7432e42

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* Apply suggestions from code review

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-09-26 16:39:56 +03:00
Aliaksandr Valialkin
72c29d762e docs/CHANGELOG.md: document f022296d96
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3147
2022-09-26 16:32:19 +03:00
Zakhar Bessarab
f022296d96 vmbackup: configure retries for GCS remote FS (#3156) 2022-09-26 16:28:20 +03:00
Aliaksandr Valialkin
1bac96dfce vendor: make vendor-update 2022-09-26 15:44:55 +03:00
Aliaksandr Valialkin
a2431c2a88 docs/CHANGELOG.md: document 166d444159 2022-09-26 15:39:14 +03:00
Roman Khavronenko
166d444159 vmselect/rollup: rm workaround for slow-changing counters (#3163)
The workaround was introduced to fix https://github.com/VictoriaMetrics/VictoriaMetrics/issues/962.
However, it didn't prove itself useful. Instead, it is recommended using `increase_pure` function.

Removing the workaround makes VM to produce accurate results when calculating
`delta` or `increase` functions over slow-changing counters with vary intervals
between data points.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-26 15:33:25 +03:00
Aliaksandr Valialkin
41f8c2987d lib/protoparser/graphite: accept whitespace in metric names and tags according to the specification
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/99
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3102

See the specification https://graphite.readthedocs.io/en/latest/tags.html
2022-09-26 15:17:25 +03:00
Aliaksandr Valialkin
5983ecf4d1 docs/Single-server-VictoriaMetrics.md: mention that VictoriaMetrics sanitizes metric names in DataDog data by default
This is a follow-up for 7c2474dac7
2022-09-26 14:02:43 +03:00
Aliaksandr Valialkin
7c2474dac7 lib/protoparser/datadog: sanitize metric names by default in the same way as DataDog does
This commit is based on the pull request https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3105

Thanks to @PerGon for the idea and initial implementation.
2022-09-26 13:57:23 +03:00
Aliaksandr Valialkin
fcffdba9dc app/{vmagent,vminsert}: add -usePromCompatibleNaming command-line flag for normalizing metric names and label names in the ingested samples
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3113

Thanks to @erkexzcx for the idea and the initial pull request at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3146
2022-09-26 13:11:39 +03:00
Aliaksandr Valialkin
819aa95552 docs/vmalert.md: follow-up for 0c95f928ae
- Clarify the description for -datasource.queryStep command-line flag
- Consistently use a single dash in front of -datasource.queryStep command-line flag
- Update -help output at docs/vmalert.md
2022-09-26 08:47:31 +03:00
Aliaksandr Valialkin
6b71f33c8b docs/vmalert.md: follow-up after 7748a9d629
- Consistently use single dash in front of command-line flags instead of double dashes.
- Add a warning that too small -search.latencyOffset may lead to incomplete query results.
2022-09-26 08:36:24 +03:00
Aliaksandr Valialkin
b68cd810fc docs: follow-up for 03d54ac890
- Typo fix: 'adn' -> 'and'
- Remove duplicate link to https://victoriametrics.com/blog/victoriametrics-monitoring/
  from docs/Articles.md . The initial link has been already added in 27254096b2
2022-09-26 08:26:38 +03:00
Roman Khavronenko
908fe6a623 dashboards: replace Index size panel with Active series (#3157)
Panel `Index size` showed itself impractical for users. So
replacing it with `Active series` panel.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/776#issuecomment-1255823734
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-25 21:49:18 +02:00
Roman Khavronenko
0c95f928ae vmalert: set default value for datasource.queryStep to 5m (#3149)
Change default value for command-line flag `datasource.queryStep` from `0s` to `5m`.
Param `step` is added by vmalert to every rule evaluation request sent to datasource.
Before this change, `step` was equal to group's evaluation interval by default.
Param `step` for instant queries defines how far VM can look back for the last written data point.
The change supposed to improve reliability of the rules evaluation when evaluation interval
is lower than scraping interval.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-23 10:22:53 +02:00
Roman Khavronenko
7748a9d629 vmalert: add info about search.latencyOffset to Troubleshooting (#3151)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-23 09:49:14 +02:00
Roman Khavronenko
fc9caa6738 deployment/docker: fix image versions for cluster components (#3150)
Cluster components always have `-cluster` suffix. The change fixes
incorrect image tag in docker-compose manifest.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-23 09:48:46 +02:00
Roman Khavronenko
03d54ac890 docs: mention VictoriaMetrics Monitoring blog post (#3152)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-23 07:12:15 +02:00
匠心零度
3d5509a720 lib/querytracer: fix comment (#3135) 2022-09-22 19:19:48 +03:00
Aliaksandr Valialkin
27254096b2 docs/Articles.md: add a link to https://victoriametrics.com/blog/victoriametrics-monitoring/ 2022-09-22 19:17:10 +03:00
Aliaksandr Valialkin
a3e536c0e7 docs/Articles.md: add a link to https://dataswamp.org/~solene/2022-09-11-exploring-monitoring-stacks.html 2022-09-22 19:17:10 +03:00
Denys Holius
e347dd7cc6 deployment/docker: add version tag for docker containers (#3141)
* deployment/docker/docker-compose.yml: adds version tags for VictoriaMetrics containers

* deployment/docker/docker-compose-cluster.yml: adds version tags for VictoriaMetrics containers
2022-09-21 14:38:09 +02:00
Aliaksandr Valialkin
dc4b87621f vendor: make vendor-update 2022-09-21 11:54:32 +03:00
Roman Khavronenko
5714a68ac6 deployment/docker: move cluster compose env to master branch (#3130)
* deployment/docker: move cluster compose env to master branch

The change supposed to simplify the process of maintaining for
single/cluster docker-compose envs, alerts, dashboards. It also
supposes to reduce confusion for users when looking for cluster
related alerts/configs.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* deployment/docker: move cluster compose env to master branch

Review updates.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-21 11:48:38 +03:00
Dmytro Kozlov
6a794ec5d5 app/{vmctl,vmalert}: update progress bar library (make vendor-update) (#3138)
* app/{vmctl,vmalert}: update progress bar library (make vendor-update)

* app/{vmctl,vmalert}: make vendor-update
2022-09-21 11:08:33 +03:00
Roman Khavronenko
d61cce06fd vmalert: prodvide more details on duplicates (#3136)
Now vmalert will print the following messages on dupliсates:
```
"recording rule \"record\"; expr: \"up == 1\"; labels: summary={{ value|query }}" is a duplicate within the group "test"
"alerting rule \"alert\"; expr: \"up == 1\"; labels: description={{ value|query }}" is a duplicate within the group "test"
```

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3127
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-20 12:52:46 +02:00
Aliaksandr Valialkin
5e2fcd455d app/vmagent/remotewrite: go fmt after 2b55d167d7 2022-09-19 15:14:35 +03:00
Aliaksandr Valialkin
310d0caec2 vendor: make vendor-update 2022-09-19 15:12:22 +03:00
Aliaksandr Valialkin
fd98ec8ba3 Makefile: fix -compat value passed to go mod tidy 2022-09-19 15:12:03 +03:00
Aliaksandr Valialkin
1aef635de4 docs/CHANGELOG.md: clarify the change at 622bbedbe1
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3119
2022-09-19 15:02:05 +03:00
Aliaksandr Valialkin
584a5d6b1f docs/managed_victoriametrics/quickstart.md: small fixes after 606166ef68 2022-09-19 14:52:46 +03:00
Aliaksandr Valialkin
2b55d167d7 app/vmagent/remotewrite: add benchmarks for comparing the performance of standard Snappy encoder with github.com/klauspost/compress/s2 encoder
The standard Snappy encoder from github.com/golang/snappy shows quite good performance number
for compressing the Prometheus remote_write proto messages according to the added benchmarks,
so there is no need in switching to github.com/klauspost/compress/s2 yet.
2022-09-19 14:28:09 +03:00
Roman Khavronenko
b4410b1c63 Dashboards (#3120)
* dashboards/cluster: few updates

* apply consistent formatting across panels;
* make resource usage panels per component more detailed;
* add extra panels to vmselect for displaying
`vm_rows_read_per_query`, `vm_rows_scanned_per_query`,
`vm_rows_read_per_series` and `vm_series_read_per_query` metrics.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboards/single: few updates

* apply consistent formatting across panels;
* add extra panels to Performance for displaying
`vm_rows_read_per_query`, `vm_rows_scanned_per_query`,
`vm_rows_read_per_series` and `vm_series_read_per_query` metrics.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboards/vmagent: few updates

* apply consistent formatting across panels;
* add panels for showing number of samples ingested
or scraped;
* adapt resource usage panels for multiple selected jobs/instances;
* add adhoc variable;
* display vmagent's version in Stats.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboards/vmalert: few updates

* apply consistent formatting across panels;
* adapt resource usage panels for multiple selected jobs/instances;
* show vmalert version in Stats section.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-16 21:24:32 +02:00
Roman Khavronenko
622bbedbe1 vmalert: always re-evaluate Annotations (#3119)
* vmalert: always re-evaluate Annotations

Previously, Annotations were evaluated only:
1. On alert creating.
2. On alert's value change.

This is premature optimization. It was assumed that since annotations
could contain only text with alert's labels or value - there is no need
in spending resources to re-compile Annotations.

Later, template function `query` was added, which can execute
arbitrary queries and return different results on every evaluation.
So if it was used in annotations, it would be executed only on init
or value change.

Another case when optimization caused an issue - annotations hot reload.
In this case, annotations of the active alert won't change even if Rule's
annotations were changed.

This fix enables Annotations re-evaluation on each iteration to resolve
issues above. It would have some impact on performance, but it is unlikely
it will be noticeable.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: add tp Changelog

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-16 16:19:10 +02:00
Roman Khavronenko
3484673566 vmalert: add Troubleshooting section to docs (#3115)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-15 16:15:39 +02:00
Dmytro Kozlov
606166ef68 docs/managed_victoriametrics: add how to restore password section (#3116)
docs/managed_victoriametrics: add how to restore password section
2022-09-15 16:02:31 +02:00
Roman Khavronenko
9c95c81534 vmalert: print example of curl command for rule's state (#3112)
The change adds an example of `curl` command to the Rule's page.
The command is generated for each recorded state. It is supposed
user can just copy&execute the command to see what was returned
to vmalert.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-15 12:40:22 +02:00
Aliaksandr Valialkin
455002922e docs/Cluster-VictoriaMetrics.md: update -help output after explicit marking of enterprise flags 2022-09-15 13:22:58 +03:00
Aliaksandr Valialkin
e7995375b5 docs: update -help output after explicit mentioning of enterprise flags 2022-09-15 13:22:57 +03:00
Aliaksandr Valialkin
4193af4571 docs/vmauth.md: update -help output after explicit marking of enterprise flags 2022-09-15 13:22:57 +03:00
Aliaksandr Valialkin
3b2599c659 docs/vmalert.md: update -help output after explicit marking of enterprise flags 2022-09-15 13:22:57 +03:00
Aliaksandr Valialkin
bbecd27557 docs: update docs with explicitly marked enterprise command-line flags for VictoriaMetrics and vmagent 2022-09-15 13:22:56 +03:00
Dmytro Kozlov
a9629cc32d app/vmctl: add description about influx-skip-database-label flag (#3111) 2022-09-15 10:46:54 +02:00
Aliaksandr Valialkin
b869c757a9 docs/FAQ.md: add an answer for What is the difference between single-node and cluster versions of VictoriaMetrics? 2022-09-15 09:56:41 +03:00
Dmytro Kozlov
b75f1854c5 vmselect/promql: add alphanumeric sort by label (sort_by_label_numeric) (#2982)
* vmselect/promql: add alphanumeric sort by label (sort_by_label_numeric)

* vmselect/promql: fix tests, add documentation

* vmselect/promql: update test

* vmselect/promql: update for alphanumeric sorting, fix tests

* vmselect/promql: remove comments

* vmselect/promql: cleanup

* vmselect/promql: avoid memory allocations, update functions descriptions

* vmselect/promql: make linter happy (remove ineffectual assigment)

* vmselect/promql: add test case, fix behavior when strings are equal

* vendor: update github.com/VictoriaMetrics/metricsql from v0.44.1 to v0.45.0

this adds support for sort_by_label_numeric and sort_by_label_numeric_desc functions

* wip

* lib/promscrape: read response body into memory in stream parsing mode before parsing it

This reduces scrape duration for targets returning big responses.

The response body was already read into memory in stream parsing mode before this change,
so this commit shouldn't increase memory usage.

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-09-14 17:41:09 +03:00
Aliaksandr Valialkin
5306f79fd1 docs/Articles.md: add a link to https://www.groundcover.com/blog/prometheus-grafana-kubernetes 2022-09-14 15:11:19 +03:00
Aliaksandr Valialkin
56ce7ce85b lib/promscrape: typo fix after 74c00a8762 2022-09-14 15:06:50 +03:00
Roman Khavronenko
877940a131 vmalert: add experimental feature of storing Rule's evaluation state (#3106)
vmalert: add experimental feature of storing Rule's evaluation state

The new feature keeps last 20 state changes of each Rule
in memory. The state are available for view on the Rule's
view page. The page can be opened by clicking on `Details`
link next to Rule's name on the `/groups` page.

States change suppose to help in investigating cases when Rule
doesn't generate alerts or records.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-14 14:04:24 +02:00
Aliaksandr Valialkin
99bc18774c app/vmselect/vmui: make vmui-update after 1304824201 2022-09-14 13:33:06 +03:00
Roman Khavronenko
efea51a9ee bump Go version to 1.19.1 (#3108)
The reason is to cover vulnerability GO-2022-0969
Found in: net/http@go1.18.5
Fixed in: net/http@go1.19.1
More info: https://pkg.go.dev/vuln/GO-2022-0969

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-14 12:29:19 +02:00
Aliaksandr Valialkin
74c00a8762 lib/promscrape: read response body into memory in stream parsing mode before parsing it
This reduces scrape duration for targets returning big responses.

The response body was already read into memory in stream parsing mode before this change,
so this commit shouldn't increase memory usage.
2022-09-14 13:15:29 +03:00
Aliaksandr Valialkin
ccad651a61 lib/promscrape/discovery/kubernetes: add more context on WatchEvent parse error
This should improve debugging issues with Kubernetes API server
2022-09-13 19:36:55 +03:00
Yury Molodov
1304824201 vmui: fix query params saving in URL (#3104) 2022-09-13 17:05:26 +02:00
Aliaksandr Valialkin
3af24a5b7c docs: consistently use docs.victoriametrics.com instead of victoriametrics.github.io in all the links 2022-09-13 16:50:13 +03:00
Aliaksandr Valialkin
523ff25077 vendor: make vendor-update 2022-09-13 16:44:44 +03:00
Aliaksandr Valialkin
0ead64b6cf app/vmalert: follow-up after 8441375da2
- Rename logDebug() to logDebugf() and pass format string together
  with format args directly to logDebugf(). This eliminates fmt.Sprintf()
  overhead at logDebug() call site when debugging is disabled.

- Format labels in debug message in Prometheus format, e.g. {label1="value1",...labelN="valueN"}

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3025
2022-09-13 16:35:14 +03:00
Roman Khavronenko
8441375da2 vmalert: add debug mode for alerting rules (#3055)
* vmalert: add `debug` mode for alerting rules

Debug information includes alerts state changes and requests
sent to the datasource. Debug can be enabled only on rule's
level. It might be useful for debugging unexpected
behaviour of alerting rule.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3025

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: review fixes

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* Update app/vmalert/alerting.go

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>

* vmalert: go fmt

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-09-13 16:25:43 +03:00
Aliaksandr Valialkin
25c9a1604a docs/CHANGELOG.md: document atomic directory deletion
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3038
2022-09-13 16:17:39 +03:00
Aliaksandr Valialkin
ce2c07c5a7 lib/mergeset: atomically remove part dirs
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3038
2022-09-13 16:17:38 +03:00
Aliaksandr Valialkin
042a532f70 lib/storage: substitute remaining calls to fs.MustRemoveAll with fs.MustRemoveDirAtomic
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3038
2022-09-13 16:17:38 +03:00
Aliaksandr Valialkin
68e32b0764 lib/storage: atomically remove parts inside partitions
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3038
2022-09-13 16:17:38 +03:00
Aliaksandr Valialkin
db4f0fe6fc docs/ExtendedPromQL.md: move the page to the bottom of the contents list at http://docs.victoriametrics.com 2022-09-13 16:17:37 +03:00
Aliaksandr Valialkin
e041c913bd Revert "docs/ExtendedPromQL.md: remove outdated doc"
This reverts commit 971e3d83f7.

Reason for revert: the ExtendedPromQL doc is still referred from third-party sites.
2022-09-13 16:17:37 +03:00
Aliaksandr Valialkin
340ada871d lib/storage: atomically remove partitions, which went outside the configured retention
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3038
2022-09-13 16:17:37 +03:00
Aliaksandr Valialkin
978dcb4574 lib/storage: properly remove cache directory contents if reset_cache_on_startup file is located there
Previously the cache directory was removed. This could result in error when the cache directory
is mounted to a separate filesystem.
2022-09-13 16:17:36 +03:00
Aliaksandr Valialkin
5f28ca1f42 lib/storage: atomically remove snapshot directories
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3038
2022-09-13 16:17:36 +03:00
Craig Rodrigues
57ea8dbb36 docs: Use headings in DataDog section to make things easier to read (#3089)
Signed-off-by: Craig Rodrigues <rodrigc@crodrigues.org>

Signed-off-by: Craig Rodrigues <rodrigc@crodrigues.org>
2022-09-12 10:58:25 +03:00
Yury Molodov
1b41169415 vmui: fix data processing (#3092)
* fix: change data processing

* app/vmselect/vmui: `make vmui-update`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-09-12 10:55:11 +03:00
Yury Molodov
b5f4060520 fix: change columns for Top Queries (#3093) 2022-09-12 10:44:35 +03:00
Aliaksandr Valialkin
c48ff746c6 docs/FAQ.md: add a link to https://victoriametrics.com/blog/mimir-benchmark/ to VictoriaMetrics vs Mimir chapter 2022-09-12 10:37:28 +03:00
Aliaksandr Valialkin
c4af0e833a docs/Articles.md: add a link to https://victoriametrics.com/blog/mimir-benchmark/ 2022-09-11 15:23:35 +03:00
Yury Molodov
9541ef2e9e vmui: add lists of top queries (#3065)
* feat: add lists of top queries

* fix: change the field label

* refactor: add handlers for readability

* app/vmselect: `make vmui-update`

* docs: document `top queries` tab

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-09-08 21:43:37 +03:00
John Belmonte
defced2599 MetricsQL doc spellcheck (#3080) 2022-09-08 21:28:03 +03:00
Aliaksandr Valialkin
53b0c2eee4 docs/CHANGELOG.md: document ec273eafef
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3076
2022-09-08 21:25:30 +03:00
Aliaksandr Valialkin
e7635e1c83 Makefile: remove github-create-release and github-upload-assets commands from publish-release
This is a follow-up for b9231c715a
2022-09-08 21:02:41 +03:00
Aliaksandr Valialkin
7c2fa1bc48 vendor: make vendor-update 2022-09-08 18:51:49 +03:00
Aliaksandr Valialkin
aa0c6ed27f Makefile: consistently use go install instead of go get for installing various binaries needed during build/test/check of the code
`go install` is the preferred way for installing go binaries starting
from the minimum supported Go version for VictoriaMetrics - Go1.18 -
see https://tip.golang.org/doc/go1.18#go-command
2022-09-08 18:43:05 +03:00
Aliaksandr Valialkin
28b6dec1f4 .github/workflows/main.yml: stop setting GO111MODULE=on env var, since it is unnecessary in Go1.18 and newer versions 2022-09-08 18:41:56 +03:00
Aliaksandr Valialkin
5dad557868 Makefile: check for vulnerabilities in used Go packages with govulncheck when running make check-all
See https://go.dev/blog/vuln
2022-09-08 18:35:34 +03:00
Aliaksandr Valialkin
f81dfaf20d deployment/docker: update Go builder for prod binaries from Go1.19.0 to Go1.19.1
See https://github.com/golang/go/issues?q=milestone%3AGo1.19.1+label%3ACherryPickApproved
2022-09-08 18:27:10 +03:00
Aliaksandr Valialkin
b9231c715a docs/Release-Guide.md: require manual push of the created release tags to public Github
The automated push of release tags to Github require specifying the remote repository name
when doing `git push <remote-repo-name> v1.xx.y`.
The remote repository name can differ in different environments,
so it cannot be put into Makefile rule.

TODO: create a Makefile rule, which generates standard remote names for public
and private repositories in Git, so `git push` for release tags could be automated then.
2022-09-08 15:00:18 +03:00
Aliaksandr Valialkin
07441b1cee app/vmselect/netstorage: fix a typo, which leads to incorrect query results in VictoriaMetrics cluster
The typo has been introduced in the commit 1a254ea20c

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3067
2022-09-08 13:48:20 +03:00
Aliaksandr Valialkin
9f20d01a81 Revert "docs: add Monitoring at scale with Victoria Metrics (#3078)"
This reverts commit f24572fa65,
because the article https://tech.bedrockstreaming.com/2022/09/06/monitoring-at-scale-with-victoriametrics.html
has been already added at 09ff3f1928
2022-09-08 11:05:25 +03:00
Roman Khavronenko
f24572fa65 docs: add Monitoring at scale with Victoria Metrics (#3078)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-06 16:29:48 +02:00
Max Golionko
7da9443686 moved cluster dashboard to master (#3074)
dashboards: move cluster dashboard to master branch

This change should simplify dashboards management.
2022-09-06 16:19:43 +02:00
Aliaksandr Valialkin
ef7fdbb63c docs/Single-server-VictoriaMetrics.md: add a note that cardinality explorer at cluster version of VictoriaMetrics may return lower than expected number of unique label values
See the corresponding comment in the code:

5a6e617b5e/app/vmselect/netstorage/netstorage.go (L1039-L1045)
2022-09-06 14:46:35 +03:00
Aliaksandr Valialkin
09ff3f1928 docs/Articles.md: add https://tech.bedrockstreaming.com/2022/09/06/monitoring-at-scale-with-victoriametrics.html 2022-09-06 13:32:41 +03:00
Dmytro Kozlov
4415c71a2b vmselect/{promql, prometheus}: show flag names which user can update in error message (#3049)
* vmselect/{promql, prometheus}: show flag names which user can update in error message

* vmselect/{promql, prometheus}: fix typo
2022-09-06 13:25:59 +03:00
Aliaksandr Valialkin
651ace6ce4 docs/vmctl.md: make docs-sync after c5261d5f56
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2733
2022-09-06 13:19:48 +03:00
Aliaksandr Valialkin
5fa9525498 lib/storage: verify that timestamps in block are in the range specified by blockHeader.{Min,Max}Timestamp when upacking the block
This should reduce chances of unnoticed on-disk data corruption.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2998
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3011

This change modifies the format for data exported via /api/v1/export/native -
now this data contains MaxTimestamp and PrecisionBits fields from blockHeader.

This is OK, since the native export format is undocumented.
2022-09-06 13:08:09 +03:00
Zakhar Bessarab
c5261d5f56 vmctl: implement support of chunking data for vm-native export (#3044)
vmctl: implement support of chunking data for vm-native export process

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2733
2022-09-06 09:09:34 +02:00
Craig Rodrigues
462fc7b394 docs: Clarify DataDog examples for VictoriaMetrics cluster (#3048)
Signed-off-by: Craig Rodrigues <rodrigc@crodrigues.org>
2022-09-05 16:25:05 +02:00
Yurii Kravets
7b04112352 docs: Update keyConcepts (#3040)
docs: update keyConcepts

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2022-09-05 13:59:28 +02:00
Aliaksandr Valialkin
ae31b2363f app/vmselect/prometheus: follow-up after 50e2524bc2
- Add getCommonParamsWithDefaultDuration function and use it at /api/v1/series, /api/v1/labels and /api/v1/label/.../values
- Document the default behaviour for setting 5 minutes time range if start arg isn't passed to /api/v1/series, /api/v1/labels and /api/v1/label/.../values
- Document the change at docs/CHANGELOG.md

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3052
2022-09-05 11:55:48 +03:00
匠心零度
50e2524bc2 api prometheus/api/v1/label/../values time not specified, (#3052)
modify default start values
2022-09-05 11:52:19 +03:00
Aliaksandr Valialkin
7dc632719d app/vmselect/promql: consistently calculate rate_over_sum(m[d]) as sum_over_time(m[d])/d
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3045
2022-09-02 23:18:04 +03:00
Aliaksandr Valialkin
2d4619c9a0 Makefile: properly push public tags 2022-09-02 22:42:13 +03:00
1802 changed files with 252729 additions and 132195 deletions

View File

@@ -17,7 +17,7 @@ jobs:
- name: Setup Go
uses: actions/setup-go@main
with:
go-version: 1.18
go-version: 1.19.3
id: go
- name: Code checkout
uses: actions/checkout@master

View File

@@ -40,6 +40,12 @@ jobs:
- name: Checkout repository
uses: actions/checkout@v3
- name: Set up Go
uses: actions/setup-go@v2
with:
go-version: 1.19
if: ${{ matrix.language == 'go' }}
# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@v2

View File

@@ -19,7 +19,7 @@ jobs:
- name: Setup Go
uses: actions/setup-go@main
with:
go-version: 1.18
go-version: 1.19.3
id: go
- name: Code checkout
uses: actions/checkout@master
@@ -29,8 +29,6 @@ jobs:
make install-errcheck
make install-golangci-lint
- name: Build
env:
GO111MODULE: on
run: |
export PATH=$PATH:$(go env GOPATH)/bin # temporary fix. See https://github.com/actions/setup-go/issues/14
make check-all

3
.gitignore vendored
View File

@@ -19,4 +19,5 @@
.DS_store
Gemfile.lock
/_site
_site
_site
*.tmp

View File

@@ -169,10 +169,7 @@ publish-release:
git checkout $(TAG) && $(MAKE) release publish && \
git checkout $(TAG)-cluster && $(MAKE) release publish && \
git checkout $(TAG)-enterprise && $(MAKE) release publish && \
git checkout $(TAG)-enterprise-cluster && $(MAKE) release publish && \
git checkout $(TAG) && git push $(TAG) && git push $(TAG)-cluster && \
$(MAKE) github-create-release && \
$(MAKE) github-upload-assets
git checkout $(TAG)-enterprise-cluster && $(MAKE) release publish
release: \
release-victoria-metrics \
@@ -322,24 +319,16 @@ lint: install-golint
golint app/...
install-golint:
which golint || GO111MODULE=off go get golang.org/x/lint/golint
which golint || go install golang.org/x/lint/golint@latest
errcheck: install-errcheck
errcheck -exclude=errcheck_excludes.txt ./lib/...
errcheck -exclude=errcheck_excludes.txt ./app/vminsert/...
errcheck -exclude=errcheck_excludes.txt ./app/vmselect/...
errcheck -exclude=errcheck_excludes.txt ./app/vmstorage/...
errcheck -exclude=errcheck_excludes.txt ./app/vmagent/...
errcheck -exclude=errcheck_excludes.txt ./app/vmalert/...
errcheck -exclude=errcheck_excludes.txt ./app/vmauth/...
errcheck -exclude=errcheck_excludes.txt ./app/vmbackup/...
errcheck -exclude=errcheck_excludes.txt ./app/vmrestore/...
errcheck -exclude=errcheck_excludes.txt ./app/vmctl/...
errcheck -exclude=errcheck_excludes.txt ./app/...
install-errcheck:
which errcheck || GO111MODULE=off go get github.com/kisielk/errcheck
which errcheck || go install github.com/kisielk/errcheck@latest
check-all: fmt vet lint errcheck golangci-lint
check-all: fmt vet lint errcheck golangci-lint govulncheck
test:
go test ./lib/... ./app/...
@@ -367,7 +356,7 @@ benchmark-pure:
vendor-update:
go get -u -d ./lib/...
go get -u -d ./app/...
go mod tidy -compat=1.18
go mod tidy -compat=1.19
go mod vendor
app-local:
@@ -386,7 +375,7 @@ quicktemplate-gen: install-qtc
qtc
install-qtc:
which qtc || GO111MODULE=off go get github.com/valyala/quicktemplate/qtc
which qtc || go install github.com/valyala/quicktemplate/qtc@latest
golangci-lint: install-golangci-lint
@@ -395,21 +384,34 @@ golangci-lint: install-golangci-lint
install-golangci-lint:
which golangci-lint || curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(shell go env GOPATH)/bin v1.48.0
govulncheck: install-govulncheck
govulncheck ./...
install-govulncheck:
which govulncheck || go install golang.org/x/vuln/cmd/govulncheck@latest
install-wwhrd:
which wwhrd || GO111MODULE=off go get github.com/frapposelli/wwhrd
which wwhrd || go install github.com/frapposelli/wwhrd@latest
check-licenses: install-wwhrd
wwhrd check -f .wwhrd.yml
copy-docs:
echo "---\nsort: ${ORDER}\n---\n" > ${DST}
echo '' > ${DST}
@if [ ${ORDER} -ne 0 ]; then \
echo "---\nsort: ${ORDER}\n---\n" > ${DST}; \
fi
cat ${SRC} >> ${DST}
sed -i='.tmp' 's/<img src=\"docs\//<img src=\"/' ${DST}
rm -rf docs/*.tmp
# Copies docs for all components and adds the order tag.
# For ORDER=0 it adds no order tag.
# Images starting with <img src="docs/ are replaced with <img src="
# Cluster docs are supposed to be ordered as 9th.
# For The rest of docs is ordered manually.t
# The rest of docs is ordered manually.
docs-sync:
cp README.md docs/README.md
SRC=README.md DST=docs/README.md ORDER=0 $(MAKE) copy-docs
SRC=README.md DST=docs/Single-server-VictoriaMetrics.md ORDER=1 $(MAKE) copy-docs
SRC=app/vmagent/README.md DST=docs/vmagent.md ORDER=3 $(MAKE) copy-docs
SRC=app/vmalert/README.md DST=docs/vmalert.md ORDER=4 $(MAKE) copy-docs

285
README.md
View File

@@ -24,12 +24,14 @@ Learn more about [key concepts](https://docs.victoriametrics.com/keyConcepts.htm
[quick start guide](https://docs.victoriametrics.com/Quick-Start.html) for a better experience.
[Contact us](mailto:info@victoriametrics.com) if you need enterprise support for VictoriaMetrics.
See [features available in enterprise package](https://victoriametrics.com/products/enterprise/).
See [features available in enterprise package](https://docs.victoriametrics.com/enterprise.html).
Enterprise binaries can be downloaded and evaluated for free
from [the releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases).
VictoriaMetrics is developed at a fast pace, so it is recommended periodically checking the [CHANGELOG](https://docs.victoriametrics.com/CHANGELOG.html) and performing [regular upgrades](#how-to-upgrade-victoriametrics).
VictoriaMetrics has achieved security certifications for Database Software Development and Software-Based Monitoring Services. We apply strict security measures in everything we do. See our [Security page](https://victoriametrics.com/security/) for more details.
## Prominent features
VictoriaMetrics has the following prominent features:
@@ -65,7 +67,7 @@ VictoriaMetrics has the following prominent features:
* [DataDog agent or DogStatsD](#how-to-send-data-from-datadog-agent).
* It supports metrics [relabeling](#relabeling).
* It can deal with [high cardinality issues](https://docs.victoriametrics.com/FAQ.html#what-is-high-cardinality) and [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate) issues via [series limiter](#cardinality-limiter).
* It ideally works with big amounts of time series data from APM, Kubernetes, IoT sensors, connected cars, industrial telemetry, financial data and various [Enterprise workloads](https://victoriametrics.com/products/enterprise/).
* It ideally works with big amounts of time series data from APM, Kubernetes, IoT sensors, connected cars, industrial telemetry, financial data and various [Enterprise workloads](https://docs.victoriametrics.com/enterprise.html).
* It has open source [cluster version](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/cluster).
* It can store data on [NFS-based storages](https://en.wikipedia.org/wiki/Network_File_System) such as [Amazon EFS](https://aws.amazon.com/efs/) and [Google Filestore](https://cloud.google.com/filestore).
@@ -132,7 +134,15 @@ VictoriaMetrics is developed at a fast pace, so it is recommended periodically c
### Environment variables
Each flag value can be set via environment variables according to these rules:
All the VictoriaMetrics components allow referring environment variables in command-line flags via `%{ENV_VAR}` syntax.
For example, `-metricsAuthKey=%{METRICS_AUTH_KEY}` is automatically expanded to `-metricsAuthKey=top-secret`
if `METRICS_AUTH_KEY=top-secret` environment variable exists at VictoriaMetrics startup.
This expansion is performed by VictoriaMetrics itself.
VictoriaMetrics recursively expands `%{ENV_VAR}` references in environment variables on startup.
For example, `FOO=%{BAR}` environment variable is expanded to `FOO=abc` if `BAR=a%{BAZ}` and `BAZ=bc`.
Additionally, all the VictoriaMetrics components allow setting flag values via environment variables according to these rules:
* The `-envflag.enable` flag must be set.
* Each `.` char in flag name must be substituted with `_` (for example `-insert.maxQueueDuration <duration>` will translate to `insert_maxQueueDuration=<duration>`).
@@ -260,7 +270,10 @@ Prometheus doesn't drop data during VictoriaMetrics restart. See [this article](
VictoriaMetrics provides UI for query troubleshooting and exploration. The UI is available at `http://victoriametrics:8428/vmui`.
The UI allows exploring query results via graphs and tables.
It also provides the ability to [explore cardinality](#cardinality-explorer) and to [investigate query traces](#query-tracing).
It also provides the following features:
- [cardinality explorer](#cardinality-explorer)
- [query tracer](#query-tracing)
- [top queries explorer](#top-queries)
Graphs in vmui support scrolling and zooming:
@@ -274,12 +287,22 @@ Multi-line queries can be entered by pressing `Shift-Enter` in query input field
When querying the [backfilled data](https://docs.victoriametrics.com/#backfilling) or during [query troubleshooting](https://docs.victoriametrics.com/Troubleshooting.html#unexpected-query-results), it may be useful disabling response cache by clicking `Disable cache` checkbox.
VMUI automatically adjusts the interval between datapoints on the graph depending on the horizontal resolution and on the selected time range. The step value can be customized by clickhing `Override step value` checkbox.
VMUI automatically adjusts the interval between datapoints on the graph depending on the horizontal resolution and on the selected time range. The step value can be customized by changing `Step value` input.
VMUI allows investigating correlations between two queries on the same graph. Just click `+` button, enter the second query in the newly appeared input field and press `Ctrl+Enter`. Results for both queries should be displayed simultaneously on the same graph. Every query has its own vertical scale, which is displayed on the left and the right side of the graph. Lines for the second query are dashed.
VMUI allows investigating correlations between multiple queries on the same graph. Just click `Add Query` button,
enter an additional query in the newly appeared input field and press `Enter`.
Results for all the queries are displayed simultaneously on the same graph.
Graphs for a particular query can be temporarily hidden by clicking the `eye` icon on the right side of the input field.
See the [example VMUI at VictoriaMetrics playground](https://play.victoriametrics.com/select/accounting/1/6a716b0f-38bc-4856-90ce-448fd713e3fe/prometheus/graph/?g0.expr=100%20*%20sum(rate(process_cpu_seconds_total))%20by%20(job)&g0.range_input=1d).
## Top queries
[VMUI](#vmui) provides `top queries` tab, which can help determining the following query types:
* the most frequently executed queries;
* queries with the biggest average execution duration;
* queries that took the most summary time for execution.
## Cardinality explorer
@@ -290,6 +313,9 @@ VictoriaMetrics provides an ability to explore time series cardinality at `cardi
- To identify values with the highest number of series for the selected label (aka `focusLabel`).
- To identify label=name pairs with the highest number of series.
- To identify labels with the highest number of unique values.
Note that [cluster version of VictoriaMetrics](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html)
may show lower than expected number of unique label values for labels with small number of unique values.
This is because of [implementation limits](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/5a6e617b5e41c9170e7c562aecd15ee0c901d489/app/vmselect/netstorage/netstorage.go#L1039-L1045).
By default cardinality explorer analyzes time series for the current date. It provides the ability to select different day at the top right corner.
By default all the time series for the selected date are analyzed. It is possible to narrow down the analysis to series
@@ -298,7 +324,7 @@ matching the specified [series selector](https://prometheus.io/docs/prometheus/l
Cardinality explorer is built on top of [/api/v1/status/tsdb](#tsdb-stats).
See [cardinality explorer playground](https://play.victoriametrics.com/select/accounting/1/6a716b0f-38bc-4856-90ce-448fd713e3fe/prometheus/graph/#/cardinality).
See the example of using the cardinality explorer [here](https://victoriametrics.com/blog/cardinality-explorer/).
## How to apply new config to VictoriaMetrics
@@ -308,7 +334,7 @@ VictoriaMetrics is configured via command-line flags, so it must be restarted wh
* Wait until the process stops. This can take a few seconds.
* Start VictoriaMetrics with the new command-line flags.
Prometheus doesn't drop data during VictoriaMetrics restart. See [this article](https://grafana.com/blog/2019/03/25/whats-new-in-prometheus-2.8-wal-based-remote-write/) for details. The same applies alos to [vmagent](https://docs.victoriametrics.com/vmagent.html).
Prometheus doesn't drop data during VictoriaMetrics restart. See [this article](https://grafana.com/blog/2019/03/25/whats-new-in-prometheus-2.8-wal-based-remote-write/) for details. The same applies also to [vmagent](https://docs.victoriametrics.com/vmagent.html).
## How to scrape Prometheus exporters such as [node-exporter](https://github.com/prometheus/node_exporter)
@@ -324,51 +350,90 @@ See also [vmagent](https://docs.victoriametrics.com/vmagent.html), which can be
## How to send data from DataDog agent
VictoriaMetrics accepts data from [DataDog agent](https://docs.datadoghq.com/agent/) or [DogStatsD](https://docs.datadoghq.com/developers/dogstatsd/) via ["submit metrics" API](https://docs.datadoghq.com/api/latest/metrics/#submit-metrics) at `/datadog/api/v1/series` path.
VictoriaMetrics accepts data from [DataDog agent](https://docs.datadoghq.com/agent/)
or [DogStatsD](https://docs.datadoghq.com/developers/dogstatsd/)
via ["submit metrics" API](https://docs.datadoghq.com/api/latest/metrics/#submit-metrics)
at `/datadog/api/v1/series` path.
Run DataDog agent with `DD_DD_URL=http://victoriametrics-host:8428/datadog` environment variable in order to write data to VictoriaMetrics at `victoriametrics-host` host. Another option is to set `dd_url` param at [DataDog agent configuration file](https://docs.datadoghq.com/agent/guide/agent-configuration-files/) to `http://victoriametrics-host:8428/datadog`.
### Sending metrics to VictoriaMetrics
VictoriaMetrics doesn't check `DD_API_KEY` param, so it can be set to arbitrary value.
DataDog agent allows configuring destinations for metrics sending via ENV variable `DD_DD_URL`
or via [configuration file](https://docs.datadoghq.com/agent/guide/agent-configuration-files/) in section `dd_url`.
Example on how to send data to VictoriaMetrics via DataDog "submit metrics" API from command line:
```console
echo '
{
"series": [
{
"host": "test.example.com",
"interval": 20,
"metric": "system.load.1",
"points": [[
0,
0.5
]],
"tags": [
"environment:test"
],
"type": "rate"
}
]
}
' | curl -X POST --data-binary @- http://localhost:8428/datadog/api/v1/series
```
The imported data can be read via [export API](https://docs.victoriametrics.com/#how-to-export-data-in-json-line-format):
<p align="center">
<img src="Single-server-VictoriaMetrics-sending_DD_metrics_to_VM.png" width="800">
</p>
To configure DataDog agent via ENV variable add the following prefix:
<div class="with-copy" markdown="1">
```console
curl http://localhost:8428/api/v1/export -d 'match[]=system.load.1'
```
DD_DD_URL=http://victoriametrics:8428/datadog
```
</div>
This command should return the following output if everything is OK:
_Choose correct URL for VictoriaMetrics [here](https://docs.victoriametrics.com/url-examples.html#datadog)._
To configure DataDog agent via [configuration file](https://docs.datadoghq.com/agent/guide/agent-configuration-files)
add the following line:
<div class="with-copy" markdown="1">
```json
{"metric":{"__name__":"system.load.1","environment":"test","host":"test.example.com"},"values":[0.5],"timestamps":[1632833641000]}
```
dd_url: http://victoriametrics:8428/datadog
```
</div>
vmagent also can accept Datadog metrics format. Depending on where vmagent will forward data,
pick [single-node or cluster URL]((https://docs.victoriametrics.com/url-examples.html#datadog)) formats.
### Sending metrics to Datadog and VictoriaMetrics
DataDog allows configuring [Dual Shipping](https://docs.datadoghq.com/agent/guide/dual-shipping/) for metrics
sending via ENV variable `DD_ADDITIONAL_ENDPOINTS` or via configuration file `additional_endpoints`.
<p align="center">
<img src="Single-server-VictoriaMetrics-sending_DD_metrics_to_VM_and_DD.png" width="800">
</p>
Run DataDog using the following ENV variable with VictoriaMetrics as additional metrics receiver:
<div class="with-copy" markdown="1">
```
DD_ADDITIONAL_ENDPOINTS='{\"http://victoriametrics:8428/datadog\"}'
```
</div>
_Choose correct URL for VictoriaMetrics [here](https://docs.victoriametrics.com/url-examples.html#datadog)._
To configure DataDog Dual Shipping via [configuration file](https://docs.datadoghq.com/agent/guide/agent-configuration-files)
add the following line:
<div class="with-copy" markdown="1">
```
additional_endpoints: http://victoriametrics:8428/datadog
```
</div>
### Send via cURL
See how to send data to VictoriaMetrics via
[DataDog "submit metrics"](https://docs.victoriametrics.com/url-examples.html#datadogapiv1series) from command line.
The imported data can be read via [export API](https://docs.victoriametrics.com/url-examples.html#apiv1export).
### Additional details
VictoriaMetrics automatically sanitizes metric names for the data ingested via DataDog protocol
according to [DataDog metric naming recommendations](https://docs.datadoghq.com/metrics/custom_metrics/#naming-custom-metrics).
If you need accepting metric names as is without sanitizing, then pass `-datadog.sanitizeMetricName=false` command-line flag to VictoriaMetrics.
Extra labels may be added to all the written time series by passing `extra_label=name=value` query args.
For example, `/datadog/api/v1/series?extra_label=foo=bar` would add `{foo="bar"}` label to all the ingested metrics.
@@ -385,7 +450,7 @@ See [these docs](https://docs.victoriametrics.com/vmagent.html#adding-labels-to-
## How to send data from InfluxDB-compatible agents such as [Telegraf](https://www.influxdata.com/time-series-platform/telegraf/)
Use `http://<victoriametric-addr>:8428` url instead of InfluxDB url in agents' configs.
Use `http://<victoriametrics-addr>:8428` url instead of InfluxDB url in agents' configs.
For instance, put the following lines into `Telegraf` config, so it sends data to VictoriaMetrics instead of InfluxDB:
```toml
@@ -403,6 +468,9 @@ VictoriaMetrics performs the following transformations to the ingested InfluxDB
* Field names are mapped to time series names prefixed with `{measurement}{separator}` value, where `{separator}` equals to `_` by default. It can be changed with `-influxMeasurementFieldSeparator` command-line flag. See also `-influxSkipSingleField` command-line flag. If `{measurement}` is empty or if `-influxSkipMeasurement` command-line flag is set, then time series names correspond to field names.
* Field values are mapped to time series values.
* Tags are mapped to Prometheus labels as-is.
* If `-usePromCompatibleNaming` command-line flag is set, then all the metric names and label names
are normalized to [Prometheus-compatible naming](https://prometheus.io/docs/concepts/data_model/#metric-names-and-labels) by replacing unsupported chars with `_`.
For example, `foo.bar-baz/1` metric name or label name is substituted with `foo_bar_baz_1`.
For example, the following InfluxDB line:
@@ -638,7 +706,7 @@ VictoriaMetrics accepts `round_digits` query arg for `/api/v1/query` and `/api/v
VictoriaMetrics accepts `limit` query arg for `/api/v1/labels` and `/api/v1/label/<labelName>/values` handlers for limiting the number of returned entries. For example, the query to `/api/v1/labels?limit=5` returns a sample of up to 5 unique labels, while ignoring the rest of labels. If the provided `limit` value exceeds the corresponding `-search.maxTagKeys` / `-search.maxTagValues` command-line flag values, then limits specified in the command-line flags are used.
By default, VictoriaMetrics returns time series for the last 5 minutes from `/api/v1/series`, while the Prometheus API defaults to all time. Use `start` and `end` to select a different time range.
By default, VictoriaMetrics returns time series for the last 5 minutes from `/api/v1/series`, `/api/v1/labels` and `/api/v1/label/<labelName>/values` while the Prometheus API defaults to all time. Explicitly set `start` and `end` to select the desired time range.
VictoriaMetrics accepts `limit` query arg for `/api/v1/series` handlers for limiting the number of returned entries. For example, the query to `/api/v1/series?limit=5` returns a sample of up to 5 series, while ignoring the rest. If the provided `limit` value exceeds the corresponding `-search.maxSeries` command-line flag values, then limits specified in the command-line flags are used.
Additionally, VictoriaMetrics provides the following handlers:
@@ -676,7 +744,7 @@ VictoriaMetrics supports `__graphite__` pseudo-label for filtering time series w
### Graphite Render API usage
[VictoriaMetrics Enterprise](https://victoriametrics.com/products/enterprise/) supports [Graphite Render API](https://graphite.readthedocs.io/en/stable/render_api.html) subset
[VictoriaMetrics Enterprise](https://docs.victoriametrics.com/enterprise.html) supports [Graphite Render API](https://graphite.readthedocs.io/en/stable/render_api.html) subset
at `/render` endpoint, which is used by [Graphite datasource in Grafana](https://grafana.com/docs/grafana/latest/datasources/graphite/).
When configuring Graphite datasource in Grafana, the `Storage-Step` http request header must be set to a step between Graphite data points stored in VictoriaMetrics. For example, `Storage-Step: 10s` would mean 10 seconds distance between Graphite datapoints stored in VictoriaMetrics.
Enterprise binaries can be downloaded and evaluated for free from [the releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases).
@@ -717,7 +785,7 @@ to your needs or when testing bugfixes.
### Development build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.18.
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.19.3.
2. Run `make victoria-metrics` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `victoria-metrics` binary and puts it into the `bin` folder.
@@ -733,7 +801,7 @@ ARM build may run on Raspberry Pi or on [energy-efficient ARM servers](https://b
### Development ARM build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.18.
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.19.3.
2. Run `make victoria-metrics-linux-arm` or `make victoria-metrics-linux-arm64` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `victoria-metrics-linux-arm` or `victoria-metrics-linux-arm64` binary respectively and puts it into the `bin` folder.
@@ -747,7 +815,7 @@ ARM build may run on Raspberry Pi or on [energy-efficient ARM servers](https://b
`Pure Go` mode builds only Go code without [cgo](https://golang.org/cmd/cgo/) dependencies.
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.18.
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.19.3.
2. Run `make victoria-metrics-pure` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `victoria-metrics-pure` binary and puts it into the `bin` folder.
@@ -809,8 +877,9 @@ Steps for restoring from a snapshot:
Send a request to `http://<victoriametrics-addr>:8428/api/v1/admin/tsdb/delete_series?match[]=<timeseries_selector_for_delete>`,
where `<timeseries_selector_for_delete>` may contain any [time series selector](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors)
for metrics to delete. After that all the time series matching the given selector are deleted. Storage space for
the deleted time series isn't freed instantly - it is freed during subsequent [background merges of data files](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282).
for metrics to delete. Delete API doesn't support the deletion of specific time ranges, the series can only be deleted completely.
Storage space for the deleted time series isn't freed instantly - it is freed during subsequent
[background merges of data files](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282).
Note that background merges may never occur for data from previous months, so storage space won't be freed for historical data.
In this case [forced merge](#forced-merge) may help freeing up storage space.
@@ -979,7 +1048,8 @@ Time series data can be imported into VictoriaMetrics via any supported data ing
* `/api/v1/import/native` for importing data obtained from [/api/v1/export/native](#how-to-export-data-in-native-format).
See [these docs](#how-to-import-data-in-native-format) for details.
* `/api/v1/import/csv` for importing arbitrary CSV data. See [these docs](#how-to-import-csv-data) for details.
* `/api/v1/import/prometheus` for importing data in Prometheus exposition format. See [these docs](#how-to-import-data-in-prometheus-exposition-format) for details.
* `/api/v1/import/prometheus` for importing data in Prometheus exposition format and in [Pushgateway format](https://github.com/prometheus/pushgateway#url).
See [these docs](#how-to-import-data-in-prometheus-exposition-format) for details.
### How to import data in JSON line format
@@ -1084,9 +1154,11 @@ Note that it could be required to flush response cache after importing historica
### How to import data in Prometheus exposition format
VictoriaMetrics accepts data in [Prometheus exposition format](https://github.com/prometheus/docs/blob/master/content/docs/instrumenting/exposition_formats.md#text-based-format)
and in [OpenMetrics format](https://github.com/OpenObservability/OpenMetrics/blob/master/specification/OpenMetrics.md)
via `/api/v1/import/prometheus` path. For example, the following line imports a single line in Prometheus exposition format into VictoriaMetrics:
VictoriaMetrics accepts data in [Prometheus exposition format](https://github.com/prometheus/docs/blob/master/content/docs/instrumenting/exposition_formats.md#text-based-format),
in [OpenMetrics format](https://github.com/OpenObservability/OpenMetrics/blob/master/specification/OpenMetrics.md)
and in [Pushgateway format](https://github.com/prometheus/pushgateway#url) via `/api/v1/import/prometheus` path.
For example, the following command imports a single line in Prometheus exposition format into VictoriaMetrics:
<div class="with-copy" markdown="1">
@@ -1112,6 +1184,16 @@ It should return something like the following:
{"metric":{"__name__":"foo","bar":"baz"},"values":[123],"timestamps":[1594370496905]}
```
The following command imports a single metric via [Pushgateway format](https://github.com/prometheus/pushgateway#url) with `{job="my_app",instance="host123"}` labels:
<div class="with-copy" markdown="1">
```console
curl -d 'metric{label="abc"} 123' -X POST 'http://localhost:8428/api/v1/import/prometheus/metrics/job/my_app/instance/host123'
```
</div>
Pass `Content-Encoding: gzip` HTTP request header to `/api/v1/import/prometheus` for importing gzipped data:
<div class="with-copy" markdown="1">
@@ -1123,8 +1205,8 @@ curl -X POST -H 'Content-Encoding: gzip' http://destination-victoriametrics:8428
</div>
Extra labels may be added to all the imported metrics by passing `extra_label=name=value` query args.
For example, `/api/v1/import/prometheus?extra_label=foo=bar` would add `{foo="bar"}` label to all the imported metrics.
Extra labels may be added to all the imported metrics either via [Pushgateway format](https://github.com/prometheus/pushgateway#url)
or by passing `extra_label=name=value` query args. For example, `/api/v1/import/prometheus?extra_label=foo=bar` would add `{foo="bar"}` label to all the imported metrics.
If timestamp is missing in `<metric> <value> <timestamp>` Prometheus exposition format line, then the current timestamp is used during data ingestion.
It can be overridden by passing unix timestamp in *milliseconds* via `timestamp` query arg. For example, `/api/v1/import/prometheus?timestamp=1594370496905`.
@@ -1208,10 +1290,11 @@ See also [resource usage limits docs](#resource-usage-limits).
By default VictoriaMetrics is tuned for an optimal resource usage under typical workloads. Some workloads may need fine-grained resource usage limits. In these cases the following command-line flags may be useful:
- `-memory.allowedPercent` and `-search.allowedBytes` limit the amounts of memory, which may be used for various internal caches at VictoriaMetrics. Note that VictoriaMetrics may use more memory, since these flags don't limit additional memory, which may be needed on a per-query basis.
- `-memory.allowedPercent` and `-memory.allowedBytes` limit the amounts of memory, which may be used for various internal caches at VictoriaMetrics. Note that VictoriaMetrics may use more memory, since these flags don't limit additional memory, which may be needed on a per-query basis.
- `-search.maxMemoryPerQuery` limits the amounts of memory, which can be used for processing a single query. Queries, which need more memory, are rejected. Heavy queries, which select big number of time series, may exceed the per-query memory limit by a small percent. The total memory limit for concurrently executed queries can be estimated as `-search.maxMemoryPerQuery` multiplied by `-search.maxConcurrentRequests`.
- `-search.maxUniqueTimeseries` limits the number of unique time series a single query can find and process. VictoriaMetrics keeps in memory some metainformation about the time series located by each query and spends some CPU time for processing the found time series. This means that the maximum memory usage and CPU usage a single query can use is proportional to `-search.maxUniqueTimeseries`.
- `-search.maxQueryDuration` limits the duration of a single query. If the query takes longer than the given duration, then it is canceled. This allows saving CPU and RAM when executing unexpected heavy queries.
- `-search.maxConcurrentRequests` limits the number of concurrent requests VictoriaMetrics can process. Bigger number of concurrent requests usually means bigger memory usage. For example, if a single query needs 100 MiB of additional memory during its execution, then 100 concurrent queries may need `100 * 100 MiB = 10 GiB` of additional memory. So it is better to limit the number of concurrent queries, while suspending additional incoming queries if the concurrency limit is reached. VictoriaMetrics provides `-search.maxQueueDuration` command-line flag for limiting the max wait time for suspended queries.
- `-search.maxConcurrentRequests` limits the number of concurrent requests VictoriaMetrics can process. Bigger number of concurrent requests usually means bigger memory usage. For example, if a single query needs 100 MiB of additional memory during its execution, then 100 concurrent queries may need `100 * 100 MiB = 10 GiB` of additional memory. So it is better to limit the number of concurrent queries, while suspending additional incoming queries if the concurrency limit is reached. VictoriaMetrics provides `-search.maxQueueDuration` command-line flag for limiting the max wait time for suspended queries. See also `-search.maxMemoryPerQuery` command-line flag.
- `-search.maxSamplesPerSeries` limits the number of raw samples the query can process per each time series. VictoriaMetrics sequentially processes raw samples per each found time series during the query. It unpacks raw samples on the selected time range per each time series into memory and then applies the given [rollup function](https://docs.victoriametrics.com/MetricsQL.html#rollup-functions). The `-search.maxSamplesPerSeries` command-line flag allows limiting memory usage in the case when the query is executed on a time range, which contains hundreds of millions of raw samples per each located time series.
- `-search.maxSamplesPerQuery` limits the number of raw samples a single query can process. This allows limiting CPU usage for heavy queries.
- `-search.maxPointsPerTimeseries` limits the number of calculated points, which can be returned per each matching time series from [range query](https://docs.victoriametrics.com/keyConcepts.html#range-query).
@@ -1345,7 +1428,11 @@ VictoriaMetrics does not support indefinite retention, but you can specify an ar
## Multiple retentions
A single instance of VictoriaMetrics supports only a single retention, which can be configured via `-retentionPeriod` command-line flag. If you need multiple retentions, then you may start multiple VictoriaMetrics instances with distinct values for the following flags:
Distinct retentions for distinct time series can be configured via [retention filters](#retention-filters)
in [VictoriaMetrics enterprise](https://docs.victoriametrics.com/enterprise.html).
Community version of VictoriaMetrics supports only a single retention, which can be configured via [-retentionPeriod](#retention) command-line flag.
If you need multiple retentions in community version of VictoriaMetrics, then you may start multiple VictoriaMetrics instances with distinct values for the following flags:
* `-retentionPeriod`
* `-storageDataPath`, so the data for each retention period is saved in a separate directory
@@ -1353,12 +1440,44 @@ A single instance of VictoriaMetrics supports only a single retention, which can
Then set up [vmauth](https://docs.victoriametrics.com/vmauth.html) in front of VictoriaMetrics instances,
so it could route requests from particular user to VictoriaMetrics with the desired retention.
The same scheme could be implemented for multiple tenants in [VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html).
Similar scheme can be applied for multiple tenants in [VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html).
See [these docs](https://docs.victoriametrics.com/guides/guide-vmcluster-multiple-retention-setup.html) for multi-retention setup details.
## Retention filters
[Enterprise version of VictoriaMetrics](https://docs.victoriametrics.com/enterprise.html) supports e.g. `retention filters`,
which allow configuring multiple retentions for distinct sets of time series matching the configured [series filters](https://docs.victoriametrics.com/keyConcepts.html#filtering)
via `-retentionFilter` command-line flag. This flag accepts `filter:duration` options, where `filter` must be
a valid [series filter](https://docs.victoriametrics.com/keyConcepts.html#filtering), while the `duration`
must contain valid [retention](#retention) for time series matching the given `filter`. If series doesn't match
any configured `-retentionFilter`, then the retention configured via [-retentionPeriod](#retention) command-line flag is applied to it.
If series matches multiple configured retention filters, then the smallest retention is applied.
For example, the following config sets 3 days retention for time series with `team="juniors"` label,
30 days retention for time series with `env="dev"` or `env="staging"` label and 1 year retention for the remaining time series:
```
-retentionFilter='{team="juniors"}:3d' -retentionFilter='{env=~"dev|staging"}:30d' -retentionPeriod=1y
```
Important notes:
- The data outside of the configured retention isn't deleted instantly - it is deleted eventually during [background merges](https://docs.victoriametrics.com/#storage).
- The `-retentionFilter` doesn't remove old data from `indexdb` (aka inverted index) until the configured [-retentionPeriod](#retention).
So the `indexdb` size can grow big under [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate)
even for small retentions configured via `-retentionFilter`.
It is safe updating `-retentionFilter` during VictoriaMetrics restarts - the updated retention filters are applied eventually
to historical data.
See [how to configure multiple retentions in VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#retention-filters).
Retention filters can be evaluated for free by downloading and using enterprise binaries from [the releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases).
## Downsampling
[VictoriaMetrics Enterprise](https://victoriametrics.com/products/enterprise/) supports multi-level downsampling with `-downsampling.period` command-line flag. For example:
[VictoriaMetrics Enterprise](https://docs.victoriametrics.com/enterprise.html) supports multi-level downsampling with `-downsampling.period` command-line flag. For example:
* `-downsampling.period=30d:5m` instructs VictoriaMetrics to [deduplicate](#deduplication) samples older than 30 days with 5 minutes interval.
@@ -1417,6 +1536,8 @@ VictoriaMetrics provides the following security-related command-line flags:
Explicitly set internal network interface for TCP and UDP ports for data ingestion with Graphite and OpenTSDB formats.
For example, substitute `-graphiteListenAddr=:2003` with `-graphiteListenAddr=<internal_iface_ip>:2003`. This protects from unexpected requests from untrusted network interfaces.
VictoriaMetrics has achieved security certifications for Database Software Development and Software-Based Monitoring Services. We apply strict security measures in everything we do. See our [Security page](https://victoriametrics.com/security/) for more details.
## Tuning
* No need in tuning for VictoriaMetrics - it uses reasonable defaults for command-line flags,
@@ -1458,7 +1579,8 @@ VictoriaMetrics exposes currently running queries and their execution times at `
VictoriaMetrics exposes queries, which take the most time to execute, at `/api/v1/status/top_queries` page.
See also [troubleshooting docs](https://docs.victoriametrics.com/Troubleshooting.html).
See also [VictoriaMetrics Monitoring](https://victoriametrics.com/blog/victoriametrics-monitoring/)
and [troubleshooting docs](https://docs.victoriametrics.com/Troubleshooting.html).
## TSDB stats
@@ -1535,7 +1657,9 @@ All the durations and timestamps in traces are in milliseconds.
Query tracing is allowed by default. It can be denied by passing `-denyQueryTracing` command-line flag to VictoriaMetrics.
[VMUI](#vmui) provides an UI for query tracing - just click `Trace query` checkbox and re-run the query in order to investigate its' trace.
[VMUI](#vmui) provides an UI:
- for query tracing - just click `Trace query` checkbox and re-run the query in order to investigate its' trace.
- for exploring custom trace - go to the tab `Trace analyzer` and upload or paste JSON with trace information.
## Cardinality limiter
@@ -1573,7 +1697,8 @@ The exceeded limits can be [monitored](#monitoring) with the following metrics:
These limits are approximate, so VictoriaMetrics can underflow/overflow the limit by a small percentage (usually less than 1%).
See also more advanced [cardinality limiter in vmagent](https://docs.victoriametrics.com/vmagent.html#cardinality-limiter).
See also more advanced [cardinality limiter in vmagent](https://docs.victoriametrics.com/vmagent.html#cardinality-limiter)
and [cardinality explorer docs](#cardinality-explorer).
## Troubleshooting
@@ -1923,6 +2048,8 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
```
-bigMergeConcurrency int
The maximum number of CPU cores to use for big merges. Default value is used if set to 0
-cacheExpireDuration duration
Items are removed from in-memory caches after they aren't accessed for this duration. Lower values may reduce memory usage at the cost of higher CPU usage. See also -prevCacheRemovalPercent (default 30m0s)
-configAuthKey string
Authorization key for accessing /config page. It must be passed via authKey query arg
-csvTrimTimestamp duration
@@ -1930,6 +2057,8 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
-datadog.maxInsertRequestSize size
The maximum size in bytes of a single DataDog POST request to /api/v1/series
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 67108864)
-datadog.sanitizeMetricName
Sanitize metric names for the ingested DataDog data to comply with DataDog behaviour described at https://docs.datadoghq.com/metrics/custom_metrics/#naming-custom-metrics (default true)
-dedup.minScrapeInterval duration
Leave only the last sample in every time series per each discrete interval equal to -dedup.minScrapeInterval > 0. See https://docs.victoriametrics.com/#deduplication and https://docs.victoriametrics.com/#downsampling
-deleteAuthKey string
@@ -1939,7 +2068,7 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
-denyQueryTracing
Whether to disable the ability to trace queries. See https://docs.victoriametrics.com/#query-tracing
-downsampling.period array
Comma-separated downsampling periods in the format 'offset:period'. For example, '30d:10m' instructs to leave a single sample per 10 minutes for samples older than 30 days. See https://docs.victoriametrics.com/#downsampling for details
Comma-separated downsampling periods in the format 'offset:period'. For example, '30d:10m' instructs to leave a single sample per 10 minutes for samples older than 30 days. See https://docs.victoriametrics.com/#downsampling for details. This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
Supports an array of values separated by comma or specified via multiple flags.
-dryRun
Whether to check only -promscrape.config and then exit. Unknown config entries aren't allowed in -promscrape.config by default. This can be changed with -promscrape.config.strictParse=false command-line flag
@@ -1950,7 +2079,7 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
-eula
By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf
By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf . This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
-finalMergeDelay duration
The delay before starting final merge for per-month partition after no new data is ingested into it. Final merge may require additional disk IO and CPU resources. Final merge may increase query speed and reduce disk space usage in some cases. Zero value disables final merge
-flagsAuthKey string
@@ -2053,6 +2182,8 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
Auth key for /debug/pprof/* endpoints. It must be passed via authKey query arg. It overrides httpAuth.* settings
-precisionBits int
The number of precision bits to store per each value. Lower precision bits improves data compression at the cost of precision loss (default 64)
-prevCacheRemovalPercent float
Items in the previous caches are removed when the percent of requests it serves becomes lower than this value. Higher values reduce memory usage at the cost of higher CPU usage. See also -cacheExpireDuration (default 0.1)
-promscrape.azureSDCheckInterval duration
Interval for checking for changes in Azure. This works only if azure_sd_configs is configured in '-promscrape.config' file. See https://docs.victoriametrics.com/sd_configs.html#azure_sd_configs for details (default 1m0s)
-promscrape.cluster.memberNum string
@@ -2133,7 +2264,7 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
-promscrape.suppressScrapeErrorsDelay duration
The delay for suppressing repeated scrape errors logging per each scrape targets. This may be used for reducing the number of log lines related to scrape errors. See also -promscrape.suppressScrapeErrors
-promscrape.yandexcloudSDCheckInterval duration
Interval for checking for changes in Yandex Cloud API. This works only if yandexcloud_sd_configs is configured in '-promscrape.config' file. (default 30s)
Interval for checking for changes in Yandex Cloud API. This works only if yandexcloud_sd_configs is configured in '-promscrape.config' file. See https://docs.victoriametrics.com/sd_configs.html#yandexcloud_sd_configs for details (default 30s)
-pushmetrics.extraLabel array
Optional labels to add to metrics pushed to -pushmetrics.url . For example, -pushmetrics.extraLabel='instance="foo"' adds instance="foo" label to all the metrics pushed to -pushmetrics.url
Supports an array of values separated by comma or specified via multiple flags.
@@ -2146,8 +2277,11 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
Optional path to a file with relabeling rules, which are applied to all the ingested metrics. The path can point either to local file or to http url. See https://docs.victoriametrics.com/#relabeling for details. The config is reloaded on SIGHUP signal
-relabelDebug
Whether to log metrics before and after relabeling with -relabelConfig. If the -relabelDebug is enabled, then the metrics aren't sent to storage. This is useful for debugging the relabeling configs
-retentionFilter array
Retention filter in the format 'filter:retention'. For example, '{env="dev"}:3d' configures the retention for time series with env="dev" label to 3 days. See https://docs.victoriametrics.com/#retention-filters for details. This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
Supports an array of values separated by comma or specified via multiple flags.
-retentionPeriod value
Data with timestamps outside the retentionPeriod is automatically deleted
Data with timestamps outside the retentionPeriod is automatically deleted. See also -retentionFilter
The following optional suffixes are supported: h (hour), d (day), w (week), y (year). If suffix isn't set, then the duration is counted in months (default 1)
-retentionTimezoneOffset duration
The offset for performing indexdb rotation. If set to 0, then the indexdb rotation is performed at 4am UTC time per each -retentionPeriod. If set to 2h, then the indexdb rotation is performed at 4am EET time (the timezone with +2h offset)
@@ -2158,15 +2292,15 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
-search.disableCache
Whether to disable response caching. This may be useful during data backfilling
-search.graphiteMaxPointsPerSeries int
The maximum number of points per series Graphite render API can return (default 1000000)
The maximum number of points per series Graphite render API can return. This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html (default 1000000)
-search.graphiteStorageStep duration
The interval between datapoints stored in the database. It is used at Graphite Render API handler for normalizing the interval between datapoints in case it isn't normalized. It can be overridden by sending 'storage_step' query arg to /render API or by sending the desired interval via 'Storage-Step' http header during querying /render API (default 10s)
The interval between datapoints stored in the database. It is used at Graphite Render API handler for normalizing the interval between datapoints in case it isn't normalized. It can be overridden by sending 'storage_step' query arg to /render API or by sending the desired interval via 'Storage-Step' http header during querying /render API. This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html (default 10s)
-search.latencyOffset duration
The time when data points become visible in query results after the collection. Too small value can result in incomplete last points for query results (default 30s)
-search.logSlowQueryDuration duration
Log queries with execution time exceeding this value. Zero disables slow query logging (default 5s)
-search.maxConcurrentRequests int
The maximum number of concurrent search requests. It shouldn't be high, since a single request can saturate all the CPU cores. See also -search.maxQueueDuration (default 8)
The maximum number of concurrent search requests. It shouldn't be high, since a single request can saturate all the CPU cores, while many concurrently executed requests may require high amounts of memory. See also -search.maxQueueDuration and -search.maxMemoryPerQuery (default 8)
-search.maxExportDuration duration
The maximum duration for /api/v1/export call (default 720h0m0s)
-search.maxExportSeries int
@@ -2174,9 +2308,12 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
-search.maxFederateSeries int
The maximum number of time series, which can be returned from /federate. This option allows limiting memory usage (default 1000000)
-search.maxGraphiteSeries int
The maximum number of time series, which can be scanned during queries to Graphite Render API. See https://docs.victoriametrics.com/#graphite-render-api-usage (default 300000)
The maximum number of time series, which can be scanned during queries to Graphite Render API. See https://docs.victoriametrics.com/#graphite-render-api-usage . This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html (default 300000)
-search.maxLookback duration
Synonym to -search.lookback-delta from Prometheus. The value is dynamically detected from interval between time series datapoints if not set. It can be overridden on per-query basis via max_lookback arg. See also '-search.maxStalenessInterval' flag, which has the same meaining due to historical reasons
-search.maxMemoryPerQuery size
The maximum amounts of memory a single query may consume. Queries requiring more memory are rejected. The total memory limit for concurrently executed queries can be estimated as -search.maxMemoryPerQuery multiplied by -search.maxConcurrentRequests
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0)
-search.maxPointsPerTimeseries int
The maximum points per a single timeseries returned from /api/v1/query_range. This option doesn't limit the number of scanned raw samples in the database. The main purpose of this option is to limit the number of per-series points returned to graphing UI such as VMUI or Grafana. There is no sense in setting this limit to values bigger than the horizontal resolution of the graph (default 30000)
-search.maxPointsSubqueryPerTimeseries int
@@ -2246,7 +2383,7 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
Overrides max size for indexdb/indexBlocks cache. See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#cache-tuning
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0)
-storage.cacheSizeIndexDBTagFilters size
Overrides max size for indexdb/tagFilters cache. See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#cache-tuning
Overrides max size for indexdb/tagFiltersToMetricIDs cache. See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#cache-tuning
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0)
-storage.cacheSizeStorageTSID size
Overrides max size for storage/tsid cache. See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#cache-tuning
@@ -2269,6 +2406,10 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
Supports an array of values separated by comma or specified via multiple flags.
-tlsKeyFile string
Path to file with TLS key if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated
-tlsMinVersion string
Optional minimum TLS version to use for incoming requests over HTTPS if -tls is set. Supported values: TLS10, TLS11, TLS12, TLS13
-usePromCompatibleNaming
Whether to replace characters unsupported by Prometheus with underscores in the ingested metric names and label names. For example, foo.bar{a.b='c'} is transformed into foo_bar{a_b='c'} during data ingestion if this flag is set. See https://prometheus.io/docs/concepts/data_model/#metric-names-and-labels
-version
Show VictoriaMetrics version
-vmalert.proxyURL string

14
SECURITY.md Normal file
View File

@@ -0,0 +1,14 @@
# Security Policy
## Supported Versions
| Version | Supported |
|---------|--------------------|
| 1.81.x | :white_check_mark: |
| 1.80.x | :x: |
| 1.79.x | :white_check_mark: |
| < 1.78 | :x: |
## Reporting a Vulnerability
Please report any security issues to security@victoriametrics.com

View File

@@ -138,7 +138,7 @@ func setUp() {
if err != nil {
return false
}
resp.Body.Close()
_ = resp.Body.Close()
return resp.StatusCode == 200
}
if err := waitFor(testStorageInitTimeout, readyStorageCheckFunc); err != nil {
@@ -337,7 +337,9 @@ func tcpWrite(t *testing.T, address string, data string) {
s := newSuite(t)
conn, err := net.Dial("tcp", address)
s.noError(err)
defer conn.Close()
defer func() {
_ = conn.Close()
}()
n, err := conn.Write([]byte(data))
s.noError(err)
s.equalInt(n, len(data))
@@ -348,7 +350,9 @@ func httpReadMetrics(t *testing.T, address, query string) []Metric {
s := newSuite(t)
resp, err := http.Get(address + query)
s.noError(err)
defer resp.Body.Close()
defer func() {
_ = resp.Body.Close()
}()
s.equalInt(resp.StatusCode, 200)
var rows []Metric
for dec := json.NewDecoder(resp.Body); dec.More(); {
@@ -363,7 +367,9 @@ func httpReadStruct(t *testing.T, address, query string, dst interface{}) {
s := newSuite(t)
resp, err := http.Get(address + query)
s.noError(err)
defer resp.Body.Close()
defer func() {
_ = resp.Body.Close()
}()
s.equalInt(resp.StatusCode, 200)
s.noError(json.NewDecoder(resp.Body).Decode(dst))
}

View File

@@ -85,7 +85,9 @@ func selfScraper(scrapeInterval time.Duration) {
mr.Timestamp = currentTimestamp
mr.Value = r.Value
}
vmstorage.AddRows(mrs)
if err := vmstorage.AddRows(mrs); err != nil {
logger.Errorf("cannot store self-scraped metrics: %s", err)
}
}
}

View File

@@ -1,8 +1,11 @@
# vmagent
`vmagent` is a tiny but mighty agent which helps you collect metrics from various sources
`vmagent` is a tiny agent which helps you collect metrics from various sources,
[relabel and filter the collected metrics](#relabeling)
and store them in [VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics)
or any other Prometheus-compatible storage systems with Prometheus `remote_write` protocol support.
or any other storage systems via Prometheus `remote_write` protocol.
See [Quick Start](#quick-start) for details.
<img alt="vmagent" src="vmagent.png">
@@ -16,27 +19,40 @@ additionally to [discovering Prometheus-compatible targets and scraping metrics
## Features
* Can be used as a drop-in replacement for Prometheus for scraping targets such as [node_exporter](https://github.com/prometheus/node_exporter). See [Quick Start](#quick-start) for details.
* Can read data from Kafka. See [these docs](#reading-metrics-from-kafka).
* Can write data to Kafka. See [these docs](#writing-metrics-to-kafka).
* Can be used as a drop-in replacement for Prometheus for discovering and scraping targets such as [node_exporter](https://github.com/prometheus/node_exporter).
Note that single-node VictoriaMetrics can also discover and scrape Prometheus-compatible targets in the same way as `vmagent` does -
see [these docs](https://docs.victoriametrics.com/#how-to-scrape-prometheus-exporters-such-as-node-exporter).
* Can add, remove and modify labels (aka tags) via Prometheus relabeling. Can filter data before sending it to remote storage. See [these docs](#relabeling) for details.
* Accepts data via all the ingestion protocols supported by VictoriaMetrics - see [these docs](#how-to-push-data-to-vmagent).
* Can replicate collected metrics simultaneously to multiple remote storage systems.
* Can accept data via all the ingestion protocols supported by VictoriaMetrics - see [these docs](#how-to-push-data-to-vmagent).
* Can replicate collected metrics simultaneously to multiple remote storage systems -
see [these docs](#replication-and-high-availability).
* Works smoothly in environments with unstable connections to remote storage. If the remote storage is unavailable, the collected metrics
are buffered at `-remoteWrite.tmpDataPath`. The buffered metrics are sent to remote storage as soon as the connection
to the remote storage is repaired. The maximum disk usage for the buffer can be limited with `-remoteWrite.maxDiskUsagePerURL`.
* Uses lower amounts of RAM, CPU, disk IO and network bandwidth compared with Prometheus.
* Uses lower amounts of RAM, CPU, disk IO and network bandwidth than Prometheus.
* Scrape targets can be spread among multiple `vmagent` instances when big number of targets must be scraped. See [these docs](#scraping-big-number-of-targets).
* Can efficiently scrape targets that expose millions of time series such as [/federate endpoint in Prometheus](https://prometheus.io/docs/prometheus/latest/federation/). See [these docs](#stream-parsing-mode).
* Can deal with [high cardinality](https://docs.victoriametrics.com/FAQ.html#what-is-high-cardinality) and [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate) issues by limiting the number of unique time series at scrape time and before sending them to remote storage systems. See [these docs](#cardinality-limiter).
* Can load scrape configs from multiple files. See [these docs](#loading-scrape-configs-from-multiple-files).
* Can efficiently scrape targets that expose millions of time series such as [/federate endpoint in Prometheus](https://prometheus.io/docs/prometheus/latest/federation/).
See [these docs](#stream-parsing-mode).
* Can deal with [high cardinality](https://docs.victoriametrics.com/FAQ.html#what-is-high-cardinality)
and [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate) issues by limiting the number of unique time series at scrape time
and before sending them to remote storage systems. See [these docs](#cardinality-limiter).
* Can write collected metrics to multiple tenants. See [these docs](#multitenancy).
* Can read data from Kafka. See [these docs](#reading-metrics-from-kafka).
* Can write data to Kafka. See [these docs](#writing-metrics-to-kafka).
## Quick Start
Please download `vmutils-*` archive from [releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) (`vmagent` is also available in [docker images](https://hub.docker.com/r/victoriametrics/vmagent/tags)), unpack it and pass the following flags to the `vmagent` binary in order to start scraping Prometheus-compatible targets:
Please download `vmutils-*` archive from [releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) (
`vmagent` is also available in [docker images](https://hub.docker.com/r/victoriametrics/vmagent/tags)),
unpack it and pass the following flags to the `vmagent` binary in order to start scraping Prometheus-compatible targets:
* `-promscrape.config` with the path to Prometheus config file (usually located at `/etc/prometheus/prometheus.yml`). The path can point either to local file or to http url. `vmagent` doesn't support some sections of Prometheus config file, so you may need either to delete these sections or to run `vmagent` with `-promscrape.config.strictParse=false` command-line flag, so `vmagent` ignores unsupported sections. See [the list of unsupported sections](#unsupported-prometheus-config-sections).
* `-remoteWrite.url` with the remote storage endpoint such as VictoriaMetrics, the `-remoteWrite.url` argument can be specified multiple times to replicate data concurrently to an arbitrary number of remote storage systems.
* `-promscrape.config` with the path to Prometheus config file (usually located at `/etc/prometheus/prometheus.yml`).
The path can point either to local file or to http url. `vmagent` doesn't support some sections of Prometheus config file,
so you may need either to delete these sections or to run `vmagent` with `-promscrape.config.strictParse=false` command-line flag.
In this case `vmagent` ignores unsupported sections. See [the list of unsupported sections](#unsupported-prometheus-config-sections).
* `-remoteWrite.url` with the remote storage endpoint such as VictoriaMetrics, the `-remoteWrite.url` argument can be specified
multiple times to replicate data concurrently to an arbitrary number of remote storage systems. See [various use cases](#use-cases).
Example command line:
@@ -46,7 +62,12 @@ Example command line:
See [how to scrape Prometheus-compatible targets](#how-to-collect-metrics-in-prometheus-format) for more details.
If you don't need to scrape Prometheus-compatible targets, then the `-promscrape.config` option isn't needed. For example, the following command is sufficient for accepting data via [supported "push"-based protocols](#how-to-push-data-to-vmagent) and sending it to the provided `-remoteWrite.url`:
If you use single-node VictoriaMetrics, then you can discover and scrape Prometheus-compatible targets directly from VictoriaMetrics
without the need to use `vmagent` - see [these docs](https://docs.victoriametrics.com/#how-to-scrape-prometheus-exporters-such-as-node-exporter).
If you don't need to scrape Prometheus-compatible targets, then the `-promscrape.config` option isn't needed.
For example, the following command is sufficient for accepting data via [supported push-based protocols](#how-to-push-data-to-vmagent)
and sending it to the provided `-remoteWrite.url`:
```console
/path/to/vmagent -remoteWrite.url=https://victoria-metrics-host:8428/api/v1/write
@@ -58,7 +79,8 @@ Pass `-help` to `vmagent` in order to see [the full list of supported command-li
## How to push data to vmagent
`vmagent` supports [the same set of push-based data ingestion protocols as VictoriaMetrics does](https://docs.victoriametrics.com/#how-to-import-time-series-data) additionally to pull-based Prometheus-compatible targets' scraping:
`vmagent` supports [the same set of push-based data ingestion protocols as VictoriaMetrics does](https://docs.victoriametrics.com/#how-to-import-time-series-data)
additionally to pull-based Prometheus-compatible targets' scraping:
* DataDog "submit metrics" API. See [these docs](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-send-data-from-datadog-agent).
* InfluxDB line protocol via `http://<vmagent>:8429/write`. See [these docs](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-send-data-from-influxdb-compatible-agents-such-as-telegraf).
@@ -73,10 +95,10 @@ Pass `-help` to `vmagent` in order to see [the full list of supported command-li
## Configuration update
`vmagent` should be restarted in order to update config options set via command-line args.
`vmagent` supports multiple approaches for reloading configs from updated config files such as
`-promscrape.config`, `-remoteWrite.relabelConfig` and `-remoteWrite.urlRelabelConfig`:
`vmagent` supports multiple approaches for reloading configs from updated config files such as `-promscrape.config`, `-remoteWrite.relabelConfig` and `-remoteWrite.urlRelabelConfig`:
* Sending `SUGHUP` signal to `vmagent` process:
* Sending `SIGHUP` signal to `vmagent` process:
```console
kill -SIGHUP `pidof vmagent`
@@ -106,13 +128,16 @@ See [these docs](#how-to-collect-metrics-in-prometheus-format) for details.
### Flexible metrics relay
`vmagent` can accept metrics in [various popular data ingestion protocols](#how-to-push-data-to-vmagent), apply [relabeling](#relabeling) to the accepted metrics (for example, change metric names/labels or drop unneeded metrics) and then forward the relabeled metrics to other remote storage systems, which support Prometheus `remote_write` protocol (including other `vmagent` instances).
`vmagent` can accept metrics in [various popular data ingestion protocols](#how-to-push-data-to-vmagent), apply [relabeling](#relabeling)
to the accepted metrics (for example, change metric names/labels or drop unneeded metrics) and then forward the relabeled metrics
to other remote storage systems, which support Prometheus `remote_write` protocol (including other `vmagent` instances).
### Replication and high availability
`vmagent` replicates the collected metrics among multiple remote storage instances configured via `-remoteWrite.url` args.
If a single remote storage instance temporarily is out of service, then the collected data remains available in another remote storage instance.
`vmagent` buffers the collected data in files at `-remoteWrite.tmpDataPath` until the remote storage becomes available again and then it sends the buffered data to the remote storage in order to prevent data gaps.
`vmagent` buffers the collected data in files at `-remoteWrite.tmpDataPath` until the remote storage becomes available again
and then it sends the buffered data to the remote storage in order to prevent data gaps.
### Relabeling and filtering
@@ -136,16 +161,36 @@ Also, Basic Auth can be enabled for the incoming `remote_write` requests with `-
### remote_write for clustered version
While `vmagent` can accept data in several supported protocols (OpenTSDB, Influx, Prometheus, Graphite) and scrape data from various targets, writes are always performed in Promethes remote_write protocol. Therefore for the [clustered version](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html), `-remoteWrite.url` the command-line flag should be configured as `<schema>://<vminsert-host>:8480/insert/<accountID>/prometheus/api/v1/write` according to [these docs](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#url-format). There is also support for multitenant writes. See [these docs](#multitenancy).
While `vmagent` can accept data in several supported protocols (OpenTSDB, Influx, Prometheus, Graphite) and scrape data from various targets,
writes are always performed in Promethes remote_write protocol. Therefore for the [clustered version](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html),
the `-remoteWrite.url` command-line flag should be configured as `<schema>://<vminsert-host>:8480/insert/<accountID>/prometheus/api/v1/write`
according to [these docs](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#url-format).
There is also support for multitenant writes. See [these docs](#multitenancy).
## Multitenancy
By default `vmagent` collects the data without tenant identifiers and routes it to the configured `-remoteWrite.url`.
[Multitenancy](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multitenancy) support is enabled when `-remoteWrite.multitenantURL` command-line flag is set. In this case `vmagent` accepts multitenant data at `http://vmagent:8429/insert/<accountID>/...` in the same way as cluster version of VictoriaMetrics does according to [these docs](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#url-format) and routes it to `<-remoteWrite.multitenantURL>/insert/<accountID>/prometheus/api/v1/write`. If multiple `-remoteWrite.multitenantURL` command-line options are set, then `vmagent` replicates the collected data across all the configured urls. This allows using a single `vmagent` instance in front of VictoriaMetrics clusters for processing the data from all the tenants.
[VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html) supports writing data to multiple tenants
specified via special labels - see [these docs](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multitenancy-via-labels).
This allows specifying tenant ids via [relabeling](#relabeling) and writing multitenant data
to a single `-remoteWrite.url=http://<vminsert-addr>/insert/multitenant/prometheus/api/v1/write`.
If `-remoteWrite.multitenantURL` command-line flag is set and `vmagent` is configured to scrape Prometheus-compatible targets (e.g. if `-promscrape.config` command-line flag is set)
then `vmagent` reads tenantID from `__tenant_id__` label for the discovered targets and routes all the metrics from this target to the given `__tenant_id__`, e.g. to the url `<-remoteWrite.multitnenatURL>/insert/<__tenant_id__>/prometheus/api/v1/write`.
`vmagent` can accept data from the same multitenant endpoints as `vminsert` from [VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html)
does according to [these docs](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#url-format) and route the accepted data
to the corresponding [tenants](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multitenancy) in VictoriaMetrics cluster
pointed by the `-remoteWrite.multitenantURL` command-line flag. For example, if `-remoteWrite.multitenantURL` is set to `http://vminsert-service`,
then `vmagent` would accept multitenant data at `http://vmagent:8429/insert/<accountID>/...` endpoints in the same way
as [VictoriaMetrics cluster does](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#url-format) and route
it to `http://vminsert-service/insert/<accountID>/prometheus/api/v1/write`.
If multiple `-remoteWrite.multitenantURL` command-line options are set, then `vmagent` replicates the collected data across all the configured urls.
This allows using a single `vmagent` instance in front of multiple VictoriaMetrics clusters.
If `-remoteWrite.multitenantURL` command-line flag is set and `vmagent` is configured to scrape Prometheus-compatible targets
(e.g. if `-promscrape.config` command-line flag is set) then `vmagent` reads tenantID from `__tenant_id__` label
for the discovered targets and routes all the metrics from this target to the given `__tenant_id__`,
e.g. to the url `<-remoteWrite.multitnenatURL>/insert/<__tenant_id__>/prometheus/api/v1/write`.
For example, the following relabeling rule instructs sending metrics to tenantID defined in the `prometheus.io/tenant` annotation of Kubernetes pod deployment:
@@ -180,7 +225,8 @@ See [the list of supported service discovery types for Prometheus scrape targets
`vmagent` supports the following additional options in [scrape_configs](https://docs.victoriametrics.com/sd_configs.html#scrape_configs) section:
* `headers` - a list of HTTP headers to send to scrape target with each scrape request. This can be used when the scrape target needs custom authorization and authentication. For example:
* `headers` - a list of HTTP headers to send to scrape target with each scrape request. This can be used when the scrape target
needs custom authorization and authentication. For example:
```yaml
scrape_configs:
@@ -190,11 +236,14 @@ scrape_configs:
- "My-Auth: TopSecret"
```
* `disable_compression: true` for disabling response compression on a per-job basis. By default `vmagent` requests compressed responses from scrape targets for saving network bandwidth.
* `disable_keepalive: true` for disabling [HTTP keep-alive connections](https://en.wikipedia.org/wiki/HTTP_persistent_connection) on a per-job basis. By default `vmagent` uses keep-alive connections to scrape targets for reducing overhead on connection re-establishing.
* `disable_compression: true` for disabling response compression on a per-job basis. By default `vmagent` requests compressed responses
from scrape targets for saving network bandwidth.
* `disable_keepalive: true` for disabling [HTTP keep-alive connections](https://en.wikipedia.org/wiki/HTTP_persistent_connection)
on a per-job basis. By default `vmagent` uses keep-alive connections to scrape targets for reducing overhead on connection re-establishing.
* `series_limit: N` for limiting the number of unique time series a single scrape target can expose. See [these docs](#cardinality-limiter).
* `stream_parse: true` for scraping targets in a streaming manner. This may be useful when targets export big number of metrics. See [these docs](#stream-parsing-mode).
* `scrape_align_interval: duration` for aligning scrapes to the given interval instead of using random offset in the range `[0 ... scrape_interval]` for scraping each target. The random offset helps spreading scrapes evenly in time.
* `scrape_align_interval: duration` for aligning scrapes to the given interval instead of using random offset
in the range `[0 ... scrape_interval]` for scraping each target. The random offset helps spreading scrapes evenly in time.
* `scrape_offset: duration` for specifying the exact offset for scraping instead of using random offset in the range `[0 ... scrape_interval]`.
* `relabel_debug: true` for enabling debug logging during relabeling of the discovered targets. See [these docs](#relabeling).
* `metric_relabel_debug: true` for enabling debug logging during relabeling of the scraped metrics. See [these docs](#relabeling).
@@ -204,7 +253,10 @@ See [scrape_configs docs](https://docs.victoriametrics.com/sd_configs.html#scrap
## Loading scrape configs from multiple files
`vmagent` supports loading [scrape configs](https://docs.victoriametrics.com/sd_configs.html#scrape_configs) from multiple files specified in the `scrape_config_files` section of `-promscrape.config` file. For example, the following `-promscrape.config` instructs `vmagent` loading scrape configs from all the `*.yml` files under `configs` directory, from `single_scrape_config.yml` local file and from `https://config-server/scrape_config.yml` url:
`vmagent` supports loading [scrape configs](https://docs.victoriametrics.com/sd_configs.html#scrape_configs) from multiple files specified
in the `scrape_config_files` section of `-promscrape.config` file. For example, the following `-promscrape.config` instructs `vmagent`
loading scrape configs from all the `*.yml` files under `configs` directory, from `single_scrape_config.yml` local file
and from `https://config-server/scrape_config.yml` url:
```yml
scrape_config_files:
@@ -213,7 +265,8 @@ scrape_config_files:
- https://config-server/scrape_config.yml
```
Every referred file can contain arbitrary number of [supported scrape configs](https://docs.victoriametrics.com/sd_configs.html#scrape_configs). There is no need in specifying top-level `scrape_configs` section in these files. For example:
Every referred file can contain arbitrary number of [supported scrape configs](https://docs.victoriametrics.com/sd_configs.html#scrape_configs).
There is no need in specifying top-level `scrape_configs` section in these files. For example:
```yml
- job_name: foo
@@ -230,24 +283,34 @@ Every referred file can contain arbitrary number of [supported scrape configs](h
`vmagent` doesn't support the following sections in Prometheus config file passed to `-promscrape.config` command-line flag:
* [remote_write](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write). This section is substituted with various `-remoteWrite*` command-line flags. See [the full list of flags](#advanced-usage). The `remote_write` section isn't supported in order to reduce possible confusion when `vmagent` is used for accepting incoming metrics via [supported push protocols](#how-to-push-data-to-vmagent). In this case the `-promscrape.config` file isn't needed.
* `remote_read`. This section isn't supported at all, since `vmagent` doesn't provide Prometheus querying API. It is expected that the querying API is provided by the remote storage specified via `-remoteWrite.url` such as VictoriaMetrics. See [Prometheus querying API docs for VictoriaMetrics](https://docs.victoriametrics.com/#prometheus-querying-api-usage).
* [remote_write](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write). This section is substituted
with various `-remoteWrite*` command-line flags. See [the full list of flags](#advanced-usage). The `remote_write` section isn't supported
in order to reduce possible confusion when `vmagent` is used for accepting incoming metrics via [supported push protocols](#how-to-push-data-to-vmagent).
In this case the `-promscrape.config` file isn't needed.
* `remote_read`. This section isn't supported at all, since `vmagent` doesn't provide Prometheus querying API.
It is expected that the querying API is provided by the remote storage specified via `-remoteWrite.url` such as VictoriaMetrics.
See [Prometheus querying API docs for VictoriaMetrics](https://docs.victoriametrics.com/#prometheus-querying-api-usage).
* `rule_files` and `alerting`. These sections are supported by [vmalert](https://docs.victoriametrics.com/vmalert.html).
The list of supported service discovery types is available [here](#how-to-collect-metrics-in-prometheus-format).
Additionally `vmagent` doesn't support `refresh_interval` option at service discovery sections. This option is substituted with `-promscrape.*CheckInterval` command-line options, which are specific per each service discovery type. See [the full list of command-line flags for vmagent](#advanced-usage).
Additionally `vmagent` doesn't support `refresh_interval` option at service discovery sections.
This option is substituted with `-promscrape.*CheckInterval` command-line options, which are specific per each service discovery type.
See [the full list of command-line flags for vmagent](#advanced-usage).
## Adding labels to metrics
Extra labels can be added to metrics collected by `vmagent` via the following mechanisms:
* The `global -> external_labels` section in `-promscrape.config` file. These labels are added only to metrics scraped from targets configured in the `-promscrape.config` file. They aren't added to metrics collected via other [data ingestion protocols](#how-to-push-data-to-vmagent).
* The `-remoteWrite.label` command-line flag. These labels are added to all the collected metrics before sending them to `-remoteWrite.url`. For example, the following command starts `vmagent`, which adds `{datacenter="foobar"}` label to all the metrics pushed to all the configured remote storage systems (all the `-remoteWrite.url` flag values):
* The `global -> external_labels` section in `-promscrape.config` file. These labels are added only to metrics scraped from targets configured
in the `-promscrape.config` file. They aren't added to metrics collected via other [data ingestion protocols](#how-to-push-data-to-vmagent).
* The `-remoteWrite.label` command-line flag. These labels are added to all the collected metrics before sending them to `-remoteWrite.url`.
For example, the following command starts `vmagent`, which adds `{datacenter="foobar"}` label to all the metrics pushed
to all the configured remote storage systems (all the `-remoteWrite.url` flag values):
```
/path/to/vmagent -remoteWrite.label=datacenter=foobar ...
```
```
/path/to/vmagent -remoteWrite.label=datacenter=foobar ...
```
* Via relabeling. See [these docs](#relabeling).
@@ -256,59 +319,87 @@ Extra labels can be added to metrics collected by `vmagent` via the following me
`vmagent` automatically generates the following metrics per each scrape of every [Prometheus-compatible target](#how-to-collect-metrics-in-prometheus-format):
* `up` - this metric exposes `1` value on successful scrape and `0` value on unsuccessful scrape. This allows monitoring failing scrapes with the following [MetricsQL query](https://docs.victoriametrics.com/MetricsQL.html):
* `up` - this metric exposes `1` value on successful scrape and `0` value on unsuccessful scrape. This allows monitoring
failing scrapes with the following [MetricsQL query](https://docs.victoriametrics.com/MetricsQL.html):
```metricsql
up == 0
```
* `scrape_duration_seconds` - the duration of the scrape for the given target. This allows monitoring slow scrapes. For example, the following [MetricsQL query](https://docs.victoriametrics.com/MetricsQL.html) returns scrapes, which take more than 1.5 seconds to complete:
* `scrape_duration_seconds` - the duration of the scrape for the given target. This allows monitoring slow scrapes.
For example, the following [MetricsQL query](https://docs.victoriametrics.com/MetricsQL.html) returns scrapes,
which take more than 1.5 seconds to complete:
```metricsql
scrape_duration_seconds > 1.5
```
* `scrape_timeout_seconds` - the configured timeout for the current scrape target (aka `scrape_timeout`). This allows detecting targets with scrape durations close to the configured scrape timeout. For example, the following [MetricsQL query](https://docs.victoriametrics.com/MetricsQL.html) returns targets (identified by `instance` label), which take more than 80% of the configured `scrape_timeout` during scrapes:
* `scrape_timeout_seconds` - the configured timeout for the current scrape target (aka `scrape_timeout`).
This allows detecting targets with scrape durations close to the configured scrape timeout.
For example, the following [MetricsQL query](https://docs.victoriametrics.com/MetricsQL.html) returns targets (identified by `instance` label),
which take more than 80% of the configured `scrape_timeout` during scrapes:
```metricsql
scrape_duration_seconds / scrape_timeout_seconds > 0.8
```
* `scrape_samples_scraped` - the number of samples (aka metrics) parsed per each scrape. This allows detecting targets, which expose too many metrics. For example, the following [MetricsQL query](https://docs.victoriametrics.com/MetricsQL.html) returns targets, which expose more than 10000 metrics:
* `scrape_samples_scraped` - the number of samples (aka metrics) parsed per each scrape. This allows detecting targets,
which expose too many metrics. For example, the following [MetricsQL query](https://docs.victoriametrics.com/MetricsQL.html)
returns targets, which expose more than 10000 metrics:
```metricsql
scrape_samples_scraped > 10000
```
* `scrape_samples_limit` - the configured limit on the number of metrics the given target can expose. The limit can be set via `sample_limit` option at [scrape_configs](https://docs.victoriametrics.com/sd_configs.html#scrape_configs). This metric is exposed only if the `sample_limit` is set. This allows detecting targets, which expose too many metrics compared to the configured `sample_limit`. For example, the following query returns targets (identified by `instance` label), which expose more than 80% metrics compared to the configed `sample_limit`:
* `scrape_samples_limit` - the configured limit on the number of metrics the given target can expose.
The limit can be set via `sample_limit` option at [scrape_configs](https://docs.victoriametrics.com/sd_configs.html#scrape_configs).
This metric is exposed only if the `sample_limit` is set. This allows detecting targets,
which expose too many metrics compared to the configured `sample_limit`. For example, the following query
returns targets (identified by `instance` label), which expose more than 80% metrics compared to the configed `sample_limit`:
```metricsql
scrape_samples_scraped / scrape_samples_limit > 0.8
```
* `scrape_samples_post_metric_relabeling` - the number of samples (aka metrics) left after applying metric-level relabeling from `metric_relabel_configs` section (see [relabeling docs](#relabeling) for more details). This allows detecting targets with too many metrics after the relabeling. For example, the following [MetricsQL query](https://docs.victoriametrics.com/MetricsQL.html) returns targets with more than 10000 metrics after the relabeling:
* `scrape_samples_post_metric_relabeling` - the number of samples (aka metrics) left after applying metric-level relabeling
from `metric_relabel_configs` section (see [relabeling docs](#relabeling) for more details).
This allows detecting targets with too many metrics after the relabeling.
For example, the following [MetricsQL query](https://docs.victoriametrics.com/MetricsQL.html) returns targets
with more than 10000 metrics after the relabeling:
```metricsql
scrape_samples_post_metric_relabeling > 10000
```
* `scrape_series_added` - **an approximate** number of new series the given target generates during the current scrape. This metric allows detecting targets (identified by `instance` label), which lead to [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate). For example, the following [MetricsQL query](https://docs.victoriametrics.com/MetricsQL.html) returns targets, which generate more than 1000 new series during the last hour:
* `scrape_series_added` - **an approximate** number of new series the given target generates during the current scrape.
This metric allows detecting targets (identified by `instance` label),
which lead to [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate).
For example, the following [MetricsQL query](https://docs.victoriametrics.com/MetricsQL.html) returns targets,
which generate more than 1000 new series during the last hour:
```metricsql
sum_over_time(scrape_series_added[1h]) > 1000
```
`vmagent` sets `scrape_series_added` to zero when it runs with `-promscrape.noStaleMarkers` command-line option (e.g. when [staleness markers](#prometheus-staleness-markers) are disabled).
`vmagent` sets `scrape_series_added` to zero when it runs with `-promscrape.noStaleMarkers` command-line option
or when it scrapes target with `no_stale_markers: true` option, e.g. when [staleness markers](#prometheus-staleness-markers) are disabled.
* `scrape_series_limit` - the limit on the number of unique time series the given target can expose according to [these docs](https://docs.victoriametrics.com/vmagent.html#cardinality-limiter). This metric is exposed only if the series limit is set.
* `scrape_series_limit` - the limit on the number of unique time series the given target can expose according to [these docs](#cardinality-limiter).
This metric is exposed only if the series limit is set.
* `scrape_series_current` - the number of unique series the given target exposed so far. This metric is exposed only if the series limit is set according to [these docs](https://docs.victoriametrics.com/vmagent.html#cardinality-limiter). This metric allows alerting when the number of exposed series by the given target reaches the limit. For example, the following query would alert when the target exposes more than 90% of unique series compared to the configured limit.
* `scrape_series_current` - the number of unique series the given target exposed so far.
This metric is exposed only if the series limit is set according to [these docs](#cardinality-limiter).
This metric allows alerting when the number of exposed series by the given target reaches the limit.
For example, the following query would alert when the target exposes more than 90% of unique series compared to the configured limit.
```metricsql
scrape_series_current / scrape_series_limit > 0.9
```
* `scrape_series_limit_samples_dropped` - exposes the number of dropped samples during the scrape because of the exceeded limit on the number of unique series. This metric is exposed only if the series limit is set according to [these docs](https://docs.victoriametrics.com/vmagent.html#cardinality-limiter). This metric allows alerting when scraped samples are dropped because of the exceeded limit. For example, the following query alerts when at least a single sample is dropped because of the exceeded limit during the last hour:
* `scrape_series_limit_samples_dropped` - exposes the number of dropped samples during the scrape because of the exceeded limit
on the number of unique series. This metric is exposed only if the series limit is set according to [these docs](#cardinality-limiter).
This metric allows alerting when scraped samples are dropped because of the exceeded limit.
For example, the following query alerts when at least a single sample is dropped because of the exceeded limit during the last hour:
```metricsql
sum_over_time(scrape_series_limit_samples_dropped[1h]) > 0
@@ -317,14 +408,36 @@ Extra labels can be added to metrics collected by `vmagent` via the following me
## Relabeling
VictoriaMetrics components support [Prometheus-compatible relabeling](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config) with [additional enhancements](#relabeling-enhancements) at various stages of data processing. The relabeling can be defined in the following places processed by `vmagent`:
VictoriaMetrics components support [Prometheus-compatible relabeling](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config)
with [additional enhancements](#relabeling-enhancements). The relabeling can be defined in the following places processed by `vmagent`:
* At the `scrape_config -> relabel_configs` section in `-promscrape.config` file. This relabeling is used for modifying labels in discovered targets and for dropping unneded targets. This relabeling can be debugged by passing `relabel_debug: true` option to the corresponding `scrape_config` section. In this case `vmagent` logs target labels before and after the relabeling and then drops the logged target.
* At the `scrape_config -> metric_relabel_configs` section in `-promscrape.config` file. This relabeling is used for modifying labels in scraped metrics and for dropping unneeded metrics. This relabeling can be debugged by passing `metric_relabel_debug: true` option to the corresponding `scrape_config` section. In this case `vmagent` logs metrics before and after the relabeling and then drops the logged metrics.
* At the `-remoteWrite.relabelConfig` file. This relabeling is used for modifying labels for all the collected metrics (inluding [metrics obtained via push-based protocols](#how-to-push-data-to-vmagent)) and for dropping unneeded metrics before sending them to all the configured `-remoteWrite.url` addresses. This relabeling can be debugged by passing `-remoteWrite.relabelDebug` command-line option to `vmagent`. In this case `vmagent` logs metrics before and after the relabeling and then drops all the logged metrics instead of sending them to remote storage.
* At the `-remoteWrite.urlRelabelConfig` files. This relabeling is used for modifying labels for metrics and for dropping unneeded metrics before sending them to a particular `-remoteWrite.url`. This relabeling can be debugged by passing `-remoteWrite.urlRelabelDebug` command-line options to `vmagent`. In this case `vmagent` logs metrics before and after the relabeling and then drops all the logged metrics instead of sending them to the corresponding `-remoteWrite.url`.
* At the `scrape_config -> relabel_configs` section in `-promscrape.config` file.
This relabeling is used for modifying labels in discovered targets and for dropping unneded targets.
See [relabeling cookbook](https://docs.victoriametrics.com/relabeling.html) for details.
All the files with relabeling configs can contain special placeholders in the form `%{ENV_VAR}`, which are replaced by the corresponding environment variable values.
This relabeling can be debugged by passing `relabel_debug: true` option to the corresponding `scrape_config` section.
In this case `vmagent` logs target labels before and after the relabeling and then drops the logged target.
* At the `scrape_config -> metric_relabel_configs` section in `-promscrape.config` file.
This relabeling is used for modifying labels in scraped metrics and for dropping unneeded metrics.
See [relabeling cookbook](https://docs.victoriametrics.com/relabeling.html) for details.
This relabeling can be debugged by passing `metric_relabel_debug: true` option to the corresponding `scrape_config` section.
In this case `vmagent` logs metrics before and after the relabeling and then drops the logged metrics.
* At the `-remoteWrite.relabelConfig` file. This relabeling is used for modifying labels for all the collected metrics
(inluding [metrics obtained via push-based protocols](#how-to-push-data-to-vmagent)) and for dropping unneeded metrics
before sending them to all the configured `-remoteWrite.url` addresses.
This relabeling can be debugged by passing `-remoteWrite.relabelDebug` command-line option to `vmagent`.
In this case `vmagent` logs metrics before and after the relabeling and then drops all the logged metrics instead of sending them to remote storage.
* At the `-remoteWrite.urlRelabelConfig` files. This relabeling is used for modifying labels for metrics
and for dropping unneeded metrics before sending them to a particular `-remoteWrite.url`.
This relabeling can be debugged by passing `-remoteWrite.urlRelabelDebug` command-line options to `vmagent`.
In this case `vmagent` logs metrics before and after the relabeling and then drops all the logged metrics instead of sending them to the corresponding `-remoteWrite.url`.
All the files with relabeling configs can contain special placeholders in the form `%{ENV_VAR}`,
which are replaced by the corresponding environment variable values.
The following articles contain useful information about Prometheus relabeling:
@@ -341,7 +454,11 @@ The following articles contain useful information about Prometheus relabeling:
## Relabeling enhancements
* The `replacement` option can refer arbitrary labels via {% raw %}`{{label_name}}`{% endraw %} placeholders. Such placeholders are substituted with the corresponding label value. For example, the following relabeling rule sets `instance-job` label value to `host123-foo` when applied to the metric with `{instance="host123",job="foo"}` labels:
`vmagent` provides the following enhancements on top of Prometheus-compatible relabeling:
* The `replacement` option can refer arbitrary labels via {% raw %}`{{label_name}}`{% endraw %} placeholders.
Such placeholders are substituted with the corresponding label value. For example, the following relabeling rule
sets `instance-job` label value to `host123-foo` when applied to the metric with `{instance="host123",job="foo"}` labels:
{% raw %}
```yaml
@@ -350,11 +467,13 @@ The following articles contain useful information about Prometheus relabeling:
```
{% endraw %}
* An optional `if` filter can be used for conditional relabeling. The `if` filter may contain arbitrary [time series selector](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors). For example, the following relabeling rule drops metrics, which don't match `foo{bar="baz"}` series selector, while leaving the rest of metrics:
* An optional `if` filter can be used for conditional relabeling. The `if` filter may contain
arbitrary [time series selector](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors).
For example, the following relabeling rule drops metrics, which don't match `foo{bar="baz"}` series selector, while leaving the rest of metrics:
```yaml
- action: keep
if: 'foo{bar="baz"}'
- if: 'foo{bar="baz"}'
action: keep
```
This is equivalent to less clear Prometheus-compatible relabeling rule:
@@ -365,7 +484,8 @@ The following articles contain useful information about Prometheus relabeling:
regex: 'foo;baz'
```
* The `regex` value can be split into multiple lines for improved readability and maintainability. These lines are automatically joined with `|` char when parsed. For example, the following configs are equivalent:
* The `regex` value can be split into multiple lines for improved readability and maintainability.
These lines are automatically joined with `|` char when parsed. For example, the following configs are equivalent:
```yaml
- action: keep_metrics
@@ -380,9 +500,12 @@ The following articles contain useful information about Prometheus relabeling:
- "foo_.+"
```
* VictoriaMetrics provides the following additional relabeling actions on top of standard actions from the [Prometheus relabeling](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config):
* VictoriaMetrics provides the following additional relabeling actions on top of standard actions
from the [Prometheus relabeling](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config):
* `replace_all` replaces all of the occurrences of `regex` in the values of `source_labels` with the `replacement` and stores the results in the `target_label`. For example, the following relabeling config replaces all the occurrences of `-` char in metric names with `_` char (e.g. `foo-bar-baz` metric name is transformed into `foo_bar_baz`):
* `replace_all` replaces all of the occurrences of `regex` in the values of `source_labels` with the `replacement`
and stores the results in the `target_label`. For example, the following relabeling config replaces all the occurrences
of `-` char in metric names with `_` char (e.g. `foo-bar-baz` metric name is transformed into `foo_bar_baz`):
```yaml
- action: replace_all
@@ -392,7 +515,9 @@ The following articles contain useful information about Prometheus relabeling:
replacement: "_"
```
* `labelmap_all` replaces all of the occurrences of `regex` in all the label names with the `replacement`. For example, the following relabeling config replaces all the occurrences of `-` char in all the label names with `_` char (e.g. `foo-bar-baz` label name is transformed into `foo_bar_baz`):
* `labelmap_all` replaces all of the occurrences of `regex` in all the label names with the `replacement`.
For example, the following relabeling config replaces all the occurrences of `-` char in all the label names
with `_` char (e.g. `foo-bar-baz` label name is transformed into `foo_bar_baz`):
```yaml
- action: labelmap_all
@@ -400,28 +525,35 @@ The following articles contain useful information about Prometheus relabeling:
replacement: "_"
```
* `keep_if_equal`: keeps the entry if all the label values from `source_labels` are equal, while dropping all the other entries. For example, the following relabeling config keeps targets if they contain equal values for `instance` and `host` labels, while dropping all the other targets:
* `keep_if_equal`: keeps the entry if all the label values from `source_labels` are equal,
while dropping all the other entries. For example, the following relabeling config keeps targets
if they contain equal values for `instance` and `host` labels, while dropping all the other targets:
```yaml
- action: keep_if_equal
source_labels: ["instance", "host"]
```
* `drop_if_equal`: drops the entry if all the label values from `source_labels` are equal, while keeping all the other entries. For example, the following relabeling config drops targets if they contain equal values for `instance` and `host` labels, while keeping all the other targets:
* `drop_if_equal`: drops the entry if all the label values from `source_labels` are equal,
while keeping all the other entries. For example, the following relabeling config drops targets
if they contain equal values for `instance` and `host` labels, while keeping all the other targets:
```yaml
- action: drop_if_equal
source_labels: ["instance", "host"]
```
* `keep_metrics`: keeps all the metrics with names matching the given `regex`, while dropping all the other metrics. For example, the following relabeling config keeps metrics with `fo` and `bar` names, while dropping all the other metrics:
* `keep_metrics`: keeps all the metrics with names matching the given `regex`,
while dropping all the other metrics. For example, the following relabeling config keeps metrics
with `foo` and `bar` names, while dropping all the other metrics:
```yaml
- action: keep_metrics
regex: "foo|bar"
```
* `drop_metrics`: drops all the metrics with names matching the given `regex`, while keeping all the other metrics. For example, the following relabeling config drops metrics with `foo` and `bar` names, while leaving all the other metrics:
* `drop_metrics`: drops all the metrics with names matching the given `regex`, while keeping all the other metrics.
For example, the following relabeling config drops metrics with `foo` and `bar` names, while leaving all the other metrics:
```yaml
- action: drop_metrics
@@ -471,17 +603,36 @@ Additionally, the `action: graphite` relabeling rules usually work much faster t
* If the scrape target becomes temporarily unavailable, then stale markers are sent for all the metrics scraped from this target.
* If the scrape target is removed from the list of targets, then stale markers are sent for all the metrics scraped from this target.
Prometheus staleness markers' tracking needs additional memory, since it must store the previous response body per each scrape target in order to compare it to the current response body. The memory usage may be reduced by passing `-promscrape.noStaleMarkers` command-line flag to `vmagent`. This disables staleness tracking. This also disables tracking the number of new time series per each scrape with the auto-generated `scrape_series_added` metric. See [these docs](https://prometheus.io/docs/concepts/jobs_instances/#automatically-generated-labels-and-time-series) for details.
Prometheus staleness markers' tracking needs additional memory, since it must store the previous response body per each scrape target
in order to compare it to the current response body. The memory usage may be reduced by disabling staleness tracking in the following ways:
* By passing `-promscrape.noStaleMarkers` command-line flag to `vmagent`. This disables staleness tracking across all the targets.
* By specifying `no_stale_markers: true` option in the [scrape_config](https://docs.victoriametrics.com/sd_configs.html#scrape_configs) for the corresponding target.
When staleness tracking is disabled, then `vmagent` doesn't track the number of new time series per each scrape,
e.g. it sets `scrape_series_added` metric to zero. See [these docs](#automatically-generated-metrics) for details.
## Stream parsing mode
By default `vmagent` reads the full response body from scrape target into memory, then parses it, applies [relabeling](#relabeling) and then pushes the resulting metrics to the configured `-remoteWrite.url`. This mode works good for the majority of cases when the scrape target exposes small number of metrics (e.g. less than 10 thousand). But this mode may take big amounts of memory when the scrape target exposes big number of metrics. In this case it is recommended enabling stream parsing mode. When this mode is enabled, then `vmagent` reads response from scrape target in chunks, then immediately processes every chunk and pushes the processed metrics to remote storage. This allows saving memory when scraping targets that expose millions of metrics.
By default `vmagent` reads the full response body from scrape target into memory, then parses it, applies [relabeling](#relabeling)
and then pushes the resulting metrics to the configured `-remoteWrite.url`. This mode works good for the majority of cases
when the scrape target exposes small number of metrics (e.g. less than 10 thousand). But this mode may take big amounts of memory
when the scrape target exposes big number of metrics. In this case it is recommended enabling stream parsing mode.
When this mode is enabled, then `vmagent` reads response from scrape target in chunks, then immediately processes every chunk
and pushes the processed metrics to remote storage. This allows saving memory when scraping targets that expose millions of metrics.
Stream parsing mode is automatically enabled for scrape targets returning response bodies with sizes bigger than the `-promscrape.minResponseSizeForStreamParse` command-line flag value. Additionally, the stream parsing mode can be explicitly enabled in the following places:
Stream parsing mode is automatically enabled for scrape targets returning response bodies with sizes bigger than
the `-promscrape.minResponseSizeForStreamParse` command-line flag value. Additionally,
stream parsing mode can be explicitly enabled in the following places:
* Via `-promscrape.streamParse` command-line flag. In this case all the scrape targets defined in the file pointed by `-promscrape.config` are scraped in stream parsing mode.
* Via `stream_parse: true` option at `scrape_configs` section. In this case all the scrape targets defined in this section are scraped in stream parsing mode.
* Via `__stream_parse__=true` label, which can be set via [relabeling](#relabeling) at `relabel_configs` section. In this case stream parsing mode is enabled for the corresponding scrape targets. Typical use case: to set the label via [Kubernetes annotations](https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/) for targets exposing big number of metrics.
* Via `-promscrape.streamParse` command-line flag. In this case all the scrape targets defined
in the file pointed by `-promscrape.config` are scraped in stream parsing mode.
* Via `stream_parse: true` option at `scrape_configs` section. In this case all the scrape targets defined
in this section are scraped in stream parsing mode.
* Via `__stream_parse__=true` label, which can be set via [relabeling](#relabeling) at `relabel_configs` section.
In this case stream parsing mode is enabled for the corresponding scrape targets.
Typical use case: to set the label via [Kubernetes annotations](https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/)
for targets exposing big number of metrics.
Examples:
@@ -499,7 +650,8 @@ scrape_configs:
'match[]': ['{__name__!=""}']
```
Note that `sample_limit` and `series_limit` options cannot be used in stream parsing mode because the parsed data is pushed to remote storage as soon as it is parsed.
Note that `sample_limit` and `series_limit` [scrape_config options](https://docs.victoriametrics.com/sd_configs.html#scrape_configs)
cannot be used in stream parsing mode because the parsed data is pushed to remote storage as soon as it is parsed.
## Scraping big number of targets
@@ -515,7 +667,8 @@ spread scrape targets among a cluster of two `vmagent` instances:
/path/to/vmagent -promscrape.cluster.membersCount=2 -promscrape.cluster.memberNum=1 -promscrape.config=/path/to/config.yml ...
```
The `-promscrape.cluster.memberNum` can be set to a StatefulSet pod name when `vmagent` runs in Kubernetes. The pod name must end with a number in the range `0 ... promscrape.cluster.memberNum-1`. For example, `-promscrape.cluster.memberNum=vmagent-0`.
The `-promscrape.cluster.memberNum` can be set to a StatefulSet pod name when `vmagent` runs in Kubernetes.
The pod name must end with a number in the range `0 ... promscrape.cluster.memberNum-1`. For example, `-promscrape.cluster.memberNum=vmagent-0`.
By default each scrape target is scraped only by a single `vmagent` instance in the cluster. If there is a need for replicating scrape targets among multiple `vmagent` instances,
then `-promscrape.cluster.replicationFactor` command-line flag must be set to the desired number of replicas. For example, the following commands
@@ -585,9 +738,14 @@ scrape_configs:
By default `vmagent` doesn't limit the number of time series each scrape target can expose. The limit can be enforced in the following places:
* Via `-promscrape.seriesLimitPerTarget` command-line option. This limit is applied individually to all the scrape targets defined in the file pointed by `-promscrape.config`.
* Via `series_limit` config option at `scrape_config` section. This limit is applied individually to all the scrape targets defined in the given `scrape_config`.
* Via `__series_limit__` label, which can be set with [relabeling](#relabeling) at `relabel_configs` section. This limit is applied to the corresponding scrape targets. Typical use case: to set the limit via [Kubernetes annotations](https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/) for targets, which may expose too high number of time series.
* Via `-promscrape.seriesLimitPerTarget` command-line option. This limit is applied individually
to all the scrape targets defined in the file pointed by `-promscrape.config`.
* Via `series_limit` config option at `scrape_config` section. This limit is applied individually
to all the scrape targets defined in the given `scrape_config`.
* Via `__series_limit__` label, which can be set with [relabeling](#relabeling) at `relabel_configs` section.
This limit is applied to the corresponding scrape targets. Typical use case: to set the limit
via [Kubernetes annotations](https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/) for targets,
which may expose too high number of time series.
See also `sample_limit` option at [scrape_config section](https://docs.victoriametrics.com/sd_configs.html#scrape_configs).
@@ -607,12 +765,16 @@ These metrics allow building the following alerting rules:
- `sum_over_time(scrape_series_limit_samples_dropped[1h]) > 0` - alerts when some samples are dropped because the series limit on a particular target is reached.
By default `vmagent` doesn't limit the number of time series written to remote storage systems specified at `-remoteWrite.url`. The limit can be enforced by setting the following command-line flags:
By default `vmagent` doesn't limit the number of time series written to remote storage systems specified at `-remoteWrite.url`.
The limit can be enforced by setting the following command-line flags:
* `-remoteWrite.maxHourlySeries` - limits the number of unique time series `vmagent` can write to remote storage systems during the last hour. Useful for limiting the number of active time series.
* `-remoteWrite.maxDailySeries` - limits the number of unique time series `vmagent` can write to remote storage systems during the last day. Useful for limiting daily churn rate.
* `-remoteWrite.maxHourlySeries` - limits the number of unique time series `vmagent` can write to remote storage systems during the last hour.
Useful for limiting the number of active time series.
* `-remoteWrite.maxDailySeries` - limits the number of unique time series `vmagent` can write to remote storage systems during the last day.
Useful for limiting daily churn rate.
Both limits can be set simultaneously. If any of these limits is reached, then samples for new time series are dropped instead of sending them to remote storage systems. A sample of dropped series is put in the log with `WARNING` level.
Both limits can be set simultaneously. If any of these limits is reached, then samples for new time series are dropped instead of sending
them to remote storage systems. A sample of dropped series is put in the log with `WARNING` level.
`vmagent` exposes the following metrics at `http://vmagent:8429/metrics` page (see [monitoring docs](#monitoring) for details):
@@ -625,23 +787,29 @@ Both limits can be set simultaneously. If any of these limits is reached, then s
These limits are approximate, so `vmagent` can underflow/overflow the limit by a small percentage (usually less than 1%).
See also [cardinality explorer docs](https://docs.victoriametrics.com/#cardinality-explorer).
## Monitoring
`vmagent` exports various metrics in Prometheus exposition format at `http://vmagent-host:8429/metrics` page. We recommend setting up regular scraping of this page
either through `vmagent` itself or by Prometheus so that the exported metrics may be analyzed later.
Use official [Grafana dashboard](https://grafana.com/grafana/dashboards/12683) for `vmagent` state overview. Graphs on this dashboard contain useful hints - hover the `i` icon at the top left corner of each graph in order to read it.
`vmagent` exports various metrics in Prometheus exposition format at `http://vmagent-host:8429/metrics` page.
We recommend setting up regular scraping of this page either through `vmagent` itself or by Prometheus
so that the exported metrics may be analyzed later.
Use official [Grafana dashboard](https://grafana.com/grafana/dashboards/12683) for `vmagent` state overview.
Graphs on this dashboard contain useful hints - hover the `i` icon at the top left corner of each graph in order to read it.
If you have suggestions for improvements or have found a bug - please open an issue on github or add a review to the dashboard.
`vmagent` also exports the status for various targets at the following handlers:
`vmagent` also exports the status for various targets at the following pages:
* `http://vmagent-host:8429/targets`. This handler returns human-readable status for every active target.
This page is easy to query from the command line with `wget`, `curl` or similar tools.
It accepts optional `show_original_labels=1` query arg which shows the original labels per each target before applying the relabeling.
This information may be useful for debugging target relabeling.
* `http://vmagent-host:8429/api/v1/targets`. This handler returns data compatible with [the corresponding page from Prometheus API](https://prometheus.io/docs/prometheus/latest/querying/api/#targets).
* `http://vmagent-host:8429/ready`. This handler returns http 200 status code when `vmagent` finishes it's initialization for all service_discovery configs.
It may be useful to perform `vmagent` rolling update without any scrape loss.
* `http://vmagent-host:8429/targets`. This pages shows the current status for every active target.
* `http://vmagent-host:8429/service-discovery`. This pages shows the list of discovered targets with the discovered `__meta_*` labels
according to [these docs](https://docs.victoriametrics.com/sd_configs.html).
This page may help debugging target [relabeling](#relabeling).
* `http://vmagent-host:8429/api/v1/targets`. This handler returns JSON response
compatible with [the corresponding page from Prometheus API](https://prometheus.io/docs/prometheus/latest/querying/api/#targets).
* `http://vmagent-host:8429/ready`. This handler returns http 200 status code when `vmagent` finishes
it's initialization for all the [service_discovery configs](https://docs.victoriametrics.com/sd_configs.html).
It may be useful to perform `vmagent` rolling update without any scrape loss.
## Troubleshooting
@@ -654,24 +822,40 @@ It may be useful to perform `vmagent` rolling update without any scrape loss.
* Disabling staleness tracking with `-promscrape.noStaleMarkers` option. See [these docs](#prometheus-staleness-markers).
* Enabling stream parsing mode if `vmagent` scrapes targets with millions of metrics per target. See [these docs](#stream-parsing-mode).
* Reducing the number of output queues with `-remoteWrite.queues` command-line option.
* Reducing the amounts of RAM vmagent can use for in-memory buffering with `-memory.allowedPercent` or `-memory.allowedBytes` command-line option. Another option is to reduce memory limits in Docker and/or Kubernetes if `vmagent` runs under these systems.
* Reducing the number of CPU cores vmagent can use by passing `GOMAXPROCS=N` environment variable to `vmagent`, where `N` is the desired limit on CPU cores. Another option is to reduce CPU limits in Docker or Kubernetes if `vmagent` runs under these systems.
* Passing `-promscrape.dropOriginalLabels` command-line option to `vmagent`, so it drops `"discoveredLabels"` and `"droppedTargets"` lists at `/api/v1/targets` page. This reduces memory usage when scraping big number of targets at the cost of reduced debuggability for improperly configured per-target relabeling.
* Reducing the amounts of RAM vmagent can use for in-memory buffering with `-memory.allowedPercent` or `-memory.allowedBytes` command-line option.
Another option is to reduce memory limits in Docker and/or Kubernetes if `vmagent` runs under these systems.
* Reducing the number of CPU cores vmagent can use by passing `GOMAXPROCS=N` environment variable to `vmagent`,
where `N` is the desired limit on CPU cores. Another option is to reduce CPU limits in Docker or Kubernetes if `vmagent` runs under these systems.
* Passing `-promscrape.dropOriginalLabels` command-line option to `vmagent`, so it drops `"discoveredLabels"` and `"droppedTargets"`
lists at `/api/v1/targets` page. This reduces memory usage when scraping big number of targets at the cost
of reduced debuggability for improperly configured per-target relabeling.
* When `vmagent` scrapes many unreliable targets, it can flood the error log with scrape errors. These errors can be suppressed
by passing `-promscrape.suppressScrapeErrors` command-line flag to `vmagent`. The most recent scrape error per each target can be observed at `http://vmagent-host:8429/targets`
and `http://vmagent-host:8429/api/v1/targets`.
* The `/api/v1/targets` page could be useful for debugging relabeling process for scrape targets.
This page contains original labels for targets dropped during relabeling (see "droppedTargets" section in the page output). By default the `-promscrape.maxDroppedTargets` targets are shown here. If your setup drops more targets during relabeling, then increase `-promscrape.maxDroppedTargets` command-line flag value to see all the dropped targets. Note that tracking each dropped target requires up to 10Kb of RAM. Therefore big values for `-promscrape.maxDroppedTargets` may result in increased memory usage if a big number of scrape targets are dropped during relabeling.
* The `/service-discovery` page could be useful for debugging relabeling process for scrape targets.
This page contains original labels for targets dropped during relabeling.
By default the `-promscrape.maxDroppedTargets` targets are shown here. If your setup drops more targets during relabeling,
then increase `-promscrape.maxDroppedTargets` command-line flag value to see all the dropped targets.
Note that tracking each dropped target requires up to 10Kb of RAM. Therefore big values for `-promscrape.maxDroppedTargets`
may result in increased memory usage if a big number of scrape targets are dropped during relabeling.
* We recommend you increase `-remoteWrite.queues` if `vmagent_remotewrite_pending_data_bytes` metric exported at `http://vmagent-host:8429/metrics` page grows constantly. It is also recommended increasing `-remoteWrite.maxBlockSize` and `-remoteWrite.maxRowsPerBlock` command-line options in this case. This can improve data ingestion performance to the configured remote storage systems at the cost of higher memory usage.
* We recommend you increase `-remoteWrite.queues` if `vmagent_remotewrite_pending_data_bytes` metric exported
at `http://vmagent-host:8429/metrics` page grows constantly. It is also recommended increasing `-remoteWrite.maxBlockSize`
and `-remoteWrite.maxRowsPerBlock` command-line options in this case. This can improve data ingestion performance
to the configured remote storage systems at the cost of higher memory usage.
* If you see gaps in the data pushed by `vmagent` to remote storage when `-remoteWrite.maxDiskUsagePerURL` is set, try increasing `-remoteWrite.queues`. Such gaps may appear because `vmagent` cannot keep up with sending the collected data to remote storage. Therefore it starts dropping the buffered data if the on-disk buffer size exceeds `-remoteWrite.maxDiskUsagePerURL`.
* If you see gaps in the data pushed by `vmagent` to remote storage when `-remoteWrite.maxDiskUsagePerURL` is set,
try increasing `-remoteWrite.queues`. Such gaps may appear because `vmagent` cannot keep up with sending the collected data to remote storage.
Therefore it starts dropping the buffered data if the on-disk buffer size exceeds `-remoteWrite.maxDiskUsagePerURL`.
* `vmagent` drops data blocks if remote storage replies with `400 Bad Request` and `409 Conflict` HTTP responses. The number of dropped blocks can be monitored via `vmagent_remotewrite_packets_dropped_total` metric exported at [/metrics page](#monitoring).
* `vmagent` drops data blocks if remote storage replies with `400 Bad Request` and `409 Conflict` HTTP responses.
The number of dropped blocks can be monitored via `vmagent_remotewrite_packets_dropped_total` metric exported at [/metrics page](#monitoring).
* Use `-remoteWrite.queues=1` when `-remoteWrite.url` points to remote storage, which doesn't accept out-of-order samples (aka data backfilling). Such storage systems include Prometheus, Cortex and Thanos, which typically emit `out of order sample` errors. The best solution is to use remote storage with [backfilling support](https://docs.victoriametrics.com/#backfilling).
* Use `-remoteWrite.queues=1` when `-remoteWrite.url` points to remote storage, which doesn't accept out-of-order samples (aka data backfilling).
Such storage systems include Prometheus, Cortex and Thanos, which typically emit `out of order sample` errors.
The best solution is to use remote storage with [backfilling support](https://docs.victoriametrics.com/#backfilling) such as VictoriaMetrics.
* `vmagent` buffers scraped data at the `-remoteWrite.tmpDataPath` directory until it is sent to `-remoteWrite.url`.
The directory can grow large when remote storage is unavailable for extended periods of time and if `-remoteWrite.maxDiskUsagePerURL` isn't set.
@@ -724,30 +908,38 @@ See also [troubleshooting docs](https://docs.victoriametrics.com/Troubleshooting
## Kafka integration
[Enterprise version](https://victoriametrics.com/products/enterprise/) of `vmagent` can read and write metrics from / to Kafka:
[Enterprise version](https://docs.victoriametrics.com/enterprise.html) of `vmagent` can read and write metrics from / to Kafka:
* [Reading metrics from Kafka](#reading-metrics-from-kafka)
* [Writing metrics to Kafka](#writing-metrics-to-kafka)
The enterprise version of vmagent is available for evaluation at [releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) page in `vmutils-*-enteprise.tar.gz` archives and in [docker images](https://hub.docker.com/r/victoriametrics/vmagent/tags) with tags containing `enterprise` suffix.
The enterprise version of vmagent is available for evaluation at [releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) page
in `vmutils-...-enteprise.tar.gz` archives and in [docker images](https://hub.docker.com/r/victoriametrics/vmagent/tags) with tags containing `enterprise` suffix.
### Reading metrics from Kafka
[Enterprise version](https://victoriametrics.com/products/enterprise/) of `vmagent` can read metrics in various formats from Kafka messages. These formats can be configured with `-kafka.consumer.topic.defaultFormat` or `-kafka.consumer.topic.format` command-line options. The following formats are supported:
[Enterprise version](https://docs.victoriametrics.com/enterprise.html) of `vmagent` can read metrics in various formats from Kafka messages.
These formats can be configured with `-kafka.consumer.topic.defaultFormat` or `-kafka.consumer.topic.format` command-line options. The following formats are supported:
* `promremotewrite` - [Prometheus remote_write](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write). Messages in this format can be sent by vmagent - see [these docs](#writing-metrics-to-kafka).
* `promremotewrite` - [Prometheus remote_write](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write).
Messages in this format can be sent by vmagent - see [these docs](#writing-metrics-to-kafka).
* `influx` - [InfluxDB line protocol format](https://docs.influxdata.com/influxdb/v1.7/write_protocols/line_protocol_tutorial/).
* `prometheus` - [Prometheus text exposition format](https://github.com/prometheus/docs/blob/master/content/docs/instrumenting/exposition_formats.md#text-based-format) and [OpenMetrics format](https://github.com/OpenObservability/OpenMetrics/blob/master/specification/OpenMetrics.md).
* `prometheus` - [Prometheus text exposition format](https://github.com/prometheus/docs/blob/master/content/docs/instrumenting/exposition_formats.md#text-based-format)
and [OpenMetrics format](https://github.com/OpenObservability/OpenMetrics/blob/master/specification/OpenMetrics.md).
* `graphite` - [Graphite plaintext format](https://graphite.readthedocs.io/en/latest/feeding-carbon.html#the-plaintext-protocol).
* `jsonline` - [JSON line format](https://docs.victoriametrics.com/#how-to-import-data-in-json-line-format).
Every Kafka message may contain multiple lines in `influx`, `prometheus`, `graphite` and `jsonline` format delimited by `\n`.
`vmagent` consumes messages from Kafka topics specified by `-kafka.consumer.topic` command-line flag. Multiple topics can be specified by passing multiple `-kafka.consumer.topic` command-line flags to `vmagent`.
`vmagent` consumes messages from Kafka topics specified by `-kafka.consumer.topic` command-line flag. Multiple topics can be specified
by passing multiple `-kafka.consumer.topic` command-line flags to `vmagent`.
`vmagent` consumes messages from Kafka brokers specified by `-kafka.consumer.topic.brokers` command-line flag. Multiple brokers can be specified per each `-kafka.consumer.topic` by passing a list of brokers delimited by `;`. For example, `-kafka.consumer.topic.brokers=host1:9092;host2:9092`.
`vmagent` consumes messages from Kafka brokers specified by `-kafka.consumer.topic.brokers` command-line flag.
Multiple brokers can be specified per each `-kafka.consumer.topic` by passing a list of brokers delimited by `;`.
For example, `-kafka.consumer.topic.brokers=host1:9092;host2:9092`.
The following command starts `vmagent`, which reads metrics in InfluxDB line protocol format from Kafka broker at `localhost:9092` from the topic `metrics-by-telegraf` and sends them to remote storage at `http://localhost:8428/api/v1/write`:
The following command starts `vmagent`, which reads metrics in InfluxDB line protocol format from Kafka broker at `localhost:9092`
from the topic `metrics-by-telegraf` and sends them to remote storage at `http://localhost:8428/api/v1/write`:
```console
./bin/vmagent -remoteWrite.url=http://localhost:8428/api/v1/write \
@@ -768,7 +960,9 @@ data_format = "influx"
#### Command-line flags for Kafka consumer
These command-line flags are available only in [enterprise](https://victoriametrics.com/products/enterprise/) version of `vmagent`, which can be downloaded for evaluation from [releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) page (see `vmutils-*-enteprise.tar.gz` archives) and from [docker images](https://hub.docker.com/r/victoriametrics/vmagent/tags) with tags containing `enterprise` suffix.
These command-line flags are available only in [enterprise](https://docs.victoriametrics.com/enterprise.html) version of `vmagent`,
which can be downloaded for evaluation from [releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) page
(see `vmutils-...-enteprise.tar.gz` archives) and from [docker images](https://hub.docker.com/r/victoriametrics/vmagent/tags) with tags containing `enterprise` suffix.
```
-kafka.consumer.topic array
@@ -801,9 +995,13 @@ These command-line flags are available only in [enterprise](https://victoriametr
### Writing metrics to Kafka
[Enterprise version](https://victoriametrics.com/products/enterprise/) of `vmagent` writes data to Kafka with `at-least-once` semantics if `-remoteWrite.url` contains e.g. Kafka url. For example, if `vmagent` is started with `-remoteWrite.url=kafka://localhost:9092/?topic=prom-rw`, then it would send Prometheus remote_write messages to Kafka bootstrap server at `localhost:9092` with the topic `prom-rw`. These messages can be read later from Kafka by another `vmagent` - see [these docs](#reading-metrics-from-kafka) for details.
[Enterprise version](https://docs.victoriametrics.com/enterprise.html) of `vmagent` writes data to Kafka with `at-least-once`
semantics if `-remoteWrite.url` contains e.g. Kafka url. For example, if `vmagent` is started with `-remoteWrite.url=kafka://localhost:9092/?topic=prom-rw`,
then it would send Prometheus remote_write messages to Kafka bootstrap server at `localhost:9092` with the topic `prom-rw`.
These messages can be read later from Kafka by another `vmagent` - see [these docs](#reading-metrics-from-kafka) for details.
Additional Kafka options can be passed as query params to `-remoteWrite.url`. For instance, `kafka://localhost:9092/?topic=prom-rw&client.id=my-favorite-id` sets `client.id` Kafka option to `my-favorite-id`. The full list of Kafka options is available [here](https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md).
Additional Kafka options can be passed as query params to `-remoteWrite.url`. For instance, `kafka://localhost:9092/?topic=prom-rw&client.id=my-favorite-id`
sets `client.id` Kafka option to `my-favorite-id`. The full list of Kafka options is available [here](https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md).
#### Kafka broker authorization and authentication
@@ -823,11 +1021,13 @@ Two types of auth are supported:
## How to build from sources
We recommend using [binary releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) - `vmagent` is located in the `vmutils-*` archives .
We recommend using [official binary releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) - `vmagent` is located in the `vmutils-...` archives.
It may be needed to build `vmagent` from source code when developing or testing new feature or bugfix.
### Development build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.18.
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.19.3.
2. Run `make vmagent` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds the `vmagent` binary and puts it into the `bin` folder.
@@ -856,7 +1056,7 @@ ARM build may run on Raspberry Pi or on [energy-efficient ARM servers](https://b
### Development ARM build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.18.
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.19.3.
2. Run `make vmagent-linux-arm` or `make vmagent-linux-arm64` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics)
It builds `vmagent-linux-arm` or `vmagent-linux-arm64` binary respectively and puts it into the `bin` folder.
@@ -893,6 +1093,7 @@ curl http://0.0.0.0:8429/debug/pprof/profile > cpu.pprof
The command for collecting CPU profile waits for 30 seconds before returning.
The collected profiles may be analyzed with [go tool pprof](https://github.com/google/pprof).
It is safe sharing the collected profiles from security point of view, since they do not contain sensitive information.
## Advanced usage
@@ -906,6 +1107,8 @@ vmagent collects metrics data via popular data ingestion protocols and routes th
See the docs at https://docs.victoriametrics.com/vmagent.html .
-cacheExpireDuration duration
Items are removed from in-memory caches after they aren't accessed for this duration. Lower values may reduce memory usage at the cost of higher CPU usage. See also -prevCacheRemovalPercent (default 30m0s)
-configAuthKey string
Authorization key for accessing /config page. It must be passed via authKey query arg
-csvTrimTimestamp duration
@@ -913,6 +1116,8 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
-datadog.maxInsertRequestSize size
The maximum size in bytes of a single DataDog POST request to /api/v1/series
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 67108864)
-datadog.sanitizeMetricName
Sanitize metric names for the ingested DataDog data to comply with DataDog behaviour described at https://docs.datadoghq.com/metrics/custom_metrics/#naming-custom-metrics (default true)
-denyQueryTracing
Whether to disable the ability to trace queries. See https://docs.victoriametrics.com/#query-tracing
-dryRun
@@ -924,7 +1129,7 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
-eula
By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf
By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf . This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
-flagsAuthKey string
Auth key for /flags endpoint. It must be passed via authKey query arg. It overrides httpAuth.* settings
-fs.disableMmap
@@ -975,30 +1180,30 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
-insert.maxQueueDuration duration
The maximum duration for waiting in the queue for insert requests due to -maxConcurrentInserts (default 1m0s)
-kafka.consumer.topic array
Kafka topic names for data consumption.
Kafka topic names for data consumption. This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
Supports an array of values separated by comma or specified via multiple flags.
-kafka.consumer.topic.basicAuth.password array
Optional basic auth password for -kafka.consumer.topic. Must be used in conjunction with any supported auth methods for kafka client, specified by flag -kafka.consumer.topic.options='security.protocol=SASL_SSL;sasl.mechanisms=PLAIN'
Optional basic auth password for -kafka.consumer.topic. Must be used in conjunction with any supported auth methods for kafka client, specified by flag -kafka.consumer.topic.options='security.protocol=SASL_SSL;sasl.mechanisms=PLAIN' . This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
Supports an array of values separated by comma or specified via multiple flags.
-kafka.consumer.topic.basicAuth.username array
Optional basic auth username for -kafka.consumer.topic. Must be used in conjunction with any supported auth methods for kafka client, specified by flag -kafka.consumer.topic.options='security.protocol=SASL_SSL;sasl.mechanisms=PLAIN'
Optional basic auth username for -kafka.consumer.topic. Must be used in conjunction with any supported auth methods for kafka client, specified by flag -kafka.consumer.topic.options='security.protocol=SASL_SSL;sasl.mechanisms=PLAIN' . This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
Supports an array of values separated by comma or specified via multiple flags.
-kafka.consumer.topic.brokers array
List of brokers to connect for given topic, e.g. -kafka.consumer.topic.broker=host-1:9092;host-2:9092
List of brokers to connect for given topic, e.g. -kafka.consumer.topic.broker=host-1:9092;host-2:9092 . This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
Supports an array of values separated by comma or specified via multiple flags.
-kafka.consumer.topic.defaultFormat string
Expected data format in the topic if -kafka.consumer.topic.format is skipped. (default "promremotewrite")
Expected data format in the topic if -kafka.consumer.topic.format is skipped. This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html (default "promremotewrite")
-kafka.consumer.topic.format array
data format for corresponding kafka topic. Valid formats: influx, prometheus, promremotewrite, graphite, jsonline
data format for corresponding kafka topic. Valid formats: influx, prometheus, promremotewrite, graphite, jsonline . This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
Supports an array of values separated by comma or specified via multiple flags.
-kafka.consumer.topic.groupID array
Defines group.id for topic
Defines group.id for topic. This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
Supports an array of values separated by comma or specified via multiple flags.
-kafka.consumer.topic.isGzipped array
Enables gzip setting for topic messages payload. Only prometheus, jsonline and influx formats accept gzipped messages.
Enables gzip setting for topic messages payload. Only prometheus, jsonline and influx formats accept gzipped messages.This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
Supports array of values separated by comma or specified via multiple flags.
-kafka.consumer.topic.options array
Optional key=value;key1=value2 settings for topic consumer. See full configuration options at https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md.
Optional key=value;key1=value2 settings for topic consumer. See full configuration options at https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md . This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
Supports an array of values separated by comma or specified via multiple flags.
-loggerDisableTimestamps
Whether to disable writing timestamps in logs
@@ -1039,6 +1244,8 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
Trim timestamps for OpenTSDB HTTP data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms)
-pprofAuthKey string
Auth key for /debug/pprof/* endpoints. It must be passed via authKey query arg. It overrides httpAuth.* settings
-prevCacheRemovalPercent float
Items in the previous caches are removed when the percent of requests it serves becomes lower than this value. Higher values reduce memory usage at the cost of higher CPU usage. See also -cacheExpireDuration (default 0.1)
-promscrape.azureSDCheckInterval duration
Interval for checking for changes in Azure. This works only if azure_sd_configs is configured in '-promscrape.config' file. See https://docs.victoriametrics.com/sd_configs.html#azure_sd_configs for details (default 1m0s)
-promscrape.cluster.memberNum string
@@ -1119,7 +1326,7 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
-promscrape.suppressScrapeErrorsDelay duration
The delay for suppressing repeated scrape errors logging per each scrape targets. This may be used for reducing the number of log lines related to scrape errors. See also -promscrape.suppressScrapeErrors
-promscrape.yandexcloudSDCheckInterval duration
Interval for checking for changes in Yandex Cloud API. This works only if yandexcloud_sd_configs is configured in '-promscrape.config' file. (default 30s)
Interval for checking for changes in Yandex Cloud API. This works only if yandexcloud_sd_configs is configured in '-promscrape.config' file. See https://docs.victoriametrics.com/sd_configs.html#yandexcloud_sd_configs for details (default 30s)
-pushmetrics.extraLabel array
Optional labels to add to metrics pushed to -pushmetrics.url . For example, -pushmetrics.extraLabel='instance="foo"' adds instance="foo" label to all the metrics pushed to -pushmetrics.url
Supports an array of values separated by comma or specified via multiple flags.
@@ -1180,9 +1387,10 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 8388608)
-remoteWrite.maxDailySeries int
The maximum number of unique series vmagent can send to remote storage systems during the last 24 hours. Excess series are logged and dropped. This can be useful for limiting series churn rate. See https://docs.victoriametrics.com/vmagent.html#cardinality-limiter
-remoteWrite.maxDiskUsagePerURL size
-remoteWrite.maxDiskUsagePerURL array
The maximum file-based buffer size in bytes at -remoteWrite.tmpDataPath for each -remoteWrite.url. When buffer size reaches the configured maximum, then old data is dropped when adding new data to the buffer. Buffered data is stored in ~500MB chunks, so the minimum practical value for this flag is 500MB. Disk usage is unlimited if the value is set to 0
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0)
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB.
Supports array of values separated by comma or specified via multiple flags.
-remoteWrite.maxHourlySeries int
The maximum number of unique series vmagent can send to remote storage systems during the last hour. Excess series are logged and dropped. This can be useful for limiting series cardinality. See https://docs.victoriametrics.com/vmagent.html#cardinality-limiter
-remoteWrite.maxRowsPerBlock int
@@ -1265,6 +1473,10 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
Supports an array of values separated by comma or specified via multiple flags.
-tlsKeyFile string
Path to file with TLS key if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated
-tlsMinVersion string
Optional minimum TLS version to use for incoming requests over HTTPS if -tls is set. Supported values: TLS10, TLS11, TLS12, TLS13
-usePromCompatibleNaming
Whether to replace characters unsupported by Prometheus with underscores in the ingested metric names and label names. For example, foo.bar{a.b='c'} is transformed into foo_bar{a_b='c'} during data ingestion if this flag is set. See https://prometheus.io/docs/concepts/data_model/#metric-names-and-labels
-version
Show VictoriaMetrics version
```

View File

@@ -217,40 +217,7 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
}
path := strings.Replace(r.URL.Path, "//", "/", -1)
if strings.HasPrefix(path, "datadog/") {
// Trim suffix from paths starting from /datadog/ in order to support legacy DataDog agent.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2670
path = strings.TrimSuffix(path, "/")
}
switch path {
case "/api/v1/write":
prometheusWriteRequests.Inc()
if err := promremotewrite.InsertHandler(nil, r); err != nil {
prometheusWriteErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "/api/v1/import":
vmimportRequests.Inc()
if err := vmimport.InsertHandler(nil, r); err != nil {
vmimportErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "/api/v1/import/csv":
csvimportRequests.Inc()
if err := csvimport.InsertHandler(nil, r); err != nil {
csvimportErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "/api/v1/import/prometheus":
if strings.HasPrefix(path, "/prometheus/api/v1/import/prometheus") || strings.HasPrefix(path, "/api/v1/import/prometheus") {
prometheusimportRequests.Inc()
if err := prometheusimport.InsertHandler(nil, r); err != nil {
prometheusimportErrors.Inc()
@@ -259,7 +226,41 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
}
w.WriteHeader(http.StatusNoContent)
return true
case "/api/v1/import/native":
}
if strings.HasPrefix(path, "datadog/") {
// Trim suffix from paths starting from /datadog/ in order to support legacy DataDog agent.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2670
path = strings.TrimSuffix(path, "/")
}
switch path {
case "/prometheus/api/v1/write", "/api/v1/write":
prometheusWriteRequests.Inc()
if err := promremotewrite.InsertHandler(nil, r); err != nil {
prometheusWriteErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "/prometheus/api/v1/import", "/api/v1/import":
vmimportRequests.Inc()
if err := vmimport.InsertHandler(nil, r); err != nil {
vmimportErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "/prometheus/api/v1/import/csv", "/api/v1/import/csv":
csvimportRequests.Inc()
if err := csvimport.InsertHandler(nil, r); err != nil {
csvimportErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "/prometheus/api/v1/import/native", "/api/v1/import/native":
nativeimportRequests.Inc()
if err := native.InsertHandler(nil, r); err != nil {
nativeimportErrors.Inc()
@@ -268,7 +269,7 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
}
w.WriteHeader(http.StatusNoContent)
return true
case "/write", "/api/v2/write":
case "/influx/write", "/influx/api/v2/write", "/write", "/api/v2/write":
influxWriteRequests.Inc()
if err := influx.InsertHandlerForHTTP(nil, r); err != nil {
influxWriteErrors.Inc()
@@ -277,7 +278,7 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
}
w.WriteHeader(http.StatusNoContent)
return true
case "/query":
case "/influx/query", "/query":
influxQueryRequests.Inc()
influxutils.WriteDatabaseNames(w)
return true
@@ -316,15 +317,21 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
w.Header().Set("Content-Type", "application/json")
fmt.Fprintf(w, `{}`)
return true
case "/targets":
case "/prometheus/targets", "/targets":
promscrapeTargetsRequests.Inc()
promscrape.WriteHumanReadableTargetsStatus(w, r)
return true
case "/service-discovery":
case "/prometheus/service-discovery", "/service-discovery":
promscrapeServiceDiscoveryRequests.Inc()
promscrape.WriteServiceDiscovery(w, r)
return true
case "/target_response":
case "/prometheus/api/v1/targets", "/api/v1/targets":
promscrapeAPIV1TargetsRequests.Inc()
w.Header().Set("Content-Type", "application/json")
state := r.FormValue("state")
promscrape.WriteAPIV1Targets(w, state)
return true
case "/prometheus/target_response", "/target_response":
promscrapeTargetResponseRequests.Inc()
if err := promscrape.WriteTargetResponse(w, r); err != nil {
promscrapeTargetResponseErrors.Inc()
@@ -332,7 +339,7 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
return true
}
return true
case "/config":
case "/prometheus/config", "/config":
if *configAuthKey != "" && r.FormValue("authKey") != *configAuthKey {
err := &httpserver.ErrorWithStatusCode{
Err: fmt.Errorf("The provided authKey doesn't match -configAuthKey"),
@@ -345,7 +352,7 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
w.Header().Set("Content-Type", "text/plain; charset=utf-8")
promscrape.WriteConfigData(w)
return true
case "/api/v1/status/config":
case "/prometheus/api/v1/status/config", "/api/v1/status/config":
// See https://prometheus.io/docs/prometheus/latest/querying/api/#config
if *configAuthKey != "" && r.FormValue("authKey") != *configAuthKey {
err := &httpserver.ErrorWithStatusCode{
@@ -361,13 +368,7 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
promscrape.WriteConfigData(&bb)
fmt.Fprintf(w, `{"status":"success","data":{"yaml":%q}}`, bb.B)
return true
case "/api/v1/targets":
promscrapeAPIV1TargetsRequests.Inc()
w.Header().Set("Content-Type", "application/json")
state := r.FormValue("state")
promscrape.WriteAPIV1Targets(w, state)
return true
case "/-/reload":
case "/prometheus/-/reload", "/-/reload":
promscrapeConfigReloadRequests.Inc()
procutil.SelfSIGHUP()
w.WriteHeader(http.StatusOK)
@@ -409,6 +410,16 @@ func processMultitenantRequest(w http.ResponseWriter, r *http.Request, path stri
httpserver.Errorf(w, r, "cannot obtain auth token: %s", err)
return true
}
if strings.HasPrefix(p.Suffix, "prometheus/api/v1/import/prometheus") {
prometheusimportRequests.Inc()
if err := prometheusimport.InsertHandler(at, r); err != nil {
prometheusimportErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
}
if strings.HasPrefix(p.Suffix, "datadog/") {
// Trim suffix from paths starting from /datadog/ in order to support legacy DataDog agent.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2670
@@ -442,15 +453,6 @@ func processMultitenantRequest(w http.ResponseWriter, r *http.Request, path stri
}
w.WriteHeader(http.StatusNoContent)
return true
case "prometheus/api/v1/import/prometheus":
prometheusimportRequests.Inc()
if err := prometheusimport.InsertHandler(at, r); err != nil {
prometheusimportErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "prometheus/api/v1/import/native":
nativeimportRequests.Inc()
if err := native.InsertHandler(at, r); err != nil {

View File

@@ -24,46 +24,46 @@ var (
"By default the rate limit is disabled. It can be useful for limiting load on remote storage when big amounts of buffered data "+
"is sent after temporary unavailability of the remote storage")
sendTimeout = flagutil.NewArrayDuration("remoteWrite.sendTimeout", "Timeout for sending a single block of data to the corresponding -remoteWrite.url")
proxyURL = flagutil.NewArray("remoteWrite.proxyURL", "Optional proxy URL for writing data to the corresponding -remoteWrite.url. "+
proxyURL = flagutil.NewArrayString("remoteWrite.proxyURL", "Optional proxy URL for writing data to the corresponding -remoteWrite.url. "+
"Supported proxies: http, https, socks5. Example: -remoteWrite.proxyURL=socks5://proxy:1234")
tlsInsecureSkipVerify = flagutil.NewArrayBool("remoteWrite.tlsInsecureSkipVerify", "Whether to skip tls verification when connecting to the corresponding -remoteWrite.url")
tlsCertFile = flagutil.NewArray("remoteWrite.tlsCertFile", "Optional path to client-side TLS certificate file to use when connecting "+
tlsCertFile = flagutil.NewArrayString("remoteWrite.tlsCertFile", "Optional path to client-side TLS certificate file to use when connecting "+
"to the corresponding -remoteWrite.url")
tlsKeyFile = flagutil.NewArray("remoteWrite.tlsKeyFile", "Optional path to client-side TLS certificate key to use when connecting to the corresponding -remoteWrite.url")
tlsCAFile = flagutil.NewArray("remoteWrite.tlsCAFile", "Optional path to TLS CA file to use for verifying connections to the corresponding -remoteWrite.url. "+
tlsKeyFile = flagutil.NewArrayString("remoteWrite.tlsKeyFile", "Optional path to client-side TLS certificate key to use when connecting to the corresponding -remoteWrite.url")
tlsCAFile = flagutil.NewArrayString("remoteWrite.tlsCAFile", "Optional path to TLS CA file to use for verifying connections to the corresponding -remoteWrite.url. "+
"By default system CA is used")
tlsServerName = flagutil.NewArray("remoteWrite.tlsServerName", "Optional TLS server name to use for connections to the corresponding -remoteWrite.url. "+
tlsServerName = flagutil.NewArrayString("remoteWrite.tlsServerName", "Optional TLS server name to use for connections to the corresponding -remoteWrite.url. "+
"By default the server name from -remoteWrite.url is used")
headers = flagutil.NewArray("remoteWrite.headers", "Optional HTTP headers to send with each request to the corresponding -remoteWrite.url. "+
headers = flagutil.NewArrayString("remoteWrite.headers", "Optional HTTP headers to send with each request to the corresponding -remoteWrite.url. "+
"For example, -remoteWrite.headers='My-Auth:foobar' would send 'My-Auth: foobar' HTTP header with every request to the corresponding -remoteWrite.url. "+
"Multiple headers must be delimited by '^^': -remoteWrite.headers='header1:value1^^header2:value2'")
basicAuthUsername = flagutil.NewArray("remoteWrite.basicAuth.username", "Optional basic auth username to use for the corresponding -remoteWrite.url")
basicAuthPassword = flagutil.NewArray("remoteWrite.basicAuth.password", "Optional basic auth password to use for the corresponding -remoteWrite.url")
basicAuthPasswordFile = flagutil.NewArray("remoteWrite.basicAuth.passwordFile", "Optional path to basic auth password to use for the corresponding -remoteWrite.url. "+
basicAuthUsername = flagutil.NewArrayString("remoteWrite.basicAuth.username", "Optional basic auth username to use for the corresponding -remoteWrite.url")
basicAuthPassword = flagutil.NewArrayString("remoteWrite.basicAuth.password", "Optional basic auth password to use for the corresponding -remoteWrite.url")
basicAuthPasswordFile = flagutil.NewArrayString("remoteWrite.basicAuth.passwordFile", "Optional path to basic auth password to use for the corresponding -remoteWrite.url. "+
"The file is re-read every second")
bearerToken = flagutil.NewArray("remoteWrite.bearerToken", "Optional bearer auth token to use for the corresponding -remoteWrite.url")
bearerTokenFile = flagutil.NewArray("remoteWrite.bearerTokenFile", "Optional path to bearer token file to use for the corresponding -remoteWrite.url. "+
bearerToken = flagutil.NewArrayString("remoteWrite.bearerToken", "Optional bearer auth token to use for the corresponding -remoteWrite.url")
bearerTokenFile = flagutil.NewArrayString("remoteWrite.bearerTokenFile", "Optional path to bearer token file to use for the corresponding -remoteWrite.url. "+
"The token is re-read from the file every second")
oauth2ClientID = flagutil.NewArray("remoteWrite.oauth2.clientID", "Optional OAuth2 clientID to use for the corresponding -remoteWrite.url")
oauth2ClientSecret = flagutil.NewArray("remoteWrite.oauth2.clientSecret", "Optional OAuth2 clientSecret to use for the corresponding -remoteWrite.url")
oauth2ClientSecretFile = flagutil.NewArray("remoteWrite.oauth2.clientSecretFile", "Optional OAuth2 clientSecretFile to use for the corresponding -remoteWrite.url")
oauth2TokenURL = flagutil.NewArray("remoteWrite.oauth2.tokenUrl", "Optional OAuth2 tokenURL to use for the corresponding -remoteWrite.url")
oauth2Scopes = flagutil.NewArray("remoteWrite.oauth2.scopes", "Optional OAuth2 scopes to use for the corresponding -remoteWrite.url. Scopes must be delimited by ';'")
oauth2ClientID = flagutil.NewArrayString("remoteWrite.oauth2.clientID", "Optional OAuth2 clientID to use for the corresponding -remoteWrite.url")
oauth2ClientSecret = flagutil.NewArrayString("remoteWrite.oauth2.clientSecret", "Optional OAuth2 clientSecret to use for the corresponding -remoteWrite.url")
oauth2ClientSecretFile = flagutil.NewArrayString("remoteWrite.oauth2.clientSecretFile", "Optional OAuth2 clientSecretFile to use for the corresponding -remoteWrite.url")
oauth2TokenURL = flagutil.NewArrayString("remoteWrite.oauth2.tokenUrl", "Optional OAuth2 tokenURL to use for the corresponding -remoteWrite.url")
oauth2Scopes = flagutil.NewArrayString("remoteWrite.oauth2.scopes", "Optional OAuth2 scopes to use for the corresponding -remoteWrite.url. Scopes must be delimited by ';'")
awsUseSigv4 = flagutil.NewArrayBool("remoteWrite.aws.useSigv4", "Enables SigV4 request signing for the corresponding -remoteWrite.url. "+
"It is expected that other -remoteWrite.aws.* command-line flags are set if sigv4 request signing is enabled")
awsEC2Endpoint = flagutil.NewArray("remoteWrite.aws.ec2Endpoint", "Optional AWS EC2 API endpoint to use for the corresponding -remoteWrite.url if -remoteWrite.aws.useSigv4 is set")
awsSTSEndpoint = flagutil.NewArray("remoteWrite.aws.stsEndpoint", "Optional AWS STS API endpoint to use for the corresponding -remoteWrite.url if -remoteWrite.aws.useSigv4 is set")
awsRegion = flagutil.NewArray("remoteWrite.aws.region", "Optional AWS region to use for the corresponding -remoteWrite.url if -remoteWrite.aws.useSigv4 is set")
awsRoleARN = flagutil.NewArray("remoteWrite.aws.roleARN", "Optional AWS roleARN to use for the corresponding -remoteWrite.url if -remoteWrite.aws.useSigv4 is set")
awsAccessKey = flagutil.NewArray("remoteWrite.aws.accessKey", "Optional AWS AccessKey to use for the corresponding -remoteWrite.url if -remoteWrite.aws.useSigv4 is set")
awsService = flagutil.NewArray("remoteWrite.aws.service", "Optional AWS Service to use for the corresponding -remoteWrite.url if -remoteWrite.aws.useSigv4 is set. "+
awsEC2Endpoint = flagutil.NewArrayString("remoteWrite.aws.ec2Endpoint", "Optional AWS EC2 API endpoint to use for the corresponding -remoteWrite.url if -remoteWrite.aws.useSigv4 is set")
awsSTSEndpoint = flagutil.NewArrayString("remoteWrite.aws.stsEndpoint", "Optional AWS STS API endpoint to use for the corresponding -remoteWrite.url if -remoteWrite.aws.useSigv4 is set")
awsRegion = flagutil.NewArrayString("remoteWrite.aws.region", "Optional AWS region to use for the corresponding -remoteWrite.url if -remoteWrite.aws.useSigv4 is set")
awsRoleARN = flagutil.NewArrayString("remoteWrite.aws.roleARN", "Optional AWS roleARN to use for the corresponding -remoteWrite.url if -remoteWrite.aws.useSigv4 is set")
awsAccessKey = flagutil.NewArrayString("remoteWrite.aws.accessKey", "Optional AWS AccessKey to use for the corresponding -remoteWrite.url if -remoteWrite.aws.useSigv4 is set")
awsService = flagutil.NewArrayString("remoteWrite.aws.service", "Optional AWS Service to use for the corresponding -remoteWrite.url if -remoteWrite.aws.useSigv4 is set. "+
"Defaults to \"aps\"")
awsSecretKey = flagutil.NewArray("remoteWrite.aws.secretKey", "Optional AWS SecretKey to use for the corresponding -remoteWrite.url if -remoteWrite.aws.useSigv4 is set")
awsSecretKey = flagutil.NewArrayString("remoteWrite.aws.secretKey", "Optional AWS SecretKey to use for the corresponding -remoteWrite.url if -remoteWrite.aws.useSigv4 is set")
)
type client struct {

View File

@@ -0,0 +1,62 @@
package remotewrite
import (
"fmt"
"testing"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/golang/snappy"
)
func TestPushWriteRequest(t *testing.T) {
for _, rowsCount := range []int{1, 10, 100, 1e3, 1e4} {
t.Run(fmt.Sprintf("%d", rowsCount), func(t *testing.T) {
testPushWriteRequest(t, rowsCount)
})
}
}
func testPushWriteRequest(t *testing.T, rowsCount int) {
wr := newTestWriteRequest(rowsCount, 10)
pushBlockLen := 0
pushBlock := func(block []byte) {
if pushBlockLen > 0 {
panic(fmt.Errorf("BUG: pushBlock called multiple times; pushBlockLen=%d at first call, len(block)=%d at second call", pushBlockLen, len(block)))
}
pushBlockLen = len(block)
}
pushWriteRequest(wr, pushBlock)
b := prompbmarshal.MarshalWriteRequest(nil, wr)
zb := snappy.Encode(nil, b)
maxPushBlockLen := len(zb)
minPushBlockLen := maxPushBlockLen / 2
if pushBlockLen < minPushBlockLen {
t.Fatalf("unexpected block len after pushWriteRequest; got %d bytes; must be at least %d bytes", pushBlockLen, minPushBlockLen)
}
if pushBlockLen > maxPushBlockLen {
t.Fatalf("unexpected block len after pushWriteRequest; got %d bytes; must be smaller or equal to %d bytes", pushBlockLen, maxPushBlockLen)
}
}
func newTestWriteRequest(seriesCount, labelsCount int) *prompbmarshal.WriteRequest {
var wr prompbmarshal.WriteRequest
for i := 0; i < seriesCount; i++ {
var labels []prompbmarshal.Label
for j := 0; j < labelsCount; j++ {
labels = append(labels, prompbmarshal.Label{
Name: fmt.Sprintf("label_%d_%d", i, j),
Value: fmt.Sprintf("value_%d_%d", i, j),
})
}
wr.Timeseries = append(wr.Timeseries, prompbmarshal.TimeSeries{
Labels: labels,
Samples: []prompbmarshal.Sample{
{
Value: float64(i),
Timestamp: 1000 * int64(i),
},
},
})
}
return &wr
}

View File

@@ -0,0 +1,36 @@
package remotewrite
import (
"fmt"
"testing"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/golang/snappy"
"github.com/klauspost/compress/s2"
)
func BenchmarkCompressWriteRequestSnappy(b *testing.B) {
b.Run("snappy", func(b *testing.B) {
benchmarkCompressWriteRequest(b, snappy.Encode)
})
b.Run("s2", func(b *testing.B) {
benchmarkCompressWriteRequest(b, s2.EncodeSnappy)
})
}
func benchmarkCompressWriteRequest(b *testing.B, compressFunc func(dst, src []byte) []byte) {
for _, rowsCount := range []int{1, 10, 100, 1e3, 1e4} {
b.Run(fmt.Sprintf("rows_%d", rowsCount), func(b *testing.B) {
wr := newTestWriteRequest(rowsCount, 10)
data := prompbmarshal.MarshalWriteRequest(nil, wr)
b.ReportAllocs()
b.SetBytes(int64(rowsCount))
b.RunParallel(func(pb *testing.PB) {
var zb []byte
for pb.Next() {
zb = compressFunc(zb[:cap(zb)], data)
}
})
})
}
}

View File

@@ -13,18 +13,22 @@ import (
)
var (
unparsedLabelsGlobal = flagutil.NewArray("remoteWrite.label", "Optional label in the form 'name=value' to add to all the metrics before sending them to -remoteWrite.url. "+
unparsedLabelsGlobal = flagutil.NewArrayString("remoteWrite.label", "Optional label in the form 'name=value' to add to all the metrics before sending them to -remoteWrite.url. "+
"Pass multiple -remoteWrite.label flags in order to add multiple labels to metrics before sending them to remote storage")
relabelConfigPathGlobal = flag.String("remoteWrite.relabelConfig", "", "Optional path to file with relabel_config entries. "+
"The path can point either to local file or to http url. These entries are applied to all the metrics "+
"before sending them to -remoteWrite.url. See https://docs.victoriametrics.com/vmagent.html#relabeling for details")
relabelDebugGlobal = flag.Bool("remoteWrite.relabelDebug", false, "Whether to log metrics before and after relabeling with -remoteWrite.relabelConfig. "+
"If the -remoteWrite.relabelDebug is enabled, then the metrics aren't sent to remote storage. This is useful for debugging the relabeling configs")
relabelConfigPaths = flagutil.NewArray("remoteWrite.urlRelabelConfig", "Optional path to relabel config for the corresponding -remoteWrite.url. "+
relabelConfigPaths = flagutil.NewArrayString("remoteWrite.urlRelabelConfig", "Optional path to relabel config for the corresponding -remoteWrite.url. "+
"The path can point either to local file or to http url")
relabelDebug = flagutil.NewArrayBool("remoteWrite.urlRelabelDebug", "Whether to log metrics before and after relabeling with -remoteWrite.urlRelabelConfig. "+
"If the -remoteWrite.urlRelabelDebug is enabled, then the metrics aren't sent to the corresponding -remoteWrite.url. "+
"This is useful for debugging the relabeling configs")
usePromCompatibleNaming = flag.Bool("usePromCompatibleNaming", false, "Whether to replace characters unsupported by Prometheus with underscores "+
"in the ingested metric names and label names. For example, foo.bar{a.b='c'} is transformed into foo_bar{a_b='c'} during data ingestion if this flag is set. "+
"See https://prometheus.io/docs/concepts/data_model/#metric-names-and-labels")
)
var labelsGlobal []prompbmarshal.Label
@@ -107,7 +111,20 @@ func (rctx *relabelCtx) applyRelabeling(tss []prompbmarshal.TimeSeries, extraLab
labels = append(labels, *extraLabel)
}
}
labels = pcs.Apply(labels, labelsLen, true)
if *usePromCompatibleNaming {
// Replace unsupported Prometheus chars in label names and metric names with underscores.
tmpLabels := labels[labelsLen:]
for j := range tmpLabels {
label := &tmpLabels[j]
if label.Name == "__name__" {
label.Value = promrelabel.SanitizeName(label.Value)
} else {
label.Name = promrelabel.SanitizeName(label.Name)
}
}
}
labels = pcs.Apply(labels, labelsLen)
labels = promrelabel.FinalizeLabels(labels[:labelsLen], labels[labelsLen:])
if len(labels) == labelsLen {
// Drop the current time series, since relabeling removed all the labels.
continue

View File

@@ -13,6 +13,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bloomfilter"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fasttime"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/memory"
@@ -26,10 +27,10 @@ import (
)
var (
remoteWriteURLs = flagutil.NewArray("remoteWrite.url", "Remote storage URL to write data to. It must support Prometheus remote_write API. "+
remoteWriteURLs = flagutil.NewArrayString("remoteWrite.url", "Remote storage URL to write data to. It must support Prometheus remote_write API. "+
"It is recommended using VictoriaMetrics as remote storage. Example url: http://<victoriametrics-host>:8428/api/v1/write . "+
"Pass multiple -remoteWrite.url flags in order to replicate data to multiple remote storage systems. See also -remoteWrite.multitenantURL")
remoteWriteMultitenantURLs = flagutil.NewArray("remoteWrite.multitenantURL", "Base path for multitenant remote storage URL to write data to. "+
remoteWriteMultitenantURLs = flagutil.NewArrayString("remoteWrite.multitenantURL", "Base path for multitenant remote storage URL to write data to. "+
"See https://docs.victoriametrics.com/vmagent.html#multitenancy for details. Example url: http://<vminsert>:8480 . "+
"Pass multiple -remoteWrite.multitenantURL flags in order to replicate data to multiple remote storage systems. See also -remoteWrite.url")
tmpDataPath = flag.String("remoteWrite.tmpDataPath", "vmagent-remotewrite-data", "Path to directory where temporary data for remote write component is stored. "+
@@ -38,7 +39,7 @@ var (
"isn't enough for sending high volume of collected data to remote storage. Default value is 2 * numberOfAvailableCPUs")
showRemoteWriteURL = flag.Bool("remoteWrite.showURL", false, "Whether to show -remoteWrite.url in the exported metrics. "+
"It is hidden by default, since it can contain sensitive info such as auth key")
maxPendingBytesPerURL = flagutil.NewBytes("remoteWrite.maxDiskUsagePerURL", 0, "The maximum file-based buffer size in bytes at -remoteWrite.tmpDataPath "+
maxPendingBytesPerURL = flagutil.NewArrayBytes("remoteWrite.maxDiskUsagePerURL", "The maximum file-based buffer size in bytes at -remoteWrite.tmpDataPath "+
"for each -remoteWrite.url. When buffer size reaches the configured maximum, then old data is dropped when adding new data to the buffer. "+
"Buffered data is stored in ~500MB chunks, so the minimum practical value for this flag is 500MB. "+
"Disk usage is unlimited if the value is set to 0")
@@ -139,6 +140,8 @@ func Init() {
logger.Fatalf("cannot load relabel configs: %s", err)
}
allRelabelConfigs.Store(rcs)
configSuccess.Set(1)
configTimestamp.Set(fasttime.UnixTimestamp())
if len(*remoteWriteURLs) > 0 {
rwctxsDefault = newRemoteWriteCtxs(nil, *remoteWriteURLs)
@@ -154,18 +157,31 @@ func Init() {
case <-stopCh:
return
}
configReloads.Inc()
logger.Infof("SIGHUP received; reloading relabel configs pointed by -remoteWrite.relabelConfig and -remoteWrite.urlRelabelConfig")
rcs, err := loadRelabelConfigs()
if err != nil {
configReloadErrors.Inc()
configSuccess.Set(0)
logger.Errorf("cannot reload relabel configs; preserving the previous configs; error: %s", err)
continue
}
allRelabelConfigs.Store(rcs)
configSuccess.Set(1)
configTimestamp.Set(fasttime.UnixTimestamp())
logger.Infof("Successfully reloaded relabel configs")
}
}()
}
var (
configReloads = metrics.NewCounter(`vmagent_relabel_config_reloads_total`)
configReloadErrors = metrics.NewCounter(`vmagent_relabel_config_reloads_errors_total`)
configSuccess = metrics.NewCounter(`vmagent_relabel_config_last_reload_successful`)
configTimestamp = metrics.NewCounter(`vmagent_relabel_config_last_reload_success_timestamp_seconds`)
)
func newRemoteWriteCtxs(at *auth.Token, urls []string) []*remoteWriteCtx {
if len(urls) == 0 {
logger.Panicf("BUG: urls must be non-empty")
@@ -436,7 +452,8 @@ func newRemoteWriteCtx(argIdx int, at *auth.Token, remoteWriteURL *url.URL, maxI
pqURL.Fragment = ""
h := xxhash.Sum64([]byte(pqURL.String()))
queuePath := fmt.Sprintf("%s/persistent-queue/%d_%016X", *tmpDataPath, argIdx+1, h)
fq := persistentqueue.MustOpenFastQueue(queuePath, sanitizedURL, maxInmemoryBlocks, maxPendingBytesPerURL.N)
maxPendingBytes := maxPendingBytesPerURL.GetOptionalArgOrDefault(argIdx, 0)
fq := persistentqueue.MustOpenFastQueue(queuePath, sanitizedURL, maxInmemoryBlocks, maxPendingBytes)
_ = metrics.GetOrCreateGauge(fmt.Sprintf(`vmagent_remotewrite_pending_data_bytes{path=%q, url=%q}`, queuePath, sanitizedURL), func() float64 {
return float64(fq.GetPendingBytes())
})

View File

@@ -120,7 +120,7 @@ name: <string>
# params:
# nocache: ["1"] # disable caching for vmselect
# denyPartialResponse: ["true"] # fail if one or more vmstorage nodes returned an error
# extra_label: ["env=dev"] # apply additional label filter "env=dev" for all requests
# extra_label: ["env=dev"] # apply additional label filter "env=dev" for all requests
# see more details at https://docs.victoriametrics.com#prometheus-querying-api-enhancements
params:
[ <string>: [<string>, ...]]
@@ -131,7 +131,7 @@ params:
# headers:
# - "CustomHeader: foo"
# - "CustomHeader2: bar"
# Headers set via this param have priority over headers set via `-datasource.headers` flag.
# Headers set via this param have priority over headers set via `-datasource.headers` flag.
headers:
[ <string>, ...]
@@ -184,6 +184,13 @@ expr: <string>
# as firing once they return.
[ for: <duration> | default = 0s ]
# Whether to print debug information into logs.
# Information includes alerts state changes and requests sent to the datasource.
# Please note, that if rule's query params contain sensitive
# information - it will be printed to logs.
# Is applicable to alerting rules only.
[ debug: <bool> | default = false ]
# Labels to add or overwrite for each alert.
labels:
[ <labelname>: <tmpl_string> ]
@@ -193,9 +200,9 @@ annotations:
[ <labelname>: <tmpl_string> ]
```
#### Templating
#### Templating
It is allowed to use [Go templating](https://golang.org/pkg/text/template/) in annotations to format data, iterate over
It is allowed to use [Go templating](https://golang.org/pkg/text/template/) in annotations to format data, iterate over
or execute expressions.
The following variables are available in templating:
@@ -206,18 +213,56 @@ The following variables are available in templating:
| $labels or .Labels | The list of labels of the current alert. Use as ".Labels.<label_name>". | {% raw %}Too high number of connections for {{ .Labels.instance }}{% endraw %} |
| $alertID or .AlertID | The current alert's ID generated by vmalert. | {% raw %}Link: vmalert/alert?group_id={{.GroupID}}&alert_id={{.AlertID}}{% endraw %} |
| $groupID or .GroupID | The current alert's group ID generated by vmalert. | {% raw %}Link: vmalert/alert?group_id={{.GroupID}}&alert_id={{.AlertID}}{% endraw %} |
| $expr or .Expr | Alert's expression. Can be used for generating links to Grafana or other systems. | {% raw %}/api/v1/query?query={{ $expr&#124;quotesEscape&#124;queryEscape }}{% endraw %} |
| $expr or .Expr | Alert's expression. Can be used for generating links to Grafana or other systems. | {% raw %}/api/v1/query?query={{ $expr&#124;queryEscape }}{% endraw %} |
| $externalLabels or .ExternalLabels | List of labels configured via `-external.label` command-line flag. | {% raw %}Issues with {{ $labels.instance }} (datacenter-{{ $externalLabels.dc }}){% endraw %} |
| $externalURL or .ExternalURL | URL configured via `-external.url` command-line flag. Used for cases when vmalert is hidden behind proxy. | {% raw %}Visit {{ $externalURL }} for more details{% endraw %} |
Additionally, `vmalert` provides some extra templating functions
listed [here](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmalert/templates/template.go)
and [reusable templates](#reusable-templates).
Additionally, `vmalert` provides some extra templating functions listed [here](#template-functions) and [reusable templates](#reusable-templates).
#### Template functions
`vmalert` provides the following template functions, which can be used during [templating](#templating):
- `args arg0 ... argN` - converts the input args into a map with `arg0`, ..., `argN` keys.
- `externalURL` - returns the value of `-external.url` command-line flag.
- `first` - returns the first result from the input query results returned by `query` function.
- `htmlEscape` - escapes special chars in input string, so it can be safely embedded as a plaintext into HTML.
- `humanize` - converts the input number into human-readable format by adding [metric prefixes](https://en.wikipedia.org/wiki/Metric_prefix).
For example, `100000` is converted into `100K`.
- `humanize1024` - converts the input number into human-readable format with 1024 base.
For example, `1024` is converted into 1ki`.
- `humanizeDuration` - converts the input number in seconds into human-readable duration.
- `humanizePercentage` - converts the input number to percentage. For example, `0.123` is converted into `12.3%`.
- `humanizeTimestamp` - converts the input unix timestamp into human-readable time.
- `jsonEscape` - JSON-encodes the input string.
- `label name` - returns the value of the label with the given `name` from the input query result.
- `match regex` - matches the input string against the provided `regex`.
- `parseDuration` - parses the input string into duration in seconds. For example, `1h` is parsed into `3600`.
- `parseDurationTime` - parses the input string into [time.Duration](https://pkg.go.dev/time#Duration).
- `pathEscape` - escapes the input string, so it can be safely put inside path part of URL.
- `pathPrefix` - returns the path part of the `-external.url` command-line flag.
- `query` - executes the [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) query against `-datasource.url` and returns the query result.
For example, {% raw %}`{{ query "sort_desc(process_resident_memory_bytes)" | first | value }}`{% endraw %} executes the `sort_desc(process_resident_memory_bytes)`
query at `-datasource.url` and returns the first result.
- `queryEscape` - escapes the input string, so it can be safely put inside [query arg](https://en.wikipedia.org/wiki/Percent-encoding) part of URL.
- `quotesEscape` - escapes the input string, so it can be safely embedded into JSON string.
- `reReplaceAll regex repl` - replaces all the occurences of the `regex` in input string with the `repl`.
- `safeHtml` - marks the input string as safe to use in HTML context without the need to html-escape it.
- `sortByLabel name` - sorts the input query results by the label with the given `name`.
- `stripDomain` - leaves the first part of the domain. For example, `foo.bar.baz` is converted to `foo`.
The port part is left in the output string. E.g. `foo.bar:1234` is converted into `foo:1234`.
- `stripPort` - strips `port` part from `host:port` input string.
- `strvalue` - returns the metric name from the input query result.
- `title` - converts the first letters of every input word to uppercase.
- `toLower` - converts all the chars in the input string to lowercase.
- `toTime` - converts the input unix timestamp to [time.Time](https://pkg.go.dev/time#Time).
- `toUpper` - converts all the chars in the input string to uppercase.
- `value` - returns the numeric value from the input query result.
#### Reusable templates
Like in Alertmanager you can define [reusable templates](https://prometheus.io/docs/prometheus/latest/configuration/template_examples/#defining-reusable-templates)
to share same templates across annotations. Just define the templates in a file and
to share same templates across annotations. Just define the templates in a file and
set the path via `-rule.templates` flag.
For example, template `grafana.filter` can be defined as following:
@@ -308,7 +353,7 @@ There are the following approaches exist for alerting and recording rules across
rules to `AccountID=123`.
* To specify `tenant` parameter per each alerting and recording group if
[enterprise version of vmalert](https://victoriametrics.com/products/enterprise/) is used
[enterprise version of vmalert](https://docs.victoriametrics.com/enterprise.html) is used
with `-clusterMode` command-line flag. For example:
```yaml
@@ -324,6 +369,10 @@ groups:
# Rules for accountID=456, projectID=789
```
The results of alerting and recording rules contain `vm_account_id` and `vm_project_id` labels
if `-clusterMode` is enabled. These labels can be used during [templating](https://docs.victoriametrics.com/vmalert.html#templating),
and help to identify to which account or project the triggered alert or produced recording belongs.
If `-clusterMode` is enabled, then `-datasource.url`, `-remoteRead.url` and `-remoteWrite.url` must
contain only the hostname without tenant id. For example: `-datasource.url=http://vmselect:8481`.
`vmalert` automatically adds the specified tenant to urls per each recording rule in this case.
@@ -411,8 +460,8 @@ To avoid recording rules results and alerts state duplication in VictoriaMetrics
don't forget to configure [deduplication](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#deduplication).
The recommended value for `-dedup.minScrapeInterval` must be greater or equal to vmalert's `evaluation_interval`.
If you observe inconsistent or "jumping" values in series produced by vmalert, try disabling `-datasource.queryTimeAlignment`
command line flag. Because of alignment, two or more vmalert HA pairs will produce results with the same timestamps.
But due of backfilling (data delivered to the datasource with some delay) values of such results may differ,
command line flag. Because of alignment, two or more vmalert HA pairs will produce results with the same timestamps.
But due of backfilling (data delivered to the datasource with some delay) values of such results may differ,
which would affect deduplication logic and result into "jumping" datapoints.
Alertmanager will automatically deduplicate alerts with identical labels, so ensure that
@@ -432,8 +481,8 @@ This ability allows to aggregate data. For example, the following rule will calc
metric `http_requests` on the `5m` interval:
```yaml
- record: http_requests:avg5m
expr: avg_over_time(http_requests[5m])
- record: http_requests:avg5m
expr: avg_over_time(http_requests[5m])
```
Every time this rule will be evaluated, `vmalert` will backfill its results as a new time series `http_requests:avg5m`
@@ -447,9 +496,9 @@ This ability allows to downsample data. For example, the following config will e
groups:
- name: my_group
interval: 5m
rules:
rules:
- record: http_requests:avg5m
expr: avg_over_time(http_requests[5m])
expr: avg_over_time(http_requests[5m])
```
Ability of `vmalert` to be configured with different `datasource.url` and `remoteWrite.url` allows
@@ -467,7 +516,7 @@ or reducing resolution) and push results to "cold" cluster.
```
./bin/vmalert -rule=downsampling-rules.yml \ # Path to the file with rules configuration. Supports wildcard
-datasource.url=http://raw-cluster-vmselect:8481/select/0/prometheus # vmselect addr for executing recordi ng rules expressions
-datasource.url=http://raw-cluster-vmselect:8481/select/0/prometheus # vmselect addr for executing recording rules expressions
-remoteWrite.url=http://aggregated-cluster-vminsert:8480/insert/0/prometheus # vminsert addr to persist recording rules results
```
@@ -489,7 +538,7 @@ we recommend using [vmagent](https://docs.victoriametrics.com/vmagent.html) as f
In this topology, `vmalert` is configured to persist rule results to `vmagent`. And `vmagent`
is configured to fan-out received data to two or more destinations.
Using `vmagent` as a proxy provides additional benefits such as
Using `vmagent` as a proxy provides additional benefits such as
[data persisting when storage is unreachable](https://docs.victoriametrics.com/vmagent.html#replication-and-high-availability),
or time series modification via [relabeling](https://docs.victoriametrics.com/vmagent.html#relabeling).
@@ -504,6 +553,7 @@ or time series modification via [relabeling](https://docs.victoriametrics.com/vm
* `http://<vmalert-addr>/vmalert/api/v1/alert?group_id=<group_id>&alert_id=<alert_id>` - get alert status in JSON format.
Used as alert source in AlertManager.
* `http://<vmalert-addr>/vmalert/alert?group_id=<group_id>&alert_id=<alert_id>` - get alert status in web UI.
* `http://<vmalert-addr>/vmalert/rule?group_id=<group_id>&rule_id=<rule_id>` - get rule status in web UI.
* `http://<vmalert-addr>/metrics` - application metrics.
* `http://<vmalert-addr>/-/reload` - hot configuration reload.
@@ -612,7 +662,7 @@ There are following non-required `replay` flags:
Progress bar may generate a lot of log records, which is not formatted as standard VictoriaMetrics logger.
It could break logs parsing by external system and generate additional load on it.
See full description for these flags in `./vmalert --help`.
See full description for these flags in `./vmalert -help`.
### Limitations
@@ -623,13 +673,76 @@ See full description for these flags in `./vmalert --help`.
## Monitoring
`vmalert` exports various metrics in Prometheus exposition format at `http://vmalert-host:8880/metrics` page.
The default list of alerting rules for these metric can be found [here](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/master/deployment/docker).
We recommend setting up regular scraping of this page either through `vmagent` or by Prometheus so that the exported
metrics may be analyzed later.
Use the official [Grafana dashboard](https://grafana.com/grafana/dashboards/14950) for `vmalert` overview. Graphs on this dashboard contain useful hints - hover the `i` icon at the top left corner of each graph in order to read it.
Use the official [Grafana dashboard](https://grafana.com/grafana/dashboards/14950) for `vmalert` overview.
Graphs on this dashboard contain useful hints - hover the `i` icon in the top left corner of each graph in order to read it.
If you have suggestions for improvements or have found a bug - please open an issue on github or add
a review to the dashboard.
## Troubleshooting
vmalert executes configured rules within certain intervals. It is expected that at the moment when rule is executed,
the data is already present in configured `-datasource.url`:
<img alt="vmalert expected evaluation" src="vmalert_ts_normal.gif">
Usually, troubles start to appear when data in `-datasource.url` is delayed or absent. In such cases, evaluations
may get empty response from datasource and produce empty recording rules or reset alerts state:
<img alt="vmalert evaluation when data is delayed" src="vmalert_ts_data_delay.gif">
_By default recently written samples to VictoriaMetrics aren't visible for queries for up to 30s.
This behavior is controlled by `-search.latencyOffset` command-line flag on vmselect. Usually, this results into
a 30s shift for recording rules results.
Note that too small value passed to `-search.latencyOffset` may lead to incomplete query results._
Try the following recommendations in such cases:
* Always configure group's `evaluationInterval` to be bigger or equal to `scrape_interval` at which metrics
are delivered to the datasource;
* If you know in advance, that data in datasource is delayed - try changing vmalert's `-datasource.lookback`
command-line flag to add a time shift for evaluations;
* If time intervals between datapoints in datasource are irregular or `>=5min` - try changing vmalert's
`-datasource.queryStep` command-line flag to specify how far search query can lookback for the recent datapoint.
The recommendation is to have the step at least two times bigger than `scrape_interval`, since
there are no guarantees that scrape will not fail.
Sometimes, it is not clear why some specific alert fired or didn't fire. It is very important to remember, that
alerts with `for: 0` fire immediately when their expression becomes true. And alerts with `for > 0` will fire only
after multiple consecutive evaluations, and at each evaluation their expression must be true. If at least one evaluation
becomes false, then alert's state resets to the initial state.
If `-remoteWrite.url` command-line flag is configured, vmalert will persist alert's state in form of time series
`ALERTS` and `ALERTS_FOR_STATE` to the specified destination. Such time series can be then queried via
[vmui](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#vmui) or Grafana to track how alerts state
changed in time.
vmalert also stores last N state updates for each rule. To check updates, click on `Details` link next to rule's name
on `/vmalert/groups` page and check the `Last updates` section:
<img alt="vmalert state" src="vmalert_state.png">
Rows in the section represent ordered rule evaluations and their results. The column `curl` contains an example of
HTTP request sent by vmalert to the `-datasource.url` during evaluation. If specific state shows that there were
no samples returned and curl command returns data - then it is very likely there was no data in datasource on the
moment when rule was evaluated.
vmalert also alows configuring more detailed logging for specific rule. Just set `debug: true` in rule's configuration
and vmalert will start printing additional log messages:
```terminal
2022-09-15T13:35:41.155Z DEBUG rule "TestGroup":"Conns" (2601299393013563564) at 2022-09-15T15:35:41+02:00: query returned 0 samples (elapsed: 5.896041ms)
2022-09-15T13:35:56.149Z DEBUG datasource request: executing POST request with params "denyPartialResponse=true&query=sum%28vm_tcplistener_conns%7Binstance%3D%22localhost%3A8429%22%7D%29+by%28instance%29+%3E+0&step=15s&time=1663248945"
2022-09-15T13:35:56.178Z DEBUG rule "TestGroup":"Conns" (2601299393013563564) at 2022-09-15T15:35:56+02:00: query returned 1 samples (elapsed: 28.368208ms)
2022-09-15T13:35:56.178Z DEBUG datasource request: executing POST request with params "denyPartialResponse=true&query=sum%28vm_tcplistener_conns%7Binstance%3D%22localhost%3A8429%22%7D%29&step=15s&time=1663248945"
2022-09-15T13:35:56.179Z DEBUG rule "TestGroup":"Conns" (2601299393013563564) at 2022-09-15T15:35:56+02:00: alert 10705778000901301787 {alertgroup="TestGroup",alertname="Conns",cluster="east-1",instance="localhost:8429",replica="a"} created in state PENDING
...
2022-09-15T13:36:56.153Z DEBUG rule "TestGroup":"Conns" (2601299393013563564) at 2022-09-15T15:36:56+02:00: alert 10705778000901301787 {alertgroup="TestGroup",alertname="Conns",cluster="east-1",instance="localhost:8429",replica="a"} PENDING => FIRING: 1m0s since becoming active at 2022-09-15 15:35:56.126006 +0200 CEST m=+39.384575417
```
## Profiling
`vmalert` provides handlers for collecting the following [Go profiles](https://blog.golang.org/profiling-go-programs):
@@ -670,7 +783,7 @@ The shortlist of configuration flags is the following:
{% raw %}
```
-clusterMode
If clusterMode is enabled, then vmalert automatically adds the tenant specified in config groups to -datasource.url, -remoteWrite.url and -remoteRead.url. See https://docs.victoriametrics.com/vmalert.html#multitenancy
If clusterMode is enabled, then vmalert automatically adds the tenant specified in config groups to -datasource.url, -remoteWrite.url and -remoteRead.url. See https://docs.victoriametrics.com/vmalert.html#multitenancy . This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
-configCheckInterval duration
Interval for checking for changes in '-rule' or '-notifier.config' files. By default the checking is disabled. Send SIGHUP signal in order to force config check for changes.
-datasource.appendTypePrefix
@@ -704,7 +817,7 @@ The shortlist of configuration flags is the following:
-datasource.oauth2.tokenUrl string
Optional OAuth2 tokenURL to use for -datasource.url.
-datasource.queryStep duration
queryStep defines how far a value can fallback to when evaluating queries. For example, if datasource.queryStep=15s then param "step" with value "15s" will be added to every query.If queryStep isn't specified, rule's evaluationInterval will be used instead.
How far a value can fallback to when evaluating queries. For example, if -datasource.queryStep=15s then param "step" with value "15s" will be added to every query. If set to 0, rule's evaluation interval will be used instead. (default 5m0s)
-datasource.queryTimeAlignment
Whether to align "time" parameter with evaluation interval.Alignment supposed to produce deterministic results despite of number of vmalert replicas or time they were started. See more details here https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1257 (default true)
-datasource.roundDigits int
@@ -724,12 +837,12 @@ The shortlist of configuration flags is the following:
-datasource.url string
Datasource compatible with Prometheus HTTP API. It can be single node VictoriaMetrics or vmselect URL. Required parameter. E.g. http://127.0.0.1:8428 . See also '-datasource.disablePathAppend', '-datasource.showURL'.
-defaultTenant.graphite string
Default tenant for Graphite alerting groups. See https://docs.victoriametrics.com/vmalert.html#multitenancy
Default tenant for Graphite alerting groups. See https://docs.victoriametrics.com/vmalert.html#multitenancy .This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
-defaultTenant.prometheus string
Default tenant for Prometheus alerting groups. See https://docs.victoriametrics.com/vmalert.html#multitenancy
Default tenant for Prometheus alerting groups. See https://docs.victoriametrics.com/vmalert.html#multitenancy . This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
-disableAlertgroupLabel
Whether to disable adding group's Name as label to generated alerts and time series.
-dryRun -rule
-dryRun
Whether to check only config files without running vmalert. The rules file are validated. The -rule flag must be specified.
-enableTCP6
Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used
@@ -738,12 +851,11 @@ The shortlist of configuration flags is the following:
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
-eula
By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf
By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf . This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
-evaluationInterval duration
How often to evaluate the rules (default 1m0s)
-external.alert.source string
External Alert Source allows to override the Source link for alerts sent to AlertManager for cases where you want to build a custom link to Grafana, Prometheus or any other service.
Supports templating. For example, link to Grafana: 'explore?orgId=1&left=[\"now-1h\",\"now\",\"VictoriaMetrics\",{\"expr\": \"{{$expr|quotesEscape|crlfEscape|queryEscape}}\"},{\"mode\":\"Metrics\"},{\"ui\":[true,true,true,\"none\"]}]'. (default "{{.ExternalURL}}/vmalert/alert?group_id={{.GroupID}}&alert_id={{.AlertID}}")
External Alert Source allows to override the Source link for alerts sent to AlertManager for cases where you want to build a custom link to Grafana, Prometheus or any other service. Supports templating - see https://docs.victoriametrics.com/vmalert.html#templating . For example, link to Grafana: -external.alert.source='explore?orgId=1&left=["now-1h","now","VictoriaMetrics",{"expr":{{$expr|jsonEscape|queryEscape}} },{"mode":"Metrics"},{"ui":[true,true,true,"none"]}]' . If empty 'vmalert/alert?group_id={{.GroupID}}&alert_id={{.AlertID}}' is used.
-external.label array
Optional label in the form 'Name=value' to add to all generated recording rules and alerts. Pass multiple -label flags in order to add multiple label sets.
Supports an array of values separated by comma or specified via multiple flags.
@@ -849,13 +961,13 @@ The shortlist of configuration flags is the following:
-promscrape.consul.waitTime duration
Wait time used by Consul service discovery. Default value is used if not set
-promscrape.consulSDCheckInterval duration
Interval for checking for changes in Consul. This works only if consul_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config for details (default 30s)
Interval for checking for changes in Consul. This works only if consul_sd_configs is configured in '-promscrape.config' file. See https://docs.victoriametrics.com/sd_configs.html#consul_sd_configs for details (default 30s)
-promscrape.discovery.concurrency int
The maximum number of concurrent requests to Prometheus autodiscovery API (Consul, Kubernetes, etc.) (default 100)
-promscrape.discovery.concurrentWaitTime duration
The maximum duration for waiting to perform API requests if more than -promscrape.discovery.concurrency requests are simultaneously performed (default 1m0s)
-promscrape.dnsSDCheckInterval duration
Interval for checking for changes in dns. This works only if dns_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dns_sd_config for details (default 30s)
Interval for checking for changes in dns. This works only if dns_sd_configs is configured in '-promscrape.config' file. See https://docs.victoriametrics.com/sd_configs.html#dns_sd_configs for details (default 30s)
-pushmetrics.extraLabel array
Optional labels to add to metrics pushed to -pushmetrics.url . For example, -pushmetrics.extraLabel='instance="foo"' adds instance="foo" label to all the metrics pushed to -pushmetrics.url
Supports an array of values separated by comma or specified via multiple flags.
@@ -1000,6 +1112,8 @@ The shortlist of configuration flags is the following:
Supports an array of values separated by comma or specified via multiple flags.
-tlsKeyFile string
Path to file with TLS key if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated
-tlsMinVersion string
Optional minimum TLS version to use for incoming requests over HTTPS if -tls is set. Supported values: TLS10, TLS11, TLS12, TLS13
-version
Show VictoriaMetrics version
```
@@ -1050,7 +1164,7 @@ and [DNS](https://prometheus.io/docs/prometheus/latest/configuration/configurati
For example:
```
static_configs:
static_configs:
- targets:
- localhost:9093
- localhost:9095
@@ -1059,7 +1173,7 @@ consul_sd_configs:
- server: localhost:8500
services:
- alertmanager
dns_sd_configs:
- names:
- my.domain.com
@@ -1087,7 +1201,7 @@ is the following:
# password and password_file are mutually exclusive.
basic_auth:
[ username: <string> ]
[ password: <secret> ]
[ password: <string> ]
[ password_file: <string> ]
# Optional `Authorization` header configuration.
@@ -1106,10 +1220,41 @@ authorization:
tls_config:
[ <tls_config> ]
# Configures Bearer authentication token via string
bearer_token: <string>
# or by passing path to the file with token.
bearer_token_file: <string>
# Configures OAuth 2.0 authentication
# see https://prometheus.io/docs/prometheus/latest/configuration/configuration/#oauth2
oauth2:
[ <oauth2_config> ]
# Optional list of HTTP headers in form `header-name: value`
# applied for all requests to notifiers
# For example:
# headers:
# - "CustomHeader: foo"
# - "CustomHeader2: bar"
headers:
[ <string>, ...]
# List of labeled statically configured Notifiers.
#
# Each list of targets may be additionally instructed with
# authorization params. Target's authorization params will
# inherit params from global authorization params if there
# are no conflicts.
static_configs:
targets:
[ - '<host>' ]
[ - targets: ]
[ - '<host>' ]
[ oauth2 ]
[ basic_auth ]
[ authorization ]
[ tls_config ]
[ bearer_token ]
[ bearer_token_file ]
[ headers ]
# List of Consul service discovery configurations.
# See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config
@@ -1170,7 +1315,7 @@ spec:
### Development build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.18.
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.19.3.
2. Run `make vmalert` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmalert` binary and puts it into the `bin` folder.
@@ -1186,7 +1331,7 @@ ARM build may run on Raspberry Pi or on [energy-efficient ARM servers](https://b
### Development ARM build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.18.
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.19.3.
2. Run `make vmalert-linux-arm` or `make vmalert-linux-arm64` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmalert-linux-arm` or `vmalert-linux-arm64` binary respectively and puts it into the `bin` folder.

View File

@@ -6,6 +6,7 @@ import (
"hash/fnv"
"sort"
"strconv"
"strings"
"sync"
"time"
@@ -30,24 +31,17 @@ type AlertingRule struct {
GroupID uint64
GroupName string
EvalInterval time.Duration
Debug bool
q datasource.Querier
// guard status fields
mu sync.RWMutex
alertsMu sync.RWMutex
// stores list of active alerts
alerts map[uint64]*notifier.Alert
// stores last moment of time Exec was called
lastExecTime time.Time
// stores the duration of the last Exec call
lastExecDuration time.Duration
// stores last error that happened in Exec func
// resets on every successful Exec
// may be used as Health state
lastExecError error
// stores the number of samples returned during
// the last evaluation
lastExecSamples int
// state stores recent state changes
// during evaluations
state *ruleState
metrics *alertingRuleMetrics
}
@@ -71,21 +65,24 @@ func newAlertingRule(qb datasource.QuerierBuilder, group *Group, cfg config.Rule
GroupID: group.ID(),
GroupName: group.Name,
EvalInterval: group.Interval,
Debug: cfg.Debug,
q: qb.BuildWithParams(datasource.QuerierParams{
DataSourceType: group.Type.String(),
EvaluationInterval: group.Interval,
QueryParams: group.Params,
Headers: group.Headers,
Debug: cfg.Debug,
}),
alerts: make(map[uint64]*notifier.Alert),
state: newRuleState(),
metrics: &alertingRuleMetrics{},
}
labels := fmt.Sprintf(`alertname=%q, group=%q, id="%d"`, ar.Name, group.Name, ar.ID())
ar.metrics.pending = utils.GetOrCreateGauge(fmt.Sprintf(`vmalert_alerts_pending{%s}`, labels),
func() float64 {
ar.mu.RLock()
defer ar.mu.RUnlock()
ar.alertsMu.RLock()
defer ar.alertsMu.RUnlock()
var num int
for _, a := range ar.alerts {
if a.State == notifier.StatePending {
@@ -96,8 +93,8 @@ func newAlertingRule(qb datasource.QuerierBuilder, group *Group, cfg config.Rule
})
ar.metrics.active = utils.GetOrCreateGauge(fmt.Sprintf(`vmalert_alerts_firing{%s}`, labels),
func() float64 {
ar.mu.RLock()
defer ar.mu.RUnlock()
ar.alertsMu.RLock()
defer ar.alertsMu.RUnlock()
var num int
for _, a := range ar.alerts {
if a.State == notifier.StateFiring {
@@ -108,18 +105,16 @@ func newAlertingRule(qb datasource.QuerierBuilder, group *Group, cfg config.Rule
})
ar.metrics.errors = utils.GetOrCreateGauge(fmt.Sprintf(`vmalert_alerting_rules_error{%s}`, labels),
func() float64 {
ar.mu.RLock()
defer ar.mu.RUnlock()
if ar.lastExecError == nil {
e := ar.state.getLast()
if e.err == nil {
return 0
}
return 1
})
ar.metrics.samples = utils.GetOrCreateGauge(fmt.Sprintf(`vmalert_alerting_rules_last_evaluation_samples{%s}`, labels),
func() float64 {
ar.mu.RLock()
defer ar.mu.RUnlock()
return float64(ar.lastExecSamples)
e := ar.state.getLast()
return float64(e.samples)
})
return ar
}
@@ -143,12 +138,42 @@ func (ar *AlertingRule) ID() uint64 {
return ar.RuleID
}
func (ar *AlertingRule) logDebugf(at time.Time, a *notifier.Alert, format string, args ...interface{}) {
if !ar.Debug {
return
}
prefix := fmt.Sprintf("DEBUG rule %q:%q (%d) at %v: ",
ar.GroupName, ar.Name, ar.RuleID, at.Format(time.RFC3339))
if a != nil {
labelKeys := make([]string, len(a.Labels))
var i int
for k := range a.Labels {
labelKeys[i] = k
i++
}
sort.Strings(labelKeys)
labels := make([]string, len(labelKeys))
for i, l := range labelKeys {
labels[i] = fmt.Sprintf("%s=%q", l, a.Labels[l])
}
labelsStr := strings.Join(labels, ",")
prefix += fmt.Sprintf("alert %d {%s} ", a.ID, labelsStr)
}
msg := fmt.Sprintf(format, args...)
logger.Infof("%s", prefix+msg)
}
type labelSet struct {
// origin labels from series
// used for templating
// origin labels extracted from received time series
// plus extra labels (group labels, service labels like alertNameLabel).
// in case of conflicts, origin labels from time series preferred.
// used for templating annotations
origin map[string]string
// processed labels with additional data
// used as Alert labels
// processed labels includes origin labels
// plus extra labels (group labels, service labels like alertNameLabel).
// in case of conflicts, extra labels are preferred.
// used as labels attached to notifier.Alert and ALERTS series written to remote storage.
processed map[string]string
}
@@ -156,7 +181,7 @@ type labelSet struct {
// to labelSet which contains original and processed labels.
func (ar *AlertingRule) toLabels(m datasource.Metric, qFn templates.QueryFn) (*labelSet, error) {
ls := &labelSet{
origin: make(map[string]string, len(m.Labels)),
origin: make(map[string]string),
processed: make(map[string]string),
}
for _, l := range m.Labels {
@@ -178,14 +203,23 @@ func (ar *AlertingRule) toLabels(m datasource.Metric, qFn templates.QueryFn) (*l
}
for k, v := range extraLabels {
ls.processed[k] = v
if _, ok := ls.origin[k]; !ok {
ls.origin[k] = v
}
}
// set additional labels to identify group and rule name
if ar.Name != "" {
ls.processed[alertNameLabel] = ar.Name
if _, ok := ls.origin[alertNameLabel]; !ok {
ls.origin[alertNameLabel] = ar.Name
}
}
if !*disableAlertGroupLabel && ar.GroupName != "" {
ls.processed[alertGroupNameLabel] = ar.GroupName
if _, ok := ls.origin[alertGroupNameLabel]; !ok {
ls.origin[alertGroupNameLabel] = ar.GroupName
}
}
return ls, nil
}
@@ -243,39 +277,55 @@ const resolvedRetention = 15 * time.Minute
// Based on the Querier results AlertingRule maintains notifier.Alerts
func (ar *AlertingRule) Exec(ctx context.Context, ts time.Time, limit int) ([]prompbmarshal.TimeSeries, error) {
start := time.Now()
qMetrics, err := ar.q.Query(ctx, ar.Expr, ts)
ar.mu.Lock()
defer ar.mu.Unlock()
qMetrics, req, err := ar.q.Query(ctx, ar.Expr, ts)
curState := ruleStateEntry{
time: start,
at: ts,
duration: time.Since(start),
samples: len(qMetrics),
err: err,
req: req,
}
defer func() {
ar.state.add(curState)
}()
ar.alertsMu.Lock()
defer ar.alertsMu.Unlock()
ar.lastExecTime = start
ar.lastExecDuration = time.Since(start)
ar.lastExecError = err
ar.lastExecSamples = len(qMetrics)
if err != nil {
return nil, fmt.Errorf("failed to execute query %q: %w", ar.Expr, err)
}
ar.logDebugf(ts, nil, "query returned %d samples (elapsed: %s)", curState.samples, curState.duration)
for h, a := range ar.alerts {
// cleanup inactive alerts from previous Exec
if a.State == notifier.StateInactive && ts.Sub(a.ResolvedAt) > resolvedRetention {
ar.logDebugf(ts, a, "deleted as inactive")
delete(ar.alerts, h)
}
}
qFn := func(query string) ([]datasource.Metric, error) { return ar.q.Query(ctx, query, ts) }
qFn := func(query string) ([]datasource.Metric, error) {
res, _, err := ar.q.Query(ctx, query, ts)
return res, err
}
updated := make(map[uint64]struct{})
// update list of active alerts
for _, m := range qMetrics {
ls, err := ar.toLabels(m, qFn)
if err != nil {
return nil, fmt.Errorf("failed to expand labels: %s", err)
curState.err = fmt.Errorf("failed to expand labels: %s", err)
return nil, curState.err
}
h := hash(ls.processed)
if _, ok := updated[h]; ok {
// duplicate may be caused by extra labels
// conflicting with the metric labels
ar.lastExecError = fmt.Errorf("labels %v: %w", ls.processed, errDuplicate)
return nil, ar.lastExecError
curState.err = fmt.Errorf("labels %v: %w", ls.processed, errDuplicate)
return nil, curState.err
}
updated[h] = struct{}{}
if a, ok := ar.alerts[h]; ok {
@@ -285,28 +335,26 @@ func (ar *AlertingRule) Exec(ctx context.Context, ts time.Time, limit int) ([]pr
// back to notifier.StatePending
a.State = notifier.StatePending
a.ActiveAt = ts
ar.logDebugf(ts, a, "INACTIVE => PENDING")
}
if a.Value != m.Values[0] {
// update Value field with latest value
a.Value = m.Values[0]
// and re-exec template since Value can be used
// in annotations
a.Annotations, err = a.ExecTemplate(qFn, ls.origin, ar.Annotations)
if err != nil {
return nil, err
}
a.Value = m.Values[0]
// re-exec template since Value or query can be used in annotations
a.Annotations, err = a.ExecTemplate(qFn, ls.origin, ar.Annotations)
if err != nil {
return nil, err
}
continue
}
a, err := ar.newAlert(m, ls, ar.lastExecTime, qFn)
a, err := ar.newAlert(m, ls, start, qFn)
if err != nil {
ar.lastExecError = err
return nil, fmt.Errorf("failed to create alert: %w", err)
curState.err = fmt.Errorf("failed to create alert: %w", err)
return nil, curState.err
}
a.ID = h
a.State = notifier.StatePending
a.ActiveAt = ts
ar.alerts[h] = a
ar.logDebugf(ts, a, "created in state PENDING")
}
var numActivePending int
for h, a := range ar.alerts {
@@ -317,11 +365,13 @@ func (ar *AlertingRule) Exec(ctx context.Context, ts time.Time, limit int) ([]pr
// alert was in Pending state - it is not
// active anymore
delete(ar.alerts, h)
ar.logDebugf(ts, a, "PENDING => DELETED: is absent in current evaluation round")
continue
}
if a.State == notifier.StateFiring {
a.State = notifier.StateInactive
a.ResolvedAt = ts
ar.logDebugf(ts, a, "FIRING => INACTIVE: is absent in current evaluation round")
}
continue
}
@@ -330,11 +380,13 @@ func (ar *AlertingRule) Exec(ctx context.Context, ts time.Time, limit int) ([]pr
a.State = notifier.StateFiring
a.Start = ts
alertsFired.Inc()
ar.logDebugf(ts, a, "PENDING => FIRING: %s since becoming active at %v", ts.Sub(a.ActiveAt), a.ActiveAt)
}
}
if limit > 0 && numActivePending > limit {
ar.alerts = map[uint64]*notifier.Alert{}
return nil, fmt.Errorf("exec exceeded limit of %d with %d alerts", limit, numActivePending)
curState.err = fmt.Errorf("exec exceeded limit of %d with %d alerts", limit, numActivePending)
return nil, curState.err
}
return ar.toTimeSeries(ts.Unix()), nil
}
@@ -411,8 +463,8 @@ func (ar *AlertingRule) newAlert(m datasource.Metric, ls *labelSet, start time.T
// AlertAPI generates APIAlert object from alert by its id(hash)
func (ar *AlertingRule) AlertAPI(id uint64) *APIAlert {
ar.mu.RLock()
defer ar.mu.RUnlock()
ar.alertsMu.RLock()
defer ar.alertsMu.RUnlock()
a, ok := ar.alerts[id]
if !ok {
return nil
@@ -420,9 +472,10 @@ func (ar *AlertingRule) AlertAPI(id uint64) *APIAlert {
return ar.newAlertAPI(*a)
}
// ToAPI returns Rule representation in form
// of APIRule
// ToAPI returns Rule representation in form of APIRule
// Isn't thread-safe. Call must be protected by AlertingRule mutex.
func (ar *AlertingRule) ToAPI() APIRule {
lastState := ar.state.getLast()
r := APIRule{
Type: "alerting",
DatasourceType: ar.Type.String(),
@@ -431,19 +484,20 @@ func (ar *AlertingRule) ToAPI() APIRule {
Duration: ar.For.Seconds(),
Labels: ar.Labels,
Annotations: ar.Annotations,
LastEvaluation: ar.lastExecTime,
EvaluationTime: ar.lastExecDuration.Seconds(),
LastEvaluation: lastState.time,
EvaluationTime: lastState.duration.Seconds(),
Health: "ok",
State: "inactive",
Alerts: ar.AlertsToAPI(),
LastSamples: ar.lastExecSamples,
LastSamples: lastState.samples,
Updates: ar.state.getAll(),
// encode as strings to avoid rounding in JSON
ID: fmt.Sprintf("%d", ar.ID()),
GroupID: fmt.Sprintf("%d", ar.GroupID),
}
if ar.lastExecError != nil {
r.LastError = ar.lastExecError.Error()
if lastState.err != nil {
r.LastError = lastState.err.Error()
r.Health = "err"
}
// satisfy APIRule.State logic
@@ -463,14 +517,14 @@ func (ar *AlertingRule) ToAPI() APIRule {
// AlertsToAPI generates list of APIAlert objects from existing alerts
func (ar *AlertingRule) AlertsToAPI() []*APIAlert {
var alerts []*APIAlert
ar.mu.RLock()
ar.alertsMu.RLock()
for _, a := range ar.alerts {
if a.State == notifier.StateInactive {
continue
}
alerts = append(alerts, ar.newAlertAPI(*a))
}
ar.mu.RUnlock()
ar.alertsMu.RUnlock()
return alerts
}
@@ -553,7 +607,10 @@ func (ar *AlertingRule) Restore(ctx context.Context, q datasource.Querier, lookb
}
ts := time.Now()
qFn := func(query string) ([]datasource.Metric, error) { return ar.q.Query(ctx, query, ts) }
qFn := func(query string) ([]datasource.Metric, error) {
res, _, err := ar.q.Query(ctx, query, ts)
return res, err
}
// account for external labels in filter
var labelsFilter string
@@ -563,7 +620,7 @@ func (ar *AlertingRule) Restore(ctx context.Context, q datasource.Querier, lookb
expr := fmt.Sprintf("last_over_time(%s{alertname=%q%s}[%ds])",
alertForStateMetricName, ar.Name, labelsFilter, int(lookback.Seconds()))
qMetrics, err := q.Query(ctx, expr, ts)
qMetrics, _, err := q.Query(ctx, expr, ts)
if err != nil {
return err
}

View File

@@ -700,14 +700,26 @@ func TestAlertingRule_Template(t *testing.T) {
expAlerts map[uint64]*notifier.Alert
}{
{
newTestRuleWithLabels("common", "region", "east"),
&AlertingRule{
Name: "common",
Labels: map[string]string{
"region": "east",
},
Annotations: map[string]string{
"summary": `{{ $labels.alertname }}: Too high connection number for "{{ $labels.instance }}"`,
},
alerts: make(map[uint64]*notifier.Alert),
state: newRuleState(),
},
[]datasource.Metric{
metricWithValueAndLabels(t, 1, "instance", "foo"),
metricWithValueAndLabels(t, 1, "instance", "bar"),
},
map[uint64]*notifier.Alert{
hash(map[string]string{alertNameLabel: "common", "region": "east", "instance": "foo"}): {
Annotations: map[string]string{},
Annotations: map[string]string{
"summary": `common: Too high connection number for "foo"`,
},
Labels: map[string]string{
alertNameLabel: "common",
"region": "east",
@@ -715,7 +727,9 @@ func TestAlertingRule_Template(t *testing.T) {
},
},
hash(map[string]string{alertNameLabel: "common", "region": "east", "instance": "bar"}): {
Annotations: map[string]string{},
Annotations: map[string]string{
"summary": `common: Too high connection number for "bar"`,
},
Labels: map[string]string{
alertNameLabel: "common",
"region": "east",
@@ -735,6 +749,7 @@ func TestAlertingRule_Template(t *testing.T) {
"description": `{{ $labels.alertname}}: It is {{ $value }} connections for "{{ $labels.instance }}"`,
},
alerts: make(map[uint64]*notifier.Alert),
state: newRuleState(),
},
[]datasource.Metric{
metricWithValueAndLabels(t, 2, "__name__", "first", "instance", "foo", alertNameLabel, "override"),
@@ -774,6 +789,7 @@ func TestAlertingRule_Template(t *testing.T) {
"summary": `Alert "{{ $labels.alertname }}({{ $labels.alertgroup }})" for instance {{ $labels.instance }}`,
},
alerts: make(map[uint64]*notifier.Alert),
state: newRuleState(),
},
[]datasource.Metric{
metricWithValueAndLabels(t, 1,
@@ -915,5 +931,11 @@ func newTestRuleWithLabels(name string, labels ...string) *AlertingRule {
}
func newTestAlertingRule(name string, waitFor time.Duration) *AlertingRule {
return &AlertingRule{Name: name, alerts: make(map[uint64]*notifier.Alert), For: waitFor, EvalInterval: waitFor}
return &AlertingRule{
Name: name,
For: waitFor,
EvalInterval: waitFor,
alerts: make(map[uint64]*notifier.Alert),
state: newRuleState(),
}
}

View File

@@ -77,7 +77,7 @@ func (g *Group) Validate(validateTplFn ValidateTplFn, validateExpressions bool)
ruleName = r.Alert
}
if _, ok := uniqueRules[r.ID]; ok {
return fmt.Errorf("rule %q duplicate", ruleName)
return fmt.Errorf("%q is a duplicate within the group %q", r.String(), g.Name)
}
uniqueRules[r.ID] = struct{}{}
if err := r.Validate(); err != nil {
@@ -113,6 +113,7 @@ type Rule struct {
For *promutils.Duration `yaml:"for,omitempty"`
Labels map[string]string `yaml:"labels,omitempty"`
Annotations map[string]string `yaml:"annotations,omitempty"`
Debug bool `yaml:"debug,omitempty"`
// Catches all undefined fields and must be empty after parsing.
XXX map[string]interface{} `yaml:",inline"`
@@ -136,6 +137,32 @@ func (r *Rule) Name() string {
return r.Alert
}
// String implements Stringer interface
func (r *Rule) String() string {
ruleType := "recording"
if r.Alert != "" {
ruleType = "alerting"
}
b := strings.Builder{}
b.WriteString(fmt.Sprintf("%s rule %q", ruleType, r.Name()))
b.WriteString(fmt.Sprintf("; expr: %q", r.Expr))
kv := sortMap(r.Labels)
for i := range kv {
if i == 0 {
b.WriteString("; labels:")
}
b.WriteString(" ")
b.WriteString(kv[i].key)
b.WriteString("=")
b.WriteString(kv[i].value)
if i < len(kv)-1 {
b.WriteString(",")
}
}
return b.String()
}
// HashRule hashes significant Rule fields into
// unique hash that supposed to define Rule uniqueness
func HashRule(r Rule) uint64 {
@@ -216,9 +243,12 @@ func Parse(pathPatterns []string, validateTplFn ValidateTplFn, validateExpressio
func parseFile(path string) ([]Group, error) {
data, err := os.ReadFile(path)
if err != nil {
return nil, fmt.Errorf("error reading alert rule file: %w", err)
return nil, fmt.Errorf("error reading alert rule file %q: %w", path, err)
}
data, err = envtemplate.ReplaceBytes(data)
if err != nil {
return nil, fmt.Errorf("cannot expand environment vars in %q: %w", path, err)
}
data = envtemplate.Replace(data)
g := struct {
Groups []Group `yaml:"groups"`
// Catches all undefined fields and must be empty after parsing.

View File

@@ -535,6 +535,21 @@ headers:
rules:
- alert: foo
expr: sum by(job) (up == 1)
`)
})
t.Run("`debug` change", func(t *testing.T) {
f(t, `
name: TestGroup
rules:
- alert: foo
expr: sum by(job) (up == 1)
`, `
name: TestGroup
rules:
- alert: foo
expr: sum by(job) (up == 1)
debug: true
`)
})
}

View File

@@ -1,6 +1,6 @@
groups:
- name: TestGroup
interval: 2s
interval: 5s
concurrency: 2
limit: 1000
headers:
@@ -9,10 +9,12 @@ groups:
denyPartialResponse: ["true"]
rules:
- alert: Conns
expr: sum(vm_tcplistener_conns) by(instance) > 1
expr: vm_tcplistener_conns > 0
for: 3m
debug: true
annotations:
summary: Too high connection number for {{$labels.instance}}
labels: "Available labels: {{ $labels }}"
summary: Too high connection number for {{ $labels.instance }}
{{ with printf "sum(vm_tcplistener_conns{instance=%q})" .Labels.instance | query }}
{{ . | first | value }}
{{ end }}

View File

@@ -2,18 +2,27 @@ package datasource
import (
"context"
"net/http"
"net/url"
"time"
)
// Querier interface wraps Query and QueryRange methods
type Querier interface {
Query(ctx context.Context, query string, ts time.Time) ([]Metric, error)
// Query executes instant request with the given query at the given ts.
// It returns list of Metric in response, the http.Request used for sending query
// and error if any. Returned http.Request can't be reused and its body is already read.
// Query should stop once ctx is cancelled.
Query(ctx context.Context, query string, ts time.Time) ([]Metric, *http.Request, error)
// QueryRange executes range request with the given query on the given time range.
// It returns list of Metric in response and error if any.
// QueryRange should stop once ctx is cancelled.
QueryRange(ctx context.Context, query string, from, to time.Time) ([]Metric, error)
}
// QuerierBuilder builds Querier with given params.
type QuerierBuilder interface {
// BuildWithParams creates a new Querier object with the given params
BuildWithParams(params QuerierParams) Querier
}
@@ -23,6 +32,7 @@ type QuerierParams struct {
EvaluationInterval time.Duration
QueryParams url.Values
Headers map[string]string
Debug bool
}
// Metric is the basic entity which should be return by datasource

View File

@@ -6,6 +6,7 @@ import (
"net/http"
"net/url"
"strings"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/utils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
@@ -42,9 +43,9 @@ var (
oauth2Scopes = flag.String("datasource.oauth2.scopes", "", "Optional OAuth2 scopes to use for -datasource.url. Scopes must be delimited by ';'")
lookBack = flag.Duration("datasource.lookback", 0, `Lookback defines how far into the past to look when evaluating queries. For example, if the datasource.lookback=5m then param "time" with value now()-5m will be added to every query.`)
queryStep = flag.Duration("datasource.queryStep", 0, "queryStep defines how far a value can fallback to when evaluating queries. "+
"For example, if datasource.queryStep=15s then param \"step\" with value \"15s\" will be added to every query."+
"If queryStep isn't specified, rule's evaluationInterval will be used instead.")
queryStep = flag.Duration("datasource.queryStep", 5*time.Minute, "How far a value can fallback to when evaluating queries. "+
"For example, if -datasource.queryStep=15s then param \"step\" with value \"15s\" will be added to every query. "+
"If set to 0, rule's evaluation interval will be used instead.")
queryTimeAlignment = flag.Bool("datasource.queryTimeAlignment", true, `Whether to align "time" parameter with evaluation interval.`+
"Alignment supposed to produce deterministic results despite of number of vmalert replicas or time they were started. See more details here https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1257")
maxIdleConnections = flag.Int("datasource.maxIdleConnections", 100, `Defines the number of idle (keep-alive connections) to each configured datasource. Consider setting this value equal to the value: groups_total * group.concurrency. Too low a value may result in a high number of sockets in TIME_WAIT state.`)

View File

@@ -9,6 +9,7 @@ import (
"strings"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promauth"
)
@@ -39,6 +40,10 @@ type VMStorage struct {
evaluationInterval time.Duration
extraParams url.Values
extraHeaders []keyValue
// whether to print additional log messages
// for each sent request
debug bool
}
type keyValue struct {
@@ -64,6 +69,7 @@ func (s *VMStorage) ApplyParams(params QuerierParams) *VMStorage {
s.dataSourceType = toDatasourceType(params.DataSourceType)
s.evaluationInterval = params.EvaluationInterval
s.extraParams = params.QueryParams
s.debug = params.Debug
if params.Headers != nil {
for key, value := range params.Headers {
kv := keyValue{key: key, value: value}
@@ -92,10 +98,10 @@ func NewVMStorage(baseURL string, authCfg *promauth.Config, lookBack time.Durati
}
// Query executes the given query and returns parsed response
func (s *VMStorage) Query(ctx context.Context, query string, ts time.Time) ([]Metric, error) {
func (s *VMStorage) Query(ctx context.Context, query string, ts time.Time) ([]Metric, *http.Request, error) {
req, err := s.newRequestPOST()
if err != nil {
return nil, err
return nil, nil, err
}
switch s.dataSourceType {
@@ -104,12 +110,12 @@ func (s *VMStorage) Query(ctx context.Context, query string, ts time.Time) ([]Me
case datasourceGraphite:
s.setGraphiteReqParams(req, query, ts)
default:
return nil, fmt.Errorf("engine not found: %q", s.dataSourceType)
return nil, nil, fmt.Errorf("engine not found: %q", s.dataSourceType)
}
resp, err := s.do(ctx, req)
if err != nil {
return nil, err
return nil, req, err
}
defer func() {
_ = resp.Body.Close()
@@ -119,7 +125,8 @@ func (s *VMStorage) Query(ctx context.Context, query string, ts time.Time) ([]Me
if s.dataSourceType != datasourcePrometheus {
parseFn = parseGraphiteResponse
}
return parseFn(req, resp)
result, err := parseFn(req, resp)
return result, req, err
}
// QueryRange executes the given query on the given time range.
@@ -151,6 +158,9 @@ func (s *VMStorage) QueryRange(ctx context.Context, query string, start, end tim
}
func (s *VMStorage) do(ctx context.Context, req *http.Request) (*http.Response, error) {
if s.debug {
logger.Infof("DEBUG datasource request: executing %s request with params %q", req.Method, req.URL.RawQuery)
}
resp, err := s.c.Do(req.WithContext(ctx))
if err != nil {
return nil, fmt.Errorf("error getting response from %s: %w", req.URL.Redacted(), err)

View File

@@ -94,7 +94,7 @@ func TestVMInstantQuery(t *testing.T) {
ts := time.Now()
expErr := func(err string) {
if _, err := pq.Query(ctx, query, ts); err == nil {
if _, _, err := pq.Query(ctx, query, ts); err == nil {
t.Fatalf("expected %q got nil", err)
}
}
@@ -106,7 +106,7 @@ func TestVMInstantQuery(t *testing.T) {
expErr("unknown status") // 4
expErr("non-vector resultType error") // 5
m, err := pq.Query(ctx, query, ts) // 6 - vector
m, _, err := pq.Query(ctx, query, ts) // 6 - vector
if err != nil {
t.Fatalf("unexpected %s", err)
}
@@ -129,10 +129,13 @@ func TestVMInstantQuery(t *testing.T) {
t.Fatalf("unexpected metric %+v want %+v", m, expected)
}
m, err = pq.Query(ctx, query, ts) // 7 - scalar
m, req, err := pq.Query(ctx, query, ts) // 7 - scalar
if err != nil {
t.Fatalf("unexpected %s", err)
}
if req == nil {
t.Fatalf("expected request to be non-nil")
}
if len(m) != 1 {
t.Fatalf("expected 1 metrics got %d in %+v", len(m), m)
}
@@ -148,7 +151,7 @@ func TestVMInstantQuery(t *testing.T) {
gq := s.BuildWithParams(QuerierParams{DataSourceType: string(datasourceGraphite)})
m, err = gq.Query(ctx, queryRender, ts) // 8 - graphite
m, _, err = gq.Query(ctx, queryRender, ts) // 8 - graphite
if err != nil {
t.Fatalf("unexpected %s", err)
}

View File

@@ -3,6 +3,7 @@ package main
import (
"context"
"fmt"
"net/http"
"reflect"
"sort"
"sync"
@@ -44,18 +45,20 @@ func (fq *fakeQuerier) BuildWithParams(_ datasource.QuerierParams) datasource.Qu
}
func (fq *fakeQuerier) QueryRange(ctx context.Context, q string, _, _ time.Time) ([]datasource.Metric, error) {
return fq.Query(ctx, q, time.Now())
req, _, err := fq.Query(ctx, q, time.Now())
return req, err
}
func (fq *fakeQuerier) Query(_ context.Context, _ string, _ time.Time) ([]datasource.Metric, error) {
func (fq *fakeQuerier) Query(_ context.Context, _ string, _ time.Time) ([]datasource.Metric, *http.Request, error) {
fq.Lock()
defer fq.Unlock()
if fq.err != nil {
return nil, fq.err
return nil, nil, fq.err
}
cp := make([]datasource.Metric, len(fq.metrics))
copy(cp, fq.metrics)
return cp, nil
req, _ := http.NewRequest(http.MethodPost, "foo.com", nil)
return cp, req, nil
}
type fakeNotifier struct {

View File

@@ -28,7 +28,7 @@ import (
)
var (
rulePath = flagutil.NewArray("rule", `Path to the file with alert rules.
rulePath = flagutil.NewArrayString("rule", `Path to the file with alert rules.
Supports patterns. Flag can be specified multiple times.
Examples:
-rule="/path/to/file". Path to a single file with alerting rules
@@ -36,7 +36,7 @@ Examples:
absolute path to all .yaml files in root.
Rule files may contain %{ENV_VAR} placeholders, which are substituted by the corresponding env vars.`)
ruleTemplatesPath = flagutil.NewArray("rule.templates", `Path or glob pattern to location with go template definitions
ruleTemplatesPath = flagutil.NewArrayString("rule.templates", `Path or glob pattern to location with go template definitions
for rules annotations templating. Flag can be specified multiple times.
Examples:
-rule.templates="/path/to/file". Path to a single file with go templates
@@ -59,10 +59,12 @@ absolute path to all .tpl files in root.`)
resendDelay = flag.Duration("rule.resendDelay", 0, "Minimum amount of time to wait before resending an alert to notifier")
externalURL = flag.String("external.url", "", "External URL is used as alert's source for sent alerts to the notifier")
externalAlertSource = flag.String("external.alert.source", "", `External Alert Source allows to override the Source link for alerts sent to AlertManager for cases where you want to build a custom link to Grafana, Prometheus or any other service.
Supports templating. For example, link to Grafana: 'explore?orgId=1&left=[\"now-1h\",\"now\",\"VictoriaMetrics\",{\"expr\": \"{{$expr|quotesEscape|crlfEscape|queryEscape}}\"},{\"mode\":\"Metrics\"},{\"ui\":[true,true,true,\"none\"]}]'.
If empty 'vmalert/alert?group_id={{.GroupID}}&alert_id={{.AlertID}}' is used.`)
externalLabels = flagutil.NewArray("external.label", "Optional label in the form 'Name=value' to add to all generated recording rules and alerts. "+
externalAlertSource = flag.String("external.alert.source", "", `External Alert Source allows to override the Source link for alerts sent to AlertManager `+
`for cases where you want to build a custom link to Grafana, Prometheus or any other service. `+
`Supports templating - see https://docs.victoriametrics.com/vmalert.html#templating . `+
`For example, link to Grafana: -external.alert.source='explore?orgId=1&left=["now-1h","now","VictoriaMetrics",{"expr":{{$expr|jsonEscape|queryEscape}} },{"mode":"Metrics"},{"ui":[true,true,true,"none"]}]' . `+
`If empty 'vmalert/alert?group_id={{.GroupID}}&alert_id={{.AlertID}}' is used.`)
externalLabels = flagutil.NewArrayString("external.label", "Optional label in the form 'Name=value' to add to all generated recording rules and alerts. "+
"Pass multiple -label flags in order to add multiple label sets.")
remoteReadLookBack = flag.Duration("remoteRead.lookback", time.Hour, "Lookback defines how far to look into past for alerts timeseries."+
@@ -71,7 +73,7 @@ If empty 'vmalert/alert?group_id={{.GroupID}}&alert_id={{.AlertID}}' is used.`)
disableAlertGroupLabel = flag.Bool("disableAlertgroupLabel", false, "Whether to disable adding group's Name as label to generated alerts and time series.")
dryRun = flag.Bool("dryRun", false, "Whether to check only config files without running vmalert. The rules file are validated. The `-rule` flag must be specified.")
dryRun = flag.Bool("dryRun", false, "Whether to check only config files without running vmalert. The rules file are validated. The -rule flag must be specified.")
)
var alertURLGeneratorFn notifier.AlertURLGenerator
@@ -264,7 +266,7 @@ func getAlertURLGenerator(externalURL *url.URL, externalAlertSource string, vali
"tpl": externalAlertSource,
}
return func(alert notifier.Alert) string {
templated, err := alert.ExecTemplate(nil, nil, m)
templated, err := alert.ExecTemplate(nil, alert.Labels, m)
if err != nil {
logger.Errorf("can not exec source template %s", err)
}

View File

@@ -34,7 +34,7 @@ func TestGetExternalURL(t *testing.T) {
}
func TestGetAlertURLGenerator(t *testing.T) {
testAlert := notifier.Alert{GroupID: 42, ID: 2, Value: 4}
testAlert := notifier.Alert{GroupID: 42, ID: 2, Value: 4, Labels: map[string]string{"tenant": "baz"}}
u, _ := url.Parse("https://victoriametrics.com/path")
fn, err := getAlertURLGenerator(u, "", false)
if err != nil {
@@ -48,11 +48,11 @@ func TestGetAlertURLGenerator(t *testing.T) {
if err == nil {
t.Errorf("expected template validation error got nil")
}
fn, err = getAlertURLGenerator(u, "foo?query={{$value}}", true)
fn, err = getAlertURLGenerator(u, "foo?query={{$value}}&ds={{ $labels.tenant }}", true)
if err != nil {
t.Errorf("unexpected error %s", err)
}
if exp := "https://victoriametrics.com/path/foo?query=4"; exp != fn(testAlert) {
if exp := "https://victoriametrics.com/path/foo?query=4&ds=baz"; exp != fn(testAlert) {
t.Errorf("unexpected url want %s, got %s", exp, fn(testAlert))
}
}

View File

@@ -30,6 +30,23 @@ type manager struct {
groups map[uint64]*Group
}
// RuleAPI generates APIRule object from alert by its ID(hash)
func (m *manager) RuleAPI(gID, rID uint64) (APIRule, error) {
m.groupsMu.RLock()
defer m.groupsMu.RUnlock()
g, ok := m.groups[gID]
if !ok {
return APIRule{}, fmt.Errorf("can't find group with id %d", gID)
}
for _, rule := range g.Rules {
if rule.ID() == rID {
return rule.ToAPI(), nil
}
}
return APIRule{}, fmt.Errorf("can't find rule with id %d in group %q", rID, g.Name)
}
// AlertAPI generates APIAlert object from alert by its ID(hash)
func (m *manager) AlertAPI(gID, aID uint64) (*APIAlert, error) {
m.groupsMu.RLock()
@@ -70,9 +87,9 @@ func (m *manager) startGroup(ctx context.Context, group *Group, restore bool) er
err := group.Restore(ctx, m.rr, *remoteReadLookBack, m.labels)
if err != nil {
if !*remoteReadIgnoreRestoreErrors {
return fmt.Errorf("failed to restore state for group %q: %w", group.Name, err)
return fmt.Errorf("failed to restore ruleState for group %q: %w", group.Name, err)
}
logger.Errorf("error while restoring state for group %q: %s", group.Name, err)
logger.Errorf("error while restoring ruleState for group %q: %s", group.Name, err)
}
}

View File

@@ -183,9 +183,9 @@ func (a Alert) toPromLabels(relabelCfg *promrelabel.ParsedConfigs) []prompbmarsh
Value: v,
})
}
promrelabel.SortLabels(labels)
if relabelCfg != nil {
return relabelCfg.Apply(labels, 0, false)
labels = relabelCfg.Apply(labels, 0)
}
promrelabel.SortLabels(labels)
return labels
}

View File

@@ -67,15 +67,21 @@ func TestAlert_ExecTemplate(t *testing.T) {
{
name: "expression-template",
alert: &Alert{
Expr: `vm_rows{"label"="bar"}>0`,
Expr: `vm_rows{"label"="bar"}<0`,
},
annotations: map[string]string{
"exprEscapedQuery": "{{ $expr|quotesEscape|queryEscape }}",
"exprEscapedPath": "{{ $expr|quotesEscape|pathEscape }}",
"exprEscapedQuery": "{{ $expr|queryEscape }}",
"exprEscapedPath": "{{ $expr|pathEscape }}",
"exprEscapedJSON": "{{ $expr|jsonEscape }}",
"exprEscapedQuotes": "{{ $expr|quotesEscape }}",
"exprEscapedHTML": "{{ $expr|htmlEscape }}",
},
expTpl: map[string]string{
"exprEscapedQuery": "vm_rows%7B%5C%22label%5C%22%3D%5C%22bar%5C%22%7D%3E0",
"exprEscapedPath": "vm_rows%7B%5C%22label%5C%22=%5C%22bar%5C%22%7D%3E0",
"exprEscapedQuery": "vm_rows%7B%22label%22%3D%22bar%22%7D%3C0",
"exprEscapedPath": "vm_rows%7B%22label%22=%22bar%22%7D%3C0",
"exprEscapedJSON": `"vm_rows{\"label\"=\"bar\"}\u003c0"`,
"exprEscapedQuotes": `vm_rows{\"label\"=\"bar\"}\u003c0`,
"exprEscapedHTML": "vm_rows{&quot;label&quot;=&quot;bar&quot;}&lt;0",
},
},
{

View File

@@ -15,10 +15,10 @@
"endsAt":{%q= alert.End.Format(time.RFC3339Nano) %},
{% endif %}
"labels": {
"alertname":{%q= alert.Name %}
{% code lbls := alert.toPromLabels(relabelCfg) %}
{% for _, l := range lbls %}
,{%q= l.Name %}:{%q= l.Value %}
{% code ll := len(lbls) %}
{% for idx, l := range lbls %}
{%q= l.Name %}:{%q= l.Value %}{% if idx != ll-1 %}, {% endif %}
{% endfor %}
},
"annotations": {

View File

@@ -51,22 +51,27 @@ func streamamRequest(qw422016 *qt422016.Writer, alerts []Alert, generatorURL fun
//line app/vmalert/notifier/alertmanager_request.qtpl:16
}
//line app/vmalert/notifier/alertmanager_request.qtpl:16
qw422016.N().S(`"labels": {"alertname":`)
qw422016.N().S(`"labels": {`)
//line app/vmalert/notifier/alertmanager_request.qtpl:18
qw422016.N().Q(alert.Name)
//line app/vmalert/notifier/alertmanager_request.qtpl:19
lbls := alert.toPromLabels(relabelCfg)
//line app/vmalert/notifier/alertmanager_request.qtpl:19
ll := len(lbls)
//line app/vmalert/notifier/alertmanager_request.qtpl:20
for _, l := range lbls {
//line app/vmalert/notifier/alertmanager_request.qtpl:20
qw422016.N().S(`,`)
for idx, l := range lbls {
//line app/vmalert/notifier/alertmanager_request.qtpl:21
qw422016.N().Q(l.Name)
//line app/vmalert/notifier/alertmanager_request.qtpl:21
qw422016.N().S(`:`)
//line app/vmalert/notifier/alertmanager_request.qtpl:21
qw422016.N().Q(l.Value)
//line app/vmalert/notifier/alertmanager_request.qtpl:21
if idx != ll-1 {
//line app/vmalert/notifier/alertmanager_request.qtpl:21
qw422016.N().S(`,`)
//line app/vmalert/notifier/alertmanager_request.qtpl:21
}
//line app/vmalert/notifier/alertmanager_request.qtpl:22
}
//line app/vmalert/notifier/alertmanager_request.qtpl:22

View File

@@ -67,9 +67,6 @@ func TestAlertManager_Send(t *testing.T) {
if a[0].GeneratorURL != "0/0" {
t.Errorf("expected 0/0 as generatorURL got %s", a[0].GeneratorURL)
}
if a[0].Labels["alertname"] != "alert0" {
t.Errorf("expected alert0 as alert name got %s", a[0].Labels["alertname"])
}
if a[0].StartsAt.IsZero() {
t.Errorf("expected non-zero start time")
}

View File

@@ -68,6 +68,8 @@ type Config struct {
// [ - '<host>' ]
type StaticConfig struct {
Targets []string `yaml:"targets"`
// HTTPClientConfig contains HTTP configuration for the Targets
HTTPClientConfig promauth.HTTPClientConfig `yaml:",inline"`
}
// UnmarshalYAML implements the yaml.Unmarshaler interface.
@@ -130,8 +132,9 @@ func parseConfig(path string) (*Config, error) {
func parseLabels(target string, metaLabels map[string]string, cfg *Config) (string, []prompbmarshal.Label, error) {
labels := mergeLabels(target, metaLabels, cfg)
labels = cfg.parsedRelabelConfigs.Apply(labels, 0, false)
labels = cfg.parsedRelabelConfigs.Apply(labels, 0)
labels = promrelabel.RemoveMetaLabels(labels[:0], labels)
promrelabel.SortLabels(labels)
// Remove references to already deleted labels, so GC could clean strings for label name and label value past len(labels).
// This should reduce memory usage when relabeling creates big number of temporary labels with long names and/or values.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/825 for details.

View File

@@ -6,6 +6,7 @@ import (
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promauth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape/discovery/consul"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape/discovery/dns"
)
@@ -161,12 +162,13 @@ func (cw *configWatcher) start() error {
if len(cw.cfg.StaticConfigs) > 0 {
var targets []Target
for _, cfg := range cw.cfg.StaticConfigs {
httpCfg := mergeHTTPClientConfigs(cw.cfg.HTTPClientConfig, cfg.HTTPClientConfig)
for _, target := range cfg.Targets {
address, labels, err := parseLabels(target, nil, cw.cfg)
if err != nil {
return fmt.Errorf("failed to parse labels for target %q: %s", target, err)
}
notifier, err := NewAlertManager(address, cw.genFn, cw.cfg.HTTPClientConfig, cw.cfg.parsedAlertRelabelConfigs, cw.cfg.Timeout.Duration())
notifier, err := NewAlertManager(address, cw.genFn, httpCfg, cw.cfg.parsedAlertRelabelConfigs, cw.cfg.Timeout.Duration())
if err != nil {
return fmt.Errorf("failed to init alertmanager for addr %q: %s", address, err)
}
@@ -252,3 +254,30 @@ func (cw *configWatcher) setTargets(key TargetType, targets []Target) {
cw.targets[key] = targets
cw.targetsMu.Unlock()
}
// mergeHTTPClientConfigs merges fields between child and parent params
// by populating child from parent params if they're missing.
func mergeHTTPClientConfigs(parent, child promauth.HTTPClientConfig) promauth.HTTPClientConfig {
if child.Authorization == nil {
child.Authorization = parent.Authorization
}
if child.BasicAuth == nil {
child.BasicAuth = parent.BasicAuth
}
if child.BearerToken == nil {
child.BearerToken = parent.BearerToken
}
if child.BearerTokenFile == "" {
child.BearerTokenFile = parent.BearerTokenFile
}
if child.OAuth2 == nil {
child.OAuth2 = parent.OAuth2
}
if child.TLSConfig == nil {
child.TLSConfig = parent.TLSConfig
}
if child.Headers == nil {
child.Headers = parent.Headers
}
return child
}

View File

@@ -8,6 +8,8 @@ import (
"os"
"sync"
"testing"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promauth"
)
func TestConfigWatcherReload(t *testing.T) {
@@ -298,3 +300,20 @@ func newFakeConsulServer() *httptest.Server {
return httptest.NewServer(mux)
}
func TestMergeHTTPClientConfigs(t *testing.T) {
cfg1 := promauth.HTTPClientConfig{Headers: []string{"Header:Foo"}}
cfg2 := promauth.HTTPClientConfig{BasicAuth: &promauth.BasicAuthConfig{
Username: "foo",
Password: promauth.NewSecret("bar"),
}}
result := mergeHTTPClientConfigs(cfg1, cfg2)
if result.Headers == nil {
t.Fatalf("expected Headers to be inherited")
}
if result.BasicAuth == nil {
t.Fatalf("expected BasicAuth tp be present")
}
}

View File

@@ -17,32 +17,32 @@ var (
configPath = flag.String("notifier.config", "", "Path to configuration file for notifiers")
suppressDuplicateTargetErrors = flag.Bool("notifier.suppressDuplicateTargetErrors", false, "Whether to suppress 'duplicate target' errors during discovery")
addrs = flagutil.NewArray("notifier.url", "Prometheus alertmanager URL, e.g. http://127.0.0.1:9093")
addrs = flagutil.NewArrayString("notifier.url", "Prometheus alertmanager URL, e.g. http://127.0.0.1:9093")
basicAuthUsername = flagutil.NewArray("notifier.basicAuth.username", "Optional basic auth username for -notifier.url")
basicAuthPassword = flagutil.NewArray("notifier.basicAuth.password", "Optional basic auth password for -notifier.url")
basicAuthPasswordFile = flagutil.NewArray("notifier.basicAuth.passwordFile", "Optional path to basic auth password file for -notifier.url")
basicAuthUsername = flagutil.NewArrayString("notifier.basicAuth.username", "Optional basic auth username for -notifier.url")
basicAuthPassword = flagutil.NewArrayString("notifier.basicAuth.password", "Optional basic auth password for -notifier.url")
basicAuthPasswordFile = flagutil.NewArrayString("notifier.basicAuth.passwordFile", "Optional path to basic auth password file for -notifier.url")
bearerToken = flagutil.NewArray("notifier.bearerToken", "Optional bearer token for -notifier.url")
bearerTokenFile = flagutil.NewArray("notifier.bearerTokenFile", "Optional path to bearer token file for -notifier.url")
bearerToken = flagutil.NewArrayString("notifier.bearerToken", "Optional bearer token for -notifier.url")
bearerTokenFile = flagutil.NewArrayString("notifier.bearerTokenFile", "Optional path to bearer token file for -notifier.url")
tlsInsecureSkipVerify = flagutil.NewArrayBool("notifier.tlsInsecureSkipVerify", "Whether to skip tls verification when connecting to -notifier.url")
tlsCertFile = flagutil.NewArray("notifier.tlsCertFile", "Optional path to client-side TLS certificate file to use when connecting to -notifier.url")
tlsKeyFile = flagutil.NewArray("notifier.tlsKeyFile", "Optional path to client-side TLS certificate key to use when connecting to -notifier.url")
tlsCAFile = flagutil.NewArray("notifier.tlsCAFile", "Optional path to TLS CA file to use for verifying connections to -notifier.url. "+
tlsCertFile = flagutil.NewArrayString("notifier.tlsCertFile", "Optional path to client-side TLS certificate file to use when connecting to -notifier.url")
tlsKeyFile = flagutil.NewArrayString("notifier.tlsKeyFile", "Optional path to client-side TLS certificate key to use when connecting to -notifier.url")
tlsCAFile = flagutil.NewArrayString("notifier.tlsCAFile", "Optional path to TLS CA file to use for verifying connections to -notifier.url. "+
"By default system CA is used")
tlsServerName = flagutil.NewArray("notifier.tlsServerName", "Optional TLS server name to use for connections to -notifier.url. "+
tlsServerName = flagutil.NewArrayString("notifier.tlsServerName", "Optional TLS server name to use for connections to -notifier.url. "+
"By default the server name from -notifier.url is used")
oauth2ClientID = flagutil.NewArray("notifier.oauth2.clientID", "Optional OAuth2 clientID to use for -notifier.url. "+
oauth2ClientID = flagutil.NewArrayString("notifier.oauth2.clientID", "Optional OAuth2 clientID to use for -notifier.url. "+
"If multiple args are set, then they are applied independently for the corresponding -notifier.url")
oauth2ClientSecret = flagutil.NewArray("notifier.oauth2.clientSecret", "Optional OAuth2 clientSecret to use for -notifier.url. "+
oauth2ClientSecret = flagutil.NewArrayString("notifier.oauth2.clientSecret", "Optional OAuth2 clientSecret to use for -notifier.url. "+
"If multiple args are set, then they are applied independently for the corresponding -notifier.url")
oauth2ClientSecretFile = flagutil.NewArray("notifier.oauth2.clientSecretFile", "Optional OAuth2 clientSecretFile to use for -notifier.url. "+
oauth2ClientSecretFile = flagutil.NewArrayString("notifier.oauth2.clientSecretFile", "Optional OAuth2 clientSecretFile to use for -notifier.url. "+
"If multiple args are set, then they are applied independently for the corresponding -notifier.url")
oauth2TokenURL = flagutil.NewArray("notifier.oauth2.tokenUrl", "Optional OAuth2 tokenURL to use for -notifier.url. "+
oauth2TokenURL = flagutil.NewArrayString("notifier.oauth2.tokenUrl", "Optional OAuth2 tokenURL to use for -notifier.url. "+
"If multiple args are set, then they are applied independently for the corresponding -notifier.url")
oauth2Scopes = flagutil.NewArray("notifier.oauth2.scopes", "Optional OAuth2 scopes to use for -notifier.url. Scopes must be delimited by ';'. "+
oauth2Scopes = flagutil.NewArrayString("notifier.oauth2.scopes", "Optional OAuth2 scopes to use for -notifier.url. Scopes must be delimited by ';'. "+
"If multiple args are set, then they are applied independently for the corresponding -notifier.url")
)

View File

@@ -1,7 +1,21 @@
headers:
- 'CustomHeader: foo'
static_configs:
- targets:
- localhost:9093
- localhost:9095
basic_auth:
username: foo
password: bar
- targets:
- localhost:9096
- localhost:9097
basic_auth:
username: foo
password: baz
alert_relabel_configs:
- target_label: "foo"
replacement: "aaa"
replacement: "aaa"

View File

@@ -5,7 +5,6 @@ import (
"fmt"
"sort"
"strings"
"sync"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/config"
@@ -27,19 +26,9 @@ type RecordingRule struct {
q datasource.Querier
// guard status fields
mu sync.RWMutex
// stores last moment of time Exec was called
lastExecTime time.Time
// stores the duration of the last Exec call
lastExecDuration time.Duration
// stores last error that happened in Exec func
// resets on every successful Exec
// may be used as Health state
lastExecError error
// stores the number of samples returned during
// the last evaluation
lastExecSamples int
// state stores recent state changes
// during evaluations
state *ruleState
metrics *recordingRuleMetrics
}
@@ -69,6 +58,7 @@ func newRecordingRule(qb datasource.QuerierBuilder, group *Group, cfg config.Rul
Labels: cfg.Labels,
GroupID: group.ID(),
metrics: &recordingRuleMetrics{},
state: newRuleState(),
q: qb.BuildWithParams(datasource.QuerierParams{
DataSourceType: group.Type.String(),
EvaluationInterval: group.Interval,
@@ -80,18 +70,16 @@ func newRecordingRule(qb datasource.QuerierBuilder, group *Group, cfg config.Rul
labels := fmt.Sprintf(`recording=%q, group=%q, id="%d"`, rr.Name, group.Name, rr.ID())
rr.metrics.errors = utils.GetOrCreateGauge(fmt.Sprintf(`vmalert_recording_rules_error{%s}`, labels),
func() float64 {
rr.mu.RLock()
defer rr.mu.RUnlock()
if rr.lastExecError == nil {
e := rr.state.getLast()
if e.err == nil {
return 0
}
return 1
})
rr.metrics.samples = utils.GetOrCreateGauge(fmt.Sprintf(`vmalert_recording_rules_last_evaluation_samples{%s}`, labels),
func() float64 {
rr.mu.RLock()
defer rr.mu.RUnlock()
return float64(rr.lastExecSamples)
e := rr.state.getLast()
return float64(e.samples)
})
return rr
}
@@ -126,21 +114,29 @@ func (rr *RecordingRule) ExecRange(ctx context.Context, start, end time.Time) ([
// Exec executes RecordingRule expression via the given Querier.
func (rr *RecordingRule) Exec(ctx context.Context, ts time.Time, limit int) ([]prompbmarshal.TimeSeries, error) {
qMetrics, err := rr.q.Query(ctx, rr.Expr, ts)
rr.mu.Lock()
defer rr.mu.Unlock()
start := time.Now()
qMetrics, req, err := rr.q.Query(ctx, rr.Expr, ts)
curState := ruleStateEntry{
time: start,
at: ts,
duration: time.Since(start),
samples: len(qMetrics),
req: req,
}
defer func() {
rr.state.add(curState)
}()
rr.lastExecTime = ts
rr.lastExecDuration = time.Since(ts)
rr.lastExecError = err
rr.lastExecSamples = len(qMetrics)
if err != nil {
return nil, fmt.Errorf("failed to execute query %q: %w", rr.Expr, err)
curState.err = fmt.Errorf("failed to execute query %q: %w", rr.Expr, err)
return nil, curState.err
}
numSeries := len(qMetrics)
if limit > 0 && numSeries > limit {
return nil, fmt.Errorf("exec exceeded limit of %d with %d series", limit, numSeries)
curState.err = fmt.Errorf("exec exceeded limit of %d with %d series", limit, numSeries)
return nil, curState.err
}
duplicates := make(map[string]struct{}, len(qMetrics))
@@ -149,8 +145,8 @@ func (rr *RecordingRule) Exec(ctx context.Context, ts time.Time, limit int) ([]p
ts := rr.toTimeSeries(r)
key := stringifyLabels(ts)
if _, ok := duplicates[key]; ok {
rr.lastExecError = errDuplicate
return nil, fmt.Errorf("original metric %v; resulting labels %q: %w", r, key, errDuplicate)
curState.err = fmt.Errorf("original metric %v; resulting labels %q: %w", r, key, errDuplicate)
return nil, curState.err
}
duplicates[key] = struct{}{}
tss = append(tss, ts)
@@ -205,23 +201,25 @@ func (rr *RecordingRule) UpdateWith(r Rule) error {
// ToAPI returns Rule's representation in form
// of APIRule
func (rr *RecordingRule) ToAPI() APIRule {
lastState := rr.state.getLast()
r := APIRule{
Type: "recording",
DatasourceType: rr.Type.String(),
Name: rr.Name,
Query: rr.Expr,
Labels: rr.Labels,
LastEvaluation: rr.lastExecTime,
EvaluationTime: rr.lastExecDuration.Seconds(),
LastEvaluation: lastState.time,
EvaluationTime: lastState.duration.Seconds(),
Health: "ok",
LastSamples: rr.lastExecSamples,
LastSamples: lastState.samples,
Updates: rr.state.getAll(),
// encode as strings to avoid rounding
ID: fmt.Sprintf("%d", rr.ID()),
GroupID: fmt.Sprintf("%d", rr.GroupID),
}
if rr.lastExecError != nil {
r.LastError = rr.lastExecError.Error()
if lastState.err != nil {
r.LastError = lastState.err.Error()
r.Health = "err"
}
return r

View File

@@ -19,7 +19,7 @@ func TestRecordingRule_Exec(t *testing.T) {
expTS []prompbmarshal.TimeSeries
}{
{
&RecordingRule{Name: "foo"},
&RecordingRule{Name: "foo", state: newRuleState()},
[]datasource.Metric{metricWithValueAndLabels(t, 10,
"__name__", "bar",
)},
@@ -30,7 +30,7 @@ func TestRecordingRule_Exec(t *testing.T) {
},
},
{
&RecordingRule{Name: "foobarbaz"},
&RecordingRule{Name: "foobarbaz", state: newRuleState()},
[]datasource.Metric{
metricWithValueAndLabels(t, 1, "__name__", "foo", "job", "foo"),
metricWithValueAndLabels(t, 2, "__name__", "bar", "job", "bar"),
@@ -52,9 +52,12 @@ func TestRecordingRule_Exec(t *testing.T) {
},
},
{
&RecordingRule{Name: "job:foo", Labels: map[string]string{
"source": "test",
}},
&RecordingRule{
Name: "job:foo",
state: newRuleState(),
Labels: map[string]string{
"source": "test",
}},
[]datasource.Metric{
metricWithValueAndLabels(t, 2, "__name__", "foo", "job", "foo"),
metricWithValueAndLabels(t, 1, "__name__", "bar", "job", "bar")},
@@ -195,7 +198,7 @@ func TestRecordingRuleLimit(t *testing.T) {
metricWithValuesAndLabels(t, []float64{2, 3}, "__name__", "bar", "job", "bar"),
metricWithValuesAndLabels(t, []float64{4, 5, 6}, "__name__", "baz", "job", "baz"),
}
rule := &RecordingRule{Name: "job:foo", Labels: map[string]string{
rule := &RecordingRule{Name: "job:foo", state: newRuleState(), Labels: map[string]string{
"source": "test_limit",
}}
var err error
@@ -211,9 +214,13 @@ func TestRecordingRuleLimit(t *testing.T) {
}
func TestRecordingRule_ExecNegative(t *testing.T) {
rr := &RecordingRule{Name: "job:foo", Labels: map[string]string{
"job": "test",
}}
rr := &RecordingRule{
Name: "job:foo",
state: newRuleState(),
Labels: map[string]string{
"job": "test",
},
}
fq := &fakeQuerier{}
expErr := "connection reset by peer"

View File

@@ -218,7 +218,7 @@ func (c *Client) flush(ctx context.Context, wr *prompbmarshal.WriteRequest) {
return
}
logger.Errorf("attempt %d to send request failed: %s", i+1, err)
logger.Warnf("attempt %d to send request failed: %s", i+1, err)
// sleeping to avoid remote db hammering
time.Sleep(time.Second)
continue

View File

@@ -7,7 +7,7 @@ import (
"strings"
"time"
"github.com/dmitryk-dk/pb/v3"
"github.com/cheggaaa/pb/v3"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/config"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"

View File

@@ -3,6 +3,8 @@ package main
import (
"context"
"errors"
"net/http"
"sync"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
@@ -31,3 +33,74 @@ type Rule interface {
}
var errDuplicate = errors.New("result contains metrics with the same labelset after applying rule labels")
type ruleState struct {
sync.RWMutex
entries []ruleStateEntry
cur int
}
type ruleStateEntry struct {
// stores last moment of time rule.Exec was called
time time.Time
// stores the timesteamp with which rule.Exec was called
at time.Time
// stores the duration of the last rule.Exec call
duration time.Duration
// stores last error that happened in Exec func
// resets on every successful Exec
// may be used as Health ruleState
err error
// stores the number of samples returned during
// the last evaluation
samples int
// stores the HTTP request used by datasource during rule.Exec
req *http.Request
}
const defaultStateEntriesLimit = 20
func newRuleState() *ruleState {
return &ruleState{
entries: make([]ruleStateEntry, defaultStateEntriesLimit),
}
}
func (s *ruleState) getLast() ruleStateEntry {
s.RLock()
defer s.RUnlock()
return s.entries[s.cur]
}
func (s *ruleState) getAll() []ruleStateEntry {
entries := make([]ruleStateEntry, 0)
s.RLock()
defer s.RUnlock()
cur := s.cur
for {
e := s.entries[cur]
if !e.time.IsZero() || !e.at.IsZero() {
entries = append(entries, e)
}
cur--
if cur < 0 {
cur = cap(s.entries) - 1
}
if cur == s.cur {
return entries
}
}
}
func (s *ruleState) add(e ruleStateEntry) {
s.Lock()
defer s.Unlock()
s.cur++
if s.cur > cap(s.entries)-1 {
s.cur = 0
}
s.entries[s.cur] = e
}

81
app/vmalert/rule_test.go Normal file
View File

@@ -0,0 +1,81 @@
package main
import (
"sync"
"testing"
"time"
)
func TestRule_state(t *testing.T) {
state := newRuleState()
e := state.getLast()
if !e.at.IsZero() {
t.Fatalf("expected entry to be zero")
}
now := time.Now()
state.add(ruleStateEntry{at: now})
e = state.getLast()
if e.at != now {
t.Fatalf("expected entry at %v to be equal to %v",
e.at, now)
}
time.Sleep(time.Millisecond)
now2 := time.Now()
state.add(ruleStateEntry{at: now2})
e = state.getLast()
if e.at != now2 {
t.Fatalf("expected entry at %v to be equal to %v",
e.at, now2)
}
if len(state.getAll()) != 2 {
t.Fatalf("expected for state to have 2 entries only; got %d",
len(state.getAll()),
)
}
var last time.Time
for i := 0; i < defaultStateEntriesLimit*2; i++ {
last = time.Now()
state.add(ruleStateEntry{at: last})
}
e = state.getLast()
if e.at != last {
t.Fatalf("expected entry at %v to be equal to %v",
e.at, last)
}
if len(state.getAll()) != defaultStateEntriesLimit {
t.Fatalf("expected for state to have %d entries only; got %d",
defaultStateEntriesLimit, len(state.getAll()),
)
}
}
// TestRule_stateConcurrent supposed to test concurrent
// execution of state updates.
// Should be executed with -race flag
func TestRule_stateConcurrent(t *testing.T) {
state := newRuleState()
const workers = 50
const iterations = 100
wg := sync.WaitGroup{}
wg.Add(workers)
for i := 0; i < workers; i++ {
go func() {
defer wg.Done()
for i := 0; i < iterations; i++ {
state.add(ruleStateEntry{at: time.Now()})
state.getAll()
state.getLast()
}
}()
}
wg.Wait()
}

View File

@@ -0,0 +1,15 @@
{% stripspace %}
{% func quotesEscape(s string) %}
{%j= s %}
{% endfunc %}
{% func jsonEscape(s string) %}
{%q= s %}
{% endfunc %}
{% func htmlEscape(s string) %}
{%s s %}
{% endfunc %}
{% endstripspace %}

View File

@@ -0,0 +1,117 @@
// Code generated by qtc from "funcs.qtpl". DO NOT EDIT.
// See https://github.com/valyala/quicktemplate for details.
//line app/vmalert/templates/funcs.qtpl:3
package templates
//line app/vmalert/templates/funcs.qtpl:3
import (
qtio422016 "io"
qt422016 "github.com/valyala/quicktemplate"
)
//line app/vmalert/templates/funcs.qtpl:3
var (
_ = qtio422016.Copy
_ = qt422016.AcquireByteBuffer
)
//line app/vmalert/templates/funcs.qtpl:3
func streamquotesEscape(qw422016 *qt422016.Writer, s string) {
//line app/vmalert/templates/funcs.qtpl:4
qw422016.N().J(s)
//line app/vmalert/templates/funcs.qtpl:5
}
//line app/vmalert/templates/funcs.qtpl:5
func writequotesEscape(qq422016 qtio422016.Writer, s string) {
//line app/vmalert/templates/funcs.qtpl:5
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmalert/templates/funcs.qtpl:5
streamquotesEscape(qw422016, s)
//line app/vmalert/templates/funcs.qtpl:5
qt422016.ReleaseWriter(qw422016)
//line app/vmalert/templates/funcs.qtpl:5
}
//line app/vmalert/templates/funcs.qtpl:5
func quotesEscape(s string) string {
//line app/vmalert/templates/funcs.qtpl:5
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmalert/templates/funcs.qtpl:5
writequotesEscape(qb422016, s)
//line app/vmalert/templates/funcs.qtpl:5
qs422016 := string(qb422016.B)
//line app/vmalert/templates/funcs.qtpl:5
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmalert/templates/funcs.qtpl:5
return qs422016
//line app/vmalert/templates/funcs.qtpl:5
}
//line app/vmalert/templates/funcs.qtpl:7
func streamjsonEscape(qw422016 *qt422016.Writer, s string) {
//line app/vmalert/templates/funcs.qtpl:8
qw422016.N().Q(s)
//line app/vmalert/templates/funcs.qtpl:9
}
//line app/vmalert/templates/funcs.qtpl:9
func writejsonEscape(qq422016 qtio422016.Writer, s string) {
//line app/vmalert/templates/funcs.qtpl:9
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmalert/templates/funcs.qtpl:9
streamjsonEscape(qw422016, s)
//line app/vmalert/templates/funcs.qtpl:9
qt422016.ReleaseWriter(qw422016)
//line app/vmalert/templates/funcs.qtpl:9
}
//line app/vmalert/templates/funcs.qtpl:9
func jsonEscape(s string) string {
//line app/vmalert/templates/funcs.qtpl:9
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmalert/templates/funcs.qtpl:9
writejsonEscape(qb422016, s)
//line app/vmalert/templates/funcs.qtpl:9
qs422016 := string(qb422016.B)
//line app/vmalert/templates/funcs.qtpl:9
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmalert/templates/funcs.qtpl:9
return qs422016
//line app/vmalert/templates/funcs.qtpl:9
}
//line app/vmalert/templates/funcs.qtpl:11
func streamhtmlEscape(qw422016 *qt422016.Writer, s string) {
//line app/vmalert/templates/funcs.qtpl:12
qw422016.E().S(s)
//line app/vmalert/templates/funcs.qtpl:13
}
//line app/vmalert/templates/funcs.qtpl:13
func writehtmlEscape(qq422016 qtio422016.Writer, s string) {
//line app/vmalert/templates/funcs.qtpl:13
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmalert/templates/funcs.qtpl:13
streamhtmlEscape(qw422016, s)
//line app/vmalert/templates/funcs.qtpl:13
qt422016.ReleaseWriter(qw422016)
//line app/vmalert/templates/funcs.qtpl:13
}
//line app/vmalert/templates/funcs.qtpl:13
func htmlEscape(s string) string {
//line app/vmalert/templates/funcs.qtpl:13
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmalert/templates/funcs.qtpl:13
writehtmlEscape(qb422016, s)
//line app/vmalert/templates/funcs.qtpl:13
qs422016 := string(qb422016.B)
//line app/vmalert/templates/funcs.qtpl:13
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmalert/templates/funcs.qtpl:13
return qs422016
//line app/vmalert/templates/funcs.qtpl:13
}

View File

@@ -207,23 +207,10 @@ func FuncsWithExternalURL(externalURL *url.URL) textTpl.FuncMap {
// templateFuncs initiates template helper functions
func templateFuncs() textTpl.FuncMap {
// See https://prometheus.io/docs/prometheus/latest/configuration/template_reference/
// and https://github.com/prometheus/prometheus/blob/fa6e05903fd3ce52e374a6e1bf4eb98c9f1f45a7/template/template.go#L150
return textTpl.FuncMap{
/* Strings */
// reReplaceAll ReplaceAllString returns a copy of src, replacing matches of the Regexp with
// the replacement string repl. Inside repl, $ signs are interpreted as in Expand,
// so for instance $1 represents the text of the first submatch.
// alias for https://golang.org/pkg/regexp/#Regexp.ReplaceAllString
"reReplaceAll": func(pattern, repl, text string) string {
re := regexp.MustCompile(pattern)
return re.ReplaceAllString(text, repl)
},
// match reports whether the string s
// contains any match of the regular expression pattern.
// alias for https://golang.org/pkg/regexp/#MatchString
"match": regexp.MatchString,
// title returns a copy of the string s with all Unicode letters
// that begin words mapped to their Unicode title case.
// alias for https://golang.org/pkg/strings/#Title
@@ -237,6 +224,31 @@ func templateFuncs() textTpl.FuncMap {
// alias for https://golang.org/pkg/strings/#ToLower
"toLower": strings.ToLower,
// crlfEscape replaces '\n' and '\r' chars with `\\n` and `\\r`.
// This funcion is deprectated.
//
// It is better to use quotesEscape, jsonEscape, queryEscape or pathEscape instead -
// these functions properly escape `\n` and `\r` chars according to their purpose.
"crlfEscape": func(q string) string {
q = strings.Replace(q, "\n", `\n`, -1)
return strings.Replace(q, "\r", `\r`, -1)
},
// quotesEscape escapes the string, so it can be safely put inside JSON string.
//
// See also jsonEscape.
"quotesEscape": quotesEscape,
// jsonEscape converts the string to properly encoded JSON string.
//
// See also quotesEscape.
"jsonEscape": jsonEscape,
// htmlEscape applies html-escaping to q, so it can be safely embedded as plaintext into html.
//
// See also safeHtml.
"htmlEscape": htmlEscape,
// stripPort splits string into host and port, then returns only host.
"stripPort": func(hostPort string) string {
host, _, err := net.SplitHostPort(hostPort)
@@ -246,6 +258,37 @@ func templateFuncs() textTpl.FuncMap {
return host
},
// stripDomain removes the domain part of a FQDN. Leaves port untouched.
"stripDomain": func(hostPort string) string {
host, port, err := net.SplitHostPort(hostPort)
if err != nil {
host = hostPort
}
ip := net.ParseIP(host)
if ip != nil {
return hostPort
}
host = strings.Split(host, ".")[0]
if port != "" {
return net.JoinHostPort(host, port)
}
return host
},
// match reports whether the string s
// contains any match of the regular expression pattern.
// alias for https://golang.org/pkg/regexp/#MatchString
"match": regexp.MatchString,
// reReplaceAll ReplaceAllString returns a copy of src, replacing matches of the Regexp with
// the replacement string repl. Inside repl, $ signs are interpreted as in Expand,
// so for instance $1 represents the text of the first submatch.
// alias for https://golang.org/pkg/regexp/#Regexp.ReplaceAllString
"reReplaceAll": func(pattern, repl, text string) string {
re := regexp.MustCompile(pattern)
return re.ReplaceAllString(text, repl)
},
// parseDuration parses a duration string such as "1h" into the number of seconds it represents
"parseDuration": func(s string) (float64, error) {
d, err := promutils.ParseDuration(s)
@@ -421,31 +464,15 @@ func templateFuncs() textTpl.FuncMap {
return ""
},
// pathEscape escapes the string so it can be safely placed inside a URL path segment,
// replacing special characters (including /) with %XX sequences as needed.
// alias for https://golang.org/pkg/net/url/#PathEscape
"pathEscape": func(u string) string {
return url.PathEscape(u)
},
// pathEscape escapes the string so it can be safely placed inside a URL path segment.
//
// See also queryEscape.
"pathEscape": url.PathEscape,
// queryEscape escapes the string so it can be safely placed
// inside a URL query.
// alias for https://golang.org/pkg/net/url/#QueryEscape
"queryEscape": func(q string) string {
return url.QueryEscape(q)
},
// crlfEscape replaces new line chars to skip URL encoding.
// see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/890
"crlfEscape": func(q string) string {
q = strings.Replace(q, "\n", `\n`, -1)
return strings.Replace(q, "\r", `\r`, -1)
},
// quotesEscape escapes quote char
"quotesEscape": func(q string) string {
return strings.Replace(q, `"`, `\"`, -1)
},
// queryEscape escapes the string so it can be safely placed inside a query arg in URL.
//
// See also queryEscape.
"queryEscape": url.QueryEscape,
// query executes the MetricsQL/PromQL query against
// configured `datasource.url` address.
@@ -477,6 +504,17 @@ func templateFuncs() textTpl.FuncMap {
return m.Labels[label]
},
// value returns the value of the given metric.
// usually used alongside with `query` template function.
"value": func(m metric) float64 {
return m.Value
},
// strvalue returns metric name.
"strvalue": func(m metric) string {
return m.Labels["__name__"]
},
// sortByLabel sorts the given metrics by provided label key
"sortByLabel": func(label string, metrics []metric) []metric {
sort.SliceStable(metrics, func(i, j int) bool {
@@ -485,12 +523,6 @@ func templateFuncs() textTpl.FuncMap {
return metrics
},
// value returns the value of the given metric.
// usually used alongside with `query` template function.
"value": func(m metric) float64 {
return m.Value
},
/* Helpers */
// Converts a list of objects to a map with keys arg0, arg1 etc.
@@ -504,6 +536,8 @@ func templateFuncs() textTpl.FuncMap {
},
// safeHtml marks string as HTML not requiring auto-escaping.
//
// See also htmlEscape.
"safeHtml": func(text string) htmlTpl.HTML {
return htmlTpl.HTML(text)
},

View File

@@ -6,6 +6,52 @@ import (
textTpl "text/template"
)
func TestTemplateFuncs(t *testing.T) {
funcs := templateFuncs()
f := func(funcName, s, resultExpected string) {
t.Helper()
v := funcs[funcName]
fLocal := v.(func(s string) string)
result := fLocal(s)
if result != resultExpected {
t.Fatalf("unexpected result for %s(%q); got\n%s\nwant\n%s", funcName, s, result, resultExpected)
}
}
f("title", "foo bar", "Foo Bar")
f("toUpper", "foo", "FOO")
f("toLower", "FOO", "foo")
f("pathEscape", "foo/bar\n+baz", "foo%2Fbar%0A+baz")
f("queryEscape", "foo+bar\n+baz", "foo%2Bbar%0A%2Bbaz")
f("jsonEscape", `foo{bar="baz"}`+"\n + 1", `"foo{bar=\"baz\"}\n + 1"`)
f("quotesEscape", `foo{bar="baz"}`+"\n + 1", `foo{bar=\"baz\"}\n + 1`)
f("htmlEscape", "foo < 10\nabc", "foo &lt; 10\nabc")
f("crlfEscape", "foo\nbar\rx", `foo\nbar\rx`)
f("stripPort", "foo", "foo")
f("stripPort", "foo:1234", "foo")
f("stripDomain", "foo.bar.baz", "foo")
f("stripDomain", "foo.bar:123", "foo:123")
// check "match" func
matchFunc := funcs["match"].(func(pattern, s string) (bool, error))
if _, err := matchFunc("invalid[regexp", "abc"); err == nil {
t.Fatalf("expecting non-nil error on invalid regexp")
}
ok, err := matchFunc("abc", "def")
if err != nil {
t.Fatalf("unexpected error")
}
if ok {
t.Fatalf("unexpected match")
}
ok, err = matchFunc("a.+b", "acsdb")
if err != nil {
t.Fatalf("unexpected error")
}
if !ok {
t.Fatalf("unexpected mismatch")
}
}
func mkTemplate(current, replacement interface{}) textTemplate {
tmpl := textTemplate{}
if current != nil {

View File

@@ -59,6 +59,15 @@
background-color: rgba(0,0,0,.5);
-webkit-box-shadow: 0 0 1px rgba(255,255,255,.5);
}
textarea.curl-area{
width: 100%;
line-height: 1;
font-size: 12px;
border: none;
margin: 0;
padding: 0;
overflow: scroll;
}
</style>
</head>
<body>

View File

@@ -101,143 +101,152 @@ func StreamHeader(qw422016 *qt422016.Writer, r *http.Request, navItems []NavItem
background-color: rgba(0,0,0,.5);
-webkit-box-shadow: 0 0 1px rgba(255,255,255,.5);
}
textarea.curl-area{
width: 100%;
line-height: 1;
font-size: 12px;
border: none;
margin: 0;
padding: 0;
overflow: scroll;
}
</style>
</head>
<body>
`)
//line app/vmalert/tpl/header.qtpl:65
//line app/vmalert/tpl/header.qtpl:74
streamprintNavItems(qw422016, r, title, navItems)
//line app/vmalert/tpl/header.qtpl:65
//line app/vmalert/tpl/header.qtpl:74
qw422016.N().S(`
<main class="px-2">
`)
//line app/vmalert/tpl/header.qtpl:67
//line app/vmalert/tpl/header.qtpl:76
}
//line app/vmalert/tpl/header.qtpl:67
//line app/vmalert/tpl/header.qtpl:76
func WriteHeader(qq422016 qtio422016.Writer, r *http.Request, navItems []NavItem, title string) {
//line app/vmalert/tpl/header.qtpl:67
//line app/vmalert/tpl/header.qtpl:76
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmalert/tpl/header.qtpl:67
//line app/vmalert/tpl/header.qtpl:76
StreamHeader(qw422016, r, navItems, title)
//line app/vmalert/tpl/header.qtpl:67
//line app/vmalert/tpl/header.qtpl:76
qt422016.ReleaseWriter(qw422016)
//line app/vmalert/tpl/header.qtpl:67
//line app/vmalert/tpl/header.qtpl:76
}
//line app/vmalert/tpl/header.qtpl:67
//line app/vmalert/tpl/header.qtpl:76
func Header(r *http.Request, navItems []NavItem, title string) string {
//line app/vmalert/tpl/header.qtpl:67
//line app/vmalert/tpl/header.qtpl:76
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmalert/tpl/header.qtpl:67
//line app/vmalert/tpl/header.qtpl:76
WriteHeader(qb422016, r, navItems, title)
//line app/vmalert/tpl/header.qtpl:67
//line app/vmalert/tpl/header.qtpl:76
qs422016 := string(qb422016.B)
//line app/vmalert/tpl/header.qtpl:67
//line app/vmalert/tpl/header.qtpl:76
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmalert/tpl/header.qtpl:67
//line app/vmalert/tpl/header.qtpl:76
return qs422016
//line app/vmalert/tpl/header.qtpl:67
//line app/vmalert/tpl/header.qtpl:76
}
//line app/vmalert/tpl/header.qtpl:71
//line app/vmalert/tpl/header.qtpl:80
type NavItem struct {
Name string
Url string
}
//line app/vmalert/tpl/header.qtpl:77
//line app/vmalert/tpl/header.qtpl:86
func streamprintNavItems(qw422016 *qt422016.Writer, r *http.Request, current string, items []NavItem) {
//line app/vmalert/tpl/header.qtpl:77
//line app/vmalert/tpl/header.qtpl:86
qw422016.N().S(`
`)
//line app/vmalert/tpl/header.qtpl:79
//line app/vmalert/tpl/header.qtpl:88
prefix := "/vmalert/"
if strings.HasPrefix(r.URL.Path, prefix) {
prefix = ""
}
//line app/vmalert/tpl/header.qtpl:83
//line app/vmalert/tpl/header.qtpl:92
qw422016.N().S(`
<nav class="navbar navbar-expand-md navbar-dark fixed-top bg-dark">
<div class="container-fluid">
<div class="collapse navbar-collapse" id="navbarCollapse">
<ul class="navbar-nav me-auto mb-2 mb-md-0">
`)
//line app/vmalert/tpl/header.qtpl:88
//line app/vmalert/tpl/header.qtpl:97
for _, item := range items {
//line app/vmalert/tpl/header.qtpl:88
//line app/vmalert/tpl/header.qtpl:97
qw422016.N().S(`
<li class="nav-item">
`)
//line app/vmalert/tpl/header.qtpl:91
//line app/vmalert/tpl/header.qtpl:100
u, _ := url.Parse(item.Url)
//line app/vmalert/tpl/header.qtpl:92
//line app/vmalert/tpl/header.qtpl:101
qw422016.N().S(`
<a class="nav-link`)
//line app/vmalert/tpl/header.qtpl:93
//line app/vmalert/tpl/header.qtpl:102
if current == item.Name {
//line app/vmalert/tpl/header.qtpl:93
//line app/vmalert/tpl/header.qtpl:102
qw422016.N().S(` active`)
//line app/vmalert/tpl/header.qtpl:93
//line app/vmalert/tpl/header.qtpl:102
}
//line app/vmalert/tpl/header.qtpl:93
//line app/vmalert/tpl/header.qtpl:102
qw422016.N().S(`"
href="`)
//line app/vmalert/tpl/header.qtpl:94
//line app/vmalert/tpl/header.qtpl:103
if u.IsAbs() {
//line app/vmalert/tpl/header.qtpl:94
//line app/vmalert/tpl/header.qtpl:103
qw422016.E().S(item.Url)
//line app/vmalert/tpl/header.qtpl:94
//line app/vmalert/tpl/header.qtpl:103
} else {
//line app/vmalert/tpl/header.qtpl:94
//line app/vmalert/tpl/header.qtpl:103
qw422016.E().S(path.Join(prefix, item.Url))
//line app/vmalert/tpl/header.qtpl:94
//line app/vmalert/tpl/header.qtpl:103
}
//line app/vmalert/tpl/header.qtpl:94
//line app/vmalert/tpl/header.qtpl:103
qw422016.N().S(`">
`)
//line app/vmalert/tpl/header.qtpl:95
//line app/vmalert/tpl/header.qtpl:104
qw422016.E().S(item.Name)
//line app/vmalert/tpl/header.qtpl:95
//line app/vmalert/tpl/header.qtpl:104
qw422016.N().S(`
</a>
</li>
`)
//line app/vmalert/tpl/header.qtpl:98
//line app/vmalert/tpl/header.qtpl:107
}
//line app/vmalert/tpl/header.qtpl:98
//line app/vmalert/tpl/header.qtpl:107
qw422016.N().S(`
</ul>
</div>
</nav>
`)
//line app/vmalert/tpl/header.qtpl:102
//line app/vmalert/tpl/header.qtpl:111
}
//line app/vmalert/tpl/header.qtpl:102
//line app/vmalert/tpl/header.qtpl:111
func writeprintNavItems(qq422016 qtio422016.Writer, r *http.Request, current string, items []NavItem) {
//line app/vmalert/tpl/header.qtpl:102
//line app/vmalert/tpl/header.qtpl:111
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmalert/tpl/header.qtpl:102
//line app/vmalert/tpl/header.qtpl:111
streamprintNavItems(qw422016, r, current, items)
//line app/vmalert/tpl/header.qtpl:102
//line app/vmalert/tpl/header.qtpl:111
qt422016.ReleaseWriter(qw422016)
//line app/vmalert/tpl/header.qtpl:102
//line app/vmalert/tpl/header.qtpl:111
}
//line app/vmalert/tpl/header.qtpl:102
//line app/vmalert/tpl/header.qtpl:111
func printNavItems(r *http.Request, current string, items []NavItem) string {
//line app/vmalert/tpl/header.qtpl:102
//line app/vmalert/tpl/header.qtpl:111
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmalert/tpl/header.qtpl:102
//line app/vmalert/tpl/header.qtpl:111
writeprintNavItems(qb422016, r, current, items)
//line app/vmalert/tpl/header.qtpl:102
//line app/vmalert/tpl/header.qtpl:111
qs422016 := string(qb422016.B)
//line app/vmalert/tpl/header.qtpl:102
//line app/vmalert/tpl/header.qtpl:111
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmalert/tpl/header.qtpl:102
//line app/vmalert/tpl/header.qtpl:111
return qs422016
//line app/vmalert/tpl/header.qtpl:102
//line app/vmalert/tpl/header.qtpl:111
}

View File

@@ -1,7 +1,10 @@
package main
import (
"fmt"
"net/http"
"sort"
"strings"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
@@ -47,3 +50,62 @@ func newTimeSeriesPB(values []float64, timestamps []int64, labels []prompbmarsha
ts.Labels = labels
return ts
}
type curlWriter struct {
b strings.Builder
}
func (cw *curlWriter) string() string {
res := "curl " + cw.b.String()
cw.b.Reset()
return strings.TrimSpace(res)
}
func (cw *curlWriter) addWithEsc(str string) {
escStr := `'` + strings.Replace(str, `'`, `'\''`, -1) + `'`
cw.add(escStr)
}
func (cw *curlWriter) add(str string) {
cw.b.WriteString(str)
cw.b.WriteString(" ")
}
func requestToCurl(req *http.Request) string {
if req.URL == nil {
return ""
}
cw := &curlWriter{}
schema := req.URL.Scheme
requestURL := req.URL.String()
if schema == "" {
schema = "http"
if req.TLS != nil {
schema = "https"
}
requestURL = schema + "://" + req.Host + requestURL
}
if schema == "https" {
cw.add("-k")
}
cw.add("-X")
cw.add(req.Method)
var keys []string
for k := range req.Header {
keys = append(keys, k)
}
sort.Strings(keys)
for _, k := range keys {
cw.add("-H")
cw.addWithEsc(fmt.Sprintf("%s: %s", k, strings.Join(req.Header[k], " ")))
}
cw.addWithEsc(requestURL)
return cw.string()
}

47
app/vmalert/utils_test.go Normal file
View File

@@ -0,0 +1,47 @@
package main
import (
"net/http"
"testing"
)
func TestRequestToCurl(t *testing.T) {
f := func(req *http.Request, exp string) {
got := requestToCurl(req)
if got != exp {
t.Fatalf("expected to have %q; got %q instead", exp, got)
}
}
req, _ := http.NewRequest(http.MethodPost, "foo.com", nil)
f(req, "curl -X POST 'http://foo.com'")
req, _ = http.NewRequest(http.MethodGet, "https://foo.com", nil)
f(req, "curl -k -X GET 'https://foo.com'")
req, _ = http.NewRequest(http.MethodPost, "foo.com", nil)
req.Header.Set("foo", "bar")
req.Header.Set("baz", "qux")
f(req, "curl -X POST -H 'Baz: qux' -H 'Foo: bar' 'http://foo.com'")
req, _ = http.NewRequest(http.MethodPost, "foo.com", nil)
params := req.URL.Query()
params.Add("query", "up")
params.Add("step", "10")
req.URL.RawQuery = params.Encode()
f(req, "curl -X POST 'http://foo.com?query=up&step=10'")
req, _ = http.NewRequest(http.MethodPost, "http://foo.com", nil)
params = req.URL.Query()
params.Add("query", "up")
params.Add("step", "10")
req.URL.RawQuery = params.Encode()
f(req, "curl -X POST 'http://foo.com?query=up&step=10'")
req, _ = http.NewRequest(http.MethodPost, "https://foo.com", nil)
params = req.URL.Query()
params.Add("query", "up")
params.Add("step", "10")
req.URL.RawQuery = params.Encode()
f(req, "curl -k -X POST 'https://foo.com?query=up&step=10'")
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 109 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 41 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 41 KiB

View File

@@ -85,6 +85,14 @@ func (rh *requestHandler) handler(w http.ResponseWriter, r *http.Request) bool {
}
WriteAlert(w, r, alert)
return true
case "/vmalert/rule":
rule, err := rh.getRule(r)
if err != nil {
httpserver.Errorf(w, r, "%s", err)
return true
}
WriteRuleDetails(w, r, rule)
return true
case "/vmalert/groups":
WriteListGroups(w, r, rh.groups())
return true
@@ -160,7 +168,7 @@ func (rh *requestHandler) handler(w http.ResponseWriter, r *http.Request) bool {
if strings.HasPrefix(r.URL.Path, "/api/v1/") {
redirectURL = alert.APILink()
}
httpserver.RedirectPermanent(w, "/"+redirectURL)
httpserver.Redirect(w, "/"+redirectURL)
return true
}
}
@@ -168,8 +176,25 @@ func (rh *requestHandler) handler(w http.ResponseWriter, r *http.Request) bool {
const (
paramGroupID = "group_id"
paramAlertID = "alert_id"
paramRuleID = "rule_id"
)
func (rh *requestHandler) getRule(r *http.Request) (APIRule, error) {
groupID, err := strconv.ParseUint(r.FormValue(paramGroupID), 10, 0)
if err != nil {
return APIRule{}, fmt.Errorf("failed to read %q param: %s", paramGroupID, err)
}
ruleID, err := strconv.ParseUint(r.FormValue(paramRuleID), 10, 0)
if err != nil {
return APIRule{}, fmt.Errorf("failed to read %q param: %s", paramRuleID, err)
}
rule, err := rh.m.RuleAPI(groupID, ruleID)
if err != nil {
return APIRule{}, errResponse(err, http.StatusNotFound)
}
return rule, nil
}
func (rh *requestHandler) getAlert(r *http.Request) (*APIAlert, error) {
groupID, err := strconv.ParseUint(r.FormValue(paramGroupID), 10, 0)
if err != nil {

View File

@@ -26,6 +26,7 @@
{% endfunc %}
{% func ListGroups(r *http.Request, groups []APIGroup) %}
{%code prefix := utils.Prefix(r.URL.Path) %}
{%= tpl.Header(r, navItems, "Groups") %}
{% if len(groups) > 0 %}
{%code
@@ -85,6 +86,7 @@
{% else %}
<b>record:</b> {%s r.Name %}
{% endif %}
| <span><a target="_blank" href="{%s prefix+r.WebLink() %}">Details</a></span>
</div>
<div class="col-12">
<code><pre>{%s r.Query %}</pre></code>
@@ -116,7 +118,7 @@
{% else %}
<div>
<p>No items...</p>
<p>No groups...</p>
</div>
{% endif %}
@@ -204,7 +206,7 @@
{% else %}
<div>
<p>No items...</p>
<p>No active alerts...</p>
</div>
{% endif %}
@@ -260,7 +262,7 @@
{% else %}
<div>
<p>No items...</p>
<p>No targets...</p>
</div>
{% endif %}
@@ -284,7 +286,7 @@
}
sort.Strings(annotationKeys)
%}
<div class="display-6 pb-3 mb-3">{%s alert.Name %}<span class="ms-2 badge {% if alert.State=="firing" %}bg-danger{% else %} bg-warning text-dark{% endif %}">{%s alert.State %}</span></div>
<div class="display-6 pb-3 mb-3">Alert: {%s alert.Name %}<span class="ms-2 badge {% if alert.State=="firing" %}bg-danger{% else %} bg-warning text-dark{% endif %}">{%s alert.State %}</span></div>
<div class="container border-bottom p-2">
<div class="row">
<div class="col-2">
@@ -354,6 +356,125 @@
{% endfunc %}
{% func RuleDetails(r *http.Request, rule APIRule) %}
{%code prefix := utils.Prefix(r.URL.Path) %}
{%= tpl.Header(r, navItems, "") %}
{%code
var labelKeys []string
for k := range rule.Labels {
labelKeys = append(labelKeys, k)
}
sort.Strings(labelKeys)
var annotationKeys []string
for k := range rule.Annotations {
annotationKeys = append(annotationKeys, k)
}
sort.Strings(annotationKeys)
%}
<div class="display-6 pb-3 mb-3">Rule: {%s rule.Name %}<span class="ms-2 badge {% if rule.Health!="ok" %}bg-danger{% else %} bg-warning text-dark{% endif %}">{%s rule.Health %}</span></div>
<div class="container border-bottom p-2">
<div class="row">
<div class="col-2">
Expr
</div>
<div class="col">
<code><pre>{%s rule.Query %}</pre></code>
</div>
</div>
</div>
{% if rule.Type == "alerting" %}
<div class="container border-bottom p-2">
<div class="row">
<div class="col-2">
For
</div>
<div class="col">
{%v rule.Duration %} seconds
</div>
</div>
</div>
{% endif %}
<div class="container border-bottom p-2">
<div class="row">
<div class="col-2">
Labels
</div>
<div class="col">
{% for _, k := range labelKeys %}
<span class="m-1 badge bg-primary">{%s k %}={%s rule.Labels[k] %}</span>
{% endfor %}
</div>
</div>
</div>
{% if rule.Type == "alerting" %}
<div class="container border-bottom p-2">
<div class="row">
<div class="col-2">
Annotations
</div>
<div class="col">
{% for _, k := range annotationKeys %}
<b>{%s k %}:</b><br>
<p>{%s rule.Annotations[k] %}</p>
{% endfor %}
</div>
</div>
</div>
{% endif %}
<div class="container border-bottom p-2">
<div class="row">
<div class="col-2">
Group
</div>
<div class="col">
<a target="_blank" href="{%s prefix %}groups#group-{%s rule.GroupID %}">{%s rule.GroupID %}</a>
</div>
</div>
</div>
<br>
<div class="display-6 pb-3">Last {%d len(rule.Updates) %} updates</span>:</div>
<table class="table table-striped table-hover table-sm">
<thead>
<tr>
<th scope="col" title="The time when event was created">Updated at</th>
<th scope="col" style="width: 10%" class="text-center" title="How many samples were returned">Samples</th>
<th scope="col" style="width: 10%" class="text-center" title="How many seconds request took">Duration</th>
<th scope="col" class="text-center" title="Time used for rule execution">Executed at</th>
<th scope="col" class="text-center" title="cURL command with request example">cURL</th>
</tr>
</thead>
<tbody>
{% for _, u := range rule.Updates %}
<tr{% if u.err != nil %} class="alert-danger"{% endif %}>
<td>
<span class="badge bg-primary rounded-pill me-3" title="Updated at">{%s u.time.Format(time.RFC3339) %}</span>
</td>
<td class="text-center" wi>{%d u.samples %}</td>
<td class="text-center">{%f.3 u.duration.Seconds() %}s</td>
<td class="text-center">{%s u.at.Format(time.RFC3339) %}</td>
<td>
<textarea class="curl-area" rows="1" onclick="this.focus();this.select()">{%s requestToCurl(u.req) %}</textarea>
</td>
</tr>
</li>
{% if u.err != nil %}
<tr{% if u.err != nil %} class="alert-danger"{% endif %}>
<td colspan="5">
<span class="alert-danger">{%v u.err %}</span>
</td>
</tr>
{% endif %}
{% endfor %}
{%= tpl.Footer(r) %}
{% endfunc %}
{% func badgeState(state string) %}
{%code
badgeClass := "bg-warning text-dark"

File diff suppressed because it is too large Load Diff

View File

@@ -17,6 +17,7 @@ func TestHandler(t *testing.T) {
alerts: map[uint64]*notifier.Alert{
0: {State: notifier.StateFiring},
},
state: newRuleState(),
}
g := &Group{
Name: "group",
@@ -52,6 +53,22 @@ func TestHandler(t *testing.T) {
t.Run("/", func(t *testing.T) {
getResp(ts.URL, nil, 200)
getResp(ts.URL+"/vmalert", nil, 200)
getResp(ts.URL+"/vmalert/alerts", nil, 200)
getResp(ts.URL+"/vmalert/groups", nil, 200)
getResp(ts.URL+"/vmalert/notifiers", nil, 200)
getResp(ts.URL+"/rules", nil, 200)
})
t.Run("/vmalert/rule", func(t *testing.T) {
a := ar.ToAPI()
getResp(ts.URL+"/vmalert/"+a.WebLink(), nil, 200)
})
t.Run("/vmalert/rule?badParam", func(t *testing.T) {
params := fmt.Sprintf("?%s=0&%s=1", paramGroupID, paramRuleID)
getResp(ts.URL+"/vmalert/rule"+params, nil, 404)
params = fmt.Sprintf("?%s=1&%s=0", paramGroupID, paramRuleID)
getResp(ts.URL+"/vmalert/rule"+params, nil, 404)
})
t.Run("/api/v1/alerts", func(t *testing.T) {

View File

@@ -86,7 +86,7 @@ type GroupAlerts struct {
// see https://github.com/prometheus/compliance/blob/main/alert_generator/specification.md#get-apiv1rules
type APIRule struct {
// State must be one of these under following scenarios
// "pending": at least 1 alert in the rule in pending state and no other alert in firing state.
// "pending": at least 1 alert in the rule in pending state and no other alert in firing ruleState.
// "firing": at least 1 alert in the rule in firing state.
// "inactive": no alert in the rule in firing or pending state.
State string `json:"state"`
@@ -116,8 +116,17 @@ type APIRule struct {
// Type of the rule: recording or alerting
DatasourceType string `json:"datasourceType"`
LastSamples int `json:"lastSamples"`
// ID is an unique Alert's ID within a group
// ID is a unique Alert's ID within a group
ID string `json:"id"`
// GroupID is an unique Group's ID
GroupID string `json:"group_id"`
// Updates contains the ordered list of recorded ruleStateEntry objects
Updates []ruleStateEntry `json:"updates"`
}
// WebLink returns a link to the alert which can be used in UI.
func (ar APIRule) WebLink() string {
return fmt.Sprintf("rule?%s=%s&%s=%s",
paramGroupID, ar.GroupID, paramRuleID, ar.ID)
}

View File

@@ -167,7 +167,7 @@ It is recommended using [binary releases](https://github.com/VictoriaMetrics/Vic
### Development build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.18.
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.19.3.
2. Run `make vmauth` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmauth` binary and puts it into the `bin` folder.
@@ -239,7 +239,7 @@ See the docs at https://docs.victoriametrics.com/vmauth.html .
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
-eula
By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf
By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf . This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
-flagsAuthKey string
Auth key for /flags endpoint. It must be passed via authKey query arg. It overrides httpAuth.* settings
-fs.disableMmap
@@ -308,6 +308,8 @@ See the docs at https://docs.victoriametrics.com/vmauth.html .
Supports an array of values separated by comma or specified via multiple flags.
-tlsKeyFile string
Path to file with TLS key if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated
-tlsMinVersion string
Optional minimum TLS version to use for incoming requests over HTTPS if -tls is set. Supported values: TLS10, TLS11, TLS12, TLS13
-version
Show VictoriaMetrics version
```

View File

@@ -250,7 +250,11 @@ func readAuthConfig(path string) (map[string]*UserInfo, error) {
}
func parseAuthConfig(data []byte) (map[string]*UserInfo, error) {
data = envtemplate.Replace(data)
var err error
data, err = envtemplate.ReplaceBytes(data)
if err != nil {
return nil, fmt.Errorf("cannot expand environment vars: %w", err)
}
var ac AuthConfig
if err := yaml.UnmarshalStrict(data, &ac); err != nil {
return nil, fmt.Errorf("cannot unmarshal AuthConfig data: %w", err)

View File

@@ -39,10 +39,19 @@ func createTargetURL(ui *UserInfo, uOrig *url.URL) (*url.URL, []Header, error) {
u := *uOrig
// Prevent from attacks with using `..` in r.URL.Path
u.Path = path.Clean(u.Path)
if !strings.HasSuffix(u.Path, "/") && strings.HasSuffix(uOrig.Path, "/") {
// The path.Clean() removes traling slash.
// Return it back if needed.
// This should fix https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1752
u.Path += "/"
}
if !strings.HasPrefix(u.Path, "/") {
u.Path = "/" + u.Path
}
u.Path = strings.TrimSuffix(u.Path, "/")
if u.Path == "/" {
// See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1554
u.Path = ""
}
for _, e := range ui.URLMap {
for _, sp := range e.SrcPaths {
if sp.match(u.Path) {

View File

@@ -2,13 +2,6 @@
`vmbackup` creates VictoriaMetrics data backups from [instant snapshots](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-work-with-snapshots).
Supported storage systems for backups:
* [GCS](https://cloud.google.com/storage/). Example: `gs://<bucket>/<path/to/backup>`
* [S3](https://aws.amazon.com/s3/). Example: `s3://<bucket>/<path/to/backup>`
* Any S3-compatible storage such as [MinIO](https://github.com/minio/minio), [Ceph](https://docs.ceph.com/en/pacific/radosgw/s3/) or [Swift](https://platform.swiftstack.com/docs/admin/middleware/s3_middleware.html). See [these docs](#advanced-usage) for details.
* Local filesystem. Example: `fs://</absolute/path/to/backup>`. Note that `vmbackup` prevents from storing the backup into the directory pointed by `-storageDataPath` command-line flag, since this directory should be managed solely by VictoriaMetrics or `vmstorage`.
`vmbackup` supports incremental and full backups. Incremental backups are created automatically if the destination path already contains data from the previous backup.
Full backups can be sped up with `-origin` pointing to an already existing backup on the same remote storage. In this case `vmbackup` makes server-side copy for the shared
data between the existing backup and new backup. It saves time and costs on data transfer.
@@ -22,6 +15,16 @@ See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time-
See also [vmbackupmanager](https://docs.victoriametrics.com/vmbackupmanager.html) tool built on top of `vmbackup`. This tool simplifies
creation of hourly, daily, weekly and monthly backups.
## Supported storage types
`vmbackup` supports the following `-dst` storage types:
* [GCS](https://cloud.google.com/storage/). Example: `gs://<bucket>/<path/to/backup>`
* [S3](https://aws.amazon.com/s3/). Example: `s3://<bucket>/<path/to/backup>`
* [Azure Blob Storage](https://azure.microsoft.com/en-us/products/storage/blobs/). Example: `azblob://<bucket>/<path/to/backup>`
* Any S3-compatible storage such as [MinIO](https://github.com/minio/minio), [Ceph](https://docs.ceph.com/en/pacific/radosgw/s3/) or [Swift](https://platform.swiftstack.com/docs/admin/middleware/s3_middleware.html). See [these docs](#advanced-usage) for details.
* Local filesystem. Example: `fs://</absolute/path/to/backup>`. Note that `vmbackup` prevents from storing the backup into the directory pointed by `-storageDataPath` command-line flag, since this directory should be managed solely by VictoriaMetrics or `vmstorage`.
## Use cases
### Regular backups
@@ -29,7 +32,7 @@ creation of hourly, daily, weekly and monthly backups.
Regular backup can be performed with the following command:
```console
vmbackup -storageDataPath=</path/to/victoria-metrics-data> -snapshot.createURL=http://localhost:8428/snapshot/create -dst=gs://<bucket>/<path/to/new/backup>
./vmbackup -storageDataPath=</path/to/victoria-metrics-data> -snapshot.createURL=http://localhost:8428/snapshot/create -dst=gs://<bucket>/<path/to/new/backup>
```
* `</path/to/victoria-metrics-data>` - path to VictoriaMetrics data pointed by `-storageDataPath` command-line flag in single-node VictoriaMetrics or in cluster `vmstorage`.
@@ -74,7 +77,7 @@ The command will upload only changed data to `gs://<bucket>/latest`.
* Run the following command once a day:
```console
vmbackup -storageDataPath=</path/to/victoria-metrics-data> -snapshot.createURL=http://localhost:8428/snapshot/create -dst=gs://<bucket>/<YYYYMMDD> -origin=gs://<bucket>/latest
./vmbackup -storageDataPath=</path/to/victoria-metrics-data> -snapshot.createURL=http://localhost:8428/snapshot/create -dst=gs://<bucket>/<YYYYMMDD> -origin=gs://<bucket>/latest
```
Where `<daily-snapshot>` is the snapshot for the last day `<YYYYMMDD>`.
@@ -151,6 +154,11 @@ See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time-
"client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/service-account-email"
}
```
* Obtaining credentials from env variables.
- For AWS S3 compatible storages set env variable `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY`.
Also you can set env variable `AWS_SHARED_CREDENTIALS_FILE` with path to credentials file.
- For GCE cloud storage set env variable `GOOGLE_APPLICATION_CREDENTIALS` with path to credentials file.
- For Azure storage either set env variables `AZURE_STORAGE_ACCOUNT_NAME` and `AZURE_STORAGE_ACCOUNT_KEY`, or `AZURE_STORAGE_ACCOUNT_CONNECTION_STRING`.
* Usage with s3 custom url endpoint. It is possible to use `vmbackup` with s3 compatible storages like minio, cloudian, etc.
You have to add a custom url endpoint via flag:
@@ -179,7 +187,7 @@ See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time-
-customS3Endpoint string
Custom S3 endpoint for use with S3-compatible storages (e.g. MinIO). S3 is used if not set
-dst string
Where to put the backup on the remote storage. Example: gs://bucket/path/to/backup/dir, s3://bucket/path/to/backup/dir or fs:///path/to/local/backup/dir
Where to put the backup on the remote storage. Example: gs://bucket/path/to/backup, s3://bucket/path/to/backup, azblob://bucket/path/to/backup or fs:///path/to/local/backup/dir
-dst can point to the previous backup. In this case incremental backup is performed, i.e. only changed data is uploaded
-enableTCP6
Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used
@@ -188,7 +196,7 @@ See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time-
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
-eula
By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf
By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf . This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
-flagsAuthKey string
Auth key for /flags endpoint. It must be passed via authKey query arg. It overrides httpAuth.* settings
-fs.disableMmap
@@ -266,6 +274,8 @@ See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time-
Supports an array of values separated by comma or specified via multiple flags.
-tlsKeyFile string
Path to file with TLS key if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated
-tlsMinVersion string
Optional minimum TLS version to use for incoming requests over HTTPS if -tls is set. Supported values: TLS10, TLS11, TLS12, TLS13
-version
Show VictoriaMetrics version
```
@@ -276,7 +286,7 @@ It is recommended using [binary releases](https://github.com/VictoriaMetrics/Vic
### Development build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.18.
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.19.3.
2. Run `make vmbackup` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmbackup` binary and puts it into the `bin` folder.

View File

@@ -30,7 +30,7 @@ var (
snapshotDeleteURL = flag.String("snapshot.deleteURL", "", "VictoriaMetrics delete snapshot url. Optional. Will be generated from -snapshot.createURL if not provided. "+
"All created snapshots will be automatically deleted. Example: http://victoriametrics:8428/snapshot/delete")
dst = flag.String("dst", "", "Where to put the backup on the remote storage. "+
"Example: gs://bucket/path/to/backup/dir, s3://bucket/path/to/backup/dir or fs:///path/to/local/backup/dir\n"+
"Example: gs://bucket/path/to/backup, s3://bucket/path/to/backup, azblob://bucket/path/to/backup or fs:///path/to/local/backup/dir\n"+
"-dst can point to the previous backup. In this case incremental backup is performed, i.e. only changed data is uploaded")
origin = flag.String("origin", "", "Optional origin directory on the remote storage with old backup for server-side copying when performing full backup. This speeds up full backups")
concurrency = flag.Int("concurrency", 10, "The number of concurrent workers. Higher concurrency may reduce backup duration")

View File

@@ -1,17 +1,23 @@
## vmbackupmanager
***vmbackupmanager is a part of [enterprise package](https://victoriametrics.com/products/enterprise/). It is available for download and evaluation at [releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases)***
***vmbackupmanager is a part of [enterprise package](https://docs.victoriametrics.com/enterprise.html). It is available for download and evaluation at [releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases)***
The VictoriaMetrics backup manager automates regular backup procedures. It supports the following backup intervals: **hourly**, **daily**, **weekly** and **monthly**. Multiple backup intervals may be configured simultaneously. I.e. the backup manager creates hourly backups every hour, while it creates daily backups every day, etc. Backup manager must have read access to the storage data, so best practice is to install it on the same machine (or as a sidecar) where the storage node is installed.
The backup service makes a backup every hour and puts it to the latest folder and then copies data to the folders which represent the backup intervals (hourly, daily, weekly and monthly)
The VictoriaMetrics backup manager automates regular backup procedures. It supports the following backup intervals: **hourly**, **daily**, **weekly** and **monthly**.
Multiple backup intervals may be configured simultaneously. I.e. the backup manager creates hourly backups every hour, while it creates daily backups every day, etc.
Backup manager must have read access to the storage data, so best practice is to install it on the same machine (or as a sidecar) where the storage node is installed.
The backup service makes a backup every hour and puts it to the latest folder and then copies data to the folders
which represent the backup intervals (hourly, daily, weekly and monthly)
The required flags for running the service are as follows:
* -eula - should be true and means that you have the legal right to run a backup manager. That can either be a signed contract or an email with confirmation to run the service in a trial period
* -storageDataPath - path to VictoriaMetrics or vmstorage data path to make backup from
* -eula - should be true and means that you have the legal right to run a backup manager. That can either be a signed contract or an email
with confirmation to run the service in a trial period.
* -storageDataPath - path to VictoriaMetrics or vmstorage data path to make backup from.
* -snapshot.createURL - VictoriaMetrics creates snapshot URL which will automatically be created during backup. Example: <http://victoriametrics:8428/snapshot/create>
* -dst - backup destination at s3, gcs or local filesystem
* -credsFilePath - path to file with GCS or S3 credentials. Credentials are loaded from default locations if not set. See [https://cloud.google.com/iam/docs/creating-managing-service-account-keys](https://cloud.google.com/iam/docs/creating-managing-service-account-keys) and [https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html](https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html)
* -dst - backup destination at [the supported storage types](https://docs.victoriametrics.com/vmbackup.html#supported-storage-types).
* -credsFilePath - path to file with GCS or S3 credentials. Credentials are loaded from default locations if not set.
See [https://cloud.google.com/iam/docs/creating-managing-service-account-keys](https://cloud.google.com/iam/docs/creating-managing-service-account-keys)
and [https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html](https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html).
Backup schedule is controlled by the following flags:
@@ -36,7 +42,11 @@ To get the full list of supported flags please run the following command:
./vmbackupmanager --help
```
The service creates a **full** backup each run. This means that the system can be restored fully from any particular backup using vmrestore. Backup manager uploads only the data that has been changed or created since the most recent backup (incremental backup).
The service creates a **full** backup each run. This means that the system can be restored fully
from any particular backup using [vmrestore](https://docs.victoriametrics.com/vmrestore.html).
Backup manager uploads only the data that has been changed or created since the most recent backup (incremental backup).
This reduces the consumed network traffic and the time needed for performing the backup.
See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time-series-databases-533c1a927883) for details.
*Please take into account that the first backup upload could take a significant amount of time as it needs to upload all of the data.*
@@ -47,7 +57,7 @@ There are two flags which could help with performance tuning:
## Example of Usage
GCS and cluster version. You need to have a credentials file in json format with following structure
GCS and cluster version. You need to have a credentials file in json format with following structure:
credentials.json
@@ -72,14 +82,14 @@ Backup manager launched with the following configuration:
```console
export NODE_IP=192.168.0.10
export VMSTORAGE_ENDPOINT=http://127.0.0.1:8428
./vmbackupmanager -dst=gs://vmstorage-data/$NODE_IP -credsFilePath=credentials.json -storageDataPath=/vmstorage-data -snapshot.createURL=$VMSTORAGE_ENDPOINT/snapshot/create -eula
./vmbackupmanager -dst=gs://vmstorage-data/$NODE_IP -credsFilePath=credentials.json -storageDataPath=/vmstorage-data -snapshot.createURL=$VMSTORAGE_ENDPOINT/snapshot/create -eula
```
Expected logs in vmbackupmanager:
```console
info lib/backup/actions/backup.go:131 server-side copied 81 out of 81 parts from GCS{bucket: "vmstorage-data", dir: "192.168.0.10//latest/"} to GCS{bucket: "vmstorage-data", dir: "192.168.0.10//weekly/2020-34/"} in 2.549833008s
info lib/backup/actions/backup.go:169 backed up 853315 bytes in 2.882 seconds; deleted 0 bytes; server-side copied 853315 bytes; uploaded 0 bytes
info lib/backup/actions/backup.go:169 backed up 853315 bytes in 2.882 seconds; deleted 0 bytes; server-side copied 853315 bytes; uploaded 0 bytes
```
Expected logs in vmstorage:
@@ -93,7 +103,7 @@ info VictoriaMetrics/lib/storage/storage.go:319 deleted snapshot "/vmstora
The result on the GCS bucket
* The root folder
![root](vmbackupmanager_root_folder.png)
* The latest folder
@@ -141,6 +151,142 @@ The result on the GCS bucket. We see only 3 daily backups:
![daily](vmbackupmanager_rp_daily_2.png)
## API methods
`vmbackupmanager` exposes the following API methods:
* GET `/api/v1/backups` - returns list of backups in remote storage.
Example output:
```json
["daily/2022-10-06","daily/2022-10-10","hourly/2022-10-04:13","hourly/2022-10-06:12","hourly/2022-10-06:13","hourly/2022-10-10:14","hourly/2022-10-10:16","monthly/2022-10","weekly/2022-40","weekly/2022-41"]
```
* POST `/api/v1/restore` - saves backup name to restore when [performing restore](#restore-commands).
Example request body:
```json
{"backup":"daily/2022-10-06"}
```
* GET `/api/v1/restore` - returns backup name from restore mark if it exists.
Example response:
```json
{"backup":"daily/2022-10-06"}
```
* DELETE `/api/v1/restore` - delete restore mark.
## CLI
`vmbackupmanager` exposes CLI commands to work with [API methods](#api-methods) without external dependencies.
Supported commands:
```console
vmbackupmanager backup
vmbackupmanager backup list
List backups in remote storage
vmbackupmanager restore
Restore backup specified by restore mark if it exists
vmbackupmanager restore get
Get restore mark if it exists
vmbackupmanager restore delete
Delete restore mark if it exists
vmbackupmanager restore create [backup_name]
Create restore mark
```
By default, CLI commands are using `http://127.0.0.1:8300` endpoint to reach `vmbackupmanager` API.
It can be changed by using flag:
```
-apiURL string
vmbackupmanager address to perform API requests (default "http://127.0.0.1:8300")
```
### Backup commands
`vmbackupmanager backup list` lists backups in remote storage:
```console
$ ./vmbackupmanager backup list
["daily/2022-10-06","daily/2022-10-10","hourly/2022-10-04:13","hourly/2022-10-06:12","hourly/2022-10-06:13","hourly/2022-10-10:14","hourly/2022-10-10:16","monthly/2022-10","weekly/2022-40","weekly/2022-41"]
```
### Restore commands
Restore commands are used to create, get and delete restore mark.
Restore mark is used by `vmbackupmanager` to store backup name to restore when running restore.
Create restore mark:
```console
$ ./vmbackupmanager restore create daily/2022-10-06
```
Get restore mark if it exists:
```console
$ ./vmbackupmanager restore get
{"backup":"daily/2022-10-06"}
```
Delete restore mark if it exists:
```console
$ ./vmbackupmanager restore delete
```
Perform restore:
```console
$ /vmbackupmanager-prod restore -dst=gs://vmstorage-data/$NODE_IP -credsFilePath=credentials.json -storageDataPath=/vmstorage-data
```
Note that `vmsingle` or `vmstorage` should be stopped before performing restore.
If restore mark doesn't exist at `storageDataPath`(restore wasn't requested) `vmbackupmanager restore` will exit with successful status code.
### How to restore backup via CLI
1. Run `vmbackupmanager backup list` to get list of available backups:
```console
$ /vmbackupmanager-prod backup list
["daily/2022-10-06","daily/2022-10-10","hourly/2022-10-04:13","hourly/2022-10-06:12","hourly/2022-10-06:13","hourly/2022-10-10:14","hourly/2022-10-10:16","monthly/2022-10","weekly/2022-40","weekly/2022-41"]
```
2. Run `vmbackupmanager restore create` to create restore mark:
- Use relative path to backup to restore from currently used remote storage:
```console
$ /vmbackupmanager-prod restore create daily/2022-10-06
```
- Use full path to backup to restore from any remote storage:
```console
$ /vmbackupmanager-prod restore create azblob://test1/vmbackupmanager/daily/2022-10-06
```
3. Stop `vmstorage` or `vmsingle` node
4. Run `vmbackupmanager restore` to restore backup:
```console
$ /vmbackupmanager-prod restore -credsFilePath=credentials.json -storageDataPath=/vmstorage-data
```
5. Start `vmstorage` or `vmsingle` node
### How to restore in Kubernetes
1. Enter container running `vmbackupmanager`
2. Use `vmbackupmanager backup list` to get list of available backups:
```console
$ /vmbackupmanager-prod backup list
["daily/2022-10-06","daily/2022-10-10","hourly/2022-10-04:13","hourly/2022-10-06:12","hourly/2022-10-06:13","hourly/2022-10-10:14","hourly/2022-10-10:16","monthly/2022-10","weekly/2022-40","weekly/2022-41"]
```
3. Use `vmbackupmanager restore create` to create restore mark:
- Use relative path to backup to restore from currently used remote storage:
```console
$ /vmbackupmanager-prod restore create daily/2022-10-06
```
- Use full path to backup to restore from any remote storage:
```console
$ /vmbackupmanager-prod restore create azblob://test1/vmbackupmanager/daily/2022-10-06
```
4. Restart pod
## Configuration
### Flags
@@ -153,6 +299,13 @@ The shortlist of configuration flags is the following:
```
vmbackupmanager performs regular backups according to the provided configs.
subcommands:
backup: provides auxiliary backup-related commands
restore: restores backup specified by restore mark if it exists
command-line flags:
-apiURL string
vmbackupmanager address to perform API requests (default "http://127.0.0.1:8300")
-concurrency int
The number of concurrent workers. Higher concurrency may reduce backup duration (default 10)
-configFilePath string
@@ -182,7 +335,7 @@ vmbackupmanager performs regular backups according to the provided configs.
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
-eula
By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf
By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf . This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
-flagsAuthKey string
Auth key for /flags endpoint. It must be passed via authKey query arg. It overrides httpAuth.* settings
-fs.disableMmap
@@ -265,6 +418,8 @@ vmbackupmanager performs regular backups according to the provided configs.
Supports an array of values separated by comma or specified via multiple flags.
-tlsKeyFile string
Path to file with TLS key if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated
-tlsMinVersion string
Optional minimum TLS version to use for incoming requests over HTTPS if -tls is set. Supported values: TLS10, TLS11, TLS12, TLS13
-version
Show VictoriaMetrics version
```

View File

@@ -242,6 +242,7 @@ Found 40000 timeseries to import. Continue? [Y/n] y
Vmctl maps InfluxDB data the same way as VictoriaMetrics does by using the following rules:
- `influx-database` arg is mapped into `db` label value unless `db` tag exists in the InfluxDB line.
If you want to skip this mapping just enable flag `influx-skip-database-label`.
- Field names are mapped to time series names prefixed with {measurement}{separator} value,
where {separator} equals to _ by default.
It can be changed with `--influx-measurement-field-separator` command-line flag.
@@ -521,6 +522,74 @@ To avoid such situation try to filter out VM process metrics via `--vm-native-fi
Instead, use [relabeling in VictoriaMetrics](https://github.com/VictoriaMetrics/vmctl/issues/4#issuecomment-683424375).
5. When importing in or from cluster version remember to use correct [URL format](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#url-format)
and specify `accountID` param.
6. When migrating large volumes of data it might be useful to use `--vm-native-step-interval` flag to split single process into smaller steps.
#### Using time-based chunking of migration
It is possible split migration process into set of smaller batches based on time. This is especially useful when migrating large volumes of data as this adds indication of progress and ability to restore process from certain point in case of failure.
To use this you need to specify `--vm-native-step-interval` flag. Supported values are: `month`, `day`, `hour`.
Note that in order to use this it is required `--vm-native-filter-time-start` to be set to calculate time ranges for export process.
Every range is being processed independently, which means that:
- after range processing is finished all data within range is migrated
- if process fails on one of stages it is guaranteed that data of prior stages is already written, so it is possible to restart process starting from failed range
It is recommended using the `month` step when migrating the data over multiple months, since the migration with `day` and `hour` steps may take longer time to complete
because of additional overhead.
Usage example:
```console
./vmctl vm-native
--vm-native-filter-time-start 2022-06-17T00:07:00Z \
--vm-native-filter-time-end 2022-10-03T00:07:00Z \
--vm-native-src-addr http://localhost:8428 \
--vm-native-dst-addr http://localhost:8528 \
--vm-native-step-interval=month
VictoriaMetrics Native import mode
2022/08/30 19:48:24 Processing range 1/5: 2022-06-17T00:07:00Z - 2022-06-30T23:59:59Z
2022/08/30 19:48:24 Initing export pipe from "http://localhost:8428" with filters:
filter: match[]={__name__!=""}
start: 2022-06-17T00:07:00Z
end: 2022-06-30T23:59:59Z
Initing import process to "http://localhost:8428":
2022/08/30 19:48:24 Import finished!
Total: 16 B ↗ Speed: 28.89 KiB p/s
2022/08/30 19:48:24 Processing range 2/5: 2022-07-01T00:00:00Z - 2022-07-31T23:59:59Z
2022/08/30 19:48:24 Initing export pipe from "http://localhost:8428" with filters:
filter: match[]={__name__!=""}
start: 2022-07-01T00:00:00Z
end: 2022-07-31T23:59:59Z
Initing import process to "http://localhost:8428":
2022/08/30 19:48:24 Import finished!
Total: 16 B ↗ Speed: 164.35 KiB p/s
2022/08/30 19:48:24 Processing range 3/5: 2022-08-01T00:00:00Z - 2022-08-31T23:59:59Z
2022/08/30 19:48:24 Initing export pipe from "http://localhost:8428" with filters:
filter: match[]={__name__!=""}
start: 2022-08-01T00:00:00Z
end: 2022-08-31T23:59:59Z
Initing import process to "http://localhost:8428":
2022/08/30 19:48:24 Import finished!
Total: 16 B ↗ Speed: 191.42 KiB p/s
2022/08/30 19:48:24 Processing range 4/5: 2022-09-01T00:00:00Z - 2022-09-30T23:59:59Z
2022/08/30 19:48:24 Initing export pipe from "http://localhost:8428" with filters:
filter: match[]={__name__!=""}
start: 2022-09-01T00:00:00Z
end: 2022-09-30T23:59:59Z
Initing import process to "http://localhost:8428":
2022/08/30 19:48:24 Import finished!
Total: 16 B ↗ Speed: 141.04 KiB p/s
2022/08/30 19:48:24 Processing range 5/5: 2022-10-01T00:00:00Z - 2022-10-03T00:07:00Z
2022/08/30 19:48:24 Initing export pipe from "http://localhost:8428" with filters:
filter: match[]={__name__!=""}
start: 2022-10-01T00:00:00Z
end: 2022-10-03T00:07:00Z
Initing import process to "http://localhost:8428":
2022/08/30 19:48:24 Import finished!
Total: 16 B ↗ Speed: 186.32 KiB p/s
2022/08/30 19:48:24 Total time: 12.680582ms
```
## Verifying exported blocks from VictoriaMetrics
@@ -631,7 +700,7 @@ It is recommended using [binary releases](https://github.com/VictoriaMetrics/Vic
### Development build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.18.
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.19.3.
2. Run `make vmctl` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmctl` binary and puts it into the `bin` folder.
@@ -660,7 +729,7 @@ ARM build may run on Raspberry Pi or on [energy-efficient ARM servers](https://b
#### Development ARM build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.18.
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.19.3.
2. Run `make vmctl-linux-arm` or `make vmctl-linux-arm64` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmctl-linux-arm` or `vmctl-linux-arm64` binary respectively and puts it into the `bin` folder.

View File

@@ -3,7 +3,7 @@
// altogether.
package barpool
import "github.com/dmitryk-dk/pb/v3"
import "github.com/cheggaaa/pb/v3"
var pool = pb.NewPool()

View File

@@ -4,6 +4,8 @@ import (
"fmt"
"github.com/urfave/cli/v2"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl/stepper"
)
const (
@@ -318,6 +320,7 @@ const (
vmNativeFilterMatch = "vm-native-filter-match"
vmNativeFilterTimeStart = "vm-native-filter-time-start"
vmNativeFilterTimeEnd = "vm-native-filter-time-end"
vmNativeStepInterval = "vm-native-step-interval"
vmNativeSrcAddr = "vm-native-src-addr"
vmNativeSrcUser = "vm-native-src-user"
@@ -345,6 +348,10 @@ var (
Name: vmNativeFilterTimeEnd,
Usage: "The time filter may contain either unix timestamp in seconds or RFC3339 values. E.g. '2020-01-01T20:07:00Z'",
},
&cli.StringFlag{
Name: vmNativeStepInterval,
Usage: fmt.Sprintf("Split export data into chunks. Requires setting --%s. Valid values are '%s','%s','%s'.", vmNativeFilterTimeStart, stepper.StepMonth, stepper.StepDay, stepper.StepHour),
},
&cli.StringFlag{
Name: vmNativeSrcAddr,
Usage: "VictoriaMetrics address to perform export from. \n" +

View File

@@ -11,6 +11,8 @@ import (
"syscall"
"time"
"github.com/urfave/cli/v2"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl/influx"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl/opentsdb"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl/prometheus"
@@ -18,7 +20,6 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/native"
"github.com/urfave/cli/v2"
)
func main() {
@@ -33,6 +34,9 @@ func main() {
Name: "vmctl",
Usage: "VictoriaMetrics command-line tool",
Version: buildinfo.Version,
// Disable `-version` flag to avoid conflict with lib/buildinfo flags
// see https://github.com/urfave/cli/issues/1560
HideVersion: true,
Commands: []*cli.Command{
{
Name: "opentsdb",
@@ -161,6 +165,7 @@ func main() {
match: c.String(vmNativeFilterMatch),
timeStart: c.String(vmNativeFilterTimeStart),
timeEnd: c.String(vmNativeFilterTimeEnd),
chunk: c.String(vmNativeStepInterval),
},
src: &vmNativeClient{
addr: strings.Trim(c.String(vmNativeSrcAddr), "/"),

View File

@@ -8,7 +8,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl/opentsdb"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl/vm"
"github.com/dmitryk-dk/pb/v3"
"github.com/cheggaaa/pb/v3"
)
type otsdbProcessor struct {

View File

@@ -6,13 +6,12 @@ import (
"strconv"
"strings"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
)
var (
allowedNames = regexp.MustCompile("^[a-zA-Z][a-zA-Z0-9_:]*$")
allowedFirstChar = regexp.MustCompile("^[a-zA-Z]")
replaceChars = regexp.MustCompile("[^a-zA-Z0-9_:]")
allowedTagKeys = regexp.MustCompile("^[a-zA-Z][a-zA-Z0-9_]*$")
)
func convertDuration(duration string) (time.Duration, error) {
@@ -180,13 +179,8 @@ func modifyData(msg Metric, normalize bool) (Metric, error) {
}
/*
replace bad characters in metric name with _ per the data model
only replace if needed to reduce string processing time
*/
if !allowedNames.MatchString(name) {
finalMsg.Metric = replaceChars.ReplaceAllString(name, "_")
} else {
finalMsg.Metric = name
}
finalMsg.Metric = promrelabel.SanitizeName(name)
// replace bad characters in tag keys with _ per the data model
for key, value := range msg.Tags {
// if normalization requested, lowercase the key and value
@@ -196,11 +190,8 @@ func modifyData(msg Metric, normalize bool) (Metric, error) {
}
/*
replace all explicitly bad characters with _
only replace if needed to reduce string processing time
*/
if !allowedTagKeys.MatchString(key) {
key = replaceChars.ReplaceAllString(key, "_")
}
key = promrelabel.SanitizeName(key)
// tags that start with __ are considered custom stats for internal prometheus stuff, we should drop them
if !strings.HasPrefix(key, "__") {
finalMsg.Tags[key] = value

View File

@@ -0,0 +1,63 @@
package stepper
import (
"fmt"
"time"
)
const (
// StepMonth represents a one month interval
StepMonth string = "month"
// StepDay represents a one day interval
StepDay string = "day"
// StepHour represents a one hour interval
StepHour string = "hour"
)
// SplitDateRange splits start-end range in a subset of ranges respecting the given step
// Ranges with granularity of StepMonth are aligned to 1st of each month in order to improve export efficiency at block transfer level
func SplitDateRange(start, end time.Time, step string) ([][]time.Time, error) {
if start.After(end) {
return nil, fmt.Errorf("start time %q should come before end time %q", start.Format(time.RFC3339), end.Format(time.RFC3339))
}
var nextStep func(time.Time) (time.Time, time.Time)
switch step {
case StepMonth:
nextStep = func(t time.Time) (time.Time, time.Time) {
endOfMonth := time.Date(t.Year(), t.Month()+1, 1, 0, 0, 0, 0, t.Location()).Add(-1 * time.Nanosecond)
if t == endOfMonth {
endOfMonth = time.Date(t.Year(), t.Month()+2, 1, 0, 0, 0, 0, t.Location()).Add(-1 * time.Nanosecond)
t = time.Date(t.Year(), t.Month()+1, 1, 0, 0, 0, 0, t.Location())
}
return t, endOfMonth
}
case StepDay:
nextStep = func(t time.Time) (time.Time, time.Time) {
return t, t.AddDate(0, 0, 1)
}
case StepHour:
nextStep = func(t time.Time) (time.Time, time.Time) {
return t, t.Add(time.Hour * 1)
}
default:
return nil, fmt.Errorf("failed to parse step value, valid values are: '%s', '%s', '%s'. provided: '%s'", StepMonth, StepDay, StepHour, step)
}
currentStep := start
ranges := make([][]time.Time, 0)
for end.After(currentStep) {
s, e := nextStep(currentStep)
if e.After(end) {
e = end
}
ranges = append(ranges, []time.Time{s, e})
currentStep = e
}
return ranges, nil
}

View File

@@ -0,0 +1,152 @@
package stepper
import (
"reflect"
"testing"
"time"
)
type testTimeRange []string
func mustParseDatetime(t string) time.Time {
result, err := time.Parse(time.RFC3339, t)
if err != nil {
panic(err)
}
return result
}
func Test_splitDateRange(t *testing.T) {
type args struct {
start string
end string
granularity string
}
tests := []struct {
name string
args args
want []testTimeRange
wantErr bool
}{
{
name: "validates start is before end",
args: args{
start: "2022-02-01T00:00:00Z",
end: "2022-01-01T00:00:00Z",
granularity: StepMonth,
},
want: nil,
wantErr: true,
},
{
name: "validates granularity value",
args: args{
start: "2022-01-01T00:00:00Z",
end: "2022-02-01T00:00:00Z",
granularity: "non-existent-format",
},
want: nil,
wantErr: true,
},
{
name: "month chunking",
args: args{
start: "2022-01-03T11:11:11Z",
end: "2022-03-03T12:12:12Z",
granularity: StepMonth,
},
want: []testTimeRange{
{
"2022-01-03T11:11:11Z",
"2022-01-31T23:59:59.999999999Z",
},
{
"2022-02-01T00:00:00Z",
"2022-02-28T23:59:59.999999999Z",
},
{
"2022-03-01T00:00:00Z",
"2022-03-03T12:12:12Z",
},
},
wantErr: false,
},
{
name: "daily chunking",
args: args{
start: "2022-01-03T11:11:11Z",
end: "2022-01-05T12:12:12Z",
granularity: StepDay,
},
want: []testTimeRange{
{
"2022-01-03T11:11:11Z",
"2022-01-04T11:11:11Z",
},
{
"2022-01-04T11:11:11Z",
"2022-01-05T11:11:11Z",
},
{
"2022-01-05T11:11:11Z",
"2022-01-05T12:12:12Z",
},
},
wantErr: false,
},
{
name: "hourly chunking",
args: args{
start: "2022-01-03T11:11:11Z",
end: "2022-01-03T14:14:14Z",
granularity: StepHour,
},
want: []testTimeRange{
{
"2022-01-03T11:11:11Z",
"2022-01-03T12:11:11Z",
},
{
"2022-01-03T12:11:11Z",
"2022-01-03T13:11:11Z",
},
{
"2022-01-03T13:11:11Z",
"2022-01-03T14:11:11Z",
},
{
"2022-01-03T14:11:11Z",
"2022-01-03T14:14:14Z",
},
},
wantErr: false,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
start := mustParseDatetime(tt.args.start)
end := mustParseDatetime(tt.args.end)
got, err := SplitDateRange(start, end, tt.args.granularity)
if (err != nil) != tt.wantErr {
t.Errorf("splitDateRange() error = %v, wantErr %v", err, tt.wantErr)
return
}
var testExpectedResults [][]time.Time
if tt.want != nil {
testExpectedResults = make([][]time.Time, 0)
for _, dr := range tt.want {
testExpectedResults = append(testExpectedResults, []time.Time{
mustParseDatetime(dr[0]),
mustParseDatetime(dr[1]),
})
}
}
if !reflect.DeepEqual(got, testExpectedResults) {
t.Errorf("splitDateRange() got = %v, want %v", got, testExpectedResults)
}
})
}
}

View File

@@ -15,7 +15,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl/barpool"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl/limiter"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/decimal"
"github.com/dmitryk-dk/pb/v3"
"github.com/cheggaaa/pb/v3"
)
// Config contains list of params to configure

View File

@@ -6,9 +6,12 @@ import (
"io"
"log"
"net/http"
"time"
"github.com/cheggaaa/pb/v3"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl/barpool"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl/limiter"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl/stepper"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl/vm"
)
@@ -31,6 +34,7 @@ type filter struct {
match string
timeStart string
timeEnd string
chunk string
}
func (f filter) String() string {
@@ -52,10 +56,54 @@ const (
)
func (p *vmNativeProcessor) run(ctx context.Context) error {
if p.filter.chunk == "" {
return p.runSingle(ctx, p.filter)
}
startOfRange, err := time.Parse(time.RFC3339, p.filter.timeStart)
if err != nil {
return fmt.Errorf("failed to parse %s, provided: %s, expected format: %s, error: %v", vmNativeFilterTimeStart, p.filter.timeStart, time.RFC3339, err)
}
var endOfRange time.Time
if p.filter.timeEnd != "" {
endOfRange, err = time.Parse(time.RFC3339, p.filter.timeEnd)
if err != nil {
return fmt.Errorf("failed to parse %s, provided: %s, expected format: %s, error: %v", vmNativeFilterTimeEnd, p.filter.timeEnd, time.RFC3339, err)
}
} else {
endOfRange = time.Now()
}
ranges, err := stepper.SplitDateRange(startOfRange, endOfRange, p.filter.chunk)
if err != nil {
return fmt.Errorf("failed to create date ranges for the given time filters: %v", err)
}
for rangeIdx, r := range ranges {
formattedStartTime := r[0].Format(time.RFC3339)
formattedEndTime := r[1].Format(time.RFC3339)
log.Printf("Processing range %d/%d: %s - %s \n", rangeIdx+1, len(ranges), formattedStartTime, formattedEndTime)
f := filter{
match: p.filter.match,
timeStart: formattedStartTime,
timeEnd: formattedEndTime,
}
err := p.runSingle(ctx, f)
if err != nil {
log.Printf("processing failed for range %d/%d: %s - %s \n", rangeIdx+1, len(ranges), formattedStartTime, formattedEndTime)
return err
}
}
return nil
}
func (p *vmNativeProcessor) runSingle(ctx context.Context, f filter) error {
pr, pw := io.Pipe()
fmt.Printf("Initing export pipe from %q with filters: %s\n", p.src.addr, p.filter)
exportReader, err := p.exportPipe(ctx)
log.Printf("Initing export pipe from %q with filters: %s\n", p.src.addr, f)
exportReader, err := p.exportPipe(ctx, f)
if err != nil {
return fmt.Errorf("failed to init export pipe: %s", err)
}
@@ -83,13 +131,20 @@ func (p *vmNativeProcessor) run(ctx context.Context) error {
}()
fmt.Printf("Initing import process to %q:\n", p.dst.addr)
bar := barpool.AddWithTemplate(nativeBarTpl, 0)
pool := pb.NewPool()
bar := pb.ProgressBarTemplate(nativeBarTpl).New(0)
pool.Add(bar)
barReader := bar.NewProxyReader(exportReader)
if err := barpool.Start(); err != nil {
if err := pool.Start(); err != nil {
log.Printf("error start process bars pool: %s", err)
return err
}
defer barpool.Stop()
defer func() {
bar.Finish()
if err := pool.Stop(); err != nil {
fmt.Printf("failed to stop barpool: %+v\n", err)
}
}()
w := io.Writer(pw)
if p.rateLimit > 0 {
@@ -111,7 +166,7 @@ func (p *vmNativeProcessor) run(ctx context.Context) error {
return nil
}
func (p *vmNativeProcessor) exportPipe(ctx context.Context) (io.ReadCloser, error) {
func (p *vmNativeProcessor) exportPipe(ctx context.Context, f filter) (io.ReadCloser, error) {
u := fmt.Sprintf("%s/%s", p.src.addr, nativeExportAddr)
req, err := http.NewRequestWithContext(ctx, "GET", u, nil)
if err != nil {
@@ -119,12 +174,12 @@ func (p *vmNativeProcessor) exportPipe(ctx context.Context) (io.ReadCloser, erro
}
params := req.URL.Query()
params.Set("match[]", p.filter.match)
if p.filter.timeStart != "" {
params.Set("start", p.filter.timeStart)
params.Set("match[]", f.match)
if f.timeStart != "" {
params.Set("start", f.timeStart)
}
if p.filter.timeEnd != "" {
params.Set("end", p.filter.timeEnd)
if f.timeEnd != "" {
params.Set("end", f.timeEnd)
}
req.URL.RawQuery = params.Encode()

View File

@@ -4,6 +4,8 @@ import (
"context"
"testing"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl/stepper"
)
// If you want to run this test:
@@ -16,6 +18,7 @@ import (
const (
matchFilter = `{job="avalanche"}`
timeStartFilter = "2020-01-01T20:07:00Z"
timeEndFilter = "2020-08-01T20:07:00Z"
srcAddr = "http://127.0.0.1:8428"
dstAddr = "http://127.0.0.1:8528"
)
@@ -74,6 +77,26 @@ func Test_vmNativeProcessor_run(t *testing.T) {
closer: func(cancelFunc context.CancelFunc) {},
wantErr: false,
},
{
name: "simulate correct work with chunking",
fields: fields{
filter: filter{
match: matchFilter,
timeStart: timeStartFilter,
timeEnd: timeEndFilter,
chunk: stepper.StepMonth,
},
rateLimit: 0,
dst: &vmNativeClient{
addr: dstAddr,
},
src: &vmNativeClient{
addr: srcAddr,
},
},
closer: func(cancelFunc context.CancelFunc) {},
wantErr: false,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {

View File

@@ -1,6 +1,6 @@
# vmgateway
***vmgateway is a part of [enterprise package](https://victoriametrics.com/products/enterprise/). It is available for download and evaluation at [releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases)***
***vmgateway is a part of [enterprise package](https://docs.victoriametrics.com/enterprise.html). It is available for download and evaluation at [releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases)***
<img alt="vmgateway" src="vmgateway-overview.jpeg">
@@ -13,7 +13,7 @@
* Provides access by tenantID in the Cluster version
* Allows for separate write/read/admin access to data
`vmgateway` is included in our [enterprise packages](https://victoriametrics.com/products/enterprise/).
`vmgateway` is included in our [enterprise packages](https://docs.victoriametrics.com/enterprise.html).
## Access Control
@@ -149,7 +149,7 @@ cat << EOF > limit.yaml
limits:
- type: queries
value: 100
- type: rows_inserted
- type: rows_inserted
value: 100000
- type: new_series
value: 1000
@@ -168,7 +168,7 @@ curl 'http://localhost:8431/api/v1/import/prometheus' -X POST -d 'foo{bar="baz1
# read metric from tenant 1:5
curl 'http://localhost:8431/api/v1/labels' -H 'Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJleHAiOjE2MjAxNjIwMDAwMDAsInZtX2FjY2VzcyI6eyJ0ZW5hbnRfaWQiOnsiYWNjb3VudF9pZCI6MTV9fX0.PB1_KXDKPUp-40pxOGk6lt_jt9Yq80PIMpWVJqSForQ'
# check rate limit
# check rate limit
```
## Configuration
@@ -176,6 +176,8 @@ curl 'http://localhost:8431/api/v1/labels' -H 'Authorization: Bearer eyJhbGciOiJ
The shortlist of configuration flags include the following:
```console
-auth.httpHeader string
HTTP header name to look for JWT authorization token (default "Authorization")
-clusterMode
enable this for the cluster version
-datasource.appendTypePrefix
@@ -192,6 +194,8 @@ The shortlist of configuration flags include the following:
Optional path to bearer token file to use for -datasource.url.
-datasource.disableKeepAlive
Whether to disable long-lived connections to the datasource. If true, disables HTTP keep-alives and will only use the connection to the server for a single HTTP request.
-datasource.headers string
Optional HTTP extraHeaders to send with each request to the corresponding -datasource.url. For example, -datasource.headers='My-Auth:foobar' would send 'My-Auth: foobar' HTTP header with every request to the corresponding -datasource.url. Multiple headers must be delimited by '^^': -datasource.headers='header1:value1^^header2:value2'
-datasource.lookback duration
Lookback defines how far into the past to look when evaluating queries. For example, if the datasource.lookback=5m then param "time" with value now()-5m will be added to every query.
-datasource.maxIdleConnections int
@@ -207,11 +211,13 @@ The shortlist of configuration flags include the following:
-datasource.oauth2.tokenUrl string
Optional OAuth2 tokenURL to use for -datasource.url.
-datasource.queryStep duration
queryStep defines how far a value can fallback to when evaluating queries. For example, if datasource.queryStep=15s then param "step" with value "15s" will be added to every query.If queryStep isn't specified, rule's evaluationInterval will be used instead.
How far a value can fallback to when evaluating queries. For example, if -datasource.queryStep=15s then param "step" with value "15s" will be added to every query. If set to 0, rule's evaluation interval will be used instead. (default 5m0s)
-datasource.queryTimeAlignment
Whether to align "time" parameter with evaluation interval.Alignment supposed to produce deterministic results despite of number of vmalert replicas or time they were started. See more details here https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1257 (default true)
-datasource.roundDigits int
Adds "round_digits" GET param to datasource requests. In VM "round_digits" limits the number of digits after the decimal point in response values.
-datasource.showURL
Whether to show -datasource.url in the exported metrics. It is hidden by default, since it can contain sensitive info such as auth key
-datasource.tlsCAFile string
Optional path to TLS CA file to use for verifying connections to -datasource.url. By default, system CA is used
-datasource.tlsCertFile string
@@ -223,7 +229,7 @@ The shortlist of configuration flags include the following:
-datasource.tlsServerName string
Optional TLS server name to use for connections to -datasource.url. By default, the server name from -datasource.url is used
-datasource.url string
VictoriaMetrics or vmselect url. Required parameter. E.g. http://127.0.0.1:8428 . See also -remoteRead.disablePathAppend
Datasource compatible with Prometheus HTTP API. It can be single node VictoriaMetrics or vmselect URL. Required parameter. E.g. http://127.0.0.1:8428 . See also '-datasource.disablePathAppend', '-datasource.showURL'.
-enable.auth
enables auth with jwt token
-enable.rateLimit
@@ -235,7 +241,7 @@ The shortlist of configuration flags include the following:
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
-eula
By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf
By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf . This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
-flagsAuthKey string
Auth key for /flags endpoint. It must be passed via authKey query arg. It overrides httpAuth.* settings
-fs.disableMmap
@@ -311,6 +317,8 @@ The shortlist of configuration flags include the following:
Supports an array of values separated by comma or specified via multiple flags.
-tlsKeyFile string
Path to file with TLS key if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated
-tlsMinVersion string
Optional minimum TLS version to use for incoming requests over HTTPS if -tls is set. Supported values: TLS10, TLS11, TLS12, TLS13
-version
Show VictoriaMetrics version
-write.url string

View File

@@ -120,6 +120,16 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
staticServer.ServeHTTP(w, r)
return true
}
if strings.HasPrefix(path, "/prometheus/api/v1/import/prometheus") || strings.HasPrefix(path, "/api/v1/import/prometheus") {
prometheusimportRequests.Inc()
if err := prometheusimport.InsertHandler(r); err != nil {
prometheusimportErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
}
if strings.HasPrefix(path, "/datadog/") {
// Trim suffix from paths starting from /datadog/ in order to support legacy DataDog agent.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2670
@@ -153,15 +163,6 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
}
w.WriteHeader(http.StatusNoContent)
return true
case "/prometheus/api/v1/import/prometheus", "/api/v1/import/prometheus":
prometheusimportRequests.Inc()
if err := prometheusimport.InsertHandler(r); err != nil {
prometheusimportErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "/prometheus/api/v1/import/native", "/api/v1/import/native":
nativeimportRequests.Inc()
if err := native.InsertHandler(r); err != nil {

View File

@@ -6,6 +6,7 @@ import (
"sync/atomic"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fasttime"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/procutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
@@ -20,6 +21,10 @@ var (
"See https://docs.victoriametrics.com/#relabeling for details. The config is reloaded on SIGHUP signal")
relabelDebug = flag.Bool("relabelDebug", false, "Whether to log metrics before and after relabeling with -relabelConfig. If the -relabelDebug is enabled, "+
"then the metrics aren't sent to storage. This is useful for debugging the relabeling configs")
usePromCompatibleNaming = flag.Bool("usePromCompatibleNaming", false, "Whether to replace characters unsupported by Prometheus with underscores "+
"in the ingested metric names and label names. For example, foo.bar{a.b='c'} is transformed into foo_bar{a_b='c'} during data ingestion if this flag is set. "+
"See https://prometheus.io/docs/concepts/data_model/#metric-names-and-labels")
)
// Init must be called after flag.Parse and before using the relabel package.
@@ -34,23 +39,38 @@ func Init() {
logger.Fatalf("cannot load relabelConfig: %s", err)
}
pcsGlobal.Store(pcs)
configSuccess.Set(1)
configTimestamp.Set(fasttime.UnixTimestamp())
if len(*relabelConfig) == 0 {
return
}
go func() {
for range sighupCh {
configReloads.Inc()
logger.Infof("received SIGHUP; reloading -relabelConfig=%q...", *relabelConfig)
pcs, err := loadRelabelConfig()
if err != nil {
configReloadErrors.Inc()
configSuccess.Set(0)
logger.Errorf("cannot load the updated relabelConfig: %s; preserving the previous config", err)
continue
}
pcsGlobal.Store(pcs)
configSuccess.Set(1)
configTimestamp.Set(fasttime.UnixTimestamp())
logger.Infof("successfully reloaded -relabelConfig=%q", *relabelConfig)
}
}()
}
var (
configReloads = metrics.NewCounter(`vm_relabel_config_reloads_total`)
configReloadErrors = metrics.NewCounter(`vm_relabel_config_reloads_errors_total`)
configSuccess = metrics.NewCounter(`vm_relabel_config_last_reload_successful`)
configTimestamp = metrics.NewCounter(`vm_relabel_config_last_reload_success_timestamp_seconds`)
)
var pcsGlobal atomic.Value
func loadRelabelConfig() (*promrelabel.ParsedConfigs, error) {
@@ -67,7 +87,7 @@ func loadRelabelConfig() (*promrelabel.ParsedConfigs, error) {
// HasRelabeling returns true if there is global relabeling.
func HasRelabeling() bool {
pcs := pcsGlobal.Load().(*promrelabel.ParsedConfigs)
return pcs.Len() > 0
return pcs.Len() > 0 || *usePromCompatibleNaming
}
// Ctx holds relabeling context.
@@ -87,11 +107,11 @@ func (ctx *Ctx) Reset() {
// The returned labels are valid until the next call to ApplyRelabeling.
func (ctx *Ctx) ApplyRelabeling(labels []prompb.Label) []prompb.Label {
pcs := pcsGlobal.Load().(*promrelabel.ParsedConfigs)
if pcs.Len() == 0 {
if pcs.Len() == 0 && !*usePromCompatibleNaming {
// There are no relabeling rules.
return labels
}
// Convert src to prompbmarshal.Label format suitable for relabeling.
// Convert labels to prompbmarshal.Label format suitable for relabeling.
tmpLabels := ctx.tmpLabels[:0]
for _, label := range labels {
name := bytesutil.ToUnsafeString(label.Name)
@@ -105,13 +125,29 @@ func (ctx *Ctx) ApplyRelabeling(labels []prompb.Label) []prompb.Label {
})
}
// Apply relabeling
tmpLabels = pcs.Apply(tmpLabels, 0, true)
ctx.tmpLabels = tmpLabels
if len(tmpLabels) == 0 {
metricsDropped.Inc()
if *usePromCompatibleNaming {
// Replace unsupported Prometheus chars in label names and metric names with underscores.
for i := range tmpLabels {
label := &tmpLabels[i]
if label.Name == "__name__" {
label.Value = promrelabel.SanitizeName(label.Value)
} else {
label.Name = promrelabel.SanitizeName(label.Name)
}
}
}
if pcs.Len() > 0 {
// Apply relabeling
tmpLabels = pcs.Apply(tmpLabels, 0)
tmpLabels = promrelabel.FinalizeLabels(tmpLabels[:0], tmpLabels)
if len(tmpLabels) == 0 {
metricsDropped.Inc()
}
}
ctx.tmpLabels = tmpLabels
// Return back labels to the desired format.
dst := labels[:0]
for _, label := range tmpLabels {

View File

@@ -1,7 +1,6 @@
# vmrestore
`vmrestore` restores data from backups created by [vmbackup](https://docs.victoriametrics.com/vmbackup.html).
VictoriaMetrics `v1.29.0` and newer versions must be used for working with the restored data.
Restore process can be interrupted at any time. It is automatically resumed from the interruption point
when restarting `vmrestore` with the same args.
@@ -10,19 +9,28 @@ when restarting `vmrestore` with the same args.
VictoriaMetrics must be stopped during the restore process.
```console
vmrestore -src=gs://<bucket>/<path/to/backup> -storageDataPath=<local/path/to/restore>
Run the following command to restore backup from the given `-src` into the given `-storageDataPath`:
```console
./vmrestore -src=<storageType>://<path/to/backup> -storageDataPath=<local/path/to/restore>
```
* `<bucket>` is [GCS bucket](https://cloud.google.com/storage/docs/creating-buckets) name.
* `<path/to/backup>` is the path to backup made with [vmbackup](https://docs.victoriametrics.com/vmbackup.html) on GCS bucket.
* `<storageType>://<path/to/backup>` is the path to backup made with [vmbackup](https://docs.victoriametrics.com/vmbackup.html).
`vmrestore` can restore backups from the following storage types:
* [GCS](https://cloud.google.com/storage/). Example: `-src=gs://<bucket>/<path/to/backup>`
* [S3](https://aws.amazon.com/s3/). Example: `-src=s3://<bucket>/<path/to/backup>`
* [Azure Blob Storage](https://azure.microsoft.com/en-us/products/storage/blobs/). Example: `-src=azblob://<bucket>/<path/to/backup>`
* Any S3-compatible storage such as [MinIO](https://github.com/minio/minio), [Ceph](https://docs.ceph.com/en/pacific/radosgw/s3/)
or [Swift](https://platform.swiftstack.com/docs/admin/middleware/s3_middleware.html). See [these docs](#advanced-usage) for details.
* Local filesystem. Example: `-src=fs://</absolute/path/to/backup>`. Note that `vmbackup` prevents from storing the backup
into the directory pointed by `-storageDataPath` command-line flag, since this directory should be managed solely by VictoriaMetrics or `vmstorage`.
* `<local/path/to/restore>` is the path to folder where data will be restored. This folder must be passed
to VictoriaMetrics in `-storageDataPath` command-line flag after the restore process is complete.
The original `-storageDataPath` directory may contain old files. They will be substituted by the files from backup,
i.e. the end result would be similar to [rsync --delete](https://askubuntu.com/questions/476041/how-do-i-make-rsync-delete-files-that-have-been-deleted-from-the-source-folder).
## Troubleshooting
* If `vmrestore` eats all the network bandwidth, then set `-maxBytesPerSecond` to the desired value.
@@ -92,7 +100,7 @@ i.e. the end result would be similar to [rsync --delete](https://askubuntu.com/q
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
-eula
By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf
By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf . This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
-flagsAuthKey string
Auth key for /flags endpoint. It must be passed via authKey query arg. It overrides httpAuth.* settings
-fs.disableMmap
@@ -154,7 +162,7 @@ i.e. the end result would be similar to [rsync --delete](https://askubuntu.com/q
-skipBackupCompleteCheck
Whether to skip checking for 'backup complete' file in -src. This may be useful for restoring from old backups, which were created without 'backup complete' file
-src string
Source path with backup on the remote storage. Example: gs://bucket/path/to/backup/dir, s3://bucket/path/to/backup/dir or fs:///path/to/local/backup/dir
Source path with backup on the remote storage. Example: gs://bucket/path/to/backup, s3://bucket/path/to/backup, azblob://bucket/path/to/backup or fs:///path/to/local/backup
-storageDataPath string
Destination path where backup must be restored. VictoriaMetrics must be stopped when restoring from backup. -storageDataPath dir can be non-empty. In this case the contents of -storageDataPath dir is synchronized with -src contents, i.e. it works like 'rsync --delete' (default "victoria-metrics-data")
-tls
@@ -166,6 +174,8 @@ i.e. the end result would be similar to [rsync --delete](https://askubuntu.com/q
Supports an array of values separated by comma or specified via multiple flags.
-tlsKeyFile string
Path to file with TLS key if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated
-tlsMinVersion string
Optional minimum TLS version to use for incoming requests over HTTPS if -tls is set. Supported values: TLS10, TLS11, TLS12, TLS13
-version
Show VictoriaMetrics version
```
@@ -176,7 +186,7 @@ It is recommended using [binary releases](https://github.com/VictoriaMetrics/Vic
### Development build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.18.
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.19.3.
2. Run `make vmrestore` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmrestore` binary and puts it into the `bin` folder.

View File

@@ -20,7 +20,7 @@ import (
var (
httpListenAddr = flag.String("httpListenAddr", ":8421", "TCP address for exporting metrics at /metrics page")
src = flag.String("src", "", "Source path with backup on the remote storage. "+
"Example: gs://bucket/path/to/backup/dir, s3://bucket/path/to/backup/dir or fs:///path/to/local/backup/dir")
"Example: gs://bucket/path/to/backup, s3://bucket/path/to/backup, azblob://bucket/path/to/backup or fs:///path/to/local/backup")
storageDataPath = flag.String("storageDataPath", "victoria-metrics-data", "Destination path where backup must be restored. "+
"VictoriaMetrics must be stopped when restoring from backup. -storageDataPath dir can be non-empty. In this case the contents of -storageDataPath dir "+
"is synchronized with -src contents, i.e. it works like 'rsync --delete'")

View File

@@ -54,6 +54,9 @@ func (bw *Writer) reset() {
// Write writes p to bw.
func (bw *Writer) Write(p []byte) (int, error) {
if len(p) == 0 {
return 0, nil
}
bw.lock.Lock()
defer bw.lock.Unlock()
if bw.err != nil {

View File

@@ -29,7 +29,8 @@ import (
var (
deleteAuthKey = flag.String("deleteAuthKey", "", "authKey for metrics' deletion via /api/v1/admin/tsdb/delete_series and /tags/delSeries")
maxConcurrentRequests = flag.Int("search.maxConcurrentRequests", getDefaultMaxConcurrentRequests(), "The maximum number of concurrent search requests. "+
"It shouldn't be high, since a single request can saturate all the CPU cores. See also -search.maxQueueDuration")
"It shouldn't be high, since a single request can saturate all the CPU cores, while many concurrently executed requests may require high amounts of memory. "+
"See also -search.maxQueueDuration and -search.maxMemoryPerQuery")
maxQueueDuration = flag.Duration("search.maxQueueDuration", 10*time.Second, "The maximum time the request waits for execution when -search.maxConcurrentRequests "+
"limit is reached; see also -search.maxQueryDuration")
resetCacheAuthKey = flag.String("search.resetCacheAuthKey", "", "Optional authKey for resetting rollup cache via /internal/resetRollupResultCache call")
@@ -168,7 +169,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
_ = r.ParseForm()
path = strings.TrimPrefix(path, "/")
newURL := path + "/?" + r.Form.Encode()
httpserver.RedirectPermanent(w, newURL)
httpserver.Redirect(w, newURL)
return true
}
if strings.HasPrefix(path, "/vmui/") {
@@ -217,7 +218,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
// vmalert access via incomplete url without `/` in the end. Redirecto to complete url.
// Use relative redirect, since, since the hostname and path prefix may be incorrect if VictoriaMetrics
// is hidden behind vmauth or similar proxy.
httpserver.RedirectPermanent(w, "vmalert/")
httpserver.Redirect(w, "vmalert/")
return true
}
if strings.HasPrefix(path, "/vmalert/") {

View File

@@ -846,7 +846,8 @@ var ssPool sync.Pool
// Data processing is immediately stopped if f returns non-nil error.
// It is the responsibility of f to call b.UnmarshalData before reading timestamps and values from the block.
// It is the responsibility of f to filter blocks according to the given tr.
func ExportBlocks(qt *querytracer.Tracer, sq *storage.SearchQuery, deadline searchutils.Deadline, f func(mn *storage.MetricName, b *storage.Block, tr storage.TimeRange) error) error {
func ExportBlocks(qt *querytracer.Tracer, sq *storage.SearchQuery, deadline searchutils.Deadline,
f func(mn *storage.MetricName, b *storage.Block, tr storage.TimeRange, workerID uint) error) error {
qt = qt.NewChild("export blocks: %s", sq)
defer qt.Done()
if deadline.Exceeded() {
@@ -881,10 +882,10 @@ func ExportBlocks(qt *querytracer.Tracer, sq *storage.SearchQuery, deadline sear
var wg sync.WaitGroup
wg.Add(gomaxprocs)
for i := 0; i < gomaxprocs; i++ {
go func() {
go func(workerID uint) {
defer wg.Done()
for xw := range workCh {
if err := f(&xw.mn, &xw.b, tr); err != nil {
if err := f(&xw.mn, &xw.b, tr, workerID); err != nil {
errGlobalLock.Lock()
if errGlobal != nil {
errGlobal = err
@@ -895,7 +896,7 @@ func ExportBlocks(qt *querytracer.Tracer, sq *storage.SearchQuery, deadline sear
xw.reset()
exportWorkPool.Put(xw)
}
}()
}(uint(i))
}
// Feed workers with work
@@ -1010,7 +1011,10 @@ func ProcessSearchQuery(qt *querytracer.Tracer, sq *storage.SearchQuery, deadlin
startTime := time.Now()
maxSeriesCount := sr.Init(qt, vmstorage.Storage, tfss, tr, sq.MaxMetrics, deadline.Deadline())
indexSearchDuration.UpdateDuration(startTime)
m := make(map[string][]blockRef, maxSeriesCount)
type blockRefs struct {
brs []blockRef
}
m := make(map[string]*blockRefs, maxSeriesCount)
orderedMetricNames := make([]string, 0, maxSeriesCount)
blocksRead := 0
samples := 0
@@ -1039,13 +1043,14 @@ func ProcessSearchQuery(qt *querytracer.Tracer, sq *storage.SearchQuery, deadlin
}
metricName := sr.MetricBlockRef.MetricName
brs := m[string(metricName)]
brs = append(brs, blockRef{
if brs == nil {
brs = &blockRefs{}
}
brs.brs = append(brs.brs, blockRef{
partRef: br.PartRef(),
addr: addr,
})
if len(brs) > 1 {
m[string(metricName)] = brs
} else {
if len(brs.brs) == 1 {
// An optimization for big number of time series with long metricName values:
// use only a single copy of metricName for both orderedMetricNames and m.
orderedMetricNames = append(orderedMetricNames, string(metricName))
@@ -1074,7 +1079,7 @@ func ProcessSearchQuery(qt *querytracer.Tracer, sq *storage.SearchQuery, deadlin
for i, metricName := range orderedMetricNames {
pts[i] = packedTimeseries{
metricName: metricName,
brs: m[metricName],
brs: m[metricName].brs,
}
}
rss.packedTimeseries = pts

View File

@@ -99,10 +99,10 @@
"values":[
{% if len(xb.values) > 0 %}
{% code values := xb.values %}
{%f= values[0] %}
{%= convertValueToSpecialJSON(values[0]) %}
{% code values = values[1:] %}
{% for _, v := range values %}
,{% if math.IsNaN(v) %}null{% else %}{%f= v %}{% endif %}
,{%= convertValueToSpecialJSON(v) %}
{% endfor %}
{% endif %}
],
@@ -126,49 +126,24 @@
}
{% endfunc %}
{% func ExportPromAPIResponse(resultsCh <-chan *quicktemplate.ByteBuffer, qt *querytracer.Tracer) %}
{% func ExportPromAPIHeader() %}
{
{% code
lines := 0
bytesTotal := 0
%}
"status":"success",
"data":{
"resultType":"matrix",
"result":[
{% code bb, ok := <-resultsCh %}
{% if ok %}
{%z= bb.B %}
{% code
lines++
bytesTotal += len(bb.B)
quicktemplate.ReleaseByteBuffer(bb)
%}
{% for bb := range resultsCh %}
,{%z= bb.B %}
{% code
lines++
bytesTotal += len(bb.B)
quicktemplate.ReleaseByteBuffer(bb)
%}
{% endfor %}
{% endif %}
{% endfunc %}
{% func ExportPromAPIFooter(qt *querytracer.Tracer) %}
]
}
{% code
qt.Donef("export format=promapi: lines=%d, bytes=%d", lines, bytesTotal)
qt.Donef("export format=promapi")
%}
{%= dumpQueryTrace(qt) %}
}
{% endfunc %}
{% func ExportStdResponse(resultsCh <-chan *quicktemplate.ByteBuffer, qt *querytracer.Tracer) %}
{% for bb := range resultsCh %}
{%z= bb.B %}
{% code quicktemplate.ReleaseByteBuffer(bb) %}
{% endfor %}
{% endfunc %}
{% func prometheusMetricName(mn *storage.MetricName) %}
{%z= mn.MetricGroup %}
{% if len(mn.Tags) > 0 %}
@@ -183,4 +158,19 @@
}
{% endif %}
{% endfunc %}
{% func convertValueToSpecialJSON(v float64) %}
{% if math.IsNaN(v) %}
null
{% elseif math.IsInf(v, 0) %}
{% if v > 0 %}
"Infinity"
{% else %}
"-Infinity"
{% endif %}
{% else %}
{%f= v %}
{% endif %}
{% endfunc %}
{% endstripspace %}

View File

@@ -295,7 +295,7 @@ func StreamExportJSONLine(qw422016 *qt422016.Writer, xb *exportBlock) {
values := xb.values
//line app/vmselect/prometheus/export.qtpl:102
qw422016.N().F(values[0])
streamconvertValueToSpecialJSON(qw422016, values[0])
//line app/vmselect/prometheus/export.qtpl:103
values = values[1:]
@@ -304,15 +304,7 @@ func StreamExportJSONLine(qw422016 *qt422016.Writer, xb *exportBlock) {
//line app/vmselect/prometheus/export.qtpl:104
qw422016.N().S(`,`)
//line app/vmselect/prometheus/export.qtpl:105
if math.IsNaN(v) {
//line app/vmselect/prometheus/export.qtpl:105
qw422016.N().S(`null`)
//line app/vmselect/prometheus/export.qtpl:105
} else {
//line app/vmselect/prometheus/export.qtpl:105
qw422016.N().F(v)
//line app/vmselect/prometheus/export.qtpl:105
}
streamconvertValueToSpecialJSON(qw422016, v)
//line app/vmselect/prometheus/export.qtpl:106
}
//line app/vmselect/prometheus/export.qtpl:107
@@ -415,184 +407,195 @@ func ExportPromAPILine(xb *exportBlock) string {
}
//line app/vmselect/prometheus/export.qtpl:129
func StreamExportPromAPIResponse(qw422016 *qt422016.Writer, resultsCh <-chan *quicktemplate.ByteBuffer, qt *querytracer.Tracer) {
func StreamExportPromAPIHeader(qw422016 *qt422016.Writer) {
//line app/vmselect/prometheus/export.qtpl:129
qw422016.N().S(`{`)
//line app/vmselect/prometheus/export.qtpl:132
lines := 0
bytesTotal := 0
qw422016.N().S(`{"status":"success","data":{"resultType":"matrix","result":[`)
//line app/vmselect/prometheus/export.qtpl:135
}
//line app/vmselect/prometheus/export.qtpl:134
qw422016.N().S(`"status":"success","data":{"resultType":"matrix","result":[`)
//line app/vmselect/prometheus/export.qtpl:139
bb, ok := <-resultsCh
//line app/vmselect/prometheus/export.qtpl:135
func WriteExportPromAPIHeader(qq422016 qtio422016.Writer) {
//line app/vmselect/prometheus/export.qtpl:135
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmselect/prometheus/export.qtpl:135
StreamExportPromAPIHeader(qw422016)
//line app/vmselect/prometheus/export.qtpl:135
qt422016.ReleaseWriter(qw422016)
//line app/vmselect/prometheus/export.qtpl:135
}
//line app/vmselect/prometheus/export.qtpl:140
if ok {
//line app/vmselect/prometheus/export.qtpl:141
qw422016.N().Z(bb.B)
//line app/vmselect/prometheus/export.qtpl:143
lines++
bytesTotal += len(bb.B)
quicktemplate.ReleaseByteBuffer(bb)
//line app/vmselect/prometheus/export.qtpl:135
func ExportPromAPIHeader() string {
//line app/vmselect/prometheus/export.qtpl:135
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmselect/prometheus/export.qtpl:135
WriteExportPromAPIHeader(qb422016)
//line app/vmselect/prometheus/export.qtpl:135
qs422016 := string(qb422016.B)
//line app/vmselect/prometheus/export.qtpl:135
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmselect/prometheus/export.qtpl:135
return qs422016
//line app/vmselect/prometheus/export.qtpl:135
}
//line app/vmselect/prometheus/export.qtpl:147
for bb := range resultsCh {
//line app/vmselect/prometheus/export.qtpl:147
qw422016.N().S(`,`)
//line app/vmselect/prometheus/export.qtpl:148
qw422016.N().Z(bb.B)
//line app/vmselect/prometheus/export.qtpl:150
lines++
bytesTotal += len(bb.B)
quicktemplate.ReleaseByteBuffer(bb)
//line app/vmselect/prometheus/export.qtpl:154
}
//line app/vmselect/prometheus/export.qtpl:155
}
//line app/vmselect/prometheus/export.qtpl:155
//line app/vmselect/prometheus/export.qtpl:137
func StreamExportPromAPIFooter(qw422016 *qt422016.Writer, qt *querytracer.Tracer) {
//line app/vmselect/prometheus/export.qtpl:137
qw422016.N().S(`]}`)
//line app/vmselect/prometheus/export.qtpl:159
qt.Donef("export format=promapi: lines=%d, bytes=%d", lines, bytesTotal)
//line app/vmselect/prometheus/export.qtpl:141
qt.Donef("export format=promapi")
//line app/vmselect/prometheus/export.qtpl:161
//line app/vmselect/prometheus/export.qtpl:143
streamdumpQueryTrace(qw422016, qt)
//line app/vmselect/prometheus/export.qtpl:161
//line app/vmselect/prometheus/export.qtpl:143
qw422016.N().S(`}`)
//line app/vmselect/prometheus/export.qtpl:163
//line app/vmselect/prometheus/export.qtpl:145
}
//line app/vmselect/prometheus/export.qtpl:163
func WriteExportPromAPIResponse(qq422016 qtio422016.Writer, resultsCh <-chan *quicktemplate.ByteBuffer, qt *querytracer.Tracer) {
//line app/vmselect/prometheus/export.qtpl:163
//line app/vmselect/prometheus/export.qtpl:145
func WriteExportPromAPIFooter(qq422016 qtio422016.Writer, qt *querytracer.Tracer) {
//line app/vmselect/prometheus/export.qtpl:145
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmselect/prometheus/export.qtpl:163
StreamExportPromAPIResponse(qw422016, resultsCh, qt)
//line app/vmselect/prometheus/export.qtpl:163
//line app/vmselect/prometheus/export.qtpl:145
StreamExportPromAPIFooter(qw422016, qt)
//line app/vmselect/prometheus/export.qtpl:145
qt422016.ReleaseWriter(qw422016)
//line app/vmselect/prometheus/export.qtpl:163
//line app/vmselect/prometheus/export.qtpl:145
}
//line app/vmselect/prometheus/export.qtpl:163
func ExportPromAPIResponse(resultsCh <-chan *quicktemplate.ByteBuffer, qt *querytracer.Tracer) string {
//line app/vmselect/prometheus/export.qtpl:163
//line app/vmselect/prometheus/export.qtpl:145
func ExportPromAPIFooter(qt *querytracer.Tracer) string {
//line app/vmselect/prometheus/export.qtpl:145
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmselect/prometheus/export.qtpl:163
WriteExportPromAPIResponse(qb422016, resultsCh, qt)
//line app/vmselect/prometheus/export.qtpl:163
//line app/vmselect/prometheus/export.qtpl:145
WriteExportPromAPIFooter(qb422016, qt)
//line app/vmselect/prometheus/export.qtpl:145
qs422016 := string(qb422016.B)
//line app/vmselect/prometheus/export.qtpl:163
//line app/vmselect/prometheus/export.qtpl:145
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmselect/prometheus/export.qtpl:163
//line app/vmselect/prometheus/export.qtpl:145
return qs422016
//line app/vmselect/prometheus/export.qtpl:163
//line app/vmselect/prometheus/export.qtpl:145
}
//line app/vmselect/prometheus/export.qtpl:165
func StreamExportStdResponse(qw422016 *qt422016.Writer, resultsCh <-chan *quicktemplate.ByteBuffer, qt *querytracer.Tracer) {
//line app/vmselect/prometheus/export.qtpl:166
for bb := range resultsCh {
//line app/vmselect/prometheus/export.qtpl:167
qw422016.N().Z(bb.B)
//line app/vmselect/prometheus/export.qtpl:168
quicktemplate.ReleaseByteBuffer(bb)
//line app/vmselect/prometheus/export.qtpl:169
}
//line app/vmselect/prometheus/export.qtpl:170
}
//line app/vmselect/prometheus/export.qtpl:170
func WriteExportStdResponse(qq422016 qtio422016.Writer, resultsCh <-chan *quicktemplate.ByteBuffer, qt *querytracer.Tracer) {
//line app/vmselect/prometheus/export.qtpl:170
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmselect/prometheus/export.qtpl:170
StreamExportStdResponse(qw422016, resultsCh, qt)
//line app/vmselect/prometheus/export.qtpl:170
qt422016.ReleaseWriter(qw422016)
//line app/vmselect/prometheus/export.qtpl:170
}
//line app/vmselect/prometheus/export.qtpl:170
func ExportStdResponse(resultsCh <-chan *quicktemplate.ByteBuffer, qt *querytracer.Tracer) string {
//line app/vmselect/prometheus/export.qtpl:170
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmselect/prometheus/export.qtpl:170
WriteExportStdResponse(qb422016, resultsCh, qt)
//line app/vmselect/prometheus/export.qtpl:170
qs422016 := string(qb422016.B)
//line app/vmselect/prometheus/export.qtpl:170
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmselect/prometheus/export.qtpl:170
return qs422016
//line app/vmselect/prometheus/export.qtpl:170
}
//line app/vmselect/prometheus/export.qtpl:172
//line app/vmselect/prometheus/export.qtpl:147
func streamprometheusMetricName(qw422016 *qt422016.Writer, mn *storage.MetricName) {
//line app/vmselect/prometheus/export.qtpl:173
//line app/vmselect/prometheus/export.qtpl:148
qw422016.N().Z(mn.MetricGroup)
//line app/vmselect/prometheus/export.qtpl:174
//line app/vmselect/prometheus/export.qtpl:149
if len(mn.Tags) > 0 {
//line app/vmselect/prometheus/export.qtpl:174
//line app/vmselect/prometheus/export.qtpl:149
qw422016.N().S(`{`)
//line app/vmselect/prometheus/export.qtpl:176
//line app/vmselect/prometheus/export.qtpl:151
tags := mn.Tags
//line app/vmselect/prometheus/export.qtpl:177
//line app/vmselect/prometheus/export.qtpl:152
qw422016.N().Z(tags[0].Key)
//line app/vmselect/prometheus/export.qtpl:177
//line app/vmselect/prometheus/export.qtpl:152
qw422016.N().S(`=`)
//line app/vmselect/prometheus/export.qtpl:177
//line app/vmselect/prometheus/export.qtpl:152
qw422016.N().QZ(tags[0].Value)
//line app/vmselect/prometheus/export.qtpl:178
//line app/vmselect/prometheus/export.qtpl:153
tags = tags[1:]
//line app/vmselect/prometheus/export.qtpl:179
//line app/vmselect/prometheus/export.qtpl:154
for i := range tags {
//line app/vmselect/prometheus/export.qtpl:180
//line app/vmselect/prometheus/export.qtpl:155
tag := &tags[i]
//line app/vmselect/prometheus/export.qtpl:180
//line app/vmselect/prometheus/export.qtpl:155
qw422016.N().S(`,`)
//line app/vmselect/prometheus/export.qtpl:181
//line app/vmselect/prometheus/export.qtpl:156
qw422016.N().Z(tag.Key)
//line app/vmselect/prometheus/export.qtpl:181
//line app/vmselect/prometheus/export.qtpl:156
qw422016.N().S(`=`)
//line app/vmselect/prometheus/export.qtpl:181
//line app/vmselect/prometheus/export.qtpl:156
qw422016.N().QZ(tag.Value)
//line app/vmselect/prometheus/export.qtpl:182
//line app/vmselect/prometheus/export.qtpl:157
}
//line app/vmselect/prometheus/export.qtpl:182
//line app/vmselect/prometheus/export.qtpl:157
qw422016.N().S(`}`)
//line app/vmselect/prometheus/export.qtpl:184
//line app/vmselect/prometheus/export.qtpl:159
}
//line app/vmselect/prometheus/export.qtpl:185
//line app/vmselect/prometheus/export.qtpl:160
}
//line app/vmselect/prometheus/export.qtpl:185
//line app/vmselect/prometheus/export.qtpl:160
func writeprometheusMetricName(qq422016 qtio422016.Writer, mn *storage.MetricName) {
//line app/vmselect/prometheus/export.qtpl:185
//line app/vmselect/prometheus/export.qtpl:160
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmselect/prometheus/export.qtpl:185
//line app/vmselect/prometheus/export.qtpl:160
streamprometheusMetricName(qw422016, mn)
//line app/vmselect/prometheus/export.qtpl:185
//line app/vmselect/prometheus/export.qtpl:160
qt422016.ReleaseWriter(qw422016)
//line app/vmselect/prometheus/export.qtpl:185
//line app/vmselect/prometheus/export.qtpl:160
}
//line app/vmselect/prometheus/export.qtpl:185
//line app/vmselect/prometheus/export.qtpl:160
func prometheusMetricName(mn *storage.MetricName) string {
//line app/vmselect/prometheus/export.qtpl:185
//line app/vmselect/prometheus/export.qtpl:160
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmselect/prometheus/export.qtpl:185
//line app/vmselect/prometheus/export.qtpl:160
writeprometheusMetricName(qb422016, mn)
//line app/vmselect/prometheus/export.qtpl:185
//line app/vmselect/prometheus/export.qtpl:160
qs422016 := string(qb422016.B)
//line app/vmselect/prometheus/export.qtpl:185
//line app/vmselect/prometheus/export.qtpl:160
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmselect/prometheus/export.qtpl:185
//line app/vmselect/prometheus/export.qtpl:160
return qs422016
//line app/vmselect/prometheus/export.qtpl:185
//line app/vmselect/prometheus/export.qtpl:160
}
//line app/vmselect/prometheus/export.qtpl:162
func streamconvertValueToSpecialJSON(qw422016 *qt422016.Writer, v float64) {
//line app/vmselect/prometheus/export.qtpl:163
if math.IsNaN(v) {
//line app/vmselect/prometheus/export.qtpl:163
qw422016.N().S(`null`)
//line app/vmselect/prometheus/export.qtpl:165
} else if math.IsInf(v, 0) {
//line app/vmselect/prometheus/export.qtpl:166
if v > 0 {
//line app/vmselect/prometheus/export.qtpl:166
qw422016.N().S(`"Infinity"`)
//line app/vmselect/prometheus/export.qtpl:168
} else {
//line app/vmselect/prometheus/export.qtpl:168
qw422016.N().S(`"-Infinity"`)
//line app/vmselect/prometheus/export.qtpl:170
}
//line app/vmselect/prometheus/export.qtpl:171
} else {
//line app/vmselect/prometheus/export.qtpl:172
qw422016.N().F(v)
//line app/vmselect/prometheus/export.qtpl:173
}
//line app/vmselect/prometheus/export.qtpl:174
}
//line app/vmselect/prometheus/export.qtpl:174
func writeconvertValueToSpecialJSON(qq422016 qtio422016.Writer, v float64) {
//line app/vmselect/prometheus/export.qtpl:174
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmselect/prometheus/export.qtpl:174
streamconvertValueToSpecialJSON(qw422016, v)
//line app/vmselect/prometheus/export.qtpl:174
qt422016.ReleaseWriter(qw422016)
//line app/vmselect/prometheus/export.qtpl:174
}
//line app/vmselect/prometheus/export.qtpl:174
func convertValueToSpecialJSON(v float64) string {
//line app/vmselect/prometheus/export.qtpl:174
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmselect/prometheus/export.qtpl:174
writeconvertValueToSpecialJSON(qb422016, v)
//line app/vmselect/prometheus/export.qtpl:174
qs422016 := string(qb422016.B)
//line app/vmselect/prometheus/export.qtpl:174
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmselect/prometheus/export.qtpl:174
return qs422016
//line app/vmselect/prometheus/export.qtpl:174
}

View File

@@ -1,4 +1,6 @@
{% import (
"math"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/netstorage"
) %}
@@ -7,10 +9,25 @@
// Federate writes rs in /federate format.
// See https://prometheus.io/docs/prometheus/latest/federation/
{% func Federate(rs *netstorage.Result) %}
{% if len(rs.Timestamps) == 0 || len(rs.Values) == 0 %}{% return %}{% endif %}
{% code
values := rs.Values
timestamps := rs.Timestamps
%}
{% if len(timestamps) == 0 || len(values) == 0 %}{% return %}{% endif %}
{% code
lastValue := values[len(values)-1]
%}
{% if math.IsNaN(lastValue) %}
{% comment %}
This is most likely a staleness marker.
Return nothing after the staleness marker.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3185
{% endcomment %}
{% return %}
{% endif %}
{%= prometheusMetricName(&rs.MetricName) %}{% space %}
{%f= rs.Values[len(rs.Values)-1] %}{% space %}
{%dl= rs.Timestamps[len(rs.Timestamps)-1] %}{% newline %}
{%f= lastValue %}{% space %}
{%dl= timestamps[len(timestamps)-1] %}{% newline %}
{% endfunc %}
{% endstripspace %}

View File

@@ -6,70 +6,85 @@ package prometheus
//line app/vmselect/prometheus/federate.qtpl:1
import (
"math"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/netstorage"
)
// Federate writes rs in /federate format.// See https://prometheus.io/docs/prometheus/latest/federation/
//line app/vmselect/prometheus/federate.qtpl:9
//line app/vmselect/prometheus/federate.qtpl:11
import (
qtio422016 "io"
qt422016 "github.com/valyala/quicktemplate"
)
//line app/vmselect/prometheus/federate.qtpl:9
//line app/vmselect/prometheus/federate.qtpl:11
var (
_ = qtio422016.Copy
_ = qt422016.AcquireByteBuffer
)
//line app/vmselect/prometheus/federate.qtpl:9
//line app/vmselect/prometheus/federate.qtpl:11
func StreamFederate(qw422016 *qt422016.Writer, rs *netstorage.Result) {
//line app/vmselect/prometheus/federate.qtpl:10
if len(rs.Timestamps) == 0 || len(rs.Values) == 0 {
//line app/vmselect/prometheus/federate.qtpl:10
//line app/vmselect/prometheus/federate.qtpl:13
values := rs.Values
timestamps := rs.Timestamps
//line app/vmselect/prometheus/federate.qtpl:16
if len(timestamps) == 0 || len(values) == 0 {
//line app/vmselect/prometheus/federate.qtpl:16
return
//line app/vmselect/prometheus/federate.qtpl:10
//line app/vmselect/prometheus/federate.qtpl:16
}
//line app/vmselect/prometheus/federate.qtpl:11
//line app/vmselect/prometheus/federate.qtpl:18
lastValue := values[len(values)-1]
//line app/vmselect/prometheus/federate.qtpl:20
if math.IsNaN(lastValue) {
//line app/vmselect/prometheus/federate.qtpl:26
return
//line app/vmselect/prometheus/federate.qtpl:27
}
//line app/vmselect/prometheus/federate.qtpl:28
streamprometheusMetricName(qw422016, &rs.MetricName)
//line app/vmselect/prometheus/federate.qtpl:11
//line app/vmselect/prometheus/federate.qtpl:28
qw422016.N().S(` `)
//line app/vmselect/prometheus/federate.qtpl:12
qw422016.N().F(rs.Values[len(rs.Values)-1])
//line app/vmselect/prometheus/federate.qtpl:12
//line app/vmselect/prometheus/federate.qtpl:29
qw422016.N().F(lastValue)
//line app/vmselect/prometheus/federate.qtpl:29
qw422016.N().S(` `)
//line app/vmselect/prometheus/federate.qtpl:13
qw422016.N().DL(rs.Timestamps[len(rs.Timestamps)-1])
//line app/vmselect/prometheus/federate.qtpl:13
//line app/vmselect/prometheus/federate.qtpl:30
qw422016.N().DL(timestamps[len(timestamps)-1])
//line app/vmselect/prometheus/federate.qtpl:30
qw422016.N().S(`
`)
//line app/vmselect/prometheus/federate.qtpl:14
//line app/vmselect/prometheus/federate.qtpl:31
}
//line app/vmselect/prometheus/federate.qtpl:14
//line app/vmselect/prometheus/federate.qtpl:31
func WriteFederate(qq422016 qtio422016.Writer, rs *netstorage.Result) {
//line app/vmselect/prometheus/federate.qtpl:14
//line app/vmselect/prometheus/federate.qtpl:31
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmselect/prometheus/federate.qtpl:14
//line app/vmselect/prometheus/federate.qtpl:31
StreamFederate(qw422016, rs)
//line app/vmselect/prometheus/federate.qtpl:14
//line app/vmselect/prometheus/federate.qtpl:31
qt422016.ReleaseWriter(qw422016)
//line app/vmselect/prometheus/federate.qtpl:14
//line app/vmselect/prometheus/federate.qtpl:31
}
//line app/vmselect/prometheus/federate.qtpl:14
//line app/vmselect/prometheus/federate.qtpl:31
func Federate(rs *netstorage.Result) string {
//line app/vmselect/prometheus/federate.qtpl:14
//line app/vmselect/prometheus/federate.qtpl:31
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmselect/prometheus/federate.qtpl:14
//line app/vmselect/prometheus/federate.qtpl:31
WriteFederate(qb422016, rs)
//line app/vmselect/prometheus/federate.qtpl:14
//line app/vmselect/prometheus/federate.qtpl:31
qs422016 := string(qb422016.B)
//line app/vmselect/prometheus/federate.qtpl:14
//line app/vmselect/prometheus/federate.qtpl:31
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmselect/prometheus/federate.qtpl:14
//line app/vmselect/prometheus/federate.qtpl:31
return qs422016
//line app/vmselect/prometheus/federate.qtpl:14
//line app/vmselect/prometheus/federate.qtpl:31
}

View File

@@ -5,9 +5,11 @@ import (
"fmt"
"math"
"net/http"
"runtime"
"strconv"
"strings"
"sync"
"sync/atomic"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/bufferedwriter"
@@ -16,7 +18,6 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/querystats"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/searchutils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/encoding"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fasttime"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
@@ -25,7 +26,6 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/metrics"
"github.com/valyala/fastjson/fastfloat"
"github.com/valyala/quicktemplate"
)
var (
@@ -84,23 +84,19 @@ func FederateHandler(startTime time.Time, w http.ResponseWriter, r *http.Request
w.Header().Set("Content-Type", "text/plain; charset=utf-8")
bw := bufferedwriter.Get(w)
defer bufferedwriter.Put(bw)
sw := newScalableWriter(bw)
err = rss.RunParallel(nil, func(rs *netstorage.Result, workerID uint) error {
if err := bw.Error(); err != nil {
return err
}
bb := quicktemplate.AcquireByteBuffer()
bb := sw.getBuffer(workerID)
WriteFederate(bb, rs)
_, err := bw.Write(bb.B)
quicktemplate.ReleaseByteBuffer(bb)
return err
return sw.maybeFlushBuffer(bb)
})
if err != nil {
return fmt.Errorf("error during sending data to remote client: %w", err)
}
if err := bw.Flush(); err != nil {
return err
}
return nil
return sw.flush()
}
var federateDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/federate"}`)
@@ -125,15 +121,14 @@ func ExportCSVHandler(startTime time.Time, w http.ResponseWriter, r *http.Reques
w.Header().Set("Content-Type", "text/csv; charset=utf-8")
bw := bufferedwriter.Get(w)
defer bufferedwriter.Put(bw)
resultsCh := make(chan *quicktemplate.ByteBuffer, cgroup.AvailableCPUs())
writeCSVLine := func(xb *exportBlock) {
sw := newScalableWriter(bw)
writeCSVLine := func(xb *exportBlock, workerID uint) error {
if len(xb.timestamps) == 0 {
return
return nil
}
bb := quicktemplate.AcquireByteBuffer()
bb := sw.getBuffer(workerID)
WriteExportCSVLine(bb, xb, fieldNames)
resultsCh <- bb
return sw.maybeFlushBuffer(bb)
}
doneCh := make(chan error, 1)
if !reduceMemUsage {
@@ -150,17 +145,18 @@ func ExportCSVHandler(startTime time.Time, w http.ResponseWriter, r *http.Reques
xb.mn = &rs.MetricName
xb.timestamps = rs.Timestamps
xb.values = rs.Values
writeCSVLine(xb)
if err := writeCSVLine(xb, workerID); err != nil {
return err
}
xb.reset()
exportBlockPool.Put(xb)
return nil
})
close(resultsCh)
doneCh <- err
}()
} else {
go func() {
err := netstorage.ExportBlocks(nil, sq, cp.deadline, func(mn *storage.MetricName, b *storage.Block, tr storage.TimeRange) error {
err := netstorage.ExportBlocks(nil, sq, cp.deadline, func(mn *storage.MetricName, b *storage.Block, tr storage.TimeRange, workerID uint) error {
if err := bw.Error(); err != nil {
return err
}
@@ -170,29 +166,21 @@ func ExportCSVHandler(startTime time.Time, w http.ResponseWriter, r *http.Reques
xb := exportBlockPool.Get().(*exportBlock)
xb.mn = mn
xb.timestamps, xb.values = b.AppendRowsWithTimeRangeFilter(xb.timestamps[:0], xb.values[:0], tr)
writeCSVLine(xb)
if err := writeCSVLine(xb, workerID); err != nil {
return err
}
xb.reset()
exportBlockPool.Put(xb)
return nil
})
close(resultsCh)
doneCh <- err
}()
}
// Consume all the data from resultsCh.
for bb := range resultsCh {
// Do not check for error in bw.Write, since this error is checked inside netstorage.ExportBlocks above.
_, _ = bw.Write(bb.B)
quicktemplate.ReleaseByteBuffer(bb)
}
if err := bw.Flush(); err != nil {
return err
}
err = <-doneCh
if err != nil {
return fmt.Errorf("error during sending the exported csv data to remote client: %w", err)
}
return nil
return sw.flush()
}
var exportCSVDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/api/v1/export/csv"}`)
@@ -210,6 +198,7 @@ func ExportNativeHandler(startTime time.Time, w http.ResponseWriter, r *http.Req
w.Header().Set("Content-Type", "VictoriaMetrics/native")
bw := bufferedwriter.Get(w)
defer bufferedwriter.Put(bw)
sw := newScalableWriter(bw)
// Marshal tr
trBuf := make([]byte, 0, 16)
@@ -218,13 +207,13 @@ func ExportNativeHandler(startTime time.Time, w http.ResponseWriter, r *http.Req
_, _ = bw.Write(trBuf)
// Marshal native blocks.
err = netstorage.ExportBlocks(nil, sq, cp.deadline, func(mn *storage.MetricName, b *storage.Block, tr storage.TimeRange) error {
err = netstorage.ExportBlocks(nil, sq, cp.deadline, func(mn *storage.MetricName, b *storage.Block, tr storage.TimeRange, workerID uint) error {
if err := bw.Error(); err != nil {
return err
}
dstBuf := bbPool.Get()
bb := sw.getBuffer(workerID)
dst := bb.B
tmpBuf := bbPool.Get()
dst := dstBuf.B
tmp := tmpBuf.B
// Marshal mn
@@ -240,19 +229,13 @@ func ExportNativeHandler(startTime time.Time, w http.ResponseWriter, r *http.Req
tmpBuf.B = tmp
bbPool.Put(tmpBuf)
_, err := bw.Write(dst)
dstBuf.B = dst
bbPool.Put(dstBuf)
return err
bb.B = dst
return sw.maybeFlushBuffer(bb)
})
if err != nil {
return fmt.Errorf("error during sending native data to remote client: %w", err)
}
if err := bw.Flush(); err != nil {
return fmt.Errorf("error during flushing native data to remote client: %w", err)
}
return nil
return sw.flush()
}
var exportNativeDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/api/v1/export/native"}`)
@@ -279,31 +262,48 @@ func ExportHandler(startTime time.Time, w http.ResponseWriter, r *http.Request)
var exportDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/api/v1/export"}`)
func exportHandler(qt *querytracer.Tracer, w http.ResponseWriter, cp *commonParams, format string, maxRowsPerLine int, reduceMemUsage bool) error {
writeResponseFunc := WriteExportStdResponse
writeLineFunc := func(xb *exportBlock, resultsCh chan<- *quicktemplate.ByteBuffer) {
bb := quicktemplate.AcquireByteBuffer()
bw := bufferedwriter.Get(w)
defer bufferedwriter.Put(bw)
sw := newScalableWriter(bw)
writeLineFunc := func(xb *exportBlock, workerID uint) error {
bb := sw.getBuffer(workerID)
WriteExportJSONLine(bb, xb)
resultsCh <- bb
return sw.maybeFlushBuffer(bb)
}
contentType := "application/stream+json; charset=utf-8"
if format == "prometheus" {
contentType = "text/plain; charset=utf-8"
writeLineFunc = func(xb *exportBlock, resultsCh chan<- *quicktemplate.ByteBuffer) {
bb := quicktemplate.AcquireByteBuffer()
writeLineFunc = func(xb *exportBlock, workerID uint) error {
bb := sw.getBuffer(workerID)
WriteExportPrometheusLine(bb, xb)
resultsCh <- bb
return sw.maybeFlushBuffer(bb)
}
} else if format == "promapi" {
writeResponseFunc = WriteExportPromAPIResponse
writeLineFunc = func(xb *exportBlock, resultsCh chan<- *quicktemplate.ByteBuffer) {
bb := quicktemplate.AcquireByteBuffer()
WriteExportPromAPIHeader(bw)
firstLineOnce := uint32(0)
firstLineSent := uint32(0)
writeLineFunc = func(xb *exportBlock, workerID uint) error {
bb := sw.getBuffer(workerID)
if atomic.CompareAndSwapUint32(&firstLineOnce, 0, 1) {
// Send the first line to sw.bw
WriteExportPromAPILine(bb, xb)
_, err := sw.bw.Write(bb.B)
bb.Reset()
atomic.StoreUint32(&firstLineSent, 1)
return err
}
for atomic.LoadUint32(&firstLineSent) == 0 {
// Busy wait until the first line is sent to sw.bw
runtime.Gosched()
}
bb.B = append(bb.B, ',')
WriteExportPromAPILine(bb, xb)
resultsCh <- bb
return sw.maybeFlushBuffer(bb)
}
}
if maxRowsPerLine > 0 {
writeLineFuncOrig := writeLineFunc
writeLineFunc = func(xb *exportBlock, resultsCh chan<- *quicktemplate.ByteBuffer) {
writeLineFunc = func(xb *exportBlock, workerID uint) error {
valuesOrig := xb.values
timestampsOrig := xb.timestamps
values := valuesOrig
@@ -324,19 +324,19 @@ func exportHandler(qt *querytracer.Tracer, w http.ResponseWriter, cp *commonPara
}
xb.values = valuesChunk
xb.timestamps = timestampsChunk
writeLineFuncOrig(xb, resultsCh)
if err := writeLineFuncOrig(xb, workerID); err != nil {
return err
}
}
xb.values = valuesOrig
xb.timestamps = timestampsOrig
return nil
}
}
sq := storage.NewSearchQuery(cp.start, cp.end, cp.filterss, *maxExportSeries)
w.Header().Set("Content-Type", contentType)
bw := bufferedwriter.Get(w)
defer bufferedwriter.Put(bw)
resultsCh := make(chan *quicktemplate.ByteBuffer, cgroup.AvailableCPUs())
doneCh := make(chan error, 1)
if !reduceMemUsage {
rss, err := netstorage.ProcessSearchQuery(qt, sq, cp.deadline)
@@ -353,19 +353,20 @@ func exportHandler(qt *querytracer.Tracer, w http.ResponseWriter, cp *commonPara
xb.mn = &rs.MetricName
xb.timestamps = rs.Timestamps
xb.values = rs.Values
writeLineFunc(xb, resultsCh)
if err := writeLineFunc(xb, workerID); err != nil {
return err
}
xb.reset()
exportBlockPool.Put(xb)
return nil
})
qtChild.Done()
close(resultsCh)
doneCh <- err
}()
} else {
qtChild := qt.NewChild("background export format=%s", format)
go func() {
err := netstorage.ExportBlocks(qtChild, sq, cp.deadline, func(mn *storage.MetricName, b *storage.Block, tr storage.TimeRange) error {
err := netstorage.ExportBlocks(qtChild, sq, cp.deadline, func(mn *storage.MetricName, b *storage.Block, tr storage.TimeRange, workerID uint) error {
if err := bw.Error(); err != nil {
return err
}
@@ -376,26 +377,30 @@ func exportHandler(qt *querytracer.Tracer, w http.ResponseWriter, cp *commonPara
xb.mn = mn
xb.timestamps, xb.values = b.AppendRowsWithTimeRangeFilter(xb.timestamps[:0], xb.values[:0], tr)
if len(xb.timestamps) > 0 {
writeLineFunc(xb, resultsCh)
if err := writeLineFunc(xb, workerID); err != nil {
return err
}
}
xb.reset()
exportBlockPool.Put(xb)
return nil
})
qtChild.Done()
close(resultsCh)
doneCh <- err
}()
}
// writeResponseFunc must consume all the data from resultsCh.
writeResponseFunc(bw, resultsCh, qt)
if err := bw.Flush(); err != nil {
return err
}
err := <-doneCh
if err != nil {
return fmt.Errorf("error during sending the data to remote client: %w", err)
return fmt.Errorf("cannot send data to remote client: %w", err)
}
if err := sw.flush(); err != nil {
return fmt.Errorf("cannot send data to remote client: %w", err)
}
if format == "promapi" {
WriteExportPromAPIFooter(bw, qt)
}
if err := bw.Flush(); err != nil {
return err
}
return nil
}
@@ -450,7 +455,7 @@ var deleteDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/api/
func LabelValuesHandler(qt *querytracer.Tracer, startTime time.Time, labelName string, w http.ResponseWriter, r *http.Request) error {
defer labelValuesDuration.UpdateDuration(startTime)
cp, err := getCommonParams(r, startTime, false)
cp, err := getCommonParamsWithDefaultDuration(r, startTime, false)
if err != nil {
return err
}
@@ -547,7 +552,7 @@ var tsdbStatusDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/
func LabelsHandler(qt *querytracer.Tracer, startTime time.Time, w http.ResponseWriter, r *http.Request) error {
defer labelsDuration.UpdateDuration(startTime)
cp, err := getCommonParams(r, startTime, false)
cp, err := getCommonParamsWithDefaultDuration(r, startTime, false)
if err != nil {
return err
}
@@ -600,17 +605,14 @@ var seriesCountDuration = metrics.NewSummary(`vm_request_duration_seconds{path="
func SeriesHandler(qt *querytracer.Tracer, startTime time.Time, w http.ResponseWriter, r *http.Request) error {
defer seriesDuration.UpdateDuration(startTime)
cp, err := getCommonParams(r, startTime, true)
if err != nil {
return err
}
// Do not set start to searchutils.minTimeMsecs by default as Prometheus does,
// since this leads to fetching and scanning all the data from the storage,
// which can take a lot of time for big storages.
// It is better setting start as end-defaultStep by default.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/91
if cp.start == 0 {
cp.start = cp.end - defaultStep
cp, err := getCommonParamsWithDefaultDuration(r, startTime, true)
if err != nil {
return err
}
limit, err := searchutils.GetInt(r, "limit")
if err != nil {
@@ -828,7 +830,7 @@ func queryRangeHandler(qt *querytracer.Tracer, startTime time.Time, w http.Respo
end = start + defaultStep
}
if err := promql.ValidateMaxPointsPerSeries(start, end, step, *maxPointsPerTimeseries); err != nil {
return err
return fmt.Errorf("%w; (see -search.maxPointsPerTimeseries command-line flag)", err)
}
if mayCache {
start, end = promql.AdjustStartEnd(start, end, step)
@@ -1062,6 +1064,17 @@ func getExportParams(r *http.Request, startTime time.Time) (*commonParams, error
return cp, nil
}
func getCommonParamsWithDefaultDuration(r *http.Request, startTime time.Time, requireNonEmptyMatch bool) (*commonParams, error) {
cp, err := getCommonParams(r, startTime, requireNonEmptyMatch)
if err != nil {
return nil, err
}
if cp.start == 0 {
cp.start = cp.end - defaultStep
}
return cp, nil
}
// getCommonParams obtains common params from r, which are used in /api/v1/* handlers:
//
// - timeout
@@ -1115,3 +1128,41 @@ func getCommonParams(r *http.Request, startTime time.Time, requireNonEmptyMatch
}
return cp, nil
}
type scalableWriter struct {
bw *bufferedwriter.Writer
m sync.Map
}
func newScalableWriter(bw *bufferedwriter.Writer) *scalableWriter {
return &scalableWriter{
bw: bw,
}
}
func (sw *scalableWriter) getBuffer(workerID uint) *bytesutil.ByteBuffer {
v, ok := sw.m.Load(workerID)
if !ok {
v = &bytesutil.ByteBuffer{}
sw.m.Store(workerID, v)
}
return v.(*bytesutil.ByteBuffer)
}
func (sw *scalableWriter) maybeFlushBuffer(bb *bytesutil.ByteBuffer) error {
if len(bb.B) < 1024*1024 {
return nil
}
_, err := sw.bw.Write(bb.B)
bb.Reset()
return err
}
func (sw *scalableWriter) flush() error {
sw.m.Range(func(k, v interface{}) bool {
bb := v.(*bytesutil.ByteBuffer)
_, err := sw.bw.Write(bb.B)
return err == nil
})
return sw.bw.Flush()
}

View File

@@ -104,6 +104,9 @@ func removeGroupTags(metricName *storage.MetricName, modifier *metricsql.Modifie
func aggrFuncExt(afe func(tss []*timeseries, modifier *metricsql.ModifierExpr) []*timeseries, argOrig []*timeseries,
modifier *metricsql.ModifierExpr, maxSeries int, keepOriginal bool) ([]*timeseries, error) {
// Remove empty time series, e.g. series with all NaN samples,
// since such series are ignored by aggregate functions.
argOrig = removeEmptySeries(argOrig)
arg := copyTimeseriesMetricNames(argOrig, keepOriginal)
// Perform grouping.

View File

@@ -65,8 +65,7 @@ var incrementalAggrFuncCallbacksMap = map[string]*incrementalAggrFuncCallbacks{
type incrementalAggrFuncContext struct {
ae *metricsql.AggrFuncExpr
mLock sync.Mutex
m map[uint]map[string]*incrementalAggrContext
m sync.Map
callbacks *incrementalAggrFuncCallbacks
}
@@ -74,19 +73,20 @@ type incrementalAggrFuncContext struct {
func newIncrementalAggrFuncContext(ae *metricsql.AggrFuncExpr, callbacks *incrementalAggrFuncCallbacks) *incrementalAggrFuncContext {
return &incrementalAggrFuncContext{
ae: ae,
m: make(map[uint]map[string]*incrementalAggrContext),
callbacks: callbacks,
}
}
func (iafc *incrementalAggrFuncContext) updateTimeseries(tsOrig *timeseries, workerID uint) {
iafc.mLock.Lock()
m := iafc.m[workerID]
if m == nil {
m = make(map[string]*incrementalAggrContext, 1)
iafc.m[workerID] = m
v, ok := iafc.m.Load(workerID)
if !ok {
// It is safe creating and storing m in iafc.m without locking,
// since it is guaranteed that only a single goroutine can execute
// code for the given workerID at a time.
v = make(map[string]*incrementalAggrContext, 1)
iafc.m.Store(workerID, v)
}
iafc.mLock.Unlock()
m := v.(map[string]*incrementalAggrContext)
ts := tsOrig
keepOriginal := iafc.callbacks.keepOriginal
@@ -124,11 +124,10 @@ func (iafc *incrementalAggrFuncContext) updateTimeseries(tsOrig *timeseries, wor
}
func (iafc *incrementalAggrFuncContext) finalizeTimeseries() []*timeseries {
// There is no need in iafc.mLock.Lock here, since finalizeTimeseries must be called
// without concurrent goroutines touching iafc.
mGlobal := make(map[string]*incrementalAggrContext)
mergeAggrFunc := iafc.callbacks.mergeAggrFunc
for _, m := range iafc.m {
iafc.m.Range(func(k, v interface{}) bool {
m := v.(map[string]*incrementalAggrContext)
for k, iac := range m {
iacGlobal := mGlobal[k]
if iacGlobal == nil {
@@ -141,7 +140,8 @@ func (iafc *incrementalAggrFuncContext) finalizeTimeseries() []*timeseries {
}
mergeAggrFunc(iacGlobal, iac)
}
}
return true
})
tss := make([]*timeseries, 0, len(mGlobal))
finalizeAggrFunc := iafc.callbacks.finalizeAggrFunc
for _, iac := range mGlobal {

View File

@@ -15,6 +15,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/decimal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/memory"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/querytracer"
@@ -27,7 +28,11 @@ var (
disableCache = flag.Bool("search.disableCache", false, "Whether to disable response caching. This may be useful during data backfilling")
maxPointsSubqueryPerTimeseries = flag.Int("search.maxPointsSubqueryPerTimeseries", 100e3, "The maximum number of points per series, which can be generated by subquery. "+
"See https://valyala.medium.com/prometheus-subqueries-in-victoriametrics-9b1492b720b3")
noStaleMarkers = flag.Bool("search.noStaleMarkers", false, "Set this flag to true if the database doesn't contain Prometheus stale markers, so there is no need in spending additional CPU time on its handling. Staleness markers may exist only in data obtained from Prometheus scrape targets")
maxMemoryPerQuery = flagutil.NewBytes("search.maxMemoryPerQuery", 0, "The maximum amounts of memory a single query may consume. "+
"Queries requiring more memory are rejected. The total memory limit for concurrently executed queries can be estimated "+
"as -search.maxMemoryPerQuery multiplied by -search.maxConcurrentRequests")
noStaleMarkers = flag.Bool("search.noStaleMarkers", false, "Set this flag to true if the database doesn't contain Prometheus stale markers, "+
"so there is no need in spending additional CPU time on its handling. Staleness markers may exist only in data obtained from Prometheus scrape targets")
)
// The minimum number of points per timeseries for enabling time rounding.
@@ -42,7 +47,7 @@ func ValidateMaxPointsPerSeries(start, end, step int64, maxPoints int) error {
}
points := (end-start)/step + 1
if points > int64(maxPoints) {
return fmt.Errorf("too many points for the given start=%d, end=%d and step=%d: %d; the maximum number of points is %d (see -search.maxPoints* command-line flags)",
return fmt.Errorf("too many points for the given start=%d, end=%d and step=%d: %d; the maximum number of points is %d",
start, end, step, points, maxPoints)
}
return nil
@@ -483,6 +488,12 @@ func execBinaryOpArgs(qt *querytracer.Tracer, ec *EvalConfig, exprFirst, exprSec
if err != nil {
return nil, nil, err
}
if len(tssFirst) == 0 && strings.ToLower(be.Op) != "or" {
// Fast path: there is no sense in executing the exprSecond when exprFirst returns an empty result,
// since the "exprFirst op exprSecond" would return an empty result in any case.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3349
return nil, nil, nil
}
lfs := getCommonLabelFilters(tssFirst)
lfs = metricsql.TrimFiltersByGroupModifier(lfs, be)
exprSecond = metricsql.PushdownBinaryOpFilters(exprSecond, lfs)
@@ -850,7 +861,7 @@ func evalRollupFuncWithSubquery(qt *querytracer.Tracer, ec *EvalConfig, funcName
ecSQ.Step = step
ecSQ.MaxPointsPerSeries = *maxPointsSubqueryPerTimeseries
if err := ValidateMaxPointsPerSeries(ecSQ.Start, ecSQ.End, ecSQ.Step, ecSQ.MaxPointsPerSeries); err != nil {
return nil, err
return nil, fmt.Errorf("%w; (see -search.maxPointsSubqueryPerTimeseries command-line flag)", err)
}
// unconditionally align start and end args to step for subquery as Prometheus does.
ecSQ.Start, ecSQ.End = alignStartEnd(ecSQ.Start, ecSQ.End, ecSQ.Step)
@@ -1051,7 +1062,17 @@ func evalRollupFuncWithMetricExpr(qt *querytracer.Tracer, ec *EvalConfig, funcNa
}
}
rollupPoints := mulNoOverflow(pointsPerTimeseries, int64(timeseriesLen*len(rcs)))
rollupMemorySize = mulNoOverflow(rollupPoints, 16)
rollupMemorySize = sumNoOverflow(mulNoOverflow(int64(rssLen), 1000), mulNoOverflow(rollupPoints, 16))
if maxMemory := int64(maxMemoryPerQuery.N); maxMemory > 0 && rollupMemorySize > maxMemory {
rss.Cancel()
return nil, &UserReadableError{
Err: fmt.Errorf("not enough memory for processing %d data points across %d time series with %d points in each time series "+
"according to -search.maxMemoryPerQuery=%d; requested memory: %d bytes; "+
"possible solutions are: reducing the number of matching time series; increasing `step` query arg (step=%gs); "+
"increasing -search.maxMemoryPerQuery",
rollupPoints, timeseriesLen*len(rcs), pointsPerTimeseries, maxMemory, rollupMemorySize, float64(ec.Step)/1e3),
}
}
rml := getRollupMemoryLimiter()
if !rml.Get(uint64(rollupMemorySize)) {
rss.Cancel()
@@ -1059,8 +1080,8 @@ func evalRollupFuncWithMetricExpr(qt *querytracer.Tracer, ec *EvalConfig, funcNa
Err: fmt.Errorf("not enough memory for processing %d data points across %d time series with %d points in each time series; "+
"total available memory for concurrent requests: %d bytes; "+
"requested memory: %d bytes; "+
"possible solutions are: reducing the number of matching time series; switching to node with more RAM; "+
"increasing -memory.allowedPercent; increasing `step` query arg (%gs)",
"possible solutions are: reducing the number of matching time series; increasing `step` query arg (step=%gs); "+
"switching to node with more RAM; increasing -memory.allowedPercent",
rollupPoints, timeseriesLen*len(rcs), pointsPerTimeseries, rml.MaxSize, uint64(rollupMemorySize), float64(ec.Step)/1e3),
}
}
@@ -1227,6 +1248,14 @@ func mulNoOverflow(a, b int64) int64 {
return a * b
}
func sumNoOverflow(a, b int64) int64 {
if math.MaxInt64-a < b {
// Overflow
return math.MaxInt64
}
return a + b
}
func dropStaleNaNs(funcName string, values []float64, timestamps []int64) ([]float64, []int64) {
if *noStaleMarkers || funcName == "default_rollup" || funcName == "stale_samples_over_time" {
// Do not drop Prometheus staleness marks (aka stale NaNs) for default_rollup() function,

View File

@@ -99,7 +99,8 @@ func maySortResults(e metricsql.Expr, tss []*timeseries) bool {
case *metricsql.FuncExpr:
switch strings.ToLower(v.Name) {
case "sort", "sort_desc",
"sort_by_label", "sort_by_label_desc":
"sort_by_label", "sort_by_label_desc",
"sort_by_label_numeric", "sort_by_label_numeric_desc":
return false
}
case *metricsql.AggrFuncExpr:

View File

@@ -2281,6 +2281,37 @@ func TestExecSuccess(t *testing.T) {
resultExpected := []netstorage.Result{r}
f(q, resultExpected)
})
t.Run(`limit_offset(too-big-offset)`, func(t *testing.T) {
t.Parallel()
q := `limit_offset(1, 10, sort_by_label((
label_set(time()*1, "foo", "y"),
label_set(time()*2, "foo", "a"),
label_set(time()*3, "foo", "x"),
), "foo"))`
resultExpected := []netstorage.Result{}
f(q, resultExpected)
})
t.Run(`limit_offset NaN`, func(t *testing.T) {
t.Parallel()
// q returns 3 time series, where foo=3 contains only NaN values
// limit_offset suppose to apply offset for non-NaN series only
q := `limit_offset(1, 1, sort_by_label_desc((
label_set(time()*1, "foo", "1"),
label_set(time()*2, "foo", "2"),
label_set(time()*3, "foo", "3"),
) < 3000, "foo"))`
r := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{1000, 1200, 1400, 1600, 1800, 2000},
Timestamps: timestampsExpected,
}
r.MetricName.Tags = []storage.Tag{{
Key: []byte("foo"),
Value: []byte("1"),
}}
resultExpected := []netstorage.Result{r}
f(q, resultExpected)
})
t.Run(`sum(label_graphite_group)`, func(t *testing.T) {
t.Parallel()
q := `sort(sum by (__name__) (
@@ -3768,6 +3799,27 @@ func TestExecSuccess(t *testing.T) {
resultExpected := []netstorage.Result{r}
f(q, resultExpected)
})
t.Run(`histogram_quantile(duplicate-le)`, func(t *testing.T) {
// See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3225
t.Parallel()
q := `round(sort(histogram_quantile(0.6,
label_set(90, "foo", "bar", "le", "5")
or label_set(100, "foo", "bar", "le", "5.0")
or label_set(200, "foo", "bar", "le", "6.0")
or label_set(300, "foo", "bar", "le", "+Inf")
)), 0.1)`
r1 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{4.7, 4.7, 4.7, 4.7, 4.7, 4.7},
Timestamps: timestampsExpected,
}
r1.MetricName.Tags = []storage.Tag{{
Key: []byte("foo"),
Value: []byte("bar"),
}}
resultExpected := []netstorage.Result{r1}
f(q, resultExpected)
})
t.Run(`histogram_quantile(valid)`, func(t *testing.T) {
t.Parallel()
q := `sort(histogram_quantile(0.6,
@@ -5079,7 +5131,40 @@ func TestExecSuccess(t *testing.T) {
resultExpected := []netstorage.Result{r}
f(q, resultExpected)
})
t.Run(`quantiles_over_time`, func(t *testing.T) {
t.Run(`quantiles_over_time(single_sample)`, func(t *testing.T) {
t.Parallel()
q := `sort_by_label(
quantiles_over_time("phi", 0.5, 0.9,
time()[100s:100s]
),
"phi",
)`
r1 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{1000, 1200, 1400, 1600, 1800, 2000},
Timestamps: timestampsExpected,
}
r1.MetricName.Tags = []storage.Tag{
{
Key: []byte("phi"),
Value: []byte("0.5"),
},
}
r2 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{1000, 1200, 1400, 1600, 1800, 2000},
Timestamps: timestampsExpected,
}
r2.MetricName.Tags = []storage.Tag{
{
Key: []byte("phi"),
Value: []byte("0.9"),
},
}
resultExpected := []netstorage.Result{r1, r2}
f(q, resultExpected)
})
t.Run(`quantiles_over_time(multiple_samples)`, func(t *testing.T) {
t.Parallel()
q := `sort_by_label(
quantiles_over_time("phi", 0.5, 0.9,
@@ -5480,6 +5565,12 @@ func TestExecSuccess(t *testing.T) {
resultExpected := []netstorage.Result{r}
f(q, resultExpected)
})
t.Run(`any(empty-series)`, func(t *testing.T) {
t.Parallel()
q := `any(label_set(time()<0, "foo", "bar"))`
resultExpected := []netstorage.Result{}
f(q, resultExpected)
})
t.Run(`group() by (test)`, func(t *testing.T) {
t.Parallel()
q := `group((
@@ -6299,19 +6390,39 @@ func TestExecSuccess(t *testing.T) {
q := `range_quantile(0.5, time())`
r := netstorage.Result{
MetricName: metricNameExpected,
// time() results in [1000 1200 1400 1600 1800 2000]
Values: []float64{1500, 1500, 1500, 1500, 1500, 1500},
Timestamps: timestampsExpected,
}
resultExpected := []netstorage.Result{r}
f(q, resultExpected)
})
t.Run(`range_stddev()`, func(t *testing.T) {
t.Parallel()
q := `round(range_stddev(time()),0.01)`
r := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{341.57, 341.57, 341.57, 341.57, 341.57, 341.57},
Timestamps: timestampsExpected,
}
resultExpected := []netstorage.Result{r}
f(q, resultExpected)
})
t.Run(`range_stdvar()`, func(t *testing.T) {
t.Parallel()
q := `round(range_stdvar(time()),0.01)`
r := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{116666.67, 116666.67, 116666.67, 116666.67, 116666.67, 116666.67},
Timestamps: timestampsExpected,
}
resultExpected := []netstorage.Result{r}
f(q, resultExpected)
})
t.Run(`range_median()`, func(t *testing.T) {
t.Parallel()
q := `range_median(time())`
r := netstorage.Result{
MetricName: metricNameExpected,
// time() results in [1000 1200 1400 1600 1800 2000]
Values: []float64{1500, 1500, 1500, 1500, 1500, 1500},
Timestamps: timestampsExpected,
}
@@ -6791,6 +6902,23 @@ func TestExecSuccess(t *testing.T) {
resultExpected := []netstorage.Result{r}
f(q, resultExpected)
})
t.Run(`range_normalize(time(),alias(-time(),"negative"))`, func(t *testing.T) {
t.Parallel()
q := `range_normalize(time(),alias(-time(), "negative"))`
r1 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{0, 0.2, 0.4, 0.6, 0.8, 1},
Timestamps: timestampsExpected,
}
r2 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{1, 0.8, 0.6, 0.4, 0.2, 0},
Timestamps: timestampsExpected,
}
r2.MetricName.MetricGroup = []byte("negative")
resultExpected := []netstorage.Result{r1, r2}
f(q, resultExpected)
})
t.Run(`range_first(time())`, func(t *testing.T) {
t.Parallel()
q := `range_first(time())`
@@ -6824,6 +6952,51 @@ func TestExecSuccess(t *testing.T) {
resultExpected := []netstorage.Result{r}
f(q, resultExpected)
})
t.Run(`range_linear_regression(time())`, func(t *testing.T) {
t.Parallel()
q := `range_linear_regression(time())`
r := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{1000, 1200, 1400, 1600, 1800, 2000},
Timestamps: timestampsExpected,
}
resultExpected := []netstorage.Result{r}
f(q, resultExpected)
})
t.Run(`range_linear_regression(-time())`, func(t *testing.T) {
t.Parallel()
q := `range_linear_regression(-time())`
r := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{-1000, -1200, -1400, -1600, -1800, -2000},
Timestamps: timestampsExpected,
}
resultExpected := []netstorage.Result{r}
f(q, resultExpected)
})
t.Run(`range_linear_regression(100/time())`, func(t *testing.T) {
t.Parallel()
q := `sort_desc(round((
alias(range_linear_regression(100/time()), "regress"),
alias(100/time(), "orig"),
),
0.001
))`
r1 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{0.1, 0.083, 0.071, 0.062, 0.056, 0.05},
Timestamps: timestampsExpected,
}
r1.MetricName.MetricGroup = []byte("orig")
r2 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{0.095, 0.085, 0.075, 0.066, 0.056, 0.046},
Timestamps: timestampsExpected,
}
r2.MetricName.MetricGroup = []byte("regress")
resultExpected := []netstorage.Result{r1, r2}
f(q, resultExpected)
})
t.Run(`deriv(N)`, func(t *testing.T) {
t.Parallel()
q := `deriv(1000)`
@@ -7738,6 +7911,178 @@ func TestExecSuccess(t *testing.T) {
resultExpected := []netstorage.Result{r1, r2, r3, r4}
f(q, resultExpected)
})
t.Run(`sort_by_label_numeric(multiple_labels_only_string)`, func(t *testing.T) {
t.Parallel()
q := `sort_by_label_numeric((
label_set(1, "x", "b", "y", "aa"),
label_set(2, "x", "a", "y", "aa"),
), "y", "x")`
r1 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{2, 2, 2, 2, 2, 2},
Timestamps: timestampsExpected,
}
r1.MetricName.Tags = []storage.Tag{
{
Key: []byte("x"),
Value: []byte("a"),
},
{
Key: []byte("y"),
Value: []byte("aa"),
},
}
r2 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{1, 1, 1, 1, 1, 1},
Timestamps: timestampsExpected,
}
r2.MetricName.Tags = []storage.Tag{
{
Key: []byte("x"),
Value: []byte("b"),
},
{
Key: []byte("y"),
Value: []byte("aa"),
},
}
resultExpected := []netstorage.Result{r1, r2}
f(q, resultExpected)
})
t.Run(`sort_by_label_numeric(multiple_labels_numbers_special_chars)`, func(t *testing.T) {
t.Parallel()
q := `sort_by_label_numeric((
label_set(1, "x", "1:0:2", "y", "1:0:1"),
label_set(2, "x", "1:0:15", "y", "1:0:1"),
), "x", "y")`
r1 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{1, 1, 1, 1, 1, 1},
Timestamps: timestampsExpected,
}
r1.MetricName.Tags = []storage.Tag{
{
Key: []byte("x"),
Value: []byte("1:0:2"),
},
{
Key: []byte("y"),
Value: []byte("1:0:1"),
},
}
r2 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{2, 2, 2, 2, 2, 2},
Timestamps: timestampsExpected,
}
r2.MetricName.Tags = []storage.Tag{
{
Key: []byte("x"),
Value: []byte("1:0:15"),
},
{
Key: []byte("y"),
Value: []byte("1:0:1"),
},
}
resultExpected := []netstorage.Result{r1, r2}
f(q, resultExpected)
})
t.Run(`sort_by_label_numeric_desc(multiple_labels_numbers_special_chars)`, func(t *testing.T) {
t.Parallel()
q := `sort_by_label_numeric_desc((
label_set(1, "x", "1:0:2", "y", "1:0:1"),
label_set(2, "x", "1:0:15", "y", "1:0:1"),
), "x", "y")`
r1 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{2, 2, 2, 2, 2, 2},
Timestamps: timestampsExpected,
}
r1.MetricName.Tags = []storage.Tag{
{
Key: []byte("x"),
Value: []byte("1:0:15"),
},
{
Key: []byte("y"),
Value: []byte("1:0:1"),
},
}
r2 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{1, 1, 1, 1, 1, 1},
Timestamps: timestampsExpected,
}
r2.MetricName.Tags = []storage.Tag{
{
Key: []byte("x"),
Value: []byte("1:0:2"),
},
{
Key: []byte("y"),
Value: []byte("1:0:1"),
},
}
resultExpected := []netstorage.Result{r1, r2}
f(q, resultExpected)
})
t.Run(`sort_by_label_numeric(alias_numbers_with_special_chars)`, func(t *testing.T) {
t.Parallel()
q := `sort_by_label_numeric((
label_set(4, "a", "DS50:1/0/15"),
label_set(1, "a", "DS50:1/0/0"),
label_set(2, "a", "DS50:1/0/1"),
label_set(3, "a", "DS50:1/0/2"),
), "a")`
r1 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{1, 1, 1, 1, 1, 1},
Timestamps: timestampsExpected,
}
r1.MetricName.Tags = []storage.Tag{
{
Key: []byte("a"),
Value: []byte("DS50:1/0/0"),
},
}
r2 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{2, 2, 2, 2, 2, 2},
Timestamps: timestampsExpected,
}
r2.MetricName.Tags = []storage.Tag{
{
Key: []byte("a"),
Value: []byte("DS50:1/0/1"),
},
}
r3 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{3, 3, 3, 3, 3, 3},
Timestamps: timestampsExpected,
}
r3.MetricName.Tags = []storage.Tag{
{
Key: []byte("a"),
Value: []byte("DS50:1/0/2"),
},
}
r4 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{4, 4, 4, 4, 4, 4},
Timestamps: timestampsExpected,
}
r4.MetricName.Tags = []storage.Tag{
{
Key: []byte("a"),
Value: []byte("DS50:1/0/15"),
},
}
resultExpected := []netstorage.Result{r1, r2, r3, r4}
f(q, resultExpected)
})
}
func TestExecError(t *testing.T) {
@@ -7781,6 +8126,8 @@ func TestExecError(t *testing.T) {
f(`nonexisting()`)
// Invalid number of args
f(`range_stddev()`)
f(`range_stdvar()`)
f(`range_quantile()`)
f(`range_quantile(1, 2, 3)`)
f(`range_median()`)
@@ -7811,6 +8158,8 @@ func TestExecError(t *testing.T) {
f(`sort_desc()`)
f(`sort_by_label()`)
f(`sort_by_label_desc()`)
f(`sort_by_label_numeric()`)
f(`sort_by_label_numeric_desc()`)
f(`timestamp()`)
f(`timestamp_with_name()`)
f(`vector()`)
@@ -7842,6 +8191,7 @@ func TestExecError(t *testing.T) {
f(`range_sum(1, 2)`)
f(`range_first(1, 2)`)
f(`range_last(1, 2)`)
f(`range_linear_regression(1, 2)`)
f(`smooth_exponential()`)
f(`smooth_exponential(1)`)
f(`remove_resets()`)
@@ -7933,6 +8283,7 @@ func TestExecError(t *testing.T) {
f(`round(1, 1 or label_set(2, "xx", "foo"))`)
f(`histogram_quantile(1 or label_set(2, "xx", "foo"), 1)`)
f(`histogram_quantiles("foo", 1 or label_set(2, "xxx", "foo"), 2)`)
f(`sort_by_label_numeric(1, 2)`)
f(`label_set(1, 2, 3)`)
f(`label_set(1, "foo", (label_set(1, "foo", bar") or label_set(2, "xxx", "yy")))`)
f(`label_set(1, "foo", 3)`)

View File

@@ -167,6 +167,8 @@ var rollupFuncsCanAdjustWindow = map[string]bool{
"timestamp": true,
}
// rollupFuncsRemoveCounterResets contains functions, which need to call removeCounterResets
// over input samples before calling the corresponding rollup functions.
var rollupFuncsRemoveCounterResets = map[string]bool{
"increase": true,
"increase_prometheus": true,
@@ -177,6 +179,36 @@ var rollupFuncsRemoveCounterResets = map[string]bool{
"rollup_rate": true,
}
// rollupFuncsSamplesScannedPerCall contains functions, which scan lower number of samples
// than is passed to the rollup func.
//
// It is expected that the remaining rollupFuncs scan all the samples passed to them.
var rollupFuncsSamplesScannedPerCall = map[string]int{
"absent_over_time": 1,
"count_over_time": 1,
"default_rollup": 1,
"delta": 2,
"delta_prometheus": 2,
"deriv_fast": 2,
"first_over_time": 1,
"idelta": 2,
"ideriv": 2,
"increase": 2,
"increase_prometheus": 2,
"increase_pure": 2,
"irate": 2,
"lag": 1,
"last_over_time": 1,
"lifetime": 2,
"present_over_time": 1,
"rate": 2,
"scrape_interval": 2,
"tfirst_over_time": 1,
"timestamp": 1,
"timestamp_with_name": 1,
"tlast_over_time": 1,
}
// These functions don't change physical meaning of input time series,
// so they don't drop metric name
var rollupFuncsKeepMetricName = map[string]bool{
@@ -248,14 +280,17 @@ func getRollupAggrFuncNames(expr metricsql.Expr) ([]string, error) {
return aggrFuncNames, nil
}
func getRollupConfigs(name string, rf rollupFunc, expr metricsql.Expr, start, end, step int64, maxPointsPerSeries int, window, lookbackDelta int64, sharedTimestamps []int64) (
func getRollupConfigs(funcName string, rf rollupFunc, expr metricsql.Expr, start, end, step int64, maxPointsPerSeries int,
window, lookbackDelta int64, sharedTimestamps []int64) (
func(values []float64, timestamps []int64), []*rollupConfig, error) {
preFunc := func(values []float64, timestamps []int64) {}
if rollupFuncsRemoveCounterResets[name] {
funcName = strings.ToLower(funcName)
if rollupFuncsRemoveCounterResets[funcName] {
preFunc = func(values []float64, timestamps []int64) {
removeCounterResets(values)
}
}
samplesScannedPerCall := rollupFuncsSamplesScannedPerCall[funcName]
newRollupConfig := func(rf rollupFunc, tagValue string) *rollupConfig {
return &rollupConfig{
TagValue: tagValue,
@@ -267,10 +302,11 @@ func getRollupConfigs(name string, rf rollupFunc, expr metricsql.Expr, start, en
MaxPointsPerSeries: maxPointsPerSeries,
MayAdjustWindow: rollupFuncsCanAdjustWindow[name],
LookbackDelta: lookbackDelta,
Timestamps: sharedTimestamps,
isDefaultRollup: name == "default_rollup",
MayAdjustWindow: rollupFuncsCanAdjustWindow[funcName],
LookbackDelta: lookbackDelta,
Timestamps: sharedTimestamps,
isDefaultRollup: funcName == "default_rollup",
samplesScannedPerCall: samplesScannedPerCall,
}
}
appendRollupConfigs := func(dst []*rollupConfig) []*rollupConfig {
@@ -280,7 +316,7 @@ func getRollupConfigs(name string, rf rollupFunc, expr metricsql.Expr, start, en
return dst
}
var rcs []*rollupConfig
switch name {
switch funcName {
case "rollup":
rcs = appendRollupConfigs(rcs)
case "rollup_rate", "rollup_deriv":
@@ -420,6 +456,11 @@ type rollupConfig struct {
// Whether default_rollup is used.
isDefaultRollup bool
// The estimated number of samples scanned per Func call.
//
// If zero, then it is considered that Func scans all the samples passed to it.
samplesScannedPerCall int
}
func (rc *rollupConfig) getTimestamps() []int64 {
@@ -562,7 +603,8 @@ func (rc *rollupConfig) doInternal(dstValues []float64, tsm *timeseriesMap, valu
ni := 0
nj := 0
f := rc.Func
var samplesScanned uint64
samplesScanned := uint64(len(values))
samplesScannedPerCall := uint64(rc.samplesScannedPerCall)
for _, tEnd := range rc.Timestamps {
tStart := tEnd - window
ni = seekFirstTimestampIdxAfter(timestamps[i:], tStart, ni)
@@ -594,7 +636,11 @@ func (rc *rollupConfig) doInternal(dstValues []float64, tsm *timeseriesMap, valu
rfa.currTimestamp = tEnd
value := f(rfa)
rfa.idx++
samplesScanned += uint64(len(rfa.values))
if samplesScannedPerCall > 0 {
samplesScanned += samplesScannedPerCall
} else {
samplesScanned += uint64(len(rfa.values))
}
dstValues = append(dstValues, value)
}
putRollupFuncArg(rfa)
@@ -848,7 +894,7 @@ func newRollupPredictLinear(args []interface{}) (rollupFunc, error) {
return nil, err
}
rf := func(rfa *rollupFuncArg) float64 {
v, k := linearRegression(rfa)
v, k := linearRegression(rfa.values, rfa.timestamps, rfa.currTimestamp)
if math.IsNaN(v) {
return nan
}
@@ -858,13 +904,8 @@ func newRollupPredictLinear(args []interface{}) (rollupFunc, error) {
return rf, nil
}
func linearRegression(rfa *rollupFuncArg) (float64, float64) {
// There is no need in handling NaNs here, since they must be cleaned up
// before calling rollup funcs.
values := rfa.values
timestamps := rfa.timestamps
n := float64(len(values))
if n == 0 {
func linearRegression(values []float64, timestamps []int64, interceptTime int64) (float64, float64) {
if len(values) == 0 {
return nan, nan
}
if areConstValues(values) {
@@ -872,25 +913,32 @@ func linearRegression(rfa *rollupFuncArg) (float64, float64) {
}
// See https://en.wikipedia.org/wiki/Simple_linear_regression#Numerical_example
interceptTime := rfa.currTimestamp
vSum := float64(0)
tSum := float64(0)
tvSum := float64(0)
ttSum := float64(0)
n := 0
for i, v := range values {
if math.IsNaN(v) {
continue
}
dt := float64(timestamps[i]-interceptTime) / 1e3
vSum += v
tSum += dt
tvSum += dt * v
ttSum += dt * dt
n++
}
if n == 0 {
return nan, nan
}
k := float64(0)
tDiff := ttSum - tSum*tSum/n
tDiff := ttSum - tSum*tSum/float64(n)
if math.Abs(tDiff) >= 1e-6 {
// Prevent from incorrect division for too small tDiff values.
k = (tvSum - tSum*vSum/n) / tDiff
k = (tvSum - tSum*vSum/float64(n)) / tDiff
}
v := vSum/n - k*tSum/n
v := vSum/float64(n) - k*tSum/float64(n)
return v, k
}
@@ -1122,11 +1170,7 @@ func newRollupQuantiles(args []interface{}) (rollupFunc, error) {
// before calling rollup funcs.
values := rfa.values
if len(values) == 0 {
return rfa.prevValue
}
if len(values) == 1 {
// Fast path - only a single value.
return values[0]
return nan
}
qs := getFloat64s()
qs.A = quantiles(qs.A[:0], phis, values)
@@ -1349,15 +1393,11 @@ func rollupRateOverSum(rfa *rollupFuncArg) float64 {
// Assume that the value didn't change since rfa.prevValue.
return 0
}
dt := rfa.window
if !math.IsNaN(rfa.prevValue) {
dt = timestamps[len(timestamps)-1] - rfa.prevTimestamp
}
sum := float64(0)
for _, v := range rfa.values {
sum += v
}
return sum / (float64(dt) / 1e3)
return sum / (float64(rfa.window) / 1e3)
}
func rollupRange(rfa *rollupFuncArg) float64 {
@@ -1435,16 +1475,20 @@ func rollupStaleSamples(rfa *rollupFuncArg) float64 {
}
func rollupStddev(rfa *rollupFuncArg) float64 {
stdvar := rollupStdvar(rfa)
return math.Sqrt(stdvar)
return stddev(rfa.values)
}
func rollupStdvar(rfa *rollupFuncArg) float64 {
// See `Rapid calculation methods` at https://en.wikipedia.org/wiki/Standard_deviation
return stdvar(rfa.values)
}
// There is no need in handling NaNs here, since they must be cleaned up
// before calling rollup funcs.
values := rfa.values
func stddev(values []float64) float64 {
v := stdvar(values)
return math.Sqrt(v)
}
func stdvar(values []float64) float64 {
// See `Rapid calculation methods` at https://en.wikipedia.org/wiki/Standard_deviation
if len(values) == 0 {
return nan
}
@@ -1456,11 +1500,17 @@ func rollupStdvar(rfa *rollupFuncArg) float64 {
var count float64
var q float64
for _, v := range values {
if math.IsNaN(v) {
continue
}
count++
avgNew := avg + (v-avg)/count
q += (v - avg) * (v - avgNew)
avg = avgNew
}
if count == 0 {
return nan
}
return q / count
}
@@ -1499,11 +1549,8 @@ func rollupDelta(rfa *rollupFuncArg) float64 {
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/894
return values[len(values)-1] - rfa.realPrevValue
}
// Assume that the previous non-existing value was 0 only in the following cases:
//
// - If the delta with the next value equals to 0.
// This is the case for slow-changing counter - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/962
// - If the first value doesn't exceed too much the delta with the next value.
// Assume that the previous non-existing value was 0
// only if the first value doesn't exceed too much the delta with the next value.
//
// This should prevent from improper increase() results for os-level counters
// such as cpu time or bytes sent over the network interface.
@@ -1517,9 +1564,6 @@ func rollupDelta(rfa *rollupFuncArg) float64 {
} else if !math.IsNaN(rfa.realNextValue) {
d = rfa.realNextValue - values[0]
}
if d == 0 {
d = 10
}
if math.Abs(values[0]) < 10*(math.Abs(d)+1) {
prevValue = 0
} else {
@@ -1573,7 +1617,7 @@ func rollupIdelta(rfa *rollupFuncArg) float64 {
func rollupDerivSlow(rfa *rollupFuncArg) float64 {
// Use linear regression like Prometheus does.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/73
_, k := linearRegression(rfa)
_, k := linearRegression(rfa.values, rfa.timestamps, rfa.currTimestamp)
return k
}

View File

@@ -152,6 +152,9 @@ func InitRollupResultCache(cachePath string) {
metrics.GetOrCreateGauge(`vm_cache_size_bytes{type="promql/rollupResult"}`, func() float64 {
return float64(fcs().BytesSize)
})
metrics.GetOrCreateGauge(`vm_cache_size_max_bytes{type="promql/rollupResult"}`, func() float64 {
return float64(fcs().MaxBytesSize)
})
metrics.GetOrCreateGauge(`vm_cache_requests_total{type="promql/rollupResult"}`, func() float64 {
return float64(fcs().GetCalls)
})
@@ -430,8 +433,7 @@ func mustLoadRollupResultCacheKeyPrefix(path string) {
func mustSaveRollupResultCacheKeyPrefix(path string) {
path = path + ".key.prefix"
data := encoding.MarshalUint64(nil, rollupResultCacheKeyPrefix)
fs.MustRemoveAll(path)
if err := fs.WriteFileAtomically(path, data); err != nil {
if err := fs.WriteFileAtomically(path, data, true); err != nil {
logger.Fatalf("cannot store rollupResult cache key prefix to %q: %s", path, err)
}
}

View File

@@ -388,12 +388,7 @@ func TestRollupPredictLinear(t *testing.T) {
func TestLinearRegression(t *testing.T) {
f := func(values []float64, timestamps []int64, expV, expK float64) {
t.Helper()
rfa := &rollupFuncArg{
values: values,
timestamps: timestamps,
currTimestamp: timestamps[0] + 100,
}
v, k := linearRegression(rfa)
v, k := linearRegression(values, timestamps, timestamps[0]+100)
if err := compareValues([]float64{v}, []float64{expV}); err != nil {
t.Fatalf("unexpected v err: %s", err)
}
@@ -587,8 +582,8 @@ func TestRollupNoWindowNoPoints(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned != 0 {
t.Fatalf("expecting zero samplesScanned from rollupConfig.Do; got %d", samplesScanned)
if samplesScanned != 12 {
t.Fatalf("expecting 12 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{nan, nan, nan, nan, nan}
timestampsExpected := []int64{0, 1, 2, 3, 4}
@@ -626,8 +621,8 @@ func TestRollupWindowNoPoints(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned != 0 {
t.Fatalf("expecting zero samplesScanned from rollupConfig.Do; got %d", samplesScanned)
if samplesScanned != 12 {
t.Fatalf("expecting 12 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{nan, nan, nan, nan, nan}
timestampsExpected := []int64{0, 1, 2, 3, 4}
@@ -644,8 +639,8 @@ func TestRollupWindowNoPoints(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned != 0 {
t.Fatalf("expecting zero samplesScanned from rollupConfig.Do; got %d", samplesScanned)
if samplesScanned != 12 {
t.Fatalf("expecting 12 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{nan, nan, nan, nan}
timestampsExpected := []int64{161, 171, 181, 191}
@@ -665,8 +660,8 @@ func TestRollupNoWindowPartialPoints(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 15 {
t.Fatalf("expecting 15 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{nan, 123, nan, 34, nan, 44}
timestampsExpected := []int64{0, 5, 10, 15, 20, 25}
@@ -683,8 +678,8 @@ func TestRollupNoWindowPartialPoints(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 16 {
t.Fatalf("expecting 16 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{44, 32, 34, nan}
timestampsExpected := []int64{100, 120, 140, 160}
@@ -701,8 +696,8 @@ func TestRollupNoWindowPartialPoints(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 24 {
t.Fatalf("expecting 24 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{nan, nan, 123, 34, 32}
timestampsExpected := []int64{-50, 0, 50, 100, 150}
@@ -722,8 +717,8 @@ func TestRollupWindowPartialPoints(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 16 {
t.Fatalf("expecting 16 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{nan, 123, 123, 34, 34}
timestampsExpected := []int64{0, 5, 10, 15, 20}
@@ -740,8 +735,8 @@ func TestRollupWindowPartialPoints(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 16 {
t.Fatalf("expecting 16 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{44, 34, 34, nan}
timestampsExpected := []int64{100, 120, 140, 160}
@@ -758,8 +753,8 @@ func TestRollupWindowPartialPoints(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 15 {
t.Fatalf("expecting 15 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{nan, 54, 44, nan}
timestampsExpected := []int64{0, 50, 100, 150}
@@ -779,8 +774,8 @@ func TestRollupFuncsLookbackDelta(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 18 {
t.Fatalf("expecting 18 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{99, nan, 44, nan, 32, 34, nan}
timestampsExpected := []int64{80, 90, 100, 110, 120, 130, 140}
@@ -797,8 +792,8 @@ func TestRollupFuncsLookbackDelta(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 18 {
t.Fatalf("expecting 18 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{99, nan, 44, nan, 32, 34, nan}
timestampsExpected := []int64{80, 90, 100, 110, 120, 130, 140}
@@ -815,8 +810,8 @@ func TestRollupFuncsLookbackDelta(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 18 {
t.Fatalf("expecting 18 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{99, nan, 44, nan, 32, 34, nan}
timestampsExpected := []int64{80, 90, 100, 110, 120, 130, 140}
@@ -836,8 +831,8 @@ func TestRollupFuncsNoWindow(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 24 {
t.Fatalf("expecting 24 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{nan, 123, 54, 44, 34}
timestampsExpected := []int64{0, 40, 80, 120, 160}
@@ -854,8 +849,8 @@ func TestRollupFuncsNoWindow(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 24 {
t.Fatalf("expecting 24 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{nan, 4, 4, 3, 1}
timestampsExpected := []int64{0, 40, 80, 120, 160}
@@ -872,8 +867,8 @@ func TestRollupFuncsNoWindow(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 24 {
t.Fatalf("expecting 24 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{nan, 21, 12, 32, 34}
timestampsExpected := []int64{0, 40, 80, 120, 160}
@@ -890,8 +885,8 @@ func TestRollupFuncsNoWindow(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 24 {
t.Fatalf("expecting 24 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{nan, 123, 99, 44, 34}
timestampsExpected := []int64{0, 40, 80, 120, 160}
@@ -908,8 +903,8 @@ func TestRollupFuncsNoWindow(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 24 {
t.Fatalf("expecting 24 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{nan, 222, 199, 110, 34}
timestampsExpected := []int64{0, 40, 80, 120, 160}
@@ -926,8 +921,8 @@ func TestRollupFuncsNoWindow(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 24 {
t.Fatalf("expecting 24 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{nan, 21, -9, 22, 0}
timestampsExpected := []int64{0, 40, 80, 120, 160}
@@ -944,8 +939,8 @@ func TestRollupFuncsNoWindow(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 24 {
t.Fatalf("expecting 24 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{nan, -102, -42, -10, nan}
timestampsExpected := []int64{0, 40, 80, 120, 160}
@@ -962,8 +957,8 @@ func TestRollupFuncsNoWindow(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 24 {
t.Fatalf("expecting 24 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{123, 33, -87, 0}
timestampsExpected := []int64{10, 50, 90, 130}
@@ -980,8 +975,8 @@ func TestRollupFuncsNoWindow(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 24 {
t.Fatalf("expecting 24 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{nan, 0.004, 0, 0, 0.03}
timestampsExpected := []int64{0, 40, 80, 120, 160}
@@ -998,8 +993,8 @@ func TestRollupFuncsNoWindow(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 24 {
t.Fatalf("expecting 24 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{nan, 0.031, 0.044, 0.04, 0.01}
timestampsExpected := []int64{0, 40, 80, 120, 160}
@@ -1016,8 +1011,8 @@ func TestRollupFuncsNoWindow(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 47 {
t.Fatalf("expecting 47 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{nan, 0.031, 0.075, 0.115, 0.125}
timestampsExpected := []int64{0, 40, 80, 120, 160}
@@ -1034,8 +1029,8 @@ func TestRollupFuncsNoWindow(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 24 {
t.Fatalf("expecting 24 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{nan, 0.010333333333333333, 0.011, 0.013333333333333334, 0.01}
timestampsExpected := []int64{0, 40, 80, 120, 160}
@@ -1052,8 +1047,8 @@ func TestRollupFuncsNoWindow(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 35 {
t.Fatalf("expecting 35 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{nan, 0.010333333333333333, 0.010714285714285714, 0.012, 0.0125}
timestampsExpected := []int64{0, 40, 80, 120, 160}
@@ -1070,8 +1065,8 @@ func TestRollupFuncsNoWindow(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 24 {
t.Fatalf("expecting 24 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{nan, 4, 4, 3, 0}
timestampsExpected := []int64{0, 40, 80, 120, 160}
@@ -1088,8 +1083,8 @@ func TestRollupFuncsNoWindow(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 24 {
t.Fatalf("expecting 24 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{nan, 3, 3, 2, 0}
timestampsExpected := []int64{0, 40, 80, 120, 160}
@@ -1106,8 +1101,8 @@ func TestRollupFuncsNoWindow(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 16 {
t.Fatalf("expecting 16 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{nan, 1, 1, 1, 1, 0}
timestampsExpected := []int64{0, 9, 18, 27, 36, 45}
@@ -1124,8 +1119,8 @@ func TestRollupFuncsNoWindow(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 24 {
t.Fatalf("expecting 24 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{nan, 2, 2, 1, 0}
timestampsExpected := []int64{0, 40, 80, 120, 160}
@@ -1142,8 +1137,8 @@ func TestRollupFuncsNoWindow(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 24 {
t.Fatalf("expecting 24 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{nan, 55.5, 49.75, 36.666666666666664, 34}
timestampsExpected := []int64{0, 40, 80, 120, 160}
@@ -1160,8 +1155,8 @@ func TestRollupFuncsNoWindow(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 24 {
t.Fatalf("expecting 24 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{nan, -2879.310344827588, 127.87627310448904, -496.5831435079728, 0}
timestampsExpected := []int64{0, 40, 80, 120, 160}
@@ -1178,8 +1173,8 @@ func TestRollupFuncsNoWindow(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 14 {
t.Fatalf("expecting 14 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{nan, nan, nan, 0, -8900, 0}
timestampsExpected := []int64{0, 4, 8, 12, 16, 20}
@@ -1196,8 +1191,8 @@ func TestRollupFuncsNoWindow(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 24 {
t.Fatalf("expecting 24 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{nan, -1916.6666666666665, -43500, 400, 0}
timestampsExpected := []int64{0, 40, 80, 120, 160}
@@ -1214,8 +1209,8 @@ func TestRollupFuncsNoWindow(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 24 {
t.Fatalf("expecting 24 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{nan, 39.81519810323691, 32.080952292598795, 5.2493385826745405, 0}
timestampsExpected := []int64{0, 40, 80, 120, 160}
@@ -1232,8 +1227,8 @@ func TestRollupFuncsNoWindow(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 24 {
t.Fatalf("expecting 24 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{nan, 2.148, 1.593, 1.156, 1.36}
timestampsExpected := []int64{0, 40, 80, 120, 160}
@@ -1250,8 +1245,8 @@ func TestRollupFuncsNoWindow(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 24 {
t.Fatalf("expecting 24 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{nan, 4, 4, 3, 1}
timestampsExpected := []int64{0, 40, 80, 120, 160}
@@ -1268,8 +1263,8 @@ func TestRollupFuncsNoWindow(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 35 {
t.Fatalf("expecting 35 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{nan, 4, 7, 6, 3}
timestampsExpected := []int64{0, 40, 80, 120, 160}
@@ -1286,8 +1281,8 @@ func TestRollupFuncsNoWindow(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 35 {
t.Fatalf("expecting 35 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{nan, 21, 34, 34, 34}
timestampsExpected := []int64{0, 40, 80, 120, 160}
@@ -1304,10 +1299,10 @@ func TestRollupFuncsNoWindow(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 35 {
t.Fatalf("expecting 35 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{nan, 2775, 5262.5, 3678.5714285714284, 2880}
valuesExpected := []float64{nan, 2775, 5262.5, 3862.5, 1800}
timestampsExpected := []int64{0, 40, 80, 120, 160}
testRowsEqual(t, values, rc.Timestamps, valuesExpected, timestampsExpected)
})
@@ -1322,8 +1317,8 @@ func TestRollupFuncsNoWindow(t *testing.T) {
}
rc.Timestamps = rc.getTimestamps()
values, samplesScanned := rc.Do(nil, testValues, testTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 35 {
t.Fatalf("expecting 35 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{nan, -0.86650328627136, -1.1200838283548589, -0.40035755084856683, nan}
timestampsExpected := []int64{0, 40, 80, 120, 160}
@@ -1348,8 +1343,8 @@ func TestRollupBigNumberOfValues(t *testing.T) {
srcTimestamps[i] = int64(i / 2)
}
values, samplesScanned := rc.Do(nil, srcValues, srcTimestamps)
if samplesScanned == 0 {
t.Fatalf("expecting non-zero samplesScanned from rollupConfig.Do")
if samplesScanned != 22002 {
t.Fatalf("expecting 22002 samplesScanned from rollupConfig.Do; got %d", samplesScanned)
}
valuesExpected := []float64{1, 4001, 8001, 9999, nan, nan}
timestampsExpected := []int64{0, 2000, 4000, 6000, 8000, 10000}
@@ -1425,19 +1420,16 @@ func TestRollupDelta(t *testing.T) {
// Small initial value
f(nan, nan, nan, []float64{1}, 1)
f(nan, nan, nan, []float64{10}, 10)
f(nan, nan, nan, []float64{100}, 100)
f(nan, nan, nan, []float64{10}, 0)
f(nan, nan, nan, []float64{100}, 0)
f(nan, nan, nan, []float64{1, 2, 3}, 3)
f(1, nan, nan, []float64{1, 2, 3}, 2)
f(nan, nan, nan, []float64{5, 6, 8}, 8)
f(2, nan, nan, []float64{5, 6, 8}, 6)
// Moderate initial value with zero delta after that.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/962
f(nan, nan, nan, []float64{100}, 100)
f(nan, nan, nan, []float64{100, 100}, 100)
f(nan, nan, nan, []float64{100, 100}, 0)
// Big initial value with with zero delta after that.
// Big initial value with zero delta after that.
f(nan, nan, nan, []float64{1000}, 0)
f(nan, nan, nan, []float64{1000, 1000}, 0)

Some files were not shown because too many files have changed in this diff Show More