mirror of
https://github.com/VictoriaMetrics/VictoriaMetrics.git
synced 2026-06-07 19:06:17 +03:00
Compare commits
382 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
7ce87ebcb2 | ||
|
|
1051d8aa2d | ||
|
|
689cf88eb2 | ||
|
|
bdd0a1cdb2 | ||
|
|
acf1a2c72b | ||
|
|
89315d719d | ||
|
|
dc9d7aedd5 | ||
|
|
7373986f9e | ||
|
|
7bf5d48315 | ||
|
|
3e451ccdda | ||
|
|
fe3444b124 | ||
|
|
77be066ee8 | ||
|
|
1837f2f7d3 | ||
|
|
f5d52b51f1 | ||
|
|
31ec79eaf6 | ||
|
|
c8ea697db8 | ||
|
|
2140ccbdcc | ||
|
|
7976c22797 | ||
|
|
2c44f9989a | ||
|
|
e61e3bf174 | ||
|
|
89611fa48c | ||
|
|
14f0f90507 | ||
|
|
24ffad74c1 | ||
|
|
6740294ebb | ||
|
|
2e2e4f7e21 | ||
|
|
9dcb18e03d | ||
|
|
0477991b4d | ||
|
|
b1f9b39c4b | ||
|
|
39b11b3ff4 | ||
|
|
7bd420cbfe | ||
|
|
85962b459f | ||
|
|
f6ca776c75 | ||
|
|
70df5f4975 | ||
|
|
c86286ec1d | ||
|
|
261535b32d | ||
|
|
4b7105a65b | ||
|
|
df0309eae0 | ||
|
|
ad4e6a9283 | ||
|
|
59183f66d0 | ||
|
|
fb338c50a3 | ||
|
|
86630350bf | ||
|
|
490c69c64e | ||
|
|
932e53522d | ||
|
|
1de15ad490 | ||
|
|
1f2944a9d0 | ||
|
|
cab7e936a3 | ||
|
|
0326638c90 | ||
|
|
4eb520a342 | ||
|
|
b21e16ad0c | ||
|
|
820669da69 | ||
|
|
8dd03ecf19 | ||
|
|
9e4ed5e591 | ||
|
|
9df60518bb | ||
|
|
c270f8f3e6 | ||
|
|
46dba00756 | ||
|
|
de89bcddae | ||
|
|
0f99c1afb1 | ||
|
|
750daa04d1 | ||
|
|
e4f856e900 | ||
|
|
e15b20dde3 | ||
|
|
13804bda8f | ||
|
|
404cbd1522 | ||
|
|
88ac4dfc07 | ||
|
|
17c2ce18fd | ||
|
|
d65c03c004 | ||
|
|
ebf8da3730 | ||
|
|
e6666da4e7 | ||
|
|
97686ddc65 | ||
|
|
43577a8237 | ||
|
|
8df25e12d8 | ||
|
|
d8197f4a55 | ||
|
|
8aa2f448a8 | ||
|
|
2dfa746c91 | ||
|
|
9abb2d6c74 | ||
|
|
27f0261257 | ||
|
|
2a1550f341 | ||
|
|
0d2c4f252f | ||
|
|
0e082b1c76 | ||
|
|
1b9992b42a | ||
|
|
795e32be4a | ||
|
|
4215182e61 | ||
|
|
e8f645bf52 | ||
|
|
a4c7fcb5e1 | ||
|
|
aa56b9217e | ||
|
|
b10ad44692 | ||
|
|
1eabbc0e27 | ||
|
|
a13a443bf7 | ||
|
|
b9913e151a | ||
|
|
b730fc2667 | ||
|
|
11fa458e39 | ||
|
|
149511f5e9 | ||
|
|
2813d0b1e0 | ||
|
|
95c9b630cc | ||
|
|
60fcac4878 | ||
|
|
5af2a9ca0e | ||
|
|
020917949b | ||
|
|
4e48067133 | ||
|
|
ae3675d3d0 | ||
|
|
6247884057 | ||
|
|
0b2726c3be | ||
|
|
5d426dfe0a | ||
|
|
d006b41eff | ||
|
|
ae972429c7 | ||
|
|
f8e7f433cf | ||
|
|
069c9ade52 | ||
|
|
ce8c2dd1f1 | ||
|
|
5ebfc275e6 | ||
|
|
f93247e82d | ||
|
|
c4c90ab2b1 | ||
|
|
ae10ff8ccd | ||
|
|
4862edfef3 | ||
|
|
9d42546a27 | ||
|
|
710f8ce5aa | ||
|
|
081aa4ad68 | ||
|
|
5f9d88a3cb | ||
|
|
ba8ac08739 | ||
|
|
e7d8d84396 | ||
|
|
30445ed5e9 | ||
|
|
82afcb6d0d | ||
|
|
3ca1ed0fde | ||
|
|
b13680a67e | ||
|
|
0066a02293 | ||
|
|
fd9fd191b9 | ||
|
|
4146fc4668 | ||
|
|
364f30a6e7 | ||
|
|
1906f841c9 | ||
|
|
26df320be5 | ||
|
|
b6b1b06d70 | ||
|
|
5454668709 | ||
|
|
c8133cbb16 | ||
|
|
30deb2b548 | ||
|
|
08b71d2067 | ||
|
|
0f1b969aa6 | ||
|
|
c7ac7c1807 | ||
|
|
05813259dc | ||
|
|
9c1c9d8e76 | ||
|
|
007dbf273d | ||
|
|
82972a8f2a | ||
|
|
83c0c241a7 | ||
|
|
299a35948c | ||
|
|
b0e4b234cb | ||
|
|
6f0038209c | ||
|
|
ae1db8fa08 | ||
|
|
0e46e8df8d | ||
|
|
d305cc2017 | ||
|
|
e2e8ef86d9 | ||
|
|
52915c8f7e | ||
|
|
eb27dbde13 | ||
|
|
9d787f9edd | ||
|
|
66379cc69f | ||
|
|
d0e1589ea9 | ||
|
|
de0643fab5 | ||
|
|
9cd8eb92f1 | ||
|
|
5009b25a03 | ||
|
|
c6dee6c52d | ||
|
|
a7fc84b390 | ||
|
|
2f777d996d | ||
|
|
44a34a0f5f | ||
|
|
4910bac46b | ||
|
|
1982505c2b | ||
|
|
9d87496b50 | ||
|
|
91a4c279cc | ||
|
|
7590b8477b | ||
|
|
b1fd390e16 | ||
|
|
5bf14991a3 | ||
|
|
700bda8e2e | ||
|
|
efdc3c71af | ||
|
|
ca091bade3 | ||
|
|
b35b3dc043 | ||
|
|
0463cb5550 | ||
|
|
357f886f97 | ||
|
|
ace969d595 | ||
|
|
32869e4c0f | ||
|
|
a906b3862f | ||
|
|
eedb79ead8 | ||
|
|
ae457828bc | ||
|
|
51652f638f | ||
|
|
3a32789352 | ||
|
|
2cea4d403f | ||
|
|
3dffc6099e | ||
|
|
b0a5c382ee | ||
|
|
1de1774de6 | ||
|
|
067188501f | ||
|
|
4cb6bcd2d7 | ||
|
|
6b1317b6a4 | ||
|
|
b7fcdb528d | ||
|
|
dabbf930d8 | ||
|
|
1c669a69a8 | ||
|
|
7119f294f3 | ||
|
|
8a057e705a | ||
|
|
b65236530c | ||
|
|
ae04378424 | ||
|
|
bf95fbfc1d | ||
|
|
78d2715d04 | ||
|
|
d0ffb49ee2 | ||
|
|
b7f4fc6e0d | ||
|
|
d48363534a | ||
|
|
0acdab3ab9 | ||
|
|
7e8dcf9ddc | ||
|
|
aa90b93778 | ||
|
|
de523c81b9 | ||
|
|
a724dde90a | ||
|
|
fb8e56d8a2 | ||
|
|
f0c207fae2 | ||
|
|
d3794eb994 | ||
|
|
f765985947 | ||
|
|
e614a14b21 | ||
|
|
9d160f9048 | ||
|
|
d7932775cc | ||
|
|
eec76718e9 | ||
|
|
093a891762 | ||
|
|
c03e4ef9d6 | ||
|
|
de7f315231 | ||
|
|
97a0c80904 | ||
|
|
09105ff49c | ||
|
|
2859a452d4 | ||
|
|
170e2f54ab | ||
|
|
8b116b619a | ||
|
|
6e6d62284c | ||
|
|
a02a12f639 | ||
|
|
f818ab497b | ||
|
|
b73802372a | ||
|
|
2f05f90888 | ||
|
|
7e4bcbd853 | ||
|
|
a11659013f | ||
|
|
a6b2b2c005 | ||
|
|
c2afa3fdd7 | ||
|
|
d4cc934c77 | ||
|
|
870270c75e | ||
|
|
7addbfc831 | ||
|
|
1c477bc2fc | ||
|
|
d57214244d | ||
|
|
84b986b2fc | ||
|
|
1052effb6d | ||
|
|
266788be14 | ||
|
|
cf18df367d | ||
|
|
72ab3f7230 | ||
|
|
30a922f383 | ||
|
|
2c67232565 | ||
|
|
86f99c6b55 | ||
|
|
3c1434118e | ||
|
|
27a417bcd3 | ||
|
|
6fa806f1ca | ||
|
|
f5500251d9 | ||
|
|
5d6d2ef3a6 | ||
|
|
0208d8c103 | ||
|
|
465923b181 | ||
|
|
a1f3795b78 | ||
|
|
414cd39659 | ||
|
|
d100341394 | ||
|
|
6251762787 | ||
|
|
48d033a198 | ||
|
|
4aaee33860 | ||
|
|
6c0d36e4a9 | ||
|
|
ef9a8989fd | ||
|
|
5d27642106 | ||
|
|
0deabbbb4a | ||
|
|
67b41c080d | ||
|
|
6fcbd17bdd | ||
|
|
9ce5c0c33f | ||
|
|
c5daf8a27b | ||
|
|
d9d01f976b | ||
|
|
1f19c167a4 | ||
|
|
cdf1e6684b | ||
|
|
28ea993872 | ||
|
|
149c0c4a6d | ||
|
|
4f8a3af061 | ||
|
|
57a4af98fa | ||
|
|
3fa9ab4a49 | ||
|
|
47a038401b | ||
|
|
077f8cbe1c | ||
|
|
4057305148 | ||
|
|
bb06b98202 | ||
|
|
4adb96161a | ||
|
|
4c8e01b312 | ||
|
|
51c529a2b6 | ||
|
|
1437d6db0c | ||
|
|
e60c0d0bae | ||
|
|
462913ed2f | ||
|
|
1e69c151eb | ||
|
|
348edd92fe | ||
|
|
352485b0de | ||
|
|
9e40eec7d8 | ||
|
|
e205975716 | ||
|
|
6e668fd480 | ||
|
|
47390d8947 | ||
|
|
ba4a2c8bca | ||
|
|
0d7a3f4eb3 | ||
|
|
fc499ab501 | ||
|
|
3adf8c5a6f | ||
|
|
0d1855f661 | ||
|
|
bcd139362b | ||
|
|
6c24c5caa3 | ||
|
|
ef6ab3d2c9 | ||
|
|
41813eb87a | ||
|
|
4e391a5e39 | ||
|
|
bb3b513bdd | ||
|
|
83df20b5b5 | ||
|
|
9e83335ca9 | ||
|
|
5407eed2f6 | ||
|
|
188325f0fc | ||
|
|
55e98e265e | ||
|
|
dbbc160a40 | ||
|
|
9c0e2d2a6e | ||
|
|
82ce930e59 | ||
|
|
dd6bfa50e9 | ||
|
|
43823addea | ||
|
|
5943f49f60 | ||
|
|
9deda5107b | ||
|
|
07f7245aeb | ||
|
|
944c5ea331 | ||
|
|
de81472724 | ||
|
|
f733b0ac9d | ||
|
|
368b69b4c4 | ||
|
|
1cb78ba1a0 | ||
|
|
b378cd6ed8 | ||
|
|
381ad564a2 | ||
|
|
4c808d58bf | ||
|
|
c4e8c34d0e | ||
|
|
b2042a1c30 | ||
|
|
caeb74f068 | ||
|
|
ae91a6883c | ||
|
|
e4182dd896 | ||
|
|
b9e5172aa2 | ||
|
|
600f225cff | ||
|
|
bd81f926a4 | ||
|
|
5a9743211f | ||
|
|
ca8b5745b5 | ||
|
|
f3f62ab04e | ||
|
|
e0a91ef163 | ||
|
|
c87fb9191e | ||
|
|
51e661ecfe | ||
|
|
cd071357d8 | ||
|
|
61579680bb | ||
|
|
fe289331dd | ||
|
|
d396c265a6 | ||
|
|
31918f60b2 | ||
|
|
d62ec1cb01 | ||
|
|
5e75c389e6 | ||
|
|
c0f3be824d | ||
|
|
ca566dce39 | ||
|
|
0b35da159c | ||
|
|
cb71af216a | ||
|
|
daacbc7e34 | ||
|
|
f477cbe861 | ||
|
|
50d44d5932 | ||
|
|
68d004bc05 | ||
|
|
e277c3d07b | ||
|
|
29e4e7f422 | ||
|
|
b7638f04a7 | ||
|
|
c539494b36 | ||
|
|
d12c4914f0 | ||
|
|
64e2d66014 | ||
|
|
4108e85efd | ||
|
|
f0bdc5716e | ||
|
|
67059caa12 | ||
|
|
de3fe22815 | ||
|
|
055f152246 | ||
|
|
20311f6065 | ||
|
|
a51a7b2a20 | ||
|
|
bca468bb55 | ||
|
|
0729cc36b2 | ||
|
|
5bfd4e6218 | ||
|
|
920300643a | ||
|
|
ef77120170 | ||
|
|
b3f3c078e5 | ||
|
|
84e3881c0b | ||
|
|
2ed069c3bc | ||
|
|
28353e48ca | ||
|
|
01987f8c77 | ||
|
|
d2960a20e0 | ||
|
|
d4f12e0fbb | ||
|
|
e6ab69dd88 | ||
|
|
ed5f05024b | ||
|
|
43aa737e23 | ||
|
|
46dccc1088 | ||
|
|
96cdfcba50 | ||
|
|
09d60d64a9 | ||
|
|
c37e5de66f | ||
|
|
3b847d32d9 | ||
|
|
590d8d537f | ||
|
|
bc42b5598f |
5
.github/ISSUE_TEMPLATE/bug_report.md
vendored
5
.github/ISSUE_TEMPLATE/bug_report.md
vendored
@@ -9,9 +9,12 @@ assignees: ''
|
||||
|
||||
**Describe the bug**
|
||||
A clear and concise description of what the bug is.
|
||||
It would be great [upgrading](https://victoriametrics.github.io/#how-to-upgrade) to [the latest avaialble release](https://github.com/VictoriaMetrics/VictoriaMetrics/releases)
|
||||
and verifying whether the bug is reproducible there.
|
||||
It is also recommended reading [troubleshooting docs](https://victoriametrics.github.io/#troubleshooting).
|
||||
|
||||
**To Reproduce**
|
||||
Steps to reproduce the behavior
|
||||
Steps to reproduce the behavior.
|
||||
|
||||
**Expected behavior**
|
||||
A clear and concise description of what you expected to happen.
|
||||
|
||||
4
.github/workflows/main.yml
vendored
4
.github/workflows/main.yml
vendored
@@ -19,12 +19,10 @@ jobs:
|
||||
go-version: 1.15
|
||||
id: go
|
||||
- name: Dependencies
|
||||
env:
|
||||
GO111MODULE: on
|
||||
run: |
|
||||
go get -u golang.org/x/lint/golint
|
||||
go get -u github.com/kisielk/errcheck
|
||||
curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(go env GOPATH)/bin v1.27.0
|
||||
curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(go env GOPATH)/bin v1.29.0
|
||||
- name: Code checkout
|
||||
uses: actions/checkout@master
|
||||
- name: Build
|
||||
|
||||
115
CHANGELOG.md
115
CHANGELOG.md
@@ -1,115 +0,0 @@
|
||||
# tip
|
||||
|
||||
|
||||
# [v1.44.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.44.0)
|
||||
|
||||
* FEATURE: automatically add missing label filters to binary operands as described at https://utcc.utoronto.ca/~cks/space/blog/sysadmin/PrometheusLabelNonOptimization .
|
||||
This should improve performance for queries with missing label filters in binary operands. For example, the following query should work faster now, because it shouldn't
|
||||
fetch and discard time series for `node_filesystem_files_free` metric without matching labels for the left side of the expression:
|
||||
```
|
||||
node_filesystem_files{ host="$host", mountpoint="/" } - node_filesystem_files_free
|
||||
```
|
||||
* FEATURE: vmagent: add Docker Swarm service discovery (aka [dockerswarm_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dockerswarm_sd_config)).
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/656
|
||||
* FEATURE: add ability to export data in CSV format. See [these docs](https://victoriametrics.github.io/#how-to-export-csv-data) for details.
|
||||
* FEATURE: vmagent: add `-promscrape.suppressDuplicateScrapeTargetErrors` command-line flag for suppressing `duplicate scrape target` errors.
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/651 and https://victoriametrics.github.io/vmagent.html#troubleshooting .
|
||||
* FEATURE: vmagent: show original labels before relabeling is applied on `duplicate scrape target` errors. This should simplify debugging for incorrect relabeling.
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/651
|
||||
* FEATURE: vmagent: `/targets` page now accepts optional `show_original_labels=1` query arg for displaying original labels for each target before relabeling is applied.
|
||||
This should simplify debugging for target relabeling configs. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/651
|
||||
* FEATURE: add `-finalMergeDelay` command-line flag for configuring the delay before final merge for per-month partitions.
|
||||
The final merge is started after no new data is ingested into per-month partition during `-finalMergeDelay`.
|
||||
* FEATURE: add `vm_rows_added_to_storage_total` metric, which shows the total number of rows added to storage since app start.
|
||||
The `sum(rate(vm_rows_added_to_storage_total))` can be smaller than `sum(rate(vm_rows_inserted_total))` if certain metrics are dropped
|
||||
due to [relabeling](https://victoriametrics.github.io/#relabeling). The `sum(rate(vm_rows_added_to_storage_total))` can be bigger
|
||||
than `sum(rate(vm_rows_inserted_total))` if [replication](https://victoriametrics.github.io/Cluster-VictoriaMetrics.html#replication-and-data-safety) is enabled.
|
||||
* FEATURE: keep metric name after applying [MetricsQL](https://victoriametrics.github.io/MetricsQL.html) functions, which don't change time series meaning.
|
||||
The list of such functions:
|
||||
* `keep_last_value`
|
||||
* `keep_next_value`
|
||||
* `interpolate`
|
||||
* `running_min`
|
||||
* `running_max`
|
||||
* `running_avg`
|
||||
* `range_min`
|
||||
* `range_max`
|
||||
* `range_avg`
|
||||
* `range_first`
|
||||
* `range_last`
|
||||
* `range_quantile`
|
||||
* `smooth_exponential`
|
||||
* `ceil`
|
||||
* `floor`
|
||||
* `round`
|
||||
* `clamp_min`
|
||||
* `clamp_max`
|
||||
* `max_over_time`
|
||||
* `min_over_time`
|
||||
* `avg_over_time`
|
||||
* `quantile_over_time`
|
||||
* `mode_over_time`
|
||||
* `geomean_over_time`
|
||||
* `holt_winters`
|
||||
* `predict_linear`
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/674
|
||||
|
||||
* BUGFIX: properly handle stale time series after K8S deployment. Previously such time series could be double-counted.
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/748
|
||||
* BUGFIX: return a single time series at max from `absent()` function like Prometheus does.
|
||||
* BUGFIX: vmalert: accept days, weeks and years in `for: ` part of config like Prometheus does. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/817
|
||||
* BUGFIX: fix `mode_over_time(m[d])` calculations. Previously the function could return incorrect results.
|
||||
|
||||
|
||||
# [v1.43.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.43.0)
|
||||
|
||||
* FEATURE: reduce CPU usage for repeated queries over sliding time window when no new time series are added to the database.
|
||||
Typical use cases: repeated evaluation of alerting rules in [vmalert](https://victoriametrics.github.io/vmalert.html) or dashboard auto-refresh in Grafana.
|
||||
* FEATURE: vmagent: add OpenStack service discovery aka [openstack_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#openstack_sd_config).
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/728 .
|
||||
* FEATURE: vmalert: make `-maxIdleConnections` configurable for datasource HTTP client. This option can be used for minimizing connection churn.
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/795 .
|
||||
* FEATURE: add `-influx.maxLineSize` command-line flag for configuring the maximum size for a single Influx line during parsing.
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/807
|
||||
|
||||
* BUGFIX: properly handle `inf` values during [background merge of LSM parts](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282).
|
||||
Previously `Inf` values could result in `NaN` values for adjancent samples in time series. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/805 .
|
||||
* BUGFIX: fill gaps on graphs for `range_*` and `running_*` functions. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/806 .
|
||||
* BUGFIX: make a copy of label with new name during relabeling with `action: labelmap` in the same way as Prometheus does.
|
||||
Previously the original label name has been replaced. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/812 .
|
||||
* BUGFIX: support parsing floating-point timestamp like Graphite Carbon does. Such timestmaps are truncated to seconds.
|
||||
|
||||
|
||||
# [v1.42.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.42.0)
|
||||
|
||||
* FEATURE: use all the available CPU cores when accepting data via a single TCP connection
|
||||
for [all the supported protocols](https://victoriametrics.github.io/#how-to-import-time-series-data).
|
||||
Previously data ingested via a single TCP connection could use only a single CPU core. This could limit data ingestion performance.
|
||||
The main benefit of this feature is that data can be imported at max speed via a single connection - there is no need to open multiple concurrent
|
||||
connections to VictoriaMetrics or [vmagent](https://victoriametrics.github.io/vmagent.html) in order to achieve the maximum data ingestion speed.
|
||||
* FEATURE: cluster: improve performance for data ingestion path from `vminsert` to `vmstorage` nodes. The maximum data ingestion performance
|
||||
for a single connection between `vminsert` and `vmstorage` node scales with the number of available CPU cores on `vmstorage` side.
|
||||
This should help with https://github.com/VictoriaMetrics/VictoriaMetrics/issues/791 .
|
||||
* FEATURE: add ability to export / import data in native format via `/api/v1/export/native` and `/api/v1/import/native`.
|
||||
This is the most optimized approach for data migration between VictoriaMetrics instances. Both single-node and cluster instances are supported.
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/787#issuecomment-700632551 .
|
||||
* FEATURE: add `reduce_mem_usage` query option to `/api/v1/export` in order to reduce memory usage during data export / import.
|
||||
See [these docs](https://victoriametrics.github.io/#how-to-export-data-in-json-line-format) for details.
|
||||
* FEATURE: improve performance for `/api/v1/series` handler when it returns big number of time series.
|
||||
* FEATURE: add `vm_merge_need_free_disk_space` metric, which can be used for estimating the number of deferred background data merges due to the lack of free disk space.
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/686 .
|
||||
* FEATURE: add OpenBSD support. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/785 .
|
||||
|
||||
* BUGFIX: properly apply `-search.maxStalenessInterval` command-line flag value. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/784 .
|
||||
* BUGFIX: fix displaying data in Grafana tables. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/720 .
|
||||
* BUGFIX: do not adjust the number of detected CPU cores found at `/sys/devices/system/cpu/online`.
|
||||
The adjustement was increasing the resulting GOMAXPROC by 1, which looked confusing to users.
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/685#issuecomment-698595309 .
|
||||
* BUGFIX: vmagent: do not show `-remoteWrite.url` in initial logs if `-remoteWrite.showURL` isn't set. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/773 .
|
||||
* BUGFIX: properly handle case when [/metrics/find](https://victoriametrics.github.io/#graphite-metrics-api-usage) finds both a leaf and a node for the given `query=prefix.*`.
|
||||
In this case only the node must be returned with stripped dot in the end of id as carbonapi does.
|
||||
|
||||
|
||||
# Previous releases
|
||||
|
||||
See [releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases).
|
||||
120
CODE_OF_CONDUCT_RU.md
Normal file
120
CODE_OF_CONDUCT_RU.md
Normal file
@@ -0,0 +1,120 @@
|
||||
|
||||
# Кодекс Поведения участника
|
||||
|
||||
## Наши обязательства
|
||||
|
||||
Мы, как участники, авторы и лидеры обязуемся сделать участие в сообществе
|
||||
свободным от притеснений для всех, независимо от возраста, телосложения,
|
||||
видимых или невидимых ограничений способности, этнической принадлежности,
|
||||
половых признаков, гендерной идентичности и выражения, уровня опыта,
|
||||
образования, социо-экономического статуса, национальности, внешности,
|
||||
расы, религии, или сексуальной идентичности и ориентации.
|
||||
|
||||
Мы обещаем действовать и взаимодействовать таким образом, чтобы вносить вклад в открытое,
|
||||
дружелюбное, многообразное, инклюзивное и здоровое сообщество.
|
||||
|
||||
## Наши стандарты
|
||||
|
||||
Примеры поведения, создающие условия для благоприятных взаимоотношений включают в себя:
|
||||
|
||||
* Проявление доброты и эмпатии к другим участникам проекта
|
||||
* Уважение к чужой точке зрения и опыту
|
||||
* Конструктивная критика и принятие конструктивной критики
|
||||
* Принятие ответственности, принесение извинений тем, кто пострадал от наших ошибок
|
||||
и извлечение уроков из опыта
|
||||
* Ориентирование на то, что лучше подходит для сообщества, а не только для нас лично
|
||||
|
||||
Примеры неприемлемого поведения участников включают в себя:
|
||||
|
||||
* Использование выражений или изображений сексуального характера и нежелательное сексуальное внимание или домогательство в любой форме
|
||||
* Троллинг, оскорбительные или уничижительные комментарии, переход на личности или затрагивание политических убеждений
|
||||
* Публичное или приватное домогательство
|
||||
* Публикация личной информации других лиц, например, физического или электронного адреса, без явного разрешения
|
||||
* Иное поведение, которое обоснованно считать неуместным в профессиональной обстановке
|
||||
|
||||
## Обязанности
|
||||
|
||||
Лидеры сообщества отвечают за разъяснение и применение наших стандартов приемлемого
|
||||
поведения и будут предпринимать соответствующие и честные меры по исправлению положения
|
||||
в ответ на любое поведение, которое они сочтут неприемлемым, угрожающим, оскорбительным или вредным.
|
||||
|
||||
Лидеры сообщества обладают правом и обязанностью удалять, редактировать или отклонять
|
||||
комментарии, коммиты, код, изменения в вики, вопросы и другой вклад, который не совпадает
|
||||
с Кодексом Поведения, и предоставят причины принятого решения, когда сочтут нужным.
|
||||
|
||||
## Область применения
|
||||
|
||||
Данный Кодекс Поведения применим во всех во всех публичных физических и цифровых пространства сообщества,
|
||||
а также когда человек официально представляет сообщество в публичных местах.
|
||||
Примеры представления проекта или сообщества включают использование официальной электронной почты,
|
||||
публикации в официальном аккаунте в социальных сетях,
|
||||
или упоминания как представителя в онлайн или оффлайн мероприятии.
|
||||
|
||||
## Приведение в исполнение
|
||||
|
||||
О случаях домогательства, а так же оскорбительного или иного другого неприемлемого
|
||||
поведения можно сообщить ответственным лидерам сообщества с помощью письма на info@victoriametrics.com
|
||||
Все жалобы будут рассмотрены и расследованы оперативно и беспристрастно.
|
||||
|
||||
Все лидеры сообщества обязаны уважать неприкосновенность частной жизни и личную
|
||||
неприкосновенность автора сообщения.
|
||||
|
||||
## Руководство по исполнению
|
||||
|
||||
Лидеры сообщества будут следовать следующим Принципам Воздействия в Сообществе,
|
||||
чтобы определить последствия для тех, кого они считают виновными в нарушении данного Кодекса Поведения:
|
||||
|
||||
### 1. Исправление
|
||||
|
||||
**Общественное влияние**: Использование недопустимой лексики или другое поведение,
|
||||
считающиеся непрофессиональным или нежелательным в сообществе.
|
||||
|
||||
**Последствия**: Личное, письменное предупреждение от лидеров сообщества,
|
||||
объясняющее суть нарушения и почему такое поведение
|
||||
было неуместно. Лидеры сообщества могут попросить принести публичное извинение.
|
||||
|
||||
### 2. Предупреждение
|
||||
|
||||
**Общественное влияние**: Нарушение в результате одного инцидента или серии действий.
|
||||
|
||||
**Последствия**: Предупреждение о последствиях в случае продолжающегося неуместного поведения.
|
||||
На определенное время не допускается взаимодействие с людьми, вовлеченными в инцидент,
|
||||
включая незапрошенное взаимодействие
|
||||
с теми, кто обеспечивает соблюдение Кодекса. Это включает в себя избегание взаимодействия
|
||||
в публичных пространствах, а так же во внешних каналах,
|
||||
таких как социальные сети. Нарушение этих правил влечет за собой временный или вечный бан.
|
||||
|
||||
### 3. Временный бан
|
||||
|
||||
**Общественное влияние**: Серьёзное нарушение стандартов сообщества,
|
||||
включая продолжительное неуместное поведение.
|
||||
|
||||
**Последствия**: Временный запрет (бан) на любое взаимодействие
|
||||
или публичное общение с сообществом на определенный период времени.
|
||||
На этот период не допускается публичное или личное взаимодействие с людьми,
|
||||
вовлеченными в инцидент, включая незапрошенное взаимодействие
|
||||
с теми, кто обеспечивает соблюдение Кодекса.
|
||||
Нарушение этих правил влечет за собой вечный бан.
|
||||
|
||||
### 4. Вечный бан
|
||||
|
||||
**Общественное влияние**: Демонстрация систематических нарушений стандартов сообщества,
|
||||
включая продолжающееся неуместное поведение, домогательство до отдельных лиц,
|
||||
или проявление агрессии либо пренебрежительного отношения к категориям лиц.
|
||||
|
||||
**Последствия**: Вечный запрет на любое публичное взаимодействие с сообществом.
|
||||
|
||||
## Атрибуция
|
||||
|
||||
Данный Кодекс Поведения основан на [Кодекс Поведения участника][homepage],
|
||||
версии 2.0, доступной по адресу
|
||||
https://www.contributor-covenant.org/version/2/0/code_of_conduct.html.
|
||||
|
||||
Принципы Воздействия в Сообществе были вдохновлены [Mozilla's code of conduct
|
||||
enforcement ladder](https://github.com/mozilla/diversity).
|
||||
|
||||
[homepage]: https://www.contributor-covenant.org
|
||||
|
||||
Ответы на общие вопросы о данном кодексе поведения ищите на странице FAQ:
|
||||
https://www.contributor-covenant.org/faq. Переводы доступны по адресу
|
||||
https://www.contributor-covenant.org/translations.
|
||||
35
Makefile
35
Makefile
@@ -10,6 +10,8 @@ endif
|
||||
|
||||
GO_BUILDINFO = -X '$(PKG_PREFIX)/lib/buildinfo.Version=$(APP_NAME)-$(shell date -u +'%Y%m%d-%H%M%S')-$(BUILDINFO_TAG)'
|
||||
|
||||
.PHONY: $(MAKECMDGOALS)
|
||||
|
||||
all: \
|
||||
victoria-metrics-prod \
|
||||
vmagent-prod \
|
||||
@@ -47,6 +49,17 @@ vmutils: \
|
||||
vmbackup \
|
||||
vmrestore
|
||||
|
||||
vmutils-arm64: \
|
||||
vmagent-arm64 \
|
||||
vmalert-arm64 \
|
||||
vmauth-arm64 \
|
||||
vmbackup-arm64 \
|
||||
vmrestore-arm64
|
||||
|
||||
release-snap:
|
||||
snapcraft
|
||||
snapcraft upload "victoriametrics_$(PKG_TAG)_multi.snap" --release beta,edge,candidate
|
||||
|
||||
release: \
|
||||
release-victoria-metrics \
|
||||
release-vmutils
|
||||
@@ -64,6 +77,16 @@ release-vmutils: \
|
||||
cd bin && tar czf vmutils-$(PKG_TAG).tar.gz vmagent-prod vmalert-prod vmauth-prod vmbackup-prod vmrestore-prod && \
|
||||
sha256sum vmutils-$(PKG_TAG).tar.gz > vmutils-$(PKG_TAG)_checksums.txt
|
||||
|
||||
release-vmutils-arm64: \
|
||||
vmagent-arm64-prod \
|
||||
vmalert-arm64-prod \
|
||||
vmauth-arm64-prod \
|
||||
vmbackup-arm64-prod \
|
||||
vmrestore-arm64-prod
|
||||
cd bin && tar czf vmutils-arm64-$(PKG_TAG).tar.gz vmagent-arm64-prod vmalert-arm64-prod vmauth-arm64-prod vmbackup-arm64-prod vmrestore-arm64-prod && \
|
||||
sha256sum vmutils-arm64-$(PKG_TAG).tar.gz > vmutils-arm64-$(PKG_TAG)_checksums.txt
|
||||
|
||||
|
||||
pprof-cpu:
|
||||
go tool pprof -trim_path=github.com/VictoriaMetrics/VictoriaMetrics@ $(PPROF_FILE)
|
||||
|
||||
@@ -80,7 +103,7 @@ lint: install-golint
|
||||
golint app/...
|
||||
|
||||
install-golint:
|
||||
which golint || GO111MODULE=off go get -u golang.org/x/lint/golint
|
||||
which golint || go install golang.org/x/lint/golint
|
||||
|
||||
errcheck: install-errcheck
|
||||
errcheck -exclude=errcheck_excludes.txt ./lib/...
|
||||
@@ -94,7 +117,7 @@ errcheck: install-errcheck
|
||||
errcheck -exclude=errcheck_excludes.txt ./app/vmrestore/...
|
||||
|
||||
install-errcheck:
|
||||
which errcheck || GO111MODULE=off go get -u github.com/kisielk/errcheck
|
||||
which errcheck || go install github.com/kisielk/errcheck
|
||||
|
||||
check-all: fmt vet lint errcheck golangci-lint
|
||||
|
||||
@@ -122,8 +145,8 @@ benchmark-pure:
|
||||
GO111MODULE=on CGO_ENABLED=0 go test -mod=vendor -bench=. ./app/...
|
||||
|
||||
vendor-update:
|
||||
GO111MODULE=on go get -u ./lib/...
|
||||
GO111MODULE=on go get -u ./app/...
|
||||
GO111MODULE=on go get -u -d ./lib/...
|
||||
GO111MODULE=on go get -u -d ./app/...
|
||||
GO111MODULE=on go mod tidy
|
||||
GO111MODULE=on go mod vendor
|
||||
|
||||
@@ -140,14 +163,14 @@ quicktemplate-gen: install-qtc
|
||||
qtc
|
||||
|
||||
install-qtc:
|
||||
which qtc || GO111MODULE=off go get -u github.com/valyala/quicktemplate/qtc
|
||||
which qtc || go install github.com/valyala/quicktemplate/qtc
|
||||
|
||||
|
||||
golangci-lint: install-golangci-lint
|
||||
golangci-lint run --exclude '(SA4003|SA1019|SA5011):' -D errcheck -D structcheck --timeout 2m
|
||||
|
||||
install-golangci-lint:
|
||||
which golangci-lint || GO111MODULE=off go get -u github.com/golangci/golangci-lint/cmd/golangci-lint
|
||||
which golangci-lint || curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(shell go env GOPATH)/bin v1.29.0
|
||||
|
||||
docs-sync:
|
||||
cp app/vmagent/README.md docs/vmagent.md
|
||||
|
||||
@@ -108,3 +108,10 @@ victoria-metrics-package-deb-rpm-all: \
|
||||
victoria-metrics-package-deb-arm64 \
|
||||
victoria-metrics-package-rpm \
|
||||
victoria-metrics-package-rpm-arm64
|
||||
|
||||
### Packaging as snap
|
||||
victoria-metrics-package-snap:
|
||||
which snapcraft || snap install snapcraft
|
||||
which multipass || snap install multipass
|
||||
snapcraft
|
||||
|
||||
|
||||
@@ -3,20 +3,24 @@ package main
|
||||
import (
|
||||
"flag"
|
||||
"fmt"
|
||||
"io"
|
||||
"net/http"
|
||||
"os"
|
||||
"path"
|
||||
"time"
|
||||
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/promql"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envflag"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/procutil"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
|
||||
)
|
||||
|
||||
@@ -25,19 +29,33 @@ var (
|
||||
minScrapeInterval = flag.Duration("dedup.minScrapeInterval", 0, "Remove superflouos samples from time series if they are located closer to each other than this duration. "+
|
||||
"This may be useful for reducing overhead when multiple identically configured Prometheus instances write data to the same VictoriaMetrics. "+
|
||||
"Deduplication is disabled if the -dedup.minScrapeInterval is 0")
|
||||
dryRun = flag.Bool("dryRun", false, "Whether to check only -promscrape.config and then exit. "+
|
||||
"Unknown config entries are allowed in -promscrape.config by default. This can be changed with -promscrape.config.strictParse")
|
||||
)
|
||||
|
||||
func main() {
|
||||
// Write flags and help message to stdout, since it is easier to grep or pipe.
|
||||
flag.CommandLine.SetOutput(os.Stdout)
|
||||
flag.Usage = usage
|
||||
envflag.Parse()
|
||||
buildinfo.Init()
|
||||
logger.Init()
|
||||
cgroup.UpdateGOMAXPROCSToCPUQuota()
|
||||
|
||||
if promscrape.IsDryRun() {
|
||||
*dryRun = true
|
||||
}
|
||||
if *dryRun {
|
||||
if err := promscrape.CheckConfig(); err != nil {
|
||||
logger.Fatalf("error when checking -promscrape.config: %s", err)
|
||||
}
|
||||
logger.Infof("-promscrape.config is ok; exitting with 0 status code")
|
||||
return
|
||||
}
|
||||
|
||||
logger.Infof("starting VictoriaMetrics at %q...", *httpListenAddr)
|
||||
startTime := time.Now()
|
||||
storage.SetMinScrapeIntervalForDeduplication(*minScrapeInterval)
|
||||
vmstorage.Init()
|
||||
vmstorage.Init(promql.ResetRollupResultCacheIfNeeded)
|
||||
vmselect.Init()
|
||||
vminsert.Init()
|
||||
startSelfScraper()
|
||||
@@ -67,8 +85,16 @@ func main() {
|
||||
}
|
||||
|
||||
func requestHandler(w http.ResponseWriter, r *http.Request) bool {
|
||||
if r.RequestURI == "/" {
|
||||
fmt.Fprintf(w, "Single-node VictoriaMetrics. See docs at https://victoriametrics.github.io/")
|
||||
if r.URL.Path == "/" {
|
||||
fmt.Fprintf(w, "<h2>Single-node VictoriaMetrics.</h2></br>")
|
||||
fmt.Fprintf(w, "See docs at <a href='https://victoriametrics.github.io/'>https://victoriametrics.github.io/</a></br>")
|
||||
fmt.Fprintf(w, "Useful endpoints: </br>")
|
||||
writeAPIHelp(w, [][]string{
|
||||
{"/targets", "discovered targets list"},
|
||||
{"/api/v1/targets", "advanced information about discovered targets in JSON format"},
|
||||
{"/metrics", "available service metrics"},
|
||||
{"/api/v1/status/tsdb", "tsdb status page"},
|
||||
})
|
||||
return true
|
||||
}
|
||||
if vminsert.RequestHandler(w, r) {
|
||||
@@ -82,3 +108,21 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
func writeAPIHelp(w io.Writer, pathList [][]string) {
|
||||
pathPrefix := httpserver.GetPathPrefix()
|
||||
for _, p := range pathList {
|
||||
p, doc := p[0], p[1]
|
||||
p = path.Join(pathPrefix, p)
|
||||
fmt.Fprintf(w, "<a href='%s'>%q</a> - %s<br/>", p, p, doc)
|
||||
}
|
||||
}
|
||||
|
||||
func usage() {
|
||||
const s = `
|
||||
victoria-metrics is a time series database and monitoring solution.
|
||||
|
||||
See the docs at https://victoriametrics.github.io/
|
||||
`
|
||||
flagutil.Usage(s)
|
||||
}
|
||||
|
||||
@@ -20,6 +20,7 @@ import (
|
||||
testutil "github.com/VictoriaMetrics/VictoriaMetrics/app/victoria-metrics/test"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/promql"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envflag"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
|
||||
@@ -57,6 +58,7 @@ var (
|
||||
type test struct {
|
||||
Name string `json:"name"`
|
||||
Data []string `json:"data"`
|
||||
InsertQuery string `json:"insert_query"`
|
||||
Query []string `json:"query"`
|
||||
ResultMetrics []Metric `json:"result_metrics"`
|
||||
ResultSeries Series `json:"result_series"`
|
||||
@@ -129,7 +131,7 @@ func setUp() {
|
||||
storagePath = filepath.Join(os.TempDir(), testStorageSuffix)
|
||||
processFlags()
|
||||
logger.Init()
|
||||
vmstorage.InitWithoutMetrics()
|
||||
vmstorage.InitWithoutMetrics(promql.ResetRollupResultCacheIfNeeded)
|
||||
vmselect.Init()
|
||||
vminsert.Init()
|
||||
go httpserver.Serve(*httpListenAddr, requestHandler)
|
||||
@@ -192,7 +194,7 @@ func TestWriteRead(t *testing.T) {
|
||||
time.Sleep(1 * time.Second)
|
||||
vmstorage.Stop()
|
||||
// open storage after stop in write
|
||||
vmstorage.InitWithoutMetrics()
|
||||
vmstorage.InitWithoutMetrics(promql.ResetRollupResultCacheIfNeeded)
|
||||
t.Run("read", testRead)
|
||||
}
|
||||
|
||||
@@ -208,7 +210,7 @@ func testWrite(t *testing.T) {
|
||||
t.Errorf("error compressing %v %s", r, err)
|
||||
t.Fail()
|
||||
}
|
||||
httpWrite(t, testPromWriteHTTPPath, bytes.NewBuffer(data))
|
||||
httpWrite(t, testPromWriteHTTPPath, test.InsertQuery, bytes.NewBuffer(data))
|
||||
}
|
||||
})
|
||||
|
||||
@@ -217,7 +219,7 @@ func testWrite(t *testing.T) {
|
||||
test := x
|
||||
t.Run(test.Name, func(t *testing.T) {
|
||||
t.Parallel()
|
||||
httpWrite(t, testWriteHTTPPath, bytes.NewBufferString(strings.Join(test.Data, "\n")))
|
||||
httpWrite(t, testWriteHTTPPath, test.InsertQuery, bytes.NewBufferString(strings.Join(test.Data, "\n")))
|
||||
})
|
||||
}
|
||||
})
|
||||
@@ -245,7 +247,7 @@ func testWrite(t *testing.T) {
|
||||
t.Run(test.Name, func(t *testing.T) {
|
||||
t.Parallel()
|
||||
logger.Infof("writing %s", test.Data)
|
||||
httpWrite(t, testOpenTSDBWriteHTTPPath, bytes.NewBufferString(strings.Join(test.Data, "\n")))
|
||||
httpWrite(t, testOpenTSDBWriteHTTPPath, test.InsertQuery, bytes.NewBufferString(strings.Join(test.Data, "\n")))
|
||||
})
|
||||
}
|
||||
})
|
||||
@@ -323,10 +325,10 @@ func readIn(readFor string, t *testing.T, insertTime time.Time) []test {
|
||||
return tt
|
||||
}
|
||||
|
||||
func httpWrite(t *testing.T, address string, r io.Reader) {
|
||||
func httpWrite(t *testing.T, address, query string, r io.Reader) {
|
||||
t.Helper()
|
||||
s := newSuite(t)
|
||||
resp, err := http.Post(address, "", r)
|
||||
resp, err := http.Post(address+query, "", r)
|
||||
s.noError(err)
|
||||
s.noError(resp.Body.Close())
|
||||
s.equalInt(resp.StatusCode, 204)
|
||||
|
||||
10
app/victoria-metrics/testdata/influxdb/with_extra_labels.json
vendored
Normal file
10
app/victoria-metrics/testdata/influxdb/with_extra_labels.json
vendored
Normal file
@@ -0,0 +1,10 @@
|
||||
{
|
||||
"name": "insert_with_extra_labels",
|
||||
"data": ["measurement,tag1=value1,tag2=value2 field6=1.23,field5=123 {TIME_NS}"],
|
||||
"insert_query": "?extra_label=job=test&extra_label=tag2=value10",
|
||||
"query": ["/api/v1/export?match={__name__!=''}"],
|
||||
"result_metrics": [
|
||||
{"metric":{"__name__":"measurement_field5","tag1":"value1","job": "test","tag2":"value10"},"values":[123], "timestamps": ["{TIME_MS}"]},
|
||||
{"metric":{"__name__":"measurement_field6","tag1":"value1","job": "test","tag2":"value10"},"values":[1.23], "timestamps": ["{TIME_MS}"]}
|
||||
]
|
||||
}
|
||||
9
app/victoria-metrics/testdata/opentsdbhttp/with_extra_labels.json
vendored
Normal file
9
app/victoria-metrics/testdata/opentsdbhttp/with_extra_labels.json
vendored
Normal file
@@ -0,0 +1,9 @@
|
||||
{
|
||||
"name": "insert_with_extra_labels",
|
||||
"data": ["{\"metric\": \"opentsdbhttp.foobar\", \"value\": 1001, \"timestamp\": {TIME_S}, \"tags\": {\"bar\":\"baz\", \"x\": \"y\"}}"],
|
||||
"insert_query": "?extra_label=job=open-test&extra_label=x=z",
|
||||
"query": ["/api/v1/export?match={__name__!=''}"],
|
||||
"result_metrics": [
|
||||
{"metric":{"__name__":"opentsdbhttp.foobar","bar":"baz","x":"z","job": "open-test"},"values":[1001], "timestamps": ["{TIME_MSZ}"]}
|
||||
]
|
||||
}
|
||||
9
app/victoria-metrics/testdata/prometheus/with_extra_labels.json
vendored
Normal file
9
app/victoria-metrics/testdata/prometheus/with_extra_labels.json
vendored
Normal file
@@ -0,0 +1,9 @@
|
||||
{
|
||||
"name": "basic_insertion_with_extra_labels",
|
||||
"insert_query": "?extra_label=job=prom-test&extra_label=baz=bar",
|
||||
"data": ["[{\"labels\":[{\"name\":\"__name__\",\"value\":\"prometheus.foobar\"},{\"name\":\"baz\",\"value\":\"qux\"}],\"samples\":[{\"value\":100000,\"timestamp\":\"{TIME_MS}\"}]}]"],
|
||||
"query": ["/api/v1/export?match={__name__!=''}"],
|
||||
"result_metrics": [
|
||||
{"metric":{"__name__":"prometheus.foobar","baz":"bar","job": "prom-test"},"values":[100000], "timestamps": ["{TIME_MS}"]}
|
||||
]
|
||||
}
|
||||
@@ -21,14 +21,14 @@ to `vmagent` (like the ability to push metrics instead of pulling them). We did
|
||||
See [Quick Start](#quick-start) for details.
|
||||
* Can add, remove and modify labels (aka tags) via Prometheus relabeling. Can filter data before sending it to remote storage. See [these docs](#relabeling) for details.
|
||||
* Accepts data via all the ingestion protocols supported by VictoriaMetrics:
|
||||
* Influx line protocol via `http://<vmagent>:8429/write`. See [these docs](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-send-data-from-influxdb-compatible-agents-such-as-telegraf).
|
||||
* Graphite plaintext protocol if `-graphiteListenAddr` command-line flag is set. See [these docs](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-send-data-from-graphite-compatible-agents-such-as-statsd).
|
||||
* OpenTSDB telnet and http protocols if `-opentsdbListenAddr` command-line flag is set. See [these docs](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-send-data-from-opentsdb-compatible-agents).
|
||||
* Influx line protocol via `http://<vmagent>:8429/write`. See [these docs](https://victoriametrics.github.io/Single-server-VictoriaMetrics.html#how-to-send-data-from-influxdb-compatible-agents-such-as-telegraf).
|
||||
* Graphite plaintext protocol if `-graphiteListenAddr` command-line flag is set. See [these docs](https://victoriametrics.github.io/Single-server-VictoriaMetrics.html#how-to-send-data-from-graphite-compatible-agents-such-as-statsd).
|
||||
* OpenTSDB telnet and http protocols if `-opentsdbListenAddr` command-line flag is set. See [these docs](https://victoriametrics.github.io/Single-server-VictoriaMetrics.html#how-to-send-data-from-opentsdb-compatible-agents).
|
||||
* Prometheus remote write protocol via `http://<vmagent>:8429/api/v1/write`.
|
||||
* JSON lines import protocol via `http://<vmagent>:8429/api/v1/import`. See [these docs](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-import-data-in-json-line-format).
|
||||
* Native data import protocol via `http://<vmagent>:8429/api/v1/import/native`. See [these docs](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-import-data-in-native-format).
|
||||
* Data in Prometheus exposition format. See [these docs](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-import-data-in-prometheus-exposition-format) for details.
|
||||
* Arbitrary CSV data via `http://<vmagent>:8429/api/v1/import/csv`. See [these docs](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-import-csv-data).
|
||||
* JSON lines import protocol via `http://<vmagent>:8429/api/v1/import`. See [these docs](https://victoriametrics.github.io/Single-server-VictoriaMetrics.html#how-to-import-data-in-json-line-format).
|
||||
* Native data import protocol via `http://<vmagent>:8429/api/v1/import/native`. See [these docs](https://victoriametrics.github.io/Single-server-VictoriaMetrics.html#how-to-import-data-in-native-format).
|
||||
* Data in Prometheus exposition format. See [these docs](https://victoriametrics.github.io/Single-server-VictoriaMetrics.html#how-to-import-data-in-prometheus-exposition-format) for details.
|
||||
* Arbitrary CSV data via `http://<vmagent>:8429/api/v1/import/csv`. See [these docs](https://victoriametrics.github.io/Single-server-VictoriaMetrics.html#how-to-import-csv-data).
|
||||
* Can replicate collected metrics simultaneously to multiple remote storage systems.
|
||||
* Works in environments with unstable connections to remote storage. If the remote storage is unavailable, the collected metrics
|
||||
are buffered at `-remoteWrite.tmpDataPath`. The buffered metrics are sent to remote storage as soon as connection
|
||||
@@ -56,13 +56,29 @@ If you only need to collect Influx data, then the following is sufficient:
|
||||
/path/to/vmagent -remoteWrite.url=https://victoria-metrics-host:8428/api/v1/write
|
||||
```
|
||||
|
||||
Then send Influx data to `http://vmagent-host:8429`. See [these docs](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-send-data-from-influxdb-compatible-agents-such-as-telegraf) for more details.
|
||||
Then send Influx data to `http://vmagent-host:8429`. See [these docs](https://victoriametrics.github.io/Single-server-VictoriaMetrics.html#how-to-send-data-from-influxdb-compatible-agents-such-as-telegraf) for more details.
|
||||
|
||||
`vmagent` is also available in [docker images](https://hub.docker.com/r/victoriametrics/vmagent/tags).
|
||||
|
||||
Pass `-help` to `vmagent` in order to see the full list of supported command-line flags with their descriptions.
|
||||
|
||||
|
||||
### Configuration update
|
||||
|
||||
`vmagent` should be restarted in order to update config options set via command-line args.
|
||||
|
||||
`vmagent` supports multiple approaches for reloading configs from updated config files such as `-promscrape.config`, `-remoteWrite.relabelConfig` and `-remoteWrite.urlRelabelConfig`:
|
||||
|
||||
* Sending `SUGHUP` signal to `vmagent` process:
|
||||
```bash
|
||||
kill -SIGHUP `pidof vmagent`
|
||||
```
|
||||
|
||||
* Sending HTTP request to `http://vmagent:8429/-/reload` endpoint.
|
||||
|
||||
There is also `-promscrape.configCheckInterval` command-line option, which can be used for automatic reloading configs from updated `-promscrape.config` file.
|
||||
|
||||
|
||||
### Use cases
|
||||
|
||||
|
||||
@@ -153,6 +169,8 @@ The following scrape types in [scrape_config](https://prometheus.io/docs/prometh
|
||||
[OpenStack identity API v3](https://docs.openstack.org/api-ref/identity/v3/) is supported only.
|
||||
* `dockerswarm_sd_configs` - for scraping Docker Swarm targets.
|
||||
See [dockerswarm_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dockerswarm_sd_config) for details.
|
||||
* `eureka_sd_configs` - for scraping targets registered in [Netflix Eureka](https://github.com/Netflix/eureka).
|
||||
See [eureka_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#eureka_sd_config) for details.
|
||||
|
||||
File feature requests at [our issue tracker](https://github.com/VictoriaMetrics/VictoriaMetrics/issues) if you need other service discovery mechanisms to be supported by `vmagent`.
|
||||
|
||||
@@ -197,6 +215,7 @@ The relabeling can be defined in the following places:
|
||||
|
||||
Read more about relabeling in the following articles:
|
||||
|
||||
* [How to use Relabeling in Prometheus and VictoriaMetrics](https://valyala.medium.com/how-to-use-relabeling-in-prometheus-and-victoriametrics-8b90fc22c4b2)
|
||||
* [Life of a label](https://www.robustperception.io/life-of-a-label)
|
||||
* [Discarding targets and timeseries with relabeling](https://www.robustperception.io/relabelling-can-discard-targets-timeseries-and-alerts)
|
||||
* [Dropping labels at scrape time](https://www.robustperception.io/dropping-metrics-at-scrape-time-with-prometheus)
|
||||
@@ -211,9 +230,16 @@ either via `vmagent` itself or via Prometheus, so the exported metrics could be
|
||||
Use official [Grafana dashboard](https://grafana.com/grafana/dashboards/12683) for `vmagent` state overview.
|
||||
If you have suggestions, improvements or found a bug - feel free to open an issue on github or add review to the dashboard.
|
||||
|
||||
`vmagent` also exports target statuses at `http://vmagent-host:8429/targets` page in plaintext format.
|
||||
`/targets` handler accepts optional `show_original_labels=1` query arg, which shows the original labels per each target
|
||||
before applying relabeling. This information may be useful for debugging target relabeling.
|
||||
`vmagent` also exports target statuses at the following handlers:
|
||||
|
||||
* `http://vmagent-host:8429/targets`. This handler returns human-readable plaintext status for every active target.
|
||||
This page is convenient to query from command line with `wget`, `curl` or similar tools.
|
||||
It accepts optional `show_original_labels=1` query arg, which shows the original labels per each target before applying relabeling.
|
||||
This information may be useful for debugging target relabeling.
|
||||
* `http://vmagent-host:8429/api/v1/targets`. This handler returns data compatible with [the corresponding page from Prometheus API](https://prometheus.io/docs/prometheus/latest/querying/api/#targets).
|
||||
|
||||
* `http://vmagent-host:8429/ready`. This handler returns http 200 status code when `vmagent` finishes initialization for all service_discovery configs.
|
||||
It may be useful for performing `vmagent` rolling update without scrape loss.
|
||||
|
||||
|
||||
### Troubleshooting
|
||||
@@ -224,7 +250,35 @@ before applying relabeling. This information may be useful for debugging target
|
||||
since `vmagent` establishes at least a single TCP connection per each target.
|
||||
|
||||
* When `vmagent` scrapes many unreliable targets, it can flood error log with scrape errors. These errors can be suppressed
|
||||
by passing `-promscrape.suppressScrapeErrors` command-line flag to `vmagent`. The most recent scrape error per each target can be observed at `http://vmagent-host:8429/targets`.
|
||||
by passing `-promscrape.suppressScrapeErrors` command-line flag to `vmagent`. The most recent scrape error per each target can be observed at `http://vmagent-host:8429/targets`
|
||||
and `http://vmagent-host:8429/api/v1/targets`.
|
||||
|
||||
* The `/api/v1/targets` page could be useful for debugging relabeling process for scrape targets.
|
||||
This page contains original labels for targets dropped during relabeling (see "droppedTargets" section in the page output). By default up to `-promscrape.maxDroppedTargets` targets are shown here. If your setup drops more targets during relabeling, then increase `-promscrape.maxDroppedTargets` command-line flag value in order to see all the dropped targets. Note that tracking each dropped target requires up to 10Kb of RAM, so big values for `-promscrape.maxDroppedTargets` may result in increased memory usage if big number of scrape targets are dropped during relabeling.
|
||||
|
||||
* If `vmagent` scrapes big number of targets, then `-promscrape.dropOriginalLabels` command-line option may be passed to `vmagent` in order to reduce memory usage.
|
||||
This option drops `"discoveredLabels"` and `"droppedTargets"` lists at `/api/v1/targets` page, which may result in reduced debuggability for improperly configured per-target relabeling.
|
||||
|
||||
* If `vmagent` scrapes targets with millions of metrics per each target (for instance, when scraping [federation endpoints](https://prometheus.io/docs/prometheus/latest/federation/)),
|
||||
then it is recommended enabling `stream parsing mode` in order to reduce memory usage during scraping. This mode may be enabled either globally for all the scrape targets
|
||||
by passing `-promscrape.streamParse` command-line flag or on a per-scrape target basis with `stream_parse: true` option. For example:
|
||||
|
||||
```yml
|
||||
scrape_configs:
|
||||
- job_name: 'big-federate'
|
||||
stream_parse: true
|
||||
static_configs:
|
||||
- targets:
|
||||
- big-prometeus1
|
||||
- big-prometeus2
|
||||
honor_labels: true
|
||||
metrics_path: /federate
|
||||
params:
|
||||
'match[]': ['{__name__!=""}']
|
||||
```
|
||||
|
||||
Note that `sample_limit` option doesn't work if stream parsing is enabled, since the parsed data is pushed to remote storage as soon as it is parsed. So `sample_limit` option
|
||||
has no sense during stream parsing.
|
||||
|
||||
* It is recommended to increase `-remoteWrite.queues` if `vmagent_remotewrite_pending_data_bytes` metric exported at `http://vmagent-host:8429/metrics` page constantly grows.
|
||||
|
||||
|
||||
@@ -1,10 +1,11 @@
|
||||
package common
|
||||
|
||||
import (
|
||||
"runtime"
|
||||
"sync"
|
||||
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
|
||||
)
|
||||
|
||||
// PushCtx is a context used for populating WriteRequest.
|
||||
@@ -28,12 +29,7 @@ func (ctx *PushCtx) Reset() {
|
||||
}
|
||||
ctx.WriteRequest.Timeseries = ctx.WriteRequest.Timeseries[:0]
|
||||
|
||||
labels := ctx.Labels
|
||||
for i := range labels {
|
||||
label := &labels[i]
|
||||
label.Name = ""
|
||||
label.Value = ""
|
||||
}
|
||||
promrelabel.CleanLabels(ctx.Labels)
|
||||
ctx.Labels = ctx.Labels[:0]
|
||||
|
||||
ctx.Samples = ctx.Samples[:0]
|
||||
@@ -67,4 +63,4 @@ func PutPushCtx(ctx *PushCtx) {
|
||||
}
|
||||
|
||||
var pushCtxPool sync.Pool
|
||||
var pushCtxPoolCh = make(chan *PushCtx, runtime.GOMAXPROCS(-1))
|
||||
var pushCtxPoolCh = make(chan *PushCtx, cgroup.AvailableCPUs())
|
||||
|
||||
@@ -4,13 +4,15 @@ import (
|
||||
"flag"
|
||||
"io"
|
||||
"net/http"
|
||||
"runtime"
|
||||
"sync"
|
||||
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
|
||||
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
|
||||
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/influx"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
|
||||
"github.com/VictoriaMetrics/metrics"
|
||||
@@ -32,7 +34,9 @@ var (
|
||||
// See https://github.com/influxdata/telegraf/tree/master/plugins/inputs/socket_listener/
|
||||
func InsertHandlerForReader(r io.Reader) error {
|
||||
return writeconcurrencylimiter.Do(func() error {
|
||||
return parser.ParseStream(r, false, "", "", insertRows)
|
||||
return parser.ParseStream(r, false, "", "", func(db string, rows []parser.Row) error {
|
||||
return insertRows(db, rows, nil)
|
||||
})
|
||||
})
|
||||
}
|
||||
|
||||
@@ -40,17 +44,23 @@ func InsertHandlerForReader(r io.Reader) error {
|
||||
//
|
||||
// See https://github.com/influxdata/influxdb/blob/4cbdc197b8117fee648d62e2e5be75c6575352f0/tsdb/README.md
|
||||
func InsertHandlerForHTTP(req *http.Request) error {
|
||||
extraLabels, err := parserCommon.GetExtraLabels(req)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
return writeconcurrencylimiter.Do(func() error {
|
||||
isGzipped := req.Header.Get("Content-Encoding") == "gzip"
|
||||
q := req.URL.Query()
|
||||
precision := q.Get("precision")
|
||||
// Read db tag from https://docs.influxdata.com/influxdb/v1.7/tools/api/#write-http-endpoint
|
||||
db := q.Get("db")
|
||||
return parser.ParseStream(req.Body, isGzipped, precision, db, insertRows)
|
||||
return parser.ParseStream(req.Body, isGzipped, precision, db, func(db string, rows []parser.Row) error {
|
||||
return insertRows(db, rows, extraLabels)
|
||||
})
|
||||
})
|
||||
}
|
||||
|
||||
func insertRows(db string, rows []parser.Row) error {
|
||||
func insertRows(db string, rows []parser.Row, extraLabels []prompbmarshal.Label) error {
|
||||
ctx := getPushCtx()
|
||||
defer putPushCtx(ctx)
|
||||
|
||||
@@ -81,6 +91,7 @@ func insertRows(db string, rows []parser.Row) error {
|
||||
Value: db,
|
||||
})
|
||||
}
|
||||
commonLabels = append(commonLabels, extraLabels...)
|
||||
ctx.metricGroupBuf = ctx.metricGroupBuf[:0]
|
||||
if !*skipMeasurement {
|
||||
ctx.metricGroupBuf = append(ctx.metricGroupBuf, r.Measurement...)
|
||||
@@ -135,12 +146,8 @@ type pushCtx struct {
|
||||
func (ctx *pushCtx) reset() {
|
||||
ctx.ctx.Reset()
|
||||
|
||||
commonLabels := ctx.commonLabels
|
||||
for i := range commonLabels {
|
||||
label := &commonLabels[i]
|
||||
label.Name = ""
|
||||
label.Value = ""
|
||||
}
|
||||
promrelabel.CleanLabels(ctx.commonLabels)
|
||||
ctx.commonLabels = ctx.commonLabels[:0]
|
||||
|
||||
ctx.metricGroupBuf = ctx.metricGroupBuf[:0]
|
||||
ctx.buf = ctx.buf[:0]
|
||||
@@ -168,4 +175,4 @@ func putPushCtx(ctx *pushCtx) {
|
||||
}
|
||||
|
||||
var pushCtxPool sync.Pool
|
||||
var pushCtxPoolCh = make(chan *pushCtx, runtime.GOMAXPROCS(-1))
|
||||
var pushCtxPoolCh = make(chan *pushCtx, cgroup.AvailableCPUs())
|
||||
|
||||
@@ -5,8 +5,8 @@ import (
|
||||
"fmt"
|
||||
"net/http"
|
||||
"os"
|
||||
"strconv"
|
||||
"strings"
|
||||
"sync/atomic"
|
||||
"time"
|
||||
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/csvimport"
|
||||
@@ -20,8 +20,8 @@ import (
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/vmimport"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envflag"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
|
||||
graphiteserver "github.com/VictoriaMetrics/VictoriaMetrics/lib/ingestserver/graphite"
|
||||
influxserver "github.com/VictoriaMetrics/VictoriaMetrics/lib/ingestserver/influx"
|
||||
@@ -47,7 +47,8 @@ var (
|
||||
"Usually :4242 must be set. Doesn't work if empty")
|
||||
opentsdbHTTPListenAddr = flag.String("opentsdbHTTPListenAddr", "", "TCP address to listen for OpentTSDB HTTP put requests. Usually :4242 must be set. Doesn't work if empty")
|
||||
dryRun = flag.Bool("dryRun", false, "Whether to check only config files without running vmagent. The following files are checked: "+
|
||||
"-promscrape.config, -remoteWrite.relabelConfig, -remoteWrite.urlRelabelConfig . See also -promscrape.config.dryRun")
|
||||
"-promscrape.config, -remoteWrite.relabelConfig, -remoteWrite.urlRelabelConfig . "+
|
||||
"Unknown config entries are allowed in -promscrape.config by default. This can be changed with -promscrape.config.strictParse")
|
||||
)
|
||||
|
||||
var (
|
||||
@@ -65,17 +66,20 @@ func main() {
|
||||
remotewrite.InitSecretFlags()
|
||||
buildinfo.Init()
|
||||
logger.Init()
|
||||
cgroup.UpdateGOMAXPROCSToCPUQuota()
|
||||
|
||||
if *dryRun {
|
||||
if err := flag.Set("promscrape.config.strictParse", "true"); err != nil {
|
||||
logger.Panicf("BUG: cannot set promscrape.config.strictParse=true: %s", err)
|
||||
if promscrape.IsDryRun() {
|
||||
if err := promscrape.CheckConfig(); err != nil {
|
||||
logger.Fatalf("error when checking -promscrape.config: %s", err)
|
||||
}
|
||||
logger.Infof("-promscrape.config is ok; exitting with 0 status code")
|
||||
return
|
||||
}
|
||||
if *dryRun {
|
||||
if err := remotewrite.CheckRelabelConfigs(); err != nil {
|
||||
logger.Fatalf("error when checking relabel configs: %s", err)
|
||||
}
|
||||
if err := promscrape.CheckConfig(); err != nil {
|
||||
logger.Fatalf("error when checking Prometheus config: %s", err)
|
||||
logger.Fatalf("error when checking -promscrape.config: %s", err)
|
||||
}
|
||||
logger.Infof("all the configs are ok; exitting with 0 status code")
|
||||
return
|
||||
@@ -139,7 +143,7 @@ func main() {
|
||||
}
|
||||
|
||||
func requestHandler(w http.ResponseWriter, r *http.Request) bool {
|
||||
if r.RequestURI == "/" {
|
||||
if r.URL.Path == "/" {
|
||||
fmt.Fprintf(w, "vmagent - see docs at https://victoriametrics.github.io/vmagent.html")
|
||||
return true
|
||||
}
|
||||
@@ -207,15 +211,29 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
|
||||
return true
|
||||
case "/targets":
|
||||
promscrapeTargetsRequests.Inc()
|
||||
w.Header().Set("Content-Type", "text/plain")
|
||||
showOriginalLabels, _ := strconv.ParseBool(r.FormValue("show_original_labels"))
|
||||
promscrape.WriteHumanReadableTargetsStatus(w, showOriginalLabels)
|
||||
promscrape.WriteHumanReadableTargetsStatus(w, r)
|
||||
return true
|
||||
case "/api/v1/targets":
|
||||
promscrapeAPIV1TargetsRequests.Inc()
|
||||
w.Header().Set("Content-Type", "application/json; charset=utf-8")
|
||||
state := r.FormValue("state")
|
||||
promscrape.WriteAPIV1Targets(w, state)
|
||||
return true
|
||||
case "/-/reload":
|
||||
promscrapeConfigReloadRequests.Inc()
|
||||
procutil.SelfSIGHUP()
|
||||
w.WriteHeader(http.StatusOK)
|
||||
return true
|
||||
case "/ready":
|
||||
if rdy := atomic.LoadInt32(&promscrape.PendingScrapeConfigs); rdy > 0 {
|
||||
errMsg := fmt.Sprintf("waiting for scrapes to init, left: %d", rdy)
|
||||
http.Error(w, errMsg, http.StatusTooEarly)
|
||||
} else {
|
||||
w.Header().Set("Content-Type", "text/plain; charset=utf-8")
|
||||
w.WriteHeader(http.StatusOK)
|
||||
w.Write([]byte("OK"))
|
||||
}
|
||||
return true
|
||||
}
|
||||
return false
|
||||
}
|
||||
@@ -241,7 +259,8 @@ var (
|
||||
|
||||
influxQueryRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/query", protocol="influx"}`)
|
||||
|
||||
promscrapeTargetsRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/targets"}`)
|
||||
promscrapeTargetsRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/targets"}`)
|
||||
promscrapeAPIV1TargetsRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/api/v1/targets"}`)
|
||||
|
||||
promscrapeConfigReloadRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/-/reload"}`)
|
||||
)
|
||||
@@ -250,10 +269,7 @@ func usage() {
|
||||
const s = `
|
||||
vmagent collects metrics data via popular data ingestion protocols and routes it to VictoriaMetrics.
|
||||
|
||||
See the docs at https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmagent/README.md .
|
||||
See the docs at https://victoriametrics.github.io/vmagent.html .
|
||||
`
|
||||
|
||||
f := flag.CommandLine.Output()
|
||||
fmt.Fprintf(f, "%s\n", s)
|
||||
flag.PrintDefaults()
|
||||
flagutil.Usage(s)
|
||||
}
|
||||
|
||||
@@ -6,6 +6,7 @@ import (
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
|
||||
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
|
||||
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentsdbhttp"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
|
||||
"github.com/VictoriaMetrics/metrics"
|
||||
@@ -19,12 +20,18 @@ var (
|
||||
// InsertHandler processes HTTP OpenTSDB put requests.
|
||||
// See http://opentsdb.net/docs/build/html/api_http/put.html
|
||||
func InsertHandler(req *http.Request) error {
|
||||
extraLabels, err := parserCommon.GetExtraLabels(req)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
return writeconcurrencylimiter.Do(func() error {
|
||||
return parser.ParseStream(req, insertRows)
|
||||
return parser.ParseStream(req, func(rows []parser.Row) error {
|
||||
return insertRows(rows, extraLabels)
|
||||
})
|
||||
})
|
||||
}
|
||||
|
||||
func insertRows(rows []parser.Row) error {
|
||||
func insertRows(rows []parser.Row, extraLabels []prompbmarshal.Label) error {
|
||||
ctx := common.GetPushCtx()
|
||||
defer common.PutPushCtx(ctx)
|
||||
|
||||
@@ -45,6 +52,7 @@ func insertRows(rows []parser.Row) error {
|
||||
Value: tag.Value,
|
||||
})
|
||||
}
|
||||
labels = append(labels, extraLabels...)
|
||||
samples = append(samples, prompbmarshal.Sample{
|
||||
Value: r.Value,
|
||||
Timestamp: r.Timestamp,
|
||||
|
||||
@@ -31,7 +31,7 @@ func InsertHandler(req *http.Request) error {
|
||||
isGzipped := req.Header.Get("Content-Encoding") == "gzip"
|
||||
return parser.ParseStream(req.Body, defaultTimestamp, isGzipped, func(rows []parser.Row) error {
|
||||
return insertRows(rows, extraLabels)
|
||||
})
|
||||
}, nil)
|
||||
})
|
||||
}
|
||||
|
||||
|
||||
@@ -8,6 +8,7 @@ import (
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
|
||||
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
|
||||
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/promremotewrite"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
|
||||
"github.com/VictoriaMetrics/metrics"
|
||||
@@ -20,12 +21,18 @@ var (
|
||||
|
||||
// InsertHandler processes remote write for prometheus.
|
||||
func InsertHandler(req *http.Request) error {
|
||||
extraLabels, err := parserCommon.GetExtraLabels(req)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
return writeconcurrencylimiter.Do(func() error {
|
||||
return parser.ParseStream(req, insertRows)
|
||||
return parser.ParseStream(req, func(tss []prompb.TimeSeries) error {
|
||||
return insertRows(tss, extraLabels)
|
||||
})
|
||||
})
|
||||
}
|
||||
|
||||
func insertRows(timeseries []prompb.TimeSeries) error {
|
||||
func insertRows(timeseries []prompb.TimeSeries, extraLabels []prompbmarshal.Label) error {
|
||||
ctx := common.GetPushCtx()
|
||||
defer common.PutPushCtx(ctx)
|
||||
|
||||
@@ -44,6 +51,7 @@ func insertRows(timeseries []prompb.TimeSeries) error {
|
||||
Value: bytesutil.ToUnsafeString(label.Value),
|
||||
})
|
||||
}
|
||||
labels = append(labels, extraLabels...)
|
||||
samplesLen := len(samples)
|
||||
for i := range ts.Samples {
|
||||
sample := &ts.Samples[i]
|
||||
|
||||
@@ -4,7 +4,6 @@ import (
|
||||
"bytes"
|
||||
"crypto/tls"
|
||||
"encoding/base64"
|
||||
"flag"
|
||||
"fmt"
|
||||
"io/ioutil"
|
||||
"net/http"
|
||||
@@ -21,11 +20,11 @@ import (
|
||||
)
|
||||
|
||||
var (
|
||||
sendTimeout = flag.Duration("remoteWrite.sendTimeout", time.Minute, "Timeout for sending a single block of data to -remoteWrite.url")
|
||||
sendTimeout = flagutil.NewArrayDuration("remoteWrite.sendTimeout", "Timeout for sending a single block of data to -remoteWrite.url")
|
||||
proxyURL = flagutil.NewArray("remoteWrite.proxyURL", "Optional proxy URL for writing data to -remoteWrite.url. Supported proxies: http, https, socks5. "+
|
||||
"Example: -remoteWrite.proxyURL=socks5://proxy:1234")
|
||||
|
||||
tlsInsecureSkipVerify = flag.Bool("remoteWrite.tlsInsecureSkipVerify", false, "Whether to skip tls verification when connecting to -remoteWrite.url")
|
||||
tlsInsecureSkipVerify = flagutil.NewArrayBool("remoteWrite.tlsInsecureSkipVerify", "Whether to skip tls verification when connecting to -remoteWrite.url")
|
||||
tlsCertFile = flagutil.NewArray("remoteWrite.tlsCertFile", "Optional path to client-side TLS certificate file to use when connecting to -remoteWrite.url. "+
|
||||
"If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url")
|
||||
tlsKeyFile = flagutil.NewArray("remoteWrite.tlsKeyFile", "Optional path to client-side TLS certificate key to use when connecting to -remoteWrite.url. "+
|
||||
@@ -50,9 +49,12 @@ type client struct {
|
||||
fq *persistentqueue.FastQueue
|
||||
hc *http.Client
|
||||
|
||||
bytesSent *metrics.Counter
|
||||
blocksSent *metrics.Counter
|
||||
requestDuration *metrics.Histogram
|
||||
requestsOKCount *metrics.Counter
|
||||
errorsCount *metrics.Counter
|
||||
packetsDropped *metrics.Counter
|
||||
retriesCount *metrics.Counter
|
||||
|
||||
wg sync.WaitGroup
|
||||
@@ -107,13 +109,16 @@ func newClient(argIdx int, remoteWriteURL, sanitizedURL string, fq *persistentqu
|
||||
fq: fq,
|
||||
hc: &http.Client{
|
||||
Transport: tr,
|
||||
Timeout: *sendTimeout,
|
||||
Timeout: sendTimeout.GetOptionalArgOrDefault(argIdx, time.Minute),
|
||||
},
|
||||
stopCh: make(chan struct{}),
|
||||
}
|
||||
c.bytesSent = metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_bytes_sent_total{url=%q}`, c.sanitizedURL))
|
||||
c.blocksSent = metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_blocks_sent_total{url=%q}`, c.sanitizedURL))
|
||||
c.requestDuration = metrics.GetOrCreateHistogram(fmt.Sprintf(`vmagent_remotewrite_duration_seconds{url=%q}`, c.sanitizedURL))
|
||||
c.requestsOKCount = metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_requests_total{url=%q, status_code="2XX"}`, c.sanitizedURL))
|
||||
c.errorsCount = metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_errors_total{url=%q}`, c.sanitizedURL))
|
||||
c.packetsDropped = metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_packets_dropped_total{url=%q}`, c.sanitizedURL))
|
||||
c.retriesCount = metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_retries_count_total{url=%q}`, c.sanitizedURL))
|
||||
for i := 0; i < concurrency; i++ {
|
||||
c.wg.Add(1)
|
||||
@@ -138,7 +143,7 @@ func getTLSConfig(argIdx int) (*tls.Config, error) {
|
||||
CertFile: tlsCertFile.GetOptionalArg(argIdx),
|
||||
KeyFile: tlsKeyFile.GetOptionalArg(argIdx),
|
||||
ServerName: tlsServerName.GetOptionalArg(argIdx),
|
||||
InsecureSkipVerify: *tlsInsecureSkipVerify,
|
||||
InsecureSkipVerify: tlsInsecureSkipVerify.GetOptionalArg(argIdx),
|
||||
}
|
||||
if c.CAFile == "" && c.CertFile == "" && c.KeyFile == "" && c.ServerName == "" && !c.InsecureSkipVerify {
|
||||
return nil, nil
|
||||
@@ -186,6 +191,8 @@ func (c *client) runWorker() {
|
||||
func (c *client) sendBlock(block []byte) {
|
||||
retryDuration := time.Second
|
||||
retriesCount := 0
|
||||
c.bytesSent.Add(len(block))
|
||||
c.blocksSent.Inc()
|
||||
|
||||
again:
|
||||
req, err := http.NewRequest("POST", c.remoteWriteURL, bytes.NewBuffer(block))
|
||||
@@ -228,10 +235,20 @@ again:
|
||||
c.requestsOKCount.Inc()
|
||||
return
|
||||
}
|
||||
metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_requests_total{url=%q, status_code="%d"}`, c.sanitizedURL, statusCode)).Inc()
|
||||
if statusCode == 409 {
|
||||
// Just drop block on 409 status code like Prometheus does.
|
||||
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/873
|
||||
body, _ := ioutil.ReadAll(resp.Body)
|
||||
_ = resp.Body.Close()
|
||||
logger.Errorf("unexpected status code received when sending a block with size %d bytes to %q: #%d; dropping the block like Prometheus does; "+
|
||||
"response body=%q", len(block), c.sanitizedURL, statusCode, body)
|
||||
c.packetsDropped.Inc()
|
||||
return
|
||||
}
|
||||
|
||||
// Unexpected status code returned
|
||||
retriesCount++
|
||||
metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_requests_total{url=%q, status_code="%d"}`, c.sanitizedURL, statusCode)).Inc()
|
||||
retryDuration *= 2
|
||||
if retryDuration > time.Minute {
|
||||
retryDuration = time.Minute
|
||||
|
||||
@@ -11,6 +11,7 @@ import (
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/persistentqueue"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
|
||||
"github.com/VictoriaMetrics/metrics"
|
||||
"github.com/golang/snappy"
|
||||
)
|
||||
@@ -104,11 +105,7 @@ func (wr *writeRequest) reset() {
|
||||
}
|
||||
wr.tss = wr.tss[:0]
|
||||
|
||||
for i := range wr.labels {
|
||||
label := &wr.labels[i]
|
||||
label.Name = ""
|
||||
label.Value = ""
|
||||
}
|
||||
promrelabel.CleanLabels(wr.labels)
|
||||
wr.labels = wr.labels[:0]
|
||||
|
||||
wr.samples = wr.samples[:0]
|
||||
|
||||
@@ -65,6 +65,9 @@ type relabelConfigs struct {
|
||||
func initLabelsGlobal() {
|
||||
labelsGlobal = nil
|
||||
for _, s := range *unparsedLabelsGlobal {
|
||||
if len(s) == 0 {
|
||||
continue
|
||||
}
|
||||
n := strings.IndexByte(s, '=')
|
||||
if n < 0 {
|
||||
logger.Fatalf("missing '=' in `-remoteWrite.label`. It must contain label in the form `name=value`; got %q", s)
|
||||
@@ -117,12 +120,7 @@ type relabelCtx struct {
|
||||
}
|
||||
|
||||
func (rctx *relabelCtx) reset() {
|
||||
labels := rctx.labels
|
||||
for i := range labels {
|
||||
label := &labels[i]
|
||||
label.Name = ""
|
||||
label.Value = ""
|
||||
}
|
||||
promrelabel.CleanLabels(rctx.labels)
|
||||
rctx.labels = rctx.labels[:0]
|
||||
}
|
||||
|
||||
|
||||
@@ -3,10 +3,10 @@ package remotewrite
|
||||
import (
|
||||
"flag"
|
||||
"fmt"
|
||||
"runtime"
|
||||
"sync"
|
||||
"sync/atomic"
|
||||
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/decimal"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
|
||||
@@ -43,7 +43,7 @@ var allRelabelConfigs atomic.Value
|
||||
|
||||
// maxQueues limits the maximum value for `-remoteWrite.queues`. There is no sense in setting too high value,
|
||||
// since it may lead to high memory usage due to big number of buffers.
|
||||
var maxQueues = runtime.GOMAXPROCS(-1) * 4
|
||||
var maxQueues = cgroup.AvailableCPUs() * 4
|
||||
|
||||
// InitSecretFlags must be called after flag.Parse and before any logging.
|
||||
func InitSecretFlags() {
|
||||
|
||||
@@ -5,9 +5,9 @@ import (
|
||||
"net"
|
||||
"strings"
|
||||
"sync/atomic"
|
||||
"time"
|
||||
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/netutil"
|
||||
"github.com/VictoriaMetrics/fasthttp"
|
||||
"github.com/VictoriaMetrics/metrics"
|
||||
)
|
||||
|
||||
@@ -15,11 +15,10 @@ func statDial(network, addr string) (conn net.Conn, err error) {
|
||||
if !strings.HasPrefix(network, "tcp") {
|
||||
return nil, fmt.Errorf("unexpected network passed to statDial: %q; it must start from `tcp`", network)
|
||||
}
|
||||
if netutil.TCP6Enabled() {
|
||||
conn, err = fasthttp.DialDualStack(addr)
|
||||
} else {
|
||||
conn, err = fasthttp.Dial(addr)
|
||||
if !netutil.TCP6Enabled() {
|
||||
network = "tcp4"
|
||||
}
|
||||
conn, err = net.DialTimeout(network, addr, 5*time.Second)
|
||||
dialsTotal.Inc()
|
||||
if err != nil {
|
||||
dialErrors.Inc()
|
||||
|
||||
@@ -6,11 +6,12 @@ rules against configured address.
|
||||
|
||||
### Features:
|
||||
* Integration with [VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics) TSDB;
|
||||
* VictoriaMetrics [MetricsQL](https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/MetricsQL)
|
||||
* VictoriaMetrics [MetricsQL](https://victoriametrics.github.io/MetricsQL.html)
|
||||
support and expressions validation;
|
||||
* Prometheus [alerting rules definition format](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/#defining-alerting-rules)
|
||||
support;
|
||||
* Integration with [Alertmanager](https://github.com/prometheus/alertmanager);
|
||||
* Keeps the alerts [state on restarts](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/master/app/vmalert#alerts-state-on-restarts);
|
||||
* Lightweight without extra dependencies.
|
||||
|
||||
### Limitations:
|
||||
@@ -20,7 +21,6 @@ may fail;
|
||||
* by default, rules execution is sequential within one group, but persisting of execution results to remote
|
||||
storage is asynchronous. Hence, user shouldn't rely on recording rules chaining when result of previous
|
||||
recording rule is reused in next one;
|
||||
* there is no `query` function support in templates yet;
|
||||
* `vmalert` has no UI, just an API for getting groups and rules statuses.
|
||||
|
||||
### QuickStart
|
||||
@@ -89,7 +89,7 @@ rules:
|
||||
|
||||
There are two types of Rules:
|
||||
* [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/) -
|
||||
Alerting rules allows to define alert conditions via [MetricsQL](https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/MetricsQL)
|
||||
Alerting rules allows to define alert conditions via [MetricsQL](https://victoriametrics.github.io/MetricsQL.html)
|
||||
and to send notifications about firing alerts to [Alertmanager](https://github.com/prometheus/alertmanager).
|
||||
* [recording](https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/) -
|
||||
Recording rules allow you to precompute frequently needed or computationally expensive expressions
|
||||
@@ -121,14 +121,6 @@ annotations:
|
||||
[ <labelname>: <tmpl_string> ]
|
||||
```
|
||||
|
||||
`vmalert` has no local storage and alerts state is stored in process memory. Hence, after reloading of `vmalert` process
|
||||
alerts state will be lost. To avoid this situation, `vmalert` may be configured via following flags:
|
||||
* `-remoteWrite.url` - URL to Victoria Metrics or VMInsert. `vmalert` will persist alerts state into the configured
|
||||
address in form of timeseries with name `ALERTS` via remote-write protocol.
|
||||
* `-remoteRead.url` - URL to Victoria Metrics or VMSelect. `vmalert` will try to restore alerts state from configured
|
||||
address by querying `ALERTS` timeseries.
|
||||
|
||||
|
||||
##### Recording rules
|
||||
|
||||
The syntax for recording rules is following:
|
||||
@@ -147,6 +139,22 @@ labels:
|
||||
For recording rules to work `-remoteWrite.url` must specified.
|
||||
|
||||
|
||||
#### Alerts state on restarts
|
||||
|
||||
`vmalert` has no local storage, so alerts state is stored in the process memory. Hence, after reloading of `vmalert`
|
||||
the process alerts state will be lost. To avoid this situation, `vmalert` should be configured via the following flags:
|
||||
* `-remoteWrite.url` - URL to VictoriaMetrics (Single) or VMInsert (Cluster). `vmalert` will persist alerts state
|
||||
into the configured address in the form of time series named `ALERTS` and `ALERTS_FOR_STATE` via remote-write protocol.
|
||||
These are regular time series and may be queried from VM just as any other time series.
|
||||
The state stored to the configured address on every rule evaluation.
|
||||
* `-remoteRead.url` - URL to VictoriaMetrics (Single) or VMSelect (Cluster). `vmalert` will try to restore alerts state
|
||||
from configured address by querying time series with name `ALERTS_FOR_STATE`.
|
||||
|
||||
Both flags are required for the proper state restoring. Restore process may fail if time series are missing
|
||||
in configured `-remoteRead.url`, weren't updated in the last `1h` or received state doesn't match current `vmalert`
|
||||
rules configuration.
|
||||
|
||||
|
||||
#### WEB
|
||||
|
||||
`vmalert` runs a web-server (`-httpListenAddr`) for serving metrics and alerts endpoints:
|
||||
@@ -167,9 +175,9 @@ The shortlist of configuration flags is the following:
|
||||
-datasource.basicAuth.username string
|
||||
Optional basic auth username for -datasource.url
|
||||
-datasource.lookback duration
|
||||
Lookback defines how far to look into past when evaluating queries. For example, if datasource.lookback=5m then param "time" with value now()-5m will be added to every query.
|
||||
Lookback defines how far to look into past when evaluating queries. For example, if datasource.lookback=5m then param "time" with value now()-5m will be added to every query.
|
||||
-datasource.maxIdleConnections int
|
||||
Defines the number of idle (keep-alive connections) to configured datasource.Consider to set this value equal to the value: groups_total * group.concurrency. Too low value may result into high number of sockets in TIME_WAIT state. (default 100)
|
||||
Defines the number of idle (keep-alive connections) to configured datasource.Consider to set this value equal to the value: groups_total * group.concurrency. Too low value may result into high number of sockets in TIME_WAIT state. (default 100)
|
||||
-datasource.tlsCAFile string
|
||||
Optional path to TLS CA file to use for verifying connections to -datasource.url. By default system CA is used
|
||||
-datasource.tlsCertFile string
|
||||
@@ -182,6 +190,8 @@ The shortlist of configuration flags is the following:
|
||||
Optional TLS server name to use for connections to -datasource.url. By default the server name from -datasource.url is used
|
||||
-datasource.url string
|
||||
Victoria Metrics or VMSelect url. Required parameter. E.g. http://127.0.0.1:8428
|
||||
-dryRun -rule
|
||||
Whether to check only config files without running vmalert. The rules file are validated. The -rule flag must be specified.
|
||||
-enableTCP6
|
||||
Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP is used
|
||||
-envflag.enable
|
||||
@@ -192,7 +202,7 @@ The shortlist of configuration flags is the following:
|
||||
How often to evaluate the rules (default 1m0s)
|
||||
-external.alert.source string
|
||||
External Alert Source allows to override the Source link for alerts sent to AlertManager for cases where you want to build a custom link to Grafana, Prometheus or any other service.
|
||||
eg. 'explore?orgId=1&left=[\"now-1h\",\"now\",\"VictoriaMetrics\",{\"expr\": \"{{$expr|quotesEscape|pathEscape}}\"},{\"mode\":\"Metrics\"},{\"ui\":[true,true,true,\"none\"]}]'.If empty '/api/v1/:groupID/alertID/status' is used
|
||||
eg. 'explore?orgId=1&left=[\"now-1h\",\"now\",\"VictoriaMetrics\",{\"expr\": \"{{$expr|quotesEscape|crlfEscape|pathEscape}}\"},{\"mode\":\"Metrics\"},{\"ui\":[true,true,true,\"none\"]}]'.If empty '/api/v1/:groupID/alertID/status' is used
|
||||
-external.label array
|
||||
Optional label in the form 'name=value' to add to all generated recording rules and alerts. Pass multiple -label flags in order to add multiple label sets.
|
||||
Supports array of values separated by comma or specified via multiple flags.
|
||||
@@ -216,14 +226,18 @@ The shortlist of configuration flags is the following:
|
||||
Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
|
||||
-httpListenAddr string
|
||||
Address to listen for http connections (default ":8880")
|
||||
-loggerDisableTimestamps
|
||||
Whether to disable writing timestamps in logs
|
||||
-loggerErrorsPerSecondLimit int
|
||||
Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, then the remaining errors are suppressed. Zero value disables the rate limit (default 10)
|
||||
Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, then the remaining errors are suppressed. Zero value disables the rate limit
|
||||
-loggerFormat string
|
||||
Format for logs. Possible values: default, json (default "default")
|
||||
-loggerLevel string
|
||||
Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO")
|
||||
-loggerOutput string
|
||||
Output for the logs. Supported values: stderr, stdout (default "stderr")
|
||||
-loggerWarnsPerSecondLimit int
|
||||
Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero value disables the rate limit
|
||||
-memory.allowedBytes value
|
||||
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to non-zero value. Too low value may increase cache miss rate, which usually results in higher CPU and disk IO usage. Too high value may evict too much data from OS page cache, which will result in higher disk IO usage
|
||||
Supports the following optional suffixes for values: KB, MB, GB, KiB, MiB, GiB (default 0)
|
||||
@@ -232,10 +246,10 @@ The shortlist of configuration flags is the following:
|
||||
-metricsAuthKey string
|
||||
Auth key for /metrics. It overrides httpAuth settings
|
||||
-notifier.basicAuth.password array
|
||||
Optional basic auth password for -datasource.url
|
||||
Optional basic auth password for -notifier.url
|
||||
Supports array of values separated by comma or specified via multiple flags.
|
||||
-notifier.basicAuth.username array
|
||||
Optional basic auth username for -datasource.url
|
||||
Optional basic auth username for -notifier.url
|
||||
Supports array of values separated by comma or specified via multiple flags.
|
||||
-notifier.tlsCAFile array
|
||||
Optional path to TLS CA file to use for verifying connections to -notifier.url. By default system CA is used
|
||||
@@ -243,8 +257,9 @@ The shortlist of configuration flags is the following:
|
||||
-notifier.tlsCertFile array
|
||||
Optional path to client-side TLS certificate file to use when connecting to -notifier.url
|
||||
Supports array of values separated by comma or specified via multiple flags.
|
||||
-notifier.tlsInsecureSkipVerify
|
||||
-notifier.tlsInsecureSkipVerify array
|
||||
Whether to skip tls verification when connecting to -notifier.url
|
||||
Supports array of values separated by comma or specified via multiple flags.
|
||||
-notifier.tlsKeyFile array
|
||||
Optional path to client-side TLS certificate key to use when connecting to -notifier.url
|
||||
Supports array of values separated by comma or specified via multiple flags.
|
||||
|
||||
@@ -137,25 +137,41 @@ func (ar *AlertingRule) Exec(ctx context.Context, q datasource.Querier, series b
|
||||
}
|
||||
}
|
||||
|
||||
qFn := func(query string) ([]datasource.Metric, error) { return q.Query(ctx, query) }
|
||||
updated := make(map[uint64]struct{})
|
||||
// update list of active alerts
|
||||
for _, m := range qMetrics {
|
||||
// extra labels could contain templates, so we expand them first
|
||||
labels, err := expandLabels(m, qFn, ar)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("failed to expand labels: %s", err)
|
||||
}
|
||||
for k, v := range labels {
|
||||
// apply extra labels to datasource
|
||||
// so the hash key will be consistent on restore
|
||||
m.SetLabel(k, v)
|
||||
}
|
||||
h := hash(m)
|
||||
if _, ok := updated[h]; ok {
|
||||
// duplicate may be caused by extra labels
|
||||
// conflicting with the metric labels
|
||||
return nil, fmt.Errorf("labels %v: %w", m.Labels, errDuplicate)
|
||||
}
|
||||
updated[h] = struct{}{}
|
||||
if a, ok := ar.alerts[h]; ok {
|
||||
if a.Value != m.Value {
|
||||
// update Value field with latest value
|
||||
a.Value = m.Value
|
||||
// and re-exec template since Value can be used
|
||||
// in templates
|
||||
err = ar.template(a)
|
||||
// in annotations
|
||||
a.Annotations, err = a.ExecTemplate(qFn, ar.Annotations)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
}
|
||||
continue
|
||||
}
|
||||
a, err := ar.newAlert(m, ar.lastExecTime)
|
||||
a, err := ar.newAlert(m, ar.lastExecTime, qFn)
|
||||
if err != nil {
|
||||
ar.lastExecError = err
|
||||
return nil, fmt.Errorf("failed to create alert: %w", err)
|
||||
@@ -189,6 +205,19 @@ func (ar *AlertingRule) Exec(ctx context.Context, q datasource.Querier, series b
|
||||
return nil, nil
|
||||
}
|
||||
|
||||
func expandLabels(m datasource.Metric, q notifier.QueryFn, ar *AlertingRule) (map[string]string, error) {
|
||||
metricLabels := make(map[string]string)
|
||||
for _, l := range m.Labels {
|
||||
metricLabels[l.Name] = l.Value
|
||||
}
|
||||
tpl := notifier.AlertTplData{
|
||||
Labels: metricLabels,
|
||||
Value: m.Value,
|
||||
Expr: ar.Expr,
|
||||
}
|
||||
return notifier.ExecTemplate(q, ar.Labels, tpl)
|
||||
}
|
||||
|
||||
func (ar *AlertingRule) toTimeSeries(timestamp time.Time) []prompbmarshal.TimeSeries {
|
||||
var tss []prompbmarshal.TimeSeries
|
||||
for _, a := range ar.alerts {
|
||||
@@ -235,7 +264,7 @@ func hash(m datasource.Metric) uint64 {
|
||||
return hash.Sum64()
|
||||
}
|
||||
|
||||
func (ar *AlertingRule) newAlert(m datasource.Metric, start time.Time) (*notifier.Alert, error) {
|
||||
func (ar *AlertingRule) newAlert(m datasource.Metric, start time.Time, qFn notifier.QueryFn) (*notifier.Alert, error) {
|
||||
a := ¬ifier.Alert{
|
||||
GroupID: ar.GroupID,
|
||||
Name: ar.Name,
|
||||
@@ -254,31 +283,9 @@ func (ar *AlertingRule) newAlert(m datasource.Metric, start time.Time) (*notifie
|
||||
}
|
||||
a.Labels[l.Name] = l.Value
|
||||
}
|
||||
return a, ar.template(a)
|
||||
}
|
||||
|
||||
func (ar *AlertingRule) template(a *notifier.Alert) error {
|
||||
// 1. template rule labels with data labels
|
||||
rLabels, err := a.ExecTemplate(ar.Labels)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
// 2. merge data labels and rule labels
|
||||
// metric labels may be overridden by
|
||||
// rule labels
|
||||
for k, v := range rLabels {
|
||||
a.Labels[k] = v
|
||||
}
|
||||
|
||||
// 3. template merged labels
|
||||
a.Labels, err = a.ExecTemplate(a.Labels)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
a.Annotations, err = a.ExecTemplate(ar.Annotations)
|
||||
return err
|
||||
var err error
|
||||
a.Annotations, err = a.ExecTemplate(qFn, ar.Annotations)
|
||||
return a, err
|
||||
}
|
||||
|
||||
// AlertAPI generates APIAlert object from alert by its id(hash)
|
||||
@@ -397,13 +404,15 @@ func (ar *AlertingRule) Restore(ctx context.Context, q datasource.Querier, lookb
|
||||
return fmt.Errorf("querier is nil")
|
||||
}
|
||||
|
||||
qFn := func(query string) ([]datasource.Metric, error) { return q.Query(ctx, query) }
|
||||
|
||||
// account for external labels in filter
|
||||
var labelsFilter string
|
||||
for k, v := range labels {
|
||||
labelsFilter += fmt.Sprintf(",%s=%q", k, v)
|
||||
}
|
||||
|
||||
// Get the last datapoint in range via MetricsQL `last_over_time`.
|
||||
// Get the last data point in range via MetricsQL `last_over_time`.
|
||||
// We don't use plain PromQL since Prometheus doesn't support
|
||||
// remote write protocol which is used for state persistence in vmalert.
|
||||
expr := fmt.Sprintf("last_over_time(%s{alertname=%q%s}[%ds])",
|
||||
@@ -417,26 +426,22 @@ func (ar *AlertingRule) Restore(ctx context.Context, q datasource.Querier, lookb
|
||||
labels := m.Labels
|
||||
m.Labels = make([]datasource.Label, 0)
|
||||
// drop all extra labels, so hash key will
|
||||
// be identical to timeseries received in Exec
|
||||
// be identical to time series received in Exec
|
||||
for _, l := range labels {
|
||||
if l.Name == alertNameLabel {
|
||||
continue
|
||||
}
|
||||
// drop all overridden labels
|
||||
if _, ok := ar.Labels[l.Name]; ok {
|
||||
if l.Name == alertNameLabel || l.Name == alertGroupNameLabel {
|
||||
continue
|
||||
}
|
||||
m.Labels = append(m.Labels, l)
|
||||
}
|
||||
|
||||
a, err := ar.newAlert(m, time.Unix(int64(m.Value), 0))
|
||||
a, err := ar.newAlert(m, time.Unix(int64(m.Value), 0), qFn)
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to create alert: %w", err)
|
||||
}
|
||||
a.ID = hash(m)
|
||||
a.State = notifier.StatePending
|
||||
ar.alerts[a.ID] = a
|
||||
logger.Infof("alert %q(%d) restored to state at %v", a.Name, a.ID, a.Start)
|
||||
logger.Infof("alert %q (%d) restored to state at %v", a.Name, a.ID, a.Start)
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
@@ -2,6 +2,9 @@ package main
|
||||
|
||||
import (
|
||||
"context"
|
||||
"errors"
|
||||
"reflect"
|
||||
"strings"
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
@@ -218,19 +221,6 @@ func TestAlertingRule_Exec(t *testing.T) {
|
||||
hash(metricWithLabels(t, "name", "foo2")): {State: notifier.StateFiring},
|
||||
},
|
||||
},
|
||||
{
|
||||
newTestAlertingRule("duplicate", 0),
|
||||
[][]datasource.Metric{
|
||||
{
|
||||
// metrics with the same labelset should result in one alert
|
||||
metricWithLabels(t, "name", "foo", "type", "bar"),
|
||||
metricWithLabels(t, "type", "bar", "name", "foo"),
|
||||
},
|
||||
},
|
||||
map[uint64]*notifier.Alert{
|
||||
hash(metricWithLabels(t, "name", "foo", "type", "bar")): {State: notifier.StateFiring},
|
||||
},
|
||||
},
|
||||
{
|
||||
newTestAlertingRule("for-pending", time.Minute),
|
||||
[][]datasource.Metric{
|
||||
@@ -355,6 +345,7 @@ func TestAlertingRule_Restore(t *testing.T) {
|
||||
metricWithValueAndLabels(t, float64(time.Now().Truncate(time.Hour).Unix()),
|
||||
"__name__", alertForStateMetricName,
|
||||
alertNameLabel, "",
|
||||
alertGroupNameLabel, "groupID",
|
||||
"foo", "bar",
|
||||
"namespace", "baz",
|
||||
),
|
||||
@@ -375,7 +366,7 @@ func TestAlertingRule_Restore(t *testing.T) {
|
||||
alertNameLabel, "",
|
||||
"foo", "bar",
|
||||
"namespace", "baz",
|
||||
// following pair supposed to be dropped
|
||||
// extra labels set by rule
|
||||
"source", "vm",
|
||||
),
|
||||
},
|
||||
@@ -383,6 +374,7 @@ func TestAlertingRule_Restore(t *testing.T) {
|
||||
hash(metricWithLabels(t,
|
||||
"foo", "bar",
|
||||
"namespace", "baz",
|
||||
"source", "vm",
|
||||
)): {State: notifier.StatePending,
|
||||
Start: time.Now().Truncate(time.Hour)},
|
||||
},
|
||||
@@ -441,6 +433,138 @@ func TestAlertingRule_Restore(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
func TestAlertingRule_Exec_Negative(t *testing.T) {
|
||||
fq := &fakeQuerier{}
|
||||
ar := newTestAlertingRule("test", 0)
|
||||
ar.Labels = map[string]string{"job": "test"}
|
||||
|
||||
// successful attempt
|
||||
fq.add(metricWithValueAndLabels(t, 1, "__name__", "foo", "job", "bar"))
|
||||
_, err := ar.Exec(context.TODO(), fq, false)
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
|
||||
// label `job` will collide with rule extra label and will make both time series equal
|
||||
fq.add(metricWithValueAndLabels(t, 1, "__name__", "foo", "job", "baz"))
|
||||
_, err = ar.Exec(context.TODO(), fq, false)
|
||||
if !errors.Is(err, errDuplicate) {
|
||||
t.Fatalf("expected to have %s error; got %s", errDuplicate, err)
|
||||
}
|
||||
|
||||
fq.reset()
|
||||
|
||||
expErr := "connection reset by peer"
|
||||
fq.setErr(errors.New(expErr))
|
||||
_, err = ar.Exec(context.TODO(), fq, false)
|
||||
if err == nil {
|
||||
t.Fatalf("expected to get err; got nil")
|
||||
}
|
||||
if !strings.Contains(err.Error(), expErr) {
|
||||
t.Fatalf("expected to get err %q; got %q insterad", expErr, err)
|
||||
}
|
||||
}
|
||||
|
||||
func TestAlertingRule_Template(t *testing.T) {
|
||||
testCases := []struct {
|
||||
rule *AlertingRule
|
||||
metrics []datasource.Metric
|
||||
expAlerts map[uint64]*notifier.Alert
|
||||
}{
|
||||
{
|
||||
newTestRuleWithLabels("common", "region", "east"),
|
||||
[]datasource.Metric{
|
||||
metricWithValueAndLabels(t, 1, "instance", "foo"),
|
||||
metricWithValueAndLabels(t, 1, "instance", "bar"),
|
||||
},
|
||||
map[uint64]*notifier.Alert{
|
||||
hash(metricWithLabels(t, "region", "east", "instance", "foo")): {
|
||||
Annotations: map[string]string{},
|
||||
Labels: map[string]string{
|
||||
alertGroupNameLabel: "",
|
||||
"region": "east",
|
||||
"instance": "foo",
|
||||
},
|
||||
},
|
||||
hash(metricWithLabels(t, "region", "east", "instance", "bar")): {
|
||||
Annotations: map[string]string{},
|
||||
Labels: map[string]string{
|
||||
alertGroupNameLabel: "",
|
||||
"region": "east",
|
||||
"instance": "bar",
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
{
|
||||
&AlertingRule{
|
||||
Name: "override label",
|
||||
Labels: map[string]string{
|
||||
"instance": "{{ $labels.instance }}",
|
||||
"region": "east",
|
||||
},
|
||||
Annotations: map[string]string{
|
||||
"summary": `Too high connection number for "{{ $labels.instance }}" for region {{ $labels.region }}`,
|
||||
"description": `It is {{ $value }} connections for "{{ $labels.instance }}"`,
|
||||
},
|
||||
alerts: make(map[uint64]*notifier.Alert),
|
||||
},
|
||||
[]datasource.Metric{
|
||||
metricWithValueAndLabels(t, 2, "instance", "foo"),
|
||||
metricWithValueAndLabels(t, 10, "instance", "bar"),
|
||||
},
|
||||
map[uint64]*notifier.Alert{
|
||||
hash(metricWithLabels(t, "region", "east", "instance", "foo")): {
|
||||
Labels: map[string]string{
|
||||
alertGroupNameLabel: "",
|
||||
"instance": "foo",
|
||||
"region": "east",
|
||||
},
|
||||
Annotations: map[string]string{
|
||||
"summary": `Too high connection number for "foo" for region east`,
|
||||
"description": `It is 2 connections for "foo"`,
|
||||
},
|
||||
},
|
||||
hash(metricWithLabels(t, "region", "east", "instance", "bar")): {
|
||||
Labels: map[string]string{
|
||||
alertGroupNameLabel: "",
|
||||
"instance": "bar",
|
||||
"region": "east",
|
||||
},
|
||||
Annotations: map[string]string{
|
||||
"summary": `Too high connection number for "bar" for region east`,
|
||||
"description": `It is 10 connections for "bar"`,
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
fakeGroup := Group{Name: "TestRule_Exec"}
|
||||
for _, tc := range testCases {
|
||||
t.Run(tc.rule.Name, func(t *testing.T) {
|
||||
fq := &fakeQuerier{}
|
||||
tc.rule.GroupID = fakeGroup.ID()
|
||||
fq.add(tc.metrics...)
|
||||
if _, err := tc.rule.Exec(context.TODO(), fq, false); err != nil {
|
||||
t.Fatalf("unexpected err: %s", err)
|
||||
}
|
||||
for hash, expAlert := range tc.expAlerts {
|
||||
gotAlert := tc.rule.alerts[hash]
|
||||
if gotAlert == nil {
|
||||
t.Fatalf("alert %d is missing; labels: %v; annotations: %v",
|
||||
hash, expAlert.Labels, expAlert.Annotations)
|
||||
}
|
||||
if !reflect.DeepEqual(expAlert.Annotations, gotAlert.Annotations) {
|
||||
t.Fatalf("expected to have annotations %#v; got %#v", expAlert.Annotations, gotAlert.Annotations)
|
||||
}
|
||||
if !reflect.DeepEqual(expAlert.Labels, gotAlert.Labels) {
|
||||
t.Fatalf("expected to have labels %#v; got %#v", expAlert.Labels, gotAlert.Labels)
|
||||
}
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func newTestRuleWithLabels(name string, labels ...string) *AlertingRule {
|
||||
r := newTestAlertingRule(name, 0)
|
||||
r.Labels = make(map[string]string)
|
||||
|
||||
@@ -11,6 +11,7 @@ import (
|
||||
"time"
|
||||
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/utils"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envtemplate"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
|
||||
"github.com/VictoriaMetrics/metricsql"
|
||||
@@ -94,7 +95,7 @@ type Rule struct {
|
||||
Record string `yaml:"record,omitempty"`
|
||||
Alert string `yaml:"alert,omitempty"`
|
||||
Expr string `yaml:"expr"`
|
||||
For PromDuration `yaml:"for,omitempty"`
|
||||
For PromDuration `yaml:"for"`
|
||||
Labels map[string]string `yaml:"labels,omitempty"`
|
||||
Annotations map[string]string `yaml:"annotations,omitempty"`
|
||||
|
||||
@@ -114,6 +115,11 @@ func NewPromDuration(d time.Duration) PromDuration {
|
||||
}
|
||||
}
|
||||
|
||||
// MarshalYAML implements yaml.Marshaler interface.
|
||||
func (pd PromDuration) MarshalYAML() (interface{}, error) {
|
||||
return pd.Duration().String(), nil
|
||||
}
|
||||
|
||||
// UnmarshalYAML implements yaml.Unmarshaler interface.
|
||||
func (pd *PromDuration) UnmarshalYAML(unmarshal func(interface{}) error) error {
|
||||
var s string
|
||||
@@ -193,25 +199,32 @@ func Parse(pathPatterns []string, validateAnnotations, validateExpressions bool)
|
||||
}
|
||||
fp = append(fp, matches...)
|
||||
}
|
||||
errGroup := new(utils.ErrGroup)
|
||||
var groups []Group
|
||||
for _, file := range fp {
|
||||
uniqueGroups := map[string]struct{}{}
|
||||
gr, err := parseFile(file)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("failed to parse file %q: %w", file, err)
|
||||
errGroup.Add(fmt.Errorf("failed to parse file %q: %w", file, err))
|
||||
continue
|
||||
}
|
||||
for _, g := range gr {
|
||||
if err := g.Validate(validateAnnotations, validateExpressions); err != nil {
|
||||
return nil, fmt.Errorf("invalid group %q in file %q: %w", g.Name, file, err)
|
||||
errGroup.Add(fmt.Errorf("invalid group %q in file %q: %w", g.Name, file, err))
|
||||
continue
|
||||
}
|
||||
if _, ok := uniqueGroups[g.Name]; ok {
|
||||
return nil, fmt.Errorf("group name %q duplicate in file %q", g.Name, file)
|
||||
errGroup.Add(fmt.Errorf("group name %q duplicate in file %q", g.Name, file))
|
||||
continue
|
||||
}
|
||||
uniqueGroups[g.Name] = struct{}{}
|
||||
g.File = file
|
||||
groups = append(groups, g)
|
||||
}
|
||||
}
|
||||
if err := errGroup.Err(); err != nil {
|
||||
return nil, err
|
||||
}
|
||||
if len(groups) < 1 {
|
||||
logger.Warnf("no groups found in %s", strings.Join(pathPatterns, ";"))
|
||||
}
|
||||
|
||||
@@ -7,8 +7,9 @@ import (
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
|
||||
"gopkg.in/yaml.v2"
|
||||
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
|
||||
)
|
||||
|
||||
func TestMain(m *testing.M) {
|
||||
@@ -42,7 +43,7 @@ func TestParseBad(t *testing.T) {
|
||||
},
|
||||
{
|
||||
[]string{"testdata/dir/rules2-bad.rules"},
|
||||
"function \"value\" not defined",
|
||||
"function \"unknown\" not defined",
|
||||
},
|
||||
{
|
||||
[]string{"testdata/dir/rules3-bad.rules"},
|
||||
@@ -137,12 +138,14 @@ func TestGroup_Validate(t *testing.T) {
|
||||
Alert: "alert",
|
||||
Expr: "up == 1",
|
||||
Labels: map[string]string{
|
||||
"summary": "{{ value|query }}",
|
||||
"summary": `
|
||||
{{ with printf "node_memory_MemTotal{job='node',instance='%s'}" "localhost" | query }}
|
||||
{{ . | first | value | humanize1024 }}B
|
||||
{{ end }}`,
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
expErr: "error parsing annotation",
|
||||
validateAnnotations: true,
|
||||
},
|
||||
{
|
||||
@@ -323,34 +326,55 @@ func TestHashRule(t *testing.T) {
|
||||
}
|
||||
|
||||
func TestGroupChecksum(t *testing.T) {
|
||||
data := `
|
||||
f := func(t *testing.T, data, newData string) {
|
||||
t.Helper()
|
||||
var g Group
|
||||
if err := yaml.Unmarshal([]byte(data), &g); err != nil {
|
||||
t.Fatalf("failed to unmarshal: %s", err)
|
||||
}
|
||||
if g.Checksum == "" {
|
||||
t.Fatalf("expected to get non-empty checksum")
|
||||
}
|
||||
|
||||
var ng Group
|
||||
if err := yaml.Unmarshal([]byte(newData), &ng); err != nil {
|
||||
t.Fatalf("failed to unmarshal: %s", err)
|
||||
}
|
||||
if g.Checksum == ng.Checksum {
|
||||
t.Fatalf("expected to get different checksums")
|
||||
}
|
||||
}
|
||||
t.Run("Ok", func(t *testing.T) {
|
||||
f(t, `
|
||||
name: TestGroup
|
||||
rules:
|
||||
- alert: ExampleAlertAlwaysFiring
|
||||
expr: sum by(job) (up == 1)
|
||||
- record: handler:requests:rate5m
|
||||
expr: sum(rate(prometheus_http_requests_total[5m])) by (handler)
|
||||
`
|
||||
var g Group
|
||||
if err := yaml.Unmarshal([]byte(data), &g); err != nil {
|
||||
t.Fatalf("failed to unmarshal: %s", err)
|
||||
}
|
||||
if g.Checksum == "" {
|
||||
t.Fatalf("expected to get non-empty checksum")
|
||||
}
|
||||
newData := `
|
||||
`, `
|
||||
name: TestGroup
|
||||
rules:
|
||||
- record: handler:requests:rate5m
|
||||
expr: sum(rate(prometheus_http_requests_total[5m])) by (handler)
|
||||
- alert: ExampleAlertAlwaysFiring
|
||||
expr: sum by(job) (up == 1)
|
||||
`
|
||||
var ng Group
|
||||
if err := yaml.Unmarshal([]byte(newData), &g); err != nil {
|
||||
t.Fatalf("failed to unmarshal: %s", err)
|
||||
}
|
||||
if g.Checksum == ng.Checksum {
|
||||
t.Fatalf("expected to get different checksums")
|
||||
}
|
||||
`)
|
||||
})
|
||||
|
||||
t.Run("Ok, `for` must change cs", func(t *testing.T) {
|
||||
f(t, `
|
||||
name: TestGroup
|
||||
rules:
|
||||
- alert: ExampleAlertWithFor
|
||||
expr: sum by(job) (up == 1)
|
||||
for: 5m
|
||||
`, `
|
||||
name: TestGroup
|
||||
rules:
|
||||
- alert: ExampleAlertWithFor
|
||||
expr: sum by(job) (up == 1)
|
||||
`)
|
||||
})
|
||||
|
||||
}
|
||||
|
||||
@@ -6,6 +6,6 @@ groups:
|
||||
expr: vm_rows > 0
|
||||
labels:
|
||||
label: bar
|
||||
summary: "{{ value|query }}"
|
||||
summary: "{{ unknown|query }}"
|
||||
annotations:
|
||||
description: "{{$labels}}"
|
||||
|
||||
13
app/vmalert/config/testdata/rules2-good.rules
vendored
13
app/vmalert/config/testdata/rules2-good.rules
vendored
@@ -7,11 +7,22 @@ groups:
|
||||
expr: sum(vm_tcplistener_conns) by(instance) > 1
|
||||
for: 3m
|
||||
annotations:
|
||||
summary: "Too high connection number for {{$labels.instance}}"
|
||||
summary: Too high connection number for {{$labels.instance}}
|
||||
{{ with printf "sum(vm_tcplistener_conns{instance=%q})" .Labels.instance | query }}
|
||||
{{ . | first | value }}
|
||||
{{ end }}
|
||||
description: "It is {{ $value }} connections for {{$labels.instance}}"
|
||||
- alert: ExampleAlertAlwaysFiring
|
||||
expr: sum by(job)
|
||||
(up == 1)
|
||||
labels:
|
||||
job: '{{ $labels.job }}'
|
||||
dynamic: '{{ $x := query "up" | first | value }}{{ if eq 1.0 $x }}one{{ else }}unknown{{ end }}'
|
||||
annotations:
|
||||
description: Job {{ $labels.job }} is up!
|
||||
summary: All instances up {{ range query "up" }}
|
||||
{{ . | label "instance" }}
|
||||
{{ end }}
|
||||
- record: handler:requests:rate5m
|
||||
expr: sum(rate(prometheus_http_requests_total[5m])) by (handler)
|
||||
labels:
|
||||
|
||||
@@ -17,6 +17,34 @@ type Metric struct {
|
||||
Value float64
|
||||
}
|
||||
|
||||
// SetLabel adds or updates existing one label
|
||||
// by the given key and label
|
||||
func (m *Metric) SetLabel(key, value string) {
|
||||
for i, l := range m.Labels {
|
||||
if l.Name == key {
|
||||
m.Labels[i].Value = value
|
||||
return
|
||||
}
|
||||
}
|
||||
m.AddLabel(key, value)
|
||||
}
|
||||
|
||||
// AddLabel appends the given label to the label set
|
||||
func (m *Metric) AddLabel(key, value string) {
|
||||
m.Labels = append(m.Labels, Label{Name: key, Value: value})
|
||||
}
|
||||
|
||||
// Label returns the given label value.
|
||||
// If label is missing empty string will be returned
|
||||
func (m *Metric) Label(key string) string {
|
||||
for _, l := range m.Labels {
|
||||
if l.Name == key {
|
||||
return l.Value
|
||||
}
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
// Label represents metric's label
|
||||
type Label struct {
|
||||
Name string
|
||||
|
||||
@@ -37,7 +37,7 @@ func (r response) metrics() ([]Metric, error) {
|
||||
}
|
||||
m.Labels = nil
|
||||
for k, v := range r.Data.Result[i].Labels {
|
||||
m.Labels = append(m.Labels, Label{Name: k, Value: v})
|
||||
m.AddLabel(k, v)
|
||||
}
|
||||
m.Timestamp = int64(res.TV[0].(float64))
|
||||
m.Value = f
|
||||
@@ -82,7 +82,7 @@ func (s *VMStorage) Query(ctx context.Context, query string) ([]Metric, error) {
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
req.Header.Set("Content-Type", "application/json")
|
||||
req.Header.Set("Content-Type", "application/json; charset=utf-8")
|
||||
if s.basicAuthPass != "" {
|
||||
req.SetBasicAuth(s.basicAuthUser, s.basicAuthPass)
|
||||
}
|
||||
|
||||
@@ -167,18 +167,28 @@ func TestGroupStart(t *testing.T) {
|
||||
m2 := metricWithLabels(t, "instance", inst2, "job", job)
|
||||
|
||||
r := g.Rules[0].(*AlertingRule)
|
||||
alert1, err := r.newAlert(m1, time.Now())
|
||||
alert1, err := r.newAlert(m1, time.Now(), nil)
|
||||
if err != nil {
|
||||
t.Fatalf("faield to create alert: %s", err)
|
||||
}
|
||||
alert1.State = notifier.StateFiring
|
||||
// add external label
|
||||
alert1.Labels["cluster"] = "east-1"
|
||||
// add rule labels - see config/testdata/rules1-good.rules
|
||||
alert1.Labels["label"] = "bar"
|
||||
alert1.Labels["host"] = inst1
|
||||
alert1.ID = hash(m1)
|
||||
|
||||
alert2, err := r.newAlert(m2, time.Now())
|
||||
alert2, err := r.newAlert(m2, time.Now(), nil)
|
||||
if err != nil {
|
||||
t.Fatalf("faield to create alert: %s", err)
|
||||
}
|
||||
alert2.State = notifier.StateFiring
|
||||
// add external label
|
||||
alert2.Labels["cluster"] = "east-1"
|
||||
// add rule labels - see config/testdata/rules1-good.rules
|
||||
alert2.Labels["label"] = "bar"
|
||||
alert2.Labels["host"] = inst2
|
||||
alert2.ID = hash(m2)
|
||||
|
||||
finished := make(chan struct{})
|
||||
|
||||
@@ -10,12 +10,12 @@ import (
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/config"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/remoteread"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/remotewrite"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envflag"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fasttime"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
|
||||
@@ -41,12 +41,14 @@ Rule files may contain %{ENV_VAR} placeholders, which are substituted by the cor
|
||||
validateExpressions = flag.Bool("rule.validateExpressions", true, "Whether to validate rules expressions via MetricsQL engine")
|
||||
externalURL = flag.String("external.url", "", "External URL is used as alert's source for sent alerts to the notifier")
|
||||
externalAlertSource = flag.String("external.alert.source", "", `External Alert Source allows to override the Source link for alerts sent to AlertManager for cases where you want to build a custom link to Grafana, Prometheus or any other service.
|
||||
eg. 'explore?orgId=1&left=[\"now-1h\",\"now\",\"VictoriaMetrics\",{\"expr\": \"{{$expr|quotesEscape|pathEscape}}\"},{\"mode\":\"Metrics\"},{\"ui\":[true,true,true,\"none\"]}]'.If empty '/api/v1/:groupID/alertID/status' is used`)
|
||||
eg. 'explore?orgId=1&left=[\"now-1h\",\"now\",\"VictoriaMetrics\",{\"expr\": \"{{$expr|quotesEscape|crlfEscape|pathEscape}}\"},{\"mode\":\"Metrics\"},{\"ui\":[true,true,true,\"none\"]}]'.If empty '/api/v1/:groupID/alertID/status' is used`)
|
||||
externalLabels = flagutil.NewArray("external.label", "Optional label in the form 'name=value' to add to all generated recording rules and alerts. "+
|
||||
"Pass multiple -label flags in order to add multiple label sets.")
|
||||
|
||||
remoteReadLookBack = flag.Duration("remoteRead.lookback", time.Hour, "Lookback defines how far to look into past for alerts timeseries."+
|
||||
" For example, if lookback=1h then range from now() to now()-1h will be scanned.")
|
||||
|
||||
dryRun = flag.Bool("dryRun", false, "Whether to check only config files without running vmalert. The rules file are validated. The `-rule` flag must be specified.")
|
||||
)
|
||||
|
||||
func main() {
|
||||
@@ -56,8 +58,19 @@ func main() {
|
||||
envflag.Parse()
|
||||
buildinfo.Init()
|
||||
logger.Init()
|
||||
cgroup.UpdateGOMAXPROCSToCPUQuota()
|
||||
|
||||
if *dryRun {
|
||||
u, _ := url.Parse("https://victoriametrics.com/")
|
||||
notifier.InitTemplateFunc(u)
|
||||
groups, err := config.Parse(*rulePath, true, true)
|
||||
if err != nil {
|
||||
logger.Fatalf(err.Error())
|
||||
}
|
||||
if len(groups) == 0 {
|
||||
logger.Fatalf("No rules for validation. Please specify path to file(s) with alerting and/or recording rules using `-rule` flag")
|
||||
}
|
||||
return
|
||||
}
|
||||
ctx, cancel := context.WithCancel(context.Background())
|
||||
manager, err := newManager(ctx)
|
||||
if err != nil {
|
||||
@@ -145,6 +158,9 @@ func newManager(ctx context.Context) (*manager, error) {
|
||||
manager.rr = rr
|
||||
|
||||
for _, s := range *externalLabels {
|
||||
if len(s) == 0 {
|
||||
continue
|
||||
}
|
||||
n := strings.IndexByte(s, '=')
|
||||
if n < 0 {
|
||||
return nil, fmt.Errorf("missing '=' in `-label`. It must contain label in the form `name=value`; got %q", s)
|
||||
@@ -190,7 +206,7 @@ func getAlertURLGenerator(externalURL *url.URL, externalAlertSource string, vali
|
||||
"tpl": externalAlertSource,
|
||||
}
|
||||
return func(alert notifier.Alert) string {
|
||||
templated, err := alert.ExecTemplate(m)
|
||||
templated, err := alert.ExecTemplate(nil, m)
|
||||
if err != nil {
|
||||
logger.Errorf("can not exec source template %s", err)
|
||||
}
|
||||
@@ -202,10 +218,7 @@ func usage() {
|
||||
const s = `
|
||||
vmalert processes alerts and recording rules.
|
||||
|
||||
See the docs at https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmalert/README.md .
|
||||
See the docs at https://victoriametrics.github.io/vmalert.html .
|
||||
`
|
||||
|
||||
f := flag.CommandLine.Output()
|
||||
fmt.Fprintf(f, "%s\n", s)
|
||||
flag.PrintDefaults()
|
||||
flagutil.Usage(s)
|
||||
}
|
||||
|
||||
@@ -52,7 +52,8 @@ func (as AlertState) String() string {
|
||||
return "inactive"
|
||||
}
|
||||
|
||||
type alertTplData struct {
|
||||
// AlertTplData is used to execute templating
|
||||
type AlertTplData struct {
|
||||
Labels map[string]string
|
||||
Value float64
|
||||
Expr string
|
||||
@@ -60,23 +61,30 @@ type alertTplData struct {
|
||||
|
||||
const tplHeader = `{{ $value := .Value }}{{ $labels := .Labels }}{{ $expr := .Expr }}`
|
||||
|
||||
// ExecTemplate executes the Alert template for give
|
||||
// ExecTemplate executes the Alert template for given
|
||||
// map of annotations.
|
||||
func (a *Alert) ExecTemplate(annotations map[string]string) (map[string]string, error) {
|
||||
tplData := alertTplData{Value: a.Value, Labels: a.Labels, Expr: a.Expr}
|
||||
return templateAnnotations(annotations, tplHeader, tplData)
|
||||
// Every alert could have a different datasource, so function
|
||||
// requires a queryFunction as an argument.
|
||||
func (a *Alert) ExecTemplate(q QueryFn, annotations map[string]string) (map[string]string, error) {
|
||||
tplData := AlertTplData{Value: a.Value, Labels: a.Labels, Expr: a.Expr}
|
||||
return templateAnnotations(annotations, tplData, funcsWithQuery(q))
|
||||
}
|
||||
|
||||
// ExecTemplate executes the given template for given annotations map.
|
||||
func ExecTemplate(q QueryFn, annotations map[string]string, tpl AlertTplData) (map[string]string, error) {
|
||||
return templateAnnotations(annotations, tpl, funcsWithQuery(q))
|
||||
}
|
||||
|
||||
// ValidateTemplates validate annotations for possible template error, uses empty data for template population
|
||||
func ValidateTemplates(annotations map[string]string) error {
|
||||
_, err := templateAnnotations(annotations, tplHeader, alertTplData{
|
||||
_, err := templateAnnotations(annotations, AlertTplData{
|
||||
Labels: map[string]string{},
|
||||
Value: 0,
|
||||
})
|
||||
}, tmplFunc)
|
||||
return err
|
||||
}
|
||||
|
||||
func templateAnnotations(annotations map[string]string, header string, data alertTplData) (map[string]string, error) {
|
||||
func templateAnnotations(annotations map[string]string, data AlertTplData, funcs template.FuncMap) (map[string]string, error) {
|
||||
var builder strings.Builder
|
||||
var buf bytes.Buffer
|
||||
eg := new(utils.ErrGroup)
|
||||
@@ -85,10 +93,10 @@ func templateAnnotations(annotations map[string]string, header string, data aler
|
||||
r[key] = text
|
||||
buf.Reset()
|
||||
builder.Reset()
|
||||
builder.Grow(len(header) + len(text))
|
||||
builder.WriteString(header)
|
||||
builder.Grow(len(tplHeader) + len(text))
|
||||
builder.WriteString(tplHeader)
|
||||
builder.WriteString(text)
|
||||
if err := templateAnnotation(&buf, builder.String(), data); err != nil {
|
||||
if err := templateAnnotation(&buf, builder.String(), data, funcs); err != nil {
|
||||
eg.Add(fmt.Errorf("key %q, template %q: %w", key, text, err))
|
||||
continue
|
||||
}
|
||||
@@ -97,8 +105,9 @@ func templateAnnotations(annotations map[string]string, header string, data aler
|
||||
return r, eg.Err()
|
||||
}
|
||||
|
||||
func templateAnnotation(dst io.Writer, text string, data alertTplData) error {
|
||||
tpl, err := template.New("").Funcs(tmplFunc).Option("missingkey=zero").Parse(text)
|
||||
func templateAnnotation(dst io.Writer, text string, data AlertTplData, funcs template.FuncMap) error {
|
||||
t := template.New("").Funcs(funcs).Option("missingkey=zero")
|
||||
tpl, err := t.Parse(text)
|
||||
if err != nil {
|
||||
return fmt.Errorf("error parsing annotation: %w", err)
|
||||
}
|
||||
|
||||
@@ -2,6 +2,8 @@ package notifier
|
||||
|
||||
import (
|
||||
"testing"
|
||||
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
|
||||
)
|
||||
|
||||
func TestAlert_ExecTemplate(t *testing.T) {
|
||||
@@ -60,11 +62,41 @@ func TestAlert_ExecTemplate(t *testing.T) {
|
||||
"exprEscapedPath": "vm_rows%7B%5C%22label%5C%22=%5C%22bar%5C%22%7D%3E0",
|
||||
},
|
||||
},
|
||||
{
|
||||
name: "query",
|
||||
alert: &Alert{Expr: `vm_rows{"label"="bar"}>0`},
|
||||
annotations: map[string]string{
|
||||
"summary": `{{ query "foo" | first | value }}`,
|
||||
"desc": `{{ range query "bar" }}{{ . | label "foo" }} {{ . | value }};{{ end }}`,
|
||||
},
|
||||
expTpl: map[string]string{
|
||||
"summary": "1",
|
||||
"desc": "bar 1;garply 2;",
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
qFn := func(q string) ([]datasource.Metric, error) {
|
||||
return []datasource.Metric{
|
||||
{
|
||||
Labels: []datasource.Label{
|
||||
{Name: "foo", Value: "bar"},
|
||||
{Name: "baz", Value: "qux"},
|
||||
},
|
||||
Value: 1,
|
||||
},
|
||||
{
|
||||
Labels: []datasource.Label{
|
||||
{Name: "foo", Value: "garply"},
|
||||
{Name: "baz", Value: "fred"},
|
||||
},
|
||||
Value: 2,
|
||||
},
|
||||
}, nil
|
||||
}
|
||||
for _, tc := range testCases {
|
||||
t.Run(tc.name, func(t *testing.T) {
|
||||
tpl, err := tc.alert.ExecTemplate(tc.annotations)
|
||||
tpl, err := tc.alert.ExecTemplate(qFn, tc.annotations)
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
|
||||
@@ -28,7 +28,7 @@ func (am *AlertManager) Send(ctx context.Context, alerts []Alert) error {
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
req.Header.Set("Content-Type", "application/json")
|
||||
req.Header.Set("Content-Type", "application/json; charset=utf-8")
|
||||
req = req.WithContext(ctx)
|
||||
if am.basicAuthPass != "" {
|
||||
req.SetBasicAuth(am.basicAuthUser, am.basicAuthPass)
|
||||
|
||||
@@ -1,7 +1,6 @@
|
||||
package notifier
|
||||
|
||||
import (
|
||||
"flag"
|
||||
"fmt"
|
||||
"net/http"
|
||||
|
||||
@@ -11,10 +10,10 @@ import (
|
||||
|
||||
var (
|
||||
addrs = flagutil.NewArray("notifier.url", "Prometheus alertmanager URL. Required parameter. e.g. http://127.0.0.1:9093")
|
||||
basicAuthUsername = flagutil.NewArray("notifier.basicAuth.username", "Optional basic auth username for -datasource.url")
|
||||
basicAuthPassword = flagutil.NewArray("notifier.basicAuth.password", "Optional basic auth password for -datasource.url")
|
||||
basicAuthUsername = flagutil.NewArray("notifier.basicAuth.username", "Optional basic auth username for -notifier.url")
|
||||
basicAuthPassword = flagutil.NewArray("notifier.basicAuth.password", "Optional basic auth password for -notifier.url")
|
||||
|
||||
tlsInsecureSkipVerify = flag.Bool("notifier.tlsInsecureSkipVerify", false, "Whether to skip tls verification when connecting to -notifier.url")
|
||||
tlsInsecureSkipVerify = flagutil.NewArrayBool("notifier.tlsInsecureSkipVerify", "Whether to skip tls verification when connecting to -notifier.url")
|
||||
tlsCertFile = flagutil.NewArray("notifier.tlsCertFile", "Optional path to client-side TLS certificate file to use when connecting to -notifier.url")
|
||||
tlsKeyFile = flagutil.NewArray("notifier.tlsKeyFile", "Optional path to client-side TLS certificate key to use when connecting to -notifier.url")
|
||||
tlsCAFile = flagutil.NewArray("notifier.tlsCAFile", "Optional path to TLS CA file to use for verifying connections to -notifier.url. "+
|
||||
@@ -33,7 +32,7 @@ func Init(gen AlertURLGenerator) ([]Notifier, error) {
|
||||
for i, addr := range *addrs {
|
||||
cert, key := tlsCertFile.GetOptionalArg(i), tlsKeyFile.GetOptionalArg(i)
|
||||
ca, serverName := tlsCAFile.GetOptionalArg(i), tlsServerName.GetOptionalArg(i)
|
||||
tr, err := utils.Transport(addr, cert, key, ca, serverName, *tlsInsecureSkipVerify)
|
||||
tr, err := utils.Transport(addr, cert, key, ca, serverName, tlsInsecureSkipVerify.GetOptionalArg(i))
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("failed to create transport: %w", err)
|
||||
}
|
||||
|
||||
@@ -14,21 +14,40 @@
|
||||
package notifier
|
||||
|
||||
import (
|
||||
"errors"
|
||||
"fmt"
|
||||
html_template "html/template"
|
||||
"math"
|
||||
"net/url"
|
||||
"regexp"
|
||||
"strings"
|
||||
text_template "text/template"
|
||||
"time"
|
||||
|
||||
htmlTpl "html/template"
|
||||
textTpl "text/template"
|
||||
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
|
||||
)
|
||||
|
||||
var tmplFunc text_template.FuncMap
|
||||
// QueryFn is used to wrap a call to datasource into simple-to-use function
|
||||
// for templating functions.
|
||||
type QueryFn func(query string) ([]datasource.Metric, error)
|
||||
|
||||
// InitTemplateFunc returns template helper functions
|
||||
func funcsWithQuery(query QueryFn) textTpl.FuncMap {
|
||||
fm := make(textTpl.FuncMap)
|
||||
for k, fn := range tmplFunc {
|
||||
fm[k] = fn
|
||||
}
|
||||
fm["query"] = func(q string) ([]datasource.Metric, error) {
|
||||
return query(q)
|
||||
}
|
||||
return fm
|
||||
}
|
||||
|
||||
var tmplFunc textTpl.FuncMap
|
||||
|
||||
// InitTemplateFunc initiates template helper functions
|
||||
func InitTemplateFunc(externalURL *url.URL) {
|
||||
tmplFunc = text_template.FuncMap{
|
||||
tmplFunc = textTpl.FuncMap{
|
||||
"args": func(args ...interface{}) map[string]interface{} {
|
||||
result := make(map[string]interface{})
|
||||
for i, a := range args {
|
||||
@@ -40,8 +59,8 @@ func InitTemplateFunc(externalURL *url.URL) {
|
||||
re := regexp.MustCompile(pattern)
|
||||
return re.ReplaceAllString(text, repl)
|
||||
},
|
||||
"safeHtml": func(text string) html_template.HTML {
|
||||
return html_template.HTML(text)
|
||||
"safeHtml": func(text string) htmlTpl.HTML {
|
||||
return htmlTpl.HTML(text)
|
||||
},
|
||||
"match": regexp.MatchString,
|
||||
"title": strings.Title,
|
||||
@@ -148,9 +167,33 @@ func InitTemplateFunc(externalURL *url.URL) {
|
||||
"queryEscape": func(q string) string {
|
||||
return url.QueryEscape(q)
|
||||
},
|
||||
"crlfEscape": func(q string) string {
|
||||
q = strings.Replace(q, "\n", `\n`, -1)
|
||||
return strings.Replace(q, "\r", `\r`, -1)
|
||||
},
|
||||
"quotesEscape": func(q string) string {
|
||||
return strings.Replace(q, `"`, `\"`, -1)
|
||||
},
|
||||
// query function supposed to be substituted at funcsWithQuery().
|
||||
// it is present here only for validation purposes, when there is no
|
||||
// provided datasource.
|
||||
"query": func(q string) ([]datasource.Metric, error) {
|
||||
// return non-empty slice to pass validation with chained functions in template
|
||||
// see issue #989 for details
|
||||
return []datasource.Metric{{}}, nil
|
||||
},
|
||||
"first": func(metrics []datasource.Metric) (datasource.Metric, error) {
|
||||
if len(metrics) > 0 {
|
||||
return metrics[0], nil
|
||||
}
|
||||
return datasource.Metric{}, errors.New("first() called on vector with no elements")
|
||||
},
|
||||
"label": func(label string, m datasource.Metric) string {
|
||||
return m.Label(label)
|
||||
},
|
||||
"value": func(m datasource.Metric) float64 {
|
||||
return m.Value
|
||||
},
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -2,7 +2,6 @@ package main
|
||||
|
||||
import (
|
||||
"context"
|
||||
"errors"
|
||||
"fmt"
|
||||
"hash/fnv"
|
||||
"sort"
|
||||
@@ -79,8 +78,6 @@ func (rr *RecordingRule) Close() {
|
||||
metrics.UnregisterMetric(rr.metrics.errors.name)
|
||||
}
|
||||
|
||||
var errDuplicate = errors.New("result contains metrics with the same labelset after applying rule labels")
|
||||
|
||||
// Exec executes RecordingRule expression via the given Querier.
|
||||
func (rr *RecordingRule) Exec(ctx context.Context, q datasource.Querier, series bool) ([]prompbmarshal.TimeSeries, error) {
|
||||
if !series {
|
||||
|
||||
@@ -2,6 +2,7 @@ package main
|
||||
|
||||
import (
|
||||
"context"
|
||||
"errors"
|
||||
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
|
||||
@@ -25,3 +26,5 @@ type Rule interface {
|
||||
// such as metrics unregister
|
||||
Close()
|
||||
}
|
||||
|
||||
var errDuplicate = errors.New("result contains metrics with the same labelset after applying rule labels")
|
||||
|
||||
@@ -40,7 +40,7 @@ func (rh *requestHandler) handler(w http.ResponseWriter, r *http.Request) bool {
|
||||
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
|
||||
return true
|
||||
}
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.Header().Set("Content-Type", "application/json; charset=utf-8")
|
||||
w.Write(data)
|
||||
return true
|
||||
case "/api/v1/alerts":
|
||||
@@ -49,7 +49,7 @@ func (rh *requestHandler) handler(w http.ResponseWriter, r *http.Request) bool {
|
||||
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
|
||||
return true
|
||||
}
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.Header().Set("Content-Type", "application/json; charset=utf-8")
|
||||
w.Write(data)
|
||||
return true
|
||||
case "/-/reload":
|
||||
@@ -67,7 +67,7 @@ func (rh *requestHandler) handler(w http.ResponseWriter, r *http.Request) bool {
|
||||
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
|
||||
return true
|
||||
}
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.Header().Set("Content-Type", "application/json; charset=utf-8")
|
||||
w.Write(data)
|
||||
return true
|
||||
}
|
||||
|
||||
@@ -46,7 +46,7 @@ users:
|
||||
url_prefix: "http://localhost:8428"
|
||||
|
||||
# The user for querying account 123 in VictoriaMetrics cluster
|
||||
# See https://github.com/VictoriaMetrics/VictoriaMetrics/blob/cluster/README.md#url-format
|
||||
# See https://victoriametrics.github.io/Cluster-VictoriaMetrics.html#url-format
|
||||
# All the requests to http://vmauth:8427 with the given Basic Auth (username:password)
|
||||
# will be routed to http://vmselect:8481/select/123/prometheus .
|
||||
# For example, http://vmauth:8427/api/v1/query is routed to http://vmselect:8481/select/123/prometheus/api/v1/select
|
||||
@@ -55,7 +55,7 @@ users:
|
||||
url_prefix: "http://vmselect:8481/select/123/prometheus"
|
||||
|
||||
# The user for inserting Prometheus data into VictoriaMetrics cluster under account 42
|
||||
# See https://github.com/VictoriaMetrics/VictoriaMetrics/blob/cluster/README.md#url-format
|
||||
# See https://victoriametrics.github.io/Cluster-VictoriaMetrics.html#url-format
|
||||
# All the reuqests to http://vmauth:8427 with the given Basic Auth (username:password)
|
||||
# will be routed to http://vminsert:8480/insert/42/prometheus .
|
||||
# For example, http://vmauth:8427/api/v1/write is routed to http://vminsert:8480/insert/42/prometheus/api/v1/write
|
||||
@@ -87,7 +87,7 @@ Alternatively, [https termination proxy](https://en.wikipedia.org/wiki/TLS_termi
|
||||
### Monitoring
|
||||
|
||||
`vmauth` exports various metrics in Prometheus exposition format at `http://vmauth-host:8427/metrics` page. It is recommended setting up regular scraping of this page
|
||||
either via [vmagent](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmagent/README.md) or via Prometheus, so the exported metrics could be analyzed later.
|
||||
either via [vmagent](https://victoriametrics.github.io/vmagent.html) or via Prometheus, so the exported metrics could be analyzed later.
|
||||
|
||||
|
||||
### How to build from sources
|
||||
@@ -151,10 +151,10 @@ Pass `-help` command-line arg to `vmauth` in order to see all the configuration
|
||||
|
||||
vmauth authenticates and authorizes incoming requests and proxies them to VictoriaMetrics.
|
||||
|
||||
See the docs at https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmauth/README.md .
|
||||
See the docs at https://victoriametrics.github.io/vmauth.html .
|
||||
|
||||
-auth.config string
|
||||
Path to auth config. See https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmauth/README.md for details on the format of this auth config
|
||||
Path to auth config. See https://victoriametrics.github.io/vmauth.html for details on the format of this auth config
|
||||
-enableTCP6
|
||||
Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP is used
|
||||
-envflag.enable
|
||||
|
||||
@@ -17,7 +17,7 @@ import (
|
||||
)
|
||||
|
||||
var (
|
||||
authConfigPath = flag.String("auth.config", "", "Path to auth config. See https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmauth/README.md "+
|
||||
authConfigPath = flag.String("auth.config", "", "Path to auth config. See https://victoriametrics.github.io/vmauth.html "+
|
||||
"for details on the format of this auth config")
|
||||
)
|
||||
|
||||
|
||||
@@ -12,7 +12,7 @@ users:
|
||||
url_prefix: "http://localhost:8428"
|
||||
|
||||
# The user for querying account 123 in VictoriaMetrics cluster
|
||||
# See https://github.com/VictoriaMetrics/VictoriaMetrics/blob/cluster/README.md#url-format
|
||||
# See https://victoriametrics.github.io/Cluster-VictoriaMetrics.html#url-format
|
||||
# All the requests to http://vmauth:8427 with the given Basic Auth (username:password)
|
||||
# will be routed to http://vmselect:8481/select/123/prometheus .
|
||||
# For example, http://vmauth:8427/api/v1/query is routed to http://vmselect:8481/select/123/prometheus/api/v1/select
|
||||
@@ -21,7 +21,7 @@ users:
|
||||
url_prefix: "http://vmselect:8481/select/123/prometheus"
|
||||
|
||||
# The user for inserting Prometheus data into VictoriaMetrics cluster under account 42
|
||||
# See https://github.com/VictoriaMetrics/VictoriaMetrics/blob/cluster/README.md#url-format
|
||||
# See https://victoriametrics.github.io/Cluster-VictoriaMetrics.html#url-format
|
||||
# All the reuqests to http://vmauth:8427 with the given Basic Auth (username:password)
|
||||
# will be routed to http://vminsert:8480/insert/42/prometheus .
|
||||
# For example, http://vmauth:8427/api/v1/write is routed to http://vminsert:8480/insert/42/prometheus/api/v1/write
|
||||
|
||||
@@ -2,7 +2,6 @@ package main
|
||||
|
||||
import (
|
||||
"flag"
|
||||
"fmt"
|
||||
"net/http"
|
||||
"net/http/httputil"
|
||||
"net/url"
|
||||
@@ -10,8 +9,8 @@ import (
|
||||
"time"
|
||||
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envflag"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/procutil"
|
||||
@@ -28,7 +27,6 @@ func main() {
|
||||
envflag.Parse()
|
||||
buildinfo.Init()
|
||||
logger.Init()
|
||||
cgroup.UpdateGOMAXPROCSToCPUQuota()
|
||||
logger.Infof("starting vmauth at %q...", *httpListenAddr)
|
||||
startTime := time.Now()
|
||||
initAuthConfig()
|
||||
@@ -98,10 +96,7 @@ func usage() {
|
||||
const s = `
|
||||
vmauth authenticates and authorizes incoming requests and proxies them to VictoriaMetrics.
|
||||
|
||||
See the docs at https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmauth/README.md .
|
||||
See the docs at https://victoriametrics.github.io/vmauth.html .
|
||||
`
|
||||
|
||||
f := flag.CommandLine.Output()
|
||||
fmt.Fprintf(f, "%s\n", s)
|
||||
flag.PrintDefaults()
|
||||
flagutil.Usage(s)
|
||||
}
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
## vmbackup
|
||||
|
||||
`vmbackup` creates VictoriaMetrics data backups from [instant snapshots](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-work-with-snapshots).
|
||||
`vmbackup` creates VictoriaMetrics data backups from [instant snapshots](https://victoriametrics.github.io/Single-server-VictoriaMetrics.html#how-to-work-with-snapshots).
|
||||
|
||||
Supported storage systems for backups:
|
||||
|
||||
@@ -15,7 +15,7 @@ data between the existing backup and new backup. It saves time and costs on data
|
||||
|
||||
Backup process can be interrupted at any time. It is automatically resumed from the interruption point when restarting `vmbackup` with the same args.
|
||||
|
||||
Backed up data can be restored with [vmrestore](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmrestore/README.md).
|
||||
Backed up data can be restored with [vmrestore](https://victoriametrics.github.io/vmrestore.html).
|
||||
|
||||
See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time-series-databases-533c1a927883) for more details.
|
||||
|
||||
@@ -34,8 +34,8 @@ vmbackup -storageDataPath=</path/to/victoria-metrics-data> -snapshotName=<local-
|
||||
```
|
||||
|
||||
* `</path/to/victoria-metrics-data>` - path to VictoriaMetrics data pointed by `-storageDataPath` command-line flag in single-node VictoriaMetrics or in cluster `vmstorage`.
|
||||
There is no need to stop VictoriaMetrics for creating backups, since they are performed from immutable [instant snapshots](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-work-with-snapshots).
|
||||
* `<local-snapshot>` is the snapshot to back up. See [how to create instant snapshots](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-work-with-snapshots).
|
||||
There is no need to stop VictoriaMetrics for creating backups, since they are performed from immutable [instant snapshots](https://victoriametrics.github.io/Single-server-VictoriaMetrics.html#how-to-work-with-snapshots).
|
||||
* `<local-snapshot>` is the snapshot to back up. See [how to create instant snapshots](https://victoriametrics.github.io/Single-server-VictoriaMetrics.html#how-to-work-with-snapshots).
|
||||
* `<bucket>` is an already existing name for [GCS bucket](https://cloud.google.com/storage/docs/creating-buckets).
|
||||
* `<path/to/new/backup>` is the destination path where new backup will be placed.
|
||||
|
||||
@@ -72,7 +72,7 @@ Smart backups mean storing full daily backups into `YYYYMMDD` folders and creati
|
||||
vmbackup -snapshotName=<latest-snapshot> -dst=gcs://<bucket>/latest
|
||||
```
|
||||
|
||||
Where `<latest-snapshot>` is the latest [snapshot](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-work-with-snapshots).
|
||||
Where `<latest-snapshot>` is the latest [snapshot](https://victoriametrics.github.io/Single-server-VictoriaMetrics.html#how-to-work-with-snapshots).
|
||||
The command will upload only changed data to `gcs://<bucket>/latest`.
|
||||
|
||||
* Run the following command once a day:
|
||||
@@ -123,8 +123,8 @@ See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time-
|
||||
* If the backup is slow, then try setting higher value for `-concurrency` flag. This will increase the number of concurrent workers that upload data to backup storage.
|
||||
* If `vmbackup` eats all the network bandwidth, then set `-maxBytesPerSecond` to the desired value.
|
||||
* If `vmbackup` has been interrupted due to temporary error, then just restart it with the same args. It will resume the backup process.
|
||||
* Backups created from [single-node VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md) cannot be restored
|
||||
at [cluster VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/cluster/README.md) and vice versa.
|
||||
* Backups created from [single-node VictoriaMetrics](https://victoriametrics.github.io/Single-server-VictoriaMetrics.html) cannot be restored
|
||||
at [cluster VictoriaMetrics](https://victoriametrics.github.io/Cluster-VictoriaMetrics.html) and vice versa.
|
||||
|
||||
|
||||
### Advanced usage
|
||||
@@ -214,7 +214,7 @@ See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time-
|
||||
-snapshot.deleteURL string
|
||||
VictoriaMetrics delete snapshot url. Optional. Will be generated from -snapshot.createURL if not provided. All created snaphosts will be automatically deleted. Example: http://victoriametrics:8428/snaphsot/delete
|
||||
-snapshotName string
|
||||
Name for the snapshot to backup. See https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-work-with-snapshots
|
||||
Name for the snapshot to backup. See https://victoriametrics.github.io/Single-server-VictoriaMetrics.html#how-to-work-with-snapshots
|
||||
-storageDataPath string
|
||||
Path to VictoriaMetrics data. Must match -storageDataPath from VictoriaMetrics or vmstorage (default "victoria-metrics-data")
|
||||
-version
|
||||
|
||||
@@ -10,8 +10,8 @@ import (
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/backup/actions"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/backup/common"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/backup/fslocal"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/backup/fsnil"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envflag"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
|
||||
@@ -19,7 +19,7 @@ import (
|
||||
|
||||
var (
|
||||
storageDataPath = flag.String("storageDataPath", "victoria-metrics-data", "Path to VictoriaMetrics data. Must match -storageDataPath from VictoriaMetrics or vmstorage")
|
||||
snapshotName = flag.String("snapshotName", "", "Name for the snapshot to backup. See https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-work-with-snapshots")
|
||||
snapshotName = flag.String("snapshotName", "", "Name for the snapshot to backup. See https://victoriametrics.github.io/Single-server-VictoriaMetrics.html#how-to-work-with-snapshots")
|
||||
snapshotCreateURL = flag.String("snapshot.createURL", "", "VictoriaMetrics create snapshot url. When this is given a snapshot will automatically be created during backup. "+
|
||||
"Example: http://victoriametrics:8428/snaphsot/create")
|
||||
snapshotDeleteURL = flag.String("snapshot.deleteURL", "", "VictoriaMetrics delete snapshot url. Optional. Will be generated from -snapshot.createURL if not provided. "+
|
||||
@@ -39,10 +39,9 @@ func main() {
|
||||
envflag.Parse()
|
||||
buildinfo.Init()
|
||||
logger.Init()
|
||||
cgroup.UpdateGOMAXPROCSToCPUQuota()
|
||||
|
||||
if len(*snapshotCreateURL) > 0 {
|
||||
logger.Infof("%s", "Snapshots enabled")
|
||||
logger.Infof("Snapshots enabled")
|
||||
logger.Infof("Snapshot create url %s", *snapshotCreateURL)
|
||||
if len(*snapshotDeleteURL) <= 0 {
|
||||
err := flag.Set("snapshot.deleteURL", strings.Replace(*snapshotCreateURL, "/create", "/delete", 1))
|
||||
@@ -54,17 +53,17 @@ func main() {
|
||||
|
||||
name, err := snapshot.Create(*snapshotCreateURL)
|
||||
if err != nil {
|
||||
logger.Fatalf("%s", err)
|
||||
logger.Fatalf("cannot create snapshot: %s", err)
|
||||
}
|
||||
err = flag.Set("snapshotName", name)
|
||||
if err != nil {
|
||||
logger.Fatalf("Failed to set snapshotName flag: %v", err)
|
||||
logger.Fatalf("cannot set snapshotName flag: %v", err)
|
||||
}
|
||||
|
||||
defer func() {
|
||||
err := snapshot.Delete(*snapshotDeleteURL, name)
|
||||
if err != nil {
|
||||
logger.Fatalf("%s", err)
|
||||
logger.Fatalf("cannot delete snapshot: %s", err)
|
||||
}
|
||||
}()
|
||||
}
|
||||
@@ -100,12 +99,9 @@ func usage() {
|
||||
vmbackup performs backups for VictoriaMetrics data from instant snapshots to gcs, s3
|
||||
or local filesystem. Backed up data can be restored with vmrestore.
|
||||
|
||||
See the docs at https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmbackup/README.md .
|
||||
See the docs at https://victoriametrics.github.io/vbackup.html .
|
||||
`
|
||||
|
||||
f := flag.CommandLine.Output()
|
||||
fmt.Fprintf(f, "%s\n", s)
|
||||
flag.PrintDefaults()
|
||||
flagutil.Usage(s)
|
||||
}
|
||||
|
||||
func newSrcFS() (*fslocal.FS, error) {
|
||||
@@ -146,9 +142,9 @@ func newDstFS() (common.RemoteFS, error) {
|
||||
return fs, nil
|
||||
}
|
||||
|
||||
func newOriginFS() (common.RemoteFS, error) {
|
||||
func newOriginFS() (common.OriginFS, error) {
|
||||
if len(*origin) == 0 {
|
||||
return nil, nil
|
||||
return &fsnil.FS{}, nil
|
||||
}
|
||||
fs, err := actions.NewRemoteFS(*origin)
|
||||
if err != nil {
|
||||
|
||||
@@ -20,26 +20,27 @@ type snapshot struct {
|
||||
// Create creates a snapshot and the provided api endpoint and returns
|
||||
// the snapshot name
|
||||
func Create(createSnapshotURL string) (string, error) {
|
||||
logger.Infof("%s", "Creating snapshot")
|
||||
logger.Infof("Creating snapshot")
|
||||
u, err := url.Parse(createSnapshotURL)
|
||||
if err != nil {
|
||||
return "", err
|
||||
}
|
||||
|
||||
resp, err := http.Get(u.String())
|
||||
if err != nil {
|
||||
return "", err
|
||||
}
|
||||
|
||||
body, err := ioutil.ReadAll(resp.Body)
|
||||
if err != nil {
|
||||
return "", err
|
||||
}
|
||||
if resp.StatusCode != http.StatusOK {
|
||||
return "", fmt.Errorf("unexpected status code returned from %q; expecting %d; got %d; response body: %q", createSnapshotURL, resp.StatusCode, http.StatusOK, body)
|
||||
}
|
||||
|
||||
snap := snapshot{}
|
||||
err = json.Unmarshal(body, &snap)
|
||||
if err != nil {
|
||||
return "", err
|
||||
return "", fmt.Errorf("cannot parse JSON response from %q: %w; response body: %q", createSnapshotURL, err, body)
|
||||
}
|
||||
|
||||
if snap.Status == "ok" {
|
||||
@@ -58,26 +59,26 @@ func Delete(deleteSnapshotURL string, snapshotName string) error {
|
||||
formData := url.Values{
|
||||
"snapshot": {snapshotName},
|
||||
}
|
||||
|
||||
u, err := url.Parse(deleteSnapshotURL)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
resp, err := http.PostForm(u.String(), formData)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
body, err := ioutil.ReadAll(resp.Body)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
if resp.StatusCode != http.StatusOK {
|
||||
return fmt.Errorf("unexpected status code returned from %q; expecting %d; got %d; response body: %q", deleteSnapshotURL, resp.StatusCode, http.StatusOK, body)
|
||||
}
|
||||
|
||||
snap := snapshot{}
|
||||
err = json.Unmarshal(body, &snap)
|
||||
if err != nil {
|
||||
return err
|
||||
return fmt.Errorf("cannot parse JSON response from %q: %w; response body: %q", deleteSnapshotURL, err, body)
|
||||
}
|
||||
|
||||
if snap.Status == "ok" {
|
||||
|
||||
@@ -1,8 +1,9 @@
|
||||
package common
|
||||
|
||||
import (
|
||||
"runtime"
|
||||
"sync"
|
||||
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
|
||||
)
|
||||
|
||||
// GetInsertCtx returns InsertCtx from the pool.
|
||||
@@ -33,4 +34,4 @@ func PutInsertCtx(ctx *InsertCtx) {
|
||||
}
|
||||
|
||||
var insertCtxPool sync.Pool
|
||||
var insertCtxPoolCh = make(chan *InsertCtx, runtime.GOMAXPROCS(-1))
|
||||
var insertCtxPoolCh = make(chan *InsertCtx, cgroup.AvailableCPUs())
|
||||
|
||||
@@ -4,13 +4,15 @@ import (
|
||||
"flag"
|
||||
"io"
|
||||
"net/http"
|
||||
"runtime"
|
||||
"sync"
|
||||
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/relabel"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
|
||||
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
|
||||
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/influx"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
|
||||
@@ -33,7 +35,9 @@ var (
|
||||
// See https://github.com/influxdata/telegraf/tree/master/plugins/inputs/socket_listener/
|
||||
func InsertHandlerForReader(r io.Reader) error {
|
||||
return writeconcurrencylimiter.Do(func() error {
|
||||
return parser.ParseStream(r, false, "", "", insertRows)
|
||||
return parser.ParseStream(r, false, "", "", func(db string, rows []parser.Row) error {
|
||||
return insertRows(db, rows, nil)
|
||||
})
|
||||
})
|
||||
}
|
||||
|
||||
@@ -41,17 +45,23 @@ func InsertHandlerForReader(r io.Reader) error {
|
||||
//
|
||||
// See https://github.com/influxdata/influxdb/blob/4cbdc197b8117fee648d62e2e5be75c6575352f0/tsdb/README.md
|
||||
func InsertHandlerForHTTP(req *http.Request) error {
|
||||
extraLabels, err := parserCommon.GetExtraLabels(req)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
return writeconcurrencylimiter.Do(func() error {
|
||||
isGzipped := req.Header.Get("Content-Encoding") == "gzip"
|
||||
q := req.URL.Query()
|
||||
precision := q.Get("precision")
|
||||
// Read db tag from https://docs.influxdata.com/influxdb/v1.7/tools/api/#write-http-endpoint
|
||||
db := q.Get("db")
|
||||
return parser.ParseStream(req.Body, isGzipped, precision, db, insertRows)
|
||||
return parser.ParseStream(req.Body, isGzipped, precision, db, func(db string, rows []parser.Row) error {
|
||||
return insertRows(db, rows, extraLabels)
|
||||
})
|
||||
})
|
||||
}
|
||||
|
||||
func insertRows(db string, rows []parser.Row) error {
|
||||
func insertRows(db string, rows []parser.Row, extraLabels []prompbmarshal.Label) error {
|
||||
ctx := getPushCtx()
|
||||
defer putPushCtx(ctx)
|
||||
|
||||
@@ -78,6 +88,10 @@ func insertRows(db string, rows []parser.Row) error {
|
||||
if !hasDBKey {
|
||||
ic.AddLabel("db", db)
|
||||
}
|
||||
for j := range extraLabels {
|
||||
label := &extraLabels[j]
|
||||
ic.AddLabel(label.Name, label.Value)
|
||||
}
|
||||
ctx.metricGroupBuf = ctx.metricGroupBuf[:0]
|
||||
if !*skipMeasurement {
|
||||
ctx.metricGroupBuf = append(ctx.metricGroupBuf, r.Measurement...)
|
||||
@@ -175,4 +189,4 @@ func putPushCtx(ctx *pushCtx) {
|
||||
}
|
||||
|
||||
var pushCtxPool sync.Pool
|
||||
var pushCtxPoolCh = make(chan *pushCtx, runtime.GOMAXPROCS(-1))
|
||||
var pushCtxPoolCh = make(chan *pushCtx, cgroup.AvailableCPUs())
|
||||
|
||||
@@ -4,7 +4,6 @@ import (
|
||||
"flag"
|
||||
"fmt"
|
||||
"net/http"
|
||||
"strconv"
|
||||
"strings"
|
||||
"sync/atomic"
|
||||
|
||||
@@ -40,7 +39,7 @@ var (
|
||||
"Telnet put messages and HTTP /api/put messages are simultaneously served on TCP port. "+
|
||||
"Usually :4242 must be set. Doesn't work if empty")
|
||||
opentsdbHTTPListenAddr = flag.String("opentsdbHTTPListenAddr", "", "TCP address to listen for OpentTSDB HTTP put requests. Usually :4242 must be set. Doesn't work if empty")
|
||||
maxLabelsPerTimeseries = flag.Int("maxLabelsPerTimeseries", 30, "The maximum number of labels accepted per time series. Superflouos labels are dropped")
|
||||
maxLabelsPerTimeseries = flag.Int("maxLabelsPerTimeseries", 30, "The maximum number of labels accepted per time series. Superfluous labels are dropped")
|
||||
)
|
||||
|
||||
var (
|
||||
@@ -155,15 +154,29 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
|
||||
return true
|
||||
case "/targets":
|
||||
promscrapeTargetsRequests.Inc()
|
||||
w.Header().Set("Content-Type", "text/plain")
|
||||
showOriginalLabels, _ := strconv.ParseBool(r.FormValue("show_original_labels"))
|
||||
promscrape.WriteHumanReadableTargetsStatus(w, showOriginalLabels)
|
||||
promscrape.WriteHumanReadableTargetsStatus(w, r)
|
||||
return true
|
||||
case "/api/v1/targets":
|
||||
promscrapeAPIV1TargetsRequests.Inc()
|
||||
w.Header().Set("Content-Type", "application/json; charset=utf-8")
|
||||
state := r.FormValue("state")
|
||||
promscrape.WriteAPIV1Targets(w, state)
|
||||
return true
|
||||
case "/-/reload":
|
||||
promscrapeConfigReloadRequests.Inc()
|
||||
procutil.SelfSIGHUP()
|
||||
w.WriteHeader(http.StatusNoContent)
|
||||
return true
|
||||
case "/ready":
|
||||
if rdy := atomic.LoadInt32(&promscrape.PendingScrapeConfigs); rdy > 0 {
|
||||
errMsg := fmt.Sprintf("waiting for scrape config to init targets, configs left: %d", rdy)
|
||||
http.Error(w, errMsg, http.StatusTooEarly)
|
||||
} else {
|
||||
w.Header().Set("Content-Type", "text/plain; charset=utf-8")
|
||||
w.WriteHeader(http.StatusOK)
|
||||
w.Write([]byte("OK"))
|
||||
}
|
||||
return true
|
||||
default:
|
||||
// This is not our link
|
||||
return false
|
||||
@@ -191,7 +204,8 @@ var (
|
||||
|
||||
influxQueryRequests = metrics.NewCounter(`vm_http_requests_total{path="/query", protocol="influx"}`)
|
||||
|
||||
promscrapeTargetsRequests = metrics.NewCounter(`vm_http_requests_total{path="/targets"}`)
|
||||
promscrapeTargetsRequests = metrics.NewCounter(`vm_http_requests_total{path="/targets"}`)
|
||||
promscrapeAPIV1TargetsRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/targets"}`)
|
||||
|
||||
promscrapeConfigReloadRequests = metrics.NewCounter(`vm_http_requests_total{path="/-/reload"}`)
|
||||
|
||||
|
||||
@@ -2,11 +2,11 @@ package native
|
||||
|
||||
import (
|
||||
"net/http"
|
||||
"runtime"
|
||||
"sync"
|
||||
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/relabel"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
|
||||
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
|
||||
@@ -112,4 +112,4 @@ func putPushCtx(ctx *pushCtx) {
|
||||
}
|
||||
|
||||
var pushCtxPool sync.Pool
|
||||
var pushCtxPoolCh = make(chan *pushCtx, runtime.GOMAXPROCS(-1))
|
||||
var pushCtxPoolCh = make(chan *pushCtx, cgroup.AvailableCPUs())
|
||||
|
||||
@@ -6,6 +6,8 @@ import (
|
||||
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/relabel"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
|
||||
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
|
||||
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentsdbhttp"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
|
||||
"github.com/VictoriaMetrics/metrics"
|
||||
@@ -22,15 +24,21 @@ func InsertHandler(req *http.Request) error {
|
||||
path := req.URL.Path
|
||||
switch path {
|
||||
case "/api/put":
|
||||
extraLabels, err := parserCommon.GetExtraLabels(req)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
return writeconcurrencylimiter.Do(func() error {
|
||||
return parser.ParseStream(req, insertRows)
|
||||
return parser.ParseStream(req, func(rows []parser.Row) error {
|
||||
return insertRows(rows, extraLabels)
|
||||
})
|
||||
})
|
||||
default:
|
||||
return fmt.Errorf("unexpected path requested on HTTP OpenTSDB server: %q", path)
|
||||
}
|
||||
}
|
||||
|
||||
func insertRows(rows []parser.Row) error {
|
||||
func insertRows(rows []parser.Row, extraLabels []prompbmarshal.Label) error {
|
||||
ctx := common.GetInsertCtx()
|
||||
defer common.PutInsertCtx(ctx)
|
||||
|
||||
@@ -44,6 +52,10 @@ func insertRows(rows []parser.Row) error {
|
||||
tag := &r.Tags[j]
|
||||
ctx.AddLabel(tag.Key, tag.Value)
|
||||
}
|
||||
for j := range extraLabels {
|
||||
label := &extraLabels[j]
|
||||
ctx.AddLabel(label.Name, label.Value)
|
||||
}
|
||||
if hasRelabeling {
|
||||
ctx.ApplyRelabeling()
|
||||
}
|
||||
|
||||
@@ -31,7 +31,7 @@ func InsertHandler(req *http.Request) error {
|
||||
isGzipped := req.Header.Get("Content-Encoding") == "gzip"
|
||||
return parser.ParseStream(req.Body, defaultTimestamp, isGzipped, func(rows []parser.Row) error {
|
||||
return insertRows(rows, extraLabels)
|
||||
})
|
||||
}, nil)
|
||||
})
|
||||
}
|
||||
|
||||
|
||||
@@ -6,6 +6,8 @@ import (
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/relabel"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
|
||||
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
|
||||
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/promremotewrite"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
|
||||
"github.com/VictoriaMetrics/metrics"
|
||||
@@ -18,12 +20,18 @@ var (
|
||||
|
||||
// InsertHandler processes remote write for prometheus.
|
||||
func InsertHandler(req *http.Request) error {
|
||||
extraLabels, err := parserCommon.GetExtraLabels(req)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
return writeconcurrencylimiter.Do(func() error {
|
||||
return parser.ParseStream(req, insertRows)
|
||||
return parser.ParseStream(req, func(tss []prompb.TimeSeries) error {
|
||||
return insertRows(tss, extraLabels)
|
||||
})
|
||||
})
|
||||
}
|
||||
|
||||
func insertRows(timeseries []prompb.TimeSeries) error {
|
||||
func insertRows(timeseries []prompb.TimeSeries, extraLabels []prompbmarshal.Label) error {
|
||||
ctx := common.GetInsertCtx()
|
||||
defer common.PutInsertCtx(ctx)
|
||||
|
||||
@@ -42,6 +50,10 @@ func insertRows(timeseries []prompb.TimeSeries) error {
|
||||
for _, srcLabel := range srcLabels {
|
||||
ctx.AddLabelBytes(srcLabel.Name, srcLabel.Value)
|
||||
}
|
||||
for j := range extraLabels {
|
||||
label := &extraLabels[j]
|
||||
ctx.AddLabel(label.Name, label.Value)
|
||||
}
|
||||
if hasRelabeling {
|
||||
ctx.ApplyRelabeling()
|
||||
}
|
||||
|
||||
@@ -69,12 +69,7 @@ type Ctx struct {
|
||||
|
||||
// Reset resets ctx.
|
||||
func (ctx *Ctx) Reset() {
|
||||
labels := ctx.tmpLabels
|
||||
for i := range labels {
|
||||
label := &labels[i]
|
||||
label.Name = ""
|
||||
label.Value = ""
|
||||
}
|
||||
promrelabel.CleanLabels(ctx.tmpLabels)
|
||||
ctx.tmpLabels = ctx.tmpLabels[:0]
|
||||
}
|
||||
|
||||
|
||||
@@ -2,11 +2,11 @@ package vmimport
|
||||
|
||||
import (
|
||||
"net/http"
|
||||
"runtime"
|
||||
"sync"
|
||||
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/relabel"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
|
||||
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
|
||||
@@ -117,4 +117,4 @@ func putPushCtx(ctx *pushCtx) {
|
||||
}
|
||||
|
||||
var pushCtxPool sync.Pool
|
||||
var pushCtxPoolCh = make(chan *pushCtx, runtime.GOMAXPROCS(-1))
|
||||
var pushCtxPoolCh = make(chan *pushCtx, cgroup.AvailableCPUs())
|
||||
|
||||
@@ -1,9 +1,9 @@
|
||||
## vmrestore
|
||||
|
||||
`vmrestore` restores data from backups created by [vmbackup](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmbackup/README.md).
|
||||
`vmrestore` restores data from backups created by [vmbackup](https://victoriametrics.github.io/vbackup.html).
|
||||
VictoriaMetrics `v1.29.0` and newer versions must be used for working with the restored data.
|
||||
|
||||
Restore process can be interrupted at any time. It is automatically resumed from the inerruption point
|
||||
Restore process can be interrupted at any time. It is automatically resumed from the interruption point
|
||||
when restarting `vmrestore` with the same args.
|
||||
|
||||
|
||||
@@ -17,7 +17,7 @@ vmrestore -src=gcs://<bucket>/<path/to/backup> -storageDataPath=<local/path/to/r
|
||||
```
|
||||
|
||||
* `<bucket>` is [GCS bucket](https://cloud.google.com/storage/docs/creating-buckets) name.
|
||||
* `<path/to/backup>` is the path to backup made with [vmbackup](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmbackup/README.md) on GCS bucket.
|
||||
* `<path/to/backup>` is the path to backup made with [vmbackup](https://victoriametrics.github.io/vbackup.html) on GCS bucket.
|
||||
* `<local/path/to/restore>` is the path to folder where data will be restored. This folder must be passed
|
||||
to VictoriaMetrics in `-storageDataPath` command-line flag after the restore process is complete.
|
||||
|
||||
|
||||
@@ -9,7 +9,6 @@ import (
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/backup/common"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/backup/fslocal"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envflag"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
|
||||
@@ -33,7 +32,6 @@ func main() {
|
||||
envflag.Parse()
|
||||
buildinfo.Init()
|
||||
logger.Init()
|
||||
cgroup.UpdateGOMAXPROCSToCPUQuota()
|
||||
|
||||
srcFS, err := newSrcFS()
|
||||
if err != nil {
|
||||
@@ -60,12 +58,9 @@ func usage() {
|
||||
const s = `
|
||||
vmrestore restores VictoriaMetrics data from backups made by vmbackup.
|
||||
|
||||
See the docs at https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmrestore/README.md .
|
||||
See the docs at https://victoriametrics.github.io/vmrestore.html .
|
||||
`
|
||||
|
||||
f := flag.CommandLine.Output()
|
||||
fmt.Fprintf(f, "%s\n", s)
|
||||
flag.PrintDefaults()
|
||||
flagutil.Usage(s)
|
||||
}
|
||||
|
||||
func newDstFS() (*fslocal.FS, error) {
|
||||
|
||||
@@ -84,10 +84,7 @@ func MetricsFindHandler(startTime time.Time, w http.ResponseWriter, r *http.Requ
|
||||
}
|
||||
paths = deduplicatePaths(paths, delimiter)
|
||||
sortPaths(paths, delimiter)
|
||||
contentType := "application/json"
|
||||
if jsonp != "" {
|
||||
contentType = "text/javascript"
|
||||
}
|
||||
contentType := getContentType(jsonp)
|
||||
w.Header().Set("Content-Type", contentType)
|
||||
bw := bufferedwriter.Get(w)
|
||||
defer bufferedwriter.Put(bw)
|
||||
@@ -166,10 +163,7 @@ func MetricsExpandHandler(startTime time.Time, w http.ResponseWriter, r *http.Re
|
||||
}
|
||||
m[query] = paths
|
||||
}
|
||||
contentType := "application/json"
|
||||
if jsonp != "" {
|
||||
contentType = "text/javascript"
|
||||
}
|
||||
contentType := getContentType(jsonp)
|
||||
w.Header().Set("Content-Type", contentType)
|
||||
if groupByExpr {
|
||||
for _, paths := range m {
|
||||
@@ -215,10 +209,7 @@ func MetricsIndexHandler(startTime time.Time, w http.ResponseWriter, r *http.Req
|
||||
if err != nil {
|
||||
return fmt.Errorf(`cannot obtain metric names: %w`, err)
|
||||
}
|
||||
contentType := "application/json"
|
||||
if jsonp != "" {
|
||||
contentType = "text/javascript"
|
||||
}
|
||||
contentType := getContentType(jsonp)
|
||||
w.Header().Set("Content-Type", contentType)
|
||||
bw := bufferedwriter.Get(w)
|
||||
defer bufferedwriter.Put(bw)
|
||||
@@ -347,47 +338,8 @@ func getRegexpForQuery(query string, delimiter byte) (*regexp.Regexp, error) {
|
||||
if re := regexpCache[k]; re != nil {
|
||||
return re.re, re.err
|
||||
}
|
||||
a := make([]string, 0, len(query))
|
||||
quotedDelimiter := regexp.QuoteMeta(string([]byte{delimiter}))
|
||||
tillNextDelimiter := "[^" + quotedDelimiter + "]*"
|
||||
for i := 0; i < len(query); i++ {
|
||||
switch query[i] {
|
||||
case '*':
|
||||
a = append(a, tillNextDelimiter)
|
||||
case '{':
|
||||
tmp := query[i+1:]
|
||||
if n := strings.IndexByte(tmp, '}'); n < 0 {
|
||||
a = append(a, regexp.QuoteMeta(query[i:]))
|
||||
i = len(query)
|
||||
} else {
|
||||
a = append(a, "(?:")
|
||||
opts := strings.Split(tmp[:n], ",")
|
||||
for j, opt := range opts {
|
||||
opts[j] = regexp.QuoteMeta(opt)
|
||||
}
|
||||
a = append(a, strings.Join(opts, "|"))
|
||||
a = append(a, ")")
|
||||
i += n + 1
|
||||
}
|
||||
case '[':
|
||||
tmp := query[i:]
|
||||
if n := strings.IndexByte(tmp, ']'); n < 0 {
|
||||
a = append(a, regexp.QuoteMeta(query[i:]))
|
||||
i = len(query)
|
||||
} else {
|
||||
a = append(a, tmp[:n+1])
|
||||
i += n
|
||||
}
|
||||
default:
|
||||
a = append(a, regexp.QuoteMeta(query[i:i+1]))
|
||||
}
|
||||
}
|
||||
s := strings.Join(a, "")
|
||||
if !strings.HasSuffix(s, quotedDelimiter) {
|
||||
s += quotedDelimiter + "?"
|
||||
}
|
||||
s = "^(?:" + s + ")$"
|
||||
re, err := regexp.Compile(s)
|
||||
rs := getRegexpStringForQuery(query, delimiter, false)
|
||||
re, err := regexp.Compile(rs)
|
||||
regexpCache[k] = ®expCacheEntry{
|
||||
re: re,
|
||||
err: err,
|
||||
@@ -403,6 +355,63 @@ func getRegexpForQuery(query string, delimiter byte) (*regexp.Regexp, error) {
|
||||
return re, err
|
||||
}
|
||||
|
||||
func getRegexpStringForQuery(query string, delimiter byte, isSubquery bool) string {
|
||||
var a []string
|
||||
quotedDelimiter := regexp.QuoteMeta(string([]byte{delimiter}))
|
||||
tillNextDelimiter := "[^" + quotedDelimiter + "]*"
|
||||
j := 0
|
||||
for i := 0; i < len(query); i++ {
|
||||
switch query[i] {
|
||||
case '*':
|
||||
a = append(a, regexp.QuoteMeta(query[j:i]))
|
||||
a = append(a, tillNextDelimiter)
|
||||
j = i + 1
|
||||
case '{':
|
||||
if isSubquery {
|
||||
break
|
||||
}
|
||||
a = append(a, regexp.QuoteMeta(query[j:i]))
|
||||
tmp := query[i+1:]
|
||||
if n := strings.IndexByte(tmp, '}'); n < 0 {
|
||||
rs := getRegexpStringForQuery(query[i:], delimiter, true)
|
||||
a = append(a, rs)
|
||||
i = len(query)
|
||||
} else {
|
||||
a = append(a, "(?:")
|
||||
opts := strings.Split(tmp[:n], ",")
|
||||
for j, opt := range opts {
|
||||
opts[j] = getRegexpStringForQuery(opt, delimiter, true)
|
||||
}
|
||||
a = append(a, strings.Join(opts, "|"))
|
||||
a = append(a, ")")
|
||||
i += n + 1
|
||||
}
|
||||
j = i + 1
|
||||
case '[':
|
||||
a = append(a, regexp.QuoteMeta(query[j:i]))
|
||||
tmp := query[i:]
|
||||
if n := strings.IndexByte(tmp, ']'); n < 0 {
|
||||
a = append(a, regexp.QuoteMeta(query[i:]))
|
||||
i = len(query)
|
||||
} else {
|
||||
a = append(a, tmp[:n+1])
|
||||
i += n
|
||||
}
|
||||
j = i + 1
|
||||
}
|
||||
}
|
||||
a = append(a, regexp.QuoteMeta(query[j:]))
|
||||
s := strings.Join(a, "")
|
||||
if isSubquery {
|
||||
return s
|
||||
}
|
||||
if !strings.HasSuffix(s, quotedDelimiter) {
|
||||
s += quotedDelimiter + "?"
|
||||
}
|
||||
s = "^(?:" + s + ")$"
|
||||
return s
|
||||
}
|
||||
|
||||
type regexpCacheEntry struct {
|
||||
re *regexp.Regexp
|
||||
err error
|
||||
@@ -417,3 +426,10 @@ var regexpCache = make(map[regexpCacheKey]*regexpCacheEntry)
|
||||
var regexpCacheLock sync.Mutex
|
||||
|
||||
const maxRegexpCacheSize = 10000
|
||||
|
||||
func getContentType(jsonp string) string {
|
||||
if jsonp == "" {
|
||||
return "application/json; charset=utf-8"
|
||||
}
|
||||
return "text/javascript; charset=utf-8"
|
||||
}
|
||||
@@ -28,6 +28,9 @@ func TestGetRegexpForQuery(t *testing.T) {
|
||||
f("foo_[ab]*", '_', `^(?:foo_[ab][^_]*_?)$`)
|
||||
f("foo_[ab]_", '_', `^(?:foo_[ab]_)$`)
|
||||
f("foo.[ab].", '.', `^(?:foo\.[ab]\.)$`)
|
||||
f("foo{b{ar*,ba*z[1-9]}", '.', `^(?:foo(?:b\{ar[^\.]*|ba[^\.]*z[1-9])\.?)$`)
|
||||
f("{foo*}", '.', `^(?:(?:foo[^\.]*)\.?)$`)
|
||||
f("{foo*,}", '.', `^(?:(?:foo[^\.]*|)\.?)$`)
|
||||
}
|
||||
|
||||
func TestSortPaths(t *testing.T) {
|
||||
@@ -72,4 +75,5 @@ func TestAddAutomaticVariants(t *testing.T) {
|
||||
f("foo,bar.baz", "_", "{foo,bar.baz}")
|
||||
f("foo,bar_baz*", "_", "{foo,bar}_baz*")
|
||||
f("foo.bar,baz,aa.bb,cc", ".", "foo.{bar,baz,aa}.{bb,cc}")
|
||||
f("foo.b*r,b[a-xz]z,aa.bb,cc", ".", "foo.{b*r,b[a-xz]z,aa}.{bb,cc}")
|
||||
}
|
||||
20
app/vmselect/graphite/tag_values_response.qtpl
Normal file
20
app/vmselect/graphite/tag_values_response.qtpl
Normal file
@@ -0,0 +1,20 @@
|
||||
{% stripspace %}
|
||||
|
||||
Tags generates response for /tags/<tag_name> handler
|
||||
See https://graphite.readthedocs.io/en/stable/tags.html#exploring-tags
|
||||
{% func TagValuesResponse(tagName string, tagValues []string) %}
|
||||
{
|
||||
"tag":{%q= tagName %},
|
||||
"values":[
|
||||
{% for i, value := range tagValues %}
|
||||
{
|
||||
"count":1,
|
||||
"value":{%q= value %}
|
||||
}
|
||||
{% if i+1 < len(tagValues) %},{% endif %}
|
||||
{% endfor %}
|
||||
]
|
||||
}
|
||||
{% endfunc %}
|
||||
|
||||
{% endstripspace %}
|
||||
75
app/vmselect/graphite/tag_values_response.qtpl.go
Normal file
75
app/vmselect/graphite/tag_values_response.qtpl.go
Normal file
@@ -0,0 +1,75 @@
|
||||
// Code generated by qtc from "tag_values_response.qtpl". DO NOT EDIT.
|
||||
// See https://github.com/valyala/quicktemplate for details.
|
||||
|
||||
// Tags generates response for /tags/<tag_name> handlerSee https://graphite.readthedocs.io/en/stable/tags.html#exploring-tags
|
||||
|
||||
//line app/vmselect/graphite/tag_values_response.qtpl:5
|
||||
package graphite
|
||||
|
||||
//line app/vmselect/graphite/tag_values_response.qtpl:5
|
||||
import (
|
||||
qtio422016 "io"
|
||||
|
||||
qt422016 "github.com/valyala/quicktemplate"
|
||||
)
|
||||
|
||||
//line app/vmselect/graphite/tag_values_response.qtpl:5
|
||||
var (
|
||||
_ = qtio422016.Copy
|
||||
_ = qt422016.AcquireByteBuffer
|
||||
)
|
||||
|
||||
//line app/vmselect/graphite/tag_values_response.qtpl:5
|
||||
func StreamTagValuesResponse(qw422016 *qt422016.Writer, tagName string, tagValues []string) {
|
||||
//line app/vmselect/graphite/tag_values_response.qtpl:5
|
||||
qw422016.N().S(`{"tag":`)
|
||||
//line app/vmselect/graphite/tag_values_response.qtpl:7
|
||||
qw422016.N().Q(tagName)
|
||||
//line app/vmselect/graphite/tag_values_response.qtpl:7
|
||||
qw422016.N().S(`,"values":[`)
|
||||
//line app/vmselect/graphite/tag_values_response.qtpl:9
|
||||
for i, value := range tagValues {
|
||||
//line app/vmselect/graphite/tag_values_response.qtpl:9
|
||||
qw422016.N().S(`{"count":1,"value":`)
|
||||
//line app/vmselect/graphite/tag_values_response.qtpl:12
|
||||
qw422016.N().Q(value)
|
||||
//line app/vmselect/graphite/tag_values_response.qtpl:12
|
||||
qw422016.N().S(`}`)
|
||||
//line app/vmselect/graphite/tag_values_response.qtpl:14
|
||||
if i+1 < len(tagValues) {
|
||||
//line app/vmselect/graphite/tag_values_response.qtpl:14
|
||||
qw422016.N().S(`,`)
|
||||
//line app/vmselect/graphite/tag_values_response.qtpl:14
|
||||
}
|
||||
//line app/vmselect/graphite/tag_values_response.qtpl:15
|
||||
}
|
||||
//line app/vmselect/graphite/tag_values_response.qtpl:15
|
||||
qw422016.N().S(`]}`)
|
||||
//line app/vmselect/graphite/tag_values_response.qtpl:18
|
||||
}
|
||||
|
||||
//line app/vmselect/graphite/tag_values_response.qtpl:18
|
||||
func WriteTagValuesResponse(qq422016 qtio422016.Writer, tagName string, tagValues []string) {
|
||||
//line app/vmselect/graphite/tag_values_response.qtpl:18
|
||||
qw422016 := qt422016.AcquireWriter(qq422016)
|
||||
//line app/vmselect/graphite/tag_values_response.qtpl:18
|
||||
StreamTagValuesResponse(qw422016, tagName, tagValues)
|
||||
//line app/vmselect/graphite/tag_values_response.qtpl:18
|
||||
qt422016.ReleaseWriter(qw422016)
|
||||
//line app/vmselect/graphite/tag_values_response.qtpl:18
|
||||
}
|
||||
|
||||
//line app/vmselect/graphite/tag_values_response.qtpl:18
|
||||
func TagValuesResponse(tagName string, tagValues []string) string {
|
||||
//line app/vmselect/graphite/tag_values_response.qtpl:18
|
||||
qb422016 := qt422016.AcquireByteBuffer()
|
||||
//line app/vmselect/graphite/tag_values_response.qtpl:18
|
||||
WriteTagValuesResponse(qb422016, tagName, tagValues)
|
||||
//line app/vmselect/graphite/tag_values_response.qtpl:18
|
||||
qs422016 := string(qb422016.B)
|
||||
//line app/vmselect/graphite/tag_values_response.qtpl:18
|
||||
qt422016.ReleaseByteBuffer(qb422016)
|
||||
//line app/vmselect/graphite/tag_values_response.qtpl:18
|
||||
return qs422016
|
||||
//line app/vmselect/graphite/tag_values_response.qtpl:18
|
||||
}
|
||||
504
app/vmselect/graphite/tags_api.go
Normal file
504
app/vmselect/graphite/tags_api.go
Normal file
@@ -0,0 +1,504 @@
|
||||
package graphite
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"net/http"
|
||||
"regexp"
|
||||
"sort"
|
||||
"strconv"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/bufferedwriter"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/netstorage"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/searchutils"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
|
||||
graphiteparser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/graphite"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
|
||||
"github.com/VictoriaMetrics/metrics"
|
||||
)
|
||||
|
||||
// TagsDelSeriesHandler implements /tags/delSeries handler.
|
||||
//
|
||||
// See https://graphite.readthedocs.io/en/stable/tags.html#removing-series-from-the-tagdb
|
||||
func TagsDelSeriesHandler(startTime time.Time, w http.ResponseWriter, r *http.Request) error {
|
||||
if err := r.ParseForm(); err != nil {
|
||||
return fmt.Errorf("cannot parse form values: %w", err)
|
||||
}
|
||||
paths := r.Form["path"]
|
||||
totalDeleted := 0
|
||||
var row graphiteparser.Row
|
||||
var tagsPool []graphiteparser.Tag
|
||||
ct := time.Now().UnixNano() / 1e6
|
||||
for _, path := range paths {
|
||||
var err error
|
||||
tagsPool, err = row.UnmarshalMetricAndTags(path, tagsPool[:0])
|
||||
if err != nil {
|
||||
return fmt.Errorf("cannot parse path=%q: %w", path, err)
|
||||
}
|
||||
tfs := make([]storage.TagFilter, 0, 1+len(row.Tags))
|
||||
tfs = append(tfs, storage.TagFilter{
|
||||
Key: nil,
|
||||
Value: []byte(row.Metric),
|
||||
})
|
||||
for _, tag := range row.Tags {
|
||||
tfs = append(tfs, storage.TagFilter{
|
||||
Key: []byte(tag.Key),
|
||||
Value: []byte(tag.Value),
|
||||
})
|
||||
}
|
||||
sq := storage.NewSearchQuery(0, ct, [][]storage.TagFilter{tfs})
|
||||
n, err := netstorage.DeleteSeries(sq)
|
||||
if err != nil {
|
||||
return fmt.Errorf("cannot delete series for %q: %w", sq, err)
|
||||
}
|
||||
totalDeleted += n
|
||||
}
|
||||
|
||||
w.Header().Set("Content-Type", "application/json; charset=utf-8")
|
||||
if totalDeleted > 0 {
|
||||
fmt.Fprintf(w, "true")
|
||||
} else {
|
||||
fmt.Fprintf(w, "false")
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
// TagsTagSeriesHandler implements /tags/tagSeries handler.
|
||||
//
|
||||
// See https://graphite.readthedocs.io/en/stable/tags.html#adding-series-to-the-tagdb
|
||||
func TagsTagSeriesHandler(startTime time.Time, w http.ResponseWriter, r *http.Request) error {
|
||||
return registerMetrics(startTime, w, r, false)
|
||||
}
|
||||
|
||||
// TagsTagMultiSeriesHandler implements /tags/tagMultiSeries handler.
|
||||
//
|
||||
// See https://graphite.readthedocs.io/en/stable/tags.html#adding-series-to-the-tagdb
|
||||
func TagsTagMultiSeriesHandler(startTime time.Time, w http.ResponseWriter, r *http.Request) error {
|
||||
return registerMetrics(startTime, w, r, true)
|
||||
}
|
||||
|
||||
func registerMetrics(startTime time.Time, w http.ResponseWriter, r *http.Request, isJSONResponse bool) error {
|
||||
if err := r.ParseForm(); err != nil {
|
||||
return fmt.Errorf("cannot parse form values: %w", err)
|
||||
}
|
||||
paths := r.Form["path"]
|
||||
var row graphiteparser.Row
|
||||
var labels []prompb.Label
|
||||
var b []byte
|
||||
var tagsPool []graphiteparser.Tag
|
||||
mrs := make([]storage.MetricRow, len(paths))
|
||||
ct := time.Now().UnixNano() / 1e6
|
||||
canonicalPaths := make([]string, len(paths))
|
||||
for i, path := range paths {
|
||||
var err error
|
||||
tagsPool, err = row.UnmarshalMetricAndTags(path, tagsPool[:0])
|
||||
if err != nil {
|
||||
return fmt.Errorf("cannot parse path=%q: %w", path, err)
|
||||
}
|
||||
|
||||
// Construct canonical path according to https://graphite.readthedocs.io/en/stable/tags.html#adding-series-to-the-tagdb
|
||||
sort.Slice(row.Tags, func(i, j int) bool {
|
||||
return row.Tags[i].Key < row.Tags[j].Key
|
||||
})
|
||||
b = append(b[:0], row.Metric...)
|
||||
for _, tag := range row.Tags {
|
||||
b = append(b, ';')
|
||||
b = append(b, tag.Key...)
|
||||
b = append(b, '=')
|
||||
b = append(b, tag.Value...)
|
||||
}
|
||||
canonicalPaths[i] = string(b)
|
||||
|
||||
// Convert parsed metric and tags to labels.
|
||||
labels = append(labels[:0], prompb.Label{
|
||||
Name: []byte("__name__"),
|
||||
Value: []byte(row.Metric),
|
||||
})
|
||||
for _, tag := range row.Tags {
|
||||
labels = append(labels, prompb.Label{
|
||||
Name: []byte(tag.Key),
|
||||
Value: []byte(tag.Value),
|
||||
})
|
||||
}
|
||||
|
||||
// Put labels with the current timestamp to MetricRow
|
||||
mr := &mrs[i]
|
||||
mr.MetricNameRaw = storage.MarshalMetricNameRaw(mr.MetricNameRaw[:0], labels)
|
||||
mr.Timestamp = ct
|
||||
}
|
||||
if err := vmstorage.RegisterMetricNames(mrs); err != nil {
|
||||
return fmt.Errorf("cannot register paths: %w", err)
|
||||
}
|
||||
|
||||
// Return response
|
||||
contentType := "text/plain; charset=utf-8"
|
||||
if isJSONResponse {
|
||||
contentType = "application/json; charset=utf-8"
|
||||
}
|
||||
w.Header().Set("Content-Type", contentType)
|
||||
WriteTagsTagMultiSeriesResponse(w, canonicalPaths, isJSONResponse)
|
||||
if isJSONResponse {
|
||||
tagsTagMultiSeriesDuration.UpdateDuration(startTime)
|
||||
} else {
|
||||
tagsTagSeriesDuration.UpdateDuration(startTime)
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
var (
|
||||
tagsTagSeriesDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/tags/tagSeries"}`)
|
||||
tagsTagMultiSeriesDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/tags/tagMultiSeries"}`)
|
||||
)
|
||||
|
||||
// TagsAutoCompleteValuesHandler implements /tags/autoComplete/values endpoint from Graphite Tags API.
|
||||
//
|
||||
// See https://graphite.readthedocs.io/en/stable/tags.html#auto-complete-support
|
||||
func TagsAutoCompleteValuesHandler(startTime time.Time, w http.ResponseWriter, r *http.Request) error {
|
||||
deadline := searchutils.GetDeadlineForQuery(r, startTime)
|
||||
if err := r.ParseForm(); err != nil {
|
||||
return fmt.Errorf("cannot parse form values: %w", err)
|
||||
}
|
||||
limit, err := getInt(r, "limit")
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
if limit <= 0 {
|
||||
// Use limit=100 by default. See https://graphite.readthedocs.io/en/stable/tags.html#auto-complete-support
|
||||
limit = 100
|
||||
}
|
||||
tag := r.FormValue("tag")
|
||||
if len(tag) == 0 {
|
||||
return fmt.Errorf("missing `tag` query arg")
|
||||
}
|
||||
valuePrefix := r.FormValue("valuePrefix")
|
||||
exprs := r.Form["expr"]
|
||||
var tagValues []string
|
||||
if len(exprs) == 0 {
|
||||
// Fast path: there are no `expr` filters, so use netstorage.GetGraphiteTagValues.
|
||||
// Escape special chars in tagPrefix as Graphite does.
|
||||
// See https://github.com/graphite-project/graphite-web/blob/3ad279df5cb90b211953e39161df416e54a84948/webapp/graphite/tags/base.py#L228
|
||||
filter := regexp.QuoteMeta(valuePrefix)
|
||||
tagValues, err = netstorage.GetGraphiteTagValues(tag, filter, limit, deadline)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
} else {
|
||||
// Slow path: use netstorage.SearchMetricNames for applying `expr` filters.
|
||||
sq, err := getSearchQueryForExprs(exprs)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
mns, err := netstorage.SearchMetricNames(sq, deadline)
|
||||
if err != nil {
|
||||
return fmt.Errorf("cannot fetch metric names for %q: %w", sq, err)
|
||||
}
|
||||
m := make(map[string]struct{})
|
||||
if tag == "name" {
|
||||
tag = "__name__"
|
||||
}
|
||||
for _, mn := range mns {
|
||||
tagValue := mn.GetTagValue(tag)
|
||||
if len(tagValue) == 0 {
|
||||
continue
|
||||
}
|
||||
m[string(tagValue)] = struct{}{}
|
||||
}
|
||||
if len(valuePrefix) > 0 {
|
||||
for tagValue := range m {
|
||||
if !strings.HasPrefix(tagValue, valuePrefix) {
|
||||
delete(m, tagValue)
|
||||
}
|
||||
}
|
||||
}
|
||||
tagValues = make([]string, 0, len(m))
|
||||
for tagValue := range m {
|
||||
tagValues = append(tagValues, tagValue)
|
||||
}
|
||||
sort.Strings(tagValues)
|
||||
if limit > 0 && limit < len(tagValues) {
|
||||
tagValues = tagValues[:limit]
|
||||
}
|
||||
}
|
||||
|
||||
jsonp := r.FormValue("jsonp")
|
||||
contentType := getContentType(jsonp)
|
||||
w.Header().Set("Content-Type", contentType)
|
||||
bw := bufferedwriter.Get(w)
|
||||
defer bufferedwriter.Put(bw)
|
||||
WriteTagsAutoCompleteResponse(bw, tagValues, jsonp)
|
||||
if err := bw.Flush(); err != nil {
|
||||
return err
|
||||
}
|
||||
tagsAutoCompleteValuesDuration.UpdateDuration(startTime)
|
||||
return nil
|
||||
}
|
||||
|
||||
var tagsAutoCompleteValuesDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/tags/autoComplete/values"}`)
|
||||
|
||||
// TagsAutoCompleteTagsHandler implements /tags/autoComplete/tags endpoint from Graphite Tags API.
|
||||
//
|
||||
// See https://graphite.readthedocs.io/en/stable/tags.html#auto-complete-support
|
||||
func TagsAutoCompleteTagsHandler(startTime time.Time, w http.ResponseWriter, r *http.Request) error {
|
||||
deadline := searchutils.GetDeadlineForQuery(r, startTime)
|
||||
if err := r.ParseForm(); err != nil {
|
||||
return fmt.Errorf("cannot parse form values: %w", err)
|
||||
}
|
||||
limit, err := getInt(r, "limit")
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
if limit <= 0 {
|
||||
// Use limit=100 by default. See https://graphite.readthedocs.io/en/stable/tags.html#auto-complete-support
|
||||
limit = 100
|
||||
}
|
||||
tagPrefix := r.FormValue("tagPrefix")
|
||||
exprs := r.Form["expr"]
|
||||
var labels []string
|
||||
if len(exprs) == 0 {
|
||||
// Fast path: there are no `expr` filters, so use netstorage.GetGraphiteTags.
|
||||
|
||||
// Escape special chars in tagPrefix as Graphite does.
|
||||
// See https://github.com/graphite-project/graphite-web/blob/3ad279df5cb90b211953e39161df416e54a84948/webapp/graphite/tags/base.py#L181
|
||||
filter := regexp.QuoteMeta(tagPrefix)
|
||||
labels, err = netstorage.GetGraphiteTags(filter, limit, deadline)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
} else {
|
||||
// Slow path: use netstorage.SearchMetricNames for applying `expr` filters.
|
||||
sq, err := getSearchQueryForExprs(exprs)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
mns, err := netstorage.SearchMetricNames(sq, deadline)
|
||||
if err != nil {
|
||||
return fmt.Errorf("cannot fetch metric names for %q: %w", sq, err)
|
||||
}
|
||||
m := make(map[string]struct{})
|
||||
for _, mn := range mns {
|
||||
m["name"] = struct{}{}
|
||||
for _, tag := range mn.Tags {
|
||||
m[string(tag.Key)] = struct{}{}
|
||||
}
|
||||
}
|
||||
if len(tagPrefix) > 0 {
|
||||
for label := range m {
|
||||
if !strings.HasPrefix(label, tagPrefix) {
|
||||
delete(m, label)
|
||||
}
|
||||
}
|
||||
}
|
||||
labels = make([]string, 0, len(m))
|
||||
for label := range m {
|
||||
labels = append(labels, label)
|
||||
}
|
||||
sort.Strings(labels)
|
||||
if limit > 0 && limit < len(labels) {
|
||||
labels = labels[:limit]
|
||||
}
|
||||
}
|
||||
|
||||
jsonp := r.FormValue("jsonp")
|
||||
contentType := getContentType(jsonp)
|
||||
w.Header().Set("Content-Type", contentType)
|
||||
bw := bufferedwriter.Get(w)
|
||||
defer bufferedwriter.Put(bw)
|
||||
WriteTagsAutoCompleteResponse(bw, labels, jsonp)
|
||||
if err := bw.Flush(); err != nil {
|
||||
return err
|
||||
}
|
||||
tagsAutoCompleteTagsDuration.UpdateDuration(startTime)
|
||||
return nil
|
||||
}
|
||||
|
||||
var tagsAutoCompleteTagsDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/tags/autoComplete/tags"}`)
|
||||
|
||||
// TagsFindSeriesHandler implements /tags/findSeries endpoint from Graphite Tags API.
|
||||
//
|
||||
// See https://graphite.readthedocs.io/en/stable/tags.html#exploring-tags
|
||||
func TagsFindSeriesHandler(startTime time.Time, w http.ResponseWriter, r *http.Request) error {
|
||||
deadline := searchutils.GetDeadlineForQuery(r, startTime)
|
||||
if err := r.ParseForm(); err != nil {
|
||||
return fmt.Errorf("cannot parse form values: %w", err)
|
||||
}
|
||||
limit, err := getInt(r, "limit")
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
exprs := r.Form["expr"]
|
||||
if len(exprs) == 0 {
|
||||
return fmt.Errorf("expecting at least one `expr` query arg")
|
||||
}
|
||||
sq, err := getSearchQueryForExprs(exprs)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
mns, err := netstorage.SearchMetricNames(sq, deadline)
|
||||
if err != nil {
|
||||
return fmt.Errorf("cannot fetch metric names for %q: %w", sq, err)
|
||||
}
|
||||
paths := getCanonicalPaths(mns)
|
||||
if limit > 0 && limit < len(paths) {
|
||||
paths = paths[:limit]
|
||||
}
|
||||
|
||||
w.Header().Set("Content-Type", "application/json; charset=utf-8")
|
||||
bw := bufferedwriter.Get(w)
|
||||
defer bufferedwriter.Put(bw)
|
||||
WriteTagsFindSeriesResponse(bw, paths)
|
||||
if err := bw.Flush(); err != nil {
|
||||
return err
|
||||
}
|
||||
tagsFindSeriesDuration.UpdateDuration(startTime)
|
||||
return nil
|
||||
}
|
||||
|
||||
func getCanonicalPaths(mns []storage.MetricName) []string {
|
||||
paths := make([]string, 0, len(mns))
|
||||
var b []byte
|
||||
var tags []storage.Tag
|
||||
for _, mn := range mns {
|
||||
b = append(b[:0], mn.MetricGroup...)
|
||||
tags = append(tags[:0], mn.Tags...)
|
||||
sort.Slice(tags, func(i, j int) bool {
|
||||
return string(tags[i].Key) < string(tags[j].Key)
|
||||
})
|
||||
for _, tag := range tags {
|
||||
b = append(b, ';')
|
||||
b = append(b, tag.Key...)
|
||||
b = append(b, '=')
|
||||
b = append(b, tag.Value...)
|
||||
}
|
||||
paths = append(paths, string(b))
|
||||
}
|
||||
sort.Strings(paths)
|
||||
return paths
|
||||
}
|
||||
|
||||
var tagsFindSeriesDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/tags/findSeries"}`)
|
||||
|
||||
// TagValuesHandler implements /tags/<tag_name> endpoint from Graphite Tags API.
|
||||
//
|
||||
// See https://graphite.readthedocs.io/en/stable/tags.html#exploring-tags
|
||||
func TagValuesHandler(startTime time.Time, tagName string, w http.ResponseWriter, r *http.Request) error {
|
||||
deadline := searchutils.GetDeadlineForQuery(r, startTime)
|
||||
if err := r.ParseForm(); err != nil {
|
||||
return fmt.Errorf("cannot parse form values: %w", err)
|
||||
}
|
||||
limit, err := getInt(r, "limit")
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
filter := r.FormValue("filter")
|
||||
tagValues, err := netstorage.GetGraphiteTagValues(tagName, filter, limit, deadline)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
w.Header().Set("Content-Type", "application/json; charset=utf-8")
|
||||
bw := bufferedwriter.Get(w)
|
||||
defer bufferedwriter.Put(bw)
|
||||
WriteTagValuesResponse(bw, tagName, tagValues)
|
||||
if err := bw.Flush(); err != nil {
|
||||
return err
|
||||
}
|
||||
tagValuesDuration.UpdateDuration(startTime)
|
||||
return nil
|
||||
}
|
||||
|
||||
var tagValuesDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/tags/<tag_name>"}`)
|
||||
|
||||
// TagsHandler implements /tags endpoint from Graphite Tags API.
|
||||
//
|
||||
// See https://graphite.readthedocs.io/en/stable/tags.html#exploring-tags
|
||||
func TagsHandler(startTime time.Time, w http.ResponseWriter, r *http.Request) error {
|
||||
deadline := searchutils.GetDeadlineForQuery(r, startTime)
|
||||
if err := r.ParseForm(); err != nil {
|
||||
return fmt.Errorf("cannot parse form values: %w", err)
|
||||
}
|
||||
limit, err := getInt(r, "limit")
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
filter := r.FormValue("filter")
|
||||
labels, err := netstorage.GetGraphiteTags(filter, limit, deadline)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
w.Header().Set("Content-Type", "application/json; charset=utf-8")
|
||||
bw := bufferedwriter.Get(w)
|
||||
defer bufferedwriter.Put(bw)
|
||||
WriteTagsResponse(bw, labels)
|
||||
if err := bw.Flush(); err != nil {
|
||||
return err
|
||||
}
|
||||
tagsDuration.UpdateDuration(startTime)
|
||||
return nil
|
||||
}
|
||||
|
||||
var tagsDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/tags"}`)
|
||||
|
||||
func getInt(r *http.Request, argName string) (int, error) {
|
||||
argValue := r.FormValue(argName)
|
||||
if len(argValue) == 0 {
|
||||
return 0, nil
|
||||
}
|
||||
n, err := strconv.Atoi(argValue)
|
||||
if err != nil {
|
||||
return 0, fmt.Errorf("cannot parse %q=%q: %w", argName, argValue, err)
|
||||
}
|
||||
return n, nil
|
||||
}
|
||||
|
||||
func getSearchQueryForExprs(exprs []string) (*storage.SearchQuery, error) {
|
||||
tfs, err := exprsToTagFilters(exprs)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
ct := time.Now().UnixNano() / 1e6
|
||||
sq := storage.NewSearchQuery(0, ct, [][]storage.TagFilter{tfs})
|
||||
return sq, nil
|
||||
}
|
||||
|
||||
func exprsToTagFilters(exprs []string) ([]storage.TagFilter, error) {
|
||||
tfs := make([]storage.TagFilter, 0, len(exprs))
|
||||
for _, expr := range exprs {
|
||||
tf, err := parseFilterExpr(expr)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("cannot parse `expr` query arg: %w", err)
|
||||
}
|
||||
tfs = append(tfs, *tf)
|
||||
}
|
||||
return tfs, nil
|
||||
}
|
||||
|
||||
func parseFilterExpr(s string) (*storage.TagFilter, error) {
|
||||
n := strings.Index(s, "=")
|
||||
if n < 0 {
|
||||
return nil, fmt.Errorf("missing tag value in filter expression %q", s)
|
||||
}
|
||||
tagName := s[:n]
|
||||
tagValue := s[n+1:]
|
||||
isNegative := false
|
||||
if strings.HasSuffix(tagName, "!") {
|
||||
isNegative = true
|
||||
tagName = tagName[:len(tagName)-1]
|
||||
}
|
||||
if tagName == "name" {
|
||||
tagName = ""
|
||||
}
|
||||
isRegexp := false
|
||||
if strings.HasPrefix(tagValue, "~") {
|
||||
isRegexp = true
|
||||
tagValue = "^(?:" + tagValue[1:] + ").*"
|
||||
}
|
||||
return &storage.TagFilter{
|
||||
Key: []byte(tagName),
|
||||
Value: []byte(tagValue),
|
||||
IsNegative: isNegative,
|
||||
IsRegexp: isRegexp,
|
||||
}, nil
|
||||
}
|
||||
16
app/vmselect/graphite/tags_autocomplete_response.qtpl
Normal file
16
app/vmselect/graphite/tags_autocomplete_response.qtpl
Normal file
@@ -0,0 +1,16 @@
|
||||
{% stripspace %}
|
||||
|
||||
TagsAutoCompleteResponse generates responses for /tags/autoComplete/{tags,values} handlers in Graphite Tags API.
|
||||
See https://graphite.readthedocs.io/en/stable/tags.html#auto-complete-support
|
||||
{% func TagsAutoCompleteResponse(ss []string, jsonp string) %}
|
||||
{% if jsonp != "" %}{%s= jsonp %}({% endif %}
|
||||
[
|
||||
{% for i, s := range ss %}
|
||||
{%q= s %}
|
||||
{% if i+1 < len(ss) %},{% endif %}
|
||||
{% endfor %}
|
||||
]
|
||||
{% if jsonp != "" %}){% endif %}
|
||||
{% endfunc %}
|
||||
|
||||
{% endstripspace %}
|
||||
81
app/vmselect/graphite/tags_autocomplete_response.qtpl.go
Normal file
81
app/vmselect/graphite/tags_autocomplete_response.qtpl.go
Normal file
@@ -0,0 +1,81 @@
|
||||
// Code generated by qtc from "tags_autocomplete_response.qtpl". DO NOT EDIT.
|
||||
// See https://github.com/valyala/quicktemplate for details.
|
||||
|
||||
// TagsAutoCompleteResponse generates responses for /tags/autoComplete/{tags,values} handlers in Graphite Tags API.See https://graphite.readthedocs.io/en/stable/tags.html#auto-complete-support
|
||||
|
||||
//line app/vmselect/graphite/tags_autocomplete_response.qtpl:5
|
||||
package graphite
|
||||
|
||||
//line app/vmselect/graphite/tags_autocomplete_response.qtpl:5
|
||||
import (
|
||||
qtio422016 "io"
|
||||
|
||||
qt422016 "github.com/valyala/quicktemplate"
|
||||
)
|
||||
|
||||
//line app/vmselect/graphite/tags_autocomplete_response.qtpl:5
|
||||
var (
|
||||
_ = qtio422016.Copy
|
||||
_ = qt422016.AcquireByteBuffer
|
||||
)
|
||||
|
||||
//line app/vmselect/graphite/tags_autocomplete_response.qtpl:5
|
||||
func StreamTagsAutoCompleteResponse(qw422016 *qt422016.Writer, ss []string, jsonp string) {
|
||||
//line app/vmselect/graphite/tags_autocomplete_response.qtpl:6
|
||||
if jsonp != "" {
|
||||
//line app/vmselect/graphite/tags_autocomplete_response.qtpl:6
|
||||
qw422016.N().S(jsonp)
|
||||
//line app/vmselect/graphite/tags_autocomplete_response.qtpl:6
|
||||
qw422016.N().S(`(`)
|
||||
//line app/vmselect/graphite/tags_autocomplete_response.qtpl:6
|
||||
}
|
||||
//line app/vmselect/graphite/tags_autocomplete_response.qtpl:6
|
||||
qw422016.N().S(`[`)
|
||||
//line app/vmselect/graphite/tags_autocomplete_response.qtpl:8
|
||||
for i, s := range ss {
|
||||
//line app/vmselect/graphite/tags_autocomplete_response.qtpl:9
|
||||
qw422016.N().Q(s)
|
||||
//line app/vmselect/graphite/tags_autocomplete_response.qtpl:10
|
||||
if i+1 < len(ss) {
|
||||
//line app/vmselect/graphite/tags_autocomplete_response.qtpl:10
|
||||
qw422016.N().S(`,`)
|
||||
//line app/vmselect/graphite/tags_autocomplete_response.qtpl:10
|
||||
}
|
||||
//line app/vmselect/graphite/tags_autocomplete_response.qtpl:11
|
||||
}
|
||||
//line app/vmselect/graphite/tags_autocomplete_response.qtpl:11
|
||||
qw422016.N().S(`]`)
|
||||
//line app/vmselect/graphite/tags_autocomplete_response.qtpl:13
|
||||
if jsonp != "" {
|
||||
//line app/vmselect/graphite/tags_autocomplete_response.qtpl:13
|
||||
qw422016.N().S(`)`)
|
||||
//line app/vmselect/graphite/tags_autocomplete_response.qtpl:13
|
||||
}
|
||||
//line app/vmselect/graphite/tags_autocomplete_response.qtpl:14
|
||||
}
|
||||
|
||||
//line app/vmselect/graphite/tags_autocomplete_response.qtpl:14
|
||||
func WriteTagsAutoCompleteResponse(qq422016 qtio422016.Writer, ss []string, jsonp string) {
|
||||
//line app/vmselect/graphite/tags_autocomplete_response.qtpl:14
|
||||
qw422016 := qt422016.AcquireWriter(qq422016)
|
||||
//line app/vmselect/graphite/tags_autocomplete_response.qtpl:14
|
||||
StreamTagsAutoCompleteResponse(qw422016, ss, jsonp)
|
||||
//line app/vmselect/graphite/tags_autocomplete_response.qtpl:14
|
||||
qt422016.ReleaseWriter(qw422016)
|
||||
//line app/vmselect/graphite/tags_autocomplete_response.qtpl:14
|
||||
}
|
||||
|
||||
//line app/vmselect/graphite/tags_autocomplete_response.qtpl:14
|
||||
func TagsAutoCompleteResponse(ss []string, jsonp string) string {
|
||||
//line app/vmselect/graphite/tags_autocomplete_response.qtpl:14
|
||||
qb422016 := qt422016.AcquireByteBuffer()
|
||||
//line app/vmselect/graphite/tags_autocomplete_response.qtpl:14
|
||||
WriteTagsAutoCompleteResponse(qb422016, ss, jsonp)
|
||||
//line app/vmselect/graphite/tags_autocomplete_response.qtpl:14
|
||||
qs422016 := string(qb422016.B)
|
||||
//line app/vmselect/graphite/tags_autocomplete_response.qtpl:14
|
||||
qt422016.ReleaseByteBuffer(qb422016)
|
||||
//line app/vmselect/graphite/tags_autocomplete_response.qtpl:14
|
||||
return qs422016
|
||||
//line app/vmselect/graphite/tags_autocomplete_response.qtpl:14
|
||||
}
|
||||
12
app/vmselect/graphite/tags_find_series_response.qtpl
Normal file
12
app/vmselect/graphite/tags_find_series_response.qtpl
Normal file
@@ -0,0 +1,12 @@
|
||||
{% stripspace %}
|
||||
|
||||
{% func TagsFindSeriesResponse(paths []string) %}
|
||||
[
|
||||
{% for i, path := range paths %}
|
||||
{%q= path %}
|
||||
{% if i+1 < len(paths) %},{% endif %}
|
||||
{% endfor %}
|
||||
]
|
||||
{% endfunc %}
|
||||
|
||||
{% endstripspace %}
|
||||
65
app/vmselect/graphite/tags_find_series_response.qtpl.go
Normal file
65
app/vmselect/graphite/tags_find_series_response.qtpl.go
Normal file
@@ -0,0 +1,65 @@
|
||||
// Code generated by qtc from "tags_find_series_response.qtpl". DO NOT EDIT.
|
||||
// See https://github.com/valyala/quicktemplate for details.
|
||||
|
||||
//line app/vmselect/graphite/tags_find_series_response.qtpl:3
|
||||
package graphite
|
||||
|
||||
//line app/vmselect/graphite/tags_find_series_response.qtpl:3
|
||||
import (
|
||||
qtio422016 "io"
|
||||
|
||||
qt422016 "github.com/valyala/quicktemplate"
|
||||
)
|
||||
|
||||
//line app/vmselect/graphite/tags_find_series_response.qtpl:3
|
||||
var (
|
||||
_ = qtio422016.Copy
|
||||
_ = qt422016.AcquireByteBuffer
|
||||
)
|
||||
|
||||
//line app/vmselect/graphite/tags_find_series_response.qtpl:3
|
||||
func StreamTagsFindSeriesResponse(qw422016 *qt422016.Writer, paths []string) {
|
||||
//line app/vmselect/graphite/tags_find_series_response.qtpl:3
|
||||
qw422016.N().S(`[`)
|
||||
//line app/vmselect/graphite/tags_find_series_response.qtpl:5
|
||||
for i, path := range paths {
|
||||
//line app/vmselect/graphite/tags_find_series_response.qtpl:6
|
||||
qw422016.N().Q(path)
|
||||
//line app/vmselect/graphite/tags_find_series_response.qtpl:7
|
||||
if i+1 < len(paths) {
|
||||
//line app/vmselect/graphite/tags_find_series_response.qtpl:7
|
||||
qw422016.N().S(`,`)
|
||||
//line app/vmselect/graphite/tags_find_series_response.qtpl:7
|
||||
}
|
||||
//line app/vmselect/graphite/tags_find_series_response.qtpl:8
|
||||
}
|
||||
//line app/vmselect/graphite/tags_find_series_response.qtpl:8
|
||||
qw422016.N().S(`]`)
|
||||
//line app/vmselect/graphite/tags_find_series_response.qtpl:10
|
||||
}
|
||||
|
||||
//line app/vmselect/graphite/tags_find_series_response.qtpl:10
|
||||
func WriteTagsFindSeriesResponse(qq422016 qtio422016.Writer, paths []string) {
|
||||
//line app/vmselect/graphite/tags_find_series_response.qtpl:10
|
||||
qw422016 := qt422016.AcquireWriter(qq422016)
|
||||
//line app/vmselect/graphite/tags_find_series_response.qtpl:10
|
||||
StreamTagsFindSeriesResponse(qw422016, paths)
|
||||
//line app/vmselect/graphite/tags_find_series_response.qtpl:10
|
||||
qt422016.ReleaseWriter(qw422016)
|
||||
//line app/vmselect/graphite/tags_find_series_response.qtpl:10
|
||||
}
|
||||
|
||||
//line app/vmselect/graphite/tags_find_series_response.qtpl:10
|
||||
func TagsFindSeriesResponse(paths []string) string {
|
||||
//line app/vmselect/graphite/tags_find_series_response.qtpl:10
|
||||
qb422016 := qt422016.AcquireByteBuffer()
|
||||
//line app/vmselect/graphite/tags_find_series_response.qtpl:10
|
||||
WriteTagsFindSeriesResponse(qb422016, paths)
|
||||
//line app/vmselect/graphite/tags_find_series_response.qtpl:10
|
||||
qs422016 := string(qb422016.B)
|
||||
//line app/vmselect/graphite/tags_find_series_response.qtpl:10
|
||||
qt422016.ReleaseByteBuffer(qb422016)
|
||||
//line app/vmselect/graphite/tags_find_series_response.qtpl:10
|
||||
return qs422016
|
||||
//line app/vmselect/graphite/tags_find_series_response.qtpl:10
|
||||
}
|
||||
16
app/vmselect/graphite/tags_response.qtpl
Normal file
16
app/vmselect/graphite/tags_response.qtpl
Normal file
@@ -0,0 +1,16 @@
|
||||
{% stripspace %}
|
||||
|
||||
Tags generates response for /tags handler
|
||||
See https://graphite.readthedocs.io/en/stable/tags.html#exploring-tags
|
||||
{% func TagsResponse(tags []string) %}
|
||||
[
|
||||
{% for i, tag := range tags %}
|
||||
{
|
||||
"tag":{%q= tag %}
|
||||
}
|
||||
{% if i+1 < len(tags) %},{% endif %}
|
||||
{% endfor %}
|
||||
]
|
||||
{% endfunc %}
|
||||
|
||||
{% endstripspace %}
|
||||
71
app/vmselect/graphite/tags_response.qtpl.go
Normal file
71
app/vmselect/graphite/tags_response.qtpl.go
Normal file
@@ -0,0 +1,71 @@
|
||||
// Code generated by qtc from "tags_response.qtpl". DO NOT EDIT.
|
||||
// See https://github.com/valyala/quicktemplate for details.
|
||||
|
||||
// Tags generates response for /tags handlerSee https://graphite.readthedocs.io/en/stable/tags.html#exploring-tags
|
||||
|
||||
//line app/vmselect/graphite/tags_response.qtpl:5
|
||||
package graphite
|
||||
|
||||
//line app/vmselect/graphite/tags_response.qtpl:5
|
||||
import (
|
||||
qtio422016 "io"
|
||||
|
||||
qt422016 "github.com/valyala/quicktemplate"
|
||||
)
|
||||
|
||||
//line app/vmselect/graphite/tags_response.qtpl:5
|
||||
var (
|
||||
_ = qtio422016.Copy
|
||||
_ = qt422016.AcquireByteBuffer
|
||||
)
|
||||
|
||||
//line app/vmselect/graphite/tags_response.qtpl:5
|
||||
func StreamTagsResponse(qw422016 *qt422016.Writer, tags []string) {
|
||||
//line app/vmselect/graphite/tags_response.qtpl:5
|
||||
qw422016.N().S(`[`)
|
||||
//line app/vmselect/graphite/tags_response.qtpl:7
|
||||
for i, tag := range tags {
|
||||
//line app/vmselect/graphite/tags_response.qtpl:7
|
||||
qw422016.N().S(`{"tag":`)
|
||||
//line app/vmselect/graphite/tags_response.qtpl:9
|
||||
qw422016.N().Q(tag)
|
||||
//line app/vmselect/graphite/tags_response.qtpl:9
|
||||
qw422016.N().S(`}`)
|
||||
//line app/vmselect/graphite/tags_response.qtpl:11
|
||||
if i+1 < len(tags) {
|
||||
//line app/vmselect/graphite/tags_response.qtpl:11
|
||||
qw422016.N().S(`,`)
|
||||
//line app/vmselect/graphite/tags_response.qtpl:11
|
||||
}
|
||||
//line app/vmselect/graphite/tags_response.qtpl:12
|
||||
}
|
||||
//line app/vmselect/graphite/tags_response.qtpl:12
|
||||
qw422016.N().S(`]`)
|
||||
//line app/vmselect/graphite/tags_response.qtpl:14
|
||||
}
|
||||
|
||||
//line app/vmselect/graphite/tags_response.qtpl:14
|
||||
func WriteTagsResponse(qq422016 qtio422016.Writer, tags []string) {
|
||||
//line app/vmselect/graphite/tags_response.qtpl:14
|
||||
qw422016 := qt422016.AcquireWriter(qq422016)
|
||||
//line app/vmselect/graphite/tags_response.qtpl:14
|
||||
StreamTagsResponse(qw422016, tags)
|
||||
//line app/vmselect/graphite/tags_response.qtpl:14
|
||||
qt422016.ReleaseWriter(qw422016)
|
||||
//line app/vmselect/graphite/tags_response.qtpl:14
|
||||
}
|
||||
|
||||
//line app/vmselect/graphite/tags_response.qtpl:14
|
||||
func TagsResponse(tags []string) string {
|
||||
//line app/vmselect/graphite/tags_response.qtpl:14
|
||||
qb422016 := qt422016.AcquireByteBuffer()
|
||||
//line app/vmselect/graphite/tags_response.qtpl:14
|
||||
WriteTagsResponse(qb422016, tags)
|
||||
//line app/vmselect/graphite/tags_response.qtpl:14
|
||||
qs422016 := string(qb422016.B)
|
||||
//line app/vmselect/graphite/tags_response.qtpl:14
|
||||
qt422016.ReleaseByteBuffer(qb422016)
|
||||
//line app/vmselect/graphite/tags_response.qtpl:14
|
||||
return qs422016
|
||||
//line app/vmselect/graphite/tags_response.qtpl:14
|
||||
}
|
||||
14
app/vmselect/graphite/tags_tag_multi_series_response.qtpl
Normal file
14
app/vmselect/graphite/tags_tag_multi_series_response.qtpl
Normal file
@@ -0,0 +1,14 @@
|
||||
{% stripspace %}
|
||||
|
||||
TagsTagMultiSeriesResponse generates response for /tags/tagMultiSeries .
|
||||
See https://graphite.readthedocs.io/en/stable/tags.html#adding-series-to-the-tagdb
|
||||
{% func TagsTagMultiSeriesResponse(canonicalPaths []string, isJSONResponse bool) %}
|
||||
{% if isJSONResponse %}[{% endif %}
|
||||
{% for i, path := range canonicalPaths %}
|
||||
{%q= path %}
|
||||
{% if i+1 < len(canonicalPaths) %},{% endif %}
|
||||
{% endfor %}
|
||||
{% if isJSONResponse %}]{% endif %}
|
||||
{% endfunc %}
|
||||
|
||||
{% endstripspace %}
|
||||
75
app/vmselect/graphite/tags_tag_multi_series_response.qtpl.go
Normal file
75
app/vmselect/graphite/tags_tag_multi_series_response.qtpl.go
Normal file
@@ -0,0 +1,75 @@
|
||||
// Code generated by qtc from "tags_tag_multi_series_response.qtpl". DO NOT EDIT.
|
||||
// See https://github.com/valyala/quicktemplate for details.
|
||||
|
||||
// TagsTagMultiSeriesResponse generates response for /tags/tagMultiSeries .See https://graphite.readthedocs.io/en/stable/tags.html#adding-series-to-the-tagdb
|
||||
|
||||
//line app/vmselect/graphite/tags_tag_multi_series_response.qtpl:5
|
||||
package graphite
|
||||
|
||||
//line app/vmselect/graphite/tags_tag_multi_series_response.qtpl:5
|
||||
import (
|
||||
qtio422016 "io"
|
||||
|
||||
qt422016 "github.com/valyala/quicktemplate"
|
||||
)
|
||||
|
||||
//line app/vmselect/graphite/tags_tag_multi_series_response.qtpl:5
|
||||
var (
|
||||
_ = qtio422016.Copy
|
||||
_ = qt422016.AcquireByteBuffer
|
||||
)
|
||||
|
||||
//line app/vmselect/graphite/tags_tag_multi_series_response.qtpl:5
|
||||
func StreamTagsTagMultiSeriesResponse(qw422016 *qt422016.Writer, canonicalPaths []string, isJSONResponse bool) {
|
||||
//line app/vmselect/graphite/tags_tag_multi_series_response.qtpl:6
|
||||
if isJSONResponse {
|
||||
//line app/vmselect/graphite/tags_tag_multi_series_response.qtpl:6
|
||||
qw422016.N().S(`[`)
|
||||
//line app/vmselect/graphite/tags_tag_multi_series_response.qtpl:6
|
||||
}
|
||||
//line app/vmselect/graphite/tags_tag_multi_series_response.qtpl:7
|
||||
for i, path := range canonicalPaths {
|
||||
//line app/vmselect/graphite/tags_tag_multi_series_response.qtpl:8
|
||||
qw422016.N().Q(path)
|
||||
//line app/vmselect/graphite/tags_tag_multi_series_response.qtpl:9
|
||||
if i+1 < len(canonicalPaths) {
|
||||
//line app/vmselect/graphite/tags_tag_multi_series_response.qtpl:9
|
||||
qw422016.N().S(`,`)
|
||||
//line app/vmselect/graphite/tags_tag_multi_series_response.qtpl:9
|
||||
}
|
||||
//line app/vmselect/graphite/tags_tag_multi_series_response.qtpl:10
|
||||
}
|
||||
//line app/vmselect/graphite/tags_tag_multi_series_response.qtpl:11
|
||||
if isJSONResponse {
|
||||
//line app/vmselect/graphite/tags_tag_multi_series_response.qtpl:11
|
||||
qw422016.N().S(`]`)
|
||||
//line app/vmselect/graphite/tags_tag_multi_series_response.qtpl:11
|
||||
}
|
||||
//line app/vmselect/graphite/tags_tag_multi_series_response.qtpl:12
|
||||
}
|
||||
|
||||
//line app/vmselect/graphite/tags_tag_multi_series_response.qtpl:12
|
||||
func WriteTagsTagMultiSeriesResponse(qq422016 qtio422016.Writer, canonicalPaths []string, isJSONResponse bool) {
|
||||
//line app/vmselect/graphite/tags_tag_multi_series_response.qtpl:12
|
||||
qw422016 := qt422016.AcquireWriter(qq422016)
|
||||
//line app/vmselect/graphite/tags_tag_multi_series_response.qtpl:12
|
||||
StreamTagsTagMultiSeriesResponse(qw422016, canonicalPaths, isJSONResponse)
|
||||
//line app/vmselect/graphite/tags_tag_multi_series_response.qtpl:12
|
||||
qt422016.ReleaseWriter(qw422016)
|
||||
//line app/vmselect/graphite/tags_tag_multi_series_response.qtpl:12
|
||||
}
|
||||
|
||||
//line app/vmselect/graphite/tags_tag_multi_series_response.qtpl:12
|
||||
func TagsTagMultiSeriesResponse(canonicalPaths []string, isJSONResponse bool) string {
|
||||
//line app/vmselect/graphite/tags_tag_multi_series_response.qtpl:12
|
||||
qb422016 := qt422016.AcquireByteBuffer()
|
||||
//line app/vmselect/graphite/tags_tag_multi_series_response.qtpl:12
|
||||
WriteTagsTagMultiSeriesResponse(qb422016, canonicalPaths, isJSONResponse)
|
||||
//line app/vmselect/graphite/tags_tag_multi_series_response.qtpl:12
|
||||
qs422016 := string(qb422016.B)
|
||||
//line app/vmselect/graphite/tags_tag_multi_series_response.qtpl:12
|
||||
qt422016.ReleaseByteBuffer(qb422016)
|
||||
//line app/vmselect/graphite/tags_tag_multi_series_response.qtpl:12
|
||||
return qs422016
|
||||
//line app/vmselect/graphite/tags_tag_multi_series_response.qtpl:12
|
||||
}
|
||||
@@ -5,15 +5,17 @@ import (
|
||||
"flag"
|
||||
"fmt"
|
||||
"net/http"
|
||||
"runtime"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/graphite"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/netstorage"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/prometheus"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/promql"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/searchutils"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/timerpool"
|
||||
@@ -21,7 +23,7 @@ import (
|
||||
)
|
||||
|
||||
var (
|
||||
deleteAuthKey = flag.String("deleteAuthKey", "", "authKey for metrics' deletion via /api/v1/admin/tsdb/delete_series")
|
||||
deleteAuthKey = flag.String("deleteAuthKey", "", "authKey for metrics' deletion via /api/v1/admin/tsdb/delete_series and /tags/delSeries")
|
||||
maxConcurrentRequests = flag.Int("search.maxConcurrentRequests", getDefaultMaxConcurrentRequests(), "The maximum number of concurrent search requests. "+
|
||||
"It shouldn't be high, since a single request can saturate all the CPU cores. See also -search.maxQueueDuration")
|
||||
maxQueueDuration = flag.Duration("search.maxQueueDuration", 10*time.Second, "The maximum time the request waits for execution when -search.maxConcurrentRequests "+
|
||||
@@ -30,7 +32,7 @@ var (
|
||||
)
|
||||
|
||||
func getDefaultMaxConcurrentRequests() int {
|
||||
n := runtime.GOMAXPROCS(-1)
|
||||
n := cgroup.AvailableCPUs()
|
||||
if n <= 4 {
|
||||
n *= 2
|
||||
}
|
||||
@@ -45,6 +47,9 @@ func getDefaultMaxConcurrentRequests() int {
|
||||
|
||||
// Init initializes vmselect
|
||||
func Init() {
|
||||
tmpDirPath := *vmstorage.DataPath + "/tmp"
|
||||
fs.RemoveDirContents(tmpDirPath)
|
||||
netstorage.InitTmpBlocksDir(tmpDirPath)
|
||||
promql.InitRollupResultCache(*vmstorage.DataPath + "/cache/rollupResult")
|
||||
|
||||
concurrencyCh = make(chan struct{}, *maxConcurrentRequests)
|
||||
@@ -127,6 +132,16 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
|
||||
return true
|
||||
}
|
||||
}
|
||||
if strings.HasPrefix(path, "/tags/") && !isGraphiteTagsPath(path) {
|
||||
tagName := r.URL.Path[len("/tags/"):]
|
||||
graphiteTagValuesRequests.Inc()
|
||||
if err := graphite.TagValuesHandler(startTime, tagName, w, r); err != nil {
|
||||
graphiteTagValuesErrors.Inc()
|
||||
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
|
||||
return true
|
||||
}
|
||||
return true
|
||||
}
|
||||
|
||||
switch path {
|
||||
case "/api/v1/query":
|
||||
@@ -195,6 +210,14 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
|
||||
statusActiveQueriesRequests.Inc()
|
||||
promql.WriteActiveQueries(w)
|
||||
return true
|
||||
case "/api/v1/status/top_queries":
|
||||
topQueriesRequests.Inc()
|
||||
if err := prometheus.QueryStatsHandler(startTime, w, r); err != nil {
|
||||
topQueriesErrors.Inc()
|
||||
sendPrometheusError(w, r, fmt.Errorf("cannot query status endpoint: %w", err))
|
||||
return true
|
||||
}
|
||||
return true
|
||||
case "/api/v1/export":
|
||||
exportRequests.Inc()
|
||||
if err := prometheus.ExportHandler(startTime, w, r); err != nil {
|
||||
@@ -254,22 +277,85 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
|
||||
return true
|
||||
}
|
||||
return true
|
||||
case "/tags/tagSeries":
|
||||
graphiteTagsTagSeriesRequests.Inc()
|
||||
if err := graphite.TagsTagSeriesHandler(startTime, w, r); err != nil {
|
||||
graphiteTagsTagSeriesErrors.Inc()
|
||||
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
|
||||
return true
|
||||
}
|
||||
return true
|
||||
case "/tags/tagMultiSeries":
|
||||
graphiteTagsTagMultiSeriesRequests.Inc()
|
||||
if err := graphite.TagsTagMultiSeriesHandler(startTime, w, r); err != nil {
|
||||
graphiteTagsTagMultiSeriesErrors.Inc()
|
||||
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
|
||||
return true
|
||||
}
|
||||
return true
|
||||
case "/tags":
|
||||
graphiteTagsRequests.Inc()
|
||||
if err := graphite.TagsHandler(startTime, w, r); err != nil {
|
||||
graphiteTagsErrors.Inc()
|
||||
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
|
||||
return true
|
||||
}
|
||||
return true
|
||||
case "/tags/findSeries":
|
||||
graphiteTagsFindSeriesRequests.Inc()
|
||||
if err := graphite.TagsFindSeriesHandler(startTime, w, r); err != nil {
|
||||
graphiteTagsFindSeriesErrors.Inc()
|
||||
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
|
||||
return true
|
||||
}
|
||||
return true
|
||||
case "/tags/autoComplete/tags":
|
||||
graphiteTagsAutoCompleteTagsRequests.Inc()
|
||||
httpserver.EnableCORS(w, r)
|
||||
if err := graphite.TagsAutoCompleteTagsHandler(startTime, w, r); err != nil {
|
||||
graphiteTagsAutoCompleteTagsErrors.Inc()
|
||||
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
|
||||
return true
|
||||
}
|
||||
return true
|
||||
case "/tags/autoComplete/values":
|
||||
graphiteTagsAutoCompleteValuesRequests.Inc()
|
||||
httpserver.EnableCORS(w, r)
|
||||
if err := graphite.TagsAutoCompleteValuesHandler(startTime, w, r); err != nil {
|
||||
graphiteTagsAutoCompleteValuesErrors.Inc()
|
||||
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
|
||||
return true
|
||||
}
|
||||
return true
|
||||
case "/tags/delSeries":
|
||||
graphiteTagsDelSeriesRequests.Inc()
|
||||
authKey := r.FormValue("authKey")
|
||||
if authKey != *deleteAuthKey {
|
||||
httpserver.Errorf(w, r, "invalid authKey %q. It must match the value from -deleteAuthKey command line flag", authKey)
|
||||
return true
|
||||
}
|
||||
if err := graphite.TagsDelSeriesHandler(startTime, w, r); err != nil {
|
||||
graphiteTagsDelSeriesErrors.Inc()
|
||||
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
|
||||
return true
|
||||
}
|
||||
return true
|
||||
case "/api/v1/rules":
|
||||
// Return dumb placeholder
|
||||
rulesRequests.Inc()
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.Header().Set("Content-Type", "application/json; charset=utf-8")
|
||||
fmt.Fprintf(w, "%s", `{"status":"success","data":{"groups":[]}}`)
|
||||
return true
|
||||
case "/api/v1/alerts":
|
||||
// Return dumb placehloder
|
||||
alertsRequests.Inc()
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.Header().Set("Content-Type", "application/json; charset=utf-8")
|
||||
fmt.Fprintf(w, "%s", `{"status":"success","data":{"alerts":[]}}`)
|
||||
return true
|
||||
case "/api/v1/metadata":
|
||||
// Return dumb placeholder
|
||||
metadataRequests.Inc()
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.Header().Set("Content-Type", "application/json; charset=utf-8")
|
||||
fmt.Fprintf(w, "%s", `{"status":"success","data":{}}`)
|
||||
return true
|
||||
case "/api/v1/admin/tsdb/delete_series":
|
||||
@@ -291,10 +377,22 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
|
||||
}
|
||||
}
|
||||
|
||||
func isGraphiteTagsPath(path string) bool {
|
||||
switch path {
|
||||
// See https://graphite.readthedocs.io/en/stable/tags.html for a list of Graphite Tags API paths.
|
||||
// Do not include `/tags/<tag_name>` here, since this will fool the caller.
|
||||
case "/tags/tagSeries", "/tags/tagMultiSeries", "/tags/findSeries",
|
||||
"/tags/autoComplete/tags", "/tags/autoComplete/values", "/tags/delSeries":
|
||||
return true
|
||||
default:
|
||||
return false
|
||||
}
|
||||
}
|
||||
|
||||
func sendPrometheusError(w http.ResponseWriter, r *http.Request, err error) {
|
||||
logger.Warnf("error in %q: %s", r.RequestURI, err)
|
||||
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.Header().Set("Content-Type", "application/json; charset=utf-8")
|
||||
statusCode := http.StatusUnprocessableEntity
|
||||
var esc *httpserver.ErrorWithStatusCode
|
||||
if errors.As(err, &esc) {
|
||||
@@ -331,6 +429,9 @@ var (
|
||||
|
||||
statusActiveQueriesRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/status/active_queries"}`)
|
||||
|
||||
topQueriesRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/status/top_queries"}`)
|
||||
topQueriesErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/api/v1/status/top_queries"}`)
|
||||
|
||||
deleteRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/admin/tsdb/delete_series"}`)
|
||||
deleteErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/api/v1/admin/tsdb/delete_series"}`)
|
||||
|
||||
@@ -355,6 +456,30 @@ var (
|
||||
graphiteMetricsIndexRequests = metrics.NewCounter(`vm_http_requests_total{path="/metrics/index.json"}`)
|
||||
graphiteMetricsIndexErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/metrics/index.json"}`)
|
||||
|
||||
graphiteTagsTagSeriesRequests = metrics.NewCounter(`vm_http_requests_total{path="/tags/tagSeries"}`)
|
||||
graphiteTagsTagSeriesErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/tags/tagSeries"}`)
|
||||
|
||||
graphiteTagsTagMultiSeriesRequests = metrics.NewCounter(`vm_http_requests_total{path="/tags/tagMultiSeries"}`)
|
||||
graphiteTagsTagMultiSeriesErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/tags/tagMultiSeries"}`)
|
||||
|
||||
graphiteTagsRequests = metrics.NewCounter(`vm_http_requests_total{path="/tags"}`)
|
||||
graphiteTagsErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/tags"}`)
|
||||
|
||||
graphiteTagValuesRequests = metrics.NewCounter(`vm_http_requests_total{path="/tags/<tag_name>"}`)
|
||||
graphiteTagValuesErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/tags/<tag_name>"}`)
|
||||
|
||||
graphiteTagsFindSeriesRequests = metrics.NewCounter(`vm_http_requests_total{path="/tags/findSeries"}`)
|
||||
graphiteTagsFindSeriesErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/tags/findSeries"}`)
|
||||
|
||||
graphiteTagsAutoCompleteTagsRequests = metrics.NewCounter(`vm_http_requests_total{path="/tags/autoComplete/tags"}`)
|
||||
graphiteTagsAutoCompleteTagsErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/tags/autoComplete/tags"}`)
|
||||
|
||||
graphiteTagsAutoCompleteValuesRequests = metrics.NewCounter(`vm_http_requests_total{path="/tags/autoComplete/values"}`)
|
||||
graphiteTagsAutoCompleteValuesErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/tags/autoComplete/values"}`)
|
||||
|
||||
graphiteTagsDelSeriesRequests = metrics.NewCounter(`vm_http_requests_total{path="/tags/delSeries"}`)
|
||||
graphiteTagsDelSeriesErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/tags/delSeries"}`)
|
||||
|
||||
rulesRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/rules"}`)
|
||||
alertsRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/alerts"}`)
|
||||
metadataRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/metadata"}`)
|
||||
|
||||
@@ -5,7 +5,7 @@ import (
|
||||
"errors"
|
||||
"flag"
|
||||
"fmt"
|
||||
"runtime"
|
||||
"regexp"
|
||||
"sort"
|
||||
"sync"
|
||||
"sync/atomic"
|
||||
@@ -13,6 +13,7 @@ import (
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/searchutils"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fasttime"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
|
||||
@@ -57,6 +58,7 @@ type Results struct {
|
||||
|
||||
packedTimeseries []packedTimeseries
|
||||
sr *storage.Search
|
||||
tbf *tmpBlocksFile
|
||||
}
|
||||
|
||||
// Len returns the number of results in rss.
|
||||
@@ -72,6 +74,8 @@ func (rss *Results) Cancel() {
|
||||
func (rss *Results) mustClose() {
|
||||
putStorageSearch(rss.sr)
|
||||
rss.sr = nil
|
||||
putTmpBlocksFile(rss.tbf)
|
||||
rss.tbf = nil
|
||||
}
|
||||
|
||||
var timeseriesWorkCh = make(chan *timeseriesWork, gomaxprocs*16)
|
||||
@@ -105,7 +109,7 @@ func timeseriesWorker(workerID uint) {
|
||||
tsw.doneCh <- nil
|
||||
continue
|
||||
}
|
||||
if err := tsw.pts.Unpack(&rs, rss.tr, rss.fetchData); err != nil {
|
||||
if err := tsw.pts.Unpack(&rs, rss.tbf, rss.tr, rss.fetchData); err != nil {
|
||||
tsw.doneCh <- fmt.Errorf("error during time series unpacking: %w", err)
|
||||
continue
|
||||
}
|
||||
@@ -175,31 +179,33 @@ func (rss *Results) RunParallel(f func(rs *Result, workerID uint) error) error {
|
||||
var perQueryRowsProcessed = metrics.NewHistogram(`vm_per_query_rows_processed_count`)
|
||||
var perQuerySeriesProcessed = metrics.NewHistogram(`vm_per_query_series_processed_count`)
|
||||
|
||||
var gomaxprocs = runtime.GOMAXPROCS(-1)
|
||||
var gomaxprocs = cgroup.AvailableCPUs()
|
||||
|
||||
type packedTimeseries struct {
|
||||
metricName string
|
||||
brs []storage.BlockRef
|
||||
brs []blockRef
|
||||
}
|
||||
|
||||
var unpackWorkCh = make(chan *unpackWork, gomaxprocs*128)
|
||||
|
||||
type unpackWorkItem struct {
|
||||
br storage.BlockRef
|
||||
br blockRef
|
||||
tr storage.TimeRange
|
||||
}
|
||||
|
||||
type unpackWork struct {
|
||||
tbf *tmpBlocksFile
|
||||
ws []unpackWorkItem
|
||||
sbs []*sortBlock
|
||||
doneCh chan error
|
||||
}
|
||||
|
||||
func (upw *unpackWork) reset() {
|
||||
upw.tbf = nil
|
||||
ws := upw.ws
|
||||
for i := range ws {
|
||||
w := &ws[i]
|
||||
w.br = storage.BlockRef{}
|
||||
w.br = blockRef{}
|
||||
w.tr = storage.TimeRange{}
|
||||
}
|
||||
upw.ws = upw.ws[:0]
|
||||
@@ -216,7 +222,7 @@ func (upw *unpackWork) reset() {
|
||||
func (upw *unpackWork) unpack(tmpBlock *storage.Block) {
|
||||
for _, w := range upw.ws {
|
||||
sb := getSortBlock()
|
||||
if err := sb.unpackFrom(tmpBlock, w.br, w.tr); err != nil {
|
||||
if err := sb.unpackFrom(tmpBlock, upw.tbf, w.br, w.tr); err != nil {
|
||||
putSortBlock(sb)
|
||||
upw.doneCh <- fmt.Errorf("cannot unpack block: %w", err)
|
||||
return
|
||||
@@ -259,10 +265,10 @@ func unpackWorker() {
|
||||
// unpackBatchSize is the maximum number of blocks that may be unpacked at once by a single goroutine.
|
||||
//
|
||||
// This batch is needed in order to reduce contention for upackWorkCh in multi-CPU system.
|
||||
var unpackBatchSize = 8 * runtime.GOMAXPROCS(-1)
|
||||
var unpackBatchSize = 8 * cgroup.AvailableCPUs()
|
||||
|
||||
// Unpack unpacks pts to dst.
|
||||
func (pts *packedTimeseries) Unpack(dst *Result, tr storage.TimeRange, fetchData bool) error {
|
||||
func (pts *packedTimeseries) Unpack(dst *Result, tbf *tmpBlocksFile, tr storage.TimeRange, fetchData bool) error {
|
||||
dst.reset()
|
||||
if err := dst.MetricName.Unmarshal(bytesutil.ToUnsafeBytes(pts.metricName)); err != nil {
|
||||
return fmt.Errorf("cannot unmarshal metricName %q: %w", pts.metricName, err)
|
||||
@@ -276,11 +282,13 @@ func (pts *packedTimeseries) Unpack(dst *Result, tr storage.TimeRange, fetchData
|
||||
brsLen := len(pts.brs)
|
||||
upws := make([]*unpackWork, 0, 1+brsLen/unpackBatchSize)
|
||||
upw := getUnpackWork()
|
||||
upw.tbf = tbf
|
||||
for _, br := range pts.brs {
|
||||
if len(upw.ws) >= unpackBatchSize {
|
||||
unpackWorkCh <- upw
|
||||
upws = append(upws, upw)
|
||||
upw = getUnpackWork()
|
||||
upw.tbf = tbf
|
||||
}
|
||||
upw.ws = append(upw.ws, unpackWorkItem{
|
||||
br: br,
|
||||
@@ -397,9 +405,10 @@ func (sb *sortBlock) reset() {
|
||||
sb.NextIdx = 0
|
||||
}
|
||||
|
||||
func (sb *sortBlock) unpackFrom(tmpBlock *storage.Block, br storage.BlockRef, tr storage.TimeRange) error {
|
||||
func (sb *sortBlock) unpackFrom(tmpBlock *storage.Block, tbf *tmpBlocksFile, br blockRef, tr storage.TimeRange) error {
|
||||
tmpBlock.Reset()
|
||||
br.MustReadBlock(tmpBlock, true)
|
||||
brReal := tbf.MustReadBlockRefAt(br.partRef, br.addr)
|
||||
brReal.MustReadBlock(tmpBlock, true)
|
||||
if err := tmpBlock.UnmarshalData(); err != nil {
|
||||
return fmt.Errorf("cannot unmarshal block: %w", err)
|
||||
}
|
||||
@@ -445,6 +454,71 @@ func DeleteSeries(sq *storage.SearchQuery) (int, error) {
|
||||
return vmstorage.DeleteMetrics(tfss)
|
||||
}
|
||||
|
||||
// GetLabelsOnTimeRange returns labels for the given tr until the given deadline.
|
||||
func GetLabelsOnTimeRange(tr storage.TimeRange, deadline searchutils.Deadline) ([]string, error) {
|
||||
if deadline.Exceeded() {
|
||||
return nil, fmt.Errorf("timeout exceeded before starting the query processing: %s", deadline.String())
|
||||
}
|
||||
labels, err := vmstorage.SearchTagKeysOnTimeRange(tr, *maxTagKeysPerSearch, deadline.Deadline())
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("error during labels search on time range: %w", err)
|
||||
}
|
||||
// Substitute "" with "__name__"
|
||||
for i := range labels {
|
||||
if labels[i] == "" {
|
||||
labels[i] = "__name__"
|
||||
}
|
||||
}
|
||||
// Sort labels like Prometheus does
|
||||
sort.Strings(labels)
|
||||
return labels, nil
|
||||
}
|
||||
|
||||
// GetGraphiteTags returns Graphite tags until the given deadline.
|
||||
func GetGraphiteTags(filter string, limit int, deadline searchutils.Deadline) ([]string, error) {
|
||||
if deadline.Exceeded() {
|
||||
return nil, fmt.Errorf("timeout exceeded before starting the query processing: %s", deadline.String())
|
||||
}
|
||||
labels, err := GetLabels(deadline)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
// Substitute "__name__" with "name" for Graphite compatibility
|
||||
for i := range labels {
|
||||
if labels[i] != "__name__" {
|
||||
continue
|
||||
}
|
||||
// Prevent from duplicate `name` tag.
|
||||
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/942
|
||||
if hasString(labels, "name") {
|
||||
labels = append(labels[:i], labels[i+1:]...)
|
||||
} else {
|
||||
labels[i] = "name"
|
||||
sort.Strings(labels)
|
||||
}
|
||||
break
|
||||
}
|
||||
if len(filter) > 0 {
|
||||
labels, err = applyGraphiteRegexpFilter(filter, labels)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
}
|
||||
if limit > 0 && limit < len(labels) {
|
||||
labels = labels[:limit]
|
||||
}
|
||||
return labels, nil
|
||||
}
|
||||
|
||||
func hasString(a []string, s string) bool {
|
||||
for _, x := range a {
|
||||
if x == s {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
// GetLabels returns labels until the given deadline.
|
||||
func GetLabels(deadline searchutils.Deadline) ([]string, error) {
|
||||
if deadline.Exceeded() {
|
||||
@@ -454,20 +528,60 @@ func GetLabels(deadline searchutils.Deadline) ([]string, error) {
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("error during labels search: %w", err)
|
||||
}
|
||||
|
||||
// Substitute "" with "__name__"
|
||||
for i := range labels {
|
||||
if labels[i] == "" {
|
||||
labels[i] = "__name__"
|
||||
}
|
||||
}
|
||||
|
||||
// Sort labels like Prometheus does
|
||||
sort.Strings(labels)
|
||||
|
||||
return labels, nil
|
||||
}
|
||||
|
||||
// GetLabelValuesOnTimeRange returns label values for the given labelName on the given tr
|
||||
// until the given deadline.
|
||||
func GetLabelValuesOnTimeRange(labelName string, tr storage.TimeRange, deadline searchutils.Deadline) ([]string, error) {
|
||||
if deadline.Exceeded() {
|
||||
return nil, fmt.Errorf("timeout exceeded before starting the query processing: %s", deadline.String())
|
||||
}
|
||||
if labelName == "__name__" {
|
||||
labelName = ""
|
||||
}
|
||||
// Search for tag values
|
||||
labelValues, err := vmstorage.SearchTagValuesOnTimeRange([]byte(labelName), tr, *maxTagValuesPerSearch, deadline.Deadline())
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("error during label values search on time range for labelName=%q: %w", labelName, err)
|
||||
}
|
||||
// Sort labelValues like Prometheus does
|
||||
sort.Strings(labelValues)
|
||||
return labelValues, nil
|
||||
}
|
||||
|
||||
// GetGraphiteTagValues returns tag values for the given tagName until the given deadline.
|
||||
func GetGraphiteTagValues(tagName, filter string, limit int, deadline searchutils.Deadline) ([]string, error) {
|
||||
if deadline.Exceeded() {
|
||||
return nil, fmt.Errorf("timeout exceeded before starting the query processing: %s", deadline.String())
|
||||
}
|
||||
if tagName == "name" {
|
||||
tagName = ""
|
||||
}
|
||||
tagValues, err := GetLabelValues(tagName, deadline)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
if len(filter) > 0 {
|
||||
tagValues, err = applyGraphiteRegexpFilter(filter, tagValues)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
}
|
||||
if limit > 0 && limit < len(tagValues) {
|
||||
tagValues = tagValues[:limit]
|
||||
}
|
||||
return tagValues, nil
|
||||
}
|
||||
|
||||
// GetLabelValues returns label values for the given labelName
|
||||
// until the given deadline.
|
||||
func GetLabelValues(labelName string, deadline searchutils.Deadline) ([]string, error) {
|
||||
@@ -477,16 +591,13 @@ func GetLabelValues(labelName string, deadline searchutils.Deadline) ([]string,
|
||||
if labelName == "__name__" {
|
||||
labelName = ""
|
||||
}
|
||||
|
||||
// Search for tag values
|
||||
labelValues, err := vmstorage.SearchTagValues([]byte(labelName), *maxTagValuesPerSearch, deadline.Deadline())
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("error during label values search for labelName=%q: %w", labelName, err)
|
||||
}
|
||||
|
||||
// Sort labelValues like Prometheus does
|
||||
sort.Strings(labelValues)
|
||||
|
||||
return labelValues, nil
|
||||
}
|
||||
|
||||
@@ -604,7 +715,7 @@ func ExportBlocks(sq *storage.SearchQuery, deadline searchutils.Deadline, f func
|
||||
sr.Init(vmstorage.Storage, tfss, tr, *maxMetricsPerSearch, deadline.Deadline())
|
||||
|
||||
// Start workers that call f in parallel on available CPU cores.
|
||||
gomaxprocs := runtime.GOMAXPROCS(-1)
|
||||
gomaxprocs := cgroup.AvailableCPUs()
|
||||
workCh := make(chan *exportWork, gomaxprocs*8)
|
||||
var (
|
||||
errGlobal error
|
||||
@@ -683,6 +794,32 @@ var exportWorkPool = &sync.Pool{
|
||||
},
|
||||
}
|
||||
|
||||
// SearchMetricNames returns all the metric names matching sq until the given deadline.
|
||||
func SearchMetricNames(sq *storage.SearchQuery, deadline searchutils.Deadline) ([]storage.MetricName, error) {
|
||||
if deadline.Exceeded() {
|
||||
return nil, fmt.Errorf("timeout exceeded before starting to search metric names: %s", deadline.String())
|
||||
}
|
||||
|
||||
// Setup search.
|
||||
tfss, err := setupTfss(sq.TagFilterss)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
tr := storage.TimeRange{
|
||||
MinTimestamp: sq.MinTimestamp,
|
||||
MaxTimestamp: sq.MaxTimestamp,
|
||||
}
|
||||
if err := vmstorage.CheckTimeRange(tr); err != nil {
|
||||
return nil, err
|
||||
}
|
||||
|
||||
mns, err := vmstorage.SearchMetricNames(tfss, tr, *maxMetricsPerSearch, deadline.Deadline())
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("cannot find metric names: %w", err)
|
||||
}
|
||||
return mns, nil
|
||||
}
|
||||
|
||||
// ProcessSearchQuery performs sq until the given deadline.
|
||||
//
|
||||
// Results.RunParallel or Results.Cancel must be called on the returned Results.
|
||||
@@ -709,19 +846,31 @@ func ProcessSearchQuery(sq *storage.SearchQuery, fetchData bool, deadline search
|
||||
|
||||
sr := getStorageSearch()
|
||||
maxSeriesCount := sr.Init(vmstorage.Storage, tfss, tr, *maxMetricsPerSearch, deadline.Deadline())
|
||||
|
||||
m := make(map[string][]storage.BlockRef, maxSeriesCount)
|
||||
m := make(map[string][]blockRef, maxSeriesCount)
|
||||
orderedMetricNames := make([]string, 0, maxSeriesCount)
|
||||
blocksRead := 0
|
||||
tbf := getTmpBlocksFile()
|
||||
var buf []byte
|
||||
for sr.NextMetricBlock() {
|
||||
blocksRead++
|
||||
if deadline.Exceeded() {
|
||||
putTmpBlocksFile(tbf)
|
||||
putStorageSearch(sr)
|
||||
return nil, fmt.Errorf("timeout exceeded while fetching data block #%d from storage: %s", blocksRead, deadline.String())
|
||||
}
|
||||
buf = sr.MetricBlockRef.BlockRef.Marshal(buf[:0])
|
||||
addr, err := tbf.WriteBlockRefData(buf)
|
||||
if err != nil {
|
||||
putTmpBlocksFile(tbf)
|
||||
putStorageSearch(sr)
|
||||
return nil, fmt.Errorf("cannot write %d bytes to temporary file: %w", len(buf), err)
|
||||
}
|
||||
metricName := sr.MetricBlockRef.MetricName
|
||||
brs := m[string(metricName)]
|
||||
brs = append(brs, *sr.MetricBlockRef.BlockRef)
|
||||
brs = append(brs, blockRef{
|
||||
partRef: sr.MetricBlockRef.BlockRef.PartRef(),
|
||||
addr: addr,
|
||||
})
|
||||
if len(brs) > 1 {
|
||||
// An optimization: do not allocate a string for already existing metricName key in m
|
||||
m[string(metricName)] = brs
|
||||
@@ -733,12 +882,18 @@ func ProcessSearchQuery(sq *storage.SearchQuery, fetchData bool, deadline search
|
||||
}
|
||||
}
|
||||
if err := sr.Error(); err != nil {
|
||||
putTmpBlocksFile(tbf)
|
||||
putStorageSearch(sr)
|
||||
if errors.Is(err, storage.ErrDeadlineExceeded) {
|
||||
return nil, fmt.Errorf("timeout exceeded during the query: %s", deadline.String())
|
||||
}
|
||||
return nil, fmt.Errorf("search error after reading %d data blocks: %w", blocksRead, err)
|
||||
}
|
||||
if err := tbf.Finalize(); err != nil {
|
||||
putTmpBlocksFile(tbf)
|
||||
putStorageSearch(sr)
|
||||
return nil, fmt.Errorf("cannot finalize temporary file: %w", err)
|
||||
}
|
||||
|
||||
var rss Results
|
||||
rss.tr = tr
|
||||
@@ -753,9 +908,15 @@ func ProcessSearchQuery(sq *storage.SearchQuery, fetchData bool, deadline search
|
||||
}
|
||||
rss.packedTimeseries = pts
|
||||
rss.sr = sr
|
||||
rss.tbf = tbf
|
||||
return &rss, nil
|
||||
}
|
||||
|
||||
type blockRef struct {
|
||||
partRef storage.PartRef
|
||||
addr tmpBlockAddr
|
||||
}
|
||||
|
||||
func setupTfss(tagFilterss [][]storage.TagFilter) ([]*storage.TagFilters, error) {
|
||||
tfss := make([]*storage.TagFilters, 0, len(tagFilterss))
|
||||
for _, tagFilters := range tagFilterss {
|
||||
@@ -771,3 +932,20 @@ func setupTfss(tagFilterss [][]storage.TagFilter) ([]*storage.TagFilters, error)
|
||||
}
|
||||
return tfss, nil
|
||||
}
|
||||
|
||||
func applyGraphiteRegexpFilter(filter string, ss []string) ([]string, error) {
|
||||
// Anchor filter regexp to the beginning of the string as Graphite does.
|
||||
// See https://github.com/graphite-project/graphite-web/blob/3ad279df5cb90b211953e39161df416e54a84948/webapp/graphite/tags/localdatabase.py#L157
|
||||
filter = "^(?:" + filter + ")"
|
||||
re, err := regexp.Compile(filter)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("cannot parse regexp filter=%q: %w", filter, err)
|
||||
}
|
||||
dst := ss[:0]
|
||||
for _, s := range ss {
|
||||
if re.MatchString(s) {
|
||||
dst = append(dst, s)
|
||||
}
|
||||
}
|
||||
return dst, nil
|
||||
}
|
||||
|
||||
185
app/vmselect/netstorage/tmp_blocks_file.go
Normal file
185
app/vmselect/netstorage/tmp_blocks_file.go
Normal file
@@ -0,0 +1,185 @@
|
||||
package netstorage
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"io/ioutil"
|
||||
"os"
|
||||
"sync"
|
||||
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/memory"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
|
||||
"github.com/VictoriaMetrics/metrics"
|
||||
)
|
||||
|
||||
// InitTmpBlocksDir initializes directory to store temporary search results.
|
||||
//
|
||||
// It stores data in system-defined temporary directory if tmpDirPath is empty.
|
||||
func InitTmpBlocksDir(tmpDirPath string) {
|
||||
if len(tmpDirPath) == 0 {
|
||||
tmpDirPath = os.TempDir()
|
||||
}
|
||||
tmpBlocksDir = tmpDirPath + "/searchResults"
|
||||
fs.MustRemoveAll(tmpBlocksDir)
|
||||
if err := fs.MkdirAllIfNotExist(tmpBlocksDir); err != nil {
|
||||
logger.Panicf("FATAL: cannot create %q: %s", tmpBlocksDir, err)
|
||||
}
|
||||
}
|
||||
|
||||
var tmpBlocksDir string
|
||||
|
||||
func maxInmemoryTmpBlocksFile() int {
|
||||
mem := memory.Allowed()
|
||||
maxLen := mem / 1024
|
||||
if maxLen < 64*1024 {
|
||||
return 64 * 1024
|
||||
}
|
||||
if maxLen > 4*1024*1024 {
|
||||
return 4 * 1024 * 1024
|
||||
}
|
||||
return maxLen
|
||||
}
|
||||
|
||||
var _ = metrics.NewGauge(`vm_tmp_blocks_max_inmemory_file_size_bytes`, func() float64 {
|
||||
return float64(maxInmemoryTmpBlocksFile())
|
||||
})
|
||||
|
||||
type tmpBlocksFile struct {
|
||||
buf []byte
|
||||
|
||||
f *os.File
|
||||
r *fs.ReaderAt
|
||||
|
||||
offset uint64
|
||||
}
|
||||
|
||||
func getTmpBlocksFile() *tmpBlocksFile {
|
||||
v := tmpBlocksFilePool.Get()
|
||||
if v == nil {
|
||||
return &tmpBlocksFile{
|
||||
buf: make([]byte, 0, maxInmemoryTmpBlocksFile()),
|
||||
}
|
||||
}
|
||||
return v.(*tmpBlocksFile)
|
||||
}
|
||||
|
||||
func putTmpBlocksFile(tbf *tmpBlocksFile) {
|
||||
tbf.MustClose()
|
||||
tbf.buf = tbf.buf[:0]
|
||||
tbf.f = nil
|
||||
tbf.r = nil
|
||||
tbf.offset = 0
|
||||
tmpBlocksFilePool.Put(tbf)
|
||||
}
|
||||
|
||||
var tmpBlocksFilePool sync.Pool
|
||||
|
||||
type tmpBlockAddr struct {
|
||||
offset uint64
|
||||
size int
|
||||
}
|
||||
|
||||
func (addr tmpBlockAddr) String() string {
|
||||
return fmt.Sprintf("offset %d, size %d", addr.offset, addr.size)
|
||||
}
|
||||
|
||||
var (
|
||||
tmpBlocksFilesCreated = metrics.NewCounter(`vm_tmp_blocks_files_created_total`)
|
||||
_ = metrics.NewGauge(`vm_tmp_blocks_files_directory_free_bytes`, func() float64 {
|
||||
return float64(fs.MustGetFreeSpace(tmpBlocksDir))
|
||||
})
|
||||
)
|
||||
|
||||
// WriteBlockRefData writes br to tbf.
|
||||
//
|
||||
// It returns errors since the operation may fail on space shortage
|
||||
// and this must be handled.
|
||||
func (tbf *tmpBlocksFile) WriteBlockRefData(b []byte) (tmpBlockAddr, error) {
|
||||
var addr tmpBlockAddr
|
||||
addr.offset = tbf.offset
|
||||
addr.size = len(b)
|
||||
tbf.offset += uint64(addr.size)
|
||||
if len(tbf.buf)+len(b) <= cap(tbf.buf) {
|
||||
// Fast path - the data fits tbf.buf
|
||||
tbf.buf = append(tbf.buf, b...)
|
||||
return addr, nil
|
||||
}
|
||||
|
||||
// Slow path: flush the data from tbf.buf to file.
|
||||
if tbf.f == nil {
|
||||
f, err := ioutil.TempFile(tmpBlocksDir, "")
|
||||
if err != nil {
|
||||
return addr, err
|
||||
}
|
||||
tbf.f = f
|
||||
tmpBlocksFilesCreated.Inc()
|
||||
}
|
||||
_, err := tbf.f.Write(tbf.buf)
|
||||
tbf.buf = append(tbf.buf[:0], b...)
|
||||
if err != nil {
|
||||
return addr, fmt.Errorf("cannot write block to %q: %w", tbf.f.Name(), err)
|
||||
}
|
||||
return addr, nil
|
||||
}
|
||||
|
||||
func (tbf *tmpBlocksFile) Finalize() error {
|
||||
if tbf.f == nil {
|
||||
return nil
|
||||
}
|
||||
fname := tbf.f.Name()
|
||||
if _, err := tbf.f.Write(tbf.buf); err != nil {
|
||||
return fmt.Errorf("cannot write the remaining %d bytes to %q: %w", len(tbf.buf), fname, err)
|
||||
}
|
||||
tbf.buf = tbf.buf[:0]
|
||||
r := fs.MustOpenReaderAt(fname)
|
||||
// Hint the OS that the file is read almost sequentiallly.
|
||||
// This should reduce the number of disk seeks, which is important
|
||||
// for HDDs.
|
||||
r.MustFadviseSequentialRead(true)
|
||||
tbf.r = r
|
||||
return nil
|
||||
}
|
||||
|
||||
func (tbf *tmpBlocksFile) MustReadBlockRefAt(partRef storage.PartRef, addr tmpBlockAddr) storage.BlockRef {
|
||||
var buf []byte
|
||||
if tbf.f == nil {
|
||||
buf = tbf.buf[addr.offset : addr.offset+uint64(addr.size)]
|
||||
} else {
|
||||
bb := tmpBufPool.Get()
|
||||
defer tmpBufPool.Put(bb)
|
||||
bb.B = bytesutil.Resize(bb.B, addr.size)
|
||||
tbf.r.MustReadAt(bb.B, int64(addr.offset))
|
||||
buf = bb.B
|
||||
}
|
||||
var br storage.BlockRef
|
||||
if err := br.Init(partRef, buf); err != nil {
|
||||
logger.Panicf("FATAL: cannot initialize BlockRef: %s", err)
|
||||
}
|
||||
return br
|
||||
}
|
||||
|
||||
var tmpBufPool bytesutil.ByteBufferPool
|
||||
|
||||
func (tbf *tmpBlocksFile) MustClose() {
|
||||
if tbf.f == nil {
|
||||
return
|
||||
}
|
||||
if tbf.r != nil {
|
||||
// tbf.r could be nil if Finalize wasn't called.
|
||||
tbf.r.MustClose()
|
||||
}
|
||||
fname := tbf.f.Name()
|
||||
|
||||
// Remove the file at first, then close it.
|
||||
// This way the OS shouldn't try to flush file contents to storage
|
||||
// on close.
|
||||
if err := os.Remove(fname); err != nil {
|
||||
logger.Panicf("FATAL: cannot remove %q: %s", fname, err)
|
||||
}
|
||||
if err := tbf.f.Close(); err != nil {
|
||||
logger.Panicf("FATAL: cannot close %q: %s", fname, err)
|
||||
}
|
||||
tbf.f = nil
|
||||
}
|
||||
@@ -5,7 +5,6 @@ import (
|
||||
"fmt"
|
||||
"math"
|
||||
"net/http"
|
||||
"runtime"
|
||||
"sort"
|
||||
"strconv"
|
||||
"strings"
|
||||
@@ -15,8 +14,10 @@ import (
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/bufferedwriter"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/netstorage"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/promql"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/querystats"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/searchutils"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/encoding"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fasttime"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
|
||||
@@ -78,17 +79,13 @@ func FederateHandler(startTime time.Time, w http.ResponseWriter, r *http.Request
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
sq := &storage.SearchQuery{
|
||||
MinTimestamp: start,
|
||||
MaxTimestamp: end,
|
||||
TagFilterss: tagFilterss,
|
||||
}
|
||||
sq := storage.NewSearchQuery(start, end, tagFilterss)
|
||||
rss, err := netstorage.ProcessSearchQuery(sq, true, deadline)
|
||||
if err != nil {
|
||||
return fmt.Errorf("cannot fetch data for %q: %w", sq, err)
|
||||
}
|
||||
|
||||
w.Header().Set("Content-Type", "text/plain")
|
||||
w.Header().Set("Content-Type", "text/plain; charset=utf-8")
|
||||
bw := bufferedwriter.Get(w)
|
||||
defer bufferedwriter.Put(bw)
|
||||
err = rss.RunParallel(func(rs *netstorage.Result, workerID uint) error {
|
||||
@@ -146,16 +143,12 @@ func ExportCSVHandler(startTime time.Time, w http.ResponseWriter, r *http.Reques
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
sq := &storage.SearchQuery{
|
||||
MinTimestamp: start,
|
||||
MaxTimestamp: end,
|
||||
TagFilterss: tagFilterss,
|
||||
}
|
||||
w.Header().Set("Content-Type", "text/csv")
|
||||
sq := storage.NewSearchQuery(start, end, tagFilterss)
|
||||
w.Header().Set("Content-Type", "text/csv; charset=utf-8")
|
||||
bw := bufferedwriter.Get(w)
|
||||
defer bufferedwriter.Put(bw)
|
||||
|
||||
resultsCh := make(chan *quicktemplate.ByteBuffer, runtime.GOMAXPROCS(-1))
|
||||
resultsCh := make(chan *quicktemplate.ByteBuffer, cgroup.AvailableCPUs())
|
||||
doneCh := make(chan error)
|
||||
go func() {
|
||||
err := netstorage.ExportBlocks(sq, deadline, func(mn *storage.MetricName, b *storage.Block, tr storage.TimeRange) error {
|
||||
@@ -227,11 +220,7 @@ func ExportNativeHandler(startTime time.Time, w http.ResponseWriter, r *http.Req
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
sq := &storage.SearchQuery{
|
||||
MinTimestamp: start,
|
||||
MaxTimestamp: end,
|
||||
TagFilterss: tagFilterss,
|
||||
}
|
||||
sq := storage.NewSearchQuery(start, end, tagFilterss)
|
||||
w.Header().Set("Content-Type", "VictoriaMetrics/native")
|
||||
bw := bufferedwriter.Get(w)
|
||||
defer bufferedwriter.Put(bw)
|
||||
@@ -331,9 +320,9 @@ func exportHandler(w http.ResponseWriter, matches []string, start, end int64, fo
|
||||
WriteExportJSONLine(bb, xb)
|
||||
resultsCh <- bb
|
||||
}
|
||||
contentType := "application/stream+json"
|
||||
contentType := "application/stream+json; charset=utf-8"
|
||||
if format == "prometheus" {
|
||||
contentType = "text/plain"
|
||||
contentType = "text/plain; charset=utf-8"
|
||||
writeLineFunc = func(xb *exportBlock, resultsCh chan<- *quicktemplate.ByteBuffer) {
|
||||
bb := quicktemplate.AcquireByteBuffer()
|
||||
WriteExportPrometheusLine(bb, xb)
|
||||
@@ -381,16 +370,12 @@ func exportHandler(w http.ResponseWriter, matches []string, start, end int64, fo
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
sq := &storage.SearchQuery{
|
||||
MinTimestamp: start,
|
||||
MaxTimestamp: end,
|
||||
TagFilterss: tagFilterss,
|
||||
}
|
||||
sq := storage.NewSearchQuery(start, end, tagFilterss)
|
||||
w.Header().Set("Content-Type", contentType)
|
||||
bw := bufferedwriter.Get(w)
|
||||
defer bufferedwriter.Put(bw)
|
||||
|
||||
resultsCh := make(chan *quicktemplate.ByteBuffer, runtime.GOMAXPROCS(-1))
|
||||
resultsCh := make(chan *quicktemplate.ByteBuffer, cgroup.AvailableCPUs())
|
||||
doneCh := make(chan error)
|
||||
if !reduceMemUsage {
|
||||
rss, err := netstorage.ProcessSearchQuery(sq, true, deadline)
|
||||
@@ -486,9 +471,7 @@ func DeleteHandler(startTime time.Time, r *http.Request) error {
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
sq := &storage.SearchQuery{
|
||||
TagFilterss: tagFilterss,
|
||||
}
|
||||
sq := storage.NewSearchQuery(0, 0, tagFilterss)
|
||||
deletedCount, err := netstorage.DeleteSeries(sq)
|
||||
if err != nil {
|
||||
return fmt.Errorf("cannot delete time series matching %q: %w", matches, err)
|
||||
@@ -511,11 +494,31 @@ func LabelValuesHandler(startTime time.Time, labelName string, w http.ResponseWr
|
||||
return fmt.Errorf("cannot parse form values: %w", err)
|
||||
}
|
||||
var labelValues []string
|
||||
if len(r.Form["match[]"]) == 0 && len(r.Form["start"]) == 0 && len(r.Form["end"]) == 0 {
|
||||
var err error
|
||||
labelValues, err = netstorage.GetLabelValues(labelName, deadline)
|
||||
if err != nil {
|
||||
return fmt.Errorf(`cannot obtain label values for %q: %w`, labelName, err)
|
||||
if len(r.Form["match[]"]) == 0 {
|
||||
if len(r.Form["start"]) == 0 && len(r.Form["end"]) == 0 {
|
||||
var err error
|
||||
labelValues, err = netstorage.GetLabelValues(labelName, deadline)
|
||||
if err != nil {
|
||||
return fmt.Errorf(`cannot obtain label values for %q: %w`, labelName, err)
|
||||
}
|
||||
} else {
|
||||
ct := startTime.UnixNano() / 1e6
|
||||
end, err := searchutils.GetTime(r, "end", ct)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
start, err := searchutils.GetTime(r, "start", end-defaultStep)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
tr := storage.TimeRange{
|
||||
MinTimestamp: start,
|
||||
MaxTimestamp: end,
|
||||
}
|
||||
labelValues, err = netstorage.GetLabelValuesOnTimeRange(labelName, tr, deadline)
|
||||
if err != nil {
|
||||
return fmt.Errorf(`cannot obtain label values on time range for %q: %w`, labelName, err)
|
||||
}
|
||||
}
|
||||
} else {
|
||||
// Extended functionality that allows filtering by label filters and time range
|
||||
@@ -541,7 +544,7 @@ func LabelValuesHandler(startTime time.Time, labelName string, w http.ResponseWr
|
||||
}
|
||||
}
|
||||
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.Header().Set("Content-Type", "application/json; charset=utf-8")
|
||||
bw := bufferedwriter.Get(w)
|
||||
defer bufferedwriter.Put(bw)
|
||||
WriteLabelValuesResponse(bw, labelValues)
|
||||
@@ -576,32 +579,41 @@ func labelValuesWithMatches(labelName string, matches []string, start, end int64
|
||||
if start >= end {
|
||||
end = start + defaultStep
|
||||
}
|
||||
sq := &storage.SearchQuery{
|
||||
MinTimestamp: start,
|
||||
MaxTimestamp: end,
|
||||
TagFilterss: tagFilterss,
|
||||
}
|
||||
rss, err := netstorage.ProcessSearchQuery(sq, false, deadline)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("cannot fetch data for %q: %w", sq, err)
|
||||
}
|
||||
|
||||
sq := storage.NewSearchQuery(start, end, tagFilterss)
|
||||
m := make(map[string]struct{})
|
||||
var mLock sync.Mutex
|
||||
err = rss.RunParallel(func(rs *netstorage.Result, workerID uint) error {
|
||||
labelValue := rs.MetricName.GetTagValue(labelName)
|
||||
if len(labelValue) == 0 {
|
||||
return nil
|
||||
if end-start > 24*3600*1000 {
|
||||
// It is cheaper to call SearchMetricNames on time ranges exceeding a day.
|
||||
mns, err := netstorage.SearchMetricNames(sq, deadline)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("cannot fetch time series for %q: %w", sq, err)
|
||||
}
|
||||
for _, mn := range mns {
|
||||
labelValue := mn.GetTagValue(labelName)
|
||||
if len(labelValue) == 0 {
|
||||
continue
|
||||
}
|
||||
m[string(labelValue)] = struct{}{}
|
||||
}
|
||||
} else {
|
||||
rss, err := netstorage.ProcessSearchQuery(sq, false, deadline)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("cannot fetch data for %q: %w", sq, err)
|
||||
}
|
||||
var mLock sync.Mutex
|
||||
err = rss.RunParallel(func(rs *netstorage.Result, workerID uint) error {
|
||||
labelValue := rs.MetricName.GetTagValue(labelName)
|
||||
if len(labelValue) == 0 {
|
||||
return nil
|
||||
}
|
||||
mLock.Lock()
|
||||
m[string(labelValue)] = struct{}{}
|
||||
mLock.Unlock()
|
||||
return nil
|
||||
})
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("error when data fetching: %w", err)
|
||||
}
|
||||
mLock.Lock()
|
||||
m[string(labelValue)] = struct{}{}
|
||||
mLock.Unlock()
|
||||
return nil
|
||||
})
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("error when data fetching: %w", err)
|
||||
}
|
||||
|
||||
labelValues := make([]string, 0, len(m))
|
||||
for labelValue := range m {
|
||||
labelValues = append(labelValues, labelValue)
|
||||
@@ -619,7 +631,7 @@ func LabelsCountHandler(startTime time.Time, w http.ResponseWriter, r *http.Requ
|
||||
if err != nil {
|
||||
return fmt.Errorf(`cannot obtain label entries: %w`, err)
|
||||
}
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.Header().Set("Content-Type", "application/json; charset=utf-8")
|
||||
bw := bufferedwriter.Get(w)
|
||||
defer bufferedwriter.Put(bw)
|
||||
WriteLabelsCountResponse(bw, labelEntries)
|
||||
@@ -670,7 +682,7 @@ func TSDBStatusHandler(startTime time.Time, w http.ResponseWriter, r *http.Reque
|
||||
if err != nil {
|
||||
return fmt.Errorf(`cannot obtain tsdb status for date=%d, topN=%d: %w`, date, topN, err)
|
||||
}
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.Header().Set("Content-Type", "application/json; charset=utf-8")
|
||||
bw := bufferedwriter.Get(w)
|
||||
defer bufferedwriter.Put(bw)
|
||||
WriteTSDBStatusResponse(bw, status)
|
||||
@@ -692,11 +704,31 @@ func LabelsHandler(startTime time.Time, w http.ResponseWriter, r *http.Request)
|
||||
return fmt.Errorf("cannot parse form values: %w", err)
|
||||
}
|
||||
var labels []string
|
||||
if len(r.Form["match[]"]) == 0 && len(r.Form["start"]) == 0 && len(r.Form["end"]) == 0 {
|
||||
var err error
|
||||
labels, err = netstorage.GetLabels(deadline)
|
||||
if err != nil {
|
||||
return fmt.Errorf("cannot obtain labels: %w", err)
|
||||
if len(r.Form["match[]"]) == 0 {
|
||||
if len(r.Form["start"]) == 0 && len(r.Form["end"]) == 0 {
|
||||
var err error
|
||||
labels, err = netstorage.GetLabels(deadline)
|
||||
if err != nil {
|
||||
return fmt.Errorf("cannot obtain labels: %w", err)
|
||||
}
|
||||
} else {
|
||||
ct := startTime.UnixNano() / 1e6
|
||||
end, err := searchutils.GetTime(r, "end", ct)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
start, err := searchutils.GetTime(r, "start", end-defaultStep)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
tr := storage.TimeRange{
|
||||
MinTimestamp: start,
|
||||
MaxTimestamp: end,
|
||||
}
|
||||
labels, err = netstorage.GetLabelsOnTimeRange(tr, deadline)
|
||||
if err != nil {
|
||||
return fmt.Errorf("cannot obtain labels on time range: %w", err)
|
||||
}
|
||||
}
|
||||
} else {
|
||||
// Extended functionality that allows filtering by label filters and time range
|
||||
@@ -720,7 +752,7 @@ func LabelsHandler(startTime time.Time, w http.ResponseWriter, r *http.Request)
|
||||
}
|
||||
}
|
||||
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.Header().Set("Content-Type", "application/json; charset=utf-8")
|
||||
bw := bufferedwriter.Get(w)
|
||||
defer bufferedwriter.Put(bw)
|
||||
WriteLabelsResponse(bw, labels)
|
||||
@@ -742,33 +774,41 @@ func labelsWithMatches(matches []string, start, end int64, deadline searchutils.
|
||||
if start >= end {
|
||||
end = start + defaultStep
|
||||
}
|
||||
sq := &storage.SearchQuery{
|
||||
MinTimestamp: start,
|
||||
MaxTimestamp: end,
|
||||
TagFilterss: tagFilterss,
|
||||
}
|
||||
rss, err := netstorage.ProcessSearchQuery(sq, false, deadline)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("cannot fetch data for %q: %w", sq, err)
|
||||
}
|
||||
|
||||
sq := storage.NewSearchQuery(start, end, tagFilterss)
|
||||
m := make(map[string]struct{})
|
||||
var mLock sync.Mutex
|
||||
err = rss.RunParallel(func(rs *netstorage.Result, workerID uint) error {
|
||||
mLock.Lock()
|
||||
tags := rs.MetricName.Tags
|
||||
for i := range tags {
|
||||
t := &tags[i]
|
||||
m[string(t.Key)] = struct{}{}
|
||||
if end-start > 24*3600*1000 {
|
||||
// It is cheaper to call SearchMetricNames on time ranges exceeding a day.
|
||||
mns, err := netstorage.SearchMetricNames(sq, deadline)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("cannot fetch time series for %q: %w", sq, err)
|
||||
}
|
||||
for _, mn := range mns {
|
||||
for _, tag := range mn.Tags {
|
||||
m[string(tag.Key)] = struct{}{}
|
||||
}
|
||||
}
|
||||
if len(mns) > 0 {
|
||||
m["__name__"] = struct{}{}
|
||||
}
|
||||
} else {
|
||||
rss, err := netstorage.ProcessSearchQuery(sq, false, deadline)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("cannot fetch data for %q: %w", sq, err)
|
||||
}
|
||||
var mLock sync.Mutex
|
||||
err = rss.RunParallel(func(rs *netstorage.Result, workerID uint) error {
|
||||
mLock.Lock()
|
||||
for _, tag := range rs.MetricName.Tags {
|
||||
m[string(tag.Key)] = struct{}{}
|
||||
}
|
||||
m["__name__"] = struct{}{}
|
||||
mLock.Unlock()
|
||||
return nil
|
||||
})
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("error when data fetching: %w", err)
|
||||
}
|
||||
m["__name__"] = struct{}{}
|
||||
mLock.Unlock()
|
||||
return nil
|
||||
})
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("error when data fetching: %w", err)
|
||||
}
|
||||
|
||||
labels := make([]string, 0, len(m))
|
||||
for label := range m {
|
||||
labels = append(labels, label)
|
||||
@@ -786,7 +826,7 @@ func SeriesCountHandler(startTime time.Time, w http.ResponseWriter, r *http.Requ
|
||||
if err != nil {
|
||||
return fmt.Errorf("cannot obtain series count: %w", err)
|
||||
}
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.Header().Set("Content-Type", "application/json; charset=utf-8")
|
||||
bw := bufferedwriter.Get(w)
|
||||
defer bufferedwriter.Put(bw)
|
||||
WriteSeriesCountResponse(bw, n)
|
||||
@@ -833,17 +873,39 @@ func SeriesHandler(startTime time.Time, w http.ResponseWriter, r *http.Request)
|
||||
if start >= end {
|
||||
end = start + defaultStep
|
||||
}
|
||||
sq := &storage.SearchQuery{
|
||||
MinTimestamp: start,
|
||||
MaxTimestamp: end,
|
||||
TagFilterss: tagFilterss,
|
||||
sq := storage.NewSearchQuery(start, end, tagFilterss)
|
||||
if end-start > 24*3600*1000 {
|
||||
// It is cheaper to call SearchMetricNames on time ranges exceeding a day.
|
||||
mns, err := netstorage.SearchMetricNames(sq, deadline)
|
||||
if err != nil {
|
||||
return fmt.Errorf("cannot fetch time series for %q: %w", sq, err)
|
||||
}
|
||||
w.Header().Set("Content-Type", "application/json; charset=utf-8")
|
||||
bw := bufferedwriter.Get(w)
|
||||
defer bufferedwriter.Put(bw)
|
||||
resultsCh := make(chan *quicktemplate.ByteBuffer)
|
||||
go func() {
|
||||
for i := range mns {
|
||||
bb := quicktemplate.AcquireByteBuffer()
|
||||
writemetricNameObject(bb, &mns[i])
|
||||
resultsCh <- bb
|
||||
}
|
||||
close(resultsCh)
|
||||
}()
|
||||
// WriteSeriesResponse must consume all the data from resultsCh.
|
||||
WriteSeriesResponse(bw, resultsCh)
|
||||
if err := bw.Flush(); err != nil {
|
||||
return err
|
||||
}
|
||||
seriesDuration.UpdateDuration(startTime)
|
||||
return nil
|
||||
}
|
||||
rss, err := netstorage.ProcessSearchQuery(sq, false, deadline)
|
||||
if err != nil {
|
||||
return fmt.Errorf("cannot fetch data for %q: %w", sq, err)
|
||||
}
|
||||
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.Header().Set("Content-Type", "application/json; charset=utf-8")
|
||||
bw := bufferedwriter.Get(w)
|
||||
defer bufferedwriter.Put(bw)
|
||||
resultsCh := make(chan *quicktemplate.ByteBuffer)
|
||||
@@ -980,7 +1042,7 @@ func QueryHandler(startTime time.Time, w http.ResponseWriter, r *http.Request) e
|
||||
}
|
||||
}
|
||||
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.Header().Set("Content-Type", "application/json; charset=utf-8")
|
||||
bw := bufferedwriter.Get(w)
|
||||
defer bufferedwriter.Put(bw)
|
||||
WriteQueryResponse(bw, result)
|
||||
@@ -1079,7 +1141,7 @@ func queryRangeHandler(startTime time.Time, w http.ResponseWriter, query string,
|
||||
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/153
|
||||
result = removeEmptyValuesAndTimeseries(result)
|
||||
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.Header().Set("Content-Type", "application/json; charset=utf-8")
|
||||
bw := bufferedwriter.Get(w)
|
||||
defer bufferedwriter.Put(bw)
|
||||
WriteQueryRangeResponse(bw, result)
|
||||
@@ -1189,3 +1251,35 @@ func getLatencyOffsetMilliseconds() int64 {
|
||||
}
|
||||
return d
|
||||
}
|
||||
|
||||
// QueryStatsHandler returns query stats at `/api/v1/status/top_queries`
|
||||
func QueryStatsHandler(startTime time.Time, w http.ResponseWriter, r *http.Request) error {
|
||||
if err := r.ParseForm(); err != nil {
|
||||
return fmt.Errorf("cannot parse form values: %w", err)
|
||||
}
|
||||
topN := 20
|
||||
topNStr := r.FormValue("topN")
|
||||
if len(topNStr) > 0 {
|
||||
n, err := strconv.Atoi(topNStr)
|
||||
if err != nil {
|
||||
return fmt.Errorf("cannot parse `topN` arg %q: %w", topNStr, err)
|
||||
}
|
||||
topN = n
|
||||
}
|
||||
maxLifetimeMsecs, err := searchutils.GetDuration(r, "maxLifetime", 10*60*1000)
|
||||
if err != nil {
|
||||
return fmt.Errorf("cannot parse `maxLifetime` arg: %w", err)
|
||||
}
|
||||
maxLifetime := time.Duration(maxLifetimeMsecs) * time.Millisecond
|
||||
w.Header().Set("Content-Type", "application/json; charset=utf-8")
|
||||
bw := bufferedwriter.Get(w)
|
||||
defer bufferedwriter.Put(bw)
|
||||
querystats.WriteJSONQueryStats(bw, topN, maxLifetime)
|
||||
if err := bw.Flush(); err != nil {
|
||||
return err
|
||||
}
|
||||
queryStatsDuration.UpdateDuration(startTime)
|
||||
return nil
|
||||
}
|
||||
|
||||
var queryStatsDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/api/v1/status/top_queries"}`)
|
||||
|
||||
@@ -69,7 +69,9 @@ func newAggrFunc(afe func(tss []*timeseries) []*timeseries) aggrFunc {
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
return aggrFuncExt(afe, tss, &afa.ae.Modifier, afa.ae.Limit, false)
|
||||
return aggrFuncExt(func(tss []*timeseries, modififer *metricsql.ModifierExpr) []*timeseries {
|
||||
return afe(tss)
|
||||
}, tss, &afa.ae.Modifier, afa.ae.Limit, false)
|
||||
}
|
||||
}
|
||||
|
||||
@@ -98,7 +100,8 @@ func removeGroupTags(metricName *storage.MetricName, modifier *metricsql.Modifie
|
||||
}
|
||||
}
|
||||
|
||||
func aggrFuncExt(afe func(tss []*timeseries) []*timeseries, argOrig []*timeseries, modifier *metricsql.ModifierExpr, maxSeries int, keepOriginal bool) ([]*timeseries, error) {
|
||||
func aggrFuncExt(afe func(tss []*timeseries, modifier *metricsql.ModifierExpr) []*timeseries, argOrig []*timeseries,
|
||||
modifier *metricsql.ModifierExpr, maxSeries int, keepOriginal bool) ([]*timeseries, error) {
|
||||
arg := copyTimeseriesMetricNames(argOrig, keepOriginal)
|
||||
|
||||
// Perform grouping.
|
||||
@@ -124,7 +127,7 @@ func aggrFuncExt(afe func(tss []*timeseries) []*timeseries, argOrig []*timeserie
|
||||
dstTssCount := 0
|
||||
rvs := make([]*timeseries, 0, len(m))
|
||||
for _, tss := range m {
|
||||
rv := afe(tss)
|
||||
rv := afe(tss, modifier)
|
||||
rvs = append(rvs, rv...)
|
||||
srcTssCount += len(tss)
|
||||
dstTssCount += len(rv)
|
||||
@@ -141,7 +144,7 @@ func aggrFuncAny(afa *aggrFuncArg) ([]*timeseries, error) {
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
afe := func(tss []*timeseries) []*timeseries {
|
||||
afe := func(tss []*timeseries, modifier *metricsql.ModifierExpr) []*timeseries {
|
||||
return tss[:1]
|
||||
}
|
||||
limit := afa.ae.Limit
|
||||
@@ -178,10 +181,11 @@ func aggrFuncSum(tss []*timeseries) []*timeseries {
|
||||
sum := float64(0)
|
||||
count := 0
|
||||
for _, ts := range tss {
|
||||
if math.IsNaN(ts.Values[i]) {
|
||||
v := ts.Values[i]
|
||||
if math.IsNaN(v) {
|
||||
continue
|
||||
}
|
||||
sum += ts.Values[i]
|
||||
sum += v
|
||||
count++
|
||||
}
|
||||
if count == 0 {
|
||||
@@ -449,7 +453,7 @@ func aggrFuncZScore(afa *aggrFuncArg) ([]*timeseries, error) {
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
afe := func(tss []*timeseries) []*timeseries {
|
||||
afe := func(tss []*timeseries, modifier *metricsql.ModifierExpr) []*timeseries {
|
||||
for i := range tss[0].Values {
|
||||
// Calculate avg and stddev for tss points at position i.
|
||||
// See `Rapid calculation methods` at https://en.wikipedia.org/wiki/Standard_deviation
|
||||
@@ -550,7 +554,7 @@ func aggrFuncCountValues(afa *aggrFuncArg) ([]*timeseries, error) {
|
||||
// Do nothing
|
||||
}
|
||||
|
||||
afe := func(tss []*timeseries) []*timeseries {
|
||||
afe := func(tss []*timeseries, modififer *metricsql.ModifierExpr) []*timeseries {
|
||||
m := make(map[float64]bool)
|
||||
for _, ts := range tss {
|
||||
for _, v := range ts.Values {
|
||||
@@ -602,7 +606,7 @@ func newAggrFuncTopK(isReverse bool) aggrFunc {
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
afe := func(tss []*timeseries) []*timeseries {
|
||||
afe := func(tss []*timeseries, modififer *metricsql.ModifierExpr) []*timeseries {
|
||||
for n := range tss[0].Values {
|
||||
sort.Slice(tss, func(i, j int) bool {
|
||||
a := tss[i].Values[n]
|
||||
@@ -623,21 +627,32 @@ func newAggrFuncTopK(isReverse bool) aggrFunc {
|
||||
func newAggrFuncRangeTopK(f func(values []float64) float64, isReverse bool) aggrFunc {
|
||||
return func(afa *aggrFuncArg) ([]*timeseries, error) {
|
||||
args := afa.args
|
||||
if err := expectTransformArgsNum(args, 2); err != nil {
|
||||
return nil, err
|
||||
if len(args) < 2 {
|
||||
return nil, fmt.Errorf(`unexpected number of args; got %d; want at least %d`, len(args), 2)
|
||||
}
|
||||
if len(args) > 3 {
|
||||
return nil, fmt.Errorf(`unexpected number of args; got %d; want no more than %d`, len(args), 3)
|
||||
}
|
||||
ks, err := getScalar(args[0], 0)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
afe := func(tss []*timeseries) []*timeseries {
|
||||
return getRangeTopKTimeseries(tss, ks, f, isReverse)
|
||||
remainingSumTagName := ""
|
||||
if len(args) == 3 {
|
||||
remainingSumTagName, err = getString(args[2], 2)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
}
|
||||
afe := func(tss []*timeseries, modifier *metricsql.ModifierExpr) []*timeseries {
|
||||
return getRangeTopKTimeseries(tss, modifier, ks, remainingSumTagName, f, isReverse)
|
||||
}
|
||||
return aggrFuncExt(afe, args[1], &afa.ae.Modifier, afa.ae.Limit, true)
|
||||
}
|
||||
}
|
||||
|
||||
func getRangeTopKTimeseries(tss []*timeseries, ks []float64, f func(values []float64) float64, isReverse bool) []*timeseries {
|
||||
func getRangeTopKTimeseries(tss []*timeseries, modifier *metricsql.ModifierExpr, ks []float64, remainingSumTagName string,
|
||||
f func(values []float64) float64, isReverse bool) []*timeseries {
|
||||
type tsWithValue struct {
|
||||
ts *timeseries
|
||||
value float64
|
||||
@@ -661,35 +676,74 @@ func getRangeTopKTimeseries(tss []*timeseries, ks []float64, f func(values []flo
|
||||
for i := range maxs {
|
||||
tss[i] = maxs[i].ts
|
||||
}
|
||||
remainingSumTS := getRemainingSumTimeseries(tss, modifier, ks, remainingSumTagName)
|
||||
for i, k := range ks {
|
||||
fillNaNsAtIdx(i, k, tss)
|
||||
}
|
||||
if remainingSumTS != nil {
|
||||
tss = append(tss, remainingSumTS)
|
||||
}
|
||||
return removeNaNs(tss)
|
||||
}
|
||||
|
||||
func getRemainingSumTimeseries(tss []*timeseries, modifier *metricsql.ModifierExpr, ks []float64, remainingSumTagName string) *timeseries {
|
||||
if len(remainingSumTagName) == 0 || len(tss) == 0 {
|
||||
return nil
|
||||
}
|
||||
var dst timeseries
|
||||
dst.CopyFromShallowTimestamps(tss[0])
|
||||
removeGroupTags(&dst.MetricName, modifier)
|
||||
dst.MetricName.RemoveTag(remainingSumTagName)
|
||||
dst.MetricName.AddTag(remainingSumTagName, remainingSumTagName)
|
||||
for i, k := range ks {
|
||||
kn := getIntK(k, len(tss))
|
||||
var sum float64
|
||||
count := 0
|
||||
for _, ts := range tss[:len(tss)-kn] {
|
||||
v := ts.Values[i]
|
||||
if math.IsNaN(v) {
|
||||
continue
|
||||
}
|
||||
sum += v
|
||||
count++
|
||||
}
|
||||
if count == 0 {
|
||||
sum = nan
|
||||
}
|
||||
dst.Values[i] = sum
|
||||
}
|
||||
return &dst
|
||||
}
|
||||
|
||||
func fillNaNsAtIdx(idx int, k float64, tss []*timeseries) {
|
||||
if math.IsNaN(k) {
|
||||
k = 0
|
||||
}
|
||||
kn := int(k)
|
||||
if kn < 0 {
|
||||
kn = 0
|
||||
}
|
||||
if kn > len(tss) {
|
||||
kn = len(tss)
|
||||
}
|
||||
kn := getIntK(k, len(tss))
|
||||
for _, ts := range tss[:len(tss)-kn] {
|
||||
ts.Values[idx] = nan
|
||||
}
|
||||
}
|
||||
|
||||
func minValue(values []float64) float64 {
|
||||
if len(values) == 0 {
|
||||
return nan
|
||||
func getIntK(k float64, kMax int) int {
|
||||
if math.IsNaN(k) {
|
||||
return 0
|
||||
}
|
||||
min := values[0]
|
||||
for _, v := range values[1:] {
|
||||
if v < min {
|
||||
kn := int(k)
|
||||
if kn < 0 {
|
||||
return 0
|
||||
}
|
||||
if kn > kMax {
|
||||
return kMax
|
||||
}
|
||||
return kn
|
||||
}
|
||||
|
||||
func minValue(values []float64) float64 {
|
||||
min := nan
|
||||
for len(values) > 0 && math.IsNaN(min) {
|
||||
min = values[0]
|
||||
values = values[1:]
|
||||
}
|
||||
for _, v := range values {
|
||||
if !math.IsNaN(v) && v < min {
|
||||
min = v
|
||||
}
|
||||
}
|
||||
@@ -697,12 +751,13 @@ func minValue(values []float64) float64 {
|
||||
}
|
||||
|
||||
func maxValue(values []float64) float64 {
|
||||
if len(values) == 0 {
|
||||
return nan
|
||||
max := nan
|
||||
for len(values) > 0 && math.IsNaN(max) {
|
||||
max = values[0]
|
||||
values = values[1:]
|
||||
}
|
||||
max := values[0]
|
||||
for _, v := range values[1:] {
|
||||
if v > max {
|
||||
for _, v := range values {
|
||||
if !math.IsNaN(v) && v > max {
|
||||
max = v
|
||||
}
|
||||
}
|
||||
@@ -746,7 +801,7 @@ func aggrFuncOutliersK(afa *aggrFuncArg) ([]*timeseries, error) {
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
afe := func(tss []*timeseries) []*timeseries {
|
||||
afe := func(tss []*timeseries, modifier *metricsql.ModifierExpr) []*timeseries {
|
||||
// Calculate medians for each point across tss.
|
||||
medians := make([]float64, len(ks))
|
||||
h := histogram.GetFast()
|
||||
@@ -771,7 +826,7 @@ func aggrFuncOutliersK(afa *aggrFuncArg) ([]*timeseries, error) {
|
||||
}
|
||||
return sum2
|
||||
}
|
||||
return getRangeTopKTimeseries(tss, ks, f, false)
|
||||
return getRangeTopKTimeseries(tss, &afa.ae.Modifier, ks, "", f, false)
|
||||
}
|
||||
return aggrFuncExt(afe, args[1], &afa.ae.Modifier, afa.ae.Limit, true)
|
||||
}
|
||||
@@ -792,7 +847,7 @@ func aggrFuncLimitK(afa *aggrFuncArg) ([]*timeseries, error) {
|
||||
maxK = k
|
||||
}
|
||||
}
|
||||
afe := func(tss []*timeseries) []*timeseries {
|
||||
afe := func(tss []*timeseries, modifier *metricsql.ModifierExpr) []*timeseries {
|
||||
if len(tss) > maxK {
|
||||
tss = tss[:maxK]
|
||||
}
|
||||
@@ -833,8 +888,8 @@ func aggrFuncMedian(afa *aggrFuncArg) ([]*timeseries, error) {
|
||||
return aggrFuncExt(afe, tss, &afa.ae.Modifier, afa.ae.Limit, false)
|
||||
}
|
||||
|
||||
func newAggrQuantileFunc(phis []float64) func(tss []*timeseries) []*timeseries {
|
||||
return func(tss []*timeseries) []*timeseries {
|
||||
func newAggrQuantileFunc(phis []float64) func(tss []*timeseries, modifier *metricsql.ModifierExpr) []*timeseries {
|
||||
return func(tss []*timeseries, modifier *metricsql.ModifierExpr) []*timeseries {
|
||||
dst := tss[0]
|
||||
h := histogram.GetFast()
|
||||
defer histogram.PutFast(h)
|
||||
|
||||
@@ -62,6 +62,9 @@ func newBinaryOpCmpFunc(cf func(left, right float64) bool) binaryOpFunc {
|
||||
if cf(left, right) {
|
||||
return 1
|
||||
}
|
||||
if math.IsNaN(left) {
|
||||
return nan
|
||||
}
|
||||
return 0
|
||||
}
|
||||
return newBinaryOpFunc(cfe)
|
||||
|
||||
@@ -4,12 +4,12 @@ import (
|
||||
"flag"
|
||||
"fmt"
|
||||
"math"
|
||||
"runtime"
|
||||
"sync"
|
||||
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/netstorage"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/searchutils"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/memory"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
|
||||
@@ -439,12 +439,10 @@ func evalRollupFunc(ec *EvalConfig, name string, rf rollupFunc, expr metricsql.E
|
||||
ecNew = newEvalConfig(ecNew)
|
||||
ecNew.Start -= offset
|
||||
ecNew.End -= offset
|
||||
if ecNew.MayCache {
|
||||
start, end := AdjustStartEnd(ecNew.Start, ecNew.End, ecNew.Step)
|
||||
offset += ecNew.Start - start
|
||||
ecNew.Start = start
|
||||
ecNew.End = end
|
||||
}
|
||||
// There is no need in calling AdjustStartEnd() on ecNew if ecNew.MayCache is set to true,
|
||||
// since the time range alignment has been already performed by the caller,
|
||||
// so cache hit rate should be quite good.
|
||||
// See also https://github.com/VictoriaMetrics/VictoriaMetrics/issues/976
|
||||
}
|
||||
if name == "rollup_candlestick" {
|
||||
// Automatically apply `offset -step` to `rollup_candlestick` function
|
||||
@@ -555,7 +553,7 @@ func evalRollupFuncWithSubquery(ec *EvalConfig, name string, rf rollupFunc, expr
|
||||
}
|
||||
|
||||
func doParallel(tss []*timeseries, f func(ts *timeseries, values []float64, timestamps []int64) ([]float64, []int64)) {
|
||||
concurrency := runtime.GOMAXPROCS(-1)
|
||||
concurrency := cgroup.AvailableCPUs()
|
||||
if concurrency > len(tss) {
|
||||
concurrency = len(tss)
|
||||
}
|
||||
@@ -653,11 +651,7 @@ func evalRollupFuncWithMetricExpr(ec *EvalConfig, name string, rf rollupFunc,
|
||||
} else {
|
||||
minTimestamp -= ec.Step
|
||||
}
|
||||
sq := &storage.SearchQuery{
|
||||
MinTimestamp: minTimestamp,
|
||||
MaxTimestamp: ec.End,
|
||||
TagFilterss: [][]storage.TagFilter{tfs},
|
||||
}
|
||||
sq := storage.NewSearchQuery(minTimestamp, ec.End, [][]storage.TagFilter{tfs})
|
||||
rss, err := netstorage.ProcessSearchQuery(sq, true, ec.Deadline)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
@@ -682,7 +676,7 @@ func evalRollupFuncWithMetricExpr(ec *EvalConfig, name string, rf rollupFunc,
|
||||
timeseriesLen := rssLen
|
||||
if iafc != nil {
|
||||
// Incremental aggregates require holding only GOMAXPROCS timeseries in memory.
|
||||
timeseriesLen = runtime.GOMAXPROCS(-1)
|
||||
timeseriesLen = cgroup.AvailableCPUs()
|
||||
if iafc.ae.Modifier.Op != "" {
|
||||
if iafc.ae.Limit > 0 {
|
||||
// There is an explicit limit on the number of output time series.
|
||||
|
||||
@@ -5,17 +5,25 @@ import (
|
||||
"fmt"
|
||||
"math"
|
||||
"sort"
|
||||
"strings"
|
||||
"sync"
|
||||
"sync/atomic"
|
||||
"time"
|
||||
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/netstorage"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/querystats"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
|
||||
"github.com/VictoriaMetrics/metrics"
|
||||
"github.com/VictoriaMetrics/metricsql"
|
||||
)
|
||||
|
||||
var logSlowQueryDuration = flag.Duration("search.logSlowQueryDuration", 5*time.Second, "Log queries with execution time exceeding this value. Zero disables slow query logging")
|
||||
var (
|
||||
logSlowQueryDuration = flag.Duration("search.logSlowQueryDuration", 5*time.Second, "Log queries with execution time exceeding this value. Zero disables slow query logging")
|
||||
treatDotsAsIsInRegexps = flag.Bool("search.treatDotsAsIsInRegexps", false, "Whether to treat dots as is in regexp label filters used in queries. "+
|
||||
`For example, foo{bar=~"a.b.c"} will be automatically converted to foo{bar=~"a\\.b\\.c"}, i.e. all the dots in regexp filters will be automatically escaped `+
|
||||
`in order to match only dot char instead of matching any char. Dots in ".+", ".*" and ".{n}" regexps aren't escaped. `+
|
||||
`Such escaping can be useful when querying Graphite data`)
|
||||
)
|
||||
|
||||
var slowQueries = metrics.NewCounter(`vm_slow_queries_total`)
|
||||
|
||||
@@ -26,12 +34,16 @@ func Exec(ec *EvalConfig, q string, isFirstPointOnly bool) ([]netstorage.Result,
|
||||
defer func() {
|
||||
d := time.Since(startTime)
|
||||
if d >= *logSlowQueryDuration {
|
||||
logger.Warnf("slow query according to -search.logSlowQueryDuration=%s: duration=%.3f seconds, start=%d, end=%d, step=%d, query=%q",
|
||||
*logSlowQueryDuration, d.Seconds(), ec.Start/1000, ec.End/1000, ec.Step/1000, q)
|
||||
logger.Warnf("slow query according to -search.logSlowQueryDuration=%s: remoteAddr=%s, duration=%.3f seconds, start=%d, end=%d, step=%d, query=%q",
|
||||
*logSlowQueryDuration, ec.QuotedRemoteAddr, d.Seconds(), ec.Start/1000, ec.End/1000, ec.Step/1000, q)
|
||||
slowQueries.Inc()
|
||||
}
|
||||
}()
|
||||
}
|
||||
if querystats.Enabled() {
|
||||
startTime := time.Now()
|
||||
defer querystats.RegisterQuery(q, ec.End-ec.Start, startTime)
|
||||
}
|
||||
|
||||
ec.validate()
|
||||
|
||||
@@ -142,11 +154,11 @@ func adjustCmpOps(e metricsql.Expr) metricsql.Expr {
|
||||
if !metricsql.IsBinaryOpCmp(be.Op) {
|
||||
return
|
||||
}
|
||||
if _, ok := be.Left.(*metricsql.NumberExpr); !ok {
|
||||
if isNumberExpr(be.Right) || !isScalarExpr(be.Left) {
|
||||
return
|
||||
}
|
||||
// Convert 'num cmpOp query' expression to `query reverseCmpOp num` expression
|
||||
// like Prometheus does. For isntance, `0.5 < foo` must be converted to `foo > 0.5`
|
||||
// like Prometheus does. For instance, `0.5 < foo` must be converted to `foo > 0.5`
|
||||
// in order to return valid values for `foo` that are bigger than 0.5.
|
||||
be.Right, be.Left = be.Left, be.Right
|
||||
be.Op = getReverseCmpOp(be.Op)
|
||||
@@ -154,6 +166,22 @@ func adjustCmpOps(e metricsql.Expr) metricsql.Expr {
|
||||
return e
|
||||
}
|
||||
|
||||
func isNumberExpr(e metricsql.Expr) bool {
|
||||
_, ok := e.(*metricsql.NumberExpr)
|
||||
return ok
|
||||
}
|
||||
|
||||
func isScalarExpr(e metricsql.Expr) bool {
|
||||
if isNumberExpr(e) {
|
||||
return true
|
||||
}
|
||||
if fe, ok := e.(*metricsql.FuncExpr); ok {
|
||||
// time() returns scalar in PromQL - see https://prometheus.io/docs/prometheus/latest/querying/functions/#time
|
||||
return strings.ToLower(fe.Name) == "time"
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
func getReverseCmpOp(op string) string {
|
||||
switch op {
|
||||
case ">":
|
||||
@@ -177,6 +205,9 @@ func parsePromQLWithCache(q string) (metricsql.Expr, error) {
|
||||
if err == nil {
|
||||
e = metricsql.Optimize(e)
|
||||
e = adjustCmpOps(e)
|
||||
if *treatDotsAsIsInRegexps {
|
||||
e = escapeDotsInRegexpLabelFilters(e)
|
||||
}
|
||||
}
|
||||
pcv = &parseCacheValue{
|
||||
e: e,
|
||||
@@ -190,6 +221,41 @@ func parsePromQLWithCache(q string) (metricsql.Expr, error) {
|
||||
return pcv.e, nil
|
||||
}
|
||||
|
||||
func escapeDotsInRegexpLabelFilters(e metricsql.Expr) metricsql.Expr {
|
||||
metricsql.VisitAll(e, func(expr metricsql.Expr) {
|
||||
me, ok := expr.(*metricsql.MetricExpr)
|
||||
if !ok {
|
||||
return
|
||||
}
|
||||
for i := range me.LabelFilters {
|
||||
f := &me.LabelFilters[i]
|
||||
if f.IsRegexp {
|
||||
f.Value = escapeDots(f.Value)
|
||||
}
|
||||
}
|
||||
})
|
||||
return e
|
||||
}
|
||||
|
||||
func escapeDots(s string) string {
|
||||
dotsCount := strings.Count(s, ".")
|
||||
if dotsCount <= 0 {
|
||||
return s
|
||||
}
|
||||
result := make([]byte, 0, len(s)+2*dotsCount)
|
||||
for i := 0; i < len(s); i++ {
|
||||
if s[i] == '.' && (i == 0 || s[i-1] != '\\') && (i+1 == len(s) || i+1 < len(s) && s[i+1] != '*' && s[i+1] != '+' && s[i+1] != '{') {
|
||||
// Escape a dot if the following conditions are met:
|
||||
// - if it isn't escaped already, i.e. if there is no `\` char before the dot.
|
||||
// - if there is no regexp modifiers such as '+', '*' or '{' after the dot.
|
||||
result = append(result, '\\', '.')
|
||||
} else {
|
||||
result = append(result, s[i])
|
||||
}
|
||||
}
|
||||
return string(result)
|
||||
}
|
||||
|
||||
var parseCacheV = func() *parseCache {
|
||||
pc := &parseCache{
|
||||
m: make(map[string]*parseCacheValue),
|
||||
|
||||
@@ -7,8 +7,46 @@ import (
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/netstorage"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/searchutils"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
|
||||
"github.com/VictoriaMetrics/metricsql"
|
||||
)
|
||||
|
||||
func TestEscapeDots(t *testing.T) {
|
||||
f := func(s, resultExpected string) {
|
||||
t.Helper()
|
||||
result := escapeDots(s)
|
||||
if result != resultExpected {
|
||||
t.Fatalf("unexpected result for escapeDots(%q); got\n%s\nwant\n%s", s, result, resultExpected)
|
||||
}
|
||||
}
|
||||
f("", "")
|
||||
f("a", "a")
|
||||
f("foobar", "foobar")
|
||||
f(".", `\.`)
|
||||
f(".*", `.*`)
|
||||
f(".+", `.+`)
|
||||
f("..", `\.\.`)
|
||||
f("foo.b.{2}ar..+baz.*", `foo\.b.{2}ar\..+baz.*`)
|
||||
}
|
||||
|
||||
func TestEscapeDotsInRegexpLabelFilters(t *testing.T) {
|
||||
f := func(s, resultExpected string) {
|
||||
t.Helper()
|
||||
e, err := metricsql.Parse(s)
|
||||
if err != nil {
|
||||
t.Fatalf("unexpected error in metricsql.Parse(%q): %s", s, err)
|
||||
}
|
||||
e = escapeDotsInRegexpLabelFilters(e)
|
||||
result := e.AppendString(nil)
|
||||
if string(result) != resultExpected {
|
||||
t.Fatalf("unexpected result for escapeDotsInRegexpLabelFilters(%q);\ngot\n%s\nwant\n%s", s, result, resultExpected)
|
||||
}
|
||||
}
|
||||
f("2", "2")
|
||||
f(`foo.bar + 123`, `foo.bar + 123`)
|
||||
f(`foo{bar=~"baz.xx.yyy"}`, `foo{bar=~"baz\\.xx\\.yyy"}`)
|
||||
f(`foo(a.b{c="d.e",x=~"a.b.+[.a]",y!~"aaa.bb|cc.dd"}) + x.y(1,sum({x=~"aa.bb"}))`, `foo(a.b{c="d.e", x=~"a\\.b.+[\\.a]", y!~"aaa\\.bb|cc\\.dd"}) + x.y(1, sum({x=~"aa\\.bb"}))`)
|
||||
}
|
||||
|
||||
func TestExecSuccess(t *testing.T) {
|
||||
start := int64(1000e3)
|
||||
end := int64(2000e3)
|
||||
@@ -150,12 +188,23 @@ func TestExecSuccess(t *testing.T) {
|
||||
resultExpected := []netstorage.Result{r}
|
||||
f(q, resultExpected)
|
||||
})
|
||||
t.Run("time() offset 1m40s0ms", func(t *testing.T) {
|
||||
t.Run("time() offset 1h40s0ms", func(t *testing.T) {
|
||||
t.Parallel()
|
||||
q := `time() offset 100s`
|
||||
q := `time() offset 1h40s0ms`
|
||||
r := netstorage.Result{
|
||||
MetricName: metricNameExpected,
|
||||
Values: []float64{800, 1000, 1200, 1400, 1600, 1800},
|
||||
Values: []float64{-2800, -2600, -2400, -2200, -2000, -1800},
|
||||
Timestamps: timestampsExpected,
|
||||
}
|
||||
resultExpected := []netstorage.Result{r}
|
||||
f(q, resultExpected)
|
||||
})
|
||||
t.Run("time() offset -1h40s0ms", func(t *testing.T) {
|
||||
t.Parallel()
|
||||
q := `time() offset -1h40s0ms`
|
||||
r := netstorage.Result{
|
||||
MetricName: metricNameExpected,
|
||||
Values: []float64{4600, 4800, 5000, 5200, 5400, 5600},
|
||||
Timestamps: timestampsExpected,
|
||||
}
|
||||
resultExpected := []netstorage.Result{r}
|
||||
@@ -488,6 +537,17 @@ func TestExecSuccess(t *testing.T) {
|
||||
resultExpected := []netstorage.Result{r}
|
||||
f(q, resultExpected)
|
||||
})
|
||||
t.Run(`minute(series_with_NaNs)`, func(t *testing.T) {
|
||||
t.Parallel()
|
||||
q := `minute(time() <= 1200 or time() > 1600)`
|
||||
r := netstorage.Result{
|
||||
MetricName: metricNameExpected,
|
||||
Values: []float64{16, 20, nan, nan, 30, 33},
|
||||
Timestamps: timestampsExpected,
|
||||
}
|
||||
resultExpected := []netstorage.Result{r}
|
||||
f(q, resultExpected)
|
||||
})
|
||||
t.Run("rate({})", func(t *testing.T) {
|
||||
t.Parallel()
|
||||
q := `rate({})`
|
||||
@@ -1041,6 +1101,62 @@ func TestExecSuccess(t *testing.T) {
|
||||
resultExpected := []netstorage.Result{r1, r2, r3, r4, r5}
|
||||
f(q, resultExpected)
|
||||
})
|
||||
t.Run(`label_uppercase`, func(t *testing.T) {
|
||||
t.Parallel()
|
||||
q := `label_uppercase(
|
||||
label_set(time(), "foo", "bAr", "XXx", "yyy", "zzz", "abc"),
|
||||
"foo", "XXx", "aaa"
|
||||
)`
|
||||
r := netstorage.Result{
|
||||
MetricName: metricNameExpected,
|
||||
Values: []float64{1000, 1200, 1400, 1600, 1800, 2000},
|
||||
Timestamps: timestampsExpected,
|
||||
}
|
||||
r.MetricName.Tags = []storage.Tag{
|
||||
{
|
||||
Key: []byte("XXx"),
|
||||
Value: []byte("YYY"),
|
||||
},
|
||||
{
|
||||
Key: []byte("foo"),
|
||||
Value: []byte("BAR"),
|
||||
},
|
||||
{
|
||||
Key: []byte("zzz"),
|
||||
Value: []byte("abc"),
|
||||
},
|
||||
}
|
||||
resultExpected := []netstorage.Result{r}
|
||||
f(q, resultExpected)
|
||||
})
|
||||
t.Run(`label_lowercase`, func(t *testing.T) {
|
||||
t.Parallel()
|
||||
q := `label_lowercase(
|
||||
label_set(time(), "foo", "bAr", "XXx", "yyy", "zzz", "aBc"),
|
||||
"foo", "XXx", "aaa"
|
||||
)`
|
||||
r := netstorage.Result{
|
||||
MetricName: metricNameExpected,
|
||||
Values: []float64{1000, 1200, 1400, 1600, 1800, 2000},
|
||||
Timestamps: timestampsExpected,
|
||||
}
|
||||
r.MetricName.Tags = []storage.Tag{
|
||||
{
|
||||
Key: []byte("XXx"),
|
||||
Value: []byte("yyy"),
|
||||
},
|
||||
{
|
||||
Key: []byte("foo"),
|
||||
Value: []byte("bar"),
|
||||
},
|
||||
{
|
||||
Key: []byte("zzz"),
|
||||
Value: []byte("aBc"),
|
||||
},
|
||||
}
|
||||
resultExpected := []netstorage.Result{r}
|
||||
f(q, resultExpected)
|
||||
})
|
||||
t.Run(`label_copy(new_tag)`, func(t *testing.T) {
|
||||
t.Parallel()
|
||||
q := `label_copy(
|
||||
@@ -1721,6 +1837,45 @@ func TestExecSuccess(t *testing.T) {
|
||||
resultExpected := []netstorage.Result{r1, r2}
|
||||
f(q, resultExpected)
|
||||
})
|
||||
t.Run(`sort_by_label(multiple_labels)`, func(t *testing.T) {
|
||||
t.Parallel()
|
||||
q := `sort_by_label((
|
||||
label_set(1, "x", "b", "y", "aa"),
|
||||
label_set(2, "x", "a", "y", "aa"),
|
||||
), "y", "x")`
|
||||
r1 := netstorage.Result{
|
||||
MetricName: metricNameExpected,
|
||||
Values: []float64{2, 2, 2, 2, 2, 2},
|
||||
Timestamps: timestampsExpected,
|
||||
}
|
||||
r1.MetricName.Tags = []storage.Tag{
|
||||
{
|
||||
Key: []byte("x"),
|
||||
Value: []byte("a"),
|
||||
},
|
||||
{
|
||||
Key: []byte("y"),
|
||||
Value: []byte("aa"),
|
||||
},
|
||||
}
|
||||
r2 := netstorage.Result{
|
||||
MetricName: metricNameExpected,
|
||||
Values: []float64{1, 1, 1, 1, 1, 1},
|
||||
Timestamps: timestampsExpected,
|
||||
}
|
||||
r2.MetricName.Tags = []storage.Tag{
|
||||
{
|
||||
Key: []byte("x"),
|
||||
Value: []byte("b"),
|
||||
},
|
||||
{
|
||||
Key: []byte("y"),
|
||||
Value: []byte("aa"),
|
||||
},
|
||||
}
|
||||
resultExpected := []netstorage.Result{r1, r2}
|
||||
f(q, resultExpected)
|
||||
})
|
||||
t.Run(`scalar < time()`, func(t *testing.T) {
|
||||
t.Parallel()
|
||||
q := `123 < time()`
|
||||
@@ -1734,10 +1889,32 @@ func TestExecSuccess(t *testing.T) {
|
||||
})
|
||||
t.Run(`time() > scalar`, func(t *testing.T) {
|
||||
t.Parallel()
|
||||
q := `time() > 123`
|
||||
q := `time() > 1234`
|
||||
r := netstorage.Result{
|
||||
MetricName: metricNameExpected,
|
||||
Values: []float64{1000, 1200, 1400, 1600, 1800, 2000},
|
||||
Values: []float64{nan, nan, 1400, 1600, 1800, 2000},
|
||||
Timestamps: timestampsExpected,
|
||||
}
|
||||
resultExpected := []netstorage.Result{r}
|
||||
f(q, resultExpected)
|
||||
})
|
||||
t.Run(`time() >bool scalar`, func(t *testing.T) {
|
||||
t.Parallel()
|
||||
q := `time() >bool 1234`
|
||||
r := netstorage.Result{
|
||||
MetricName: metricNameExpected,
|
||||
Values: []float64{0, 0, 1, 1, 1, 1},
|
||||
Timestamps: timestampsExpected,
|
||||
}
|
||||
resultExpected := []netstorage.Result{r}
|
||||
f(q, resultExpected)
|
||||
})
|
||||
t.Run(`nan >bool scalar1`, func(t *testing.T) {
|
||||
t.Parallel()
|
||||
q := `(time() > 1234) >bool 1450`
|
||||
r := netstorage.Result{
|
||||
MetricName: metricNameExpected,
|
||||
Values: []float64{nan, nan, 0, 1, 1, 1},
|
||||
Timestamps: timestampsExpected,
|
||||
}
|
||||
resultExpected := []netstorage.Result{r}
|
||||
@@ -4008,6 +4185,28 @@ func TestExecSuccess(t *testing.T) {
|
||||
resultExpected := []netstorage.Result{r}
|
||||
f(q, resultExpected)
|
||||
})
|
||||
t.Run(`count_eq_over_time`, func(t *testing.T) {
|
||||
t.Parallel()
|
||||
q := `count_eq_over_time(round(5*rand(0))[200s:10s], 1)`
|
||||
r := netstorage.Result{
|
||||
MetricName: metricNameExpected,
|
||||
Values: []float64{2, 4, 5, 2, 6, 6},
|
||||
Timestamps: timestampsExpected,
|
||||
}
|
||||
resultExpected := []netstorage.Result{r}
|
||||
f(q, resultExpected)
|
||||
})
|
||||
t.Run(`count_ne_over_time`, func(t *testing.T) {
|
||||
t.Parallel()
|
||||
q := `count_ne_over_time(round(5*rand(0))[200s:10s], 1)`
|
||||
r := netstorage.Result{
|
||||
MetricName: metricNameExpected,
|
||||
Values: []float64{18, 16, 15, 18, 14, 14},
|
||||
Timestamps: timestampsExpected,
|
||||
}
|
||||
resultExpected := []netstorage.Result{r}
|
||||
f(q, resultExpected)
|
||||
})
|
||||
t.Run(`increases_over_time`, func(t *testing.T) {
|
||||
t.Parallel()
|
||||
q := `increases_over_time(rand(0)[200s:10s])`
|
||||
@@ -4193,7 +4392,7 @@ func TestExecSuccess(t *testing.T) {
|
||||
})
|
||||
t.Run(`topk_max(1)`, func(t *testing.T) {
|
||||
t.Parallel()
|
||||
q := `sort(topk_max(1, label_set(10, "foo", "bar") or label_set(time()/150, "baz", "sss")))`
|
||||
q := `topk_max(1, label_set(10, "foo", "bar") or label_set(time()/150, "baz", "sss"))`
|
||||
r1 := netstorage.Result{
|
||||
MetricName: metricNameExpected,
|
||||
Values: []float64{nan, nan, nan, 10.666666666666666, 12, 13.333333333333334},
|
||||
@@ -4206,6 +4405,84 @@ func TestExecSuccess(t *testing.T) {
|
||||
resultExpected := []netstorage.Result{r1}
|
||||
f(q, resultExpected)
|
||||
})
|
||||
t.Run(`topk_max(1, remaining_sum)`, func(t *testing.T) {
|
||||
t.Parallel()
|
||||
q := `sort_desc(topk_max(1, label_set(10, "foo", "bar") or label_set(time()/150, "baz", "sss"), "remaining_sum"))`
|
||||
r1 := netstorage.Result{
|
||||
MetricName: metricNameExpected,
|
||||
Values: []float64{nan, nan, nan, 10.666666666666666, 12, 13.333333333333334},
|
||||
Timestamps: timestampsExpected,
|
||||
}
|
||||
r1.MetricName.Tags = []storage.Tag{{
|
||||
Key: []byte("baz"),
|
||||
Value: []byte("sss"),
|
||||
}}
|
||||
r2 := netstorage.Result{
|
||||
MetricName: metricNameExpected,
|
||||
Values: []float64{10, 10, 10, 10, 10, 10},
|
||||
Timestamps: timestampsExpected,
|
||||
}
|
||||
r2.MetricName.Tags = []storage.Tag{
|
||||
{
|
||||
Key: []byte("remaining_sum"),
|
||||
Value: []byte("remaining_sum"),
|
||||
},
|
||||
}
|
||||
resultExpected := []netstorage.Result{r1, r2}
|
||||
f(q, resultExpected)
|
||||
})
|
||||
t.Run(`topk_max(2, remaining_sum)`, func(t *testing.T) {
|
||||
t.Parallel()
|
||||
q := `sort_desc(topk_max(2, label_set(10, "foo", "bar") or label_set(time()/150, "baz", "sss"), "remaining_sum"))`
|
||||
r1 := netstorage.Result{
|
||||
MetricName: metricNameExpected,
|
||||
Values: []float64{nan, nan, nan, 10.666666666666666, 12, 13.333333333333334},
|
||||
Timestamps: timestampsExpected,
|
||||
}
|
||||
r1.MetricName.Tags = []storage.Tag{{
|
||||
Key: []byte("baz"),
|
||||
Value: []byte("sss"),
|
||||
}}
|
||||
r2 := netstorage.Result{
|
||||
MetricName: metricNameExpected,
|
||||
Values: []float64{10, 10, 10, 10, 10, 10},
|
||||
Timestamps: timestampsExpected,
|
||||
}
|
||||
r2.MetricName.Tags = []storage.Tag{
|
||||
{
|
||||
Key: []byte("foo"),
|
||||
Value: []byte("bar"),
|
||||
},
|
||||
}
|
||||
resultExpected := []netstorage.Result{r1, r2}
|
||||
f(q, resultExpected)
|
||||
})
|
||||
t.Run(`topk_max(3, remaining_sum)`, func(t *testing.T) {
|
||||
t.Parallel()
|
||||
q := `sort_desc(topk_max(3, label_set(10, "foo", "bar") or label_set(time()/150, "baz", "sss"), "remaining_sum"))`
|
||||
r1 := netstorage.Result{
|
||||
MetricName: metricNameExpected,
|
||||
Values: []float64{nan, nan, nan, 10.666666666666666, 12, 13.333333333333334},
|
||||
Timestamps: timestampsExpected,
|
||||
}
|
||||
r1.MetricName.Tags = []storage.Tag{{
|
||||
Key: []byte("baz"),
|
||||
Value: []byte("sss"),
|
||||
}}
|
||||
r2 := netstorage.Result{
|
||||
MetricName: metricNameExpected,
|
||||
Values: []float64{10, 10, 10, 10, 10, 10},
|
||||
Timestamps: timestampsExpected,
|
||||
}
|
||||
r2.MetricName.Tags = []storage.Tag{
|
||||
{
|
||||
Key: []byte("foo"),
|
||||
Value: []byte("bar"),
|
||||
},
|
||||
}
|
||||
resultExpected := []netstorage.Result{r1, r2}
|
||||
f(q, resultExpected)
|
||||
})
|
||||
t.Run(`bottomk_max(1)`, func(t *testing.T) {
|
||||
t.Parallel()
|
||||
q := `sort(bottomk_max(1, label_set(10, "foo", "bar") or label_set(time()/150, "baz", "sss")))`
|
||||
@@ -6032,6 +6309,8 @@ func TestExecError(t *testing.T) {
|
||||
f(`share_gt_over_time()`)
|
||||
f(`count_le_over_time()`)
|
||||
f(`count_gt_over_time()`)
|
||||
f(`count_eq_over_time()`)
|
||||
f(`count_ne_over_time()`)
|
||||
|
||||
// Invalid argument type
|
||||
f(`median_over_time({}, 2)`)
|
||||
@@ -6068,6 +6347,8 @@ func TestExecError(t *testing.T) {
|
||||
f(`label_transform(1, "foo", "invalid(regexp", "baz`)
|
||||
f(`label_match(1, 2, 3)`)
|
||||
f(`label_mismatch(1, 2, 3)`)
|
||||
f(`label_uppercase()`)
|
||||
f(`label_lowercase()`)
|
||||
f(`alias(1, 2)`)
|
||||
f(`aggr_over_time(1, 2)`)
|
||||
f(`aggr_over_time(("foo", "bar"), 3)`)
|
||||
|
||||
@@ -24,6 +24,7 @@ func TestParseMetricSelectorSuccess(t *testing.T) {
|
||||
f(`foo {bar != "baz"}`)
|
||||
f(` foo { bar !~ "^ddd(x+)$", a="ss", __name__="sffd"} `)
|
||||
f(`(foo)`)
|
||||
f(`\п\р\и\в\е\т{\ы="111"}`)
|
||||
}
|
||||
|
||||
func TestParseMetricSelectorError(t *testing.T) {
|
||||
|
||||
@@ -15,7 +15,7 @@ import (
|
||||
"github.com/valyala/histogram"
|
||||
)
|
||||
|
||||
var minStalenessInterval = flag.Duration("search.minStalenessInterval", 0, "The mimimum interval for staleness calculations. "+
|
||||
var minStalenessInterval = flag.Duration("search.minStalenessInterval", 0, "The minimum interval for staleness calculations. "+
|
||||
"This flag could be useful for removing gaps on graphs generated from time series with irregular intervals between samples. "+
|
||||
"See also '-search.maxStalenessInterval'")
|
||||
|
||||
@@ -60,10 +60,14 @@ var rollupFuncs = map[string]newRollupFunc{
|
||||
"scrape_interval": newRollupFuncOneArg(rollupScrapeInterval),
|
||||
"tmin_over_time": newRollupFuncOneArg(rollupTmin),
|
||||
"tmax_over_time": newRollupFuncOneArg(rollupTmax),
|
||||
"tfirst_over_time": newRollupFuncOneArg(rollupTfirst),
|
||||
"tlast_over_time": newRollupFuncOneArg(rollupTlast),
|
||||
"share_le_over_time": newRollupShareLE,
|
||||
"share_gt_over_time": newRollupShareGT,
|
||||
"count_le_over_time": newRollupCountLE,
|
||||
"count_gt_over_time": newRollupCountGT,
|
||||
"count_eq_over_time": newRollupCountEQ,
|
||||
"count_ne_over_time": newRollupCountNE,
|
||||
"histogram_over_time": newRollupFuncOneArg(rollupHistogram),
|
||||
"rollup": newRollupFuncOneArg(rollupFake),
|
||||
"rollup_rate": newRollupFuncOneArg(rollupFake), // + rollupFuncsRemoveCounterResets
|
||||
@@ -81,7 +85,7 @@ var rollupFuncs = map[string]newRollupFunc{
|
||||
// `timestamp` function must return timestamp for the last datapoint on the current window
|
||||
// in order to properly handle offset and timestamps unaligned to the current step.
|
||||
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/415 for details.
|
||||
"timestamp": newRollupFuncOneArg(rollupTimestamp),
|
||||
"timestamp": newRollupFuncOneArg(rollupTlast),
|
||||
|
||||
// See https://en.wikipedia.org/wiki/Mode_(statistics)
|
||||
"mode_over_time": newRollupFuncOneArg(rollupModeOverTime),
|
||||
@@ -126,10 +130,12 @@ var rollupAggrFuncs = map[string]rollupFunc{
|
||||
"scrape_interval": rollupScrapeInterval,
|
||||
"tmin_over_time": rollupTmin,
|
||||
"tmax_over_time": rollupTmax,
|
||||
"tfirst_over_time": rollupTfirst,
|
||||
"tlast_over_time": rollupTlast,
|
||||
"ascent_over_time": rollupAscentOverTime,
|
||||
"descent_over_time": rollupDescentOverTime,
|
||||
"zscore_over_time": rollupZScoreOverTime,
|
||||
"timestamp": rollupTimestamp,
|
||||
"timestamp": rollupTlast,
|
||||
"mode_over_time": rollupModeOverTime,
|
||||
"rate_over_sum": rollupRateOverSum,
|
||||
}
|
||||
@@ -258,15 +264,16 @@ func getRollupConfigs(name string, rf rollupFunc, expr metricsql.Expr, start, en
|
||||
}
|
||||
newRollupConfig := func(rf rollupFunc, tagValue string) *rollupConfig {
|
||||
return &rollupConfig{
|
||||
TagValue: tagValue,
|
||||
Func: rf,
|
||||
Start: start,
|
||||
End: end,
|
||||
Step: step,
|
||||
Window: window,
|
||||
MayAdjustWindow: !rollupFuncsCannotAdjustWindow[name],
|
||||
LookbackDelta: lookbackDelta,
|
||||
Timestamps: sharedTimestamps,
|
||||
TagValue: tagValue,
|
||||
Func: rf,
|
||||
Start: start,
|
||||
End: end,
|
||||
Step: step,
|
||||
Window: window,
|
||||
MayAdjustWindow: !rollupFuncsCannotAdjustWindow[name],
|
||||
CanDropLastSample: name == "default_rollup",
|
||||
LookbackDelta: lookbackDelta,
|
||||
Timestamps: sharedTimestamps,
|
||||
}
|
||||
}
|
||||
appendRollupConfigs := func(dst []*rollupConfig) []*rollupConfig {
|
||||
@@ -325,14 +332,32 @@ func getRollupFunc(funcName string) newRollupFunc {
|
||||
}
|
||||
|
||||
type rollupFuncArg struct {
|
||||
prevValue float64
|
||||
prevTimestamp int64
|
||||
values []float64
|
||||
timestamps []int64
|
||||
// The value preceeding values if it fits staleness interval.
|
||||
prevValue float64
|
||||
|
||||
// The timestamp for prevValue.
|
||||
prevTimestamp int64
|
||||
|
||||
// Values that fit window ending at currTimestamp.
|
||||
values []float64
|
||||
|
||||
// Timestamps for values.
|
||||
timestamps []int64
|
||||
|
||||
// Real value preceeding values without restrictions on staleness interval.
|
||||
realPrevValue float64
|
||||
|
||||
// Real value which goes after values.
|
||||
realNextValue float64
|
||||
|
||||
// Current timestamp for rollup evaluation.
|
||||
currTimestamp int64
|
||||
idx int
|
||||
window int64
|
||||
|
||||
// Index for the currently evaluated point relative to time range for query evaluation.
|
||||
idx int
|
||||
|
||||
// Time window for rollup calculations.
|
||||
window int64
|
||||
|
||||
tsm *timeseriesMap
|
||||
}
|
||||
@@ -370,6 +395,11 @@ type rollupConfig struct {
|
||||
// when using window smaller than 2 x scrape_interval.
|
||||
MayAdjustWindow bool
|
||||
|
||||
// Whether the last sample can be dropped during rollup calculations.
|
||||
// The last sample can be dropped for `default_rollup()` function only.
|
||||
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/748 .
|
||||
CanDropLastSample bool
|
||||
|
||||
Timestamps []int64
|
||||
|
||||
// LoookbackDelta is the analog to `-query.lookback-delta` from Prometheus world.
|
||||
@@ -482,8 +512,8 @@ func (rc *rollupConfig) doInternal(dstValues []float64, tsm *timeseriesMap, valu
|
||||
window := rc.Window
|
||||
if window <= 0 {
|
||||
window = rc.Step
|
||||
if rc.LookbackDelta > 0 && window > rc.LookbackDelta {
|
||||
// Implicitly set window exceeds -search.maxStalenessInterval, so limit it to -search.maxStalenessInterval
|
||||
if rc.CanDropLastSample && rc.LookbackDelta > 0 && window > rc.LookbackDelta {
|
||||
// Implicitly window exceeds -search.maxStalenessInterval, so limit it to -search.maxStalenessInterval
|
||||
// according to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/784
|
||||
window = rc.LookbackDelta
|
||||
}
|
||||
@@ -501,6 +531,9 @@ func (rc *rollupConfig) doInternal(dstValues []float64, tsm *timeseriesMap, valu
|
||||
ni := 0
|
||||
nj := 0
|
||||
stalenessInterval := int64(float64(scrapeInterval) * 0.9)
|
||||
// Do not drop trailing data points for queries, which return 2 or 1 point (aka instant queries).
|
||||
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/845
|
||||
canDropLastSample := rc.CanDropLastSample && len(rc.Timestamps) > 2
|
||||
for _, tEnd := range rc.Timestamps {
|
||||
tStart := tEnd - window
|
||||
ni = seekFirstTimestampIdxAfter(timestamps[i:], tStart, ni)
|
||||
@@ -519,15 +552,26 @@ func (rc *rollupConfig) doInternal(dstValues []float64, tsm *timeseriesMap, valu
|
||||
}
|
||||
rfa.values = values[i:j]
|
||||
rfa.timestamps = timestamps[i:j]
|
||||
if j == len(timestamps) && i < j && tEnd-timestamps[j-1] > stalenessInterval {
|
||||
// Do not take into account the last data point in time series if the distance between this data point
|
||||
// and tEnd exceeds stalenessInterval.
|
||||
if canDropLastSample && j == len(timestamps) && j > 0 && (tEnd-timestamps[j-1] > stalenessInterval || i == j && len(timestamps) == 1) {
|
||||
// Drop trailing data points in the following cases:
|
||||
// - if the distance between the last raw sample and tEnd exceeds stalenessInterval
|
||||
// - if time series contains only a single raw sample
|
||||
// This should prevent from double counting when a label changes in time series (for instance,
|
||||
// during new deployment in K8S). See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/748
|
||||
rfa.prevValue = nan
|
||||
rfa.values = nil
|
||||
rfa.timestamps = nil
|
||||
}
|
||||
if i > 0 {
|
||||
rfa.realPrevValue = values[i-1]
|
||||
} else {
|
||||
rfa.realPrevValue = nan
|
||||
}
|
||||
if j < len(values) {
|
||||
rfa.realNextValue = values[j]
|
||||
} else {
|
||||
rfa.realNextValue = nan
|
||||
}
|
||||
rfa.currTimestamp = tEnd
|
||||
value := rc.Func(rfa)
|
||||
rfa.idx++
|
||||
@@ -857,6 +901,26 @@ func countFilterGT(values []float64, gt float64) int {
|
||||
return n
|
||||
}
|
||||
|
||||
func countFilterEQ(values []float64, eq float64) int {
|
||||
n := 0
|
||||
for _, v := range values {
|
||||
if v == eq {
|
||||
n++
|
||||
}
|
||||
}
|
||||
return n
|
||||
}
|
||||
|
||||
func countFilterNE(values []float64, ne float64) int {
|
||||
n := 0
|
||||
for _, v := range values {
|
||||
if v != ne {
|
||||
n++
|
||||
}
|
||||
}
|
||||
return n
|
||||
}
|
||||
|
||||
func newRollupShareFilter(args []interface{}, countFilter func(values []float64, limit float64) int) (rollupFunc, error) {
|
||||
rf, err := newRollupCountFilter(args, countFilter)
|
||||
if err != nil {
|
||||
@@ -876,6 +940,14 @@ func newRollupCountGT(args []interface{}) (rollupFunc, error) {
|
||||
return newRollupCountFilter(args, countFilterGT)
|
||||
}
|
||||
|
||||
func newRollupCountEQ(args []interface{}) (rollupFunc, error) {
|
||||
return newRollupCountFilter(args, countFilterEQ)
|
||||
}
|
||||
|
||||
func newRollupCountNE(args []interface{}) (rollupFunc, error) {
|
||||
return newRollupCountFilter(args, countFilterNE)
|
||||
}
|
||||
|
||||
func newRollupCountFilter(args []interface{}, countFilter func(values []float64, limit float64) int) (rollupFunc, error) {
|
||||
if err := expectRollupArgsNum(args, 2); err != nil {
|
||||
return nil, err
|
||||
@@ -1099,15 +1171,41 @@ func rollupTmax(rfa *rollupFuncArg) float64 {
|
||||
return float64(maxTimestamp) / 1e3
|
||||
}
|
||||
|
||||
func rollupTfirst(rfa *rollupFuncArg) float64 {
|
||||
// There is no need in handling NaNs here, since they must be cleaned up
|
||||
// before calling rollup funcs.
|
||||
timestamps := rfa.timestamps
|
||||
if len(timestamps) == 0 {
|
||||
// Do not take into account rfa.prevTimestamp, since it may lead
|
||||
// to inconsistent results comparing to Prometheus on broken time series
|
||||
// with irregular data points.
|
||||
return nan
|
||||
}
|
||||
return float64(timestamps[0]) / 1e3
|
||||
}
|
||||
|
||||
func rollupTlast(rfa *rollupFuncArg) float64 {
|
||||
// There is no need in handling NaNs here, since they must be cleaned up
|
||||
// before calling rollup funcs.
|
||||
timestamps := rfa.timestamps
|
||||
if len(timestamps) == 0 {
|
||||
// Do not take into account rfa.prevTimestamp, since it may lead
|
||||
// to inconsistent results comparing to Prometheus on broken time series
|
||||
// with irregular data points.
|
||||
return nan
|
||||
}
|
||||
return float64(timestamps[len(timestamps)-1]) / 1e3
|
||||
}
|
||||
|
||||
func rollupSum(rfa *rollupFuncArg) float64 {
|
||||
// There is no need in handling NaNs here, since they must be cleaned up
|
||||
// before calling rollup funcs.
|
||||
values := rfa.values
|
||||
if len(values) == 0 {
|
||||
if math.IsNaN(rfa.prevValue) {
|
||||
return nan
|
||||
}
|
||||
return 0
|
||||
// Do not take into account rfa.prevValue, since it may lead
|
||||
// to inconsistent results comparing to Prometheus on broken time series
|
||||
// with irregular data points.
|
||||
return nan
|
||||
}
|
||||
var sum float64
|
||||
for _, v := range values {
|
||||
@@ -1234,8 +1332,17 @@ func rollupDelta(rfa *rollupFuncArg) float64 {
|
||||
if len(values) == 0 {
|
||||
return nan
|
||||
}
|
||||
// Assume that the previous non-existing value was 0
|
||||
// only if the first value doesn't exceed too much the delta with the next value.
|
||||
if !math.IsNaN(rfa.realPrevValue) {
|
||||
// Assume that the value didn't change during the current gap.
|
||||
// This should fix high delta() and increase() values at the end of gaps.
|
||||
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/894
|
||||
return values[len(values)-1] - rfa.realPrevValue
|
||||
}
|
||||
// Assume that the previous non-existing value was 0 only in the following cases:
|
||||
//
|
||||
// - If the delta with the next value equals to 0.
|
||||
// This is the case for slow-changing counter - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/962
|
||||
// - If the first value doesn't exceed too much the delta with the next value.
|
||||
//
|
||||
// This should prevent from improper increase() results for os-level counters
|
||||
// such as cpu time or bytes sent over the network interface.
|
||||
@@ -1243,9 +1350,14 @@ func rollupDelta(rfa *rollupFuncArg) float64 {
|
||||
//
|
||||
// This also should prevent from improper increase() results when a part of label values are changed
|
||||
// without counter reset.
|
||||
d := float64(10)
|
||||
var d float64
|
||||
if len(values) > 1 {
|
||||
d = values[1] - values[0]
|
||||
} else if !math.IsNaN(rfa.realNextValue) {
|
||||
d = rfa.realNextValue - values[0]
|
||||
}
|
||||
if d == 0 {
|
||||
d = 10
|
||||
}
|
||||
if math.Abs(values[0]) < 10*(math.Abs(d)+1) {
|
||||
prevValue = 0
|
||||
@@ -1580,19 +1692,6 @@ func rollupLow(rfa *rollupFuncArg) float64 {
|
||||
return min
|
||||
}
|
||||
|
||||
func rollupTimestamp(rfa *rollupFuncArg) float64 {
|
||||
// There is no need in handling NaNs here, since they must be cleaned up
|
||||
// before calling rollup funcs.
|
||||
timestamps := rfa.timestamps
|
||||
if len(timestamps) == 0 {
|
||||
// Do not take into account rfa.prevTimestamp, since it may lead
|
||||
// to inconsistent results comparing to Prometheus on broken time series
|
||||
// with irregular data points.
|
||||
return nan
|
||||
}
|
||||
return float64(timestamps[len(timestamps)-1]) / 1e3
|
||||
}
|
||||
|
||||
func rollupModeOverTime(rfa *rollupFuncArg) float64 {
|
||||
// There is no need in handling NaNs here, since they must be cleaned up
|
||||
// before calling rollup funcs.
|
||||
|
||||
@@ -13,6 +13,7 @@ import (
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fasttime"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/memory"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/workingsetcache"
|
||||
"github.com/VictoriaMetrics/fastcache"
|
||||
"github.com/VictoriaMetrics/metrics"
|
||||
@@ -25,6 +26,39 @@ var (
|
||||
"due to time synchronization issues between VictoriaMetrics and data sources")
|
||||
)
|
||||
|
||||
// ResetRollupResultCacheIfNeeded resets rollup result cache if mrs contains timestamps outside `now - search.cacheTimestampOffset`.
|
||||
func ResetRollupResultCacheIfNeeded(mrs []storage.MetricRow) {
|
||||
checkRollupResultCacheResetOnce.Do(func() {
|
||||
go checkRollupResultCacheReset()
|
||||
})
|
||||
minTimestamp := int64(fasttime.UnixTimestamp()*1000) - cacheTimestampOffset.Milliseconds() + checkRollupResultCacheResetInterval.Milliseconds()
|
||||
needCacheReset := false
|
||||
for i := range mrs {
|
||||
if mrs[i].Timestamp < minTimestamp {
|
||||
needCacheReset = true
|
||||
break
|
||||
}
|
||||
}
|
||||
if needCacheReset {
|
||||
// Do not call ResetRollupResultCache() here, since it may be heavy when frequently called.
|
||||
atomic.StoreUint32(&needRollupResultCacheReset, 1)
|
||||
}
|
||||
}
|
||||
|
||||
func checkRollupResultCacheReset() {
|
||||
for {
|
||||
time.Sleep(checkRollupResultCacheResetInterval)
|
||||
if atomic.SwapUint32(&needRollupResultCacheReset, 0) > 0 {
|
||||
ResetRollupResultCache()
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
const checkRollupResultCacheResetInterval = 5 * time.Second
|
||||
|
||||
var needRollupResultCacheReset uint32
|
||||
var checkRollupResultCacheResetOnce sync.Once
|
||||
|
||||
var rollupResultCacheV = &rollupResultCache{
|
||||
c: workingsetcache.New(1024*1024, time.Hour), // This is a cache for testing.
|
||||
}
|
||||
|
||||
@@ -285,6 +285,44 @@ func TestRollupCountGTOverTime(t *testing.T) {
|
||||
f(1000, 0)
|
||||
}
|
||||
|
||||
func TestRollupCountEQOverTime(t *testing.T) {
|
||||
f := func(eq, vExpected float64) {
|
||||
t.Helper()
|
||||
eqs := []*timeseries{{
|
||||
Values: []float64{eq},
|
||||
Timestamps: []int64{123},
|
||||
}}
|
||||
var me metricsql.MetricExpr
|
||||
args := []interface{}{&metricsql.RollupExpr{Expr: &me}, eqs}
|
||||
testRollupFunc(t, "count_eq_over_time", args, &me, vExpected)
|
||||
}
|
||||
|
||||
f(-123, 0)
|
||||
f(0, 0)
|
||||
f(34, 4)
|
||||
f(123, 1)
|
||||
f(12, 1)
|
||||
}
|
||||
|
||||
func TestRollupCountNEOverTime(t *testing.T) {
|
||||
f := func(ne, vExpected float64) {
|
||||
t.Helper()
|
||||
nes := []*timeseries{{
|
||||
Values: []float64{ne},
|
||||
Timestamps: []int64{123},
|
||||
}}
|
||||
var me metricsql.MetricExpr
|
||||
args := []interface{}{&metricsql.RollupExpr{Expr: &me}, nes}
|
||||
testRollupFunc(t, "count_ne_over_time", args, &me, vExpected)
|
||||
}
|
||||
|
||||
f(-123, 12)
|
||||
f(0, 12)
|
||||
f(34, 8)
|
||||
f(123, 11)
|
||||
f(12, 11)
|
||||
}
|
||||
|
||||
func TestRollupQuantileOverTime(t *testing.T) {
|
||||
f := func(phi, vExpected float64) {
|
||||
t.Helper()
|
||||
@@ -423,6 +461,8 @@ func TestRollupNewRollupFuncSuccess(t *testing.T) {
|
||||
f("max_over_time", 123)
|
||||
f("tmin_over_time", 0.08)
|
||||
f("tmax_over_time", 0.005)
|
||||
f("tfirst_over_time", 0.005)
|
||||
f("tlast_over_time", 0.13)
|
||||
f("sum_over_time", 565)
|
||||
f("sum2_over_time", 37951)
|
||||
f("geomean_over_time", 39.33466603189148)
|
||||
@@ -645,7 +685,7 @@ func TestRollupFuncsLookbackDelta(t *testing.T) {
|
||||
}
|
||||
rc.Timestamps = getTimestamps(rc.Start, rc.End, rc.Step)
|
||||
values := rc.Do(nil, testValues, testTimestamps)
|
||||
valuesExpected := []float64{12, nan, nan, nan, 34, 34, nan}
|
||||
valuesExpected := []float64{99, nan, 44, nan, 32, 34, nan}
|
||||
timestampsExpected := []int64{80, 90, 100, 110, 120, 130, 140}
|
||||
testRowsEqual(t, values, rc.Timestamps, valuesExpected, timestampsExpected)
|
||||
})
|
||||
@@ -1103,11 +1143,13 @@ func testRowsEqual(t *testing.T, values []float64, timestamps []int64, valuesExp
|
||||
}
|
||||
|
||||
func TestRollupDelta(t *testing.T) {
|
||||
f := func(prevValue float64, values []float64, resultExpected float64) {
|
||||
f := func(prevValue, realPrevValue, realNextValue float64, values []float64, resultExpected float64) {
|
||||
t.Helper()
|
||||
rfa := &rollupFuncArg{
|
||||
prevValue: prevValue,
|
||||
values: values,
|
||||
prevValue: prevValue,
|
||||
values: values,
|
||||
realPrevValue: realPrevValue,
|
||||
realNextValue: realNextValue,
|
||||
}
|
||||
result := rollupDelta(rfa)
|
||||
if math.IsNaN(result) {
|
||||
@@ -1120,22 +1162,44 @@ func TestRollupDelta(t *testing.T) {
|
||||
t.Fatalf("unexpected result; got %v; want %v", result, resultExpected)
|
||||
}
|
||||
}
|
||||
f(nan, nil, nan)
|
||||
f(nan, nan, nan, nil, nan)
|
||||
|
||||
// Small initial value
|
||||
f(nan, []float64{1}, 1)
|
||||
f(nan, []float64{10}, 10)
|
||||
f(nan, []float64{100}, 100)
|
||||
f(nan, []float64{1, 2, 3}, 3)
|
||||
f(1, []float64{1, 2, 3}, 2)
|
||||
f(nan, []float64{5, 6, 8}, 8)
|
||||
f(2, []float64{5, 6, 8}, 6)
|
||||
f(nan, nan, nan, []float64{1}, 1)
|
||||
f(nan, nan, nan, []float64{10}, 10)
|
||||
f(nan, nan, nan, []float64{100}, 100)
|
||||
f(nan, nan, nan, []float64{1, 2, 3}, 3)
|
||||
f(1, nan, nan, []float64{1, 2, 3}, 2)
|
||||
f(nan, nan, nan, []float64{5, 6, 8}, 8)
|
||||
f(2, nan, nan, []float64{5, 6, 8}, 6)
|
||||
|
||||
// Too big initial value must be skipped.
|
||||
f(nan, []float64{1000}, 0)
|
||||
f(nan, []float64{1000, 1001, 1002}, 2)
|
||||
// Moderate initial value with zero delta after that.
|
||||
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/962
|
||||
f(nan, nan, nan, []float64{100}, 100)
|
||||
f(nan, nan, nan, []float64{100, 100}, 100)
|
||||
|
||||
// Big initial value with with zero delta after that.
|
||||
f(nan, nan, nan, []float64{1000}, 0)
|
||||
f(nan, nan, nan, []float64{1000, 1000}, 0)
|
||||
|
||||
// Big initial value with small delta after that.
|
||||
f(nan, nan, nan, []float64{1000, 1001, 1002}, 2)
|
||||
|
||||
// Non-nan realPrevValue
|
||||
f(nan, 900, nan, []float64{1000}, 100)
|
||||
f(nan, 1000, nan, []float64{1000}, 0)
|
||||
f(nan, 1100, nan, []float64{1000}, -100)
|
||||
f(nan, 900, nan, []float64{1000, 1001, 1002}, 102)
|
||||
|
||||
// Small delta between realNextValue and values
|
||||
f(nan, nan, 990, []float64{1000}, 0)
|
||||
f(nan, nan, 1005, []float64{1000}, 0)
|
||||
|
||||
// Big delta between relaNextValue and values
|
||||
f(nan, nan, 800, []float64{1000}, 1000)
|
||||
f(nan, nan, 1300, []float64{1000}, 1000)
|
||||
|
||||
// Empty values
|
||||
f(1, nil, 0)
|
||||
f(100, nil, 0)
|
||||
f(1, nan, nan, nil, 0)
|
||||
f(100, nan, nan, nil, 0)
|
||||
}
|
||||
|
||||
@@ -73,6 +73,8 @@ var transformFuncs = map[string]transformFunc{
|
||||
// New funcs
|
||||
"label_set": transformLabelSet,
|
||||
"label_map": transformLabelMap,
|
||||
"label_uppercase": transformLabelUppercase,
|
||||
"label_lowercase": transformLabelLowercase,
|
||||
"label_del": transformLabelDel,
|
||||
"label_keep": transformLabelKeep,
|
||||
"label_copy": transformLabelCopy,
|
||||
@@ -265,6 +267,9 @@ func newTransformFuncDateTime(f func(t time.Time) int) transformFunc {
|
||||
}
|
||||
tf := func(values []float64) {
|
||||
for i, v := range values {
|
||||
if math.IsNaN(v) {
|
||||
continue
|
||||
}
|
||||
t := time.Unix(int64(v), 0).UTC()
|
||||
values[i] = float64(f(t))
|
||||
}
|
||||
@@ -1193,6 +1198,42 @@ func transformLabelSet(tfa *transformFuncArg) ([]*timeseries, error) {
|
||||
return rvs, nil
|
||||
}
|
||||
|
||||
func transformLabelUppercase(tfa *transformFuncArg) ([]*timeseries, error) {
|
||||
return transformLabelValueFunc(tfa, strings.ToUpper)
|
||||
}
|
||||
|
||||
func transformLabelLowercase(tfa *transformFuncArg) ([]*timeseries, error) {
|
||||
return transformLabelValueFunc(tfa, strings.ToLower)
|
||||
}
|
||||
|
||||
func transformLabelValueFunc(tfa *transformFuncArg, f func(string) string) ([]*timeseries, error) {
|
||||
args := tfa.args
|
||||
if len(args) < 2 {
|
||||
return nil, fmt.Errorf(`not enough args; got %d; want at least %d`, len(args), 2)
|
||||
}
|
||||
labels := make([]string, 0, len(args)-1)
|
||||
for i := 1; i < len(args); i++ {
|
||||
label, err := getString(args[i], i)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
labels = append(labels, label)
|
||||
}
|
||||
|
||||
rvs := args[0]
|
||||
for _, ts := range rvs {
|
||||
mn := &ts.MetricName
|
||||
for _, label := range labels {
|
||||
dstValue := getDstValue(mn, label)
|
||||
*dstValue = append((*dstValue)[:0], f(string(*dstValue))...)
|
||||
if len(*dstValue) == 0 {
|
||||
mn.RemoveTag(label)
|
||||
}
|
||||
}
|
||||
}
|
||||
return rvs, nil
|
||||
}
|
||||
|
||||
func transformLabelMap(tfa *transformFuncArg) ([]*timeseries, error) {
|
||||
args := tfa.args
|
||||
if len(args) < 2 {
|
||||
@@ -1562,21 +1603,31 @@ func transformScalar(tfa *transformFuncArg) ([]*timeseries, error) {
|
||||
func newTransformFuncSortByLabel(isDesc bool) transformFunc {
|
||||
return func(tfa *transformFuncArg) ([]*timeseries, error) {
|
||||
args := tfa.args
|
||||
if err := expectTransformArgsNum(args, 2); err != nil {
|
||||
return nil, err
|
||||
if len(args) < 2 {
|
||||
return nil, fmt.Errorf("expecting at least 2 args; got %d args", len(args))
|
||||
}
|
||||
label, err := getString(args[1], 1)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("cannot parse label name for sorting: %w", err)
|
||||
var labels []string
|
||||
for i, arg := range args[1:] {
|
||||
label, err := getString(arg, 1)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("cannot parse label #%d for sorting: %w", i+1, err)
|
||||
}
|
||||
labels = append(labels, label)
|
||||
}
|
||||
rvs := args[0]
|
||||
sort.SliceStable(rvs, func(i, j int) bool {
|
||||
a := rvs[i].MetricName.GetTagValue(label)
|
||||
b := rvs[j].MetricName.GetTagValue(label)
|
||||
if isDesc {
|
||||
return string(b) < string(a)
|
||||
for _, label := range labels {
|
||||
a := rvs[i].MetricName.GetTagValue(label)
|
||||
b := rvs[j].MetricName.GetTagValue(label)
|
||||
if string(a) == string(b) {
|
||||
continue
|
||||
}
|
||||
if isDesc {
|
||||
return string(b) < string(a)
|
||||
}
|
||||
return string(a) < string(b)
|
||||
}
|
||||
return string(a) < string(b)
|
||||
return false
|
||||
})
|
||||
return rvs, nil
|
||||
}
|
||||
|
||||
249
app/vmselect/querystats/querystats.go
Normal file
249
app/vmselect/querystats/querystats.go
Normal file
@@ -0,0 +1,249 @@
|
||||
package querystats
|
||||
|
||||
import (
|
||||
"flag"
|
||||
"fmt"
|
||||
"io"
|
||||
"sort"
|
||||
"sync"
|
||||
"time"
|
||||
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
|
||||
)
|
||||
|
||||
var (
|
||||
lastQueriesCount = flag.Int("search.queryStats.lastQueriesCount", 20000, "Query stats for `/api/v1/status/top_queries` is tracked on this number of last queries. "+
|
||||
"Zero value disables query stats tracking")
|
||||
minQueryDuration = flag.Duration("search.queryStats.minQueryDuration", 0, "The minimum duration for queries to track in query stats at `/api/v1/status/top_queries`. "+
|
||||
"Queries with lower duration are ignored in query stats")
|
||||
)
|
||||
|
||||
var (
|
||||
qsTracker *queryStatsTracker
|
||||
initOnce sync.Once
|
||||
)
|
||||
|
||||
// Enabled returns true of query stats tracking is enabled.
|
||||
func Enabled() bool {
|
||||
return *lastQueriesCount > 0
|
||||
}
|
||||
|
||||
// RegisterQuery registers the query on the given timeRangeMsecs, which has been started at startTime.
|
||||
//
|
||||
// RegisterQuery must be called when the query is finished.
|
||||
func RegisterQuery(query string, timeRangeMsecs int64, startTime time.Time) {
|
||||
initOnce.Do(initQueryStats)
|
||||
qsTracker.registerQuery(query, timeRangeMsecs, startTime)
|
||||
}
|
||||
|
||||
// WriteJSONQueryStats writes query stats to given writer in json format.
|
||||
func WriteJSONQueryStats(w io.Writer, topN int, maxLifetime time.Duration) {
|
||||
initOnce.Do(initQueryStats)
|
||||
qsTracker.writeJSONQueryStats(w, topN, maxLifetime)
|
||||
}
|
||||
|
||||
// queryStatsTracker holds statistics for queries
|
||||
type queryStatsTracker struct {
|
||||
mu sync.Mutex
|
||||
a []queryStatRecord
|
||||
nextIdx uint
|
||||
}
|
||||
|
||||
type queryStatRecord struct {
|
||||
query string
|
||||
timeRangeSecs int64
|
||||
registerTime time.Time
|
||||
duration time.Duration
|
||||
}
|
||||
|
||||
type queryStatKey struct {
|
||||
query string
|
||||
timeRangeSecs int64
|
||||
}
|
||||
|
||||
func initQueryStats() {
|
||||
recordsCount := *lastQueriesCount
|
||||
if recordsCount <= 0 {
|
||||
recordsCount = 1
|
||||
} else {
|
||||
logger.Infof("enabled query stats tracking at `/api/v1/status/top_queries` with -search.queryStats.lastQueriesCount=%d, -search.queryStats.minQueryDuration=%s",
|
||||
*lastQueriesCount, *minQueryDuration)
|
||||
}
|
||||
qsTracker = &queryStatsTracker{
|
||||
a: make([]queryStatRecord, recordsCount),
|
||||
}
|
||||
}
|
||||
|
||||
func (qst *queryStatsTracker) writeJSONQueryStats(w io.Writer, topN int, maxLifetime time.Duration) {
|
||||
fmt.Fprintf(w, `{"topN":"%d","maxLifetime":%q,`, topN, maxLifetime)
|
||||
fmt.Fprintf(w, `"search.queryStats.lastQueriesCount":%d,`, *lastQueriesCount)
|
||||
fmt.Fprintf(w, `"search.queryStats.minQueryDuration":%q,`, *minQueryDuration)
|
||||
fmt.Fprintf(w, `"topByCount":[`)
|
||||
topByCount := qst.getTopByCount(topN, maxLifetime)
|
||||
for i, r := range topByCount {
|
||||
fmt.Fprintf(w, `{"query":%q,"timeRangeSeconds":%d,"count":%d}`, r.query, r.timeRangeSecs, r.count)
|
||||
if i+1 < len(topByCount) {
|
||||
fmt.Fprintf(w, `,`)
|
||||
}
|
||||
}
|
||||
fmt.Fprintf(w, `],"topByAvgDuration":[`)
|
||||
topByAvgDuration := qst.getTopByAvgDuration(topN, maxLifetime)
|
||||
for i, r := range topByAvgDuration {
|
||||
fmt.Fprintf(w, `{"query":%q,"timeRangeSeconds":%d,"avgDurationSeconds":%.3f}`, r.query, r.timeRangeSecs, r.duration.Seconds())
|
||||
if i+1 < len(topByAvgDuration) {
|
||||
fmt.Fprintf(w, `,`)
|
||||
}
|
||||
}
|
||||
fmt.Fprintf(w, `],"topBySumDuration":[`)
|
||||
topBySumDuration := qst.getTopBySumDuration(topN, maxLifetime)
|
||||
for i, r := range topBySumDuration {
|
||||
fmt.Fprintf(w, `{"query":%q,"timeRangeSeconds":%d,"sumDurationSeconds":%.3f}`, r.query, r.timeRangeSecs, r.duration.Seconds())
|
||||
if i+1 < len(topBySumDuration) {
|
||||
fmt.Fprintf(w, `,`)
|
||||
}
|
||||
}
|
||||
fmt.Fprintf(w, `]}`)
|
||||
}
|
||||
|
||||
func (qst *queryStatsTracker) registerQuery(query string, timeRangeMsecs int64, startTime time.Time) {
|
||||
registerTime := time.Now()
|
||||
duration := registerTime.Sub(startTime)
|
||||
if duration < *minQueryDuration {
|
||||
return
|
||||
}
|
||||
|
||||
qst.mu.Lock()
|
||||
defer qst.mu.Unlock()
|
||||
|
||||
a := qst.a
|
||||
idx := qst.nextIdx
|
||||
if idx >= uint(len(a)) {
|
||||
idx = 0
|
||||
}
|
||||
qst.nextIdx = idx + 1
|
||||
r := &a[idx]
|
||||
r.query = query
|
||||
r.timeRangeSecs = timeRangeMsecs / 1000
|
||||
r.registerTime = registerTime
|
||||
r.duration = duration
|
||||
}
|
||||
|
||||
func (r *queryStatRecord) matches(currentTime time.Time, maxLifetime time.Duration) bool {
|
||||
if r.query == "" || currentTime.Sub(r.registerTime) > maxLifetime {
|
||||
return false
|
||||
}
|
||||
return true
|
||||
}
|
||||
|
||||
func (r *queryStatRecord) key() queryStatKey {
|
||||
return queryStatKey{
|
||||
query: r.query,
|
||||
timeRangeSecs: r.timeRangeSecs,
|
||||
}
|
||||
}
|
||||
|
||||
func (qst *queryStatsTracker) getTopByCount(topN int, maxLifetime time.Duration) []queryStatByCount {
|
||||
currentTime := time.Now()
|
||||
qst.mu.Lock()
|
||||
m := make(map[queryStatKey]int)
|
||||
for _, r := range qst.a {
|
||||
if r.matches(currentTime, maxLifetime) {
|
||||
k := r.key()
|
||||
m[k] = m[k] + 1
|
||||
}
|
||||
}
|
||||
qst.mu.Unlock()
|
||||
|
||||
var a []queryStatByCount
|
||||
for k, count := range m {
|
||||
a = append(a, queryStatByCount{
|
||||
query: k.query,
|
||||
timeRangeSecs: k.timeRangeSecs,
|
||||
count: count,
|
||||
})
|
||||
}
|
||||
sort.Slice(a, func(i, j int) bool {
|
||||
return a[i].count > a[j].count
|
||||
})
|
||||
if len(a) > topN {
|
||||
a = a[:topN]
|
||||
}
|
||||
return a
|
||||
}
|
||||
|
||||
type queryStatByCount struct {
|
||||
query string
|
||||
timeRangeSecs int64
|
||||
count int
|
||||
}
|
||||
|
||||
func (qst *queryStatsTracker) getTopByAvgDuration(topN int, maxLifetime time.Duration) []queryStatByDuration {
|
||||
currentTime := time.Now()
|
||||
qst.mu.Lock()
|
||||
type countSum struct {
|
||||
count int
|
||||
sum time.Duration
|
||||
}
|
||||
m := make(map[queryStatKey]countSum)
|
||||
for _, r := range qst.a {
|
||||
if r.matches(currentTime, maxLifetime) {
|
||||
k := r.key()
|
||||
ks := m[k]
|
||||
ks.count++
|
||||
ks.sum += r.duration
|
||||
m[k] = ks
|
||||
}
|
||||
}
|
||||
qst.mu.Unlock()
|
||||
|
||||
var a []queryStatByDuration
|
||||
for k, ks := range m {
|
||||
a = append(a, queryStatByDuration{
|
||||
query: k.query,
|
||||
timeRangeSecs: k.timeRangeSecs,
|
||||
duration: ks.sum / time.Duration(ks.count),
|
||||
})
|
||||
}
|
||||
sort.Slice(a, func(i, j int) bool {
|
||||
return a[i].duration > a[j].duration
|
||||
})
|
||||
if len(a) > topN {
|
||||
a = a[:topN]
|
||||
}
|
||||
return a
|
||||
}
|
||||
|
||||
type queryStatByDuration struct {
|
||||
query string
|
||||
timeRangeSecs int64
|
||||
duration time.Duration
|
||||
}
|
||||
|
||||
func (qst *queryStatsTracker) getTopBySumDuration(topN int, maxLifetime time.Duration) []queryStatByDuration {
|
||||
currentTime := time.Now()
|
||||
qst.mu.Lock()
|
||||
m := make(map[queryStatKey]time.Duration)
|
||||
for _, r := range qst.a {
|
||||
if r.matches(currentTime, maxLifetime) {
|
||||
k := r.key()
|
||||
m[k] = m[k] + r.duration
|
||||
}
|
||||
}
|
||||
qst.mu.Unlock()
|
||||
|
||||
var a []queryStatByDuration
|
||||
for k, d := range m {
|
||||
a = append(a, queryStatByDuration{
|
||||
query: k.query,
|
||||
timeRangeSecs: k.timeRangeSecs,
|
||||
duration: d,
|
||||
})
|
||||
}
|
||||
sort.Slice(a, func(i, j int) bool {
|
||||
return a[i].duration > a[j].duration
|
||||
})
|
||||
if len(a) > topN {
|
||||
a = a[:topN]
|
||||
}
|
||||
return a
|
||||
}
|
||||
@@ -184,5 +184,7 @@ func (d *Deadline) Deadline() uint64 {
|
||||
|
||||
// String returns human-readable string representation for d.
|
||||
func (d *Deadline) String() string {
|
||||
return fmt.Sprintf("%.3f seconds; the timeout can be adjusted with `%s` command-line flag", d.timeout.Seconds(), d.flagHint)
|
||||
startTime := time.Unix(int64(d.deadline), 0).Add(-d.timeout)
|
||||
elapsed := time.Since(startTime)
|
||||
return fmt.Sprintf("%.3f seconds (elapsed %.3f seconds); the timeout can be adjusted with `%s` command-line flag", d.timeout.Seconds(), elapsed.Seconds(), d.flagHint)
|
||||
}
|
||||
|
||||
@@ -10,6 +10,7 @@ import (
|
||||
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/encoding"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fasttime"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
|
||||
@@ -19,18 +20,19 @@ import (
|
||||
)
|
||||
|
||||
var (
|
||||
retentionPeriod = flag.Int("retentionPeriod", 1, "Retention period in months")
|
||||
retentionPeriod = flagutil.NewDuration("retentionPeriod", 1, "Data with timestamps outside the retentionPeriod is automatically deleted")
|
||||
snapshotAuthKey = flag.String("snapshotAuthKey", "", "authKey, which must be passed in query string to /snapshot* pages")
|
||||
forceMergeAuthKey = flag.String("forceMergeAuthKey", "", "authKey, which must be passed in query string to /internal/force_merge pages")
|
||||
forceFlushAuthKey = flag.String("forceFlushAuthKey", "", "authKey, which must be passed in query string to /internal/force_flush pages")
|
||||
|
||||
precisionBits = flag.Int("precisionBits", 64, "The number of precision bits to store per each value. Lower precision bits improves data compression at the cost of precision loss")
|
||||
|
||||
// DataPath is a path to storage data.
|
||||
DataPath = flag.String("storageDataPath", "victoria-metrics-data", "Path to storage data")
|
||||
|
||||
finalMergeDelay = flag.Duration("finalMergeDelay", 30*time.Second, "The delay before starting final merge for per-month partition after no new data is ingested into it. "+
|
||||
"Query speed and disk space usage is usually reduced after the final merge is complete. Too low delay for final merge may result in increased "+
|
||||
"disk IO usage and CPU usage")
|
||||
finalMergeDelay = flag.Duration("finalMergeDelay", 0, "The delay before starting final merge for per-month partition after no new data is ingested into it. "+
|
||||
"Final merge may require additional disk IO and CPU resources. Final merge may increase query speed and reduce disk space usage in some cases. "+
|
||||
"Zero value disables final merge")
|
||||
bigMergeConcurrency = flag.Int("bigMergeConcurrency", 0, "The maximum number of CPU cores to use for big merges. Default value is used if set to 0")
|
||||
smallMergeConcurrency = flag.Int("smallMergeConcurrency", 0, "The maximum number of CPU cores to use for small merges. Default value is used if set to 0")
|
||||
|
||||
@@ -44,40 +46,41 @@ func CheckTimeRange(tr storage.TimeRange) error {
|
||||
if !*denyQueriesOutsideRetention {
|
||||
return nil
|
||||
}
|
||||
minAllowedTimestamp := (int64(fasttime.UnixTimestamp()) - int64(*retentionPeriod)*3600*24*30) * 1000
|
||||
minAllowedTimestamp := int64(fasttime.UnixTimestamp()*1000) - retentionPeriod.Msecs
|
||||
if tr.MinTimestamp > minAllowedTimestamp {
|
||||
return nil
|
||||
}
|
||||
return &httpserver.ErrorWithStatusCode{
|
||||
Err: fmt.Errorf("the given time range %s is outside the allowed retention of %d months according to -denyQueriesOutsideRetention", &tr, *retentionPeriod),
|
||||
Err: fmt.Errorf("the given time range %s is outside the allowed -retentionPeriod=%s according to -denyQueriesOutsideRetention", &tr, retentionPeriod),
|
||||
StatusCode: http.StatusServiceUnavailable,
|
||||
}
|
||||
}
|
||||
|
||||
// Init initializes vmstorage.
|
||||
func Init() {
|
||||
InitWithoutMetrics()
|
||||
func Init(resetCacheIfNeeded func(mrs []storage.MetricRow)) {
|
||||
InitWithoutMetrics(resetCacheIfNeeded)
|
||||
registerStorageMetrics()
|
||||
}
|
||||
|
||||
// InitWithoutMetrics must be called instead of Init inside tests.
|
||||
//
|
||||
// This allows multiple Init / Stop cycles.
|
||||
func InitWithoutMetrics() {
|
||||
func InitWithoutMetrics(resetCacheIfNeeded func(mrs []storage.MetricRow)) {
|
||||
if err := encoding.CheckPrecisionBits(uint8(*precisionBits)); err != nil {
|
||||
logger.Fatalf("invalid `-precisionBits`: %s", err)
|
||||
}
|
||||
|
||||
resetResponseCacheIfNeeded = resetCacheIfNeeded
|
||||
storage.SetFinalMergeDelay(*finalMergeDelay)
|
||||
storage.SetBigMergeWorkersCount(*bigMergeConcurrency)
|
||||
storage.SetSmallMergeWorkersCount(*smallMergeConcurrency)
|
||||
|
||||
logger.Infof("opening storage at %q with retention period %d months", *DataPath, *retentionPeriod)
|
||||
logger.Infof("opening storage at %q with -retentionPeriod=%s", *DataPath, retentionPeriod)
|
||||
startTime := time.Now()
|
||||
WG = syncwg.WaitGroup{}
|
||||
strg, err := storage.OpenStorage(*DataPath, *retentionPeriod)
|
||||
strg, err := storage.OpenStorage(*DataPath, retentionPeriod.Msecs)
|
||||
if err != nil {
|
||||
logger.Fatalf("cannot open a storage at %s with retention period %d months: %s", *DataPath, *retentionPeriod, err)
|
||||
logger.Fatalf("cannot open a storage at %s with -retentionPeriod=%s: %s", *DataPath, retentionPeriod, err)
|
||||
}
|
||||
Storage = strg
|
||||
|
||||
@@ -103,14 +106,26 @@ var Storage *storage.Storage
|
||||
// Use syncwg instead of sync, since Add is called from concurrent goroutines.
|
||||
var WG syncwg.WaitGroup
|
||||
|
||||
// resetResponseCacheIfNeeded is a callback for automatic resetting of response cache if needed.
|
||||
var resetResponseCacheIfNeeded func(mrs []storage.MetricRow)
|
||||
|
||||
// AddRows adds mrs to the storage.
|
||||
func AddRows(mrs []storage.MetricRow) error {
|
||||
resetResponseCacheIfNeeded(mrs)
|
||||
WG.Add(1)
|
||||
err := Storage.AddRows(mrs, uint8(*precisionBits))
|
||||
WG.Done()
|
||||
return err
|
||||
}
|
||||
|
||||
// RegisterMetricNames registers all the metrics from mrs in the storage.
|
||||
func RegisterMetricNames(mrs []storage.MetricRow) error {
|
||||
WG.Add(1)
|
||||
err := Storage.RegisterMetricNames(mrs)
|
||||
WG.Done()
|
||||
return err
|
||||
}
|
||||
|
||||
// DeleteMetrics deletes metrics matching tfss.
|
||||
//
|
||||
// Returns the number of deleted metrics.
|
||||
@@ -121,6 +136,22 @@ func DeleteMetrics(tfss []*storage.TagFilters) (int, error) {
|
||||
return n, err
|
||||
}
|
||||
|
||||
// SearchMetricNames returns metric names for the given tfss on the given tr.
|
||||
func SearchMetricNames(tfss []*storage.TagFilters, tr storage.TimeRange, maxMetrics int, deadline uint64) ([]storage.MetricName, error) {
|
||||
WG.Add(1)
|
||||
mns, err := Storage.SearchMetricNames(tfss, tr, maxMetrics, deadline)
|
||||
WG.Done()
|
||||
return mns, err
|
||||
}
|
||||
|
||||
// SearchTagKeysOnTimeRange searches for tag keys on tr.
|
||||
func SearchTagKeysOnTimeRange(tr storage.TimeRange, maxTagKeys int, deadline uint64) ([]string, error) {
|
||||
WG.Add(1)
|
||||
keys, err := Storage.SearchTagKeysOnTimeRange(tr, maxTagKeys, deadline)
|
||||
WG.Done()
|
||||
return keys, err
|
||||
}
|
||||
|
||||
// SearchTagKeys searches for tag keys
|
||||
func SearchTagKeys(maxTagKeys int, deadline uint64) ([]string, error) {
|
||||
WG.Add(1)
|
||||
@@ -129,6 +160,14 @@ func SearchTagKeys(maxTagKeys int, deadline uint64) ([]string, error) {
|
||||
return keys, err
|
||||
}
|
||||
|
||||
// SearchTagValuesOnTimeRange searches for tag values for the given tagKey on tr.
|
||||
func SearchTagValuesOnTimeRange(tagKey []byte, tr storage.TimeRange, maxTagValues int, deadline uint64) ([]string, error) {
|
||||
WG.Add(1)
|
||||
values, err := Storage.SearchTagValuesOnTimeRange(tagKey, tr, maxTagValues, deadline)
|
||||
WG.Done()
|
||||
return values, err
|
||||
}
|
||||
|
||||
// SearchTagValues searches for tag values for the given tagKey
|
||||
func SearchTagValues(tagKey []byte, maxTagValues int, deadline uint64) ([]string, error) {
|
||||
WG.Add(1)
|
||||
@@ -205,6 +244,16 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
|
||||
}()
|
||||
return true
|
||||
}
|
||||
if path == "/internal/force_flush" {
|
||||
authKey := r.FormValue("authKey")
|
||||
if authKey != *forceFlushAuthKey {
|
||||
httpserver.Errorf(w, r, "invalid authKey %q. It must match the value from -forceFlushAuthKey command line flag", authKey)
|
||||
return true
|
||||
}
|
||||
logger.Infof("flushing storage to make pending data available for reading")
|
||||
Storage.DebugFlush()
|
||||
return true
|
||||
}
|
||||
prometheusCompatibleResponse := false
|
||||
if path == "/api/v1/admin/tsdb/snapshot" {
|
||||
// Handle Prometheus API - https://prometheus.io/docs/prometheus/latest/querying/api/#snapshot .
|
||||
@@ -223,7 +272,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
|
||||
|
||||
switch path {
|
||||
case "/create":
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.Header().Set("Content-Type", "application/json; charset=utf-8")
|
||||
snapshotPath, err := Storage.CreateSnapshot()
|
||||
if err != nil {
|
||||
err = fmt.Errorf("cannot create snapshot: %w", err)
|
||||
@@ -237,7 +286,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
|
||||
}
|
||||
return true
|
||||
case "/list":
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.Header().Set("Content-Type", "application/json; charset=utf-8")
|
||||
snapshots, err := Storage.ListSnapshots()
|
||||
if err != nil {
|
||||
err = fmt.Errorf("cannot list snapshots: %w", err)
|
||||
@@ -254,7 +303,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
|
||||
fmt.Fprintf(w, `]}`)
|
||||
return true
|
||||
case "/delete":
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.Header().Set("Content-Type", "application/json; charset=utf-8")
|
||||
snapshotName := r.FormValue("snapshot")
|
||||
if err := Storage.DeleteSnapshot(snapshotName); err != nil {
|
||||
err = fmt.Errorf("cannot delete snapshot %q: %w", snapshotName, err)
|
||||
@@ -264,7 +313,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
|
||||
fmt.Fprintf(w, `{"status":"ok"}`)
|
||||
return true
|
||||
case "/delete_all":
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.Header().Set("Content-Type", "application/json; charset=utf-8")
|
||||
snapshots, err := Storage.ListSnapshots()
|
||||
if err != nil {
|
||||
err = fmt.Errorf("cannot list snapshots: %w", err)
|
||||
|
||||
@@ -51,12 +51,12 @@
|
||||
}
|
||||
]
|
||||
},
|
||||
"description": "Overview for single node VictoriaMetrics v1.40.0 or higher",
|
||||
"description": "Overview for single node VictoriaMetrics v1.48.0 or higher",
|
||||
"editable": true,
|
||||
"gnetId": 10229,
|
||||
"graphTooltip": 0,
|
||||
"id": null,
|
||||
"iteration": 1599034965731,
|
||||
"iteration": 1603307754894,
|
||||
"links": [
|
||||
{
|
||||
"icon": "doc",
|
||||
@@ -925,7 +925,7 @@
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"datasource": "$ds",
|
||||
"description": "Shows how many ongoing insertions are taking place.\n* `max` - equal to number of CPU * 2\n* `current` - current number of goroutines busy with inserting rows into storage\n\nWhen `current` hits `max` constantly, it means storage is overloaded and require more CPU.",
|
||||
"description": "Shows how many ongoing insertions (not API /write calls) on disk are taking place, where:\n* `max` - equal to number of CPUs;\n* `current` - current number of goroutines busy with inserting rows into underlying storage.\n\nEvery successful API /write call results into flush on disk. However, these two actions are separated and controlled via different concurrency limiters. The `max` on this panel can't be changed and always equal to number of CPUs. \n\nWhen `current` hits `max` constantly, it means storage is overloaded and requires more CPU.\n\n",
|
||||
"fieldConfig": {
|
||||
"defaults": {
|
||||
"custom": {},
|
||||
@@ -979,6 +979,7 @@
|
||||
{
|
||||
"expr": "sum(vm_concurrent_addrows_capacity{job=\"$job\", instance=\"$instance\"})",
|
||||
"format": "time_series",
|
||||
"interval": "",
|
||||
"intervalFactor": 1,
|
||||
"legendFormat": "max",
|
||||
"refId": "A"
|
||||
@@ -995,7 +996,7 @@
|
||||
"timeFrom": null,
|
||||
"timeRegions": [],
|
||||
"timeShift": null,
|
||||
"title": "Concurrent inserts ($instance)",
|
||||
"title": "Concurrent flushes on disk ($instance)",
|
||||
"tooltip": {
|
||||
"shared": true,
|
||||
"sort": 2,
|
||||
@@ -1164,7 +1165,7 @@
|
||||
"h": 8,
|
||||
"w": 12,
|
||||
"x": 0,
|
||||
"y": 36
|
||||
"y": 3
|
||||
},
|
||||
"hiddenSeries": false,
|
||||
"id": 10,
|
||||
@@ -1250,7 +1251,7 @@
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"datasource": "$ds",
|
||||
"description": "How many datapoints are in RAM queue waiting to be written into storage. The number of pending data points should be in the range from 0 to `2*<ingestion_rate>`, since VictoriaMetrics pushes pending data to persistent storage every second.",
|
||||
"description": "Shows the time needed to reach the 100% of disk capacity based on the following params:\n* free disk space;\n* row ingestion rate;\n* dedup rate;\n* compression.\n\nUse this panel for capacity planning in order to estimate the time remaining for running out of the disk space.\n\n",
|
||||
"fieldConfig": {
|
||||
"defaults": {
|
||||
"custom": {},
|
||||
@@ -1264,63 +1265,53 @@
|
||||
"h": 8,
|
||||
"w": 12,
|
||||
"x": 12,
|
||||
"y": 36
|
||||
"y": 3
|
||||
},
|
||||
"hiddenSeries": false,
|
||||
"id": 34,
|
||||
"id": 73,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"alignAsTable": true,
|
||||
"avg": true,
|
||||
"current": true,
|
||||
"hideZero": true,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"show": false,
|
||||
"total": false,
|
||||
"values": false
|
||||
"values": true
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"links": [],
|
||||
"nullPointMode": "null",
|
||||
"nullPointMode": "null as zero",
|
||||
"percentage": false,
|
||||
"pluginVersion": "7.1.1",
|
||||
"pointradius": 2,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"seriesOverrides": [
|
||||
{
|
||||
"alias": "pending index entries",
|
||||
"yaxis": 2
|
||||
}
|
||||
],
|
||||
"seriesOverrides": [],
|
||||
"spaceLength": 10,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "vm_pending_rows{job=\"$job\", instance=~\"$instance\", type=\"storage\"}",
|
||||
"expr": "vm_free_disk_space_bytes{job=\"$job\", instance=\"$instance\"} / ignoring(path) ((rate(vm_rows_added_to_storage_total{job=\"$job\", instance=\"$instance\"}[1d]) - ignoring(type) rate(vm_deduplicated_samples_total{job=\"$job\", instance=\"$instance\", type=\"merge\"}[1d])) * scalar(sum(vm_data_size_bytes{job=\"$job\", instance=\"$instance\", type!=\"indexdb\"}) / sum(vm_rows{job=\"$job\", instance=\"$instance\", type!=\"indexdb\"})))",
|
||||
"format": "time_series",
|
||||
"hide": false,
|
||||
"interval": "",
|
||||
"intervalFactor": 1,
|
||||
"legendFormat": "pending datapoints",
|
||||
"legendFormat": "",
|
||||
"refId": "A"
|
||||
},
|
||||
{
|
||||
"expr": "vm_pending_rows{job=\"$job\", instance=~\"$instance\", type=\"indexdb\"}",
|
||||
"format": "time_series",
|
||||
"hide": false,
|
||||
"intervalFactor": 1,
|
||||
"legendFormat": "pending index entries",
|
||||
"refId": "B"
|
||||
}
|
||||
],
|
||||
"thresholds": [],
|
||||
"timeFrom": null,
|
||||
"timeRegions": [],
|
||||
"timeShift": null,
|
||||
"title": "Pending datapoints ($instance)",
|
||||
"title": "Storage full ETA ($instance)",
|
||||
"tooltip": {
|
||||
"shared": true,
|
||||
"sort": 0,
|
||||
"sort": 2,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "graph",
|
||||
@@ -1333,7 +1324,8 @@
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"format": "short",
|
||||
"decimals": null,
|
||||
"format": "s",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
@@ -1341,8 +1333,7 @@
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"decimals": 3,
|
||||
"format": "none",
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
@@ -1375,7 +1366,7 @@
|
||||
"h": 8,
|
||||
"w": 12,
|
||||
"x": 0,
|
||||
"y": 44
|
||||
"y": 11
|
||||
},
|
||||
"hiddenSeries": false,
|
||||
"id": 30,
|
||||
@@ -1472,7 +1463,7 @@
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"datasource": "$ds",
|
||||
"description": "Data parts of LSM tree.\nHigh number of parts could be an evidence of slow merge performance - check the resource utilization.\n* `indexdb` - inverted index\n* `storage/small` - recently added parts of data ingested into storage(hot data)\n* `storage/big` - small parts gradually merged into big parts (cold data)",
|
||||
"description": "How many datapoints are in RAM queue waiting to be written into storage. The number of pending data points should be in the range from 0 to `2*<ingestion_rate>`, since VictoriaMetrics pushes pending data to persistent storage every second.",
|
||||
"fieldConfig": {
|
||||
"defaults": {
|
||||
"custom": {},
|
||||
@@ -1486,16 +1477,16 @@
|
||||
"h": 8,
|
||||
"w": 12,
|
||||
"x": 12,
|
||||
"y": 44
|
||||
"y": 11
|
||||
},
|
||||
"hiddenSeries": false,
|
||||
"id": 36,
|
||||
"id": 34,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"show": true,
|
||||
"show": false,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
@@ -1508,27 +1499,41 @@
|
||||
"pointradius": 2,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"seriesOverrides": [],
|
||||
"seriesOverrides": [
|
||||
{
|
||||
"alias": "pending index entries",
|
||||
"yaxis": 2
|
||||
}
|
||||
],
|
||||
"spaceLength": 10,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(vm_parts{job=\"$job\", instance=\"$instance\"}) by (type)",
|
||||
"expr": "vm_pending_rows{job=\"$job\", instance=~\"$instance\", type=\"storage\"}",
|
||||
"format": "time_series",
|
||||
"hide": false,
|
||||
"intervalFactor": 1,
|
||||
"legendFormat": "{{type}}",
|
||||
"legendFormat": "pending datapoints",
|
||||
"refId": "A"
|
||||
},
|
||||
{
|
||||
"expr": "vm_pending_rows{job=\"$job\", instance=~\"$instance\", type=\"indexdb\"}",
|
||||
"format": "time_series",
|
||||
"hide": false,
|
||||
"intervalFactor": 1,
|
||||
"legendFormat": "pending index entries",
|
||||
"refId": "B"
|
||||
}
|
||||
],
|
||||
"thresholds": [],
|
||||
"timeFrom": null,
|
||||
"timeRegions": [],
|
||||
"timeShift": null,
|
||||
"title": "LSM parts ($instance)",
|
||||
"title": "Pending datapoints ($instance)",
|
||||
"tooltip": {
|
||||
"shared": true,
|
||||
"sort": 2,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "graph",
|
||||
@@ -1549,7 +1554,8 @@
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "short",
|
||||
"decimals": 3,
|
||||
"format": "none",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
@@ -1582,7 +1588,7 @@
|
||||
"h": 8,
|
||||
"w": 12,
|
||||
"x": 0,
|
||||
"y": 52
|
||||
"y": 19
|
||||
},
|
||||
"hiddenSeries": false,
|
||||
"id": 53,
|
||||
@@ -1669,6 +1675,196 @@
|
||||
"alignLevel": null
|
||||
}
|
||||
},
|
||||
{
|
||||
"aliasColors": {},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"datasource": "$ds",
|
||||
"description": "Data parts of LSM tree.\nHigh number of parts could be an evidence of slow merge performance - check the resource utilization.\n* `indexdb` - inverted index\n* `storage/small` - recently added parts of data ingested into storage(hot data)\n* `storage/big` - small parts gradually merged into big parts (cold data)",
|
||||
"fieldConfig": {
|
||||
"defaults": {
|
||||
"custom": {},
|
||||
"links": []
|
||||
},
|
||||
"overrides": []
|
||||
},
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
"h": 8,
|
||||
"w": 12,
|
||||
"x": 12,
|
||||
"y": 19
|
||||
},
|
||||
"hiddenSeries": false,
|
||||
"id": 36,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"links": [],
|
||||
"nullPointMode": "null",
|
||||
"percentage": false,
|
||||
"pluginVersion": "7.1.1",
|
||||
"pointradius": 2,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"seriesOverrides": [],
|
||||
"spaceLength": 10,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(vm_parts{job=\"$job\", instance=\"$instance\"}) by (type)",
|
||||
"format": "time_series",
|
||||
"intervalFactor": 1,
|
||||
"legendFormat": "{{type}}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": [],
|
||||
"timeFrom": null,
|
||||
"timeRegions": [],
|
||||
"timeShift": null,
|
||||
"title": "LSM parts ($instance)",
|
||||
"tooltip": {
|
||||
"shared": true,
|
||||
"sort": 2,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "graph",
|
||||
"xaxis": {
|
||||
"buckets": null,
|
||||
"mode": "time",
|
||||
"name": null,
|
||||
"show": true,
|
||||
"values": []
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": "0",
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": "0",
|
||||
"show": true
|
||||
}
|
||||
],
|
||||
"yaxis": {
|
||||
"align": false,
|
||||
"alignLevel": null
|
||||
}
|
||||
},
|
||||
{
|
||||
"aliasColors": {},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"datasource": "$ds",
|
||||
"description": "The number of on-going merges in storage nodes. It is expected to have high numbers for `storage/small` metric.",
|
||||
"fieldConfig": {
|
||||
"defaults": {
|
||||
"custom": {},
|
||||
"links": []
|
||||
},
|
||||
"overrides": []
|
||||
},
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
"h": 8,
|
||||
"w": 12,
|
||||
"x": 0,
|
||||
"y": 27
|
||||
},
|
||||
"hiddenSeries": false,
|
||||
"id": 62,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"nullPointMode": "null",
|
||||
"percentage": false,
|
||||
"pluginVersion": "7.1.1",
|
||||
"pointradius": 2,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"seriesOverrides": [],
|
||||
"spaceLength": 10,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(vm_active_merges{job=\"$job\", instance=\"$instance\"}) by(type)",
|
||||
"legendFormat": "{{type}}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": [],
|
||||
"timeFrom": null,
|
||||
"timeRegions": [],
|
||||
"timeShift": null,
|
||||
"title": "Active merges ($instance)",
|
||||
"tooltip": {
|
||||
"shared": true,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "graph",
|
||||
"xaxis": {
|
||||
"buckets": null,
|
||||
"mode": "time",
|
||||
"name": null,
|
||||
"show": true,
|
||||
"values": []
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"decimals": 0,
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": "0",
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": "0",
|
||||
"show": true
|
||||
}
|
||||
],
|
||||
"yaxis": {
|
||||
"align": false,
|
||||
"alignLevel": null
|
||||
}
|
||||
},
|
||||
{
|
||||
"aliasColors": {},
|
||||
"bars": false,
|
||||
@@ -1689,7 +1885,7 @@
|
||||
"h": 8,
|
||||
"w": 12,
|
||||
"x": 12,
|
||||
"y": 52
|
||||
"y": 27
|
||||
},
|
||||
"hiddenSeries": false,
|
||||
"id": 55,
|
||||
@@ -1764,194 +1960,6 @@
|
||||
"alignLevel": null
|
||||
}
|
||||
},
|
||||
{
|
||||
"aliasColors": {},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"datasource": "$ds",
|
||||
"description": "The number of on-going merges in storage nodes. It is expected to have high numbers for `storage/small` metric.",
|
||||
"fieldConfig": {
|
||||
"defaults": {
|
||||
"custom": {},
|
||||
"links": []
|
||||
},
|
||||
"overrides": []
|
||||
},
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
"h": 8,
|
||||
"w": 12,
|
||||
"x": 0,
|
||||
"y": 60
|
||||
},
|
||||
"hiddenSeries": false,
|
||||
"id": 62,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"nullPointMode": "null",
|
||||
"percentage": false,
|
||||
"pluginVersion": "7.1.1",
|
||||
"pointradius": 2,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"seriesOverrides": [],
|
||||
"spaceLength": 10,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(vm_active_merges{job=\"$job\", instance=\"$instance\"}) by(type)",
|
||||
"legendFormat": "{{type}}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": [],
|
||||
"timeFrom": null,
|
||||
"timeRegions": [],
|
||||
"timeShift": null,
|
||||
"title": "Active merges ($instance)",
|
||||
"tooltip": {
|
||||
"shared": true,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "graph",
|
||||
"xaxis": {
|
||||
"buckets": null,
|
||||
"mode": "time",
|
||||
"name": null,
|
||||
"show": true,
|
||||
"values": []
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"decimals": 0,
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": "0",
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": "0",
|
||||
"show": true
|
||||
}
|
||||
],
|
||||
"yaxis": {
|
||||
"align": false,
|
||||
"alignLevel": null
|
||||
}
|
||||
},
|
||||
{
|
||||
"aliasColors": {},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"datasource": "$ds",
|
||||
"description": "The number of rows merged per second by storage nodes.",
|
||||
"fieldConfig": {
|
||||
"defaults": {
|
||||
"custom": {},
|
||||
"links": []
|
||||
},
|
||||
"overrides": []
|
||||
},
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
"h": 8,
|
||||
"w": 12,
|
||||
"x": 12,
|
||||
"y": 60
|
||||
},
|
||||
"hiddenSeries": false,
|
||||
"id": 64,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"nullPointMode": "null",
|
||||
"percentage": false,
|
||||
"pluginVersion": "7.1.1",
|
||||
"pointradius": 2,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"seriesOverrides": [],
|
||||
"spaceLength": 10,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(rate(vm_rows_merged_total{job=\"$job\", instance=\"$instance\"}[5m])) by(type)",
|
||||
"legendFormat": "{{type}}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": [],
|
||||
"timeFrom": null,
|
||||
"timeRegions": [],
|
||||
"timeShift": null,
|
||||
"title": "Merge speed ($instance)",
|
||||
"tooltip": {
|
||||
"shared": true,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "graph",
|
||||
"xaxis": {
|
||||
"buckets": null,
|
||||
"mode": "time",
|
||||
"name": null,
|
||||
"show": true,
|
||||
"values": []
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"decimals": 0,
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": "0",
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": "0",
|
||||
"show": true
|
||||
}
|
||||
],
|
||||
"yaxis": {
|
||||
"align": false,
|
||||
"alignLevel": null
|
||||
}
|
||||
},
|
||||
{
|
||||
"aliasColors": {},
|
||||
"bars": false,
|
||||
@@ -1972,7 +1980,7 @@
|
||||
"h": 8,
|
||||
"w": 12,
|
||||
"x": 0,
|
||||
"y": 68
|
||||
"y": 35
|
||||
},
|
||||
"hiddenSeries": false,
|
||||
"id": 58,
|
||||
@@ -2050,6 +2058,100 @@
|
||||
"alignLevel": null
|
||||
}
|
||||
},
|
||||
{
|
||||
"aliasColors": {},
|
||||
"bars": false,
|
||||
"dashLength": 10,
|
||||
"dashes": false,
|
||||
"datasource": "$ds",
|
||||
"description": "The number of rows merged per second by storage nodes.",
|
||||
"fieldConfig": {
|
||||
"defaults": {
|
||||
"custom": {},
|
||||
"links": []
|
||||
},
|
||||
"overrides": []
|
||||
},
|
||||
"fill": 1,
|
||||
"fillGradient": 0,
|
||||
"gridPos": {
|
||||
"h": 8,
|
||||
"w": 12,
|
||||
"x": 12,
|
||||
"y": 35
|
||||
},
|
||||
"hiddenSeries": false,
|
||||
"id": 64,
|
||||
"legend": {
|
||||
"avg": false,
|
||||
"current": false,
|
||||
"max": false,
|
||||
"min": false,
|
||||
"show": true,
|
||||
"total": false,
|
||||
"values": false
|
||||
},
|
||||
"lines": true,
|
||||
"linewidth": 1,
|
||||
"nullPointMode": "null",
|
||||
"percentage": false,
|
||||
"pluginVersion": "7.1.1",
|
||||
"pointradius": 2,
|
||||
"points": false,
|
||||
"renderer": "flot",
|
||||
"seriesOverrides": [],
|
||||
"spaceLength": 10,
|
||||
"stack": false,
|
||||
"steppedLine": false,
|
||||
"targets": [
|
||||
{
|
||||
"expr": "sum(rate(vm_rows_merged_total{job=\"$job\", instance=\"$instance\"}[5m])) by(type)",
|
||||
"legendFormat": "{{type}}",
|
||||
"refId": "A"
|
||||
}
|
||||
],
|
||||
"thresholds": [],
|
||||
"timeFrom": null,
|
||||
"timeRegions": [],
|
||||
"timeShift": null,
|
||||
"title": "Merge speed ($instance)",
|
||||
"tooltip": {
|
||||
"shared": true,
|
||||
"sort": 0,
|
||||
"value_type": "individual"
|
||||
},
|
||||
"type": "graph",
|
||||
"xaxis": {
|
||||
"buckets": null,
|
||||
"mode": "time",
|
||||
"name": null,
|
||||
"show": true,
|
||||
"values": []
|
||||
},
|
||||
"yaxes": [
|
||||
{
|
||||
"decimals": 0,
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": "0",
|
||||
"show": true
|
||||
},
|
||||
{
|
||||
"format": "short",
|
||||
"label": null,
|
||||
"logBase": 1,
|
||||
"max": null,
|
||||
"min": "0",
|
||||
"show": true
|
||||
}
|
||||
],
|
||||
"yaxis": {
|
||||
"align": false,
|
||||
"alignLevel": null
|
||||
}
|
||||
},
|
||||
{
|
||||
"aliasColors": {},
|
||||
"bars": false,
|
||||
@@ -2070,7 +2172,7 @@
|
||||
"h": 8,
|
||||
"w": 12,
|
||||
"x": 12,
|
||||
"y": 68
|
||||
"y": 43
|
||||
},
|
||||
"hiddenSeries": false,
|
||||
"id": 67,
|
||||
|
||||
@@ -2,9 +2,9 @@
|
||||
|
||||
DOCKER_NAMESPACE := victoriametrics
|
||||
|
||||
ROOT_IMAGE ?= alpine:3.12
|
||||
CERTS_IMAGE := alpine:3.12
|
||||
GO_BUILDER_IMAGE := golang:1.15.2
|
||||
ROOT_IMAGE ?= alpine:3.12.3
|
||||
CERTS_IMAGE := alpine:3.12.3
|
||||
GO_BUILDER_IMAGE := golang:1.15.6
|
||||
BUILDER_IMAGE := local/builder:2.0.0-$(shell echo $(GO_BUILDER_IMAGE) | tr : _)
|
||||
BASE_IMAGE := local/base:1.1.1-$(shell echo $(ROOT_IMAGE) | tr : _)-$(shell echo $(CERTS_IMAGE) | tr : _)
|
||||
|
||||
|
||||
5
deployment/docker/alertmanager.yml
Normal file
5
deployment/docker/alertmanager.yml
Normal file
@@ -0,0 +1,5 @@
|
||||
route:
|
||||
receiver: blackhole
|
||||
|
||||
receivers:
|
||||
- name: blackhole
|
||||
174
deployment/docker/alerts.yml
Normal file
174
deployment/docker/alerts.yml
Normal file
@@ -0,0 +1,174 @@
|
||||
# File contains default list of alerts for vm-single and vmagent services.
|
||||
# The alerts below are just recommendations and may require some updates
|
||||
# and threshold calibration according to every specific setup.
|
||||
groups:
|
||||
- name: serviceHealth
|
||||
rules:
|
||||
# note the `job` filter and update accordingly to your setup
|
||||
- alert: TooManyRestarts
|
||||
expr: changes(process_start_time_seconds{job=~"victoriametrics|vmagent|vmalert"}[15m]) > 2
|
||||
labels:
|
||||
severity: critical
|
||||
annotations:
|
||||
summary: "{{ $labels.job }} too many restarts (instance {{ $labels.instance }})"
|
||||
description: "Job {{ $labels.job }} has restarted more than twice in the last 15 minutes.
|
||||
It might be crashlooping."
|
||||
|
||||
# Alerts group for VM single assumes that Grafana dashboard
|
||||
# https://grafana.com/grafana/dashboards/10229 is installed.
|
||||
# Pls update the `dashboard` annotation according to your setup.
|
||||
- name: vmsingle
|
||||
interval: 30s
|
||||
concurrency: 2
|
||||
rules:
|
||||
- alert: DiskRunsOutOfSpaceIn3Days
|
||||
expr: |
|
||||
vm_free_disk_space_bytes / ignoring(path) (
|
||||
(
|
||||
sum(rate(vm_rows_added_to_storage_total[1d])) -
|
||||
sum(rate(vm_deduplicated_samples_total[1d])) without(type)
|
||||
)
|
||||
*
|
||||
(
|
||||
sum(vm_data_size_bytes{type!="indexdb"}) /
|
||||
sum(vm_rows{type!="indexdb"})
|
||||
)
|
||||
) < 3 * 24 * 3600
|
||||
for: 30m
|
||||
labels:
|
||||
severity: critical
|
||||
annotations:
|
||||
dashboard: "http://localhost:3000/d/wNf0q_kZk?viewPanel=73&var-instance={{ $labels.instance }}"
|
||||
summary: "Instance {{ $labels.instance }} will run out of disk space soon"
|
||||
description: "Taking into account current ingestion rate, free disk space will be enough only
|
||||
for {{ $value | humanizeDuration }} on instance {{ $labels.instance }}.\n
|
||||
Consider to limit the ingestion rate, decrease retention or scale the disk space if possible."
|
||||
|
||||
- alert: RequestErrorsToAPI
|
||||
expr: increase(vm_http_request_errors_total[5m]) > 0
|
||||
for: 15m
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
dashboard: "http://localhost:3000/d/wNf0q_kZk?viewPanel=35&var-instance={{ $labels.instance }}"
|
||||
summary: "Too many errors served for path {{ $labels.path }} (instance {{ $labels.instance }})"
|
||||
description: "Requests to path {{ $labels.path }} are receiving errors.
|
||||
Please verify if clients are sending correct requests."
|
||||
|
||||
- alert: ConcurrentFlushesHitTheLimit
|
||||
expr: vm_concurrent_addrows_current >= vm_concurrent_addrows_capacity
|
||||
for: 15m
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
dashboard: "http://localhost:3000/d/wNf0q_kZk?viewPanel=59&var-instance={{ $labels.instance }}"
|
||||
summary: "VictoriMetrics on instance {{ $labels.instance }} is constantly hitting concurrent flushes limit"
|
||||
description: "The limit of concurrent flushes on instance {{ $labels.instance }} is equal to number of CPUs.\n
|
||||
When VictoriaMetrics constantly hits the limit it means that storage is overloaded and requires more CPU."
|
||||
|
||||
- alert: TooManyLogs
|
||||
expr: sum(increase(vm_log_messages_total{level!="info"}[5m])) by (job, instance) > 0
|
||||
for: 15m
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
dashboard: "http://localhost:3000/d/wNf0q_kZk?viewPanel=67&var-instance={{ $labels.instance }}"
|
||||
summary: "Too many logs printed for job \"{{ $labels.job }}\" ({{ $labels.instance }})"
|
||||
description: "Logging rate for job \"{{ $labels.job }}\" ({{ $labels.instance }}) is {{ $value }} for last 15m.\n
|
||||
Worth to check logs for specific error messages."
|
||||
|
||||
- alert: RowsRejectedOnIngestion
|
||||
expr: sum(rate(vm_rows_ignored_total[5m])) by (instance, reason) > 0
|
||||
for: 15m
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
dashboard: "http://localhost:3000/d/wNf0q_kZk?viewPanel=58&var-instance={{ $labels.instance }}"
|
||||
summary: "Some rows are rejected on \"{{ $labels.instance }}\" on ingestion attempt"
|
||||
description: "VM is rejecting to ingest rows on \"{{ $labels.instance }}\" due to the
|
||||
following reason: \"{{ $labels.reason }}\""
|
||||
|
||||
- alert: TooHighChurnRate
|
||||
expr: |
|
||||
(
|
||||
sum(rate(vm_new_timeseries_created_total[5m])) by(instance)
|
||||
/
|
||||
sum(rate(vm_rows_inserted_total[5m])) by (instance)
|
||||
) > 0.1
|
||||
for: 15m
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
dashboard: "http://localhost:3000/d/wNf0q_kZk?viewPanel=66&var-instance={{ $labels.instance }}"
|
||||
summary: "Churn rate is more than 10% on \"{{ $labels.instance }}\" for the last 15m"
|
||||
description: "VM constantly creates new time series on \"{{ $labels.instance }}\".\n
|
||||
This effect is known as Churn Rate.\n
|
||||
High Churn Rate tightly connected with database performance and may
|
||||
result in unexpected OOM's or slow queries."
|
||||
|
||||
- alert: TooHighSlowInsertsRate
|
||||
expr: |
|
||||
(
|
||||
sum(rate(vm_slow_row_inserts_total[5m])) by(instance)
|
||||
/
|
||||
sum(rate(vm_rows_inserted_total[5m])) by (instance)
|
||||
) > 0.5
|
||||
for: 15m
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
dashboard: "http://localhost:3000/d/wNf0q_kZk?viewPanel=68&var-instance={{ $labels.instance }}"
|
||||
summary: "Percentage of slow inserts is more than 50% on \"{{ $labels.instance }}\" for the last 15m"
|
||||
description: "High rate of slow inserts on \"{{ $labels.instance }}\" may be a sign of resource exhaustion
|
||||
for the current load. It is likely more RAM is needed for optimal handling of the current number of active time series."
|
||||
|
||||
# Alerts group for vmagent assumes that Grafana dashboard
|
||||
# https://grafana.com/grafana/dashboards/12683 is installed.
|
||||
# Pls update the `dashboard` annotation according to your setup.
|
||||
- name: vmagent
|
||||
interval: 30s
|
||||
concurrency: 2
|
||||
rules:
|
||||
- alert: PersistentQueueIsDroppingData
|
||||
expr: sum(increase(vm_persistentqueue_bytes_dropped_total[5m])) by (job, instance) > 0
|
||||
for: 10m
|
||||
labels:
|
||||
severity: critical
|
||||
annotations:
|
||||
dashboard: "http://localhost:3000/d/G7Z9GzMGz?viewPanel=49&var-instance={{ $labels.instance }}"
|
||||
summary: "Instance {{ $labels.instance }} is dropping data from persistent queue"
|
||||
description: "Vmagent dropped {{ $value | humanize1024 }} from persistent queue
|
||||
on instance {{ $labels.instance }} for the last 10m."
|
||||
|
||||
- alert: TooManyScrapeErrors
|
||||
expr: sum(increase(vm_promscrape_scrapes_failed_total[5m])) by (job, instance) > 0
|
||||
for: 15m
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
dashboard: "http://localhost:3000/d/G7Z9GzMGz?viewPanel=31&var-instance={{ $labels.instance }}"
|
||||
summary: "Job \"{{ $labels.job }}\" on instance {{ $labels.instance }} fails to scrape targets for last 15m"
|
||||
|
||||
- alert: TooManyWriteErrors
|
||||
expr: |
|
||||
(sum(increase(vm_ingestserver_request_errors_total[5m])) by (job, instance)
|
||||
+
|
||||
sum(increase(vmagent_http_request_errors_total[5m])) by (job, instance)) > 0
|
||||
for: 15m
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
dashboard: "http://localhost:3000/d/G7Z9GzMGz?viewPanel=77&var-instance={{ $labels.instance }}"
|
||||
summary: "Job \"{{ $labels.job }}\" on instance {{ $labels.instance }} responds with errors to write requests for last 15m."
|
||||
|
||||
- alert: TooManyRemoteWriteErrors
|
||||
expr: sum(rate(vmagent_remotewrite_retries_count_total[5m])) by(job, instance, url) > 0
|
||||
for: 15m
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
dashboard: "http://localhost:3000/d/G7Z9GzMGz?viewPanel=61&var-instance={{ $labels.instance }}"
|
||||
summary: "Job \"{{ $labels.job }}\" on instance {{ $labels.instance }} fails to push to remote storage"
|
||||
description: "Vmagent fails to push data via remote write protocol to destination \"{{ $labels.url }}\"\n
|
||||
Ensure that destination is up and reachable."
|
||||
|
||||
@@ -21,7 +21,10 @@ services:
|
||||
image: victoriametrics/victoria-metrics
|
||||
ports:
|
||||
- 8428:8428
|
||||
- 8089:8089
|
||||
- 8089:8089/udp
|
||||
- 2003:2003
|
||||
- 2003:2003/udp
|
||||
- 4242:4242
|
||||
volumes:
|
||||
- vmdata:/storage
|
||||
@@ -30,6 +33,7 @@ services:
|
||||
- '--graphiteListenAddr=:2003'
|
||||
- '--opentsdbListenAddr=:4242'
|
||||
- '--httpListenAddr=:8428'
|
||||
- '--influxListenAddr=:8089'
|
||||
networks:
|
||||
- vm_net
|
||||
restart: always
|
||||
@@ -48,6 +52,40 @@ services:
|
||||
networks:
|
||||
- vm_net
|
||||
restart: always
|
||||
vmalert:
|
||||
container_name: vmalert
|
||||
image: victoriametrics/vmalert
|
||||
depends_on:
|
||||
- "victoriametrics"
|
||||
- "alertmanager"
|
||||
ports:
|
||||
- 8880:8880
|
||||
volumes:
|
||||
- ./alerts.yml:/etc/alerts/alerts.yml
|
||||
command:
|
||||
- '--datasource.url=http://victoriametrics:8428/'
|
||||
- '--remoteRead.url=http://victoriametrics:8428/'
|
||||
- '--remoteWrite.url=http://victoriametrics:8428/'
|
||||
- '--notifier.url=http://alertmanager:9093/'
|
||||
- '--rule=/etc/alerts/*.yml'
|
||||
# display source of alerts in grafana
|
||||
- '-external.url=http://127.0.0.1:3000' #grafana outside container
|
||||
- '--external.alert.source=explore?orgId=1&left=["now-1h","now","VictoriaMetrics",{"expr":"{{$$expr|quotesEscape|pathEscape}}"},{"mode":"Metrics"},{"ui":[true,true,true,"none"]}]' ## when copypaste the line be aware of '$$' for escaping in '$expr'
|
||||
networks:
|
||||
- vm_net
|
||||
restart: always
|
||||
alertmanager:
|
||||
container_name: alertmanager
|
||||
image: prom/alertmanager
|
||||
volumes:
|
||||
- ./alertmanager.yml:/config/alertmanager.yml
|
||||
command:
|
||||
- '--config.file=/config/alertmanager.yml'
|
||||
ports:
|
||||
- 9093:9093
|
||||
networks:
|
||||
- vm_net
|
||||
restart: always
|
||||
volumes:
|
||||
vmagentdata: {}
|
||||
vmdata: {}
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
global:
|
||||
scrape_interval: 10s
|
||||
scrape_interval: 10s
|
||||
|
||||
scrape_configs:
|
||||
- job_name: 'vmagent'
|
||||
|
||||
@@ -1,41 +1,16 @@
|
||||
# Articles
|
||||
|
||||
## Third-party articles and slides about VictoriaMetrics
|
||||
|
||||
## Our articles
|
||||
|
||||
* [Open-sourcing VictoriaMetrics](https://medium.com/@valyala/open-sourcing-victoriametrics-f31e34485c2b)
|
||||
* [How we created VictoriaMetrics](https://medium.com/devopslinks/victoriametrics-creating-the-best-remote-storage-for-prometheus-5d92d66787ac)
|
||||
* [VictoriaMetrics vs TimescaleDB vs InfluxDB benchmarks on 40K unique time series](https://medium.com/@valyala/when-size-matters-benchmarking-victoriametrics-vs-timescale-and-influxdb-6035811952d4)
|
||||
* [VictoriaMetrics vs TimescaleDB vs InfluxDB benchmarks on 400K, 4M and 40M unique time series](https://medium.com/@valyala/high-cardinality-tsdb-benchmarks-victoriametrics-vs-timescaledb-vs-influxdb-13e6ee64dd6b)
|
||||
* [Insert benchmarks for VictoriaMetrics vs InfluxDB on high-cardinality data](https://medium.com/@valyala/insert-benchmarks-with-inch-influxdb-vs-victoriametrics-e31a41ae2893)
|
||||
* [Measuring vertical scalability for time series databases in Google Cloud](https://medium.com/@valyala/measuring-vertical-scalability-for-time-series-databases-in-google-cloud-92550d78d8ae)
|
||||
* [How VictoriaMetrics creates instant snapshots](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282)
|
||||
* [Prometheus Subqueries in VictoriaMetrics](https://medium.com/@valyala/prometheus-subqueries-in-victoriametrics-9b1492b720b3)
|
||||
* [Why irate from Prometheus doesn't capture spikes](https://medium.com/@valyala/why-irate-from-prometheus-doesnt-capture-spikes-45f9896d7832)
|
||||
* [Why mmap'ed files in Go may hurt performance](https://medium.com/@valyala/mmap-in-go-considered-harmful-d92a25cb161d)
|
||||
* [WAL Usage Looks Broken in Modern TSDBs](https://medium.com/@valyala/wal-usage-looks-broken-in-modern-time-series-databases-b62a627ab704)
|
||||
* [Analyzing Prometheus data with external tools](https://medium.com/@valyala/analyzing-prometheus-data-with-external-tools-5f3e5e147639)
|
||||
* [Stripping dependency bloat in VictoriaMetrics Docker image](https://medium.com/@valyala/stripping-dependency-bloat-in-victoriametrics-docker-image-983fb5912b0d)
|
||||
* [PromQL tutorial for beginners](https://medium.com/@valyala/promql-tutorial-for-beginners-9ab455142085)
|
||||
* [Achieving better compression for time series data than Gorilla](https://medium.com/@valyala/victoriametrics-achieving-better-compression-for-time-series-data-than-gorilla-317bc1f95932)
|
||||
* [Comparing Thanos to VictoriaMetrics cluster](https://medium.com/@valyala/comparing-thanos-to-victoriametrics-cluster-b193bea1683)
|
||||
* [Speeding up backups for big time series databases](https://medium.com/@valyala/speeding-up-backups-for-big-time-series-databases-533c1a927883)
|
||||
* [Evaluation performance and correctness: VictoriaMetrics response](https://medium.com/@valyala/evaluating-performance-and-correctness-victoriametrics-response-e27315627e87)
|
||||
* [Improving histogram usability for Prometheus and Grafana](https://medium.com/@valyala/improving-histogram-usability-for-prometheus-and-grafana-bc7e5df0e350)
|
||||
* [Prometheus storage: tech terms for humans](https://medium.com/@valyala/prometheus-storage-technical-terms-for-humans-4ab4de6c3d48)
|
||||
* [Billy: how VictoriaMetrics deals with more than 500 billion rows](https://medium.com/@valyala/billy-how-victoriametrics-deals-with-more-than-500-billion-rows-e82ff8f725da)
|
||||
* [How to migrate data from Prometheus to VictoriaMetrics](https://medium.com/@romanhavronenko/victoriametrics-how-to-migrate-data-from-prometheus-d44a6728f043)
|
||||
* [Filtering and modifying time series during import to VictoriaMetrics](https://medium.com/@romanhavronenko/victoriametrics-how-to-migrate-data-from-prometheus-filtering-and-modifying-time-series-6d40cea4bf21)
|
||||
* [Anomaly Detection in VictoriaMetrics](https://medium.com/@VictoriaMetrics/anomaly-detection-in-victoriametrics-9528538786a7)
|
||||
|
||||
|
||||
## Third-party articles and slides
|
||||
|
||||
* [Foiled by the Firewall: A Tale of Transition From Prometheus to VictoriaMetrics](https://www.percona.com/blog/2020/12/01/foiled-by-the-firewall-a-tale-of-transition-from-prometheus-to-victoriametrics/)
|
||||
* [Observations on Better Resource Usage with Percona Monitoring and Management v2.12.0](https://www.percona.com/blog/2020/12/23/observations-on-better-resource-usage-with-percona-monitoring-and-management-v2-12-0/)
|
||||
* [Better Prometheus rate() function with VictoriaMetrics](https://www.percona.com/blog/2020/02/28/better-prometheus-rate-function-with-victoriametrics/)
|
||||
* [Percona monitoring and management migration from Prometheus to VictoriaMetrics FAQ](https://www.percona.com/blog/2020/12/16/percona-monitoring-and-management-migration-from-prometheus-to-victoriametrics-faq/)
|
||||
* [Making peace with Prometheus rate()](https://blog.doit-intl.com/making-peace-with-prometheus-rate-43a3ea75c4cf)
|
||||
* [Infrastructure monitoring with Prometheus at Zerodha](https://zerodha.tech/blog/infra-monitoring-at-zerodha/)
|
||||
* [Sismology: Iguana Solutions’ Monitoring System](https://medium.com/@IG1.com/sismology-iguana-solutions-monitoring-system-f46e4170447f)
|
||||
* [Prometheus High Availability and Fault Tolerance strategy, long term storage with VictoriaMetrics](https://medium.com/miro-engineering/prometheus-high-availability-and-fault-tolerance-strategy-long-term-storage-with-victoriametrics-82f6f3f0409e)
|
||||
* [How we improved our Kubernetes monitoring at Smarkets, and how you could too](https://smarketshq.com/monitoring-kubernetes-clusters-41a4b24c19e3)
|
||||
* [Monitoring K8S with VictoriaMetrics](https://docs.google.com/presentation/d/1g7yUyVEaAp4tPuRy-MZbPXKqJ1z78_5VKuV841aQfsg/edit)
|
||||
* [CMS monitoring R&D: Real-time monitoring and alerts](https://indico.cern.ch/event/877333/contributions/3696707/attachments/1972189/3281133/CMS_mon_RD_for_opInt.pdf)
|
||||
* [The CMS monitoring infrastructure and applications](https://arxiv.org/pdf/2007.03630.pdf)
|
||||
@@ -47,3 +22,56 @@
|
||||
* [Calculating the Error of Quantile Estimation with Histograms](https://linuxczar.net/blog/2020/08/13/histogram-error/)
|
||||
* [Monitoring private clouds with VictoriaMetrics at LeroyMerlin](https://www.youtube.com/watch?v=74swsWqf0Uc)
|
||||
* [Monitoring Kubernetes with VictoriaMetrics+Prometheus](https://speakerdeck.com/bo0km4n/victoriametrics-plus-prometheusdegou-zhu-surufu-shu-kubernetesfalsejian-shi-ji-pan)
|
||||
* [High-performance Graphite storage solution on top of VictoriaMetrics](https://golangexample.com/a-high-performance-graphite-storage-solution/)
|
||||
* [Cloud Native Model Driven Telemetry Stack on OpenShift](https://cer6erus.medium.com/cloud-native-model-driven-telemetry-stack-on-openshift-80712621f5bc)
|
||||
|
||||
|
||||
## Our articles
|
||||
|
||||
### Announcements
|
||||
|
||||
* [Open-sourcing VictoriaMetrics](https://medium.com/@valyala/open-sourcing-victoriametrics-f31e34485c2b)
|
||||
* [How we created VictoriaMetrics](https://medium.com/devopslinks/victoriametrics-creating-the-best-remote-storage-for-prometheus-5d92d66787ac)
|
||||
* [Anomaly Detection in VictoriaMetrics](https://medium.com/@VictoriaMetrics/anomaly-detection-in-victoriametrics-9528538786a7)
|
||||
|
||||
|
||||
### Benchmarks
|
||||
|
||||
* [VictoriaMetrics vs TimescaleDB vs InfluxDB benchmarks on 40K unique time series](https://medium.com/@valyala/when-size-matters-benchmarking-victoriametrics-vs-timescale-and-influxdb-6035811952d4)
|
||||
* [VictoriaMetrics vs TimescaleDB vs InfluxDB benchmarks on 400K, 4M and 40M unique time series](https://medium.com/@valyala/high-cardinality-tsdb-benchmarks-victoriametrics-vs-timescaledb-vs-influxdb-13e6ee64dd6b)
|
||||
* [Insert benchmarks for VictoriaMetrics vs InfluxDB on high-cardinality data](https://medium.com/@valyala/insert-benchmarks-with-inch-influxdb-vs-victoriametrics-e31a41ae2893)
|
||||
* [Measuring vertical scalability for time series databases in Google Cloud](https://medium.com/@valyala/measuring-vertical-scalability-for-time-series-databases-in-google-cloud-92550d78d8ae)
|
||||
* [Billy: how VictoriaMetrics deals with more than 500 billion rows](https://medium.com/@valyala/billy-how-victoriametrics-deals-with-more-than-500-billion-rows-e82ff8f725da)
|
||||
* [First look at performance comparison between InfluxDB IOx and VictoriaMetrics](https://medium.com/@VictoriaMetrics/first-look-at-perfomance-comparassion-between-influxdb-iox-and-victoriametrics-e590f847935b)
|
||||
* [Prometheus vs VictoriaMetrics benchmark on node-exporter metrics](https://valyala.medium.com/prometheus-vs-victoriametrics-benchmark-on-node-exporter-metrics-4ca29c75590f)
|
||||
* [Promscale vs VictoriaMetrics: resource usage on production workload](https://valyala.medium.com/promscale-vs-victoriametrics-resource-usage-on-production-workload-91c8e3786c03)
|
||||
|
||||
|
||||
### Technical articles
|
||||
|
||||
* [How VictoriaMetrics creates instant snapshots](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282)
|
||||
* [WAL Usage Looks Broken in Modern TSDBs](https://medium.com/@valyala/wal-usage-looks-broken-in-modern-time-series-databases-b62a627ab704)
|
||||
* [Why mmap'ed files in Go may hurt performance](https://medium.com/@valyala/mmap-in-go-considered-harmful-d92a25cb161d)
|
||||
* [Achieving better compression for time series data than Gorilla](https://medium.com/@valyala/victoriametrics-achieving-better-compression-for-time-series-data-than-gorilla-317bc1f95932)
|
||||
* [Stripping dependency bloat in VictoriaMetrics Docker image](https://medium.com/@valyala/stripping-dependency-bloat-in-victoriametrics-docker-image-983fb5912b0d)
|
||||
* [Speeding up backups for big time series databases](https://medium.com/@valyala/speeding-up-backups-for-big-time-series-databases-533c1a927883)
|
||||
* [Improving histogram usability for Prometheus and Grafana](https://medium.com/@valyala/improving-histogram-usability-for-prometheus-and-grafana-bc7e5df0e350)
|
||||
* [Why irate from Prometheus doesn't capture spikes](https://medium.com/@valyala/why-irate-from-prometheus-doesnt-capture-spikes-45f9896d7832)
|
||||
|
||||
|
||||
### Tutorials, guides and how-to articles
|
||||
|
||||
* [PromQL tutorial for beginners](https://medium.com/@valyala/promql-tutorial-for-beginners-9ab455142085)
|
||||
* [Analyzing Prometheus data with external tools](https://medium.com/@valyala/analyzing-prometheus-data-with-external-tools-5f3e5e147639)
|
||||
* [Prometheus Subqueries in VictoriaMetrics](https://medium.com/@valyala/prometheus-subqueries-in-victoriametrics-9b1492b720b3)
|
||||
* [How to migrate data from Prometheus to VictoriaMetrics](https://medium.com/@romanhavronenko/victoriametrics-how-to-migrate-data-from-prometheus-d44a6728f043)
|
||||
* [Filtering and modifying time series during import to VictoriaMetrics](https://medium.com/@romanhavronenko/victoriametrics-how-to-migrate-data-from-prometheus-filtering-and-modifying-time-series-6d40cea4bf21)
|
||||
* [How to use relabeling in Prometheus and VictoriaMetrics](https://valyala.medium.com/how-to-use-relabeling-in-prometheus-and-victoriametrics-8b90fc22c4b2)
|
||||
* [How to monitor Go applications with VictoriaMetrics](https://victoriametrics.medium.com/how-to-monitor-go-applications-with-victoriametrics-c04703110870)
|
||||
* [Prometheus storage: tech terms for humans](https://medium.com/@valyala/prometheus-storage-technical-terms-for-humans-4ab4de6c3d48)
|
||||
|
||||
|
||||
### Other articles
|
||||
|
||||
* [Comparing Thanos to VictoriaMetrics cluster](https://medium.com/@valyala/comparing-thanos-to-victoriametrics-cluster-b193bea1683)
|
||||
* [Evaluation performance and correctness: VictoriaMetrics response](https://medium.com/@valyala/evaluating-performance-and-correctness-victoriametrics-response-e27315627e87)
|
||||
|
||||
318
docs/CHANGELOG.md
Normal file
318
docs/CHANGELOG.md
Normal file
@@ -0,0 +1,318 @@
|
||||
# CHANGELOG
|
||||
|
||||
# tip
|
||||
|
||||
|
||||
# [v1.52.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.52.0)
|
||||
|
||||
* FEATURE: provide a sample list of alerting rules for VictoriaMetrics components. It is available [here](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/deployment/docker/alerts.yml).
|
||||
* FEATURE: disable final merge for data for the previous month at the beginning of new month, since it may result in high disk IO and CPU usage. Final merge can be enabled by setting `-finalMergeDelay` command-line flag to positive duration.
|
||||
* FEATURE: add `tfirst_over_time(m[d])` and `tlast_over_time(m[d])` functions to [MetricsQL](https://victoriametrics.github.io/MetricsQL.html) for returning timestamps for the first and the last data point in `m` over `d` duration.
|
||||
* FEATURE: add ability to pass multiple labels to `sort_by_label()` and `sort_by_label_desc()` functions. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/992 .
|
||||
* FEATURE: enforce at least TLS v1.2 when accepting HTTPS requests if `-tls`, `-tlsCertFile` and `-tlsKeyFile` command-line flags are set, because older TLS protocols such as v1.0 and v1.1 have been deprecated due to security vulnerabilities.
|
||||
* FEATURE: support `extra_label` query arg for all HTTP-based [data ingestion protocols](https://victoriametrics.github.io/#how-to-import-time-series-data). This query arg can be used for specifying extra labels which should be added for the ingested data.
|
||||
* FEATURE: vmbackup: increase backup chunk size from 128MB to 1GB. This should reduce the number of Object storage API calls during backups by 8x. This may also reduce costs, since object storage API calls usually have non-zero costs. See https://aws.amazon.com/s3/pricing/ and https://cloud.google.com/storage/pricing#operations-pricing .
|
||||
|
||||
* BUGFIX: properly parse escaped unicode chars in MetricsQL metric names, label names and function names. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/990
|
||||
* BUGFIX: override user-provided labels with labels set in `extra_label` query args during data ingestion over HTTP-based protocols.
|
||||
* BUGFIX: vmagent: prevent from `dialing to the given TCP address time out` error when scraping big number of unavailable targets. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/987
|
||||
* BUGFIX: vmagent: properly show scrape duration on `/targets` page. Previously it was incorrectly shown as 0.000s.
|
||||
* BUGFIX: vmagent: properly log errors when `-promscrape.streamParse` command-line flag is set. See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1009
|
||||
* BUGFIX: vmagent: properly suppress errors when both `-promscrape.suppressScrapeErrors` and `-promscrape.streamParse` command-line flags are set. See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1009 .
|
||||
* BUGFIX: vmalert: return non-empty result in template func `query` stub to pass validation. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/989 .
|
||||
* BUGFIX: upgrade base image for Docker packages from Alpine 3.12.1 to Alpine 3.12.3 in order to fix potential security issues. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1010
|
||||
|
||||
|
||||
# [v1.51.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.51.0)
|
||||
|
||||
* FEATURE: add `/api/v1/status/top_queries` handler, which returns the most frequently executed queries and queries that took the most time for execution. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/907
|
||||
* FEATURE: vmagent: add support for `proxy_url` config option in Prometheus scrape configs. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/503
|
||||
* FEATURE: remove parts with stale data as soon as they go outside the configured `-retentionPeriod`. Previously such parts may remain active for long periods of time. This should help reducing disk usage for `-retentionPeriod` smaller than one month.
|
||||
* FEATURE: vmalert: allow setting multiple values for `-notifier.tlsInsecureSkipVerify` command-line flag per each `-notifier.url`.
|
||||
|
||||
* BUGFIX: vmalert: properly escape multiline queries when passing them to Grafana. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/890
|
||||
* BUGFIX: vmagent: set missing `__meta_kubernetes_service_*` labels in `kubernetes_sd_config` for `endpoints` and `endpointslices` roles. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/982
|
||||
* BUGFIX: do not adjust `offset` value provided in MetricsQL query. Previously it could be modified in order to improve response cache hit ratio. This is unneeded, since cache hit ratio should remain good because the query time range should be already aligned to multiple of `step` values. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/976
|
||||
|
||||
|
||||
# [v1.50.2](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.50.2)
|
||||
|
||||
* FEATURE: do not publish duplicate Docker images with `-cluster` tag suffix for [vmagent](https://victoriametrics.github.io/vmagent.html), [vmalert](https://victoriametrics.github.io/vmalert.html), [vmauth](https://victoriametrics.github.io/vmauth.html), [vmbackup](https://victoriametrics.github.io/vmbackup.html) and [vmrestore](https://victoriametrics.github.io/vmrestore.html), since they are identical to images without `-cluster` tag suffix.
|
||||
|
||||
* BUGFIX: vmalert: properly populate template variables. This has been broken in v1.50.0. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/974
|
||||
* BUGFIX: properly parse negative combined duration in MetricsQL such as `-1h3m4s`. It must be parsed as `-(1h + 3m + 4s)`. Prevsiously it was parsed as `-1h + 3m + 4s`.
|
||||
* BUGFIX: properly parse lines in [Prometheus exposition format](https://github.com/prometheus/docs/blob/master/content/docs/instrumenting/exposition_formats.md) and in [OpenMetrics format](https://github.com/OpenObservability/OpenMetrics/blob/master/specification/OpenMetrics.md) with whitespace after the timestamp. For example, `foo 123 456 # some comment here`. See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/970
|
||||
|
||||
|
||||
# [v1.50.1](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.50.1)
|
||||
|
||||
* FEATURE: vmagent: export `vmagent_remotewrite_blocks_sent_total` and `vmagent_remotewrite_blocks_sent_total` metrics for each `-remoteWrite.url`.
|
||||
|
||||
* BUGFIX: vmagent: properly delete unregistered scrape targets from `/targets` and `/api/v1/targets` pages. They weren't deleted due to the bug in `v1.50.0`.
|
||||
|
||||
|
||||
# [v1.50.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.50.0)
|
||||
|
||||
* FEATURE: automatically reset response cache when samples with timestamps older than `now - search.cacheTimestampOffset` are ingested to VictoriaMetrics. This makes unnecessary disabling response cache during data backfilling or resetting it after backfilling is complete as described [in these docs](https://victoriametrics.github.io/#backfilling). This feature applies only to single-node VictoriaMetrics. It doesn't apply to cluster version of VictoriaMetrics because `vminsert` nodes don't know about `vmselect` nodes where the response cache must be reset.
|
||||
* FEATURE: vmalert: add `query`, `first` and `value` functions to alert templates. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/539
|
||||
* FEATURE: vmagent: return user-friendly HTML page when requesting `/targets` page from web browser. The page is returned in the old plaintext format when requesting via curl or similar tool.
|
||||
* FEATURE: allow multiple whitespace chars between measurements, fields and timestamp when parsing InfluxDB line protocol.
|
||||
Though [InfluxDB line protocol](https://docs.influxdata.com/influxdb/v1.8/write_protocols/line_protocol_tutorial/) denies multiple whitespace chars between these entities,
|
||||
some apps improperly put multiple whitespace chars. This workaround allows accepting data from such apps.
|
||||
* FEATURE: export `vm_promscrape_active_scrapers{type="<sd_type>"}` metric for tracking the number of active scrapers per each service discovery type.
|
||||
* FEATURE: export `vm_promscrape_scrapers_started_total{type="<sd_type>"}` and `vm_promscrape_scrapers_stopped_total{type="<sd_type>"}` metrics for tracking churn rate for scrapers
|
||||
per each service discovery type.
|
||||
* FEATURE: vmagent: allow setting per-`-remoteWrite.url` command-line flags for `-remoteWrite.sendTimeout` and `-remoteWrite.tlsInsecureSkipVerify`.
|
||||
|
||||
* BUGFIX: properly handle `*` and `[...]` inside curly braces in query passed to Graphite Metrics API. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/952
|
||||
* BUGFIX: vmagent: fix memory leak when big number of targets is discovered via service discovery.
|
||||
* BUGFIX: vmagent: properly pass `datacenter` filter to Consul API server. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/574#issuecomment-740454170
|
||||
* BUGFIX: properly handle CPU limits set on the host system or host container. The bugfix may result in lower memory usage on systems with CPU limits. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/946
|
||||
* BUGFIX: prevent from duplicate `name` tag returned from `/tags/autoComplete/tags` handler. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/942
|
||||
* BUGFIX: do not enable strict parsing for `-promscrape.config` if `-promscrape.config.dryRun` comand-line flag is set. Strict parsing can be enabled with `-promscrape.config.strictParse` command-line flag. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/944
|
||||
* BUGFIX: vminsert: properly update `vm_rpc_rerouted_rows_processed_total` metric. Previously it wasn't updated. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/955
|
||||
* BUGFIX: vmagent: properly recover when opening incorrectly stored persistent queue. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/964
|
||||
* BUGFIX: vmagent: properly handle scrape errors when stream parsing is enabled with `-promscrape.streamParse` command-line flag or with `stream_parse: true` per-target config option. Previously such errors weren't reported at `/targets` page. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/967
|
||||
* BUGFIX: assume the previous value is 0 when calculating `increase()` for the first point on the graph if its value doesn't exceed 100 and the delta between two first points equals to 0. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/962
|
||||
|
||||
|
||||
# [v1.49.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.49.0)
|
||||
|
||||
* FEATURE: optimize Consul service discovery speed when discovering big number of services. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/574
|
||||
* FEATURE: add `label_uppercase(q, label1, ... labelN)` and `label_lowercase(q, label1, ... labelN)` function to [MetricsQL](https://victoriametrics.github.io/MetricsQL.html)
|
||||
for uppercasing and lowercasing values for the given labels. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/936
|
||||
* FEATURE: add `count_eq_over_time(m[d], N)` and `count_ne_over_time(m[d], N)` for counting the number of samples for `m` over `d` that (equal / not equal) to `N`.
|
||||
* FEATURE: do not print usage info for all the command-line flags when incorrect command-line flag is passed. Previously it could be hard reading the error message
|
||||
about incorrect command-line flag because of too big usage info for all the flags.
|
||||
* FEATURE: upgrade Go builder from v1.15.5 to v1.15.6 . This fixes [issues found in Go since v1.15.5](https://github.com/golang/go/issues?q=milestone%3AGo1.15.6+label%3ACherryPickApproved).
|
||||
|
||||
* BUGFIX: properly parse timestamps in OpenMetrics format - they are exposed as floating-point number in seconds instead of integer milliseconds
|
||||
unlike in Prometheus exposition format. See [the docs](https://github.com/OpenObservability/OpenMetrics/blob/master/specification/OpenMetrics.md#timestamps).
|
||||
* BUGFIX: return `nan` for `a >bool b` query when `a` equals to `nan` like Prometheus does. Previously `0` was returned in this case. This applies to any comparison operation
|
||||
with `bool` modifier. See [these docs](https://prometheus.io/docs/prometheus/latest/querying/operators/#comparison-binary-operators) for details.
|
||||
* BUGFIX: properly parse hex numbers in MetricsQL. Previously hex numbers with non-decimal digits such as `0x3b` couldn't be parsed.
|
||||
* BUGFIX: handle `time() cmp_op metric` like Prometheus does - i.e. return `metric` value if `cmp_op` comparison is true. Previously `time()` value was returned.
|
||||
* BUGFIX: return `nan` for `minute(m)` query when `m` equals to `nan` like Prometheus does. This applies to all the time-related functions such as `day_of_month`, `day_of_week`,
|
||||
`days_in_month`, `hour`, `month` and `year`.
|
||||
|
||||
|
||||
# [v1.48.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.48.0)
|
||||
|
||||
* FEATURE: added [Snap package for single-node VictoriaMetrics](https://snapcraft.io/victoriametrics). This simplifies installation under Ubuntu to a single command:
|
||||
```bash
|
||||
snap install victoriametrics
|
||||
```
|
||||
* FEATURE: vmselect: add `-replicationFactor` command-line flag for reducing query duration when replication is enabled and a part of vmstorage nodes
|
||||
are temporarily slow and/or temporarily unavailable. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/711
|
||||
* FEATURE: vminsert: export `vm_rpc_vmstorage_is_reachable` metric, which can be used for monitoring reachability of vmstorage nodes from vminsert nodes.
|
||||
* FEATURE: vmagent: add [Netflix Eureka](https://github.com/Netflix/eureka) service discovery (aka [eureka_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#eureka_sd_config)). See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/851
|
||||
* FEATURE: add `filters` option to `dockerswarm_sd_config` like Prometheus did in v2.23.0 - see https://github.com/prometheus/prometheus/pull/8074
|
||||
* FEATURE: expose `__meta_ec2_ipv6_addresses` label for `ec2_sd_config` like Prometheus will do in the next release.
|
||||
* FEATURE: add `-loggerWarnsPerSecondLimit` command-line flag for rate limiting of WARN messages in logs. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/905
|
||||
* FEATURE: apply `loggerErrorsPerSecondLimit` and `-loggerWarnsPerSecondLimit` rate limit per caller. I.e. log messages are suppressed if the same caller logs the same message
|
||||
at the rate exceeding the given limit. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/905#issuecomment-729395855
|
||||
* FEATURE: add remoteAddr to slow query log in order to simplify identifying the client that sends slow queries to VictoriaMetrics.
|
||||
Slow query logging is controlled with `-search.logSlowQueryDuration` command-line flag.
|
||||
* FEATURE: add `/tags/delSeries` handler from Graphite Tags API. See https://victoriametrics.github.io/#graphite-tags-api-usage
|
||||
* FEATURE: log metric name plus all its labels when the metric timestamp is out of the configured retention. This should simplify detecting the source of metrics with unexpected timestamps.
|
||||
* FEATURE: add `-dryRun` command-line flag to single-node VictoriaMetrics in order to check config file pointed by `-promscrape.config`.
|
||||
|
||||
* BUGFIX: properly parse Prometheus metrics with [exemplars](https://github.com/OpenObservability/OpenMetrics/blob/master/OpenMetrics.md#exemplars-1) such as `foo 123 # {bar="baz"} 1`.
|
||||
* BUGFIX: properly parse "infinity" values in [OpenMetrics format](https://github.com/OpenObservability/OpenMetrics/blob/master/OpenMetrics.md#abnf).
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/924
|
||||
|
||||
|
||||
# [v1.47.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.47.0)
|
||||
|
||||
* FEATURE: vmselect: return the original error from `vmstorage` node in query response if `-search.denyPartialResponse` is set.
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/891
|
||||
* FEATURE: vmselect: add `"isPartial":{true|false}` field in JSON output for `/api/v1/*` functions
|
||||
from [Prometheus querying API](https://prometheus.io/docs/prometheus/latest/querying/api/). `"isPartial":true` is set if the response contains partial data
|
||||
because of a part of `vmstorage` nodes were unavailable during query processing.
|
||||
* FEATURE: improve performance for `/api/v1/series`, `/api/v1/labels` and `/api/v1/label/<labelName>/values` on time ranges exceeding one day.
|
||||
* FEATURE: vmagent: reduce memory usage when service discovery detects big number of scrape targets and the set of discovered targets changes over time.
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/825
|
||||
* FEATURE: vmagent: add `-promscrape.dropOriginalLabels` command-line option, which can be used for reducing memory usage when scraping big number of targets.
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/825#issuecomment-724308361 for details.
|
||||
* FEATURE: vmalert: explicitly set extra labels to alert entities. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/870
|
||||
* FEATURE: add `-search.treatDotsAsIsInRegexps` command-line flag, which can be used for automatic escaping of dots in regexp label filters used in queries.
|
||||
For example, if `-search.treatDotsAsIsInRegexps` is set, then the query `foo{bar=~"aaa.bb.cc|dd.eee"}` is automatically converted to `foo{bar=~"aaa\\.bb\\.cc|dd\\.eee"}`.
|
||||
This may be useful for querying Graphite data.
|
||||
* FEATURE: consistently return text-based HTTP responses such as `plain/text` and `application/json` with `charset=utf-8`.
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/897
|
||||
* FEATURE: update Go builder from v1.15.4 to v1.15.5. This should fix [these issues in Go](https://github.com/golang/go/issues?q=milestone%3AGo1.15.5+label%3ACherryPickApproved).
|
||||
* FEATURE: added `/internal/force_flush` http handler for flushing recently ingested data from in-memory buffers to persistent storage.
|
||||
See [troubleshooting docs](https://victoriametrics.github.io/#troubleshooting) for more details.
|
||||
* FEATURE: added [Graphite Tags API](https://graphite.readthedocs.io/en/stable/tags.html) support.
|
||||
See [these docs](https://victoriametrics.github.io/#graphite-tags-api-usage) for details.
|
||||
|
||||
* BUGFIX: do not return data points in the end of the selected time range for time series ending in the middle of the selected time range.
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/887 and https://github.com/VictoriaMetrics/VictoriaMetrics/issues/845
|
||||
* BUGFIX: remove spikes at the end of time series gaps for `increase()` or `delta()` functions. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/894
|
||||
* BUGFIX: vminsert: properly return HTTP 503 status code when all the vmstorage nodes are unavailable. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/896
|
||||
|
||||
|
||||
# [v1.46.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.46.0)
|
||||
|
||||
* FEATURE: optimize requests to `/api/v1/labels` and `/api/v1/label/<name>/values` when `start` and `end` args are set.
|
||||
* FEATURE: reduce memory usage when query touches big number of time series.
|
||||
* FEATURE: vmagent: reduce memory usage when `kubernetes_sd_config` discovers big number of scrape targets (e.g. hundreds of thousands) and the majority of these targets (99%)
|
||||
are dropped during relabeling. Previously labels for all the dropped targets were displayed at `/api/v1/targets` page. Now only up to `-promscrape.maxDroppedTargets` such
|
||||
targets are displayed. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/878 for details.
|
||||
* FEATURE: vmagent: reduce memory usage when scraping big number of targets with big number of temporary labels starting with `__`.
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/825
|
||||
* FEATURE: vmagent: add `/ready` HTTP endpoint, which returns 200 OK status code when all the service discovery has been initialized.
|
||||
This may be useful during rolling upgrades. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/875
|
||||
|
||||
* BUGFIX: vmagent: eliminate data race when `-promscrape.streamParse` command-line is set. Previously this mode could result in scraped metrics with garbage labels.
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/825#issuecomment-723198247 for details.
|
||||
* BUGFIX: properly calculate `topk_*` and `bottomk_*` functions from [MetricsQL](https://victoriametrics.github.io/MetricsQL.html) for time series with gaps.
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/883
|
||||
|
||||
|
||||
# [v1.45.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.45.0)
|
||||
|
||||
* FEATURE: allow setting `-retentionPeriod` smaller than one month. I.e. `-retentionPeriod=3d`, `-retentionPeriod=2w`, etc. is supported now.
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/173
|
||||
* FEATURE: optimize more cases according to https://utcc.utoronto.ca/~cks/space/blog/sysadmin/PrometheusLabelNonOptimization . Now the following cases are optimized too:
|
||||
* `rollup_func(foo{filters}[d]) op bar` -> `rollup_func(foo{filters}[d]) op bar{filters}`
|
||||
* `transform_func(foo{filters}) op bar` -> `transform_func(foo{filters}) op bar{filters}`
|
||||
* `num_or_scalar op foo{filters} op bar` -> `num_or_scalar op foo{filters} op bar{filters}`
|
||||
* FEATURE: improve time series search for queries with multiple label filters. I.e. `foo{label1="value", label2=~"regexp"}`.
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/781
|
||||
* FEATURE: vmagent: add `stream parse` mode. This mode allows reducing memory usage when individual scrape targets expose tens of millions of metrics.
|
||||
For example, during scraping Prometheus in [federation](https://prometheus.io/docs/prometheus/latest/federation/) mode.
|
||||
See `-promscrape.streamParse` command-line option and `stream_parse: true` config option for `scrape_config` section in `-promscrape.config`.
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/825 and [troubleshooting docs for vmagent](https://victoriametrics.github.io/vmagent.html#troubleshooting).
|
||||
* FEATURE: vmalert: add `-dryRun` command-line option for validating the provided config files without the need to start `vmalert` service.
|
||||
* FEATURE: accept optional third argument of string type at `topk_*` and `bottomk_*` functions. This is label name for additional time series to return with the sum of time series outside top/bottom K. See [MetricsQL docs](https://victoriametrics.github.io/MetricsQL.html) for more details.
|
||||
* FEATURE: vmagent: expose `/api/v1/targets` page according to [the corresponding Prometheus API](https://prometheus.io/docs/prometheus/latest/querying/api/#targets).
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/643
|
||||
|
||||
* BUGFIX: vmagent: properly handle OpenStack endpoint ending with `v3.0` such as `https://ostack.example.com:5000/v3.0`
|
||||
in the same way as Prometheus does. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/728#issuecomment-709914803
|
||||
* BUGFIX: drop trailing data points for time series with a single raw sample. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/748
|
||||
* BUGFIX: do not drop trailing data points for instant queries to `/api/v1/query`. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/845
|
||||
* BUGFIX: vmbackup: fix panic when `-origin` isn't specified. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/856
|
||||
* BUGFIX: vmalert: skip automatically added labels on alerts restore. Label `alertgroup` was introduced in [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/611)
|
||||
and automatically added to generated time series. By mistake, this new label wasn't correctly purged on restore event and affected alert's ID uniqueness.
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/870
|
||||
* BUGFIX: vmagent: fix panic at scrape error body formating. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/864
|
||||
* BUGFIX: vmagent: add leading missing slash to metrics path like Prometheus does. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/835
|
||||
* BUGFIX: vmagent: drop packet if remote storage returns 4xx status code. This make the behaviour consistent with Prometheus.
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/873
|
||||
* BUGFIX: vmagent: properly handle 301 redirects. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/869
|
||||
|
||||
|
||||
# [v1.44.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.44.0)
|
||||
|
||||
* FEATURE: automatically add missing label filters to binary operands as described at https://utcc.utoronto.ca/~cks/space/blog/sysadmin/PrometheusLabelNonOptimization .
|
||||
This should improve performance for queries with missing label filters in binary operands. For example, the following query should work faster now, because it shouldn't
|
||||
fetch and discard time series for `node_filesystem_files_free` metric without matching labels for the left side of the expression:
|
||||
```
|
||||
node_filesystem_files{ host="$host", mountpoint="/" } - node_filesystem_files_free
|
||||
```
|
||||
* FEATURE: vmagent: add Docker Swarm service discovery (aka [dockerswarm_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dockerswarm_sd_config)).
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/656
|
||||
* FEATURE: add ability to export data in CSV format. See [these docs](https://victoriametrics.github.io/#how-to-export-csv-data) for details.
|
||||
* FEATURE: vmagent: add `-promscrape.suppressDuplicateScrapeTargetErrors` command-line flag for suppressing `duplicate scrape target` errors.
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/651 and https://victoriametrics.github.io/vmagent.html#troubleshooting .
|
||||
* FEATURE: vmagent: show original labels before relabeling is applied on `duplicate scrape target` errors. This should simplify debugging for incorrect relabeling.
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/651
|
||||
* FEATURE: vmagent: `/targets` page now accepts optional `show_original_labels=1` query arg for displaying original labels for each target before relabeling is applied.
|
||||
This should simplify debugging for target relabeling configs. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/651
|
||||
* FEATURE: add `-finalMergeDelay` command-line flag for configuring the delay before final merge for per-month partitions.
|
||||
The final merge is started after no new data is ingested into per-month partition during `-finalMergeDelay`.
|
||||
* FEATURE: add `vm_rows_added_to_storage_total` metric, which shows the total number of rows added to storage since app start.
|
||||
The `sum(rate(vm_rows_added_to_storage_total))` can be smaller than `sum(rate(vm_rows_inserted_total))` if certain metrics are dropped
|
||||
due to [relabeling](https://victoriametrics.github.io/#relabeling). The `sum(rate(vm_rows_added_to_storage_total))` can be bigger
|
||||
than `sum(rate(vm_rows_inserted_total))` if [replication](https://victoriametrics.github.io/Cluster-VictoriaMetrics.html#replication-and-data-safety) is enabled.
|
||||
* FEATURE: keep metric name after applying [MetricsQL](https://victoriametrics.github.io/MetricsQL.html) functions, which don't change time series meaning.
|
||||
The list of such functions:
|
||||
* `keep_last_value`
|
||||
* `keep_next_value`
|
||||
* `interpolate`
|
||||
* `running_min`
|
||||
* `running_max`
|
||||
* `running_avg`
|
||||
* `range_min`
|
||||
* `range_max`
|
||||
* `range_avg`
|
||||
* `range_first`
|
||||
* `range_last`
|
||||
* `range_quantile`
|
||||
* `smooth_exponential`
|
||||
* `ceil`
|
||||
* `floor`
|
||||
* `round`
|
||||
* `clamp_min`
|
||||
* `clamp_max`
|
||||
* `max_over_time`
|
||||
* `min_over_time`
|
||||
* `avg_over_time`
|
||||
* `quantile_over_time`
|
||||
* `mode_over_time`
|
||||
* `geomean_over_time`
|
||||
* `holt_winters`
|
||||
* `predict_linear`
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/674
|
||||
|
||||
* BUGFIX: properly handle stale time series after K8S deployment. Previously such time series could be double-counted.
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/748
|
||||
* BUGFIX: return a single time series at max from `absent()` function like Prometheus does.
|
||||
* BUGFIX: vmalert: accept days, weeks and years in `for: ` part of config like Prometheus does. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/817
|
||||
* BUGFIX: fix `mode_over_time(m[d])` calculations. Previously the function could return incorrect results.
|
||||
|
||||
|
||||
# [v1.43.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.43.0)
|
||||
|
||||
* FEATURE: reduce CPU usage for repeated queries over sliding time window when no new time series are added to the database.
|
||||
Typical use cases: repeated evaluation of alerting rules in [vmalert](https://victoriametrics.github.io/vmalert.html) or dashboard auto-refresh in Grafana.
|
||||
* FEATURE: vmagent: add OpenStack service discovery aka [openstack_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#openstack_sd_config).
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/728 .
|
||||
* FEATURE: vmalert: make `-maxIdleConnections` configurable for datasource HTTP client. This option can be used for minimizing connection churn.
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/795 .
|
||||
* FEATURE: add `-influx.maxLineSize` command-line flag for configuring the maximum size for a single Influx line during parsing.
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/807
|
||||
|
||||
* BUGFIX: properly handle `inf` values during [background merge of LSM parts](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282).
|
||||
Previously `Inf` values could result in `NaN` values for adjancent samples in time series. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/805 .
|
||||
* BUGFIX: fill gaps on graphs for `range_*` and `running_*` functions. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/806 .
|
||||
* BUGFIX: make a copy of label with new name during relabeling with `action: labelmap` in the same way as Prometheus does.
|
||||
Previously the original label name has been replaced. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/812 .
|
||||
* BUGFIX: support parsing floating-point timestamp like Graphite Carbon does. Such timestmaps are truncated to seconds.
|
||||
|
||||
|
||||
# [v1.42.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.42.0)
|
||||
|
||||
* FEATURE: use all the available CPU cores when accepting data via a single TCP connection
|
||||
for [all the supported protocols](https://victoriametrics.github.io/#how-to-import-time-series-data).
|
||||
Previously data ingested via a single TCP connection could use only a single CPU core. This could limit data ingestion performance.
|
||||
The main benefit of this feature is that data can be imported at max speed via a single connection - there is no need to open multiple concurrent
|
||||
connections to VictoriaMetrics or [vmagent](https://victoriametrics.github.io/vmagent.html) in order to achieve the maximum data ingestion speed.
|
||||
* FEATURE: cluster: improve performance for data ingestion path from `vminsert` to `vmstorage` nodes. The maximum data ingestion performance
|
||||
for a single connection between `vminsert` and `vmstorage` node scales with the number of available CPU cores on `vmstorage` side.
|
||||
This should help with https://github.com/VictoriaMetrics/VictoriaMetrics/issues/791 .
|
||||
* FEATURE: add ability to export / import data in native format via `/api/v1/export/native` and `/api/v1/import/native`.
|
||||
This is the most optimized approach for data migration between VictoriaMetrics instances. Both single-node and cluster instances are supported.
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/787#issuecomment-700632551 .
|
||||
* FEATURE: add `reduce_mem_usage` query option to `/api/v1/export` in order to reduce memory usage during data export / import.
|
||||
See [these docs](https://victoriametrics.github.io/#how-to-export-data-in-json-line-format) for details.
|
||||
* FEATURE: improve performance for `/api/v1/series` handler when it returns big number of time series.
|
||||
* FEATURE: add `vm_merge_need_free_disk_space` metric, which can be used for estimating the number of deferred background data merges due to the lack of free disk space.
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/686 .
|
||||
* FEATURE: add OpenBSD support. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/785 .
|
||||
|
||||
* BUGFIX: properly apply `-search.maxStalenessInterval` command-line flag value. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/784 .
|
||||
* BUGFIX: fix displaying data in Grafana tables. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/720 .
|
||||
* BUGFIX: do not adjust the number of detected CPU cores found at `/sys/devices/system/cpu/online`.
|
||||
The adjustement was increasing the resulting GOMAXPROC by 1, which looked confusing to users.
|
||||
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/685#issuecomment-698595309 .
|
||||
* BUGFIX: vmagent: do not show `-remoteWrite.url` in initial logs if `-remoteWrite.showURL` isn't set. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/773 .
|
||||
* BUGFIX: properly handle case when [/metrics/find](https://victoriametrics.github.io/#graphite-metrics-api-usage) finds both a leaf and a node for the given `query=prefix.*`.
|
||||
In this case only the node must be returned with stripped dot in the end of id as carbonapi does.
|
||||
|
||||
|
||||
# Previous releases
|
||||
|
||||
See [releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases).
|
||||
@@ -3,10 +3,42 @@
|
||||
Below are approved public case studies and talks from VictoriaMetrics users. Join our [community Slack channel](http://slack.victoriametrics.com/)
|
||||
and feel free asking for references, reviews and additional case studies from real VictoriaMetrics users there.
|
||||
|
||||
See also [articles about VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/Articles).
|
||||
See also [articles about VictoriaMetrics from our users](https://victoriametrics.github.io/Articles.html#third-party-articles-and-slides).
|
||||
|
||||
Alphabetically sorted links to case studies:
|
||||
|
||||
* [adidas](#adidas)
|
||||
* [Adsterra](#adsterra)
|
||||
* [ARNES](#arnes)
|
||||
* [Brandwatch](#brandwatch)
|
||||
* [CERN](#cern)
|
||||
* [COLOPL](#colopl)
|
||||
* [Dreamteam](#dreamteam)
|
||||
* [Idealo.de](#idealode)
|
||||
* [MHI Vestas Offshore Wind](#mhi-vestas-offshore-wind)
|
||||
* [Synthesio](#synthesio)
|
||||
* [Wedos.com](#wedoscom)
|
||||
* [Wix.com](#wixcom)
|
||||
* [Zerodha](#zerodha)
|
||||
* [zhihu](#zhihu)
|
||||
|
||||
|
||||
## Adidas
|
||||
## zhihu
|
||||
|
||||
[zhihu](https://www.zhihu.com) is the largest chinese question-and-answer website. We use VictoriaMetrics to store and use Graphite metrics, and we shared the [promate](https://github.com/zhihu/promate) solution in our [单机 20 亿指标,知乎 Graphite 极致优化!](https://qcon.infoq.cn/2020/shenzhen/presentation/2881)([slides](https://static001.geekbang.org/con/76/pdf/828698018/file/%E5%8D%95%E6%9C%BA%2020%20%E4%BA%BF%E6%8C%87%E6%A0%87%EF%BC%8C%E7%9F%A5%E4%B9%8E%20Graphite%20%E6%9E%81%E8%87%B4%E4%BC%98%E5%8C%96%EF%BC%81-%E7%86%8A%E8%B1%B9.pdf)) talk at [QCon 2020](https://qcon.infoq.cn/2020/shenzhen/).
|
||||
|
||||
Numbers:
|
||||
|
||||
- Active time series: ~2500 Million
|
||||
- Datapoints: ~20 Trillion
|
||||
- Ingestion rate: ~1800k/s
|
||||
- Disk usage: ~20 TiB
|
||||
- Index size: ~600 GiB
|
||||
- The average query rate is ~3k per second (mostly alert queries).
|
||||
- Query duration: median is ~40ms, 99th percentile is ~100ms.
|
||||
|
||||
|
||||
## adidas
|
||||
|
||||
See [slides](https://promcon.io/2019-munich/slides/remote-write-storage-wars.pdf) and [video](https://youtu.be/OsH6gPdxR4s)
|
||||
from [Remote Write Storage Wars](https://promcon.io/2019-munich/talks/remote-write-storage-wars/) talk at [PromCon 2019](https://promcon.io/2019-munich/).
|
||||
@@ -57,7 +89,7 @@ Thanos, Cortex and VictoriaMetrics were evaluated as a long-term storage for Pro
|
||||
* Blazing fast benchmarks for a single node setup.
|
||||
* Single binary mode. Easy to scale vertically, very less operational headache.
|
||||
* Considerable [improvements on creating Histograms](https://medium.com/@valyala/improving-histogram-usability-for-prometheus-and-grafana-bc7e5df0e350).
|
||||
* [MetricsQL](https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/MetricsQL) gives us the ability to extend PromQL with more aggregation operators.
|
||||
* [MetricsQL](https://victoriametrics.github.io/MetricsQL.html) gives us the ability to extend PromQL with more aggregation operators.
|
||||
* API is compatible with Prometheus, almost all standard PromQL queries just work out of the box.
|
||||
* Handles storage well, with periodic compaction. Makes it easy to take snapshots.
|
||||
|
||||
@@ -69,17 +101,17 @@ See [Monitoring K8S with VictoriaMetrics](https://docs.google.com/presentation/d
|
||||
|
||||
[Wix.com](https://en.wikipedia.org/wiki/Wix.com) is the leading web development platform.
|
||||
|
||||
> We needed to redesign metric infrastructure from the ground up after the move to Kubernethes. A few approaches/designs have been tried before the one that works great has been chosen: Prometheus instance in every datacenter with 2 hours retention for local storage and remote write into [HA pair of single-node VictoriaMetrics instances](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#high-availability).
|
||||
> We needed to redesign metric infrastructure from the ground up after the move to Kubernethes. A few approaches/designs have been tried before the one that works great has been chosen: Prometheus instance in every datacenter with 2 hours retention for local storage and remote write into [HA pair of single-node VictoriaMetrics instances](https://victoriametrics.github.io/Single-server-VictoriaMetrics.html#high-availability).
|
||||
|
||||
Numbers:
|
||||
|
||||
* The number of active time series per VictoriaMetrics instance is 40M.
|
||||
* The total number of time series per VictoriaMetrics instance is 400M+.
|
||||
* The total number of time series per VictoriaMetrics instance is 5000M+.
|
||||
* Ingestion rate per VictoriaMetrics instance is 1M data points per second.
|
||||
* The total number of datapoints per VictoriaMetrics instance is 8 trillions.
|
||||
* The average time series churn rate is ~3M per day.
|
||||
* The total number of datapoints per VictoriaMetrics instance is 8.5 trillions.
|
||||
* The average time series churn rate is ~80M per day.
|
||||
* The average query rate is ~100 per second (mostly alert queries).
|
||||
* Query duration: median is ~70ms, 99th percentile is ~1.5sec.
|
||||
* Query duration: median is ~20ms, 99th percentile is ~1.5sec.
|
||||
* Retention: 3 months.
|
||||
|
||||
> Alternatives that we’ve played with before choosing VictoriaMetrics are: federated Prometheus, Cortex, IronDB and Thanos.
|
||||
@@ -92,14 +124,14 @@ Numbers:
|
||||
* Enough head room/scaling capacity for future growth, up to 100M active time series.
|
||||
* Ability to split DB replicas per workload. Alert queries go to one replica, user queries go to another (speed for users, effective cache).
|
||||
|
||||
> Optimizing for those points and our specific workload VictoriaMetrics proved to be the best option. As an icing on a cake we’ve got [PromQL extensions](https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/MetricsQL) - `default 0` and `histogram` are my favorite ones, for example. What we specially like is having a lot of tsdb params easily available via config options, that makes tsdb easy to tune for specific use case. Also worth noting is a great community in [Slack channel](http://slack.victoriametrics.com/) and of course maintainer support.
|
||||
> Optimizing for those points and our specific workload VictoriaMetrics proved to be the best option. As an icing on a cake we’ve got [PromQL extensions](https://victoriametrics.github.io/MetricsQL.html) - `default 0` and `histogram` are my favorite ones, for example. What we specially like is having a lot of tsdb params easily available via config options, that makes tsdb easy to tune for specific use case. Also worth noting is a great community in [Slack channel](http://slack.victoriametrics.com/) and of course maintainer support.
|
||||
|
||||
Alex Ulstein, Head of Monitoring, Wix.com
|
||||
|
||||
|
||||
## Wedos.com
|
||||
|
||||
> [Wedos](https://www.wedos.com/) is the Biggest Czech Hosting. We have our own private data center, that holds only our servers and technologies. The second data center, where the servers will be cooled in an oil bath, is being built. We started using [cluster VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/cluster/README.md) to store Prometheus metrics from all our infrastructure after receiving positive references from our friends who successfully use VictoriaMetrics.
|
||||
> [Wedos](https://www.wedos.com/) is the Biggest Czech Hosting. We have our own private data center, that holds only our servers and technologies. The second data center, where the servers will be cooled in an oil bath, is being built. We started using [cluster VictoriaMetrics](https://victoriametrics.github.io/Cluster-VictoriaMetrics.html) to store Prometheus metrics from all our infrastructure after receiving positive references from our friends who successfully use VictoriaMetrics.
|
||||
|
||||
Numbers:
|
||||
|
||||
@@ -220,12 +252,12 @@ We end up with the following configuration:
|
||||
|
||||
Turns out that remote write protocol generates too much traffic and connections. So after 8 months we started to look for alternatives.
|
||||
|
||||
Around the same time VictoriaMetrics released [vmagent](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmagent/README.md).
|
||||
Around the same time VictoriaMetrics released [vmagent](https://victoriametrics.github.io/vmagent.html).
|
||||
We tried to scrape all the metrics via a single insance of vmagent. But that didn't work - vmgent wasn't able to catch up with writes
|
||||
into VictoriaMetrics. We tested different options and end up with the following scheme:
|
||||
|
||||
- We removed Prometheus from our setup.
|
||||
- VictoriaMetrics [can scrape targets](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-scrape-prometheus-exporters-such-as-node-exporter) as well,
|
||||
- VictoriaMetrics [can scrape targets](https://victoriametrics.github.io/Single-server-VictoriaMetrics.html#how-to-scrape-prometheus-exporters-such-as-node-exporter) as well,
|
||||
so we removed vmagent. Now VictoriaMetrics scrapes all the metrics from 110 jobs and 5531 targets.
|
||||
- We use [Promxy](https://github.com/jacksontj/promxy) for alerting.
|
||||
|
||||
@@ -236,7 +268,7 @@ Such a scheme has the following benefits comparing to Prometheus:
|
||||
|
||||
Cons are the following:
|
||||
|
||||
- VictoriaMetrics didn't support replication (it [supports replication now](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/cluster/README.md#replication-and-data-safety)) - we run extra instance of VictoriaMetrics and Promxy in front of VictoriaMetrics pair for high availability.
|
||||
- VictoriaMetrics didn't support replication (it [supports replication now](https://victoriametrics.github.io/Cluster-VictoriaMetrics.html#replication-and-data-safety)) - we run extra instance of VictoriaMetrics and Promxy in front of VictoriaMetrics pair for high availability.
|
||||
- VictoriaMetrics stores 1 extra month for defined retention (if retention is set to N months, then VM stores N+1 months of data), but this is still better than other solutions.
|
||||
|
||||
Some numbers from our single-node VictoriaMetrics setup:
|
||||
@@ -304,3 +336,21 @@ Grafana has a LB infront, so if one DC has problems, we can still view all metri
|
||||
|
||||
We are still in the process of migration, but we are really happy with the whole stack. It has proven as an essential piece
|
||||
for insight into our services during COVID-19 and has enabled us to provide better service and spot problems faster.
|
||||
|
||||
|
||||
## Idealo.de
|
||||
|
||||
[idealo.de](https://www.idealo.de/) is the leading price comparison website in Germany. We use Prometheus for metrics on our container platform.
|
||||
When we introduced Prometheus at idealo we started with m3db as a longterm storage. In our setup m3db was quite unstable and consumed a lot of resources.
|
||||
|
||||
VictoriaMetrics runs very stable for us and uses only a fraction of the resources. Although we also increased our retention time from 1 month to 13 months.
|
||||
|
||||
Numbers:
|
||||
|
||||
- The number of active time series per VictoriaMetrics instance is 21M.
|
||||
- Total ingestion rate 120k metrics per second.
|
||||
- The total number of datapoints 3.1 trillion.
|
||||
- The average time series churn rate is ~9M per day.
|
||||
- The average query rate is ~20 per second. Response time for 99th quantile is 120ms.
|
||||
- Retention: 13 months.
|
||||
- Size of all datapoints: 3.5 TB
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user