Compare commits

..

525 Commits

Author SHA1 Message Date
Aliaksandr Valialkin
d396c265a6 CHANGELOG.md: cut v1.45.0 2020-11-02 02:43:12 +02:00
Aliaksandr Valialkin
31918f60b2 vendor: make vendor-update 2020-11-02 02:41:02 +02:00
Aliaksandr Valialkin
d62ec1cb01 CHANGELOG.md: add a link to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/825 2020-11-02 02:28:55 +02:00
Aliaksandr Valialkin
5e75c389e6 app/vmselect/promql: allow dropping trailing sample only for default_rollup function
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/850
2020-11-02 02:10:59 +02:00
Aliaksandr Valialkin
c0f3be824d lib/promscrape: properly handle response body after 301 redirect
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/869
2020-11-02 01:09:52 +02:00
Aliaksandr Valialkin
ca566dce39 CHANGELOG.md: mention about packets drop in vmagent like Prometheus does 2020-11-02 00:46:49 +02:00
Aliaksandr Valialkin
0b35da159c app/vmagent/remotewrite: drop packets if remote storage returns 4xx status code
This makes consistent the behaviour with Prometheus.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/873
2020-11-02 00:45:09 +02:00
Aliaksandr Valialkin
cb71af216a app/vmselect/promql: go fmt 2020-11-02 00:15:29 +02:00
Aliaksandr Valialkin
daacbc7e34 app/vmselect/promql: do not drop trailing datapoints for instant queries
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/845
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/748
2020-11-02 00:12:37 +02:00
S.F
f477cbe861 OpenBSD packaging files (#853) 2020-11-01 23:39:25 +02:00
Roman Khavronenko
50d44d5932 dashboard: add Storage full ETA panel (#858)
* dashboard: add `Storage full ETA` panel

The new panel suppose to help to estimate the time needed to run out of free
disk space.
Thx to @belm0 @hekmon

* disable legend for `Storage full ETA` panel
2020-11-01 23:37:31 +02:00
Aliaksandr Valialkin
68d004bc05 CHANGELOG.md: mention about recently added bugfixes 2020-11-01 23:35:06 +02:00
Aliaksandr Valialkin
e277c3d07b lib/promscrape: add stream parse mode for efficient scraping of targets that expose millions of metrics 2020-11-01 23:35:06 +02:00
Aliaksandr Valialkin
29e4e7f422 lib/storage: drop more samples outside the given retention during background merge
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/17
2020-11-01 23:35:06 +02:00
Aliaksandr Valialkin
b7638f04a7 app/vmagent: expose /api/v1/targets page according to https://prometheus.io/docs/prometheus/latest/querying/api/#targets
This page is exposed by vmagent and by a single-node VictoriaMetrics

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/643
2020-11-01 23:35:06 +02:00
Aliaksandr Valialkin
c539494b36 app/vmselect/promql: allow passing optional third argument to topk_* and bottomk_* functions in order to obtain sum of time series outside top/bottom K 2020-11-01 23:35:06 +02:00
Aliaksandr Valialkin
d12c4914f0 lib/storage: properly handle the case when key="__name__" is passed to MetricName.AddTag* 2020-11-01 23:35:06 +02:00
Aliaksandr Valialkin
64e2d66014 lib/storage: code cleanup after 5bfd4e6218 2020-11-01 23:35:06 +02:00
Sergey Klyuykov
4108e85efd Fix InfluxDB support on docker-compose deployment. (#872)
* Added UDP protocol support for Graphite/Influx in docker-compose deployment.

This is necessary for Proxmox VE External Metric Server support.
https://pve.proxmox.com/wiki/External_Metric_Server

* Added `influxListenAddr` in docker-compose deployment.

This is necessary for Proxmox VE External Metric Server support.
https://pve.proxmox.com/wiki/External_Metric_Server

Additionally created Grafana Dashboard for monitoring Proxmox VE hosts.
https://grafana.com/grafana/dashboards/13307
2020-11-01 23:34:39 +02:00
Roman Khavronenko
f0bdc5716e vmalert: skip automatically added labels on alerts restore (#871)
Label `alertgroup` was introduced in #611 and automatically added to generated
time series. By mistake, this new label wasn't correctly purged on restore event
and affected alert's ID uniqueness. This commit removes `alertgroup` label
in restore function.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/870
2020-10-30 08:18:20 +00:00
Nikolay
67059caa12 fixes panic at scrape error body formating, (#868)
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/864
regression after body reuse improvements
2020-10-29 17:17:52 +03:00
Nikolay
de3fe22815 adds leading forward slash check for scrapeURL path (#855)
* fixes in-consistency with prometheus behaviour for scrape targets url path.
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/835
2020-10-29 08:39:42 +03:00
Sergey Kulukov
055f152246 Added UDP protocol support for Graphite/Influx in docker-compose deployment.
This is necessary for Proxmox VE External Metric Server support.
https://pve.proxmox.com/wiki/External_Metric_Server
2020-10-28 20:26:55 +02:00
Roman Khavronenko
20311f6065 dashboard: clarify the purpose of Concurrent flushes on disk panel (#849)
Current description led to confusion at https://victoriametrics.slack.com/archives/CGZF1H6L9/p1603270014273800
2020-10-28 18:10:46 +00:00
kreedom
a51a7b2a20 vmbackup fix panic when no origin fs given (#859)
* use fsnil when no origin fs
2020-10-28 20:09:10 +02:00
Aliaksandr Valialkin
bca468bb55 CHANGELOG.md: mention about recently added changes 2020-10-20 14:32:14 +03:00
Aliaksandr Valialkin
0729cc36b2 lib/memory: do not print trailing zeroes in logs for -memory.allowedPercent command-line flag 2020-10-20 14:32:07 +03:00
Aliaksandr Valialkin
5bfd4e6218 app/vmstorage: support for -retentionPeriod smaller than one month
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/173
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/17
2020-10-20 14:31:44 +03:00
Aliaksandr Valialkin
920300643a docs/vmrestore.md: make docs-sync 2020-10-20 10:48:19 +03:00
kreedom
ef77120170 vmalert - add dryRun (#842)
vmalert: add `dryRun` flag for rules validation without running the service
2020-10-20 08:15:21 +01:00
Seva Poliakov
b3f3c078e5 Fix typo in vnrestore readme 2020-10-18 15:41:39 +03:00
faceair
84e3881c0b disable response compression on websocket (#841) 2020-10-17 13:32:34 +03:00
Aliaksandr Valialkin
2ed069c3bc docs/MetricsQL.md: small clarifications 2020-10-17 12:01:43 +03:00
Aliaksandr Valialkin
28353e48ca app/vmselect/promql: an attempt to improve heuristics for dropping trailing data points in time series
Now trailing data points are additionally dropped for time series with a single raw sample

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/748
2020-10-17 10:44:34 +03:00
Aliaksandr Valialkin
01987f8c77 lib/storage: small code adjustements after d2960a20e0
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/781
2020-10-17 01:16:54 +03:00
faceair
d2960a20e0 evaluate the execution cost of all tag filters (#824)
* evaluate the execution cost of all tag filters

* fix suffixes typo
2020-10-17 00:46:55 +03:00
Aliaksandr Valialkin
d4f12e0fbb CHANGELOG.md: mention about improved openstack endpoint handling
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/728
2020-10-16 23:06:33 +03:00
Nikolay Khramchikhin
e6ab69dd88 fixes openstack api endpoint with suffix trim adds openstack (#840)
api v2.0 check
2020-10-16 21:20:57 +03:00
Aliaksandr Valialkin
ed5f05024b deployment/docker: update Go builder from Go1.15.2 to Go1.15.3
This should fix potential issues related to Go runtime - see https://github.com/golang/go/issues?q=milestone%3AGo1.15.3+label%3ACherryPickApproved
2020-10-16 15:08:52 +03:00
Aliaksandr Valialkin
43aa737e23 vendor: make vendor-update 2020-10-16 15:06:27 +03:00
Aliaksandr Valialkin
46dccc1088 CHANGELOG.md: describe added optimization cases from 96cdfcba50 2020-10-16 12:59:42 +03:00
Aliaksandr Valialkin
96cdfcba50 vendor: update github.com/VictoriaMetrics/metricsql from v0.7.1 to v0.7.2
The new release of github.com/VictoriaMetrics/metricsql adds more optimizations for `foo{filters1} op bar{filters2}`:

* rollup_func(foo[d]) op bar{filters}
* transform_func(foo) op bar{filters}
* num_or_scalar op bar op baz{filters}
2020-10-16 12:53:36 +03:00
Aliaksandr Valialkin
09d60d64a9 docs: add a link to https://smarketshq.com/monitoring-kubernetes-clusters-41a4b24c19e3 article about VictoriaMetrics 2020-10-16 09:07:41 +03:00
Aliaksandr Valialkin
c37e5de66f docs/Single-server-VictoriaMetrics.md: update docs 2020-10-14 13:26:31 +03:00
Aliaksandr Valialkin
3b847d32d9 docs/CaseStudies.md: actualize numbers for Wix.com 2020-10-14 13:07:33 +03:00
Aliaksandr Valialkin
590d8d537f docs/vmalert.md: make docs-sync 2020-10-13 18:34:32 +03:00
Roman Khavronenko
bc42b5598f vmalert: update docs to highlight the state restore requirements; (#833)
Address https://github.com/VictoriaMetrics/VictoriaMetrics/issues/830
2020-10-13 18:32:43 +03:00
Aliaksandr Valialkin
94978af9bc CHANGELOG.md: cut v1.44.0 release 2020-10-13 16:59:33 +03:00
Aliaksandr Valialkin
8e20bc7b53 docs/Cluster-VictoriaMetrics.md: clarify RAM requirements for vmstorage nodes 2020-10-13 16:47:51 +03:00
Aliaksandr Valialkin
a2b9476897 app/vmselect/promql: return a single time series at max from absent() function like Prometheus does 2020-10-13 15:56:04 +03:00
Aliaksandr Valialkin
9aa3b65766 app/vmselect/promql: improve time series staleness detection
This should prevent from double counting for time series at the time when it changes label.
The most common case is in K8S, which changes pod uid label with each new deployment.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/748
2020-10-13 12:19:57 +03:00
Aliaksandr Valialkin
d8af290947 app/vmselect/promql: fix mode_over_time calculations
Previously `mode_over_time` could return garbage due to improper shuffling of input data points.
2020-10-13 11:58:25 +03:00
Aliaksandr Valialkin
1e27420243 app/vmselect/prometheus: fix golangci-lint warning 2020-10-13 09:36:11 +03:00
Aliaksandr Valialkin
4f16a964e3 app/vmselect: add ability to export data in CSV format via /api/v1/export/csv 2020-10-12 20:08:17 +03:00
Aliaksandr Valialkin
4cc6574cea CHANGELOG.md: mention about added Docker Swarm service discovery 2020-10-12 16:17:58 +03:00
Aliaksandr Valialkin
63c4999e06 lib/promscrape: code prettifying after 9bd9f67718 2020-10-12 16:12:36 +03:00
Nikolay Khramchikhin
9bd9f67718 Adds dockerswarm sd (#818)
* adds dockerswarm service discovery

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/656

 Following roles supported: services, tasks and nodes.
 Basic, token and tls auth supported.
 Added tests for labels generation.

* added unix socket support to discovery utils

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2020-10-12 13:38:21 +03:00
Aliaksandr Valialkin
7f983d461a docs/MetricsQL.md: mention that VictoriaMetrics keeps metric names after applying functions which dont change time series meaning 2020-10-12 13:25:25 +03:00
Aliaksandr Valialkin
3bba6a2199 CHANGELOG.md: mention that VictoriaMetrics keeps metric names when applying functions which don't change time series meaning 2020-10-12 12:55:09 +03:00
Aliaksandr Valialkin
762c967855 app/vmselect/promql: keep metric name after applying more functions, which dont change time series meaning
Functions are:

* keep_last_value
* keep_next_value
* interpolate
* running_min
* running_max
* running_avg
* range_min
* range_max
* range_avg
* range_first
* range_last
* range_quantile
* smooth_exponential

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/674
2020-10-12 11:47:06 +03:00
Aliaksandr Valialkin
45f7cdc532 Revert "app/vmselect/promql: remove metric name after applying ceil, floor and round functions in order to be more consistent with Prometheus"
This reverts commit ac45082216.

Reason for revert: the previous behavior for VictoriaMetrics is easier to understand and use by users -
functions, which don't change the meaning of the time series shouldn't drop metric name.

Now the following functions do not drop metric names:

* ceil
* floor
* round

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/674
2020-10-12 11:40:34 +03:00
Aliaksandr Valialkin
a94825b169 Revert "app/vmselect/promql: remove metric name after applying clamp_min and clamp_max functions in order to be consistent with Prometheus"
This reverts commit bb61a4769b.

Reason for revert: the previous behavior for VictoriaMetrics is easier to understand and use by users -
functions, which don't change the meaning of the time series shouldn't drop metric name.

Now the following functions do not drop metric name:

* clamp_min
* clamp_max

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/674
2020-10-12 11:38:27 +03:00
Aliaksandr Valialkin
f7d28bddbf Revert "app/vmselect/promql: remove metric name from results of certain rollup functions in order to be consistent with Prometheus"
This reverts commit e5202a4eae.

Reason for revert: the previous behavior for VictoriaMetrics is easier to understand and use by users -
functions, which don't change the meaning of the time series shouldn't drop metric name.

Now the following functions do not drop metric name:

* max_over_time
* min_over_time
* avg_over_time
* quantile_over_time
* geomean_over_time
* mode_over_time
* holt_winters
* predict_linear

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/674
2020-10-12 11:35:18 +03:00
Aliaksandr Valialkin
2749a3c827 docs/Single-server-VictoriaMetrics.md: add missing whitespace 2020-10-09 20:56:26 +03:00
Aliaksandr Valialkin
b449607181 lib/backup: add MustStop() method for all remote filesystems 2020-10-09 15:32:19 +03:00
Aliaksandr Valialkin
cf5f2874cd lib/backup/fslocal: add FS.MustStop() method for stopping bandwidth limiter 2020-10-09 15:12:03 +03:00
Aliaksandr Valialkin
272d6976b3 CHANGELOG.md: update with recent changes 2020-10-09 14:22:05 +03:00
Aliaksandr Valialkin
68f0e00761 app/vmstorage: add vm_rows_added_to_storage_total metric, which shows the total number of rows added to storage since app start 2020-10-09 13:35:48 +03:00
Aliaksandr Valialkin
84227ea2fc app/{vminsert,vmagent}: take into account all the inserted rows before relabeling in vm_rows_inserted_total and vmagent_rows_inserted_total metrics 2020-10-09 13:29:51 +03:00
Aliaksandr Valialkin
f4e8687c88 app/vmalert: accept days, weeks and years in for: part of config like Prometheus does
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/817
2020-10-08 20:13:15 +03:00
Aliaksandr Valialkin
561a7619a5 lib/promscrape: fix tests after 71ea4935de 2020-10-08 19:32:36 +03:00
Aliaksandr Valialkin
6105d61d11 docs/vmagent.md: clarify -promscrape.suppressDuplicateScrapeTargetErrors command-line flag usage 2020-10-08 19:24:31 +03:00
Aliaksandr Valialkin
12d2cf3a7a CHANGELOG.md: mention features from 71ea4935de 2020-10-08 19:13:54 +03:00
Aliaksandr Valialkin
71ea4935de lib/promscrape: add -promscrape.suppressDuplicateScrapeTargetErrors command-line flag in order to suppress duplicate scrape target errors
Show also original labels for duplicate targets in error message in order to simplify debugging the issue.

Now `/targets` endpoint accepts optional `show_original_labels=1` query arg, which shows original labels for each target.
This may simplify debugging for target relabeling.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/651
2020-10-08 18:58:30 +03:00
Aliaksandr Valialkin
9b0a5c1028 lib/backup/actions: improve logging to be more clear to humans 2020-10-08 14:23:07 +03:00
Aliaksandr Valialkin
d423d73251 app/vmalert: do not pring description for all the flags on config errors
The description is too big to consume by human and it just distracts humans.
2020-10-08 13:35:57 +03:00
Aliaksandr Valialkin
d8546e972a vendor: make vendor-update 2020-10-08 11:52:01 +03:00
Aliaksandr Valialkin
c9fb217e4e vendor: update github.com/VictoriaMetrics/metricsql from v0.7.0 to v0.7.1 2020-10-08 11:46:51 +03:00
Aliaksandr Valialkin
bec85d5135 CHANGELOG.md: mentioned about the added optimization that adds missing filters to binary operands 2020-10-07 21:23:02 +03:00
Aliaksandr Valialkin
e9f2e2cbc9 app/vmselect/promql: add missing label filters to binary operands before query execution
This implements the optimization described at https://utcc.utoronto.ca/~cks/space/blog/sysadmin/PrometheusLabelNonOptimization

See also https://github.com/cortexproject/cortex/issues/3253
2020-10-07 21:15:09 +03:00
Aliaksandr Valialkin
5ef71974fe CHANGELOG.md: mention about -finalMergeDelay comand-line flag 2020-10-07 18:52:41 +03:00
Dmitry Shihovtsev
92e5d89fc9 Fix typos in the vmalert datasource (#814)
* Fix typos in the vmalert datasource

* Fix typo in the vmalert datasource test
2020-10-07 17:59:50 +03:00
Artem Navoiev
8e6eb2cd6b update go action 2020-10-07 17:48:42 +03:00
Aliaksandr Valialkin
af90b3121c app/vmstorage: add -finalMergeDelay command-line flag for configuring the delay before final merge for per-month partitions after no new data is ingested to it 2020-10-07 17:35:44 +03:00
Aliaksandr Valialkin
e9d99021b0 docs/CaseStudies.md: actualize Wix numbers 2020-10-06 16:09:35 +03:00
Aliaksandr Valialkin
5aa269def6 CHANGELOG.md: add missing link to an issue about OpenStack service discovery - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/728 2020-10-06 15:37:36 +03:00
Aliaksandr Valialkin
d16dbfd639 app/vmagent: add a link to https://victoriametrics.github.io/vmagent.html from main page 2020-10-06 15:29:49 +03:00
Aliaksandr Valialkin
cfd720e772 app/victoria-metrics: add a link to https://victoriametrics.github.io/ from main page 2020-10-06 15:29:49 +03:00
Aliaksandr Valialkin
e10c484a8e docs/Articles.md: add https://medium.com/@VictoriaMetrics/anomaly-detection-in-victoriametrics-9528538786a7 2020-10-06 15:29:49 +03:00
Aliaksandr Valialkin
2a6fa53957 CHANGELOG.md: cut v1.43.0 release 2020-10-06 14:28:50 +03:00
Aliaksandr Valialkin
5a8553bfd2 CHANGELOG.md: add missing entries for upcoming release 2020-10-06 12:04:38 +03:00
Aliaksandr Valialkin
e19d400230 lib/protoparser/graphite: support parsing floating-point timestamp like Graphite does
Such timestamps are rounded to seconds like Carbon does.
See b0ba62a62d/lib/carbon/protocols.py (L197)
2020-10-06 11:38:29 +03:00
Aliaksandr Valialkin
90aa2a8ffd lib/promscrape/discovery/openstack: show expiration time for refreshed OpenStack token in seconds - this is easier to interpret by human 2020-10-06 11:34:09 +03:00
Aliaksandr Valialkin
cc08648699 vendor: make vendor-update 2020-10-05 23:21:41 +03:00
Aliaksandr Valialkin
129b07113e .github/workflows: switch Go version from v1.14 to v1.15 2020-10-05 22:00:51 +03:00
Aliaksandr Valialkin
aba899c298 lib/promscrape/discovery/openstack: code prettifying after cbe3cf683b
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/728
2020-10-05 18:11:55 +03:00
Aliaksandr Valialkin
991fad7855 docs: make docs-sync after cbe3cf683b 2020-10-05 16:47:57 +03:00
Nikolay Khramchikhin
cbe3cf683b Adds openstack sd (#811)
* adds openstack service discovery

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/728

 implemented hypervisors and instance discovery with openstack v3 api.
 Added tests for labeling and data parsing.
 Added token refresh.

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2020-10-05 16:45:33 +03:00
Aliaksandr Valialkin
f42194d817 lib/promrelabel: make a copy of label with new name for action: labelmap in the same way as Prometheus does
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/812
2020-10-05 16:19:19 +03:00
Aliaksandr Valialkin
bbeac0ba46 lib/protoparser/influx: add -influx.maxLineSize command-line flag for configuring the maximum size for a single Influx line during parsing
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/807
2020-10-05 15:19:05 +03:00
Aliaksandr Valialkin
47db9bb24a lib/decimal: add tests for negative values passed to maxUpExponent 2020-10-05 14:56:45 +03:00
Aliaksandr Valialkin
bc7d67cee2 lib/decimal: properly calibrate scale for blocks with Inf values
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/805
2020-10-05 14:52:44 +03:00
Aliaksandr Valialkin
59c26feefa app/vmselect/promql: fill gaps on graphs for range_* and running_* functions
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/806
2020-10-02 13:59:45 +03:00
Aliaksandr Valialkin
764dc2499f lib/storage: code cleanup after 10f2eedee0
Remove the code that uses metricIDs caches for the current and the previous hour during metricIDs search,
since this code became unused after implementing per-day inverted index almost a year ago.

While at it, fix a bug, which could prevent from finding time series with names containing dots (aka Graphite-like names
such as `foo.bar.baz`).
2020-10-01 19:06:23 +03:00
Aliaksandr Valialkin
10f2eedee0 lib/storage: imrpove cache effectiveness for time series ids matching the given filters
Previously the maximum cache lifetime has been limited by 10 seconds. Now it is extended up to a day.
This should reduce CPU usage in the following cases:

* when querying recently added data with small churn rate for time series
* when querying historical data
2020-10-01 14:38:25 +03:00
Aliaksandr Valialkin
d25dd7fdb6 docs: make docs-sync 2020-09-30 09:50:29 +03:00
Roman Khavronenko
daa2d1c065 vmalert: make maxIdleConnections configurable for datasource HTTP client (#797)
Address https://github.com/VictoriaMetrics/VictoriaMetrics/issues/795
2020-09-30 09:49:45 +03:00
Aliaksandr Valialkin
a44e0c6153 vendor: make vendor-update 2020-09-30 08:59:20 +03:00
Aliaksandr Valialkin
a897cf2ec3 docs/Release-Guide.md: mention that CHANGELOG.md must be updated before release 2020-09-30 08:53:17 +03:00
Aliaksandr Valialkin
58465bb29b CHANGELOG.md: release v1.42.0 2020-09-30 08:45:31 +03:00
Aliaksandr Valialkin
e59de98384 CHANGELOG.md: initial commit
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/788
2020-09-30 00:12:32 +03:00
Aliaksandr Valialkin
bec9b31b81 lib/storage: allow set values higher than 1 for vm_merge_need_free_disk_space if there are multiple partitions with deferred merges due to disk space shortage 2020-09-29 22:51:43 +03:00
Aliaksandr Valialkin
44bcda81ab app/vmstorage: rename vm_{big|small}_merge_need_free_disk_space to vm_merge_need_free_disk_space
This simplifies alerting.
2020-09-29 22:44:19 +03:00
Aliaksandr Valialkin
a9db81c4ab app/vmstorage: add metrics for determining whether background merges need additional disk space to complete
These metrics are:

* vm_small_merge_need_free_disk_space
* vm_big_merge_need_free_disk_space

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/686
2020-09-29 21:48:33 +03:00
Aliaksandr Valialkin
dbf9402329 docs/Single-server-VictoriaMetrics.md: typo fix 2020-09-29 20:29:46 +03:00
Aliaksandr Valialkin
1137bdec66 docs/Single-server-VictoriaMetrics.md: typo fix: compations -> compactions 2020-09-29 20:27:05 +03:00
Aliaksandr Valialkin
127537d631 app/vmagent/remotewrite: do not show -remoteWrite.url in logs if -remoteWrite.showURL isn't set
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/773
2020-09-29 19:49:12 +03:00
Aliaksandr Valialkin
f7636b0342 app/vmselect/graphite: do not substitute path and path. with path.. in /metrics/find/?format=completer output 2020-09-29 18:03:26 +03:00
Aliaksandr Valialkin
76b244cfcf lib/cgroup: do not adjust the number of detected CPU cores via /sys/devices/system/cpu/online
The adjustement increases the resulting GOMAXPROC by 1, which looks confusing to users
as outlined at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/685#issuecomment-698595309
2020-09-29 13:55:26 +03:00
Aliaksandr Valialkin
7dc67cd883 docs/{vmbackup,vmrestore}: formatting fixes 2020-09-29 13:19:07 +03:00
Aliaksandr Valialkin
efdefbc1cb docs/vmbackup.md: make docs about minio config more prominent 2020-09-29 13:16:04 +03:00
Aliaksandr Valialkin
1659135752 lib/storage: fix tests for 32-bit arches such as GOARCH=386 and GOARCH=arm 2020-09-29 13:10:22 +03:00
Aliaksandr Valialkin
9945b8c98d docs: improve readability a bit 2020-09-29 13:03:38 +03:00
Nikolay Khramchikhin
1e679f3e0d update vmbackup/vmrestore README usage (#794)
* update vmbackup/vmrestore README usage

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/381

with minio and configuration file examples.

* Apply suggestions from code review

Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>

* added backup/restore docs changes

* added example for relabelConfig flag

* Apply suggestions from code review

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>
2020-09-29 12:53:10 +03:00
Aliaksandr Valialkin
38789e4aa0 lib/storage: fix 32-bit builds for GOARH=386 or GOARCH=arm 2020-09-29 12:40:35 +03:00
Aliaksandr Valialkin
19c0b6f3ef lib/protoparser/prometheus: sort rows before comparing them in TestParseStream, since the order for callback calls is non-deterministic 2020-09-29 12:30:04 +03:00
Aliaksandr Valialkin
7cde336b33 lib/protoparser/prometheus: fix TestParseStream after 124f78857b 2020-09-29 12:11:17 +03:00
Aliaksandr Valialkin
96ee276e6e app/vmselect/prometheus: check for errors returned from bufferedwriter.Write
This makes `make errcheck` happy
2020-09-29 11:37:01 +03:00
Aliaksandr Valialkin
6fdfc67620 app/vmselect/graphite: properly handle case when /metrics/find finds both leaf and node for the given query=prefix.*
In this case only node must be returned with stripped dot in the end of id as carbonapi does
2020-09-29 11:01:59 +03:00
Aliaksandr Valialkin
165c9c6371 .github/workflows: verify builds for vmagent, vmalert, vmbackup and vmrestore 2020-09-29 00:49:20 +03:00
Aliaksandr Valialkin
41f24cdb64 .github/workflows: verify that VictoriaMetrics can be built for GOOS=openbsd 2020-09-29 00:44:44 +03:00
Aliaksandr Valialkin
7673839228 lib/{fs,filestream}: small consistency-related updates after cc90a548b1 2020-09-29 00:42:43 +03:00
Nikolay Khramchikhin
cc90a548b1 added openbsd implementations (#790)
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/785

removed fadvise for openbsd, added freespace implemenation for openbsd
2020-09-29 00:29:04 +03:00
Aliaksandr Valialkin
8d5df13c7c vendor: make vendor-update 2020-09-28 21:59:58 +03:00
Aliaksandr Valialkin
7500146321 lib/protoparser: avoid copying of buffer read from the network to unmarshal buffer 2020-09-28 17:19:16 +03:00
Aliaksandr Valialkin
124f78857b app/{vminsert,vmagent}: improve data ingestion speed over a single connection
Process data obtianed from a single connection on all the available CPU cores.
2020-09-28 04:13:08 +03:00
Aliaksandr Valialkin
978c6b4ba9 docs/Cluster-VictoriaMetrics.md: sync with cluster branch 2020-09-28 02:07:55 +03:00
Aliaksandr Valialkin
5cdad60a6f lib/protoparser: use 64KB read buffer instead of default 4KB buffer provided by net/http.Server
This should reduce syscall overhead when reading big amounts of data
2020-09-28 02:07:10 +03:00
Aliaksandr Valialkin
1b3efccb24 app/vmselect: stop /api/v1/export/* execution if client disconnects 2020-09-27 23:53:13 +03:00
Aliaksandr Valialkin
95688cbfc5 all: add native format for data export and import
The data can be exported via [/api/v1/export/native](https://victoriametrics.github.io/#how-to-export-data-in-native-format) handler
and imported via [/api/v1/import/native](https://victoriametrics.github.io/#how-to-import-data-in-native-format) handler.
2020-09-27 19:54:07 +03:00
Aliaksandr Valialkin
b4bf722d8f lib/protoparser: use all the available CPU cores for processing ingested data from a single /api/v1/import stream
Previously a single data ingestion stream to /api/v1/import could load only a single CPU core.
2020-09-26 04:21:32 +03:00
Aliaksandr Valialkin
c00627c103 app/vminsert: code prettifying 2020-09-26 04:13:18 +03:00
Aliaksandr Valialkin
b6a976b98d app/vmagent: reduce memory usage when importing data via /api/v1/import
Previously vmagent could use big amounts of RAM when each ingested JSON line
contained many samples.
2020-09-26 04:10:24 +03:00
Aliaksandr Valialkin
82973f8ae7 Revert "lib/storage: remove unused fetchData arg from BlockRef.MustReadBlock"
This reverts commit bab6a15ae0.

Reason for revert: the `fetchData` arg is used in cluster branch.
Leaving this arg in master branch makes smaller the diff with cluster branch.
2020-09-24 22:44:23 +03:00
Aliaksandr Valialkin
bab6a15ae0 lib/storage: remove unused fetchData arg from BlockRef.MustReadBlock
This arg became unused after 23bdc1f107
2020-09-24 20:48:40 +03:00
Aliaksandr Valialkin
23bdc1f107 app/vmselect/netstorage: do not spend CPU time on unpacking empty blocks during /api/v1/series calls 2020-09-24 20:18:10 +03:00
Aliaksandr Valialkin
24ca30bf66 lib/storage: correctly use maxBlockSize in various checks
Previously `maxBlockSize` has been multiplied by 8 in certain checks. This is unnecessary.
2020-09-24 18:12:56 +03:00
Aliaksandr Valialkin
c584aece38 app/vmselect/promql: properly limit implicitly set rollup window to -search.maxStalenessInterval
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/784
2020-09-23 23:23:59 +03:00
Aliaksandr Valialkin
2985077c35 all: consistently use "%w" formatting in fmt.Errorf for wrapped errors 2020-09-23 22:46:34 +03:00
Aliaksandr Valialkin
30c7269814 vendor: make vendor-update 2020-09-23 14:23:39 +03:00
Aliaksandr Valialkin
27500d7d4c app/vmselect/prometheus: code cleanup after 3ba507000c 2020-09-23 13:04:17 +03:00
Aliaksandr Valialkin
3ba507000c app/vmselect/prometheus: return timestamps from /api/v1/query, which match the time query arg
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/720
2020-09-23 12:58:48 +03:00
Aliaksandr Valialkin
c5ef0e6327 lib/persistentqueue: protect from multiple concurrent opening for the same persistent queue 2020-09-23 02:17:47 +03:00
Aliaksandr Valialkin
bed25e3c24 app/vmselect/netstorage: properly pre-allocate space for sbs 2020-09-22 23:49:55 +03:00
Aliaksandr Valialkin
5c42965853 lib/cgroup: attempt to obtain available CPU cores via /sys/devices/system/cpu/online
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/685#issuecomment-674423728
2020-09-22 23:27:19 +03:00
Aliaksandr Valialkin
09b0f7c202 app/vmselect/netstorage: release search resources on timeout errors
Previously these resources weren't released, which could lead to resource leaks.
2020-09-22 22:57:38 +03:00
Aliaksandr Valialkin
36eb5427eb vendor: make vendor-update 2020-09-22 17:07:37 +03:00
Aliaksandr Valialkin
31ce0e29cd docs/Single-server-VictoriaMetrics.md: VictoriaMetrics properly stores Inf values after 26115891db 2020-09-22 17:02:39 +03:00
Aliaksandr Valialkin
3b1e3a03e0 app/vmselect: make sure the request doesnt wait in pending queue more than the configured timeout
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/711
2020-09-22 01:23:19 +03:00
Aliaksandr Valialkin
a69234ed18 lib/storage: code prettifying after be5e1222f3
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/781
2020-09-22 00:36:45 +03:00
faceair
be5e1222f3 add filter to getMetricIDs (#783)
* add getMetricIDs filter

* check nil filter before use
2020-09-22 00:33:43 +03:00
Aliaksandr Valialkin
94f7d00537 docs/vmagent.md: typo fix 2020-09-21 21:49:22 +03:00
Aliaksandr Valialkin
f6f5c4118c docs: make docs-sync 2020-09-21 21:47:47 +03:00
Aliaksandr Valialkin
00b5145c69 app/vmselect/searchutils: fixed tests after 2eb72e09ab 2020-09-21 21:31:38 +03:00
Aliaksandr Valialkin
2eb72e09ab app/vmselect: use time value rounded to seconds if it isnt passed to /api/v1/query
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/720
2020-09-21 21:24:40 +03:00
Aliaksandr Valialkin
29108cc53e lib/logger: add -loggerDisableTimestamps command-line flag for disabling timestamps in logs
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/778
2020-09-21 19:28:04 +03:00
Aliaksandr Valialkin
964bc7595c lib/promscrape/discovery/ec2: code prettifying after 312fead9a2
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/771
2020-09-21 18:43:34 +03:00
Nikolay Khramchikhin
312fead9a2 Add improvements to ec2_sd_discovery (#775)
* Add improvements to ec2 discovery

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/771

 role_arn support with aws sts
 instance iam_role support
 refreshing temporary tokens

* Apply suggestions from code review

Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>

* changed implementation, removed tests, clean up code

* moved endpoint builder into getEC2APIResponse

Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>
2020-09-21 16:04:15 +03:00
Aliaksandr Valialkin
1e1a27d803 app/vmalert: remove unneeded UTC() call
UTC() doesn't change the underlying timestamp, so the call isn't needed here
2020-09-21 15:55:59 +03:00
Aliaksandr Valialkin
9739283dad lib/storage: reduce CPU load for idle VictoriaMetrics by reducing the frequency for the need for background merges 2020-09-21 15:54:11 +03:00
Roman Khavronenko
5dffc7a553 vmalert: add support for datasource.lookback flag (#779)
New datasource flag `datasource.lookback` defines how far to look into
past when evaluating queries.

Address https://github.com/VictoriaMetrics/VictoriaMetrics/issues/668
2020-09-21 15:53:49 +03:00
Roman Khavronenko
82c3bbce34 vmalert: fix the typo in error message (#782)
The error will be always nil so no sense in printing it.
2020-09-21 11:34:23 +03:00
Aliaksandr Valialkin
3e8569f456 lib/decimal: optimize maxUpExponent() by eliminating division from hot path 2020-09-19 13:50:09 +03:00
Aliaksandr Valialkin
f00e0e0103 lib/persistentqueue: sync data to file inside filestream.Writer.MustFlush 2020-09-19 12:51:41 +03:00
Aliaksandr Valialkin
26115891db lib/decimal: properly store Inf values
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/752
2020-09-18 19:07:07 +03:00
Aliaksandr Valialkin
d50165ad59 app/vmagent: increase default value for -remoteWrite.queues from 1 to 4, since it has been appeared that many users hit this limit 2020-09-18 14:21:54 +03:00
Aliaksandr Valialkin
63d3c88c3b vendor: update github.com/valyala/quicktemplate from v1.6.2 to v1.6.3 2020-09-18 13:10:48 +03:00
Aliaksandr Valialkin
1a9ee39b0e lib/promscrape: avoid copying response body when scraping targets.
This should reduce memory usage when scraping targets with millions of metrics.
2020-09-18 13:05:43 +03:00
Aliaksandr Valialkin
70c721c01b lib/persistentqueue: flush data to disk every second
Previously small amounts of data may be left unflushed for extended periods of time if vmagent collects small amounts of data.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/687
2020-09-18 13:05:40 +03:00
Aliaksandr Valialkin
74e3198281 vendor: udpate github.com/VictoriaMetrics/fasthttp from v1.0.5 to v1.0.7 2020-09-18 12:20:29 +03:00
Aliaksandr Valialkin
98d1cd0971 app/vmselect/graphite: return proper results /metrics/find?query=foo.*.bar according to Graphite Metrics API 2020-09-18 11:00:00 +03:00
Aliaksandr Valialkin
7a134b0fd7 app/vmstorage: added -forceMergeAuthKey command-line flag for protecting /internal/force_merge endpoint 2020-09-17 14:21:53 +03:00
Aliaksandr Valialkin
1f33dd717f lib/storage: add /internal/force_merge handler for running forced compactions on historical per-month partitions
This may be useful for freeing up storage space after time series deletion.

See https://victoriametrics.github.io/#force-merge for more details.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/686
2020-09-17 12:20:40 +03:00
Aliaksandr Valialkin
8beb0da6ad lib/{mergeset,storage}: compare errors with errors.Is() 2020-09-17 03:03:02 +03:00
Aliaksandr Valialkin
067d7c1ea1 lib/{mergeset,storage}: code prettifying 2020-09-17 02:06:31 +03:00
Aliaksandr Valialkin
020bd8685e lib/storage: removed duplicate checks for empty parts during merge - another check is in the beginning of mergeParts functions 2020-09-17 01:49:03 +03:00
Aliaksandr Valialkin
f2a449983d vendor: make vendor-update 2020-09-17 01:43:19 +03:00
Aliaksandr Valialkin
8674963f6a docs/Single-server-VictoriaMetrics.md: document that /api/v1/series/count may count delete time series
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/770
2020-09-17 01:38:17 +03:00
Aliaksandr Valialkin
ab53cb6f7b app/vmagent: substitute -remoteWrite.url with secret-url value in logs, since it may contain sensitive info such as passwords or auth tokens
Pass `-remoteWrite.showURL` command-line flag in order to see real `-remoteWrite.url` values in logs and at `/metrics` page.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/773
2020-09-16 22:36:25 +03:00
Aliaksandr Valialkin
9f79bcf64a app/vmselect: improve description for -search.maxQueryDuration 2020-09-16 21:15:41 +03:00
Aliaksandr Valialkin
39dee12ed7 lib/persistentqueue: code simplification after d455764a6f 2020-09-16 21:14:19 +03:00
Aliaksandr Valialkin
d455764a6f lib/persistentqueue: make the persistent queue more durable against unclean shutdown (kill -9, OOM, hard reset)
The strategy is:

- Periodical flushing of inmemory blocks to files, so they aren't lost on unclean shutdown.
- Periodical syncing of metadata for persisted queues, so the metadata remains in sync with the persisted data.
- Automatic adjusting of too big chunk size when opening the queue. The chunk size may be bigger than the writer offset after unclean shutdown.
- Skipping of broken chunk file if it cannot be read.
- Fsyncing finalized chunk files.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/687
2020-09-16 18:13:44 +03:00
Aliaksandr Valialkin
ffadf035fa lib/protoparser/vmimport: add more testcases for invalid timestamps and values
Updates https://github.com/VictoriaMetrics/vmctl/issues/25
2020-09-16 02:22:06 +03:00
Aliaksandr Valialkin
d8183c3124 lib/protoparser: report more errors for incorrect timestamps and/or values
Previously certain errors in timestamps and/or values could be silently skipped,
which could lead to samples with zero values stored in the database.

Updates https://github.com/VictoriaMetrics/vmctl/issues/25
2020-09-16 02:14:18 +03:00
Aliaksandr Valialkin
9bc8484ab6 lib/protoparser/graphite: return error when value or timestamp cannot be properly parsed
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/99
2020-09-16 01:35:12 +03:00
Aliaksandr Valialkin
26fa94ba8d vendor: update github.com/valyala/fastjson from v1.5.4 to v1.5.5
This should properly parse `+Inf` values when importing JSON lines via `/api/v1/import`

Updates https://github.com/VictoriaMetrics/vmctl/issues/25
2020-09-16 00:07:56 +03:00
Aliaksandr Valialkin
0bccb58e80 docs/Single-server-VictoriaMetrics.md: mention that VictoriaMetrics ignores NaN and Inf values during data ingestion
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/752
2020-09-15 23:40:28 +03:00
Aliaksandr Valialkin
1fec47a289 app/vmselect/netstorage: reduce memory usage when the time range from query touches big number of samples per each time series 2020-09-15 21:08:28 +03:00
Aliaksandr Valialkin
8c3d7c1a59 app/vmselect: typo fix in -search.maxStalenessInterval description 2020-09-15 14:24:27 +03:00
Aliaksandr Valialkin
fa01169c3d lib/promscrape: add a link to troubleshooting docs to error message when duplicate scrape target with identical labels is skipped 2020-09-15 14:16:05 +03:00
Aliaksandr Valialkin
51598bd718 docs/Articles.md: add a link to https://medium.com/miro-engineering/prometheus-high-availability-and-fault-tolerance-strategy-long-term-storage-with-victoriametrics-82f6f3f0409e 2020-09-15 12:29:10 +03:00
Aliaksandr Valialkin
ba74d0c14c lib/promscrape: typo fix 2020-09-12 00:14:21 +03:00
Aliaksandr Valialkin
7d893a234c lib/promscrape: do not reset the remaining rows when pushing a part of data to remote storage during big scrapes
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/753

Thanks to @PerGon and @clmssz for help with debugging.
2020-09-11 23:39:13 +03:00
Aliaksandr Valialkin
0e533d1a9c app/vmselect/promql: support composite durations like Prometheus 2.21 does
The following durations are supported now: `1h5m35s` or `1s543ms`

See https://github.com/prometheus/prometheus/releases/tag/v2.21.0
and https://github.com/prometheus/prometheus/pull/7713
2020-09-11 23:39:13 +03:00
Aliaksandr Valialkin
0e19f35af5 lib/promscrape/discovery/dns: add __meta_dns_srv_record_target and __meta_dns_srv_record_port labels
This syncs dns service discovery with Prometheus 2.21 - see https://github.com/prometheus/prometheus/releases
and https://github.com/prometheus/prometheus/pull/7678 .
2020-09-11 23:39:13 +03:00
Roman Khavronenko
6ad6480400 vmalert: add Group name as label to generated alerts and timeseries (#761)
Solves #611
2020-09-11 20:52:56 +01:00
Roman Khavronenko
4cdffb04a4 vmalert: update groups on config reload only if changes detected (#759)
On config reload event `vmalert` reloads configuration for every group. While
it works for simple configurations, the more complex and heavy installations may
suffer from frequent config reloads.
The change introduces the `checksum` field for every group and is set to md5 hash
of yaml configuration. The checksum will change if on any change to group
definition like rules order or annotation change. Comparing the `checksum` field
on config reload event helps to detect if group should be updated.
The groups update is now done concurrently, so reload duration will be limited by
the slowest group now.

Partially solves #691 by improving config reload speed.
2020-09-11 20:14:30 +01:00
Aliaksandr Valialkin
ca856284e4 app/vmagent: allow setting multiple identical -remoteWrite.url values
This may be useful when each url is authenticated via different `-remoteWrite.basicAuth.username`.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/755
2020-09-11 15:17:22 +03:00
Aliaksandr Valialkin
62fde80490 lib/protoparser/common: do not read request body when parsing timestamp query arg
This was preventing from reading data via /api/v1/prometheus/import .

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/750
2020-09-11 14:44:58 +03:00
Aliaksandr Valialkin
5a90a92378 lib/storage: do not store inf values, since they may lead to significant precision loss for previously stored values
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/752
2020-09-11 14:44:53 +03:00
Aliaksandr Valialkin
a2f647d142 app/vmselect/prometheus: typo fix in the description for -search.latencyOffset command-line flag 2020-09-11 14:16:46 +03:00
Aliaksandr Valialkin
f95eea60d1 lib/protoparser: accept timestamp in milliseconds instead of seconds at /api/v1/import/prometheus
This improves consistency with timestamps in Prometheus text exposition format
2020-09-11 14:04:46 +03:00
Aliaksandr Valialkin
2380e9b017 app/{vminsert,vmagent}: allow passing timestamp via timestamp query arg when ingesting data to /api/v1/import/prometheus
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/750
2020-09-11 13:27:14 +03:00
Aliaksandr Valialkin
f0005c3007 app/vmselect: move Deadline from netstorage to searchutils
This removes dependency on netstorage from searchutils.
2020-09-11 13:27:13 +03:00
Aliaksandr Valialkin
2114179e19 app/vmselect: substitute inf values at smooth_exponential with the previous values
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/757
2020-09-11 12:24:14 +03:00
Nikolay Khramchikhin
6c80ae0da8 Added endpointslices discovery to k8s api (#760)
This is similar to https://github.com/prometheus/prometheus/pull/6838 , which will be added in Prometheus v2.21.
See https://github.com/prometheus/prometheus/releases/tag/v2.21.0-rc.1

* Added endpointslices discovery to k8s api

Started from 1.17 k8s version endpointslices is beta,
it allows to query k8s api for endpoints more efficient.
It presents at scrape_config.yaml as separate role for kubernetes_sd_config.
kubernetes_sd_config:
- role: endpointslices

* fixed typos, changed EndpointConditions signature - with values instead of pointers
2020-09-11 12:16:45 +03:00
Aliaksandr Valialkin
204ec415b4 app/vmselect: skip infinite values when calculating smooth_exponential
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/757
2020-09-11 11:29:58 +03:00
Aliaksandr Valialkin
8a8b5a73d3 app/vmselect/graphite: typo fix in label name for vm_request_duration_seconds metric 2020-09-11 01:58:28 +03:00
John Belmonte
c9d0905b17 fix typo on outliersk() doc (#758) 2020-09-11 00:55:53 +03:00
Aliaksandr Valialkin
f6bc608e86 app/vmselect: initial implementation of Graphite Metrics API
See https://graphite-api.readthedocs.io/en/latest/api.html#the-metrics-api
2020-09-11 00:30:01 +03:00
Aliaksandr Valialkin
3eccecd5fd vendor: make vendor-update 2020-09-10 09:49:13 +03:00
Aliaksandr Valialkin
b3dcaf0cd7 deployment/docker: update Go builder from v1.15.1 to v1.15.2
This fixes the following issues in Go runtime - see https://github.com/golang/go/issues?q=milestone%3AGo1.15.2+label%3ACherryPickApproved
2020-09-10 09:36:43 +03:00
Aliaksandr Valialkin
9d8fdff6c5 lib/storage: reuse timestamp blocks for adjancent metric blocks with identical timestamps
This should reduce disk space usage when scraping targets containing metrics with identical names
such as `node_cpu_seconds_total`, histograms, quantiles, etc.

Expose `vm_timestamps_blocks_merged_total` and `vm_timestamps_bytes_saved_total` metrics for monitoring
the effectiveness of timestamp blocks merging.
2020-09-09 23:59:32 +03:00
Aliaksandr Valialkin
d7c04db1fc docs: sync docs for vmalert, vmauth, vmbackup and vmrestore 2020-09-09 21:10:34 +03:00
Aliaksandr Valialkin
e5ed8c8d75 docs/Articles.md: add links to recently published third-party articles and talks about VictoriaMetrics 2020-09-09 20:15:27 +03:00
Aliaksandr Valialkin
9d431a4b45 docs/Single-server-VictoriaMetrics.md: typo fix 2020-09-09 01:21:45 +03:00
Aliaksandr Valialkin
4739dff6f0 docs/Single-server-VictoriaMetrics.md: typo fix 2020-09-09 00:59:37 +03:00
Aliaksandr Valialkin
11eaa37111 docs/vmagent.md: clarified the case when -remoteWrite.queues must be tuned
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/745
2020-09-08 20:15:27 +03:00
Aliaksandr Valialkin
df169b1ebd lib/httpserver: add a jitter to connection timeouts in order to protect from Thundering herd problem 2020-09-08 19:55:09 +03:00
Aliaksandr Valialkin
9d61d24142 vendor: make vendor-update 2020-09-08 15:20:01 +03:00
Aliaksandr Valialkin
62919eaf7e app/vmselect/promql: go fmt 2020-09-08 15:19:59 +03:00
Aliaksandr Valialkin
e6da63dffe app/vmselect/promql: adjust integrate() calculations to be more similar to calculations from InfluxDB: attempt #2
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/701
2020-09-08 14:35:50 +03:00
Aliaksandr Valialkin
8e85b56737 app/vmselect/promql: adjust integrate() calculations to be more similar to calculations from InfluxDB
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/701
2020-09-08 14:23:39 +03:00
Aliaksandr Valialkin
c0343a661b app/vmselect/promql: increase floating point calculations accuracy by dividing by 1e3 instead of multiplying by 1e-3 2020-09-08 14:00:47 +03:00
Aliaksandr Valialkin
1bca6160a3 docs/Single-server-VictoriaMetrics.md: make docs-sync 2020-09-07 21:58:06 +03:00
John Belmonte
ccfb7c5e29 revise /api/v1/series docs (#746)
* revise /api/v1/series docs

Further clarification for #735

  * clarify how default range differers from Prometheus API
  * avoid `start=0` suggestion when confirming delete, because it will cause a timeout in most deployments

* Update README.md
2020-09-07 21:57:34 +03:00
Nikolay Khramchikhin
8d71a60a76 Changed s3 configProfile flag default, (#749)
aws sdk has complicated logic for chosing profile name and we shouldn't set
it to `default` value. It leads to bugs and improper configuration.
Set it to empty value by default is safe. It will be automatically set to `default` by sdk.
2020-09-07 21:53:24 +03:00
Aliaksandr Valialkin
eb33a48b9b docs/Single-server-VictoriaMetrics.md: sync with README.md 2020-09-04 03:30:05 +03:00
John Belmonte
cd7426be6e document minScrapeInterval semantics (#744)
* document `minScrapeInterval` semantics

Fixes #714.

* Update README.md

revise wording
2020-09-04 03:29:26 +03:00
Aliaksandr Valialkin
a5621b9c46 docs/Single-server-VictoriaMetrics.md: updates according to review comments at fe98ba5a60 2020-09-04 02:57:02 +03:00
Aliaksandr Valialkin
be6ae4b5e7 lib/memory: fall back to reading hierarchical memory limit in cgroups when the default limit isn't set
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/699
2020-09-04 00:05:05 +03:00
Aliaksandr Valialkin
d387da142e lib/httpserver: add -http.connTimeout command-line flag for limiting the lifetime for incoming http connections
This can be useful for balancing incoming connections among multiple services.
2020-09-03 22:23:29 +03:00
Aliaksandr Valialkin
e1c2757f70 vendor: update github.com/VictoriaMetrics/metricsql from v0.4.3 to v0.5.1
The new version of the package supports binary operations on string literals:

    * "foo" + "bar"     => "foobar"
    * "foo" == "bar"    => NaN
    * "foo" == "foo"    => 1
    * "foo" >bool "bar" => 1
    * "foo" < "bar"     => NaN

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/717
2020-09-03 16:33:31 +03:00
Aliaksandr Valialkin
f4e7e5fb90 app/vmselect/promql: add count_le_over_time(m[d], le) and count_gt_over_time(m[d], gt) functions
These functions returns the number of raw samples that don't exceed `le` or are bigger than `gt`.
These functions are complement to already existing `share_le_over_time(m[d], le)` and `share_gt_over_time(m[d], gt)`.
2020-09-03 15:29:10 +03:00
Aliaksandr Valialkin
d5b985f086 vendor: update github.com/VictoriaMetrics/metricsql from v0.4.1 to v0.4.2
The new version of this package properly supports escaped identifiers.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/743
2020-09-03 15:01:42 +03:00
Aliaksandr Valialkin
e706e59d49 app/vmselect: unconditionally align time range boundaries to step for subqueries as Prometheus does 2020-09-03 13:29:50 +03:00
Aliaksandr Valialkin
fe98ba5a60 docs/Single-server-VictoriaMetrics.md: mention that /api/v1/series returns series for the last 5 minutes if start query arg is missing
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/735
2020-09-03 12:38:29 +03:00
Aliaksandr Valialkin
ddabc13796 app/vmagent: properly flush big blocks of data
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/741

Thanks to @IceRain00 for the investigation and initial attempt to fix the issue
at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/742
2020-09-03 12:12:39 +03:00
Aliaksandr Valialkin
7a839b461f app/vmagent: fix data race when accessing writeRequest.lastFlushTime 2020-09-03 12:12:37 +03:00
Nikolay Khramchikhin
764b3d4fda changed vmalert behaviour (#738)
* VMAlert start with empty rules dir

There are some applications (operator for instance), that generates alerts configuration at runtime
and vmalert must start correctly without rules to support this behaviour.
Later application will add rules files and send SIGHUP to vmalert,
which will trigger reading rules files and start rules exectuion.

Removing rules files with SIGHUP signal must stop rules execution and
vmalert will wait for new rules.

* imports sorted

* added test cases for empty rules, removed blank line

* fixed imports conflict

* updated tests
2020-09-03 11:04:42 +03:00
Aliaksandr Valialkin
b4afc6ee2f docs/Single-server-VictoriaMetrics.md: add missing link to Prometheus text exposition format 2020-09-03 01:10:11 +03:00
Aliaksandr Valialkin
5f16ceb294 app/vmalert: imrovements over 3f932c2db1 2020-09-03 01:00:55 +03:00
DexterZhang
3f932c2db1 feat: spread load of rule evaluation by group when starting new groups (#724)
* feat: spread load of rule evaluation by group when starting new groups

* review: reduce the resulting diff.

* Update app/vmalert/group.go

Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>
2020-09-03 00:58:54 +03:00
Aliaksandr Valialkin
f41b36bb9a app/{vminsert,vmagent}: allow adding extra labels when importing data via Prometheus, CSV and JSON line formats
Extra labels may be added to the imported data by passing `extra_label=name=value` query args.
Multiple query args may be passed in order to add multiple extra labels.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/719
2020-09-02 19:43:21 +03:00
Aliaksandr Valialkin
038358b777 lib/promscrape: use the number of parsed rows as a basis for writeRequestCtxPool leveling
The previous basis on `cap(sw.labels)` doesn't work anymore after 7785869ccc ,
because `sw.labels` may be reset multiple times when processing big number of rows.
2020-09-02 18:46:01 +03:00
Roman Khavronenko
ed899ca9e8 Single dashboards update (#736)
* dashboard: rename var `datasource` to `ds` for consistency reason

Dasbhoards for cluster version or vmagent operate with datasource variable
named `ds`. For consistency sake we rename this variable in single node version
as well.

* dashboard: add instance variable picker

See dashboard reviews here https://grafana.com/grafana/dashboards/10229/reviews

* dashboard: limit number of buckets in histogram to 12 for vmagent dashboard

* dashboard: bump version requirement in description for single version

* dashboard: drop extra series override for single version

* dashboard: set Y-min to zero for most of panels in vmagent dashboard
2020-09-02 15:16:40 +03:00
Aliaksandr Valialkin
e9196655dd deployment/docker: update Go builder from v1.15.0 to v1.15.1 2020-09-02 15:10:15 +03:00
Aliaksandr Valialkin
821df709d3 vendor: make vendor-update 2020-09-02 15:05:16 +03:00
John Belmonte
67277abecf use Y-min 0 on Grafana dashboard graphs (#732) 2020-09-01 19:56:56 +01:00
Aliaksandr Valialkin
c2ff8de456 lib/httpserver: add -http.idleConnTimeout command-line flag for tuning the timeout for incoming idle http connections 2020-09-01 15:33:24 +03:00
Aliaksandr Valialkin
b059f194e4 lib/promscrape: fix applying sample_limit when scraping targets with big number of metrics
This has been broken at 7785869ccc
2020-09-01 11:08:13 +03:00
Aliaksandr Valialkin
7785869ccc lib/promscrape: reduce memory usage when scraping targets with millions of metrics
This should help when scraping /federate endpoints from Prometheus instances,
which scrape millions of metrics. See https://prometheus.io/docs/prometheus/latest/federation/
2020-09-01 10:57:07 +03:00
Aliaksandr Valialkin
5af777469a app/vmagent: log unsuccessful attempt number when sending data to -remoteWrite.url 2020-08-30 21:40:22 +03:00
Aliaksandr Valialkin
2149733bd2 app/vmagent: apply sane limits to -remoteWrite.queues
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/707
2020-08-30 21:25:37 +03:00
Aliaksandr Valialkin
dd20784d06 docs/Single-server-VictoriaMetrics.md: mention that VictoriaMetrics accepts relative times at time, start and end query args 2020-08-28 10:13:16 +03:00
Aliaksandr Valialkin
de6970e828 docs/vmalert.md: sync with app/vmalert/README.md via make docs-update 2020-08-28 09:51:48 +03:00
Aliaksandr Valialkin
4a415620d3 docs/Articles.md: add a link to https://medium.com/@romanhavronenko/victoriametrics-how-to-migrate-data-from-prometheus-filtering-and-modifying-time-series-6d40cea4bf21 2020-08-28 09:51:26 +03:00
Aliaksandr Valialkin
acbcad1ece lib/{promscrape,leveledbytebufferpool}: rename getPoolIdAndCapacity to getPoolIDAndCapacity in order to make golint happy 2020-08-28 09:49:32 +03:00
Aliaksandr Valialkin
f4c4ab811b lib/cgroup: limit the maximum GOMAXPROCS value to the number of available CPU cores
There is no sense in setting GOMAXPROCS to value higher than the number of available CPU cores.
2020-08-28 09:49:32 +03:00
Roman Khavronenko
10601bc652 vmalert: update -rule flag description to enforce quotes using (#709)
Description for `-rule` flag uses as example specific chars like asterisks
which could be interpreted wrong by different shells. To avoid this, description
now contains quoted flag values.

See also #708
2020-08-20 22:36:38 +01:00
Roman Khavronenko
f2c004d1ae lib/flagutil: avoid int overflow for arch 386 (#710)
Arch 386 is a 32-bit architecture and interprets int type for numbers as an explicit int32,
whereas on most modern CPUs int is implicitly an int64. This makes tests to fail with
`int overflow` error.
2020-08-20 22:27:37 +01:00
Aliaksandr Valialkin
efc730863b lib/promscrape: reduce memory usage when scraping targets with big number of metrics alongside targets with small number of labels
Previously targets with big number of metrics and/or labels could generated too big buffers,
which then could be re-used when scraping targets with small number of metrics.
This resulted in memory waste.

Now big buffers are used only for targets with big number of metrics / labels,
while small buffers are used for targets with small number of metrics / labels.
2020-08-16 22:29:51 +03:00
Aliaksandr Valialkin
d6967319b6 lib/leveledbytebufferpool: allocate byte buffers with capacity rounded to the upper boundary for the given bucket
This should reduce the number of resizings for the returned byte buffers.
2020-08-16 22:13:30 +03:00
Roman Khavronenko
f5f59896ec lib/decimal: rename significant decimal digits to significant figures (#698)
The previous notion was inconsistent with what `decimal.Round` does.
According to [wiki](https://en.wikipedia.org/wiki/Significant_figures) rounding
applied to all significant figures, not just decimal ones.
2020-08-16 17:21:35 +03:00
Aliaksandr Valialkin
147c35ebd4 all: allow using KB, MB, GB, KiB, MiB and GiB suffixes in command-line flag values related to byte sizes or byte rates 2020-08-16 17:05:52 +03:00
Aliaksandr Valialkin
7c0d6a8b88 lib/memory: improve log message about the memory allowed to use by VictoriaMetrics 2020-08-16 16:04:11 +03:00
Aliaksandr Valialkin
ed00eb3f33 lib/protoparser: removed unnecessary call to SetReadDeadline when reading a stream of data
The OS should return any buffered data in the stream without the need to set the read timeout.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/696
2020-08-15 15:38:08 +03:00
Aliaksandr Valialkin
7615a3ab8d vendor: upgrade github.com/valyala/gozstd from v1.7.1 to v1.8.3 2020-08-15 15:11:56 +03:00
Aliaksandr Valialkin
7be9bedaf9 vendor: downgrade github.com/valyala/gozstd from v1.8.1 to v1.7.1 until https://github.com/facebook/zstd/issues/2222 is fixed 2020-08-15 14:46:32 +03:00
Aliaksandr Valialkin
00b1659dde lib: dump compressed block contents on error during decompression
This should improve detecting root cause for https://github.com/facebook/zstd/issues/2222
2020-08-15 14:44:33 +03:00
Aliaksandr Valialkin
528e25bdde vendor: update github.com/valyala/gozstd from v1.7.0 to v1.8.1 2020-08-15 13:46:43 +03:00
Aliaksandr Valialkin
b3849a90fd lib/leveledbytebufferpool: pre-allocate byte slice with the given capacity if the pool is empty
This should reduce memory allocations and copying when the byte slice is growing.
2020-08-15 01:40:54 +03:00
Aliaksandr Valialkin
7d89fafe1a app/vmselect/promql: allow passing multiple args to aggregate functions such as avg(q1, q2, q3) 2020-08-15 01:15:09 +03:00
Aliaksandr Valialkin
cd96248480 docs/vmagent.md: mention that gaps in remote storage may appear if vmagent cannot keep up with data ingestion 2020-08-14 20:47:57 +03:00
Aliaksandr Valialkin
7554be172d lib/protoparser: move common code for detecting timeouts to ReadLinesBlockExt
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/696
2020-08-14 20:40:15 +03:00
Aliaksandr Valialkin
4beab7ad39 lib/protoparser: prevent from busy loop on repeated timeout errors when reading streams of ingested data
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/696
2020-08-14 20:14:11 +03:00
Aliaksandr Valialkin
41d23f84ed docs/Cluster-VictoriaMetrics.md: sync with upstream 2020-08-14 19:15:29 +03:00
Aliaksandr Valialkin
184670fb9b docs: update docs 2020-08-14 19:13:42 +03:00
Aliaksandr Valialkin
52791fd1c0 lib/memory: add -memory.allowedBytes command-line flag for setting absolute memory limit for VictoriaMetrics caches 2020-08-14 19:13:38 +03:00
Aliaksandr Valialkin
576da0fe46 app/{vminsert,vmagent}: improve documentation for -influxListenAddr command-line flag 2020-08-14 18:04:44 +03:00
Aliaksandr Valialkin
215967437d lib/protoparser/prometheus: typo fix in error message 2020-08-14 11:04:23 +03:00
Aliaksandr Valialkin
d1ad3adcbe vendor: make vendor-update 2020-08-14 02:29:02 +03:00
Aliaksandr Valialkin
42960feff4 vendor: update github.com/VictoriaMetrics/fasthttp from v1.0.4 to v1.0.5 2020-08-14 02:19:36 +03:00
Aliaksandr Valialkin
07246bc31c vendor: update github.com/klauspost/compress from v1.10.10 to v1.10.11 2020-08-14 02:17:07 +03:00
Aliaksandr Valialkin
e646674b23 lib/promscrape: use a hint on body length instead of body capacity
This should reduce memory usage for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/689
2020-08-14 01:17:52 +03:00
Aliaksandr Valialkin
4628deecd1 lib/promscrape: reduce memory usage when scraping big number of targets
Thanks to @dxtrzhang for the original idea at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/688

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/689
2020-08-14 01:04:53 +03:00
Aliaksandr Valialkin
eead3ee8ec lib/promscrape: properly retry requests on the server closed connection before returning the first response byte error during service discover API calls and target scrapes 2020-08-13 22:31:52 +03:00
Aliaksandr Valialkin
c402265e88 all: support %{ENV_VAR} placeholders in yaml configs in all the vm* components
Such placeholders are substituted by the corresponding environment variable values.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/583
2020-08-13 17:15:25 +03:00
Aliaksandr Valialkin
ff495a74f6 deployment/docker: update Go builder from Go1.14.7 to Go1.15.0 2020-08-13 15:53:32 +03:00
Aliaksandr Valialkin
45962fb8c2 docs/Cluster-VictoriaMetrics.md: mention about Kubernetes operator 2020-08-12 21:15:34 +03:00
Aliaksandr Valialkin
fd6c690276 docs/Single-server-VictoriaMetrics.md: mention helm charts, k8s operator and vmctl tool in Integrations chapter 2020-08-12 21:12:23 +03:00
Aliaksandr Valialkin
e730788477 docs/Articles.md: added https://medium.com/@romanhavronenko/victoriametrics-how-to-migrate-data-from-prometheus-d44a6728f043 2020-08-12 21:03:05 +03:00
Aliaksandr Valialkin
ef7e2af8f5 app: respect CPU limits set via cgroups
Update GOMAXPROCS to limits set via cgroups. This should reduce CPU trashing and reduce memory usage
for cases when VictoriaMetrics components run in containers with CPU limits.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/685
2020-08-11 22:59:19 +03:00
Aliaksandr Valialkin
15aa6142ef lib/protoparser: clarify that the string passed to Unmarshal() function must remain available when the parsed rows are in use 2020-08-11 17:04:39 +03:00
Aliaksandr Valialkin
5492edcc6c docs/Single-server-VictoriaMetrics.md: mention that it is safe to skip multiple versions during the upgrade 2020-08-11 14:21:37 +03:00
Aliaksandr Valialkin
e969ef2639 app/vmselect: reduce memory usage when exporting time series with big number of samples via /api/v1/export if max_rows_per_line is set to non-zero value
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/685
2020-08-10 20:57:36 +03:00
Aliaksandr Valialkin
c098988a18 lib/protoparser/influx: accept precision=us and precision=µ according to https://docs.influxdata.com/influxdb/v1.8/tools/api/#write-http-endpoint 2020-08-10 20:23:26 +03:00
Aliaksandr Valialkin
1bdfa29ef7 lib/promscrape: optimize per-metric hash calculations
This increases vmagent performance by up to 10% when scraping big number of metrics
2020-08-10 19:49:03 +03:00
Aliaksandr Valialkin
8adba82c02 app/vmselect/netstorage: vary batch size for data unpacking depending on the available CPU cores
This should reduce contention on the channel with unpack work for systems with high number of CPU cores
2020-08-10 15:16:42 +03:00
Aliaksandr Valialkin
8d9eb5f808 lib/storage: mention time range used in the query that led to error message
This should improve detecting slow queries with too big time ranges
2020-08-10 13:46:36 +03:00
Aliaksandr Valialkin
582c74cd93 lib/storage: mention tag filters used in the query that led to error message
This should improve detecting invalid or heavy queries that lead to errors.
2020-08-10 13:36:49 +03:00
Aliaksandr Valialkin
f3d33e23c9 app/vmstorage: improve error logging when the request times out 2020-08-10 13:23:26 +03:00
Aliaksandr Valialkin
455bf50a91 lib/promscrape: show real timestamp and real duration for the scape on /targets page
Previously the scrape duration may be negative when calculated scrape timestamp drifts away from the real scrape timestamp
2020-08-10 12:40:25 +03:00
Aliaksandr Valialkin
2791008e19 vendor: make vendor-update 2020-08-09 15:13:55 +03:00
Aliaksandr Valialkin
a499de45cc lib/promscrape: make errcheck happy 2020-08-09 13:17:18 +03:00
Aliaksandr Valialkin
23c9e6b727 lib/promscrape: export scrape_samples_added per-target metric like Prometheus does
This metric may be useful for detecting targets with high churn rate for the exported metrics.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/683
2020-08-09 12:45:39 +03:00
Aliaksandr Valialkin
9d32fb1d9e lib/fs: use WARN instead of ERROR log level for the message when NFS diretory removal temporarily fails
this is expected condition, so it is better to use WARN log level for it
2020-08-09 12:07:30 +03:00
Aliaksandr Valialkin
d4b6d22987 lib/promscrape: add a test for scrape config for blackbox exporter
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/684
2020-08-09 12:02:48 +03:00
Roman Khavronenko
0be5b09fb4 app/vmalert: extend metrics set exported by vmalert #573 (#654)
* app/vmalert: extend metrics set exported by `vmalert` #573

New metrics were added to improve observability:
+ vmalert_alerts_pending{alertname, group} - number of pending alerts per group
per alert;
+ vmalert_alerts_acitve{alertname, group} - number of active alerts per group
per alert;
+ vmalert_alerts_error{alertname, group} - is 1 if alertname ended up with error
during prev execution, is 0 if no errors happened;
+ vmalert_recording_rules_error{recording, group} - is 1 if recording rule
 ended up with error during prev execution, is 0 if no errors happened;
* vmalert_iteration_total{group, file} - now contains group and file name labels.
This should improve control over specific groups;
* vmalert_iteration_duration_seconds{group, file} - now contains group and file name labels. This should improve control over specific groups;

Some collisions for alerts and recording rules are possible, because neither
group name nor alert/recording rule name are unique for compatibility reasons.

Commit contains list of TODOs for Unregistering metrics since groups and rules
are ephemeral and could be removed without application restart. In order to
unlock Unregistering feature corresponding PR was filed - https://github.com/VictoriaMetrics/metrics/pull/13

* app/vmalert: extend metrics set exported by `vmalert` #573

The changes are following:
* add an ID label to rules metrics, since `name` collisions within one group is
a common case - see the k8s example alerts;
* supports metrics unregistering on rule updates. Consider the case when one rule
was added or removed from the group, or the whole group was added or removed.

The change depends on https://github.com/VictoriaMetrics/metrics/pull/16
where race condition for Unregister method was fixed.
2020-08-09 09:41:29 +03:00
ofen
81746d14b9 401 Unauthorize HTTP error added (#681)
401 Unauthorize HTTP error added to trigger browser credentials pop-up promt [RFC 7235 https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication]
2020-08-09 09:38:41 +03:00
Aliaksandr Valialkin
807c2b076c vendor: update github.com/VictoriaMetrics/metrics from v1.12.2 to v1.12.3 2020-08-07 13:02:51 +03:00
Aliaksandr Valialkin
84fd8af6d3 lib/storage: slow down concurrent searches when the number of concurrent inserts reaches the limit
This should improve data ingestion performance when heavy searches are executed

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/618
2020-08-07 08:49:40 +03:00
Aliaksandr Valialkin
9043a509a3 lib/storage: properly check timeouts and pace limits
Previously they were checked on every iteration for small number of iterations
2020-08-07 08:40:37 +03:00
Aliaksandr Valialkin
1ad3de5c54 deployment/docker: update Go builder from v1.14.6 to v1.14.7 2020-08-07 08:29:06 +03:00
Aliaksandr Valialkin
d60908bba4 docs/MetricsQL.md: mention that MetricsQL removes all the NaN values from results 2020-08-07 07:51:45 +03:00
Aliaksandr Valialkin
716754fae6 app/vmselect/promql: properly handle -n^m like Prometheus does
`-n^m` must be handled as `-(n^m)` instead of `(-n)^m`.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/675
2020-08-07 07:42:18 +03:00
Aliaksandr Valialkin
bb61a4769b app/vmselect/promql: remove metric name after applying clamp_min and clamp_max functions in order to be consistent with Prometheus
This improves VictoriaMetrics score at https://promlabs.com/promql-compliance-test-results-victoriametrics/
2020-08-06 23:42:37 +03:00
Aliaksandr Valialkin
ac45082216 app/vmselect/promql: remove metric name after applying ceil, floor and round functions in order to be more consistent with Prometheus
This improves VictoriaMetrics score at https://promlabs.com/promql-compliance-test-results-victoriametrics/
2020-08-06 23:34:37 +03:00
Aliaksandr Valialkin
e5202a4eae app/vmselect/promql: remove metric name from results of certain rollup functions in order to be consistent with Prometheus
Rollup functions:

  - avg_over_time
  - min_over_time
  - max_over_time
  - quantile_over_time

This improves VictoriaMetrics results at https://promlabs.com/promql-compliance-test-results-victoriametrics/
2020-08-06 23:29:13 +03:00
Aliaksandr Valialkin
68e4f40a72 app/vmselect: properly handle PromQL queries like scalar1 < metric < scalar2 like Prometheus does
This fixes some cases from https://promlabs.com/promql-compliance-test-results-victoriametrics/
2020-08-06 23:21:03 +03:00
Aliaksandr Valialkin
ada2ae69ec vendor: update github.com/VictoriaMetrics/metricsql from v0.2.10 to v0.3.0
This adds support for special integers in MetricsQL that start from 0x, 0b, 0o.
This improves compatibility with PromQL - see https://promlabs.com/promql-compliance-test-results-victoriametrics/
2020-08-06 21:45:21 +03:00
Aliaksandr Valialkin
bc8381613d app/vmselect: reduce memory allocations by pre-allocatin memory for time series map and for a list of time series names 2020-08-06 19:17:58 +03:00
Aliaksandr Valialkin
8e44fba76d lib/storage: reduce the frequency (and overhead) for timeout and pace limiter checks by 4x 2020-08-06 18:45:55 +03:00
Aliaksandr Valialkin
7dbe335426 lib/pacelimiter: increase scalability for multi-CPU system 2020-08-06 18:32:59 +03:00
Aliaksandr Valialkin
3f85c06b65 app/vmselect/netstorage: reduce CPU contention when upacking time series blocks by unpacking batches of such blocks instead of a single block
This should improve query performance on systems with big number of CPU cores (16 and more)
2020-08-06 17:50:17 +03:00
Aliaksandr Valialkin
d20c2156e4 app/vmselect/netstorage: reduce contention on unpackworkCh and timeseriesWorkCh for multi-CPU system by providing more capacity for these chans 2020-08-06 17:22:48 +03:00
Aliaksandr Valialkin
ad730d8a17 lib/storage: optimize prefetching metric names for the given metricIDs 2020-08-06 16:53:10 +03:00
Aliaksandr Valialkin
dbbdfbe7ee app/vmstorage: rename vm_cache_size_entries{type="storage/prefetchedMetricIDs"} to vm_cache_entries{type="storage/prefetchedMetricIDs"} to be consistent with other vm_cache_entries metrics 2020-08-06 16:34:24 +03:00
Aliaksandr Valialkin
639b26b40c lib/fs: export vm_nfs_pending_dirs_to_remove metric for monitoring the number of pending directories that couldn't be removed due to NFS lock 2020-08-06 15:31:34 +03:00
Aliaksandr Valialkin
8f16388428 lib/storage: limit the number of concurrent calls to storage.searchTSIDs to GOMAXPROCS*2
This should limit the maximum memory usage and reduce CPU trashing on vmstorage
when multiple heavy queries are executed.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
2020-08-05 18:30:07 +03:00
Aliaksandr Valialkin
aaa497ff0b Perform conversion from string to []byte according to rule #6 at https://golang.org/pkg/unsafe/#Pointer 2020-08-05 11:55:58 +03:00
Aliaksandr Valialkin
ef94333808 vendor: make vendor-update 2020-08-05 11:10:10 +03:00
Aliaksandr Valialkin
c25b0c2cd5 app/vmagent: tune http client for sending data to remote storage in order to disable closing keep-alive connections
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/663
2020-08-04 21:00:29 +03:00
Aliaksandr Valialkin
5d0c37bec0 app/vmselect: use warning level instead of info level for logging slow queries that take longer than -search.logSlowQueryDuration 2020-08-04 20:25:35 +03:00
Antonin Kral
bba1442649 Add option to build 32b ARM Debian package (armhf) (#665) 2020-08-04 18:12:59 +03:00
Aliaksandr Valialkin
a9ffd233df docs/Single-server-VictoriaMetrics.md: add a chapter about data updates 2020-08-04 13:53:59 +03:00
Aliaksandr Valialkin
a034f02fb2 lib/backup: allow using ~/.aws/config without region
Use us-west-2 for determining bucket region.
2020-08-04 13:07:59 +03:00
Aliaksandr Valialkin
e6eee2bebf app/vmselect/promql: add zscore-related functions: zscore_over_time(m[d]) and zscore(q) by (...) 2020-08-03 21:52:18 +03:00
Aliaksandr Valialkin
509d12643b app/vmselect: show X-Forwarded-For contents on /api/v1/status/active_queries page
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/659
2020-07-31 20:05:18 +03:00
Aliaksandr Valialkin
5e71fab8a6 lib/storage: reduce the maximum number of concurrent merge workers to GOMAXPROCS/2
Previously the limit has been raised to GOMAXPROCS, but it has been appeared that this
increases query latencies since more CPUs are busy with merges.

While at it, substitute `*MergeConcurrencyLimitCh` channels with simple integer limits.
2020-07-31 17:46:56 +03:00
Aliaksandr Valialkin
d01f3c1943 all: add mssing APP_NAME to vm*-GOARCH builds 2020-07-31 13:42:18 +03:00
Aliaksandr Valialkin
3f498cf2dc docs/{vmagent,vmalert}: add instruction on how to build for ARM 2020-07-31 09:25:22 +03:00
Aliaksandr Valialkin
8c8c14c127 docs/Single-server-VictoriaMetrics.md: mention that downgrade is also safe to perform 2020-07-31 09:20:40 +03:00
Aliaksandr Valialkin
44a86e1be3 vendor: update github.com/valyala/quicktemplate from v1.5.2 to v1.6.0 2020-07-30 23:39:40 +03:00
Aliaksandr Valialkin
f0c678c41b app/vmselect: do not adjust start and end query args passed to /api/v1/query_range when -search.disableCache command-line flag is set
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/563
2020-07-30 23:14:37 +03:00
Aliaksandr Valialkin
e255c066cc docs/vmalert.md: sync with app/vmalert/README.md 2020-07-30 21:56:48 +03:00
Aliaksandr Valialkin
e7959094f6 lib/storage: remove prioritizing of merging small parts over merging big parts, since it doesn't work as expected
The prioritizing could lead to big merge starvation, which could end up in too big number of parts that must be merged into big parts.

Multiple big merges may be initiated after the migration from v1.39.0 or v1.39.1. It is OK - these merges should be finished soon,
which should return CPU and disk IO usage to normal levels.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/618
2020-07-30 19:57:27 +03:00
Aliaksandr Valialkin
922d9aadf2 lib/storage: properly update vm_slow_row_inserts_total metric when importing multiple data points per time series at once
Previously the `vm_slow_row_inserts_total` metric may be incremented multiple times for different data points per a single time series,
while only a single increment is needed when inserting the first data point for this time series.
2020-07-30 16:17:24 +03:00
Aliaksandr Valialkin
68716488db vendor: update github.com/valyala/quicktemplate from v1.5.1 to v1.5.2 2020-07-29 18:20:11 +03:00
Aliaksandr Valialkin
67a64c142d lib/httpserver: emit X-Forwarded-For additionally to remoteAddr in error logs
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/659
2020-07-29 13:12:42 +03:00
Aliaksandr Valialkin
328b52e5ff app/vmselect/promql: return non-empty value from rate_over_sum(m[d]) even if a single data point is located in the given [d] window
Just divide the data point value by the window duration in this case.
2020-07-29 12:37:58 +03:00
Aliaksandr Valialkin
700737c181 app/vmselect/promql: remove rollupFuncArg.realPrevValue handling, since the corner case in increase() is handled in another way now
See e00cfc854d for the approach used now.
2020-07-29 12:37:58 +03:00
Aliaksandr Valialkin
2f735f112d app/vmselect/promql: fill gaps with 0 in rate_over_sum response when the last value before the selected time window isnt empty 2020-07-29 12:37:58 +03:00
Aliaksandr Valialkin
1ca0c8a29b vendor: make vendor-update 2020-07-29 09:36:08 +03:00
Aliaksandr Valialkin
d81d586b86 vendor: update github.com/VictoriaMetrics/metrics from v1.12.1 to v1.12.2 2020-07-28 22:02:29 +03:00
Aliaksandr Valialkin
0f63da3698 app/{vmagent,vminsert}: properly preserve db tag from query string passed to Influx line protocol query
Previously `db` tag from the query string wasn't added to metrics after encountering `db` tag in the Influx line

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/653
2020-07-28 21:25:19 +03:00
Aliaksandr Valialkin
62ed38c6f0 app/vmagent/remotewrite: add missing resp.Body.Close() after pushing data to remote storage
Missing body close could disable HTTP keep-alive connections.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/653
2020-07-28 21:00:15 +03:00
Aliaksandr Valialkin
79c30cf4cb app/vmselect: show query origin (aka remote_addr or client address) on the /api/v1/status/active_queries page for every query 2020-07-28 15:13:08 +03:00
Roman Khavronenko
2f1e7298ce app/vmalert: support external.label to specify global labelset for all rules #622 (#652)
`external.label` flag supposed to help to distinguish alert or recording rules
source in situations when more than one `vmalert` runs for the same datasource
or AlertManager.
2020-07-28 14:20:31 +03:00
Aliaksandr Valialkin
0da202023b app/vmselect/promql: return empty values from group() if all the time series have no values at the given timestamp
This aligns `group()` behaviour to Prometheus
2020-07-28 13:40:11 +03:00
Aliaksandr Valialkin
48d0ec1363 docs/MetricsQL.md: small fixes in the docs 2020-07-28 13:27:37 +03:00
Aliaksandr Valialkin
a1a065a47e docs/Single-server-VictoriaMetrics.md: mention that OpenTSDB data ingestion protocol is used by KairosDB 2020-07-28 13:11:07 +03:00
Aliaksandr Valialkin
0516e3f330 vendor: update github.com/VictoriaMetrics/metrics from v1.12.0 to v1.12.1 2020-07-28 00:20:43 +03:00
Sasasu
5b81bdde39 lib/storage: metaindexRow use memroy more efficiently (#655)
due to memory align the metaindexRow structure use 64-byte pre object.
this commit changes the order of field, make metaindexRow use 56-byte pre
object.

Signed-off-by: Sasasu <su@sasasu.me>
2020-07-27 19:02:53 +03:00
Aliaksandr Valialkin
865610a7c8 lib/protoparser/prometheus: add a test for cassandra-exporter
Thanks to Seva
2020-07-27 18:37:11 +03:00
Aliaksandr Valialkin
cb8c6908dc app/vmagent/remotewrite: create new request on failure to send a block of data to remote storage
Previously the request body was already consumed before the retry, so this led to the following error:

    http: ContentLength=... with Body length 0
2020-07-27 17:32:46 +03:00
Aliaksandr Valialkin
894dcb7c1c app/vmselect/promql: improve further the accuracy of buckets_limit() function
The accuracy is increased by mergin the smallest bucket with the smallest adjacent bucket.
2020-07-26 12:10:13 +03:00
Aliaksandr Valialkin
215eba0b82 app/vminsert: flush bufs if needed after the current row is added
Previously the data for the added row could be overwritten by the flush
before the row addition is complete.
2020-07-26 12:10:11 +03:00
Aliaksandr Valialkin
edb1eca6f1 app/vmselect/promql: avoid dropping inf bucket in buckets_limit
The `le="inf"` bucket must be preserved in order to maintain the maximum level of accuracy.
2020-07-25 17:00:36 +03:00
Aliaksandr Valialkin
97b6f5d223 app/vmselect/promql: optimize buckets_limit(k, buckets) for big number of buckets 2020-07-25 13:24:03 +03:00
Aliaksandr Valialkin
a090627059 app/vminsert: limit memory usage when ingesting data in big packets 2020-07-24 23:32:40 +03:00
Aliaksandr Valialkin
53c87ba341 deployment/docker/docker-compose.yml: update Grafana version from 7.0.3 to 7.1.1 2020-07-24 18:43:37 +03:00
Aliaksandr Valialkin
bb161497cf app/vmselect/promql: improve the accuracy of buckets_limit(k, buckets) function
Now it properly merges the bucket with the previous bucket after deletion.
2020-07-24 17:07:49 +03:00
Aliaksandr Valialkin
994fa2f3bf app/vmselect/promql: add buckets_limit(k, buckets) function, which limits the number of buckets per time series to k
This function works with both Prometheus-style and VictoriaMetrics-style buckets.
The function removes buckets with the lowest values in order to reserve the highest precision.
The function is useful for building heatmaps in Grafana from too big number of buckets.
2020-07-24 16:13:53 +03:00
Aliaksandr Valialkin
e151c5c644 app/vmselect: fix tests for rate_over_sum 2020-07-24 02:35:28 +03:00
Aliaksandr Valialkin
3107c633e3 app/vmselect/promql: typo fix after 3e557c9861 2020-07-24 02:15:58 +03:00
Aliaksandr Valialkin
3e557c9861 app/vmselect/promql: add rate_over_sum(m[d]) function to MetricsQL, which returns rate over sum of m values over d duration
Something like `sum_over_time(m[d]) / d`, but more accurate.
2020-07-24 01:17:42 +03:00
Aliaksandr Valialkin
54ef2d8112 lib/storage: slightly reduce code difference between single-node and cluster versions 2020-07-24 00:31:16 +03:00
Aliaksandr Valialkin
b1f6843bd0 app/vmselect/promql: allow setting [d] window smaller than the interval between raw points for avg_over_time
This makes `avg_over_time` behavior consistent with `sum_over_time` and `count_over_time` behaviors.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/636
2020-07-23 22:25:43 +03:00
Aliaksandr Valialkin
039c9d2441 lib/storage: respect -search.maxQueryDuration when searching for time series in inverted index
Previously the time spent on inverted index search could exceed the configured `-search.maxQueryDuration`.
This commit stops searching in inverted index on query timeout.
2020-07-23 21:21:42 +03:00
Aliaksandr Valialkin
2a45871823 lib/storage: add more fine-grained pace limiting for search 2020-07-23 19:26:08 +03:00
Aliaksandr Valialkin
461481fbdf app/vmselect/netstorage: protect from too smart compiler, which may break memory usage optimization in ProcessSearchQuery 2020-07-23 17:54:01 +03:00
Aliaksandr Valialkin
4c8b49b193 app/vminsert: export vm_relabel_metrics_dropped_total metric that shows the number of metrics dropped due to relabeling 2020-07-23 14:57:53 +03:00
Aliaksandr Valialkin
e79de9774b app/vmselect: typo fix after 34563916f7 2020-07-23 14:12:28 +03:00
Aliaksandr Valialkin
34563916f7 app/vmselect: reduce memory usage when querying big number of time series with long labels
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/646
2020-07-23 13:53:52 +03:00
Aliaksandr Valialkin
9257eee982 app/vminsert: do not call ApplyRelabeling function if relabeling is disabled
This should reduce CPU usage a bit when `-relabelConfig` isn't set
2020-07-23 13:39:44 +03:00
Aliaksandr Valialkin
6f05c4d351 lib/storage: improve prioritizing of data ingestion over querying
Prioritize also small merges over big merges.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
2020-07-23 13:23:36 +03:00
Aliaksandr Valialkin
2f612e0c67 app/vminsert: fix relabeling for metrics ingested via Influx line protocol
Previously the enabled relabeling with `-relabelConfig` command-line flag could result in missing labels
if a single Influx line protocol message contains multiple field values.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/638
2020-07-23 13:23:14 +03:00
Aliaksandr Valialkin
61c611f5ad lib/storage: properly calculate global metrics in UpdateStats() 2020-07-23 00:35:15 +03:00
Aliaksandr Valialkin
9224ede54f lib/mergeset: properly calculate global metrics in UpdateStats()
Previously these metrics could be calculated multiple times for multiple mergeset.Table instances.
2020-07-23 00:35:13 +03:00
Aliaksandr Valialkin
228d137936 lib/storage: reorder mergeBlockStreams() args in order to make them more consistent 2020-07-22 21:58:10 +03:00
Aliaksandr Valialkin
e4303d3d21 lib/storage: prevent possible race condition when all the goroutines exit Storage.AddRows, before goroutines other goroutines are blocked on searchTSIDsCond inside Storage.searchTSIDs
This condition may occur after the following sequence of events:

1) A goroutine enters the loop body when len(addRowsConcurrencyCh) == cap(addRowsConcurrencyCh) inside Storage.searchTSIDs.
2) All the goroutines return from Storage.AddRows.
3) The goroutine from step 1 blocks on searchTSIDsCond.Wait() inside the loop body.

The goroutine remains blocked until the next call to Storage.AddRows, which calls searchTSIDsCond.Signal().
This may take indefinite time.
2020-07-22 21:52:34 +03:00
Aliaksandr Valialkin
ad8d3b387d docs/Single-server-VictoriaMetrics.md: mention that it is recommended inspecting logs during troubleshooting 2020-07-22 18:21:29 +03:00
Aliaksandr Valialkin
62e76ca805 vendor: make vendor-update 2020-07-22 16:54:44 +03:00
Aliaksandr Valialkin
4f526cc816 app/vmselect/prometheus: support d, w and y suffixes for durations passed to step in /api/v1/query_range like Prometheus does
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/641
2020-07-22 16:26:18 +03:00
Aliaksandr Valialkin
dfb113f175 app/vmselect/netstorage: reduce memory allocations when unpacking time series data by using a pool for unpackWork entries
This should slightly reduce load on GC when processing queries that touch big number of time series.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/646 according to the provided memory profile there.
2020-07-22 15:03:57 +03:00
Aliaksandr Valialkin
31ae5911a8 app/vmagent: add -remoteWrite.decimalPlaces command-line flag, which may be used for reducing disk space usage on the remote storage 2020-07-21 21:55:32 +03:00
Aliaksandr Valialkin
d3442b40b2 lib/uint64set: optimize adding items to the set via Set.AddMulti 2020-07-21 20:56:59 +03:00
Aliaksandr Valialkin
caa2952aa6 app/vmselect: take into account the time spent in wait queue before query execution as time spent on the query 2020-07-21 19:00:09 +03:00
Aliaksandr Valialkin
e00cfc854d app/vmselect/promql: skip the first value in time series passed to increase() if it exceeds by more than 10x the delta between the next value and the first value
This should prvent from inflated `increase()` results for time series that start from big initial values.
Such cases may occur when a label value changes in a metric without counter reset.
2020-07-21 17:24:10 +03:00
Aliaksandr Valialkin
b9c8f6bf34 app/vmselect: log the total available memory for concurrent requests on not enough memory errors
This should simplify root cause analysis
2020-07-20 19:51:40 +03:00
Aliaksandr Valialkin
ad6290953c app/vmagent: add -remoteWrite.proxyURL command-line option
This option allows writing data to `-remoteWrite.url` via http, https or socks5 proxy.
This is similar to `proxy_url` option in `remote_write` section of Prometheus.
See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write
2020-07-20 19:28:49 +03:00
Aliaksandr Valialkin
efcbb51968 docs/vmagent.md: sync with app/vmagent/README.md 2020-07-20 17:08:34 +03:00
Roman Khavronenko
ed0df37ee7 app/vmagent: mention grafana dashboard in README (#639) 2020-07-20 17:07:27 +03:00
Aliaksandr Valialkin
004d2924e2 vendor: update github.com/VictoriaMetrics/metrics from v1.11.3 to v1.12.0 2020-07-20 16:56:22 +03:00
Aliaksandr Valialkin
11be704109 app/vmagent/remotewrite: allow passing empty -remoteWrite.urlRelabelConfig entries 2020-07-20 15:49:27 +03:00
Aliaksandr Valialkin
5a4675c528 app/vmselect/prometheus: do not return time series with empty list of datapoints from /api/v1/query_range
This matches Prometheus behaviour.

This should fix https://github.com/jacksontj/promxy/issues/329
2020-07-20 15:31:21 +03:00
Aliaksandr Valialkin
ecb1b2564a app/vmselect/promql: add mode() aggregate function 2020-07-20 15:31:20 +03:00
Aliaksandr Valialkin
b35cb293f5 lib/httpserver: log remote address in error message from httpserver.Errorf
This should improve detection of the root cause of errors.
Thanks to Anant for the idea.
2020-07-20 14:11:22 +03:00
Aliaksandr Valialkin
1c641037e8 app/vmselect/promql: add mode_over_time(m[d]) function
See https://en.wikipedia.org/wiki/Mode_(statistics) and https://stackoverflow.com/questions/61134078/promql-query-to-return-the-value-from-a-range-vector-which-occurs-maximum-no-of
2020-07-17 18:28:45 +03:00
Aliaksandr Valialkin
6b5ad535ae app/vmselect/promql: optimize group(rollup(m)) calculations 2020-07-17 16:47:16 +03:00
Aliaksandr Valialkin
8949d65ad1 app/vmselect/promql: check that any() doesn't touch metric name 2020-07-17 16:23:21 +03:00
Aliaksandr Valialkin
3198fd31fa deployment/docker: update Go builder from v1.14.5 to v1.14.6
This fixes runtime issues found in Go since v1.14.5. See https://github.com/golang/go/issues?q=milestone%3AGo1.14.6+label%3ACherryPickApproved
2020-07-17 15:21:38 +03:00
Aliaksandr Valialkin
aa5d88055d app/vmselect/promql: add group() aggregate function to MetricsQL
This function has been added in Prometheus 2.20. See https://github.com/prometheus/prometheus/pull/7480
2020-07-17 15:17:55 +03:00
Aliaksandr Valialkin
df01836818 app/vmselect/promql: keep all labels for time series from any() call 2020-07-17 15:17:54 +03:00
Roman Khavronenko
dfa156e6aa vmagent: update grafana dashboard (#634)
* reference datasource variable instead of datasource name;
* change unit from `bytes` to `bits/s` for Network panel.
2020-07-17 02:11:20 +03:00
Aliaksandr Valialkin
8c14ca93fa app/vminsert/influx: properly handle the case when certain labels with empty values are removed by ApplyRelabeling() call
Previously this could lead to `out of range` panic
2020-07-17 00:07:06 +03:00
Aliaksandr Valialkin
e4e1cd1de2 app/vmselect: fix nil pointer dereference panic when unsuccessfully querying vmstorage 2020-07-16 19:15:43 +03:00
Aliaksandr Valialkin
ef6ee72108 deployment/docker: update Go builder from v1.14.4 to v1.14.5
This should fix the following issues in Go - https://github.com/golang/go/issues?q=milestone%3AGo1.14.5+label%3ACherryPickApproved
2020-07-16 18:55:09 +03:00
Aliaksandr Valialkin
ed7580ad22 app/vmalert: consistently use "%w" instead of "%s" in fmt.Errorf when wrapping errors 2020-07-15 13:56:47 +03:00
Roman Khavronenko
9eb71dda3d vmagent: add grafana dashboard (#629)
`vmagent` Grafana dashboard suppose to provide basic observability over multiple
`vmagent` instances. Dashboard is saved in Grafana export format so it can be easily
imported. It was also integrated into docker-compose environment.
2020-07-15 13:56:06 +03:00
Aliaksandr Valialkin
328814ee60 docs/vmagent.md: make filtering rules for init container pods less confusing 2020-07-14 20:32:47 +03:00
Aliaksandr Valialkin
7398e5701b vendor: make vendor-update 2020-07-14 20:31:42 +03:00
Aliaksandr Valialkin
4e770e9120 docs/Single-server-VictoriaMetrics.md: remove Roadmap chapter, since it became outdated 2020-07-14 19:06:33 +03:00
Aliaksandr Valialkin
b442a42d8e app/vmagent/remotewrite: return proper value from tssRelabelPool.New
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/599
2020-07-14 14:29:20 +03:00
Aliaksandr Valialkin
6d77bfae4f docs/Single-server-VictoriaMetrics.md: sync with README.md 2020-07-14 14:19:14 +03:00
Aliaksandr Valialkin
4081e2295e app/{vminsert,vmagent}: add -influxSkipMeasurement command-line flag for using field name as metric name
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/626
2020-07-14 14:17:24 +03:00
Aliaksandr Valialkin
e1107fec10 lib/storage: reset MetricName->TSID cache after marking metricIDs as deleted
This is a follow-up commit after 12b16077c4 ,
which didn't reset the `tsidCache` in all the required places.
This could result in indefinite errors like:

    missing metricName by metricID ...; this could be the case after unclean shutdown; deleting the metricID, so it could be re-created next time

Fix this by resetting the cache inside deleteMetricIDs function.
2020-07-14 14:06:32 +03:00
Aliaksandr Valialkin
25f80d320b app/vmselect/prometheus: do not adjust last points in time series with timestamps exceeding the current time
Such timestamps usually mean that the query contains `offset`.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/625
2020-07-14 12:52:16 +03:00
Aliaksandr Valialkin
cde18d1f43 lib/protoparser: properly update vm_protoparser_rows_read_total{type="promscrape"} metric 2020-07-14 12:16:35 +03:00
Seva Poliakov
457e61900d add vm_protoparser_rows_read_total metrics to promscrape (#624)
* add vm_protoparser_rows_read_total metrics to promscrape

move vm_protoparser_rows_read_total for promscrape to better place

move vm_protoparser_rows_read_total for promscrape to better place

* remove possibility of infinity loop at prometheus parser
2020-07-14 12:16:34 +03:00
Roman Khavronenko
7e347972c4 lib/flagutil: specify additional description for all Array type flags (#620)
Array type flag is now defined as `value` type in flag description when printed.
This change adds additional description to every Array type flag so it would be
clear what exact type is used:
```
  -remoteWrite.urlRelabelConfig array
        Optional path to relabel config for the corresponding -remoteWrite.url
        Supports array of values separated by comma or specified via multiple flags.
```
2020-07-13 21:56:37 +03:00
Roman Khavronenko
19dd121968 lib/persistentqueue: add vm_persistentqueue_bytes_pending metric (#619)
Metric `vm_persistentqueue_bytes_pending` is a gauge that shows current amount
of bytes in persistentqueue flushed on disk as a difference between write and read
offsets. This metric is very similar to `vmagent_remotewrite_pending_data_bytes`
except of accounting for bytes in-memory.
2020-07-13 21:54:09 +03:00
Roman Khavronenko
829ec4f9cf Extend metric vm_promscrape_targets with status label (#615)
The change to `vm_promscrape_targets` metric suppose to improve observability
for `vmagent` so it will be possible to track how many targets are up or down
for every specific scrape group:
```
vm_promscrape_targets{type="static_configs", status="down"} 1
vm_promscrape_targets{type="static_configs", status="up"} 2
```
2020-07-13 21:52:03 +03:00
Aliaksandr Valialkin
55d83e777d app/vmselect/prometheus: minimize the diff for the change 1033dc7e2a over 619b0a25c9 2020-07-13 21:40:38 +03:00
faceair
1033dc7e2a fix empty response template (#617) 2020-07-13 21:31:19 +03:00
Aliaksandr Valialkin
619b0a25c9 docs/vmagent.md: sync with app/vmagent/README.md 2020-07-13 21:25:11 +03:00
ofen
666c795b98 Update README.md (#621)
Troubleshooting section updated to help out with duplicate targets detection
2020-07-13 21:18:54 +03:00
Aliaksandr Valialkin
a730b3f6a1 app/vmagent: fix data race when multiple -remoteWrite.urlRelabelConfig options are set
Previously multiple goroutines could access remoteWriteCtx.tss concurrently, which could lead to data race
and improper relabeling. Now each goroutine has its own copy of tss during relabeling.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/599
2020-07-10 15:16:59 +03:00
Aliaksandr Valialkin
508ad46e0e app/vmagent/remotewrite: typo fix in -remoteWrite.showURL help message 2020-07-10 14:07:08 +03:00
Aliaksandr Valialkin
e5b9f47623 vendor: update github.com/valyala/quicktemplate from v1.5.0 to v1.5.1
This should fix incorrect encoding for json strings with char codes below 0x20

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/613
2020-07-10 12:59:15 +03:00
Aliaksandr Valialkin
ca74b80f10 docs/Cluster-VictoriaMetrics.md: sync with the original README.md 2020-07-10 12:15:31 +03:00
Aliaksandr Valialkin
cba820e390 app/{vminsert,vmagent}: add ability to import data in Prometheus exposition format via /api/v1/import/prometheus 2020-07-10 12:14:07 +03:00
Aliaksandr Valialkin
6fe3c48a6e properly calculate readCalls 2020-07-10 12:00:58 +03:00
Aliaksandr Valialkin
9c350bc20d app/vmselect/promql: add missing tests for ifnot binary operation 2020-07-09 13:24:06 +03:00
Aliaksandr Valialkin
256fd9a87e app/vmselect/promql: refactor implementations for and and unless binary operations, so they are closer to or implementation 2020-07-09 13:05:55 +03:00
Aliaksandr Valialkin
2d9b3ad5b3 app/vmselect/promql/active_queries.go: simplify code a bit by inlining getNextActiveQueryID function 2020-07-09 11:18:30 +03:00
Aliaksandr Valialkin
b66c7c13ac docs: add a link to the The CMS monitoring infrastructure and applications publication from CERN 2020-07-08 20:16:43 +03:00
Aliaksandr Valialkin
3e1d7d8489 lib/promscrape: send Accept header similar to Prometheus when scraping targets
This should fix scraping Spring Boot servers, which return incorrect response
unless `Accept: text/plain` request header is set.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/608
2020-07-08 19:48:22 +03:00
Aliaksandr Valialkin
47c7ea5c60 vendor: make vendor-update 2020-07-08 19:25:38 +03:00
Aliaksandr Valialkin
4f737d1cbd docs/Cluster-VictoriaMetrics.md: mention about api/v1/status/active_queries page
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/528
2020-07-08 19:18:26 +03:00
Aliaksandr Valialkin
742da690f4 app/vmselect: add /api/v1/status/active_queries page with the list of currently running queries
This is a follow-up for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/598

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/575
2020-07-08 18:55:38 +03:00
DexterZhang
99f54e44ff feat(vmselect): add current running query list, add ability for getting the running query info and killing running query for master branch (#598) 2020-07-08 18:52:55 +03:00
Aliaksandr Valialkin
cb92113632 lib/storage: limit the maximum concurrency for data ingestion to GOMAXPROCS
Previously the concurrency has been limited to GOMAXPROCS*2. This had little sense,
since every call to Storage.AddRows is bound to CPU, so the maximum ingestion bandwidth
is achieved when the number of concurrent calls to Storage.AddRows is limited to the number of CPUs,
i.e. to GOMAXPROCS.
2020-07-08 17:32:18 +03:00
Roman Khavronenko
e7557e0252 lib/protoparser: fix metric name of unmarshal errors in promremotewrite (#607)
The change fixes the typo in metric name `vm_protoparser_unmarshal_errors` to
respect the naming standard.
2020-07-08 14:18:41 +03:00
Aliaksandr Valialkin
e59b9916aa lib/protoparser/graphite: go fmt 2020-07-08 14:12:10 +03:00
Aliaksandr Valialkin
d0b694c5c8 lib/protoparser/graphite: add more tests after eb45185eef
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/610
2020-07-08 14:10:35 +03:00
Seva Poliakov
eb45185eef Fix graphite minus one timestamp (#609)
* fix graphite -1 timestamp

* format the graphite fix -1 timestamp
2020-07-08 13:59:19 +03:00
Aliaksandr Valialkin
32b9fb58b8 lib/storage: clarify out of retention period error message by mentioning -retentionPeriod command-line flag 2020-07-08 13:54:26 +03:00
Aliaksandr Valialkin
12b16077c4 lib/storage: reset MetricName->TSID cache after deleting time series
This should prevent from adding new data points to deleted time series
without the need to check for the deleted time series.

This improves ingestion performance a bit when the `deleted time series ids` aka `dmis` set
contains big number of time series.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/596

Based on the idea from @n4mine at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/604
2020-07-06 22:01:08 +03:00
Aliaksandr Valialkin
a23806f486 lib/fs: clarify description for -fs.disableMmap command-line flag 2020-07-06 14:28:34 +03:00
Aliaksandr Valialkin
6daa5f7500 lib/storage: prioritize data ingestion over heavy queries
Heavy queries could result in the lack of CPU resources for processing the current data ingestion stream.
Prevent this by delaying queries' execution until free resources are available for data ingestion.

Expose `vm_search_delays_total` metric, which may be used in for alerting when there is no enough CPU resources
for data ingestion and/or for executing heavy queries.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291
2020-07-05 19:42:05 +03:00
Roman Khavronenko
703def4b2e app/vmalert: add retries to remotewrite (#605)
* app/vmalert: add retries to remotewrite

Remotewrite pkg now does limited number of retries if write request failed.
This suppose to make vmalert state persisting more reliable.

New metrics were added to remotewrite in order to track rows/bytes sent/dropped.

defaultFlushInterval was increased from 1s to 5s for sanity reasons.

* fix

* wip

* wip

* wip

* fix bits alignment bug for 32-bit systems

* fix mistakenly dropped field
2020-07-05 18:46:52 +03:00
Aliaksandr Valialkin
de137aef98 app/victoria-metrics: fix tests after the commit acf828a759 2020-07-05 18:24:41 +03:00
Aliaksandr Valialkin
acf828a759 app/vmselect/prometheus: small fixes on top of 8bb762124a 2020-07-05 18:17:06 +03:00
faceair
8bb762124a fix adjust last points avoid influence earlier value (#606) 2020-07-05 17:56:54 +03:00
Aliaksandr Valialkin
ff6a0955eb lib/promscrape: use HostClient.DoDeadline instead of HostClient.Do in order to guarantee strict deadline across multiple scrape attempts 2020-07-03 21:33:22 +03:00
Aliaksandr Valialkin
8b133e40d5 lib/promscrape: prevent from too big deadline misses on scrape retries
The maximum deadline miss duration is reduced to 2x scrape_interval in the worst case.
By default it is limited to scrape_interval configured for the given scrape target.
2020-07-03 20:41:36 +03:00
Aliaksandr Valialkin
44a54b8b3d lib/promscrape: check for nil error before checking for the returned status code when scraping targets 2020-07-03 18:37:14 +03:00
Ween
d59cdbe90c [VMAlert] Fix error log when remoteWrite queue size is full (#602)
* Fix Auto metrics relabeled errors

* Finalize auto-genenated  Labels

* Fix Test Errors

* fix error logs when queue is full

Co-authored-by: xinyulong <xinyulong@kuaishou.com>
2020-07-03 16:49:37 +03:00
Aliaksandr Valialkin
0b2086b7a5 app/vminsert: prevent from adding and/or selecting labels with empty values
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/600
2020-07-02 23:14:11 +03:00
Aliaksandr Valialkin
8f628cd805 app/victoria-metrics: removed debug log message when -selfScrapeInterval is set 2020-07-02 20:39:41 +03:00
Aliaksandr Valialkin
91b3482894 app/vminsert: add ability to apply relabeling to all the incoming metrics if -relabelConfig command-line arg points to a file with a list of relabel_config entries
See https://victoriametrics.github.io/#relabeling
2020-07-02 20:39:28 +03:00
Aliaksandr Valialkin
e5500bfcf2 all: typo fix: exptected -> expected 2020-07-02 18:05:52 +03:00
Aliaksandr Valialkin
5d3db3ff7c app/vmselect: add interpolate function for filling gaps with linearly interpolated values
See https://stackoverflow.com/q/62565021/274937 for details
2020-07-02 14:54:21 +03:00
Aliaksandr Valialkin
4dd3de9286 lib/promscrape: add ability to set disable_compression and disable_keepalive options in scrape_config section of the config passed to -promscrape.config
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/580
2020-07-02 14:19:14 +03:00
Aliaksandr Valialkin
8da3f773ae lib/promscrape: add -promscrape.disableKeepAlive command-line flag for disabling http keep-alive connections when scraping targets
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/580
2020-07-01 02:20:20 +03:00
BigFish
9d5f5b6878 fix: spelling mistakes (#594)
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2020-07-01 01:35:26 +03:00
Aliaksandr Valialkin
9a2ba5b6d1 vendor: make vendor-update 2020-07-01 01:04:58 +03:00
Aliaksandr Valialkin
b277ba8121 lib/httpserver: add Unwrap method to ErrorWithStatusCode, so As and Is functions in standard errors package may properly unwrap the error inside ErrorWithStatusCode 2020-07-01 00:54:01 +03:00
Aliaksandr Valialkin
84a37098ed app/vmstorage: add -denyQueriesOutsideRetention command-line flag for denying queries outside the configured retention
VictoriaMetrics returns `503 Service Unavailable` http error for requests with time ranges outside the configured retention
if `-denyQueriesOutsideRetention` command-line flag is set.
2020-07-01 00:21:44 +03:00
Aliaksandr Valialkin
56ccfa5218 all: use errors.As instead of type assertion for detecting net.Error 2020-07-01 00:15:34 +03:00
Aliaksandr Valialkin
7c2c8b2981 all: use errors.As for inspecting errors that implement httpserver.ErrorWithStatusCode 2020-07-01 00:04:34 +03:00
Aliaksandr Valialkin
d5dddb0953 all: use %w instead of %s for wrapping errors in fmt.Errorf
This will simplify examining the returned errors such as httpserver.ErrorWithStatusCode .
See https://blog.golang.org/go1.13-errors for details.
2020-06-30 23:05:11 +03:00
Aliaksandr Valialkin
586c5be404 lib/promscrape: add missing label sorting for autogenerated metrics
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/592
2020-06-29 22:36:12 +03:00
Ween
1cd01b5359 Fix Auto metrics relabeled errors (#593)
* Fix Auto metrics relabeled errors

* Finalize auto-genenated  Labels

* Fix Test Errors

Co-authored-by: xinyulong <xinyulong@kuaishou.com>
2020-06-29 22:29:29 +03:00
Roman Khavronenko
88538df267 app/vmalert: support multiple notifier urls (#584) (#590)
* app/vmalert: support multiple notifier urls (#584)

User now can set multiple notifier URLs in the same fashion
as for other vmutils (e.g. vmagent). The same is correct for
TLS setting for every configured URL. Alerts sending is done
in sequential way for respecting the specified URLs order.

* app/vmalert: add basicAuth support for notifier client (#585)

The change adds possibility to set basicAuth creds for notifier
client in the same fasion as for remote write/read and datasource.
2020-06-29 22:21:03 +03:00
Aliaksandr Valialkin
63e5ee0d29 docs: sync with upstream 2020-06-29 22:09:03 +03:00
Roman Khavronenko
eba4e92994 deployment/docker: replace Prometheus with vmagent (#589)
vmagent replaces Prometheus to perform scrapes and writes
into VictoriaMetrics installation. Prometheus datasource was
dropped, but its config was reused to feed vmagent.

Change also contains simplification in dashboard propagation
to Grafana container by removing excessive json manipulation
steps.
2020-06-29 22:05:34 +03:00
Roman Khavronenko
82ecfa3b32 app/vmalert: move flags description and initialization into subpackages
The change adds no new functionality and aims to move flags definitions
to subpackages that are using them. This should improve readability
of the main function.
2020-06-28 12:26:22 +01:00
kreedom
dc4e3f0e0b app/vmalert: properly set transport for HTTP clients
Fixes issue #586
2020-06-27 08:31:54 +01:00
Aliaksandr Valialkin
8f2e88234f docs: update the info that docker images are built on top of alpine image now
A follow-up after the commit ff624c9125
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/522
2020-06-26 13:54:10 +03:00
Aliaksandr Valialkin
423825695f vendor: make vendor-update 2020-06-25 23:45:14 +03:00
Aliaksandr Valialkin
5dc0bf6d3d vendor: update github.com/valyala/fastjson from v1.5.1 to v1.5.2 2020-06-25 23:35:03 +03:00
Aliaksandr Valialkin
7eb171182b lib/promrelabel: properly apply ^ and $ anchors to regex value in Prometheus relabeling rules 2020-06-25 17:19:19 +03:00
Aliaksandr Valialkin
05d754d7bb app/vmselect/netstorage: reset big result values every 10 seconds instead of after processing every time series
This should reduce GC pressure when processing time series with big number of rows
2020-06-24 19:38:39 +03:00
Aliaksandr Valialkin
8dec17470d deployment/docker/docker-compose.yml: update Prometheus from v1.18.1 to v1.19.1 and Grafana from v7.0.2 to v7.0.3 2020-06-24 18:09:33 +03:00
Aliaksandr Valialkin
5e35b87c3d docs/Cluster-VictoriaMetrics.md: move VictoriaMetrics logo below "Cluster version" heading, since it is heeded for proper navigation at https://victoriametrics.github.io 2020-06-24 12:06:27 +03:00
Aliaksandr Valialkin
c85d926569 docs/SampleSizeCalculations.md: updates 2020-06-24 12:06:25 +03:00
Aliaksandr Valialkin
f0cef4761b docs/SampleSizeCalculations.md: add a doc with calculations for the "Lowest sample size" graph at https://victoriametrics.com/ 2020-06-24 12:00:22 +03:00
nicbaz
774f7ca1c1 vmselect: fix label_replace when mismatch (#579)
As per documentation on `label_replace` function: "If the regular
expression doesn't match then the timeseries is returned unchanged".

Currently this behavior is not enforced, if a regexp on an existing
tag doesn't match then the tag value is copied as-is in the destination
tag. This fix first checks that the regular expression matches the
source tag before applying anything.

Given the current implementation, this fix also changes the behavior
of the **MetricsQL** `label_transform` function which does not
document this behavior at the moment.
2020-06-23 23:50:33 +03:00
Aliaksandr Valialkin
a560b4788e lib/fs: go fmt 2020-06-23 23:02:39 +03:00
Aliaksandr Valialkin
8141541e61 lib/fs: fall back to cgo copy for copying the last 4KB of mmaped data
This probably should fix https://github.com/VictoriaMetrics/VictoriaMetrics/issues/581
2020-06-23 22:55:22 +03:00
Aliaksandr Valialkin
e65b4cb6b1 docs/vmalert.md: sync with app/vmalert/README.md 2020-06-23 22:49:38 +03:00
Aliaksandr Valialkin
7209d58fbd app/vmselect/netstorage: increase concurrency when processing small number of time series with big number of data points per each time series
Previously VictoriaMetrics was processing up to 32 time series in a single goroutine.
This could be slow if each time series contains big number of data points (10M+ or more), since only a single CPU core could be loaded with work,
while other CPU cores were idle. Fix this by launching GOMAXPROCS workers for time series processing.

This should help with https://github.com/VictoriaMetrics/VictoriaMetrics/issues/572
2020-06-23 22:46:15 +03:00
nicbaz
72c90bfd8b vmalert: add support for TLS configuration (#578)
app/vmalert: add support for TLS configuration

Add support for TLS optional configuration in a similar fashion to what
is currently supported in other vmutils such as vmagent. TLS
configuration options are distinct for datasource, remoteRead,
remoteWrite as well as notifier.
2020-06-23 20:45:45 +01:00
Aliaksandr Valialkin
2a39ba639d lib/promrelabel: add support for keep_if_equal and drop_if_equal actions to relabel configs
These actions may be useful for filtering out unneeded targets and/or metrics if they contain equal label values.
For example, the following rule would leave the target only if __meta_kubernetes_annotation_prometheus_io_port
equals __meta_kubernetes_pod_container_port_number:

  - action: keep_if_equal
    source_labels: [__meta_kubernetes_annotation_prometheus_io_port, __meta_kubernetes_pod_container_port_number]
2020-06-23 17:29:03 +03:00
Aliaksandr Valialkin
8f0bcec6cc lib/promscrape: preserve the previously discovered targets on discovery errors per each job_name
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/582
2020-06-23 15:40:40 +03:00
Aliaksandr Valialkin
a13cd60c6f vendor: update github.com/klauspost/compress from v1.10.9 to v1.10.10 2020-06-23 13:48:51 +03:00
Aliaksandr Valialkin
c970cb912c lib/fs: an attempt to fix SIGBUS error by rounding mmap`ed region to multiple of 4KB pages
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/581
2020-06-23 13:39:49 +03:00
Aliaksandr Valialkin
b5206ce33f lib/logger: add -loggerErrorsPerSecondLimit for limiting the rate of ERROR messages 2020-06-23 12:41:36 +03:00
Aliaksandr Valialkin
4c7f216dfe lib/promscrape: retry performing the request to the server for up to 3 times before giving up when it closes keep-alive connections
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/580
2020-06-23 12:33:54 +03:00
Aliaksandr Valialkin
530f7a21e8 docs/Single-server-VictoriaMetrics.md: remove -httpListenAddr command-line flag from setting up VictoriaMetrics chapter
This flag is optional and it has good default value - `:8428`, so there is no need in mentioning it at this chapter
2020-06-22 12:45:20 +03:00
Aliaksandr Valialkin
7532dbcdf5 app/vmselect/promql: properly override label values from group_left and group_right lists like Prometheus does
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/577
2020-06-21 16:33:01 +03:00
kreedom
7ec6711f06 Support of custom URL path for alert (#560)
app/vmalert: Support custom URL for alerts source

Add flag `external.alert.source` for configuring custom URL
for alert's source. This may be handy to re-point default source
URL to other systems like Grafana.
Updates #517
2020-06-21 11:32:46 +01:00
963 changed files with 63504 additions and 62383 deletions

View File

@@ -14,9 +14,9 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Setup Go
uses: actions/setup-go@master
uses: actions/setup-go@main
with:
go-version: 1.14
go-version: 1.15
id: go
- name: Dependencies
env:
@@ -43,7 +43,20 @@ jobs:
make victoria-metrics-arm64
make vmutils
GOOS=freebsd go build -mod=vendor ./app/victoria-metrics
GOOS=freebsd go build -mod=vendor ./app/vmagent
GOOS=freebsd go build -mod=vendor ./app/vmalert
GOOS=freebsd go build -mod=vendor ./app/vmbackup
GOOS=freebsd go build -mod=vendor ./app/vmrestore
GOOS=openbsd go build -mod=vendor ./app/victoria-metrics
GOOS=openbsd go build -mod=vendor ./app/vmagent
GOOS=openbsd go build -mod=vendor ./app/vmalert
GOOS=openbsd go build -mod=vendor ./app/vmbackup
GOOS=openbsd go build -mod=vendor ./app/vmrestore
GOOS=darwin go build -mod=vendor ./app/victoria-metrics
GOOS=darwin go build -mod=vendor ./app/vmagent
GOOS=darwin go build -mod=vendor ./app/vmalert
GOOS=darwin go build -mod=vendor ./app/vmbackup
GOOS=darwin go build -mod=vendor ./app/vmrestore
- name: Publish coverage
uses: codecov/codecov-action@v1.0.6
with:

149
CHANGELOG.md Normal file
View File

@@ -0,0 +1,149 @@
# tip
# [v1.45.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.45.0)
* FEATURE: allow setting `-retentionPeriod` smaller than one month. I.e. `-retentionPeriod=3d`, `-retentionPeriod=2w`, etc. is supported now.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/173
* FEATURE: optimize more cases according to https://utcc.utoronto.ca/~cks/space/blog/sysadmin/PrometheusLabelNonOptimization . Now the following cases are optimized too:
* `rollup_func(foo{filters}[d]) op bar` -> `rollup_func(foo{filters}[d]) op bar{filters}`
* `transform_func(foo{filters}) op bar` -> `transform_func(foo{filters}) op bar{filters}`
* `num_or_scalar op foo{filters} op bar` -> `num_or_scalar op foo{filters} op bar{filters}`
* FEATURE: improve time series search for queries with multiple label filters. I.e. `foo{label1="value", label2=~"regexp"}`.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/781
* FEATURE: vmagent: add `stream parse` mode. This mode allows reducing memory usage when individual scrape targets expose tens of millions of metrics.
For example, during scraping Prometheus in [federation](https://prometheus.io/docs/prometheus/latest/federation/) mode.
See `-promscrape.streamParse` command-line option and `stream_parse: true` config option for `scrape_config` section in `-promscrape.config`.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/825 and [troubleshooting docs for vmagent](https://victoriametrics.github.io/vmagent.html#troubleshooting).
* FEATURE: vmalert: add `-dryRun` command-line option for validating the provided config files without the need to start `vmalert` service.
* FEATURE: accept optional third argument of string type at `topk_*` and `bottomk_*` functions. This is label name for additional time series to return with the sum of time series outside top/bottom K. See [MetricsQL docs](https://victoriametrics.github.io/MetricsQL.html) for more details.
* FEATURE: vmagent: expose `/api/v1/targets` page according to [the corresponding Prometheus API](https://prometheus.io/docs/prometheus/latest/querying/api/#targets).
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/643
* BUGFIX: vmagent: properly handle OpenStack endpoint ending with `v3.0` such as `https://ostack.example.com:5000/v3.0`
in the same way as Prometheus does. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/728#issuecomment-709914803
* BUGFIX: drop trailing data points for time series with a single raw sample. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/748
* BUGFIX: do not drop trailing data points for instant queries to `/api/v1/query`. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/845
* BUGFIX: vmbackup: fix panic when `-origin` isn't specified. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/856
* BUGFIX: vmalert: skip automatically added labels on alerts restore. Label `alertgroup` was introduced in [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/611)
and automatically added to generated time series. By mistake, this new label wasn't correctly purged on restore event and affected alert's ID uniqueness.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/870
* BUGFIX: vmagent: fix panic at scrape error body formating. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/864
* BUGFIX: vmagent: add leading missing slash to metrics path like Prometheus does. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/835
* BUGFIX: vmagent: drop packet if remote storage returns 4xx status code. This make the behaviour consistent with Prometheus.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/873
* BUGFIX: vmagent: properly handle 301 redirects. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/869
# [v1.44.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.44.0)
* FEATURE: automatically add missing label filters to binary operands as described at https://utcc.utoronto.ca/~cks/space/blog/sysadmin/PrometheusLabelNonOptimization .
This should improve performance for queries with missing label filters in binary operands. For example, the following query should work faster now, because it shouldn't
fetch and discard time series for `node_filesystem_files_free` metric without matching labels for the left side of the expression:
```
node_filesystem_files{ host="$host", mountpoint="/" } - node_filesystem_files_free
```
* FEATURE: vmagent: add Docker Swarm service discovery (aka [dockerswarm_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dockerswarm_sd_config)).
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/656
* FEATURE: add ability to export data in CSV format. See [these docs](https://victoriametrics.github.io/#how-to-export-csv-data) for details.
* FEATURE: vmagent: add `-promscrape.suppressDuplicateScrapeTargetErrors` command-line flag for suppressing `duplicate scrape target` errors.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/651 and https://victoriametrics.github.io/vmagent.html#troubleshooting .
* FEATURE: vmagent: show original labels before relabeling is applied on `duplicate scrape target` errors. This should simplify debugging for incorrect relabeling.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/651
* FEATURE: vmagent: `/targets` page now accepts optional `show_original_labels=1` query arg for displaying original labels for each target before relabeling is applied.
This should simplify debugging for target relabeling configs. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/651
* FEATURE: add `-finalMergeDelay` command-line flag for configuring the delay before final merge for per-month partitions.
The final merge is started after no new data is ingested into per-month partition during `-finalMergeDelay`.
* FEATURE: add `vm_rows_added_to_storage_total` metric, which shows the total number of rows added to storage since app start.
The `sum(rate(vm_rows_added_to_storage_total))` can be smaller than `sum(rate(vm_rows_inserted_total))` if certain metrics are dropped
due to [relabeling](https://victoriametrics.github.io/#relabeling). The `sum(rate(vm_rows_added_to_storage_total))` can be bigger
than `sum(rate(vm_rows_inserted_total))` if [replication](https://victoriametrics.github.io/Cluster-VictoriaMetrics.html#replication-and-data-safety) is enabled.
* FEATURE: keep metric name after applying [MetricsQL](https://victoriametrics.github.io/MetricsQL.html) functions, which don't change time series meaning.
The list of such functions:
* `keep_last_value`
* `keep_next_value`
* `interpolate`
* `running_min`
* `running_max`
* `running_avg`
* `range_min`
* `range_max`
* `range_avg`
* `range_first`
* `range_last`
* `range_quantile`
* `smooth_exponential`
* `ceil`
* `floor`
* `round`
* `clamp_min`
* `clamp_max`
* `max_over_time`
* `min_over_time`
* `avg_over_time`
* `quantile_over_time`
* `mode_over_time`
* `geomean_over_time`
* `holt_winters`
* `predict_linear`
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/674
* BUGFIX: properly handle stale time series after K8S deployment. Previously such time series could be double-counted.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/748
* BUGFIX: return a single time series at max from `absent()` function like Prometheus does.
* BUGFIX: vmalert: accept days, weeks and years in `for: ` part of config like Prometheus does. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/817
* BUGFIX: fix `mode_over_time(m[d])` calculations. Previously the function could return incorrect results.
# [v1.43.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.43.0)
* FEATURE: reduce CPU usage for repeated queries over sliding time window when no new time series are added to the database.
Typical use cases: repeated evaluation of alerting rules in [vmalert](https://victoriametrics.github.io/vmalert.html) or dashboard auto-refresh in Grafana.
* FEATURE: vmagent: add OpenStack service discovery aka [openstack_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#openstack_sd_config).
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/728 .
* FEATURE: vmalert: make `-maxIdleConnections` configurable for datasource HTTP client. This option can be used for minimizing connection churn.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/795 .
* FEATURE: add `-influx.maxLineSize` command-line flag for configuring the maximum size for a single Influx line during parsing.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/807
* BUGFIX: properly handle `inf` values during [background merge of LSM parts](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282).
Previously `Inf` values could result in `NaN` values for adjancent samples in time series. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/805 .
* BUGFIX: fill gaps on graphs for `range_*` and `running_*` functions. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/806 .
* BUGFIX: make a copy of label with new name during relabeling with `action: labelmap` in the same way as Prometheus does.
Previously the original label name has been replaced. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/812 .
* BUGFIX: support parsing floating-point timestamp like Graphite Carbon does. Such timestmaps are truncated to seconds.
# [v1.42.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.42.0)
* FEATURE: use all the available CPU cores when accepting data via a single TCP connection
for [all the supported protocols](https://victoriametrics.github.io/#how-to-import-time-series-data).
Previously data ingested via a single TCP connection could use only a single CPU core. This could limit data ingestion performance.
The main benefit of this feature is that data can be imported at max speed via a single connection - there is no need to open multiple concurrent
connections to VictoriaMetrics or [vmagent](https://victoriametrics.github.io/vmagent.html) in order to achieve the maximum data ingestion speed.
* FEATURE: cluster: improve performance for data ingestion path from `vminsert` to `vmstorage` nodes. The maximum data ingestion performance
for a single connection between `vminsert` and `vmstorage` node scales with the number of available CPU cores on `vmstorage` side.
This should help with https://github.com/VictoriaMetrics/VictoriaMetrics/issues/791 .
* FEATURE: add ability to export / import data in native format via `/api/v1/export/native` and `/api/v1/import/native`.
This is the most optimized approach for data migration between VictoriaMetrics instances. Both single-node and cluster instances are supported.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/787#issuecomment-700632551 .
* FEATURE: add `reduce_mem_usage` query option to `/api/v1/export` in order to reduce memory usage during data export / import.
See [these docs](https://victoriametrics.github.io/#how-to-export-data-in-json-line-format) for details.
* FEATURE: improve performance for `/api/v1/series` handler when it returns big number of time series.
* FEATURE: add `vm_merge_need_free_disk_space` metric, which can be used for estimating the number of deferred background data merges due to the lack of free disk space.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/686 .
* FEATURE: add OpenBSD support. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/785 .
* BUGFIX: properly apply `-search.maxStalenessInterval` command-line flag value. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/784 .
* BUGFIX: fix displaying data in Grafana tables. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/720 .
* BUGFIX: do not adjust the number of detected CPU cores found at `/sys/devices/system/cpu/online`.
The adjustement was increasing the resulting GOMAXPROC by 1, which looked confusing to users.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/685#issuecomment-698595309 .
* BUGFIX: vmagent: do not show `-remoteWrite.url` in initial logs if `-remoteWrite.showURL` isn't set. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/773 .
* BUGFIX: properly handle case when [/metrics/find](https://victoriametrics.github.io/#graphite-metrics-api-usage) finds both a leaf and a node for the given `query=prefix.*`.
In this case only the node must be returned with stripped dot in the end of id as carbonapi does.
# Previous releases
See [releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases).

View File

@@ -133,6 +133,9 @@ app-local:
app-local-pure:
CGO_ENABLED=0 GO111MODULE=on go build $(RACE) -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/$(APP_NAME)-pure$(RACE) $(PKG_PREFIX)/app/$(APP_NAME)
app-local-with-goarch:
GO111MODULE=on go build $(RACE) -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/$(APP_NAME)-$(GOARCH)$(RACE) $(PKG_PREFIX)/app/$(APP_NAME)
quicktemplate-gen: install-qtc
qtc

542
README.md
View File

@@ -77,8 +77,11 @@ See [features available for enterprise customers](https://github.com/VictoriaMet
if `-graphiteListenAddr` is set.
* [OpenTSDB put message](#sending-data-via-telnet-put-protocol) if `-opentsdbListenAddr` is set.
* [HTTP OpenTSDB /api/put requests](#sending-opentsdb-data-via-http-apiput-requests) if `-opentsdbHTTPListenAddr` is set.
* [/api/v1/import](#how-to-import-time-series-data).
* [JSON line format](#how-to-import-data-in-json-line-format).
* [Native binary format](#how-to-import-data-in-native-format).
* [Prometheus exposition format](#how-to-import-data-in-prometheus-exposition-format).
* [Arbitrary CSV data](#how-to-import-csv-data).
* Supports metrics' relabeling. See [these docs](#relabeling) for details.
* Ideally works with big amounts of time series data from Kubernetes, IoT sensors, connected cars, industrial telemetry, financial data and various Enterprise workloads.
* Has open source [cluster version](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/cluster).
* See also technical [Articles about VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/Articles).
@@ -99,6 +102,8 @@ See [features available for enterprise customers](https://github.com/VictoriaMet
* [Querying Graphite data](#querying-graphite-data)
* [How to send data from OpenTSDB-compatible agents](#how-to-send-data-from-opentsdb-compatible-agents)
* [Prometheus querying API usage](#prometheus-querying-api-usage)
* [Prometheus querying API enhancements](#prometheus-querying-api-enhancements)
* [Graphite Metrics API usage](#graphite-metrics-api-usage)
* [How to build from sources](#how-to-build-from-sources)
* [Development build](#development-build)
* [Production build](#production-build)
@@ -109,8 +114,17 @@ See [features available for enterprise customers](https://github.com/VictoriaMet
* [Setting up service](#setting-up-service)
* [How to work with snapshots](#how-to-work-with-snapshots)
* [How to delete time series](#how-to-delete-time-series)
* [Forced merge](#forced-merge)
* [How to export time series](#how-to-export-time-series)
* [How to export data in native format](#how-to-export-data-in-native-format)
* [How to export data in JSON line format](#how-to-export-data-in-json-line-format)
* [How to export CSV data](#how-to-export-csv-data)
* [How to import time series data](#how-to-import-time-series-data)
* [How to import data in native format](#how-to-import-data-in-native-format)
* [How to import data in json line format](#how-to-import-data-in-json-line-format)
* [How to import CSV data](#how-to-import-csv-data)
* [How to import data in Prometheus exposition format](#how-to-import-data-in-prometheus-exposition-format)
* [Relabeling](#relabeling)
* [Federation](#federation)
* [Capacity planning](#capacity-planning)
* [High availability](#high-availability)
@@ -126,6 +140,7 @@ See [features available for enterprise customers](https://github.com/VictoriaMet
* [Monitoring](#monitoring)
* [Troubleshooting](#troubleshooting)
* [Backfilling](#backfilling)
* [Data updates](#data-updates)
* [Replication](#replication)
* [Backups](#backups)
* [Profiling](#profiling)
@@ -134,39 +149,40 @@ See [features available for enterprise customers](https://github.com/VictoriaMet
* [Contacts](#contacts)
* [Community and contributions](#community-and-contributions)
* [Reporting bugs](#reporting-bugs)
* [Roadmap](#roadmap)
* [Victoria Metrics Logo](#victoria-metrics-logo)
* [Logo Usage Guidelines](#logo-usage-guidelines)
* [Font used](#font-used)
* [Color Palette](#color-palette)
* [We kindly ask](#we-kindly-ask)
### How to start VictoriaMetrics
Just start VictoriaMetrics [executable](https://github.com/VictoriaMetrics/VictoriaMetrics/releases)
Start VictoriaMetrics [executable](https://github.com/VictoriaMetrics/VictoriaMetrics/releases)
or [docker image](https://hub.docker.com/r/victoriametrics/victoria-metrics/) with the desired command-line flags.
The following command-line flags are used the most:
* `-storageDataPath` - path to data directory. VictoriaMetrics stores all the data in this directory. Default path is `victoria-metrics-data` in current working directory.
* `-retentionPeriod` - retention period in months for the data. Older data is automatically deleted. Default period is 1 month.
* `-httpListenAddr` - TCP address to listen to for http requests. By default, it listens port `8428` on all the network interfaces.
* `-storageDataPath` - path to data directory. VictoriaMetrics stores all the data in this directory. Default path is `victoria-metrics-data` in the current working directory.
* `-retentionPeriod` - retention for stored data. Older data is automatically deleted. Default retention is 1 month. See [these docs](#retention) for more details.
Other flags have good enough default values, so set them only if you really need this.
Other flags have good enough default values, so set them only if you really need this. Pass `-help` to see all the available flags with description and default values.
Pass `-help` to see all the available flags with description and default values.
See how to [ingest data to VictoriaMetrics](#how-to-import-time-series-data) and how to [query VictoriaMetrics](#grafana-setup).
VictoriaMetrics accepts [Prometheus querying API requests](#prometheus-querying-api-usage) on port `8428` by default.
It is recommended setting up [monitoring](#monitoring) for VictoriaMetrics.
#### Environment variables
Each flag values can be set thru environment variables by following these rules:
Each flag value can be set via environment variables according to these rules:
* The `-envflag.enable` flag must be set
* Each `.` in flag names must be substituted by `_` (for example `-insert.maxQueueDuration <duration>` will translate to `insert_maxQueueDuration=<duration>`)
* For repeating flags, an alternative syntax can be used by joining the different values into one using `,` as separator (for example `-storageNode <nodeA> -storageNode <nodeB>` will translate to `storageNode=<nodeA>,<nodeB>`)
* Each `.` char in flag name must be substituted by `_` (for example `-insert.maxQueueDuration <duration>` will translate to `insert_maxQueueDuration=<duration>`)
* For repeating flags an alternative syntax can be used by joining the different values into one using `,` char as separator (for example `-storageNode <nodeA> -storageNode <nodeB>` will translate to `storageNode=<nodeA>,<nodeB>`)
* It is possible setting prefix for environment vars with `-envflag.prefix`. For instance, if `-envflag.prefix=VM_`, then env vars must be prepended with `VM_`
### Prometheus setup
Prometheus must be configured with [remote_write](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write)
@@ -178,15 +194,15 @@ remote_write:
- url: http://<victoriametrics-addr>:8428/api/v1/write
```
Substitute `<victoriametrics-addr>` with the hostname or IP address of VictoriaMetrics.
Then apply the new config via the following command:
Substitute `<victoriametrics-addr>` with hostname or IP address of VictoriaMetrics.
Then apply new config via the following command:
```bash
kill -HUP `pidof prometheus`
```
Prometheus writes incoming data to local storage and replicates it to remote storage in parallel.
This means the data remains available in local storage for `--storage.tsdb.retention.time` duration
This means that data remains available in local storage for `--storage.tsdb.retention.time` duration
even if remote storage is unavailable.
If you plan to send data to VictoriaMetrics from multiple Prometheus instances, then add the following lines into `global` section
@@ -199,11 +215,10 @@ global:
```
This instructs Prometheus to add `datacenter=dc-123` label to each time series sent to remote storage.
The label name may be arbitrary - `datacenter` is just an example. The label value must be unique
The label name can be arbitrary - `datacenter` is just an example. The label value must be unique
across Prometheus instances, so those time series may be filtered and grouped by this label.
For highly loaded Prometheus instances (400k+ samples per second)
the following tuning may be applied:
For highly loaded Prometheus instances (400k+ samples per second) the following tuning may be applied:
```yaml
remote_write:
@@ -214,14 +229,11 @@ remote_write:
max_shards: 30
```
Using remote write increases memory usage for Prometheus up to ~25%
and depends on the shape of data. If you are experiencing issues with
too high memory consumption try to lower `max_samples_per_send`
and `capacity` params (keep in mind that these two params are tightly connected).
Using remote write increases memory usage for Prometheus up to ~25% and depends on the shape of data. If you are experiencing issues with
too high memory consumption try to lower `max_samples_per_send` and `capacity` params (keep in mind that these two params are tightly connected).
Read more about tuning remote write for Prometheus [here](https://prometheus.io/docs/practices/remote_write).
It is recommended upgrading Prometheus to [v2.12.0](https://github.com/prometheus/prometheus/releases) or newer,
since the previous versions may have issues with `remote_write`.
It is recommended upgrading Prometheus to [v2.12.0](https://github.com/prometheus/prometheus/releases) or newer, since previous versions may have issues with `remote_write`.
Take a look also at [vmagent](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmagent/README.md),
which can be used as faster and less resource-hungry alternative to Prometheus in certain cases.
@@ -229,7 +241,7 @@ which can be used as faster and less resource-hungry alternative to Prometheus i
### Grafana setup
Create [Prometheus datasource](http://docs.grafana.org/features/datasources/prometheus/) in Grafana with the following Url:
Create [Prometheus datasource](http://docs.grafana.org/features/datasources/prometheus/) in Grafana with the following url:
```url
http://<victoriametrics-addr>:8428
@@ -237,35 +249,41 @@ http://<victoriametrics-addr>:8428
Substitute `<victoriametrics-addr>` with the hostname or IP address of VictoriaMetrics.
Then build graphs with the created datasource using [Prometheus query language](https://prometheus.io/docs/prometheus/latest/querying/basics/).
VictoriaMetrics supports native PromQL and [extends it with useful features](https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/MetricsQL).
Then build graphs with the created datasource using [PromQL](https://prometheus.io/docs/prometheus/latest/querying/basics/)
or [MetricsQL](https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/MetricsQL). VictoriaMetrics supports [Prometheus querying API](#prometheus-querying-api-usage),
which is used by Grafana.
### How to upgrade VictoriaMetrics
It is safe upgrading VictoriaMetrics to new versions unless [release notes](https://github.com/VictoriaMetrics/VictoriaMetrics/releases)
say otherwise. It is recommended performing regular upgrades to the latest version,
since it may contain important bug fixes, performance optimizations or new features.
say otherwise. It is safe skipping multiple versions during the upgrade unless [release notes](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) say otherwise.
It is recommended performing regular upgrades to the latest version, since it may contain important bug fixes, performance optimizations or new features.
Follow the following steps during the upgrade:
It is also safe downgrading to the previous version unless [release notes](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) say otherwise.
1) Send `SIGINT` signal to VictoriaMetrics process in order to gracefully stop it.
2) Wait until the process stops. This can take a few seconds.
3) Start the upgraded VictoriaMetrics.
The following steps must be performed during the upgrade / downgrade:
* Send `SIGINT` signal to VictoriaMetrics process in order to gracefully stop it.
* Wait until the process stops. This can take a few seconds.
* Start the upgraded VictoriaMetrics.
Prometheus doesn't drop data during VictoriaMetrics restart.
See [this article](https://grafana.com/blog/2019/03/25/whats-new-in-prometheus-2.8-wal-based-remote-write/) for details.
### How to apply new config to VictoriaMetrics
VictoriaMetrics must be restarted for applying new config:
VictoriaMetrics is configured via command-line flags, so it must be restarted when new command-line flags should be applied:
1) Send `SIGINT` signal to VictoriaMetrics process in order to gracefully stop it.
2) Wait until the process stops. This can take a few seconds.
3) Start VictoriaMetrics with the new config.
* Send `SIGINT` signal to VictoriaMetrics process in order to gracefully stop it.
* Wait until the process stops. This can take a few seconds.
* Start VictoriaMetrics with the new command-line flags.
Prometheus doesn't drop data during VictoriaMetrics restart.
See [this article](https://grafana.com/blog/2019/03/25/whats-new-in-prometheus-2.8-wal-based-remote-write/) for details.
### How to scrape Prometheus exporters such as [node-exporter](https://github.com/prometheus/node_exporter)
VictoriaMetrics can be used as drop-in replacement for Prometheus for scraping targets configured in `prometheus.yml` config file according to [the specification](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#configuration-file).
@@ -279,14 +297,21 @@ Currently the following [scrape_config](https://prometheus.io/docs/prometheus/la
* [gce_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#gce_sd_config)
* [consul_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config)
* [dns_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dns_sd_config)
* [openstack_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#openstack_sd_config)
* [dockerswarm_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dockerswarm_sd_config)
In the future other `*_sd_config` types will be supported.
Other `*_sd_config` types will be supported in the future.
The file pointed by `-promscrape.config` may contain `%{ENV_VAR}` placeholders, which are substituted by the corresponding `ENV_VAR` environment variable values.
VictoriaMetrics also supports [importing data in Prometheus exposition format](#how-to-import-data-in-prometheus-exposition-format).
See also [vmagent](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmagent/README.md), which can be used as drop-in replacement for Prometheus.
### How to send data from InfluxDB-compatible agents such as [Telegraf](https://www.influxdata.com/time-series-platform/telegraf/)
Just use `http://<victoriametric-addr>:8428` url instead of InfluxDB url in agents' configs.
Use `http://<victoriametric-addr>:8428` url instead of InfluxDB url in agents' configs.
For instance, put the following lines into `Telegraf` config, so it sends data to VictoriaMetrics instead of InfluxDB:
```toml
@@ -294,8 +319,6 @@ For instance, put the following lines into `Telegraf` config, so it sends data t
urls = ["http://<victoriametrics-addr>:8428"]
```
Do not forget substituting `<victoriametrics-addr>` with the real address where VictoriaMetrics runs.
Another option is to enable TCP and UDP receiver for Influx line protocol via `-influxListenAddr` command-line flag
and stream plain Influx line protocol data to the configured TCP and/or UDP addresses.
@@ -305,7 +328,8 @@ VictoriaMetrics maps Influx data using the following rules:
unless `db` tag exists in the Influx line.
* Field names are mapped to time series names prefixed with `{measurement}{separator}` value,
where `{separator}` equals to `_` by default. It can be changed with `-influxMeasurementFieldSeparator` command-line flag.
See also `-influxSkipSingleField` command-line flag. If `{measurement}` is empty, then time series names correspond to field names.
See also `-influxSkipSingleField` command-line flag.
If `{measurement}` is empty or `-influxSkipMeasurement` command-line flag is set, then time series names correspond to field names.
* Field values are mapped to time series values.
* Tags are mapped to Prometheus labels as-is.
@@ -329,8 +353,8 @@ to local VictoriaMetrics using `curl`:
curl -d 'measurement,tag1=value1,tag2=value2 field1=123,field2=1.23' -X POST 'http://localhost:8428/write'
```
An arbitrary number of lines delimited by '\n' (aka newline char) may be sent in a single request.
After that the data may be read via [/api/v1/export](#how-to-export-time-series) endpoint:
An arbitrary number of lines delimited by '\n' (aka newline char) can be sent in a single request.
After that the data may be read via [/api/v1/export](#how-to-export-data-in-json-line-format) endpoint:
```bash
curl -G 'http://localhost:8428/api/v1/export' -d 'match={__name__=~"measurement_.*"}'
@@ -346,16 +370,17 @@ The `/api/v1/export` endpoint should return the following response:
Note that Influx line protocol expects [timestamps in *nanoseconds* by default](https://docs.influxdata.com/influxdb/v1.7/write_protocols/line_protocol_tutorial/#timestamp),
while VictoriaMetrics stores them with *milliseconds* precision.
### How to send data from Graphite-compatible agents such as [StatsD](https://github.com/etsy/statsd)
1) Enable Graphite receiver in VictoriaMetrics by setting `-graphiteListenAddr` command line flag. For instance,
Enable Graphite receiver in VictoriaMetrics by setting `-graphiteListenAddr` command line flag. For instance,
the following command will enable Graphite receiver in VictoriaMetrics on TCP and UDP port `2003`:
```bash
/path/to/victoria-metrics-prod -graphiteListenAddr=:2003
```
2) Use the configured address in Graphite-compatible agents. For instance, set `graphiteHost`
Use the configured address in Graphite-compatible agents. For instance, set `graphiteHost`
to the VictoriaMetrics host in `StatsD` configs.
Example for writing data with Graphite plaintext protocol to local VictoriaMetrics using `nc`:
@@ -365,8 +390,8 @@ echo "foo.bar.baz;tag1=value1;tag2=value2 123 `date +%s`" | nc -N localhost 2003
```
VictoriaMetrics sets the current time if the timestamp is omitted.
An arbitrary number of lines delimited by `\n` (aka newline char) may be sent in one go.
After that the data may be read via [/api/v1/export](#how-to-export-time-series) endpoint:
An arbitrary number of lines delimited by `\n` (aka newline char) can be sent in one go.
After that the data may be read via [/api/v1/export](#how-to-export-data-in-json-line-format) endpoint:
```bash
curl -G 'http://localhost:8428/api/v1/export' -d 'match=foo.bar.baz'
@@ -380,25 +405,28 @@ The `/api/v1/export` endpoint should return the following response:
### Querying Graphite data
Data sent to VictoriaMetrics via `Graphite plaintext protocol` may be read either via
[Prometheus querying API](#prometheus-querying-api-usage)
or via [go-graphite/carbonapi](https://github.com/go-graphite/carbonapi/blob/master/cmd/carbonapi/carbonapi.example.prometheus.yaml).
Data sent to VictoriaMetrics via `Graphite plaintext protocol` may be read via the following APIs:
* [Prometheus querying API](#prometheus-querying-api-usage)
* Metric names can be explored via [Graphite metrics API](#graphite-metrics-api-usage)
* [go-graphite/carbonapi](https://github.com/go-graphite/carbonapi/blob/master/cmd/carbonapi/carbonapi.example.prometheus.yaml)
### How to send data from OpenTSDB-compatible agents
VictoriaMetrics supports [telnet put protocol](http://opentsdb.net/docs/build/html/api_telnet/put.html)
and [HTTP /api/put requests](http://opentsdb.net/docs/build/html/api_http/put.html) for ingesting OpenTSDB data.
The same protocol is used for [ingesting data in KairosDB](https://kairosdb.github.io/docs/build/html/PushingData.html).
#### Sending data via `telnet put` protocol
1) Enable OpenTSDB receiver in VictoriaMetrics by setting `-opentsdbListenAddr` command line flag. For instance,
Enable OpenTSDB receiver in VictoriaMetrics by setting `-opentsdbListenAddr` command line flag. For instance,
the following command enables OpenTSDB receiver in VictoriaMetrics on TCP and UDP port `4242`:
```bash
/path/to/victoria-metrics-prod -opentsdbListenAddr=:4242
```
2) Send data to the given address from OpenTSDB-compatible agents.
Send data to the given address from OpenTSDB-compatible agents.
Example for writing data with OpenTSDB protocol to local VictoriaMetrics using `nc`:
@@ -406,8 +434,8 @@ Example for writing data with OpenTSDB protocol to local VictoriaMetrics using `
echo "put foo.bar.baz `date +%s` 123 tag1=value1 tag2=value2" | nc -N localhost 4242
```
An arbitrary number of lines delimited by `\n` (aka newline char) may be sent in one go.
After that the data may be read via [/api/v1/export](#how-to-export-time-series) endpoint:
An arbitrary number of lines delimited by `\n` (aka newline char) can be sent in one go.
After that the data may be read via [/api/v1/export](#how-to-export-data-in-json-line-format) endpoint:
```bash
curl -G 'http://localhost:8428/api/v1/export' -d 'match=foo.bar.baz'
@@ -421,14 +449,14 @@ The `/api/v1/export` endpoint should return the following response:
#### Sending OpenTSDB data via HTTP `/api/put` requests
1) Enable HTTP server for OpenTSDB `/api/put` requests by setting `-opentsdbHTTPListenAddr` command line flag. For instance,
Enable HTTP server for OpenTSDB `/api/put` requests by setting `-opentsdbHTTPListenAddr` command line flag. For instance,
the following command enables OpenTSDB HTTP server on port `4242`:
```bash
/path/to/victoria-metrics-prod -opentsdbHTTPListenAddr=:4242
```
2) Send data to the given address from OpenTSDB-compatible agents.
Send data to the given address from OpenTSDB-compatible agents.
Example for writing a single data point:
@@ -442,7 +470,7 @@ Example for writing multiple data points in a single request:
curl -H 'Content-Type: application/json' -d '[{"metric":"foo","value":45.34},{"metric":"bar","value":43}]' http://localhost:4242/api/put
```
After that the data may be read via [/api/v1/export](#how-to-export-time-series) endpoint:
After that the data may be read via [/api/v1/export](#how-to-export-data-in-json-line-format) endpoint:
```bash
curl -G 'http://localhost:8428/api/v1/export' -d 'match[]=x.y.z' -d 'match[]=foo' -d 'match[]=bar'
@@ -457,56 +485,6 @@ The `/api/v1/export` endpoint should return the following response:
```
### How to import CSV data
Arbitrary CSV data can be imported via `/api/v1/import/csv`. The CSV data is imported according to the provided `format` query arg.
The `format` query arg must contain comma-separated list of parsing rules for CSV fields. Each rule consists of three parts delimited by a colon:
```
<column_pos>:<type>:<context>
```
* `<column_pos>` is the position of the CSV column (field). Column numbering starts from 1. The order of parsing rules may be arbitrary.
* `<type>` describes the column type. Supported types are:
* `metric` - the corresponding CSV column at `<column_pos>` contains metric value, which must be integer or floating-point number.
The metric name is read from the `<context>`. CSV line must have at least a single metric field. Multiple metric fields per CSV line is OK.
* `label` - the corresponding CSV column at `<column_pos>` contains label value. The label name is read from the `<context>`.
CSV line may have arbitrary number of label fields. All these labels are attached to all the configured metrics.
* `time` - the corresponding CSV column at `<column_pos>` contains metric time. CSV line may contain either one or zero columns with time.
If CSV line has no time, then the current time is used. The time is applied to all the configured metrics.
The format of the time is configured via `<context>`. Supported time formats are:
* `unix_s` - unix timestamp in seconds.
* `unix_ms` - unix timestamp in milliseconds.
* `unix_ns` - unix timestamp in nanoseconds. Note that VictoriaMetrics rounds the timestamp to milliseconds.
* `rfc3339` - timestamp in [RFC3339](https://tools.ietf.org/html/rfc3339) format, i.e. `2006-01-02T15:04:05Z`.
* `custom:<layout>` - custom layout for the timestamp. The `<layout>` may contain arbitrary time layout according to [time.Parse rules in Go](https://golang.org/pkg/time/#Parse).
Each request to `/api/v1/import/csv` may contain arbitrary number of CSV lines.
Example for importing CSV data via `/api/v1/import/csv`:
```bash
curl -d "GOOG,1.23,4.56,NYSE" 'http://localhost:8428/api/v1/import/csv?format=2:metric:ask,3:metric:bid,1:label:ticker,4:label:market'
curl -d "MSFT,3.21,1.67,NASDAQ" 'http://localhost:8428/api/v1/import/csv?format=2:metric:ask,3:metric:bid,1:label:ticker,4:label:market'
```
After that the data may be read via [/api/v1/export](#how-to-export-time-series) endpoint:
```bash
curl -G 'http://localhost:8428/api/v1/export' -d 'match[]={ticker!=""}'
```
The following response should be returned:
```bash
{"metric":{"__name__":"bid","market":"NASDAQ","ticker":"MSFT"},"values":[1.67],"timestamps":[1583865146520]}
{"metric":{"__name__":"bid","market":"NYSE","ticker":"GOOG"},"values":[4.56],"timestamps":[1583865146495]}
{"metric":{"__name__":"ask","market":"NASDAQ","ticker":"MSFT"},"values":[3.21],"timestamps":[1583865146520]}
{"metric":{"__name__":"ask","market":"NYSE","ticker":"GOOG"},"values":[1.23],"timestamps":[1583865146495]}
```
Note that it could be required to flush response cache after importing historical data. See [these docs](#backfilling) for detail.
### Prometheus querying API usage
VictoriaMetrics supports the following handlers from [Prometheus querying API](https://prometheus.io/docs/prometheus/latest/querying/api/):
@@ -517,9 +495,17 @@ VictoriaMetrics supports the following handlers from [Prometheus querying API](h
* [/api/v1/labels](https://prometheus.io/docs/prometheus/latest/querying/api/#getting-label-names)
* [/api/v1/label/.../values](https://prometheus.io/docs/prometheus/latest/querying/api/#querying-label-values)
* [/api/v1/status/tsdb](https://prometheus.io/docs/prometheus/latest/querying/api/#tsdb-stats)
* [/api/v1/targets](https://prometheus.io/docs/prometheus/latest/querying/api/#targets) - see [these docs](#how-to-scrape-prometheus-exporters-such-as-node-exporter) for more details.
These handlers can be queried from Prometheus-compatible clients such as Grafana or curl.
#### Prometheus querying API enhancements
Additionally to unix timestamps and [RFC3339](https://www.ietf.org/rfc/rfc3339.txt) VictoriaMetrics accepts relative times in `time`, `start` and `end` query args.
For example, the following query would return data for the last 30 minutes: `/api/v1/query_range?start=-30m&query=...`.
By default, VictoriaMetrics returns time series for the last 5 minutes from /api/v1/series, while the Prometheus API defaults to all time. Use `start` and `end` to select a different time range.
VictoriaMetrics accepts additional args for `/api/v1/labels` and `/api/v1/label/.../values` handlers.
See [this feature request](https://github.com/prometheus/prometheus/issues/6178) for details:
@@ -528,9 +514,26 @@ See [this feature request](https://github.com/prometheus/prometheus/issues/6178)
Additionally VictoriaMetrics provides the following handlers:
* `/api/v1/series/count` - it returns the total number of time series in the database. Note that this handler scans all the inverted index,
so it can be slow if the database contains tens of millions of time series.
* `/api/v1/series/count` - it returns the total number of time series in the database. Some notes:
* the handler scans all the inverted index, so it can be slow if the database contains tens of millions of time series;
* the handler may count [deleted time series](#how-to-delete-time-series) additionally to normal time series due to internal implementation restrictions;
* `/api/v1/labels/count` - it returns a list of `label: values_count` entries. It can be used for determining labels with the maximum number of values.
* `/api/v1/status/active_queries` - it returns a list of currently running queries.
### Graphite Metrics API usage
VictoriaMetrics supports the following handlers from [Graphite Metrics API](https://graphite-api.readthedocs.io/en/latest/api.html#the-metrics-api):
* [/metrics/find](https://graphite-api.readthedocs.io/en/latest/api.html#metrics-find)
* [/metrics/expand](https://graphite-api.readthedocs.io/en/latest/api.html#metrics-expand)
* [/metrics/index.json](https://graphite-api.readthedocs.io/en/latest/api.html#metrics-index-json)
VictoriaMetrics accepts the following additional query args at `/metrics/find` and `/metrics/expand`:
* `label` - for selecting arbitrary label values. By default `label=__name__`, i.e. metric names are selected.
* `delimiter` - for using different delimiters in metric name hierachy. For example, `/metrics/find?delimiter=_&query=node_*` would return all the metric name prefixes
that start with `node_`. By default `delimiter=.`.
### How to build from sources
@@ -583,8 +586,9 @@ Run `make package-victoria-metrics`. It builds `victoriametrics/victoria-metrics
`<PKG_TAG>` is auto-generated image tag, which depends on source code in the repository.
The `<PKG_TAG>` may be manually set via `PKG_TAG=foobar make package-victoria-metrics`.
By default the image is built on top of `alpine` image for improved debuggability. It is possible to build the package on top of any other base image
by setting it via `<ROOT_IMAGE>` environment variable. For example, the following command builds the image on top of `scratch` image:
The base docker image is [alpine](https://hub.docker.com/_/alpine) but it is possible to use any other base image
by setting it via `<ROOT_IMAGE>` environment variable.
For example, the following command builds the image on top of [scratch](https://hub.docker.com/_/scratch) image:
```bash
ROOT_IMAGE=scratch make package-victoria-metrics
@@ -593,7 +597,7 @@ ROOT_IMAGE=scratch make package-victoria-metrics
### Start with docker-compose
[Docker-compose](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/deployment/docker/docker-compose.yml)
helps to spin up VictoriaMetrics, Prometheus and Grafana with one command.
helps to spin up VictoriaMetrics, [vmagent](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmagent/README.md) and Grafana with one command.
More details may be found [here](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/master/deployment/docker#folder-contains-basic-images-and-tools-for-building-and-running-victoria-metrics-in-docker).
### Setting up service
@@ -635,9 +639,12 @@ Send a request to `http://<victoriametrics-addr>:8428/api/v1/admin/tsdb/delete_s
where `<timeseries_selector_for_delete>` may contain any [time series selector](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors)
for metrics to delete. After that all the time series matching the given selector are deleted. Storage space for
the deleted time series isn't freed instantly - it is freed during subsequent [background merges of data files](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282).
Note that background merges may never occur for data from previous months, so storage space won't be freed for historical data.
In this case [forced merge](#forced-merge) may help freeing up storage space.
It is recommended verifying which metrics will be deleted with the call to `http://<victoria-metrics-addr>:8428/api/v1/series?match[]=<timeseries_selector_for_delete>`
before actually deleting the metrics.
before actually deleting the metrics. By default this query will only scan active series in the past 5 minutes, so you may need to
adjust `start` and `end` to a suitable range to achieve match hits.
The `/api/v1/admin/tsdb/delete_series` handler may be protected with `authKey` if `-deleteAuthKey` command-line flag is set.
@@ -649,20 +656,62 @@ The delete API is intended mainly for the following cases:
It isn't recommended using delete API for the following cases, since it brings non-zero overhead:
* Regular cleanups for unneeded data. Just prevent writing unneeded data into VictoriaMetrics.
This can be done with relabeling in [vmagent](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmagent/README.md).
This can be done with [relabeling](#relabeling).
See [this article](https://www.robustperception.io/relabelling-can-discard-targets-timeseries-and-alerts) for details.
* Reducing disk space usage by deleting unneeded time series. This doesn't work as expected, since the deleted
time series occupy disk space until the next merge operation, which can never occur when deleting too old data.
[Forced merge](#forced-merge) may be used for freeing up disk space occupied by old data.
It is better using `-retentionPeriod` command-line flag for efficient pruning of old data.
### Forced merge
VictoriaMetrics performs [data compactions in background](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282)
in order to keep good performance characteristics when accepting new data. These compactions (merges) are performed independently on per-month partitions.
This means that compactions are stopped for per-month partitions if no new data is ingested into these partitions.
Sometimes it is necessary to trigger compactions for old partitions. For instance, in order to free up disk space occupied by [deleted time series](#how-to-delete-time-series).
In this case forced compaction may be initiated on the specified per-month partition by sending request to `/internal/force_merge?partition_prefix=YYYY_MM`,
where `YYYY_MM` is per-month partition name. For example, `http://victoriametrics:8428/internal/force_merge?partition_prefix=2020_08` would initiate forced
merge for August 2020 partition. The call to `/internal/force_merge` returns immediately, while the corresponding forced merge continues running in background.
Forced merges may require additional CPU, disk IO and storage space resources. It is unnecessary to run forced merge under normal conditions,
since VictoriaMetrics automatically performs [optimal merges in background](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282)
when new data is ingested into it.
### How to export time series
Send a request to `http://<victoriametrics-addr>:8428/api/v1/export?match[]=<timeseries_selector_for_export>`,
VictoriaMetrics provides the following handlers for exporting data:
* `/api/v1/export/native` for exporting data in native binary format. This is the most efficient format for data export.
See [these docs](#how-to-export-data-in-native-format) for details.
* `/api/v1/export` for exporing data in JSON line format. See [these docs](#how-to-export-data-in-json-line-format) for details.
* `/api/v1/export/csv` for exporting data in CSV. See [these docs](#how-to-export-csv-data) for details.
#### How to export data in native format
Send a request to `http://<victoriametrics-addr>:8428/api/v1/export/native?match[]=<timeseries_selector_for_export>`,
where `<timeseries_selector_for_export>` may contain any [time series selector](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors)
for metrics to export. Use `{__name__!=""}` selector for fetching all the time series.
Optional `start` and `end` args may be added to the request in order to limit the time frame for the exported data. These args may contain either
unix timestamp in seconds or [RFC3339](https://www.ietf.org/rfc/rfc3339.txt) values.
The exported data can be imported to VictoriaMetrics via [/api/v1/import/native](#how-to-import-data-in-native-format).
#### How to export data in JSON line format
Consider [exporting data in native format](#how-to-export-data-in-native-format) if big amounts of data must be migrated between VictoriaMetrics instances,
since exporting in native format usually consumes lower amounts of CPU and memory resources, while the resulting exported data occupies lower amounts of disk space.
In order to export data in JSON line format, send a request to `http://<victoriametrics-addr>:8428/api/v1/export?match[]=<timeseries_selector_for_export>`,
where `<timeseries_selector_for_export>` may contain any [time series selector](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors)
for metrics to export. Use `{__name__!=""}` selector for fetching all the time series.
The response would contain all the data for the selected time series in [JSON streaming format](https://en.wikipedia.org/wiki/JSON_streaming#Line-delimited_JSON).
Each JSON line would contain data for a single time series. An example output:
Each JSON line contains samples for a single time series. An example output:
```jsonl
{"metric":{"__name__":"up","job":"node_exporter","instance":"localhost:9100"},"values":[0,0,0],"timestamps":[1549891472010,1549891487724,1549891503438]}
@@ -672,8 +721,9 @@ Each JSON line would contain data for a single time series. An example output:
Optional `start` and `end` args may be added to the request in order to limit the time frame for the exported data. These args may contain either
unix timestamp in seconds or [RFC3339](https://www.ietf.org/rfc/rfc3339.txt) values.
Optional `max_rows_per_line` arg may be added to the request in order to limit the maximum number of rows exported per each JSON line.
By default each JSON line contains all the rows for a single time series.
Optional `max_rows_per_line` arg may be added to the request for limiting the maximum number of rows exported per each JSON line.
Optional `reduce_mem_usage=1` arg may be added to the request for reducing memory usage when exporting big number of time series.
In this case the output may contain multiple lines with distinct samples for the same time series.
Pass `Accept-Encoding: gzip` HTTP header in the request to `/api/v1/export` in order to reduce network bandwidth during exporing big amounts
of time series data. This enables gzip compression for the exported data. Example for exporting gzipped data:
@@ -684,21 +734,82 @@ curl -H 'Accept-Encoding: gzip' http://localhost:8428/api/v1/export -d 'match[]=
The maximum duration for each request to `/api/v1/export` is limited by `-search.maxExportDuration` command-line flag.
Exported data can be imported via POST'ing it to [/api/v1/import](#how-to-import-time-series-data).
Exported data can be imported via POST'ing it to [/api/v1/import](#how-to-import-data-in-json-line-format).
#### How to export CSV data
Send a request to `http://<victoriametrics-addr>:8428/api/v1/export/csv?format=<format>&match=<timeseries_selector_for_export>`,
where:
* `<format>` must contain comma-delimited label names for the exported CSV. The following special label names are supported:
* `__name__` - metric name
* `__value__` - sample value
* `__timestamp__:<ts_format>` - sample timestamp. `<ts_format>` can have the following values:
* `unix_s` - unix seconds
* `unix_ms` - unix milliseconds
* `unix_ns` - unix nanoseconds
* `rfc3339` - [RFC3339](https://www.ietf.org/rfc/rfc3339.txt) time
* `custom:<layout>` - custom layout for time that is supported by [time.Format](https://golang.org/pkg/time/#Time.Format) function from Go.
* `<timeseries_selector_for_export>` may contain any [time series selector](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors)
for metrics to export.
Optional `start` and `end` args may be added to the request in order to limit the time frame for the exported data. These args may contain either
unix timestamp in seconds or [RFC3339](https://www.ietf.org/rfc/rfc3339.txt) values.
The exported CSV data can be imported to VictoriaMetrics via [/api/v1/import/csv](#how-to-import-csv-data).
### How to import time series data
Time series data can be imported via any supported ingestion protocol:
* [Prometheus remote_write API](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write)
* [Influx line protocol](#how-to-send-data-from-influxdb-compatible-agents-such-as-telegraf)
* [Graphite plaintext protocol](#how-to-send-data-from-graphite-compatible-agents-such-as-statsd)
* [OpenTSDB telnet put protocol](#sending-data-via-telnet-put-protocol)
* [OpenTSDB http /api/put](#sending-opentsdb-data-via-http-apiput-requests)
* `/api/v1/import` http POST handler, which accepts data from [/api/v1/export](#how-to-export-time-series).
* `/api/v1/import/csv` http POST handler, which accepts CSV data. See [these docs](#how-to-import-csv-data) for details.
* [Prometheus remote_write API](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write). See [these docs](#prometheus-setup) for details.
* Influx line protocol. See [these docs](#how-to-send-data-from-influxdb-compatible-agents-such-as-telegraf) for details.
* Graphite plaintext protocol. See [these docs](#how-to-send-data-from-graphite-compatible-agents-such-as-statsd) for details.
* OpenTSDB telnet put protocol. See [these docs](#sending-data-via-telnet-put-protocol) for details.
* OpenTSDB http `/api/put` protocol. See [these docs](#sending-opentsdb-data-via-http-apiput-requests) for details.
* `/api/v1/import` for importing data obtained from [/api/v1/export](#how-to-export-data-in-json-line-format).
See [these docs](##how-to-import-data-in-json-line-format) for details.
* `/api/v1/import/native` for importing data obtained from [/api/v1/export/native](#how-to-export-data-in-native-format).
See [these docs](#how-to-import-data-in-native-format) for details.
* `/api/v1/import/csv` for importing arbitrary CSV data. See [these docs](#how-to-import-csv-data) for details.
* `/api/v1/import/prometheus` for importing data in Prometheus exposition format. See [these docs](#how-to-import-data-in-prometheus-exposition-format) for details.
The most efficient protocol for importing data into VictoriaMetrics is `/api/v1/import`. Example for importing data obtained via `/api/v1/export`:
#### How to import data in native format
The most efficient protocol for importing data into VictoriaMetrics is `/api/v1/import/native`.
Example for importing data obtained via [/api/v1/export/native](#how-to-export-data-in-native-format):
```bash
# Export the data from <source-victoriametrics>:
curl http://source-victoriametrics:8428/api/v1/export/native -d 'match={__name__!=""}' > exported_data.bin
# Import the data to <destination-victoriametrics>:
curl -X POST http://destination-victoriametrics:8428/api/v1/import/native -T exported_data.bin
```
Pass `Content-Encoding: gzip` HTTP request header to `/api/v1/import/native` for importing gzipped data:
```bash
# Export gzipped data from <source-victoriametrics>:
curl -H 'Accept-Encoding: gzip' http://source-victoriametrics:8428/api/v1/export/native -d 'match={__name__!=""}' > exported_data.bin.gz
# Import gzipped data to <destination-victoriametrics>:
curl -X POST -H 'Content-Encoding: gzip' http://destination-victoriametrics:8428/api/v1/import/native -T exported_data.bin.gz
```
Extra labels may be added to all the imported time series by passing `extra_label=name=value` query args.
For example, `/api/v1/import/native?extra_label=foo=bar` would add `"foo":"bar"` label to all the imported time series.
Note that it could be required to flush response cache after importing historical data. See [these docs](#backfilling) for detail.
#### How to import data in JSON line format
Example for importing data obtained via [/api/v1/export](#how-to-export-data-in-json-line-format):
```bash
# Export the data from <source-victoriametrics>:
@@ -718,10 +829,123 @@ curl -H 'Accept-Encoding: gzip' http://source-victoriametrics:8428/api/v1/export
curl -X POST -H 'Content-Encoding: gzip' http://destination-victoriametrics:8428/api/v1/import -T exported_data.jsonl.gz
```
Extra labels may be added to all the imported time series by passing `extra_label=name=value` query args.
For example, `/api/v1/import?extra_label=foo=bar` would add `"foo":"bar"` label to all the imported time series.
Note that it could be required to flush response cache after importing historical data. See [these docs](#backfilling) for detail.
Each request to `/api/v1/import` can load up to a single vCPU core on VictoriaMetrics. Import speed can be improved by splitting the original file into smaller parts
and importing them concurrently. Note that the original file must be split on newlines.
#### How to import CSV data
Arbitrary CSV data can be imported via `/api/v1/import/csv`. The CSV data is imported according to the provided `format` query arg.
The `format` query arg must contain comma-separated list of parsing rules for CSV fields. Each rule consists of three parts delimited by a colon:
```
<column_pos>:<type>:<context>
```
* `<column_pos>` is the position of the CSV column (field). Column numbering starts from 1. The order of parsing rules may be arbitrary.
* `<type>` describes the column type. Supported types are:
* `metric` - the corresponding CSV column at `<column_pos>` contains metric value, which must be integer or floating-point number.
The metric name is read from the `<context>`. CSV line must have at least a single metric field. Multiple metric fields per CSV line is OK.
* `label` - the corresponding CSV column at `<column_pos>` contains label value. The label name is read from the `<context>`.
CSV line may have arbitrary number of label fields. All these labels are attached to all the configured metrics.
* `time` - the corresponding CSV column at `<column_pos>` contains metric time. CSV line may contain either one or zero columns with time.
If CSV line has no time, then the current time is used. The time is applied to all the configured metrics.
The format of the time is configured via `<context>`. Supported time formats are:
* `unix_s` - unix timestamp in seconds.
* `unix_ms` - unix timestamp in milliseconds.
* `unix_ns` - unix timestamp in nanoseconds. Note that VictoriaMetrics rounds the timestamp to milliseconds.
* `rfc3339` - timestamp in [RFC3339](https://tools.ietf.org/html/rfc3339) format, i.e. `2006-01-02T15:04:05Z`.
* `custom:<layout>` - custom layout for the timestamp. The `<layout>` may contain arbitrary time layout according to [time.Parse rules in Go](https://golang.org/pkg/time/#Parse).
Each request to `/api/v1/import/csv` may contain arbitrary number of CSV lines.
Example for importing CSV data via `/api/v1/import/csv`:
```bash
curl -d "GOOG,1.23,4.56,NYSE" 'http://localhost:8428/api/v1/import/csv?format=2:metric:ask,3:metric:bid,1:label:ticker,4:label:market'
curl -d "MSFT,3.21,1.67,NASDAQ" 'http://localhost:8428/api/v1/import/csv?format=2:metric:ask,3:metric:bid,1:label:ticker,4:label:market'
```
After that the data may be read via [/api/v1/export](#how-to-export-data-in-json-line-format) endpoint:
```bash
curl -G 'http://localhost:8428/api/v1/export' -d 'match[]={ticker!=""}'
```
The following response should be returned:
```bash
{"metric":{"__name__":"bid","market":"NASDAQ","ticker":"MSFT"},"values":[1.67],"timestamps":[1583865146520]}
{"metric":{"__name__":"bid","market":"NYSE","ticker":"GOOG"},"values":[4.56],"timestamps":[1583865146495]}
{"metric":{"__name__":"ask","market":"NASDAQ","ticker":"MSFT"},"values":[3.21],"timestamps":[1583865146520]}
{"metric":{"__name__":"ask","market":"NYSE","ticker":"GOOG"},"values":[1.23],"timestamps":[1583865146495]}
```
Extra labels may be added to all the imported lines by passing `extra_label=name=value` query args.
For example, `/api/v1/import/csv?extra_label=foo=bar` would add `"foo":"bar"` label to all the imported lines.
Note that it could be required to flush response cache after importing historical data. See [these docs](#backfilling) for detail.
#### How to import data in Prometheus exposition format
VictoriaMetrics accepts data in [Prometheus exposition format](https://github.com/prometheus/docs/blob/master/content/docs/instrumenting/exposition_formats.md#text-based-format)
via `/api/v1/import/prometheus` path. For example, the following line imports a single line in Prometheus exposition format into VictoriaMetrics:
```bash
curl -d 'foo{bar="baz"} 123' -X POST 'http://localhost:8428/api/v1/import/prometheus'
```
The following command may be used for verifying the imported data:
```bash
curl -G 'http://localhost:8428/api/v1/export' -d 'match={__name__=~"foo"}'
```
It should return something like the following:
```
{"metric":{"__name__":"foo","bar":"baz"},"values":[123],"timestamps":[1594370496905]}
```
Extra labels may be added to all the imported metrics by passing `extra_label=name=value` query args.
For example, `/api/v1/import/prometheus?extra_label=foo=bar` would add `{foo="bar"}` label to all the imported metrics.
If timestamp is missing in `<metric> <value> <timestamp>` Prometheus exposition format line, then the current timestamp is used during data ingestion.
It can be overriden by passing unix timestamp in *milliseconds* via `timestamp` query arg. For example, `/api/v1/import/prometheus?timestamp=1594370496905`.
VictoriaMetrics accepts arbitrary number of lines in a single request to `/api/v1/import/prometheus`, i.e. it supports data streaming.
Note that it could be required to flush response cache after importing historical data. See [these docs](#backfilling) for detail.
VictoriaMetrics also may scrape Prometheus targets - see [these docs](#how-to-scrape-prometheus-exporters-such-as-node-exporter).
### Relabeling
VictoriaMetrics supports Prometheus-compatible relabeling for all the ingested metrics if `-relabelConfig` command-line flag points
to a file containing a list of [relabel_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config) entries.
Example contents for `-relabelConfig` file:
```yml
# relabel_config.yml
- target_label: cluster
replacement: dev
- action: drop
source_labels: [__meta_kubernetes_pod_container_init]
regex: true
```
VictoriaMetrics provides the following extra actions for relabeling rules:
* `replace_all`: replaces all the occurences of `regex` in the values of `source_labels` with the `replacement` and stores the result in the `target_label`.
* `labelmap_all`: replaces all the occurences of `regex` in all the label names with the `replacement`.
* `keep_if_equal`: keeps the entry if all label values from `source_labels` are equal.
* `drop_if_equal`: drops the entry if all the label values from `source_labels` are equal.
See also [relabeling in vmagent](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmagent/README.md#relabeling).
### Federation
@@ -742,7 +966,7 @@ A rough estimation of the required resources for ingestion path:
Time series is considered active if new data points have been added to it recently or if it has been recently queried.
The number of active time series may be obtained from `vm_cache_entries{type="storage/hour_metric_ids"}` metric
exported on the `/metrics` page.
VictoriaMetrics stores various caches in RAM. Memory size for these caches may be limited by `-memory.allowedPercent` flag.
VictoriaMetrics stores various caches in RAM. Memory size for these caches may be limited with `-memory.allowedPercent` or `-memory.allowedBytes` flags.
* CPU cores: a CPU core per 300K inserted data points per second. So, ~4 CPU cores are required for processing
the insert stream of 1M data points per second. The ingestion rate may be lower for high cardinality data or for time series with high number of labels.
@@ -768,14 +992,16 @@ The required resources for query path:
The higher number of scanned time series and lower `step` argument results in the higher RAM usage.
* CPU cores: a CPU core per 30 millions of scanned data points per second.
This means that heavy queries that touch big number of time series (over 10K) and/or big number data points (over 100M)
usually require more CPU resources than tiny queries that touch a few time series with small number of data points.
* Network usage: depends on the frequency and the type of incoming requests. Typical Grafana dashboards usually
require negligible network bandwidth.
### High availability
1) Install multiple VictoriaMetrics instances in distinct datacenters (availability zones).
2) Pass addresses of these instances to [vmagent](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmagent/README.md) via `-remoteWrite.url` command-line flag:
* Install multiple VictoriaMetrics instances in distinct datacenters (availability zones).
* Pass addresses of these instances to [vmagent](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmagent/README.md) via `-remoteWrite.url` command-line flag:
```bash
/path/to/vmagent -remoteWrite.url=http://<victoriametrics-addr-1>:8428/api/v1/write -remoteWrite.url=http://<victoriametrics-addr-2>:8428/api/v1/write
@@ -794,7 +1020,7 @@ remote_write:
max_samples_per_send: 10000
```
3) Apply the updated config:
* Apply the updated config:
```bash
kill -HUP `pidof prometheus`
@@ -802,9 +1028,9 @@ kill -HUP `pidof prometheus`
It is recommended to use [vmagent](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmagent/README.md) instead of Prometheus for highly loaded setups.
4) Now Prometheus should write data into all the configured `remote_write` urls in parallel.
5) Set up [Promxy](https://github.com/jacksontj/promxy) in front of all the VictoriaMetrics replicas.
6) Set up Prometheus datasource in Grafana that points to Promxy.
* Now Prometheus should write data into all the configured `remote_write` urls in parallel.
* Set up [Promxy](https://github.com/jacksontj/promxy) in front of all the VictoriaMetrics replicas.
* Set up Prometheus datasource in Grafana that points to Promxy.
If you have Prometheus HA pairs with replicas `r1` and `r2` in each pair, then configure each `r1`
to write data to `victoriametrics-addr-1`, while each `r2` should write data to `victoriametrics-addr-2`.
@@ -817,11 +1043,13 @@ with the enabled de-duplication. See [this section](#deduplication) for details.
VictoriaMetrics de-duplicates data points if `-dedup.minScrapeInterval` command-line flag
is set to positive duration. For example, `-dedup.minScrapeInterval=60s` would de-duplicate data points
on the same time series if they are located closer than 60s to each other.
on the same time series if they fall within the same discrete 60s bucket. The earliest data point will be kept. In the case of equal timestamps, an arbitrary data point will be kept.
The de-duplication reduces disk space usage if multiple identically configured Prometheus instances in HA pair
write data to the same VictoriaMetrics instance. Note that these Prometheus instances must have identical
`external_labels` section in their configs, so they write data to the same time series.
### Retention
Retention is configured with `-retentionPeriod` command-line flag. For instance, `-retentionPeriod=3` means
@@ -833,6 +1061,10 @@ For example if `-retentionPeriod` is set to 1, data for January is deleted on Ma
It is safe to extend `-retentionPeriod` on existing data. If `-retentionPeriod` is set to lower
value than before then data outside the configured period will be eventually deleted.
VictoriaMetrics supports retention smaller than 1 month. For example, `-retentionPeriod=5d` would set data retention for 5 days.
Older data is eventually deleted during [background merge](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282).
### Multiple retentions
Just start multiple VictoriaMetrics instances with distinct values for the following flags:
@@ -897,6 +1129,7 @@ Consider setting the following command-line flags:
with [HTTP Basic Authentication](https://en.wikipedia.org/wiki/Basic_access_authentication).
* `-deleteAuthKey` for protecting `/api/v1/admin/tsdb/delete_series` endpoint. See [how to delete time series](#how-to-delete-time-series).
* `-snapshotAuthKey` for protecting `/snapshot*` endpoints. See [how to work with snapshots](#how-to-work-with-snapshots).
* `-forceMergeAuthKey` for protecting `/internal/force_merge` endpoint. See [force merge docs](#forced-merge).
* `-search.resetCacheAuthKey` for protecting `/internal/resetRollupResultCache` endpoint. See [backfilling](#backfilling) for more details.
Explicitly set internal network interface for TCP and UDP ports for data ingestion with Graphite and OpenTSDB formats.
@@ -949,6 +1182,8 @@ The most interesting metrics are:
If this number remains high during extended periods of time, then it is likely more RAM is needed for optimal handling
of the current number of active time series.
VictoriaMetrics also exposes currently running queries with their execution times at `/api/v1/status/active_queries` page.
### Troubleshooting
@@ -956,7 +1191,9 @@ The most interesting metrics are:
of tweaking these flag values arises.
* It is recommended upgrading to the latest available release from [this page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases),
since the issue could be already fixed there.
since the encountered issue could be already fixed there.
* It is recommended inspecting logs during troubleshooting, since they may contain useful information.
* If VictoriaMetrics works slowly and eats more than a CPU core per 100K ingested data points per second,
then it is likely you have too many active time series for the current amount of RAM.
@@ -966,11 +1203,17 @@ The most interesting metrics are:
Another option is to increase `-memory.allowedPercent` command-line flag value. Be careful with this
option, since too big value for `-memory.allowedPercent` may result in high I/O usage.
* VictoriaMetrics prioritizes data ingestion over data querying. So if it has no enough resources for data ingestion,
then data querying may slow down significantly.
* VictoriaMetrics requires free disk space for [merging data files to bigger ones](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282).
It may slow down when there is no enough free space left. So make sure `-storageDataPath` directory
has at least 20% of free space comparing to disk size. The remaining amount of free space
can be [monitored](#monitoring) via `vm_free_disk_space_bytes` metric. The total size of data
stored on the disk can be monitored via sum of `vm_data_size_bytes` metrics.
See also `vm_merge_need_free_disk_space` metrics, which are set to values higher than 0
if background merge cannot be initiated due to free disk space shortage. The value shows the number of per-month partitions,
which would start background merge if they had more free disk space.
* If VictoriaMetrics doesn't work because of certain parts are corrupted due to disk errors,
then just remove directories with broken parts. This will recover VictoriaMetrics at the cost
@@ -997,6 +1240,8 @@ The most interesting metrics are:
This prevents from ingesting metrics with too many labels. It is recommended [monitoring](#monitoring) `vm_metrics_with_dropped_labels_total`
metric in order to determine whether `-maxLabelsPerTimeseries` must be adjusted for your workload.
* VictoriaMetrics ignores `NaN` values during data ingestion.
### Backfilling
@@ -1014,6 +1259,14 @@ Yet another solution is to increase `-search.cacheTimestampOffset` flag value in
for data with timestamps close to the current time.
### Data updates
VictoriaMetrics doesn't support updating already existing sample values to new ones. It stores all the ingested data points
for the same time series with identical timestamps. While is possible substituting old time series with new time series via
[removal of old time series](#how-to-delete-timeseries) and then [writing new time series](#backfilling), this approach
should be used only for one-off updates. It shouldn't be used for frequent updates because of non-zero overhead related to data removal.
### Replication
Single-node VictoriaMetrics doesn't support application-level replication. Use cluster version instead.
@@ -1053,6 +1306,9 @@ The collected profiles may be analyzed with [go tool pprof](https://github.com/g
## Integrations
* [Helm charts for single-node and cluster versions of VictoriaMetrics](https://github.com/VictoriaMetrics/helm-charts).
* [Kubernetes operator for VictoriaMetrics](https://github.com/VictoriaMetrics/operator).
* [vmctl tool for data migration to VictoriaMetrics](https://github.com/VictoriaMetrics/vmctl).
* [netdata](https://github.com/netdata/netdata) can push data into VictoriaMetrics via `Prometheus remote_write API`.
See [these docs](https://github.com/netdata/netdata#integrations).
* [go-graphite/carbonapi](https://github.com/go-graphite/carbonapi) can use VictoriaMetrics as time series backend.
@@ -1102,16 +1358,6 @@ Adhering `KISS` principle simplifies the resulting code and architecture, so it
Report bugs and propose new features [here](https://github.com/VictoriaMetrics/VictoriaMetrics/issues).
## Roadmap
* [ ] Replication [#118](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/118)
* [ ] Support of Object Storages (GCS, S3, Azure Storage) [#38](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/38)
* [ ] Data downsampling [#36](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/36)
* [ ] Alert Manager Integration [#119](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/119)
* [ ] CLI tool for data migration, re-balancing and adding/removing nodes [#103](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/103)
The discussion happens [here](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/129). Feel free to comment on any item or add you own one.
## Victoria Metrics Logo

View File

@@ -2,6 +2,7 @@ package main
import (
"flag"
"fmt"
"net/http"
"os"
"time"
@@ -10,6 +11,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envflag"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
@@ -31,6 +33,7 @@ func main() {
envflag.Parse()
buildinfo.Init()
logger.Init()
cgroup.UpdateGOMAXPROCSToCPUQuota()
logger.Infof("starting VictoriaMetrics at %q...", *httpListenAddr)
startTime := time.Now()
storage.SetMinScrapeIntervalForDeduplication(*minScrapeInterval)
@@ -64,6 +67,10 @@ func main() {
}
func requestHandler(w http.ResponseWriter, r *http.Request) bool {
if r.RequestURI == "/" {
fmt.Fprintf(w, "Single-node VictoriaMetrics. See docs at https://victoriametrics.github.io/")
return true
}
if vminsert.RequestHandler(w, r) {
return true
}

View File

@@ -373,7 +373,7 @@ func checkMetricsResult(got, want []Metric) error {
want = removeIfFoundMetrics(r, want)
}
if len(want) > 0 {
return fmt.Errorf("exptected metrics %+v not found in %+v", want, got)
return fmt.Errorf("expected metrics %+v not found in %+v", want, got)
}
return nil
}

View File

@@ -85,7 +85,6 @@ func selfScraper(scrapeInterval time.Duration) {
mr.Timestamp = currentTimestamp
mr.Value = r.Value
}
logger.Infof("writing %d rows at timestamp %d", len(mrs), currentTimestamp)
vmstorage.AddRows(mrs)
}
}

View File

@@ -5,12 +5,12 @@
"empty_label_match 1 {TIME_S-1m}",
"empty_label_match;foo=bar 2 {TIME_S-1m}",
"empty_label_match;foo=baz 3 {TIME_S-1m}"],
"query": ["/api/v1/query_range?query=empty_label_match{foo=~'bar|'}&start={TIME_S}&end={TIME_S}&step=60"],
"query": ["/api/v1/query_range?query=empty_label_match{foo=~'bar|'}&start={TIME_S-1m}&end={TIME_S}&step=60"],
"result_query_range": {
"status":"success",
"data":{"resultType":"matrix",
"result":[
{"metric":{"__name__":"empty_label_match"},"values":[["{TIME_S}","1"]]},
{"metric":{"__name__":"empty_label_match","foo":"bar"},"values":[["{TIME_S}","2"]]}
{"metric":{"__name__":"empty_label_match"},"values":[["{TIME_S-1m}","1"],["{TIME_S}","1"]]},
{"metric":{"__name__":"empty_label_match","foo":"bar"},"values":[["{TIME_S-1m}","2"],["{TIME_S}","2"]]}
]}}
}

View File

@@ -16,6 +16,8 @@
["{TIME_S-120s}","3"],
["{TIME_S-60s}","2"],
["{TIME_S-30s}","1"],
["{TIME_S-20s}","1"]
["{TIME_S-20s}","1"],
["{TIME_S-10s}","1"],
["{TIME_S-0s}","1"]
]}]}}
}

View File

@@ -9,6 +9,6 @@
"query": ["/api/v1/query?query=min%20by%20(item)%20(min_over_time(forms_daily_count[10m:1m]))&time={TIME_S-1m}"],
"result_query": {
"status":"success",
"data":{"resultType":"vector","result":[{"metric":{"item":"x"},"value":["{TIME_S-1m}","1"]},{"metric":{"item":"y"},"value":["{TIME_S-1m}","3"]}]}
"data":{"resultType":"vector","result":[{"metric":{"item":"x"},"value":["{TIME_S-1m}","2"]},{"metric":{"item":"y"},"value":["{TIME_S-1m}","4"]}]}
}
}

View File

@@ -59,19 +59,22 @@ run-vmagent:
$(MAKE) run-via-docker
vmagent-amd64:
CGO_ENABLED=1 GOOS=linux GOARCH=amd64 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmagent-amd64 ./app/vmagent
CGO_ENABLED=1 GOARCH=amd64 $(MAKE) vmagent-local-with-goarch
vmagent-arm:
CGO_ENABLED=0 GOOS=linux GOARCH=arm GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmagent-arm ./app/vmagent
CGO_ENABLED=0 GOARCH=arm $(MAKE) vmagent-local-with-goarch
vmagent-arm64:
CGO_ENABLED=0 GOOS=linux GOARCH=arm64 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmagent-arm64 ./app/vmagent
CGO_ENABLED=0 GOARCH=arm64 $(MAKE) vmagent-local-with-goarch
vmagent-ppc64le:
CGO_ENABLED=0 GOOS=linux GOARCH=ppc64le GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmagent-ppc64le ./app/vmagent
CGO_ENABLED=0 GOARCH=ppc64le $(MAKE) vmagent-local-with-goarch
vmagent-386:
CGO_ENABLED=0 GOOS=linux GOARCH=386 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmagent-386 ./app/vmagent
CGO_ENABLED=0 GOARCH=386 $(MAKE) vmagent-local-with-goarch
vmagent-local-with-goarch:
APP_NAME=vmagent $(MAKE) app-local-with-goarch
vmagent-pure:
APP_NAME=vmagent $(MAKE) app-local-pure

View File

@@ -25,7 +25,9 @@ to `vmagent` (like the ability to push metrics instead of pulling them). We did
* Graphite plaintext protocol if `-graphiteListenAddr` command-line flag is set. See [these docs](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-send-data-from-graphite-compatible-agents-such-as-statsd).
* OpenTSDB telnet and http protocols if `-opentsdbListenAddr` command-line flag is set. See [these docs](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-send-data-from-opentsdb-compatible-agents).
* Prometheus remote write protocol via `http://<vmagent>:8429/api/v1/write`.
* JSON lines import protocol via `http://<vmagent>:8429/api/v1/import`. See [these docs](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-import-time-series-data).
* JSON lines import protocol via `http://<vmagent>:8429/api/v1/import`. See [these docs](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-import-data-in-json-line-format).
* Native data import protocol via `http://<vmagent>:8429/api/v1/import/native`. See [these docs](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-import-data-in-native-format).
* Data in Prometheus exposition format. See [these docs](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-import-data-in-prometheus-exposition-format) for details.
* Arbitrary CSV data via `http://<vmagent>:8429/api/v1/import/csv`. See [these docs](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-import-csv-data).
* Can replicate collected metrics simultaneously to multiple remote storage systems.
* Works in environments with unstable connections to remote storage. If the remote storage is unavailable, the collected metrics
@@ -40,7 +42,7 @@ Just download `vmutils-*` archive from [releases page](https://github.com/Victor
and pass the following flags to `vmagent` binary in order to start scraping Prometheus targets:
* `-promscrape.config` with the path to Prometheus config file (it is usually located at `/etc/prometheus/prometheus.yml`)
* `-remoteWrite.url` with the remote storage endpoint such as VictoriaMetrics. The `-remoteWrite.url` argument can be specified multiple times in order to replicate data concurrently to an arbitrary amount of remote storage systems.
* `-remoteWrite.url` with the remote storage endpoint such as VictoriaMetrics. The `-remoteWrite.url` argument can be specified multiple times in order to replicate data concurrently to an arbitrary number of remote storage systems.
Example command line:
@@ -134,7 +136,7 @@ The following scrape types in [scrape_config](https://prometheus.io/docs/prometh
See [kubernetes_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config) for details.
* `ec2_sd_configs` - for scraping targets in Amazon EC2.
See [ec2_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#ec2_sd_config) for details.
`vmagent` doesn't support `role_arn` config param yet.
`vmagent` doesn't support `profile` config param and aws credentials file yet.
* `gce_sd_configs` - for scraping targets in Google Compute Engine (GCE).
See [gce_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#gce_sd_config) for details.
`vmagent` provides the following additional functionality for `gce_sd_config`:
@@ -146,13 +148,26 @@ The following scrape types in [scrape_config](https://prometheus.io/docs/prometh
See [consul_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config) for details.
* `dns_sd_configs` - for scraping targets discovered from DNS records (SRV, A and AAAA).
See [dns_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dns_sd_config) for details.
* `openstack_sd_configs` - for scraping OpenStack targets.
See [openstack_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#openstack_sd_config) for details.
[OpenStack identity API v3](https://docs.openstack.org/api-ref/identity/v3/) is supported only.
* `dockerswarm_sd_configs` - for scraping Docker Swarm targets.
See [dockerswarm_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dockerswarm_sd_config) for details.
File feature requests at [our issue tracker](https://github.com/VictoriaMetrics/VictoriaMetrics/issues) if you need other service discovery mechanisms to be supported by `vmagent`.
`vmagent` also support the following additional options in `scrape_config` section:
* `disable_compression: true` - for disabling response compression on a per-job basis. By default `vmagent` requests compressed responses from scrape targets
in order to save network bandwidth.
* `disable_keepalive: true` - for disabling [HTTP keep-alive connections](https://en.wikipedia.org/wiki/HTTP_persistent_connection) on a per-job basis.
By default `vmagent` uses keep-alive connections to scrape targets in order to reduce overhead on connection re-establishing.
Note that `vmagent` doesn't support `refresh_interval` option these scrape configs. Use the corresponding `-promscrape.*CheckInterval`
command-line flag instead. For example, `-promscrape.consulSDCheckInterval=60s` sets `refresh_interval` for all the `consul_sd_configs`
entries to 60s. Run `vmagent -help` in order to see default values for `-promscrape.*CheckInterval` flags.
File feature requests at [our issue tracker](https://github.com/VictoriaMetrics/VictoriaMetrics/issues) if you need other service discovery mechanisms to be supported by `vmagent`.
The file pointed by `-promscrape.config` may contain `%{ENV_VAR}` placeholders, which are substituted by the corresponding `ENV_VAR` environment variable values.
### Adding labels to metrics
@@ -170,6 +185,8 @@ Additionally it provides the following extra actions:
* `replace_all`: replaces all the occurences of `regex` in the values of `source_labels` with the `replacement` and stores the result in the `target_label`.
* `labelmap_all`: replaces all the occurences of `regex` in all the label names with the `replacement`.
* `keep_if_equal`: keeps the entry if all label values from `source_labels` are equal.
* `drop_if_equal`: drops the entry if all the label values from `source_labels` are equal.
The relabeling can be defined in the following places:
@@ -191,25 +208,84 @@ Read more about relabeling in the following articles:
`vmagent` exports various metrics in Prometheus exposition format at `http://vmagent-host:8429/metrics` page. It is recommended setting up regular scraping of this page
either via `vmagent` itself or via Prometheus, so the exported metrics could be analyzed later.
Use official [Grafana dashboard](https://grafana.com/grafana/dashboards/12683) for `vmagent` state overview.
If you have suggestions, improvements or found a bug - feel free to open an issue on github or add review to the dashboard.
`vmagent` also exports target statuses at `http://vmagent-host:8429/targets` page in plaintext format.
`vmagent` also exports target statuses at the following handlers:
* `http://vmagent-host:8429/targets`. This handler returns human-readable plaintext status for every active target.
This page is convenient to query from command line with `wget`, `curl` or similar tools.
It accepts optional `show_original_labels=1` query arg, which shows the original labels per each target before applying relabeling.
This information may be useful for debugging target relabeling.
* `http://vmagent-host:8429/api/v1/targets`. This handler returns data compatible with [the corresponding page from Prometheus API](https://prometheus.io/docs/prometheus/latest/querying/api/#targets).
### Troubleshooting
* It is recommended [setting up the official Grafana dashboard](#monitoring) in order to monitor `vmagent` state.
* It is recommended increasing the maximum number of open files in the system (`ulimit -n`) when scraping big number of targets,
since `vmagent` establishes at least a single TCP connection per each target.
* When `vmagent` scrapes many unreliable targets, it can flood error log with scrape errors. These errors can be suppressed
by passing `-promscrape.suppressScrapeErrors` command-line flag to `vmagent`. The most recent scrape error per each target can be observed at `http://vmagent-host:8429/targets`.
by passing `-promscrape.suppressScrapeErrors` command-line flag to `vmagent`. The most recent scrape error per each target can be observed at `http://vmagent-host:8429/targets`
and `http://vmagent-host:8429/api/v1/targets`.
* It is recommended to increase `-remoteWrite.queues` if `vmagent` collects more than 100K samples per second
and `vmagent_remotewrite_pending_data_bytes` metric exported at `http://vmagent-host:8429/metrics` page constantly grows.
* If `vmagent` scrapes targets with millions of metrics per each target (for instance, when scraping [federation endpoints](https://prometheus.io/docs/prometheus/latest/federation/)),
then it is recommended enabling `stream parsing mode` in order to reduce memory usage during scraping. This mode may be enabled either globally for all the scrape targets
by passing `-promscrape.streamParse` command-line flag or on a per-scrape target basis with `stream_parse: true` option. For example:
```yml
scrape_configs:
- job_name: 'big-federate'
stream_parse: true
static_configs:
- targets:
- big-prometeus1
- big-prometeus2
honor_labels: true
metrics_path: /federate
params:
'match[]': ['{__name__!=""}']
```
* It is recommended to increase `-remoteWrite.queues` if `vmagent_remotewrite_pending_data_bytes` metric exported at `http://vmagent-host:8429/metrics` page constantly grows.
* If you see gaps on the data pushed by `vmagent` to remote storage when `-remoteWrite.maxDiskUsagePerURL` is set, then try increasing `-remoteWrite.queues`.
Such gaps may appear because `vmagent` cannot keep up with sending the collected data to remote storage, so it starts dropping the buffered data
if the on-disk buffer size exceeds `-remoteWrite.maxDiskUsagePerURL`.
* `vmagent` buffers scraped data at `-remoteWrite.tmpDataPath` directory until it is sent to `-remoteWrite.url`.
The directory can grow large when remote storage is unavailable for extended periods of time and if `-remoteWrite.maxDiskUsagePerURL` isn't set.
If you don't want to send all the data from the directory to remote storage, simply stop `vmagent` and delete the directory.
* By default `vmagent` masks `-remoteWrite.url` with `secret-url` values in logs and at `/metrics` page because
the url may contain sensitive information such as auth tokens or passwords.
Pass `-remoteWrite.showURL` command-line flag when starting `vmagent` in order to see all the valid urls.
* If you see `skipping duplicate scrape target with identical labels` errors when scraping Kubernetes pods, then it is likely these pods listen multiple ports
or they use init container. These errors can be either fixed or suppressed with `-promscrape.suppressDuplicateScrapeTargetErrors` command-line flag.
See available options below if you prefer fixing the root cause of the error:
The following `relabel_configs` section may help determining `__meta_*` labels resulting in duplicate targets:
```yml
- action: labelmap
regex: __meta_(.*)
```
The following relabeling rule may be added to `relabel_configs` section in order to filter out pods with unneeded ports:
```yml
- action: keep_if_equal
source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_port, __meta_kubernetes_pod_container_port_number]
```
The following relabeling rule may be added to `relabel_configs` section in order to filter out init container pods:
```yml
- action: drop
source_labels: [__meta_kubernetes_pod_container_init]
regex: true
```
### How to build from sources
@@ -234,13 +310,29 @@ Run `make package-vmagent`. It builds `victoriametrics/vmagent:<PKG_TAG>` docker
`<PKG_TAG>` is auto-generated image tag, which depends on source code in the repository.
The `<PKG_TAG>` may be manually set via `PKG_TAG=foobar make package-vmagent`.
By default the image is built on top of `scratch` image. It is possible to build the package on top of any other base image
by setting it via `<ROOT_IMAGE>` environment variable. For example, the following command builds the image on top of `alpine:3.11` image:
The base docker image is [alpine](https://hub.docker.com/_/alpine) but it is possible to use any other base image
by setting it via `<ROOT_IMAGE>` environment variable. For example, the following command builds the image on top of [scratch](https://hub.docker.com/_/scratch) image:
```bash
ROOT_IMAGE=alpine:3.11 make package-vmagent
ROOT_IMAGE=scratch make package-vmagent
```
#### ARM build
ARM build may run on Raspberry Pi or on [energy-efficient ARM servers](https://blog.cloudflare.com/arm-takes-wing/).
#### Development ARM build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.13.
2. Run `make vmagent-arm` or `make vmagent-arm64` from the root folder of the repository.
It builds `vmagent-arm` or `vmagent-arm64` binary respectively and puts it into the `bin` folder.
#### Production ARM build
1. [Install docker](https://docs.docker.com/install/).
2. Run `make vmagent-arm-prod` or `make vmagent-arm64-prod` from the root folder of the repository.
It builds `vmagent-arm-prod` or `vmagent-arm64-prod` binary respectively and puts it into the `bin` folder.
### Profiling

View File

@@ -6,6 +6,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/csvimport"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
@@ -18,12 +19,18 @@ var (
// InsertHandler processes csv data from req.
func InsertHandler(req *http.Request) error {
extraLabels, err := parserCommon.GetExtraLabels(req)
if err != nil {
return err
}
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(req, insertRows)
return parser.ParseStream(req, func(rows []parser.Row) error {
return insertRows(rows, extraLabels)
})
})
}
func insertRows(rows []parser.Row) error {
func insertRows(rows []parser.Row, extraLabels []prompbmarshal.Label) error {
ctx := common.GetPushCtx()
defer common.PutPushCtx(ctx)
@@ -44,6 +51,7 @@ func insertRows(rows []parser.Row) error {
Value: tag.Value,
})
}
labels = append(labels, extraLabels...)
samples = append(samples, prompbmarshal.Sample{
Value: r.Value,
Timestamp: r.Timestamp,

View File

@@ -19,6 +19,7 @@ import (
var (
measurementFieldSeparator = flag.String("influxMeasurementFieldSeparator", "_", "Separator for '{measurement}{separator}{field_name}' metric name when inserted via Influx line protocol")
skipSingleField = flag.Bool("influxSkipSingleField", false, "Uses '{measurement}' instead of '{measurement}{separator}{field_name}' for metic name if Influx line contains only a single field")
skipMeasurement = flag.Bool("influxSkipMeasurement", false, "Uses '{field_name}' as a metric name while ignoring '{measurement}' and '-influxMeasurementFieldSeparator'")
)
var (
@@ -61,25 +62,29 @@ func insertRows(db string, rows []parser.Row) error {
buf := ctx.buf[:0]
for i := range rows {
r := &rows[i]
rowsTotal += len(r.Fields)
commonLabels = commonLabels[:0]
hasDBLabel := false
hasDBKey := false
for j := range r.Tags {
tag := &r.Tags[j]
if tag.Key == "db" {
hasDBLabel = true
hasDBKey = true
}
commonLabels = append(commonLabels, prompbmarshal.Label{
Name: tag.Key,
Value: tag.Value,
})
}
if len(db) > 0 && !hasDBLabel {
if len(db) > 0 && !hasDBKey {
commonLabels = append(commonLabels, prompbmarshal.Label{
Name: "db",
Value: db,
})
}
ctx.metricGroupBuf = append(ctx.metricGroupBuf[:0], r.Measurement...)
ctx.metricGroupBuf = ctx.metricGroupBuf[:0]
if !*skipMeasurement {
ctx.metricGroupBuf = append(ctx.metricGroupBuf, r.Measurement...)
}
skipFieldKey := len(r.Fields) == 1 && *skipSingleField
if len(ctx.metricGroupBuf) > 0 && !skipFieldKey {
ctx.metricGroupBuf = append(ctx.metricGroupBuf, *measurementFieldSeparator...)
@@ -107,7 +112,6 @@ func insertRows(db string, rows []parser.Row) error {
Samples: samples[len(samples)-1:],
})
}
rowsTotal += len(r.Fields)
}
ctx.buf = buf
ctx.ctx.WriteRequest.Timeseries = tssDst

View File

@@ -5,18 +5,22 @@ import (
"fmt"
"net/http"
"os"
"strconv"
"strings"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/csvimport"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/graphite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/influx"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/native"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/opentsdb"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/opentsdbhttp"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/prometheusimport"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/promremotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/vmimport"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envflag"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
graphiteserver "github.com/VictoriaMetrics/VictoriaMetrics/lib/ingestserver/graphite"
@@ -26,6 +30,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/procutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
@@ -34,7 +39,8 @@ var (
httpListenAddr = flag.String("httpListenAddr", ":8429", "TCP address to listen for http connections. "+
"Set this flag to empty value in order to disable listening on any port. This mode may be useful for running multiple vmagent instances on the same server. "+
"Note that /targets and /metrics pages aren't available if -httpListenAddr=''")
influxListenAddr = flag.String("influxListenAddr", "", "TCP and UDP address to listen for Influx line protocol data. Usually :8189 must be set. Doesn't work if empty")
influxListenAddr = flag.String("influxListenAddr", "", "TCP and UDP address to listen for Influx line protocol data. Usually :8189 must be set. Doesn't work if empty. "+
"This flag isn't needed when ingesting data over HTTP - just send it to `http://<vmagent>:8429/write`")
graphiteListenAddr = flag.String("graphiteListenAddr", "", "TCP and UDP address to listen for Graphite plaintext data. Usually :2003 must be set. Doesn't work if empty")
opentsdbListenAddr = flag.String("opentsdbListenAddr", "", "TCP and UDP address to listen for OpentTSDB metrics. "+
"Telnet put messages and HTTP /api/put messages are simultaneously served on TCP port. "+
@@ -56,8 +62,10 @@ func main() {
flag.CommandLine.SetOutput(os.Stdout)
flag.Usage = usage
envflag.Parse()
remotewrite.InitSecretFlags()
buildinfo.Init()
logger.Init()
cgroup.UpdateGOMAXPROCSToCPUQuota()
if *dryRun {
if err := flag.Set("promscrape.config.strictParse", "true"); err != nil {
@@ -76,6 +84,7 @@ func main() {
logger.Infof("starting vmagent at %q...", *httpListenAddr)
startTime := time.Now()
remotewrite.Init()
common.StartUnmarshalWorkers()
writeconcurrencylimiter.Init()
if len(*influxListenAddr) > 0 {
influxServer = influxserver.MustStart(*influxListenAddr, influx.InsertHandlerForReader)
@@ -123,19 +132,24 @@ func main() {
if len(*opentsdbHTTPListenAddr) > 0 {
opentsdbhttpServer.MustStop()
}
common.StopUnmarshalWorkers()
remotewrite.Stop()
logger.Infof("successfully stopped vmagent in %.3f seconds", time.Since(startTime).Seconds())
}
func requestHandler(w http.ResponseWriter, r *http.Request) bool {
if r.RequestURI == "/" {
fmt.Fprintf(w, "vmagent - see docs at https://victoriametrics.github.io/vmagent.html")
return true
}
path := strings.Replace(r.URL.Path, "//", "/", -1)
switch path {
case "/api/v1/write":
prometheusWriteRequests.Inc()
if err := promremotewrite.InsertHandler(r); err != nil {
prometheusWriteErrors.Inc()
httpserver.Errorf(w, "error in %q: %s", r.URL.Path, err)
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
return true
}
w.WriteHeader(http.StatusNoContent)
@@ -144,7 +158,7 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
vmimportRequests.Inc()
if err := vmimport.InsertHandler(r); err != nil {
vmimportErrors.Inc()
httpserver.Errorf(w, "error in %q: %s", r.URL.Path, err)
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
return true
}
w.WriteHeader(http.StatusNoContent)
@@ -153,7 +167,25 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
csvimportRequests.Inc()
if err := csvimport.InsertHandler(r); err != nil {
csvimportErrors.Inc()
httpserver.Errorf(w, "error in %q: %s", r.URL.Path, err)
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "/api/v1/import/prometheus":
prometheusimportRequests.Inc()
if err := prometheusimport.InsertHandler(r); err != nil {
prometheusimportErrors.Inc()
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "/api/v1/import/native":
nativeimportRequests.Inc()
if err := native.InsertHandler(r); err != nil {
nativeimportErrors.Inc()
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
return true
}
w.WriteHeader(http.StatusNoContent)
@@ -162,7 +194,7 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
influxWriteRequests.Inc()
if err := influx.InsertHandlerForHTTP(r); err != nil {
influxWriteErrors.Inc()
httpserver.Errorf(w, "error in %q: %s", r.URL.Path, err)
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
return true
}
w.WriteHeader(http.StatusNoContent)
@@ -176,7 +208,14 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
case "/targets":
promscrapeTargetsRequests.Inc()
w.Header().Set("Content-Type", "text/plain")
promscrape.WriteHumanReadableTargetsStatus(w)
showOriginalLabels, _ := strconv.ParseBool(r.FormValue("show_original_labels"))
promscrape.WriteHumanReadableTargetsStatus(w, showOriginalLabels)
return true
case "/api/v1/targets":
promscrapeAPIV1TargetsRequests.Inc()
w.Header().Set("Content-Type", "application/json")
state := r.FormValue("state")
promscrape.WriteAPIV1Targets(w, state)
return true
case "/-/reload":
promscrapeConfigReloadRequests.Inc()
@@ -197,12 +236,19 @@ var (
csvimportRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/api/v1/import/csv", protocol="csvimport"}`)
csvimportErrors = metrics.NewCounter(`vmagent_http_request_errors_total{path="/api/v1/import/csv", protocol="csvimport"}`)
prometheusimportRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/api/v1/import/prometheus", protocol="prometheusimport"}`)
prometheusimportErrors = metrics.NewCounter(`vmagent_http_request_errors_total{path="/api/v1/import/prometheus", protocol="prometheusimport"}`)
nativeimportRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/api/v1/import/native", protocol="nativeimport"}`)
nativeimportErrors = metrics.NewCounter(`vmagent_http_request_errors_total{path="/api/v1/import/native", protocol="nativeimport"}`)
influxWriteRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/write", protocol="influx"}`)
influxWriteErrors = metrics.NewCounter(`vmagent_http_request_errors_total{path="/write", protocol="influx"}`)
influxQueryRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/query", protocol="influx"}`)
promscrapeTargetsRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/targets"}`)
promscrapeTargetsRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/targets"}`)
promscrapeAPIV1TargetsRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/api/v1/targets"}`)
promscrapeConfigReloadRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/-/reload"}`)
)

View File

@@ -0,0 +1,85 @@
package native
import (
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/native"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="native"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="native"}`)
)
// InsertHandler processes `/api/v1/import` request.
//
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6
func InsertHandler(req *http.Request) error {
extraLabels, err := parserCommon.GetExtraLabels(req)
if err != nil {
return err
}
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(req, func(block *parser.Block) error {
return insertRows(block, extraLabels)
})
})
}
func insertRows(block *parser.Block, extraLabels []prompbmarshal.Label) error {
ctx := common.GetPushCtx()
defer common.PutPushCtx(ctx)
// Update rowsInserted and rowsPerInsert before actual inserting,
// since relabeling can prevent from inserting the rows.
rowsLen := len(block.Values)
rowsInserted.Add(rowsLen)
rowsPerInsert.Update(float64(rowsLen))
tssDst := ctx.WriteRequest.Timeseries[:0]
labels := ctx.Labels[:0]
samples := ctx.Samples[:0]
mn := &block.MetricName
labelsLen := len(labels)
labels = append(labels, prompbmarshal.Label{
Name: "__name__",
Value: bytesutil.ToUnsafeString(mn.MetricGroup),
})
for j := range mn.Tags {
tag := &mn.Tags[j]
labels = append(labels, prompbmarshal.Label{
Name: bytesutil.ToUnsafeString(tag.Key),
Value: bytesutil.ToUnsafeString(tag.Value),
})
}
labels = append(labels, extraLabels...)
values := block.Values
timestamps := block.Timestamps
if len(timestamps) != len(values) {
logger.Panicf("BUG: len(timestamps)=%d must match len(values)=%d", len(timestamps), len(values))
}
samplesLen := len(samples)
for j, value := range values {
samples = append(samples, prompbmarshal.Sample{
Value: value,
Timestamp: timestamps[j],
})
}
tssDst = append(tssDst, prompbmarshal.TimeSeries{
Labels: labels[labelsLen:],
Samples: samples[samplesLen:],
})
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.Push(&ctx.WriteRequest)
return nil
}

View File

@@ -0,0 +1,76 @@
package prometheusimport
import (
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/prometheus"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="prometheus"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="prometheus"}`)
)
// InsertHandler processes `/api/v1/import/prometheus` request.
func InsertHandler(req *http.Request) error {
extraLabels, err := parserCommon.GetExtraLabels(req)
if err != nil {
return err
}
defaultTimestamp, err := parserCommon.GetTimestamp(req)
if err != nil {
return err
}
return writeconcurrencylimiter.Do(func() error {
isGzipped := req.Header.Get("Content-Encoding") == "gzip"
return parser.ParseStream(req.Body, defaultTimestamp, isGzipped, func(rows []parser.Row) error {
return insertRows(rows, extraLabels)
})
})
}
func insertRows(rows []parser.Row, extraLabels []prompbmarshal.Label) error {
ctx := common.GetPushCtx()
defer common.PutPushCtx(ctx)
tssDst := ctx.WriteRequest.Timeseries[:0]
labels := ctx.Labels[:0]
samples := ctx.Samples[:0]
for i := range rows {
r := &rows[i]
labelsLen := len(labels)
labels = append(labels, prompbmarshal.Label{
Name: "__name__",
Value: r.Metric,
})
for j := range r.Tags {
tag := &r.Tags[j]
labels = append(labels, prompbmarshal.Label{
Name: tag.Key,
Value: tag.Value,
})
}
labels = append(labels, extraLabels...)
samples = append(samples, prompbmarshal.Sample{
Value: r.Value,
Timestamp: r.Timestamp,
})
tssDst = append(tssDst, prompbmarshal.TimeSeries{
Labels: labels[labelsLen:],
Samples: samples[len(samples)-1:],
})
}
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.Push(&ctx.WriteRequest)
rowsInserted.Add(len(rows))
rowsPerInsert.Update(float64(len(rows)))
return nil
}

View File

@@ -35,6 +35,7 @@ func insertRows(timeseries []prompb.TimeSeries) error {
samples := ctx.Samples[:0]
for i := range timeseries {
ts := &timeseries[i]
rowsTotal += len(ts.Samples)
labelsLen := len(labels)
for i := range ts.Labels {
label := &ts.Labels[i]
@@ -55,7 +56,6 @@ func insertRows(timeseries []prompb.TimeSeries) error {
Labels: labels[labelsLen:],
Samples: samples[samplesLen:],
})
rowsTotal += len(ts.Samples)
}
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels

View File

@@ -1,10 +1,14 @@
package remotewrite
import (
"bytes"
"crypto/tls"
"encoding/base64"
"flag"
"fmt"
"io/ioutil"
"net/http"
"net/url"
"strings"
"sync"
"time"
@@ -13,12 +17,13 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/persistentqueue"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promauth"
"github.com/VictoriaMetrics/fasthttp"
"github.com/VictoriaMetrics/metrics"
)
var (
sendTimeout = flag.Duration("remoteWrite.sendTimeout", time.Minute, "Timeout for sending a single block of data to -remoteWrite.url")
proxyURL = flagutil.NewArray("remoteWrite.proxyURL", "Optional proxy URL for writing data to -remoteWrite.url. Supported proxies: http, https, socks5. "+
"Example: -remoteWrite.proxyURL=socks5://proxy:1234")
tlsInsecureSkipVerify = flag.Bool("remoteWrite.tlsInsecureSkipVerify", false, "Whether to skip tls verification when connecting to -remoteWrite.url")
tlsCertFile = flagutil.NewArray("remoteWrite.tlsCertFile", "Optional path to client-side TLS certificate file to use when connecting to -remoteWrite.url. "+
@@ -39,24 +44,47 @@ var (
)
type client struct {
urlLabelValue string
sanitizedURL string
remoteWriteURL string
host string
requestURI string
authHeader string
fq *persistentqueue.FastQueue
hc *fasthttp.HostClient
hc *http.Client
requestDuration *metrics.Histogram
requestsOKCount *metrics.Counter
errorsCount *metrics.Counter
packetsDropped *metrics.Counter
retriesCount *metrics.Counter
wg sync.WaitGroup
stopCh chan struct{}
}
func newClient(argIdx int, remoteWriteURL, urlLabelValue string, fq *persistentqueue.FastQueue, concurrency int) *client {
func newClient(argIdx int, remoteWriteURL, sanitizedURL string, fq *persistentqueue.FastQueue, concurrency int) *client {
tlsCfg, err := getTLSConfig(argIdx)
if err != nil {
logger.Panicf("FATAL: cannot initialize TLS config: %s", err)
}
tr := &http.Transport{
Dial: statDial,
TLSClientConfig: tlsCfg,
TLSHandshakeTimeout: 5 * time.Second,
MaxConnsPerHost: 2 * concurrency,
MaxIdleConnsPerHost: 2 * concurrency,
IdleConnTimeout: time.Minute,
WriteBufferSize: 64 * 1024,
}
pURL := proxyURL.GetOptionalArg(argIdx)
if len(pURL) > 0 {
if !strings.Contains(pURL, "://") {
logger.Fatalf("cannot parse -remoteWrite.proxyURL=%q: it must start with `http://`, `https://` or `socks5://`", pURL)
}
urlProxy, err := url.Parse(pURL)
if err != nil {
logger.Fatalf("cannot parse -remoteWrite.proxyURL=%q: %s", pURL, err)
}
tr.Proxy = http.ProxyURL(urlProxy)
}
authHeader := ""
username := basicAuthUsername.GetOptionalArg(argIdx)
password := basicAuthPassword.GetOptionalArg(argIdx)
@@ -73,68 +101,22 @@ func newClient(argIdx int, remoteWriteURL, urlLabelValue string, fq *persistentq
}
authHeader = "Bearer " + token
}
readTimeout := *sendTimeout
if readTimeout <= 0 {
readTimeout = time.Minute
}
writeTimeout := readTimeout
var u fasthttp.URI
u.Update(remoteWriteURL)
scheme := string(u.Scheme())
switch scheme {
case "http", "https":
default:
logger.Fatalf("unsupported scheme in -remoteWrite.url=%q: %q. It must be http or https", remoteWriteURL, scheme)
}
host := string(u.Host())
if len(host) == 0 {
logger.Fatalf("invalid -remoteWrite.url=%q: host cannot be empty. Make sure the url looks like `http://host:port/path`", remoteWriteURL)
}
requestURI := string(u.RequestURI())
isTLS := scheme == "https"
var tlsCfg *tls.Config
if isTLS {
var err error
tlsCfg, err = getTLSConfig(argIdx)
if err != nil {
logger.Panicf("FATAL: cannot initialize TLS config: %s", err)
}
}
if !strings.Contains(host, ":") {
if isTLS {
host += ":443"
} else {
host += ":80"
}
}
maxConns := 2 * concurrency
hc := &fasthttp.HostClient{
Addr: host,
Name: "vmagent",
Dial: statDial,
IsTLS: isTLS,
TLSConfig: tlsCfg,
MaxConns: maxConns,
MaxIdleConnDuration: 10 * readTimeout,
ReadTimeout: readTimeout,
WriteTimeout: writeTimeout,
MaxResponseBodySize: 1024 * 1024,
}
c := &client{
urlLabelValue: urlLabelValue,
sanitizedURL: sanitizedURL,
remoteWriteURL: remoteWriteURL,
host: host,
requestURI: requestURI,
authHeader: authHeader,
fq: fq,
hc: hc,
stopCh: make(chan struct{}),
hc: &http.Client{
Transport: tr,
Timeout: *sendTimeout,
},
stopCh: make(chan struct{}),
}
c.requestDuration = metrics.GetOrCreateHistogram(fmt.Sprintf(`vmagent_remotewrite_duration_seconds{url=%q}`, c.urlLabelValue))
c.requestsOKCount = metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_requests_total{url=%q, status_code="2XX"}`, c.urlLabelValue))
c.errorsCount = metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_errors_total{url=%q}`, c.urlLabelValue))
c.retriesCount = metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_retries_count_total{url=%q}`, c.urlLabelValue))
c.requestDuration = metrics.GetOrCreateHistogram(fmt.Sprintf(`vmagent_remotewrite_duration_seconds{url=%q}`, c.sanitizedURL))
c.requestsOKCount = metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_requests_total{url=%q, status_code="2XX"}`, c.sanitizedURL))
c.errorsCount = metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_errors_total{url=%q}`, c.sanitizedURL))
c.packetsDropped = metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_packets_dropped_total{url=%q}`, c.sanitizedURL))
c.retriesCount = metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_retries_count_total{url=%q}`, c.sanitizedURL))
for i := 0; i < concurrency; i++ {
c.wg.Add(1)
go func() {
@@ -142,27 +124,30 @@ func newClient(argIdx int, remoteWriteURL, urlLabelValue string, fq *persistentq
c.runWorker()
}()
}
logger.Infof("initialized client for -remoteWrite.url=%q", c.remoteWriteURL)
logger.Infof("initialized client for -remoteWrite.url=%q", c.sanitizedURL)
return c
}
func (c *client) MustStop() {
close(c.stopCh)
c.wg.Wait()
logger.Infof("stopped client for -remoteWrite.url=%q", c.remoteWriteURL)
logger.Infof("stopped client for -remoteWrite.url=%q", c.sanitizedURL)
}
func getTLSConfig(argIdx int) (*tls.Config, error) {
tlsConfig := &promauth.TLSConfig{
c := &promauth.TLSConfig{
CAFile: tlsCAFile.GetOptionalArg(argIdx),
CertFile: tlsCertFile.GetOptionalArg(argIdx),
KeyFile: tlsKeyFile.GetOptionalArg(argIdx),
ServerName: tlsServerName.GetOptionalArg(argIdx),
InsecureSkipVerify: *tlsInsecureSkipVerify,
}
cfg, err := promauth.NewConfig(".", nil, "", "", tlsConfig)
if c.CAFile == "" && c.CertFile == "" && c.KeyFile == "" && c.ServerName == "" && !c.InsecureSkipVerify {
return nil, nil
}
cfg, err := promauth.NewConfig(".", nil, "", "", c)
if err != nil {
return nil, fmt.Errorf("cannot populate TLS config: %s", err)
return nil, fmt.Errorf("cannot populate TLS config: %w", err)
}
tlsCfg := cfg.NewTLSConfig()
return tlsCfg, nil
@@ -193,7 +178,7 @@ func (c *client) runWorker() {
// The block has been sent successfully.
case <-time.After(graceDuration):
logger.Errorf("couldn't sent block with size %d bytes to %q in %.3f seconds during shutdown; dropping it",
len(block), c.remoteWriteURL, graceDuration.Seconds())
len(block), c.sanitizedURL, graceDuration.Seconds())
}
return
}
@@ -201,32 +186,25 @@ func (c *client) runWorker() {
}
func (c *client) sendBlock(block []byte) {
req := fasthttp.AcquireRequest()
req.SetRequestURI(c.requestURI)
req.SetHost(c.host)
req.Header.SetMethod("POST")
req.Header.Add("Content-Type", "application/x-protobuf")
req.Header.Add("Content-Encoding", "snappy")
req.Header.Add("X-Prometheus-Remote-Write-Version", "0.1.0")
retryDuration := time.Second
retriesCount := 0
again:
req, err := http.NewRequest("POST", c.remoteWriteURL, bytes.NewBuffer(block))
if err != nil {
logger.Panicf("BUG: unexected error from http.NewRequest(%q): %s", c.sanitizedURL, err)
}
h := req.Header
h.Set("User-Agent", "vmagent")
h.Set("Content-Type", "application/x-protobuf")
h.Set("Content-Encoding", "snappy")
h.Set("X-Prometheus-Remote-Write-Version", "0.1.0")
if c.authHeader != "" {
req.Header.Set("Authorization", c.authHeader)
}
req.SetBody(block)
retryDuration := time.Second
resp := fasthttp.AcquireResponse()
again:
select {
case <-c.stopCh:
fasthttp.ReleaseRequest(req)
fasthttp.ReleaseResponse(resp)
return
default:
}
startTime := time.Now()
err := doRequestWithPossibleRetry(c.hc, req, resp)
resp, err := c.hc.Do(req)
c.requestDuration.UpdateDuration(startTime)
if err != nil {
c.errorsCount.Inc()
@@ -235,40 +213,56 @@ again:
retryDuration = time.Minute
}
logger.Errorf("couldn't send a block with size %d bytes to %q: %s; re-sending the block in %.3f seconds",
len(block), c.remoteWriteURL, err, retryDuration.Seconds())
time.Sleep(retryDuration)
c.retriesCount.Inc()
goto again
}
statusCode := resp.StatusCode()
if statusCode/100 != 2 {
metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_requests_total{url=%q, status_code="%d"}`, c.urlLabelValue, statusCode)).Inc()
retryDuration *= 2
if retryDuration > time.Minute {
retryDuration = time.Minute
len(block), c.sanitizedURL, err, retryDuration.Seconds())
t := time.NewTimer(retryDuration)
select {
case <-c.stopCh:
t.Stop()
return
case <-t.C:
}
logger.Errorf("unexpected status code received after sending a block with size %d bytes to %q: %d; response body=%q; re-sending the block in %.3f seconds",
len(block), c.remoteWriteURL, statusCode, resp.Body(), retryDuration.Seconds())
time.Sleep(retryDuration)
c.retriesCount.Inc()
goto again
}
c.requestsOKCount.Inc()
// The block has been successfully sent to the remote storage.
fasthttp.ReleaseResponse(resp)
fasthttp.ReleaseRequest(req)
}
func doRequestWithPossibleRetry(hc *fasthttp.HostClient, req *fasthttp.Request, resp *fasthttp.Response) error {
// There is no need in calling DoTimeout, since the timeout must be already set in hc.ReadTimeout.
err := hc.Do(req, resp)
if err == nil {
return nil
statusCode := resp.StatusCode
if statusCode/100 == 2 {
_ = resp.Body.Close()
c.requestsOKCount.Inc()
return
}
if err != fasthttp.ErrConnectionClosed {
return err
metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_requests_total{url=%q, status_code="%d"}`, c.sanitizedURL, statusCode)).Inc()
if statusCode/100 == 4 {
// Just drop block on 4xx status code like Prometheus does.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/873
body, _ := ioutil.ReadAll(resp.Body)
_ = resp.Body.Close()
logger.Errorf("unexpected status code received when sending a block with size %d bytes to %q: #%d; dropping the block for 4XX status code like Prometheus does; "+
"response body=%q", len(block), c.sanitizedURL, statusCode, body)
c.packetsDropped.Inc()
return
}
// Retry request if the server closed the keep-alive connection during the first attempt.
return hc.Do(req, resp)
// Unexpected status code returned
retriesCount++
retryDuration *= 2
if retryDuration > time.Minute {
retryDuration = time.Minute
}
body, err := ioutil.ReadAll(resp.Body)
_ = resp.Body.Close()
if err != nil {
logger.Errorf("cannot read response body from %q during retry #%d: %s", c.sanitizedURL, retriesCount, err)
} else {
logger.Errorf("unexpected status code received after sending a block with size %d bytes to %q during retry #%d: %d; response body=%q; "+
"re-sending the block in %.3f seconds", len(block), c.sanitizedURL, retriesCount, statusCode, body, retryDuration.Seconds())
}
t := time.NewTimer(retryDuration)
select {
case <-c.stopCh:
t.Stop()
return
case <-t.C:
}
c.retriesCount.Inc()
goto again
}

View File

@@ -3,10 +3,12 @@ package remotewrite
import (
"flag"
"sync"
"sync/atomic"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fasttime"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/persistentqueue"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/metrics"
@@ -17,7 +19,7 @@ var (
flushInterval = flag.Duration("remoteWrite.flushInterval", time.Second, "Interval for flushing the data to remote storage. "+
"Higher value reduces network bandwidth usage at the cost of delayed push of scraped data to remote storage. "+
"Minimum supported interval is 1 second")
maxUnpackedBlockSize = flag.Int("remoteWrite.maxBlockSize", 32*1024*1024, "The maximum size in bytes of unpacked request to send to remote storage. "+
maxUnpackedBlockSize = flagutil.NewBytes("remoteWrite.maxBlockSize", 8*1024*1024, "The maximum size in bytes of unpacked request to send to remote storage. "+
"It shouldn't exceed -maxInsertRequestSize from VictoriaMetrics")
)
@@ -68,7 +70,7 @@ func (ps *pendingSeries) periodicFlusher() {
case <-ps.stopCh:
mustStop = true
case <-ticker.C:
if fasttime.UnixTimestamp()-ps.wr.lastFlushTime < uint64(flushSeconds) {
if fasttime.UnixTimestamp()-atomic.LoadUint64(&ps.wr.lastFlushTime) < uint64(flushSeconds) {
continue
}
}
@@ -79,10 +81,12 @@ func (ps *pendingSeries) periodicFlusher() {
}
type writeRequest struct {
wr prompbmarshal.WriteRequest
pushBlock func(block []byte)
// Move lastFlushTime to the top of the struct in order to guarantee atomic access on 32-bit architectures.
lastFlushTime uint64
wr prompbmarshal.WriteRequest
pushBlock func(block []byte)
tss []prompbmarshal.TimeSeries
labels []prompbmarshal.Label
@@ -113,7 +117,7 @@ func (wr *writeRequest) reset() {
func (wr *writeRequest) flush() {
wr.wr.Timeseries = wr.tss
wr.lastFlushTime = fasttime.UnixTimestamp()
atomic.StoreUint64(&wr.lastFlushTime, fasttime.UnixTimestamp())
pushWriteRequest(&wr.wr, wr.pushBlock)
wr.reset()
}
@@ -122,9 +126,9 @@ func (wr *writeRequest) push(src []prompbmarshal.TimeSeries) {
tssDst := wr.tss
for i := range src {
tssDst = append(tssDst, prompbmarshal.TimeSeries{})
dst := &tssDst[len(tssDst)-1]
wr.copyTimeSeries(dst, &src[i])
if len(wr.tss) >= maxRowsPerBlock {
wr.copyTimeSeries(&tssDst[len(tssDst)-1], &src[i])
if len(wr.samples) >= maxRowsPerBlock {
wr.tss = tssDst
wr.flush()
tssDst = wr.tss
}
@@ -164,7 +168,7 @@ func pushWriteRequest(wr *prompbmarshal.WriteRequest, pushBlock func(block []byt
}
bb := writeRequestBufPool.Get()
bb.B = prompbmarshal.MarshalWriteRequest(bb.B[:0], wr)
if len(bb.B) <= *maxUnpackedBlockSize {
if len(bb.B) <= maxUnpackedBlockSize.N {
zb := snappyBufPool.Get()
zb.B = snappy.Encode(zb.B[:cap(zb.B)], bb.B)
writeRequestBufPool.Put(bb)

View File

@@ -16,7 +16,7 @@ var (
unparsedLabelsGlobal = flagutil.NewArray("remoteWrite.label", "Optional label in the form 'name=value' to add to all the metrics before sending them to -remoteWrite.url. "+
"Pass multiple -remoteWrite.label flags in order to add multiple flags to metrics before sending them to remote storage")
relabelConfigPathGlobal = flag.String("remoteWrite.relabelConfig", "", "Optional path to file with relabel_config entries. These entries are applied to all the metrics "+
"before sending them to -remoteWrite.url. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config for details")
"before sending them to -remoteWrite.url. See https://victoriametrics.github.io/vmagent.html#relabeling for details")
relabelConfigPaths = flagutil.NewArray("remoteWrite.urlRelabelConfig", "Optional path to relabel config for the corresponding -remoteWrite.url")
)
@@ -33,7 +33,7 @@ func loadRelabelConfigs() (*relabelConfigs, error) {
if *relabelConfigPathGlobal != "" {
global, err := promrelabel.LoadRelabelConfigs(*relabelConfigPathGlobal)
if err != nil {
return nil, fmt.Errorf("cannot load -remoteWrite.relabelConfig=%q: %s", *relabelConfigPathGlobal, err)
return nil, fmt.Errorf("cannot load -remoteWrite.relabelConfig=%q: %w", *relabelConfigPathGlobal, err)
}
rcs.global = global
}
@@ -43,9 +43,13 @@ func loadRelabelConfigs() (*relabelConfigs, error) {
}
rcs.perURL = make([][]promrelabel.ParsedRelabelConfig, len(*remoteWriteURLs))
for i, path := range *relabelConfigPaths {
if len(path) == 0 {
// Skip empty relabel config.
continue
}
prc, err := promrelabel.LoadRelabelConfigs(path)
if err != nil {
return nil, fmt.Errorf("cannot load relabel configs from -remoteWrite.urlRelabelConfig=%q: %s", path, err)
return nil, fmt.Errorf("cannot load relabel configs from -remoteWrite.urlRelabelConfig=%q: %w", path, err)
}
rcs.perURL[i] = prc
}
@@ -59,7 +63,6 @@ type relabelConfigs struct {
// initLabelsGlobal must be called after parsing command-line flags.
func initLabelsGlobal() {
// Init labelsGlobal
labelsGlobal = nil
for _, s := range *unparsedLabelsGlobal {
n := strings.IndexByte(s, '=')

View File

@@ -3,11 +3,12 @@ package remotewrite
import (
"flag"
"fmt"
"runtime"
"sync"
"sync/atomic"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/decimal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/memory"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/persistentqueue"
@@ -22,14 +23,17 @@ var (
"It is recommended using VictoriaMetrics as remote storage. Example url: http://<victoriametrics-host>:8428/api/v1/write . "+
"Pass multiple -remoteWrite.url flags in order to write data concurrently to multiple remote storage systems")
tmpDataPath = flag.String("remoteWrite.tmpDataPath", "vmagent-remotewrite-data", "Path to directory where temporary data for remote write component is stored")
queues = flag.Int("remoteWrite.queues", 1, "The number of concurrent queues to each -remoteWrite.url. Set more queues if a single queue "+
queues = flag.Int("remoteWrite.queues", 4, "The number of concurrent queues to each -remoteWrite.url. Set more queues if default number of queues "+
"isn't enough for sending high volume of collected data to remote storage")
showRemoteWriteURL = flag.Bool("remoteWrite.showURL", false, "Whether to show -remoteWrite.url in the exported metrics. "+
"It is hidden by default, since it can contain sensistive auth info")
maxPendingBytesPerURL = flag.Int("remoteWrite.maxDiskUsagePerURL", 0, "The maximum file-based buffer size in bytes at -remoteWrite.tmpDataPath "+
"It is hidden by default, since it can contain sensitive info such as auth key")
maxPendingBytesPerURL = flagutil.NewBytes("remoteWrite.maxDiskUsagePerURL", 0, "The maximum file-based buffer size in bytes at -remoteWrite.tmpDataPath "+
"for each -remoteWrite.url. When buffer size reaches the configured maximum, then old data is dropped when adding new data to the buffer. "+
"Buffered data is stored in ~500MB chunks, so the minimum practical value for this flag is 500000000. "+
"Disk usage is unlimited if the value is set to 0")
significantFigures = flag.Int("remoteWrite.significantFigures", 0, "The number of significant figures to leave in metric values before writing them to remote storage. "+
"See https://en.wikipedia.org/wiki/Significant_figures . Zero value saves all the significant figures. "+
"This option may be used for increasing on-disk compression level for the stored metrics")
)
var rwctxs []*remoteWriteCtx
@@ -37,6 +41,18 @@ var rwctxs []*remoteWriteCtx
// Contains the current relabelConfigs.
var allRelabelConfigs atomic.Value
// maxQueues limits the maximum value for `-remoteWrite.queues`. There is no sense in setting too high value,
// since it may lead to high memory usage due to big number of buffers.
var maxQueues = runtime.GOMAXPROCS(-1) * 4
// InitSecretFlags must be called after flag.Parse and before any logging.
func InitSecretFlags() {
if !*showRemoteWriteURL {
// remoteWrite.url can contain authentication codes, so hide it at `/metrics` output.
flagutil.RegisterSecretFlag("remoteWrite.url")
}
}
// Init initializes remotewrite.
//
// It must be called after flag.Parse().
@@ -44,12 +60,13 @@ var allRelabelConfigs atomic.Value
// Stop must be called for graceful shutdown.
func Init() {
if len(*remoteWriteURLs) == 0 {
logger.Fatalf("at least one `-remoteWrite.url` must be set")
logger.Fatalf("at least one `-remoteWrite.url` command-line flag must be set")
}
if !*showRemoteWriteURL {
// remoteWrite.url can contain authentication codes, so hide it at `/metrics` output.
httpserver.RegisterSecretFlag("remoteWrite.url")
if *queues > maxQueues {
*queues = maxQueues
}
if *queues <= 0 {
*queues = 1
}
initLabelsGlobal()
rcs, err := loadRelabelConfigs()
@@ -69,11 +86,11 @@ func Init() {
maxInmemoryBlocks = 2
}
for i, remoteWriteURL := range *remoteWriteURLs {
urlLabelValue := fmt.Sprintf("secret-url-%d", i+1)
sanitizedURL := fmt.Sprintf("%d:secret-url", i+1)
if *showRemoteWriteURL {
urlLabelValue = remoteWriteURL
sanitizedURL = fmt.Sprintf("%d:%s", i+1, remoteWriteURL)
}
rwctx := newRemoteWriteCtx(i, remoteWriteURL, maxInmemoryBlocks, urlLabelValue)
rwctx := newRemoteWriteCtx(i, remoteWriteURL, maxInmemoryBlocks, sanitizedURL)
rwctxs = append(rwctxs, rwctx)
}
@@ -118,8 +135,19 @@ func Stop() {
// Push sends wr to remote storage systems set via `-remoteWrite.url`.
//
// Note that wr may be modified by Push due to relabeling.
// Note that wr may be modified by Push due to relabeling and rounding.
func Push(wr *prompbmarshal.WriteRequest) {
if *significantFigures > 0 {
// Round values according to significantFigures
for i := range wr.Timeseries {
samples := wr.Timeseries[i].Samples
for j := range samples {
s := &samples[j]
s.Value = decimal.Round(s.Value, *significantFigures)
}
}
}
var rctx *relabelCtx
rcs := allRelabelConfigs.Load().(*relabelConfigs)
prcsGlobal := rcs.global
@@ -128,11 +156,20 @@ func Push(wr *prompbmarshal.WriteRequest) {
}
tss := wr.Timeseries
for len(tss) > 0 {
// Process big tss in smaller blocks in order to reduce maxmimum memory usage
// Process big tss in smaller blocks in order to reduce the maximum memory usage
samplesCount := 0
i := 0
for i < len(tss) {
samplesCount += len(tss[i].Samples)
i++
if samplesCount > maxRowsPerBlock {
break
}
}
tssBlock := tss
if len(tssBlock) > maxRowsPerBlock {
tssBlock = tss[:maxRowsPerBlock]
tss = tss[maxRowsPerBlock:]
if i < len(tss) {
tssBlock = tss[:i]
tss = tss[i:]
} else {
tss = nil
}
@@ -162,22 +199,20 @@ type remoteWriteCtx struct {
pss []*pendingSeries
pssNextIdx uint64
tss []prompbmarshal.TimeSeries
relabelMetricsDropped *metrics.Counter
}
func newRemoteWriteCtx(argIdx int, remoteWriteURL string, maxInmemoryBlocks int, urlLabelValue string) *remoteWriteCtx {
func newRemoteWriteCtx(argIdx int, remoteWriteURL string, maxInmemoryBlocks int, sanitizedURL string) *remoteWriteCtx {
h := xxhash.Sum64([]byte(remoteWriteURL))
path := fmt.Sprintf("%s/persistent-queue/%016X", *tmpDataPath, h)
fq := persistentqueue.MustOpenFastQueue(path, remoteWriteURL, maxInmemoryBlocks, *maxPendingBytesPerURL)
_ = metrics.GetOrCreateGauge(fmt.Sprintf(`vmagent_remotewrite_pending_data_bytes{path=%q, url=%q}`, path, urlLabelValue), func() float64 {
path := fmt.Sprintf("%s/persistent-queue/%d_%016X", *tmpDataPath, argIdx+1, h)
fq := persistentqueue.MustOpenFastQueue(path, sanitizedURL, maxInmemoryBlocks, maxPendingBytesPerURL.N)
_ = metrics.GetOrCreateGauge(fmt.Sprintf(`vmagent_remotewrite_pending_data_bytes{path=%q, url=%q}`, path, sanitizedURL), func() float64 {
return float64(fq.GetPendingBytes())
})
_ = metrics.GetOrCreateGauge(fmt.Sprintf(`vmagent_remotewrite_pending_inmemory_blocks{path=%q, url=%q}`, path, urlLabelValue), func() float64 {
_ = metrics.GetOrCreateGauge(fmt.Sprintf(`vmagent_remotewrite_pending_inmemory_blocks{path=%q, url=%q}`, path, sanitizedURL), func() float64 {
return float64(fq.GetInmemoryQueueLen())
})
c := newClient(argIdx, remoteWriteURL, urlLabelValue, fq, *queues)
c := newClient(argIdx, remoteWriteURL, sanitizedURL, fq, *queues)
pss := make([]*pendingSeries, *queues)
for i := range pss {
pss[i] = newPendingSeries(fq.MustWriteBlock)
@@ -188,7 +223,7 @@ func newRemoteWriteCtx(argIdx int, remoteWriteURL string, maxInmemoryBlocks int,
c: c,
pss: pss,
relabelMetricsDropped: metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_relabel_metrics_dropped_total{path=%q, url=%q}`, path, urlLabelValue)),
relabelMetricsDropped: metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_relabel_metrics_dropped_total{path=%q, url=%q}`, path, sanitizedURL)),
}
}
@@ -208,15 +243,17 @@ func (rwctx *remoteWriteCtx) MustStop() {
func (rwctx *remoteWriteCtx) Push(tss []prompbmarshal.TimeSeries) {
var rctx *relabelCtx
var v *[]prompbmarshal.TimeSeries
rcs := allRelabelConfigs.Load().(*relabelConfigs)
prcs := rcs.perURL[rwctx.idx]
if len(prcs) > 0 {
rctx = getRelabelCtx()
// Make a copy of tss before applying relabeling in order to prevent
// from affecting time series for other remoteWrite.url configs.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/467 for details.
rwctx.tss = append(rwctx.tss[:0], tss...)
tss = rwctx.tss
rctx = getRelabelCtx()
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/467
// and https://github.com/VictoriaMetrics/VictoriaMetrics/issues/599
v = tssRelabelPool.Get().(*[]prompbmarshal.TimeSeries)
tss = append(*v, tss...)
tssLen := len(tss)
tss = rctx.applyRelabeling(tss, nil, prcs)
rwctx.relabelMetricsDropped.Add(tssLen - len(tss))
@@ -225,8 +262,15 @@ func (rwctx *remoteWriteCtx) Push(tss []prompbmarshal.TimeSeries) {
idx := atomic.AddUint64(&rwctx.pssNextIdx, 1) % uint64(len(pss))
pss[idx].Push(tss)
if rctx != nil {
*v = prompbmarshal.ResetTimeSeries(tss)
tssRelabelPool.Put(v)
putRelabelCtx(rctx)
// Zero rwctx.tss in order to free up GC references.
rwctx.tss = prompbmarshal.ResetTimeSeries(rwctx.tss)
}
}
var tssRelabelPool = &sync.Pool{
New: func() interface{} {
a := []prompbmarshal.TimeSeries{}
return &a
},
}

View File

@@ -1,7 +1,9 @@
package remotewrite
import (
"fmt"
"net"
"strings"
"sync/atomic"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/netutil"
@@ -9,7 +11,10 @@ import (
"github.com/VictoriaMetrics/metrics"
)
func statDial(addr string) (conn net.Conn, err error) {
func statDial(network, addr string) (conn net.Conn, err error) {
if !strings.HasPrefix(network, "tcp") {
return nil, fmt.Errorf("unexpected network passed to statDial: %q; it must start from `tcp`", network)
}
if netutil.TCP6Enabled() {
conn, err = fasthttp.DialDualStack(addr)
} else {

View File

@@ -6,7 +6,9 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/vmimport"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
@@ -21,12 +23,18 @@ var (
//
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6
func InsertHandler(req *http.Request) error {
extraLabels, err := parserCommon.GetExtraLabels(req)
if err != nil {
return err
}
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(req, insertRows)
return parser.ParseStream(req, func(rows []parser.Row) error {
return insertRows(rows, extraLabels)
})
})
}
func insertRows(rows []parser.Row) error {
func insertRows(rows []parser.Row, extraLabels []prompbmarshal.Label) error {
ctx := common.GetPushCtx()
defer common.PutPushCtx(ctx)
@@ -36,6 +44,7 @@ func insertRows(rows []parser.Row) error {
samples := ctx.Samples[:0]
for i := range rows {
r := &rows[i]
rowsTotal += len(r.Values)
labelsLen := len(labels)
for j := range r.Tags {
tag := &r.Tags[j]
@@ -44,9 +53,12 @@ func insertRows(rows []parser.Row) error {
Value: bytesutil.ToUnsafeString(tag.Value),
})
}
labels = append(labels, extraLabels...)
values := r.Values
timestamps := r.Timestamps
_ = timestamps[len(values)-1]
if len(timestamps) != len(values) {
logger.Panicf("BUG: len(timestamps)=%d must match len(values)=%d", len(timestamps), len(values))
}
samplesLen := len(samples)
for j, value := range values {
samples = append(samples, prompbmarshal.Sample{
@@ -58,7 +70,6 @@ func insertRows(rows []parser.Row) error {
Labels: labels[labelsLen:],
Samples: samples[samplesLen:],
})
rowsTotal += len(values)
}
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels

View File

@@ -61,24 +61,30 @@ run-vmalert: vmalert
./bin/vmalert -rule=app/vmalert/config/testdata/rules2-good.rules \
-datasource.url=http://localhost:8428 \
-notifier.url=http://localhost:9093 \
-notifier.url=http://127.0.0.1:9093 \
-remoteWrite.url=http://localhost:8428 \
-remoteRead.url=http://localhost:8428 \
-external.label=cluster=east-1 \
-external.label=replica=a \
-evaluationInterval=3s
vmalert-amd64:
CGO_ENABLED=1 GOOS=linux GOARCH=amd64 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmalert-amd64 ./app/vmalert
CGO_ENABLED=1 GOARCH=amd64 $(MAKE) vmalert-local-with-goarch
vmalert-arm:
CGO_ENABLED=0 GOOS=linux GOARCH=arm GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmalert-arm ./app/vmalert
CGO_ENABLED=0 GOARCH=arm $(MAKE) vmalert-local-with-goarch
vmalert-arm64:
CGO_ENABLED=0 GOOS=linux GOARCH=arm64 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmalert-arm64 ./app/vmalert
CGO_ENABLED=0 GOARCH=arm64 $(MAKE) vmalert-local-with-goarch
vmalert-ppc64le:
CGO_ENABLED=0 GOOS=linux GOARCH=ppc64le GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmalert-ppc64le ./app/vmalert
CGO_ENABLED=0 GOARCH=ppc64le $(MAKE) vmalert-local-with-goarch
vmalert-386:
CGO_ENABLED=0 GOOS=linux GOARCH=386 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmalert-386 ./app/vmalert
CGO_ENABLED=0 GOARCH=386 $(MAKE) vmalert-local-with-goarch
vmalert-local-with-goarch:
APP_NAME=vmalert $(MAKE) app-local-with-goarch
vmalert-pure:
APP_NAME=vmalert $(MAKE) app-local-pure

View File

@@ -11,6 +11,7 @@ rules against configured address.
* Prometheus [alerting rules definition format](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/#defining-alerting-rules)
support;
* Integration with [Alertmanager](https://github.com/prometheus/alertmanager);
* Keeps the alerts [state on restarts](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/master/app/vmalert#alerts-state-on-restarts);
* Lightweight without extra dependencies.
### Limitations:
@@ -44,10 +45,19 @@ compatible storage address for storing recording rules results and alerts state
Then configure `vmalert` accordingly:
```
./bin/vmalert -rule=alert.rules \
-datasource.url=http://localhost:8428 \
-notifier.url=http://localhost:9093
-datasource.url=http://localhost:8428 \ # PromQL compatible datasource
-notifier.url=http://localhost:9093 \ # AlertManager URL
-notifier.url=http://127.0.0.1:9093 \ # AlertManager replica URL
-remoteWrite.url=http://localhost:8428 \ # remote write compatible storage to persist rules
-remoteRead.url=http://localhost:8428 \ # PromQL compatible datasource to restore alerts state from
-external.label=cluster=east-1 \ # External label to be applied for each rule
-external.label=replica=a \ # Multiple external labels may be set
-evaluationInterval=3s # Default evaluation interval if not specified in rules group
```
If you run multiple `vmalert` services for the same datastore or AlertManager - do not forget
to specify different `external.label` flags in order to define which `vmalert` generated rules or alerts.
Configuration for [recording](https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/)
and [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/) rules is very
similar to Prometheus rules and configured using YAML. Configuration examples may be found
@@ -112,14 +122,6 @@ annotations:
[ <labelname>: <tmpl_string> ]
```
`vmalert` has no local storage and alerts state is stored in process memory. Hence, after reloading of `vmalert` process
alerts state will be lost. To avoid this situation, `vmalert` may be configured via following flags:
* `-remoteWrite.url` - URL to Victoria Metrics or VMInsert. `vmalert` will persist alerts state into the configured
address in form of timeseries with name `ALERTS` via remote-write protocol.
* `-remoteRead.url` - URL to Victoria Metrics or VMSelect. `vmalert` will try to restore alerts state from configured
address by querying `ALERTS` timeseries.
##### Recording rules
The syntax for recording rules is following:
@@ -138,6 +140,22 @@ labels:
For recording rules to work `-remoteWrite.url` must specified.
#### Alerts state on restarts
`vmalert` has no local storage, so alerts state is stored in the process memory. Hence, after reloading of `vmalert`
the process alerts state will be lost. To avoid this situation, `vmalert` should be configured via the following flags:
* `-remoteWrite.url` - URL to VictoriaMetrics (Single) or VMInsert (Cluster). `vmalert` will persist alerts state
into the configured address in the form of time series named `ALERTS` and `ALERTS_FOR_STATE` via remote-write protocol.
These are regular time series and may be queried from VM just as any other time series.
The state stored to the configured address on every rule evaluation.
* `-remoteRead.url` - URL to VictoriaMetrics (Single) or VMSelect (Cluster). `vmalert` will try to restore alerts state
from configured address by querying time series with name `ALERTS_FOR_STATE`.
Both flags are required for the proper state restoring. Restore process may fail if time series are missing
in configured `-remoteRead.url`, weren't updated in the last `1h` or received state doesn't match current `vmalert`
rules configuration.
#### WEB
`vmalert` runs a web-server (`-httpListenAddr`) for serving metrics and alerts endpoints:
@@ -153,54 +171,163 @@ Used as alert source in AlertManager.
The shortlist of configuration flags is the following:
```
Usage of vmalert:
-datasource.basicAuth.password string
Optional basic auth password for -datasource.url
Optional basic auth password for -datasource.url
-datasource.basicAuth.username string
Optional basic auth username for -datasource.url
Optional basic auth username for -datasource.url
-datasource.lookback duration
Lookback defines how far to look into past when evaluating queries. For example, if datasource.lookback=5m then param "time" with value now()-5m will be added to every query.
-datasource.maxIdleConnections int
Defines the number of idle (keep-alive connections) to configured datasource.Consider to set this value equal to the value: groups_total * group.concurrency. Too low value may result into high number of sockets in TIME_WAIT state. (default 100)
-datasource.tlsCAFile string
Optional path to TLS CA file to use for verifying connections to -datasource.url. By default system CA is used
-datasource.tlsCertFile string
Optional path to client-side TLS certificate file to use when connecting to -datasource.url
-datasource.tlsInsecureSkipVerify
Whether to skip tls verification when connecting to -datasource.url
-datasource.tlsKeyFile string
Optional path to client-side TLS certificate key to use when connecting to -datasource.url
-datasource.tlsServerName string
Optional TLS server name to use for connections to -datasource.url. By default the server name from -datasource.url is used
-datasource.url string
Victoria Metrics or VMSelect url. Required parameter. E.g. http://127.0.0.1:8428
Victoria Metrics or VMSelect url. Required parameter. E.g. http://127.0.0.1:8428
-enableTCP6
Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP is used
-envflag.enable
Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
-evaluationInterval duration
How often to evaluate the rules (default 1m0s)
How often to evaluate the rules (default 1m0s)
-external.alert.source string
External Alert Source allows to override the Source link for alerts sent to AlertManager for cases where you want to build a custom link to Grafana, Prometheus or any other service.
eg. 'explore?orgId=1&left=[\"now-1h\",\"now\",\"VictoriaMetrics\",{\"expr\": \"{{$expr|quotesEscape|pathEscape}}\"},{\"mode\":\"Metrics\"},{\"ui\":[true,true,true,\"none\"]}]'.If empty '/api/v1/:groupID/alertID/status' is used
-external.label array
Optional label in the form 'name=value' to add to all generated recording rules and alerts. Pass multiple -label flags in order to add multiple label sets.
Supports array of values separated by comma or specified via multiple flags.
-external.url string
External URL is used as alert's source for sent alerts to the notifier
External URL is used as alert's source for sent alerts to the notifier
-http.connTimeout duration
Incoming http connections are closed after the configured timeout. This may help spreading incoming load among a cluster of services behind load balancer. Note that the real timeout may be bigger by up to 10% as a protection from Thundering herd problem (default 2m0s)
-http.disableResponseCompression
Disable compression of HTTP responses for saving CPU resources. By default compression is enabled to save network bandwidth
-http.idleConnTimeout duration
Timeout for incoming idle http connections (default 1m0s)
-http.maxGracefulShutdownDuration duration
The maximum duration for graceful shutdown of HTTP server. Highly loaded server may require increased value for graceful shutdown (default 7s)
-http.pathPrefix string
An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus
-http.shutdownDelay duration
Optional delay before http server shutdown. During this dealy the servier returns non-OK responses from /health page, so load balancers can route new requests to other servers
-httpAuth.password string
Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty
-httpAuth.username string
Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
-httpListenAddr string
Address to listen for http connections (default ":8880")
Address to listen for http connections (default ":8880")
-loggerErrorsPerSecondLimit int
Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, then the remaining errors are suppressed. Zero value disables the rate limit (default 10)
-loggerFormat string
Format for logs. Possible values: default, json (default "default")
-loggerLevel string
Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO")
-loggerOutput string
Output for the logs. Supported values: stderr, stdout (default "stderr")
-memory.allowedBytes value
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to non-zero value. Too low value may increase cache miss rate, which usually results in higher CPU and disk IO usage. Too high value may evict too much data from OS page cache, which will result in higher disk IO usage
Supports the following optional suffixes for values: KB, MB, GB, KiB, MiB, GiB (default 0)
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low value may increase cache miss rate, which usually results in higher CPU and disk IO usage. Too high value may evict too much data from OS page cache, which will result in higher disk IO usage (default 60)
-metricsAuthKey string
Auth key for /metrics. It overrides httpAuth settings
-notifier.url string
Prometheus alertmanager URL. Required parameter. e.g. http://127.0.0.1:9093
Auth key for /metrics. It overrides httpAuth settings
-notifier.basicAuth.password array
Optional basic auth password for -datasource.url
Supports array of values separated by comma or specified via multiple flags.
-notifier.basicAuth.username array
Optional basic auth username for -datasource.url
Supports array of values separated by comma or specified via multiple flags.
-notifier.tlsCAFile array
Optional path to TLS CA file to use for verifying connections to -notifier.url. By default system CA is used
Supports array of values separated by comma or specified via multiple flags.
-notifier.tlsCertFile array
Optional path to client-side TLS certificate file to use when connecting to -notifier.url
Supports array of values separated by comma or specified via multiple flags.
-notifier.tlsInsecureSkipVerify
Whether to skip tls verification when connecting to -notifier.url
-notifier.tlsKeyFile array
Optional path to client-side TLS certificate key to use when connecting to -notifier.url
Supports array of values separated by comma or specified via multiple flags.
-notifier.tlsServerName array
Optional TLS server name to use for connections to -notifier.url. By default the server name from -notifier.url is used
Supports array of values separated by comma or specified via multiple flags.
-notifier.url array
Prometheus alertmanager URL. Required parameter. e.g. http://127.0.0.1:9093
Supports array of values separated by comma or specified via multiple flags.
-pprofAuthKey string
Auth key for /debug/pprof. It overrides httpAuth settings
-remoteRead.basicAuth.password string
Optional basic auth password for -remoteRead.url
Optional basic auth password for -remoteRead.url
-remoteRead.basicAuth.username string
Optional basic auth username for -remoteRead.url
Optional basic auth username for -remoteRead.url
-remoteRead.lookback duration
Lookback defines how far to look into past for alerts timeseries. For example, if lookback=1h then range from now() to now()-1h will be scanned. (default 1h0m0s)
Lookback defines how far to look into past for alerts timeseries. For example, if lookback=1h then range from now() to now()-1h will be scanned. (default 1h0m0s)
-remoteRead.tlsCAFile string
Optional path to TLS CA file to use for verifying connections to -remoteRead.url. By default system CA is used
-remoteRead.tlsCertFile string
Optional path to client-side TLS certificate file to use when connecting to -remoteRead.url
-remoteRead.tlsInsecureSkipVerify
Whether to skip tls verification when connecting to -remoteRead.url
-remoteRead.tlsKeyFile string
Optional path to client-side TLS certificate key to use when connecting to -remoteRead.url
-remoteRead.tlsServerName string
Optional TLS server name to use for connections to -remoteRead.url. By default the server name from -remoteRead.url is used
-remoteRead.url vmalert
Optional URL to Victoria Metrics or VMSelect that will be used to restore alerts state. This configuration makes sense only if vmalert was configured with `remoteWrite.url` before and has been successfully persisted its state. E.g. http://127.0.0.1:8428
Optional URL to Victoria Metrics or VMSelect that will be used to restore alerts state. This configuration makes sense only if vmalert was configured with `remoteWrite.url` before and has been successfully persisted its state. E.g. http://127.0.0.1:8428
-remoteWrite.basicAuth.password string
Optional basic auth password for -remoteWrite.url
Optional basic auth password for -remoteWrite.url
-remoteWrite.basicAuth.username string
Optional basic auth username for -remoteWrite.url
Optional basic auth username for -remoteWrite.url
-remoteWrite.concurrency int
Defines number of readers that concurrently write into remote storage (default 1)
Defines number of writers for concurrent writing into remote querier (default 1)
-remoteWrite.flushInterval duration
Defines interval of flushes to remote write endpoint (default 5s)
-remoteWrite.maxBatchSize int
Defines defines max number of timeseries to be flushed at once (default 1000)
Defines defines max number of timeseries to be flushed at once (default 1000)
-remoteWrite.maxQueueSize int
Defines the max number of pending datapoints to remote write endpoint (default 100000)
Defines the max number of pending datapoints to remote write endpoint (default 100000)
-remoteWrite.tlsCAFile string
Optional path to TLS CA file to use for verifying connections to -remoteWrite.url. By default system CA is used
-remoteWrite.tlsCertFile string
Optional path to client-side TLS certificate file to use when connecting to -remoteWrite.url
-remoteWrite.tlsInsecureSkipVerify
Whether to skip tls verification when connecting to -remoteWrite.url
-remoteWrite.tlsKeyFile string
Optional path to client-side TLS certificate key to use when connecting to -remoteWrite.url
-remoteWrite.tlsServerName string
Optional TLS server name to use for connections to -remoteWrite.url. By default the server name from -remoteWrite.url is used
-remoteWrite.url string
Optional URL to Victoria Metrics or VMInsert where to persist alerts state in form of timeseries. E.g. http://127.0.0.1:8428
-rule value
Path to the file with alert rules.
Supports patterns. Flag can be specified multiple times.
Examples:
-rule /path/to/file. Path to a single file with alerting rules
-rule dir/*.yaml -rule /*.yaml. Relative path to all .yaml files in "dir" folder,
absolute path to all .yaml files in root.
Optional URL to Victoria Metrics or VMInsert where to persist alerts state and recording rules results in form of timeseries. E.g. http://127.0.0.1:8428
-rule array
Path to the file with alert rules.
Supports patterns. Flag can be specified multiple times.
Examples:
-rule="/path/to/file". Path to a single file with alerting rules
-rule="dir/*.yaml" -rule="/*.yaml". Relative path to all .yaml files in "dir" folder,
absolute path to all .yaml files in root.
Rule files may contain %{ENV_VAR} placeholders, which are substituted by the corresponding env vars.
Supports array of values separated by comma or specified via multiple flags.
-rule.validateExpressions
Whether to validate rules expressions via MetricsQL engine (default true)
Whether to validate rules expressions via MetricsQL engine (default true)
-rule.validateTemplates
Whether to validate annotation and label templates (default true)
Whether to validate annotation and label templates (default true)
-tls
Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set
-tlsCertFile string
Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs, since RSA certs are slow
-tlsKeyFile string
Path to file with TLS key. Used only if -tls is set
-version
Show VictoriaMetrics version
```
Pass `-help` to `vmalert` in order to see the full list of supported
@@ -233,3 +360,20 @@ It is recommended using
1. [Install docker](https://docs.docker.com/install/).
2. Run `make vmalert-prod` from the root folder of the repository.
It builds `vmalert-prod` binary and puts it into the `bin` folder.
#### ARM build
ARM build may run on Raspberry Pi or on [energy-efficient ARM servers](https://blog.cloudflare.com/arm-takes-wing/).
#### Development ARM build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.13.
2. Run `make vmalert-arm` or `make vmalert-arm64` from the root folder of the repository.
It builds `vmalert-arm` or `vmalert-arm64` binary respectively and puts it into the `bin` folder.
#### Production ARM build
1. [Install docker](https://docs.docker.com/install/).
2. Run `make vmalert-arm-prod` or `make vmalert-arm64-prod` from the root folder of the repository.
It builds `vmalert-arm-prod` or `vmalert-arm64-prod` binary respectively and puts it into the `bin` folder.

View File

@@ -14,6 +14,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/metrics"
)
// AlertingRule is basic alert entity
@@ -25,6 +26,7 @@ type AlertingRule struct {
Labels map[string]string
Annotations map[string]string
GroupID uint64
GroupName string
// guard status fields
mu sync.RWMutex
@@ -36,19 +38,72 @@ type AlertingRule struct {
// resets on every successful Exec
// may be used as Health state
lastExecError error
metrics *alertingRuleMetrics
}
func newAlertingRule(gID uint64, cfg config.Rule) *AlertingRule {
return &AlertingRule{
type alertingRuleMetrics struct {
errors *gauge
pending *gauge
active *gauge
}
func newAlertingRule(group *Group, cfg config.Rule) *AlertingRule {
ar := &AlertingRule{
RuleID: cfg.ID,
Name: cfg.Alert,
Expr: cfg.Expr,
For: cfg.For,
For: cfg.For.Duration(),
Labels: cfg.Labels,
Annotations: cfg.Annotations,
GroupID: gID,
GroupID: group.ID(),
GroupName: group.Name,
alerts: make(map[uint64]*notifier.Alert),
metrics: &alertingRuleMetrics{},
}
labels := fmt.Sprintf(`alertname=%q, group=%q, id="%d"`, ar.Name, group.Name, ar.ID())
ar.metrics.pending = getOrCreateGauge(fmt.Sprintf(`vmalert_alerts_pending{%s}`, labels),
func() float64 {
ar.mu.Lock()
defer ar.mu.Unlock()
var num int
for _, a := range ar.alerts {
if a.State == notifier.StatePending {
num++
}
}
return float64(num)
})
ar.metrics.active = getOrCreateGauge(fmt.Sprintf(`vmalert_alerts_firing{%s}`, labels),
func() float64 {
ar.mu.Lock()
defer ar.mu.Unlock()
var num int
for _, a := range ar.alerts {
if a.State == notifier.StateFiring {
num++
}
}
return float64(num)
})
ar.metrics.errors = getOrCreateGauge(fmt.Sprintf(`vmalert_alerts_error{%s}`, labels),
func() float64 {
ar.mu.Lock()
defer ar.mu.Unlock()
if ar.lastExecError == nil {
return 0
}
return 1
})
return ar
}
// Close unregisters rule metrics
func (ar *AlertingRule) Close() {
metrics.UnregisterMetric(ar.metrics.active.name)
metrics.UnregisterMetric(ar.metrics.pending.name)
metrics.UnregisterMetric(ar.metrics.errors.name)
}
// String implements Stringer interface
@@ -72,7 +127,7 @@ func (ar *AlertingRule) Exec(ctx context.Context, q datasource.Querier, series b
ar.lastExecError = err
ar.lastExecTime = time.Now()
if err != nil {
return nil, fmt.Errorf("failed to execute query %q: %s", ar.Expr, err)
return nil, fmt.Errorf("failed to execute query %q: %w", ar.Expr, err)
}
for h, a := range ar.alerts {
@@ -103,7 +158,7 @@ func (ar *AlertingRule) Exec(ctx context.Context, q datasource.Querier, series b
a, err := ar.newAlert(m, ar.lastExecTime)
if err != nil {
ar.lastExecError = err
return nil, fmt.Errorf("failed to create alert: %s", err)
return nil, fmt.Errorf("failed to create alert: %w", err)
}
a.ID = h
a.State = notifier.StatePending
@@ -189,6 +244,9 @@ func (ar *AlertingRule) newAlert(m datasource.Metric, start time.Time) (*notifie
Start: start,
Expr: ar.Expr,
}
// label defined here to make override possible by
// time series labels.
a.Labels[alertGroupNameLabel] = ar.GroupName
for _, l := range m.Labels {
// drop __name__ to be consistent with Prometheus alerting
if l.Name == "__name__" {
@@ -283,15 +341,18 @@ func (ar *AlertingRule) newAlertAPI(a notifier.Alert) *APIAlert {
}
const (
// AlertMetricName is the metric name for synthetic alert timeseries.
// alertMetricName is the metric name for synthetic alert timeseries.
alertMetricName = "ALERTS"
// AlertForStateMetricName is the metric name for 'for' state of alert.
// alertForStateMetricName is the metric name for 'for' state of alert.
alertForStateMetricName = "ALERTS_FOR_STATE"
// AlertNameLabel is the label name indicating the name of an alert.
// alertNameLabel is the label name indicating the name of an alert.
alertNameLabel = "alertname"
// AlertStateLabel is the label name indicating the state of an alert.
// alertStateLabel is the label name indicating the state of an alert.
alertStateLabel = "alertstate"
// alertGroupNameLabel defines the label name attached for generated time series.
alertGroupNameLabel = "alertgroup"
)
// alertToTimeSeries converts the given alert with the given timestamp to timeseries
@@ -331,15 +392,22 @@ func alertForToTimeSeries(name string, a *notifier.Alert, timestamp time.Time) p
// Restore restores only Start field. Field State will be always Pending and supposed
// to be updated on next Exec, as well as Value field.
// Only rules with For > 0 will be restored.
func (ar *AlertingRule) Restore(ctx context.Context, q datasource.Querier, lookback time.Duration) error {
func (ar *AlertingRule) Restore(ctx context.Context, q datasource.Querier, lookback time.Duration, labels map[string]string) error {
if q == nil {
return fmt.Errorf("querier is nil")
}
// Get the last datapoint in range via MetricsQL `last_over_time`.
// account for external labels in filter
var labelsFilter string
for k, v := range labels {
labelsFilter += fmt.Sprintf(",%s=%q", k, v)
}
// Get the last data point in range via MetricsQL `last_over_time`.
// We don't use plain PromQL since Prometheus doesn't support
// remote write protocol which is used for state persistence in vmalert.
expr := fmt.Sprintf("last_over_time(%s{alertname=%q}[%ds])",
alertForStateMetricName, ar.Name, int(lookback.Seconds()))
expr := fmt.Sprintf("last_over_time(%s{alertname=%q%s}[%ds])",
alertForStateMetricName, ar.Name, labelsFilter, int(lookback.Seconds()))
qMetrics, err := q.Query(ctx, expr)
if err != nil {
return err
@@ -349,11 +417,14 @@ func (ar *AlertingRule) Restore(ctx context.Context, q datasource.Querier, lookb
labels := m.Labels
m.Labels = make([]datasource.Label, 0)
// drop all extra labels, so hash key will
// be identical to timeseries received in Exec
// be identical to time series received in Exec
for _, l := range labels {
if l.Name == alertNameLabel {
continue
}
if l.Name == alertGroupNameLabel {
continue
}
// drop all overridden labels
if _, ok := ar.Labels[l.Name]; ok {
continue
@@ -363,12 +434,12 @@ func (ar *AlertingRule) Restore(ctx context.Context, q datasource.Querier, lookb
a, err := ar.newAlert(m, time.Unix(int64(m.Value), 0))
if err != nil {
return fmt.Errorf("failed to create alert: %s", err)
return fmt.Errorf("failed to create alert: %w", err)
}
a.ID = hash(m)
a.State = notifier.StatePending
ar.alerts[a.ID] = a
logger.Infof("alert %q(%d) restored to state at %v", a.Name, a.ID, a.Start)
logger.Infof("alert %q (%d) restored to state at %v", a.Name, a.ID, a.Start)
}
return nil
}

View File

@@ -355,6 +355,7 @@ func TestAlertingRule_Restore(t *testing.T) {
metricWithValueAndLabels(t, float64(time.Now().Truncate(time.Hour).Unix()),
"__name__", alertForStateMetricName,
alertNameLabel, "",
alertGroupNameLabel, "groupID",
"foo", "bar",
"namespace", "baz",
),
@@ -419,7 +420,7 @@ func TestAlertingRule_Restore(t *testing.T) {
fq := &fakeQuerier{}
tc.rule.GroupID = fakeGroup.ID()
fq.add(tc.metrics...)
if err := tc.rule.Restore(context.TODO(), fq, time.Hour); err != nil {
if err := tc.rule.Restore(context.TODO(), fq, time.Hour, nil); err != nil {
t.Fatalf("unexpected err: %s", err)
}
if len(tc.rule.alerts) != len(tc.expAlerts) {

View File

@@ -1,6 +1,7 @@
package config
import (
"crypto/md5"
"fmt"
"hash/fnv"
"io/ioutil"
@@ -10,6 +11,9 @@ import (
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/utils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envtemplate"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/metricsql"
"gopkg.in/yaml.v2"
)
@@ -22,11 +26,30 @@ type Group struct {
Interval time.Duration `yaml:"interval,omitempty"`
Rules []Rule `yaml:"rules"`
Concurrency int `yaml:"concurrency"`
// Checksum stores the hash of yaml definition for this group.
// May be used to detect any changes like rules re-ordering etc.
Checksum string
// Catches all undefined fields and must be empty after parsing.
XXX map[string]interface{} `yaml:",inline"`
}
// UnmarshalYAML implements the yaml.Unmarshaler interface.
func (g *Group) UnmarshalYAML(unmarshal func(interface{}) error) error {
type group Group
if err := unmarshal((*group)(g)); err != nil {
return err
}
b, err := yaml.Marshal(g)
if err != nil {
return fmt.Errorf("failed to marshal group configuration for checksum: %w", err)
}
h := md5.New()
h.Write(b)
g.Checksum = fmt.Sprintf("%x", h.Sum(nil))
return nil
}
// Validate check for internal Group or Rule configuration errors
func (g *Group) Validate(validateAnnotations, validateExpressions bool) error {
if g.Name == "" {
@@ -46,19 +69,19 @@ func (g *Group) Validate(validateAnnotations, validateExpressions bool) error {
}
uniqueRules[r.ID] = struct{}{}
if err := r.Validate(); err != nil {
return fmt.Errorf("invalid rule %q.%q: %s", g.Name, ruleName, err)
return fmt.Errorf("invalid rule %q.%q: %w", g.Name, ruleName, err)
}
if validateExpressions {
if _, err := metricsql.Parse(r.Expr); err != nil {
return fmt.Errorf("invalid expression for rule %q.%q: %s", g.Name, ruleName, err)
return fmt.Errorf("invalid expression for rule %q.%q: %w", g.Name, ruleName, err)
}
}
if validateAnnotations {
if err := notifier.ValidateTemplates(r.Annotations); err != nil {
return fmt.Errorf("invalid annotations for rule %q.%q: %s", g.Name, ruleName, err)
return fmt.Errorf("invalid annotations for rule %q.%q: %w", g.Name, ruleName, err)
}
if err := notifier.ValidateTemplates(r.Labels); err != nil {
return fmt.Errorf("invalid labels for rule %q.%q: %s", g.Name, ruleName, err)
return fmt.Errorf("invalid labels for rule %q.%q: %w", g.Name, ruleName, err)
}
}
}
@@ -72,7 +95,7 @@ type Rule struct {
Record string `yaml:"record,omitempty"`
Alert string `yaml:"alert,omitempty"`
Expr string `yaml:"expr"`
For time.Duration `yaml:"for,omitempty"`
For PromDuration `yaml:"for,omitempty"`
Labels map[string]string `yaml:"labels,omitempty"`
Annotations map[string]string `yaml:"annotations,omitempty"`
@@ -80,6 +103,37 @@ type Rule struct {
XXX map[string]interface{} `yaml:",inline"`
}
// PromDuration is Prometheus duration.
type PromDuration struct {
milliseconds int64
}
// NewPromDuration returns PromDuration for given d.
func NewPromDuration(d time.Duration) PromDuration {
return PromDuration{
milliseconds: d.Milliseconds(),
}
}
// UnmarshalYAML implements yaml.Unmarshaler interface.
func (pd *PromDuration) UnmarshalYAML(unmarshal func(interface{}) error) error {
var s string
if err := unmarshal(&s); err != nil {
return err
}
ms, err := metricsql.DurationValue(s, 0)
if err != nil {
return err
}
pd.milliseconds = ms
return nil
}
// Duration returns duration for pd.
func (pd *PromDuration) Duration() time.Duration {
return time.Duration(pd.milliseconds) * time.Millisecond
}
// UnmarshalYAML implements the yaml.Unmarshaler interface.
func (r *Rule) UnmarshalYAML(unmarshal func(interface{}) error) error {
type rule Rule
@@ -90,8 +144,16 @@ func (r *Rule) UnmarshalYAML(unmarshal func(interface{}) error) error {
return nil
}
// Name returns Rule name according to its type
func (r *Rule) Name() string {
if r.Record != "" {
return r.Record
}
return r.Alert
}
// HashRule hashes significant Rule fields into
// unique hash value
// unique hash that supposed to define Rule uniqueness
func HashRule(r Rule) uint64 {
h := fnv.New64a()
h.Write([]byte(r.Expr))
@@ -102,16 +164,7 @@ func HashRule(r Rule) uint64 {
h.Write([]byte("alerting"))
h.Write([]byte(r.Alert))
}
type item struct {
key, value string
}
var kv []item
for k, v := range r.Labels {
kv = append(kv, item{key: k, value: v})
}
sort.Slice(kv, func(i, j int) bool {
return kv[i].key < kv[j].key
})
kv := sortMap(r.Labels)
for _, i := range kv {
h.Write([]byte(i.key))
h.Write([]byte(i.value))
@@ -137,31 +190,38 @@ func Parse(pathPatterns []string, validateAnnotations, validateExpressions bool)
for _, pattern := range pathPatterns {
matches, err := filepath.Glob(pattern)
if err != nil {
return nil, fmt.Errorf("error reading file pattern %s: %v", pattern, err)
return nil, fmt.Errorf("error reading file pattern %s: %w", pattern, err)
}
fp = append(fp, matches...)
}
errGroup := new(utils.ErrGroup)
var groups []Group
for _, file := range fp {
uniqueGroups := map[string]struct{}{}
gr, err := parseFile(file)
if err != nil {
return nil, fmt.Errorf("failed to parse file %q: %w", file, err)
errGroup.Add(fmt.Errorf("failed to parse file %q: %w", file, err))
continue
}
for _, g := range gr {
if err := g.Validate(validateAnnotations, validateExpressions); err != nil {
return nil, fmt.Errorf("invalid group %q in file %q: %s", g.Name, file, err)
errGroup.Add(fmt.Errorf("invalid group %q in file %q: %w", g.Name, file, err))
continue
}
if _, ok := uniqueGroups[g.Name]; ok {
return nil, fmt.Errorf("group name %q duplicate in file %q", g.Name, file)
errGroup.Add(fmt.Errorf("group name %q duplicate in file %q", g.Name, file))
continue
}
uniqueGroups[g.Name] = struct{}{}
g.File = file
groups = append(groups, g)
}
}
if err := errGroup.Err(); err != nil {
return nil, err
}
if len(groups) < 1 {
return nil, fmt.Errorf("no groups found in %s", strings.Join(pathPatterns, ";"))
logger.Warnf("no groups found in %s", strings.Join(pathPatterns, ";"))
}
return groups, nil
}
@@ -171,6 +231,7 @@ func parseFile(path string) ([]Group, error) {
if err != nil {
return nil, fmt.Errorf("error reading alert rule file: %w", err)
}
data = envtemplate.Replace(data)
g := struct {
Groups []Group `yaml:"groups"`
// Catches all undefined fields and must be empty after parsing.
@@ -193,3 +254,18 @@ func checkOverflow(m map[string]interface{}, ctx string) error {
}
return nil
}
type item struct {
key, value string
}
func sortMap(m map[string]string) []item {
var kv []item
for k, v := range m {
kv = append(kv, item{key: k, value: v})
}
sort.Slice(kv, func(i, j int) bool {
return kv[i].key < kv[j].key
})
return kv
}

View File

@@ -8,6 +8,7 @@ import (
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
"gopkg.in/yaml.v2"
)
func TestMain(m *testing.M) {
@@ -51,10 +52,6 @@ func TestParseBad(t *testing.T) {
[]string{"testdata/dir/rules4-bad.rules"},
"either `record` or `alert` must be set",
},
{
[]string{"testdata/*.yaml"},
"no groups found",
},
}
for _, tc := range testCases {
_, err := Parse(tc.path, true, true)
@@ -70,13 +67,13 @@ func TestParseBad(t *testing.T) {
func TestRule_Validate(t *testing.T) {
if err := (&Rule{}).Validate(); err == nil {
t.Errorf("exptected empty name error")
t.Errorf("expected empty name error")
}
if err := (&Rule{Alert: "alert"}).Validate(); err == nil {
t.Errorf("exptected empty expr error")
t.Errorf("expected empty expr error")
}
if err := (&Rule{Alert: "alert", Expr: "test>0"}).Validate(); err != nil {
t.Errorf("exptected valid rule; got %s", err)
t.Errorf("expected valid rule; got %s", err)
}
}
@@ -273,7 +270,7 @@ func TestHashRule(t *testing.T) {
true,
},
{
Rule{Alert: "alert", Expr: "up == 1", For: time.Minute},
Rule{Alert: "alert", Expr: "up == 1", For: NewPromDuration(time.Minute)},
Rule{Alert: "alert", Expr: "up == 1"},
true,
},
@@ -324,3 +321,36 @@ func TestHashRule(t *testing.T) {
}
}
}
func TestGroupChecksum(t *testing.T) {
data := `
name: TestGroup
rules:
- alert: ExampleAlertAlwaysFiring
expr: sum by(job) (up == 1)
- record: handler:requests:rate5m
expr: sum(rate(prometheus_http_requests_total[5m])) by (handler)
`
var g Group
if err := yaml.Unmarshal([]byte(data), &g); err != nil {
t.Fatalf("failed to unmarshal: %s", err)
}
if g.Checksum == "" {
t.Fatalf("expected to get non-empty checksum")
}
newData := `
name: TestGroup
rules:
- record: handler:requests:rate5m
expr: sum(rate(prometheus_http_requests_total[5m])) by (handler)
- alert: ExampleAlertAlwaysFiring
expr: sum by(job) (up == 1)
`
var ng Group
if err := yaml.Unmarshal([]byte(newData), &g); err != nil {
t.Fatalf("failed to unmarshal: %s", err)
}
if g.Checksum == ng.Checksum {
t.Fatalf("expected to get different checksums")
}
}

View File

@@ -665,7 +665,7 @@
/
sum(rate(kube_state_metrics_list_total{job="kube-state-metrics"}[5m])))
> 0.01
for: 15m
for: 1d
labels:
severity: critical
- alert: KubeStateMetricsWatchErrors
@@ -1724,4 +1724,4 @@
rate(prometheus_operator_node_address_lookup_errors_total{job="prometheus-operator",namespace="monitoring"}[5m]) > 0.1
for: 10m
labels:
severity: warning
severity: warning

View File

@@ -0,0 +1,43 @@
package datasource
import (
"flag"
"fmt"
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/utils"
)
var (
addr = flag.String("datasource.url", "", "Victoria Metrics or VMSelect url. Required parameter."+
" E.g. http://127.0.0.1:8428")
basicAuthUsername = flag.String("datasource.basicAuth.username", "", "Optional basic auth username for -datasource.url")
basicAuthPassword = flag.String("datasource.basicAuth.password", "", "Optional basic auth password for -datasource.url")
tlsInsecureSkipVerify = flag.Bool("datasource.tlsInsecureSkipVerify", false, "Whether to skip tls verification when connecting to -datasource.url")
tlsCertFile = flag.String("datasource.tlsCertFile", "", "Optional path to client-side TLS certificate file to use when connecting to -datasource.url")
tlsKeyFile = flag.String("datasource.tlsKeyFile", "", "Optional path to client-side TLS certificate key to use when connecting to -datasource.url")
tlsCAFile = flag.String("datasource.tlsCAFile", "", "Optional path to TLS CA file to use for verifying connections to -datasource.url. "+
"By default system CA is used")
tlsServerName = flag.String("datasource.tlsServerName", "", "Optional TLS server name to use for connections to -datasource.url. "+
"By default the server name from -datasource.url is used")
lookBack = flag.Duration("datasource.lookback", 0, "Lookback defines how far to look into past when evaluating queries. "+
"For example, if datasource.lookback=5m then param \"time\" with value now()-5m will be added to every query.")
maxIdleConnections = flag.Int("datasource.maxIdleConnections", 100, "Defines the number of idle (keep-alive connections) to configured datasource."+
"Consider to set this value equal to the value: groups_total * group.concurrency. Too low value may result into high number of sockets in TIME_WAIT state.")
)
// Init creates a Querier from provided flag values.
func Init() (Querier, error) {
if *addr == "" {
return nil, fmt.Errorf("datasource.url is empty")
}
tr, err := utils.Transport(*addr, *tlsCertFile, *tlsKeyFile, *tlsCAFile, *tlsServerName, *tlsInsecureSkipVerify)
if err != nil {
return nil, fmt.Errorf("failed to create transport: %w", err)
}
tr.MaxIdleConns = *maxIdleConnections
c := &http.Client{Transport: tr}
return NewVMStorage(*addr, *basicAuthUsername, *basicAuthPassword, *lookBack, c), nil
}

View File

@@ -9,6 +9,7 @@ import (
"net/url"
"strconv"
"strings"
"time"
)
type response struct {
@@ -32,7 +33,7 @@ func (r response) metrics() ([]Metric, error) {
for i, res := range r.Data.Result {
f, err = strconv.ParseFloat(res.TV[1].(string), 64)
if err != nil {
return nil, fmt.Errorf("metric %v, unable to parse float64 from %s: %s", res, res.TV[1], err)
return nil, fmt.Errorf("metric %v, unable to parse float64 from %s: %w", res, res.TV[1], err)
}
m.Labels = nil
for k, v := range r.Data.Result[i].Labels {
@@ -45,22 +46,25 @@ func (r response) metrics() ([]Metric, error) {
return ms, nil
}
const queryPath = "/api/v1/query?query="
// VMStorage represents vmstorage entity with ability to read and write metrics
type VMStorage struct {
c *http.Client
queryURL string
basicAuthUser, basicAuthPass string
c *http.Client
queryURL string
basicAuthUser string
basicAuthPass string
lookBack time.Duration
}
const queryPath = "/api/v1/query?query="
// NewVMStorage is a constructor for VMStorage
func NewVMStorage(baseURL, basicAuthUser, basicAuthPass string, c *http.Client) *VMStorage {
func NewVMStorage(baseURL, basicAuthUser, basicAuthPass string, lookBack time.Duration, c *http.Client) *VMStorage {
return &VMStorage{
c: c,
basicAuthUser: basicAuthUser,
basicAuthPass: basicAuthPass,
queryURL: strings.TrimSuffix(baseURL, "/") + queryPath,
lookBack: lookBack,
}
}
@@ -69,7 +73,12 @@ func (s *VMStorage) Query(ctx context.Context, query string) ([]Metric, error) {
const (
statusSuccess, statusError, rtVector = "success", "error", "vector"
)
req, err := http.NewRequest("POST", s.queryURL+url.QueryEscape(query), nil)
q := s.queryURL + url.QueryEscape(query)
if s.lookBack > 0 {
lookBack := time.Now().Add(-s.lookBack)
q += fmt.Sprintf("&time=%d", lookBack.Unix())
}
req, err := http.NewRequest("POST", q, nil)
if err != nil {
return nil, err
}
@@ -79,25 +88,25 @@ func (s *VMStorage) Query(ctx context.Context, query string) ([]Metric, error) {
}
resp, err := s.c.Do(req.WithContext(ctx))
if err != nil {
return nil, fmt.Errorf("error getting response from %s:%s", req.URL, err)
return nil, fmt.Errorf("error getting response from %s: %w", req.URL, err)
}
defer func() { _ = resp.Body.Close() }()
if resp.StatusCode != http.StatusOK {
body, _ := ioutil.ReadAll(resp.Body)
return nil, fmt.Errorf("datasource returns unxeprected response code %d for %s with err %s. Reponse body %s", resp.StatusCode, req.URL, err, body)
return nil, fmt.Errorf("datasource returns unexpected response code %d for %s. Response body %s", resp.StatusCode, req.URL, body)
}
r := &response{}
if err := json.NewDecoder(resp.Body).Decode(r); err != nil {
return nil, fmt.Errorf("error parsing metrics for %s:%s", req.URL, err)
return nil, fmt.Errorf("error parsing metrics for %s: %w", req.URL, err)
}
if r.Status == statusError {
return nil, fmt.Errorf("response error, query: %s, errorType: %s, error: %s", req.URL, r.ErrorType, r.Error)
}
if r.Status != statusSuccess {
return nil, fmt.Errorf("unkown status:%s, Expected success or error ", r.Status)
return nil, fmt.Errorf("unknown status: %s, Expected success or error ", r.Status)
}
if r.Data.ResultType != rtVector {
return nil, fmt.Errorf("unkown restul type:%s. Expected vector", r.Data.ResultType)
return nil, fmt.Errorf("unknown result type:%s. Expected vector", r.Data.ResultType)
}
return r.metrics()
}

View File

@@ -4,7 +4,9 @@ import (
"context"
"net/http"
"net/http/httptest"
"strconv"
"testing"
"time"
)
var (
@@ -29,7 +31,14 @@ func TestVMSelectQuery(t *testing.T) {
t.Errorf("expected %s:%s as basic auth got %s:%s", basicAuthName, basicAuthPass, name, pass)
}
if r.URL.Query().Get("query") != query {
t.Errorf("exptected %s in query param, got %s", query, r.URL.Query().Get("query"))
t.Errorf("expected %s in query param, got %s", query, r.URL.Query().Get("query"))
}
timeParam := r.URL.Query().Get("time")
if timeParam == "" {
t.Errorf("expected 'time' in query param, got nil instead")
}
if _, err := strconv.ParseInt(timeParam, 10, 64); err != nil {
t.Errorf("failed to parse 'time' query param: %s", err)
}
switch c {
case 0:
@@ -52,7 +61,7 @@ func TestVMSelectQuery(t *testing.T) {
srv := httptest.NewServer(mux)
defer srv.Close()
am := NewVMStorage(srv.URL, basicAuthName, basicAuthPass, srv.Client())
am := NewVMStorage(srv.URL, basicAuthName, basicAuthPass, time.Minute, srv.Client())
if _, err := am.Query(ctx, query); err == nil {
t.Fatalf("expected connection error got nil")
}
@@ -66,7 +75,7 @@ func TestVMSelectQuery(t *testing.T) {
t.Fatalf("expected error status got nil")
}
if _, err := am.Query(ctx, query); err == nil {
t.Fatalf("expected unkown status got nil")
t.Fatalf("expected unknown status got nil")
}
if _, err := am.Query(ctx, query); err == nil {
t.Fatalf("expected non-vector resultType error got nil")
@@ -76,7 +85,7 @@ func TestVMSelectQuery(t *testing.T) {
t.Fatalf("unexpected %s", err)
}
if len(m) != 1 {
t.Fatalf("exptected 1 metric got %d in %+v", len(m), m)
t.Fatalf("expected 1 metric got %d in %+v", len(m), m)
}
expected := Metric{
Labels: []Label{{Value: "vm_rows", Name: "__name__"}},
@@ -89,5 +98,4 @@ func TestVMSelectQuery(t *testing.T) {
m[0].Labels[0].Name != expected.Labels[0].Name {
t.Fatalf("unexpected metric %+v want %+v", m[0], expected)
}
}

View File

@@ -11,6 +11,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/utils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/metrics"
)
@@ -23,24 +24,42 @@ type Group struct {
Rules []Rule
Interval time.Duration
Concurrency int
Checksum string
doneCh chan struct{}
finishedCh chan struct{}
// channel accepts new Group obj
// which supposed to update current group
updateCh chan *Group
metrics *groupMetrics
}
func newGroup(cfg config.Group, defaultInterval time.Duration) *Group {
type groupMetrics struct {
iterationTotal *counter
iterationDuration *summary
}
func newGroupMetrics(name, file string) *groupMetrics {
m := &groupMetrics{}
labels := fmt.Sprintf(`group=%q, file=%q`, name, file)
m.iterationTotal = getOrCreateCounter(fmt.Sprintf(`vmalert_iteration_total{%s}`, labels))
m.iterationDuration = getOrCreateSummary(fmt.Sprintf(`vmalert_iteration_duration_seconds{%s}`, labels))
return m
}
func newGroup(cfg config.Group, defaultInterval time.Duration, labels map[string]string) *Group {
g := &Group{
Name: cfg.Name,
File: cfg.File,
Interval: cfg.Interval,
Concurrency: cfg.Concurrency,
Checksum: cfg.Checksum,
doneCh: make(chan struct{}),
finishedCh: make(chan struct{}),
updateCh: make(chan *Group),
}
g.metrics = newGroupMetrics(g.Name, g.File)
if g.Interval == 0 {
g.Interval = defaultInterval
}
@@ -49,6 +68,17 @@ func newGroup(cfg config.Group, defaultInterval time.Duration) *Group {
}
rules := make([]Rule, len(cfg.Rules))
for i, r := range cfg.Rules {
// override rule labels with external labels
for k, v := range labels {
if prevV, ok := r.Labels[k]; ok {
logger.Infof("label %q=%q for rule %q.%q overwritten with external label %q=%q",
k, prevV, g.Name, r.Name(), k, v)
}
if r.Labels == nil {
r.Labels = map[string]string{}
}
r.Labels[k] = v
}
rules[i] = g.newRule(r)
}
g.Rules = rules
@@ -57,9 +87,9 @@ func newGroup(cfg config.Group, defaultInterval time.Duration) *Group {
func (g *Group) newRule(rule config.Rule) Rule {
if rule.Alert != "" {
return newAlertingRule(g.ID(), rule)
return newAlertingRule(g, rule)
}
return newRecordingRule(g.ID(), rule)
return newRecordingRule(g, rule)
}
// ID return unique group ID that consists of
@@ -73,7 +103,7 @@ func (g *Group) ID() uint64 {
}
// Restore restores alerts state for group rules
func (g *Group) Restore(ctx context.Context, q datasource.Querier, lookback time.Duration) error {
func (g *Group) Restore(ctx context.Context, q datasource.Querier, lookback time.Duration, labels map[string]string) error {
for _, rule := range g.Rules {
rr, ok := rule.(*AlertingRule)
if !ok {
@@ -82,8 +112,8 @@ func (g *Group) Restore(ctx context.Context, q datasource.Querier, lookback time
if rr.For < 1 {
continue
}
if err := rr.Restore(ctx, q, lookback); err != nil {
return fmt.Errorf("error while restoring rule %q: %s", rule, err)
if err := rr.Restore(ctx, q, lookback, labels); err != nil {
return fmt.Errorf("error while restoring rule %q: %w", rule, err)
}
}
return nil
@@ -105,6 +135,7 @@ func (g *Group) updateWith(newGroup *Group) error {
if !ok {
// old rule is not present in the new list
// so we mark it for removing
g.Rules[i].Close()
g.Rules[i] = nil
continue
}
@@ -127,24 +158,15 @@ func (g *Group) updateWith(newGroup *Group) error {
newRules = append(newRules, nr)
}
g.Concurrency = newGroup.Concurrency
g.Checksum = newGroup.Checksum
g.Rules = newRules
return nil
}
var (
iterationTotal = metrics.NewCounter(`vmalert_iteration_total`)
iterationDuration = metrics.NewSummary(`vmalert_iteration_duration_seconds`)
execTotal = metrics.NewCounter(`vmalert_execution_total`)
execErrors = metrics.NewCounter(`vmalert_execution_errors_total`)
execDuration = metrics.NewSummary(`vmalert_execution_duration_seconds`)
alertsFired = metrics.NewCounter(`vmalert_alerts_fired_total`)
alertsSent = metrics.NewCounter(`vmalert_alerts_sent_total`)
alertsSendErrors = metrics.NewCounter(`vmalert_alerts_send_errors_total`)
remoteWriteSent = metrics.NewCounter(`vmalert_remotewrite_sent_total`)
remoteWriteErrors = metrics.NewCounter(`vmalert_remotewrite_errors_total`)
)
func (g *Group) close() {
@@ -153,12 +175,41 @@ func (g *Group) close() {
}
close(g.doneCh)
<-g.finishedCh
metrics.UnregisterMetric(g.metrics.iterationDuration.name)
metrics.UnregisterMetric(g.metrics.iterationTotal.name)
for _, rule := range g.Rules {
rule.Close()
}
}
func (g *Group) start(ctx context.Context, querier datasource.Querier, nr notifier.Notifier, rw *remotewrite.Client) {
var skipRandSleepOnGroupStart bool
func (g *Group) start(ctx context.Context, querier datasource.Querier, nts []notifier.Notifier, rw *remotewrite.Client) {
defer func() { close(g.finishedCh) }()
// Spread group rules evaluation over time in order to reduce load on VictoriaMetrics.
if !skipRandSleepOnGroupStart {
randSleep := uint64(float64(g.Interval) * (float64(uint32(g.ID())) / (1 << 32)))
sleepOffset := uint64(time.Now().UnixNano()) % uint64(g.Interval)
if randSleep < sleepOffset {
randSleep += uint64(g.Interval)
}
randSleep -= sleepOffset
sleepTimer := time.NewTimer(time.Duration(randSleep))
select {
case <-ctx.Done():
sleepTimer.Stop()
return
case <-g.doneCh:
sleepTimer.Stop()
return
case <-sleepTimer.C:
}
}
logger.Infof("group %q started; interval=%v; concurrency=%d", g.Name, g.Interval, g.Concurrency)
e := &executor{querier, nr, rw}
e := &executor{querier, nts, rw}
t := time.NewTicker(g.Interval)
defer t.Stop()
for {
@@ -185,7 +236,7 @@ func (g *Group) start(ctx context.Context, querier datasource.Querier, nr notifi
g.mu.Unlock()
logger.Infof("group %q re-started; interval=%v; concurrency=%d", g.Name, g.Interval, g.Concurrency)
case <-t.C:
iterationTotal.Inc()
g.metrics.iterationTotal.Inc()
iterationStart := time.Now()
errs := e.execConcurrently(ctx, g.Rules, g.Concurrency, g.Interval)
@@ -195,15 +246,15 @@ func (g *Group) start(ctx context.Context, querier datasource.Querier, nr notifi
}
}
iterationDuration.UpdateDuration(iterationStart)
g.metrics.iterationDuration.UpdateDuration(iterationStart)
}
}
}
type executor struct {
querier datasource.Querier
notifier notifier.Notifier
rw *remotewrite.Client
querier datasource.Querier
notifiers []notifier.Notifier
rw *remotewrite.Client
}
func (e *executor) execConcurrently(ctx context.Context, rules []Rule, concurrency int, interval time.Duration) chan error {
@@ -240,6 +291,14 @@ func (e *executor) execConcurrently(ctx context.Context, rules []Rule, concurren
return res
}
var (
execTotal = metrics.NewCounter(`vmalert_execution_total`)
execErrors = metrics.NewCounter(`vmalert_execution_errors_total`)
execDuration = metrics.NewSummary(`vmalert_execution_duration_seconds`)
remoteWriteErrors = metrics.NewCounter(`vmalert_remotewrite_errors_total`)
)
func (e *executor) exec(ctx context.Context, rule Rule, returnSeries bool, interval time.Duration) error {
execTotal.Inc()
execStart := time.Now()
@@ -250,15 +309,14 @@ func (e *executor) exec(ctx context.Context, rule Rule, returnSeries bool, inter
tss, err := rule.Exec(ctx, e.querier, returnSeries)
if err != nil {
execErrors.Inc()
return fmt.Errorf("rule %q: failed to execute: %s", rule, err)
return fmt.Errorf("rule %q: failed to execute: %w", rule, err)
}
if len(tss) > 0 && e.rw != nil {
remoteWriteSent.Add(len(tss))
for _, ts := range tss {
if err := e.rw.Push(ts); err != nil {
remoteWriteErrors.Inc()
return fmt.Errorf("rule %q: remote write failure: %s", rule, err)
return fmt.Errorf("rule %q: remote write failure: %w", rule, err)
}
}
}
@@ -286,10 +344,14 @@ func (e *executor) exec(ctx context.Context, rule Rule, returnSeries bool, inter
if len(alerts) < 1 {
return nil
}
alertsSent.Add(len(alerts))
if err := e.notifier.Send(ctx, alerts); err != nil {
alertsSendErrors.Inc()
return fmt.Errorf("rule %q: failed to send alerts: %s", rule, err)
errGr := new(utils.ErrGroup)
for _, nt := range e.notifiers {
if err := nt.Send(ctx, alerts); err != nil {
alertsSendErrors.Inc()
errGr.Add(fmt.Errorf("rule %q: failed to send alerts: %w", rule, err))
}
}
return nil
return errGr.Err()
}

View File

@@ -10,6 +10,12 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
)
func init() {
// Disable rand sleep on group start during tests in order to speed up test execution.
// Rand sleep is needed only in prod code.
skipRandSleepOnGroupStart = true
}
func TestUpdateWith(t *testing.T) {
testCases := []struct {
name string
@@ -26,7 +32,7 @@ func TestUpdateWith(t *testing.T) {
[]config.Rule{{
Alert: "foo",
Expr: "up > 0",
For: time.Second,
For: config.NewPromDuration(time.Second),
Labels: map[string]string{
"bar": "baz",
},
@@ -38,7 +44,7 @@ func TestUpdateWith(t *testing.T) {
[]config.Rule{{
Alert: "foo",
Expr: "up > 10",
For: time.Second,
For: config.NewPromDuration(time.Second),
Labels: map[string]string{
"baz": "bar",
},
@@ -150,7 +156,7 @@ func TestGroupStart(t *testing.T) {
t.Fatalf("failed to parse rules: %s", err)
}
const evalInterval = time.Millisecond
g := newGroup(groups[0], evalInterval)
g := newGroup(groups[0], evalInterval, map[string]string{"cluster": "east-1"})
g.Concurrency = 2
fn := &fakeNotifier{}
@@ -179,7 +185,7 @@ func TestGroupStart(t *testing.T) {
fs.add(m1)
fs.add(m2)
go func() {
g.start(context.Background(), fs, fn, nil)
g.start(context.Background(), fs, []notifier.Notifier{fn}, nil)
close(finished)
}()

View File

@@ -4,16 +4,19 @@ import (
"context"
"flag"
"fmt"
"net/http"
"net/url"
"os"
"strconv"
"strings"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/config"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/remoteread"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envflag"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fasttime"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
@@ -27,39 +30,26 @@ var (
rulePath = flagutil.NewArray("rule", `Path to the file with alert rules.
Supports patterns. Flag can be specified multiple times.
Examples:
-rule /path/to/file. Path to a single file with alerting rules
-rule dir/*.yaml -rule /*.yaml. Relative path to all .yaml files in "dir" folder,
absolute path to all .yaml files in root.`)
-rule="/path/to/file". Path to a single file with alerting rules
-rule="dir/*.yaml" -rule="/*.yaml". Relative path to all .yaml files in "dir" folder,
absolute path to all .yaml files in root.
Rule files may contain %{ENV_VAR} placeholders, which are substituted by the corresponding env vars.`)
httpListenAddr = flag.String("httpListenAddr", ":8880", "Address to listen for http connections")
evaluationInterval = flag.Duration("evaluationInterval", time.Minute, "How often to evaluate the rules")
validateTemplates = flag.Bool("rule.validateTemplates", true, "Whether to validate annotation and label templates")
validateExpressions = flag.Bool("rule.validateExpressions", true, "Whether to validate rules expressions via MetricsQL engine")
externalURL = flag.String("external.url", "", "External URL is used as alert's source for sent alerts to the notifier")
externalAlertSource = flag.String("external.alert.source", "", `External Alert Source allows to override the Source link for alerts sent to AlertManager for cases where you want to build a custom link to Grafana, Prometheus or any other service.
eg. 'explore?orgId=1&left=[\"now-1h\",\"now\",\"VictoriaMetrics\",{\"expr\": \"{{$expr|quotesEscape|pathEscape}}\"},{\"mode\":\"Metrics\"},{\"ui\":[true,true,true,\"none\"]}]'.If empty '/api/v1/:groupID/alertID/status' is used`)
externalLabels = flagutil.NewArray("external.label", "Optional label in the form 'name=value' to add to all generated recording rules and alerts. "+
"Pass multiple -label flags in order to add multiple label sets.")
httpListenAddr = flag.String("httpListenAddr", ":8880", "Address to listen for http connections")
datasourceURL = flag.String("datasource.url", "", "Victoria Metrics or VMSelect url. Required parameter."+
" E.g. http://127.0.0.1:8428")
basicAuthUsername = flag.String("datasource.basicAuth.username", "", "Optional basic auth username for -datasource.url")
basicAuthPassword = flag.String("datasource.basicAuth.password", "", "Optional basic auth password for -datasource.url")
remoteWriteURL = flag.String("remoteWrite.url", "", "Optional URL to Victoria Metrics or VMInsert where to persist alerts state"+
" and recording rules results in form of timeseries. E.g. http://127.0.0.1:8428")
remoteWriteUsername = flag.String("remoteWrite.basicAuth.username", "", "Optional basic auth username for -remoteWrite.url")
remoteWritePassword = flag.String("remoteWrite.basicAuth.password", "", "Optional basic auth password for -remoteWrite.url")
remoteWriteMaxQueueSize = flag.Int("remoteWrite.maxQueueSize", 1e5, "Defines the max number of pending datapoints to remote write endpoint")
remoteWriteMaxBatchSize = flag.Int("remoteWrite.maxBatchSize", 1e3, "Defines defines max number of timeseries to be flushed at once")
remoteWriteConcurrency = flag.Int("remoteWrite.concurrency", 1, "Defines number of writers for concurrent writing into remote storage")
remoteReadURL = flag.String("remoteRead.url", "", "Optional URL to Victoria Metrics or VMSelect that will be used to restore alerts"+
" state. This configuration makes sense only if `vmalert` was configured with `remoteWrite.url` before and has been successfully persisted its state."+
" E.g. http://127.0.0.1:8428")
remoteReadUsername = flag.String("remoteRead.basicAuth.username", "", "Optional basic auth username for -remoteRead.url")
remoteReadPassword = flag.String("remoteRead.basicAuth.password", "", "Optional basic auth password for -remoteRead.url")
remoteReadLookBack = flag.Duration("remoteRead.lookback", time.Hour, "Lookback defines how far to look into past for alerts timeseries."+
" For example, if lookback=1h then range from now() to now()-1h will be scanned.")
evaluationInterval = flag.Duration("evaluationInterval", time.Minute, "How often to evaluate the rules")
notifierURL = flag.String("notifier.url", "", "Prometheus alertmanager URL. Required parameter. e.g. http://127.0.0.1:9093")
externalURL = flag.String("external.url", "", "External URL is used as alert's source for sent alerts to the notifier")
dryRun = flag.Bool("dryRun", false, "Whether to check only config files without running vmalert. The rules file are validated. The `-rule` flag must be specified.")
)
func main() {
@@ -69,40 +59,25 @@ func main() {
envflag.Parse()
buildinfo.Init()
logger.Init()
checkFlags()
ctx, cancel := context.WithCancel(context.Background())
eu, err := getExternalURL(*externalURL, *httpListenAddr, httpserver.IsTLS())
if err != nil {
logger.Fatalf("can not get external url: %s ", err)
}
notifier.InitTemplateFunc(eu)
cgroup.UpdateGOMAXPROCSToCPUQuota()
manager := &manager{
groups: make(map[uint64]*Group),
storage: datasource.NewVMStorage(*datasourceURL, *basicAuthUsername, *basicAuthPassword, &http.Client{}),
notifier: notifier.NewAlertManager(*notifierURL, func(group, alert string) string {
return fmt.Sprintf("%s/api/v1/%s/%s/status", eu, group, alert)
}, &http.Client{}),
}
if *remoteWriteURL != "" {
c, err := remotewrite.NewClient(ctx, remotewrite.Config{
Addr: *remoteWriteURL,
Concurrency: *remoteWriteConcurrency,
MaxQueueSize: *remoteWriteMaxQueueSize,
MaxBatchSize: *remoteWriteMaxBatchSize,
FlushInterval: *evaluationInterval,
BasicAuthUser: *remoteWriteUsername,
BasicAuthPass: *remoteWritePassword,
})
if *dryRun {
u, _ := url.Parse("https://victoriametrics.com/")
notifier.InitTemplateFunc(u)
groups, err := config.Parse(*rulePath, true, true)
if err != nil {
logger.Fatalf("failed to init remotewrite client: %s", err)
logger.Fatalf(err.Error())
}
manager.rw = c
if len(groups) == 0 {
logger.Fatalf("No rules for validation. Please specify path to file(s) with alerting and/or recording rules using `-rule` flag")
}
return
}
if *remoteReadURL != "" {
manager.rr = datasource.NewVMStorage(*remoteReadURL, *remoteReadUsername, *remoteReadPassword, &http.Client{})
ctx, cancel := context.WithCancel(context.Background())
manager, err := newManager(ctx)
if err != nil {
logger.Fatalf("failed to init: %s", err)
}
if err := manager.start(ctx, *rulePath, *validateTemplates, *validateExpressions); err != nil {
logger.Fatalf("failed to start: %s", err)
}
@@ -147,6 +122,53 @@ var (
configTimestamp = metrics.NewCounter(`vmalert_config_last_reload_success_timestamp_seconds`)
)
func newManager(ctx context.Context) (*manager, error) {
q, err := datasource.Init()
if err != nil {
return nil, fmt.Errorf("failed to init datasource: %w", err)
}
eu, err := getExternalURL(*externalURL, *httpListenAddr, httpserver.IsTLS())
if err != nil {
return nil, fmt.Errorf("failed to init `external.url`: %w", err)
}
notifier.InitTemplateFunc(eu)
aug, err := getAlertURLGenerator(eu, *externalAlertSource, *validateTemplates)
if err != nil {
return nil, fmt.Errorf("failed to init `external.alert.source`: %w", err)
}
nts, err := notifier.Init(aug)
if err != nil {
return nil, fmt.Errorf("failed to init notifier: %w", err)
}
manager := &manager{
groups: make(map[uint64]*Group),
querier: q,
notifiers: nts,
labels: map[string]string{},
}
rw, err := remotewrite.Init(ctx)
if err != nil {
return nil, fmt.Errorf("failed to init remoteWrite: %w", err)
}
manager.rw = rw
rr, err := remoteread.Init()
if err != nil {
return nil, fmt.Errorf("failed to init remoteRead: %w", err)
}
manager.rr = rr
for _, s := range *externalLabels {
n := strings.IndexByte(s, '=')
if n < 0 {
return nil, fmt.Errorf("missing '=' in `-label`. It must contain label in the form `name=value`; got %q", s)
}
manager.labels[s[:n]] = s[n+1:]
}
return manager, nil
}
func getExternalURL(externalURL, httpListenAddr string, isSecure bool) (*url.URL, error) {
if externalURL != "" {
return url.Parse(externalURL)
@@ -166,15 +188,29 @@ func getExternalURL(externalURL, httpListenAddr string, isSecure bool) (*url.URL
return url.Parse(fmt.Sprintf("%s%s%s", schema, hname, port))
}
func checkFlags() {
if *notifierURL == "" {
flag.PrintDefaults()
logger.Fatalf("notifier.url is empty")
func getAlertURLGenerator(externalURL *url.URL, externalAlertSource string, validateTemplate bool) (notifier.AlertURLGenerator, error) {
if externalAlertSource == "" {
return func(alert notifier.Alert) string {
return fmt.Sprintf("%s/api/v1/%s/%s/status", externalURL, strconv.FormatUint(alert.GroupID, 10), strconv.FormatUint(alert.ID, 10))
}, nil
}
if *datasourceURL == "" {
flag.PrintDefaults()
logger.Fatalf("datasource.url is empty")
if validateTemplate {
if err := notifier.ValidateTemplates(map[string]string{
"tpl": externalAlertSource,
}); err != nil {
return nil, fmt.Errorf("error validating source template %s: %w", externalAlertSource, err)
}
}
m := map[string]string{
"tpl": externalAlertSource,
}
return func(alert notifier.Alert) string {
templated, err := alert.ExecTemplate(m)
if err != nil {
logger.Errorf("can not exec source template %s", err)
}
return fmt.Sprintf("%s/%s", externalURL, templated["tpl"])
}, nil
}
func usage() {

53
app/vmalert/main_test.go Normal file
View File

@@ -0,0 +1,53 @@
package main
import (
"fmt"
"net/url"
"os"
"testing"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
)
func TestGetExternalURL(t *testing.T) {
expURL := "https://vicotriametrics.com/path"
u, err := getExternalURL(expURL, "", false)
if err != nil {
t.Errorf("unexpected error %s", err)
}
if u.String() != expURL {
t.Errorf("unexpected url want %s, got %s", expURL, u.String())
}
h, _ := os.Hostname()
expURL = fmt.Sprintf("https://%s:4242", h)
u, err = getExternalURL("", "0.0.0.0:4242", true)
if err != nil {
t.Errorf("unexpected error %s", err)
}
if u.String() != expURL {
t.Errorf("unexpected url want %s, got %s", expURL, u.String())
}
}
func TestGetAlertURLGenerator(t *testing.T) {
testAlert := notifier.Alert{GroupID: 42, ID: 2, Value: 4}
u, _ := url.Parse("https://victoriametrics.com/path")
fn, err := getAlertURLGenerator(u, "", false)
if err != nil {
t.Errorf("unexpected error %s", err)
}
if exp := "https://victoriametrics.com/path/api/v1/42/2/status"; exp != fn(testAlert) {
t.Errorf("unexpected url want %s, got %s", exp, fn(testAlert))
}
_, err = getAlertURLGenerator(nil, "foo?{{invalid}}", true)
if err == nil {
t.Errorf("expected tempalte validation error got nil")
}
fn, err = getAlertURLGenerator(u, "foo?query={{$value}}", true)
if err != nil {
t.Errorf("unexpected error %s", err)
}
if exp := "https://victoriametrics.com/path/foo?query=4"; exp != fn(testAlert) {
t.Errorf("unexpected url want %s, got %s", exp, fn(testAlert))
}
}

View File

@@ -15,13 +15,14 @@ import (
// manager controls group states
type manager struct {
storage datasource.Querier
notifier notifier.Notifier
querier datasource.Querier
notifiers []notifier.Notifier
rw *remotewrite.Client
rr datasource.Querier
wg sync.WaitGroup
wg sync.WaitGroup
labels map[string]string
groupsMu sync.RWMutex
groups map[uint64]*Group
@@ -64,7 +65,7 @@ func (m *manager) close() {
func (m *manager) startGroup(ctx context.Context, group *Group, restore bool) {
if restore && m.rr != nil {
err := group.Restore(ctx, m.rr, *remoteReadLookBack)
err := group.Restore(ctx, m.rr, *remoteReadLookBack, m.labels)
if err != nil {
logger.Errorf("error while restoring state for group %q: %s", group.Name, err)
}
@@ -73,7 +74,7 @@ func (m *manager) startGroup(ctx context.Context, group *Group, restore bool) {
m.wg.Add(1)
id := group.ID()
go func() {
group.start(ctx, m.storage, m.notifier, m.rw)
group.start(ctx, m.querier, m.notifiers, m.rw)
m.wg.Done()
}()
m.groups[id] = group
@@ -83,40 +84,62 @@ func (m *manager) update(ctx context.Context, path []string, validateTpl, valida
logger.Infof("reading rules configuration file from %q", strings.Join(path, ";"))
groupsCfg, err := config.Parse(path, validateTpl, validateExpr)
if err != nil {
return fmt.Errorf("cannot parse configuration file: %s", err)
return fmt.Errorf("cannot parse configuration file: %w", err)
}
groupsRegistry := make(map[uint64]*Group)
for _, cfg := range groupsCfg {
ng := newGroup(cfg, *evaluationInterval)
ng := newGroup(cfg, *evaluationInterval, m.labels)
groupsRegistry[ng.ID()] = ng
}
type updateItem struct {
old *Group
new *Group
}
var toUpdate []updateItem
m.groupsMu.Lock()
for _, og := range m.groups {
ng, ok := groupsRegistry[og.ID()]
if !ok {
// old group is not present in new list
// and must be stopped and deleted
// old group is not present in new list,
// so must be stopped and deleted
og.close()
delete(m.groups, og.ID())
og = nil
continue
}
og.updateCh <- ng
delete(groupsRegistry, ng.ID())
if og.Checksum != ng.Checksum {
toUpdate = append(toUpdate, updateItem{old: og, new: ng})
}
}
for _, ng := range groupsRegistry {
m.startGroup(ctx, ng, restore)
}
m.groupsMu.Unlock()
if len(toUpdate) > 0 {
var wg sync.WaitGroup
for _, item := range toUpdate {
wg.Add(1)
go func(old *Group, new *Group) {
old.updateCh <- new
wg.Done()
}(item.old, item.new)
}
wg.Wait()
}
return nil
}
func (g *Group) toAPI() APIGroup {
g.mu.RLock()
defer g.mu.RUnlock()
ag := APIGroup{
// encode as strings to avoid rounding
// encode as string to avoid rounding
ID: fmt.Sprintf("%d", g.ID()),
Name: g.Name,
File: g.File,

View File

@@ -5,7 +5,6 @@ import (
"math/rand"
"net/url"
"os"
"strings"
"sync"
"testing"
"time"
@@ -19,16 +18,15 @@ func TestMain(m *testing.M) {
os.Exit(m.Run())
}
func TestManagerUpdateError(t *testing.T) {
// TestManagerEmptyRulesDir tests
// successful cases of
// starting with empty rules folder
func TestManagerEmptyRulesDir(t *testing.T) {
m := &manager{groups: make(map[uint64]*Group)}
path := []string{"foo/bar"}
err := m.update(context.Background(), path, true, true, false)
if err == nil {
t.Fatalf("expected to have err; got nil instead")
}
expErr := "no groups found"
if !strings.Contains(err.Error(), expErr) {
t.Fatalf("expected to got err %s; got %s", expErr, err)
if err != nil {
t.Fatalf("expected to load succesfully with empty rules dir; got err instead: %v", err)
}
}
@@ -37,9 +35,9 @@ func TestManagerUpdateError(t *testing.T) {
// Should be executed with -race flag
func TestManagerUpdateConcurrent(t *testing.T) {
m := &manager{
groups: make(map[uint64]*Group),
storage: &fakeQuerier{},
notifier: &fakeNotifier{},
groups: make(map[uint64]*Group),
querier: &fakeQuerier{},
notifiers: []notifier.Notifier{&fakeNotifier{}},
}
paths := []string{
"config/testdata/dir/rules0-good.rules",
@@ -180,11 +178,32 @@ func TestManagerUpdate(t *testing.T) {
}},
},
},
{
name: "update empty dir rules from 0 to 2 groups",
initPath: "config/testdata/empty/*",
updatePath: "config/testdata/rules0-good.rules",
want: []*Group{
{
File: "config/testdata/rules0-good.rules",
Name: "groupGorSingleAlert",
Interval: defaultEvalInterval,
Rules: []Rule{VMRows},
},
{
File: "config/testdata/rules0-good.rules",
Interval: defaultEvalInterval,
Name: "TestGroup", Rules: []Rule{
Conns,
ExampleAlertAlwaysFiring,
},
},
},
},
}
for _, tc := range testCases {
t.Run(tc.name, func(t *testing.T) {
ctx, cancel := context.WithCancel(context.TODO())
m := &manager{groups: make(map[uint64]*Group), storage: &fakeQuerier{}}
m := &manager{groups: make(map[uint64]*Group), querier: &fakeQuerier{}}
path := []string{tc.initPath}
if err := m.update(ctx, path, true, true, false); err != nil {
t.Fatalf("failed to complete initial rules update: %s", err)

39
app/vmalert/metrics.go Normal file
View File

@@ -0,0 +1,39 @@
package main
import "github.com/VictoriaMetrics/metrics"
type gauge struct {
name string
*metrics.Gauge
}
func getOrCreateGauge(name string, f func() float64) *gauge {
return &gauge{
name: name,
Gauge: metrics.GetOrCreateGauge(name, f),
}
}
type counter struct {
name string
*metrics.Counter
}
func getOrCreateCounter(name string) *counter {
return &counter{
name: name,
Counter: metrics.GetOrCreateCounter(name),
}
}
type summary struct {
name string
*metrics.Summary
}
func getOrCreateSummary(name string) *summary {
return &summary{
name: name,
Summary: metrics.GetOrCreateSummary(name),
}
}

View File

@@ -7,6 +7,8 @@ import (
"strings"
"text/template"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/utils"
)
// Alert the triggered alert
@@ -77,7 +79,7 @@ func ValidateTemplates(annotations map[string]string) error {
func templateAnnotations(annotations map[string]string, header string, data alertTplData) (map[string]string, error) {
var builder strings.Builder
var buf bytes.Buffer
eg := errGroup{}
eg := new(utils.ErrGroup)
r := make(map[string]string, len(annotations))
for key, text := range annotations {
r[key] = text
@@ -87,12 +89,12 @@ func templateAnnotations(annotations map[string]string, header string, data aler
builder.WriteString(header)
builder.WriteString(text)
if err := templateAnnotation(&buf, builder.String(), data); err != nil {
eg.errs = append(eg.errs, fmt.Sprintf("key %q, template %q: %s", key, text, err))
eg.Add(fmt.Errorf("key %q, template %q: %w", key, text, err))
continue
}
r[key] = buf.String()
}
return r, eg.err()
return r, eg.Err()
}
func templateAnnotation(dst io.Writer, text string, data alertTplData) error {

View File

@@ -1,13 +1,10 @@
package notifier
import (
"net/url"
"testing"
)
func TestAlert_ExecTemplate(t *testing.T) {
u, _ := url.Parse("https://victoriametrics.com/path")
InitTemplateFunc(u)
testCases := []struct {
name string
alert *Alert

View File

@@ -12,9 +12,11 @@ import (
// AlertManager represents integration provider with Prometheus alert manager
// https://github.com/prometheus/alertmanager
type AlertManager struct {
alertURL string
argFunc AlertURLGenerator
client *http.Client
alertURL string
basicAuthUser string
basicAuthPass string
argFunc AlertURLGenerator
client *http.Client
}
// Send an alert or resolve message
@@ -28,6 +30,9 @@ func (am *AlertManager) Send(ctx context.Context, alerts []Alert) error {
}
req.Header.Set("Content-Type", "application/json")
req = req.WithContext(ctx)
if am.basicAuthPass != "" {
req.SetBasicAuth(am.basicAuthUser, am.basicAuthPass)
}
resp, err := am.client.Do(req)
if err != nil {
return err
@@ -38,7 +43,7 @@ func (am *AlertManager) Send(ctx context.Context, alerts []Alert) error {
if resp.StatusCode != http.StatusOK {
body, err := ioutil.ReadAll(resp.Body)
if err != nil {
return fmt.Errorf("failed to read response from %q: %s", am.alertURL, err)
return fmt.Errorf("failed to read response from %q: %w", am.alertURL, err)
}
return fmt.Errorf("invalid SC %d from %q; response body: %s", resp.StatusCode, am.alertURL, string(body))
}
@@ -46,15 +51,18 @@ func (am *AlertManager) Send(ctx context.Context, alerts []Alert) error {
}
// AlertURLGenerator returns URL to single alert by given name
type AlertURLGenerator func(group, alert string) string
type AlertURLGenerator func(Alert) string
const alertManagerPath = "/api/v2/alerts"
// NewAlertManager is a constructor for AlertManager
func NewAlertManager(alertManagerURL string, fn AlertURLGenerator, c *http.Client) *AlertManager {
func NewAlertManager(alertManagerURL, user, pass string, fn AlertURLGenerator, c *http.Client) *AlertManager {
addr := strings.TrimSuffix(alertManagerURL, "/") + alertManagerPath
return &AlertManager{
alertURL: strings.TrimSuffix(alertManagerURL, "/") + alertManagerPath,
argFunc: fn,
client: c,
alertURL: addr,
argFunc: fn,
client: c,
basicAuthUser: user,
basicAuthPass: pass,
}
}

View File

@@ -1,15 +1,14 @@
{% import (
"strconv"
"time"
) %}
{% stripspace %}
{% func amRequest(alerts []Alert, generatorURL func(string, string) string) %}
{% func amRequest(alerts []Alert, generatorURL func(Alert) string) %}
[
{% for i, alert := range alerts %}
{
"startsAt":{%q= alert.Start.Format(time.RFC3339Nano) %},
"generatorURL": {%q= generatorURL(strconv.FormatUint(alert.GroupID, 10), strconv.FormatUint(alert.ID, 10)) %},
"startsAt":{%q= alert.Start.Format(time.RFC3339Nano) %},
"generatorURL": {%q= generatorURL(alert) %},
{% if !alert.End.IsZero() %}
"endsAt":{%q= alert.End.Format(time.RFC3339Nano) %},
{% endif %}

View File

@@ -1,131 +1,130 @@
// Code generated by qtc from "alertmanager_request.qtpl". DO NOT EDIT.
// See https://github.com/valyala/quicktemplate for details.
//line notifier/alertmanager_request.qtpl:1
//line app/vmalert/notifier/alertmanager_request.qtpl:1
package notifier
//line notifier/alertmanager_request.qtpl:1
//line app/vmalert/notifier/alertmanager_request.qtpl:1
import (
"strconv"
"time"
)
//line notifier/alertmanager_request.qtpl:7
//line app/vmalert/notifier/alertmanager_request.qtpl:6
import (
qtio422016 "io"
qt422016 "github.com/valyala/quicktemplate"
)
//line notifier/alertmanager_request.qtpl:7
//line app/vmalert/notifier/alertmanager_request.qtpl:6
var (
_ = qtio422016.Copy
_ = qt422016.AcquireByteBuffer
)
//line notifier/alertmanager_request.qtpl:7
func streamamRequest(qw422016 *qt422016.Writer, alerts []Alert, generatorURL func(string, string) string) {
//line notifier/alertmanager_request.qtpl:7
//line app/vmalert/notifier/alertmanager_request.qtpl:6
func streamamRequest(qw422016 *qt422016.Writer, alerts []Alert, generatorURL func(Alert) string) {
//line app/vmalert/notifier/alertmanager_request.qtpl:6
qw422016.N().S(`[`)
//line notifier/alertmanager_request.qtpl:9
//line app/vmalert/notifier/alertmanager_request.qtpl:8
for i, alert := range alerts {
//line notifier/alertmanager_request.qtpl:9
//line app/vmalert/notifier/alertmanager_request.qtpl:8
qw422016.N().S(`{"startsAt":`)
//line notifier/alertmanager_request.qtpl:11
//line app/vmalert/notifier/alertmanager_request.qtpl:10
qw422016.N().Q(alert.Start.Format(time.RFC3339Nano))
//line notifier/alertmanager_request.qtpl:11
//line app/vmalert/notifier/alertmanager_request.qtpl:10
qw422016.N().S(`,"generatorURL":`)
//line notifier/alertmanager_request.qtpl:12
qw422016.N().Q(generatorURL(strconv.FormatUint(alert.GroupID, 10), strconv.FormatUint(alert.ID, 10)))
//line notifier/alertmanager_request.qtpl:12
//line app/vmalert/notifier/alertmanager_request.qtpl:11
qw422016.N().Q(generatorURL(alert))
//line app/vmalert/notifier/alertmanager_request.qtpl:11
qw422016.N().S(`,`)
//line notifier/alertmanager_request.qtpl:13
//line app/vmalert/notifier/alertmanager_request.qtpl:12
if !alert.End.IsZero() {
//line notifier/alertmanager_request.qtpl:13
//line app/vmalert/notifier/alertmanager_request.qtpl:12
qw422016.N().S(`"endsAt":`)
//line notifier/alertmanager_request.qtpl:14
//line app/vmalert/notifier/alertmanager_request.qtpl:13
qw422016.N().Q(alert.End.Format(time.RFC3339Nano))
//line notifier/alertmanager_request.qtpl:14
//line app/vmalert/notifier/alertmanager_request.qtpl:13
qw422016.N().S(`,`)
//line notifier/alertmanager_request.qtpl:15
//line app/vmalert/notifier/alertmanager_request.qtpl:14
}
//line notifier/alertmanager_request.qtpl:15
//line app/vmalert/notifier/alertmanager_request.qtpl:14
qw422016.N().S(`"labels": {"alertname":`)
//line notifier/alertmanager_request.qtpl:17
//line app/vmalert/notifier/alertmanager_request.qtpl:16
qw422016.N().Q(alert.Name)
//line notifier/alertmanager_request.qtpl:18
//line app/vmalert/notifier/alertmanager_request.qtpl:17
for k, v := range alert.Labels {
//line notifier/alertmanager_request.qtpl:18
//line app/vmalert/notifier/alertmanager_request.qtpl:17
qw422016.N().S(`,`)
//line notifier/alertmanager_request.qtpl:19
//line app/vmalert/notifier/alertmanager_request.qtpl:18
qw422016.N().Q(k)
//line notifier/alertmanager_request.qtpl:19
//line app/vmalert/notifier/alertmanager_request.qtpl:18
qw422016.N().S(`:`)
//line notifier/alertmanager_request.qtpl:19
//line app/vmalert/notifier/alertmanager_request.qtpl:18
qw422016.N().Q(v)
//line notifier/alertmanager_request.qtpl:20
//line app/vmalert/notifier/alertmanager_request.qtpl:19
}
//line notifier/alertmanager_request.qtpl:20
//line app/vmalert/notifier/alertmanager_request.qtpl:19
qw422016.N().S(`},"annotations": {`)
//line notifier/alertmanager_request.qtpl:23
//line app/vmalert/notifier/alertmanager_request.qtpl:22
c := len(alert.Annotations)
//line notifier/alertmanager_request.qtpl:24
//line app/vmalert/notifier/alertmanager_request.qtpl:23
for k, v := range alert.Annotations {
//line notifier/alertmanager_request.qtpl:25
//line app/vmalert/notifier/alertmanager_request.qtpl:24
c = c - 1
//line notifier/alertmanager_request.qtpl:26
//line app/vmalert/notifier/alertmanager_request.qtpl:25
qw422016.N().Q(k)
//line notifier/alertmanager_request.qtpl:26
//line app/vmalert/notifier/alertmanager_request.qtpl:25
qw422016.N().S(`:`)
//line notifier/alertmanager_request.qtpl:26
//line app/vmalert/notifier/alertmanager_request.qtpl:25
qw422016.N().Q(v)
//line notifier/alertmanager_request.qtpl:26
//line app/vmalert/notifier/alertmanager_request.qtpl:25
if c > 0 {
//line notifier/alertmanager_request.qtpl:26
//line app/vmalert/notifier/alertmanager_request.qtpl:25
qw422016.N().S(`,`)
//line notifier/alertmanager_request.qtpl:26
//line app/vmalert/notifier/alertmanager_request.qtpl:25
}
//line notifier/alertmanager_request.qtpl:27
//line app/vmalert/notifier/alertmanager_request.qtpl:26
}
//line notifier/alertmanager_request.qtpl:27
//line app/vmalert/notifier/alertmanager_request.qtpl:26
qw422016.N().S(`}}`)
//line notifier/alertmanager_request.qtpl:30
//line app/vmalert/notifier/alertmanager_request.qtpl:29
if i != len(alerts)-1 {
//line notifier/alertmanager_request.qtpl:30
//line app/vmalert/notifier/alertmanager_request.qtpl:29
qw422016.N().S(`,`)
//line notifier/alertmanager_request.qtpl:30
//line app/vmalert/notifier/alertmanager_request.qtpl:29
}
//line notifier/alertmanager_request.qtpl:31
//line app/vmalert/notifier/alertmanager_request.qtpl:30
}
//line notifier/alertmanager_request.qtpl:31
//line app/vmalert/notifier/alertmanager_request.qtpl:30
qw422016.N().S(`]`)
//line notifier/alertmanager_request.qtpl:33
//line app/vmalert/notifier/alertmanager_request.qtpl:32
}
//line notifier/alertmanager_request.qtpl:33
func writeamRequest(qq422016 qtio422016.Writer, alerts []Alert, generatorURL func(string, string) string) {
//line notifier/alertmanager_request.qtpl:33
//line app/vmalert/notifier/alertmanager_request.qtpl:32
func writeamRequest(qq422016 qtio422016.Writer, alerts []Alert, generatorURL func(Alert) string) {
//line app/vmalert/notifier/alertmanager_request.qtpl:32
qw422016 := qt422016.AcquireWriter(qq422016)
//line notifier/alertmanager_request.qtpl:33
//line app/vmalert/notifier/alertmanager_request.qtpl:32
streamamRequest(qw422016, alerts, generatorURL)
//line notifier/alertmanager_request.qtpl:33
//line app/vmalert/notifier/alertmanager_request.qtpl:32
qt422016.ReleaseWriter(qw422016)
//line notifier/alertmanager_request.qtpl:33
//line app/vmalert/notifier/alertmanager_request.qtpl:32
}
//line notifier/alertmanager_request.qtpl:33
func amRequest(alerts []Alert, generatorURL func(string, string) string) string {
//line notifier/alertmanager_request.qtpl:33
//line app/vmalert/notifier/alertmanager_request.qtpl:32
func amRequest(alerts []Alert, generatorURL func(Alert) string) string {
//line app/vmalert/notifier/alertmanager_request.qtpl:32
qb422016 := qt422016.AcquireByteBuffer()
//line notifier/alertmanager_request.qtpl:33
//line app/vmalert/notifier/alertmanager_request.qtpl:32
writeamRequest(qb422016, alerts, generatorURL)
//line notifier/alertmanager_request.qtpl:33
//line app/vmalert/notifier/alertmanager_request.qtpl:32
qs422016 := string(qb422016.B)
//line notifier/alertmanager_request.qtpl:33
//line app/vmalert/notifier/alertmanager_request.qtpl:32
qt422016.ReleaseByteBuffer(qb422016)
//line notifier/alertmanager_request.qtpl:33
//line app/vmalert/notifier/alertmanager_request.qtpl:32
return qs422016
//line notifier/alertmanager_request.qtpl:33
//line app/vmalert/notifier/alertmanager_request.qtpl:32
}

View File

@@ -5,17 +5,27 @@ import (
"encoding/json"
"net/http"
"net/http/httptest"
"strconv"
"testing"
"time"
)
func TestAlertManager_Send(t *testing.T) {
const baUser, baPass = "foo", "bar"
mux := http.NewServeMux()
mux.HandleFunc("/", func(_ http.ResponseWriter, _ *http.Request) {
t.Errorf("should not be called")
})
c := -1
mux.HandleFunc(alertManagerPath, func(w http.ResponseWriter, r *http.Request) {
user, pass, ok := r.BasicAuth()
if !ok {
t.Errorf("unauthorized request")
}
if user != baUser || pass != baPass {
t.Errorf("wrong creds %q:%q; expected %q:%q",
user, pass, baUser, baPass)
}
c++
if r.Method != http.MethodPost {
t.Errorf("expected POST method got %s", r.Method)
@@ -42,23 +52,23 @@ func TestAlertManager_Send(t *testing.T) {
t.Errorf("expected 1 alert in array got %d", len(a))
}
if a[0].GeneratorURL != "0/0" {
t.Errorf("exptected 0/0 as generatorURL got %s", a[0].GeneratorURL)
t.Errorf("expected 0/0 as generatorURL got %s", a[0].GeneratorURL)
}
if a[0].Labels["alertname"] != "alert0" {
t.Errorf("exptected alert0 as alert name got %s", a[0].Labels["alertname"])
t.Errorf("expected alert0 as alert name got %s", a[0].Labels["alertname"])
}
if a[0].StartsAt.IsZero() {
t.Errorf("exptected non-zero start time")
t.Errorf("expected non-zero start time")
}
if a[0].EndAt.IsZero() {
t.Errorf("exptected non-zero end time")
t.Errorf("expected non-zero end time")
}
}
})
srv := httptest.NewServer(mux)
defer srv.Close()
am := NewAlertManager(srv.URL, func(group, name string) string {
return group + "/" + name
am := NewAlertManager(srv.URL, baUser, baPass, func(alert Alert) string {
return strconv.FormatUint(alert.GroupID, 10) + "/" + strconv.FormatUint(alert.ID, 10)
}, srv.Client())
if err := am.Send(context.Background(), []Alert{{}, {}}); err == nil {
t.Error("expected connection error got nil")

View File

@@ -0,0 +1,46 @@
package notifier
import (
"flag"
"fmt"
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/utils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
)
var (
addrs = flagutil.NewArray("notifier.url", "Prometheus alertmanager URL. Required parameter. e.g. http://127.0.0.1:9093")
basicAuthUsername = flagutil.NewArray("notifier.basicAuth.username", "Optional basic auth username for -datasource.url")
basicAuthPassword = flagutil.NewArray("notifier.basicAuth.password", "Optional basic auth password for -datasource.url")
tlsInsecureSkipVerify = flag.Bool("notifier.tlsInsecureSkipVerify", false, "Whether to skip tls verification when connecting to -notifier.url")
tlsCertFile = flagutil.NewArray("notifier.tlsCertFile", "Optional path to client-side TLS certificate file to use when connecting to -notifier.url")
tlsKeyFile = flagutil.NewArray("notifier.tlsKeyFile", "Optional path to client-side TLS certificate key to use when connecting to -notifier.url")
tlsCAFile = flagutil.NewArray("notifier.tlsCAFile", "Optional path to TLS CA file to use for verifying connections to -notifier.url. "+
"By default system CA is used")
tlsServerName = flagutil.NewArray("notifier.tlsServerName", "Optional TLS server name to use for connections to -notifier.url. "+
"By default the server name from -notifier.url is used")
)
// Init creates a Notifier object based on provided flags.
func Init(gen AlertURLGenerator) ([]Notifier, error) {
if len(*addrs) == 0 {
return nil, fmt.Errorf("at least one `-notifier.url` must be set")
}
var notifiers []Notifier
for i, addr := range *addrs {
cert, key := tlsCertFile.GetOptionalArg(i), tlsKeyFile.GetOptionalArg(i)
ca, serverName := tlsCAFile.GetOptionalArg(i), tlsServerName.GetOptionalArg(i)
tr, err := utils.Transport(addr, cert, key, ca, serverName, *tlsInsecureSkipVerify)
if err != nil {
return nil, fmt.Errorf("failed to create transport: %w", err)
}
user, pass := basicAuthUsername.GetOptionalArg(i), basicAuthPassword.GetOptionalArg(i)
am := NewAlertManager(addr, user, pass, gen, &http.Client{Transport: tr})
notifiers = append(notifiers, am)
}
return notifiers, nil
}

View File

@@ -0,0 +1,13 @@
package notifier
import (
"net/url"
"os"
"testing"
)
func TestMain(m *testing.M) {
u, _ := url.Parse("https://victoriametrics.com/path")
InitTemplateFunc(u)
os.Exit(m.Run())
}

View File

@@ -1,21 +0,0 @@
package notifier
import (
"fmt"
"strings"
)
type errGroup struct {
errs []string
}
func (eg *errGroup) err() error {
if eg == nil || len(eg.errs) == 0 {
return nil
}
return eg
}
func (eg *errGroup) Error() string {
return fmt.Sprintf("errors: %s", strings.Join(eg.errs, "\n"))
}

View File

@@ -12,6 +12,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/config"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/metrics"
)
// RecordingRule is a Rule that supposed
@@ -32,6 +33,12 @@ type RecordingRule struct {
// resets on every successful Exec
// may be used as Health state
lastExecError error
metrics *recordingRuleMetrics
}
type recordingRuleMetrics struct {
errors *gauge
}
// String implements Stringer interface
@@ -45,14 +52,31 @@ func (rr *RecordingRule) ID() uint64 {
return rr.RuleID
}
func newRecordingRule(gID uint64, cfg config.Rule) *RecordingRule {
return &RecordingRule{
func newRecordingRule(group *Group, cfg config.Rule) *RecordingRule {
rr := &RecordingRule{
RuleID: cfg.ID,
Name: cfg.Record,
Expr: cfg.Expr,
Labels: cfg.Labels,
GroupID: gID,
GroupID: group.ID(),
metrics: &recordingRuleMetrics{},
}
labels := fmt.Sprintf(`recording=%q, group=%q, id="%d"`, rr.Name, group.Name, rr.ID())
rr.metrics.errors = getOrCreateGauge(fmt.Sprintf(`vmalert_recording_rules_error{%s}`, labels),
func() float64 {
rr.mu.Lock()
defer rr.mu.Unlock()
if rr.lastExecError == nil {
return 0
}
return 1
})
return rr
}
// Close unregisters rule metrics
func (rr *RecordingRule) Close() {
metrics.UnregisterMetric(rr.metrics.errors.name)
}
var errDuplicate = errors.New("result contains metrics with the same labelset after applying rule labels")
@@ -71,7 +95,7 @@ func (rr *RecordingRule) Exec(ctx context.Context, q datasource.Querier, series
rr.lastExecTime = time.Now()
rr.lastExecError = err
if err != nil {
return nil, fmt.Errorf("failed to execute query %q: %s", rr.Expr, err)
return nil, fmt.Errorf("failed to execute query %q: %w", rr.Expr, err)
}
duplicates := make(map[uint64]prompbmarshal.TimeSeries, len(qMetrics))

View File

@@ -0,0 +1,39 @@
package remoteread
import (
"flag"
"fmt"
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/utils"
)
var (
addr = flag.String("remoteRead.url", "", "Optional URL to Victoria Metrics or VMSelect that will be used to restore alerts"+
" state. This configuration makes sense only if `vmalert` was configured with `remoteWrite.url` before and has been successfully persisted its state."+
" E.g. http://127.0.0.1:8428")
basicAuthUsername = flag.String("remoteRead.basicAuth.username", "", "Optional basic auth username for -remoteRead.url")
basicAuthPassword = flag.String("remoteRead.basicAuth.password", "", "Optional basic auth password for -remoteRead.url")
tlsInsecureSkipVerify = flag.Bool("remoteRead.tlsInsecureSkipVerify", false, "Whether to skip tls verification when connecting to -remoteRead.url")
tlsCertFile = flag.String("remoteRead.tlsCertFile", "", "Optional path to client-side TLS certificate file to use when connecting to -remoteRead.url")
tlsKeyFile = flag.String("remoteRead.tlsKeyFile", "", "Optional path to client-side TLS certificate key to use when connecting to -remoteRead.url")
tlsCAFile = flag.String("remoteRead.tlsCAFile", "", "Optional path to TLS CA file to use for verifying connections to -remoteRead.url. "+
"By default system CA is used")
tlsServerName = flag.String("remoteRead.tlsServerName", "", "Optional TLS server name to use for connections to -remoteRead.url. "+
"By default the server name from -remoteRead.url is used")
)
// Init creates a Querier from provided flag values.
// Returns nil if addr flag wasn't set.
func Init() (datasource.Querier, error) {
if *addr == "" {
return nil, nil
}
tr, err := utils.Transport(*addr, *tlsCertFile, *tlsKeyFile, *tlsCAFile, *tlsServerName, *tlsInsecureSkipVerify)
if err != nil {
return nil, fmt.Errorf("failed to create transport: %w", err)
}
c := &http.Client{Transport: tr}
return datasource.NewVMStorage(*addr, *basicAuthUsername, *basicAuthPassword, 0, c), nil
}

View File

@@ -0,0 +1,54 @@
package remotewrite
import (
"context"
"flag"
"fmt"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/utils"
)
var (
addr = flag.String("remoteWrite.url", "", "Optional URL to Victoria Metrics or VMInsert where to persist alerts state"+
" and recording rules results in form of timeseries. E.g. http://127.0.0.1:8428")
basicAuthUsername = flag.String("remoteWrite.basicAuth.username", "", "Optional basic auth username for -remoteWrite.url")
basicAuthPassword = flag.String("remoteWrite.basicAuth.password", "", "Optional basic auth password for -remoteWrite.url")
maxQueueSize = flag.Int("remoteWrite.maxQueueSize", 1e5, "Defines the max number of pending datapoints to remote write endpoint")
maxBatchSize = flag.Int("remoteWrite.maxBatchSize", 1e3, "Defines defines max number of timeseries to be flushed at once")
concurrency = flag.Int("remoteWrite.concurrency", 1, "Defines number of writers for concurrent writing into remote querier")
flushInterval = flag.Duration("remoteWrite.flushInterval", 5*time.Second, "Defines interval of flushes to remote write endpoint")
tlsInsecureSkipVerify = flag.Bool("remoteWrite.tlsInsecureSkipVerify", false, "Whether to skip tls verification when connecting to -remoteWrite.url")
tlsCertFile = flag.String("remoteWrite.tlsCertFile", "", "Optional path to client-side TLS certificate file to use when connecting to -remoteWrite.url")
tlsKeyFile = flag.String("remoteWrite.tlsKeyFile", "", "Optional path to client-side TLS certificate key to use when connecting to -remoteWrite.url")
tlsCAFile = flag.String("remoteWrite.tlsCAFile", "", "Optional path to TLS CA file to use for verifying connections to -remoteWrite.url. "+
"By default system CA is used")
tlsServerName = flag.String("remoteWrite.tlsServerName", "", "Optional TLS server name to use for connections to -remoteWrite.url. "+
"By default the server name from -remoteWrite.url is used")
)
// Init creates Client object from given flags.
// Returns nil if addr flag wasn't set.
func Init(ctx context.Context) (*Client, error) {
if *addr == "" {
return nil, nil
}
t, err := utils.Transport(*addr, *tlsCertFile, *tlsKeyFile, *tlsCAFile, *tlsServerName, *tlsInsecureSkipVerify)
if err != nil {
return nil, fmt.Errorf("failed to create transport: %w", err)
}
return NewClient(ctx, Config{
Addr: *addr,
Concurrency: *concurrency,
MaxQueueSize: *maxQueueSize,
MaxBatchSize: *maxBatchSize,
FlushInterval: *flushInterval,
BasicAuthUser: *basicAuthUsername,
BasicAuthPass: *basicAuthPassword,
Transport: t,
})
}

View File

@@ -12,6 +12,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/metrics"
"github.com/golang/snappy"
)
@@ -53,13 +54,15 @@ type Config struct {
// WriteTimeout defines timeout for HTTP write request
// to remote storage
WriteTimeout time.Duration
// Transport will be used by the underlying http.Client
Transport *http.Transport
}
const (
defaultConcurrency = 4
defaultMaxBatchSize = 1e3
defaultMaxQueueSize = 1e5
defaultFlushInterval = time.Second
defaultFlushInterval = 5 * time.Second
defaultWriteTimeout = 30 * time.Second
)
@@ -83,15 +86,20 @@ func NewClient(ctx context.Context, cfg Config) (*Client, error) {
if cfg.WriteTimeout == 0 {
cfg.WriteTimeout = defaultWriteTimeout
}
if cfg.Transport == nil {
cfg.Transport = http.DefaultTransport.(*http.Transport).Clone()
}
c := &Client{
c: &http.Client{
Timeout: cfg.WriteTimeout,
Timeout: cfg.WriteTimeout,
Transport: cfg.Transport,
},
addr: strings.TrimSuffix(cfg.Addr, "/") + writePath,
baUser: cfg.BasicAuthUser,
baPass: cfg.BasicAuthPass,
flushInterval: cfg.FlushInterval,
maxBatchSize: cfg.MaxBatchSize,
maxQueueSize: cfg.MaxQueueSize,
doneCh: make(chan struct{}),
input: make(chan prompbmarshal.TimeSeries, cfg.MaxQueueSize),
}
@@ -134,14 +142,11 @@ func (c *Client) Close() error {
func (c *Client) run(ctx context.Context) {
ticker := time.NewTicker(c.flushInterval)
wr := prompbmarshal.WriteRequest{}
wr := &prompbmarshal.WriteRequest{}
shutdown := func() {
for ts := range c.input {
wr.Timeseries = append(wr.Timeseries, ts)
}
if len(wr.Timeseries) < 1 {
return
}
lastCtx, cancel := context.WithTimeout(context.Background(), defaultWriteTimeout)
c.flush(lastCtx, wr)
cancel()
@@ -160,44 +165,82 @@ func (c *Client) run(ctx context.Context) {
return
case <-ticker.C:
c.flush(ctx, wr)
wr = prompbmarshal.WriteRequest{}
case ts := <-c.input:
case ts, ok := <-c.input:
if !ok {
continue
}
wr.Timeseries = append(wr.Timeseries, ts)
if len(wr.Timeseries) >= c.maxBatchSize {
c.flush(ctx, wr)
wr = prompbmarshal.WriteRequest{}
}
}
}
}()
}
func (c *Client) flush(ctx context.Context, wr prompbmarshal.WriteRequest) {
var (
sentRows = metrics.NewCounter(`vmalert_remotewrite_sent_rows_total`)
sentBytes = metrics.NewCounter(`vmalert_remotewrite_sent_bytes_total`)
droppedRows = metrics.NewCounter(`vmalert_remotewrite_dropped_rows_total`)
droppedBytes = metrics.NewCounter(`vmalert_remotewrite_dropped_bytes_total`)
)
// flush is a blocking function that marshals WriteRequest and sends
// it to remote write endpoint. Flush performs limited amount of retries
// if request fails.
func (c *Client) flush(ctx context.Context, wr *prompbmarshal.WriteRequest) {
if len(wr.Timeseries) < 1 {
return
}
defer prompbmarshal.ResetWriteRequest(wr)
data, err := wr.Marshal()
if err != nil {
logger.Errorf("failed to marshal WriteRequest: %s", err)
return
}
req, err := http.NewRequest("POST", c.addr, bytes.NewReader(snappy.Encode(nil, data)))
const attempts = 5
b := snappy.Encode(nil, data)
for i := 0; i < attempts; i++ {
err := c.send(ctx, b)
if err == nil {
sentRows.Add(len(wr.Timeseries))
sentBytes.Add(len(b))
return
}
logger.Errorf("attempt %d to send request failed: %s", i+1, err)
// sleeping to avoid remote db hammering
time.Sleep(time.Second)
continue
}
droppedRows.Add(len(wr.Timeseries))
droppedBytes.Add(len(b))
logger.Errorf("all %d attempts to send request failed - dropping %d timeseries",
attempts, len(wr.Timeseries))
}
func (c *Client) send(ctx context.Context, data []byte) error {
r := bytes.NewReader(data)
req, err := http.NewRequest("POST", c.addr, r)
if err != nil {
logger.Errorf("failed to create new HTTP request: %s", err)
return
return fmt.Errorf("failed to create new HTTP request: %w", err)
}
if c.baPass != "" {
req.SetBasicAuth(c.baUser, c.baPass)
}
resp, err := c.c.Do(req.WithContext(ctx))
if err != nil {
logger.Errorf("error getting response from %s:%s", req.URL, err)
return
return fmt.Errorf("error while sending request to %s: %w; Data len %d(%d)",
req.URL, err, len(data), r.Size())
}
defer func() { _ = resp.Body.Close() }()
if resp.StatusCode != http.StatusNoContent {
body, _ := ioutil.ReadAll(resp.Body)
logger.Errorf("unexpected response code %d for %s. Response body %s", resp.StatusCode, req.URL, body)
return
return fmt.Errorf("unexpected response code %d for %s. Response body %q",
resp.StatusCode, req.URL, body)
}
return nil
}

View File

@@ -0,0 +1,102 @@
package remotewrite
import (
"context"
"fmt"
"io/ioutil"
"math/rand"
"net/http"
"net/http/httptest"
"sync/atomic"
"testing"
"time"
"github.com/golang/snappy"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
)
func TestClient_Push(t *testing.T) {
testSrv := newRWServer()
cfg := Config{
Addr: testSrv.URL,
MaxBatchSize: 100,
}
client, err := NewClient(context.Background(), cfg)
if err != nil {
t.Fatalf("failed to create client: %s", err)
}
const rowsN = 1e4
var sent int
for i := 0; i < rowsN; i++ {
s := prompbmarshal.TimeSeries{
Samples: []prompbmarshal.Sample{{
Value: rand.Float64(),
Timestamp: time.Now().Unix(),
}},
}
err := client.Push(s)
if err == nil {
sent++
}
}
if sent == 0 {
t.Fatalf("0 series sent")
}
if err := client.Close(); err != nil {
t.Fatalf("failed to close client: %s", err)
}
got := testSrv.accepted()
if got != sent {
t.Fatalf("expected to have %d series; got %d", sent, got)
}
}
func newRWServer() *rwServer {
rw := &rwServer{}
rw.Server = httptest.NewServer(http.HandlerFunc(rw.handler))
return rw
}
type rwServer struct {
// WARN: ordering of fields is important for alignment!
// see https://golang.org/pkg/sync/atomic/#pkg-note-BUG
acceptedRows uint64
*httptest.Server
}
func (rw *rwServer) accepted() int {
return int(atomic.LoadUint64(&rw.acceptedRows))
}
func (rw *rwServer) err(w http.ResponseWriter, err error) {
w.WriteHeader(http.StatusBadRequest)
w.Write([]byte(err.Error()))
}
func (rw *rwServer) handler(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodPost {
rw.err(w, fmt.Errorf("bad method %q", r.Method))
return
}
data, err := ioutil.ReadAll(r.Body)
if err != nil {
rw.err(w, fmt.Errorf("body read err: %w", err))
return
}
defer func() { _ = r.Body.Close() }()
b, err := snappy.Decode(nil, data)
if err != nil {
rw.err(w, fmt.Errorf("decode err: %w", err))
return
}
wr := &prompb.WriteRequest{}
if err := wr.Unmarshal(b); err != nil {
rw.err(w, fmt.Errorf("unmarhsal err: %w", err))
return
}
atomic.AddUint64(&rw.acceptedRows, uint64(len(wr.Timeseries)))
w.WriteHeader(http.StatusNoContent)
}

View File

@@ -21,4 +21,7 @@ type Rule interface {
// UpdateWith performs modification of current Rule
// with fields of the given Rule.
UpdateWith(Rule) error
// Close performs the shutdown procedures for rule
// such as metrics unregister
Close()
}

View File

@@ -1,9 +1,10 @@
package main
import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"sort"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
)
func newTimeSeries(value float64, labels map[string]string, timestamp time.Time) prompbmarshal.TimeSeries {

View File

@@ -0,0 +1,43 @@
package utils
import (
"fmt"
"strings"
)
// ErrGroup accumulates multiple errors
// and produces single error message.
type ErrGroup struct {
errs []error
}
// Add adds a new error to group.
// Isn't thread-safe.
func (eg *ErrGroup) Add(err error) {
eg.errs = append(eg.errs, err)
}
// Err checks if group contains at least
// one error.
func (eg *ErrGroup) Err() error {
if eg == nil || len(eg.errs) == 0 {
return nil
}
return eg
}
// Error satisfies Error interface
func (eg *ErrGroup) Error() string {
if len(eg.errs) == 0 {
return ""
}
var b strings.Builder
fmt.Fprintf(&b, "errors(%d): ", len(eg.errs))
for i, err := range eg.errs {
b.WriteString(err.Error())
if i != len(eg.errs)-1 {
b.WriteString("\n")
}
}
return b.String()
}

View File

@@ -0,0 +1,38 @@
package utils
import (
"errors"
"testing"
)
func TestErrGroup(t *testing.T) {
testCases := []struct {
errs []error
exp string
}{
{nil, ""},
{[]error{errors.New("timeout")}, "errors(1): timeout"},
{
[]error{errors.New("timeout"), errors.New("deadline")},
"errors(2): timeout\ndeadline",
},
}
for _, tc := range testCases {
eg := new(ErrGroup)
for _, err := range tc.errs {
eg.Add(err)
}
if len(tc.errs) == 0 {
if eg.Err() != nil {
t.Fatalf("expected to get nil error")
}
continue
}
if eg.Err() == nil {
t.Fatalf("expected to get non-nil error")
}
if eg.Error() != tc.exp {
t.Fatalf("expected to have: \n%q\ngot:\n%q", tc.exp, eg.Error())
}
}
}

58
app/vmalert/utils/tls.go Normal file
View File

@@ -0,0 +1,58 @@
package utils
import (
"crypto/tls"
"crypto/x509"
"fmt"
"io/ioutil"
"net/http"
"strings"
)
// Transport creates http.Transport object based on provided URL.
// Returns Transport with TLS configuration if URL contains `https` prefix
func Transport(URL, certFile, keyFile, CAFile, serverName string, insecureSkipVerify bool) (*http.Transport, error) {
t := http.DefaultTransport.(*http.Transport).Clone()
if !strings.HasPrefix(URL, "https") {
return t, nil
}
tlsCfg, err := TLSConfig(certFile, keyFile, CAFile, serverName, insecureSkipVerify)
if err != nil {
return nil, err
}
t.TLSClientConfig = tlsCfg
return t, nil
}
// TLSConfig creates tls.Config object from provided arguments
func TLSConfig(certFile, keyFile, CAFile, serverName string, insecureSkipVerify bool) (*tls.Config, error) {
var certs []tls.Certificate
if certFile != "" {
cert, err := tls.LoadX509KeyPair(certFile, keyFile)
if err != nil {
return nil, fmt.Errorf("cannot load TLS certificate from `cert_file`=%q, `key_file`=%q: %w", certFile, keyFile, err)
}
certs = []tls.Certificate{cert}
}
var rootCAs *x509.CertPool
if CAFile != "" {
pem, err := ioutil.ReadFile(CAFile)
if err != nil {
return nil, fmt.Errorf("cannot read `ca_file` %q: %w", CAFile, err)
}
rootCAs = x509.NewCertPool()
if !rootCAs.AppendCertsFromPEM(pem) {
return nil, fmt.Errorf("cannot parse data from `ca_file` %q", CAFile)
}
}
return &tls.Config{
Certificates: certs,
InsecureSkipVerify: insecureSkipVerify,
RootCAs: rootCAs,
ServerName: serverName,
}, nil
}

View File

@@ -0,0 +1,52 @@
package utils
import "testing"
func TestTLSConfig(t *testing.T) {
var certFile, keyFile, CAFile, serverName string
var insecureSkipVerify bool
serverName = "test"
insecureSkipVerify = true
tlsCfg, err := TLSConfig(certFile, keyFile, CAFile, serverName, insecureSkipVerify)
if err != nil {
t.Errorf("unexpected error %s", err)
}
if tlsCfg == nil {
t.Errorf("expected tlsConfig to be set, got nil")
}
if tlsCfg.ServerName != serverName {
t.Errorf("unexpected ServerName, want %s, got %s", serverName, tlsCfg.ServerName)
}
if tlsCfg.InsecureSkipVerify != insecureSkipVerify {
t.Errorf("unexpected InsecureSkipVerify, want %v, got %v", insecureSkipVerify, tlsCfg.InsecureSkipVerify)
}
certFile = "/path/to/nonexisting/cert/file"
_, err = TLSConfig(certFile, keyFile, CAFile, serverName, insecureSkipVerify)
if err == nil {
t.Errorf("expected keypair error, got nil")
}
certFile = ""
CAFile = "/path/to/nonexisting/cert/file"
_, err = TLSConfig(certFile, keyFile, CAFile, serverName, insecureSkipVerify)
if err == nil {
t.Errorf("expected read error, got nil")
}
}
func TestTransport(t *testing.T) {
var certFile, keyFile, CAFile, serverName string
var insecureSkipVerify bool
URL := "http://victoriametrics.com"
_, err := Transport(URL, certFile, keyFile, CAFile, serverName, insecureSkipVerify)
if err != nil {
t.Errorf("unexpected error %s", err)
}
URL = "https://victoriametrics.com"
tr, err := Transport(URL, certFile, keyFile, CAFile, serverName, insecureSkipVerify)
if err != nil {
t.Errorf("unexpected error %s", err)
}
if tr.TLSClientConfig == nil {
t.Errorf("expected TLSClientConfig to be set, got nil")
}
}

View File

@@ -27,7 +27,6 @@ var pathList = [][]string{
}
func (rh *requestHandler) handler(w http.ResponseWriter, r *http.Request) bool {
resph := responseHandler{w}
switch r.URL.Path {
case "/":
for _, path := range pathList {
@@ -36,10 +35,22 @@ func (rh *requestHandler) handler(w http.ResponseWriter, r *http.Request) bool {
}
return true
case "/api/v1/groups":
resph.handle(rh.listGroups())
data, err := rh.listGroups()
if err != nil {
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
return true
}
w.Header().Set("Content-Type", "application/json")
w.Write(data)
return true
case "/api/v1/alerts":
resph.handle(rh.listAlerts())
data, err := rh.listAlerts()
if err != nil {
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
return true
}
w.Header().Set("Content-Type", "application/json")
w.Write(data)
return true
case "/-/reload":
logger.Infof("api config reload was called, sending sighup")
@@ -47,12 +58,18 @@ func (rh *requestHandler) handler(w http.ResponseWriter, r *http.Request) bool {
w.WriteHeader(http.StatusOK)
return true
default:
if !strings.HasSuffix(r.URL.Path, "/status") {
return false
}
// /api/v1/<groupName>/<alertID>/status
if strings.HasSuffix(r.URL.Path, "/status") {
resph.handle(rh.alert(r.URL.Path))
data, err := rh.alert(r.URL.Path)
if err != nil {
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
return true
}
return false
w.Header().Set("Content-Type", "application/json")
w.Write(data)
return true
}
}
@@ -80,7 +97,7 @@ func (rh *requestHandler) listGroups() ([]byte, error) {
b, err := json.Marshal(lr)
if err != nil {
return nil, &httpserver.ErrorWithStatusCode{
Err: fmt.Errorf(`error encoding list of active alerts: %s`, err),
Err: fmt.Errorf(`error encoding list of active alerts: %w`, err),
StatusCode: http.StatusInternalServerError,
}
}
@@ -117,7 +134,7 @@ func (rh *requestHandler) listAlerts() ([]byte, error) {
b, err := json.Marshal(lr)
if err != nil {
return nil, &httpserver.ErrorWithStatusCode{
Err: fmt.Errorf(`error encoding list of active alerts: %s`, err),
Err: fmt.Errorf(`error encoding list of active alerts: %w`, err),
StatusCode: http.StatusInternalServerError,
}
}
@@ -138,11 +155,11 @@ func (rh *requestHandler) alert(path string) ([]byte, error) {
groupID, err := uint64FromPath(parts[0])
if err != nil {
return nil, badRequest(fmt.Errorf(`cannot parse groupID: %s`, err))
return nil, badRequest(fmt.Errorf(`cannot parse groupID: %w`, err))
}
alertID, err := uint64FromPath(parts[1])
if err != nil {
return nil, badRequest(fmt.Errorf(`cannot parse alertID: %s`, err))
return nil, badRequest(fmt.Errorf(`cannot parse alertID: %w`, err))
}
resp, err := rh.m.AlertAPI(groupID, alertID)
if err != nil {
@@ -151,18 +168,6 @@ func (rh *requestHandler) alert(path string) ([]byte, error) {
return json.Marshal(resp)
}
// responseHandler wrapper on http.ResponseWriter with sugar
type responseHandler struct{ http.ResponseWriter }
func (w responseHandler) handle(b []byte, err error) {
if err != nil {
httpserver.Errorf(w, "%s", err)
return
}
w.Header().Set("Content-Type", "application/json")
w.Write(b)
}
func uint64FromPath(path string) (uint64, error) {
s := strings.TrimRight(path, "/")
return strconv.ParseUint(s, 10, 0)

View File

@@ -58,19 +58,22 @@ run-vmauth:
$(MAKE) run-via-docker
vmauth-amd64:
CGO_ENABLED=1 GOOS=linux GOARCH=amd64 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmauth-amd64 ./app/vmauth
CGO_ENABLED=1 GOARCH=amd64 $(MAKE) vmauth-local-with-goarch
vmauth-arm:
CGO_ENABLED=0 GOOS=linux GOARCH=arm GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmauth-arm ./app/vmauth
CGO_ENABLED=0 GOARCH=arm $(MAKE) vmauth-local-with-goarch
vmauth-arm64:
CGO_ENABLED=0 GOOS=linux GOARCH=arm64 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmauth-arm64 ./app/vmauth
CGO_ENABLED=0 GOARCH=arm64 $(MAKE) vmauth-local-with-goarch
vmauth-ppc64le:
CGO_ENABLED=0 GOOS=linux GOARCH=ppc64le GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmauth-ppc64le ./app/vmauth
CGO_ENABLED=0 GOARCH=ppc64le $(MAKE) vmauth-local-with-goarch
vmauth-386:
CGO_ENABLED=0 GOOS=linux GOARCH=386 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmauth-386 ./app/vmauth
CGO_ENABLED=0 GOARCH=386 $(MAKE) vmauth-local-with-goarch
vmauth-local-with-goarch:
APP_NAME=vmauth $(MAKE) app-local-with-goarch
vmauth-pure:
APP_NAME=vmauth $(MAKE) app-local-pure

View File

@@ -64,6 +64,9 @@ users:
url_prefix: "http://vminsert:8480/insert/42/prometheus"
```
The config may contain `%{ENV_VAR}` placeholders, which are substituted by the corresponding `ENV_VAR` environment variable values.
This may be useful for passing secrets to the config.
### Security
@@ -110,11 +113,11 @@ Run `make package-vmauth`. It builds `victoriametrics/vmauth:<PKG_TAG>` docker i
`<PKG_TAG>` is auto-generated image tag, which depends on source code in the repository.
The `<PKG_TAG>` may be manually set via `PKG_TAG=foobar make package-vmauth`.
By default the image is built on top of `scratch` image. It is possible to build the package on top of any other base image
by setting it via `<ROOT_IMAGE>` environment variable. For example, the following command builds the image on top of `alpine:3.11` image:
The base docker image is [alpine](https://hub.docker.com/_/alpine) but it is possible to use any other base image
by setting it via `<ROOT_IMAGE>` environment variable. For example, the following command builds the image on top of [scratch](https://hub.docker.com/_/scratch) image:
```bash
ROOT_IMAGE=alpine:3.11 make package-vmauth
ROOT_IMAGE=scratch make package-vmauth
```
@@ -137,3 +140,68 @@ curl -s http://<vmauth-host>:8427/debug/pprof/profile > cpu.pprof
The command for collecting CPU profile waits for 30 seconds before returning.
The collected profiles may be analyzed with [go tool pprof](https://github.com/google/pprof).
### Advanced usage
Pass `-help` command-line arg to `vmauth` in order to see all the configuration options:
```
./vmauth -help
vmauth authenticates and authorizes incoming requests and proxies them to VictoriaMetrics.
See the docs at https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmauth/README.md .
-auth.config string
Path to auth config. See https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmauth/README.md for details on the format of this auth config
-enableTCP6
Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP is used
-envflag.enable
Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
-http.connTimeout duration
Incoming http connections are closed after the configured timeout. This may help spreading incoming load among a cluster of services behind load balancer. Note that the real timeout may be bigger by up to 10% as a protection from Thundering herd problem (default 2m0s)
-http.disableResponseCompression
Disable compression of HTTP responses for saving CPU resources. By default compression is enabled to save network bandwidth
-http.idleConnTimeout duration
Timeout for incoming idle http connections (default 1m0s)
-http.maxGracefulShutdownDuration duration
The maximum duration for graceful shutdown of HTTP server. Highly loaded server may require increased value for graceful shutdown (default 7s)
-http.pathPrefix string
An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus
-http.shutdownDelay duration
Optional delay before http server shutdown. During this dealy the servier returns non-OK responses from /health page, so load balancers can route new requests to other servers
-httpAuth.password string
Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty
-httpAuth.username string
Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
-httpListenAddr string
TCP address to listen for http connections (default ":8427")
-loggerErrorsPerSecondLimit int
Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, then the remaining errors are suppressed. Zero value disables the rate limit (default 10)
-loggerFormat string
Format for logs. Possible values: default, json (default "default")
-loggerLevel string
Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO")
-loggerOutput string
Output for the logs. Supported values: stderr, stdout (default "stderr")
-memory.allowedBytes value
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to non-zero value. Too low value may increase cache miss rate, which usually results in higher CPU and disk IO usage. Too high value may evict too much data from OS page cache, which will result in higher disk IO usage
Supports the following optional suffixes for values: KB, MB, GB, KiB, MiB, GiB (default 0)
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low value may increase cache miss rate, which usually results in higher CPU and disk IO usage. Too high value may evict too much data from OS page cache, which will result in higher disk IO usage (default 60)
-metricsAuthKey string
Auth key for /metrics. It overrides httpAuth settings
-pprofAuthKey string
Auth key for /debug/pprof. It overrides httpAuth settings
-tls
Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set
-tlsCertFile string
Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs, since RSA certs are slow
-tlsKeyFile string
Path to file with TLS key. Used only if -tls is set
-version
Show VictoriaMetrics version
```

View File

@@ -9,6 +9,7 @@ import (
"sync"
"sync/atomic"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envtemplate"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/procutil"
"github.com/VictoriaMetrics/metrics"
@@ -82,20 +83,21 @@ var stopCh chan struct{}
func readAuthConfig(path string) (map[string]*UserInfo, error) {
data, err := ioutil.ReadFile(path)
if err != nil {
return nil, fmt.Errorf("cannot read %q: %s", path, err)
return nil, fmt.Errorf("cannot read %q: %w", path, err)
}
m, err := parseAuthConfig(data)
if err != nil {
return nil, fmt.Errorf("cannot parse %q: %s", path, err)
return nil, fmt.Errorf("cannot parse %q: %w", path, err)
}
logger.Infof("Loaded information about %d users from %q", len(m), path)
return m, nil
}
func parseAuthConfig(data []byte) (map[string]*UserInfo, error) {
data = envtemplate.Replace(data)
var ac AuthConfig
if err := yaml.UnmarshalStrict(data, &ac); err != nil {
return nil, fmt.Errorf("cannot unmarshal AuthConfig data: %s", err)
return nil, fmt.Errorf("cannot unmarshal AuthConfig data: %w", err)
}
uis := ac.Users
if len(uis) == 0 {
@@ -115,7 +117,7 @@ func parseAuthConfig(data []byte) (map[string]*UserInfo, error) {
// Validate urlPrefix
target, err := url.Parse(urlPrefix)
if err != nil {
return nil, fmt.Errorf("invalid `url_prefix: %q`: %s", urlPrefix, err)
return nil, fmt.Errorf("invalid `url_prefix: %q`: %w", urlPrefix, err)
}
if target.Scheme != "http" && target.Scheme != "https" {
return nil, fmt.Errorf("unsupported scheme for `url_prefix: %q`: %q; must be `http` or `https`", urlPrefix, target.Scheme)

View File

@@ -10,6 +10,7 @@ import (
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envflag"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
@@ -27,6 +28,7 @@ func main() {
envflag.Parse()
buildinfo.Init()
logger.Init()
cgroup.UpdateGOMAXPROCSToCPUQuota()
logger.Infof("starting vmauth at %q...", *httpListenAddr)
startTime := time.Now()
initAuthConfig()
@@ -49,20 +51,21 @@ func main() {
func requestHandler(w http.ResponseWriter, r *http.Request) bool {
username, password, ok := r.BasicAuth()
if !ok {
httpserver.Errorf(w, "Missing `Authorization: Basic *` header")
w.Header().Set("WWW-Authenticate", `Basic realm="Restricted"`)
http.Error(w, "missing `Authorization: Basic *` header", http.StatusUnauthorized)
return true
}
ac := authConfig.Load().(map[string]*UserInfo)
info := ac[username]
if info == nil || info.Password != password {
httpserver.Errorf(w, "Cannot find the provided username %q or password in config", username)
httpserver.Errorf(w, r, "cannot find the provided username %q or password in config", username)
return true
}
info.requests.Inc()
targetURL := createTargetURL(info.URLPrefix, r.URL)
if _, err := url.Parse(targetURL); err != nil {
httpserver.Errorf(w, "Invalid targetURL=%q: %s", targetURL, err)
httpserver.Errorf(w, r, "invalid targetURL=%q: %s", targetURL, err)
return true
}
r.Header.Set("vm-target-url", targetURL)

View File

@@ -51,20 +51,23 @@ package-vmbackup-386:
publish-vmbackup:
APP_NAME=vmbackup $(MAKE) publish-via-docker
vmbackup-pure:
APP_NAME=vmbackup $(MAKE) app-local-pure
vmbackup-amd64:
CGO_ENABLED=1 GOOS=linux GOARCH=amd64 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmbackup-amd64 ./app/vmbackup
CGO_ENABLED=1 GOARCH=amd64 $(MAKE) vmbackup-local-with-goarch
vmbackup-arm:
CGO_ENABLED=0 GOOS=linux GOARCH=arm GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmbackup-arm ./app/vmbackup
CGO_ENABLED=0 GOARCH=arm $(MAKE) vmbackup-local-with-goarch
vmbackup-arm64:
CGO_ENABLED=0 GOOS=linux GOARCH=arm64 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmbackup-arm64 ./app/vmbackup
CGO_ENABLED=0 GOARCH=arm64 $(MAKE) vmbackup-local-with-goarch
vmbackup-ppc64le:
CGO_ENABLED=0 GOOS=linux GOARCH=ppc64le GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmbackup-ppc64le ./app/vmbackup
CGO_ENABLED=0 GOARCH=ppc64le $(MAKE) vmbackup-local-with-goarch
vmbackup-386:
CGO_ENABLED=0 GOOS=linux GOARCH=386 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmbackup-386 ./app/vmbackup
CGO_ENABLED=0 GOARCH=386 $(MAKE) vmbackup-local-with-goarch
vmbackup-local-with-goarch:
APP_NAME=vmbackup $(MAKE) app-local-with-goarch
vmbackup-pure:
APP_NAME=vmbackup $(MAKE) app-local-pure

View File

@@ -6,12 +6,12 @@ Supported storage systems for backups:
* [GCS](https://cloud.google.com/storage/). Example: `gcs://<bucket>/<path/to/backup>`
* [S3](https://aws.amazon.com/s3/). Example: `s3://<bucket>/<path/to/backup>`
* Any S3-compatible storage such as [MinIO](https://github.com/minio/minio), [Ceph](https://docs.ceph.com/docs/mimic/radosgw/s3/) or [Swift](https://www.swiftstack.com/docs/admin/middleware/s3_middleware.html). See `-customS3Endpoint` command-line flag.
* Any S3-compatible storage such as [MinIO](https://github.com/minio/minio), [Ceph](https://docs.ceph.com/docs/mimic/radosgw/s3/) or [Swift](https://www.swiftstack.com/docs/admin/middleware/s3_middleware.html). See [these docs](#advanced-usage) for details.
* Local filesystem. Example: `fs://</absolute/path/to/backup>`
Incremental backups and full backups are supported. Incremental backups are created automatically if the destination path already contains data from the previous backup.
`vmbackup` supports incremental and full backups. Incremental backups created automatically if the destination path already contains data from the previous backup.
Full backups can be sped up with `-origin` pointing to already existing backup on the same remote storage. In this case `vmbackup` makes server-side copy for the shared
data between the existing backup and new backup. This saves time and costs on data transfer.
data between the existing backup and new backup. It saves time and costs on data transfer.
Backup process can be interrupted at any time. It is automatically resumed from the interruption point when restarting `vmbackup` with the same args.
@@ -35,8 +35,8 @@ vmbackup -storageDataPath=</path/to/victoria-metrics-data> -snapshotName=<local-
* `</path/to/victoria-metrics-data>` - path to VictoriaMetrics data pointed by `-storageDataPath` command-line flag in single-node VictoriaMetrics or in cluster `vmstorage`.
There is no need to stop VictoriaMetrics for creating backups, since they are performed from immutable [instant snapshots](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-work-with-snapshots).
* `<local-snapshot>` is the snapshot to backup. See [how to create instant snapshots](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-work-with-snapshots).
* `<bucket>` is already existing name for [GCS bucket](https://cloud.google.com/storage/docs/creating-buckets).
* `<local-snapshot>` is the snapshot to back up. See [how to create instant snapshots](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-work-with-snapshots).
* `<bucket>` is an already existing name for [GCS bucket](https://cloud.google.com/storage/docs/creating-buckets).
* `<path/to/new/backup>` is the destination path where new backup will be placed.
@@ -49,13 +49,13 @@ with the following command:
vmbackup -storageDataPath=</path/to/victoria-metrics-data> -snapshotName=<local-snapshot> -dst=gcs://<bucket>/<path/to/new/backup> -origin=gcs://<bucket>/<path/to/existing/backup>
```
This saves time and network bandwidth costs by performing server-side copy for the shared data from the `-origin` to `-dst`.
It saves time and network bandwidth costs by performing server-side copy for the shared data from the `-origin` to `-dst`.
#### Incremental backups
Incremental backups are performed if `-dst` points to already existing backup. In this case only new data is uploaded to remote storage.
This saves time and network bandwidth costs when working with big backups:
Incremental backups performed if `-dst` points to an already existing backup. In this case only new data uploaded to remote storage.
It saves time and network bandwidth costs when working with big backups:
```
vmbackup -storageDataPath=</path/to/victoria-metrics-data> -snapshotName=<local-snapshot> -dst=gcs://<bucket>/<path/to/existing/backup>
@@ -100,16 +100,16 @@ The backup algorithm is the following:
2. Determine files in `-dst`, which are missing in `-snapshotName`, and delete them. These are usually small files, which are already merged into bigger files in the snapshot.
3. Determine files from `-snapshotName`, which are missing in `-dst`. These are usually small new files and bigger merged files.
4. Determine files from step 3, which exist in the `-origin`, and perform server-side copy of these files from `-origin` to `-dst`.
This are usually the biggest and the oldest files, which are shared between backups.
5. Upload the remaining files from setp 3 from `-snapshotName` to `-dst`.
These are usually the biggest and the oldest files, which are shared between backups.
5. Upload the remaining files from step 3 from `-snapshotName` to `-dst`.
The algorithm splits source files into 100MB chunks in the backup. Each chunk is stored as a separate file in the backup.
The algorithm splits source files into 100 MB chunks in the backup. Each chunk stored as a separate file in the backup.
Such splitting minimizes the amounts of data to re-transfer after temporary errors.
`vmbackup` relies on [instant snapshot](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282) properties:
- All the files in the snapshot are immutable.
- Old files are periodically merged into new files.
- Old files periodically merged into new files.
- Smaller files have higher probability to be merged.
- Consecutive snapshots share many identical files.
@@ -129,7 +129,45 @@ See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time-
### Advanced usage
Run `vmbackup -help` in order to see all the available options:
* Obtaining credentials from a file.
Add flag `-credsFilePath=/etc/credentials` with the following content:
for s3 (aws, minio or other s3 compatible storages):
```bash
[default]
aws_access_key_id=theaccesskey
aws_secret_access_key=thesecretaccesskeyvalue
```
for gce cloud storage:
```json
{
"type": "service_account",
"project_id": "project-id",
"private_key_id": "key-id",
"private_key": "-----BEGIN PRIVATE KEY-----\nprivate-key\n-----END PRIVATE KEY-----\n",
"client_email": "service-account-email",
"client_id": "client-id",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://accounts.google.com/o/oauth2/token",
"auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
"client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/service-account-email"
}
```
* Usage with s3 custom url endpoint. It is possible to use `vmbackup` with s3 compatible storages like minio, cloudian, etc.
You have to add a custom url endpoint via flag:
```
# for minio
-customS3Endpoint=http://localhost:9000
# for aws gov region
-customS3Endpoint=https://s3-fips.us-gov-west-1.amazonaws.com
```
* Run `vmbackup -help` in order to see all the available options:
```
-concurrency int
@@ -138,7 +176,7 @@ Run `vmbackup -help` in order to see all the available options:
Path to file with S3 configs. Configs are loaded from default location if not set.
See https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html
-configProfile string
Profile name for S3 configs (default "default")
Profile name for S3 configs. If no set, the value of the environment variable will be loaded (AWS_PROFILE or AWS_DEFAULT_PROFILE), or if both not set, DefaultSharedConfigProfile is used
-credsFilePath string
Path to file with GCS or S3 credentials. Credentials are loaded from default locations if not set.
See https://cloud.google.com/iam/docs/creating-managing-service-account-keys and https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html
@@ -152,23 +190,29 @@ Run `vmbackup -help` in order to see all the available options:
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
-fs.disableMmap
Whether to use pread() instead of mmap() for reading data files
Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread()
-loggerErrorsPerSecondLimit int
Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, then the remaining errors are suppressed. Zero value disables the rate limit (default 10)
-loggerFormat string
Format for logs. Possible values: default, json (default "default")
-loggerLevel string
Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO")
-loggerOutput string
Output for the logs. Supported values: stderr, stdout (default "stderr")
-maxBytesPerSecond int
-maxBytesPerSecond value
The maximum upload speed. There is no limit if it is set to 0
Supports the following optional suffixes for values: KB, MB, GB, KiB, MiB, GiB (default 0)
-memory.allowedBytes value
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to non-zero value. Too low value may increase cache miss rate, which usually results in higher CPU and disk IO usage. Too high value may evict too much data from OS page cache, which will result in higher disk IO usage
Supports the following optional suffixes for values: KB, MB, GB, KiB, MiB, GiB (default 0)
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy. Too low value may increase cache miss rate, which usually results in higher CPU and disk IO usage. Too high value may evict too much data from OS page cache, which will result in higher disk IO usage (default 60)
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low value may increase cache miss rate, which usually results in higher CPU and disk IO usage. Too high value may evict too much data from OS page cache, which will result in higher disk IO usage (default 60)
-origin string
Optional origin directory on the remote storage with old backup for server-side copying when performing full backup. This speeds up full backups
-snapshot.createURL string
VictoriaMetrics create snapshot url. When this is given a snapshot will automatically be created during backup.Example: http://victoriametrics:8428/snaphsot/create
VictoriaMetrics create snapshot url. When this is given a snapshot will automatically be created during backup. Example: http://victoriametrics:8428/snaphsot/create
-snapshot.deleteURL string
VictoriaMetrics delete snapshot url. Optional. Will be generated from snapshotCreateURL if not provided. All created snaphosts will be automatically deleted.Example: http://victoriametrics:8428/snaphsot/delete
VictoriaMetrics delete snapshot url. Optional. Will be generated from -snapshot.createURL if not provided. All created snaphosts will be automatically deleted. Example: http://victoriametrics:8428/snaphsot/delete
-snapshotName string
Name for the snapshot to backup. See https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-work-with-snapshots
-storageDataPath string
@@ -201,9 +245,9 @@ Run `make package-vmbackup`. It builds `victoriametrics/vmbackup:<PKG_TAG>` dock
`<PKG_TAG>` is auto-generated image tag, which depends on source code in the repository.
The `<PKG_TAG>` may be manually set via `PKG_TAG=foobar make package-vmbackup`.
By default the image is built on top of `scratch` image. It is possible to build the package on top of any other base image
by setting it via `<ROOT_IMAGE>` environment variable. For example, the following command builds the image on top of `alpine:3.11` image:
The base docker image is [alpine](https://hub.docker.com/_/alpine) but it is possible to use any other base image
by setting it via `<ROOT_IMAGE>` environment variable. For example, the following command builds the image on top of [scratch](https://hub.docker.com/_/scratch) image:
```bash
ROOT_IMAGE=alpine:3.11 make package-vmbackup
ROOT_IMAGE=scratch make package-vmbackup
```

View File

@@ -10,24 +10,27 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/backup/actions"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/backup/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/backup/fslocal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/backup/fsnil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envflag"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
)
var (
storageDataPath = flag.String("storageDataPath", "victoria-metrics-data", "Path to VictoriaMetrics data. Must match -storageDataPath from VictoriaMetrics or vmstorage")
snapshotName = flag.String("snapshotName", "", "Name for the snapshot to backup. See https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-work-with-snapshots")
snapshotCreateURL = flag.String("snapshot.createURL", "", "VictoriaMetrics create snapshot url. When this is given a snapshot will automatically be created during backup."+
snapshotCreateURL = flag.String("snapshot.createURL", "", "VictoriaMetrics create snapshot url. When this is given a snapshot will automatically be created during backup. "+
"Example: http://victoriametrics:8428/snaphsot/create")
snapshotDeleteURL = flag.String("snapshot.deleteURL", "", "VictoriaMetrics delete snapshot url. Optional. Will be generated from snapshotCreateURL if not provided. All created snaphosts will be automatically deleted."+
"Example: http://victoriametrics:8428/snaphsot/delete")
snapshotDeleteURL = flag.String("snapshot.deleteURL", "", "VictoriaMetrics delete snapshot url. Optional. Will be generated from -snapshot.createURL if not provided. "+
"All created snaphosts will be automatically deleted. Example: http://victoriametrics:8428/snaphsot/delete")
dst = flag.String("dst", "", "Where to put the backup on the remote storage. "+
"Example: gcs://bucket/path/to/backup/dir, s3://bucket/path/to/backup/dir or fs:///path/to/local/backup/dir\n"+
"-dst can point to the previous backup. In this case incremental backup is performed, i.e. only changed data is uploaded")
origin = flag.String("origin", "", "Optional origin directory on the remote storage with old backup for server-side copying when performing full backup. This speeds up full backups")
concurrency = flag.Int("concurrency", 10, "The number of concurrent workers. Higher concurrency may reduce backup duration")
maxBytesPerSecond = flag.Int("maxBytesPerSecond", 0, "The maximum upload speed. There is no limit if it is set to 0")
maxBytesPerSecond = flagutil.NewBytes("maxBytesPerSecond", 0, "The maximum upload speed. There is no limit if it is set to 0")
)
func main() {
@@ -36,6 +39,8 @@ func main() {
flag.Usage = usage
envflag.Parse()
buildinfo.Init()
logger.Init()
cgroup.UpdateGOMAXPROCSToCPUQuota()
if len(*snapshotCreateURL) > 0 {
logger.Infof("%s", "Snapshots enabled")
@@ -86,6 +91,9 @@ func main() {
if err := a.Run(); err != nil {
logger.Fatalf("cannot create backup: %s", err)
}
srcFS.MustStop()
dstFS.MustStop()
originFS.MustStop()
}
func usage() {
@@ -110,12 +118,12 @@ func newSrcFS() (*fslocal.FS, error) {
// Verify the snapshot exists.
f, err := os.Open(snapshotPath)
if err != nil {
return nil, fmt.Errorf("cannot open snapshot at %q: %s", snapshotPath, err)
return nil, fmt.Errorf("cannot open snapshot at %q: %w", snapshotPath, err)
}
fi, err := f.Stat()
_ = f.Close()
if err != nil {
return nil, fmt.Errorf("cannot stat %q: %s", snapshotPath, err)
return nil, fmt.Errorf("cannot stat %q: %w", snapshotPath, err)
}
if !fi.IsDir() {
return nil, fmt.Errorf("snapshot %q must be a directory", snapshotPath)
@@ -123,10 +131,10 @@ func newSrcFS() (*fslocal.FS, error) {
fs := &fslocal.FS{
Dir: snapshotPath,
MaxBytesPerSecond: *maxBytesPerSecond,
MaxBytesPerSecond: maxBytesPerSecond.N,
}
if err := fs.Init(); err != nil {
return nil, fmt.Errorf("cannot initialize fs: %s", err)
return nil, fmt.Errorf("cannot initialize fs: %w", err)
}
return fs, nil
}
@@ -134,18 +142,18 @@ func newSrcFS() (*fslocal.FS, error) {
func newDstFS() (common.RemoteFS, error) {
fs, err := actions.NewRemoteFS(*dst)
if err != nil {
return nil, fmt.Errorf("cannot parse `-dst`=%q: %s", *dst, err)
return nil, fmt.Errorf("cannot parse `-dst`=%q: %w", *dst, err)
}
return fs, nil
}
func newOriginFS() (common.RemoteFS, error) {
func newOriginFS() (common.OriginFS, error) {
if len(*origin) == 0 {
return nil, nil
return &fsnil.FS{}, nil
}
fs, err := actions.NewRemoteFS(*origin)
if err != nil {
return nil, fmt.Errorf("cannot parse `-origin`=%q: %s", *origin, err)
return nil, fmt.Errorf("cannot parse `-origin`=%q: %w", *origin, err)
}
return fs, nil
}

View File

@@ -4,6 +4,7 @@ import (
"fmt"
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/relabel"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
@@ -17,11 +18,14 @@ type InsertCtx struct {
mrs []storage.MetricRow
metricNamesBuf []byte
relabelCtx relabel.Ctx
}
// Reset resets ctx for future fill with rowsLen rows.
func (ctx *InsertCtx) Reset(rowsLen int) {
for _, label := range ctx.Labels {
for i := range ctx.Labels {
label := &ctx.Labels[i]
label.Name = nil
label.Value = nil
}
@@ -37,6 +41,7 @@ func (ctx *InsertCtx) Reset(rowsLen int) {
}
ctx.mrs = ctx.mrs[:0]
ctx.metricNamesBuf = ctx.metricNamesBuf[:0]
ctx.relabelCtx.Reset()
}
func (ctx *InsertCtx) marshalMetricNameRaw(prefix []byte, labels []prompb.Label) []byte {
@@ -48,23 +53,23 @@ func (ctx *InsertCtx) marshalMetricNameRaw(prefix []byte, labels []prompb.Label)
}
// WriteDataPoint writes (timestamp, value) with the given prefix and labels into ctx buffer.
func (ctx *InsertCtx) WriteDataPoint(prefix []byte, labels []prompb.Label, timestamp int64, value float64) {
func (ctx *InsertCtx) WriteDataPoint(prefix []byte, labels []prompb.Label, timestamp int64, value float64) error {
metricNameRaw := ctx.marshalMetricNameRaw(prefix, labels)
ctx.addRow(metricNameRaw, timestamp, value)
return ctx.addRow(metricNameRaw, timestamp, value)
}
// WriteDataPointExt writes (timestamp, value) with the given metricNameRaw and labels into ctx buffer.
//
// It returns metricNameRaw for the given labels if len(metricNameRaw) == 0.
func (ctx *InsertCtx) WriteDataPointExt(metricNameRaw []byte, labels []prompb.Label, timestamp int64, value float64) []byte {
func (ctx *InsertCtx) WriteDataPointExt(metricNameRaw []byte, labels []prompb.Label, timestamp int64, value float64) ([]byte, error) {
if len(metricNameRaw) == 0 {
metricNameRaw = ctx.marshalMetricNameRaw(nil, labels)
}
ctx.addRow(metricNameRaw, timestamp, value)
return metricNameRaw
err := ctx.addRow(metricNameRaw, timestamp, value)
return metricNameRaw, err
}
func (ctx *InsertCtx) addRow(metricNameRaw []byte, timestamp int64, value float64) {
func (ctx *InsertCtx) addRow(metricNameRaw []byte, timestamp int64, value float64) error {
mrs := ctx.mrs
if cap(mrs) > len(mrs) {
mrs = mrs[:len(mrs)+1]
@@ -76,55 +81,64 @@ func (ctx *InsertCtx) addRow(metricNameRaw []byte, timestamp int64, value float6
mr.MetricNameRaw = metricNameRaw
mr.Timestamp = timestamp
mr.Value = value
if len(ctx.metricNamesBuf) > 16*1024*1024 {
if err := ctx.FlushBufs(); err != nil {
return err
}
}
return nil
}
// AddLabelBytes adds (name, value) label to ctx.Labels.
//
// name and value must exist until ctx.Labels is used.
func (ctx *InsertCtx) AddLabelBytes(name, value []byte) {
labels := ctx.Labels
if cap(labels) > len(labels) {
labels = labels[:len(labels)+1]
} else {
labels = append(labels, prompb.Label{})
if len(value) == 0 {
// Skip labels without values, since they have no sense.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/600
// Do not skip labels with empty name, since they are equal to __name__.
return
}
label := &labels[len(labels)-1]
// Do not copy name and value contents for performance reasons.
// This reduces GC overhead on the number of objects and allocations.
label.Name = name
label.Value = value
ctx.Labels = labels
ctx.Labels = append(ctx.Labels, prompb.Label{
// Do not copy name and value contents for performance reasons.
// This reduces GC overhead on the number of objects and allocations.
Name: name,
Value: value,
})
}
// AddLabel adds (name, value) label to ctx.Labels.
//
// name and value must exist until ctx.Labels is used.
func (ctx *InsertCtx) AddLabel(name, value string) {
labels := ctx.Labels
if cap(labels) > len(labels) {
labels = labels[:len(labels)+1]
} else {
labels = append(labels, prompb.Label{})
if len(value) == 0 {
// Skip labels without values, since they have no sense.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/600
// Do not skip labels with empty name, since they are equal to __name__.
return
}
label := &labels[len(labels)-1]
ctx.Labels = append(ctx.Labels, prompb.Label{
// Do not copy name and value contents for performance reasons.
// This reduces GC overhead on the number of objects and allocations.
Name: bytesutil.ToUnsafeBytes(name),
Value: bytesutil.ToUnsafeBytes(value),
})
}
// Do not copy name and value contents for performance reasons.
// This reduces GC overhead on the number of objects and allocations.
label.Name = bytesutil.ToUnsafeBytes(name)
label.Value = bytesutil.ToUnsafeBytes(value)
ctx.Labels = labels
// ApplyRelabeling applies relabeling to ic.Labels.
func (ctx *InsertCtx) ApplyRelabeling() {
ctx.Labels = ctx.relabelCtx.ApplyRelabeling(ctx.Labels)
}
// FlushBufs flushes buffered rows to the underlying storage.
func (ctx *InsertCtx) FlushBufs() error {
if err := vmstorage.AddRows(ctx.mrs); err != nil {
return &httpserver.ErrorWithStatusCode{
Err: fmt.Errorf("cannot store metrics: %s", err),
StatusCode: http.StatusServiceUnavailable,
}
err := vmstorage.AddRows(ctx.mrs)
ctx.Reset(0)
if err == nil {
return nil
}
return &httpserver.ErrorWithStatusCode{
Err: fmt.Errorf("cannot store metrics: %w", err),
StatusCode: http.StatusServiceUnavailable,
}
return nil
}

View File

@@ -4,6 +4,9 @@ import (
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/relabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/csvimport"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
@@ -16,18 +19,23 @@ var (
// InsertHandler processes /api/v1/import/csv requests.
func InsertHandler(req *http.Request) error {
extraLabels, err := parserCommon.GetExtraLabels(req)
if err != nil {
return err
}
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(req, func(rows []parser.Row) error {
return insertRows(rows)
return insertRows(rows, extraLabels)
})
})
}
func insertRows(rows []parser.Row) error {
func insertRows(rows []parser.Row, extraLabels []prompbmarshal.Label) error {
ctx := common.GetInsertCtx()
defer common.PutInsertCtx(ctx)
ctx.Reset(len(rows))
hasRelabeling := relabel.HasRelabeling()
for i := range rows {
r := &rows[i]
ctx.Labels = ctx.Labels[:0]
@@ -36,7 +44,20 @@ func insertRows(rows []parser.Row) error {
tag := &r.Tags[j]
ctx.AddLabel(tag.Key, tag.Value)
}
ctx.WriteDataPoint(nil, ctx.Labels, r.Timestamp, r.Value)
for j := range extraLabels {
label := &extraLabels[j]
ctx.AddLabel(label.Name, label.Value)
}
if hasRelabeling {
ctx.ApplyRelabeling()
}
if len(ctx.Labels) == 0 {
// Skip metric without labels.
continue
}
if err := ctx.WriteDataPoint(nil, ctx.Labels, r.Timestamp, r.Value); err != nil {
return err
}
}
rowsInserted.Add(len(rows))
rowsPerInsert.Update(float64(len(rows)))

View File

@@ -4,6 +4,7 @@ import (
"io"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/relabel"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/graphite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
@@ -28,6 +29,7 @@ func insertRows(rows []parser.Row) error {
defer common.PutInsertCtx(ctx)
ctx.Reset(len(rows))
hasRelabeling := relabel.HasRelabeling()
for i := range rows {
r := &rows[i]
ctx.Labels = ctx.Labels[:0]
@@ -36,7 +38,16 @@ func insertRows(rows []parser.Row) error {
tag := &r.Tags[j]
ctx.AddLabel(tag.Key, tag.Value)
}
ctx.WriteDataPoint(nil, ctx.Labels, r.Timestamp, r.Value)
if hasRelabeling {
ctx.ApplyRelabeling()
}
if len(ctx.Labels) == 0 {
// Skip metric without labels.
continue
}
if err := ctx.WriteDataPoint(nil, ctx.Labels, r.Timestamp, r.Value); err != nil {
return err
}
}
rowsInserted.Add(len(rows))
rowsPerInsert.Update(float64(len(rows)))

View File

@@ -8,7 +8,9 @@ import (
"sync"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/relabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/influx"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
@@ -18,6 +20,7 @@ import (
var (
measurementFieldSeparator = flag.String("influxMeasurementFieldSeparator", "_", "Separator for '{measurement}{separator}{field_name}' metric name when inserted via Influx line protocol")
skipSingleField = flag.Bool("influxSkipSingleField", false, "Uses '{measurement}' instead of '{measurement}{separator}{field_name}' for metic name if Influx line contains only a single field")
skipMeasurement = flag.Bool("influxSkipMeasurement", false, "Uses '{field_name}' as a metric name while ignoring '{measurement}' and '-influxMeasurementFieldSeparator'")
)
var (
@@ -59,38 +62,70 @@ func insertRows(db string, rows []parser.Row) error {
ic := &ctx.Common
ic.Reset(rowsLen)
rowsTotal := 0
hasRelabeling := relabel.HasRelabeling()
for i := range rows {
r := &rows[i]
rowsTotal += len(r.Fields)
ic.Labels = ic.Labels[:0]
hasDBLabel := false
hasDBKey := false
for j := range r.Tags {
tag := &r.Tags[j]
if tag.Key == "db" {
hasDBLabel = true
hasDBKey = true
}
ic.AddLabel(tag.Key, tag.Value)
}
if len(db) > 0 && !hasDBLabel {
if !hasDBKey {
ic.AddLabel("db", db)
}
ctx.metricNameBuf = storage.MarshalMetricNameRaw(ctx.metricNameBuf[:0], ic.Labels)
ctx.metricGroupBuf = append(ctx.metricGroupBuf[:0], r.Measurement...)
ctx.metricGroupBuf = ctx.metricGroupBuf[:0]
if !*skipMeasurement {
ctx.metricGroupBuf = append(ctx.metricGroupBuf, r.Measurement...)
}
skipFieldKey := len(r.Fields) == 1 && *skipSingleField
if len(ctx.metricGroupBuf) > 0 && !skipFieldKey {
ctx.metricGroupBuf = append(ctx.metricGroupBuf, *measurementFieldSeparator...)
}
metricGroupPrefixLen := len(ctx.metricGroupBuf)
for j := range r.Fields {
f := &r.Fields[j]
if !skipFieldKey {
ctx.metricGroupBuf = append(ctx.metricGroupBuf[:metricGroupPrefixLen], f.Key...)
if hasRelabeling {
ctx.originLabels = append(ctx.originLabels[:0], ic.Labels...)
for j := range r.Fields {
f := &r.Fields[j]
if !skipFieldKey {
ctx.metricGroupBuf = append(ctx.metricGroupBuf[:metricGroupPrefixLen], f.Key...)
}
metricGroup := bytesutil.ToUnsafeString(ctx.metricGroupBuf)
ic.Labels = append(ic.Labels[:0], ctx.originLabels...)
ic.AddLabel("", metricGroup)
ic.ApplyRelabeling()
if len(ic.Labels) == 0 {
// Skip metric without labels.
continue
}
if err := ic.WriteDataPoint(nil, ic.Labels, r.Timestamp, f.Value); err != nil {
return err
}
}
} else {
ctx.metricNameBuf = storage.MarshalMetricNameRaw(ctx.metricNameBuf[:0], ic.Labels)
labelsLen := len(ic.Labels)
for j := range r.Fields {
f := &r.Fields[j]
if !skipFieldKey {
ctx.metricGroupBuf = append(ctx.metricGroupBuf[:metricGroupPrefixLen], f.Key...)
}
metricGroup := bytesutil.ToUnsafeString(ctx.metricGroupBuf)
ic.Labels = ic.Labels[:labelsLen]
ic.AddLabel("", metricGroup)
if len(ic.Labels) == 0 {
// Skip metric without labels.
continue
}
if err := ic.WriteDataPoint(ctx.metricNameBuf, ic.Labels[len(ic.Labels)-1:], r.Timestamp, f.Value); err != nil {
return err
}
}
metricGroup := bytesutil.ToUnsafeString(ctx.metricGroupBuf)
ic.Labels = ic.Labels[:0]
ic.AddLabel("", metricGroup)
ic.WriteDataPoint(ctx.metricNameBuf, ic.Labels[:1], r.Timestamp, f.Value)
}
rowsTotal += len(r.Fields)
}
rowsInserted.Add(rowsTotal)
rowsPerInsert.Update(float64(rowsTotal))
@@ -101,12 +136,21 @@ type pushCtx struct {
Common common.InsertCtx
metricNameBuf []byte
metricGroupBuf []byte
originLabels []prompb.Label
}
func (ctx *pushCtx) reset() {
ctx.Common.Reset(0)
ctx.metricNameBuf = ctx.metricNameBuf[:0]
ctx.metricGroupBuf = ctx.metricGroupBuf[:0]
originLabels := ctx.originLabels
for i := range originLabels {
label := &originLabels[i]
label.Name = nil
label.Value = nil
}
ctx.originLabels = ctx.originLabels[:0]
}
func getPushCtx() *pushCtx {

View File

@@ -4,16 +4,20 @@ import (
"flag"
"fmt"
"net/http"
"strconv"
"strings"
"sync/atomic"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/csvimport"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/graphite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/influx"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/native"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/opentsdb"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/opentsdbhttp"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/prometheusimport"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/prompush"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/promremotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/relabel"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/vmimport"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
graphiteserver "github.com/VictoriaMetrics/VictoriaMetrics/lib/ingestserver/graphite"
@@ -22,6 +26,7 @@ import (
opentsdbhttpserver "github.com/VictoriaMetrics/VictoriaMetrics/lib/ingestserver/opentsdbhttp"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/procutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
@@ -29,7 +34,8 @@ import (
var (
graphiteListenAddr = flag.String("graphiteListenAddr", "", "TCP and UDP address to listen for Graphite plaintext data. Usually :2003 must be set. Doesn't work if empty")
influxListenAddr = flag.String("influxListenAddr", "", "TCP and UDP address to listen for Influx line protocol data. Usually :8189 must be set. Doesn't work if empty")
influxListenAddr = flag.String("influxListenAddr", "", "TCP and UDP address to listen for Influx line protocol data. Usually :8189 must be set. Doesn't work if empty. "+
"This flag isn't needed when ingesting data over HTTP - just send it to `http://<victoriametrics>:8428/write`")
opentsdbListenAddr = flag.String("opentsdbListenAddr", "", "TCP and UDP address to listen for OpentTSDB metrics. "+
"Telnet put messages and HTTP /api/put messages are simultaneously served on TCP port. "+
"Usually :4242 must be set. Doesn't work if empty")
@@ -46,8 +52,9 @@ var (
// Init initializes vminsert.
func Init() {
relabel.Init()
storage.SetMaxLabelsPerTimeseries(*maxLabelsPerTimeseries)
common.StartUnmarshalWorkers()
writeconcurrencylimiter.Init()
if len(*influxListenAddr) > 0 {
influxServer = influxserver.MustStart(*influxListenAddr, influx.InsertHandlerForReader)
@@ -79,6 +86,7 @@ func Stop() {
if len(*opentsdbHTTPListenAddr) > 0 {
opentsdbhttpServer.MustStop()
}
common.StopUnmarshalWorkers()
}
// RequestHandler is a handler for Prometheus remote storage write API
@@ -89,7 +97,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
prometheusWriteRequests.Inc()
if err := promremotewrite.InsertHandler(r); err != nil {
prometheusWriteErrors.Inc()
httpserver.Errorf(w, "error in %q: %s", r.URL.Path, err)
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
return true
}
w.WriteHeader(http.StatusNoContent)
@@ -98,7 +106,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
vmimportRequests.Inc()
if err := vmimport.InsertHandler(r); err != nil {
vmimportErrors.Inc()
httpserver.Errorf(w, "error in %q: %s", r.URL.Path, err)
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
return true
}
w.WriteHeader(http.StatusNoContent)
@@ -107,7 +115,25 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
csvimportRequests.Inc()
if err := csvimport.InsertHandler(r); err != nil {
csvimportErrors.Inc()
httpserver.Errorf(w, "error in %q: %s", r.URL.Path, err)
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "/api/v1/import/prometheus":
prometheusimportRequests.Inc()
if err := prometheusimport.InsertHandler(r); err != nil {
prometheusimportErrors.Inc()
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "/api/v1/import/native":
nativeimportRequests.Inc()
if err := native.InsertHandler(r); err != nil {
nativeimportErrors.Inc()
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
return true
}
w.WriteHeader(http.StatusNoContent)
@@ -116,7 +142,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
influxWriteRequests.Inc()
if err := influx.InsertHandlerForHTTP(r); err != nil {
influxWriteErrors.Inc()
httpserver.Errorf(w, "error in %q: %s", r.URL.Path, err)
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
return true
}
w.WriteHeader(http.StatusNoContent)
@@ -130,7 +156,14 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
case "/targets":
promscrapeTargetsRequests.Inc()
w.Header().Set("Content-Type", "text/plain")
promscrape.WriteHumanReadableTargetsStatus(w)
showOriginalLabels, _ := strconv.ParseBool(r.FormValue("show_original_labels"))
promscrape.WriteHumanReadableTargetsStatus(w, showOriginalLabels)
return true
case "/api/v1/targets":
promscrapeAPIV1TargetsRequests.Inc()
w.Header().Set("Content-Type", "application/json")
state := r.FormValue("state")
promscrape.WriteAPIV1Targets(w, state)
return true
case "/-/reload":
promscrapeConfigReloadRequests.Inc()
@@ -153,12 +186,19 @@ var (
csvimportRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/import/csv", protocol="csvimport"}`)
csvimportErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/api/v1/import/csv", protocol="csvimport"}`)
prometheusimportRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/import/prometheus", protocol="prometheusimport"}`)
prometheusimportErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/api/v1/import/prometheus", protocol="prometheusimport"}`)
nativeimportRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/import/native", protocol="nativeimport"}`)
nativeimportErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/api/v1/import/native", protocol="nativeimport"}`)
influxWriteRequests = metrics.NewCounter(`vm_http_requests_total{path="/write", protocol="influx"}`)
influxWriteErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/write", protocol="influx"}`)
influxQueryRequests = metrics.NewCounter(`vm_http_requests_total{path="/query", protocol="influx"}`)
promscrapeTargetsRequests = metrics.NewCounter(`vm_http_requests_total{path="/targets"}`)
promscrapeTargetsRequests = metrics.NewCounter(`vm_http_requests_total{path="/targets"}`)
promscrapeAPIV1TargetsRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/targets"}`)
promscrapeConfigReloadRequests = metrics.NewCounter(`vm_http_requests_total{path="/-/reload"}`)

View File

@@ -0,0 +1,115 @@
package native
import (
"net/http"
"runtime"
"sync"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/relabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/native"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="native"}`)
rowsPerInsert = metrics.NewHistogram(`vm_rows_per_insert{type="native"}`)
)
// InsertHandler processes `/api/v1/import/native` request.
func InsertHandler(req *http.Request) error {
extraLabels, err := parserCommon.GetExtraLabels(req)
if err != nil {
return err
}
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(req, func(block *parser.Block) error {
return insertRows(block, extraLabels)
})
})
}
func insertRows(block *parser.Block, extraLabels []prompbmarshal.Label) error {
ctx := getPushCtx()
defer putPushCtx(ctx)
// Update rowsInserted and rowsPerInsert before actual inserting,
// since relabeling can prevent from inserting the rows.
rowsLen := len(block.Values)
rowsInserted.Add(rowsLen)
rowsPerInsert.Update(float64(rowsLen))
ic := &ctx.Common
ic.Reset(rowsLen)
hasRelabeling := relabel.HasRelabeling()
mn := &block.MetricName
ic.Labels = ic.Labels[:0]
ic.AddLabelBytes(nil, mn.MetricGroup)
for j := range mn.Tags {
tag := &mn.Tags[j]
ic.AddLabelBytes(tag.Key, tag.Value)
}
for j := range extraLabels {
label := &extraLabels[j]
ic.AddLabel(label.Name, label.Value)
}
if hasRelabeling {
ic.ApplyRelabeling()
}
if len(ic.Labels) == 0 {
// Skip metric without labels.
return nil
}
ctx.metricNameBuf = storage.MarshalMetricNameRaw(ctx.metricNameBuf[:0], ic.Labels)
values := block.Values
timestamps := block.Timestamps
if len(timestamps) != len(values) {
logger.Panicf("BUG: len(timestamps)=%d must match len(values)=%d", len(timestamps), len(values))
}
for j, value := range values {
timestamp := timestamps[j]
if err := ic.WriteDataPoint(ctx.metricNameBuf, nil, timestamp, value); err != nil {
return err
}
}
return ic.FlushBufs()
}
type pushCtx struct {
Common common.InsertCtx
metricNameBuf []byte
}
func (ctx *pushCtx) reset() {
ctx.Common.Reset(0)
ctx.metricNameBuf = ctx.metricNameBuf[:0]
}
func getPushCtx() *pushCtx {
select {
case ctx := <-pushCtxPoolCh:
return ctx
default:
if v := pushCtxPool.Get(); v != nil {
return v.(*pushCtx)
}
return &pushCtx{}
}
}
func putPushCtx(ctx *pushCtx) {
ctx.reset()
select {
case pushCtxPoolCh <- ctx:
default:
pushCtxPool.Put(ctx)
}
}
var pushCtxPool sync.Pool
var pushCtxPoolCh = make(chan *pushCtx, runtime.GOMAXPROCS(-1))

View File

@@ -4,6 +4,7 @@ import (
"io"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/relabel"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentsdb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
@@ -28,6 +29,7 @@ func insertRows(rows []parser.Row) error {
defer common.PutInsertCtx(ctx)
ctx.Reset(len(rows))
hasRelabeling := relabel.HasRelabeling()
for i := range rows {
r := &rows[i]
ctx.Labels = ctx.Labels[:0]
@@ -36,7 +38,16 @@ func insertRows(rows []parser.Row) error {
tag := &r.Tags[j]
ctx.AddLabel(tag.Key, tag.Value)
}
ctx.WriteDataPoint(nil, ctx.Labels, r.Timestamp, r.Value)
if hasRelabeling {
ctx.ApplyRelabeling()
}
if len(ctx.Labels) == 0 {
// Skip metric without labels.
continue
}
if err := ctx.WriteDataPoint(nil, ctx.Labels, r.Timestamp, r.Value); err != nil {
return err
}
}
rowsInserted.Add(len(rows))
rowsPerInsert.Update(float64(len(rows)))

View File

@@ -5,6 +5,7 @@ import (
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/relabel"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentsdbhttp"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
@@ -34,6 +35,7 @@ func insertRows(rows []parser.Row) error {
defer common.PutInsertCtx(ctx)
ctx.Reset(len(rows))
hasRelabeling := relabel.HasRelabeling()
for i := range rows {
r := &rows[i]
ctx.Labels = ctx.Labels[:0]
@@ -42,7 +44,16 @@ func insertRows(rows []parser.Row) error {
tag := &r.Tags[j]
ctx.AddLabel(tag.Key, tag.Value)
}
ctx.WriteDataPoint(nil, ctx.Labels, r.Timestamp, r.Value)
if hasRelabeling {
ctx.ApplyRelabeling()
}
if len(ctx.Labels) == 0 {
// Skip metric without labels.
continue
}
if err := ctx.WriteDataPoint(nil, ctx.Labels, r.Timestamp, r.Value); err != nil {
return err
}
}
rowsInserted.Add(len(rows))
rowsPerInsert.Update(float64(len(rows)))

View File

@@ -0,0 +1,70 @@
package prometheusimport
import (
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/relabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/prometheus"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="prometheus"}`)
rowsPerInsert = metrics.NewHistogram(`vm_rows_per_insert{type="prometheus"}`)
)
// InsertHandler processes `/api/v1/import/prometheus` request.
func InsertHandler(req *http.Request) error {
extraLabels, err := parserCommon.GetExtraLabels(req)
if err != nil {
return err
}
defaultTimestamp, err := parserCommon.GetTimestamp(req)
if err != nil {
return err
}
return writeconcurrencylimiter.Do(func() error {
isGzipped := req.Header.Get("Content-Encoding") == "gzip"
return parser.ParseStream(req.Body, defaultTimestamp, isGzipped, func(rows []parser.Row) error {
return insertRows(rows, extraLabels)
})
})
}
func insertRows(rows []parser.Row, extraLabels []prompbmarshal.Label) error {
ctx := common.GetInsertCtx()
defer common.PutInsertCtx(ctx)
ctx.Reset(len(rows))
hasRelabeling := relabel.HasRelabeling()
for i := range rows {
r := &rows[i]
ctx.Labels = ctx.Labels[:0]
ctx.AddLabel("", r.Metric)
for j := range r.Tags {
tag := &r.Tags[j]
ctx.AddLabel(tag.Key, tag.Value)
}
for j := range extraLabels {
label := &extraLabels[j]
ctx.AddLabel(label.Name, label.Value)
}
if hasRelabeling {
ctx.ApplyRelabeling()
}
if len(ctx.Labels) == 0 {
// Skip metric without labels.
continue
}
if err := ctx.WriteDataPoint(nil, ctx.Labels, r.Timestamp, r.Value); err != nil {
return err
}
}
rowsInserted.Add(len(rows))
rowsPerInsert.Update(float64(len(rows)))
return ctx.FlushBufs()
}

View File

@@ -1,13 +1,8 @@
package prompush
import (
"runtime"
"sync"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/metrics"
)
@@ -21,93 +16,66 @@ const maxRowsPerBlock = 10000
// Push pushes wr to storage.
func Push(wr *prompbmarshal.WriteRequest) {
ctx := getPushCtx()
defer putPushCtx(ctx)
ctx := common.GetInsertCtx()
defer common.PutInsertCtx(ctx)
tss := wr.Timeseries
for len(tss) > 0 {
// Process big tss in smaller blocks in order to reduce maxmimum memory usage
samplesCount := 0
i := 0
for i < len(tss) {
samplesCount += len(tss[i].Samples)
i++
if samplesCount > maxRowsPerBlock {
break
}
}
tssBlock := tss
if len(tssBlock) > maxRowsPerBlock {
tssBlock = tss[:maxRowsPerBlock]
tss = tss[maxRowsPerBlock:]
if i < len(tss) {
tssBlock = tss[:i]
tss = tss[i:]
} else {
tss = nil
}
ctx.push(tssBlock)
push(ctx, tssBlock)
}
}
func (ctx *pushCtx) push(tss []prompbmarshal.TimeSeries) {
func push(ctx *common.InsertCtx, tss []prompbmarshal.TimeSeries) {
rowsLen := 0
for i := range tss {
rowsLen += len(tss[i].Samples)
}
ic := &ctx.Common
ic.Reset(rowsLen)
ctx.Reset(rowsLen)
rowsTotal := 0
labels := ctx.labels[:0]
for i := range tss {
ts := &tss[i]
labels = labels[:0]
rowsTotal += len(ts.Samples)
ctx.Labels = ctx.Labels[:0]
for j := range ts.Labels {
label := &ts.Labels[j]
labels = append(labels, prompb.Label{
Name: bytesutil.ToUnsafeBytes(label.Name),
Value: bytesutil.ToUnsafeBytes(label.Value),
})
ctx.AddLabel(label.Name, label.Value)
}
ctx.ApplyRelabeling()
if len(ctx.Labels) == 0 {
// Skip metric without labels.
continue
}
var metricNameRaw []byte
var err error
for i := range ts.Samples {
r := &ts.Samples[i]
metricNameRaw = ic.WriteDataPointExt(metricNameRaw, labels, r.Timestamp, r.Value)
metricNameRaw, err = ctx.WriteDataPointExt(metricNameRaw, ctx.Labels, r.Timestamp, r.Value)
if err != nil {
logger.Errorf("cannot write promscape data to storage: %s", err)
return
}
}
rowsTotal += len(ts.Samples)
}
ctx.labels = labels
rowsInserted.Add(rowsTotal)
rowsPerInsert.Update(float64(rowsTotal))
if err := ic.FlushBufs(); err != nil {
if err := ctx.FlushBufs(); err != nil {
logger.Errorf("cannot flush promscrape data to storage: %s", err)
}
}
type pushCtx struct {
Common common.InsertCtx
labels []prompb.Label
}
func (ctx *pushCtx) reset() {
ctx.Common.Reset(0)
for i := range ctx.labels {
label := &ctx.labels[i]
label.Name = nil
label.Value = nil
}
ctx.labels = ctx.labels[:0]
}
func getPushCtx() *pushCtx {
select {
case ctx := <-pushCtxPoolCh:
return ctx
default:
if v := pushCtxPool.Get(); v != nil {
return v.(*pushCtx)
}
return &pushCtx{}
}
}
func putPushCtx(ctx *pushCtx) {
ctx.reset()
select {
case pushCtxPoolCh <- ctx:
default:
pushCtxPool.Put(ctx)
}
}
var pushCtxPool sync.Pool
var pushCtxPoolCh = make(chan *pushCtx, runtime.GOMAXPROCS(-1))

View File

@@ -4,6 +4,7 @@ import (
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/relabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/promremotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
@@ -32,14 +33,32 @@ func insertRows(timeseries []prompb.TimeSeries) error {
}
ctx.Reset(rowsLen)
rowsTotal := 0
hasRelabeling := relabel.HasRelabeling()
for i := range timeseries {
ts := &timeseries[i]
var metricNameRaw []byte
for i := range ts.Samples {
r := &ts.Samples[i]
metricNameRaw = ctx.WriteDataPointExt(metricNameRaw, ts.Labels, r.Timestamp, r.Value)
}
rowsTotal += len(ts.Samples)
ctx.Labels = ctx.Labels[:0]
srcLabels := ts.Labels
for _, srcLabel := range srcLabels {
ctx.AddLabelBytes(srcLabel.Name, srcLabel.Value)
}
if hasRelabeling {
ctx.ApplyRelabeling()
}
if len(ctx.Labels) == 0 {
// Skip metric without labels.
continue
}
var metricNameRaw []byte
var err error
samples := ts.Samples
for i := range samples {
r := &samples[i]
metricNameRaw, err = ctx.WriteDataPointExt(metricNameRaw, ctx.Labels, r.Timestamp, r.Value)
if err != nil {
return err
}
}
}
rowsInserted.Add(rowsTotal)
rowsPerInsert.Update(float64(rowsTotal))

View File

@@ -0,0 +1,127 @@
package relabel
import (
"flag"
"fmt"
"sync/atomic"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/procutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
"github.com/VictoriaMetrics/metrics"
)
var relabelConfig = flag.String("relabelConfig", "", "Optional path to a file with relabeling rules, which are applied to all the ingested metrics. "+
"See https://victoriametrics.github.io/#relabeling for details")
// Init must be called after flag.Parse and before using the relabel package.
func Init() {
prcs, err := loadRelabelConfig()
if err != nil {
logger.Fatalf("cannot load relabelConfig: %s", err)
}
prcsGlobal.Store(&prcs)
if len(*relabelConfig) == 0 {
return
}
sighupCh := procutil.NewSighupChan()
go func() {
for range sighupCh {
logger.Infof("received SIGHUP; reloading -relabelConfig=%q...", *relabelConfig)
prcs, err := loadRelabelConfig()
if err != nil {
logger.Errorf("cannot load the updated relabelConfig: %s; preserving the previous config", err)
continue
}
prcsGlobal.Store(&prcs)
logger.Infof("successfully reloaded -relabelConfig=%q", *relabelConfig)
}
}()
}
var prcsGlobal atomic.Value
func loadRelabelConfig() ([]promrelabel.ParsedRelabelConfig, error) {
if len(*relabelConfig) == 0 {
return nil, nil
}
prcs, err := promrelabel.LoadRelabelConfigs(*relabelConfig)
if err != nil {
return nil, fmt.Errorf("error when reading -relabelConfig=%q: %w", *relabelConfig, err)
}
return prcs, nil
}
// HasRelabeling returns true if there is global relabeling.
func HasRelabeling() bool {
prcs := prcsGlobal.Load().(*[]promrelabel.ParsedRelabelConfig)
return len(*prcs) > 0
}
// Ctx holds relabeling context.
type Ctx struct {
// tmpLabels is used during ApplyRelabeling call.
tmpLabels []prompbmarshal.Label
}
// Reset resets ctx.
func (ctx *Ctx) Reset() {
labels := ctx.tmpLabels
for i := range labels {
label := &labels[i]
label.Name = ""
label.Value = ""
}
ctx.tmpLabels = ctx.tmpLabels[:0]
}
// ApplyRelabeling applies relabeling to the given labels and returns the result.
//
// The returned labels are valid until the next call to ApplyRelabeling.
func (ctx *Ctx) ApplyRelabeling(labels []prompb.Label) []prompb.Label {
prcs := prcsGlobal.Load().(*[]promrelabel.ParsedRelabelConfig)
if len(*prcs) == 0 {
// There are no relabeling rules.
return labels
}
// Convert src to prompbmarshal.Label format suitable for relabeling.
tmpLabels := ctx.tmpLabels[:0]
for _, label := range labels {
name := bytesutil.ToUnsafeString(label.Name)
if len(name) == 0 {
name = "__name__"
}
value := bytesutil.ToUnsafeString(label.Value)
tmpLabels = append(tmpLabels, prompbmarshal.Label{
Name: name,
Value: value,
})
}
// Apply relabeling
tmpLabels = promrelabel.ApplyRelabelConfigs(tmpLabels, 0, *prcs, true)
ctx.tmpLabels = tmpLabels
if len(tmpLabels) == 0 {
metricsDropped.Inc()
}
// Return back labels to the desired format.
dst := labels[:0]
for _, label := range tmpLabels {
name := bytesutil.ToUnsafeBytes(label.Name)
if label.Name == "__name__" {
name = nil
}
value := bytesutil.ToUnsafeBytes(label.Value)
dst = append(dst, prompb.Label{
Name: name,
Value: value,
})
}
return dst
}
var metricsDropped = metrics.NewCounter(`vm_relabel_metrics_dropped_total`)

View File

@@ -6,6 +6,10 @@ import (
"sync"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/relabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/vmimport"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
@@ -21,12 +25,18 @@ var (
//
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6
func InsertHandler(req *http.Request) error {
extraLabels, err := parserCommon.GetExtraLabels(req)
if err != nil {
return err
}
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(req, insertRows)
return parser.ParseStream(req, func(rows []parser.Row) error {
return insertRows(rows, extraLabels)
})
})
}
func insertRows(rows []parser.Row) error {
func insertRows(rows []parser.Row, extraLabels []prompbmarshal.Label) error {
ctx := getPushCtx()
defer putPushCtx(ctx)
@@ -37,22 +47,38 @@ func insertRows(rows []parser.Row) error {
ic := &ctx.Common
ic.Reset(rowsLen)
rowsTotal := 0
hasRelabeling := relabel.HasRelabeling()
for i := range rows {
r := &rows[i]
rowsTotal += len(r.Values)
ic.Labels = ic.Labels[:0]
for j := range r.Tags {
tag := &r.Tags[j]
ic.AddLabelBytes(tag.Key, tag.Value)
}
for j := range extraLabels {
label := &extraLabels[j]
ic.AddLabel(label.Name, label.Value)
}
if hasRelabeling {
ic.ApplyRelabeling()
}
if len(ic.Labels) == 0 {
// Skip metric without labels.
continue
}
ctx.metricNameBuf = storage.MarshalMetricNameRaw(ctx.metricNameBuf[:0], ic.Labels)
values := r.Values
timestamps := r.Timestamps
_ = timestamps[len(values)-1]
if len(timestamps) != len(values) {
logger.Panicf("BUG: len(timestamps)=%d must match len(values)=%d", len(timestamps), len(values))
}
for j, value := range values {
timestamp := timestamps[j]
ic.WriteDataPoint(ctx.metricNameBuf, nil, timestamp, value)
if err := ic.WriteDataPoint(ctx.metricNameBuf, nil, timestamp, value); err != nil {
return err
}
}
rowsTotal += len(values)
}
rowsInserted.Add(rowsTotal)
rowsPerInsert.Update(float64(rowsTotal))

View File

@@ -51,20 +51,23 @@ package-vmrestore-386:
publish-vmrestore:
APP_NAME=vmrestore $(MAKE) publish-via-docker
vmrestore-pure:
APP_NAME=vmrestore $(MAKE) app-local-pure
vmrestore-amd64:
CGO_ENABLED=1 GOOS=linux GOARCH=amd64 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmrestore-amd64 ./app/vmrestore
CGO_ENABLED=1 GOARCH=amd64 $(MAKE) vmrestore-local-with-goarch
vmrestore-arm:
CGO_ENABLED=0 GOOS=linux GOARCH=arm GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmrestore-arm ./app/vmrestore
CGO_ENABLED=0 GOARCH=arm $(MAKE) vmrestore-local-with-goarch
vmrestore-arm64:
CGO_ENABLED=0 GOOS=linux GOARCH=arm64 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmrestore-arm64 ./app/vmrestore
CGO_ENABLED=0 GOARCH=arm64 $(MAKE) vmrestore-local-with-goarch
vmrestore-ppc64le:
CGO_ENABLED=0 GOOS=linux GOARCH=ppc64le GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmrestore-ppc64le ./app/vmrestore
CGO_ENABLED=0 GOARCH=ppc64le $(MAKE) vmrestore-local-with-goarch
vmrestore-386:
CGO_ENABLED=0 GOOS=linux GOARCH=386 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmrestore-386 ./app/vmrestore
CGO_ENABLED=0 GOARCH=386 $(MAKE) vmrestore-local-with-goarch
vmrestore-local-with-goarch:
APP_NAME=vmrestore $(MAKE) app-local-with-goarch
vmrestore-pure:
APP_NAME=vmrestore $(MAKE) app-local-pure

View File

@@ -3,7 +3,7 @@
`vmrestore` restores data from backups created by [vmbackup](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmbackup/README.md).
VictoriaMetrics `v1.29.0` and newer versions must be used for working with the restored data.
Restore process can be interrupted at any time. It is automatically resumed from the inerruption point
Restore process can be interrupted at any time. It is automatically resumed from the interruption point
when restarting `vmrestore` with the same args.
@@ -21,7 +21,7 @@ vmrestore -src=gcs://<bucket>/<path/to/backup> -storageDataPath=<local/path/to/r
* `<local/path/to/restore>` is the path to folder where data will be restored. This folder must be passed
to VictoriaMetrics in `-storageDataPath` command-line flag after the restore process is complete.
The original `-storageDataPath` directory may contain old files. They will be susbstituted by the files from backup,
The original `-storageDataPath` directory may contain old files. They will be substituted by the files from backup,
i.e. the end result would be similar to [rsync --delete](https://askubuntu.com/questions/476041/how-do-i-make-rsync-delete-files-that-have-been-deleted-from-the-source-folder).
@@ -33,7 +33,44 @@ i.e. the end result would be similar to [rsync --delete](https://askubuntu.com/q
### Advanced usage
Run `vmrestore -help` in order to see all the available options:
* Obtaining credentials from a file.
Add flag `-credsFilePath=/etc/credentials` with following content:
for s3 (aws, minio or other s3 compatible storages):
```bash
[default]
aws_access_key_id=theaccesskey
aws_secret_access_key=thesecretaccesskeyvalue
```
for gce cloud storage:
```json
{
"type": "service_account",
"project_id": "project-id",
"private_key_id": "key-id",
"private_key": "-----BEGIN PRIVATE KEY-----\nprivate-key\n-----END PRIVATE KEY-----\n",
"client_email": "service-account-email",
"client_id": "client-id",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://accounts.google.com/o/oauth2/token",
"auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
"client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/service-account-email"
}
```
* Usage with s3 custom url endpoint. It is possible to use `vmrestore` with s3 api compatible storages, like minio, cloudian and other.
You have to add custom url endpoint with a flag:
```
# for minio:
-customS3Endpoint=http://localhost:9000
# for aws gov region
-customS3Endpoint=https://s3-fips.us-gov-west-1.amazonaws.com
```
* Run `vmrestore -help` in order to see all the available options:
```
-concurrency int
@@ -42,7 +79,7 @@ Run `vmrestore -help` in order to see all the available options:
Path to file with S3 configs. Configs are loaded from default location if not set.
See https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html
-configProfile string
Profile name for S3 configs (default "default")
Profile name for S3 configs. If no set, the value of the environment variable will be loaded (AWS_PROFILE or AWS_DEFAULT_PROFILE), or if both not set, DefaultSharedConfigProfile is used
-credsFilePath string
Path to file with GCS or S3 credentials. Credentials are loaded from default locations if not set.
See https://cloud.google.com/iam/docs/creating-managing-service-account-keys and https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html
@@ -53,24 +90,30 @@ Run `vmrestore -help` in order to see all the available options:
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
-fs.disableMmap
Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot data files bigger than 2^32 bytes in memory
Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread()
-loggerErrorsPerSecondLimit int
Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, then the remaining errors are suppressed. Zero value disables the rate limit (default 10)
-loggerFormat string
Format for logs. Possible values: default, json (default "default")
-loggerLevel string
Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO")
-loggerOutput string
Output for the logs. Supported values: stderr, stdout (default "stderr")
-maxBytesPerSecond int
-maxBytesPerSecond value
The maximum download speed. There is no limit if it is set to 0
Supports the following optional suffixes for values: KB, MB, GB, KiB, MiB, GiB (default 0)
-memory.allowedBytes value
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to non-zero value. Too low value may increase cache miss rate, which usually results in higher CPU and disk IO usage. Too high value may evict too much data from OS page cache, which will result in higher disk IO usage
Supports the following optional suffixes for values: KB, MB, GB, KiB, MiB, GiB (default 0)
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy. Too low value may increase cache miss rate, which usually results in higher CPU and disk IO usage. Too high value may evict too much data from OS page cache, which will result in higher disk IO usage (default 60)
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low value may increase cache miss rate, which usually results in higher CPU and disk IO usage. Too high value may evict too much data from OS page cache, which will result in higher disk IO usage (default 60)
-skipBackupCompleteCheck
Whether to skip checking for 'backup complete' file in -src. This may be useful for restoring from old backups, which were created without 'backup complete' file
-src string
Source path with backup on the remote storage. Example: gcs://bucket/path/to/backup/dir, s3://bucket/path/to/backup/dir or fs:///path/to/local/backup/dir
-storageDataPath string
Destination path where backup must be restored. VictoriaMetrics must be stopped when restoring from backup. -storageDataPath dir can be non-empty. In this case the contents of -storageDataPath dir is synchronized with -src contents, i.e. it works like 'rsync --delete' (default "victoria-metrics-data")
-version
-version
Show VictoriaMetrics version
```
@@ -98,9 +141,9 @@ Run `make package-vmrestore`. It builds `victoriametrics/vmrestore:<PKG_TAG>` do
`<PKG_TAG>` is auto-generated image tag, which depends on source code in the repository.
The `<PKG_TAG>` may be manually set via `PKG_TAG=foobar make package-vmrestore`.
By default the image is built on top of `scratch` image. It is possible to build the package on top of any other base image
by setting it via `<ROOT_IMAGE>` environment variable. For example, the following command builds the image on top of `alpine:3.11` image:
The base docker image is [alpine](https://hub.docker.com/_/alpine) but it is possible to use any other base image
by setting it via `<ROOT_IMAGE>` environment variable. For example, the following command builds the image on top of [scratch](https://hub.docker.com/_/scratch) image:
```bash
ROOT_IMAGE=alpine:3.11 make package-vmrestore
ROOT_IMAGE=scratch make package-vmrestore
```

View File

@@ -9,7 +9,9 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/backup/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/backup/fslocal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envflag"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
)
@@ -20,7 +22,7 @@ var (
"VictoriaMetrics must be stopped when restoring from backup. -storageDataPath dir can be non-empty. In this case the contents of -storageDataPath dir "+
"is synchronized with -src contents, i.e. it works like 'rsync --delete'")
concurrency = flag.Int("concurrency", 10, "The number of concurrent workers. Higher concurrency may reduce restore duration")
maxBytesPerSecond = flag.Int("maxBytesPerSecond", 0, "The maximum download speed. There is no limit if it is set to 0")
maxBytesPerSecond = flagutil.NewBytes("maxBytesPerSecond", 0, "The maximum download speed. There is no limit if it is set to 0")
skipBackupCompleteCheck = flag.Bool("skipBackupCompleteCheck", false, "Whether to skip checking for 'backup complete' file in -src. This may be useful for restoring from old backups, which were created without 'backup complete' file")
)
@@ -30,6 +32,8 @@ func main() {
flag.Usage = usage
envflag.Parse()
buildinfo.Init()
logger.Init()
cgroup.UpdateGOMAXPROCSToCPUQuota()
srcFS, err := newSrcFS()
if err != nil {
@@ -48,6 +52,8 @@ func main() {
if err := a.Run(); err != nil {
logger.Fatalf("cannot restore from backup: %s", err)
}
srcFS.MustStop()
dstFS.MustStop()
}
func usage() {
@@ -68,10 +74,10 @@ func newDstFS() (*fslocal.FS, error) {
}
fs := &fslocal.FS{
Dir: *storageDataPath,
MaxBytesPerSecond: *maxBytesPerSecond,
MaxBytesPerSecond: maxBytesPerSecond.N,
}
if err := fs.Init(); err != nil {
return nil, fmt.Errorf("cannot initialize local fs: %s", err)
return nil, fmt.Errorf("cannot initialize local fs: %w", err)
}
return fs, nil
}
@@ -79,7 +85,7 @@ func newDstFS() (*fslocal.FS, error) {
func newSrcFS() (common.RemoteFS, error) {
fs, err := actions.NewRemoteFS(*src)
if err != nil {
return nil, fmt.Errorf("cannot parse `-src`=%q: %s", *src, err)
return nil, fmt.Errorf("cannot parse `-src`=%q: %w", *src, err)
}
return fs, nil
}

View File

@@ -0,0 +1,86 @@
package bufferedwriter
import (
"bufio"
"fmt"
"io"
"sync"
)
// Get returns buffered writer for the given w.
//
// The writer must be returned to the pool after use by calling Put().
func Get(w io.Writer) *Writer {
v := writerPool.Get()
if v == nil {
v = &Writer{
// By default net/http.Server uses 4KB buffers, which are flushed to client with chunked responses.
// These buffers may result in visible overhead for responses exceeding a few megabytes.
// So allocate 64Kb buffers.
bw: bufio.NewWriterSize(w, 64*1024),
}
}
bw := v.(*Writer)
bw.bw.Reset(w)
return bw
}
// Put returns back bw to the pool.
//
// bw cannot be used after returning to the pool.
func Put(bw *Writer) {
bw.reset()
writerPool.Put(bw)
}
var writerPool sync.Pool
// Writer is buffered writer, which may be used in order to reduce overhead
// when sending moderately big responses to http server.
//
// Writer methods can be called from concurrently running goroutines.
// The writer remembers the first occurred error, which can be inspected with Error method.
type Writer struct {
lock sync.Mutex
bw *bufio.Writer
err error
}
func (bw *Writer) reset() {
bw.bw.Reset(nil)
bw.err = nil
}
// Write writes p to bw.
func (bw *Writer) Write(p []byte) (int, error) {
bw.lock.Lock()
defer bw.lock.Unlock()
if bw.err != nil {
return 0, bw.err
}
n, err := bw.bw.Write(p)
if err != nil {
bw.err = fmt.Errorf("cannot send %d bytes to client: %w", len(p), err)
}
return n, bw.err
}
// Flush flushes bw to the underlying writer.
func (bw *Writer) Flush() error {
bw.lock.Lock()
defer bw.lock.Unlock()
if bw.err != nil {
return bw.err
}
if err := bw.bw.Flush(); err != nil {
bw.err = fmt.Errorf("cannot flush data to client: %w", err)
}
return bw.err
}
// Error returns the first occurred error in bw.
func (bw *Writer) Error() error {
bw.lock.Lock()
defer bw.lock.Unlock()
return bw.err
}

View File

@@ -0,0 +1,419 @@
package graphite
import (
"fmt"
"net/http"
"regexp"
"sort"
"strings"
"sync"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/bufferedwriter"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/netstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/searchutils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/metrics"
)
// MetricsFindHandler implements /metrics/find handler.
//
// See https://graphite-api.readthedocs.io/en/latest/api.html#metrics-find
func MetricsFindHandler(startTime time.Time, w http.ResponseWriter, r *http.Request) error {
deadline := searchutils.GetDeadlineForQuery(r, startTime)
if err := r.ParseForm(); err != nil {
return fmt.Errorf("cannot parse form values: %w", err)
}
format := r.FormValue("format")
if format == "" {
format = "treejson"
}
switch format {
case "treejson", "completer":
default:
return fmt.Errorf(`unexpected "format" query arg: %q; expecting "treejson" or "completer"`, format)
}
query := r.FormValue("query")
if len(query) == 0 {
return fmt.Errorf("expecting non-empty `query` arg")
}
delimiter := r.FormValue("delimiter")
if delimiter == "" {
delimiter = "."
}
if len(delimiter) > 1 {
return fmt.Errorf("`delimiter` query arg must contain only a single char")
}
if searchutils.GetBool(r, "automatic_variants") {
// See https://github.com/graphite-project/graphite-web/blob/bb9feb0e6815faa73f538af6ed35adea0fb273fd/webapp/graphite/metrics/views.py#L152
query = addAutomaticVariants(query, delimiter)
}
if format == "completer" {
// See https://github.com/graphite-project/graphite-web/blob/bb9feb0e6815faa73f538af6ed35adea0fb273fd/webapp/graphite/metrics/views.py#L148
query = strings.ReplaceAll(query, "..", ".*")
if !strings.HasSuffix(query, "*") {
query += "*"
}
}
leavesOnly := searchutils.GetBool(r, "leavesOnly")
wildcards := searchutils.GetBool(r, "wildcards")
label := r.FormValue("label")
if label == "__name__" {
label = ""
}
jsonp := r.FormValue("jsonp")
from, err := searchutils.GetTime(r, "from", 0)
if err != nil {
return err
}
ct := startTime.UnixNano() / 1e6
until, err := searchutils.GetTime(r, "until", ct)
if err != nil {
return err
}
tr := storage.TimeRange{
MinTimestamp: from,
MaxTimestamp: until,
}
paths, err := metricsFind(tr, label, query, delimiter[0], false, deadline)
if err != nil {
return err
}
if leavesOnly {
paths = filterLeaves(paths, delimiter)
}
paths = deduplicatePaths(paths, delimiter)
sortPaths(paths, delimiter)
contentType := "application/json"
if jsonp != "" {
contentType = "text/javascript"
}
w.Header().Set("Content-Type", contentType)
bw := bufferedwriter.Get(w)
defer bufferedwriter.Put(bw)
WriteMetricsFindResponse(bw, paths, delimiter, format, wildcards, jsonp)
if err := bw.Flush(); err != nil {
return err
}
metricsFindDuration.UpdateDuration(startTime)
return nil
}
func deduplicatePaths(paths []string, delimiter string) []string {
if len(paths) == 0 {
return nil
}
sort.Strings(paths)
dst := paths[:1]
for _, path := range paths[1:] {
prevPath := dst[len(dst)-1]
if path == prevPath {
// Skip duplicate path.
continue
}
dst = append(dst, path)
}
return dst
}
// MetricsExpandHandler implements /metrics/expand handler.
//
// See https://graphite-api.readthedocs.io/en/latest/api.html#metrics-expand
func MetricsExpandHandler(startTime time.Time, w http.ResponseWriter, r *http.Request) error {
deadline := searchutils.GetDeadlineForQuery(r, startTime)
if err := r.ParseForm(); err != nil {
return fmt.Errorf("cannot parse form values: %w", err)
}
queries := r.Form["query"]
if len(queries) == 0 {
return fmt.Errorf("missing `query` arg")
}
groupByExpr := searchutils.GetBool(r, "groupByExpr")
leavesOnly := searchutils.GetBool(r, "leavesOnly")
label := r.FormValue("label")
if label == "__name__" {
label = ""
}
delimiter := r.FormValue("delimiter")
if delimiter == "" {
delimiter = "."
}
if len(delimiter) > 1 {
return fmt.Errorf("`delimiter` query arg must contain only a single char")
}
jsonp := r.FormValue("jsonp")
from, err := searchutils.GetTime(r, "from", 0)
if err != nil {
return err
}
ct := startTime.UnixNano() / 1e6
until, err := searchutils.GetTime(r, "until", ct)
if err != nil {
return err
}
tr := storage.TimeRange{
MinTimestamp: from,
MaxTimestamp: until,
}
m := make(map[string][]string, len(queries))
for _, query := range queries {
paths, err := metricsFind(tr, label, query, delimiter[0], true, deadline)
if err != nil {
return err
}
if leavesOnly {
paths = filterLeaves(paths, delimiter)
}
m[query] = paths
}
contentType := "application/json"
if jsonp != "" {
contentType = "text/javascript"
}
w.Header().Set("Content-Type", contentType)
if groupByExpr {
for _, paths := range m {
sortPaths(paths, delimiter)
}
WriteMetricsExpandResponseByQuery(w, m, jsonp)
return nil
}
paths := m[queries[0]]
if len(m) > 1 {
pathsSet := make(map[string]struct{})
for _, paths := range m {
for _, path := range paths {
pathsSet[path] = struct{}{}
}
}
paths = make([]string, 0, len(pathsSet))
for path := range pathsSet {
paths = append(paths, path)
}
}
sortPaths(paths, delimiter)
bw := bufferedwriter.Get(w)
defer bufferedwriter.Put(bw)
WriteMetricsExpandResponseFlat(bw, paths, jsonp)
if err := bw.Flush(); err != nil {
return err
}
metricsExpandDuration.UpdateDuration(startTime)
return nil
}
// MetricsIndexHandler implements /metrics/index.json handler.
//
// See https://graphite-api.readthedocs.io/en/latest/api.html#metrics-index-json
func MetricsIndexHandler(startTime time.Time, w http.ResponseWriter, r *http.Request) error {
deadline := searchutils.GetDeadlineForQuery(r, startTime)
if err := r.ParseForm(); err != nil {
return fmt.Errorf("cannot parse form values: %w", err)
}
jsonp := r.FormValue("jsonp")
metricNames, err := netstorage.GetLabelValues("__name__", deadline)
if err != nil {
return fmt.Errorf(`cannot obtain metric names: %w`, err)
}
contentType := "application/json"
if jsonp != "" {
contentType = "text/javascript"
}
w.Header().Set("Content-Type", contentType)
bw := bufferedwriter.Get(w)
defer bufferedwriter.Put(bw)
WriteMetricsIndexResponse(bw, metricNames, jsonp)
if err := bw.Flush(); err != nil {
return err
}
metricsIndexDuration.UpdateDuration(startTime)
return nil
}
// metricsFind searches for label values that match the given query.
func metricsFind(tr storage.TimeRange, label, query string, delimiter byte, isExpand bool, deadline searchutils.Deadline) ([]string, error) {
n := strings.IndexAny(query, "*{[")
if n < 0 || n == len(query)-1 && strings.HasSuffix(query, "*") {
expandTail := n >= 0
if expandTail {
query = query[:len(query)-1]
}
suffixes, err := netstorage.GetTagValueSuffixes(tr, label, query, delimiter, deadline)
if err != nil {
return nil, err
}
if len(suffixes) == 0 {
return nil, nil
}
if !expandTail && len(query) > 0 && query[len(query)-1] == delimiter {
return []string{query}, nil
}
results := make([]string, 0, len(suffixes))
for _, suffix := range suffixes {
if expandTail || len(suffix) == 0 || len(suffix) == 1 && suffix[0] == delimiter {
results = append(results, query+suffix)
}
}
return results, nil
}
subquery := query[:n] + "*"
paths, err := metricsFind(tr, label, subquery, delimiter, isExpand, deadline)
if err != nil {
return nil, err
}
tail := ""
suffix := query[n:]
if m := strings.IndexByte(suffix, delimiter); m >= 0 {
tail = suffix[m+1:]
suffix = suffix[:m+1]
}
qPrefix := query[:n] + suffix
rePrefix, err := getRegexpForQuery(qPrefix, delimiter)
if err != nil {
return nil, fmt.Errorf("cannot convert query %q to regexp: %w", qPrefix, err)
}
results := make([]string, 0, len(paths))
for _, path := range paths {
if !rePrefix.MatchString(path) {
continue
}
if tail == "" {
results = append(results, path)
continue
}
subquery := path + tail
fullPaths, err := metricsFind(tr, label, subquery, delimiter, isExpand, deadline)
if err != nil {
return nil, err
}
if isExpand {
results = append(results, fullPaths...)
} else {
for _, fullPath := range fullPaths {
results = append(results, qPrefix+fullPath[len(path):])
}
}
}
return results, nil
}
var (
metricsFindDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/metrics/find"}`)
metricsExpandDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/metrics/expand"}`)
metricsIndexDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/metrics/index.json"}`)
)
func addAutomaticVariants(query, delimiter string) string {
// See https://github.com/graphite-project/graphite-web/blob/bb9feb0e6815faa73f538af6ed35adea0fb273fd/webapp/graphite/metrics/views.py#L152
parts := strings.Split(query, delimiter)
for i, part := range parts {
if strings.Contains(part, ",") && !strings.Contains(part, "{") {
parts[i] = "{" + part + "}"
}
}
return strings.Join(parts, delimiter)
}
func filterLeaves(paths []string, delimiter string) []string {
leaves := paths[:0]
for _, path := range paths {
if !strings.HasSuffix(path, delimiter) {
leaves = append(leaves, path)
}
}
return leaves
}
func sortPaths(paths []string, delimiter string) {
sort.Slice(paths, func(i, j int) bool {
a, b := paths[i], paths[j]
isNodeA := strings.HasSuffix(a, delimiter)
isNodeB := strings.HasSuffix(b, delimiter)
if isNodeA == isNodeB {
return a < b
}
return isNodeA
})
}
func getRegexpForQuery(query string, delimiter byte) (*regexp.Regexp, error) {
regexpCacheLock.Lock()
defer regexpCacheLock.Unlock()
k := regexpCacheKey{
query: query,
delimiter: delimiter,
}
if re := regexpCache[k]; re != nil {
return re.re, re.err
}
a := make([]string, 0, len(query))
quotedDelimiter := regexp.QuoteMeta(string([]byte{delimiter}))
tillNextDelimiter := "[^" + quotedDelimiter + "]*"
for i := 0; i < len(query); i++ {
switch query[i] {
case '*':
a = append(a, tillNextDelimiter)
case '{':
tmp := query[i+1:]
if n := strings.IndexByte(tmp, '}'); n < 0 {
a = append(a, regexp.QuoteMeta(query[i:]))
i = len(query)
} else {
a = append(a, "(?:")
opts := strings.Split(tmp[:n], ",")
for j, opt := range opts {
opts[j] = regexp.QuoteMeta(opt)
}
a = append(a, strings.Join(opts, "|"))
a = append(a, ")")
i += n + 1
}
case '[':
tmp := query[i:]
if n := strings.IndexByte(tmp, ']'); n < 0 {
a = append(a, regexp.QuoteMeta(query[i:]))
i = len(query)
} else {
a = append(a, tmp[:n+1])
i += n
}
default:
a = append(a, regexp.QuoteMeta(query[i:i+1]))
}
}
s := strings.Join(a, "")
if !strings.HasSuffix(s, quotedDelimiter) {
s += quotedDelimiter + "?"
}
s = "^(?:" + s + ")$"
re, err := regexp.Compile(s)
regexpCache[k] = &regexpCacheEntry{
re: re,
err: err,
}
if len(regexpCache) >= maxRegexpCacheSize {
for k := range regexpCache {
if len(regexpCache) < maxRegexpCacheSize {
break
}
delete(regexpCache, k)
}
}
return re, err
}
type regexpCacheEntry struct {
re *regexp.Regexp
err error
}
type regexpCacheKey struct {
query string
delimiter byte
}
var regexpCache = make(map[regexpCacheKey]*regexpCacheEntry)
var regexpCacheLock sync.Mutex
const maxRegexpCacheSize = 10000

View File

@@ -0,0 +1,75 @@
package graphite
import (
"reflect"
"testing"
)
func TestGetRegexpForQuery(t *testing.T) {
f := func(query string, delimiter byte, reExpected string) {
t.Helper()
re, err := getRegexpForQuery(query, delimiter)
if err != nil {
t.Fatalf("unexpected error in getRegexpForQuery(%q): %s", query, err)
}
reStr := re.String()
if reStr != reExpected {
t.Fatalf("unexpected regexp for query=%q, delimiter=%c; got %s; want %s", query, delimiter, reStr, reExpected)
}
}
f("", '.', `^(?:\.?)$`)
f("foobar", '.', `^(?:foobar\.?)$`)
f("*", '.', `^(?:[^\.]*\.?)$`)
f("*", '_', `^(?:[^_]*_?)$`)
f("foo.*.bar", '.', `^(?:foo\.[^\.]*\.bar\.?)$`)
f("fo*b{ar,aaa}[a-z]xx*.d", '.', `^(?:fo[^\.]*b(?:ar|aaa)[a-z]xx[^\.]*\.d\.?)$`)
f("fo*b{ar,aaa}[a-z]xx*_d", '_', `^(?:fo[^_]*b(?:ar|aaa)[a-z]xx[^_]*_d_?)$`)
f("foo.[ab]*z", '.', `^(?:foo\.[ab][^\.]*z\.?)$`)
f("foo_[ab]*", '_', `^(?:foo_[ab][^_]*_?)$`)
f("foo_[ab]_", '_', `^(?:foo_[ab]_)$`)
f("foo.[ab].", '.', `^(?:foo\.[ab]\.)$`)
}
func TestSortPaths(t *testing.T) {
f := func(paths []string, delimiter string, pathsSortedExpected []string) {
t.Helper()
sortPaths(paths, delimiter)
if !reflect.DeepEqual(paths, pathsSortedExpected) {
t.Fatalf("unexpected sortPaths result;\ngot\n%q\nwant\n%q", paths, pathsSortedExpected)
}
}
f([]string{"foo", "bar"}, ".", []string{"bar", "foo"})
f([]string{"foo.", "bar", "aa", "ab."}, ".", []string{"ab.", "foo.", "aa", "bar"})
f([]string{"foo.", "bar", "aa", "ab."}, "_", []string{"aa", "ab.", "bar", "foo."})
}
func TestFilterLeaves(t *testing.T) {
f := func(paths []string, delimiter string, leavesExpected []string) {
t.Helper()
leaves := filterLeaves(paths, delimiter)
if !reflect.DeepEqual(leaves, leavesExpected) {
t.Fatalf("unexpected leaves; got\n%q\nwant\n%q", leaves, leavesExpected)
}
}
f([]string{"foo", "bar"}, ".", []string{"foo", "bar"})
f([]string{"a.", ".", "bc"}, ".", []string{"bc"})
f([]string{"a.", ".", "bc"}, "_", []string{"a.", ".", "bc"})
f([]string{"a_", "_", "bc"}, "_", []string{"bc"})
f([]string{"foo.", "bar."}, ".", []string{})
}
func TestAddAutomaticVariants(t *testing.T) {
f := func(query, delimiter, resultExpected string) {
t.Helper()
result := addAutomaticVariants(query, delimiter)
if result != resultExpected {
t.Fatalf("unexpected result for addAutomaticVariants(%q, delimiter=%q); got %q; want %q", query, delimiter, result, resultExpected)
}
}
f("", ".", "")
f("foobar", ".", "foobar")
f("foo,bar.baz", ".", "{foo,bar}.baz")
f("foo,bar.baz", "_", "{foo,bar.baz}")
f("foo,bar_baz*", "_", "{foo,bar}_baz*")
f("foo.bar,baz,aa.bb,cc", ".", "foo.{bar,baz,aa}.{bb,cc}")
}

View File

@@ -0,0 +1,38 @@
{% stripspace %}
MetricsExpandResponseByQuery generates response for /metrics/expand?groupByExpr=1 .
See https://graphite-api.readthedocs.io/en/latest/api.html#metrics-expand
{% func MetricsExpandResponseByQuery(m map[string][]string, jsonp string) %}
{% if jsonp != "" %}{%s= jsonp %}({% endif %}
{
"results":{
{% code i := 0 %}
{% for query, paths := range m %}
{%q= query %}:{%= metricPaths(paths) %}
{% code i++ %}
{% if i < len(m) %},{% endif %}
{% endfor %}
}
}
{% if jsonp != "" %}){% endif %}
{% endfunc %}
MetricsExpandResponseFlat generates response for /metrics/expand?groupByExpr=0 .
See https://graphite-api.readthedocs.io/en/latest/api.html#metrics-expand
{% func MetricsExpandResponseFlat(paths []string, jsonp string) %}
{% if jsonp != "" %}{%s= jsonp %}({% endif %}
{%= metricPaths(paths) %}
{% if jsonp != "" %}){% endif %}
{% endfunc %}
{% func metricPaths(paths []string) %}
[
{% for i, path := range paths %}
{%q= path %}
{% if i+1 < len(paths) %},{% endif %}
{% endfor %}
]
{% endfunc %}
{% endstripspace %}

View File

@@ -0,0 +1,187 @@
// Code generated by qtc from "metrics_expand_response.qtpl". DO NOT EDIT.
// See https://github.com/valyala/quicktemplate for details.
// MetricsExpandResponseByQuery generates response for /metrics/expand?groupByExpr=1 .See https://graphite-api.readthedocs.io/en/latest/api.html#metrics-expand
//line app/vmselect/graphite/metrics_expand_response.qtpl:5
package graphite
//line app/vmselect/graphite/metrics_expand_response.qtpl:5
import (
qtio422016 "io"
qt422016 "github.com/valyala/quicktemplate"
)
//line app/vmselect/graphite/metrics_expand_response.qtpl:5
var (
_ = qtio422016.Copy
_ = qt422016.AcquireByteBuffer
)
//line app/vmselect/graphite/metrics_expand_response.qtpl:5
func StreamMetricsExpandResponseByQuery(qw422016 *qt422016.Writer, m map[string][]string, jsonp string) {
//line app/vmselect/graphite/metrics_expand_response.qtpl:6
if jsonp != "" {
//line app/vmselect/graphite/metrics_expand_response.qtpl:6
qw422016.N().S(jsonp)
//line app/vmselect/graphite/metrics_expand_response.qtpl:6
qw422016.N().S(`(`)
//line app/vmselect/graphite/metrics_expand_response.qtpl:6
}
//line app/vmselect/graphite/metrics_expand_response.qtpl:6
qw422016.N().S(`{"results":{`)
//line app/vmselect/graphite/metrics_expand_response.qtpl:9
i := 0
//line app/vmselect/graphite/metrics_expand_response.qtpl:10
for query, paths := range m {
//line app/vmselect/graphite/metrics_expand_response.qtpl:11
qw422016.N().Q(query)
//line app/vmselect/graphite/metrics_expand_response.qtpl:11
qw422016.N().S(`:`)
//line app/vmselect/graphite/metrics_expand_response.qtpl:11
streammetricPaths(qw422016, paths)
//line app/vmselect/graphite/metrics_expand_response.qtpl:12
i++
//line app/vmselect/graphite/metrics_expand_response.qtpl:13
if i < len(m) {
//line app/vmselect/graphite/metrics_expand_response.qtpl:13
qw422016.N().S(`,`)
//line app/vmselect/graphite/metrics_expand_response.qtpl:13
}
//line app/vmselect/graphite/metrics_expand_response.qtpl:14
}
//line app/vmselect/graphite/metrics_expand_response.qtpl:14
qw422016.N().S(`}}`)
//line app/vmselect/graphite/metrics_expand_response.qtpl:17
if jsonp != "" {
//line app/vmselect/graphite/metrics_expand_response.qtpl:17
qw422016.N().S(`)`)
//line app/vmselect/graphite/metrics_expand_response.qtpl:17
}
//line app/vmselect/graphite/metrics_expand_response.qtpl:18
}
//line app/vmselect/graphite/metrics_expand_response.qtpl:18
func WriteMetricsExpandResponseByQuery(qq422016 qtio422016.Writer, m map[string][]string, jsonp string) {
//line app/vmselect/graphite/metrics_expand_response.qtpl:18
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmselect/graphite/metrics_expand_response.qtpl:18
StreamMetricsExpandResponseByQuery(qw422016, m, jsonp)
//line app/vmselect/graphite/metrics_expand_response.qtpl:18
qt422016.ReleaseWriter(qw422016)
//line app/vmselect/graphite/metrics_expand_response.qtpl:18
}
//line app/vmselect/graphite/metrics_expand_response.qtpl:18
func MetricsExpandResponseByQuery(m map[string][]string, jsonp string) string {
//line app/vmselect/graphite/metrics_expand_response.qtpl:18
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmselect/graphite/metrics_expand_response.qtpl:18
WriteMetricsExpandResponseByQuery(qb422016, m, jsonp)
//line app/vmselect/graphite/metrics_expand_response.qtpl:18
qs422016 := string(qb422016.B)
//line app/vmselect/graphite/metrics_expand_response.qtpl:18
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmselect/graphite/metrics_expand_response.qtpl:18
return qs422016
//line app/vmselect/graphite/metrics_expand_response.qtpl:18
}
// MetricsExpandResponseFlat generates response for /metrics/expand?groupByExpr=0 .See https://graphite-api.readthedocs.io/en/latest/api.html#metrics-expand
//line app/vmselect/graphite/metrics_expand_response.qtpl:23
func StreamMetricsExpandResponseFlat(qw422016 *qt422016.Writer, paths []string, jsonp string) {
//line app/vmselect/graphite/metrics_expand_response.qtpl:24
if jsonp != "" {
//line app/vmselect/graphite/metrics_expand_response.qtpl:24
qw422016.N().S(jsonp)
//line app/vmselect/graphite/metrics_expand_response.qtpl:24
qw422016.N().S(`(`)
//line app/vmselect/graphite/metrics_expand_response.qtpl:24
}
//line app/vmselect/graphite/metrics_expand_response.qtpl:25
streammetricPaths(qw422016, paths)
//line app/vmselect/graphite/metrics_expand_response.qtpl:26
if jsonp != "" {
//line app/vmselect/graphite/metrics_expand_response.qtpl:26
qw422016.N().S(`)`)
//line app/vmselect/graphite/metrics_expand_response.qtpl:26
}
//line app/vmselect/graphite/metrics_expand_response.qtpl:27
}
//line app/vmselect/graphite/metrics_expand_response.qtpl:27
func WriteMetricsExpandResponseFlat(qq422016 qtio422016.Writer, paths []string, jsonp string) {
//line app/vmselect/graphite/metrics_expand_response.qtpl:27
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmselect/graphite/metrics_expand_response.qtpl:27
StreamMetricsExpandResponseFlat(qw422016, paths, jsonp)
//line app/vmselect/graphite/metrics_expand_response.qtpl:27
qt422016.ReleaseWriter(qw422016)
//line app/vmselect/graphite/metrics_expand_response.qtpl:27
}
//line app/vmselect/graphite/metrics_expand_response.qtpl:27
func MetricsExpandResponseFlat(paths []string, jsonp string) string {
//line app/vmselect/graphite/metrics_expand_response.qtpl:27
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmselect/graphite/metrics_expand_response.qtpl:27
WriteMetricsExpandResponseFlat(qb422016, paths, jsonp)
//line app/vmselect/graphite/metrics_expand_response.qtpl:27
qs422016 := string(qb422016.B)
//line app/vmselect/graphite/metrics_expand_response.qtpl:27
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmselect/graphite/metrics_expand_response.qtpl:27
return qs422016
//line app/vmselect/graphite/metrics_expand_response.qtpl:27
}
//line app/vmselect/graphite/metrics_expand_response.qtpl:29
func streammetricPaths(qw422016 *qt422016.Writer, paths []string) {
//line app/vmselect/graphite/metrics_expand_response.qtpl:29
qw422016.N().S(`[`)
//line app/vmselect/graphite/metrics_expand_response.qtpl:31
for i, path := range paths {
//line app/vmselect/graphite/metrics_expand_response.qtpl:32
qw422016.N().Q(path)
//line app/vmselect/graphite/metrics_expand_response.qtpl:33
if i+1 < len(paths) {
//line app/vmselect/graphite/metrics_expand_response.qtpl:33
qw422016.N().S(`,`)
//line app/vmselect/graphite/metrics_expand_response.qtpl:33
}
//line app/vmselect/graphite/metrics_expand_response.qtpl:34
}
//line app/vmselect/graphite/metrics_expand_response.qtpl:34
qw422016.N().S(`]`)
//line app/vmselect/graphite/metrics_expand_response.qtpl:36
}
//line app/vmselect/graphite/metrics_expand_response.qtpl:36
func writemetricPaths(qq422016 qtio422016.Writer, paths []string) {
//line app/vmselect/graphite/metrics_expand_response.qtpl:36
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmselect/graphite/metrics_expand_response.qtpl:36
streammetricPaths(qw422016, paths)
//line app/vmselect/graphite/metrics_expand_response.qtpl:36
qt422016.ReleaseWriter(qw422016)
//line app/vmselect/graphite/metrics_expand_response.qtpl:36
}
//line app/vmselect/graphite/metrics_expand_response.qtpl:36
func metricPaths(paths []string) string {
//line app/vmselect/graphite/metrics_expand_response.qtpl:36
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmselect/graphite/metrics_expand_response.qtpl:36
writemetricPaths(qb422016, paths)
//line app/vmselect/graphite/metrics_expand_response.qtpl:36
qs422016 := string(qb422016.B)
//line app/vmselect/graphite/metrics_expand_response.qtpl:36
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmselect/graphite/metrics_expand_response.qtpl:36
return qs422016
//line app/vmselect/graphite/metrics_expand_response.qtpl:36
}

View File

@@ -0,0 +1,138 @@
{% import (
"sort"
"strings"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
) %}
{% stripspace %}
MetricsFindResponse generates response for /metrics/find .
See https://graphite-api.readthedocs.io/en/latest/api.html#metrics-find
{% func MetricsFindResponse(paths []string, delimiter, format string, addWildcards bool, jsonp string) %}
{% if jsonp != "" %}{%s= jsonp %}({% endif %}
{% switch format %}
{% case "completer" %}
{%= metricsFindResponseCompleter(paths, delimiter, addWildcards) %}
{% case "treejson" %}
{%= metricsFindResponseTreeJSON(paths, delimiter, addWildcards) %}
{% default %}
{% code logger.Panicf("BUG: unexpected format=%q", format) %}
{% endswitch %}
{% if jsonp != "" %}){% endif %}
{% endfunc %}
{% func metricsFindResponseCompleter(paths []string, delimiter string, addWildcards bool) %}
{
"metrics":[
{% for i, path := range paths %}
{
"path": {%q= path %},
"name": {%= metricPathName(path, delimiter) %},
"is_leaf": {% if strings.HasSuffix(path, delimiter) %}0{% else %}1{% endif %}
}
{% if i+1 < len(paths) %},{% endif %}
{% endfor %}
{% if addWildcards && len(paths) > 1 %}
,{
"name": "*"
}
{% endif %}
]
}
{% endfunc %}
{% func metricsFindResponseTreeJSON(paths []string, delimiter string, addWildcards bool) %}
[
{% code
if len(paths) > 1 {
sort.Strings(paths)
// Substitute `path` and `path<delimiter>` with `path<delimiter><delimiter>`.
// Such path is treated specially during rendering - see code below for details.
dst := paths[:1]
for _, path := range paths[1:] {
prevPath := dst[len(dst)-1]
if len(path) == len(prevPath)+1 && strings.HasSuffix(path, delimiter) && strings.HasPrefix(path, prevPath) {
// The path is equivalent to <prevPath> + <delimiter>
// Overwrite the prevPath with <path> + <delimiter> as carbonapi does.
// I.e. the resulting path ends with double delimiter.
// Such path is treated specially during rendering - see metrics_find_response.qtpl for details.
dst[len(dst)-1] = path + delimiter
continue
}
dst = append(dst, path)
}
paths = dst
}
%}
{% for i, path := range paths %}
{
{% code
id := path
allowChildren := "0"
isLeaf := "1"
if strings.HasSuffix(id, delimiter) {
if strings.HasSuffix(id[:len(id)-1], delimiter) {
// Special case when id ends with double delimiter.
// See the code above for details.
id = id[:len(id)-2]
}
allowChildren = "1"
isLeaf = "0"
}
%}
"id": {%q= id %},
"text": {%= metricPathName(path, delimiter) %},
"allowChildren": {%s= allowChildren %},
"expandable": {%s= allowChildren %},
"leaf": {%s= isLeaf %}
}
{% if i+1 < len(paths) %},{% endif %}
{% endfor %}
{% if addWildcards && len(paths) > 1 %}
,{
{% code
path := paths[0]
for strings.HasSuffix(path, delimiter) {
path = path[:len(path)-1]
}
id := ""
if n := strings.LastIndexByte(path, delimiter[0]); n >= 0 {
id = path[:n+1]
}
id += "*"
allowChildren := "0"
isLeaf := "1"
for _, path := range paths {
if strings.HasSuffix(path, delimiter) {
allowChildren = "1"
isLeaf = "0"
break
}
}
%}
"id": {%q= id %},
"text": "*",
"allowChildren": {%s= allowChildren %},
"expandable": {%s= allowChildren %},
"leaf": {%s= isLeaf %}
}
{% endif %}
]
{% endfunc %}
{% func metricPathName(path, delimiter string) %}
{% code
name := path
for strings.HasSuffix(name, delimiter) {
name = name[:len(name)-1]
}
if n := strings.LastIndexByte(name, delimiter[0]); n >= 0 {
name = name[n+1:]
}
%}
{%q= name %}
{% endfunc %}
{% endstripspace %}

View File

@@ -0,0 +1,354 @@
// Code generated by qtc from "metrics_find_response.qtpl". DO NOT EDIT.
// See https://github.com/valyala/quicktemplate for details.
//line app/vmselect/graphite/metrics_find_response.qtpl:1
package graphite
//line app/vmselect/graphite/metrics_find_response.qtpl:1
import (
"sort"
"strings"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
)
// MetricsFindResponse generates response for /metrics/find .See https://graphite-api.readthedocs.io/en/latest/api.html#metrics-find
//line app/vmselect/graphite/metrics_find_response.qtpl:12
import (
qtio422016 "io"
qt422016 "github.com/valyala/quicktemplate"
)
//line app/vmselect/graphite/metrics_find_response.qtpl:12
var (
_ = qtio422016.Copy
_ = qt422016.AcquireByteBuffer
)
//line app/vmselect/graphite/metrics_find_response.qtpl:12
func StreamMetricsFindResponse(qw422016 *qt422016.Writer, paths []string, delimiter, format string, addWildcards bool, jsonp string) {
//line app/vmselect/graphite/metrics_find_response.qtpl:13
if jsonp != "" {
//line app/vmselect/graphite/metrics_find_response.qtpl:13
qw422016.N().S(jsonp)
//line app/vmselect/graphite/metrics_find_response.qtpl:13
qw422016.N().S(`(`)
//line app/vmselect/graphite/metrics_find_response.qtpl:13
}
//line app/vmselect/graphite/metrics_find_response.qtpl:14
switch format {
//line app/vmselect/graphite/metrics_find_response.qtpl:15
case "completer":
//line app/vmselect/graphite/metrics_find_response.qtpl:16
streammetricsFindResponseCompleter(qw422016, paths, delimiter, addWildcards)
//line app/vmselect/graphite/metrics_find_response.qtpl:17
case "treejson":
//line app/vmselect/graphite/metrics_find_response.qtpl:18
streammetricsFindResponseTreeJSON(qw422016, paths, delimiter, addWildcards)
//line app/vmselect/graphite/metrics_find_response.qtpl:19
default:
//line app/vmselect/graphite/metrics_find_response.qtpl:20
logger.Panicf("BUG: unexpected format=%q", format)
//line app/vmselect/graphite/metrics_find_response.qtpl:21
}
//line app/vmselect/graphite/metrics_find_response.qtpl:22
if jsonp != "" {
//line app/vmselect/graphite/metrics_find_response.qtpl:22
qw422016.N().S(`)`)
//line app/vmselect/graphite/metrics_find_response.qtpl:22
}
//line app/vmselect/graphite/metrics_find_response.qtpl:23
}
//line app/vmselect/graphite/metrics_find_response.qtpl:23
func WriteMetricsFindResponse(qq422016 qtio422016.Writer, paths []string, delimiter, format string, addWildcards bool, jsonp string) {
//line app/vmselect/graphite/metrics_find_response.qtpl:23
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmselect/graphite/metrics_find_response.qtpl:23
StreamMetricsFindResponse(qw422016, paths, delimiter, format, addWildcards, jsonp)
//line app/vmselect/graphite/metrics_find_response.qtpl:23
qt422016.ReleaseWriter(qw422016)
//line app/vmselect/graphite/metrics_find_response.qtpl:23
}
//line app/vmselect/graphite/metrics_find_response.qtpl:23
func MetricsFindResponse(paths []string, delimiter, format string, addWildcards bool, jsonp string) string {
//line app/vmselect/graphite/metrics_find_response.qtpl:23
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmselect/graphite/metrics_find_response.qtpl:23
WriteMetricsFindResponse(qb422016, paths, delimiter, format, addWildcards, jsonp)
//line app/vmselect/graphite/metrics_find_response.qtpl:23
qs422016 := string(qb422016.B)
//line app/vmselect/graphite/metrics_find_response.qtpl:23
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmselect/graphite/metrics_find_response.qtpl:23
return qs422016
//line app/vmselect/graphite/metrics_find_response.qtpl:23
}
//line app/vmselect/graphite/metrics_find_response.qtpl:25
func streammetricsFindResponseCompleter(qw422016 *qt422016.Writer, paths []string, delimiter string, addWildcards bool) {
//line app/vmselect/graphite/metrics_find_response.qtpl:25
qw422016.N().S(`{"metrics":[`)
//line app/vmselect/graphite/metrics_find_response.qtpl:28
for i, path := range paths {
//line app/vmselect/graphite/metrics_find_response.qtpl:28
qw422016.N().S(`{"path":`)
//line app/vmselect/graphite/metrics_find_response.qtpl:30
qw422016.N().Q(path)
//line app/vmselect/graphite/metrics_find_response.qtpl:30
qw422016.N().S(`,"name":`)
//line app/vmselect/graphite/metrics_find_response.qtpl:31
streammetricPathName(qw422016, path, delimiter)
//line app/vmselect/graphite/metrics_find_response.qtpl:31
qw422016.N().S(`,"is_leaf":`)
//line app/vmselect/graphite/metrics_find_response.qtpl:32
if strings.HasSuffix(path, delimiter) {
//line app/vmselect/graphite/metrics_find_response.qtpl:32
qw422016.N().S(`0`)
//line app/vmselect/graphite/metrics_find_response.qtpl:32
} else {
//line app/vmselect/graphite/metrics_find_response.qtpl:32
qw422016.N().S(`1`)
//line app/vmselect/graphite/metrics_find_response.qtpl:32
}
//line app/vmselect/graphite/metrics_find_response.qtpl:32
qw422016.N().S(`}`)
//line app/vmselect/graphite/metrics_find_response.qtpl:34
if i+1 < len(paths) {
//line app/vmselect/graphite/metrics_find_response.qtpl:34
qw422016.N().S(`,`)
//line app/vmselect/graphite/metrics_find_response.qtpl:34
}
//line app/vmselect/graphite/metrics_find_response.qtpl:35
}
//line app/vmselect/graphite/metrics_find_response.qtpl:36
if addWildcards && len(paths) > 1 {
//line app/vmselect/graphite/metrics_find_response.qtpl:36
qw422016.N().S(`,{"name": "*"}`)
//line app/vmselect/graphite/metrics_find_response.qtpl:40
}
//line app/vmselect/graphite/metrics_find_response.qtpl:40
qw422016.N().S(`]}`)
//line app/vmselect/graphite/metrics_find_response.qtpl:43
}
//line app/vmselect/graphite/metrics_find_response.qtpl:43
func writemetricsFindResponseCompleter(qq422016 qtio422016.Writer, paths []string, delimiter string, addWildcards bool) {
//line app/vmselect/graphite/metrics_find_response.qtpl:43
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmselect/graphite/metrics_find_response.qtpl:43
streammetricsFindResponseCompleter(qw422016, paths, delimiter, addWildcards)
//line app/vmselect/graphite/metrics_find_response.qtpl:43
qt422016.ReleaseWriter(qw422016)
//line app/vmselect/graphite/metrics_find_response.qtpl:43
}
//line app/vmselect/graphite/metrics_find_response.qtpl:43
func metricsFindResponseCompleter(paths []string, delimiter string, addWildcards bool) string {
//line app/vmselect/graphite/metrics_find_response.qtpl:43
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmselect/graphite/metrics_find_response.qtpl:43
writemetricsFindResponseCompleter(qb422016, paths, delimiter, addWildcards)
//line app/vmselect/graphite/metrics_find_response.qtpl:43
qs422016 := string(qb422016.B)
//line app/vmselect/graphite/metrics_find_response.qtpl:43
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmselect/graphite/metrics_find_response.qtpl:43
return qs422016
//line app/vmselect/graphite/metrics_find_response.qtpl:43
}
//line app/vmselect/graphite/metrics_find_response.qtpl:45
func streammetricsFindResponseTreeJSON(qw422016 *qt422016.Writer, paths []string, delimiter string, addWildcards bool) {
//line app/vmselect/graphite/metrics_find_response.qtpl:45
qw422016.N().S(`[`)
//line app/vmselect/graphite/metrics_find_response.qtpl:48
if len(paths) > 1 {
sort.Strings(paths)
// Substitute `path` and `path<delimiter>` with `path<delimiter><delimiter>`.
// Such path is treated specially during rendering - see code below for details.
dst := paths[:1]
for _, path := range paths[1:] {
prevPath := dst[len(dst)-1]
if len(path) == len(prevPath)+1 && strings.HasSuffix(path, delimiter) && strings.HasPrefix(path, prevPath) {
// The path is equivalent to <prevPath> + <delimiter>
// Overwrite the prevPath with <path> + <delimiter> as carbonapi does.
// I.e. the resulting path ends with double delimiter.
// Such path is treated specially during rendering - see metrics_find_response.qtpl for details.
dst[len(dst)-1] = path + delimiter
continue
}
dst = append(dst, path)
}
paths = dst
}
//line app/vmselect/graphite/metrics_find_response.qtpl:68
for i, path := range paths {
//line app/vmselect/graphite/metrics_find_response.qtpl:68
qw422016.N().S(`{`)
//line app/vmselect/graphite/metrics_find_response.qtpl:71
id := path
allowChildren := "0"
isLeaf := "1"
if strings.HasSuffix(id, delimiter) {
if strings.HasSuffix(id[:len(id)-1], delimiter) {
// Special case when id ends with double delimiter.
// See the code above for details.
id = id[:len(id)-2]
}
allowChildren = "1"
isLeaf = "0"
}
//line app/vmselect/graphite/metrics_find_response.qtpl:83
qw422016.N().S(`"id":`)
//line app/vmselect/graphite/metrics_find_response.qtpl:84
qw422016.N().Q(id)
//line app/vmselect/graphite/metrics_find_response.qtpl:84
qw422016.N().S(`,"text":`)
//line app/vmselect/graphite/metrics_find_response.qtpl:85
streammetricPathName(qw422016, path, delimiter)
//line app/vmselect/graphite/metrics_find_response.qtpl:85
qw422016.N().S(`,"allowChildren":`)
//line app/vmselect/graphite/metrics_find_response.qtpl:86
qw422016.N().S(allowChildren)
//line app/vmselect/graphite/metrics_find_response.qtpl:86
qw422016.N().S(`,"expandable":`)
//line app/vmselect/graphite/metrics_find_response.qtpl:87
qw422016.N().S(allowChildren)
//line app/vmselect/graphite/metrics_find_response.qtpl:87
qw422016.N().S(`,"leaf":`)
//line app/vmselect/graphite/metrics_find_response.qtpl:88
qw422016.N().S(isLeaf)
//line app/vmselect/graphite/metrics_find_response.qtpl:88
qw422016.N().S(`}`)
//line app/vmselect/graphite/metrics_find_response.qtpl:90
if i+1 < len(paths) {
//line app/vmselect/graphite/metrics_find_response.qtpl:90
qw422016.N().S(`,`)
//line app/vmselect/graphite/metrics_find_response.qtpl:90
}
//line app/vmselect/graphite/metrics_find_response.qtpl:91
}
//line app/vmselect/graphite/metrics_find_response.qtpl:92
if addWildcards && len(paths) > 1 {
//line app/vmselect/graphite/metrics_find_response.qtpl:92
qw422016.N().S(`,{`)
//line app/vmselect/graphite/metrics_find_response.qtpl:95
path := paths[0]
for strings.HasSuffix(path, delimiter) {
path = path[:len(path)-1]
}
id := ""
if n := strings.LastIndexByte(path, delimiter[0]); n >= 0 {
id = path[:n+1]
}
id += "*"
allowChildren := "0"
isLeaf := "1"
for _, path := range paths {
if strings.HasSuffix(path, delimiter) {
allowChildren = "1"
isLeaf = "0"
break
}
}
//line app/vmselect/graphite/metrics_find_response.qtpl:114
qw422016.N().S(`"id":`)
//line app/vmselect/graphite/metrics_find_response.qtpl:115
qw422016.N().Q(id)
//line app/vmselect/graphite/metrics_find_response.qtpl:115
qw422016.N().S(`,"text": "*","allowChildren":`)
//line app/vmselect/graphite/metrics_find_response.qtpl:117
qw422016.N().S(allowChildren)
//line app/vmselect/graphite/metrics_find_response.qtpl:117
qw422016.N().S(`,"expandable":`)
//line app/vmselect/graphite/metrics_find_response.qtpl:118
qw422016.N().S(allowChildren)
//line app/vmselect/graphite/metrics_find_response.qtpl:118
qw422016.N().S(`,"leaf":`)
//line app/vmselect/graphite/metrics_find_response.qtpl:119
qw422016.N().S(isLeaf)
//line app/vmselect/graphite/metrics_find_response.qtpl:119
qw422016.N().S(`}`)
//line app/vmselect/graphite/metrics_find_response.qtpl:121
}
//line app/vmselect/graphite/metrics_find_response.qtpl:121
qw422016.N().S(`]`)
//line app/vmselect/graphite/metrics_find_response.qtpl:123
}
//line app/vmselect/graphite/metrics_find_response.qtpl:123
func writemetricsFindResponseTreeJSON(qq422016 qtio422016.Writer, paths []string, delimiter string, addWildcards bool) {
//line app/vmselect/graphite/metrics_find_response.qtpl:123
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmselect/graphite/metrics_find_response.qtpl:123
streammetricsFindResponseTreeJSON(qw422016, paths, delimiter, addWildcards)
//line app/vmselect/graphite/metrics_find_response.qtpl:123
qt422016.ReleaseWriter(qw422016)
//line app/vmselect/graphite/metrics_find_response.qtpl:123
}
//line app/vmselect/graphite/metrics_find_response.qtpl:123
func metricsFindResponseTreeJSON(paths []string, delimiter string, addWildcards bool) string {
//line app/vmselect/graphite/metrics_find_response.qtpl:123
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmselect/graphite/metrics_find_response.qtpl:123
writemetricsFindResponseTreeJSON(qb422016, paths, delimiter, addWildcards)
//line app/vmselect/graphite/metrics_find_response.qtpl:123
qs422016 := string(qb422016.B)
//line app/vmselect/graphite/metrics_find_response.qtpl:123
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmselect/graphite/metrics_find_response.qtpl:123
return qs422016
//line app/vmselect/graphite/metrics_find_response.qtpl:123
}
//line app/vmselect/graphite/metrics_find_response.qtpl:125
func streammetricPathName(qw422016 *qt422016.Writer, path, delimiter string) {
//line app/vmselect/graphite/metrics_find_response.qtpl:127
name := path
for strings.HasSuffix(name, delimiter) {
name = name[:len(name)-1]
}
if n := strings.LastIndexByte(name, delimiter[0]); n >= 0 {
name = name[n+1:]
}
//line app/vmselect/graphite/metrics_find_response.qtpl:135
qw422016.N().Q(name)
//line app/vmselect/graphite/metrics_find_response.qtpl:136
}
//line app/vmselect/graphite/metrics_find_response.qtpl:136
func writemetricPathName(qq422016 qtio422016.Writer, path, delimiter string) {
//line app/vmselect/graphite/metrics_find_response.qtpl:136
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmselect/graphite/metrics_find_response.qtpl:136
streammetricPathName(qw422016, path, delimiter)
//line app/vmselect/graphite/metrics_find_response.qtpl:136
qt422016.ReleaseWriter(qw422016)
//line app/vmselect/graphite/metrics_find_response.qtpl:136
}
//line app/vmselect/graphite/metrics_find_response.qtpl:136
func metricPathName(path, delimiter string) string {
//line app/vmselect/graphite/metrics_find_response.qtpl:136
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmselect/graphite/metrics_find_response.qtpl:136
writemetricPathName(qb422016, path, delimiter)
//line app/vmselect/graphite/metrics_find_response.qtpl:136
qs422016 := string(qb422016.B)
//line app/vmselect/graphite/metrics_find_response.qtpl:136
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmselect/graphite/metrics_find_response.qtpl:136
return qs422016
//line app/vmselect/graphite/metrics_find_response.qtpl:136
}

View File

@@ -0,0 +1,11 @@
{% stripspace %}
MetricsIndexResponse generates response for /metrics/index.json .
See https://graphite-api.readthedocs.io/en/latest/api.html#metrics-index-json
{% func MetricsIndexResponse(metricNames []string, jsonp string) %}
{% if jsonp != "" %}{%s= jsonp %}({% endif %}
{%= metricPaths(metricNames) %}
{% if jsonp != "" %}){% endif %}
{% endfunc %}
{% endstripspace %}

View File

@@ -0,0 +1,67 @@
// Code generated by qtc from "metrics_index_response.qtpl". DO NOT EDIT.
// See https://github.com/valyala/quicktemplate for details.
// MetricsIndexResponse generates response for /metrics/index.json .See https://graphite-api.readthedocs.io/en/latest/api.html#metrics-index-json
//line app/vmselect/graphite/metrics_index_response.qtpl:5
package graphite
//line app/vmselect/graphite/metrics_index_response.qtpl:5
import (
qtio422016 "io"
qt422016 "github.com/valyala/quicktemplate"
)
//line app/vmselect/graphite/metrics_index_response.qtpl:5
var (
_ = qtio422016.Copy
_ = qt422016.AcquireByteBuffer
)
//line app/vmselect/graphite/metrics_index_response.qtpl:5
func StreamMetricsIndexResponse(qw422016 *qt422016.Writer, metricNames []string, jsonp string) {
//line app/vmselect/graphite/metrics_index_response.qtpl:6
if jsonp != "" {
//line app/vmselect/graphite/metrics_index_response.qtpl:6
qw422016.N().S(jsonp)
//line app/vmselect/graphite/metrics_index_response.qtpl:6
qw422016.N().S(`(`)
//line app/vmselect/graphite/metrics_index_response.qtpl:6
}
//line app/vmselect/graphite/metrics_index_response.qtpl:7
streammetricPaths(qw422016, metricNames)
//line app/vmselect/graphite/metrics_index_response.qtpl:8
if jsonp != "" {
//line app/vmselect/graphite/metrics_index_response.qtpl:8
qw422016.N().S(`)`)
//line app/vmselect/graphite/metrics_index_response.qtpl:8
}
//line app/vmselect/graphite/metrics_index_response.qtpl:9
}
//line app/vmselect/graphite/metrics_index_response.qtpl:9
func WriteMetricsIndexResponse(qq422016 qtio422016.Writer, metricNames []string, jsonp string) {
//line app/vmselect/graphite/metrics_index_response.qtpl:9
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmselect/graphite/metrics_index_response.qtpl:9
StreamMetricsIndexResponse(qw422016, metricNames, jsonp)
//line app/vmselect/graphite/metrics_index_response.qtpl:9
qt422016.ReleaseWriter(qw422016)
//line app/vmselect/graphite/metrics_index_response.qtpl:9
}
//line app/vmselect/graphite/metrics_index_response.qtpl:9
func MetricsIndexResponse(metricNames []string, jsonp string) string {
//line app/vmselect/graphite/metrics_index_response.qtpl:9
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmselect/graphite/metrics_index_response.qtpl:9
WriteMetricsIndexResponse(qb422016, metricNames, jsonp)
//line app/vmselect/graphite/metrics_index_response.qtpl:9
qs422016 := string(qb422016.B)
//line app/vmselect/graphite/metrics_index_response.qtpl:9
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmselect/graphite/metrics_index_response.qtpl:9
return qs422016
//line app/vmselect/graphite/metrics_index_response.qtpl:9
}

View File

@@ -1,6 +1,7 @@
package vmselect
import (
"errors"
"flag"
"fmt"
"net/http"
@@ -8,8 +9,10 @@ import (
"strings"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/graphite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/prometheus"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/promql"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/searchutils"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
@@ -21,7 +24,8 @@ var (
deleteAuthKey = flag.String("deleteAuthKey", "", "authKey for metrics' deletion via /api/v1/admin/tsdb/delete_series")
maxConcurrentRequests = flag.Int("search.maxConcurrentRequests", getDefaultMaxConcurrentRequests(), "The maximum number of concurrent search requests. "+
"It shouldn't be high, since a single request can saturate all the CPU cores. See also -search.maxQueueDuration")
maxQueueDuration = flag.Duration("search.maxQueueDuration", 10*time.Second, "The maximum time the request waits for execution when -search.maxConcurrentRequests limit is reached")
maxQueueDuration = flag.Duration("search.maxQueueDuration", 10*time.Second, "The maximum time the request waits for execution when -search.maxConcurrentRequests "+
"limit is reached; see also -search.maxQueryDuration")
resetCacheAuthKey = flag.String("search.resetCacheAuthKey", "", "Optional authKey for resetting rollup cache via /internal/resetRollupResultCache call")
)
@@ -75,7 +79,11 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
default:
// Sleep for a while until giving up. This should resolve short bursts in requests.
concurrencyLimitReached.Inc()
t := timerpool.Get(*maxQueueDuration)
d := searchutils.GetMaxQueryDuration(r)
if d > *maxQueueDuration {
d = *maxQueueDuration
}
t := timerpool.Get(d)
select {
case concurrencyCh <- struct{}{}:
timerpool.Put(t)
@@ -85,11 +93,12 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
concurrencyLimitTimeout.Inc()
err := &httpserver.ErrorWithStatusCode{
Err: fmt.Errorf("cannot handle more than %d concurrent search requests during %s; possible solutions: "+
"increase `-search.maxQueueDuration`, increase `-search.maxConcurrentRequests`, increase server capacity",
*maxConcurrentRequests, *maxQueueDuration),
"increase `-search.maxQueueDuration`; increase `-search.maxQueryDuration`; increase `-search.maxConcurrentRequests`; "+
"increase server capacity",
*maxConcurrentRequests, d),
StatusCode: http.StatusServiceUnavailable,
}
httpserver.Errorf(w, "%s", err)
httpserver.Errorf(w, r, "%s", err)
return true
}
}
@@ -175,18 +184,38 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
}
return true
case "/api/v1/status/tsdb":
tsdbStatusRequests.Inc()
statusTSDBRequests.Inc()
if err := prometheus.TSDBStatusHandler(startTime, w, r); err != nil {
tsdbStatusErrors.Inc()
statusTSDBErrors.Inc()
sendPrometheusError(w, r, err)
return true
}
return true
case "/api/v1/status/active_queries":
statusActiveQueriesRequests.Inc()
promql.WriteActiveQueries(w)
return true
case "/api/v1/export":
exportRequests.Inc()
if err := prometheus.ExportHandler(startTime, w, r); err != nil {
exportErrors.Inc()
httpserver.Errorf(w, "error in %q: %s", r.URL.Path, err)
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
return true
}
return true
case "/api/v1/export/csv":
exportCSVRequests.Inc()
if err := prometheus.ExportCSVHandler(startTime, w, r); err != nil {
exportCSVErrors.Inc()
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
return true
}
return true
case "/api/v1/export/native":
exportNativeRequests.Inc()
if err := prometheus.ExportNativeHandler(startTime, w, r); err != nil {
exportNativeErrors.Inc()
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
return true
}
return true
@@ -194,7 +223,34 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
federateRequests.Inc()
if err := prometheus.FederateHandler(startTime, w, r); err != nil {
federateErrors.Inc()
httpserver.Errorf(w, "error in %q: %s", r.URL.Path, err)
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
return true
}
return true
case "/metrics/find", "/metrics/find/":
graphiteMetricsFindRequests.Inc()
httpserver.EnableCORS(w, r)
if err := graphite.MetricsFindHandler(startTime, w, r); err != nil {
graphiteMetricsFindErrors.Inc()
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
return true
}
return true
case "/metrics/expand", "/metrics/expand/":
graphiteMetricsExpandRequests.Inc()
httpserver.EnableCORS(w, r)
if err := graphite.MetricsExpandHandler(startTime, w, r); err != nil {
graphiteMetricsExpandErrors.Inc()
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
return true
}
return true
case "/metrics/index.json", "/metrics/index.json/":
graphiteMetricsIndexRequests.Inc()
httpserver.EnableCORS(w, r)
if err := graphite.MetricsIndexHandler(startTime, w, r); err != nil {
graphiteMetricsIndexErrors.Inc()
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
return true
}
return true
@@ -220,12 +276,12 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
deleteRequests.Inc()
authKey := r.FormValue("authKey")
if authKey != *deleteAuthKey {
httpserver.Errorf(w, "invalid authKey %q. It must match the value from -deleteAuthKey command line flag", authKey)
httpserver.Errorf(w, r, "invalid authKey %q. It must match the value from -deleteAuthKey command line flag", authKey)
return true
}
if err := prometheus.DeleteHandler(startTime, r); err != nil {
deleteErrors.Inc()
httpserver.Errorf(w, "error in %q: %s", r.URL.Path, err)
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
return true
}
w.WriteHeader(http.StatusNoContent)
@@ -240,7 +296,8 @@ func sendPrometheusError(w http.ResponseWriter, r *http.Request, err error) {
w.Header().Set("Content-Type", "application/json")
statusCode := http.StatusUnprocessableEntity
if esc, ok := err.(*httpserver.ErrorWithStatusCode); ok {
var esc *httpserver.ErrorWithStatusCode
if errors.As(err, &esc) {
statusCode = esc.StatusCode
}
w.WriteHeader(statusCode)
@@ -269,8 +326,10 @@ var (
labelsCountRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/labels/count"}`)
labelsCountErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/api/v1/labels/count"}`)
tsdbStatusRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/status/tsdb"}`)
tsdbStatusErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/api/v1/status/tsdb"}`)
statusTSDBRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/status/tsdb"}`)
statusTSDBErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/api/v1/status/tsdb"}`)
statusActiveQueriesRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/status/active_queries"}`)
deleteRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/admin/tsdb/delete_series"}`)
deleteErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/api/v1/admin/tsdb/delete_series"}`)
@@ -278,9 +337,24 @@ var (
exportRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/export"}`)
exportErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/api/v1/export"}`)
exportCSVRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/export/csv"}`)
exportCSVErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/api/v1/export/csv"}`)
exportNativeRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/export/native"}`)
exportNativeErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/api/v1/export/native"}`)
federateRequests = metrics.NewCounter(`vm_http_requests_total{path="/federate"}`)
federateErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/federate"}`)
graphiteMetricsFindRequests = metrics.NewCounter(`vm_http_requests_total{path="/metrics/find"}`)
graphiteMetricsFindErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/metrics/find"}`)
graphiteMetricsExpandRequests = metrics.NewCounter(`vm_http_requests_total{path="/metrics/expand"}`)
graphiteMetricsExpandErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/metrics/expand"}`)
graphiteMetricsIndexRequests = metrics.NewCounter(`vm_http_requests_total{path="/metrics/index.json"}`)
graphiteMetricsIndexErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/metrics/index.json"}`)
rulesRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/rules"}`)
alertsRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/alerts"}`)
metadataRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/metadata"}`)

View File

@@ -2,26 +2,28 @@ package netstorage
import (
"container/heap"
"errors"
"flag"
"fmt"
"runtime"
"sort"
"sync"
"sync/atomic"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/searchutils"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/decimal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fasttime"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/metrics"
)
var (
maxTagKeysPerSearch = flag.Int("search.maxTagKeys", 100e3, "The maximum number of tag keys returned per search")
maxTagValuesPerSearch = flag.Int("search.maxTagValues", 100e3, "The maximum number of tag values returned per search")
maxMetricsPerSearch = flag.Int("search.maxUniqueTimeseries", 300e3, "The maximum number of unique time series each search can scan")
maxTagKeysPerSearch = flag.Int("search.maxTagKeys", 100e3, "The maximum number of tag keys returned from /api/v1/labels")
maxTagValuesPerSearch = flag.Int("search.maxTagValues", 100e3, "The maximum number of tag values returned from /api/v1/label/<label_name>/values")
maxTagValueSuffixesPerSearch = flag.Int("search.maxTagValueSuffixesPerSearch", 100e3, "The maximum number of tag value suffixes returned from /metrics/find")
maxMetricsPerSearch = flag.Int("search.maxUniqueTimeseries", 300e3, "The maximum number of unique time series each search can scan")
)
// Result is a single timeseries result.
@@ -51,7 +53,7 @@ func (r *Result) reset() {
type Results struct {
tr storage.TimeRange
fetchData bool
deadline Deadline
deadline searchutils.Deadline
packedTimeseries []packedTimeseries
sr *storage.Search
@@ -72,81 +74,102 @@ func (rss *Results) mustClose() {
rss.sr = nil
}
// RunParallel runs in parallel f for all the results from rss.
var timeseriesWorkCh = make(chan *timeseriesWork, gomaxprocs*16)
type timeseriesWork struct {
mustStop uint64
rss *Results
pts *packedTimeseries
f func(rs *Result, workerID uint) error
doneCh chan error
rowsProcessed int
}
func init() {
for i := 0; i < gomaxprocs; i++ {
go timeseriesWorker(uint(i))
}
}
func timeseriesWorker(workerID uint) {
var rs Result
var rsLastResetTime uint64
for tsw := range timeseriesWorkCh {
rss := tsw.rss
if rss.deadline.Exceeded() {
tsw.doneCh <- fmt.Errorf("timeout exceeded during query execution: %s", rss.deadline.String())
continue
}
if atomic.LoadUint64(&tsw.mustStop) != 0 {
tsw.doneCh <- nil
continue
}
if err := tsw.pts.Unpack(&rs, rss.tr, rss.fetchData); err != nil {
tsw.doneCh <- fmt.Errorf("error during time series unpacking: %w", err)
continue
}
if len(rs.Timestamps) > 0 || !rss.fetchData {
if err := tsw.f(&rs, workerID); err != nil {
tsw.doneCh <- err
continue
}
}
tsw.rowsProcessed = len(rs.Values)
tsw.doneCh <- nil
currentTime := fasttime.UnixTimestamp()
if cap(rs.Values) > 1024*1024 && 4*len(rs.Values) < cap(rs.Values) && currentTime-rsLastResetTime > 10 {
// Reset rs in order to preseve memory usage after processing big time series with millions of rows.
rs = Result{}
rsLastResetTime = currentTime
}
}
}
// RunParallel runs f in parallel for all the results from rss.
//
// f shouldn't hold references to rs after returning.
// workerID is the id of the worker goroutine that calls f.
// Data processing is immediately stopped if f returns non-nil error.
//
// rss becomes unusable after the call to RunParallel.
func (rss *Results) RunParallel(f func(rs *Result, workerID uint)) error {
func (rss *Results) RunParallel(f func(rs *Result, workerID uint) error) error {
defer rss.mustClose()
workersCount := 1 + len(rss.packedTimeseries)/32
if workersCount > gomaxprocs {
workersCount = gomaxprocs
}
if workersCount == 0 {
logger.Panicf("BUG: workersCount cannot be zero")
}
workCh := make(chan *packedTimeseries, workersCount)
doneCh := make(chan error)
// Start workers.
rowsProcessedTotal := uint64(0)
for i := 0; i < workersCount; i++ {
go func(workerID uint) {
rs := getResult()
defer putResult(rs)
maxWorkersCount := gomaxprocs / workersCount
var err error
rowsProcessed := 0
for pts := range workCh {
if time.Until(rss.deadline.Deadline) < 0 {
err = fmt.Errorf("timeout exceeded during query execution: %s", rss.deadline.String())
break
}
if err = pts.Unpack(rs, rss.tr, rss.fetchData, maxWorkersCount); err != nil {
break
}
if len(rs.Timestamps) == 0 && rss.fetchData {
// Skip empty blocks.
continue
}
rowsProcessed += len(rs.Values)
f(rs, workerID)
}
atomic.AddUint64(&rowsProcessedTotal, uint64(rowsProcessed))
// Drain the remaining work
for range workCh {
}
doneCh <- err
}(uint(i))
}
// Feed workers with work.
tsws := make([]*timeseriesWork, len(rss.packedTimeseries))
for i := range rss.packedTimeseries {
workCh <- &rss.packedTimeseries[i]
tsw := &timeseriesWork{
rss: rss,
pts: &rss.packedTimeseries[i],
f: f,
doneCh: make(chan error, 1),
}
timeseriesWorkCh <- tsw
tsws[i] = tsw
}
seriesProcessedTotal := len(rss.packedTimeseries)
rss.packedTimeseries = rss.packedTimeseries[:0]
close(workCh)
// Wait until workers finish.
var errors []error
for i := 0; i < workersCount; i++ {
if err := <-doneCh; err != nil {
errors = append(errors, err)
// Wait until work is complete.
var firstErr error
rowsProcessedTotal := 0
for _, tsw := range tsws {
if err := <-tsw.doneCh; err != nil && firstErr == nil {
// Return just the first error, since other errors
// are likely duplicate the first error.
firstErr = err
// Notify all the the tsws that they shouldn't be executed.
for _, tsw := range tsws {
atomic.StoreUint64(&tsw.mustStop, 1)
}
}
rowsProcessedTotal += tsw.rowsProcessed
}
perQueryRowsProcessed.Update(float64(rowsProcessedTotal))
perQuerySeriesProcessed.Update(float64(seriesProcessedTotal))
if len(errors) > 0 {
// Return just the first error, since other errors
// is likely duplicate the first error.
return errors[0]
}
return nil
return firstErr
}
var perQueryRowsProcessed = metrics.NewHistogram(`vm_per_query_rows_processed_count`)
@@ -159,70 +182,135 @@ type packedTimeseries struct {
brs []storage.BlockRef
}
var unpackWorkCh = make(chan *unpackWork, gomaxprocs*128)
type unpackWorkItem struct {
br storage.BlockRef
tr storage.TimeRange
}
type unpackWork struct {
ws []unpackWorkItem
sbs []*sortBlock
doneCh chan error
}
func (upw *unpackWork) reset() {
ws := upw.ws
for i := range ws {
w := &ws[i]
w.br = storage.BlockRef{}
w.tr = storage.TimeRange{}
}
upw.ws = upw.ws[:0]
sbs := upw.sbs
for i := range sbs {
sbs[i] = nil
}
upw.sbs = upw.sbs[:0]
if n := len(upw.doneCh); n > 0 {
logger.Panicf("BUG: upw.doneCh must be empty; it contains %d items now", n)
}
}
func (upw *unpackWork) unpack(tmpBlock *storage.Block) {
for _, w := range upw.ws {
sb := getSortBlock()
if err := sb.unpackFrom(tmpBlock, w.br, w.tr); err != nil {
putSortBlock(sb)
upw.doneCh <- fmt.Errorf("cannot unpack block: %w", err)
return
}
upw.sbs = append(upw.sbs, sb)
}
upw.doneCh <- nil
}
func getUnpackWork() *unpackWork {
v := unpackWorkPool.Get()
if v != nil {
return v.(*unpackWork)
}
return &unpackWork{
doneCh: make(chan error, 1),
}
}
func putUnpackWork(upw *unpackWork) {
upw.reset()
unpackWorkPool.Put(upw)
}
var unpackWorkPool sync.Pool
func init() {
for i := 0; i < gomaxprocs; i++ {
go unpackWorker()
}
}
func unpackWorker() {
var tmpBlock storage.Block
for upw := range unpackWorkCh {
upw.unpack(&tmpBlock)
}
}
// unpackBatchSize is the maximum number of blocks that may be unpacked at once by a single goroutine.
//
// This batch is needed in order to reduce contention for upackWorkCh in multi-CPU system.
var unpackBatchSize = 8 * runtime.GOMAXPROCS(-1)
// Unpack unpacks pts to dst.
func (pts *packedTimeseries) Unpack(dst *Result, tr storage.TimeRange, fetchData bool, maxWorkersCount int) error {
func (pts *packedTimeseries) Unpack(dst *Result, tr storage.TimeRange, fetchData bool) error {
dst.reset()
if err := dst.MetricName.Unmarshal(bytesutil.ToUnsafeBytes(pts.metricName)); err != nil {
return fmt.Errorf("cannot unmarshal metricName %q: %s", pts.metricName, err)
return fmt.Errorf("cannot unmarshal metricName %q: %w", pts.metricName, err)
}
workersCount := 1 + len(pts.brs)/32
if workersCount > maxWorkersCount {
workersCount = maxWorkersCount
}
if workersCount == 0 {
logger.Panicf("BUG: workersCount cannot be zero")
}
sbs := make([]*sortBlock, 0, len(pts.brs))
var sbsLock sync.Mutex
workCh := make(chan storage.BlockRef, workersCount)
doneCh := make(chan error)
// Start workers
for i := 0; i < workersCount; i++ {
go func() {
var err error
for br := range workCh {
sb := getSortBlock()
if err = sb.unpackFrom(br, tr, fetchData); err != nil {
break
}
sbsLock.Lock()
sbs = append(sbs, sb)
sbsLock.Unlock()
}
// Drain the remaining work
for range workCh {
}
doneCh <- err
}()
if !fetchData {
// Do not spend resources on data reading and unpacking.
return nil
}
// Feed workers with work
brsLen := len(pts.brs)
upws := make([]*unpackWork, 0, 1+brsLen/unpackBatchSize)
upw := getUnpackWork()
for _, br := range pts.brs {
workCh <- br
}
pts.brs = pts.brs[:0]
close(workCh)
// Wait until workers finish
var errors []error
for i := 0; i < workersCount; i++ {
if err := <-doneCh; err != nil {
errors = append(errors, err)
if len(upw.ws) >= unpackBatchSize {
unpackWorkCh <- upw
upws = append(upws, upw)
upw = getUnpackWork()
}
upw.ws = append(upw.ws, unpackWorkItem{
br: br,
tr: tr,
})
}
if len(errors) > 0 {
// Return the first error only, since other errors are likely the same.
return errors[0]
}
unpackWorkCh <- upw
upws = append(upws, upw)
pts.brs = pts.brs[:0]
// Merge blocks
// Wait until work is complete
sbs := make([]*sortBlock, 0, brsLen)
var firstErr error
for _, upw := range upws {
if err := <-upw.doneCh; err != nil && firstErr == nil {
// Return the first error only, since other errors are likely the same.
firstErr = err
}
if firstErr == nil {
sbs = append(sbs, upw.sbs...)
} else {
for _, sb := range upw.sbs {
putSortBlock(sb)
}
}
putUnpackWork(upw)
}
if firstErr != nil {
return firstErr
}
mergeSortBlocks(dst, sbs)
return nil
}
@@ -298,52 +386,26 @@ func mergeSortBlocks(dst *Result, sbh sortBlocksHeap) {
var dedupsDuringSelect = metrics.NewCounter(`vm_deduplicated_samples_total{type="select"}`)
type sortBlock struct {
// b is used as a temporary storage for unpacked rows before they
// go to Timestamps and Values.
b storage.Block
Timestamps []int64
Values []float64
NextIdx int
}
func (sb *sortBlock) reset() {
sb.b.Reset()
sb.Timestamps = sb.Timestamps[:0]
sb.Values = sb.Values[:0]
sb.NextIdx = 0
}
func (sb *sortBlock) unpackFrom(br storage.BlockRef, tr storage.TimeRange, fetchData bool) error {
br.MustReadBlock(&sb.b, fetchData)
if fetchData {
if err := sb.b.UnmarshalData(); err != nil {
return fmt.Errorf("cannot unmarshal block: %s", err)
}
func (sb *sortBlock) unpackFrom(tmpBlock *storage.Block, br storage.BlockRef, tr storage.TimeRange) error {
tmpBlock.Reset()
br.MustReadBlock(tmpBlock, true)
if err := tmpBlock.UnmarshalData(); err != nil {
return fmt.Errorf("cannot unmarshal block: %w", err)
}
timestamps := sb.b.Timestamps()
// Skip timestamps smaller than tr.MinTimestamp.
i := 0
for i < len(timestamps) && timestamps[i] < tr.MinTimestamp {
i++
}
// Skip timestamps bigger than tr.MaxTimestamp.
j := len(timestamps)
for j > i && timestamps[j-1] > tr.MaxTimestamp {
j--
}
skippedRows := sb.b.RowsCount() - (j - i)
sb.Timestamps, sb.Values = tmpBlock.AppendRowsWithTimeRangeFilter(sb.Timestamps[:0], sb.Values[:0], tr)
skippedRows := tmpBlock.RowsCount() - len(sb.Timestamps)
metricRowsSkipped.Add(skippedRows)
// Copy the remaining values.
if i == j {
return nil
}
values := sb.b.Values()
sb.Timestamps = append(sb.Timestamps, timestamps[i:j]...)
sb.Values = decimal.AppendDecimalToFloat(sb.Values, values[i:j], sb.b.Scale())
return nil
}
@@ -384,10 +446,13 @@ func DeleteSeries(sq *storage.SearchQuery) (int, error) {
}
// GetLabels returns labels until the given deadline.
func GetLabels(deadline Deadline) ([]string, error) {
labels, err := vmstorage.SearchTagKeys(*maxTagKeysPerSearch)
func GetLabels(deadline searchutils.Deadline) ([]string, error) {
if deadline.Exceeded() {
return nil, fmt.Errorf("timeout exceeded before starting the query processing: %s", deadline.String())
}
labels, err := vmstorage.SearchTagKeys(*maxTagKeysPerSearch, deadline.Deadline())
if err != nil {
return nil, fmt.Errorf("error during labels search: %s", err)
return nil, fmt.Errorf("error during labels search: %w", err)
}
// Substitute "" with "__name__"
@@ -405,15 +470,18 @@ func GetLabels(deadline Deadline) ([]string, error) {
// GetLabelValues returns label values for the given labelName
// until the given deadline.
func GetLabelValues(labelName string, deadline Deadline) ([]string, error) {
func GetLabelValues(labelName string, deadline searchutils.Deadline) ([]string, error) {
if deadline.Exceeded() {
return nil, fmt.Errorf("timeout exceeded before starting the query processing: %s", deadline.String())
}
if labelName == "__name__" {
labelName = ""
}
// Search for tag values
labelValues, err := vmstorage.SearchTagValues([]byte(labelName), *maxTagValuesPerSearch)
labelValues, err := vmstorage.SearchTagValues([]byte(labelName), *maxTagValuesPerSearch, deadline.Deadline())
if err != nil {
return nil, fmt.Errorf("error during label values search for labelName=%q: %s", labelName, err)
return nil, fmt.Errorf("error during label values search for labelName=%q: %w", labelName, err)
}
// Sort labelValues like Prometheus does
@@ -422,11 +490,29 @@ func GetLabelValues(labelName string, deadline Deadline) ([]string, error) {
return labelValues, nil
}
// GetLabelEntries returns all the label entries until the given deadline.
func GetLabelEntries(deadline Deadline) ([]storage.TagEntry, error) {
labelEntries, err := vmstorage.SearchTagEntries(*maxTagKeysPerSearch, *maxTagValuesPerSearch)
// GetTagValueSuffixes returns tag value suffixes for the given tagKey and the given tagValuePrefix.
//
// It can be used for implementing https://graphite-api.readthedocs.io/en/latest/api.html#metrics-find
func GetTagValueSuffixes(tr storage.TimeRange, tagKey, tagValuePrefix string, delimiter byte, deadline searchutils.Deadline) ([]string, error) {
if deadline.Exceeded() {
return nil, fmt.Errorf("timeout exceeded before starting the query processing: %s", deadline.String())
}
suffixes, err := vmstorage.SearchTagValueSuffixes(tr, []byte(tagKey), []byte(tagValuePrefix), delimiter, *maxTagValueSuffixesPerSearch, deadline.Deadline())
if err != nil {
return nil, fmt.Errorf("error during label entries request: %s", err)
return nil, fmt.Errorf("error during search for suffixes for tagKey=%q, tagValuePrefix=%q, delimiter=%c on time range %s: %w",
tagKey, tagValuePrefix, delimiter, tr.String(), err)
}
return suffixes, nil
}
// GetLabelEntries returns all the label entries until the given deadline.
func GetLabelEntries(deadline searchutils.Deadline) ([]storage.TagEntry, error) {
if deadline.Exceeded() {
return nil, fmt.Errorf("timeout exceeded before starting the query processing: %s", deadline.String())
}
labelEntries, err := vmstorage.SearchTagEntries(*maxTagKeysPerSearch, *maxTagValuesPerSearch, deadline.Deadline())
if err != nil {
return nil, fmt.Errorf("error during label entries request: %w", err)
}
// Substitute "" with "__name__"
@@ -450,19 +536,25 @@ func GetLabelEntries(deadline Deadline) ([]storage.TagEntry, error) {
}
// GetTSDBStatusForDate returns tsdb status according to https://prometheus.io/docs/prometheus/latest/querying/api/#tsdb-stats
func GetTSDBStatusForDate(deadline Deadline, date uint64, topN int) (*storage.TSDBStatus, error) {
status, err := vmstorage.GetTSDBStatusForDate(date, topN)
func GetTSDBStatusForDate(deadline searchutils.Deadline, date uint64, topN int) (*storage.TSDBStatus, error) {
if deadline.Exceeded() {
return nil, fmt.Errorf("timeout exceeded before starting the query processing: %s", deadline.String())
}
status, err := vmstorage.GetTSDBStatusForDate(date, topN, deadline.Deadline())
if err != nil {
return nil, fmt.Errorf("error during tsdb status request: %s", err)
return nil, fmt.Errorf("error during tsdb status request: %w", err)
}
return status, nil
}
// GetSeriesCount returns the number of unique series.
func GetSeriesCount(deadline Deadline) (uint64, error) {
n, err := vmstorage.GetSeriesCount()
func GetSeriesCount(deadline searchutils.Deadline) (uint64, error) {
if deadline.Exceeded() {
return 0, fmt.Errorf("timeout exceeded before starting the query processing: %s", deadline.String())
}
n, err := vmstorage.GetSeriesCount(deadline.Deadline())
if err != nil {
return 0, fmt.Errorf("error during series count request: %s", err)
return 0, fmt.Errorf("error during series count request: %w", err)
}
return n, nil
}
@@ -482,10 +574,123 @@ func putStorageSearch(sr *storage.Search) {
var ssPool sync.Pool
// ProcessSearchQuery performs sq on storage nodes until the given deadline.
// ExportBlocks searches for time series matching sq and calls f for each found block.
//
// f is called in parallel from multiple goroutines.
// Data processing is immediately stopped if f returns non-nil error.
// It is the responsibility of f to call b.UnmarshalData before reading timestamps and values from the block.
// It is the responsibility of f to filter blocks according to the given tr.
func ExportBlocks(sq *storage.SearchQuery, deadline searchutils.Deadline, f func(mn *storage.MetricName, b *storage.Block, tr storage.TimeRange) error) error {
if deadline.Exceeded() {
return fmt.Errorf("timeout exceeded before starting data export: %s", deadline.String())
}
tfss, err := setupTfss(sq.TagFilterss)
if err != nil {
return err
}
tr := storage.TimeRange{
MinTimestamp: sq.MinTimestamp,
MaxTimestamp: sq.MaxTimestamp,
}
if err := vmstorage.CheckTimeRange(tr); err != nil {
return err
}
vmstorage.WG.Add(1)
defer vmstorage.WG.Done()
sr := getStorageSearch()
defer putStorageSearch(sr)
sr.Init(vmstorage.Storage, tfss, tr, *maxMetricsPerSearch, deadline.Deadline())
// Start workers that call f in parallel on available CPU cores.
gomaxprocs := runtime.GOMAXPROCS(-1)
workCh := make(chan *exportWork, gomaxprocs*8)
var (
errGlobal error
errGlobalLock sync.Mutex
mustStop uint32
)
var wg sync.WaitGroup
wg.Add(gomaxprocs)
for i := 0; i < gomaxprocs; i++ {
go func() {
defer wg.Done()
for xw := range workCh {
if err := f(&xw.mn, &xw.b, tr); err != nil {
errGlobalLock.Lock()
if errGlobal != nil {
errGlobal = err
atomic.StoreUint32(&mustStop, 1)
}
errGlobalLock.Unlock()
}
xw.reset()
exportWorkPool.Put(xw)
}
}()
}
// Feed workers with work
blocksRead := 0
for sr.NextMetricBlock() {
blocksRead++
if deadline.Exceeded() {
return fmt.Errorf("timeout exceeded while fetching data block #%d from storage: %s", blocksRead, deadline.String())
}
if atomic.LoadUint32(&mustStop) != 0 {
break
}
xw := exportWorkPool.Get().(*exportWork)
if err := xw.mn.Unmarshal(sr.MetricBlockRef.MetricName); err != nil {
return fmt.Errorf("cannot unmarshal metricName for block #%d: %w", blocksRead, err)
}
sr.MetricBlockRef.BlockRef.MustReadBlock(&xw.b, true)
workCh <- xw
}
close(workCh)
// Wait for workers to finish.
wg.Wait()
// Check errors.
err = sr.Error()
if err == nil {
err = errGlobal
}
if err != nil {
if errors.Is(err, storage.ErrDeadlineExceeded) {
return fmt.Errorf("timeout exceeded during the query: %s", deadline.String())
}
return fmt.Errorf("search error after reading %d data blocks: %w", blocksRead, err)
}
return nil
}
type exportWork struct {
mn storage.MetricName
b storage.Block
}
func (xw *exportWork) reset() {
xw.mn.Reset()
xw.b.Reset()
}
var exportWorkPool = &sync.Pool{
New: func() interface{} {
return &exportWork{}
},
}
// ProcessSearchQuery performs sq until the given deadline.
//
// Results.RunParallel or Results.Cancel must be called on the returned Results.
func ProcessSearchQuery(sq *storage.SearchQuery, fetchData bool, deadline Deadline) (*Results, error) {
func ProcessSearchQuery(sq *storage.SearchQuery, fetchData bool, deadline searchutils.Deadline) (*Results, error) {
if deadline.Exceeded() {
return nil, fmt.Errorf("timeout exceeded before starting the query processing: %s", deadline.String())
}
// Setup search.
tfss, err := setupTfss(sq.TagFilterss)
if err != nil {
@@ -495,30 +700,44 @@ func ProcessSearchQuery(sq *storage.SearchQuery, fetchData bool, deadline Deadli
MinTimestamp: sq.MinTimestamp,
MaxTimestamp: sq.MaxTimestamp,
}
if err := vmstorage.CheckTimeRange(tr); err != nil {
return nil, err
}
vmstorage.WG.Add(1)
defer vmstorage.WG.Done()
sr := getStorageSearch()
sr.Init(vmstorage.Storage, tfss, tr, *maxMetricsPerSearch)
maxSeriesCount := sr.Init(vmstorage.Storage, tfss, tr, *maxMetricsPerSearch, deadline.Deadline())
m := make(map[string][]storage.BlockRef)
var orderedMetricNames []string
m := make(map[string][]storage.BlockRef, maxSeriesCount)
orderedMetricNames := make([]string, 0, maxSeriesCount)
blocksRead := 0
for sr.NextMetricBlock() {
blocksRead++
if time.Until(deadline.Deadline) < 0 {
if deadline.Exceeded() {
putStorageSearch(sr)
return nil, fmt.Errorf("timeout exceeded while fetching data block #%d from storage: %s", blocksRead, deadline.String())
}
metricName := sr.MetricBlockRef.MetricName
brs := m[string(metricName)]
if len(brs) == 0 {
brs = append(brs, *sr.MetricBlockRef.BlockRef)
if len(brs) > 1 {
// An optimization: do not allocate a string for already existing metricName key in m
m[string(metricName)] = brs
} else {
// An optimization for big number of time series with long metricName values:
// use only a single copy of metricName for both orderedMetricNames and m.
orderedMetricNames = append(orderedMetricNames, string(metricName))
m[orderedMetricNames[len(orderedMetricNames)-1]] = brs
}
m[string(metricName)] = append(brs, *sr.MetricBlockRef.BlockRef)
}
if err := sr.Error(); err != nil {
return nil, fmt.Errorf("search error after reading %d data blocks: %s", blocksRead, err)
putStorageSearch(sr)
if errors.Is(err, storage.ErrDeadlineExceeded) {
return nil, fmt.Errorf("timeout exceeded during the query: %s", deadline.String())
}
return nil, fmt.Errorf("search error after reading %d data blocks: %w", blocksRead, err)
}
var rss Results
@@ -537,25 +756,6 @@ func ProcessSearchQuery(sq *storage.SearchQuery, fetchData bool, deadline Deadli
return &rss, nil
}
func getResult() *Result {
v := rsPool.Get()
if v == nil {
return &Result{}
}
return v.(*Result)
}
func putResult(rs *Result) {
if len(rs.Values) > 8192 {
// Do not pool big results, since they may occupy too much memory.
return
}
rs.reset()
rsPool.Put(rs)
}
var rsPool sync.Pool
func setupTfss(tagFilterss [][]storage.TagFilter) ([]*storage.TagFilters, error) {
tfss := make([]*storage.TagFilters, 0, len(tagFilterss))
for _, tagFilters := range tagFilterss {
@@ -563,7 +763,7 @@ func setupTfss(tagFilterss [][]storage.TagFilter) ([]*storage.TagFilters, error)
for i := range tagFilters {
tf := &tagFilters[i]
if err := tfs.Add(tf.Key, tf.Value, tf.IsNegative, tf.IsRegexp); err != nil {
return nil, fmt.Errorf("cannot parse tag filter %s: %s", tf, err)
return nil, fmt.Errorf("cannot parse tag filter %s: %w", tf, err)
}
}
tfss = append(tfss, tfs)
@@ -571,28 +771,3 @@ func setupTfss(tagFilterss [][]storage.TagFilter) ([]*storage.TagFilters, error)
}
return tfss, nil
}
// Deadline contains deadline with the corresponding timeout for pretty error messages.
type Deadline struct {
Deadline time.Time
timeout time.Duration
flagHint string
}
// NewDeadline returns deadline for the given timeout.
//
// flagHint must contain a hit for command-line flag, which could be used
// in order to increase timeout.
func NewDeadline(timeout time.Duration, flagHint string) Deadline {
return Deadline{
Deadline: time.Now().Add(timeout),
timeout: timeout,
flagHint: flagHint,
}
}
// String returns human-readable string representation for d.
func (d *Deadline) String() string {
return fmt.Sprintf("%.3f seconds; the timeout can be adjusted with `%s` command-line flag", d.timeout.Seconds(), d.flagHint)
}

View File

@@ -1,30 +1,102 @@
{% import (
"bytes"
"strings"
"time"
"github.com/valyala/quicktemplate"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/netstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
) %}
{% stripspace %}
{% func ExportPrometheusLine(rs *netstorage.Result) %}
{% if len(rs.Timestamps) == 0 %}{% return %}{% endif %}
{% func ExportCSVLine(xb *exportBlock, fieldNames []string) %}
{% if len(xb.timestamps) == 0 || len(fieldNames) == 0 %}{% return %}{% endif %}
{% for i, timestamp := range xb.timestamps %}
{% code value := xb.values[i] %}
{%= exportCSVField(xb.mn, fieldNames[0], timestamp, value) %}
{% for _, fieldName := range fieldNames[1:] %}
,
{%= exportCSVField(xb.mn, fieldName, timestamp, value) %}
{% endfor %}
{% newline %}
{% endfor %}
{% endfunc %}
{% func exportCSVField(mn *storage.MetricName, fieldName string, timestamp int64, value float64) %}
{% if fieldName == "__value__" %}
{%f= value %}
{% return %}
{% endif %}
{% if fieldName == "__timestamp__" %}
{%dl timestamp %}
{% return %}
{% endif %}
{% if strings.HasPrefix(fieldName, "__timestamp__:") %}
{% code timeFormat := fieldName[len("__timestamp__:"):] %}
{% switch timeFormat %}
{% case "unix_s" %}
{%dl= timestamp/1000 %}
{% case "unix_ms" %}
{%dl= timestamp %}
{% case "unix_ns" %}
{%dl= timestamp*1e6 %}
{% case "rfc3339" %}
{% code
bb := quicktemplate.AcquireByteBuffer()
bb.B = time.Unix(timestamp/1000, (timestamp%1000)*1e6).AppendFormat(bb.B[:0], time.RFC3339)
%}
{%z= bb.B %}
{% code
quicktemplate.ReleaseByteBuffer(bb)
%}
{% default %}
{% if strings.HasPrefix(timeFormat, "custom:") %}
{% code
layout := timeFormat[len("custom:"):]
bb := quicktemplate.AcquireByteBuffer()
bb.B = time.Unix(timestamp/1000, (timestamp%1000)*1e6).AppendFormat(bb.B[:0], layout)
%}
{% if bytes.ContainsAny(bb.B, `"`+",\n") %}
{%qz bb.B %}
{% else %}
{%z= bb.B %}
{% endif %}
{% code
quicktemplate.ReleaseByteBuffer(bb)
%}
{% else %}
Unsupported timeFormat={%s= timeFormat %}
{% endif %}
{% endswitch %}
{% return %}
{% endif %}
{% code v := mn.GetTagValue(fieldName) %}
{% if bytes.ContainsAny(v, `"`+",\n") %}
{%qz= v %}
{% else %}
{%z= v %}
{% endif %}
{% endfunc %}
{% func ExportPrometheusLine(xb *exportBlock) %}
{% if len(xb.timestamps) == 0 %}{% return %}{% endif %}
{% code bb := quicktemplate.AcquireByteBuffer() %}
{% code writeprometheusMetricName(bb, &rs.MetricName) %}
{% for i, ts := range rs.Timestamps %}
{% code writeprometheusMetricName(bb, xb.mn) %}
{% for i, ts := range xb.timestamps %}
{%z= bb.B %}{% space %}
{%f= rs.Values[i] %}{% space %}
{%f= xb.values[i] %}{% space %}
{%dl= ts %}{% newline %}
{% endfor %}
{% code quicktemplate.ReleaseByteBuffer(bb) %}
{% endfunc %}
{% func ExportJSONLine(rs *netstorage.Result) %}
{% if len(rs.Timestamps) == 0 %}{% return %}{% endif %}
{% func ExportJSONLine(xb *exportBlock) %}
{% if len(xb.timestamps) == 0 %}{% return %}{% endif %}
{
"metric":{%= metricNameObject(&rs.MetricName) %},
"metric":{%= metricNameObject(xb.mn) %},
"values":[
{% if len(rs.Values) > 0 %}
{% code values := rs.Values %}
{% if len(xb.values) > 0 %}
{% code values := xb.values %}
{%f= values[0] %}
{% code values = values[1:] %}
{% for _, v := range values %}
@@ -33,8 +105,8 @@
{% endif %}
],
"timestamps":[
{% if len(rs.Timestamps) > 0 %}
{% code timestamps := rs.Timestamps %}
{% if len(xb.timestamps) > 0 %}
{% code timestamps := xb.timestamps %}
{%dl= timestamps[0] %}
{% code timestamps = timestamps[1:] %}
{% for _, ts := range timestamps %}
@@ -45,10 +117,10 @@
}{% newline %}
{% endfunc %}
{% func ExportPromAPILine(rs *netstorage.Result) %}
{% func ExportPromAPILine(xb *exportBlock) %}
{
"metric": {%= metricNameObject(&rs.MetricName) %},
"values": {%= valuesWithTimestamps(rs.Values, rs.Timestamps) %}
"metric": {%= metricNameObject(xb.mn) %},
"values": {%= valuesWithTimestamps(xb.values, xb.timestamps) %}
}
{% endfunc %}

View File

@@ -6,380 +6,566 @@ package prometheus
//line app/vmselect/prometheus/export.qtpl:1
import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/netstorage"
"bytes"
"strings"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/valyala/quicktemplate"
)
//line app/vmselect/prometheus/export.qtpl:9
//line app/vmselect/prometheus/export.qtpl:12
import (
qtio422016 "io"
qt422016 "github.com/valyala/quicktemplate"
)
//line app/vmselect/prometheus/export.qtpl:9
//line app/vmselect/prometheus/export.qtpl:12
var (
_ = qtio422016.Copy
_ = qt422016.AcquireByteBuffer
)
//line app/vmselect/prometheus/export.qtpl:9
func StreamExportPrometheusLine(qw422016 *qt422016.Writer, rs *netstorage.Result) {
//line app/vmselect/prometheus/export.qtpl:10
if len(rs.Timestamps) == 0 {
//line app/vmselect/prometheus/export.qtpl:10
return
//line app/vmselect/prometheus/export.qtpl:10
}
//line app/vmselect/prometheus/export.qtpl:11
bb := quicktemplate.AcquireByteBuffer()
//line app/vmselect/prometheus/export.qtpl:12
writeprometheusMetricName(bb, &rs.MetricName)
func StreamExportCSVLine(qw422016 *qt422016.Writer, xb *exportBlock, fieldNames []string) {
//line app/vmselect/prometheus/export.qtpl:13
for i, ts := range rs.Timestamps {
if len(xb.timestamps) == 0 || len(fieldNames) == 0 {
//line app/vmselect/prometheus/export.qtpl:13
return
//line app/vmselect/prometheus/export.qtpl:13
}
//line app/vmselect/prometheus/export.qtpl:14
qw422016.N().Z(bb.B)
//line app/vmselect/prometheus/export.qtpl:14
qw422016.N().S(` `)
for i, timestamp := range xb.timestamps {
//line app/vmselect/prometheus/export.qtpl:15
qw422016.N().F(rs.Values[i])
//line app/vmselect/prometheus/export.qtpl:15
qw422016.N().S(` `)
//line app/vmselect/prometheus/export.qtpl:16
qw422016.N().DL(ts)
value := xb.values[i]
//line app/vmselect/prometheus/export.qtpl:16
streamexportCSVField(qw422016, xb.mn, fieldNames[0], timestamp, value)
//line app/vmselect/prometheus/export.qtpl:17
for _, fieldName := range fieldNames[1:] {
//line app/vmselect/prometheus/export.qtpl:17
qw422016.N().S(`,`)
//line app/vmselect/prometheus/export.qtpl:19
streamexportCSVField(qw422016, xb.mn, fieldName, timestamp, value)
//line app/vmselect/prometheus/export.qtpl:20
}
//line app/vmselect/prometheus/export.qtpl:21
qw422016.N().S(`
`)
//line app/vmselect/prometheus/export.qtpl:17
//line app/vmselect/prometheus/export.qtpl:22
}
//line app/vmselect/prometheus/export.qtpl:18
quicktemplate.ReleaseByteBuffer(bb)
//line app/vmselect/prometheus/export.qtpl:19
//line app/vmselect/prometheus/export.qtpl:23
}
//line app/vmselect/prometheus/export.qtpl:19
func WriteExportPrometheusLine(qq422016 qtio422016.Writer, rs *netstorage.Result) {
//line app/vmselect/prometheus/export.qtpl:19
//line app/vmselect/prometheus/export.qtpl:23
func WriteExportCSVLine(qq422016 qtio422016.Writer, xb *exportBlock, fieldNames []string) {
//line app/vmselect/prometheus/export.qtpl:23
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmselect/prometheus/export.qtpl:19
StreamExportPrometheusLine(qw422016, rs)
//line app/vmselect/prometheus/export.qtpl:19
//line app/vmselect/prometheus/export.qtpl:23
StreamExportCSVLine(qw422016, xb, fieldNames)
//line app/vmselect/prometheus/export.qtpl:23
qt422016.ReleaseWriter(qw422016)
//line app/vmselect/prometheus/export.qtpl:19
//line app/vmselect/prometheus/export.qtpl:23
}
//line app/vmselect/prometheus/export.qtpl:19
func ExportPrometheusLine(rs *netstorage.Result) string {
//line app/vmselect/prometheus/export.qtpl:19
//line app/vmselect/prometheus/export.qtpl:23
func ExportCSVLine(xb *exportBlock, fieldNames []string) string {
//line app/vmselect/prometheus/export.qtpl:23
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmselect/prometheus/export.qtpl:19
WriteExportPrometheusLine(qb422016, rs)
//line app/vmselect/prometheus/export.qtpl:19
//line app/vmselect/prometheus/export.qtpl:23
WriteExportCSVLine(qb422016, xb, fieldNames)
//line app/vmselect/prometheus/export.qtpl:23
qs422016 := string(qb422016.B)
//line app/vmselect/prometheus/export.qtpl:19
//line app/vmselect/prometheus/export.qtpl:23
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmselect/prometheus/export.qtpl:19
//line app/vmselect/prometheus/export.qtpl:23
return qs422016
//line app/vmselect/prometheus/export.qtpl:19
//line app/vmselect/prometheus/export.qtpl:23
}
//line app/vmselect/prometheus/export.qtpl:21
func StreamExportJSONLine(qw422016 *qt422016.Writer, rs *netstorage.Result) {
//line app/vmselect/prometheus/export.qtpl:22
if len(rs.Timestamps) == 0 {
//line app/vmselect/prometheus/export.qtpl:22
return
//line app/vmselect/prometheus/export.qtpl:22
}
//line app/vmselect/prometheus/export.qtpl:22
qw422016.N().S(`{"metric":`)
//line app/vmselect/prometheus/export.qtpl:24
streammetricNameObject(qw422016, &rs.MetricName)
//line app/vmselect/prometheus/export.qtpl:24
qw422016.N().S(`,"values":[`)
//line app/vmselect/prometheus/export.qtpl:25
func streamexportCSVField(qw422016 *qt422016.Writer, mn *storage.MetricName, fieldName string, timestamp int64, value float64) {
//line app/vmselect/prometheus/export.qtpl:26
if len(rs.Values) > 0 {
if fieldName == "__value__" {
//line app/vmselect/prometheus/export.qtpl:27
values := rs.Values
qw422016.N().F(value)
//line app/vmselect/prometheus/export.qtpl:28
qw422016.N().F(values[0])
return
//line app/vmselect/prometheus/export.qtpl:29
values = values[1:]
}
//line app/vmselect/prometheus/export.qtpl:30
for _, v := range values {
//line app/vmselect/prometheus/export.qtpl:30
qw422016.N().S(`,`)
if fieldName == "__timestamp__" {
//line app/vmselect/prometheus/export.qtpl:31
qw422016.N().F(v)
qw422016.N().DL(timestamp)
//line app/vmselect/prometheus/export.qtpl:32
}
return
//line app/vmselect/prometheus/export.qtpl:33
}
//line app/vmselect/prometheus/export.qtpl:33
qw422016.N().S(`],"timestamps":[`)
//line app/vmselect/prometheus/export.qtpl:34
if strings.HasPrefix(fieldName, "__timestamp__:") {
//line app/vmselect/prometheus/export.qtpl:35
timeFormat := fieldName[len("__timestamp__:"):]
//line app/vmselect/prometheus/export.qtpl:36
if len(rs.Timestamps) > 0 {
switch timeFormat {
//line app/vmselect/prometheus/export.qtpl:37
timestamps := rs.Timestamps
case "unix_s":
//line app/vmselect/prometheus/export.qtpl:38
qw422016.N().DL(timestamps[0])
qw422016.N().DL(timestamp / 1000)
//line app/vmselect/prometheus/export.qtpl:39
timestamps = timestamps[1:]
case "unix_ms":
//line app/vmselect/prometheus/export.qtpl:40
for _, ts := range timestamps {
//line app/vmselect/prometheus/export.qtpl:40
qw422016.N().S(`,`)
qw422016.N().DL(timestamp)
//line app/vmselect/prometheus/export.qtpl:41
qw422016.N().DL(ts)
case "unix_ns":
//line app/vmselect/prometheus/export.qtpl:42
}
qw422016.N().DL(timestamp * 1e6)
//line app/vmselect/prometheus/export.qtpl:43
}
//line app/vmselect/prometheus/export.qtpl:43
qw422016.N().S(`]}`)
case "rfc3339":
//line app/vmselect/prometheus/export.qtpl:45
qw422016.N().S(`
`)
//line app/vmselect/prometheus/export.qtpl:46
}
//line app/vmselect/prometheus/export.qtpl:46
func WriteExportJSONLine(qq422016 qtio422016.Writer, rs *netstorage.Result) {
//line app/vmselect/prometheus/export.qtpl:46
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmselect/prometheus/export.qtpl:46
StreamExportJSONLine(qw422016, rs)
//line app/vmselect/prometheus/export.qtpl:46
qt422016.ReleaseWriter(qw422016)
//line app/vmselect/prometheus/export.qtpl:46
}
//line app/vmselect/prometheus/export.qtpl:46
func ExportJSONLine(rs *netstorage.Result) string {
//line app/vmselect/prometheus/export.qtpl:46
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmselect/prometheus/export.qtpl:46
WriteExportJSONLine(qb422016, rs)
//line app/vmselect/prometheus/export.qtpl:46
qs422016 := string(qb422016.B)
//line app/vmselect/prometheus/export.qtpl:46
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmselect/prometheus/export.qtpl:46
return qs422016
//line app/vmselect/prometheus/export.qtpl:46
}
bb := quicktemplate.AcquireByteBuffer()
bb.B = time.Unix(timestamp/1000, (timestamp%1000)*1e6).AppendFormat(bb.B[:0], time.RFC3339)
//line app/vmselect/prometheus/export.qtpl:48
func StreamExportPromAPILine(qw422016 *qt422016.Writer, rs *netstorage.Result) {
//line app/vmselect/prometheus/export.qtpl:48
qw422016.N().S(`{"metric":`)
//line app/vmselect/prometheus/export.qtpl:50
streammetricNameObject(qw422016, &rs.MetricName)
//line app/vmselect/prometheus/export.qtpl:50
qw422016.N().S(`,"values":`)
//line app/vmselect/prometheus/export.qtpl:51
streamvaluesWithTimestamps(qw422016, rs.Values, rs.Timestamps)
//line app/vmselect/prometheus/export.qtpl:51
qw422016.N().S(`}`)
//line app/vmselect/prometheus/export.qtpl:53
}
//line app/vmselect/prometheus/export.qtpl:53
func WriteExportPromAPILine(qq422016 qtio422016.Writer, rs *netstorage.Result) {
//line app/vmselect/prometheus/export.qtpl:53
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmselect/prometheus/export.qtpl:53
StreamExportPromAPILine(qw422016, rs)
//line app/vmselect/prometheus/export.qtpl:53
qt422016.ReleaseWriter(qw422016)
//line app/vmselect/prometheus/export.qtpl:53
}
//line app/vmselect/prometheus/export.qtpl:53
func ExportPromAPILine(rs *netstorage.Result) string {
//line app/vmselect/prometheus/export.qtpl:53
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmselect/prometheus/export.qtpl:53
WriteExportPromAPILine(qb422016, rs)
//line app/vmselect/prometheus/export.qtpl:53
qs422016 := string(qb422016.B)
//line app/vmselect/prometheus/export.qtpl:53
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmselect/prometheus/export.qtpl:53
return qs422016
//line app/vmselect/prometheus/export.qtpl:53
}
//line app/vmselect/prometheus/export.qtpl:55
func StreamExportPromAPIResponse(qw422016 *qt422016.Writer, resultsCh <-chan *quicktemplate.ByteBuffer) {
//line app/vmselect/prometheus/export.qtpl:55
qw422016.N().S(`{"status":"success","data":{"resultType":"matrix","result":[`)
//line app/vmselect/prometheus/export.qtpl:61
bb, ok := <-resultsCh
//line app/vmselect/prometheus/export.qtpl:62
if ok {
//line app/vmselect/prometheus/export.qtpl:63
qw422016.N().Z(bb.B)
//line app/vmselect/prometheus/export.qtpl:64
quicktemplate.ReleaseByteBuffer(bb)
//line app/vmselect/prometheus/export.qtpl:65
for bb := range resultsCh {
//line app/vmselect/prometheus/export.qtpl:65
qw422016.N().S(`,`)
//line app/vmselect/prometheus/export.qtpl:66
qw422016.N().Z(bb.B)
//line app/vmselect/prometheus/export.qtpl:67
//line app/vmselect/prometheus/export.qtpl:50
quicktemplate.ReleaseByteBuffer(bb)
//line app/vmselect/prometheus/export.qtpl:52
default:
//line app/vmselect/prometheus/export.qtpl:53
if strings.HasPrefix(timeFormat, "custom:") {
//line app/vmselect/prometheus/export.qtpl:55
layout := timeFormat[len("custom:"):]
bb := quicktemplate.AcquireByteBuffer()
bb.B = time.Unix(timestamp/1000, (timestamp%1000)*1e6).AppendFormat(bb.B[:0], layout)
//line app/vmselect/prometheus/export.qtpl:59
if bytes.ContainsAny(bb.B, `"`+",\n") {
//line app/vmselect/prometheus/export.qtpl:60
qw422016.E().QZ(bb.B)
//line app/vmselect/prometheus/export.qtpl:61
} else {
//line app/vmselect/prometheus/export.qtpl:62
qw422016.N().Z(bb.B)
//line app/vmselect/prometheus/export.qtpl:63
}
//line app/vmselect/prometheus/export.qtpl:65
quicktemplate.ReleaseByteBuffer(bb)
//line app/vmselect/prometheus/export.qtpl:67
} else {
//line app/vmselect/prometheus/export.qtpl:67
qw422016.N().S(`Unsupported timeFormat=`)
//line app/vmselect/prometheus/export.qtpl:68
qw422016.N().S(timeFormat)
//line app/vmselect/prometheus/export.qtpl:69
}
//line app/vmselect/prometheus/export.qtpl:70
}
//line app/vmselect/prometheus/export.qtpl:69
//line app/vmselect/prometheus/export.qtpl:71
return
//line app/vmselect/prometheus/export.qtpl:72
}
//line app/vmselect/prometheus/export.qtpl:69
qw422016.N().S(`]}}`)
//line app/vmselect/prometheus/export.qtpl:73
}
//line app/vmselect/prometheus/export.qtpl:73
func WriteExportPromAPIResponse(qq422016 qtio422016.Writer, resultsCh <-chan *quicktemplate.ByteBuffer) {
//line app/vmselect/prometheus/export.qtpl:73
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmselect/prometheus/export.qtpl:73
StreamExportPromAPIResponse(qw422016, resultsCh)
//line app/vmselect/prometheus/export.qtpl:73
qt422016.ReleaseWriter(qw422016)
//line app/vmselect/prometheus/export.qtpl:73
}
//line app/vmselect/prometheus/export.qtpl:73
func ExportPromAPIResponse(resultsCh <-chan *quicktemplate.ByteBuffer) string {
//line app/vmselect/prometheus/export.qtpl:73
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmselect/prometheus/export.qtpl:73
WriteExportPromAPIResponse(qb422016, resultsCh)
//line app/vmselect/prometheus/export.qtpl:73
qs422016 := string(qb422016.B)
//line app/vmselect/prometheus/export.qtpl:73
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmselect/prometheus/export.qtpl:73
return qs422016
//line app/vmselect/prometheus/export.qtpl:73
}
v := mn.GetTagValue(fieldName)
//line app/vmselect/prometheus/export.qtpl:74
if bytes.ContainsAny(v, `"`+",\n") {
//line app/vmselect/prometheus/export.qtpl:75
func StreamExportStdResponse(qw422016 *qt422016.Writer, resultsCh <-chan *quicktemplate.ByteBuffer) {
qw422016.N().QZ(v)
//line app/vmselect/prometheus/export.qtpl:76
for bb := range resultsCh {
} else {
//line app/vmselect/prometheus/export.qtpl:77
qw422016.N().Z(bb.B)
qw422016.N().Z(v)
//line app/vmselect/prometheus/export.qtpl:78
quicktemplate.ReleaseByteBuffer(bb)
}
//line app/vmselect/prometheus/export.qtpl:79
}
//line app/vmselect/prometheus/export.qtpl:79
}
//line app/vmselect/prometheus/export.qtpl:80
}
//line app/vmselect/prometheus/export.qtpl:80
func WriteExportStdResponse(qq422016 qtio422016.Writer, resultsCh <-chan *quicktemplate.ByteBuffer) {
//line app/vmselect/prometheus/export.qtpl:80
func writeexportCSVField(qq422016 qtio422016.Writer, mn *storage.MetricName, fieldName string, timestamp int64, value float64) {
//line app/vmselect/prometheus/export.qtpl:79
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmselect/prometheus/export.qtpl:80
StreamExportStdResponse(qw422016, resultsCh)
//line app/vmselect/prometheus/export.qtpl:80
//line app/vmselect/prometheus/export.qtpl:79
streamexportCSVField(qw422016, mn, fieldName, timestamp, value)
//line app/vmselect/prometheus/export.qtpl:79
qt422016.ReleaseWriter(qw422016)
//line app/vmselect/prometheus/export.qtpl:80
//line app/vmselect/prometheus/export.qtpl:79
}
//line app/vmselect/prometheus/export.qtpl:80
func ExportStdResponse(resultsCh <-chan *quicktemplate.ByteBuffer) string {
//line app/vmselect/prometheus/export.qtpl:80
//line app/vmselect/prometheus/export.qtpl:79
func exportCSVField(mn *storage.MetricName, fieldName string, timestamp int64, value float64) string {
//line app/vmselect/prometheus/export.qtpl:79
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmselect/prometheus/export.qtpl:80
WriteExportStdResponse(qb422016, resultsCh)
//line app/vmselect/prometheus/export.qtpl:80
//line app/vmselect/prometheus/export.qtpl:79
writeexportCSVField(qb422016, mn, fieldName, timestamp, value)
//line app/vmselect/prometheus/export.qtpl:79
qs422016 := string(qb422016.B)
//line app/vmselect/prometheus/export.qtpl:80
//line app/vmselect/prometheus/export.qtpl:79
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmselect/prometheus/export.qtpl:80
//line app/vmselect/prometheus/export.qtpl:79
return qs422016
//line app/vmselect/prometheus/export.qtpl:80
//line app/vmselect/prometheus/export.qtpl:79
}
//line app/vmselect/prometheus/export.qtpl:81
func StreamExportPrometheusLine(qw422016 *qt422016.Writer, xb *exportBlock) {
//line app/vmselect/prometheus/export.qtpl:82
func streamprometheusMetricName(qw422016 *qt422016.Writer, mn *storage.MetricName) {
if len(xb.timestamps) == 0 {
//line app/vmselect/prometheus/export.qtpl:82
return
//line app/vmselect/prometheus/export.qtpl:82
}
//line app/vmselect/prometheus/export.qtpl:83
qw422016.N().Z(mn.MetricGroup)
bb := quicktemplate.AcquireByteBuffer()
//line app/vmselect/prometheus/export.qtpl:84
if len(mn.Tags) > 0 {
//line app/vmselect/prometheus/export.qtpl:84
qw422016.N().S(`{`)
writeprometheusMetricName(bb, xb.mn)
//line app/vmselect/prometheus/export.qtpl:85
for i, ts := range xb.timestamps {
//line app/vmselect/prometheus/export.qtpl:86
tags := mn.Tags
qw422016.N().Z(bb.B)
//line app/vmselect/prometheus/export.qtpl:86
qw422016.N().S(` `)
//line app/vmselect/prometheus/export.qtpl:87
qw422016.N().Z(tags[0].Key)
qw422016.N().F(xb.values[i])
//line app/vmselect/prometheus/export.qtpl:87
qw422016.N().S(`=`)
//line app/vmselect/prometheus/export.qtpl:87
qw422016.N().QZ(tags[0].Value)
qw422016.N().S(` `)
//line app/vmselect/prometheus/export.qtpl:88
tags = tags[1:]
qw422016.N().DL(ts)
//line app/vmselect/prometheus/export.qtpl:88
qw422016.N().S(`
`)
//line app/vmselect/prometheus/export.qtpl:89
for i := range tags {
}
//line app/vmselect/prometheus/export.qtpl:90
tag := &tags[i]
quicktemplate.ReleaseByteBuffer(bb)
//line app/vmselect/prometheus/export.qtpl:90
qw422016.N().S(`,`)
//line app/vmselect/prometheus/export.qtpl:91
qw422016.N().Z(tag.Key)
}
//line app/vmselect/prometheus/export.qtpl:91
qw422016.N().S(`=`)
func WriteExportPrometheusLine(qq422016 qtio422016.Writer, xb *exportBlock) {
//line app/vmselect/prometheus/export.qtpl:91
qw422016.N().QZ(tag.Value)
//line app/vmselect/prometheus/export.qtpl:92
}
//line app/vmselect/prometheus/export.qtpl:92
qw422016.N().S(`}`)
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmselect/prometheus/export.qtpl:91
StreamExportPrometheusLine(qw422016, xb)
//line app/vmselect/prometheus/export.qtpl:91
qt422016.ReleaseWriter(qw422016)
//line app/vmselect/prometheus/export.qtpl:91
}
//line app/vmselect/prometheus/export.qtpl:91
func ExportPrometheusLine(xb *exportBlock) string {
//line app/vmselect/prometheus/export.qtpl:91
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmselect/prometheus/export.qtpl:91
WriteExportPrometheusLine(qb422016, xb)
//line app/vmselect/prometheus/export.qtpl:91
qs422016 := string(qb422016.B)
//line app/vmselect/prometheus/export.qtpl:91
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmselect/prometheus/export.qtpl:91
return qs422016
//line app/vmselect/prometheus/export.qtpl:91
}
//line app/vmselect/prometheus/export.qtpl:93
func StreamExportJSONLine(qw422016 *qt422016.Writer, xb *exportBlock) {
//line app/vmselect/prometheus/export.qtpl:94
if len(xb.timestamps) == 0 {
//line app/vmselect/prometheus/export.qtpl:94
return
//line app/vmselect/prometheus/export.qtpl:94
}
//line app/vmselect/prometheus/export.qtpl:95
//line app/vmselect/prometheus/export.qtpl:94
qw422016.N().S(`{"metric":`)
//line app/vmselect/prometheus/export.qtpl:96
streammetricNameObject(qw422016, xb.mn)
//line app/vmselect/prometheus/export.qtpl:96
qw422016.N().S(`,"values":[`)
//line app/vmselect/prometheus/export.qtpl:98
if len(xb.values) > 0 {
//line app/vmselect/prometheus/export.qtpl:99
values := xb.values
//line app/vmselect/prometheus/export.qtpl:100
qw422016.N().F(values[0])
//line app/vmselect/prometheus/export.qtpl:101
values = values[1:]
//line app/vmselect/prometheus/export.qtpl:102
for _, v := range values {
//line app/vmselect/prometheus/export.qtpl:102
qw422016.N().S(`,`)
//line app/vmselect/prometheus/export.qtpl:103
qw422016.N().F(v)
//line app/vmselect/prometheus/export.qtpl:104
}
//line app/vmselect/prometheus/export.qtpl:105
}
//line app/vmselect/prometheus/export.qtpl:105
qw422016.N().S(`],"timestamps":[`)
//line app/vmselect/prometheus/export.qtpl:108
if len(xb.timestamps) > 0 {
//line app/vmselect/prometheus/export.qtpl:109
timestamps := xb.timestamps
//line app/vmselect/prometheus/export.qtpl:110
qw422016.N().DL(timestamps[0])
//line app/vmselect/prometheus/export.qtpl:111
timestamps = timestamps[1:]
//line app/vmselect/prometheus/export.qtpl:112
for _, ts := range timestamps {
//line app/vmselect/prometheus/export.qtpl:112
qw422016.N().S(`,`)
//line app/vmselect/prometheus/export.qtpl:113
qw422016.N().DL(ts)
//line app/vmselect/prometheus/export.qtpl:114
}
//line app/vmselect/prometheus/export.qtpl:115
}
//line app/vmselect/prometheus/export.qtpl:115
qw422016.N().S(`]}`)
//line app/vmselect/prometheus/export.qtpl:117
qw422016.N().S(`
`)
//line app/vmselect/prometheus/export.qtpl:118
}
//line app/vmselect/prometheus/export.qtpl:95
func writeprometheusMetricName(qq422016 qtio422016.Writer, mn *storage.MetricName) {
//line app/vmselect/prometheus/export.qtpl:95
//line app/vmselect/prometheus/export.qtpl:118
func WriteExportJSONLine(qq422016 qtio422016.Writer, xb *exportBlock) {
//line app/vmselect/prometheus/export.qtpl:118
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmselect/prometheus/export.qtpl:95
streamprometheusMetricName(qw422016, mn)
//line app/vmselect/prometheus/export.qtpl:95
//line app/vmselect/prometheus/export.qtpl:118
StreamExportJSONLine(qw422016, xb)
//line app/vmselect/prometheus/export.qtpl:118
qt422016.ReleaseWriter(qw422016)
//line app/vmselect/prometheus/export.qtpl:95
//line app/vmselect/prometheus/export.qtpl:118
}
//line app/vmselect/prometheus/export.qtpl:95
func prometheusMetricName(mn *storage.MetricName) string {
//line app/vmselect/prometheus/export.qtpl:95
//line app/vmselect/prometheus/export.qtpl:118
func ExportJSONLine(xb *exportBlock) string {
//line app/vmselect/prometheus/export.qtpl:118
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmselect/prometheus/export.qtpl:95
writeprometheusMetricName(qb422016, mn)
//line app/vmselect/prometheus/export.qtpl:95
//line app/vmselect/prometheus/export.qtpl:118
WriteExportJSONLine(qb422016, xb)
//line app/vmselect/prometheus/export.qtpl:118
qs422016 := string(qb422016.B)
//line app/vmselect/prometheus/export.qtpl:95
//line app/vmselect/prometheus/export.qtpl:118
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmselect/prometheus/export.qtpl:95
//line app/vmselect/prometheus/export.qtpl:118
return qs422016
//line app/vmselect/prometheus/export.qtpl:95
//line app/vmselect/prometheus/export.qtpl:118
}
//line app/vmselect/prometheus/export.qtpl:120
func StreamExportPromAPILine(qw422016 *qt422016.Writer, xb *exportBlock) {
//line app/vmselect/prometheus/export.qtpl:120
qw422016.N().S(`{"metric":`)
//line app/vmselect/prometheus/export.qtpl:122
streammetricNameObject(qw422016, xb.mn)
//line app/vmselect/prometheus/export.qtpl:122
qw422016.N().S(`,"values":`)
//line app/vmselect/prometheus/export.qtpl:123
streamvaluesWithTimestamps(qw422016, xb.values, xb.timestamps)
//line app/vmselect/prometheus/export.qtpl:123
qw422016.N().S(`}`)
//line app/vmselect/prometheus/export.qtpl:125
}
//line app/vmselect/prometheus/export.qtpl:125
func WriteExportPromAPILine(qq422016 qtio422016.Writer, xb *exportBlock) {
//line app/vmselect/prometheus/export.qtpl:125
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmselect/prometheus/export.qtpl:125
StreamExportPromAPILine(qw422016, xb)
//line app/vmselect/prometheus/export.qtpl:125
qt422016.ReleaseWriter(qw422016)
//line app/vmselect/prometheus/export.qtpl:125
}
//line app/vmselect/prometheus/export.qtpl:125
func ExportPromAPILine(xb *exportBlock) string {
//line app/vmselect/prometheus/export.qtpl:125
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmselect/prometheus/export.qtpl:125
WriteExportPromAPILine(qb422016, xb)
//line app/vmselect/prometheus/export.qtpl:125
qs422016 := string(qb422016.B)
//line app/vmselect/prometheus/export.qtpl:125
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmselect/prometheus/export.qtpl:125
return qs422016
//line app/vmselect/prometheus/export.qtpl:125
}
//line app/vmselect/prometheus/export.qtpl:127
func StreamExportPromAPIResponse(qw422016 *qt422016.Writer, resultsCh <-chan *quicktemplate.ByteBuffer) {
//line app/vmselect/prometheus/export.qtpl:127
qw422016.N().S(`{"status":"success","data":{"resultType":"matrix","result":[`)
//line app/vmselect/prometheus/export.qtpl:133
bb, ok := <-resultsCh
//line app/vmselect/prometheus/export.qtpl:134
if ok {
//line app/vmselect/prometheus/export.qtpl:135
qw422016.N().Z(bb.B)
//line app/vmselect/prometheus/export.qtpl:136
quicktemplate.ReleaseByteBuffer(bb)
//line app/vmselect/prometheus/export.qtpl:137
for bb := range resultsCh {
//line app/vmselect/prometheus/export.qtpl:137
qw422016.N().S(`,`)
//line app/vmselect/prometheus/export.qtpl:138
qw422016.N().Z(bb.B)
//line app/vmselect/prometheus/export.qtpl:139
quicktemplate.ReleaseByteBuffer(bb)
//line app/vmselect/prometheus/export.qtpl:140
}
//line app/vmselect/prometheus/export.qtpl:141
}
//line app/vmselect/prometheus/export.qtpl:141
qw422016.N().S(`]}}`)
//line app/vmselect/prometheus/export.qtpl:145
}
//line app/vmselect/prometheus/export.qtpl:145
func WriteExportPromAPIResponse(qq422016 qtio422016.Writer, resultsCh <-chan *quicktemplate.ByteBuffer) {
//line app/vmselect/prometheus/export.qtpl:145
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmselect/prometheus/export.qtpl:145
StreamExportPromAPIResponse(qw422016, resultsCh)
//line app/vmselect/prometheus/export.qtpl:145
qt422016.ReleaseWriter(qw422016)
//line app/vmselect/prometheus/export.qtpl:145
}
//line app/vmselect/prometheus/export.qtpl:145
func ExportPromAPIResponse(resultsCh <-chan *quicktemplate.ByteBuffer) string {
//line app/vmselect/prometheus/export.qtpl:145
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmselect/prometheus/export.qtpl:145
WriteExportPromAPIResponse(qb422016, resultsCh)
//line app/vmselect/prometheus/export.qtpl:145
qs422016 := string(qb422016.B)
//line app/vmselect/prometheus/export.qtpl:145
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmselect/prometheus/export.qtpl:145
return qs422016
//line app/vmselect/prometheus/export.qtpl:145
}
//line app/vmselect/prometheus/export.qtpl:147
func StreamExportStdResponse(qw422016 *qt422016.Writer, resultsCh <-chan *quicktemplate.ByteBuffer) {
//line app/vmselect/prometheus/export.qtpl:148
for bb := range resultsCh {
//line app/vmselect/prometheus/export.qtpl:149
qw422016.N().Z(bb.B)
//line app/vmselect/prometheus/export.qtpl:150
quicktemplate.ReleaseByteBuffer(bb)
//line app/vmselect/prometheus/export.qtpl:151
}
//line app/vmselect/prometheus/export.qtpl:152
}
//line app/vmselect/prometheus/export.qtpl:152
func WriteExportStdResponse(qq422016 qtio422016.Writer, resultsCh <-chan *quicktemplate.ByteBuffer) {
//line app/vmselect/prometheus/export.qtpl:152
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmselect/prometheus/export.qtpl:152
StreamExportStdResponse(qw422016, resultsCh)
//line app/vmselect/prometheus/export.qtpl:152
qt422016.ReleaseWriter(qw422016)
//line app/vmselect/prometheus/export.qtpl:152
}
//line app/vmselect/prometheus/export.qtpl:152
func ExportStdResponse(resultsCh <-chan *quicktemplate.ByteBuffer) string {
//line app/vmselect/prometheus/export.qtpl:152
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmselect/prometheus/export.qtpl:152
WriteExportStdResponse(qb422016, resultsCh)
//line app/vmselect/prometheus/export.qtpl:152
qs422016 := string(qb422016.B)
//line app/vmselect/prometheus/export.qtpl:152
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmselect/prometheus/export.qtpl:152
return qs422016
//line app/vmselect/prometheus/export.qtpl:152
}
//line app/vmselect/prometheus/export.qtpl:154
func streamprometheusMetricName(qw422016 *qt422016.Writer, mn *storage.MetricName) {
//line app/vmselect/prometheus/export.qtpl:155
qw422016.N().Z(mn.MetricGroup)
//line app/vmselect/prometheus/export.qtpl:156
if len(mn.Tags) > 0 {
//line app/vmselect/prometheus/export.qtpl:156
qw422016.N().S(`{`)
//line app/vmselect/prometheus/export.qtpl:158
tags := mn.Tags
//line app/vmselect/prometheus/export.qtpl:159
qw422016.N().Z(tags[0].Key)
//line app/vmselect/prometheus/export.qtpl:159
qw422016.N().S(`=`)
//line app/vmselect/prometheus/export.qtpl:159
qw422016.N().QZ(tags[0].Value)
//line app/vmselect/prometheus/export.qtpl:160
tags = tags[1:]
//line app/vmselect/prometheus/export.qtpl:161
for i := range tags {
//line app/vmselect/prometheus/export.qtpl:162
tag := &tags[i]
//line app/vmselect/prometheus/export.qtpl:162
qw422016.N().S(`,`)
//line app/vmselect/prometheus/export.qtpl:163
qw422016.N().Z(tag.Key)
//line app/vmselect/prometheus/export.qtpl:163
qw422016.N().S(`=`)
//line app/vmselect/prometheus/export.qtpl:163
qw422016.N().QZ(tag.Value)
//line app/vmselect/prometheus/export.qtpl:164
}
//line app/vmselect/prometheus/export.qtpl:164
qw422016.N().S(`}`)
//line app/vmselect/prometheus/export.qtpl:166
}
//line app/vmselect/prometheus/export.qtpl:167
}
//line app/vmselect/prometheus/export.qtpl:167
func writeprometheusMetricName(qq422016 qtio422016.Writer, mn *storage.MetricName) {
//line app/vmselect/prometheus/export.qtpl:167
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmselect/prometheus/export.qtpl:167
streamprometheusMetricName(qw422016, mn)
//line app/vmselect/prometheus/export.qtpl:167
qt422016.ReleaseWriter(qw422016)
//line app/vmselect/prometheus/export.qtpl:167
}
//line app/vmselect/prometheus/export.qtpl:167
func prometheusMetricName(mn *storage.MetricName) string {
//line app/vmselect/prometheus/export.qtpl:167
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmselect/prometheus/export.qtpl:167
writeprometheusMetricName(qb422016, mn)
//line app/vmselect/prometheus/export.qtpl:167
qs422016 := string(qb422016.B)
//line app/vmselect/prometheus/export.qtpl:167
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmselect/prometheus/export.qtpl:167
return qs422016
//line app/vmselect/prometheus/export.qtpl:167
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,27 +1,22 @@
package prometheus
import (
"fmt"
"math"
"net/http"
"net/url"
"reflect"
"testing"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/netstorage"
)
func TestRemoveNaNValuesInplace(t *testing.T) {
func TestRemoveEmptyValuesAndTimeseries(t *testing.T) {
f := func(tss []netstorage.Result, tssExpected []netstorage.Result) {
t.Helper()
removeNaNValuesInplace(tss)
tss = removeEmptyValuesAndTimeseries(tss)
if !reflect.DeepEqual(tss, tssExpected) {
t.Fatalf("unexpected result; got %v; want %v", tss, tssExpected)
}
}
nan := math.NaN()
f(nil, nil)
f([]netstorage.Result{
{
@@ -32,6 +27,14 @@ func TestRemoveNaNValuesInplace(t *testing.T) {
Timestamps: []int64{100, 200, 300, 400},
Values: []float64{nan, nan, 3, nan},
},
{
Timestamps: []int64{1, 2},
Values: []float64{nan, nan},
},
{
Timestamps: nil,
Values: nil,
},
}, []netstorage.Result{
{
Timestamps: []int64{100, 200, 300},
@@ -44,72 +47,151 @@ func TestRemoveNaNValuesInplace(t *testing.T) {
})
}
func TestGetTimeSuccess(t *testing.T) {
f := func(s string, timestampExpected int64) {
func TestAdjustLastPoints(t *testing.T) {
f := func(tss []netstorage.Result, start, end int64, tssExpected []netstorage.Result) {
t.Helper()
urlStr := fmt.Sprintf("http://foo.bar/baz?s=%s", url.QueryEscape(s))
r, err := http.NewRequest("GET", urlStr, nil)
if err != nil {
t.Fatalf("unexpected error in NewRequest: %s", err)
tss = adjustLastPoints(tss, start, end)
for i, ts := range tss {
for j, value := range ts.Values {
expectedValue := tssExpected[i].Values[j]
if math.IsNaN(expectedValue) {
if !math.IsNaN(value) {
t.Fatalf("unexpected value for time series #%d at position %d; got %v; want nan", i, j, value)
}
} else if expectedValue != value {
t.Fatalf("unexpected value for time series #%d at position %d; got %v; want %v", i, j, value, expectedValue)
}
}
if !reflect.DeepEqual(ts.Timestamps, tssExpected[i].Timestamps) {
t.Fatalf("unexpected timestamps for time series #%d; got %v; want %v", i, tss, tssExpected)
}
}
// Verify defaultValue
ts, err := getTime(r, "foo", 123)
if err != nil {
t.Fatalf("unexpected error when obtaining default time from getTime(%q): %s", s, err)
}
if ts != 123 {
t.Fatalf("unexpected default value for getTime(%q); got %d; want %d", s, ts, 123)
}
// Verify timestampExpected
ts, err = getTime(r, "s", 123)
if err != nil {
t.Fatalf("unexpected error in getTime(%q): %s", s, err)
}
if ts != timestampExpected {
t.Fatalf("unexpected timestamp for getTime(%q); got %d; want %d", s, ts, timestampExpected)
}
}
f("2019-07-07T20:01:02Z", 1562529662000)
f("2019-07-07T20:47:40+03:00", 1562521660000)
f("-292273086-05-16T16:47:06Z", minTimeMsecs)
f("292277025-08-18T07:12:54.999999999Z", maxTimeMsecs)
f("1562529662.324", 1562529662324)
f("-9223372036.854", minTimeMsecs)
f("-9223372036.855", minTimeMsecs)
f("9223372036.855", maxTimeMsecs)
}
func TestGetTimeError(t *testing.T) {
f := func(s string) {
t.Helper()
urlStr := fmt.Sprintf("http://foo.bar/baz?s=%s", url.QueryEscape(s))
r, err := http.NewRequest("GET", urlStr, nil)
if err != nil {
t.Fatalf("unexpected error in NewRequest: %s", err)
}
// Verify defaultValue
ts, err := getTime(r, "foo", 123)
if err != nil {
t.Fatalf("unexpected error when obtaining default time from getTime(%q): %s", s, err)
}
if ts != 123 {
t.Fatalf("unexpected default value for getTime(%q); got %d; want %d", s, ts, 123)
}
// Verify timestampExpected
_, err = getTime(r, "s", 123)
if err == nil {
t.Fatalf("expecting non-nil error in getTime(%q)", s)
}
}
f("foo")
f("2019-07-07T20:01:02Zisdf")
f("2019-07-07T20:47:40+03:00123")
f("-292273086-05-16T16:47:07Z")
f("292277025-08-18T07:12:54.999999998Z")
nan := math.NaN()
f(nil, 300, 500, nil)
f([]netstorage.Result{
{
Timestamps: []int64{100, 200, 300, 400, 500},
Values: []float64{1, 2, 3, 4, nan},
},
{
Timestamps: []int64{100, 200, 300, 400, 500},
Values: []float64{1, 2, 3, nan, nan},
},
}, 400, 500, []netstorage.Result{
{
Timestamps: []int64{100, 200, 300, 400, 500},
Values: []float64{1, 2, 3, 4, 4},
},
{
Timestamps: []int64{100, 200, 300, 400, 500},
Values: []float64{1, 2, 3, nan, nan},
},
})
f([]netstorage.Result{
{
Timestamps: []int64{100, 200, 300, 400, 500},
Values: []float64{1, 2, 3, nan, nan},
},
{
Timestamps: []int64{100, 200, 300, 400, 500},
Values: []float64{1, 2, nan, nan, nan},
},
}, 300, 500, []netstorage.Result{
{
Timestamps: []int64{100, 200, 300, 400, 500},
Values: []float64{1, 2, 3, 3, 3},
},
{
Timestamps: []int64{100, 200, 300, 400, 500},
Values: []float64{1, 2, nan, nan, nan},
},
})
f([]netstorage.Result{
{
Timestamps: []int64{100, 200, 300, 400, 500},
Values: []float64{1, 2, nan, nan, nan},
},
{
Timestamps: []int64{100, 200, 300, 400, 500},
Values: []float64{1, nan, nan, nan, nan},
},
}, 500, 300, []netstorage.Result{
{
Timestamps: []int64{100, 200, 300, 400, 500},
Values: []float64{1, 2, nan, nan, nan},
},
{
Timestamps: []int64{100, 200, 300, 400, 500},
Values: []float64{1, nan, nan, nan, nan},
},
})
f([]netstorage.Result{
{
Timestamps: []int64{100, 200, 300, 400, 500},
Values: []float64{1, 2, 3, 4, nan},
},
{
Timestamps: []int64{100, 200, 300, 400},
Values: []float64{1, 2, 3, 4},
},
}, 400, 500, []netstorage.Result{
{
Timestamps: []int64{100, 200, 300, 400, 500},
Values: []float64{1, 2, 3, 4, 4},
},
{
Timestamps: []int64{100, 200, 300, 400},
Values: []float64{1, 2, 3, 4},
},
})
f([]netstorage.Result{
{
Timestamps: []int64{100, 200, 300, 400, 500},
Values: []float64{1, 2, 3, nan, nan},
},
{
Timestamps: []int64{100, 200, 300},
Values: []float64{1, 2, nan},
},
}, 300, 600, []netstorage.Result{
{
Timestamps: []int64{100, 200, 300, 400, 500},
Values: []float64{1, 2, 3, 3, 3},
},
{
Timestamps: []int64{100, 200, 300},
Values: []float64{1, 2, nan},
},
})
// Check for timestamps outside the configured time range.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/625
f([]netstorage.Result{
{
Timestamps: []int64{100, 200, 300, 400, 500},
Values: []float64{1, 2, 3, nan, nan},
},
{
Timestamps: []int64{100, 200, 300},
Values: []float64{1, 2, 45},
},
}, 250, 400, []netstorage.Result{
{
Timestamps: []int64{100, 200, 300, 400, 500},
Values: []float64{1, 2, 3, nan, nan},
},
{
Timestamps: []int64{100, 200, 300},
Values: []float64{1, 2, 2},
},
})
}

Some files were not shown because too many files have changed in this diff Show More