Compare commits

...

503 Commits

Author SHA1 Message Date
Aliaksandr Valialkin
43375df923 lib/promscrape/discovery/kubernetes: update stale comments 2020-04-17 14:06:20 +03:00
Aliaksandr Valialkin
43bbffebb3 vendor: make vendor-update 2020-04-17 13:24:08 +03:00
Aliaksandr Valialkin
79fb595732 docs/vmagent.md: typo fix: unvailable -> unavailable 2020-04-17 13:11:31 +03:00
Aliaksandr Valialkin
546d26523c app/vmagent/README.md: mention about prodmscrape.suppressScrapeErrors 2020-04-17 13:08:21 +03:00
Aliaksandr Valialkin
f41e6a7bd9 app/vmselect: properly apply -search.maxLookback to queries sent to /api/v1/query 2020-04-17 12:30:11 +03:00
Dmitry Shihovtsev
830538e290 Fix misspelled Cortex name in the FAQ (#421) 2020-04-17 08:36:12 +01:00
Aliaksandr Valialkin
5d1537a395 lib/promscrape: suppress scrape errors if -promscrape.suppressScrapeErrors flag is set 2020-04-16 23:41:30 +03:00
Aliaksandr Valialkin
600490131f lib/promscrape: print all the labels for the target on error message for failed scrape
This should improve debuggability.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/420
2020-04-16 23:35:05 +03:00
Aliaksandr Valialkin
bd4c6d21dd lib/promscrape: retry target scraping when the target closes previously established keep-alive connection to it
This should fix the following error:

the server closed connection before returning the first response byte. Make sure the server returns 'Connection: close' response header before closing the connection
2020-04-16 23:25:29 +03:00
Aliaksandr Valialkin
95da8d410c docs/Single-server-VictoriaMetrics.md: mention that VictoriaMetrics supports Kubernetes service discovery 2020-04-16 18:40:11 +03:00
Aliaksandr Valialkin
bcec5c5429 docs/Single-server-VictoriaMetrics.md: typo fix: unneded -> unneeded 2020-04-16 17:35:08 +03:00
Aliaksandr Valialkin
467279acd2 docs/Single-server-VictoriaMetrics.md: imrpove docs about metrics deletion 2020-04-16 17:32:09 +03:00
Aliaksandr Valialkin
e0d213f82b docs/Single-server-VictoriaMetrics.md: mention that the delete API can be protected by authKey 2020-04-16 17:19:10 +03:00
Aliaksandr Valialkin
2fd2dec5eb lib/logger: typo fix 2020-04-16 00:19:10 +03:00
Aliaksandr Valialkin
071fdf5518 lib/logger: add WARN level for logging expected errors such as invalid user queries 2020-04-15 20:50:26 +03:00
Aliaksandr Valialkin
30b401ebbf docs/Single-server-VictoriaMetrics.md: typo fix 2020-04-15 15:21:58 +03:00
Aliaksandr Valialkin
a59a7bcc5e vendor: make vendor-update 2020-04-15 14:52:24 +03:00
Aliaksandr Valialkin
ccb887c0f6 docs/Single-server-VictoriaMetrics.md: clarify how to use -influxListenAddr command-line option 2020-04-15 12:33:42 +03:00
Aliaksandr Valialkin
6f7f64f757 app/vmselect: handle timestamp(metric offset X) the same way as Prometheus does
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/415
2020-04-15 12:01:00 +03:00
Aliaksandr Valialkin
426a0567c4 lib/promscrape: code cleanup in runScraper func 2020-04-15 11:36:24 +03:00
Aliaksandr Valialkin
6e2f6574b8 docs/Single-server-VictoriaMetrics.md: mention that backfilling can be done via any supported ingestion method 2020-04-15 10:56:53 +03:00
Aliaksandr Valialkin
c1de3f67b4 lib/storage: skip metricID if the corresponding metricID->metricName is missing in inverted index during search
This case is possible when the corresponding metricID->metricName entry didn't propagate to inverted index yet.

This should fix the following error:

error when searching tsids for tfss [...]: cannot find metricName by metricID 1582417212213420669: EOF
2020-04-15 00:06:43 +03:00
Aliaksandr Valialkin
8a25c1ed71 docs/Single-server-VictoriaMetrics.md: add https://github.com/Slapper/ansible-victoriametrics-cluster-role to integrations chapter 2020-04-14 16:27:20 +03:00
Aliaksandr Valialkin
067c7afebc lib/promscrape: show information on improperly configured scrape targets at the bottom of /targets page
This is a common error whith improperly configured target autodiscovery and/or relabeling.
This error leads to duplicate scraping of the same targets with the same set of labels, which leads
to duplicate samples in time series.
2020-04-14 14:55:05 +03:00
Aliaksandr Valialkin
ac35635b71 lib/promscrape/discovery/kubernetes: remove only unused client for API server during cleaning 2020-04-14 14:19:21 +03:00
Aliaksandr Valialkin
78863d7066 lib/promscrape: add promrelabel.GetLabelValueByName helper function 2020-04-14 14:12:01 +03:00
Aliaksandr Valialkin
c64f003cfb lib/promscrape: mention job name in error messages when target cannot be scraped
This should improve debuggability
2020-04-14 13:33:13 +03:00
Aliaksandr Valialkin
4718a5d951 lib/promscrape: reset ScrapeWork.ID in tests 2020-04-14 13:31:31 +03:00
Aliaksandr Valialkin
257521a634 lib/promscrape: properly expose statuses for targets with duplicate scrape urls at /targets page
Previously targets with duplicate scrape urls were merged into a single line on the page.
Now each target with duplicate scrape url is displayed on a separate line.
2020-04-14 13:10:01 +03:00
Aliaksandr Valialkin
6a75c95194 lib/promscrape: remove labels starting with __meta_ after applying relabel_configs as Prometheus does
This should reduce CPU load during scraping when target discovery generates
big number of `__meta_*` labels (for instance, k8s discovery).

See https://www.robustperception.io/life-of-a-label for details.
2020-04-14 12:23:22 +03:00
Aliaksandr Valialkin
01d7d799dc lib/promscrape: rename 'scrape_config->scrape_limit' to 'scrape_config->sample_limit'
`scrape_config` block from Prometheus config contains `sample_limit` field,
while in `vmagent` this field was mistakenly named as `scrape_limit`.
2020-04-14 11:59:57 +03:00
Aliaksandr Valialkin
0b76c27fa1 docs/vmagent.md: mention that vmagent supports kubernetes_sd_configs now 2020-04-13 21:06:36 +03:00
Aliaksandr Valialkin
2e4e202c2b lib/promscrape: add initial support for kubernetes_sd_config
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/334
2020-04-13 21:03:28 +03:00
Aliaksandr Valialkin
2814b1490f lib/promscrape: add -promscrape.config.strictParse flag for detecting errors in -promscrape.config file 2020-04-13 13:15:44 +03:00
Aliaksandr Valialkin
90b4a6dd12 lib/promscrape: extract common auth code to lib/promauth 2020-04-13 12:59:10 +03:00
hagen1778
2eed6c393f vmalert: prepare package for external usage
* update README according to changes
* add Makefile with basic commands
2020-04-12 15:32:42 +03:00
kreedom
948f8b6b5f [vmalert] fix linter issues 2020-04-12 15:08:11 +03:00
kreedom
8fca5f2819 [vmalert] add tests to webserver (#413) 2020-04-12 14:51:03 +03:00
Roman Khavronenko
7c9405f53d Vmalert metrics (#412)
vmalert: add basic list of metrics
2020-04-11 20:42:01 +01:00
Roman Khavronenko
9f8cc8ae1b Extend web responses for alerts: (#411)
vmalert: Extend web responses for alerts

* populate apiAlert object with additional fields
* return all active alerts, not only firing
* sort list of API alerts for deterministic output
* add helper for available path list
2020-04-11 16:49:23 +01:00
kreedom
90de3086b3 [vmalert] add webserver (#410)
* [vmalert] add webserver
2020-04-11 12:40:24 +03:00
Aliaksandr Valialkin
830d5fb1e0 vendor: make vendor-update 2020-04-10 18:40:21 +03:00
Aliaksandr Valialkin
66d8086a5e vendor: update github.com/klauspost/compress from v1.10.3 to v1.10.4 2020-04-10 18:39:19 +03:00
Aliaksandr Valialkin
a30c98c0bc deployment/docker: update Go builder image from go1.14.1 to go1.14.2 2020-04-10 18:19:34 +03:00
Aliaksandr Valialkin
4de6c6bbf0 lib/storage: disable deduplication after dedup tests are complete
The rest of tests expect that the de-duplication is disabled.
2020-04-10 17:28:31 +03:00
Aliaksandr Valialkin
ded0c0d3c7 lib/storage: correctly handle -dedup.minScrapeInterval values smaller than 8ms
Such small values may be used for removing samples with duplicate timestamps.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/409 for details.
2020-04-10 16:36:41 +03:00
Aliaksandr Valialkin
7d73623c69 lib/{storage,mergeset}: make sure that requests and misses cache counters never go down 2020-04-10 14:45:01 +03:00
Aliaksandr Valialkin
e62afc7366 lib/protoparser: add -*TrimTimstamp command-line flags for Influx, Graphite, OpenTSDB and CSV data
These flags can be used for reducing disk space usage for timestamps data ingested over the given protocols
2020-04-10 12:44:39 +03:00
Aliaksandr Valialkin
0681b4c27a lib/workingsetcache: accumulate stat counters on cache rotation
This should prevent from cache stats counters going down after cache rotation,
which may corrupt `cache hit ratio` graph on the official Grafan dasbhoards
when using the following query:

    1 - (sum(rate(vm_cache_misses_total[5m])) by (type) / sum(rate(vm_cache_requests_total[5m])) by (type))
2020-04-10 11:51:40 +03:00
Aliaksandr Valialkin
f86947d55c lib/memory: add more details to -memory.allowedPercent help message 2020-04-09 15:28:53 +03:00
Aliaksandr Valialkin
f94a090020 docs: update minimum supported Go version from 1.12 to 1.13 2020-04-07 13:38:37 +03:00
Aliaksandr Valialkin
8064775c02 docs/CaseStudies.md: updated ARNES numbers 2020-04-06 16:20:11 +03:00
Aliaksandr Valialkin
520a704606 docs/CaseStudies.md: prettifying of the formatting 2020-04-06 15:24:37 +03:00
Aliaksandr Valialkin
105f0c78d9 docs/CaseStudies.md: add ARNES case study 2020-04-06 15:17:33 +03:00
Roman Khavronenko
b099d84271 Vmalert/rules eval (#400)
* Initial rules evaluation support.

Rules are now store alerts state in private field `alerts`. Every evaluation updates
the alerts and state. Every unique metric received from datastore represents a unique alert,
uniqueness is guaranteed by hashing ordered labelset.

* merge with master

* cleanup

* support endAt parameter as 3*evaluationInterval for active alerts

* make golint happy
2020-04-06 14:44:03 +03:00
Aliaksandr Valialkin
407bdbf2b9 docs/Single-server-VictoriaMetrics.md: cosmetic fixes in Importing CSV data chapter 2020-04-06 12:29:28 +03:00
Aliaksandr Valialkin
69962a7001 docs/FAQ.md: small fixes 2020-04-05 13:53:08 +03:00
Aliaksandr Valialkin
9f03548e55 docs/FAQ.md: add more articles about VictoriaMetrics performance 2020-04-05 13:48:03 +03:00
Aliaksandr Valialkin
022310f35b docs/Articles.md: added a link to https://www.iunera.com/kraken/fabric/time-series-database/ 2020-04-04 16:40:00 +03:00
Aliaksandr Valialkin
895cadfae7 app/vmagent/remotewrite: add "X-Prometheus-Remote-Write-Version: 0.1.0" http header to remote_write request
This header is required by Cortex (and, probably, other remote storage systems).
See 9c1f44d090/docs/apis.md (remote-api) .

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/399
2020-04-04 16:24:56 +03:00
Aliaksandr Valialkin
57704aa584 app/victoria-metrics: add -selfScrapeInstance and -selfScrapeJob flags for tuning labels for self-scraped metrics 2020-04-04 14:57:22 +03:00
Aliaksandr Valialkin
f9b24d4899 app/vmselect/promql: keep metric name after applying first_over_time and last_over_time functions 2020-04-04 14:54:13 +03:00
Aliaksandr Valialkin
fa0554b771 docs/Articles.md: move Percona article to third-party 2020-04-02 15:43:02 +03:00
Aliaksandr Valialkin
35b133bff4 docs/Articles.md: add a link to https://blog.cloudera.com/benchmarking-time-series-workloads-on-apache-kudu-using-tsbs/ 2020-04-02 15:41:09 +03:00
Aliaksandr Valialkin
a884803377 docs/CaseStudies.md: add Adsterra case 2020-04-02 00:49:16 +03:00
Aliaksandr Valialkin
b38d048dd9 app/vmstorage: add vm_free_disk_space_bytes metric for monitoring the remaining disk space at -storageDataPath 2020-04-01 23:08:58 +03:00
Aliaksandr Valialkin
de2cd4231b docs/Single-server-VictoriaMetrics.md: re-organize chapters 2020-04-01 22:38:56 +03:00
kreedom
298eb0a0f8 [vmalert] improve external url handling 2020-04-01 22:29:11 +03:00
kreedom
12fe915b48 [vmalert] add prometheus template function (#396)
* [vmalert] add prometheus template function

* make linter be happy

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2020-04-01 18:17:53 +03:00
Aliaksandr Valialkin
cdf0a4cf8f lib/httpserver: remove unnecessary http.HandlerFunc wrapper in gzipHandler 2020-04-01 18:14:17 +03:00
Aliaksandr Valialkin
1c9c57db1c docs/Cluster-VictoriaMetrics.md: small fixes and updates 2020-04-01 18:10:12 +03:00
Aliaksandr Valialkin
8edc72201d docs/Single-server-VictoriaMetrics.md: small fixes and updates 2020-04-01 18:09:07 +03:00
Aliaksandr Valialkin
b024ecd10c docs/Cluster-VictoriaMetrics.md: swap production build and development build chapters 2020-04-01 17:49:51 +03:00
Aliaksandr Valialkin
e0d0348f36 lib/storage: add missing reset for tagFilter.matchesEmptyValue on tagFilter.Init 2020-04-01 17:42:44 +03:00
Aliaksandr Valialkin
3e55c7e069 lib/promscrape: reduce timestamp jitter when scraping targets
This should improve compression for timestamps
2020-04-01 16:11:35 +03:00
Aliaksandr Valialkin
c4acd20d2a lib/storage: remove duplicate data points on 7/8*minScrapeInterval interval instead of 1/2*minScrapeInterval
This should reduce storage usage and should improve deduplication accuracy
2020-04-01 15:48:48 +03:00
Aliaksandr Valialkin
8661dc4624 docs/Single-server-VictoriaMetrics.md: mention that environment vars may be prefixed with -envflag.prefix 2020-03-31 22:37:44 +03:00
Aliaksandr Valialkin
16572c8722 README.md: mention that response cache must be reset after import historical data 2020-03-31 19:33:20 +03:00
Aliaksandr Valialkin
b699c46046 lib/storage: handle errors returned from TagFilters.Add when cloning TagFilters with negative filter 2020-03-31 16:18:02 +03:00
Aliaksandr Valialkin
e71519b8b2 app/victoria-metrics/testdata: add a test for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/395 2020-03-31 12:51:25 +03:00
Aliaksandr Valialkin
972713bd79 lib/storage: add fast path for the previous indexdb search if it doesn't contain per-day inverted index yet 2020-03-31 12:51:21 +03:00
Aliaksandr Valialkin
5d99ca6cfc lib/storage: optimize per-day inverted index search for tag filters matching big number of time series
- Sort tag filters in the ascending number of matching time series
  in order to apply the most specific filters first.
- Fall back to metricName search for filters matching big number of time series
  (usually this are negative filters or regexp filters).
2020-03-31 00:48:35 +03:00
Aliaksandr Valialkin
318326c309 lib/storage: properly handle {label=~"foo|"} filters as Prometheus does
Such filters must match all the time series with `label="foo"` plus all the time series without `label`

Previously only time series with `label="foo"` were matched.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/395
2020-03-31 00:48:18 +03:00
Aliaksandr Valialkin
a1e4c6a2be .github/workflows/wiki.yml: fix copying files from docs to wiki 2020-03-30 15:59:12 +03:00
Aliaksandr Valialkin
ac3ee44fa7 docs/robots.txt: trigger github actions 2020-03-30 15:54:39 +03:00
Aliaksandr Valialkin
b98ca56d94 lib/envflag: add -envflag.prefix for setting optional prefix for environment vars 2020-03-30 15:51:19 +03:00
Aliaksandr Valialkin
b41ee5f27d vendor: make vendor-update 2020-03-30 15:06:35 +03:00
Aliaksandr Valialkin
8d35af6fdb .github/workflows: copy all the files from docs folder to wiki and github pages 2020-03-30 15:05:37 +03:00
Aliaksandr Valialkin
0f2dd77a76 go.mod: update the minimum required Go version from go1.12 to go1.13 2020-03-30 14:56:57 +03:00
Aliaksandr Valialkin
0c485f14d1 app/vmselect/prometheus: allow passing relative time to start, end and time args of /api/v1/* queries 2020-03-29 21:57:14 +03:00
Aliaksandr Valialkin
2ebf7d86ff app/vmselect/prometheus: code simplification: (d.Seconds()/1e3) -> d.Milliseconds() 2020-03-29 21:50:28 +03:00
kreedom
bf6c24d0f4 [vmalert] config parser (#393)
* [vmalert] config parser

* make linter be happy

* fix test

* fix sprintf add test for rule validation
2020-03-29 01:48:30 +02:00
Aliaksandr Valialkin
1f7292675a docs: add robots.txt 2020-03-28 23:22:46 +02:00
Aliaksandr Valialkin
bd156cd088 docs/vmagent.md: add prometheus remote_write proxy use case 2020-03-28 23:16:38 +02:00
Aliaksandr Valialkin
b695087119 docs/CaseStudies.md: add Brandwatch case study 2020-03-28 20:57:54 +02:00
Aliaksandr Valialkin
80f53e5396 deployment/docker: run docker apps under default user (0, root) in order to preserve backwards compatibility
If docker app is upgraded from root to non-root, then the data pointed by `-storageDataPath` or similar flags
becomes denied to non-root user after the upgrade. This breaks upgrade path. So revert back to default root user
for docker apps.

Users may explicitly execute `docker run --user <non_root_user>` for running docker apps under non-root user.
2020-03-28 19:23:26 +02:00
Roman Khavronenko
7acb797595 Update dashboard according to new Grafana version. (#390)
The way how regex for column style in Table panel should be applied has changed in 6.7 Grafana version. The change supposed to fix Flags panel column styles accordingly.
2020-03-28 01:24:39 +02:00
Roman Khavronenko
3a8bbfd6b9 bump Prometheus and Grafana images (#389) 2020-03-28 01:15:07 +02:00
Dmitry Naumov
27373807c1 Rootless docker images by default (#358)
* Rootless docker images by default

* Migrate to rootless base image

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2020-03-27 21:23:50 +02:00
Aliaksandr Valialkin
8d7f0aa632 vendor: make vendor-update 2020-03-27 21:23:30 +02:00
Aliaksandr Valialkin
149f365f74 lib/httpserver: add -http.maxGracefulShutdownDuration command-line flag for tuning the maximum duration required for graceful shutdown of http server 2020-03-27 21:23:30 +02:00
kreedom
b22da547a2 [vmalert] - parse template annotaions (#387)
* [vmalert] - parse template annotations
2020-03-27 18:31:16 +02:00
Aliaksandr Valialkin
047849e855 lib/uint64set: remove zero buckets after Set.Intersect 2020-03-27 01:15:58 +02:00
Aliaksandr Valialkin
f3ec424e7d lib/uint64set: small code cleanup and perf tuning
* Remember the last accessed bucket on Has() call.
* Inline fast paths inside Add() and Has() calls.
* Remove fragile code with maxUnsortedBuckets inside bucket32.
2020-03-25 15:30:25 +02:00
Aliaksandr Valialkin
ef8aee8a2d deployment/docker: update Go builder from Go1.14.0 to Go1.14.1 2020-03-24 22:35:26 +02:00
Aliaksandr Valialkin
dde4a97534 lib/uint64set: go fmt 2020-03-24 22:28:43 +02:00
Aliaksandr Valialkin
f3e0c55ea1 lib/storage: serialize snapshot creation process with mutex
This guarantees that the snapshot contains all the recently added data
from inmemory buffers when multiple concurrent calls to Storage.CreateSnapshot are performed.
2020-03-24 22:27:05 +02:00
Aliaksandr Valialkin
97fb0edd07 lib/uint64set: added more tests 2020-03-24 22:27:04 +02:00
Aliaksandr Valialkin
25f585ecf2 docs/CaseStudies.md: added a case study from MHI Vestas Offshore Wind 2020-03-14 13:22:12 +02:00
Aliaksandr Valialkin
df91d2d91f lib/storage: remove obsolete code 2020-03-13 22:48:17 +02:00
Aliaksandr Valialkin
3c7c71a49c app/vmselect: adjust label_map() handling for corner cases
The following corner cases now supported:
* label_map(q, "label", "", "foo") - adds `label="foo"` to series with missing `label`
* label_map(q, "label", "foo", "") - removes `label="foo"` from series

All the unmatched labels are kept unchanged.
2020-03-13 18:45:03 +02:00
Aliaksandr Valialkin
69f1470692 vendor: update github.com/VictoriaMetrics/metrics from v1.11.0 to v1.11.2
This fixes data race in Histogram
2020-03-13 12:39:57 +02:00
Aliaksandr Valialkin
4fc4912f0c app/vmalert/datasource: typo fix in docs: Labels -> Label 2020-03-13 12:22:33 +02:00
kreedom
a746cb62b6 vmalert add vm datasource, change alertmanager (#364)
* vmalert add vm datasource, change alertmanager

* make linter be happy

* make linter be happy.2

* PR comments

* PR comments.1
2020-03-13 12:19:31 +02:00
Aliaksandr Valialkin
499594f421 lib/promscrape: allow overriding external_labels as Prometheus does
Prometheus docs at https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config say:

> In communication with external systems, they are always applied only
> when a time series does not have a given label yet and are ignored otherwise.

Though this may result in consistency chaos when scrape targets override `external_labels`,
let's stick with Prometheus behavior for the sake of backwards compatibility.

There is last resort in vmagent with `-remoteWrite.label`, which consistently
sets the configured labels to all the metrics before sending them to remote storage.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/366
2020-03-12 20:24:42 +02:00
Aliaksandr Valialkin
fdc2a9d1d7 app/vmselect: add label_map(q, label, srcValue1, dstValue1, ... srcValueN, dstValueN) function to MetricsQL
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/369
2020-03-12 19:13:47 +02:00
Aliaksandr Valialkin
92d67e2592 vendor: update google.golang.org/genproto from fc8f55426688 to da6875a35672 2020-03-12 18:11:33 +02:00
Aliaksandr Valialkin
8a853778d7 vendor: update golang.org/x/tools from 26f6a1b6802d to 5e2df02acb1e 2020-03-12 18:07:52 +02:00
Aliaksandr Valialkin
8d75a5dbd0 vendor: update github.com/aws/aws-sdk-go from v1.29.10 to v1.29.22 2020-03-12 17:54:58 +02:00
Aliaksandr Valialkin
cdd6171af1 vendor: update google.golang.org/api from v0.19.0 to v0.20.0 2020-03-12 17:51:49 +02:00
Aliaksandr Valialkin
cc183bc899 vendor: update golang.org/x/sys from d5e6a3e2c0ae to 5c8b2ff67527 2020-03-12 17:46:24 +02:00
Aliaksandr Valialkin
3935038e20 vendor: update github.com/klauspost/compress from v1.10.1 to v1.10.3 2020-03-12 17:32:24 +02:00
Aliaksandr Valialkin
c8dc1cd218 lib/protoparser/csvimport: add missing metric vm_rows_invalid_total{type="csvimport"} 2020-03-12 15:27:45 +02:00
Aliaksandr Valialkin
c1551a3269 README.md: mention about alternative dashboard for cluster version - https://grafana.com/grafana/dashboards/11831 2020-03-12 15:10:14 +02:00
Aliaksandr Valialkin
8023ad7dbd app/vmselect: add -search.maxStalenessInterval for tuning Prometheus data model closer to Influx-style data model 2020-03-11 16:43:34 +02:00
Aliaksandr Valialkin
d4beb17ebe lib/promscrape: remove possible races when registering and de-registering scrape workers for /targets page 2020-03-11 16:30:21 +02:00
Aliaksandr Valialkin
fcd91795d5 app/vmagent: mention that vmagent can filter data 2020-03-11 16:22:39 +02:00
Aliaksandr Valialkin
650830db79 docs/Articles.md: add a link to https://stas.starikevich.com/posts/disk-usage-for-vm-versus-prometheus/ 2020-03-11 04:56:16 +02:00
Aliaksandr Valialkin
cdf70b7944 lib/promscrape: consistently update /targets page after SIGHUP 2020-03-11 03:20:03 +02:00
Aliaksandr Valialkin
301c2acd61 app/vmstorage: return 500 status code instead of 200 status code on internal errors inside /snapshot/* handlers 2020-03-10 23:51:55 +02:00
Aliaksandr Valialkin
61d0ee857c docs/vmagent.md: sync with app/vmagent/README.md 2020-03-10 21:54:04 +02:00
Aliaksandr Valialkin
e17702fada app/vmselect: add optional max_rows_per_line query arg to /api/v1/export
This arg allows limiting the number of data points that may be exported on a single line.
2020-03-10 21:45:56 +02:00
Aliaksandr Valialkin
1fe66fb3cc app/{vmagent,vminsert}: add support for importing csv data via /api/v1/import/csv 2020-03-10 21:15:35 +02:00
Aliaksandr Valialkin
49d7cb1a3f all: fix golangci-lint issues 2020-03-10 19:41:46 +02:00
Aliaksandr Valialkin
8d3869cd99 docs/FAQ.md: actualize answer about deduplication 2020-03-09 13:37:12 +02:00
Aliaksandr Valialkin
9d89b08cb5 docs: add missing vmagent.png, which is used in vmagent.md 2020-03-09 13:35:49 +02:00
Aliaksandr Valialkin
5fe38a84eb app/vmagent: properly apply -remoteWrite.sendTimeout to fasthttp.HostClient 2020-03-09 13:31:55 +02:00
Aliaksandr Valialkin
7c432da788 lib/promscrape: do not retry idempotent requests when scraping targets
This should prevent from the following unexpected side-effects of idempotent request retries:
- increased actual timeout when scraping the target comparing to the configured scrape_timeout
- increased load on the target

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/357
2020-03-09 13:31:52 +02:00
Aliaksandr Valialkin
986dba5ab3 app/vmagent: do not allow non-supported fields in -remoteWrite.relabelConfig and file_sd_configs
This should reduce possible confusion like in the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/363
2020-03-06 20:19:13 +02:00
Aliaksandr Valialkin
c386c5de57 app/vmagent: properly add labels set via -remoteWrite.label to metrics before sending them to -remoteWrite.url 2020-03-06 19:26:58 +02:00
Artem Navoiev
58a3e59d59 bump version of codecov-action to v1.0.6 2020-03-05 23:25:13 +02:00
Aliaksandr Valialkin
c5f894b361 Makefile: add build and test rules with enabled race detector. These rules have -race suffix
Fix also `unsafe pointer conversion` errors detected by Go1.14. See https://golang.org/doc/go1.14#compiler .
2020-03-05 12:03:38 +02:00
Aliaksandr Valialkin
9be64e34b4 docs/Articles.md: add a link to https://www.percona.com/blog/2020/02/28/better-prometheus-rate-function-with-victoriametrics/ 2020-03-04 20:05:26 +02:00
Aliaksandr Valialkin
e51a0a56f4 README.md: add a link to https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/Articles 2020-03-04 20:05:18 +02:00
Aliaksandr Valialkin
754db0d22e app/vmagent/README.md: small fixes 2020-03-04 18:14:47 +02:00
Aliaksandr Valialkin
772312bf7b app/vmagent/README.md: typo fix 2020-03-04 18:05:09 +02:00
Aliaksandr Valialkin
871abfab7a app/vmagent/README.md: clarification 2020-03-04 18:03:48 +02:00
Aliaksandr Valialkin
007c591de8 app/vmagent/README.md: add iot and edge monitoring use case 2020-03-04 18:01:34 +02:00
Aliaksandr Valialkin
474a09c0f1 app/vmagent/README.md: add use cases section 2020-03-04 17:42:27 +02:00
Aliaksandr Valialkin
d58aa80e9b README.md: add a link to Synthesio case study 2020-03-04 14:18:19 +02:00
Aliaksandr Valialkin
ad927575b7 docs/CaseStudies: add Synthesio 2020-03-04 14:14:39 +02:00
Aliaksandr Valialkin
0b1e877a7d docs/Single-server-VictoriaMetrics.md: sync with README.md 2020-03-03 21:39:05 +02:00
Aliaksandr Valialkin
0ba8ee6022 README.md: mention -search.cacheTimestampOffset in Backfilling section 2020-03-03 21:38:39 +02:00
Aliaksandr Valialkin
9a944fd169 lib/promscrape: consistency renaming: stopCh -> globalStopCh 2020-03-03 20:08:08 +02:00
Aliaksandr Valialkin
032c88561b app/vminsert/prompush: limit memory usage by pushing promscrape data in smaller blocks 2020-03-03 19:58:54 +02:00
Aliaksandr Valialkin
76036c1897 app/vmagent: add -remoteWrite.maxDiskUsagePerURL for limiting the maximum disk usage for each -remoteWrite.url buffer
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/352
2020-03-03 19:49:07 +02:00
Aliaksandr Valialkin
c31d640eb9 app/vmagent/remotewrite: do not reset empty relabelCtx 2020-03-03 15:01:03 +02:00
Aliaksandr Valialkin
02b55c72dc app/vmagent: add -remoteWrite.urlRelabelConfig for applying individual relabeling for each -remoteWrite.url
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/320
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/308
2020-03-03 13:12:16 +02:00
Aliaksandr Valialkin
1d7ab78b55 lib/protoparser/prometheus: allow trailing comma in tags list
The trailing comma is generated by cloudwatch exporter.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/350
2020-03-02 22:22:09 +02:00
Aliaksandr Valialkin
7d178a40bd app/vmselect/prometheus: do not add __name__!= filter when searching for all the matching metric names via /api/v1/label/__name__/values with non-empty label filter
This should reduce query time.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/343
2020-02-28 23:35:55 +02:00
Aliaksandr Valialkin
43754ff420 README.md: put https://gitlab.com/optima_public/prometheus_oauth_proxy in third-party contributions section 2020-02-28 21:23:34 +02:00
Aliaksandr Valialkin
b785429ddb lib/protoparser: metrics renaming: vm_protoparser_<type>_* -> vm_protoparser_*{type="<type>"}
This should improve composability of these metrics in PromQL queries
2020-02-28 20:20:10 +02:00
Aliaksandr Valialkin
f9a584b5c1 app/vmagent/remotewrite: yet another typo fix 2020-02-28 20:05:55 +02:00
Aliaksandr Valialkin
e22fdc1073 lib/persistentqueue: reset chunk file when the persistent queue is empty 2020-02-28 20:05:53 +02:00
Aliaksandr Valialkin
b9b46cb8dc app/vmagent/remotewrite: typo fix 2020-02-28 19:03:16 +02:00
Aliaksandr Valialkin
db6f4e4af1 app/vmagent/remotewrite: limit memory usage when big scrape blocks are pushed to remote storage 2020-02-28 18:58:01 +02:00
Aliaksandr Valialkin
8cc88db38d docs/Single-server-VictoriaMetrics.md: sync with README.md 2020-02-28 12:58:32 +02:00
Aliaksandr Valialkin
f3c28d2ae4 README.md: typo fix 2020-02-28 12:58:31 +02:00
Aliaksandr Valialkin
57528ca31c docs: add a doc for vmagent 2020-02-28 12:23:56 +02:00
Aliaksandr Valialkin
5701b2f7bb app/vmselect/prometheus: properly pass filter for labelName=__name__ in labelValuesWithMatches
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/343
2020-02-28 12:18:14 +02:00
Aliaksandr Valialkin
18af31a4c2 all: properly split vm_deduplicated_samples_total among cluster components
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/345
2020-02-27 23:48:07 +02:00
Aliaksandr Valialkin
6819db5686 lib/envflag: typo fix in docs to -envflag.enable: envoronment->environment 2020-02-27 21:47:58 +02:00
Aliaksandr Valialkin
63a88a619b deployment/docker: update Go builder from Go1.13.8 to Go1.14.0 2020-02-26 22:15:44 +02:00
Aliaksandr Valialkin
c458b521a2 app/vmagent: allow setting -httpListenAddr to empty string in order to disable listening for http requests
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/340
2020-02-26 20:58:11 +02:00
Aliaksandr Valialkin
b459919250 make vendor-update 2020-02-26 20:45:27 +02:00
Aliaksandr Valialkin
cc5fe0b315 vendor: update github.com/VictoriaMetrics/metrics from v1.10.1 to v1.11.0 2020-02-26 20:41:02 +02:00
Aliaksandr Valialkin
117c76311c app/vmagent/README.md: list service discovery mechanisms, which will be added soon
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/334
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/330
2020-02-26 19:27:08 +02:00
Aliaksandr Valialkin
b63e4464f4 lib/promscrape: properly reload new configs on SIGHUP
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/335
2020-02-26 13:54:00 +02:00
Edouard Hur
3ad36134f6 Readme markdown linting (#338)
* fixed MD009/no-trailing-spaces

* fixed MD033/no-inline-html: Inline HTML

* fixed MD012/no-multiple-blanks

* fixed MD007/ul-indent

* fixed MD004/ul-style

* fixed MD031/blanks-around-fences

* fixed MD040/fenced-code-language

* fixed MD032/blanks-around-lists

* fixed MD026/no-trailing-punctuation
2020-02-26 13:21:19 +02:00
Edouard Hur
1f0007d0b1 Readme envvars (#332)
* add details about env vars config

* add env var to table of contents

* remove unnecessary words
2020-02-25 22:41:34 +02:00
Aliaksandr Valialkin
6739c2749d lib/promscrape: go fmt 2020-02-25 20:56:44 +02:00
Aliaksandr Valialkin
7a33da8fea lib/promscrape: do not add missing port to __address__ label in order to be consistent with Prometheus behavior
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/331
2020-02-25 20:49:50 +02:00
Aliaksandr Valialkin
be37d762cd app/vmagent: add -remoteWrite.maxBlockSize command-line flag for limiting the maximum size of unpacked block to send to remote storage 2020-02-25 19:57:47 +02:00
Aliaksandr Valialkin
4e24839a2c app/vmagent: do not allow sending unpacked requests with sizes exceeding -maxInsertRequestSize 2020-02-25 19:34:41 +02:00
Aliaksandr Valialkin
6386aeb1e0 app/vmagent: add ability to accept Influx line protocol data via TCP and UDP
Just set `-influxListenAddr` command-line flag

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/333
2020-02-25 19:12:49 +02:00
Aliaksandr Valialkin
e453880084 app/vmagent/README.md: mention that vmagent exposes target statuses at /targets page 2020-02-25 18:15:58 +02:00
Aliaksandr Valialkin
4c4448b66e app/vminsert: add /targets handler, which exposes Prometheus targets defined in -promscrape.config file 2020-02-25 18:13:11 +02:00
Aliaksandr Valialkin
7ef7c9368e lib/fs: typo fix: read blocks bigger than 8KB via pread() call instead of using mmap 2020-02-25 18:05:06 +02:00
Aliaksandr Valialkin
e1ef72af01 app/vmagent: logo fix 2020-02-25 00:09:19 +02:00
Aliaksandr Valialkin
56c70fe856 app/vmagent: update docs 2020-02-25 00:09:18 +02:00
Aliaksandr Valialkin
e7e4aa5243 app/vmagent/README.md: small fixes 2020-02-24 21:25:38 +02:00
Aliaksandr Valialkin
fed2959658 lib/envflag: substitute dots with underscores in env var names if -envflag.enable is set
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/311
2020-02-24 21:14:44 +02:00
Aliaksandr Valialkin
ae51300973 app/vmselect/promql: properly take into account the first datapoint when calculating rollup_candlestick
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/309
2020-02-24 13:24:30 +02:00
Aliaksandr Valialkin
e65ec88779 app/vmselect/promql: do not take into account values outside the current window in rollup_candlestick
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/309
2020-02-23 18:03:57 +02:00
Yaroslav
a6d0645539 fix rollupOpen(), rollupHigh(), rollupLow() functions (#328) 2020-02-23 18:01:53 +02:00
Aliaksandr Valialkin
04762344c6 app/vmagent: initial implementation for vmagent 2020-02-23 13:36:03 +02:00
Aliaksandr Valialkin
4e905d6501 vendor: update github.com/valyala/fastjson from v1.4.5 to v1.5.0 2020-02-23 10:06:00 +02:00
kreedom
49390b8dbc [vmalert] integration with AlertManager (#325) 2020-02-21 23:15:05 +02:00
Aliaksandr Valialkin
2f55cabaa4 app/vmselect/promql: log when rollupResult cache is cleared 2020-02-21 20:07:01 +02:00
Aliaksandr Valialkin
d21cb43e48 lib/storage: add vm_ prefix to deduplicated_samples_total metric to be conistent with other metrics 2020-02-21 19:33:59 +02:00
Aliaksandr Valialkin
ec9bf39b5b app/vmselect: add -search.cacheTimestampOffset command-line flag
This flag can be used for removing gaps on graphs if the difference between the current time
and the timestamps from the ingested data exceeds 5 minutes.

This is the case when the time between data sources and VictoriaMetrics is improperly synchronized.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/312
2020-02-21 13:58:06 +02:00
Aliaksandr Valialkin
539139391c app/vmselect: add /internl/resetRollupResultCache handler for resetting response cache
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/312
2020-02-21 13:58:05 +02:00
Aliaksandr Valialkin
5431f9cd4e deployment/docker: update Go builder from v1.13.7 to v1.13.8 2020-02-20 19:46:20 +02:00
kreedom
3c06179184 basic vmalert backbone (#317)
* basic vmalert backbone

* Resolve code review comments for vmalert backbone

* Second review fixes for vmalert backbone
2020-02-16 20:59:02 +02:00
Aliaksandr Valialkin
71a52f5f90 lib/protoparser/prometheus: skip leading whitespace from tag names 2020-02-16 19:06:33 +02:00
Aliaksandr Valialkin
e7ba18b0d9 vendor: make vendor-udpate 2020-02-16 16:11:24 +02:00
Aliaksandr Valialkin
ce15cecae4 lib/storage: typo fix 2020-02-16 15:53:44 +02:00
Aliaksandr Valialkin
32e153e834 lib/storage: prevent from clobbering nin-nil lastError in Storage.add 2020-02-16 15:51:26 +02:00
Aliaksandr Valialkin
7b1c7051a3 app/vmselect: add sort_by_label(q, label) and sort_by_label_desc(q, label) functions
This is implementation of https://github.com/prometheus/prometheus/pull/1533 for VictoriaMetrics.
2020-02-13 17:01:37 +02:00
Aliaksandr Valialkin
7836ad8907 lib/mergeset: skip createing temporary part objects when merging source inmemory parts
This should reduce CPU usage when adding new entries to inverted index.
This should alos prevent from creating stalled cleaner goroutines for the created temporary parts,
since they were never closed.

This should fix the following issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/316 .
2020-02-13 14:09:32 +02:00
Aliaksandr Valialkin
eceaf13e5e lib/{storage,mergeset}: use time.Ticker instead of time.Timer where appropriate
It has been appeared that time.Timer was used in places where time.Ticker must be used instead.
This could result in blocked goroutines as in the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/316 .
2020-02-13 13:10:07 +02:00
Aliaksandr Valialkin
8162d58dbd make vendor-update 2020-02-10 23:28:15 +02:00
Aliaksandr Valialkin
848d5da0be vendor: update github.com/VictoriaMetrics/metrics from v1.9.3 to v1.10.1 2020-02-10 23:08:38 +02:00
Aliaksandr Valialkin
4cc0163c7c docs: migrate ExtendedPromQL->MetricsQL in order to be more consistent 2020-02-10 23:02:43 +02:00
Aliaksandr Valialkin
a801a1a6e7 .github/ISSUE_TEMPLATE: ask for command-line flags and Prometheus logs 2020-02-10 22:56:17 +02:00
Aliaksandr Valialkin
02e852854a README.md: refer to the article about data deletion via relabeling 2020-02-10 22:46:52 +02:00
Aliaksandr Valialkin
9e6e2319b9 README.md: mention that flags may be read from env vars if -envflag.enable command-line flag is set 2020-02-10 16:20:15 +02:00
Aliaksandr Valialkin
025297f15d lib/envflag: check for incorrect flag values read from environment vars 2020-02-10 16:08:10 +02:00
Aliaksandr Valialkin
5d207b2025 lib/envflag: add -envflag.enable command-line flag for enabling reading flags from environment vars
By default flags are read only from command line. They can be read from environment vars if `-envflag.enable` is set.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/311
2020-02-10 16:02:37 +02:00
Aliaksandr Valialkin
8466ab0034 all: allow setting flags via environment vars
Now flags can be set via environment vars with the same names as flags.
Command-line flags override flags set via env vars.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/311
2020-02-10 13:29:13 +02:00
Aliaksandr Valialkin
e210cd9da1 lib/storage: move -dedup.minScrapeInterval flag outside lib/storage, so it doesnt show up in vminsert in cluster version 2020-02-10 13:09:51 +02:00
Aliaksandr Valialkin
6db573470c docs/Single-server-VictoriaMetrics.md: sync with README.md 2020-02-07 00:02:34 +02:00
Ryota Arai
fffe5d4ba4 Fix a typo in README (selfScrapeInterval) (#310) 2020-02-06 13:14:31 +02:00
Aliaksandr Valialkin
a6c6a2debc app/vmselect/promql: do not add step to range end, since this hack became obsolete since commit 9e1119dab8 2020-02-05 21:22:19 +02:00
Aliaksandr Valialkin
78b62dee87 app/vmselect/promql: properly adjust time range for data to select
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/309
2020-02-05 21:22:18 +02:00
Aliaksandr Valialkin
366693b9f1 app/vmselect: unconditionally offset -step to rollup_candlestick. This makes results more consistent 2020-02-04 23:32:12 +02:00
Aliaksandr Valialkin
525101339e app/vmselect/promql: automatically apply offset -step to rollup_candlestick function in order to obtain the expected OHLC results
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/309
2020-02-04 23:24:35 +02:00
Aliaksandr Valialkin
ada6a3da8d app/vmselect/promql: adjust rollup_candlestick calculations to the exepcted results
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/309
2020-02-04 22:42:13 +02:00
Aliaksandr Valialkin
40c6ae2952 lib/logger: initialize output to os.Stderr by default 2020-02-04 22:40:44 +02:00
Aliaksandr Valialkin
cff0cb297c Do not require checking for errors returned from fmt.Fprint
This fixes `make errcheck` error found in lib/logger
2020-02-04 22:03:37 +02:00
Aliaksandr Valialkin
e0a4c37fc1 lib/logger: add -loggerOutput command-line flag
This flag allows changing log output from `stderr` to `stdout` if `-loggerOutput=stdout` is set.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/306
2020-02-04 21:47:56 +02:00
Aliaksandr Valialkin
7f3e3a6034 lib/logger: do not clutter -loggerFormat=json output with stack trace
This should improve json parsing
2020-02-04 21:37:25 +02:00
Aliaksandr Valialkin
bd4698bb7a lib/storage: do not deduplicate blocks with less than 32 samples during merge
This should improve deduplication accuracy for blocks with higher number of samples.
2020-02-04 18:41:54 +02:00
Aliaksandr Valialkin
36a1ac8360 app/vmselect: take into account the time the requests wait in the queue if -search.maxConcurrentRequests is exceeded
This will prevent from excess CPU usage for timed out queries.
2020-02-04 16:15:08 +02:00
Aliaksandr Valialkin
834051e5b2 app/vmselect: add a placeholder for /api/v1/metadata, which could be requested by Grafana
See https://prometheus.io/docs/prometheus/latest/querying/api/#querying-metric-metadata

VictoriaMetrics doesn't collect any metadata for metrics, so just return empty response.
2020-02-04 15:53:47 +02:00
Aliaksandr Valialkin
42864bb52f all: do not clash flag description with back-quoted flag types
See https://golang.org/pkg/flag/#PrintDefaults for more details.
2020-02-04 15:46:52 +02:00
Roman Khavronenko
1e023c6a72 Single dashboard (#300)
* improve description for `Pending datapoints` panel

* bump VM version requirement
2020-02-03 02:09:53 +02:00
Artem Navoiev
a47f292295 [vmalert] add vmalert.png.2 2020-02-02 12:17:19 +02:00
Artem Navoiev
354232b62b [vmalert] add vmalert.png 2020-02-02 12:16:05 +02:00
Artem Navoiev
28778be0cc [vmalert] initial 2020-02-02 12:14:09 +02:00
Aliaksandr Valialkin
90cf356ea1 app/vmselect/promql: adjust and and unless binary operator handling to be consistent with Prometheus 2020-01-31 18:52:38 +02:00
Aliaksandr Valialkin
c0b69ed06e deployment/docker: update Go builder from v1.13.6 to v1.13.7 2020-01-31 18:06:58 +02:00
Aliaksandr Valialkin
011a79da85 lib/fs: remove unused readerAt interface 2020-01-31 15:12:43 +02:00
Aliaksandr Valialkin
c3d86eef96 all: add -dedup.minScrapeInterval command-line flag for data de-duplication
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/86
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/278
2020-01-31 01:16:57 +02:00
Aliaksandr Valialkin
2152f6f0cd lib/storage: re-use indexSearch inside Storage.prefetchMetricNames 2020-01-31 01:16:53 +02:00
Aliaksandr Valialkin
d70ba7eb37 lib/fs: optimize small reads for ReaderAt.MustReadAt by reading from memory-mapped space instead of reading from file descriptor
This should improve performance when reading many small blocks.
2020-01-30 15:09:05 +02:00
Aliaksandr Valialkin
ad8af629bb all: rename ReadAt* to MustReadAt* in order to dont clash with io.ReaderAt 2020-01-30 15:08:58 +02:00
Aliaksandr Valialkin
d68546aa4a lib/storage: pre-fetch metricNames for the found metricIDs in Search.Init
This should speed up Search.NextMetricBlock loop for big number of found time series.
2020-01-30 15:08:51 +02:00
Aliaksandr Valialkin
5bb9ccb6bf lib/mergeset: properly update lastAccesstime in indexBlockCache entries
This is a follow-up for 6665f10e7b
2020-01-29 21:20:47 +02:00
Aliaksandr Valialkin
a462355b2f app/vmselect/promql: add keep_next_value(q) for filling gaps with the next non-empty value 2020-01-29 00:48:04 +02:00
Aliaksandr Valialkin
bdbb463756 docs/Single-server-VictoriaMetrics.md: fix heading size for Third-party contributions section 2020-01-28 23:13:35 +02:00
Aliaksandr Valialkin
371e86194d app/vminsert: moved -maxInsertRequestSize command-line flag out of lib/prompb in order to prevent its inclusion in vmselect and vmstorage apps 2020-01-28 23:02:08 +02:00
Aliaksandr Valialkin
adbbc4fa1a app/vmselect/promql: return expected results from increase() over the beginning of time series, which start from big value
Examples for such counters: OS-level counters for network or cpu stats.
2020-01-28 16:30:11 +02:00
Aliaksandr Valialkin
75ad47a43c app/victoria-metrics: check for error arg passed to filepath.Walk callback 2020-01-27 20:56:45 +02:00
Aliaksandr Valialkin
6320a19a8c app/victoria-metrics: remove integration build tag from tests
This simplifies testing with `go test ./app/victoria-metrics` without
the need to remember to pass `-tags=integration` to Go commands.
2020-01-27 20:25:28 +02:00
Aliaksandr Valialkin
7b26db5527 docs/Single-server-VictoriaMetrics.md: update Retention section 2020-01-27 18:44:21 +02:00
Alexander Danilov
1a3626bbe1 Add description for retention and how it works (#297) 2020-01-27 18:38:22 +02:00
Aliaksandr Valialkin
8074c10590 README.md: mention https://github.com/AnchorFree/tsdb-remote-write 2020-01-27 18:35:48 +02:00
Aliaksandr Valialkin
2392a359e1 app/vmselect/promql: fix panic on a single zero vmrange bucket in prometheus_buckets() function
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/296
2020-01-27 18:04:55 +02:00
Aliaksandr Valialkin
6caa9bb51b lib/logger: fix improperly set skipframes for all the logging functions
The bug has been introduced in the previous commit f6baee6efe
2020-01-26 18:34:27 +02:00
Aliaksandr Valialkin
f6baee6efe lib/httpserver: log the caller of httpserver.Errorf
Previously log message contained `httpserver.Errorf`, not it contains the caller of `httpserver.Errorf`, which is more useful.
2020-01-25 20:17:59 +02:00
Aliaksandr Valialkin
9df5b2d1c3 app/victoria-metrics: add -selfScrapeInterval flag for self-scraping /metrics page
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/30
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/180
2020-01-25 19:19:59 +02:00
Aliaksandr Valialkin
2a0a0ed14d lib/protoparser: add parser for Prometheus exposition text format
This parser will be used by vmagent
2020-01-24 20:11:02 +02:00
Aliaksandr Valialkin
6456c93dbb app/vminsert: move ingestion protocol parsers to lib/protoparser, so they could be re-used in the upcoming vmagent 2020-01-24 16:53:00 +02:00
Aliaksandr Valialkin
1efea246b7 docs/Articles.md: add a link to https://medium.com/@valyala/billy-how-victoriametrics-deals-with-more-than-500-billion-rows-e82ff8f725da 2020-01-22 19:08:35 +02:00
Aliaksandr Valialkin
680080887d all: consistently log durations in seconds with millisecond precision
This should improve logs readability
2020-01-22 18:28:27 +02:00
Aliaksandr Valialkin
3992984e10 vendor: make vendor-update 2020-01-22 18:08:39 +02:00
Aliaksandr Valialkin
9773022e50 app/vmselect: mention the original query and time range in error messages
This should simplify debugging invalid or heavy queries.
2020-01-22 17:36:36 +02:00
Aliaksandr Valialkin
f8954c7250 vendor: update github.com/klauspost/compress from v1.9.7 to v1.9.8
New version should have better gzip compression. See https://github.com/klauspost/compress#changelog
2020-01-22 16:50:15 +02:00
Aliaksandr Valialkin
0ef6f91410 docs: Mention Slack and Telegram channels for user questions 2020-01-22 16:50:14 +02:00
Aliaksandr Valialkin
efc7ad88ec app/vmselect: mention command-line flag, which could be used for adjusting query timeouts, in timeout errors 2020-01-22 15:50:48 +02:00
Aliaksandr Valialkin
ec9651e266 app/vmselect/prometheus: increase default value -maxExportDuration to 30 days, since 10 minutes beat users exporting bit amounts of data 2020-01-22 15:50:47 +02:00
Aliaksandr Valialkin
a8b2f82fc6 vendor: update github.com/VictoriaMetrics/fastcache from v1.5.5 to v1.5.7 2020-01-22 12:31:32 +02:00
Aliaksandr Valialkin
582dd01f42 app/vmselect/promql: add range_over_time(m[d]) function for calculating value range for m over d 2020-01-21 19:05:17 +02:00
Aliaksandr Valialkin
36973ee975 app/vmselect/promql: add label_match(q, label, regexp) and label_mismatch(q, label, regexp) functions for filtering out time series with labels matching the given regexp 2020-01-21 15:00:20 +02:00
Aliaksandr Valialkin
6665f10e7b lib/{mergeset,storage}: properly update lastAccessTime in index and data block cache entries 2020-01-20 14:59:47 +02:00
Aliaksandr Valialkin
04363d6511 README.md: mention that delete API shouldnt be used on a regular basis due to non-zero overhead 2020-01-20 13:28:36 +02:00
Aliaksandr Valialkin
c97ade4487 docs/FAQ.md: typo fix according to comment from https://www.reddit.com/message/messages/lezkmo 2020-01-18 18:05:13 +02:00
Aliaksandr Valialkin
970f0dfbf2 docs/CaseStudies.md: add links to COLOPL talk about VictoriaMetrics 2020-01-18 17:23:33 +02:00
Aliaksandr Valialkin
227cf53ef9 app/vminsert: increase default value for -insert.maxQueueDuration from 30s to 60s
This should help catching up with high ingestion rate after VictoriaMetrics restart.
2020-01-18 14:39:36 +02:00
Aliaksandr Valialkin
257e61195a lib/uint64set: add missing bucket32.b16his values 2020-01-18 14:26:04 +02:00
Aliaksandr Valialkin
4cc0c44b9e lib/uint64set: optimize Set.Union
This should improve performance for queries over big number of time series
2020-01-18 13:47:03 +02:00
Aliaksandr Valialkin
1b5f02e293 lib/uint64set: add benchmarks for Set.Union 2020-01-18 13:47:02 +02:00
Aliaksandr Valialkin
3748fb24b6 lib/storage: skip recovering timestamps order for lossless compression (PrecisionBits=64) 2020-01-18 00:09:33 +02:00
Aliaksandr Valialkin
c9472e4f3a all: use github.com/klauspost/compress/gzip instead of compress/gzip
`github.com/klauspost/compress/gzip` is more optimized than `compress/gzip`.
This gives better gzip compression and decompression speeds.
2020-01-17 23:58:46 +02:00
Aliaksandr Valialkin
bc0f897fcb lib/uint64set: reduce memory allocations in Set.AppendTo 2020-01-17 22:33:09 +02:00
Aliaksandr Valialkin
f9289b804a lib/storage: reduce memory allocations when merging metricID sets 2020-01-17 22:10:44 +02:00
Aliaksandr Valialkin
0c8ad08578 lib/uint64set: typo fix in Set.Intersect 2020-01-17 18:10:58 +02:00
Aliaksandr Valialkin
cdcacaea6d app/vmselect/netstorage: make fmt 2020-01-17 17:47:21 +02:00
Aliaksandr Valialkin
7327adbc86 app/vmselect/netstorage: limit the maximum size for in-memory buffer for temporary blocks file
This should reduce memory usage on systems with more than 8GB RAM.
2020-01-17 16:28:21 +02:00
Aliaksandr Valialkin
9f027ec176 lib/uint64set: optimize Intersect, Subtract and Union functions
This should improve performance for queries over big number of time series.
2020-01-17 16:11:49 +02:00
Aliaksandr Valialkin
cd53f7d177 lib/uint64set: improve benchmark for Set.Intersect 2020-01-17 16:08:17 +02:00
Aliaksandr Valialkin
d0d258b314 app/vmselect: limit the default value for -search.maxConcurrentRequests, so it plays well on systems with more than 16 vCPUs
A single heavy request can saturate all the available CPUs, so let's limit the number of concurrent requests to lower value.
This will give more chances for executing insert path.
2020-01-17 15:43:54 +02:00
Aliaksandr Valialkin
d88725f133 app/{vminsert,vmselect}: improve error messages when VictoriaMetrics cannot handle too high number of concurrent inserts / selects 2020-01-17 13:24:37 +02:00
Aliaksandr Valialkin
8dbf430469 lib/uint64set: add benchmark for Set.Intersect 2020-01-17 00:31:07 +02:00
Aliaksandr Valialkin
9ef4d32a9a make vendor-update 2020-01-16 14:14:19 +02:00
Aliaksandr Valialkin
0d7505b00e all: mention command-line flags used for limiting the incoming request size in error messages
This should improve error logs usability.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/287
2020-01-16 13:03:30 +02:00
Aliaksandr Valialkin
2839f4688a app/vmselect/promql: fix panic on sum(aggr_over_time(...)) with incorrect number of args 2020-01-15 16:26:09 +02:00
Aliaksandr Valialkin
605d588ba6 lib/uint64set: reduce memory usage in Union, Intersect and Subtract methods
Iterate items with newly added Set.ForEach method instead of allocating `[]uint64`
slice for all the items before the iteration.
2020-01-15 12:12:49 +02:00
Aliaksandr Valialkin
7483deccca docs/FAQ.md: add bullet comparison with Cortex and Thanos 2020-01-15 10:47:40 +02:00
Aliaksandr Valialkin
893b62c682 lib/{mergeset,storage}: fix uint64 counters alignment for 32-bit architectures (GOARCH=386, GOARCH=arm) 2020-01-14 22:47:04 +02:00
Aliaksandr Valialkin
7830c10eb2 lib/{storage,mergeset}: gradually remove stale entries from block cache and index caches
This should reduce memory usage in the long run when old blocks and indexes
aren't accessed anymore.
2020-01-14 21:38:44 +02:00
Aliaksandr Valialkin
e4f1bfd221 deployment/docker: update Prometheus from v2.14.0 to v2.15.2 and Grafana from v6.5.0 to v6.5.2 2020-01-12 23:14:10 +02:00
Aliaksandr Valialkin
91ee1bce2e README.md: add a link to VictoriaMetrics subreddit - https://www.reddit.com/r/VictoriaMetrics/ 2020-01-12 00:06:20 +02:00
Aliaksandr Valialkin
8b14572f70 app/vmselect/promql: add hoeffding_bound_upper(phi, m[d]) and hoeffding_bound_lower(phi, m[d]) functions
These functions can be used for calculating Hoeffding bounds
for `m` over `d` time range and for the given `phi` in the range `[0..1]`.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/283
2020-01-11 14:46:23 +02:00
Aliaksandr Valialkin
8eaced8cae app/vmselect/promql: return continuous values for min_over_time and max_over_time when step is smaller than scrape_interval 2020-01-11 12:47:50 +02:00
Aliaksandr Valialkin
1585dab5a3 deployment/docker: switch Go builder from v1.13.5 to v1.13.6 2020-01-11 11:06:00 +02:00
Aliaksandr Valialkin
cd66d3fc43 README.md: mention about Prometheus->VictoriaMetrics exporter https://github.com/ryotarai/prometheus-tsdb-dump
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/93
2020-01-11 01:29:09 +02:00
Aliaksandr Valialkin
ea231f8167 app/victoria-metrics: adjust integration tests after the commit 99facd71cd6ac151d512cea1df73be91c10c7f83 2020-01-11 00:58:16 +02:00
Aliaksandr Valialkin
46bfdbe6cf app/vmselect/promql: do not take into account the previous point before time window in square brackets for min_over_time, max_over_time, rollup_first and rollup_last functions
This makes the behaviour for these functions similar to Prometheus when processing broken time series with irregular data points
like `gitlab_runner_jobs`. See https://gitlab.com/gitlab-org/gitlab-exporter/issues/50 for details.
2020-01-11 00:26:26 +02:00
Aliaksandr Valialkin
4f0a645f77 vendor: update github.com/valyala/fastjson from v1.4.2 to v1.4.5
This should fix parsing Inf values in `/api/v1/import`. The previous attempt to fix this in VictoriaMetrics v1.32.1 was unsuccessful.
2020-01-10 23:15:15 +02:00
Aliaksandr Valialkin
b829fe5e39 app/vmselect/promql: properly handle aggr(aggr_over_time(...)) 2020-01-10 21:57:18 +02:00
Aliaksandr Valialkin
164278151f app/vmselect/promql: add aggr_over_time(("aggr_func1", "aggr_func2", ...), m[d]) function
This function can be used for simultaneous calculating of multiple `aggr_func*` functions
that accept range vector. For example, `aggr_over_time(("min_over_time", "max_over_time"), m[d])`
would calculate `min_over_time` and `max_over_time` for `m[d]`.
2020-01-10 21:18:06 +02:00
Aliaksandr Valialkin
c4632faa9d app/vmselect/promql: add tmin_over_time(m[d]) and tmax_over_time(m[d]) functions
These functions return timestamp in seconds for the minimum and maximum value for `m` over time range `d`
2020-01-10 19:39:28 +02:00
Aliaksandr Valialkin
a768198814 docs: fix spelling typos 2020-01-09 23:42:55 +02:00
Roman Khavronenko
57f4875024 fix spellcheck issues (#285) 2020-01-09 23:41:52 +02:00
Aliaksandr Valialkin
b8038a14e7 lib/backup/s3remote: check whether the file exists before deleting it
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/284
2020-01-09 23:20:31 +02:00
Aliaksandr Valialkin
f358fb72d1 app/{vmbackup,vmrestore}: add backup complete file to backup when it is complete and check for this file before restoring from backup
This should prevent from restoring from incomplete backups.

Add `-skipBackupCompleteCheck` command-line flag to `vmrestore` in order to be able restoring from old backups without `backup complete` file.
2020-01-09 15:35:38 +02:00
Aliaksandr Valialkin
1c436b2723 vendor: update github.com/valyala/fastjson from v1.4.1 to v1.4.2
This fixes parsing of `inf` and `nan` values in json lines passed to `/api/v1/import`
2020-01-08 20:47:21 +02:00
Aliaksandr Valialkin
a973df6d79 README.md: remove height="200px" from logo image, since it is improperly displayed on smartphones 2020-01-08 20:29:11 +02:00
Aliaksandr Valialkin
d4132a6915 docs: typo fix 2020-01-08 14:45:27 +02:00
Aliaksandr Valialkin
d5aeda0e1a app/vmselect/promql: skip rate calculation for the first point on time series 2020-01-08 14:42:53 +02:00
Aliaksandr Valialkin
bb71b6d47d docs: add references to Remote Write Storage Wars
Also mention than VictoriaMetrics uses less RAM than Thanos Store Gateway - see https://github.com/thanos-io/thanos/issues/448 for details.
2020-01-04 23:57:35 +02:00
Aliaksandr Valialkin
fc71602039 lib/storage: limit maxRaRowsPerPartition by 500K for any number of rawRowsShardsPerPartition
This should reduce write amplification for high ingestion rate on multi-CPU systems
2020-01-04 23:57:31 +02:00
Aliaksandr Valialkin
c60fdbed30 docs/CaseStudies.md: add link to Remote Write Storage Wars talk from Adidas at PromCon 2019 2020-01-04 16:51:45 +02:00
Aliaksandr Valialkin
d410c78c7e app/vmselect/promql: fix calculations for histogram_share 2020-01-04 14:44:48 +02:00
Aliaksandr Valialkin
66f3d1dac8 README.md: update Alerting section 2020-01-04 13:55:09 +02:00
Aliaksandr Valialkin
d9c4ac9978 lib/metricsql: export IsRollupFunc and IsTransformFunc, since they can be used by package users 2020-01-04 13:25:05 +02:00
Aliaksandr Valialkin
4567a59fa0 LICENSE: update year 2020-01-04 13:21:04 +02:00
Aliaksandr Valialkin
d64699bb9f app/vmselect/promql: add missing MetricName into netstorage.Result in tests 2020-01-04 12:52:39 +02:00
Aliaksandr Valialkin
f409f2d050 app/vmselect/promql: add histogram_share(le, buckets) function 2020-01-04 12:45:55 +02:00
Aliaksandr Valialkin
b1ded7cf9a app/vmselect/promql: add absent_over_time(m[d]) func similar to the function in Prometheus 2.16
See https://github.com/prometheus/prometheus/issues/2882
2020-01-04 12:45:07 +02:00
Aliaksandr Valialkin
a8360d04c0 app/vmselect/promql: add histogram_over_time(m[d]) rollup function 2020-01-04 12:44:56 +02:00
Aliaksandr Valialkin
3e09d38f29 app/vmselect/promql: fix results caching for multi-arg rollup functions such as quantile_over_time
Previosly only a single arg was taken into account, so caching didn't work properly for multi-arg rollup funcs.
2020-01-03 20:49:08 +02:00
Aliaksandr Valialkin
a774120460 app/vmselect/promql: use scrapeInterval instead of window in denominator when calculating rate for the first point on the time series
This should provide better estimation for `rate` in the beginning of time series.
2020-01-03 19:01:50 +02:00
Aliaksandr Valialkin
695682232f lib/uint64set: reduce memory usage when storing big number of sparse metric_id values 2020-01-03 18:16:44 +02:00
Aliaksandr Valialkin
b5645ccbdf app/vmselect/promql: increase the estimated number of time series returned by aggr() by (something) from 100 to 1K, since 100 may result in OOM for high number of time series 2020-01-03 01:02:21 +02:00
Aliaksandr Valialkin
cb3a342882 app/vmselect/promql: add share_le_over_time and share_gt_over_time functions for SLI and SLO calculations 2020-01-03 00:41:16 +02:00
Aliaksandr Valialkin
0038365206 docs: refer to standalone MetricsQL package 2020-01-02 23:43:35 +02:00
Aliaksandr Valialkin
61c9d320ed vendor: update github.com/VictoriaMetrics/fastcache from v1.5.4 to v1.5.5 2019-12-29 18:17:49 +02:00
Aliaksandr Valialkin
a21d786d3c lib/metricsql: add example for ExpandWithExprs 2019-12-26 21:32:11 +02:00
Aliaksandr Valialkin
192b51c246 vendor: make vendor-update 2019-12-26 19:41:02 +02:00
Aliaksandr Valialkin
17a4dc9782 vendor: update github.com/valyala/gozstd from v1.6.3 to v1.6.4
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/281
2019-12-26 19:30:08 +02:00
Aliaksandr Valialkin
6f67e0b56b lib/metricsq: add ExpandWithExprs 2019-12-25 22:20:30 +02:00
Aliaksandr Valialkin
1925ee038d Rename lib/promql to lib/metricsql and apply small fixes 2019-12-25 22:03:59 +02:00
Mike Poindexter
bec62e4e43 Split Extended PromQL parsing to a separate library 2019-12-25 22:03:51 +02:00
Aliaksandr Valialkin
d880325cf6 app/vmselect/promql: make sure AdjustStartEnd returns time range covering the same number of points as the initial time range
This should prevent from the following panic at app/vmselect/promql/binary_op.go:255:

    BUG: len(leftVaues) must match len(rightValues) and len(dstValues)
2019-12-24 22:45:56 +02:00
Aliaksandr Valialkin
c18802af59 lib/fs: typo fix in fadvise_unix.go 2019-12-24 20:59:28 +02:00
Aliaksandr Valialkin
4ba4abe666 lib/encoding: log the compressed block contents if it cannot be decompressed or unmarshaled
This should help detecting the root cause of https://github.com/VictoriaMetrics/VictoriaMetrics/issues/281
2019-12-24 20:48:31 +02:00
Aliaksandr Valialkin
5bb39e757b lib/encoding: mention src contents in error message returned from unmarshalInt64NearestDelta*
This should simplify detecting the root cause of the issue at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/281
2019-12-24 20:41:52 +02:00
Aliaksandr Valialkin
d5c9841220 lib/encoding: mention unpacked block size in the error message if unparsed tail left
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/281
2019-12-24 20:35:13 +02:00
Aliaksandr Valialkin
9e19949c6b app/vmselect/promql: adjust calculations for rate and increase for the first value
These calculations should trigger alerts on `/api/v1/query` for counters starting from values greater than 0.
2019-12-24 19:39:25 +02:00
Aliaksandr Valialkin
0455c03cb9 app/vmselect/promql: properly calculate rate on the first data point
It is calculated as `value / scrape_interval`, since the value was missing on the previous scrape,
i.e. we can assume its value was 0 at this time.
2019-12-24 15:55:52 +02:00
Aliaksandr Valialkin
5cb8d97743 all: use gozstd instead of pure Go zstd for GOARCH=amd64 2019-12-24 12:42:42 +02:00
Aliaksandr Valialkin
31d04fb5df Revert "lib/logger: prevent from blocking when log output isn't consumed in timely manner"
This reverts commit e3c462f08a.

Reason to revert: this leaves incomplete logs on app shutdown.
2019-12-24 12:21:39 +02:00
Aliaksandr Valialkin
5b75984aa9 app/vmselect/netstorage: move MustAdviseSequentialRead to lib/fs 2019-12-23 23:16:11 +02:00
Aliaksandr Valialkin
097c21931c docs: sync README.md with Single-server-VictoriaMetrics.md 2019-12-23 20:34:21 +02:00
Roman Khavronenko
85463a7199 update configuration recommendations for Prometheus remote_write (#277) 2019-12-23 20:33:10 +02:00
Aliaksandr Valialkin
6a1499efa3 lib/encoding/zstd: prevent from possible encoder leak when concurrent goroutines create encoders for the same compressionLevel
Thanks to @klauspost for the pointer to this issue. See https://github.com/klauspost/compress/issues/195 for details.
2019-12-23 18:05:41 +02:00
Aliaksandr Valialkin
bf4413e58d README.md: document how to export and import gzipped data 2019-12-23 13:40:22 +02:00
Aliaksandr Valialkin
e3c462f08a lib/logger: prevent from blocking when log output isn't consumed in timely manner
Drop log messages instead of blocking and increment `vm_log_messages_dropped_total` metric.
2019-12-20 11:49:34 +02:00
Aliaksandr Valialkin
bea5a8700a app/vmselect: add -search.maxExportDuration command-line flag for limiting /api/v1/export duration
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/275
2019-12-20 11:35:22 +02:00
Aliaksandr Valialkin
1825893eef lib/storage: scale ingestion performance by sharding rawRows on systems with more than 8 CPU cores 2019-12-19 18:18:29 +02:00
Aliaksandr Valialkin
97f70ccda7 lib/storage: optimize bulk import performance when multiple data points are inserted for the same time series
This should speed up `/api/v1/import` and make it more scalable on multi-core systems.
2019-12-19 18:18:29 +02:00
Andrii Dembitskyi
2fba7b6f35 Fix typo in log message 2019-12-19 14:33:20 +02:00
Aliaksandr Valialkin
d03827c57d app/vminsert: return StatusNoContent http response for /api/v1/import to be consistent with other insert handlers 2019-12-19 01:21:54 +02:00
Aliaksandr Valialkin
bb530a0591 lib/httpserver: inline checkAuth code to make it more clear 2019-12-18 23:06:25 +02:00
koalaty-code
aea4c80dd7 Ignore /health endpoint when checking auth 2019-12-18 23:04:31 +02:00
Aliaksandr Valialkin
5e8e0fbc80 docs/ExtendedPromQL.md: rewording regarding scalar vs instant vector difference 2019-12-18 21:47:24 +02:00
Aliaksandr Valialkin
1e8aa89a3b docs/Home.md: fix link to case studies 2019-12-18 01:04:20 +02:00
Aliaksandr Valialkin
56595ae12a docs: renaming: PromQL extensions -> MetricsQL 2019-12-18 00:56:51 +02:00
Aliaksandr Valialkin
96ff8d9adb app/vmselect: add ability to pass match[], start and end to /api/v1/labels
This makes the `/api/v1/labels` handler consistent with already existing functionality for `/api/v1/label/.../values`.

See https://github.com/prometheus/prometheus/issues/6178 for more details.
2019-12-15 00:20:50 +02:00
Aliaksandr Valialkin
02f6566ce1 app/vmbackup: mention that backups are possible to Ceph and Swift 2019-12-14 01:08:49 +02:00
Aliaksandr Valialkin
7535f20c98 docs: publish vmbackup and vmrestore docs on wiki and victoriametrics.github.io 2019-12-14 01:05:55 +02:00
Aliaksandr Valialkin
bc645152cb app/vminsert: simultaneously accept telnet put and HTTP /api/put OpenTSDB metrics at -opentsdbListenAddr
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/266
2019-12-14 00:30:12 +02:00
Aliaksandr Valialkin
f5ac9b0721 lib/logger: add -loggerFormat for choosing log message formats
Supported formats: default, json

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/265
2019-12-13 15:10:05 +02:00
Aliaksandr Valialkin
d95a43f392 docs: sync with cluster branch 2019-12-12 20:49:55 +02:00
Aliaksandr Valialkin
87a8348062 make vendor-update 2019-12-12 19:39:52 +02:00
Aliaksandr Valialkin
cea5a14853 all: rename Extended PromQL to PromQL extensions 2019-12-12 19:25:58 +02:00
Aliaksandr Valialkin
9787c228a4 docs/CaseStudies.md: add a link to VMQL 2019-12-12 14:53:48 +02:00
Aliaksandr Valialkin
c121608205 README.md: mention that {__name__!=""} selects all the time series in /api/v1/export 2019-12-12 14:48:30 +02:00
Aliaksandr Valialkin
492f032b38 docs: add Dreamteam numbers 2019-12-12 01:01:07 +02:00
Aliaksandr Valialkin
4624c060ac docs/Single-server-VictoriaMetrics.md: sync with README.md 2019-12-12 00:55:14 +02:00
Clémence Saussez
8454679d9f README.md: adds link to Grafana dashboard for clustered version (#259)
Signed-off-by: Clemence Saussez <clemence@zen.ly>
2019-12-12 00:54:24 +02:00
Aliaksandr Valialkin
440a15111e deployment/docker/Makefile: mention that the Makefile rules must be invoked from the repository root 2019-12-11 23:33:02 +02:00
Aliaksandr Valialkin
6ddcd162ed all: publish Docker images for the following GOARCH: amd64, arm, arm64, ppc64le and 386
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/258
2019-12-11 23:32:59 +02:00
Aliaksandr Valialkin
6504f78ce4 README.md: add Docker hub shield 2019-12-11 18:34:26 +02:00
Aliaksandr Valialkin
73b2a3d4b7 app/vmselect/promql: return lower and upper bounds for the estimated percentile from histogram_quantile if third arg is passed
Updates https://github.com/prometheus/prometheus/issues/5706
2019-12-11 13:57:26 +02:00
Aliaksandr Valialkin
07d5bc986b CaseStudies: clarify wording: metrics -> active time series 2019-12-11 12:05:08 +02:00
Aliaksandr Valialkin
caa4eb72d9 app/vmselect/promql: return matrix instead of vector on subqueries to /api/v1/query like Prometheus does 2019-12-11 01:00:26 +02:00
Aliaksandr Valialkin
3c076544bf app/vmselect/promql: allow negative offsets
Updates https://github.com/prometheus/prometheus/issues/6282
2019-12-11 01:00:23 +02:00
Aliaksandr Valialkin
35f5ca1def README.md: typo fixes 2019-12-09 23:30:01 +02:00
Aliaksandr Valialkin
a7d80f62be README.md: add a chapter about Prometheus querying API usage
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/253
2019-12-09 23:27:23 +02:00
Aliaksandr Valialkin
40540397c3 README.md: use relative links to REAMDE.md 2019-12-09 23:04:34 +02:00
Aliaksandr Valialkin
c107f46b0e docs: mention about /api/v1/import in Single-server-VictoriaMetrics.md 2019-12-09 23:02:07 +02:00
Aliaksandr Valialkin
8cce513a15 docs: mention about /api/v1/import in Cluster-VictoriaMetrics.md 2019-12-09 23:01:14 +02:00
Aliaksandr Valialkin
b01ddfdd76 deployment/docker: update Go builder from go1.13.4 to go1.13.5 2019-12-09 22:58:26 +02:00
Aliaksandr Valialkin
68e1cf8942 app/vminsert: add /api/v1/import handler
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6
2019-12-09 20:59:04 +02:00
Aliaksandr Valialkin
8501b4a48d app/vminsert: consistency renaming for counters 2019-12-09 16:43:10 +02:00
Aliaksandr Valialkin
0ed9258545 lib/{mergeset,storage}: log info message when both source and destination part paths from txn are missing during startup
This is expected condition after unclean shutdown (OOM, hard reset, `kill -9`) on NFS disk.
2019-12-09 15:44:53 +02:00
Roman Khavronenko
b0d88460de #251 - add Logging rate panel (#254) 2019-12-09 13:05:59 +02:00
Aliaksandr Valialkin
8db7660afe docs/CaseStudies.md: mention that additional references and reviews can be obtained from our Slack channel 2019-12-08 14:04:18 +02:00
Aliaksandr Valialkin
18369bca42 docs/ExtendedPromQL.md: add a link to https://medium.com/@valyala/improving-histogram-usability-for-prometheus-and-grafana-bc7e5df0e350 for histogram func 2019-12-08 13:48:33 +02:00
Aliaksandr Valialkin
95328782c3 docs/CaseStudies.md: re-wording 2019-12-08 13:43:49 +02:00
Aliaksandr Valialkin
981cb66a95 docs/CaseStudies.md: improve wording 2019-12-08 13:39:29 +02:00
Aliaksandr Valialkin
f15d89bfe0 vendor: fix broken build for GOARCH=arm64 on golang.org/x/sys/unix 2019-12-08 13:27:38 +02:00
Aliaksandr Valialkin
36feb7d3e4 docs: add draft version of case studies 2019-12-08 13:23:15 +02:00
Aliaksandr Valialkin
d900184d8d vendor: fix arm build for golang.org/x/sys/unix/zptrace_armnn_linux.go 2019-12-08 12:49:05 +02:00
Aliaksandr Valialkin
293b541784 make vendor-update 2019-12-07 23:10:16 +02:00
Aliaksandr Valialkin
84b57e8974 app/vminsert/influx: add a test case from https://community.librenms.org/t/integration-with-victoriametrics/9689 2019-12-07 23:00:40 +02:00
Aliaksandr Valialkin
b458e5a213 README.md: mention that VictoriaMetrics is built on shared nothing architecture 2019-12-05 20:39:44 +02:00
Aliaksandr Valialkin
c09472dfd9 app/vmselect/promql: add {topk|bottomk}_{min|max|avg|median} aggregate functions for returning the exact k time series on the given time range
The full list of functions added:
- `topk_min(k, q)` - returns top K time series with the max minimums on the given time range
- `topk_max(k, q)` - returns top K time series with the max maximums on the given time range
- `topk_avg(k, q)` - returns top K time series with the max averages on the given time range
- `topk_median(k, q)` - returns top K time series with the max medians on the given time range
- `bottomk_min(k, q)` - returns bottom K time series with the min minimums on the given time range
- `bottomk_max(k, q)` - returns bottom K time series with the min maximums on the given time range
- `bottomk_avg(k, q)` - returns bottom K time series with the min averages on the given time range
- `bottomk_median(k, q)` - returns bottom K time series with the min medians on the given time range
2019-12-05 19:26:47 +02:00
Aliaksandr Valialkin
72345eb5bd lib/{mergeset,storage}: make sure pending transaction deletions are finished before and after runTransactions call.
`runTransactions` call issues async deletions for transaction files. The previously issued transaction deletions
can race with the next call to `runTransactions`. Prevent this by waiting until all the pending transaction
deletions are funished in the beginning of `runTransactions`. Also make sure that all the pending transaction
deletions are finished before returning from `runTransactions`.
2019-12-04 21:40:30 +02:00
Aliaksandr Valialkin
1244ad810d lib/httpserver: add /ping handler for compatibility with Influx agents
Certain Influx agents check for `/ping` endpoint before starting
to send Influx line protocol data. See https://docs.influxdata.com/influxdb/v1.7/tools/api/#ping-http-endpoint
2019-12-04 19:15:52 +02:00
Aliaksandr Valialkin
359c4d6109 docs: add a link to https://medium.com/@valyala/prometheus-storage-technical-terms-for-humans-4ab4de6c3d48 2019-12-03 22:37:16 +02:00
Aliaksandr Valialkin
face3d57bf app/vmselect: add placeholders for /api/v1/rules and /api/v1/alerts 2019-12-03 19:36:33 +02:00
Aliaksandr Valialkin
a247236f61 lib/storage: fall back to global inverted index if a filter match too many time series in per-day index
Previously this resulted to error message. The query may succeed via search in global index.
2019-12-03 14:48:31 +02:00
Aliaksandr Valialkin
54741ee578 lib/storage: fix printing tag filters in TagFilters.String 2019-12-03 14:25:13 +02:00
Aliaksandr Valialkin
efbc83a13e lib/storage: print __name__ instead of empty string in user-visible tag filters 2019-12-03 14:18:28 +02:00
Aliaksandr Valialkin
ade453847f docs: typo fixes 2019-12-03 00:44:50 +02:00
Aliaksandr Valialkin
f52874dab4 lib/storage: optimize regexp filter search 2019-12-03 00:43:12 +02:00
Artem Navoiev
652ba59ce9 [docs] update release page doc 2019-12-02 23:01:51 +02:00
Artem Navoiev
3e81ab2f75 [docs] change titles 2019-12-02 22:53:11 +02:00
Artem Navoiev
a778233877 [docs] change titles 2019-12-02 22:50:54 +02:00
Aliaksandr Valialkin
14100ed643 vendor: update github.com/VictoriaMetrics/metrics from v1.9.1 to v1.9.2
This fixes possible deadlock when metrics.WritePrometheus calls Gauge callback, which calls metrics functions with internal lock.
2019-12-02 22:33:33 +02:00
Artem Navoiev
cfc6e7df07 [docs] revert titles 2019-12-02 22:06:39 +02:00
Artem Navoiev
c07a83374c [docs] remove double titles 2019-12-02 22:02:59 +02:00
Artem Navoiev
c76b2be21f [ci] add github pages action 2019-12-02 21:53:33 +02:00
Aliaksandr Valialkin
638a5cbb16 lib/{mergeset,storage}: remove transaction files only after the mentioned dirs are really removed
This should fix the issue on NFS when incompletely removed dirs may be left
after unclean shutdown (OOM, kill -9, hard reset, etc.), while the corresponding transaction
files are already removed.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/162
2019-12-02 21:36:31 +02:00
Aliaksandr Valialkin
20812008a7 lib/storage: remove metricID with missing metricID->metricName entry
The metricID->metricName entry can be missing in the indexdb after unclean shutdown
when only a part of entries for new time series is written into indexdb.

Recover from such a situation by removing the broken metricID. New metricID
will be automatically created for time series with the given metricName
when new data point will arive to it.
2019-12-02 20:46:44 +02:00
Aliaksandr Valialkin
62a915f2b2 lib/storage: protect from time drift during indexdb rotation
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/248
2019-12-02 14:44:42 +02:00
Aliaksandr Valialkin
42da569bcd lib/logger: merge file and line labels into location="file:line"
This should improve the usability for `vm_log_messages_total` metric during practical queries
2019-12-02 14:44:40 +02:00
Aliaksandr Valialkin
70b8191fab lib/storage: generate more human-friendly result in TagFilters.String 2019-12-02 13:52:22 +02:00
Aliaksandr Valialkin
9476b73527 app/vmselect/promql: estimate per-series scrape interval as 0.6 quantile for the first 100 intervals
This should improve scrape interval estimation for tiem series with gaps.
2019-12-02 13:42:33 +02:00
Aliaksandr Valialkin
542b9c2043 lib/logger: consistency renaming from vm_log_messages_count to vm_log_messages_total, since this is a counter 2019-12-02 00:49:00 +02:00
Aliaksandr Valialkin
c567919f80 lib/logger: track the number of log messages by (level, file, line) in the vm_log_messages_count metric 2019-12-01 18:37:49 +02:00
Aliaksandr Valialkin
761645b20a lib/netutil: use IPv6 for both listening and dialing if -enabledTCP6 is set
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/244
2019-12-01 02:57:13 +02:00
Aliaksandr Valialkin
811b7a8303 app/vminsert/influx: allow empty measurement in Influx line protocol
In this case metric names are mapped directly from field names without any prefixes.
2019-11-30 23:18:41 +02:00
Artem Navoiev
4972bd4c96 Update release guide add Wiki section. Change styling 2019-11-30 21:10:42 +02:00
Artem Navoiev
335e0f8f6a Update release guide add Wiki section 2019-11-30 21:08:48 +02:00
Artem Navoiev
505e46980a [ci] push docs/*.md file to wiki 2019-11-30 20:58:28 +02:00
Artem Navoiev
ab88b77515 rename doc to docs 2019-11-30 20:48:40 +02:00
Artem Navoiev
3d8e75e065 [ci] test wiki push 2019-11-30 20:38:37 +02:00
Artem Navoiev
74b4ccfc91 [ci] push to wiki 2019-11-30 20:36:10 +02:00
Aliaksandr Valialkin
75ff524a4e app/vmselect/promql: fix corner case for increase over time series with gaps
In this case `increase` could return invalid high value for the first point after the gap.
2019-11-30 01:34:56 +02:00
Aliaksandr Valialkin
96492348cb deployment/docker/certs: update TLS certs source from alpine:3.9 to alpine:3.10 2019-11-29 19:57:29 +02:00
Aliaksandr Valialkin
f733cb2186 lib/backup: cosmetic fixes after #243 2019-11-29 18:07:04 +02:00
glebsam
15b7406f7b Add option to provide custom endpoint for S3, add option to specify S3 config profile (#243)
* Add option to provide custom endpoint for S3 for use with s3-compatible storages, add option to specify S3 config profile

* make fmt
2019-11-29 17:59:56 +02:00
Aliaksandr Valialkin
9010c6a1d6 lib/netutil: add -enableTCP6 command-line flag for enabling listening for IPv6 additionally to IPv4 TCP ports 2019-11-29 17:32:47 +02:00
Aliaksandr Valialkin
a7125a5b7b lib/backup: remove flock.lock file in empty dirs
This fixes an issue when VictoriaMetrics doesn't see the restored data after the following operations:

1. Stop VictoriaMetrics.
2. Delete `<-storageDataPath>` dir.
3. Start VictoriaMetrics, then stop it.
4. Restore data from backup with `vmrestore`.
5. Start VictoriaMetrics.

`vmrestore` didn't delete properly empty dirs in `<-storageDataPath>/indexdb` because of the remaining `flock.lock` files in these dirs.
2019-11-28 13:38:58 +02:00
Aliaksandr Valialkin
a6d7179286 README.md: remove the unnecessary step during restoring from backups 2019-11-27 19:57:03 +02:00
Aliaksandr Valialkin
e828647d0f vendor: make vendor-update 2019-11-27 15:37:14 +02:00
Aliaksandr Valialkin
31fb6f2b07 vendor: update github.com/VictoriaMetrics/fastcache from v1.5.2 to v1.5.4 2019-11-27 15:30:33 +02:00
Aliaksandr Valialkin
2c86816950 deployment/docker: update Grafana from v6.4.4 to v6.5.0 2019-11-27 15:10:37 +02:00
Aliaksandr Valialkin
4c859d980c app/vmselect/prometheus: consistently apply nocache arg to /api/v1/query the same way ast to /api/v1/query_range
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/241
2019-11-26 22:55:43 +02:00
Aliaksandr Valialkin
14bcff6015 lib/httpserver: improve docs for -tls* flags to be more clear
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/242
2019-11-26 18:08:35 +02:00
Aliaksandr Valialkin
110235f789 app/vmselect/prometheus: fix content-type for /api/v1/export responses
The correct Content-Type should be `application/stream+json` instead of `application/json`
Thanks to Joshua Ryder for pointing to this.
2019-11-26 17:45:26 +02:00
Aliaksandr Valialkin
205233d9a7 app/vmselect/promql: remove zero timeseries from prometheus_buckets output 2019-11-25 19:10:23 +02:00
Aliaksandr Valialkin
3f99f39e9b app/vmselect/prometheus: reduce default value for -search.latencyOffset from 60s to 30s
30 seconds should be enough for almost all the cases
2019-11-25 16:33:42 +02:00
Aliaksandr Valialkin
e91cb34c0e app/vmselect/promql: allow nested parens 2019-11-25 16:13:41 +02:00
Aliaksandr Valialkin
826dfd63a5 vendor: update github.com/VictoriaMetrics/metrics from v1.9.0 to v1.9.1 2019-11-25 15:23:01 +02:00
Aliaksandr Valialkin
0401969d78 app/vmselect/promql: re-use metrics.Histogram when calculating histogram function for each point on the graph
This should reduce the amounts memory allocations
2019-11-25 14:24:21 +02:00
Aliaksandr Valialkin
da98703748 app/vmselect/promql: optimize binary search over big number of samples during rollup calculations 2019-11-25 14:01:46 +02:00
Aliaksandr Valialkin
c28876172f app/vmselect/promql: adjust tests after the upgrade of github.com/VictoriaMetrics/metrics from v1.8.3 to v1.9.0 2019-11-25 13:43:57 +02:00
Aliaksandr Valialkin
66c53bf3c6 vendor: update github.com/VictoriaMetrics/metrics from v1.8.3 to v1.9.0 2019-11-25 13:19:43 +02:00
Aliaksandr Valialkin
50ae1879c6 app/vmselect/promql: add histogram aggregate function, which is useful for building heatmaps from multiple time series 2019-11-24 00:04:25 +02:00
Aliaksandr Valialkin
4ff2fbcf3f vendor: update github.com/VictoriaMetrics/metrics from v1.8.2 to v1.8.3 2019-11-24 00:04:24 +02:00
Aliaksandr Valialkin
5285acae3e lib/decimal: calculate ln2/ln10 constant during compile time 2019-11-23 15:52:58 +02:00
Aliaksandr Valialkin
8582b50360 app/vmselect/promql: do not take into account buckets with negative counters in prometheus_buckets 2019-11-23 14:19:25 +02:00
Aliaksandr Valialkin
19dfe52254 app/vmselect/promql: properly handle histogram_quantile(0, ...) with zero buckets 2019-11-23 14:02:35 +02:00
Aliaksandr Valialkin
4bb88843cf app/vmselect: add vm_per_query_{rows,series}_processed_count histograms 2019-11-23 13:23:26 +02:00
Aliaksandr Valialkin
0827bb6ce5 vendor: update github.com/VictoriaMetrics/metrics from v1.8.1 to v1.8.2 2019-11-23 11:48:54 +02:00
Aliaksandr Valialkin
7753c8c0a1 app/vmselect/promql: transparently apply prometheus_buckets in histogram_quantile 2019-11-23 11:48:51 +02:00
Aliaksandr Valialkin
ef25e1b049 vendor: update github.com/VictoriaMetrics/metrics from v1.8.0 to v1.8.1 2019-11-23 00:49:13 +02:00
Aliaksandr Valialkin
9d1fcb2be6 vendor: update github.com/VictoriaMetrics/metrics from v1.7.2 to v1.8.0. This version supports histograms 2019-11-23 00:20:27 +02:00
Aliaksandr Valialkin
c4287b3c86 app/vmselect/promql: add prometheus_buckets function for converting the upcoming histogram buckets from github.com/VictoriaMetrics/metrics to Prometheus-compatible buckets 2019-11-23 00:20:20 +02:00
Aliaksandr Valialkin
1f3fd2c910 app/vmselect: adjust end arg instead of adjusting start arg if start > end
`start` arg has higher chances to be set properly comparing to `end` arg,
so it is expected that the `end` arg could be adjusted if it was set incorrectly.
2019-11-22 16:12:19 +02:00
Aliaksandr Valialkin
90b03309de vendor: updated github.com/valyala/gozstd from v1.6.2 to v1.6.3 2019-11-21 23:57:00 +02:00
Aliaksandr Valialkin
7a4635f853 all: remove the remaining mentions of cluster version 2019-11-21 23:18:22 +02:00
Aliaksandr Valialkin
3e9b7addb1 lib/httpserver: typo fix in -httpAuth.password command-line description 2019-11-21 21:54:26 +02:00
Aliaksandr Valialkin
f652c0f40f lib/storage: move non-matching tag filters to the top at matchTagFilters
This should reduce the amount of useless work needed for matching the next metricNames.
2019-11-21 21:35:13 +02:00
Aliaksandr Valialkin
b8cde6cce1 lib/storage: speed up time series search for queries with multiple filters
Use optimized specialized binary search for uint64 metricIDs instead of generic sort.Search.
2019-11-21 18:43:17 +02:00
Aliaksandr Valialkin
aeea59e280 Makefile: create files with sha256 checksums during make release
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/19
2019-11-20 22:43:37 +02:00
Aliaksandr Valialkin
74e563ca3f README.md: added a link to https://github.com/dreamteam-gg/ansible-victoriametrics-role 2019-11-20 21:26:43 +02:00
Aliaksandr Valialkin
5c1e4143e9 lib/storage: verify the number of returned metricIDs in BenchmarkHeadPostingForMatchers 2019-11-20 15:39:28 +02:00
Aliaksandr Valialkin
52d7ca6bf0 lib/decimal: increase decimal->float speed conversion for integer numbers 2019-11-20 13:04:34 +02:00
Aliaksandr Valialkin
75eeea21ee lib/decimal: reduce rounding error when converting from decimal to float with negative exponent
While at it, slightly increase the conversion performance by moving fast path to the top of the loop.
2019-11-19 23:35:33 +02:00
Artem Navoiev
c03b87dac0 update version of codecove to 1.04 2019-11-19 22:23:14 +02:00
Aliaksandr Valialkin
259dc95366 make vendor-update 2019-11-19 21:35:07 +02:00
Aliaksandr Valialkin
cfb9fa2100 lib/backup: retrieve only the required metadata when reading GCS objects 2019-11-19 21:06:34 +02:00
Aliaksandr Valialkin
355ccba81a make vendor-update 2019-11-19 21:05:37 +02:00
Aliaksandr Valialkin
443189fb0a app/{vmbackup,vmrestore}: add -maxBytesPerSecond command-line flag for limiting the used network bandwidth during backup / restore 2019-11-19 20:31:52 +02:00
Aliaksandr Valialkin
2db06f0ef8 lib/backup: prevent from restoring to directory which is in use by VictoriaMetrics during the restore 2019-11-19 18:36:23 +02:00
Aliaksandr Valialkin
0094bc4fc9 app/vmselect/prometheus: properly adjust too big time time on /api/v1/query
Too big `time` must be adjusted to `now()-queryOffset`.
2019-11-19 00:42:00 +02:00
Aliaksandr Valialkin
b6f22a62cb lib/storage: increase the number of created time series in BenchmarkHeadPostingForMatchers in order to be on par with Promethues
The previous commit was accidentally creating 10x smaller number of time series than Prometheus
and this led to invalid benchmark results.

The updated benchmark results:

benchmark                                                          old ns/op      new ns/op     delta
BenchmarkHeadPostingForMatchers/n="1"                              272756688      6194893       -97.73%
BenchmarkHeadPostingForMatchers/n="1",j="foo"                      138132923      10781372      -92.19%
BenchmarkHeadPostingForMatchers/j="foo",n="1"                      134723762      10632834      -92.11%
BenchmarkHeadPostingForMatchers/n="1",j!="foo"                     195823953      10679975      -94.55%
BenchmarkHeadPostingForMatchers/i=~".*"                            7962582919     100118510     -98.74%
BenchmarkHeadPostingForMatchers/i=~".+"                            7589543864     154955671     -97.96%
BenchmarkHeadPostingForMatchers/i=~""                              1142371741     258003769     -77.42%
BenchmarkHeadPostingForMatchers/i!=""                              9964150263     159783895     -98.40%
BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo"              216995884      10937895      -94.96%
BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo"       202541348      10990027      -94.57%
BenchmarkHeadPostingForMatchers/n="1",i!=""                        486285711      87004349      -82.11%
BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo"                350776931      53342793      -84.79%
BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo"              380888565      54256156      -85.76%
BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo"             89500296       21823279      -75.62%
BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo"       379529654      46671359      -87.70%
BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo"     424563825      53915842      -87.30%

VictoriaMetrics uses 1GB of RAM during the benchmark (vs 3.5GB of RAM for Prometheus)
2019-11-18 19:50:58 +02:00
Aliaksandr Valialkin
8a0dfc6220 lib/storage: add BenchmarkHeadPostingForMatchers similar to the benchmark from Prometheus
See the corresponding benchmark in Prometheus - 23c0299d85/tsdb/head_bench_test.go (L52)

The benchmark allows performing apples-to-apples comparison of time series search
in Prometheus and VictoriaMetrics. The following article - https://www.robustperception.io/evaluating-performance-and-correctness -
contains incorrect numbers for VictoriaMetrics, since there wasn't this benchmark yet. Fix this.

Benchmarks can be repeated with the following commands from Prometheus and VictoriaMetrics source code roots:

- Prometheus: GOMAXPROCS=1 go test ./tsdb/ -run=111 -bench=BenchmarkHeadPostingForMatchers
- VictoriaMetrics: GOMAXPROCS=1 go test ./lib/storage/ -run=111 -bench=BenchmarkHeadPostingForMatchers

Benchmark results:
benchmark                                                          old ns/op      new ns/op     delta
BenchmarkHeadPostingForMatchers/n="1"                              272756688      364977        -99.87%
BenchmarkHeadPostingForMatchers/n="1",j="foo"                      138132923      1181636       -99.14%
BenchmarkHeadPostingForMatchers/j="foo",n="1"                      134723762      1141578       -99.15%
BenchmarkHeadPostingForMatchers/n="1",j!="foo"                     195823953      1148056       -99.41%
BenchmarkHeadPostingForMatchers/i=~".*"                            7962582919     8716755       -99.89%
BenchmarkHeadPostingForMatchers/i=~".+"                            7589543864     12096587      -99.84%
BenchmarkHeadPostingForMatchers/i=~""                              1142371741     16164560      -98.59%
BenchmarkHeadPostingForMatchers/i!=""                              9964150263     12230021      -99.88%
BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo"              216995884      1173476       -99.46%
BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo"       202541348      1299743       -99.36%
BenchmarkHeadPostingForMatchers/n="1",i!=""                        486285711      11555193      -97.62%
BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo"                350776931      5607506       -98.40%
BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo"              380888565      6380335       -98.32%
BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo"             89500296       2078970       -97.68%
BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo"       379529654      6561368       -98.27%
BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo"     424563825      6757132       -98.41%

The first column (old) is for Prometheus, the second column (new) is for VictoriaMetrics.

As you can see, VictoriaMetrics outperforms Prometheus by more than 100x in almost all the test cases of this benchmark.

Prometheus was using 3.5GB of RAM during the benchmark, while VictoriaMetrics was using 400MB of RAM.
2019-11-18 18:45:06 +02:00
Aliaksandr Valialkin
2ab4cea5e5 lib/storage: always start using per-day inverted index on the next day after its creation
The current day could miss entries for already stopped time series before
enabling per-day index.

This fixes the issue when queries return empty results during the first hour after
upgrading to v1.29.*
2019-11-16 12:11:25 +02:00
Aliaksandr Valialkin
c050abbbad deployment/docker: update Prometheus version from v2.12.0 to v2.14.0 2019-11-16 00:13:15 +02:00
Aliaksandr Valialkin
3f1637fae8 app/vmselect/promql: properly calculate integrate(q[d]) 2019-11-13 21:10:41 +02:00
Aliaksandr Valialkin
c56b9ed03b app/victoria-metrics: add build rules for GOARCH=ppc64le
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/235
2019-11-13 20:24:33 +02:00
Aliaksandr Valialkin
3fd32e331a app/vmselect/promql: use universal approach for determining maxByteSliceLen on 32-bit and 64-bit archs
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/235
2019-11-13 20:24:26 +02:00
Aliaksandr Valialkin
119dfd01bb lib/storage: add vm_cache_size_bytes{type="storage/hour_metric_ids"} metric 2019-11-13 20:24:21 +02:00
Aliaksandr Valialkin
86a1cd700b lib/storage: remove inmemory index for recent hour, since it uses too much memory
Production workload shows that the index requires ~4Kb of RAM per active time series.
This is too much for high number of active time series, so let's delete this index.

Now the queries should fall back to the index for the current day instead of the index
for the recent hour. The query performance for the current day index should be good enough
given the 100M rows/sec scan speed per CPU core.
2019-11-13 17:58:07 +02:00
1142 changed files with 161187 additions and 131226 deletions

View File

@@ -26,5 +26,16 @@ $ ./victoria-metrics-prod --version
victoria-metrics-20190730-121249-heads-single-node-0-g671d9e55
```
**Used command-line flags**
Command-line flags are listed as `flag{name="httpListenAddr", value=":443"} 1` lines at `/metrics` page.
See the following docs for details:
* [monitoring for single-node VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#monitoring)
* [montioring for VictoriaMetrics cluster](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/cluster/README.md#monitoring)
**Additional context**
Add any other context about the problem here such as error logs, `/metrics` output, screenshots from [the official Grafana dashboard for VictoriaMetrics](https://grafana.com/dashboards/10229).
Add any other context about the problem here such as error logs from VictoriaMetrics and Prometheus,
`/metrics` output, screenshots from the official Grafana dashboards for VictoriaMetrics:
* [Grafana dashboard for single-node VictoriaMetrics](https://grafana.com/dashboards/10229)
* [Grafana dashboard for VictoriaMetrics cluster](https://grafana.com/grafana/dashboards/11176)

30
.github/workflows/github-pages.yml vendored Normal file
View File

@@ -0,0 +1,30 @@
name: github-pages
on:
push:
paths:
- 'docs/*'
- 'README.md'
branches:
- master
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@master
- name: publish
shell: bash
env:
TOKEN: ${{secrets.CI_TOKEN}}
run: |
git clone https://vika:${TOKEN}@github.com/VictoriaMetrics/VictoriaMetrics.github.io.git gpages
cp docs/* gpages
cp README.md gpages
cd gpages
git config --local user.email "info@victoriametrics.com"
git config --local user.name "Vika"
git add .
git commit -m "update github pages"
remote_repo="https://vika:${TOKEN}@github.com/VictoriaMetrics/VictoriaMetrics.github.io.git"
git push "${remote_repo}"
cd ..
rm -rf gpages

View File

@@ -1,7 +1,13 @@
name: main
on:
- push
- pull_request
push:
paths-ignore:
- 'docs/**'
- '**.md'
pull_request:
paths-ignore:
- 'docs/**'
- '**.md'
jobs:
build:
name: Build
@@ -24,21 +30,21 @@ jobs:
env:
GO111MODULE: on
run: |
export PATH=$PATH:$(go env GOPATH)/bin # temporary fix. See https://github.com/actions/setup-go/issues/14
make check-all
git diff --exit-code
make test-full
make test-pure
make test-full-386
make victoria-metrics
make victoria-metrics-pure
make victoria-metrics-arm
make victoria-metrics-arm64
make vmutils
GOOS=freebsd go build -mod=vendor ./app/victoria-metrics
GOOS=darwin go build -mod=vendor ./app/victoria-metrics
export PATH=$PATH:$(go env GOPATH)/bin # temporary fix. See https://github.com/actions/setup-go/issues/14
make check-all
git diff --exit-code
make test-full
make test-pure
make test-full-386
make victoria-metrics
make victoria-metrics-pure
make victoria-metrics-arm
make victoria-metrics-arm64
make vmutils
GOOS=freebsd go build -mod=vendor ./app/victoria-metrics
GOOS=darwin go build -mod=vendor ./app/victoria-metrics
- name: Publish coverage
uses: codecov/codecov-action@v1.0.0
uses: codecov/codecov-action@v1.0.6
with:
token: ${{secrets.CODECOV_TOKEN}}
file: ./coverage.txt

28
.github/workflows/wiki.yml vendored Normal file
View File

@@ -0,0 +1,28 @@
name: wiki
on:
push:
paths:
- 'docs/*'
branches:
- master
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@master
- name: publish
shell: bash
env:
TOKEN: ${{secrets.CI_TOKEN}}
run: |
git clone https://vika:${TOKEN}@github.com/VictoriaMetrics/VictoriaMetrics.wiki.git wiki
cp docs/* wiki
cd wiki
git config --local user.email "info@victoriametrics.com"
git config --local user.name "Vika"
git add .
git commit -m "update wiki pages"
remote_repo="https://vika:${TOKEN}@github.com/VictoriaMetrics/VictoriaMetrics.wiki.git"
git push "${remote_repo}"
cd ..
rm -rf wiki

2
.gitignore vendored
View File

@@ -1,3 +1,4 @@
/tmp
/tags
/pkg
*.pprof
@@ -7,6 +8,7 @@
*.swp
/gocache-for-docker
/victoria-metrics-data
/vmagent-remotewrite-data
/vmstorage-data
/vmselect-cache
/package/temp-deb-*

View File

@@ -175,7 +175,7 @@
END OF TERMS AND CONDITIONS
Copyright 2019 VictoriaMetrics, Inc.
Copyright 2019-2020 VictoriaMetrics, Inc.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.

View File

@@ -11,7 +11,10 @@ endif
GO_BUILDINFO = -X '$(PKG_PREFIX)/lib/buildinfo.Version=$(APP_NAME)-$(shell date -u +'%Y%m%d-%H%M%S')-$(BUILDINFO_TAG)'
all: \
victoria-metrics-prod
victoria-metrics-prod \
vmagent-prod \
vmbackup-prod \
vmrestore-prod
include app/*/Makefile
include deployment/*/Makefile
@@ -21,15 +24,18 @@ clean:
publish: \
publish-victoria-metrics \
publish-vmagent \
publish-vmbackup \
publish-vmrestore
package: \
package-victoria-metrics \
package-vmagent \
package-vmbackup \
package-vmrestore
vmutils: \
vmagent \
vmbackup \
vmrestore
@@ -38,12 +44,15 @@ release: \
release-vmutils
release-victoria-metrics: victoria-metrics-prod
cd bin && tar czf victoria-metrics-$(PKG_TAG).tar.gz victoria-metrics-prod
cd bin && tar czf victoria-metrics-$(PKG_TAG).tar.gz victoria-metrics-prod && \
sha256sum victoria-metrics-$(PKG_TAG).tar.gz > victoria-metrics-$(PKG_TAG)_checksums.txt
release-vmutils: \
vmagent-prod \
vmbackup-prod \
vmrestore-prod
cd bin && tar czf vmutils-$(PKG_TAG).tar.gz vmbackup-prod vmrestore-prod
cd bin && tar czf vmutils-$(PKG_TAG).tar.gz vmagent-prod vmbackup-prod vmrestore-prod && \
sha256sum vmutils-$(PKG_TAG).tar.gz > vmutils-$(PKG_TAG)_checksums.txt
pprof-cpu:
go tool pprof -trim_path=github.com/VictoriaMetrics/VictoriaMetrics@ $(PPROF_FILE)
@@ -68,8 +77,10 @@ errcheck: install-errcheck
errcheck -exclude=errcheck_excludes.txt ./app/vminsert/...
errcheck -exclude=errcheck_excludes.txt ./app/vmselect/...
errcheck -exclude=errcheck_excludes.txt ./app/vmstorage/...
errcheck -exclude=errcheck_excludes.txt ./app/vmagent/...
errcheck -exclude=errcheck_excludes.txt ./app/vmbackup/...
errcheck -exclude=errcheck_excludes.txt ./app/vmrestore/...
errcheck -exclude=errcheck_excludes.txt ./app/vmalert/...
install-errcheck:
which errcheck || GO111MODULE=off go get -u github.com/kisielk/errcheck
@@ -77,16 +88,19 @@ install-errcheck:
check-all: fmt vet lint errcheck golangci-lint
test:
GO111MODULE=on go test -tags=integration -mod=vendor ./lib/... ./app/...
GO111MODULE=on go test -mod=vendor ./lib/... ./app/...
test-race:
GO111MODULE=on go test -mod=vendor -race ./lib/... ./app/...
test-pure:
GO111MODULE=on CGO_ENABLED=0 go test -tags=integration -mod=vendor ./lib/... ./app/...
GO111MODULE=on CGO_ENABLED=0 go test -mod=vendor ./lib/... ./app/...
test-full:
GO111MODULE=on go test -tags=integration -mod=vendor -coverprofile=coverage.txt -covermode=atomic ./lib/... ./app/...
GO111MODULE=on go test -mod=vendor -coverprofile=coverage.txt -covermode=atomic ./lib/... ./app/...
test-full-386:
GO111MODULE=on GOARCH=386 go test -tags=integration -mod=vendor -coverprofile=coverage.txt -covermode=atomic ./lib/... ./app/...
GO111MODULE=on GOARCH=386 go test -mod=vendor -coverprofile=coverage.txt -covermode=atomic ./lib/... ./app/...
benchmark:
GO111MODULE=on go test -mod=vendor -bench=. ./lib/...

625
README.md

File diff suppressed because it is too large Load Diff

View File

@@ -3,12 +3,50 @@
victoria-metrics:
APP_NAME=victoria-metrics $(MAKE) app-local
victoria-metrics-race:
APP_NAME=victoria-metrics RACE=-race $(MAKE) app-local
victoria-metrics-prod:
APP_NAME=victoria-metrics $(MAKE) app-via-docker
victoria-metrics-pure-prod:
APP_NAME=victoria-metrics $(MAKE) app-via-docker-pure
victoria-metrics-amd64-prod:
APP_NAME=victoria-metrics $(MAKE) app-via-docker-amd64
victoria-metrics-arm-prod:
APP_NAME=victoria-metrics $(MAKE) app-via-docker-arm
victoria-metrics-arm64-prod:
APP_NAME=victoria-metrics $(MAKE) app-via-docker-arm64
victoria-metrics-ppc64le-prod:
APP_NAME=victoria-metrics $(MAKE) app-via-docker-ppc64le
victoria-metrics-386-prod:
APP_NAME=victoria-metrics $(MAKE) app-via-docker-386
package-victoria-metrics:
APP_NAME=victoria-metrics \
$(MAKE) package-via-docker
APP_NAME=victoria-metrics $(MAKE) package-via-docker
package-victoria-metrics-pure:
APP_NAME=victoria-metrics $(MAKE) package-via-docker-pure
package-victoria-metrics-amd64:
APP_NAME=victoria-metrics $(MAKE) package-via-docker-amd64
package-victoria-metrics-arm:
APP_NAME=victoria-metrics $(MAKE) package-via-docker-arm
package-victoria-metrics-arm64:
APP_NAME=victoria-metrics $(MAKE) package-via-docker-arm64
package-victoria-metrics-ppc64le:
APP_NAME=victoria-metrics $(MAKE) package-via-docker-ppc64le
package-victoria-metrics-386:
APP_NAME=victoria-metrics $(MAKE) package-via-docker-386
publish-victoria-metrics:
APP_NAME=victoria-metrics $(MAKE) publish-via-docker
@@ -20,30 +58,24 @@ run-victoria-metrics:
ARGS='-graphiteListenAddr=:2003 -opentsdbListenAddr=:4242 -retentionPeriod=12 -search.maxUniqueTimeseries=1000000 -search.maxQueryDuration=10m' \
$(MAKE) run-via-docker
victoria-metrics-amd64:
CGO_ENABLED=1 GOOS=linux GOARCH=amd64 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/victoria-metrics-amd64 ./app/victoria-metrics
victoria-metrics-arm:
CGO_ENABLED=0 GOOS=linux GOARCH=arm GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/victoria-metrics-arm ./app/victoria-metrics
victoria-metrics-arm-prod:
APP_NAME=victoria-metrics APP_SUFFIX='-arm' DOCKER_OPTS='--env CGO_ENABLED=0 --env GOARCH=arm' $(MAKE) app-via-docker
victoria-metrics-arm64:
CGO_ENABLED=0 GOOS=linux GOARCH=arm64 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/victoria-metrics-arm64 ./app/victoria-metrics
victoria-metrics-arm64-prod:
APP_NAME=victoria-metrics APP_SUFFIX='-arm64' DOCKER_OPTS='--env CGO_ENABLED=0 --env GOARCH=arm64' $(MAKE) app-via-docker
victoria-metrics-ppc64le:
CGO_ENABLED=0 GOOS=linux GOARCH=ppc64le GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/victoria-metrics-ppc64le ./app/victoria-metrics
victoria-metrics-386:
CGO_ENABLED=0 GOOS=linux GOARCH=386 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/victoria-metrics-386 ./app/victoria-metrics
victoria-metrics-386-prod:
APP_NAME=victoria-metrics APP_SUFFIX='-386' DOCKER_OPTS='--env CGO_ENABLED=0 --env GOARCH=386' $(MAKE) app-via-docker
victoria-metrics-pure:
APP_NAME=victoria-metrics $(MAKE) app-local-pure
victoria-metrics-pure-prod:
APP_NAME=victoria-metrics APP_SUFFIX='-pure' DOCKER_OPTS='--env CGO_ENABLED=0' $(MAKE) app-via-docker
### Packaging as DEB - amd64
victoria-metrics-package-deb: victoria-metrics-prod
./package/package_deb.sh amd64

View File

@@ -1,5 +1,8 @@
FROM scratch
COPY --from=local/certs:1.0.2 /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ca-certificates.crt
COPY bin/victoria-metrics-prod .
ARG base_image
FROM $base_image
EXPOSE 8428
ENTRYPOINT ["/victoria-metrics-prod"]
ARG src_binary
COPY $src_binary ./victoria-metrics-prod

View File

@@ -9,44 +9,55 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envflag"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/procutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
)
var httpListenAddr = flag.String("httpListenAddr", ":8428", "TCP address to listen for http connections")
var (
httpListenAddr = flag.String("httpListenAddr", ":8428", "TCP address to listen for http connections")
minScrapeInterval = flag.Duration("dedup.minScrapeInterval", 0, "Remove superflouos samples from time series if they are located closer to each other than this duration. "+
"This may be useful for reducing overhead when multiple identically configured Prometheus instances write data to the same VictoriaMetrics. "+
"Deduplication is disabled if the -dedup.minScrapeInterval is 0")
)
func main() {
flag.Parse()
envflag.Parse()
buildinfo.Init()
logger.Init()
logger.Infof("starting VictoraMetrics at %q...", *httpListenAddr)
logger.Infof("starting VictoriaMetrics at %q...", *httpListenAddr)
startTime := time.Now()
storage.SetMinScrapeIntervalForDeduplication(*minScrapeInterval)
vmstorage.Init()
vmselect.Init()
vminsert.Init()
startSelfScraper()
go httpserver.Serve(*httpListenAddr, requestHandler)
logger.Infof("started VictoriaMetrics in %s", time.Since(startTime))
logger.Infof("started VictoriaMetrics in %.3f seconds", time.Since(startTime).Seconds())
sig := procutil.WaitForSigterm()
logger.Infof("received signal %s", sig)
stopSelfScraper()
logger.Infof("gracefully shutting down webservice at %q", *httpListenAddr)
startTime = time.Now()
if err := httpserver.Stop(*httpListenAddr); err != nil {
logger.Fatalf("cannot stop the webservice: %s", err)
}
vminsert.Stop()
logger.Infof("successfully shut down the webservice in %s", time.Since(startTime))
logger.Infof("successfully shut down the webservice in %.3f seconds", time.Since(startTime).Seconds())
vmstorage.Stop()
vmselect.Stop()
fs.MustStopDirRemover()
logger.Infof("the VictoriaMetrics has been stopped in %s", time.Since(startTime))
logger.Infof("the VictoriaMetrics has been stopped in %.3f seconds", time.Since(startTime).Seconds())
}
func requestHandler(w http.ResponseWriter, r *http.Request) bool {

View File

@@ -1,5 +1,3 @@
// +build integration
package main
import (
@@ -23,6 +21,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envflag"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
@@ -148,7 +147,7 @@ func setUp() {
}
func processFlags() {
flag.Parse()
envflag.Parse()
for _, fv := range []struct {
flag string
value string
@@ -302,6 +301,9 @@ func readIn(readFor string, t *testing.T, insertTime time.Time) []test {
s := newSuite(t)
var tt []test
s.noError(filepath.Walk(filepath.Join(testFixturesDir, readFor), func(path string, info os.FileInfo, err error) error {
if err != nil {
return err
}
if filepath.Ext(path) != ".json" {
return nil
}

View File

@@ -0,0 +1,103 @@
package main
import (
"flag"
"sync"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/prometheus"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
)
var (
selfScrapeInterval = flag.Duration("selfScrapeInterval", 0, "Interval for self-scraping own metrics at /metrics page")
selfScrapeInstance = flag.String("selfScrapeInstance", "self", "Value for 'instance' label, which is added to self-scraped metrics")
selfScrapeJob = flag.String("selfScrapeJob", "victoria-metrics", "Value for 'job' label, which is added to self-scraped metrics")
)
var selfScraperStopCh chan struct{}
var selfScraperWG sync.WaitGroup
func startSelfScraper() {
selfScraperStopCh = make(chan struct{})
selfScraperWG.Add(1)
go func() {
defer selfScraperWG.Done()
selfScraper(*selfScrapeInterval)
}()
}
func stopSelfScraper() {
close(selfScraperStopCh)
selfScraperWG.Wait()
}
func selfScraper(scrapeInterval time.Duration) {
if scrapeInterval <= 0 {
// Self-scrape is disabled.
return
}
logger.Infof("started self-scraping `/metrics` page with interval %.3f seconds", scrapeInterval.Seconds())
var bb bytesutil.ByteBuffer
var rows prometheus.Rows
var mrs []storage.MetricRow
var labels []prompb.Label
t := time.NewTicker(scrapeInterval)
var currentTimestamp int64
for {
select {
case <-selfScraperStopCh:
t.Stop()
logger.Infof("stopped self-scraping `/metrics` page")
return
case currentTime := <-t.C:
currentTimestamp = currentTime.UnixNano() / 1e6
}
bb.Reset()
httpserver.WritePrometheusMetrics(&bb)
s := bytesutil.ToUnsafeString(bb.B)
rows.Reset()
rows.Unmarshal(s)
mrs = mrs[:0]
for i := range rows.Rows {
r := &rows.Rows[i]
labels = labels[:0]
labels = addLabel(labels, "", r.Metric)
labels = addLabel(labels, "job", *selfScrapeJob)
labels = addLabel(labels, "instance", *selfScrapeInstance)
for j := range r.Tags {
t := &r.Tags[j]
labels = addLabel(labels, t.Key, t.Value)
}
if len(mrs) < cap(mrs) {
mrs = mrs[:len(mrs)+1]
} else {
mrs = append(mrs, storage.MetricRow{})
}
mr := &mrs[len(mrs)-1]
mr.MetricNameRaw = storage.MarshalMetricNameRaw(mr.MetricNameRaw[:0], labels)
mr.Timestamp = currentTimestamp
mr.Value = r.Value
}
logger.Infof("writing %d rows at timestamp %d", len(mrs), currentTimestamp)
vmstorage.AddRows(mrs)
}
}
func addLabel(dst []prompb.Label, key, value string) []prompb.Label {
if len(dst) < cap(dst) {
dst = dst[:len(dst)+1]
} else {
dst = append(dst, prompb.Label{})
}
lb := &dst[len(dst)-1]
lb.Name = bytesutil.ToUnsafeBytes(key)
lb.Value = bytesutil.ToUnsafeBytes(value)
return dst
}

View File

@@ -1,18 +1,18 @@
// +build integration
// Source https://github.com/prometheus/prometheus/blob/master/prompb/remote.pb.go . Code is copy pasted and cleaned up
package test
// Source https://github.com/prometheus/prometheus/blob/master/prompb/remote.pb.go . Code is copy pasted and cleaned up
import (
"encoding/binary"
"math"
"math/bits"
)
// WriteRequest is write request
type WriteRequest struct {
Timeseries []TimeSeries `protobuf:"bytes,1,rep,name=timeseries,proto3" json:"timeseries"`
}
// Size returns m size in bytes after marshaling.
func (m *WriteRequest) Size() (n int) {
if m == nil {
return 0
@@ -31,6 +31,7 @@ func sovRemote(x uint64) (n int) {
return (bits.Len64(x|1) + 6) / 7
}
// Marshal marshals m.
func (m *WriteRequest) Marshal() (dAtA []byte, err error) {
size := m.Size()
dAtA = make([]byte, size)
@@ -41,11 +42,13 @@ func (m *WriteRequest) Marshal() (dAtA []byte, err error) {
return dAtA[:n], nil
}
// MarshalTo marshals m to dAtA
func (m *WriteRequest) MarshalTo(dAtA []byte) (int, error) {
size := m.Size()
return m.MarshalToSizedBuffer(dAtA[:size])
}
// MarshalToSizedBuffer marshals m to dAtA.
func (m *WriteRequest) MarshalToSizedBuffer(dAtA []byte) (int, error) {
i := len(dAtA)
if len(m.Timeseries) > 0 {
@@ -77,11 +80,13 @@ func encodeVarintRemote(dAtA []byte, offset int, v uint64) int {
return base
}
// Sample is time series sample.
type Sample struct {
Value float64 `protobuf:"fixed64,1,opt,name=value,proto3" json:"value,omitempty"`
Timestamp int64 `protobuf:"varint,2,opt,name=timestamp,proto3" json:"timestamp,omitempty"`
}
// Reset resets m.
func (m *Sample) Reset() { *m = Sample{} }
// TimeSeries represents samples and labels for a single time series.
@@ -90,21 +95,27 @@ type TimeSeries struct {
Samples []Sample `protobuf:"bytes,2,rep,name=samples,proto3" json:"samples"`
}
// Reset resets m.
func (m *TimeSeries) Reset() { *m = TimeSeries{} }
// Label is time series label.
type Label struct {
Name string `protobuf:"bytes,1,opt,name=name,proto3" json:"name,omitempty"`
Value string `protobuf:"bytes,2,opt,name=value,proto3" json:"value,omitempty"`
}
// Reset resets m.
func (m *Label) Reset() { *m = Label{} }
// Labels is a set of labels.
type Labels struct {
Labels []Label `protobuf:"bytes,1,rep,name=labels,proto3" json:"labels"`
}
// Reset resets m.
func (m *Labels) Reset() { *m = Labels{} }
// Marshal marshals m.
func (m *Sample) Marshal() (dAtA []byte, err error) {
size := m.Size()
dAtA = make([]byte, size)
@@ -115,11 +126,13 @@ func (m *Sample) Marshal() (dAtA []byte, err error) {
return dAtA[:n], nil
}
// MarshalTo marshals m to dAtA.
func (m *Sample) MarshalTo(dAtA []byte) (int, error) {
size := m.Size()
return m.MarshalToSizedBuffer(dAtA[:size])
}
// MarshalToSizedBuffer marshals m to dAtA.
func (m *Sample) MarshalToSizedBuffer(dAtA []byte) (int, error) {
i := len(dAtA)
if m.Timestamp != 0 {
@@ -136,6 +149,7 @@ func (m *Sample) MarshalToSizedBuffer(dAtA []byte) (int, error) {
return len(dAtA) - i, nil
}
// Marshal marshals m.
func (m *TimeSeries) Marshal() (dAtA []byte, err error) {
size := m.Size()
dAtA = make([]byte, size)
@@ -146,11 +160,13 @@ func (m *TimeSeries) Marshal() (dAtA []byte, err error) {
return dAtA[:n], nil
}
// MarshalTo marshals m to dAtA.
func (m *TimeSeries) MarshalTo(dAtA []byte) (int, error) {
size := m.Size()
return m.MarshalToSizedBuffer(dAtA[:size])
}
// MarshalToSizedBuffer marshals m to dAtA.
func (m *TimeSeries) MarshalToSizedBuffer(dAtA []byte) (int, error) {
i := len(dAtA)
if len(m.Samples) > 0 {
@@ -184,6 +200,7 @@ func (m *TimeSeries) MarshalToSizedBuffer(dAtA []byte) (int, error) {
return len(dAtA) - i, nil
}
// Marshal marshals m.
func (m *Label) Marshal() (dAtA []byte, err error) {
size := m.Size()
dAtA = make([]byte, size)
@@ -194,11 +211,13 @@ func (m *Label) Marshal() (dAtA []byte, err error) {
return dAtA[:n], nil
}
// MarshalTo marshals m to dAtA.
func (m *Label) MarshalTo(dAtA []byte) (int, error) {
size := m.Size()
return m.MarshalToSizedBuffer(dAtA[:size])
}
// MarshalToSizedBuffer marshals m to dAtA.
func (m *Label) MarshalToSizedBuffer(dAtA []byte) (int, error) {
i := len(dAtA)
_ = i
@@ -221,6 +240,7 @@ func (m *Label) MarshalToSizedBuffer(dAtA []byte) (int, error) {
return len(dAtA) - i, nil
}
// Marshal marshals m.
func (m *Labels) Marshal() (dAtA []byte, err error) {
size := m.Size()
dAtA = make([]byte, size)
@@ -231,11 +251,13 @@ func (m *Labels) Marshal() (dAtA []byte, err error) {
return dAtA[:n], nil
}
// MarshalTo marshals m to dAtA.
func (m *Labels) MarshalTo(dAtA []byte) (int, error) {
size := m.Size()
return m.MarshalToSizedBuffer(dAtA[:size])
}
// MarshalToSizedBuffer marshals m to dAtA.
func (m *Labels) MarshalToSizedBuffer(dAtA []byte) (int, error) {
i := len(dAtA)
if len(m.Labels) > 0 {
@@ -267,6 +289,7 @@ func encodeVarintTypes(dAtA []byte, offset int, v uint64) int {
return base
}
// Size returns the size of marshaled m.
func (m *Sample) Size() (n int) {
if m == nil {
return 0
@@ -280,6 +303,7 @@ func (m *Sample) Size() (n int) {
return n
}
// Size returns the size of marshaled m.
func (m *TimeSeries) Size() (n int) {
if m == nil {
return 0
@@ -301,6 +325,7 @@ func (m *TimeSeries) Size() (n int) {
return n
}
// Size returns the size of marshaled m.
func (m *Label) Size() (n int) {
if m == nil {
return 0
@@ -318,6 +343,7 @@ func (m *Label) Size() (n int) {
return n
}
// Size returns the size of marshaled m.
func (m *Labels) Size() (n int) {
if m == nil {
return 0

View File

@@ -1,9 +1,8 @@
// +build integration
package test
import "github.com/golang/snappy"
// Compress marshals and compresses wr.
func Compress(wr WriteRequest) ([]byte, error) {
data, err := wr.Marshal()
if err != nil {

View File

@@ -0,0 +1,16 @@
{
"name": "empty-label-match",
"issue": "https://github.com/VictoriaMetrics/VictoriaMetrics/issues/395",
"data": [
"empty_label_match 1 {TIME_S-1m}",
"empty_label_match;foo=bar 2 {TIME_S-1m}",
"empty_label_match;foo=baz 3 {TIME_S-1m}"],
"query": ["/api/v1/query_range?query=empty_label_match{foo=~'bar|'}&start={TIME_S}&end={TIME_S}&step=60"],
"result_query_range": {
"status":"success",
"data":{"resultType":"matrix",
"result":[
{"metric":{"__name__":"empty_label_match"},"values":[["{TIME_S}","1"]]},
{"metric":{"__name__":"empty_label_match","foo":"bar"},"values":[["{TIME_S}","2"]]}
]}}
}

View File

@@ -13,11 +13,8 @@
"data":{"resultType":"matrix",
"result":[{"metric":{"__name__":"max_lookback_set"},"values":[
["{TIME_S-150s}","4"],
["{TIME_S-140s}","4"],
["{TIME_S-120s}","3"],
["{TIME_S-110s}","3"],
["{TIME_S-60s}","2"],
["{TIME_S-50s}","2"],
["{TIME_S-30s}","1"],
["{TIME_S-20s}","1"]
]}]}}

View File

@@ -19,14 +19,12 @@
["{TIME_S-110s}","3"],
["{TIME_S-100s}","3"],
["{TIME_S-90s}","3"],
["{TIME_S-80s}","3"],
["{TIME_S-70s}","3"],
["{TIME_S-60s}","2"],
["{TIME_S-50s}","2"],
["{TIME_S-40s}","2"],
["{TIME_S-30s}","1"],
["{TIME_S-20s}","1"],
["{TIME_S-10s}","1"],
["{TIME_S}","1"]
["{TIME_S-0s}","1"]
]}]}}
}

76
app/vmagent/Makefile Normal file
View File

@@ -0,0 +1,76 @@
# All these commands must run from repository root.
vmagent:
APP_NAME=vmagent $(MAKE) app-local
vmagent-race:
APP_NAME=vmagent RACE=-race $(MAKE) app-local
vmagent-prod:
APP_NAME=vmagent $(MAKE) app-via-docker
vmagent-pure-prod:
APP_NAME=vmagent $(MAKE) app-via-docker-pure
vmagent-amd64-prod:
APP_NAME=vmagent $(MAKE) app-via-docker-amd64
vmagent-arm-prod:
APP_NAME=vmagent $(MAKE) app-via-docker-arm
vmagent-arm64-prod:
APP_NAME=vmagent $(MAKE) app-via-docker-arm64
vmagent-ppc64le-prod:
APP_NAME=vmagent $(MAKE) app-via-docker-ppc64le
vmagent-386-prod:
APP_NAME=vmagent $(MAKE) app-via-docker-386
package-vmagent:
APP_NAME=vmagent $(MAKE) package-via-docker
package-vmagent-pure:
APP_NAME=vmagent $(MAKE) package-via-docker-pure
package-vmagent-amd64:
APP_NAME=vmagent $(MAKE) package-via-docker-amd64
package-vmagent-arm:
APP_NAME=vmagent $(MAKE) package-via-docker-arm
package-vmagent-arm64:
APP_NAME=vmagent $(MAKE) package-via-docker-arm64
package-vmagent-ppc64le:
APP_NAME=vmagent $(MAKE) package-via-docker-ppc64le
package-vmagent-386:
APP_NAME=vmagent $(MAKE) package-via-docker-386
publish-vmagent:
APP_NAME=vmagent $(MAKE) publish-via-docker
run-vmagent:
mkdir -p vmagent-data
DOCKER_OPTS='-v $(shell pwd)/vmagent-data:/vmagent-data' \
APP_NAME=vmagent \
$(MAKE) run-via-docker
vmagent-amd64:
CGO_ENABLED=1 GOOS=linux GOARCH=amd64 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmagent-amd64 ./app/vmagent
vmagent-arm:
CGO_ENABLED=0 GOOS=linux GOARCH=arm GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmagent-arm ./app/vmagent
vmagent-arm64:
CGO_ENABLED=0 GOOS=linux GOARCH=arm64 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmagent-arm64 ./app/vmagent
vmagent-ppc64le:
CGO_ENABLED=0 GOOS=linux GOARCH=ppc64le GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmagent-ppc64le ./app/vmagent
vmagent-386:
CGO_ENABLED=0 GOOS=linux GOARCH=386 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmagent-386 ./app/vmagent
vmagent-pure:
APP_NAME=vmagent $(MAKE) app-local-pure

226
app/vmagent/README.md Normal file
View File

@@ -0,0 +1,226 @@
## vmagent
`vmagent` is a tiny but brave agent, which helps you collecting metrics from various sources
and storing them to [VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics)
or any other Prometheus-compatible storage system that supports `remote_write` protocol.
<img alt="vmagent" src="vmagent.png">
### Motivation
While VictoriaMetrics provides an efficient solution to store and observe metrics, our users needed something fast
and RAM friendly to scrape metrics from Prometheus-compatible exporters to VictoriaMetrics.
Also, we found that users infrastructure is like snowflakes - never alike, and we decided to add more flexibility
to `vmagent` (like the ability to push metrics instead of pulling them). We did our best and plan to do even more.
### Features
* Can be used as drop-in replacement for Prometheus for scraping targets such as [node_exporter](https://github.com/prometheus/node_exporter).
See [Quick Start](#quick-start) for details.
* Can add, remove and modify labels (aka tags) via Prometheus relabeling. Can filter data before sending it to remote storage. See [these docs](#relabeling) for details.
* Accepts data via all the ingestion protocols supported by VictoriaMetrics:
* Influx line protocol via `http://<vmagent>:8429/write`. See [these docs](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-send-data-from-influxdb-compatible-agents-such-as-telegraf).
* Graphite plaintext protocol if `-graphiteListenAddr` command-line flag is set. See [these docs](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-send-data-from-graphite-compatible-agents-such-as-statsd).
* OpenTSDB telnet and http protocols if `-opentsdbListenAddr` command-line flag is set. See [these docs](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-send-data-from-opentsdb-compatible-agents).
* Prometheus remote write protocol via `http://<vmagent>:8429/api/v1/write`.
* JSON lines import protocol via `http://<vmagent>:8429/api/v1/import`. See [these docs](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-import-time-series-data).
* Arbitrary CSV data via `http://<vmagent>:8429/api/v1/import/csv`. See [these docs](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-import-csv-data).
* Can replicate collected metrics simultaneously to multiple remote storage systems.
* Works in environments with unstable connections to remote storage. If the remote storage is unavailable, the collected metrics
are buffered at `-remoteWrite.tmpDataPath`. The buffered metrics are sent to remote storage as soon as connection
to remote storage is recovered. The maximum disk usage for the buffer can be limited with `-remoteWrite.maxDiskUsagePerURL`.
* Uses lower amounts of RAM, CPU, disk IO and network bandwidth comparing to Prometheus.
### Quick Start
Just download `vmutils-*` archive from [releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases), unpack it
and pass the following flags to `vmagent` binary in order to start scraping Prometheus targets:
* `-promscrape.config` with the path to Prometheus config file (it is usually located at `/etc/prometheus/prometheus.yml`)
* `-remoteWrite.url` with the remote storage endpoint such as VictoriaMetrics. Multiple `-remoteWrite.url` args can be set in parallel
in order to replicate data concurrently to multiple remote storage systems.
Example command line:
```
/path/to/vmagent -promscrape.config=/path/to/prometheus.yml -remoteWrite.url=https://victoria-metrics-host:8428/api/v1/write
```
If you need collecting only Influx data, then the following command line would be enough:
```
/path/to/vmagent -remoteWrite.url=https://victoria-metrics-host:8428/api/v1/write
```
Then send Influx data to `http://vmagent-host:8429`. See [these docs](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-send-data-from-influxdb-compatible-agents-such-as-telegraf) for more details.
`vmagent` is also available in [docker images](https://hub.docker.com/r/victoriametrics/vmagent/).
Pass `-help` to `vmagent` in order to see the full list of supported command-line flags with their descriptions.
### Use cases
#### IoT and Edge monitoring
`vmagent` can run and collect metrics in IoT and industrial networks with unreliable or scheduled connections to the remote storage.
It buffers the collected data in local files until the connection to remote storage becomes available and then sends the buffered
data to the remote storage. It re-tries sending the data to remote storage on any errors.
The maximum buffer size can be limited with `-remoteWrite.maxDiskUsagePerURL`.
`vmagent` works on various architectures from IoT world - 32-bit arm, 64-bit arm, ppc64, 386, amd64.
See [the corresponding Makefile rules](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmagent/Makefile) for details.
#### Drop-in replacement for Prometheus
If you use Prometheus only for scraping metrics from various targets and forwarding these metrics to remote storage,
then `vmagent` can replace such Prometheus setup. Usually `vmagent` requires lower amounts of RAM, CPU and network bandwidth comparing to Prometheus for such setup.
See [these docs](#how-to-collect-metrics-in-prometheus-format) for details.
#### Replication and high availability
`vmagent` replicates the collected metrics among multiple remote storage instances configured via `-remoteWrite.url` args.
If a single remote storage instance temporarily goes out of service, then the collected data remains available in another remote storage instances.
`vmagent` buffers the collected data in files at `-remoteWrite.tmpDataPath` until the remote storage becomes available again.
Then it sends the buffered data to the remote storage in order to prevent data gaps in the remote storage.
#### Relabeling and filtering
`vmagent` can add, remove or update labels on the collected data before sending it to remote storage. Additionally,
it can remove unneeded samples via Prometheus-like relabeling before sending the collected data to remote storage.
See [these docs](#relabeling) for details.
#### Splitting data streams among multiple systems
`vmagent` supports splitting of the collected data among muliple destinations with the help of `-remoteWrite.urlRelabelConfig`,
which is applied independently for each configured `-remoteWrite.url` destination. For instance, it is possible to replicate or split
data among long-term remote storage, short-term remote storage and real-time analytical system [built on top of Kafka](https://github.com/Telefonica/prometheus-kafka-adapter).
Note that each destination can receive its own subset of the collected data thanks to per-destination relabeling via `-remoteWrite.urlRelabelConfig`.
#### Prometheus remote_write proxy
`vmagent` may be used as a proxy for Prometheus data sent via Prometheus `remote_write` protocol. It can accept data via `remote_write` API
at `/api/v1/write` endpoint, apply relabeling and filtering and then proxy it to another `remote_write` systems.
The `vmagent` can be configured to encrypt the incoming `remote_write` requests with `-tls*` command-line flags.
Additionally, Basic Auth can be enabled for the incoming `remote_write` requests with `-httpAuth.*` command-line flags.
### How to collect metrics in Prometheus format
Pass the path to `prometheus.yml` to `-promscrape.config` command-line flag. `vmagent` takes into account the following
sections from [Prometheus config file](https://prometheus.io/docs/prometheus/latest/configuration/configuration/):
* `global`
* `scrape_configs`
All the other sections are ignored, including [remote_write](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write) section.
Use `-remoteWrite.*` command-line flags instead for configuring remote write settings.
The following scrape types in [scrape_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config) section are supported:
* `static_configs` - for scraping statically defined targets. See [these docs](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#static_config) for details.
* `file_sd_configs` - for scraping targets defined in external files aka file-based service discover.
See [these docs](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#file_sd_config) for details.
* `kubernetes_sd_configs` - for scraping targets in Kubernetes (k8s).
See [kubernetes_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config) for details.
The following service discovery mechanisms will be added to `vmagent` soon:
* [ec2_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#ec2_sd_config)
* [gce_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#gce_sd_config)
* [consul_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config)
* [dns_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dns_sd_config)
File feature requests at [our issue tracker](https://github.com/VictoriaMetrics/VictoriaMetrics/issues) if you need other service discovery mechanisms to be supported by `vmagent`.
### Adding labels to metrics
Labels can be added to metrics via the following mechanisms:
* Via `global -> external_labels` section in `-promscrape.config` file. These labels are added only to metrics scraped from targets configured in `-promscrape.config` file.
* Via `-remoteWrite.label` command-line flag. These labels are added to all the collected metrics before sending them to `-remoteWrite.url`.
### Relabeling
`vmagent` supports [Prometheus relabeling](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config).
Additionally it provides the following extra actions:
* `replace_all`: replaces all the occurences of `regex` in the values of `source_labels` with the `replacement` and stores the result in the `target_label`.
* `labelmap_all`: replaces all the occurences of `regex` in all the label names with the `replacement`.
The relabeling can be defined in the following places:
* At `scrape_config -> relabel_configs` section in `-promscrape.config` file. This relabeling is applied to targets when parsing the file during `vmagent` startup
or during config reload after sending `SIGHUP` signal to `vmagent` via `kill -HUP`.
* At `scrape_config -> metric_relabel_configs` section in `-promscrape.config` file. This relabeling is applied to metrics after each scrape for the configured targets.
* At `-remoteWrite.relabelConfig` file. This relabeling is aplied to all the collected metrics before sending them to remote storage.
* At `-remoteWrite.urlRelabelConfig` files. This relabeling is applied to metrics before sending them to the corresponding `-remoteWrite.url`.
Read more about relabeling in the following articles:
* [Life of a label](https://www.robustperception.io/life-of-a-label)
* [Discarding targets and timeseries with relabeling](https://www.robustperception.io/relabelling-can-discard-targets-timeseries-and-alerts)
* [Dropping labels at scrape time](https://www.robustperception.io/dropping-metrics-at-scrape-time-with-prometheus)
* [Extracting labels from legacy metric names](https://www.robustperception.io/extracting-labels-from-legacy-metric-names)
* [relabel_configs vs metric_relabel_configs](https://www.robustperception.io/relabel_configs-vs-metric_relabel_configs)
### Monitoring
`vmagent` exports various metrics in Prometheus exposition format at `http://vmagent-host:8429/metrics` page. It is recommended setting up regular scraping of this page
either via `vmagent` itself or via Prometheus, so the exported metrics could be analyzed later.
`vmagent` also exports target statuses at `http://vmagent-host:8429/targets` page in plaintext format. This page also exports information on improperly configured scrape configs.
### Troubleshooting
* It is recommended increasing the maximum number of open files in the system (`ulimit -n`) when scraping big number of targets,
since `vmagent` establishes at least a single TCP connection per each target.
* When `vmagent` scrapes many unreliable targets, it can flood error log with scrape errors. These errors can be suppressed
by passing `-promscrape.suppressScrapeErrors` command-line flag to `vmagent`. The most recent scrape error per each target can be observed at `http://vmagent-host:8429/targets`.
* It is recommended increasing `-remoteWrite.queues` if `vmagent` collects more than 100K samples per second
and `vmagent_remotewrite_pending_data_bytes` metric exported by `vmagent` at `/metrics` page constantly grows.
* `vmagent` buffers scraped data at `-remoteWrite.tmpDataPath` directory until it is sent to `-remoteWrite.url`.
The directory can grow big when remote storage is unavailable during extended periods of time and if `-remoteWrite.maxDiskUsagePerURL` isn't set.
If you don't want sending all the data from the directory to remote storage, just stop `vmagent` and delete the directory.
### How to build from sources
It is recommended using [binary releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) - `vmagent` is located in `vmutils-*` archives there.
#### Development build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.13.
2. Run `make vmagent` from the root folder of the repository.
It builds `vmagent` binary and puts it into the `bin` folder.
#### Production build
1. [Install docker](https://docs.docker.com/install/).
2. Run `make vmagent-prod` from the root folder of the repository.
It builds `vmagent-prod` binary and puts it into the `bin` folder.
#### Building docker images
Run `make package-vmagent`. It builds `victoriametrics/vmagent:<PKG_TAG>` docker image locally.
`<PKG_TAG>` is auto-generated image tag, which depends on source code in the repository.
The `<PKG_TAG>` may be manually set via `PKG_TAG=foobar make package-vmagent`.

View File

@@ -0,0 +1,70 @@
package common
import (
"runtime"
"sync"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
)
// PushCtx is a context used for populating WriteRequest.
type PushCtx struct {
WriteRequest prompbmarshal.WriteRequest
// Labels contains flat list of all the labels used in WriteRequest.
Labels []prompbmarshal.Label
// Samples contains flat list of all the samples used in WriteRequest.
Samples []prompbmarshal.Sample
}
// Reset resets ctx.
func (ctx *PushCtx) Reset() {
tss := ctx.WriteRequest.Timeseries
for i := range tss {
ts := &tss[i]
ts.Labels = nil
ts.Samples = nil
}
ctx.WriteRequest.Timeseries = ctx.WriteRequest.Timeseries[:0]
labels := ctx.Labels
for i := range labels {
label := &labels[i]
label.Name = ""
label.Value = ""
}
ctx.Labels = ctx.Labels[:0]
ctx.Samples = ctx.Samples[:0]
}
// GetPushCtx returns PushCtx from pool.
//
// Call PutPushCtx when the ctx is no longer needed.
func GetPushCtx() *PushCtx {
select {
case ctx := <-pushCtxPoolCh:
return ctx
default:
if v := pushCtxPool.Get(); v != nil {
return v.(*PushCtx)
}
return &PushCtx{}
}
}
// PutPushCtx returns ctx to the pool.
//
// ctx mustn't be used after returning to the pool.
func PutPushCtx(ctx *PushCtx) {
ctx.Reset()
select {
case pushCtxPoolCh <- ctx:
default:
pushCtxPool.Put(ctx)
}
}
var pushCtxPool sync.Pool
var pushCtxPoolCh = make(chan *PushCtx, runtime.GOMAXPROCS(-1))

View File

@@ -0,0 +1,63 @@
package csvimport
import (
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/csvimport"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="csvimport"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="csvimport"}`)
)
// InsertHandler processes csv data from req.
func InsertHandler(req *http.Request) error {
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(req, insertRows)
})
}
func insertRows(rows []parser.Row) error {
ctx := common.GetPushCtx()
defer common.PutPushCtx(ctx)
tssDst := ctx.WriteRequest.Timeseries[:0]
labels := ctx.Labels[:0]
samples := ctx.Samples[:0]
for i := range rows {
r := &rows[i]
labelsLen := len(labels)
labels = append(labels, prompbmarshal.Label{
Name: "__name__",
Value: r.Metric,
})
for j := range r.Tags {
tag := &r.Tags[j]
labels = append(labels, prompbmarshal.Label{
Name: tag.Key,
Value: tag.Value,
})
}
samples = append(samples, prompbmarshal.Sample{
Value: r.Value,
Timestamp: r.Timestamp,
})
tssDst = append(tssDst, prompbmarshal.TimeSeries{
Labels: labels[labelsLen:],
Samples: samples[len(samples)-1:],
})
}
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.Push(&ctx.WriteRequest)
rowsInserted.Add(len(rows))
rowsPerInsert.Update(float64(len(rows)))
return nil
}

View File

@@ -0,0 +1,8 @@
ARG base_image
FROM $base_image
EXPOSE 8429
ENTRYPOINT ["/vmagent-prod"]
ARG src_binary
COPY $src_binary ./vmagent-prod

View File

@@ -0,0 +1,65 @@
package graphite
import (
"io"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/graphite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="graphite"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="graphite"}`)
)
// InsertHandler processes remote write for graphite plaintext protocol.
//
// See https://graphite.readthedocs.io/en/latest/feeding-carbon.html#the-plaintext-protocol
func InsertHandler(r io.Reader) error {
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(r, insertRows)
})
}
func insertRows(rows []parser.Row) error {
ctx := common.GetPushCtx()
defer common.PutPushCtx(ctx)
tssDst := ctx.WriteRequest.Timeseries[:0]
labels := ctx.Labels[:0]
samples := ctx.Samples[:0]
for i := range rows {
r := &rows[i]
labelsLen := len(labels)
labels = append(labels, prompbmarshal.Label{
Name: "__name__",
Value: r.Metric,
})
for j := range r.Tags {
tag := &r.Tags[j]
labels = append(labels, prompbmarshal.Label{
Name: tag.Key,
Value: tag.Value,
})
}
samples = append(samples, prompbmarshal.Sample{
Value: r.Value,
Timestamp: r.Timestamp,
})
tssDst = append(tssDst, prompbmarshal.TimeSeries{
Labels: labels[labelsLen:],
Samples: samples[len(samples)-1:],
})
}
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.Push(&ctx.WriteRequest)
rowsInserted.Add(len(rows))
rowsPerInsert.Update(float64(len(rows)))
return nil
}

View File

@@ -0,0 +1,167 @@
package influx
import (
"flag"
"io"
"net/http"
"runtime"
"sync"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/influx"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
measurementFieldSeparator = flag.String("influxMeasurementFieldSeparator", "_", "Separator for '{measurement}{separator}{field_name}' metric name when inserted via Influx line protocol")
skipSingleField = flag.Bool("influxSkipSingleField", false, "Uses '{measurement}' instead of '{measurement}{separator}{field_name}' for metic name if Influx line contains only a single field")
)
var (
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="influx"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="influx"}`)
)
// InsertHandlerForReader processes remote write for influx line protocol.
//
// See https://github.com/influxdata/telegraf/tree/master/plugins/inputs/socket_listener/
func InsertHandlerForReader(r io.Reader) error {
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(r, false, "", "", insertRows)
})
}
// InsertHandlerForHTTP processes remote write for influx line protocol.
//
// See https://github.com/influxdata/influxdb/blob/4cbdc197b8117fee648d62e2e5be75c6575352f0/tsdb/README.md
func InsertHandlerForHTTP(req *http.Request) error {
return writeconcurrencylimiter.Do(func() error {
isGzipped := req.Header.Get("Content-Encoding") == "gzip"
q := req.URL.Query()
precision := q.Get("precision")
// Read db tag from https://docs.influxdata.com/influxdb/v1.7/tools/api/#write-http-endpoint
db := q.Get("db")
return parser.ParseStream(req.Body, isGzipped, precision, db, insertRows)
})
}
func insertRows(db string, rows []parser.Row) error {
ctx := getPushCtx()
defer putPushCtx(ctx)
rowsTotal := 0
tssDst := ctx.ctx.WriteRequest.Timeseries[:0]
labels := ctx.ctx.Labels[:0]
samples := ctx.ctx.Samples[:0]
commonLabels := ctx.commonLabels[:0]
buf := ctx.buf[:0]
for i := range rows {
r := &rows[i]
commonLabels = commonLabels[:0]
hasDBLabel := false
for j := range r.Tags {
tag := &r.Tags[j]
if tag.Key == "db" {
hasDBLabel = true
}
commonLabels = append(commonLabels, prompbmarshal.Label{
Name: tag.Key,
Value: tag.Value,
})
}
if len(db) > 0 && !hasDBLabel {
commonLabels = append(commonLabels, prompbmarshal.Label{
Name: "db",
Value: db,
})
}
ctx.metricGroupBuf = append(ctx.metricGroupBuf[:0], r.Measurement...)
skipFieldKey := len(r.Fields) == 1 && *skipSingleField
if len(ctx.metricGroupBuf) > 0 && !skipFieldKey {
ctx.metricGroupBuf = append(ctx.metricGroupBuf, *measurementFieldSeparator...)
}
for j := range r.Fields {
f := &r.Fields[j]
bufLen := len(buf)
buf = append(buf, ctx.metricGroupBuf...)
if !skipFieldKey {
buf = append(buf, f.Key...)
}
metricGroup := bytesutil.ToUnsafeString(buf[bufLen:])
labelsLen := len(labels)
labels = append(labels, prompbmarshal.Label{
Name: "__name__",
Value: metricGroup,
})
labels = append(labels, commonLabels...)
samples = append(samples, prompbmarshal.Sample{
Timestamp: r.Timestamp,
Value: f.Value,
})
tssDst = append(tssDst, prompbmarshal.TimeSeries{
Labels: labels[labelsLen:],
Samples: samples[len(samples)-1:],
})
}
rowsTotal += len(r.Fields)
}
ctx.buf = buf
ctx.ctx.WriteRequest.Timeseries = tssDst
ctx.ctx.Labels = labels
ctx.ctx.Samples = samples
ctx.commonLabels = commonLabels
remotewrite.Push(&ctx.ctx.WriteRequest)
rowsInserted.Add(rowsTotal)
rowsPerInsert.Update(float64(rowsTotal))
return nil
}
type pushCtx struct {
ctx common.PushCtx
commonLabels []prompbmarshal.Label
metricGroupBuf []byte
buf []byte
}
func (ctx *pushCtx) reset() {
ctx.ctx.Reset()
commonLabels := ctx.commonLabels
for i := range commonLabels {
label := &commonLabels[i]
label.Name = ""
label.Value = ""
}
ctx.metricGroupBuf = ctx.metricGroupBuf[:0]
ctx.buf = ctx.buf[:0]
}
func getPushCtx() *pushCtx {
select {
case ctx := <-pushCtxPoolCh:
return ctx
default:
if v := pushCtxPool.Get(); v != nil {
return v.(*pushCtx)
}
return &pushCtx{}
}
}
func putPushCtx(ctx *pushCtx) {
ctx.reset()
select {
case pushCtxPoolCh <- ctx:
default:
pushCtxPool.Put(ctx)
}
}
var pushCtxPool sync.Pool
var pushCtxPoolCh = make(chan *pushCtx, runtime.GOMAXPROCS(-1))

180
app/vmagent/main.go Normal file
View File

@@ -0,0 +1,180 @@
package main
import (
"flag"
"fmt"
"net/http"
"strings"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/csvimport"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/graphite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/influx"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/opentsdb"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/opentsdbhttp"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/promremotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/vmimport"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envflag"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
graphiteserver "github.com/VictoriaMetrics/VictoriaMetrics/lib/ingestserver/graphite"
influxserver "github.com/VictoriaMetrics/VictoriaMetrics/lib/ingestserver/influx"
opentsdbserver "github.com/VictoriaMetrics/VictoriaMetrics/lib/ingestserver/opentsdb"
opentsdbhttpserver "github.com/VictoriaMetrics/VictoriaMetrics/lib/ingestserver/opentsdbhttp"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/procutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
httpListenAddr = flag.String("httpListenAddr", ":8429", "TCP address to listen for http connections. "+
"Set this flag to empty value in order to disable listening on any port. This mode may be useful for running multiple vmagent instances on the same server. "+
"Note that /targets and /metrics pages aren't available if -httpListenAddr=''")
influxListenAddr = flag.String("influxListenAddr", "", "TCP and UDP address to listen for Influx line protocol data. Usually :8189 must be set. Doesn't work if empty")
graphiteListenAddr = flag.String("graphiteListenAddr", "", "TCP and UDP address to listen for Graphite plaintext data. Usually :2003 must be set. Doesn't work if empty")
opentsdbListenAddr = flag.String("opentsdbListenAddr", "", "TCP and UDP address to listen for OpentTSDB metrics. "+
"Telnet put messages and HTTP /api/put messages are simultaneously served on TCP port. "+
"Usually :4242 must be set. Doesn't work if empty")
opentsdbHTTPListenAddr = flag.String("opentsdbHTTPListenAddr", "", "TCP address to listen for OpentTSDB HTTP put requests. Usually :4242 must be set. Doesn't work if empty")
)
var (
influxServer *influxserver.Server
graphiteServer *graphiteserver.Server
opentsdbServer *opentsdbserver.Server
opentsdbhttpServer *opentsdbhttpserver.Server
)
func main() {
envflag.Parse()
buildinfo.Init()
logger.Init()
logger.Infof("starting vmagent at %q...", *httpListenAddr)
startTime := time.Now()
remotewrite.Init()
writeconcurrencylimiter.Init()
if len(*influxListenAddr) > 0 {
influxServer = influxserver.MustStart(*influxListenAddr, influx.InsertHandlerForReader)
}
if len(*graphiteListenAddr) > 0 {
graphiteServer = graphiteserver.MustStart(*graphiteListenAddr, graphite.InsertHandler)
}
if len(*opentsdbListenAddr) > 0 {
opentsdbServer = opentsdbserver.MustStart(*opentsdbListenAddr, opentsdb.InsertHandler, opentsdbhttp.InsertHandler)
}
if len(*opentsdbHTTPListenAddr) > 0 {
opentsdbhttpServer = opentsdbhttpserver.MustStart(*opentsdbHTTPListenAddr, opentsdbhttp.InsertHandler)
}
promscrape.Init(remotewrite.Push)
if len(*httpListenAddr) > 0 {
go httpserver.Serve(*httpListenAddr, requestHandler)
}
logger.Infof("started vmagent in %.3f seconds", time.Since(startTime).Seconds())
sig := procutil.WaitForSigterm()
logger.Infof("received signal %s", sig)
startTime = time.Now()
if len(*httpListenAddr) > 0 {
logger.Infof("gracefully shutting down webservice at %q", *httpListenAddr)
if err := httpserver.Stop(*httpListenAddr); err != nil {
logger.Fatalf("cannot stop the webservice: %s", err)
}
logger.Infof("successfully shut down the webservice in %.3f seconds", time.Since(startTime).Seconds())
}
promscrape.Stop()
if len(*influxListenAddr) > 0 {
influxServer.MustStop()
}
if len(*graphiteListenAddr) > 0 {
graphiteServer.MustStop()
}
if len(*opentsdbListenAddr) > 0 {
opentsdbServer.MustStop()
}
if len(*opentsdbHTTPListenAddr) > 0 {
opentsdbhttpServer.MustStop()
}
remotewrite.Stop()
logger.Infof("successfully stopped vmagent in %.3f seconds", time.Since(startTime).Seconds())
}
func requestHandler(w http.ResponseWriter, r *http.Request) bool {
path := strings.Replace(r.URL.Path, "//", "/", -1)
switch path {
case "/api/v1/write":
prometheusWriteRequests.Inc()
if err := promremotewrite.InsertHandler(r); err != nil {
prometheusWriteErrors.Inc()
httpserver.Errorf(w, "error in %q: %s", r.URL.Path, err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "/api/v1/import":
vmimportRequests.Inc()
if err := vmimport.InsertHandler(r); err != nil {
vmimportErrors.Inc()
httpserver.Errorf(w, "error in %q: %s", r.URL.Path, err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "/api/v1/import/csv":
csvimportRequests.Inc()
if err := csvimport.InsertHandler(r); err != nil {
csvimportErrors.Inc()
httpserver.Errorf(w, "error in %q: %s", r.URL.Path, err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "/write", "/api/v2/write":
influxWriteRequests.Inc()
if err := influx.InsertHandlerForHTTP(r); err != nil {
influxWriteErrors.Inc()
httpserver.Errorf(w, "error in %q: %s", r.URL.Path, err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "/query":
// Emulate fake response for influx query.
// This is required for TSBS benchmark.
influxQueryRequests.Inc()
fmt.Fprintf(w, `{"results":[{"series":[{"values":[]}]}]}`)
return true
case "/targets":
promscrapeTargetsRequests.Inc()
w.Header().Set("Content-Type", "text/plain")
promscrape.WriteHumanReadableTargetsStatus(w)
return true
}
return false
}
var (
prometheusWriteRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/api/v1/write", protocol="promremotewrite"}`)
prometheusWriteErrors = metrics.NewCounter(`vmagent_http_request_errors_total{path="/api/v1/write", protocol="promremotewrite"}`)
vmimportRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/api/v1/import", protocol="vmimport"}`)
vmimportErrors = metrics.NewCounter(`vmagent_http_request_errors_total{path="/api/v1/import", protocol="vmimport"}`)
csvimportRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/api/v1/import/csv", protocol="csvimport"}`)
csvimportErrors = metrics.NewCounter(`vmagent_http_request_errors_total{path="/api/v1/import/csv", protocol="csvimport"}`)
influxWriteRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/write", protocol="influx"}`)
influxWriteErrors = metrics.NewCounter(`vmagent_http_request_errors_total{path="/write", protocol="influx"}`)
influxQueryRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/query", protocol="influx"}`)
promscrapeTargetsRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/targets"}`)
)

View File

@@ -0,0 +1,65 @@
package opentsdb
import (
"io"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentsdb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="opentsdb"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="opentsdb"}`)
)
// InsertHandler processes remote write for OpenTSDB put protocol.
//
// See http://opentsdb.net/docs/build/html/api_telnet/put.html
func InsertHandler(r io.Reader) error {
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(r, insertRows)
})
}
func insertRows(rows []parser.Row) error {
ctx := common.GetPushCtx()
defer common.PutPushCtx(ctx)
tssDst := ctx.WriteRequest.Timeseries[:0]
labels := ctx.Labels[:0]
samples := ctx.Samples[:0]
for i := range rows {
r := &rows[i]
labelsLen := len(labels)
labels = append(labels, prompbmarshal.Label{
Name: "__name__",
Value: r.Metric,
})
for j := range r.Tags {
tag := &r.Tags[j]
labels = append(labels, prompbmarshal.Label{
Name: tag.Key,
Value: tag.Value,
})
}
samples = append(samples, prompbmarshal.Sample{
Value: r.Value,
Timestamp: r.Timestamp,
})
tssDst = append(tssDst, prompbmarshal.TimeSeries{
Labels: labels[labelsLen:],
Samples: samples[len(samples)-1:],
})
}
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.Push(&ctx.WriteRequest)
rowsInserted.Add(len(rows))
rowsPerInsert.Update(float64(len(rows)))
return nil
}

View File

@@ -0,0 +1,64 @@
package opentsdbhttp
import (
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentsdbhttp"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="opentsdbhttp"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="opentsdbhttp"}`)
)
// InsertHandler processes HTTP OpenTSDB put requests.
// See http://opentsdb.net/docs/build/html/api_http/put.html
func InsertHandler(req *http.Request) error {
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(req, insertRows)
})
}
func insertRows(rows []parser.Row) error {
ctx := common.GetPushCtx()
defer common.PutPushCtx(ctx)
tssDst := ctx.WriteRequest.Timeseries[:0]
labels := ctx.Labels[:0]
samples := ctx.Samples[:0]
for i := range rows {
r := &rows[i]
labelsLen := len(labels)
labels = append(labels, prompbmarshal.Label{
Name: "__name__",
Value: r.Metric,
})
for j := range r.Tags {
tag := &r.Tags[j]
labels = append(labels, prompbmarshal.Label{
Name: tag.Key,
Value: tag.Value,
})
}
samples = append(samples, prompbmarshal.Sample{
Value: r.Value,
Timestamp: r.Timestamp,
})
tssDst = append(tssDst, prompbmarshal.TimeSeries{
Labels: labels[labelsLen:],
Samples: samples[len(samples)-1:],
})
}
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.Push(&ctx.WriteRequest)
rowsInserted.Add(len(rows))
rowsPerInsert.Update(float64(len(rows)))
return nil
}

View File

@@ -0,0 +1,67 @@
package promremotewrite
import (
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/promremotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="promremotewrite"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="promremotewrite"}`)
)
// InsertHandler processes remote write for prometheus.
func InsertHandler(req *http.Request) error {
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(req, insertRows)
})
}
func insertRows(timeseries []prompb.TimeSeries) error {
ctx := common.GetPushCtx()
defer common.PutPushCtx(ctx)
rowsTotal := 0
tssDst := ctx.WriteRequest.Timeseries[:0]
labels := ctx.Labels[:0]
samples := ctx.Samples[:0]
for i := range timeseries {
ts := &timeseries[i]
labelsLen := len(labels)
for i := range ts.Labels {
label := &ts.Labels[i]
labels = append(labels, prompbmarshal.Label{
Name: bytesutil.ToUnsafeString(label.Name),
Value: bytesutil.ToUnsafeString(label.Value),
})
}
samplesLen := len(samples)
for i := range ts.Samples {
sample := &ts.Samples[i]
samples = append(samples, prompbmarshal.Sample{
Value: sample.Value,
Timestamp: sample.Timestamp,
})
}
tssDst = append(tssDst, prompbmarshal.TimeSeries{
Labels: labels[labelsLen:],
Samples: samples[samplesLen:],
})
rowsTotal += len(ts.Samples)
}
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.Push(&ctx.WriteRequest)
rowsInserted.Add(rowsTotal)
rowsPerInsert.Update(float64(rowsTotal))
return nil
}

View File

@@ -0,0 +1,269 @@
package remotewrite
import (
"crypto/tls"
"crypto/x509"
"encoding/base64"
"flag"
"fmt"
"io/ioutil"
"strings"
"sync"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/netutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/persistentqueue"
"github.com/VictoriaMetrics/metrics"
"github.com/valyala/fasthttp"
)
var (
sendTimeout = flag.Duration("remoteWrite.sendTimeout", time.Minute, "Timeout for sending a single block of data to -remoteWrite.url")
tlsInsecureSkipVerify = flag.Bool("remoteWrite.tlsInsecureSkipVerify", false, "Whether to skip tls verification when connecting to -remoteWrite.url")
tlsCertFile = flag.String("remoteWrite.tlsCertFile", "", "Optional path to client-side TLS certificate file to use when connecting to -remoteWrite.url")
tlsKeyFile = flag.String("remoteWrite.tlsKeyFile", "", "Optional path to client-side TLS certificate key to use when connecting to -remoteWrite.url")
tlsCAFile = flag.String("remoteWrite.tlsCAFile", "", "Optional path to TLS CA file to use for verifying connections to -remoteWrite.url. "+
"By default system CA is used")
basicAuthUsername = flag.String("remoteWrite.basicAuth.username", "", "Optional basic auth username to use for -remoteWrite.url")
basicAuthPassword = flag.String("remoteWrite.basicAuth.password", "", "Optional basic auth password to use for -remoteWrite.url")
bearerToken = flag.String("remoteWrite.bearerToken", "", "Optional bearer auth token to use for -remoteWrite.url")
)
type client struct {
urlLabelValue string
remoteWriteURL string
host string
requestURI string
authHeader string
fq *persistentqueue.FastQueue
hc *fasthttp.HostClient
requestDuration *metrics.Histogram
requestsOKCount *metrics.Counter
errorsCount *metrics.Counter
retriesCount *metrics.Counter
wg sync.WaitGroup
stopCh chan struct{}
}
func newClient(remoteWriteURL, urlLabelValue string, fq *persistentqueue.FastQueue, concurrency int) *client {
authHeader := ""
if len(*basicAuthUsername) > 0 || len(*basicAuthPassword) > 0 {
// See https://en.wikipedia.org/wiki/Basic_access_authentication
token := *basicAuthUsername + ":" + *basicAuthPassword
token64 := base64.StdEncoding.EncodeToString([]byte(token))
authHeader = "Basic " + token64
}
if len(*bearerToken) > 0 {
if authHeader != "" {
logger.Panicf("FATAL: `-remoteWrite.bearerToken`=%q cannot be set when `-remoteWrite.basicAuth.*` flags are set", *bearerToken)
}
authHeader = "Bearer " + *bearerToken
}
readTimeout := *sendTimeout
if readTimeout <= 0 {
readTimeout = time.Minute
}
writeTimeout := readTimeout
var u fasthttp.URI
u.Update(remoteWriteURL)
scheme := string(u.Scheme())
switch scheme {
case "http", "https":
default:
logger.Panicf("FATAL: unsupported scheme in -remoteWrite.url=%q: %q. It must be http or https", remoteWriteURL, scheme)
}
host := string(u.Host())
if len(host) == 0 {
logger.Panicf("FATAL: invalid -remoteWrite.url=%q: host cannot be empty. Make sure the url looks like `http://host:port/path`", remoteWriteURL)
}
requestURI := string(u.RequestURI())
isTLS := scheme == "https"
var tlsCfg *tls.Config
if isTLS {
var err error
tlsCfg, err = getTLSConfig()
if err != nil {
logger.Panicf("FATAL: cannot initialize TLS config: %s", err)
}
}
if !strings.Contains(host, ":") {
if isTLS {
host += ":443"
} else {
host += ":80"
}
}
maxConns := 2 * concurrency
hc := &fasthttp.HostClient{
Addr: host,
Name: "vmagent",
Dial: statDial,
DialDualStack: netutil.TCP6Enabled(),
IsTLS: isTLS,
TLSConfig: tlsCfg,
MaxConns: maxConns,
MaxIdleConnDuration: 10 * readTimeout,
ReadTimeout: readTimeout,
WriteTimeout: writeTimeout,
MaxResponseBodySize: 1024 * 1024,
}
c := &client{
urlLabelValue: urlLabelValue,
remoteWriteURL: remoteWriteURL,
host: host,
requestURI: requestURI,
authHeader: authHeader,
fq: fq,
hc: hc,
stopCh: make(chan struct{}),
}
c.requestDuration = metrics.GetOrCreateHistogram(fmt.Sprintf(`vmagent_remotewrite_duration_seconds{url=%q}`, c.urlLabelValue))
c.requestsOKCount = metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_requests_total{url=%q, status_code="2XX"}`, c.urlLabelValue))
c.errorsCount = metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_errors_total{url=%q}`, c.urlLabelValue))
c.retriesCount = metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_retries_count_total{url=%q}`, c.urlLabelValue))
for i := 0; i < concurrency; i++ {
c.wg.Add(1)
go func() {
defer c.wg.Done()
c.runWorker()
}()
}
logger.Infof("initialized client for -remoteWrite.url=%q", c.remoteWriteURL)
return c
}
func (c *client) MustStop() {
close(c.stopCh)
c.wg.Wait()
logger.Infof("stopped client for -remoteWrite.url=%q", c.remoteWriteURL)
}
func getTLSConfig() (*tls.Config, error) {
var tlsRootCA *x509.CertPool
var tlsCertificate *tls.Certificate
if *tlsCertFile != "" || *tlsKeyFile != "" {
cert, err := tls.LoadX509KeyPair(*tlsCertFile, *tlsKeyFile)
if err != nil {
return nil, fmt.Errorf("cannot load TLS certificate for -remoteWrite.tlsCertFile=%q and -remoteWrite.tlsKeyFile=%q: %s", *tlsCertFile, *tlsKeyFile, err)
}
tlsCertificate = &cert
}
if *tlsCAFile != "" {
data, err := ioutil.ReadFile(*tlsCAFile)
if err != nil {
return nil, fmt.Errorf("cannot read -remoteWrite.tlsCAFile=%q: %s", *tlsCAFile, err)
}
tlsRootCA = x509.NewCertPool()
if !tlsRootCA.AppendCertsFromPEM(data) {
return nil, fmt.Errorf("cannot parse data -remoteWrite.tlsCAFile=%q", *tlsCAFile)
}
}
tlsCfg := &tls.Config{
RootCAs: tlsRootCA,
ClientSessionCache: tls.NewLRUClientSessionCache(0),
}
if tlsCertificate != nil {
tlsCfg.Certificates = []tls.Certificate{*tlsCertificate}
}
tlsCfg.InsecureSkipVerify = *tlsInsecureSkipVerify
return tlsCfg, nil
}
func (c *client) runWorker() {
var ok bool
var block []byte
ch := make(chan struct{})
for {
block, ok = c.fq.MustReadBlock(block[:0])
if !ok {
return
}
go func() {
c.sendBlock(block)
ch <- struct{}{}
}()
select {
case <-ch:
// The block has been sent successfully
continue
case <-c.stopCh:
// c must be stopped. Wait for a while in the hope the block will be sent.
graceDuration := 5 * time.Second
select {
case <-ch:
// The block has been sent successfully.
case <-time.After(graceDuration):
logger.Errorf("couldn't sent block with size %d bytes to %q in %.3f seconds during shutdown; dropping it",
len(block), c.remoteWriteURL, graceDuration.Seconds())
}
return
}
}
}
func (c *client) sendBlock(block []byte) {
req := fasthttp.AcquireRequest()
req.SetRequestURI(c.requestURI)
req.SetHost(c.host)
req.Header.SetMethod("POST")
req.Header.Add("Content-Type", "application/x-protobuf")
req.Header.Add("Content-Encoding", "snappy")
req.Header.Add("X-Prometheus-Remote-Write-Version", "0.1.0")
if c.authHeader != "" {
req.Header.Set("Authorization", c.authHeader)
}
req.SetBody(block)
retryDuration := time.Second
resp := fasthttp.AcquireResponse()
again:
select {
case <-c.stopCh:
fasthttp.ReleaseRequest(req)
fasthttp.ReleaseResponse(resp)
return
default:
}
startTime := time.Now()
// There is no need in calling DoTimeout, since the timeout is set in c.hc.ReadTimeout.
err := c.hc.Do(req, resp)
c.requestDuration.UpdateDuration(startTime)
if err != nil {
c.errorsCount.Inc()
retryDuration *= 2
if retryDuration > time.Minute {
retryDuration = time.Minute
}
logger.Errorf("couldn't send a block with size %d bytes to %q: %s; re-sending the block in %.3f seconds",
len(block), c.remoteWriteURL, err, retryDuration.Seconds())
time.Sleep(retryDuration)
c.retriesCount.Inc()
goto again
}
statusCode := resp.StatusCode()
if statusCode/100 != 2 {
metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_requests_total{url=%q, status_code="%d"}`, c.urlLabelValue, statusCode)).Inc()
retryDuration *= 2
if retryDuration > time.Minute {
retryDuration = time.Minute
}
logger.Errorf("unexpected status code received after sending a block with size %d bytes to %q: %d; response body=%q; re-sending the block in %.3f seconds",
len(block), c.remoteWriteURL, statusCode, resp.Body(), retryDuration.Seconds())
time.Sleep(retryDuration)
c.retriesCount.Inc()
goto again
}
c.requestsOKCount.Inc()
// The block has been successfully sent to the remote storage.
fasthttp.ReleaseResponse(resp)
fasthttp.ReleaseRequest(req)
}

View File

@@ -0,0 +1,199 @@
package remotewrite
import (
"flag"
"sync"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/persistentqueue"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/metrics"
"github.com/golang/snappy"
)
var (
flushInterval = flag.Duration("remoteWrite.flushInterval", time.Second, "Interval for flushing the data to remote storage. "+
"Higher value reduces network bandwidth usage at the cost of delayed push of scraped data to remote storage")
maxUnpackedBlockSize = flag.Int("remoteWrite.maxBlockSize", 32*1024*1024, "The maximum size in bytes of unpacked request to send to remote storage. "+
"It shouldn't exceed -maxInsertRequestSize from VictoriaMetrics")
)
// the maximum number of rows to send per each block.
const maxRowsPerBlock = 10000
type pendingSeries struct {
mu sync.Mutex
wr writeRequest
stopCh chan struct{}
periodicFlusherWG sync.WaitGroup
}
func newPendingSeries(pushBlock func(block []byte)) *pendingSeries {
var ps pendingSeries
ps.wr.pushBlock = pushBlock
ps.stopCh = make(chan struct{})
ps.periodicFlusherWG.Add(1)
go func() {
defer ps.periodicFlusherWG.Done()
ps.periodicFlusher()
}()
return &ps
}
func (ps *pendingSeries) MustStop() {
close(ps.stopCh)
ps.periodicFlusherWG.Wait()
}
func (ps *pendingSeries) Push(tss []prompbmarshal.TimeSeries) {
ps.mu.Lock()
ps.wr.push(tss)
ps.mu.Unlock()
}
func (ps *pendingSeries) periodicFlusher() {
ticker := time.NewTicker(*flushInterval)
defer ticker.Stop()
mustStop := false
for !mustStop {
select {
case <-ps.stopCh:
mustStop = true
case <-ticker.C:
if time.Since(ps.wr.lastFlushTime) < *flushInterval/2 {
continue
}
}
ps.mu.Lock()
ps.wr.flush()
ps.mu.Unlock()
}
}
type writeRequest struct {
wr prompbmarshal.WriteRequest
pushBlock func(block []byte)
lastFlushTime time.Time
tss []prompbmarshal.TimeSeries
labels []prompbmarshal.Label
samples []prompbmarshal.Sample
buf []byte
}
func (wr *writeRequest) reset() {
wr.wr.Timeseries = nil
for i := range wr.tss {
ts := &wr.tss[i]
ts.Labels = nil
ts.Samples = nil
}
wr.tss = wr.tss[:0]
for i := range wr.labels {
label := &wr.labels[i]
label.Name = ""
label.Value = ""
}
wr.labels = wr.labels[:0]
wr.samples = wr.samples[:0]
wr.buf = wr.buf[:0]
}
func (wr *writeRequest) flush() {
wr.wr.Timeseries = wr.tss
wr.lastFlushTime = time.Now()
pushWriteRequest(&wr.wr, wr.pushBlock)
wr.reset()
}
func (wr *writeRequest) push(src []prompbmarshal.TimeSeries) {
tssDst := wr.tss
for i := range src {
tssDst = append(tssDst, prompbmarshal.TimeSeries{})
dst := &tssDst[len(tssDst)-1]
wr.copyTimeSeries(dst, &src[i])
if len(wr.tss) >= maxRowsPerBlock {
wr.flush()
tssDst = wr.tss
}
}
wr.tss = tssDst
}
func (wr *writeRequest) copyTimeSeries(dst, src *prompbmarshal.TimeSeries) {
labelsDst := wr.labels
labelsLen := len(wr.labels)
samplesDst := wr.samples
buf := wr.buf
for i := range src.Labels {
labelsDst = append(labelsDst, prompbmarshal.Label{})
dstLabel := &labelsDst[len(labelsDst)-1]
srcLabel := &src.Labels[i]
buf = append(buf, srcLabel.Name...)
dstLabel.Name = bytesutil.ToUnsafeString(buf[len(buf)-len(srcLabel.Name):])
buf = append(buf, srcLabel.Value...)
dstLabel.Value = bytesutil.ToUnsafeString(buf[len(buf)-len(srcLabel.Value):])
}
dst.Labels = labelsDst[labelsLen:]
samplesDst = append(samplesDst, prompbmarshal.Sample{})
dstSample := &samplesDst[len(samplesDst)-1]
if len(src.Samples) != 1 {
logger.Panicf("BUG: unexpected number of samples in time series; got %d; want 1", len(src.Samples))
}
*dstSample = src.Samples[0]
dst.Samples = samplesDst[len(samplesDst)-1:]
wr.samples = samplesDst
wr.labels = labelsDst
wr.buf = buf
}
func pushWriteRequest(wr *prompbmarshal.WriteRequest, pushBlock func(block []byte)) {
if len(wr.Timeseries) == 0 {
// Nothing to push
return
}
bb := writeRequestBufPool.Get()
bb.B = prompbmarshal.MarshalWriteRequest(bb.B[:0], wr)
if len(bb.B) <= *maxUnpackedBlockSize {
zb := snappyBufPool.Get()
zb.B = snappy.Encode(zb.B[:cap(zb.B)], bb.B)
writeRequestBufPool.Put(bb)
if len(zb.B) <= persistentqueue.MaxBlockSize {
pushBlock(zb.B)
blockSizeRows.Update(float64(len(wr.Timeseries)))
blockSizeBytes.Update(float64(len(zb.B)))
snappyBufPool.Put(zb)
return
}
snappyBufPool.Put(zb)
} else {
writeRequestBufPool.Put(bb)
}
// Too big block. Recursively split it into smaller parts.
timeseries := wr.Timeseries
n := len(timeseries) / 2
wr.Timeseries = timeseries[:n]
pushWriteRequest(wr, pushBlock)
wr.Timeseries = timeseries[n:]
pushWriteRequest(wr, pushBlock)
wr.Timeseries = timeseries
}
var (
blockSizeBytes = metrics.NewHistogram(`vmagent_remotewrite_block_size_bytes`)
blockSizeRows = metrics.NewHistogram(`vmagent_remotewrite_block_size_rows`)
)
var writeRequestBufPool bytesutil.ByteBufferPool
var snappyBufPool bytesutil.ByteBufferPool

View File

@@ -0,0 +1,113 @@
package remotewrite
import (
"flag"
"strings"
"sync"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
)
var (
unparsedLabelsGlobal = flagutil.NewArray("remoteWrite.label", "Optional label in the form 'name=value' to add to all the metrics before sending them to -remoteWrite.url. "+
"Pass multiple -remoteWrite.label flags in order to add multiple flags to metrics before sending them to remote storage")
relabelConfigPathGlobal = flag.String("remoteWrite.relabelConfig", "", "Optional path to file with relabel_config entries. These entries are applied to all the metrics "+
"before sending them to -remoteWrite.url. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config for details")
)
var labelsGlobal []prompbmarshal.Label
var prcsGlobal []promrelabel.ParsedRelabelConfig
// initRelabelGlobal must be called after parsing command-line flags.
func initRelabelGlobal() {
// Init labelsGlobal
labelsGlobal = nil
for _, s := range *unparsedLabelsGlobal {
n := strings.IndexByte(s, '=')
if n < 0 {
logger.Panicf("FATAL: missing '=' in `-remoteWrite.label`. It must contain label in the form `name=value`; got %q", s)
}
labelsGlobal = append(labelsGlobal, prompbmarshal.Label{
Name: s[:n],
Value: s[n+1:],
})
}
// Init prcsGlobal
prcsGlobal = nil
if len(*relabelConfigPathGlobal) > 0 {
var err error
prcsGlobal, err = promrelabel.LoadRelabelConfigs(*relabelConfigPathGlobal)
if err != nil {
logger.Panicf("FATAL: cannot load relabel configs from -remoteWrite.relabelConfig=%q: %s", *relabelConfigPathGlobal, err)
}
}
}
func (rctx *relabelCtx) applyRelabeling(tss []prompbmarshal.TimeSeries, extraLabels []prompbmarshal.Label, prcs []promrelabel.ParsedRelabelConfig) []prompbmarshal.TimeSeries {
if len(extraLabels) == 0 && len(prcs) == 0 {
// Nothing to change.
return tss
}
tssDst := tss[:0]
labels := rctx.labels[:0]
for i := range tss {
ts := &tss[i]
labelsLen := len(labels)
labels = append(labels, ts.Labels...)
// extraLabels must be added before applying relabeling according to https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write
for j := range extraLabels {
extraLabel := &extraLabels[j]
tmp := promrelabel.GetLabelByName(labels[labelsLen:], extraLabel.Name)
if tmp != nil {
tmp.Value = extraLabel.Value
} else {
labels = append(labels, *extraLabel)
}
}
labels = promrelabel.ApplyRelabelConfigs(labels, labelsLen, prcs, true)
if len(labels) == labelsLen {
// Drop the current time series, since relabeling removed all the labels.
continue
}
tssDst = append(tssDst, prompbmarshal.TimeSeries{
Labels: labels[labelsLen:],
Samples: ts.Samples,
})
}
rctx.labels = labels
return tssDst
}
type relabelCtx struct {
// pool for labels, which are used during the relabeling.
labels []prompbmarshal.Label
}
func (rctx *relabelCtx) reset() {
labels := rctx.labels
for i := range labels {
label := &labels[i]
label.Name = ""
label.Value = ""
}
rctx.labels = rctx.labels[:0]
}
var relabelCtxPool = &sync.Pool{
New: func() interface{} {
return &relabelCtx{}
},
}
func getRelabelCtx() *relabelCtx {
return relabelCtxPool.Get().(*relabelCtx)
}
func putRelabelCtx(rctx *relabelCtx) {
rctx.labels = rctx.labels[:0]
relabelCtxPool.Put(rctx)
}

View File

@@ -0,0 +1,195 @@
package remotewrite
import (
"flag"
"fmt"
"sync/atomic"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/memory"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/persistentqueue"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
"github.com/VictoriaMetrics/metrics"
xxhash "github.com/cespare/xxhash/v2"
)
var (
remoteWriteURLs = flagutil.NewArray("remoteWrite.url", "Remote storage URL to write data to. It must support Prometheus remote_write API. "+
"It is recommended using VictoriaMetrics as remote storage. Example url: http://<victoriametrics-host>:8428/api/v1/write . "+
"Pass multiple -remoteWrite.url flags in order to write data concurrently to multiple remote storage systems")
relabelConfigPaths = flagutil.NewArray("remoteWrite.urlRelabelConfig", "Optional path to relabel config for the corresponding -remoteWrite.url")
tmpDataPath = flag.String("remoteWrite.tmpDataPath", "vmagent-remotewrite-data", "Path to directory where temporary data for remote write component is stored")
queues = flag.Int("remoteWrite.queues", 1, "The number of concurrent queues to each -remoteWrite.url. Set more queues if a single queue "+
"isn't enough for sending high volume of collected data to remote storage")
showRemoteWriteURL = flag.Bool("remoteWrite.showURL", false, "Whether to show -remoteWrite.url in the exported metrics. "+
"It is hidden by default, since it can contain sensistive auth info")
maxPendingBytesPerURL = flag.Int("remoteWrite.maxDiskUsagePerURL", 0, "The maximum file-based buffer size in bytes at -remoteWrite.tmpDataPath "+
"for each -remoteWrite.url. When buffer size reaches the configured maximum, then old data is dropped when adding new data to the buffer. "+
"Buffered data is stored in ~500MB chunks, so the minimum practical value for this flag is 500000000. "+
"Disk usage is unlimited if the value is set to 0")
)
var rwctxs []*remoteWriteCtx
// Init initializes remotewrite.
//
// It must be called after flag.Parse().
//
// Stop must be called for graceful shutdown.
func Init() {
if len(*remoteWriteURLs) == 0 {
logger.Panicf("FATAL: at least one `-remoteWrite.url` must be set")
}
if !*showRemoteWriteURL {
// remoteWrite.url can contain authentication codes, so hide it at `/metrics` output.
httpserver.RegisterSecretFlag("remoteWrite.url")
}
initRelabelGlobal()
maxInmemoryBlocks := memory.Allowed() / len(*remoteWriteURLs) / maxRowsPerBlock / 100
if maxInmemoryBlocks > 200 {
// There is no much sense in keeping higher number of blocks in memory,
// since this means that the producer outperforms consumer and the queue
// will continue growing. It is better storing the queue to file.
maxInmemoryBlocks = 200
}
if maxInmemoryBlocks < 2 {
maxInmemoryBlocks = 2
}
for i, remoteWriteURL := range *remoteWriteURLs {
relabelConfigPath := ""
if i < len(*relabelConfigPaths) {
relabelConfigPath = (*relabelConfigPaths)[i]
}
urlLabelValue := fmt.Sprintf("secret-url-%d", i+1)
if *showRemoteWriteURL {
urlLabelValue = remoteWriteURL
}
rwctx := newRemoteWriteCtx(remoteWriteURL, relabelConfigPath, maxInmemoryBlocks, urlLabelValue)
rwctxs = append(rwctxs, rwctx)
}
}
// Stop stops remotewrite.
//
// It is expected that nobody calls Push during and after the call to this func.
func Stop() {
for _, rwctx := range rwctxs {
rwctx.MustStop()
}
rwctxs = nil
}
// Push sends wr to remote storage systems set via `-remoteWrite.url`.
//
// Each timeseries in wr.Timeseries must contain one sample.
func Push(wr *prompbmarshal.WriteRequest) {
var rctx *relabelCtx
if len(prcsGlobal) > 0 || len(labelsGlobal) > 0 {
rctx = getRelabelCtx()
}
tss := wr.Timeseries
for len(tss) > 0 {
// Process big tss in smaller blocks in order to reduce maxmimum memory usage
tssBlock := tss
if len(tssBlock) > maxRowsPerBlock {
tssBlock = tss[:maxRowsPerBlock]
tss = tss[maxRowsPerBlock:]
} else {
tss = nil
}
if rctx != nil {
tssBlockLen := len(tssBlock)
tssBlock = rctx.applyRelabeling(tssBlock, labelsGlobal, prcsGlobal)
globalRelabelMetricsDropped.Add(tssBlockLen - len(tssBlock))
}
for _, rwctx := range rwctxs {
rwctx.Push(tssBlock)
}
if rctx != nil {
rctx.reset()
}
}
if rctx != nil {
putRelabelCtx(rctx)
}
}
var globalRelabelMetricsDropped = metrics.NewCounter("vmagent_remotewrite_global_relabel_metrics_dropped_total")
type remoteWriteCtx struct {
fq *persistentqueue.FastQueue
c *client
prcs []promrelabel.ParsedRelabelConfig
pss []*pendingSeries
pssNextIdx uint64
relabelMetricsDropped *metrics.Counter
}
func newRemoteWriteCtx(remoteWriteURL, relabelConfigPath string, maxInmemoryBlocks int, urlLabelValue string) *remoteWriteCtx {
h := xxhash.Sum64([]byte(remoteWriteURL))
path := fmt.Sprintf("%s/persistent-queue/%016X", *tmpDataPath, h)
fq := persistentqueue.MustOpenFastQueue(path, remoteWriteURL, maxInmemoryBlocks, *maxPendingBytesPerURL)
_ = metrics.GetOrCreateGauge(fmt.Sprintf(`vmagent_remotewrite_pending_data_bytes{path=%q, url=%q}`, path, urlLabelValue), func() float64 {
return float64(fq.GetPendingBytes())
})
_ = metrics.GetOrCreateGauge(fmt.Sprintf(`vmagent_remotewrite_pending_inmemory_blocks{path=%q, url=%q}`, path, urlLabelValue), func() float64 {
return float64(fq.GetInmemoryQueueLen())
})
c := newClient(remoteWriteURL, urlLabelValue, fq, *queues)
var prcs []promrelabel.ParsedRelabelConfig
if len(relabelConfigPath) > 0 {
var err error
prcs, err = promrelabel.LoadRelabelConfigs(relabelConfigPath)
if err != nil {
logger.Panicf("FATAL: cannot load relabel configs from -remoteWrite.urlRelabelConfig=%q: %s", relabelConfigPath, err)
}
}
pss := make([]*pendingSeries, *queues)
for i := range pss {
pss[i] = newPendingSeries(fq.MustWriteBlock)
}
return &remoteWriteCtx{
fq: fq,
c: c,
prcs: prcs,
pss: pss,
relabelMetricsDropped: metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_relabel_metrics_dropped_total{path=%q, url=%q}`, path, urlLabelValue)),
}
}
func (rwctx *remoteWriteCtx) MustStop() {
for _, ps := range rwctx.pss {
ps.MustStop()
}
rwctx.pss = nil
rwctx.fq.MustClose()
rwctx.fq = nil
rwctx.prcs = nil
rwctx.c.MustStop()
rwctx.c = nil
rwctx.relabelMetricsDropped = nil
}
func (rwctx *remoteWriteCtx) Push(tss []prompbmarshal.TimeSeries) {
var rctx *relabelCtx
if len(rwctx.prcs) > 0 {
rctx = getRelabelCtx()
tssLen := len(tss)
tss = rctx.applyRelabeling(tss, nil, rwctx.prcs)
rwctx.relabelMetricsDropped.Add(tssLen - len(tss))
}
pss := rwctx.pss
idx := atomic.AddUint64(&rwctx.pssNextIdx, 1) % uint64(len(pss))
pss[idx].Push(tss)
if rctx != nil {
putRelabelCtx(rctx)
}
}

View File

@@ -0,0 +1,71 @@
package remotewrite
import (
"net"
"sync/atomic"
"github.com/VictoriaMetrics/metrics"
"github.com/valyala/fasthttp"
)
func statDial(addr string) (net.Conn, error) {
conn, err := fasthttp.Dial(addr)
dialsTotal.Inc()
if err != nil {
dialErrors.Inc()
return nil, err
}
conns.Inc()
sc := &statConn{
Conn: conn,
}
return sc, nil
}
var (
dialsTotal = metrics.NewCounter(`vmagent_remotewrite_dials_total`)
dialErrors = metrics.NewCounter(`vmagent_remotewrite_dial_errors_total`)
conns = metrics.NewCounter(`vmagent_remotewrite_conns`)
)
type statConn struct {
closed uint64
net.Conn
}
func (sc *statConn) Read(p []byte) (int, error) {
n, err := sc.Conn.Read(p)
connReadsTotal.Inc()
if err != nil {
connReadErrors.Inc()
}
connBytesRead.Add(n)
return n, err
}
func (sc *statConn) Write(p []byte) (int, error) {
n, err := sc.Conn.Write(p)
connWritesTotal.Inc()
if err != nil {
connWriteErrors.Inc()
}
connBytesWritten.Add(n)
return n, err
}
func (sc *statConn) Close() error {
err := sc.Conn.Close()
if atomic.AddUint64(&sc.closed, 1) == 1 {
conns.Dec()
}
return err
}
var (
connReadsTotal = metrics.NewCounter(`vmagent_remotewrite_conn_reads_total`)
connWritesTotal = metrics.NewCounter(`vmagent_remotewrite_conn_writes_total`)
connReadErrors = metrics.NewCounter(`vmagent_remotewrite_conn_read_errors_total`)
connWriteErrors = metrics.NewCounter(`vmagent_remotewrite_conn_write_errors_total`)
connBytesRead = metrics.NewCounter(`vmagent_remotewrite_conn_bytes_read_total`)
connBytesWritten = metrics.NewCounter(`vmagent_remotewrite_conn_bytes_written_total`)
)

BIN
app/vmagent/vmagent.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 69 KiB

View File

@@ -0,0 +1,70 @@
package vmimport
import (
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/vmimport"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="vmimport"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="vmimport"}`)
)
// InsertHandler processes `/api/v1/import` request.
//
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6
func InsertHandler(req *http.Request) error {
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(req, insertRows)
})
}
func insertRows(rows []parser.Row) error {
ctx := common.GetPushCtx()
defer common.PutPushCtx(ctx)
rowsTotal := 0
tssDst := ctx.WriteRequest.Timeseries[:0]
labels := ctx.Labels[:0]
samples := ctx.Samples[:0]
for i := range rows {
r := &rows[i]
labelsLen := len(labels)
for j := range r.Tags {
tag := &r.Tags[j]
labels = append(labels, prompbmarshal.Label{
Name: bytesutil.ToUnsafeString(tag.Key),
Value: bytesutil.ToUnsafeString(tag.Value),
})
}
values := r.Values
timestamps := r.Timestamps
_ = timestamps[len(values)-1]
samplesLen := len(samples)
for j, value := range values {
samples = append(samples, prompbmarshal.Sample{
Value: value,
Timestamp: timestamps[j],
})
}
tssDst = append(tssDst, prompbmarshal.TimeSeries{
Labels: labels[labelsLen:],
Samples: samples[samplesLen:],
})
rowsTotal += len(values)
}
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.Push(&ctx.WriteRequest)
rowsInserted.Add(rowsTotal)
rowsPerInsert.Update(float64(rowsTotal))
return nil
}

78
app/vmalert/Makefile Normal file
View File

@@ -0,0 +1,78 @@
# All these commands must run from repository root.
vmalert:
APP_NAME=vmalert $(MAKE) app-local
vmalert-race:
APP_NAME=vmalert RACE=-race $(MAKE) app-local
vmalert-prod:
APP_NAME=vmalert $(MAKE) app-via-docker
vmalert-pure-prod:
APP_NAME=vmalert $(MAKE) app-via-docker-pure
vmalert-amd64-prod:
APP_NAME=vmalert $(MAKE) app-via-docker-amd64
vmalert-arm-prod:
APP_NAME=vmalert $(MAKE) app-via-docker-arm
vmalert-arm64-prod:
APP_NAME=vmalert $(MAKE) app-via-docker-arm64
vmalert-ppc64le-prod:
APP_NAME=vmalert $(MAKE) app-via-docker-ppc64le
vmalert-386-prod:
APP_NAME=vmalert $(MAKE) app-via-docker-386
package-vmalert:
APP_NAME=vmalert $(MAKE) package-via-docker
package-vmalert-pure:
APP_NAME=vmalert $(MAKE) package-via-docker-pure
package-vmalert-amd64:
APP_NAME=vmalert $(MAKE) package-via-docker-amd64
package-vmalert-arm:
APP_NAME=vmalert $(MAKE) package-via-docker-arm
package-vmalert-arm64:
APP_NAME=vmalert $(MAKE) package-via-docker-arm64
package-vmalert-ppc64le:
APP_NAME=vmalert $(MAKE) package-via-docker-ppc64le
package-vmalert-386:
APP_NAME=vmalert $(MAKE) package-via-docker-386
publish-vmalert:
APP_NAME=vmalert $(MAKE) publish-via-docker
test-vmalert:
go test -race -cover ./app/vmalert
run-vmalert: vmalert
./bin/vmalert -rule=app/vmalert/testdata/rules0-good.rules \
-datasource.url=http://localhost:8428 -notifier.url=http://localhost:9093 \
-evaluationInterval=3s
vmalert-amd64:
CGO_ENABLED=1 GOOS=linux GOARCH=amd64 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmalert-amd64 ./app/vmalert
vmalert-arm:
CGO_ENABLED=0 GOOS=linux GOARCH=arm GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmalert-arm ./app/vmalert
vmalert-arm64:
CGO_ENABLED=0 GOOS=linux GOARCH=arm64 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmalert-arm64 ./app/vmalert
vmalert-ppc64le:
CGO_ENABLED=0 GOOS=linux GOARCH=ppc64le GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmalert-ppc64le ./app/vmalert
vmalert-386:
CGO_ENABLED=0 GOOS=linux GOARCH=386 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmalert-386 ./app/vmalert
vmalert-pure:
APP_NAME=vmalert $(MAKE) app-local-pure

91
app/vmalert/README.md Normal file
View File

@@ -0,0 +1,91 @@
## VM Alert
`vmalert` executes a list of given MetricsQL expressions (rules) and
sends alerts to [Alert Manager](https://github.com/prometheus/alertmanager).
NOTE: `vmalert` is in early alpha and wasn't tested in production systems yet.
### Features:
* Integration with [VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics) TSDB;
* VictoriaMetrics [MetricsQL](https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/MetricsQL)
expressions validation;
* Prometheus [alerting rules definition format](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/#defining-alerting-rules)
support;
* Integration with [Alertmanager](https://github.com/prometheus/alertmanager);
* Lightweight without extra dependencies.
### TODO:
* Persist alerts state as timeseries in TSDB. Currently, alerts state is stored
in process memory only and will be lost on restart;
* Configuration hot reload.
### QuickStart
To build `vmalert` from sources:
```
git clone https://github.com/VictoriaMetrics/VictoriaMetrics
cd VictoriaMetrics
make vmalert
```
The build binary will be placed to `VictoriaMetrics/bin` folder.
To start using `vmalert` you will need the following things:
* list of alert rules - PromQL/MetricsQL expressions to execute;
* datasource address - reachable VictoriaMetrics instance for rules execution;
* notifier address - reachable Alertmanager instance for processing,
aggregating alerts and sending notifications.
Then configure `vmalert` accordingly:
```
./bin/vmalert -rule=alert.rules \
-datasource.url=http://localhost:8428 \
-notifier.url=http://localhost:9093
```
Example for `.rules` file bay be found [here](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmalert/testdata/rules0-good.rules)
`vmalert` runs evaluation for every group in a separate goroutine.
Rules in group evaluated one-by-one sequentially.
`vmalert` also runs a web-server (`-httpListenAddr`) for serving metrics and alerts endpoints:
* `http://<vmalert-addr>/api/v1/alerts` - list of all active alerts;
* `http://<vmalert-addr>/api/v1/<groupName>/<alertID>/status" ` - get alert status by ID.
Used as alert source in AlertManager.
* `http://<vmalert-addr>/metrics` - application metrics.
### Configuration
The shortlist of configuration flags is the following:
```
Usage of vmalert:
-datasource.url string
Victoria Metrics or VMSelect url. Required parameter. e.g. http://127.0.0.1:8428
-datasource.basicAuth.password string
Optional basic auth password to use for -datasource.url
-datasource.basicAuth.username string
Optional basic auth username to use for -datasource.url
-evaluationInterval duration
How often to evaluate the rules. Default 1m (default 1m0s)
-external.url string
External URL is used as alert's source for sent alerts to the notifier
-notifier.url string
Prometheus alertmanager URL. Required parameter. e.g. http://127.0.0.1:9093
-rule value
Path to the file with alert rules.
Supports patterns. Flag can be specified multiple times.
Examples:
-rule /path/to/file. Path to a single file with alerting rules
-rule dir/*.yaml -rule /*.yaml. Relative path to all .yaml files in "dir" folder,
absolute path to all .yaml files in root.
-rule.validateAnnotations
Indicates to validate annotation templates (default true)
```
Pass `-help` to `vmalert` in order to see the full list of supported
command-line flags with their descriptions.
### Contributing
`vmalert` is mostly designed and built by VictoriaMetrics community.
Feel free to share your experience and ideas for improving this
software. Please keep simplicity as the main priority.

67
app/vmalert/config.go Normal file
View File

@@ -0,0 +1,67 @@
package main
import (
"fmt"
"gopkg.in/yaml.v2"
"io/ioutil"
"path/filepath"
"strings"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
)
// Parse parses rule configs from given file patterns
func Parse(pathPatterns []string, validateAnnotations bool) ([]Group, error) {
var fp []string
for _, pattern := range pathPatterns {
matches, err := filepath.Glob(pattern)
if err != nil {
return nil, fmt.Errorf("error reading file patther %s:%v", pattern, err)
}
fp = append(fp, matches...)
}
var groups []Group
for _, file := range fp {
groupsNames := map[string]struct{}{}
gr, err := parseFile(file)
if err != nil {
return nil, fmt.Errorf("file %s: %w", file, err)
}
for _, group := range gr {
if _, ok := groupsNames[group.Name]; ok {
return nil, fmt.Errorf("one file can not contain groups with the same name %s, filepath:%s", file, group.Name)
}
groupsNames[group.Name] = struct{}{}
for _, rule := range group.Rules {
if err = rule.Validate(); err != nil {
return nil, fmt.Errorf("invalid rule filepath:%s, group %s:%w", file, group.Name, err)
}
// TODO: this init looks weird here
rule.alerts = make(map[uint64]*notifier.Alert)
if validateAnnotations {
if err = notifier.ValidateAnnotations(rule.Annotations); err != nil {
return nil, fmt.Errorf("invalida annotations filepath:%s, group %s:%w", file, group.Name, err)
}
}
rule.group = &group
}
}
groups = append(groups, gr...)
}
if len(groups) < 1 {
return nil, fmt.Errorf("no groups found in %s", strings.Join(pathPatterns, ";"))
}
return groups, nil
}
func parseFile(path string) ([]Group, error) {
data, err := ioutil.ReadFile(path)
if err != nil {
return nil, fmt.Errorf("error reading alert rule file: %w", err)
}
g := struct {
Groups []Group `yaml:"groups"`
}{}
err = yaml.Unmarshal(data, &g)
return g.Groups, err
}

View File

@@ -0,0 +1,36 @@
package main
import (
"net/url"
"os"
"testing"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
)
func TestMain(m *testing.M) {
u, _ := url.Parse("https://victoriametrics.com/path")
notifier.InitTemplateFunc(u)
os.Exit(m.Run())
}
func TestParseGood(t *testing.T) {
if _, err := Parse([]string{"testdata/*good.rules", "testdata/dir/*good.*"}, true); err != nil {
t.Errorf("error parsing files %s", err)
}
}
func TestParseBad(t *testing.T) {
if _, err := Parse([]string{"testdata/rules0-bad.rules"}, true); err == nil {
t.Errorf("expected syntaxt error")
}
if _, err := Parse([]string{"testdata/dir/rules0-bad.rules"}, true); err == nil {
t.Errorf("expected template annotation error")
}
if _, err := Parse([]string{"testdata/dir/rules1-bad.rules"}, true); err == nil {
t.Errorf("expected same group error")
}
if _, err := Parse([]string{"testdata/*.yaml"}, true); err == nil {
t.Errorf("expected empty group")
}
}

View File

@@ -0,0 +1,24 @@
package datasource
import "context"
// Querier interface wraps Query method which
// executes given query and returns list of Metrics
// as result
type Querier interface {
Query(ctx context.Context, query string) ([]Metric, error)
}
// Metric is the basic entity which should be return by datasource
// It represents single data point with full list of labels
type Metric struct {
Labels []Label
Timestamp int64
Value float64
}
// Label represents metric's label
type Label struct {
Name string
Value string
}

View File

@@ -0,0 +1,103 @@
package datasource
import (
"context"
"encoding/json"
"fmt"
"io/ioutil"
"net/http"
"net/url"
"strconv"
"strings"
)
type response struct {
Status string `json:"status"`
Data struct {
ResultType string `json:"resultType"`
Result []struct {
Labels map[string]string `json:"metric"`
TV [2]interface{} `json:"value"`
} `json:"result"`
} `json:"data"`
ErrorType string `json:"errorType"`
Error string `json:"error"`
}
func (r response) metrics() ([]Metric, error) {
var ms []Metric
var m Metric
var f float64
var err error
for i, res := range r.Data.Result {
f, err = strconv.ParseFloat(res.TV[1].(string), 64)
if err != nil {
return nil, fmt.Errorf("metric %v, unable to parse float64 from %s: %s", res, res.TV[1], err)
}
m.Labels = nil
for k, v := range r.Data.Result[i].Labels {
m.Labels = append(m.Labels, Label{Name: k, Value: v})
}
m.Timestamp = int64(res.TV[0].(float64))
m.Value = f
ms = append(ms, m)
}
return ms, nil
}
const queryPath = "/api/v1/query?query="
// VMStorage represents vmstorage entity with ability to read and write metrics
type VMStorage struct {
c *http.Client
queryURL string
basicAuthUser, basicAuthPass string
}
// NewVMStorage is a constructor for VMStorage
func NewVMStorage(baseURL, basicAuthUser, basicAuthPass string, c *http.Client) *VMStorage {
return &VMStorage{
c: c,
basicAuthUser: basicAuthUser,
basicAuthPass: basicAuthPass,
queryURL: strings.TrimSuffix(baseURL, "/") + queryPath,
}
}
// Query reads metrics from datasource by given query
func (s *VMStorage) Query(ctx context.Context, query string) ([]Metric, error) {
const (
statusSuccess, statusError, rtVector = "success", "error", "vector"
)
req, err := http.NewRequest("POST", s.queryURL+url.QueryEscape(query), nil)
if err != nil {
return nil, err
}
req.Header.Set("Content-Type", "application/json")
if s.basicAuthPass != "" {
req.SetBasicAuth(s.basicAuthUser, s.basicAuthPass)
}
resp, err := s.c.Do(req.WithContext(ctx))
if err != nil {
return nil, fmt.Errorf("error getting response from %s:%s", req.URL, err)
}
defer func() { _ = resp.Body.Close() }()
if resp.StatusCode != http.StatusOK {
body, _ := ioutil.ReadAll(resp.Body)
return nil, fmt.Errorf("datasource returns unxeprected response code %d for %s with err %s. Reponse body %s", resp.StatusCode, req.URL, err, body)
}
r := &response{}
if err := json.NewDecoder(resp.Body).Decode(r); err != nil {
return nil, fmt.Errorf("error parsing metrics for %s:%s", req.URL, err)
}
if r.Status == statusError {
return nil, fmt.Errorf("response error, query: %s, errorType: %s, error: %s", req.URL, r.ErrorType, r.Error)
}
if r.Status != statusSuccess {
return nil, fmt.Errorf("unkown status:%s, Expected success or error ", r.Status)
}
if r.Data.ResultType != rtVector {
return nil, fmt.Errorf("unkown restul type:%s. Expected vector", r.Data.ResultType)
}
return r.metrics()
}

View File

@@ -0,0 +1,93 @@
package datasource
import (
"context"
"net/http"
"net/http/httptest"
"testing"
)
var (
ctx = context.Background()
basicAuthName = "foo"
basicAuthPass = "bar"
query = "vm_rows"
)
func TestVMSelectQuery(t *testing.T) {
mux := http.NewServeMux()
mux.HandleFunc("/", func(_ http.ResponseWriter, _ *http.Request) {
t.Errorf("should not be called")
})
c := -1
mux.HandleFunc("/api/v1/query", func(w http.ResponseWriter, r *http.Request) {
c++
if r.Method != http.MethodPost {
t.Errorf("expected POST method got %s", r.Method)
}
if name, pass, _ := r.BasicAuth(); name != basicAuthName || pass != basicAuthPass {
t.Errorf("expected %s:%s as basic auth got %s:%s", basicAuthName, basicAuthPass, name, pass)
}
if r.URL.Query().Get("query") != query {
t.Errorf("exptected %s in query param, got %s", query, r.URL.Query().Get("query"))
}
switch c {
case 0:
conn, _, _ := w.(http.Hijacker).Hijack()
_ = conn.Close()
case 1:
w.WriteHeader(500)
case 2:
w.Write([]byte("[]"))
case 3:
w.Write([]byte(`{"status":"error", "errorType":"type:", "error":"some error msg"}`))
case 4:
w.Write([]byte(`{"status":"unknown"}`))
case 5:
w.Write([]byte(`{"status":"success","data":{"resultType":"matrix"}}`))
case 6:
w.Write([]byte(`{"status":"success","data":{"resultType":"vector","result":[{"metric":{"__name__":"vm_rows"},"value":[1583786142,"13763"]}]}}`))
}
})
srv := httptest.NewServer(mux)
defer srv.Close()
am := NewVMStorage(srv.URL, basicAuthName, basicAuthPass, srv.Client())
if _, err := am.Query(ctx, query); err == nil {
t.Fatalf("expected connection error got nil")
}
if _, err := am.Query(ctx, query); err == nil {
t.Fatalf("expected invalid response status error got nil")
}
if _, err := am.Query(ctx, query); err == nil {
t.Fatalf("expected response body error got nil")
}
if _, err := am.Query(ctx, query); err == nil {
t.Fatalf("expected error status got nil")
}
if _, err := am.Query(ctx, query); err == nil {
t.Fatalf("expected unkown status got nil")
}
if _, err := am.Query(ctx, query); err == nil {
t.Fatalf("expected non-vector resultType error got nil")
}
m, err := am.Query(ctx, query)
if err != nil {
t.Fatalf("unexpected %s", err)
}
if len(m) != 1 {
t.Fatalf("exptected 1 metric got %d in %+v", len(m), m)
}
expected := Metric{
Labels: []Label{{Value: "vm_rows", Name: "__name__"}},
Timestamp: 1583786142,
Value: 13763,
}
if m[0].Timestamp != expected.Timestamp &&
m[0].Value != expected.Value &&
m[0].Labels[0].Value != expected.Labels[0].Value &&
m[0].Labels[0].Name != expected.Labels[0].Name {
t.Fatalf("unexpected metric %+v want %+v", m[0], expected)
}
}

165
app/vmalert/main.go Normal file
View File

@@ -0,0 +1,165 @@
package main
import (
"context"
"flag"
"fmt"
"net/http"
"net/url"
"os"
"strings"
"sync"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envflag"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/procutil"
"github.com/VictoriaMetrics/metrics"
)
var (
rulePath = flagutil.NewArray("rule", `Path to the file with alert rules.
Supports patterns. Flag can be specified multiple times.
Examples:
-rule /path/to/file. Path to a single file with alerting rules
-rule dir/*.yaml -rule /*.yaml. Relative path to all .yaml files in "dir" folder,
absolute path to all .yaml files in root.`)
validateAlertAnnotations = flag.Bool("rule.validateAnnotations", true, "Indicates to validate annotation templates")
httpListenAddr = flag.String("httpListenAddr", ":8880", "Address to listen for http connections")
datasourceURL = flag.String("datasource.url", "", "Victoria Metrics or VMSelect url. Required parameter. e.g. http://127.0.0.1:8428")
basicAuthUsername = flag.String("datasource.basicAuth.username", "", "Optional basic auth username to use for -datasource.url")
basicAuthPassword = flag.String("datasource.basicAuth.password", "", "Optional basic auth password to use for -datasource.url")
evaluationInterval = flag.Duration("evaluationInterval", 1*time.Minute, "How often to evaluate the rules. Default 1m")
notifierURL = flag.String("notifier.url", "", "Prometheus alertmanager URL. Required parameter. e.g. http://127.0.0.1:9093")
externalURL = flag.String("external.url", "", "External URL is used as alert's source for sent alerts to the notifier")
)
// TODO: hot configuration reload
// TODO: alerts state persistence
func main() {
envflag.Parse()
buildinfo.Init()
logger.Init()
checkFlags()
ctx, cancel := context.WithCancel(context.Background())
eu, err := getExternalURL(*externalURL, *httpListenAddr, httpserver.IsTLS())
if err != nil {
logger.Fatalf("can not get external url:%s ", err)
}
notifier.InitTemplateFunc(eu)
logger.Infof("reading alert rules configuration file from %s", strings.Join(*rulePath, ";"))
groups, err := Parse(*rulePath, *validateAlertAnnotations)
if err != nil {
logger.Fatalf("Cannot parse configuration file: %s", err)
}
w := &watchdog{
storage: datasource.NewVMStorage(*datasourceURL, *basicAuthUsername, *basicAuthPassword, &http.Client{}),
alertProvider: notifier.NewAlertManager(*notifierURL, func(group, name string) string {
return fmt.Sprintf("%s/api/v1/%s/%s/status", eu, group, name)
}, &http.Client{}),
}
wg := sync.WaitGroup{}
for i := range groups {
wg.Add(1)
go func(group Group) {
w.run(ctx, group, *evaluationInterval)
wg.Done()
}(groups[i])
}
go httpserver.Serve(*httpListenAddr, (&requestHandler{groups: groups}).handler)
sig := procutil.WaitForSigterm()
logger.Infof("service received signal %s", sig)
if err := httpserver.Stop(*httpListenAddr); err != nil {
logger.Fatalf("cannot stop the webservice: %s", err)
}
cancel()
wg.Wait()
}
type watchdog struct {
storage *datasource.VMStorage
alertProvider notifier.Notifier
}
var (
iterationTotal = metrics.NewCounter(`vmalert_iteration_total`)
iterationDuration = metrics.NewSummary(`vmalert_iteration_duration_seconds`)
execTotal = metrics.NewCounter(`vmalert_execution_total`)
execErrors = metrics.NewCounter(`vmalert_execution_errors_total`)
execDuration = metrics.NewSummary(`vmalert_execution_duration_seconds`)
)
func (w *watchdog) run(ctx context.Context, group Group, evaluationInterval time.Duration) {
logger.Infof("watchdog for %s has been started", group.Name)
t := time.NewTicker(evaluationInterval)
defer t.Stop()
for {
select {
case <-t.C:
iterationTotal.Inc()
iterationStart := time.Now()
for _, rule := range group.Rules {
execTotal.Inc()
execStart := time.Now()
err := rule.Exec(ctx, w.storage)
execDuration.UpdateDuration(execStart)
if err != nil {
execErrors.Inc()
logger.Errorf("failed to execute rule %q.%q: %s", group.Name, rule.Name, err)
continue
}
if err := rule.Send(ctx, w.alertProvider); err != nil {
logger.Errorf("failed to send alert for rule %q.%q: %s", group.Name, rule.Name, err)
}
}
iterationDuration.UpdateDuration(iterationStart)
case <-ctx.Done():
logger.Infof("%s received stop signal", group.Name)
return
}
}
}
func getExternalURL(externalURL, httpListenAddr string, isSecure bool) (*url.URL, error) {
if externalURL != "" {
return url.Parse(externalURL)
}
hname, err := os.Hostname()
if err != nil {
return nil, err
}
port := ""
if ipport := strings.Split(httpListenAddr, ":"); len(ipport) > 1 {
port = ":" + ipport[1]
}
schema := "http://"
if isSecure {
schema = "https://"
}
return url.Parse(fmt.Sprintf("%s%s%s", schema, hname, port))
}
func checkFlags() {
if *notifierURL == "" {
flag.PrintDefaults()
logger.Fatalf("notifier.url is empty")
}
if *datasourceURL == "" {
flag.PrintDefaults()
logger.Fatalf("datasource.url is empty")
}
}

View File

@@ -0,0 +1,120 @@
package notifier
import (
"bytes"
"fmt"
"io"
"strings"
"text/template"
"time"
)
// Alert the triggered alert
// TODO: Looks like alert name isn't unique
type Alert struct {
Group string
Name string
Labels map[string]string
Annotations map[string]string
State AlertState
Start time.Time
End time.Time
Value float64
ID uint64
}
// AlertState type indicates the Alert state
type AlertState int
const (
// StateInactive is the state of an alert that is neither firing nor pending.
StateInactive AlertState = iota
// StatePending is the state of an alert that has been active for less than
// the configured threshold duration.
StatePending
// StateFiring is the state of an alert that has been active for longer than
// the configured threshold duration.
StateFiring
)
// String stringer for AlertState
func (as AlertState) String() string {
switch as {
case StateFiring:
return "firing"
case StatePending:
return "pending"
}
return "inactive"
}
type alertTplData struct {
Labels map[string]string
Value float64
}
const tplHeader = `{{ $value := .Value }}{{ $labels := .Labels }}`
// ExecTemplate executes the Alert template for give
// map of annotations.
func (a *Alert) ExecTemplate(annotations map[string]string) (map[string]string, error) {
tplData := alertTplData{Value: a.Value, Labels: a.Labels}
return templateAnnotations(annotations, tplHeader, tplData)
}
// ValidateAnnotations validate annotations for possible template error, uses empty data for template population
func ValidateAnnotations(annotations map[string]string) error {
_, err := templateAnnotations(annotations, tplHeader, alertTplData{
Labels: map[string]string{},
Value: 0,
})
return err
}
func templateAnnotations(annotations map[string]string, header string, data alertTplData) (map[string]string, error) {
var builder strings.Builder
var buf bytes.Buffer
eg := errGroup{}
r := make(map[string]string, len(annotations))
for key, text := range annotations {
r[key] = text
buf.Reset()
builder.Reset()
builder.Grow(len(header) + len(text))
builder.WriteString(header)
builder.WriteString(text)
if err := templateAnnotation(&buf, builder.String(), data); err != nil {
eg.errs = append(eg.errs, fmt.Sprintf("key %s, template %s:%s", key, text, err))
continue
}
r[key] = buf.String()
}
return r, eg.err()
}
func templateAnnotation(dst io.Writer, text string, data alertTplData) error {
tpl, err := template.New("").Funcs(tmplFunc).Option("missingkey=zero").Parse(text)
if err != nil {
return fmt.Errorf("error parsing annotation:%w", err)
}
if err = tpl.Execute(dst, data); err != nil {
return fmt.Errorf("error evaluating annotation template:%w", err)
}
return nil
}
type errGroup struct {
errs []string
}
func (eg *errGroup) err() error {
if eg == nil || len(eg.errs) == 0 {
return nil
}
return eg
}
func (eg *errGroup) Error() string {
return fmt.Sprintf("errors:%s", strings.Join(eg.errs, "\n"))
}

View File

@@ -0,0 +1,65 @@
package notifier
import (
"fmt"
"testing"
)
func TestAlert_ExecTemplate(t *testing.T) {
testCases := []struct {
alert *Alert
annotations map[string]string
expTpl map[string]string
}{
{
alert: &Alert{},
annotations: map[string]string{},
expTpl: map[string]string{},
},
{
alert: &Alert{
Value: 1e4,
Labels: map[string]string{
"instance": "localhost",
},
},
annotations: map[string]string{},
expTpl: map[string]string{},
},
{
alert: &Alert{
Value: 1e4,
Labels: map[string]string{
"job": "staging",
"instance": "localhost",
},
},
annotations: map[string]string{
"summary": "Too high connection number for {{$labels.instance}} for job {{$labels.job}}",
"description": "It is {{ $value }} connections for {{$labels.instance}}",
},
expTpl: map[string]string{
"summary": "Too high connection number for localhost for job staging",
"description": "It is 10000 connections for localhost",
},
},
}
for i, tc := range testCases {
t.Run(fmt.Sprintf("%d", i), func(t *testing.T) {
tpl, err := tc.alert.ExecTemplate(tc.annotations)
if err != nil {
t.Fatal(err)
}
if len(tpl) != len(tc.expTpl) {
t.Fatalf("expected %d elements; got %d", len(tc.expTpl), len(tpl))
}
for k := range tc.expTpl {
got, exp := tpl[k], tc.expTpl[k]
if got != exp {
t.Fatalf("expected %q=%q; got %q=%q", k, exp, k, got)
}
}
})
}
}

View File

@@ -0,0 +1,51 @@
package notifier
import (
"bytes"
"fmt"
"io/ioutil"
"net/http"
"strings"
)
// AlertManager represents integration provider with Prometheus alert manager
// https://github.com/prometheus/alertmanager
type AlertManager struct {
alertURL string
argFunc AlertURLGenerator
client *http.Client
}
// Send an alert or resolve message
func (am *AlertManager) Send(alerts []Alert) error {
b := &bytes.Buffer{}
writeamRequest(b, alerts, am.argFunc)
resp, err := am.client.Post(am.alertURL, "application/json", b)
if err != nil {
return err
}
defer func() { _ = resp.Body.Close() }()
if resp.StatusCode != http.StatusOK {
body, err := ioutil.ReadAll(resp.Body)
if err != nil {
return fmt.Errorf("failed to read response from %q: %s", am.alertURL, err)
}
return fmt.Errorf("invalid SC %d from %q; response body: %s", resp.StatusCode, am.alertURL, string(body))
}
return nil
}
// AlertURLGenerator returns URL to single alert by given name
type AlertURLGenerator func(group, id string) string
const alertManagerPath = "/api/v2/alerts"
// NewAlertManager is a constructor for AlertManager
func NewAlertManager(alertManagerURL string, fn AlertURLGenerator, c *http.Client) *AlertManager {
return &AlertManager{
alertURL: strings.TrimSuffix(alertManagerURL, "/") + alertManagerPath,
argFunc: fn,
client: c,
}
}

View File

@@ -0,0 +1,34 @@
{% import (
"strconv"
"time"
) %}
{% stripspace %}
{% func amRequest(alerts []Alert, generatorURL func(string, string) string) %}
[
{% for i, alert := range alerts %}
{
"startsAt":{%q= alert.Start.Format(time.RFC3339Nano) %},
"generatorURL": {%q= generatorURL(alert.Group, strconv.FormatUint(alert.ID, 10)) %},
{% if !alert.End.IsZero() %}
"endsAt":{%q= alert.End.Format(time.RFC3339Nano) %},
{% endif %}
"labels": {
"alertname":{%q= alert.Name %}
{% for k,v := range alert.Labels %}
,{%q= k %}:{%q= v %}
{% endfor %}
},
"annotations": {
{% code c := len(alert.Annotations) %}
{% for k,v := range alert.Annotations %}
{% code c = c-1 %}
{%q= k %}:{%q= v %}{% if c > 0 %},{% endif %}
{% endfor %}
}
}
{% if i != len(alerts)-1 %},{% endif %}
{% endfor %}
]
{% endfunc %}
{% endstripspace %}

View File

@@ -0,0 +1,131 @@
// Code generated by qtc from "alertmanager_request.qtpl". DO NOT EDIT.
// See https://github.com/valyala/quicktemplate for details.
//line app/vmalert/notifier/alertmanager_request.qtpl:1
package notifier
//line app/vmalert/notifier/alertmanager_request.qtpl:1
import (
"strconv"
"time"
)
//line app/vmalert/notifier/alertmanager_request.qtpl:7
import (
qtio422016 "io"
qt422016 "github.com/valyala/quicktemplate"
)
//line app/vmalert/notifier/alertmanager_request.qtpl:7
var (
_ = qtio422016.Copy
_ = qt422016.AcquireByteBuffer
)
//line app/vmalert/notifier/alertmanager_request.qtpl:7
func streamamRequest(qw422016 *qt422016.Writer, alerts []Alert, generatorURL func(string, string) string) {
//line app/vmalert/notifier/alertmanager_request.qtpl:7
qw422016.N().S(`[`)
//line app/vmalert/notifier/alertmanager_request.qtpl:9
for i, alert := range alerts {
//line app/vmalert/notifier/alertmanager_request.qtpl:9
qw422016.N().S(`{"startsAt":`)
//line app/vmalert/notifier/alertmanager_request.qtpl:11
qw422016.N().Q(alert.Start.Format(time.RFC3339Nano))
//line app/vmalert/notifier/alertmanager_request.qtpl:11
qw422016.N().S(`,"generatorURL":`)
//line app/vmalert/notifier/alertmanager_request.qtpl:12
qw422016.N().Q(generatorURL(alert.Group, strconv.FormatUint(alert.ID, 10)))
//line app/vmalert/notifier/alertmanager_request.qtpl:12
qw422016.N().S(`,`)
//line app/vmalert/notifier/alertmanager_request.qtpl:13
if !alert.End.IsZero() {
//line app/vmalert/notifier/alertmanager_request.qtpl:13
qw422016.N().S(`"endsAt":`)
//line app/vmalert/notifier/alertmanager_request.qtpl:14
qw422016.N().Q(alert.End.Format(time.RFC3339Nano))
//line app/vmalert/notifier/alertmanager_request.qtpl:14
qw422016.N().S(`,`)
//line app/vmalert/notifier/alertmanager_request.qtpl:15
}
//line app/vmalert/notifier/alertmanager_request.qtpl:15
qw422016.N().S(`"labels": {"alertname":`)
//line app/vmalert/notifier/alertmanager_request.qtpl:17
qw422016.N().Q(alert.Name)
//line app/vmalert/notifier/alertmanager_request.qtpl:18
for k, v := range alert.Labels {
//line app/vmalert/notifier/alertmanager_request.qtpl:18
qw422016.N().S(`,`)
//line app/vmalert/notifier/alertmanager_request.qtpl:19
qw422016.N().Q(k)
//line app/vmalert/notifier/alertmanager_request.qtpl:19
qw422016.N().S(`:`)
//line app/vmalert/notifier/alertmanager_request.qtpl:19
qw422016.N().Q(v)
//line app/vmalert/notifier/alertmanager_request.qtpl:20
}
//line app/vmalert/notifier/alertmanager_request.qtpl:20
qw422016.N().S(`},"annotations": {`)
//line app/vmalert/notifier/alertmanager_request.qtpl:23
c := len(alert.Annotations)
//line app/vmalert/notifier/alertmanager_request.qtpl:24
for k, v := range alert.Annotations {
//line app/vmalert/notifier/alertmanager_request.qtpl:25
c = c - 1
//line app/vmalert/notifier/alertmanager_request.qtpl:26
qw422016.N().Q(k)
//line app/vmalert/notifier/alertmanager_request.qtpl:26
qw422016.N().S(`:`)
//line app/vmalert/notifier/alertmanager_request.qtpl:26
qw422016.N().Q(v)
//line app/vmalert/notifier/alertmanager_request.qtpl:26
if c > 0 {
//line app/vmalert/notifier/alertmanager_request.qtpl:26
qw422016.N().S(`,`)
//line app/vmalert/notifier/alertmanager_request.qtpl:26
}
//line app/vmalert/notifier/alertmanager_request.qtpl:27
}
//line app/vmalert/notifier/alertmanager_request.qtpl:27
qw422016.N().S(`}}`)
//line app/vmalert/notifier/alertmanager_request.qtpl:30
if i != len(alerts)-1 {
//line app/vmalert/notifier/alertmanager_request.qtpl:30
qw422016.N().S(`,`)
//line app/vmalert/notifier/alertmanager_request.qtpl:30
}
//line app/vmalert/notifier/alertmanager_request.qtpl:31
}
//line app/vmalert/notifier/alertmanager_request.qtpl:31
qw422016.N().S(`]`)
//line app/vmalert/notifier/alertmanager_request.qtpl:33
}
//line app/vmalert/notifier/alertmanager_request.qtpl:33
func writeamRequest(qq422016 qtio422016.Writer, alerts []Alert, generatorURL func(string, string) string) {
//line app/vmalert/notifier/alertmanager_request.qtpl:33
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmalert/notifier/alertmanager_request.qtpl:33
streamamRequest(qw422016, alerts, generatorURL)
//line app/vmalert/notifier/alertmanager_request.qtpl:33
qt422016.ReleaseWriter(qw422016)
//line app/vmalert/notifier/alertmanager_request.qtpl:33
}
//line app/vmalert/notifier/alertmanager_request.qtpl:33
func amRequest(alerts []Alert, generatorURL func(string, string) string) string {
//line app/vmalert/notifier/alertmanager_request.qtpl:33
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmalert/notifier/alertmanager_request.qtpl:33
writeamRequest(qb422016, alerts, generatorURL)
//line app/vmalert/notifier/alertmanager_request.qtpl:33
qs422016 := string(qb422016.B)
//line app/vmalert/notifier/alertmanager_request.qtpl:33
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmalert/notifier/alertmanager_request.qtpl:33
return qs422016
//line app/vmalert/notifier/alertmanager_request.qtpl:33
}

View File

@@ -0,0 +1,80 @@
package notifier
import (
"encoding/json"
"net/http"
"net/http/httptest"
"testing"
"time"
)
func TestAlertManager_Send(t *testing.T) {
mux := http.NewServeMux()
mux.HandleFunc("/", func(_ http.ResponseWriter, _ *http.Request) {
t.Errorf("should not be called")
})
c := -1
mux.HandleFunc(alertManagerPath, func(w http.ResponseWriter, r *http.Request) {
c++
if r.Method != http.MethodPost {
t.Errorf("expected POST method got %s", r.Method)
}
switch c {
case 0:
conn, _, _ := w.(http.Hijacker).Hijack()
_ = conn.Close()
case 1:
w.WriteHeader(500)
case 2:
var a []struct {
Labels map[string]string `json:"labels"`
StartsAt time.Time `json:"startsAt"`
EndAt time.Time `json:"endsAt"`
Annotations map[string]string `json:"annotations"`
GeneratorURL string `json:"generatorURL"`
}
if err := json.NewDecoder(r.Body).Decode(&a); err != nil {
t.Errorf("can not unmarshal data into alert %s", err)
t.FailNow()
}
if len(a) != 1 {
t.Errorf("expected 1 alert in array got %d", len(a))
}
if a[0].GeneratorURL != "group0" {
t.Errorf("exptected alert0 as generatorURL got %s", a[0].GeneratorURL)
}
if a[0].Labels["alertname"] != "alert0" {
t.Errorf("exptected alert0 as alert name got %s", a[0].Labels["alertname"])
}
if a[0].StartsAt.IsZero() {
t.Errorf("exptected non-zero start time")
}
if a[0].EndAt.IsZero() {
t.Errorf("exptected non-zero end time")
}
}
})
srv := httptest.NewServer(mux)
defer srv.Close()
am := NewAlertManager(srv.URL, func(group, name string) string {
return group + name
}, srv.Client())
if err := am.Send([]Alert{{}, {}}); err == nil {
t.Error("expected connection error got nil")
}
if err := am.Send([]Alert{}); err == nil {
t.Error("expected wrong http code error got nil")
}
if err := am.Send([]Alert{{
Group: "group",
Name: "alert0",
Start: time.Now().UTC(),
End: time.Now().UTC(),
Annotations: map[string]string{"a": "b", "c": "d", "e": "f"},
}}); err != nil {
t.Errorf("unexpected error %s", err)
}
if c != 2 {
t.Errorf("expected 2 calls(count from zero) to server got %d", c)
}
}

View File

@@ -0,0 +1,6 @@
package notifier
// Notifier is common interface for alert manager provider
type Notifier interface {
Send(alerts []Alert) error
}

View File

@@ -0,0 +1,171 @@
// Copyright 2013 The Prometheus Authors
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package notifier
import (
"fmt"
html_template "html/template"
"math"
"net/url"
"regexp"
"strings"
text_template "text/template"
"time"
)
var tmplFunc text_template.FuncMap
// InitTemplateFunc returns template helper functions
func InitTemplateFunc(externalURL *url.URL) {
tmplFunc = text_template.FuncMap{
"args": func(args ...interface{}) map[string]interface{} {
result := make(map[string]interface{})
for i, a := range args {
result[fmt.Sprintf("arg%d", i)] = a
}
return result
},
"reReplaceAll": func(pattern, repl, text string) string {
re := regexp.MustCompile(pattern)
return re.ReplaceAllString(text, repl)
},
"safeHtml": func(text string) html_template.HTML {
return html_template.HTML(text)
},
"match": regexp.MatchString,
"title": strings.Title,
"toUpper": strings.ToUpper,
"toLower": strings.ToLower,
"humanize": func(v float64) string {
if v == 0 || math.IsNaN(v) || math.IsInf(v, 0) {
return fmt.Sprintf("%.4g", v)
}
if math.Abs(v) >= 1 {
prefix := ""
for _, p := range []string{"k", "M", "G", "T", "P", "E", "Z", "Y"} {
if math.Abs(v) < 1000 {
break
}
prefix = p
v /= 1000
}
return fmt.Sprintf("%.4g%s", v, prefix)
}
prefix := ""
for _, p := range []string{"m", "u", "n", "p", "f", "a", "z", "y"} {
if math.Abs(v) >= 1 {
break
}
prefix = p
v *= 1000
}
return fmt.Sprintf("%.4g%s", v, prefix)
},
"humanize1024": func(v float64) string {
if math.Abs(v) <= 1 || math.IsNaN(v) || math.IsInf(v, 0) {
return fmt.Sprintf("%.4g", v)
}
prefix := ""
for _, p := range []string{"ki", "Mi", "Gi", "Ti", "Pi", "Ei", "Zi", "Yi"} {
if math.Abs(v) < 1024 {
break
}
prefix = p
v /= 1024
}
return fmt.Sprintf("%.4g%s", v, prefix)
},
"humanizeDuration": func(v float64) string {
if math.IsNaN(v) || math.IsInf(v, 0) {
return fmt.Sprintf("%.4g", v)
}
if v == 0 {
return fmt.Sprintf("%.4gs", v)
}
if math.Abs(v) >= 1 {
sign := ""
if v < 0 {
sign = "-"
v = -v
}
seconds := int64(v) % 60
minutes := (int64(v) / 60) % 60
hours := (int64(v) / 60 / 60) % 24
days := int64(v) / 60 / 60 / 24
// For days to minutes, we display seconds as an integer.
if days != 0 {
return fmt.Sprintf("%s%dd %dh %dm %ds", sign, days, hours, minutes, seconds)
}
if hours != 0 {
return fmt.Sprintf("%s%dh %dm %ds", sign, hours, minutes, seconds)
}
if minutes != 0 {
return fmt.Sprintf("%s%dm %ds", sign, minutes, seconds)
}
// For seconds, we display 4 significant digits.
return fmt.Sprintf("%s%.4gs", sign, v)
}
prefix := ""
for _, p := range []string{"m", "u", "n", "p", "f", "a", "z", "y"} {
if math.Abs(v) >= 1 {
break
}
prefix = p
v *= 1000
}
return fmt.Sprintf("%.4g%ss", v, prefix)
},
"humanizePercentage": func(v float64) string {
return fmt.Sprintf("%.4g%%", v*100)
},
"humanizeTimestamp": func(v float64) string {
if math.IsNaN(v) || math.IsInf(v, 0) {
return fmt.Sprintf("%.4g", v)
}
t := TimeFromUnixNano(int64(v * 1e9)).Time().UTC()
return fmt.Sprint(t)
},
"pathPrefix": func() string {
return externalURL.Path
},
"externalURL": func() string {
return externalURL.String()
},
}
}
// Time is the number of milliseconds since the epoch
// (1970-01-01 00:00 UTC) excluding leap seconds.
type Time int64
// TimeFromUnixNano returns the Time equivalent to the Unix Time
// t provided in nanoseconds.
func TimeFromUnixNano(t int64) Time {
return Time(t / nanosPerTick)
}
// The number of nanoseconds per minimum tick.
const nanosPerTick = int64(minimumTick / time.Nanosecond)
// MinimumTick is the minimum supported time resolution. This has to be
// at least time.Second in order for the code below to work.
const minimumTick = time.Millisecond
// second is the Time duration equivalent to one second.
const second = int64(time.Second / minimumTick)
// Time returns the time.Time representation of t.
func (t Time) Time() time.Time {
return time.Unix(int64(t)/second, (int64(t)%second)*nanosPerTick)
}

221
app/vmalert/rule.go Normal file
View File

@@ -0,0 +1,221 @@
package main
import (
"context"
"errors"
"fmt"
"hash/fnv"
"sort"
"strconv"
"sync"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/metricsql"
"github.com/VictoriaMetrics/metrics"
)
// Group grouping array of alert
type Group struct {
Name string
Rules []*Rule
}
// Rule is basic alert entity
type Rule struct {
Name string `yaml:"alert"`
Expr string `yaml:"expr"`
For time.Duration `yaml:"for"`
Labels map[string]string `yaml:"labels"`
Annotations map[string]string `yaml:"annotations"`
group *Group
// guard status fields
mu sync.RWMutex
// stores list of active alerts
alerts map[uint64]*notifier.Alert
// stores last moment of time Exec was called
lastExecTime time.Time
// stores last error that happened in Exec func
// resets on every successful Exec
// may be used as Health state
lastExecError error
}
// Validate validates rule
func (r *Rule) Validate() error {
if r.Name == "" {
return errors.New("rule name can not be empty")
}
if r.Expr == "" {
return fmt.Errorf("expression for rule %q can't be empty", r.Name)
}
if _, err := metricsql.Parse(r.Expr); err != nil {
return fmt.Errorf("invalid expression for rule %q: %w", r.Name, err)
}
return nil
}
// Exec executes Rule expression via the given Querier.
// Based on the Querier results Rule maintains notifier.Alerts
func (r *Rule) Exec(ctx context.Context, q datasource.Querier) error {
qMetrics, err := q.Query(ctx, r.Expr)
r.mu.Lock()
defer r.mu.Unlock()
r.lastExecError = err
r.lastExecTime = time.Now()
if err != nil {
return fmt.Errorf("failed to execute query %q: %s", r.Expr, err)
}
for h, a := range r.alerts {
// cleanup inactive alerts from previous Eval
if a.State == notifier.StateInactive {
delete(r.alerts, h)
}
}
updated := make(map[uint64]struct{})
// update list of active alerts
for _, m := range qMetrics {
h := hash(m)
updated[h] = struct{}{}
if _, ok := r.alerts[h]; ok {
continue
}
a, err := r.newAlert(m)
if err != nil {
r.lastExecError = err
return fmt.Errorf("failed to create alert: %s", err)
}
a.ID = h
a.State = notifier.StatePending
r.alerts[h] = a
}
for h, a := range r.alerts {
// if alert wasn't updated in this iteration
// means it is resolved already
if _, ok := updated[h]; !ok {
a.State = notifier.StateInactive
// set endTime to last execution time
// so it can be sent by notifier on next step
a.End = r.lastExecTime
continue
}
if a.State == notifier.StatePending && time.Since(a.Start) >= r.For {
a.State = notifier.StateFiring
alertsFired.Inc()
}
if a.State == notifier.StateFiring {
a.End = r.lastExecTime.Add(3 * *evaluationInterval)
}
}
return nil
}
// Send sends the active alerts via given
// notifier.Notifier.
// See for reference https://prometheus.io/docs/alerting/clients/
// TODO: add tests for endAt value
func (r *Rule) Send(_ context.Context, ap notifier.Notifier) error {
// copy alerts to new list to avoid locks
var alertsCopy []notifier.Alert
r.mu.Lock()
for _, a := range r.alerts {
if a.State == notifier.StatePending {
continue
}
// it is safe to dereference instead of deep-copy
// because only simple types may be changed during rule.Exec
alertsCopy = append(alertsCopy, *a)
}
r.mu.Unlock()
if len(alertsCopy) < 1 {
return nil
}
alertsSent.Add(len(alertsCopy))
return ap.Send(alertsCopy)
}
var (
alertsFired = metrics.NewCounter(`vmalert_alerts_fired_total`)
alertsSent = metrics.NewCounter(`vmalert_alerts_sent_total`)
)
// TODO: consider hashing algorithm in VM
func hash(m datasource.Metric) uint64 {
hash := fnv.New64a()
labels := m.Labels
sort.Slice(labels, func(i, j int) bool {
return labels[i].Name < labels[j].Name
})
for _, l := range labels {
hash.Write([]byte(l.Name))
hash.Write([]byte(l.Value))
hash.Write([]byte("\xff"))
}
return hash.Sum64()
}
func (r *Rule) newAlert(m datasource.Metric) (*notifier.Alert, error) {
a := &notifier.Alert{
Group: r.group.Name,
Name: r.Name,
Labels: map[string]string{},
Value: m.Value,
Start: time.Now(),
// TODO: support End time
}
for _, l := range m.Labels {
a.Labels[l.Name] = l.Value
}
// metric labels may be overridden by
// rule labels
for k, v := range r.Labels {
a.Labels[k] = v
}
var err error
a.Annotations, err = a.ExecTemplate(r.Annotations)
return a, err
}
// AlertAPI generates APIAlert object from alert by its id(hash)
func (r *Rule) AlertAPI(id uint64) *APIAlert {
r.mu.RLock()
defer r.mu.RUnlock()
a, ok := r.alerts[id]
if !ok {
return nil
}
return r.newAlertAPI(*a)
}
// AlertsAPI generates list of APIAlert objects from existing alerts
func (r *Rule) AlertsAPI() []*APIAlert {
var alerts []*APIAlert
r.mu.RLock()
for _, a := range r.alerts {
alerts = append(alerts, r.newAlertAPI(*a))
}
r.mu.RUnlock()
return alerts
}
func (r *Rule) newAlertAPI(a notifier.Alert) *APIAlert {
return &APIAlert{
ID: a.ID,
Name: a.Name,
Group: a.Group,
Expression: r.Expr,
Labels: a.Labels,
Annotations: a.Annotations,
State: a.State.String(),
ActiveAt: a.Start,
Value: strconv.FormatFloat(a.Value, 'e', -1, 64),
}
}

282
app/vmalert/rule_test.go Normal file
View File

@@ -0,0 +1,282 @@
package main
import (
"context"
"testing"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
)
func TestRule_Validate(t *testing.T) {
if err := (&Rule{}).Validate(); err == nil {
t.Errorf("exptected empty name error")
}
if err := (&Rule{Name: "alert"}).Validate(); err == nil {
t.Errorf("exptected empty expr error")
}
if err := (&Rule{Name: "alert", Expr: "test{"}).Validate(); err == nil {
t.Errorf("exptected invalid expr error")
}
if err := (&Rule{Name: "alert", Expr: "test>0"}).Validate(); err != nil {
t.Errorf("exptected valid rule got %s", err)
}
}
func newTestRule(name string, waitFor time.Duration) *Rule {
return &Rule{Name: name, alerts: make(map[uint64]*notifier.Alert), For: waitFor}
}
func TestRule_Exec(t *testing.T) {
testCases := []struct {
rule *Rule
steps [][]datasource.Metric
expAlerts map[uint64]*notifier.Alert
}{
{
newTestRule("empty", 0),
[][]datasource.Metric{},
map[uint64]*notifier.Alert{},
},
{
newTestRule("single-firing", 0),
[][]datasource.Metric{
{metricWithLabels(t, "__name__", "foo")},
},
map[uint64]*notifier.Alert{
hash(metricWithLabels(t, "__name__", "foo")): {State: notifier.StateFiring},
},
},
{
newTestRule("single-firing=>inactive", 0),
[][]datasource.Metric{
{metricWithLabels(t, "__name__", "foo")},
{},
},
map[uint64]*notifier.Alert{
hash(metricWithLabels(t, "__name__", "foo")): {State: notifier.StateInactive},
},
},
{
newTestRule("single-firing=>inactive=>firing", 0),
[][]datasource.Metric{
{metricWithLabels(t, "__name__", "foo")},
{},
{metricWithLabels(t, "__name__", "foo")},
},
map[uint64]*notifier.Alert{
hash(metricWithLabels(t, "__name__", "foo")): {State: notifier.StateFiring},
},
},
{
newTestRule("single-firing=>inactive=>firing=>inactive", 0),
[][]datasource.Metric{
{metricWithLabels(t, "__name__", "foo")},
{},
{metricWithLabels(t, "__name__", "foo")},
{},
},
map[uint64]*notifier.Alert{
hash(metricWithLabels(t, "__name__", "foo")): {State: notifier.StateInactive},
},
},
{
newTestRule("single-firing=>inactive=>firing=>inactive=>empty", 0),
[][]datasource.Metric{
{metricWithLabels(t, "__name__", "foo")},
{},
{metricWithLabels(t, "__name__", "foo")},
{},
{},
},
map[uint64]*notifier.Alert{},
},
{
newTestRule("single-firing=>inactive=>firing=>inactive=>empty=>firing", 0),
[][]datasource.Metric{
{metricWithLabels(t, "__name__", "foo")},
{},
{metricWithLabels(t, "__name__", "foo")},
{},
{},
{metricWithLabels(t, "__name__", "foo")},
},
map[uint64]*notifier.Alert{
hash(metricWithLabels(t, "__name__", "foo")): {State: notifier.StateFiring},
},
},
{
newTestRule("multiple-firing", 0),
[][]datasource.Metric{
{
metricWithLabels(t, "__name__", "foo"),
metricWithLabels(t, "__name__", "foo1"),
metricWithLabels(t, "__name__", "foo2"),
},
},
map[uint64]*notifier.Alert{
hash(metricWithLabels(t, "__name__", "foo")): {State: notifier.StateFiring},
hash(metricWithLabels(t, "__name__", "foo1")): {State: notifier.StateFiring},
hash(metricWithLabels(t, "__name__", "foo2")): {State: notifier.StateFiring},
},
},
{
newTestRule("multiple-steps-firing", 0),
[][]datasource.Metric{
{metricWithLabels(t, "__name__", "foo")},
{metricWithLabels(t, "__name__", "foo1")},
{metricWithLabels(t, "__name__", "foo2")},
},
// 1: fire first alert
// 2: fire second alert, set first inactive
// 3: fire third alert, set second inactive, delete first one
map[uint64]*notifier.Alert{
hash(metricWithLabels(t, "__name__", "foo1")): {State: notifier.StateInactive},
hash(metricWithLabels(t, "__name__", "foo2")): {State: notifier.StateFiring},
},
},
{
newTestRule("duplicate", 0),
[][]datasource.Metric{
{
// metrics with the same labelset should result in one alert
metricWithLabels(t, "__name__", "foo", "type", "bar"),
metricWithLabels(t, "type", "bar", "__name__", "foo"),
},
},
map[uint64]*notifier.Alert{
hash(metricWithLabels(t, "__name__", "foo", "type", "bar")): {State: notifier.StateFiring},
},
},
{
newTestRule("for-pending", time.Minute),
[][]datasource.Metric{
{metricWithLabels(t, "__name__", "foo")},
},
map[uint64]*notifier.Alert{
hash(metricWithLabels(t, "__name__", "foo")): {State: notifier.StatePending},
},
},
{
newTestRule("for-fired", time.Millisecond),
[][]datasource.Metric{
{metricWithLabels(t, "__name__", "foo")},
{metricWithLabels(t, "__name__", "foo")},
},
map[uint64]*notifier.Alert{
hash(metricWithLabels(t, "__name__", "foo")): {State: notifier.StateFiring},
},
},
{
newTestRule("for-pending=>inactive", time.Millisecond),
[][]datasource.Metric{
{metricWithLabels(t, "__name__", "foo")},
{metricWithLabels(t, "__name__", "foo")},
// empty step to reset pending alerts
{},
},
map[uint64]*notifier.Alert{
hash(metricWithLabels(t, "__name__", "foo")): {State: notifier.StateInactive},
},
},
{
newTestRule("for-pending=>firing=>inactive", time.Millisecond),
[][]datasource.Metric{
{metricWithLabels(t, "__name__", "foo")},
{metricWithLabels(t, "__name__", "foo")},
// empty step to reset pending alerts
{},
},
map[uint64]*notifier.Alert{
hash(metricWithLabels(t, "__name__", "foo")): {State: notifier.StateInactive},
},
},
{
newTestRule("for-pending=>firing=>inactive=>pending", time.Millisecond),
[][]datasource.Metric{
{metricWithLabels(t, "__name__", "foo")},
{metricWithLabels(t, "__name__", "foo")},
// empty step to reset pending alerts
{},
{metricWithLabels(t, "__name__", "foo")},
},
map[uint64]*notifier.Alert{
hash(metricWithLabels(t, "__name__", "foo")): {State: notifier.StatePending},
},
},
{
newTestRule("for-pending=>firing=>inactive=>pending=>firing", time.Millisecond),
[][]datasource.Metric{
{metricWithLabels(t, "__name__", "foo")},
{metricWithLabels(t, "__name__", "foo")},
// empty step to reset pending alerts
{},
{metricWithLabels(t, "__name__", "foo")},
{metricWithLabels(t, "__name__", "foo")},
},
map[uint64]*notifier.Alert{
hash(metricWithLabels(t, "__name__", "foo")): {State: notifier.StateFiring},
},
},
}
fakeGroup := &Group{Name: "TestRule_Exec"}
for _, tc := range testCases {
t.Run(tc.rule.Name, func(t *testing.T) {
fq := &fakeQuerier{}
tc.rule.group = fakeGroup
for _, step := range tc.steps {
fq.reset()
fq.add(t, step...)
if err := tc.rule.Exec(context.TODO(), fq); err != nil {
t.Fatalf("unexpected err: %s", err)
}
// artificial delay between applying steps
time.Sleep(time.Millisecond)
}
if len(tc.rule.alerts) != len(tc.expAlerts) {
t.Fatalf("expected %d alerts; got %d", len(tc.expAlerts), len(tc.rule.alerts))
}
for key, exp := range tc.expAlerts {
got, ok := tc.rule.alerts[key]
if !ok {
t.Fatalf("expected to have key %d", key)
}
if got.State != exp.State {
t.Fatalf("expected state %d; got %d", exp.State, got.State)
}
}
})
}
}
func metricWithLabels(t *testing.T, labels ...string) datasource.Metric {
t.Helper()
if len(labels) == 0 || len(labels)%2 != 0 {
t.Fatalf("expected to get even number of labels")
}
m := datasource.Metric{}
for i := 0; i < len(labels); i += 2 {
m.Labels = append(m.Labels, datasource.Label{
Name: labels[i],
Value: labels[i+1],
})
}
return m
}
type fakeQuerier struct {
metrics []datasource.Metric
}
func (fq *fakeQuerier) reset() {
fq.metrics = fq.metrics[:0]
}
func (fq *fakeQuerier) add(t *testing.T, metrics ...datasource.Metric) {
fq.metrics = append(fq.metrics, metrics...)
}
func (fq fakeQuerier) Query(ctx context.Context, query string) ([]datasource.Metric, error) {
return fq.metrics, nil
}

View File

@@ -0,0 +1,19 @@
groups:
- name: group
rules:
- alert: InvalidAnnotations
for: 5m
expr: vm_rows > 0
labels:
label: bar
annotations:
summary: "{{ $value }"
description: "{{$labels}}"
- alert: UnkownAnnotationsFunction
for: 5m
expr: vm_rows > 0
labels:
label: bar
annotations:
summary: "{{ value|query }}"
description: "{{$labels}}"

View File

@@ -0,0 +1,13 @@
groups:
- name: duplicatedGroupDiffFiles
rules:
- alert: VMRows
for: 5m
expr: vm_rows > 0
labels:
label: bar
annotations:
summary: "{{ $value|humanize }}"
description: "{{$labels}}"

View File

@@ -0,0 +1,22 @@
groups:
- name: sameGroup
rules:
- alert: alert
for: 5m
expr: vm_rows > 0
labels:
label: bar
annotations:
summary: "{{ $value }}"
description: "{{$labels}}"
- name: sameGroup
rules:
- alert: alert
for: 5m
expr: vm_rows > 0
labels:
label: bar
annotations:
summary: "{{ $value }}"
description: "{{$labels}}"

View File

@@ -0,0 +1,13 @@
groups:
- name: duplicatedGroupDiffFiles
rules:
- alert: VMRows
for: 5m
expr: vm_rows > 0
labels:
label: bar
annotations:
summary: "{{ $value }}"
description: "{{$labels}}"

28
app/vmalert/testdata/rules0-bad.rules vendored Normal file
View File

@@ -0,0 +1,28 @@
groups:
- name: group
rules:
- alert: InvalidExpr
for: 5m
expr: vm_rows{ > 0
labels:
label: bar
annotations:
summary: "{{ $value }}"
description: "{{$labels}}"
- alert: EmptyExpr
for: 5m
expr: ""
labels:
label: bar
annotations:
summary: "{{ $value }}"
description: "{{$labels}}"
- alert: ""
for: 5m
expr: vm_rows > 0
labels:
label: foo
annotations:
summary: "{{ $value }}"
description: "{{$labels}}"

22
app/vmalert/testdata/rules0-good.rules vendored Normal file
View File

@@ -0,0 +1,22 @@
groups:
- name: groupGorSingleAlert
rules:
- alert: VMRows
for: 10s
expr: vm_rows > 0
labels:
label: bar
annotations:
summary: "{{ $value|humanize }}"
description: "{{$labels}}"
- name: TestGroup
rules:
- alert: Conns
expr: sum(vm_tcplistener_conns) by(instance) > 1
annotations:
summary: "Too high connection number for {{$labels.instance}}"
description: "It is {{ $value }} connections for {{$labels.instance}}"
- alert: ExampleAlertAlwaysFiring
expr: sum by(job)
(up == 1)

134
app/vmalert/web.go Normal file
View File

@@ -0,0 +1,134 @@
package main
import (
"encoding/json"
"fmt"
"net/http"
"sort"
"strconv"
"strings"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
)
// APIAlert has info for an alert.
type APIAlert struct {
ID uint64 `json:"id"`
Name string `json:"name"`
Group string `json:"group"`
Expression string `json:"expression"`
State string `json:"state"`
Value string `json:"value"`
Labels map[string]string `json:"labels"`
Annotations map[string]string `json:"annotations"`
ActiveAt time.Time `json:"activeAt"`
}
type requestHandler struct {
groups []Group
}
var pathList = [][]string{
{"/api/v1/alerts", "list all active alerts"},
{"/api/v1/groupName/alertID/status", "get alert status by ID"},
// /metrics is served by httpserver by default
{"/metrics", "list of application metrics"},
}
func (rh *requestHandler) handler(w http.ResponseWriter, r *http.Request) bool {
resph := responseHandler{w}
switch r.URL.Path {
case "/":
for _, path := range pathList {
p, doc := path[0], path[1]
fmt.Fprintf(w, "<a href='%s'>%q</a> - %s<br/>", p, p, doc)
}
return true
case "/api/v1/alerts":
resph.handle(rh.list())
return true
default:
// /api/v1/<groupName>/<alertID>/status
if strings.HasSuffix(r.URL.Path, "/status") {
resph.handle(rh.alert(r.URL.Path))
return true
}
return false
}
}
type listAlertsResponse struct {
Data struct {
Alerts []*APIAlert `json:"alerts"`
} `json:"data"`
Status string `json:"status"`
}
func (rh *requestHandler) list() ([]byte, error) {
lr := listAlertsResponse{Status: "success"}
for _, g := range rh.groups {
for _, r := range g.Rules {
lr.Data.Alerts = append(lr.Data.Alerts, r.AlertsAPI()...)
}
}
// sort list of alerts for deterministic output
sort.Slice(lr.Data.Alerts, func(i, j int) bool {
return lr.Data.Alerts[i].Name < lr.Data.Alerts[j].Name
})
b, err := json.Marshal(lr)
if err != nil {
return nil, &httpserver.ErrorWithStatusCode{
Err: fmt.Errorf(`error encoding list of active alerts: %s`, err),
StatusCode: http.StatusInternalServerError,
}
}
return b, nil
}
func (rh *requestHandler) alert(path string) ([]byte, error) {
parts := strings.SplitN(strings.TrimPrefix(path, "/api/v1/"), "/", 3)
if len(parts) != 3 {
return nil, &httpserver.ErrorWithStatusCode{
Err: fmt.Errorf(`path %q cointains /status suffix but doesn't match pattern "/group/alert/status"`, path),
StatusCode: http.StatusBadRequest,
}
}
group := strings.TrimRight(parts[0], "/")
idStr := strings.TrimRight(parts[1], "/")
id, err := strconv.ParseUint(idStr, 10, 0)
if err != nil {
return nil, &httpserver.ErrorWithStatusCode{
Err: fmt.Errorf(`cannot parse int from %q`, idStr),
StatusCode: http.StatusBadRequest,
}
}
for _, g := range rh.groups {
if g.Name != group {
continue
}
for i := range g.Rules {
if apiAlert := g.Rules[i].AlertAPI(id); apiAlert != nil {
return json.Marshal(apiAlert)
}
}
}
return nil, &httpserver.ErrorWithStatusCode{
Err: fmt.Errorf(`cannot find alert %s in %q`, idStr, group),
StatusCode: http.StatusNotFound,
}
}
// responseHandler wrapper on http.ResponseWriter with sugar
type responseHandler struct{ http.ResponseWriter }
func (w responseHandler) handle(b []byte, err error) {
if err != nil {
httpserver.Errorf(w, "%s", err)
return
}
w.Header().Set("Content-Type", "application/json")
w.Write(b)
}

72
app/vmalert/web_test.go Normal file
View File

@@ -0,0 +1,72 @@
package main
import (
"encoding/json"
"net/http"
"net/http/httptest"
"reflect"
"testing"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
)
func TestHandler(t *testing.T) {
rule := &Rule{
Name: "alert",
alerts: map[uint64]*notifier.Alert{
0: {},
},
}
rh := &requestHandler{
groups: []Group{{
Name: "group",
Rules: []*Rule{rule},
}},
}
getResp := func(url string, to interface{}, code int) {
t.Helper()
resp, err := http.Get(url)
if err != nil {
t.Errorf("unexpected err %s", err)
}
if code != resp.StatusCode {
t.Errorf("unexpected status code %d want %d", resp.StatusCode, code)
}
defer func() {
if err := resp.Body.Close(); err != nil {
t.Errorf("err closing body %s", err)
}
}()
if to != nil {
if err = json.NewDecoder(resp.Body).Decode(to); err != nil {
t.Errorf("unexpected err %s", err)
}
}
}
ts := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { rh.handler(w, r) }))
defer ts.Close()
t.Run("/api/v1/alerts", func(t *testing.T) {
lr := listAlertsResponse{}
getResp(ts.URL+"/api/v1/alerts", &lr, 200)
if length := len(lr.Data.Alerts); length != 1 {
t.Errorf("expected 1 alert got %d", length)
}
})
t.Run("/api/v1/group/0/status", func(t *testing.T) {
alert := &APIAlert{}
getResp(ts.URL+"/api/v1/group/0/status", alert, 200)
expAlert := rule.newAlertAPI(*rule.alerts[0])
if !reflect.DeepEqual(alert, expAlert) {
t.Errorf("expected %v is equal to %v", alert, expAlert)
}
})
t.Run("/api/v1/group/1/status", func(t *testing.T) {
getResp(ts.URL+"/api/v1/group/1/status", nil, 404)
})
t.Run("/api/v1/unknown-group/0/status", func(t *testing.T) {
getResp(ts.URL+"/api/v1/unknown-group/0/status", nil, 404)
})
t.Run("/", func(t *testing.T) {
getResp(ts.URL, nil, 200)
})
}

View File

@@ -3,35 +3,68 @@
vmbackup:
APP_NAME=vmbackup $(MAKE) app-local
vmbackup-race:
APP_NAME=vmbackup RACE=-race $(MAKE) app-local
vmbackup-prod:
APP_NAME=vmbackup $(MAKE) app-via-docker
vmbackup-pure-prod:
APP_NAME=vmbackup $(MAKE) app-via-docker-pure
vmbackup-amd64-prod:
APP_NAME=vmbackup $(MAKE) app-via-docker-amd64
vmbackup-arm-prod:
APP_NAME=vmbackup $(MAKE) app-via-docker-arm
vmbackup-arm64-prod:
APP_NAME=vmbackup $(MAKE) app-via-docker-arm64
vmbackup-ppc64le-prod:
APP_NAME=vmbackup $(MAKE) app-via-docker-ppc64le
vmbackup-386-prod:
APP_NAME=vmbackup $(MAKE) app-via-docker-386
package-vmbackup:
APP_NAME=vmbackup $(MAKE) package-via-docker
package-vmbackup-pure:
APP_NAME=vmbackup $(MAKE) package-via-docker-pure
package-vmbackup-amd64:
APP_NAME=vmbackup $(MAKE) package-via-docker-amd64
package-vmbackup-arm:
APP_NAME=vmbackup $(MAKE) package-via-docker-arm
package-vmbackup-arm64:
APP_NAME=vmbackup $(MAKE) package-via-docker-arm64
package-vmbackup-ppc64le:
APP_NAME=vmbackup $(MAKE) package-via-docker-ppc64le
package-vmbackup-386:
APP_NAME=vmbackup $(MAKE) package-via-docker-386
publish-vmbackup:
APP_NAME=vmbackup $(MAKE) publish-via-docker
vmbackup-arm:
CGO_ENABLED=0 GOOS=linux GOARCH=arm GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmbackup-arm ./app/vmbackup
vmbackup-arm-prod:
APP_NAME=vmbackup APP_SUFFIX='-arm' DOCKER_OPTS='--env CGO_ENABLED=0 --env GOARCH=arm' $(MAKE) app-via-docker
vmbackup-arm64:
CGO_ENABLED=0 GOOS=linux GOARCH=arm64 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmbackup-arm64 ./app/vmbackup
vmbackup-arm64-prod:
APP_NAME=vmbackup APP_SUFFIX='-arm64' DOCKER_OPTS='--env CGO_ENABLED=0 --env GOARCH=arm64' $(MAKE) app-via-docker
vmbackup-386:
CGO_ENABLED=0 GOOS=linux GOARCH=386 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmbackup-386 ./app/vmbackup
vmbackup-386-prod:
APP_NAME=vmbackup APP_SUFFIX='-386' DOCKER_OPTS='--env CGO_ENABLED=0 --env GOARCH=386' $(MAKE) app-via-docker
vmbackup-pure:
APP_NAME=vmbackup $(MAKE) app-local-pure
vmbackup-pure-prod:
APP_NAME=vmbackup APP_SUFFIX='-pure' DOCKER_OPTS='--env CGO_ENABLED=0' $(MAKE) app-via-docker
vmbackup-amd64:
CGO_ENABLED=1 GOOS=linux GOARCH=amd64 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmbackup-amd64 ./app/vmbackup
vmbackup-arm:
CGO_ENABLED=0 GOOS=linux GOARCH=arm GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmbackup-arm ./app/vmbackup
vmbackup-arm64:
CGO_ENABLED=0 GOOS=linux GOARCH=arm64 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmbackup-arm64 ./app/vmbackup
vmbackup-ppc64le:
CGO_ENABLED=0 GOOS=linux GOARCH=ppc64le GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmbackup-ppc64le ./app/vmbackup
vmbackup-386:
CGO_ENABLED=0 GOOS=linux GOARCH=386 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmbackup-386 ./app/vmbackup

View File

@@ -6,6 +6,7 @@ Supported storage systems for backups:
* [GCS](https://cloud.google.com/storage/). Example: `gcs://<bucket>/<path/to/backup>`
* [S3](https://aws.amazon.com/s3/). Example: `s3://<bucket>/<path/to/backup>`
* Any S3-compatible storage such as [MinIO](https://github.com/minio/minio), [Ceph](https://docs.ceph.com/docs/mimic/radosgw/s3/) or [Swift](https://www.swiftstack.com/docs/admin/middleware/s3_middleware.html). See `-customS3Endpoint` command-line flag.
* Local filesystem. Example: `fs://</absolute/path/to/backup>`
Incremental backups and full backups are supported. Incremental backups are created automatically if the destination path already contains data from the previous backup.
@@ -115,7 +116,7 @@ See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time-
### Troubleshooting
* If the backup is slow, then try setting higher value for `-concurrency` flag. This will increase the number of concurrent workers that upload data to backup storage.
* If `vmbackup` eats all the network bandwidth, then set `-concurrency` to 1. This should reduce network bandwidth usage.
* If `vmbackup` eats all the network bandwidth, then set `-maxBytesPerSecond` to the desired value.
* If `vmbackup` has been interrupted due to temporary error, then just restart it with the same args. It will resume the backup process.
@@ -129,14 +130,20 @@ Run `vmbackup -help` in order to see all the available options:
-configFilePath string
Path to file with S3 configs. Configs are loaded from default location if not set.
See https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html
-configProfile string
Profile name for S3 configs (default "default")
-credsFilePath string
Path to file with GCS or S3 credentials. Credentials are loaded from default locations if not set.
See https://cloud.google.com/iam/docs/creating-managing-service-account-keys and https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html
-customS3Endpoint string
Custom S3 endpoint for use with S3-compatible storages (e.g. MinIO). S3 is used if not set
-dst string
Where to put the backup on the remote storage. Example: gcs://bucket/path/to/backup/dir, s3://bucket/path/to/backup/dir or fs:///path/to/local/backup/dir
-dst can point to the previous backup. In this case incremental backup is performed, i.e. only changed data is uploaded
-loggerLevel string
Minimum level of errors to log. Possible values: INFO, ERROR, FATAL, PANIC (default "INFO")
-maxBytesPerSecond int
The maximum upload speed. There is no limit if it is set to 0
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy (default 60)
-origin string
@@ -157,7 +164,7 @@ It is recommended using [binary releases](https://github.com/VictoriaMetrics/Vic
#### Development build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.12.
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.13.
2. Run `make vmbackup` from the root folder of the repository.
It builds `vmbackup` binary and puts it into the `bin` folder.

View File

@@ -1,5 +1,6 @@
FROM scratch
COPY --from=local/certs:1.0.2 /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ca-certificates.crt
COPY bin/vmbackup-prod .
EXPOSE 8428
ARG base_image
FROM $base_image
ENTRYPOINT ["/vmbackup-prod"]
ARG src_binary
COPY $src_binary ./vmbackup-prod

View File

@@ -9,6 +9,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/backup/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/backup/fslocal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envflag"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
)
@@ -18,13 +19,14 @@ var (
dst = flag.String("dst", "", "Where to put the backup on the remote storage. "+
"Example: gcs://bucket/path/to/backup/dir, s3://bucket/path/to/backup/dir or fs:///path/to/local/backup/dir\n"+
"-dst can point to the previous backup. In this case incremental backup is performed, i.e. only changed data is uploaded")
origin = flag.String("origin", "", "Optional origin directory on the remote storage with old backup for server-side copying when performing full backup. This speeds up full backups")
concurrency = flag.Int("concurrency", 10, "The number of concurrent workers. Higher concurrency may reduce backup duration")
origin = flag.String("origin", "", "Optional origin directory on the remote storage with old backup for server-side copying when performing full backup. This speeds up full backups")
concurrency = flag.Int("concurrency", 10, "The number of concurrent workers. Higher concurrency may reduce backup duration")
maxBytesPerSecond = flag.Int("maxBytesPerSecond", 0, "The maximum upload speed. There is no limit if it is set to 0")
)
func main() {
flag.Usage = usage
flag.Parse()
envflag.Parse()
buildinfo.Init()
srcFS, err := newSrcFS()
@@ -84,7 +86,11 @@ func newSrcFS() (*fslocal.FS, error) {
}
fs := &fslocal.FS{
Dir: snapshotPath,
Dir: snapshotPath,
MaxBytesPerSecond: *maxBytesPerSecond,
}
if err := fs.Init(); err != nil {
return nil, fmt.Errorf("cannot initialize fs: %s", err)
}
return fs, nil
}

View File

@@ -47,7 +47,7 @@ func (ctx *InsertCtx) marshalMetricNameRaw(prefix []byte, labels []prompb.Label)
return metricNameRaw[:len(metricNameRaw):len(metricNameRaw)]
}
// WriteDataPoint writes (timestamp, value) with the given prefix and lables into ctx buffer.
// WriteDataPoint writes (timestamp, value) with the given prefix and labels into ctx buffer.
func (ctx *InsertCtx) WriteDataPoint(prefix []byte, labels []prompb.Label, timestamp int64, value float64) {
metricNameRaw := ctx.marshalMetricNameRaw(prefix, labels)
ctx.addRow(metricNameRaw, timestamp, value)
@@ -78,6 +78,26 @@ func (ctx *InsertCtx) addRow(metricNameRaw []byte, timestamp int64, value float6
mr.Value = value
}
// AddLabelBytes adds (name, value) label to ctx.Labels.
//
// name and value must exist until ctx.Labels is used.
func (ctx *InsertCtx) AddLabelBytes(name, value []byte) {
labels := ctx.Labels
if cap(labels) > len(labels) {
labels = labels[:len(labels)+1]
} else {
labels = append(labels, prompb.Label{})
}
label := &labels[len(labels)-1]
// Do not copy name and value contents for performance reasons.
// This reduces GC overhead on the number of objects and allocations.
label.Name = name
label.Value = value
ctx.Labels = labels
}
// AddLabel adds (name, value) label to ctx.Labels.
//
// name and value must exist until ctx.Labels is used.

View File

@@ -0,0 +1,36 @@
package common
import (
"runtime"
"sync"
)
// GetInsertCtx returns InsertCtx from the pool.
//
// Call PutInsertCtx for returning it to the pool.
func GetInsertCtx() *InsertCtx {
select {
case ctx := <-insertCtxPoolCh:
return ctx
default:
if v := insertCtxPool.Get(); v != nil {
return v.(*InsertCtx)
}
return &InsertCtx{}
}
}
// PutInsertCtx returns ctx to the pool.
//
// ctx cannot be used after the call.
func PutInsertCtx(ctx *InsertCtx) {
ctx.Reset(0)
select {
case insertCtxPoolCh <- ctx:
default:
insertCtxPool.Put(ctx)
}
}
var insertCtxPool sync.Pool
var insertCtxPoolCh = make(chan *InsertCtx, runtime.GOMAXPROCS(-1))

View File

@@ -0,0 +1,44 @@
package csvimport
import (
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/csvimport"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="csvimport"}`)
rowsPerInsert = metrics.NewHistogram(`vm_rows_per_insert{type="csvimport"}`)
)
// InsertHandler processes /api/v1/import/csv requests.
func InsertHandler(req *http.Request) error {
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(req, func(rows []parser.Row) error {
return insertRows(rows)
})
})
}
func insertRows(rows []parser.Row) error {
ctx := common.GetInsertCtx()
defer common.PutInsertCtx(ctx)
ctx.Reset(len(rows))
for i := range rows {
r := &rows[i]
ctx.Labels = ctx.Labels[:0]
ctx.AddLabel("", r.Metric)
for j := range r.Tags {
tag := &r.Tags[j]
ctx.AddLabel(tag.Key, tag.Value)
}
ctx.WriteDataPoint(nil, ctx.Labels, r.Timestamp, r.Value)
}
rowsInserted.Add(len(rows))
rowsPerInsert.Update(float64(len(rows)))
return ctx.FlushBufs()
}

View File

@@ -1,161 +1,44 @@
package graphite
import (
"fmt"
"io"
"net"
"runtime"
"sync"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/concurrencylimiter"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/graphite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="graphite"}`)
rowsPerInsert = metrics.NewSummary(`vm_rows_per_insert{type="graphite"}`)
rowsPerInsert = metrics.NewHistogram(`vm_rows_per_insert{type="graphite"}`)
)
// insertHandler processes remote write for graphite plaintext protocol.
// InsertHandler processes remote write for graphite plaintext protocol.
//
// See https://graphite.readthedocs.io/en/latest/feeding-carbon.html#the-plaintext-protocol
func insertHandler(r io.Reader) error {
return concurrencylimiter.Do(func() error {
return insertHandlerInternal(r)
func InsertHandler(r io.Reader) error {
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(r, insertRows)
})
}
func insertHandlerInternal(r io.Reader) error {
ctx := getPushCtx()
defer putPushCtx(ctx)
for ctx.Read(r) {
if err := ctx.InsertRows(); err != nil {
return err
}
}
return ctx.Error()
}
func insertRows(rows []parser.Row) error {
ctx := common.GetInsertCtx()
defer common.PutInsertCtx(ctx)
func (ctx *pushCtx) InsertRows() error {
rows := ctx.Rows.Rows
ic := &ctx.Common
ic.Reset(len(rows))
ctx.Reset(len(rows))
for i := range rows {
r := &rows[i]
ic.Labels = ic.Labels[:0]
ic.AddLabel("", r.Metric)
ctx.Labels = ctx.Labels[:0]
ctx.AddLabel("", r.Metric)
for j := range r.Tags {
tag := &r.Tags[j]
ic.AddLabel(tag.Key, tag.Value)
ctx.AddLabel(tag.Key, tag.Value)
}
ic.WriteDataPoint(nil, ic.Labels, r.Timestamp, r.Value)
ctx.WriteDataPoint(nil, ctx.Labels, r.Timestamp, r.Value)
}
rowsInserted.Add(len(rows))
rowsPerInsert.Update(float64(len(rows)))
return ic.FlushBufs()
return ctx.FlushBufs()
}
const flushTimeout = 3 * time.Second
func (ctx *pushCtx) Read(r io.Reader) bool {
graphiteReadCalls.Inc()
if ctx.err != nil {
return false
}
if c, ok := r.(net.Conn); ok {
if err := c.SetReadDeadline(time.Now().Add(flushTimeout)); err != nil {
graphiteReadErrors.Inc()
ctx.err = fmt.Errorf("cannot set read deadline: %s", err)
return false
}
}
ctx.reqBuf, ctx.tailBuf, ctx.err = common.ReadLinesBlock(r, ctx.reqBuf, ctx.tailBuf)
if ctx.err != nil {
if ne, ok := ctx.err.(net.Error); ok && ne.Timeout() {
// Flush the read data on timeout and try reading again.
ctx.err = nil
} else {
if ctx.err != io.EOF {
graphiteReadErrors.Inc()
ctx.err = fmt.Errorf("cannot read graphite plaintext protocol data: %s", ctx.err)
}
return false
}
}
ctx.Rows.Unmarshal(bytesutil.ToUnsafeString(ctx.reqBuf))
// Fill missing timestamps with the current timestamp rounded to seconds.
currentTimestamp := time.Now().Unix()
rows := ctx.Rows.Rows
for i := range rows {
r := &rows[i]
if r.Timestamp == 0 {
r.Timestamp = currentTimestamp
}
}
// Convert timestamps from seconds to milliseconds.
for i := range rows {
rows[i].Timestamp *= 1e3
}
return true
}
type pushCtx struct {
Rows Rows
Common common.InsertCtx
reqBuf []byte
tailBuf []byte
err error
}
func (ctx *pushCtx) Error() error {
if ctx.err == io.EOF {
return nil
}
return ctx.err
}
func (ctx *pushCtx) reset() {
ctx.Rows.Reset()
ctx.Common.Reset(0)
ctx.reqBuf = ctx.reqBuf[:0]
ctx.tailBuf = ctx.tailBuf[:0]
ctx.err = nil
}
var (
graphiteReadCalls = metrics.NewCounter(`vm_read_calls_total{name="graphite"}`)
graphiteReadErrors = metrics.NewCounter(`vm_read_errors_total{name="graphite"}`)
)
func getPushCtx() *pushCtx {
select {
case ctx := <-pushCtxPoolCh:
return ctx
default:
if v := pushCtxPool.Get(); v != nil {
return v.(*pushCtx)
}
return &pushCtx{}
}
}
func putPushCtx(ctx *pushCtx) {
ctx.reset()
select {
case pushCtxPoolCh <- ctx:
default:
pushCtxPool.Put(ctx)
}
}
var pushCtxPool sync.Pool
var pushCtxPoolCh = make(chan *pushCtx, runtime.GOMAXPROCS(-1))

View File

@@ -2,87 +2,59 @@ package influx
import (
"flag"
"fmt"
"io"
"net/http"
"runtime"
"sync"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/concurrencylimiter"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/influx"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
measurementFieldSeparator = flag.String("influxMeasurementFieldSeparator", "_", "Separator for `{measurement}{separator}{field_name}` metric name when inserted via Influx line protocol")
skipSingleField = flag.Bool("influxSkipSingleField", false, "Uses `{measurement}` instead of `{measurement}{separator}{field_name}` for metic name if Influx line contains only a single field")
measurementFieldSeparator = flag.String("influxMeasurementFieldSeparator", "_", "Separator for '{measurement}{separator}{field_name}' metric name when inserted via Influx line protocol")
skipSingleField = flag.Bool("influxSkipSingleField", false, "Uses '{measurement}' instead of '{measurement}{separator}{field_name}' for metic name if Influx line contains only a single field")
)
var (
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="influx"}`)
rowsPerInsert = metrics.NewSummary(`vm_rows_per_insert{type="influx"}`)
rowsPerInsert = metrics.NewHistogram(`vm_rows_per_insert{type="influx"}`)
)
// InsertHandler processes remote write for influx line protocol.
// InsertHandlerForReader processes remote write for influx line protocol.
//
// See https://github.com/influxdata/influxdb/blob/4cbdc197b8117fee648d62e2e5be75c6575352f0/tsdb/README.md
func InsertHandler(req *http.Request) error {
return concurrencylimiter.Do(func() error {
return insertHandlerInternal(req)
// See https://github.com/influxdata/telegraf/tree/master/plugins/inputs/socket_listener/
func InsertHandlerForReader(r io.Reader) error {
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(r, false, "", "", insertRows)
})
}
func insertHandlerInternal(req *http.Request) error {
influxReadCalls.Inc()
r := req.Body
if req.Header.Get("Content-Encoding") == "gzip" {
zr, err := common.GetGzipReader(r)
if err != nil {
return fmt.Errorf("cannot read gzipped influx line protocol data: %s", err)
}
defer common.PutGzipReader(zr)
r = zr
}
q := req.URL.Query()
tsMultiplier := int64(1e6)
switch q.Get("precision") {
case "ns":
tsMultiplier = 1e6
case "u":
tsMultiplier = 1e3
case "ms":
tsMultiplier = 1
case "s":
tsMultiplier = -1e3
case "m":
tsMultiplier = -1e3 * 60
case "h":
tsMultiplier = -1e3 * 3600
}
// Read db tag from https://docs.influxdata.com/influxdb/v1.7/tools/api/#write-http-endpoint
db := q.Get("db")
ctx := getPushCtx()
defer putPushCtx(ctx)
for ctx.Read(r, tsMultiplier) {
if err := ctx.InsertRows(db); err != nil {
return err
}
}
return ctx.Error()
// InsertHandlerForHTTP processes remote write for influx line protocol.
//
// See https://github.com/influxdata/influxdb/blob/4cbdc197b8117fee648d62e2e5be75c6575352f0/tsdb/README.md
func InsertHandlerForHTTP(req *http.Request) error {
return writeconcurrencylimiter.Do(func() error {
isGzipped := req.Header.Get("Content-Encoding") == "gzip"
q := req.URL.Query()
precision := q.Get("precision")
// Read db tag from https://docs.influxdata.com/influxdb/v1.7/tools/api/#write-http-endpoint
db := q.Get("db")
return parser.ParseStream(req.Body, isGzipped, precision, db, insertRows)
})
}
func (ctx *pushCtx) InsertRows(db string) error {
rows := ctx.Rows.Rows
func insertRows(db string, rows []parser.Row) error {
ctx := getPushCtx()
defer putPushCtx(ctx)
rowsLen := 0
for i := range rows {
rowsLen += len(rows[i].Tags)
rowsLen += len(rows[i].Fields)
}
ic := &ctx.Common
ic.Reset(rowsLen)
@@ -104,7 +76,7 @@ func (ctx *pushCtx) InsertRows(db string) error {
ctx.metricNameBuf = storage.MarshalMetricNameRaw(ctx.metricNameBuf[:0], ic.Labels)
ctx.metricGroupBuf = append(ctx.metricGroupBuf[:0], r.Measurement...)
skipFieldKey := len(r.Fields) == 1 && *skipSingleField
if !skipFieldKey {
if len(ctx.metricGroupBuf) > 0 && !skipFieldKey {
ctx.metricGroupBuf = append(ctx.metricGroupBuf, *measurementFieldSeparator...)
}
metricGroupPrefixLen := len(ctx.metricGroupBuf)
@@ -125,80 +97,16 @@ func (ctx *pushCtx) InsertRows(db string) error {
return ic.FlushBufs()
}
func (ctx *pushCtx) Read(r io.Reader, tsMultiplier int64) bool {
if ctx.err != nil {
return false
}
ctx.reqBuf, ctx.tailBuf, ctx.err = common.ReadLinesBlock(r, ctx.reqBuf, ctx.tailBuf)
if ctx.err != nil {
if ctx.err != io.EOF {
influxReadErrors.Inc()
ctx.err = fmt.Errorf("cannot read influx line protocol data: %s", ctx.err)
}
return false
}
ctx.Rows.Unmarshal(bytesutil.ToUnsafeString(ctx.reqBuf))
// Adjust timestamps according to tsMultiplier
currentTs := time.Now().UnixNano() / 1e6
if tsMultiplier >= 1 {
for i := range ctx.Rows.Rows {
row := &ctx.Rows.Rows[i]
if row.Timestamp == 0 {
row.Timestamp = currentTs
} else {
row.Timestamp /= tsMultiplier
}
}
} else if tsMultiplier < 0 {
tsMultiplier = -tsMultiplier
currentTs -= currentTs % tsMultiplier
for i := range ctx.Rows.Rows {
row := &ctx.Rows.Rows[i]
if row.Timestamp == 0 {
row.Timestamp = currentTs
} else {
row.Timestamp *= tsMultiplier
}
}
}
return true
}
var (
influxReadCalls = metrics.NewCounter(`vm_read_calls_total{name="influx"}`)
influxReadErrors = metrics.NewCounter(`vm_read_errors_total{name="influx"}`)
)
type pushCtx struct {
Rows Rows
Common common.InsertCtx
reqBuf []byte
tailBuf []byte
Common common.InsertCtx
metricNameBuf []byte
metricGroupBuf []byte
err error
}
func (ctx *pushCtx) Error() error {
if ctx.err == io.EOF {
return nil
}
return ctx.err
}
func (ctx *pushCtx) reset() {
ctx.Rows.Reset()
ctx.Common.Reset(0)
ctx.reqBuf = ctx.reqBuf[:0]
ctx.tailBuf = ctx.tailBuf[:0]
ctx.metricNameBuf = ctx.metricNameBuf[:0]
ctx.metricGroupBuf = ctx.metricGroupBuf[:0]
ctx.err = nil
}
func getPushCtx() *pushCtx {

View File

@@ -6,51 +6,76 @@ import (
"net/http"
"strings"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/concurrencylimiter"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/csvimport"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/graphite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/influx"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/opentsdb"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/opentsdbhttp"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/prometheus"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/prompush"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/promremotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/vmimport"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
graphiteserver "github.com/VictoriaMetrics/VictoriaMetrics/lib/ingestserver/graphite"
influxserver "github.com/VictoriaMetrics/VictoriaMetrics/lib/ingestserver/influx"
opentsdbserver "github.com/VictoriaMetrics/VictoriaMetrics/lib/ingestserver/opentsdb"
opentsdbhttpserver "github.com/VictoriaMetrics/VictoriaMetrics/lib/ingestserver/opentsdbhttp"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
graphiteListenAddr = flag.String("graphiteListenAddr", "", "TCP and UDP address to listen for Graphite plaintext data. Usually :2003 must be set. Doesn't work if empty")
opentsdbListenAddr = flag.String("opentsdbListenAddr", "", "TCP and UDP address to listen for OpentTSDB put messages. Usually :4242 must be set. Doesn't work if empty")
graphiteListenAddr = flag.String("graphiteListenAddr", "", "TCP and UDP address to listen for Graphite plaintext data. Usually :2003 must be set. Doesn't work if empty")
influxListenAddr = flag.String("influxListenAddr", "", "TCP and UDP address to listen for Influx line protocol data. Usually :8189 must be set. Doesn't work if empty")
opentsdbListenAddr = flag.String("opentsdbListenAddr", "", "TCP and UDP address to listen for OpentTSDB metrics. "+
"Telnet put messages and HTTP /api/put messages are simultaneously served on TCP port. "+
"Usually :4242 must be set. Doesn't work if empty")
opentsdbHTTPListenAddr = flag.String("opentsdbHTTPListenAddr", "", "TCP address to listen for OpentTSDB HTTP put requests. Usually :4242 must be set. Doesn't work if empty")
maxInsertRequestSize = flag.Int("maxInsertRequestSize", 32*1024*1024, "The maximum size of a single insert request in bytes")
maxLabelsPerTimeseries = flag.Int("maxLabelsPerTimeseries", 30, "The maximum number of labels accepted per time series. Superflouos labels are dropped")
)
var (
influxServer *influxserver.Server
graphiteServer *graphiteserver.Server
opentsdbServer *opentsdbserver.Server
opentsdbhttpServer *opentsdbhttpserver.Server
)
// Init initializes vminsert.
func Init() {
storage.SetMaxLabelsPerTimeseries(*maxLabelsPerTimeseries)
concurrencylimiter.Init()
writeconcurrencylimiter.Init()
if len(*influxListenAddr) > 0 {
influxServer = influxserver.MustStart(*influxListenAddr, influx.InsertHandlerForReader)
}
if len(*graphiteListenAddr) > 0 {
go graphite.Serve(*graphiteListenAddr)
graphiteServer = graphiteserver.MustStart(*graphiteListenAddr, graphite.InsertHandler)
}
if len(*opentsdbListenAddr) > 0 {
go opentsdb.Serve(*opentsdbListenAddr)
opentsdbServer = opentsdbserver.MustStart(*opentsdbListenAddr, opentsdb.InsertHandler, opentsdbhttp.InsertHandler)
}
if len(*opentsdbHTTPListenAddr) > 0 {
go opentsdbhttp.Serve(*opentsdbHTTPListenAddr, int64(*maxInsertRequestSize))
opentsdbhttpServer = opentsdbhttpserver.MustStart(*opentsdbHTTPListenAddr, opentsdbhttp.InsertHandler)
}
promscrape.Init(prompush.Push)
}
// Stop stops vminsert.
func Stop() {
promscrape.Stop()
if len(*influxListenAddr) > 0 {
influxServer.MustStop()
}
if len(*graphiteListenAddr) > 0 {
graphite.Stop()
graphiteServer.MustStop()
}
if len(*opentsdbListenAddr) > 0 {
opentsdb.Stop()
opentsdbServer.MustStop()
}
if len(*opentsdbHTTPListenAddr) > 0 {
opentsdbhttp.Stop()
opentsdbhttpServer.MustStop()
}
}
@@ -60,16 +85,34 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
switch path {
case "/api/v1/write":
prometheusWriteRequests.Inc()
if err := prometheus.InsertHandler(r, int64(*maxInsertRequestSize)); err != nil {
if err := promremotewrite.InsertHandler(r); err != nil {
prometheusWriteErrors.Inc()
httpserver.Errorf(w, "error in %q: %s", r.URL.Path, err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "/api/v1/import":
vmimportRequests.Inc()
if err := vmimport.InsertHandler(r); err != nil {
vmimportErrors.Inc()
httpserver.Errorf(w, "error in %q: %s", r.URL.Path, err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "/api/v1/import/csv":
csvimportRequests.Inc()
if err := csvimport.InsertHandler(r); err != nil {
csvimportErrors.Inc()
httpserver.Errorf(w, "error in %q: %s", r.URL.Path, err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "/write", "/api/v2/write":
influxWriteRequests.Inc()
if err := influx.InsertHandler(r); err != nil {
if err := influx.InsertHandlerForHTTP(r); err != nil {
influxWriteErrors.Inc()
httpserver.Errorf(w, "error in %q: %s", r.URL.Path, err)
return true
@@ -82,6 +125,11 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
influxQueryRequests.Inc()
fmt.Fprintf(w, `{"results":[{"series":[{"values":[]}]}]}`)
return true
case "/targets":
promscrapeTargetsRequests.Inc()
w.Header().Set("Content-Type", "text/plain")
promscrape.WriteHumanReadableTargetsStatus(w)
return true
default:
// This is not our link
return false
@@ -89,11 +137,19 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
}
var (
prometheusWriteRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/write", protocol="prometheus"}`)
prometheusWriteErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/api/v1/write", protocol="prometheus"}`)
prometheusWriteRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/write", protocol="promremotewrite"}`)
prometheusWriteErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/api/v1/write", protocol="promremotewrite"}`)
vmimportRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/import", protocol="vmimport"}`)
vmimportErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/api/v1/import", protocol="vmimport"}`)
csvimportRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/import/csv", protocol="csvimport"}`)
csvimportErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/api/v1/import/csv", protocol="csvimport"}`)
influxWriteRequests = metrics.NewCounter(`vm_http_requests_total{path="/write", protocol="influx"}`)
influxWriteErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/write", protocol="influx"}`)
influxQueryRequests = metrics.NewCounter(`vm_http_requests_total{path="/query", protocol="influx"}`)
promscrapeTargetsRequests = metrics.NewCounter(`vm_http_requests_total{path="/targets"}`)
)

View File

@@ -1,160 +1,44 @@
package opentsdb
import (
"fmt"
"io"
"net"
"runtime"
"sync"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/concurrencylimiter"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentsdb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="opentsdb"}`)
rowsPerInsert = metrics.NewSummary(`vm_rows_per_insert{type="opentsdb"}`)
rowsPerInsert = metrics.NewHistogram(`vm_rows_per_insert{type="opentsdb"}`)
)
// insertHandler processes remote write for OpenTSDB put protocol.
// InsertHandler processes remote write for OpenTSDB put protocol.
//
// See http://opentsdb.net/docs/build/html/api_telnet/put.html
func insertHandler(r io.Reader) error {
return concurrencylimiter.Do(func() error {
return insertHandlerInternal(r)
func InsertHandler(r io.Reader) error {
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(r, insertRows)
})
}
func insertHandlerInternal(r io.Reader) error {
ctx := getPushCtx()
defer putPushCtx(ctx)
for ctx.Read(r) {
if err := ctx.InsertRows(); err != nil {
return err
}
}
return ctx.Error()
}
func insertRows(rows []parser.Row) error {
ctx := common.GetInsertCtx()
defer common.PutInsertCtx(ctx)
func (ctx *pushCtx) InsertRows() error {
rows := ctx.Rows.Rows
ic := &ctx.Common
ic.Reset(len(rows))
ctx.Reset(len(rows))
for i := range rows {
r := &rows[i]
ic.Labels = ic.Labels[:0]
ic.AddLabel("", r.Metric)
ctx.Labels = ctx.Labels[:0]
ctx.AddLabel("", r.Metric)
for j := range r.Tags {
tag := &r.Tags[j]
ic.AddLabel(tag.Key, tag.Value)
ctx.AddLabel(tag.Key, tag.Value)
}
ic.WriteDataPoint(nil, ic.Labels, r.Timestamp, r.Value)
ctx.WriteDataPoint(nil, ctx.Labels, r.Timestamp, r.Value)
}
rowsInserted.Add(len(rows))
rowsPerInsert.Update(float64(len(rows)))
return ic.FlushBufs()
return ctx.FlushBufs()
}
const flushTimeout = 3 * time.Second
func (ctx *pushCtx) Read(r io.Reader) bool {
opentsdbReadCalls.Inc()
if ctx.err != nil {
return false
}
if c, ok := r.(net.Conn); ok {
if err := c.SetReadDeadline(time.Now().Add(flushTimeout)); err != nil {
opentsdbReadErrors.Inc()
ctx.err = fmt.Errorf("cannot set read deadline: %s", err)
return false
}
}
ctx.reqBuf, ctx.tailBuf, ctx.err = common.ReadLinesBlock(r, ctx.reqBuf, ctx.tailBuf)
if ctx.err != nil {
if ne, ok := ctx.err.(net.Error); ok && ne.Timeout() {
// Flush the read data on timeout and try reading again.
ctx.err = nil
} else {
if ctx.err != io.EOF {
opentsdbReadErrors.Inc()
ctx.err = fmt.Errorf("cannot read OpenTSDB put protocol data: %s", ctx.err)
}
return false
}
}
ctx.Rows.Unmarshal(bytesutil.ToUnsafeString(ctx.reqBuf))
// Fill in missing timestamps
currentTimestamp := time.Now().Unix()
rows := ctx.Rows.Rows
for i := range rows {
r := &rows[i]
if r.Timestamp == 0 {
r.Timestamp = currentTimestamp
}
}
// Convert timestamps from seconds to milliseconds
for i := range rows {
rows[i].Timestamp *= 1e3
}
return true
}
type pushCtx struct {
Rows Rows
Common common.InsertCtx
reqBuf []byte
tailBuf []byte
err error
}
func (ctx *pushCtx) Error() error {
if ctx.err == io.EOF {
return nil
}
return ctx.err
}
func (ctx *pushCtx) reset() {
ctx.Rows.Reset()
ctx.Common.Reset(0)
ctx.reqBuf = ctx.reqBuf[:0]
ctx.tailBuf = ctx.tailBuf[:0]
ctx.err = nil
}
var (
opentsdbReadCalls = metrics.NewCounter(`vm_read_calls_total{name="opentsdb"}`)
opentsdbReadErrors = metrics.NewCounter(`vm_read_errors_total{name="opentsdb"}`)
)
func getPushCtx() *pushCtx {
select {
case ctx := <-pushCtxPoolCh:
return ctx
default:
if v := pushCtxPool.Get(); v != nil {
return v.(*pushCtx)
}
return &pushCtx{}
}
}
func putPushCtx(ctx *pushCtx) {
ctx.reset()
select {
case pushCtxPoolCh <- ctx:
default:
pushCtxPool.Put(ctx)
}
}
var pushCtxPool sync.Pool
var pushCtxPoolCh = make(chan *pushCtx, runtime.GOMAXPROCS(-1))

View File

@@ -1,138 +0,0 @@
package opentsdb
import (
"net"
"runtime"
"strings"
"sync"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/netutil"
"github.com/VictoriaMetrics/metrics"
)
var (
writeRequestsTCP = metrics.NewCounter(`vm_opentsdb_requests_total{name="write", net="tcp"}`)
writeErrorsTCP = metrics.NewCounter(`vm_opentsdb_request_errors_total{name="write", net="tcp"}`)
writeRequestsUDP = metrics.NewCounter(`vm_opentsdb_requests_total{name="write", net="udp"}`)
writeErrorsUDP = metrics.NewCounter(`vm_opentsdb_request_errors_total{name="write", net="udp"}`)
)
// Serve starts OpenTSDB collector on the given addr.
func Serve(addr string) {
logger.Infof("starting TCP OpenTSDB collector at %q", addr)
lnTCP, err := netutil.NewTCPListener("opentsdb", addr)
if err != nil {
logger.Fatalf("cannot start TCP OpenTSDB collector at %q: %s", addr, err)
}
listenerTCP = lnTCP
logger.Infof("starting UDP OpenTSDB collector at %q", addr)
lnUDP, err := net.ListenPacket("udp4", addr)
if err != nil {
logger.Fatalf("cannot start UDP OpenTSDB collector at %q: %s", addr, err)
}
listenerUDP = lnUDP
var wg sync.WaitGroup
wg.Add(1)
go func() {
defer wg.Done()
serveTCP(listenerTCP)
logger.Infof("stopped TCP OpenTSDB collector at %q", addr)
}()
wg.Add(1)
go func() {
defer wg.Done()
serveUDP(listenerUDP)
logger.Infof("stopped UDP OpenTSDB collector at %q", addr)
}()
wg.Wait()
}
func serveTCP(ln net.Listener) {
for {
c, err := ln.Accept()
if err != nil {
if ne, ok := err.(net.Error); ok {
if ne.Temporary() {
time.Sleep(time.Second)
continue
}
if strings.Contains(err.Error(), "use of closed network connection") {
break
}
logger.Fatalf("unrecoverable error when accepting TCP OpenTSDB connections: %s", err)
}
logger.Fatalf("unexpected error when accepting TCP OpenTSDB connections: %s", err)
}
go func() {
writeRequestsTCP.Inc()
if err := insertHandler(c); err != nil {
writeErrorsTCP.Inc()
logger.Errorf("error in TCP OpenTSDB conn %q<->%q: %s", c.LocalAddr(), c.RemoteAddr(), err)
}
_ = c.Close()
}()
}
}
func serveUDP(ln net.PacketConn) {
gomaxprocs := runtime.GOMAXPROCS(-1)
var wg sync.WaitGroup
for i := 0; i < gomaxprocs; i++ {
wg.Add(1)
go func() {
defer wg.Done()
var bb bytesutil.ByteBuffer
bb.B = bytesutil.Resize(bb.B, 64*1024)
for {
bb.Reset()
bb.B = bb.B[:cap(bb.B)]
n, addr, err := ln.ReadFrom(bb.B)
if err != nil {
writeErrorsUDP.Inc()
if ne, ok := err.(net.Error); ok {
if ne.Temporary() {
time.Sleep(time.Second)
continue
}
if strings.Contains(err.Error(), "use of closed network connection") {
break
}
}
logger.Errorf("cannot read OpenTSDB UDP data: %s", err)
continue
}
bb.B = bb.B[:n]
writeRequestsUDP.Inc()
if err := insertHandler(bb.NewReader()); err != nil {
writeErrorsUDP.Inc()
logger.Errorf("error in UDP OpenTSDB conn %q<->%q: %s", ln.LocalAddr(), addr, err)
continue
}
}
}()
}
wg.Wait()
}
var (
listenerTCP net.Listener
listenerUDP net.PacketConn
)
// Stop stops the server.
func Stop() {
logger.Infof("stopping TCP OpenTSDB server at %q...", listenerTCP.Addr())
if err := listenerTCP.Close(); err != nil {
logger.Errorf("cannot close TCP OpenTSDB server: %s", err)
}
logger.Infof("stopping UDP OpenTSDB server at %q...", listenerUDP.LocalAddr())
if err := listenerUDP.Close(); err != nil {
logger.Errorf("cannot close UDP OpenTSDB server: %s", err)
}
}

View File

@@ -2,149 +2,49 @@ package opentsdbhttp
import (
"fmt"
"io"
"net/http"
"runtime"
"sync"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/concurrencylimiter"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentsdbhttp"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
"github.com/valyala/fastjson"
)
var (
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="opentsdb-http"}`)
rowsPerInsert = metrics.NewSummary(`vm_rows_per_insert{type="opentsdb-http"}`)
opentsdbReadCalls = metrics.NewCounter(`vm_read_calls_total{name="opentsdb-http"}`)
opentsdbReadErrors = metrics.NewCounter(`vm_read_errors_total{name="opentsdb-http"}`)
opentsdbUnmarshalErrors = metrics.NewCounter(`vm_unmarshal_errors_total{name="opentsdb-http"}`)
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="opentsdbhttp"}`)
rowsPerInsert = metrics.NewHistogram(`vm_rows_per_insert{type="opentsdbhttp"}`)
)
// insertHandler processes HTTP OpenTSDB put requests.
// InsertHandler processes HTTP OpenTSDB put requests.
// See http://opentsdb.net/docs/build/html/api_http/put.html
func insertHandler(req *http.Request, maxSize int64) error {
return concurrencylimiter.Do(func() error {
return insertHandlerInternal(req, maxSize)
})
func InsertHandler(req *http.Request) error {
path := req.URL.Path
switch path {
case "/api/put":
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(req, insertRows)
})
default:
return fmt.Errorf("unexpected path requested on HTTP OpenTSDB server: %q", path)
}
}
func insertHandlerInternal(req *http.Request, maxSize int64) error {
opentsdbReadCalls.Inc()
func insertRows(rows []parser.Row) error {
ctx := common.GetInsertCtx()
defer common.PutInsertCtx(ctx)
r := req.Body
if req.Header.Get("Content-Encoding") == "gzip" {
zr, err := common.GetGzipReader(r)
if err != nil {
opentsdbReadErrors.Inc()
return fmt.Errorf("cannot read gzipped http protocol data: %s", err)
}
defer common.PutGzipReader(zr)
r = zr
}
ctx := getPushCtx()
defer putPushCtx(ctx)
// Read the request in ctx.reqBuf
lr := io.LimitReader(r, maxSize+1)
reqLen, err := ctx.reqBuf.ReadFrom(lr)
if err != nil {
opentsdbReadErrors.Inc()
return fmt.Errorf("cannot read HTTP OpenTSDB request: %s", err)
}
if reqLen > maxSize {
opentsdbReadErrors.Inc()
return fmt.Errorf("too big HTTP OpenTSDB request; mustn't exceed %d bytes", maxSize)
}
// Unmarshal the request to ctx.Rows
p := parserPool.Get()
defer parserPool.Put(p)
v, err := p.ParseBytes(ctx.reqBuf.B)
if err != nil {
opentsdbUnmarshalErrors.Inc()
return fmt.Errorf("cannot parse HTTP OpenTSDB json: %s", err)
}
ctx.Rows.Unmarshal(v)
// Fill in missing timestamps
currentTimestamp := time.Now().Unix()
rows := ctx.Rows.Rows
ctx.Reset(len(rows))
for i := range rows {
r := &rows[i]
if r.Timestamp == 0 {
r.Timestamp = currentTimestamp
}
}
// Convert timestamps in seconds to milliseconds if needed.
// See http://opentsdb.net/docs/javadoc/net/opentsdb/core/Const.html#SECOND_MASK
for i := range rows {
r := &rows[i]
if r.Timestamp&secondMask == 0 {
r.Timestamp *= 1e3
}
}
// Insert ctx.Rows to db.
ic := &ctx.Common
ic.Reset(len(rows))
for i := range rows {
r := &rows[i]
ic.Labels = ic.Labels[:0]
ic.AddLabel("", r.Metric)
ctx.Labels = ctx.Labels[:0]
ctx.AddLabel("", r.Metric)
for j := range r.Tags {
tag := &r.Tags[j]
ic.AddLabel(tag.Key, tag.Value)
ctx.AddLabel(tag.Key, tag.Value)
}
ic.WriteDataPoint(nil, ic.Labels, r.Timestamp, r.Value)
ctx.WriteDataPoint(nil, ctx.Labels, r.Timestamp, r.Value)
}
rowsInserted.Add(len(rows))
rowsPerInsert.Update(float64(len(rows)))
return ic.FlushBufs()
return ctx.FlushBufs()
}
const secondMask int64 = 0x7FFFFFFF00000000
var parserPool fastjson.ParserPool
type pushCtx struct {
Rows Rows
Common common.InsertCtx
reqBuf bytesutil.ByteBuffer
}
func (ctx *pushCtx) reset() {
ctx.Rows.Reset()
ctx.Common.Reset(0)
ctx.reqBuf.Reset()
}
func getPushCtx() *pushCtx {
select {
case ctx := <-pushCtxPoolCh:
return ctx
default:
if v := pushCtxPool.Get(); v != nil {
return v.(*pushCtx)
}
return &pushCtx{}
}
}
func putPushCtx(ctx *pushCtx) {
ctx.reset()
select {
case pushCtxPoolCh <- ctx:
default:
pushCtxPool.Put(ctx)
}
}
var pushCtxPool sync.Pool
var pushCtxPoolCh = make(chan *pushCtx, runtime.GOMAXPROCS(-1))

View File

@@ -1,70 +0,0 @@
package opentsdbhttp
import (
"context"
"net/http"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/metrics"
)
var (
writeRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/put", protocol="opentsdb-http"}`)
writeErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/api/put", protocol="opentsdb-http"}`)
)
var (
httpServer *http.Server
httpAddr string
maxRequestSize int64
)
// Serve starts HTTP OpenTSDB server on the given addr.
func Serve(addr string, maxReqSize int64) {
logger.Infof("starting HTTP OpenTSDB server at %q", addr)
httpAddr = addr
maxRequestSize = maxReqSize
httpServer = &http.Server{
Addr: addr,
Handler: http.HandlerFunc(requestHandler),
ReadTimeout: 30 * time.Second,
WriteTimeout: 10 * time.Second,
}
go func() {
err := httpServer.ListenAndServe()
if err == http.ErrServerClosed {
return
}
if err != nil {
logger.Fatalf("error serving HTTP OpenTSDB: %s", err)
}
}()
}
// requestHandler handles HTTP OpenTSDB insert request.
func requestHandler(w http.ResponseWriter, r *http.Request) {
switch r.URL.Path {
case "/api/put":
writeRequests.Inc()
if err := insertHandler(r, maxRequestSize); err != nil {
writeErrors.Inc()
httpserver.Errorf(w, "error in %q: %s", r.URL.Path, err)
return
}
w.WriteHeader(http.StatusNoContent)
default:
httpserver.Errorf(w, "unexpected path requested on HTTP OpenTSDB server: %q", r.URL.Path)
}
}
// Stop stops HTTP OpenTSDB server.
func Stop() {
logger.Infof("stopping HTTP OpenTSDB server at %q...", httpAddr)
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
if err := httpServer.Shutdown(ctx); err != nil {
logger.Fatalf("cannot close HTTP OpenTSDB server: %s", err)
}
}

View File

@@ -1,112 +0,0 @@
package prometheus
import (
"fmt"
"net/http"
"runtime"
"sync"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/concurrencylimiter"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="prometheus"}`)
rowsPerInsert = metrics.NewSummary(`vm_rows_per_insert{type="prometheus"}`)
)
// InsertHandler processes remote write for prometheus.
func InsertHandler(r *http.Request, maxSize int64) error {
return concurrencylimiter.Do(func() error {
return insertHandlerInternal(r, maxSize)
})
}
func insertHandlerInternal(r *http.Request, maxSize int64) error {
ctx := getPushCtx()
defer putPushCtx(ctx)
if err := ctx.Read(r, maxSize); err != nil {
return err
}
timeseries := ctx.req.Timeseries
rowsLen := 0
for i := range timeseries {
rowsLen += len(timeseries[i].Samples)
}
ic := &ctx.Common
ic.Reset(rowsLen)
rowsTotal := 0
for i := range timeseries {
ts := &timeseries[i]
var metricNameRaw []byte
for i := range ts.Samples {
r := &ts.Samples[i]
metricNameRaw = ic.WriteDataPointExt(metricNameRaw, ts.Labels, r.Timestamp, r.Value)
}
rowsTotal += len(ts.Samples)
}
rowsInserted.Add(rowsTotal)
rowsPerInsert.Update(float64(rowsTotal))
return ic.FlushBufs()
}
type pushCtx struct {
Common common.InsertCtx
req prompb.WriteRequest
reqBuf []byte
}
func (ctx *pushCtx) reset() {
ctx.Common.Reset(0)
ctx.req.Reset()
ctx.reqBuf = ctx.reqBuf[:0]
}
func (ctx *pushCtx) Read(r *http.Request, maxSize int64) error {
prometheusReadCalls.Inc()
var err error
ctx.reqBuf, err = prompb.ReadSnappy(ctx.reqBuf[:0], r.Body, maxSize)
if err != nil {
prometheusReadErrors.Inc()
return fmt.Errorf("cannot read prompb.WriteRequest: %s", err)
}
if err = ctx.req.Unmarshal(ctx.reqBuf); err != nil {
prometheusUnmarshalErrors.Inc()
return fmt.Errorf("cannot unmarshal prompb.WriteRequest with size %d bytes: %s", len(ctx.reqBuf), err)
}
return nil
}
var (
prometheusReadCalls = metrics.NewCounter(`vm_read_calls_total{name="prometheus"}`)
prometheusReadErrors = metrics.NewCounter(`vm_read_errors_total{name="prometheus"}`)
prometheusUnmarshalErrors = metrics.NewCounter(`vm_unmarshal_errors_total{name="prometheus"}`)
)
func getPushCtx() *pushCtx {
select {
case ctx := <-pushCtxPoolCh:
return ctx
default:
if v := pushCtxPool.Get(); v != nil {
return v.(*pushCtx)
}
return &pushCtx{}
}
}
func putPushCtx(ctx *pushCtx) {
ctx.reset()
select {
case pushCtxPoolCh <- ctx:
default:
pushCtxPool.Put(ctx)
}
}
var pushCtxPool sync.Pool
var pushCtxPoolCh = make(chan *pushCtx, runtime.GOMAXPROCS(-1))

View File

@@ -0,0 +1,113 @@
package prompush
import (
"runtime"
"sync"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="promscrape"}`)
rowsPerInsert = metrics.NewHistogram(`vm_rows_per_insert{type="promscrape"}`)
)
const maxRowsPerBlock = 10000
// Push pushes wr to storage.
func Push(wr *prompbmarshal.WriteRequest) {
ctx := getPushCtx()
defer putPushCtx(ctx)
tss := wr.Timeseries
for len(tss) > 0 {
// Process big tss in smaller blocks in order to reduce maxmimum memory usage
tssBlock := tss
if len(tssBlock) > maxRowsPerBlock {
tssBlock = tss[:maxRowsPerBlock]
tss = tss[maxRowsPerBlock:]
} else {
tss = nil
}
ctx.push(tssBlock)
}
}
func (ctx *pushCtx) push(tss []prompbmarshal.TimeSeries) {
rowsLen := 0
for i := range tss {
rowsLen += len(tss[i].Samples)
}
ic := &ctx.Common
ic.Reset(rowsLen)
rowsTotal := 0
labels := ctx.labels[:0]
for i := range tss {
ts := &tss[i]
labels = labels[:0]
for j := range ts.Labels {
label := &ts.Labels[j]
labels = append(labels, prompb.Label{
Name: bytesutil.ToUnsafeBytes(label.Name),
Value: bytesutil.ToUnsafeBytes(label.Value),
})
}
var metricNameRaw []byte
for i := range ts.Samples {
r := &ts.Samples[i]
metricNameRaw = ic.WriteDataPointExt(metricNameRaw, labels, r.Timestamp, r.Value)
}
rowsTotal += len(ts.Samples)
}
ctx.labels = labels
rowsInserted.Add(rowsTotal)
rowsPerInsert.Update(float64(rowsTotal))
if err := ic.FlushBufs(); err != nil {
logger.Errorf("cannot flush promscrape data to storage: %s", err)
}
}
type pushCtx struct {
Common common.InsertCtx
labels []prompb.Label
}
func (ctx *pushCtx) reset() {
ctx.Common.Reset(0)
for i := range ctx.labels {
label := &ctx.labels[i]
label.Name = nil
label.Value = nil
}
ctx.labels = ctx.labels[:0]
}
func getPushCtx() *pushCtx {
select {
case ctx := <-pushCtxPoolCh:
return ctx
default:
if v := pushCtxPool.Get(); v != nil {
return v.(*pushCtx)
}
return &pushCtx{}
}
}
func putPushCtx(ctx *pushCtx) {
ctx.reset()
select {
case pushCtxPoolCh <- ctx:
default:
pushCtxPool.Put(ctx)
}
}
var pushCtxPool sync.Pool
var pushCtxPoolCh = make(chan *pushCtx, runtime.GOMAXPROCS(-1))

View File

@@ -0,0 +1,47 @@
package promremotewrite
import (
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/promremotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="promremotewrite"}`)
rowsPerInsert = metrics.NewHistogram(`vm_rows_per_insert{type="promremotewrite"}`)
)
// InsertHandler processes remote write for prometheus.
func InsertHandler(req *http.Request) error {
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(req, insertRows)
})
}
func insertRows(timeseries []prompb.TimeSeries) error {
ctx := common.GetInsertCtx()
defer common.PutInsertCtx(ctx)
rowsLen := 0
for i := range timeseries {
rowsLen += len(timeseries[i].Samples)
}
ctx.Reset(rowsLen)
rowsTotal := 0
for i := range timeseries {
ts := &timeseries[i]
var metricNameRaw []byte
for i := range ts.Samples {
r := &ts.Samples[i]
metricNameRaw = ctx.WriteDataPointExt(metricNameRaw, ts.Labels, r.Timestamp, r.Value)
}
rowsTotal += len(ts.Samples)
}
rowsInserted.Add(rowsTotal)
rowsPerInsert.Update(float64(rowsTotal))
return ctx.FlushBufs()
}

View File

@@ -0,0 +1,94 @@
package vmimport
import (
"net/http"
"runtime"
"sync"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/vmimport"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="vmimport"}`)
rowsPerInsert = metrics.NewHistogram(`vm_rows_per_insert{type="vmimport"}`)
)
// InsertHandler processes `/api/v1/import` request.
//
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6
func InsertHandler(req *http.Request) error {
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(req, insertRows)
})
}
func insertRows(rows []parser.Row) error {
ctx := getPushCtx()
defer putPushCtx(ctx)
rowsLen := 0
for i := range rows {
rowsLen += len(rows[i].Values)
}
ic := &ctx.Common
ic.Reset(rowsLen)
rowsTotal := 0
for i := range rows {
r := &rows[i]
ic.Labels = ic.Labels[:0]
for j := range r.Tags {
tag := &r.Tags[j]
ic.AddLabelBytes(tag.Key, tag.Value)
}
ctx.metricNameBuf = storage.MarshalMetricNameRaw(ctx.metricNameBuf[:0], ic.Labels)
values := r.Values
timestamps := r.Timestamps
_ = timestamps[len(values)-1]
for j, value := range values {
timestamp := timestamps[j]
ic.WriteDataPoint(ctx.metricNameBuf, nil, timestamp, value)
}
rowsTotal += len(values)
}
rowsInserted.Add(rowsTotal)
rowsPerInsert.Update(float64(rowsTotal))
return ic.FlushBufs()
}
type pushCtx struct {
Common common.InsertCtx
metricNameBuf []byte
}
func (ctx *pushCtx) reset() {
ctx.Common.Reset(0)
ctx.metricNameBuf = ctx.metricNameBuf[:0]
}
func getPushCtx() *pushCtx {
select {
case ctx := <-pushCtxPoolCh:
return ctx
default:
if v := pushCtxPool.Get(); v != nil {
return v.(*pushCtx)
}
return &pushCtx{}
}
}
func putPushCtx(ctx *pushCtx) {
ctx.reset()
select {
case pushCtxPoolCh <- ctx:
default:
pushCtxPool.Put(ctx)
}
}
var pushCtxPool sync.Pool
var pushCtxPoolCh = make(chan *pushCtx, runtime.GOMAXPROCS(-1))

View File

@@ -3,35 +3,68 @@
vmrestore:
APP_NAME=vmrestore $(MAKE) app-local
vmrestore-race:
APP_NAME=vmrestore RACE=-race $(MAKE) app-local
vmrestore-prod:
APP_NAME=vmrestore $(MAKE) app-via-docker
vmrestore-pure-prod:
APP_NAME=vmrestore $(MAKE) app-via-docker-pure
vmrestore-amd64-prod:
APP_NAME=vmrestore $(MAKE) app-via-docker-amd64
vmrestore-arm-prod:
APP_NAME=vmrestore $(MAKE) app-via-docker-arm
vmrestore-arm64-prod:
APP_NAME=vmrestore $(MAKE) app-via-docker-arm64
vmrestore-ppc64le-prod:
APP_NAME=vmrestore $(MAKE) app-via-docker-ppc64le
vmrestore-386-prod:
APP_NAME=vmrestore $(MAKE) app-via-docker-386
package-vmrestore:
APP_NAME=vmrestore $(MAKE) package-via-docker
package-vmrestore-pure:
APP_NAME=vmrestore $(MAKE) package-via-docker-pure
package-vmrestore-amd64:
APP_NAME=vmrestore $(MAKE) package-via-docker-amd64
package-vmrestore-arm:
APP_NAME=vmrestore $(MAKE) package-via-docker-arm
package-vmrestore-arm64:
APP_NAME=vmrestore $(MAKE) package-via-docker-arm64
package-vmrestore-ppc64le:
APP_NAME=vmrestore $(MAKE) package-via-docker-ppc64le
package-vmrestore-386:
APP_NAME=vmrestore $(MAKE) package-via-docker-386
publish-vmrestore:
APP_NAME=vmrestore $(MAKE) publish-via-docker
vmrestore-arm:
CGO_ENABLED=0 GOOS=linux GOARCH=arm GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmrestore-arm ./app/vmrestore
vmrestore-arm-prod:
APP_NAME=vmrestore APP_SUFFIX='-arm' DOCKER_OPTS='--env CGO_ENABLED=0 --env GOARCH=arm' $(MAKE) app-via-docker
vmrestore-arm64:
CGO_ENABLED=0 GOOS=linux GOARCH=arm64 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmrestore-arm64 ./app/vmrestore
vmrestore-arm64-prod:
APP_NAME=vmrestore APP_SUFFIX='-arm64' DOCKER_OPTS='--env CGO_ENABLED=0 --env GOARCH=arm64' $(MAKE) app-via-docker
vmrestore-386:
CGO_ENABLED=0 GOOS=linux GOARCH=386 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmrestore-386 ./app/vmrestore
vmrestore-386-prod:
APP_NAME=vmrestore APP_SUFFIX='-386' DOCKER_OPTS='--env CGO_ENABLED=0 --env GOARCH=386' $(MAKE) app-via-docker
vmrestore-pure:
APP_NAME=vmrestore $(MAKE) app-local-pure
vmrestore-pure-prod:
APP_NAME=vmrestore APP_SUFFIX='-pure' DOCKER_OPTS='--env CGO_ENABLED=0' $(MAKE) app-via-docker
vmrestore-amd64:
CGO_ENABLED=1 GOOS=linux GOARCH=amd64 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmrestore-amd64 ./app/vmrestore
vmrestore-arm:
CGO_ENABLED=0 GOOS=linux GOARCH=arm GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmrestore-arm ./app/vmrestore
vmrestore-arm64:
CGO_ENABLED=0 GOOS=linux GOARCH=arm64 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmrestore-arm64 ./app/vmrestore
vmrestore-ppc64le:
CGO_ENABLED=0 GOOS=linux GOARCH=ppc64le GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmrestore-ppc64le ./app/vmrestore
vmrestore-386:
CGO_ENABLED=0 GOOS=linux GOARCH=386 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmrestore-386 ./app/vmrestore

View File

@@ -24,25 +24,33 @@ vmrestore -src=gcs://<bucket>/<path/to/backup> -storageDataPath=<local/path/to/r
The original `-storageDataPath` directory may contain old files. They will be susbstituted by the files from backup.
### Troubleshooting
* If `vmrestore` eats all the network bandwidth, then set `-maxBytesPerSecond` to the desired value.
* If `vmrestore` has been interrupted due to temporary error, then just restart it with the same args. It will resume the restore process.
### Advanced usage
Run `vmrestore -help` in order to see all the available options:
```
vmrestore restores VictoriaMetrics data from backups made by vmbackup.
See the docs at https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmrestore/README.md .
-concurrency int
The number of concurrent workers. Higher concurrency may reduce restore duration (default 10)
-configFilePath string
Path to file with S3 configs. Configs are loaded from default location if not set.
See https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html
-configProfile string
Profile name for S3 configs (default "default")
-credsFilePath string
Path to file with GCS or S3 credentials. Credentials are loaded from default locations if not set.
See https://cloud.google.com/iam/docs/creating-managing-service-account-keys and https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html
-customS3Endpoint string
Custom S3 endpoint for use with S3-compatible storages (e.g. MinIO). S3 is used if not set
-loggerLevel string
Minimum level of errors to log. Possible values: INFO, ERROR, FATAL, PANIC (default "INFO")
-maxBytesPerSecond int
The maximum download speed. There is no limit if it is set to 0
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy (default 60)
-src string
@@ -61,7 +69,7 @@ It is recommended using [binary releases](https://github.com/VictoriaMetrics/Vic
#### Development build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.12.
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.13.
2. Run `make vmrestore` from the root folder of the repository.
It builds `vmrestore` binary and puts it into the `bin` folder.

View File

@@ -1,5 +1,6 @@
FROM scratch
COPY --from=local/certs:1.0.2 /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ca-certificates.crt
COPY bin/vmrestore-prod .
EXPOSE 8428
ARG base_image
FROM $base_image
ENTRYPOINT ["/vmrestore-prod"]
ARG src_binary
COPY $src_binary ./vmrestore-prod

View File

@@ -8,6 +8,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/backup/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/backup/fslocal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envflag"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
)
@@ -16,12 +17,14 @@ var (
"Example: gcs://bucket/path/to/backup/dir, s3://bucket/path/to/backup/dir or fs:///path/to/local/backup/dir")
storageDataPath = flag.String("storageDataPath", "victoria-metrics-data", "Destination path where backup must be restored. "+
"VictoriaMetrics must be stopped when restoring from backup. -storageDataPath dir can be non-empty. In this case only missing data is downloaded from backup")
concurrency = flag.Int("concurrency", 10, "The number of concurrent workers. Higher concurrency may reduce restore duration")
concurrency = flag.Int("concurrency", 10, "The number of concurrent workers. Higher concurrency may reduce restore duration")
maxBytesPerSecond = flag.Int("maxBytesPerSecond", 0, "The maximum download speed. There is no limit if it is set to 0")
skipBackupCompleteCheck = flag.Bool("skipBackupCompleteCheck", false, "Whether to skip checking for 'backup complete' file in -src. This may be useful for restoring from old backups, which were created without 'backup complete' file")
)
func main() {
flag.Usage = usage
flag.Parse()
envflag.Parse()
buildinfo.Init()
srcFS, err := newSrcFS()
@@ -33,9 +36,10 @@ func main() {
logger.Fatalf("%s", err)
}
a := &actions.Restore{
Concurrency: *concurrency,
Src: srcFS,
Dst: dstFS,
Concurrency: *concurrency,
Src: srcFS,
Dst: dstFS,
SkipBackupCompleteCheck: *skipBackupCompleteCheck,
}
if err := a.Run(); err != nil {
logger.Fatalf("cannot restore from backup: %s", err)
@@ -59,7 +63,11 @@ func newDstFS() (*fslocal.FS, error) {
return nil, fmt.Errorf("`-storageDataPath` cannot be empty")
}
fs := &fslocal.FS{
Dir: *storageDataPath,
Dir: *storageDataPath,
MaxBytesPerSecond: *maxBytesPerSecond,
}
if err := fs.Init(); err != nil {
return nil, fmt.Errorf("cannot initialize local fs: %s", err)
}
return fs, nil
}

View File

@@ -21,10 +21,26 @@ import (
var (
deleteAuthKey = flag.String("deleteAuthKey", "", "authKey for metrics' deletion via /api/v1/admin/tsdb/delete_series")
maxConcurrentRequests = flag.Int("search.maxConcurrentRequests", runtime.GOMAXPROCS(-1)*2, "The maximum number of concurrent search requests. It shouldn't exceed 2*vCPUs for better performance. See also -search.maxQueueDuration")
maxQueueDuration = flag.Duration("search.maxQueueDuration", 10*time.Second, "The maximum time the request waits for execution when -search.maxConcurrentRequests limit is reached")
maxConcurrentRequests = flag.Int("search.maxConcurrentRequests", getDefaultMaxConcurrentRequests(), "The maximum number of concurrent search requests. "+
"It shouldn't be high, since a single request can saturate all the CPU cores. See also -search.maxQueueDuration")
maxQueueDuration = flag.Duration("search.maxQueueDuration", 10*time.Second, "The maximum time the request waits for execution when -search.maxConcurrentRequests limit is reached")
resetCacheAuthKey = flag.String("search.resetCacheAuthKey", "", "Optional authKey for resetting rollup cache via /internal/resetCache call")
)
func getDefaultMaxConcurrentRequests() int {
n := runtime.GOMAXPROCS(-1)
if n <= 4 {
n *= 2
}
if n > 16 {
// A single request can saturate all the CPU cores, so there is no sense
// in allowing higher number of concurrent requests - they will just contend
// for unavailable CPU time.
n = 16
}
return n
}
// Init initializes vmselect
func Init() {
tmpDirPath := *vmstorage.DataPath + "/tmp"
@@ -56,6 +72,7 @@ var (
// RequestHandler handles remote read API requests for Prometheus
func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
startTime := time.Now()
// Limit the number of concurrent queries.
select {
case concurrencyCh <- struct{}{}:
@@ -72,7 +89,9 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
timerpool.Put(t)
concurrencyLimitTimeout.Inc()
err := &httpserver.ErrorWithStatusCode{
Err: fmt.Errorf("cannot handle more than %d concurrent requests", cap(concurrencyCh)),
Err: fmt.Errorf("cannot handle more than %d concurrent search requests during %s; possible solutions: "+
"increase `-search.maxQueueDuration`, increase `-search.maxConcurrentRequests`, increase server capacity",
*maxConcurrentRequests, *maxQueueDuration),
StatusCode: http.StatusServiceUnavailable,
}
httpserver.Errorf(w, "%s", err)
@@ -81,13 +100,22 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
}
path := strings.Replace(r.URL.Path, "//", "/", -1)
if path == "/internal/resetRollupResultCache" {
if len(*resetCacheAuthKey) > 0 && r.FormValue("authKey") != *resetCacheAuthKey {
sendPrometheusError(w, r, fmt.Errorf("invalid authKey=%q for %q", r.FormValue("authKey"), path))
return true
}
promql.ResetRollupResultCache()
return true
}
if strings.HasPrefix(path, "/api/v1/label/") {
s := r.URL.Path[len("/api/v1/label/"):]
if strings.HasSuffix(s, "/values") {
labelValuesRequests.Inc()
labelName := s[:len(s)-len("/values")]
httpserver.EnableCORS(w, r)
if err := prometheus.LabelValuesHandler(labelName, w, r); err != nil {
if err := prometheus.LabelValuesHandler(startTime, labelName, w, r); err != nil {
labelValuesErrors.Inc()
sendPrometheusError(w, r, err)
return true
@@ -100,7 +128,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
case "/api/v1/query":
queryRequests.Inc()
httpserver.EnableCORS(w, r)
if err := prometheus.QueryHandler(w, r); err != nil {
if err := prometheus.QueryHandler(startTime, w, r); err != nil {
queryErrors.Inc()
sendPrometheusError(w, r, err)
return true
@@ -109,7 +137,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
case "/api/v1/query_range":
queryRangeRequests.Inc()
httpserver.EnableCORS(w, r)
if err := prometheus.QueryRangeHandler(w, r); err != nil {
if err := prometheus.QueryRangeHandler(startTime, w, r); err != nil {
queryRangeErrors.Inc()
sendPrometheusError(w, r, err)
return true
@@ -118,7 +146,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
case "/api/v1/series":
seriesRequests.Inc()
httpserver.EnableCORS(w, r)
if err := prometheus.SeriesHandler(w, r); err != nil {
if err := prometheus.SeriesHandler(startTime, w, r); err != nil {
seriesErrors.Inc()
sendPrometheusError(w, r, err)
return true
@@ -127,7 +155,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
case "/api/v1/series/count":
seriesCountRequests.Inc()
httpserver.EnableCORS(w, r)
if err := prometheus.SeriesCountHandler(w, r); err != nil {
if err := prometheus.SeriesCountHandler(startTime, w, r); err != nil {
seriesCountErrors.Inc()
sendPrometheusError(w, r, err)
return true
@@ -136,7 +164,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
case "/api/v1/labels":
labelsRequests.Inc()
httpserver.EnableCORS(w, r)
if err := prometheus.LabelsHandler(w, r); err != nil {
if err := prometheus.LabelsHandler(startTime, w, r); err != nil {
labelsErrors.Inc()
sendPrometheusError(w, r, err)
return true
@@ -145,7 +173,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
case "/api/v1/labels/count":
labelsCountRequests.Inc()
httpserver.EnableCORS(w, r)
if err := prometheus.LabelsCountHandler(w, r); err != nil {
if err := prometheus.LabelsCountHandler(startTime, w, r); err != nil {
labelsCountErrors.Inc()
sendPrometheusError(w, r, err)
return true
@@ -153,7 +181,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
return true
case "/api/v1/export":
exportRequests.Inc()
if err := prometheus.ExportHandler(w, r); err != nil {
if err := prometheus.ExportHandler(startTime, w, r); err != nil {
exportErrors.Inc()
httpserver.Errorf(w, "error in %q: %s", r.URL.Path, err)
return true
@@ -161,12 +189,30 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
return true
case "/federate":
federateRequests.Inc()
if err := prometheus.FederateHandler(w, r); err != nil {
if err := prometheus.FederateHandler(startTime, w, r); err != nil {
federateErrors.Inc()
httpserver.Errorf(w, "error int %q: %s", r.URL.Path, err)
return true
}
return true
case "/api/v1/rules":
// Return dumb placeholder
rulesRequests.Inc()
w.Header().Set("Content-Type", "application/json")
fmt.Fprintf(w, "%s", `{"status":"success","data":{"groups":[]}}`)
return true
case "/api/v1/alerts":
// Return dumb placehloder
alertsRequests.Inc()
w.Header().Set("Content-Type", "application/json")
fmt.Fprintf(w, "%s", `{"status":"success","data":{"alerts":[]}}`)
return true
case "/api/v1/metadata":
// Return dumb placeholder
metadataRequests.Inc()
w.Header().Set("Content-Type", "application/json")
fmt.Fprintf(w, "%s", `{"status":"success","data":{}}`)
return true
case "/api/v1/admin/tsdb/delete_series":
deleteRequests.Inc()
authKey := r.FormValue("authKey")
@@ -174,7 +220,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
httpserver.Errorf(w, "invalid authKey %q. It must match the value from -deleteAuthKey command line flag", authKey)
return true
}
if err := prometheus.DeleteHandler(r); err != nil {
if err := prometheus.DeleteHandler(startTime, r); err != nil {
deleteErrors.Inc()
httpserver.Errorf(w, "error in %q: %s", r.URL.Path, err)
return true
@@ -187,7 +233,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
}
func sendPrometheusError(w http.ResponseWriter, r *http.Request, err error) {
logger.Errorf("error in %q: %s", r.URL.Path, err)
logger.Warnf("error in %q: %s", r.RequestURI, err)
w.Header().Set("Content-Type", "application/json")
statusCode := http.StatusUnprocessableEntity
@@ -228,4 +274,8 @@ var (
federateRequests = metrics.NewCounter(`vm_http_requests_total{path="/federate"}`)
federateErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/federate"}`)
rulesRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/rules"}`)
alertsRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/alerts"}`)
metadataRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/metadata"}`)
)

View File

@@ -1,9 +0,0 @@
package netstorage
import (
"os"
)
func mustFadviseSequentialRead(f *os.File) {
// Do nothing :)
}

View File

@@ -1,15 +0,0 @@
package netstorage
import (
"os"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"golang.org/x/sys/unix"
)
func mustFadviseSequentialRead(f *os.File) {
fd := int(f.Fd())
if err := unix.Fadvise(int(fd), 0, 0, unix.FADV_SEQUENTIAL|unix.FADV_WILLNEED); err != nil {
logger.Panicf("FATAL: error returned from unix.Fadvise(SEQUENTIAL|WILLNEED): %s", err)
}
}

View File

@@ -1,15 +0,0 @@
package netstorage
import (
"os"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"golang.org/x/sys/unix"
)
func mustFadviseSequentialRead(f *os.File) {
fd := int(f.Fd())
if err := unix.Fadvise(int(fd), 0, 0, unix.FADV_SEQUENTIAL|unix.FADV_WILLNEED); err != nil {
logger.Panicf("FATAL: error returned from unix.Fadvise(SEQUENTIAL|WILLNEED): %s", err)
}
}

View File

@@ -92,6 +92,7 @@ func (rss *Results) RunParallel(f func(rs *Result, workerID uint)) error {
doneCh := make(chan error)
// Start workers.
rowsProcessedTotal := uint64(0)
for i := 0; i < workersCount; i++ {
go func(workerID uint) {
rs := getResult()
@@ -99,9 +100,10 @@ func (rss *Results) RunParallel(f func(rs *Result, workerID uint)) error {
maxWorkersCount := gomaxprocs / workersCount
var err error
rowsProcessed := 0
for pts := range workCh {
if time.Until(rss.deadline.Deadline) < 0 {
err = fmt.Errorf("timeout exceeded during query execution: %s", rss.deadline.Timeout)
err = fmt.Errorf("timeout exceeded during query execution: %s", rss.deadline.String())
break
}
if err = pts.Unpack(rss.tbf, rs, rss.tr, rss.fetchData, maxWorkersCount); err != nil {
@@ -111,8 +113,10 @@ func (rss *Results) RunParallel(f func(rs *Result, workerID uint)) error {
// Skip empty blocks.
continue
}
rowsProcessed += len(rs.Values)
f(rs, workerID)
}
atomic.AddUint64(&rowsProcessedTotal, uint64(rowsProcessed))
// Drain the remaining work
for range workCh {
}
@@ -124,6 +128,7 @@ func (rss *Results) RunParallel(f func(rs *Result, workerID uint)) error {
for i := range rss.packedTimeseries {
workCh <- &rss.packedTimeseries[i]
}
seriesProcessedTotal := len(rss.packedTimeseries)
rss.packedTimeseries = rss.packedTimeseries[:0]
close(workCh)
@@ -134,6 +139,8 @@ func (rss *Results) RunParallel(f func(rs *Result, workerID uint)) error {
errors = append(errors, err)
}
}
perQueryRowsProcessed.Update(float64(rowsProcessedTotal))
perQuerySeriesProcessed.Update(float64(seriesProcessedTotal))
if len(errors) > 0 {
// Return just the first error, since other errors
// is likely duplicate the first error.
@@ -142,6 +149,9 @@ func (rss *Results) RunParallel(f func(rs *Result, workerID uint)) error {
return nil
}
var perQueryRowsProcessed = metrics.NewHistogram(`vm_per_query_rows_processed_count`)
var perQuerySeriesProcessed = metrics.NewHistogram(`vm_per_query_series_processed_count`)
var gomaxprocs = runtime.GOMAXPROCS(-1)
type packedTimeseries struct {
@@ -256,7 +266,7 @@ func mergeSortBlocks(dst *Result, sbh sortBlocksHeap) {
dst.Timestamps = append(dst.Timestamps, top.Timestamps[top.NextIdx:]...)
dst.Values = append(dst.Values, top.Values[top.NextIdx:]...)
putSortBlock(top)
return
break
}
sbNext := sbh[0]
tsNext := sbNext.Timestamps[sbNext.NextIdx]
@@ -277,8 +287,16 @@ func mergeSortBlocks(dst *Result, sbh sortBlocksHeap) {
putSortBlock(top)
}
}
timestamps, values := storage.DeduplicateSamples(dst.Timestamps, dst.Values)
dedups := len(dst.Timestamps) - len(timestamps)
dedupsDuringSelect.Add(dedups)
dst.Timestamps = timestamps
dst.Values = values
}
var dedupsDuringSelect = metrics.NewCounter(`vm_deduplicated_samples_total{type="select"}`)
type sortBlock struct {
// b is used as a temporary storage for unpacked rows before they
// go to Timestamps and Values.
@@ -422,13 +440,10 @@ func GetLabelEntries(deadline Deadline) ([]storage.TagEntry, error) {
// Sort labelEntries by the number of label values in each entry.
sort.Slice(labelEntries, func(i, j int) bool {
a, b := labelEntries[i].Values, labelEntries[j].Values
if len(a) < len(b) {
return true
if len(a) != len(b) {
return len(a) > len(b)
}
if len(a) > len(b) {
return false
}
return labelEntries[i].Key < labelEntries[j].Key
return labelEntries[i].Key > labelEntries[j].Key
})
return labelEntries, nil
@@ -452,16 +467,12 @@ func getStorageSearch() *storage.Search {
}
func putStorageSearch(sr *storage.Search) {
n := atomic.LoadUint64(&sr.MissingMetricNamesForMetricID)
missingMetricNamesForMetricID.Add(int(n))
sr.MustClose()
ssPool.Put(sr)
}
var ssPool sync.Pool
var missingMetricNamesForMetricID = metrics.NewCounter(`vm_missing_metric_names_for_metric_id_total`)
// ProcessSearchQuery performs sq on storage nodes until the given deadline.
func ProcessSearchQuery(sq *storage.SearchQuery, fetchData bool, deadline Deadline) (*Results, error) {
// Setup search.
@@ -496,7 +507,7 @@ func ProcessSearchQuery(sq *storage.SearchQuery, fetchData bool, deadline Deadli
}
if time.Until(deadline.Deadline) < 0 {
putTmpBlocksFile(tbf)
return nil, fmt.Errorf("timeout exceeded while fetching data block #%d from storage: %s", blocksRead, deadline.Timeout)
return nil, fmt.Errorf("timeout exceeded while fetching data block #%d from storage: %s", blocksRead, deadline.String())
}
metricName := sr.MetricBlock.MetricName
m[string(metricName)] = append(m[string(metricName)], addr)
@@ -565,6 +576,7 @@ func setupTfss(tagFilterss [][]storage.TagFilter) ([]*storage.TagFilters, error)
}
}
tfss = append(tfss, tfs)
tfss = append(tfss, tfs.Finalize()...)
}
return tfss, nil
}
@@ -572,13 +584,24 @@ func setupTfss(tagFilterss [][]storage.TagFilter) ([]*storage.TagFilters, error)
// Deadline contains deadline with the corresponding timeout for pretty error messages.
type Deadline struct {
Deadline time.Time
Timeout time.Duration
timeout time.Duration
flagHint string
}
// NewDeadline returns deadline for the given timeout.
func NewDeadline(timeout time.Duration) Deadline {
//
// flagHint must contain a hit for command-line flag, which could be used
// in order to increase timeout.
func NewDeadline(timeout time.Duration, flagHint string) Deadline {
return Deadline{
Deadline: time.Now().Add(timeout),
Timeout: timeout,
timeout: timeout,
flagHint: flagHint,
}
}
// String returns human-readable string representation for d.
func (d *Deadline) String() string {
return fmt.Sprintf("%.3f seconds; the timeout can be adjusted with `%s` command-line flag", d.timeout.Seconds(), d.flagHint)
}

View File

@@ -36,6 +36,9 @@ func maxInmemoryTmpBlocksFile() int {
if maxLen < 64*1024 {
return 64 * 1024
}
if maxLen > 4*1024*1024 {
return 4 * 1024 * 1024
}
return maxLen
}
@@ -47,6 +50,7 @@ type tmpBlocksFile struct {
buf []byte
f *os.File
r *fs.ReaderAt
offset uint64
}
@@ -65,6 +69,7 @@ func putTmpBlocksFile(tbf *tmpBlocksFile) {
tbf.MustClose()
tbf.buf = tbf.buf[:0]
tbf.f = nil
tbf.r = nil
tbf.offset = 0
tmpBlocksFilePool.Put(tbf)
}
@@ -118,17 +123,20 @@ func (tbf *tmpBlocksFile) Finalize() error {
if tbf.f == nil {
return nil
}
fname := tbf.f.Name()
if _, err := tbf.f.Write(tbf.buf); err != nil {
return fmt.Errorf("cannot flush the remaining %d bytes to tmpBlocksFile: %s", len(tbf.buf), err)
return fmt.Errorf("cannot write the remaining %d bytes to %q: %s", len(tbf.buf), fname, err)
}
tbf.buf = tbf.buf[:0]
if _, err := tbf.f.Seek(0, 0); err != nil {
logger.Panicf("FATAL: cannot seek to the start of file: %s", err)
r, err := fs.OpenReaderAt(fname)
if err != nil {
logger.Panicf("FATAL: cannot open %q: %s", fname, err)
}
// Hint the OS that the file is read almost sequentiallly.
// This should reduce the number of disk seeks, which is important
// for HDDs.
mustFadviseSequentialRead(tbf.f)
r.MustFadviseSequentialRead(true)
tbf.r = r
return nil
}
@@ -140,13 +148,7 @@ func (tbf *tmpBlocksFile) MustReadBlockAt(dst *storage.Block, addr tmpBlockAddr)
bb := tmpBufPool.Get()
defer tmpBufPool.Put(bb)
bb.B = bytesutil.Resize(bb.B, addr.size)
n, err := tbf.f.ReadAt(bb.B, int64(addr.offset))
if err != nil {
logger.Panicf("FATAL: cannot read from %q at %s: %s", tbf.f.Name(), addr, err)
}
if n != len(bb.B) {
logger.Panicf("FATAL: too short number of bytes read at %s; got %d; want %d", addr, n, len(bb.B))
}
tbf.r.MustReadAt(bb.B, int64(addr.offset))
buf = bb.B
}
tail, err := storage.UnmarshalBlock(dst, buf)
@@ -164,6 +166,10 @@ func (tbf *tmpBlocksFile) MustClose() {
if tbf.f == nil {
return
}
if tbf.r != nil {
// tbf.r could be nil if Finalize wasn't called.
tbf.r.MustClose()
}
fname := tbf.f.Name()
// Remove the file at first, then close it.

View File

@@ -3,6 +3,7 @@ package prometheus
import (
"flag"
"fmt"
"io"
"math"
"net/http"
"runtime"
@@ -15,26 +16,28 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/netstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/promql"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/metricsql"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/metrics"
"github.com/valyala/fastjson/fastfloat"
"github.com/valyala/quicktemplate"
)
var (
latencyOffset = flag.Duration("search.latencyOffset", time.Second*60, "The time when data points become visible in query results after the colection. "+
latencyOffset = flag.Duration("search.latencyOffset", time.Second*30, "The time when data points become visible in query results after the colection. "+
"Too small value can result in incomplete last points for query results")
maxQueryDuration = flag.Duration("search.maxQueryDuration", time.Second*30, "The maximum time for search query execution")
maxQueryLen = flag.Int("search.maxQueryLen", 16*1024, "The maximum search query length in bytes")
maxLookback = flag.Duration("search.maxLookback", 0, "Synonim to `-search.lookback-delta` from Prometheus. "+
"The value is dynamically detected from interval between time series datapoints if not set. It can be overridden on per-query basis via `max_lookback` arg")
maxExportDuration = flag.Duration("search.maxExportDuration", time.Hour*24*30, "The maximum duration for /api/v1/export call")
maxQueryDuration = flag.Duration("search.maxQueryDuration", time.Second*30, "The maximum duration for search query execution")
maxQueryLen = flag.Int("search.maxQueryLen", 16*1024, "The maximum search query length in bytes")
maxLookback = flag.Duration("search.maxLookback", 0, "Synonim to -search.lookback-delta from Prometheus. "+
"The value is dynamically detected from interval between time series datapoints if not set. It can be overridden on per-query basis via max_lookback arg")
)
// Default step used if not set.
const defaultStep = 5 * 60 * 1000
// FederateHandler implements /federate . See https://prometheus.io/docs/prometheus/latest/federation/
func FederateHandler(w http.ResponseWriter, r *http.Request) error {
startTime := time.Now()
func FederateHandler(startTime time.Time, w http.ResponseWriter, r *http.Request) error {
ct := currentTime()
if err := r.ParseForm(); err != nil {
return fmt.Errorf("cannot parse request form values: %s", err)
@@ -58,7 +61,7 @@ func FederateHandler(w http.ResponseWriter, r *http.Request) error {
if err != nil {
return err
}
deadline := getDeadline(r)
deadline := getDeadlineForQuery(r)
if start >= end {
start = end - defaultStep
}
@@ -105,8 +108,7 @@ func FederateHandler(w http.ResponseWriter, r *http.Request) error {
var federateDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/federate"}`)
// ExportHandler exports data in raw format from /api/v1/export.
func ExportHandler(w http.ResponseWriter, r *http.Request) error {
startTime := time.Now()
func ExportHandler(startTime time.Time, w http.ResponseWriter, r *http.Request) error {
ct := currentTime()
if err := r.ParseForm(); err != nil {
return fmt.Errorf("cannot parse request form values: %s", err)
@@ -129,12 +131,13 @@ func ExportHandler(w http.ResponseWriter, r *http.Request) error {
return err
}
format := r.FormValue("format")
deadline := getDeadline(r)
maxRowsPerLine := int(fastfloat.ParseInt64BestEffort(r.FormValue("max_rows_per_line")))
deadline := getDeadlineForExport(r)
if start >= end {
start = end - defaultStep
end = start + defaultStep
}
if err := exportHandler(w, matches, start, end, format, deadline); err != nil {
return err
if err := exportHandler(w, matches, start, end, format, maxRowsPerLine, deadline); err != nil {
return fmt.Errorf("error when exporting data for queries=%q on the time range (start=%d, end=%d): %s", matches, start, end, err)
}
exportDuration.UpdateDuration(startTime)
return nil
@@ -142,10 +145,38 @@ func ExportHandler(w http.ResponseWriter, r *http.Request) error {
var exportDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/api/v1/export"}`)
func exportHandler(w http.ResponseWriter, matches []string, start, end int64, format string, deadline netstorage.Deadline) error {
func exportHandler(w http.ResponseWriter, matches []string, start, end int64, format string, maxRowsPerLine int, deadline netstorage.Deadline) error {
writeResponseFunc := WriteExportStdResponse
writeLineFunc := WriteExportJSONLine
contentType := "application/json"
if maxRowsPerLine > 0 {
writeLineFunc = func(w io.Writer, rs *netstorage.Result) {
valuesOrig := rs.Values
timestampsOrig := rs.Timestamps
values := valuesOrig
timestamps := timestampsOrig
for len(values) > 0 {
var valuesChunk []float64
var timestampsChunk []int64
if len(values) > maxRowsPerLine {
valuesChunk = values[:maxRowsPerLine]
timestampsChunk = timestamps[:maxRowsPerLine]
values = values[maxRowsPerLine:]
timestamps = timestamps[maxRowsPerLine:]
} else {
valuesChunk = values
timestampsChunk = timestamps
values = nil
timestamps = nil
}
rs.Values = valuesChunk
rs.Timestamps = timestampsChunk
WriteExportJSONLine(w, rs)
}
rs.Values = valuesOrig
rs.Timestamps = timestampsOrig
}
}
contentType := "application/stream+json"
if format == "prometheus" {
contentType = "text/plain"
writeLineFunc = WriteExportPrometheusLine
@@ -198,8 +229,7 @@ func exportHandler(w http.ResponseWriter, matches []string, start, end int64, fo
// DeleteHandler processes /api/v1/admin/tsdb/delete_series prometheus API request.
//
// See https://prometheus.io/docs/prometheus/latest/querying/api/#delete-series
func DeleteHandler(r *http.Request) error {
startTime := time.Now()
func DeleteHandler(startTime time.Time, r *http.Request) error {
if err := r.ParseForm(); err != nil {
return fmt.Errorf("cannot parse request form values: %s", err)
}
@@ -233,9 +263,8 @@ var deleteDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/api/
// LabelValuesHandler processes /api/v1/label/<labelName>/values request.
//
// See https://prometheus.io/docs/prometheus/latest/querying/api/#querying-label-values
func LabelValuesHandler(labelName string, w http.ResponseWriter, r *http.Request) error {
startTime := time.Now()
deadline := getDeadline(r)
func LabelValuesHandler(startTime time.Time, labelName string, w http.ResponseWriter, r *http.Request) error {
deadline := getDeadlineForQuery(r)
if err := r.ParseForm(); err != nil {
return fmt.Errorf("cannot parse form values: %s", err)
@@ -285,8 +314,21 @@ func labelValuesWithMatches(labelName string, matches []string, start, end int64
if err != nil {
return nil, err
}
// Add `labelName!=''` tag filter in order to filter out series without the labelName.
// There is no need in adding `__name__!=''` filter, since all the time series should
// already have non-empty name.
if labelName != "__name__" {
key := []byte(labelName)
for i, tfs := range tagFilterss {
tagFilterss[i] = append(tfs, storage.TagFilter{
Key: key,
IsNegative: true,
})
}
}
if start >= end {
start = end - defaultStep
end = start + defaultStep
}
sq := &storage.SearchQuery{
MinTimestamp: start,
@@ -324,9 +366,8 @@ func labelValuesWithMatches(labelName string, matches []string, start, end int64
var labelValuesDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/api/v1/label/{}/values"}`)
// LabelsCountHandler processes /api/v1/labels/count request.
func LabelsCountHandler(w http.ResponseWriter, r *http.Request) error {
startTime := time.Now()
deadline := getDeadline(r)
func LabelsCountHandler(startTime time.Time, w http.ResponseWriter, r *http.Request) error {
deadline := getDeadlineForQuery(r)
labelEntries, err := netstorage.GetLabelEntries(deadline)
if err != nil {
return fmt.Errorf(`cannot obtain label entries: %s`, err)
@@ -343,12 +384,39 @@ var labelsCountDuration = metrics.NewSummary(`vm_request_duration_seconds{path="
// LabelsHandler processes /api/v1/labels request.
//
// See https://prometheus.io/docs/prometheus/latest/querying/api/#getting-label-names
func LabelsHandler(w http.ResponseWriter, r *http.Request) error {
startTime := time.Now()
deadline := getDeadline(r)
labels, err := netstorage.GetLabels(deadline)
if err != nil {
return fmt.Errorf("cannot obtain labels: %s", err)
func LabelsHandler(startTime time.Time, w http.ResponseWriter, r *http.Request) error {
deadline := getDeadlineForQuery(r)
if err := r.ParseForm(); err != nil {
return fmt.Errorf("cannot parse form values: %s", err)
}
var labels []string
if len(r.Form["match[]"]) == 0 && len(r.Form["start"]) == 0 && len(r.Form["end"]) == 0 {
var err error
labels, err = netstorage.GetLabels(deadline)
if err != nil {
return fmt.Errorf("cannot obtain labels: %s", err)
}
} else {
// Extended functionality that allows filtering by label filters and time range
// i.e. /api/v1/labels?match[]=foobar{baz="abc"}&start=...&end=...
matches := r.Form["match[]"]
if len(matches) == 0 {
matches = []string{"{__name__!=''}"}
}
ct := currentTime()
end, err := getTime(r, "end", ct)
if err != nil {
return err
}
start, err := getTime(r, "start", end-defaultStep)
if err != nil {
return err
}
labels, err = labelsWithMatches(matches, start, end, deadline)
if err != nil {
return fmt.Errorf("cannot obtain labels for match[]=%q, start=%d, end=%d: %s", matches, start, end, err)
}
}
w.Header().Set("Content-Type", "application/json")
@@ -357,12 +425,56 @@ func LabelsHandler(w http.ResponseWriter, r *http.Request) error {
return nil
}
func labelsWithMatches(matches []string, start, end int64, deadline netstorage.Deadline) ([]string, error) {
if len(matches) == 0 {
logger.Panicf("BUG: matches must be non-empty")
}
tagFilterss, err := getTagFilterssFromMatches(matches)
if err != nil {
return nil, err
}
if start >= end {
end = start + defaultStep
}
sq := &storage.SearchQuery{
MinTimestamp: start,
MaxTimestamp: end,
TagFilterss: tagFilterss,
}
rss, err := netstorage.ProcessSearchQuery(sq, false, deadline)
if err != nil {
return nil, fmt.Errorf("cannot fetch data for %q: %s", sq, err)
}
m := make(map[string]struct{})
var mLock sync.Mutex
err = rss.RunParallel(func(rs *netstorage.Result, workerID uint) {
mLock.Lock()
tags := rs.MetricName.Tags
for i := range tags {
t := &tags[i]
m[string(t.Key)] = struct{}{}
}
m["__name__"] = struct{}{}
mLock.Unlock()
})
if err != nil {
return nil, fmt.Errorf("error when data fetching: %s", err)
}
labels := make([]string, 0, len(m))
for label := range m {
labels = append(labels, label)
}
sort.Strings(labels)
return labels, nil
}
var labelsDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/api/v1/labels"}`)
// SeriesCountHandler processes /api/v1/series/count request.
func SeriesCountHandler(w http.ResponseWriter, r *http.Request) error {
startTime := time.Now()
deadline := getDeadline(r)
func SeriesCountHandler(startTime time.Time, w http.ResponseWriter, r *http.Request) error {
deadline := getDeadlineForQuery(r)
n, err := netstorage.GetSeriesCount(deadline)
if err != nil {
return fmt.Errorf("cannot obtain series count: %s", err)
@@ -378,8 +490,7 @@ var seriesCountDuration = metrics.NewSummary(`vm_request_duration_seconds{path="
// SeriesHandler processes /api/v1/series request.
//
// See https://prometheus.io/docs/prometheus/latest/querying/api/#finding-series-by-label-matchers
func SeriesHandler(w http.ResponseWriter, r *http.Request) error {
startTime := time.Now()
func SeriesHandler(startTime time.Time, w http.ResponseWriter, r *http.Request) error {
ct := currentTime()
if err := r.ParseForm(); err != nil {
@@ -402,14 +513,14 @@ func SeriesHandler(w http.ResponseWriter, r *http.Request) error {
if err != nil {
return err
}
deadline := getDeadline(r)
deadline := getDeadlineForQuery(r)
tagFilterss, err := getTagFilterssFromMatches(matches)
if err != nil {
return err
}
if start >= end {
start = end - defaultStep
end = start + defaultStep
}
sq := &storage.SearchQuery{
MinTimestamp: start,
@@ -454,8 +565,7 @@ var seriesDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/api/
// QueryHandler processes /api/v1/query request.
//
// See https://prometheus.io/docs/prometheus/latest/querying/api/#instant-queries
func QueryHandler(w http.ResponseWriter, r *http.Request) error {
startTime := time.Now()
func QueryHandler(startTime time.Time, w http.ResponseWriter, r *http.Request) error {
ct := currentTime()
query := r.FormValue("query")
@@ -466,45 +576,67 @@ func QueryHandler(w http.ResponseWriter, r *http.Request) error {
if err != nil {
return err
}
queryOffset := getLatencyOffsetMilliseconds()
step, err := getDuration(r, "step", queryOffset)
if err != nil {
return err
}
deadline := getDeadline(r)
lookbackDelta, err := getMaxLookback(r)
if err != nil {
return err
}
step, err := getDuration(r, "step", lookbackDelta)
if err != nil {
return err
}
if step <= 0 {
step = defaultStep
}
deadline := getDeadlineForQuery(r)
if len(query) > *maxQueryLen {
return fmt.Errorf(`too long query; got %d bytes; mustn't exceed %d bytes`, len(query), *maxQueryLen)
return fmt.Errorf("too long query; got %d bytes; mustn't exceed `-search.maxQueryLen=%d` bytes", len(query), *maxQueryLen)
}
if ct-start < queryOffset {
start -= queryOffset
queryOffset := getLatencyOffsetMilliseconds()
if !getBool(r, "nocache") && ct-start < queryOffset {
// Adjust start time only if `nocache` arg isn't set.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/241
start = ct - queryOffset
}
if childQuery, windowStr, offsetStr := promql.IsMetricSelectorWithRollup(query); childQuery != "" {
var window int64
if len(windowStr) > 0 {
var err error
window, err = promql.DurationValue(windowStr, step)
if err != nil {
return err
}
window, err := parsePositiveDuration(windowStr, step)
if err != nil {
return fmt.Errorf("cannot parse window: %s", err)
}
var offset int64
if len(offsetStr) > 0 {
var err error
offset, err = promql.DurationValue(offsetStr, step)
if err != nil {
return err
}
offset, err := parseDuration(offsetStr, step)
if err != nil {
return fmt.Errorf("cannot parse offset: %s", err)
}
start -= offset
end := start
start = end - window
if err := exportHandler(w, []string{childQuery}, start, end, "promapi", deadline); err != nil {
return err
if err := exportHandler(w, []string{childQuery}, start, end, "promapi", 0, deadline); err != nil {
return fmt.Errorf("error when exporting data for query=%q on the time range (start=%d, end=%d): %s", childQuery, start, end, err)
}
queryDuration.UpdateDuration(startTime)
return nil
}
if childQuery, windowStr, stepStr, offsetStr := promql.IsRollup(query); childQuery != "" {
newStep, err := parsePositiveDuration(stepStr, step)
if err != nil {
return fmt.Errorf("cannot parse step: %s", err)
}
if newStep > 0 {
step = newStep
}
window, err := parsePositiveDuration(windowStr, step)
if err != nil {
return fmt.Errorf("cannot parse window: %s", err)
}
offset, err := parseDuration(offsetStr, step)
if err != nil {
return fmt.Errorf("cannot parse offset: %s", err)
}
start -= offset
end := start
start = end - window
if err := queryRangeHandler(w, childQuery, start, end, step, r, ct); err != nil {
return fmt.Errorf("error when executing query=%q on the time range (start=%d, end=%d, step=%d): %s", childQuery, start, end, step, err)
}
queryDuration.UpdateDuration(startTime)
return nil
@@ -519,7 +651,7 @@ func QueryHandler(w http.ResponseWriter, r *http.Request) error {
}
result, err := promql.Exec(&ec, query, true)
if err != nil {
return fmt.Errorf("cannot execute %q: %s", query, err)
return fmt.Errorf("error when executing query=%q for (time=%d, step=%d): %s", query, start, step, err)
}
w.Header().Set("Content-Type", "application/json")
@@ -530,11 +662,24 @@ func QueryHandler(w http.ResponseWriter, r *http.Request) error {
var queryDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/api/v1/query"}`)
func parseDuration(s string, step int64) (int64, error) {
if len(s) == 0 {
return 0, nil
}
return metricsql.DurationValue(s, step)
}
func parsePositiveDuration(s string, step int64) (int64, error) {
if len(s) == 0 {
return 0, nil
}
return metricsql.PositiveDurationValue(s, step)
}
// QueryRangeHandler processes /api/v1/query_range request.
//
// See https://prometheus.io/docs/prometheus/latest/querying/api/#range-queries
func QueryRangeHandler(w http.ResponseWriter, r *http.Request) error {
startTime := time.Now()
func QueryRangeHandler(startTime time.Time, w http.ResponseWriter, r *http.Request) error {
ct := currentTime()
query := r.FormValue("query")
@@ -553,7 +698,15 @@ func QueryRangeHandler(w http.ResponseWriter, r *http.Request) error {
if err != nil {
return err
}
deadline := getDeadline(r)
if err := queryRangeHandler(w, query, start, end, step, r, ct); err != nil {
return fmt.Errorf("error when executing query=%q on the time range (start=%d, end=%d, step=%d): %s", query, start, end, step, err)
}
queryRangeDuration.UpdateDuration(startTime)
return nil
}
func queryRangeHandler(w http.ResponseWriter, query string, start, end, step int64, r *http.Request, ct int64) error {
deadline := getDeadlineForQuery(r)
mayCache := !getBool(r, "nocache")
lookbackDelta, err := getMaxLookback(r)
if err != nil {
@@ -562,10 +715,10 @@ func QueryRangeHandler(w http.ResponseWriter, r *http.Request) error {
// Validate input args.
if len(query) > *maxQueryLen {
return fmt.Errorf(`too long query; got %d bytes; mustn't exceed %d bytes`, len(query), *maxQueryLen)
return fmt.Errorf("too long query; got %d bytes; mustn't exceed `-search.maxQueryLen=%d` bytes", len(query), *maxQueryLen)
}
if start > end {
start = end
end = start + defaultStep
}
if err := promql.ValidateMaxPointsPerTimeseries(start, end, step); err != nil {
return err
@@ -584,7 +737,7 @@ func QueryRangeHandler(w http.ResponseWriter, r *http.Request) error {
}
result, err := promql.Exec(&ec, query, false)
if err != nil {
return fmt.Errorf("cannot execute %q: %s", query, err)
return fmt.Errorf("cannot execute query: %s", err)
}
queryOffset := getLatencyOffsetMilliseconds()
if ct-end < queryOffset {
@@ -597,7 +750,6 @@ func QueryRangeHandler(w http.ResponseWriter, r *http.Request) error {
w.Header().Set("Content-Type", "application/json")
WriteQueryRangeResponse(w, result)
queryRangeDuration.UpdateDuration(startTime)
return nil
}
@@ -691,7 +843,15 @@ func getTime(r *http.Request, argKey string, defaultValue int64) (int64, error)
case prometheusMaxTimeFormatted:
return maxTimeMsecs, nil
}
return 0, fmt.Errorf("cannot parse %q=%q: %s", argKey, argValue, err)
// Try parsing duration relative to the current time
d, err1 := time.ParseDuration(argValue)
if err1 != nil {
return 0, fmt.Errorf("cannot parse %q=%q: %s", argKey, argValue, err)
}
if d > 0 {
d = -d
}
t = time.Now().Add(d)
}
secs = float64(t.UnixNano()) / 1e9
}
@@ -742,21 +902,30 @@ func getDuration(r *http.Request, argKey string, defaultValue int64) (int64, err
const maxDurationMsecs = 100 * 365 * 24 * 3600 * 1000
func getMaxLookback(r *http.Request) (int64, error) {
d := int64(*maxLookback / time.Millisecond)
d := maxLookback.Milliseconds()
return getDuration(r, "max_lookback", d)
}
func getDeadline(r *http.Request) netstorage.Deadline {
func getDeadlineForQuery(r *http.Request) netstorage.Deadline {
dMax := maxQueryDuration.Milliseconds()
return getDeadlineWithMaxDuration(r, dMax, "-search.maxQueryDuration")
}
func getDeadlineForExport(r *http.Request) netstorage.Deadline {
dMax := maxExportDuration.Milliseconds()
return getDeadlineWithMaxDuration(r, dMax, "-search.maxExportDuration")
}
func getDeadlineWithMaxDuration(r *http.Request, dMax int64, flagHint string) netstorage.Deadline {
d, err := getDuration(r, "timeout", 0)
if err != nil {
d = 0
}
dMax := int64(maxQueryDuration.Seconds() * 1e3)
if d <= 0 || d > dMax {
d = dMax
}
timeout := time.Duration(d) * time.Millisecond
return netstorage.NewDeadline(timeout)
return netstorage.NewDeadline(timeout, flagHint)
}
func getBool(r *http.Request, argKey string) bool {
@@ -786,7 +955,7 @@ func getTagFilterssFromMatches(matches []string) ([][]storage.TagFilter, error)
}
func getLatencyOffsetMilliseconds() int64 {
d := int64(*latencyOffset / time.Millisecond)
d := latencyOffset.Milliseconds()
if d <= 1000 {
d = 1000
}

View File

@@ -8,7 +8,10 @@ import (
"strings"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/metricsql"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/metrics"
"github.com/valyala/histogram"
)
var aggrFuncs = map[string]aggrFunc{
@@ -25,19 +28,28 @@ var aggrFuncs = map[string]aggrFunc{
"topk": newAggrFuncTopK(false),
"quantile": aggrFuncQuantile,
// Extended PromQL funcs
"median": aggrFuncMedian,
"limitk": aggrFuncLimitK,
"distinct": newAggrFunc(aggrFuncDistinct),
"sum2": newAggrFunc(aggrFuncSum2),
"geomean": newAggrFunc(aggrFuncGeomean),
// PromQL extension funcs
"median": aggrFuncMedian,
"limitk": aggrFuncLimitK,
"distinct": newAggrFunc(aggrFuncDistinct),
"sum2": newAggrFunc(aggrFuncSum2),
"geomean": newAggrFunc(aggrFuncGeomean),
"histogram": newAggrFunc(aggrFuncHistogram),
"topk_min": newAggrFuncRangeTopK(minValue, false),
"topk_max": newAggrFuncRangeTopK(maxValue, false),
"topk_avg": newAggrFuncRangeTopK(avgValue, false),
"topk_median": newAggrFuncRangeTopK(medianValue, false),
"bottomk_min": newAggrFuncRangeTopK(minValue, true),
"bottomk_max": newAggrFuncRangeTopK(maxValue, true),
"bottomk_avg": newAggrFuncRangeTopK(avgValue, true),
"bottomk_median": newAggrFuncRangeTopK(medianValue, true),
}
type aggrFunc func(afa *aggrFuncArg) ([]*timeseries, error)
type aggrFuncArg struct {
args [][]*timeseries
ae *aggrFuncExpr
ae *metricsql.AggrFuncExpr
ec *EvalConfig
}
@@ -46,20 +58,6 @@ func getAggrFunc(s string) aggrFunc {
return aggrFuncs[s]
}
func isAggrFunc(s string) bool {
return getAggrFunc(s) != nil
}
func isAggrFuncModifier(s string) bool {
s = strings.ToLower(s)
switch s {
case "by", "without":
return true
default:
return false
}
}
func newAggrFunc(afe func(tss []*timeseries) []*timeseries) aggrFunc {
return func(afa *aggrFuncArg) ([]*timeseries, error) {
args := afa.args
@@ -70,7 +68,7 @@ func newAggrFunc(afe func(tss []*timeseries) []*timeseries) aggrFunc {
}
}
func removeGroupTags(metricName *storage.MetricName, modifier *modifierExpr) {
func removeGroupTags(metricName *storage.MetricName, modifier *metricsql.ModifierExpr) {
groupOp := strings.ToLower(modifier.Op)
switch groupOp {
case "", "by":
@@ -82,7 +80,7 @@ func removeGroupTags(metricName *storage.MetricName, modifier *modifierExpr) {
}
}
func aggrFuncExt(afe func(tss []*timeseries) []*timeseries, argOrig []*timeseries, modifier *modifierExpr, keepOriginal bool) ([]*timeseries, error) {
func aggrFuncExt(afe func(tss []*timeseries) []*timeseries, argOrig []*timeseries, modifier *metricsql.ModifierExpr, keepOriginal bool) ([]*timeseries, error) {
arg := copyTimeseriesMetricNames(argOrig)
// Perform grouping.
@@ -184,6 +182,38 @@ func aggrFuncGeomean(tss []*timeseries) []*timeseries {
return tss[:1]
}
func aggrFuncHistogram(tss []*timeseries) []*timeseries {
var h metrics.Histogram
m := make(map[string]*timeseries)
for i := range tss[0].Values {
h.Reset()
for _, ts := range tss {
v := ts.Values[i]
h.Update(v)
}
h.VisitNonZeroBuckets(func(vmrange string, count uint64) {
ts := m[vmrange]
if ts == nil {
ts = &timeseries{}
ts.CopyFromShallowTimestamps(tss[0])
ts.MetricName.RemoveTag("vmrange")
ts.MetricName.AddTag("vmrange", vmrange)
values := ts.Values
for k := range values {
values[k] = 0
}
m[vmrange] = ts
}
ts.Values[i] = float64(count)
})
}
rvs := make([]*timeseries, 0, len(m))
for _, ts := range m {
rvs = append(rvs, ts)
}
return vmrangeBucketsToLE(rvs)
}
func aggrFuncMin(tss []*timeseries) []*timeseries {
if len(tss) == 1 {
// Fast path - nothing to min.
@@ -425,37 +455,138 @@ func newAggrFuncTopK(isReverse bool) aggrFunc {
return nil, err
}
afe := func(tss []*timeseries) []*timeseries {
rvs := tss
for n := range rvs[0].Values {
sort.Slice(rvs, func(i, j int) bool {
a := rvs[i].Values[n]
b := rvs[j].Values[n]
cmp := lessWithNaNs(a, b)
for n := range tss[0].Values {
sort.Slice(tss, func(i, j int) bool {
a := tss[i].Values[n]
b := tss[j].Values[n]
if isReverse {
cmp = !cmp
a, b = b, a
}
return cmp
return lessWithNaNs(a, b)
})
if math.IsNaN(ks[n]) {
ks[n] = 0
}
k := int(ks[n])
if k < 0 {
k = 0
}
if k > len(rvs) {
k = len(rvs)
}
for _, ts := range rvs[:len(rvs)-k] {
ts.Values[n] = nan
}
fillNaNsAtIdx(n, ks[n], tss)
}
return removeNaNs(rvs)
return removeNaNs(tss)
}
return aggrFuncExt(afe, args[1], &afa.ae.Modifier, true)
}
}
type tsWithValue struct {
ts *timeseries
value float64
}
func newAggrFuncRangeTopK(f func(values []float64) float64, isReverse bool) aggrFunc {
return func(afa *aggrFuncArg) ([]*timeseries, error) {
args := afa.args
if err := expectTransformArgsNum(args, 2); err != nil {
return nil, err
}
ks, err := getScalar(args[0], 0)
if err != nil {
return nil, err
}
afe := func(tss []*timeseries) []*timeseries {
maxs := make([]tsWithValue, len(tss))
for i, ts := range tss {
value := f(ts.Values)
maxs[i] = tsWithValue{
ts: ts,
value: value,
}
}
sort.Slice(maxs, func(i, j int) bool {
a := maxs[i].value
b := maxs[j].value
if isReverse {
a, b = b, a
}
return lessWithNaNs(a, b)
})
for i := range maxs {
tss[i] = maxs[i].ts
}
for i, k := range ks {
fillNaNsAtIdx(i, k, tss)
}
return removeNaNs(tss)
}
return aggrFuncExt(afe, args[1], &afa.ae.Modifier, true)
}
}
func fillNaNsAtIdx(idx int, k float64, tss []*timeseries) {
if math.IsNaN(k) {
k = 0
}
kn := int(k)
if kn < 0 {
kn = 0
}
if kn > len(tss) {
kn = len(tss)
}
for _, ts := range tss[:len(tss)-kn] {
ts.Values[idx] = nan
}
}
func minValue(values []float64) float64 {
if len(values) == 0 {
return nan
}
min := values[0]
for _, v := range values[1:] {
if v < min {
min = v
}
}
return min
}
func maxValue(values []float64) float64 {
if len(values) == 0 {
return nan
}
max := values[0]
for _, v := range values[1:] {
if v > max {
max = v
}
}
return max
}
func avgValue(values []float64) float64 {
sum := float64(0)
count := 0
for _, v := range values {
if math.IsNaN(v) {
continue
}
count++
sum += v
}
if count == 0 {
return nan
}
return sum / float64(count)
}
func medianValue(values []float64) float64 {
h := histogram.GetFast()
for _, v := range values {
if math.IsNaN(v) {
continue
}
h.Update(v)
}
value := h.Quantile(0.5)
histogram.PutFast(h)
return value
}
func aggrFuncLimitK(afa *aggrFuncArg) ([]*timeseries, error) {
args := afa.args
if err := expectTransformArgsNum(args, 2); err != nil {

View File

@@ -4,10 +4,12 @@ import (
"math"
"strings"
"sync"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/metricsql"
)
// callbacks for optimized incremental calculations for aggregate functions
// over rollups over metricExpr.
// over rollups over metricsql.MetricExpr.
//
// These calculations save RAM for aggregates over big number of time series.
var incrementalAggrFuncCallbacksMap = map[string]*incrementalAggrFuncCallbacks{
@@ -49,7 +51,7 @@ var incrementalAggrFuncCallbacksMap = map[string]*incrementalAggrFuncCallbacks{
}
type incrementalAggrFuncContext struct {
ae *aggrFuncExpr
ae *metricsql.AggrFuncExpr
mLock sync.Mutex
m map[uint]map[string]*incrementalAggrContext
@@ -57,7 +59,7 @@ type incrementalAggrFuncContext struct {
callbacks *incrementalAggrFuncCallbacks
}
func newIncrementalAggrFuncContext(ae *aggrFuncExpr, callbacks *incrementalAggrFuncCallbacks) *incrementalAggrFuncContext {
func newIncrementalAggrFuncContext(ae *metricsql.AggrFuncExpr, callbacks *incrementalAggrFuncCallbacks) *incrementalAggrFuncContext {
return &incrementalAggrFuncContext{
ae: ae,
m: make(map[uint]map[string]*incrementalAggrContext),

View File

@@ -7,6 +7,8 @@ import (
"runtime"
"sync"
"testing"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/metricsql"
)
func TestIncrementalAggr(t *testing.T) {
@@ -42,7 +44,7 @@ func TestIncrementalAggr(t *testing.T) {
f := func(name string, valuesExpected []float64) {
t.Helper()
callbacks := getIncrementalAggrFuncCallbacks(name)
ae := &aggrFuncExpr{
ae := &metricsql.AggrFuncExpr{
Name: name,
}
tssExpected := []*timeseries{{

View File

@@ -1,3 +0,0 @@
package promql
const maxByteSliceLen = 1<<31 - 1

View File

@@ -1,3 +0,0 @@
package promql
const maxByteSliceLen = 1 << 40

View File

@@ -1,3 +0,0 @@
package promql
const maxByteSliceLen = 1<<31 - 1

View File

@@ -1,3 +0,0 @@
package promql
const maxByteSliceLen = 1 << 40

View File

@@ -6,63 +6,36 @@ import (
"strings"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/metricsql"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/metricsql/binaryop"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
)
var binaryOpFuncs = map[string]binaryOpFunc{
"+": newBinaryOpArithFunc(binaryOpPlus),
"-": newBinaryOpArithFunc(binaryOpMinus),
"*": newBinaryOpArithFunc(binaryOpMul),
"/": newBinaryOpArithFunc(binaryOpDiv),
"%": newBinaryOpArithFunc(binaryOpMod),
"^": newBinaryOpArithFunc(binaryOpPow),
"+": newBinaryOpArithFunc(binaryop.Plus),
"-": newBinaryOpArithFunc(binaryop.Minus),
"*": newBinaryOpArithFunc(binaryop.Mul),
"/": newBinaryOpArithFunc(binaryop.Div),
"%": newBinaryOpArithFunc(binaryop.Mod),
"^": newBinaryOpArithFunc(binaryop.Pow),
// cmp ops
"==": newBinaryOpCmpFunc(binaryOpEq),
"!=": newBinaryOpCmpFunc(binaryOpNeq),
">": newBinaryOpCmpFunc(binaryOpGt),
"<": newBinaryOpCmpFunc(binaryOpLt),
">=": newBinaryOpCmpFunc(binaryOpGte),
"<=": newBinaryOpCmpFunc(binaryOpLte),
"==": newBinaryOpCmpFunc(binaryop.Eq),
"!=": newBinaryOpCmpFunc(binaryop.Neq),
">": newBinaryOpCmpFunc(binaryop.Gt),
"<": newBinaryOpCmpFunc(binaryop.Lt),
">=": newBinaryOpCmpFunc(binaryop.Gte),
"<=": newBinaryOpCmpFunc(binaryop.Lte),
// logical set ops
"and": binaryOpAnd,
"or": binaryOpOr,
"unless": binaryOpUnless,
// New op
"if": newBinaryOpArithFunc(binaryOpIf),
"ifnot": newBinaryOpArithFunc(binaryOpIfnot),
"default": newBinaryOpArithFunc(binaryOpDefault),
}
var binaryOpPriorities = map[string]int{
"default": -1,
"if": 0,
"ifnot": 0,
// See https://prometheus.io/docs/prometheus/latest/querying/operators/#binary-operator-precedence
"or": 1,
"and": 2,
"unless": 2,
"==": 3,
"!=": 3,
"<": 3,
">": 3,
"<=": 3,
">=": 3,
"+": 4,
"-": 4,
"*": 5,
"/": 5,
"%": 5,
"^": 6,
// New ops
"if": newBinaryOpArithFunc(binaryop.If),
"ifnot": newBinaryOpArithFunc(binaryop.Ifnot),
"default": newBinaryOpArithFunc(binaryop.Default),
}
func getBinaryOpFunc(op string) binaryOpFunc {
@@ -70,144 +43,8 @@ func getBinaryOpFunc(op string) binaryOpFunc {
return binaryOpFuncs[op]
}
func isBinaryOp(op string) bool {
return getBinaryOpFunc(op) != nil
}
func binaryOpPriority(op string) int {
op = strings.ToLower(op)
return binaryOpPriorities[op]
}
func scanBinaryOpPrefix(s string) int {
n := 0
for op := range binaryOpFuncs {
if len(s) < len(op) {
continue
}
ss := strings.ToLower(s[:len(op)])
if ss == op && len(op) > n {
n = len(op)
}
}
return n
}
func isRightAssociativeBinaryOp(op string) bool {
// See https://prometheus.io/docs/prometheus/latest/querying/operators/#binary-operator-precedence
return op == "^"
}
func isBinaryOpGroupModifier(s string) bool {
s = strings.ToLower(s)
switch s {
// See https://prometheus.io/docs/prometheus/latest/querying/operators/#vector-matching
case "on", "ignoring":
return true
default:
return false
}
}
func isBinaryOpJoinModifier(s string) bool {
s = strings.ToLower(s)
switch s {
case "group_left", "group_right":
return true
default:
return false
}
}
func isBinaryOpBoolModifier(s string) bool {
s = strings.ToLower(s)
return s == "bool"
}
func isBinaryOpCmp(op string) bool {
switch op {
case "==", "!=", ">", "<", ">=", "<=":
return true
default:
return false
}
}
func isBinaryOpLogicalSet(op string) bool {
op = strings.ToLower(op)
switch op {
case "and", "or", "unless":
return true
default:
return false
}
}
func binaryOpConstants(op string, left, right float64, isBool bool) float64 {
if isBinaryOpCmp(op) {
evalCmp := func(cf func(left, right float64) bool) float64 {
if isBool {
if cf(left, right) {
return 1
}
return 0
}
if cf(left, right) {
return left
}
return nan
}
switch op {
case "==":
left = evalCmp(binaryOpEq)
case "!=":
left = evalCmp(binaryOpNeq)
case ">":
left = evalCmp(binaryOpGt)
case "<":
left = evalCmp(binaryOpLt)
case ">=":
left = evalCmp(binaryOpGte)
case "<=":
left = evalCmp(binaryOpLte)
default:
logger.Panicf("BUG: unexpected comparison binaryOp: %q", op)
}
} else {
switch op {
case "+":
left = binaryOpPlus(left, right)
case "-":
left = binaryOpMinus(left, right)
case "*":
left = binaryOpMul(left, right)
case "/":
left = binaryOpDiv(left, right)
case "%":
left = binaryOpMod(left, right)
case "^":
left = binaryOpPow(left, right)
case "and":
// Nothing to do
case "or":
// Nothing to do
case "unless":
left = nan
case "default":
left = binaryOpDefault(left, right)
case "if":
left = binaryOpIf(left, right)
case "ifnot":
left = binaryOpIfnot(left, right)
default:
logger.Panicf("BUG: unexpected non-comparison binaryOp: %q", op)
}
}
return left
}
type binaryOpFuncArg struct {
be *binaryOpExpr
be *metricsql.BinaryOpExpr
left []*timeseries
right []*timeseries
}
@@ -267,7 +104,7 @@ func newBinaryOpFunc(bf func(left, right float64, isBool bool) float64) binaryOp
}
}
func adjustBinaryOpTags(be *binaryOpExpr, left, right []*timeseries) ([]*timeseries, []*timeseries, []*timeseries, error) {
func adjustBinaryOpTags(be *metricsql.BinaryOpExpr, left, right []*timeseries) ([]*timeseries, []*timeseries, []*timeseries, error) {
if len(be.GroupModifier.Op) == 0 && len(be.JoinModifier.Op) == 0 {
if isScalar(left) {
// Fast path: `scalar op vector`
@@ -348,7 +185,7 @@ func adjustBinaryOpTags(be *binaryOpExpr, left, right []*timeseries) ([]*timeser
return rvsLeft, rvsRight, dst, nil
}
func ensureSingleTimeseries(side string, be *binaryOpExpr, tss []*timeseries) error {
func ensureSingleTimeseries(side string, be *metricsql.BinaryOpExpr, tss []*timeseries) error {
if len(tss) == 0 {
logger.Panicf("BUG: tss must contain at least one value")
}
@@ -362,7 +199,7 @@ func ensureSingleTimeseries(side string, be *binaryOpExpr, tss []*timeseries) er
return nil
}
func groupJoin(singleTimeseriesSide string, be *binaryOpExpr, rvsLeft, rvsRight, tssLeft, tssRight []*timeseries) ([]*timeseries, []*timeseries, error) {
func groupJoin(singleTimeseriesSide string, be *metricsql.BinaryOpExpr, rvsLeft, rvsRight, tssLeft, tssRight []*timeseries) ([]*timeseries, []*timeseries, error) {
joinTags := be.JoinModifier.Args
var m map[string]*timeseries
for _, tsLeft := range tssLeft {
@@ -432,8 +269,8 @@ func mergeNonOverlappingTimeseries(dst, src *timeseries) bool {
return true
}
func resetMetricGroupIfRequired(be *binaryOpExpr, ts *timeseries) {
if isBinaryOpCmp(be.Op) && !be.Bool {
func resetMetricGroupIfRequired(be *metricsql.BinaryOpExpr, ts *timeseries) {
if metricsql.IsBinaryOpCmp(be.Op) && !be.Bool {
// Do not reset MetricGroup for non-boolean `compare` binary ops like Prometheus does.
return
}
@@ -445,97 +282,24 @@ func resetMetricGroupIfRequired(be *binaryOpExpr, ts *timeseries) {
ts.MetricName.ResetMetricGroup()
}
func binaryOpPlus(left, right float64) float64 {
return left + right
}
func binaryOpMinus(left, right float64) float64 {
return left - right
}
func binaryOpMul(left, right float64) float64 {
return left * right
}
func binaryOpDiv(left, right float64) float64 {
return left / right
}
func binaryOpMod(left, right float64) float64 {
return math.Mod(left, right)
}
func binaryOpPow(left, right float64) float64 {
return math.Pow(left, right)
}
func binaryOpDefault(left, right float64) float64 {
if math.IsNaN(left) {
return right
}
return left
}
func binaryOpIf(left, right float64) float64 {
if math.IsNaN(right) {
return nan
}
return left
}
func binaryOpIfnot(left, right float64) float64 {
if math.IsNaN(right) {
return left
}
return nan
}
func binaryOpEq(left, right float64) bool {
// Special handling for nan == nan.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/150 .
if math.IsNaN(left) {
return math.IsNaN(right)
}
return left == right
}
func binaryOpNeq(left, right float64) bool {
// Special handling for comparison with nan.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/150 .
if math.IsNaN(left) {
return !math.IsNaN(right)
}
if math.IsNaN(right) {
return true
}
return left != right
}
func binaryOpGt(left, right float64) bool {
return left > right
}
func binaryOpLt(left, right float64) bool {
return left < right
}
func binaryOpGte(left, right float64) bool {
return left >= right
}
func binaryOpLte(left, right float64) bool {
return left <= right
}
func binaryOpAnd(bfa *binaryOpFuncArg) ([]*timeseries, error) {
mLeft, mRight := createTimeseriesMapByTagSet(bfa.be, bfa.left, bfa.right)
var rvs []*timeseries
for k := range mRight {
if tss := mLeft[k]; tss != nil {
rvs = append(rvs, tss...)
for k, tssRight := range mRight {
tssLeft := mLeft[k]
if tssLeft == nil {
continue
}
for i := range tssLeft[0].Values {
if !isAllNaNs(tssRight, i) {
continue
}
for _, tsLeft := range tssLeft {
tsLeft.Values[i] = nan
}
}
tssLeft = removeNaNs(tssLeft)
rvs = append(rvs, tssLeft...)
}
return rvs, nil
}
@@ -557,15 +321,36 @@ func binaryOpOr(bfa *binaryOpFuncArg) ([]*timeseries, error) {
func binaryOpUnless(bfa *binaryOpFuncArg) ([]*timeseries, error) {
mLeft, mRight := createTimeseriesMapByTagSet(bfa.be, bfa.left, bfa.right)
var rvs []*timeseries
for k, tss := range mLeft {
if mRight[k] == nil {
rvs = append(rvs, tss...)
for k, tssLeft := range mLeft {
tssRight := mRight[k]
if tssRight == nil {
rvs = append(rvs, tssLeft...)
continue
}
for i := range tssLeft[0].Values {
if isAllNaNs(tssRight, i) {
continue
}
for _, tsLeft := range tssLeft {
tsLeft.Values[i] = nan
}
}
tssLeft = removeNaNs(tssLeft)
rvs = append(rvs, tssLeft...)
}
return rvs, nil
}
func createTimeseriesMapByTagSet(be *binaryOpExpr, left, right []*timeseries) (map[string][]*timeseries, map[string][]*timeseries) {
func isAllNaNs(tss []*timeseries, idx int) bool {
for _, ts := range tss {
if !math.IsNaN(ts.Values[idx]) {
return false
}
}
return true
}
func createTimeseriesMapByTagSet(be *metricsql.BinaryOpExpr, left, right []*timeseries) (map[string][]*timeseries, map[string][]*timeseries) {
groupTags := be.GroupModifier.Args
groupOp := strings.ToLower(be.GroupModifier.Op)
if len(groupOp) == 0 {

View File

@@ -11,6 +11,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/memory"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/metricsql"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/metrics"
)
@@ -57,6 +58,14 @@ func AdjustStartEnd(start, end, step int64) (int64, int64) {
if adjust > 0 {
end += step - adjust
}
// Make sure that the new number of points is the same as the initial number of points.
newPoints := (end-start)/step + 1
for newPoints > points {
end -= step
newPoints--
}
return start, end
}
@@ -144,25 +153,25 @@ func getTimestamps(start, end, step int64) []int64 {
return timestamps
}
func evalExpr(ec *EvalConfig, e expr) ([]*timeseries, error) {
if me, ok := e.(*metricExpr); ok {
re := &rollupExpr{
func evalExpr(ec *EvalConfig, e metricsql.Expr) ([]*timeseries, error) {
if me, ok := e.(*metricsql.MetricExpr); ok {
re := &metricsql.RollupExpr{
Expr: me,
}
rv, err := evalRollupFunc(ec, "default_rollup", rollupDefault, re, nil)
rv, err := evalRollupFunc(ec, "default_rollup", rollupDefault, e, re, nil)
if err != nil {
return nil, fmt.Errorf(`cannot evaluate %q: %s`, me.AppendString(nil), err)
}
return rv, nil
}
if re, ok := e.(*rollupExpr); ok {
rv, err := evalRollupFunc(ec, "default_rollup", rollupDefault, re, nil)
if re, ok := e.(*metricsql.RollupExpr); ok {
rv, err := evalRollupFunc(ec, "default_rollup", rollupDefault, e, re, nil)
if err != nil {
return nil, fmt.Errorf(`cannot evaluate %q: %s`, re.AppendString(nil), err)
}
return rv, nil
}
if fe, ok := e.(*funcExpr); ok {
if fe, ok := e.(*metricsql.FuncExpr); ok {
nrf := getRollupFunc(fe.Name)
if nrf == nil {
args, err := evalExprs(ec, fe.Args)
@@ -192,17 +201,17 @@ func evalExpr(ec *EvalConfig, e expr) ([]*timeseries, error) {
if err != nil {
return nil, err
}
rv, err := evalRollupFunc(ec, fe.Name, rf, re, nil)
rv, err := evalRollupFunc(ec, fe.Name, rf, e, re, nil)
if err != nil {
return nil, fmt.Errorf(`cannot evaluate %q: %s`, fe.AppendString(nil), err)
}
return rv, nil
}
if ae, ok := e.(*aggrFuncExpr); ok {
if ae, ok := e.(*metricsql.AggrFuncExpr); ok {
if callbacks := getIncrementalAggrFuncCallbacks(ae.Name); callbacks != nil {
fe, nrf := tryGetArgRollupFuncWithMetricExpr(ae)
if fe != nil {
// There is an optimized path for calculating aggrFuncExpr over rollupFunc over metricExpr.
// There is an optimized path for calculating metricsql.AggrFuncExpr over rollupFunc over metricsql.MetricExpr.
// The optimized path saves RAM for aggregates over big number of time series.
args, re, err := evalRollupFuncArgs(ec, fe)
if err != nil {
@@ -213,7 +222,7 @@ func evalExpr(ec *EvalConfig, e expr) ([]*timeseries, error) {
return nil, err
}
iafc := newIncrementalAggrFuncContext(ae, callbacks)
return evalRollupFunc(ec, fe.Name, rf, re, iafc)
return evalRollupFunc(ec, fe.Name, rf, e, re, iafc)
}
}
args, err := evalExprs(ec, ae.Args)
@@ -235,7 +244,7 @@ func evalExpr(ec *EvalConfig, e expr) ([]*timeseries, error) {
}
return rv, nil
}
if be, ok := e.(*binaryOpExpr); ok {
if be, ok := e.(*metricsql.BinaryOpExpr); ok {
left, err := evalExpr(ec, be.Left)
if err != nil {
return nil, err
@@ -259,18 +268,18 @@ func evalExpr(ec *EvalConfig, e expr) ([]*timeseries, error) {
}
return rv, nil
}
if ne, ok := e.(*numberExpr); ok {
if ne, ok := e.(*metricsql.NumberExpr); ok {
rv := evalNumber(ec, ne.N)
return rv, nil
}
if se, ok := e.(*stringExpr); ok {
if se, ok := e.(*metricsql.StringExpr); ok {
rv := evalString(ec, se.S)
return rv, nil
}
return nil, fmt.Errorf("unexpected expression %q", e.AppendString(nil))
}
func tryGetArgRollupFuncWithMetricExpr(ae *aggrFuncExpr) (*funcExpr, newRollupFunc) {
func tryGetArgRollupFuncWithMetricExpr(ae *metricsql.AggrFuncExpr) (*metricsql.FuncExpr, newRollupFunc) {
if len(ae.Args) != 1 {
return nil, nil
}
@@ -281,31 +290,31 @@ func tryGetArgRollupFuncWithMetricExpr(ae *aggrFuncExpr) (*funcExpr, newRollupFu
// - rollupFunc(metricExpr)
// - rollupFunc(metricExpr[d])
if me, ok := e.(*metricExpr); ok {
if me, ok := e.(*metricsql.MetricExpr); ok {
// e = metricExpr
if me.IsEmpty() {
return nil, nil
}
fe := &funcExpr{
fe := &metricsql.FuncExpr{
Name: "default_rollup",
Args: []expr{me},
Args: []metricsql.Expr{me},
}
nrf := getRollupFunc(fe.Name)
return fe, nrf
}
if re, ok := e.(*rollupExpr); ok {
if me, ok := re.Expr.(*metricExpr); !ok || me.IsEmpty() || re.ForSubquery() {
if re, ok := e.(*metricsql.RollupExpr); ok {
if me, ok := re.Expr.(*metricsql.MetricExpr); !ok || me.IsEmpty() || re.ForSubquery() {
return nil, nil
}
// e = metricExpr[d]
fe := &funcExpr{
fe := &metricsql.FuncExpr{
Name: "default_rollup",
Args: []expr{re},
Args: []metricsql.Expr{re},
}
nrf := getRollupFunc(fe.Name)
return fe, nrf
}
fe, ok := e.(*funcExpr)
fe, ok := e.(*metricsql.FuncExpr)
if !ok {
return nil, nil
}
@@ -314,19 +323,23 @@ func tryGetArgRollupFuncWithMetricExpr(ae *aggrFuncExpr) (*funcExpr, newRollupFu
return nil, nil
}
rollupArgIdx := getRollupArgIdx(fe.Name)
if rollupArgIdx >= len(fe.Args) {
// Incorrect number of args for rollup func.
return nil, nil
}
arg := fe.Args[rollupArgIdx]
if me, ok := arg.(*metricExpr); ok {
if me, ok := arg.(*metricsql.MetricExpr); ok {
if me.IsEmpty() {
return nil, nil
}
// e = rollupFunc(metricExpr)
return &funcExpr{
return &metricsql.FuncExpr{
Name: fe.Name,
Args: []expr{me},
Args: []metricsql.Expr{me},
}, nrf
}
if re, ok := arg.(*rollupExpr); ok {
if me, ok := re.Expr.(*metricExpr); !ok || me.IsEmpty() || re.ForSubquery() {
if re, ok := arg.(*metricsql.RollupExpr); ok {
if me, ok := re.Expr.(*metricsql.MetricExpr); !ok || me.IsEmpty() || re.ForSubquery() {
return nil, nil
}
// e = rollupFunc(metricExpr[d])
@@ -335,7 +348,7 @@ func tryGetArgRollupFuncWithMetricExpr(ae *aggrFuncExpr) (*funcExpr, newRollupFu
return nil, nil
}
func evalExprs(ec *EvalConfig, es []expr) ([][]*timeseries, error) {
func evalExprs(ec *EvalConfig, es []metricsql.Expr) ([][]*timeseries, error) {
var rvs [][]*timeseries
for _, e := range es {
rv, err := evalExpr(ec, e)
@@ -347,9 +360,12 @@ func evalExprs(ec *EvalConfig, es []expr) ([][]*timeseries, error) {
return rvs, nil
}
func evalRollupFuncArgs(ec *EvalConfig, fe *funcExpr) ([]interface{}, *rollupExpr, error) {
var re *rollupExpr
func evalRollupFuncArgs(ec *EvalConfig, fe *metricsql.FuncExpr) ([]interface{}, *metricsql.RollupExpr, error) {
var re *metricsql.RollupExpr
rollupArgIdx := getRollupArgIdx(fe.Name)
if len(fe.Args) <= rollupArgIdx {
return nil, nil, fmt.Errorf("expecting at least %d args to %q; got %d args; expr: %q", rollupArgIdx+1, fe.Name, len(fe.Args), fe.AppendString(nil))
}
args := make([]interface{}, len(fe.Args))
for i, arg := range fe.Args {
if i == rollupArgIdx {
@@ -366,11 +382,11 @@ func evalRollupFuncArgs(ec *EvalConfig, fe *funcExpr) ([]interface{}, *rollupExp
return args, re, nil
}
func getRollupExprArg(arg expr) *rollupExpr {
re, ok := arg.(*rollupExpr)
func getRollupExprArg(arg metricsql.Expr) *metricsql.RollupExpr {
re, ok := arg.(*metricsql.RollupExpr)
if !ok {
// Wrap non-rollup arg into rollupExpr.
return &rollupExpr{
// Wrap non-rollup arg into metricsql.RollupExpr.
return &metricsql.RollupExpr{
Expr: arg,
}
}
@@ -378,45 +394,60 @@ func getRollupExprArg(arg expr) *rollupExpr {
// Return standard rollup if it doesn't contain subquery.
return re
}
me, ok := re.Expr.(*metricExpr)
me, ok := re.Expr.(*metricsql.MetricExpr)
if !ok {
// arg contains subquery.
return re
}
// Convert me[w:step] -> default_rollup(me)[w:step]
reNew := *re
reNew.Expr = &funcExpr{
reNew.Expr = &metricsql.FuncExpr{
Name: "default_rollup",
Args: []expr{
&rollupExpr{Expr: me},
Args: []metricsql.Expr{
&metricsql.RollupExpr{Expr: me},
},
}
return &reNew
}
func evalRollupFunc(ec *EvalConfig, name string, rf rollupFunc, re *rollupExpr, iafc *incrementalAggrFuncContext) ([]*timeseries, error) {
func evalRollupFunc(ec *EvalConfig, name string, rf rollupFunc, expr metricsql.Expr, re *metricsql.RollupExpr, iafc *incrementalAggrFuncContext) ([]*timeseries, error) {
ecNew := ec
var offset int64
if len(re.Offset) > 0 {
var err error
offset, err = DurationValue(re.Offset, ec.Step)
offset, err = metricsql.DurationValue(re.Offset, ec.Step)
if err != nil {
return nil, err
}
ecNew = newEvalConfig(ec)
ecNew = newEvalConfig(ecNew)
ecNew.Start -= offset
ecNew.End -= offset
ecNew.Start, ecNew.End = AdjustStartEnd(ecNew.Start, ecNew.End, ecNew.Step)
if ecNew.MayCache {
start, end := AdjustStartEnd(ecNew.Start, ecNew.End, ecNew.Step)
offset += ecNew.Start - start
ecNew.Start = start
ecNew.End = end
}
}
if name == "rollup_candlestick" {
// Automatically apply `offset -step` to `rollup_candlestick` function
// in order to obtain expected OHLC results.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/309#issuecomment-582113462
step := ecNew.Step
ecNew = newEvalConfig(ecNew)
ecNew.Start += step
ecNew.End += step
offset -= step
}
var rvs []*timeseries
var err error
if me, ok := re.Expr.(*metricExpr); ok {
rvs, err = evalRollupFuncWithMetricExpr(ecNew, name, rf, me, iafc, re.Window)
if me, ok := re.Expr.(*metricsql.MetricExpr); ok {
rvs, err = evalRollupFuncWithMetricExpr(ecNew, name, rf, expr, me, iafc, re.Window)
} else {
if iafc != nil {
logger.Panicf("BUG: iafc must be nil for rollup %q over subquery %q", name, re.AppendString(nil))
}
rvs, err = evalRollupFuncWithSubquery(ecNew, name, rf, re)
rvs, err = evalRollupFuncWithSubquery(ecNew, name, rf, expr, re)
}
if err != nil {
return nil, err
@@ -435,12 +466,12 @@ func evalRollupFunc(ec *EvalConfig, name string, rf rollupFunc, re *rollupExpr,
return rvs, nil
}
func evalRollupFuncWithSubquery(ec *EvalConfig, name string, rf rollupFunc, re *rollupExpr) ([]*timeseries, error) {
// Do not use rollupResultCacheV here, since it works only with metricExpr.
func evalRollupFuncWithSubquery(ec *EvalConfig, name string, rf rollupFunc, expr metricsql.Expr, re *metricsql.RollupExpr) ([]*timeseries, error) {
// TODO: determine whether to use rollupResultCacheV here.
var step int64
if len(re.Step) > 0 {
var err error
step, err = DurationValue(re.Step, ec.Step)
step, err = metricsql.PositiveDurationValue(re.Step, ec.Step)
if err != nil {
return nil, err
}
@@ -450,7 +481,7 @@ func evalRollupFuncWithSubquery(ec *EvalConfig, name string, rf rollupFunc, re *
var window int64
if len(re.Window) > 0 {
var err error
window, err = DurationValue(re.Window, ec.Step)
window, err = metricsql.PositiveDurationValue(re.Window, ec.Step)
if err != nil {
return nil, err
}
@@ -467,9 +498,19 @@ func evalRollupFuncWithSubquery(ec *EvalConfig, name string, rf rollupFunc, re *
if err != nil {
return nil, err
}
if len(tssSQ) == 0 {
if name == "absent_over_time" {
tss := evalNumber(ec, 1)
return tss, nil
}
return nil, nil
}
sharedTimestamps := getTimestamps(ec.Start, ec.End, ec.Step)
preFunc, rcs := getRollupConfigs(name, rf, ec.Start, ec.End, ec.Step, window, ec.LookbackDelta, sharedTimestamps)
preFunc, rcs, err := getRollupConfigs(name, rf, expr, ec.Start, ec.End, ec.Step, window, ec.LookbackDelta, sharedTimestamps)
if err != nil {
return nil, err
}
tss := make([]*timeseries, 0, len(tssSQ)*len(rcs))
var tssLock sync.Mutex
removeMetricGroup := !rollupFuncsKeepMetricGroup[name]
@@ -477,6 +518,13 @@ func evalRollupFuncWithSubquery(ec *EvalConfig, name string, rf rollupFunc, re *
values, timestamps = removeNanValues(values[:0], timestamps[:0], tsSQ.Values, tsSQ.Timestamps)
preFunc(values, timestamps)
for _, rc := range rcs {
if tsm := newTimeseriesMap(name, sharedTimestamps, &tsSQ.MetricName); tsm != nil {
rc.DoTimeseriesMap(tsm, values, timestamps)
tssLock.Lock()
tss = tsm.AppendTimeseriesTo(tss)
tssLock.Unlock()
continue
}
var ts timeseries
doRollupForTimeseries(rc, &ts, &tsSQ.MetricName, values, timestamps, sharedTimestamps, removeMetricGroup)
tssLock.Lock()
@@ -544,21 +592,22 @@ var (
rollupResultCacheMiss = metrics.NewCounter(`vm_rollup_result_cache_miss_total`)
)
func evalRollupFuncWithMetricExpr(ec *EvalConfig, name string, rf rollupFunc, me *metricExpr, iafc *incrementalAggrFuncContext, windowStr string) ([]*timeseries, error) {
func evalRollupFuncWithMetricExpr(ec *EvalConfig, name string, rf rollupFunc,
expr metricsql.Expr, me *metricsql.MetricExpr, iafc *incrementalAggrFuncContext, windowStr string) ([]*timeseries, error) {
if me.IsEmpty() {
return evalNumber(ec, nan), nil
}
var window int64
if len(windowStr) > 0 {
var err error
window, err = DurationValue(windowStr, ec.Step)
window, err = metricsql.PositiveDurationValue(windowStr, ec.Step)
if err != nil {
return nil, err
}
}
// Search for partial results in cache.
tssCached, start := rollupResultCacheV.Get(name, ec, me, iafc, window)
tssCached, start := rollupResultCacheV.Get(ec, expr, window)
if start > ec.End {
// The result is fully cached.
rollupResultCacheFullHits.Inc()
@@ -570,11 +619,26 @@ func evalRollupFuncWithMetricExpr(ec *EvalConfig, name string, rf rollupFunc, me
rollupResultCacheMiss.Inc()
}
// Obtain rollup configs before fetching data from db,
// so type errors can be caught earlier.
sharedTimestamps := getTimestamps(start, ec.End, ec.Step)
preFunc, rcs, err := getRollupConfigs(name, rf, expr, start, ec.End, ec.Step, window, ec.LookbackDelta, sharedTimestamps)
if err != nil {
return nil, err
}
// Fetch the remaining part of the result.
tfs := toTagFilters(me.LabelFilters)
minTimestamp := start - maxSilenceInterval
if window > ec.Step {
minTimestamp -= window
} else {
minTimestamp -= ec.Step
}
sq := &storage.SearchQuery{
MinTimestamp: start - window - maxSilenceInterval,
MaxTimestamp: ec.End + ec.Step,
TagFilterss: [][]storage.TagFilter{me.TagFilters},
MinTimestamp: minTimestamp,
MaxTimestamp: ec.End,
TagFilterss: [][]storage.TagFilter{tfs},
}
rss, err := netstorage.ProcessSearchQuery(sq, true, ec.Deadline)
if err != nil {
@@ -583,14 +647,16 @@ func evalRollupFuncWithMetricExpr(ec *EvalConfig, name string, rf rollupFunc, me
rssLen := rss.Len()
if rssLen == 0 {
rss.Cancel()
var tss []*timeseries
if name == "absent_over_time" {
tss = getAbsentTimeseries(ec, me)
}
// Add missing points until ec.End.
// Do not cache the result, since missing points
// may be backfilled in the future.
tss := mergeTimeseries(tssCached, nil, start, ec)
tss = mergeTimeseries(tssCached, tss, start, ec)
return tss, nil
}
sharedTimestamps := getTimestamps(start, ec.End, ec.Step)
preFunc, rcs := getRollupConfigs(name, rf, start, ec.End, ec.Step, window, ec.LookbackDelta, sharedTimestamps)
// Verify timeseries fit available memory after the rollup.
// Take into account points from tssCached.
@@ -602,8 +668,8 @@ func evalRollupFuncWithMetricExpr(ec *EvalConfig, name string, rf rollupFunc, me
if iafc.ae.Modifier.Op != "" {
// Increase the number of timeseries for non-empty group list: `aggr() by (something)`,
// since each group can have own set of time series in memory.
// Estimate the number of such groups is lower than 100 :)
timeseriesLen *= 100
// Estimate the number of such groups is lower than 1000 :)
timeseriesLen *= 1000
}
}
rollupPoints := mulNoOverflow(pointsPerTimeseries, int64(timeseriesLen*len(rcs)))
@@ -622,16 +688,15 @@ func evalRollupFuncWithMetricExpr(ec *EvalConfig, name string, rf rollupFunc, me
removeMetricGroup := !rollupFuncsKeepMetricGroup[name]
var tss []*timeseries
if iafc != nil {
tss, err = evalRollupWithIncrementalAggregate(iafc, rss, rcs, preFunc, sharedTimestamps, removeMetricGroup)
tss, err = evalRollupWithIncrementalAggregate(name, iafc, rss, rcs, preFunc, sharedTimestamps, removeMetricGroup)
} else {
tss, err = evalRollupNoIncrementalAggregate(rss, rcs, preFunc, sharedTimestamps, removeMetricGroup)
tss, err = evalRollupNoIncrementalAggregate(name, rss, rcs, preFunc, sharedTimestamps, removeMetricGroup)
}
if err != nil {
return nil, err
}
tss = mergeTimeseries(tssCached, tss, start, ec)
rollupResultCacheV.Put(name, ec, me, iafc, window, tss)
rollupResultCacheV.Put(ec, expr, window, tss)
return tss, nil
}
@@ -647,13 +712,20 @@ func getRollupMemoryLimiter() *memoryLimiter {
return &rollupMemoryLimiter
}
func evalRollupWithIncrementalAggregate(iafc *incrementalAggrFuncContext, rss *netstorage.Results, rcs []*rollupConfig,
func evalRollupWithIncrementalAggregate(name string, iafc *incrementalAggrFuncContext, rss *netstorage.Results, rcs []*rollupConfig,
preFunc func(values []float64, timestamps []int64), sharedTimestamps []int64, removeMetricGroup bool) ([]*timeseries, error) {
err := rss.RunParallel(func(rs *netstorage.Result, workerID uint) {
preFunc(rs.Values, rs.Timestamps)
ts := getTimeseries()
defer putTimeseries(ts)
for _, rc := range rcs {
if tsm := newTimeseriesMap(name, sharedTimestamps, &rs.MetricName); tsm != nil {
rc.DoTimeseriesMap(tsm, rs.Values, rs.Timestamps)
for _, ts := range tsm.m {
iafc.updateTimeseries(ts, workerID)
}
continue
}
ts.Reset()
doRollupForTimeseries(rc, ts, &rs.MetricName, rs.Values, rs.Timestamps, sharedTimestamps, removeMetricGroup)
iafc.updateTimeseries(ts, workerID)
@@ -670,13 +742,20 @@ func evalRollupWithIncrementalAggregate(iafc *incrementalAggrFuncContext, rss *n
return tss, nil
}
func evalRollupNoIncrementalAggregate(rss *netstorage.Results, rcs []*rollupConfig,
func evalRollupNoIncrementalAggregate(name string, rss *netstorage.Results, rcs []*rollupConfig,
preFunc func(values []float64, timestamps []int64), sharedTimestamps []int64, removeMetricGroup bool) ([]*timeseries, error) {
tss := make([]*timeseries, 0, rss.Len()*len(rcs))
var tssLock sync.Mutex
err := rss.RunParallel(func(rs *netstorage.Result, workerID uint) {
preFunc(rs.Values, rs.Timestamps)
for _, rc := range rcs {
if tsm := newTimeseriesMap(name, sharedTimestamps, &rs.MetricName); tsm != nil {
rc.DoTimeseriesMap(tsm, rs.Values, rs.Timestamps)
tssLock.Lock()
tss = tsm.AppendTimeseriesTo(tss)
tssLock.Unlock()
continue
}
var ts timeseries
doRollupForTimeseries(rc, &ts, &rs.MetricName, rs.Values, rs.Timestamps, sharedTimestamps, removeMetricGroup)
tssLock.Lock()
@@ -704,62 +783,6 @@ func doRollupForTimeseries(rc *rollupConfig, tsDst *timeseries, mnSrc *storage.M
tsDst.denyReuse = true
}
func getRollupConfigs(name string, rf rollupFunc, start, end, step, window int64, lookbackDelta int64, sharedTimestamps []int64) (
func(values []float64, timestamps []int64), []*rollupConfig) {
preFunc := func(values []float64, timestamps []int64) {}
if rollupFuncsRemoveCounterResets[name] {
preFunc = func(values []float64, timestamps []int64) {
removeCounterResets(values)
}
}
newRollupConfig := func(rf rollupFunc, tagValue string) *rollupConfig {
return &rollupConfig{
TagValue: tagValue,
Func: rf,
Start: start,
End: end,
Step: step,
Window: window,
MayAdjustWindow: rollupFuncsMayAdjustWindow[name],
LookbackDelta: lookbackDelta,
Timestamps: sharedTimestamps,
}
}
appendRollupConfigs := func(dst []*rollupConfig) []*rollupConfig {
dst = append(dst, newRollupConfig(rollupMin, "min"))
dst = append(dst, newRollupConfig(rollupMax, "max"))
dst = append(dst, newRollupConfig(rollupAvg, "avg"))
return dst
}
var rcs []*rollupConfig
switch name {
case "rollup":
rcs = appendRollupConfigs(rcs)
case "rollup_rate", "rollup_deriv":
preFuncPrev := preFunc
preFunc = func(values []float64, timestamps []int64) {
preFuncPrev(values, timestamps)
derivValues(values, timestamps)
}
rcs = appendRollupConfigs(rcs)
case "rollup_increase", "rollup_delta":
preFuncPrev := preFunc
preFunc = func(values []float64, timestamps []int64) {
preFuncPrev(values, timestamps)
deltaValues(values)
}
rcs = appendRollupConfigs(rcs)
case "rollup_candlestick":
rcs = append(rcs, newRollupConfig(rollupFirst, "open"))
rcs = append(rcs, newRollupConfig(rollupLast, "close"))
rcs = append(rcs, newRollupConfig(rollupMin, "low"))
rcs = append(rcs, newRollupConfig(rollupMax, "high"))
default:
rcs = append(rcs, newRollupConfig(rf, ""))
}
return preFunc, rcs
}
var bbPool bytesutil.ByteBufferPool
func evalNumber(ec *EvalConfig, n float64) []*timeseries {
@@ -798,3 +821,23 @@ func mulNoOverflow(a, b int64) int64 {
}
return a * b
}
func toTagFilters(lfs []metricsql.LabelFilter) []storage.TagFilter {
tfs := make([]storage.TagFilter, len(lfs))
for i := range lfs {
toTagFilter(&tfs[i], &lfs[i])
}
return tfs
}
func toTagFilter(dst *storage.TagFilter, src *metricsql.LabelFilter) {
if src.Label != "__name__" {
dst.Key = []byte(src.Label)
} else {
// This is required for storage.Search.
dst.Key = nil
}
dst.Value = []byte(src.Value)
dst.IsRegexp = src.IsRegexp
dst.IsNegative = src.IsNegative
}

Some files were not shown because too many files have changed in this diff Show More