Compare commits

...

354 Commits

Author SHA1 Message Date
Aliaksandr Valialkin
e1ef72af01 app/vmagent: logo fix 2020-02-25 00:09:19 +02:00
Aliaksandr Valialkin
56c70fe856 app/vmagent: update docs 2020-02-25 00:09:18 +02:00
Aliaksandr Valialkin
e7e4aa5243 app/vmagent/README.md: small fixes 2020-02-24 21:25:38 +02:00
Aliaksandr Valialkin
fed2959658 lib/envflag: substitute dots with underscores in env var names if -envflag.enable is set
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/311
2020-02-24 21:14:44 +02:00
Aliaksandr Valialkin
ae51300973 app/vmselect/promql: properly take into account the first datapoint when calculating rollup_candlestick
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/309
2020-02-24 13:24:30 +02:00
Aliaksandr Valialkin
e65ec88779 app/vmselect/promql: do not take into account values outside the current window in rollup_candlestick
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/309
2020-02-23 18:03:57 +02:00
Yaroslav
a6d0645539 fix rollupOpen(), rollupHigh(), rollupLow() functions (#328) 2020-02-23 18:01:53 +02:00
Aliaksandr Valialkin
04762344c6 app/vmagent: initial implementation for vmagent 2020-02-23 13:36:03 +02:00
Aliaksandr Valialkin
4e905d6501 vendor: update github.com/valyala/fastjson from v1.4.5 to v1.5.0 2020-02-23 10:06:00 +02:00
kreedom
49390b8dbc [vmalert] integration with AlertManager (#325) 2020-02-21 23:15:05 +02:00
Aliaksandr Valialkin
2f55cabaa4 app/vmselect/promql: log when rollupResult cache is cleared 2020-02-21 20:07:01 +02:00
Aliaksandr Valialkin
d21cb43e48 lib/storage: add vm_ prefix to deduplicated_samples_total metric to be conistent with other metrics 2020-02-21 19:33:59 +02:00
Aliaksandr Valialkin
ec9bf39b5b app/vmselect: add -search.cacheTimestampOffset command-line flag
This flag can be used for removing gaps on graphs if the difference between the current time
and the timestamps from the ingested data exceeds 5 minutes.

This is the case when the time between data sources and VictoriaMetrics is improperly synchronized.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/312
2020-02-21 13:58:06 +02:00
Aliaksandr Valialkin
539139391c app/vmselect: add /internl/resetRollupResultCache handler for resetting response cache
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/312
2020-02-21 13:58:05 +02:00
Aliaksandr Valialkin
5431f9cd4e deployment/docker: update Go builder from v1.13.7 to v1.13.8 2020-02-20 19:46:20 +02:00
kreedom
3c06179184 basic vmalert backbone (#317)
* basic vmalert backbone

* Resolve code review comments for vmalert backbone

* Second review fixes for vmalert backbone
2020-02-16 20:59:02 +02:00
Aliaksandr Valialkin
71a52f5f90 lib/protoparser/prometheus: skip leading whitespace from tag names 2020-02-16 19:06:33 +02:00
Aliaksandr Valialkin
e7ba18b0d9 vendor: make vendor-udpate 2020-02-16 16:11:24 +02:00
Aliaksandr Valialkin
ce15cecae4 lib/storage: typo fix 2020-02-16 15:53:44 +02:00
Aliaksandr Valialkin
32e153e834 lib/storage: prevent from clobbering nin-nil lastError in Storage.add 2020-02-16 15:51:26 +02:00
Aliaksandr Valialkin
7b1c7051a3 app/vmselect: add sort_by_label(q, label) and sort_by_label_desc(q, label) functions
This is implementation of https://github.com/prometheus/prometheus/pull/1533 for VictoriaMetrics.
2020-02-13 17:01:37 +02:00
Aliaksandr Valialkin
7836ad8907 lib/mergeset: skip createing temporary part objects when merging source inmemory parts
This should reduce CPU usage when adding new entries to inverted index.
This should alos prevent from creating stalled cleaner goroutines for the created temporary parts,
since they were never closed.

This should fix the following issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/316 .
2020-02-13 14:09:32 +02:00
Aliaksandr Valialkin
eceaf13e5e lib/{storage,mergeset}: use time.Ticker instead of time.Timer where appropriate
It has been appeared that time.Timer was used in places where time.Ticker must be used instead.
This could result in blocked goroutines as in the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/316 .
2020-02-13 13:10:07 +02:00
Aliaksandr Valialkin
8162d58dbd make vendor-update 2020-02-10 23:28:15 +02:00
Aliaksandr Valialkin
848d5da0be vendor: update github.com/VictoriaMetrics/metrics from v1.9.3 to v1.10.1 2020-02-10 23:08:38 +02:00
Aliaksandr Valialkin
4cc0163c7c docs: migrate ExtendedPromQL->MetricsQL in order to be more consistent 2020-02-10 23:02:43 +02:00
Aliaksandr Valialkin
a801a1a6e7 .github/ISSUE_TEMPLATE: ask for command-line flags and Prometheus logs 2020-02-10 22:56:17 +02:00
Aliaksandr Valialkin
02e852854a README.md: refer to the article about data deletion via relabeling 2020-02-10 22:46:52 +02:00
Aliaksandr Valialkin
9e6e2319b9 README.md: mention that flags may be read from env vars if -envflag.enable command-line flag is set 2020-02-10 16:20:15 +02:00
Aliaksandr Valialkin
025297f15d lib/envflag: check for incorrect flag values read from environment vars 2020-02-10 16:08:10 +02:00
Aliaksandr Valialkin
5d207b2025 lib/envflag: add -envflag.enable command-line flag for enabling reading flags from environment vars
By default flags are read only from command line. They can be read from environment vars if `-envflag.enable` is set.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/311
2020-02-10 16:02:37 +02:00
Aliaksandr Valialkin
8466ab0034 all: allow setting flags via environment vars
Now flags can be set via environment vars with the same names as flags.
Command-line flags override flags set via env vars.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/311
2020-02-10 13:29:13 +02:00
Aliaksandr Valialkin
e210cd9da1 lib/storage: move -dedup.minScrapeInterval flag outside lib/storage, so it doesnt show up in vminsert in cluster version 2020-02-10 13:09:51 +02:00
Aliaksandr Valialkin
6db573470c docs/Single-server-VictoriaMetrics.md: sync with README.md 2020-02-07 00:02:34 +02:00
Ryota Arai
fffe5d4ba4 Fix a typo in README (selfScrapeInterval) (#310) 2020-02-06 13:14:31 +02:00
Aliaksandr Valialkin
a6c6a2debc app/vmselect/promql: do not add step to range end, since this hack became obsolete since commit 9e1119dab8 2020-02-05 21:22:19 +02:00
Aliaksandr Valialkin
78b62dee87 app/vmselect/promql: properly adjust time range for data to select
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/309
2020-02-05 21:22:18 +02:00
Aliaksandr Valialkin
366693b9f1 app/vmselect: unconditionally offset -step to rollup_candlestick. This makes results more consistent 2020-02-04 23:32:12 +02:00
Aliaksandr Valialkin
525101339e app/vmselect/promql: automatically apply offset -step to rollup_candlestick function in order to obtain the expected OHLC results
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/309
2020-02-04 23:24:35 +02:00
Aliaksandr Valialkin
ada6a3da8d app/vmselect/promql: adjust rollup_candlestick calculations to the exepcted results
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/309
2020-02-04 22:42:13 +02:00
Aliaksandr Valialkin
40c6ae2952 lib/logger: initialize output to os.Stderr by default 2020-02-04 22:40:44 +02:00
Aliaksandr Valialkin
cff0cb297c Do not require checking for errors returned from fmt.Fprint
This fixes `make errcheck` error found in lib/logger
2020-02-04 22:03:37 +02:00
Aliaksandr Valialkin
e0a4c37fc1 lib/logger: add -loggerOutput command-line flag
This flag allows changing log output from `stderr` to `stdout` if `-loggerOutput=stdout` is set.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/306
2020-02-04 21:47:56 +02:00
Aliaksandr Valialkin
7f3e3a6034 lib/logger: do not clutter -loggerFormat=json output with stack trace
This should improve json parsing
2020-02-04 21:37:25 +02:00
Aliaksandr Valialkin
bd4698bb7a lib/storage: do not deduplicate blocks with less than 32 samples during merge
This should improve deduplication accuracy for blocks with higher number of samples.
2020-02-04 18:41:54 +02:00
Aliaksandr Valialkin
36a1ac8360 app/vmselect: take into account the time the requests wait in the queue if -search.maxConcurrentRequests is exceeded
This will prevent from excess CPU usage for timed out queries.
2020-02-04 16:15:08 +02:00
Aliaksandr Valialkin
834051e5b2 app/vmselect: add a placeholder for /api/v1/metadata, which could be requested by Grafana
See https://prometheus.io/docs/prometheus/latest/querying/api/#querying-metric-metadata

VictoriaMetrics doesn't collect any metadata for metrics, so just return empty response.
2020-02-04 15:53:47 +02:00
Aliaksandr Valialkin
42864bb52f all: do not clash flag description with back-quoted flag types
See https://golang.org/pkg/flag/#PrintDefaults for more details.
2020-02-04 15:46:52 +02:00
Roman Khavronenko
1e023c6a72 Single dashboard (#300)
* improve description for `Pending datapoints` panel

* bump VM version requirement
2020-02-03 02:09:53 +02:00
Artem Navoiev
a47f292295 [vmalert] add vmalert.png.2 2020-02-02 12:17:19 +02:00
Artem Navoiev
354232b62b [vmalert] add vmalert.png 2020-02-02 12:16:05 +02:00
Artem Navoiev
28778be0cc [vmalert] initial 2020-02-02 12:14:09 +02:00
Aliaksandr Valialkin
90cf356ea1 app/vmselect/promql: adjust and and unless binary operator handling to be consistent with Prometheus 2020-01-31 18:52:38 +02:00
Aliaksandr Valialkin
c0b69ed06e deployment/docker: update Go builder from v1.13.6 to v1.13.7 2020-01-31 18:06:58 +02:00
Aliaksandr Valialkin
011a79da85 lib/fs: remove unused readerAt interface 2020-01-31 15:12:43 +02:00
Aliaksandr Valialkin
c3d86eef96 all: add -dedup.minScrapeInterval command-line flag for data de-duplication
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/86
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/278
2020-01-31 01:16:57 +02:00
Aliaksandr Valialkin
2152f6f0cd lib/storage: re-use indexSearch inside Storage.prefetchMetricNames 2020-01-31 01:16:53 +02:00
Aliaksandr Valialkin
d70ba7eb37 lib/fs: optimize small reads for ReaderAt.MustReadAt by reading from memory-mapped space instead of reading from file descriptor
This should improve performance when reading many small blocks.
2020-01-30 15:09:05 +02:00
Aliaksandr Valialkin
ad8af629bb all: rename ReadAt* to MustReadAt* in order to dont clash with io.ReaderAt 2020-01-30 15:08:58 +02:00
Aliaksandr Valialkin
d68546aa4a lib/storage: pre-fetch metricNames for the found metricIDs in Search.Init
This should speed up Search.NextMetricBlock loop for big number of found time series.
2020-01-30 15:08:51 +02:00
Aliaksandr Valialkin
5bb9ccb6bf lib/mergeset: properly update lastAccesstime in indexBlockCache entries
This is a follow-up for 6665f10e7b
2020-01-29 21:20:47 +02:00
Aliaksandr Valialkin
a462355b2f app/vmselect/promql: add keep_next_value(q) for filling gaps with the next non-empty value 2020-01-29 00:48:04 +02:00
Aliaksandr Valialkin
bdbb463756 docs/Single-server-VictoriaMetrics.md: fix heading size for Third-party contributions section 2020-01-28 23:13:35 +02:00
Aliaksandr Valialkin
371e86194d app/vminsert: moved -maxInsertRequestSize command-line flag out of lib/prompb in order to prevent its inclusion in vmselect and vmstorage apps 2020-01-28 23:02:08 +02:00
Aliaksandr Valialkin
adbbc4fa1a app/vmselect/promql: return expected results from increase() over the beginning of time series, which start from big value
Examples for such counters: OS-level counters for network or cpu stats.
2020-01-28 16:30:11 +02:00
Aliaksandr Valialkin
75ad47a43c app/victoria-metrics: check for error arg passed to filepath.Walk callback 2020-01-27 20:56:45 +02:00
Aliaksandr Valialkin
6320a19a8c app/victoria-metrics: remove integration build tag from tests
This simplifies testing with `go test ./app/victoria-metrics` without
the need to remember to pass `-tags=integration` to Go commands.
2020-01-27 20:25:28 +02:00
Aliaksandr Valialkin
7b26db5527 docs/Single-server-VictoriaMetrics.md: update Retention section 2020-01-27 18:44:21 +02:00
Alexander Danilov
1a3626bbe1 Add description for retention and how it works (#297) 2020-01-27 18:38:22 +02:00
Aliaksandr Valialkin
8074c10590 README.md: mention https://github.com/AnchorFree/tsdb-remote-write 2020-01-27 18:35:48 +02:00
Aliaksandr Valialkin
2392a359e1 app/vmselect/promql: fix panic on a single zero vmrange bucket in prometheus_buckets() function
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/296
2020-01-27 18:04:55 +02:00
Aliaksandr Valialkin
6caa9bb51b lib/logger: fix improperly set skipframes for all the logging functions
The bug has been introduced in the previous commit f6baee6efe
2020-01-26 18:34:27 +02:00
Aliaksandr Valialkin
f6baee6efe lib/httpserver: log the caller of httpserver.Errorf
Previously log message contained `httpserver.Errorf`, not it contains the caller of `httpserver.Errorf`, which is more useful.
2020-01-25 20:17:59 +02:00
Aliaksandr Valialkin
9df5b2d1c3 app/victoria-metrics: add -selfScrapeInterval flag for self-scraping /metrics page
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/30
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/180
2020-01-25 19:19:59 +02:00
Aliaksandr Valialkin
2a0a0ed14d lib/protoparser: add parser for Prometheus exposition text format
This parser will be used by vmagent
2020-01-24 20:11:02 +02:00
Aliaksandr Valialkin
6456c93dbb app/vminsert: move ingestion protocol parsers to lib/protoparser, so they could be re-used in the upcoming vmagent 2020-01-24 16:53:00 +02:00
Aliaksandr Valialkin
1efea246b7 docs/Articles.md: add a link to https://medium.com/@valyala/billy-how-victoriametrics-deals-with-more-than-500-billion-rows-e82ff8f725da 2020-01-22 19:08:35 +02:00
Aliaksandr Valialkin
680080887d all: consistently log durations in seconds with millisecond precision
This should improve logs readability
2020-01-22 18:28:27 +02:00
Aliaksandr Valialkin
3992984e10 vendor: make vendor-update 2020-01-22 18:08:39 +02:00
Aliaksandr Valialkin
9773022e50 app/vmselect: mention the original query and time range in error messages
This should simplify debugging invalid or heavy queries.
2020-01-22 17:36:36 +02:00
Aliaksandr Valialkin
f8954c7250 vendor: update github.com/klauspost/compress from v1.9.7 to v1.9.8
New version should have better gzip compression. See https://github.com/klauspost/compress#changelog
2020-01-22 16:50:15 +02:00
Aliaksandr Valialkin
0ef6f91410 docs: Mention Slack and Telegram channels for user questions 2020-01-22 16:50:14 +02:00
Aliaksandr Valialkin
efc7ad88ec app/vmselect: mention command-line flag, which could be used for adjusting query timeouts, in timeout errors 2020-01-22 15:50:48 +02:00
Aliaksandr Valialkin
ec9651e266 app/vmselect/prometheus: increase default value -maxExportDuration to 30 days, since 10 minutes beat users exporting bit amounts of data 2020-01-22 15:50:47 +02:00
Aliaksandr Valialkin
a8b2f82fc6 vendor: update github.com/VictoriaMetrics/fastcache from v1.5.5 to v1.5.7 2020-01-22 12:31:32 +02:00
Aliaksandr Valialkin
582dd01f42 app/vmselect/promql: add range_over_time(m[d]) function for calculating value range for m over d 2020-01-21 19:05:17 +02:00
Aliaksandr Valialkin
36973ee975 app/vmselect/promql: add label_match(q, label, regexp) and label_mismatch(q, label, regexp) functions for filtering out time series with labels matching the given regexp 2020-01-21 15:00:20 +02:00
Aliaksandr Valialkin
6665f10e7b lib/{mergeset,storage}: properly update lastAccessTime in index and data block cache entries 2020-01-20 14:59:47 +02:00
Aliaksandr Valialkin
04363d6511 README.md: mention that delete API shouldnt be used on a regular basis due to non-zero overhead 2020-01-20 13:28:36 +02:00
Aliaksandr Valialkin
c97ade4487 docs/FAQ.md: typo fix according to comment from https://www.reddit.com/message/messages/lezkmo 2020-01-18 18:05:13 +02:00
Aliaksandr Valialkin
970f0dfbf2 docs/CaseStudies.md: add links to COLOPL talk about VictoriaMetrics 2020-01-18 17:23:33 +02:00
Aliaksandr Valialkin
227cf53ef9 app/vminsert: increase default value for -insert.maxQueueDuration from 30s to 60s
This should help catching up with high ingestion rate after VictoriaMetrics restart.
2020-01-18 14:39:36 +02:00
Aliaksandr Valialkin
257e61195a lib/uint64set: add missing bucket32.b16his values 2020-01-18 14:26:04 +02:00
Aliaksandr Valialkin
4cc0c44b9e lib/uint64set: optimize Set.Union
This should improve performance for queries over big number of time series
2020-01-18 13:47:03 +02:00
Aliaksandr Valialkin
1b5f02e293 lib/uint64set: add benchmarks for Set.Union 2020-01-18 13:47:02 +02:00
Aliaksandr Valialkin
3748fb24b6 lib/storage: skip recovering timestamps order for lossless compression (PrecisionBits=64) 2020-01-18 00:09:33 +02:00
Aliaksandr Valialkin
c9472e4f3a all: use github.com/klauspost/compress/gzip instead of compress/gzip
`github.com/klauspost/compress/gzip` is more optimized than `compress/gzip`.
This gives better gzip compression and decompression speeds.
2020-01-17 23:58:46 +02:00
Aliaksandr Valialkin
bc0f897fcb lib/uint64set: reduce memory allocations in Set.AppendTo 2020-01-17 22:33:09 +02:00
Aliaksandr Valialkin
f9289b804a lib/storage: reduce memory allocations when merging metricID sets 2020-01-17 22:10:44 +02:00
Aliaksandr Valialkin
0c8ad08578 lib/uint64set: typo fix in Set.Intersect 2020-01-17 18:10:58 +02:00
Aliaksandr Valialkin
cdcacaea6d app/vmselect/netstorage: make fmt 2020-01-17 17:47:21 +02:00
Aliaksandr Valialkin
7327adbc86 app/vmselect/netstorage: limit the maximum size for in-memory buffer for temporary blocks file
This should reduce memory usage on systems with more than 8GB RAM.
2020-01-17 16:28:21 +02:00
Aliaksandr Valialkin
9f027ec176 lib/uint64set: optimize Intersect, Subtract and Union functions
This should improve performance for queries over big number of time series.
2020-01-17 16:11:49 +02:00
Aliaksandr Valialkin
cd53f7d177 lib/uint64set: improve benchmark for Set.Intersect 2020-01-17 16:08:17 +02:00
Aliaksandr Valialkin
d0d258b314 app/vmselect: limit the default value for -search.maxConcurrentRequests, so it plays well on systems with more than 16 vCPUs
A single heavy request can saturate all the available CPUs, so let's limit the number of concurrent requests to lower value.
This will give more chances for executing insert path.
2020-01-17 15:43:54 +02:00
Aliaksandr Valialkin
d88725f133 app/{vminsert,vmselect}: improve error messages when VictoriaMetrics cannot handle too high number of concurrent inserts / selects 2020-01-17 13:24:37 +02:00
Aliaksandr Valialkin
8dbf430469 lib/uint64set: add benchmark for Set.Intersect 2020-01-17 00:31:07 +02:00
Aliaksandr Valialkin
9ef4d32a9a make vendor-update 2020-01-16 14:14:19 +02:00
Aliaksandr Valialkin
0d7505b00e all: mention command-line flags used for limiting the incoming request size in error messages
This should improve error logs usability.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/287
2020-01-16 13:03:30 +02:00
Aliaksandr Valialkin
2839f4688a app/vmselect/promql: fix panic on sum(aggr_over_time(...)) with incorrect number of args 2020-01-15 16:26:09 +02:00
Aliaksandr Valialkin
605d588ba6 lib/uint64set: reduce memory usage in Union, Intersect and Subtract methods
Iterate items with newly added Set.ForEach method instead of allocating `[]uint64`
slice for all the items before the iteration.
2020-01-15 12:12:49 +02:00
Aliaksandr Valialkin
7483deccca docs/FAQ.md: add bullet comparison with Cortex and Thanos 2020-01-15 10:47:40 +02:00
Aliaksandr Valialkin
893b62c682 lib/{mergeset,storage}: fix uint64 counters alignment for 32-bit architectures (GOARCH=386, GOARCH=arm) 2020-01-14 22:47:04 +02:00
Aliaksandr Valialkin
7830c10eb2 lib/{storage,mergeset}: gradually remove stale entries from block cache and index caches
This should reduce memory usage in the long run when old blocks and indexes
aren't accessed anymore.
2020-01-14 21:38:44 +02:00
Aliaksandr Valialkin
e4f1bfd221 deployment/docker: update Prometheus from v2.14.0 to v2.15.2 and Grafana from v6.5.0 to v6.5.2 2020-01-12 23:14:10 +02:00
Aliaksandr Valialkin
91ee1bce2e README.md: add a link to VictoriaMetrics subreddit - https://www.reddit.com/r/VictoriaMetrics/ 2020-01-12 00:06:20 +02:00
Aliaksandr Valialkin
8b14572f70 app/vmselect/promql: add hoeffding_bound_upper(phi, m[d]) and hoeffding_bound_lower(phi, m[d]) functions
These functions can be used for calculating Hoeffding bounds
for `m` over `d` time range and for the given `phi` in the range `[0..1]`.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/283
2020-01-11 14:46:23 +02:00
Aliaksandr Valialkin
8eaced8cae app/vmselect/promql: return continuous values for min_over_time and max_over_time when step is smaller than scrape_interval 2020-01-11 12:47:50 +02:00
Aliaksandr Valialkin
1585dab5a3 deployment/docker: switch Go builder from v1.13.5 to v1.13.6 2020-01-11 11:06:00 +02:00
Aliaksandr Valialkin
cd66d3fc43 README.md: mention about Prometheus->VictoriaMetrics exporter https://github.com/ryotarai/prometheus-tsdb-dump
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/93
2020-01-11 01:29:09 +02:00
Aliaksandr Valialkin
ea231f8167 app/victoria-metrics: adjust integration tests after the commit 99facd71cd6ac151d512cea1df73be91c10c7f83 2020-01-11 00:58:16 +02:00
Aliaksandr Valialkin
46bfdbe6cf app/vmselect/promql: do not take into account the previous point before time window in square brackets for min_over_time, max_over_time, rollup_first and rollup_last functions
This makes the behaviour for these functions similar to Prometheus when processing broken time series with irregular data points
like `gitlab_runner_jobs`. See https://gitlab.com/gitlab-org/gitlab-exporter/issues/50 for details.
2020-01-11 00:26:26 +02:00
Aliaksandr Valialkin
4f0a645f77 vendor: update github.com/valyala/fastjson from v1.4.2 to v1.4.5
This should fix parsing Inf values in `/api/v1/import`. The previous attempt to fix this in VictoriaMetrics v1.32.1 was unsuccessful.
2020-01-10 23:15:15 +02:00
Aliaksandr Valialkin
b829fe5e39 app/vmselect/promql: properly handle aggr(aggr_over_time(...)) 2020-01-10 21:57:18 +02:00
Aliaksandr Valialkin
164278151f app/vmselect/promql: add aggr_over_time(("aggr_func1", "aggr_func2", ...), m[d]) function
This function can be used for simultaneous calculating of multiple `aggr_func*` functions
that accept range vector. For example, `aggr_over_time(("min_over_time", "max_over_time"), m[d])`
would calculate `min_over_time` and `max_over_time` for `m[d]`.
2020-01-10 21:18:06 +02:00
Aliaksandr Valialkin
c4632faa9d app/vmselect/promql: add tmin_over_time(m[d]) and tmax_over_time(m[d]) functions
These functions return timestamp in seconds for the minimum and maximum value for `m` over time range `d`
2020-01-10 19:39:28 +02:00
Aliaksandr Valialkin
a768198814 docs: fix spelling typos 2020-01-09 23:42:55 +02:00
Roman Khavronenko
57f4875024 fix spellcheck issues (#285) 2020-01-09 23:41:52 +02:00
Aliaksandr Valialkin
b8038a14e7 lib/backup/s3remote: check whether the file exists before deleting it
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/284
2020-01-09 23:20:31 +02:00
Aliaksandr Valialkin
f358fb72d1 app/{vmbackup,vmrestore}: add backup complete file to backup when it is complete and check for this file before restoring from backup
This should prevent from restoring from incomplete backups.

Add `-skipBackupCompleteCheck` command-line flag to `vmrestore` in order to be able restoring from old backups without `backup complete` file.
2020-01-09 15:35:38 +02:00
Aliaksandr Valialkin
1c436b2723 vendor: update github.com/valyala/fastjson from v1.4.1 to v1.4.2
This fixes parsing of `inf` and `nan` values in json lines passed to `/api/v1/import`
2020-01-08 20:47:21 +02:00
Aliaksandr Valialkin
a973df6d79 README.md: remove height="200px" from logo image, since it is improperly displayed on smartphones 2020-01-08 20:29:11 +02:00
Aliaksandr Valialkin
d4132a6915 docs: typo fix 2020-01-08 14:45:27 +02:00
Aliaksandr Valialkin
d5aeda0e1a app/vmselect/promql: skip rate calculation for the first point on time series 2020-01-08 14:42:53 +02:00
Aliaksandr Valialkin
bb71b6d47d docs: add references to Remote Write Storage Wars
Also mention than VictoriaMetrics uses less RAM than Thanos Store Gateway - see https://github.com/thanos-io/thanos/issues/448 for details.
2020-01-04 23:57:35 +02:00
Aliaksandr Valialkin
fc71602039 lib/storage: limit maxRaRowsPerPartition by 500K for any number of rawRowsShardsPerPartition
This should reduce write amplification for high ingestion rate on multi-CPU systems
2020-01-04 23:57:31 +02:00
Aliaksandr Valialkin
c60fdbed30 docs/CaseStudies.md: add link to Remote Write Storage Wars talk from Adidas at PromCon 2019 2020-01-04 16:51:45 +02:00
Aliaksandr Valialkin
d410c78c7e app/vmselect/promql: fix calculations for histogram_share 2020-01-04 14:44:48 +02:00
Aliaksandr Valialkin
66f3d1dac8 README.md: update Alerting section 2020-01-04 13:55:09 +02:00
Aliaksandr Valialkin
d9c4ac9978 lib/metricsql: export IsRollupFunc and IsTransformFunc, since they can be used by package users 2020-01-04 13:25:05 +02:00
Aliaksandr Valialkin
4567a59fa0 LICENSE: update year 2020-01-04 13:21:04 +02:00
Aliaksandr Valialkin
d64699bb9f app/vmselect/promql: add missing MetricName into netstorage.Result in tests 2020-01-04 12:52:39 +02:00
Aliaksandr Valialkin
f409f2d050 app/vmselect/promql: add histogram_share(le, buckets) function 2020-01-04 12:45:55 +02:00
Aliaksandr Valialkin
b1ded7cf9a app/vmselect/promql: add absent_over_time(m[d]) func similar to the function in Prometheus 2.16
See https://github.com/prometheus/prometheus/issues/2882
2020-01-04 12:45:07 +02:00
Aliaksandr Valialkin
a8360d04c0 app/vmselect/promql: add histogram_over_time(m[d]) rollup function 2020-01-04 12:44:56 +02:00
Aliaksandr Valialkin
3e09d38f29 app/vmselect/promql: fix results caching for multi-arg rollup functions such as quantile_over_time
Previosly only a single arg was taken into account, so caching didn't work properly for multi-arg rollup funcs.
2020-01-03 20:49:08 +02:00
Aliaksandr Valialkin
a774120460 app/vmselect/promql: use scrapeInterval instead of window in denominator when calculating rate for the first point on the time series
This should provide better estimation for `rate` in the beginning of time series.
2020-01-03 19:01:50 +02:00
Aliaksandr Valialkin
695682232f lib/uint64set: reduce memory usage when storing big number of sparse metric_id values 2020-01-03 18:16:44 +02:00
Aliaksandr Valialkin
b5645ccbdf app/vmselect/promql: increase the estimated number of time series returned by aggr() by (something) from 100 to 1K, since 100 may result in OOM for high number of time series 2020-01-03 01:02:21 +02:00
Aliaksandr Valialkin
cb3a342882 app/vmselect/promql: add share_le_over_time and share_gt_over_time functions for SLI and SLO calculations 2020-01-03 00:41:16 +02:00
Aliaksandr Valialkin
0038365206 docs: refer to standalone MetricsQL package 2020-01-02 23:43:35 +02:00
Aliaksandr Valialkin
61c9d320ed vendor: update github.com/VictoriaMetrics/fastcache from v1.5.4 to v1.5.5 2019-12-29 18:17:49 +02:00
Aliaksandr Valialkin
a21d786d3c lib/metricsql: add example for ExpandWithExprs 2019-12-26 21:32:11 +02:00
Aliaksandr Valialkin
192b51c246 vendor: make vendor-update 2019-12-26 19:41:02 +02:00
Aliaksandr Valialkin
17a4dc9782 vendor: update github.com/valyala/gozstd from v1.6.3 to v1.6.4
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/281
2019-12-26 19:30:08 +02:00
Aliaksandr Valialkin
6f67e0b56b lib/metricsq: add ExpandWithExprs 2019-12-25 22:20:30 +02:00
Aliaksandr Valialkin
1925ee038d Rename lib/promql to lib/metricsql and apply small fixes 2019-12-25 22:03:59 +02:00
Mike Poindexter
bec62e4e43 Split Extended PromQL parsing to a separate library 2019-12-25 22:03:51 +02:00
Aliaksandr Valialkin
d880325cf6 app/vmselect/promql: make sure AdjustStartEnd returns time range covering the same number of points as the initial time range
This should prevent from the following panic at app/vmselect/promql/binary_op.go:255:

    BUG: len(leftVaues) must match len(rightValues) and len(dstValues)
2019-12-24 22:45:56 +02:00
Aliaksandr Valialkin
c18802af59 lib/fs: typo fix in fadvise_unix.go 2019-12-24 20:59:28 +02:00
Aliaksandr Valialkin
4ba4abe666 lib/encoding: log the compressed block contents if it cannot be decompressed or unmarshaled
This should help detecting the root cause of https://github.com/VictoriaMetrics/VictoriaMetrics/issues/281
2019-12-24 20:48:31 +02:00
Aliaksandr Valialkin
5bb39e757b lib/encoding: mention src contents in error message returned from unmarshalInt64NearestDelta*
This should simplify detecting the root cause of the issue at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/281
2019-12-24 20:41:52 +02:00
Aliaksandr Valialkin
d5c9841220 lib/encoding: mention unpacked block size in the error message if unparsed tail left
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/281
2019-12-24 20:35:13 +02:00
Aliaksandr Valialkin
9e19949c6b app/vmselect/promql: adjust calculations for rate and increase for the first value
These calculations should trigger alerts on `/api/v1/query` for counters starting from values greater than 0.
2019-12-24 19:39:25 +02:00
Aliaksandr Valialkin
0455c03cb9 app/vmselect/promql: properly calculate rate on the first data point
It is calculated as `value / scrape_interval`, since the value was missing on the previous scrape,
i.e. we can assume its value was 0 at this time.
2019-12-24 15:55:52 +02:00
Aliaksandr Valialkin
5cb8d97743 all: use gozstd instead of pure Go zstd for GOARCH=amd64 2019-12-24 12:42:42 +02:00
Aliaksandr Valialkin
31d04fb5df Revert "lib/logger: prevent from blocking when log output isn't consumed in timely manner"
This reverts commit e3c462f08a.

Reason to revert: this leaves incomplete logs on app shutdown.
2019-12-24 12:21:39 +02:00
Aliaksandr Valialkin
5b75984aa9 app/vmselect/netstorage: move MustAdviseSequentialRead to lib/fs 2019-12-23 23:16:11 +02:00
Aliaksandr Valialkin
097c21931c docs: sync README.md with Single-server-VictoriaMetrics.md 2019-12-23 20:34:21 +02:00
Roman Khavronenko
85463a7199 update configuration recommendations for Prometheus remote_write (#277) 2019-12-23 20:33:10 +02:00
Aliaksandr Valialkin
6a1499efa3 lib/encoding/zstd: prevent from possible encoder leak when concurrent goroutines create encoders for the same compressionLevel
Thanks to @klauspost for the pointer to this issue. See https://github.com/klauspost/compress/issues/195 for details.
2019-12-23 18:05:41 +02:00
Aliaksandr Valialkin
bf4413e58d README.md: document how to export and import gzipped data 2019-12-23 13:40:22 +02:00
Aliaksandr Valialkin
e3c462f08a lib/logger: prevent from blocking when log output isn't consumed in timely manner
Drop log messages instead of blocking and increment `vm_log_messages_dropped_total` metric.
2019-12-20 11:49:34 +02:00
Aliaksandr Valialkin
bea5a8700a app/vmselect: add -search.maxExportDuration command-line flag for limiting /api/v1/export duration
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/275
2019-12-20 11:35:22 +02:00
Aliaksandr Valialkin
1825893eef lib/storage: scale ingestion performance by sharding rawRows on systems with more than 8 CPU cores 2019-12-19 18:18:29 +02:00
Aliaksandr Valialkin
97f70ccda7 lib/storage: optimize bulk import performance when multiple data points are inserted for the same time series
This should speed up `/api/v1/import` and make it more scalable on multi-core systems.
2019-12-19 18:18:29 +02:00
Andrii Dembitskyi
2fba7b6f35 Fix typo in log message 2019-12-19 14:33:20 +02:00
Aliaksandr Valialkin
d03827c57d app/vminsert: return StatusNoContent http response for /api/v1/import to be consistent with other insert handlers 2019-12-19 01:21:54 +02:00
Aliaksandr Valialkin
bb530a0591 lib/httpserver: inline checkAuth code to make it more clear 2019-12-18 23:06:25 +02:00
koalaty-code
aea4c80dd7 Ignore /health endpoint when checking auth 2019-12-18 23:04:31 +02:00
Aliaksandr Valialkin
5e8e0fbc80 docs/ExtendedPromQL.md: rewording regarding scalar vs instant vector difference 2019-12-18 21:47:24 +02:00
Aliaksandr Valialkin
1e8aa89a3b docs/Home.md: fix link to case studies 2019-12-18 01:04:20 +02:00
Aliaksandr Valialkin
56595ae12a docs: renaming: PromQL extensions -> MetricsQL 2019-12-18 00:56:51 +02:00
Aliaksandr Valialkin
96ff8d9adb app/vmselect: add ability to pass match[], start and end to /api/v1/labels
This makes the `/api/v1/labels` handler consistent with already existing functionality for `/api/v1/label/.../values`.

See https://github.com/prometheus/prometheus/issues/6178 for more details.
2019-12-15 00:20:50 +02:00
Aliaksandr Valialkin
02f6566ce1 app/vmbackup: mention that backups are possible to Ceph and Swift 2019-12-14 01:08:49 +02:00
Aliaksandr Valialkin
7535f20c98 docs: publish vmbackup and vmrestore docs on wiki and victoriametrics.github.io 2019-12-14 01:05:55 +02:00
Aliaksandr Valialkin
bc645152cb app/vminsert: simultaneously accept telnet put and HTTP /api/put OpenTSDB metrics at -opentsdbListenAddr
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/266
2019-12-14 00:30:12 +02:00
Aliaksandr Valialkin
f5ac9b0721 lib/logger: add -loggerFormat for choosing log message formats
Supported formats: default, json

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/265
2019-12-13 15:10:05 +02:00
Aliaksandr Valialkin
d95a43f392 docs: sync with cluster branch 2019-12-12 20:49:55 +02:00
Aliaksandr Valialkin
87a8348062 make vendor-update 2019-12-12 19:39:52 +02:00
Aliaksandr Valialkin
cea5a14853 all: rename Extended PromQL to PromQL extensions 2019-12-12 19:25:58 +02:00
Aliaksandr Valialkin
9787c228a4 docs/CaseStudies.md: add a link to VMQL 2019-12-12 14:53:48 +02:00
Aliaksandr Valialkin
c121608205 README.md: mention that {__name__!=""} selects all the time series in /api/v1/export 2019-12-12 14:48:30 +02:00
Aliaksandr Valialkin
492f032b38 docs: add Dreamteam numbers 2019-12-12 01:01:07 +02:00
Aliaksandr Valialkin
4624c060ac docs/Single-server-VictoriaMetrics.md: sync with README.md 2019-12-12 00:55:14 +02:00
Clémence Saussez
8454679d9f README.md: adds link to Grafana dashboard for clustered version (#259)
Signed-off-by: Clemence Saussez <clemence@zen.ly>
2019-12-12 00:54:24 +02:00
Aliaksandr Valialkin
440a15111e deployment/docker/Makefile: mention that the Makefile rules must be invoked from the repository root 2019-12-11 23:33:02 +02:00
Aliaksandr Valialkin
6ddcd162ed all: publish Docker images for the following GOARCH: amd64, arm, arm64, ppc64le and 386
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/258
2019-12-11 23:32:59 +02:00
Aliaksandr Valialkin
6504f78ce4 README.md: add Docker hub shield 2019-12-11 18:34:26 +02:00
Aliaksandr Valialkin
73b2a3d4b7 app/vmselect/promql: return lower and upper bounds for the estimated percentile from histogram_quantile if third arg is passed
Updates https://github.com/prometheus/prometheus/issues/5706
2019-12-11 13:57:26 +02:00
Aliaksandr Valialkin
07d5bc986b CaseStudies: clarify wording: metrics -> active time series 2019-12-11 12:05:08 +02:00
Aliaksandr Valialkin
caa4eb72d9 app/vmselect/promql: return matrix instead of vector on subqueries to /api/v1/query like Prometheus does 2019-12-11 01:00:26 +02:00
Aliaksandr Valialkin
3c076544bf app/vmselect/promql: allow negative offsets
Updates https://github.com/prometheus/prometheus/issues/6282
2019-12-11 01:00:23 +02:00
Aliaksandr Valialkin
35f5ca1def README.md: typo fixes 2019-12-09 23:30:01 +02:00
Aliaksandr Valialkin
a7d80f62be README.md: add a chapter about Prometheus querying API usage
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/253
2019-12-09 23:27:23 +02:00
Aliaksandr Valialkin
40540397c3 README.md: use relative links to REAMDE.md 2019-12-09 23:04:34 +02:00
Aliaksandr Valialkin
c107f46b0e docs: mention about /api/v1/import in Single-server-VictoriaMetrics.md 2019-12-09 23:02:07 +02:00
Aliaksandr Valialkin
8cce513a15 docs: mention about /api/v1/import in Cluster-VictoriaMetrics.md 2019-12-09 23:01:14 +02:00
Aliaksandr Valialkin
b01ddfdd76 deployment/docker: update Go builder from go1.13.4 to go1.13.5 2019-12-09 22:58:26 +02:00
Aliaksandr Valialkin
68e1cf8942 app/vminsert: add /api/v1/import handler
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6
2019-12-09 20:59:04 +02:00
Aliaksandr Valialkin
8501b4a48d app/vminsert: consistency renaming for counters 2019-12-09 16:43:10 +02:00
Aliaksandr Valialkin
0ed9258545 lib/{mergeset,storage}: log info message when both source and destination part paths from txn are missing during startup
This is expected condition after unclean shutdown (OOM, hard reset, `kill -9`) on NFS disk.
2019-12-09 15:44:53 +02:00
Roman Khavronenko
b0d88460de #251 - add Logging rate panel (#254) 2019-12-09 13:05:59 +02:00
Aliaksandr Valialkin
8db7660afe docs/CaseStudies.md: mention that additional references and reviews can be obtained from our Slack channel 2019-12-08 14:04:18 +02:00
Aliaksandr Valialkin
18369bca42 docs/ExtendedPromQL.md: add a link to https://medium.com/@valyala/improving-histogram-usability-for-prometheus-and-grafana-bc7e5df0e350 for histogram func 2019-12-08 13:48:33 +02:00
Aliaksandr Valialkin
95328782c3 docs/CaseStudies.md: re-wording 2019-12-08 13:43:49 +02:00
Aliaksandr Valialkin
981cb66a95 docs/CaseStudies.md: improve wording 2019-12-08 13:39:29 +02:00
Aliaksandr Valialkin
f15d89bfe0 vendor: fix broken build for GOARCH=arm64 on golang.org/x/sys/unix 2019-12-08 13:27:38 +02:00
Aliaksandr Valialkin
36feb7d3e4 docs: add draft version of case studies 2019-12-08 13:23:15 +02:00
Aliaksandr Valialkin
d900184d8d vendor: fix arm build for golang.org/x/sys/unix/zptrace_armnn_linux.go 2019-12-08 12:49:05 +02:00
Aliaksandr Valialkin
293b541784 make vendor-update 2019-12-07 23:10:16 +02:00
Aliaksandr Valialkin
84b57e8974 app/vminsert/influx: add a test case from https://community.librenms.org/t/integration-with-victoriametrics/9689 2019-12-07 23:00:40 +02:00
Aliaksandr Valialkin
b458e5a213 README.md: mention that VictoriaMetrics is built on shared nothing architecture 2019-12-05 20:39:44 +02:00
Aliaksandr Valialkin
c09472dfd9 app/vmselect/promql: add {topk|bottomk}_{min|max|avg|median} aggregate functions for returning the exact k time series on the given time range
The full list of functions added:
- `topk_min(k, q)` - returns top K time series with the max minimums on the given time range
- `topk_max(k, q)` - returns top K time series with the max maximums on the given time range
- `topk_avg(k, q)` - returns top K time series with the max averages on the given time range
- `topk_median(k, q)` - returns top K time series with the max medians on the given time range
- `bottomk_min(k, q)` - returns bottom K time series with the min minimums on the given time range
- `bottomk_max(k, q)` - returns bottom K time series with the min maximums on the given time range
- `bottomk_avg(k, q)` - returns bottom K time series with the min averages on the given time range
- `bottomk_median(k, q)` - returns bottom K time series with the min medians on the given time range
2019-12-05 19:26:47 +02:00
Aliaksandr Valialkin
72345eb5bd lib/{mergeset,storage}: make sure pending transaction deletions are finished before and after runTransactions call.
`runTransactions` call issues async deletions for transaction files. The previously issued transaction deletions
can race with the next call to `runTransactions`. Prevent this by waiting until all the pending transaction
deletions are funished in the beginning of `runTransactions`. Also make sure that all the pending transaction
deletions are finished before returning from `runTransactions`.
2019-12-04 21:40:30 +02:00
Aliaksandr Valialkin
1244ad810d lib/httpserver: add /ping handler for compatibility with Influx agents
Certain Influx agents check for `/ping` endpoint before starting
to send Influx line protocol data. See https://docs.influxdata.com/influxdb/v1.7/tools/api/#ping-http-endpoint
2019-12-04 19:15:52 +02:00
Aliaksandr Valialkin
359c4d6109 docs: add a link to https://medium.com/@valyala/prometheus-storage-technical-terms-for-humans-4ab4de6c3d48 2019-12-03 22:37:16 +02:00
Aliaksandr Valialkin
face3d57bf app/vmselect: add placeholders for /api/v1/rules and /api/v1/alerts 2019-12-03 19:36:33 +02:00
Aliaksandr Valialkin
a247236f61 lib/storage: fall back to global inverted index if a filter match too many time series in per-day index
Previously this resulted to error message. The query may succeed via search in global index.
2019-12-03 14:48:31 +02:00
Aliaksandr Valialkin
54741ee578 lib/storage: fix printing tag filters in TagFilters.String 2019-12-03 14:25:13 +02:00
Aliaksandr Valialkin
efbc83a13e lib/storage: print __name__ instead of empty string in user-visible tag filters 2019-12-03 14:18:28 +02:00
Aliaksandr Valialkin
ade453847f docs: typo fixes 2019-12-03 00:44:50 +02:00
Aliaksandr Valialkin
f52874dab4 lib/storage: optimize regexp filter search 2019-12-03 00:43:12 +02:00
Artem Navoiev
652ba59ce9 [docs] update release page doc 2019-12-02 23:01:51 +02:00
Artem Navoiev
3e81ab2f75 [docs] change titles 2019-12-02 22:53:11 +02:00
Artem Navoiev
a778233877 [docs] change titles 2019-12-02 22:50:54 +02:00
Aliaksandr Valialkin
14100ed643 vendor: update github.com/VictoriaMetrics/metrics from v1.9.1 to v1.9.2
This fixes possible deadlock when metrics.WritePrometheus calls Gauge callback, which calls metrics functions with internal lock.
2019-12-02 22:33:33 +02:00
Artem Navoiev
cfc6e7df07 [docs] revert titles 2019-12-02 22:06:39 +02:00
Artem Navoiev
c07a83374c [docs] remove double titles 2019-12-02 22:02:59 +02:00
Artem Navoiev
c76b2be21f [ci] add github pages action 2019-12-02 21:53:33 +02:00
Aliaksandr Valialkin
638a5cbb16 lib/{mergeset,storage}: remove transaction files only after the mentioned dirs are really removed
This should fix the issue on NFS when incompletely removed dirs may be left
after unclean shutdown (OOM, kill -9, hard reset, etc.), while the corresponding transaction
files are already removed.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/162
2019-12-02 21:36:31 +02:00
Aliaksandr Valialkin
20812008a7 lib/storage: remove metricID with missing metricID->metricName entry
The metricID->metricName entry can be missing in the indexdb after unclean shutdown
when only a part of entries for new time series is written into indexdb.

Recover from such a situation by removing the broken metricID. New metricID
will be automatically created for time series with the given metricName
when new data point will arive to it.
2019-12-02 20:46:44 +02:00
Aliaksandr Valialkin
62a915f2b2 lib/storage: protect from time drift during indexdb rotation
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/248
2019-12-02 14:44:42 +02:00
Aliaksandr Valialkin
42da569bcd lib/logger: merge file and line labels into location="file:line"
This should improve the usability for `vm_log_messages_total` metric during practical queries
2019-12-02 14:44:40 +02:00
Aliaksandr Valialkin
70b8191fab lib/storage: generate more human-friendly result in TagFilters.String 2019-12-02 13:52:22 +02:00
Aliaksandr Valialkin
9476b73527 app/vmselect/promql: estimate per-series scrape interval as 0.6 quantile for the first 100 intervals
This should improve scrape interval estimation for tiem series with gaps.
2019-12-02 13:42:33 +02:00
Aliaksandr Valialkin
542b9c2043 lib/logger: consistency renaming from vm_log_messages_count to vm_log_messages_total, since this is a counter 2019-12-02 00:49:00 +02:00
Aliaksandr Valialkin
c567919f80 lib/logger: track the number of log messages by (level, file, line) in the vm_log_messages_count metric 2019-12-01 18:37:49 +02:00
Aliaksandr Valialkin
761645b20a lib/netutil: use IPv6 for both listening and dialing if -enabledTCP6 is set
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/244
2019-12-01 02:57:13 +02:00
Aliaksandr Valialkin
811b7a8303 app/vminsert/influx: allow empty measurement in Influx line protocol
In this case metric names are mapped directly from field names without any prefixes.
2019-11-30 23:18:41 +02:00
Artem Navoiev
4972bd4c96 Update release guide add Wiki section. Change styling 2019-11-30 21:10:42 +02:00
Artem Navoiev
335e0f8f6a Update release guide add Wiki section 2019-11-30 21:08:48 +02:00
Artem Navoiev
505e46980a [ci] push docs/*.md file to wiki 2019-11-30 20:58:28 +02:00
Artem Navoiev
ab88b77515 rename doc to docs 2019-11-30 20:48:40 +02:00
Artem Navoiev
3d8e75e065 [ci] test wiki push 2019-11-30 20:38:37 +02:00
Artem Navoiev
74b4ccfc91 [ci] push to wiki 2019-11-30 20:36:10 +02:00
Aliaksandr Valialkin
75ff524a4e app/vmselect/promql: fix corner case for increase over time series with gaps
In this case `increase` could return invalid high value for the first point after the gap.
2019-11-30 01:34:56 +02:00
Aliaksandr Valialkin
96492348cb deployment/docker/certs: update TLS certs source from alpine:3.9 to alpine:3.10 2019-11-29 19:57:29 +02:00
Aliaksandr Valialkin
f733cb2186 lib/backup: cosmetic fixes after #243 2019-11-29 18:07:04 +02:00
glebsam
15b7406f7b Add option to provide custom endpoint for S3, add option to specify S3 config profile (#243)
* Add option to provide custom endpoint for S3 for use with s3-compatible storages, add option to specify S3 config profile

* make fmt
2019-11-29 17:59:56 +02:00
Aliaksandr Valialkin
9010c6a1d6 lib/netutil: add -enableTCP6 command-line flag for enabling listening for IPv6 additionally to IPv4 TCP ports 2019-11-29 17:32:47 +02:00
Aliaksandr Valialkin
a7125a5b7b lib/backup: remove flock.lock file in empty dirs
This fixes an issue when VictoriaMetrics doesn't see the restored data after the following operations:

1. Stop VictoriaMetrics.
2. Delete `<-storageDataPath>` dir.
3. Start VictoriaMetrics, then stop it.
4. Restore data from backup with `vmrestore`.
5. Start VictoriaMetrics.

`vmrestore` didn't delete properly empty dirs in `<-storageDataPath>/indexdb` because of the remaining `flock.lock` files in these dirs.
2019-11-28 13:38:58 +02:00
Aliaksandr Valialkin
a6d7179286 README.md: remove the unnecessary step during restoring from backups 2019-11-27 19:57:03 +02:00
Aliaksandr Valialkin
e828647d0f vendor: make vendor-update 2019-11-27 15:37:14 +02:00
Aliaksandr Valialkin
31fb6f2b07 vendor: update github.com/VictoriaMetrics/fastcache from v1.5.2 to v1.5.4 2019-11-27 15:30:33 +02:00
Aliaksandr Valialkin
2c86816950 deployment/docker: update Grafana from v6.4.4 to v6.5.0 2019-11-27 15:10:37 +02:00
Aliaksandr Valialkin
4c859d980c app/vmselect/prometheus: consistently apply nocache arg to /api/v1/query the same way ast to /api/v1/query_range
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/241
2019-11-26 22:55:43 +02:00
Aliaksandr Valialkin
14bcff6015 lib/httpserver: improve docs for -tls* flags to be more clear
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/242
2019-11-26 18:08:35 +02:00
Aliaksandr Valialkin
110235f789 app/vmselect/prometheus: fix content-type for /api/v1/export responses
The correct Content-Type should be `application/stream+json` instead of `application/json`
Thanks to Joshua Ryder for pointing to this.
2019-11-26 17:45:26 +02:00
Aliaksandr Valialkin
205233d9a7 app/vmselect/promql: remove zero timeseries from prometheus_buckets output 2019-11-25 19:10:23 +02:00
Aliaksandr Valialkin
3f99f39e9b app/vmselect/prometheus: reduce default value for -search.latencyOffset from 60s to 30s
30 seconds should be enough for almost all the cases
2019-11-25 16:33:42 +02:00
Aliaksandr Valialkin
e91cb34c0e app/vmselect/promql: allow nested parens 2019-11-25 16:13:41 +02:00
Aliaksandr Valialkin
826dfd63a5 vendor: update github.com/VictoriaMetrics/metrics from v1.9.0 to v1.9.1 2019-11-25 15:23:01 +02:00
Aliaksandr Valialkin
0401969d78 app/vmselect/promql: re-use metrics.Histogram when calculating histogram function for each point on the graph
This should reduce the amounts memory allocations
2019-11-25 14:24:21 +02:00
Aliaksandr Valialkin
da98703748 app/vmselect/promql: optimize binary search over big number of samples during rollup calculations 2019-11-25 14:01:46 +02:00
Aliaksandr Valialkin
c28876172f app/vmselect/promql: adjust tests after the upgrade of github.com/VictoriaMetrics/metrics from v1.8.3 to v1.9.0 2019-11-25 13:43:57 +02:00
Aliaksandr Valialkin
66c53bf3c6 vendor: update github.com/VictoriaMetrics/metrics from v1.8.3 to v1.9.0 2019-11-25 13:19:43 +02:00
Aliaksandr Valialkin
50ae1879c6 app/vmselect/promql: add histogram aggregate function, which is useful for building heatmaps from multiple time series 2019-11-24 00:04:25 +02:00
Aliaksandr Valialkin
4ff2fbcf3f vendor: update github.com/VictoriaMetrics/metrics from v1.8.2 to v1.8.3 2019-11-24 00:04:24 +02:00
Aliaksandr Valialkin
5285acae3e lib/decimal: calculate ln2/ln10 constant during compile time 2019-11-23 15:52:58 +02:00
Aliaksandr Valialkin
8582b50360 app/vmselect/promql: do not take into account buckets with negative counters in prometheus_buckets 2019-11-23 14:19:25 +02:00
Aliaksandr Valialkin
19dfe52254 app/vmselect/promql: properly handle histogram_quantile(0, ...) with zero buckets 2019-11-23 14:02:35 +02:00
Aliaksandr Valialkin
4bb88843cf app/vmselect: add vm_per_query_{rows,series}_processed_count histograms 2019-11-23 13:23:26 +02:00
Aliaksandr Valialkin
0827bb6ce5 vendor: update github.com/VictoriaMetrics/metrics from v1.8.1 to v1.8.2 2019-11-23 11:48:54 +02:00
Aliaksandr Valialkin
7753c8c0a1 app/vmselect/promql: transparently apply prometheus_buckets in histogram_quantile 2019-11-23 11:48:51 +02:00
Aliaksandr Valialkin
ef25e1b049 vendor: update github.com/VictoriaMetrics/metrics from v1.8.0 to v1.8.1 2019-11-23 00:49:13 +02:00
Aliaksandr Valialkin
9d1fcb2be6 vendor: update github.com/VictoriaMetrics/metrics from v1.7.2 to v1.8.0. This version supports histograms 2019-11-23 00:20:27 +02:00
Aliaksandr Valialkin
c4287b3c86 app/vmselect/promql: add prometheus_buckets function for converting the upcoming histogram buckets from github.com/VictoriaMetrics/metrics to Prometheus-compatible buckets 2019-11-23 00:20:20 +02:00
Aliaksandr Valialkin
1f3fd2c910 app/vmselect: adjust end arg instead of adjusting start arg if start > end
`start` arg has higher chances to be set properly comparing to `end` arg,
so it is expected that the `end` arg could be adjusted if it was set incorrectly.
2019-11-22 16:12:19 +02:00
Aliaksandr Valialkin
90b03309de vendor: updated github.com/valyala/gozstd from v1.6.2 to v1.6.3 2019-11-21 23:57:00 +02:00
Aliaksandr Valialkin
7a4635f853 all: remove the remaining mentions of cluster version 2019-11-21 23:18:22 +02:00
Aliaksandr Valialkin
3e9b7addb1 lib/httpserver: typo fix in -httpAuth.password command-line description 2019-11-21 21:54:26 +02:00
Aliaksandr Valialkin
f652c0f40f lib/storage: move non-matching tag filters to the top at matchTagFilters
This should reduce the amount of useless work needed for matching the next metricNames.
2019-11-21 21:35:13 +02:00
Aliaksandr Valialkin
b8cde6cce1 lib/storage: speed up time series search for queries with multiple filters
Use optimized specialized binary search for uint64 metricIDs instead of generic sort.Search.
2019-11-21 18:43:17 +02:00
Aliaksandr Valialkin
aeea59e280 Makefile: create files with sha256 checksums during make release
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/19
2019-11-20 22:43:37 +02:00
Aliaksandr Valialkin
74e563ca3f README.md: added a link to https://github.com/dreamteam-gg/ansible-victoriametrics-role 2019-11-20 21:26:43 +02:00
Aliaksandr Valialkin
5c1e4143e9 lib/storage: verify the number of returned metricIDs in BenchmarkHeadPostingForMatchers 2019-11-20 15:39:28 +02:00
Aliaksandr Valialkin
52d7ca6bf0 lib/decimal: increase decimal->float speed conversion for integer numbers 2019-11-20 13:04:34 +02:00
Aliaksandr Valialkin
75eeea21ee lib/decimal: reduce rounding error when converting from decimal to float with negative exponent
While at it, slightly increase the conversion performance by moving fast path to the top of the loop.
2019-11-19 23:35:33 +02:00
Artem Navoiev
c03b87dac0 update version of codecove to 1.04 2019-11-19 22:23:14 +02:00
Aliaksandr Valialkin
259dc95366 make vendor-update 2019-11-19 21:35:07 +02:00
Aliaksandr Valialkin
cfb9fa2100 lib/backup: retrieve only the required metadata when reading GCS objects 2019-11-19 21:06:34 +02:00
Aliaksandr Valialkin
355ccba81a make vendor-update 2019-11-19 21:05:37 +02:00
Aliaksandr Valialkin
443189fb0a app/{vmbackup,vmrestore}: add -maxBytesPerSecond command-line flag for limiting the used network bandwidth during backup / restore 2019-11-19 20:31:52 +02:00
Aliaksandr Valialkin
2db06f0ef8 lib/backup: prevent from restoring to directory which is in use by VictoriaMetrics during the restore 2019-11-19 18:36:23 +02:00
Aliaksandr Valialkin
0094bc4fc9 app/vmselect/prometheus: properly adjust too big time time on /api/v1/query
Too big `time` must be adjusted to `now()-queryOffset`.
2019-11-19 00:42:00 +02:00
Aliaksandr Valialkin
b6f22a62cb lib/storage: increase the number of created time series in BenchmarkHeadPostingForMatchers in order to be on par with Promethues
The previous commit was accidentally creating 10x smaller number of time series than Prometheus
and this led to invalid benchmark results.

The updated benchmark results:

benchmark                                                          old ns/op      new ns/op     delta
BenchmarkHeadPostingForMatchers/n="1"                              272756688      6194893       -97.73%
BenchmarkHeadPostingForMatchers/n="1",j="foo"                      138132923      10781372      -92.19%
BenchmarkHeadPostingForMatchers/j="foo",n="1"                      134723762      10632834      -92.11%
BenchmarkHeadPostingForMatchers/n="1",j!="foo"                     195823953      10679975      -94.55%
BenchmarkHeadPostingForMatchers/i=~".*"                            7962582919     100118510     -98.74%
BenchmarkHeadPostingForMatchers/i=~".+"                            7589543864     154955671     -97.96%
BenchmarkHeadPostingForMatchers/i=~""                              1142371741     258003769     -77.42%
BenchmarkHeadPostingForMatchers/i!=""                              9964150263     159783895     -98.40%
BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo"              216995884      10937895      -94.96%
BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo"       202541348      10990027      -94.57%
BenchmarkHeadPostingForMatchers/n="1",i!=""                        486285711      87004349      -82.11%
BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo"                350776931      53342793      -84.79%
BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo"              380888565      54256156      -85.76%
BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo"             89500296       21823279      -75.62%
BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo"       379529654      46671359      -87.70%
BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo"     424563825      53915842      -87.30%

VictoriaMetrics uses 1GB of RAM during the benchmark (vs 3.5GB of RAM for Prometheus)
2019-11-18 19:50:58 +02:00
Aliaksandr Valialkin
8a0dfc6220 lib/storage: add BenchmarkHeadPostingForMatchers similar to the benchmark from Prometheus
See the corresponding benchmark in Prometheus - 23c0299d85/tsdb/head_bench_test.go (L52)

The benchmark allows performing apples-to-apples comparison of time series search
in Prometheus and VictoriaMetrics. The following article - https://www.robustperception.io/evaluating-performance-and-correctness -
contains incorrect numbers for VictoriaMetrics, since there wasn't this benchmark yet. Fix this.

Benchmarks can be repeated with the following commands from Prometheus and VictoriaMetrics source code roots:

- Prometheus: GOMAXPROCS=1 go test ./tsdb/ -run=111 -bench=BenchmarkHeadPostingForMatchers
- VictoriaMetrics: GOMAXPROCS=1 go test ./lib/storage/ -run=111 -bench=BenchmarkHeadPostingForMatchers

Benchmark results:
benchmark                                                          old ns/op      new ns/op     delta
BenchmarkHeadPostingForMatchers/n="1"                              272756688      364977        -99.87%
BenchmarkHeadPostingForMatchers/n="1",j="foo"                      138132923      1181636       -99.14%
BenchmarkHeadPostingForMatchers/j="foo",n="1"                      134723762      1141578       -99.15%
BenchmarkHeadPostingForMatchers/n="1",j!="foo"                     195823953      1148056       -99.41%
BenchmarkHeadPostingForMatchers/i=~".*"                            7962582919     8716755       -99.89%
BenchmarkHeadPostingForMatchers/i=~".+"                            7589543864     12096587      -99.84%
BenchmarkHeadPostingForMatchers/i=~""                              1142371741     16164560      -98.59%
BenchmarkHeadPostingForMatchers/i!=""                              9964150263     12230021      -99.88%
BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo"              216995884      1173476       -99.46%
BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo"       202541348      1299743       -99.36%
BenchmarkHeadPostingForMatchers/n="1",i!=""                        486285711      11555193      -97.62%
BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo"                350776931      5607506       -98.40%
BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo"              380888565      6380335       -98.32%
BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo"             89500296       2078970       -97.68%
BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo"       379529654      6561368       -98.27%
BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo"     424563825      6757132       -98.41%

The first column (old) is for Prometheus, the second column (new) is for VictoriaMetrics.

As you can see, VictoriaMetrics outperforms Prometheus by more than 100x in almost all the test cases of this benchmark.

Prometheus was using 3.5GB of RAM during the benchmark, while VictoriaMetrics was using 400MB of RAM.
2019-11-18 18:45:06 +02:00
Aliaksandr Valialkin
2ab4cea5e5 lib/storage: always start using per-day inverted index on the next day after its creation
The current day could miss entries for already stopped time series before
enabling per-day index.

This fixes the issue when queries return empty results during the first hour after
upgrading to v1.29.*
2019-11-16 12:11:25 +02:00
Aliaksandr Valialkin
c050abbbad deployment/docker: update Prometheus version from v2.12.0 to v2.14.0 2019-11-16 00:13:15 +02:00
Aliaksandr Valialkin
3f1637fae8 app/vmselect/promql: properly calculate integrate(q[d]) 2019-11-13 21:10:41 +02:00
Aliaksandr Valialkin
c56b9ed03b app/victoria-metrics: add build rules for GOARCH=ppc64le
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/235
2019-11-13 20:24:33 +02:00
Aliaksandr Valialkin
3fd32e331a app/vmselect/promql: use universal approach for determining maxByteSliceLen on 32-bit and 64-bit archs
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/235
2019-11-13 20:24:26 +02:00
Aliaksandr Valialkin
119dfd01bb lib/storage: add vm_cache_size_bytes{type="storage/hour_metric_ids"} metric 2019-11-13 20:24:21 +02:00
Aliaksandr Valialkin
86a1cd700b lib/storage: remove inmemory index for recent hour, since it uses too much memory
Production workload shows that the index requires ~4Kb of RAM per active time series.
This is too much for high number of active time series, so let's delete this index.

Now the queries should fall back to the index for the current day instead of the index
for the recent hour. The query performance for the current day index should be good enough
given the 100M rows/sec scan speed per CPU core.
2019-11-13 17:58:07 +02:00
Aliaksandr Valialkin
33895d4a0f lib/storage: add missing increment for recentHourInvertedIndexSearchCalls 2019-11-13 15:13:51 +02:00
Aliaksandr Valialkin
c57eb0ff83 lib/storage: add -disableRecentHourIndex flag for disabling inmemory index for recent hour
This may be useful for saving RAM on high number of time series aka high cardinality
2019-11-13 15:02:51 +02:00
Aliaksandr Valialkin
e14ab14e54 lib/storage: verify marshaling for iidx.pendingMetricIDs in TestInmemoryInvertedIndexMarshalUnmarshal 2019-11-13 13:35:30 +02:00
Aliaksandr Valialkin
ca259864e2 lib/storage: return back inmemory inverted index for recent hour
Issues fixed:
- Slow startup times. Now the index is loaded from cache during start.
- High memory usage related to superflouos index copies every 10 seconds.
2019-11-13 13:11:04 +02:00
Aliaksandr Valialkin
01bb3c06c7 lib/storage: remove inmemory inverted index for recent hours
Production load with >10M active time series showed it could
slow down VictoriaMetrics startup times and could eat
all the memory leading to OOM.

Remove inmemory inverted index for recent hours until thorough
testing on production data shows it works OK.
2019-11-13 10:45:53 +02:00
Aliaksandr Valialkin
66c4961ff8 README.md: mention that VictoriaMetrics executable is small 2019-11-12 16:58:15 +02:00
Aliaksandr Valialkin
3e16248ed6 README.md: small updates 2019-11-12 16:54:18 +02:00
Aliaksandr Valialkin
5e6c1cd986 README.md: typo fix 2019-11-12 16:48:40 +02:00
Aliaksandr Valialkin
6c2303764e Revert "lib/fs: do not postpone directory removal on NFS error"
This reverts commit 4c02e496f7.

Reason for revert: the commit breaks on NFS - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/234
2019-11-12 16:18:09 +02:00
Mike Poindexter
f3ad330635 Add test for invalid caching of tsids (#232)
* Add test for invalid caching of tsids

* Clean up error handling
2019-11-12 15:09:33 +02:00
Aliaksandr Valialkin
6c362d82cb README.md: mention that backups are made to S3 or GCS 2019-11-12 14:32:37 +02:00
Aliaksandr Valialkin
661dd190bb Refer to https://medium.com/@valyala/speeding-up-backups-for-big-time-series-databases-533c1a927883 from multiple places in README.md 2019-11-12 13:02:39 +02:00
Aliaksandr Valialkin
630ba810f1 deployment/docker: upgrade Go from v1.13.4 to v1.13.4 2019-11-12 03:49:19 +02:00
Oleg Kovalov
b4f44befa3 fix misspelled words (#229) 2019-11-12 00:16:42 +02:00
Roman Khavronenko
5fc8fb1323 add churn rate panel (#230) 2019-11-12 00:14:53 +02:00
Aliaksandr Valialkin
8e8f98f712 lib/storage: add tests for dateMetricIDCache 2019-11-11 13:21:57 +02:00
Aliaksandr Valialkin
c342f5e37e lib/storage: eliminate data race when updating lastSyncTime in dateMetricIDCache.Has 2019-11-10 22:04:01 +02:00
Aliaksandr Valialkin
56d7cc8a0d app/victoria-metrics: remove deprecated fs.MustStopDirRemover from main_test.go 2019-11-10 13:37:13 +02:00
Aliaksandr Valialkin
4c02e496f7 lib/fs: do not postpone directory removal on NFS error
Continue trying to remove NFS directory on temporary errors for up to a minute.

The previous async removal process breaks in the following case during VictoriaMetrics start

- VictoriaMetrics opens index, finds incomplete merge transactions and starts replaying them.
- The transaction instructs removing old directories for parts, which were already merged into bigger part.
- VictoriaMetrics removes these directories, but their removal is delayed due to NFS errors.
- VictoriaMetrics scans partition directory after all the incomplete merge transactions are finished
  and finds directories, which should be removed, but weren't still removed due to NFS errors.
- VictoriaMetrics panics when it finds unexpected empty directory.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/162
2019-11-10 13:24:51 +02:00
Aliaksandr Valialkin
3956003dd0 lib/storage: reorganize the code in getStartDateForPerDayInvertedIndex according to golangci-lint 2019-11-10 00:38:59 +02:00
Aliaksandr Valialkin
5c3fa59181 app/vmrestore: the upcoming release would be 1.29.0 2019-11-10 00:20:41 +02:00
Aliaksandr Valialkin
ee7765b10d lib/storage: implement per-day inverted index 2019-11-10 00:02:46 +02:00
Aliaksandr Valialkin
5810ba57c2 lib/storage: use specialized cache for (date, metricID) entries
This improves ingestion performance.
2019-11-09 23:06:11 +02:00
Aliaksandr Valialkin
e573ef2126 lib/storage: remove unused code from getMetricIDsForTimeRange: it is expected that time range is always non-zero 2019-11-09 19:03:34 +02:00
Aliaksandr Valialkin
823fa085ef lib/storage: properly set time range when deleting time series 2019-11-09 18:49:49 +02:00
Aliaksandr Valialkin
695c1dc5eb lib/storage: obtain all the time series ids from (tag->metricIDs) rows instead of (metricID->TSID) rows, since this much faster 2019-11-09 18:04:33 +02:00
Aliaksandr Valialkin
cdbe848102 lib/storage: small code prettifying 2019-11-09 14:19:52 +02:00
Aliaksandr Valialkin
5c25070556 lib/uint64set: remove superflouos check for item existence before deleting it in Set.Subtract 2019-11-09 14:19:47 +02:00
Aliaksandr Valialkin
bb08bab263 lib/storage: inmemoryInvertedIndex prettifying 2019-11-09 14:19:41 +02:00
Aliaksandr Valialkin
6ad7fe8eeb lib/storage: export vm_new_timeseries_created_total metric for determining time series churn rate 2019-11-08 21:21:07 +02:00
Aliaksandr Valialkin
9ea549ed24 lib/storage: sync with cluster changes 2019-11-08 21:21:07 +02:00
Aliaksandr Valialkin
63b05c0b9f app/vmselect/promql: adjust memory limits calculations for incremental aggregate functions
Incremental aggregate functions don't keep all the selected time series in memory -
they keep only up to GOMAXPROCS time series for incremental aggregations.

Take into account that the number of time series in RAM can be higher if they are split
into many groups with `by (...)` or `without (...)` modifiers.

This should reduce the number of `not enough memory for processing ... data points` false
positive errors.
2019-11-08 21:21:07 +02:00
Aliaksandr Valialkin
d888b21657 lib/storage: add inmemory inverted index for the last hour
It should improve performance for `last N hours` dashboards with update intervals smaller than 1 hour.
2019-11-08 21:21:07 +02:00
Aliaksandr Valialkin
1e46961d68 app/{vmbackup,vmrestore}: add vmbackup and vmrestore tools for creating backups on s3 or gcs from instant snapshots
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/203
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/38
2019-11-08 21:21:07 +02:00
Roman Khavronenko
72756ab8c7 #224: add slow_queries, on-going merges and merge speed panels to dashboard (#226) 2019-11-08 21:20:38 +02:00
Aliaksandr Valialkin
543dc8d337 lib/storage: populate partition names from both small and big directories
Certain partition directories may be missing after restoring from backups
if they had no data. Re-create such directories on start.
2019-11-06 19:49:34 +02:00
Aliaksandr Valialkin
e472f0b23b lib/storage: substitute error message about unsorted items in the index block after metricIDs merge with counter
The origin of the error has been detected and documented in the code,
so it is enough to export a counter for such errors at `vm_index_blocks_with_metric_ids_incorrect_order_total`,
so it could be monitored and alerted on high error rates.

Export also the counter for processed index blocks with metricIDs - `vm_index_blocks_with_metric_ids_processed_total`,
so its' rate could be compared to `rate(vm_index_blocks_with_metric_ids_incorrect_order_total)`.
2019-11-06 14:28:11 +02:00
Aliaksandr Valialkin
c51ca04a43 lib/storage: take into account the requested time range when caching TSIDs for the given tag filters 2019-11-06 14:28:11 +02:00
Aliaksandr Valialkin
e37f06dc52 lib/storage: dump incorrectly sorted items on a single line; this should simplify error reporting 2019-11-05 18:44:22 +02:00
1408 changed files with 444158 additions and 42281 deletions

View File

@@ -26,5 +26,16 @@ $ ./victoria-metrics-prod --version
victoria-metrics-20190730-121249-heads-single-node-0-g671d9e55
```
**Used command-line flags**
Command-line flags are listed as `flag{name="httpListenAddr", value=":443"} 1` lines at `/metrics` page.
See the following docs for details:
* [monitoring for single-node VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#monitoring)
* [montioring for VictoriaMetrics cluster](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/cluster/README.md#monitoring)
**Additional context**
Add any other context about the problem here such as error logs, `/metrics` output, screenshots from [the official Grafana dashboard for VictoriaMetrics](https://grafana.com/dashboards/10229).
Add any other context about the problem here such as error logs from VictoriaMetrics and Prometheus,
`/metrics` output, screenshots from the official Grafana dashboards for VictoriaMetrics:
* [Grafana dashboard for single-node VictoriaMetrics](https://grafana.com/dashboards/10229)
* [Grafana dashboard for VictoriaMetrics cluster](https://grafana.com/grafana/dashboards/11176)

30
.github/workflows/github-pages.yml vendored Normal file
View File

@@ -0,0 +1,30 @@
name: github-pages
on:
push:
paths:
- 'docs/*.md'
- 'README.md'
branches:
- master
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@master
- name: publish
shell: bash
env:
TOKEN: ${{secrets.CI_TOKEN}}
run: |
git clone https://vika:${TOKEN}@github.com/VictoriaMetrics/VictoriaMetrics.github.io.git gpages
cp docs/*.md gpages
cp README.md gpages
cd gpages
git config --local user.email "info@victoriametrics.com"
git config --local user.name "Vika"
git add "*.md"
git commit -m "update github pages"
remote_repo="https://vika:${TOKEN}@github.com/VictoriaMetrics/VictoriaMetrics.github.io.git"
git push "${remote_repo}"
cd ..
rm -rf gpages

View File

@@ -1,7 +1,13 @@
name: main
on:
- push
- pull_request
push:
paths-ignore:
- 'docs/**'
- '**.md'
pull_request:
paths-ignore:
- 'docs/**'
- '**.md'
jobs:
build:
name: Build
@@ -24,20 +30,21 @@ jobs:
env:
GO111MODULE: on
run: |
export PATH=$PATH:$(go env GOPATH)/bin # temporary fix. See https://github.com/actions/setup-go/issues/14
make check-all
git diff --exit-code
make test-full
make test-pure
make test-full-386
make victoria-metrics
make victoria-metrics-pure
make victoria-metrics-arm
make victoria-metrics-arm64
GOOS=freebsd go build -mod=vendor ./app/victoria-metrics
GOOS=darwin go build -mod=vendor ./app/victoria-metrics
export PATH=$PATH:$(go env GOPATH)/bin # temporary fix. See https://github.com/actions/setup-go/issues/14
make check-all
git diff --exit-code
make test-full
make test-pure
make test-full-386
make victoria-metrics
make victoria-metrics-pure
make victoria-metrics-arm
make victoria-metrics-arm64
make vmutils
GOOS=freebsd go build -mod=vendor ./app/victoria-metrics
GOOS=darwin go build -mod=vendor ./app/victoria-metrics
- name: Publish coverage
uses: codecov/codecov-action@v1.0.0
uses: codecov/codecov-action@v1.0.4
with:
token: ${{secrets.CODECOV_TOKEN}}
file: ./coverage.txt

29
.github/workflows/wiki.yml vendored Normal file
View File

@@ -0,0 +1,29 @@
name: wiki
on:
push:
paths:
- 'docs/*.md'
branches:
- master
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@master
- name: publish
shell: bash
env:
TOKEN: ${{secrets.CI_TOKEN}}
run: |
cd docs
git clone https://vika:${TOKEN}@github.com/VictoriaMetrics/VictoriaMetrics.wiki.git wiki
find ./ -name '*.md' -exec cp -prv '{}' 'wiki' ';'
cd wiki
git config --local user.email "info@victoriametrics.com"
git config --local user.name "Vika"
git add "*.md"
git commit -m "update wiki pages"
remote_repo="https://vika:${TOKEN}@github.com/VictoriaMetrics/VictoriaMetrics.wiki.git"
git push "${remote_repo}"
cd ..
rm -rf wiki

2
.gitignore vendored
View File

@@ -1,3 +1,4 @@
/tmp
/tags
/pkg
*.pprof
@@ -7,6 +8,7 @@
*.swp
/gocache-for-docker
/victoria-metrics-data
/vmagent-remotewrite-data
/vmstorage-data
/vmselect-cache
/package/temp-deb-*

View File

@@ -175,7 +175,7 @@
END OF TERMS AND CONDITIONS
Copyright 2019 VictoriaMetrics, Inc.
Copyright 2019-2020 VictoriaMetrics, Inc.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.

View File

@@ -11,7 +11,10 @@ endif
GO_BUILDINFO = -X '$(PKG_PREFIX)/lib/buildinfo.Version=$(APP_NAME)-$(shell date -u +'%Y%m%d-%H%M%S')-$(BUILDINFO_TAG)'
all: \
victoria-metrics-prod
victoria-metrics-prod \
vmagent-prod \
vmbackup-prod \
vmrestore-prod
include app/*/Makefile
include deployment/*/Makefile
@@ -19,12 +22,37 @@ include deployment/*/Makefile
clean:
rm -rf bin/*
publish: publish-victoria-metrics
publish: \
publish-victoria-metrics \
publish-vmagent \
publish-vmbackup \
publish-vmrestore
package: package-victoria-metrics
package: \
package-victoria-metrics \
package-vmagent \
package-vmbackup \
package-vmrestore
release: victoria-metrics-prod
cd bin && tar czf victoria-metrics-$(PKG_TAG).tar.gz victoria-metrics-prod
vmutils: \
vmagent \
vmbackup \
vmrestore
release: \
release-victoria-metrics \
release-vmutils
release-victoria-metrics: victoria-metrics-prod
cd bin && tar czf victoria-metrics-$(PKG_TAG).tar.gz victoria-metrics-prod && \
sha256sum victoria-metrics-$(PKG_TAG).tar.gz > victoria-metrics-$(PKG_TAG)_checksums.txt
release-vmutils: \
vmagent-prod \
vmbackup-prod \
vmrestore-prod
cd bin && tar czf vmutils-$(PKG_TAG).tar.gz vmagent-prod vmbackup-prod vmrestore-prod && \
sha256sum vmutils-$(PKG_TAG).tar.gz > vmutils-$(PKG_TAG)_checksums.txt
pprof-cpu:
go tool pprof -trim_path=github.com/VictoriaMetrics/VictoriaMetrics@ $(PPROF_FILE)
@@ -42,13 +70,17 @@ lint: install-golint
golint app/...
install-golint:
which golint || GO111MODULE=off go get -u github.com/golang/lint/golint
which golint || GO111MODULE=off go get -u golang.org/x/lint/golint
errcheck: install-errcheck
errcheck -exclude=errcheck_excludes.txt ./lib/...
errcheck -exclude=errcheck_excludes.txt ./app/vminsert/...
errcheck -exclude=errcheck_excludes.txt ./app/vmselect/...
errcheck -exclude=errcheck_excludes.txt ./app/vmstorage/...
errcheck -exclude=errcheck_excludes.txt ./app/vmagent/...
errcheck -exclude=errcheck_excludes.txt ./app/vmbackup/...
errcheck -exclude=errcheck_excludes.txt ./app/vmrestore/...
errcheck -exclude=errcheck_excludes.txt ./app/vmalert/...
install-errcheck:
which errcheck || GO111MODULE=off go get -u github.com/kisielk/errcheck
@@ -56,16 +88,16 @@ install-errcheck:
check-all: fmt vet lint errcheck golangci-lint
test:
GO111MODULE=on go test -tags=integration -mod=vendor ./lib/... ./app/...
GO111MODULE=on go test -mod=vendor ./lib/... ./app/...
test-pure:
GO111MODULE=on CGO_ENABLED=0 go test -tags=integration -mod=vendor ./lib/... ./app/...
GO111MODULE=on CGO_ENABLED=0 go test -mod=vendor ./lib/... ./app/...
test-full:
GO111MODULE=on go test -tags=integration -mod=vendor -coverprofile=coverage.txt -covermode=atomic ./lib/... ./app/...
GO111MODULE=on go test -mod=vendor -coverprofile=coverage.txt -covermode=atomic ./lib/... ./app/...
test-full-386:
GO111MODULE=on GOARCH=386 go test -tags=integration -mod=vendor -coverprofile=coverage.txt -covermode=atomic ./lib/... ./app/...
GO111MODULE=on GOARCH=386 go test -mod=vendor -coverprofile=coverage.txt -covermode=atomic ./lib/... ./app/...
benchmark:
GO111MODULE=on go test -mod=vendor -bench=. ./lib/...

288
README.md
View File

@@ -1,4 +1,5 @@
[![Latest Release](https://img.shields.io/github/release/VictoriaMetrics/VictoriaMetrics.svg?style=flat-square)](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/latest)
[![Docker Pulls](https://img.shields.io/docker/pulls/victoriametrics/victoria-metrics.svg?maxAge=604800)](https://hub.docker.com/r/victoriametrics/victoria-metrics)
[![Slack](https://img.shields.io/badge/join%20slack-%23victoriametrics-brightgreen.svg)](http://slack.victoriametrics.com/)
[![GitHub license](https://img.shields.io/github/license/VictoriaMetrics/VictoriaMetrics.svg)](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/LICENSE)
[![Go Report](https://goreportcard.com/badge/github.com/VictoriaMetrics/VictoriaMetrics)](https://goreportcard.com/report/github.com/VictoriaMetrics/VictoriaMetrics)
@@ -7,45 +8,62 @@
<img alt="Victoria Metrics" src="logo.png">
## Single-node VictoriaMetrics
## VictoriaMetrics
VictoriaMetrics is fast, cost-effective and scalable time-series database. It can be used as long-term remote storage for Prometheus.
It is available in [binary releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases),
[docker images](https://hub.docker.com/r/victoriametrics/victoria-metrics/) and
in [source code](https://github.com/VictoriaMetrics/VictoriaMetrics).
in [source code](https://github.com/VictoriaMetrics/VictoriaMetrics). Just download VictoriaMetrics and see [how to start it](#how-to-start-victoriametrics).
Cluster version is available [here](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/cluster).
## Case studies and talks
* [Adidas](https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/CaseStudies#adidas)
* [COLOPL](https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/CaseStudies#colopl)
* [Wix.com](https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/CaseStudies#wixcom)
* [Wedos.com](https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/CaseStudies#wedoscom)
* [Dreamteam](https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/CaseStudies#dreamteam)
## Prominent features
* Supports [Prometheus querying API](https://prometheus.io/docs/prometheus/latest/querying/api/), so it can be used as Prometheus drop-in replacement in Grafana.
Additionally, VictoriaMetrics extends PromQL with opt-in [useful features](https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/ExtendedPromQL).
VictoriaMetrics implements [MetricsQL](https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/MetricsQL) query language, which is inspired by PromQL.
* Supports global query view. Multiple Prometheus instances may write data into VictoriaMetrics. Later this data may be used in a single query.
* High performance and good scalability for both [inserts](https://medium.com/@valyala/high-cardinality-tsdb-benchmarks-victoriametrics-vs-timescaledb-vs-influxdb-13e6ee64dd6b)
and [selects](https://medium.com/@valyala/when-size-matters-benchmarking-victoriametrics-vs-timescale-and-influxdb-6035811952d4).
[Outperforms InfluxDB and TimescaleDB by up to 20x](https://medium.com/@valyala/measuring-vertical-scalability-for-time-series-databases-in-google-cloud-92550d78d8ae).
* [Uses 10x less RAM than InfluxDB](https://medium.com/@valyala/insert-benchmarks-with-inch-influxdb-vs-victoriametrics-e31a41ae2893) when working with millions of unique time series (aka high cardinality).
* Optimized for time series with high churn rate. Think about [prometheus-operator](https://github.com/coreos/prometheus-operator) metrics from frequent deployments in Kubernetes.
* High data compression, so [up to 70x more data points](https://medium.com/@valyala/when-size-matters-benchmarking-victoriametrics-vs-timescale-and-influxdb-6035811952d4)
may be crammed into limited storage comparing to TimescaleDB.
* Optimized for storage with high-latency IO and low IOPS (HDD and network storage in AWS, Google Cloud, Microsoft Azure, etc). See [graphs from these benchmarks](https://medium.com/@valyala/high-cardinality-tsdb-benchmarks-victoriametrics-vs-timescaledb-vs-influxdb-13e6ee64dd6b).
* A single-node VictoriaMetrics may substitute moderately sized clusters built with competing solutions such as Thanos, Uber M3, Cortex, InfluxDB or TimescaleDB.
See [vertical scalability benchmarks](https://medium.com/@valyala/measuring-vertical-scalability-for-time-series-databases-in-google-cloud-92550d78d8ae)
and [comparing Thanos to VictoriaMetrics cluster](https://medium.com/@valyala/comparing-thanos-to-victoriametrics-cluster-b193bea1683).
* A single-node VictoriaMetrics may substitute moderately sized clusters built with competing solutions such as Thanos, M3DB, Cortex, InfluxDB or TimescaleDB.
See [vertical scalability benchmarks](https://medium.com/@valyala/measuring-vertical-scalability-for-time-series-databases-in-google-cloud-92550d78d8ae),
[comparing Thanos to VictoriaMetrics cluster](https://medium.com/@valyala/comparing-thanos-to-victoriametrics-cluster-b193bea1683)
and [Remote Write Storage Wars](https://promcon.io/2019-munich/talks/remote-write-storage-wars/) talk
from [PromCon 2019](https://promcon.io/2019-munich/talks/remote-write-storage-wars/).
* Easy operation:
* VictoriaMetrics consists of a single executable without external dependencies.
* VictoriaMetrics consists of a single [small executable](https://medium.com/@valyala/stripping-dependency-bloat-in-victoriametrics-docker-image-983fb5912b0d) without external dependencies.
* All the configuration is done via explicit command-line flags with reasonable defaults.
* All the data is stored in a single directory pointed by `-storageDataPath` flag.
* Easy backups from [instant snapshots](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282).
* Storage is protected from corruption on unclean shutdown (i.e. hardware reset or `kill -9`) thanks to [the storage architecture](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282).
* Supports metrics' ingestion and [backfilling](#backfilling) via the following protocols:
* Easy and fast backups from [instant snapshots](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282)
to S3 or GCS with [vmbackup](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmbackup/README.md) / [vmrestore](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmrestore/README.md).
See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time-series-databases-533c1a927883) for more details.
* Storage is protected from corruption on unclean shutdown (i.e. OOM, hardware reset or `kill -9`) thanks to [the storage architecture](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282).
* Supports metrics' scraping, ingestion and [backfilling](#backfilling) via the following protocols:
* [Metrics from Prometheus exporters](https://github.com/prometheus/docs/blob/master/content/docs/instrumenting/exposition_formats.md#text-based-format)
such as [node_exporter](https://github.com/prometheus/node_exporter). See [these docs](#how-to-scrape-prometheus-exporters-such-as-node-exporter) for details.
* [Prometheus remote write API](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write)
* [InfluxDB line protocol](https://docs.influxdata.com/influxdb/v1.7/write_protocols/line_protocol_tutorial/)
* [Graphite plaintext protocol](https://graphite.readthedocs.io/en/latest/feeding-carbon.html) with [tags](https://graphite.readthedocs.io/en/latest/tags.html#carbon)
* [InfluxDB line protocol](#how-to-send-data-from-influxdb-compatible-agents-such-as-telegraf)
* [Graphite plaintext protocol](#how-to-send-data-from-graphite-compatible-agents-such-as-statsd) with [tags](https://graphite.readthedocs.io/en/latest/tags.html#carbon)
if `-graphiteListenAddr` is set.
* [OpenTSDB put message](http://opentsdb.net/docs/build/html/api_telnet/put.html) if `-opentsdbListenAddr` is set.
* [HTTP OpenTSDB /api/put requests](http://opentsdb.net/docs/build/html/api_http/put.html) if `-opentsdbHTTPListenAddr` is set.
* Ideally works with big amounts of time series data from Kubernetes, IoT sensors, connected cars, industrial telemetry and various Enterprise workloads.
* [OpenTSDB put message](#sending-data-via-telnet-put-protocol) if `-opentsdbListenAddr` is set.
* [HTTP OpenTSDB /api/put requests](#sending-opentsdb-data-via-http-apiput-requests) if `-opentsdbHTTPListenAddr` is set.
* [/api/v1/import](#how-to-import-time-series-data)
* Ideally works with big amounts of time series data from Kubernetes, IoT sensors, connected cars, industrial telemetry, financial data and various Enterprise workloads.
* Has open source [cluster version](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/cluster).
@@ -59,10 +77,12 @@ Cluster version is available [here](https://github.com/VictoriaMetrics/VictoriaM
- [Grafana setup](#grafana-setup)
- [How to upgrade VictoriaMetrics?](#how-to-upgrade-victoriametrics)
- [How to apply new config to VictoriaMetrics?](#how-to-apply-new-config-to-victoriametrics)
- [How to scrape Prometheus exporters such as node_exporter?](#how-to-scrape-prometheus-exporters-such-as-node-exporter)
- [How to send data from InfluxDB-compatible agents such as Telegraf?](#how-to-send-data-from-influxdb-compatible-agents-such-as-telegraf)
- [How to send data from Graphite-compatible agents such as StatsD?](#how-to-send-data-from-graphite-compatible-agents-such-as-statsd)
- [Querying Graphite data](#querying-graphite-data)
- [How to send data from OpenTSDB-compatible agents?](#how-to-send-data-from-opentsdb-compatible-agents)
- [Prometheus querying API usage](#prometheus-querying-api-usage)
- [How to build from sources](#how-to-build-from-sources)
- [Development build](#development-build)
- [Production build](#production-build)
@@ -71,13 +91,15 @@ Cluster version is available [here](https://github.com/VictoriaMetrics/VictoriaM
- [Building docker images](#building-docker-images)
- [Start with docker-compose](#start-with-docker-compose)
- [Setting up service](#setting-up-service)
- [Third-party contributions](#third-party-contributions)
- [How to work with snapshots?](#how-to-work-with-snapshots)
- [How to delete time series?](#how-to-delete-time-series)
- [How to export time series?](#how-to-export-time-series)
- [How to import time series data?](#how-to-import-time-series-data)
- [Federation](#federation)
- [Capacity planning](#capacity-planning)
- [High availability](#high-availability)
- [Deduplication](#deduplication)
- [Retention](#retention)
- [Multiple retentions](#multiple-retentions)
- [Downsampling](#downsampling)
- [Multi-tenancy](#multi-tenancy)
@@ -93,6 +115,7 @@ Cluster version is available [here](https://github.com/VictoriaMetrics/VictoriaM
- [Roadmap](#roadmap)
- [Contacts](#contacts)
- [Community and contributions](#community-and-contributions)
- [Third-party contributions](#third-party-contributions)
- [Reporting bugs](#reporting-bugs)
- [Victoria Metrics Logo](#victoria-metrics-logo)
- [Logo Usage Guidelines](#logo-usage-guidelines)
@@ -111,25 +134,25 @@ The following command-line flags are used the most:
* `-storageDataPath` - path to data directory. VictoriaMetrics stores all the data in this directory. Default path is `victoria-metrics-data` in current working directory.
* `-retentionPeriod` - retention period in months for the data. Older data is automatically deleted. Default period is 1 month.
* `-httpListenAddr` - TCP address to listen to for http requests. By default, it listens port `8428` on all the network interfaces.
* `-graphiteListenAddr` - TCP and UDP address to listen to for Graphite data. By default, it is disabled.
* `-opentsdbListenAddr` - TCP and UDP address to listen to for OpenTSDB data over telnet protocol. By default, it is disabled.
* `-opentsdbHTTPListenAddr` - TCP address to listen to for HTTP OpenTSDB data over `/api/put`. By default, it is disabled.
Pass `-help` to see all the available flags with description and default values.
Default flag values may be read from environment variables if `-envflag.enable` command-line flag is set.
Substitute dots with underscores in env var names. Alternative syntax can be used for setting repeatable flags:
`-arg=foo -arg=bar` can be written as `-arg=foo,bar`. See [this feature request](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/311) for more details.
It is recommended setting up [monitoring](#monitoring) for VictoriaMetrics.
### Prometheus setup
Add the following lines to Prometheus config file (it is usually located at `/etc/prometheus/prometheus.yml`):
Prometheus must be configured with [remote_write](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write)
in order to send data to VictoriaMetrics. Add the following lines
to Prometheus config file (it is usually located at `/etc/prometheus/prometheus.yml`):
```yml
remote_write:
- url: http://<victoriametrics-addr>:8428/api/v1/write
queue_config:
max_samples_per_send: 10000
max_shards: 30
```
Substitute `<victoriametrics-addr>` with the hostname or IP address of VictoriaMetrics.
@@ -156,6 +179,22 @@ This instructs Prometheus to add `datacenter=dc-123` label to each time series s
The label name may be arbitrary - `datacenter` is just an example. The label value must be unique
across Prometheus instances, so those time series may be filtered and grouped by this label.
For highly loaded Prometheus instances (400k+ samples per second)
the following tuning may be applied:
```
remote_write:
- url: http://<victoriametrics-addr>:8428/api/v1/write
queue_config:
max_samples_per_send: 10000
capacity: 20000
max_shards: 30
```
Using remote write increases memory usage for Prometheus up to ~25%
and depends on the shape of data. If you are experiencing issues with
too high memory consumption try to lower `max_samples_per_send`
and `capacity` params (keep in mind that these two params are tightly connected).
Read more about tuning remote write for Prometheus [here](https://prometheus.io/docs/practices/remote_write).
It is recommended upgrading Prometheus to [v2.12.0](https://github.com/prometheus/prometheus/releases) or newer,
since the previous versions may have issues with `remote_write`.
@@ -172,7 +211,7 @@ http://<victoriametrics-addr>:8428
Substitute `<victoriametrics-addr>` with the hostname or IP address of VictoriaMetrics.
Then build graphs with the created datasource using [Prometheus query language](https://prometheus.io/docs/prometheus/latest/querying/basics/).
VictoriaMetrics supports native PromQL and [extends it with useful features](https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/ExtendedPromQL).
VictoriaMetrics supports native PromQL and [extends it with useful features](https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/MetricsQL).
### How to upgrade VictoriaMetrics?
@@ -203,6 +242,20 @@ Prometheus doesn't drop data during VictoriaMetrics restart.
See [this article](https://grafana.com/blog/2019/03/25/whats-new-in-prometheus-2.8-wal-based-remote-write/) for details.
### How to scrape Prometheus exporters such as [node-exporter](https://github.com/prometheus/node_exporter)?
VictoriaMetrics can be used as drop-in replacement for Prometheus for scraping targets configured in `prometheus.yml` config file according to [the specification](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#configuration-file).
Just set `-promscrape.config` command-line flag to the path to `prometheus.yml` config - and VictoriaMetrics should start scraping the configured targets.
Currently the following [scrape_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config) types are supported:
* [static_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#static_config)
* [file_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#file_sd_config)
In the future other `*_sd_config` types will be supported.
See also [vmagent](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmagent/README.md), which can be used as drop-in replacement for Prometheus.
### How to send data from InfluxDB-compatible agents such as [Telegraf](https://www.influxdata.com/time-series-platform/telegraf/)?
Just use `http://<victoriametric-addr>:8428` url instead of InfluxDB url in agents' configs.
@@ -220,7 +273,7 @@ VictoriaMetrics maps Influx data using the following rules:
unless `db` tag exists in the Influx line.
* Field names are mapped to time series names prefixed with `{measurement}{separator}` value,
where `{separator}` equals to `_` by default. It can be changed with `-influxMeasurementFieldSeparator` command-line flag.
See also `-influxSkipSingleField` command-line flag.
See also `-influxSkipSingleField` command-line flag. If `{measurement}` is empty, then time series names correspond to field names.
* Field values are mapped to time series values.
* Tags are mapped to Prometheus labels as-is.
@@ -248,7 +301,7 @@ An arbitrary number of lines delimited by '\n' may be sent in a single request.
After that the data may be read via [/api/v1/export](#how-to-export-time-series) endpoint:
```
curl -G 'http://localhost:8428/api/v1/export' -d 'match={__name__!=""}'
curl -G 'http://localhost:8428/api/v1/export' -d 'match={__name__=~"measurement_.*"}'
```
The `/api/v1/export` endpoint should return the following response:
@@ -286,7 +339,7 @@ An arbitrary number of lines delimited by `\n` may be sent in one go.
After that the data may be read via [/api/v1/export](#how-to-export-time-series) endpoint:
```
curl -G 'http://localhost:8428/api/v1/export' -d 'match={__name__!=""}'
curl -G 'http://localhost:8428/api/v1/export' -d 'match=foo.bar.baz'
```
The `/api/v1/export` endpoint should return the following response:
@@ -299,7 +352,7 @@ The `/api/v1/export` endpoint should return the following response:
### Querying Graphite data
Data sent to VictoriaMetrics via `Graphite plaintext protocol` may be read either via
[Prometheus querying API](https://prometheus.io/docs/prometheus/latest/querying/api/)
[Prometheus querying API](#prometheus-querying-api-usage)
or via [go-graphite/carbonapi](https://github.com/go-graphite/carbonapi/blob/master/cmd/carbonapi/carbonapi.example.prometheus.yaml).
@@ -331,7 +384,7 @@ An arbitrary number of lines delimited by `\n` may be sent in one go.
After that the data may be read via [/api/v1/export](#how-to-export-time-series) endpoint:
```
curl -G 'http://localhost:8428/api/v1/export' -d 'match={__name__!=""}'
curl -G 'http://localhost:8428/api/v1/export' -d 'match=foo.bar.baz'
```
The `/api/v1/export` endpoint should return the following response:
@@ -379,6 +432,31 @@ The `/api/v1/export` endpoint should return the following response:
```
### Prometheus querying API usage
VictoriaMetrics supports the following handlers from [Prometheus querying API](https://prometheus.io/docs/prometheus/latest/querying/api/):
* [/api/v1/query](https://prometheus.io/docs/prometheus/latest/querying/api/#instant-queries)
* [/api/v1/query_range](https://prometheus.io/docs/prometheus/latest/querying/api/#range-queries)
* [/api/v1/series](https://prometheus.io/docs/prometheus/latest/querying/api/#finding-series-by-label-matchers)
* [/api/v1/labels](https://prometheus.io/docs/prometheus/latest/querying/api/#getting-label-names)
* [/api/v1/label/.../values](https://prometheus.io/docs/prometheus/latest/querying/api/#querying-label-values)
These handlers can be queried from Prometheus-compatible clients such as Grafana or curl.
VictoriaMetrics accepts additional args for `/api/v1/labels` and `/api/v1/label/.../values` handlers.
See [this feature request](https://github.com/prometheus/prometheus/issues/6178) for details:
* Any number [time series selectors](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors) via `match[]` query arg.
* Optional `start` and `end` query args for limiting the time range for the selected labels or label values.
Additionally VictoriaMetrics provides the following handlers:
* `/api/v1/series/count` - it returns the total number of time series in the database. Note that this handler scans all the inverted index,
so it can be slow if the database contains tens of millions of time series.
* `/api/v1/labels/count` - it returns a list of `label: values_count` entries. It can be used for determining labels with the maximum number of values.
### How to build from sources
We recommend using either [binary releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) or
@@ -444,11 +522,6 @@ More details may be found [here](https://github.com/VictoriaMetrics/VictoriaMetr
Read [these instructions](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/43) on how to set up VictoriaMetrics as a service in your OS.
### Third-party contributions
* [Unofficial yum repository](https://copr.fedorainfracloud.org/coprs/antonpatsev/VictoriaMetrics/) ([source code](https://github.com/patsevanton/victoriametrics-rpm))
### How to work with snapshots?
VictoriaMetrics can create [instant snapshots](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282)
@@ -461,8 +534,8 @@ The page will return the following JSON response:
```
Snapshots are created under `<-storageDataPath>/snapshots` directory, where `<-storageDataPath>`
is the command-line flag value. Snapshots can be archived to backup storage via `cp -L`, `rsync -L`, `scp -r`
or any similar tool that follows symlinks during copying.
is the command-line flag value. Snapshots can be archived to backup storage at any time
with [vmbackup](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmbackup/README.md).
The `http://<victoriametrics-addr>:8428/snapshot/list` page contains the list of available snapshots.
@@ -473,9 +546,9 @@ Navigate to `http://<victoriametrics-addr>:8428/snapshot/delete_all` in order to
Steps for restoring from a snapshot:
1. Stop VictoriaMetrics with `kill -INT`.
2. Remove the entire contents of the directory pointed by `-storageDataPath` command-line flag.
3. Copy snapshot contents to the directory pointed by `-storageDataPath`.
4. Start VictoriaMetrics.
2. Restore snapshot contents from backup with [vmrestore](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmrestore/README.md)
to the directory pointed by `-storageDataPath`.
3. Start VictoriaMetrics.
### How to delete time series?
@@ -488,12 +561,27 @@ the deleted time series isn't freed instantly - it is freed during subsequent me
It is recommended verifying which metrics will be deleted with the call to `http://<victoria-metrics-addr>:8428/api/v1/series?match[]=<timeseries_selector_for_delete>`
before actually deleting the metrics.
The delete API is intended mainly for the following cases:
- One-off deleting of accidentally written invalid (or undesired) time series.
- One-off deleting of user data due to [GDPR](https://en.wikipedia.org/wiki/General_Data_Protection_Regulation).
It isn't recommended using delete API for the following cases, since it brings non-zero overhead:
- Regular cleanups for unneded data. Just prevent writing unneeded data into VictoriaMetrics.
See [this article](https://www.robustperception.io/relabelling-can-discard-targets-timeseries-and-alerts) for details.
- Reducing disk space usage by deleting unneded time series. This doesn't work as expected, since the deleted
time series occupy disk space until the next merge operation, which can never occur.
It is better using `-retentionPeriod` command-line flag for efficient pruning of old data.
### How to export time series?
Send a request to `http://<victoriametrics-addr>:8428/api/v1/export?match[]=<timeseries_selector_for_export>`,
where `<timeseries_selector_for_export>` may contain any [time series selector](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors)
for metrics to export. The response would contain all the data for the selected time series in [JSON streaming format](https://en.wikipedia.org/wiki/JSON_streaming#Line-delimited_JSON).
for metrics to export. Use `{__name__!=""}` selector for fetching all the time series.
The response would contain all the data for the selected time series in [JSON streaming format](https://en.wikipedia.org/wiki/JSON_streaming#Line-delimited_JSON).
Each JSON line would contain data for a single time series. An example output:
```
@@ -504,6 +592,52 @@ Each JSON line would contain data for a single time series. An example output:
Optional `start` and `end` args may be added to the request in order to limit the time frame for the exported data. These args may contain either
unix timestamp in seconds or [RFC3339](https://www.ietf.org/rfc/rfc3339.txt) values.
Pass `Accept-Encoding: gzip` HTTP header in the request to `/api/v1/export` in order to reduce network bandwidth during exporing big amounts
of time series data. This enables gzip compression for the exported data. Example for exporting gzipped data:
```
curl -H 'Accept-Encoding: gzip' http://localhost:8428/api/v1/export -d 'match[]={__name__!=""}' > data.jsonl.gz
```
The maximum duration for each request to `/api/v1/export` is limited by `-search.maxExportDuration` command-line flag.
Exported data can be imported via POST'ing it to [/api/v1/import](#how-to-import-time-series-data).
### How to import time series data?
Time series data can be imported via any supported ingestion protocol:
* [Prometheus remote_write API](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write)
* [Influx line protocol](#how-to-send-data-from-influxdb-compatible-agents-such-as-telegraf)
* [Graphite plaintext protocol](#how-to-send-data-from-graphite-compatible-agents-such-as-statsd)
* [OpenTSDB telnet put protocol](#sending-data-via-telnet-put-protocol)
* [OpenTSDB http /api/put](#sending-opentsdb-data-via-http-apiput-requests)
* `/api/v1/import` http POST handler, which accepts data from [/api/v1/export](#how-to-export-time-series).
The most efficient protocol for importing data into VictoriaMetrics is `/api/v1/import`. Example for importing data obtained via `/api/v1/export`:
```
# Export the data from <source-victoriametrics>:
curl http://source-victoriametrics:8428/api/v1/export -d 'match={__name__!=""}' > exported_data.jsonl
# Import the data to <destination-victoriametrics>:
curl -X POST http://destination-victoriametrics:8428/api/v1/import -T exported_data.jsonl
```
Pass `Content-Encoding: gzip` HTTP request header to `/api/v1/import` for importing gzipped data:
```
# Export gzipped data from <source-victoriametrics>:
curl -H 'Accept-Encoding: gzip' http://source-victoriametrics:8428/api/v1/export -d 'match={__name__!=""}' > exported_data.jsonl.gz
# Import gzipped data to <destination-victoriametrics>:
curl -X POST -H 'Content-Encoding: gzip' http://destination-victoriametrics:8428/api/v1/import -T exported_data.jsonl.gz
```
Each request to `/api/v1/import` can load up to a single vCPU core on VictoriaMetrics. Import speed can be improved by splitting the original file into smaller parts
and importing them concurrently. Note that the original file must be split on newlines.
### Federation
@@ -524,7 +658,7 @@ A rough estimation of the required resources for ingestion path:
* RAM size: less than 1KB per active time series. So, ~1GB of RAM is required for 1M active time series.
Time series is considered active if new data points have been added to it recently or if it has been recently queried.
The number of active time series may be obtained from `vm_cache_entries{type="storage/hour_metric_ids"}` metric
exproted on the `/metrics` page.
exported on the `/metrics` page.
VictoriaMetrics stores various caches in RAM. Memory size for these caches may be limited by `-memory.allowedPercent` flag.
* CPU cores: a CPU core per 300K inserted data points per second. So, ~4 CPU cores are required for processing
@@ -587,6 +721,29 @@ kill -HUP `pidof prometheus`
If you have Prometheus HA pairs with replicas `r1` and `r2` in each pair, then configure each `r1`
to write data to `victoriametrics-addr-1`, while each `r2` should write data to `victoriametrics-addr-2`.
Another option is to write data simultaneously from Prometheus HA pair to a pair of VictoriaMetrics instances
with the enabled de-duplication. See [this section](#deduplication) for details.
### Deduplication
VictoriaMetrics de-duplicates data points if `-dedup.minScrapeInterval` command-line flag
is set to positive duration. For example, `-dedup.minScrapeInterval=60s` would de-duplicate data points
on the same time series if they are located closer than 60s to each other.
The de-duplication reduces disk space usage if multiple identically configured Prometheus instances in HA pair
write data to the same VictoriaMetrics instance. Note that these Prometheus instances must have identical
`external_labels` section in their configs, so they write data to the same time series.
### Retention
Retention is configured with `-retentionPeriod` command-line flag. For instance, `-retentionPeriod=3` means
that the data will be stored for 3 months and then deleted.
Data is split in per-month subdirectories inside `<-storageDataPath>/data/small` and `<-storageDataPath>/data/big` folders.
Directories for months outside the configured retention are deleted on the first day of new month.
In order to keep data according to `-retentionPeriod` max disk space usage is going to be `-retentionPeriod` + 1 month.
For example if `-retentionPeriod` is set to 1, data for January is deleted on March 1st.
### Multiple retentions
@@ -605,7 +762,7 @@ There is no downsampling support at the moment, but:
- VictoriaMetrics has good compression for on-disk data. See [this article](https://medium.com/@valyala/victoriametrics-achieving-better-compression-for-time-series-data-than-gorilla-317bc1f95932)
for details.
These properties reduce the need in downsampling. We plan to implement downsampling in the future.
These properties reduce the need of downsampling. We plan to implement downsampling in the future.
See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/36) for details.
@@ -627,8 +784,10 @@ horizontally scalable long-term remote storage for really large Prometheus deplo
### Alerting
VictoriaMetrics doesn't support rule evaluation and alerting yet, so these actions must be performed either
on [Prometheus side](https://prometheus.io/docs/alerting/overview/) or on [Grafana side](https://grafana.com/docs/alerting/rules/).
VictoriaMetrics doesn't support rule evaluation and alerting yet, so these actions can be performed at the following places:
* At Prometheus - see [the corresponding docs](https://prometheus.io/docs/alerting/overview/).
* At Promxy - see [the corresponding docs](https://github.com/jacksontj/promxy/blob/master/README.md#how-do-i-use-alertingrecording-rules-in-promxy).
* At Grafana - see [the corresponding docs](https://grafana.com/docs/alerting/rules/).
### Security
@@ -648,14 +807,14 @@ For example, substitute `-graphiteListenAddr=:2003` with `-graphiteListenAddr=<i
### Tuning
* There is no need in VictoriaMetrics tuning since it uses reasonable defaults for command-line flags,
* There is no need for VictoriaMetrics tuning since it uses reasonable defaults for command-line flags,
which are automatically adjusted for the available CPU and RAM resources.
* There is no need in Operating System tuning since VictoriaMetrics is optimized for default OS settings.
* There is no need for Operating System tuning since VictoriaMetrics is optimized for default OS settings.
The only option is increasing the limit on [the number of open files in the OS](https://medium.com/@muhammadtriwibowo/set-permanently-ulimit-n-open-files-in-ubuntu-4d61064429a),
so Prometheus instances could establish more connections to VictoriaMetrics.
* The recommended filesystem is `ext4`, the recommended persistent storage is [persistent HDD-based disk on GCP](https://cloud.google.com/compute/docs/disks/#pdspecs),
since it is protected from hardware failures via internal replication and it can be [resized on the fly](https://cloud.google.com/compute/docs/disks/add-persistent-disk#resize_pd).
If you plan storing more than 1TB of data on `ext4` partition or plan extending it to more than 16TB,
If you plan to store more than 1TB of data on `ext4` partition or plan extending it to more than 16TB,
then the following options are recommended to pass to `mkfs.ext4`:
```
@@ -665,14 +824,18 @@ mkfs.ext4 ... -O 64bit,huge_file,extent -T huge
### Monitoring
VictoriaMetrics exports internal metrics in Prometheus format on the `/metrics` page.
Add this page to Prometheus' scrape config in order to collect VictoriaMetrics metrics.
There is [an official Grafana dashboard for single-node VictoriaMetrics](https://grafana.com/dashboards/10229).
VictoriaMetrics exports internal metrics in Prometheus format at `/metrics` page.
These metrics may be collected either via Prometheus by adding the corresponding scrape config to it.
Alternatively they can be self-scraped by setting `-selfScrapeInterval` command-line flag to duration greater than 0.
For example, `-selfScrapeInterval=10s` would enable self-scraping of `/metrics` page with 10 seconds interval.
There are officials Grafana dashboards for [single-node VictoriaMetrics](https://grafana.com/dashboards/10229) and [clustered VictoriaMetrics](https://grafana.com/grafana/dashboards/11176).
The most interesting metrics are:
* `vm_cache_entries{type="storage/hour_metric_ids"}` - the number of time series with new data points during the last hour
aka active time series.
* `rate(vm_new_timeseries_created_total[5m])` - time series churn rate.
* `vm_rows{type="indexdb"}` - the number of rows in inverted index. High value for this number usually mean high churn rate for time series.
* Sum of `vm_rows{type="storage/big"}` and `vm_rows{type="storage/small"}` - total number of `(timestamp, value)` data points
in the database.
@@ -685,7 +848,7 @@ The most interesting metrics are:
### Troubleshooting
* It is recommended to use default command-line flag values (i.e. don't set them explicitly) until the need
in tweaking these flag values arises.
of tweaking these flag values arises.
* If VictoriaMetrics works slowly and eats more than a CPU core per 100K ingested data points per second,
then it is likely you have too many active time series for the current amount of RAM.
@@ -699,19 +862,27 @@ The most interesting metrics are:
has at least 20% of free space comparing to disk size.
* If VictoriaMetrics doesn't work because of certain parts are corrupted due to disk errors,
then just remove directoreis with broken parts. This will recover VictoriaMetrics at the cost
then just remove directories with broken parts. This will recover VictoriaMetrics at the cost
of data loss stored in the broken parts. In the future, `vmrecover` tool will be created
for automatic recovering from such errors.
* If you see gaps on the graphs, try resetting the cache by sending request to `/internal/resetRollupResultCache`.
If this removes gaps on the graphs, then it is likely data with timestamps older than `-search.cacheTimestampOffset`
is ingested into VictoriaMetrics. Make sure that data sources have synchronized time with VictoriaMetrics.
### Backfilling
VictoriaMetrics accepts historical data in arbitrary order of time.
Make sure that configured `-retentionPeriod` covers timestamps for the backfilled data.
It is recommended disabling query cache with `-search.disableCache` command-line flag when writing
historical data with timestamps from the past, since the cache assumes that the data is written with
the current timestamps. Query cache can be enabled after the backfilling is complete.
An alternative solution is to query `/internal/resetRollupResultCache` url after backfilling is complete. This will reset
the query cache, which could contain incomplete data cached during the backfilling.
### Profiling
@@ -738,6 +909,7 @@ The collected profiles may be analyzed with [go tool pprof](https://github.com/g
See [these docs](https://github.com/netdata/netdata#integrations).
* [go-graphite/carbonapi](https://github.com/go-graphite/carbonapi) can use VictoriaMetrics as time series backend.
See [this example](/blob/master/cmd/carbonapi/carbonapi.example.prometheus.yaml).
* [Ansible role for installing VictoriaMetrics](https://github.com/dreamteam-gg/ansible-victoriametrics-role).
## Roadmap
@@ -749,7 +921,7 @@ The collected profiles may be analyzed with [go tool pprof](https://github.com/g
- [ ] CLI tool for data migration, re-balancing and adding/removing nodes [#103](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/103)
The discussion happens [here](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/129). Feel free to comment any item or add own one.
The discussion happens [here](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/129). Feel free to comment on any item or add you own one.
## Contacts
@@ -762,6 +934,7 @@ Contact us with any questions regarding VictoriaMetrics at [info@victoriametrics
Feel free asking any questions regarding VictoriaMetrics:
- [slack](http://slack.victoriametrics.com/)
- [reddit](https://www.reddit.com/r/VictoriaMetrics/)
- [telegram-en](https://t.me/VictoriaMetrics_en)
- [telegram-ru](https://t.me/VictoriaMetrics_ru1)
- [google groups](https://groups.google.com/forum/#!forum/victorametrics-users)
@@ -785,6 +958,13 @@ We are open to third-party pull requests provided they follow [KISS design princ
Adhering `KISS` principle simplifies the resulting code and architecture, so it can be reviewed, understood and verified by many people.
## Third-party contributions
* [Unofficial yum repository](https://copr.fedorainfracloud.org/coprs/antonpatsev/VictoriaMetrics/) ([source code](https://github.com/patsevanton/victoriametrics-rpm))
* [Prometheus -> VictoriaMetrics exporter #1](https://github.com/ryotarai/prometheus-tsdb-dump)
* [Prometheus -> VictoriaMetrics exporter #2](https://github.com/AnchorFree/tsdb-remote-write)
## Reporting bugs
Report bugs and propose new features [here](https://github.com/VictoriaMetrics/VictoriaMetrics/issues).
@@ -792,7 +972,7 @@ Report bugs and propose new features [here](https://github.com/VictoriaMetrics/V
## Victoria Metrics Logo
[Zip](VM_logo.zip) contains three folders with different image orientation (main color and inverted version).
[Zip](VM_logo.zip) contains three folders with different image orientations (main color and inverted version).
Files included in each folder:

View File

@@ -6,9 +6,44 @@ victoria-metrics:
victoria-metrics-prod:
APP_NAME=victoria-metrics $(MAKE) app-via-docker
victoria-metrics-pure-prod:
APP_NAME=victoria-metrics $(MAKE) app-via-docker-pure
victoria-metrics-amd64-prod:
APP_NAME=victoria-metrics $(MAKE) app-via-docker-amd64
victoria-metrics-arm-prod:
APP_NAME=victoria-metrics $(MAKE) app-via-docker-arm
victoria-metrics-arm64-prod:
APP_NAME=victoria-metrics $(MAKE) app-via-docker-arm64
victoria-metrics-ppc64le-prod:
APP_NAME=victoria-metrics $(MAKE) app-via-docker-ppc64le
victoria-metrics-386-prod:
APP_NAME=victoria-metrics $(MAKE) app-via-docker-386
package-victoria-metrics:
APP_NAME=victoria-metrics \
$(MAKE) package-via-docker
APP_NAME=victoria-metrics $(MAKE) package-via-docker
package-victoria-metrics-pure:
APP_NAME=victoria-metrics $(MAKE) package-via-docker-pure
package-victoria-metrics-amd64:
APP_NAME=victoria-metrics $(MAKE) package-via-docker-amd64
package-victoria-metrics-arm:
APP_NAME=victoria-metrics $(MAKE) package-via-docker-arm
package-victoria-metrics-arm64:
APP_NAME=victoria-metrics $(MAKE) package-via-docker-arm64
package-victoria-metrics-ppc64le:
APP_NAME=victoria-metrics $(MAKE) package-via-docker-ppc64le
package-victoria-metrics-386:
APP_NAME=victoria-metrics $(MAKE) package-via-docker-386
publish-victoria-metrics:
APP_NAME=victoria-metrics $(MAKE) publish-via-docker
@@ -20,30 +55,24 @@ run-victoria-metrics:
ARGS='-graphiteListenAddr=:2003 -opentsdbListenAddr=:4242 -retentionPeriod=12 -search.maxUniqueTimeseries=1000000 -search.maxQueryDuration=10m' \
$(MAKE) run-via-docker
victoria-metrics-amd64:
CGO_ENABLED=1 GOOS=linux GOARCH=amd64 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/victoria-metrics-amd64 ./app/victoria-metrics
victoria-metrics-arm:
CGO_ENABLED=0 GOOS=linux GOARCH=arm GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/victoria-metrics-arm ./app/victoria-metrics
victoria-metrics-arm-prod:
APP_NAME=victoria-metrics APP_SUFFIX='-arm' DOCKER_OPTS='--env CGO_ENABLED=0 --env GOARCH=arm' $(MAKE) app-via-docker
victoria-metrics-arm64:
CGO_ENABLED=0 GOOS=linux GOARCH=arm64 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/victoria-metrics-arm64 ./app/victoria-metrics
victoria-metrics-arm64-prod:
APP_NAME=victoria-metrics APP_SUFFIX='-arm64' DOCKER_OPTS='--env CGO_ENABLED=0 --env GOARCH=arm64' $(MAKE) app-via-docker
victoria-metrics-ppc64le:
CGO_ENABLED=0 GOOS=linux GOARCH=ppc64le GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/victoria-metrics-ppc64le ./app/victoria-metrics
victoria-metrics-386:
CGO_ENABLED=0 GOOS=linux GOARCH=386 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/victoria-metrics-386 ./app/victoria-metrics
victoria-metrics-386-prod:
APP_NAME=victoria-metrics APP_SUFFIX='-386' DOCKER_OPTS='--env CGO_ENABLED=0 --env GOARCH=386' $(MAKE) app-via-docker
victoria-metrics-pure:
APP_NAME=victoria-metrics $(MAKE) app-local-pure
victoria-metrics-pure-prod:
APP_NAME=victoria-metrics APP_SUFFIX='-pure' DOCKER_OPTS='--env CGO_ENABLED=0' $(MAKE) app-via-docker
### Packaging as DEB - amd64
victoria-metrics-package-deb: victoria-metrics-prod
./package/package_deb.sh amd64

View File

@@ -1,5 +1,8 @@
ARG certs_image
FROM $certs_image AS certs
FROM scratch
COPY --from=local/certs:1.0.2 /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ca-certificates.crt
COPY bin/victoria-metrics-prod .
COPY --from=certs /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ca-certificates.crt
ARG src_binary
COPY $src_binary ./victoria-metrics-prod
EXPOSE 8428
ENTRYPOINT ["/victoria-metrics-prod"]

View File

@@ -9,44 +9,55 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envflag"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/procutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
)
var httpListenAddr = flag.String("httpListenAddr", ":8428", "TCP address to listen for http connections")
var (
httpListenAddr = flag.String("httpListenAddr", ":8428", "TCP address to listen for http connections")
minScrapeInterval = flag.Duration("dedup.minScrapeInterval", 0, "Remove superflouos samples from time series if they are located closer to each other than this duration. "+
"This may be useful for reducing overhead when multiple identically configured Prometheus instances write data to the same VictoriaMetrics. "+
"Deduplication is disabled if the -dedup.minScrapeInterval is 0")
)
func main() {
flag.Parse()
envflag.Parse()
buildinfo.Init()
logger.Init()
logger.Infof("starting VictoraMetrics at %q...", *httpListenAddr)
logger.Infof("starting VictoriaMetrics at %q...", *httpListenAddr)
startTime := time.Now()
storage.SetMinScrapeIntervalForDeduplication(*minScrapeInterval)
vmstorage.Init()
vmselect.Init()
vminsert.Init()
startSelfScraper()
go httpserver.Serve(*httpListenAddr, requestHandler)
logger.Infof("started VictoriaMetrics in %s", time.Since(startTime))
logger.Infof("started VictoriaMetrics in %.3f seconds", time.Since(startTime).Seconds())
sig := procutil.WaitForSigterm()
logger.Infof("received signal %s", sig)
stopSelfScraper()
logger.Infof("gracefully shutting down webservice at %q", *httpListenAddr)
startTime = time.Now()
if err := httpserver.Stop(*httpListenAddr); err != nil {
logger.Fatalf("cannot stop the webservice: %s", err)
}
vminsert.Stop()
logger.Infof("successfully shut down the webservice in %s", time.Since(startTime))
logger.Infof("successfully shut down the webservice in %.3f seconds", time.Since(startTime).Seconds())
vmstorage.Stop()
vmselect.Stop()
fs.MustStopDirRemover()
logger.Infof("the VictoriaMetrics has been stopped in %s", time.Since(startTime))
logger.Infof("the VictoriaMetrics has been stopped in %.3f seconds", time.Since(startTime).Seconds())
}
func requestHandler(w http.ResponseWriter, r *http.Request) bool {

View File

@@ -1,5 +1,3 @@
// +build integration
package main
import (
@@ -23,6 +21,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envflag"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
@@ -148,7 +147,7 @@ func setUp() {
}
func processFlags() {
flag.Parse()
envflag.Parse()
for _, fv := range []struct {
flag string
value string
@@ -186,7 +185,6 @@ func tearDown() {
vmstorage.Stop()
vmselect.Stop()
fs.MustRemoveAll(storagePath)
fs.MustStopDirRemover()
}
func TestWriteRead(t *testing.T) {
@@ -303,6 +301,9 @@ func readIn(readFor string, t *testing.T, insertTime time.Time) []test {
s := newSuite(t)
var tt []test
s.noError(filepath.Walk(filepath.Join(testFixturesDir, readFor), func(path string, info os.FileInfo, err error) error {
if err != nil {
return err
}
if filepath.Ext(path) != ".json" {
return nil
}

View File

@@ -0,0 +1,99 @@
package main
import (
"flag"
"sync"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/prometheus"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
)
var selfScrapeInterval = flag.Duration("selfScrapeInterval", 0, "Interval for self-scraping own metrics at /metrics page")
var selfScraperStopCh chan struct{}
var selfScraperWG sync.WaitGroup
func startSelfScraper() {
selfScraperStopCh = make(chan struct{})
selfScraperWG.Add(1)
go func() {
defer selfScraperWG.Done()
selfScraper(*selfScrapeInterval)
}()
}
func stopSelfScraper() {
close(selfScraperStopCh)
selfScraperWG.Wait()
}
func selfScraper(scrapeInterval time.Duration) {
if scrapeInterval <= 0 {
// Self-scrape is disabled.
return
}
logger.Infof("started self-scraping `/metrics` page with interval %.3f seconds", scrapeInterval.Seconds())
var bb bytesutil.ByteBuffer
var rows prometheus.Rows
var mrs []storage.MetricRow
var labels []prompb.Label
t := time.NewTicker(scrapeInterval)
var currentTimestamp int64
for {
select {
case <-selfScraperStopCh:
t.Stop()
logger.Infof("stopped self-scraping `/metrics` page")
return
case currentTime := <-t.C:
currentTimestamp = currentTime.UnixNano() / 1e6
}
bb.Reset()
httpserver.WritePrometheusMetrics(&bb)
s := bytesutil.ToUnsafeString(bb.B)
rows.Reset()
rows.Unmarshal(s)
mrs = mrs[:0]
for i := range rows.Rows {
r := &rows.Rows[i]
labels = labels[:0]
labels = addLabel(labels, "", r.Metric)
labels = addLabel(labels, "job", "victoria-metrics")
labels = addLabel(labels, "instance", "self")
for j := range r.Tags {
t := &r.Tags[j]
labels = addLabel(labels, t.Key, t.Value)
}
if len(mrs) < cap(mrs) {
mrs = mrs[:len(mrs)+1]
} else {
mrs = append(mrs, storage.MetricRow{})
}
mr := &mrs[len(mrs)-1]
mr.MetricNameRaw = storage.MarshalMetricNameRaw(mr.MetricNameRaw[:0], labels)
mr.Timestamp = currentTimestamp
mr.Value = r.Value
}
logger.Infof("writing %d rows at timestamp %d", len(mrs), currentTimestamp)
vmstorage.AddRows(mrs)
}
}
func addLabel(dst []prompb.Label, key, value string) []prompb.Label {
if len(dst) < cap(dst) {
dst = dst[:len(dst)+1]
} else {
dst = append(dst, prompb.Label{})
}
lb := &dst[len(dst)-1]
lb.Name = bytesutil.ToUnsafeBytes(key)
lb.Value = bytesutil.ToUnsafeBytes(value)
return dst
}

View File

@@ -41,7 +41,7 @@ func PopulateTimeTpl(b []byte, tGlobal time.Time) []byte {
case `TIME_NS`:
return []byte(fmt.Sprintf("%d", t.UnixNano()))
default:
log.Fatalf("unkown time pattern %s in %s", parts[0], repl)
log.Fatalf("unknown time pattern %s in %s", parts[0], repl)
}
return repl
})

View File

@@ -1,18 +1,18 @@
// +build integration
// Source https://github.com/prometheus/prometheus/blob/master/prompb/remote.pb.go . Code is copy pasted and cleaned up
package test
// Source https://github.com/prometheus/prometheus/blob/master/prompb/remote.pb.go . Code is copy pasted and cleaned up
import (
"encoding/binary"
"math"
"math/bits"
)
// WriteRequest is write request
type WriteRequest struct {
Timeseries []TimeSeries `protobuf:"bytes,1,rep,name=timeseries,proto3" json:"timeseries"`
}
// Size returns m size in bytes after marshaling.
func (m *WriteRequest) Size() (n int) {
if m == nil {
return 0
@@ -31,6 +31,7 @@ func sovRemote(x uint64) (n int) {
return (bits.Len64(x|1) + 6) / 7
}
// Marshal marshals m.
func (m *WriteRequest) Marshal() (dAtA []byte, err error) {
size := m.Size()
dAtA = make([]byte, size)
@@ -41,11 +42,13 @@ func (m *WriteRequest) Marshal() (dAtA []byte, err error) {
return dAtA[:n], nil
}
// MarshalTo marshals m to dAtA
func (m *WriteRequest) MarshalTo(dAtA []byte) (int, error) {
size := m.Size()
return m.MarshalToSizedBuffer(dAtA[:size])
}
// MarshalToSizedBuffer marshals m to dAtA.
func (m *WriteRequest) MarshalToSizedBuffer(dAtA []byte) (int, error) {
i := len(dAtA)
if len(m.Timeseries) > 0 {
@@ -77,11 +80,13 @@ func encodeVarintRemote(dAtA []byte, offset int, v uint64) int {
return base
}
// Sample is time series sample.
type Sample struct {
Value float64 `protobuf:"fixed64,1,opt,name=value,proto3" json:"value,omitempty"`
Timestamp int64 `protobuf:"varint,2,opt,name=timestamp,proto3" json:"timestamp,omitempty"`
}
// Reset resets m.
func (m *Sample) Reset() { *m = Sample{} }
// TimeSeries represents samples and labels for a single time series.
@@ -90,21 +95,27 @@ type TimeSeries struct {
Samples []Sample `protobuf:"bytes,2,rep,name=samples,proto3" json:"samples"`
}
// Reset resets m.
func (m *TimeSeries) Reset() { *m = TimeSeries{} }
// Label is time series label.
type Label struct {
Name string `protobuf:"bytes,1,opt,name=name,proto3" json:"name,omitempty"`
Value string `protobuf:"bytes,2,opt,name=value,proto3" json:"value,omitempty"`
}
// Reset resets m.
func (m *Label) Reset() { *m = Label{} }
// Labels is a set of labels.
type Labels struct {
Labels []Label `protobuf:"bytes,1,rep,name=labels,proto3" json:"labels"`
}
// Reset resets m.
func (m *Labels) Reset() { *m = Labels{} }
// Marshal marshals m.
func (m *Sample) Marshal() (dAtA []byte, err error) {
size := m.Size()
dAtA = make([]byte, size)
@@ -115,11 +126,13 @@ func (m *Sample) Marshal() (dAtA []byte, err error) {
return dAtA[:n], nil
}
// MarshalTo marshals m to dAtA.
func (m *Sample) MarshalTo(dAtA []byte) (int, error) {
size := m.Size()
return m.MarshalToSizedBuffer(dAtA[:size])
}
// MarshalToSizedBuffer marshals m to dAtA.
func (m *Sample) MarshalToSizedBuffer(dAtA []byte) (int, error) {
i := len(dAtA)
if m.Timestamp != 0 {
@@ -136,6 +149,7 @@ func (m *Sample) MarshalToSizedBuffer(dAtA []byte) (int, error) {
return len(dAtA) - i, nil
}
// Marshal marshals m.
func (m *TimeSeries) Marshal() (dAtA []byte, err error) {
size := m.Size()
dAtA = make([]byte, size)
@@ -146,11 +160,13 @@ func (m *TimeSeries) Marshal() (dAtA []byte, err error) {
return dAtA[:n], nil
}
// MarshalTo marshals m to dAtA.
func (m *TimeSeries) MarshalTo(dAtA []byte) (int, error) {
size := m.Size()
return m.MarshalToSizedBuffer(dAtA[:size])
}
// MarshalToSizedBuffer marshals m to dAtA.
func (m *TimeSeries) MarshalToSizedBuffer(dAtA []byte) (int, error) {
i := len(dAtA)
if len(m.Samples) > 0 {
@@ -184,6 +200,7 @@ func (m *TimeSeries) MarshalToSizedBuffer(dAtA []byte) (int, error) {
return len(dAtA) - i, nil
}
// Marshal marshals m.
func (m *Label) Marshal() (dAtA []byte, err error) {
size := m.Size()
dAtA = make([]byte, size)
@@ -194,11 +211,13 @@ func (m *Label) Marshal() (dAtA []byte, err error) {
return dAtA[:n], nil
}
// MarshalTo marshals m to dAtA.
func (m *Label) MarshalTo(dAtA []byte) (int, error) {
size := m.Size()
return m.MarshalToSizedBuffer(dAtA[:size])
}
// MarshalToSizedBuffer marshals m to dAtA.
func (m *Label) MarshalToSizedBuffer(dAtA []byte) (int, error) {
i := len(dAtA)
_ = i
@@ -221,6 +240,7 @@ func (m *Label) MarshalToSizedBuffer(dAtA []byte) (int, error) {
return len(dAtA) - i, nil
}
// Marshal marshals m.
func (m *Labels) Marshal() (dAtA []byte, err error) {
size := m.Size()
dAtA = make([]byte, size)
@@ -231,11 +251,13 @@ func (m *Labels) Marshal() (dAtA []byte, err error) {
return dAtA[:n], nil
}
// MarshalTo marshals m to dAtA.
func (m *Labels) MarshalTo(dAtA []byte) (int, error) {
size := m.Size()
return m.MarshalToSizedBuffer(dAtA[:size])
}
// MarshalToSizedBuffer marshals m to dAtA.
func (m *Labels) MarshalToSizedBuffer(dAtA []byte) (int, error) {
i := len(dAtA)
if len(m.Labels) > 0 {
@@ -267,6 +289,7 @@ func encodeVarintTypes(dAtA []byte, offset int, v uint64) int {
return base
}
// Size returns the size of marshaled m.
func (m *Sample) Size() (n int) {
if m == nil {
return 0
@@ -280,6 +303,7 @@ func (m *Sample) Size() (n int) {
return n
}
// Size returns the size of marshaled m.
func (m *TimeSeries) Size() (n int) {
if m == nil {
return 0
@@ -301,6 +325,7 @@ func (m *TimeSeries) Size() (n int) {
return n
}
// Size returns the size of marshaled m.
func (m *Label) Size() (n int) {
if m == nil {
return 0
@@ -318,6 +343,7 @@ func (m *Label) Size() (n int) {
return n
}
// Size returns the size of marshaled m.
func (m *Labels) Size() (n int) {
if m == nil {
return 0

View File

@@ -1,9 +1,8 @@
// +build integration
package test
import "github.com/golang/snappy"
// Compress marshals and compresses wr.
func Compress(wr WriteRequest) ([]byte, error) {
data, err := wr.Marshal()
if err != nil {

View File

@@ -13,11 +13,8 @@
"data":{"resultType":"matrix",
"result":[{"metric":{"__name__":"max_lookback_set"},"values":[
["{TIME_S-150s}","4"],
["{TIME_S-140s}","4"],
["{TIME_S-120s}","3"],
["{TIME_S-110s}","3"],
["{TIME_S-60s}","2"],
["{TIME_S-50s}","2"],
["{TIME_S-30s}","1"],
["{TIME_S-20s}","1"]
]}]}}

View File

@@ -19,14 +19,12 @@
["{TIME_S-110s}","3"],
["{TIME_S-100s}","3"],
["{TIME_S-90s}","3"],
["{TIME_S-80s}","3"],
["{TIME_S-70s}","3"],
["{TIME_S-60s}","2"],
["{TIME_S-50s}","2"],
["{TIME_S-40s}","2"],
["{TIME_S-30s}","1"],
["{TIME_S-20s}","1"],
["{TIME_S-10s}","1"],
["{TIME_S}","1"]
["{TIME_S-0s}","1"]
]}]}}
}

73
app/vmagent/Makefile Normal file
View File

@@ -0,0 +1,73 @@
# All these commands must run from repository root.
vmagent:
APP_NAME=vmagent $(MAKE) app-local
vmagent-prod:
APP_NAME=vmagent $(MAKE) app-via-docker
vmagent-pure-prod:
APP_NAME=vmagent $(MAKE) app-via-docker-pure
vmagent-amd64-prod:
APP_NAME=vmagent $(MAKE) app-via-docker-amd64
vmagent-arm-prod:
APP_NAME=vmagent $(MAKE) app-via-docker-arm
vmagent-arm64-prod:
APP_NAME=vmagent $(MAKE) app-via-docker-arm64
vmagent-ppc64le-prod:
APP_NAME=vmagent $(MAKE) app-via-docker-ppc64le
vmagent-386-prod:
APP_NAME=vmagent $(MAKE) app-via-docker-386
package-vmagent:
APP_NAME=vmagent $(MAKE) package-via-docker
package-vmagent-pure:
APP_NAME=vmagent $(MAKE) package-via-docker-pure
package-vmagent-amd64:
APP_NAME=vmagent $(MAKE) package-via-docker-amd64
package-vmagent-arm:
APP_NAME=vmagent $(MAKE) package-via-docker-arm
package-vmagent-arm64:
APP_NAME=vmagent $(MAKE) package-via-docker-arm64
package-vmagent-ppc64le:
APP_NAME=vmagent $(MAKE) package-via-docker-ppc64le
package-vmagent-386:
APP_NAME=vmagent $(MAKE) package-via-docker-386
publish-vmagent:
APP_NAME=vmagent $(MAKE) publish-via-docker
run-vmagent:
mkdir -p vmagent-data
DOCKER_OPTS='-v $(shell pwd)/vmagent-data:/vmagent-data' \
APP_NAME=vmagent \
$(MAKE) run-via-docker
vmagent-amd64:
CGO_ENABLED=1 GOOS=linux GOARCH=amd64 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmagent-amd64 ./app/vmagent
vmagent-arm:
CGO_ENABLED=0 GOOS=linux GOARCH=arm GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmagent-arm ./app/vmagent
vmagent-arm64:
CGO_ENABLED=0 GOOS=linux GOARCH=arm64 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmagent-arm64 ./app/vmagent
vmagent-ppc64le:
CGO_ENABLED=0 GOOS=linux GOARCH=ppc64le GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmagent-ppc64le ./app/vmagent
vmagent-386:
CGO_ENABLED=0 GOOS=linux GOARCH=386 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmagent-386 ./app/vmagent
vmagent-pure:
APP_NAME=vmagent $(MAKE) app-local-pure

159
app/vmagent/README.md Normal file
View File

@@ -0,0 +1,159 @@
## vmagent
`vmagent` is a tiny but brave agent, which helps you collecting metrics from various sources
and storing them to [VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics).
<img alt="vmagent" src="vmagent.png">
### Motivation
While VictoriaMetrics provides an efficient solution to store and observe metrics, our users needed something fast
and RAM friendly to scrape metrics from Prometheus-compatible exporters to VictoriaMetrics.
Also, we found that users infrastructure is like snowflakes - never alike, and we decided to add more flexibility
to `vmagent` (like the ability to push metrics instead of pulling them). We did our best and plan to do even more.
### Features
* Can be used as drop-in replacement for Prometheus for scraping targets such as [node_exporter](https://github.com/prometheus/node_exporter).
See [Quick Start](#quick-start) for details.
* Can add, remove and modify labels via Prometheus relabeling. See [these docs](#relabeling) for details.
* Accepts data via all the ingestion protocols supported by VictoriaMetrics:
* Influx line protocol via `http://<vmagent>:8429/write`. See [these docs](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-send-data-from-influxdb-compatible-agents-such-as-telegraf).
* JSON lines import protocol via `http://<vmagent>:8429/api/v1/import`. See [these docs](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-import-time-series-data).
* Graphite plaintext protocol if `-graphiteListenAddr` command-line flag is set. See [these docs](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-send-data-from-graphite-compatible-agents-such-as-statsd).
* OpenTSDB telnet and http protocols if `-opentsdbListenAddr` command-line flag is set. See [these docs](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-send-data-from-opentsdb-compatible-agents).
* Prometheus remote write protocol via `http://<vmagent>:8429/api/v1/write`.
* Can replicate collected metrics simultaneously to multiple remote storage systems.
* Works in environments with unstable connections to remote storage. If the remote storage is unavailable, the collected metrics
are buffered at `-remoteWrite.tmpDataPath`. The buffered metrics are sent to remote storage as soon as connection
to remote storage is recovered.
* Uses lower amounts of RAM, CPU, disk IO and network bandwidth comparing to Prometheus.
### Quick Start
Just download `vmutils-*` archive from [releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases), unpack it
and pass the following flags to `vmagent` binary in order to start scraping Prometheus targets:
* `-promscrape.config` with the path to Prometheus config file (it is usually located at `/etc/prometheus/prometheus.yml`)
* `-remoteWrite.url` with the remote storage endpoint such as VictoriaMetrics. Multiple `-remoteWrite.url` args can be set in parallel
in order to replicate data concurrently to multiple remote storage systems.
Example command line:
```
/path/to/vmagent -promscrape.config=/path/to/prometheus.yml -remoteWrite.url=https://victoria-metrics-host:8428/api/v1/write
```
If you need collecting only Influx data, then the following command line would be enough:
```
/path/to/vmagent -remoteWrite.url=https://victoria-metrics-host:8428/api/v1/write
```
Then send Influx data to `http://vmagent-host:8429/write`. See [these docs](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-send-data-from-influxdb-compatible-agents-such-as-telegraf) for more details.
`vmagent` is also available in [docker images](https://hub.docker.com/r/victoriametrics/vmagent/).
Pass `-help` to `vmagent` in order to see the full list of supported command-line flags with their descriptions.
### How to collect metrics in Prometheus format?
Pass the path to `prometheus.yml` to `-promscrape.config` command-line flag. `vmagent` takes into account the following
sections from [Prometheus config file](https://prometheus.io/docs/prometheus/latest/configuration/configuration/):
* `global`
* `scrape_configs`
All the other section are ignored, including [remote_write](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write) section.
Use `-remoteWrite.*` command-line flags instead for configuring remote write settings:
* `-remoteWrite.url` for pointing to remote storage. Data to remote storage can be sent either via HTTP or HTTPS. See `-remoteWrite.tls*` flags for details.
* `-remoteWrite.label` for adding labels to metrics before sending them to remote storage.
* `-remoteWrite.relabelConfig` for applying relabeling to metrics before sending them to remote storage.
The following scrape types in [scrape_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config) section are supported:
* `static_configs` - for scraping statically defined targets. See [these docs](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#static_config) for details.
* `file_sd_configs` - for scraping targets defined in external files aka file-based service discover.
See [these docs](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#file_sd_config) for details.
File feature requests at [our issue tracker](https://github.com/VictoriaMetrics/VictoriaMetrics/issues) if you need other service discovery mechanisms to be supported by `vmagent`.
### Adding labels to metrics
Labels can be added to metrics via the following mechanisms:
* Via `global -> external_labels` section in `-promscrape.config` file. These labels are added only to metrics scraped from targets configured in `-promscrape.config` file.
* Via `-remoteWrite.label` command-line flag. These labels are added to all the collected metrics before sending them to `-remoteWrite.url`.
### Relabeling
`vmagent` supports [Prometheus relabeling](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config).
Additionally it provides the following extra actions:
* `replace_all`: replaces all the occurences of `regex` in the values of `source_labels` with the `replacement` and stores the result in the `target_label`
* `labelmap_all`: replaces all the occurences of `regex` in all the label names with the `replacement`
The relabeling can be defined in the following places:
* At `scrape_config -> relabel_configs` section in `-promscrape.config` file. This relabeling is applied to targets when parsing the file during `vmagent` startup
or during config reload after sending `SIGHUP` signal to `vmagent` via `kill -HUP`.
* At `scrape_config -> metric_relabel_configs` section in `-promscrape.config` file. This relabeling is applied to metrics after each scrape for the configured targets.
* At `-remoteWrite.relabelConfig` file. This relabeling is aplied to all the collected metrics before sending them to `-remoteWrite.url`.
Read more about relabeling in the following articles:
* [Life of a label](https://www.robustperception.io/life-of-a-label)
* [Discarding targets and timeseries with relabeling](https://www.robustperception.io/relabelling-can-discard-targets-timeseries-and-alerts)
* [Dropping labels at scrape time](https://www.robustperception.io/dropping-metrics-at-scrape-time-with-prometheus)
* [Extracting labels from legacy metric names](https://www.robustperception.io/extracting-labels-from-legacy-metric-names)
* [relabel_configs vs metric_relabel_configs](https://www.robustperception.io/relabel_configs-vs-metric_relabel_configs)
### Monitoring
`vmagent` exports various metrics in Prometheus exposition format at `/metrics` page. It is recommended setting up regular scraping of this page
either via `vmagent` itself or via Prometheus, so the exported metrics could be analyzed later.
### Troubleshooting
* It is recommended increasing the maximum number of open files in the system (`ulimit -n`) when scraping big number of targets,
since `vmagent` establishes at least a single TCP connection per each target.
* It is recommended increasing `-remoteWrite.queues` if `vmagent` collects more than 100K samples per second
and `vmagent_remotewrite_pending_data_bytes` metric exported by `vmagent` at `/metrics` page constantly grows.
* `vmagent` buffers scraped data at `-remoteWrite.tmpDataPath` directory until it is sent to `-remoteWrite.url`.
The directory can grow big when remote storage is unvailable during extended periods of time. If you don't want
sending all the data from the directory to remote storage, just stop `vmagent` and delete the directory.
### How to build from sources
It is recommended using [binary releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) - `vmagent` is located in `vmutils-*` archives there.
#### Development build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.12.
2. Run `make vmagent` from the root folder of the repository.
It builds `vmagent` binary and puts it into the `bin` folder.
#### Production build
1. [Install docker](https://docs.docker.com/install/).
2. Run `make vmagent-prod` from the root folder of the repository.
It builds `vmagent-prod` binary and puts it into the `bin` folder.
#### Building docker images
Run `make package-vmagent`. It builds `victoriametrics/vmagent:<PKG_TAG>` docker image locally.
`<PKG_TAG>` is auto-generated image tag, which depends on source code in the repository.
The `<PKG_TAG>` may be manually set via `PKG_TAG=foobar make package-vmagent`.

View File

@@ -0,0 +1,70 @@
package common
import (
"runtime"
"sync"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
)
// PushCtx is a context used for populating WriteRequest.
type PushCtx struct {
WriteRequest prompbmarshal.WriteRequest
// Labels contains flat list of all the labels used in WriteRequest.
Labels []prompbmarshal.Label
// Samples contains flat list of all the samples used in WriteRequest.
Samples []prompbmarshal.Sample
}
// Reset resets ctx.
func (ctx *PushCtx) Reset() {
tss := ctx.WriteRequest.Timeseries
for i := range tss {
ts := &tss[i]
ts.Labels = nil
ts.Samples = nil
}
ctx.WriteRequest.Timeseries = ctx.WriteRequest.Timeseries[:0]
labels := ctx.Labels
for i := range labels {
label := &labels[i]
label.Name = ""
label.Value = ""
}
ctx.Labels = ctx.Labels[:0]
ctx.Samples = ctx.Samples[:0]
}
// GetPushCtx returns PushCtx from pool.
//
// Call PutPushCtx when the ctx is no longer needed.
func GetPushCtx() *PushCtx {
select {
case ctx := <-pushCtxPoolCh:
return ctx
default:
if v := pushCtxPool.Get(); v != nil {
return v.(*PushCtx)
}
return &PushCtx{}
}
}
// PutPushCtx returns ctx to the pool.
//
// ctx mustn't be used after returning to the pool.
func PutPushCtx(ctx *PushCtx) {
ctx.Reset()
select {
case pushCtxPoolCh <- ctx:
default:
pushCtxPool.Put(ctx)
}
}
var pushCtxPool sync.Pool
var pushCtxPoolCh = make(chan *PushCtx, runtime.GOMAXPROCS(-1))

View File

@@ -0,0 +1,8 @@
ARG certs_image
FROM $certs_image AS certs
FROM scratch
COPY --from=certs /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ca-certificates.crt
ARG src_binary
COPY $src_binary ./vmagent-prod
EXPOSE 8429
ENTRYPOINT ["/vmagent-prod"]

View File

@@ -0,0 +1,65 @@
package graphite
import (
"io"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/graphite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="graphite"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="graphite"}`)
)
// InsertHandler processes remote write for graphite plaintext protocol.
//
// See https://graphite.readthedocs.io/en/latest/feeding-carbon.html#the-plaintext-protocol
func InsertHandler(r io.Reader) error {
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(r, insertRows)
})
}
func insertRows(rows []parser.Row) error {
ctx := common.GetPushCtx()
defer common.PutPushCtx(ctx)
tssDst := ctx.WriteRequest.Timeseries[:0]
labels := ctx.Labels[:0]
samples := ctx.Samples[:0]
for i := range rows {
r := &rows[i]
labelsLen := len(labels)
labels = append(labels, prompbmarshal.Label{
Name: "__name__",
Value: r.Metric,
})
for j := range r.Tags {
tag := &r.Tags[j]
labels = append(labels, prompbmarshal.Label{
Name: tag.Key,
Value: tag.Value,
})
}
samples = append(samples, prompbmarshal.Sample{
Value: r.Value,
Timestamp: r.Timestamp,
})
tssDst = append(tssDst, prompbmarshal.TimeSeries{
Labels: labels[labelsLen:],
Samples: samples[len(samples)-1:],
})
}
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.Push(&ctx.WriteRequest)
rowsInserted.Add(len(rows))
rowsPerInsert.Update(float64(len(rows)))
return nil
}

View File

@@ -0,0 +1,152 @@
package influx
import (
"flag"
"net/http"
"runtime"
"sync"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/influx"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
measurementFieldSeparator = flag.String("influxMeasurementFieldSeparator", "_", "Separator for '{measurement}{separator}{field_name}' metric name when inserted via Influx line protocol")
skipSingleField = flag.Bool("influxSkipSingleField", false, "Uses '{measurement}' instead of '{measurement}{separator}{field_name}' for metic name if Influx line contains only a single field")
)
var (
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="influx"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="influx"}`)
)
// InsertHandler processes remote write for influx line protocol.
//
// See https://github.com/influxdata/influxdb/blob/4cbdc197b8117fee648d62e2e5be75c6575352f0/tsdb/README.md
func InsertHandler(req *http.Request) error {
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(req, insertRows)
})
}
func insertRows(db string, rows []parser.Row) error {
ctx := getPushCtx()
defer putPushCtx(ctx)
rowsTotal := 0
tssDst := ctx.ctx.WriteRequest.Timeseries[:0]
labels := ctx.ctx.Labels[:0]
samples := ctx.ctx.Samples[:0]
commonLabels := ctx.commonLabels[:0]
buf := ctx.buf[:0]
for i := range rows {
r := &rows[i]
commonLabels = commonLabels[:0]
hasDBLabel := false
for j := range r.Tags {
tag := &r.Tags[j]
if tag.Key == "db" {
hasDBLabel = true
}
commonLabels = append(commonLabels, prompbmarshal.Label{
Name: tag.Key,
Value: tag.Value,
})
}
if len(db) > 0 && !hasDBLabel {
commonLabels = append(commonLabels, prompbmarshal.Label{
Name: "db",
Value: db,
})
}
ctx.metricGroupBuf = append(ctx.metricGroupBuf[:0], r.Measurement...)
skipFieldKey := len(r.Fields) == 1 && *skipSingleField
if len(ctx.metricGroupBuf) > 0 && !skipFieldKey {
ctx.metricGroupBuf = append(ctx.metricGroupBuf, *measurementFieldSeparator...)
}
for j := range r.Fields {
f := &r.Fields[j]
bufLen := len(buf)
buf = append(buf, ctx.metricGroupBuf...)
if !skipFieldKey {
buf = append(buf, f.Key...)
}
metricGroup := bytesutil.ToUnsafeString(buf[bufLen:])
labelsLen := len(labels)
labels = append(labels, prompbmarshal.Label{
Name: "__name__",
Value: metricGroup,
})
labels = append(labels, commonLabels...)
samples = append(samples, prompbmarshal.Sample{
Timestamp: r.Timestamp,
Value: f.Value,
})
tssDst = append(tssDst, prompbmarshal.TimeSeries{
Labels: labels[labelsLen:],
Samples: samples[len(samples)-1:],
})
}
rowsTotal += len(r.Fields)
}
ctx.buf = buf
ctx.ctx.WriteRequest.Timeseries = tssDst
ctx.ctx.Labels = labels
ctx.ctx.Samples = samples
ctx.commonLabels = commonLabels
remotewrite.Push(&ctx.ctx.WriteRequest)
rowsInserted.Add(rowsTotal)
rowsPerInsert.Update(float64(rowsTotal))
return nil
}
type pushCtx struct {
ctx common.PushCtx
commonLabels []prompbmarshal.Label
metricGroupBuf []byte
buf []byte
}
func (ctx *pushCtx) reset() {
ctx.ctx.Reset()
commonLabels := ctx.commonLabels
for i := range commonLabels {
label := &commonLabels[i]
label.Name = ""
label.Value = ""
}
ctx.metricGroupBuf = ctx.metricGroupBuf[:0]
ctx.buf = ctx.buf[:0]
}
func getPushCtx() *pushCtx {
select {
case ctx := <-pushCtxPoolCh:
return ctx
default:
if v := pushCtxPool.Get(); v != nil {
return v.(*pushCtx)
}
return &pushCtx{}
}
}
func putPushCtx(ctx *pushCtx) {
ctx.reset()
select {
case pushCtxPoolCh <- ctx:
default:
pushCtxPool.Put(ctx)
}
}
var pushCtxPool sync.Pool
var pushCtxPoolCh = make(chan *pushCtx, runtime.GOMAXPROCS(-1))

149
app/vmagent/main.go Normal file
View File

@@ -0,0 +1,149 @@
package main
import (
"flag"
"fmt"
"net/http"
"strings"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/graphite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/influx"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/opentsdb"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/opentsdbhttp"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/promremotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/vmimport"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envflag"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
graphiteserver "github.com/VictoriaMetrics/VictoriaMetrics/lib/ingestserver/graphite"
opentsdbserver "github.com/VictoriaMetrics/VictoriaMetrics/lib/ingestserver/opentsdb"
opentsdbhttpserver "github.com/VictoriaMetrics/VictoriaMetrics/lib/ingestserver/opentsdbhttp"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/procutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
httpListenAddr = flag.String("httpListenAddr", ":8429", "TCP address to listen for http connections")
graphiteListenAddr = flag.String("graphiteListenAddr", "", "TCP and UDP address to listen for Graphite plaintext data. Usually :2003 must be set. Doesn't work if empty")
opentsdbListenAddr = flag.String("opentsdbListenAddr", "", "TCP and UDP address to listen for OpentTSDB metrics. "+
"Telnet put messages and HTTP /api/put messages are simultaneously served on TCP port. "+
"Usually :4242 must be set. Doesn't work if empty")
opentsdbHTTPListenAddr = flag.String("opentsdbHTTPListenAddr", "", "TCP address to listen for OpentTSDB HTTP put requests. Usually :4242 must be set. Doesn't work if empty")
)
var (
graphiteServer *graphiteserver.Server
opentsdbServer *opentsdbserver.Server
opentsdbhttpServer *opentsdbhttpserver.Server
)
func main() {
envflag.Parse()
buildinfo.Init()
logger.Init()
logger.Infof("starting vmagent at %q...", *httpListenAddr)
startTime := time.Now()
remotewrite.Init()
writeconcurrencylimiter.Init()
if len(*graphiteListenAddr) > 0 {
graphiteServer = graphiteserver.MustStart(*graphiteListenAddr, graphite.InsertHandler)
}
if len(*opentsdbListenAddr) > 0 {
opentsdbServer = opentsdbserver.MustStart(*opentsdbListenAddr, opentsdb.InsertHandler, opentsdbhttp.InsertHandler)
}
if len(*opentsdbHTTPListenAddr) > 0 {
opentsdbhttpServer = opentsdbhttpserver.MustStart(*opentsdbHTTPListenAddr, opentsdbhttp.InsertHandler)
}
promscrape.Init(remotewrite.Push)
go httpserver.Serve(*httpListenAddr, requestHandler)
logger.Infof("started vmagent in %.3f seconds", time.Since(startTime).Seconds())
sig := procutil.WaitForSigterm()
logger.Infof("received signal %s", sig)
logger.Infof("gracefully shutting down webservice at %q", *httpListenAddr)
startTime = time.Now()
if err := httpserver.Stop(*httpListenAddr); err != nil {
logger.Fatalf("cannot stop the webservice: %s", err)
}
logger.Infof("successfully shut down the webservice in %.3f seconds", time.Since(startTime).Seconds())
promscrape.Stop()
if len(*graphiteListenAddr) > 0 {
graphiteServer.MustStop()
}
if len(*opentsdbListenAddr) > 0 {
opentsdbServer.MustStop()
}
if len(*opentsdbHTTPListenAddr) > 0 {
opentsdbhttpServer.MustStop()
}
remotewrite.Stop()
logger.Infof("successfully stopped vmagent in %.3f seconds", time.Since(startTime).Seconds())
}
func requestHandler(w http.ResponseWriter, r *http.Request) bool {
path := strings.Replace(r.URL.Path, "//", "/", -1)
switch path {
case "/api/v1/write":
prometheusWriteRequests.Inc()
if err := promremotewrite.InsertHandler(r); err != nil {
prometheusWriteErrors.Inc()
httpserver.Errorf(w, "error in %q: %s", r.URL.Path, err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "/targets":
w.Header().Set("Content-Type", "text/plain")
promscrape.WriteHumanReadableTargetsStatus(w)
return true
case "/api/v1/import":
vmimportRequests.Inc()
if err := vmimport.InsertHandler(r); err != nil {
vmimportErrors.Inc()
httpserver.Errorf(w, "error in %q: %s", r.URL.Path, err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "/write", "/api/v2/write":
influxWriteRequests.Inc()
if err := influx.InsertHandler(r); err != nil {
influxWriteErrors.Inc()
httpserver.Errorf(w, "error in %q: %s", r.URL.Path, err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "/query":
// Emulate fake response for influx query.
// This is required for TSBS benchmark.
influxQueryRequests.Inc()
fmt.Fprintf(w, `{"results":[{"series":[{"values":[]}]}]}`)
return true
}
return false
}
var (
prometheusWriteRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/api/v1/write", protocol="prometheus"}`)
prometheusWriteErrors = metrics.NewCounter(`vmagent_http_request_errors_total{path="/api/v1/write", protocol="prometheus"}`)
vmimportRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/api/v1/import", protocol="vm"}`)
vmimportErrors = metrics.NewCounter(`vmagent_http_request_errors_total{path="/api/v1/import", protocol="vm"}`)
influxWriteRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/write", protocol="influx"}`)
influxWriteErrors = metrics.NewCounter(`vmagent_http_request_errors_total{path="/write", protocol="influx"}`)
influxQueryRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/query", protocol="influx"}`)
)

View File

@@ -0,0 +1,65 @@
package opentsdb
import (
"io"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentsdb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="opentsdb"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="opentsdb"}`)
)
// InsertHandler processes remote write for OpenTSDB put protocol.
//
// See http://opentsdb.net/docs/build/html/api_telnet/put.html
func InsertHandler(r io.Reader) error {
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(r, insertRows)
})
}
func insertRows(rows []parser.Row) error {
ctx := common.GetPushCtx()
defer common.PutPushCtx(ctx)
tssDst := ctx.WriteRequest.Timeseries[:0]
labels := ctx.Labels[:0]
samples := ctx.Samples[:0]
for i := range rows {
r := &rows[i]
labelsLen := len(labels)
labels = append(labels, prompbmarshal.Label{
Name: "__name__",
Value: r.Metric,
})
for j := range r.Tags {
tag := &r.Tags[j]
labels = append(labels, prompbmarshal.Label{
Name: tag.Key,
Value: tag.Value,
})
}
samples = append(samples, prompbmarshal.Sample{
Value: r.Value,
Timestamp: r.Timestamp,
})
tssDst = append(tssDst, prompbmarshal.TimeSeries{
Labels: labels[labelsLen:],
Samples: samples[len(samples)-1:],
})
}
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.Push(&ctx.WriteRequest)
rowsInserted.Add(len(rows))
rowsPerInsert.Update(float64(len(rows)))
return nil
}

View File

@@ -0,0 +1,64 @@
package opentsdbhttp
import (
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentsdbhttp"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="opentsdbhttp"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="opentsdbhttp"}`)
)
// InsertHandler processes HTTP OpenTSDB put requests.
// See http://opentsdb.net/docs/build/html/api_http/put.html
func InsertHandler(req *http.Request) error {
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(req, insertRows)
})
}
func insertRows(rows []parser.Row) error {
ctx := common.GetPushCtx()
defer common.PutPushCtx(ctx)
tssDst := ctx.WriteRequest.Timeseries[:0]
labels := ctx.Labels[:0]
samples := ctx.Samples[:0]
for i := range rows {
r := &rows[i]
labelsLen := len(labels)
labels = append(labels, prompbmarshal.Label{
Name: "__name__",
Value: r.Metric,
})
for j := range r.Tags {
tag := &r.Tags[j]
labels = append(labels, prompbmarshal.Label{
Name: tag.Key,
Value: tag.Value,
})
}
samples = append(samples, prompbmarshal.Sample{
Value: r.Value,
Timestamp: r.Timestamp,
})
tssDst = append(tssDst, prompbmarshal.TimeSeries{
Labels: labels[labelsLen:],
Samples: samples[len(samples)-1:],
})
}
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.Push(&ctx.WriteRequest)
rowsInserted.Add(len(rows))
rowsPerInsert.Update(float64(len(rows)))
return nil
}

View File

@@ -0,0 +1,67 @@
package promremotewrite
import (
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/promremotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="promremotewrite"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="promremotewrite"}`)
)
// InsertHandler processes remote write for prometheus.
func InsertHandler(req *http.Request) error {
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(req, insertRows)
})
}
func insertRows(timeseries []prompb.TimeSeries) error {
ctx := common.GetPushCtx()
defer common.PutPushCtx(ctx)
rowsTotal := 0
tssDst := ctx.WriteRequest.Timeseries[:0]
labels := ctx.Labels[:0]
samples := ctx.Samples[:0]
for i := range timeseries {
ts := &timeseries[i]
labelsLen := len(labels)
for i := range ts.Labels {
label := &ts.Labels[i]
labels = append(labels, prompbmarshal.Label{
Name: bytesutil.ToUnsafeString(label.Name),
Value: bytesutil.ToUnsafeString(label.Value),
})
}
samplesLen := len(samples)
for i := range ts.Samples {
sample := &ts.Samples[i]
samples = append(samples, prompbmarshal.Sample{
Value: sample.Value,
Timestamp: sample.Timestamp,
})
}
tssDst = append(tssDst, prompbmarshal.TimeSeries{
Labels: labels[labelsLen:],
Samples: samples[samplesLen:],
})
rowsTotal += len(ts.Samples)
}
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.Push(&ctx.WriteRequest)
rowsInserted.Add(rowsTotal)
rowsPerInsert.Update(float64(rowsTotal))
return nil
}

View File

@@ -0,0 +1,267 @@
package remotewrite
import (
"crypto/tls"
"crypto/x509"
"encoding/base64"
"flag"
"fmt"
"io/ioutil"
"strings"
"sync"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/netutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/persistentqueue"
"github.com/VictoriaMetrics/metrics"
"github.com/valyala/fasthttp"
)
var (
sendTimeout = flag.Duration("remoteWrite.sendTimeout", time.Minute, "Timeout for sending a single block of data to -remoteWrite.url")
tlsInsecureSkipVerify = flag.Bool("remoteWrite.tlsInsecureSkipVerify", false, "Whether to skip tls verification when connecting to -remoteWrite.url")
tlsCertFile = flag.String("remoteWrite.tlsCertFile", "", "Optional path to client-side TLS certificate file to use when connecting to -remoteWrite.url")
tlsKeyFile = flag.String("remoteWrite.tlsKeyFile", "", "Optional path to client-side TLS certificate key to use when connecting to -remoteWrite.url")
tlsCAFile = flag.String("remoteWrite.tlsCAFile", "", "Optional path to TLS CA file to use for verifying connections to -remoteWrite.url. "+
"By default system CA is used")
basicAuthUsername = flag.String("remoteWrite.basicAuth.username", "", "Optional basic auth username to use for -remoteWrite.url")
basicAuthPassword = flag.String("remoteWrite.basicAuth.password", "", "Optional basic auth password to use for -remoteWrite.url")
bearerToken = flag.String("remoteWrite.bearerToken", "", "Optional bearer auth token to use for -remoteWrite.url")
)
type client struct {
urlLabelValue string
remoteWriteURL string
host string
requestURI string
authHeader string
fq *persistentqueue.FastQueue
hc *fasthttp.HostClient
requestDuration *metrics.Histogram
requestsOKCount *metrics.Counter
errorsCount *metrics.Counter
retriesCount *metrics.Counter
wg sync.WaitGroup
stopCh chan struct{}
}
func newClient(remoteWriteURL, urlLabelValue string, fq *persistentqueue.FastQueue) *client {
authHeader := ""
if len(*basicAuthUsername) > 0 || len(*basicAuthPassword) > 0 {
// See https://en.wikipedia.org/wiki/Basic_access_authentication
token := *basicAuthUsername + ":" + *basicAuthPassword
token64 := base64.StdEncoding.EncodeToString([]byte(token))
authHeader = "Basic " + token64
}
if len(*bearerToken) > 0 {
if authHeader != "" {
logger.Panicf("FATAL: `-remoteWrite.bearerToken`=%q cannot be set when `-remoteWrite.basicAuth.*` flags are set", *bearerToken)
}
authHeader = "Bearer " + *bearerToken
}
readTimeout := *sendTimeout
if readTimeout <= 0 {
readTimeout = time.Minute
}
var u fasthttp.URI
u.Update(remoteWriteURL)
scheme := string(u.Scheme())
switch scheme {
case "http", "https":
default:
logger.Panicf("FATAL: unsupported scheme in -remoteWrite.url=%q: %q. It must be http or https", remoteWriteURL, scheme)
}
host := string(u.Host())
if len(host) == 0 {
logger.Panicf("FATAL: invalid -remoteWrite.url=%q: host cannot be empty. Make sure the url looks like `http://host:port/path`", remoteWriteURL)
}
requestURI := string(u.RequestURI())
isTLS := scheme == "https"
var tlsCfg *tls.Config
if isTLS {
var err error
tlsCfg, err = getTLSConfig()
if err != nil {
logger.Panicf("FATAL: cannot initialize TLS config: %s", err)
}
}
if !strings.Contains(host, ":") {
if isTLS {
host += ":443"
} else {
host += ":80"
}
}
maxConns := 2 * *queues
hc := &fasthttp.HostClient{
Addr: host,
Name: "vmagent",
Dial: statDial,
DialDualStack: netutil.TCP6Enabled(),
IsTLS: isTLS,
TLSConfig: tlsCfg,
MaxConns: maxConns,
MaxIdleConnDuration: 10 * readTimeout,
ReadTimeout: readTimeout,
WriteTimeout: 10 * time.Second,
MaxResponseBodySize: 1024 * 1024,
}
c := &client{
urlLabelValue: urlLabelValue,
remoteWriteURL: remoteWriteURL,
host: host,
requestURI: requestURI,
authHeader: authHeader,
fq: fq,
hc: hc,
stopCh: make(chan struct{}),
}
c.requestDuration = metrics.NewHistogram(fmt.Sprintf(`vmagent_remotewrite_duration_seconds{url=%q}`, c.urlLabelValue))
c.requestsOKCount = metrics.NewCounter(fmt.Sprintf(`vmagent_remotewrite_requests_total{url=%q, status_code="2XX"}`, c.urlLabelValue))
c.errorsCount = metrics.NewCounter(fmt.Sprintf(`vmagent_remotewrite_errors_total{url=%q}`, c.urlLabelValue))
c.retriesCount = metrics.NewCounter(fmt.Sprintf(`vmagent_remotewrite_retries_count_total{url=%q}`, c.urlLabelValue))
for i := 0; i < *queues; i++ {
c.wg.Add(1)
go func() {
defer c.wg.Done()
c.runWorker()
}()
}
logger.Infof("initialized client for -remoteWrite.url=%q", c.remoteWriteURL)
return c
}
func (c *client) MustStop() {
close(c.stopCh)
c.wg.Wait()
logger.Infof("stopped client for -remoteWrite.url=%q", c.remoteWriteURL)
}
func getTLSConfig() (*tls.Config, error) {
var tlsRootCA *x509.CertPool
var tlsCertificate *tls.Certificate
if *tlsCertFile != "" || *tlsKeyFile != "" {
cert, err := tls.LoadX509KeyPair(*tlsCertFile, *tlsKeyFile)
if err != nil {
return nil, fmt.Errorf("cannot load TLS certificate for -remoteWrite.tlsCertFile=%q and -remoteWrite.tlsKeyFile=%q: %s", *tlsCertFile, *tlsKeyFile, err)
}
tlsCertificate = &cert
}
if *tlsCAFile != "" {
data, err := ioutil.ReadFile(*tlsCAFile)
if err != nil {
return nil, fmt.Errorf("cannot read -remoteWrite.tlsCAFile=%q: %s", *tlsCAFile, err)
}
tlsRootCA = x509.NewCertPool()
if !tlsRootCA.AppendCertsFromPEM(data) {
return nil, fmt.Errorf("cannot parse data -remoteWrite.tlsCAFile=%q", *tlsCAFile)
}
}
tlsCfg := &tls.Config{
RootCAs: tlsRootCA,
ClientSessionCache: tls.NewLRUClientSessionCache(0),
}
if tlsCertificate != nil {
tlsCfg.Certificates = []tls.Certificate{*tlsCertificate}
}
tlsCfg.InsecureSkipVerify = *tlsInsecureSkipVerify
return tlsCfg, nil
}
func (c *client) runWorker() {
var ok bool
var block []byte
ch := make(chan struct{})
for {
block, ok = c.fq.MustReadBlock(block[:0])
if !ok {
return
}
go func() {
c.sendBlock(block)
ch <- struct{}{}
}()
select {
case <-ch:
// The block has been sent successfully
continue
case <-c.stopCh:
// c must be stopped. Wait for a while in the hope the block will be sent.
graceDuration := 5 * time.Second
select {
case <-ch:
// The block has been sent successfully.
case <-time.After(graceDuration):
logger.Errorf("couldn't sent block with size %d bytes to %q in %.3f seconds during shutdown; dropping it",
len(block), c.remoteWriteURL, graceDuration.Seconds())
}
return
}
}
}
func (c *client) sendBlock(block []byte) {
req := fasthttp.AcquireRequest()
req.SetRequestURI(c.requestURI)
req.SetHost(c.host)
req.Header.SetMethod("POST")
req.Header.Add("Content-Type", "application/x-protobuf")
req.Header.Add("Content-Encoding", "snappy")
if c.authHeader != "" {
req.Header.Set("Authorization", c.authHeader)
}
req.SetBody(block)
retryDuration := time.Second
resp := fasthttp.AcquireResponse()
again:
select {
case <-c.stopCh:
fasthttp.ReleaseRequest(req)
fasthttp.ReleaseResponse(resp)
return
default:
}
startTime := time.Now()
// There is no need in calling DoTimeout, since the timeout is set in c.hc.ReadTimeout.
err := c.hc.Do(req, resp)
c.requestDuration.UpdateDuration(startTime)
if err != nil {
c.errorsCount.Inc()
retryDuration *= 2
if retryDuration > time.Minute {
retryDuration = time.Minute
}
logger.Errorf("couldn't send a block with size %d bytes to %q: %s; re-sending the block in %.3f seconds",
len(block), c.remoteWriteURL, err, retryDuration.Seconds())
time.Sleep(retryDuration)
c.retriesCount.Inc()
goto again
}
statusCode := resp.StatusCode()
if statusCode/100 != 2 {
metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_requests_total{url=%q, status_code="%d"}`, c.urlLabelValue, statusCode)).Inc()
retryDuration *= 2
if retryDuration > time.Minute {
retryDuration = time.Minute
}
logger.Errorf("unexpected status code received after sending a block with size %d bytes to %q: %d; response body=%q; re-sending the block in %.3f seconds",
len(block), c.remoteWriteURL, statusCode, resp.Body(), retryDuration.Seconds())
time.Sleep(retryDuration)
c.retriesCount.Inc()
goto again
}
c.requestsOKCount.Inc()
// The block has been successfully sent to the remote storage.
fasthttp.ReleaseResponse(resp)
fasthttp.ReleaseRequest(req)
}

View File

@@ -0,0 +1,191 @@
package remotewrite
import (
"flag"
"sync"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/persistentqueue"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/metrics"
"github.com/golang/snappy"
)
var flushInterval = flag.Duration("remoteWrite.flushInterval", time.Second, "Interval for flushing the data to remote storage. "+
"Higher value reduces network bandwidth usage at the cost of delayed push of scraped data to remote storage")
// the maximum number of rows to send per each block.
const maxRowsPerBlock = 10000
type pendingSeries struct {
mu sync.Mutex
wr writeRequest
stopCh chan struct{}
periodicFlusherWG sync.WaitGroup
}
func newPendingSeries(pushBlock func(block []byte)) *pendingSeries {
var ps pendingSeries
ps.wr.pushBlock = pushBlock
ps.stopCh = make(chan struct{})
ps.periodicFlusherWG.Add(1)
go func() {
defer ps.periodicFlusherWG.Done()
ps.periodicFlusher()
}()
return &ps
}
func (ps *pendingSeries) MustStop() {
close(ps.stopCh)
ps.periodicFlusherWG.Wait()
}
func (ps *pendingSeries) Push(tss []prompbmarshal.TimeSeries) {
ps.mu.Lock()
ps.wr.push(tss)
ps.mu.Unlock()
}
func (ps *pendingSeries) periodicFlusher() {
ticker := time.NewTicker(*flushInterval)
defer ticker.Stop()
mustStop := false
for !mustStop {
select {
case <-ps.stopCh:
mustStop = true
case <-ticker.C:
if time.Since(ps.wr.lastFlushTime) < *flushInterval/2 {
continue
}
}
ps.mu.Lock()
ps.wr.flush()
ps.mu.Unlock()
}
}
type writeRequest struct {
wr prompbmarshal.WriteRequest
pushBlock func(block []byte)
lastFlushTime time.Time
tss []prompbmarshal.TimeSeries
labels []prompbmarshal.Label
samples []prompbmarshal.Sample
buf []byte
}
func (wr *writeRequest) reset() {
wr.wr.Timeseries = nil
for i := range wr.tss {
ts := &wr.tss[i]
ts.Labels = nil
ts.Samples = nil
}
wr.tss = wr.tss[:0]
for i := range wr.labels {
label := &wr.labels[i]
label.Name = ""
label.Value = ""
}
wr.labels = wr.labels[:0]
wr.samples = wr.samples[:0]
wr.buf = wr.buf[:0]
}
func (wr *writeRequest) flush() {
wr.wr.Timeseries = wr.tss
wr.lastFlushTime = time.Now()
pushWriteRequest(&wr.wr, wr.pushBlock)
wr.reset()
}
func (wr *writeRequest) push(src []prompbmarshal.TimeSeries) {
tssDst := wr.tss
for i := range src {
tssDst = append(tssDst, prompbmarshal.TimeSeries{})
dst := &tssDst[len(tssDst)-1]
wr.copyTimeSeries(dst, &src[i])
if len(wr.tss) >= maxRowsPerBlock {
wr.flush()
tssDst = wr.tss
}
}
wr.tss = tssDst
}
func (wr *writeRequest) copyTimeSeries(dst, src *prompbmarshal.TimeSeries) {
labelsDst := wr.labels
labelsLen := len(wr.labels)
samplesDst := wr.samples
buf := wr.buf
for i := range src.Labels {
labelsDst = append(labelsDst, prompbmarshal.Label{})
dstLabel := &labelsDst[len(labelsDst)-1]
srcLabel := &src.Labels[i]
buf = append(buf, srcLabel.Name...)
dstLabel.Name = bytesutil.ToUnsafeString(buf[len(buf)-len(srcLabel.Name):])
buf = append(buf, srcLabel.Value...)
dstLabel.Value = bytesutil.ToUnsafeString(buf[len(buf)-len(srcLabel.Value):])
}
dst.Labels = labelsDst[labelsLen:]
samplesDst = append(samplesDst, prompbmarshal.Sample{})
dstSample := &samplesDst[len(samplesDst)-1]
if len(src.Samples) != 1 {
logger.Panicf("BUG: unexpected number of samples in time series; got %d; want 1", len(src.Samples))
}
*dstSample = src.Samples[0]
dst.Samples = samplesDst[len(samplesDst)-1:]
wr.samples = samplesDst
wr.labels = labelsDst
wr.buf = buf
}
func pushWriteRequest(wr *prompbmarshal.WriteRequest, pushBlock func(block []byte)) {
if len(wr.Timeseries) == 0 {
// Nothing to push
return
}
bb := writeRequestBufPool.Get()
bb.B = prompbmarshal.MarshalWriteRequest(bb.B[:0], wr)
zb := snappyBufPool.Get()
zb.B = snappy.Encode(zb.B[:cap(zb.B)], bb.B)
writeRequestBufPool.Put(bb)
if len(zb.B) <= persistentqueue.MaxBlockSize {
pushBlock(zb.B)
blockSizeRows.Update(float64(len(wr.Timeseries)))
blockSizeBytes.Update(float64(len(zb.B)))
snappyBufPool.Put(zb)
return
}
snappyBufPool.Put(zb)
// Too big block. Recursively split it into smaller parts.
timeseries := wr.Timeseries
n := len(timeseries) / 2
wr.Timeseries = timeseries[:n]
pushWriteRequest(wr, pushBlock)
wr.Timeseries = timeseries[n:]
pushWriteRequest(wr, pushBlock)
wr.Timeseries = timeseries
}
var (
blockSizeBytes = metrics.NewHistogram(`vmagent_remotewrite_block_size_bytes`)
blockSizeRows = metrics.NewHistogram(`vmagent_remotewrite_block_size_rows`)
)
var writeRequestBufPool bytesutil.ByteBufferPool
var snappyBufPool bytesutil.ByteBufferPool

View File

@@ -0,0 +1,108 @@
package remotewrite
import (
"flag"
"strings"
"sync"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
)
var (
extraLabelsUnparsed = flagutil.NewArray("remoteWrite.label", "Optional label in the form 'name=value' to add to all the metrics before sending them to -remoteWrite.url. "+
"Pass multiple -remoteWrite.label flags in order to add multiple flags to metrics before sending them to remote storage")
relabelConfigPath = flag.String("remoteWrite.relabelConfig", "", "Optional path to file with relabel_config entries. These entries are applied to all the metrics "+
"before sending them to -remoteWrite.url. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config for details")
)
var extraLabels []prompbmarshal.Label
var prcs []promrelabel.ParsedRelabelConfig
// initRelabel must be called after parsing command-line flags.
func initRelabel() {
// Init extraLabels
for _, s := range *extraLabelsUnparsed {
n := strings.IndexByte(s, '=')
if n < 0 {
logger.Panicf("FATAL: missing '=' in `-remoteWrite.label`. It must contain label in the form `name=value`; got %q", s)
}
extraLabels = append(extraLabels, prompbmarshal.Label{
Name: s[:n],
Value: s[n+1:],
})
}
// Init prcs
if len(*relabelConfigPath) > 0 {
var err error
prcs, err = promrelabel.LoadRelabelConfigs(*relabelConfigPath)
if err != nil {
logger.Panicf("FATAL: cannot load relabel configs from -remoteWrite.relabelConfig=%q: %s", *relabelConfigPath, err)
}
}
}
func resetRelabel() {
extraLabels = nil
prcs = nil
}
func (rctx *relabelCtx) applyRelabeling(wr *prompbmarshal.WriteRequest) {
if len(extraLabels) == 0 && len(prcs) == 0 {
// Nothing to change.
return
}
tss := wr.Timeseries
tssDst := tss[:0]
labels := rctx.labels[:0]
for i := range tss {
ts := &tss[i]
labelsLen := len(labels)
labels = append(labels, ts.Labels...)
// extraLabels must be added before applying relabeling according to https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write
for j := range extraLabels {
extraLabel := &extraLabels[j]
tmp := promrelabel.GetLabelByName(labels[labelsLen:], extraLabel.Name)
if tmp != nil {
tmp.Value = extraLabel.Value
} else {
labels = append(labels, *extraLabel)
}
}
labels = promrelabel.ApplyRelabelConfigs(labels, labelsLen, prcs, true)
if len(labels) == labelsLen {
// Drop the current time series, since relabeling removed all the labels.
continue
}
tssDst = append(tssDst, prompbmarshal.TimeSeries{
Labels: labels[labelsLen:],
Samples: ts.Samples,
})
}
rctx.labels = labels
wr.Timeseries = tssDst
}
type relabelCtx struct {
// pool for labels, which are used during the relabeling.
labels []prompbmarshal.Label
}
func (rctx *relabelCtx) reset() {
labels := rctx.labels
for i := range labels {
label := &labels[i]
label.Name = ""
label.Value = ""
}
rctx.labels = rctx.labels[:0]
}
var relabelCtxPool = &sync.Pool{
New: func() interface{} {
return &relabelCtx{}
},
}

View File

@@ -0,0 +1,127 @@
package remotewrite
import (
"flag"
"fmt"
"sync/atomic"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/memory"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/persistentqueue"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/metrics"
xxhash "github.com/cespare/xxhash/v2"
)
var (
remoteWriteURLs = flagutil.NewArray("remoteWrite.url", "Remote storage URL to write data to. It must support Prometheus remote_write API. "+
"It is recommended using VictoriaMetrics as remote storage. Example url: http://<victoriametrics-host>:8428/api/v1/write . "+
"Pass multiple -remoteWrite.url flags in order to write data concurrently to multiple remote storage systems")
tmpDataPath = flag.String("remoteWrite.tmpDataPath", "vmagent-remotewrite-data", "Path to directory where temporary data for remote write component is stored")
queues = flag.Int("remoteWrite.queues", 1, "The number of concurrent queues to each -remoteWrite.url. Set more queues if a single queue "+
"isn't enough for sending high volume of collected data to remote storage")
showRemoteWriteURL = flag.Bool("remoteWrite.showURL", false, "Whether to show -remoteWrite.url in the exported metrics. "+
"It is hidden by default, since it can contain sensistive auth info")
)
// Init initializes remotewrite.
//
// It must be called after flag.Parse().
//
// Stop must be called for graceful shutdown.
func Init() {
if len(*remoteWriteURLs) == 0 {
logger.Panicf("FATAL: at least one `-remoteWrite.url` must be set")
}
if !*showRemoteWriteURL {
// remoteWrite.url can contain authentication codes, so hide it at `/metrics` output.
httpserver.RegisterSecretFlag("remoteWrite.url")
}
initRelabel()
maxInmemoryBlocks := memory.Allowed() / len(*remoteWriteURLs) / maxRowsPerBlock / 100
if maxInmemoryBlocks > 200 {
// There is no much sense in keeping higher number of blocks in memory,
// since this means that the producer outperforms consumer and the queue
// will continue growing. It is better storing the queue to file.
maxInmemoryBlocks = 200
}
if maxInmemoryBlocks < 2 {
maxInmemoryBlocks = 2
}
for i, remoteWriteURL := range *remoteWriteURLs {
h := xxhash.Sum64([]byte(remoteWriteURL))
path := fmt.Sprintf("%s/persistent-queue/%016X", *tmpDataPath, h)
fq := persistentqueue.MustOpenFastQueue(path, remoteWriteURL, maxInmemoryBlocks)
urlLabelValue := fmt.Sprintf("secret-url-%d", i+1)
if *showRemoteWriteURL {
urlLabelValue = remoteWriteURL
}
_ = metrics.NewGauge(fmt.Sprintf(`vmagent_remotewrite_pending_data_bytes{url=%q, hash="%016X"}`, urlLabelValue, h), func() float64 {
return float64(fq.GetPendingBytes())
})
_ = metrics.NewGauge(fmt.Sprintf(`vmagent_remotewrite_pending_inmemory_blocks{url=%q}`, urlLabelValue), func() float64 {
return float64(fq.GetInmemoryQueueLen())
})
c := newClient(remoteWriteURL, urlLabelValue, fq)
fqs = append(fqs, fq)
cs = append(cs, c)
}
pss = make([]*pendingSeries, *queues)
for i := range pss {
pss[i] = newPendingSeries(pushBlockToPersistentQueues)
}
}
// Stop stops remotewrite.
//
// It is expected that nobody calls Push during and after the call to this func.
func Stop() {
for _, ps := range pss {
ps.MustStop()
}
// Close all the persistent queues. This should unblock clients waiting in MustReadBlock.
for _, fq := range fqs {
fq.MustClose()
}
fqs = nil
// Stop all the clients
for _, c := range cs {
c.MustStop()
}
cs = nil
resetRelabel()
}
// Push sends wr to remote storage systems set via `-remoteWrite.url`.
//
// Each timeseries in wr.Timeseries must contain one sample.
func Push(wr *prompbmarshal.WriteRequest) {
rctx := relabelCtxPool.Get().(*relabelCtx)
rctx.applyRelabeling(wr)
idx := atomic.AddUint64(&pssNextIdx, 1) % uint64(len(pss))
pss[idx].Push(wr.Timeseries)
rctx.reset()
relabelCtxPool.Put(rctx)
}
func pushBlockToPersistentQueues(block []byte) {
for _, fq := range fqs {
fq.MustWriteBlock(block)
}
}
var fqs []*persistentqueue.FastQueue
var cs []*client
var pssNextIdx uint64
var pss []*pendingSeries

View File

@@ -0,0 +1,71 @@
package remotewrite
import (
"net"
"sync/atomic"
"github.com/VictoriaMetrics/metrics"
"github.com/valyala/fasthttp"
)
func statDial(addr string) (net.Conn, error) {
conn, err := fasthttp.Dial(addr)
dialsTotal.Inc()
if err != nil {
dialErrors.Inc()
return nil, err
}
conns.Inc()
sc := &statConn{
Conn: conn,
}
return sc, nil
}
var (
dialsTotal = metrics.NewCounter(`vmagent_remotewrite_dials_total`)
dialErrors = metrics.NewCounter(`vmagent_remotewrite_dial_errors_total`)
conns = metrics.NewCounter(`vmagent_remotewrite_conns`)
)
type statConn struct {
closed uint64
net.Conn
}
func (sc *statConn) Read(p []byte) (int, error) {
n, err := sc.Conn.Read(p)
connReadsTotal.Inc()
if err != nil {
connReadErrors.Inc()
}
connBytesRead.Add(n)
return n, err
}
func (sc *statConn) Write(p []byte) (int, error) {
n, err := sc.Conn.Write(p)
connWritesTotal.Inc()
if err != nil {
connWriteErrors.Inc()
}
connBytesWritten.Add(n)
return n, err
}
func (sc *statConn) Close() error {
err := sc.Conn.Close()
if atomic.AddUint64(&sc.closed, 1) == 1 {
conns.Dec()
}
return err
}
var (
connReadsTotal = metrics.NewCounter(`vmagent_remotewrite_conn_reads_total`)
connWritesTotal = metrics.NewCounter(`vmagent_remotewrite_conn_writes_total`)
connReadErrors = metrics.NewCounter(`vmagent_remotewrite_conn_read_errors_total`)
connWriteErrors = metrics.NewCounter(`vmagent_remotewrite_conn_write_errors_total`)
connBytesRead = metrics.NewCounter(`vmagent_remotewrite_conn_bytes_read_total`)
connBytesWritten = metrics.NewCounter(`vmagent_remotewrite_conn_bytes_written_total`)
)

BIN
app/vmagent/vmagent.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 69 KiB

View File

@@ -0,0 +1,70 @@
package vmimport
import (
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/vmimport"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="vmimport"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="vmimport"}`)
)
// InsertHandler processes `/api/v1/import` request.
//
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6
func InsertHandler(req *http.Request) error {
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(req, insertRows)
})
}
func insertRows(rows []parser.Row) error {
ctx := common.GetPushCtx()
defer common.PutPushCtx(ctx)
rowsTotal := 0
tssDst := ctx.WriteRequest.Timeseries[:0]
labels := ctx.Labels[:0]
samples := ctx.Samples[:0]
for i := range rows {
r := &rows[i]
labelsLen := len(labels)
for j := range r.Tags {
tag := &r.Tags[j]
labels = append(labels, prompbmarshal.Label{
Name: bytesutil.ToUnsafeString(tag.Key),
Value: bytesutil.ToUnsafeString(tag.Value),
})
}
values := r.Values
timestamps := r.Timestamps
_ = timestamps[len(values)-1]
samplesLen := len(samples)
for j, value := range values {
samples = append(samples, prompbmarshal.Sample{
Value: value,
Timestamp: timestamps[j],
})
}
tssDst = append(tssDst, prompbmarshal.TimeSeries{
Labels: labels[labelsLen:],
Samples: samples[samplesLen:],
})
rowsTotal += len(values)
}
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.Push(&ctx.WriteRequest)
rowsInserted.Add(rowsTotal)
rowsPerInsert.Update(float64(rowsTotal))
return nil
}

41
app/vmalert/README.md Normal file
View File

@@ -0,0 +1,41 @@
## VM Alert
#### Abstract
The application which accepts the alert rules, executes them on given source, sends(fires) an alert to(in) alert management system
### Components
#### Alert Config Reader
It accepts yaml config as input parameter in Prometheus format, parses it into Go struct.
#### Source Caller
Create own watchdog for every alert group (goroutines), which executes alert query on given source and issues an alert if source returns non-empty result.
Source can be any service which supports PromQL (MetricsQL).
#### Alert Management System Provider
Send positive alert to alert management system, provides interface for every concrete implementation.
Should be ingratiated with Prometheus alertmanager.
open questions:
- do we really need alert group or can just run every alert in own goroutine?
#### Web Server
Expose metrics
open questions:
- should the tool provide API or UI for managing alerting rules? Where to store config updated via the API or UI?
- should the tool provide “alerting rules validation mode” for validating and debugging alerting rules? This mode is useful when creating and debugging alerting rules.
#### Requirements:
- Stateless
- Avoid external dependencies if possible
- Reuse existing code from VictoriaMetrics repo
- Makefile rules for common tasks see Makefiles for other apps in the app/ dir
- Every package should be covered by tests
- Dockerfile
- Graceful shutdown
- Helm template
- Application uses command line flags for configuration
<img alt="VM Alert" src="vmalert.png">

View File

@@ -0,0 +1,32 @@
package config
import "time"
// Annotations basic annotation for alert rule
type Annotations struct {
Summary string
Description string
}
// Alert basic alert entity rule
type Alert struct {
Name string
Expr string
For time.Duration
Labels map[string]string
Annotations Annotations
Start time.Time
End time.Time
}
// Group grouping array of alert
type Group struct {
Name string
Rules []Alert
}
// Parse parses config from given file
func Parse(filepath string) ([]Group, error) {
return []Group{}, nil
}

View File

@@ -0,0 +1,14 @@
package datasource
import "context"
// Metrics the data returns from storage
type Metrics struct{}
// VMStorage represents vmstorage entity with ability to read and write metrics
type VMStorage struct{}
//Query basic query to the datasource
func (s *VMStorage) Query(ctx context.Context, query string) ([]Metrics, error) {
return nil, nil
}

60
app/vmalert/main.go Normal file
View File

@@ -0,0 +1,60 @@
package main
import (
"flag"
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/config"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envflag"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/procutil"
)
var (
configPath = flag.String("config", "config.yaml", "Path to alert configuration file")
httpListenAddr = flag.String("httpListenAddr", ":8880", "Address to listen for http connections")
)
func main() {
envflag.Parse()
buildinfo.Init()
logger.Init()
logger.Infof("reading alert rules configuration file from %s", *configPath)
alertGroups, err := config.Parse(*configPath)
if err != nil {
logger.Fatalf("Cannot parse configuration file %s", err)
}
w := &watchdog{storage: &datasource.VMStorage{}}
for id := range alertGroups {
go func(group config.Group) {
w.run(group)
}(alertGroups[id])
}
go func() {
httpserver.Serve(*httpListenAddr, func(w http.ResponseWriter, r *http.Request) bool {
panic("not implemented")
})
}()
sig := procutil.WaitForSigterm()
logger.Infof("service received signal %s", sig)
if err := httpserver.Stop(*httpListenAddr); err != nil {
logger.Fatalf("cannot stop the webservice: %s", err)
}
w.stop()
}
type watchdog struct {
storage *datasource.VMStorage
}
func (w *watchdog) run(a config.Group) {
}
func (w *watchdog) stop() {
panic("not implemented")
}

View File

@@ -0,0 +1,26 @@
{% import (
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/config"
) %}
{% stripspace %}
{% func amRequest(alert *config.Alert, generatorURL string) %}
{
"startsAt":{%q= alert.Start.Format(time.RFC3339Nano) %},
"generatorURL": {%q= generatorURL %},
{% if !alert.End.IsZero() %}
"endsAt":{%q= alert.End.Format(time.RFC3339Nano) %},
{% endif %}
"labels": {
"alertname":{%q= alert.Name %}
{% for k,v := range alert.Labels %}
,{%q= k %}:{%q= v %}
{% endfor %}
},
"annotations": {
"summary": {%q= alert.Annotations.Summary %},
"description": {%q= alert.Annotations.Description %}
}
}
{% endfunc %}
{% endstripspace %}

View File

@@ -0,0 +1,101 @@
// Code generated by qtc from "alert_manager_request.qtpl". DO NOT EDIT.
// See https://github.com/valyala/quicktemplate for details.
//line app/vmalert/provider/alert_manager_request.qtpl:1
package provider
//line app/vmalert/provider/alert_manager_request.qtpl:1
import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/config"
"time"
)
//line app/vmalert/provider/alert_manager_request.qtpl:7
import (
qtio422016 "io"
qt422016 "github.com/valyala/quicktemplate"
)
//line app/vmalert/provider/alert_manager_request.qtpl:7
var (
_ = qtio422016.Copy
_ = qt422016.AcquireByteBuffer
)
//line app/vmalert/provider/alert_manager_request.qtpl:7
func streamamRequest(qw422016 *qt422016.Writer, alert *config.Alert, generatorURL string) {
//line app/vmalert/provider/alert_manager_request.qtpl:7
qw422016.N().S(`{"startsAt":`)
//line app/vmalert/provider/alert_manager_request.qtpl:9
qw422016.N().Q(alert.Start.Format(time.RFC3339Nano))
//line app/vmalert/provider/alert_manager_request.qtpl:9
qw422016.N().S(`,"generatorURL":`)
//line app/vmalert/provider/alert_manager_request.qtpl:10
qw422016.N().Q(generatorURL)
//line app/vmalert/provider/alert_manager_request.qtpl:10
qw422016.N().S(`,`)
//line app/vmalert/provider/alert_manager_request.qtpl:11
if !alert.End.IsZero() {
//line app/vmalert/provider/alert_manager_request.qtpl:11
qw422016.N().S(`"endsAt":`)
//line app/vmalert/provider/alert_manager_request.qtpl:12
qw422016.N().Q(alert.End.Format(time.RFC3339Nano))
//line app/vmalert/provider/alert_manager_request.qtpl:12
qw422016.N().S(`,`)
//line app/vmalert/provider/alert_manager_request.qtpl:13
}
//line app/vmalert/provider/alert_manager_request.qtpl:13
qw422016.N().S(`"labels": {"alertname":`)
//line app/vmalert/provider/alert_manager_request.qtpl:15
qw422016.N().Q(alert.Name)
//line app/vmalert/provider/alert_manager_request.qtpl:16
for k, v := range alert.Labels {
//line app/vmalert/provider/alert_manager_request.qtpl:16
qw422016.N().S(`,`)
//line app/vmalert/provider/alert_manager_request.qtpl:17
qw422016.N().Q(k)
//line app/vmalert/provider/alert_manager_request.qtpl:17
qw422016.N().S(`:`)
//line app/vmalert/provider/alert_manager_request.qtpl:17
qw422016.N().Q(v)
//line app/vmalert/provider/alert_manager_request.qtpl:18
}
//line app/vmalert/provider/alert_manager_request.qtpl:18
qw422016.N().S(`},"annotations": {"summary":`)
//line app/vmalert/provider/alert_manager_request.qtpl:21
qw422016.N().Q(alert.Annotations.Summary)
//line app/vmalert/provider/alert_manager_request.qtpl:21
qw422016.N().S(`,"description":`)
//line app/vmalert/provider/alert_manager_request.qtpl:22
qw422016.N().Q(alert.Annotations.Description)
//line app/vmalert/provider/alert_manager_request.qtpl:22
qw422016.N().S(`}}`)
//line app/vmalert/provider/alert_manager_request.qtpl:25
}
//line app/vmalert/provider/alert_manager_request.qtpl:25
func writeamRequest(qq422016 qtio422016.Writer, alert *config.Alert, generatorURL string) {
//line app/vmalert/provider/alert_manager_request.qtpl:25
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmalert/provider/alert_manager_request.qtpl:25
streamamRequest(qw422016, alert, generatorURL)
//line app/vmalert/provider/alert_manager_request.qtpl:25
qt422016.ReleaseWriter(qw422016)
//line app/vmalert/provider/alert_manager_request.qtpl:25
}
//line app/vmalert/provider/alert_manager_request.qtpl:25
func amRequest(alert *config.Alert, generatorURL string) string {
//line app/vmalert/provider/alert_manager_request.qtpl:25
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmalert/provider/alert_manager_request.qtpl:25
writeamRequest(qb422016, alert, generatorURL)
//line app/vmalert/provider/alert_manager_request.qtpl:25
qs422016 := string(qb422016.B)
//line app/vmalert/provider/alert_manager_request.qtpl:25
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmalert/provider/alert_manager_request.qtpl:25
return qs422016
//line app/vmalert/provider/alert_manager_request.qtpl:25
}

View File

@@ -0,0 +1,66 @@
package provider
import (
"bytes"
"fmt"
"io"
"net/http"
"strings"
"sync"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/config"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
)
const alertsPath = "/api/v2/alerts"
var pool = sync.Pool{New: func() interface{} {
return &bytes.Buffer{}
}}
// AlertManager represents integration provider with Prometheus alert manager
type AlertManager struct {
alertURL string
argFunc AlertURLGenerator
client *http.Client
}
// AlertURLGenerator returns URL to single alert by given name
type AlertURLGenerator func(name string) string
// NewAlertManager is a constructor for AlertManager
func NewAlertManager(alertManagerURL string, fn AlertURLGenerator, c *http.Client) *AlertManager {
return &AlertManager{
alertURL: strings.TrimSuffix(alertManagerURL, "/") + alertsPath,
argFunc: fn,
client: c,
}
}
const (
jsonArrayOpen byte = 91 // [
jsonArrayClose byte = 93 // ]
)
// Send an alert or resolve message
func (am *AlertManager) Send(alert *config.Alert) error {
b := pool.Get().(*bytes.Buffer)
b.Reset()
defer pool.Put(b)
b.WriteByte(jsonArrayOpen)
writeamRequest(b, alert, am.argFunc(alert.Name))
b.WriteByte(jsonArrayClose)
resp, err := am.client.Post(am.alertURL, "application/json", b)
if err != nil {
return err
}
defer func() { _ = resp.Body.Close() }()
if resp.StatusCode != http.StatusOK {
b.Reset()
if _, err := io.Copy(b, resp.Body); err != nil {
logger.Errorf("unable to copy error response body to buffer %s", err)
}
return fmt.Errorf("invalid response from alertmanager %s", b)
}
return nil
}

View File

@@ -0,0 +1,80 @@
package provider
import (
"encoding/json"
"net/http"
"net/http/httptest"
"testing"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/config"
)
func TestAlertManager_Send(t *testing.T) {
mux := http.NewServeMux()
mux.HandleFunc("/", func(_ http.ResponseWriter, _ *http.Request) {
t.Errorf("should not be called")
})
c := -1
mux.HandleFunc(alertsPath, func(w http.ResponseWriter, r *http.Request) {
c++
if r.Method != http.MethodPost {
t.Errorf("expected POST method got %s", r.Method)
}
switch c {
case 0:
conn, _, _ := w.(http.Hijacker).Hijack()
_ = conn.Close()
case 1:
w.WriteHeader(500)
case 2:
var a []struct {
Labels map[string]string `json:"labels"`
StartsAt time.Time `json:"startsAt"`
EndAt time.Time `json:"endsAt"`
Annotations map[string]string `json:"annotations"`
GeneratorURL string `json:"generatorURL"`
}
if err := json.NewDecoder(r.Body).Decode(&a); err != nil {
t.Errorf("can not unmarshal data into alert %s", err)
t.FailNow()
}
if len(a) != 1 {
t.Errorf("expected 1 alert in array got %d", len(a))
}
if a[0].GeneratorURL != "alert0" {
t.Errorf("exptected alert0 as generatorURL got %s", a[0].GeneratorURL)
}
if a[0].Labels["alertname"] != "alert0" {
t.Errorf("exptected alert0 as alert name got %s", a[0].Labels["alertname"])
}
if a[0].StartsAt.IsZero() {
t.Errorf("exptected non-zero start time")
}
if a[0].EndAt.IsZero() {
t.Errorf("exptected non-zero end time")
}
}
})
srv := httptest.NewServer(mux)
defer srv.Close()
am := NewAlertManager(srv.URL, func(name string) string {
return name
}, srv.Client())
if err := am.Send(&config.Alert{}); err == nil {
t.Error("expected connection error got nil")
}
if err := am.Send(&config.Alert{}); err == nil {
t.Error("expected wrong http code error got nil")
}
if err := am.Send(&config.Alert{
Name: "alert0",
Start: time.Now().UTC(),
End: time.Now().UTC(),
}); err != nil {
t.Errorf("unexpected error %s", err)
}
if c != 2 {
t.Errorf("expected 2 calls(count from zero) to server got %d", c)
}
}

View File

@@ -0,0 +1,8 @@
package provider
import "github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/config"
// AlertProvider is common interface for alert manager provider
type AlertProvider interface {
Send(rule config.Alert) error
}

BIN
app/vmalert/vmalert.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 25 KiB

67
app/vmbackup/Makefile Normal file
View File

@@ -0,0 +1,67 @@
# All these commands must run from repository root.
vmbackup:
APP_NAME=vmbackup $(MAKE) app-local
vmbackup-prod:
APP_NAME=vmbackup $(MAKE) app-via-docker
vmbackup-pure-prod:
APP_NAME=vmbackup $(MAKE) app-via-docker-pure
vmbackup-amd64-prod:
APP_NAME=vmbackup $(MAKE) app-via-docker-amd64
vmbackup-arm-prod:
APP_NAME=vmbackup $(MAKE) app-via-docker-arm
vmbackup-arm64-prod:
APP_NAME=vmbackup $(MAKE) app-via-docker-arm64
vmbackup-ppc64le-prod:
APP_NAME=vmbackup $(MAKE) app-via-docker-ppc64le
vmbackup-386-prod:
APP_NAME=vmbackup $(MAKE) app-via-docker-386
package-vmbackup:
APP_NAME=vmbackup $(MAKE) package-via-docker
package-vmbackup-pure:
APP_NAME=vmbackup $(MAKE) package-via-docker-pure
package-vmbackup-amd64:
APP_NAME=vmbackup $(MAKE) package-via-docker-amd64
package-vmbackup-arm:
APP_NAME=vmbackup $(MAKE) package-via-docker-arm
package-vmbackup-arm64:
APP_NAME=vmbackup $(MAKE) package-via-docker-arm64
package-vmbackup-ppc64le:
APP_NAME=vmbackup $(MAKE) package-via-docker-ppc64le
package-vmbackup-386:
APP_NAME=vmbackup $(MAKE) package-via-docker-386
publish-vmbackup:
APP_NAME=vmbackup $(MAKE) publish-via-docker
vmbackup-pure:
APP_NAME=vmbackup $(MAKE) app-local-pure
vmbackup-amd64:
CGO_ENABLED=1 GOOS=linux GOARCH=amd64 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmbackup-amd64 ./app/vmbackup
vmbackup-arm:
CGO_ENABLED=0 GOOS=linux GOARCH=arm GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmbackup-arm ./app/vmbackup
vmbackup-arm64:
CGO_ENABLED=0 GOOS=linux GOARCH=arm64 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmbackup-arm64 ./app/vmbackup
vmbackup-ppc64le:
CGO_ENABLED=0 GOOS=linux GOARCH=ppc64le GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmbackup-ppc64le ./app/vmbackup
vmbackup-386:
CGO_ENABLED=0 GOOS=linux GOARCH=386 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmbackup-386 ./app/vmbackup

181
app/vmbackup/README.md Normal file
View File

@@ -0,0 +1,181 @@
## vmbackup
`vmbackup` creates VictoriaMetrics data backups from [instant snapshots](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-work-with-snapshots).
Supported storage systems for backups:
* [GCS](https://cloud.google.com/storage/). Example: `gcs://<bucket>/<path/to/backup>`
* [S3](https://aws.amazon.com/s3/). Example: `s3://<bucket>/<path/to/backup>`
* Any S3-compatible storage such as [MinIO](https://github.com/minio/minio), [Ceph](https://docs.ceph.com/docs/mimic/radosgw/s3/) or [Swift](https://www.swiftstack.com/docs/admin/middleware/s3_middleware.html). See `-customS3Endpoint` command-line flag.
* Local filesystem. Example: `fs://</absolute/path/to/backup>`
Incremental backups and full backups are supported. Incremental backups are created automatically if the destination path already contains data from the previous backup.
Full backups can be sped up with `-origin` pointing to already existing backup on the same remote storage. In this case `vmbackup` makes server-side copy for the shared
data between the existing backup and new backup. This saves time and costs on data transfer.
Backup process can be interrupted at any time. It is automatically resumed from the interruption point when restarting `vmbackup` with the same args.
Backed up data can be restored with [vmrestore](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmrestore/README.md).
See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time-series-databases-533c1a927883) for more details.
### Use cases
#### Regular backups
Regular backup can be performed with the following command:
```
vmbackup -storageDataPath=</path/to/victoria-metrics-data> -snapshotName=<local-snapshot> -dst=gcs://<bucket>/<path/to/new/backup>
```
* `</path/to/victoria-metrics-data>` - path to VictoriaMetrics data pointed by `-storageDataPath` command-line flag in single-node VictoriaMetrics or in cluster `vmstorage`.
There is no need to stop VictoriaMetrics for creating backups, since they are performed from immutable [instant snapshots](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-work-with-snapshots).
* `<local-snapshot>` is the snapshot to backup. See [how to create instant snapshots](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-work-with-snapshots).
* `<bucket>` is already existing name for [GCS bucket](https://cloud.google.com/storage/docs/creating-buckets).
* `<path/to/new/backup>` is the destination path where new backup will be placed.
#### Regular backups with server-side copy from existing backup
If the destination GCS bucket already contains the previous backup at `-origin` path, then new backup can be sped up
with the following command:
```
vmbackup -storageDataPath=</path/to/victoria-metrics-data> -snapshotName=<local-snapshot> -dst=gcs://<bucket>/<path/to/new/backup> -origin=gcs://<bucket>/<path/to/existing/backup>
```
This saves time and network bandwidth costs by performing server-side copy for the shared data from the `-origin` to `-dst`.
#### Incremental backups
Incremental backups are performed if `-dst` points to already existing backup. In this case only new data is uploaded to remote storage.
This saves time and network bandwidth costs when working with big backups:
```
vmbackup -storageDataPath=</path/to/victoria-metrics-data> -snapshotName=<local-snapshot> -dst=gcs://<bucket>/<path/to/existing/backup>
```
#### Smart backups
Smart backups mean storing full daily backups into `YYYYMMDD` folders and creating incremental hourly backup into `latest` folder:
* Run the following command every hour:
```
vmbackup -snapshotName=<latest-snapshot> -dst=gcs://<bucket>/latest
```
Where `<latest-snapshot>` is the latest [snapshot](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-work-with-snapshots).
The command will upload only changed data to `gcs://<bucket>/latest`.
* Run the following command once a day:
```
vmbackup -snapshotName=<daily-snapshot> -dst=gcs://<bucket>/<YYYYMMDD> -origin=gcs://<bucket>/latest
```
Where `<daily-snapshot>` is the snapshot for the last day `<YYYYMMDD>`.
This apporach saves network bandwidth costs on hourly backups (since they are incremental) and allows recovering data from either the last hour (`latest` backup)
or from any day (`YYYYMMDD` backups). Note that hourly backup shouldn't run when creating daily backup.
Do not forget removing old snapshots and backups when they are no longer needed for saving storage costs.
### How does it work?
The backup algorithm is the following:
1. Collect information about files in the `-snapshotName`, in the `-dst` and in the `-origin`.
2. Determine files in `-dst`, which are missing in `-snapshotName`, and delete them. These are usually small files, which are already merged into bigger files in the snapshot.
3. Determine files from `-snapshotName`, which are missing in `-dst`. These are usually small new files and bigger merged files.
4. Determine files from step 3, which exist in the `-origin`, and perform server-side copy of these files from `-origin` to `-dst`.
This are usually the biggest and the oldest files, which are shared between backups.
5. Upload the remaining files from setp 3 from `-snapshotName` to `-dst`.
The algorithm splits source files into 100MB chunks in the backup. Each chunk is stored as a separate file in the backup.
Such splitting minimizes the amounts of data to re-transfer after temporary errors.
`vmbackup` relies on [instant snapshot](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282) properties:
- All the files in the snapshot are immutable.
- Old files are periodically merged into new files.
- Smaller files have higher probability to be merged.
- Consecutive snapshots share many identical files.
These properties allow performing fast and cheap incremental backups and server-side copying from `-origin` paths.
See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time-series-databases-533c1a927883) for more details.
`vmbackup` can work improperly or slowly when these properties are violated.
### Troubleshooting
* If the backup is slow, then try setting higher value for `-concurrency` flag. This will increase the number of concurrent workers that upload data to backup storage.
* If `vmbackup` eats all the network bandwidth, then set `-maxBytesPerSecond` to the desired value.
* If `vmbackup` has been interrupted due to temporary error, then just restart it with the same args. It will resume the backup process.
### Advanced usage
Run `vmbackup -help` in order to see all the available options:
```
-concurrency int
The number of concurrent workers. Higher concurrency may reduce backup duration (default 10)
-configFilePath string
Path to file with S3 configs. Configs are loaded from default location if not set.
See https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html
-configProfile string
Profile name for S3 configs (default "default")
-credsFilePath string
Path to file with GCS or S3 credentials. Credentials are loaded from default locations if not set.
See https://cloud.google.com/iam/docs/creating-managing-service-account-keys and https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html
-customS3Endpoint string
Custom S3 endpoint for use with S3-compatible storages (e.g. MinIO). S3 is used if not set
-dst string
Where to put the backup on the remote storage. Example: gcs://bucket/path/to/backup/dir, s3://bucket/path/to/backup/dir or fs:///path/to/local/backup/dir
-dst can point to the previous backup. In this case incremental backup is performed, i.e. only changed data is uploaded
-loggerLevel string
Minimum level of errors to log. Possible values: INFO, ERROR, FATAL, PANIC (default "INFO")
-maxBytesPerSecond int
The maximum upload speed. There is no limit if it is set to 0
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy (default 60)
-origin string
Optional origin directory on the remote storage with old backup for server-side copying when performing full backup. This speeds up full backups
-snapshotName string
Name for the snapshot to backup. See https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-work-with-snapshots
-storageDataPath string
Path to VictoriaMetrics data. Must match -storageDataPath from VictoriaMetrics or vmstorage (default "victoria-metrics-data")
-version
Show VictoriaMetrics version
```
### How to build from sources
It is recommended using [binary releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) - see `vmutils-*` archives there.
#### Development build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.12.
2. Run `make vmbackup` from the root folder of the repository.
It builds `vmbackup` binary and puts it into the `bin` folder.
#### Production build
1. [Install docker](https://docs.docker.com/install/).
2. Run `make vmbackup-prod` from the root folder of the repository.
It builds `vmbackup-prod` binary and puts it into the `bin` folder.
#### Building docker images
Run `make package-vmbackup`. It builds `victoriametrics/vmbackup:<PKG_TAG>` docker image locally.
`<PKG_TAG>` is auto-generated image tag, which depends on source code in the repository.
The `<PKG_TAG>` may be manually set via `PKG_TAG=foobar make package-vmbackup`.

View File

@@ -0,0 +1,7 @@
ARG certs_image
FROM $certs_image AS certs
FROM scratch
COPY --from=certs /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ca-certificates.crt
ARG src_binary
COPY $src_binary ./vmbackup-prod
ENTRYPOINT ["/vmbackup-prod"]

115
app/vmbackup/main.go Normal file
View File

@@ -0,0 +1,115 @@
package main
import (
"flag"
"fmt"
"os"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/backup/actions"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/backup/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/backup/fslocal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envflag"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
)
var (
storageDataPath = flag.String("storageDataPath", "victoria-metrics-data", "Path to VictoriaMetrics data. Must match -storageDataPath from VictoriaMetrics or vmstorage")
snapshotName = flag.String("snapshotName", "", "Name for the snapshot to backup. See https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-work-with-snapshots")
dst = flag.String("dst", "", "Where to put the backup on the remote storage. "+
"Example: gcs://bucket/path/to/backup/dir, s3://bucket/path/to/backup/dir or fs:///path/to/local/backup/dir\n"+
"-dst can point to the previous backup. In this case incremental backup is performed, i.e. only changed data is uploaded")
origin = flag.String("origin", "", "Optional origin directory on the remote storage with old backup for server-side copying when performing full backup. This speeds up full backups")
concurrency = flag.Int("concurrency", 10, "The number of concurrent workers. Higher concurrency may reduce backup duration")
maxBytesPerSecond = flag.Int("maxBytesPerSecond", 0, "The maximum upload speed. There is no limit if it is set to 0")
)
func main() {
flag.Usage = usage
envflag.Parse()
buildinfo.Init()
srcFS, err := newSrcFS()
if err != nil {
logger.Fatalf("%s", err)
}
dstFS, err := newDstFS()
if err != nil {
logger.Fatalf("%s", err)
}
originFS, err := newOriginFS()
if err != nil {
logger.Fatalf("%s", err)
}
a := &actions.Backup{
Concurrency: *concurrency,
Src: srcFS,
Dst: dstFS,
Origin: originFS,
}
if err := a.Run(); err != nil {
logger.Fatalf("cannot create backup: %s", err)
}
}
func usage() {
const s = `
vmbackup performs backups for VictoriaMetrics data from instant snapshots to gcs, s3
or local filesystem. Backed up data can be restored with vmrestore.
See the docs at https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmbackup/README.md .
`
f := flag.CommandLine.Output()
fmt.Fprintf(f, "%s\n", s)
flag.PrintDefaults()
}
func newSrcFS() (*fslocal.FS, error) {
if len(*snapshotName) == 0 {
return nil, fmt.Errorf("`-snapshotName` cannot be empty")
}
snapshotPath := *storageDataPath + "/snapshots/" + *snapshotName
// Verify the snapshot exists.
f, err := os.Open(snapshotPath)
if err != nil {
return nil, fmt.Errorf("cannot open snapshot at %q: %s", snapshotPath, err)
}
fi, err := f.Stat()
_ = f.Close()
if err != nil {
return nil, fmt.Errorf("cannot stat %q: %s", snapshotPath, err)
}
if !fi.IsDir() {
return nil, fmt.Errorf("snapshot %q must be a directory", snapshotPath)
}
fs := &fslocal.FS{
Dir: snapshotPath,
MaxBytesPerSecond: *maxBytesPerSecond,
}
if err := fs.Init(); err != nil {
return nil, fmt.Errorf("cannot initialize fs: %s", err)
}
return fs, nil
}
func newDstFS() (common.RemoteFS, error) {
fs, err := actions.NewRemoteFS(*dst)
if err != nil {
return nil, fmt.Errorf("cannot parse `-dst`=%q: %s", *dst, err)
}
return fs, nil
}
func newOriginFS() (common.RemoteFS, error) {
if len(*origin) == 0 {
return nil, nil
}
fs, err := actions.NewRemoteFS(*origin)
if err != nil {
return nil, fmt.Errorf("cannot parse `-origin`=%q: %s", *origin, err)
}
return fs, nil
}

View File

@@ -47,7 +47,7 @@ func (ctx *InsertCtx) marshalMetricNameRaw(prefix []byte, labels []prompb.Label)
return metricNameRaw[:len(metricNameRaw):len(metricNameRaw)]
}
// WriteDataPoint writes (timestamp, value) with the given prefix and lables into ctx buffer.
// WriteDataPoint writes (timestamp, value) with the given prefix and labels into ctx buffer.
func (ctx *InsertCtx) WriteDataPoint(prefix []byte, labels []prompb.Label, timestamp int64, value float64) {
metricNameRaw := ctx.marshalMetricNameRaw(prefix, labels)
ctx.addRow(metricNameRaw, timestamp, value)
@@ -78,6 +78,26 @@ func (ctx *InsertCtx) addRow(metricNameRaw []byte, timestamp int64, value float6
mr.Value = value
}
// AddLabelBytes adds (name, value) label to ctx.Labels.
//
// name and value must exist until ctx.Labels is used.
func (ctx *InsertCtx) AddLabelBytes(name, value []byte) {
labels := ctx.Labels
if cap(labels) > len(labels) {
labels = labels[:len(labels)+1]
} else {
labels = append(labels, prompb.Label{})
}
label := &labels[len(labels)-1]
// Do not copy name and value contents for performance reasons.
// This reduces GC overhead on the number of objects and allocations.
label.Name = name
label.Value = value
ctx.Labels = labels
}
// AddLabel adds (name, value) label to ctx.Labels.
//
// name and value must exist until ctx.Labels is used.

View File

@@ -0,0 +1,36 @@
package common
import (
"runtime"
"sync"
)
// GetInsertCtx returns InsertCtx from the pool.
//
// Call PutInsertCtx for returning it to the pool.
func GetInsertCtx() *InsertCtx {
select {
case ctx := <-insertCtxPoolCh:
return ctx
default:
if v := insertCtxPool.Get(); v != nil {
return v.(*InsertCtx)
}
return &InsertCtx{}
}
}
// PutInsertCtx returns ctx to the pool.
//
// ctx cannot be used after the call.
func PutInsertCtx(ctx *InsertCtx) {
ctx.Reset(0)
select {
case insertCtxPoolCh <- ctx:
default:
insertCtxPool.Put(ctx)
}
}
var insertCtxPool sync.Pool
var insertCtxPoolCh = make(chan *InsertCtx, runtime.GOMAXPROCS(-1))

View File

@@ -1,161 +1,44 @@
package graphite
import (
"fmt"
"io"
"net"
"runtime"
"sync"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/concurrencylimiter"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/graphite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="graphite"}`)
rowsPerInsert = metrics.NewSummary(`vm_rows_per_insert{type="graphite"}`)
rowsPerInsert = metrics.NewHistogram(`vm_rows_per_insert{type="graphite"}`)
)
// insertHandler processes remote write for graphite plaintext protocol.
// InsertHandler processes remote write for graphite plaintext protocol.
//
// See https://graphite.readthedocs.io/en/latest/feeding-carbon.html#the-plaintext-protocol
func insertHandler(r io.Reader) error {
return concurrencylimiter.Do(func() error {
return insertHandlerInternal(r)
func InsertHandler(r io.Reader) error {
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(r, insertRows)
})
}
func insertHandlerInternal(r io.Reader) error {
ctx := getPushCtx()
defer putPushCtx(ctx)
for ctx.Read(r) {
if err := ctx.InsertRows(); err != nil {
return err
}
}
return ctx.Error()
}
func insertRows(rows []parser.Row) error {
ctx := common.GetInsertCtx()
defer common.PutInsertCtx(ctx)
func (ctx *pushCtx) InsertRows() error {
rows := ctx.Rows.Rows
ic := &ctx.Common
ic.Reset(len(rows))
ctx.Reset(len(rows))
for i := range rows {
r := &rows[i]
ic.Labels = ic.Labels[:0]
ic.AddLabel("", r.Metric)
ctx.Labels = ctx.Labels[:0]
ctx.AddLabel("", r.Metric)
for j := range r.Tags {
tag := &r.Tags[j]
ic.AddLabel(tag.Key, tag.Value)
ctx.AddLabel(tag.Key, tag.Value)
}
ic.WriteDataPoint(nil, ic.Labels, r.Timestamp, r.Value)
ctx.WriteDataPoint(nil, ctx.Labels, r.Timestamp, r.Value)
}
rowsInserted.Add(len(rows))
rowsPerInsert.Update(float64(len(rows)))
return ic.FlushBufs()
return ctx.FlushBufs()
}
const flushTimeout = 3 * time.Second
func (ctx *pushCtx) Read(r io.Reader) bool {
graphiteReadCalls.Inc()
if ctx.err != nil {
return false
}
if c, ok := r.(net.Conn); ok {
if err := c.SetReadDeadline(time.Now().Add(flushTimeout)); err != nil {
graphiteReadErrors.Inc()
ctx.err = fmt.Errorf("cannot set read deadline: %s", err)
return false
}
}
ctx.reqBuf, ctx.tailBuf, ctx.err = common.ReadLinesBlock(r, ctx.reqBuf, ctx.tailBuf)
if ctx.err != nil {
if ne, ok := ctx.err.(net.Error); ok && ne.Timeout() {
// Flush the read data on timeout and try reading again.
ctx.err = nil
} else {
if ctx.err != io.EOF {
graphiteReadErrors.Inc()
ctx.err = fmt.Errorf("cannot read graphite plaintext protocol data: %s", ctx.err)
}
return false
}
}
ctx.Rows.Unmarshal(bytesutil.ToUnsafeString(ctx.reqBuf))
// Fill missing timestamps with the current timestamp rounded to seconds.
currentTimestamp := time.Now().Unix()
rows := ctx.Rows.Rows
for i := range rows {
r := &rows[i]
if r.Timestamp == 0 {
r.Timestamp = currentTimestamp
}
}
// Convert timestamps from seconds to milliseconds.
for i := range rows {
rows[i].Timestamp *= 1e3
}
return true
}
type pushCtx struct {
Rows Rows
Common common.InsertCtx
reqBuf []byte
tailBuf []byte
err error
}
func (ctx *pushCtx) Error() error {
if ctx.err == io.EOF {
return nil
}
return ctx.err
}
func (ctx *pushCtx) reset() {
ctx.Rows.Reset()
ctx.Common.Reset(0)
ctx.reqBuf = ctx.reqBuf[:0]
ctx.tailBuf = ctx.tailBuf[:0]
ctx.err = nil
}
var (
graphiteReadCalls = metrics.NewCounter(`vm_read_calls_total{name="graphite"}`)
graphiteReadErrors = metrics.NewCounter(`vm_read_errors_total{name="graphite"}`)
)
func getPushCtx() *pushCtx {
select {
case ctx := <-pushCtxPoolCh:
return ctx
default:
if v := pushCtxPool.Get(); v != nil {
return v.(*pushCtx)
}
return &pushCtx{}
}
}
func putPushCtx(ctx *pushCtx) {
ctx.reset()
select {
case pushCtxPoolCh <- ctx:
default:
pushCtxPool.Put(ctx)
}
}
var pushCtxPool sync.Pool
var pushCtxPoolCh = make(chan *pushCtx, runtime.GOMAXPROCS(-1))

View File

@@ -2,87 +2,44 @@ package influx
import (
"flag"
"fmt"
"io"
"net/http"
"runtime"
"sync"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/concurrencylimiter"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/influx"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
measurementFieldSeparator = flag.String("influxMeasurementFieldSeparator", "_", "Separator for `{measurement}{separator}{field_name}` metric name when inserted via Influx line protocol")
skipSingleField = flag.Bool("influxSkipSingleField", false, "Uses `{measurement}` instead of `{measurement}{separator}{field_name}` for metic name if Influx line contains only a single field")
measurementFieldSeparator = flag.String("influxMeasurementFieldSeparator", "_", "Separator for '{measurement}{separator}{field_name}' metric name when inserted via Influx line protocol")
skipSingleField = flag.Bool("influxSkipSingleField", false, "Uses '{measurement}' instead of '{measurement}{separator}{field_name}' for metic name if Influx line contains only a single field")
)
var (
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="influx"}`)
rowsPerInsert = metrics.NewSummary(`vm_rows_per_insert{type="influx"}`)
rowsPerInsert = metrics.NewHistogram(`vm_rows_per_insert{type="influx"}`)
)
// InsertHandler processes remote write for influx line protocol.
//
// See https://github.com/influxdata/influxdb/blob/4cbdc197b8117fee648d62e2e5be75c6575352f0/tsdb/README.md
func InsertHandler(req *http.Request) error {
return concurrencylimiter.Do(func() error {
return insertHandlerInternal(req)
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(req, insertRows)
})
}
func insertHandlerInternal(req *http.Request) error {
influxReadCalls.Inc()
r := req.Body
if req.Header.Get("Content-Encoding") == "gzip" {
zr, err := common.GetGzipReader(r)
if err != nil {
return fmt.Errorf("cannot read gzipped influx line protocol data: %s", err)
}
defer common.PutGzipReader(zr)
r = zr
}
q := req.URL.Query()
tsMultiplier := int64(1e6)
switch q.Get("precision") {
case "ns":
tsMultiplier = 1e6
case "u":
tsMultiplier = 1e3
case "ms":
tsMultiplier = 1
case "s":
tsMultiplier = -1e3
case "m":
tsMultiplier = -1e3 * 60
case "h":
tsMultiplier = -1e3 * 3600
}
// Read db tag from https://docs.influxdata.com/influxdb/v1.7/tools/api/#write-http-endpoint
db := q.Get("db")
func insertRows(db string, rows []parser.Row) error {
ctx := getPushCtx()
defer putPushCtx(ctx)
for ctx.Read(r, tsMultiplier) {
if err := ctx.InsertRows(db); err != nil {
return err
}
}
return ctx.Error()
}
func (ctx *pushCtx) InsertRows(db string) error {
rows := ctx.Rows.Rows
rowsLen := 0
for i := range rows {
rowsLen += len(rows[i].Tags)
rowsLen += len(rows[i].Fields)
}
ic := &ctx.Common
ic.Reset(rowsLen)
@@ -104,7 +61,7 @@ func (ctx *pushCtx) InsertRows(db string) error {
ctx.metricNameBuf = storage.MarshalMetricNameRaw(ctx.metricNameBuf[:0], ic.Labels)
ctx.metricGroupBuf = append(ctx.metricGroupBuf[:0], r.Measurement...)
skipFieldKey := len(r.Fields) == 1 && *skipSingleField
if !skipFieldKey {
if len(ctx.metricGroupBuf) > 0 && !skipFieldKey {
ctx.metricGroupBuf = append(ctx.metricGroupBuf, *measurementFieldSeparator...)
}
metricGroupPrefixLen := len(ctx.metricGroupBuf)
@@ -125,80 +82,16 @@ func (ctx *pushCtx) InsertRows(db string) error {
return ic.FlushBufs()
}
func (ctx *pushCtx) Read(r io.Reader, tsMultiplier int64) bool {
if ctx.err != nil {
return false
}
ctx.reqBuf, ctx.tailBuf, ctx.err = common.ReadLinesBlock(r, ctx.reqBuf, ctx.tailBuf)
if ctx.err != nil {
if ctx.err != io.EOF {
influxReadErrors.Inc()
ctx.err = fmt.Errorf("cannot read influx line protocol data: %s", ctx.err)
}
return false
}
ctx.Rows.Unmarshal(bytesutil.ToUnsafeString(ctx.reqBuf))
// Adjust timestamps according to tsMultiplier
currentTs := time.Now().UnixNano() / 1e6
if tsMultiplier >= 1 {
for i := range ctx.Rows.Rows {
row := &ctx.Rows.Rows[i]
if row.Timestamp == 0 {
row.Timestamp = currentTs
} else {
row.Timestamp /= tsMultiplier
}
}
} else if tsMultiplier < 0 {
tsMultiplier = -tsMultiplier
currentTs -= currentTs % tsMultiplier
for i := range ctx.Rows.Rows {
row := &ctx.Rows.Rows[i]
if row.Timestamp == 0 {
row.Timestamp = currentTs
} else {
row.Timestamp *= tsMultiplier
}
}
}
return true
}
var (
influxReadCalls = metrics.NewCounter(`vm_read_calls_total{name="influx"}`)
influxReadErrors = metrics.NewCounter(`vm_read_errors_total{name="influx"}`)
)
type pushCtx struct {
Rows Rows
Common common.InsertCtx
reqBuf []byte
tailBuf []byte
Common common.InsertCtx
metricNameBuf []byte
metricGroupBuf []byte
err error
}
func (ctx *pushCtx) Error() error {
if ctx.err == io.EOF {
return nil
}
return ctx.err
}
func (ctx *pushCtx) reset() {
ctx.Rows.Reset()
ctx.Common.Reset(0)
ctx.reqBuf = ctx.reqBuf[:0]
ctx.tailBuf = ctx.tailBuf[:0]
ctx.metricNameBuf = ctx.metricNameBuf[:0]
ctx.metricGroupBuf = ctx.metricGroupBuf[:0]
ctx.err = nil
}
func getPushCtx() *pushCtx {

View File

@@ -6,51 +6,66 @@ import (
"net/http"
"strings"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/concurrencylimiter"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/graphite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/influx"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/opentsdb"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/opentsdbhttp"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/prometheus"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/prompush"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/promremotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/vmimport"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
graphiteserver "github.com/VictoriaMetrics/VictoriaMetrics/lib/ingestserver/graphite"
opentsdbserver "github.com/VictoriaMetrics/VictoriaMetrics/lib/ingestserver/opentsdb"
opentsdbhttpserver "github.com/VictoriaMetrics/VictoriaMetrics/lib/ingestserver/opentsdbhttp"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
graphiteListenAddr = flag.String("graphiteListenAddr", "", "TCP and UDP address to listen for Graphite plaintext data. Usually :2003 must be set. Doesn't work if empty")
opentsdbListenAddr = flag.String("opentsdbListenAddr", "", "TCP and UDP address to listen for OpentTSDB put messages. Usually :4242 must be set. Doesn't work if empty")
graphiteListenAddr = flag.String("graphiteListenAddr", "", "TCP and UDP address to listen for Graphite plaintext data. Usually :2003 must be set. Doesn't work if empty")
opentsdbListenAddr = flag.String("opentsdbListenAddr", "", "TCP and UDP address to listen for OpentTSDB metrics. "+
"Telnet put messages and HTTP /api/put messages are simultaneously served on TCP port. "+
"Usually :4242 must be set. Doesn't work if empty")
opentsdbHTTPListenAddr = flag.String("opentsdbHTTPListenAddr", "", "TCP address to listen for OpentTSDB HTTP put requests. Usually :4242 must be set. Doesn't work if empty")
maxInsertRequestSize = flag.Int("maxInsertRequestSize", 32*1024*1024, "The maximum size of a single insert request in bytes")
maxLabelsPerTimeseries = flag.Int("maxLabelsPerTimeseries", 30, "The maximum number of labels accepted per time series. Superflouos labels are dropped")
)
var (
graphiteServer *graphiteserver.Server
opentsdbServer *opentsdbserver.Server
opentsdbhttpServer *opentsdbhttpserver.Server
)
// Init initializes vminsert.
func Init() {
storage.SetMaxLabelsPerTimeseries(*maxLabelsPerTimeseries)
concurrencylimiter.Init()
writeconcurrencylimiter.Init()
if len(*graphiteListenAddr) > 0 {
go graphite.Serve(*graphiteListenAddr)
graphiteServer = graphiteserver.MustStart(*graphiteListenAddr, graphite.InsertHandler)
}
if len(*opentsdbListenAddr) > 0 {
go opentsdb.Serve(*opentsdbListenAddr)
opentsdbServer = opentsdbserver.MustStart(*opentsdbListenAddr, opentsdb.InsertHandler, opentsdbhttp.InsertHandler)
}
if len(*opentsdbHTTPListenAddr) > 0 {
go opentsdbhttp.Serve(*opentsdbHTTPListenAddr, int64(*maxInsertRequestSize))
opentsdbhttpServer = opentsdbhttpserver.MustStart(*opentsdbHTTPListenAddr, opentsdbhttp.InsertHandler)
}
promscrape.Init(prompush.Push)
}
// Stop stops vminsert.
func Stop() {
promscrape.Stop()
if len(*graphiteListenAddr) > 0 {
graphite.Stop()
graphiteServer.MustStop()
}
if len(*opentsdbListenAddr) > 0 {
opentsdb.Stop()
opentsdbServer.MustStop()
}
if len(*opentsdbHTTPListenAddr) > 0 {
opentsdbhttp.Stop()
opentsdbhttpServer.MustStop()
}
}
@@ -60,13 +75,22 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
switch path {
case "/api/v1/write":
prometheusWriteRequests.Inc()
if err := prometheus.InsertHandler(r, int64(*maxInsertRequestSize)); err != nil {
if err := promremotewrite.InsertHandler(r); err != nil {
prometheusWriteErrors.Inc()
httpserver.Errorf(w, "error in %q: %s", r.URL.Path, err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "/api/v1/import":
vmimportRequests.Inc()
if err := vmimport.InsertHandler(r); err != nil {
vmimportErrors.Inc()
httpserver.Errorf(w, "error in %q: %s", r.URL.Path, err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "/write", "/api/v2/write":
influxWriteRequests.Inc()
if err := influx.InsertHandler(r); err != nil {
@@ -92,6 +116,9 @@ var (
prometheusWriteRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/write", protocol="prometheus"}`)
prometheusWriteErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/api/v1/write", protocol="prometheus"}`)
vmimportRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/import", protocol="vm"}`)
vmimportErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/api/v1/import", protocol="vm"}`)
influxWriteRequests = metrics.NewCounter(`vm_http_requests_total{path="/write", protocol="influx"}`)
influxWriteErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/write", protocol="influx"}`)

View File

@@ -1,160 +1,44 @@
package opentsdb
import (
"fmt"
"io"
"net"
"runtime"
"sync"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/concurrencylimiter"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentsdb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="opentsdb"}`)
rowsPerInsert = metrics.NewSummary(`vm_rows_per_insert{type="opentsdb"}`)
rowsPerInsert = metrics.NewHistogram(`vm_rows_per_insert{type="opentsdb"}`)
)
// insertHandler processes remote write for OpenTSDB put protocol.
// InsertHandler processes remote write for OpenTSDB put protocol.
//
// See http://opentsdb.net/docs/build/html/api_telnet/put.html
func insertHandler(r io.Reader) error {
return concurrencylimiter.Do(func() error {
return insertHandlerInternal(r)
func InsertHandler(r io.Reader) error {
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(r, insertRows)
})
}
func insertHandlerInternal(r io.Reader) error {
ctx := getPushCtx()
defer putPushCtx(ctx)
for ctx.Read(r) {
if err := ctx.InsertRows(); err != nil {
return err
}
}
return ctx.Error()
}
func insertRows(rows []parser.Row) error {
ctx := common.GetInsertCtx()
defer common.PutInsertCtx(ctx)
func (ctx *pushCtx) InsertRows() error {
rows := ctx.Rows.Rows
ic := &ctx.Common
ic.Reset(len(rows))
ctx.Reset(len(rows))
for i := range rows {
r := &rows[i]
ic.Labels = ic.Labels[:0]
ic.AddLabel("", r.Metric)
ctx.Labels = ctx.Labels[:0]
ctx.AddLabel("", r.Metric)
for j := range r.Tags {
tag := &r.Tags[j]
ic.AddLabel(tag.Key, tag.Value)
ctx.AddLabel(tag.Key, tag.Value)
}
ic.WriteDataPoint(nil, ic.Labels, r.Timestamp, r.Value)
ctx.WriteDataPoint(nil, ctx.Labels, r.Timestamp, r.Value)
}
rowsInserted.Add(len(rows))
rowsPerInsert.Update(float64(len(rows)))
return ic.FlushBufs()
return ctx.FlushBufs()
}
const flushTimeout = 3 * time.Second
func (ctx *pushCtx) Read(r io.Reader) bool {
opentsdbReadCalls.Inc()
if ctx.err != nil {
return false
}
if c, ok := r.(net.Conn); ok {
if err := c.SetReadDeadline(time.Now().Add(flushTimeout)); err != nil {
opentsdbReadErrors.Inc()
ctx.err = fmt.Errorf("cannot set read deadline: %s", err)
return false
}
}
ctx.reqBuf, ctx.tailBuf, ctx.err = common.ReadLinesBlock(r, ctx.reqBuf, ctx.tailBuf)
if ctx.err != nil {
if ne, ok := ctx.err.(net.Error); ok && ne.Timeout() {
// Flush the read data on timeout and try reading again.
ctx.err = nil
} else {
if ctx.err != io.EOF {
opentsdbReadErrors.Inc()
ctx.err = fmt.Errorf("cannot read OpenTSDB put protocol data: %s", ctx.err)
}
return false
}
}
ctx.Rows.Unmarshal(bytesutil.ToUnsafeString(ctx.reqBuf))
// Fill in missing timestamps
currentTimestamp := time.Now().Unix()
rows := ctx.Rows.Rows
for i := range rows {
r := &rows[i]
if r.Timestamp == 0 {
r.Timestamp = currentTimestamp
}
}
// Convert timestamps from seconds to milliseconds
for i := range rows {
rows[i].Timestamp *= 1e3
}
return true
}
type pushCtx struct {
Rows Rows
Common common.InsertCtx
reqBuf []byte
tailBuf []byte
err error
}
func (ctx *pushCtx) Error() error {
if ctx.err == io.EOF {
return nil
}
return ctx.err
}
func (ctx *pushCtx) reset() {
ctx.Rows.Reset()
ctx.Common.Reset(0)
ctx.reqBuf = ctx.reqBuf[:0]
ctx.tailBuf = ctx.tailBuf[:0]
ctx.err = nil
}
var (
opentsdbReadCalls = metrics.NewCounter(`vm_read_calls_total{name="opentsdb"}`)
opentsdbReadErrors = metrics.NewCounter(`vm_read_errors_total{name="opentsdb"}`)
)
func getPushCtx() *pushCtx {
select {
case ctx := <-pushCtxPoolCh:
return ctx
default:
if v := pushCtxPool.Get(); v != nil {
return v.(*pushCtx)
}
return &pushCtx{}
}
}
func putPushCtx(ctx *pushCtx) {
ctx.reset()
select {
case pushCtxPoolCh <- ctx:
default:
pushCtxPool.Put(ctx)
}
}
var pushCtxPool sync.Pool
var pushCtxPoolCh = make(chan *pushCtx, runtime.GOMAXPROCS(-1))

View File

@@ -2,149 +2,49 @@ package opentsdbhttp
import (
"fmt"
"io"
"net/http"
"runtime"
"sync"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/concurrencylimiter"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentsdbhttp"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
"github.com/valyala/fastjson"
)
var (
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="opentsdb-http"}`)
rowsPerInsert = metrics.NewSummary(`vm_rows_per_insert{type="opentsdb-http"}`)
opentsdbReadCalls = metrics.NewCounter(`vm_read_calls_total{name="opentsdb-http"}`)
opentsdbReadErrors = metrics.NewCounter(`vm_read_errors_total{name="opentsdb-http"}`)
opentsdbUnmarshalErrors = metrics.NewCounter(`vm_unmarshal_errors_total{name="opentsdb-http"}`)
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="opentsdbhttp"}`)
rowsPerInsert = metrics.NewHistogram(`vm_rows_per_insert{type="opentsdbhttp"}`)
)
// insertHandler processes HTTP OpenTSDB put requests.
// InsertHandler processes HTTP OpenTSDB put requests.
// See http://opentsdb.net/docs/build/html/api_http/put.html
func insertHandler(req *http.Request, maxSize int64) error {
return concurrencylimiter.Do(func() error {
return insertHandlerInternal(req, maxSize)
})
func InsertHandler(req *http.Request) error {
path := req.URL.Path
switch path {
case "/api/put":
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(req, insertRows)
})
default:
return fmt.Errorf("unexpected path requested on HTTP OpenTSDB server: %q", path)
}
}
func insertHandlerInternal(req *http.Request, maxSize int64) error {
opentsdbReadCalls.Inc()
func insertRows(rows []parser.Row) error {
ctx := common.GetInsertCtx()
defer common.PutInsertCtx(ctx)
r := req.Body
if req.Header.Get("Content-Encoding") == "gzip" {
zr, err := common.GetGzipReader(r)
if err != nil {
opentsdbReadErrors.Inc()
return fmt.Errorf("cannot read gzipped http protocol data: %s", err)
}
defer common.PutGzipReader(zr)
r = zr
}
ctx := getPushCtx()
defer putPushCtx(ctx)
// Read the request in ctx.reqBuf
lr := io.LimitReader(r, maxSize+1)
reqLen, err := ctx.reqBuf.ReadFrom(lr)
if err != nil {
opentsdbReadErrors.Inc()
return fmt.Errorf("cannot read HTTP OpenTSDB request: %s", err)
}
if reqLen > maxSize {
opentsdbReadErrors.Inc()
return fmt.Errorf("too big HTTP OpenTSDB request; mustn't exceed %d bytes", maxSize)
}
// Unmarshal the request to ctx.Rows
p := parserPool.Get()
defer parserPool.Put(p)
v, err := p.ParseBytes(ctx.reqBuf.B)
if err != nil {
opentsdbUnmarshalErrors.Inc()
return fmt.Errorf("cannot parse HTTP OpenTSDB json: %s", err)
}
ctx.Rows.Unmarshal(v)
// Fill in missing timestamps
currentTimestamp := time.Now().Unix()
rows := ctx.Rows.Rows
ctx.Reset(len(rows))
for i := range rows {
r := &rows[i]
if r.Timestamp == 0 {
r.Timestamp = currentTimestamp
}
}
// Convert timestamps in seconds to milliseconds if needed.
// See http://opentsdb.net/docs/javadoc/net/opentsdb/core/Const.html#SECOND_MASK
for i := range rows {
r := &rows[i]
if r.Timestamp&secondMask == 0 {
r.Timestamp *= 1e3
}
}
// Insert ctx.Rows to db.
ic := &ctx.Common
ic.Reset(len(rows))
for i := range rows {
r := &rows[i]
ic.Labels = ic.Labels[:0]
ic.AddLabel("", r.Metric)
ctx.Labels = ctx.Labels[:0]
ctx.AddLabel("", r.Metric)
for j := range r.Tags {
tag := &r.Tags[j]
ic.AddLabel(tag.Key, tag.Value)
ctx.AddLabel(tag.Key, tag.Value)
}
ic.WriteDataPoint(nil, ic.Labels, r.Timestamp, r.Value)
ctx.WriteDataPoint(nil, ctx.Labels, r.Timestamp, r.Value)
}
rowsInserted.Add(len(rows))
rowsPerInsert.Update(float64(len(rows)))
return ic.FlushBufs()
return ctx.FlushBufs()
}
const secondMask int64 = 0x7FFFFFFF00000000
var parserPool fastjson.ParserPool
type pushCtx struct {
Rows Rows
Common common.InsertCtx
reqBuf bytesutil.ByteBuffer
}
func (ctx *pushCtx) reset() {
ctx.Rows.Reset()
ctx.Common.Reset(0)
ctx.reqBuf.Reset()
}
func getPushCtx() *pushCtx {
select {
case ctx := <-pushCtxPoolCh:
return ctx
default:
if v := pushCtxPool.Get(); v != nil {
return v.(*pushCtx)
}
return &pushCtx{}
}
}
func putPushCtx(ctx *pushCtx) {
ctx.reset()
select {
case pushCtxPoolCh <- ctx:
default:
pushCtxPool.Put(ctx)
}
}
var pushCtxPool sync.Pool
var pushCtxPoolCh = make(chan *pushCtx, runtime.GOMAXPROCS(-1))

View File

@@ -1,70 +0,0 @@
package opentsdbhttp
import (
"context"
"net/http"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/metrics"
)
var (
writeRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/put", protocol="opentsdb-http"}`)
writeErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/api/put", protocol="opentsdb-http"}`)
)
var (
httpServer *http.Server
httpAddr string
maxRequestSize int64
)
// Serve starts HTTP OpenTSDB server on the given addr.
func Serve(addr string, maxReqSize int64) {
logger.Infof("starting HTTP OpenTSDB server at %q", addr)
httpAddr = addr
maxRequestSize = maxReqSize
httpServer = &http.Server{
Addr: addr,
Handler: http.HandlerFunc(requestHandler),
ReadTimeout: 30 * time.Second,
WriteTimeout: 10 * time.Second,
}
go func() {
err := httpServer.ListenAndServe()
if err == http.ErrServerClosed {
return
}
if err != nil {
logger.Fatalf("error serving HTTP OpenTSDB: %s", err)
}
}()
}
// requestHandler handles HTTP OpenTSDB insert request.
func requestHandler(w http.ResponseWriter, r *http.Request) {
switch r.URL.Path {
case "/api/put":
writeRequests.Inc()
if err := insertHandler(r, maxRequestSize); err != nil {
writeErrors.Inc()
httpserver.Errorf(w, "error in %q: %s", r.URL.Path, err)
return
}
w.WriteHeader(http.StatusNoContent)
default:
httpserver.Errorf(w, "unexpected path requested on HTTP OpenTSDB server: %q", r.URL.Path)
}
}
// Stop stops HTTP OpenTSDB server.
func Stop() {
logger.Infof("stopping HTTP OpenTSDB server at %q...", httpAddr)
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
if err := httpServer.Shutdown(ctx); err != nil {
logger.Fatalf("cannot close HTTP OpenTSDB server: %s", err)
}
}

View File

@@ -1,112 +0,0 @@
package prometheus
import (
"fmt"
"net/http"
"runtime"
"sync"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/concurrencylimiter"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="prometheus"}`)
rowsPerInsert = metrics.NewSummary(`vm_rows_per_insert{type="prometheus"}`)
)
// InsertHandler processes remote write for prometheus.
func InsertHandler(r *http.Request, maxSize int64) error {
return concurrencylimiter.Do(func() error {
return insertHandlerInternal(r, maxSize)
})
}
func insertHandlerInternal(r *http.Request, maxSize int64) error {
ctx := getPushCtx()
defer putPushCtx(ctx)
if err := ctx.Read(r, maxSize); err != nil {
return err
}
timeseries := ctx.req.Timeseries
rowsLen := 0
for i := range timeseries {
rowsLen += len(timeseries[i].Samples)
}
ic := &ctx.Common
ic.Reset(rowsLen)
rowsTotal := 0
for i := range timeseries {
ts := &timeseries[i]
var metricNameRaw []byte
for i := range ts.Samples {
r := &ts.Samples[i]
metricNameRaw = ic.WriteDataPointExt(metricNameRaw, ts.Labels, r.Timestamp, r.Value)
}
rowsTotal += len(ts.Samples)
}
rowsInserted.Add(rowsTotal)
rowsPerInsert.Update(float64(rowsTotal))
return ic.FlushBufs()
}
type pushCtx struct {
Common common.InsertCtx
req prompb.WriteRequest
reqBuf []byte
}
func (ctx *pushCtx) reset() {
ctx.Common.Reset(0)
ctx.req.Reset()
ctx.reqBuf = ctx.reqBuf[:0]
}
func (ctx *pushCtx) Read(r *http.Request, maxSize int64) error {
prometheusReadCalls.Inc()
var err error
ctx.reqBuf, err = prompb.ReadSnappy(ctx.reqBuf[:0], r.Body, maxSize)
if err != nil {
prometheusReadErrors.Inc()
return fmt.Errorf("cannot read prompb.WriteRequest: %s", err)
}
if err = ctx.req.Unmarshal(ctx.reqBuf); err != nil {
prometheusUnmarshalErrors.Inc()
return fmt.Errorf("cannot unmarshal prompb.WriteRequest with size %d bytes: %s", len(ctx.reqBuf), err)
}
return nil
}
var (
prometheusReadCalls = metrics.NewCounter(`vm_read_calls_total{name="prometheus"}`)
prometheusReadErrors = metrics.NewCounter(`vm_read_errors_total{name="prometheus"}`)
prometheusUnmarshalErrors = metrics.NewCounter(`vm_unmarshal_errors_total{name="prometheus"}`)
)
func getPushCtx() *pushCtx {
select {
case ctx := <-pushCtxPoolCh:
return ctx
default:
if v := pushCtxPool.Get(); v != nil {
return v.(*pushCtx)
}
return &pushCtx{}
}
}
func putPushCtx(ctx *pushCtx) {
ctx.reset()
select {
case pushCtxPoolCh <- ctx:
default:
pushCtxPool.Put(ctx)
}
}
var pushCtxPool sync.Pool
var pushCtxPoolCh = make(chan *pushCtx, runtime.GOMAXPROCS(-1))

View File

@@ -0,0 +1,97 @@
package prompush
import (
"runtime"
"sync"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="promscrape"}`)
rowsPerInsert = metrics.NewHistogram(`vm_rows_per_insert{type="promscrape"}`)
)
// Push pushes wr to to storage.
func Push(wr *prompbmarshal.WriteRequest) {
ctx := getPushCtx()
defer putPushCtx(ctx)
timeseries := wr.Timeseries
rowsLen := 0
for i := range timeseries {
rowsLen += len(timeseries[i].Samples)
}
ic := &ctx.Common
ic.Reset(rowsLen)
rowsTotal := 0
labels := ctx.labels[:0]
for i := range timeseries {
ts := &timeseries[i]
labels = labels[:0]
for j := range ts.Labels {
label := &ts.Labels[j]
labels = append(labels, prompb.Label{
Name: bytesutil.ToUnsafeBytes(label.Name),
Value: bytesutil.ToUnsafeBytes(label.Value),
})
}
var metricNameRaw []byte
for i := range ts.Samples {
r := &ts.Samples[i]
metricNameRaw = ic.WriteDataPointExt(metricNameRaw, labels, r.Timestamp, r.Value)
}
rowsTotal += len(ts.Samples)
}
ctx.labels = labels
rowsInserted.Add(rowsTotal)
rowsPerInsert.Update(float64(rowsTotal))
if err := ic.FlushBufs(); err != nil {
logger.Errorf("cannot flush promscrape data to storage: %s", err)
}
}
type pushCtx struct {
Common common.InsertCtx
labels []prompb.Label
}
func (ctx *pushCtx) reset() {
ctx.Common.Reset(0)
for i := range ctx.labels {
label := &ctx.labels[i]
label.Name = nil
label.Value = nil
}
ctx.labels = ctx.labels[:0]
}
func getPushCtx() *pushCtx {
select {
case ctx := <-pushCtxPoolCh:
return ctx
default:
if v := pushCtxPool.Get(); v != nil {
return v.(*pushCtx)
}
return &pushCtx{}
}
}
func putPushCtx(ctx *pushCtx) {
ctx.reset()
select {
case pushCtxPoolCh <- ctx:
default:
pushCtxPool.Put(ctx)
}
}
var pushCtxPool sync.Pool
var pushCtxPoolCh = make(chan *pushCtx, runtime.GOMAXPROCS(-1))

View File

@@ -0,0 +1,47 @@
package promremotewrite
import (
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/promremotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="promremotewrite"}`)
rowsPerInsert = metrics.NewHistogram(`vm_rows_per_insert{type="promremotewrite"}`)
)
// InsertHandler processes remote write for prometheus.
func InsertHandler(req *http.Request) error {
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(req, insertRows)
})
}
func insertRows(timeseries []prompb.TimeSeries) error {
ctx := common.GetInsertCtx()
defer common.PutInsertCtx(ctx)
rowsLen := 0
for i := range timeseries {
rowsLen += len(timeseries[i].Samples)
}
ctx.Reset(rowsLen)
rowsTotal := 0
for i := range timeseries {
ts := &timeseries[i]
var metricNameRaw []byte
for i := range ts.Samples {
r := &ts.Samples[i]
metricNameRaw = ctx.WriteDataPointExt(metricNameRaw, ts.Labels, r.Timestamp, r.Value)
}
rowsTotal += len(ts.Samples)
}
rowsInserted.Add(rowsTotal)
rowsPerInsert.Update(float64(rowsTotal))
return ctx.FlushBufs()
}

View File

@@ -0,0 +1,94 @@
package vmimport
import (
"net/http"
"runtime"
"sync"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/vmimport"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="vmimport"}`)
rowsPerInsert = metrics.NewHistogram(`vm_rows_per_insert{type="vmimport"}`)
)
// InsertHandler processes `/api/v1/import` request.
//
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6
func InsertHandler(req *http.Request) error {
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(req, insertRows)
})
}
func insertRows(rows []parser.Row) error {
ctx := getPushCtx()
defer putPushCtx(ctx)
rowsLen := 0
for i := range rows {
rowsLen += len(rows[i].Values)
}
ic := &ctx.Common
ic.Reset(rowsLen)
rowsTotal := 0
for i := range rows {
r := &rows[i]
ic.Labels = ic.Labels[:0]
for j := range r.Tags {
tag := &r.Tags[j]
ic.AddLabelBytes(tag.Key, tag.Value)
}
ctx.metricNameBuf = storage.MarshalMetricNameRaw(ctx.metricNameBuf[:0], ic.Labels)
values := r.Values
timestamps := r.Timestamps
_ = timestamps[len(values)-1]
for j, value := range values {
timestamp := timestamps[j]
ic.WriteDataPoint(ctx.metricNameBuf, nil, timestamp, value)
}
rowsTotal += len(values)
}
rowsInserted.Add(rowsTotal)
rowsPerInsert.Update(float64(rowsTotal))
return ic.FlushBufs()
}
type pushCtx struct {
Common common.InsertCtx
metricNameBuf []byte
}
func (ctx *pushCtx) reset() {
ctx.Common.Reset(0)
ctx.metricNameBuf = ctx.metricNameBuf[:0]
}
func getPushCtx() *pushCtx {
select {
case ctx := <-pushCtxPoolCh:
return ctx
default:
if v := pushCtxPool.Get(); v != nil {
return v.(*pushCtx)
}
return &pushCtx{}
}
}
func putPushCtx(ctx *pushCtx) {
ctx.reset()
select {
case pushCtxPoolCh <- ctx:
default:
pushCtxPool.Put(ctx)
}
}
var pushCtxPool sync.Pool
var pushCtxPoolCh = make(chan *pushCtx, runtime.GOMAXPROCS(-1))

67
app/vmrestore/Makefile Normal file
View File

@@ -0,0 +1,67 @@
# All these commands must run from repository root.
vmrestore:
APP_NAME=vmrestore $(MAKE) app-local
vmrestore-prod:
APP_NAME=vmrestore $(MAKE) app-via-docker
vmrestore-pure-prod:
APP_NAME=vmrestore $(MAKE) app-via-docker-pure
vmrestore-amd64-prod:
APP_NAME=vmrestore $(MAKE) app-via-docker-amd64
vmrestore-arm-prod:
APP_NAME=vmrestore $(MAKE) app-via-docker-arm
vmrestore-arm64-prod:
APP_NAME=vmrestore $(MAKE) app-via-docker-arm64
vmrestore-ppc64le-prod:
APP_NAME=vmrestore $(MAKE) app-via-docker-ppc64le
vmrestore-386-prod:
APP_NAME=vmrestore $(MAKE) app-via-docker-386
package-vmrestore:
APP_NAME=vmrestore $(MAKE) package-via-docker
package-vmrestore-pure:
APP_NAME=vmrestore $(MAKE) package-via-docker-pure
package-vmrestore-amd64:
APP_NAME=vmrestore $(MAKE) package-via-docker-amd64
package-vmrestore-arm:
APP_NAME=vmrestore $(MAKE) package-via-docker-arm
package-vmrestore-arm64:
APP_NAME=vmrestore $(MAKE) package-via-docker-arm64
package-vmrestore-ppc64le:
APP_NAME=vmrestore $(MAKE) package-via-docker-ppc64le
package-vmrestore-386:
APP_NAME=vmrestore $(MAKE) package-via-docker-386
publish-vmrestore:
APP_NAME=vmrestore $(MAKE) publish-via-docker
vmrestore-pure:
APP_NAME=vmrestore $(MAKE) app-local-pure
vmrestore-amd64:
CGO_ENABLED=1 GOOS=linux GOARCH=amd64 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmrestore-amd64 ./app/vmrestore
vmrestore-arm:
CGO_ENABLED=0 GOOS=linux GOARCH=arm GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmrestore-arm ./app/vmrestore
vmrestore-arm64:
CGO_ENABLED=0 GOOS=linux GOARCH=arm64 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmrestore-arm64 ./app/vmrestore
vmrestore-ppc64le:
CGO_ENABLED=0 GOOS=linux GOARCH=ppc64le GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmrestore-ppc64le ./app/vmrestore
vmrestore-386:
CGO_ENABLED=0 GOOS=linux GOARCH=386 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmrestore-386 ./app/vmrestore

86
app/vmrestore/README.md Normal file
View File

@@ -0,0 +1,86 @@
## vmrestore
`vmrestore` restores data from backups created by [vmbackup](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmbackup/README.md).
VictoriaMetrics `v1.29.0` and newer versions must be used for working with the restored data.
Restore process can be interrupted at any time. It is automatically resumed from the inerruption point
when restarting `vmrestore` with the same args.
### Usage
VictoriaMetrics must be stopped during the restore process.
```
vmrestore -src=gcs://<bucket>/<path/to/backup> -storageDataPath=<local/path/to/restore>
```
* `<bucket>` is [GCS bucket](https://cloud.google.com/storage/docs/creating-buckets) name.
* `<path/to/backup>` is the path to backup made with [vmbackup](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmbackup/README.md) on GCS bucket.
* `<local/path/to/restore>` is the path to folder where data will be restored. This folder must be passed
to VictoriaMetrics in `-storageDataPath` command-line flag after the restore process is complete.
The original `-storageDataPath` directory may contain old files. They will be susbstituted by the files from backup.
### Troubleshooting
* If `vmrestore` eats all the network bandwidth, then set `-maxBytesPerSecond` to the desired value.
* If `vmrestore` has been interrupted due to temporary error, then just restart it with the same args. It will resume the restore process.
### Advanced usage
Run `vmrestore -help` in order to see all the available options:
```
-concurrency int
The number of concurrent workers. Higher concurrency may reduce restore duration (default 10)
-configFilePath string
Path to file with S3 configs. Configs are loaded from default location if not set.
See https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html
-configProfile string
Profile name for S3 configs (default "default")
-credsFilePath string
Path to file with GCS or S3 credentials. Credentials are loaded from default locations if not set.
See https://cloud.google.com/iam/docs/creating-managing-service-account-keys and https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html
-customS3Endpoint string
Custom S3 endpoint for use with S3-compatible storages (e.g. MinIO). S3 is used if not set
-loggerLevel string
Minimum level of errors to log. Possible values: INFO, ERROR, FATAL, PANIC (default "INFO")
-maxBytesPerSecond int
The maximum download speed. There is no limit if it is set to 0
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy (default 60)
-src string
Source path with backup on the remote storage. Example: gcs://bucket/path/to/backup/dir, s3://bucket/path/to/backup/dir or fs:///path/to/local/backup/dir
-storageDataPath string
Destination path where backup must be restored. VictoriaMetrics must be stopped when restoring from backup. -storageDataPath dir can be non-empty. In this case only missing data is downloaded from backup (default "victoria-metrics-data")
-version
Show VictoriaMetrics version
```
### How to build from sources
It is recommended using [binary releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) - see `vmutils-*` archives there.
#### Development build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.12.
2. Run `make vmrestore` from the root folder of the repository.
It builds `vmrestore` binary and puts it into the `bin` folder.
#### Production build
1. [Install docker](https://docs.docker.com/install/).
2. Run `make vmrestore-prod` from the root folder of the repository.
It builds `vmrestore-prod` binary and puts it into the `bin` folder.
#### Building docker images
Run `make package-vmrestore`. It builds `victoriametrics/vmrestore:<PKG_TAG>` docker image locally.
`<PKG_TAG>` is auto-generated image tag, which depends on source code in the repository.
The `<PKG_TAG>` may be manually set via `PKG_TAG=foobar make package-vmrestore`.

View File

@@ -0,0 +1,7 @@
ARG certs_image
FROM $certs_image AS certs
FROM scratch
COPY --from=certs /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ca-certificates.crt
ARG src_binary
COPY $src_binary ./vmrestore-prod
ENTRYPOINT ["/vmrestore-prod"]

81
app/vmrestore/main.go Normal file
View File

@@ -0,0 +1,81 @@
package main
import (
"flag"
"fmt"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/backup/actions"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/backup/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/backup/fslocal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envflag"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
)
var (
src = flag.String("src", "", "Source path with backup on the remote storage. "+
"Example: gcs://bucket/path/to/backup/dir, s3://bucket/path/to/backup/dir or fs:///path/to/local/backup/dir")
storageDataPath = flag.String("storageDataPath", "victoria-metrics-data", "Destination path where backup must be restored. "+
"VictoriaMetrics must be stopped when restoring from backup. -storageDataPath dir can be non-empty. In this case only missing data is downloaded from backup")
concurrency = flag.Int("concurrency", 10, "The number of concurrent workers. Higher concurrency may reduce restore duration")
maxBytesPerSecond = flag.Int("maxBytesPerSecond", 0, "The maximum download speed. There is no limit if it is set to 0")
skipBackupCompleteCheck = flag.Bool("skipBackupCompleteCheck", false, "Whether to skip checking for 'backup complete' file in -src. This may be useful for restoring from old backups, which were created without 'backup complete' file")
)
func main() {
flag.Usage = usage
envflag.Parse()
buildinfo.Init()
srcFS, err := newSrcFS()
if err != nil {
logger.Fatalf("%s", err)
}
dstFS, err := newDstFS()
if err != nil {
logger.Fatalf("%s", err)
}
a := &actions.Restore{
Concurrency: *concurrency,
Src: srcFS,
Dst: dstFS,
SkipBackupCompleteCheck: *skipBackupCompleteCheck,
}
if err := a.Run(); err != nil {
logger.Fatalf("cannot restore from backup: %s", err)
}
}
func usage() {
const s = `
vmrestore restores VictoriaMetrics data from backups made by vmbackup.
See the docs at https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmrestore/README.md .
`
f := flag.CommandLine.Output()
fmt.Fprintf(f, "%s\n", s)
flag.PrintDefaults()
}
func newDstFS() (*fslocal.FS, error) {
if len(*storageDataPath) == 0 {
return nil, fmt.Errorf("`-storageDataPath` cannot be empty")
}
fs := &fslocal.FS{
Dir: *storageDataPath,
MaxBytesPerSecond: *maxBytesPerSecond,
}
if err := fs.Init(); err != nil {
return nil, fmt.Errorf("cannot initialize local fs: %s", err)
}
return fs, nil
}
func newSrcFS() (common.RemoteFS, error) {
fs, err := actions.NewRemoteFS(*src)
if err != nil {
return nil, fmt.Errorf("cannot parse `-src`=%q: %s", *src, err)
}
return fs, nil
}

View File

@@ -21,10 +21,26 @@ import (
var (
deleteAuthKey = flag.String("deleteAuthKey", "", "authKey for metrics' deletion via /api/v1/admin/tsdb/delete_series")
maxConcurrentRequests = flag.Int("search.maxConcurrentRequests", runtime.GOMAXPROCS(-1)*2, "The maximum number of concurrent search requests. It shouldn't exceed 2*vCPUs for better performance. See also -search.maxQueueDuration")
maxQueueDuration = flag.Duration("search.maxQueueDuration", 10*time.Second, "The maximum time the request waits for execution when -search.maxConcurrentRequests limit is reached")
maxConcurrentRequests = flag.Int("search.maxConcurrentRequests", getDefaultMaxConcurrentRequests(), "The maximum number of concurrent search requests. "+
"It shouldn't be high, since a single request can saturate all the CPU cores. See also -search.maxQueueDuration")
maxQueueDuration = flag.Duration("search.maxQueueDuration", 10*time.Second, "The maximum time the request waits for execution when -search.maxConcurrentRequests limit is reached")
resetCacheAuthKey = flag.String("search.resetCacheAuthKey", "", "Optional authKey for resetting rollup cache via /internal/resetCache call")
)
func getDefaultMaxConcurrentRequests() int {
n := runtime.GOMAXPROCS(-1)
if n <= 4 {
n *= 2
}
if n > 16 {
// A single request can saturate all the CPU cores, so there is no sense
// in allowing higher number of concurrent requests - they will just contend
// for unavailable CPU time.
n = 16
}
return n
}
// Init initializes vmselect
func Init() {
tmpDirPath := *vmstorage.DataPath + "/tmp"
@@ -56,6 +72,7 @@ var (
// RequestHandler handles remote read API requests for Prometheus
func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
startTime := time.Now()
// Limit the number of concurrent queries.
select {
case concurrencyCh <- struct{}{}:
@@ -72,7 +89,9 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
timerpool.Put(t)
concurrencyLimitTimeout.Inc()
err := &httpserver.ErrorWithStatusCode{
Err: fmt.Errorf("cannot handle more than %d concurrent requests", cap(concurrencyCh)),
Err: fmt.Errorf("cannot handle more than %d concurrent search requests during %s; possible solutions: "+
"increase `-search.maxQueueDuration`, increase `-search.maxConcurrentRequests`, increase server capacity",
*maxConcurrentRequests, *maxQueueDuration),
StatusCode: http.StatusServiceUnavailable,
}
httpserver.Errorf(w, "%s", err)
@@ -81,13 +100,22 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
}
path := strings.Replace(r.URL.Path, "//", "/", -1)
if path == "/internal/resetRollupResultCache" {
if len(*resetCacheAuthKey) > 0 && r.FormValue("authKey") != *resetCacheAuthKey {
sendPrometheusError(w, r, fmt.Errorf("invalid authKey=%q for %q", r.FormValue("authKey"), path))
return true
}
promql.ResetRollupResultCache()
return true
}
if strings.HasPrefix(path, "/api/v1/label/") {
s := r.URL.Path[len("/api/v1/label/"):]
if strings.HasSuffix(s, "/values") {
labelValuesRequests.Inc()
labelName := s[:len(s)-len("/values")]
httpserver.EnableCORS(w, r)
if err := prometheus.LabelValuesHandler(labelName, w, r); err != nil {
if err := prometheus.LabelValuesHandler(startTime, labelName, w, r); err != nil {
labelValuesErrors.Inc()
sendPrometheusError(w, r, err)
return true
@@ -100,7 +128,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
case "/api/v1/query":
queryRequests.Inc()
httpserver.EnableCORS(w, r)
if err := prometheus.QueryHandler(w, r); err != nil {
if err := prometheus.QueryHandler(startTime, w, r); err != nil {
queryErrors.Inc()
sendPrometheusError(w, r, err)
return true
@@ -109,7 +137,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
case "/api/v1/query_range":
queryRangeRequests.Inc()
httpserver.EnableCORS(w, r)
if err := prometheus.QueryRangeHandler(w, r); err != nil {
if err := prometheus.QueryRangeHandler(startTime, w, r); err != nil {
queryRangeErrors.Inc()
sendPrometheusError(w, r, err)
return true
@@ -118,7 +146,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
case "/api/v1/series":
seriesRequests.Inc()
httpserver.EnableCORS(w, r)
if err := prometheus.SeriesHandler(w, r); err != nil {
if err := prometheus.SeriesHandler(startTime, w, r); err != nil {
seriesErrors.Inc()
sendPrometheusError(w, r, err)
return true
@@ -127,7 +155,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
case "/api/v1/series/count":
seriesCountRequests.Inc()
httpserver.EnableCORS(w, r)
if err := prometheus.SeriesCountHandler(w, r); err != nil {
if err := prometheus.SeriesCountHandler(startTime, w, r); err != nil {
seriesCountErrors.Inc()
sendPrometheusError(w, r, err)
return true
@@ -136,7 +164,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
case "/api/v1/labels":
labelsRequests.Inc()
httpserver.EnableCORS(w, r)
if err := prometheus.LabelsHandler(w, r); err != nil {
if err := prometheus.LabelsHandler(startTime, w, r); err != nil {
labelsErrors.Inc()
sendPrometheusError(w, r, err)
return true
@@ -145,7 +173,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
case "/api/v1/labels/count":
labelsCountRequests.Inc()
httpserver.EnableCORS(w, r)
if err := prometheus.LabelsCountHandler(w, r); err != nil {
if err := prometheus.LabelsCountHandler(startTime, w, r); err != nil {
labelsCountErrors.Inc()
sendPrometheusError(w, r, err)
return true
@@ -153,7 +181,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
return true
case "/api/v1/export":
exportRequests.Inc()
if err := prometheus.ExportHandler(w, r); err != nil {
if err := prometheus.ExportHandler(startTime, w, r); err != nil {
exportErrors.Inc()
httpserver.Errorf(w, "error in %q: %s", r.URL.Path, err)
return true
@@ -161,12 +189,30 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
return true
case "/federate":
federateRequests.Inc()
if err := prometheus.FederateHandler(w, r); err != nil {
if err := prometheus.FederateHandler(startTime, w, r); err != nil {
federateErrors.Inc()
httpserver.Errorf(w, "error int %q: %s", r.URL.Path, err)
return true
}
return true
case "/api/v1/rules":
// Return dumb placeholder
rulesRequests.Inc()
w.Header().Set("Content-Type", "application/json")
fmt.Fprintf(w, "%s", `{"status":"success","data":{"groups":[]}}`)
return true
case "/api/v1/alerts":
// Return dumb placehloder
alertsRequests.Inc()
w.Header().Set("Content-Type", "application/json")
fmt.Fprintf(w, "%s", `{"status":"success","data":{"alerts":[]}}`)
return true
case "/api/v1/metadata":
// Return dumb placeholder
metadataRequests.Inc()
w.Header().Set("Content-Type", "application/json")
fmt.Fprintf(w, "%s", `{"status":"success","data":{}}`)
return true
case "/api/v1/admin/tsdb/delete_series":
deleteRequests.Inc()
authKey := r.FormValue("authKey")
@@ -174,7 +220,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
httpserver.Errorf(w, "invalid authKey %q. It must match the value from -deleteAuthKey command line flag", authKey)
return true
}
if err := prometheus.DeleteHandler(r); err != nil {
if err := prometheus.DeleteHandler(startTime, r); err != nil {
deleteErrors.Inc()
httpserver.Errorf(w, "error in %q: %s", r.URL.Path, err)
return true
@@ -187,7 +233,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
}
func sendPrometheusError(w http.ResponseWriter, r *http.Request, err error) {
logger.Errorf("error in %q: %s", r.URL.Path, err)
logger.Errorf("error in %q: %s", r.RequestURI, err)
w.Header().Set("Content-Type", "application/json")
statusCode := http.StatusUnprocessableEntity
@@ -228,4 +274,8 @@ var (
federateRequests = metrics.NewCounter(`vm_http_requests_total{path="/federate"}`)
federateErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/federate"}`)
rulesRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/rules"}`)
alertsRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/alerts"}`)
metadataRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/metadata"}`)
)

View File

@@ -1,9 +0,0 @@
package netstorage
import (
"os"
)
func mustFadviseSequentialRead(f *os.File) {
// Do nothing :)
}

View File

@@ -1,15 +0,0 @@
package netstorage
import (
"os"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"golang.org/x/sys/unix"
)
func mustFadviseSequentialRead(f *os.File) {
fd := int(f.Fd())
if err := unix.Fadvise(int(fd), 0, 0, unix.FADV_SEQUENTIAL|unix.FADV_WILLNEED); err != nil {
logger.Panicf("FATAL: error returned from unix.Fadvise(SEQUENTIAL|WILLNEED): %s", err)
}
}

View File

@@ -1,15 +0,0 @@
package netstorage
import (
"os"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"golang.org/x/sys/unix"
)
func mustFadviseSequentialRead(f *os.File) {
fd := int(f.Fd())
if err := unix.Fadvise(int(fd), 0, 0, unix.FADV_SEQUENTIAL|unix.FADV_WILLNEED); err != nil {
logger.Panicf("FATAL: error returned from unix.Fadvise(SEQUENTIAL|WILLNEED): %s", err)
}
}

View File

@@ -92,6 +92,7 @@ func (rss *Results) RunParallel(f func(rs *Result, workerID uint)) error {
doneCh := make(chan error)
// Start workers.
rowsProcessedTotal := uint64(0)
for i := 0; i < workersCount; i++ {
go func(workerID uint) {
rs := getResult()
@@ -99,9 +100,10 @@ func (rss *Results) RunParallel(f func(rs *Result, workerID uint)) error {
maxWorkersCount := gomaxprocs / workersCount
var err error
rowsProcessed := 0
for pts := range workCh {
if time.Until(rss.deadline.Deadline) < 0 {
err = fmt.Errorf("timeout exceeded during query execution: %s", rss.deadline.Timeout)
err = fmt.Errorf("timeout exceeded during query execution: %s", rss.deadline.String())
break
}
if err = pts.Unpack(rss.tbf, rs, rss.tr, rss.fetchData, maxWorkersCount); err != nil {
@@ -111,8 +113,10 @@ func (rss *Results) RunParallel(f func(rs *Result, workerID uint)) error {
// Skip empty blocks.
continue
}
rowsProcessed += len(rs.Values)
f(rs, workerID)
}
atomic.AddUint64(&rowsProcessedTotal, uint64(rowsProcessed))
// Drain the remaining work
for range workCh {
}
@@ -124,6 +128,7 @@ func (rss *Results) RunParallel(f func(rs *Result, workerID uint)) error {
for i := range rss.packedTimeseries {
workCh <- &rss.packedTimeseries[i]
}
seriesProcessedTotal := len(rss.packedTimeseries)
rss.packedTimeseries = rss.packedTimeseries[:0]
close(workCh)
@@ -134,6 +139,8 @@ func (rss *Results) RunParallel(f func(rs *Result, workerID uint)) error {
errors = append(errors, err)
}
}
perQueryRowsProcessed.Update(float64(rowsProcessedTotal))
perQuerySeriesProcessed.Update(float64(seriesProcessedTotal))
if len(errors) > 0 {
// Return just the first error, since other errors
// is likely duplicate the first error.
@@ -142,6 +149,9 @@ func (rss *Results) RunParallel(f func(rs *Result, workerID uint)) error {
return nil
}
var perQueryRowsProcessed = metrics.NewHistogram(`vm_per_query_rows_processed_count`)
var perQuerySeriesProcessed = metrics.NewHistogram(`vm_per_query_series_processed_count`)
var gomaxprocs = runtime.GOMAXPROCS(-1)
type packedTimeseries struct {
@@ -256,7 +266,7 @@ func mergeSortBlocks(dst *Result, sbh sortBlocksHeap) {
dst.Timestamps = append(dst.Timestamps, top.Timestamps[top.NextIdx:]...)
dst.Values = append(dst.Values, top.Values[top.NextIdx:]...)
putSortBlock(top)
return
break
}
sbNext := sbh[0]
tsNext := sbNext.Timestamps[sbNext.NextIdx]
@@ -277,6 +287,8 @@ func mergeSortBlocks(dst *Result, sbh sortBlocksHeap) {
putSortBlock(top)
}
}
dst.Timestamps, dst.Values = storage.DeduplicateSamples(dst.Timestamps, dst.Values)
}
type sortBlock struct {
@@ -422,13 +434,10 @@ func GetLabelEntries(deadline Deadline) ([]storage.TagEntry, error) {
// Sort labelEntries by the number of label values in each entry.
sort.Slice(labelEntries, func(i, j int) bool {
a, b := labelEntries[i].Values, labelEntries[j].Values
if len(a) < len(b) {
return true
if len(a) != len(b) {
return len(a) > len(b)
}
if len(a) > len(b) {
return false
}
return labelEntries[i].Key < labelEntries[j].Key
return labelEntries[i].Key > labelEntries[j].Key
})
return labelEntries, nil
@@ -452,16 +461,12 @@ func getStorageSearch() *storage.Search {
}
func putStorageSearch(sr *storage.Search) {
n := atomic.LoadUint64(&sr.MissingMetricNamesForMetricID)
missingMetricNamesForMetricID.Add(int(n))
sr.MustClose()
ssPool.Put(sr)
}
var ssPool sync.Pool
var missingMetricNamesForMetricID = metrics.NewCounter(`vm_missing_metric_names_for_metric_id_total`)
// ProcessSearchQuery performs sq on storage nodes until the given deadline.
func ProcessSearchQuery(sq *storage.SearchQuery, fetchData bool, deadline Deadline) (*Results, error) {
// Setup search.
@@ -496,7 +501,7 @@ func ProcessSearchQuery(sq *storage.SearchQuery, fetchData bool, deadline Deadli
}
if time.Until(deadline.Deadline) < 0 {
putTmpBlocksFile(tbf)
return nil, fmt.Errorf("timeout exceeded while fetching data block #%d from storage: %s", blocksRead, deadline.Timeout)
return nil, fmt.Errorf("timeout exceeded while fetching data block #%d from storage: %s", blocksRead, deadline.String())
}
metricName := sr.MetricBlock.MetricName
m[string(metricName)] = append(m[string(metricName)], addr)
@@ -572,13 +577,24 @@ func setupTfss(tagFilterss [][]storage.TagFilter) ([]*storage.TagFilters, error)
// Deadline contains deadline with the corresponding timeout for pretty error messages.
type Deadline struct {
Deadline time.Time
Timeout time.Duration
timeout time.Duration
flagHint string
}
// NewDeadline returns deadline for the given timeout.
func NewDeadline(timeout time.Duration) Deadline {
//
// flagHint must contain a hit for command-line flag, which could be used
// in order to increase timeout.
func NewDeadline(timeout time.Duration, flagHint string) Deadline {
return Deadline{
Deadline: time.Now().Add(timeout),
Timeout: timeout,
timeout: timeout,
flagHint: flagHint,
}
}
// String returns human-readable string representation for d.
func (d *Deadline) String() string {
return fmt.Sprintf("%.3f seconds; the timeout can be adjusted with `%s` command-line flag", d.timeout.Seconds(), d.flagHint)
}

View File

@@ -36,6 +36,9 @@ func maxInmemoryTmpBlocksFile() int {
if maxLen < 64*1024 {
return 64 * 1024
}
if maxLen > 4*1024*1024 {
return 4 * 1024 * 1024
}
return maxLen
}
@@ -47,6 +50,7 @@ type tmpBlocksFile struct {
buf []byte
f *os.File
r *fs.ReaderAt
offset uint64
}
@@ -65,6 +69,7 @@ func putTmpBlocksFile(tbf *tmpBlocksFile) {
tbf.MustClose()
tbf.buf = tbf.buf[:0]
tbf.f = nil
tbf.r = nil
tbf.offset = 0
tmpBlocksFilePool.Put(tbf)
}
@@ -118,17 +123,20 @@ func (tbf *tmpBlocksFile) Finalize() error {
if tbf.f == nil {
return nil
}
fname := tbf.f.Name()
if _, err := tbf.f.Write(tbf.buf); err != nil {
return fmt.Errorf("cannot flush the remaining %d bytes to tmpBlocksFile: %s", len(tbf.buf), err)
return fmt.Errorf("cannot write the remaining %d bytes to %q: %s", len(tbf.buf), fname, err)
}
tbf.buf = tbf.buf[:0]
if _, err := tbf.f.Seek(0, 0); err != nil {
logger.Panicf("FATAL: cannot seek to the start of file: %s", err)
r, err := fs.OpenReaderAt(fname)
if err != nil {
logger.Panicf("FATAL: cannot open %q: %s", fname, err)
}
// Hint the OS that the file is read almost sequentiallly.
// This should reduce the number of disk seeks, which is important
// for HDDs.
mustFadviseSequentialRead(tbf.f)
r.MustFadviseSequentialRead(true)
tbf.r = r
return nil
}
@@ -140,13 +148,7 @@ func (tbf *tmpBlocksFile) MustReadBlockAt(dst *storage.Block, addr tmpBlockAddr)
bb := tmpBufPool.Get()
defer tmpBufPool.Put(bb)
bb.B = bytesutil.Resize(bb.B, addr.size)
n, err := tbf.f.ReadAt(bb.B, int64(addr.offset))
if err != nil {
logger.Panicf("FATAL: cannot read from %q at %s: %s", tbf.f.Name(), addr, err)
}
if n != len(bb.B) {
logger.Panicf("FATAL: too short number of bytes read at %s; got %d; want %d", addr, n, len(bb.B))
}
tbf.r.MustReadAt(bb.B, int64(addr.offset))
buf = bb.B
}
tail, err := storage.UnmarshalBlock(dst, buf)
@@ -164,6 +166,10 @@ func (tbf *tmpBlocksFile) MustClose() {
if tbf.f == nil {
return
}
if tbf.r != nil {
// tbf.r could be nil if Finalize wasn't called.
tbf.r.MustClose()
}
fname := tbf.f.Name()
// Remove the file at first, then close it.

View File

@@ -15,26 +15,27 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/netstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/promql"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/metricsql"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/metrics"
"github.com/valyala/quicktemplate"
)
var (
latencyOffset = flag.Duration("search.latencyOffset", time.Second*60, "The time when data points become visible in query results after the colection. "+
latencyOffset = flag.Duration("search.latencyOffset", time.Second*30, "The time when data points become visible in query results after the colection. "+
"Too small value can result in incomplete last points for query results")
maxQueryDuration = flag.Duration("search.maxQueryDuration", time.Second*30, "The maximum time for search query execution")
maxQueryLen = flag.Int("search.maxQueryLen", 16*1024, "The maximum search query length in bytes")
maxLookback = flag.Duration("search.maxLookback", 0, "Synonim to `-search.lookback-delta` from Prometheus. "+
"The value is dynamically detected from interval between time series datapoints if not set. It can be overriden on per-query basis via `max_lookback` arg")
maxExportDuration = flag.Duration("search.maxExportDuration", time.Hour*24*30, "The maximum duration for /api/v1/export call")
maxQueryDuration = flag.Duration("search.maxQueryDuration", time.Second*30, "The maximum duration for search query execution")
maxQueryLen = flag.Int("search.maxQueryLen", 16*1024, "The maximum search query length in bytes")
maxLookback = flag.Duration("search.maxLookback", 0, "Synonim to -search.lookback-delta from Prometheus. "+
"The value is dynamically detected from interval between time series datapoints if not set. It can be overridden on per-query basis via max_lookback arg")
)
// Default step used if not set.
const defaultStep = 5 * 60 * 1000
// FederateHandler implements /federate . See https://prometheus.io/docs/prometheus/latest/federation/
func FederateHandler(w http.ResponseWriter, r *http.Request) error {
startTime := time.Now()
func FederateHandler(startTime time.Time, w http.ResponseWriter, r *http.Request) error {
ct := currentTime()
if err := r.ParseForm(); err != nil {
return fmt.Errorf("cannot parse request form values: %s", err)
@@ -58,7 +59,7 @@ func FederateHandler(w http.ResponseWriter, r *http.Request) error {
if err != nil {
return err
}
deadline := getDeadline(r)
deadline := getDeadlineForQuery(r)
if start >= end {
start = end - defaultStep
}
@@ -105,8 +106,7 @@ func FederateHandler(w http.ResponseWriter, r *http.Request) error {
var federateDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/federate"}`)
// ExportHandler exports data in raw format from /api/v1/export.
func ExportHandler(w http.ResponseWriter, r *http.Request) error {
startTime := time.Now()
func ExportHandler(startTime time.Time, w http.ResponseWriter, r *http.Request) error {
ct := currentTime()
if err := r.ParseForm(); err != nil {
return fmt.Errorf("cannot parse request form values: %s", err)
@@ -129,12 +129,12 @@ func ExportHandler(w http.ResponseWriter, r *http.Request) error {
return err
}
format := r.FormValue("format")
deadline := getDeadline(r)
deadline := getDeadlineForExport(r)
if start >= end {
start = end - defaultStep
end = start + defaultStep
}
if err := exportHandler(w, matches, start, end, format, deadline); err != nil {
return err
return fmt.Errorf("error when exporting data for queries=%q on the time range (start=%d, end=%d): %s", matches, start, end, err)
}
exportDuration.UpdateDuration(startTime)
return nil
@@ -145,7 +145,7 @@ var exportDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/api/
func exportHandler(w http.ResponseWriter, matches []string, start, end int64, format string, deadline netstorage.Deadline) error {
writeResponseFunc := WriteExportStdResponse
writeLineFunc := WriteExportJSONLine
contentType := "application/json"
contentType := "application/stream+json"
if format == "prometheus" {
contentType = "text/plain"
writeLineFunc = WriteExportPrometheusLine
@@ -198,8 +198,7 @@ func exportHandler(w http.ResponseWriter, matches []string, start, end int64, fo
// DeleteHandler processes /api/v1/admin/tsdb/delete_series prometheus API request.
//
// See https://prometheus.io/docs/prometheus/latest/querying/api/#delete-series
func DeleteHandler(r *http.Request) error {
startTime := time.Now()
func DeleteHandler(startTime time.Time, r *http.Request) error {
if err := r.ParseForm(); err != nil {
return fmt.Errorf("cannot parse request form values: %s", err)
}
@@ -233,9 +232,8 @@ var deleteDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/api/
// LabelValuesHandler processes /api/v1/label/<labelName>/values request.
//
// See https://prometheus.io/docs/prometheus/latest/querying/api/#querying-label-values
func LabelValuesHandler(labelName string, w http.ResponseWriter, r *http.Request) error {
startTime := time.Now()
deadline := getDeadline(r)
func LabelValuesHandler(startTime time.Time, labelName string, w http.ResponseWriter, r *http.Request) error {
deadline := getDeadlineForQuery(r)
if err := r.ParseForm(); err != nil {
return fmt.Errorf("cannot parse form values: %s", err)
@@ -285,8 +283,15 @@ func labelValuesWithMatches(labelName string, matches []string, start, end int64
if err != nil {
return nil, err
}
for i, tfs := range tagFilterss {
// Add `labelName!=''` tag filter in order to filter out series without the labelName.
tagFilterss[i] = append(tfs, storage.TagFilter{
Key: []byte(labelName),
IsNegative: true,
})
}
if start >= end {
start = end - defaultStep
end = start + defaultStep
}
sq := &storage.SearchQuery{
MinTimestamp: start,
@@ -324,9 +329,8 @@ func labelValuesWithMatches(labelName string, matches []string, start, end int64
var labelValuesDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/api/v1/label/{}/values"}`)
// LabelsCountHandler processes /api/v1/labels/count request.
func LabelsCountHandler(w http.ResponseWriter, r *http.Request) error {
startTime := time.Now()
deadline := getDeadline(r)
func LabelsCountHandler(startTime time.Time, w http.ResponseWriter, r *http.Request) error {
deadline := getDeadlineForQuery(r)
labelEntries, err := netstorage.GetLabelEntries(deadline)
if err != nil {
return fmt.Errorf(`cannot obtain label entries: %s`, err)
@@ -343,12 +347,39 @@ var labelsCountDuration = metrics.NewSummary(`vm_request_duration_seconds{path="
// LabelsHandler processes /api/v1/labels request.
//
// See https://prometheus.io/docs/prometheus/latest/querying/api/#getting-label-names
func LabelsHandler(w http.ResponseWriter, r *http.Request) error {
startTime := time.Now()
deadline := getDeadline(r)
labels, err := netstorage.GetLabels(deadline)
if err != nil {
return fmt.Errorf("cannot obtain labels: %s", err)
func LabelsHandler(startTime time.Time, w http.ResponseWriter, r *http.Request) error {
deadline := getDeadlineForQuery(r)
if err := r.ParseForm(); err != nil {
return fmt.Errorf("cannot parse form values: %s", err)
}
var labels []string
if len(r.Form["match[]"]) == 0 && len(r.Form["start"]) == 0 && len(r.Form["end"]) == 0 {
var err error
labels, err = netstorage.GetLabels(deadline)
if err != nil {
return fmt.Errorf("cannot obtain labels: %s", err)
}
} else {
// Extended functionality that allows filtering by label filters and time range
// i.e. /api/v1/labels?match[]=foobar{baz="abc"}&start=...&end=...
matches := r.Form["match[]"]
if len(matches) == 0 {
matches = []string{"{__name__!=''}"}
}
ct := currentTime()
end, err := getTime(r, "end", ct)
if err != nil {
return err
}
start, err := getTime(r, "start", end-defaultStep)
if err != nil {
return err
}
labels, err = labelsWithMatches(matches, start, end, deadline)
if err != nil {
return fmt.Errorf("cannot obtain labels for match[]=%q, start=%d, end=%d: %s", matches, start, end, err)
}
}
w.Header().Set("Content-Type", "application/json")
@@ -357,12 +388,56 @@ func LabelsHandler(w http.ResponseWriter, r *http.Request) error {
return nil
}
func labelsWithMatches(matches []string, start, end int64, deadline netstorage.Deadline) ([]string, error) {
if len(matches) == 0 {
logger.Panicf("BUG: matches must be non-empty")
}
tagFilterss, err := getTagFilterssFromMatches(matches)
if err != nil {
return nil, err
}
if start >= end {
end = start + defaultStep
}
sq := &storage.SearchQuery{
MinTimestamp: start,
MaxTimestamp: end,
TagFilterss: tagFilterss,
}
rss, err := netstorage.ProcessSearchQuery(sq, false, deadline)
if err != nil {
return nil, fmt.Errorf("cannot fetch data for %q: %s", sq, err)
}
m := make(map[string]struct{})
var mLock sync.Mutex
err = rss.RunParallel(func(rs *netstorage.Result, workerID uint) {
mLock.Lock()
tags := rs.MetricName.Tags
for i := range tags {
t := &tags[i]
m[string(t.Key)] = struct{}{}
}
m["__name__"] = struct{}{}
mLock.Unlock()
})
if err != nil {
return nil, fmt.Errorf("error when data fetching: %s", err)
}
labels := make([]string, 0, len(m))
for label := range m {
labels = append(labels, label)
}
sort.Strings(labels)
return labels, nil
}
var labelsDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/api/v1/labels"}`)
// SeriesCountHandler processes /api/v1/series/count request.
func SeriesCountHandler(w http.ResponseWriter, r *http.Request) error {
startTime := time.Now()
deadline := getDeadline(r)
func SeriesCountHandler(startTime time.Time, w http.ResponseWriter, r *http.Request) error {
deadline := getDeadlineForQuery(r)
n, err := netstorage.GetSeriesCount(deadline)
if err != nil {
return fmt.Errorf("cannot obtain series count: %s", err)
@@ -378,8 +453,7 @@ var seriesCountDuration = metrics.NewSummary(`vm_request_duration_seconds{path="
// SeriesHandler processes /api/v1/series request.
//
// See https://prometheus.io/docs/prometheus/latest/querying/api/#finding-series-by-label-matchers
func SeriesHandler(w http.ResponseWriter, r *http.Request) error {
startTime := time.Now()
func SeriesHandler(startTime time.Time, w http.ResponseWriter, r *http.Request) error {
ct := currentTime()
if err := r.ParseForm(); err != nil {
@@ -402,14 +476,14 @@ func SeriesHandler(w http.ResponseWriter, r *http.Request) error {
if err != nil {
return err
}
deadline := getDeadline(r)
deadline := getDeadlineForQuery(r)
tagFilterss, err := getTagFilterssFromMatches(matches)
if err != nil {
return err
}
if start >= end {
start = end - defaultStep
end = start + defaultStep
}
sq := &storage.SearchQuery{
MinTimestamp: start,
@@ -454,8 +528,7 @@ var seriesDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/api/
// QueryHandler processes /api/v1/query request.
//
// See https://prometheus.io/docs/prometheus/latest/querying/api/#instant-queries
func QueryHandler(w http.ResponseWriter, r *http.Request) error {
startTime := time.Now()
func QueryHandler(startTime time.Time, w http.ResponseWriter, r *http.Request) error {
ct := currentTime()
query := r.FormValue("query")
@@ -471,40 +544,59 @@ func QueryHandler(w http.ResponseWriter, r *http.Request) error {
if err != nil {
return err
}
deadline := getDeadline(r)
deadline := getDeadlineForQuery(r)
lookbackDelta, err := getMaxLookback(r)
if err != nil {
return err
}
if len(query) > *maxQueryLen {
return fmt.Errorf(`too long query; got %d bytes; mustn't exceed %d bytes`, len(query), *maxQueryLen)
return fmt.Errorf("too long query; got %d bytes; mustn't exceed `-search.maxQueryLen=%d` bytes", len(query), *maxQueryLen)
}
if ct-start < queryOffset {
start -= queryOffset
if !getBool(r, "nocache") && ct-start < queryOffset {
// Adjust start time only if `nocache` arg isn't set.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/241
start = ct - queryOffset
}
if childQuery, windowStr, offsetStr := promql.IsMetricSelectorWithRollup(query); childQuery != "" {
var window int64
if len(windowStr) > 0 {
var err error
window, err = promql.DurationValue(windowStr, step)
if err != nil {
return err
}
window, err := parsePositiveDuration(windowStr, step)
if err != nil {
return fmt.Errorf("cannot parse window: %s", err)
}
var offset int64
if len(offsetStr) > 0 {
var err error
offset, err = promql.DurationValue(offsetStr, step)
if err != nil {
return err
}
offset, err := parseDuration(offsetStr, step)
if err != nil {
return fmt.Errorf("cannot parse offset: %s", err)
}
start -= offset
end := start
start = end - window
if err := exportHandler(w, []string{childQuery}, start, end, "promapi", deadline); err != nil {
return err
return fmt.Errorf("error when exporting data for query=%q on the time range (start=%d, end=%d): %s", childQuery, start, end, err)
}
queryDuration.UpdateDuration(startTime)
return nil
}
if childQuery, windowStr, stepStr, offsetStr := promql.IsRollup(query); childQuery != "" {
newStep, err := parsePositiveDuration(stepStr, step)
if err != nil {
return fmt.Errorf("cannot parse step: %s", err)
}
if newStep > 0 {
step = newStep
}
window, err := parsePositiveDuration(windowStr, step)
if err != nil {
return fmt.Errorf("cannot parse window: %s", err)
}
offset, err := parseDuration(offsetStr, step)
if err != nil {
return fmt.Errorf("cannot parse offset: %s", err)
}
start -= offset
end := start
start = end - window
if err := queryRangeHandler(w, childQuery, start, end, step, r, ct); err != nil {
return fmt.Errorf("error when executing query=%q on the time range (start=%d, end=%d, step=%d): %s", childQuery, start, end, step, err)
}
queryDuration.UpdateDuration(startTime)
return nil
@@ -519,7 +611,7 @@ func QueryHandler(w http.ResponseWriter, r *http.Request) error {
}
result, err := promql.Exec(&ec, query, true)
if err != nil {
return fmt.Errorf("cannot execute %q: %s", query, err)
return fmt.Errorf("error when executing query=%q for (time=%d, step=%d): %s", query, start, step, err)
}
w.Header().Set("Content-Type", "application/json")
@@ -530,11 +622,24 @@ func QueryHandler(w http.ResponseWriter, r *http.Request) error {
var queryDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/api/v1/query"}`)
func parseDuration(s string, step int64) (int64, error) {
if len(s) == 0 {
return 0, nil
}
return metricsql.DurationValue(s, step)
}
func parsePositiveDuration(s string, step int64) (int64, error) {
if len(s) == 0 {
return 0, nil
}
return metricsql.PositiveDurationValue(s, step)
}
// QueryRangeHandler processes /api/v1/query_range request.
//
// See https://prometheus.io/docs/prometheus/latest/querying/api/#range-queries
func QueryRangeHandler(w http.ResponseWriter, r *http.Request) error {
startTime := time.Now()
func QueryRangeHandler(startTime time.Time, w http.ResponseWriter, r *http.Request) error {
ct := currentTime()
query := r.FormValue("query")
@@ -553,7 +658,15 @@ func QueryRangeHandler(w http.ResponseWriter, r *http.Request) error {
if err != nil {
return err
}
deadline := getDeadline(r)
if err := queryRangeHandler(w, query, start, end, step, r, ct); err != nil {
return fmt.Errorf("error when executing query=%q on the time range (start=%d, end=%d, step=%d): %s", query, start, end, step, err)
}
queryRangeDuration.UpdateDuration(startTime)
return nil
}
func queryRangeHandler(w http.ResponseWriter, query string, start, end, step int64, r *http.Request, ct int64) error {
deadline := getDeadlineForQuery(r)
mayCache := !getBool(r, "nocache")
lookbackDelta, err := getMaxLookback(r)
if err != nil {
@@ -562,10 +675,10 @@ func QueryRangeHandler(w http.ResponseWriter, r *http.Request) error {
// Validate input args.
if len(query) > *maxQueryLen {
return fmt.Errorf(`too long query; got %d bytes; mustn't exceed %d bytes`, len(query), *maxQueryLen)
return fmt.Errorf("too long query; got %d bytes; mustn't exceed `-search.maxQueryLen=%d` bytes", len(query), *maxQueryLen)
}
if start > end {
start = end
end = start + defaultStep
}
if err := promql.ValidateMaxPointsPerTimeseries(start, end, step); err != nil {
return err
@@ -584,7 +697,7 @@ func QueryRangeHandler(w http.ResponseWriter, r *http.Request) error {
}
result, err := promql.Exec(&ec, query, false)
if err != nil {
return fmt.Errorf("cannot execute %q: %s", query, err)
return fmt.Errorf("cannot execute query: %s", err)
}
queryOffset := getLatencyOffsetMilliseconds()
if ct-end < queryOffset {
@@ -597,7 +710,6 @@ func QueryRangeHandler(w http.ResponseWriter, r *http.Request) error {
w.Header().Set("Content-Type", "application/json")
WriteQueryRangeResponse(w, result)
queryRangeDuration.UpdateDuration(startTime)
return nil
}
@@ -746,17 +858,26 @@ func getMaxLookback(r *http.Request) (int64, error) {
return getDuration(r, "max_lookback", d)
}
func getDeadline(r *http.Request) netstorage.Deadline {
func getDeadlineForQuery(r *http.Request) netstorage.Deadline {
dMax := int64(maxQueryDuration.Seconds() * 1e3)
return getDeadlineWithMaxDuration(r, dMax, "-search.maxQueryDuration")
}
func getDeadlineForExport(r *http.Request) netstorage.Deadline {
dMax := int64(maxExportDuration.Seconds() * 1e3)
return getDeadlineWithMaxDuration(r, dMax, "-search.maxExportDuration")
}
func getDeadlineWithMaxDuration(r *http.Request, dMax int64, flagHint string) netstorage.Deadline {
d, err := getDuration(r, "timeout", 0)
if err != nil {
d = 0
}
dMax := int64(maxQueryDuration.Seconds() * 1e3)
if d <= 0 || d > dMax {
d = dMax
}
timeout := time.Duration(d) * time.Millisecond
return netstorage.NewDeadline(timeout)
return netstorage.NewDeadline(timeout, flagHint)
}
func getBool(r *http.Request, argKey string) bool {

View File

@@ -8,7 +8,10 @@ import (
"strings"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/metricsql"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/metrics"
"github.com/valyala/histogram"
)
var aggrFuncs = map[string]aggrFunc{
@@ -25,19 +28,28 @@ var aggrFuncs = map[string]aggrFunc{
"topk": newAggrFuncTopK(false),
"quantile": aggrFuncQuantile,
// Extended PromQL funcs
"median": aggrFuncMedian,
"limitk": aggrFuncLimitK,
"distinct": newAggrFunc(aggrFuncDistinct),
"sum2": newAggrFunc(aggrFuncSum2),
"geomean": newAggrFunc(aggrFuncGeomean),
// PromQL extension funcs
"median": aggrFuncMedian,
"limitk": aggrFuncLimitK,
"distinct": newAggrFunc(aggrFuncDistinct),
"sum2": newAggrFunc(aggrFuncSum2),
"geomean": newAggrFunc(aggrFuncGeomean),
"histogram": newAggrFunc(aggrFuncHistogram),
"topk_min": newAggrFuncRangeTopK(minValue, false),
"topk_max": newAggrFuncRangeTopK(maxValue, false),
"topk_avg": newAggrFuncRangeTopK(avgValue, false),
"topk_median": newAggrFuncRangeTopK(medianValue, false),
"bottomk_min": newAggrFuncRangeTopK(minValue, true),
"bottomk_max": newAggrFuncRangeTopK(maxValue, true),
"bottomk_avg": newAggrFuncRangeTopK(avgValue, true),
"bottomk_median": newAggrFuncRangeTopK(medianValue, true),
}
type aggrFunc func(afa *aggrFuncArg) ([]*timeseries, error)
type aggrFuncArg struct {
args [][]*timeseries
ae *aggrFuncExpr
ae *metricsql.AggrFuncExpr
ec *EvalConfig
}
@@ -46,20 +58,6 @@ func getAggrFunc(s string) aggrFunc {
return aggrFuncs[s]
}
func isAggrFunc(s string) bool {
return getAggrFunc(s) != nil
}
func isAggrFuncModifier(s string) bool {
s = strings.ToLower(s)
switch s {
case "by", "without":
return true
default:
return false
}
}
func newAggrFunc(afe func(tss []*timeseries) []*timeseries) aggrFunc {
return func(afa *aggrFuncArg) ([]*timeseries, error) {
args := afa.args
@@ -70,7 +68,7 @@ func newAggrFunc(afe func(tss []*timeseries) []*timeseries) aggrFunc {
}
}
func removeGroupTags(metricName *storage.MetricName, modifier *modifierExpr) {
func removeGroupTags(metricName *storage.MetricName, modifier *metricsql.ModifierExpr) {
groupOp := strings.ToLower(modifier.Op)
switch groupOp {
case "", "by":
@@ -82,7 +80,7 @@ func removeGroupTags(metricName *storage.MetricName, modifier *modifierExpr) {
}
}
func aggrFuncExt(afe func(tss []*timeseries) []*timeseries, argOrig []*timeseries, modifier *modifierExpr, keepOriginal bool) ([]*timeseries, error) {
func aggrFuncExt(afe func(tss []*timeseries) []*timeseries, argOrig []*timeseries, modifier *metricsql.ModifierExpr, keepOriginal bool) ([]*timeseries, error) {
arg := copyTimeseriesMetricNames(argOrig)
// Perform grouping.
@@ -184,6 +182,38 @@ func aggrFuncGeomean(tss []*timeseries) []*timeseries {
return tss[:1]
}
func aggrFuncHistogram(tss []*timeseries) []*timeseries {
var h metrics.Histogram
m := make(map[string]*timeseries)
for i := range tss[0].Values {
h.Reset()
for _, ts := range tss {
v := ts.Values[i]
h.Update(v)
}
h.VisitNonZeroBuckets(func(vmrange string, count uint64) {
ts := m[vmrange]
if ts == nil {
ts = &timeseries{}
ts.CopyFromShallowTimestamps(tss[0])
ts.MetricName.RemoveTag("vmrange")
ts.MetricName.AddTag("vmrange", vmrange)
values := ts.Values
for k := range values {
values[k] = 0
}
m[vmrange] = ts
}
ts.Values[i] = float64(count)
})
}
rvs := make([]*timeseries, 0, len(m))
for _, ts := range m {
rvs = append(rvs, ts)
}
return vmrangeBucketsToLE(rvs)
}
func aggrFuncMin(tss []*timeseries) []*timeseries {
if len(tss) == 1 {
// Fast path - nothing to min.
@@ -425,37 +455,138 @@ func newAggrFuncTopK(isReverse bool) aggrFunc {
return nil, err
}
afe := func(tss []*timeseries) []*timeseries {
rvs := tss
for n := range rvs[0].Values {
sort.Slice(rvs, func(i, j int) bool {
a := rvs[i].Values[n]
b := rvs[j].Values[n]
cmp := lessWithNaNs(a, b)
for n := range tss[0].Values {
sort.Slice(tss, func(i, j int) bool {
a := tss[i].Values[n]
b := tss[j].Values[n]
if isReverse {
cmp = !cmp
a, b = b, a
}
return cmp
return lessWithNaNs(a, b)
})
if math.IsNaN(ks[n]) {
ks[n] = 0
}
k := int(ks[n])
if k < 0 {
k = 0
}
if k > len(rvs) {
k = len(rvs)
}
for _, ts := range rvs[:len(rvs)-k] {
ts.Values[n] = nan
}
fillNaNsAtIdx(n, ks[n], tss)
}
return removeNaNs(rvs)
return removeNaNs(tss)
}
return aggrFuncExt(afe, args[1], &afa.ae.Modifier, true)
}
}
type tsWithValue struct {
ts *timeseries
value float64
}
func newAggrFuncRangeTopK(f func(values []float64) float64, isReverse bool) aggrFunc {
return func(afa *aggrFuncArg) ([]*timeseries, error) {
args := afa.args
if err := expectTransformArgsNum(args, 2); err != nil {
return nil, err
}
ks, err := getScalar(args[0], 0)
if err != nil {
return nil, err
}
afe := func(tss []*timeseries) []*timeseries {
maxs := make([]tsWithValue, len(tss))
for i, ts := range tss {
value := f(ts.Values)
maxs[i] = tsWithValue{
ts: ts,
value: value,
}
}
sort.Slice(maxs, func(i, j int) bool {
a := maxs[i].value
b := maxs[j].value
if isReverse {
a, b = b, a
}
return lessWithNaNs(a, b)
})
for i := range maxs {
tss[i] = maxs[i].ts
}
for i, k := range ks {
fillNaNsAtIdx(i, k, tss)
}
return removeNaNs(tss)
}
return aggrFuncExt(afe, args[1], &afa.ae.Modifier, true)
}
}
func fillNaNsAtIdx(idx int, k float64, tss []*timeseries) {
if math.IsNaN(k) {
k = 0
}
kn := int(k)
if kn < 0 {
kn = 0
}
if kn > len(tss) {
kn = len(tss)
}
for _, ts := range tss[:len(tss)-kn] {
ts.Values[idx] = nan
}
}
func minValue(values []float64) float64 {
if len(values) == 0 {
return nan
}
min := values[0]
for _, v := range values[1:] {
if v < min {
min = v
}
}
return min
}
func maxValue(values []float64) float64 {
if len(values) == 0 {
return nan
}
max := values[0]
for _, v := range values[1:] {
if v > max {
max = v
}
}
return max
}
func avgValue(values []float64) float64 {
sum := float64(0)
count := 0
for _, v := range values {
if math.IsNaN(v) {
continue
}
count++
sum += v
}
if count == 0 {
return nan
}
return sum / float64(count)
}
func medianValue(values []float64) float64 {
h := histogram.GetFast()
for _, v := range values {
if math.IsNaN(v) {
continue
}
h.Update(v)
}
value := h.Quantile(0.5)
histogram.PutFast(h)
return value
}
func aggrFuncLimitK(afa *aggrFuncArg) ([]*timeseries, error) {
args := afa.args
if err := expectTransformArgsNum(args, 2); err != nil {

View File

@@ -4,10 +4,12 @@ import (
"math"
"strings"
"sync"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/metricsql"
)
// callbacks for optimized incremental calculations for aggregate functions
// over rollups over metricExpr.
// over rollups over metricsql.MetricExpr.
//
// These calculations save RAM for aggregates over big number of time series.
var incrementalAggrFuncCallbacksMap = map[string]*incrementalAggrFuncCallbacks{
@@ -49,7 +51,7 @@ var incrementalAggrFuncCallbacksMap = map[string]*incrementalAggrFuncCallbacks{
}
type incrementalAggrFuncContext struct {
ae *aggrFuncExpr
ae *metricsql.AggrFuncExpr
mLock sync.Mutex
m map[uint]map[string]*incrementalAggrContext
@@ -57,7 +59,7 @@ type incrementalAggrFuncContext struct {
callbacks *incrementalAggrFuncCallbacks
}
func newIncrementalAggrFuncContext(ae *aggrFuncExpr, callbacks *incrementalAggrFuncCallbacks) *incrementalAggrFuncContext {
func newIncrementalAggrFuncContext(ae *metricsql.AggrFuncExpr, callbacks *incrementalAggrFuncCallbacks) *incrementalAggrFuncContext {
return &incrementalAggrFuncContext{
ae: ae,
m: make(map[uint]map[string]*incrementalAggrContext),

View File

@@ -7,6 +7,8 @@ import (
"runtime"
"sync"
"testing"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/metricsql"
)
func TestIncrementalAggr(t *testing.T) {
@@ -42,7 +44,7 @@ func TestIncrementalAggr(t *testing.T) {
f := func(name string, valuesExpected []float64) {
t.Helper()
callbacks := getIncrementalAggrFuncCallbacks(name)
ae := &aggrFuncExpr{
ae := &metricsql.AggrFuncExpr{
Name: name,
}
tssExpected := []*timeseries{{

View File

@@ -0,0 +1,5 @@
package promql
import "unsafe"
const maxByteSliceLen = 1<<(31+9*(unsafe.Sizeof(int(0))/8)) - 1

View File

@@ -1,3 +0,0 @@
package promql
const maxByteSliceLen = 1<<31 - 1

View File

@@ -1,3 +0,0 @@
package promql
const maxByteSliceLen = 1 << 40

View File

@@ -1,3 +0,0 @@
package promql
const maxByteSliceLen = 1<<31 - 1

View File

@@ -1,3 +0,0 @@
package promql
const maxByteSliceLen = 1 << 40

View File

@@ -6,63 +6,36 @@ import (
"strings"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/metricsql"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/metricsql/binaryop"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
)
var binaryOpFuncs = map[string]binaryOpFunc{
"+": newBinaryOpArithFunc(binaryOpPlus),
"-": newBinaryOpArithFunc(binaryOpMinus),
"*": newBinaryOpArithFunc(binaryOpMul),
"/": newBinaryOpArithFunc(binaryOpDiv),
"%": newBinaryOpArithFunc(binaryOpMod),
"^": newBinaryOpArithFunc(binaryOpPow),
"+": newBinaryOpArithFunc(binaryop.Plus),
"-": newBinaryOpArithFunc(binaryop.Minus),
"*": newBinaryOpArithFunc(binaryop.Mul),
"/": newBinaryOpArithFunc(binaryop.Div),
"%": newBinaryOpArithFunc(binaryop.Mod),
"^": newBinaryOpArithFunc(binaryop.Pow),
// cmp ops
"==": newBinaryOpCmpFunc(binaryOpEq),
"!=": newBinaryOpCmpFunc(binaryOpNeq),
">": newBinaryOpCmpFunc(binaryOpGt),
"<": newBinaryOpCmpFunc(binaryOpLt),
">=": newBinaryOpCmpFunc(binaryOpGte),
"<=": newBinaryOpCmpFunc(binaryOpLte),
"==": newBinaryOpCmpFunc(binaryop.Eq),
"!=": newBinaryOpCmpFunc(binaryop.Neq),
">": newBinaryOpCmpFunc(binaryop.Gt),
"<": newBinaryOpCmpFunc(binaryop.Lt),
">=": newBinaryOpCmpFunc(binaryop.Gte),
"<=": newBinaryOpCmpFunc(binaryop.Lte),
// logical set ops
"and": binaryOpAnd,
"or": binaryOpOr,
"unless": binaryOpUnless,
// New op
"if": newBinaryOpArithFunc(binaryOpIf),
"ifnot": newBinaryOpArithFunc(binaryOpIfnot),
"default": newBinaryOpArithFunc(binaryOpDefault),
}
var binaryOpPriorities = map[string]int{
"default": -1,
"if": 0,
"ifnot": 0,
// See https://prometheus.io/docs/prometheus/latest/querying/operators/#binary-operator-precedence
"or": 1,
"and": 2,
"unless": 2,
"==": 3,
"!=": 3,
"<": 3,
">": 3,
"<=": 3,
">=": 3,
"+": 4,
"-": 4,
"*": 5,
"/": 5,
"%": 5,
"^": 6,
// New ops
"if": newBinaryOpArithFunc(binaryop.If),
"ifnot": newBinaryOpArithFunc(binaryop.Ifnot),
"default": newBinaryOpArithFunc(binaryop.Default),
}
func getBinaryOpFunc(op string) binaryOpFunc {
@@ -70,144 +43,8 @@ func getBinaryOpFunc(op string) binaryOpFunc {
return binaryOpFuncs[op]
}
func isBinaryOp(op string) bool {
return getBinaryOpFunc(op) != nil
}
func binaryOpPriority(op string) int {
op = strings.ToLower(op)
return binaryOpPriorities[op]
}
func scanBinaryOpPrefix(s string) int {
n := 0
for op := range binaryOpFuncs {
if len(s) < len(op) {
continue
}
ss := strings.ToLower(s[:len(op)])
if ss == op && len(op) > n {
n = len(op)
}
}
return n
}
func isRightAssociativeBinaryOp(op string) bool {
// See https://prometheus.io/docs/prometheus/latest/querying/operators/#binary-operator-precedence
return op == "^"
}
func isBinaryOpGroupModifier(s string) bool {
s = strings.ToLower(s)
switch s {
// See https://prometheus.io/docs/prometheus/latest/querying/operators/#vector-matching
case "on", "ignoring":
return true
default:
return false
}
}
func isBinaryOpJoinModifier(s string) bool {
s = strings.ToLower(s)
switch s {
case "group_left", "group_right":
return true
default:
return false
}
}
func isBinaryOpBoolModifier(s string) bool {
s = strings.ToLower(s)
return s == "bool"
}
func isBinaryOpCmp(op string) bool {
switch op {
case "==", "!=", ">", "<", ">=", "<=":
return true
default:
return false
}
}
func isBinaryOpLogicalSet(op string) bool {
op = strings.ToLower(op)
switch op {
case "and", "or", "unless":
return true
default:
return false
}
}
func binaryOpConstants(op string, left, right float64, isBool bool) float64 {
if isBinaryOpCmp(op) {
evalCmp := func(cf func(left, right float64) bool) float64 {
if isBool {
if cf(left, right) {
return 1
}
return 0
}
if cf(left, right) {
return left
}
return nan
}
switch op {
case "==":
left = evalCmp(binaryOpEq)
case "!=":
left = evalCmp(binaryOpNeq)
case ">":
left = evalCmp(binaryOpGt)
case "<":
left = evalCmp(binaryOpLt)
case ">=":
left = evalCmp(binaryOpGte)
case "<=":
left = evalCmp(binaryOpLte)
default:
logger.Panicf("BUG: unexpected comparison binaryOp: %q", op)
}
} else {
switch op {
case "+":
left = binaryOpPlus(left, right)
case "-":
left = binaryOpMinus(left, right)
case "*":
left = binaryOpMul(left, right)
case "/":
left = binaryOpDiv(left, right)
case "%":
left = binaryOpMod(left, right)
case "^":
left = binaryOpPow(left, right)
case "and":
// Nothing to do
case "or":
// Nothing to do
case "unless":
left = nan
case "default":
left = binaryOpDefault(left, right)
case "if":
left = binaryOpIf(left, right)
case "ifnot":
left = binaryOpIfnot(left, right)
default:
logger.Panicf("BUG: unexpected non-comparison binaryOp: %q", op)
}
}
return left
}
type binaryOpFuncArg struct {
be *binaryOpExpr
be *metricsql.BinaryOpExpr
left []*timeseries
right []*timeseries
}
@@ -267,7 +104,7 @@ func newBinaryOpFunc(bf func(left, right float64, isBool bool) float64) binaryOp
}
}
func adjustBinaryOpTags(be *binaryOpExpr, left, right []*timeseries) ([]*timeseries, []*timeseries, []*timeseries, error) {
func adjustBinaryOpTags(be *metricsql.BinaryOpExpr, left, right []*timeseries) ([]*timeseries, []*timeseries, []*timeseries, error) {
if len(be.GroupModifier.Op) == 0 && len(be.JoinModifier.Op) == 0 {
if isScalar(left) {
// Fast path: `scalar op vector`
@@ -348,7 +185,7 @@ func adjustBinaryOpTags(be *binaryOpExpr, left, right []*timeseries) ([]*timeser
return rvsLeft, rvsRight, dst, nil
}
func ensureSingleTimeseries(side string, be *binaryOpExpr, tss []*timeseries) error {
func ensureSingleTimeseries(side string, be *metricsql.BinaryOpExpr, tss []*timeseries) error {
if len(tss) == 0 {
logger.Panicf("BUG: tss must contain at least one value")
}
@@ -362,7 +199,7 @@ func ensureSingleTimeseries(side string, be *binaryOpExpr, tss []*timeseries) er
return nil
}
func groupJoin(singleTimeseriesSide string, be *binaryOpExpr, rvsLeft, rvsRight, tssLeft, tssRight []*timeseries) ([]*timeseries, []*timeseries, error) {
func groupJoin(singleTimeseriesSide string, be *metricsql.BinaryOpExpr, rvsLeft, rvsRight, tssLeft, tssRight []*timeseries) ([]*timeseries, []*timeseries, error) {
joinTags := be.JoinModifier.Args
var m map[string]*timeseries
for _, tsLeft := range tssLeft {
@@ -432,8 +269,8 @@ func mergeNonOverlappingTimeseries(dst, src *timeseries) bool {
return true
}
func resetMetricGroupIfRequired(be *binaryOpExpr, ts *timeseries) {
if isBinaryOpCmp(be.Op) && !be.Bool {
func resetMetricGroupIfRequired(be *metricsql.BinaryOpExpr, ts *timeseries) {
if metricsql.IsBinaryOpCmp(be.Op) && !be.Bool {
// Do not reset MetricGroup for non-boolean `compare` binary ops like Prometheus does.
return
}
@@ -445,97 +282,24 @@ func resetMetricGroupIfRequired(be *binaryOpExpr, ts *timeseries) {
ts.MetricName.ResetMetricGroup()
}
func binaryOpPlus(left, right float64) float64 {
return left + right
}
func binaryOpMinus(left, right float64) float64 {
return left - right
}
func binaryOpMul(left, right float64) float64 {
return left * right
}
func binaryOpDiv(left, right float64) float64 {
return left / right
}
func binaryOpMod(left, right float64) float64 {
return math.Mod(left, right)
}
func binaryOpPow(left, right float64) float64 {
return math.Pow(left, right)
}
func binaryOpDefault(left, right float64) float64 {
if math.IsNaN(left) {
return right
}
return left
}
func binaryOpIf(left, right float64) float64 {
if math.IsNaN(right) {
return nan
}
return left
}
func binaryOpIfnot(left, right float64) float64 {
if math.IsNaN(right) {
return left
}
return nan
}
func binaryOpEq(left, right float64) bool {
// Special handling for nan == nan.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/150 .
if math.IsNaN(left) {
return math.IsNaN(right)
}
return left == right
}
func binaryOpNeq(left, right float64) bool {
// Special handling for comparison with nan.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/150 .
if math.IsNaN(left) {
return !math.IsNaN(right)
}
if math.IsNaN(right) {
return true
}
return left != right
}
func binaryOpGt(left, right float64) bool {
return left > right
}
func binaryOpLt(left, right float64) bool {
return left < right
}
func binaryOpGte(left, right float64) bool {
return left >= right
}
func binaryOpLte(left, right float64) bool {
return left <= right
}
func binaryOpAnd(bfa *binaryOpFuncArg) ([]*timeseries, error) {
mLeft, mRight := createTimeseriesMapByTagSet(bfa.be, bfa.left, bfa.right)
var rvs []*timeseries
for k := range mRight {
if tss := mLeft[k]; tss != nil {
rvs = append(rvs, tss...)
for k, tssRight := range mRight {
tssLeft := mLeft[k]
if tssLeft == nil {
continue
}
for i := range tssLeft[0].Values {
if !isAllNaNs(tssRight, i) {
continue
}
for _, tsLeft := range tssLeft {
tsLeft.Values[i] = nan
}
}
tssLeft = removeNaNs(tssLeft)
rvs = append(rvs, tssLeft...)
}
return rvs, nil
}
@@ -557,15 +321,36 @@ func binaryOpOr(bfa *binaryOpFuncArg) ([]*timeseries, error) {
func binaryOpUnless(bfa *binaryOpFuncArg) ([]*timeseries, error) {
mLeft, mRight := createTimeseriesMapByTagSet(bfa.be, bfa.left, bfa.right)
var rvs []*timeseries
for k, tss := range mLeft {
if mRight[k] == nil {
rvs = append(rvs, tss...)
for k, tssLeft := range mLeft {
tssRight := mRight[k]
if tssRight == nil {
rvs = append(rvs, tssLeft...)
continue
}
for i := range tssLeft[0].Values {
if isAllNaNs(tssRight, i) {
continue
}
for _, tsLeft := range tssLeft {
tsLeft.Values[i] = nan
}
}
tssLeft = removeNaNs(tssLeft)
rvs = append(rvs, tssLeft...)
}
return rvs, nil
}
func createTimeseriesMapByTagSet(be *binaryOpExpr, left, right []*timeseries) (map[string][]*timeseries, map[string][]*timeseries) {
func isAllNaNs(tss []*timeseries, idx int) bool {
for _, ts := range tss {
if !math.IsNaN(ts.Values[idx]) {
return false
}
}
return true
}
func createTimeseriesMapByTagSet(be *metricsql.BinaryOpExpr, left, right []*timeseries) (map[string][]*timeseries, map[string][]*timeseries) {
groupTags := be.GroupModifier.Args
groupOp := strings.ToLower(be.GroupModifier.Op)
if len(groupOp) == 0 {

View File

@@ -11,6 +11,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/memory"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/metricsql"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/metrics"
)
@@ -57,6 +58,14 @@ func AdjustStartEnd(start, end, step int64) (int64, int64) {
if adjust > 0 {
end += step - adjust
}
// Make sure that the new number of points is the same as the initial number of points.
newPoints := (end-start)/step + 1
for newPoints > points {
end -= step
newPoints--
}
return start, end
}
@@ -144,25 +153,25 @@ func getTimestamps(start, end, step int64) []int64 {
return timestamps
}
func evalExpr(ec *EvalConfig, e expr) ([]*timeseries, error) {
if me, ok := e.(*metricExpr); ok {
re := &rollupExpr{
func evalExpr(ec *EvalConfig, e metricsql.Expr) ([]*timeseries, error) {
if me, ok := e.(*metricsql.MetricExpr); ok {
re := &metricsql.RollupExpr{
Expr: me,
}
rv, err := evalRollupFunc(ec, "default_rollup", rollupDefault, re, nil)
rv, err := evalRollupFunc(ec, "default_rollup", rollupDefault, e, re, nil)
if err != nil {
return nil, fmt.Errorf(`cannot evaluate %q: %s`, me.AppendString(nil), err)
}
return rv, nil
}
if re, ok := e.(*rollupExpr); ok {
rv, err := evalRollupFunc(ec, "default_rollup", rollupDefault, re, nil)
if re, ok := e.(*metricsql.RollupExpr); ok {
rv, err := evalRollupFunc(ec, "default_rollup", rollupDefault, e, re, nil)
if err != nil {
return nil, fmt.Errorf(`cannot evaluate %q: %s`, re.AppendString(nil), err)
}
return rv, nil
}
if fe, ok := e.(*funcExpr); ok {
if fe, ok := e.(*metricsql.FuncExpr); ok {
nrf := getRollupFunc(fe.Name)
if nrf == nil {
args, err := evalExprs(ec, fe.Args)
@@ -192,17 +201,17 @@ func evalExpr(ec *EvalConfig, e expr) ([]*timeseries, error) {
if err != nil {
return nil, err
}
rv, err := evalRollupFunc(ec, fe.Name, rf, re, nil)
rv, err := evalRollupFunc(ec, fe.Name, rf, e, re, nil)
if err != nil {
return nil, fmt.Errorf(`cannot evaluate %q: %s`, fe.AppendString(nil), err)
}
return rv, nil
}
if ae, ok := e.(*aggrFuncExpr); ok {
if ae, ok := e.(*metricsql.AggrFuncExpr); ok {
if callbacks := getIncrementalAggrFuncCallbacks(ae.Name); callbacks != nil {
fe, nrf := tryGetArgRollupFuncWithMetricExpr(ae)
if fe != nil {
// There is an optimized path for calculating aggrFuncExpr over rollupFunc over metricExpr.
// There is an optimized path for calculating metricsql.AggrFuncExpr over rollupFunc over metricsql.MetricExpr.
// The optimized path saves RAM for aggregates over big number of time series.
args, re, err := evalRollupFuncArgs(ec, fe)
if err != nil {
@@ -213,7 +222,7 @@ func evalExpr(ec *EvalConfig, e expr) ([]*timeseries, error) {
return nil, err
}
iafc := newIncrementalAggrFuncContext(ae, callbacks)
return evalRollupFunc(ec, fe.Name, rf, re, iafc)
return evalRollupFunc(ec, fe.Name, rf, e, re, iafc)
}
}
args, err := evalExprs(ec, ae.Args)
@@ -235,7 +244,7 @@ func evalExpr(ec *EvalConfig, e expr) ([]*timeseries, error) {
}
return rv, nil
}
if be, ok := e.(*binaryOpExpr); ok {
if be, ok := e.(*metricsql.BinaryOpExpr); ok {
left, err := evalExpr(ec, be.Left)
if err != nil {
return nil, err
@@ -259,18 +268,18 @@ func evalExpr(ec *EvalConfig, e expr) ([]*timeseries, error) {
}
return rv, nil
}
if ne, ok := e.(*numberExpr); ok {
if ne, ok := e.(*metricsql.NumberExpr); ok {
rv := evalNumber(ec, ne.N)
return rv, nil
}
if se, ok := e.(*stringExpr); ok {
if se, ok := e.(*metricsql.StringExpr); ok {
rv := evalString(ec, se.S)
return rv, nil
}
return nil, fmt.Errorf("unexpected expression %q", e.AppendString(nil))
}
func tryGetArgRollupFuncWithMetricExpr(ae *aggrFuncExpr) (*funcExpr, newRollupFunc) {
func tryGetArgRollupFuncWithMetricExpr(ae *metricsql.AggrFuncExpr) (*metricsql.FuncExpr, newRollupFunc) {
if len(ae.Args) != 1 {
return nil, nil
}
@@ -281,31 +290,31 @@ func tryGetArgRollupFuncWithMetricExpr(ae *aggrFuncExpr) (*funcExpr, newRollupFu
// - rollupFunc(metricExpr)
// - rollupFunc(metricExpr[d])
if me, ok := e.(*metricExpr); ok {
if me, ok := e.(*metricsql.MetricExpr); ok {
// e = metricExpr
if me.IsEmpty() {
return nil, nil
}
fe := &funcExpr{
fe := &metricsql.FuncExpr{
Name: "default_rollup",
Args: []expr{me},
Args: []metricsql.Expr{me},
}
nrf := getRollupFunc(fe.Name)
return fe, nrf
}
if re, ok := e.(*rollupExpr); ok {
if me, ok := re.Expr.(*metricExpr); !ok || me.IsEmpty() || re.ForSubquery() {
if re, ok := e.(*metricsql.RollupExpr); ok {
if me, ok := re.Expr.(*metricsql.MetricExpr); !ok || me.IsEmpty() || re.ForSubquery() {
return nil, nil
}
// e = metricExpr[d]
fe := &funcExpr{
fe := &metricsql.FuncExpr{
Name: "default_rollup",
Args: []expr{re},
Args: []metricsql.Expr{re},
}
nrf := getRollupFunc(fe.Name)
return fe, nrf
}
fe, ok := e.(*funcExpr)
fe, ok := e.(*metricsql.FuncExpr)
if !ok {
return nil, nil
}
@@ -314,19 +323,23 @@ func tryGetArgRollupFuncWithMetricExpr(ae *aggrFuncExpr) (*funcExpr, newRollupFu
return nil, nil
}
rollupArgIdx := getRollupArgIdx(fe.Name)
if rollupArgIdx >= len(fe.Args) {
// Incorrect number of args for rollup func.
return nil, nil
}
arg := fe.Args[rollupArgIdx]
if me, ok := arg.(*metricExpr); ok {
if me, ok := arg.(*metricsql.MetricExpr); ok {
if me.IsEmpty() {
return nil, nil
}
// e = rollupFunc(metricExpr)
return &funcExpr{
return &metricsql.FuncExpr{
Name: fe.Name,
Args: []expr{me},
Args: []metricsql.Expr{me},
}, nrf
}
if re, ok := arg.(*rollupExpr); ok {
if me, ok := re.Expr.(*metricExpr); !ok || me.IsEmpty() || re.ForSubquery() {
if re, ok := arg.(*metricsql.RollupExpr); ok {
if me, ok := re.Expr.(*metricsql.MetricExpr); !ok || me.IsEmpty() || re.ForSubquery() {
return nil, nil
}
// e = rollupFunc(metricExpr[d])
@@ -335,7 +348,7 @@ func tryGetArgRollupFuncWithMetricExpr(ae *aggrFuncExpr) (*funcExpr, newRollupFu
return nil, nil
}
func evalExprs(ec *EvalConfig, es []expr) ([][]*timeseries, error) {
func evalExprs(ec *EvalConfig, es []metricsql.Expr) ([][]*timeseries, error) {
var rvs [][]*timeseries
for _, e := range es {
rv, err := evalExpr(ec, e)
@@ -347,9 +360,12 @@ func evalExprs(ec *EvalConfig, es []expr) ([][]*timeseries, error) {
return rvs, nil
}
func evalRollupFuncArgs(ec *EvalConfig, fe *funcExpr) ([]interface{}, *rollupExpr, error) {
var re *rollupExpr
func evalRollupFuncArgs(ec *EvalConfig, fe *metricsql.FuncExpr) ([]interface{}, *metricsql.RollupExpr, error) {
var re *metricsql.RollupExpr
rollupArgIdx := getRollupArgIdx(fe.Name)
if len(fe.Args) <= rollupArgIdx {
return nil, nil, fmt.Errorf("expecting at least %d args to %q; got %d args; expr: %q", rollupArgIdx+1, fe.Name, len(fe.Args), fe.AppendString(nil))
}
args := make([]interface{}, len(fe.Args))
for i, arg := range fe.Args {
if i == rollupArgIdx {
@@ -366,11 +382,11 @@ func evalRollupFuncArgs(ec *EvalConfig, fe *funcExpr) ([]interface{}, *rollupExp
return args, re, nil
}
func getRollupExprArg(arg expr) *rollupExpr {
re, ok := arg.(*rollupExpr)
func getRollupExprArg(arg metricsql.Expr) *metricsql.RollupExpr {
re, ok := arg.(*metricsql.RollupExpr)
if !ok {
// Wrap non-rollup arg into rollupExpr.
return &rollupExpr{
// Wrap non-rollup arg into metricsql.RollupExpr.
return &metricsql.RollupExpr{
Expr: arg,
}
}
@@ -378,45 +394,60 @@ func getRollupExprArg(arg expr) *rollupExpr {
// Return standard rollup if it doesn't contain subquery.
return re
}
me, ok := re.Expr.(*metricExpr)
me, ok := re.Expr.(*metricsql.MetricExpr)
if !ok {
// arg contains subquery.
return re
}
// Convert me[w:step] -> default_rollup(me)[w:step]
reNew := *re
reNew.Expr = &funcExpr{
reNew.Expr = &metricsql.FuncExpr{
Name: "default_rollup",
Args: []expr{
&rollupExpr{Expr: me},
Args: []metricsql.Expr{
&metricsql.RollupExpr{Expr: me},
},
}
return &reNew
}
func evalRollupFunc(ec *EvalConfig, name string, rf rollupFunc, re *rollupExpr, iafc *incrementalAggrFuncContext) ([]*timeseries, error) {
func evalRollupFunc(ec *EvalConfig, name string, rf rollupFunc, expr metricsql.Expr, re *metricsql.RollupExpr, iafc *incrementalAggrFuncContext) ([]*timeseries, error) {
ecNew := ec
var offset int64
if len(re.Offset) > 0 {
var err error
offset, err = DurationValue(re.Offset, ec.Step)
offset, err = metricsql.DurationValue(re.Offset, ec.Step)
if err != nil {
return nil, err
}
ecNew = newEvalConfig(ec)
ecNew = newEvalConfig(ecNew)
ecNew.Start -= offset
ecNew.End -= offset
ecNew.Start, ecNew.End = AdjustStartEnd(ecNew.Start, ecNew.End, ecNew.Step)
if ecNew.MayCache {
start, end := AdjustStartEnd(ecNew.Start, ecNew.End, ecNew.Step)
offset += ecNew.Start - start
ecNew.Start = start
ecNew.End = end
}
}
if name == "rollup_candlestick" {
// Automatically apply `offset -step` to `rollup_candlestick` function
// in order to obtain expected OHLC results.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/309#issuecomment-582113462
step := ecNew.Step
ecNew = newEvalConfig(ecNew)
ecNew.Start += step
ecNew.End += step
offset -= step
}
var rvs []*timeseries
var err error
if me, ok := re.Expr.(*metricExpr); ok {
rvs, err = evalRollupFuncWithMetricExpr(ecNew, name, rf, me, iafc, re.Window)
if me, ok := re.Expr.(*metricsql.MetricExpr); ok {
rvs, err = evalRollupFuncWithMetricExpr(ecNew, name, rf, expr, me, iafc, re.Window)
} else {
if iafc != nil {
logger.Panicf("BUG: iafc must be nil for rollup %q over subquery %q", name, re.AppendString(nil))
}
rvs, err = evalRollupFuncWithSubquery(ecNew, name, rf, re)
rvs, err = evalRollupFuncWithSubquery(ecNew, name, rf, expr, re)
}
if err != nil {
return nil, err
@@ -435,12 +466,12 @@ func evalRollupFunc(ec *EvalConfig, name string, rf rollupFunc, re *rollupExpr,
return rvs, nil
}
func evalRollupFuncWithSubquery(ec *EvalConfig, name string, rf rollupFunc, re *rollupExpr) ([]*timeseries, error) {
// Do not use rollupResultCacheV here, since it works only with metricExpr.
func evalRollupFuncWithSubquery(ec *EvalConfig, name string, rf rollupFunc, expr metricsql.Expr, re *metricsql.RollupExpr) ([]*timeseries, error) {
// TODO: determine whether to use rollupResultCacheV here.
var step int64
if len(re.Step) > 0 {
var err error
step, err = DurationValue(re.Step, ec.Step)
step, err = metricsql.PositiveDurationValue(re.Step, ec.Step)
if err != nil {
return nil, err
}
@@ -450,7 +481,7 @@ func evalRollupFuncWithSubquery(ec *EvalConfig, name string, rf rollupFunc, re *
var window int64
if len(re.Window) > 0 {
var err error
window, err = DurationValue(re.Window, ec.Step)
window, err = metricsql.PositiveDurationValue(re.Window, ec.Step)
if err != nil {
return nil, err
}
@@ -467,9 +498,19 @@ func evalRollupFuncWithSubquery(ec *EvalConfig, name string, rf rollupFunc, re *
if err != nil {
return nil, err
}
if len(tssSQ) == 0 {
if name == "absent_over_time" {
tss := evalNumber(ec, 1)
return tss, nil
}
return nil, nil
}
sharedTimestamps := getTimestamps(ec.Start, ec.End, ec.Step)
preFunc, rcs := getRollupConfigs(name, rf, ec.Start, ec.End, ec.Step, window, ec.LookbackDelta, sharedTimestamps)
preFunc, rcs, err := getRollupConfigs(name, rf, expr, ec.Start, ec.End, ec.Step, window, ec.LookbackDelta, sharedTimestamps)
if err != nil {
return nil, err
}
tss := make([]*timeseries, 0, len(tssSQ)*len(rcs))
var tssLock sync.Mutex
removeMetricGroup := !rollupFuncsKeepMetricGroup[name]
@@ -477,6 +518,13 @@ func evalRollupFuncWithSubquery(ec *EvalConfig, name string, rf rollupFunc, re *
values, timestamps = removeNanValues(values[:0], timestamps[:0], tsSQ.Values, tsSQ.Timestamps)
preFunc(values, timestamps)
for _, rc := range rcs {
if tsm := newTimeseriesMap(name, sharedTimestamps, &tsSQ.MetricName); tsm != nil {
rc.DoTimeseriesMap(tsm, values, timestamps)
tssLock.Lock()
tss = tsm.AppendTimeseriesTo(tss)
tssLock.Unlock()
continue
}
var ts timeseries
doRollupForTimeseries(rc, &ts, &tsSQ.MetricName, values, timestamps, sharedTimestamps, removeMetricGroup)
tssLock.Lock()
@@ -544,21 +592,22 @@ var (
rollupResultCacheMiss = metrics.NewCounter(`vm_rollup_result_cache_miss_total`)
)
func evalRollupFuncWithMetricExpr(ec *EvalConfig, name string, rf rollupFunc, me *metricExpr, iafc *incrementalAggrFuncContext, windowStr string) ([]*timeseries, error) {
func evalRollupFuncWithMetricExpr(ec *EvalConfig, name string, rf rollupFunc,
expr metricsql.Expr, me *metricsql.MetricExpr, iafc *incrementalAggrFuncContext, windowStr string) ([]*timeseries, error) {
if me.IsEmpty() {
return evalNumber(ec, nan), nil
}
var window int64
if len(windowStr) > 0 {
var err error
window, err = DurationValue(windowStr, ec.Step)
window, err = metricsql.PositiveDurationValue(windowStr, ec.Step)
if err != nil {
return nil, err
}
}
// Search for partial results in cache.
tssCached, start := rollupResultCacheV.Get(name, ec, me, iafc, window)
tssCached, start := rollupResultCacheV.Get(ec, expr, window)
if start > ec.End {
// The result is fully cached.
rollupResultCacheFullHits.Inc()
@@ -570,11 +619,26 @@ func evalRollupFuncWithMetricExpr(ec *EvalConfig, name string, rf rollupFunc, me
rollupResultCacheMiss.Inc()
}
// Obtain rollup configs before fetching data from db,
// so type errors can be caught earlier.
sharedTimestamps := getTimestamps(start, ec.End, ec.Step)
preFunc, rcs, err := getRollupConfigs(name, rf, expr, start, ec.End, ec.Step, window, ec.LookbackDelta, sharedTimestamps)
if err != nil {
return nil, err
}
// Fetch the remaining part of the result.
tfs := toTagFilters(me.LabelFilters)
minTimestamp := start - maxSilenceInterval
if window > ec.Step {
minTimestamp -= window
} else {
minTimestamp -= ec.Step
}
sq := &storage.SearchQuery{
MinTimestamp: start - window - maxSilenceInterval,
MaxTimestamp: ec.End + ec.Step,
TagFilterss: [][]storage.TagFilter{me.TagFilters},
MinTimestamp: minTimestamp,
MaxTimestamp: ec.End,
TagFilterss: [][]storage.TagFilter{tfs},
}
rss, err := netstorage.ProcessSearchQuery(sq, true, ec.Deadline)
if err != nil {
@@ -583,19 +647,32 @@ func evalRollupFuncWithMetricExpr(ec *EvalConfig, name string, rf rollupFunc, me
rssLen := rss.Len()
if rssLen == 0 {
rss.Cancel()
var tss []*timeseries
if name == "absent_over_time" {
tss = getAbsentTimeseries(ec, me)
}
// Add missing points until ec.End.
// Do not cache the result, since missing points
// may be backfilled in the future.
tss := mergeTimeseries(tssCached, nil, start, ec)
tss = mergeTimeseries(tssCached, tss, start, ec)
return tss, nil
}
sharedTimestamps := getTimestamps(start, ec.End, ec.Step)
preFunc, rcs := getRollupConfigs(name, rf, start, ec.End, ec.Step, window, ec.LookbackDelta, sharedTimestamps)
// Verify timeseries fit available memory after the rollup.
// Take into account points from tssCached.
pointsPerTimeseries := 1 + (ec.End-ec.Start)/ec.Step
rollupPoints := mulNoOverflow(pointsPerTimeseries, int64(rssLen*len(rcs)))
timeseriesLen := rssLen
if iafc != nil {
// Incremental aggregates require hold only GOMAXPROCS timeseries in memory.
timeseriesLen = runtime.GOMAXPROCS(-1)
if iafc.ae.Modifier.Op != "" {
// Increase the number of timeseries for non-empty group list: `aggr() by (something)`,
// since each group can have own set of time series in memory.
// Estimate the number of such groups is lower than 1000 :)
timeseriesLen *= 1000
}
}
rollupPoints := mulNoOverflow(pointsPerTimeseries, int64(timeseriesLen*len(rcs)))
rollupMemorySize := mulNoOverflow(rollupPoints, 16)
rml := getRollupMemoryLimiter()
if !rml.Get(uint64(rollupMemorySize)) {
@@ -611,16 +688,15 @@ func evalRollupFuncWithMetricExpr(ec *EvalConfig, name string, rf rollupFunc, me
removeMetricGroup := !rollupFuncsKeepMetricGroup[name]
var tss []*timeseries
if iafc != nil {
tss, err = evalRollupWithIncrementalAggregate(iafc, rss, rcs, preFunc, sharedTimestamps, removeMetricGroup)
tss, err = evalRollupWithIncrementalAggregate(name, iafc, rss, rcs, preFunc, sharedTimestamps, removeMetricGroup)
} else {
tss, err = evalRollupNoIncrementalAggregate(rss, rcs, preFunc, sharedTimestamps, removeMetricGroup)
tss, err = evalRollupNoIncrementalAggregate(name, rss, rcs, preFunc, sharedTimestamps, removeMetricGroup)
}
if err != nil {
return nil, err
}
tss = mergeTimeseries(tssCached, tss, start, ec)
rollupResultCacheV.Put(name, ec, me, iafc, window, tss)
rollupResultCacheV.Put(ec, expr, window, tss)
return tss, nil
}
@@ -636,13 +712,20 @@ func getRollupMemoryLimiter() *memoryLimiter {
return &rollupMemoryLimiter
}
func evalRollupWithIncrementalAggregate(iafc *incrementalAggrFuncContext, rss *netstorage.Results, rcs []*rollupConfig,
func evalRollupWithIncrementalAggregate(name string, iafc *incrementalAggrFuncContext, rss *netstorage.Results, rcs []*rollupConfig,
preFunc func(values []float64, timestamps []int64), sharedTimestamps []int64, removeMetricGroup bool) ([]*timeseries, error) {
err := rss.RunParallel(func(rs *netstorage.Result, workerID uint) {
preFunc(rs.Values, rs.Timestamps)
ts := getTimeseries()
defer putTimeseries(ts)
for _, rc := range rcs {
if tsm := newTimeseriesMap(name, sharedTimestamps, &rs.MetricName); tsm != nil {
rc.DoTimeseriesMap(tsm, rs.Values, rs.Timestamps)
for _, ts := range tsm.m {
iafc.updateTimeseries(ts, workerID)
}
continue
}
ts.Reset()
doRollupForTimeseries(rc, ts, &rs.MetricName, rs.Values, rs.Timestamps, sharedTimestamps, removeMetricGroup)
iafc.updateTimeseries(ts, workerID)
@@ -659,13 +742,20 @@ func evalRollupWithIncrementalAggregate(iafc *incrementalAggrFuncContext, rss *n
return tss, nil
}
func evalRollupNoIncrementalAggregate(rss *netstorage.Results, rcs []*rollupConfig,
func evalRollupNoIncrementalAggregate(name string, rss *netstorage.Results, rcs []*rollupConfig,
preFunc func(values []float64, timestamps []int64), sharedTimestamps []int64, removeMetricGroup bool) ([]*timeseries, error) {
tss := make([]*timeseries, 0, rss.Len()*len(rcs))
var tssLock sync.Mutex
err := rss.RunParallel(func(rs *netstorage.Result, workerID uint) {
preFunc(rs.Values, rs.Timestamps)
for _, rc := range rcs {
if tsm := newTimeseriesMap(name, sharedTimestamps, &rs.MetricName); tsm != nil {
rc.DoTimeseriesMap(tsm, rs.Values, rs.Timestamps)
tssLock.Lock()
tss = tsm.AppendTimeseriesTo(tss)
tssLock.Unlock()
continue
}
var ts timeseries
doRollupForTimeseries(rc, &ts, &rs.MetricName, rs.Values, rs.Timestamps, sharedTimestamps, removeMetricGroup)
tssLock.Lock()
@@ -693,62 +783,6 @@ func doRollupForTimeseries(rc *rollupConfig, tsDst *timeseries, mnSrc *storage.M
tsDst.denyReuse = true
}
func getRollupConfigs(name string, rf rollupFunc, start, end, step, window int64, lookbackDelta int64, sharedTimestamps []int64) (
func(values []float64, timestamps []int64), []*rollupConfig) {
preFunc := func(values []float64, timestamps []int64) {}
if rollupFuncsRemoveCounterResets[name] {
preFunc = func(values []float64, timestamps []int64) {
removeCounterResets(values)
}
}
newRollupConfig := func(rf rollupFunc, tagValue string) *rollupConfig {
return &rollupConfig{
TagValue: tagValue,
Func: rf,
Start: start,
End: end,
Step: step,
Window: window,
MayAdjustWindow: rollupFuncsMayAdjustWindow[name],
LookbackDelta: lookbackDelta,
Timestamps: sharedTimestamps,
}
}
appendRollupConfigs := func(dst []*rollupConfig) []*rollupConfig {
dst = append(dst, newRollupConfig(rollupMin, "min"))
dst = append(dst, newRollupConfig(rollupMax, "max"))
dst = append(dst, newRollupConfig(rollupAvg, "avg"))
return dst
}
var rcs []*rollupConfig
switch name {
case "rollup":
rcs = appendRollupConfigs(rcs)
case "rollup_rate", "rollup_deriv":
preFuncPrev := preFunc
preFunc = func(values []float64, timestamps []int64) {
preFuncPrev(values, timestamps)
derivValues(values, timestamps)
}
rcs = appendRollupConfigs(rcs)
case "rollup_increase", "rollup_delta":
preFuncPrev := preFunc
preFunc = func(values []float64, timestamps []int64) {
preFuncPrev(values, timestamps)
deltaValues(values)
}
rcs = appendRollupConfigs(rcs)
case "rollup_candlestick":
rcs = append(rcs, newRollupConfig(rollupFirst, "open"))
rcs = append(rcs, newRollupConfig(rollupLast, "close"))
rcs = append(rcs, newRollupConfig(rollupMin, "low"))
rcs = append(rcs, newRollupConfig(rollupMax, "high"))
default:
rcs = append(rcs, newRollupConfig(rf, ""))
}
return preFunc, rcs
}
var bbPool bytesutil.ByteBufferPool
func evalNumber(ec *EvalConfig, n float64) []*timeseries {
@@ -787,3 +821,23 @@ func mulNoOverflow(a, b int64) int64 {
}
return a * b
}
func toTagFilters(lfs []metricsql.LabelFilter) []storage.TagFilter {
tfs := make([]storage.TagFilter, len(lfs))
for i := range lfs {
toTagFilter(&tfs[i], &lfs[i])
}
return tfs
}
func toTagFilter(dst *storage.TagFilter, src *metricsql.LabelFilter) {
if src.Label != "__name__" {
dst.Key = []byte(src.Label)
} else {
// This is required for storage.Search.
dst.Key = nil
}
dst.Value = []byte(src.Value)
dst.IsRegexp = src.IsRegexp
dst.IsNegative = src.IsNegative
}

View File

@@ -11,6 +11,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/netstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/metricsql"
"github.com/VictoriaMetrics/metrics"
)
@@ -18,17 +19,6 @@ var logSlowQueryDuration = flag.Duration("search.logSlowQueryDuration", 5*time.S
var slowQueries = metrics.NewCounter(`vm_slow_queries_total`)
// ExpandWithExprs expands WITH expressions inside q and returns the resulting
// PromQL without WITH expressions.
func ExpandWithExprs(q string) (string, error) {
e, err := parsePromQLWithCache(q)
if err != nil {
return "", err
}
buf := e.AppendString(nil)
return string(buf), nil
}
// Exec executes q for the given ec.
func Exec(ec *EvalConfig, q string, isFirstPointOnly bool) ([]netstorage.Result, error) {
if *logSlowQueryDuration > 0 {
@@ -36,8 +26,8 @@ func Exec(ec *EvalConfig, q string, isFirstPointOnly bool) ([]netstorage.Result,
defer func() {
d := time.Since(startTime)
if d >= *logSlowQueryDuration {
logger.Infof("slow query according to -search.logSlowQueryDuration=%s: duration=%s, start=%d, end=%d, step=%d, query=%q",
*logSlowQueryDuration, d, ec.Start/1000, ec.End/1000, ec.Step/1000, q)
logger.Infof("slow query according to -search.logSlowQueryDuration=%s: duration=%.3f seconds, start=%d, end=%d, step=%d, query=%q",
*logSlowQueryDuration, d.Seconds(), ec.Start/1000, ec.End/1000, ec.Step/1000, q)
slowQueries.Inc()
}
}()
@@ -50,25 +40,11 @@ func Exec(ec *EvalConfig, q string, isFirstPointOnly bool) ([]netstorage.Result,
return nil, err
}
// Add an additional point to the end. This point is used
// in calculating the last value for rate, deriv, increase
// and delta funcs.
ec.End += ec.Step
rv, err := evalExpr(ec, e)
if err != nil {
return nil, err
}
// Remove the additional point at the end.
for _, ts := range rv {
ts.Values = ts.Values[:len(ts.Values)-1]
// ts.Timestamps may be shared between timeseries, so truncate it with len(ts.Values) instead of len(ts.Timestamps)-1
ts.Timestamps = ts.Timestamps[:len(ts.Values)]
}
ec.End -= ec.Step
if isFirstPointOnly {
// Remove all the points except the first one from every time series.
for _, ts := range rv {
@@ -85,17 +61,18 @@ func Exec(ec *EvalConfig, q string, isFirstPointOnly bool) ([]netstorage.Result,
return result, err
}
func maySortResults(e expr, tss []*timeseries) bool {
func maySortResults(e metricsql.Expr, tss []*timeseries) bool {
if len(tss) > 100 {
// There is no sense in sorting a lot of results
return false
}
fe, ok := e.(*funcExpr)
fe, ok := e.(*metricsql.FuncExpr)
if !ok {
return true
}
switch fe.Name {
case "sort", "sort_desc":
case "sort", "sort_desc",
"sort_by_label", "sort_by_label_desc":
return false
default:
return true
@@ -110,7 +87,7 @@ func timeseriesToResult(tss []*timeseries, maySort bool) ([]netstorage.Result, e
for i, ts := range tss {
bb.B = marshalMetricNameSorted(bb.B[:0], &ts.MetricName)
if _, ok := m[string(bb.B)]; ok {
return nil, fmt.Errorf(`duplicate output timeseries: %s%s`, ts.MetricName.MetricGroup, stringMetricName(&ts.MetricName))
return nil, fmt.Errorf(`duplicate output timeseries: %s`, stringMetricName(&ts.MetricName))
}
m[string(bb.B)] = struct{}{}
@@ -154,10 +131,10 @@ func removeNaNs(tss []*timeseries) []*timeseries {
return rvs
}
func parsePromQLWithCache(q string) (expr, error) {
func parsePromQLWithCache(q string) (metricsql.Expr, error) {
pcv := parseCacheV.Get(q)
if pcv == nil {
e, err := parsePromQL(q)
e, err := metricsql.Parse(q)
pcv = &parseCacheValue{
e: e,
err: err,
@@ -189,7 +166,7 @@ var parseCacheV = func() *parseCache {
const parseCacheMaxLen = 10e3
type parseCacheValue struct {
e expr
e metricsql.Expr
err error
}

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -47,670 +47,3 @@ func TestParseMetricSelectorError(t *testing.T) {
f(`foo[5m]`)
f(`foo offset 5m`)
}
func TestParsePromQLSuccess(t *testing.T) {
another := func(s string, sExpected string) {
t.Helper()
e, err := parsePromQL(s)
if err != nil {
t.Fatalf("unexpected error when parsing %q: %s", s, err)
}
res := e.AppendString(nil)
if string(res) != sExpected {
t.Fatalf("unexpected string constructed;\ngot\n%q\nwant\n%q", res, sExpected)
}
}
same := func(s string) {
t.Helper()
another(s, s)
}
// metricExpr
same(`{}`)
same(`{}[5m]`)
same(`{}[5m:]`)
same(`{}[:]`)
another(`{}[: ]`, `{}[:]`)
same(`{}[:3s]`)
another(`{}[: 3s ]`, `{}[:3s]`)
same(`{}[5m:3s]`)
another(`{}[ 5m : 3s ]`, `{}[5m:3s]`)
same(`{} offset 5m`)
same(`{}[5m] offset 10y`)
same(`{}[5.3m:3.4s] offset 10y`)
same(`{}[:3.4s] offset 10y`)
same(`{Foo="bAR"}`)
same(`{foo="bar"}`)
same(`{foo="bar"}[5m]`)
same(`{foo="bar"}[5m:]`)
same(`{foo="bar"}[5m:3s]`)
same(`{foo="bar"} offset 10y`)
same(`{foo="bar"}[5m] offset 10y`)
same(`{foo="bar"}[5m:3s] offset 10y`)
another(`{foo="bar"}[5m] oFFSEt 10y`, `{foo="bar"}[5m] offset 10y`)
same("METRIC")
same("metric")
same("m_e:tri44:_c123")
another("-metric", "0 - metric")
same(`metric offset 10h`)
same("metric[5m]")
same("metric[5m:3s]")
same("metric[5m] offset 10h")
same("metric[5m:3s] offset 10h")
same("metric[5i:3i] offset 10i")
same(`metric{foo="bar"}`)
same(`metric{foo="bar"} offset 10h`)
same(`metric{foo!="bar"}[2d]`)
same(`metric{foo="bar"}[2d] offset 10h`)
same(`metric{foo="bar", b="sdfsdf"}[2d:3h] offset 10h`)
another(` metric { foo = "bar" } [ 2d ] offset 10h `, `metric{foo="bar"}[2d] offset 10h`)
// metric name matching keywords
same("rate")
same("RATE")
same("by")
same("BY")
same("bool")
same("BOOL")
same("unless")
same("UNLESS")
same("Ignoring")
same("with")
same("WITH")
same("With")
same("alias")
same(`alias{foo="bar"}`)
same(`aLIas{alias="aa"}`)
another(`al\ias`, `alias`)
// identifiers with with escape chars
same(`foo\ bar`)
same(`foo\-bar\{{baz\+bar="aa"}`)
another(`\x2E\x2ef\oo{b\xEF\ar="aa"}`, `\x2e.foo{b\xefar="aa"}`)
// Duplicate filters
same(`foo{__name__="bar"}`)
same(`foo{a="b", a="c", __name__="aaa", b="d"}`)
// Metric filters ending with comma
another(`m{foo="bar",}`, `m{foo="bar"}`)
// String concat in tag value
another(`m{foo="bar" + "baz"}`, `m{foo="barbaz"}`)
// Valid regexp
same(`foo{bar=~"x"}`)
same(`foo{bar=~"^x"}`)
same(`foo{bar=~"^x$"}`)
same(`foo{bar=~"^(a[bc]|d)$"}`)
same(`foo{bar!~"x"}`)
same(`foo{bar!~"^x"}`)
same(`foo{bar!~"^x$"}`)
same(`foo{bar!~"^(a[bc]|d)$"}`)
// stringExpr
same(`""`)
same(`"\n\t\r 12:{}[]()44"`)
another(`''`, `""`)
another("``", `""`)
another(" `foo\"b'ar` ", "\"foo\\\"b'ar\"")
another(` 'foo\'bar"BAZ' `, `"foo'bar\"BAZ"`)
// string concat
another(`"foo"+'bar'`, `"foobar"`)
// numberExpr
same(`1`)
same(`1.23`)
same(`0.23`)
same(`1.2e+45`)
same(`1.2e-45`)
same(`-1`)
same(`-1.23`)
same(`-0.23`)
same(`-1.2e+45`)
same(`-1.2e-45`)
same(`-1.2e-45`)
another(`12.5E34`, `1.25e+35`)
another(`-.2`, `-0.2`)
another(`-.2E-2`, `-0.002`)
same(`NaN`)
another(`nan`, `NaN`)
another(`NAN`, `NaN`)
another(`nAN`, `NaN`)
another(`Inf`, `+Inf`)
another(`INF`, `+Inf`)
another(`inf`, `+Inf`)
another(`+Inf`, `+Inf`)
another(`-Inf`, `-Inf`)
another(`-inF`, `-Inf`)
// binaryOpExpr
another(`nan == nan`, `NaN`)
another(`nan ==bool nan`, `1`)
another(`nan !=bool nan`, `0`)
another(`nan !=bool 2`, `1`)
another(`2 !=bool nan`, `1`)
another(`nan >bool nan`, `0`)
another(`nan <bool nan`, `0`)
another(`1 ==bool nan`, `0`)
another(`NaN !=bool 1`, `1`)
another(`inf >=bool 2`, `1`)
another(`-1 >bool -inf`, `1`)
another(`-1 <bool -inf`, `0`)
another(`nan + 2 *3 * inf`, `NaN`)
another(`INF - Inf`, `NaN`)
another(`Inf + inf`, `+Inf`)
another(`1/0`, `+Inf`)
another(`0/0`, `NaN`)
another(`-m`, `0 - m`)
same(`m + ignoring () n[5m]`)
another(`M + IGNORING () N[5m]`, `M + ignoring () N[5m]`)
same(`m + on (foo) n[5m]`)
another(`m + ON (Foo) n[5m]`, `m + on (Foo) n[5m]`)
same(`m + ignoring (a, b) n[5m]`)
another(`1 or 2`, `1`)
another(`1 and 2`, `1`)
another(`1 unless 2`, `NaN`)
another(`1 default 2`, `1`)
another(`1 default NaN`, `1`)
another(`NaN default 2`, `2`)
another(`1 > 2`, `NaN`)
another(`1 > bool 2`, `0`)
another(`3 >= 2`, `3`)
another(`3 <= bool 2`, `0`)
another(`1 + -2 - 3`, `-4`)
another(`1 / 0 + 2`, `+Inf`)
another(`2 + -1 / 0`, `-Inf`)
another(`-1 ^ 0.5`, `NaN`)
another(`512.5 - (1 + 3) * (2 ^ 2) ^ 3`, `256.5`)
another(`1 == bool 1 != bool 24 < bool 4 > bool -1`, `1`)
another(`1 == bOOl 1 != BOOL 24 < Bool 4 > booL -1`, `1`)
another(`m1+on(foo)group_left m2`, `m1 + on (foo) group_left () m2`)
another(`M1+ON(FOO)GROUP_left M2`, `M1 + on (FOO) group_left () M2`)
same(`m1 + on (foo) group_right () m2`)
same(`m1 + on (foo, bar) group_right (x, y) m2`)
another(`m1 + on (foo, bar,) group_right (x, y,) m2`, `m1 + on (foo, bar) group_right (x, y) m2`)
same(`m1 == bool on (foo, bar) group_right (x, y) m2`)
another(`5 - 1 + 3 * 2 ^ 2 ^ 3 - 2 OR Metric {Bar= "Baz", aaa!="bb",cc=~"dd" ,zz !~"ff" } `,
`770 or Metric{Bar="Baz", aaa!="bb", cc=~"dd", zz!~"ff"}`)
same(`"foo" + bar()`)
same(`"foo" + bar{x="y"}`)
same(`("foo"[3s] + bar{x="y"})[5m:3s] offset 10s`)
same(`("foo"[3s] + bar{x="y"})[5i:3i] offset 10i`)
same(`bar + "foo" offset 3s`)
same(`bar + "foo" offset 3i`)
another(`1+2 if 2>3`, `NaN`)
another(`1+4 if 2<3`, `5`)
another(`2+6 default 3 if 2>3`, `8`)
another(`2+6 if 2>3 default NaN`, `NaN`)
another(`42 if 3>2 if 2+2<5`, `42`)
another(`42 if 3>2 if 2+2>=5`, `NaN`)
another(`1+2 ifnot 2>3`, `3`)
another(`1+4 ifnot 2<3`, `NaN`)
another(`2+6 default 3 ifnot 2>3`, `8`)
another(`2+6 ifnot 2>3 default NaN`, `8`)
another(`42 if 3>2 ifnot 2+2<5`, `NaN`)
another(`42 if 3>2 ifnot 2+2>=5`, `42`)
// parensExpr
another(`(-foo + ((bar) / (baz))) + ((23))`, `((0 - foo) + (bar / baz)) + 23`)
another(`(FOO + ((Bar) / (baZ))) + ((23))`, `(FOO + (Bar / baZ)) + 23`)
same(`(foo, bar)`)
another(`1+(foo, bar,)`, `1 + (foo, bar)`)
another(`((foo(bar,baz)), (1+(2)+(3,4)+()))`, `(foo(bar, baz), (3 + (3, 4)) + ())`)
same(`()`)
// funcExpr
same(`f()`)
another(`f(x,)`, `f(x)`)
another(`-f()-Ff()`, `(0 - f()) - Ff()`)
same(`F()`)
another(`+F()`, `F()`)
another(`++F()`, `F()`)
another(`--F()`, `0 - (0 - F())`)
same(`f(http_server_request)`)
same(`f(http_server_request)[4s:5m] offset 10m`)
same(`f(http_server_request)[4i:5i] offset 10i`)
same(`F(HttpServerRequest)`)
same(`f(job, foo)`)
same(`F(Job, Foo)`)
another(` FOO (bar) + f ( m ( ),ff(1 + ( 2.5)) ,M[5m ] , "ff" )`, `FOO(bar) + f(m(), ff(3.5), M[5m], "ff")`)
// funcName matching keywords
same(`by(2)`)
same(`BY(2)`)
same(`or(2)`)
same(`OR(2)`)
same(`bool(2)`)
same(`BOOL(2)`)
same(`rate(rate(m))`)
same(`rate(rate(m[5m]))`)
same(`rate(rate(m[5m])[1h:])`)
same(`rate(rate(m[5m])[1h:3s])`)
// funcName with escape chars
same(`foo\(ba\-r()`)
// aggrFuncExpr
same(`sum(http_server_request) by ()`)
same(`sum(http_server_request) by (job)`)
same(`sum(http_server_request) without (job, foo)`)
another(`sum(x,y,) without (a,b,)`, `sum(x, y) without (a, b)`)
another(`sum by () (xx)`, `sum(xx) by ()`)
another(`sum by (s) (xx)[5s]`, `(sum(xx) by (s))[5s]`)
another(`SUM BY (ZZ, aa) (XX)`, `sum(XX) by (ZZ, aa)`)
another(`sum without (a, b) (xx,2+2)`, `sum(xx, 4) without (a, b)`)
another(`Sum WIthout (a, B) (XX,2+2)`, `sum(XX, 4) without (a, B)`)
same(`sum(a) or sum(b)`)
same(`sum(a) by () or sum(b) without (x, y)`)
same(`sum(a) + sum(b)`)
same(`sum(x) * (1 + sum(a))`)
// All the above
another(`Sum(Ff(M) * M{X=""}[5m] Offset 7m - 123, 35) BY (X, y) * F2("Test")`,
`sum((Ff(M) * M{X=""}[5m] offset 7m) - 123, 35) by (X, y) * F2("Test")`)
another(`# comment
Sum(Ff(M) * M{X=""}[5m] Offset 7m - 123, 35) BY (X, y) # yet another comment
* F2("Test")`,
`sum((Ff(M) * M{X=""}[5m] offset 7m) - 123, 35) by (X, y) * F2("Test")`)
// withExpr
another(`with () x`, `x`)
another(`with (x=1,) x`, `1`)
another(`with (x = m offset 5h) x + x`, `m offset 5h + m offset 5h`)
another(`with (x = m offset 5i) x + x`, `m offset 5i + m offset 5i`)
another(`with (foo = bar{x="x"}) 1`, `1`)
another(`with (foo = bar{x="x"}) "x"`, `"x"`)
another(`with (f="x") f`, `"x"`)
another(`with (foo = bar{x="x"}) x{x="y"}`, `x{x="y"}`)
another(`with (foo = bar{x="x"}) 1+1`, `2`)
another(`with (foo = bar{x="x"}) f()`, `f()`)
another(`with (foo = bar{x="x"}) sum(x)`, `sum(x)`)
another(`with (foo = bar{x="x"}) baz{foo="bar"}`, `baz{foo="bar"}`)
another(`with (foo = bar) baz`, `baz`)
another(`with (foo = bar) foo + foo{a="b"}`, `bar + bar{a="b"}`)
another(`with (foo = bar, bar=baz + f()) test`, `test`)
another(`with (ct={job="test"}) a{ct} + ct() + f({ct="x"})`, `(a{job="test"} + {job="test"}) + f({ct="x"})`)
another(`with (ct={job="test", i="bar"}) ct + {ct, x="d"} + foo{ct, ct} + ctx(1)`,
`(({job="test", i="bar"} + {job="test", i="bar", x="d"}) + foo{job="test", i="bar"}) + ctx(1)`)
another(`with (foo = bar) {__name__=~"foo"}`, `{__name__=~"foo"}`)
another(`with (foo = bar) foo{__name__="foo"}`, `bar`)
another(`with (foo = bar) {__name__="foo", x="y"}`, `bar{x="y"}`)
another(`with (foo(bar) = {__name__!="bar"}) foo(x)`, `{__name__!="bar"}`)
another(`with (foo(bar) = bar{__name__="bar"}) foo(x)`, `x`)
another(`with (foo\-bar(baz) = baz + baz) foo\-bar((x,y))`, `(x, y) + (x, y)`)
another(`with (foo\-bar(baz) = baz + baz) foo\-bar(x*y)`, `(x * y) + (x * y)`)
another(`with (foo\-bar(baz) = baz + baz) foo\-bar(x\*y)`, `x\*y + x\*y`)
another(`with (foo\-bar(b\ az) = b\ az + b\ az) foo\-bar(x\*y)`, `x\*y + x\*y`)
// override ttf to something new.
another(`with (ttf = a) ttf + b`, `a + b`)
// override ttf to ru
another(`with (ttf = ru(m, n)) ttf`, `(clamp_min(n - clamp_min(m, 0), 0) / clamp_min(n, 0)) * 100`)
// Verify withExpr recursion and forward reference
another(`with (x = x+y, y = x+x) y ^ 2`, `((x + y) + (x + y)) ^ 2`)
another(`with (f1(x)=f2(x), f2(x)=f1(x)^2) f1(foobar)`, `f2(foobar)`)
another(`with (f1(x)=f2(x), f2(x)=f1(x)^2) f2(foobar)`, `f2(foobar) ^ 2`)
// Verify withExpr funcs
another(`with (x() = y+1) x`, `y + 1`)
another(`with (x(foo) = foo+1) x(a)`, `a + 1`)
another(`with (x(a, b) = a + b) x(foo, bar)`, `foo + bar`)
another(`with (x(a, b) = a + b) x(foo, x(1, 2))`, `foo + 3`)
another(`with (x(a) = sum(a) by (b)) x(xx) / x(y)`, `sum(xx) by (b) / sum(y) by (b)`)
another(`with (f(a,f,x)=ff(x,f,a)) f(f(x,y,z),1,2)`, `ff(2, 1, ff(z, y, x))`)
another(`with (f(x)=1+f(x)) f(foo{bar="baz"})`, `1 + f(foo{bar="baz"})`)
another(`with (a=foo, y=bar, f(a)= a+a+y) f(x)`, `(x + x) + bar`)
another(`with (f(a, b) = m{a, b}) f({a="x", b="y"}, {c="d"})`, `m{a="x", b="y", c="d"}`)
another(`with (xx={a="x"}, f(a, b) = m{a, b}) f({xx, b="y"}, {c="d"})`, `m{a="x", b="y", c="d"}`)
another(`with (x() = {b="c"}) foo{x}`, `foo{b="c"}`)
another(`with (f(x)=x{foo="bar"} offset 5m) f(m offset 10m)`, `(m{foo="bar"} offset 10m) offset 5m`)
another(`with (f(x)=x{foo="bar",bas="a"}[5m]) f(m[10m] offset 3s)`, `(m{foo="bar", bas="a"}[10m] offset 3s)[5m]`)
another(`with (f(x)=x{foo="bar"}[5m] offset 10m) f(m{x="y"})`, `m{x="y", foo="bar"}[5m] offset 10m`)
another(`with (f(x)=x{foo="bar"}[5m] offset 10m) f({x="y", foo="bar", foo="bar"})`, `{x="y", foo="bar"}[5m] offset 10m`)
another(`with (f(m, x)=m{x}[5m] offset 10m) f(foo, {})`, `foo[5m] offset 10m`)
another(`with (f(m, x)=m{x, bar="baz"}[5m] offset 10m) f(foo, {})`, `foo{bar="baz"}[5m] offset 10m`)
another(`with (f(x)=x[5m] offset 3s) f(foo[3m]+bar)`, `(foo[3m] + bar)[5m] offset 3s`)
another(`with (f(x)=x[5m:3s] oFFsEt 1.5m) f(sum(s) by (a,b))`, `(sum(s) by (a, b))[5m:3s] offset 1.5m`)
another(`with (x="a", y=x) y+"bc"`, `"abc"`)
another(`with (x="a", y="b"+x) "we"+y+"z"+f()`, `"webaz" + f()`)
another(`with (f(x) = m{foo=x+"y", bar="y"+x, baz=x} + x) f("qwe")`, `m{foo="qwey", bar="yqwe", baz="qwe"} + "qwe"`)
another(`with (f(a)=a) f`, `f`)
another(`with (f\q(a)=a) f\q`, `fq`)
// Verify withExpr for aggr func modifiers
another(`with (f(x) = x, y = sum(m) by (f)) y`, `sum(m) by (f)`)
another(`with (f(x) = sum(m) by (x)) f(foo)`, `sum(m) by (foo)`)
another(`with (f(x) = sum(m) by (x)) f((foo, bar, foo))`, `sum(m) by (foo, bar)`)
another(`with (f(x) = sum(m) without (x,y)) f((a, b))`, `sum(m) without (a, b, y)`)
another(`with (f(x) = sum(m) without (y,x)) f((a, y))`, `sum(m) without (y, a)`)
another(`with (f(x,y) = a + on (x,y) group_left (y,bar) b) f(foo,())`, `a + on (foo) group_left (bar) b`)
another(`with (f(x,y) = a + on (x,y) group_left (y,bar) b) f((foo),())`, `a + on (foo) group_left (bar) b`)
another(`with (f(x,y) = a + on (x,y) group_left (y,bar) b) f((foo,xx),())`, `a + on (foo, xx) group_left (bar) b`)
// Verify nested with exprs
another(`with (f(x) = (with(x=y) x) + x) f(z)`, `y + z`)
another(`with (x=foo) f(a, with (y=x) y)`, `f(a, foo)`)
another(`with (x=foo) a * x + (with (y=x) y) / y`, `(a * foo) + (foo / y)`)
another(`with (x = with (y = foo) y + x) x/x`, `(foo + x) / (foo + x)`)
another(`with (
x = {foo="bar"},
q = m{x, y="1"},
f(x) =
with (
z(y) = x + y * q
)
z(foo) / f(x)
)
f(a)`, `(a + (foo * m{foo="bar", y="1"})) / f(a)`)
// complex withExpr
another(`WITH (
treshold = (0.9),
commonFilters = {job="cacher", instance=~"1.2.3.4"},
hits = rate(cache{type="hit", commonFilters}[5m]),
miss = rate(cache{type="miss", commonFilters}[5m]),
sumByInstance(arg) = sum(arg) by (instance),
hitRatio = sumByInstance(hits) / sumByInstance(hits + miss)
)
hitRatio < treshold`,
`(sum(rate(cache{type="hit", job="cacher", instance=~"1.2.3.4"}[5m])) by (instance) / sum(rate(cache{type="hit", job="cacher", instance=~"1.2.3.4"}[5m]) + rate(cache{type="miss", job="cacher", instance=~"1.2.3.4"}[5m])) by (instance)) < 0.9`)
another(`WITH (
x2(x) = x^2,
f(x, y) = x2(x) + x*y + x2(y)
)
f(a, 3)
`, `((a ^ 2) + (a * 3)) + 9`)
another(`WITH (
x2(x) = x^2,
f(x, y) = x2(x) + x*y + x2(y)
)
f(2, 3)
`, `19`)
another(`WITH (
commonFilters = {instance="foo"},
timeToFuckup(currv, maxv) = (maxv - currv) / rate(currv)
)
timeToFuckup(diskUsage{commonFilters}, maxDiskSize{commonFilters})`,
`(maxDiskSize{instance="foo"} - diskUsage{instance="foo"}) / rate(diskUsage{instance="foo"})`)
another(`WITH (
commonFilters = {job="foo", instance="bar"},
sumRate(m, cf) = sum(rate(m{cf})) by (job, instance),
hitRate(hits, misses) = sumRate(hits, commonFilters) / (sumRate(hits, commonFilters) + sumRate(misses, commonFilters))
)
hitRate(cacheHits, cacheMisses)`,
`sum(rate(cacheHits{job="foo", instance="bar"})) by (job, instance) / (sum(rate(cacheHits{job="foo", instance="bar"})) by (job, instance) + sum(rate(cacheMisses{job="foo", instance="bar"})) by (job, instance))`)
another(`with(y=123,z=5) union(with(y=3,f(x)=x*y) f(2) + f(3), with(x=5,y=2) x*y*z)`, `union(15, 50)`)
}
func TestParsePromQLError(t *testing.T) {
f := func(s string) {
t.Helper()
e, err := parsePromQL(s)
if err == nil {
t.Fatalf("expecting non-nil error when parsing %q", s)
}
if e != nil {
t.Fatalf("expecting nil expr when parsing %q", s)
}
}
// an empty string
f("")
f(" \t\b\r\n ")
// invalid metricExpr
f(`{__name__="ff"} offset 55`)
f(`{__name__="ff"} offset -5m`)
f(`foo[55]`)
f(`m[-5m]`)
f(`{`)
f(`foo{`)
f(`foo{bar`)
f(`foo{bar=`)
f(`foo{bar="baz"`)
f(`foo{bar="baz", `)
f(`foo{123="23"}`)
f(`foo{foo}`)
f(`foo{,}`)
f(`foo{,foo="bar"}`)
f(`foo{foo=}`)
f(`foo{foo="ba}`)
f(`foo{"foo"="bar"}`)
f(`foo{$`)
f(`foo{a $`)
f(`foo{a="b",$`)
f(`foo{a="b"}$`)
f(`[`)
f(`[]`)
f(`f[5m]$`)
f(`[5m]`)
f(`[5m] offset 4h`)
f(`m[5m] offset $`)
f(`m[5m] offset 5h $`)
f(`m[]`)
f(`m[-5m]`)
f(`m[5m:`)
f(`m[5m:-`)
f(`m[5m:-1`)
f(`m[5m:-1]`)
f(`m[:`)
f(`m[:-`)
f(`m[:1]`)
f(`m[:-1m]`)
f(`m[5]`)
f(`m[[5m]]`)
f(`m[foo]`)
f(`m["ff"]`)
f(`m[10m`)
f(`m[123`)
f(`m["ff`)
f(`m[(f`)
f(`fd}`)
f(`]`)
f(`m $`)
f(`m{,}`)
f(`m{x=y}`)
f(`m{x=y/5}`)
f(`m{x=y+5}`)
// Invalid regexp
f(`foo{bar=~"x["}`)
f(`foo{bar=~"x("}`)
f(`foo{bar=~"x)"}`)
f(`foo{bar!~"x["}`)
f(`foo{bar!~"x("}`)
f(`foo{bar!~"x)"}`)
// invalid stringExpr
f(`'`)
f(`"`)
f("`")
f(`"foo`)
f(`'foo`)
f("`foo")
f(`"foo\"bar`)
f(`'foo\'bar`)
f("`foo\\`bar")
f(`"" $`)
f(`"foo" +`)
f(`n{"foo" + m`)
// invalid numberExpr
f(`12.`)
f(`1.2e`)
f(`23e-`)
f(`23E+`)
f(`.`)
f(`-12.`)
f(`-1.2e`)
f(`-23e-`)
f(`-23E+`)
f(`-.`)
f(`-1$$`)
f(`-$$`)
f(`+$$`)
f(`23 $$`)
// invalid binaryOpExpr
f(`+`)
f(`1 +`)
f(`1 + 2.`)
f(`3 unless`)
f(`23 + on (foo)`)
f(`m + on (,) m`)
f(`3 * ignoring`)
f(`m * on (`)
f(`m * on (foo`)
f(`m * on (foo,`)
f(`m * on (foo,)`)
f(`m * on (,foo)`)
f(`m * on (,)`)
f(`m == bool (bar) baz`)
f(`m == bool () baz`)
f(`m * by (baz) n`)
f(`m + bool group_left m2`)
f(`m + on () group_left (`)
f(`m + on () group_left (,`)
f(`m + on () group_left (,foo`)
f(`m + on () group_left (foo,)`)
f(`m + on () group_left (,foo)`)
f(`m + on () group_left (foo)`)
f(`m + on () group_right (foo) (m`)
f(`m or ignoring () group_left () n`)
f(`1 + bool 2`)
f(`m % bool n`)
f(`m * bool baz`)
f(`M * BOoL BaZ`)
f(`foo unless ignoring (bar) group_left xxx`)
f(`foo or bool bar`)
f(`foo == bool $$`)
f(`"foo" + bar`)
// invalid parensExpr
f(`(`)
f(`($`)
f(`(+`)
f(`(1`)
f(`(m+`)
f(`1)`)
f(`(,)`)
f(`(1)$`)
// invalid funcExpr
f(`f $`)
f(`f($)`)
f(`f[`)
f(`f()$`)
f(`f(`)
f(`f(foo`)
f(`f(f,`)
f(`f(,`)
f(`f(,)`)
f(`f(,foo)`)
f(`f(,foo`)
f(`f(foo,$`)
f(`f() by (a)`)
f(`f without (x) (y)`)
f(`f() foo (a)`)
f(`f bar (x) (b)`)
f(`f bar (x)`)
// invalid aggrFuncExpr
f(`sum(`)
f(`sum $`)
f(`sum [`)
f(`sum($)`)
f(`sum()$`)
f(`sum(foo) ba`)
f(`sum(foo) ba()`)
f(`sum(foo) by`)
f(`sum(foo) without x`)
f(`sum(foo) aaa`)
f(`sum(foo) aaa x`)
f(`sum() by $`)
f(`sum() by (`)
f(`sum() by ($`)
f(`sum() by (a`)
f(`sum() by (a $`)
f(`sum() by (a ]`)
f(`sum() by (a)$`)
f(`sum() by (,`)
f(`sum() by (a,$`)
f(`sum() by (,)`)
f(`sum() by (,a`)
f(`sum() by (,a)`)
f(`sum() on (b)`)
f(`sum() bool`)
f(`sum() group_left`)
f(`sum() group_right(x)`)
f(`sum ba`)
f(`sum ba ()`)
f(`sum by (`)
f(`sum by (a`)
f(`sum by (,`)
f(`sum by (,)`)
f(`sum by (,a`)
f(`sum by (,a)`)
f(`sum by (a)`)
f(`sum by (a) (`)
f(`sum by (a) [`)
f(`sum by (a) {`)
f(`sum by (a) (b`)
f(`sum by (a) (b,`)
f(`sum by (a) (,)`)
f(`avg by (a) (,b)`)
f(`sum by (x) (y) by (z)`)
f(`sum(m) by (1)`)
// invalid withExpr
f(`with $`)
f(`with a`)
f(`with a=b c`)
f(`with (`)
f(`with (x=b)$`)
f(`with ($`)
f(`with (foo`)
f(`with (foo $`)
f(`with (x y`)
f(`with (x =`)
f(`with (x = $`)
f(`with (x= y`)
f(`with (x= y $`)
f(`with (x= y)`)
f(`with (x=(`)
f(`with (x=[)`)
f(`with (x=() x)`)
f(`with ($$)`)
f(`with (x $$`)
f(`with (x = $$)`)
f(`with (x = foo) bar{x}`)
f(`with (x = {foo="bar"}[5m]) bar{x}`)
f(`with (x = {foo="bar"} offset 5m) bar{x}`)
f(`with (x = a, x = b) c`)
f(`with (x(a, a) = b) c`)
f(`with (x=m{f="x"}) foo{x}`)
f(`with (sum = x) y`)
f(`with (rate(a) = b) c`)
f(`with (clamp_min=x) y`)
f(`with (f()`)
f(`with (a=b c=d) e`)
f(`with (f(x)=x^2) m{x}`)
f(`with (f(x)=ff()) m{x}`)
f(`with (f(x`)
f(`with (x=m) a{x} + b`)
f(`with (x=m) b + a{x}`)
f(`with (x=m) f(b, a{x})`)
f(`with (x=m) sum(a{x})`)
f(`with (x=m) (a{x})`)
f(`with (f(a)=a) f(1, 2)`)
f(`with (f(x)=x{foo="bar"}) f(1)`)
f(`with (f(x)=x{foo="bar"}) f(m + n)`)
f(`with (f = with`)
f(`with (,)`)
f(`with (1) 2`)
f(`with (f(1)=2) 3`)
f(`with (f(,)=x) x`)
f(`with (x(a) = {b="c"}) foo{x}`)
f(`with (f(x) = m{foo=xx}) f("qwe")`)
f(`a + with(f(x)=x) f(1,2)`)
f(`with (f(x) = sum(m) by (x)) f({foo="bar"})`)
f(`with (f(x) = sum(m) by (x)) f((xx(), {foo="bar"}))`)
f(`with (f(x) = m + on (x) n) f(xx())`)
f(`with (f(x) = m + on (a) group_right (x) n) f(xx())`)
}

File diff suppressed because it is too large Load Diff

View File

@@ -12,12 +12,18 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/encoding"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/memory"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/metricsql"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/workingsetcache"
"github.com/VictoriaMetrics/fastcache"
"github.com/VictoriaMetrics/metrics"
)
var disableCache = flag.Bool("search.disableCache", false, "Whether to disable response caching. This may be useful during data backfilling")
var (
disableCache = flag.Bool("search.disableCache", false, "Whether to disable response caching. This may be useful during data backfilling")
cacheTimestampOffset = flag.Duration("search.cacheTimestampOffset", 5*time.Minute, "The maximum duration since the current time for response data, "+
"which is always queried from the original raw data, without using the response cache. Increase this value if you see gaps in responses "+
"due to time synchronization issues between VictoriaMetrics and data sources")
)
var rollupResultCacheV = &rollupResultCache{
c: workingsetcache.New(1024*1024, time.Hour), // This is a cache for testing.
@@ -73,8 +79,8 @@ func InitRollupResultCache(cachePath string) {
return stats
}
if len(rollupResultCachePath) > 0 {
logger.Infof("loaded rollupResult cache from %q in %s; entriesCount: %d, sizeBytes: %d",
rollupResultCachePath, time.Since(startTime), fcs().EntriesCount, fcs().BytesSize)
logger.Infof("loaded rollupResult cache from %q in %.3f seconds; entriesCount: %d, sizeBytes: %d",
rollupResultCachePath, time.Since(startTime).Seconds(), fcs().EntriesCount, fcs().BytesSize)
}
metrics.NewGauge(`vm_cache_entries{type="promql/rollupResult"}`, func() float64 {
@@ -112,8 +118,8 @@ func StopRollupResultCache() {
rollupResultCacheV.c.UpdateStats(&fcs)
rollupResultCacheV.c.Stop()
rollupResultCacheV.c = nil
logger.Infof("saved rollupResult cache to %q in %s; entriesCount: %d, sizeBytes: %d",
rollupResultCachePath, time.Since(startTime), fcs.EntriesCount, fcs.BytesSize)
logger.Infof("saved rollupResult cache to %q in %.3f seconds; entriesCount: %d, sizeBytes: %d",
rollupResultCachePath, time.Since(startTime).Seconds(), fcs.EntriesCount, fcs.BytesSize)
}
type rollupResultCache struct {
@@ -126,9 +132,10 @@ var rollupResultCacheResets = metrics.NewCounter(`vm_cache_resets_total{type="pr
func ResetRollupResultCache() {
rollupResultCacheResets.Inc()
rollupResultCacheV.c.Reset()
logger.Infof("rollupResult cache has been cleared")
}
func (rrc *rollupResultCache) Get(funcName string, ec *EvalConfig, me *metricExpr, iafc *incrementalAggrFuncContext, window int64) (tss []*timeseries, newStart int64) {
func (rrc *rollupResultCache) Get(ec *EvalConfig, expr metricsql.Expr, window int64) (tss []*timeseries, newStart int64) {
if *disableCache || !ec.mayCache() {
return nil, ec.Start
}
@@ -137,7 +144,7 @@ func (rrc *rollupResultCache) Get(funcName string, ec *EvalConfig, me *metricExp
bb := bbPool.Get()
defer bbPool.Put(bb)
bb.B = marshalRollupResultCacheKey(bb.B[:0], funcName, me, iafc, window, ec.Step)
bb.B = marshalRollupResultCacheKey(bb.B[:0], expr, window, ec.Step)
metainfoBuf := rrc.c.Get(nil, bb.B)
if len(metainfoBuf) == 0 {
return nil, ec.Start
@@ -157,7 +164,7 @@ func (rrc *rollupResultCache) Get(funcName string, ec *EvalConfig, me *metricExp
if len(compressedResultBuf.B) == 0 {
mi.RemoveKey(key)
metainfoBuf = mi.Marshal(metainfoBuf[:0])
bb.B = marshalRollupResultCacheKey(bb.B[:0], funcName, me, iafc, window, ec.Step)
bb.B = marshalRollupResultCacheKey(bb.B[:0], expr, window, ec.Step)
rrc.c.Set(bb.B, metainfoBuf)
return nil, ec.Start
}
@@ -209,15 +216,15 @@ func (rrc *rollupResultCache) Get(funcName string, ec *EvalConfig, me *metricExp
var resultBufPool bytesutil.ByteBufferPool
func (rrc *rollupResultCache) Put(funcName string, ec *EvalConfig, me *metricExpr, iafc *incrementalAggrFuncContext, window int64, tss []*timeseries) {
func (rrc *rollupResultCache) Put(ec *EvalConfig, expr metricsql.Expr, window int64, tss []*timeseries) {
if *disableCache || len(tss) == 0 || !ec.mayCache() {
return
}
// Remove values up to currentTime - step - maxSilenceInterval,
// Remove values up to currentTime - step - cacheTimestampOffset,
// since these values may be added later.
timestamps := tss[0].Timestamps
deadline := (time.Now().UnixNano() / 1e6) - ec.Step - maxSilenceInterval
deadline := (time.Now().UnixNano() / 1e6) - ec.Step - cacheTimestampOffset.Milliseconds()
i := len(timestamps) - 1
for i >= 0 && timestamps[i] > deadline {
i--
@@ -260,7 +267,7 @@ func (rrc *rollupResultCache) Put(funcName string, ec *EvalConfig, me *metricExp
bb.B = key.Marshal(bb.B[:0])
rrc.c.SetBig(bb.B, compressedResultBuf.B)
bb.B = marshalRollupResultCacheKey(bb.B[:0], funcName, me, iafc, window, ec.Step)
bb.B = marshalRollupResultCacheKey(bb.B[:0], expr, window, ec.Step)
metainfoBuf := rrc.c.Get(nil, bb.B)
var mi rollupResultCacheMetainfo
if len(metainfoBuf) > 0 {
@@ -288,23 +295,13 @@ var (
var tooBigRollupResults = metrics.NewCounter("vm_too_big_rollup_results_total")
// Increment this value every time the format of the cache changes.
const rollupResultCacheVersion = 6
const rollupResultCacheVersion = 7
func marshalRollupResultCacheKey(dst []byte, funcName string, me *metricExpr, iafc *incrementalAggrFuncContext, window, step int64) []byte {
func marshalRollupResultCacheKey(dst []byte, expr metricsql.Expr, window, step int64) []byte {
dst = append(dst, rollupResultCacheVersion)
if iafc == nil {
dst = append(dst, 0)
} else {
dst = append(dst, 1)
dst = iafc.ae.AppendString(dst)
}
dst = encoding.MarshalUint64(dst, uint64(len(funcName)))
dst = append(dst, funcName...)
dst = encoding.MarshalInt64(dst, window)
dst = encoding.MarshalInt64(dst, step)
for i := range me.TagFilters {
dst = me.TagFilters[i].Marshal(dst)
}
dst = expr.AppendString(dst)
return dst
}

View File

@@ -3,12 +3,12 @@ package promql
import (
"testing"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/metricsql"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
)
func TestRollupResultCache(t *testing.T) {
ResetRollupResultCache()
funcName := "foo"
window := int64(456)
ec := &EvalConfig{
Start: 1000,
@@ -17,21 +17,24 @@ func TestRollupResultCache(t *testing.T) {
MayCache: true,
}
me := &metricExpr{
TagFilters: []storage.TagFilter{{
Key: []byte("aaa"),
Value: []byte("xxx"),
me := &metricsql.MetricExpr{
LabelFilters: []metricsql.LabelFilter{{
Label: "aaa",
Value: "xxx",
}},
}
iafc := &incrementalAggrFuncContext{
ae: &aggrFuncExpr{
Name: "foobar",
},
fe := &metricsql.FuncExpr{
Name: "foo",
Args: []metricsql.Expr{me},
}
ae := &metricsql.AggrFuncExpr{
Name: "foobar",
Args: []metricsql.Expr{fe},
}
// Try obtaining an empty value.
t.Run("empty", func(t *testing.T) {
tss, newStart := rollupResultCacheV.Get(funcName, ec, me, nil, window)
tss, newStart := rollupResultCacheV.Get(ec, fe, window)
if newStart != ec.Start {
t.Fatalf("unexpected newStart; got %d; want %d", newStart, ec.Start)
}
@@ -41,7 +44,7 @@ func TestRollupResultCache(t *testing.T) {
})
// Store timeseries overlapping with start
t.Run("start-overlap-no-iafc", func(t *testing.T) {
t.Run("start-overlap-no-ae", func(t *testing.T) {
ResetRollupResultCache()
tss := []*timeseries{
{
@@ -49,8 +52,8 @@ func TestRollupResultCache(t *testing.T) {
Values: []float64{0, 1, 2},
},
}
rollupResultCacheV.Put(funcName, ec, me, nil, window, tss)
tss, newStart := rollupResultCacheV.Get(funcName, ec, me, nil, window)
rollupResultCacheV.Put(ec, fe, window, tss)
tss, newStart := rollupResultCacheV.Get(ec, fe, window)
if newStart != 1400 {
t.Fatalf("unexpected newStart; got %d; want %d", newStart, 1400)
}
@@ -62,7 +65,7 @@ func TestRollupResultCache(t *testing.T) {
}
testTimeseriesEqual(t, tss, tssExpected)
})
t.Run("start-overlap-with-iafc", func(t *testing.T) {
t.Run("start-overlap-with-ae", func(t *testing.T) {
ResetRollupResultCache()
tss := []*timeseries{
{
@@ -70,8 +73,8 @@ func TestRollupResultCache(t *testing.T) {
Values: []float64{0, 1, 2},
},
}
rollupResultCacheV.Put(funcName, ec, me, iafc, window, tss)
tss, newStart := rollupResultCacheV.Get(funcName, ec, me, iafc, window)
rollupResultCacheV.Put(ec, ae, window, tss)
tss, newStart := rollupResultCacheV.Get(ec, ae, window)
if newStart != 1400 {
t.Fatalf("unexpected newStart; got %d; want %d", newStart, 1400)
}
@@ -93,8 +96,8 @@ func TestRollupResultCache(t *testing.T) {
Values: []float64{333, 0, 1, 2},
},
}
rollupResultCacheV.Put(funcName, ec, me, nil, window, tss)
tss, newStart := rollupResultCacheV.Get(funcName, ec, me, nil, window)
rollupResultCacheV.Put(ec, fe, window, tss)
tss, newStart := rollupResultCacheV.Get(ec, fe, window)
if newStart != 1000 {
t.Fatalf("unexpected newStart; got %d; want %d", newStart, 1000)
}
@@ -112,8 +115,8 @@ func TestRollupResultCache(t *testing.T) {
Values: []float64{0, 1, 2},
},
}
rollupResultCacheV.Put(funcName, ec, me, nil, window, tss)
tss, newStart := rollupResultCacheV.Get(funcName, ec, me, nil, window)
rollupResultCacheV.Put(ec, fe, window, tss)
tss, newStart := rollupResultCacheV.Get(ec, fe, window)
if newStart != 1000 {
t.Fatalf("unexpected newStart; got %d; want %d", newStart, 1000)
}
@@ -131,8 +134,8 @@ func TestRollupResultCache(t *testing.T) {
Values: []float64{0, 1, 2},
},
}
rollupResultCacheV.Put(funcName, ec, me, nil, window, tss)
tss, newStart := rollupResultCacheV.Get(funcName, ec, me, nil, window)
rollupResultCacheV.Put(ec, fe, window, tss)
tss, newStart := rollupResultCacheV.Get(ec, fe, window)
if newStart != 1000 {
t.Fatalf("unexpected newStart; got %d; want %d", newStart, 1000)
}
@@ -150,8 +153,8 @@ func TestRollupResultCache(t *testing.T) {
Values: []float64{0, 1, 2},
},
}
rollupResultCacheV.Put(funcName, ec, me, nil, window, tss)
tss, newStart := rollupResultCacheV.Get(funcName, ec, me, nil, window)
rollupResultCacheV.Put(ec, fe, window, tss)
tss, newStart := rollupResultCacheV.Get(ec, fe, window)
if newStart != 1000 {
t.Fatalf("unexpected newStart; got %d; want %d", newStart, 1000)
}
@@ -169,8 +172,8 @@ func TestRollupResultCache(t *testing.T) {
Values: []float64{0, 1, 2, 3, 4, 5, 6, 7},
},
}
rollupResultCacheV.Put(funcName, ec, me, nil, window, tss)
tss, newStart := rollupResultCacheV.Get(funcName, ec, me, nil, window)
rollupResultCacheV.Put(ec, fe, window, tss)
tss, newStart := rollupResultCacheV.Get(ec, fe, window)
if newStart != 2200 {
t.Fatalf("unexpected newStart; got %d; want %d", newStart, 2200)
}
@@ -192,8 +195,8 @@ func TestRollupResultCache(t *testing.T) {
Values: []float64{1, 2, 3, 4, 5, 6},
},
}
rollupResultCacheV.Put(funcName, ec, me, nil, window, tss)
tss, newStart := rollupResultCacheV.Get(funcName, ec, me, nil, window)
rollupResultCacheV.Put(ec, fe, window, tss)
tss, newStart := rollupResultCacheV.Get(ec, fe, window)
if newStart != 2200 {
t.Fatalf("unexpected newStart; got %d; want %d", newStart, 2200)
}
@@ -217,8 +220,8 @@ func TestRollupResultCache(t *testing.T) {
}
tss = append(tss, ts)
}
rollupResultCacheV.Put(funcName, ec, me, nil, window, tss)
tssResult, newStart := rollupResultCacheV.Get(funcName, ec, me, nil, window)
rollupResultCacheV.Put(ec, fe, window, tss)
tssResult, newStart := rollupResultCacheV.Get(ec, fe, window)
if newStart != 2200 {
t.Fatalf("unexpected newStart; got %d; want %d", newStart, 2200)
}
@@ -246,10 +249,10 @@ func TestRollupResultCache(t *testing.T) {
Values: []float64{0, 1, 2},
},
}
rollupResultCacheV.Put(funcName, ec, me, nil, window, tss1)
rollupResultCacheV.Put(funcName, ec, me, nil, window, tss2)
rollupResultCacheV.Put(funcName, ec, me, nil, window, tss3)
tss, newStart := rollupResultCacheV.Get(funcName, ec, me, nil, window)
rollupResultCacheV.Put(ec, fe, window, tss1)
rollupResultCacheV.Put(ec, fe, window, tss2)
rollupResultCacheV.Put(ec, fe, window, tss3)
tss, newStart := rollupResultCacheV.Get(ec, fe, window)
if newStart != 1400 {
t.Fatalf("unexpected newStart; got %d; want %d", newStart, 1400)
}
@@ -388,7 +391,7 @@ func testTimeseriesEqual(t *testing.T, tss, tssExpected []*timeseries) {
}
for i, ts := range tss {
tsExpected := tssExpected[i]
testMetricNamesEqual(t, &ts.MetricName, &tsExpected.MetricName)
testMetricNamesEqual(t, &ts.MetricName, &tsExpected.MetricName, i)
testRowsEqual(t, ts.Values, ts.Timestamps, tsExpected.Values, tsExpected.Timestamps)
}
}

View File

@@ -3,6 +3,8 @@ package promql
import (
"math"
"testing"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/metricsql"
)
var (
@@ -57,7 +59,7 @@ func TestRollupIderivDuplicateTimestamps(t *testing.T) {
}
n = rollupIderiv(rfa)
if n != 500 {
t.Fatalf("unexpected value; got %v; want %v", n, 0.5)
t.Fatalf("unexpected value; got %v; want %v", n, 500)
}
rfa = &rollupFuncArg{
@@ -157,7 +159,7 @@ func TestDerivValues(t *testing.T) {
testRowsEqual(t, values, timestamps, valuesExpected, timestamps)
}
func testRollupFunc(t *testing.T, funcName string, args []interface{}, meExpected *metricExpr, vExpected float64) {
func testRollupFunc(t *testing.T, funcName string, args []interface{}, meExpected *metricsql.MetricExpr, vExpected float64) {
t.Helper()
nrf := getRollupFunc(funcName)
if nrf == nil {
@@ -190,6 +192,52 @@ func testRollupFunc(t *testing.T, funcName string, args []interface{}, meExpecte
}
}
func TestRollupShareLEOverTime(t *testing.T) {
f := func(le, vExpected float64) {
t.Helper()
les := []*timeseries{{
Values: []float64{le},
Timestamps: []int64{123},
}}
var me metricsql.MetricExpr
args := []interface{}{&metricsql.RollupExpr{Expr: &me}, les}
testRollupFunc(t, "share_le_over_time", args, &me, vExpected)
}
f(-123, 0)
f(0, 0)
f(10, 0)
f(12, 0.08333333333333333)
f(30, 0.16666666666666666)
f(50, 0.75)
f(100, 0.9166666666666666)
f(123, 1)
f(1000, 1)
}
func TestRollupShareGTOverTime(t *testing.T) {
f := func(gt, vExpected float64) {
t.Helper()
gts := []*timeseries{{
Values: []float64{gt},
Timestamps: []int64{123},
}}
var me metricsql.MetricExpr
args := []interface{}{&metricsql.RollupExpr{Expr: &me}, gts}
testRollupFunc(t, "share_gt_over_time", args, &me, vExpected)
}
f(-123, 1)
f(0, 1)
f(10, 1)
f(12, 0.9166666666666666)
f(30, 0.8333333333333334)
f(50, 0.25)
f(100, 0.08333333333333333)
f(123, 0)
f(1000, 0)
}
func TestRollupQuantileOverTime(t *testing.T) {
f := func(phi, vExpected float64) {
t.Helper()
@@ -197,8 +245,8 @@ func TestRollupQuantileOverTime(t *testing.T) {
Values: []float64{phi},
Timestamps: []int64{123},
}}
var me metricExpr
args := []interface{}{phis, &rollupExpr{Expr: &me}}
var me metricsql.MetricExpr
args := []interface{}{phis, &metricsql.RollupExpr{Expr: &me}}
testRollupFunc(t, "quantile_over_time", args, &me, vExpected)
}
@@ -219,8 +267,8 @@ func TestRollupPredictLinear(t *testing.T) {
Values: []float64{sec},
Timestamps: []int64{123},
}}
var me metricExpr
args := []interface{}{&rollupExpr{Expr: &me}, secs}
var me metricsql.MetricExpr
args := []interface{}{&metricsql.RollupExpr{Expr: &me}, secs}
testRollupFunc(t, "predict_linear", args, &me, vExpected)
}
@@ -241,8 +289,8 @@ func TestRollupHoltWinters(t *testing.T) {
Values: []float64{tf},
Timestamps: []int64{123},
}}
var me metricExpr
args := []interface{}{&rollupExpr{Expr: &me}, sfs, tfs}
var me metricsql.MetricExpr
args := []interface{}{&metricsql.RollupExpr{Expr: &me}, sfs, tfs}
testRollupFunc(t, "holt_winters", args, &me, vExpected)
}
@@ -262,27 +310,72 @@ func TestRollupHoltWinters(t *testing.T) {
f(0.9, 0.9, 33.99637566941818)
}
func TestRollupHoeffdingBoundLower(t *testing.T) {
f := func(phi, vExpected float64) {
t.Helper()
phis := []*timeseries{{
Values: []float64{phi},
Timestamps: []int64{123},
}}
var me metricsql.MetricExpr
args := []interface{}{phis, &metricsql.RollupExpr{Expr: &me}}
testRollupFunc(t, "hoeffding_bound_lower", args, &me, vExpected)
}
f(0.5, 28.21949401521037)
f(-1, 47.083333333333336)
f(0, 47.083333333333336)
f(1, -inf)
f(2, -inf)
f(0.1, 39.72878000047643)
f(0.9, 12.701803086472331)
}
func TestRollupHoeffdingBoundUpper(t *testing.T) {
f := func(phi, vExpected float64) {
t.Helper()
phis := []*timeseries{{
Values: []float64{phi},
Timestamps: []int64{123},
}}
var me metricsql.MetricExpr
args := []interface{}{phis, &metricsql.RollupExpr{Expr: &me}}
testRollupFunc(t, "hoeffding_bound_upper", args, &me, vExpected)
}
f(0.5, 65.9471726514563)
f(-1, 47.083333333333336)
f(0, 47.083333333333336)
f(1, inf)
f(2, inf)
f(0.1, 54.43788666619024)
f(0.9, 81.46486358019433)
}
func TestRollupNewRollupFuncSuccess(t *testing.T) {
f := func(funcName string, vExpected float64) {
t.Helper()
var me metricExpr
args := []interface{}{&rollupExpr{Expr: &me}}
var me metricsql.MetricExpr
args := []interface{}{&metricsql.RollupExpr{Expr: &me}}
testRollupFunc(t, funcName, args, &me, vExpected)
}
f("default_rollup", 34)
f("changes", 11)
f("delta", -89)
f("delta", 34)
f("deriv", -266.85860231406065)
f("deriv_fast", -712)
f("idelta", 0)
f("increase", 275)
f("increase", 398)
f("irate", 0)
f("rate", 2200)
f("resets", 5)
f("range_over_time", 111)
f("avg_over_time", 47.083333333333336)
f("min_over_time", 12)
f("max_over_time", 123)
f("tmin_over_time", 0.08)
f("tmax_over_time", 0.005)
f("sum_over_time", 565)
f("sum2_over_time", 37951)
f("geomean_over_time", 39.33466603189148)
@@ -291,7 +384,7 @@ func TestRollupNewRollupFuncSuccess(t *testing.T) {
f("stdvar_over_time", 945.7430555555555)
f("first_over_time", 123)
f("last_over_time", 34)
f("integrate", 61.0275)
f("integrate", 5.4705)
f("distinct_over_time", 8)
f("ideriv", 0)
f("decreases_over_time", 5)
@@ -327,7 +420,7 @@ func TestRollupNewRollupFuncError(t *testing.T) {
Values: []float64{321},
Timestamps: []int64{123},
}}
me := &metricExpr{}
me := &metricsql.MetricExpr{}
f("holt_winters", []interface{}{123, 123, 321})
f("holt_winters", []interface{}{me, 123, 321})
f("holt_winters", []interface{}{me, scalarTs, 321})
@@ -409,7 +502,7 @@ func TestRollupNoWindowPartialPoints(t *testing.T) {
}
rc.Timestamps = getTimestamps(rc.Start, rc.End, rc.Step)
values := rc.Do(nil, testValues, testTimestamps)
valuesExpected := []float64{nan, 123, 123, 123, 34, 34}
valuesExpected := []float64{nan, 123, nan, 34, nan, 44}
timestampsExpected := []int64{0, 5, 10, 15, 20, 25}
testRowsEqual(t, values, rc.Timestamps, valuesExpected, timestampsExpected)
})
@@ -423,7 +516,7 @@ func TestRollupNoWindowPartialPoints(t *testing.T) {
}
rc.Timestamps = getTimestamps(rc.Start, rc.End, rc.Step)
values := rc.Do(nil, testValues, testTimestamps)
valuesExpected := []float64{12, 44, 34, nan}
valuesExpected := []float64{44, 32, 34, nan}
timestampsExpected := []int64{100, 120, 140, 160}
testRowsEqual(t, values, rc.Timestamps, valuesExpected, timestampsExpected)
})
@@ -437,7 +530,7 @@ func TestRollupNoWindowPartialPoints(t *testing.T) {
}
rc.Timestamps = getTimestamps(rc.Start, rc.End, rc.Step)
values := rc.Do(nil, testValues, testTimestamps)
valuesExpected := []float64{nan, nan, 123, 54, 44}
valuesExpected := []float64{nan, nan, 123, 34, 32}
timestampsExpected := []int64{-50, 0, 50, 100, 150}
testRowsEqual(t, values, rc.Timestamps, valuesExpected, timestampsExpected)
})
@@ -499,7 +592,7 @@ func TestRollupFuncsLookbackDelta(t *testing.T) {
}
rc.Timestamps = getTimestamps(rc.Start, rc.End, rc.Step)
values := rc.Do(nil, testValues, testTimestamps)
valuesExpected := []float64{99, 12, 44, nan, 32, 34, nan}
valuesExpected := []float64{99, nan, 44, nan, 32, 34, nan}
timestampsExpected := []int64{80, 90, 100, 110, 120, 130, 140}
testRowsEqual(t, values, rc.Timestamps, valuesExpected, timestampsExpected)
})
@@ -513,7 +606,7 @@ func TestRollupFuncsLookbackDelta(t *testing.T) {
}
rc.Timestamps = getTimestamps(rc.Start, rc.End, rc.Step)
values := rc.Do(nil, testValues, testTimestamps)
valuesExpected := []float64{99, 12, 44, 44, 32, 34, nan}
valuesExpected := []float64{99, nan, 44, nan, 32, 34, nan}
timestampsExpected := []int64{80, 90, 100, 110, 120, 130, 140}
testRowsEqual(t, values, rc.Timestamps, valuesExpected, timestampsExpected)
})
@@ -527,7 +620,7 @@ func TestRollupFuncsLookbackDelta(t *testing.T) {
}
rc.Timestamps = getTimestamps(rc.Start, rc.End, rc.Step)
values := rc.Do(nil, testValues, testTimestamps)
valuesExpected := []float64{34, 12, 12, 44, 44, 34, nan}
valuesExpected := []float64{99, nan, 44, nan, 32, 34, nan}
timestampsExpected := []int64{80, 90, 100, 110, 120, 130, 140}
testRowsEqual(t, values, rc.Timestamps, valuesExpected, timestampsExpected)
})
@@ -544,7 +637,7 @@ func TestRollupFuncsNoWindow(t *testing.T) {
}
rc.Timestamps = getTimestamps(rc.Start, rc.End, rc.Step)
values := rc.Do(nil, testValues, testTimestamps)
valuesExpected := []float64{nan, 123, 21, 12, 34}
valuesExpected := []float64{nan, 123, 54, 44, 34}
timestampsExpected := []int64{0, 40, 80, 120, 160}
testRowsEqual(t, values, rc.Timestamps, valuesExpected, timestampsExpected)
})
@@ -572,7 +665,7 @@ func TestRollupFuncsNoWindow(t *testing.T) {
}
rc.Timestamps = getTimestamps(rc.Start, rc.End, rc.Step)
values := rc.Do(nil, testValues, testTimestamps)
valuesExpected := []float64{nan, 21, 12, 12, 34}
valuesExpected := []float64{nan, 21, 12, 32, 34}
timestampsExpected := []int64{0, 40, 80, 120, 160}
testRowsEqual(t, values, rc.Timestamps, valuesExpected, timestampsExpected)
})
@@ -614,7 +707,7 @@ func TestRollupFuncsNoWindow(t *testing.T) {
}
rc.Timestamps = getTimestamps(rc.Start, rc.End, rc.Step)
values := rc.Do(nil, testValues, testTimestamps)
valuesExpected := []float64{nan, -102, -9, 22, 0}
valuesExpected := []float64{nan, nan, -9, 22, 0}
timestampsExpected := []int64{0, 40, 80, 120, 160}
testRowsEqual(t, values, rc.Timestamps, valuesExpected, timestampsExpected)
})
@@ -772,6 +865,20 @@ func TestRollupFuncsNoWindow(t *testing.T) {
timestampsExpected := []int64{0, 40, 80, 120, 160}
testRowsEqual(t, values, rc.Timestamps, valuesExpected, timestampsExpected)
})
t.Run("deriv_fast", func(t *testing.T) {
rc := rollupConfig{
Func: rollupDerivFast,
Start: 0,
End: 20,
Step: 4,
Window: 0,
}
rc.Timestamps = getTimestamps(rc.Start, rc.End, rc.Step)
values := rc.Do(nil, testValues, testTimestamps)
valuesExpected := []float64{nan, nan, nan, 0, -8900, 0}
timestampsExpected := []int64{0, 4, 8, 12, 16, 20}
testRowsEqual(t, values, rc.Timestamps, valuesExpected, timestampsExpected)
})
t.Run("ideriv", func(t *testing.T) {
rc := rollupConfig{
Func: rollupIderiv,
@@ -810,7 +917,7 @@ func TestRollupFuncsNoWindow(t *testing.T) {
}
rc.Timestamps = getTimestamps(rc.Start, rc.End, rc.Step)
values := rc.Do(nil, testValues, testTimestamps)
valuesExpected := []float64{nan, 4.6035, 4.3934999999999995, 2.166, 0.34}
valuesExpected := []float64{nan, 1.526, 2.2795, 1.325, 0.34}
timestampsExpected := []int64{0, 40, 80, 120, 160}
testRowsEqual(t, values, rc.Timestamps, valuesExpected, timestampsExpected)
})
@@ -844,6 +951,27 @@ func TestRollupFuncsNoWindow(t *testing.T) {
})
}
func TestRollupBigNumberOfValues(t *testing.T) {
const srcValuesCount = 1e4
rc := rollupConfig{
Func: rollupDefault,
End: srcValuesCount,
Step: srcValuesCount / 5,
Window: srcValuesCount / 4,
}
rc.Timestamps = getTimestamps(rc.Start, rc.End, rc.Step)
srcValues := make([]float64, srcValuesCount)
srcTimestamps := make([]int64, srcValuesCount)
for i := 0; i < srcValuesCount; i++ {
srcValues[i] = float64(i)
srcTimestamps[i] = int64(i / 2)
}
values := rc.Do(nil, srcValues, srcTimestamps)
valuesExpected := []float64{1, 4001, 8001, 9999, nan, nan}
timestampsExpected := []int64{0, 2000, 4000, 6000, 8000, 10000}
testRowsEqual(t, values, rc.Timestamps, valuesExpected, timestampsExpected)
}
func testRowsEqual(t *testing.T, values []float64, timestamps []int64, valuesExpected []float64, timestampsExpected []int64) {
t.Helper()
if len(values) != len(valuesExpected) {

View File

@@ -288,7 +288,6 @@ func marshalMetricTagsFast(dst []byte, tags []storage.Tag) []byte {
}
func marshalMetricNameSorted(dst []byte, mn *storage.MetricName) []byte {
// Do not marshal AccountID and ProjectID, since they are unused.
dst = marshalBytesFast(dst, mn.MetricGroup)
sortMetricTags(mn.Tags)
dst = marshalMetricTagsFast(dst, mn.Tags)

View File

@@ -12,6 +12,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/decimal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/metricsql"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/valyala/histogram"
)
@@ -64,9 +65,12 @@ var transformFuncs = map[string]transformFunc{
"label_move": transformLabelMove,
"label_transform": transformLabelTransform,
"label_value": transformLabelValue,
"label_match": transformLabelMatch,
"label_mismatch": transformLabelMismatch,
"union": transformUnion,
"": transformUnion, // empty func is a synonim to union
"keep_last_value": transformKeepLastValue,
"keep_next_value": transformKeepNextValue,
"start": newTransformFuncZeroArgs(transformStart),
"end": newTransformFuncZeroArgs(transformEnd),
"step": newTransformFuncZeroArgs(transformStep),
@@ -91,6 +95,10 @@ var transformFuncs = map[string]transformFunc{
"cos": newTransformFuncOneArg(transformCos),
"asin": newTransformFuncOneArg(transformAsin),
"acos": newTransformFuncOneArg(transformAcos),
"prometheus_buckets": transformPrometheusBuckets,
"histogram_share": transformHistogramShare,
"sort_by_label": newTransformFuncSortByLabel(false),
"sort_by_label_desc": newTransformFuncSortByLabel(true),
}
func getTransformFunc(s string) transformFunc {
@@ -98,13 +106,9 @@ func getTransformFunc(s string) transformFunc {
return transformFuncs[s]
}
func isTransformFunc(s string) bool {
return getTransformFunc(s) != nil
}
type transformFuncArg struct {
ec *EvalConfig
fe *funcExpr
fe *metricsql.FuncExpr
args [][]*timeseries
}
@@ -125,7 +129,7 @@ func newTransformFuncOneArg(tf func(v float64) float64) transformFunc {
}
}
func doTransformValues(arg []*timeseries, tf func(values []float64), fe *funcExpr) ([]*timeseries, error) {
func doTransformValues(arg []*timeseries, tf func(values []float64), fe *metricsql.FuncExpr) ([]*timeseries, error) {
name := strings.ToLower(fe.Name)
keepMetricGroup := transformFuncsKeepMetricGroup[name]
for _, ts := range arg {
@@ -148,28 +152,10 @@ func transformAbsent(tfa *transformFuncArg) ([]*timeseries, error) {
return nil, err
}
arg := args[0]
if len(arg) == 0 {
// Copy tags from arg
rvs := evalNumber(tfa.ec, 1)
rv := rvs[0]
me, ok := tfa.fe.Args[0].(*metricExpr)
if !ok {
return rvs, nil
}
for i := range me.TagFilters {
tf := &me.TagFilters[i]
if len(tf.Key) == 0 {
continue
}
if tf.IsRegexp || tf.IsNegative {
continue
}
rv.MetricName.AddTagBytes(tf.Key, tf.Value)
}
rvs := getAbsentTimeseries(tfa.ec, tfa.fe.Args[0])
return rvs, nil
}
for _, ts := range arg {
ts.MetricName.ResetMetricGroup()
for i, v := range ts.Values {
@@ -184,6 +170,28 @@ func transformAbsent(tfa *transformFuncArg) ([]*timeseries, error) {
return arg, nil
}
func getAbsentTimeseries(ec *EvalConfig, arg metricsql.Expr) []*timeseries {
// Copy tags from arg
rvs := evalNumber(ec, 1)
rv := rvs[0]
me, ok := arg.(*metricsql.MetricExpr)
if !ok {
return rvs
}
tfs := toTagFilters(me.LabelFilters)
for i := range tfs {
tf := &tfs[i]
if len(tf.Key) == 0 {
continue
}
if tf.IsRegexp || tf.IsNegative {
continue
}
rv.MetricName.AddTagBytes(tf.Key, tf.Value)
}
return rvs
}
func transformCeil(v float64) float64 {
return math.Ceil(v)
}
@@ -272,24 +280,364 @@ func transformFloor(v float64) float64 {
return math.Floor(v)
}
func transformHistogramQuantile(tfa *transformFuncArg) ([]*timeseries, error) {
func transformPrometheusBuckets(tfa *transformFuncArg) ([]*timeseries, error) {
args := tfa.args
if err := expectTransformArgsNum(args, 2); err != nil {
return nil, err
}
phis, err := getScalar(args[0], 0)
if err != nil {
if err := expectTransformArgsNum(args, 1); err != nil {
return nil, err
}
rvs := vmrangeBucketsToLE(args[0])
return rvs, nil
}
// Group metrics by all tags excluding "le"
func vmrangeBucketsToLE(tss []*timeseries) []*timeseries {
rvs := make([]*timeseries, 0, len(tss))
// Group timeseries by MetricGroup+tags excluding `vmrange` tag.
type x struct {
le float64
ts *timeseries
startStr string
endStr string
start float64
end float64
ts *timeseries
}
m := make(map[string][]x)
bb := bbPool.Get()
for _, ts := range args[1] {
defer bbPool.Put(bb)
for _, ts := range tss {
vmrange := ts.MetricName.GetTagValue("vmrange")
if len(vmrange) == 0 {
if le := ts.MetricName.GetTagValue("le"); len(le) > 0 {
// Keep Prometheus-compatible buckets.
rvs = append(rvs, ts)
}
continue
}
n := strings.Index(bytesutil.ToUnsafeString(vmrange), "...")
if n < 0 {
continue
}
startStr := string(vmrange[:n])
start, err := strconv.ParseFloat(startStr, 64)
if err != nil {
continue
}
endStr := string(vmrange[n+len("..."):])
end, err := strconv.ParseFloat(endStr, 64)
if err != nil {
continue
}
ts.MetricName.RemoveTag("le")
ts.MetricName.RemoveTag("vmrange")
bb.B = marshalMetricNameSorted(bb.B[:0], &ts.MetricName)
m[string(bb.B)] = append(m[string(bb.B)], x{
startStr: startStr,
endStr: endStr,
start: start,
end: end,
ts: ts,
})
}
// Convert `vmrange` label in each group of time series to `le` label.
copyTS := func(src *timeseries, leStr string) *timeseries {
var ts timeseries
ts.CopyFromShallowTimestamps(src)
values := ts.Values
for i := range values {
values[i] = 0
}
ts.MetricName.RemoveTag("le")
ts.MetricName.AddTag("le", leStr)
return &ts
}
isZeroTS := func(ts *timeseries) bool {
for _, v := range ts.Values {
if v > 0 {
return false
}
}
return true
}
for _, xss := range m {
sort.Slice(xss, func(i, j int) bool { return xss[i].end < xss[j].end })
xssNew := make([]x, 0, len(xss)+2)
var xsPrev x
for _, xs := range xss {
ts := xs.ts
if isZeroTS(ts) {
// Skip time series with zeros. They are substituted by xssNew below.
xsPrev = xs
continue
}
if xs.start != xsPrev.end {
xssNew = append(xssNew, x{
endStr: xs.startStr,
end: xs.start,
ts: copyTS(ts, xs.startStr),
})
}
ts.MetricName.AddTag("le", xs.endStr)
xssNew = append(xssNew, xs)
xsPrev = xs
}
if !math.IsInf(xsPrev.end, 1) {
xssNew = append(xssNew, x{
endStr: "+Inf",
end: math.Inf(1),
ts: copyTS(xsPrev.ts, "+Inf"),
})
}
xss = xssNew
for i := range xss[0].ts.Values {
count := float64(0)
for _, xs := range xss {
ts := xs.ts
v := ts.Values[i]
if !math.IsNaN(v) && v > 0 {
count += v
}
ts.Values[i] = count
}
}
for _, xs := range xss {
rvs = append(rvs, xs.ts)
}
}
return rvs
}
func transformHistogramShare(tfa *transformFuncArg) ([]*timeseries, error) {
args := tfa.args
if len(args) < 2 || len(args) > 3 {
return nil, fmt.Errorf("unexpected number of args; got %d; want 2...3", len(args))
}
les, err := getScalar(args[0], 0)
if err != nil {
return nil, fmt.Errorf("cannot parse le: %s", err)
}
// Convert buckets with `vmrange` labels to buckets with `le` labels.
tss := vmrangeBucketsToLE(args[1])
// Parse boundsLabel. See https://github.com/prometheus/prometheus/issues/5706 for details.
var boundsLabel string
if len(args) > 2 {
s, err := getString(args[2], 2)
if err != nil {
return nil, fmt.Errorf("cannot parse boundsLabel (arg #3): %s", err)
}
boundsLabel = s
}
// Group metrics by all tags excluding "le"
m := groupLeTimeseries(tss)
// Calculate share for les
share := func(i int, les []float64, xss []leTimeseries) (q, lower, upper float64) {
leReq := les[i]
if math.IsNaN(leReq) || len(xss) == 0 {
return nan, nan, nan
}
fixBrokenBuckets(i, xss)
if leReq < 0 {
return 0, 0, 0
}
if math.IsInf(leReq, 1) {
return 1, 1, 1
}
var vPrev, lePrev float64
for _, xs := range xss {
v := xs.ts.Values[i]
le := xs.le
if leReq >= le {
vPrev = v
lePrev = le
continue
}
// precondition: lePrev <= leReq < le
vLast := xss[len(xss)-1].ts.Values[i]
lower = vPrev / vLast
if math.IsInf(le, 1) {
return lower, lower, 1
}
if lePrev == leReq {
return lower, lower, lower
}
upper = v / vLast
q = lower + (v-vPrev)/vLast*(leReq-lePrev)/(le-lePrev)
return q, lower, upper
}
// precondition: leReq > leLast
return 1, 1, 1
}
rvs := make([]*timeseries, 0, len(m))
for _, xss := range m {
sort.Slice(xss, func(i, j int) bool {
return xss[i].le < xss[j].le
})
dst := xss[0].ts
var tsLower, tsUpper *timeseries
if len(boundsLabel) > 0 {
tsLower = &timeseries{}
tsLower.CopyFromShallowTimestamps(dst)
tsLower.MetricName.RemoveTag(boundsLabel)
tsLower.MetricName.AddTag(boundsLabel, "lower")
tsUpper = &timeseries{}
tsUpper.CopyFromShallowTimestamps(dst)
tsUpper.MetricName.RemoveTag(boundsLabel)
tsUpper.MetricName.AddTag(boundsLabel, "upper")
}
for i := range dst.Values {
q, lower, upper := share(i, les, xss)
dst.Values[i] = q
if len(boundsLabel) > 0 {
tsLower.Values[i] = lower
tsUpper.Values[i] = upper
}
}
rvs = append(rvs, dst)
if len(boundsLabel) > 0 {
rvs = append(rvs, tsLower)
rvs = append(rvs, tsUpper)
}
}
return rvs, nil
}
func transformHistogramQuantile(tfa *transformFuncArg) ([]*timeseries, error) {
args := tfa.args
if len(args) < 2 || len(args) > 3 {
return nil, fmt.Errorf("unexpected number of args; got %d; want 2...3", len(args))
}
phis, err := getScalar(args[0], 0)
if err != nil {
return nil, fmt.Errorf("cannot parse phi: %s", err)
}
// Convert buckets with `vmrange` labels to buckets with `le` labels.
tss := vmrangeBucketsToLE(args[1])
// Parse boundsLabel. See https://github.com/prometheus/prometheus/issues/5706 for details.
var boundsLabel string
if len(args) > 2 {
s, err := getString(args[2], 2)
if err != nil {
return nil, fmt.Errorf("cannot parse boundsLabel (arg #3): %s", err)
}
boundsLabel = s
}
// Group metrics by all tags excluding "le"
m := groupLeTimeseries(tss)
// Calculate quantile for each group in m
lastNonInf := func(i int, xss []leTimeseries) float64 {
for len(xss) > 0 {
xsLast := xss[len(xss)-1]
v := xsLast.ts.Values[i]
if v == 0 {
return nan
}
if !math.IsInf(xsLast.le, 0) {
return xsLast.le
}
xss = xss[:len(xss)-1]
}
return nan
}
quantile := func(i int, phis []float64, xss []leTimeseries) (q, lower, upper float64) {
phi := phis[i]
if math.IsNaN(phi) {
return nan, nan, nan
}
fixBrokenBuckets(i, xss)
vLast := float64(0)
if len(xss) > 0 {
vLast = xss[len(xss)-1].ts.Values[i]
}
if vLast == 0 {
return nan, nan, nan
}
if phi < 0 {
return -inf, -inf, xss[0].ts.Values[i]
}
if phi > 1 {
return inf, vLast, inf
}
vReq := vLast * phi
vPrev := float64(0)
lePrev := float64(0)
for _, xs := range xss {
v := xs.ts.Values[i]
le := xs.le
if v <= 0 {
// Skip zero buckets.
lePrev = le
continue
}
if v < vReq {
vPrev = v
lePrev = le
continue
}
if math.IsInf(le, 0) {
vv := lastNonInf(i, xss)
return vv, vv, inf
}
if v == vPrev {
return lePrev, lePrev, v
}
vv := lePrev + (le-lePrev)*(vReq-vPrev)/(v-vPrev)
return vv, lePrev, le
}
vv := lastNonInf(i, xss)
return vv, vv, inf
}
rvs := make([]*timeseries, 0, len(m))
for _, xss := range m {
sort.Slice(xss, func(i, j int) bool {
return xss[i].le < xss[j].le
})
dst := xss[0].ts
var tsLower, tsUpper *timeseries
if len(boundsLabel) > 0 {
tsLower = &timeseries{}
tsLower.CopyFromShallowTimestamps(dst)
tsLower.MetricName.RemoveTag(boundsLabel)
tsLower.MetricName.AddTag(boundsLabel, "lower")
tsUpper = &timeseries{}
tsUpper.CopyFromShallowTimestamps(dst)
tsUpper.MetricName.RemoveTag(boundsLabel)
tsUpper.MetricName.AddTag(boundsLabel, "upper")
}
for i := range dst.Values {
v, lower, upper := quantile(i, phis, xss)
dst.Values[i] = v
if len(boundsLabel) > 0 {
tsLower.Values[i] = lower
tsUpper.Values[i] = upper
}
}
rvs = append(rvs, dst)
if len(boundsLabel) > 0 {
rvs = append(rvs, tsLower)
rvs = append(rvs, tsUpper)
}
}
return rvs, nil
}
type leTimeseries struct {
le float64
ts *timeseries
}
func groupLeTimeseries(tss []*timeseries) map[string][]leTimeseries {
m := make(map[string][]leTimeseries)
bb := bbPool.Get()
for _, ts := range tss {
tagValue := ts.MetricName.GetTagValue("le")
if len(tagValue) == 0 {
continue
@@ -301,95 +649,28 @@ func transformHistogramQuantile(tfa *transformFuncArg) ([]*timeseries, error) {
ts.MetricName.ResetMetricGroup()
ts.MetricName.RemoveTag("le")
bb.B = marshalMetricTagsSorted(bb.B[:0], &ts.MetricName)
m[string(bb.B)] = append(m[string(bb.B)], x{
m[string(bb.B)] = append(m[string(bb.B)], leTimeseries{
le: le,
ts: ts,
})
}
bbPool.Put(bb)
return m
}
// Calculate quantile for each group in m
lastNonInf := func(i int, xss []x) float64 {
for len(xss) > 0 {
xsLast := xss[len(xss)-1]
if xsLast.ts.Values[i] == 0 {
return nan
}
if !math.IsInf(xsLast.le, 0) {
break
}
xss = xss[:len(xss)-1]
func fixBrokenBuckets(i int, xss []leTimeseries) {
// Fix broken buckets.
// They are already sorted by le, so their values must be in ascending order,
// since the next bucket includes all the previous buckets.
vPrev := float64(0)
for _, xs := range xss {
v := xs.ts.Values[i]
if v < vPrev || math.IsNaN(v) {
xs.ts.Values[i] = vPrev
} else {
vPrev = v
}
if len(xss) == 0 {
return nan
}
return xss[len(xss)-1].le
}
quantile := func(i int, phis []float64, xss []x) float64 {
phi := phis[i]
if math.IsNaN(phi) {
return nan
}
// Fix broken buckets.
// They are already sorted by le, so their values must be in ascending order,
// since the next bucket value includes all the previous buckets.
vPrev := float64(0)
for _, xs := range xss {
v := xs.ts.Values[i]
if math.IsNaN(v) || v < vPrev {
xs.ts.Values[i] = vPrev
} else {
vPrev = v
}
}
if len(xss) == 0 {
return nan
}
if phi < 0 {
return -inf
}
if phi > 1 {
return inf
}
vLast := xss[len(xss)-1].ts.Values[i]
if vLast == 0 {
return nan
}
vReq := vLast * phi
vPrev = 0
lePrev := float64(0)
for _, xs := range xss {
v := xs.ts.Values[i]
le := xs.le
if v < vReq {
vPrev = v
lePrev = le
continue
}
if math.IsInf(le, 0) {
return lastNonInf(i, xss)
}
if v == vPrev {
return lePrev
}
return lePrev + (le-lePrev)*(vReq-vPrev)/(v-vPrev)
}
return lastNonInf(i, xss)
}
rvs := make([]*timeseries, 0, len(m))
for _, xss := range m {
sort.Slice(xss, func(i, j int) bool {
return xss[i].le < xss[j].le
})
dst := xss[0].ts
for i := range dst.Values {
dst.Values[i] = quantile(i, phis, xss)
}
rvs = append(rvs, dst)
}
return rvs, nil
}
func transformHour(t time.Time) int {
@@ -446,13 +727,37 @@ func transformKeepLastValue(tfa *transformFuncArg) ([]*timeseries, error) {
if len(values) == 0 {
continue
}
prevValue := values[0]
lastValue := values[0]
for i, v := range values {
if math.IsNaN(v) {
v = prevValue
if !math.IsNaN(v) {
lastValue = v
continue
}
values[i] = v
prevValue = v
values[i] = lastValue
}
}
return rvs, nil
}
func transformKeepNextValue(tfa *transformFuncArg) ([]*timeseries, error) {
args := tfa.args
if err := expectTransformArgsNum(args, 1); err != nil {
return nil, err
}
rvs := args[0]
for _, ts := range rvs {
values := ts.Values
if len(values) == 0 {
continue
}
nextValue := values[len(values)-1]
for i := len(values) - 1; i >= 0; i-- {
v := values[i]
if !math.IsNaN(v) {
nextValue = v
continue
}
values[i] = nextValue
}
}
return rvs, nil
@@ -517,10 +822,6 @@ func transformRangeQuantile(tfa *transformFuncArg) ([]*timeseries, error) {
hf.Reset()
lastIdx := -1
values := ts.Values
if len(values) > 0 {
// Ignore the last value. See Exec func for details.
values = values[:len(values)-1]
}
for i, v := range values {
if math.IsNaN(v) {
continue
@@ -571,14 +872,7 @@ func transformRangeLast(tfa *transformFuncArg) ([]*timeseries, error) {
func setLastValues(tss []*timeseries) {
for _, ts := range tss {
values := ts.Values
if len(values) < 2 {
continue
}
// Do not take into account the last value, since it shouldn't be included
// in the range. See Exec func for details.
values = values[:len(values)-1]
values = skipTrailingNaNs(values)
values := skipTrailingNaNs(ts.Values)
if len(values) == 0 {
continue
}
@@ -850,7 +1144,7 @@ func transformLabelTransform(tfa *transformFuncArg) ([]*timeseries, error) {
return nil, err
}
r, err := compileRegexp(regex)
r, err := metricsql.CompileRegexp(regex)
if err != nil {
return nil, fmt.Errorf(`cannot compile regex %q: %s`, regex, err)
}
@@ -879,7 +1173,7 @@ func transformLabelReplace(tfa *transformFuncArg) ([]*timeseries, error) {
return nil, err
}
r, err := compileRegexpAnchored(regex)
r, err := metricsql.CompileRegexpAnchored(regex)
if err != nil {
return nil, fmt.Errorf(`cannot compile regex %q: %s`, regex, err)
}
@@ -928,6 +1222,62 @@ func transformLabelValue(tfa *transformFuncArg) ([]*timeseries, error) {
return rvs, nil
}
func transformLabelMatch(tfa *transformFuncArg) ([]*timeseries, error) {
args := tfa.args
if err := expectTransformArgsNum(args, 3); err != nil {
return nil, err
}
labelName, err := getString(args[1], 1)
if err != nil {
return nil, fmt.Errorf("cannot get label name: %s", err)
}
labelRe, err := getString(args[2], 2)
if err != nil {
return nil, fmt.Errorf("cannot get regexp: %s", err)
}
r, err := metricsql.CompileRegexpAnchored(labelRe)
if err != nil {
return nil, fmt.Errorf(`cannot compile regexp %q: %s`, labelRe, err)
}
tss := args[0]
rvs := tss[:0]
for _, ts := range tss {
labelValue := ts.MetricName.GetTagValue(labelName)
if r.Match(labelValue) {
rvs = append(rvs, ts)
}
}
return rvs, nil
}
func transformLabelMismatch(tfa *transformFuncArg) ([]*timeseries, error) {
args := tfa.args
if err := expectTransformArgsNum(args, 3); err != nil {
return nil, err
}
labelName, err := getString(args[1], 1)
if err != nil {
return nil, fmt.Errorf("cannot get label name: %s", err)
}
labelRe, err := getString(args[2], 2)
if err != nil {
return nil, fmt.Errorf("cannot get regexp: %s", err)
}
r, err := metricsql.CompileRegexpAnchored(labelRe)
if err != nil {
return nil, fmt.Errorf(`cannot compile regexp %q: %s`, labelRe, err)
}
tss := args[0]
rvs := tss[:0]
for _, ts := range tss {
labelValue := ts.MetricName.GetTagValue(labelName)
if !r.Match(labelValue) {
rvs = append(rvs, ts)
}
}
return rvs, nil
}
func transformLn(v float64) float64 {
return math.Log(v)
}
@@ -990,7 +1340,7 @@ func transformScalar(tfa *transformFuncArg) ([]*timeseries, error) {
// Verify whether the arg is a string.
// Then try converting the string to number.
if se, ok := tfa.fe.Args[0].(*stringExpr); ok {
if se, ok := tfa.fe.Args[0].(*metricsql.StringExpr); ok {
n, err := strconv.ParseFloat(se.S, 64)
if err != nil {
n = nan
@@ -1007,6 +1357,29 @@ func transformScalar(tfa *transformFuncArg) ([]*timeseries, error) {
return arg, nil
}
func newTransformFuncSortByLabel(isDesc bool) transformFunc {
return func(tfa *transformFuncArg) ([]*timeseries, error) {
args := tfa.args
if err := expectTransformArgsNum(args, 2); err != nil {
return nil, err
}
label, err := getString(args[1], 1)
if err != nil {
return nil, fmt.Errorf("cannot parse label name for sorting: %s", err)
}
rvs := args[0]
sort.SliceStable(rvs, func(i, j int) bool {
a := rvs[i].MetricName.GetTagValue(label)
b := rvs[j].MetricName.GetTagValue(label)
if isDesc {
return string(b) < string(a)
}
return string(a) < string(b)
})
return rvs, nil
}
}
func newTransformFuncSort(isDesc bool) transformFunc {
return func(tfa *transformFuncArg) ([]*timeseries, error) {
args := tfa.args
@@ -1019,7 +1392,7 @@ func newTransformFuncSort(isDesc bool) transformFunc {
b := rvs[j].Values
n := len(a) - 1
for n >= 0 {
if !math.IsNaN(a[n]) && !math.IsNaN(b[n]) {
if !math.IsNaN(a[n]) && !math.IsNaN(b[n]) && a[n] != b[n] {
break
}
n--
@@ -1027,11 +1400,10 @@ func newTransformFuncSort(isDesc bool) transformFunc {
if n < 0 {
return false
}
cmp := a[n] < b[n]
if isDesc {
cmp = !cmp
return b[n] < a[n]
}
return cmp
return a[n] < b[n]
})
return rvs, nil
}
@@ -1162,9 +1534,7 @@ func transformStart(tfa *transformFuncArg) float64 {
}
func transformEnd(tfa *transformFuncArg) float64 {
// Subtract step from end, since it shouldn't go to the range.
// See Exec func for details.
return float64(tfa.ec.End-tfa.ec.Step) * 1e-3
return float64(tfa.ec.End) * 1e-3
}
// copyTimeseriesMetricNames returns a copy of arg with real copy of MetricNames,

View File

@@ -62,8 +62,8 @@ func InitWithoutMetrics() {
blocksCount := tm.SmallBlocksCount + tm.BigBlocksCount
rowsCount := tm.SmallRowsCount + tm.BigRowsCount
sizeBytes := tm.SmallSizeBytes + tm.BigSizeBytes
logger.Infof("successfully opened storage %q in %s; partsCount: %d; blocksCount: %d; rowsCount: %d; sizeBytes: %d",
*DataPath, time.Since(startTime), partsCount, blocksCount, rowsCount, sizeBytes)
logger.Infof("successfully opened storage %q in %.3f seconds; partsCount: %d; blocksCount: %d; rowsCount: %d; sizeBytes: %d",
*DataPath, time.Since(startTime).Seconds(), partsCount, blocksCount, rowsCount, sizeBytes)
}
// Storage is a storage.
@@ -133,7 +133,7 @@ func Stop() {
startTime := time.Now()
WG.WaitAndBlock()
Storage.MustClose()
logger.Infof("successfully closed the storage in %s", time.Since(startTime))
logger.Infof("successfully closed the storage in %.3f seconds", time.Since(startTime).Seconds())
logger.Infof("the storage has been stopped")
}
@@ -305,6 +305,9 @@ func registerStorageMetrics() {
return float64(idbm().PartsRefCount)
})
metrics.NewGauge(`vm_new_timeseries_created_total`, func() float64 {
return float64(idbm().NewTimeseriesCreated)
})
metrics.NewGauge(`vm_missing_tsids_for_metric_id_total`, func() float64 {
return float64(idbm().MissingTSIDsForMetricID)
})
@@ -320,6 +323,12 @@ func registerStorageMetrics() {
metrics.NewGauge(`vm_date_metric_ids_search_hits_total`, func() float64 {
return float64(idbm().DateMetricIDsSearchHits)
})
metrics.NewGauge(`vm_index_blocks_with_metric_ids_processed_total`, func() float64 {
return float64(idbm().IndexBlocksWithMetricIDsProcessed)
})
metrics.NewGauge(`vm_index_blocks_with_metric_ids_incorrect_order_total`, func() float64 {
return float64(idbm().IndexBlocksWithMetricIDsIncorrectOrder)
})
metrics.NewGauge(`vm_assisted_merges_total{type="storage/small"}`, func() float64 {
return float64(tm().SmallAssistedMerges)
@@ -398,6 +407,24 @@ func registerStorageMetrics() {
return float64(idbm().ItemsCount)
})
metrics.NewGauge(`vm_date_range_search_calls_total`, func() float64 {
return float64(idbm().DateRangeSearchCalls)
})
metrics.NewGauge(`vm_date_range_hits_total`, func() float64 {
return float64(idbm().DateRangeSearchHits)
})
metrics.NewGauge(`vm_missing_metric_names_for_metric_id_total`, func() float64 {
return float64(idbm().MissingMetricNamesForMetricID)
})
metrics.NewGauge(`vm_date_metric_id_cache_syncs_total`, func() float64 {
return float64(m().DateMetricIDCacheSyncsCount)
})
metrics.NewGauge(`vm_date_metric_id_cache_resets_total`, func() float64 {
return float64(m().DateMetricIDCacheResetsCount)
})
metrics.NewGauge(`vm_cache_entries{type="storage/tsid"}`, func() float64 {
return float64(m().TSIDCacheSize)
})
@@ -434,6 +461,9 @@ func registerStorageMetrics() {
metrics.NewGauge(`vm_cache_entries{type="storage/regexps"}`, func() float64 {
return float64(storage.RegexpCacheSize())
})
metrics.NewGauge(`vm_cache_size_entries{type="storage/prefetchedMetricIDs"}`, func() float64 {
return float64(m().PrefetchedMetricIDsSize)
})
metrics.NewGauge(`vm_cache_size_bytes{type="storage/tsid"}`, func() float64 {
return float64(m().TSIDCacheSizeBytes)
@@ -447,12 +477,18 @@ func registerStorageMetrics() {
metrics.NewGauge(`vm_cache_size_bytes{type="storage/date_metricID"}`, func() float64 {
return float64(m().DateMetricIDCacheSizeBytes)
})
metrics.NewGauge(`vm_cache_size_bytes{type="storage/hour_metric_ids"}`, func() float64 {
return float64(m().HourMetricIDCacheSizeBytes)
})
metrics.NewGauge(`vm_cache_size_bytes{type="indexdb/tagFilters"}`, func() float64 {
return float64(idbm().TagCacheSizeBytes)
})
metrics.NewGauge(`vm_cache_size_bytes{type="indexdb/uselessTagFilters"}`, func() float64 {
return float64(idbm().UselessTagFiltersCacheSizeBytes)
})
metrics.NewGauge(`vm_cache_size_bytes{type="storage/prefetchedMetricIDs"}`, func() float64 {
return float64(m().PrefetchedMetricIDsSizeBytes)
})
metrics.NewGauge(`vm_cache_requests_total{type="storage/tsid"}`, func() float64 {
return float64(m().TSIDCacheRequests)
@@ -463,9 +499,6 @@ func registerStorageMetrics() {
metrics.NewGauge(`vm_cache_requests_total{type="storage/metricName"}`, func() float64 {
return float64(m().MetricNameCacheRequests)
})
metrics.NewGauge(`vm_cache_requests_total{type="storage/date_metricID"}`, func() float64 {
return float64(m().DateMetricIDCacheRequests)
})
metrics.NewGauge(`vm_cache_requests_total{type="storage/bigIndexBlocks"}`, func() float64 {
return float64(tm().BigIndexBlocksCacheRequests)
})
@@ -497,9 +530,6 @@ func registerStorageMetrics() {
metrics.NewGauge(`vm_cache_misses_total{type="storage/metricName"}`, func() float64 {
return float64(m().MetricNameCacheMisses)
})
metrics.NewGauge(`vm_cache_misses_total{type="storage/date_metricID"}`, func() float64 {
return float64(m().DateMetricIDCacheMisses)
})
metrics.NewGauge(`vm_cache_misses_total{type="storage/bigIndexBlocks"}`, func() float64 {
return float64(tm().BigIndexBlocksCacheMisses)
})
@@ -532,7 +562,4 @@ func registerStorageMetrics() {
metrics.NewGauge(`vm_cache_collisions_total{type="storage/metricName"}`, func() float64 {
return float64(m().MetricNameCacheCollisions)
})
metrics.NewGauge(`vm_cache_collisions_total{type="storage/date_metricID"}`, func() float64 {
return float64(m().DateMetricIDCacheCollisions)
})
}

View File

@@ -14,7 +14,7 @@
"type": "grafana",
"id": "grafana",
"name": "Grafana",
"version": "6.3.5"
"version": "6.5.2"
},
{
"type": "panel",
@@ -60,12 +60,12 @@
}
]
},
"description": "Overview for single node VictoriaMetrics v1.28.0 or higher",
"description": "Overview for single node VictoriaMetrics v1.32.8 or higher",
"editable": true,
"gnetId": 10229,
"graphTooltip": 0,
"id": null,
"iteration": 1572208904768,
"iteration": 1580634216692,
"links": [
{
"icon": "doc",
@@ -96,6 +96,7 @@
"panels": [
{
"collapsed": false,
"datasource": "${DS_PROMETHEUS}",
"gridPos": {
"h": 1,
"w": 24,
@@ -109,6 +110,7 @@
},
{
"content": "<div style=\"text-align: center; font-size: 2em\">$version</div>",
"datasource": "${DS_PROMETHEUS}",
"description": "",
"gridPos": {
"h": 3,
@@ -470,6 +472,7 @@
},
{
"collapsed": false,
"datasource": "${DS_PROMETHEUS}",
"gridPos": {
"h": 1,
"w": 24,
@@ -496,6 +499,7 @@
"x": 0,
"y": 11
},
"hiddenSeries": false,
"id": 12,
"legend": {
"alignAsTable": true,
@@ -589,6 +593,7 @@
"x": 12,
"y": 11
},
"hiddenSeries": false,
"id": 22,
"legend": {
"alignAsTable": true,
@@ -682,6 +687,7 @@
"x": 0,
"y": 19
},
"hiddenSeries": false,
"id": 51,
"legend": {
"avg": false,
@@ -778,6 +784,7 @@
"x": 12,
"y": 19
},
"hiddenSeries": false,
"id": 33,
"legend": {
"avg": false,
@@ -873,7 +880,7 @@
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"description": "* `*` - unsupported query path\n* `/write` - insert into VM\n* `/metrics` - query VM system metrics\n* `/query` - query instant values\n* `/query_range` - query over a range of time\n* `/series` - match a certain label set\n* `/label/{}/values` - query a list of label values (variables mostly)",
"description": "Shows how many of new time-series are created every second. High churn rate tightly connected with database performance and may result in unexpected OOM's or slow queries. It is recommended to always keep an eye on this metric to avoid unexpected cardinality \"explosions\".\n\nGood references to read:\n* https://www.robustperception.io/cardinality-is-key\n* https://www.robustperception.io/using-tsdb-analyze-to-investigate-churn-and-cardinality",
"fill": 1,
"fillGradient": 0,
"gridPos": {
@@ -882,6 +889,95 @@
"x": 0,
"y": 27
},
"hiddenSeries": false,
"id": 66,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"nullPointMode": "null",
"options": {
"dataLinks": []
},
"percentage": false,
"pointradius": 2,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sum(rate(vm_new_timeseries_created_total{job=\"$job\"}[5m]))",
"legendFormat": "churn rate",
"refId": "A"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "Churn rate",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
},
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"description": "* `*` - unsupported query path\n* `/write` - insert into VM\n* `/metrics` - query VM system metrics\n* `/query` - query instant values\n* `/query_range` - query over a range of time\n* `/series` - match a certain label set\n* `/label/{}/values` - query a list of label values (variables mostly)",
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 27
},
"hiddenSeries": false,
"id": 35,
"legend": {
"alignAsTable": true,
@@ -960,6 +1056,99 @@
"alignLevel": null
}
},
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"description": "Slow queries according to `search.logSlowQueryDuration` flag, which is `5s` by default.",
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 35
},
"hiddenSeries": false,
"id": 60,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": false,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [],
"nullPointMode": "null",
"options": {
"dataLinks": []
},
"percentage": false,
"pointradius": 2,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sum(rate(vm_slow_queries_total{job=\"$job\"}[5m]))",
"format": "time_series",
"hide": false,
"intervalFactor": 1,
"legendFormat": "slow queries rate",
"refId": "A"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "Slow queries",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"decimals": null,
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
},
{
"aliasColors": {},
"bars": false,
@@ -973,8 +1162,9 @@
"h": 8,
"w": 12,
"x": 12,
"y": 27
"y": 35
},
"hiddenSeries": false,
"id": 59,
"legend": {
"alignAsTable": true,
@@ -1082,8 +1272,9 @@
"h": 8,
"w": 12,
"x": 0,
"y": 35
"y": 43
},
"hiddenSeries": false,
"id": 37,
"legend": {
"avg": false,
@@ -1174,8 +1365,9 @@
"h": 8,
"w": 12,
"x": 12,
"y": 35
"y": 43
},
"hiddenSeries": false,
"id": 49,
"legend": {
"avg": false,
@@ -1255,11 +1447,12 @@
},
{
"collapsed": false,
"datasource": "${DS_PROMETHEUS}",
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 43
"y": 51
},
"id": 14,
"panels": [],
@@ -1279,8 +1472,9 @@
"h": 8,
"w": 12,
"x": 0,
"y": 44
"y": 52
},
"hiddenSeries": false,
"id": 10,
"legend": {
"alignAsTable": true,
@@ -1371,15 +1565,16 @@
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"description": "How many datapoints are in RAM queue waiting to be written into storage. The less is better.",
"description": "How many datapoints are in RAM queue waiting to be written into storage. The number of pending data points should be in the range from 0 to `2*<ingestion_rate>`, since VictoriaMetrics pushes pending data to persistent storage every second.",
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 44
"y": 52
},
"hiddenSeries": false,
"id": 34,
"legend": {
"avg": false,
@@ -1483,8 +1678,9 @@
"h": 8,
"w": 12,
"x": 0,
"y": 52
"y": 60
},
"hiddenSeries": false,
"id": 30,
"legend": {
"avg": false,
@@ -1573,8 +1769,9 @@
"h": 8,
"w": 12,
"x": 12,
"y": 52
"y": 60
},
"hiddenSeries": false,
"id": 36,
"legend": {
"avg": false,
@@ -1663,8 +1860,9 @@
"h": 8,
"w": 12,
"x": 0,
"y": 60
"y": 68
},
"hiddenSeries": false,
"id": 53,
"legend": {
"avg": false,
@@ -1753,8 +1951,9 @@
"h": 8,
"w": 12,
"x": 12,
"y": 60
"y": 68
},
"hiddenSeries": false,
"id": 55,
"legend": {
"avg": false,
@@ -1829,6 +2028,184 @@
"alignLevel": null
}
},
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"description": "The number of on-going merges in storage nodes. It is expected to have high numbers for `storage/small` metric.",
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 76
},
"hiddenSeries": false,
"id": 62,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"nullPointMode": "null",
"options": {
"dataLinks": []
},
"percentage": false,
"pointradius": 2,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sum(vm_active_merges{job=\"$job\"}) by(type)",
"legendFormat": "{{type}}",
"refId": "A"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "Active merges",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"decimals": 0,
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
},
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"description": "The number of rows merged per second by storage nodes.",
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 76
},
"hiddenSeries": false,
"id": 64,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"nullPointMode": "null",
"options": {
"dataLinks": []
},
"percentage": false,
"pointradius": 2,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sum(rate(vm_rows_merged_total{job=\"$job\"}[5m])) by(type)",
"legendFormat": "{{type}}",
"refId": "A"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "Merge speed",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"decimals": 0,
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
},
{
"aliasColors": {},
"bars": false,
@@ -1842,8 +2219,9 @@
"h": 8,
"w": 12,
"x": 0,
"y": 68
"y": 84
},
"hiddenSeries": false,
"id": 58,
"legend": {
"avg": false,
@@ -1921,13 +2299,107 @@
"alignLevel": null
}
},
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"description": "Shows the rate of logging the messages by their level. Unexpected spike in rate is a good reason to check logs.",
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 84
},
"hiddenSeries": false,
"id": 67,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [],
"nullPointMode": "null",
"options": {
"dataLinks": []
},
"percentage": false,
"pointradius": 2,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sum(rate(vm_log_messages_total{job=\"$job\"}[5m])) by (level) ",
"format": "time_series",
"hide": false,
"intervalFactor": 1,
"legendFormat": "{{level}}",
"refId": "A"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "Logging rate",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"decimals": null,
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
},
{
"collapsed": false,
"datasource": "${DS_PROMETHEUS}",
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 76
"y": 92
},
"id": 46,
"panels": [],
@@ -1947,8 +2419,9 @@
"h": 8,
"w": 12,
"x": 0,
"y": 77
"y": 93
},
"hiddenSeries": false,
"id": 44,
"legend": {
"avg": false,
@@ -2046,14 +2519,16 @@
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 77
"y": 93
},
"hiddenSeries": false,
"id": 57,
"legend": {
"avg": false,
@@ -2141,8 +2616,9 @@
"h": 8,
"w": 12,
"x": 0,
"y": 85
"y": 101
},
"hiddenSeries": false,
"id": 47,
"legend": {
"avg": false,
@@ -2231,8 +2707,9 @@
"h": 8,
"w": 12,
"x": 12,
"y": 85
"y": 101
},
"hiddenSeries": false,
"id": 42,
"legend": {
"avg": false,
@@ -2320,8 +2797,9 @@
"h": 8,
"w": 12,
"x": 0,
"y": 93
"y": 109
},
"hiddenSeries": false,
"id": 48,
"legend": {
"avg": false,
@@ -2400,7 +2878,7 @@
}
],
"refresh": "30s",
"schemaVersion": 19,
"schemaVersion": 21,
"style": "dark",
"tags": [],
"templating": {
@@ -2452,7 +2930,7 @@
]
},
"time": {
"from": "now-1h",
"from": "now-30m",
"to": "now"
},
"timepicker": {
@@ -2483,5 +2961,5 @@
"timezone": "",
"title": "VictoriaMetrics",
"uid": "wNf0q_kZk",
"version": 3
}
"version": 2
}

View File

@@ -1,13 +1,15 @@
DOCKER_NAMESPACE := victoriametrics
BUILDER_IMAGE := local/builder:go1.13.3
CERTS_IMAGE := local/certs:1.0.2
# All these commands must run from repository root.
DOCKER_NAMESPACE := docker.io/victoriametrics
BUILDER_IMAGE := local/builder:go1.13.8
CERTS_IMAGE := local/certs:1.0.3
package-certs:
(docker image ls --format '{{.Repository}}:{{.Tag}}' | grep -q '$(CERTS_IMAGE)') \
(docker image ls --format '{{.Repository}}:{{.Tag}}' | grep -q '$(CERTS_IMAGE)$$') \
|| docker build -t $(CERTS_IMAGE) deployment/docker/certs
package-builder:
(docker image ls --format '{{.Repository}}:{{.Tag}}' | grep -q '$(BUILDER_IMAGE)') \
(docker image ls --format '{{.Repository}}:{{.Tag}}' | grep -q '$(BUILDER_IMAGE)$$') \
|| docker build -t $(BUILDER_IMAGE) deployment/docker/builder
app-via-docker: package-certs package-builder
@@ -25,21 +27,118 @@ app-via-docker: package-certs package-builder
-o bin/$(APP_NAME)$(APP_SUFFIX)-prod $(PKG_PREFIX)/app/$(APP_NAME)
package-via-docker:
(docker image ls --format '{{.Repository}}:{{.Tag}}' | grep -q '$(DOCKER_NAMESPACE)/$(APP_NAME):$(PKG_TAG)$(RACE)') || (\
(docker image ls --format '{{.Repository}}:{{.Tag}}' | grep -q '$(DOCKER_NAMESPACE)/$(APP_NAME):$(PKG_TAG)$(APP_SUFFIX)$(RACE)$$') || (\
$(MAKE) app-via-docker && \
docker build -t $(DOCKER_NAMESPACE)/$(APP_NAME):$(PKG_TAG)$(RACE) -f app/$(APP_NAME)/deployment/Dockerfile .)
docker build \
--build-arg src_binary=$(APP_NAME)$(APP_SUFFIX)-prod \
--build-arg certs_image=$(CERTS_IMAGE) \
-t $(DOCKER_NAMESPACE)/$(APP_NAME):$(PKG_TAG)$(APP_SUFFIX)$(RACE) \
-f app/$(APP_NAME)/deployment/Dockerfile bin)
publish-via-docker: package-via-docker
docker push $(DOCKER_NAMESPACE)/$(APP_NAME):$(PKG_TAG)$(RACE)
docker tag $(DOCKER_NAMESPACE)/$(APP_NAME):$(PKG_TAG)$(RACE) $(DOCKER_NAMESPACE)/$(APP_NAME):latest
docker push $(DOCKER_NAMESPACE)/$(APP_NAME):latest
package-manifest: \
package-via-docker-amd64 \
package-via-docker-arm \
package-via-docker-arm64 \
package-via-docker-ppc64le \
package-via-docker-386
$(MAKE) package-manifest-internal
package-manifest-internal:
docker push $(DOCKER_NAMESPACE)/$(APP_NAME):$(PKG_TAG)-amd64$(RACE)
docker push $(DOCKER_NAMESPACE)/$(APP_NAME):$(PKG_TAG)-arm$(RACE)
docker push $(DOCKER_NAMESPACE)/$(APP_NAME):$(PKG_TAG)-arm64$(RACE)
docker push $(DOCKER_NAMESPACE)/$(APP_NAME):$(PKG_TAG)-ppc64le$(RACE)
docker push $(DOCKER_NAMESPACE)/$(APP_NAME):$(PKG_TAG)-386$(RACE)
DOCKER_CLI_EXPERIMENTAL=enabled docker manifest create --amend $(DOCKER_NAMESPACE)/$(APP_NAME):$(PKG_TAG)$(RACE) \
$(DOCKER_NAMESPACE)/$(APP_NAME):$(PKG_TAG)-amd64$(RACE) \
$(DOCKER_NAMESPACE)/$(APP_NAME):$(PKG_TAG)-arm$(RACE) \
$(DOCKER_NAMESPACE)/$(APP_NAME):$(PKG_TAG)-arm64$(RACE) \
$(DOCKER_NAMESPACE)/$(APP_NAME):$(PKG_TAG)-ppc64le$(RACE) \
$(DOCKER_NAMESPACE)/$(APP_NAME):$(PKG_TAG)-386$(RACE)
GOARCH=amd64 $(MAKE) package-manifest-annotate-goarch
GOARCH=arm $(MAKE) package-manifest-annotate-goarch
GOARCH=arm64 $(MAKE) package-manifest-annotate-goarch
GOARCH=ppc64le $(MAKE) package-manifest-annotate-goarch
GOARCH=386 $(MAKE) package-manifest-annotate-goarch
package-manifest-annotate-goarch:
DOCKER_CLI_EXPERIMENTAL=enabled docker manifest annotate $(DOCKER_NAMESPACE)/$(APP_NAME):$(PKG_TAG)$(RACE) \
$(DOCKER_NAMESPACE)/$(APP_NAME):$(PKG_TAG)-$(GOARCH)$(RACE) --os linux --arch $(GOARCH)
publish-via-docker: package-manifest
docker tag $(DOCKER_NAMESPACE)/$(APP_NAME):$(PKG_TAG)-amd64$(RACE) $(DOCKER_NAMESPACE)/$(APP_NAME):latest-amd64$(RACE)
docker tag $(DOCKER_NAMESPACE)/$(APP_NAME):$(PKG_TAG)-arm$(RACE) $(DOCKER_NAMESPACE)/$(APP_NAME):latest-arm$(RACE)
docker tag $(DOCKER_NAMESPACE)/$(APP_NAME):$(PKG_TAG)-arm64$(RACE) $(DOCKER_NAMESPACE)/$(APP_NAME):latest-arm64$(RACE)
docker tag $(DOCKER_NAMESPACE)/$(APP_NAME):$(PKG_TAG)-ppc64le$(RACE) $(DOCKER_NAMESPACE)/$(APP_NAME):latest-ppc64le$(RACE)
docker tag $(DOCKER_NAMESPACE)/$(APP_NAME):$(PKG_TAG)-386$(RACE) $(DOCKER_NAMESPACE)/$(APP_NAME):latest-386$(RACE)
PKG_TAG=latest $(MAKE) package-manifest-internal
DOCKER_CLI_EXPERIMENTAL=enabled docker manifest push --purge $(DOCKER_NAMESPACE)/$(APP_NAME):$(PKG_TAG)$(RACE)
DOCKER_CLI_EXPERIMENTAL=enabled docker manifest push --purge $(DOCKER_NAMESPACE)/$(APP_NAME):latest$(RACE)
run-via-docker: package-via-docker
docker run -it --rm \
--user $(shell id -u):$(shell id -g) \
--net host \
$(DOCKER_OPTS) \
$(DOCKER_NAMESPACE)/$(APP_NAME):$(PKG_TAG)$(RACE) $(ARGS)
$(DOCKER_NAMESPACE)/$(APP_NAME):$(PKG_TAG)$(APP_SUFFIX)$(RACE) $(ARGS)
app-via-docker-goarch:
APP_SUFFIX='-$(GOARCH)' \
DOCKER_OPTS='--env CGO_ENABLED=$(CGO_ENABLED) --env GOOS=linux --env GOARCH=$(GOARCH)' \
$(MAKE) app-via-docker
app-via-docker-goarch-cgo:
CGO_ENABLED=1 $(MAKE) app-via-docker-goarch
app-via-docker-goarch-nocgo:
CGO_ENABLED=0 $(MAKE) app-via-docker-goarch
app-via-docker-pure:
APP_SUFFIX='-pure' DOCKER_OPTS='--env CGO_ENABLED=0' $(MAKE) app-via-docker
app-via-docker-amd64:
GOARCH=amd64 $(MAKE) app-via-docker-goarch-cgo
app-via-docker-arm:
GOARCH=arm $(MAKE) app-via-docker-goarch-nocgo
app-via-docker-arm64:
GOARCH=arm64 $(MAKE) app-via-docker-goarch-nocgo
app-via-docker-ppc64le:
GOARCH=ppc64le $(MAKE) app-via-docker-goarch-nocgo
app-via-docker-386:
GOARCH=386 $(MAKE) app-via-docker-goarch-nocgo
package-via-docker-goarch:
APP_SUFFIX='-$(GOARCH)' \
DOCKER_OPTS='--env CGO_ENABLED=$(CGO_ENABLED) --env GOOS=linux --env GOARCH=$(GOARCH)' \
$(MAKE) package-via-docker
package-via-docker-goarch-cgo:
CGO_ENABLED=1 $(MAKE) package-via-docker-goarch
package-via-docker-goarch-nocgo:
CGO_ENABLED=0 $(MAKE) package-via-docker-goarch
package-via-docker-pure:
APP_SUFFIX='-pure' DOCKER_OPTS='--env CGO_ENABLED=0' $(MAKE) package-via-docker
package-via-docker-amd64:
GOARCH=amd64 $(MAKE) package-via-docker-goarch-cgo
package-via-docker-arm:
GOARCH=arm $(MAKE) package-via-docker-goarch-nocgo
package-via-docker-arm64:
GOARCH=arm64 $(MAKE) package-via-docker-goarch-nocgo
package-via-docker-ppc64le:
GOARCH=ppc64le $(MAKE) package-via-docker-goarch-nocgo
package-via-docker-386:
GOARCH=386 $(MAKE) package-via-docker-goarch-nocgo
remove-docker-images:
docker image ls --format '{{.Repository}}\t{{.ID}}' | grep $(DOCKER_NAMESPACE)/ | grep -v /builder | awk '{print $$2}' | xargs docker image rm -f

View File

@@ -1,2 +1,2 @@
FROM golang:1.13.3
FROM golang:1.13.8
STOPSIGNAL SIGINT

View File

@@ -1,3 +1,3 @@
# See https://medium.com/on-docker/use-multi-stage-builds-to-inject-ca-certs-ad1e8f01de1b
FROM alpine:3.9 as certs
FROM alpine:3.10 as certs
RUN apk --update add ca-certificates

View File

@@ -2,7 +2,7 @@ version: '3.5'
services:
prometheus:
container_name: prometheus
image: prom/prometheus:v2.12.0
image: prom/prometheus:v2.15.2
depends_on:
- "victoriametrics"
ports:
@@ -35,7 +35,7 @@ services:
restart: always
grafana:
container_name: grafana
image: grafana/grafana:6.3.5
image: grafana/grafana:6.5.2
entrypoint: >
/bin/sh -c "
cd /var/lib/grafana &&

View File

@@ -5,10 +5,10 @@ datasources:
type: prometheus
access: proxy
url: http://prometheus:9090
isDefault: false
isDefault: true
- name: VictoriaMetrics
type: prometheus
access: proxy
url: http://victoriametrics:8428
isDefault: true
isDefault: false

23
docs/Articles.md Normal file
View File

@@ -0,0 +1,23 @@
# Articles
* [Open-sourcing VictoriaMetrics](https://medium.com/@valyala/open-sourcing-victoriametrics-f31e34485c2b)
* [How we created VictoriaMetrics](https://medium.com/devopslinks/victoriametrics-creating-the-best-remote-storage-for-prometheus-5d92d66787ac)
* [VictoriaMetrics vs TimescaleDB vs InfluxDB benchmarks on 40K unique time series](https://medium.com/@valyala/when-size-matters-benchmarking-victoriametrics-vs-timescale-and-influxdb-6035811952d4)
* [VictoriaMetrics vs TimescaleDB vs InfluxDB benchmarks on 400K, 4M and 40M unique time series](https://medium.com/@valyala/high-cardinality-tsdb-benchmarks-victoriametrics-vs-timescaledb-vs-influxdb-13e6ee64dd6b)
* [Insert benchmarks for VictoriaMetrics vs InfluxDB on high-cardinality data](https://medium.com/@valyala/insert-benchmarks-with-inch-influxdb-vs-victoriametrics-e31a41ae2893)
* [Measuring vertical scalability for time series databases in Google Cloud](https://medium.com/@valyala/measuring-vertical-scalability-for-time-series-databases-in-google-cloud-92550d78d8ae)
* [How VictoriaMetrics creates instant snapshots](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282)
* [Prometheus Subqueries in VictoriaMetrics](https://medium.com/@valyala/prometheus-subqueries-in-victoriametrics-9b1492b720b3)
* [Why irate from Prometheus doesn't capture spikes](https://medium.com/@valyala/why-irate-from-prometheus-doesnt-capture-spikes-45f9896d7832)
* [Why mmap'ed files in Go may hurt performance](https://medium.com/@valyala/mmap-in-go-considered-harmful-d92a25cb161d)
* [WAL Usage Looks Broken in Modern TSDBs](https://medium.com/@valyala/wal-usage-looks-broken-in-modern-time-series-databases-b62a627ab704)
* [Analyzing Prometheus data with external tools](https://medium.com/@valyala/analyzing-prometheus-data-with-external-tools-5f3e5e147639)
* [Stripping dependency bloat in VictoriaMetrics Docker image](https://medium.com/@valyala/stripping-dependency-bloat-in-victoriametrics-docker-image-983fb5912b0d)
* [PromQL tutorial for beginners](https://medium.com/@valyala/promql-tutorial-for-beginners-9ab455142085)
* [Achieving better compression for time series data than Gorilla](https://medium.com/@valyala/victoriametrics-achieving-better-compression-for-time-series-data-than-gorilla-317bc1f95932)
* [Comparing Thanos to VictoriaMetrics cluster](https://medium.com/@valyala/comparing-thanos-to-victoriametrics-cluster-b193bea1683)
* [Speeding up backups for big time series databases](https://medium.com/@valyala/speeding-up-backups-for-big-time-series-databases-533c1a927883)
* [Evaluation performance and correctness: VictoriaMetrics response](https://medium.com/@valyala/evaluating-performance-and-correctness-victoriametrics-response-e27315627e87)
* [Improving histogram usability for Prometheus and Grafana](https://medium.com/@valyala/improving-histogram-usability-for-prometheus-and-grafana-bc7e5df0e350)
* [Prometheus storage: tech terms for humans](https://medium.com/@valyala/prometheus-storage-technical-terms-for-humans-4ab4de6c3d48)
* [Billy: how VictoriaMetrics deals with more than 500 billion rows](https://medium.com/@valyala/billy-how-victoriametrics-deals-with-more-than-500-billion-rows-e82ff8f725da)

Some files were not shown because too many files have changed in this diff Show More