Compare commits

...

285 Commits

Author SHA1 Message Date
Aliaksandr Valialkin
359c4d6109 docs: add a link to https://medium.com/@valyala/prometheus-storage-technical-terms-for-humans-4ab4de6c3d48 2019-12-03 22:37:16 +02:00
Aliaksandr Valialkin
face3d57bf app/vmselect: add placeholders for /api/v1/rules and /api/v1/alerts 2019-12-03 19:36:33 +02:00
Aliaksandr Valialkin
a247236f61 lib/storage: fall back to global inverted index if a filter match too many time series in per-day index
Previously this resulted to error message. The query may succeed via search in global index.
2019-12-03 14:48:31 +02:00
Aliaksandr Valialkin
54741ee578 lib/storage: fix printing tag filters in TagFilters.String 2019-12-03 14:25:13 +02:00
Aliaksandr Valialkin
efbc83a13e lib/storage: print __name__ instead of empty string in user-visible tag filters 2019-12-03 14:18:28 +02:00
Aliaksandr Valialkin
ade453847f docs: typo fixes 2019-12-03 00:44:50 +02:00
Aliaksandr Valialkin
f52874dab4 lib/storage: optimize regexp filter search 2019-12-03 00:43:12 +02:00
Artem Navoiev
652ba59ce9 [docs] update release page doc 2019-12-02 23:01:51 +02:00
Artem Navoiev
3e81ab2f75 [docs] change titles 2019-12-02 22:53:11 +02:00
Artem Navoiev
a778233877 [docs] change titles 2019-12-02 22:50:54 +02:00
Aliaksandr Valialkin
14100ed643 vendor: update github.com/VictoriaMetrics/metrics from v1.9.1 to v1.9.2
This fixes possible deadlock when metrics.WritePrometheus calls Gauge callback, which calls metrics functions with internal lock.
2019-12-02 22:33:33 +02:00
Artem Navoiev
cfc6e7df07 [docs] revert titles 2019-12-02 22:06:39 +02:00
Artem Navoiev
c07a83374c [docs] remove double titles 2019-12-02 22:02:59 +02:00
Artem Navoiev
c76b2be21f [ci] add github pages action 2019-12-02 21:53:33 +02:00
Aliaksandr Valialkin
638a5cbb16 lib/{mergeset,storage}: remove transaction files only after the mentioned dirs are really removed
This should fix the issue on NFS when incompletely removed dirs may be left
after unclean shutdown (OOM, kill -9, hard reset, etc.), while the corresponding transaction
files are already removed.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/162
2019-12-02 21:36:31 +02:00
Aliaksandr Valialkin
20812008a7 lib/storage: remove metricID with missing metricID->metricName entry
The metricID->metricName entry can be missing in the indexdb after unclean shutdown
when only a part of entries for new time series is written into indexdb.

Recover from such a situation by removing the broken metricID. New metricID
will be automatically created for time series with the given metricName
when new data point will arive to it.
2019-12-02 20:46:44 +02:00
Aliaksandr Valialkin
62a915f2b2 lib/storage: protect from time drift during indexdb rotation
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/248
2019-12-02 14:44:42 +02:00
Aliaksandr Valialkin
42da569bcd lib/logger: merge file and line labels into location="file:line"
This should improve the usability for `vm_log_messages_total` metric during practical queries
2019-12-02 14:44:40 +02:00
Aliaksandr Valialkin
70b8191fab lib/storage: generate more human-friendly result in TagFilters.String 2019-12-02 13:52:22 +02:00
Aliaksandr Valialkin
9476b73527 app/vmselect/promql: estimate per-series scrape interval as 0.6 quantile for the first 100 intervals
This should improve scrape interval estimation for tiem series with gaps.
2019-12-02 13:42:33 +02:00
Aliaksandr Valialkin
542b9c2043 lib/logger: consistency renaming from vm_log_messages_count to vm_log_messages_total, since this is a counter 2019-12-02 00:49:00 +02:00
Aliaksandr Valialkin
c567919f80 lib/logger: track the number of log messages by (level, file, line) in the vm_log_messages_count metric 2019-12-01 18:37:49 +02:00
Aliaksandr Valialkin
761645b20a lib/netutil: use IPv6 for both listening and dialing if -enabledTCP6 is set
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/244
2019-12-01 02:57:13 +02:00
Aliaksandr Valialkin
811b7a8303 app/vminsert/influx: allow empty measurement in Influx line protocol
In this case metric names are mapped directly from field names without any prefixes.
2019-11-30 23:18:41 +02:00
Artem Navoiev
4972bd4c96 Update release guide add Wiki section. Change styling 2019-11-30 21:10:42 +02:00
Artem Navoiev
335e0f8f6a Update release guide add Wiki section 2019-11-30 21:08:48 +02:00
Artem Navoiev
505e46980a [ci] push docs/*.md file to wiki 2019-11-30 20:58:28 +02:00
Artem Navoiev
ab88b77515 rename doc to docs 2019-11-30 20:48:40 +02:00
Artem Navoiev
3d8e75e065 [ci] test wiki push 2019-11-30 20:38:37 +02:00
Artem Navoiev
74b4ccfc91 [ci] push to wiki 2019-11-30 20:36:10 +02:00
Aliaksandr Valialkin
75ff524a4e app/vmselect/promql: fix corner case for increase over time series with gaps
In this case `increase` could return invalid high value for the first point after the gap.
2019-11-30 01:34:56 +02:00
Aliaksandr Valialkin
96492348cb deployment/docker/certs: update TLS certs source from alpine:3.9 to alpine:3.10 2019-11-29 19:57:29 +02:00
Aliaksandr Valialkin
f733cb2186 lib/backup: cosmetic fixes after #243 2019-11-29 18:07:04 +02:00
glebsam
15b7406f7b Add option to provide custom endpoint for S3, add option to specify S3 config profile (#243)
* Add option to provide custom endpoint for S3 for use with s3-compatible storages, add option to specify S3 config profile

* make fmt
2019-11-29 17:59:56 +02:00
Aliaksandr Valialkin
9010c6a1d6 lib/netutil: add -enableTCP6 command-line flag for enabling listening for IPv6 additionally to IPv4 TCP ports 2019-11-29 17:32:47 +02:00
Aliaksandr Valialkin
a7125a5b7b lib/backup: remove flock.lock file in empty dirs
This fixes an issue when VictoriaMetrics doesn't see the restored data after the following operations:

1. Stop VictoriaMetrics.
2. Delete `<-storageDataPath>` dir.
3. Start VictoriaMetrics, then stop it.
4. Restore data from backup with `vmrestore`.
5. Start VictoriaMetrics.

`vmrestore` didn't delete properly empty dirs in `<-storageDataPath>/indexdb` because of the remaining `flock.lock` files in these dirs.
2019-11-28 13:38:58 +02:00
Aliaksandr Valialkin
a6d7179286 README.md: remove the unnecessary step during restoring from backups 2019-11-27 19:57:03 +02:00
Aliaksandr Valialkin
e828647d0f vendor: make vendor-update 2019-11-27 15:37:14 +02:00
Aliaksandr Valialkin
31fb6f2b07 vendor: update github.com/VictoriaMetrics/fastcache from v1.5.2 to v1.5.4 2019-11-27 15:30:33 +02:00
Aliaksandr Valialkin
2c86816950 deployment/docker: update Grafana from v6.4.4 to v6.5.0 2019-11-27 15:10:37 +02:00
Aliaksandr Valialkin
4c859d980c app/vmselect/prometheus: consistently apply nocache arg to /api/v1/query the same way ast to /api/v1/query_range
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/241
2019-11-26 22:55:43 +02:00
Aliaksandr Valialkin
14bcff6015 lib/httpserver: improve docs for -tls* flags to be more clear
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/242
2019-11-26 18:08:35 +02:00
Aliaksandr Valialkin
110235f789 app/vmselect/prometheus: fix content-type for /api/v1/export responses
The correct Content-Type should be `application/stream+json` instead of `application/json`
Thanks to Joshua Ryder for pointing to this.
2019-11-26 17:45:26 +02:00
Aliaksandr Valialkin
205233d9a7 app/vmselect/promql: remove zero timeseries from prometheus_buckets output 2019-11-25 19:10:23 +02:00
Aliaksandr Valialkin
3f99f39e9b app/vmselect/prometheus: reduce default value for -search.latencyOffset from 60s to 30s
30 seconds should be enough for almost all the cases
2019-11-25 16:33:42 +02:00
Aliaksandr Valialkin
e91cb34c0e app/vmselect/promql: allow nested parens 2019-11-25 16:13:41 +02:00
Aliaksandr Valialkin
826dfd63a5 vendor: update github.com/VictoriaMetrics/metrics from v1.9.0 to v1.9.1 2019-11-25 15:23:01 +02:00
Aliaksandr Valialkin
0401969d78 app/vmselect/promql: re-use metrics.Histogram when calculating histogram function for each point on the graph
This should reduce the amounts memory allocations
2019-11-25 14:24:21 +02:00
Aliaksandr Valialkin
da98703748 app/vmselect/promql: optimize binary search over big number of samples during rollup calculations 2019-11-25 14:01:46 +02:00
Aliaksandr Valialkin
c28876172f app/vmselect/promql: adjust tests after the upgrade of github.com/VictoriaMetrics/metrics from v1.8.3 to v1.9.0 2019-11-25 13:43:57 +02:00
Aliaksandr Valialkin
66c53bf3c6 vendor: update github.com/VictoriaMetrics/metrics from v1.8.3 to v1.9.0 2019-11-25 13:19:43 +02:00
Aliaksandr Valialkin
50ae1879c6 app/vmselect/promql: add histogram aggregate function, which is useful for building heatmaps from multiple time series 2019-11-24 00:04:25 +02:00
Aliaksandr Valialkin
4ff2fbcf3f vendor: update github.com/VictoriaMetrics/metrics from v1.8.2 to v1.8.3 2019-11-24 00:04:24 +02:00
Aliaksandr Valialkin
5285acae3e lib/decimal: calculate ln2/ln10 constant during compile time 2019-11-23 15:52:58 +02:00
Aliaksandr Valialkin
8582b50360 app/vmselect/promql: do not take into account buckets with negative counters in prometheus_buckets 2019-11-23 14:19:25 +02:00
Aliaksandr Valialkin
19dfe52254 app/vmselect/promql: properly handle histogram_quantile(0, ...) with zero buckets 2019-11-23 14:02:35 +02:00
Aliaksandr Valialkin
4bb88843cf app/vmselect: add vm_per_query_{rows,series}_processed_count histograms 2019-11-23 13:23:26 +02:00
Aliaksandr Valialkin
0827bb6ce5 vendor: update github.com/VictoriaMetrics/metrics from v1.8.1 to v1.8.2 2019-11-23 11:48:54 +02:00
Aliaksandr Valialkin
7753c8c0a1 app/vmselect/promql: transparently apply prometheus_buckets in histogram_quantile 2019-11-23 11:48:51 +02:00
Aliaksandr Valialkin
ef25e1b049 vendor: update github.com/VictoriaMetrics/metrics from v1.8.0 to v1.8.1 2019-11-23 00:49:13 +02:00
Aliaksandr Valialkin
9d1fcb2be6 vendor: update github.com/VictoriaMetrics/metrics from v1.7.2 to v1.8.0. This version supports histograms 2019-11-23 00:20:27 +02:00
Aliaksandr Valialkin
c4287b3c86 app/vmselect/promql: add prometheus_buckets function for converting the upcoming histogram buckets from github.com/VictoriaMetrics/metrics to Prometheus-compatible buckets 2019-11-23 00:20:20 +02:00
Aliaksandr Valialkin
1f3fd2c910 app/vmselect: adjust end arg instead of adjusting start arg if start > end
`start` arg has higher chances to be set properly comparing to `end` arg,
so it is expected that the `end` arg could be adjusted if it was set incorrectly.
2019-11-22 16:12:19 +02:00
Aliaksandr Valialkin
90b03309de vendor: updated github.com/valyala/gozstd from v1.6.2 to v1.6.3 2019-11-21 23:57:00 +02:00
Aliaksandr Valialkin
7a4635f853 all: remove the remaining mentions of cluster version 2019-11-21 23:18:22 +02:00
Aliaksandr Valialkin
3e9b7addb1 lib/httpserver: typo fix in -httpAuth.password command-line description 2019-11-21 21:54:26 +02:00
Aliaksandr Valialkin
f652c0f40f lib/storage: move non-matching tag filters to the top at matchTagFilters
This should reduce the amount of useless work needed for matching the next metricNames.
2019-11-21 21:35:13 +02:00
Aliaksandr Valialkin
b8cde6cce1 lib/storage: speed up time series search for queries with multiple filters
Use optimized specialized binary search for uint64 metricIDs instead of generic sort.Search.
2019-11-21 18:43:17 +02:00
Aliaksandr Valialkin
aeea59e280 Makefile: create files with sha256 checksums during make release
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/19
2019-11-20 22:43:37 +02:00
Aliaksandr Valialkin
74e563ca3f README.md: added a link to https://github.com/dreamteam-gg/ansible-victoriametrics-role 2019-11-20 21:26:43 +02:00
Aliaksandr Valialkin
5c1e4143e9 lib/storage: verify the number of returned metricIDs in BenchmarkHeadPostingForMatchers 2019-11-20 15:39:28 +02:00
Aliaksandr Valialkin
52d7ca6bf0 lib/decimal: increase decimal->float speed conversion for integer numbers 2019-11-20 13:04:34 +02:00
Aliaksandr Valialkin
75eeea21ee lib/decimal: reduce rounding error when converting from decimal to float with negative exponent
While at it, slightly increase the conversion performance by moving fast path to the top of the loop.
2019-11-19 23:35:33 +02:00
Artem Navoiev
c03b87dac0 update version of codecove to 1.04 2019-11-19 22:23:14 +02:00
Aliaksandr Valialkin
259dc95366 make vendor-update 2019-11-19 21:35:07 +02:00
Aliaksandr Valialkin
cfb9fa2100 lib/backup: retrieve only the required metadata when reading GCS objects 2019-11-19 21:06:34 +02:00
Aliaksandr Valialkin
355ccba81a make vendor-update 2019-11-19 21:05:37 +02:00
Aliaksandr Valialkin
443189fb0a app/{vmbackup,vmrestore}: add -maxBytesPerSecond command-line flag for limiting the used network bandwidth during backup / restore 2019-11-19 20:31:52 +02:00
Aliaksandr Valialkin
2db06f0ef8 lib/backup: prevent from restoring to directory which is in use by VictoriaMetrics during the restore 2019-11-19 18:36:23 +02:00
Aliaksandr Valialkin
0094bc4fc9 app/vmselect/prometheus: properly adjust too big time time on /api/v1/query
Too big `time` must be adjusted to `now()-queryOffset`.
2019-11-19 00:42:00 +02:00
Aliaksandr Valialkin
b6f22a62cb lib/storage: increase the number of created time series in BenchmarkHeadPostingForMatchers in order to be on par with Promethues
The previous commit was accidentally creating 10x smaller number of time series than Prometheus
and this led to invalid benchmark results.

The updated benchmark results:

benchmark                                                          old ns/op      new ns/op     delta
BenchmarkHeadPostingForMatchers/n="1"                              272756688      6194893       -97.73%
BenchmarkHeadPostingForMatchers/n="1",j="foo"                      138132923      10781372      -92.19%
BenchmarkHeadPostingForMatchers/j="foo",n="1"                      134723762      10632834      -92.11%
BenchmarkHeadPostingForMatchers/n="1",j!="foo"                     195823953      10679975      -94.55%
BenchmarkHeadPostingForMatchers/i=~".*"                            7962582919     100118510     -98.74%
BenchmarkHeadPostingForMatchers/i=~".+"                            7589543864     154955671     -97.96%
BenchmarkHeadPostingForMatchers/i=~""                              1142371741     258003769     -77.42%
BenchmarkHeadPostingForMatchers/i!=""                              9964150263     159783895     -98.40%
BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo"              216995884      10937895      -94.96%
BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo"       202541348      10990027      -94.57%
BenchmarkHeadPostingForMatchers/n="1",i!=""                        486285711      87004349      -82.11%
BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo"                350776931      53342793      -84.79%
BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo"              380888565      54256156      -85.76%
BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo"             89500296       21823279      -75.62%
BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo"       379529654      46671359      -87.70%
BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo"     424563825      53915842      -87.30%

VictoriaMetrics uses 1GB of RAM during the benchmark (vs 3.5GB of RAM for Prometheus)
2019-11-18 19:50:58 +02:00
Aliaksandr Valialkin
8a0dfc6220 lib/storage: add BenchmarkHeadPostingForMatchers similar to the benchmark from Prometheus
See the corresponding benchmark in Prometheus - 23c0299d85/tsdb/head_bench_test.go (L52)

The benchmark allows performing apples-to-apples comparison of time series search
in Prometheus and VictoriaMetrics. The following article - https://www.robustperception.io/evaluating-performance-and-correctness -
contains incorrect numbers for VictoriaMetrics, since there wasn't this benchmark yet. Fix this.

Benchmarks can be repeated with the following commands from Prometheus and VictoriaMetrics source code roots:

- Prometheus: GOMAXPROCS=1 go test ./tsdb/ -run=111 -bench=BenchmarkHeadPostingForMatchers
- VictoriaMetrics: GOMAXPROCS=1 go test ./lib/storage/ -run=111 -bench=BenchmarkHeadPostingForMatchers

Benchmark results:
benchmark                                                          old ns/op      new ns/op     delta
BenchmarkHeadPostingForMatchers/n="1"                              272756688      364977        -99.87%
BenchmarkHeadPostingForMatchers/n="1",j="foo"                      138132923      1181636       -99.14%
BenchmarkHeadPostingForMatchers/j="foo",n="1"                      134723762      1141578       -99.15%
BenchmarkHeadPostingForMatchers/n="1",j!="foo"                     195823953      1148056       -99.41%
BenchmarkHeadPostingForMatchers/i=~".*"                            7962582919     8716755       -99.89%
BenchmarkHeadPostingForMatchers/i=~".+"                            7589543864     12096587      -99.84%
BenchmarkHeadPostingForMatchers/i=~""                              1142371741     16164560      -98.59%
BenchmarkHeadPostingForMatchers/i!=""                              9964150263     12230021      -99.88%
BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo"              216995884      1173476       -99.46%
BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo"       202541348      1299743       -99.36%
BenchmarkHeadPostingForMatchers/n="1",i!=""                        486285711      11555193      -97.62%
BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo"                350776931      5607506       -98.40%
BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo"              380888565      6380335       -98.32%
BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo"             89500296       2078970       -97.68%
BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo"       379529654      6561368       -98.27%
BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo"     424563825      6757132       -98.41%

The first column (old) is for Prometheus, the second column (new) is for VictoriaMetrics.

As you can see, VictoriaMetrics outperforms Prometheus by more than 100x in almost all the test cases of this benchmark.

Prometheus was using 3.5GB of RAM during the benchmark, while VictoriaMetrics was using 400MB of RAM.
2019-11-18 18:45:06 +02:00
Aliaksandr Valialkin
2ab4cea5e5 lib/storage: always start using per-day inverted index on the next day after its creation
The current day could miss entries for already stopped time series before
enabling per-day index.

This fixes the issue when queries return empty results during the first hour after
upgrading to v1.29.*
2019-11-16 12:11:25 +02:00
Aliaksandr Valialkin
c050abbbad deployment/docker: update Prometheus version from v2.12.0 to v2.14.0 2019-11-16 00:13:15 +02:00
Aliaksandr Valialkin
3f1637fae8 app/vmselect/promql: properly calculate integrate(q[d]) 2019-11-13 21:10:41 +02:00
Aliaksandr Valialkin
c56b9ed03b app/victoria-metrics: add build rules for GOARCH=ppc64le
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/235
2019-11-13 20:24:33 +02:00
Aliaksandr Valialkin
3fd32e331a app/vmselect/promql: use universal approach for determining maxByteSliceLen on 32-bit and 64-bit archs
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/235
2019-11-13 20:24:26 +02:00
Aliaksandr Valialkin
119dfd01bb lib/storage: add vm_cache_size_bytes{type="storage/hour_metric_ids"} metric 2019-11-13 20:24:21 +02:00
Aliaksandr Valialkin
86a1cd700b lib/storage: remove inmemory index for recent hour, since it uses too much memory
Production workload shows that the index requires ~4Kb of RAM per active time series.
This is too much for high number of active time series, so let's delete this index.

Now the queries should fall back to the index for the current day instead of the index
for the recent hour. The query performance for the current day index should be good enough
given the 100M rows/sec scan speed per CPU core.
2019-11-13 17:58:07 +02:00
Aliaksandr Valialkin
33895d4a0f lib/storage: add missing increment for recentHourInvertedIndexSearchCalls 2019-11-13 15:13:51 +02:00
Aliaksandr Valialkin
c57eb0ff83 lib/storage: add -disableRecentHourIndex flag for disabling inmemory index for recent hour
This may be useful for saving RAM on high number of time series aka high cardinality
2019-11-13 15:02:51 +02:00
Aliaksandr Valialkin
e14ab14e54 lib/storage: verify marshaling for iidx.pendingMetricIDs in TestInmemoryInvertedIndexMarshalUnmarshal 2019-11-13 13:35:30 +02:00
Aliaksandr Valialkin
ca259864e2 lib/storage: return back inmemory inverted index for recent hour
Issues fixed:
- Slow startup times. Now the index is loaded from cache during start.
- High memory usage related to superflouos index copies every 10 seconds.
2019-11-13 13:11:04 +02:00
Aliaksandr Valialkin
01bb3c06c7 lib/storage: remove inmemory inverted index for recent hours
Production load with >10M active time series showed it could
slow down VictoriaMetrics startup times and could eat
all the memory leading to OOM.

Remove inmemory inverted index for recent hours until thorough
testing on production data shows it works OK.
2019-11-13 10:45:53 +02:00
Aliaksandr Valialkin
66c4961ff8 README.md: mention that VictoriaMetrics executable is small 2019-11-12 16:58:15 +02:00
Aliaksandr Valialkin
3e16248ed6 README.md: small updates 2019-11-12 16:54:18 +02:00
Aliaksandr Valialkin
5e6c1cd986 README.md: typo fix 2019-11-12 16:48:40 +02:00
Aliaksandr Valialkin
6c2303764e Revert "lib/fs: do not postpone directory removal on NFS error"
This reverts commit 4c02e496f7.

Reason for revert: the commit breaks on NFS - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/234
2019-11-12 16:18:09 +02:00
Mike Poindexter
f3ad330635 Add test for invalid caching of tsids (#232)
* Add test for invalid caching of tsids

* Clean up error handling
2019-11-12 15:09:33 +02:00
Aliaksandr Valialkin
6c362d82cb README.md: mention that backups are made to S3 or GCS 2019-11-12 14:32:37 +02:00
Aliaksandr Valialkin
661dd190bb Refer to https://medium.com/@valyala/speeding-up-backups-for-big-time-series-databases-533c1a927883 from multiple places in README.md 2019-11-12 13:02:39 +02:00
Aliaksandr Valialkin
630ba810f1 deployment/docker: upgrade Go from v1.13.4 to v1.13.4 2019-11-12 03:49:19 +02:00
Oleg Kovalov
b4f44befa3 fix misspelled words (#229) 2019-11-12 00:16:42 +02:00
Roman Khavronenko
5fc8fb1323 add churn rate panel (#230) 2019-11-12 00:14:53 +02:00
Aliaksandr Valialkin
8e8f98f712 lib/storage: add tests for dateMetricIDCache 2019-11-11 13:21:57 +02:00
Aliaksandr Valialkin
c342f5e37e lib/storage: eliminate data race when updating lastSyncTime in dateMetricIDCache.Has 2019-11-10 22:04:01 +02:00
Aliaksandr Valialkin
56d7cc8a0d app/victoria-metrics: remove deprecated fs.MustStopDirRemover from main_test.go 2019-11-10 13:37:13 +02:00
Aliaksandr Valialkin
4c02e496f7 lib/fs: do not postpone directory removal on NFS error
Continue trying to remove NFS directory on temporary errors for up to a minute.

The previous async removal process breaks in the following case during VictoriaMetrics start

- VictoriaMetrics opens index, finds incomplete merge transactions and starts replaying them.
- The transaction instructs removing old directories for parts, which were already merged into bigger part.
- VictoriaMetrics removes these directories, but their removal is delayed due to NFS errors.
- VictoriaMetrics scans partition directory after all the incomplete merge transactions are finished
  and finds directories, which should be removed, but weren't still removed due to NFS errors.
- VictoriaMetrics panics when it finds unexpected empty directory.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/162
2019-11-10 13:24:51 +02:00
Aliaksandr Valialkin
3956003dd0 lib/storage: reorganize the code in getStartDateForPerDayInvertedIndex according to golangci-lint 2019-11-10 00:38:59 +02:00
Aliaksandr Valialkin
5c3fa59181 app/vmrestore: the upcoming release would be 1.29.0 2019-11-10 00:20:41 +02:00
Aliaksandr Valialkin
ee7765b10d lib/storage: implement per-day inverted index 2019-11-10 00:02:46 +02:00
Aliaksandr Valialkin
5810ba57c2 lib/storage: use specialized cache for (date, metricID) entries
This improves ingestion performance.
2019-11-09 23:06:11 +02:00
Aliaksandr Valialkin
e573ef2126 lib/storage: remove unused code from getMetricIDsForTimeRange: it is expected that time range is always non-zero 2019-11-09 19:03:34 +02:00
Aliaksandr Valialkin
823fa085ef lib/storage: properly set time range when deleting time series 2019-11-09 18:49:49 +02:00
Aliaksandr Valialkin
695c1dc5eb lib/storage: obtain all the time series ids from (tag->metricIDs) rows instead of (metricID->TSID) rows, since this much faster 2019-11-09 18:04:33 +02:00
Aliaksandr Valialkin
cdbe848102 lib/storage: small code prettifying 2019-11-09 14:19:52 +02:00
Aliaksandr Valialkin
5c25070556 lib/uint64set: remove superflouos check for item existence before deleting it in Set.Subtract 2019-11-09 14:19:47 +02:00
Aliaksandr Valialkin
bb08bab263 lib/storage: inmemoryInvertedIndex prettifying 2019-11-09 14:19:41 +02:00
Aliaksandr Valialkin
6ad7fe8eeb lib/storage: export vm_new_timeseries_created_total metric for determining time series churn rate 2019-11-08 21:21:07 +02:00
Aliaksandr Valialkin
9ea549ed24 lib/storage: sync with cluster changes 2019-11-08 21:21:07 +02:00
Aliaksandr Valialkin
63b05c0b9f app/vmselect/promql: adjust memory limits calculations for incremental aggregate functions
Incremental aggregate functions don't keep all the selected time series in memory -
they keep only up to GOMAXPROCS time series for incremental aggregations.

Take into account that the number of time series in RAM can be higher if they are split
into many groups with `by (...)` or `without (...)` modifiers.

This should reduce the number of `not enough memory for processing ... data points` false
positive errors.
2019-11-08 21:21:07 +02:00
Aliaksandr Valialkin
d888b21657 lib/storage: add inmemory inverted index for the last hour
It should improve performance for `last N hours` dashboards with update intervals smaller than 1 hour.
2019-11-08 21:21:07 +02:00
Aliaksandr Valialkin
1e46961d68 app/{vmbackup,vmrestore}: add vmbackup and vmrestore tools for creating backups on s3 or gcs from instant snapshots
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/203
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/38
2019-11-08 21:21:07 +02:00
Roman Khavronenko
72756ab8c7 #224: add slow_queries, on-going merges and merge speed panels to dashboard (#226) 2019-11-08 21:20:38 +02:00
Aliaksandr Valialkin
543dc8d337 lib/storage: populate partition names from both small and big directories
Certain partition directories may be missing after restoring from backups
if they had no data. Re-create such directories on start.
2019-11-06 19:49:34 +02:00
Aliaksandr Valialkin
e472f0b23b lib/storage: substitute error message about unsorted items in the index block after metricIDs merge with counter
The origin of the error has been detected and documented in the code,
so it is enough to export a counter for such errors at `vm_index_blocks_with_metric_ids_incorrect_order_total`,
so it could be monitored and alerted on high error rates.

Export also the counter for processed index blocks with metricIDs - `vm_index_blocks_with_metric_ids_processed_total`,
so its' rate could be compared to `rate(vm_index_blocks_with_metric_ids_incorrect_order_total)`.
2019-11-06 14:28:11 +02:00
Aliaksandr Valialkin
c51ca04a43 lib/storage: take into account the requested time range when caching TSIDs for the given tag filters 2019-11-06 14:28:11 +02:00
Aliaksandr Valialkin
e37f06dc52 lib/storage: dump incorrectly sorted items on a single line; this should simplify error reporting 2019-11-05 18:44:22 +02:00
Aliaksandr Valialkin
5c2099ecfe lib/storage: return back finalPartsToMerge from 2 to 3 in order to prevent from excessive merges in old partitions 2019-11-05 17:27:48 +02:00
Aliaksandr Valialkin
885ba17905 lib/storage: separate the max inverted index scan loops per metric into fast and slow loops
Slow loops could require seeks and expensive regexp matching, while fast loops just scans
all the metricIDs for the given `tag=value` prefix. So these operations must have separate
max loops multiplier.
2019-11-05 17:27:48 +02:00
Aliaksandr Valialkin
b9a06e8e74 lib/storage: skip repeated useless work when intersection of metricIDs with the given filter is too expensive
This should improve performance for query filters over big number of time series.
2019-11-05 14:19:13 +02:00
Aliaksandr Valialkin
30c8301b11 lib/storage: reduce the maximum inverted index scans before giving up to label filters matching by metric name
The new value reduces the amount of wasted work during index scans over big number of time series.
2019-11-05 14:19:06 +02:00
Aliaksandr Valialkin
e53f9e553d lib/storage: try potentially faster tag filters at first, then apply slower tag filters
The fastest tag filters are non-negative non-regexp, since they are the most specific.
The slowest tag filters are negative regexp, since they require scanning
all the entries for the given label.
2019-11-05 14:19:01 +02:00
Aliaksandr Valialkin
d6ade02fd3 Makefile: add pprof-cpu rule for inspecting CPU profiles with PPROF_FILE=/path/to/cpu.pprof make pprof-cpu 2019-11-04 12:44:09 +02:00
Aliaksandr Valialkin
3c90d77858 lib/storage: pass pointer to MetricName in Fatalf, so it is properly detected as an interface with String() method
This fixes lint errors
2019-11-04 01:07:19 +02:00
Artem Navoiev
478767d0ed add unittests for bytesutil and storage (#221) 2019-11-04 00:54:46 +02:00
Aliaksandr Valialkin
02e0b19a62 lib/storage: tune the returned value from adjustMaxMetricsAdaptive 2019-11-04 00:44:37 +02:00
Aliaksandr Valialkin
6be4456d88 lib/{storage,uint64set}: add Set.Union() function and use it 2019-11-04 00:44:37 +02:00
Aliaksandr Valialkin
9becc26f4b lib/storage: remove interface conversion in hot path during block merging
This should improve merge speed a bit for parts with big number of small blocks.
2019-11-03 12:33:34 +02:00
Aliaksandr Valialkin
c62399eb3e lib/{storage,mergeset}: create missing partition directories after restoring from backups
Backup tools could skip empty directories. So re-create such directories on the first run.
2019-11-02 02:27:11 +02:00
Aliaksandr Valialkin
55d728c849 lib/{decimal,encoding}: optimize float64<->decimal conversion for arrays with zeros or ones
Time series with only zeros or ones frequently occur in monitoring, so it is worth optimizing their handling.
2019-11-01 16:48:12 +02:00
Aliaksandr Valialkin
808fc0971f lib/{encoding,decimal}: add benchmarks for blocks containing zeros or ones
Time series with such values are quite common in monitoring space,
so it would be great to have benchmarks for them.
2019-11-01 16:48:12 +02:00
Aliaksandr Valialkin
370cfbb365 lib/uint64set: return an emptry set instead of nil set from Set.Clone, since the caller may add data to the cloned set
This fixes the following panic in v1.28.1:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x783a7e]

goroutine 1155 [running]:
github.com/VictoriaMetrics/VictoriaMetrics/lib/uint64set.(*Set).Add(0x0, 0x15b3bfb41e8b71ec)
  github.com/VictoriaMetrics/VictoriaMetrics@/lib/uint64set/uint64set.go:57 +0x2e
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage.(*indexSearch).getMetricIDsForRecentHours(0xc5bdc0dd40, 0x16e273f6b50, 0x16e2745d3f0, 0x5b8d95, 0x10, 0x4a2f51, 0xaa01000000000000)
  github.com/VictoriaMetrics/VictoriaMetrics@/lib/storage/index_db.go:1951 +0x260
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage.(*indexSearch).getMetricIDsForTimeRange(0xc5bdc0dd40, 0x16e273f6b50, 0x16e2745d3f0, 0x5b8d95, 0x10, 0xb296c0, 0xc00009cd80, 0x9bc640)
2019-11-01 16:12:44 +02:00
Aliaksandr Valialkin
2f58f37f07 app/vmselect/promql: add lag(q[d]) function, which returns the lag between the current timestamp and the timstamp for the last data point in q 2019-11-01 12:21:33 +02:00
Aliaksandr Valialkin
d18ea0c95b app/vmstorage: add -bigMergeConcurrency and -smallMergeConcurrency flags for tuning the maximum number of CPU cores used during merges 2019-10-31 16:19:13 +02:00
Aliaksandr Valialkin
e0b292c6de lib/storage: small cleanup in Storage.add 2019-10-31 14:30:34 +02:00
Aliaksandr Valialkin
86f6be40db README.md: update information about vm_rows{type="indexdb"} metric
The previous information became outdated after v1.28.0, since now each row in the inverted index
can refer to multiple time series.
2019-10-31 13:30:29 +02:00
Aliaksandr Valialkin
e76e21e4c7 lib/decimal: speed up FromFloat for common case with integers 2019-10-31 13:24:59 +02:00
Aliaksandr Valialkin
cfa5e279c2 lib/decimal: increase float64->decimal conversion precision a bit
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/213
2019-10-30 02:04:56 +02:00
Aliaksandr Valialkin
fa7c3ab93a README.md: fix delimiter between {measurement} and {field_name} in the Influx line protocol example 2019-10-30 02:04:56 +02:00
Aliaksandr Valialkin
26d570bb3a lib/storage: get parts to merge after applying the limit on the number of concurrent merges
This should reduce write amplification under high ingestion rate.
2019-10-30 02:04:56 +02:00
Roman Khavronenko
62ed508546 Bump version requirements in description 2019-10-29 22:29:48 +00:00
Aliaksandr Valialkin
2e2eff90d5 lib/{mergeset,storage}: limit the maximum number of concurrent merges; leave smaller number of parts during final merge 2019-10-29 12:45:28 +02:00
Aliaksandr Valialkin
855e5c8963 vendor: update github.com/VictoriaMetrics/fastcache from v1.5.1 to v1.5.2 2019-10-29 11:31:29 +02:00
Aliaksandr Valialkin
04e48ef064 lib/fs: typo fix in comment to WriteFileAtomically 2019-10-29 11:31:26 +02:00
Roman Khavronenko
971206b514 update single-version dashboard with panels: (#219)
* concurrent inserts
* rows ignored
2019-10-28 13:54:10 +02:00
Aliaksandr Valialkin
d063bfaf83 vendor: make vendor-update 2019-10-28 13:39:05 +02:00
Roman Khavronenko
6ab48838bf #215: update klauspost/compress lib (#217)
* #215: update klauspost/compress lib

* #215: bump klauspost/compress lib to 1.9.1
2019-10-28 13:36:35 +02:00
Aliaksandr Valialkin
a42b5db39f lib/decimal: increase float->decimal conversion precision for big numbers
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/213
2019-10-28 13:23:44 +02:00
Aliaksandr Valialkin
b0295dbf2e app/vmselect: add -search.latencyOffset flag for tuning the time after data collection when data points become visible in query results
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/218
2019-10-28 12:31:07 +02:00
Petr Mikusek
3cea200309 Fix typo s/telergam/telegram/ in README.md 2019-10-23 19:30:36 +03:00
Aliaksandr Valialkin
32600ba4fc deployment/docker: upgrade Go builder from go1.13.1 to go1.13.3 2019-10-20 23:50:05 +03:00
hanzai
b3c946e35a warns during rows addition (#214) 2019-10-20 23:41:07 +03:00
Aliaksandr Valialkin
e83fe938c8 all: make fmt 2019-10-17 20:04:34 +03:00
Aliaksandr Valialkin
f708aa7003 Makefile: disable structcheck in golangci-lint, since it gives false positive on embedded structs 2019-10-17 19:59:10 +03:00
Aliaksandr Valialkin
97ce4e03a5 all: add support for GOARCH=386 and fix all the issues related to 32-bit architectures such as GOARCH=arm
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/212
2019-10-17 18:23:23 +03:00
Aliaksandr Valialkin
a398343bb6 vendor: update github.com/valyala/quicktemplate from v1.2.0 to v1.3.1 2019-10-17 18:23:19 +03:00
Aliaksandr Valialkin
6ebf537153 lib/memory: properly handle int overflow in sysTotalMemory
This should fix builds on 32-bit architectures such as arm.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/212
2019-10-17 00:50:48 +03:00
Aliaksandr Valialkin
f752479cb8 app/victoria-metrics/test: add missing docs to public funcs PopulateTimeTplString and PopulateTimeTpl 2019-10-17 00:50:46 +03:00
Aliaksandr Valialkin
61e956e175 app/victoria-metrics: add a test for max_lookback=<duration> query arg
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/209
2019-10-15 21:31:48 +03:00
Aliaksandr Valialkin
c66a691593 app/vmselect/prometheus: add -search.maxLookback command-line flag for overriding dynamic calculations for max lookback interval
This flag is similar to `-search.lookback-delta` if set. The max lookback interval is determined dynamically
from interval between datapoints for each input time series if the flag isn't set.

The interval can be overriden on per-query basis by passing `max_lookback=<duration>` query arg to `/api/v1/query` and `/api/v1/query_range`.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/209
2019-10-15 21:31:48 +03:00
Aliaksandr Valialkin
cc21b31502 app/victoria-metrics/test: add a test for PopulateTimeTplString 2019-10-15 21:31:48 +03:00
Aliaksandr Valialkin
195cefd81a lib/prompb: removed outdated README.md 2019-10-14 22:12:57 +03:00
Aliaksandr Valialkin
c1581c3810 vendor: make vendor-update 2019-10-13 23:17:47 +03:00
Aliaksandr Valialkin
16cae15c45 README.md: add integrations section 2019-10-11 19:14:28 +03:00
Aliaksandr Valialkin
f6334bffa1 lib/storage: harden the check that the original items are sorted after mergeTagToMetricIDsRows fails to preserve sort order 2019-10-09 12:13:17 +03:00
Aliaksandr Valialkin
2abd5154e0 lib/storage: typo fix in comment to maxRowsPerSmallPart. 2019-10-08 18:51:20 +03:00
Aliaksandr Valialkin
c1cf7d9f93 lib/storage: add tests for mergeTagToMetricIDsRows and return the original items if the function breaks items` ordering.
This should save from data corruption issues revealed in the previous releases up to v1.28.0-beta5.
2019-10-08 16:27:35 +03:00
Aliaksandr Valialkin
956fdd89d3 app/vmselect/promql: take into account the previous point when calculating max_over_time and min_over_time
This lines up with `first_over_time` function used in `rollup_candlestick`, so `rollup=low` always returns
the minimum value.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/204
2019-10-08 12:30:05 +03:00
Alexander Danilov
1bc6377863 Improve documentation a little bit 2019-10-07 22:18:40 +03:00
Artem Navoiev
1e2c511747 Add regression test for query apo
Part of https://github.com/VictoriaMetrics/VictoriaMetrics/issues/187
cover:
- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/153
- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/150
2019-10-07 22:18:04 +03:00
Aliaksandr Valialkin
0eeffb910f vendor: make vendor-update 2019-10-06 15:47:23 +03:00
Aliaksandr Valialkin
4ba86f501a vendor: update github.com/VictoriaMetrics/metrics from v1.7.1 to v1.7.2 2019-10-06 11:20:45 +03:00
Aliaksandr Valialkin
fdc5cfd838 lib/mergeset: reduce the maximum number of cached blocks, since there are reports on OOMs due to too big caches
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/189
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/195
2019-09-30 12:25:40 +03:00
Artem Navoiev
a116f5e7c1 Add regression test for query apo (#194)
Part of https://github.com/VictoriaMetrics/VictoriaMetrics/issues/187
cover:
- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/184
2019-09-30 11:25:54 +03:00
Aliaksandr Valialkin
4e9e1ca0f7 app/vmselect/netstorage: hint the OS that tmpBlocksFile is read almost sequentially
This became the case after b7ee2e7af2 .
2019-09-30 00:11:14 +03:00
Aliaksandr Valialkin
c1d3705be0 app/vmselect/netstorage: marshal block outside tmpBlocksFile.WriteBlock
This allows re-using the destination buffer for marshaling in the outer loop.
2019-09-28 21:07:13 +03:00
Aliaksandr Valialkin
b7ee2e7af2 app/vmselect/netstorage: reduce the number of disk seeks when the query processes big number of time series 2019-09-28 21:07:09 +03:00
Aliaksandr Valialkin
67d44b0845 app/vmselect/promql: do not generate timestamps for NaN values in timestamp function according to Prometheus logic 2019-09-27 18:54:43 +03:00
Artem Navoiev
1e6ae9eff4 Add regression test for duplicated labels and series
Part of https://github.com/VictoriaMetrics/VictoriaMetrics/issues/187
cover:
- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/155
- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/172
2019-09-27 16:52:16 +03:00
Aliaksandr Valialkin
fa81f82714 deployment/docker: switch Go builder image from v1.13.0 to v1.13.1 2019-09-26 17:09:40 +03:00
Aliaksandr Valialkin
0fa6df94a2 lib/storage: optimize TSID comparison 2019-09-26 14:16:02 +03:00
Aliaksandr Valialkin
c39355921e lib/storage: verify whether items are sorted in the end of call to mergeTagToMetricIDsRows
This should prevent from inverted index corruption if bug in mergeTagToMetricIDsRows is discovered.
2019-09-26 13:13:41 +03:00
Artem Navoiev
cf4786f34a add test for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/161 2019-09-26 12:45:19 +03:00
Aliaksandr Valialkin
3e67862676 README.md: typo fix 2019-09-26 11:03:14 +03:00
Aliaksandr Valialkin
0db9fcedd5 lib/storage: properly match labels against regexp with (?i) flag
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/161
2019-09-26 11:03:10 +03:00
Aliaksandr Valialkin
391530bb74 README.md: mention recommended ext4 options for mkfs.ext4 when creating multi-TB partition 2019-09-25 23:52:43 +03:00
Aliaksandr Valialkin
60c5b368bc README.md: tiny updates 2019-09-25 23:29:55 +03:00
Aliaksandr Valialkin
26dc21cf64 app/vmselect/promql: add increases_over_time and decreases_over_time functions
`increases_over_time(q[d])` returns the number of `q` increases during the given duration `d`.
`decreases_over_time(q[d])` returns the number of `q` decreases during the given duration `d`.
2019-09-25 20:38:44 +03:00
Aliaksandr Valialkin
2444433d83 lib/storage: add missing break in removeDuplicateMetricIDs 2019-09-25 18:23:43 +03:00
Aliaksandr Valialkin
ea4c828bae lib/storage: remove duplicate MetricIDs in tag->metricIDs items before writing them into inverted index 2019-09-25 17:55:13 +03:00
Aliaksandr Valialkin
aebc45ad26 lib/{mergeset,storage}: do not cache inverted index blocks containing tag->metricIDs items
This should reduce the amounts of used RAM during queries with filters over big number of time series.
2019-09-25 14:02:15 +03:00
Aliaksandr Valialkin
2cb811b42f lib/uint64set: optimize Set.AppendTo 2019-09-25 00:34:17 +03:00
Aliaksandr Valialkin
b986516fbe lib/storage: create and use lib/uint64set instead of map[uint64]struct{}
This should improve inverted index search performance for filters matching big number of time series,
since `lib/uint64set.Set` is faster than `map[uint64]struct{}` for both `Add` and `Has` calls.
See the corresponding benchmarks in `lib/uint64set`.
2019-09-24 21:17:55 +03:00
Aliaksandr Valialkin
ef2296e420 lib/storage: typo fix: return dstData instead of data from mergeTagToMetricIDsRows 2019-09-24 19:32:34 +03:00
Aliaksandr Valialkin
a6086cde78 lib/storage: limit the number of metricIDs in tag->metricIDs row
This reduces the overhead on index and metaindex in lib/mergeset
2019-09-24 00:49:51 +03:00
Aliaksandr Valialkin
c9063ece66 lib/storage: share tsids across all the partSearch instances
This should reduce memory usage when big number of time series matches the given query.
2019-09-23 22:35:15 +03:00
Aliaksandr Valialkin
4e26ad869b lib/{storage,mergeset}: verify PrepareBlock callback results
Do not touch the first and the last item passed to PrepareBlock
in order to preserve sort order of mergeset blocks.
2019-09-23 20:43:13 +03:00
Aliaksandr Valialkin
0772191975 lib/mergeset: detect whether we are in test by executable suffix 2019-09-22 23:12:15 +03:00
Aliaksandr Valialkin
48999e5396 lib/workingsetcache: remove data race when resetting c.misses 2019-09-22 19:36:49 +03:00
Aliaksandr Valialkin
0adebae1f8 lib/storage: generate the first tag->metricIDs item in a mergeset block with a single metricID
The first item from each mergeset block goes into index (lib/mergeset.blockHeader),
so it must be short in order to reduce index size.
2019-09-22 19:21:33 +03:00
Aliaksandr Valialkin
267efde5ae README.md: update troubleshooting and tuning sections according to recent questions from our users 2019-09-22 19:12:24 +03:00
Aliaksandr Valialkin
0686ac52c3 lib/{storage,mergeset}: merge tag->metricID rows into tag->metricIDs rows for common tag values
This should improve lookup performance if the same `label=value` pair exists
in big number of time series.
This should also reduce memory usage for mergeset data cache, since `tag->metricIDs` rows
occupy less space than the original `tag->metricID` rows.
2019-09-20 22:06:41 +03:00
Aliaksandr Valialkin
68722c3c74 lib/encoding: optimize UnmarshalUint* and UnmarshalInt* 2019-09-20 13:08:16 +03:00
Aliaksandr Valialkin
a544f49c2b lib/storage: optimize selecting all the metricIDs by scanning MetricID->TSID entries instead of tag->MetricID entries
The number of MetricID->TSID entries is smaller than the number of tag->MetricID entries
and MetricID->TSID entries are usually shorter than tag->MetricID entries.
This should improve performance when selecting all the metricIDs.
2019-09-20 11:54:10 +03:00
Aliaksandr Valialkin
d32f88c378 app/vminsert/opentsdbhttp: remove FATAL prefix from logger.Fatalf errors for the sake of consistency with other logger.Fatalf calls 2019-09-19 22:15:59 +03:00
Aliaksandr Valialkin
00cfb2d2b9 lib/mergeset: rename misleading mergeSmallParts to mergeExistingParts 2019-09-19 21:48:20 +03:00
Aliaksandr Valialkin
37dc223e25 lib/mergeset: use sort.IsSorted instead of sort.SliceIsSorted in inmemoryBlock.isSorted in order to reduce memory allocations 2019-09-19 20:13:08 +03:00
Aliaksandr Valialkin
a84fe76677 lib/storage: use sort.Sort instead of sort.slice in getSortedMetricIDs 2019-09-19 20:07:22 +03:00
Aliaksandr Valialkin
3a697a935a lib/storage: skip duplicate call to intersectMetricIDsWithTagFilter on zero successful intersects 2019-09-19 17:49:56 +03:00
Aliaksandr Valialkin
51a21c7d4b lib/mergeset: fill partHeader.firstItem on first block flush 2019-09-19 17:48:09 +03:00
Aliaksandr Valialkin
3d83f5d334 lib/storage: mark tag filter returning errFallbackToMetricNameMatch as useless
This will save CPU on subsequent calls for this filter
2019-09-18 19:10:32 +03:00
Aliaksandr Valialkin
6f3b2fd600 deployment/docker/docker-compose.yml: update Prometheus and Grafana image tags
Prometheus: from v2.10.0 to v2.12.0
Grafana: v6.2.1 from to v6.3.5
2019-09-18 18:29:09 +03:00
Aliaksandr Valialkin
8d35718dc6 lib/storage: properly construct keys for uselessTagFiltersCache and register useless negative tag filters there 2019-09-17 23:20:27 +03:00
Aliaksandr Valialkin
33975513d0 vendor: update github.com/valyala/gozstd from v1.6.1 to v1.6.2 2019-09-16 21:50:49 +03:00
Aliaksandr Valialkin
63f2b539df vendor: make vendor-update 2019-09-13 22:48:56 +03:00
Aliaksandr Valialkin
9428ec9c9f deployment/docker: remove file system paths from the compiled binary 2019-09-13 22:45:59 +03:00
Aliaksandr Valialkin
0c8057924f lib/mergeset: properly check for sorted block headers
Fix a typo for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/181
2019-09-13 21:59:29 +03:00
Aliaksandr Valialkin
d4218d27e6 app/vmselect/promql: properly handle subqueries like aggr_func(rollup_func(metric[window:step]))
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/184
2019-09-13 21:41:04 +03:00
hanzai
e2274714b1 lib/workingsetcache: adjust switching from mode=split to mode=whole smoothly and load cachefile successfully 2019-09-13 19:13:01 +03:00
Aliaksandr Valialkin
4d636c244d app/vmselect/promql: binary operation fixes according to Prometheus behaviour
The follosing issues were fixed:
- VictoriaMetrics could leave superflouos labels when using `on` or `ignoring` modifiers
- VictoriaMetrics could return `duplicate timeseries` error when using `group_left` or `group_right` with non-empty label list
2019-09-13 17:42:52 +03:00
Aliaksandr Valialkin
bad53e4207 lib/mergeset: dynamically calculate the maximum number of items per part, which can be cached in OS page cache 2019-09-11 14:53:45 +03:00
Artem Navoiev
3f581a9860 [ci] github actions - run pipeline on pull request. Fix running of test in external PR from forks 2019-09-11 09:30:11 +03:00
sundy-li
398e00aa54 README.md: fix ExtendedPromQL link url 2019-09-10 14:56:19 +03:00
Artem Navoiev
4fd741f40d [tests] check timestamp in tests (#177) 2019-09-08 19:48:38 +03:00
Artem Navoiev
4a2cd85b92 [ci] bump version of go to 1.13 in github actions config 2019-09-08 14:02:23 +03:00
Aliaksandr Valialkin
6c46afb087 vendor: update github.com/klauspost/compress from v1.7.6 to v1.8.2 2019-09-06 00:47:31 +03:00
Aliaksandr Valialkin
7343e8b408 vendor: update golang.org/x/sys 2019-09-06 00:47:31 +03:00
Artem Navoiev
22e3fabefd Add OpenTSDB and Prometheus integration tests (#168)
* [WIP] open tsdb and prometheus integration tests

* app/victoria-metrics: fix race condition on parallel tests
2019-09-05 17:55:38 +03:00
Aliaksandr Valialkin
88f8670ede lib/fs: add MustStopDirRemover for waiting until pending directories are removed on graceful shutdown
This patch is mainly required for laggy NFS. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/162
2019-09-05 11:13:17 +03:00
Aliaksandr Valialkin
9eb5de334f lib/storage: typo fix 2019-09-04 19:58:01 +03:00
Aliaksandr Valialkin
6954e126fc app/vmselect/promql: ignore grouping by destination label in count_values, since such a grouping is performed automatically 2019-09-04 19:58:01 +03:00
Aliaksandr Valialkin
bce35b8dd9 README.md: mention that Prometheus doesn't drop data when VictoriaMetrics restarts 2019-09-04 18:40:39 +03:00
Aliaksandr Valialkin
16dd145586 lib/storage: remove duplicate tag keys on MetricName.Marshal call
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/172
2019-09-04 18:13:45 +03:00
Aliaksandr Valialkin
cd2c9e39da deployment/docker: switch Go builder from Go 1.12.9 to Go 1.13.0 2019-09-04 17:17:23 +03:00
Aliaksandr Valialkin
305e7bc981 app/vmselect/promql: do not return artificial points beyond the last point in time series 2019-09-04 16:35:34 +03:00
Aliaksandr Valialkin
9721d06c6a app/vmselect/prometheus: do not adjust start and end args in /api/v1/query_range if nocache=1 arg is set
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/171
2019-09-04 13:10:09 +03:00
Aliaksandr Valialkin
4862e93024 lib/fs: try harder with directory removal on NFS in the event of temporary lock
Do not give up after 11 attempts of directory removal on laggy NFS.

Add `vm_nfs_dir_remove_failed_attempts_total` metric for counting the number of failed attempts
on directory removal.

Log failed attempts on directory removal after long sleep times.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/162
2019-09-04 12:24:50 +03:00
Aliaksandr Valialkin
db4560ca31 app/vmselect/promql: reset timeseries name on group_left and group_right as Prometheus does 2019-09-03 20:42:54 +03:00
Aliaksandr Valialkin
1575a560f0 app/vmselect/netstorage: adaptively adjust the maximum inmemory file size for storing temporary blocks
The maximum inmemory file size now depends on `-memory.allowedPercent`.
This should improve performance and reduce the number of filesystem calls
on machines with big amounts of RAM when performing heavy queries
over big number of samples and time series.
2019-09-03 13:32:09 +03:00
Aliaksandr Valialkin
e1d76ec1f3 lib/storage: invalidate tagFilters -> TSIDS cache when newly added index data becomes visible to search
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/163
2019-08-29 15:08:35 +03:00
Aliaksandr Valialkin
aeaa5de5fe lib/prombp: apply ba06b47c16
The following commands used:

gofmt -r '(uint64(x)&0x7F)<<shift -> uint64(x&0x7F)<<shift' -w ./lib/prompb/
gofmt -r '(int64(x)&0x7F)<<shift -> int64(x&0x7F)<<shift' -w ./lib/prompb/
2019-08-29 13:35:27 +03:00
Aliaksandr Valialkin
4c0a262a2e .github/workflows: verify builds on freebsd and darwin 2019-08-28 23:05:15 +03:00
Aliaksandr Valialkin
3685fc18d5 Makefile: extract app-local and app-local-pure build rules 2019-08-28 01:34:58 +03:00
Aliaksandr Valialkin
ede7ad3703 app/victoria-metrics: add missing victoria-metrics prefix to --version output when building with make victoria-metrics 2019-08-28 01:28:08 +03:00
Aliaksandr Valialkin
9196c085a7 all: port to FreeBSD on GOARCH=amd64 2019-08-28 01:19:23 +03:00
Aliaksandr Valialkin
3802ae9269 README.md: recommend checking which metrics will be deleted before deleting them 2019-08-27 15:01:16 +03:00
Artem Navoiev
b0090dbd86 add github actions (#160) 2019-08-27 14:42:46 +03:00
Aliaksandr Valialkin
603a79b357 app/vmstorage: increase default values for search.maxTagKeys, search.maxTagValues and search.maxUniqueTimeseries 2019-08-27 14:29:53 +03:00
Aliaksandr Valialkin
2655220c58 lib/storage: go fmt 2019-08-27 14:29:51 +03:00
Aliaksandr Valialkin
bf915fc0db lib/storage: report proper maxMetrics limit when more than -search.maxUniqueTimeseries series match the given filters 2019-08-27 14:21:42 +03:00
Aliaksandr Valialkin
2fc157ff7a lib/storage: properly handle (?i) in the tag filter regexp
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/161
2019-08-26 00:44:45 +03:00
Aliaksandr Valialkin
0dc0006f34 lib/storage: calculate the maximum number of rows per small part from -memory.allowedPercent
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/159

This simplifies error detection additionally to the `vm_rows_ignored_total` counters.
2019-08-25 15:31:47 +03:00
Aliaksandr Valialkin
4b688fffee lib/storage: calculate the maximum number of rows per small part from -memory.allowedPercent
This should improve query speed over recent data on machines with big amounts of RAM
2019-08-25 14:41:12 +03:00
Aliaksandr Valialkin
1402a6b981 lib/storage: properly limit the number of output rows in small and big parts storage
Previously small parts storage didn't take into account the available disk space for big parts.
2019-08-25 14:41:12 +03:00
Aliaksandr Valialkin
3308279c4e lib/storage: remove outdated comment on maxRowsPerSmallPart
The commend became outdated after the commit ed6ac1a5df027f0dfc22448e3b27c26b6f77c67a,
which stops merging of small parts on graceful shutdown instead of waiting
for their completion.
2019-08-25 13:47:32 +03:00
Aliaksandr Valialkin
fb909cf710 app/vminsert/influx: set db label only if Influx line doesnt have db tag 2019-08-24 13:52:48 +03:00
Aliaksandr Valialkin
c4e75f09dc README.md: mention that -retentionPeriod must cover the backfilled data 2019-08-24 13:52:48 +03:00
Aliaksandr Valialkin
fb8840ac38 vendor: update github.com/valyala/quicktemplate from v1.1.1 to v1.2.0 2019-08-24 13:41:15 +03:00
Aliaksandr Valialkin
9c9221d1b2 app/vminsert: skip empty tags 2019-08-24 13:36:29 +03:00
Aliaksandr Valialkin
70ca018a57 app/vminsert/opentsdbhttp: skip invalid rows and continue parsing the remaining rows
Invalid rows are logged and counted in `vm_rows_invalid_total{type="opentsdb-http"}` metric
2019-08-24 13:36:29 +03:00
Aliaksandr Valialkin
4266091e4f app/vminsert/opentsdb: skip invalid rows and continue parsing the remaining rows
Invalid rows are logged and counted in `vm_rows_invalid_total{type="opentsdb"}` metric
2019-08-24 13:36:29 +03:00
Aliaksandr Valialkin
8001d29b6e app/vminsert/graphite: skip invalid rows and continue parsing the remaining rows
Invalid rows are logged and counted in `vm_rows_invalid_total{type="graphite"}` metric
2019-08-24 13:36:29 +03:00
Aliaksandr Valialkin
9d3f1fcbb9 app/vminsert/influx: skip invalid rows and continue parsing the remaining rows
Invalid influx lines are logged and counted in `vm_rows_invalid_total{type="influx"}` metric.
2019-08-24 13:36:29 +03:00
Aliaksandr Valialkin
ba7b3806be app/vminsert/influx: do not allow escaping newline char, since they dont occur in real life
The prefious report with escaped newline chars in influx line protocol was false alarm.
2019-08-23 18:42:05 +03:00
Aliaksandr Valialkin
7fa88c6efc app/vminsert/opentsdbhttp: allow timestamp as float64 and as string, since it occurs in real life 2019-08-23 18:35:41 +03:00
Aliaksandr Valialkin
4da34b11f8 app/vminsert/influx: handle \r\n aka crlf influx line endings from windows world
Such lines exist in real life.
2019-08-23 18:28:49 +03:00
Aliaksandr Valialkin
a18317adbc app/vminsert/influx: allow escaping newline char
Though newline char isn't mentioned in escape rules at https://docs.influxdata.com/influxdb/v1.7/write_protocols/line_protocol_tutorial/ ,
there are reports that such chars occur in real life
2019-08-23 15:14:46 +03:00
Aliaksandr Valialkin
44d7fc599d app/vminsert/influx: skip comments starting with # in influx line protocol 2019-08-23 14:43:09 +03:00
Aliaksandr Valialkin
dce6079379 README.md: add a section about Go profiling 2019-08-23 13:37:09 +03:00
Aliaksandr Valialkin
98419c00ef vendor: make vendor-update 2019-08-23 10:02:10 +03:00
Aliaksandr Valialkin
ac004665b5 all: return 503 http error if service is temporarily unavailable
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/156
2019-08-23 09:55:07 +03:00
Aliaksandr Valialkin
8c03a8c4b4 app/vminsert: allow setting the maximum number of labels per time series via -maxLabelsPerTimeseries 2019-08-23 08:45:26 +03:00
Aliaksandr Valialkin
8a126c2865 README.md: mention that VictoriaMetrics supports enterprise workloads 2019-08-22 18:00:47 +03:00
Aliaksandr Valialkin
380cae23a0 lib/storage: add benchmarks for regexp filter match / mismatch
These benchmarks allow estimate the performance of regexp filters in promql
2019-08-22 16:36:42 +03:00
1276 changed files with 391338 additions and 38274 deletions

30
.github/workflows/github-pages.yml vendored Normal file
View File

@@ -0,0 +1,30 @@
name: github-pages
on:
push:
paths:
- 'docs/*.md'
- 'README.md'
branches:
- master
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@master
- name: publish
shell: bash
env:
TOKEN: ${{secrets.CI_TOKEN}}
run: |
git clone https://vika:${TOKEN}@github.com/VictoriaMetrics/VictoriaMetrics.github.io.git gpages
cp docs/*.md gpages
cp README.md gpages
cd gpages
git config --local user.email "info@victoriametrics.com"
git config --local user.name "Vika"
git add "*.md"
git commit -m "update github pages"
remote_repo="https://vika:${TOKEN}@github.com/VictoriaMetrics/VictoriaMetrics.github.io.git"
git push "${remote_repo}"
cd ..
rm -rf gpages

51
.github/workflows/main.yml vendored Normal file
View File

@@ -0,0 +1,51 @@
name: main
on:
push:
paths-ignore:
- 'docs/**'
- '**.md'
pull_request:
paths-ignore:
- 'docs/**'
- '**.md'
jobs:
build:
name: Build
runs-on: ubuntu-latest
steps:
- name: Setup Go
uses: actions/setup-go@v1
with:
go-version: 1.13
id: go
- name: Code checkout
uses: actions/checkout@v1
- name: Dependencies
env:
GO111MODULE: off
run: |
go get -v golang.org/x/lint/golint
go get -u github.com/kisielk/errcheck
- name: Build
env:
GO111MODULE: on
run: |
export PATH=$PATH:$(go env GOPATH)/bin # temporary fix. See https://github.com/actions/setup-go/issues/14
make check-all
git diff --exit-code
make test-full
make test-pure
make test-full-386
make victoria-metrics
make victoria-metrics-pure
make victoria-metrics-arm
make victoria-metrics-arm64
make vmutils
GOOS=freebsd go build -mod=vendor ./app/victoria-metrics
GOOS=darwin go build -mod=vendor ./app/victoria-metrics
- name: Publish coverage
uses: codecov/codecov-action@v1.0.4
with:
token: ${{secrets.CODECOV_TOKEN}}
file: ./coverage.txt

29
.github/workflows/wiki.yml vendored Normal file
View File

@@ -0,0 +1,29 @@
name: wiki
on:
push:
paths:
- 'docs/*.md'
branches:
- master
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@master
- name: publish
shell: bash
env:
TOKEN: ${{secrets.CI_TOKEN}}
run: |
cd docs
git clone https://vika:${TOKEN}@github.com/VictoriaMetrics/VictoriaMetrics.wiki.git wiki
find ./ -name '*.md' -exec cp -prv '{}' 'wiki' ';'
cd wiki
git config --local user.email "info@victoriametrics.com"
git config --local user.name "Vika"
git add "*.md"
git commit -m "update wiki pages"
remote_repo="https://vika:${TOKEN}@github.com/VictoriaMetrics/VictoriaMetrics.wiki.git"
git push "${remote_repo}"
cd ..
rm -rf wiki

1
.gitignore vendored
View File

@@ -1,3 +1,4 @@
/tmp
/tags
/pkg
*.pprof

View File

@@ -1,26 +0,0 @@
language: go
go:
- 1.12.x
install: make
env:
- GO111MODULE=on
before_install:
- GO111MODULE=off go get -v golang.org/x/lint/golint
- GO111MODULE=off go get -u github.com/kisielk/errcheck
script:
- make check-all
- git diff --exit-code
- make test-full
- make test-pure
- make victoria-metrics
- make victoria-metrics-pure
- make victoria-metrics-arm
- make victoria-metrics-arm64
after_success:
- bash <(curl -s https://codecov.io/bash)

View File

@@ -1,7 +1,7 @@
PKG_PREFIX := github.com/VictoriaMetrics/VictoriaMetrics
BUILDINFO_TAG ?= $(shell echo $$(git describe --long --all | tr '/' '-')$$( \
git diff-index --quiet HEAD -- || echo '-dirty-'$$(git diff-index -u HEAD | sha1sum | grep -oP '^.{8}')))
git diff-index --quiet HEAD -- || echo '-dirty-'$$(git diff-index -u HEAD | openssl sha1 | cut -c 10-17)))
PKG_TAG ?= $(shell git tag -l --points-at HEAD)
ifeq ($(PKG_TAG),)
@@ -19,12 +19,36 @@ include deployment/*/Makefile
clean:
rm -rf bin/*
publish: publish-victoria-metrics
publish: \
publish-victoria-metrics \
publish-vmbackup \
publish-vmrestore
package: package-victoria-metrics
package: \
package-victoria-metrics \
package-vmbackup \
package-vmrestore
release: victoria-metrics-prod
cd bin && tar czf victoria-metrics-$(PKG_TAG).tar.gz victoria-metrics-prod
vmutils: \
vmbackup \
vmrestore
release: \
release-victoria-metrics \
release-vmutils
release-victoria-metrics: victoria-metrics-prod
cd bin && tar czf victoria-metrics-$(PKG_TAG).tar.gz victoria-metrics-prod && \
sha256sum victoria-metrics-$(PKG_TAG).tar.gz > victoria-metrics-$(PKG_TAG)_checksums.txt
release-vmutils: \
vmbackup-prod \
vmrestore-prod
cd bin && tar czf vmutils-$(PKG_TAG).tar.gz vmbackup-prod vmrestore-prod && \
sha256sum vmutils-$(PKG_TAG).tar.gz > vmutils-$(PKG_TAG)_checksums.txt
pprof-cpu:
go tool pprof -trim_path=github.com/VictoriaMetrics/VictoriaMetrics@ $(PPROF_FILE)
fmt:
GO111MODULE=on gofmt -l -w -s ./lib
@@ -39,13 +63,15 @@ lint: install-golint
golint app/...
install-golint:
which golint || GO111MODULE=off go get -u github.com/golang/lint/golint
which golint || GO111MODULE=off go get -u golang.org/x/lint/golint
errcheck: install-errcheck
errcheck -exclude=errcheck_excludes.txt ./lib/...
errcheck -exclude=errcheck_excludes.txt ./app/vminsert/...
errcheck -exclude=errcheck_excludes.txt ./app/vmselect/...
errcheck -exclude=errcheck_excludes.txt ./app/vmstorage/...
errcheck -exclude=errcheck_excludes.txt ./app/vmbackup/...
errcheck -exclude=errcheck_excludes.txt ./app/vmrestore/...
install-errcheck:
which errcheck || GO111MODULE=off go get -u github.com/kisielk/errcheck
@@ -61,6 +87,9 @@ test-pure:
test-full:
GO111MODULE=on go test -tags=integration -mod=vendor -coverprofile=coverage.txt -covermode=atomic ./lib/... ./app/...
test-full-386:
GO111MODULE=on GOARCH=386 go test -tags=integration -mod=vendor -coverprofile=coverage.txt -covermode=atomic ./lib/... ./app/...
benchmark:
GO111MODULE=on go test -mod=vendor -bench=. ./lib/...
GO111MODULE=on go test -mod=vendor -bench=. ./app/...
@@ -75,6 +104,12 @@ vendor-update:
GO111MODULE=on go mod tidy
GO111MODULE=on go mod vendor
app-local:
CGO_ENABLED=1 GO111MODULE=on go build $(RACE) -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/$(APP_NAME)$(RACE) $(PKG_PREFIX)/app/$(APP_NAME)
app-local-pure:
CGO_ENABLED=0 GO111MODULE=on go build $(RACE) -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/$(APP_NAME)-pure$(RACE) $(PKG_PREFIX)/app/$(APP_NAME)
quicktemplate-gen: install-qtc
qtc
@@ -83,7 +118,7 @@ install-qtc:
golangci-lint: install-golangci-lint
golangci-lint run --exclude '(SA4003|SA1019):' -D errcheck
golangci-lint run --exclude '(SA4003|SA1019):' -D errcheck -D structcheck
install-golangci-lint:
which golangci-lint || GO111MODULE=off go get -u github.com/golangci/golangci-lint/cmd/golangci-lint

114
README.md
View File

@@ -2,7 +2,7 @@
[![Slack](https://img.shields.io/badge/join%20slack-%23victoriametrics-brightgreen.svg)](http://slack.victoriametrics.com/)
[![GitHub license](https://img.shields.io/github/license/VictoriaMetrics/VictoriaMetrics.svg)](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/LICENSE)
[![Go Report](https://goreportcard.com/badge/github.com/VictoriaMetrics/VictoriaMetrics)](https://goreportcard.com/report/github.com/VictoriaMetrics/VictoriaMetrics)
[![Build Status](https://travis-ci.org/VictoriaMetrics/VictoriaMetrics.svg?branch=master)](https://travis-ci.org/VictoriaMetrics/VictoriaMetrics)
[![Build Status](https://github.com/VictoriaMetrics/VictoriaMetrics/workflows/main/badge.svg)](https://github.com/VictoriaMetrics/VictoriaMetrics/actions)
[![codecov](https://codecov.io/gh/VictoriaMetrics/VictoriaMetrics/branch/master/graph/badge.svg)](https://codecov.io/gh/VictoriaMetrics/VictoriaMetrics)
<img alt="Victoria Metrics" src="logo.png">
@@ -21,11 +21,12 @@ Cluster version is available [here](https://github.com/VictoriaMetrics/VictoriaM
* Supports [Prometheus querying API](https://prometheus.io/docs/prometheus/latest/querying/api/), so it can be used as Prometheus drop-in replacement in Grafana.
Additionally, VictoriaMetrics extends PromQL with opt-in [useful features](https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/ExtendedPromQL).
* Global query view. Multiple Prometheus instances may write data into VictoriaMetrics. Later this data may be used in a single query.
* Supports global query view. Multiple Prometheus instances may write data into VictoriaMetrics. Later this data may be used in a single query.
* High performance and good scalability for both [inserts](https://medium.com/@valyala/high-cardinality-tsdb-benchmarks-victoriametrics-vs-timescaledb-vs-influxdb-13e6ee64dd6b)
and [selects](https://medium.com/@valyala/when-size-matters-benchmarking-victoriametrics-vs-timescale-and-influxdb-6035811952d4).
[Outperforms InfluxDB and TimescaleDB by up to 20x](https://medium.com/@valyala/measuring-vertical-scalability-for-time-series-databases-in-google-cloud-92550d78d8ae).
* [Uses 10x less RAM than InfluxDB](https://medium.com/@valyala/insert-benchmarks-with-inch-influxdb-vs-victoriametrics-e31a41ae2893) when working with millions of unique time series (aka high cardinality).
* Optimized for time series with high churn rate. Think about [prometheus-operator](https://github.com/coreos/prometheus-operator) metrics from frequent deployments in Kubernetes.
* High data compression, so [up to 70x more data points](https://medium.com/@valyala/when-size-matters-benchmarking-victoriametrics-vs-timescale-and-influxdb-6035811952d4)
may be crammed into limited storage comparing to TimescaleDB.
* Optimized for storage with high-latency IO and low IOPS (HDD and network storage in AWS, Google Cloud, Microsoft Azure, etc). See [graphs from these benchmarks](https://medium.com/@valyala/high-cardinality-tsdb-benchmarks-victoriametrics-vs-timescaledb-vs-influxdb-13e6ee64dd6b).
@@ -33,11 +34,13 @@ Cluster version is available [here](https://github.com/VictoriaMetrics/VictoriaM
See [vertical scalability benchmarks](https://medium.com/@valyala/measuring-vertical-scalability-for-time-series-databases-in-google-cloud-92550d78d8ae)
and [comparing Thanos to VictoriaMetrics cluster](https://medium.com/@valyala/comparing-thanos-to-victoriametrics-cluster-b193bea1683).
* Easy operation:
* VictoriaMetrics consists of a single executable without external dependencies.
* VictoriaMetrics consists of a single [small executable](https://medium.com/@valyala/stripping-dependency-bloat-in-victoriametrics-docker-image-983fb5912b0d) without external dependencies.
* All the configuration is done via explicit command-line flags with reasonable defaults.
* All the data is stored in a single directory pointed by `-storageDataPath` flag.
* Easy backups from [instant snapshots](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282).
* Storage is protected from corruption on unclean shutdown (i.e. hardware reset or `kill -9`) thanks to [the storage architecture](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282).
* Easy and fast backups from [instant snapshots](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282)
to S3 or GCS with [vmbackup](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmbackup/README.md) / [vmrestore](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmrestore/README.md).
See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time-series-databases-533c1a927883) for more details.
* Storage is protected from corruption on unclean shutdown (i.e. OOM, hardware reset or `kill -9`) thanks to [the storage architecture](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282).
* Supports metrics' ingestion and [backfilling](#backfilling) via the following protocols:
* [Prometheus remote write API](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write)
* [InfluxDB line protocol](https://docs.influxdata.com/influxdb/v1.7/write_protocols/line_protocol_tutorial/)
@@ -45,7 +48,7 @@ Cluster version is available [here](https://github.com/VictoriaMetrics/VictoriaM
if `-graphiteListenAddr` is set.
* [OpenTSDB put message](http://opentsdb.net/docs/build/html/api_telnet/put.html) if `-opentsdbListenAddr` is set.
* [HTTP OpenTSDB /api/put requests](http://opentsdb.net/docs/build/html/api_http/put.html) if `-opentsdbHTTPListenAddr` is set.
* Ideally works with big amounts of time series data from Kubernetes, IoT sensors, connected cars and industrial telemetry.
* Ideally works with big amounts of time series data from Kubernetes, IoT sensors, connected cars, industrial telemetry, financial data and various Enterprise workloads.
* Has open source [cluster version](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/cluster).
@@ -88,6 +91,8 @@ Cluster version is available [here](https://github.com/VictoriaMetrics/VictoriaM
- [Monitoring](#monitoring)
- [Troubleshooting](#troubleshooting)
- [Backfilling](#backfilling)
- [Profiling](#profiling)
- [Integrations](#integrations)
- [Roadmap](#roadmap)
- [Contacts](#contacts)
- [Community and contributions](#community-and-contributions)
@@ -106,8 +111,8 @@ or [docker image](https://hub.docker.com/r/victoriametrics/victoria-metrics/) wi
The following command-line flags are used the most:
* `-storageDataPath` - path to data directory. VictoriaMetrics stores all the data in this directory.
* `-retentionPeriod` - retention period in months for the data. Older data is automatically deleted.
* `-storageDataPath` - path to data directory. VictoriaMetrics stores all the data in this directory. Default path is `victoria-metrics-data` in current working directory.
* `-retentionPeriod` - retention period in months for the data. Older data is automatically deleted. Default period is 1 month.
* `-httpListenAddr` - TCP address to listen to for http requests. By default, it listens port `8428` on all the network interfaces.
* `-graphiteListenAddr` - TCP and UDP address to listen to for Graphite data. By default, it is disabled.
* `-opentsdbListenAddr` - TCP and UDP address to listen to for OpenTSDB data over telnet protocol. By default, it is disabled.
@@ -155,7 +160,7 @@ The label name may be arbitrary - `datacenter` is just an example. The label val
across Prometheus instances, so those time series may be filtered and grouped by this label.
It is recommended upgrading Prometheus to [v2.10.0](https://github.com/prometheus/prometheus/releases) or newer,
It is recommended upgrading Prometheus to [v2.12.0](https://github.com/prometheus/prometheus/releases) or newer,
since the previous versions may have issues with `remote_write`.
@@ -170,7 +175,7 @@ http://<victoriametrics-addr>:8428
Substitute `<victoriametrics-addr>` with the hostname or IP address of VictoriaMetrics.
Then build graphs with the created datasource using [Prometheus query language](https://prometheus.io/docs/prometheus/latest/querying/basics/).
VictoriaMetrics supports native PromQL and [extends it with useful features](ExtendedPromQL).
VictoriaMetrics supports native PromQL and [extends it with useful features](https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/ExtendedPromQL).
### How to upgrade VictoriaMetrics?
@@ -185,6 +190,9 @@ Follow the following steps during the upgrade:
2) Wait until the process stops. This can take a few seconds.
3) Start the upgraded VictoriaMetrics.
Prometheus doesn't drop data during VictoriaMetrics restart.
See [this article](https://grafana.com/blog/2019/03/25/whats-new-in-prometheus-2.8-wal-based-remote-write/) for details.
### How to apply new config to VictoriaMetrics?
@@ -194,6 +202,9 @@ VictoriaMetrics must be restarted for applying new config:
2) Wait until the process stops. This can take a few seconds.
3) Start VictoriaMetrics with the new config.
Prometheus doesn't drop data during VictoriaMetrics restart.
See [this article](https://grafana.com/blog/2019/03/25/whats-new-in-prometheus-2.8-wal-based-remote-write/) for details.
### How to send data from InfluxDB-compatible agents such as [Telegraf](https://www.influxdata.com/time-series-platform/telegraf/)?
@@ -208,10 +219,11 @@ For instance, put the following lines into `Telegraf` config, so it sends data t
Do not forget substituting `<victoriametrics-addr>` with the real address where VictoriaMetrics runs.
VictoriaMetrics maps Influx data using the following rules:
* [`db` query arg](https://docs.influxdata.com/influxdb/v1.7/tools/api/#write-http-endpoint) is mapped into `db` label value.
* [`db` query arg](https://docs.influxdata.com/influxdb/v1.7/tools/api/#write-http-endpoint) is mapped into `db` label value
unless `db` tag exists in the Influx line.
* Field names are mapped to time series names prefixed with `{measurement}{separator}` value,
where `{separator}` equals to `_` by default. It can be changed with `-influxMeasurementFieldSeparator` command-line flag.
See also `-influxSkipSingleField` command-line flag.
See also `-influxSkipSingleField` command-line flag. If `{measurement}` is empty, then time series names correspond to field names.
* Field values are mapped to time series values.
* Tags are mapped to Prometheus labels as-is.
@@ -239,14 +251,14 @@ An arbitrary number of lines delimited by '\n' may be sent in a single request.
After that the data may be read via [/api/v1/export](#how-to-export-time-series) endpoint:
```
curl -G 'http://localhost:8428/api/v1/export' -d 'match={__name__!=""}'
curl -G 'http://localhost:8428/api/v1/export' -d 'match={__name__=~"measurement_.*"}'
```
The `/api/v1/export` endpoint should return the following response:
```
{"metric":{"__name__":"measurement.field1","tag1":"value1","tag2":"value2"},"values":[123],"timestamps":[1560272508147]}
{"metric":{"__name__":"measurement.field2","tag1":"value1","tag2":"value2"},"values":[1.23],"timestamps":[1560272508147]}
{"metric":{"__name__":"measurement_field1","tag1":"value1","tag2":"value2"},"values":[123],"timestamps":[1560272508147]}
{"metric":{"__name__":"measurement_field2","tag1":"value1","tag2":"value2"},"values":[1.23],"timestamps":[1560272508147]}
```
Note that Influx line protocol expects [timestamps in *nanoseconds* by default](https://docs.influxdata.com/influxdb/v1.7/write_protocols/line_protocol_tutorial/#timestamp),
@@ -277,7 +289,7 @@ An arbitrary number of lines delimited by `\n` may be sent in one go.
After that the data may be read via [/api/v1/export](#how-to-export-time-series) endpoint:
```
curl -G 'http://localhost:8428/api/v1/export' -d 'match={__name__!=""}'
curl -G 'http://localhost:8428/api/v1/export' -d 'match=foo.bar.baz'
```
The `/api/v1/export` endpoint should return the following response:
@@ -322,7 +334,7 @@ An arbitrary number of lines delimited by `\n` may be sent in one go.
After that the data may be read via [/api/v1/export](#how-to-export-time-series) endpoint:
```
curl -G 'http://localhost:8428/api/v1/export' -d 'match={__name__!=""}'
curl -G 'http://localhost:8428/api/v1/export' -d 'match=foo.bar.baz'
```
The `/api/v1/export` endpoint should return the following response:
@@ -452,8 +464,8 @@ The page will return the following JSON response:
```
Snapshots are created under `<-storageDataPath>/snapshots` directory, where `<-storageDataPath>`
is the command-line flag value. Snapshots can be archived to backup storage via `cp -L`, `rsync -L`, `scp -r`
or any similar tool that follows symlinks during copying.
is the command-line flag value. Snapshots can be archived to backup storage at any time
with [vmbackup](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmbackup/README.md).
The `http://<victoriametrics-addr>:8428/snapshot/list` page contains the list of available snapshots.
@@ -464,9 +476,9 @@ Navigate to `http://<victoriametrics-addr>:8428/snapshot/delete_all` in order to
Steps for restoring from a snapshot:
1. Stop VictoriaMetrics with `kill -INT`.
2. Remove the entire contents of the directory pointed by `-storageDataPath` command-line flag.
3. Copy snapshot contents to the directory pointed by `-storageDataPath`.
4. Start VictoriaMetrics.
2. Restore snapshot contents from backup with [vmrestore](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmrestore/README.md)
to the directory pointed by `-storageDataPath`.
3. Start VictoriaMetrics.
### How to delete time series?
@@ -476,6 +488,9 @@ where `<timeseries_selector_for_delete>` may contain any [time series selector](
for metrics to delete. After that all the time series matching the given selector are deleted. Storage space for
the deleted time series isn't freed instantly - it is freed during subsequent merges of data files.
It is recommended verifying which metrics will be deleted with the call to `http://<victoria-metrics-addr>:8428/api/v1/series?match[]=<timeseries_selector_for_delete>`
before actually deleting the metrics.
### How to export time series?
@@ -500,7 +515,7 @@ at `http://<victoriametrics-addr>:8428/federate?match[]=<timeseries_selector_for
Optional `start` and `end` args may be added to the request in order to scrape the last point for each selected time series on the `[start ... end]` interval.
`start` and `end` may contain either unix timestamp in seconds or [RFC3339](https://www.ietf.org/rfc/rfc3339.txt) values. By default, the last point
on the interval `[now - max_lookback ... now]` is scraped for each time series. The default value for `max_lookback` is `5m` (5 minutes), but can be overridden.
on the interval `[now - max_lookback ... now]` is scraped for each time series. The default value for `max_lookback` is `5m` (5 minutes), but it can be overridden.
For instance, `/federate?match[]=up&max_lookback=1h` would return last points on the `[now - 1h ... now]` interval. This may be useful for time series federation
with scrape intervals exceeding `5m`.
@@ -516,7 +531,7 @@ A rough estimation of the required resources for ingestion path:
VictoriaMetrics stores various caches in RAM. Memory size for these caches may be limited by `-memory.allowedPercent` flag.
* CPU cores: a CPU core per 300K inserted data points per second. So, ~4 CPU cores are required for processing
the insert stream of 1M data points per second. The ingestion rate may be lower for high cardinality data.
the insert stream of 1M data points per second. The ingestion rate may be lower for high cardinality data or for time series with high number of labels.
See [this article](https://medium.com/@valyala/insert-benchmarks-with-inch-influxdb-vs-victoriametrics-e31a41ae2893) for details.
If you see lower numbers per CPU core, then it is likely active time series info doesn't fit caches,
so you need more RAM for lowering CPU usage.
@@ -641,6 +656,14 @@ For example, substitute `-graphiteListenAddr=:2003` with `-graphiteListenAddr=<i
* There is no need in Operating System tuning since VictoriaMetrics is optimized for default OS settings.
The only option is increasing the limit on [the number of open files in the OS](https://medium.com/@muhammadtriwibowo/set-permanently-ulimit-n-open-files-in-ubuntu-4d61064429a),
so Prometheus instances could establish more connections to VictoriaMetrics.
* The recommended filesystem is `ext4`, the recommended persistent storage is [persistent HDD-based disk on GCP](https://cloud.google.com/compute/docs/disks/#pdspecs),
since it is protected from hardware failures via internal replication and it can be [resized on the fly](https://cloud.google.com/compute/docs/disks/add-persistent-disk#resize_pd).
If you plan storing more than 1TB of data on `ext4` partition or plan extending it to more than 16TB,
then the following options are recommended to pass to `mkfs.ext4`:
```
mkfs.ext4 ... -O 64bit,huge_file,extent -T huge
```
### Monitoring
@@ -653,10 +676,8 @@ The most interesting metrics are:
* `vm_cache_entries{type="storage/hour_metric_ids"}` - the number of time series with new data points during the last hour
aka active time series.
* `vm_rows{type="indexdb"}` - the number of rows in inverted index. Each label in each unique time series adds a single
row into the inverted index. An approximate number of time series in the database may be calculated as
`vm_rows{type="indexdb"} / (avg_labels_per_series + 1)`, where `avg_labels_per_series` is the average number of labels
per each time series.
* `rate(vm_new_timeseries_created_total[5m])` - time series churn rate.
* `vm_rows{type="indexdb"}` - the number of rows in inverted index. High value for this number usually mean high churn rate for time series.
* Sum of `vm_rows{type="storage/big"}` and `vm_rows{type="storage/small"}` - total number of `(timestamp, value)` data points
in the database.
* Sum of all the `vm_cache_size_bytes` metrics - the total size of all the caches in the database.
@@ -667,6 +688,9 @@ The most interesting metrics are:
### Troubleshooting
* It is recommended to use default command-line flag values (i.e. don't set them explicitly) until the need
in tweaking these flag values arises.
* If VictoriaMetrics works slowly and eats more than a CPU core per 100K ingested data points per second,
then it is likely you have too many active time series for the current amount of RAM.
It is recommended increasing the amount of RAM on the node with VictoriaMetrics in order to improve
@@ -686,11 +710,41 @@ The most interesting metrics are:
### Backfilling
Make sure that configured `-retentionPeriod` covers timestamps for the backfilled data.
It is recommended disabling query cache with `-search.disableCache` command-line flag when writing
historical data with timestamps from the past, since the cache assumes that the data is written with
the current timestamps. Query cache can be enabled after the backfilling is complete.
### Profiling
VictoriaMetrics provides handlers for collecting the following [Go profiles](https://blog.golang.org/profiling-go-programs):
- Memory profile. It can be collected with the following command:
```
curl -s http://<victoria-metrics-host>:8428/debug/pprof/heap > mem.pprof
```
- CPU profile. It can be collected with the following command:
```
curl -s http://<victoria-metrics-host>:8428/debug/pprof/profile > cpu.pprof
```
The command for collecting CPU profile waits for 30 seconds before returning.
The collected profiles may be analyzed with [go tool pprof](https://github.com/google/pprof).
## Integrations
* [netdata](https://github.com/netdata/netdata) can push data into VictoriaMetrics via `Prometheus remote_write API`.
See [these docs](https://github.com/netdata/netdata#integrations).
* [go-graphite/carbonapi](https://github.com/go-graphite/carbonapi) can use VictoriaMetrics as time series backend.
See [this example](/blob/master/cmd/carbonapi/carbonapi.example.prometheus.yaml).
* [Ansible role for installing VictoriaMetrics](https://github.com/dreamteam-gg/ansible-victoriametrics-role).
## Roadmap
- [ ] Replication [#118](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/118)
@@ -713,8 +767,8 @@ Contact us with any questions regarding VictoriaMetrics at [info@victoriametrics
Feel free asking any questions regarding VictoriaMetrics:
- [slack](http://slack.victoriametrics.com/)
- [telergam-en](https://t.me/VictoriaMetrics_en)
- [telergam-ru](https://t.me/VictoriaMetrics_ru1)
- [telegram-en](https://t.me/VictoriaMetrics_en)
- [telegram-ru](https://t.me/VictoriaMetrics_ru1)
- [google groups](https://groups.google.com/forum/#!forum/victorametrics-users)

View File

@@ -1,7 +1,7 @@
# All these commands must run from repository root.
victoria-metrics:
GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/victoria-metrics ./app/victoria-metrics
APP_NAME=victoria-metrics $(MAKE) app-local
victoria-metrics-prod:
APP_NAME=victoria-metrics $(MAKE) app-via-docker
@@ -32,8 +32,20 @@ victoria-metrics-arm64:
victoria-metrics-arm64-prod:
APP_NAME=victoria-metrics APP_SUFFIX='-arm64' DOCKER_OPTS='--env CGO_ENABLED=0 --env GOARCH=arm64' $(MAKE) app-via-docker
victoria-metrics-ppc64le:
CGO_ENABLED=0 GOOS=linux GOARCH=ppc64le GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/victoria-metrics-ppc64le ./app/victoria-metrics
victoria-metrics-ppc64le-prod:
APP_NAME=victoria-metrics APP_SUFFIX='-ppc64le' DOCKER_OPTS='--env CGO_ENABLED=0 --env GOARCH=ppc64le' $(MAKE) app-via-docker
victoria-metrics-386:
CGO_ENABLED=0 GOOS=linux GOARCH=386 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/victoria-metrics-386 ./app/victoria-metrics
victoria-metrics-386-prod:
APP_NAME=victoria-metrics APP_SUFFIX='-386' DOCKER_OPTS='--env CGO_ENABLED=0 --env GOARCH=386' $(MAKE) app-via-docker
victoria-metrics-pure:
GO111MODULE=on CGO_ENABLED=0 go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/victoria-metrics-pure ./app/victoria-metrics
APP_NAME=victoria-metrics $(MAKE) app-local-pure
victoria-metrics-pure-prod:
APP_NAME=victoria-metrics APP_SUFFIX='-pure' DOCKER_OPTS='--env CGO_ENABLED=0' $(MAKE) app-via-docker

View File

@@ -1,5 +1,5 @@
FROM scratch
COPY --from=local/certs:1.0.2 /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ca-certificates.crt
COPY --from=local/certs:1.0.3 /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ca-certificates.crt
COPY bin/victoria-metrics-prod .
EXPOSE 8428
ENTRYPOINT ["/victoria-metrics-prod"]

View File

@@ -9,6 +9,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/procutil"
@@ -43,6 +44,8 @@ func main() {
vmstorage.Stop()
vmselect.Stop()
fs.MustStopDirRemover()
logger.Infof("the VictoriaMetrics has been stopped in %s", time.Since(startTime))
}

View File

@@ -7,6 +7,7 @@ import (
"encoding/json"
"flag"
"fmt"
"io"
"io/ioutil"
"log"
"net"
@@ -18,26 +19,31 @@ import (
"testing"
"time"
testutil "github.com/VictoriaMetrics/VictoriaMetrics/app/victoria-metrics/test"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
)
const (
testFixturesDir = "testdata"
testStorageSuffix = "vm-test-storage"
testHTTPListenAddr = ":7654"
testStatsDListenAddr = ":2003"
testOpenTSDBListenAddr = ":4242"
testLogLevel = "INFO"
testFixturesDir = "testdata"
testStorageSuffix = "vm-test-storage"
testHTTPListenAddr = ":7654"
testStatsDListenAddr = ":2003"
testOpenTSDBListenAddr = ":4242"
testOpenTSDBHTTPListenAddr = ":4243"
testLogLevel = "INFO"
)
const (
testReadHTTPPath = "http://127.0.0.1" + testHTTPListenAddr
testWriteHTTPPath = "http://127.0.0.1" + testHTTPListenAddr + "/write"
testHealthHTTPPath = "http://127.0.0.1" + testHTTPListenAddr + "/health"
testReadHTTPPath = "http://127.0.0.1" + testHTTPListenAddr
testWriteHTTPPath = "http://127.0.0.1" + testHTTPListenAddr + "/write"
testOpenTSDBWriteHTTPPath = "http://127.0.0.1" + testOpenTSDBHTTPListenAddr + "/api/put"
testPromWriteHTTPPath = "http://127.0.0.1" + testHTTPListenAddr + "/api/v1/write"
testHealthHTTPPath = "http://127.0.0.1" + testHTTPListenAddr + "/health"
)
const (
@@ -50,18 +56,69 @@ var (
)
type test struct {
Name string `json:"name"`
Data string `json:"data"`
Query string `json:"query"`
Result []Row `json:"result"`
Name string `json:"name"`
Data []string `json:"data"`
Query []string `json:"query"`
ResultMetrics []Metric `json:"result_metrics"`
ResultSeries Series `json:"result_series"`
ResultQuery Query `json:"result_query"`
ResultQueryRange QueryRange `json:"result_query_range"`
Issue string `json:"issue"`
}
type Row struct {
type Metric struct {
Metric map[string]string `json:"metric"`
Values []float64 `json:"values"`
Timestamps []int64 `json:"timestamps"`
}
func (r *Metric) UnmarshalJSON(b []byte) error {
type plain Metric
return json.Unmarshal(testutil.PopulateTimeTpl(b, insertionTime), (*plain)(r))
}
type Series struct {
Status string `json:"status"`
Data []map[string]string `json:"data"`
}
type Query struct {
Status string `json:"status"`
Data QueryData `json:"data"`
}
type QueryData struct {
ResultType string `json:"resultType"`
Result []QueryDataResult `json:"result"`
}
type QueryDataResult struct {
Metric map[string]string `json:"metric"`
Value []interface{} `json:"value"`
}
func (r *QueryDataResult) UnmarshalJSON(b []byte) error {
type plain QueryDataResult
return json.Unmarshal(testutil.PopulateTimeTpl(b, insertionTime), (*plain)(r))
}
type QueryRange struct {
Status string `json:"status"`
Data QueryRangeData `json:"data"`
}
type QueryRangeData struct {
ResultType string `json:"resultType"`
Result []QueryRangeDataResult `json:"result"`
}
type QueryRangeDataResult struct {
Metric map[string]string `json:"metric"`
Values [][]interface{} `json:"values"`
}
func (r *QueryRangeDataResult) UnmarshalJSON(b []byte) error {
type plain QueryRangeDataResult
return json.Unmarshal(testutil.PopulateTimeTpl(b, insertionTime), (*plain)(r))
}
func TestMain(m *testing.M) {
setUp()
code := m.Run()
@@ -92,7 +149,7 @@ func setUp() {
func processFlags() {
flag.Parse()
for _, fs := range []struct {
for _, fv := range []struct {
flag string
value string
}{
@@ -101,10 +158,11 @@ func processFlags() {
{flag: "graphiteListenAddr", value: testStatsDListenAddr},
{flag: "opentsdbListenAddr", value: testOpenTSDBListenAddr},
{flag: "loggerLevel", value: testLogLevel},
{flag: "opentsdbHTTPListenAddr", value: testOpenTSDBHTTPListenAddr},
} {
// panics if flag doesn't exist
if err := flag.Lookup(fs.flag).Value.Set(fs.value); err != nil {
log.Fatalf("unable to set %q with value %q, err: %v", fs.flag, fs.value, err)
if err := flag.Lookup(fv.flag).Value.Set(fv.value); err != nil {
log.Fatalf("unable to set %q with value %q, err: %v", fv.flag, fv.value, err)
}
}
}
@@ -121,67 +179,125 @@ func waitFor(timeout time.Duration, f func() bool) error {
}
func tearDown() {
if err := httpserver.Stop(*httpListenAddr); err != nil {
log.Printf("cannot stop the webservice: %s", err)
}
vminsert.Stop()
vmstorage.Stop()
vmselect.Stop()
if err := httpserver.Stop(*httpListenAddr); err != nil {
log.Fatalf("cannot stop the webservice: %s", err)
}
os.RemoveAll(storagePath)
fs.MustRemoveAll(storagePath)
}
func TestWriteRead(t *testing.T) {
t.Run("write", testWrite)
time.Sleep(1 * time.Second)
vmstorage.Stop()
// open storage after stop in write
vmstorage.InitWithoutMetrics()
t.Run("read", testRead)
}
func testWrite(t *testing.T) {
t.Run("prometheus", func(t *testing.T) {
for _, test := range readIn("prometheus", t, insertionTime) {
s := newSuite(t)
r := testutil.WriteRequest{}
s.noError(json.Unmarshal([]byte(strings.Join(test.Data, "\n")), &r.Timeseries))
data, err := testutil.Compress(r)
s.greaterThan(len(r.Timeseries), 0)
if err != nil {
t.Errorf("error compressing %v %s", r, err)
t.Fail()
}
httpWrite(t, testPromWriteHTTPPath, bytes.NewBuffer(data))
}
})
t.Run("influxdb", func(t *testing.T) {
for _, test := range readIn("influxdb", t, fmt.Sprintf("%d", insertionTime.UnixNano())) {
for _, x := range readIn("influxdb", t, insertionTime) {
test := x
t.Run(test.Name, func(t *testing.T) {
t.Parallel()
httpWrite(t, testWriteHTTPPath, test.Data)
httpWrite(t, testWriteHTTPPath, bytes.NewBufferString(strings.Join(test.Data, "\n")))
})
}
})
t.Run("graphite", func(t *testing.T) {
for _, test := range readIn("graphite", t, fmt.Sprintf("%d", insertionTime.Unix())) {
for _, x := range readIn("graphite", t, insertionTime) {
test := x
t.Run(test.Name, func(t *testing.T) {
t.Parallel()
tcpWrite(t, "127.0.0.1"+testStatsDListenAddr, test.Data)
tcpWrite(t, "127.0.0.1"+testStatsDListenAddr, strings.Join(test.Data, "\n"))
})
}
})
t.Run("opentsdb", func(t *testing.T) {
for _, test := range readIn("opentsdb", t, fmt.Sprintf("%d", insertionTime.Unix())) {
for _, x := range readIn("opentsdb", t, insertionTime) {
test := x
t.Run(test.Name, func(t *testing.T) {
t.Parallel()
tcpWrite(t, "127.0.0.1"+testOpenTSDBListenAddr, test.Data)
tcpWrite(t, "127.0.0.1"+testOpenTSDBListenAddr, strings.Join(test.Data, "\n"))
})
}
})
t.Run("opentsdbhttp", func(t *testing.T) {
for _, x := range readIn("opentsdbhttp", t, insertionTime) {
test := x
t.Run(test.Name, func(t *testing.T) {
t.Parallel()
logger.Infof("writing %s", test.Data)
httpWrite(t, testOpenTSDBWriteHTTPPath, bytes.NewBufferString(strings.Join(test.Data, "\n")))
})
}
})
}
func testRead(t *testing.T) {
for _, engine := range []string{"graphite", "opentsdb", "influxdb"} {
for _, engine := range []string{"prometheus", "graphite", "opentsdb", "influxdb", "opentsdbhttp"} {
t.Run(engine, func(t *testing.T) {
for _, test := range readIn(engine, t, fmt.Sprintf("%d", insertionTime.UnixNano())) {
test := test
for _, x := range readIn(engine, t, insertionTime) {
test := x
t.Run(test.Name, func(t *testing.T) {
t.Parallel()
rowContains(t, httpRead(t, testReadHTTPPath, test.Query), test.Result)
for _, q := range test.Query {
q = testutil.PopulateTimeTplString(q, insertionTime)
if test.Issue != "" {
test.Issue = "Regression in " + test.Issue
}
switch true {
case strings.HasPrefix(q, "/api/v1/export"):
if err := checkMetricsResult(httpReadMetrics(t, testReadHTTPPath, q), test.ResultMetrics); err != nil {
t.Fatalf("Export. %s fails with error %s.%s", q, err, test.Issue)
}
case strings.HasPrefix(q, "/api/v1/series"):
s := Series{}
httpReadStruct(t, testReadHTTPPath, q, &s)
if err := checkSeriesResult(s, test.ResultSeries); err != nil {
t.Fatalf("Series. %s fails with error %s.%s", q, err, test.Issue)
}
case strings.HasPrefix(q, "/api/v1/query_range"):
queryResult := QueryRange{}
httpReadStruct(t, testReadHTTPPath, q, &queryResult)
if err := checkQueryRangeResult(queryResult, test.ResultQueryRange); err != nil {
t.Fatalf("Query Range. %s fails with error %s.%s", q, err, test.Issue)
}
case strings.HasPrefix(q, "/api/v1/query"):
queryResult := Query{}
httpReadStruct(t, testReadHTTPPath, q, &queryResult)
if err := checkQueryResult(queryResult, test.ResultQuery); err != nil {
t.Fatalf("Query. %s fails with error %s.%s", q, err, test.Issue)
}
default:
t.Fatalf("unsupported read query %s", q)
}
}
})
}
})
}
}
func readIn(readFor string, t *testing.T, timeStr string) []test {
func readIn(readFor string, t *testing.T, insertTime time.Time) []test {
t.Helper()
s := newSuite(t)
var tt []test
@@ -193,7 +309,9 @@ func readIn(readFor string, t *testing.T, timeStr string) []test {
s.noError(err)
item := test{}
s.noError(json.Unmarshal(b, &item))
item.Data = strings.Replace(item.Data, "{TIME}", timeStr, 1)
for i := range item.Data {
item.Data[i] = testutil.PopulateTimeTplString(item.Data[i], insertTime)
}
tt = append(tt, item)
return nil
}))
@@ -203,10 +321,10 @@ func readIn(readFor string, t *testing.T, timeStr string) []test {
return tt
}
func httpWrite(t *testing.T, address string, data string) {
func httpWrite(t *testing.T, address string, r io.Reader) {
t.Helper()
s := newSuite(t)
resp, err := http.Post(address, "", bytes.NewBufferString(data))
resp, err := http.Post(address, "", r)
s.noError(err)
s.noError(resp.Body.Close())
s.equalInt(resp.StatusCode, 204)
@@ -223,35 +341,122 @@ func tcpWrite(t *testing.T, address string, data string) {
s.equalInt(n, len(data))
}
func httpRead(t *testing.T, address, query string) []Row {
func httpReadMetrics(t *testing.T, address, query string) []Metric {
t.Helper()
s := newSuite(t)
resp, err := http.Get(address + query)
s.noError(err)
defer resp.Body.Close()
s.equalInt(resp.StatusCode, 200)
var rows []Row
var rows []Metric
for dec := json.NewDecoder(resp.Body); dec.More(); {
var row Row
var row Metric
s.noError(dec.Decode(&row))
rows = append(rows, row)
}
return rows
}
func rowContains(t *testing.T, rows, contains []Row) {
func httpReadStruct(t *testing.T, address, query string, dst interface{}) {
t.Helper()
for _, r := range rows {
contains = removeIfFound(r, contains)
}
if len(contains) > 0 {
t.Fatalf("result rows %+v not found in %+v", contains, rows)
}
s := newSuite(t)
resp, err := http.Get(address + query)
s.noError(err)
defer resp.Body.Close()
s.equalInt(resp.StatusCode, 200)
s.noError(json.NewDecoder(resp.Body).Decode(dst))
}
func removeIfFound(r Row, contains []Row) []Row {
func checkMetricsResult(got, want []Metric) error {
for _, r := range append([]Metric(nil), got...) {
want = removeIfFoundMetrics(r, want)
}
if len(want) > 0 {
return fmt.Errorf("exptected metrics %+v not found in %+v", want, got)
}
return nil
}
func removeIfFoundMetrics(r Metric, contains []Metric) []Metric {
for i, item := range contains {
if reflect.DeepEqual(r.Metric, item.Metric) && reflect.DeepEqual(r.Values, item.Values) &&
reflect.DeepEqual(r.Timestamps, item.Timestamps) {
contains[i] = contains[len(contains)-1]
return contains[:len(contains)-1]
}
}
return contains
}
func checkSeriesResult(got, want Series) error {
if got.Status != want.Status {
return fmt.Errorf("status mismatch %q - %q", want.Status, got.Status)
}
wantData := append([]map[string]string(nil), want.Data...)
for _, r := range got.Data {
wantData = removeIfFoundSeries(r, wantData)
}
if len(wantData) > 0 {
return fmt.Errorf("expected seria(s) %+v not found in %+v", wantData, got.Data)
}
return nil
}
func removeIfFoundSeries(r map[string]string, contains []map[string]string) []map[string]string {
for i, item := range contains {
if reflect.DeepEqual(r, item) {
contains[i] = contains[len(contains)-1]
return contains[:len(contains)-1]
}
}
return contains
}
func checkQueryResult(got, want Query) error {
if got.Status != want.Status {
return fmt.Errorf("status mismatch %q - %q", want.Status, got.Status)
}
if got.Data.ResultType != want.Data.ResultType {
return fmt.Errorf("result type mismatch %q - %q", want.Data.ResultType, got.Data.ResultType)
}
wantData := append([]QueryDataResult(nil), want.Data.Result...)
for _, r := range got.Data.Result {
wantData = removeIfFoundQueryData(r, wantData)
}
if len(wantData) > 0 {
return fmt.Errorf("expected query result %+v not found in %+v", wantData, got.Data.Result)
}
return nil
}
func removeIfFoundQueryData(r QueryDataResult, contains []QueryDataResult) []QueryDataResult {
for i, item := range contains {
if reflect.DeepEqual(r.Metric, item.Metric) && reflect.DeepEqual(r.Value[0], item.Value[0]) && reflect.DeepEqual(r.Value[1], item.Value[1]) {
contains[i] = contains[len(contains)-1]
return contains[:len(contains)-1]
}
}
return contains
}
func checkQueryRangeResult(got, want QueryRange) error {
if got.Status != want.Status {
return fmt.Errorf("status mismatch %q - %q", want.Status, got.Status)
}
if got.Data.ResultType != want.Data.ResultType {
return fmt.Errorf("result type mismatch %q - %q", want.Data.ResultType, got.Data.ResultType)
}
wantData := append([]QueryRangeDataResult(nil), want.Data.Result...)
for _, r := range got.Data.Result {
wantData = removeIfFoundQueryRangeData(r, wantData)
}
if len(wantData) > 0 {
return fmt.Errorf("expected query range result %+v not found in %+v", wantData, got.Data.Result)
}
return nil
}
func removeIfFoundQueryRangeData(r QueryRangeDataResult, contains []QueryRangeDataResult) []QueryRangeDataResult {
for i, item := range contains {
// todo check time
if reflect.DeepEqual(r.Metric, item.Metric) && reflect.DeepEqual(r.Values, item.Values) {
contains[i] = contains[len(contains)-1]
return contains[:len(contains)-1]
@@ -279,3 +484,11 @@ func (s *suite) equalInt(a, b int) {
s.t.FailNow()
}
}
func (s *suite) greaterThan(a, b int) {
s.t.Helper()
if a <= b {
s.t.Errorf("%d less or equal then %d", a, b)
s.t.FailNow()
}
}

View File

@@ -0,0 +1,52 @@
package test
import (
"fmt"
"log"
"regexp"
"strings"
"time"
)
var (
parseTimeExpRegex = regexp.MustCompile(`"?{TIME[^}]*}"?`)
extractRegex = regexp.MustCompile(`"?{([^}]*)}"?`)
)
// PopulateTimeTplString substitutes {TIME_*} with t in s and returns the result.
func PopulateTimeTplString(s string, t time.Time) string {
return string(PopulateTimeTpl([]byte(s), t))
}
// PopulateTimeTpl substitutes {TIME_*} with tGlobal in b and returns the result.
func PopulateTimeTpl(b []byte, tGlobal time.Time) []byte {
return parseTimeExpRegex.ReplaceAllFunc(b, func(repl []byte) []byte {
t := tGlobal
repl = extractRegex.FindSubmatch(repl)[1]
parts := strings.SplitN(string(repl), "-", 2)
if len(parts) == 2 {
duration, err := time.ParseDuration(strings.TrimSpace(parts[1]))
if err != nil {
log.Fatalf("error %s parsing duration %s in %s", err, parts[1], repl)
}
t = t.Add(-duration)
}
switch strings.TrimSpace(parts[0]) {
case `TIME_S`:
return []byte(fmt.Sprintf("%d", t.Unix()))
case `TIME_MSZ`:
return []byte(fmt.Sprintf("%d", t.Unix()*1e3))
case `TIME_MS`:
return []byte(fmt.Sprintf("%d", timeToMillis(t)))
case `TIME_NS`:
return []byte(fmt.Sprintf("%d", t.UnixNano()))
default:
log.Fatalf("unknown time pattern %s in %s", parts[0], repl)
}
return repl
})
}
func timeToMillis(t time.Time) int64 {
return t.UnixNano() / 1e6
}

View File

@@ -0,0 +1,24 @@
package test
import (
"testing"
"time"
)
func TestPopulateTimeTplString(t *testing.T) {
now, err := time.Parse(time.RFC3339, "2006-01-02T15:04:05Z")
if err != nil {
t.Fatalf("unexpected error when parsing time: %s", err)
}
f := func(s, resultExpected string) {
t.Helper()
result := PopulateTimeTplString(s, now)
if result != resultExpected {
t.Fatalf("unexpected result; got %q; want %q", result, resultExpected)
}
}
f("", "")
f("{TIME_S}", "1136214245")
f("now: {TIME_S}, past 30s: {TIME_MS-30s}, now: {TIME_S}", "now: 1136214245, past 30s: 1136214215000, now: 1136214245")
f("now: {TIME_MS}, past 30m: {TIME_MSZ-30m}, past 2h: {TIME_NS-2h}", "now: 1136214245000, past 30m: 1136212445000, past 2h: 1136207045000000000")
}

View File

@@ -0,0 +1,338 @@
// +build integration
// Source https://github.com/prometheus/prometheus/blob/master/prompb/remote.pb.go . Code is copy pasted and cleaned up
package test
import (
"encoding/binary"
"math"
"math/bits"
)
type WriteRequest struct {
Timeseries []TimeSeries `protobuf:"bytes,1,rep,name=timeseries,proto3" json:"timeseries"`
}
func (m *WriteRequest) Size() (n int) {
if m == nil {
return 0
}
var l int
_ = l
if len(m.Timeseries) > 0 {
for _, e := range m.Timeseries {
l = e.Size()
n += 1 + l + sovRemote(uint64(l))
}
}
return n
}
func sovRemote(x uint64) (n int) {
return (bits.Len64(x|1) + 6) / 7
}
func (m *WriteRequest) Marshal() (dAtA []byte, err error) {
size := m.Size()
dAtA = make([]byte, size)
n, err := m.MarshalToSizedBuffer(dAtA[:size])
if err != nil {
return nil, err
}
return dAtA[:n], nil
}
func (m *WriteRequest) MarshalTo(dAtA []byte) (int, error) {
size := m.Size()
return m.MarshalToSizedBuffer(dAtA[:size])
}
func (m *WriteRequest) MarshalToSizedBuffer(dAtA []byte) (int, error) {
i := len(dAtA)
if len(m.Timeseries) > 0 {
for iNdEx := len(m.Timeseries) - 1; iNdEx >= 0; iNdEx-- {
{
size, err := m.Timeseries[iNdEx].MarshalToSizedBuffer(dAtA[:i])
if err != nil {
return 0, err
}
i -= size
i = encodeVarintRemote(dAtA, i, uint64(size))
}
i--
dAtA[i] = 0xa
}
}
return len(dAtA) - i, nil
}
func encodeVarintRemote(dAtA []byte, offset int, v uint64) int {
offset -= sovRemote(v)
base := offset
for v >= 1<<7 {
dAtA[offset] = uint8(v&0x7f | 0x80)
v >>= 7
offset++
}
dAtA[offset] = uint8(v)
return base
}
type Sample struct {
Value float64 `protobuf:"fixed64,1,opt,name=value,proto3" json:"value,omitempty"`
Timestamp int64 `protobuf:"varint,2,opt,name=timestamp,proto3" json:"timestamp,omitempty"`
}
func (m *Sample) Reset() { *m = Sample{} }
// TimeSeries represents samples and labels for a single time series.
type TimeSeries struct {
Labels []Label `protobuf:"bytes,1,rep,name=labels,proto3" json:"labels"`
Samples []Sample `protobuf:"bytes,2,rep,name=samples,proto3" json:"samples"`
}
func (m *TimeSeries) Reset() { *m = TimeSeries{} }
type Label struct {
Name string `protobuf:"bytes,1,opt,name=name,proto3" json:"name,omitempty"`
Value string `protobuf:"bytes,2,opt,name=value,proto3" json:"value,omitempty"`
}
func (m *Label) Reset() { *m = Label{} }
type Labels struct {
Labels []Label `protobuf:"bytes,1,rep,name=labels,proto3" json:"labels"`
}
func (m *Labels) Reset() { *m = Labels{} }
func (m *Sample) Marshal() (dAtA []byte, err error) {
size := m.Size()
dAtA = make([]byte, size)
n, err := m.MarshalToSizedBuffer(dAtA[:size])
if err != nil {
return nil, err
}
return dAtA[:n], nil
}
func (m *Sample) MarshalTo(dAtA []byte) (int, error) {
size := m.Size()
return m.MarshalToSizedBuffer(dAtA[:size])
}
func (m *Sample) MarshalToSizedBuffer(dAtA []byte) (int, error) {
i := len(dAtA)
if m.Timestamp != 0 {
i = encodeVarintTypes(dAtA, i, uint64(m.Timestamp))
i--
dAtA[i] = 0x10
}
if m.Value != 0 {
i -= 8
binary.LittleEndian.PutUint64(dAtA[i:], uint64(math.Float64bits(float64(m.Value))))
i--
dAtA[i] = 0x9
}
return len(dAtA) - i, nil
}
func (m *TimeSeries) Marshal() (dAtA []byte, err error) {
size := m.Size()
dAtA = make([]byte, size)
n, err := m.MarshalToSizedBuffer(dAtA[:size])
if err != nil {
return nil, err
}
return dAtA[:n], nil
}
func (m *TimeSeries) MarshalTo(dAtA []byte) (int, error) {
size := m.Size()
return m.MarshalToSizedBuffer(dAtA[:size])
}
func (m *TimeSeries) MarshalToSizedBuffer(dAtA []byte) (int, error) {
i := len(dAtA)
if len(m.Samples) > 0 {
for iNdEx := len(m.Samples) - 1; iNdEx >= 0; iNdEx-- {
{
size, err := m.Samples[iNdEx].MarshalToSizedBuffer(dAtA[:i])
if err != nil {
return 0, err
}
i -= size
i = encodeVarintTypes(dAtA, i, uint64(size))
}
i--
dAtA[i] = 0x12
}
}
if len(m.Labels) > 0 {
for iNdEx := len(m.Labels) - 1; iNdEx >= 0; iNdEx-- {
{
size, err := m.Labels[iNdEx].MarshalToSizedBuffer(dAtA[:i])
if err != nil {
return 0, err
}
i -= size
i = encodeVarintTypes(dAtA, i, uint64(size))
}
i--
dAtA[i] = 0xa
}
}
return len(dAtA) - i, nil
}
func (m *Label) Marshal() (dAtA []byte, err error) {
size := m.Size()
dAtA = make([]byte, size)
n, err := m.MarshalToSizedBuffer(dAtA[:size])
if err != nil {
return nil, err
}
return dAtA[:n], nil
}
func (m *Label) MarshalTo(dAtA []byte) (int, error) {
size := m.Size()
return m.MarshalToSizedBuffer(dAtA[:size])
}
func (m *Label) MarshalToSizedBuffer(dAtA []byte) (int, error) {
i := len(dAtA)
_ = i
var l int
_ = l
if len(m.Value) > 0 {
i -= len(m.Value)
copy(dAtA[i:], m.Value)
i = encodeVarintTypes(dAtA, i, uint64(len(m.Value)))
i--
dAtA[i] = 0x12
}
if len(m.Name) > 0 {
i -= len(m.Name)
copy(dAtA[i:], m.Name)
i = encodeVarintTypes(dAtA, i, uint64(len(m.Name)))
i--
dAtA[i] = 0xa
}
return len(dAtA) - i, nil
}
func (m *Labels) Marshal() (dAtA []byte, err error) {
size := m.Size()
dAtA = make([]byte, size)
n, err := m.MarshalToSizedBuffer(dAtA[:size])
if err != nil {
return nil, err
}
return dAtA[:n], nil
}
func (m *Labels) MarshalTo(dAtA []byte) (int, error) {
size := m.Size()
return m.MarshalToSizedBuffer(dAtA[:size])
}
func (m *Labels) MarshalToSizedBuffer(dAtA []byte) (int, error) {
i := len(dAtA)
if len(m.Labels) > 0 {
for iNdEx := len(m.Labels) - 1; iNdEx >= 0; iNdEx-- {
{
size, err := m.Labels[iNdEx].MarshalToSizedBuffer(dAtA[:i])
if err != nil {
return 0, err
}
i -= size
i = encodeVarintTypes(dAtA, i, uint64(size))
}
i--
dAtA[i] = 0xa
}
}
return len(dAtA) - i, nil
}
func encodeVarintTypes(dAtA []byte, offset int, v uint64) int {
offset -= sovTypes(v)
base := offset
for v >= 1<<7 {
dAtA[offset] = uint8(v&0x7f | 0x80)
v >>= 7
offset++
}
dAtA[offset] = uint8(v)
return base
}
func (m *Sample) Size() (n int) {
if m == nil {
return 0
}
if m.Value != 0 {
n += 9
}
if m.Timestamp != 0 {
n += 1 + sovTypes(uint64(m.Timestamp))
}
return n
}
func (m *TimeSeries) Size() (n int) {
if m == nil {
return 0
}
var l int
_ = l
if len(m.Labels) > 0 {
for _, e := range m.Labels {
l = e.Size()
n += 1 + l + sovTypes(uint64(l))
}
}
if len(m.Samples) > 0 {
for _, e := range m.Samples {
l = e.Size()
n += 1 + l + sovTypes(uint64(l))
}
}
return n
}
func (m *Label) Size() (n int) {
if m == nil {
return 0
}
var l int
_ = l
l = len(m.Name)
if l > 0 {
n += 1 + l + sovTypes(uint64(l))
}
l = len(m.Value)
if l > 0 {
n += 1 + l + sovTypes(uint64(l))
}
return n
}
func (m *Labels) Size() (n int) {
if m == nil {
return 0
}
var l int
_ = l
if len(m.Labels) > 0 {
for _, e := range m.Labels {
l = e.Size()
n += 1 + l + sovTypes(uint64(l))
}
}
return n
}
func sovTypes(x uint64) (n int) {
return (bits.Len64(x|1) + 6) / 7
}

View File

@@ -0,0 +1,13 @@
// +build integration
package test
import "github.com/golang/snappy"
func Compress(wr WriteRequest) ([]byte, error) {
data, err := wr.Marshal()
if err != nil {
return nil, err
}
return snappy.Encode(nil, data), nil
}

View File

@@ -1,8 +1,8 @@
{
"name": "basic_insertion",
"data": "graphite.foo.bar.baz;tag1=value1;tag2=value2 123 {TIME}",
"query": "/api/v1/export?match={__name__!=\"\"}",
"result": [
{"metric":{"__name__":"graphite.foo.bar.baz","tag1":"value1","tag2":"value2"},"values":[123]}
"data": ["graphite.foo.bar.baz;tag1=value1;tag2=value2 123 {TIME_S}"],
"query": ["/api/v1/export?match={__name__!=''}"],
"result_metrics": [
{"metric":{"__name__":"graphite.foo.bar.baz","tag1":"value1","tag2":"value2"},"values":[123], "timestamps": ["{TIME_MSZ}"]}
]
}

View File

@@ -0,0 +1,16 @@
{
"name": "comparison-not-inf-not-nan",
"issue": "https://github.com/VictoriaMetrics/VictoriaMetrics/issues/150",
"data": [
"not_nan_not_inf;item=x 1 {TIME_S-1m}",
"not_nan_not_inf;item=x 1 {TIME_S-2m}",
"not_nan_not_inf;item=y 3 {TIME_S-1m}",
"not_nan_not_inf;item=y 1 {TIME_S-2m}"],
"query": ["/api/v1/query_range?query=1/(not_nan_not_inf-1)!=inf!=nan&start={TIME_S-3m}&end={TIME_S}&step=60"],
"result_query_range": {
"status":"success",
"data":{"resultType":"matrix",
"result":[
{"metric":{"item":"y"},"values":[["{TIME_S-1m}","0.5"],["{TIME_S}","0.5"]]}
]}}
}

View File

@@ -0,0 +1,24 @@
{
"name": "max_lookback_set",
"issue": "https://github.com/VictoriaMetrics/VictoriaMetrics/issues/209",
"data": [
"max_lookback_set 1 {TIME_S-30s}",
"max_lookback_set 2 {TIME_S-60s}",
"max_lookback_set 3 {TIME_S-120s}",
"max_lookback_set 4 {TIME_S-150s}"
],
"query": ["/api/v1/query_range?query=max_lookback_set&start={TIME_S-150s}&end={TIME_S}&step=10s&max_lookback=1s"],
"result_query_range": {
"status":"success",
"data":{"resultType":"matrix",
"result":[{"metric":{"__name__":"max_lookback_set"},"values":[
["{TIME_S-150s}","4"],
["{TIME_S-140s}","4"],
["{TIME_S-120s}","3"],
["{TIME_S-110s}","3"],
["{TIME_S-60s}","2"],
["{TIME_S-50s}","2"],
["{TIME_S-30s}","1"],
["{TIME_S-20s}","1"]
]}]}}
}

View File

@@ -0,0 +1,32 @@
{
"name": "max_lookback_unset",
"issue": "https://github.com/VictoriaMetrics/VictoriaMetrics/issues/209",
"data": [
"max_lookback_unset 1 {TIME_S-30s}",
"max_lookback_unset 2 {TIME_S-60s}",
"max_lookback_unset 3 {TIME_S-120s}",
"max_lookback_unset 4 {TIME_S-150s}"
],
"query": ["/api/v1/query_range?query=max_lookback_unset&start={TIME_S-150s}&end={TIME_S}&step=10s"],
"result_query_range": {
"status":"success",
"data":{"resultType":"matrix",
"result":[{"metric":{"__name__":"max_lookback_unset"},"values":[
["{TIME_S-150s}","4"],
["{TIME_S-140s}","4"],
["{TIME_S-130s}","4"],
["{TIME_S-120s}","3"],
["{TIME_S-110s}","3"],
["{TIME_S-100s}","3"],
["{TIME_S-90s}","3"],
["{TIME_S-80s}","3"],
["{TIME_S-70s}","3"],
["{TIME_S-60s}","2"],
["{TIME_S-50s}","2"],
["{TIME_S-40s}","2"],
["{TIME_S-30s}","1"],
["{TIME_S-20s}","1"],
["{TIME_S-10s}","1"],
["{TIME_S}","1"]
]}]}}
}

View File

@@ -0,0 +1,18 @@
{
"name": "not-nan-as-missing-data",
"issue": "https://github.com/VictoriaMetrics/VictoriaMetrics/issues/153",
"data": [
"not_nan_as_missing_data;item=x 2 {TIME_S-2m}",
"not_nan_as_missing_data;item=x 1 {TIME_S-1m}",
"not_nan_as_missing_data;item=y 4 {TIME_S-2m}",
"not_nan_as_missing_data;item=y 3 {TIME_S-1m}"
],
"query": ["/api/v1/query_range?query=not_nan_as_missing_data>1&start={TIME_S-2m}&end={TIME_S}&step=60"],
"result_query_range": {
"status":"success",
"data":{"resultType":"matrix",
"result":[
{"metric":{"__name__":"not_nan_as_missing_data","item":"x"},"values":[["{TIME_S-2m}","2"]]},
{"metric":{"__name__":"not_nan_as_missing_data","item":"y"},"values":[["{TIME_S-2m}","4"],["{TIME_S-1m}","3"],["{TIME_S}","3"]]}
]}}
}

View File

@@ -0,0 +1,14 @@
{
"name": "subquery-aggregation",
"issue": "https://github.com/VictoriaMetrics/VictoriaMetrics/issues/184",
"data": [
"forms_daily_count;item=x 1 {TIME_S-1m}",
"forms_daily_count;item=x 2 {TIME_S-2m}",
"forms_daily_count;item=y 3 {TIME_S-1m}",
"forms_daily_count;item=y 4 {TIME_S-2m}"],
"query": ["/api/v1/query?query=min%20by%20(item)%20(min_over_time(forms_daily_count[10m:1m]))&time={TIME_S-1m}"],
"result_query": {
"status":"success",
"data":{"resultType":"vector","result":[{"metric":{"item":"x"},"value":["{TIME_S-1m}","1"]},{"metric":{"item":"y"},"value":["{TIME_S-1m}","3"]}]}
}
}

View File

@@ -1,9 +1,9 @@
{
"name": "basic_insertion",
"data": "measurement,tag1=value1,tag2=value2 field1=1.23,field2=123",
"query": "/api/v1/export?match={__name__!=\"\"}",
"result": [
{"metric":{"__name__":"measurement_field2","tag1":"value1","tag2":"value2"},"values":[123]},
{"metric":{"__name__":"measurement_field1","tag1":"value1","tag2":"value2"},"values":[1.23]}
"data": ["measurement,tag1=value1,tag2=value2 field1=1.23,field2=123 {TIME_NS}"],
"query": ["/api/v1/export?match={__name__!=''}"],
"result_metrics": [
{"metric":{"__name__":"measurement_field2","tag1":"value1","tag2":"value2"},"values":[123], "timestamps": ["{TIME_MS}"]},
{"metric":{"__name__":"measurement_field1","tag1":"value1","tag2":"value2"},"values":[1.23], "timestamps": ["{TIME_MS}"]}
]
}

View File

@@ -1,8 +1,8 @@
{
"name": "basic_insertion",
"data": "put openstdb.foo.bar.baz {TIME} 123 tag1=value1 tag2=value2",
"query": "/api/v1/export?match={__name__!=\"\"}",
"result": [
{"metric":{"__name__":"openstdb.foo.bar.baz","tag1":"value1","tag2":"value2"},"values":[123]}
"data": ["put openstdb.foo.bar.baz {TIME_S} 123 tag1=value1 tag2=value2"],
"query": ["/api/v1/export?match={__name__!=''}"],
"result_metrics": [
{"metric":{"__name__":"openstdb.foo.bar.baz","tag1":"value1","tag2":"value2"},"values":[123], "timestamps": ["{TIME_MSZ}"]}
]
}

View File

@@ -0,0 +1,8 @@
{
"name": "basic_insertion",
"data": ["{\"metric\": \"opentsdbhttp.foo\", \"value\": 1001, \"timestamp\": {TIME_S}, \"tags\": {\"bar\":\"baz\", \"x\": \"y\"}}"],
"query": ["/api/v1/export?match={__name__!=''}"],
"result_metrics": [
{"metric":{"__name__":"opentsdbhttp.foo","bar":"baz","x":"y"},"values":[1001], "timestamps": ["{TIME_MSZ}"]}
]
}

View File

@@ -0,0 +1,9 @@
{
"name": "multiline",
"data": ["[{\"metric\": \"opentsdbhttp.multiline1\", \"value\": 1001, \"timestamp\": \"{TIME_S}\", \"tags\": {\"bar\":\"baz\", \"x\": \"y\"}}, {\"metric\": \"opentsdbhttp.multiline2\", \"value\": 1002, \"timestamp\": {TIME_S}}]"],
"query": ["/api/v1/export?match={__name__!=''}"],
"result_metrics": [
{"metric":{"__name__":"opentsdbhttp.multiline1","bar":"baz","x":"y"},"values":[1001], "timestamps": ["{TIME_MSZ}"]},
{"metric":{"__name__":"opentsdbhttp.multiline2"},"values":[1002], "timestamps": ["{TIME_MSZ}"]}
]
}

View File

@@ -0,0 +1,8 @@
{
"name": "basic_insertion",
"data": ["[{\"labels\":[{\"name\":\"__name__\",\"value\":\"prometheus.bar\"},{\"name\":\"baz\",\"value\":\"qux\"}],\"samples\":[{\"value\":100000,\"timestamp\":\"{TIME_MS}\"}]}]"],
"query": ["/api/v1/export?match={__name__!=''}"],
"result_metrics": [
{"metric":{"__name__":"prometheus.bar","baz":"qux"},"values":[100000], "timestamps": ["{TIME_MS}"]}
]
}

View File

@@ -0,0 +1,10 @@
{
"name": "case-sensitive-regex",
"issue": "https://github.com/VictoriaMetrics/VictoriaMetrics/issues/161",
"data": ["[{\"labels\":[{\"name\":\"__name__\",\"value\":\"prometheus.sensitiveRegex\"},{\"name\":\"label\",\"value\":\"sensitiveRegex\"}],\"samples\":[{\"value\":2,\"timestamp\":\"{TIME_MS}\"}]},{\"labels\":[{\"name\":\"__name__\",\"value\":\"prometheus.sensitiveRegex\"},{\"name\":\"label\",\"value\":\"SensitiveRegex\"}],\"samples\":[{\"value\":1,\"timestamp\":\"{TIME_MS}\"}]}]"],
"query": ["/api/v1/export?match={label=~'(?i)sensitiveregex'}"],
"result_metrics": [
{"metric":{"__name__":"prometheus.sensitiveRegex","label":"sensitiveRegex"},"values":[2], "timestamps": ["{TIME_MS}"]},
{"metric":{"__name__":"prometheus.sensitiveRegex","label":"SensitiveRegex"},"values":[1], "timestamps": ["{TIME_MS}"]}
]
}

View File

@@ -0,0 +1,9 @@
{
"name": "duplicate_label",
"issue": "https://github.com/VictoriaMetrics/VictoriaMetrics/issues/172",
"data": ["[{\"labels\":[{\"name\":\"__name__\",\"value\":\"prometheus.duplicate_label\"},{\"name\":\"duplicate\",\"value\":\"label\"},{\"name\":\"duplicate\",\"value\":\"label\"}],\"samples\":[{\"value\":1,\"timestamp\":\"{TIME_MS}\"}]}]"],
"query": ["/api/v1/export?match={__name__!=''}"],
"result_metrics": [
{"metric":{"__name__":"prometheus.duplicate_label","duplicate":"label"},"values":[1], "timestamps": ["{TIME_MS}"]}
]
}

View File

@@ -0,0 +1,15 @@
{
"name": "match_series",
"issue": "https://github.com/VictoriaMetrics/VictoriaMetrics/issues/155",
"data": ["[{\"labels\":[{\"name\":\"__name__\",\"value\":\"MatchSeries\"},{\"name\":\"db\",\"value\":\"TenMinute\"},{\"name\":\"TurbineType\",\"value\":\"V112\"},{\"name\":\"Park\",\"value\":\"1\"}],\"samples\":[{\"value\":1,\"timestamp\":\"{TIME_MS}\"}]},{\"labels\":[{\"name\":\"__name__\",\"value\":\"MatchSeries\"},{\"name\":\"db\",\"value\":\"TenMinute\"},{\"name\":\"TurbineType\",\"value\":\"V112\"},{\"name\":\"Park\",\"value\":\"2\"}],\"samples\":[{\"value\":1,\"timestamp\":\"{TIME_MS}\"}]},{\"labels\":[{\"name\":\"__name__\",\"value\":\"MatchSeries\"},{\"name\":\"db\",\"value\":\"TenMinute\"},{\"name\":\"TurbineType\",\"value\":\"V112\"},{\"name\":\"Park\",\"value\":\"3\"}],\"samples\":[{\"value\":1,\"timestamp\":\"{TIME_MS}\"}]},{\"labels\":[{\"name\":\"__name__\",\"value\":\"MatchSeries\"},{\"name\":\"db\",\"value\":\"TenMinute\"},{\"name\":\"TurbineType\",\"value\":\"V112\"},{\"name\":\"Park\",\"value\":\"4\"}],\"samples\":[{\"value\":1,\"timestamp\":\"{TIME_MS}\"}]}]"],
"query": ["/api/v1/series?match[]={__name__='MatchSeries'}", "/api/v1/series?match[]={__name__=~'MatchSeries.*'}"],
"result_series": {
"status": "success",
"data": [
{"__name__":"MatchSeries","db":"TenMinute","Park":"1","TurbineType":"V112"},
{"__name__":"MatchSeries","db":"TenMinute","Park":"2","TurbineType":"V112"},
{"__name__":"MatchSeries","db":"TenMinute","Park":"3","TurbineType":"V112"},
{"__name__":"MatchSeries","db":"TenMinute","Park":"4","TurbineType":"V112"}
]
}
}

37
app/vmbackup/Makefile Normal file
View File

@@ -0,0 +1,37 @@
# All these commands must run from repository root.
vmbackup:
APP_NAME=vmbackup $(MAKE) app-local
vmbackup-prod:
APP_NAME=vmbackup $(MAKE) app-via-docker
package-vmbackup:
APP_NAME=vmbackup $(MAKE) package-via-docker
publish-vmbackup:
APP_NAME=vmbackup $(MAKE) publish-via-docker
vmbackup-arm:
CGO_ENABLED=0 GOOS=linux GOARCH=arm GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmbackup-arm ./app/vmbackup
vmbackup-arm-prod:
APP_NAME=vmbackup APP_SUFFIX='-arm' DOCKER_OPTS='--env CGO_ENABLED=0 --env GOARCH=arm' $(MAKE) app-via-docker
vmbackup-arm64:
CGO_ENABLED=0 GOOS=linux GOARCH=arm64 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmbackup-arm64 ./app/vmbackup
vmbackup-arm64-prod:
APP_NAME=vmbackup APP_SUFFIX='-arm64' DOCKER_OPTS='--env CGO_ENABLED=0 --env GOARCH=arm64' $(MAKE) app-via-docker
vmbackup-386:
CGO_ENABLED=0 GOOS=linux GOARCH=386 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmbackup-386 ./app/vmbackup
vmbackup-386-prod:
APP_NAME=vmbackup APP_SUFFIX='-386' DOCKER_OPTS='--env CGO_ENABLED=0 --env GOARCH=386' $(MAKE) app-via-docker
vmbackup-pure:
APP_NAME=vmbackup $(MAKE) app-local-pure
vmbackup-pure-prod:
APP_NAME=vmbackup APP_SUFFIX='-pure' DOCKER_OPTS='--env CGO_ENABLED=0' $(MAKE) app-via-docker

181
app/vmbackup/README.md Normal file
View File

@@ -0,0 +1,181 @@
## vmbackup
`vmbackup` creates VictoriaMetrics data backups from [instant snapshots](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-work-with-snapshots).
Supported storage systems for backups:
* [GCS](https://cloud.google.com/storage/). Example: `gcs://<bucket>/<path/to/backup>`
* [S3](https://aws.amazon.com/s3/). Example: `s3://<bucket>/<path/to/backup>`
* Any S3-compatible storage such as [MinIO](https://github.com/minio/minio). See `-customS3Endpoint` command-line flag.
* Local filesystem. Example: `fs://</absolute/path/to/backup>`
Incremental backups and full backups are supported. Incremental backups are created automatically if the destination path already contains data from the previous backup.
Full backups can be sped up with `-origin` pointing to already existing backup on the same remote storage. In this case `vmbackup` makes server-side copy for the shared
data between the existing backup and new backup. This saves time and costs on data transfer.
Backup process can be interrupted at any time. It is automatically resumed from the interruption point when restarting `vmbackup` with the same args.
Backed up data can be restored with [vmrestore](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmrestore/README.md).
See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time-series-databases-533c1a927883) for more details.
### Use cases
#### Regular backups
Regular backup can be performed with the following command:
```
vmbackup -storageDataPath=</path/to/victoria-metrics-data> -snapshotName=<local-snapshot> -dst=gcs://<bucket>/<path/to/new/backup>
```
* `</path/to/victoria-metrics-data>` - path to VictoriaMetrics data pointed by `-storageDataPath` command-line flag in single-node VictoriaMetrics or in cluster `vmstorage`.
There is no need to stop VictoriaMetrics for creating backups, since they are performed from immutable [instant snapshots](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-work-with-snapshots).
* `<local-snapshot>` is the snapshot to backup. See [how to create instant snapshots](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-work-with-snapshots).
* `<bucket>` is already existing name for [GCS bucket](https://cloud.google.com/storage/docs/creating-buckets).
* `<path/to/new/backup>` is the destination path where new backup will be placed.
#### Regular backups with server-side copy from existing backup
If the destination GCS bucket already contains the previous backup at `-origin` path, then new backup can be sped up
with the following command:
```
vmbackup -storageDataPath=</path/to/victoria-metrics-data> -snapshotName=<local-snapshot> -dst=gcs://<bucket>/<path/to/new/backup> -origin=gcs://<bucket>/<path/to/existing/backup>
```
This saves time and network bandwidth costs by performing server-side copy for the shared data from the `-origin` to `-dst`.
#### Incremental backups
Incremental backups are performed if `-dst` points to already existing backup. In this case only new data is uploaded to remote storage.
This saves time and network bandwidth costs when working with big backups:
```
vmbackup -storageDataPath=</path/to/victoria-metrics-data> -snapshotName=<local-snapshot> -dst=gcs://<bucket>/<path/to/existing/backup>
```
#### Smart backups
Smart backups mean storing full daily backups into `YYYYMMDD` folders and creating incremental hourly backup into `latest` folder:
* Run the following command every hour:
```
vmbackup -snapshotName=<latest-snapshot> -dst=gcs://<bucket>/latest
```
Where `<latest-snapshot>` is the latest [snapshot](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-work-with-snapshots).
The command will upload only changed data to `gcs://<bucket>/latest`.
* Run the following command once a day:
```
vmbackup -snapshotName=<daily-snapshot> -dst=gcs://<bucket>/<YYYYMMDD> -origin=gcs://<bucket>/latest
```
Where `<daily-snapshot>` is the snapshot for the last day `<YYYYMMDD>`.
This apporach saves network bandwidth costs on hourly backups (since they are incremental) and allows recovering data from either the last hour (`latest` backup)
or from any day (`YYYYMMDD` backups). Note that hourly backup shouldn't run when creating daily backup.
Do not forget removing old snapshots and backups when they are no longer needed for saving storage costs.
### How does it work?
The backup algorithm is the following:
1. Collect information about files in the `-snapshotName`, in the `-dst` and in the `-origin`.
2. Determine files in `-dst`, which are missing in `-snapshotName`, and delete them. These are usually small files, which are already merged into bigger files in the snapshot.
3. Determine files from `-snapshotName`, which are missing in `-dst`. These are usually small new files and bigger merged files.
4. Determine files from step 3, which exist in the `-origin`, and perform server-side copy of these files from `-origin` to `-dst`.
This are usually the biggest and the oldest files, which are shared between backups.
5. Upload the remaining files from setp 3 from `-snapshotName` to `-dst`.
The algorithm splits source files into 100MB chunks in the backup. Each chunk is stored as a separate file in the backup.
Such splitting minimizes the amounts of data to re-transfer after temporary errors.
`vmbackup` relies on [instant snapshot](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282) properties:
- All the files in the snapshot are immutable.
- Old files are periodically merged into new files.
- Smaller files have higher probability to be merged.
- Consecutive snapshots share many identical files.
These properties allow performing fast and cheap incremental backups and server-side copying from `-origin` paths.
See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time-series-databases-533c1a927883) for more details.
`vmbackup` can work improperly or slowly when these properties are violated.
### Troubleshooting
* If the backup is slow, then try setting higher value for `-concurrency` flag. This will increase the number of concurrent workers that upload data to backup storage.
* If `vmbackup` eats all the network bandwidth, then set `-maxBytesPerSecond` to the desired value.
* If `vmbackup` has been interrupted due to temporary error, then just restart it with the same args. It will resume the backup process.
### Advanced usage
Run `vmbackup -help` in order to see all the available options:
```
-concurrency int
The number of concurrent workers. Higher concurrency may reduce backup duration (default 10)
-configFilePath string
Path to file with S3 configs. Configs are loaded from default location if not set.
See https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html
-configProfile string
Profile name for S3 configs (default "default")
-credsFilePath string
Path to file with GCS or S3 credentials. Credentials are loaded from default locations if not set.
See https://cloud.google.com/iam/docs/creating-managing-service-account-keys and https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html
-customS3Endpoint string
Custom S3 endpoint for use with S3-compatible storages (e.g. MinIO). S3 is used if not set
-dst string
Where to put the backup on the remote storage. Example: gcs://bucket/path/to/backup/dir, s3://bucket/path/to/backup/dir or fs:///path/to/local/backup/dir
-dst can point to the previous backup. In this case incremental backup is performed, i.e. only changed data is uploaded
-loggerLevel string
Minimum level of errors to log. Possible values: INFO, ERROR, FATAL, PANIC (default "INFO")
-maxBytesPerSecond int
The maximum upload speed. There is no limit if it is set to 0
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy (default 60)
-origin string
Optional origin directory on the remote storage with old backup for server-side copying when performing full backup. This speeds up full backups
-snapshotName string
Name for the snapshot to backup. See https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-work-with-snapshots
-storageDataPath string
Path to VictoriaMetrics data. Must match -storageDataPath from VictoriaMetrics or vmstorage (default "victoria-metrics-data")
-version
Show VictoriaMetrics version
```
### How to build from sources
It is recommended using [binary releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) - see `vmutils-*` archives there.
#### Development build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.12.
2. Run `make vmbackup` from the root folder of the repository.
It builds `vmbackup` binary and puts it into the `bin` folder.
#### Production build
1. [Install docker](https://docs.docker.com/install/).
2. Run `make vmbackup-prod` from the root folder of the repository.
It builds `vmbackup-prod` binary and puts it into the `bin` folder.
#### Building docker images
Run `make package-vmbackup`. It builds `victoriametrics/vmbackup:<PKG_TAG>` docker image locally.
`<PKG_TAG>` is auto-generated image tag, which depends on source code in the repository.
The `<PKG_TAG>` may be manually set via `PKG_TAG=foobar make package-vmbackup`.

View File

@@ -0,0 +1,5 @@
FROM scratch
COPY --from=local/certs:1.0.3 /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ca-certificates.crt
COPY bin/vmbackup-prod .
EXPOSE 8428
ENTRYPOINT ["/vmbackup-prod"]

114
app/vmbackup/main.go Normal file
View File

@@ -0,0 +1,114 @@
package main
import (
"flag"
"fmt"
"os"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/backup/actions"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/backup/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/backup/fslocal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
)
var (
storageDataPath = flag.String("storageDataPath", "victoria-metrics-data", "Path to VictoriaMetrics data. Must match -storageDataPath from VictoriaMetrics or vmstorage")
snapshotName = flag.String("snapshotName", "", "Name for the snapshot to backup. See https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-work-with-snapshots")
dst = flag.String("dst", "", "Where to put the backup on the remote storage. "+
"Example: gcs://bucket/path/to/backup/dir, s3://bucket/path/to/backup/dir or fs:///path/to/local/backup/dir\n"+
"-dst can point to the previous backup. In this case incremental backup is performed, i.e. only changed data is uploaded")
origin = flag.String("origin", "", "Optional origin directory on the remote storage with old backup for server-side copying when performing full backup. This speeds up full backups")
concurrency = flag.Int("concurrency", 10, "The number of concurrent workers. Higher concurrency may reduce backup duration")
maxBytesPerSecond = flag.Int("maxBytesPerSecond", 0, "The maximum upload speed. There is no limit if it is set to 0")
)
func main() {
flag.Usage = usage
flag.Parse()
buildinfo.Init()
srcFS, err := newSrcFS()
if err != nil {
logger.Fatalf("%s", err)
}
dstFS, err := newDstFS()
if err != nil {
logger.Fatalf("%s", err)
}
originFS, err := newOriginFS()
if err != nil {
logger.Fatalf("%s", err)
}
a := &actions.Backup{
Concurrency: *concurrency,
Src: srcFS,
Dst: dstFS,
Origin: originFS,
}
if err := a.Run(); err != nil {
logger.Fatalf("cannot create backup: %s", err)
}
}
func usage() {
const s = `
vmbackup performs backups for VictoriaMetrics data from instant snapshots to gcs, s3
or local filesystem. Backed up data can be restored with vmrestore.
See the docs at https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmbackup/README.md .
`
f := flag.CommandLine.Output()
fmt.Fprintf(f, "%s\n", s)
flag.PrintDefaults()
}
func newSrcFS() (*fslocal.FS, error) {
if len(*snapshotName) == 0 {
return nil, fmt.Errorf("`-snapshotName` cannot be empty")
}
snapshotPath := *storageDataPath + "/snapshots/" + *snapshotName
// Verify the snapshot exists.
f, err := os.Open(snapshotPath)
if err != nil {
return nil, fmt.Errorf("cannot open snapshot at %q: %s", snapshotPath, err)
}
fi, err := f.Stat()
_ = f.Close()
if err != nil {
return nil, fmt.Errorf("cannot stat %q: %s", snapshotPath, err)
}
if !fi.IsDir() {
return nil, fmt.Errorf("snapshot %q must be a directory", snapshotPath)
}
fs := &fslocal.FS{
Dir: snapshotPath,
MaxBytesPerSecond: *maxBytesPerSecond,
}
if err := fs.Init(); err != nil {
return nil, fmt.Errorf("cannot initialize fs: %s", err)
}
return fs, nil
}
func newDstFS() (common.RemoteFS, error) {
fs, err := actions.NewRemoteFS(*dst)
if err != nil {
return nil, fmt.Errorf("cannot parse `-dst`=%q: %s", *dst, err)
}
return fs, nil
}
func newOriginFS() (common.RemoteFS, error) {
if len(*origin) == 0 {
return nil, nil
}
fs, err := actions.NewRemoteFS(*origin)
if err != nil {
return nil, fmt.Errorf("cannot parse `-origin`=%q: %s", *origin, err)
}
return fs, nil
}

View File

@@ -2,9 +2,11 @@ package common
import (
"fmt"
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
)
@@ -99,7 +101,10 @@ func (ctx *InsertCtx) AddLabel(name, value string) {
// FlushBufs flushes buffered rows to the underlying storage.
func (ctx *InsertCtx) FlushBufs() error {
if err := vmstorage.AddRows(ctx.mrs); err != nil {
return fmt.Errorf("cannot store metrics: %s", err)
return &httpserver.ErrorWithStatusCode{
Err: fmt.Errorf("cannot store metrics: %s", err),
StatusCode: http.StatusServiceUnavailable,
}
}
return nil
}

View File

@@ -3,9 +3,11 @@ package concurrencylimiter
import (
"flag"
"fmt"
"net/http"
"runtime"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/timerpool"
"github.com/VictoriaMetrics/metrics"
)
@@ -53,7 +55,10 @@ func Do(f func() error) error {
case <-t.C:
timerpool.Put(t)
concurrencyLimitTimeout.Inc()
return fmt.Errorf("the server is overloaded with %d concurrent inserts; either increase -maxConcurrentInserts or reduce the load", cap(ch))
return &httpserver.ErrorWithStatusCode{
Err: fmt.Errorf("the server is overloaded with %d concurrent inserts; either increase -maxConcurrentInserts or reduce the load", cap(ch)),
StatusCode: http.StatusServiceUnavailable,
}
}
}

View File

@@ -4,6 +4,8 @@ import (
"fmt"
"strings"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/metrics"
"github.com/valyala/fastjson/fastfloat"
)
@@ -34,10 +36,8 @@ func (rs *Rows) Reset() {
// See https://graphite.readthedocs.io/en/latest/feeding-carbon.html#the-plaintext-protocol
//
// s must be unchanged until rs is in use.
func (rs *Rows) Unmarshal(s string) error {
var err error
rs.Rows, rs.tagsPool, err = unmarshalRows(rs.Rows[:0], s, rs.tagsPool[:0])
return err
func (rs *Rows) Unmarshal(s string) {
rs.Rows, rs.tagsPool = unmarshalRows(rs.Rows[:0], s, rs.tagsPool[:0])
}
// Row is a single graphite row.
@@ -80,6 +80,9 @@ func (r *Row) unmarshal(s string, tagsPool []Tag) ([]Tag, error) {
tags := tagsPool[tagsStart:]
r.Tags = tags[:len(tags):len(tags)]
}
if len(r.Metric) == 0 {
return tagsPool, fmt.Errorf("metric cannot be empty")
}
n = strings.IndexByte(tail, ' ')
if n < 0 {
@@ -92,41 +95,46 @@ func (r *Row) unmarshal(s string, tagsPool []Tag) ([]Tag, error) {
return tagsPool, nil
}
func unmarshalRows(dst []Row, s string, tagsPool []Tag) ([]Row, []Tag, error) {
func unmarshalRows(dst []Row, s string, tagsPool []Tag) ([]Row, []Tag) {
for len(s) > 0 {
n := strings.IndexByte(s, '\n')
if n == 0 {
// Skip empty line
s = s[1:]
continue
}
if cap(dst) > len(dst) {
dst = dst[:len(dst)+1]
} else {
dst = append(dst, Row{})
}
r := &dst[len(dst)-1]
if n < 0 {
// The last line.
var err error
tagsPool, err = r.unmarshal(s, tagsPool)
if err != nil {
err = fmt.Errorf("cannot unmarshal Graphite line %q: %s", s, err)
return dst, tagsPool, err
}
return dst, tagsPool, nil
}
var err error
tagsPool, err = r.unmarshal(s[:n], tagsPool)
if err != nil {
err = fmt.Errorf("cannot unmarshal Graphite line %q: %s", s[:n], err)
return dst, tagsPool, err
return unmarshalRow(dst, s, tagsPool)
}
dst, tagsPool = unmarshalRow(dst, s[:n], tagsPool)
s = s[n+1:]
}
return dst, tagsPool, nil
return dst, tagsPool
}
func unmarshalRow(dst []Row, s string, tagsPool []Tag) ([]Row, []Tag) {
if len(s) > 0 && s[len(s)-1] == '\r' {
s = s[:len(s)-1]
}
if len(s) == 0 {
// Skip empty line
return dst, tagsPool
}
if cap(dst) > len(dst) {
dst = dst[:len(dst)+1]
} else {
dst = append(dst, Row{})
}
r := &dst[len(dst)-1]
var err error
tagsPool, err = r.unmarshal(s, tagsPool)
if err != nil {
dst = dst[:len(dst)-1]
logger.Errorf("cannot unmarshal Graphite line %q: %s", s, err)
invalidLines.Inc()
}
return dst, tagsPool
}
var invalidLines = metrics.NewCounter(`vm_rows_invalid_total{type="graphite"}`)
func unmarshalTags(dst []Tag, s string) ([]Tag, error) {
for {
if cap(dst) > len(dst) {
@@ -142,12 +150,20 @@ func unmarshalTags(dst []Tag, s string) ([]Tag, error) {
if err := tag.unmarshal(s); err != nil {
return dst[:len(dst)-1], err
}
if len(tag.Key) == 0 || len(tag.Value) == 0 {
// Skip empty tag
dst = dst[:len(dst)-1]
}
return dst, nil
}
if err := tag.unmarshal(s[:n]); err != nil {
return dst[:len(dst)-1], err
}
s = s[n+1:]
if len(tag.Key) == 0 || len(tag.Value) == 0 {
// Skip empty tag
dst = dst[:len(dst)-1]
}
}
}
@@ -169,9 +185,6 @@ func (t *Tag) unmarshal(s string) error {
return fmt.Errorf("missing tag value for %q", s)
}
t.Key = s[:n]
if len(t.Key) == 0 {
return fmt.Errorf("tag key cannot be empty for %q", s)
}
t.Value = s[n+1:]
return nil
}

View File

@@ -9,45 +9,42 @@ func TestRowsUnmarshalFailure(t *testing.T) {
f := func(s string) {
t.Helper()
var rows Rows
if err := rows.Unmarshal(s); err == nil {
t.Fatalf("expecting non-nil error when parsing %q", s)
rows.Unmarshal(s)
if len(rows.Rows) != 0 {
t.Fatalf("unexpected number of rows parsed; got %d; want 0", len(rows.Rows))
}
// Try again
if err := rows.Unmarshal(s); err == nil {
t.Fatalf("expecting non-nil error when parsing %q", s)
rows.Unmarshal(s)
if len(rows.Rows) != 0 {
t.Fatalf("unexpected number of rows parsed; got %d; want 0", len(rows.Rows))
}
}
// Missing metric
f(" 123 455")
// Missing value
f("aaa")
// Invalid multiline
f("aaa\nbbb 123 34")
// missing tag
f("aa; 12 34")
// missing tag value
f("aa;bb 23 34")
f("aa;=dsd 234 45")
}
func TestRowsUnmarshalSuccess(t *testing.T) {
f := func(s string, rowsExpected *Rows) {
t.Helper()
var rows Rows
if err := rows.Unmarshal(s); err != nil {
t.Fatalf("cannot unmarshal %q: %s", s, err)
}
rows.Unmarshal(s)
if !reflect.DeepEqual(rows.Rows, rowsExpected.Rows) {
t.Fatalf("unexpected rows;\ngot\n%+v;\nwant\n%+v", rows.Rows, rowsExpected.Rows)
}
// Try unmarshaling again
if err := rows.Unmarshal(s); err != nil {
t.Fatalf("cannot unmarshal %q: %s", s, err)
}
rows.Unmarshal(s)
if !reflect.DeepEqual(rows.Rows, rowsExpected.Rows) {
t.Fatalf("unexpected rows;\ngot\n%+v;\nwant\n%+v", rows.Rows, rowsExpected.Rows)
}
@@ -60,7 +57,9 @@ func TestRowsUnmarshalSuccess(t *testing.T) {
// Empty line
f("", &Rows{})
f("\r", &Rows{})
f("\n\n", &Rows{})
f("\n\r\n", &Rows{})
// Single line
f("foobar -123.456 789", &Rows{
@@ -86,6 +85,15 @@ func TestRowsUnmarshalSuccess(t *testing.T) {
}},
})
// Timestamp bigger than 1<<31
f("aaa 1123 429496729600", &Rows{
Rows: []Row{{
Metric: "aaa",
Value: 1123,
Timestamp: 429496729600,
}},
})
// Tags
f("foo;bar=baz 1 2", &Rows{
Rows: []Row{{
@@ -98,7 +106,8 @@ func TestRowsUnmarshalSuccess(t *testing.T) {
Timestamp: 2,
}},
})
f("foo;bar=baz;aa=;x=y 1 2", &Rows{
// Empty tags
f("foo;bar=baz;aa=;x=y;=z 1 2", &Rows{
Rows: []Row{{
Metric: "foo",
Tags: []Tag{
@@ -106,10 +115,6 @@ func TestRowsUnmarshalSuccess(t *testing.T) {
Key: "bar",
Value: "baz",
},
{
Key: "aa",
Value: "",
},
{
Key: "x",
Value: "y",
@@ -139,4 +144,20 @@ func TestRowsUnmarshalSuccess(t *testing.T) {
},
},
})
// Multi lines with invalid line
f("foo 0.3 2\naaa\nbar.baz 0.34 43\n", &Rows{
Rows: []Row{
{
Metric: "foo",
Value: 0.3,
Timestamp: 2,
},
{
Metric: "bar.baz",
Value: 0.34,
Timestamp: 43,
},
},
})
}

View File

@@ -16,8 +16,9 @@ cpu.usage_irq 0.34432 1234556768
b.RunParallel(func(pb *testing.PB) {
var rows Rows
for pb.Next() {
if err := rows.Unmarshal(s); err != nil {
panic(fmt.Errorf("cannot unmarshal %q: %s", s, err))
rows.Unmarshal(s)
if len(rows.Rows) != 4 {
panic(fmt.Errorf("unexpected number of rows unmarshaled: got %d; want 4", len(rows.Rows)))
}
}
})

View File

@@ -85,11 +85,7 @@ func (ctx *pushCtx) Read(r io.Reader) bool {
return false
}
}
if err := ctx.Rows.Unmarshal(bytesutil.ToUnsafeString(ctx.reqBuf)); err != nil {
graphiteUnmarshalErrors.Inc()
ctx.err = fmt.Errorf("cannot unmarshal graphite plaintext protocol data with size %d: %s", len(ctx.reqBuf), err)
return false
}
ctx.Rows.Unmarshal(bytesutil.ToUnsafeString(ctx.reqBuf))
// Fill missing timestamps with the current timestamp rounded to seconds.
currentTimestamp := time.Now().Unix()
@@ -136,9 +132,8 @@ func (ctx *pushCtx) reset() {
}
var (
graphiteReadCalls = metrics.NewCounter(`vm_read_calls_total{name="graphite"}`)
graphiteReadErrors = metrics.NewCounter(`vm_read_errors_total{name="graphite"}`)
graphiteUnmarshalErrors = metrics.NewCounter(`vm_unmarshal_errors_total{name="graphite"}`)
graphiteReadCalls = metrics.NewCounter(`vm_read_calls_total{name="graphite"}`)
graphiteReadErrors = metrics.NewCounter(`vm_read_errors_total{name="graphite"}`)
)
func getPushCtx() *pushCtx {

View File

@@ -4,6 +4,8 @@ import (
"fmt"
"strings"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/metrics"
"github.com/valyala/fastjson/fastfloat"
)
@@ -41,10 +43,8 @@ func (rs *Rows) Reset() {
// See https://docs.influxdata.com/influxdb/v1.7/write_protocols/line_protocol_tutorial/
//
// s must be unchanged until rs is in use.
func (rs *Rows) Unmarshal(s string) error {
var err error
rs.Rows, rs.tagsPool, rs.fieldsPool, err = unmarshalRows(rs.Rows[:0], s, rs.tagsPool[:0], rs.fieldsPool[:0])
return err
func (rs *Rows) Unmarshal(s string) {
rs.Rows, rs.tagsPool, rs.fieldsPool = unmarshalRows(rs.Rows[:0], s, rs.tagsPool[:0], rs.fieldsPool[:0])
}
// Row is a single influx row.
@@ -62,9 +62,8 @@ func (r *Row) reset() {
r.Timestamp = 0
}
func (r *Row) unmarshal(s string, tagsPool []Tag, fieldsPool []Field) ([]Tag, []Field, error) {
func (r *Row) unmarshal(s string, tagsPool []Tag, fieldsPool []Field, noEscapeChars bool) ([]Tag, []Field, error) {
r.reset()
noEscapeChars := strings.IndexByte(s, '\\') < 0
n := nextUnescapedChar(s, ' ', noEscapeChars)
if n < 0 {
return tagsPool, fieldsPool, fmt.Errorf("cannot find Whitespace I in %q", s)
@@ -86,9 +85,7 @@ func (r *Row) unmarshal(s string, tagsPool []Tag, fieldsPool []Field) ([]Tag, []
measurementTags = measurementTags[:n]
}
r.Measurement = unescapeTagValue(measurementTags, noEscapeChars)
if len(r.Measurement) == 0 {
return tagsPool, fieldsPool, fmt.Errorf("measurement cannot be empty. measurementTags=%q", s)
}
// Allow empty r.Measurement. In this case metric name is constructed directly from field keys.
// Parse fields
fieldsStart := len(fieldsPool)
@@ -138,9 +135,6 @@ func (tag *Tag) unmarshal(s string, noEscapeChars bool) error {
return fmt.Errorf("missing tag value for %q", s)
}
tag.Key = unescapeTagValue(s[:n], noEscapeChars)
if len(tag.Key) == 0 {
return fmt.Errorf("tag key cannot be empty")
}
tag.Value = unescapeTagValue(s[n+1:], noEscapeChars)
return nil
}
@@ -174,41 +168,51 @@ func (f *Field) unmarshal(s string, noEscapeChars, hasQuotedFields bool) error {
return nil
}
func unmarshalRows(dst []Row, s string, tagsPool []Tag, fieldsPool []Field) ([]Row, []Tag, []Field, error) {
func unmarshalRows(dst []Row, s string, tagsPool []Tag, fieldsPool []Field) ([]Row, []Tag, []Field) {
noEscapeChars := strings.IndexByte(s, '\\') < 0
for len(s) > 0 {
n := strings.IndexByte(s, '\n')
if n == 0 {
// Skip empty line
s = s[1:]
continue
}
if cap(dst) > len(dst) {
dst = dst[:len(dst)+1]
} else {
dst = append(dst, Row{})
}
r := &dst[len(dst)-1]
if n < 0 {
// The last line.
var err error
tagsPool, fieldsPool, err = r.unmarshal(s, tagsPool, fieldsPool)
if err != nil {
err = fmt.Errorf("cannot unmarshal Influx line %q: %s", s, err)
return dst, tagsPool, fieldsPool, err
}
return dst, tagsPool, fieldsPool, nil
}
var err error
tagsPool, fieldsPool, err = r.unmarshal(s[:n], tagsPool, fieldsPool)
if err != nil {
err = fmt.Errorf("cannot unmarshal Influx line %q: %s", s[:n], err)
return dst, tagsPool, fieldsPool, err
return unmarshalRow(dst, s, tagsPool, fieldsPool, noEscapeChars)
}
dst, tagsPool, fieldsPool = unmarshalRow(dst, s[:n], tagsPool, fieldsPool, noEscapeChars)
s = s[n+1:]
}
return dst, tagsPool, fieldsPool, nil
return dst, tagsPool, fieldsPool
}
func unmarshalRow(dst []Row, s string, tagsPool []Tag, fieldsPool []Field, noEscapeChars bool) ([]Row, []Tag, []Field) {
if len(s) > 0 && s[len(s)-1] == '\r' {
s = s[:len(s)-1]
}
if len(s) == 0 {
// Skip empty line
return dst, tagsPool, fieldsPool
}
if s[0] == '#' {
// Skip comment
return dst, tagsPool, fieldsPool
}
if cap(dst) > len(dst) {
dst = dst[:len(dst)+1]
} else {
dst = append(dst, Row{})
}
r := &dst[len(dst)-1]
var err error
tagsPool, fieldsPool, err = r.unmarshal(s, tagsPool, fieldsPool, noEscapeChars)
if err != nil {
dst = dst[:len(dst)-1]
logger.Errorf("cannot unmarshal Influx line %q: %s; skipping it", s, err)
invalidLines.Inc()
}
return dst, tagsPool, fieldsPool
}
var invalidLines = metrics.NewCounter(`vm_rows_invalid_total{type="influx"}`)
func unmarshalTags(dst []Tag, s string, noEscapeChars bool) ([]Tag, error) {
for {
if cap(dst) > len(dst) {
@@ -220,14 +224,22 @@ func unmarshalTags(dst []Tag, s string, noEscapeChars bool) ([]Tag, error) {
n := nextUnescapedChar(s, ',', noEscapeChars)
if n < 0 {
if err := tag.unmarshal(s, noEscapeChars); err != nil {
return dst, err
return dst[:len(dst)-1], err
}
if len(tag.Key) == 0 || len(tag.Value) == 0 {
// Skip empty tag
dst = dst[:len(dst)-1]
}
return dst, nil
}
if err := tag.unmarshal(s[:n], noEscapeChars); err != nil {
return dst, err
return dst[:len(dst)-1], err
}
s = s[n+1:]
if len(tag.Key) == 0 || len(tag.Value) == 0 {
// Skip empty tag
dst = dst[:len(dst)-1]
}
}
}

View File

@@ -74,19 +74,18 @@ func TestRowsUnmarshalFailure(t *testing.T) {
f := func(s string) {
t.Helper()
var rows Rows
if err := rows.Unmarshal(s); err == nil {
t.Fatalf("expecting non-nil error when parsing %q", s)
rows.Unmarshal(s)
if len(rows.Rows) != 0 {
t.Fatalf("expecting zero rows; got %d rows", len(rows.Rows))
}
// Try again
if err := rows.Unmarshal(s); err == nil {
t.Fatalf("expecting non-nil error when parsing %q", s)
rows.Unmarshal(s)
if len(rows.Rows) != 0 {
t.Fatalf("expecting zero rows; got %d rows", len(rows.Rows))
}
}
// Missing measurement
f(",foo=bar baz=123")
// No fields
f("foo")
f("foo,bar=baz 1234")
@@ -94,12 +93,8 @@ func TestRowsUnmarshalFailure(t *testing.T) {
// Missing tag value
f("foo,bar")
f("foo,bar baz")
f("foo,bar= baz")
f("foo,bar=123, 123")
// Missing tag name
f("foo,=bar baz=234")
// Missing field value
f("foo bar")
f("foo bar=")
@@ -122,17 +117,13 @@ func TestRowsUnmarshalSuccess(t *testing.T) {
f := func(s string, rowsExpected *Rows) {
t.Helper()
var rows Rows
if err := rows.Unmarshal(s); err != nil {
t.Fatalf("cannot unmarshal %q: %s", s, err)
}
rows.Unmarshal(s)
if !reflect.DeepEqual(rows.Rows, rowsExpected.Rows) {
t.Fatalf("unexpected rows;\ngot\n%+v;\nwant\n%+v", rows.Rows, rowsExpected.Rows)
}
// Try unmarshaling again
if err := rows.Unmarshal(s); err != nil {
t.Fatalf("cannot unmarshal %q: %s", s, err)
}
rows.Unmarshal(s)
if !reflect.DeepEqual(rows.Rows, rowsExpected.Rows) {
t.Fatalf("unexpected rows;\ngot\n%+v;\nwant\n%+v", rows.Rows, rowsExpected.Rows)
}
@@ -146,6 +137,36 @@ func TestRowsUnmarshalSuccess(t *testing.T) {
// Empty line
f("", &Rows{})
f("\n\n", &Rows{})
f("\n\r\n", &Rows{})
// Comment
f("\n# foobar\n", &Rows{})
f("#foobar baz", &Rows{})
f("#foobar baz\n#sss", &Rows{})
// Missing measurement
f(" baz=123", &Rows{
Rows: []Row{{
Measurement: "",
Fields: []Field{{
Key: "baz",
Value: 123,
}},
}},
})
f(",foo=bar baz=123", &Rows{
Rows: []Row{{
Measurement: "",
Tags: []Tag{{
Key: "foo",
Value: "bar",
}},
Fields: []Field{{
Key: "baz",
Value: 123,
}},
}},
})
// Minimal line without tags and timestamp
f("foo bar=123", &Rows{
@@ -157,6 +178,15 @@ func TestRowsUnmarshalSuccess(t *testing.T) {
}},
}},
})
f("# comment\nfoo bar=123\r\n#comment2 sdsf dsf", &Rows{
Rows: []Row{{
Measurement: "foo",
Fields: []Field{{
Key: "bar",
Value: 123,
}},
}},
})
f("foo bar=123\n", &Rows{
Rows: []Row{{
Measurement: "foo",
@@ -216,7 +246,7 @@ func TestRowsUnmarshalSuccess(t *testing.T) {
})
// Line with empty tag values
f("foo,tag1=xyz,tagN=,tag2=43as bar=123", &Rows{
f("foo,tag1=xyz,tagN=,tag2=43as,=xxx bar=123", &Rows{
Rows: []Row{{
Measurement: "foo",
Tags: []Tag{
@@ -224,10 +254,6 @@ func TestRowsUnmarshalSuccess(t *testing.T) {
Key: "tag1",
Value: "xyz",
},
{
Key: "tagN",
Value: "",
},
{
Key: "tag2",
Value: "43as",
@@ -309,11 +335,11 @@ func TestRowsUnmarshalSuccess(t *testing.T) {
})
// Escape chars
f(`fo\,bar\=baz,x\==\\a\,\=\q\ \\\a\=\,=4.34`, &Rows{
f(`fo\,bar\=baz,x\=\b=\\a\,\=\q\ \\\a\=\,=4.34`, &Rows{
Rows: []Row{{
Measurement: `fo,bar=baz`,
Tags: []Tag{{
Key: `x=`,
Key: `x=\b`,
Value: `\a,=\q `,
}},
Fields: []Field{{
@@ -348,6 +374,34 @@ func TestRowsUnmarshalSuccess(t *testing.T) {
},
},
})
// Multiple lines with invalid line in the middle.
f("foo,tag=xyz field=1.23 48934\n"+
"invalid line\n"+
"bar x=-1i\n\n", &Rows{
Rows: []Row{
{
Measurement: "foo",
Tags: []Tag{{
Key: "tag",
Value: "xyz",
}},
Fields: []Field{{
Key: "field",
Value: 1.23,
}},
Timestamp: 48934,
},
{
Measurement: "bar",
Fields: []Field{{
Key: "x",
Value: -1,
}},
},
},
})
// No newline after the second line.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/82
f("foo,tag=xyz field=1.23 48934\n"+
@@ -374,4 +428,24 @@ func TestRowsUnmarshalSuccess(t *testing.T) {
},
},
})
f("x,y=z,g=p:\\ \\ 5432\\,\\ gp\\ mon\\ [lol]\\ con10\\ cmd5\\ SELECT f=1", &Rows{
Rows: []Row{{
Measurement: "x",
Tags: []Tag{
{
Key: "y",
Value: "z",
},
{
Key: "g",
Value: "p: 5432, gp mon [lol] con10 cmd5 SELECT",
},
},
Fields: []Field{{
Key: "f",
Value: 1,
}},
}},
})
}

View File

@@ -6,14 +6,19 @@ import (
)
func BenchmarkRowsUnmarshal(b *testing.B) {
s := `cpu usage_user=1.23,usage_system=4.34,usage_iowait=0.1112 1234556768`
s := `cpu usage_user=1.23,usage_system=4.34,usage_iowait=0.1112 1234556768
cpu usage_user=1.23,usage_system=4.34,usage_iowait=0.1112 123455676344
aaa usage_user=1.23,usage_system=4.34,usage_iowait=0.1112 123455676344
bbb usage_user=1.23,usage_system=4.34,usage_iowait=0.1112 123455676344
`
b.SetBytes(int64(len(s)))
b.ReportAllocs()
b.RunParallel(func(pb *testing.PB) {
var rows Rows
for pb.Next() {
if err := rows.Unmarshal(s); err != nil {
panic(fmt.Errorf("cannot unmarshal %q: %s", s, err))
rows.Unmarshal(s)
if len(rows.Rows) != 4 {
panic(fmt.Errorf("unexpected number of rows parsed; got %d; want 4", len(rows.Rows)))
}
}
})

View File

@@ -90,15 +90,21 @@ func (ctx *pushCtx) InsertRows(db string) error {
for i := range rows {
r := &rows[i]
ic.Labels = ic.Labels[:0]
ic.AddLabel("db", db)
hasDBLabel := false
for j := range r.Tags {
tag := &r.Tags[j]
if tag.Key == "db" {
hasDBLabel = true
}
ic.AddLabel(tag.Key, tag.Value)
}
if len(db) > 0 && !hasDBLabel {
ic.AddLabel("db", db)
}
ctx.metricNameBuf = storage.MarshalMetricNameRaw(ctx.metricNameBuf[:0], ic.Labels)
ctx.metricGroupBuf = append(ctx.metricGroupBuf[:0], r.Measurement...)
skipFieldKey := len(r.Fields) == 1 && *skipSingleField
if !skipFieldKey {
if len(ctx.metricGroupBuf) > 0 && !skipFieldKey {
ctx.metricGroupBuf = append(ctx.metricGroupBuf, *measurementFieldSeparator...)
}
metricGroupPrefixLen := len(ctx.metricGroupBuf)
@@ -131,11 +137,7 @@ func (ctx *pushCtx) Read(r io.Reader, tsMultiplier int64) bool {
}
return false
}
if err := ctx.Rows.Unmarshal(bytesutil.ToUnsafeString(ctx.reqBuf)); err != nil {
influxUnmarshalErrors.Inc()
ctx.err = fmt.Errorf("cannot unmarshal influx line protocol data with size %d: %s", len(ctx.reqBuf), err)
return false
}
ctx.Rows.Unmarshal(bytesutil.ToUnsafeString(ctx.reqBuf))
// Adjust timestamps according to tsMultiplier
currentTs := time.Now().UnixNano() / 1e6
@@ -164,9 +166,8 @@ func (ctx *pushCtx) Read(r io.Reader, tsMultiplier int64) bool {
}
var (
influxReadCalls = metrics.NewCounter(`vm_read_calls_total{name="influx"}`)
influxReadErrors = metrics.NewCounter(`vm_read_errors_total{name="influx"}`)
influxUnmarshalErrors = metrics.NewCounter(`vm_unmarshal_errors_total{name="influx"}`)
influxReadCalls = metrics.NewCounter(`vm_read_calls_total{name="influx"}`)
influxReadErrors = metrics.NewCounter(`vm_read_errors_total{name="influx"}`)
)
type pushCtx struct {

View File

@@ -13,6 +13,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/opentsdbhttp"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/prometheus"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/metrics"
)
@@ -21,10 +22,13 @@ var (
opentsdbListenAddr = flag.String("opentsdbListenAddr", "", "TCP and UDP address to listen for OpentTSDB put messages. Usually :4242 must be set. Doesn't work if empty")
opentsdbHTTPListenAddr = flag.String("opentsdbHTTPListenAddr", "", "TCP address to listen for OpentTSDB HTTP put requests. Usually :4242 must be set. Doesn't work if empty")
maxInsertRequestSize = flag.Int("maxInsertRequestSize", 32*1024*1024, "The maximum size of a single insert request in bytes")
maxLabelsPerTimeseries = flag.Int("maxLabelsPerTimeseries", 30, "The maximum number of labels accepted per time series. Superflouos labels are dropped")
)
// Init initializes vminsert.
func Init() {
storage.SetMaxLabelsPerTimeseries(*maxLabelsPerTimeseries)
concurrencylimiter.Init()
if len(*graphiteListenAddr) > 0 {
go graphite.Serve(*graphiteListenAddr)

View File

@@ -4,6 +4,8 @@ import (
"fmt"
"strings"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/metrics"
"github.com/valyala/fastjson/fastfloat"
)
@@ -34,10 +36,8 @@ func (rs *Rows) Reset() {
// See http://opentsdb.net/docs/build/html/api_telnet/put.html
//
// s must be unchanged until rs is in use.
func (rs *Rows) Unmarshal(s string) error {
var err error
rs.Rows, rs.tagsPool, err = unmarshalRows(rs.Rows[:0], s, rs.tagsPool[:0])
return err
func (rs *Rows) Unmarshal(s string) {
rs.Rows, rs.tagsPool = unmarshalRows(rs.Rows[:0], s, rs.tagsPool[:0])
}
// Row is a single OpenTSDB row.
@@ -66,6 +66,9 @@ func (r *Row) unmarshal(s string, tagsPool []Tag) ([]Tag, error) {
return tagsPool, fmt.Errorf("cannot find whitespace between metric and timestamp in %q", s)
}
r.Metric = s[:n]
if len(r.Metric) == 0 {
return tagsPool, fmt.Errorf("metric cannot be empty")
}
tail := s[n+1:]
n = strings.IndexByte(tail, ' ')
if n < 0 {
@@ -89,41 +92,46 @@ func (r *Row) unmarshal(s string, tagsPool []Tag) ([]Tag, error) {
return tagsPool, nil
}
func unmarshalRows(dst []Row, s string, tagsPool []Tag) ([]Row, []Tag, error) {
func unmarshalRows(dst []Row, s string, tagsPool []Tag) ([]Row, []Tag) {
for len(s) > 0 {
n := strings.IndexByte(s, '\n')
if n == 0 {
// Skip empty line
s = s[1:]
continue
}
if cap(dst) > len(dst) {
dst = dst[:len(dst)+1]
} else {
dst = append(dst, Row{})
}
r := &dst[len(dst)-1]
if n < 0 {
// The last line.
var err error
tagsPool, err = r.unmarshal(s, tagsPool)
if err != nil {
err = fmt.Errorf("cannot unmarshal OpenTSDB line %q: %s", s, err)
return dst, tagsPool, err
}
return dst, tagsPool, nil
}
var err error
tagsPool, err = r.unmarshal(s[:n], tagsPool)
if err != nil {
err = fmt.Errorf("cannot unmarshal OpenTSDB line %q: %s", s[:n], err)
return dst, tagsPool, err
return unmarshalRow(dst, s, tagsPool)
}
dst, tagsPool = unmarshalRow(dst, s[:n], tagsPool)
s = s[n+1:]
}
return dst, tagsPool, nil
return dst, tagsPool
}
func unmarshalRow(dst []Row, s string, tagsPool []Tag) ([]Row, []Tag) {
if len(s) > 0 && s[len(s)-1] == '\r' {
s = s[:len(s)-1]
}
if len(s) == 0 {
// Skip empty line
return dst, tagsPool
}
if cap(dst) > len(dst) {
dst = dst[:len(dst)+1]
} else {
dst = append(dst, Row{})
}
r := &dst[len(dst)-1]
var err error
tagsPool, err = r.unmarshal(s, tagsPool)
if err != nil {
dst = dst[:len(dst)-1]
logger.Errorf("cannot unmarshal OpenTSDB line %q: %s", s, err)
invalidLines.Inc()
}
return dst, tagsPool
}
var invalidLines = metrics.NewCounter(`vm_rows_invalid_total{type="opentsdb"}`)
func unmarshalTags(dst []Tag, s string) ([]Tag, error) {
for {
if cap(dst) > len(dst) {
@@ -139,12 +147,20 @@ func unmarshalTags(dst []Tag, s string) ([]Tag, error) {
if err := tag.unmarshal(s); err != nil {
return dst[:len(dst)-1], err
}
if len(tag.Key) == 0 || len(tag.Value) == 0 {
// Skip empty tag
dst = dst[:len(dst)-1]
}
return dst, nil
}
if err := tag.unmarshal(s[:n]); err != nil {
return dst[:len(dst)-1], err
}
s = s[n+1:]
if len(tag.Key) == 0 || len(tag.Value) == 0 {
// Skip empty tag
dst = dst[:len(dst)-1]
}
}
}
@@ -166,9 +182,6 @@ func (t *Tag) unmarshal(s string) error {
return fmt.Errorf("missing tag value for %q", s)
}
t.Key = s[:n]
if len(t.Key) == 0 {
return fmt.Errorf("tag key cannot be empty for %q", s)
}
t.Value = s[n+1:]
return nil
}

View File

@@ -9,19 +9,24 @@ func TestRowsUnmarshalFailure(t *testing.T) {
f := func(s string) {
t.Helper()
var rows Rows
if err := rows.Unmarshal(s); err == nil {
t.Fatalf("expecting non-nil error when parsing %q", s)
rows.Unmarshal(s)
if len(rows.Rows) != 0 {
t.Fatalf("unexpected number of rows parsed; got %d; want 0", len(rows.Rows))
}
// Try again
if err := rows.Unmarshal(s); err == nil {
t.Fatalf("expecting non-nil error when parsing %q", s)
rows.Unmarshal(s)
if len(rows.Rows) != 0 {
t.Fatalf("unexpected number of rows parsed; got %d; want 0", len(rows.Rows))
}
}
// Missing put prefix
f("xx")
// Missing metric
f("put 111 34")
// Missing timestamp
f("put aaa")
@@ -42,26 +47,19 @@ func TestRowsUnmarshalFailure(t *testing.T) {
// Invalid tag
f("put aaa 123 4.5 foo")
f("put aaa 123 4.5 =")
f("put aaa 123 4.5 =foo")
f("put aaa 123 4.5 =foo a=b")
}
func TestRowsUnmarshalSuccess(t *testing.T) {
f := func(s string, rowsExpected *Rows) {
t.Helper()
var rows Rows
if err := rows.Unmarshal(s); err != nil {
t.Fatalf("cannot unmarshal %q: %s", s, err)
}
rows.Unmarshal(s)
if !reflect.DeepEqual(rows.Rows, rowsExpected.Rows) {
t.Fatalf("unexpected rows;\ngot\n%+v;\nwant\n%+v", rows.Rows, rowsExpected.Rows)
}
// Try unmarshaling again
if err := rows.Unmarshal(s); err != nil {
t.Fatalf("cannot unmarshal %q: %s", s, err)
}
rows.Unmarshal(s)
if !reflect.DeepEqual(rows.Rows, rowsExpected.Rows) {
t.Fatalf("unexpected rows;\ngot\n%+v;\nwant\n%+v", rows.Rows, rowsExpected.Rows)
}
@@ -74,7 +72,9 @@ func TestRowsUnmarshalSuccess(t *testing.T) {
// Empty line
f("", &Rows{})
f("\r", &Rows{})
f("\n\n", &Rows{})
f("\n\r\n", &Rows{})
// Single line
f("put foobar 789 -123.456 a=b", &Rows{
@@ -88,17 +88,13 @@ func TestRowsUnmarshalSuccess(t *testing.T) {
}},
}},
})
// Empty tag value
f("put foobar 789 -123.456 a= b=c", &Rows{
// Empty tag
f("put foobar 789 -123.456 a= b=c =d", &Rows{
Rows: []Row{{
Metric: "foobar",
Value: -123.456,
Timestamp: 789,
Tags: []Tag{
{
Key: "a",
Value: "",
},
{
Key: "b",
Value: "c",
@@ -200,4 +196,27 @@ func TestRowsUnmarshalSuccess(t *testing.T) {
},
},
})
// Multi lines with invalid line
f("put foo 2 0.3 a=b\naaa bbb\nput bar.baz 43 0.34 a=b\n", &Rows{
Rows: []Row{
{
Metric: "foo",
Value: 0.3,
Timestamp: 2,
Tags: []Tag{{
Key: "a",
Value: "b",
}},
},
{
Metric: "bar.baz",
Value: 0.34,
Timestamp: 43,
Tags: []Tag{{
Key: "a",
Value: "b",
}},
},
},
})
}

View File

@@ -16,8 +16,9 @@ put cpu.usage_irq 1234556768 0.34432 a=b
b.RunParallel(func(pb *testing.PB) {
var rows Rows
for pb.Next() {
if err := rows.Unmarshal(s); err != nil {
panic(fmt.Errorf("cannot unmarshal %q: %s", s, err))
rows.Unmarshal(s)
if len(rows.Rows) != 4 {
panic(fmt.Errorf("unexpected number of parsed rows; got %d; want 4", len(rows.Rows)))
}
}
})

View File

@@ -85,11 +85,7 @@ func (ctx *pushCtx) Read(r io.Reader) bool {
return false
}
}
if err := ctx.Rows.Unmarshal(bytesutil.ToUnsafeString(ctx.reqBuf)); err != nil {
opentsdbUnmarshalErrors.Inc()
ctx.err = fmt.Errorf("cannot unmarshal OpenTSDB put protocol data with size %d: %s", len(ctx.reqBuf), err)
return false
}
ctx.Rows.Unmarshal(bytesutil.ToUnsafeString(ctx.reqBuf))
// Fill in missing timestamps
currentTimestamp := time.Now().Unix()
@@ -135,9 +131,8 @@ func (ctx *pushCtx) reset() {
}
var (
opentsdbReadCalls = metrics.NewCounter(`vm_read_calls_total{name="opentsdb"}`)
opentsdbReadErrors = metrics.NewCounter(`vm_read_errors_total{name="opentsdb"}`)
opentsdbUnmarshalErrors = metrics.NewCounter(`vm_unmarshal_errors_total{name="opentsdb"}`)
opentsdbReadCalls = metrics.NewCounter(`vm_read_calls_total{name="opentsdb"}`)
opentsdbReadErrors = metrics.NewCounter(`vm_read_errors_total{name="opentsdb"}`)
)
func getPushCtx() *pushCtx {

View File

@@ -4,6 +4,8 @@ import (
"fmt"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/metrics"
"github.com/valyala/fastjson"
"github.com/valyala/fastjson/fastfloat"
)
@@ -34,10 +36,8 @@ func (rs *Rows) Reset() {
// See http://opentsdb.net/docs/build/html/api_http/put.html
//
// s must be unchanged until rs is in use.
func (rs *Rows) Unmarshal(av *fastjson.Value) error {
var err error
rs.Rows, rs.tagsPool, err = unmarshalRows(rs.Rows[:0], av, rs.tagsPool[:0])
return err
func (rs *Rows) Unmarshal(av *fastjson.Value) {
rs.Rows, rs.tagsPool = unmarshalRows(rs.Rows[:0], av, rs.tagsPool[:0])
}
// Row is a single OpenTSDB row.
@@ -58,14 +58,14 @@ func (r *Row) reset() {
func (r *Row) unmarshal(o *fastjson.Value, tagsPool []Tag) ([]Tag, error) {
r.reset()
m := o.GetStringBytes("metric")
if m == nil {
if len(m) == 0 {
return tagsPool, fmt.Errorf("missing `metric` in %s", o)
}
r.Metric = bytesutil.ToUnsafeString(m)
rawTs := o.Get("timestamp")
if rawTs != nil {
ts, err := rawTs.Int64()
ts, err := getFloat64(rawTs)
if err != nil {
return tagsPool, fmt.Errorf("invalid `timestamp` in %s: %s", o, err)
}
@@ -80,7 +80,7 @@ func (r *Row) unmarshal(o *fastjson.Value, tagsPool []Tag) ([]Tag, error) {
if rawV == nil {
return tagsPool, fmt.Errorf("missing `value` in %s", o)
}
v, err := getValue(rawV)
v, err := getFloat64(rawV)
if err != nil {
return tagsPool, fmt.Errorf("invalid `value` in %s: %s", o, err)
}
@@ -106,7 +106,7 @@ func (r *Row) unmarshal(o *fastjson.Value, tagsPool []Tag) ([]Tag, error) {
return tagsPool, nil
}
func getValue(v *fastjson.Value) (float64, error) {
func getFloat64(v *fastjson.Value) (float64, error) {
switch v.Type() {
case fastjson.TypeNumber:
return v.Float64()
@@ -122,26 +122,24 @@ func getValue(v *fastjson.Value) (float64, error) {
}
}
func unmarshalRows(dst []Row, av *fastjson.Value, tagsPool []Tag) ([]Row, []Tag, error) {
func unmarshalRows(dst []Row, av *fastjson.Value, tagsPool []Tag) ([]Row, []Tag) {
switch av.Type() {
case fastjson.TypeObject:
return unmarshalRow(dst, av, tagsPool)
case fastjson.TypeArray:
a, _ := av.Array()
for i, o := range a {
var err error
dst, tagsPool, err = unmarshalRow(dst, o, tagsPool)
if err != nil {
return dst, tagsPool, fmt.Errorf("cannot unmarshal %d object out of %d objects: %s", i, len(a), err)
}
for _, o := range a {
dst, tagsPool = unmarshalRow(dst, o, tagsPool)
}
return dst, tagsPool, nil
return dst, tagsPool
default:
return dst, tagsPool, fmt.Errorf("OpenTSDB body must be either object or array; got %s; body=%s", av.Type(), av)
logger.Errorf("OpenTSDB JSON must be either object or array; got %s; body=%s", av.Type(), av)
invalidLines.Inc()
return dst, tagsPool
}
}
func unmarshalRow(dst []Row, o *fastjson.Value, tagsPool []Tag) ([]Row, []Tag, error) {
func unmarshalRow(dst []Row, o *fastjson.Value, tagsPool []Tag) ([]Row, []Tag) {
if cap(dst) > len(dst) {
dst = dst[:len(dst)+1]
} else {
@@ -151,11 +149,15 @@ func unmarshalRow(dst []Row, o *fastjson.Value, tagsPool []Tag) ([]Row, []Tag, e
var err error
tagsPool, err = r.unmarshal(o, tagsPool)
if err != nil {
return dst, tagsPool, fmt.Errorf("cannot unmarshal OpenTSDB object %s: %s", o, err)
dst = dst[:len(dst)-1]
logger.Errorf("cannot unmarshal OpenTSDB object %s: %s", o, err)
invalidLines.Inc()
}
return dst, tagsPool, nil
return dst, tagsPool
}
var invalidLines = metrics.NewCounter(`vm_rows_invalid_total{type="opentsdb-http"}`)
func unmarshalTags(dst []Tag, o *fastjson.Object) ([]Tag, error) {
var err error
o.Visit(func(k []byte, v *fastjson.Value) {
@@ -163,6 +165,10 @@ func unmarshalTags(dst []Tag, o *fastjson.Object) ([]Tag, error) {
err = fmt.Errorf("tag value must be string; got %s; value=%s", v.Type(), v)
return
}
if len(k) == 0 {
// Skip empty tags
return
}
vStr, _ := v.StringBytes()
if len(vStr) == 0 {
// Skip empty tags

View File

@@ -17,12 +17,14 @@ func TestRowsUnmarshalFailure(t *testing.T) {
return
}
// Verify OpenTSDB body parsing error
if err := rows.Unmarshal(v); err == nil {
t.Fatalf("expecting non-nil error when parsing %q", s)
rows.Unmarshal(v)
if len(rows.Rows) != 0 {
t.Fatalf("unexpected number of rows parsed; got %d; want 0", len(rows.Rows))
}
// Try again
if err := rows.Unmarshal(v); err == nil {
t.Fatalf("expecting non-nil error when parsing %q", s)
rows.Unmarshal(v)
if len(rows.Rows) != 0 {
t.Fatalf("unexpected number of rows parsed; got %d; want 0", len(rows.Rows))
}
}
@@ -48,14 +50,15 @@ func TestRowsUnmarshalFailure(t *testing.T) {
f(`{"metric": "aaa", "timestamp": 1122, "value": "0.0.0"}`)
// Invalid metric type
f(`{"metric": "", "timestamp": 1122, "value": 0.45, "tags": {"foo": "bar"}}`)
f(`{"metric": ["aaa"], "timestamp": 1122, "value": 0.45, "tags": {"foo": "bar"}}`)
f(`{"metric": {"aaa":1}, "timestamp": 1122, "value": 0.45, "tags": {"foo": "bar"}}`)
f(`{"metric": 1, "timestamp": 1122, "value": 0.45, "tags": {"foo": "bar"}}`)
// Invalid timestamp type
f(`{"metric": "aaa", "timestamp": "foobar", "value": 0.45, "tags": {"foo": "bar"}}`)
f(`{"metric": "aaa", "timestamp": 123.456, "value": 0.45, "tags": {"foo": "bar"}}`)
f(`{"metric": "aaa", "timestamp": "123", "value": 0.45, "tags": {"foo": "bar"}}`)
f(`{"metric": "aaa", "timestamp": [1,2], "value": 0.45, "tags": {"foo": "bar"}}`)
f(`{"metric": "aaa", "timestamp": {"a":1}, "value": 0.45, "tags": {"foo": "bar"}}`)
// Invalid value type
f(`{"metric": "aaa", "timestamp": 1122, "value": [0,1], "tags": {"foo":"bar"}}`)
@@ -73,7 +76,7 @@ func TestRowsUnmarshalFailure(t *testing.T) {
f(`{"metric": "aaa", "timestamp": 1122, "value": 0.45, "tags": {"foo": 1}}`)
// Invalid multiline
f(`[{"metric": "aaa", "timestamp": 1122, "value": "trt", "tags":{"foo":"bar"}}, {"metric": "aaa", "timestamp": 1122, "value": 111}]`)
f(`[{"metric": "aaa", "timestamp": 1122, "value": "trt", "tags":{"foo":"bar"}}, {"metric": "aaa", "timestamp": [1122], "value": 111}]`)
}
func TestRowsUnmarshalSuccess(t *testing.T) {
@@ -87,17 +90,13 @@ func TestRowsUnmarshalSuccess(t *testing.T) {
if err != nil {
t.Fatalf("cannot parse json %s: %s", s, err)
}
if err := rows.Unmarshal(v); err != nil {
t.Fatalf("cannot unmarshal %s: %s", v, err)
}
rows.Unmarshal(v)
if !reflect.DeepEqual(rows.Rows, rowsExpected.Rows) {
t.Fatalf("unexpected rows;\ngot\n%+v;\nwant\n%+v", rows.Rows, rowsExpected.Rows)
}
// Try unmarshaling again
if err := rows.Unmarshal(v); err != nil {
t.Fatalf("cannot unmarshal %s: %s", v, err)
}
rows.Unmarshal(v)
if !reflect.DeepEqual(rows.Rows, rowsExpected.Rows) {
t.Fatalf("unexpected rows;\ngot\n%+v;\nwant\n%+v", rows.Rows, rowsExpected.Rows)
}
@@ -120,6 +119,30 @@ func TestRowsUnmarshalSuccess(t *testing.T) {
}},
}},
})
// Timestamp as string
f(`{"metric": "foobar", "timestamp": "1789", "value": -123.456, "tags": {"a":"b"}}`, &Rows{
Rows: []Row{{
Metric: "foobar",
Value: -123.456,
Timestamp: 1789,
Tags: []Tag{{
Key: "a",
Value: "b",
}},
}},
})
// Timestamp as float64 (it is truncated to integer)
f(`{"metric": "foobar", "timestamp": 17.89, "value": -123.456, "tags": {"a":"b"}}`, &Rows{
Rows: []Row{{
Metric: "foobar",
Value: -123.456,
Timestamp: 17,
Tags: []Tag{{
Key: "a",
Value: "b",
}},
}},
})
// Empty tags
f(`{"metric": "foobar", "timestamp": 789, "value": -123.456, "tags": {}}`, &Rows{
Rows: []Row{{
@@ -139,7 +162,7 @@ func TestRowsUnmarshalSuccess(t *testing.T) {
}},
})
// Empty tag value
f(`{"metric": "foobar", "timestamp": 123, "value": -123.456, "tags": {"a":"", "b":"c"}}`, &Rows{
f(`{"metric": "foobar", "timestamp": 123, "value": -123.456, "tags": {"a":"", "b":"c", "": "d"}}`, &Rows{
Rows: []Row{{
Metric: "foobar",
Value: -123.456,

View File

@@ -24,8 +24,9 @@ func BenchmarkRowsUnmarshal(b *testing.B) {
if err != nil {
panic(fmt.Errorf("cannot parse %q: %s", s, err))
}
if err := rows.Unmarshal(v); err != nil {
panic(fmt.Errorf("cannot unmarshal %q: %s", s, err))
rows.Unmarshal(v)
if len(rows.Rows) != 4 {
panic(fmt.Errorf("unexpected number of rows unmarshaled; got %d; want 4", len(rows.Rows)))
}
}
})

View File

@@ -69,10 +69,7 @@ func insertHandlerInternal(req *http.Request, maxSize int64) error {
opentsdbUnmarshalErrors.Inc()
return fmt.Errorf("cannot parse HTTP OpenTSDB json: %s", err)
}
if err := ctx.Rows.Unmarshal(v); err != nil {
opentsdbUnmarshalErrors.Inc()
return fmt.Errorf("cannot unmarshal HTTP OpenTSDB json %s, %s", err, v)
}
ctx.Rows.Unmarshal(v)
// Fill in missing timestamps
currentTimestamp := time.Now().Unix()

View File

@@ -38,7 +38,7 @@ func Serve(addr string, maxReqSize int64) {
return
}
if err != nil {
logger.Fatalf("FATAL: error serving HTTP OpenTSDB: %s", err)
logger.Fatalf("error serving HTTP OpenTSDB: %s", err)
}
}()
}
@@ -65,6 +65,6 @@ func Stop() {
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
if err := httpServer.Shutdown(ctx); err != nil {
logger.Fatalf("FATAL: cannot close HTTP OpenTSDB server: %s", err)
logger.Fatalf("cannot close HTTP OpenTSDB server: %s", err)
}
}

37
app/vmrestore/Makefile Normal file
View File

@@ -0,0 +1,37 @@
# All these commands must run from repository root.
vmrestore:
APP_NAME=vmrestore $(MAKE) app-local
vmrestore-prod:
APP_NAME=vmrestore $(MAKE) app-via-docker
package-vmrestore:
APP_NAME=vmrestore $(MAKE) package-via-docker
publish-vmrestore:
APP_NAME=vmrestore $(MAKE) publish-via-docker
vmrestore-arm:
CGO_ENABLED=0 GOOS=linux GOARCH=arm GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmrestore-arm ./app/vmrestore
vmrestore-arm-prod:
APP_NAME=vmrestore APP_SUFFIX='-arm' DOCKER_OPTS='--env CGO_ENABLED=0 --env GOARCH=arm' $(MAKE) app-via-docker
vmrestore-arm64:
CGO_ENABLED=0 GOOS=linux GOARCH=arm64 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmrestore-arm64 ./app/vmrestore
vmrestore-arm64-prod:
APP_NAME=vmrestore APP_SUFFIX='-arm64' DOCKER_OPTS='--env CGO_ENABLED=0 --env GOARCH=arm64' $(MAKE) app-via-docker
vmrestore-386:
CGO_ENABLED=0 GOOS=linux GOARCH=386 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/vmrestore-386 ./app/vmrestore
vmrestore-386-prod:
APP_NAME=vmrestore APP_SUFFIX='-386' DOCKER_OPTS='--env CGO_ENABLED=0 --env GOARCH=386' $(MAKE) app-via-docker
vmrestore-pure:
APP_NAME=vmrestore $(MAKE) app-local-pure
vmrestore-pure-prod:
APP_NAME=vmrestore APP_SUFFIX='-pure' DOCKER_OPTS='--env CGO_ENABLED=0' $(MAKE) app-via-docker

86
app/vmrestore/README.md Normal file
View File

@@ -0,0 +1,86 @@
## vmrestore
`vmrestore` restores data from backups created by [vmbackup](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmbackup/README.md).
VictoriaMetrics `v1.29.0` and newer versions must be used for working with the restored data.
Restore process can be interrupted at any time. It is automatically resumed from the inerruption point
when restarting `vmrestore` with the same args.
### Usage
VictoriaMetrics must be stopped during the restore process.
```
vmrestore -src=gcs://<bucket>/<path/to/backup> -storageDataPath=<local/path/to/restore>
```
* `<bucket>` is [GCS bucket](https://cloud.google.com/storage/docs/creating-buckets) name.
* `<path/to/backup>` is the path to backup made with [vmbackup](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmbackup/README.md) on GCS bucket.
* `<local/path/to/restore>` is the path to folder where data will be restored. This folder must be passed
to VictoriaMetrics in `-storageDataPath` command-line flag after the restore process is complete.
The original `-storageDataPath` directory may contain old files. They will be susbstituted by the files from backup.
### Troubleshooting
* If `vmrestore` eats all the network bandwidth, then set `-maxBytesPerSecond` to the desired value.
* If `vmrestore` has been interrupted due to temporary error, then just restart it with the same args. It will resume the restore process.
### Advanced usage
Run `vmrestore -help` in order to see all the available options:
```
-concurrency int
The number of concurrent workers. Higher concurrency may reduce restore duration (default 10)
-configFilePath string
Path to file with S3 configs. Configs are loaded from default location if not set.
See https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html
-configProfile string
Profile name for S3 configs (default "default")
-credsFilePath string
Path to file with GCS or S3 credentials. Credentials are loaded from default locations if not set.
See https://cloud.google.com/iam/docs/creating-managing-service-account-keys and https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html
-customS3Endpoint string
Custom S3 endpoint for use with S3-compatible storages (e.g. MinIO). S3 is used if not set
-loggerLevel string
Minimum level of errors to log. Possible values: INFO, ERROR, FATAL, PANIC (default "INFO")
-maxBytesPerSecond int
The maximum download speed. There is no limit if it is set to 0
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy (default 60)
-src string
Source path with backup on the remote storage. Example: gcs://bucket/path/to/backup/dir, s3://bucket/path/to/backup/dir or fs:///path/to/local/backup/dir
-storageDataPath string
Destination path where backup must be restored. VictoriaMetrics must be stopped when restoring from backup. -storageDataPath dir can be non-empty. In this case only missing data is downloaded from backup (default "victoria-metrics-data")
-version
Show VictoriaMetrics version
```
### How to build from sources
It is recommended using [binary releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) - see `vmutils-*` archives there.
#### Development build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.12.
2. Run `make vmrestore` from the root folder of the repository.
It builds `vmrestore` binary and puts it into the `bin` folder.
#### Production build
1. [Install docker](https://docs.docker.com/install/).
2. Run `make vmrestore-prod` from the root folder of the repository.
It builds `vmrestore-prod` binary and puts it into the `bin` folder.
#### Building docker images
Run `make package-vmrestore`. It builds `victoriametrics/vmrestore:<PKG_TAG>` docker image locally.
`<PKG_TAG>` is auto-generated image tag, which depends on source code in the repository.
The `<PKG_TAG>` may be manually set via `PKG_TAG=foobar make package-vmrestore`.

View File

@@ -0,0 +1,5 @@
FROM scratch
COPY --from=local/certs:1.0.3 /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ca-certificates.crt
COPY bin/vmrestore-prod .
EXPOSE 8428
ENTRYPOINT ["/vmrestore-prod"]

78
app/vmrestore/main.go Normal file
View File

@@ -0,0 +1,78 @@
package main
import (
"flag"
"fmt"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/backup/actions"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/backup/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/backup/fslocal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
)
var (
src = flag.String("src", "", "Source path with backup on the remote storage. "+
"Example: gcs://bucket/path/to/backup/dir, s3://bucket/path/to/backup/dir or fs:///path/to/local/backup/dir")
storageDataPath = flag.String("storageDataPath", "victoria-metrics-data", "Destination path where backup must be restored. "+
"VictoriaMetrics must be stopped when restoring from backup. -storageDataPath dir can be non-empty. In this case only missing data is downloaded from backup")
concurrency = flag.Int("concurrency", 10, "The number of concurrent workers. Higher concurrency may reduce restore duration")
maxBytesPerSecond = flag.Int("maxBytesPerSecond", 0, "The maximum download speed. There is no limit if it is set to 0")
)
func main() {
flag.Usage = usage
flag.Parse()
buildinfo.Init()
srcFS, err := newSrcFS()
if err != nil {
logger.Fatalf("%s", err)
}
dstFS, err := newDstFS()
if err != nil {
logger.Fatalf("%s", err)
}
a := &actions.Restore{
Concurrency: *concurrency,
Src: srcFS,
Dst: dstFS,
}
if err := a.Run(); err != nil {
logger.Fatalf("cannot restore from backup: %s", err)
}
}
func usage() {
const s = `
vmrestore restores VictoriaMetrics data from backups made by vmbackup.
See the docs at https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmrestore/README.md .
`
f := flag.CommandLine.Output()
fmt.Fprintf(f, "%s\n", s)
flag.PrintDefaults()
}
func newDstFS() (*fslocal.FS, error) {
if len(*storageDataPath) == 0 {
return nil, fmt.Errorf("`-storageDataPath` cannot be empty")
}
fs := &fslocal.FS{
Dir: *storageDataPath,
MaxBytesPerSecond: *maxBytesPerSecond,
}
if err := fs.Init(); err != nil {
return nil, fmt.Errorf("cannot initialize local fs: %s", err)
}
return fs, nil
}
func newSrcFS() (common.RemoteFS, error) {
fs, err := actions.NewRemoteFS(*src)
if err != nil {
return nil, fmt.Errorf("cannot parse `-src`=%q: %s", *src, err)
}
return fs, nil
}

View File

@@ -2,6 +2,7 @@ package vmselect
import (
"flag"
"fmt"
"net/http"
"runtime"
"strings"
@@ -70,7 +71,11 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
case <-t.C:
timerpool.Put(t)
concurrencyLimitTimeout.Inc()
httpserver.Errorf(w, "cannot handle more than %d concurrent requests", cap(concurrencyCh))
err := &httpserver.ErrorWithStatusCode{
Err: fmt.Errorf("cannot handle more than %d concurrent requests", cap(concurrencyCh)),
StatusCode: http.StatusServiceUnavailable,
}
httpserver.Errorf(w, "%s", err)
return true
}
}
@@ -162,6 +167,18 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
return true
}
return true
case "/api/v1/rules":
// Return dumb placeholder
rulesRequests.Inc()
w.Header().Set("Content-Type", "application/json")
fmt.Fprintf(w, "%s", `{"status":"success","data":{"groups":[]}}`)
return true
case "/api/v1/alerts":
// Return dumb placehloder
alertsRequests.Inc()
w.Header().Set("Content-Type", "application/json")
fmt.Fprintf(w, "%s", `{"status":"success","data":{"alerts":[]}}`)
return true
case "/api/v1/admin/tsdb/delete_series":
deleteRequests.Inc()
authKey := r.FormValue("authKey")
@@ -185,7 +202,10 @@ func sendPrometheusError(w http.ResponseWriter, r *http.Request, err error) {
logger.Errorf("error in %q: %s", r.URL.Path, err)
w.Header().Set("Content-Type", "application/json")
statusCode := 422
statusCode := http.StatusUnprocessableEntity
if esc, ok := err.(*httpserver.ErrorWithStatusCode); ok {
statusCode = esc.StatusCode
}
w.WriteHeader(statusCode)
prometheus.WriteErrorResponse(w, statusCode, err)
}
@@ -220,4 +240,7 @@ var (
federateRequests = metrics.NewCounter(`vm_http_requests_total{path="/federate"}`)
federateErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/federate"}`)
rulesRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/rules"}`)
alertsRequests = metrics.NewCounter(`vm_http_requests_total{path="/api/v1/alerts"}`)
)

View File

@@ -4,6 +4,6 @@ import (
"os"
)
func mustFadviseRandomRead(f *os.File) {
func mustFadviseSequentialRead(f *os.File) {
// Do nothing :)
}

View File

@@ -0,0 +1,15 @@
package netstorage
import (
"os"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"golang.org/x/sys/unix"
)
func mustFadviseSequentialRead(f *os.File) {
fd := int(f.Fd())
if err := unix.Fadvise(int(fd), 0, 0, unix.FADV_SEQUENTIAL|unix.FADV_WILLNEED); err != nil {
logger.Panicf("FATAL: error returned from unix.Fadvise(SEQUENTIAL|WILLNEED): %s", err)
}
}

View File

@@ -7,9 +7,9 @@ import (
"golang.org/x/sys/unix"
)
func mustFadviseRandomRead(f *os.File) {
func mustFadviseSequentialRead(f *os.File) {
fd := int(f.Fd())
if err := unix.Fadvise(int(fd), 0, 0, unix.FADV_RANDOM|unix.FADV_WILLNEED); err != nil {
logger.Panicf("FATAL: error returned from unix.Fadvise(RANDOM|WILLNEED): %s", err)
if err := unix.Fadvise(int(fd), 0, 0, unix.FADV_SEQUENTIAL|unix.FADV_WILLNEED); err != nil {
logger.Panicf("FATAL: error returned from unix.Fadvise(SEQUENTIAL|WILLNEED): %s", err)
}
}

View File

@@ -19,9 +19,9 @@ import (
)
var (
maxTagKeysPerSearch = flag.Int("search.maxTagKeys", 10e3, "The maximum number of tag keys returned per search")
maxTagValuesPerSearch = flag.Int("search.maxTagValues", 10e3, "The maximum number of tag values returned per search")
maxMetricsPerSearch = flag.Int("search.maxUniqueTimeseries", 100e3, "The maximum number of unique time series each search can scan")
maxTagKeysPerSearch = flag.Int("search.maxTagKeys", 100e3, "The maximum number of tag keys returned per search")
maxTagValuesPerSearch = flag.Int("search.maxTagValues", 100e3, "The maximum number of tag values returned per search")
maxMetricsPerSearch = flag.Int("search.maxUniqueTimeseries", 300e3, "The maximum number of unique time series each search can scan")
)
// Result is a single timeseries result.
@@ -92,6 +92,7 @@ func (rss *Results) RunParallel(f func(rs *Result, workerID uint)) error {
doneCh := make(chan error)
// Start workers.
rowsProcessedTotal := uint64(0)
for i := 0; i < workersCount; i++ {
go func(workerID uint) {
rs := getResult()
@@ -99,6 +100,7 @@ func (rss *Results) RunParallel(f func(rs *Result, workerID uint)) error {
maxWorkersCount := gomaxprocs / workersCount
var err error
rowsProcessed := 0
for pts := range workCh {
if time.Until(rss.deadline.Deadline) < 0 {
err = fmt.Errorf("timeout exceeded during query execution: %s", rss.deadline.Timeout)
@@ -111,8 +113,10 @@ func (rss *Results) RunParallel(f func(rs *Result, workerID uint)) error {
// Skip empty blocks.
continue
}
rowsProcessed += len(rs.Values)
f(rs, workerID)
}
atomic.AddUint64(&rowsProcessedTotal, uint64(rowsProcessed))
// Drain the remaining work
for range workCh {
}
@@ -124,6 +128,7 @@ func (rss *Results) RunParallel(f func(rs *Result, workerID uint)) error {
for i := range rss.packedTimeseries {
workCh <- &rss.packedTimeseries[i]
}
seriesProcessedTotal := len(rss.packedTimeseries)
rss.packedTimeseries = rss.packedTimeseries[:0]
close(workCh)
@@ -134,6 +139,8 @@ func (rss *Results) RunParallel(f func(rs *Result, workerID uint)) error {
errors = append(errors, err)
}
}
perQueryRowsProcessed.Update(float64(rowsProcessedTotal))
perQuerySeriesProcessed.Update(float64(seriesProcessedTotal))
if len(errors) > 0 {
// Return just the first error, since other errors
// is likely duplicate the first error.
@@ -142,6 +149,9 @@ func (rss *Results) RunParallel(f func(rs *Result, workerID uint)) error {
return nil
}
var perQueryRowsProcessed = metrics.NewHistogram(`vm_per_query_rows_processed_count`)
var perQuerySeriesProcessed = metrics.NewHistogram(`vm_per_query_series_processed_count`)
var gomaxprocs = runtime.GOMAXPROCS(-1)
type packedTimeseries struct {
@@ -452,16 +462,12 @@ func getStorageSearch() *storage.Search {
}
func putStorageSearch(sr *storage.Search) {
n := atomic.LoadUint64(&sr.MissingMetricNamesForMetricID)
missingMetricNamesForMetricID.Add(int(n))
sr.MustClose()
ssPool.Put(sr)
}
var ssPool sync.Pool
var missingMetricNamesForMetricID = metrics.NewCounter(`vm_missing_metric_names_for_metric_id_total`)
// ProcessSearchQuery performs sq on storage nodes until the given deadline.
func ProcessSearchQuery(sq *storage.SearchQuery, fetchData bool, deadline Deadline) (*Results, error) {
// Setup search.
@@ -484,9 +490,12 @@ func ProcessSearchQuery(sq *storage.SearchQuery, fetchData bool, deadline Deadli
tbf := getTmpBlocksFile()
m := make(map[string][]tmpBlockAddr)
blocksRead := 0
bb := tmpBufPool.Get()
defer tmpBufPool.Put(bb)
for sr.NextMetricBlock() {
blocksRead++
addr, err := tbf.WriteBlock(sr.MetricBlock.Block)
bb.B = storage.MarshalBlock(bb.B[:0], sr.MetricBlock.Block)
addr, err := tbf.WriteBlockData(bb.B)
if err != nil {
putTmpBlocksFile(tbf)
return nil, fmt.Errorf("cannot write data block #%d to temporary blocks file: %s", blocksRead, err)
@@ -520,6 +529,15 @@ func ProcessSearchQuery(sq *storage.SearchQuery, fetchData bool, deadline Deadli
pts.metricName = metricName
pts.addrs = addrs
}
// Sort rss.packedTimeseries by the first addr offset in order
// to reduce the number of disk seeks during unpacking in RunParallel.
// In this case tmpBlocksFile must be read almost sequentially.
sort.Slice(rss.packedTimeseries, func(i, j int) bool {
pts := rss.packedTimeseries
return pts[i].addrs[0].offset < pts[j].addrs[0].offset
})
return &rss, nil
}

View File

@@ -1,7 +1,6 @@
package netstorage
import (
"bufio"
"fmt"
"io/ioutil"
"os"
@@ -10,6 +9,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/memory"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/metrics"
)
@@ -30,13 +30,23 @@ func InitTmpBlocksDir(tmpDirPath string) {
var tmpBlocksDir string
const maxInmemoryTmpBlocksFile = 512 * 1024
func maxInmemoryTmpBlocksFile() int {
mem := memory.Allowed()
maxLen := mem / 1024
if maxLen < 64*1024 {
return 64 * 1024
}
return maxLen
}
var _ = metrics.NewGauge(`vm_tmp_blocks_max_inmemory_file_size_bytes`, func() float64 {
return float64(maxInmemoryTmpBlocksFile())
})
type tmpBlocksFile struct {
buf []byte
f *os.File
bw *bufio.Writer
f *os.File
offset uint64
}
@@ -44,7 +54,9 @@ type tmpBlocksFile struct {
func getTmpBlocksFile() *tmpBlocksFile {
v := tmpBlocksFilePool.Get()
if v == nil {
return &tmpBlocksFile{}
return &tmpBlocksFile{
buf: make([]byte, 0, maxInmemoryTmpBlocksFile()),
}
}
return v.(*tmpBlocksFile)
}
@@ -53,7 +65,6 @@ func putTmpBlocksFile(tbf *tmpBlocksFile) {
tbf.MustClose()
tbf.buf = tbf.buf[:0]
tbf.f = nil
tbf.bw = nil
tbf.offset = 0
tmpBlocksFilePool.Put(tbf)
}
@@ -69,51 +80,34 @@ func (addr tmpBlockAddr) String() string {
return fmt.Sprintf("offset %d, size %d", addr.offset, addr.size)
}
func getBufioWriter(f *os.File) *bufio.Writer {
v := bufioWriterPool.Get()
if v == nil {
return bufio.NewWriterSize(f, maxInmemoryTmpBlocksFile*2)
}
bw := v.(*bufio.Writer)
bw.Reset(f)
return bw
}
func putBufioWriter(bw *bufio.Writer) {
bufioWriterPool.Put(bw)
}
var bufioWriterPool sync.Pool
var tmpBlocksFilesCreated = metrics.NewCounter(`vm_tmp_blocks_files_created_total`)
// WriteBlock writes b to tbf.
// WriteBlockData writes b to tbf.
//
// It returns errors since the operation may fail on space shortage
// and this must be handled.
func (tbf *tmpBlocksFile) WriteBlock(b *storage.Block) (tmpBlockAddr, error) {
func (tbf *tmpBlocksFile) WriteBlockData(b []byte) (tmpBlockAddr, error) {
var addr tmpBlockAddr
addr.offset = tbf.offset
tbfBufLen := len(tbf.buf)
tbf.buf = storage.MarshalBlock(tbf.buf, b)
addr.size = len(tbf.buf) - tbfBufLen
addr.size = len(b)
tbf.offset += uint64(addr.size)
if tbf.offset <= maxInmemoryTmpBlocksFile {
if len(tbf.buf)+len(b) <= cap(tbf.buf) {
// Fast path - the data fits tbf.buf
tbf.buf = append(tbf.buf, b...)
return addr, nil
}
// Slow path: flush the data from tbf.buf to file.
if tbf.f == nil {
f, err := ioutil.TempFile(tmpBlocksDir, "")
if err != nil {
return addr, err
}
tbf.f = f
tbf.bw = getBufioWriter(f)
tmpBlocksFilesCreated.Inc()
}
_, err := tbf.bw.Write(tbf.buf)
tbf.buf = tbf.buf[:0]
_, err := tbf.f.Write(tbf.buf)
tbf.buf = append(tbf.buf[:0], b...)
if err != nil {
return addr, fmt.Errorf("cannot write block to %q: %s", tbf.f.Name(), err)
}
@@ -124,15 +118,18 @@ func (tbf *tmpBlocksFile) Finalize() error {
if tbf.f == nil {
return nil
}
err := tbf.bw.Flush()
putBufioWriter(tbf.bw)
tbf.bw = nil
if _, err := tbf.f.Write(tbf.buf); err != nil {
return fmt.Errorf("cannot flush the remaining %d bytes to tmpBlocksFile: %s", len(tbf.buf), err)
}
tbf.buf = tbf.buf[:0]
if _, err := tbf.f.Seek(0, 0); err != nil {
logger.Panicf("FATAL: cannot seek to the start of file: %s", err)
}
mustFadviseRandomRead(tbf.f)
return err
// Hint the OS that the file is read almost sequentiallly.
// This should reduce the number of disk seeks, which is important
// for HDDs.
mustFadviseSequentialRead(tbf.f)
return nil
}
func (tbf *tmpBlocksFile) MustReadBlockAt(dst *storage.Block, addr tmpBlockAddr) {
@@ -167,10 +164,6 @@ func (tbf *tmpBlocksFile) MustClose() {
if tbf.f == nil {
return
}
if tbf.bw != nil {
putBufioWriter(tbf.bw)
tbf.bw = nil
}
fname := tbf.f.Name()
// Remove the file at first, then close it.

View File

@@ -30,7 +30,7 @@ func TestTmpBlocksFileSerial(t *testing.T) {
}
func TestTmpBlocksFileConcurrent(t *testing.T) {
concurrency := 4
concurrency := 3
ch := make(chan error, concurrency)
for i := 0; i < concurrency; i++ {
go func() {
@@ -69,7 +69,7 @@ func testTmpBlocksFile() error {
_, _, _ = b.MarshalData(0, 0)
return &b
}
for _, size := range []int{1024, 16 * 1024, maxInmemoryTmpBlocksFile / 2, 2 * maxInmemoryTmpBlocksFile} {
for _, size := range []int{1024, 16 * 1024, maxInmemoryTmpBlocksFile() / 2, 2 * maxInmemoryTmpBlocksFile()} {
err := func() error {
tbf := getTmpBlocksFile()
defer putTmpBlocksFile(tbf)
@@ -77,9 +77,12 @@ func testTmpBlocksFile() error {
// Write blocks until their summary size exceeds `size`.
var addrs []tmpBlockAddr
var blocks []*storage.Block
bb := tmpBufPool.Get()
defer tmpBufPool.Put(bb)
for tbf.offset < uint64(size) {
b := createBlock()
addr, err := tbf.WriteBlock(b)
bb.B = storage.MarshalBlock(bb.B[:0], b)
addr, err := tbf.WriteBlockData(bb.B)
if err != nil {
return fmt.Errorf("cannot write block at offset %d: %s", tbf.offset, err)
}
@@ -94,7 +97,7 @@ func testTmpBlocksFile() error {
}
// Read blocks in parallel and verify them
concurrency := 3
concurrency := 2
workCh := make(chan int)
doneCh := make(chan error)
for i := 0; i < concurrency; i++ {

View File

@@ -13,7 +13,7 @@
{% for i, ts := range rs.Timestamps %}
{%z= bb.B %}{% space %}
{%f= rs.Values[i] %}{% space %}
{%d= int(ts) %}{% newline %}
{%dl= ts %}{% newline %}
{% endfor %}
{% code quicktemplate.ReleaseByteBuffer(bb) %}
{% endfunc %}
@@ -35,10 +35,10 @@
"timestamps":[
{% if len(rs.Timestamps) > 0 %}
{% code timestamps := rs.Timestamps %}
{%d= int(timestamps[0]) %}
{%dl= timestamps[0] %}
{% code timestamps = timestamps[1:] %}
{% for _, ts := range timestamps %}
,{%d= int(ts) %}
,{%dl= ts %}
{% endfor %}
{% endif %}
]

View File

@@ -49,7 +49,7 @@ func StreamExportPrometheusLine(qw422016 *qt422016.Writer, rs *netstorage.Result
//line app/vmselect/prometheus/export.qtpl:15
qw422016.N().S(` `)
//line app/vmselect/prometheus/export.qtpl:16
qw422016.N().D(int(ts))
qw422016.N().DL(ts)
//line app/vmselect/prometheus/export.qtpl:16
qw422016.N().S(`
`)
@@ -129,7 +129,7 @@ func StreamExportJSONLine(qw422016 *qt422016.Writer, rs *netstorage.Result) {
timestamps := rs.Timestamps
//line app/vmselect/prometheus/export.qtpl:38
qw422016.N().D(int(timestamps[0]))
qw422016.N().DL(timestamps[0])
//line app/vmselect/prometheus/export.qtpl:39
timestamps = timestamps[1:]
@@ -138,7 +138,7 @@ func StreamExportJSONLine(qw422016 *qt422016.Writer, rs *netstorage.Result) {
//line app/vmselect/prometheus/export.qtpl:40
qw422016.N().S(`,`)
//line app/vmselect/prometheus/export.qtpl:41
qw422016.N().D(int(ts))
qw422016.N().DL(ts)
//line app/vmselect/prometheus/export.qtpl:42
}
//line app/vmselect/prometheus/export.qtpl:43

View File

@@ -10,7 +10,7 @@
{% if len(rs.Timestamps) == 0 || len(rs.Values) == 0 %}{% return %}{% endif %}
{%= prometheusMetricName(&rs.MetricName) %}{% space %}
{%f= rs.Values[len(rs.Values)-1] %}{% space %}
{%d= int(rs.Timestamps[len(rs.Timestamps)-1]) %}{% newline %}
{%dl= rs.Timestamps[len(rs.Timestamps)-1] %}{% newline %}
{% endfunc %}
{% endstripspace %}

View File

@@ -41,7 +41,7 @@ func StreamFederate(qw422016 *qt422016.Writer, rs *netstorage.Result) {
//line app/vmselect/prometheus/federate.qtpl:12
qw422016.N().S(` `)
//line app/vmselect/prometheus/federate.qtpl:13
qw422016.N().D(int(rs.Timestamps[len(rs.Timestamps)-1]))
qw422016.N().DL(rs.Timestamps[len(rs.Timestamps)-1])
//line app/vmselect/prometheus/federate.qtpl:13
qw422016.N().S(`
`)

View File

@@ -21,17 +21,17 @@ import (
)
var (
latencyOffset = flag.Duration("search.latencyOffset", time.Second*30, "The time when data points become visible in query results after the colection. "+
"Too small value can result in incomplete last points for query results")
maxQueryDuration = flag.Duration("search.maxQueryDuration", time.Second*30, "The maximum time for search query execution")
maxQueryLen = flag.Int("search.maxQueryLen", 16*1024, "The maximum search query length in bytes")
maxLookback = flag.Duration("search.maxLookback", 0, "Synonim to `-search.lookback-delta` from Prometheus. "+
"The value is dynamically detected from interval between time series datapoints if not set. It can be overridden on per-query basis via `max_lookback` arg")
)
// Default step used if not set.
const defaultStep = 5 * 60 * 1000
// Latency for data processing pipeline, i.e. the time between data is ignested
// into the system and the time it becomes visible to search.
const latencyOffset = 60 * 1000
// FederateHandler implements /federate . See https://prometheus.io/docs/prometheus/latest/federation/
func FederateHandler(w http.ResponseWriter, r *http.Request) error {
startTime := time.Now()
@@ -43,11 +43,14 @@ func FederateHandler(w http.ResponseWriter, r *http.Request) error {
if len(matches) == 0 {
return fmt.Errorf("missing `match[]` arg")
}
maxLookback, err := getDuration(r, "max_lookback", defaultStep)
lookbackDelta, err := getMaxLookback(r)
if err != nil {
return err
}
start, err := getTime(r, "start", ct-maxLookback)
if lookbackDelta <= 0 {
lookbackDelta = defaultStep
}
start, err := getTime(r, "start", ct-lookbackDelta)
if err != nil {
return err
}
@@ -128,7 +131,7 @@ func ExportHandler(w http.ResponseWriter, r *http.Request) error {
format := r.FormValue("format")
deadline := getDeadline(r)
if start >= end {
start = end - defaultStep
end = start + defaultStep
}
if err := exportHandler(w, matches, start, end, format, deadline); err != nil {
return err
@@ -142,7 +145,7 @@ var exportDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/api/
func exportHandler(w http.ResponseWriter, matches []string, start, end int64, format string, deadline netstorage.Deadline) error {
writeResponseFunc := WriteExportStdResponse
writeLineFunc := WriteExportJSONLine
contentType := "application/json"
contentType := "application/stream+json"
if format == "prometheus" {
contentType = "text/plain"
writeLineFunc = WriteExportPrometheusLine
@@ -283,7 +286,7 @@ func labelValuesWithMatches(labelName string, matches []string, start, end int64
return nil, err
}
if start >= end {
start = end - defaultStep
end = start + defaultStep
}
sq := &storage.SearchQuery{
MinTimestamp: start,
@@ -406,7 +409,7 @@ func SeriesHandler(w http.ResponseWriter, r *http.Request) error {
return err
}
if start >= end {
start = end - defaultStep
end = start + defaultStep
}
sq := &storage.SearchQuery{
MinTimestamp: start,
@@ -463,17 +466,24 @@ func QueryHandler(w http.ResponseWriter, r *http.Request) error {
if err != nil {
return err
}
step, err := getDuration(r, "step", latencyOffset)
queryOffset := getLatencyOffsetMilliseconds()
step, err := getDuration(r, "step", queryOffset)
if err != nil {
return err
}
deadline := getDeadline(r)
lookbackDelta, err := getMaxLookback(r)
if err != nil {
return err
}
if len(query) > *maxQueryLen {
return fmt.Errorf(`too long query; got %d bytes; mustn't exceed %d bytes`, len(query), *maxQueryLen)
}
if ct-start < latencyOffset {
start -= latencyOffset
if !getBool(r, "nocache") && ct-start < queryOffset {
// Adjust start time only if `nocache` arg isn't set.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/241
start = ct - queryOffset
}
if childQuery, windowStr, offsetStr := promql.IsMetricSelectorWithRollup(query); childQuery != "" {
var window int64
@@ -503,10 +513,11 @@ func QueryHandler(w http.ResponseWriter, r *http.Request) error {
}
ec := promql.EvalConfig{
Start: start,
End: start,
Step: step,
Deadline: deadline,
Start: start,
End: start,
Step: step,
Deadline: deadline,
LookbackDelta: lookbackDelta,
}
result, err := promql.Exec(&ec, query, true)
if err != nil {
@@ -546,31 +557,39 @@ func QueryRangeHandler(w http.ResponseWriter, r *http.Request) error {
}
deadline := getDeadline(r)
mayCache := !getBool(r, "nocache")
lookbackDelta, err := getMaxLookback(r)
if err != nil {
return err
}
// Validate input args.
if len(query) > *maxQueryLen {
return fmt.Errorf(`too long query; got %d bytes; mustn't exceed %d bytes`, len(query), *maxQueryLen)
}
if start > end {
start = end
end = start + defaultStep
}
if err := promql.ValidateMaxPointsPerTimeseries(start, end, step); err != nil {
return err
}
start, end = promql.AdjustStartEnd(start, end, step)
if mayCache {
start, end = promql.AdjustStartEnd(start, end, step)
}
ec := promql.EvalConfig{
Start: start,
End: end,
Step: step,
Deadline: deadline,
MayCache: mayCache,
Start: start,
End: end,
Step: step,
Deadline: deadline,
MayCache: mayCache,
LookbackDelta: lookbackDelta,
}
result, err := promql.Exec(&ec, query, false)
if err != nil {
return fmt.Errorf("cannot execute %q: %s", query, err)
}
if ct-end < latencyOffset {
queryOffset := getLatencyOffsetMilliseconds()
if ct-end < queryOffset {
result = adjustLastPoints(result)
}
@@ -724,6 +743,11 @@ func getDuration(r *http.Request, argKey string, defaultValue int64) (int64, err
const maxDurationMsecs = 100 * 365 * 24 * 3600 * 1000
func getMaxLookback(r *http.Request) (int64, error) {
d := int64(*maxLookback / time.Millisecond)
return getDuration(r, "max_lookback", d)
}
func getDeadline(r *http.Request) netstorage.Deadline {
d, err := getDuration(r, "timeout", 0)
if err != nil {
@@ -762,3 +786,11 @@ func getTagFilterssFromMatches(matches []string) ([][]storage.TagFilter, error)
}
return tagFilterss, nil
}
func getLatencyOffsetMilliseconds() int64 {
d := int64(*latencyOffset / time.Millisecond)
if d <= 1000 {
d = 1000
}
return d
}

View File

@@ -3,7 +3,7 @@ SeriesCountResponse generates response for /api/v1/series/count .
{% func SeriesCountResponse(n uint64) %}
{
"status":"success",
"data":[{%d int(n) %}]
"data":[{%dl int64(n) %}]
}
{% endfunc %}
{% endstripspace %}

View File

@@ -24,7 +24,7 @@ func StreamSeriesCountResponse(qw422016 *qt422016.Writer, n uint64) {
//line app/vmselect/prometheus/series_count_response.qtpl:3
qw422016.N().S(`{"status":"success","data":[`)
//line app/vmselect/prometheus/series_count_response.qtpl:6
qw422016.N().D(int(n))
qw422016.N().DL(int64(n))
//line app/vmselect/prometheus/series_count_response.qtpl:6
qw422016.N().S(`]}`)
//line app/vmselect/prometheus/series_count_response.qtpl:8

View File

@@ -9,6 +9,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/metrics"
)
var aggrFuncs = map[string]aggrFunc{
@@ -26,11 +27,12 @@ var aggrFuncs = map[string]aggrFunc{
"quantile": aggrFuncQuantile,
// Extended PromQL funcs
"median": aggrFuncMedian,
"limitk": aggrFuncLimitK,
"distinct": newAggrFunc(aggrFuncDistinct),
"sum2": newAggrFunc(aggrFuncSum2),
"geomean": newAggrFunc(aggrFuncGeomean),
"median": aggrFuncMedian,
"limitk": aggrFuncLimitK,
"distinct": newAggrFunc(aggrFuncDistinct),
"sum2": newAggrFunc(aggrFuncSum2),
"geomean": newAggrFunc(aggrFuncGeomean),
"histogram": newAggrFunc(aggrFuncHistogram),
}
type aggrFunc func(afa *aggrFuncArg) ([]*timeseries, error)
@@ -184,6 +186,38 @@ func aggrFuncGeomean(tss []*timeseries) []*timeseries {
return tss[:1]
}
func aggrFuncHistogram(tss []*timeseries) []*timeseries {
var h metrics.Histogram
m := make(map[string]*timeseries)
for i := range tss[0].Values {
h.Reset()
for _, ts := range tss {
v := ts.Values[i]
h.Update(v)
}
h.VisitNonZeroBuckets(func(vmrange string, count uint64) {
ts := m[vmrange]
if ts == nil {
ts = &timeseries{}
ts.CopyFromShallowTimestamps(tss[0])
ts.MetricName.RemoveTag("vmrange")
ts.MetricName.AddTag("vmrange", vmrange)
values := ts.Values
for k := range values {
values[k] = 0
}
m[vmrange] = ts
}
ts.Values[i] = float64(count)
})
}
rvs := make([]*timeseries, 0, len(m))
for _, ts := range m {
rvs = append(rvs, ts)
}
return vmrangeBucketsToLE(rvs)
}
func aggrFuncMin(tss []*timeseries) []*timeseries {
if len(tss) == 1 {
// Fast path - nothing to min.
@@ -353,6 +387,25 @@ func aggrFuncCountValues(afa *aggrFuncArg) ([]*timeseries, error) {
if err != nil {
return nil, err
}
// Remove dstLabel from grouping like Prometheus does.
modifier := &afa.ae.Modifier
switch strings.ToLower(modifier.Op) {
case "without":
modifier.Args = append(modifier.Args, dstLabel)
case "by":
dstArgs := modifier.Args[:0]
for _, arg := range modifier.Args {
if arg == dstLabel {
continue
}
dstArgs = append(dstArgs, arg)
}
modifier.Args = dstArgs
default:
// Do nothing
}
afe := func(tss []*timeseries) []*timeseries {
m := make(map[float64]bool)
for _, ts := range tss {

View File

@@ -179,7 +179,8 @@ func compareValues(vs1, vs2 []float64) error {
}
continue
}
if v1 != v2 {
eps := math.Abs(v1 - v2)
if eps > 1e-14 {
return fmt.Errorf("unexpected value; got %v; want %v", v1, v2)
}
}

View File

@@ -0,0 +1,5 @@
package promql
import "unsafe"
const maxByteSliceLen = 1<<(31+9*(unsafe.Sizeof(int(0))/8)) - 1

View File

@@ -1,3 +0,0 @@
package promql
const maxByteSliceLen = 1 << 40

View File

@@ -1,3 +0,0 @@
package promql
const maxByteSliceLen = 1<<31 - 1

View File

@@ -1,3 +0,0 @@
package promql
const maxByteSliceLen = 1 << 40

View File

@@ -292,24 +292,14 @@ func adjustBinaryOpTags(be *binaryOpExpr, left, right []*timeseries) ([]*timeser
}
// Slow path: `vector op vector` or `a op {on|ignoring} {group_left|group_right} b`
ensureOneX := func(side string, tss []*timeseries) error {
if len(tss) == 0 {
logger.Panicf("BUG: tss must contain at least one value")
}
if len(tss) == 1 {
return nil
}
if mergeNonOverlappingTimeseries(tss) {
return nil
}
return fmt.Errorf(`duplicate timeseries on the %s side of %s %s: %s and %s`, side, be.Op, be.GroupModifier.AppendString(nil),
stringMetricTags(&tss[0].MetricName), stringMetricTags(&tss[1].MetricName))
}
var rvsLeft, rvsRight []*timeseries
mLeft, mRight := createTimeseriesMapByTagSet(be, left, right)
joinOp := strings.ToLower(be.JoinModifier.Op)
joinTags := be.JoinModifier.Args
groupOp := strings.ToLower(be.GroupModifier.Op)
if len(groupOp) == 0 {
groupOp = "ignoring"
}
groupTags := be.GroupModifier.Args
for k, tssLeft := range mLeft {
tssRight := mRight[k]
if len(tssRight) == 0 {
@@ -317,37 +307,38 @@ func adjustBinaryOpTags(be *binaryOpExpr, left, right []*timeseries) ([]*timeser
}
switch joinOp {
case "group_left":
if err := ensureOneX("right", tssRight); err != nil {
var err error
rvsLeft, rvsRight, err = groupJoin("right", be, rvsLeft, rvsRight, tssLeft, tssRight)
if err != nil {
return nil, nil, nil, err
}
src := tssRight[0]
for _, ts := range tssLeft {
ts.MetricName.AddMissingTags(joinTags, &src.MetricName)
rvsLeft = append(rvsLeft, ts)
rvsRight = append(rvsRight, src)
}
case "group_right":
if err := ensureOneX("left", tssLeft); err != nil {
var err error
rvsRight, rvsLeft, err = groupJoin("left", be, rvsRight, rvsLeft, tssRight, tssLeft)
if err != nil {
return nil, nil, nil, err
}
src := tssLeft[0]
for _, ts := range tssRight {
ts.MetricName.AddMissingTags(joinTags, &src.MetricName)
rvsLeft = append(rvsLeft, src)
rvsRight = append(rvsRight, ts)
}
case "":
if err := ensureOneX("left", tssLeft); err != nil {
if err := ensureSingleTimeseries("left", be, tssLeft); err != nil {
return nil, nil, nil, err
}
if err := ensureOneX("right", tssRight); err != nil {
if err := ensureSingleTimeseries("right", be, tssRight); err != nil {
return nil, nil, nil, err
}
resetMetricGroupIfRequired(be, tssLeft[0])
rvsLeft = append(rvsLeft, tssLeft[0])
tsLeft := tssLeft[0]
resetMetricGroupIfRequired(be, tsLeft)
switch groupOp {
case "on":
tsLeft.MetricName.RemoveTagsOn(groupTags)
case "ignoring":
tsLeft.MetricName.RemoveTagsIgnoring(groupTags)
default:
logger.Panicf("BUG: unexpected binary op modifier %q", groupOp)
}
rvsLeft = append(rvsLeft, tsLeft)
rvsRight = append(rvsRight, tssRight[0])
default:
return nil, nil, nil, fmt.Errorf(`unexpected join modifier %q`, joinOp)
logger.Panicf("BUG: unexpected join modifier %q", joinOp)
}
}
dst := rvsLeft
@@ -357,6 +348,90 @@ func adjustBinaryOpTags(be *binaryOpExpr, left, right []*timeseries) ([]*timeser
return rvsLeft, rvsRight, dst, nil
}
func ensureSingleTimeseries(side string, be *binaryOpExpr, tss []*timeseries) error {
if len(tss) == 0 {
logger.Panicf("BUG: tss must contain at least one value")
}
for len(tss) > 1 {
if !mergeNonOverlappingTimeseries(tss[0], tss[len(tss)-1]) {
return fmt.Errorf(`duplicate time series on the %s side of %s %s: %s and %s`, side, be.Op, be.GroupModifier.AppendString(nil),
stringMetricTags(&tss[0].MetricName), stringMetricTags(&tss[len(tss)-1].MetricName))
}
tss = tss[:len(tss)-1]
}
return nil
}
func groupJoin(singleTimeseriesSide string, be *binaryOpExpr, rvsLeft, rvsRight, tssLeft, tssRight []*timeseries) ([]*timeseries, []*timeseries, error) {
joinTags := be.JoinModifier.Args
var m map[string]*timeseries
for _, tsLeft := range tssLeft {
resetMetricGroupIfRequired(be, tsLeft)
if len(tssRight) == 1 {
// Easy case - right part contains only a single matching time series.
tsLeft.MetricName.AddMissingTags(joinTags, &tssRight[0].MetricName)
rvsLeft = append(rvsLeft, tsLeft)
rvsRight = append(rvsRight, tssRight[0])
continue
}
// Hard case - right part contains multiple matching time series.
// Verify it doesn't result in duplicate MetricName values after adding missing tags.
if m == nil {
m = make(map[string]*timeseries, len(tssRight))
} else {
for k := range m {
delete(m, k)
}
}
bb := bbPool.Get()
for _, tsRight := range tssRight {
var tsCopy timeseries
tsCopy.CopyFromShallowTimestamps(tsLeft)
tsCopy.MetricName.AddMissingTags(joinTags, &tsRight.MetricName)
bb.B = marshalMetricTagsSorted(bb.B[:0], &tsCopy.MetricName)
if tsExisting := m[string(bb.B)]; tsExisting != nil {
// Try merging tsExisting with tsRight if they don't overlap.
if mergeNonOverlappingTimeseries(tsExisting, tsRight) {
continue
}
return nil, nil, fmt.Errorf("duplicate time series on the %s side of `%s %s %s`: %s and %s",
singleTimeseriesSide, be.Op, be.GroupModifier.AppendString(nil), be.JoinModifier.AppendString(nil),
stringMetricTags(&tsExisting.MetricName), stringMetricTags(&tsRight.MetricName))
}
m[string(bb.B)] = tsRight
rvsLeft = append(rvsLeft, &tsCopy)
rvsRight = append(rvsRight, tsRight)
}
bbPool.Put(bb)
}
return rvsLeft, rvsRight, nil
}
func mergeNonOverlappingTimeseries(dst, src *timeseries) bool {
// Verify whether the time series can be merged.
srcValues := src.Values
dstValues := dst.Values
_ = dstValues[len(srcValues)-1]
for i, v := range srcValues {
if math.IsNaN(v) {
continue
}
if !math.IsNaN(dstValues[i]) {
return false
}
}
// Time series can be merged. Merge them.
for i, v := range srcValues {
if math.IsNaN(v) {
continue
}
dstValues[i] = v
}
return true
}
func resetMetricGroupIfRequired(be *binaryOpExpr, ts *timeseries) {
if isBinaryOpCmp(be.Op) && !be.Bool {
// Do not reset MetricGroup for non-boolean `compare` binary ops like Prometheus does.
@@ -533,26 +608,3 @@ func isScalar(arg []*timeseries) bool {
}
return len(mn.Tags) == 0
}
func mergeNonOverlappingTimeseries(tss []*timeseries) bool {
if len(tss) < 2 {
logger.Panicf("BUG: expecting at least two timeseries. Got %d", len(tss))
}
// Check whether time series in tss overlap.
var dst timeseries
dst.CopyFromShallowTimestamps(tss[0])
dstValues := dst.Values
for _, ts := range tss[1:] {
for i, value := range ts.Values {
if math.IsNaN(dstValues[i]) {
dstValues[i] = value
} else if !math.IsNaN(value) {
// Time series overlap.
return false
}
}
}
tss[0].CopyFromShallowTimestamps(&dst)
return true
}

View File

@@ -70,6 +70,9 @@ type EvalConfig struct {
MayCache bool
// LookbackDelta is analog to `-query.lookback-delta` from Prometheus.
LookbackDelta int64
timestamps []int64
timestampsOnce sync.Once
}
@@ -82,6 +85,7 @@ func newEvalConfig(src *EvalConfig) *EvalConfig {
ec.Step = src.Step
ec.Deadline = src.Deadline
ec.MayCache = src.MayCache
ec.LookbackDelta = src.LookbackDelta
// do not copy src.timestamps - they must be generated again.
return &ec
@@ -290,10 +294,10 @@ func tryGetArgRollupFuncWithMetricExpr(ae *aggrFuncExpr) (*funcExpr, newRollupFu
return fe, nrf
}
if re, ok := e.(*rollupExpr); ok {
if me, ok := re.Expr.(*metricExpr); !ok || me.IsEmpty() {
if me, ok := re.Expr.(*metricExpr); !ok || me.IsEmpty() || re.ForSubquery() {
return nil, nil
}
// e = rollupExpr(metricExpr)
// e = metricExpr[d]
fe := &funcExpr{
Name: "default_rollup",
Args: []expr{re},
@@ -315,15 +319,17 @@ func tryGetArgRollupFuncWithMetricExpr(ae *aggrFuncExpr) (*funcExpr, newRollupFu
if me.IsEmpty() {
return nil, nil
}
// e = rollupFunc(metricExpr)
return &funcExpr{
Name: fe.Name,
Args: []expr{me},
}, nrf
}
if re, ok := arg.(*rollupExpr); ok {
if me, ok := re.Expr.(*metricExpr); !ok || me.IsEmpty() {
if me, ok := re.Expr.(*metricExpr); !ok || me.IsEmpty() || re.ForSubquery() {
return nil, nil
}
// e = rollupFunc(metricExpr[d])
return fe, nrf
}
return nil, nil
@@ -368,8 +374,8 @@ func getRollupExprArg(arg expr) *rollupExpr {
Expr: arg,
}
}
if len(re.Step) == 0 && !re.InheritStep {
// Return standard rollup if it doesn't set step.
if !re.ForSubquery() {
// Return standard rollup if it doesn't contain subquery.
return re
}
me, ok := re.Expr.(*metricExpr)
@@ -463,7 +469,7 @@ func evalRollupFuncWithSubquery(ec *EvalConfig, name string, rf rollupFunc, re *
}
sharedTimestamps := getTimestamps(ec.Start, ec.End, ec.Step)
preFunc, rcs := getRollupConfigs(name, rf, ec.Start, ec.End, ec.Step, window, sharedTimestamps)
preFunc, rcs := getRollupConfigs(name, rf, ec.Start, ec.End, ec.Step, window, ec.LookbackDelta, sharedTimestamps)
tss := make([]*timeseries, 0, len(tssSQ)*len(rcs))
var tssLock sync.Mutex
removeMetricGroup := !rollupFuncsKeepMetricGroup[name]
@@ -584,12 +590,23 @@ func evalRollupFuncWithMetricExpr(ec *EvalConfig, name string, rf rollupFunc, me
return tss, nil
}
sharedTimestamps := getTimestamps(start, ec.End, ec.Step)
preFunc, rcs := getRollupConfigs(name, rf, start, ec.End, ec.Step, window, sharedTimestamps)
preFunc, rcs := getRollupConfigs(name, rf, start, ec.End, ec.Step, window, ec.LookbackDelta, sharedTimestamps)
// Verify timeseries fit available memory after the rollup.
// Take into account points from tssCached.
pointsPerTimeseries := 1 + (ec.End-ec.Start)/ec.Step
rollupPoints := mulNoOverflow(pointsPerTimeseries, int64(rssLen*len(rcs)))
timeseriesLen := rssLen
if iafc != nil {
// Incremental aggregates require hold only GOMAXPROCS timeseries in memory.
timeseriesLen = runtime.GOMAXPROCS(-1)
if iafc.ae.Modifier.Op != "" {
// Increase the number of timeseries for non-empty group list: `aggr() by (something)`,
// since each group can have own set of time series in memory.
// Estimate the number of such groups is lower than 100 :)
timeseriesLen *= 100
}
}
rollupPoints := mulNoOverflow(pointsPerTimeseries, int64(timeseriesLen*len(rcs)))
rollupMemorySize := mulNoOverflow(rollupPoints, 16)
rml := getRollupMemoryLimiter()
if !rml.Get(uint64(rollupMemorySize)) {
@@ -687,7 +704,8 @@ func doRollupForTimeseries(rc *rollupConfig, tsDst *timeseries, mnSrc *storage.M
tsDst.denyReuse = true
}
func getRollupConfigs(name string, rf rollupFunc, start, end, step, window int64, sharedTimestamps []int64) (func(values []float64, timestamps []int64), []*rollupConfig) {
func getRollupConfigs(name string, rf rollupFunc, start, end, step, window int64, lookbackDelta int64, sharedTimestamps []int64) (
func(values []float64, timestamps []int64), []*rollupConfig) {
preFunc := func(values []float64, timestamps []int64) {}
if rollupFuncsRemoveCounterResets[name] {
preFunc = func(values []float64, timestamps []int64) {
@@ -703,6 +721,7 @@ func getRollupConfigs(name string, rf rollupFunc, start, end, step, window int64
Step: step,
Window: window,
MayAdjustWindow: rollupFuncsMayAdjustWindow[name],
LookbackDelta: lookbackDelta,
Timestamps: sharedTimestamps,
}
}

View File

@@ -110,7 +110,7 @@ func timeseriesToResult(tss []*timeseries, maySort bool) ([]netstorage.Result, e
for i, ts := range tss {
bb.B = marshalMetricNameSorted(bb.B[:0], &ts.MetricName)
if _, ok := m[string(bb.B)]; ok {
return nil, fmt.Errorf(`duplicate output timeseries: %s%s`, ts.MetricName.MetricGroup, stringMetricName(&ts.MetricName))
return nil, fmt.Errorf(`duplicate output timeseries: %s`, stringMetricName(&ts.MetricName))
}
m[string(bb.B)] = struct{}{}
@@ -194,11 +194,14 @@ type parseCacheValue struct {
}
type parseCache struct {
m map[string]*parseCacheValue
mu sync.RWMutex
// Move atomic counters to the top of struct for 8-byte alignment on 32-bit arch.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/212
requests uint64
misses uint64
m map[string]*parseCacheValue
mu sync.RWMutex
}
func (pc *parseCache) Requests() uint64 {

View File

@@ -369,6 +369,17 @@ func TestExecSuccess(t *testing.T) {
resultExpected := []netstorage.Result{r}
f(q, resultExpected)
})
t.Run("timestamp(time()>=1600)", func(t *testing.T) {
t.Parallel()
q := `timestamp(time()>=1600)`
r := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{nan, nan, nan, 1600, 1800, 2000},
Timestamps: timestampsExpected,
}
resultExpected := []netstorage.Result{r}
f(q, resultExpected)
})
t.Run("time()/100", func(t *testing.T) {
t.Parallel()
q := `time()/100`
@@ -1826,10 +1837,6 @@ func TestExecSuccess(t *testing.T) {
Timestamps: timestampsExpected,
}
r.MetricName.Tags = []storage.Tag{
{
Key: []byte("aa"),
Value: []byte("bb"),
},
{
Key: []byte("foo"),
Value: []byte("bar"),
@@ -1851,17 +1858,60 @@ func TestExecSuccess(t *testing.T) {
Key: []byte("foo"),
Value: []byte("bar"),
},
}
resultExpected := []netstorage.Result{r}
f(q, resultExpected)
})
t.Run(`vector * on(foo) group_left(additional_tag) duplicate_timeseries_differ_by_additional_tag`, func(t *testing.T) {
t.Parallel()
q := `sort(label_set(time()/10, "foo", "bar", "xx", "yy", "__name__", "qwert") + on(foo) group_left(op) (
label_set(time() < 1400, "foo", "bar", "op", "le"),
label_set(time() >= 1400, "foo", "bar", "op", "ge"),
))`
r1 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{1100, 1320, nan, nan, nan, nan},
Timestamps: timestampsExpected,
}
r1.MetricName.Tags = []storage.Tag{
{
Key: []byte("foo"),
Value: []byte("bar"),
},
{
Key: []byte("op"),
Value: []byte("le"),
},
{
Key: []byte("xx"),
Value: []byte("yy"),
},
}
resultExpected := []netstorage.Result{r}
r2 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{nan, nan, 1540, 1760, 1980, 2200},
Timestamps: timestampsExpected,
}
r2.MetricName.Tags = []storage.Tag{
{
Key: []byte("foo"),
Value: []byte("bar"),
},
{
Key: []byte("op"),
Value: []byte("ge"),
},
{
Key: []byte("xx"),
Value: []byte("yy"),
},
}
resultExpected := []netstorage.Result{r1, r2}
f(q, resultExpected)
})
t.Run(`vector * on(foo) group_left() duplicate_timeseries`, func(t *testing.T) {
t.Run(`vector * on(foo) duplicate_nonoverlapping_timeseries`, func(t *testing.T) {
t.Parallel()
q := `label_set(time()/10, "foo", "bar") + on(foo) group_left() (
q := `label_set(time()/10, "foo", "bar", "xx", "yy", "__name__", "qwert") + on(foo) (
label_set(time() < 1400, "foo", "bar", "op", "le"),
label_set(time() >= 1400, "foo", "bar", "op", "ge"),
)`
@@ -1870,13 +1920,105 @@ func TestExecSuccess(t *testing.T) {
Values: []float64{1100, 1320, 1540, 1760, 1980, 2200},
Timestamps: timestampsExpected,
}
r1.MetricName.Tags = []storage.Tag{{
Key: []byte("foo"),
Value: []byte("bar"),
}}
r1.MetricName.Tags = []storage.Tag{
{
Key: []byte("foo"),
Value: []byte("bar"),
},
}
resultExpected := []netstorage.Result{r1}
f(q, resultExpected)
})
t.Run(`vector * on(foo) group_left() duplicate_nonoverlapping_timeseries`, func(t *testing.T) {
t.Parallel()
q := `label_set(time()/10, "foo", "bar", "xx", "yy", "__name__", "qwert") + on(foo) group_left() (
label_set(time() < 1400, "foo", "bar", "op", "le"),
label_set(time() >= 1400, "foo", "bar", "op", "ge"),
)`
r1 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{1100, 1320, 1540, 1760, 1980, 2200},
Timestamps: timestampsExpected,
}
r1.MetricName.Tags = []storage.Tag{
{
Key: []byte("foo"),
Value: []byte("bar"),
},
{
Key: []byte("xx"),
Value: []byte("yy"),
},
}
resultExpected := []netstorage.Result{r1}
f(q, resultExpected)
})
t.Run(`vector * on(foo) group_left(__name__)`, func(t *testing.T) {
t.Parallel()
q := `label_set(time()/10, "foo", "bar", "xx", "yy", "__name__", "qwert") + on(foo) group_left(__name__)
label_set(time(), "foo", "bar", "__name__", "aaa")`
r1 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{1100, 1320, 1540, 1760, 1980, 2200},
Timestamps: timestampsExpected,
}
r1.MetricName.MetricGroup = []byte("aaa")
r1.MetricName.Tags = []storage.Tag{
{
Key: []byte("foo"),
Value: []byte("bar"),
},
{
Key: []byte("xx"),
Value: []byte("yy"),
},
}
resultExpected := []netstorage.Result{r1}
f(q, resultExpected)
})
t.Run(`vector * on(foo) group_right()`, func(t *testing.T) {
t.Parallel()
q := `sort(label_set(time()/10, "foo", "bar", "xx", "yy", "__name__", "qwert") + on(foo) group_right(xx) (
label_set(time(), "foo", "bar", "__name__", "aaa"),
label_set(time()+3, "foo", "bar", "__name__", "yyy","ppp", "123"),
))`
r1 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{1100, 1320, 1540, 1760, 1980, 2200},
Timestamps: timestampsExpected,
}
r1.MetricName.Tags = []storage.Tag{
{
Key: []byte("foo"),
Value: []byte("bar"),
},
{
Key: []byte("xx"),
Value: []byte("yy"),
},
}
r2 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{1103, 1323, 1543, 1763, 1983, 2203},
Timestamps: timestampsExpected,
}
r2.MetricName.Tags = []storage.Tag{
{
Key: []byte("foo"),
Value: []byte("bar"),
},
{
Key: []byte("ppp"),
Value: []byte("123"),
},
{
Key: []byte("xx"),
Value: []byte("yy"),
},
}
resultExpected := []netstorage.Result{r1, r2}
f(q, resultExpected)
})
t.Run(`vector * on() group_left scalar`, func(t *testing.T) {
t.Parallel()
q := `sort_desc((label_set(time(), "foo", "bar") or label_set(10, "foo", "qwert")) * on() group_left 2)`
@@ -1971,10 +2113,6 @@ func TestExecSuccess(t *testing.T) {
Timestamps: timestampsExpected,
}
r.MetricName.Tags = []storage.Tag{
{
Key: []byte("t1"),
Value: []byte("v123"),
},
{
Key: []byte("t2"),
Value: []byte("v3"),
@@ -2080,10 +2218,6 @@ func TestExecSuccess(t *testing.T) {
Timestamps: timestampsExpected,
}
r.MetricName.Tags = []storage.Tag{
{
Key: []byte("t1"),
Value: []byte("v123"),
},
{
Key: []byte("t2"),
Value: []byte("v3"),
@@ -2155,6 +2289,45 @@ func TestExecSuccess(t *testing.T) {
resultExpected := []netstorage.Result{r}
f(q, resultExpected)
})
t.Run(`histogram_quantile(single-value-valid-le-max-phi)`, func(t *testing.T) {
t.Parallel()
q := `histogram_quantile(1, (
label_set(100, "le", "200"),
label_set(0, "le", "55"),
))`
r := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{200, 200, 200, 200, 200, 200},
Timestamps: timestampsExpected,
}
resultExpected := []netstorage.Result{r}
f(q, resultExpected)
})
t.Run(`histogram_quantile(single-value-valid-le-min-phi)`, func(t *testing.T) {
t.Parallel()
q := `histogram_quantile(0, (
label_set(100, "le", "200"),
label_set(0, "le", "55"),
))`
r := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{55, 55, 55, 55, 55, 55},
Timestamps: timestampsExpected,
}
resultExpected := []netstorage.Result{r}
f(q, resultExpected)
})
t.Run(`histogram_quantile(single-value-valid-le-min-phi-no-zero-bucket)`, func(t *testing.T) {
t.Parallel()
q := `histogram_quantile(0, label_set(100, "le", "200"))`
r := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{0, 0, 0, 0, 0, 0},
Timestamps: timestampsExpected,
}
resultExpected := []netstorage.Result{r}
f(q, resultExpected)
})
t.Run(`histogram_quantile(scalar-phi)`, func(t *testing.T) {
t.Parallel()
q := `histogram_quantile(time() / 2 / 1e3, label_set(100, "le", "200"))`
@@ -2215,7 +2388,7 @@ func TestExecSuccess(t *testing.T) {
resultExpected := []netstorage.Result{r}
f(q, resultExpected)
})
t.Run(`histogram_quantile(nan-bucket-count)`, func(t *testing.T) {
t.Run(`histogram_quantile(nan-bucket-count-some)`, func(t *testing.T) {
t.Parallel()
q := `histogram_quantile(0.6,
label_set(90, "foo", "bar", "le", "10")
@@ -2224,7 +2397,7 @@ func TestExecSuccess(t *testing.T) {
)`
r := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{30, 30, 30, 30, 30, 30},
Values: []float64{10, 10, 10, 10, 10, 10},
Timestamps: timestampsExpected,
}
r.MetricName.Tags = []storage.Tag{{
@@ -2234,7 +2407,7 @@ func TestExecSuccess(t *testing.T) {
resultExpected := []netstorage.Result{r}
f(q, resultExpected)
})
t.Run(`histogram_quantile(nan-bucket-count)`, func(t *testing.T) {
t.Run(`histogram_quantile(normal-bucket-count)`, func(t *testing.T) {
t.Parallel()
q := `histogram_quantile(0.2,
label_set(0, "foo", "bar", "le", "10")
@@ -2263,7 +2436,7 @@ func TestExecSuccess(t *testing.T) {
resultExpected := []netstorage.Result{}
f(q, resultExpected)
})
t.Run(`histogram_quantile(nan-bucket-count)`, func(t *testing.T) {
t.Run(`histogram_quantile(nan-bucket-count-all)`, func(t *testing.T) {
t.Parallel()
q := `histogram_quantile(0.6,
label_set(nan, "foo", "bar", "le", "10")
@@ -2273,6 +2446,190 @@ func TestExecSuccess(t *testing.T) {
resultExpected := []netstorage.Result{}
f(q, resultExpected)
})
t.Run(`prometheus_buckets(missing-vmrange)`, func(t *testing.T) {
t.Parallel()
q := `sort(prometheus_buckets((
alias(label_set(time()/20, "foo", "bar", "le", "0.2"), "xyz"),
alias(label_set(time()/100, "foo", "bar", "vmrange", "foobar"), "xxx"),
alias(label_set(time()/100, "foo", "bar", "vmrange", "30...foobar"), "xxx"),
alias(label_set(time()/100, "foo", "bar", "vmrange", "30...40"), "xxx"),
alias(label_set(time()/80, "foo", "bar", "vmrange", "0...900", "le", "54"), "yyy"),
alias(label_set(time()/40, "foo", "bar", "vmrange", "900...+Inf", "le", "2343"), "yyy"),
)))`
r1 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{0, 0, 0, 0, 0, 0},
Timestamps: timestampsExpected,
}
r1.MetricName.MetricGroup = []byte("xxx")
r1.MetricName.Tags = []storage.Tag{
{
Key: []byte("foo"),
Value: []byte("bar"),
},
{
Key: []byte("le"),
Value: []byte("30"),
},
}
r2 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{10, 12, 14, 16, 18, 20},
Timestamps: timestampsExpected,
}
r2.MetricName.MetricGroup = []byte("xxx")
r2.MetricName.Tags = []storage.Tag{
{
Key: []byte("foo"),
Value: []byte("bar"),
},
{
Key: []byte("le"),
Value: []byte("40"),
},
}
r3 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{10, 12, 14, 16, 18, 20},
Timestamps: timestampsExpected,
}
r3.MetricName.MetricGroup = []byte("xxx")
r3.MetricName.Tags = []storage.Tag{
{
Key: []byte("foo"),
Value: []byte("bar"),
},
{
Key: []byte("le"),
Value: []byte("+Inf"),
},
}
r4 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{12.5, 15, 17.5, 20, 22.5, 25},
Timestamps: timestampsExpected,
}
r4.MetricName.MetricGroup = []byte("yyy")
r4.MetricName.Tags = []storage.Tag{
{
Key: []byte("foo"),
Value: []byte("bar"),
},
{
Key: []byte("le"),
Value: []byte("900"),
},
}
r5 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{37.5, 45, 52.5, 60, 67.5, 75},
Timestamps: timestampsExpected,
}
r5.MetricName.MetricGroup = []byte("yyy")
r5.MetricName.Tags = []storage.Tag{
{
Key: []byte("foo"),
Value: []byte("bar"),
},
{
Key: []byte("le"),
Value: []byte("+Inf"),
},
}
r6 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{50, 60, 70, 80, 90, 100},
Timestamps: timestampsExpected,
}
r6.MetricName.MetricGroup = []byte("xyz")
r6.MetricName.Tags = []storage.Tag{
{
Key: []byte("foo"),
Value: []byte("bar"),
},
{
Key: []byte("le"),
Value: []byte("0.2"),
},
}
resultExpected := []netstorage.Result{r1, r2, r3, r4, r5, r6}
f(q, resultExpected)
})
t.Run(`prometheus_buckets(valid)`, func(t *testing.T) {
t.Parallel()
q := `sort(prometheus_buckets((
alias(label_set(90, "foo", "bar", "vmrange", "0...0"), "xxx"),
alias(label_set(time()/20, "foo", "bar", "vmrange", "0...0.2"), "xxx"),
alias(label_set(time()/100, "foo", "bar", "vmrange", "0.2...40"), "xxx"),
alias(label_set(time()/10, "foo", "bar", "vmrange", "40...Inf"), "xxx"),
)))`
r1 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{90, 90, 90, 90, 90, 90},
Timestamps: timestampsExpected,
}
r1.MetricName.MetricGroup = []byte("xxx")
r1.MetricName.Tags = []storage.Tag{
{
Key: []byte("foo"),
Value: []byte("bar"),
},
{
Key: []byte("le"),
Value: []byte("0"),
},
}
r2 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{140, 150, 160, 170, 180, 190},
Timestamps: timestampsExpected,
}
r2.MetricName.MetricGroup = []byte("xxx")
r2.MetricName.Tags = []storage.Tag{
{
Key: []byte("foo"),
Value: []byte("bar"),
},
{
Key: []byte("le"),
Value: []byte("0.2"),
},
}
r3 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{150, 162, 174, 186, 198, 210},
Timestamps: timestampsExpected,
}
r3.MetricName.MetricGroup = []byte("xxx")
r3.MetricName.Tags = []storage.Tag{
{
Key: []byte("foo"),
Value: []byte("bar"),
},
{
Key: []byte("le"),
Value: []byte("40"),
},
}
r4 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{250, 282, 314, 346, 378, 410},
Timestamps: timestampsExpected,
}
r4.MetricName.MetricGroup = []byte("xxx")
r4.MetricName.Tags = []storage.Tag{
{
Key: []byte("foo"),
Value: []byte("bar"),
},
{
Key: []byte("le"),
Value: []byte("Inf"),
},
}
resultExpected := []netstorage.Result{r1, r2, r3, r4}
f(q, resultExpected)
})
t.Run(`median_over_time()`, func(t *testing.T) {
t.Parallel()
q := `median_over_time({})`
@@ -2323,6 +2680,108 @@ func TestExecSuccess(t *testing.T) {
resultExpected := []netstorage.Result{r}
f(q, resultExpected)
})
t.Run(`histogram(scalar)`, func(t *testing.T) {
t.Parallel()
q := `sort(histogram(123)+(
label_set(0, "le", "1.0e2"),
label_set(0, "le", "1.5e2"),
label_set(1, "le", "+Inf"),
))`
r1 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{0, 0, 0, 0, 0, 0},
Timestamps: timestampsExpected,
}
r1.MetricName.Tags = []storage.Tag{
{
Key: []byte("le"),
Value: []byte("1.0e2"),
},
}
r2 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{1, 1, 1, 1, 1, 1},
Timestamps: timestampsExpected,
}
r2.MetricName.Tags = []storage.Tag{
{
Key: []byte("le"),
Value: []byte("1.5e2"),
},
}
r3 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{2, 2, 2, 2, 2, 2},
Timestamps: timestampsExpected,
}
r3.MetricName.Tags = []storage.Tag{
{
Key: []byte("le"),
Value: []byte("+Inf"),
},
}
resultExpected := []netstorage.Result{r1, r2, r3}
f(q, resultExpected)
})
t.Run(`histogram(vector)`, func(t *testing.T) {
t.Parallel()
q := `sort(histogram((
label_set(1, "foo", "bar"),
label_set(1.1, "xx", "yy"),
alias(1.15, "foobar"),
))+(
label_set(0, "le", "9.5e-1"),
label_set(0, "le", "1.0e0"),
label_set(0, "le", "1.5e0"),
label_set(1, "le", "+Inf"),
))`
r1 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{0, 0, 0, 0, 0, 0},
Timestamps: timestampsExpected,
}
r1.MetricName.Tags = []storage.Tag{
{
Key: []byte("le"),
Value: []byte("9.5e-1"),
},
}
r2 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{1, 1, 1, 1, 1, 1},
Timestamps: timestampsExpected,
}
r2.MetricName.Tags = []storage.Tag{
{
Key: []byte("le"),
Value: []byte("1.0e0"),
},
}
r3 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{3, 3, 3, 3, 3, 3},
Timestamps: timestampsExpected,
}
r3.MetricName.Tags = []storage.Tag{
{
Key: []byte("le"),
Value: []byte("1.5e0"),
},
}
r4 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{4, 4, 4, 4, 4, 4},
Timestamps: timestampsExpected,
}
r4.MetricName.Tags = []storage.Tag{
{
Key: []byte("le"),
Value: []byte("+Inf"),
},
}
resultExpected := []netstorage.Result{r1, r2, r3, r4}
f(q, resultExpected)
})
t.Run(`avg(scalar) wiTHout (xx, yy)`, func(t *testing.T) {
t.Parallel()
q := `avg wiTHout (xx, yy) (123)`
@@ -2548,6 +3007,28 @@ func TestExecSuccess(t *testing.T) {
resultExpected := []netstorage.Result{r}
f(q, resultExpected)
})
t.Run(`increases_over_time`, func(t *testing.T) {
t.Parallel()
q := `increases_over_time(rand(0)[200s:10s])`
r := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{11, 9, 9, 12, 9, 8},
Timestamps: timestampsExpected,
}
resultExpected := []netstorage.Result{r}
f(q, resultExpected)
})
t.Run(`decreases_over_time`, func(t *testing.T) {
t.Parallel()
q := `decreases_over_time(rand(0)[200s:10s])`
r := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{9, 11, 11, 8, 11, 12},
Timestamps: timestampsExpected,
}
resultExpected := []netstorage.Result{r}
f(q, resultExpected)
})
t.Run(`limitk(-1)`, func(t *testing.T) {
t.Parallel()
q := `limitk(-1, label_set(10, "foo", "bar") or label_set(time()/150, "baz", "sss"))`
@@ -3400,7 +3881,7 @@ func TestExecSuccess(t *testing.T) {
}}
r4 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{0.85, 0.94, 0.97, 0.93, 0.98, 0.92},
Values: []float64{0.9, 0.94, 0.97, 0.93, 0.98, 0.92},
Timestamps: timestampsExpected,
}
r4.MetricName.Tags = []storage.Tag{{
@@ -3448,7 +3929,7 @@ func TestExecSuccess(t *testing.T) {
q := `sort(rollup(time()[:50s]))`
r1 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{850, 1050, 1250, 1450, 1650, 1850},
Values: []float64{800, 1000, 1200, 1400, 1600, 1800},
Timestamps: timestampsExpected,
}
r1.MetricName.Tags = []storage.Tag{{
@@ -3554,6 +4035,17 @@ func TestExecSuccess(t *testing.T) {
resultExpected := []netstorage.Result{r}
f(q, resultExpected)
})
t.Run(`lag()`, func(t *testing.T) {
t.Parallel()
q := `lag(time()[60s:17s])`
r := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{14, 10, 6, 2, 15, 11},
Timestamps: timestampsExpected,
}
resultExpected := []netstorage.Result{r}
f(q, resultExpected)
})
t.Run(`()`, func(t *testing.T) {
t.Parallel()
q := `()`
@@ -3702,6 +4194,35 @@ func TestExecSuccess(t *testing.T) {
resultExpected := []netstorage.Result{r1, r2}
f(q, resultExpected)
})
t.Run(`((1),(2,3))`, func(t *testing.T) {
t.Parallel()
q := `((
alias(1, "x1"),
),(
alias(2, "x2"),
alias(3, "x3"),
))`
r1 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{1, 1, 1, 1, 1, 1},
Timestamps: timestampsExpected,
}
r1.MetricName.MetricGroup = []byte("x1")
r2 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{2, 2, 2, 2, 2, 2},
Timestamps: timestampsExpected,
}
r2.MetricName.MetricGroup = []byte("x2")
r3 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{3, 3, 3, 3, 3, 3},
Timestamps: timestampsExpected,
}
r3.MetricName.MetricGroup = []byte("x3")
resultExpected := []netstorage.Result{r1, r2, r3}
f(q, resultExpected)
})
t.Run(`union(more-than-two)`, func(t *testing.T) {
t.Parallel()
q := `union(
@@ -3818,6 +4339,107 @@ func TestExecSuccess(t *testing.T) {
resultExpected := []netstorage.Result{r1, r2, r3, r4, r5, r6}
f(q, resultExpected)
})
t.Run(`count_values by (xxx)`, func(t *testing.T) {
t.Parallel()
q := `count_values("xxx", label_set(10, "foo", "bar", "xxx", "aaa") or label_set(floor(time()/600), "foo", "bar", "baz", "xx")) by (xxx)`
r1 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{1, nan, nan, nan, nan, nan},
Timestamps: timestampsExpected,
}
r1.MetricName.Tags = []storage.Tag{
{
Key: []byte("xxx"),
Value: []byte("1"),
},
}
r2 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{nan, 1, 1, 1, nan, nan},
Timestamps: timestampsExpected,
}
r2.MetricName.Tags = []storage.Tag{
{
Key: []byte("xxx"),
Value: []byte("2"),
},
}
r3 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{nan, nan, nan, nan, 1, 1},
Timestamps: timestampsExpected,
}
r3.MetricName.Tags = []storage.Tag{
{
Key: []byte("xxx"),
Value: []byte("3"),
},
}
r4 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{1, 1, 1, 1, 1, 1},
Timestamps: timestampsExpected,
}
r4.MetricName.Tags = []storage.Tag{
{
Key: []byte("xxx"),
Value: []byte("10"),
},
}
resultExpected := []netstorage.Result{r1, r2, r3, r4}
f(q, resultExpected)
})
t.Run(`count_values without (baz)`, func(t *testing.T) {
t.Parallel()
q := `count_values("xxx", label_set(floor(time()/600), "foo", "bar")) without (baz)`
r1 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{1, nan, nan, nan, nan, nan},
Timestamps: timestampsExpected,
}
r1.MetricName.Tags = []storage.Tag{
{
Key: []byte("foo"),
Value: []byte("bar"),
},
{
Key: []byte("xxx"),
Value: []byte("1"),
},
}
r2 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{nan, 1, 1, 1, nan, nan},
Timestamps: timestampsExpected,
}
r2.MetricName.Tags = []storage.Tag{
{
Key: []byte("foo"),
Value: []byte("bar"),
},
{
Key: []byte("xxx"),
Value: []byte("2"),
},
}
r3 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{nan, nan, nan, nan, 1, 1},
Timestamps: timestampsExpected,
}
r3.MetricName.Tags = []storage.Tag{
{
Key: []byte("foo"),
Value: []byte("bar"),
},
{
Key: []byte("xxx"),
Value: []byte("3"),
},
}
resultExpected := []netstorage.Result{r1, r2, r3}
f(q, resultExpected)
})
}
func TestExecError(t *testing.T) {
@@ -3918,6 +4540,8 @@ func TestExecError(t *testing.T) {
f(`alias()`)
f(`alias(1)`)
f(`alias(1, "foo", "bar")`)
f(`lifetime()`)
f(`lag()`)
// Invalid argument type
f(`median_over_time({}, 2)`)
@@ -4003,27 +4627,27 @@ func testResultsEqual(t *testing.T, result, resultExpected []netstorage.Result)
for i := range result {
r := &result[i]
rExpected := &resultExpected[i]
testMetricNamesEqual(t, &r.MetricName, &rExpected.MetricName)
testMetricNamesEqual(t, &r.MetricName, &rExpected.MetricName, i)
testRowsEqual(t, r.Values, r.Timestamps, rExpected.Values, rExpected.Timestamps)
}
}
func testMetricNamesEqual(t *testing.T, mn, mnExpected *storage.MetricName) {
func testMetricNamesEqual(t *testing.T, mn, mnExpected *storage.MetricName, pos int) {
t.Helper()
if string(mn.MetricGroup) != string(mnExpected.MetricGroup) {
t.Fatalf(`unexpected MetricGroup; got %q; want %q`, mn.MetricGroup, mnExpected.MetricGroup)
t.Fatalf(`unexpected MetricGroup at #%d; got %q; want %q`, pos, mn.MetricGroup, mnExpected.MetricGroup)
}
if len(mn.Tags) != len(mnExpected.Tags) {
t.Fatalf(`unexpected tags count; got %d; want %d`, len(mn.Tags), len(mnExpected.Tags))
t.Fatalf(`unexpected tags count at #%d; got %d; want %d`, pos, len(mn.Tags), len(mnExpected.Tags))
}
for i := range mn.Tags {
tag := &mn.Tags[i]
tagExpected := &mnExpected.Tags[i]
if string(tag.Key) != string(tagExpected.Key) {
t.Fatalf(`unexpected tag key; got %q; want %q`, tag.Key, tagExpected.Key)
t.Fatalf(`unexpected tag key at #%d,%d; got %q; want %q`, pos, i, tag.Key, tagExpected.Key)
}
if string(tag.Value) != string(tagExpected.Value) {
t.Fatalf(`unexpected tag value; got %q; want %q`, tag.Value, tagExpected.Value)
t.Fatalf(`unexpected tag value for key %q at #%d,%d; got %q; want %q`, tag.Key, pos, i, tag.Value, tagExpected.Value)
}
}
}

View File

@@ -116,13 +116,17 @@ func removeParensExpr(e expr) expr {
return fe
}
if pe, ok := e.(*parensExpr); ok {
args := *pe
for i, arg := range args {
args[i] = removeParensExpr(arg)
}
if len(*pe) == 1 {
return removeParensExpr((*pe)[0])
return args[0]
}
// Treat parensExpr as a function with empty name, i.e. union()
fe := &funcExpr{
Name: "",
Args: *pe,
Args: args,
}
return fe
}
@@ -1550,6 +1554,10 @@ type rollupExpr struct {
InheritStep bool
}
func (re *rollupExpr) ForSubquery() bool {
return len(re.Step) > 0 || re.InheritStep
}
func (re *rollupExpr) AppendString(dst []byte) []byte {
needParens := func() bool {
if _, ok := re.Expr.(*rollupExpr); ok {

View File

@@ -252,6 +252,8 @@ func TestParsePromQLSuccess(t *testing.T) {
another(`(-foo + ((bar) / (baz))) + ((23))`, `((0 - foo) + (bar / baz)) + 23`)
another(`(FOO + ((Bar) / (baZ))) + ((23))`, `(FOO + (Bar / baZ)) + 23`)
same(`(foo, bar)`)
another(`((foo, bar),(baz))`, `((foo, bar), baz)`)
same(`(foo, (bar, baz), ((x, y), (z, y), xx))`)
another(`1+(foo, bar,)`, `1 + (foo, bar)`)
another(`((foo(bar,baz)), (1+(2)+(3,4)+()))`, `(foo(bar, baz), (3 + (3, 4)) + ())`)
same(`()`)

View File

@@ -51,11 +51,14 @@ type regexpCacheValue struct {
}
type regexpCache struct {
m map[string]*regexpCacheValue
mu sync.RWMutex
// Move atomic counters to the top of struct for 8-byte alignment on 32-bit arch.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/212
requests uint64
misses uint64
m map[string]*regexpCacheValue
mu sync.RWMutex
}
func (rc *regexpCache) Requests() uint64 {

View File

@@ -3,7 +3,6 @@ package promql
import (
"fmt"
"math"
"sort"
"strings"
"sync"
@@ -23,8 +22,8 @@ var rollupFuncs = map[string]newRollupFunc{
"deriv_fast": newRollupFuncOneArg(rollupDerivFast),
"holt_winters": newRollupHoltWinters,
"idelta": newRollupFuncOneArg(rollupIdelta),
"increase": newRollupFuncOneArg(rollupDelta), // + rollupFuncsRemoveCounterResets
"irate": newRollupFuncOneArg(rollupIderiv), // + rollupFuncsRemoveCounterResets
"increase": newRollupFuncOneArg(rollupIncrease), // + rollupFuncsRemoveCounterResets
"irate": newRollupFuncOneArg(rollupIderiv), // + rollupFuncsRemoveCounterResets
"predict_linear": newRollupPredictLinear,
"rate": newRollupFuncOneArg(rollupDerivFast), // + rollupFuncsRemoveCounterResets
"resets": newRollupFuncOneArg(rollupResets),
@@ -38,21 +37,24 @@ var rollupFuncs = map[string]newRollupFunc{
"stdvar_over_time": newRollupFuncOneArg(rollupStdvar),
// Additional rollup funcs.
"sum2_over_time": newRollupFuncOneArg(rollupSum2),
"geomean_over_time": newRollupFuncOneArg(rollupGeomean),
"first_over_time": newRollupFuncOneArg(rollupFirst),
"last_over_time": newRollupFuncOneArg(rollupLast),
"distinct_over_time": newRollupFuncOneArg(rollupDistinct),
"integrate": newRollupFuncOneArg(rollupIntegrate),
"ideriv": newRollupFuncOneArg(rollupIderiv),
"lifetime": newRollupFuncOneArg(rollupLifetime),
"scrape_interval": newRollupFuncOneArg(rollupScrapeInterval),
"rollup": newRollupFuncOneArg(rollupFake),
"rollup_rate": newRollupFuncOneArg(rollupFake), // + rollupFuncsRemoveCounterResets
"rollup_deriv": newRollupFuncOneArg(rollupFake),
"rollup_delta": newRollupFuncOneArg(rollupFake),
"rollup_increase": newRollupFuncOneArg(rollupFake), // + rollupFuncsRemoveCounterResets
"rollup_candlestick": newRollupFuncOneArg(rollupFake),
"sum2_over_time": newRollupFuncOneArg(rollupSum2),
"geomean_over_time": newRollupFuncOneArg(rollupGeomean),
"first_over_time": newRollupFuncOneArg(rollupFirst),
"last_over_time": newRollupFuncOneArg(rollupLast),
"distinct_over_time": newRollupFuncOneArg(rollupDistinct),
"increases_over_time": newRollupFuncOneArg(rollupIncreases),
"decreases_over_time": newRollupFuncOneArg(rollupDecreases),
"integrate": newRollupFuncOneArg(rollupIntegrate),
"ideriv": newRollupFuncOneArg(rollupIderiv),
"lifetime": newRollupFuncOneArg(rollupLifetime),
"lag": newRollupFuncOneArg(rollupLag),
"scrape_interval": newRollupFuncOneArg(rollupScrapeInterval),
"rollup": newRollupFuncOneArg(rollupFake),
"rollup_rate": newRollupFuncOneArg(rollupFake), // + rollupFuncsRemoveCounterResets
"rollup_deriv": newRollupFuncOneArg(rollupFake),
"rollup_delta": newRollupFuncOneArg(rollupFake),
"rollup_increase": newRollupFuncOneArg(rollupFake), // + rollupFuncsRemoveCounterResets
"rollup_candlestick": newRollupFuncOneArg(rollupFake),
}
var rollupFuncsMayAdjustWindow = map[string]bool{
@@ -111,8 +113,10 @@ type rollupFuncArg struct {
values []float64
timestamps []int64
idx int
step int64
currTimestamp int64
idx int
step int64
realPrevValue float64
}
func (rfa *rollupFuncArg) reset() {
@@ -120,8 +124,10 @@ func (rfa *rollupFuncArg) reset() {
rfa.prevTimestamp = 0
rfa.values = nil
rfa.timestamps = nil
rfa.currTimestamp = 0
rfa.idx = 0
rfa.step = 0
rfa.realPrevValue = nan
}
// rollupFunc must return rollup value for the given rfa.
@@ -147,6 +153,9 @@ type rollupConfig struct {
MayAdjustWindow bool
Timestamps []int64
// LoookbackDelta is the analog to `-query.lookback-delta` from Prometheus world.
LookbackDelta int64
}
var (
@@ -184,6 +193,9 @@ func (rc *rollupConfig) Do(dstValues []float64, values []float64, timestamps []i
dstValues = decimal.ExtendFloat64sCapacity(dstValues, len(rc.Timestamps))
maxPrevInterval := getMaxPrevInterval(timestamps)
if rc.LookbackDelta > 0 && maxPrevInterval > rc.LookbackDelta {
maxPrevInterval = rc.LookbackDelta
}
window := rc.Window
if window <= 0 {
window = rc.Step
@@ -194,6 +206,7 @@ func (rc *rollupConfig) Do(dstValues []float64, values []float64, timestamps []i
rfa := getRollupFuncArg()
rfa.idx = 0
rfa.step = rc.Step
rfa.realPrevValue = nan
i := 0
j := 0
@@ -211,13 +224,17 @@ func (rc *rollupConfig) Do(dstValues []float64, values []float64, timestamps []i
rfa.prevValue = nan
rfa.prevTimestamp = tStart - maxPrevInterval
if i > 0 && timestamps[i-1] > rfa.prevTimestamp {
if i < len(timestamps) && i > 0 && timestamps[i-1] > rfa.prevTimestamp {
rfa.prevValue = values[i-1]
rfa.prevTimestamp = timestamps[i-1]
}
rfa.values = values[i:j]
rfa.timestamps = timestamps[i:j]
rfa.currTimestamp = tEnd
if i > 0 {
rfa.realPrevValue = values[i-1]
}
value := rc.Func(rfa)
rfa.idx++
dstValues = append(dstValues, value)
@@ -261,17 +278,42 @@ func seekFirstTimestampIdxAfter(timestamps []int64, seekTimestamp int64, nHint i
return startIdx + len(timestamps)
}
// Slow path: too big len(timestamps), so use binary search.
i := sort.Search(len(timestamps), func(n int) bool {
return n >= 0 && n < len(timestamps) && timestamps[n] > seekTimestamp
})
return startIdx + i
i := binarySearchInt64(timestamps, seekTimestamp+1)
return startIdx + int(i)
}
func binarySearchInt64(a []int64, v int64) uint {
// Copy-pasted sort.Search from https://golang.org/src/sort/search.go?s=2246:2286#L49
i, j := uint(0), uint(len(a))
for i < j {
h := (i + j) >> 1
if h < uint(len(a)) && a[h] < v {
i = h + 1
} else {
j = h
}
}
return i
}
func getMaxPrevInterval(timestamps []int64) int64 {
if len(timestamps) < 2 {
return int64(maxSilenceInterval)
}
d := (timestamps[len(timestamps)-1] - timestamps[0]) / int64(len(timestamps)-1)
// Estimate scrape interval as 0.6 quantile for the first 100 intervals.
h := histogram.GetFast()
tsPrev := timestamps[0]
timestamps = timestamps[1:]
if len(timestamps) > 100 {
timestamps = timestamps[:100]
}
for _, ts := range timestamps {
h.Update(float64(ts - tsPrev))
tsPrev = ts
}
d := int64(h.Quantile(0.6))
histogram.PutFast(h)
if d <= 0 {
return int64(maxSilenceInterval)
}
@@ -531,11 +573,14 @@ func rollupAvg(rfa *rollupFuncArg) float64 {
func rollupMin(rfa *rollupFuncArg) float64 {
// There is no need in handling NaNs here, since they must be cleaned up
// before calling rollup funcs.
minValue := rfa.prevValue
values := rfa.values
if len(values) == 0 {
return rfa.prevValue
if math.IsNaN(minValue) {
if len(values) == 0 {
return nan
}
minValue = values[0]
}
minValue := values[0]
for _, v := range values {
if v < minValue {
minValue = v
@@ -547,11 +592,14 @@ func rollupMin(rfa *rollupFuncArg) float64 {
func rollupMax(rfa *rollupFuncArg) float64 {
// There is no need in handling NaNs here, since they must be cleaned up
// before calling rollup funcs.
maxValue := rfa.prevValue
values := rfa.values
if len(values) == 0 {
return rfa.prevValue
if math.IsNaN(maxValue) {
if len(values) == 0 {
return nan
}
maxValue = values[0]
}
maxValue := values[0]
for _, v := range values {
if v > maxValue {
maxValue = v
@@ -565,7 +613,10 @@ func rollupSum(rfa *rollupFuncArg) float64 {
// before calling rollup funcs.
values := rfa.values
if len(values) == 0 {
return rfa.prevValue
if math.IsNaN(rfa.prevValue) {
return nan
}
return 0
}
var sum float64
for _, v := range values {
@@ -649,6 +700,14 @@ func rollupStdvar(rfa *rollupFuncArg) float64 {
}
func rollupDelta(rfa *rollupFuncArg) float64 {
return rollupDeltaInternal(rfa, false)
}
func rollupIncrease(rfa *rollupFuncArg) float64 {
return rollupDeltaInternal(rfa, true)
}
func rollupDeltaInternal(rfa *rollupFuncArg, canUseRealPrevValue bool) float64 {
// There is no need in handling NaNs here, since they must be cleaned up
// before calling rollup funcs.
values := rfa.values
@@ -658,6 +717,10 @@ func rollupDelta(rfa *rollupFuncArg) float64 {
return nan
}
if len(values) == 1 {
if canUseRealPrevValue && !math.IsNaN(rfa.realPrevValue) {
// Fix against removeCounterResets.
return values[0] - rfa.realPrevValue
}
// Assume that the previous non-existing value was 0.
return values[0]
}
@@ -782,6 +845,18 @@ func rollupLifetime(rfa *rollupFuncArg) float64 {
return float64(timestamps[len(timestamps)-1]-rfa.prevTimestamp) * 1e-3
}
func rollupLag(rfa *rollupFuncArg) float64 {
// Calculate the duration between the current timestamp and the last data point.
timestamps := rfa.timestamps
if len(timestamps) == 0 {
if math.IsNaN(rfa.prevValue) {
return nan
}
return float64(rfa.currTimestamp-rfa.prevTimestamp) * 1e-3
}
return float64(rfa.currTimestamp-timestamps[len(timestamps)-1]) * 1e-3
}
func rollupScrapeInterval(rfa *rollupFuncArg) float64 {
// Calculate the average interval between data points.
timestamps := rfa.timestamps
@@ -820,6 +895,37 @@ func rollupChanges(rfa *rollupFuncArg) float64 {
return float64(n)
}
func rollupIncreases(rfa *rollupFuncArg) float64 {
// There is no need in handling NaNs here, since they must be cleaned up
// before calling rollup funcs.
values := rfa.values
if len(values) == 0 {
if math.IsNaN(rfa.prevValue) {
return nan
}
return 0
}
prevValue := rfa.prevValue
if math.IsNaN(prevValue) {
prevValue = values[0]
values = values[1:]
}
if len(values) == 0 {
return 0
}
n := 0
for _, v := range values {
if v > prevValue {
n++
}
prevValue = v
}
return float64(n)
}
// `decreases_over_time` logic is the same as `resets` logic.
var rollupDecreases = rollupResets
func rollupResets(rfa *rollupFuncArg) float64 {
// There is no need in handling NaNs here, since they must be cleaned up
// before calling rollup funcs.
@@ -922,6 +1028,8 @@ func rollupIntegrate(rfa *rollupFuncArg) float64 {
timestamp := timestamps[i]
dt := float64(timestamp-prevTimestamp) * 1e-3
sum += 0.5 * (v + prevValue) * dt
prevTimestamp = timestamp
prevValue = v
}
return sum
}

View File

@@ -388,7 +388,7 @@ func testTimeseriesEqual(t *testing.T, tss, tssExpected []*timeseries) {
}
for i, ts := range tss {
tsExpected := tssExpected[i]
testMetricNamesEqual(t, &ts.MetricName, &tsExpected.MetricName)
testMetricNamesEqual(t, &ts.MetricName, &tsExpected.MetricName, i)
testRowsEqual(t, ts.Values, ts.Timestamps, tsExpected.Values, tsExpected.Timestamps)
}
}

View File

@@ -182,7 +182,8 @@ func testRollupFunc(t *testing.T, funcName string, args []interface{}, meExpecte
t.Fatalf("unexpected value; got %v; want %v", v, vExpected)
}
} else {
if v != vExpected {
eps := math.Abs(v - vExpected)
if eps > 1e-14 {
t.Fatalf("unexpected value; got %v; want %v", v, vExpected)
}
}
@@ -290,9 +291,11 @@ func TestRollupNewRollupFuncSuccess(t *testing.T) {
f("stdvar_over_time", 945.7430555555555)
f("first_over_time", 123)
f("last_over_time", 34)
f("integrate", 61.0275)
f("integrate", 5.4705)
f("distinct_over_time", 8)
f("ideriv", 0)
f("decreases_over_time", 5)
f("increases_over_time", 5)
}
func TestRollupNewRollupFuncError(t *testing.T) {
@@ -358,7 +361,7 @@ func TestRollupNoWindowNoPoints(t *testing.T) {
}
rc.Timestamps = getTimestamps(rc.Start, rc.End, rc.Step)
values := rc.Do(nil, testValues, testTimestamps)
valuesExpected := []float64{2, 0, 0, 0, 0, 0, 0, 0}
valuesExpected := []float64{2, 0, 0, 0, nan, nan, nan, nan}
timestampsExpected := []int64{120, 124, 128, 132, 136, 140, 144, 148}
testRowsEqual(t, values, rc.Timestamps, valuesExpected, timestampsExpected)
})
@@ -389,7 +392,7 @@ func TestRollupWindowNoPoints(t *testing.T) {
}
rc.Timestamps = getTimestamps(rc.Start, rc.End, rc.Step)
values := rc.Do(nil, testValues, testTimestamps)
valuesExpected := []float64{34, 34, 34, nan}
valuesExpected := []float64{nan, nan, nan, nan}
timestampsExpected := []int64{161, 171, 181, 191}
testRowsEqual(t, values, rc.Timestamps, valuesExpected, timestampsExpected)
})
@@ -420,7 +423,7 @@ func TestRollupNoWindowPartialPoints(t *testing.T) {
}
rc.Timestamps = getTimestamps(rc.Start, rc.End, rc.Step)
values := rc.Do(nil, testValues, testTimestamps)
valuesExpected := []float64{12, 44, 34, 34}
valuesExpected := []float64{12, 44, 34, nan}
timestampsExpected := []int64{100, 120, 140, 160}
testRowsEqual(t, values, rc.Timestamps, valuesExpected, timestampsExpected)
})
@@ -465,7 +468,7 @@ func TestRollupWindowPartialPoints(t *testing.T) {
}
rc.Timestamps = getTimestamps(rc.Start, rc.End, rc.Step)
values := rc.Do(nil, testValues, testTimestamps)
valuesExpected := []float64{44, 34, 34, 34}
valuesExpected := []float64{44, 34, 34, nan}
timestampsExpected := []int64{100, 120, 140, 160}
testRowsEqual(t, values, rc.Timestamps, valuesExpected, timestampsExpected)
})
@@ -479,12 +482,57 @@ func TestRollupWindowPartialPoints(t *testing.T) {
}
rc.Timestamps = getTimestamps(rc.Start, rc.End, rc.Step)
values := rc.Do(nil, testValues, testTimestamps)
valuesExpected := []float64{nan, 54, 44, 34}
valuesExpected := []float64{nan, 54, 44, nan}
timestampsExpected := []int64{0, 50, 100, 150}
testRowsEqual(t, values, rc.Timestamps, valuesExpected, timestampsExpected)
})
}
func TestRollupFuncsLookbackDelta(t *testing.T) {
t.Run("1", func(t *testing.T) {
rc := rollupConfig{
Func: rollupFirst,
Start: 80,
End: 140,
Step: 10,
LookbackDelta: 1,
}
rc.Timestamps = getTimestamps(rc.Start, rc.End, rc.Step)
values := rc.Do(nil, testValues, testTimestamps)
valuesExpected := []float64{99, 12, 44, nan, 32, 34, nan}
timestampsExpected := []int64{80, 90, 100, 110, 120, 130, 140}
testRowsEqual(t, values, rc.Timestamps, valuesExpected, timestampsExpected)
})
t.Run("7", func(t *testing.T) {
rc := rollupConfig{
Func: rollupFirst,
Start: 80,
End: 140,
Step: 10,
LookbackDelta: 7,
}
rc.Timestamps = getTimestamps(rc.Start, rc.End, rc.Step)
values := rc.Do(nil, testValues, testTimestamps)
valuesExpected := []float64{99, 12, 44, 44, 32, 34, nan}
timestampsExpected := []int64{80, 90, 100, 110, 120, 130, 140}
testRowsEqual(t, values, rc.Timestamps, valuesExpected, timestampsExpected)
})
t.Run("0", func(t *testing.T) {
rc := rollupConfig{
Func: rollupFirst,
Start: 80,
End: 140,
Step: 10,
LookbackDelta: 0,
}
rc.Timestamps = getTimestamps(rc.Start, rc.End, rc.Step)
values := rc.Do(nil, testValues, testTimestamps)
valuesExpected := []float64{34, 12, 12, 44, 44, 34, nan}
timestampsExpected := []int64{80, 90, 100, 110, 120, 130, 140}
testRowsEqual(t, values, rc.Timestamps, valuesExpected, timestampsExpected)
})
}
func TestRollupFuncsNoWindow(t *testing.T) {
t.Run("first", func(t *testing.T) {
rc := rollupConfig{
@@ -524,7 +572,7 @@ func TestRollupFuncsNoWindow(t *testing.T) {
}
rc.Timestamps = getTimestamps(rc.Start, rc.End, rc.Step)
values := rc.Do(nil, testValues, testTimestamps)
valuesExpected := []float64{nan, 21, 12, 32, 34}
valuesExpected := []float64{nan, 21, 12, 12, 34}
timestampsExpected := []int64{0, 40, 80, 120, 160}
testRowsEqual(t, values, rc.Timestamps, valuesExpected, timestampsExpected)
})
@@ -584,6 +632,20 @@ func TestRollupFuncsNoWindow(t *testing.T) {
timestampsExpected := []int64{10, 50, 90, 130}
testRowsEqual(t, values, rc.Timestamps, valuesExpected, timestampsExpected)
})
t.Run("lag", func(t *testing.T) {
rc := rollupConfig{
Func: rollupLag,
Start: 0,
End: 160,
Step: 40,
Window: 0,
}
rc.Timestamps = getTimestamps(rc.Start, rc.End, rc.Step)
values := rc.Do(nil, testValues, testTimestamps)
valuesExpected := []float64{nan, 0.004, 0, 0, 0.03}
timestampsExpected := []int64{0, 40, 80, 120, 160}
testRowsEqual(t, values, rc.Timestamps, valuesExpected, timestampsExpected)
})
t.Run("lifetime_1", func(t *testing.T) {
rc := rollupConfig{
Func: rollupLifetime,
@@ -748,7 +810,7 @@ func TestRollupFuncsNoWindow(t *testing.T) {
}
rc.Timestamps = getTimestamps(rc.Start, rc.End, rc.Step)
values := rc.Do(nil, testValues, testTimestamps)
valuesExpected := []float64{nan, 4.6035, 4.3934999999999995, 2.166, 0.34}
valuesExpected := []float64{nan, 1.526, 2.2795, 1.325, 0.34}
timestampsExpected := []int64{0, 40, 80, 120, 160}
testRowsEqual(t, values, rc.Timestamps, valuesExpected, timestampsExpected)
})
@@ -782,6 +844,27 @@ func TestRollupFuncsNoWindow(t *testing.T) {
})
}
func TestRollupBigNumberOfValues(t *testing.T) {
const srcValuesCount = 1e4
rc := rollupConfig{
Func: rollupDefault,
End: srcValuesCount,
Step: srcValuesCount / 5,
Window: srcValuesCount / 4,
}
rc.Timestamps = getTimestamps(rc.Start, rc.End, rc.Step)
srcValues := make([]float64, srcValuesCount)
srcTimestamps := make([]int64, srcValuesCount)
for i := 0; i < srcValuesCount; i++ {
srcValues[i] = float64(i)
srcTimestamps[i] = int64(i / 2)
}
values := rc.Do(nil, srcValues, srcTimestamps)
valuesExpected := []float64{1, 4001, 8001, 9999, nan, nan}
timestampsExpected := []int64{0, 2000, 4000, 6000, 8000, 10000}
testRowsEqual(t, values, rc.Timestamps, valuesExpected, timestampsExpected)
}
func testRowsEqual(t *testing.T, values []float64, timestamps []int64, valuesExpected []float64, timestampsExpected []int64) {
t.Helper()
if len(values) != len(valuesExpected) {
@@ -810,7 +893,7 @@ func testRowsEqual(t *testing.T, values []float64, timestamps []int64, valuesExp
}
continue
}
if v != vExpected {
if math.Abs(v-vExpected) > 1e-15 {
t.Fatalf("unexpected value at values[%d]; got %f; want %f\nvalues=\n%v\nvaluesExpected=\n%v",
i, v, vExpected, values, valuesExpected)
}

View File

@@ -288,7 +288,6 @@ func marshalMetricTagsFast(dst []byte, tags []storage.Tag) []byte {
}
func marshalMetricNameSorted(dst []byte, mn *storage.MetricName) []byte {
// Do not marshal AccountID and ProjectID, since they are unused.
dst = marshalBytesFast(dst, mn.MetricGroup)
sortMetricTags(mn.Tags)
dst = marshalMetricTagsFast(dst, mn.Tags)

View File

@@ -91,6 +91,7 @@ var transformFuncs = map[string]transformFunc{
"cos": newTransformFuncOneArg(transformCos),
"asin": newTransformFuncOneArg(transformAsin),
"acos": newTransformFuncOneArg(transformAcos),
"prometheus_buckets": transformPrometheusBuckets,
}
func getTransformFunc(s string) transformFunc {
@@ -272,6 +273,131 @@ func transformFloor(v float64) float64 {
return math.Floor(v)
}
func transformPrometheusBuckets(tfa *transformFuncArg) ([]*timeseries, error) {
args := tfa.args
if err := expectTransformArgsNum(args, 1); err != nil {
return nil, err
}
rvs := vmrangeBucketsToLE(args[0])
return rvs, nil
}
func vmrangeBucketsToLE(tss []*timeseries) []*timeseries {
rvs := make([]*timeseries, 0, len(tss))
// Group timeseries by MetricGroup+tags excluding `vmrange` tag.
type x struct {
startStr string
endStr string
start float64
end float64
ts *timeseries
}
m := make(map[string][]x)
bb := bbPool.Get()
defer bbPool.Put(bb)
for _, ts := range tss {
vmrange := ts.MetricName.GetTagValue("vmrange")
if len(vmrange) == 0 {
if le := ts.MetricName.GetTagValue("le"); len(le) > 0 {
// Keep Prometheus-compatible buckets.
rvs = append(rvs, ts)
}
continue
}
n := strings.Index(bytesutil.ToUnsafeString(vmrange), "...")
if n < 0 {
continue
}
startStr := string(vmrange[:n])
start, err := strconv.ParseFloat(startStr, 64)
if err != nil {
continue
}
endStr := string(vmrange[n+len("..."):])
end, err := strconv.ParseFloat(endStr, 64)
if err != nil {
continue
}
ts.MetricName.RemoveTag("le")
ts.MetricName.RemoveTag("vmrange")
bb.B = marshalMetricNameSorted(bb.B[:0], &ts.MetricName)
m[string(bb.B)] = append(m[string(bb.B)], x{
startStr: startStr,
endStr: endStr,
start: start,
end: end,
ts: ts,
})
}
// Convert `vmrange` label in each group of time series to `le` label.
copyTS := func(src *timeseries, leStr string) *timeseries {
var ts timeseries
ts.CopyFromShallowTimestamps(src)
values := ts.Values
for i := range values {
values[i] = 0
}
ts.MetricName.RemoveTag("le")
ts.MetricName.AddTag("le", leStr)
return &ts
}
isZeroTS := func(ts *timeseries) bool {
for _, v := range ts.Values {
if v > 0 {
return false
}
}
return true
}
for _, xss := range m {
sort.Slice(xss, func(i, j int) bool { return xss[i].end < xss[j].end })
xssNew := make([]x, 0, len(xss)+2)
var xsPrev x
for _, xs := range xss {
ts := xs.ts
if isZeroTS(ts) {
// Skip time series with zeros. They are substituted by xssNew below.
continue
}
if xs.start != xsPrev.end {
xssNew = append(xssNew, x{
endStr: xs.startStr,
end: xs.start,
ts: copyTS(ts, xs.startStr),
})
}
ts.MetricName.AddTag("le", xs.endStr)
xssNew = append(xssNew, xs)
xsPrev = xs
}
if !math.IsInf(xsPrev.end, 1) {
xssNew = append(xssNew, x{
endStr: "+Inf",
end: math.Inf(1),
ts: copyTS(xsPrev.ts, "+Inf"),
})
}
xss = xssNew
for i := range xss[0].ts.Values {
count := float64(0)
for _, xs := range xss {
ts := xs.ts
v := ts.Values[i]
if !math.IsNaN(v) && v > 0 {
count += v
}
ts.Values[i] = count
}
}
for _, xs := range xss {
rvs = append(rvs, xs.ts)
}
}
return rvs
}
func transformHistogramQuantile(tfa *transformFuncArg) ([]*timeseries, error) {
args := tfa.args
if err := expectTransformArgsNum(args, 2); err != nil {
@@ -282,6 +408,9 @@ func transformHistogramQuantile(tfa *transformFuncArg) ([]*timeseries, error) {
return nil, err
}
// Convert buckets with `vmrange` labels to buckets with `le` labels.
tss := vmrangeBucketsToLE(args[1])
// Group metrics by all tags excluding "le"
type x struct {
le float64
@@ -289,7 +418,7 @@ func transformHistogramQuantile(tfa *transformFuncArg) ([]*timeseries, error) {
}
m := make(map[string][]x)
bb := bbPool.Get()
for _, ts := range args[1] {
for _, ts := range tss {
tagValue := ts.MetricName.GetTagValue("le")
if len(tagValue) == 0 {
continue
@@ -313,18 +442,16 @@ func transformHistogramQuantile(tfa *transformFuncArg) ([]*timeseries, error) {
lastNonInf := func(i int, xss []x) float64 {
for len(xss) > 0 {
xsLast := xss[len(xss)-1]
if xsLast.ts.Values[i] == 0 {
v := xsLast.ts.Values[i]
if v == 0 {
return nan
}
if !math.IsInf(xsLast.le, 0) {
break
if !math.IsNaN(v) && !math.IsInf(xsLast.le, 0) {
return xsLast.le
}
xss = xss[:len(xss)-1]
}
if len(xss) == 0 {
return nan
}
return xss[len(xss)-1].le
return nan
}
quantile := func(i int, phis []float64, xss []x) float64 {
phi := phis[i]
@@ -337,13 +464,21 @@ func transformHistogramQuantile(tfa *transformFuncArg) ([]*timeseries, error) {
vPrev := float64(0)
for _, xs := range xss {
v := xs.ts.Values[i]
if math.IsNaN(v) || v < vPrev {
if v < vPrev {
xs.ts.Values[i] = vPrev
} else {
} else if !math.IsNaN(v) {
vPrev = v
}
}
if len(xss) == 0 {
vLast := nan
for len(xss) > 0 {
vLast = xss[len(xss)-1].ts.Values[i]
if !math.IsNaN(vLast) {
break
}
xss = xss[:len(xss)-1]
}
if vLast == 0 || math.IsNaN(vLast) {
return nan
}
if phi < 0 {
@@ -352,16 +487,22 @@ func transformHistogramQuantile(tfa *transformFuncArg) ([]*timeseries, error) {
if phi > 1 {
return inf
}
vLast := xss[len(xss)-1].ts.Values[i]
if vLast == 0 {
return nan
}
vReq := vLast * phi
vPrev = 0
lePrev := float64(0)
for _, xs := range xss {
v := xs.ts.Values[i]
if math.IsNaN(v) {
// Skip NaNs - they may appear if the selected time range
// contains multiple different bucket sets.
continue
}
le := xs.le
if v <= 0 {
// Skip zero buckets.
lePrev = le
continue
}
if v < vReq {
vPrev = v
lePrev = le
@@ -388,7 +529,6 @@ func transformHistogramQuantile(tfa *transformFuncArg) ([]*timeseries, error) {
}
rvs = append(rvs, dst)
}
return rvs, nil
}
@@ -1121,7 +1261,10 @@ func transformTimestamp(tfa *transformFuncArg) ([]*timeseries, error) {
ts.MetricName.ResetMetricGroup()
values := ts.Values
for i, t := range ts.Timestamps {
values[i] = float64(t) / 1e3
v := values[i]
if !math.IsNaN(v) {
values[i] = float64(t) / 1e3
}
}
}
return rvs, nil

View File

@@ -24,6 +24,9 @@ var (
// DataPath is a path to storage data.
DataPath = flag.String("storageDataPath", "victoria-metrics-data", "Path to storage data")
bigMergeConcurrency = flag.Int("bigMergeConcurrency", 0, "The maximum number of CPU cores to use for big merges. Default value is used if set to 0")
smallMergeConcurrency = flag.Int("smallMergeConcurrency", 0, "The maximum number of CPU cores to use for small merges. Default value is used if set to 0")
)
// Init initializes vmstorage.
@@ -39,6 +42,10 @@ func InitWithoutMetrics() {
if err := encoding.CheckPrecisionBits(uint8(*precisionBits)); err != nil {
logger.Fatalf("invalid `-precisionBits`: %s", err)
}
storage.SetBigMergeWorkersCount(*bigMergeConcurrency)
storage.SetSmallMergeWorkersCount(*smallMergeConcurrency)
logger.Infof("opening storage at %q with retention period %d months", *DataPath, *retentionPeriod)
startTime := time.Now()
WG = syncwg.WaitGroup{}
@@ -298,6 +305,9 @@ func registerStorageMetrics() {
return float64(idbm().PartsRefCount)
})
metrics.NewGauge(`vm_new_timeseries_created_total`, func() float64 {
return float64(idbm().NewTimeseriesCreated)
})
metrics.NewGauge(`vm_missing_tsids_for_metric_id_total`, func() float64 {
return float64(idbm().MissingTSIDsForMetricID)
})
@@ -313,6 +323,12 @@ func registerStorageMetrics() {
metrics.NewGauge(`vm_date_metric_ids_search_hits_total`, func() float64 {
return float64(idbm().DateMetricIDsSearchHits)
})
metrics.NewGauge(`vm_index_blocks_with_metric_ids_processed_total`, func() float64 {
return float64(idbm().IndexBlocksWithMetricIDsProcessed)
})
metrics.NewGauge(`vm_index_blocks_with_metric_ids_incorrect_order_total`, func() float64 {
return float64(idbm().IndexBlocksWithMetricIDsIncorrectOrder)
})
metrics.NewGauge(`vm_assisted_merges_total{type="storage/small"}`, func() float64 {
return float64(tm().SmallAssistedMerges)
@@ -391,6 +407,24 @@ func registerStorageMetrics() {
return float64(idbm().ItemsCount)
})
metrics.NewGauge(`vm_date_range_search_calls_total`, func() float64 {
return float64(idbm().DateRangeSearchCalls)
})
metrics.NewGauge(`vm_date_range_hits_total`, func() float64 {
return float64(idbm().DateRangeSearchHits)
})
metrics.NewGauge(`vm_missing_metric_names_for_metric_id_total`, func() float64 {
return float64(idbm().MissingMetricNamesForMetricID)
})
metrics.NewGauge(`vm_date_metric_id_cache_syncs_total`, func() float64 {
return float64(m().DateMetricIDCacheSyncsCount)
})
metrics.NewGauge(`vm_date_metric_id_cache_resets_total`, func() float64 {
return float64(m().DateMetricIDCacheResetsCount)
})
metrics.NewGauge(`vm_cache_entries{type="storage/tsid"}`, func() float64 {
return float64(m().TSIDCacheSize)
})
@@ -440,6 +474,9 @@ func registerStorageMetrics() {
metrics.NewGauge(`vm_cache_size_bytes{type="storage/date_metricID"}`, func() float64 {
return float64(m().DateMetricIDCacheSizeBytes)
})
metrics.NewGauge(`vm_cache_size_bytes{type="storage/hour_metric_ids"}`, func() float64 {
return float64(m().HourMetricIDCacheSizeBytes)
})
metrics.NewGauge(`vm_cache_size_bytes{type="indexdb/tagFilters"}`, func() float64 {
return float64(idbm().TagCacheSizeBytes)
})
@@ -456,9 +493,6 @@ func registerStorageMetrics() {
metrics.NewGauge(`vm_cache_requests_total{type="storage/metricName"}`, func() float64 {
return float64(m().MetricNameCacheRequests)
})
metrics.NewGauge(`vm_cache_requests_total{type="storage/date_metricID"}`, func() float64 {
return float64(m().DateMetricIDCacheRequests)
})
metrics.NewGauge(`vm_cache_requests_total{type="storage/bigIndexBlocks"}`, func() float64 {
return float64(tm().BigIndexBlocksCacheRequests)
})
@@ -490,9 +524,6 @@ func registerStorageMetrics() {
metrics.NewGauge(`vm_cache_misses_total{type="storage/metricName"}`, func() float64 {
return float64(m().MetricNameCacheMisses)
})
metrics.NewGauge(`vm_cache_misses_total{type="storage/date_metricID"}`, func() float64 {
return float64(m().DateMetricIDCacheMisses)
})
metrics.NewGauge(`vm_cache_misses_total{type="storage/bigIndexBlocks"}`, func() float64 {
return float64(tm().BigIndexBlocksCacheMisses)
})
@@ -525,7 +556,4 @@ func registerStorageMetrics() {
metrics.NewGauge(`vm_cache_collisions_total{type="storage/metricName"}`, func() float64 {
return float64(m().MetricNameCacheCollisions)
})
metrics.NewGauge(`vm_cache_collisions_total{type="storage/date_metricID"}`, func() float64 {
return float64(m().DateMetricIDCacheCollisions)
})
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,6 +1,6 @@
DOCKER_NAMESPACE := victoriametrics
BUILDER_IMAGE := local/builder:go1.12.9
CERTS_IMAGE := local/certs:1.0.2
BUILDER_IMAGE := local/builder:go1.13.4
CERTS_IMAGE := local/certs:1.0.3
package-certs:
(docker image ls --format '{{.Repository}}:{{.Tag}}' | grep -q '$(CERTS_IMAGE)') \
@@ -21,7 +21,8 @@ app-via-docker: package-certs package-builder
--env GO111MODULE=on \
$(DOCKER_OPTS) \
$(BUILDER_IMAGE) \
go build $(RACE) -mod=vendor -ldflags "-s -w -extldflags '-static' $(GO_BUILDINFO)" -tags 'netgo osusergo' -o bin/$(APP_NAME)$(APP_SUFFIX)-prod $(PKG_PREFIX)/app/$(APP_NAME)
go build $(RACE) -mod=vendor -trimpath -ldflags "-s -w -extldflags '-static' $(GO_BUILDINFO)" -tags 'netgo osusergo' \
-o bin/$(APP_NAME)$(APP_SUFFIX)-prod $(PKG_PREFIX)/app/$(APP_NAME)
package-via-docker:
(docker image ls --format '{{.Repository}}:{{.Tag}}' | grep -q '$(DOCKER_NAMESPACE)/$(APP_NAME):$(PKG_TAG)$(RACE)') || (\

View File

@@ -1,2 +1,2 @@
FROM golang:1.12.9
FROM golang:1.13.4
STOPSIGNAL SIGINT

View File

@@ -1,3 +1,3 @@
# See https://medium.com/on-docker/use-multi-stage-builds-to-inject-ca-certs-ad1e8f01de1b
FROM alpine:3.9 as certs
FROM alpine:3.10 as certs
RUN apk --update add ca-certificates

View File

@@ -2,7 +2,7 @@ version: '3.5'
services:
prometheus:
container_name: prometheus
image: prom/prometheus:v2.10.0
image: prom/prometheus:v2.14.0
depends_on:
- "victoriametrics"
ports:
@@ -35,7 +35,7 @@ services:
restart: always
grafana:
container_name: grafana
image: grafana/grafana:6.2.1
image: grafana/grafana:6.5.0
entrypoint: >
/bin/sh -c "
cd /var/lib/grafana &&

View File

@@ -5,10 +5,10 @@ datasources:
type: prometheus
access: proxy
url: http://prometheus:9090
isDefault: false
isDefault: true
- name: VictoriaMetrics
type: prometheus
access: proxy
url: http://victoriametrics:8428
isDefault: true
isDefault: false

22
docs/Articles.md Normal file
View File

@@ -0,0 +1,22 @@
# Articles
* [Open-sourcing VictoriaMetrics](https://medium.com/@valyala/open-sourcing-victoriametrics-f31e34485c2b)
* [How we created VictoriaMetrics](https://medium.com/devopslinks/victoriametrics-creating-the-best-remote-storage-for-prometheus-5d92d66787ac)
* [VictoriaMetrics vs TimescaleDB vs InfluxDB benchmarks on 40K unique time series](https://medium.com/@valyala/when-size-matters-benchmarking-victoriametrics-vs-timescale-and-influxdb-6035811952d4)
* [VictoriaMetrics vs TimescaleDB vs InfluxDB benchmarks on 400K, 4M and 40M unique time series](https://medium.com/@valyala/high-cardinality-tsdb-benchmarks-victoriametrics-vs-timescaledb-vs-influxdb-13e6ee64dd6b)
* [Insert benchmarks for VictoriaMetrics vs InfluxDB on high-cardinality data](https://medium.com/@valyala/insert-benchmarks-with-inch-influxdb-vs-victoriametrics-e31a41ae2893)
* [Measuring vertical scalability for time series databases in Google Cloud](https://medium.com/@valyala/measuring-vertical-scalability-for-time-series-databases-in-google-cloud-92550d78d8ae)
* [How VictoriaMetrics creates instant snapshots](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282)
* [Prometheus Subqueries in VictoriaMetrics](https://medium.com/@valyala/prometheus-subqueries-in-victoriametrics-9b1492b720b3)
* [Why irate from Prometheus doesn't capture spikes](https://medium.com/@valyala/why-irate-from-prometheus-doesnt-capture-spikes-45f9896d7832)
* [Why mmap'ed files in Go may hurt performance](https://medium.com/@valyala/mmap-in-go-considered-harmful-d92a25cb161d)
* [WAL Usage Looks Broken in Modern TSDBs](https://medium.com/@valyala/wal-usage-looks-broken-in-modern-time-series-databases-b62a627ab704)
* [Analyzing Prometheus data with external tools](https://medium.com/@valyala/analyzing-prometheus-data-with-external-tools-5f3e5e147639)
* [Stripping dependency bloat in VictoriaMetrics Docker image](https://medium.com/@valyala/stripping-dependency-bloat-in-victoriametrics-docker-image-983fb5912b0d)
* [PromQL tutorial for beginners](https://medium.com/@valyala/promql-tutorial-for-beginners-9ab455142085)
* [Achieving better compression for time series data than Gorilla](https://medium.com/@valyala/victoriametrics-achieving-better-compression-for-time-series-data-than-gorilla-317bc1f95932)
* [Comparing Thanos to VictoriaMetrics cluster](https://medium.com/@valyala/comparing-thanos-to-victoriametrics-cluster-b193bea1683)
* [Speeding up backups for big time series databases](https://medium.com/@valyala/speeding-up-backups-for-big-time-series-databases-533c1a927883)
* [Evaluation performance and correctness: VictoriaMetrics response](https://medium.com/@valyala/evaluating-performance-and-correctness-victoriametrics-response-e27315627e87)
* [Improving histogram usability for Prometheus and Grafana](https://medium.com/@valyala/improving-histogram-usability-for-prometheus-and-grafana-bc7e5df0e350)
* [Prometheus storage: tech terms for humans](https://medium.com/@valyala/prometheus-storage-technical-terms-for-humans-4ab4de6c3d48)

View File

@@ -0,0 +1,326 @@
# Cluster version
VictoriaMetrics is fast, cost-effective and scalable time series database. It can be used as a long-term remote storage for Prometheus.
It is recommended using [single-node version](https://github.com/VictoriaMetrics/VictoriaMetrics) instead of cluster version
for ingestion rates lower than 10 million of data points per second.
Single-node version [scales perfectly](https://medium.com/@valyala/measuring-vertical-scalability-for-time-series-databases-in-google-cloud-92550d78d8ae)
with the number of CPU cores, RAM and available storage space.
Single-node version is easier to configure and operate comparing to cluster version, so think twice before sticking to cluster version.
Join [our Slack](http://slack.victoriametrics.com/) or [contact us](mailto:info@victoriametrics.com) with consulting and support questions.
## Prominent features
- Supports all the features of [single-node version](https://github.com/VictoriaMetrics/VictoriaMetrics).
- Performance and capacity scales horizontally.
- Supports multiple independent namespaces for time series data (aka multi-tenancy).
## Architecture overview
VictoriaMetrics cluster consists of the following services:
- `vmstorage` - stores the data
- `vminsert` - proxies the ingested data to `vmstorage` shards using consistent hashing
- `vmselect` - performs incoming queries using the data from `vmstorage`
Each service may scale independently and may run on the most suitable hardware.
<img src="https://docs.google.com/drawings/d/e/2PACX-1vTvk2raU9kFgZ84oF-OKolrGwHaePhHRsZEcfQ1I_EC5AB_XPWwB392XshxPramLJ8E4bqptTnFn5LL/pub?w=1104&amp;h=746">
## Binaries
Compiled binaries for cluster version are available in the `assets` section of [releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases).
See archives containing `cluster` word.
Docker images for cluster version are available here:
- `vminsert` - https://hub.docker.com/r/victoriametrics/vminsert/tags
- `vmselect` - https://hub.docker.com/r/victoriametrics/vmselect/tags
- `vmstorage` - https://hub.docker.com/r/victoriametrics/vmstorage/tags
## Building from sources
Source code for cluster version is available at [cluster branch](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/cluster).
### Development Builds
1. [Install go](https://golang.org/doc/install). The minimum supported version is Go 1.12.
2. Run `make` from the repository root. It should build `vmstorage`, `vmselect`
and `vminsert` binaries and put them into the `bin` folder.
### Production builds
There is no need in installing Go on a host system since binaries are built
inside [the official docker container for Go](https://hub.docker.com/_/golang).
This makes reproducible builds.
So [install docker](https://docs.docker.com/install/) and run the following command:
```
make vminsert-prod vmselect-prod vmstorage-prod
```
Production binaries are built into statically linked binaries for `GOARCH=amd64`, `GOOS=linux`.
They are put into `bin` folder with `-prod` suffixes:
```
$ make vminsert-prod vmselect-prod vmstorage-prod
$ ls -1 bin
vminsert-prod
vmselect-prod
vmstorage-prod
```
### Building docker images
Run `make package`. It will build the following docker images locally:
* `victoriametrics/vminsert:<PKG_TAG>`
* `victoriametrics/vmselect:<PKG_TAG>`
* `victoriametrics/vmstorage:<PKG_TAG>`
`<PKG_TAG>` is auto-generated image tag, which depends on source code in the repository.
The `<PKG_TAG>` may be manually set via `PKG_TAG=foobar make package`.
## Operation
### Cluster setup
A minimal cluster must contain the following nodes:
* a single `vmstorage` node with `-retentionPeriod` and `-storageDataPath` flags
* a single `vminsert` node with `-storageNode=<vmstorage_host>:8400`
* a single `vmselect` node with `-storageNode=<vmstorage_host>:8401`
It is recommended to run at least two nodes for each service
for high availability purposes.
An http load balancer must be put in front of `vminsert` and `vmselect` nodes:
- requests starting with `/insert` must be routed to port `8480` on `vminsert` nodes.
- requests starting with `/select` must be routed to port `8481` on `vmselect` nodes.
Ports may be altered by setting `-httpListenAddr` on the corresponding nodes.
It is recommended setting up [monitoring](#monitoring) for the cluster.
### Monitoring
All the cluster components expose various metrics in Prometheus-compatible format at `/metrics` page on the TCP port set in `-httpListenAddr` command-line flag.
By default the following TCP ports are used:
- `vminsert` - 8480
- `vmselect` - 8481
- `vmstorage` - 8482
It is recommended setting up Prometheus to scrape `/metrics` pages from all the cluster components, so they can be monitored and analyzed
with [the official Grafana dashboard for VictoriaMetrics cluster](https://grafana.com/grafana/dashboards/11176).
### URL format
* URLs for data ingestion: `http://<vminsert>:8480/insert/<accountID>/<suffix>`, where:
- `<accountID>` is an arbitrary number identifying namespace for data ingestion (aka tenant)
- `<suffix>` may have the following values:
- `prometheus` - for inserting data with [Prometheus remote write API](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write)
- `influx/write` or `influx/api/v2/write` - for inserting data with [Influx line protocol](https://docs.influxdata.com/influxdb/v1.7/write_protocols/line_protocol_tutorial/)
* URLs for querying: `http://<vmselect>:8481/select/<accountID>/prometheus/<suffix>`, where:
- `<accountID>` is an arbitrary number identifying data namespace for the query (aka tenant)
- `<suffix>` may have the following values:
- `api/v1/query` - performs [PromQL instant query](https://prometheus.io/docs/prometheus/latest/querying/api/#instant-queries)
- `api/v1/query_range` - performs [PromQL range query](https://prometheus.io/docs/prometheus/latest/querying/api/#range-queries)
- `api/v1/series` - performs [series query](https://prometheus.io/docs/prometheus/latest/querying/api/#finding-series-by-label-matchers)
- `api/v1/labels` - returns a [list of label names](https://prometheus.io/docs/prometheus/latest/querying/api/#getting-label-names)
- `api/v1/label/<label_name>/values` - returns values for the given `<label_name>` according [to API](https://prometheus.io/docs/prometheus/latest/querying/api/#querying-label-values)
- `federate` - returns [federated metrics](https://prometheus.io/docs/prometheus/latest/federation/)
- `api/v1/export` - exports raw data. See [this article](https://medium.com/@valyala/analyzing-prometheus-data-with-external-tools-5f3e5e147639) for details
* URL for time series deletion: `http://<vmselect>:8481/delete/<accountID>/prometheus/api/v1/admin/tsdb/delete_series?match[]=<timeseries_selector_for_delete>`.
Note that the `delete_series` handler should be used only in exceptional cases such as deletion of accidentally ingested incorrect time series. It shouldn't
be used on a regular basis, since it carries non-zero overhead.
* `vmstorage` nodes provide the following HTTP endpoints on `8482` port:
- `/snapshot/create` - create [instant snapshot](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282),
which can be used for backups in background. Snapshots are created in `<storageDataPath>/snapshots` folder, where `<storageDataPath>` is the corresponding
command-line flag value.
- `/snapshot/list` - list available snasphots.
- `/snapshot/delete?snapshot=<id>` - delete the given snapshot.
- `/snapshot/delete_all` - delete all the snapshots.
Snapshots may be created independently on each `vmstorage` node. There is no need in synchronizing snapshots' creation
across `vmstorage` nodes.
### Cluster resizing and scalability.
Cluster performance and capacity scales with adding new nodes.
* `vminsert` and `vmselect` nodes are stateless and may be added / removed at any time.
Do not forget updating the list of these nodes on http load balancer.
Adding more `vminsert` nodes scales data ingestion rate. See [this comment](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/175#issuecomment-536925841)
about ingestion rate scalability.
Adding more `vmselect` nodes scales select queries rate.
* `vmstorage` nodes own the ingested data, so they cannot be removed without data loss.
Adding more `vmstorage` nodes scales cluster capacity.
Steps to add `vmstorage` node:
1. Start new `vmstorage` node with the same `-retentionPeriod` as existing nodes in the cluster.
2. Gradually restart all the `vmselect` nodes with new `-storageNode` arg containing `<new_vmstorage_host>:8401`.
3. Gradually restart all the `vminsert` nodes with new `-storageNode` arg containing `<new_vmstorage_host>:8400`.
### Cluster availability
* HTTP load balancer must stop routing requests to unavailable `vminsert` and `vmselect` nodes.
* The cluster remains available if at least a single `vmstorage` node exists:
- `vminsert` re-routes incoming data from unavailable `vmstorage` nodes to healthy `vmstorage` nodes
- `vmselect` continues serving partial responses if at least a single `vmstorage` node is available.
### Updating / reconfiguring cluster nodes
All the node types - `vminsert`, `vmselect` and `vmstorage` - may be updated via graceful shutdown.
Send `SIGINT` signal to the corresponding process, wait until it finishes and then start new version
with new configs.
Cluster should remain in working state if at least a single node of each type remains available during
the update process. See [cluster availability](#cluster-availability) section for details.
### Capacity planning
Each instance type - `vminsert`, `vmselect` and `vmstorage` - can run on the most suitable hardware.
#### vminsert
* The recommended total number of vCPU cores for all the `vminsert` instances can be calculated from the ingestion rate: `vCPUs = ingestion_rate / 150K`.
* The recommended number of vCPU cores per each `vminsert` instance should equal to the number of `vmstorage` instances in the cluster.
* The amount of RAM per each `vminsert` instance should be 1GB or more. RAM is used as a buffer for spikes in ingestion rate.
* Sometimes `-rpc.disableCompression` command-line flag on `vminsert` instances could increase ingestion capacity at the cost
of higher network bandwidth usage between `vminsert` and `vmstorage`.
#### vmstorage
* The recommended total number of vCPU cores for all the `vmstorage` instances can be calculated from the ingestion rate: `vCPUs = ingestion_rate / 150K`.
* The recommended total amount of RAM for all the `vmstorage` instances can be calculated from the number of active time series: `RAM = active_time_series * 1KB`.
Time series is active if it received at least a single data point during the last hour or if it has been queried during the last hour.
* The recommended total amount of storage space for all the `vmstorage` instances can be calculated
from the ingestion rate and retention: `storage_space = ingestion_rate * retention_seconds`.
#### vmselect
The recommended hardware for `vmselect` instances highly depends on the type of queries. Lightweight queries over small number of time series usually require
small number of vCPU cores and small amount of RAM on `vmselect`, while heavy queries over big number of time series (>10K) usually require
bigger number of vCPU cores and bigger amounts of RAM.
### Helm
Helm chart simplifies managing cluster version of VictoriaMetrics in Kubernetes.
It is available in the [helm-charts](https://github.com/VictoriaMetrics/helm-charts) repository.
Upgrade follows `Cluster resizing procedure` under the hood.
### Replication and data safety
VictoriaMetrics offloads replication to the underlying storage pointed by `-storageDataPath`.
It is recommended storing data on [Google Compute Engine persistent disks](https://cloud.google.com/compute/docs/disks/#pdspecs),
since they are protected from data loss and data corruption. They also provide consistently high performance
and [may be resized](https://cloud.google.com/compute/docs/disks/add-persistent-disk) without downtime.
HDD-based persistent disks should be enough for the majority of use cases.
It is recommended using durable replicated persistent volumes in Kubernetes.
Note that [replication doesn't save from disaster](https://medium.com/@valyala/speeding-up-backups-for-big-time-series-databases-533c1a927883).
### Backups
It is recommended performing periodical backups from [instant snapshots](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282)
for protecting from user errors such as accidental data deletion.
The following steps must be performed for each `vmstorage` node for creating a backup:
1. Create an instant snapshot by navigating to `/snapshot/create` HTTP handler. It will create snapshot and return its name.
2. Archive the created snapshot from `<-storageDataPath>/snapshots/<snapshot_name>` folder using [vmbackup](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/cluster/app/vmbackup/README.md).
The archival process doesn't interfere with `vmstorage` work, so it may be performed at any suitable time.
3. Delete unused snapshots via `/snapshot/delete?snapshot=<snapshot_name>` or `/snapshot/delete_all` in order to free up occupied storage space.
There is no need in synchronizing backups among all the `vmstorage` nodes.
Restoring from backup:
1. Stop `vmstorage` node with `kill -INT`.
2. Restore data from backup using [vmrestore](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/cluster/app/vmrestore/README.md) into `-storageDataPath` directory.
3. Start `vmstorage` node.
## Community and contributions
We are open to third-party pull requests provided they follow [KISS design principle](https://en.wikipedia.org/wiki/KISS_principle):
- Prefer simple code and architecture.
- Avoid complex abstractions.
- Avoid magic code and fancy algorithms.
- Avoid [big external dependencies](https://medium.com/@valyala/stripping-dependency-bloat-in-victoriametrics-docker-image-983fb5912b0d).
- Minimize the number of moving parts in the distributed system.
- Avoid automated decisions, which may hurt cluster availability, consistency or performance.
Adhering `KISS` principle simplifies the resulting code and architecture, so it can be reviewed, understood and verified by many people.
Due to `KISS` cluster version of VictoriaMetrics has no the following "features" popular in distributed computing world:
- Fragile gossip protocols. See [failed attempt in Thanos](https://github.com/improbable-eng/thanos/blob/030bc345c12c446962225221795f4973848caab5/docs/proposals/completed/201809_gossip-removal.md).
- Hard-to-understand-and-implement-properly [Paxos protocols](https://www.quora.com/In-distributed-systems-what-is-a-simple-explanation-of-the-Paxos-algorithm).
- Complex replication schemes, which may go nuts in unforesseen edge cases. The replication is offloaded to the underlying durable replicated storage
such as [persistent disks in Google Compute Engine](https://cloud.google.com/compute/docs/disks/#pdspecs).
- Automatic data reshuffling between storage nodes, which may hurt cluster performance and availability.
- Automatic cluster resizing, which may cost you a lot of money if improperly configured.
- Automatic discovering and addition of new nodes in the cluster, which may mix data between dev and prod clusters :)
- Automatic leader election, which may result in split brain disaster on network errors.
## Reporting bugs
Report bugs and propose new features [here](https://github.com/VictoriaMetrics/VictoriaMetrics/issues).
## Victoria Metrics Logo
[Zip](VM_logo.zip) contains three folders with different image orientation (main color and inverted version).
Files included in each folder:
* 2 JPEG Preview files
* 2 PNG Preview files with transparent background
* 2 EPS Adobe Illustrator EPS10 files
### Logo Usage Guidelines
#### Font used:
* Lato Black
* Lato Regular
#### Color Palette:
* HEX [#110f0f](https://www.color-hex.com/color/110f0f)
* HEX [#ffffff](https://www.color-hex.com/color/ffffff)
### We kindly ask:
- Please don't use any other font instead of suggested.
- There should be sufficient clear space around the logo.
- Do not change spacing, alignment, or relative locations of the design elements.
- Do not change the proportions of any of the design elements or the design itself. You may resize as needed but must retain all proportions.

63
docs/ExtendedPromQL.md Normal file
View File

@@ -0,0 +1,63 @@
# Extended PromQL
VictoriaMetrics supports [standard PromQL](https://prometheus.io/docs/prometheus/latest/querying/basics/)
including [subqueries](https://prometheus.io/blog/2019/01/28/subquery-support/).
Additionally it supports useful extensions mentioned below.
Try these extensions on [an editable Grafana dashboard](http://play-grafana.victoriametrics.com:3000/d/4ome8yJmz/node-exporter-on-victoriametrics-demo).
- [`WITH` templates](https://play.victoriametrics.com/promql/expand-with-exprs). This feature simplifies writing and managing complex queries. Go to [`WITH` templates playground](https://victoriametrics.com/promql/expand-with-exprs) and try it.
- Metric names and metric labels may contain escaped chars. For instance, `foo\-bar{baz\=aa="b"}` is valid expression. It returns time series with name `foo-bar` containing label `baz=aa` with value `b`. Additionally, `\xXX` escape sequence is supported, where `XX` is hexadecimal representation of escaped char.
- `offset`, range duration and step value for range vector may refer to the current step aka `$__interval` value from Grafana.
For instance, `rate(metric[10i] offset 5i)` would return per-second rate over a range covering 10 previous steps with the offset of 5 steps.
- `default` binary operator. `q1 default q2` substitutes `NaN` values from `q1` with the corresponding values from `q2`.
- `if` binary operator. `q1 if q2` removes values from `q1` for `NaN` values from `q2`.
- `ifnot` binary operator. `q1 ifnot q2` removes values from `q1` for non-`NaN` values from `q2`.
- `offset` may be put anywere in the query. For instance, `sum(foo) offset 24h`.
- Trailing commas on all the lists are allowed - label filters, function args and with expressions. For instance, the following queries are valid: `m{foo="bar",}`, `f(a, b,)`, `WITH (x=y,) x`. This simplifies maintenance of multi-line queries.
- String literals may be concatenated. This is useful with `WITH` templates: `WITH (commonPrefix="long_metric_prefix_") {__name__=commonPrefix+"suffix1"} / {__name__=commonPrefix+"suffix2"}`.
- Range duration in functions such as [rate](https://prometheus.io/docs/prometheus/latest/querying/functions/#rate()) may be omitted. VictoriaMetrics automatically selects range duration depending on the current step used for building the graph. For instance, the following query is valid in VictoriaMetrics: `rate(node_network_receive_bytes_total)`.
- [Range duration](https://prometheus.io/docs/prometheus/latest/querying/basics/#range-vector-selectors) and [offset](https://prometheus.io/docs/prometheus/latest/querying/basics/#offset-modifier) may be fractional. For instance, `rate(node_network_receive_bytes_total[1.5m] offset 0.5d)`.
- Comments starting with `#` and ending with newline. For instance, `up # this is a comment for 'up' metric`.
- Rollup functions - `rollup(m[d])`, `rollup_rate(m[d])`, `rollup_deriv(m[d])`, `rollup_increase(m[d])`, `rollup_delta(m[d])` - return `min`, `max` and `avg`
values for all the `m` data points over `d` duration.
- `rollup_candlestick(m[d])` - returns `open`, `close`, `low` and `high` values (OHLC) for all the `m` data points over `d` duration. This function is useful for financial applications.
- `union(q1, ... qN)` function for building multiple graphs for `q1`, ... `qN` subqueries with a single query. The `union` function name may be skipped -
the following queries are equivalent: `union(q1, q2)` and `(q1, q2)`.
- `ru(freeResources, maxResources)` function for returning resource utilization percentage in the range `0% - 100%`. For instance, `ru(node_memory_MemFree_bytes, node_memory_MemTotal_bytes)` returns memory utilization over [node_exporter](https://github.com/prometheus/node_exporter) metrics.
- `ttf(slowlyChangingFreeResources)` function for returning the time in seconds when the given `slowlyChangingFreeResources` expression reaches zero. For instance, `ttf(node_filesystem_avail_byte)` returns the time to storage space exhaustion. This function may be useful for capacity planning.
- Functions for label manipulation:
- `alias(q, name)` for setting metric name across all the time series `q`.
- `label_set(q, label1, value1, ... labelN, valueN)` for setting the given values for the given labels on `q`.
- `label_del(q, label1, ... labelN)` for deleting the given labels from `q`.
- `label_keep(q, label1, ... labelN)` for deleting all the labels except the given labels from `q`.
- `label_copy(q, src_label1, dst_label1, ... src_labelN, dst_labelN)` for copying label values from `src_*` to `dst_*`.
- `label_move(q, src_label1, dst_label1, ... src_labelN, dst_labelN)` for moving label values from `src_*` to `dst_*`.
- `label_transform(q, label, regexp, replacement)` for replacing all the `regexp` occurences with `replacement` in the `label` values from `q`.
- `label_value(q, label)` - returns numeric values for the given `label` from `q`.
- `step()` function for returning the step in seconds used in the query.
- `start()` and `end()` functions for returning the start and end timestamps of the `[start ... end]` range used in the query.
- `integrate(m[d])` for returning integral over the given duration `d` for the given metric `m`.
- `ideriv(m)` - for calculating `instant` derivative for `m`.
- `deriv_fast(m[d])` - for calculating `fast` derivative for `m` based on the first and the last points from duration `d`.
- `running_` functions - `running_sum`, `running_min`, `running_max`, `running_avg` - for calculating [running values](https://en.wikipedia.org/wiki/Running_total) on the selected time range.
- `range_` functions - `range_sum`, `range_min`, `range_max`, `range_avg`, `range_first`, `range_last`, `range_median`, `range_quantile` - for calculating global value over the selected time range.
- `smooth_exponential(q, sf)` - smooths `q` using [exponential moving average](https://en.wikipedia.org/wiki/Moving_average#Exponential_moving_average) with the given smooth factor `sf`.
- `remove_resets(q)` - removes counter resets from `q`.
- `lag(q[d])` - returns lag between the current timestamp and the timestamp from the previous data point in `q` over `d`.
- `lifetime(q[d])` - returns lifetime of `q` over `d` in seconds. It is expected that `d` exceeds the lifetime of `q`.
- `scrape_interval(q[d])` - returns the average interval in seconds between data points of `q` over `d` aka `scrape interval`.
- Trigonometric functions - `sin(q)`, `cos(q)`, `asin(q)`, `acos(q)` and `pi()`.
- `median_over_time(m[d])` - calculates median values for `m` over `d` time window. Shorthand to `quantile_over_time(0.5, m[d])`.
- `median(q)` - median aggregate. Shorthand to `quantile(0.5, q)`.
- `limitk(k, q)` - limits the number of time series returned from `q` to `k`.
- `keep_last_value(q)` - fills missing data (gaps) in `q` with the previous value.
- `distinct_over_time(m[d])` - returns distinct number of values for `m` data points over `d` duration.
- `distinct(q)` - returns a time series with the number of unique values for each timestamp in `q`.
- `sum2_over_time(m[d])` - returns sum of squares for all the `m` values over `d` duration.
- `sum2(q)` - returns a time series with sum of square values for each timestamp in `q`.
- `geomean_over_time(m[d])` - returns [geomean](https://en.wikipedia.org/wiki/Geometric_mean) value for all the `m` value over `d` duration.
- `geomean(q)` - returns a time series with [geomean](https://en.wikipedia.org/wiki/Geometric_mean) value for each timestamp in `q`.
- `rand()`, `rand_normal()` and `rand_exponential()` functions - for generating pseudo-random series with even, normal and exponential distribution.
- `increases_over_time(m[d])` and `decreases_over_time(m[d])` - returns the number of `m` increases or decreases over the given duration `d`.
- `prometheus_buckets(q)` - converts [VictoriaMetrics histogram](https://godoc.org/github.com/VictoriaMetrics/metrics#Histogram) buckets to Prometheus buckets with `le` labels.
- `histogram(q)` - calculates aggregate histogram over `q` time series for each point on the graph.

160
docs/FAQ.md Normal file
View File

@@ -0,0 +1,160 @@
# FAQ
### What is the main purpose of VictoriaMetrics?
To provide the best long-term [remote storage](https://prometheus.io/docs/operating/integrations/#remote-endpoints-and-storage) solution for [Prometheus](https://prometheus.io/).
### Which features does VictoriaMetrics have?
* Supports [Prometheus querying API](https://prometheus.io/docs/prometheus/latest/querying/api/), so it can be used as Prometheus drop-in replacement in Grafana.
Additionally, VictoriaMetrics extends PromQL with opt-in [useful features](https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/ExtendedPromQL).
* High performance and good scalability for both [inserts](https://medium.com/@valyala/high-cardinality-tsdb-benchmarks-victoriametrics-vs-timescaledb-vs-influxdb-13e6ee64dd6b)
and [selects](https://medium.com/@valyala/when-size-matters-benchmarking-victoriametrics-vs-timescale-and-influxdb-6035811952d4).
[Outperforms InfluxDB and TimescaleDB by up to 20x](https://medium.com/@valyala/measuring-vertical-scalability-for-time-series-databases-in-google-cloud-92550d78d8ae).
* [Uses 10x less RAM than InfluxDB](https://medium.com/@valyala/insert-benchmarks-with-inch-influxdb-vs-victoriametrics-e31a41ae2893) when working with millions of unique time series (aka high cardinality).
* High data compression, so [up to 70x more data points](https://medium.com/@valyala/when-size-matters-benchmarking-victoriametrics-vs-timescale-and-influxdb-6035811952d4)
may be crammed into a limited storage comparing to TimescaleDB.
* Optimized for storage with high-latency IO and low iops (HDD and network storage in AWS, Google Cloud, Microsoft Azure, etc). See [graphs from these benchmarks](https://medium.com/@valyala/high-cardinality-tsdb-benchmarks-victoriametrics-vs-timescaledb-vs-influxdb-13e6ee64dd6b).
* A single-node VictoriaMetrics may substitute moderately sized clusters built with competing solutions such as Thanos, Uber M3, Cortex, InfluxDB or TimescaleDB.
See [vertical scalability benchmarks](https://medium.com/@valyala/measuring-vertical-scalability-for-time-series-databases-in-google-cloud-92550d78d8ae)
and [comparing Thanos to VictoriaMetrics](https://medium.com/@valyala/comparing-thanos-to-victoriametrics-cluster-b193bea1683).
* Easy operation:
* VictoriaMetrics consists of a single executable without external dependencies.
* All the configuration is done via explicit command-line flags with reasonable defaults.
* All the data is stored in a single directory pointed by `-storageDataPath` flag.
* Easy backups from [instant snapshots](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282).
* Storage is protected from corruption on unclean shutdown (i.e. hardware reset or `kill -9`) thanks to [the storage architecture](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282).
* Supports metrics' ingestion and backfilling via the following protocols:
* [Prometheus remote write API](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write)
* [InfluxDB line protocol](https://docs.influxdata.com/influxdb/v1.7/write_protocols/line_protocol_tutorial/)
* [Graphite plaintext protocol](https://graphite.readthedocs.io/en/latest/feeding-carbon.html) with [tags](https://graphite.readthedocs.io/en/latest/tags.html#carbon)
if `-graphiteListenAddr` is set.
* [OpenTSDB put message](http://opentsdb.net/docs/build/html/api_telnet/put.html) if `-opentsdbListenAddr` is set.
* Ideally works with big amounts of time series data from IoT sensors, connected car sensors and industrial sensors.
* Has open source [cluster version](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/cluster).
### Which clients do you target?
The following Prometheus users may be interested in VictoriaMetrics:
- Users who don't want to bother with Prometheus' local storage operational burden - backups, replication, capacity planning, scalability, etc.
- Users with multiple Prometheus instances who want performing arbitrary queries over all the metrics collected by their Prometheus instances (aka `global querying view`).
- Users who want reducing costs for storing huge amounts of time series data.
### How to start using VictoriaMetrics?
Start with [single-node version](Single-server-VictoriaMetrics). It is easy to configure and operate. It should fit the majority of use cases.
### Is it safe to enable [remote write storage](https://prometheus.io/docs/operating/integrations/#remote-endpoints-and-storage) in Prometheus?
Yes. Prometheus continues writing data to local storage after enabling remote storage write, so all the existing local storage data
and new data is available for querying via Prometheus as usual.
### How does VictoriaMetrics compare to other clustered TSDBs on top of Prometheus such as [M3 from Uber](https://eng.uber.com/m3/), [Thanos](https://github.com/improbable-eng/thanos), [Cortex](https://github.com/cortexproject/cortex), etc.?
VictoriaMetrics is simpler, faster, more cost-effective and it provides [useful extensions for PromQL](ExtendedPromQL). The simplicity is twofold:
- It is simpler to configure and operate. There is no need in configuring third-party [sidecars](https://github.com/improbable-eng/thanos/blob/master/docs/components/sidecar.md)
or fighting with [gossip protocol](https://github.com/improbable-eng/thanos/blob/master/docs/proposals/completed/201809_gossip-removal.md).
- VictoriaMetrics has simpler architecture, which means less bugs and more useful features in a long run comparing to competing TSDBs.
See [comparing Thanos to VictoriaMetrics cluster](https://medium.com/@valyala/comparing-thanos-to-victoriametrics-cluster-b193bea1683).
### How does VictoriaMetrics compare to [InfluxDB](https://www.influxdata.com/time-series-platform/influxdb/)?
VictoriaMetrics requires [10x less RAM](https://medium.com/@valyala/insert-benchmarks-with-inch-influxdb-vs-victoriametrics-e31a41ae2893) and it [works faster](https://medium.com/@valyala/measuring-vertical-scalability-for-time-series-databases-in-google-cloud-92550d78d8ae).
It is easier to configure and operate. It provides [better query language](https://medium.com/@valyala/promql-tutorial-for-beginners-9ab455142085) than InfluxQL or Flux.
### How does VictoriaMetrics compare to [TimescaleDB](https://www.timescale.com/)?
TimescaleDB insists on using SQL as a query language. While SQL is more powerful than PromQL, this power is rarely required during typical TSDB usage. Real-world queries usually [look clearer and simpler when written in PromQL than in SQL](https://medium.com/@valyala/promql-tutorial-for-beginners-9ab455142085).
Additionally, VictoriaMetrics requires [up to 70x less storage space comparing to TimescaleDB](https://medium.com/@valyala/when-size-matters-benchmarking-victoriametrics-vs-timescale-and-influxdb-6035811952d4) for storing the same amount of time series data.
### Does VictoriaMetrics use Prometheus technologies like other clustered TSDBs built on top of Prometheus such as [M3 from Uber](https://eng.uber.com/m3/), [Thanos](https://github.com/improbable-eng/thanos), [Cortex](https://github.com/cortexproject/cortex)?
No. VictoriaMetrics core is written in Go from scratch by [fasthttp](https://github.com/valyala/fasthttp) [author](https://github.com/valyala).
The architecture is [optimized for storing and querying large amounts of time series data with high cardinality](https://medium.com/devopslinks/victoriametrics-creating-the-best-remote-storage-for-prometheus-5d92d66787ac). VictoriaMetrics storage uses [certain ideas from ClickHouse](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282). Special thanks to [Alexey Milovidov](https://github.com/alexey-milovidov).
### Are there performance comparisons with other solutions?
Yes:
* [Measuring vertical scalability for time series databases: VictoriaMetrics vs InfluxDB vs TimescaleDB](https://medium.com/@valyala/measuring-vertical-scalability-for-time-series-databases-in-google-cloud-92550d78d8ae).
* [Measuring insert performance on high-cardinality time series: VictoriaMetrics vs InfluxDB](https://medium.com/@valyala/insert-benchmarks-with-inch-influxdb-vs-victoriametrics-e31a41ae2893)
* [TSBS benchmark on high-cardinality time series: VictoriaMetrics vs InfluxDB vs TimescaleDB](https://medium.com/@valyala/high-cardinality-tsdb-benchmarks-victoriametrics-vs-timescaledb-vs-influxdb-13e6ee64dd6b)
* [Standard TSBS benchmark: VictoriaMetrics vs InfluxDB vs TimescaleDB](https://medium.com/@valyala/when-size-matters-benchmarking-victoriametrics-vs-timescale-and-influxdb-6035811952d4)
### What is the pricing for VictoriaMetrics?
The following versions are open source and free:
* [Single-node version](https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/Single-server-VictoriaMetrics).
* [Cluster version](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/cluster).
We provide commercial support for both versions. [Contact us](mailto:info@victoriametrics.com) for the pricing.
The following versions are commercial:
* Managed cluster in the Cloud.
* SaaS version.
[Contact us](mailto:info@victoriametrics.com) for the pricing.
### Why VictoriaMetrics doesn't support [Prometheus remote read API](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#%3Cremote_read%3E)?
Remote read API requires transferring all the raw data for all the requested metrics over the given time range. For instance,
if a query covers 1000 metrics with 10K values each, then the remote read API had to return `1000*10K`=10M metric values to Prometheus.
This is slow and expensive.
Prometheus remote read API isn't intended for querying foreign data aka `global query view`. See [this issue](https://github.com/prometheus/prometheus/issues/4456) for details.
So just query VictoriaMetrics directly via [Prometheus Querying API](https://prometheus.io/docs/prometheus/latest/querying/api/)
or via [Prometheus datasoruce in Grafana](http://docs.grafana.org/features/datasources/prometheus/).
### Does VictoriaMetrics deduplicate data from Prometheus instances scraping the same targets (aka `HA pairs`)?
Data from all the Prometheus instances is saved in VictoriaMetrics without deduplication.
The deduplication for Prometheus HA pair may be easily implemented on top of VictoriaMetrics with the following steps:
1) Run multiple VictoriaMetrics instances in multiple availability zones (datacenters).
2) Configure each Prometheus from each HA pair to write data to VictoriaMetrics in distinct availability zone.
3) Put [Promxy](https://github.com/jacksontj/promxy) in front of all the VictoriaMetrics instances.
4) Send queries to Promxy - it will deduplicate data from VictoriaMetrics instances behind it.
### Where is the source code of VictoriaMetrics?
Source code for the following versions is available in the following places:
* [Single-node version](https://github.com/VictoriaMetrics/VictoriaMetrics).
* [Cluster version](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/cluster).
### Does VictoriaMetrics fit for data from IoT sensors and industrial sensors?
VictoriaMetrics is able to handle data from hundreds of millions of IoT sensors and industrial sensors.
It supports [high cardinality data](https://medium.com/@valyala/high-cardinality-tsdb-benchmarks-victoriametrics-vs-timescaledb-vs-influxdb-13e6ee64dd6b),
perfectly [scales up on a single node](https://medium.com/@valyala/measuring-vertical-scalability-for-time-series-databases-in-google-cloud-92550d78d8ae)
and scales horizontally to multiple nodes.
### Where can I ask questions about VictoriaMetrics?
See [VictoriaMetrics-users group](https://groups.google.com/forum/#!forum/victorametrics-users).
### Where can I file bugs and feature requests regarding VictoriaMetrics?
File bugs and feature requests [here](https://github.com/VictoriaMetrics/VictoriaMetrics/issues).
### Are you looking for investors?
Yes. [Mail us](mailto:info@victoriametrics.com) if you are interested in.

Some files were not shown because too many files have changed in this diff Show More