Compare commits

...

1407 Commits

Author SHA1 Message Date
Aliaksandr Valialkin
28df7c2a96 docs/CHANGELOG.md: cut v1.86.0 2023-01-10 16:22:26 -08:00
Aliaksandr Valialkin
2d294cca59 deployment/docker: update Alpine base image from v3.17.0 to v3.17.1
See https://alpinelinux.org/posts/Alpine-3.17.1-released.html
2023-01-10 16:14:40 -08:00
Aliaksandr Valialkin
95ce1ba6ce lib/httpserver: directly pass flag value to CheckAuthFlag()
There is no sense in passing a pointer to flag value there.

This is a follow-up for 4225a0bd75
2023-01-10 15:52:23 -08:00
Zakhar Bessarab
4225a0bd75 Use httpAuth.* flags as a fallback for endpoints protected by *AuthKey flags (#3582)
* {lib/server, app/}: use `httpAuth.*` flag as fallback for `*AuthKey` if it is not set

* lib/ingestserver/opentsdbhttp: fix opentdb HTTP handler not respecting `httpAuth.*` flags

* Apply suggestions from code review

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-01-10 15:46:13 -08:00
Dmytro Kozlov
0811000bb0 app/vmctl: Add insecure skip verify flag for remote read protocol (#3611)
* app/vmctl: Add insecure skip verify flag for remote read protocol
2023-01-10 23:18:49 +01:00
Artem Navoiev
37c52ccaf4 vmagent: add minimal scrape file exampe for vmagent quick start (#3627)
* vmagent: add minimal scrape file exampe for vmagent quick start

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>

* replace example with link to your prometheus.yml in docker

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2023-01-10 23:16:10 +01:00
Aliaksandr Valialkin
cbe62f23ba lib/promscrape/discovery/gce: follow-up for b2ccdaaa2f
- Use promutils.Labels.GetLabels() instead of comparing promutils.Labels.Labels to nil.
  This make the code more consistent with other places.

- Mention the release where the issue has been introduced at docs/CHANGELOG.md.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3624
2023-01-10 13:51:03 -08:00
Aliaksandr Valialkin
53d871d0b1 app/vmselect/netstorage: reduce tail latency during query processing
Previously the selected time series were split evenly among available CPU cores
for further processing - e.g unpacking the data and applying the given rollup
function to the unpacked data.
Some time series could be processed slower than others.
This could result in uneven work distribution among available CPU cores,
e.g. some CPU cores could complete their work sooner than others.
This could slow down query execution.

The new algorithm allows stealing time series to process from other CPU cores
when all the local work is done. This should reduce the maximum time
needed for query execution (aka tail latency).

The new algorithm should also scale better on systems with many CPU cores,
since every CPU processes locally assigned time series without inter-CPU communications.

The inter-CPU communications are used only when all the local work is finished
and the pending work from other CPUs needs to be stealed.
2023-01-10 13:43:14 -08:00
Zakhar Bessarab
b2ccdaaa2f lib/promscrape/discovery/gce: fix crash in case instance does not have any labels set (#3625)
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-01-10 11:07:11 +01:00
Denys Holius
b06e795a1e docs/Release-Guide.md: added missed link to rpm repository (#3623) 2023-01-10 10:14:56 +01:00
Aliaksandr Valialkin
e640ff72f1 app/vmselect/netstorage: reduce memory allocations when unpacking time series
Unpack time series with less than 400K samples in the currently running goroutine.
Previously a new goroutine was being started for unpacking the samples.
This was requiring additional memory allocations.
2023-01-09 23:18:17 -08:00
Aliaksandr Valialkin
9a563a6aef app/vmselect/promql: eliminate memory allocation when sorting values inside float64s 2023-01-09 23:06:46 -08:00
Aliaksandr Valialkin
30ed33fae0 app/vmselect/promql: pre-allocate memory for values to be merged in mergeTimeseries()
This should reduce the number of memory re-allocations
2023-01-09 22:51:17 -08:00
Aliaksandr Valialkin
645c24dc5f app/vmselect/promql: consistently intern series names obtained from marshalMetricNameSorted
This reduces memory allocations when the returned series names are used as map keys later
2023-01-09 22:45:40 -08:00
Aliaksandr Valialkin
2f3ddd4884 app/vmselect/promql: avoid memory allocations and copying from source timeseries to the returned result at timeseriesToResult() 2023-01-09 22:38:59 -08:00
Aliaksandr Valialkin
26cf680468 app/vmselect/promql: remove memory allocations from sortMetricTags() 2023-01-09 22:22:15 -08:00
Aliaksandr Valialkin
4f0c11ee93 app/vmselect/promql: intern output series names inside timeseriesToResult()
This reduces the number of memory allocations for repeated queries,
which return (almost) the same set of time series.
2023-01-09 22:19:56 -08:00
Aliaksandr Valialkin
562d6bca08 app/vmselect/promql: intern output series names during normal aggregation 2023-01-09 22:15:24 -08:00
Aliaksandr Valialkin
21ee9a1fab app/vmselect/promql: intern output series names during incremental aggregation
This should reduce the number of memory allocations for repeated queries
2023-01-09 22:11:36 -08:00
Aliaksandr Valialkin
df2a494a7c app/vmselect/netstorage: pre-allocate 4 block references per each time series during querying
Usually the number of blocks returned per each time series during queries is around 4.
So it is a good idea to pre-allocate 4 block references per time series
in order to reduce the number of memory allocations.
2023-01-09 22:03:23 -08:00
Aliaksandr Valialkin
c5e0f527bc app/vmselect/netstorage: cache canonical MetricName for time series returned from the storage
This reduces memory allocations for repeated queries, which return (almost) the same set of time series.
2023-01-09 21:53:10 -08:00
Aliaksandr Valialkin
7afcca0c51 all: use metricsql.CompileRegexp instead of regexp.Compile for compiling regexps used in graphite queries
This should speed up repeated queries, since metricsql.CompileRegexp returns regexps from the cache
on subsequent calls for the same input regexp.
2023-01-09 21:43:08 -08:00
Aliaksandr Valialkin
67ab49baa9 vendor: make vendor-update 2023-01-09 21:34:34 -08:00
Aliaksandr Valialkin
e5eca54951 lib/promscrape/discovery/nomad: sync nomad_sd_configs fields with the Prometheus implementation
See the list of configs supported by Prometheus at f88a0a7d83/discovery/nomad/nomad.go (L76-L84)

- Removed "token" option. In can be set either via NOMAD_TOKEN env var or via `bearer_token` config option.
- Removed "scheme" option. It is automatically detected depending on whether the `tls_config` is set.
- Removed "services" and "tags" options, since they aren't supported by Prometheus.
- Added "region" option. If it is missing, then the region is read from NOMAD_REGION env var.
  If this var is empty, then it is set to "global" in the same way as Nomad client does.
  See 865ee8d37c/api/api.go (L297)
  and 865ee8d37c/api/api.go (L555-L556)
- If the "server" option is missing, then it is read from NOMAD_ADDR in the same way
  as Nomad client does - see 865ee8d37c/api/api.go (L294-L296)

This is a follow-up for 8aee209c53

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3367
2023-01-09 21:14:48 -08:00
Aliaksandr Valialkin
c38a10e143 app/vmselect/netstorage: eliminate memory allocation for sortBlocksHeap arg when calling mergeSortBlocks() 2023-01-09 21:08:51 -08:00
Aliaksandr Valialkin
1f9d605988 app/vmselect/netstorage: consistently select the sample with the biggest value out of samples with identical timestamps
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3333

This fix is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3620 ,
but doesn't slow down the common case with merging replicated data blocks so significantly.

Benchmark results:

Before the change:

BenchmarkMergeSortBlocks/replicationFactor-1-4         	   13968	     85643 ns/op	 956.53 MB/s	    1700 B/op	       1 allocs/op
BenchmarkMergeSortBlocks/replicationFactor-2-4         	   10806	    109171 ns/op	1500.77 MB/s	    2191 B/op	       1 allocs/op
BenchmarkMergeSortBlocks/replicationFactor-3-4         	    8887	    130623 ns/op	1881.45 MB/s	    2660 B/op	       1 allocs/op
BenchmarkMergeSortBlocks/replicationFactor-4-4         	    7440	    157348 ns/op	2082.52 MB/s	    3174 B/op	       1 allocs/op
BenchmarkMergeSortBlocks/replicationFactor-5-4         	    6534	    184473 ns/op	2220.38 MB/s	    3612 B/op	       1 allocs/op
BenchmarkMergeSortBlocks/overlapped-blocks-bestcase-4  	   13419	     85205 ns/op	 961.44 MB/s	    2213 B/op	       1 allocs/op
BenchmarkMergeSortBlocks/overlapped-blocks-worstcase-4 	     579	   1894900 ns/op	  43.23 MB/s	   46760 B/op	       1 allocs/op

After the change:

BenchmarkMergeSortBlocks/replicationFactor-1-4         	   13832	     85298 ns/op	 960.40 MB/s	    1716 B/op	       1 allocs/op
BenchmarkMergeSortBlocks/replicationFactor-2-4         	    8833	    134222 ns/op	1220.66 MB/s	    2675 B/op	       1 allocs/op
BenchmarkMergeSortBlocks/replicationFactor-3-4         	    6487	    184830 ns/op	1329.65 MB/s	    3636 B/op	       1 allocs/op
BenchmarkMergeSortBlocks/replicationFactor-4-4         	    4977	    236318 ns/op	1386.61 MB/s	    4733 B/op	       1 allocs/op
BenchmarkMergeSortBlocks/replicationFactor-5-4         	    4088	    296734 ns/op	1380.36 MB/s	    5761 B/op	       1 allocs/op
BenchmarkMergeSortBlocks/overlapped-blocks-bestcase-4  	   14083	     84067 ns/op	 974.47 MB/s	    2110 B/op	       1 allocs/op
BenchmarkMergeSortBlocks/overlapped-blocks-worstcase-4 	     536	   2043534 ns/op	  40.09 MB/s	   50511 B/op	       1 allocs/op
2023-01-09 13:01:48 -08:00
Denys Holius
fe0e199859 deployment/docker: update Alertmanager tag from v0.24.0 to v0.25.0 in docker-compose files (#3619)
deployment/docker: bump alertmanager to latest v0.25.0
2023-01-09 12:37:14 +01:00
Roman Khavronenko
8aee209c53 lib/promscrape: remove datacenter field from nomad_sd_config (#3612)
Looks like `datacenter` field isn't part of `/v1/services` API.
See https://developer.hashicorp.com/nomad/api-docs/services#list-services
and https://developer.hashicorp.com/nomad/api-docs/services#read-service

Related issues:
https://github.com/traefik/traefik/issues/9109
https://github.com/prometheus/prometheus/issues/11776

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-01-09 09:07:40 +01:00
Aliaksandr Valialkin
28f8dc41b0 lib/promscrape/discoveryutils: cleanup after 5df9fddaf2
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3468
2023-01-07 01:26:54 -08:00
Zakhar Bessarab
5df9fddaf2 lib/promscrape/discoveryutils: use correct timeout for blocking requests (#3609)
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-01-07 01:13:03 -08:00
Aliaksandr Valialkin
41e00a0df7 lib/storage: simplify the fix from 488940502c
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3566
2023-01-07 01:04:43 -08:00
Dmytro Kozlov
488940502c lib/storage: fix returning camelcase label names (#3608)
* lib/storage: fix returning camelcase label names

* doc: add change log

* Update docs/CHANGELOG.md

* Update docs/CHANGELOG.md

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-01-07 00:50:14 -08:00
Aliaksandr Valialkin
5fe7ff24c2 lib/streamaggr: limit the the number of concurrent flushes of the aggregate data to the exact number of available CPUs
This should reduce the maximum memory usage during concurrent flushes of the aggregate data
2023-01-07 00:18:51 -08:00
Aliaksandr Valialkin
ad5bfe3089 lib/promscrape: reduce the number of concurrently executed processScrapedData calls from 2x of the number of CPUs to the number of CPUs
This should reduce the maximum memory usage for processScrapedData() function by 2x.
The only part, which can be IO-bound in the processScrapedData() is pushData() call,
when it buffers data to persistent queue if the remote storage cannot keep up
with the data ingestion speed. In this case it is OK if the scrape pace will be limited.
2023-01-07 00:14:30 -08:00
Aliaksandr Valialkin
af263fe881 all: small improvements in error messages and command-line flag descriptions related to concurrency limiters 2023-01-07 00:11:44 -08:00
Aliaksandr Valialkin
45f39e291e lib/writeconcurrencylimiter: moved the error generation from incConcurrency() to the caller place 2023-01-06 23:45:58 -08:00
Aliaksandr Valialkin
986a05e18d lib/promscrape: limit the concurrency during parsing and relabeling the scraped samples
This should reduce memory usage when scraping big number of targets,
since this limits the summary memory usage during concurrent parsing and relabeling
by the number of available CPU cores.
2023-01-06 22:59:17 -08:00
Aliaksandr Valialkin
293e4dc77b app/{vminsert,vmstorage}: add comments on why storage.AddRows() is called without limiting the number of concurrent calls 2023-01-06 22:40:07 -08:00
Aliaksandr Valialkin
5c4bd4f7c1 lib/streamaggr: limit the number of concurrent flushes of aggregate metrics in order to limit memory usage 2023-01-06 22:39:13 -08:00
Aliaksandr Valialkin
c63755c316 lib/writeconcurrencylimiter: improve the logic behind -maxConcurrentInserts limit
Previously the -maxConcurrentInserts was limiting the number of established client connections,
which write data to VictoriaMetrics. Some of these connections could be idle.
Such connections do not consume big amounts of CPU and RAM, so there is a little sense in limiting
the number of such connections. So now the -maxConcurrentInserts command-line option
limits the number of concurrently executed insert requests, not including idle connections.

It is recommended removing -maxConcurrentInserts command-line option, since the default value
for this option should work good for most cases.
2023-01-06 22:20:19 -08:00
Aliaksandr Valialkin
f299d2ca1a lib/vmselectapi: limit the number of concurrently executed requests
This should prevent from out of memory errors when big number of vmselect
nodes send many concurrent requests to vmstorage

The limit can be controlled at vmstorage via the following command-line flags:
- search.maxConcurrentRequests
- search.maxQueueDuration

See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#resource-usage-limits
2023-01-06 22:11:34 -08:00
Aliaksandr Valialkin
e7637885a6 app/vmselect: improve error message when the request cannot be started because too many concurrent requests are already executed 2023-01-06 22:10:42 -08:00
Aliaksandr Valialkin
463b957e54 lib/promscrape/discovery/{consul,nomad}: wait until the deleted serviceWatchers are stopped inside updateServices() call
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3468
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3367
2023-01-05 21:52:33 -08:00
Aliaksandr Valialkin
f392913d00 lib/promscrape: follow-up after bced9fb978
- Document the bugfix at docs/CHANGELOG.md
- Wait until all the worker goroutines are done in consulWatcher.mustStop()
- Do not log `context canceled` errors when discovering consul serviceNames
- Removed explicit handling of gzipped responses at lib/promscrape/discoveryutils.Client,
  since this handling is automatically performed by net/http.Transport.
  See DisableCompression option at https://pkg.go.dev/net/http#Transport .
- Remove explicit handling of the proxyURL, since it is automatically handled
  by net/http.Transport. See Proxy option at https://pkg.go.dev/net/http#Transport .
- Expliticly set MaxIdleConnsPerHost, since its default value equals to 2.
  Such a small value may result in excess tcp connection churn
  when more than 2 concurrent requests are processed by lib/promscrape/discoveryutils.Client.
- Do not set explicitly the `Host` request header, since it is automatically set by net/http.Client.
- Backport the bugfix to the recently added nomad_sd_configs - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3367

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3468
2023-01-05 21:13:06 -08:00
Zakhar Bessarab
bced9fb978 lib/promscrape/discoveryutils: switch to native http client from fasthttp (#3568) 2023-01-05 19:34:47 -08:00
Roman Khavronenko
5bdd880142 vmstorage: add more context to the flock acquiring msg (#3584)
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3578

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-01-05 18:30:42 -08:00
Aliaksandr Valialkin
9f348cf8a1 lib/promscrape/discovery/nomad: follow-up after 48f371a46c
- Remove undocumented `username` and `password` config options from `nomad_sd_config`.
  TODO: probably, remove these options from `consul_sd_config` too?
  These options exist there for backwards compatibility purposes.

- Add __meta_nomad_service_alloc_id and __meta_nomad_service_job_id meta-labels
  These labels contain AllocID and JobID fields for the discovered Nomad services.

- Various typo fixes.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3367
2023-01-05 18:07:20 -08:00
Aliaksandr Valialkin
cad8553c01 Makefile: remove trailing space after golangci-lint run command
It is left after ec2c82e800
2023-01-05 16:59:07 -08:00
Aliaksandr Valialkin
1a28f0e5b3 lib/promrelabel: pass query args via query string at /metric-relabel-debug and /target-relabel-debug pages if their length doesnt exceed 1000
This allows copy-n-pasting the url to another browser window and seeing the same result.

The limit in 1000 chars is selected in order to prevent from potential issues with systems
which limit the url length such as Internet Explorer - see https://stackoverflow.com/questions/812925/what-is-the-maximum-possible-length-of-a-query-string

If the limit is exceeded, then query args are sent via POST method and aren't visible in the url.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3580
2023-01-05 16:48:04 -08:00
Karan Sharma
48f371a46c lib/promscrape: add Prometheus-compatible service discovery for Nomad (#3549)
Add nomad_sd_config support for service discovery
2023-01-05 23:03:58 +01:00
Denys Holius
043b28c725 .github/workflows/nightly-build.yml: added dockerhub login (#3594) 2023-01-05 16:54:14 +01:00
Luke Palmer
ec2c82e800 Lint and errcheck using golangci-lint (#3558) 2023-01-05 16:12:46 +01:00
Zakhar Bessarab
01bc0c94ab doc: add vmbackupmanager monitoring section (#3605)
* doc: add vmbackupmanager monitoring section

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-01-05 16:03:06 +01:00
Thomas Danielsson
9d1104d812 dashboards: fix operator datasource variable (#3604)
Got "Failed to upgrade legacy queries Datasource $ds was not found" in
Grafana on operator dashboard.
It's datasource variable was incorrectly named `datasource`.

Also made the rest of the dashboards have homogeneous datasource-variable
names and selections, matching vmagent dashboard.
2023-01-05 14:59:56 +01:00
Artem Navoiev
8b763175ff Add Understand Your Setup Size Guide (#3572)
docs: add Understand Your Setup Size Guide

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2023-01-05 14:56:50 +01:00
Aliaksandr Valialkin
2ee81a5dbb docs/CHANGELOG.md: add missing dot 2023-01-05 03:35:02 -08:00
Zakhar Bessarab
185cdcd813 lib/promscrape/discovery/dockerswarm: fix query encoding of filters (#3586)
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-01-05 03:34:25 -08:00
Aliaksandr Valialkin
0dea3b71da lib/promscrape: pre-fetch metric_relabel_configs rules when debugging metric relabeling for a particular target
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3407
2023-01-05 03:26:49 -08:00
Aliaksandr Valialkin
a1076abcbf lib/promscrape: follow-up for a7e29c38bc
- Document the bugfix at docs/CHANGELOG.md
- Make the fix more durable against future changes when droppedTargetsMap.Register may be called from other places.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3580
2023-01-05 02:52:08 -08:00
Zakhar Bessarab
a7e29c38bc lib/promscrape/targetstatus: fix crash during droppedTarget registration (#3595)
* lib/promscrape/targetstatus: fix crash during droppedTarget registration in case original labels are not present

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* lib/promscrape/targetstatus: address review comment

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-01-05 02:39:31 -08:00
Yury Molodov
2460e0f51e vmui: improve Explore metrics (#3598)
* feat: add multiple select

* feat: improve explore interface

* app/vmselect/vmui: `make vmui-update`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-01-05 02:23:04 -08:00
Aliaksandr Valialkin
0e1f0ade31 lib/streamaggr: sort by and without labels in the aggregate output metric name
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3460
2023-01-05 02:08:44 -08:00
Aliaksandr Valialkin
04dff34de4 vendor: update github.com/VictoriaMetrics/metricsql from v0.50.0 to v0.51.0
Updates https://github.com/VictoriaMetrics/metricsql/pull/7
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3589
2023-01-05 01:50:10 -08:00
Aliaksandr Valialkin
66947ee5a2 lib/streamaggr: remove unused fields 2023-01-04 13:33:46 -08:00
Roman Khavronenko
9d0e1f8e68 dashboards: add backupmanager dashboard (#3599)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-01-04 17:26:15 +01:00
Roman Khavronenko
63bf583b3c github: rm plaintext render (#3597)
`render: plain text` makes fields not formattable and prevents
 from pasting screenshots.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-01-04 14:48:17 +01:00
Denys Holius
09fe346d18 docs/Release-Guide.md: adds release scenario for RPM LTS packages (#3588) 2023-01-04 14:36:14 +01:00
Zakhar Bessarab
59f20c1034 github: use github templates for filling in feature requests or bug reports (#3587)
github: use github templates for filling in feature requests or bug reports
2023-01-04 14:34:19 +01:00
Artem Navoiev
0ff85d00a4 update year in License
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2023-01-04 09:46:38 +01:00
Aliaksandr Valialkin
fafece1af8 vendor: make vendor-update 2023-01-03 23:36:42 -08:00
Aliaksandr Valialkin
5bca3a5be2 app/vmselect: remove dependency on lib/promscrape from app/vmselect 2023-01-03 23:28:27 -08:00
Aliaksandr Valialkin
fd175ad80b docs: update -help outputs for vm* tools 2023-01-03 23:27:06 -08:00
Aliaksandr Valialkin
fa13bbc48a app/{vmagent,vminsert}: add support for streaming aggregation
See https://docs.victoriametrics.com/stream-aggregation.html

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3460
2023-01-03 22:19:21 -08:00
Aliaksandr Valialkin
add2c4bf07 lib/bytesutil: add InternBytes() function as a shortcut to InternString(ToUnsafeString(..)) 2023-01-03 22:16:22 -08:00
Aliaksandr Valialkin
f33e687723 app/vmagent/remotewrite: improve descriptions for -remoteWrite.relabelConfig and -remoteWrite.urlRelabelConfig
- Cross-reference these command-line flags.
- Add a link to https://docs.victoriametrics.com/vmagent.html#relabeling
2023-01-03 22:04:49 -08:00
Aliaksandr Valialkin
d3a1c36842 docs/Single-server-VictoriaMetrics.md: impromve formatting for prominent features chapter 2023-01-03 21:58:21 -08:00
Aliaksandr Valialkin
6db5c3801e app/vmbackup: remove superflouos whitespace after 8fe21ec707 2023-01-03 21:52:40 -08:00
Aliaksandr Valialkin
189f85b24c docs/vmbackupmanager.md: run make docs-sync after fa842d6534 2023-01-03 21:50:28 -08:00
Aliaksandr Valialkin
7b264b0c23 lib/promrelabel: allow calling Match on nil IfExpression
This simplifies the caller side of IfExpression
2023-01-03 21:44:03 -08:00
yanggang
8fe21ec707 Fix flag help message for the backups types. (#3577)
Signed-off-by: yanggang <gang.yang@daocloud.io>
2023-01-03 10:56:42 +01:00
yanggang
fa842d6534 fix typo for json values. (#3576)
Signed-off-by: yanggang <gang.yang@daocloud.io>
2023-01-03 10:55:38 +01:00
yanggang
93e935bcaa Fix vmctl command hint for vm-native-step-interval (#3575)
Signed-off-by: yanggang <gang.yang@daocloud.io>
2023-01-03 10:54:53 +01:00
Artem Navoiev
0a519c93ef run checks only for master/cluster branches (#3581)
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2023-01-03 11:08:44 +04:00
Roman Khavronenko
4b3d8eb573 vmalert: mention specifics of Alertmanager HA mode (#3573)
Stress the importance of specifying of all Alertmanager
URLs in vmalert's `-notifier.url` or `notifier.config`
if it runs in cluster mode.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3547

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-30 09:54:03 +01:00
Aliaksandr Valialkin
05fa11b296 app/vmui: reuse the series color in line tooltip 2022-12-29 15:32:41 -08:00
Aliaksandr Valialkin
1794f3d46e app/vmui: small usability improvements
- Show in the line tooltip the number of the query which generates the given line.
  This simplifies comparison of lines generated by multiple queries.

- Show metric name as __name__ label in the line tooltip in the same way as other labels are shown there.
  This makes the label information in the tooltip more consistent.

- Properly quote label values with JSON.stringify(). This prevents from improper formatting
  when label values contain doublequote chars.

- Remove double curly braces artifact at graph legend for lines without names and labels.

- Properly use modifier for regular expressions across the code.
2022-12-29 14:52:51 -08:00
Aliaksandr Valialkin
59e1e84a92 docs/CHANGELOG.md: document 1720bddb4f
Updats https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3530
2022-12-29 12:31:48 -08:00
Aliaksandr Valialkin
b8bc62431a app/vmselect/vmui: make vmui-update after 1720bddb4f 2022-12-29 12:20:06 -08:00
Yury Molodov
1720bddb4f fix: display correct graph tooltip title (#3562) 2022-12-29 12:06:48 -08:00
Roman Khavronenko
2cedb3e883 csvimport: support empty values (#3565)
Before, if the imported line contained multiple metrics and one
or more of them had an empty values - the whole line was ignored.

Now, only metrics with empty values are ignored, and the rest
of the metrics are accepted successfully.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3540

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-12-29 11:52:10 -08:00
Roman Khavronenko
83870aeb8d tests: attempt to fix flaky graphite test (#3567)
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-29 11:48:47 -08:00
Aliaksandr Valialkin
7af8857b68 docs/CHANGELOG.md: add a link to the docs and the pull request for update_entries_limit option in vmalert
This is a follow up for 6588fcbfca
2022-12-29 10:38:26 -08:00
Aliaksandr Valialkin
c90752a8be vendor: update github.com/valyala/fastjson/fastfloat from v1.6.3 to v1.6.4
This should properly parse floating-point numbers with missing integer or fractional parts.
For example, 123. or .123

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3544
2022-12-29 10:34:11 -08:00
ChenyuanHu
0a9e1d64fe app/vmselect/prometheus: no need manually call queryDuration.UpdateDuration (#3564)
There is no need to manually call `queryDuration.UpdateDuration(startTime)`, because `defer queryDuration.UpdateDuration(startTime)` is executed at the beginning of the function(L660).
2022-12-29 14:18:00 +01:00
Roman Khavronenko
6588fcbfca vmalert: allow configuring the default number of stored rule's update states (#3556)
Allow configuring the default number of stored rule's update states in memory
 via global `-rule.updateEntriesLimit` command-line flag or per-rule via rule's
 `update_entries_limit` configuration param.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-29 12:36:44 +01:00
Aliaksandr Valialkin
3dc684634e app/vmselect/searchutils: accept partial RFC3339 values at time, start and end query args
This simplifies manual usage of the APIs. For example, the following query
would return the results over the 2022 year.

  /api/v1/query_range?start=2022&end=2023&step=1d&query=...

This is equivalent to:

  /api/v1/query_range?start=2022-01-01T00:00:00Z&end=2023-01-01T00:00:00Z&step=1d&query=...
2022-12-28 19:41:54 -08:00
Yury Molodov
d2f89b55b7 vmui: fix step field (#3561)
* feat: use a unit next to the step value

* app/vmselect/vmui: `make vmui-update`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-12-28 16:00:51 -08:00
Aliaksandr Valialkin
3d082ed6db vendor: make vendor-update 2022-12-28 15:00:02 -08:00
Aliaksandr Valialkin
c4229a1bba lib/promscrape: log the actual response size in the error message when the response size exceeds -promscrape.maxScrapeSize
This is a follow-up for 7ad9fff7e5
2022-12-28 14:42:11 -08:00
Aliaksandr Valialkin
1b16118e17 lib/{storage,mergeset}: tune the threshold for assisted merge
The https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3425#issuecomment-1359117221
reveals that CPU usage for incoming queries may significantly increase when the number
of in-memory parts becomes too big.

This commit reduces the maximum number of in-memory parts before starting the assisted merge
during data ingestion. This should reduce CPU usage for incoming queries,
since they need to inspect lower number of in-memory parts.

This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3425
2022-12-28 14:39:24 -08:00
Clément Nussbaumer
7ad9fff7e5 fix(promscrape): check MaxScrapeSize after gzip decompression (#3550) 2022-12-28 12:19:41 -08:00
Aliaksandr Valialkin
293dda7169 lib/snapshot: improve log message on unexpected status code during attempts to create or delete snapshots
Use "unexpected status code returned from %q: %d; expecting %d" log message format
instead of less clear format "unexpected status code returned from %q; expecting %d; got %d"

This is a follow-up for c612bb165e
2022-12-28 11:41:50 -08:00
Aliaksandr Valialkin
ed9161a04f docs: remove a case study for Dreamteam.gg, since it looks like the company no longer exists 2022-12-28 11:34:40 -08:00
Zakhar Bessarab
c612bb165e lib/snapshot: fix error message format for failed HTTP request (#3559) 2022-12-28 18:04:11 +01:00
Artem Navoiev
34c705988e fix broken links
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2022-12-27 13:17:22 +02:00
Artem Navoiev
7d9c4bebc0 update links to grafana dashboards (#3534)
docs: update links to grafana dashboards

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2022-12-25 17:36:20 +01:00
Aliaksandr Valialkin
b11c806d1c app/vmui: show min, max and avg lines at Explore metrics graphs when instance is selected in the same way as when only the job is selected
This improves consistency of the graphs.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3386
2022-12-23 23:21:11 -08:00
Aliaksandr Valialkin
2a7e392bb3 app/vmselect/vmui: make vmui-update after 0dca224ec3 2022-12-23 22:25:04 -08:00
Aliaksandr Valialkin
0dca224ec3 app/vmui: small improvements for header panel
- Rename `Custom panel` tab to more clear `Query` tab
- Rename `Cardinality` tab to `Explore cardinality`, so it becomes consistent with `Explore metrics` tab
- Move `Dashboards` tab to the end, since it isn't used too much
2022-12-23 22:24:31 -08:00
Aliaksandr Valialkin
8ef1fe2047 app/vmui: move the Explore metrics tab closer to Custom panel tab
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3386
2022-12-23 22:16:21 -08:00
Aliaksandr Valialkin
f599e1bd34 app/vmui: show less lines at metrics explorer when the instance isn't selected
Show min, max and avg graphs across instances for the selected job.
This should improve usability of such a graphs when the job contains many instances.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3386
2022-12-23 22:15:01 -08:00
Aliaksandr Valialkin
4deea604bf app/vmui: follow-up after f6d31f5216
- Document the feature at docs/CHANGELOG.md.
- Document the metrics explorer at https://docs.victoriametrics.com/#metrics-explorer .
- Properly set `start` and `end` args for the selected time range
  when performing the request, which returns metric names.
- Improve queries, so they return lower number of lines and labels.
  This should improve metrics' exploration.
- Properly encode label filters and query args before passing them to VictoriaMetrics.
- Various cosmetic fixes.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3386
2022-12-22 17:17:01 -08:00
Yury Molodov
f6d31f5216 vmui: add explore tab for exploration of metrics, which belong to a particular job/instance (#3470)
* feat: add "Explore" page

* feat: add graphs for explore page

* vmui: add explore tab for exploration of metrics, which belong to a particular job/instance

* refactor: rename variables

* refactor: extract graph to ExploreMetricItemGraph.tsx

* feat: add searchable for Select.tsx

* feat: improve metrics explorer

* feat: set document title by page

* feat: add page to view icons

* fix: improve styles

* fix: add encodeURIComponent to query
2022-12-22 15:24:40 -08:00
Aliaksandr Valialkin
f6ac045933 docs/CHANGELOG.md: document 1df87807fd
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3513
2022-12-22 15:22:09 -08:00
Yury Molodov
1df87807fd vmui: step (#3521)
* feat: add step rounding

* fix: change step in URL parameters

* refactor: change comment for roundStep
2022-12-22 14:54:28 -08:00
Aliaksandr Valialkin
0076422350 lib/promscrape/discovery/azure: typo fix 2022-12-21 21:25:16 -08:00
Aliaksandr Valialkin
f1441a598f app/vmselect/promql: add tests for d3de110070 2022-12-21 20:25:21 -08:00
Aliaksandr Valialkin
fa236c5a84 lib/promrelabel: make fmt after d3de110070 2022-12-21 20:24:57 -08:00
Aliaksandr Valialkin
d3de110070 app/vmselect/promql: make sure that label_replace() doesn't create an empty dst_label if the src_label doesn't match regex 2022-12-21 20:20:01 -08:00
Aliaksandr Valialkin
31886aef3d lib/promrelabel: add support for keepequal and dropequal relabeling actions
These actions are supported by Prometheus starting from v2.41.0

See https://github.com/prometheus/prometheus/pull/11564 ,
https://github.com/prometheus/prometheus/issues/11556
and https://github.com/prometheus/prometheus/issues/3756

Side note:

It's a pity that Prometheus developers decided inventing `keepequal` and `dropequal`
relabeling actions instead of adding support for `keep_if_equal` and `drop_if_equal` relabeling
actions supported by VictoriaMetrics since June 2020 - see 2a39ba639d .
2022-12-21 20:04:55 -08:00
Aliaksandr Valialkin
3300546eab lib/bytesutil: make sure that the cleanup code is performed only by a single goroutine out of many concurrently running goroutines
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3466
2022-12-21 13:07:24 -08:00
Yury Molodov
0f8ffc7df9 vmui: fix change timezone (#3519)
vmui: fix time picker with changed time zone
2022-12-21 17:22:37 +01:00
Aliaksandr Valialkin
192db0f0b1 deployment/docker: update VictoriaMetrics tag from v1.85.2 to v1.85.3 in docker-compose files 2022-12-20 15:34:23 -08:00
Aliaksandr Valialkin
f443fad56d docs/CHANGELOG.md: cut v1.85.3 2022-12-20 14:51:23 -08:00
Aliaksandr Valialkin
77874d6055 app/vmselect/vmui: make vmui-update after 731d189fa9 2022-12-20 14:51:07 -08:00
Roman Khavronenko
7ca49290b6 docs: fix the image link (#3509)
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3495
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-20 14:36:25 -08:00
Roman Khavronenko
e4e9dfb785 vmagent: respect -usePromCompatibleNaming if no relabeling is set (#3511)
* vmagent: respect `-usePromCompatibleNaming` if no relabeling is set

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3493

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmagent: upd test

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-12-20 14:32:45 -08:00
Yury Molodov
731d189fa9 fix: change the logic for hide query (#3514) 2022-12-20 14:29:46 -08:00
Zakhar Bessarab
4be4645142 app/vmbackupmanager: add metrics for better observability (#488)
* app/vmbackupmanager: add metrics for better observability, include more information to `/api/v1/backups` API call response

* app/vmbackupmanager: drop old metrics before creating new ones

* app/vmbackupmanager: use `_total` postfix for counter metrics

* app/vmbackupmanager: remove `_total` postfix for gauge-like metrics

* app/vmbackupmanager: add `_last_run_failed` metrics for backups and retention

* app/vmbackupmanager: address review feedback

* app/vmbackupmanager: fix metric name

* app/vmbackupmanager: address review feedback, remove background updates of metrics, add restoring state of `_last_run_failed` metric from remote storage

* app/vmbackupmanager: improve performance for backup size calculation

* app/vmbackupmanager: refactor backup and retention runs to deduplicate each run logic

* {app/vmbackupmanager,lib/formatutil}: move HumanizeBytes into lib package

* app/vmbackupmanager: fix creating new metrics instead of reusing existing ones

* lit/formatutil: add comment to make linter happy

* app/vmbackupmanager: address review feedback
2022-12-20 14:18:06 -08:00
Aliaksandr Valialkin
4e55b67a44 lib/storage: clear the err if it is set to io.EOF when searching for the TSID by metricID
This is expected error after when recently added indexdb data isn't available for search yet
or wasn't flushed to disk after unclean shutdown of VictoriaMetrics.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3515
2022-12-20 14:05:29 -08:00
Aliaksandr Valialkin
8f5e822565 Makefile: update golangci-lint version from v1.48.0 to v1.50.1 2022-12-20 13:09:40 -08:00
Aliaksandr Valialkin
cad90c7ac1 Makefile: publish release docker images at DockerHub with the stable tag
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2911
2022-12-20 12:27:06 -08:00
Aliaksandr Valialkin
a48510573e Revert "docs/Release-Guide.md: add LATEST_TAG=stable env var for make publish-release in order to create stable tag for the published components at DockerHub"
This reverts commit 8afa7ef837.
2022-12-20 12:24:27 -08:00
Aliaksandr Valialkin
8afa7ef837 docs/Release-Guide.md: add LATEST_TAG=stable env var for make publish-release in order to create stable tag for the published components at DockerHub
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2911
2022-12-20 12:22:42 -08:00
Aliaksandr Valialkin
1d7b5cb83c docs/CHANGELOG.md: document the change at 547f07463b29c09c62c9af35eac9cee6764b3286
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2612
2022-12-20 10:22:07 -08:00
Zakhar Bessarab
18e55d14c6 app/vmbackupmanager: update doc to include cluster to cluster restore example (#3506) 2022-12-20 14:54:56 +01:00
Roman Khavronenko
8122191368 docs: fix link typo in operator docs (#3508) 2022-12-20 14:52:47 +01:00
Aliaksandr Valialkin
6bf46c7bf5 docs/CHANGELOG.md: formatting fix 2022-12-20 01:06:34 -08:00
Aliaksandr Valialkin
9fa3f1dc57 app/vmselect/promql: do not extend too short lookbehind window for rate() function if it is set explicitly
Previously too short lookbehind window d for rate(m[d]) could be automatically extended
if it didn't cover at least two raw samples. This was needed in order to guarantee
non-empty results from rate(m[d]) on short time ranges.

Now the lookbehind window isn't extended if it is set explicitly,
since it is expected that the user knows what he is doing.

The lookbehind window continues to be extended when needed if it isn't set explicitly.
For example, in the case of rate(m).

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3483
2022-12-20 00:18:20 -08:00
Aliaksandr Valialkin
eabb8762ee docs/Articles.md: add a link to https://www.youtube.com/watch?v=Mesc6JBFNhQ 2022-12-19 21:40:51 -08:00
Michal Kralik
fd53f86c84 build: fix issue with missing docker scan (#3501) 2022-12-19 15:22:45 -08:00
Aliaksandr Valialkin
680925f872 docs/CHANGELOG.md: add a warning for releases between v1.83.0 and v1.85.1 that it is recommended upgrading to v1.85.2 because of the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3502
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3502
2022-12-19 13:36:49 -08:00
Aliaksandr Valialkin
d2ad184377 docs/CHANGELOG.md: consistently use YYYY-MM-DD format for release dates
The previously used DD-MM-YYYY format could be confused with the MM-DD-YYYY format.
The YYYY-MM-DD format reduces this confusion.
2022-12-19 13:33:05 -08:00
Aliaksandr Valialkin
944effca54 lib/storage: do not check for the result returned by db.doExtDB() where this isn't necessary
This simplifies the code a bit
2022-12-19 13:23:13 -08:00
Aliaksandr Valialkin
6530344a8f docs/CHANGELOG.md: cut v1.85.2 2022-12-19 13:09:29 -08:00
Aliaksandr Valialkin
9a0308ab32 vendor: make vendor-update 2022-12-19 13:08:13 -08:00
Aliaksandr Valialkin
0bf3ae9559 lib/promscrape/discovery/consul: expose service tags in individual labels __meta_consul_tag_<tagname>
This simplifies copying service tags to target labels with the following relabeling rule:

- action: labelmap
  regex: __meta_consul_tag_(.+)

See https://stackoverflow.com/questions/44339461/relabeling-in-prometheus
2022-12-19 13:08:11 -08:00
Aliaksandr Valialkin
6c98b56935 lib/storage: search for TSIDs for the given metricIDs in the previous indexdb if they aren't found in the current indexdb
The issue triggers after the indexdb rotation for time series, which stop receiving new samples.
This results in missing data for such time series in query responses.

This commit should address the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3502

The issue has been introduced in 2dd93449d8
2022-12-19 12:03:09 -08:00
Aliaksandr Valialkin
dc0b08efb0 lib/storage: optimize partSearch.searchBHS() for common case when the TSID for the current block header is bigger or equal to the current tsid
This should help improving performance at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3425
2022-12-19 10:28:03 -08:00
Aliaksandr Valialkin
057fb2120b lib/storage: properly set buf capacity inside marshalMetricID
Previously it was always set to 0. In theory this could result into incorrect marshaling
of metricIDs.

The issue has been introduced in 5e4dfe50c6
2022-12-19 10:14:38 -08:00
Aliaksandr Valialkin
4cb83f0f4a lib/logger: follow-up for 72f8fce107
- Document the change at docs/CHANELOG.md
- Log fatal errors if the -loggerJSONFields contains unexpected values
- Rename -loggerJsonFields to -loggerJSONFields for the sake of consistency naming commonly used in Go

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2348
2022-12-16 17:42:07 -08:00
Michal Kralik
72f8fce107 lib/logger: support for renaming json fields (#3488) 2022-12-16 17:26:32 -08:00
Aliaksandr Valialkin
56a9ea3753 docs/vmalert.md: mention latency_offset query arg, which has been added in 86dae56bd0
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3481
2022-12-16 17:20:37 -08:00
Aliaksandr Valialkin
285841bcce app/vmselect/prometheus: follow-up after 86dae56bd0
Return error if the provided latency_offset query arg cannot be parsed.
This should simplify debugging in production.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3481
2022-12-16 17:13:20 -08:00
Aliaksandr Valialkin
65f8fc527f lib/promscrape: stop dropping metric name if relabeling rules do not instruct to do this on the /metric-relabel-debug page 2022-12-16 17:02:41 -08:00
Roman Khavronenko
86dae56bd0 vmselect: support overriding of -search.latencyOffset (#3489)
support overriding of `-search.latencyOffset` value via
URL param `latency_offset`.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3481

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-16 16:54:57 -08:00
Michal Kralik
07e9322157 build: nightly builds at 2:48am (#3490) 2022-12-16 16:46:24 -08:00
Roman Khavronenko
e40c7d6efa dashboards: respect $job var in sub-vars for cluster dash (#3487)
Previously, $job_select, $job_storage and $job_insert
didn't respect the $job filter. This change updates
the variable queries to account for set $job variable.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-16 09:53:32 +01:00
Aliaksandr Valialkin
17a244571b docs/keyConcepts.md: update the list of supported data ingestion protocols
- Add DataDog protocol
- Remove native protocol, since it isn't intended for general-purpose usage by external clients
2022-12-15 12:01:59 -08:00
Aliaksandr Valialkin
ad8852759d lib/storage: skip missing tsids in the current block header by using binary search
This improves performance by up to 10x when big number of the requested TSIDs
are missing in the searched parts.

This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3425
2022-12-14 22:06:51 -08:00
Aliaksandr Valialkin
4de9d35458 lib/flagutil/bytes.go: properly handle values bigger than 2GiB on 32-bit architectures
This fixes handling of values bigger than 2GiB for the following command-line flags:

- -storage.minFreeDiskSpaceBytes
- -remoteWrite.maxDiskUsagePerURL
2022-12-14 19:26:31 -08:00
Aliaksandr Valialkin
5d30080555 lib/flagutil: support for TB and TiB suffixes for command-line flags, which accept byte sizes 2022-12-14 17:52:32 -08:00
Aliaksandr Valialkin
38341802c2 app/vmselect: add /expand-with-exprs page 2022-12-14 16:18:55 -08:00
Aliaksandr Valialkin
1896c47fee .wwhrd.yml: add ISC license, which is used by github.com/davecgh/go-spew
This license is compatible with Apache2

See https://github.com/davecgh/go-spew/blob/master/LICENSE
2022-12-14 14:55:46 -08:00
Aliaksandr Valialkin
231569f89b docs/vmagent.md: small formatting fix 2022-12-14 14:25:35 -08:00
Aliaksandr Valialkin
0c54ff20eb docs/CHANGELOG.md: fix the link to the issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3466 2022-12-14 14:17:17 -08:00
Aliaksandr Valialkin
76445bfd98 docs/CHANGELOG.md: add release date for v1.85.1 2022-12-14 14:12:07 -08:00
Aliaksandr Valialkin
09a70d3e90 vendor: make vendor-update 2022-12-14 12:13:54 -08:00
Aliaksandr Valialkin
ac5948b3f3 app/vmselect/vmui: make vmui-update 2022-12-14 12:01:48 -08:00
Aliaksandr Valialkin
ea44c39377 docs/CHANGELOG.md: cut v1.85.1 2022-12-14 11:57:49 -08:00
Aliaksandr Valialkin
e0009ec466 docs/Cluster-VictoriaMetrics.md: mention the vm_storage_is_read_only metric, which can help debugging readonly mode at vmstorage 2022-12-14 09:31:28 -08:00
Aliaksandr Valialkin
7a61bafe59 docs/vmagent.md: clarify that relabeling is actually a debugging at relabel debug section 2022-12-13 15:46:21 -08:00
Aliaksandr Valialkin
b89d862aa5 docs/CHANGELOG.md: document the bugfix at a50120a212 2022-12-13 09:36:24 -08:00
Zakhar Bessarab
a50120a212 lib/backup/azremote: fix copying for parts larger than 256M by using async copy (#3479)
* lib/backup/azremote: fix copying for parts larger than 256M by using async copy

* lib/backup/azremote: add description of an error for log message
2022-12-13 09:32:57 -08:00
Yury Molodov
d3418bafc0 fix: prevent run query when selecting autocomplete option (#3480) 2022-12-13 09:30:05 -08:00
Aliaksandr Valialkin
0d41d933e9 lib/mergeset: reduce the parts threshold before starting assisted merges
This should improve query speed in general case.

This is a follow-up for d1af6046c7
2022-12-13 09:13:49 -08:00
Aliaksandr Valialkin
33bff00890 app/vmselect/vmui: make vmui-update 2022-12-12 17:46:30 -08:00
Yury Molodov
cca3bc756b vmui: minor enhancements (#3471)
* update package-lock.json

* fix: correct handle click by "action" on cardinality page

* fix: correct styles for icons width

* feat: add layout with copyright

* feat: add website and issue to footer
2022-12-12 17:44:13 -08:00
Dima Lazerka
bde93844de Add Anomaly Detection to enterprise features list (#3476)
* Add Anomaly Detection to enterprise features list

* Update docs/enterprise.md

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-12-12 17:42:44 -08:00
Aliaksandr Valialkin
d1af6046c7 lib/{mergeset,storage}: do not block small merges by pending big merges - assist with small merges instead
Blocked small merges may result into big number of small parts, which, in turn,
may result in increased CPU and memory usage during queries, since queries need to inspect
all the existing small parts.

The issue has been introduced in 8189770c50

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3337
2022-12-12 17:00:50 -08:00
Aliaksandr Valialkin
3b18931050 lib/bytesutil: cache results for all the input strings, which were passed during the last 5 minutes from FastStringMatcher.Match(), FastStringTransformer.Transform() and InternString()
Previously only up to 100K results were cached.
This could result in sub-optimal performance when more than 100K unique strings were actually used.
For example, when the relabeling rule was applied to a million of unique Graphite metric names
like in the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3466

This commit should reduce the long-term CPU usage for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3466
after all the unique Graphite metrics are registered in the FastStringMatcher.Transform() cache.

It is expected that the number of unique strings, which are passed to FastStringMatcher.Match(),
FastStringTransformer.Transform() and to InternString() during the last 5 minutes,
is limited, so the function results fit memory. Otherwise OOM crash can occur.
This should be the case for typical production workloads.
2022-12-12 14:41:13 -08:00
Zakhar Bessarab
ebcb4ab617 {app/{vmbackup/vmrestore}: update path example to use Azure terminology for consistency (#3475) 2022-12-12 22:17:22 +03:00
Roman Khavronenko
8286f9608f vmalert: support $for or .For template variables (#3474)
support `$for` or `.For` template variables  in alert's annotations.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3246

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-12 22:16:10 +03:00
Roman Khavronenko
eb275be99d dashboards: add VersionChange annotation (#3473)
The new annotation is hidden by default and suppose to show
component `short_version` label change on the panels.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-12 16:32:26 +01:00
Aliaksandr Valialkin
7ae744fce6 lib/protoparser/datadog: do not re-use previously parsed field values if they are missing in the currently parsed message
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3432
2022-12-11 13:09:25 -08:00
Aliaksandr Valialkin
4a4b3c2462 vendor: update github.com/klauspost/compress from v1.15.12 to v1.15.13 2022-12-11 02:10:51 -08:00
Aliaksandr Valialkin
9f642d10ff docs/CHANGELOG.md: cut v1.85.0 2022-12-11 02:01:11 -08:00
Aliaksandr Valialkin
38f8e8adc3 docs/CHANGELOG.md: document changes at v1.79.6 2022-12-11 01:51:36 -08:00
Aliaksandr Valialkin
d141cb28a6 docs/CHANGELOG.md: document 461158a437
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3427
2022-12-10 23:42:16 -08:00
Aliaksandr Valialkin
88597f187b app/{vmagent,vminsert}/datadog: make the host label optional in DataDog data ingestion protocol
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3432
2022-12-10 23:32:31 -08:00
Aliaksandr Valialkin
e272a0ec78 app/vmselect/promql: allow passing inf arg into functions, which accept numeric limit on the number of output time series
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3461
2022-12-10 22:47:47 -08:00
Aliaksandr Valialkin
b7aec1be4d docs/CHANGELOG.md: add a link to the issue related to reduced CPU and memory usage at vmalert
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3464

The related commit - b97bd01605
2022-12-10 22:21:19 -08:00
Aliaksandr Valialkin
19f20c0f4e vendor: make vendor-update 2022-12-10 21:46:16 -08:00
Aliaksandr Valialkin
b01607e3fb docs: clarify that single-node VictoriaMetrics also provides functionality for relabel debugging
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3407
2022-12-10 20:49:52 -08:00
Aliaksandr Valialkin
a30ae502ef lib/promscrape: allow editing relabeling configs and labels at /target-relabel-debug page
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3407
2022-12-10 12:44:45 -08:00
Aliaksandr Valialkin
3e7276639e app/{vminsert,vmselect}: move the handler for /metric-relabel-debug from vminsert to vmselect to be consistent with the cluster version 2022-12-10 02:45:40 -08:00
Aliaksandr Valialkin
3f4cb9a142 docs: sync with cluster branch after 97b41e727c 2022-12-10 02:32:24 -08:00
Aliaksandr Valialkin
a8b8e23d68 lib/promscrape: implement target-level and metric-level relabel debugging
Target-level debugging is performed by clicking the 'debug' link at the corresponding target
on either http://vmagent:8429/targets page or on http://vmagent:8428/service-discovery page.

Metric-level debugging is perfromed at http://vmagent:8429/metric-relabel-debug page.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3407

See https://docs.victoriametrics.com/vmagent.html#relabel-debug
2022-12-10 02:09:44 -08:00
Aliaksandr Valialkin
6f0179405a app/vmalert: properly handle nil req passed to requestToCurl()
This fixes a panic in the TestAlertingRule_Exec_Negative test.
The panic has been introduced in the commit b97bd01605
2022-12-10 02:04:17 -08:00
Aliaksandr Valialkin
c5dd973f9c docs/url-examples.md: add missing whitespace for proper heading for the example on how to send data via OpenTSDB protocol 2022-12-09 17:33:44 -08:00
Aliaksandr Valialkin
765ee5f7ba docs/CHANGELOG.md: document b97bd01605 2022-12-09 11:49:29 -08:00
Aliaksandr Valialkin
ca59d3de59 app/vmalert: do not show system links at http://vmalert:8880/ page when it is requested via proxy
The system links are absolute, e.g. they start from `/`, so there are high chances
they won't work as expected when requested via proxy such as vmselect with -vmalert.proxyURL
command-line flag.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3424
2022-12-09 11:46:28 -08:00
Roman Khavronenko
b97bd01605 vmalert: do not hold pointer to http.Request (#3467)
http.Request was used as a part of state struct
for generating the curl command when viewing the rule's
state changes.
It appears, that holding a referencing is far more expensive
than generating the curl command immediately.
On the test with 40k rules, this change reduces memory
and CPU usage by 50%.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-09 18:13:29 +03:00
Aliaksandr Valialkin
2406c0dcfd docs/CHANGELOG.md: document the bugfix at 05b42601c3
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3247
2022-12-08 18:35:28 -08:00
Zakhar Bessarab
05b42601c3 lib/promscrape/discovery/azure: remove API server from URL returned by azure (#3403)
* lib/promscrape/discovery/azure: remove API server from URL returned by azure

* lib/promscrape/discovery/azure: validate nextLink contains same URL as apiServer
2022-12-08 18:29:10 -08:00
Aliaksandr Valialkin
8434aa142d lib/querytracer: fix remaining tests after 49ebc48809 2022-12-08 18:18:06 -08:00
Aliaksandr Valialkin
5b9e6b9d24 lib/storage: follow-up after 7c0ae3a86a
- Update docs at https://docs.victoriametrics.com/#deduplication
- Optimize the deduplication loop a bit

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3333
2022-12-08 18:16:57 -08:00
Roman Khavronenko
7c0ae3a86a lib/storage: keep sample with the biggest value on timestamp conflict (#3421)
The change leaves raw sample with the biggest value for identical
timestamps per each `-dedup.minScrapeInterval` discrete interval
when the deduplication is enabled.

```
benchstat old.txt new.txt
name                                         old time/op    new time/op    delta
DeduplicateSamples/minScrapeInterval=1s-10      817ns ± 2%     832ns ± 3%      ~     (p=0.052 n=10+10)
DeduplicateSamples/minScrapeInterval=2s-10     1.56µs ± 1%    2.12µs ± 0%   +35.19%  (p=0.000 n=9+7)
DeduplicateSamples/minScrapeInterval=5s-10     1.32µs ± 3%    1.65µs ± 2%   +25.57%  (p=0.000 n=10+10)
DeduplicateSamples/minScrapeInterval=10s-10    1.13µs ± 2%    1.50µs ± 1%   +32.85%  (p=0.000 n=10+10)

name                                         old speed      new speed      delta
DeduplicateSamples/minScrapeInterval=1s-10   10.0GB/s ± 2%   9.9GB/s ± 3%      ~     (p=0.052 n=10+10)
DeduplicateSamples/minScrapeInterval=2s-10   5.24GB/s ± 1%  3.87GB/s ± 0%   -26.03%  (p=0.000 n=9+7)
DeduplicateSamples/minScrapeInterval=5s-10   6.22GB/s ± 3%  4.96GB/s ± 2%   -20.37%  (p=0.000 n=10+10)
DeduplicateSamples/minScrapeInterval=10s-10  7.28GB/s ± 2%  5.48GB/s ± 1%   -24.74%  (p=0.000 n=10+10)
```

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3333
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-08 18:06:11 -08:00
Aliaksandr Valialkin
3019ec3da6 lib/querytracer: fix tests after 49ebc48809 2022-12-08 17:21:38 -08:00
Aliaksandr Valialkin
eeacbaf0b6 all: update Go builder from v1.19.3 to v1.19.4
See https://github.com/golang/go/issues?q=milestone%3AGo1.19.4+label%3ACherryPickApproved
2022-12-08 16:41:24 -08:00
Aliaksandr Valialkin
56b8980915 lib/promscrape: allow using sample_limit and series_limit options in stream parsing mode
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3458
2022-12-08 16:33:38 -08:00
Aliaksandr Valialkin
f9730676d9 app/vmselect/searchutils: do not print flag name responsible for query timeout if the timeout isn't reached
This should make the log message more clear
2022-12-08 13:07:33 -08:00
Aliaksandr Valialkin
bce1c5d572 docs/Cluster-VictoriaMetrics.md: typo fix 2022-12-07 12:26:34 -08:00
Aliaksandr Valialkin
189217a069 docs/CHANGELOG.md: document the addition of file-based discovery of vmstorage nodes 2022-12-07 12:04:57 -08:00
Aliaksandr Valialkin
59430e4274 docs/Cluster-VictoriaMetrics.md: make docs-sync after 5de8330ce00adfc5ac794070d30a2617ddc14bf2 2022-12-07 12:02:38 -08:00
Aliaksandr Valialkin
49ebc48809 lib/querytracer: put the version of VictoriaMetrics in the first message of query trace
This should simplify further debugging, since the first thing to start the debugging by query trace
is to know the version of VictoriaMetrics, which produced this trace.
2022-12-07 09:46:39 -08:00
Roman Khavronenko
0b6b6d52bf dashboards: remove DataLinks from single version (#3456)
Those data links were copy&paste artifact from cluster version
and aren't needed on the dash.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-07 14:35:52 +01:00
Roman Khavronenko
9f1403db38 dashboards: add non-default flags panel for vmagent (#3453)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-07 12:22:20 +01:00
Roman Khavronenko
b9dc11612e alerts: remove show_at label for RequestErrorsToAPI alert (#3455)
Alert `RequestErrorsToAPI` could be permanently triggered due to
mistakes in clients configuration. However, such requests are unlikely
to cause VM health state change. So there is no need in displaying
this alert because there will be no correlation caused by it.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-07 14:19:50 +03:00
Denys Holius
f12dae130a deployment/docker: bump Grafana version to v9.2.7 (#3454)
see https://grafana.com/blog/2022/11/29/grafana-security-release-new-versions-with-high-severity-security-fix-for-cve-2022-31097/
2022-12-07 10:23:19 +01:00
Aliaksandr Valialkin
6183975d45 .github/ISSUE_TEMPLATE/bug_report.md: update the link to troubleshooting docs 2022-12-06 21:11:15 -08:00
Aliaksandr Valialkin
3f82e3fa36 docs: follow-up after e1bf2a85d0559d112908ce81597f3261d3a085c0
- Document the change at docs/CHANGELOG.md
- Run `make docs-sync` for copying app/vmgateway/README.md to docs/vmgateway.md
  in order to propagate docs' changes to https://docs.victoriametrics.com/vmgateway.html
2022-12-06 21:05:22 -08:00
Aliaksandr Valialkin
758e8a15fd app/vmselect: typo fixes in code comments 2022-12-06 20:58:16 -08:00
Aliaksandr Valialkin
35a3170d97 docs: document the addition of -storageNode.discoveryInterval command-line flag in VictoriaMetrics cluster enterprise
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3417
2022-12-06 19:57:06 -08:00
Aliaksandr Valialkin
5cc19b1f7e docs/Articles.md: change the link to How we tried using VictoriaMetrics and Thanos at the same time from Russian to English article 2022-12-06 16:30:10 -08:00
Roman Khavronenko
3dec847c93 vmalert: correctly return error for RW failures (#3452)
* vmalert: correctly return error for RW failures

By mistake, in 0989649ad0 the error
for remote write failures weren't return to user.
This change fixes it.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-06 15:36:46 +01:00
Aliaksandr Valialkin
9a7c36e645 docs/Articles.md: add a link to the talk "How do We Keep Metrics for a Long Time in VictoriaMetrics" 2022-12-06 00:55:38 -08:00
Aliaksandr Valialkin
50ea632bfe docs/Articles.md: link to a video for the talk 'VictoriaMetrics: scaling to 100 million metrics per second' 2022-12-06 00:53:05 -08:00
Aliaksandr Valialkin
06758650bf vendor: make vendor-update 2022-12-05 23:28:14 -08:00
Aliaksandr Valialkin
a40c50f4fe docs/CHANGELOG.md: document 1e0666abb4 2022-12-05 23:10:17 -08:00
Aliaksandr Valialkin
e2e341da9f app/vmselect/vmui: make vmui-update after 7645d9ae00 2022-12-05 23:07:08 -08:00
Pedro Gonçalves
1e0666abb4 Datadog - Add device as a tag if it's present as a field in the series object (#3431)
* Datadog - Add device as a tag if it's present as a field in the series object

* address PR comments
2022-12-05 23:06:03 -08:00
Aliaksandr Valialkin
caa1c43166 docs: follow-up for 7645d9ae00
- Document the change at docs/CHANGELOG.md
- Document the feature at https://docs.victoriametrics.com/#vmui

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3446
2022-12-05 22:50:41 -08:00
Yury Molodov
7645d9ae00 feat: add toggle query display by Ctrl (#3449) 2022-12-05 22:45:15 -08:00
Yury Molodov
01a9b36a95 vmui: timezone select (#3414)
* feat: add timezone selection

* vmui: provide feature timezone select

* fix: correct timezone with relative time

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-12-05 22:44:31 -08:00
Roman Khavronenko
71f0bbbe39 deployment: update the README (#3447)
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-05 22:05:31 -08:00
Aliaksandr Valialkin
718d1d90b6 docs/CHANGELOG.md: document fd43b5bad0
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3444
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3445
2022-12-05 22:01:42 -08:00
Yury Molodov
fd43b5bad0 vmui: fix multi-line query (#3448)
* fix: remove prevent nav by up/down keys for multi-line query

* fix: add query params encode in URL
2022-12-05 21:56:54 -08:00
Aliaksandr Valialkin
5eae9a9914 app/vmselect/promql: add range_trim_spikes(phi, q) function for trimming phi percent of largest spikes per each time series returned by q 2022-12-05 21:55:01 -08:00
Aliaksandr Valialkin
d99d222f0a lib/{storage,mergeset}: log the duration for flushing in-memory parts on graceful shutdown 2022-12-05 21:30:48 -08:00
Aliaksandr Valialkin
eed32b368c docs/vmctl.md: make docs-sync after 86c31f2955
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2930
2022-12-05 17:24:10 -08:00
Zakhar Bessarab
86c31f2955 app/vmctl: add option to migrate between clusters with automatic tenants discovery (#3450) 2022-12-05 17:18:09 -08:00
Aliaksandr Valialkin
f3e84b4dea {dashboards,alerts}: subtitute {type="indexdb"} with {type=~"indexdb.*"} inside queries after 8189770c50
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3337
2022-12-05 16:00:22 -08:00
Aliaksandr Valialkin
8189770c50 all: add -inmemoryDataFlushInterval command-line flag for controlling the frequency of saving in-memory data to disk
The main purpose of this command-line flag is to increase the lifetime of low-end flash storage
with the limited number of write operations it can perform. Such flash storage is usually
installed on Raspberry PI or similar appliances.

For example, `-inmemoryDataFlushInterval=1h` reduces the frequency of disk write operations
to up to once per hour if the ingested one-hour worth of data fits the limit for in-memory data.

The in-memory data is searchable in the same way as the data stored on disk.
VictoriaMetrics automatically flushes the in-memory data to disk on graceful shutdown via SIGINT signal.
The in-memory data is lost on unclean shutdown (hardware power loss, OOM crash, SIGKILL).

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3337
2022-12-05 15:16:14 -08:00
Aliaksandr Valialkin
e509552e92 vendor: make vendor-update 2022-12-05 01:01:57 -08:00
Yury Molodov
461158a437 fix: add word-break for tooltip (#3437) 2022-12-05 08:50:34 +01:00
Roman Khavronenko
6801b37e53 dashboards: add Disk space usage % and Disk space usage % by type panels (#3436)
The new panels have been added to the vmstorage and drilldown rows.

`Disk space usage %` is supposed to show disk space usage percentage.
This panel is now also referred by `DiskRunsOutOfSpace` alerting rule.
This panel has Drilldown option to show absolute values.

`Disk space usage % by type` shows the relation between datapoints
and indexdb size. It supposed to help identify cases when indexdb
starts to take too much disk space.
This panel has Drilldown option to show absolute values.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-05 08:35:33 +01:00
Roman Khavronenko
91a8afa172 vmalert: reduce allocations for Prometheus resp parse (#3435)
Method `metrics()` now pre-allocates slices for labels
and results from query responses. This reduces the number 
of allocations on the hot path for instant requests.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-05 08:34:54 +01:00
Aliaksandr Valialkin
544ea89f91 lib/{mergeset,storage}: add start background workers via startBackgroundWorkers() function 2022-12-04 00:01:04 -08:00
Aliaksandr Valialkin
33dda2809b lib/mergeset: panic when too long item is passed to Table.AddItems() 2022-12-03 23:32:16 -08:00
Aliaksandr Valialkin
932c1f90ae lib/storage: remove duplicate logging for filepath on errors 2022-12-03 23:15:22 -08:00
Aliaksandr Valialkin
044a304adb lib/storage: pass a single arg - rowsPerBlock - to getCompressLevel() function instead of two args 2022-12-03 23:10:16 -08:00
Aliaksandr Valialkin
cb44976716 lib/{storage,mergeset}: use a single sync.WaitGroup for all background workers
This simplifies the code
2022-12-03 23:03:08 -08:00
Aliaksandr Valialkin
28e6d9e1ff lib/storage: properly pass retentionMsecs to OpenStorage() at TestIndexDBRepopulateAfterRotation 2022-12-03 23:02:10 -08:00
Aliaksandr Valialkin
343c69fc15 lib/{mergeset,storage}: pass compressLevel to blockStreamWriter.InitFromInmemoryPart
This allows packing in-memory blocks with different compression levels
depending on its contents. This may save memory usage.
2022-12-03 22:46:48 -08:00
Aliaksandr Valialkin
6d87462f4b lib/mergeset: use the given compressLevel for index and metaindex compression in in-memory part
Previously only data was compressed with the given compressLevel
2022-12-03 22:34:54 -08:00
Aliaksandr Valialkin
f3e3a3daeb lib/{mergeset,storage}: take into account byte slice capacity when returning the size of in-memory part
This results in more correct reporting of memory usage for in-memory parts
2022-12-03 22:30:36 -08:00
Aliaksandr Valialkin
c4150995ad lib/mergeset: reduce the time needed for the slowest tests 2022-12-03 22:26:33 -08:00
Aliaksandr Valialkin
45299efe22 lib/{storage,mergeset}: consistency rename: `flushRaw{Rows,Items} -> flushPending{Rows,Items} 2022-12-03 22:17:46 -08:00
Aliaksandr Valialkin
5ca58cc4fb lib/storage: optimization: do not scan block for rows outside retention if it is covered by the retention 2022-12-03 22:14:12 -08:00
Aliaksandr Valialkin
152ac564ab lib/storage: remove logging redundant path values in a single error message 2022-12-03 22:13:13 -08:00
Aliaksandr Valialkin
93764746c2 lib/filestream: remove logging redundant path values in a single error message 2022-12-03 22:01:51 -08:00
Aliaksandr Valialkin
4f28513b1a lib/fs: remove logging redundant path values in a single error message 2022-12-03 22:00:20 -08:00
Aliaksandr Valialkin
7c3c08d102 lib/backup: remove logging duplicate path values in a single error message 2022-12-03 21:55:06 -08:00
Aliaksandr Valialkin
14660d4df5 all: typo fix: the the -> the 2022-12-03 21:53:01 -08:00
Aliaksandr Valialkin
ddc3d6b5c3 lib/mergeset: drop the crufty code responsible for direct upgrade from releases prior v1.28.0
Upgrade to v1.84.0, wait until the "finished round 2 of background conversion" message
appears in the log and then upgrade to newer release.
2022-12-03 21:17:31 -08:00
Aliaksandr Valialkin
05c65bd83f lib/storage: speed up search for data block for the given tsids
Use binary search instead of linear scan for looking up the needed
data block inside index block.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3425
2022-12-03 20:58:32 -08:00
Aliaksandr Valialkin
c1cd4a9101 docs/CHANGELOG.md: consistently add - prefix in front of command-line flags
This is a follow-up for bcba5d2a78
2022-12-02 19:08:26 -08:00
Aliaksandr Valialkin
b6712ac08e docs: follow-up after 30fea30685
- Run `make docs-sync`, so app/vmalert/README.md is copied to docs/vmalert.md
- Clarify the feature description in the docs/CHANGELOG.md

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3408
2022-12-02 19:03:11 -08:00
Aliaksandr Valialkin
299285b147 lib/storage: fix TestUpdateCurrHourMetricIDs test when it runs on the first hour of the day by UTC 2022-12-02 18:52:37 -08:00
Aliaksandr Valialkin
e9636b4c69 lib/{mergeset,storage}: re-use the code for removing isInMerge flag at parts
Move the common code into releasePartsToMerge() method and consistently use it throughout the code.
2022-12-02 18:52:37 -08:00
Denys Holius
54741f6f38 docs/Articles.md: fir broken link (#3433) 2022-12-02 10:35:40 +01:00
Roman Khavronenko
cd5c451ea3 docs: fix typo in cluster's README (#3430)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-01 16:24:16 +01:00
Zakhar Bessarab
cd2ac07195 make: make openssl output parsing symbol number agnostic (#3429) 2022-12-01 16:09:36 +01:00
Roman Khavronenko
bcba5d2a78 vmalert: fix replay step param (#3428)
The recent change in modifying default value
of `datasource.queryStep` flag resulted in situation
where replay mode was always running queries with
step=`datasource.queryStep`. When it should always
use rule's evaluation interval.

The fix is related not to replay mode only, but
for all Range requests. Now step param is set
individually for each mode.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-01 13:57:53 +01:00
Roman Khavronenko
f989c20dd7 dashboards: fix typo in data link (#3426)
Fixes a missing `&` char in data link for ETA panel
on cluster dashboards. Without `&` char it generates
wrong link when click on Drilldown menu.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-01 13:21:14 +01:00
Zakhar Bessarab
30fea30685 app/vmalert: add remoteWrite.sendTimeout command-line flag to configure timeout for sending data to remoteWrite.url (#3423)
* app/vmalert: add `remoteWrite.sendTimeout` command-line flag to configure timeout for sending data to `remoteWrite.url`

* vmalert: remove WriteTimeout from clients Cfg
No need to have it as a part of configuration struct:
* the client isn't used by other packages;
* there are no internal tests to check the WriteTimeout.

* vmalert: remove DisablePathAppend from clients Cfg
No need to have it as a part of configuration struct:
* the client isn't used by other packages;
* there are no internal tests to check the DisablePathAppend.

Co-authored-by: hagen1778 <roman@victoriametrics.com>
2022-12-01 09:57:19 +01:00
Roman Khavronenko
8cc4f7eac6 vmalert: properly pass headers during the restore procedure (#3420)
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3418

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-01 09:27:39 +01:00
Aliaksandr Valialkin
3cdff3de23 docs/vmagent.md: update after 959f06d175 2022-11-29 21:33:49 -08:00
Aliaksandr Valialkin
f325410c26 lib/promscrape: optimize service discovery speed
- Return meta-labels for the discovered targets via promutils.Labels
  instead of map[string]string. This improves the speed of generating
  meta-labels for discovered targets by up to 5x.

- Remove memory allocations in hot paths during ScrapeWork generation.
  The ScrapeWork contains scrape settings for a single discovered target.
  This improves the service discovery speed by up to 2x.
2022-11-29 21:26:00 -08:00
Aliaksandr Valialkin
c7ce4979ec all: follow-up after 05cf8a6ecc 2022-11-29 21:03:59 -08:00
Aliaksandr Valialkin
4822406b64 app/vmalert: substitute -datasource.disablePathAppend with -remoteRead.disablePathAppend in the description for -datasource.url command-line flag
This is a follow-up for 959f06d175
2022-11-29 20:36:41 -08:00
Aliaksandr Valialkin
295c84df66 lib/promscrape/discovery: add a benchmark for measuring the performance of creating pod meta-labels 2022-11-29 20:27:48 -08:00
Dmytro Kozlov
05cf8a6ecc vmctl: support of the remote read protocol (#3232)
vmctl: support of the remote read protocol

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2022-11-29 22:53:28 +01:00
Roman Khavronenko
bdd0683c4a dashboards: update VM single dash (#3400)
The change list is the following:
* bump Grafana version to 9.2.6;
* replace old "Graph" panel with "TimeSeries" panel;
* show % usage of Mem and CPU additionally to of absolute values;
* `Caches` row was removed. All needed info for caches is now part of `Troubleshooting`;
* add Annotations for Alert triggers. Not all alerts are supposed to be displayed
on the dashboard, but only those with label `show_at: dashboard`.
See `alerts.yml` change.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-11-29 19:28:22 +01:00
Roman Khavronenko
5d835a6d64 dashboards: update vmalert dash (#3404)
The change list is the following:
* bump Grafana version to 9.2.6;
* replace old Graph panel with TimeSeries panel;
* add RemoteWrite section;
* allow configuring topK elements for some of the panels;
* Preer grouping by job instead of grouping by instance.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-11-29 19:26:31 +01:00
Max Golionko
959f06d175 vmalert: flag reference update (#3415)
* flag reference update

there is no flag `-datasource.disablePathAppend` and datasource actually checking for `-remoteRead.disablePathAppend`

* update source for doc as well
2022-11-29 19:22:57 +01:00
Roman Khavronenko
7dfb01bd7b dashboards: update vmagent dash (#3411)
The change list is the following:
* bump Grafana version to 9.2.6;
* add version change annotations;
* switch to per-job panels instead of per-instance;
* add drilldown option for resource usage panels.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-11-29 19:22:13 +01:00
Zakhar Bessarab
2f837c1b23 doc/operator: update formatting for backup section, add FAQ section (#3416)
* doc/operator: update formatting for backup section, add FAQ section

* doc/operator: address review feedback

* doc/operator: add note about difference between `VMRestore` and `VMBackupmanager` as init containers

* Update docs/operator/backups.MD

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2022-11-29 19:19:46 +01:00
Aliaksandr Valialkin
0002de937b docs/CHANGELOG.md: document 027ab74efb
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3402
2022-11-28 18:55:01 -08:00
Aliaksandr Valialkin
5c906beea2 docs/url-examples.md: allow linking to how to send OpenTSDB/Graphite data chapters 2022-11-28 18:42:49 -08:00
Aliaksandr Valialkin
654e94f420 lib/promscrape: add exported_ prefix to metric names exported by scrape targets if they clash with automatically generated metrics
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3406
2022-11-28 18:37:09 -08:00
匠心零度
fa0ce10275 lib/storage: remove extra error check (#3396) 2022-11-28 16:43:31 -08:00
Aliaksandr Valialkin
090343ff50 docs/Articles.md: add a link to slides about scaling to 100 million metrics per second 2022-11-28 16:41:14 -08:00
Roman Khavronenko
31ff26065b dashboards: update VM cluster dash (#3401)
The change list is the following:
* bump Grafana version to 9.2.6;
* remove artifacts in data links.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-11-28 14:13:00 +01:00
Denys Holius
4b3479c003 deployment/docker: bump grafana version to latest v9.2.6 (#3398) 2022-11-28 10:31:10 +01:00
Timur Bakeyev
9ad578214e Update datasource entries consistently contain type prometheus and uid $ds. (#3393)
Co-authored-by: Timour I. Bakeev <tbakeev@ripe.net>
2022-11-28 08:37:39 +01:00
Aliaksandr Valialkin
fa308ae9f8 deployment/docker: update VictoriaMetrics tag from v1.83.1 to v1.84.0 2022-11-25 22:19:29 -08:00
Aliaksandr Valialkin
ad105147dd docs/CHANGELOG.md: cut v1.84.0 2022-11-25 19:53:29 -08:00
Aliaksandr Valialkin
e2a061b6a3 vendor: make vendor-update 2022-11-25 19:52:00 -08:00
Aliaksandr Valialkin
e014467f42 docs/README.md: make docs-sync after 58d459e8a8 2022-11-25 16:55:47 -08:00
Aliaksandr Valialkin
58d459e8a8 app/{vminsert,vmagent}: follow-up after 53a63c6c4c
Extend /api/v1/import/prometheus with the support for Pushgateway way of specifying additional labels.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1415
2022-11-25 16:48:14 -08:00
Pedro Gonçalves
53a63c6c4c Adding pushgateway basic capabilities to vmagent (#3360)
* init pushgateway implementation

* Initial implementation of pushgateway in vmagent

* Initial implementation of pushgateway in vmagent
2022-11-25 16:35:01 -08:00
Zakhar Bessarab
8b6d528fbd {app/vmstorage,app/vmselect}: add API to get list of existing tenants (#3348)
* {app/vmstorage,app/vmselect}: add API to get list of existing tenants

* {app/vmstorage,app/vmselect}: add API to get list of existing tenants

* app/vmselect: fix error message

* {app/vmstorage,app/vmselect}: fix error messages

* app/vmselect: change log level for error handling

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-11-25 11:05:47 -08:00
Aliaksandr Valialkin
4ca44cfe9c app/vmselect/vmui: make vmui-update after 37cda9abd0 2022-11-25 07:33:09 -08:00
Yury Molodov
37cda9abd0 fix: change header settings (#3391) 2022-11-25 07:26:35 -08:00
Yury Molodov
1ab66186ca refactor: create Autocomplete component (#3390) 2022-11-25 07:25:35 -08:00
Roman Khavronenko
42e63fe0fd dashboards: cleanup & remove artifacts (#3387)
* some unexpected DS UIDs were removed;
* replace `$instance.*` filter with `$instance` since we respect
the instance port anyway;
* remove predefined datasource for `clusterbytenant`
in favour of datasource variable `ds`.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-11-25 09:28:14 +01:00
Aliaksandr Valialkin
da13d36af9 app/vmselect/vmui: make vmui-update after eb772aa50e 2022-11-24 17:37:18 -08:00
Yury Molodov
eb772aa50e vmui: improve table view (#3377)
* vmui: add compact table view (#3365)

* feat: add compact table view

* fix: add overflow table

* fix: change table styles

* vmui: compact table view

* Update docs/CHANGELOG.md

Co-authored-by: Michal Kralik <michal.kralik@percona.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-11-24 17:33:07 -08:00
Aliaksandr Valialkin
399ed9a3b9 docs/MetricsQL.md: document that histogram_share() accepts optional boundsLabel arg 2022-11-24 17:27:30 -08:00
Aliaksandr Valialkin
045fec631b docs/vmagent.md: typo fix 2022-11-24 17:19:45 -08:00
Roman Khavronenko
3407006cdb dashboards: cluster dashboard update (#3380)
The purpose of the update is to make the dash more usable
for large installations with many instances. Panels which showed
metrics per-instance (Mem, CPU) now are showing metrics per-job or min/max/avg
aggregations in % instead. This supposed to help immediately to identify
resource shortage and remain usable for small and big installations.

For cases when detailed info is needed, to the bottom of the dashboard
a new row `Drilldown` was added. Panels like Mem or CPU now contain
a `data-link` named `Drilldown` (cis shown on line click) which takes
user to more detailed panel.

The change list is the following:
* bump Grafana version to 9.1.0;
* replace old "Graph" panel with "TimeSeries" panel;
* improve Uptime panel to show number of instances per job;
* show % usage of Mem and CPU instead of absolute values;
* `Caches` row was removed. All needed info for caches is now part of `Troubleshooting`;
* add `Drilldown` section for detailed resource usage;
* add Annotations for Alert triggers. Not all alerts are supposed to be displayed
on the dashboard, but only those with label `show_at: dashboard`.
See `alerts-cluster.yml` change.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-11-23 18:03:25 -08:00
Aliaksandr Valialkin
04bb2e14dd docs/Articles.md: add a link to "The cost of scale in Prometheus ecosystem" talk 2022-11-23 00:11:51 +02:00
Aliaksandr Valialkin
ccf9bb32ac app/vmselect/vmui: make vmui-update after 7dc2349913 2022-11-22 15:36:03 +02:00
Yury Molodov
7dc2349913 vmui: add set up series custom limits (#3368)
* feat: add set up series custom limits

* feat: add button for show series without limits

* fix: resolve merge conflicts
2022-11-22 15:31:17 +02:00
Aliaksandr Valialkin
633ad34eb7 vendor: make vendor-update 2022-11-22 11:26:16 +02:00
Aliaksandr Valialkin
b1622ad63e docs/Articles.md: add a link to https://www.youtube.com/watch?v=_zORxrgLtec (OSA Con 2022: Specifics of data analysis in Time Series Databases) 2022-11-22 01:08:21 +02:00
Aliaksandr Valialkin
9498f871e7 app/vminsert: add missing vm_relabel_config_* metrics after 03d88bc066
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3345
2022-11-22 00:47:49 +02:00
Roman Khavronenko
03d88bc066 vmagent: expose metrics for tracking config state (#3375)
Expose `vm_relabel_config_*` and `vm_promscrape_config_*` metrics
for tracking relabel and scrape configuration hot-reloads.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3345
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-11-22 00:38:43 +02:00
Aliaksandr Valialkin
2ddfde78c3 app/vmselect/vmui: make vmui-update after 7d1b3e7e14 2022-11-22 00:34:33 +02:00
Yury Molodov
7d1b3e7e14 vmui: add copy button to row on Table view (#3363)
* feat: add copy button to row on Table view

* vmui: add copy button to row on Table view

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-11-22 00:31:33 +02:00
Yury Molodov
f81072f9a7 vmui: minor fixes (#3361)
* fix: reset the value of the switches trace and cache

* fix: add cursor text for inputs

* fix: solve the Infinite loop of useFetchQuery.ts

* fix: change condition for show/hide autocomplete

* fix: add limit error length for input
2022-11-22 00:28:03 +02:00
Yury Molodov
82d254af08 vmui: sticky tooltip (#3376)
* feat: add ability to make tooltip "sticky"

* vmui: add ability to make tooltip "sticky"
2022-11-22 00:26:53 +02:00
Aliaksandr Valialkin
ee1479bac6 docs/CHANGELOG.md: link to the related issue for range_normalize() function 2022-11-21 23:27:00 +02:00
Aliaksandr Valialkin
d9c3a2b605 app/vmselect/promql: add range_normalize(q1, ..., qN) function for normalizing query results into [0..1] value range
This may be useful for analyzing correlation between time series with different value ranges
2022-11-21 23:25:00 +02:00
Aliaksandr Valialkin
95f0266558 lib/promscrape/discovery/gce: do not pass filter arg when discovering zones
The filter arg isn't supported by zones API in GCE.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3202
2022-11-21 22:32:05 +02:00
Aliaksandr Valialkin
05ed98c98b app/vmselect/promql: allow using SI and IEC suffixes in numeric values inside queries
For example, 10Ki is equivalent to 10*1024, while 5.3M is equivalent to 5.3*1000*1000
2022-11-21 21:27:55 +02:00
Aliaksandr Valialkin
2c9e403d5f app/vmselect/promql: properly return an empty result from limit_offset() if offset exceeds the number of inner time series
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3312
2022-11-21 16:47:37 +02:00
Roman Khavronenko
0b6f439b11 vmalert: bump alerting rules evaluation interval to reasonable 30s (#3374)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-11-21 15:23:23 +01:00
Aliaksandr Valialkin
b796a0dc3f app/vmselect/promql: optimize e1 op e2 when e1 returns an empty result
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3349
2022-11-21 16:09:10 +02:00
Roman Khavronenko
84742f229a vmalert: add default list of alerting rules (#3373)
The default list of alerting rules contains the basic
rules for checking vmalert's health state and is recommended
to use for monitoring vmalert deployments.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-11-21 14:45:45 +01:00
Aliaksandr Valialkin
20d758e3e4 all: add a link to https://docs.victoriametrics.com/enterprise.html into description for enterprise flags 2022-11-21 15:42:01 +02:00
Aliaksandr Valialkin
cb1a621d63 app/{vminsert,vmselect}: add -storageNode.filter command-line flag for filtering the discovered storage nodes
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3353
2022-11-21 15:20:14 +02:00
Aliaksandr Valialkin
65b4e96a80 docs/Cluster-VictoriaMetrics.md: sync with cluster branch 2022-11-18 14:06:23 +02:00
Aliaksandr Valialkin
a061d33400 app/vmselect: clarify that it isnt recommended setting -replicationFactor at vmselect nodes even if the replication is enabled at vminsert nodes 2022-11-18 14:04:53 +02:00
Aliaksandr Valialkin
cae0f37edd app/vmselect/netstorage: remove superflouos map lookup at ProcessSearchQuery
This should reduce CPU usage a bit during querying
2022-11-18 13:40:04 +02:00
Yury Molodov
519bd2af7b vmui: add trace analyzer (#3310)
* refactor: change structure project

* refactor: change structure project

* fix: add hooks for set query params

* refactor: add index for pages

* docs: add TESTCASES.md

* refactor: restructure components

* feat: add page with trace analyzer

* fix: change detect trace data

* Update app/vmui/packages/vmui/src/pages/TracePage/index.tsx

Co-authored-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* Update app/vmui/packages/vmui/src/pages/TracePage/index.tsx

Co-authored-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* fix: change descriptions on trace page

* Update app/vmui/packages/vmui/src/pages/TracePage/index.tsx

Co-authored-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* feat: add base components

* feat: add reset styles

* docs: add description about trace analyzer

* feat: add styles for custom panel page

* feat: add styles for predefined panels

* feat: add style for TracingsView.tsx

* feat: add Alerts

* feat: add Tooltip.tsx

* fix: correct styles

* feat: add DatePicker.tsx

* feat: add tables

* feat: add theme provider

* fix: replace using callbacks as props to handlers

* fix: correct update time

* fix: change TimePicker.tsx

* fix: correct styles

* fix: update packages

* vmui: refactor code, remove material-ui

* feat: add paste json for trace analyzer

* vmui: update trace analyzer docs

* app/vmselect/vmui: `make vmui-update`

Co-authored-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-11-17 22:22:01 +02:00
Aliaksandr Valialkin
e79bfdf4b8 docs/CHANGELOG.md: document the fix for CPU usage spikes
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3343
2022-11-17 22:02:02 +02:00
Aliaksandr Valialkin
353396aa23 lib/workingsetcache: expose -cacheExpireDuration command-line flag for fine-tuning of the cache expiration
While at it, decrease -prevCacheRemovalPercent from 0.2 to 0.1 and increase -cacheExpireDuration from 20 minutes to 30 minutes.

This is needed for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3343
2022-11-17 19:59:13 +02:00
Aliaksandr Valialkin
578bb58ea9 app/vmselect/vmui: make vmui-update after 51bfd1ab80 2022-11-17 18:54:55 +02:00
Yury Molodov
51bfd1ab80 vmui: add ability hide query (#3359)
* feat: add ability hide query

* fix: change logic hide query

* fix: remove console.log
2022-11-17 18:42:33 +02:00
dependabot[bot]
3ed238b75b build(deps): bump loader-utils in /app/vmui/packages/vmui (#3350)
Bumps [loader-utils](https://github.com/webpack/loader-utils) from 2.0.3 to 2.0.4.
- [Release notes](https://github.com/webpack/loader-utils/releases)
- [Changelog](https://github.com/webpack/loader-utils/blob/v2.0.4/CHANGELOG.md)
- [Commits](https://github.com/webpack/loader-utils/compare/v2.0.3...v2.0.4)

---
updated-dependencies:
- dependency-name: loader-utils
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-11-17 12:49:09 +01:00
Aliaksandr Valialkin
2c9017f6df vendor: make vendor-update 2022-11-17 01:38:29 +02:00
Dmytro Kozlov
fb65fb39d2 app/vmstorage: fix potential file inclusion via variable (#3339)
* app/vmstorage: fix potential file inclusion via variable

* app/vmstorage: cleanup
2022-11-17 01:29:43 +02:00
Aliaksandr Valialkin
a21c8e7b9a app/vmselect/vmui: make vmui-update after bc8a782f74 2022-11-17 01:15:45 +02:00
Yury Molodov
bc8a782f74 vmui/refactor (#3298)
* refactor: change structure project

* refactor: change structure project

* fix: add hooks for set query params

* refactor: add index for pages

* docs: add TESTCASES.md

* refactor: restructure components

* feat: add base components

* feat: add reset styles

* feat: add styles for custom panel page

* feat: add styles for predefined panels

* feat: add style for TracingsView.tsx

* feat: add Alerts

* feat: add Tooltip.tsx

* fix: correct styles

* feat: add DatePicker.tsx

* feat: add tables

* feat: add theme provider

* fix: replace using callbacks as props to handlers

* fix: correct update time

* fix: change TimePicker.tsx

* fix: correct styles

* fix: update packages

* vmui: refactor code, remove material-ui

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-11-17 01:09:14 +02:00
Aliaksandr Valialkin
a260e2659e app/vmselect/promql: add range_stdvar() and range_stddev() functions for calculating variance and deviation over time series on the selected time range 2022-11-17 01:03:40 +02:00
Aliaksandr Valialkin
c1a3192d8b app/vmselect/promql: add range_linear_regression(q) function for calculating simple linear regression for the selected time series on the selected time range 2022-11-17 00:38:48 +02:00
Aliaksandr Valialkin
5955d23232 lib/promscrape: add a benchmark for internLabelStrings() 2022-11-16 23:02:49 +02:00
Aliaksandr Valialkin
a75137c1c2 lib/mergeset: properly reset bsr.bhIdx after the call to blockStreamReader.readNextBHS()
The issue has been introduced in 58b40f514c

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3343
2022-11-16 21:23:35 +02:00
Aliaksandr Valialkin
c3362e3db4 lib/workingsetcache: add -prevCacheRemovalPercent command-line flag for tuning memory usage vs CPU usage ratio
Reduce the default value of this flag from 1% to 0.2% after 71335e6024

This flag should help determining the best ratio for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3343
2022-11-16 12:39:39 +02:00
Aliaksandr Valialkin
4106f197f2 lib/mergeset: retain the buffer with the data used by indexBlock.bhs, inside indexBlock.buf
Previously indexBlock.bhs pointed to the buffer, which could be changed over time.
This could result in incorrect time series search over time.

This is a follow-up for 58b40f514c

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3343
2022-11-16 12:09:23 +02:00
Aliaksandr Valialkin
58b40f514c lib/mergeset: remove string allocation and copying when unmarshaling blockHeader
This should reduce CPU usage for the case from https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3343
2022-11-15 16:30:54 +02:00
Aliaksandr Valialkin
09b79d74a7 docs/CHANGELOG.md: document changes in v1.79.5 release 2022-11-11 01:27:45 +02:00
Aliaksandr Valialkin
99f187d9bc deployment/docker: update VictoriaMetrics version from v1.83.0 to v1.83.1 2022-11-11 01:24:40 +02:00
Aliaksandr Valialkin
bbe1a1472c docs/CHANGELOG.md: cut v1.83.1 2022-11-10 14:06:57 +02:00
Aliaksandr Valialkin
1b9dff133a docs/CHANGELOG.md: document the fix at 71335e6024 2022-11-10 13:46:51 +02:00
Aliaksandr Valialkin
2bcafbef25 vendor: make vendor-update 2022-11-10 13:46:33 +02:00
Aliaksandr Valialkin
71335e6024 lib/workingsetcache: tune cache miss threshold for resetting the previous cache from 5% to 1%
It has been appeared that some production workloads could suffer for some time
after every reset of the previous cache when it gets less than 5% of requests
after the needed item isn't found in the current cache. This could result
in reduced cache hit rates, which, in turn, could increase CPU, disk IO and RAM
usage needed for reading, unpacking and caching the missed data from disk.

This commit reduces the cache miss threshold for resetting the previous cache from 5% to 1%.
This should reduce the possible negative impact after each cache reset by at least 5x,
while reducing the total memory used by caches.

This is a follow-up for d906d8573e
2022-11-10 13:31:54 +02:00
Dmytro Kozlov
5ff6e0fb02 vmui: fix vmui vulnerability (#3336)
* vmui: fix vmui vulnerability

* vmui: code cleanup
2022-11-10 02:28:37 +01:00
Aliaksandr Valialkin
6c7361b1c5 app/vmselect/vmui: make vmui-update after 7130af7fd2 2022-11-09 16:43:18 +02:00
Aliaksandr Valialkin
86bce7f5f9 lib/promscrape: add more cases to TestAddRowToTimeseries
This is a follow-up for 16fdd2af8a
2022-11-09 16:13:56 +02:00
Jeremy PLANCKEEL
16fdd2af8a test(golang): add test to function addRowToTimeseries (#3282)
Co-authored-by: jplanckeel-externe <jplanckeel.externe@bedrockstreaming.com>
2022-11-09 15:41:26 +02:00
Aliaksandr Valialkin
b8839df32c lib/protoparser/opentsdb: follow-up after 04b0e4e7bf
- Simplify the parser code to be less error prone
- Document the change
- Add a test for OpenTSDB put line with trailing whitespace without tags

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3290
2022-11-09 15:35:05 +02:00
Roman Khavronenko
04b0e4e7bf protoparser/opentsdb: allow lines without tags (#3303)
According to http://opentsdb.net/docs/build/html/api_telnet/put.html
"At least one tag pair must be present".
However, in VictoriaMetrics datamodel tags aren't required.
This could be confusing for users. Allowing accept lines without
tags seems to do no harm.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3290
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-11-09 15:32:47 +02:00
Aliaksandr Valialkin
e17a1acf4a docs/CHANGELOG.md: document 7130af7fd2
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2814
2022-11-09 12:15:49 +02:00
Michal Kralik
7130af7fd2 vmui: show tracing in json view (#3316)
* vmui: show tracing in json view

* vmui: refactor tracing view
2022-11-09 12:12:28 +02:00
dependabot[bot]
10791bf224 build(deps): bump loader-utils in /app/vmui/packages/vmui (#3328)
Bumps [loader-utils](https://github.com/webpack/loader-utils) from 2.0.2 to 2.0.3.
- [Release notes](https://github.com/webpack/loader-utils/releases)
- [Changelog](https://github.com/webpack/loader-utils/blob/v2.0.3/CHANGELOG.md)
- [Commits](https://github.com/webpack/loader-utils/compare/v2.0.2...v2.0.3)

---
updated-dependencies:
- dependency-name: loader-utils
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-11-09 12:11:13 +02:00
Denys Holius
aebe21e2c8 guides/README.md: fix link to guide-delete-or-replace-metrics.html (#3331) 2022-11-09 12:00:46 +02:00
Aliaksandr Valialkin
34aa3f6404 README.md: sync changes with 9f8bf524ad 2022-11-09 11:55:50 +02:00
Aliaksandr Valialkin
20046dab6e app/vmui/packages/vmui: return back accidental changes at 9f8bf524ad 2022-11-09 11:55:34 +02:00
Aliaksandr Valialkin
c973aca617 app/vminsert/netstorage: move nodesHash from global state to storageNodesBucket
This should prevent from panics when the list of discovered vmstorage nodes changes.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3329
2022-11-09 11:51:10 +02:00
Roman Khavronenko
9f8bf524ad bump go version to 1.19.3 (#3327)
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-11-08 16:43:59 +01:00
Michal Kralik
9b540bba6f vmui: change graph legend label format (#3315) 2022-11-08 15:16:24 +01:00
Zakhar Bessarab
91dd79f40f docs/operator: fix description for SA at VMClusterSpec (#3313) 2022-11-07 16:51:01 +01:00
Aliaksandr Valialkin
7fa5d043f5 lib/promscrape/discovery/consul: add __meta_consul_partition label in the same way as Prometheus does
See https://github.com/prometheus/prometheus/pull/11482
2022-11-07 15:25:53 +02:00
Aliaksandr Valialkin
8332622037 vendor: update github.com/urfave/cli/v2 from v2.23.2 to v2.23.4 2022-11-07 14:58:35 +02:00
Aliaksandr Valialkin
daa70e6560 lib/storage: follow-up for 790768f20b
- Document the bugfix at docs/CHANGELOG.md
- Simplify the bugfix a bit
2022-11-07 14:04:08 +02:00
Aliaksandr Valialkin
f9dc3da9e2 lib/storage: typo fix after 32d48f8dfbb03174858c00bdfe6d9d22431dc8d8 2022-11-07 13:58:27 +02:00
Aliaksandr Valialkin
116811d761 lib/envtemplate: allow non-env var names inside "%{ ... }" 2022-11-07 13:58:27 +02:00
Aliaksandr Valialkin
dd88c628aa lib/storage: remove unused isFull field from hourMetricIDs struct 2022-11-07 13:58:26 +02:00
Łukasz Marszał
790768f20b Fix issue-3309 - currHourMetricIDs shouldn't contain metrics from prev hour (#3320)
* fix issue-3309 currHourMetricIDs shouldn't contain metrics from prev hour

* Update storage.go
2022-11-07 13:55:37 +02:00
Aliaksandr Valialkin
63d4cf661b vendor: make vendor-update 2022-11-05 10:34:35 +02:00
Aliaksandr Valialkin
d61691d5fa deployment/docker: update Go builder from v1.19.2 to v1.19.3
See https://github.com/golang/go/issues?q=milestone%3AGo1.19.3+label%3ACherryPickApproved
2022-11-05 10:19:54 +02:00
Aliaksandr Valialkin
23c79e2e49 docs/Single-server-VictoriaMetrics.md: mention about security certifications at Security chapter
This is a follow-up for 94bd49402e
2022-11-05 10:07:29 +02:00
Aliaksandr Valialkin
4ef5fe1317 docs/guides/guide-vmcluster-multiple-retention-setup.md: clarify docs after a75d85b11e 2022-11-05 10:02:48 +02:00
Artem Navoiev
94bd49402e docs: Add link to security page from Readme (#3286)
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2022-11-04 09:49:12 +01:00
Dmytro Kozlov
99d8fcb332 docs/operator: change VMAgentRemoteWriteSettings.MaxDiskUsagePerURL type from int32 to int64 (#3307)
* docs/operator: change VMAgentRemoteWriteSettings.MaxDiskUsagePerURL type from int32 to int64

* docs: operator updates api description

Co-authored-by: f41gh7 <nik@victoriametrics.com>
2022-11-04 01:12:57 +01:00
Roman Khavronenko
ac4e23de39 vmctl: fix panic on start (#3300)
The change disables initing the `-version` flag in new
`urfave/cli/v2` update. The `-version` flag conflicts
with the identical flag from `lib/buildinfo` and causes panic.

See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3299

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-11-01 19:50:22 +01:00
Artem Navoiev
a75d85b11e update multi-tenancy guide. Add infromation about Enterprise and Rete… (#3285)
docs: update multi-tenancy guide

Add information about Enterprise and Retention Filters

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2022-11-01 15:19:09 +01:00
Aliaksandr Valialkin
df88832c86 docs/Single-server-VictoriaMetrics.md: follow-up after a1a97b9321
- Remove trailing whitespace at the end of lines
- Remove redundant sentence stating that time series matching the given selector will be deleted.
  It should be clear from the surrounding context.
2022-11-01 10:49:10 +02:00
Aliaksandr Valialkin
a3dc324b19 vendor: update github.com/urfave/cli/v2 from 2.20.3 to 2.23.0 2022-11-01 10:46:26 +02:00
Dmytro Kozlov
a1a97b9321 docs: clarify information about usage of the api/v1/admin/tsdb/delete_series API (#3287)
docs: clarify information about usage of the `api/v1/admin/tsdb/delete_series` API
2022-11-01 09:36:28 +01:00
Aliaksandr Valialkin
a1011931ac docs/Release-Guide.md: instruct to update VictoriaMetrics version in deployment/docker/docker-compose*.yml files after creating new release
This is a follow-up for d1509f4559
2022-11-01 10:31:02 +02:00
Aliaksandr Valialkin
619b3c926d docs/enterprise.md: mention that feature requests from enterprise customers are prioritized 2022-11-01 10:28:12 +02:00
Denys Holius
d1509f4559 docker-compose: bump version of container tags for VictoriaMetrics components (#3294)
* deployment/docker/docker-compose-cluster.yml: bump VictoriaMetrics Cluster components to the latest v1.83.0 version

* deployment/docker/docker-compose.yml: bump VictoriaMetrics Single node and vmutils to the latest v1.83.0 version
2022-11-01 09:25:07 +01:00
Aliaksandr Valialkin
869e0f9f85 lib/promrelabel: go fmt after 5cec9706dc 2022-10-29 05:17:10 +03:00
Aliaksandr Valialkin
0f8f36de24 docs: typo fixes 2022-10-29 04:52:18 +03:00
Aliaksandr Valialkin
5cec9706dc lib/promrelabel: add a test from https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3251
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3251
2022-10-29 04:33:38 +03:00
Aliaksandr Valialkin
740f7ac5e0 docs/CHANGELOG.md: cut v1.83.0 2022-10-29 02:54:54 +03:00
Aliaksandr Valialkin
6ac7b088b2 vendor: make vendor-update 2022-10-29 02:53:48 +03:00
Aliaksandr Valialkin
cdd3443806 app/vmbackupmanager: add functionality for automated restore from backup 2022-10-29 02:30:52 +03:00
Aliaksandr Valialkin
320ae1c60a lib/envflag: small refactoring after 518c340ae3 and 02096e06d0 2022-10-29 02:28:58 +03:00
Aliaksandr Valialkin
76e8888272 lib/promscrape: properly add exported_ prefix to labels, which clash with target labels if honor_labels: true option isn't set.
The issue was in the `labels := dst[offset:]` line in the beginning of appendExtraLabels() function.
The `dst` may be re-allocated when adding extra labels to it. In this case the addition of `exported_`
prefix to labels inside `labels` slice become invisible in the returned `dst` labels.

While at it, properly handle some corner cases:

- Add additional `exported_` prefix to clashing metric labels with already existing `exported_` prefix.
- Store scraped metric names in `exported___name__` label if scrape target contains `__name__` label.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3278

Thanks to @jplanckeel for the initial attempt to fix this issue
at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3281
2022-10-28 22:14:26 +03:00
Aliaksandr Valialkin
454baf84d6 lib/promscrape/discovery/kubernetes: do not print an empty kubeconfig_file option in yaml at /config page 2022-10-28 22:14:25 +03:00
Aliaksandr Valialkin
9565dbed34 app/vmselect/vmui: make vmui-update after 54e1865d17 2022-10-28 14:51:23 +03:00
Yury Molodov
54e1865d17 vmui: minor fixes (#3276)
* feat: apply serverURL on down Enter

* fix: change method of set time range

* fix: remove prevent run fetch without changes

* fix: prevent reset timerange when autorefresh
2022-10-28 14:47:50 +03:00
Aliaksandr Valialkin
9aee303ca1 docs/Cluster-VictoriaMetrics.md: improve docs for dns+srv service discovery 2022-10-28 14:24:48 +03:00
Aliaksandr Valialkin
a72c5f76eb app/{vminsert,vmselect}: add support for automatic discovery and update of vmstorage nodes
Thanks to @dmitryk-dk for the initial implemenation at https://github.com/VictoriaMetrics/VictoriaMetrics-enterprise/pull/446
2022-10-28 13:13:45 +03:00
Aliaksandr Valialkin
ad76a54c87 vendor: make vendor-update 2022-10-28 00:18:15 +03:00
Aliaksandr Valialkin
a018b1d75e app/vmalert/templates: properly escape all the special chars in quotesEscape function
Previously the `quotesEscape` function was escaping only double quotes.
This wasn't enough, since the input string could contain other special chars,
which must be escaped when put inside JSON string. For example, carriage return and line feed chars (\n\r),
backslash char, etc. This led to the following issues, which were improperly fixed:

- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/890 - this issue
  was "fixed" by introducing the `crlfEscape` function, which led to unnecessary
  complications in user templates, while not fixing various corner cases
  such as backslash chars in the input string.
  See 1de15ad490

- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3139 - this issue
  was "fixed" by urlencoding the whole string passed to -external.alert.source
  command-line flag. This led to invalid urls, which couldn't be parsed by Grafana.
  See 00c838353d
  and 4bd0244599

This commit properly encodes the input string passed to `quotesEscape`, so it can be safely embedded inside JSON strings.

This commit deprecates crlfEscape template function and adds the following new template functions:

- strvalue and stripDomain - these functions are supported by Prometheus, so they were added
  for compatibility purposes.
- jsonEscape and htmlEscape for converting the input string to valid quoted JSON string
  and for html-escaping the input string, so it could be safely embedded as a plaintext
  into html.

This commit also documents all supported template functions at https://docs.victoriametrics.com/vmalert.html#template-functions
The deprecated crlfEscape function isn't documented on purpose, since its usefulness is negative in general case.
2022-10-28 00:01:16 +03:00
Aliaksandr Valialkin
4bd0244599 Revert "vmalert: escape query params if external alert source defined (#3267)"
This reverts commit 00c838353d.

Reason for revert: it incorrectly fixes the issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3139 .
Now `-external.alert.source=explore?orgId=1&left=...` is converted to the following invalid url, which cannot be handled by Grafana:

https://grafana.example.com/explore%3ForgId%3D1%26left%3D...

The next commit will contain the correct fix of the issue - the `quotesEscape` function must
properly escape the string, so it could be embedded into JSON string. This function must
properly escape \n\r chars too. In this case the `crlfEscape` function becomes unnecessary.
Actually, the next commit makes the `crlfEscape` function deprecated.
2022-10-27 22:30:27 +03:00
Aliaksandr Valialkin
75e22ed3a4 vendor: make vendor-update 2022-10-27 20:21:13 +03:00
Denys Holius
b4e6460d2f .github/workflows/codeql-analysis.yml: specifically setting the Go version (#3277)
see https://github.com/github/codeql-action/issues/1059
2022-10-27 10:06:33 +02:00
Dmytro Kozlov
00c838353d vmalert: escape query params if external alert source defined (#3267)
vmalert: escape query args if external alert source defined
2022-10-26 10:00:14 -04:00
Aliaksandr Valialkin
518c340ae3 lib/envtemplate: allow referring env vars from other env vars via %{ENV_VAR} syntax
This is a follow-up for 02096e06d0
2022-10-26 14:49:33 +03:00
Aliaksandr Valialkin
3c66e45ef0 app/vmselect/vmui: make vmui-update after eae6063450 2022-10-26 02:50:46 +03:00
Yury Molodov
eae6063450 fix: change step setting field (#3270)
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-10-26 02:46:46 +03:00
Yury Molodov
794bd8b424 vmui: limit the number of decimal places to 10 characters (#3258)
* fix: limit the number of decimal places to 10 characters

* fix: add float numbers stabilizer
2022-10-26 02:43:13 +03:00
Yury Molodov
bc7456841f vmui: add responsive styles for small screens (#3256)
* fix: add responsive styles for small screens

* fix: correct additional settings margins

* docs/CHANGELOG.md: add responsive styles
2022-10-26 02:39:54 +03:00
Aliaksandr Valialkin
685433b8da docs/Cluster-VictoriaMetrics.md: update docs about env vars usage in command-line flags
This is a follow-up for 02096e06d0
2022-10-26 01:54:15 +03:00
Aliaksandr Valialkin
02096e06d0 lib/envflag: allow referring environment variables in command-line flags 2022-10-26 01:52:05 +03:00
Aliaksandr Valialkin
5d82e7d64a docs: run make docs-sync after 2ed3d49c26 2022-10-26 01:14:31 +03:00
omahs
2ed3d49c26 Fix: typos (#3269)
Fix: typos
2022-10-26 01:12:54 +03:00
Aliaksandr Valialkin
c4265322f4 lib/fs: add canOverwrite arg to WriteFileAtomically when it is allowed to overwrite the file atomically if it already exists 2022-10-26 01:07:34 +03:00
Aliaksandr Valialkin
db8abd000e docs/Makefile: fix docs-up Makefile command after 9ccd22c1f6 2022-10-25 17:53:24 +03:00
Aliaksandr Valialkin
d9bbf24183 app/{vminsert,vmselect}/netstorage: allow calling Init()+MustStop() in a loop
Previously netstorage.MustStop() call didn't free up all the resources,
so the subsequent call to nestorage.Init() would panic.

This allows writing tests, which call nestorage.Init() + nestorage.MustStop() in a loop.
2022-10-25 17:47:17 +03:00
Denys Holius
9ccd22c1f6 Docs: add guide "How to delete and replace metrics" (#2829)
docs: add guide how to delete and replace metrics
2022-10-25 09:02:14 -04:00
Aliaksandr Valialkin
b7882dc9af app/vmselect/vmui: make vmui-update after 274e235bf7 2022-10-24 21:29:13 +03:00
Aliaksandr Valialkin
15849cb571 docs/guides/migrate-from-influx.md: properly display images at https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/docs/guides/migrate-from-influx.md
This is a follow-up for 42375679db
2022-10-24 21:28:31 +03:00
Nguyen Van Duc
42375679db Fix images not display on key concepts document (#3266) 2022-10-24 21:22:41 +03:00
Aliaksandr Valialkin
b1324631b1 docs/CHANGELOG.md: document 274e235bf7
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3240
2022-10-24 21:02:17 +03:00
Yury Molodov
274e235bf7 vmui: optimize memory (#3255)
* fix: change series limit logic

* fix: remove spread operators
2022-10-24 20:51:24 +03:00
Yury Molodov
59199a98dd vmui: extend options for app mode (#3252)
* feat: add vmui customization for dbaas

* feat: extends vmui customization for dbaas

* fix: move input tenandId after query

* fix: change serverURL when changing tenantID

* fix: remove options

* docs: add options description
2022-10-24 20:45:41 +03:00
Aliaksandr Valialkin
c52c23c272 docs/enterprise.md: describe all the enteprise features in a short doc at https://docs.victoriametrics.com/enterprise.html 2022-10-24 18:02:03 +03:00
Aliaksandr Valialkin
cac28ae0ae docs/CHANGELOG.md: typo fixes 2022-10-24 16:59:55 +03:00
Aliaksandr Valialkin
8e998aa1a1 lib/storage: add support for retention filters (aka multiple retentions for distinct sets of time series)
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/143
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/289
2022-10-24 16:40:20 +03:00
Aliaksandr Valialkin
d7e657e5f9 docs/CHANGELOG.md: document 69b27275d2b2bf1bdae0d8b887b44bde2a787649 2022-10-24 16:08:15 +03:00
Aliaksandr Valialkin
692af2d17c vendor: make vendor-update 2022-10-24 15:47:01 +03:00
Aliaksandr Valialkin
dba218a8ce lib/storage: skip blocks outside the configured retention during search
Blocks outside the configured retention are eventually deleted during background merge.
But such blocks may reside in the storage for long time until background merge.
Previously VictoriaMetrics could spend additional CPU time on processing such blocks
during search queries. Now these blocks are skipped.
2022-10-24 02:52:44 +03:00
Aliaksandr Valialkin
e2f0b76ebf lib/storage: do not pass retentionMsecs and isReadOnly args explicitly - access them via Storage arg
This makes code easier to read.

This is a follow-up after d2d30581a0
2022-10-24 01:31:04 +03:00
Aliaksandr Valialkin
89a1108b1a lib/storage: small code cleanups 2022-10-24 01:17:47 +03:00
Aliaksandr Valialkin
05512fdd74 lib/storage: re-use newTestStorage() instead of manually initializing Storage mock
This is a follow-up for d2d30581a0
2022-10-23 16:24:00 +03:00
Aliaksandr Valialkin
d2d30581a0 lib/storage: pass Storage to table and partition instead of getDeletedMetricIDs callback
This improves code readability a bit.
2022-10-23 16:10:04 +03:00
Aliaksandr Valialkin
54f35c175c lib/storage: small refactoring: move retentionDeadline to blockStreamMerger
This allows defining per-block retention in the future by updating the getRetentionDeadline function
2022-10-23 16:10:02 +03:00
Aliaksandr Valialkin
187e294a53 lib/storage: use a single reference to the currently merged block - bsm.Block during the block merge loop 2022-10-23 14:08:57 +03:00
Aliaksandr Valialkin
d0a9ca1bc2 lib/storage: properly pass uint64 constant to fmt.Errorf on 32-bit platforms 2022-10-23 12:48:00 +03:00
Aliaksandr Valialkin
5e4dfe50c6 lib/storage: subsitute searchTSIDs functions with more lightweight searchMetricIDs function
The searchTSIDs function was searching for metricIDs matching the the given tag filters
and then was locating the corresponding TSID entries for the found metricIDs.

The TSID entries aren't needed when searching for time series names (aka MetricName),
so this commit removes the uneeded TSID search from the implementation of /api/v1/series API.
This improves perfromance of /api/v1/series calls.

This commit also improves performance a bit for /api/v1/query and /api/v1/query_range calls,
since now these calls cache small metricIDs instead of big TSID entries
in the indexdb/tagFilters cache (now this cache is named indexdb/tagFiltersToMetricIDs)
without the need to compress the saved entries in order to save cache space.

This commit also removes concurrency limiter during searching for matching time series,
which was introduced in 8f16388428, since the concurrency
for all the read queries is already limited with -search.maxConcurrentRequests command-line flag.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
2022-10-23 12:23:47 +03:00
Aliaksandr Valialkin
a10647e0bf app/vmselect/promql: expose missing metric vm_cache_size_max_bytes{type="promql/rollupResult"} 2022-10-23 12:13:47 +03:00
Aliaksandr Valialkin
4128ad71e2 lib/storage: move common code to newRawRowsBlock() function 2022-10-21 14:46:55 +03:00
Aliaksandr Valialkin
b5674164c6 lib/storage: simplify code a bit after 3f5959c053 2022-10-21 14:39:27 +03:00
Aliaksandr Valialkin
fd7c86ae25 lib/{mergeset,storage}: simplify the code a bit after ae55ad8749 2022-10-21 14:33:03 +03:00
Aliaksandr Valialkin
99d67ac8ad lib/storage: validate timestamps in the block only if they use encoding, which needs validation
This reduces CPU usage when there is no sense in validating timestamps.

This is a follow-up for 5fa9525498

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2998
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3011
2022-10-21 00:52:32 +03:00
Aliaksandr Valialkin
3f5959c053 lib/storage: try generating initial parts from inmemory rows with identical sizes under high ingestion rate
This should improve background merge rate under high load a bit
2022-10-20 23:28:24 +03:00
Aliaksandr Valialkin
891ff6af2a lib/workingsetcache: increase default cache expiration from 10 minutes to 20 minutes
This increases the maximum time for cache population with new entries from 20 minutes to 40 minutes.
This

This change shouldn't increase memory usage for caches, since the prev cache cleaner
should free up memory by deleting unused prev cache as soon as possible.
See 08ca45d238 for details on prev cache cleaner.
2022-10-20 21:48:25 +03:00
Aliaksandr Valialkin
08ca45d238 lib/workingsetcache: move the cleaner for the prev cache into a separate goroutine
This makes the code more clear after d906d8573e
2022-10-20 21:45:29 +03:00
Aliaksandr Valialkin
4cd173bbaa lib/procutil: stop immediately after receiving the second SIGINT or SIGTERM signal
Previously VictoriaMetrics apps could stop responding to SIGINT and SIGTERM signals
if they hang for some reason in graceful shutdown procedure.
2022-10-20 21:40:20 +03:00
Aliaksandr Valialkin
150e99d403 lib/{mergeset,storage}: avoid unaligned 64-bit atomic operation panic on 32-bit platforms
The panic has been introduced in 68f3a02589

While at it, add padding to shard structs in order to avoid false sharing on mordern CPUs

This should improve scalability on systems with many CPU cores
2022-10-20 16:25:43 +03:00
Aliaksandr Valialkin
d906d8573e lib/workingsetcache: drop the previous cache whenever it recieves less than 5% of requests comparing to the current cache
This means that the majority of requests are successfully served from the current cache,
so the previous cache can be reset in order to free up memory.
2022-10-20 10:47:58 +03:00
Aliaksandr Valialkin
817aeafd69 lib/workingsetcache: use per-bucket stats counters instead of global stats counters for cache hits/misses
This should improve cache scalability on systems with many CPU cores.
2022-10-20 09:12:17 +03:00
Aliaksandr Valialkin
9c02c39487 lib/workingsetcache: randomize interval for swapping curr and prev caches
This should make CPU usage smoother over time, since different caches
will be swapped at different times.
2022-10-20 08:42:43 +03:00
Aliaksandr Valialkin
cba9696a14 docs/CHANGELOG.md: move the BUGFIX line for 1059c4d84a into correct place 2022-10-18 20:38:57 +03:00
Nikolay
1059c4d84a lib/promscrape/discovery/kubernetes: correctly wrap error (#3250)
* lib/promscrape/discovery/kubernetes: correctly wrap error
follow-up after 1304824201

* Update docs/CHANGELOG.md

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-10-18 20:37:42 +03:00
Roman Khavronenko
4e0ea95f26 vmalert: lower severity level for RW retries (#3237)
The message about dropped data still remains at `error` level.
The change supposed to make log message more clear about how
serious it is.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-10-18 14:27:20 +02:00
Aliaksandr Valialkin
b4f7243110 vendor: make vendor-update 2022-10-18 10:56:38 +03:00
Aliaksandr Valialkin
3bb3893b2d app/vmselect/vmui: make vmui-update after 080562030d 2022-10-18 10:50:35 +03:00
Yury Molodov
080562030d fix: remove rounding of axis limits (#3238) 2022-10-18 10:47:58 +03:00
Aliaksandr Valialkin
069401a304 all: log error when environment variables referred from -promscrape.config are missing
This should prevent from using incorrect config files
2022-10-18 10:47:16 +03:00
Aliaksandr Valialkin
fb50730ba7 lib/storage: double the number of rawRows shards on multi-core systems
This should increase data ingestion scalability on multi-core systems at the cost of slightly higher memory usage
2022-10-17 18:19:51 +03:00
Aliaksandr Valialkin
ae55ad8749 lib/{storage,mergeset}: do not hold per-shard lock in fast path when adding per-shard items to the flush list 2022-10-17 18:01:26 +03:00
Aliaksandr Valialkin
b6e8c1403a lib/promrelabel: add relabeling tests when the source label is missing 2022-10-17 14:47:52 +03:00
Aliaksandr Valialkin
2b2c58ecf8 vendor: make vendor-update 2022-10-14 15:16:03 +03:00
Aliaksandr Valialkin
646fb17237 docs/MetricsQL.md: document that input histograms must have the same set of buckets when calculating the quantile over them
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3231
2022-10-14 13:59:58 +03:00
Aliaksandr Valialkin
adf3419699 docs/CHANGELOG.md: add a note that it is recommended to use v1.82.1 instead of v1.82.0 2022-10-14 13:41:01 +03:00
Aliaksandr Valialkin
3a0c69651a docs/CHANGELOG.md: release v1.82.1 2022-10-14 11:33:25 +03:00
Aliaksandr Valialkin
2e3be68617 lib/bytesutil: make sure that the string passed to FastStringMather.Match() is copied before using it as a key in the internal cache map
This prevents from possible corruption of the internal cache map
when the underlying byte slice used by the string key is modified.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3227
2022-10-14 09:51:19 +03:00
Denys Holius
b1de74a6ba k8s-monitoring-via-vm-cluster.md: remove extra spaces (#3234) 2022-10-13 17:25:35 +02:00
Yury Molodov
ff6151fa49 vmui: limit number of plotted series (#3229)
* feat: add maximum display series by tabs

* feat: add warning on PredefinedPanels.tsx

* docs/CHANGELOG.md: vmui limit number of plotted series

* docs/CHANGELOG.md: vmui limit number of plotted series

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-10-13 12:13:47 +03:00
Aliaksandr Valialkin
92f7fe306e app/vmagent/remotewrite: typo fix after c914e4dace 2022-10-13 12:04:10 +03:00
Aliaksandr Valialkin
e6fd33044f app/vmselect/promql: follow-up for 930f1ee153
Document the change at docs/CHANGELOG.md
Apply it to histogram_quantile() in the same way as to histogram_share()

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3225
2022-10-13 12:03:08 +03:00
Siqi Liu
930f1ee153 BUGFIX: properly calculate histogram_quantile with the same value and different string le (#3225)
Co-authored-by: 647(siki.liu) <siki.liu@huolala.cn>
2022-10-13 11:57:16 +03:00
Aliaksandr Valialkin
76e275ddef docs/CHANGELOG.md: document b856581ad3 2022-10-13 10:35:48 +03:00
Nikolay
b856581ad3 lib/backup: set s3 default region to us-west-2 (#3224)
* lib/backup: set s3 default region to us-west-2
it should fix an error with region detection for bucket, if AWS_REGION env var is not set

* Update lib/backup/s3remote/s3.go

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-10-13 10:30:07 +03:00
Dan Fredell
42ce4364fc Multi retention standardization of docs (#3223)
* Multi retention standardization of docs

While reading through the docs the implementation details had different formatting for storageNode. This standardizes them and adds a link to the retention docs.

* Update docs/guides/guide-vmcluster-multiple-retention-setup.md

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2022-10-13 10:27:16 +03:00
Aliaksandr Valialkin
c914e4dace app/vmagent/remotewrite: typo fix after 50f5eae0e0 2022-10-13 10:19:02 +03:00
Roman Khavronenko
96a106eab2 vmalert: update troubleshooting docs (#3228)
The default value of `-datasource.queryStep` has changed, so we update
the troubleshooting docs accordingly.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-10-12 12:12:37 +02:00
Aliaksandr Valialkin
185cff307b lib/mergeset: mention in the error message the path to the part, which triggered the error
This should improve debuggability
2022-10-12 09:54:21 +03:00
Aliaksandr Valialkin
cfae887c75 docs/CHANGELOG.md: document a4975ace86
The original commit, which led to the issue - 877940a131
2022-10-12 09:30:36 +03:00
Aliaksandr Valialkin
b8da90b893 app/vmselect/promql: properly handle zero and negative values for -search.maxMemoryPerQuery
This is a follow-up for 04a05f161c

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3203
2022-10-12 09:25:17 +03:00
Aliaksandr Valialkin
41925a9500 docs/CHANGELOG.md: document 9544b5cacf7446203fac19eb7c575779dc9b280e 2022-10-12 09:25:17 +03:00
Roman Khavronenko
a4975ace86 vmalert: revert unexpected fileds rename during refactoring (#3222)
Due to auto-refactoring, the filed `state` was automatically
renamed to `ruleState` when the entity with the same name
was renamed in other file. Reverting the change.

https://github.com/VictoriaMetrics/helm-charts/issues/391
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-10-11 12:37:47 +02:00
Aliaksandr Valialkin
b7887c426b vendor: make vendor-update 2022-10-10 22:03:45 +03:00
Aliaksandr Valialkin
a79272db1f docs/vmbackup.md: run make docs-sync after 076e721c22 2022-10-10 21:57:46 +03:00
Zakhar Bessarab
076e721c22 doc: describe usage of env variables for obtaining credentials (#3219) 2022-10-10 21:56:46 +03:00
Aliaksandr Valialkin
875abf0ef4 docs/CHANGELOG.md: document e384d88abf 2022-10-10 21:52:06 +03:00
Aliaksandr Valialkin
04a05f161c app/vmselect: return back the logic for limits the amounts of memory occupied by concurrently executed queries if -search.maxMemoryPerQuery isn't set
This is needed for preserving backwards compatibility with the previous releases of VictoriaMetrics.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3203
2022-10-10 21:45:13 +03:00
Howie
e384d88abf fix issue#3053 (#3182)
vmalert: prevent duplicating label `alertname` for notifications

The issue has no impact on alerting procedure. But still needs to be fixed
for clarity. 

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3053

Signed-off-by: lihaowei <haoweili35@gmail.com>
2022-10-10 09:44:58 +02:00
Aliaksandr Valialkin
921918cb49 docs/sd_configs.md: document __scrape_timeout__, __scrape_interval__ and __series_limit__ labels 2022-10-09 15:05:30 +03:00
Aliaksandr Valialkin
50f5eae0e0 lib/promrelabel: remove unconditional sorting of the labels in ParsedConfigs.Apply(), since the sorting isnt needed in many places
Sort labels explicitly after calling the ParsedConfigs.Apply() when needed.

This reduces CPU usage when performing metric-level relabeling, where labels' sorting isn't needed.
2022-10-09 14:51:16 +03:00
Aliaksandr Valialkin
8e1ccecd97 docs: mention -search.maxMemoryPerQuery in the description to -search.maxConcurrentQueries command-line flag
This is a follow-up for 5138eaeea0

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3203
2022-10-09 13:56:59 +03:00
Aliaksandr Valialkin
4f4d591ccb docs/vmbackupmanager.md: update docs after adding the support to make backups to Azure blob storage
This is a follow-up for 262ce77e2d
2022-10-08 10:30:39 +03:00
Aliaksandr Valialkin
8f8ce5e238 docs: add description for -search.maxMemoryPerQuery command-line flag 2022-10-08 01:16:55 +03:00
Aliaksandr Valialkin
5138eaeea0 app/vmselect: allow limiting per-query memory usage via -search.maxMemoryPerQuery command-line flag
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3203
2022-10-08 01:08:05 +03:00
Aliaksandr Valialkin
3aafbc3624 docs/CHANGELOG.md: add a link to a feature request for the feature, which allows specifying full scrape urls in targets list and in the __address__ label
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3208
2022-10-07 23:45:57 +03:00
Aliaksandr Valialkin
5269b1ad77 lib/promscrape: allow controlling staleness tracking on a per-scrape_config basis
Add support for no_stale_markers option at scrape_config section.
See https://docs.victoriametrics.com/sd_configs.html#scrape_configs and
https://docs.victoriametrics.com/vmagent.html#prometheus-staleness-markers
2022-10-07 23:36:14 +03:00
Aliaksandr Valialkin
93811da76d docs/CHANGELOG.md: document the 27ed4b853e
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3169
See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3196#issuecomment-1269765205
2022-10-07 23:06:32 +03:00
Yury Molodov
27ed4b853e vmui: auto-update chart after query field removed (#3210)
* feat: run query after query field removed

* app/vmselect/vmui: `make vmui-update`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-10-07 23:02:41 +03:00
Aliaksandr Valialkin
b47caa86db all: update the minimum required Go verson from 1.19.1 to 1.19.2
This is needed because of security vulnerabilities found in Go 1.19.1
See https://go.dev/doc/devel/release#go1.19.2
2022-10-07 22:43:37 +03:00
Aliaksandr Valialkin
f9df0cae16 lib/promscrape: allow specifying full target url in __address__ label
Previously the `__address__` label could contain only `host:port` part of the target url,
while the scheme and metrics path were obtained from `__scheme__` and `__metrics_path__`
labels. Now it is possible to set the full url in `__address__` label.

This makes valid the following scrape config, which is frequently used by novice users:

scrape_configs:
- job_name: foo
  static_configs:
  - targets:
    - http://host1/metrics1
    - https://host2/metrics2
2022-10-07 22:43:04 +03:00
Aliaksandr Valialkin
8322760647 app/vmselect/vmui: make vmui-update after a54987f671 2022-10-07 03:31:17 +03:00
Aliaksandr Valialkin
8bc840358f docs/CHANGELOG.md: cut v1.82.0 2022-10-07 03:15:23 +03:00
Aliaksandr Valialkin
7b6ce3f75e docs/CHANGELOG.md: typo fix 2022-10-07 03:11:22 +03:00
Aliaksandr Valialkin
ba4050ab1f docs/CHANGELOG.md: add a changelog for v1.79.4 LTS release (copied from lts-1.79 branch) 2022-10-07 02:50:59 +03:00
Aliaksandr Valialkin
897e9ef427 go.mod: go mod tidy 2022-10-07 01:23:52 +03:00
Aliaksandr Valialkin
711698b858 lib/backup/azremote: typo fixes after 03872025b747fcc4ee98710ad10fc98764328511 2022-10-07 01:02:06 +03:00
Zakhar Bessarab
176f10f5b2 app/vmbackup: fix compatibility with latest azure sdk (#461) 2022-10-07 01:02:03 +03:00
Aliaksandr Valialkin
0cea525456 vendor: make vendor-update 2022-10-07 01:01:21 +03:00
Aliaksandr Valialkin
285e92706d docs/MetricsQL.md: formatting improvements
Also put a link to the function type in docs for every function.
This should simplify understanding of MetricsQL functions for novice users.
2022-10-07 00:53:14 +03:00
Aliaksandr Valialkin
f452c84579 app/vmselect/promql: properly calculate vm_rows_scanned_per_query histogram for rollup functions, which take into account only a few samples on the provided lookbehind window 2022-10-06 23:22:24 +03:00
Aliaksandr Valialkin
40e899fd67 app/vmselect/promql: properly calculate quantiles_over_time() over a single raw sample 2022-10-06 22:37:21 +03:00
Aliaksandr Valialkin
8ae713253e docs/CHANGELOG.md: remove duplicate description of the bugfix, which has been included in v1.81.2 2022-10-06 15:53:05 +03:00
Aliaksandr Valialkin
ecd2f7451b Makefile: remove docs/*.tmp files after running sed command there 2022-10-06 15:24:56 +03:00
Aliaksandr Valialkin
9acf1845f4 Makefile: add missing "=" char between "-i" flag and its value for sed
This is a follow-up after 78af27f955
2022-10-06 15:06:42 +03:00
Roman Khavronenko
78af27f955 docs: when modifying docs in place allow storing backups (#3205)
The stored backups would help to identify docs corruption
but aren't needed for commiting. So `.tmp` backup files
are also git-ignored.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-10-06 15:04:25 +03:00
Roman Khavronenko
5b10fa87b2 vmalert: fix misleading line regarding multitenancy (#3206)
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-10-06 15:03:45 +03:00
Aliaksandr Valialkin
8d7910a463 docs/CHANGELOG.md: add missing link to /api/v1/export docs
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3161
2022-10-06 15:02:13 +03:00
Aliaksandr Valialkin
d9282027e6 app: follow-up after ec04fcac93
* Optimize fast path for /api/v1/import when importing numeric values
* Move the docs about the change from features to bugfixes at docs/CHANGELOG.md
* Update tests at lib/protoparser/vmimport

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3161
2022-10-06 14:52:02 +03:00
Dmytro Kozlov
ec04fcac93 Properly parse json when export import metric (#3180)
* app/vmselect: properly work when export import json from `api/v1/{export, import}` API

* app/vmselect: update convert function

* app/vmselect: export null if `math.IsNaN(v)`

* app/vmselect: get float from json

* lib/protoparser: add test

* docs: add change log

* lib/protoparser: make export import api compatible
2022-10-06 13:54:20 +03:00
Zakhar Bessarab
97239e05ce lib/backup/s3remote: fix error checking for alternative S3 providers (#3191) 2022-10-06 13:36:40 +03:00
Aliaksandr Valialkin
dba49943d3 docs/Single-server-VictoriaMetrics.md: update docs after the a54987f671 2022-10-06 13:29:38 +03:00
Aliaksandr Valialkin
1e93ad84e3 lib/backup/azremote: remove unused methods after the 262ce77e2d 2022-10-06 13:08:58 +03:00
Yury Molodov
a54987f671 vmui: maximum queries (#3196)
vmui: allow using up 4 queries at the same time

The change also introduces UI updates to make using 
multiple queries more conveniently.
2022-10-06 07:10:21 +02:00
Aliaksandr Valialkin
cdf385f9e4 deployment/docker: update Go builder from v1.19.1 to v1.19.2
See https://github.com/golang/go/issues?q=milestone%3AGo1.19.2+label%3ACherryPickApproved
2022-10-06 02:01:32 +03:00
Aliaksandr Valialkin
5307cf068f docs: follow-up after 262ce77e2d
* Document the addition of Azure blob storage support in vmbackup / vmrestore
* List the supported storage system types at docs/vmrestore.md
* Mention about azblob storage system support at -src and -dst command-line flags
  for vmbackup / vmrestore tools.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1029
2022-10-06 00:35:34 +03:00
Zakhar Bessarab
262ce77e2d lib/backup: add support of Azure Blob Storage (#460)
* lib/backup: add support of Azure Blob Storage

* lib/backup: add enterprise support of Azure Blob Storage
2022-10-06 00:32:46 +03:00
Aliaksandr Valialkin
f596e49881 docs/CHANGELOG.md: document f8ac55d70ada9ef8490b322abefb05f28f75e2e9 2022-10-06 00:05:37 +03:00
Aliaksandr Valialkin
c45c61cf93 app/vmalert: follow-up after f8ac55d70ada9ef8490b322abefb05f28f75e2e9
* Use vm_account_id and vm_project_id labels to be consistent with https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multitenancy-via-labels
* Document the feature that vmalert now exposes vm_account_id and vm_project_id
  labels if -clusterMode is set.
* Use literal strings instead of string constants for vm_account_id and vm_project_id.
  This improves code readability.
2022-10-06 00:05:33 +03:00
Aliaksandr Valialkin
703094a37a app/vmbackupmanager: expose vm_backup_in_flight metrics (follow-up after c6bca4a0b47b4f5626e1913d3480e62d657ed4cf) 2022-10-05 23:30:21 +03:00
Aliaksandr Valialkin
f9fc838b7b app/vmalert: update -external.alert.source command-line flag description after 61544e13ad 2022-10-05 22:52:38 +03:00
Aliaksandr Valialkin
36584ef52c README.md: sync with docs/README.md after e0ea76db62 2022-10-05 22:52:36 +03:00
Aliaksandr Valialkin
440495df52 Makefile: fix sed command and if condition for Linux bash after 2d11896486 2022-10-05 22:40:04 +03:00
Roman Khavronenko
61544e13ad vmalert: allow using {{$labels}} for templating in -external.alert.source (#3194)
The change is supposed to provide additional flexibility for generating alert's
source link based on label values.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-10-05 19:25:03 +02:00
Zakhar Bessarab
6711eec109 docker-compose: move TooManyLogs into vm-health alerts set (#3199) 2022-10-05 19:23:36 +02:00
Dmytro Kozlov
e0ea76db62 docs: add cardinality explorer blog post (#3198)
docs: add cardinality explorer blog post
2022-10-05 13:19:41 +02:00
Roman Khavronenko
2d11896486 docs: udpate Datadog section (#3190)
The `docs-sync` command was updated to modify images path
for assets in `docs/` folder. The change allows to refer images from `docs`
without copying them to the root folder.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-10-03 13:13:07 +02:00
Yurii Kravets
272f00dbb6 docs: Update How to send data from DataDog agent (#3168)
docs: Update How to send data from DataDog agent
2022-10-03 11:45:05 +02:00
Aliaksandr Valialkin
c53b7e66ef app/vmselect: improve performance scalability on multi-CPU systems for /api/v1/export/... endpoints 2022-10-01 22:05:43 +03:00
Aliaksandr Valialkin
49311ae977 app/vmselect/prometheus: improve scalability of /federate endpoint on systems with many CPU cores
Minimize usage of global lock inside bufferedwriter.Write() when processing `/federate` data
on systems with many CPU cores
2022-10-01 20:13:24 +03:00
Aliaksandr Valialkin
fb1cc3cc94 app/vmselect/promql: increase scalability of incremental aggregate calculations on systems with many CPU cores
Use sync.Map instead of a global mutex there. This should lift scalability limits
on systems with many CPU cores.
2022-10-01 20:00:03 +03:00
Aliaksandr Valialkin
fcc7ab71b3 app/vmselect: do not export NaN values for stale metrics at /federate endpoint
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3185
2022-10-01 19:47:37 +03:00
Aliaksandr Valialkin
7812761ab4 docs/Cluster-VictoriaMetrics.md: add a security note for multitenant support via labels 2022-10-01 18:57:28 +03:00
Aliaksandr Valialkin
d7327d2f02 docs/vmagent.md: fix incorrect url for multitenant writes to VictoriaMetrics cluster 2022-10-01 18:51:37 +03:00
Aliaksandr Valialkin
0dc93cca7f app/vmagent/remotewrite: allow specifying per--remoteWrite.url disk limits for persistent queue with pending data
This commit is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3071

Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2970
2022-10-01 18:40:59 +03:00
Aliaksandr Valialkin
c1fa9828b3 lib/flagutil: rename Array to ArrayString
This makes the ArrayString more consistent with other Array* types.

While at it, add ArrayBytes type, which will be used for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3071
2022-10-01 18:26:36 +03:00
Aliaksandr Valialkin
366f04001b vendor: make vendor-update 2022-10-01 17:20:11 +03:00
Zakhar Bessarab
87c77727e4 vmbackup: update AWS SDK to v2 (#3174)
* lib/backup/s3remote: update AWS SDK to v2

* Update lib/backup/s3remote/s3.go

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>

* lib/backup/s3remote: refactor error handling

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-10-01 17:12:07 +03:00
Aliaksandr Valialkin
725dfb0ed6 lib/httpserver: use 302 redirects instead of 301 redirects
Incorrect 301 redirects can be cached by user agents such as web browsers.
This can complicate recovery procedure after the incorrect redirect is fixed,
e.g. web browser cache must be reset.

The related issue - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1752
2022-10-01 16:53:35 +03:00
Aliaksandr Valialkin
a296994fed app/vmauth: do not remove trailing slash from the proxied path
This should fix the issue with opening VMUI at /vmui/ page.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1752
2022-10-01 16:52:30 +03:00
Aliaksandr Valialkin
4998402004 lib/promscrape: add external_labels from global section of -promscrape.config after the relabeling is applied to the scraped metrics
This aligns with Prometheus behaviour.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3137
2022-10-01 16:13:19 +03:00
Aliaksandr Valialkin
3a98ef2f5f lib/promrelabel: export MustParseMetricWithLabels function, which can be used for simplifying tests 2022-10-01 16:05:51 +03:00
Aliaksandr Valialkin
f86070169d lib/promscrape/discovery/azure: remove unneeded conversion to string 2022-10-01 16:04:37 +03:00
Aliaksandr Valialkin
973ce4b561 app/vmagent: accept requests with /prometheus and /influx path prefixes in the same way as VictoriaMetrics does
This allows using vmagent as a drop-in replacement for VictoriaMetrics
for push protocols.
2022-10-01 14:10:48 +03:00
Aliaksandr Valialkin
63a94b1d54 docs/vmagent.md: formatting fixes 2022-10-01 13:36:24 +03:00
Aliaksandr Valialkin
5b83e6e14e app/vmagent/README.md: apply typo fix from 1fda517af9 2022-10-01 12:31:06 +03:00
lionelee
1fda517af9 docs/vmagent.md: fix typo (#3186) 2022-10-01 12:30:01 +03:00
Aliaksandr Valialkin
db16759c68 lib/storage: optimize matching speed for non-trivial regexp filters
Wrap re.Match into bytesutil.FastStringMatcher.

This increases performance for `{foo=~"complex_regex_here"}` filters
by up to 4x.
2022-10-01 12:06:06 +03:00
Aliaksandr Valialkin
9e8fbef27e docs/CHANGELOG.md: clarify the description of the improvement in relabeling performance 2022-10-01 11:54:48 +03:00
Aliaksandr Valialkin
e8a64f6e7a lib/promrelabel: remove redundant memory allocations by using interned strings 2022-10-01 11:50:21 +03:00
Aliaksandr Valialkin
73dc17ef64 lib/promrelabel: add a benchmark for realistic Kubernetes relabeling
The benchmark name is BenchmarkApplyRelabelConfigs/kubernetes

This benchmark has been copied from d521933053/model/relabel/relabel_test.go (L505)

See also https://github.com/prometheus/prometheus/pull/11147
2022-10-01 10:38:22 +03:00
Aliaksandr Valialkin
c54e14cdec lib/promscrape/discovery/ec2: expose __meta_ec2_region label in the same way as Prometheus 2.39 does
See https://github.com/prometheus/prometheus/pull/11326
2022-09-30 20:48:32 +03:00
Aliaksandr Valialkin
4d27fa41c8 Makefile: run errcheck for all the app/... subdirs 2022-09-30 18:35:53 +03:00
Aliaksandr Valialkin
d0b7172316 app/vminsert: remove support for undocumented VictoriaMetrics_AccountID and VictoriaMetrics_ProjectID labels in tcp-based data ingestion endpoints
These labels are substituted by documented vm_account_id and vm_project_id labels.

See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multitenancy-via-labels

This is a follow up for 505d359b39
2022-09-30 18:35:53 +03:00
Nikolay
33f40f4a5f app/vminsert: allows parsing tenant id from labels (#3009)
* app/vminsert: allows parsing tenant id from labels
it should help mitigate issues with vmagent's multiTenant mode, which works incorrectly at heavy load
and it cannot handle more then 100 different tenants.
This functional hidden with flag and do not change vminsert default behaviour
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2970

* Update docs/Cluster-VictoriaMetrics.md

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

* wip

* app/vminsert/netstorage: clean remaining labels in order to free up GC

* docs/Cluster-VictoriaMetrics.md: typo fix

* wip

* wip

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-09-30 18:35:53 +03:00
Roman Khavronenko
740bb2cc00 vmalert: support auth configs per static_target (#3188)
Allow configuring authorization params per list of targets
in vmalert's notifier config for `static_configs`.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2690

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-30 17:10:17 +02:00
Aliaksandr Valialkin
171dd14aa3 lib/promrelabel: go fmt 2022-09-30 12:28:55 +03:00
Aliaksandr Valialkin
a18d6d5ccc lib/promrelabel: optimize action: replace for non-trivial regex values
Cache `action: replace` results for non-trivial regexs and return them next time
instead of performing CPU-intensive regex replacement.

Optimize also `action: labelmap_all` and `action: replace_all` in the same way.
2022-09-30 12:25:05 +03:00
Aliaksandr Valialkin
146021a076 lib/promrelabel: there is no need in calling regex.HasPrefix() after the optimization at 17289ff481 2022-09-30 10:49:18 +03:00
Aliaksandr Valialkin
899d2c40fb lib/promrelabel: optimize action: labelmap for non-trivial regexs 2022-09-30 10:43:31 +03:00
Aliaksandr Valialkin
17289ff481 lib/regexutil: cache MatchString results for unoptimized regexps
This increases relabeling performance by 3x for unoptimized regexs
2022-09-30 10:41:29 +03:00
Dmytro Kozlov
e220bc3cd5 docs, app/vmgateway: add description about new auth.httpHeader flag (#3134)
* docs, app/vmgateway: add description about new `auth.httpHeader` flag

* Update docs/CHANGELOG.md

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-09-30 08:55:12 +03:00
Aliaksandr Valialkin
b70f815dc4 app/vmselect/promql: remove empty series before applying aggregate function
Previously empty series (e.g. series with all NaN samples) were passed to aggregate functions.
Such series must be ingored by all the aggregate functions.
So it is better from consistency PoV filtering out empty series before applying aggregate functions.
2022-09-30 08:39:54 +03:00
Roman Khavronenko
b64b9b9fec app/vmselect: ignore empty series for limit_offset (#3178)
* app/vmselect: ignore empty series for `limit_offset`

VictoriaMetrics doesn't return empty series (with all NaN values) to
the user. But such series are filtered after transform functions.
It means `limit_offset` will account for empty series as well.

For example, let's consider following data set:
```
time series:
foo{label="1"} NaN, NaN, NaN, NaN // empty series
foo{label="2"} 1, 2, 3, 4
foo{label="3"} 4, 3, 2, 1
```

When user requests all series for metric `foo` the empty series
will be filtered out:
```
/query=foo:
foo{label="v2"} 1, 2, 3, 4
foo{label="v3"} 4, 3, 2, 1
```

But `limit_offset(1, 1, foo)` is applied to original series, not filtered yet.
So it will return `foo{label="v2"}` (skips the first in list)
```
/query=limit_offset(1, 1, foo):
foo{label="v2"} 1, 2, 3, 4
```

Expected result would be to apply `limit_offset` to already filtered list,
so in result we receive `foo{label="v3"}`:
```
/query=limit_offset(1, 1, foo):
foo{label="v3"} 4, 3, 2, 1
```

The change does exactly that - filters empty series before applying `limit_offset`.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* app/vmselect: ignore empty series for `limit_offset`

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-30 08:20:34 +03:00
Aliaksandr Valialkin
fda60b3d4d lib/promrelabel: properly parse regex with escaped $ at the end
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3131

Thanks to @dmitryk-dk for the initial fix at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3179
2022-09-30 08:15:43 +03:00
Aliaksandr Valialkin
bf2f14a3a6 docs/CHANGELOG.md: document 39f559d22b
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3013
2022-09-30 07:48:25 +03:00
Aliaksandr Valialkin
593da3603e lib/bytesutil: move InternString() from lib/promscrape/discoverytutils to lib/bytesutil
lib/bytesutil is more appropriate place for InternString() function
2022-09-30 07:44:35 +03:00
Nikolay
f61b8cec69 lib/awsapi: fixes sign encoding (#3183)
* lib/awsapi: fixes sign encoding

previously white spaces at filter were incorrectly encoded
encoding tip was copied from aws signing lib
For example, the space character must be encoded as %20 (not using '+', as some encoding schemes do)
https://docs.aws.amazon.com/general/latest/gr/sigv4-create-canonical-request.html
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3171

* Update lib/awsapi/sign.go

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-09-30 07:43:44 +03:00
Roman Khavronenko
39f559d22b vmalert: allow using extra labels in annotations (#3181)
According to Ruler specification, only labels returned within time series
should be available for use in annotations.

For long time, vmalert didn't respect this rule. And in PR
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2403
this was fixed for the sake of compatibility. However, this resulted
into users confusion, as they expected all configured and extra labels
to be available - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3013

This fix allows to use extra labels in Annotations. But in the case of conflicts
the original labels (extracted from time series) are preferred.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-29 18:22:50 +02:00
Aliaksandr Valialkin
6a32a64073 lib/bytesutil: add FastStringTransformer and use it in the rest of the code where needed 2022-09-28 10:41:00 +03:00
Aliaksandr Valialkin
92b3622253 lib/protoparser/datadog: optimize sanitizeName() function by using result cache for input strings
This is a follow-up for 7c2474dac7

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3105
2022-09-28 10:40:59 +03:00
Aliaksandr Valialkin
ef435f8cc4 lib/promrelabel: add SanitizeName() function for sanitizing Prometheus metric names and label names
Optimize this function by using results cache for input strings.
Use this function all over the code.

This is a follow-up for fcffdba9dc

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3113
2022-09-28 10:40:59 +03:00
nemobis
6a818dbddf docs: fix typo in Cluster-VictoriaMetrics.md (#3172) 2022-09-28 09:18:29 +02:00
panguicai
fbc85e654c docs: fix typo for vmalert docs (#3173)
Signed-off-by: panguicai008 <1121906548@qq.com>
2022-09-28 09:15:05 +02:00
Roman Khavronenko
4ad3b36630 docs: update img for key concepts (#3176)
Make dots bigger for range_query, so it will be easier
to read the graph.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-27 21:22:59 +02:00
Aliaksandr Valialkin
6411bbcce7 lib/netutil/tls.go: consistently use tlsMinVersion name across source code
This should simplify further code maintenance and refactoring

This is a follow-up after 6ab1cede62
2022-09-26 17:58:01 +03:00
Dmytro Kozlov
6ab1cede62 lib/{httpserver,netutil}: allow to define min and max TLS version of the http server (#3109)
* lib/{httpserver,netutil}: allow to define min and max TLS version of the http server

* lib/httpserver: added descriptions about tls supported versions

* lib/netutil: check minimal tls version, added supported tls versions to error

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-09-26 17:35:45 +03:00
Denys Holius
d63410bf6f docs: managed_victoriametrics/quickstart.md fix image link (#3165) 2022-09-26 16:35:18 +02:00
Roman Khavronenko
36a9a834b3 docs: clarify numeric label values for label_value (#3129)
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3114
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-26 16:48:38 +03:00
Max Golionko
d67948b8e3 added security policy (#3140)
* added security policy

* Update SECURITY.md

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-09-26 16:47:02 +03:00
Denys Holius
32be84fc75 Adds packer build for server with VM Single node in vultr.com marketplace (#3142)
* adds packer build for server with VM Single node in vultr.com marketplace

* fix missed varibale
2022-09-26 16:44:36 +03:00
Roman Khavronenko
e96ccf3f71 lib/mergeset: follow-up after a0e7432e42 (#3145)
* lib/mergeset: follow-up after a0e7432e42

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* Apply suggestions from code review

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-09-26 16:39:56 +03:00
Aliaksandr Valialkin
72c29d762e docs/CHANGELOG.md: document f022296d96
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3147
2022-09-26 16:32:19 +03:00
Zakhar Bessarab
f022296d96 vmbackup: configure retries for GCS remote FS (#3156) 2022-09-26 16:28:20 +03:00
Aliaksandr Valialkin
1bac96dfce vendor: make vendor-update 2022-09-26 15:44:55 +03:00
Aliaksandr Valialkin
a2431c2a88 docs/CHANGELOG.md: document 166d444159 2022-09-26 15:39:14 +03:00
Roman Khavronenko
166d444159 vmselect/rollup: rm workaround for slow-changing counters (#3163)
The workaround was introduced to fix https://github.com/VictoriaMetrics/VictoriaMetrics/issues/962.
However, it didn't prove itself useful. Instead, it is recommended using `increase_pure` function.

Removing the workaround makes VM to produce accurate results when calculating
`delta` or `increase` functions over slow-changing counters with vary intervals
between data points.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-26 15:33:25 +03:00
Aliaksandr Valialkin
41f8c2987d lib/protoparser/graphite: accept whitespace in metric names and tags according to the specification
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/99
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3102

See the specification https://graphite.readthedocs.io/en/latest/tags.html
2022-09-26 15:17:25 +03:00
Aliaksandr Valialkin
5983ecf4d1 docs/Single-server-VictoriaMetrics.md: mention that VictoriaMetrics sanitizes metric names in DataDog data by default
This is a follow-up for 7c2474dac7
2022-09-26 14:02:43 +03:00
Aliaksandr Valialkin
7c2474dac7 lib/protoparser/datadog: sanitize metric names by default in the same way as DataDog does
This commit is based on the pull request https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3105

Thanks to @PerGon for the idea and initial implementation.
2022-09-26 13:57:23 +03:00
Aliaksandr Valialkin
fcffdba9dc app/{vmagent,vminsert}: add -usePromCompatibleNaming command-line flag for normalizing metric names and label names in the ingested samples
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3113

Thanks to @erkexzcx for the idea and the initial pull request at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3146
2022-09-26 13:11:39 +03:00
Aliaksandr Valialkin
819aa95552 docs/vmalert.md: follow-up for 0c95f928ae
- Clarify the description for -datasource.queryStep command-line flag
- Consistently use a single dash in front of -datasource.queryStep command-line flag
- Update -help output at docs/vmalert.md
2022-09-26 08:47:31 +03:00
Aliaksandr Valialkin
6b71f33c8b docs/vmalert.md: follow-up after 7748a9d629
- Consistently use single dash in front of command-line flags instead of double dashes.
- Add a warning that too small -search.latencyOffset may lead to incomplete query results.
2022-09-26 08:36:24 +03:00
Aliaksandr Valialkin
b68cd810fc docs: follow-up for 03d54ac890
- Typo fix: 'adn' -> 'and'
- Remove duplicate link to https://victoriametrics.com/blog/victoriametrics-monitoring/
  from docs/Articles.md . The initial link has been already added in 27254096b2
2022-09-26 08:26:38 +03:00
Roman Khavronenko
908fe6a623 dashboards: replace Index size panel with Active series (#3157)
Panel `Index size` showed itself impractical for users. So
replacing it with `Active series` panel.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/776#issuecomment-1255823734
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-25 21:49:18 +02:00
Roman Khavronenko
0c95f928ae vmalert: set default value for datasource.queryStep to 5m (#3149)
Change default value for command-line flag `datasource.queryStep` from `0s` to `5m`.
Param `step` is added by vmalert to every rule evaluation request sent to datasource.
Before this change, `step` was equal to group's evaluation interval by default.
Param `step` for instant queries defines how far VM can look back for the last written data point.
The change supposed to improve reliability of the rules evaluation when evaluation interval
is lower than scraping interval.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-23 10:22:53 +02:00
Roman Khavronenko
7748a9d629 vmalert: add info about search.latencyOffset to Troubleshooting (#3151)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-23 09:49:14 +02:00
Roman Khavronenko
fc9caa6738 deployment/docker: fix image versions for cluster components (#3150)
Cluster components always have `-cluster` suffix. The change fixes
incorrect image tag in docker-compose manifest.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-23 09:48:46 +02:00
Roman Khavronenko
03d54ac890 docs: mention VictoriaMetrics Monitoring blog post (#3152)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-23 07:12:15 +02:00
匠心零度
3d5509a720 lib/querytracer: fix comment (#3135) 2022-09-22 19:19:48 +03:00
Aliaksandr Valialkin
27254096b2 docs/Articles.md: add a link to https://victoriametrics.com/blog/victoriametrics-monitoring/ 2022-09-22 19:17:10 +03:00
Aliaksandr Valialkin
a3e536c0e7 docs/Articles.md: add a link to https://dataswamp.org/~solene/2022-09-11-exploring-monitoring-stacks.html 2022-09-22 19:17:10 +03:00
Denys Holius
e347dd7cc6 deployment/docker: add version tag for docker containers (#3141)
* deployment/docker/docker-compose.yml: adds version tags for VictoriaMetrics containers

* deployment/docker/docker-compose-cluster.yml: adds version tags for VictoriaMetrics containers
2022-09-21 14:38:09 +02:00
Aliaksandr Valialkin
dc4b87621f vendor: make vendor-update 2022-09-21 11:54:32 +03:00
Roman Khavronenko
5714a68ac6 deployment/docker: move cluster compose env to master branch (#3130)
* deployment/docker: move cluster compose env to master branch

The change supposed to simplify the process of maintaining for
single/cluster docker-compose envs, alerts, dashboards. It also
supposes to reduce confusion for users when looking for cluster
related alerts/configs.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* deployment/docker: move cluster compose env to master branch

Review updates.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-21 11:48:38 +03:00
Dmytro Kozlov
6a794ec5d5 app/{vmctl,vmalert}: update progress bar library (make vendor-update) (#3138)
* app/{vmctl,vmalert}: update progress bar library (make vendor-update)

* app/{vmctl,vmalert}: make vendor-update
2022-09-21 11:08:33 +03:00
Roman Khavronenko
d61cce06fd vmalert: prodvide more details on duplicates (#3136)
Now vmalert will print the following messages on dupliсates:
```
"recording rule \"record\"; expr: \"up == 1\"; labels: summary={{ value|query }}" is a duplicate within the group "test"
"alerting rule \"alert\"; expr: \"up == 1\"; labels: description={{ value|query }}" is a duplicate within the group "test"
```

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3127
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-20 12:52:46 +02:00
Aliaksandr Valialkin
5e2fcd455d app/vmagent/remotewrite: go fmt after 2b55d167d7 2022-09-19 15:14:35 +03:00
Aliaksandr Valialkin
310d0caec2 vendor: make vendor-update 2022-09-19 15:12:22 +03:00
Aliaksandr Valialkin
fd98ec8ba3 Makefile: fix -compat value passed to go mod tidy 2022-09-19 15:12:03 +03:00
Aliaksandr Valialkin
1aef635de4 docs/CHANGELOG.md: clarify the change at 622bbedbe1
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3119
2022-09-19 15:02:05 +03:00
Aliaksandr Valialkin
584a5d6b1f docs/managed_victoriametrics/quickstart.md: small fixes after 606166ef68 2022-09-19 14:52:46 +03:00
Aliaksandr Valialkin
2b55d167d7 app/vmagent/remotewrite: add benchmarks for comparing the performance of standard Snappy encoder with github.com/klauspost/compress/s2 encoder
The standard Snappy encoder from github.com/golang/snappy shows quite good performance number
for compressing the Prometheus remote_write proto messages according to the added benchmarks,
so there is no need in switching to github.com/klauspost/compress/s2 yet.
2022-09-19 14:28:09 +03:00
Roman Khavronenko
b4410b1c63 Dashboards (#3120)
* dashboards/cluster: few updates

* apply consistent formatting across panels;
* make resource usage panels per component more detailed;
* add extra panels to vmselect for displaying
`vm_rows_read_per_query`, `vm_rows_scanned_per_query`,
`vm_rows_read_per_series` and `vm_series_read_per_query` metrics.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboards/single: few updates

* apply consistent formatting across panels;
* add extra panels to Performance for displaying
`vm_rows_read_per_query`, `vm_rows_scanned_per_query`,
`vm_rows_read_per_series` and `vm_series_read_per_query` metrics.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboards/vmagent: few updates

* apply consistent formatting across panels;
* add panels for showing number of samples ingested
or scraped;
* adapt resource usage panels for multiple selected jobs/instances;
* add adhoc variable;
* display vmagent's version in Stats.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboards/vmalert: few updates

* apply consistent formatting across panels;
* adapt resource usage panels for multiple selected jobs/instances;
* show vmalert version in Stats section.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-16 21:24:32 +02:00
Roman Khavronenko
622bbedbe1 vmalert: always re-evaluate Annotations (#3119)
* vmalert: always re-evaluate Annotations

Previously, Annotations were evaluated only:
1. On alert creating.
2. On alert's value change.

This is premature optimization. It was assumed that since annotations
could contain only text with alert's labels or value - there is no need
in spending resources to re-compile Annotations.

Later, template function `query` was added, which can execute
arbitrary queries and return different results on every evaluation.
So if it was used in annotations, it would be executed only on init
or value change.

Another case when optimization caused an issue - annotations hot reload.
In this case, annotations of the active alert won't change even if Rule's
annotations were changed.

This fix enables Annotations re-evaluation on each iteration to resolve
issues above. It would have some impact on performance, but it is unlikely
it will be noticeable.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: add tp Changelog

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-16 16:19:10 +02:00
Roman Khavronenko
3484673566 vmalert: add Troubleshooting section to docs (#3115)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-15 16:15:39 +02:00
Dmytro Kozlov
606166ef68 docs/managed_victoriametrics: add how to restore password section (#3116)
docs/managed_victoriametrics: add how to restore password section
2022-09-15 16:02:31 +02:00
Roman Khavronenko
9c95c81534 vmalert: print example of curl command for rule's state (#3112)
The change adds an example of `curl` command to the Rule's page.
The command is generated for each recorded state. It is supposed
user can just copy&execute the command to see what was returned
to vmalert.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-15 12:40:22 +02:00
Aliaksandr Valialkin
455002922e docs/Cluster-VictoriaMetrics.md: update -help output after explicit marking of enterprise flags 2022-09-15 13:22:58 +03:00
Aliaksandr Valialkin
e7995375b5 docs: update -help output after explicit mentioning of enterprise flags 2022-09-15 13:22:57 +03:00
Aliaksandr Valialkin
4193af4571 docs/vmauth.md: update -help output after explicit marking of enterprise flags 2022-09-15 13:22:57 +03:00
Aliaksandr Valialkin
3b2599c659 docs/vmalert.md: update -help output after explicit marking of enterprise flags 2022-09-15 13:22:57 +03:00
Aliaksandr Valialkin
bbecd27557 docs: update docs with explicitly marked enterprise command-line flags for VictoriaMetrics and vmagent 2022-09-15 13:22:56 +03:00
Dmytro Kozlov
a9629cc32d app/vmctl: add description about influx-skip-database-label flag (#3111) 2022-09-15 10:46:54 +02:00
Aliaksandr Valialkin
b869c757a9 docs/FAQ.md: add an answer for What is the difference between single-node and cluster versions of VictoriaMetrics? 2022-09-15 09:56:41 +03:00
Dmytro Kozlov
b75f1854c5 vmselect/promql: add alphanumeric sort by label (sort_by_label_numeric) (#2982)
* vmselect/promql: add alphanumeric sort by label (sort_by_label_numeric)

* vmselect/promql: fix tests, add documentation

* vmselect/promql: update test

* vmselect/promql: update for alphanumeric sorting, fix tests

* vmselect/promql: remove comments

* vmselect/promql: cleanup

* vmselect/promql: avoid memory allocations, update functions descriptions

* vmselect/promql: make linter happy (remove ineffectual assigment)

* vmselect/promql: add test case, fix behavior when strings are equal

* vendor: update github.com/VictoriaMetrics/metricsql from v0.44.1 to v0.45.0

this adds support for sort_by_label_numeric and sort_by_label_numeric_desc functions

* wip

* lib/promscrape: read response body into memory in stream parsing mode before parsing it

This reduces scrape duration for targets returning big responses.

The response body was already read into memory in stream parsing mode before this change,
so this commit shouldn't increase memory usage.

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-09-14 17:41:09 +03:00
Aliaksandr Valialkin
5306f79fd1 docs/Articles.md: add a link to https://www.groundcover.com/blog/prometheus-grafana-kubernetes 2022-09-14 15:11:19 +03:00
Aliaksandr Valialkin
56ce7ce85b lib/promscrape: typo fix after 74c00a8762 2022-09-14 15:06:50 +03:00
Roman Khavronenko
877940a131 vmalert: add experimental feature of storing Rule's evaluation state (#3106)
vmalert: add experimental feature of storing Rule's evaluation state

The new feature keeps last 20 state changes of each Rule
in memory. The state are available for view on the Rule's
view page. The page can be opened by clicking on `Details`
link next to Rule's name on the `/groups` page.

States change suppose to help in investigating cases when Rule
doesn't generate alerts or records.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-14 14:04:24 +02:00
Aliaksandr Valialkin
99bc18774c app/vmselect/vmui: make vmui-update after 1304824201 2022-09-14 13:33:06 +03:00
Roman Khavronenko
efea51a9ee bump Go version to 1.19.1 (#3108)
The reason is to cover vulnerability GO-2022-0969
Found in: net/http@go1.18.5
Fixed in: net/http@go1.19.1
More info: https://pkg.go.dev/vuln/GO-2022-0969

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-14 12:29:19 +02:00
Aliaksandr Valialkin
74c00a8762 lib/promscrape: read response body into memory in stream parsing mode before parsing it
This reduces scrape duration for targets returning big responses.

The response body was already read into memory in stream parsing mode before this change,
so this commit shouldn't increase memory usage.
2022-09-14 13:15:29 +03:00
Aliaksandr Valialkin
ccad651a61 lib/promscrape/discovery/kubernetes: add more context on WatchEvent parse error
This should improve debugging issues with Kubernetes API server
2022-09-13 19:36:55 +03:00
Yury Molodov
1304824201 vmui: fix query params saving in URL (#3104) 2022-09-13 17:05:26 +02:00
Aliaksandr Valialkin
3af24a5b7c docs: consistently use docs.victoriametrics.com instead of victoriametrics.github.io in all the links 2022-09-13 16:50:13 +03:00
Aliaksandr Valialkin
523ff25077 vendor: make vendor-update 2022-09-13 16:44:44 +03:00
Aliaksandr Valialkin
0ead64b6cf app/vmalert: follow-up after 8441375da2
- Rename logDebug() to logDebugf() and pass format string together
  with format args directly to logDebugf(). This eliminates fmt.Sprintf()
  overhead at logDebug() call site when debugging is disabled.

- Format labels in debug message in Prometheus format, e.g. {label1="value1",...labelN="valueN"}

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3025
2022-09-13 16:35:14 +03:00
Roman Khavronenko
8441375da2 vmalert: add debug mode for alerting rules (#3055)
* vmalert: add `debug` mode for alerting rules

Debug information includes alerts state changes and requests
sent to the datasource. Debug can be enabled only on rule's
level. It might be useful for debugging unexpected
behaviour of alerting rule.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3025

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: review fixes

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* Update app/vmalert/alerting.go

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>

* vmalert: go fmt

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-09-13 16:25:43 +03:00
Aliaksandr Valialkin
25c9a1604a docs/CHANGELOG.md: document atomic directory deletion
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3038
2022-09-13 16:17:39 +03:00
Aliaksandr Valialkin
ce2c07c5a7 lib/mergeset: atomically remove part dirs
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3038
2022-09-13 16:17:38 +03:00
Aliaksandr Valialkin
042a532f70 lib/storage: substitute remaining calls to fs.MustRemoveAll with fs.MustRemoveDirAtomic
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3038
2022-09-13 16:17:38 +03:00
Aliaksandr Valialkin
68e32b0764 lib/storage: atomically remove parts inside partitions
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3038
2022-09-13 16:17:38 +03:00
Aliaksandr Valialkin
db4f0fe6fc docs/ExtendedPromQL.md: move the page to the bottom of the contents list at http://docs.victoriametrics.com 2022-09-13 16:17:37 +03:00
Aliaksandr Valialkin
e041c913bd Revert "docs/ExtendedPromQL.md: remove outdated doc"
This reverts commit 971e3d83f7.

Reason for revert: the ExtendedPromQL doc is still referred from third-party sites.
2022-09-13 16:17:37 +03:00
Aliaksandr Valialkin
340ada871d lib/storage: atomically remove partitions, which went outside the configured retention
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3038
2022-09-13 16:17:37 +03:00
Aliaksandr Valialkin
978dcb4574 lib/storage: properly remove cache directory contents if reset_cache_on_startup file is located there
Previously the cache directory was removed. This could result in error when the cache directory
is mounted to a separate filesystem.
2022-09-13 16:17:36 +03:00
Aliaksandr Valialkin
5f28ca1f42 lib/storage: atomically remove snapshot directories
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3038
2022-09-13 16:17:36 +03:00
Craig Rodrigues
57ea8dbb36 docs: Use headings in DataDog section to make things easier to read (#3089)
Signed-off-by: Craig Rodrigues <rodrigc@crodrigues.org>

Signed-off-by: Craig Rodrigues <rodrigc@crodrigues.org>
2022-09-12 10:58:25 +03:00
Yury Molodov
1b41169415 vmui: fix data processing (#3092)
* fix: change data processing

* app/vmselect/vmui: `make vmui-update`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-09-12 10:55:11 +03:00
Yury Molodov
b5f4060520 fix: change columns for Top Queries (#3093) 2022-09-12 10:44:35 +03:00
Aliaksandr Valialkin
c48ff746c6 docs/FAQ.md: add a link to https://victoriametrics.com/blog/mimir-benchmark/ to VictoriaMetrics vs Mimir chapter 2022-09-12 10:37:28 +03:00
Aliaksandr Valialkin
c4af0e833a docs/Articles.md: add a link to https://victoriametrics.com/blog/mimir-benchmark/ 2022-09-11 15:23:35 +03:00
Yury Molodov
9541ef2e9e vmui: add lists of top queries (#3065)
* feat: add lists of top queries

* fix: change the field label

* refactor: add handlers for readability

* app/vmselect: `make vmui-update`

* docs: document `top queries` tab

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-09-08 21:43:37 +03:00
John Belmonte
defced2599 MetricsQL doc spellcheck (#3080) 2022-09-08 21:28:03 +03:00
Aliaksandr Valialkin
53b0c2eee4 docs/CHANGELOG.md: document ec273eafef
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3076
2022-09-08 21:25:30 +03:00
Aliaksandr Valialkin
e7635e1c83 Makefile: remove github-create-release and github-upload-assets commands from publish-release
This is a follow-up for b9231c715a
2022-09-08 21:02:41 +03:00
Aliaksandr Valialkin
7c2fa1bc48 vendor: make vendor-update 2022-09-08 18:51:49 +03:00
Aliaksandr Valialkin
aa0c6ed27f Makefile: consistently use go install instead of go get for installing various binaries needed during build/test/check of the code
`go install` is the preferred way for installing go binaries starting
from the minimum supported Go version for VictoriaMetrics - Go1.18 -
see https://tip.golang.org/doc/go1.18#go-command
2022-09-08 18:43:05 +03:00
Aliaksandr Valialkin
28b6dec1f4 .github/workflows/main.yml: stop setting GO111MODULE=on env var, since it is unnecessary in Go1.18 and newer versions 2022-09-08 18:41:56 +03:00
Aliaksandr Valialkin
5dad557868 Makefile: check for vulnerabilities in used Go packages with govulncheck when running make check-all
See https://go.dev/blog/vuln
2022-09-08 18:35:34 +03:00
Aliaksandr Valialkin
f81dfaf20d deployment/docker: update Go builder for prod binaries from Go1.19.0 to Go1.19.1
See https://github.com/golang/go/issues?q=milestone%3AGo1.19.1+label%3ACherryPickApproved
2022-09-08 18:27:10 +03:00
Aliaksandr Valialkin
b9231c715a docs/Release-Guide.md: require manual push of the created release tags to public Github
The automated push of release tags to Github require specifying the remote repository name
when doing `git push <remote-repo-name> v1.xx.y`.
The remote repository name can differ in different environments,
so it cannot be put into Makefile rule.

TODO: create a Makefile rule, which generates standard remote names for public
and private repositories in Git, so `git push` for release tags could be automated then.
2022-09-08 15:00:18 +03:00
Aliaksandr Valialkin
07441b1cee app/vmselect/netstorage: fix a typo, which leads to incorrect query results in VictoriaMetrics cluster
The typo has been introduced in the commit 1a254ea20c

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3067
2022-09-08 13:48:20 +03:00
Aliaksandr Valialkin
9f20d01a81 Revert "docs: add Monitoring at scale with Victoria Metrics (#3078)"
This reverts commit f24572fa65,
because the article https://tech.bedrockstreaming.com/2022/09/06/monitoring-at-scale-with-victoriametrics.html
has been already added at 09ff3f1928
2022-09-08 11:05:25 +03:00
Roman Khavronenko
f24572fa65 docs: add Monitoring at scale with Victoria Metrics (#3078)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-06 16:29:48 +02:00
Max Golionko
7da9443686 moved cluster dashboard to master (#3074)
dashboards: move cluster dashboard to master branch

This change should simplify dashboards management.
2022-09-06 16:19:43 +02:00
Aliaksandr Valialkin
ef7fdbb63c docs/Single-server-VictoriaMetrics.md: add a note that cardinality explorer at cluster version of VictoriaMetrics may return lower than expected number of unique label values
See the corresponding comment in the code:

5a6e617b5e/app/vmselect/netstorage/netstorage.go (L1039-L1045)
2022-09-06 14:46:35 +03:00
Aliaksandr Valialkin
09ff3f1928 docs/Articles.md: add https://tech.bedrockstreaming.com/2022/09/06/monitoring-at-scale-with-victoriametrics.html 2022-09-06 13:32:41 +03:00
Dmytro Kozlov
4415c71a2b vmselect/{promql, prometheus}: show flag names which user can update in error message (#3049)
* vmselect/{promql, prometheus}: show flag names which user can update in error message

* vmselect/{promql, prometheus}: fix typo
2022-09-06 13:25:59 +03:00
Aliaksandr Valialkin
651ace6ce4 docs/vmctl.md: make docs-sync after c5261d5f56
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2733
2022-09-06 13:19:48 +03:00
Aliaksandr Valialkin
5fa9525498 lib/storage: verify that timestamps in block are in the range specified by blockHeader.{Min,Max}Timestamp when upacking the block
This should reduce chances of unnoticed on-disk data corruption.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2998
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3011

This change modifies the format for data exported via /api/v1/export/native -
now this data contains MaxTimestamp and PrecisionBits fields from blockHeader.

This is OK, since the native export format is undocumented.
2022-09-06 13:08:09 +03:00
Zakhar Bessarab
c5261d5f56 vmctl: implement support of chunking data for vm-native export (#3044)
vmctl: implement support of chunking data for vm-native export process

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2733
2022-09-06 09:09:34 +02:00
Craig Rodrigues
462fc7b394 docs: Clarify DataDog examples for VictoriaMetrics cluster (#3048)
Signed-off-by: Craig Rodrigues <rodrigc@crodrigues.org>
2022-09-05 16:25:05 +02:00
Yurii Kravets
7b04112352 docs: Update keyConcepts (#3040)
docs: update keyConcepts

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2022-09-05 13:59:28 +02:00
Aliaksandr Valialkin
ae31b2363f app/vmselect/prometheus: follow-up after 50e2524bc2
- Add getCommonParamsWithDefaultDuration function and use it at /api/v1/series, /api/v1/labels and /api/v1/label/.../values
- Document the default behaviour for setting 5 minutes time range if start arg isn't passed to /api/v1/series, /api/v1/labels and /api/v1/label/.../values
- Document the change at docs/CHANGELOG.md

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3052
2022-09-05 11:55:48 +03:00
匠心零度
50e2524bc2 api prometheus/api/v1/label/../values time not specified, (#3052)
modify default start values
2022-09-05 11:52:19 +03:00
Aliaksandr Valialkin
7dc632719d app/vmselect/promql: consistently calculate rate_over_sum(m[d]) as sum_over_time(m[d])/d
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3045
2022-09-02 23:18:04 +03:00
Aliaksandr Valialkin
2d4619c9a0 Makefile: properly push public tags 2022-09-02 22:42:13 +03:00
Aliaksandr Valialkin
ac98ecdc1d docs/CHANGELOG.md: cut v1.81.1 2022-09-02 21:58:28 +03:00
Aliaksandr Valialkin
298b3c7f45 Makefile: push v1.xx.y and v1.xx.y-cluster tags to github before creating the v1.xx.y release at github
Otherwise Github creates the v1.xx.y tag on itself when creating the release
2022-09-02 21:48:07 +03:00
Aliaksandr Valialkin
9a1ede0977 vendor: make vendor-update 2022-09-02 21:42:41 +03:00
Aliaksandr Valialkin
08538ff82a app/vmselect/netstorage: fix potential panic under high load
The panic may trigger during data blocks' processing received
from vmstorage nodes when some of vmstorage nodes return an error
or when `-replicationFactor` is set to values higher than 2 at `vmselect`.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3058
2022-09-02 21:38:27 +03:00
Aliaksandr Valialkin
4076277cf0 app/vmselect/promql: evaluate union() args in parallel in order to increase query performance
Note that the parallel execution of `union()` args may take more memory and CPU time
than the sequential execution if args contain heavy queries, which may load all the available CPU,
disk and memory resources and vmselect and vmstorage levels.
2022-09-02 19:46:27 +03:00
Aliaksandr Valialkin
f9d4ade35a docs/Articles.md: add a link to https://aatarasoff.medium.com/optimizing-linkerd-metrics-in-prometheus-de607ec10f6b 2022-08-31 05:03:22 +03:00
Aliaksandr Valialkin
b26de84b4a docs/FAQ.md: mention that single-node vm is easier to setup and operate than the cluster version of vm 2022-08-31 05:02:15 +03:00
Aliaksandr Valialkin
6b0050a028 docs/CHANGELOG.md: cut v1.81.0 2022-08-31 02:33:23 +03:00
Max Golionko
c685afebb2 simplify release process (#3012)
* simplify release process

* address comments

* address comments

* wip

* wip

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-08-31 02:27:24 +03:00
Aliaksandr Valialkin
6c5d5bc7e6 docs/CHANGELOG.md: add missing link to vmagent 2022-08-31 02:21:44 +03:00
Aliaksandr Valialkin
c84e6429a0 docs/CHANGELOG.md: document v1.79.3 LTS release 2022-08-31 00:22:31 +03:00
Aliaksandr Valialkin
022bb62fa1 docs: clarify why cluster version is recommended to use for serving high loads 2022-08-31 00:22:30 +03:00
Craig Rodrigues
a54d7c24ff docs: Add DataDog to list of ingestion methods (#3047)
Signed-off-by: Craig Rodrigues <rodrigc@crodrigues.org>

Signed-off-by: Craig Rodrigues <rodrigc@crodrigues.org>
2022-08-31 00:21:53 +03:00
Craig Rodrigues
e68a97f8d4 docs: add https://github.com/VictoriaMetrics/ansible-playbooks to README.md (#3046)
Signed-off-by: Craig Rodrigues <rodrigc@crodrigues.org>
Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>
2022-08-30 20:52:24 +02:00
Aliaksandr Valialkin
bcdba8be18 docs/FAQ.md: add chapters explaining main differences between single-node and cluster versions of VictoriaMetrics
- Which type of VictoriaMetrics is better to use in production?
- How to migrate data from single-node VictoriaMetrics to cluster?
2022-08-30 13:12:03 +03:00
匠心零度
6d81584d2a reduce unnecessary vmstorage query (#3031)
* reduce unnecessary vmstorage query

* reduce unnecessary vmstorage query

* rollback limit logic /api/v1/label/*
2022-08-30 12:36:54 +03:00
Artem Navoiev
5d4b1bc742 update multi-region guide, specify multi-level vmselect option (#3039)
* update multi-region guide, specify multi-level vmselect option

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>

* add more references

* Apply suggestions from code review

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-08-30 12:26:23 +03:00
Denys Holius
eb9098ce12 snap/snapcraft.yml: yet one fix for broken links 2022-08-30 11:17:47 +03:00
Denys Holius
7d8a2c0481 snap/snapcraft.yml: fixed broken link 2022-08-30 11:17:47 +03:00
Denys Holius
3d6ae60d58 04-install-victoriametrics.sh: update download link for tar.gz
see Update note 5 at https://docs.victoriametrics.com/CHANGELOG.html#v1790
2022-08-30 11:16:59 +03:00
Denys Holius
9878a93428 RELEASE_GUIDE.md: updated the document considering the fixes 2022-08-30 11:16:59 +03:00
Denys Holius
9a90da4545 remove not needed template.json 2022-08-30 11:16:59 +03:00
Denys Holius
cb26de726e template.pkr.hcl: fix wrong variable name 2022-08-30 11:16:59 +03:00
Aliaksandr Valialkin
1eec2460ba docs: fix links to DogStatsD
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3042
2022-08-30 10:49:26 +03:00
Aliaksandr Valialkin
6ba93bf2dc vendor: make vendor-update 2022-08-30 09:45:26 +03:00
Aliaksandr Valialkin
d2e94ee91a docs/CHANGELOG.md: document the 044d51b668 2022-08-30 09:42:19 +03:00
Denys Holius
044d51b668 deployment/docker/Makefile: bump version of Alpine linux to latest 3.16.2 to fix CVE-2022-37434 (#3035)
see https://alpinelinux.org/posts/Alpine-3.13.12-3.14.8-3.15.6-3.16.2-released.html
2022-08-29 17:29:16 +02:00
Bryce Lampe
74f8e12e87 Support "HTTP" and "HTTPS" schemes (#3019)
* Support "HTTP" and "HTTPS" schemes

* Update lib/promscrape/config.go

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2022-08-27 02:22:37 +03:00
Aliaksandr Valialkin
ad11b8d83d app/vmselect/promql: follow-up after 2d71b4859c
- Use getScalar() function for obtaining the expected scalar from phi arg
- Reduce the error message returned to the user when incorrect phi is passed to histogram_quantiles
- Improve the description of this bugfix in the docs/CHANGELOG.md

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3026
2022-08-27 01:35:49 +03:00
Dmytro Kozlov
2d71b4859c vmselect/promql: fix panic in histogram_quantiles function (#3029)
* vmselect/promql: fix panic in histogram_quantiles function

* Update docs/MetricsQL.md

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2022-08-27 01:33:56 +03:00
Aliaksandr Valialkin
30b8d91727 lib/promscrape/discoveryutils: always store just allocated string to sanitized label names cache
This is a follow-up for c06e7a142c
2022-08-27 00:28:39 +03:00
Aliaksandr Valialkin
c06e7a142c lib/promscrape: optimize discoveryutils.SanitizeLabelName()
Cache sanitized label names and return them next time.
This reduces the number of allocations and speeds up the SanitizeLabelName()
function for common case when the number of unique label names is smaller than 100k
2022-08-27 00:17:45 +03:00
Aliaksandr Valialkin
a2cd79576f lib/promrelabel: call PromRegex.MatchString() on a slow path only if it contains non-empty literal prefix
This should improve slow path speed for regexps without literal prefixes
2022-08-26 21:48:30 +03:00
Zakhar Bessarab
30fb4b948e vmctl: fix progress bar not being stopped on error during import process (#3023)
vmctl: fix progress bar not being stopped on error during import process
2022-08-26 18:20:27 +02:00
Aliaksandr Valialkin
f49c9bb700 lib/promrelabel: optimize common regex mismatch cases for action: replace and action: labelmap 2022-08-26 15:45:31 +03:00
Aliaksandr Valialkin
4c6916f32a lib/promrelabel: use regexutil.PromRegex for regex matching in actions labeldrop,labelkeep,drop and keep
This makes possible optimizing additional cases inside regexutil.PromRegex
2022-08-26 15:23:45 +03:00
Aliaksandr Valialkin
7afe8450fc lib/promrelabel: optimize matching for commonly used regex patterns in if option
The following regex patterns are optimized:

- literal string match, e.g. "foo"
- prefix match, e.g. "foo.*" and "foo.+"
- substring match, e.g. ".*foo.*" and ".+foo.+"
- alternate values match, e.g. "foo|bar|baz"
2022-08-26 14:53:06 +03:00
Aliaksandr Valialkin
0ad3bbadd3 lib/regexutil: add Simplify() function for simplifying the regular expression 2022-08-26 11:57:12 +03:00
Aliaksandr Valialkin
b373661988 lib/promrelabel: optimize action: {drop,keep,labeldrop,labelkeep} with anchored regex prefix
The following commonly used relabeling rules must work faster now:

- action: labeldrop
  regex: "^foo.+$"

- action: labeldrop
  regex: "^bar.*"
2022-08-25 23:23:55 +03:00
Aliaksandr Valialkin
0d4ea03a73 lib/promrelabel: optimize action: {labeldrop,labelkeep,keep,drop} with regex containing alternate values
For example, the following relabeling rule must work much faster now:

- action: labeldrop
  regex: "foo|bar|baz"
2022-08-24 17:54:29 +03:00
Aliaksandr Valialkin
0d46e24af5 lib/storage: increase the maximum possible or values extracted from regexp from 20 to 100
This should improve time series search speed for regexp filters with big number of `or` values.
2022-08-24 17:15:25 +03:00
Aliaksandr Valialkin
fdbf5b5795 lib/storage: ignore start text and end text anchors in getOrValues(regexp) function
This is OK, since the anchors are implicitly applied to the whole regexp.
This optimization should improve the speed for regexp series filters with explicit $ and ^ anchors.
For example, `{label="^(foo|bar)$"}`
2022-08-24 17:12:52 +03:00
Aliaksandr Valialkin
cdffe401e4 app/vmagent: follow-up after 2b22aa1537
- Document the change at docs/CHANGELOG.md
- Move auth token parsing from app/vmagent/opentsdbhttp/ to app/vmagent/main.go,
  since it must be parsed only when multitenancy support is enabled at vmagent side.
  See https://docs.victoriametrics.com/vmagent.html#multitenancy
2022-08-24 16:18:59 +03:00
Jianyun Cheng
2b22aa1537 [vmagent] make opentsdb insert url support multitenant (#3015) 2022-08-24 16:17:44 +03:00
Dmytro Kozlov
463ea6897b vmselect/promql: enable search.maxPointsSubqueryPerTimeseries for sub-queries (#2963)
* vmselect/promql: enable search.maxPointsPerTimeSeriesSubquery for sub-queries

* vmselect/promql: cleanup

* vmselect/promql: rename config flag

* vmselect/promql: add tests

* vmselect/promql: use test object instead of log

* vmselect/promql: fix posible panic is subquery has more points. add description

* vmselect/promql: update tests descriptions

* vmselect/promql: update doInternal validation

* vmselect/promql: fix linter

* vmselect/promql: fix linter

* vmselect/promql: update documentation and release notes

* wip

- Properly apply -search.maxPointsSubqueryPerTimeseries limit to subqueries.
  Previously the -search.maxPointsPerTimeseries limit was unexpectedly applied to subqueries
  if it was smaller than the -search.maxPointsSubqueryPerTimeseries .
- Clarify docs for -search.maxPointsSubqueryPerTimeseries command-line flag .
- Document -search.maxPointsPerTimeseries and -search.maxPointsSubqueryPerTimeseries flags at https://docs.victoriametrics.com/#resource-usage-limits .
- Update docs/CHANGELOG.md .

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2922

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-08-24 15:25:18 +03:00
Aliaksandr Valialkin
3d12ee47f9 docs: mention that it is safe sharing the collected profiles from security PoV
The collected profiles do not contain sensitive information
2022-08-24 14:07:36 +03:00
Aliaksandr Valialkin
796aa310c2 app/vmstorage: expose vm_{hourly,daily}_series_limit_{max,current}_series metrics if -storage.max{Hourly,Daily}Series limits are set
These metrics allow alerting when the number of unique series approach the limit.
For example, the following query alerts when the number of series reaches 90% of the configured limit:

    vm_hourly_series_limit_current_series / vm_hourly_series_limit_max_series > 0.9
2022-08-24 13:44:04 +03:00
Aliaksandr Valialkin
b1e1c50627 all: update Google Analytics tracking code from Unversal Analytics to v4
This is needed because Google Analytics devs decided to force their users
to update all their tracking codes to GA v4.

See https://support.google.com/analytics/answer/9744165

The link to the new tracking property ( G-N9SVT8S3HK ) - https://analytics.google.com/analytics/web/?authuser=1#/a129683199p328513681/admin/streams/table/4004610168
2022-08-24 12:16:37 +03:00
Aliaksandr Valialkin
c011fb0f30 docs/vmagent.md: fix alerting query when scraped samples are dropped because of exceeded series limit
This is a follow-up after 7d26414b2e
2022-08-24 01:18:31 +03:00
Roman Khavronenko
8d0f5b9e60 docs: follow-up after 88425bb285 (#3007)
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-08-22 13:49:17 +02:00
laixintao
88425bb285 vmalert: add $activeAt into template variables. (#3000)
vmalert: add `$activeAt` template variable for annotations

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2999
2022-08-22 13:32:36 +02:00
Aliaksandr Valialkin
343241680b all: remove the remaining bits of io/ioutil
The io/ioutil package is deprecated since Go1.16 - see https://tip.golang.org/doc/go1.16#ioutil

VictoriaMetrics requires at least Go1.18, so it is time to remove the io/ioutil from source code

This is a follow-up for 02ca2342ab
2022-08-22 00:20:58 +03:00
Aliaksandr Valialkin
1f89278d88 all: subsitute ioutil.ReadAll with io.ReadAll
ioutil.ReadAll is deprecated since Go1.16 - see https://tip.golang.org/doc/go1.16#ioutil
VictoriaMetrics requires at least Go1.18, so it is OK to switch from ioutil.ReadAll to io.ReadAll.

This is a follow-up for 02ca2342ab
2022-08-22 00:16:37 +03:00
Aliaksandr Valialkin
2c3a89339d all: use os.ReadDir instead of ioutil.ReadDir
The ioutil.ReadDir is deprecated since Go1.16 - see https://tip.golang.org/doc/go1.16#ioutil
VictoriaMetrics requires at least Go1.18, so it is time to switch from io.ReadDir to os.ReadDir

This is a follow-up for 02ca2342ab
2022-08-22 00:02:25 +03:00
Aliaksandr Valialkin
9f94c295ab all: use os.{Read|Write}File instead of ioutil.{Read|Write}File
The ioutil.{Read|Write}File is deprecated since Go1.16 -
see https://tip.golang.org/doc/go1.16#ioutil

VictoriaMetrics needs at least Go1.18, so it is safe to remove ioutil usage
from source code.

This is a follow-up for 02ca2342ab
2022-08-21 23:52:35 +03:00
Cosrider
02ca2342ab app/victoria-metrics: replace ioutil package with os package (#2993)
Signed-off-by: Cosrider <cosrider7@gmail.com>

Signed-off-by: Cosrider <cosrider7@gmail.com>
2022-08-21 23:41:31 +03:00
Aliaksandr Valialkin
6e4e3fae63 docs/CHANGELOG.md: document d59d829cdb
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2673
2022-08-21 23:35:47 +03:00
Roman Khavronenko
d59d829cdb lib/storage: bump max merge concurrency for small parts to 15 (#2997)
* lib/storage: bump max merge concurrency for small parts to 15

The change is based on the feedback from users on github.
Thier examples show, that limit of 8 sometimes become a
bottleneck. Users report that without limit concurrency
can climb up to 15-20 merges at once.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* Update lib/storage/partition.go

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-08-21 23:32:08 +03:00
Ivan Dudin
f69990ba10 Update Cluster-VictoriaMetrics.md (#3005)
Flags `-search.maxTagKeys` and `-search.maxTagValues` are present at vmstorage and not at vmselect
```
# ./vmselect-prod -h | egrep 'search.maxTagKeys|search.maxTagValues'
# ./vmstorage-prod -h | egrep 'search.maxTagKeys|search.maxTagValues'
  -search.maxTagKeys int
  -search.maxTagValues int
```
2022-08-21 23:28:40 +03:00
Aliaksandr Valialkin
22aa07b8c9 docs/Cluster-VictoriaMetrics.md: document the best strategies for cluster update / upgrade 2022-08-21 23:21:02 +03:00
Aliaksandr Valialkin
8550c44e31 app/vmagent: add ability to construct a label from multiple existing labels by referring them in the replacement field during relabeling
For example:

- target_label: composite-label
  replacement: "{{source_label1}}-{{source_label2}}"
2022-08-21 22:50:01 +03:00
Aliaksandr Valialkin
278481f71d vendor: make vendor-update 2022-08-21 19:06:28 +03:00
Aliaksandr Valialkin
f2043d53ad docs: change links to Prometheus docs about instant and range queries to links to VictoriaMetrics docs 2022-08-21 19:01:54 +03:00
Aliaksandr Valialkin
e94b4622f3 docs/keyConcepts.md: more fixes 2022-08-21 18:53:37 +03:00
Aliaksandr Valialkin
af8201cbcc docs/FAQ.md: add a question on differences between Grafana Mimir and VictoriaMetrics 2022-08-21 17:00:33 +03:00
Aliaksandr Valialkin
1b6f66d566 docs/keyConcepts.md - clarify docs a bit 2022-08-21 12:00:21 +03:00
Aliaksandr Valialkin
c6736a3ad2 docs/Cluster-VictoriaMetrics.md: clarify required conditions for cluster availability 2022-08-20 09:40:34 +03:00
Yurii Kravets
4ca189bf94 doc: udpate Cluster-VictoriaMetrics (#3003)
added note about resource usage for data re-routing
2022-08-20 09:37:27 +03:00
Aliaksandr Valialkin
3a1eb471a3 app/vmalert/README.md: sync with docs/vmalert.md after a229182dbe 2022-08-20 08:54:01 +03:00
laixintao
a229182dbe doc: fix broken url of template function for vmalert. (#3002) 2022-08-20 08:50:26 +03:00
Aliaksandr Valialkin
b90103d950 docs/CHANGELOG.md: document 10402459d8 2022-08-19 11:43:59 +03:00
Denys Holius
202ff2216f deployment: bump Grafana version to latest 9.1.0 (#2996)
see more at https://grafana.com/blog/2022/08/16/grafana-9.1-release/
2022-08-18 12:21:22 +02:00
Roman Khavronenko
f451e0eabe docs: fix docs formatting related to vmalert (#2994)
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-08-18 09:53:46 +02:00
Aliaksandr Valialkin
dff6314c87 docs/relabeling.md: fix heading for useful tips for target relabeling 2022-08-18 01:32:49 +03:00
Aliaksandr Valialkin
149eb59546 docs/relabeling.md: typo fixes 2022-08-18 01:30:03 +03:00
Aliaksandr Valialkin
c1b43afdcd docs/relabeling.md: improve relabeling tips for scrape targets 2022-08-18 01:24:50 +03:00
Aliaksandr Valialkin
21e63e1518 docs/relabeling.md: add a link to VictoriaMetrics enhancements for relabeling 2022-08-18 01:20:01 +03:00
Aliaksandr Valialkin
421aca7b2f docs: fix ordering after adding the docs/relabeling.md 2022-08-18 01:17:27 +03:00
Aliaksandr Valialkin
43c6f81b90 docs/relabeling.md: add a cookbook for common relabeling tasks 2022-08-18 01:13:52 +03:00
Roman Khavronenko
31f922944e lib/storage: fix the search for empty label name (#2991)
* lib/storage: fix the search for empty label name

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* Apply suggestions from code review

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-08-17 21:32:25 +03:00
Aliaksandr Valialkin
22c47e97a5 docs: follow-up after 68e56b6fc5 2022-08-17 21:24:00 +03:00
Aliaksandr Valialkin
28a7a19a94 go.mod: update github.com/VictoriaMetrics/metrics from v1.22.1 to v1.22.2 2022-08-17 21:13:06 +03:00
Roman Khavronenko
68e56b6fc5 vmalert: set alert's source link to UI instead of JSON source (#2986)
We switch default alert's source link to redirect user
to vmalert's UI instead of previous JSON object. While it breaks
compatibility, it also supposed to improve user's experience.
The old behavior can be achieved by updating `-external.alert.source`
command-line flag.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-08-17 14:46:28 +02:00
Aliaksandr Valialkin
eaa5d6cbd7 docs/sd_configs.md: clarify a bit honor_labels config option 2022-08-17 14:25:55 +03:00
Aliaksandr Valialkin
7d26414b2e lib/promscrape: automatically generate additional per-target labels for targets with non-zero series limit
The following metrics are generated:

- scrape_series_limit
- scrape_series_current
- scrape_series_limit_samples_dropped

These metrics simplify alerting on targets, which expose too many time series

See https://docs.victoriametrics.com/vmagent.html#automatically-generated-metrics
and https://docs.victoriametrics.com/vmagent.html#cardinality-limiter for more details
2022-08-17 13:19:33 +03:00
Aliaksandr Valialkin
5b449649b6 docs/CHANGELOG.md: group vmalert features closer to each other 2022-08-17 11:50:23 +03:00
Roman Khavronenko
04c174a11e docs: add new article link (#2989)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-08-17 10:01:01 +02:00
Aliaksandr Valialkin
bb68ab99fa lib/promscrape: retry http requests if the server returns 429 status code
The 429 status code means that the server is overwhelmed with requests.
The client can retry the request after some wait time.
Implement this strategy for service discovery and scrape requests.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2940
2022-08-16 15:01:08 +03:00
Aliaksandr Valialkin
b0e1bb517e lib/storage: typo fix in comments after f830edc0bc 2022-08-16 13:44:45 +03:00
Aliaksandr Valialkin
f830edc0bc lib/storage: improve performance for /api/v1/labels and /api/v1/label/.../values endpoints when match[] filter matches small number of time series
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2978
2022-08-16 13:32:40 +03:00
Aliaksandr Valialkin
e0e7c14788 docs/MetricsQL.md: add the list of supported ... functions lines just before the corresponding lists
This improves the readability a bit
2022-08-16 12:08:57 +03:00
Roman Khavronenko
8b3989ba39 docs: update vmalert docs (#2987)
* mention recently added `$alertID` and `$groupID` variables in the changelog
* properly escape template examples in the vmalert's README

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-08-16 11:44:08 +03:00
Roman Khavronenko
f1b2273d13 vmalert: support $alertID and $groupID in template variables (#2983)
Support of these two variables allows building custom URLs with
alert's ID and group ID params.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/517#issuecomment-1207141432

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-08-16 08:08:27 +02:00
Artem Navoiev
edd7a92e8b fix typo in influx guide (#2984) 2022-08-15 22:59:35 +02:00
Aliaksandr Valialkin
4ac79d29ad app/vmselect: follow-up after 63e0f16062
* Explicitly store a pointer to UserReadableError in the error interface.
  Previously Go automatically converted the value to a pointer before storing in the error interface.

* Add Unwrap() method to UserReadableError, so it can be used transparently with the other code,
  which calls errors.Is() and errors.As().

* Document the change in docs/CHANGELOG.md
2022-08-15 13:50:16 +03:00
Roman Khavronenko
63e0f16062 vmselect: introduce UserReadableError type of error (#2894)
When read query fails, VM returns rich error message with
all the details. While these details might be useful
for debugging specific cases, they're usually too verbose
for users.
Introducing a new error type `UserReadableError` is supposed
to allow to return to user only the most important parts
of the error trace. This supposed to improve error readability
in web interfaces such as VMUI or Grafana.

The full error trace is still logged with the full context
and can be found in vmselect logs.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-08-15 13:38:47 +03:00
Roman Khavronenko
5792cae0ab docs: mention default vmalert's behavior change in Update notes (#2981)
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-08-15 13:31:24 +03:00
Yury Molodov
1513866d51 vmui: shortcut keys legend (#2971)
* feat: add shortcut modal

* feat: add shortcut descriptions

* app/vmselect/vmui: `make vmui-update`

* docs/CHANGELOG.md: document the change

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-08-15 11:54:47 +03:00
Yurii Kravets
88c4f29ea5 doc: add upgrade/downgrade without downtime to FAQ (#2973)
* doc: add upgrade/downgrade without downtime to FAQ

Added info on how to upgrade or downgrade VictoriaMetrics without downtime to FAQ
Based on reply https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2944#issuecomment-1207502038

* Udate pr

Fixing format
2022-08-15 11:28:39 +03:00
Aliaksandr Valialkin
c3f8481011 lib/promscrape: update links to sd_configs from Prometheus site to https://docs.victoriametrics.com/sd_configs.html 2022-08-15 01:40:20 +03:00
Aliaksandr Valialkin
95d36da358 lib/promscrape/discovery/kubernetes: add __meta_kubernetes_pod_container_image label in the same way as Prometheus 2.38 does
See https://github.com/prometheus/prometheus/pull/11034
2022-08-15 01:18:23 +03:00
Aliaksandr Valialkin
c4fcd9f1c5 lib/promscrape/discovery/kubernetes: add __meta_kubernetes_service_port_number label to role: service in the same way as Prometheus 2.38 does
See https://github.com/prometheus/prometheus/pull/11002
2022-08-15 01:06:34 +03:00
Aliaksandr Valialkin
308f29f674 vendor: make vendor-update 2022-08-15 00:53:41 +03:00
Aliaksandr Valialkin
d335694add app/vmalert/templates: add toTime() template function in the same way as Prometheus 2.38 does
See https://github.com/prometheus/prometheus/pull/10993
2022-08-15 00:49:31 +03:00
Aliaksandr Valialkin
511805d88d lib/promscrape/discovery/dns: add support for resolving MX records
See https://github.com/prometheus/prometheus/pull/10099
2022-08-15 00:32:34 +03:00
Aliaksandr Valialkin
e1cb15807e app/vmselect/netstorage: improve scalability of blocks processing on systems with multiple CPU cores
Previously a single syncwg.WaitGroup was used for tracking the lifetime of processBlock callbacks
across all the per-vmstorage goroutines. This could be slow on systems with many CPU cores
because of inter-CPU synchronization overhead.

Use a separate per-vmstorage sync.WaitGroup instead in order to reduce inter-CPU synchronization overhead.
This should imrpove performance for heavy queries over big number of blocks on multi-CPU systems.
2022-08-11 23:53:58 +03:00
Aliaksandr Valialkin
1e4fc20486 docs/CHANGELOG.md: clarify the change at 28441711e6 2022-08-11 23:49:22 +03:00
Roman Khavronenko
fa51c76ef9 vmalert: follow-up after 28441711e6 (#2972)
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-08-11 13:30:32 +02:00
Matthew Blewitt
28441711e6 vmalert: mark some url flags as sensitive (#2965)
Other components, such as `vmagent`, mark these flags as sensitive and
hide them from the `/metrics` endpoint by default. This commit adds
similar handling to the `vmalert` component, hiding them by default, to
prevent logging of secrets inappropriately.

Showing of these values is controlled by an additional flag.

Follow up to https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2947
2022-08-11 09:56:40 +02:00
Aliaksandr Valialkin
45d94d12ba deployment/docker: specify docker image tags for all the docker images for reproducible docker-compose up runs 2022-08-09 12:26:28 +03:00
Roman Khavronenko
a0e7432e42 lib/storage: prevent excessive loops when storage is in RO (#2962)
* lib/storage: prevent excessive loops when storage is in RO

Returning nil error when storage is in RO mode results
into excessive loops and function calls which could
result into CPU exhaustion. Returning an err instead
will trigger delays in the for loop and save some resources.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* document the change

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-08-09 12:17:00 +03:00
Aliaksandr Valialkin
968688f4f6 snap/local/Makefile: upgrade Go builder for snap package from Go1.18.1 to Go1.19.0 2022-08-09 12:07:53 +03:00
Roman Khavronenko
ef095a9350 vmalert: sort groups at /alerts page (#2968)
Sorting will produce deterministic output
of grops on the page.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-08-09 09:51:29 +02:00
Aliaksandr Valialkin
ad00f4aaaa docs/CHANGELOG.md: cut v1.80.0 2022-08-08 19:58:50 +03:00
Roman Khavronenko
289a4862ba dashboards: add Cache usage % panel to Caches row (#2964)
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2941
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-08-08 19:37:34 +03:00
Aliaksandr Valialkin
901300aea2 docs/CHANGELOG.md: add changes for v1.79.2 2022-08-08 18:00:00 +03:00
Aliaksandr Valialkin
1a851b14c9 vendor: update github.com/VictoriaMetrics/metrics from v1.21.0 to v1.22.1 2022-08-08 17:18:42 +03:00
Aliaksandr Valialkin
06b6063a36 docs/CHANGELOG.md: link to the issue related to vmselect panic in multi-level cluster setup
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2961
2022-08-08 15:09:09 +03:00
Aliaksandr Valialkin
44c4c1a8cb docs/vmagent.md: typo fix in __meta_kubernetes_annotation_prometheus_io_tenant label name
It must be __meta_kubernetes_pod_annotation_prometheus_io_tenant
2022-08-08 14:52:49 +03:00
Aliaksandr Valialkin
46d7792b72 lib/promscrape: follow-up after 2c553d5a2f
- fix broken tests
- cosmetic code cleanup
- document the change at https://docs.victoriametrics.com/vmagent.html#multitenancy
- document the change at https://docs.victoriametrics.com/CHANGELOG.html
2022-08-08 14:46:26 +03:00
Fury
2c553d5a2f add support to scrape multi tenant metrics (#2950)
* add support to scrape multi tenant metrics

* add support to scrape multi tenant metrics

Co-authored-by: 赵福玉 <zhaofuyu@zhaofuyudeMac-mini.local>
2022-08-08 14:10:18 +03:00
Aliaksandr Valialkin
7a87251ff5 docs: sync -help output with the latest changes 2022-08-08 14:04:30 +03:00
Roman Khavronenko
d3f13ab85b lib/promrelabel: fix expected test result (#2957)
follow-up after 68c4ec9472

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-08-08 13:47:29 +03:00
Aliaksandr Valialkin
7b8bc8ad59 all: bump the minimum supported version of Go from 1.17 to 1.18
This is needed because some dependencies uses generics, which have been appeared in Go1.18

This is a follow-up for caf3dd4fa2
2022-08-08 13:39:38 +03:00
Aliaksandr Valialkin
730c39876e app/vmselect/netstorage: prevent from calling processBlocks callback after the exit from ProcessBlocks function
This should prevent from panic at multi-level vmselect
when the top-level vmselect is configured with -replicationFactor > 1
2022-08-08 13:35:25 +03:00
Roman Khavronenko
a086e48964 vmalert: remove notions of vmalert being compatible with VM only (#2954)
vmalert can be successfully used with datasources
compatible with Prometheus HTTP API. So we remove comments or
notes in Readme which are saying opposite.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-08-08 09:45:21 +02:00
Roman Khavronenko
caf3dd4fa2 workflows: bump go version (#2955)
Some new dependencies contain generics, so we bump go version for CI.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-08-08 09:24:15 +02:00
Aliaksandr Valialkin
34f0341b36 docs/CHANGELOG.md: link to the issue with improper handling of enpoint option at ec2_sd_configs
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2917
2022-08-08 03:32:50 +03:00
Aliaksandr Valialkin
68c4ec9472 lib/promrelabel: do not split regex into multiple lines if it contains groups
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2928
2022-08-08 03:15:26 +03:00
Aliaksandr Valialkin
51debcdf6d docs/CHANGELOG.md: link to the issue regarding the increased load on Consul
This is a follow-up for 68de1f4e4a

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2940
2022-08-08 02:22:31 +03:00
Aliaksandr Valialkin
566d12d9ef docs/_includes/img.html: fix https://github.com/VictoriaMetrics/VictoriaMetrics/security/code-scanning/28 2022-08-08 01:03:47 +03:00
laixintao
97abb601f2 bugfix: fix vmalert navbar url. (#2949)
the doc url should not be joined by `prefix` because it's an abs url.
2022-08-08 00:28:52 +03:00
Yury Molodov
3cb013aeb8 vmui: graph action on moush hold and move (#2915)
* fix: change event for graph panning

* fix: change detect key

* feat: add zoom in with mouse selection

* - document the change
- run `make vmui-update`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-08-07 23:55:09 +03:00
Aliaksandr Valialkin
892c97e350 lib/auth: follow-up after b6a6a659f4 2022-08-07 23:14:39 +03:00
Dmytro Kozlov
b6a6a659f4 lib/auth: add tests for NewToken function (#2921)
* lib/auth: add tests from NewToken function

* lib/auth: update test, fix problem with type conversion

* lib/auth: update test description

* lib/auth: simplify failure tests
2022-08-07 23:07:57 +03:00
Aliaksandr Valialkin
9fa6b25fb2 lib/logger: prettify logging the defined command-line flags 2022-08-07 22:58:29 +03:00
Aliaksandr Valialkin
ebd59e17df vendor: make vendor-update 2022-08-07 22:38:01 +03:00
Aliaksandr Valialkin
8338776ed0 Makefile: update golangci-lint from v1.47.1 to v1.48.0
This is needed for adding support for Go 1.19
2022-08-07 22:33:05 +03:00
Aliaksandr Valialkin
2105c43982 deployment/docker: update Go builder from Go1.18.5 to Go1.19.0
See https://tip.golang.org/doc/go1.19

Notable changes:

* GOMEMLIMIT environment variable - see https://tip.golang.org/doc/gc-guide
* Faster CPU profiler
* Faster sort algorithm
2022-08-07 21:06:07 +03:00
Aliaksandr Valialkin
9f37935819 Makefile: remove redundant -mod=vendor option when running Go tools
The `-mod=vendor` is automatically set when there is a `vendor` directory
starting from Go1.14 - see https://go.dev/doc/go1.14#go-command

Since the minimum supported Go version for VictoriaMetrics is Go1.17,
then the `-mod=vendor` option is no longer needed.
2022-08-07 20:39:08 +03:00
Aliaksandr Valialkin
004e683c55 docs/sd_configs.md: mention that http client configs can contain headers and proxy_headers options 2022-08-07 18:27:26 +03:00
Aliaksandr Valialkin
0e3f21bc25 docs/sd_configs.md: mention that http client options can be specified in scrape_configs section 2022-08-07 00:21:00 +03:00
Aliaksandr Valialkin
f36de0ecc9 docs: add docs for scrape_configs section 2022-08-07 00:04:31 +03:00
Aliaksandr Valialkin
46ed82b894 docs/sd_configs.md: add docs for static_configs 2022-08-06 23:17:17 +03:00
Aliaksandr Valialkin
b8fc2d356f docs/sd_configs.md: add docs for openstack_sd_configs 2022-08-06 23:07:01 +03:00
Aliaksandr Valialkin
34d5eda904 docs/sd_configs.md: document kubernetes_sd_configs 2022-08-06 22:38:39 +03:00
Aliaksandr Valialkin
0ef29ceb14 lib/promscrape/discovery/kubernetes: add missing __meta_kubernetes_ingress_class_name label for role: ingress
See 7e65ad3e43
and 7e1111ff14
2022-08-05 20:55:00 +03:00
Aliaksandr Valialkin
b8fdac4bd7 docs/sd_configs.md: document http_sd_configs 2022-08-05 19:50:06 +03:00
Aliaksandr Valialkin
d8b9cb909a docs/sd_configs.md: document gce_sd_configs 2022-08-05 19:36:57 +03:00
Aliaksandr Valialkin
e3b427ea54 docs/sd_configs.md: document file_sd_configs 2022-08-05 19:16:54 +03:00
Aliaksandr Valialkin
fad2e79747 docs/sd_configs.md: document eureka_sd_configs 2022-08-05 19:04:36 +03:00
Aliaksandr Valialkin
417d3baab0 docs/sd_configs.md: document ec2_sd_configs 2022-08-05 18:51:31 +03:00
Aliaksandr Valialkin
f2816ef031 lib/promscrape/discovery/ec2: properly handle custom endpoint option in ec2_sd_configs
This option was ignored since d289ecded1

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1287
2022-08-05 18:50:02 +03:00
Aliaksandr Valialkin
60e5005d17 docs/sd_configs.md: add docs for dockerswarm_sd_configs 2022-08-05 16:19:57 +03:00
Aliaksandr Valialkin
3e8890e71b lib/promscrape/discovery/dockerswarm: properly set __meta_dockerswarm_container_label_* labels instead of __meta_dockerswarm_task_label_* labels
See https://github.com/prometheus/prometheus/issues/9187
2022-08-05 16:11:28 +03:00
Aliaksandr Valialkin
5760c68dd7 docs/sd_configs.md: document docker_sd_configs 2022-08-05 15:36:27 +03:00
Aliaksandr Valialkin
aa374af910 docs/sd_configs.md: document dns_sd_configs 2022-08-05 15:26:40 +03:00
Aliaksandr Valialkin
2d4a6a2237 docs/sd_configs.md: document digitalocean_sd_configs 2022-08-05 15:16:45 +03:00
Aliaksandr Valialkin
ca4f5eac0e docs/sd_configs.md: document consul_sd_configs 2022-08-05 15:03:22 +03:00
Aliaksandr Valialkin
68de1f4e4a lib/promscrape/discovery/consul: allow stale responses from Consul service discovery by default
This aligns with Prometheus behaviour.

See `allow_stale` option description at https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config
2022-08-05 14:41:40 +03:00
Aliaksandr Valialkin
01a380deb8 docs/sd_configs.md: document azure_sd_configs 2022-08-05 14:15:40 +03:00
Aliaksandr Valialkin
02de848c88 lib/promscrape/discovery/yandexcloud: further code cleanup after 83a4abda3f 2022-08-05 10:30:47 +03:00
Aliaksandr Valialkin
0c95d87abd docs: fixes after 83a4abda3f 2022-08-05 10:15:00 +03:00
Aliaksandr Valialkin
83a4abda3f lib/promscrape/discovery/yandexcloud: follow-up after 6e5ac32fba
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1386
2022-08-04 22:26:43 +03:00
Igor Tiunov
6e5ac32fba YC service discovery (#2923)
* YC service discovery

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1386

* Fixed linter suggestions

* fixed golint errors
2022-08-04 20:44:16 +03:00
Aliaksandr Valialkin
4e3e9b667e docs: exporing -> exporting typo fix
This is a follow-up after ccb6cb6501
2022-08-04 20:30:48 +03:00
Bastien Dronneau
ccb6cb6501 docs(README): just a small typo (#2934) 2022-08-04 20:29:48 +03:00
kevinflynn387
56c117b558 Docs: Fix typo from "HA paris" to "HA pairs" (#2935)
* Update README.md:  fix typo to "HA pairs"

Update README.md to fix typo from "HA paris" to "HA pairs"

* Update docs/README.md : Fix of typo from "HA paris"

Update docs/README.md : Fix of typo from "HA paris" to "HA pairs"

* Update of Single-server-VictoriaMetrics.md to fix typo "HA paris" to "HA pairs"

Update of Single-server-VictoriaMetrics.md to fix typo "HA paris" to "HA pairs"
2022-08-04 20:28:02 +03:00
Aliaksandr Valialkin
7478d423c5 app/vmselect/netstorage: cleanup after 92630c1ab4
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2896
2022-08-04 18:28:11 +03:00
Aliaksandr Valialkin
d5df08e9c2 lib/mergeset: cleanup after de6dd1cd5a
Remove unused getInmemoryPart and putInmemoryPart functions

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2249
2022-08-04 18:23:01 +03:00
Aliaksandr Valialkin
b5b13e48a3 vendor: update github.com/VictoriaMetrics/metrics from v1.20.1 to v1.21.0
This adds the following push-related metrics when -pushmetrics.url is set:

- metrics_push_interval_seconds
- metrics_push_total
- metrics_push_errors_total
- metrics_push_bytes_pushed_total
- metrics_push_duration_seconds
- metrics_push_block_size_bytes

Updates https://github.com/VictoriaMetrics/metrics/issues/35
2022-08-04 18:15:50 +03:00
Aliaksandr Valialkin
752a3008b4 docs: fix the recommended url for -vmalert.proxyURL accroding to 8667307d73 2022-08-04 17:56:07 +03:00
Aliaksandr Valialkin
7c99b9eaad lib/backup/actions: rename removeLockFile -> removeRestoreLock to have consistent naming with createRestoreLock function 2022-08-04 17:42:43 +03:00
Aliaksandr Valialkin
75c7170624 docs/CHANGELOG.md: document v1.79.1 security fix 2022-08-02 13:30:50 +03:00
Aliaksandr Valialkin
3844e904d0 deployment/docker: update Go builder from v1.18.4 to v1.18.5
See https://github.com/golang/go/issues?q=milestone%3AGo1.18.5+label%3ACherryPickApproved
2022-08-02 13:11:37 +03:00
Aliaksandr Valialkin
6b0550c023 app/{vmselect,vmalert}: properly generate http redirects if -http.pathPrefix command-line flag is set
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2918
2022-08-02 12:59:07 +03:00
Denys Holius
5d364545bd deployment/docker/Makefile: added docker-scan (#2916)
* deployment/docker/Makefile: added docker-scan

docker-scan based on native 'docker scan' function that use snyk.io, see https://docs.docker.com/engine/scan/

* set to call 'docker-scan after release binaries but before publishing
2022-08-02 09:54:39 +03:00
Aliaksandr Valialkin
bf65709540 vendor: make vendor-update 2022-08-02 09:19:38 +03:00
Aliaksandr Valialkin
5a4c58f9a2 lib/storage: explain why the GetOrCreateTSIDByName function doesnt check whether the per-day entry for the given date exists if TSID is found in global index 2022-08-02 09:12:29 +03:00
Aliaksandr Valialkin
c2bd75926b app/vmselect/netstorage: initialize tsw.rowsProcessed before calling tsw.f, since tsw.f can modify r.Timestamps and r.Values lengths 2022-07-30 00:39:36 +03:00
Aliaksandr Valialkin
19a0b4679a app/vmselect/netstorage: re-use random generator used for series shuffle in Result.RunParallel
This should reduce CPU usage needed for rand.Rand initialization
2022-07-30 00:30:37 +03:00
Aliaksandr Valialkin
90649de0c4 docs/Release-Guide.md: document that feature docs must be updated with the release where the feature appeared 2022-07-30 00:15:49 +03:00
Aliaksandr Valialkin
78520f2702 lib/storage: do not compress small number of tsids when storing them in tagFiltersCache
This speeds up tsids retreival from the cache for 0-2 tsids
2022-07-30 00:08:51 +03:00
Aliaksandr Valialkin
de6dd1cd5a lib/mergeset: optimize mergeInmemoryBlocks() function
Do not spend CPU time on converting inmemoryBlock structs to inmemoryPart structs.
Just merge inmemoryBlock structs directly.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2249
2022-07-27 23:58:05 +03:00
Aliaksandr Valialkin
a3f5822dc2 lib/mergeset: do not update blockStreamReader.bh.firstItem during the merge
Just read the current item directly from blockStreamReader.Block.Items
with the helper method - blockStreamReader.CurrItem()
2022-07-27 23:05:02 +03:00
Aliaksandr Valialkin
be1c82beb1 benchmark inmemoryBlock.{Marshal,Unmarshal} for different prefix length
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2254

This is needed for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2913
2022-07-27 22:20:27 +03:00
Aliaksandr Valialkin
5c84f09762 lib/mergeset: add tests and benchmarks for commonPrefixLen function
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2254

This is needed for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2913
2022-07-27 21:24:51 +03:00
Dmytro Kozlov
a927814e7b vmselect/promql: add tests for vmrangeBucketsToLE (#2907)
* vmselect/promql: add tests for vmrangeBucketsToLE

* vmselect/promql: cleanup

* vmselect/promql: cleanup

* vmselect/promql: fix panic tests want result

* vmselect/promql: cleanup

* vmselect/promql: update test name

* vmselect/promql: fix linter error

* vmselect/promql: refactor testcases

* vmselect/promql: cleanup

* vmselect/promql: remove unused reassign to workers, fix typo

* wip

* wip

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-07-26 20:42:41 +03:00
Aliaksandr Valialkin
f5676123cc lib/pushmetrics: make fmt 2022-07-26 20:40:19 +03:00
Aliaksandr Valialkin
1f6f883016 docs/CHANGELOG.md: clarify docs about the ability to configure multiple sections for vmauth with identical username and different password values 2022-07-26 19:38:46 +03:00
Aliaksandr Valialkin
ef85f45998 docs/Single-server-VictoriaMetrics.md: improve "Push metrics" docs 2022-07-26 19:34:07 +03:00
Aliaksandr Valialkin
da11056d85 all: rename -pushmetrics.extraLabels to -pushmetrics.extraLabel for the sake of consistency 2022-07-26 19:24:24 +03:00
Aliaksandr Valialkin
c888e6b9be app/vmselect/promql: reduce the diff for f148cffc8a
This is a follow-up for c826f06366
2022-07-26 19:20:48 +03:00
Alan Liang
c826f06366 vmselect: fix vmrangeBucketsToLE func may panic when ts value equal zero (#2902)
Co-authored-by: alanwzliang <alanwzliang@tencent.com>
2022-07-25 10:55:13 +03:00
Aliaksandr Valialkin
f148cffc8a vendor: make vendor-update 2022-07-25 10:49:33 +03:00
Aliaksandr Valialkin
7afcc42454 all: push metrics to -pusmetrics.url in gzip-compressed form in order to reduce the needed network bandwidth 2022-07-25 10:42:26 +03:00
Aliaksandr Valialkin
2e0b6d680e app/vmalert/config: add missing docs for ValidateTplFn 2022-07-25 09:23:13 +03:00
Aliaksandr Valialkin
92630c1ab4 app/vmselect/netstorage: improve the speed of queries over big number of time series on multi-CPU system
Reduce inter-CPU communications when processing the query over big number of time series.
This should improve performance for queries over big number of time series
on systems with many CPU cores.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2896

Based on b596ac3745
Thanks to @zqyzyq for the idea.
2022-07-25 09:18:44 +03:00
Roman Khavronenko
1aa5112771 vmalert: remove notifier dependency from config (#2906)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-07-22 13:50:41 +02:00
Aliaksandr Valialkin
ad6b3cd47d lib/pushmetrics: properly handle errors when initializing pushmetrics 2022-07-22 13:36:06 +03:00
Aliaksandr Valialkin
4c2f9a1a2e lib/promscrape: set up=0 for partially failed scrape in stream parsing mode
This behaviour aligns with Prometheus behavior
2022-07-22 13:29:44 +03:00
Roman Khavronenko
2914ce5ca5 vmalert: remove dependency on datasource pkg from config (#2905)
* vmalert: remove dependency on datasource pkg from config

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-07-22 10:44:55 +02:00
Aliaksandr Valialkin
89890eab5d docs/vmgateway: typo fix in regexp filter: ~= is substituted with =~ 2022-07-21 22:37:12 +03:00
Aliaksandr Valialkin
2db4e79a03 docs/BestPractices.md: do not mention NFS
Though VictoriaMetrics works OK on NFS-like filesystems, it is not a best practice
2022-07-21 21:29:50 +03:00
Aliaksandr Valialkin
2a78975447 vendor: make vendor-update 2022-07-21 21:10:25 +03:00
Aliaksandr Valialkin
79d967d35a app/vmselect/vmui: make vmui-update after edecd2493c 2022-07-21 20:59:52 +03:00
Aliaksandr Valialkin
f2326f953b docs/CHANGELOG.md: document bugfix for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2874
This is a follow-up for edecd2493c
2022-07-21 20:57:04 +03:00
Aliaksandr Valialkin
2095421905 docs/CHANGELOG.md: follow-up after 88edb3f6cf 2022-07-21 20:53:06 +03:00
Aliaksandr Valialkin
67bca5a6f6 app/vmalert/utils: add missing docs to WithHeaders func added at 70a822f3a0 2022-07-21 20:40:14 +03:00
Aliaksandr Valialkin
d19a368aff docs/Cluster-VictoriaMetrics.md: update after fe68bb3ba7 2022-07-21 20:36:28 +03:00
Aliaksandr Valialkin
5ced032d66 all: follow-up after 46f803fa7a
Add -pushmetrics.* command-line flags to all the VictoriaMetrics apps
2022-07-21 20:36:27 +03:00
Aliaksandr Valialkin
4ce5875fa8 all: add ability to push internal metrics to remote storage system specified via -pushmetrics.url 2022-07-21 20:36:27 +03:00
Roman Khavronenko
88edb3f6cf vmalert: allow configuring custom headers per group (#2901)
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2860

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-07-21 15:59:55 +02:00
Roman Khavronenko
70a822f3a0 vmalert: allow configuring custom headers for URLs (#2897)
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2860

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-07-21 13:57:53 +02:00
dependabot[bot]
579cc4e122 build(deps): bump terser in /app/vmui/packages/vmui (#2895)
Bumps [terser](https://github.com/terser/terser) from 5.13.1 to 5.14.2.
- [Release notes](https://github.com/terser/terser/releases)
- [Changelog](https://github.com/terser/terser/blob/master/CHANGELOG.md)
- [Commits](https://github.com/terser/terser/commits)

---
updated-dependencies:
- dependency-name: terser
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-07-21 11:45:39 +02:00
Yury Molodov
edecd2493c fix: change the z-index of the datepicker (#2891) 2022-07-21 07:46:28 +02:00
Roman Khavronenko
9ccf695d57 vmselect: return correct error for second part of expression (#2893)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-07-20 16:44:28 +02:00
Aliaksandr Valialkin
6a079df33f deployment/docker: update alpine base image from 3.16.0 to 3.16.1
See https://alpinelinux.org/posts/Alpine-3.16.1-released.html
2022-07-19 19:50:54 +03:00
Aliaksandr Valialkin
22fc7e0e04 docs/vmauth.md: mention that multiple recrods for the same username are supported
This is a follow-up for 88029c521c
2022-07-19 19:42:42 +03:00
Nikolay
88029c521c app/vmauth: allow duplicate usernames (#2888)
Usernames could be duplicate if it has uniq password.
vmauth makes routing based on auth token and username + password combination must be unique for this case.
2022-07-19 19:33:17 +03:00
Denys Holius
8e79d16dc9 Update golangci version to latest v1.47.1 (#2890)
See https://github.com/golangci/golangci-lint/releases/tag/v1.47.1
2022-07-19 19:30:39 +03:00
Denys Holius
58b64246e2 docs: faq update (#2889)
* docs: faq update

add links for OpenBSD ports & pre built binaries

* Update docs/FAQ.md

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-07-19 17:49:18 +03:00
Aliaksandr Valialkin
cc7d499bbd app/vmselect/promql: execute q1 and q2 from q1 op q2 in parallel if labels pushdown cannot be applied
This should improve query performance if VictoriaMetrics has enough resources for processing `q1` and `q2` in parallel.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2886
2022-07-19 14:27:48 +03:00
Aliaksandr Valialkin
daa0b604f9 docs/Articles.md: add a link to "How do We Keep Metrics for a Long Time in VictoriaMetrics" talk 2022-07-19 13:00:38 +03:00
Aliaksandr Valialkin
0fd86e2364 lib/promscrape: reload all the scrape configs when the global section is changed inside -promscrape.config
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2884
2022-07-18 17:15:07 +03:00
Roman Khavronenko
27f1c65074 vmagent: expose metric vmagent_remotewrite_queues (#2871)
The new metric `vmagent_remotewrite_queues` exports a static value of
number of configured remote write queus. This metric is useful to
calculate total saturation per each configured URL with given number
of queues. See corresponding changes to vmagent alerts and dashboard.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-07-18 14:31:35 +03:00
Aliaksandr Valialkin
bf59511c96 deployment/docker: update Grafana from v9.0.2 to v9.0.3
See https://grafana.com/blog/2022/07/14/grafana-v9-0-3-8-5-9-8-4-10-and-8-3-10-released-with-high-severity-security-fix/
2022-07-18 14:26:25 +03:00
Aliaksandr Valialkin
b5da47bfaf app/vmselect/promql: properly return q1 series from q1 ifnot q2 when q2 returns nothing 2022-07-18 14:24:54 +03:00
Aliaksandr Valialkin
0792c4ca90 app/vmselect/promql/transform.go: reuse evalNumber() function for constructing timezone_offset() results 2022-07-18 14:24:53 +03:00
Aliaksandr Valialkin
a4424174bb docs/CHANGELOG.md: document 2f9668eba5 2022-07-18 12:36:53 +03:00
Boris Petersen
2f9668eba5 fix assume role when running in ECS. (#2876)
This fixes #2875

Signed-off-by: Boris Petersen <boris.petersen@idealo.de>
2022-07-18 12:33:52 +03:00
Aliaksandr Valialkin
814bb1685f all: fix other typos in the same way as 6f4d9b2a48 does 2022-07-18 12:08:15 +03:00
cui fliter
6f4d9b2a48 fix some typos (#2882)
Signed-off-by: cui fliter <imcusg@gmail.com>
2022-07-18 12:02:51 +03:00
Aliaksandr Valialkin
429369f028 docs/CHANGELOG.md: mention that v1.79.0 changes binary names to $(APP_NAME)-$(GOOS)-$(GOARCH)... naming 2022-07-18 11:59:45 +03:00
zhenyuxie
f3ea7823f3 fix inmemoryBlock's Less method (#2881) 2022-07-18 11:56:17 +03:00
Aliaksandr Valialkin
87cdb58bc3 vendor: make vendor-update 2022-07-18 11:13:54 +03:00
Aliaksandr Valialkin
41d0502d99 docs/Single-server-VictoriaMetrics.md: add a link on docs how to send signals to processes in Linux 2022-07-18 10:58:43 +03:00
Aliaksandr Valialkin
15b421435a docs/Troubleshooting.md: clarify some texts there 2022-07-15 12:23:11 +03:00
Aliaksandr Valialkin
f9f73c0255 docs/Troubleshooting.md: mention about -search.setLookbackToStep command-line flag in the section explaining why /api/v1/query_range returns calculated data 2022-07-15 11:54:51 +03:00
Aliaksandr Valialkin
12ec8a7ae7 docs/Single-server-VictoriaMetrics.md: mention that the delete API doesnt delete the associated entries from inverted index 2022-07-14 21:28:58 +03:00
Aliaksandr Valialkin
d3116d9862 docs/CHANGELOG.md: cut v1.79.0 2022-07-14 16:15:24 +03:00
Roman Khavronenko
f07bfcf0c9 vmalert: drop support of deprecated extra_filter_labels param (#2870)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-07-14 15:45:08 +03:00
Aliaksandr Valialkin
cb79c3e765 docs/CHANGELOG.md: formatting fixes for update notes 2022-07-14 12:47:19 +03:00
Aliaksandr Valialkin
f9500abfe0 app: fix make publish-* after ed93330e66
Add missing `-linux` substring to built binary names for copying into Docker images
2022-07-14 10:59:11 +03:00
Aliaksandr Valialkin
9c435d7a9d docs: update descriptions for command-line flags according to the latest changes 2022-07-14 00:57:07 +03:00
Yury Molodov
17c33132df vmui: optimize table view (#2867)
* feat: optimize table view

* fix: add column display setting

* app/vmselect: `make vmui-update`

Also document the change at docs/CHANGELOG.md

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-07-14 00:15:43 +03:00
Dmytro Kozlov
a0d0ba7219 vmui: update time picker behavior (#2847)
* vmui: update time picker behavior

* docs/CHANGELOG.md: document the change

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-07-14 00:05:47 +03:00
Yury Molodov
e4efebf4a4 vmui: change selection from autocomplete (#2862)
* fix: change selection from autocomplete

* update docs/CHANELOG.md

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-07-13 23:54:56 +03:00
Nikolay
7301aa678c lib/promscrape: adds azure service discovery (#2743)
* lib/promscrape: adds azure service discovery
Adds azure service discovery mechanism
implements authorization with oauth and msi
lists virtual machines and virtual machines managed by scaleSet

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1364

* makes linter happy

* Apply suggestions from code review

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

* wip

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-07-13 23:43:18 +03:00
Aliaksandr Valialkin
2d9dbaf75d deployment/docker: update Go builder from go1.18.3 to go1.18.4
See https://github.com/golang/go/issues?q=milestone%3AGo1.18.4+label%3ACherryPickApproved
2022-07-13 18:46:34 +03:00
Aliaksandr Valialkin
ed93330e66 all: follow-up for d99ba3481b 2022-07-13 16:44:39 +03:00
Aliaksandr Valialkin
5f7b6bedce vendor: make vendor-update 2022-07-13 16:43:53 +03:00
Dmytro Kozlov
d99ba3481b Rename release packages (#2810)
* makefile: add os to each release file

* makefile: update vmutils arm64

* makefile: update victoria-metrics release process

* makefile: update publish with os

* makefile: update publish with os

* makefile: change tar library

* update release logic

* copy all releases

* sort command by GOOS

* rollback commands

* rollback OSARCH

* fix commands

* cleanup

* fix windows build

* sort build by GOOS, update README.md
2022-07-13 15:42:48 +03:00
Boris Petersen
41e9702698 fix typo introduced in pr #2604 (#2866)
Signed-off-by: Boris Petersen <boris.petersen@idealo.de>
2022-07-13 15:40:47 +03:00
Aliaksandr Valialkin
765278243b docs/CHANGELOG.md: document 91faa152a5 2022-07-13 12:40:35 +03:00
guidao
91faa152a5 add next retention metric (#2863)
Co-authored-by: wangfeng <wangfeng@zhihu.com>
2022-07-13 12:37:04 +03:00
Aliaksandr Valialkin
29e53b9f55 app/vmselect/promql: consistency update after 93fbd0c54b 2022-07-13 12:33:14 +03:00
Dmytro Kozlov
306ec10c39 lib/mergeset: fix linter error (#2864) 2022-07-13 12:31:35 +03:00
Roman Khavronenko
93fbd0c54b promql: return step as scrapeInterval when it can't be calculated (#2865)
The change allows to specify default value for `getScrapeInterval`
function when actual interval can't be calculated.

Before the change, function were returning `maxSilenceInterval` (5m)
in such cases, which may be not correct for instant queries processing.
The specific scenario where using `maxSilenceInterval` caused issues
is the following:
1. Series becomes stale;
2. Client (in this case vmalert) continues to request series every 15s;
3. Database returns empty results as expected;
4. But at some specific moment of time database returns datapoints from `now()-5m`,
because lookback window was extended to `maxSilenceInterval`.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-07-13 12:27:38 +03:00
Aliaksandr Valialkin
f7eda4a73c docs/CHANGELOG.md: mention that the communication protocol between vmselect and vmstorage nodes is updated in the new release 2022-07-13 12:16:00 +03:00
Denys Holius
37a0f5705e deployment/docker/docker-compose.yml: update Grafana from v8.5.1 to v9.0.2 (#2858)
See https://grafana.com/blog/2022/06/14/grafana-9.0-release-oss-and-cloud-features
2022-07-12 20:03:02 +03:00
Aliaksandr Valialkin
8a6fb5ef2b deployment/docker/alerts.yml: backport a42063909f 2022-07-12 19:53:06 +03:00
Aliaksandr Valialkin
c2197ad139 app/vmselect/promql: validate function name before evaluating its arguments
This avoids unneeded evaluation of args for unknown functions
2022-07-12 19:48:26 +03:00
Aliaksandr Valialkin
17b5ac1608 lib/mergeset: optimize merge speed a bit
Use heap.Fix instead of heap.Pop + heap.Push when merging blocks
2022-07-12 12:50:26 +03:00
Aliaksandr Valialkin
159c2e15e3 app/vmselect/netstorage: optimize mergeSortBlocks() for the worst case when blocks contain interleaved samples 2022-07-12 12:31:38 +03:00
Aliaksandr Valialkin
8429d4af5a app/vmselect/netstorage: add mergeSortBlocks benchmark for the worstcase 2022-07-12 12:31:36 +03:00
Aliaksandr Valialkin
076799ae29 docs: add links to https://docs.victoriametrics.com/CHANGELOG.html in relevant docs 2022-07-11 21:23:19 +03:00
Aliaksandr Valialkin
80c084df02 docs/Articles.md: add a link to https://habr.com/ru/company/sravni/blog/672908/ 2022-07-11 21:11:32 +03:00
Aliaksandr Valialkin
cad471037a app/vmselect/prometheus: follow-up after 3efe33b917 2022-07-11 20:35:28 +03:00
Dmytro Kozlov
3efe33b917 vmselect/prometeus: Add limit param to api/v1/series api endpoint (#2851)
* issue-2841: Add limit param to api/v1/series api endpoint

* issue-2841: add change log

* issue-2841: update logic

* issue-2841: simplify logic

* issue-2841: simplify logic, add information to documentation
2022-07-11 20:18:30 +03:00
Aliaksandr Valialkin
ce68e76d62 app/vmselect: follow-up after 8667307d73 2022-07-11 20:14:34 +03:00
Roman Khavronenko
8667307d73 vmselect: cover special cases for vmalert's routing in single-node version (#2845)
* vmselect: cover special cases for vmalert's routing in single-node version

* remove trailing `/` from requests
* redirect to vmalert's home page when `/vmalert` is requested.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: fix review comments

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* Update app/vmselect/main.go

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-07-11 19:52:22 +03:00
Aliaksandr Valialkin
5c8eee26bf all: make fmt via the upcoming Go1.19 2022-07-11 19:22:15 +03:00
Aliaksandr Valialkin
8851cf68e1 docs: make more clear the relation between replication and deduplication
This is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2852
2022-07-11 19:22:15 +03:00
Aliaksandr Valialkin
4b67231097 vendor: make vendor-update 2022-07-11 18:13:38 +03:00
Aliaksandr Valialkin
cd09f583fe app/vmselect/netstorage: add benchmarks for mergeSortBlocks
This is a follow-up for 743ff84863
2022-07-11 12:54:48 +03:00
Aliaksandr Valialkin
743ff84863 app/vmselect/netstorage: optimize mergeSortBlocks function
- Use binary search instead of linear scan when locating the run of smallest timestamps
  in blocks with intersected time ranges. This should improve performance
  when merging blocks with big number of samples

- Skip samples with duplicate timestamps. This should increase query performance
  in cluster version of VictoriaMetrics with the enabled replication.
2022-07-09 00:34:42 +03:00
Roman Khavronenko
e1a41cfab5 metricsql: properly evaluate timezone_offset over time interval (#2842)
* metricsql: properly evaluate `timezone_offset` over time interval

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2771
Signed-off-by: hagen1778 <roman@victoriametrics.com>

* Update docs/CHANGELOG.md

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-07-08 14:03:56 +03:00
Aliaksandr Valialkin
338fd115d9 app/vmalert/utils/links.go: document Prefix function, which has been added in b29fafa86b 2022-07-08 13:27:15 +03:00
Aliaksandr Valialkin
2e9ae40d56 app/vmselect/vmui: follow-up after 0bf6841140
* Document the bugfix in docs/CHANGELOG.md
* Run `make vmui-update` for updating static js files for vmui, which are included into vmselect
2022-07-08 13:14:17 +03:00
Dmytro Kozlov
0bf6841140 vmui: fix query for json and table tabs (#2846)
* vmui: fix query for json and table

* vmui: add step param
2022-07-08 13:09:31 +03:00
Aliaksandr Valialkin
126e32f79a docs/CHANGELOG.md: clarifications after e9b977859b 2022-07-08 13:02:41 +03:00
Aliaksandr Valialkin
08db70fa3e docs/CHANGELOG.md: recommend clearing caches after the upgrade from v1.78.0 to v1.78.1 2022-07-08 12:49:42 +03:00
Roman Khavronenko
e9b977859b vmalert: deprecate alert's status link (#2840)
* vmalert: deprecate alert's status link

Deprecate alert's status link `/api/v1/<groupID>/<alertID>/status` in favour of
`api/v1/alerts?group_id=<group_id>&alert_id=<alert_id>"`.

The change was needed for simplifying logic in vmselect for proxying vmalert's requests.

The old alert's status link will be still supported for a few versions but will be removed in the future.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2825
Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: fix review comments

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-07-08 10:26:13 +02:00
Aliaksandr Valialkin
0713da9e7a docs/CHANGELOG.md: link to the issue related to multi-level vmselect 2022-07-08 05:23:18 +03:00
Aliaksandr Valialkin
b206bd0aee docs/CHANGELOG.md: recommend upgrading from v1.78.0 to v1.78.1 because of the bug in v1.78.0 2022-07-08 03:44:26 +03:00
Aliaksandr Valialkin
138ae99602 docs/CHANGELOG.md: cut v1.78.1 2022-07-08 01:02:42 +03:00
Aliaksandr Valialkin
d3711e66fd docs/vmagent.md: clarify that vmagent supports Prometheus-compatible service discovery 2022-07-07 21:21:28 +03:00
Aliaksandr Valialkin
ec4d39e893 docs/vmagent.md: typo fixes after ef2eeeb642 2022-07-07 21:16:37 +03:00
Aliaksandr Valialkin
9cb8838b30 vendor: make vendor-update 2022-07-07 20:47:17 +03:00
Aliaksandr Valialkin
ef2eeeb642 docs/vmagent.md: actualize docs 2022-07-07 20:35:47 +03:00
Aliaksandr Valialkin
0f0525c208 docs/Single-server-VictoriaMetrics.md: actualize vmui docs according to recent changes 2022-07-07 20:35:12 +03:00
Aliaksandr Valialkin
1f1be61b78 docs/CHANGELOG.md: typo fixes 2022-07-07 20:34:08 +03:00
Aliaksandr Valialkin
20df81f1aa docs: sync after recent changes 2022-07-07 02:44:17 +03:00
Aliaksandr Valialkin
1828665a64 docs/CHANGELOG.md: link another bugreport related to the bug with per-day inverted index in v1.78.0 2022-07-07 02:34:48 +03:00
Aliaksandr Valialkin
f97355d9fb lib/promscrape: properly set Host header when sending requests via http proxy
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2794
2022-07-07 02:27:52 +03:00
Aliaksandr Valialkin
893ca6f87e docs/CHANGELOG.md: link to the issue about incorrect per-day index handling in v1.78.0 2022-07-07 02:00:07 +03:00
Aliaksandr Valialkin
10cb67adb5 app/{vmagent,vminsert}: follow-up after d19e46de55
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2839
2022-07-07 01:30:58 +03:00
Pedro Gonçalves
d19e46de55 app/vminsert: allow to ingest datadog metrics with simpler tags - not enforcing key:value (#2839) 2022-07-07 01:18:00 +03:00
Roman Khavronenko
7eb519b92e docs: mention deduplication issue for HA vmalert topology (#2838)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-07-07 01:13:05 +03:00
Aliaksandr Valialkin
01f55bc66b lib/promscrape/discovery/kubernetes: properly populate service-level labels for role: endpointslice targets
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2823
2022-07-07 00:32:26 +03:00
Aliaksandr Valialkin
b186b63e07 lib/promscrape/discovery/kubernetes: allow attaching node-level labels to role: endpoints and role: endpointlice targets in the same way as Prometheus does
See https://github.com/prometheus/prometheus/pull/10759
2022-07-06 23:18:59 +03:00
Aliaksandr Valialkin
a6bd442ef9 app/vmui: tune visual presentation of trace view 2022-07-06 14:07:12 +03:00
Aliaksandr Valialkin
ef62da3750 app/vmselect: add -clusternative.tls* options for mTLS setup in multi-level clusters 2022-07-06 13:49:07 +03:00
Aliaksandr Valialkin
bd5b20445e app/vmselect: add ability to query vmselect from another vmselect 2022-07-06 13:30:12 +03:00
Aliaksandr Valialkin
e6ba2af7a1 lib/promscrape: fix a test after c66f676f3b 2022-07-06 13:26:35 +03:00
Aliaksandr Valialkin
c030d920dd docs/Cluster-VictoriaMetrics.md: sync with cluster branch after f51bc07d97 2022-07-06 13:00:34 +03:00
Aliaksandr Valialkin
17de8a41c2 all: follow-up after ed89106274 2022-07-06 12:44:46 +03:00
Aliaksandr Valialkin
c66f676f3b lib/promscrape: push scrape_samples_limit metric to remote storage if sample_limit option is set in scrape_config for this target
See https://github.com/VictoriaMetrics/operator/issues/497
2022-07-06 12:37:55 +03:00
Aliaksandr Valialkin
77cbbacfdb lib/vmselectapi: pass storage.SearchQuery to API calls instead of []*storage.TagFilters + storage.TimeRange + maxMetrics
This reduces the number of args to vmselectapi calls
2022-07-06 12:37:54 +03:00
Aliaksandr Valialkin
f435924ab3 lib/vmselectapi: pass maxSuffixes arg to tagValueSuffixes RPC call 2022-07-06 12:37:54 +03:00
Aliaksandr Valialkin
e1b8059086 lib/vmselectapi: rename deleteMetrics to more correct deleteSeries 2022-07-06 12:37:54 +03:00
Aliaksandr Valialkin
a60e03b3a7 lib/vmselectapi: use string type for tagKey and tagValuePrefix args at TagValueSuffixes()
This improves the API consistency
2022-07-06 12:37:53 +03:00
Roman Khavronenko
ed89106274 vmselect: allow proxying requests to vmalert from single-node (#2834)
The change allows to proxy requests with prefix `/vmalert`
to the vmalert component if `-vmalert.proxyURL` is set.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2825
and https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2831

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-07-06 10:47:26 +02:00
Roman Khavronenko
84e7c517d3 vmalert: make UI and assets links relative (#2831)
* make all links in vmalert relative, so links continue to work even if vmalert sits behind the proxy;
* update vmalert's routing to always have component-unique path prefix, e.g. /vmalert;

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2825

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-07-06 10:46:01 +02:00
Aliaksandr Valialkin
edc76286ac lib/storage: put the (date, metricID) entry in dateMetricIDCache just after the corresponding series is registered in the per-day inverted index
Previously the time series could be put into dateMetricIDCache without
registering in the per-day inverted index if GetOrCreateTSIDByName
finds TSID entry in the global index. This could lead to missing
series in query results.

The issue has been introduced in the commit 55e7afae3a,
which has been included in VictoriaMetrics v1.78.0
2022-07-05 14:54:03 +03:00
Aliaksandr Valialkin
ae80cf76e0 app/vmselect: make fmt after f3ece83e67 2022-07-05 14:35:24 +03:00
Aliaksandr Valialkin
f3ece83e67 app/vmselect/promql: properly calculate histogram_quantile over unexpected le buckets
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2819
2022-07-05 13:19:24 +03:00
Artem Navoiev
9c763490b7 Docs: Operator Additional Scrape Configuration - update docs (#2826)
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2022-07-05 11:06:23 +02:00
Roman Khavronenko
3960fecac2 dashboards: small visual tweaks for vmagent's dashboard (#2828)
* remove lines filling
* filter series with zero values
* update descriptions

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-07-05 11:05:35 +02:00
Aliaksandr Valialkin
855436efd2 lib/promauth: refactor NewConfig in order to improve maintainability
1. Split NewConfig into smaller functions
2. Introduce Options struct for simplifying construction of the Config with various options

This commit is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2684
2022-07-04 14:31:12 +03:00
Aliaksandr Valialkin
611434ce81 vendor: make vendor-update 2022-07-04 12:00:27 +03:00
Aliaksandr Valialkin
17dc3dbd72 docs/Troubleshooting.md: add a link to Monitoring chapter added at 41d1834a99 2022-07-04 11:56:11 +03:00
Roman Khavronenko
41d1834a99 docs: update Troubleshooting guide (#2809)
Follow-up after a4d9388ecb

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-07-04 11:54:40 +03:00
Roman Khavronenko
c7acf36e39 docs: warn about potential issue with read queries for 1.78.0 (#2818)
* docs: warn about potential issue with read queries for 1.78.0

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* docs: warn about potential issue with read queries for 1.78.0

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-07-04 11:03:27 +03:00
Dmytro Kozlov
9f6cfea31d vmui: disable ripple on nested component, enable copy text on button (#2808) 2022-07-01 09:52:45 +02:00
Aliaksandr Valialkin
234901b36c docs/Cluster-VictoriaMetrics.md: mention about -storage.maxDailySeries and -storage.maxHourlySeries options in resource usage limits chapter 2022-06-30 23:17:50 +03:00
Aliaksandr Valialkin
84e373e5c7 app/vmselect/promql: properly handle partial counter resets in rate(), irate(), increase() and remove_resets() functions 2022-06-30 22:39:38 +03:00
Aliaksandr Valialkin
2a877a2a3c app/vmagent/remotewrite: do not shadow headers global variable in getAuthConfig 2022-06-30 20:18:12 +03:00
Aliaksandr Valialkin
fcc4258404 app/vmagent/remotewrite: clarify descriptions for -remoteWrite.* options, which must be set per each -remoteWrite.url 2022-06-30 20:18:11 +03:00
Aliaksandr Valialkin
c392d6d173 app/vmagent/remotewrite: add -remoteWrite.header command-line flag for setting additional http headers to send to -remoteWrite.url
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2805
2022-06-30 20:00:23 +03:00
Aliaksandr Valialkin
e40b40afe6 Revert "lib/promscrape, vmagent: fix path to files (#2801)"
This reverts commit 0a8e35835c.

Reason for revert: it incorrectly fixes the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2799

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2799#issuecomment-1171392005
2022-06-30 18:23:56 +03:00
Aliaksandr Valialkin
3e2dd85f7d all: readability improvements for query traces
- show dates in human-readable format, e.g. 2022-05-07, instead of a numeric value
- limit the maximum length of queries and filters shown in trace messages
2022-06-30 18:20:33 +03:00
Aliaksandr Valialkin
32ac6b5ed8 vendor: update github.com/VictoriaMetrics/metricsql from v0.44.0 to v0.44.1 2022-06-30 18:20:33 +03:00
Dmytro Kozlov
0a8e35835c lib/promscrape, vmagent: fix path to files (#2801)
vmagent: respect `-pathPrefix` flag for static files and links
2022-06-30 16:22:54 +02:00
Dmytro Kozlov
4d9715f5a8 vmui: update render logic for nested component (#2795)
* vmui: update render logic for nested component, avoid rerender, remove local storage usage for tracing flag

* docs/url-examples.md: fix various documentation issues there

* docs: add Troubleshooting doc

This doc contains troubleshooting guides for typical problems with VictoriaMetrics.

* docs/Troubleshooting.md: add troubleshooting guide for cluster instability

* wip

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-06-30 15:47:12 +03:00
Aliaksandr Valialkin
ee42f18dcb docs/Troubleshooting.md: refer to subqueries doc, since this is the most frequent source of improperly consutructed queries 2022-06-30 15:09:35 +03:00
Aliaksandr Valialkin
921ff3f49d docs/Troubleshooting.md: refer to capacity planning docs 2022-06-30 14:59:52 +03:00
Aliaksandr Valialkin
ca263371a6 docs/Troubleshooting.md: various typo fixes and clarifications 2022-06-30 14:56:35 +03:00
Aliaksandr Valialkin
bcb1175162 docs/Troubleshooting.md: formatting fixes 2022-06-30 14:38:42 +03:00
Aliaksandr Valialkin
a4d9388ecb docs/Troubleshooting.md: add troubleshooting guide for cluster instability 2022-06-30 14:35:39 +03:00
Aliaksandr Valialkin
836c19f7ba Revert "follow-up after bdf9f4669a (#2803)"
This reverts commit ec5d3253ff, because there was a similar commit already - 119dc333e1
2022-06-30 13:54:33 +03:00
Aliaksandr Valialkin
119dc333e1 docs/CHANGELOG.md: document bdf9f4669a 2022-06-30 13:53:59 +03:00
Aliaksandr Valialkin
56622bff73 docs: add Troubleshooting doc
This doc contains troubleshooting guides for typical problems with VictoriaMetrics.
2022-06-30 13:53:59 +03:00
Roman Khavronenko
ec5d3253ff follow-up after bdf9f4669a (#2803)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-06-30 12:46:57 +02:00
ttyv
bdf9f4669a lib/promscrape: fix vmagent tickerCh reload behaviour (#2786)
Co-authored-by: Dmitriy <dab@ttyv.ru>
2022-06-30 12:33:01 +02:00
Aliaksandr Valialkin
32ddc90ec1 docs/url-examples.md: fix various documentation issues there 2022-06-29 11:57:54 +03:00
Aliaksandr Valialkin
a14188dd8e app/vmselect: expose additional histograms at /metrics page, which may help get more insights for the query workload
This commit is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2792
2022-06-28 20:18:13 +03:00
Aliaksandr Valialkin
a43f2d0bc5 app/vmselect/promql: show the number of scanned samples in the query trace 2022-06-28 19:26:17 +03:00
Aliaksandr Valialkin
a5181703b1 app/vmselect/prometheus: reduce the default value for -search.maxSeries from 100k to 30k
Production experience shows that 100k is too big for /api/v1/series .
It leads to increased CPU usage when Grafana queries /api/v1/series over VictoriaMetrics
with big number of time series during auto-completion and when modifying template variables.
2022-06-28 18:22:30 +03:00
Aliaksandr Valialkin
a350d1e81c lib/storage: return marshaled metric names from SearchMetricNames
Previously SearchMetricNames was returning unmarshaled metric names.
This wasn't great for vmstorage, which should spend additional CPU time
for marshaling the metric names before sending them to vmselect.

While at it, remove possible duplicate metric names, which could occur when
multiple samples for new time series are ingested via concurrent requests.

Also sort the metric names before returning them to the client.
This simplifies debugging of the returned metric names across repeated requests to /api/v1/series
2022-06-28 18:17:15 +03:00
Aliaksandr Valialkin
eefa1e24f8 vendor: make vendor-update 2022-06-28 14:51:45 +03:00
Aliaksandr Valialkin
2c836bd398 lib/storage: put into query trace the number of found entries in SearchMetricNames 2022-06-28 14:50:53 +03:00
Aliaksandr Valialkin
e578549b8a app/vmselect: optimize /api/v1/series a bit for time ranges smaller than one day 2022-06-28 13:02:47 +03:00
Aliaksandr Valialkin
bffd72e9a9 docs/Single-server-VictoriaMetrics.md: mention about -search.maxTagValueSuffixesPerSearch command-line flag in resource limits docs 2022-06-27 14:03:54 +03:00
Aliaksandr Valialkin
741dd47273 docs/CHANGELOG.md: document 45f20ad1aa 2022-06-27 13:52:59 +03:00
Aliaksandr Valialkin
a963b2a0aa all: show timeRange in traces in human-readable format instead of timestamps in milliseconds 2022-06-27 13:45:51 +03:00
Aliaksandr Valialkin
d502426d7c app/vmalert: load static js and css from proper paths if -http.pathPrefix command-line flag is set
This is a follow-up for b104f67beb
2022-06-27 13:45:51 +03:00
Aliaksandr Valialkin
ba514284f1 lib/storage: add querytracer to more contexts
querytracer has been added to the following storage.Storage methods:
- RegisterMetricNames
- DeleteMetrics
- SearchTagValueSuffixes
- SearchGraphitePaths
2022-06-27 13:45:51 +03:00
Aliaksandr Valialkin
134751e43e all: locate throttled loggers via logger.WithThrottler() only once and then use them
This reduces the contention on logThrottlerRegistryMu mutex when logger.WithThrottler()
is called frequently from concurrent goroutines.
2022-06-27 13:45:50 +03:00
Roman Khavronenko
45f20ad1aa vmalert: make __name__ available for templating in alerts (#2783)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-06-27 09:57:56 +02:00
Aliaksandr Valialkin
9a314106ca app/vmselect/netstorage: remove Get prefix from netstorage functions
This makes these function names more consistent with the server side
2022-06-27 00:45:05 +03:00
Roman Khavronenko
b104f67beb vmalert: use absolute path for assets (#2784)
Using relative path breaks assets loading on alert view page.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-06-24 20:06:12 +02:00
Aliaksandr Valialkin
94445e8bd1 docs/CHANGELOG.md: update after e40d015e9a 2022-06-24 18:04:55 +03:00
Aliaksandr Valialkin
d2bbbf147c all: limit the maximum memory usage for regexp cache, which stores parsed regular expressions in MetricsQL queries
Previously the cache could store 10K unique regexps. When every regexp is huge (e.g. hundreds of kilobytes),
then the total cache size could grow to multiples of gigabytes. Now the cache size is limited by the total length
of all cached regexps. So huge regexps won't result in high memory usage for the cache.
2022-06-24 17:57:43 +03:00
Dmytro Kozlov
bb7f31541f vmui: added query tracing (#2748)
* vmui: added query tracing

* vmui: updated ui

* vmui: update tracing logic, fix bugs, disable tracing by default

* vmui: use empty message as props

* vmui: fixed ui, added delete for each tacing data, show query in header

* vmui: added timelines

* vmui: speedup render

* vmui: use memo for sorting

* vmui: use Trace model, remove unused functions, simplify part of code

* vmui: update recursive logic

* vmui: fix set query to header

* vmui: code cleanup, remove unused code

* vmui: remove unused type, rename component

* wip

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-06-23 22:59:20 +03:00
Nikolay
7f1c73bdaf app/vmselect: fixes partial response with replicationFactor (#2777)
* app/vmselect: fixes partial response with replicationFactor
Allow partial response if it meets replicationFactor configured at vmselect
https://t.me/VictoriaMetrics_ru1/38490

* docs/CHANGELOG.md: document this change

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-06-23 20:19:35 +03:00
Yurii Kravets
86e80428d5 docs: Update CHANGELOG Update notes (#2776)
* docs: Update CHANGELOG Update notes

Specified the reason why `vmselect` and `vmstorage` nodes may log communication errors.
2022-06-23 15:46:51 +02:00
Aliaksandr Valialkin
52eadb729e lib/promscrape: always send stale markers with the real scrape timestamp
This guarantees that query won't return data just after the series is disappeared.
2022-06-23 11:34:18 +03:00
Denys Holius
668d67a3d3 Adds a list of supported architectures (#2769)
* add list of supported architectures

* Update docs/BestPractices.md

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2022-06-22 21:56:57 +03:00
Aliaksandr Valialkin
1c4f67c5d2 lib/promauth: add ability to send additional http headers in requests to scrape targets
This solves https://stackoverflow.com/questions/66032498/prometheus-scrape-metric-with-custom-header
2022-06-22 20:39:43 +03:00
Aliaksandr Valialkin
51362f9333 app/vmselect: add -search.setLookbackToStep command-line flag for making the gap filling algorithm similar to InfluxDB data model
This option should override `-search.maxStalenessInterval` for most cases when users migrate from InfluxDB to VictoriaMetrics
2022-06-22 14:19:30 +03:00
Aliaksandr Valialkin
6a1e0692f6 docs/Cluster-VictoriaMetrics.md: small fixes 2022-06-22 13:42:43 +03:00
Aliaksandr Valialkin
7bf75c7e61 app/vmselect: typo fix in the exported metric name: vm_http_request_total -> vm_http_requests_total 2022-06-22 13:15:31 +03:00
Roman Khavronenko
75dd7542e5 docs: follow-up for 197d3cdd74 (#2766)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-06-22 08:53:54 +02:00
云原生驿站
197d3cdd74 docs: supplement vmalert downsampling docs (#2765)
Co-authored-by: 吴典秋 <muti_kube@163.com>
2022-06-22 07:43:41 +02:00
Aliaksandr Valialkin
e6ed92529b all: remove explicit "xxhash" name when importing github.com/cespare/xxhash/v2 package
This package already has the same name, so there is no need in explicit name
2022-06-21 20:23:32 +03:00
Denys Holius
f456e486b7 url-examples: added curl output after deleting metrics (#2764)
docs: add more details to url-examples for series deleting
2022-06-21 16:20:08 +02:00
Loki's Wager
ac411be904 BugFix part_header.go (#2763)
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2757

Co-authored-by: haotingyi <haotingyi@corp.netease.com>
2022-06-21 15:56:41 +03:00
Aliaksandr Valialkin
f88c642464 docs: update -help output for vmbackup, vmbackupmanager, vmgateway and vmrestore components 2022-06-21 15:49:01 +03:00
Aliaksandr Valialkin
cfc99e12da docs: update docs after e4d6b750f6
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2753
2022-06-21 14:01:12 +03:00
Aliaksandr Valialkin
091408be62 docs/CHANGELOG.md: cut v1.78.0 2022-06-20 18:10:38 +03:00
Yurii Kravets
aeeaf877ac Changed the level type in alerts.yml for TooManyLogs alert (#2760)
alerts: filter out non error log messages for `TooManyLogs`

Info and Warn error levels aren't always a result of malfunctioning
or faulty state. So we filter them out.
2022-06-20 16:44:47 +02:00
Aliaksandr Valialkin
3837b50f37 lib/netutil.ConnPool: skip dialing remote address if the previous dial attempt was unsuccessful
If the previous dial attempt was unsuccessful, then all the new dial attempts are skipped
until the background goroutine determines that the given address can be successfully dialed.

This reduces query latency when some of vmstorage nodes are unavailable and dialing them is slow.

This should help with https://github.com/VictoriaMetrics/VictoriaMetrics/issues/711

This commit is based on ideas from the https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2756

The main differences are:

- The check for healthy/unhealthy storage nodes is moved one level lower from app/vmselect/netstorage to lib/netutil.ConnPool.
  This makes possible re-using this feature everywhere lib/netutil.ConnPool is used.
- The check doesn't take into account handshake errors for already established connections.
  Handshake errors usually mean improperly configured VictoriaMetrics cluster, so they shouldn't be ignored.
2022-06-20 17:36:41 +03:00
Aliaksandr Valialkin
49586566a3 docs: follow-up after e4d6b750f6 2022-06-20 17:14:43 +03:00
Nikolay
e4d6b750f6 lib/httpserver: adds flagsAuthKey command-line flag (#2758)
* lib/httpserver: adds flagsAuthKey command-line flag
It protects /flags endpoint with authKey.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2753O

* Apply suggestions from code review

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-06-20 17:09:32 +03:00
Aliaksandr Valialkin
9f6a19a904 docs/Articles.md: add a link to https://www.sobyte.net/post/2022-05/victoriametrics-bloomfilter/ 2022-06-20 14:43:46 +03:00
Aliaksandr Valialkin
418f40f7fa vendor: make vendor-update 2022-06-20 14:30:23 +03:00
Aliaksandr Valialkin
81d1497b4c all: update Go builder for production builds from 1.18.2 to 1.18.3
See https://github.com/golang/go/issues?q=milestone%3AGo1.18.3+label%3ACherryPickApproved
2022-06-20 14:26:41 +03:00
Aliaksandr Valialkin
b958fc7846 lib/storage: properly take into account already registered series when -storage.maxHourlySeries or -storage.maxDailySeries limits are enabled
The commit 5fb45173ae takes into account only newly registered series
when applying cardinality limits. This means that the cardinality limit could be exceeded with already registered series.
This commit returns back accounting for already registered series when applying cardinality limits.
2022-06-20 13:47:47 +03:00
Roman Khavronenko
4b4f03fa1f docs: reference links from key concepts (#2745)
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-06-19 23:12:10 +03:00
Aliaksandr Valialkin
afc26c57cc all: replace bash with console blocks in all the *.md files
This is a follow-up for 954a7a6fc6
2022-06-19 23:00:39 +03:00
Artem Navoiev
954a7a6fc6 docs: replace bash code block type with console (#2746) 2022-06-19 22:57:53 +03:00
Aliaksandr Valialkin
c022c4af0a docs/CHANGELOG.md: document ef7f52e0e6 2022-06-19 22:48:39 +03:00
Aliaksandr Valialkin
55e7afae3a lib/storage: create per-day indexes together with global indexes when registering new time series
Previously the creation of per-day indexes and global indexes
for the newly registered time series was decoupled.

Now global indexes and per-day indexes for the current day are created toghether for new time series.
This should speed up registering new time series a bit.
2022-06-19 22:42:10 +03:00
Aliaksandr Valialkin
5fb45173ae lib/storage: do not register new series if -storage.maxHourlySeries or -storage.maxDailySeries limits are exceeded
Previously samples for new series weren't added as expected when series limits were reached,
but new series were still registered in indexdb.
2022-06-19 22:42:09 +03:00
Aliaksandr Valialkin
62e2371a67 lib/storage: reset metric id caches for the previous and the current hour
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698
2022-06-19 22:42:09 +03:00
Roman Khavronenko
ef7f52e0e6 Vmalert notifiers (#2744)
* vmalert: remove head of line blocking for sending alerts

This change makes sending alerts to notifiers concurrent instead
of sequential. This eliminates head of line blocking, where first
faulty notifier address prevents the rest of notifiers from
receiving notifications.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: make default timeout for sending alerts 10s

Previous value of 1m was too high and was inconsistent
with default timeout defined for notifiers via
configuration file.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: linter checks fix

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-06-18 09:11:37 +02:00
Aliaksandr Valialkin
00831e0ee5 docs: update docs on how to add tags to metrics collected by DataDog agent
Follow-up for f16072c3c1
2022-06-17 13:11:43 +03:00
Dmytro Kozlov
10454d1735 vmui: added focusLabel, enable cardinality app configuratior (#2736)
* vmui: added focusLabel, enable app configuratior

* vmui: set focusLabel if {labelName!=""}

* wip

* docs/CHANGELOG.md: mention about focusLabel feature in cardinality explorer

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2730

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-06-17 13:03:02 +03:00
Dmytro Kozlov
f16072c3c1 doc: added workaround for datadog agent (#2712)
* Added workaround for datadog agent

* docs: update datadog workaround

* doc: update doc description

* Apply suggestions from code review

* docs: `make docs-sync`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-06-16 22:22:55 +03:00
Roman Khavronenko
b875628ae6 docs: mention sandbox update in release procedure (#2724)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-06-16 20:48:01 +03:00
Roman Khavronenko
723d90536c vmselect: limit end param max value by 2d in future (#2729)
* vmselect: limit `end` param max value by 2d in future

The change is applied only to service handlers like `/labels` or `/series`
and limits the `end` param by max value <= now() + 2 days. The same limit
is applied for the ingested data, so no reason to allow to request data
in future far than that.

The change is also needed for corner cases like https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2669
where too high `end` value triggers inefficient global index search.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* docs/CHANGELOG.md: document the bugfix

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-06-16 20:46:31 +03:00
Aliaksandr Valialkin
eabd2e2320 docs/vmagent.md: typo fix: configued -> configured 2022-06-16 20:30:30 +03:00
Aliaksandr Valialkin
c18f8cccfa lib/promrelabel: support action: graphite relabeling
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2737
2022-06-16 20:24:22 +03:00
Roman Khavronenko
45c1e27937 docs: add multiple-remote-writes topology to vmalert (#2738)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-06-16 13:30:28 +02:00
Aliaksandr Valialkin
0f889497b5 docs/CHANGELOG.md: document dd327bfa9e2c69fe21ab1d92c14636733d7c5620 2022-06-15 18:40:59 +03:00
Aliaksandr Valialkin
e8214ed4e8 docs/CHANGELOG.md: document 00719e5779a3e4eeedb74cb3d25a9ecfe0e16063 2022-06-15 18:09:04 +03:00
Aliaksandr Valialkin
ec7963208d app/vmselect: accept focusLabel query arg at /api/v1/status/tsdb
This allows filling the seriesCountByFocusLabelValue list in the /api/v1/status/tsdb response
with label values for the specified focusLabel, which contain the highest number of time series.

TODO: add this to Cardinality explorer at VMUI - https://docs.victoriametrics.com/#cardinality-explorer
2022-06-14 18:36:54 +03:00
Aliaksandr Valialkin
b6c1ca12b7 lib/storage: show top labels with the highest number of series in cardinality explorer 2022-06-14 16:32:38 +03:00
Dmytro Kozlov
af3dc91a51 vmui: refactor Cardinality panel (#2726)
* vmui: refactor Cardinality panel

* vmui: change width of the search panel

* vmui: code cleanup

* vmui: code cleanup

* vmui: fixed vulnerability (npm audit fix)

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-06-14 14:39:47 +03:00
Aliaksandr Valialkin
7b3c9c50a8 docs: update command-line flags' descriptions according to recent changes 2022-06-14 13:28:00 +03:00
Aliaksandr Valialkin
a75e59700f lib/storage: improve error message when -search.max* command-line flag values are exceeded 2022-06-14 13:27:59 +03:00
Roman Khavronenko
97183e4ec5 docs: reduce free disk recommendation from 30% to 20% (#2728)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-06-14 12:54:39 +03:00
Aliaksandr Valialkin
29b7c0b4a6 docs/guides/migrate-from-influx.md: suggest more real-world value for -search.maxStalenessInterval
Suggest `-search.maxStalenessInterval=10s` instead of `-search.maxStalenessInterval=1ms`,
since `1ms` would result in empty graphs in most cases, since the interval between data points
on the graph is usually much higher than 1ms. For example, if the graph shows time range of one hour
and it contains 1000 points, then the interval between points on the graph would equal to
3600s/1000=3.6 seconds.
2022-06-14 12:32:02 +03:00
Roman Khavronenko
aecac75ec7 docs: migrating from influx (#2720)
Mention `-search.maxStalenessInterval` flag and its effect on query engine.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-06-14 09:10:44 +02:00
Aliaksandr Valialkin
86da001963 docs/Single-server-VictoriaMetrics.md: recommend running all the VictoriaMetrics components behind auth proxy in Security chapter 2022-06-13 10:30:01 +03:00
Aliaksandr Valialkin
c7555ab635 vendor: make vendor-update 2022-06-13 10:02:21 +03:00
Aliaksandr Valialkin
de2be31275 docs/CHANGELOG.md: document 99dbe7f9d4 2022-06-13 10:01:48 +03:00
Wataru Manji
99dbe7f9d4 Add remote-write headers (#2701)
Co-authored-by: Wataru Manji <wataru.manji@linecorp.com>
2022-06-13 09:59:03 +03:00
Aliaksandr Valialkin
1041f395cc app/vmagent: follow-up after 4583ed23a8 2022-06-13 09:54:11 +03:00
Dmytro Kozlov
4583ed23a8 Added a stub for datadog endpoint (#2710)
* Added a stub for datadog endpoint

* Update app/vmagent/main.go

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-06-13 09:52:13 +03:00
Roman Khavronenko
cd2f0e0760 Readme cleanup (#2715)
* docs: minor styling and wording changes

Changes made after reading https://developers.google.com/tech-writing

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* docs: set proper types for code blocks

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* docs: add `copy` wrapper for some commands

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* docs: sync

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* docs: resolve conflicts

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-06-13 09:48:26 +03:00
Yury Molodov
879670418f vmui: enhancements (#2638) (#2717)
* feat: make datepicker to be set to last 30 min by default

* fix: correct spinner while loading data

* feat: change legend style

* app/vmselect: `make vmui-update`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-06-13 09:43:37 +03:00
Yury Molodov
7979e5cd26 vmui: fix relative time and query params (#2716)
* fix: correct update query params

* fix: change select relative time
2022-06-13 09:32:09 +03:00
Aliaksandr Valialkin
55a0d34be5 docs/CHANGELOG.md: refer to the issue, which should be solved after the optimization to /api/v1/labels and /api/v1/label/.../values is added
The optimization has been added in 374beb350e

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1533
2022-06-12 14:32:45 +03:00
Aliaksandr Valialkin
52cf05c6d2 lib/storage: test GetTSDBStatusWithFiltersForDate on a global time range 2022-06-12 14:27:40 +03:00
Aliaksandr Valialkin
374beb350e app/vmselect: optimize /api/v1/labels and /api/v1/label/.../values handlers when match[] query arg is passed to them 2022-06-12 04:32:13 +03:00
Aliaksandr Valialkin
89b778902b app/vmselect: add optional limit query arg to /api/v1/labels and /api/v1/label_values endpoints
This arg allows limiting the number of sample values returned from these APIs
2022-06-10 09:50:33 +03:00
Aliaksandr Valialkin
483b402bb2 app/vmselect/prometheus: extract common code for obtaining common query args into getCommonParams() function 2022-06-09 20:34:18 +03:00
Aliaksandr Valialkin
2bcb960f17 all: improve query tracing coverage for indexdb search
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1403
2022-06-09 20:07:07 +03:00
Aliaksandr Valialkin
a30333a79e app/vmselect/graphite: remove additional redundant Request.ParseForm() calls after 38c785b851 2022-06-09 13:28:57 +03:00
Aliaksandr Valialkin
38c785b851 app/vmselect: remove redundant calls to Request.ParseForm()
Request.ParseForm() is implicitly called by the first call to Request.FormValue()
2022-06-09 13:11:26 +03:00
Artem Navoiev
cd7fb05b7c dashboards: update cluster by tenant dashboard (#2695)
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2022-06-09 10:39:30 +02:00
Roman Khavronenko
48a60eb593 vmalert: followup for 76f05f8670 (#2706)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-06-09 08:58:25 +02:00
Howie
76f05f8670 feat: rule limit (#2676)
vmalert: support `limit` param in groups definition

`limit` param limits number of time series samples produced by a single rule
during execution.
On reaching the limit rule will return an err.

Signed-off-by: lihaowei <haoweili35@gmail.com>
2022-06-09 08:21:30 +02:00
Aliaksandr Valialkin
12ac255dae lib/querytracer: make it easier to use by passing trace context message to New and NewChild
The context message can be extended by calling Donef.
If there is no need to extend the message, then just call Done.
2022-06-08 21:06:52 +03:00
Aliaksandr Valialkin
a072a061a2 docs/Single-server-VictoriaMetrics.md: explain why free disk space shortage may negatively impact VictoriaMetrics performance 2022-06-08 20:03:53 +03:00
Aliaksandr Valialkin
8888e2b955 docs: add a link to cardinality explorer playground 2022-06-08 19:39:36 +03:00
Aliaksandr Valialkin
2c2418d079 docs: refer to cardinality explorer
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2233
2022-06-08 19:17:41 +03:00
Dmytro Kozlov
018d2303c4 Cardinality explorer (#2625)
* Cardinality explorer

* vmui, vmselect: updated field name, added description to spinner

* make vmui-update

* updated const name, make vmui-update

* lib/storage: changes calculation for totalSeries values

* added static files

* wip

* wip

* wip

* wip

* docs/CHANGELOG.md: document cardinality explorer feature

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2233

Co-authored-by: f41gh7 <nik@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-06-08 18:43:05 +03:00
Aliaksandr Valialkin
46d8fb03d1 docs/CHANGELOG.md: document 63b538ecd1 2022-06-07 15:52:32 +03:00
Roman Khavronenko
63b538ecd1 vmagent: update SD duration histogram metric if SD is active (#2677)
The change updates histogram for registering SD update duration
only SD is considered as `active`. SD is active if at least
one scraper for this SD has started.

This change supposed to reduce metrics cardinality produced
by duration histogram which gets updated even if SD isn't configured.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2671

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-06-07 15:46:44 +03:00
Aliaksandr Valialkin
a93deb307f docs/CHANGELOG.md: document https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2685 2022-06-07 15:39:13 +03:00
Wataru Manji
6564dc6c16 add Content-Encoding Header (#2685)
Co-authored-by: Wataru Manji <wataru.manji@linecorp.com>
2022-06-07 15:33:21 +03:00
Aliaksandr Valialkin
cbb64c824d docs/CHANGELOG.md: document backwards-incompatible changes in communication protocol between vmselect and vmstorage
The changes are related to the added query tracing in afced37c0b

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1403
2022-06-07 15:25:35 +03:00
Aliaksandr Valialkin
638ba4614a docs/CHANGELOG.md: document e755d0ec3f 2022-06-07 15:16:48 +03:00
elProxy
e755d0ec3f Support legacy datadog agent (#2670)
dd-agent v5 can issue some requests with trailing slashes.
(e.g.
526559be73/ddagent.py (L303))
Trim trailing slashes for request on /datadog/ paths to accomodate for
that.

Co-authored-by: Pierre Rossi <pierre.rossi@schibsted.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-06-07 15:06:18 +03:00
Aliaksandr Valialkin
b022f1f113 docs/CHANGELOG.md: document 1ee1e986da 2022-06-07 15:02:22 +03:00
Roman Khavronenko
1ee1e986da lib/storage: limit max mergeConcurrency value for systems with high number of CPUs (#2673)
Workers count for merges affects the max part size during merges. Such behaviour
protects storage from running out of disk space for scenario when all workers
are merging parts with the max size.

This works very well for most cases. But for systems where high number of CPUs
is allocated for vmstorage components this could significantly impact the max
part size and result in more unmerged parts than expected.

While checking multiple production highly loaded setups it was discovered that
`max_over_time(vm_active_merges{type="storage/big}[1h]}"` rarely exceeds 2,
and `max_over_time(vm_active_merges{type="storage/small}[1h]}"` rarely exceeds 4.
The change in this commit limits the max value for concurrency accordingly.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-06-07 14:55:09 +03:00
Aliaksandr Valialkin
194258c7b4 docs: run make docs-sync after fda8da297e 2022-06-07 14:50:39 +03:00
Dmytro Kozlov
fda8da297e docs: fixed typos (#2680)
* docs: fixed typos

* Update README.md

* Update docs/README.md

* Update docs/Single-server-VictoriaMetrics.md

* docs: added examples with start and end params in request

* Apply suggestions from code review

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-06-07 14:47:15 +03:00
Aliaksandr Valialkin
f9d22e2ad3 docs/Cluster-VictoriaMetrics.md: run make docs-sync after 1c96dce367 2022-06-07 14:30:43 +03:00
Aliaksandr Valialkin
f04b997a3d docs: run make docs-sync after a439e887a3 2022-06-07 14:26:56 +03:00
Luckz
a439e887a3 README.md: add a tiny amount of articles (#2688)
* README.md: add a tiny amount of articles 

Signed-off-by: Luckz <224748+Luckz@users.noreply.github.com>

* Update README.md

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2022-06-07 14:25:53 +03:00
Aliaksandr Valialkin
a5814fe16a lib/promscrape/discovery/kubernetes: use unsupportedFieldError() function instead of errContext string
This improves code readability and maintainability a bit, since the format string
is passed as string literal into fmt.Errorf.
2022-06-07 01:22:07 +03:00
Aliaksandr Valialkin
8608dd093c all: follow-up after 8edb390e21
- Remove unused js bloatware from /targets page. This strips down binary size by more than 100Kb
- Add /service-discovery page for API compatibility with Prometheus
- Properly load bootstrap.min.css from /prometheus/targets
- Serve static contents for /targets page from app/vminsert instead of app/vmselect, because /targets page is served from there
2022-06-07 00:57:09 +03:00
Aliaksandr Valialkin
6f0a0e3072 lib/promscrape/discovery/kubernetes: follow-up after 006b8c7534
- make more clear error logs
- simplify testing for newKubeConfig by passing only the path to kube_config file instead of SDConfig struct
2022-06-06 14:40:52 +03:00
Aliaksandr Valialkin
b3b6cf345a vendor: make vendor-update 2022-06-06 13:19:34 +03:00
Aliaksandr Valialkin
6c5372c694 docs/Articles.md: add a link to https://percona.community/blog/2022/06/02/long-time-keeping-metrics-victoriametrics/ 2022-06-06 13:15:11 +03:00
Aliaksandr Valialkin
cfefdde042 lib/promauth: follow-up after 006b8c7534
- Take into account `ca`, `key` and `cert` values when generating string representation of TLSConfig.
  Print hashes instead of real values because of security considerations.
- Properly update Config.tlsCertDigets when `key` and `cert` values are set.
  This allows properly updating scrape targets after these values are updated in configs.
- Do not re-generate certificate from `key` and `cert` values per each call to getTLSCert,
  because these values are immutable.
- Do not set `ca` value from `ca_file` value, so it isn't exposed at `/config` page.
- Generate proper error messages on incorrect `key`, `cert` or `ca` values.
2022-06-04 01:01:16 +03:00
Aliaksandr Valialkin
0922ed2b7e lib/promscrape: add -promscrape.cluster.name command-line flag
This flag is used for proper data de-duplication when the same target is scraped
from multiple vmagent clusters.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2679
2022-06-04 00:37:01 +03:00
Dmytro Kozlov
8edb390e21 lib/promscrape: adds service discovery visualization for /targets page(#2675)
* lib/promscrape: updated template

* lib/promscrape: fixed click on unhealthy and all btns

* app/vmselect: jquery scripts into static folder

Co-authored-by: f41gh7 <nik@victoriametrics.com>
2022-06-03 15:38:45 +02:00
Nikolay
a18914abee lib/promscrape/discovery/kubernetes: follow-up after 0b5c874911 (#2672) 2022-06-01 20:44:45 +02:00
hadesy
006b8c7534 promscrape/discovery: support kubeconfig (#2533) 2022-06-01 20:34:00 +02:00
Aliaksandr Valialkin
3aee7751b3 docs/Single-server-VictoriaMetrics.md: small clarification about custom cache size tuning 2022-06-01 14:55:54 +03:00
Aliaksandr Valialkin
ca689fec54 docs/CHANGELOG.md: follow-up after 2177089f94 2022-06-01 14:51:26 +03:00
Aliaksandr Valialkin
ea06d2fd3c lib/storage: stop background merge when storage enters read-only mode
This should prevent from `no space left on device` errors when VictoriaMetrics
under-estimates the additional disk space needed for background merge.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2603
2022-06-01 14:36:45 +03:00
Roman Khavronenko
642eb1c534 lib/storage: make indexdb/tagFilters cache size configurable (#2667)
The default size of `indexdb/tagFilters` now can be overridden via
`storage.cacheSizeIndexDBTagFilters` flag.
Please, be careful with changing default size since it may
lead to inefficient work of the vmstorage or OOM exceptions.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2663
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Co-authored-by: Nikolay <nik@victoriametrics.com>
2022-06-01 10:07:53 +02:00
Roman Khavronenko
2177089f94 promrelabel: add support of lowercase and uppercase relabeling actions (#2665)
* promrelabel: add support of `lowercase` and `uppercase` relabeling actions

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2664
Signed-off-by: hagen1778 <roman@victoriametrics.com>

* lib/storage: make golangci-lint happy

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Co-authored-by: Nikolay <nik@victoriametrics.com>
2022-06-01 10:02:37 +02:00
Aliaksandr Valialkin
b97ad42b6e deployment/docker: update base image from alpine:3.15.4 to alpine:3.16.0 2022-06-01 02:55:21 +03:00
Aliaksandr Valialkin
41958ed5dd all: add initial support for query tracing
See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#query-tracing

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1403
2022-06-01 02:29:23 +03:00
Aliaksandr Valialkin
d2567ccdd6 lib/promscrape: use strconv.Atoi instead of strconv.ParseInt for parsing -promscrape.cluster.memberNum
In this case there is no need in converting int64 to int
2022-06-01 01:42:34 +03:00
Dima Lazerka
f5ef3806c9 Fix nth-check version for css-select dep (#2666)
Fixes security vulnerability in nth-check version <=1.0.2
My previous version pin was insufficient, as it was imported again through a different (svgo -> css-select).
2022-05-31 16:29:23 +02:00
Aliaksandr Valialkin
b6af13ae94 vendor: make vendor-update 2022-05-31 12:57:04 +03:00
Aliaksandr Valialkin
a1add5c2c7 lib/storage: make fmt 2022-05-31 12:54:37 +03:00
Aliaksandr Valialkin
89c0172778 docs/CHANGELOG.md: follow-up after 11f91532c5
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2594
2022-05-31 12:27:56 +03:00
Aliaksandr Valialkin
bac75ea8a2 lib/storage: do not take into account series from the next day when match[] filter is passed to /api/v1/status/tsdb 2022-05-31 12:15:26 +03:00
Dmytro Kozlov
11f91532c5 issue-2594: use embedded for static files (#2650)
embed static js and css files from CDN into vmalert, vmagent and vmsingle binaries.

Co-authored-by: f41gh7 <nik@victoriametrics.com>

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2594
2022-05-31 01:55:28 +02:00
Dima Lazerka
4c3b35a5ca Change img urls to the same domain (#2649) 2022-05-30 12:23:48 +03:00
Howie
d12614c0a0 chore: remove duplicated code (#2657)
Signed-off-by: lihaowei <haoweili35@gmail.com>
2022-05-30 08:17:40 +02:00
Howie
7c3d43fa7f fix: docs (#2658)
Signed-off-by: lihaowei <haoweili35@gmail.com>
2022-05-30 08:16:07 +02:00
Roman Khavronenko
af5bf8fada docs: add more details to the key concepts (#2648)
- data modification
- more applications for gauges
- recommendations for instrumenting app with metrics

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-05-27 16:55:40 +02:00
Dmytro Kozlov
1eb29794e6 removed redundant return (fixed linter) (#2647)
* removed redundant return

* updated lint package version
2022-05-26 16:24:01 +02:00
Aliaksandr Valialkin
69b9cf7161 app/vmselect/vmui: make vmui-update after 492a615a88 2022-05-26 09:40:57 +03:00
Yury Molodov
492a615a88 vmui: import dashboards (#2642)
* fix: switch dashboards import to fetch

* make vmui-update
2022-05-26 09:39:10 +03:00
Aliaksandr Valialkin
796804e4b0 lib/promscrape: add -promscrape.suppressScrapeErrorsDelay command-line flag
This flag can be used for reducing the amounts of logs when scraping unreliable scrape targets.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2575

The patch is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2576 .
Thanks to @jelmd .
2022-05-25 22:59:36 +03:00
Roman Khavronenko
1e15ff5320 docs: update single version docs by adding extra links and formatting (#2643)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-05-25 21:49:43 +02:00
Aliaksandr Valialkin
31c6cfe3fb vendor: make vendor-update 2022-05-25 21:49:12 +03:00
Aliaksandr Valialkin
f6d11a49aa lib/storage: add ability to change the indexdb rotation time offset with -retentionTimezoneOffset command-line flag
This is a follow-up for 0fbf59199a

See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2574
2022-05-25 16:05:29 +03:00
阳明
0fbf59199a lib/storage: Remove the effect of time zone on next retention period (#2568) (#2574) 2022-05-25 15:08:24 +03:00
Aliaksandr Valialkin
bfe96a3cb4 docs/CHANGELOG.md: document 9e343faa41 2022-05-25 15:03:46 +03:00
Roman Khavronenko
5bf5caab93 docs: mention guide for migrating from inlfux in vmctl readme (#2640)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-05-25 13:28:56 +02:00
spectvtor
9e343faa41 fix alert relabeling (#2633) 2022-05-25 09:36:04 +02:00
Aliaksandr Valialkin
7747708ca7 app: expose /api/v1/status/config endpoint in the same way as Prometheus does
This endpoint is needed for third-party tools.

See https://prometheus.io/docs/prometheus/latest/querying/api/#config
2022-05-25 09:57:13 +03:00
Nikolay
cbfc1b7eb8 dashboards: adds dashboard for operator (#2621)
Apply suggestions from code review

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

Adds proper interval to rate functions
2022-05-23 11:32:51 +03:00
Aliaksandr Valialkin
1fe0657828 deployment/docker: update Go builder from 1.18.1 to 1.18.2
See https://github.com/golang/go/issues?q=milestone%3AGo1.18.2+label%3ACherryPickApproved
2022-05-23 10:57:16 +03:00
Aliaksandr Valialkin
1e745416aa docs/Single-server-VictoriaMetrics.md: mention Cardinality limiter docs at Resource usage limits section 2022-05-23 10:51:50 +03:00
Nikolay
ce644e9942 Updates operator docs (#2622)
* docs/operator: adds information about VMAgent statefulMode
* docs/operator: adds description for alertmanager configuration
* docs/operator: adds description for configuration syncronization

https://github.com/VictoriaMetrics/operator/issues/124

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2022-05-23 07:46:53 +02:00
Roman Khavronenko
113301308a vmalert: mention how to build a custom image (#2626)
Thanks to @f41gh7

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-05-23 00:59:34 +02:00
Aliaksandr Valialkin
2d2d15b0d0 docs/CHANGELOG.md: cut v1.77.2 2022-05-21 02:26:15 +03:00
Aliaksandr Valialkin
9854fc4dd5 docs/CHANGELOG.md: group vmalert features together 2022-05-21 01:52:01 +03:00
Aliaksandr Valialkin
cf05750d40 docs/CHANGELOG.md: document 2cf586da78 2022-05-21 01:12:32 +03:00
Roman Khavronenko
d5eb6afe26 lib/promscrape/discovery/kubernetes: fixes kubernetes service discovery (#2615)
* lib/promscrape/discovery/kubernetes: properly updates discovered scrape works
previously, added or updated scrapeworks may override previuosly
discovered.
it happens because swosByKey may contain small subset of kubernetes
objects with it's labels.
It happens for objectsUpdated and objectsAdded maps, which include only changed elements

* Properly calculate vm_promscrape_discovery_kubernetes_scrape_works

Co-authored-by: f41gh7 <nik@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-05-21 01:01:37 +03:00
Roman Khavronenko
2cf586da78 vmalert: add new metric vmalert_iteration_interval_seconds (#2623)
The new metric shows the configured evaluation interval per group.
Metric updates its value when group's interval is changed during
hot reload.
The new metric can be used to estimate how close group
is to start missing evaluation rounds. The following query
will show the % of used time by the group to evaluate all rules
before the next round:
```
(max(vmalert_iteration_duration_seconds{quantile="0.99"}) / vmalert_iteration_interval_seconds) * 100
```

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2618
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-05-20 17:31:16 +02:00
Yurii Kravets
ac55ca052c Update README.md (#2624) 2022-05-20 15:48:32 +02:00
Aliaksandr Valialkin
1731fe2ada docs: update the description for command-line flags according to recent changes 2022-05-20 15:09:43 +03:00
Aliaksandr Valialkin
c6d543e2f9 app/vmselect/vmui: make vmui-update 2022-05-20 14:53:38 +03:00
Aliaksandr Valialkin
d87733fe1c vendor: make vendor-update 2022-05-20 14:45:24 +03:00
Aliaksandr Valialkin
a175a57084 Makefile: explicitly specify go1.17 compatibility when running go mod tidy at make vendor-update
This is needed because go1.17 is the minimum supported version of Go,
which is needed for building VictoriaMetrics
2022-05-20 14:41:55 +03:00
Aliaksandr Valialkin
6b5979cd76 docs/CHANGELOG.md: document 3df8caca15 2022-05-20 14:22:48 +03:00
Aliaksandr Valialkin
667e018a7e docs/CHANGELOG.md: formatting fixes 2022-05-20 14:22:47 +03:00
Aliaksandr Valialkin
832623516b docs/CHANGELOG.md: formatting fix for the issue url https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2607 2022-05-20 14:22:47 +03:00
Aliaksandr Valialkin
65227b88a6 docs/CHANGELOG.md: link to the feature request about reusable templates in vmalert
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2510
2022-05-20 14:22:46 +03:00
Roman Khavronenko
b9d7a66800 docs: add migration guide for influxdb (#2619)
docs: add migration guide for influxdb

Co-authored by @denisgolius 

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-05-20 11:20:36 +02:00
Roman Khavronenko
e79a1d1476 docs: follow-up after 4b3eb40658 (#2602)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-05-19 13:24:15 +02:00
Roman Khavronenko
f056121128 docs: fix a typo of mentioning scrape support for vmselect (#2617)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-05-19 13:23:55 +02:00
Boris Petersen
3df8caca15 Add ability to sign requests for all AWS services (#2604)
This adds the ability to utilize sigv4 signing for all AWS services not
just "aps". When the newly introduced property "service" is not set it
will default to "aps".

Signed-off-by: Boris Petersen <boris.petersen@idealo.de>
2022-05-18 14:58:31 +02:00
Roman Khavronenko
5111d850e2 vmalert: remove a line added for debug (#2611)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-05-18 14:57:58 +02:00
Roman Khavronenko
34116882b4 vmalert: support scalar type in response (#2610)
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2607

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-05-18 09:50:46 +02:00
Roman Khavronenko
1fad4dc919 vmalert: support strings in humanize.* templates (#2606)
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2569

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-05-17 15:38:54 +02:00
Yurii Kravets
5c42c1218a Update vmalert.md (#2580)
docs: update vmalert/README.md

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2022-05-17 14:14:18 +02:00
Roman Khavronenko
808d0d8ffe docs: quickstart update (#2572)
docs: update docs for beginners

QuickStart page was updated with more relevant information.
Key Concepts was added to cover basics for the VictoriaMetrics.
2022-05-17 14:06:58 +02:00
Yurii Bychenok
b7536f2a0a Updated vmctl documentation, migration from OpenTSDB section (#2595)
Co-authored-by: Yurii Bychenok <ipeacocks@pm.me>
2022-05-17 13:20:35 +02:00
dependabot[bot]
baf1ec4639 build(deps): bump github.com/influxdata/influxdb from 1.9.6 to 1.9.7 (#2589)
Bumps [github.com/influxdata/influxdb](https://github.com/influxdata/influxdb) from 1.9.6 to 1.9.7.
- [Release notes](https://github.com/influxdata/influxdb/releases)
- [Changelog](https://github.com/influxdata/influxdb/blob/master/CHANGELOG_OLD.md)
- [Commits](https://github.com/influxdata/influxdb/commits)

---
updated-dependencies:
- dependency-name: github.com/influxdata/influxdb
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-17 13:14:30 +02:00
Yury Molodov
c97c1fc1bf vmui: optimize data fetching (#2584) 2022-05-17 13:13:45 +02:00
Yury Molodov
fcf4190d0b vmui: add option to customize url params for individual pages (#2582) 2022-05-17 13:13:15 +02:00
Roman Khavronenko
b74c001c92 vmalert: support /rules path for Grafana's ngalert requests (#2593)
Unexpectedly, Grafana makes an extra request to `/rules`
handler in addition to `/api/v1/rules` calls in alerts UI.
This happens only for Grafana versions older than 8.5.*.
Apparently, this is related to support of other monitoring
systems.
Prometheus responds with `text/html` content for UI page `/rules`
to such requests. Actually, returning just a blank page with
SC=200 works as well.

Returning actual response of `/api/v1/rules`
results in error in Grafana since it expects a `yaml` (?) in response.
So we add a placeholder to `vmalert`.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2583
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-05-16 10:00:24 +02:00
Yury Molodov
4e6b483ef1 fix: change get display type (#2553) 2022-05-16 09:44:13 +02:00
Yury Molodov
ff74472621 vmui: setup predefined dashboards without build (#2541)
vmui: support predefined dashboards in json format

See https://github.com/VictoriaMetrics/VictoriaMetrics/tree/master/app/vmselect/vmui/dashboards
2022-05-16 09:42:37 +02:00
Roman Khavronenko
284bda8746 docs: fix liquid syntax errors (#2592)
For liquid text processor double braces `{{` `}}`
are special chars for templating.
Since we use them in some of our docs with different purpose,
we must escape them to avoid syntax errors from liquid.

For escaping curly braces we use bult-in plugin which helps
to enclose sections of text via `{% raw %}` and `{% endraw %}`.
This approach prevents liquid syntax errors and makes render correct.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-05-16 09:27:19 +02:00
Dima Lazerka
8402231d40 Force up nth-check version 2022-05-15 00:35:20 +03:00
Roman Khavronenko
0d07166eed vmalert: fix readme formatting (#2587)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-05-14 19:29:09 +02:00
Roman Khavronenko
9bc03f6b04 vmalert: follow-up after 0ac1cdfff5 (#2586)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-05-14 18:56:31 +02:00
Andrii Chubatiuk
a531a96193 added reusable templates support (#2532)
Signed-off-by: Andrii Chubatiuk <andrew.chubatiuk@gmail.com>
2022-05-14 11:38:44 +02:00
Aliaksandr Valialkin
9d7da130b5 docs/CHANGELOG.md: document 3f0ecee128
See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2577
2022-05-13 16:56:38 +03:00
Aliaksandr Valialkin
ce47d18052 app/{vmagent,vminsert}: mention port 8089 instead of 8189 in the description for -influxListenAddr flag
InfluxDB uses 8089 port for sending plain Influx line protocol data over TCP and UDP.
See https://docs.influxdata.com/influxdb/v1.8/administration/ports/

This is a follow-up for 20cef877a1
2022-05-13 16:50:45 +03:00
Aliaksandr Valialkin
c448d2fcbb app/vmalert: apply -remoteRead.disablePathAppend to -datasource.url in the same way as for the -remoteRead.url
This is a follow-up for 0e2486df56

The related pull requests:
- https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1536
- https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1712
2022-05-13 16:44:43 +03:00
Denys Holius
b68f0fe741 Update golangci version to latest v1.46.1 (#2579) 2022-05-13 14:07:23 +02:00
Roman Khavronenko
3f0ecee128 vmalert: properly cleanup stale series tracker on rules update (#2577)
Rules executor within group tracks series sent to remote write
in order to mark them as stale if they had disappeared in next
evaluation round.
The executor uses rules ID as a key to identifies series which belong to rule.
On config reload, executor remains active but the set of rules could change.
Hence, we need to properly cleanup the tracker for rules which has been disappeared
on config reload.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-05-13 10:04:49 +02:00
Roman Khavronenko
071f7c24d4 vmctl: make linter happy (#2578)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-05-13 09:55:27 +02:00
Yurii Kravets
20cef877a1 Update docs (#2566)
* deployment/docker: pass `-buildvs=false` to `go build` for production builds

This should resolve the `error obtaining VCS status: exit status 128` error
when the environment contains incorrect version of git or has incorrect access rights
to the directory with VictoriaMetrics source code.

See the following links for additional info:
- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2508#issuecomment-1117126702 ,
- https://github.com/google/ko/issues/672
- https://github.com/golang/go/issues/49004

* lib/netutil: limit the number of concurrently established connections when calling ConnPool.Get()

This should reduce potential spikes in the number of established connections in the following cases:
- when the connection establishing procedure becomes temporarily slow
- after a temporary spike in the rate of ConnPool.Get() calls

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2552

* docs/CHANGELOG.md: document c8af625bcc

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1322#issuecomment-1120276146

* docs/Cluster-VictoriaMetrics.md: typo fix: `by by` -> `by`

* docs: add `resource usage limits` docs, which describe fine-grained tuning for various resource usage limits

* docs/Cluster-VictoriaMetrics.md: the `/api/v1/label/.../values` query can take CPU and ram at both vmstorage and vmselect

* Update root Readme and root vmagent readme

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-05-12 19:22:13 +02:00
Aliaksandr Valialkin
a0727ab1b1 docs/vmagent.md: typo fix in the description for -promscrape.cluster.replicationFactor command-line flag 2022-05-12 18:50:29 +03:00
Aliaksandr Valialkin
7ddf9f0700 docs/Cluster-VictoriaMetrics.md: the /api/v1/label/.../values query can take CPU and ram at both vmstorage and vmselect 2022-05-11 20:01:24 +03:00
Aliaksandr Valialkin
73cbc87dbb docs: add resource usage limits docs, which describe fine-grained tuning for various resource usage limits 2022-05-11 19:52:03 +03:00
Aliaksandr Valialkin
b828c6e1ff docs/Cluster-VictoriaMetrics.md: typo fix: by by -> by 2022-05-11 18:13:07 +03:00
Aliaksandr Valialkin
3f6a7bff85 docs/CHANGELOG.md: document c8af625bcc
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1322#issuecomment-1120276146
2022-05-11 14:33:27 +03:00
Aliaksandr Valialkin
19f019d0d5 lib/netutil: limit the number of concurrently established connections when calling ConnPool.Get()
This should reduce potential spikes in the number of established connections in the following cases:
- when the connection establishing procedure becomes temporarily slow
- after a temporary spike in the rate of ConnPool.Get() calls

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2552
2022-05-11 14:17:22 +03:00
Aliaksandr Valialkin
991688ea65 deployment/docker: pass -buildvs=false to go build for production builds
This should resolve the `error obtaining VCS status: exit status 128` error
when the environment contains incorrect version of git or has incorrect access rights
to the directory with VictoriaMetrics source code.

See the following links for additional info:
- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2508#issuecomment-1117126702 ,
- https://github.com/google/ko/issues/672
- https://github.com/golang/go/issues/49004
2022-05-11 14:16:14 +03:00
Denys Holius
3dbdd4ef8a docs: fixed typos in CHANGELOG.md (#2565) 2022-05-10 13:16:17 +02:00
Dmytro Kozlov
c8af625bcc vmctl: fix build for solaris os (#2555)
* vmctl: fix build for solaris os

* vmctl: updated dependency (using Syscall instead of Syscall6)

* vmctl: updated dependency

* vmctl: updated dependency
2022-05-09 21:36:18 +02:00
Aliaksandr Valialkin
a7f18f8cb2 app/vmselect/promql: do not return values from label_value() if the original time series has no values at the selected timestamps 2022-05-09 17:57:39 +03:00
Denys Holius
e4bbcc29c2 Update golangci version to latest v1.46.0 (#2560)
Update golangci version to latest https://github.com/golangci/golangci-lint/tree/v1.46.0
2022-05-09 16:52:36 +02:00
Aliaksandr Valialkin
f901788c7f docs/CHANGELOG.md: document 8f4f5f1d68 2022-05-09 17:32:51 +03:00
Aliaksandr Valialkin
9ea3f0c0d3 lib/awsapi: remove whitelist arg from GetFiltersQueryString(), since it may break new filters in the future
Let users decide which filters to use. If users start using disallowed filters, then AWS will return an error.
2022-05-09 15:33:22 +03:00
Aliaksandr Valialkin
84326eacd6 docs/Release-Guide.md: typo fix: signle->single 2022-05-09 15:27:44 +03:00
Roman Khavronenko
331a5d9a17 Code check (#2558)
* vmstorage: make gofmt happy

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: make linter happy

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-05-09 10:11:56 +02:00
Roman Khavronenko
e9fa363480 Vmalert fix bugs in alerting evaluation (#2557)
* vmalert: calculate time for firing alert based on the given timestamp

Previously, current time was used for checking the `firing` threshold.
This is not correct, since alerts are evaluated at specific timestamps.
Hence, this specific timestamp supposed to be used in the calculation.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: properly calculate evaluation timestamp for rules

Timestamp for rules evaluation should be calculated after
the artifical delay for groups start. Otherwise, evaluation
timestamp can fall back too far in time.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-05-09 10:11:06 +02:00
Manuel Polo
58c1472394 docs: fix typo in quickstart 2022-05-08 19:43:42 +00:00
Artem Navoiev
3f78a609ac docs: add flags list to vmbackupmanager (#2554)
docs: add flags list to vmbackupmanager docs

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: tenmozes <artem@victoriametrics.com>
2022-05-08 12:18:39 +00:00
Marc Hörsken
8f4f5f1d68 app/vmctl: add flag to handle Prometheus remote_write to InfluxDB (#2545)
Make it possible to migrate timeseries while restoring the
original timeseries name previously written from Prometheus
to InfluxDB v1 via remote_write.

Fixes: https://github.com/VictoriaMetrics/vmctl/issues/8
2022-05-07 19:52:42 +00:00
Aliaksandr Valialkin
043363750a vendor: make vendor-update 2022-05-07 01:48:35 +03:00
Aliaksandr Valialkin
da9e96bc55 docs/CHANGELOG.md: cut v1.77.1 2022-05-07 01:46:27 +03:00
Aliaksandr Valialkin
f26daecc8d app/vminsert/netstorage: re-route samples from readonly vmstorage nodes to healthy nodes if -dropSamplesOnOverload command-line flag is set 2022-05-07 01:41:00 +03:00
Aliaksandr Valialkin
d6ad8d090d app/vmstorage: do not allow to set -retentionPeriod smaller than one day
VictoriaMetrics doesn't support retention periods smaller than one day,
so do not allow to set it to small values.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2496
2022-05-07 00:53:01 +03:00
Aliaksandr Valialkin
123aa4c79e lib/promscrape: properly implement ScrapeConfig.clone()
Previously ScrapeConfig.clone() was improperly copying promauth.Secret fields -
their contents was replaced with `<secret>` value.

This led to inability to use passwords and secrets in `-promscrape.config` file.
The bug has been introduced in v1.77.0 in the commit 67b10896d2

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2551
2022-05-07 00:05:40 +03:00
Aliaksandr Valialkin
38629b53ef docs/CHANGELOG.md: document e726340914 2022-05-06 18:10:40 +03:00
Marc Hörsken
e726340914 app/vmctl: add flag to skip adding the InfluxDB 'db' label (#2544)
Make it possible to migrate timeseries without changing labels
at all, including not adding the now optional 'db' label.
2022-05-06 18:06:54 +03:00
Dmytro Kozlov
9a63f6c1b8 vmbackup: Prevent save backups to the same folder where TSDB data is (#2547)
* {vmbackup, vmbackup/snapshot}: validate snapshot name

* vmbackup/snapshot: added another checks

* backup/actions: added check that we ignore backup_complete.ignore file

* vmbackup: moved snapshot to lib directory

* lib/snapshot: added functions description

* lib/snapshot: fixed typo

* vmbackup: code cleanup

* wip

* vmbackup: Prevent save backups to the same folder where TSDB data is

* Apply suggestions from code review

* wip

* wip

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-05-06 18:04:09 +03:00
Dmytro Kozlov
1235e754a3 deployment/docker: added vmalert.proxyURL flag (#2549) 2022-05-06 17:30:41 +03:00
Aliaksandr Valialkin
80092f087e docs/Cluster-VictoriaMetrics.md: typo fix: bandidth -> bandwidth 2022-05-06 16:35:36 +03:00
Aliaksandr Valialkin
8100c9a301 docs/Cluster-VictoriaMetrics.md: make the description for -rpc.disableCompression command-line flag more clear 2022-05-06 16:27:33 +03:00
Roman Khavronenko
20ccf0ba81 vmctl: add tip about safety flags during for native data export (#2540)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-05-06 15:12:29 +02:00
Aliaksandr Valialkin
8d81703635 app/vmagent: add missing _total suffix to vmagent_remotewrite_global_rows_pushed_before_relabel_total counter
This is a follow up for c536139d0b
2022-05-06 15:50:57 +03:00
Aliaksandr Valialkin
1dc4cc243b lib/promscrape: rename promscrape_stale_samples_created_total metric to vm_promscrape_stale_samples_created_total, so its name is consistent with the rest of vm_promscrape_ metrics 2022-05-06 15:33:13 +03:00
Aliaksandr Valialkin
c536139d0b app/vmagent: expose vmagent_remotewrite_global_rows_pushed_before_relabel and vmagent_remotewrite_rows_pushed_after_relabel_total metrics 2022-05-06 15:28:59 +03:00
Aliaksandr Valialkin
51e36fd533 app/vmagent: rename vmagent_remote_write_rate_limit_reached_total to vmagent_remotewrite_rate_limit_reached_total for the sake of consistency with other vmagent_remotewrite_ metrics 2022-05-06 15:01:54 +03:00
Aliaksandr Valialkin
d5b55fe22d lib/promscrape/discovery/ec2: add ability to filter Availability Zones in ec2_sd_config via az_filters section 2022-05-06 12:43:29 +03:00
Aliaksandr Valialkin
ca4ca4630b app/vmselect/vmui: make vmui-update after 450d879eaa 2022-05-05 21:26:01 +03:00
Yury Molodov
450d879eaa vmui: prevent reset relative time (#2543)
* fix: prevent time picker reset to previous time

* fix: add default display type
2022-05-05 21:21:02 +03:00
Yury Molodov
a580efa26a fix: remove react @types (#2539) 2022-05-05 21:17:41 +03:00
Denis Fondras
928728807c Fix typo (#2538) 2022-05-05 21:16:03 +03:00
Aliaksandr Valialkin
6343b20943 docs/Cluster-VictoriaMetrics.md: update cluster scalability tips 2022-05-05 21:11:51 +03:00
Roman Khavronenko
13efaa42d5 vmstorage: switch to rich duration parser for flag snapshotsMaxAge (#2542)
The switch suppose to allow setting `d`, `w`, `y` duration units.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-05-05 18:43:21 +02:00
Aliaksandr Valialkin
be76d49150 docs/CHANGELOG.md: document bf5e3774cc 2022-05-05 13:38:17 +03:00
Marc Hörsken
bf5e3774cc app/vmctl: fix empty/skipped labels after db label (#2536)
Do not assume the db label to be the last one and also
make sure we are not skipping it and everything afterwards.
Breaking the loop would cause following labels to be empty.
2022-05-05 13:35:08 +03:00
Aliaksandr Valialkin
97f9c2f667 lib/promscrape/discovery/ec2: properly pass filters to DescribeAvailabilityZones API call
Previously filters wheren't passed to this call after the commit 0e09fdb8b0

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1626
2022-05-05 11:00:23 +03:00
Aliaksandr Valialkin
d285c2fea7 lib/awsapi: pass filtersQueryString arg to GetEC2APIResponse() function, so the caller could decide whether to use the filters during the AWS API query
The filters shouldn't be passed to DescribeAvailabilityZones API call.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1626
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1287

Related commits:
0e09fdb8b0
d289ecded1
2022-05-05 10:29:34 +03:00
Aliaksandr Valialkin
6bb32ab9de docs/CHANGELOG.md: cut v1.77.0 2022-05-05 00:16:16 +03:00
Aliaksandr Valialkin
c9e57d8000 app/vmstorage/main.go: reduce the difference with cluster version 2022-05-04 23:56:19 +03:00
Aliaksandr Valialkin
8be5c0ab16 vendor: make vendor-update 2022-05-04 23:50:38 +03:00
Aliaksandr Valialkin
fc7c7237e3 app/vmselect: follow-up after 8639e79d38 2022-05-04 23:35:57 +03:00
Aliaksandr Valialkin
2c037ae0d3 docs/vmbackup.md: added missing -storageDataPath argument in the command for creating daily backups 2022-05-04 22:49:37 +03:00
Aliaksandr Valialkin
910f715ffe docs/vmbackup.md: acutalize docs
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2503
2022-05-04 22:44:15 +03:00
Dmytro Kozlov
7dd9f3b98e {vmbackup, vmbackup/snapshot}: fixed problem with snapshot backup in another snapshot folder (#2535)
* {vmbackup, vmbackup/snapshot}: validate snapshot name

* vmbackup/snapshot: added another checks

* backup/actions: added check that we ignore backup_complete.ignore file

* vmbackup: moved snapshot to lib directory

* lib/snapshot: added functions description

* lib/snapshot: fixed typo

* vmbackup: code cleanup

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-05-04 22:12:03 +03:00
Aliaksandr Valialkin
e761d9449c app/vmagent: rename -remoteWrite.useSigv4 command-line flag to -remoteWrite.aws.useSigv4, so its name is consistent with the other -remoteWrite.aws.* command-line flags 2022-05-04 20:41:17 +03:00
Aliaksandr Valialkin
381e2de59c app/vmalert: run make quicktemplate-gen from the root directory after the commit f6dcfbcdd6 2022-05-04 20:27:36 +03:00
Nikolay
d289ecded1 {lib/promscrape,app/vmagent}: adds sigv4 support for vmagent remoteWrite (#2458)
* {lib/promscrape,app/vmagent}: adds sigv4 support for vmagent remoteWrite
moves aws related code into separate lib from lib/promscrape
it allows to write data from vmagent to the AWS managed prometheus (cortex)

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1287

* Apply suggestions from code review

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-05-04 20:24:19 +03:00
Dmytro Kozlov
f6dcfbcdd6 vmalert/tpl: fixed truncating alerts expression in table (#2494)
vmalert: improve `/groups` UI visual 

The change also fixes truncated rules expressions in UI
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2484
2022-05-04 18:02:18 +02:00
Aliaksandr Valialkin
242856ce97 docs/guides/multi-regional-setup-dedicated-regions.md: clarify wording on vmagent configuration 2022-05-04 18:49:56 +03:00
Aliaksandr Valialkin
4e4ca1b6db docs/Cluster-VictoriaMetrics.md: move environment variables entry closer to cluster setup section 2022-05-04 18:43:57 +03:00
Aliaksandr Valialkin
bab9670d69 docs/CHANGELOG.md: yet another typo fix: present -> pressed 2022-05-04 18:20:39 +03:00
Aliaksandr Valialkin
554008bb4e docs/CHANGELOG.md: typo fixes 2022-05-04 18:18:37 +03:00
Aliaksandr Valialkin
2e6827ff04 docs/CHANGELOG.md: document 8639e79d38 2022-05-04 10:46:03 +03:00
Nikolay
8639e79d38 app/vmselect: adds proxy for rules API (#2516)
* app/vmselect: adds proxy for rules and alerts API
It allows to visualization for rules at grafana
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1739

* Update app/vmselect/main.go

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2022-05-03 20:55:15 +03:00
Dima Lazerka
1e26dd1f82 Code cleanup (#343)
* Small code cleanup: remove Request from params

* Extract common params to all export handlers

* Renamed ExportParams -> exportParams

* wip

Co-authored-by: Dzmitry Lazerka <dlazerka@gmail.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-05-03 16:04:16 +03:00
Aliaksandr Valialkin
1c7a541247 docs/Cluster-VictoriaMetrics.md: typo fix: serparated -> separated 2022-05-03 15:19:15 +03:00
Aliaksandr Valialkin
2ced6746a7 docs/CHANGELOG.md: document 3575aabeaf 2022-05-03 14:01:15 +03:00
Nikolay
3575aabeaf lib/promscrape: adds correct http status codes for redirect (#2530)
standard http client accepts multiple http status codes as redirect
it should fix issue with incorrect redirects
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2482
2022-05-03 13:31:31 +03:00
Aliaksandr Valialkin
b5fedfd3bb all: add -cluster.tlsInsecureSkipVerify command-line option to vminsert, vmselect and vmstorage components in order to be able to disable TLS certificate verification in mTLS mode
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2490
2022-05-03 13:12:50 +03:00
Aliaksandr Valialkin
528b332fe0 docs/Cluster-VictoriaMetrics.md: refer to the doc on how to set up mTLS 2022-05-03 11:55:06 +03:00
Aliaksandr Valialkin
53cef612b0 docs/CHANGELOG.md: document 488c34f5e1 2022-05-03 11:01:02 +03:00
Aliaksandr Valialkin
bca4737fcf app/vmui: fix up/down arrow keys on multi-line query after a186434b50 2022-05-03 10:50:16 +03:00
Dmytro Kozlov
488c34f5e1 vmctl: fixed blocking when aborting import process (#2509)
vmctl: fix vmctl blocking on process interrupt

This change prevents vmctl from indefinite blocking on
receiving the interrupt signal. The update touches all
import modes and suppose to improve tool reliability.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2491
2022-05-03 07:03:41 +02:00
Aliaksandr Valialkin
aa02719d86 docs/CHANGELOG.md: document d0706c8c95 2022-05-02 22:24:45 +03:00
Gard Rimestad
d0706c8c95 app/vmagent add metric for rate limit (#2521)
This adds a metric for the rate limit.
The limit is present as a flag currently:
`flag{name="remoteWrite.rateLimit", value="500000", is_set="true"} 1`

We are running many instances of vmagent and when creating alerts it is harder than it needs to be when extracting the value from the flag.

With this change it should be easier to monitor how close to the limit we are.

`((100/vmagent_remotewrite_rate_limit{account="account"})*sum (rate(vmagent_remotewrite_conn_bytes_written_total{account="account"}))) and ON (account) flag{name="remoteWrite.rateLimit"} == 1`
2022-05-02 22:20:05 +03:00
Aliaksandr Valialkin
0d86644d65 lib/storage: leave the last sample per each discrete interval during the deduplicaton
This aligns better with staleness logic in Prometheus - https://prometheus.io/docs/prometheus/latest/querying/basics/#staleness
2022-05-02 21:50:45 +03:00
Aliaksandr Valialkin
a186434b50 app/vmui: execute query by pressing enter in the same way as Prometheus does
Multi-line query can be entered via `shift-enter` in the query input field
2022-05-02 19:49:29 +03:00
Yury Molodov
87693754d5 vmui: support node v.18 (#2529)
* fix: add support vmui node18

* fix: remove @mui/styles (legacy styling solution)

* Update app/vmui/Dockerfile-build

* app/vmselect/vmui: `make vmui-update`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-05-02 18:41:53 +03:00
Aliaksandr Valialkin
67977e2b55 vendor: make vendor-update 2022-05-02 16:00:32 +03:00
Aliaksandr Valialkin
2c4565bb3d docs/Cluster-VictoriaMetrics.md: move here the deduplication docs related to cluster version 2022-05-02 15:53:34 +03:00
Max Golionko
e79aa037b0 added details about deduplication (#2527)
* added details about deduplication

* Update docs/README.md

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

* Apply suggestions from code review

* Update docs/README.md

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2022-05-02 15:45:57 +03:00
Aliaksandr Valialkin
bae7e8b16b docs/CHANGELOG.md: document 3616337812
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2514
2022-05-02 15:36:00 +03:00
Aliaksandr Valialkin
70d9e7346b docs/CHANGELOG.md: document 32a6b67e6c
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1761
2022-05-02 15:26:21 +03:00
Aliaksandr Valialkin
6039640a26 docs/CHANGELOG.md: document b2294d1cf1 2022-05-02 15:21:24 +03:00
Aliaksandr Valialkin
58390192c1 app/vmalert: run make quicktemplate-gen from the repository root
This is a follow-up after b2294d1cf1
2022-05-02 15:17:03 +03:00
Aliaksandr Valialkin
7bc6595b45 lib/netutil: close connections in ConnPool if they are idle for more than 30 seconds
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2508
2022-05-02 15:14:05 +03:00
Roman Khavronenko
3616337812 vmalert: do not execute templates during validation (#2528)
Function `ValidateTemplates`, used on the vmalert startup,
is supposed to check whether used templates and functions
in loaded rules are correct. The function was parsing
and executing loaded templates.
However, rules may contain functions which can't be executed
without values (label values or query results), like `slice`.
Because of this, validation for completely valid expression
`{{ slice $labels.job 9 }}` will fail since `$labels.job`
is empty during validation.

This PR updates `ValidateTemplates` function to only parse
templates without executing them.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2514
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-05-02 10:16:16 +02:00
Dmytro Kozlov
32a6b67e6c vmalert: added disableProgressBar flag which disable progressbar (#2506)
vmalert: added disableProgressBar flag which disable progressbar

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1761
2022-05-02 10:08:24 +02:00
Artem Navoiev
37cf509c3a lib/{storage,flagutil} - Add option for snapshot autoremoval (#2487)
* lib/{storage,flagutil} - Add option for snapshot autoremoval

- add prometheus-like duration as command flag
- add option to delete stale snapshots
- update duration.go flag to re-use own code

* wip

* lib/flagutil: re-use Duration.Set() call in NewDuration

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-05-02 11:00:15 +03:00
Aliaksandr Valialkin
20bc2a2c44 lib/flagutil: re-use Duration.Set() call in NewDuration 2022-05-02 10:56:39 +03:00
Dmytro Kozlov
b2294d1cf1 vmctl/vm: added datapoints collection bar (#2486)
add progress bars to the VM importer

The new progress bars supposed to display the processing speed per each
VM importer worker. This info should help to identify if there is a bottleneck
on the VM side during the import process, without waiting for its finish.
The new progress bars can be disabled by passing `vm-disable-progress-bar` flag.

Plotting multiple progress bars requires using experimental progress bar pool
from github.com/cheggaaa/pb/v3. Switch to progress bar pool required changes
in all import modes.

The openTSDB mode wasn't changed due to its implementation, which implies individual progress
bars per each series. Because of this, using the pool wasn't possible.

Signed-off-by: dmitryk-dk <kozlovdmitriyy@gmail.com>

Co-authored-by: hagen1778 <roman@victoriametrics.com>
2022-05-02 09:06:34 +02:00
Peter Dupej
8688ea8aa8 Update MetricsQL.md (#2519)
remove typo from label_replace doc.
2022-05-02 09:42:58 +03:00
Aliaksandr Valialkin
0efeba13a7 vendor: update github.com/valyala/gozstd from v1.16.0 to v1.17.0 2022-04-29 19:30:41 +03:00
Aliaksandr Valialkin
7aa5167996 app/vmselect/main.go: move the code /api/v1/status/buildinfo handler to the same location as in the cluster branch 2022-04-29 13:02:14 +03:00
Aliaksandr Valialkin
464325a24b app/vmselect/vmui: make vmui-update after da04e9d1de 2022-04-29 12:54:10 +03:00
Vitaliy Vasilenko
da04e9d1de vmui: fix default server path (#2511) 2022-04-29 12:51:48 +03:00
Dima Lazerka
ed8e88af11 Export "null" in jsonl instead of NaN (#2518)
* Export "null" in jsonl instead of NaN

The NaN appeared because of staleness markers that were added for compatibility. I think it's better to use json `null`, implemented here.

Also maybe it also makes sense to add a flag like `?skip-staleness-markers=true` to `/export`, to skip nulls at all?

* Update app/vmselect/prometheus/export.qtpl

* app/vmselect/prometheus/export.qtpl.go: `make quicktemplate-gen`

* docs/CHANGELOG.md: document the change

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-04-29 12:49:47 +03:00
Aliaksandr Valialkin
e33232bce3 deployment/docker/docker-compose.yml: update Grafana version from v8.3.5 to v8.5.1 2022-04-29 12:01:53 +03:00
Aliaksandr Valialkin
f5635d6920 docs/CHANGELOG.md: document c7aad8d441 2022-04-29 11:39:23 +03:00
Nikolay
c7aad8d441 app/vmselect: adds API /api/v1/status/buildinfo (#2515)
* app/vmselect: adds API /api/v1/status/buildinfo
it should fix an compability error with grafana 8.5 prometheus datasource
https://github.com/grafana/grafana/pull/46771

* Update main.go
2022-04-29 11:36:28 +03:00
Aliaksandr Valialkin
a266d50bed app/vmui/Dockerfile-build: fix dependency to nodejs v17, since vmui doesnt work with nodejs v18 2022-04-29 11:17:26 +03:00
Dima Lazerka
84683b8569 Fix targetstatus qtpl paths (#2517)
Ran `make quicktemplate-gen` from the root directory
2022-04-29 10:36:03 +03:00
Aliaksandr Valialkin
9bb35779d2 docs/MetricsQL.md: clarify keep_metric_names docs 2022-04-27 11:24:32 +03:00
Aliaksandr Valialkin
cce1b6d7f9 app/vmselect/promql: add tlast_change_over_time(m[d]) function, which returns the timestamp for the last change of m on the given lookbehind window d 2022-04-27 10:59:03 +03:00
Yury Molodov
c7693e8bc1 vmui: expression alias (#2495)
* feat: add alias for queries

* docs: update docs for predefined dashboards

* app/vmselect: `make vmui-update`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-04-26 15:59:37 +03:00
Aliaksandr Valialkin
4176be38c4 app/vmagent: substitute hard-to-read 500000000 with 500MB in -remoteWrite.maxDiskUsagePerURL description 2022-04-26 15:48:20 +03:00
Yury Molodov
9b4bff67e0 vmui: add support relative time (#2504)
* feat: add support relative time

* app/vmselect: `make vmui-update`

* docs/CHANGELOG.md: document the change

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-04-26 15:46:06 +03:00
Aliaksandr Valialkin
6be07e8c25 lib/promscrape/discovery/kubernetes: do not drop pod meta-labels even if the corresponding node objects are missing
This reflects the logic used in Prometheus.

See https://github.com/prometheus/prometheus/pull/10080
2022-04-26 15:26:01 +03:00
Aliaksandr Valialkin
e0195558c9 vendor: make vendor-update 2022-04-26 15:24:27 +03:00
Aliaksandr Valialkin
ce1190974c docs/Cluster-VictoriaMetrics.md: remove incorrect and misleading instructions for passing -replicationFactor flag to vmselect nodes in multi-level setup.
The `-replicationFactor` passed to top-level `vmselect` nodes mustn't exceed the `-replicationFactor` passed to top-level `vminsert` nodes
2022-04-26 15:13:56 +03:00
Aliaksandr Valialkin
8594609385 docs/CHANGELOG.md: document 4c1fbcd6b0
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2368
2022-04-26 15:10:00 +03:00
Roman Khavronenko
4c1fbcd6b0 Single dashboards (#2492)
* dashboards: remove index filter from stats panel for DiskUsage

The diskUsage stats panel was showing disk usage without including
size of the index, which is not correct. The filter was removed
to reflect the total disk usage.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2368

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboards: add adhoc filter to dasbhoard variables

The adhoc filter allows to quickly apply global filters without
modifying the panels.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboards: add new panel `IndexDB items rate`

The new panel supposed to reflect the pressure on indexDB
caused by churn rate or new series registration.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboards: rm "Deferred merges" panel since it could be misleading

See more context here https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1682#issuecomment-938608067

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboards: replace fixed interval of `5m` for `rate` expressions

Before we used fixed `5m` interval for expressions with `rate` func.
Unfortunately, this interval wasn't a fit for all the cases. So we
switch to `$__rate_interval` instead.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboards: bump version requirement

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboards: rm `vm_indexdb_items_added_size_bytes_total` expression

Rate over `vm_indexdb_items_added_size_bytes_total` doesn't seem to be useful
on the dasbhoard panel.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-04-24 23:27:56 +03:00
Aliaksandr Valialkin
5d87744ba2 docs/CHANGELOG.md: typo fix: may result -> could result 2022-04-23 00:31:22 +03:00
Aliaksandr Valialkin
9fe1bf5d53 lib/promauth: take into account tls_config and proxy_url when serializing OAuth2Config to string 2022-04-23 00:23:19 +03:00
Aliaksandr Valialkin
eb5d7ad089 lib/promauth: add support for min_version option at tls_config section in the same way as Prometheus does 2022-04-23 00:16:39 +03:00
Aliaksandr Valialkin
174431e31b lib/promauth: add support for proxy_url option at oauth2 section in the same way as Prometheus does 2022-04-23 00:00:44 +03:00
Aliaksandr Valialkin
18b14aad8e lib/promauth: add support for tls_config section at oauth2 config in the same way as Prometheus does 2022-04-22 23:51:07 +03:00
Aliaksandr Valialkin
6f79b2b68b lib/promscrape/discovery/kubernetes: limit the minimum sleep time between updating dependent ScrapeWork objects
Previously the sleep time could be dropped to nanoseconds, which could result in CPU time waste
2022-04-22 23:14:17 +03:00
Aliaksandr Valialkin
15190fcdae lib/promscrape/discovery/kubernetes: allow attaching node-level labels and annotations to discovered pod targets in the same way as Prometheus 2.35 does
See https://github.com/prometheus/prometheus/issues/9510
and https://github.com/prometheus/prometheus/pull/10080
2022-04-22 20:15:41 +03:00
Aliaksandr Valialkin
57a0aa204d lib/promscrape/discovery/kubernetes: improve the performance of urlWatcher.reloadObjects() on multi-CPU systems
Parallelize the generation of ScrapeWork objects there. Previously they were generated in a single goroutine.
2022-04-22 13:22:01 +03:00
Aliaksandr Valialkin
67b10896d2 lib/promscrape: prevent from memory leaks on -promscrape.config reload when only a small part of scrape jobs is updated
This is a follow-up after 26b78ad707
2022-04-22 13:19:43 +03:00
Aliaksandr Valialkin
8d0fb4d69d vendor: make vendor-update 2022-04-21 16:00:47 +03:00
Aliaksandr Valialkin
1970b3db34 docs/Cluster-VictoriaMetrics.md: mention that enterprise binaries are available for evaluation 2022-04-21 15:58:03 +03:00
Aliaksandr Valialkin
912a5a6f28 docs/Single-server-VictoriaMetrics.md: refer to the docs on how to set up multiple vmagent instances for scraping the same set of targets 2022-04-21 15:51:52 +03:00
Aliaksandr Valialkin
25fe83577d app/vmselect/promql: properly handle scalar default vector, scalar if vector and scalar ifnot vector queries
Previously `vector` time series could be unexpectedly returned from such queries
2022-04-21 15:34:36 +03:00
Aliaksandr Valialkin
d1a9fac894 app/vmselect/promql: fix comparison to nan
The comparison to nan has been broken in d335cc886c

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/150
2022-04-21 14:55:37 +03:00
Aliaksandr Valialkin
de892239a9 app/vmselect/promql: add drop_common_labels() function 2022-04-21 14:20:20 +03:00
Aliaksandr Valialkin
98129d4a8e app/vmstorage: expose vm_indexdb_items_added_total and vm_indexdb_items_added_size_bytes_total counters at /metrics page
These counters can be used for monitoring the rate of addition of new entries in indexdb (aka inverted index).

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2471
2022-04-21 13:18:39 +03:00
Aliaksandr Valialkin
167d1bea8f lib/promscrape/discovery/kubernetes: properly update endpoints and endpointslice objects when the related pod or service objects are updated
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240

This is a follow-up for 2341bd48d7
2022-04-21 13:06:22 +03:00
Aliaksandr Valialkin
7a44ba1234 app/vmselect/promql: fix q default b where b may have empty time series 2022-04-21 11:42:42 +03:00
Aliaksandr Valialkin
d335cc886c app/vmselect/promql: fix duplicate time series error on joins against time series filtered by values
This should prevent from `duplicate time series` errors when executing the following query:

kube_pod_container_resource_requests{resource="cpu"} * on (namespace,pod) group_left() (kube_pod_status_phase{phase=~"Pending|Running"}==1)

where `kube_pod_status_phase{phase=~"Pending|Running"}==1` filters out diplicate time series
2022-04-20 22:18:44 +03:00
Aliaksandr Valialkin
ed97908ca9 app/vmselect/promql: rename removeNaNs() to more clear removeEmptySeries() 2022-04-20 19:53:46 +03:00
Aliaksandr Valialkin
c75d0095f5 lib/promscrape: remove possible data race when cleaning up internStringsMap 2022-04-20 18:40:53 +03:00
Aliaksandr Valialkin
82e34984dd lib/promscrape: zero out labels after duplicate removal inside mergeLabels() 2022-04-20 18:35:33 +03:00
Aliaksandr Valialkin
a2de31f8d3 lib/promscrape/discovery/kubernetes: do not pre-allocate memory for ScrapeWork objects
There is high chance that ScrapeWork objects won't be generated because of relabeling
2022-04-20 16:40:25 +03:00
Aliaksandr Valialkin
694887cea8 docs/CHANGELOG.md: document that the service discovery speed now scales with the number of CPU cores 2022-04-20 16:22:18 +03:00
Aliaksandr Valialkin
2341bd48d7 lib/promscrape: follow-up after 91e290a8ff 2022-04-20 16:11:37 +03:00
Nikolay
91e290a8ff lib/promscrape: reduce latency for k8s GetLabels (#2454)
replaces internStringMap with sync.Map - it greatly reduces lock contention
concurently reload scrape work for api watcher - each object labels added by dedicated CPU

changes can be tested with following script https://gist.github.com/f41gh7/6f8f8d8719786aff1f18a85c23aebf70
2022-04-20 16:09:40 +03:00
Aliaksandr Valialkin
3d0549c982 lib/promscrape: optimize getScrapeWork() function
Reduce the number of memory allocations in this function. This improves its performance by up to 50%.
This should improve service discovery speed when big number of potential targets with big number of meta-labels
are generated by service discovery.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2270
2022-04-20 15:37:00 +03:00
Aliaksandr Valialkin
4513893ead lib/promscrape: use a hash over target labels as a key for dropped targets' map
This reduces the number of allocations and improves the performance for updating dropped targets' map.
This map is exposed at /api/v1/targets as in droppedTargets list.
2022-04-20 15:37:00 +03:00
Dmytro Kozlov
136a44bcfc lib/promscrape: simply update UI (#2479)
* lib/promscrape: simply update UI

* lib/promscrape: added vm icon
2022-04-20 10:25:04 +02:00
Aliaksandr Valialkin
f6d0e5e74a all: typo fix: Kuberntes -> Kubernetes 2022-04-20 10:50:49 +03:00
Dmytro Kozlov
a3ee275149 lib/promscrape: Enable filters for endpoint and labels (#2466)
* lib/promscrape: Enable filters for endpoint and labels

* lib/promscrape: cleanup

* lib/promscrape: update template

* lib/promscrape: move logic filter logic to backend

* lib/promscrape: updated placeholder

* lib/promscrape: updated placeholder

* lib/promscrape: use two different fields for filters, updated form, added error on parsing queries

* lib/promscrape: rename functions

* lib/promscrape: removed unused values

* wip

* wip

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-04-19 18:26:21 +03:00
Aliaksandr Valialkin
ea349660cf vendor: make vendor-update 2022-04-19 11:40:41 +03:00
naveensrinivasan
cb1ded8d9f chore: Set permissions for GitHub actions
Restrict the GitHub token permissions only to the required ones; this way, even if the attackers will succeed in compromising your workflow, they won’t be able to do much.

- Included permissions for the action. https://github.com/ossf/scorecard/blob/main/docs/checks.md#token-permissions

https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#permissions

https://docs.github.com/en/actions/using-jobs/assigning-permissions-to-jobs

[Keeping your GitHub Actions and workflows secure Part 1: Preventing pwn requests](https://securitylab.github.com/research/github-actions-preventing-pwn-requests/)

Signed-off-by: naveensrinivasan <172697+naveensrinivasan@users.noreply.github.com>
2022-04-17 17:04:04 +03:00
Nikolay
26b78ad707 lib/promscrape: adds job restart method (#2455)
* lib/promscrape: adds job restart method
it must restart only ScrapeConfig with changed content
this change greatly reduce time, that needed for job restart
and it should decrease possible data loss when config frequently changed at kubernetes based deployments

Apply suggestions from code review

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

* wip

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-04-16 20:28:46 +03:00
Dima Lazerka
60ad8c74bc Add GitHub workflow for code scanning (#2453)
Add pre-generated workflow definition for GitHub's CodeQL code scanning.
2022-04-16 19:00:49 +03:00
Yury Molodov
514e3660e2 fix: prevent graph hiding without data (#2456)
* fix: prevent graph hiding without data

* fix: add yaxis labels default

* app/vmselect: `make vmui-update`

* docs/CHANGELOG.md: document the change

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-04-16 17:15:31 +03:00
Aliaksandr Valialkin
f62fc2f318 docs/Cluster-VictoriaMetrics.md: sync docs 2022-04-16 16:59:27 +03:00
Aliaksandr Valialkin
1097ebebe6 lib/httpserver: clarify that -tls flag enables TLS for http requests to -httpListenAddr 2022-04-16 16:59:26 +03:00
Aliaksandr Valialkin
cad488fe7e app/vmstorage: add support for mTLS cipher suites via -cluster.tlsCipherSuites command-line flag
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2404
2022-04-16 16:39:21 +03:00
Aliaksandr Valialkin
54bb8c2bc6 docs/Cluster-VictoriaMetrics.md: update docs after 26ae50ec26 2022-04-16 16:04:58 +03:00
Aliaksandr Valialkin
b49b8020d6 docs: sync docs with the latest changes 2022-04-16 15:59:53 +03:00
Aliaksandr Valialkin
7810375c5f lib/httpserver: move the code, which creates tls.Config, into lib/netutil/tls.go
This syncs the corresponding code with cluster branch
2022-04-16 15:52:36 +03:00
Aliaksandr Valialkin
7e4bdf31ba lib/httpserver: follow up after def0032c7d 2022-04-16 15:27:21 +03:00
Dmytro Kozlov
def0032c7d lib/httpserver: added tlsCipherSuites flag (#2468)
* lib/httpserver: added tlsCipherSuites flag

* lib/httpserver: compare lower case strings

* lib/httpserver: use EqualFold

* lib/httpserver: used flagutil.NewArray, supported only strings cipher suites

* lib/httpserver: updated flag description, added flag to documentation

* Update lib/httpserver/httpserver.go

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-04-16 15:07:07 +03:00
Aliaksandr Valialkin
ebaa1c7ad5 lib/promscrape: follow-up after baa1c24b36 2022-04-16 14:25:54 +03:00
Nikolay
baa1c24b36 lib/promscrape: removes omitempty for ScrapeConfig (#2457)
This change fixes incorrect marshalling for ScrapeConfig
it affects http endpoint and ScrapeConfig checksum.

With omitempty, custom Marshaller is not called if field is not a pointer.

Previously this issue happened at vmalert
2022-04-16 13:22:11 +03:00
Ted Robertson
5e9afcceaa Fix typo in bug report template (#2472) 2022-04-16 13:19:36 +03:00
Aliaksandr Valialkin
057c3a745f docs/Cluster-VictoriaMetrics.md: clarify the issue with multi-level cluster, which may result in data gaps at some AZs 2022-04-16 13:04:29 +03:00
Aliaksandr Valialkin
4169b97af9 docs/Cluster-VictoriaMetrics.md: improve docs on cluster setup, resizing and scalability 2022-04-15 15:05:47 +03:00
Aliaksandr Valialkin
b11e9de386 docs/Cluster-VictoriaMetrics.md: mention about possible issues with multi-level cluster setup 2022-04-15 15:05:45 +03:00
Aliaksandr Valialkin
e6535a75f7 docs/CHANGELOG.md: document 45fcaa33e8 2022-04-13 14:12:17 +03:00
Aliaksandr Valialkin
77ffa4e447 docs/CHANGELOG.md: document f7e4c5a628 2022-04-13 14:09:11 +03:00
Aliaksandr Valialkin
28f21f5f17 deployment/docker: update Go builder from go1.18.0 to go1.18.1
See https://github.com/golang/go/issues?q=milestone%3AGo1.18.1+label%3ACherryPickApproved
2022-04-13 14:06:49 +03:00
Dmytro Kozlov
f7e4c5a628 vmctl: Return non zero error code if validation or subcommand fails (#2462) 2022-04-13 11:52:55 +02:00
Roman Khavronenko
45fcaa33e8 vmalert: add DNS service discovery (#2465)
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2460
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-04-13 11:50:26 +03:00
Anton Bystrov
9307fe3c04 Update CHANGELOG.md (#2463)
May be mispint here?
2022-04-13 11:18:42 +03:00
Aliaksandr Valialkin
783eb690a1 snap: update Go builder for snap 2022-04-12 20:24:36 +03:00
3055 changed files with 527669 additions and 138511 deletions

View File

@@ -1,46 +0,0 @@
---
name: Bug report
about: Create a report to help us improve
title: ''
labels: ''
assignees: ''
---
**Describe the bug**
A clear and concise description of what the bug is.
It would be great to [upgrade](https://docs.victoriametrics.com/#how-to-upgrade)
to [the latest available release](https://github.com/VictoriaMetrics/VictoriaMetrics/releases)
and verify whether the bug is reproducible there.
It's also recommended to read the [troubleshooting docs](https://docs.victoriametrics.com/#troubleshooting).
**To Reproduce**
Steps to reproduce the behavior.
**Expected behavior**
A clear and concise description of what you expected to happen.
**Logs**
Check if any warnings or errors were logged by VictoriaMetrics components
or components in communication with VictoriaMetrics (e.g. Prometheus, Grafana).
**Screenshots**
If applicable, add screenshots to help explain your problem.
For VictoriaMetrics health-state issues please provide full-length screenshots
of Grafana dashboards if possible:
* [Grafana dashboard for single-node VictoriaMetrics](https://grafana.com/dashboards/10229)
* [Grafana dashboard for VictoriaMetrics cluster](https://grafana.com/grafana/dashboards/11176)
See how to setup monitoring here:
* [monitoring for single-node VictoriaMetrics](https://docs.victoriametrics.com/#monitoring)
* [montioring for VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#monitoring)
**Version**
The line returned when passing `--version` command line flag to the binary. For example:
```
$ ./victoria-metrics-prod --version
victoria-metrics-20190730-121249-heads-single-node-0-g671d9e55
```
**Used command-line flags**
Please provide the command-line flags used for running VictoriaMetrics and its components.

86
.github/ISSUE_TEMPLATE/bug_report.yml vendored Normal file
View File

@@ -0,0 +1,86 @@
name: Bug report
description: Create a report to help us improve
labels: [bug]
body:
- type: markdown
attributes:
value: |
Before filling a bug report it would be great to [upgrade](https://docs.victoriametrics.com/#how-to-upgrade)
to [the latest available release](https://github.com/VictoriaMetrics/VictoriaMetrics/releases)
and verify whether the bug is reproducible there.
It's also recommended to read the [troubleshooting docs](https://docs.victoriametrics.com/Troubleshooting.html) first.
- type: textarea
id: describe-the-bug
attributes:
label: Describe the bug
description: |
A clear and concise description of what the bug is.
placeholder: |
When I do `A` VictoriaMetrics does `B`. I expect it to do `C`.
validations:
required: true
- type: textarea
id: to-reproduce
attributes:
label: To Reproduce
description: |
Steps to reproduce the behavior.
If reproducing an issue requires some specific configuration file, please paste it here.
placeholder: |
Steps to reproduce the behavior.
validations:
required: true
- type: textarea
id: version
attributes:
label: Version
description: |
The line returned when passing `--version` command line flag to the binary. For example:
```
$ ./victoria-metrics-prod --version
victoria-metrics-20190730-121249-heads-single-node-0-g671d9e55
```
validations:
required: true
- type: textarea
id: logs
attributes:
label: Logs
description: |
Check if any warnings or errors were logged by VictoriaMetrics components
or components in communication with VictoriaMetrics (e.g. Prometheus, Grafana).
validations:
required: false
- type: textarea
id: screenshots
attributes:
label: Screenshots
description: |
If applicable, add screenshots to help explain your problem.
For VictoriaMetrics health-state issues please provide full-length screenshots
of Grafana dashboards if possible:
* [Grafana dashboard for single-node VictoriaMetrics](https://grafana.com/grafana/dashboards/10229-victoriametrics/)
* [Grafana dashboard for VictoriaMetrics cluster](https://grafana.com/grafana/dashboards/11176-victoriametrics-cluster/)
See how to setup monitoring here:
* [monitoring for single-node VictoriaMetrics](https://docs.victoriametrics.com/#monitoring)
* [monitoring for VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#monitoring)
validations:
required: false
- type: textarea
id: flags
attributes:
label: Used command-line flags
description: |
Please provide the command-line flags used for running VictoriaMetrics and its components.
validations:
required: false
- type: textarea
id: additional-info
attributes:
label: Additional information
placeholder: |
Additional information that doesn't fit elsewhere
validations:
required: false

View File

@@ -0,0 +1,5 @@
blank_issues_enabled: true
contact_links:
- name: Ask on Slack
url: https://slack.victoriametrics.com/
about: You can ask for help here!

View File

@@ -1,20 +0,0 @@
---
name: Feature request
about: Suggest an idea for this project
title: ''
labels: ''
assignees: ''
---
**Is your feature request related to a problem? Please describe.**
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
**Describe the solution you'd like**
A clear and concise description of what you want to happen.
**Describe alternatives you've considered**
A clear and concise description of any alternative solutions or features you've considered.
**Additional context**
Add any other context or screenshots about the feature request here.

View File

@@ -0,0 +1,43 @@
name: Feature request
description: Suggest an idea for this project
labels: [enhancement]
body:
- type: textarea
id: describe-the-problem
attributes:
label: Is your feature request related to a problem? Please describe
description: |
A clear and concise description of what the problem is.
placeholder: |
Ex. I'm always frustrated when [...]
validations:
required: false
- type: textarea
id: describe-the-solution
attributes:
label: Describe the solution you'd like
description: |
A clear and concise description of what you want to happen.
validations:
required: true
- type: textarea
id: alternative-solutions
attributes:
label: Describe alternatives you've considered
description: |
A clear and concise description of any alternative solutions or features you've considered.
placeholder: |
I have tried to do `A`, but that doesn't solve a problem completely.
I have tried to do `A` and `B`, but implementing this would be better.
validations:
required: false
- type: textarea
id: feature-additional-info
attributes:
label: Additional information
description: |
Additional information which you consider helpful for implementing this feature.
placeholder: |
Add any other context or screenshots about the feature request here.
validations:
required: false

View File

@@ -6,6 +6,9 @@ on:
pull_request:
paths:
- 'vendor'
permissions:
contents: read
jobs:
build:
name: Build
@@ -14,7 +17,7 @@ jobs:
- name: Setup Go
uses: actions/setup-go@main
with:
go-version: 1.17
go-version: 1.19.4
id: go
- name: Code checkout
uses: actions/checkout@master

76
.github/workflows/codeql-analysis.yml vendored Normal file
View File

@@ -0,0 +1,76 @@
# For most projects, this workflow file will not need changing; you simply need
# to commit it to your repository.
#
# You may wish to alter this file to override the set of languages analyzed,
# or to provide custom queries or build logic.
#
# ******** NOTE ********
# We have attempted to detect the languages in your repository. Please check
# the `language` matrix defined below to confirm you have the correct set of
# supported CodeQL languages.
#
name: "CodeQL"
on:
push:
branches: [ master, cluster ]
pull_request:
# The branches below must be a subset of the branches above
branches: [ master, cluster ]
schedule:
- cron: '30 18 * * 2'
jobs:
analyze:
name: Analyze
runs-on: ubuntu-latest
permissions:
actions: read
contents: read
security-events: write
strategy:
fail-fast: false
matrix:
language: [ 'go', 'javascript' ]
# CodeQL supports [ 'cpp', 'csharp', 'go', 'java', 'javascript', 'python', 'ruby' ]
# Learn more about CodeQL language support at https://git.io/codeql-language-support
steps:
- name: Checkout repository
uses: actions/checkout@v3
- name: Set up Go
uses: actions/setup-go@v2
with:
go-version: 1.19
if: ${{ matrix.language == 'go' }}
# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@v2
with:
languages: ${{ matrix.language }}
# If you wish to specify custom queries, you can do so here or in a config file.
# By default, queries listed here will override any specified in a config file.
# Prefix the list here with "+" to use these queries and those in the config file.
# queries: ./path/to/local/query, your-org/your-repo/queries@main
# Autobuild attempts to build any compiled languages (C/C++, C#, or Java).
# If this step fails, then you should remove it and run the build manually (see below)
- name: Autobuild
uses: github/codeql-action/autobuild@v2
# Command-line programs to run using the OS shell.
# 📚 https://git.io/JvXDl
# ✏️ If the Autobuild fails above, remove it and uncomment the following three lines
# and modify them (or add more) to build your code if your project
# uses a compiled language
#- run: |
# make bootstrap
# make release
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v2

View File

@@ -1,13 +1,22 @@
name: main
on:
push:
branches:
- master
- cluster
paths-ignore:
- 'docs/**'
- '**.md'
pull_request:
branches:
- master
- cluster
paths-ignore:
- 'docs/**'
- '**.md'
permissions:
contents: read
jobs:
build:
name: Build
@@ -16,7 +25,7 @@ jobs:
- name: Setup Go
uses: actions/setup-go@main
with:
go-version: 1.17
go-version: 1.19.4
id: go
- name: Code checkout
uses: actions/checkout@master
@@ -26,8 +35,6 @@ jobs:
make install-errcheck
make install-golangci-lint
- name: Build
env:
GO111MODULE: on
run: |
export PATH=$PATH:$(go env GOPATH)/bin # temporary fix. See https://github.com/actions/setup-go/issues/14
make check-all
@@ -35,30 +42,8 @@ jobs:
make test-full
make test-pure
make test-full-386
make victoria-metrics
make victoria-metrics-pure
make victoria-metrics-arm
make victoria-metrics-arm64
make vmutils
GOOS=freebsd go build -mod=vendor ./app/victoria-metrics
GOOS=freebsd go build -mod=vendor ./app/vmagent
GOOS=freebsd go build -mod=vendor ./app/vmalert
GOOS=freebsd go build -mod=vendor ./app/vmbackup
GOOS=freebsd go build -mod=vendor ./app/vmrestore
GOOS=freebsd go build -mod=vendor ./app/vmctl
GOOS=openbsd go build -mod=vendor ./app/victoria-metrics
GOOS=openbsd go build -mod=vendor ./app/vmagent
GOOS=openbsd go build -mod=vendor ./app/vmalert
GOOS=openbsd go build -mod=vendor ./app/vmbackup
GOOS=openbsd go build -mod=vendor ./app/vmrestore
GOOS=openbsd go build -mod=vendor ./app/vmctl
GOOS=darwin go build -mod=vendor ./app/victoria-metrics
GOOS=darwin go build -mod=vendor ./app/vmagent
GOOS=darwin go build -mod=vendor ./app/vmalert
GOOS=darwin go build -mod=vendor ./app/vmbackup
GOOS=darwin go build -mod=vendor ./app/vmrestore
GOOS=darwin go build -mod=vendor ./app/vmctl
CGO_ENABLED=0 GOOS=windows go build -mod=vendor ./app/vmagent
make victoria-metrics-crossbuild
make vmuitils-crossbuild
- name: Publish coverage
uses: codecov/codecov-action@v3
with:

39
.github/workflows/nightly-build.yml vendored Normal file
View File

@@ -0,0 +1,39 @@
name: nightly-build
on:
schedule:
# Daily at 2:48am
- cron: '48 2 * * *'
permissions:
contents: read
jobs:
build:
name: Build
runs-on: ubuntu-latest
steps:
-
name: Login to Docker Hub
uses: docker/login-action@v2
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
-
name: Setup Go
uses: actions/setup-go@main
with:
go-version: 1.19.4
id: go
-
name: Setup docker scan
run: |
mkdir -p ~/.docker/cli-plugins && \
curl https://github.com/docker/scan-cli-plugin/releases/latest/download/docker-scan_linux_amd64 -L -s -S -o ~/.docker/cli-plugins/docker-scan &&\
chmod +x ~/.docker/cli-plugins/docker-scan
-
name: Code checkout
uses: actions/checkout@master
-
name: Publish
run: |
LATEST_TAG=nightly PKG_TAG=nightly make publish

View File

@@ -5,8 +5,13 @@ on:
- 'docs/*'
branches:
- master
permissions:
contents: read
jobs:
build:
permissions:
contents: write # for Git to git push
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@master

3
.gitignore vendored
View File

@@ -19,4 +19,5 @@
.DS_store
Gemfile.lock
/_site
_site
_site
*.tmp

15
.golangci.yml Normal file
View File

@@ -0,0 +1,15 @@
run:
timeout: 2m
enable:
- revive
issues:
exclude-rules:
- linters:
- staticcheck
text: "SA(4003|1019|5011):"
linters-settings:
errcheck:
exclude: ./errcheck_excludes.txt

View File

@@ -3,3 +3,4 @@ allowlist:
- MIT
- BSD-3-Clause
- BSD-2-Clause
- ISC

View File

@@ -175,7 +175,7 @@
END OF TERMS AND CONDITIONS
Copyright 2019-2022 VictoriaMetrics, Inc.
Copyright 2019-2023 VictoriaMetrics, Inc.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.

363
Makefile
View File

@@ -2,7 +2,8 @@ PKG_PREFIX := github.com/VictoriaMetrics/VictoriaMetrics
DATEINFO_TAG ?= $(shell date -u +'%Y%m%d-%H%M%S')
BUILDINFO_TAG ?= $(shell echo $$(git describe --long --all | tr '/' '-')$$( \
git diff-index --quiet HEAD -- || echo '-dirty-'$$(git diff-index -u HEAD | openssl sha1 | cut -c 10-17)))
git diff-index --quiet HEAD -- || echo '-dirty-'$$(git diff-index -u HEAD | openssl sha1 | cut -d' ' -f2 | cut -c 1-8)))
LATEST_TAG ?= latest
PKG_TAG ?= $(shell git tag -l --points-at HEAD)
ifeq ($(PKG_TAG),)
@@ -13,6 +14,11 @@ GO_BUILDINFO = -X '$(PKG_PREFIX)/lib/buildinfo.Version=$(APP_NAME)-$(DATEINFO_TA
.PHONY: $(MAKECMDGOALS)
include app/*/Makefile
include deployment/*/Makefile
include snap/local/Makefile
include package/release/Makefile
all: \
victoria-metrics-prod \
vmagent-prod \
@@ -22,15 +28,10 @@ all: \
vmrestore-prod \
vmctl-prod
include app/*/Makefile
include deployment/*/Makefile
include snap/local/Makefile
clean:
rm -rf bin/*
publish: \
publish: docker-scan \
publish-victoria-metrics \
publish-vmagent \
publish-vmalert \
@@ -64,21 +65,77 @@ vmutils-pure: \
vmrestore-pure \
vmctl-pure
vmutils-arm64: \
vmagent-arm64 \
vmalert-arm64 \
vmauth-arm64 \
vmbackup-arm64 \
vmrestore-arm64 \
vmctl-arm64
vmutils-linux-amd64: \
vmagent-linux-amd64 \
vmalert-linux-amd64 \
vmauth-linux-amd64 \
vmbackup-linux-amd64 \
vmrestore-linux-amd64 \
vmctl-linux-amd64
vmutils-arm: \
vmagent-arm \
vmalert-arm \
vmauth-arm \
vmbackup-arm \
vmrestore-arm \
vmctl-arm
vmutils-linux-arm64: \
vmagent-linux-arm64 \
vmalert-linux-arm64 \
vmauth-linux-arm64 \
vmbackup-linux-arm64 \
vmrestore-linux-arm64 \
vmctl-linux-arm64
vmutils-linux-arm: \
vmagent-linux-arm \
vmalert-linux-arm \
vmauth-linux-arm \
vmbackup-linux-arm \
vmrestore-linux-arm \
vmctl-linux-arm
vmutils-linux-386: \
vmagent-linux-386 \
vmalert-linux-386 \
vmauth-linux-386 \
vmbackup-linux-386 \
vmrestore-linux-386 \
vmctl-linux-386
vmutils-linux-ppc64le: \
vmagent-linux-ppc64le \
vmalert-linux-ppc64le \
vmauth-linux-ppc64le \
vmbackup-linux-ppc64le \
vmrestore-linux-ppc64le \
vmctl-linux-ppc64le
vmutils-darwin-amd64: \
vmagent-darwin-amd64 \
vmalert-darwin-amd64 \
vmauth-darwin-amd64 \
vmbackup-darwin-amd64 \
vmrestore-darwin-amd64 \
vmctl-darwin-amd64
vmutils-darwin-arm64: \
vmagent-darwin-arm64 \
vmalert-darwin-arm64 \
vmauth-darwin-arm64 \
vmbackup-darwin-arm64 \
vmrestore-darwin-arm64 \
vmctl-darwin-arm64
vmutils-freebsd-amd64: \
vmagent-freebsd-amd64 \
vmalert-freebsd-amd64 \
vmauth-freebsd-amd64 \
vmbackup-freebsd-amd64 \
vmrestore-freebsd-amd64 \
vmctl-freebsd-amd64
vmutils-openbsd-amd64: \
vmagent-openbsd-amd64 \
vmalert-openbsd-amd64 \
vmauth-openbsd-amd64 \
vmbackup-openbsd-amd64 \
vmrestore-openbsd-amd64 \
vmctl-openbsd-amd64
vmutils-windows-amd64: \
vmagent-windows-amd64 \
@@ -86,98 +143,144 @@ vmutils-windows-amd64: \
vmauth-windows-amd64 \
vmctl-windows-amd64
victoria-metrics-crossbuild: \
victoria-metrics-linux-amd64 \
victoria-metrics-linux-arm64 \
victoria-metrics-linux-arm \
victoria-metrics-linux-386 \
victoria-metrics-linux-ppc64le \
victoria-metrics-darwin-amd64 \
victoria-metrics-darwin-arm64 \
victoria-metrics-freebsd-amd64 \
victoria-metrics-openbsd-amd64
vmutils-crossbuild: \
vmutils-linux-amd64 \
vmutils-linux-arm64 \
vmutils-linux-arm \
vmutils-linux-386 \
vmutils-linux-ppc64le \
vmutils-darwin-amd64 \
vmutils-darwin-arm64 \
vmutils-freebsd-amd64 \
vmutils-openbsd-amd64 \
vmutils-windows-amd64
publish-release:
git checkout $(TAG) && $(MAKE) release publish && \
git checkout $(TAG)-cluster && $(MAKE) release publish && \
git checkout $(TAG)-enterprise && $(MAKE) release publish && \
git checkout $(TAG)-enterprise-cluster && $(MAKE) release publish
git checkout $(TAG) && LATEST_TAG=stable $(MAKE) release publish && \
git checkout $(TAG)-cluster && LATEST_TAG=cluster-stable $(MAKE) release publish && \
git checkout $(TAG)-enterprise && LATEST_TAG=enterprise-stable $(MAKE) release publish && \
git checkout $(TAG)-enterprise-cluster && LATEST_TAG=enterprise-cluster-stable $(MAKE) release publish
release: \
release-victoria-metrics \
release-vmutils
release-victoria-metrics: \
release-victoria-metrics-amd64 \
release-victoria-metrics-arm \
release-victoria-metrics-arm64 \
release-victoria-metrics-linux-amd64 \
release-victoria-metrics-linux-arm \
release-victoria-metrics-linux-arm64 \
release-victoria-metrics-darwin-amd64 \
release-victoria-metrics-darwin-arm64
release-victoria-metrics-darwin-arm64 \
release-victoria-metrics-freebsd-amd64 \
release-victoria-metrics-openbsd-amd64
release-victoria-metrics-amd64:
OSARCH=amd64 $(MAKE) release-victoria-metrics-generic
release-victoria-metrics-linux-amd64:
GOOS=linux GOARCH=amd64 $(MAKE) release-victoria-metrics-goos-goarch
release-victoria-metrics-arm:
OSARCH=arm $(MAKE) release-victoria-metrics-generic
release-victoria-metrics-linux-arm:
GOOS=linux GOARCH=arm $(MAKE) release-victoria-metrics-goos-goarch
release-victoria-metrics-arm64:
OSARCH=arm64 $(MAKE) release-victoria-metrics-generic
release-victoria-metrics-linux-arm64:
GOOS=linux GOARCH=arm64 $(MAKE) release-victoria-metrics-goos-goarch
release-victoria-metrics-darwin-amd64:
OSARCH=darwin-amd64 $(MAKE) release-victoria-metrics-generic
GOOS=darwin GOARCH=amd64 $(MAKE) release-victoria-metrics-goos-goarch
release-victoria-metrics-darwin-arm64:
OSARCH=darwin-arm64 $(MAKE) release-victoria-metrics-generic
GOOS=darwin GOARCH=arm64 $(MAKE) release-victoria-metrics-goos-goarch
release-victoria-metrics-generic: victoria-metrics-$(OSARCH)-prod
release-victoria-metrics-freebsd-amd64:
GOOS=freebsd GOARCH=amd64 $(MAKE) release-victoria-metrics-goos-goarch
release-victoria-metrics-openbsd-amd64:
GOOS=openbsd GOARCH=amd64 $(MAKE) release-victoria-metrics-goos-goarch
release-victoria-metrics-goos-goarch: victoria-metrics-$(GOOS)-$(GOARCH)-prod
cd bin && \
tar --transform="flags=r;s|-$(OSARCH)||" -czf victoria-metrics-$(OSARCH)-$(PKG_TAG).tar.gz \
victoria-metrics-$(OSARCH)-prod \
&& sha256sum victoria-metrics-$(OSARCH)-$(PKG_TAG).tar.gz \
victoria-metrics-$(OSARCH)-prod \
| sed s/-$(OSARCH)-prod/-prod/ > victoria-metrics-$(OSARCH)-$(PKG_TAG)_checksums.txt
tar --transform="flags=r;s|-$(GOOS)-$(GOARCH)||" -czf victoria-metrics-$(GOOS)-$(GOARCH)-$(PKG_TAG).tar.gz \
victoria-metrics-$(GOOS)-$(GOARCH)-prod \
&& sha256sum victoria-metrics-$(GOOS)-$(GOARCH)-$(PKG_TAG).tar.gz \
victoria-metrics-$(GOOS)-$(GOARCH)-prod \
| sed s/-$(GOOS)-$(GOARCH)-prod/-prod/ > victoria-metrics-$(GOOS)-$(GOARCH)-$(PKG_TAG)_checksums.txt
cd bin && rm -rf victoria-metrics-$(GOOS)-$(GOARCH)-prod
release-vmutils: \
release-vmutils-amd64 \
release-vmutils-arm64 \
release-vmutils-arm \
release-vmutils-darwin-amd64 \
release-vmutils-linux-amd64 \
release-vmutils-linux-arm64 \
release-vmutils-linux-arm \
release-vmutils-darwin-amd64 \
release-vmutils-darwin-arm64 \
release-vmutils-freebsd-amd64 \
release-vmutils-openbsd-amd64 \
release-vmutils-windows-amd64
release-vmutils-amd64:
OSARCH=amd64 $(MAKE) release-vmutils-generic
release-vmutils-linux-amd64:
GOOS=linux GOARCH=amd64 $(MAKE) release-vmutils-goos-goarch
release-vmutils-arm64:
OSARCH=arm64 $(MAKE) release-vmutils-generic
release-vmutils-linux-arm64:
GOOS=linux GOARCH=arm64 $(MAKE) release-vmutils-goos-goarch
release-vmutils-arm:
OSARCH=arm $(MAKE) release-vmutils-generic
release-vmutils-linux-arm:
GOOS=linux GOARCH=arm $(MAKE) release-vmutils-goos-goarch
release-vmutils-darwin-amd64:
OSARCH=darwin-amd64 $(MAKE) release-vmutils-generic
GOOS=darwin GOARCH=amd64 $(MAKE) release-vmutils-goos-goarch
release-vmutils-darwin-arm64:
OSARCH=darwin-arm64 $(MAKE) release-vmutils-generic
GOOS=darwin GOARCH=arm64 $(MAKE) release-vmutils-goos-goarch
release-vmutils-freebsd-amd64:
GOOS=freebsd GOARCH=amd64 $(MAKE) release-vmutils-goos-goarch
release-vmutils-openbsd-amd64:
GOOS=openbsd GOARCH=amd64 $(MAKE) release-vmutils-goos-goarch
release-vmutils-windows-amd64:
GOARCH=amd64 $(MAKE) release-vmutils-windows-generic
GOARCH=amd64 $(MAKE) release-vmutils-windows-goarch
release-vmutils-generic: \
vmagent-$(OSARCH)-prod \
vmalert-$(OSARCH)-prod \
vmauth-$(OSARCH)-prod \
vmbackup-$(OSARCH)-prod \
vmrestore-$(OSARCH)-prod \
vmctl-$(OSARCH)-prod
release-vmutils-goos-goarch: \
vmagent-$(GOOS)-$(GOARCH)-prod \
vmalert-$(GOOS)-$(GOARCH)-prod \
vmauth-$(GOOS)-$(GOARCH)-prod \
vmbackup-$(GOOS)-$(GOARCH)-prod \
vmrestore-$(GOOS)-$(GOARCH)-prod \
vmctl-$(GOOS)-$(GOARCH)-prod
cd bin && \
tar --transform="flags=r;s|-$(OSARCH)||" -czf vmutils-$(OSARCH)-$(PKG_TAG).tar.gz \
vmagent-$(OSARCH)-prod \
vmalert-$(OSARCH)-prod \
vmauth-$(OSARCH)-prod \
vmbackup-$(OSARCH)-prod \
vmrestore-$(OSARCH)-prod \
vmctl-$(OSARCH)-prod \
&& sha256sum vmutils-$(OSARCH)-$(PKG_TAG).tar.gz \
vmagent-$(OSARCH)-prod \
vmalert-$(OSARCH)-prod \
vmauth-$(OSARCH)-prod \
vmbackup-$(OSARCH)-prod \
vmrestore-$(OSARCH)-prod \
vmctl-$(OSARCH)-prod \
| sed s/-$(OSARCH)-prod/-prod/ > vmutils-$(OSARCH)-$(PKG_TAG)_checksums.txt
tar --transform="flags=r;s|-$(GOOS)-$(GOARCH)||" -czf vmutils-$(GOOS)-$(GOARCH)-$(PKG_TAG).tar.gz \
vmagent-$(GOOS)-$(GOARCH)-prod \
vmalert-$(GOOS)-$(GOARCH)-prod \
vmauth-$(GOOS)-$(GOARCH)-prod \
vmbackup-$(GOOS)-$(GOARCH)-prod \
vmrestore-$(GOOS)-$(GOARCH)-prod \
vmctl-$(GOOS)-$(GOARCH)-prod \
&& sha256sum vmutils-$(GOOS)-$(GOARCH)-$(PKG_TAG).tar.gz \
vmagent-$(GOOS)-$(GOARCH)-prod \
vmalert-$(GOOS)-$(GOARCH)-prod \
vmauth-$(GOOS)-$(GOARCH)-prod \
vmbackup-$(GOOS)-$(GOARCH)-prod \
vmrestore-$(GOOS)-$(GOARCH)-prod \
vmctl-$(GOOS)-$(GOARCH)-prod \
| sed s/-$(GOOS)-$(GOARCH)-prod/-prod/ > vmutils-$(GOOS)-$(GOARCH)-$(PKG_TAG)_checksums.txt
cd bin && rm -rf \
vmagent-$(GOOS)-$(GOARCH)-prod \
vmalert-$(GOOS)-$(GOARCH)-prod \
vmauth-$(GOOS)-$(GOARCH)-prod \
vmbackup-$(GOOS)-$(GOARCH)-prod \
vmrestore-$(GOOS)-$(GOARCH)-prod \
vmctl-$(GOOS)-$(GOARCH)-prod
release-vmutils-windows-generic: \
release-vmutils-windows-goarch: \
vmagent-windows-$(GOARCH)-prod \
vmalert-windows-$(GOARCH)-prod \
vmauth-windows-$(GOARCH)-prod \
@@ -194,112 +297,108 @@ release-vmutils-windows-generic: \
vmauth-windows-$(GOARCH)-prod.exe \
vmctl-windows-$(GOARCH)-prod.exe \
> vmutils-windows-$(GOARCH)-$(PKG_TAG)_checksums.txt
cd bin && rm -rf \
vmagent-windows-$(GOARCH)-prod.exe \
vmalert-windows-$(GOARCH)-prod.exe \
vmauth-windows-$(GOARCH)-prod.exe \
vmctl-windows-$(GOARCH)-prod.exe
pprof-cpu:
go tool pprof -trim_path=github.com/VictoriaMetrics/VictoriaMetrics@ $(PPROF_FILE)
fmt:
GO111MODULE=on gofmt -l -w -s ./lib
GO111MODULE=on gofmt -l -w -s ./app
gofmt -l -w -s ./lib
gofmt -l -w -s ./app
vet:
GO111MODULE=on go vet -mod=vendor ./lib/...
GO111MODULE=on go vet -mod=vendor ./app/...
go vet ./lib/...
go vet ./app/...
lint: install-golint
golint lib/...
golint app/...
install-golint:
which golint || GO111MODULE=off go get golang.org/x/lint/golint
errcheck: install-errcheck
errcheck -exclude=errcheck_excludes.txt ./lib/...
errcheck -exclude=errcheck_excludes.txt ./app/vminsert/...
errcheck -exclude=errcheck_excludes.txt ./app/vmselect/...
errcheck -exclude=errcheck_excludes.txt ./app/vmstorage/...
errcheck -exclude=errcheck_excludes.txt ./app/vmagent/...
errcheck -exclude=errcheck_excludes.txt ./app/vmalert/...
errcheck -exclude=errcheck_excludes.txt ./app/vmauth/...
errcheck -exclude=errcheck_excludes.txt ./app/vmbackup/...
errcheck -exclude=errcheck_excludes.txt ./app/vmrestore/...
errcheck -exclude=errcheck_excludes.txt ./app/vmctl/...
install-errcheck:
which errcheck || GO111MODULE=off go get github.com/kisielk/errcheck
check-all: fmt vet lint errcheck golangci-lint
check-all: fmt vet golangci-lint govulncheck
test:
GO111MODULE=on go test -mod=vendor ./lib/... ./app/...
go test ./lib/... ./app/...
test-race:
GO111MODULE=on go test -mod=vendor -race ./lib/... ./app/...
go test -race ./lib/... ./app/...
test-pure:
GO111MODULE=on CGO_ENABLED=0 go test -mod=vendor ./lib/... ./app/...
CGO_ENABLED=0 go test ./lib/... ./app/...
test-full:
GO111MODULE=on go test -mod=vendor -coverprofile=coverage.txt -covermode=atomic ./lib/... ./app/...
go test -coverprofile=coverage.txt -covermode=atomic ./lib/... ./app/...
test-full-386:
GO111MODULE=on GOARCH=386 go test -mod=vendor -coverprofile=coverage.txt -covermode=atomic ./lib/... ./app/...
GOARCH=386 go test -coverprofile=coverage.txt -covermode=atomic ./lib/... ./app/...
benchmark:
GO111MODULE=on go test -mod=vendor -bench=. ./lib/...
GO111MODULE=on go test -mod=vendor -bench=. ./app/...
go test -bench=. ./lib/...
go test -bench=. ./app/...
benchmark-pure:
GO111MODULE=on CGO_ENABLED=0 go test -mod=vendor -bench=. ./lib/...
GO111MODULE=on CGO_ENABLED=0 go test -mod=vendor -bench=. ./app/...
CGO_ENABLED=0 go test -bench=. ./lib/...
CGO_ENABLED=0 go test -bench=. ./app/...
vendor-update:
GO111MODULE=on go get -u -d ./lib/...
GO111MODULE=on go get -u -d ./app/...
GO111MODULE=on go mod tidy
GO111MODULE=on go mod vendor
go get -u -d ./lib/...
go get -u -d ./app/...
go mod tidy -compat=1.19
go mod vendor
app-local:
CGO_ENABLED=1 GO111MODULE=on go build $(RACE) -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/$(APP_NAME)$(RACE) $(PKG_PREFIX)/app/$(APP_NAME)
CGO_ENABLED=1 go build $(RACE) -ldflags "$(GO_BUILDINFO)" -o bin/$(APP_NAME)$(RACE) $(PKG_PREFIX)/app/$(APP_NAME)
app-local-pure:
CGO_ENABLED=0 GO111MODULE=on go build $(RACE) -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/$(APP_NAME)-pure$(RACE) $(PKG_PREFIX)/app/$(APP_NAME)
CGO_ENABLED=0 go build $(RACE) -ldflags "$(GO_BUILDINFO)" -o bin/$(APP_NAME)-pure$(RACE) $(PKG_PREFIX)/app/$(APP_NAME)
app-local-with-goarch:
GO111MODULE=on go build $(RACE) -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/$(APP_NAME)-$(GOARCH)$(RACE) $(PKG_PREFIX)/app/$(APP_NAME)
app-local-goos-goarch:
CGO_ENABLED=$(CGO_ENABLED) GOOS=$(GOOS) GOARCH=$(GOARCH) go build $(RACE) -ldflags "$(GO_BUILDINFO)" -o bin/$(APP_NAME)-$(GOOS)-$(GOARCH)$(RACE) $(PKG_PREFIX)/app/$(APP_NAME)
app-local-windows-with-goarch:
CGO_ENABLED=0 GO111MODULE=on go build $(RACE) -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/$(APP_NAME)-windows-$(GOARCH)$(RACE).exe $(PKG_PREFIX)/app/$(APP_NAME)
app-local-windows-goarch:
CGO_ENABLED=0 GOOS=windows GOARCH=$(GOARCH) go build $(RACE) -ldflags "$(GO_BUILDINFO)" -o bin/$(APP_NAME)-windows-$(GOARCH)$(RACE).exe $(PKG_PREFIX)/app/$(APP_NAME)
quicktemplate-gen: install-qtc
qtc
install-qtc:
which qtc || GO111MODULE=off go get github.com/valyala/quicktemplate/qtc
which qtc || go install github.com/valyala/quicktemplate/qtc@latest
golangci-lint: install-golangci-lint
golangci-lint run --exclude '(SA4003|SA1019|SA5011):' -D errcheck -D structcheck --timeout 2m
golangci-lint run
install-golangci-lint:
which golangci-lint || curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(shell go env GOPATH)/bin v1.45.1
which golangci-lint || curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(shell go env GOPATH)/bin v1.50.1
govulncheck: install-govulncheck
govulncheck ./...
install-govulncheck:
which govulncheck || go install golang.org/x/vuln/cmd/govulncheck@latest
install-wwhrd:
which wwhrd || GO111MODULE=off go get github.com/frapposelli/wwhrd
which wwhrd || go install github.com/frapposelli/wwhrd@latest
check-licenses: install-wwhrd
wwhrd check -f .wwhrd.yml
copy-docs:
echo "---\nsort: ${ORDER}\n---\n" > ${DST}
echo '' > ${DST}
@if [ ${ORDER} -ne 0 ]; then \
echo "---\nsort: ${ORDER}\n---\n" > ${DST}; \
fi
cat ${SRC} >> ${DST}
sed -i='.tmp' 's/<img src=\"docs\//<img src=\"/' ${DST}
rm -rf docs/*.tmp
# Copies docs for all components and adds the order tag.
# For ORDER=0 it adds no order tag.
# Images starting with <img src="docs/ are replaced with <img src="
# Cluster docs are supposed to be ordered as 9th.
# For The rest of docs is ordered manually.t
# The rest of docs is ordered manually.
docs-sync:
cp README.md docs/README.md
SRC=README.md DST=docs/README.md ORDER=0 $(MAKE) copy-docs
SRC=README.md DST=docs/Single-server-VictoriaMetrics.md ORDER=1 $(MAKE) copy-docs
SRC=app/vmagent/README.md DST=docs/vmagent.md ORDER=3 $(MAKE) copy-docs
SRC=app/vmalert/README.md DST=docs/vmalert.md ORDER=4 $(MAKE) copy-docs

1208
README.md

File diff suppressed because it is too large Load Diff

14
SECURITY.md Normal file
View File

@@ -0,0 +1,14 @@
# Security Policy
## Supported Versions
| Version | Supported |
|---------|--------------------|
| 1.81.x | :white_check_mark: |
| 1.80.x | :x: |
| 1.79.x | :white_check_mark: |
| < 1.78 | :x: |
## Reporting a Vulnerability
Please report any security issues to security@victoriametrics.com

View File

@@ -12,20 +12,20 @@ victoria-metrics-prod:
victoria-metrics-pure-prod:
APP_NAME=victoria-metrics $(MAKE) app-via-docker-pure
victoria-metrics-amd64-prod:
APP_NAME=victoria-metrics $(MAKE) app-via-docker-amd64
victoria-metrics-linux-amd64-prod:
APP_NAME=victoria-metrics $(MAKE) app-via-docker-linux-amd64
victoria-metrics-arm-prod:
APP_NAME=victoria-metrics $(MAKE) app-via-docker-arm
victoria-metrics-linux-arm-prod:
APP_NAME=victoria-metrics $(MAKE) app-via-docker-linux-arm
victoria-metrics-arm64-prod:
APP_NAME=victoria-metrics $(MAKE) app-via-docker-arm64
victoria-metrics-linux-arm64-prod:
APP_NAME=victoria-metrics $(MAKE) app-via-docker-linux-arm64
victoria-metrics-ppc64le-prod:
APP_NAME=victoria-metrics $(MAKE) app-via-docker-ppc64le
victoria-metrics-linux-ppc64le-prod:
APP_NAME=victoria-metrics $(MAKE) app-via-docker-linux-ppc64le
victoria-metrics-386-prod:
APP_NAME=victoria-metrics $(MAKE) app-via-docker-386
victoria-metrics-linux-386-prod:
APP_NAME=victoria-metrics $(MAKE) app-via-docker-linux-386
victoria-metrics-darwin-amd64-prod:
APP_NAME=victoria-metrics $(MAKE) app-via-docker-darwin-amd64
@@ -33,6 +33,12 @@ victoria-metrics-darwin-amd64-prod:
victoria-metrics-darwin-arm64-prod:
APP_NAME=victoria-metrics $(MAKE) app-via-docker-darwin-arm64
victoria-metrics-freebsd-amd64-prod:
APP_NAME=victoria-metrics $(MAKE) app-via-docker-freebsd-amd64
victoria-metrics-openbsd-amd64-prod:
APP_NAME=victoria-metrics $(MAKE) app-via-docker-openbsd-amd64
package-victoria-metrics:
APP_NAME=victoria-metrics $(MAKE) package-via-docker
@@ -64,60 +70,68 @@ run-victoria-metrics:
ARGS='-graphiteListenAddr=:2003 -opentsdbListenAddr=:4242 -retentionPeriod=12 -search.maxUniqueTimeseries=1000000 -search.maxQueryDuration=10m' \
$(MAKE) run-via-docker
victoria-metrics-amd64:
CGO_ENABLED=1 GOOS=linux GOARCH=amd64 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/victoria-metrics-amd64 ./app/victoria-metrics
victoria-metrics-linux-amd64:
APP_NAME=victoria-metrics CGO_ENABLED=1 GOOS=linux GOARCH=amd64 $(MAKE) app-local-goos-goarch
victoria-metrics-arm:
CGO_ENABLED=0 GOOS=linux GOARCH=arm GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/victoria-metrics-arm ./app/victoria-metrics
victoria-metrics-linux-arm:
APP_NAME=victoria-metrics CGO_ENABLED=0 GOOS=linux GOARCH=arm $(MAKE) app-local-goos-goarch
victoria-metrics-arm64:
CGO_ENABLED=0 GOOS=linux GOARCH=arm64 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/victoria-metrics-arm64 ./app/victoria-metrics
victoria-metrics-linux-arm64:
APP_NAME=victoria-metrics CGO_ENABLED=0 GOOS=linux GOARCH=arm64 $(MAKE) app-local-goos-goarch
victoria-metrics-ppc64le:
CGO_ENABLED=0 GOOS=linux GOARCH=ppc64le GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/victoria-metrics-ppc64le ./app/victoria-metrics
victoria-metrics-linux-ppc64le:
APP_NAME=victoria-metrics CGO_ENABLED=0 GOOS=linux GOARCH=ppc64le $(MAKE) app-local-goos-goarch
victoria-metrics-386:
CGO_ENABLED=0 GOOS=linux GOARCH=386 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/victoria-metrics-386 ./app/victoria-metrics
victoria-metrics-linux-386:
APP_NAME=victoria-metrics CGO_ENABLED=0 GOOS=linux GOARCH=386 $(MAKE) app-local-goos-goarch
victoria-metrics-darwin-amd64:
APP_NAME=victoria-metrics CGO_ENABLED=0 GOOS=darwin GOARCH=amd64 $(MAKE) app-local-goos-goarch
victoria-metrics-darwin-arm64:
APP_NAME=victoria-metrics CGO_ENABLED=0 GOOS=darwin GOARCH=arm64 $(MAKE) app-local-goos-goarch
victoria-metrics-freebsd-amd64:
APP_NAME=victoria-metrics CGO_ENABLED=0 GOOS=freebsd GOARCH=amd64 $(MAKE) app-local-goos-goarch
victoria-metrics-openbsd-amd64:
APP_NAME=victoria-metrics CGO_ENABLED=0 GOOS=openbsd GOARCH=amd64 $(MAKE) app-local-goos-goarch
victoria-metrics-pure:
APP_NAME=victoria-metrics $(MAKE) app-local-pure
### Packaging as DEB - amd64
victoria-metrics-package-deb: victoria-metrics-prod
victoria-metrics-package-deb-amd64: victoria-metrics-linux-amd64-prod
./package/package_deb.sh amd64
### Packaging as DEB - arm64
victoria-metrics-package-deb-arm64: victoria-metrics-arm64-prod
victoria-metrics-package-deb-arm: victoria-metrics-linux-arm-prod
./package/package_deb.sh arm
### Packaging as DEB - arm64
victoria-metrics-package-deb-arm64: victoria-metrics-linux-arm64-prod
./package/package_deb.sh arm64
### Packaging as DEB - all
victoria-metrics-package-deb-all: \
victoria-metrics-package-deb \
victoria-metrics-package-deb: \
victoria-metrics-package-deb-amd64 \
victoria-metrics-package-deb-arm \
victoria-metrics-package-deb-arm64
### Packaging as RPM - amd64
victoria-metrics-package-rpm: victoria-metrics-prod
victoria-metrics-package-rpm-amd64: victoria-metrics-linux-amd64-prod
./package/package_rpm.sh amd64
### Packaging as RPM - arm64
victoria-metrics-package-rpm-arm64: victoria-metrics-arm64-prod
victoria-metrics-package-rpm-arm64: victoria-metrics-linux-arm64-prod
./package/package_rpm.sh arm64
### Packaging as RPM - all
victoria-metrics-package-rpm-all: \
victoria-metrics-package-rpm \
victoria-metrics-package-rpm: \
victoria-metrics-package-rpm-amd64 \
victoria-metrics-package-rpm-arm64
### Packaging as both DEB and RPM - all
victoria-metrics-package-deb-rpm-all: \
victoria-metrics-package-deb-rpm: \
victoria-metrics-package-deb \
victoria-metrics-package-deb-arm64 \
victoria-metrics-package-rpm \
victoria-metrics-package-rpm-arm64
### Packaging as snap
victoria-metrics-package-snap:
which snapcraft || snap install snapcraft
which multipass || snap install multipass
snapcraft
victoria-metrics-package-rpm

View File

@@ -19,15 +19,20 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/procutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/pushmetrics"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
)
var (
httpListenAddr = flag.String("httpListenAddr", ":8428", "TCP address to listen for http connections")
minScrapeInterval = flag.Duration("dedup.minScrapeInterval", 0, "Leave only the first sample in every time series per each discrete interval "+
minScrapeInterval = flag.Duration("dedup.minScrapeInterval", 0, "Leave only the last sample in every time series per each discrete interval "+
"equal to -dedup.minScrapeInterval > 0. See https://docs.victoriametrics.com/#deduplication and https://docs.victoriametrics.com/#downsampling")
dryRun = flag.Bool("dryRun", false, "Whether to check only -promscrape.config and then exit. "+
"Unknown config entries aren't allowed in -promscrape.config by default. This can be changed with -promscrape.config.strictParse=false command-line flag")
inmemoryDataFlushInterval = flag.Duration("inmemoryDataFlushInterval", 5*time.Second, "The interval for guaranteed saving of in-memory data to disk. "+
"The saved data survives unclean shutdown such as OOM crash, hardware reset, SIGKILL, etc. "+
"Bigger intervals may help increasing lifetime of flash storage with limited write cycles (e.g. Raspberry PI). "+
"Smaller intervals increase disk IO load. Minimum supported value is 1s")
)
func main() {
@@ -37,6 +42,7 @@ func main() {
envflag.Parse()
buildinfo.Init()
logger.Init()
pushmetrics.Init()
if promscrape.IsDryRun() {
*dryRun = true
@@ -52,6 +58,7 @@ func main() {
logger.Infof("starting VictoriaMetrics at %q...", *httpListenAddr)
startTime := time.Now()
storage.SetDedupInterval(*minScrapeInterval)
storage.SetDataFlushInterval(*inmemoryDataFlushInterval)
vmstorage.Init(promql.ResetRollupResultCacheIfNeeded)
vmselect.Init()
vminsert.Init()
@@ -92,7 +99,10 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
fmt.Fprintf(w, "Useful endpoints:</br>")
httpserver.WriteAPIHelp(w, [][2]string{
{"vmui", "Web UI"},
{"targets", "discovered targets list"},
{"targets", "status for discovered active targets"},
{"service-discovery", "labels before and after relabeling for discovered targets"},
{"metric-relabel-debug", "debug metric relabeling"},
{"expand-with-exprs", "WITH expressions' tutorial"},
{"api/v1/targets", "advanced information about discovered targets in JSON format"},
{"config", "-promscrape.config contents"},
{"metrics", "available service metrics"},

View File

@@ -6,7 +6,6 @@ import (
"flag"
"fmt"
"io"
"io/ioutil"
"log"
"net"
"net/http"
@@ -139,7 +138,7 @@ func setUp() {
if err != nil {
return false
}
resp.Body.Close()
_ = resp.Body.Close()
return resp.StatusCode == 200
}
if err := waitFor(testStorageInitTimeout, readyStorageCheckFunc); err != nil {
@@ -262,7 +261,7 @@ func testRead(t *testing.T) {
for _, q := range test.Query {
q = testutil.PopulateTimeTplString(q, insertionTime)
if test.Issue != "" {
test.Issue = "Regression in " + test.Issue
test.Issue = "\nRegression in " + test.Issue
}
switch true {
case strings.HasPrefix(q, "/api/v1/export"):
@@ -285,7 +284,7 @@ func testRead(t *testing.T) {
queryResult := Query{}
httpReadStruct(t, testReadHTTPPath, q, &queryResult)
if err := checkQueryResult(queryResult, test.ResultQuery); err != nil {
t.Fatalf("Query. %s fails with error %s.%s", q, err, test.Issue)
t.Fatalf("Query. %s fails with error: %s.%s", q, err, test.Issue)
}
default:
t.Fatalf("unsupported read query %s", q)
@@ -308,7 +307,7 @@ func readIn(readFor string, t *testing.T, insertTime time.Time) []test {
if filepath.Ext(path) != ".json" {
return nil
}
b, err := ioutil.ReadFile(path)
b, err := os.ReadFile(path)
s.noError(err)
item := test{}
s.noError(json.Unmarshal(b, &item))
@@ -338,7 +337,9 @@ func tcpWrite(t *testing.T, address string, data string) {
s := newSuite(t)
conn, err := net.Dial("tcp", address)
s.noError(err)
defer conn.Close()
defer func() {
_ = conn.Close()
}()
n, err := conn.Write([]byte(data))
s.noError(err)
s.equalInt(n, len(data))
@@ -349,7 +350,9 @@ func httpReadMetrics(t *testing.T, address, query string) []Metric {
s := newSuite(t)
resp, err := http.Get(address + query)
s.noError(err)
defer resp.Body.Close()
defer func() {
_ = resp.Body.Close()
}()
s.equalInt(resp.StatusCode, 200)
var rows []Metric
for dec := json.NewDecoder(resp.Body); dec.More(); {
@@ -364,7 +367,9 @@ func httpReadStruct(t *testing.T, address, query string, dst interface{}) {
s := newSuite(t)
resp, err := http.Get(address + query)
s.noError(err)
defer resp.Body.Close()
defer func() {
_ = resp.Body.Close()
}()
s.equalInt(resp.StatusCode, 200)
s.noError(json.NewDecoder(resp.Body).Decode(dst))
}

View File

@@ -9,4 +9,4 @@ COPY --from=certs /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ca-certifica
EXPOSE 8428
ENTRYPOINT ["/victoria-metrics-prod"]
ARG TARGETARCH
COPY victoria-metrics-${TARGETARCH}-prod ./victoria-metrics-prod
COPY victoria-metrics-linux-${TARGETARCH}-prod ./victoria-metrics-prod

View File

@@ -6,8 +6,8 @@ import (
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/appmetrics"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/prometheus"
@@ -60,7 +60,7 @@ func selfScraper(scrapeInterval time.Duration) {
currentTimestamp = currentTime.UnixNano() / 1e6
}
bb.Reset()
httpserver.WritePrometheusMetrics(&bb)
appmetrics.WritePrometheusMetrics(&bb)
s := bytesutil.ToUnsafeString(bb.B)
rows.Reset()
rows.Unmarshal(s)
@@ -85,7 +85,9 @@ func selfScraper(scrapeInterval time.Duration) {
mr.Timestamp = currentTimestamp
mr.Value = r.Value
}
vmstorage.AddRows(mrs)
if err := vmstorage.AddRows(mrs); err != nil {
logger.Errorf("cannot store self-scraped metrics: %s", err)
}
}
}

View File

@@ -12,7 +12,7 @@ func TestPopulateTimeTplString(t *testing.T) {
}
f := func(s, resultExpected string) {
t.Helper()
result := PopulateTimeTplString(s, now)
result := PopulateTimeTplString(s, now.UTC())
if result != resultExpected {
t.Fatalf("unexpected result; got %q; want %q", result, resultExpected)
}

View File

@@ -6,7 +6,7 @@
"forms_daily_count;item=x 2 {TIME_S-2m}",
"forms_daily_count;item=y 3 {TIME_S-1m}",
"forms_daily_count;item=y 4 {TIME_S-2m}"],
"query": ["/api/v1/query?query=min%20by%20(item)%20(min_over_time(forms_daily_count[10m:1m]))&time={TIME_S-1m}"],
"query": ["/api/v1/query?query=min%20by%20(item)%20(min_over_time(forms_daily_count[10m:1m]))&time={TIME_S-1m}&latency_offset=1ms"],
"result_query": {
"status":"success",
"data":{"resultType":"vector","result":[{"metric":{"item":"x"},"value":["{TIME_S-1m}","2"]},{"metric":{"item":"y"},"value":["{TIME_S-1m}","4"]}]}

View File

@@ -12,20 +12,20 @@ vmagent-prod:
vmagent-pure-prod:
APP_NAME=vmagent $(MAKE) app-via-docker-pure
vmagent-amd64-prod:
APP_NAME=vmagent $(MAKE) app-via-docker-amd64
vmagent-linux-amd64-prod:
APP_NAME=vmagent $(MAKE) app-via-docker-linux-amd64
vmagent-arm-prod:
APP_NAME=vmagent $(MAKE) app-via-docker-arm
vmagent-linux-arm-prod:
APP_NAME=vmagent $(MAKE) app-via-docker-linux-arm
vmagent-arm64-prod:
APP_NAME=vmagent $(MAKE) app-via-docker-arm64
vmagent-linux-arm64-prod:
APP_NAME=vmagent $(MAKE) app-via-docker-linux-arm64
vmagent-ppc64le-prod:
APP_NAME=vmagent $(MAKE) app-via-docker-ppc64le
vmagent-linux-ppc64le-prod:
APP_NAME=vmagent $(MAKE) app-via-docker-linux-ppc64le
vmagent-386-prod:
APP_NAME=vmagent $(MAKE) app-via-docker-386
vmagent-linux-386-prod:
APP_NAME=vmagent $(MAKE) app-via-docker-linux-386
vmagent-darwin-amd64-prod:
APP_NAME=vmagent $(MAKE) app-via-docker-darwin-amd64
@@ -33,6 +33,12 @@ vmagent-darwin-amd64-prod:
vmagent-darwin-arm64-prod:
APP_NAME=vmagent $(MAKE) app-via-docker-darwin-arm64
vmagent-freebsd-amd64-prod:
APP_NAME=vmagent $(MAKE) app-via-docker-freebsd-amd64
vmagent-openbsd-amd64-prod:
APP_NAME=vmagent $(MAKE) app-via-docker-openbsd-amd64
vmagent-windows-amd64-prod:
APP_NAME=vmagent $(MAKE) app-via-docker-windows-amd64
@@ -67,26 +73,35 @@ run-vmagent:
APP_NAME=vmagent \
$(MAKE) run-via-docker
vmagent-amd64:
CGO_ENABLED=1 GOARCH=amd64 $(MAKE) vmagent-local-with-goarch
vmagent-linux-amd64:
APP_NAME=vmagent CGO_ENABLED=1 GOOS=linux GOARCH=amd64 $(MAKE) app-local-goos-goarch
vmagent-arm:
CGO_ENABLED=0 GOARCH=arm $(MAKE) vmagent-local-with-goarch
vmagent-linux-arm:
APP_NAME=vmagent CGO_ENABLED=0 GOOS=linux GOARCH=arm $(MAKE) app-local-goos-goarch
vmagent-arm64:
CGO_ENABLED=0 GOARCH=arm64 $(MAKE) vmagent-local-with-goarch
vmagent-linux-arm64:
APP_NAME=vmagent CGO_ENABLED=0 GOOS=linux GOARCH=arm64 $(MAKE) app-local-goos-goarch
vmagent-ppc64le:
CGO_ENABLED=0 GOARCH=ppc64le $(MAKE) vmagent-local-with-goarch
vmagent-linux-ppc64le:
APP_NAME=vmagent CGO_ENABLED=0 GOOS=linux GOARCH=ppc64le $(MAKE) app-local-goos-goarch
vmagent-386:
CGO_ENABLED=0 GOARCH=386 $(MAKE) vmagent-local-with-goarch
vmagent-linux-386:
APP_NAME=vmagent CGO_ENABLED=0 GOOS=linux GOARCH=386 $(MAKE) app-local-goos-goarch
vmagent-local-with-goarch:
APP_NAME=vmagent $(MAKE) app-local-with-goarch
vmagent-darwin-amd64:
APP_NAME=vmagent CGO_ENABLED=0 GOOS=darwin GOARCH=amd64 $(MAKE) app-local-goos-goarch
vmagent-darwin-arm64:
APP_NAME=vmagent CGO_ENABLED=0 GOOS=darwin GOARCH=arm64 $(MAKE) app-local-goos-goarch
vmagent-freebsd-amd64:
APP_NAME=vmagent CGO_ENABLED=0 GOOS=freebsd GOARCH=amd64 $(MAKE) app-local-goos-goarch
vmagent-openbsd-amd64:
APP_NAME=vmagent CGO_ENABLED=0 GOOS=openbsd GOARCH=amd64 $(MAKE) app-local-goos-goarch
vmagent-windows-amd64:
GOARCH=amd64 APP_NAME=vmagent $(MAKE) app-local-windows-goarch
vmagent-pure:
APP_NAME=vmagent $(MAKE) app-local-pure
vmagent-windows-amd64:
GOARCH=amd64 APP_NAME=vmagent $(MAKE) app-local-windows-with-goarch

File diff suppressed because it is too large Load Diff

View File

@@ -10,7 +10,6 @@ import (
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/csvimport"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
@@ -26,10 +25,8 @@ func InsertHandler(at *auth.Token, req *http.Request) error {
if err != nil {
return err
}
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(req, func(rows []parser.Row) error {
return insertRows(at, rows, extraLabels)
})
return parser.ParseStream(req, func(rows []parser.Row) error {
return insertRows(at, rows, extraLabels)
})
}
@@ -67,7 +64,7 @@ func insertRows(at *auth.Token, rows []parser.Row, extraLabels []prompbmarshal.L
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.PushWithAuthToken(at, &ctx.WriteRequest)
remotewrite.Push(at, &ctx.WriteRequest)
rowsInserted.Add(len(rows))
if at != nil {
rowsTenantInserted.Get(at).Add(len(rows))

View File

@@ -1,9 +1,7 @@
package datadog
import (
"fmt"
"net/http"
"strings"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
@@ -12,7 +10,6 @@ import (
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/datadog"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
@@ -30,11 +27,9 @@ func InsertHandlerForHTTP(at *auth.Token, req *http.Request) error {
if err != nil {
return err
}
return writeconcurrencylimiter.Do(func() error {
ce := req.Header.Get("Content-Encoding")
return parser.ParseStream(req.Body, ce, func(series []parser.Series) error {
return insertRows(at, series, extraLabels)
})
ce := req.Header.Get("Content-Encoding")
return parser.ParseStream(req.Body, ce, func(series []parser.Series) error {
return insertRows(at, series, extraLabels)
})
}
@@ -54,17 +49,20 @@ func insertRows(at *auth.Token, series []parser.Series, extraLabels []prompbmars
Name: "__name__",
Value: ss.Metric,
})
labels = append(labels, prompbmarshal.Label{
Name: "host",
Value: ss.Host,
})
if ss.Host != "" {
labels = append(labels, prompbmarshal.Label{
Name: "host",
Value: ss.Host,
})
}
if ss.Device != "" {
labels = append(labels, prompbmarshal.Label{
Name: "device",
Value: ss.Device,
})
}
for _, tag := range ss.Tags {
n := strings.IndexByte(tag, ':')
if n < 0 {
return fmt.Errorf("cannot find ':' in tag %q", tag)
}
name := tag[:n]
value := tag[n+1:]
name, value := parser.SplitTag(tag)
if name == "host" {
name = "exported_host"
}
@@ -89,7 +87,7 @@ func insertRows(at *auth.Token, series []parser.Series, extraLabels []prompbmars
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.PushWithAuthToken(at, &ctx.WriteRequest)
remotewrite.Push(at, &ctx.WriteRequest)
rowsInserted.Add(rowsTotal)
if at != nil {
rowsTenantInserted.Get(at).Add(rowsTotal)

View File

@@ -7,7 +7,6 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/graphite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
@@ -20,9 +19,7 @@ var (
//
// See https://graphite.readthedocs.io/en/latest/feeding-carbon.html#the-plaintext-protocol
func InsertHandler(r io.Reader) error {
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(r, insertRows)
})
return parser.ParseStream(r, insertRows)
}
func insertRows(rows []parser.Row) error {
@@ -58,7 +55,7 @@ func insertRows(rows []parser.Row) error {
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.Push(&ctx.WriteRequest)
remotewrite.Push(nil, &ctx.WriteRequest)
rowsInserted.Add(len(rows))
rowsPerInsert.Update(float64(len(rows)))
return nil

View File

@@ -16,7 +16,6 @@ import (
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/influx"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
@@ -37,10 +36,8 @@ var (
//
// See https://github.com/influxdata/telegraf/tree/master/plugins/inputs/socket_listener/
func InsertHandlerForReader(r io.Reader, isGzipped bool) error {
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(r, isGzipped, "", "", func(db string, rows []parser.Row) error {
return insertRows(nil, db, rows, nil)
})
return parser.ParseStream(r, isGzipped, "", "", func(db string, rows []parser.Row) error {
return insertRows(nil, db, rows, nil)
})
}
@@ -52,15 +49,13 @@ func InsertHandlerForHTTP(at *auth.Token, req *http.Request) error {
if err != nil {
return err
}
return writeconcurrencylimiter.Do(func() error {
isGzipped := req.Header.Get("Content-Encoding") == "gzip"
q := req.URL.Query()
precision := q.Get("precision")
// Read db tag from https://docs.influxdata.com/influxdb/v1.7/tools/api/#write-http-endpoint
db := q.Get("db")
return parser.ParseStream(req.Body, isGzipped, precision, db, func(db string, rows []parser.Row) error {
return insertRows(at, db, rows, extraLabels)
})
isGzipped := req.Header.Get("Content-Encoding") == "gzip"
q := req.URL.Query()
precision := q.Get("precision")
// Read db tag from https://docs.influxdata.com/influxdb/v1.7/tools/api/#write-http-endpoint
db := q.Get("db")
return parser.ParseStream(req.Body, isGzipped, precision, db, func(db string, rows []parser.Row) error {
return insertRows(at, db, rows, extraLabels)
})
}
@@ -134,7 +129,7 @@ func insertRows(at *auth.Token, db string, rows []parser.Row, extraLabels []prom
ctx.ctx.Labels = labels
ctx.ctx.Samples = samples
ctx.commonLabels = commonLabels
remotewrite.PushWithAuthToken(at, &ctx.ctx.WriteRequest)
remotewrite.Push(at, &ctx.ctx.WriteRequest)
rowsInserted.Add(rowsTotal)
if at != nil {
rowsTenantInserted.Get(at).Add(rowsTotal)

View File

@@ -1,6 +1,7 @@
package main
import (
"embed"
"flag"
"fmt"
"io"
@@ -23,6 +24,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/vmimport"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envflag"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
@@ -35,7 +37,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/procutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/pushmetrics"
"github.com/VictoriaMetrics/metrics"
)
@@ -43,7 +45,7 @@ var (
httpListenAddr = flag.String("httpListenAddr", ":8429", "TCP address to listen for http connections. "+
"Set this flag to empty value in order to disable listening on any port. This mode may be useful for running multiple vmagent instances on the same server. "+
"Note that /targets and /metrics pages aren't available if -httpListenAddr=''")
influxListenAddr = flag.String("influxListenAddr", "", "TCP and UDP address to listen for InfluxDB line protocol data. Usually :8189 must be set. Doesn't work if empty. "+
influxListenAddr = flag.String("influxListenAddr", "", "TCP and UDP address to listen for InfluxDB line protocol data. Usually :8089 must be set. Doesn't work if empty. "+
"This flag isn't needed when ingesting data over HTTP - just send it to http://<vmagent>:8429/write")
graphiteListenAddr = flag.String("graphiteListenAddr", "", "TCP and UDP address to listen for Graphite plaintext data. Usually :2003 must be set. Doesn't work if empty")
opentsdbListenAddr = flag.String("opentsdbListenAddr", "", "TCP and UDP address to listen for OpentTSDB metrics. "+
@@ -63,6 +65,12 @@ var (
opentsdbhttpServer *opentsdbhttpserver.Server
)
var (
//go:embed static
staticFiles embed.FS
staticServer = http.FileServer(http.FS(staticFiles))
)
func main() {
// Write flags and help message to stdout, since it is easier to grep or pipe.
flag.CommandLine.SetOutput(os.Stdout)
@@ -71,6 +79,7 @@ func main() {
remotewrite.InitSecretFlags()
buildinfo.Init()
logger.Init()
pushmetrics.Init()
if promscrape.IsDryRun() {
if err := promscrape.CheckConfig(); err != nil {
@@ -94,7 +103,6 @@ func main() {
startTime := time.Now()
remotewrite.Init()
common.StartUnmarshalWorkers()
writeconcurrencylimiter.Init()
if len(*influxListenAddr) > 0 {
influxServer = influxserver.MustStart(*influxListenAddr, func(r io.Reader) error {
return influx.InsertHandlerForReader(r, false)
@@ -104,10 +112,12 @@ func main() {
graphiteServer = graphiteserver.MustStart(*graphiteListenAddr, graphite.InsertHandler)
}
if len(*opentsdbListenAddr) > 0 {
opentsdbServer = opentsdbserver.MustStart(*opentsdbListenAddr, opentsdb.InsertHandler, opentsdbhttp.InsertHandler)
httpInsertHandler := getOpenTSDBHTTPInsertHandler()
opentsdbServer = opentsdbserver.MustStart(*opentsdbListenAddr, opentsdb.InsertHandler, httpInsertHandler)
}
if len(*opentsdbHTTPListenAddr) > 0 {
opentsdbhttpServer = opentsdbhttpserver.MustStart(*opentsdbHTTPListenAddr, opentsdbhttp.InsertHandler)
httpInsertHandler := getOpenTSDBHTTPInsertHandler()
opentsdbhttpServer = opentsdbhttpserver.MustStart(*opentsdbHTTPListenAddr, httpInsertHandler)
}
promscrape.Init(remotewrite.Push)
@@ -149,6 +159,40 @@ func main() {
logger.Infof("successfully stopped vmagent in %.3f seconds", time.Since(startTime).Seconds())
}
func getOpenTSDBHTTPInsertHandler() func(req *http.Request) error {
if !remotewrite.MultitenancyEnabled() {
return func(req *http.Request) error {
path := strings.Replace(req.URL.Path, "//", "/", -1)
if path != "/api/put" {
return fmt.Errorf("unsupported path requested: %q; expecting '/api/put'", path)
}
return opentsdbhttp.InsertHandler(nil, req)
}
}
return func(req *http.Request) error {
path := strings.Replace(req.URL.Path, "//", "/", -1)
at, err := getAuthTokenFromPath(path)
if err != nil {
return fmt.Errorf("cannot obtain auth token from path %q: %w", path, err)
}
return opentsdbhttp.InsertHandler(at, req)
}
}
func getAuthTokenFromPath(path string) (*auth.Token, error) {
p, err := httpserver.ParsePath(path)
if err != nil {
return nil, fmt.Errorf("cannot parse multitenant path: %w", err)
}
if p.Prefix != "insert" {
return nil, fmt.Errorf(`unsupported multitenant prefix: %q; expected "insert"`, p.Prefix)
}
if p.Suffix != "opentsdb/api/put" {
return nil, fmt.Errorf("unsupported path requested: %q; expecting 'opentsdb/api/put'", p.Suffix)
}
return auth.NewToken(p.AuthToken)
}
func requestHandler(w http.ResponseWriter, r *http.Request) bool {
if r.URL.Path == "/" {
if r.Method != "GET" {
@@ -159,7 +203,9 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
fmt.Fprintf(w, "See docs at <a href='https://docs.victoriametrics.com/vmagent.html'>https://docs.victoriametrics.com/vmagent.html</a></br>")
fmt.Fprintf(w, "Useful endpoints:</br>")
httpserver.WriteAPIHelp(w, [][2]string{
{"targets", "discovered targets list"},
{"targets", "status for discovered active targets"},
{"service-discovery", "labels before and after relabeling for discovered targets"},
{"metric-relabel-debug", "debug metric relabeling"},
{"api/v1/targets", "advanced information about discovered targets in JSON format"},
{"config", "-promscrape.config contents"},
{"metrics", "available service metrics"},
@@ -170,35 +216,7 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
}
path := strings.Replace(r.URL.Path, "//", "/", -1)
switch path {
case "/api/v1/write":
prometheusWriteRequests.Inc()
if err := promremotewrite.InsertHandler(nil, r); err != nil {
prometheusWriteErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "/api/v1/import":
vmimportRequests.Inc()
if err := vmimport.InsertHandler(nil, r); err != nil {
vmimportErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "/api/v1/import/csv":
csvimportRequests.Inc()
if err := csvimport.InsertHandler(nil, r); err != nil {
csvimportErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "/api/v1/import/prometheus":
if strings.HasPrefix(path, "/prometheus/api/v1/import/prometheus") || strings.HasPrefix(path, "/api/v1/import/prometheus") {
prometheusimportRequests.Inc()
if err := prometheusimport.InsertHandler(nil, r); err != nil {
prometheusimportErrors.Inc()
@@ -207,7 +225,41 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
}
w.WriteHeader(http.StatusNoContent)
return true
case "/api/v1/import/native":
}
if strings.HasPrefix(path, "datadog/") {
// Trim suffix from paths starting from /datadog/ in order to support legacy DataDog agent.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2670
path = strings.TrimSuffix(path, "/")
}
switch path {
case "/prometheus/api/v1/write", "/api/v1/write":
prometheusWriteRequests.Inc()
if err := promremotewrite.InsertHandler(nil, r); err != nil {
prometheusWriteErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "/prometheus/api/v1/import", "/api/v1/import":
vmimportRequests.Inc()
if err := vmimport.InsertHandler(nil, r); err != nil {
vmimportErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "/prometheus/api/v1/import/csv", "/api/v1/import/csv":
csvimportRequests.Inc()
if err := csvimport.InsertHandler(nil, r); err != nil {
csvimportErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "/prometheus/api/v1/import/native", "/api/v1/import/native":
nativeimportRequests.Inc()
if err := native.InsertHandler(nil, r); err != nil {
nativeimportErrors.Inc()
@@ -216,7 +268,7 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
}
w.WriteHeader(http.StatusNoContent)
return true
case "/write", "/api/v2/write":
case "/influx/write", "/influx/api/v2/write", "/write", "/api/v2/write":
influxWriteRequests.Inc()
if err := influx.InsertHandlerForHTTP(nil, r); err != nil {
influxWriteErrors.Inc()
@@ -225,7 +277,7 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
}
w.WriteHeader(http.StatusNoContent)
return true
case "/query":
case "/influx/query", "/query":
influxQueryRequests.Inc()
influxutils.WriteDatabaseNames(w)
return true
@@ -254,16 +306,39 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
w.WriteHeader(202)
fmt.Fprintf(w, `{"status":"ok"}`)
return true
case "/datadog/intake/":
case "/datadog/intake":
datadogIntakeRequests.Inc()
w.Header().Set("Content-Type", "application/json")
fmt.Fprintf(w, `{}`)
return true
case "/targets":
case "/datadog/api/v1/metadata":
datadogMetadataRequests.Inc()
w.Header().Set("Content-Type", "application/json")
fmt.Fprintf(w, `{}`)
return true
case "/prometheus/targets", "/targets":
promscrapeTargetsRequests.Inc()
promscrape.WriteHumanReadableTargetsStatus(w, r)
return true
case "/target_response":
case "/prometheus/service-discovery", "/service-discovery":
promscrapeServiceDiscoveryRequests.Inc()
promscrape.WriteServiceDiscovery(w, r)
return true
case "/prometheus/metric-relabel-debug", "/metric-relabel-debug":
promscrapeMetricRelabelDebugRequests.Inc()
promscrape.WriteMetricRelabelDebug(w, r)
return true
case "/prometheus/target-relabel-debug", "/target-relabel-debug":
promscrapeTargetRelabelDebugRequests.Inc()
promscrape.WriteTargetRelabelDebug(w, r)
return true
case "/prometheus/api/v1/targets", "/api/v1/targets":
promscrapeAPIV1TargetsRequests.Inc()
w.Header().Set("Content-Type", "application/json")
state := r.FormValue("state")
promscrape.WriteAPIV1Targets(w, state)
return true
case "/prometheus/target_response", "/target_response":
promscrapeTargetResponseRequests.Inc()
if err := promscrape.WriteTargetResponse(w, r); err != nil {
promscrapeTargetResponseErrors.Inc()
@@ -271,26 +346,26 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
return true
}
return true
case "/config":
if *configAuthKey != "" && r.FormValue("authKey") != *configAuthKey {
err := &httpserver.ErrorWithStatusCode{
Err: fmt.Errorf("The provided authKey doesn't match -configAuthKey"),
StatusCode: http.StatusUnauthorized,
}
httpserver.Errorf(w, r, "%s", err)
case "/prometheus/config", "/config":
if !httpserver.CheckAuthFlag(w, r, *configAuthKey, "configAuthKey") {
return true
}
promscrapeConfigRequests.Inc()
w.Header().Set("Content-Type", "text/plain; charset=utf-8")
promscrape.WriteConfigData(w)
return true
case "/api/v1/targets":
promscrapeAPIV1TargetsRequests.Inc()
case "/prometheus/api/v1/status/config", "/api/v1/status/config":
// See https://prometheus.io/docs/prometheus/latest/querying/api/#config
if !httpserver.CheckAuthFlag(w, r, *configAuthKey, "configAuthKey") {
return true
}
promscrapeStatusConfigRequests.Inc()
w.Header().Set("Content-Type", "application/json")
state := r.FormValue("state")
promscrape.WriteAPIV1Targets(w, state)
var bb bytesutil.ByteBuffer
promscrape.WriteConfigData(&bb)
fmt.Fprintf(w, `{"status":"success","data":{"yaml":%q}}`, bb.B)
return true
case "/-/reload":
case "/prometheus/-/reload", "/-/reload":
promscrapeConfigReloadRequests.Inc()
procutil.SelfSIGHUP()
w.WriteHeader(http.StatusOK)
@@ -305,11 +380,16 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
w.Write([]byte("OK"))
}
return true
default:
if strings.HasPrefix(r.URL.Path, "/static") {
staticServer.ServeHTTP(w, r)
return true
}
if remotewrite.MultitenancyEnabled() {
return processMultitenantRequest(w, r, path)
}
return false
}
if remotewrite.MultitenancyEnabled() {
return processMultitenantRequest(w, r, path)
}
return false
}
func processMultitenantRequest(w http.ResponseWriter, r *http.Request, path string) bool {
@@ -327,6 +407,21 @@ func processMultitenantRequest(w http.ResponseWriter, r *http.Request, path stri
httpserver.Errorf(w, r, "cannot obtain auth token: %s", err)
return true
}
if strings.HasPrefix(p.Suffix, "prometheus/api/v1/import/prometheus") {
prometheusimportRequests.Inc()
if err := prometheusimport.InsertHandler(at, r); err != nil {
prometheusimportErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
}
if strings.HasPrefix(p.Suffix, "datadog/") {
// Trim suffix from paths starting from /datadog/ in order to support legacy DataDog agent.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2670
p.Suffix = strings.TrimSuffix(p.Suffix, "/")
}
switch p.Suffix {
case "prometheus/", "prometheus", "prometheus/api/v1/write":
prometheusWriteRequests.Inc()
@@ -355,15 +450,6 @@ func processMultitenantRequest(w http.ResponseWriter, r *http.Request, path stri
}
w.WriteHeader(http.StatusNoContent)
return true
case "prometheus/api/v1/import/prometheus":
prometheusimportRequests.Inc()
if err := prometheusimport.InsertHandler(at, r); err != nil {
prometheusimportErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "prometheus/api/v1/import/native":
nativeimportRequests.Inc()
if err := native.InsertHandler(at, r); err != nil {
@@ -410,11 +496,16 @@ func processMultitenantRequest(w http.ResponseWriter, r *http.Request, path stri
w.WriteHeader(202)
fmt.Fprintf(w, `{"status":"ok"}`)
return true
case "datadog/intake/":
case "datadog/intake":
datadogIntakeRequests.Inc()
w.Header().Set("Content-Type", "application/json")
fmt.Fprintf(w, `{}`)
return true
case "datadog/api/v1/metadata":
datadogMetadataRequests.Inc()
w.Header().Set("Content-Type", "application/json")
fmt.Fprintf(w, `{}`)
return true
default:
httpserver.Errorf(w, r, "unsupported multitenant path suffix: %q", p.Suffix)
return true
@@ -447,15 +538,22 @@ var (
datadogValidateRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/datadog/api/v1/validate", protocol="datadog"}`)
datadogCheckRunRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/datadog/api/v1/check_run", protocol="datadog"}`)
datadogIntakeRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/datadog/intake/", protocol="datadog"}`)
datadogIntakeRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/datadog/intake", protocol="datadog"}`)
datadogMetadataRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/datadog/api/v1/metadata", protocol="datadog"}`)
promscrapeTargetsRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/targets"}`)
promscrapeServiceDiscoveryRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/service-discovery"}`)
promscrapeMetricRelabelDebugRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/metric-relabel-debug"}`)
promscrapeTargetRelabelDebugRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/target-relabel-debug"}`)
promscrapeTargetsRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/targets"}`)
promscrapeAPIV1TargetsRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/api/v1/targets"}`)
promscrapeTargetResponseRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/target_response"}`)
promscrapeTargetResponseErrors = metrics.NewCounter(`vmagent_http_request_errors_total{path="/target_response"}`)
promscrapeConfigRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/config"}`)
promscrapeConfigRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/config"}`)
promscrapeStatusConfigRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/api/v1/status/config"}`)
promscrapeConfigReloadRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/-/reload"}`)
)

View File

@@ -9,4 +9,4 @@ COPY --from=certs /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ca-certifica
EXPOSE 8429
ENTRYPOINT ["/vmagent-prod"]
ARG TARGETARCH
COPY vmagent-${TARGETARCH}-prod ./vmagent-prod
COPY vmagent-linux-${TARGETARCH}-prod ./vmagent-prod

View File

@@ -12,7 +12,6 @@ import (
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/native"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
@@ -31,10 +30,8 @@ func InsertHandler(at *auth.Token, req *http.Request) error {
return err
}
isGzip := req.Header.Get("Content-Encoding") == "gzip"
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(req.Body, isGzip, func(block *parser.Block) error {
return insertRows(at, block, extraLabels)
})
return parser.ParseStream(req.Body, isGzip, func(block *parser.Block) error {
return insertRows(at, block, extraLabels)
})
}
@@ -87,6 +84,6 @@ func insertRows(at *auth.Token, block *parser.Block, extraLabels []prompbmarshal
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.PushWithAuthToken(at, &ctx.WriteRequest)
remotewrite.Push(at, &ctx.WriteRequest)
return nil
}

View File

@@ -7,7 +7,6 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentsdb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
@@ -20,9 +19,7 @@ var (
//
// See http://opentsdb.net/docs/build/html/api_telnet/put.html
func InsertHandler(r io.Reader) error {
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(r, insertRows)
})
return parser.ParseStream(r, insertRows)
}
func insertRows(rows []parser.Row) error {
@@ -58,7 +55,7 @@ func insertRows(rows []parser.Row) error {
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.Push(&ctx.WriteRequest)
remotewrite.Push(nil, &ctx.WriteRequest)
rowsInserted.Add(len(rows))
rowsPerInsert.Update(float64(len(rows)))
return nil

View File

@@ -5,10 +5,10 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentsdbhttp"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
@@ -19,19 +19,17 @@ var (
// InsertHandler processes HTTP OpenTSDB put requests.
// See http://opentsdb.net/docs/build/html/api_http/put.html
func InsertHandler(req *http.Request) error {
func InsertHandler(at *auth.Token, req *http.Request) error {
extraLabels, err := parserCommon.GetExtraLabels(req)
if err != nil {
return err
}
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(req, func(rows []parser.Row) error {
return insertRows(rows, extraLabels)
})
return parser.ParseStream(req, func(rows []parser.Row) error {
return insertRows(at, rows, extraLabels)
})
}
func insertRows(rows []parser.Row, extraLabels []prompbmarshal.Label) error {
func insertRows(at *auth.Token, rows []parser.Row, extraLabels []prompbmarshal.Label) error {
ctx := common.GetPushCtx()
defer common.PutPushCtx(ctx)
@@ -65,7 +63,7 @@ func insertRows(rows []parser.Row, extraLabels []prompbmarshal.Label) error {
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.Push(&ctx.WriteRequest)
remotewrite.Push(at, &ctx.WriteRequest)
rowsInserted.Add(len(rows))
rowsPerInsert.Update(float64(len(rows)))
return nil

View File

@@ -11,7 +11,6 @@ import (
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/prometheus"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
@@ -31,21 +30,17 @@ func InsertHandler(at *auth.Token, req *http.Request) error {
if err != nil {
return err
}
return writeconcurrencylimiter.Do(func() error {
isGzipped := req.Header.Get("Content-Encoding") == "gzip"
return parser.ParseStream(req.Body, defaultTimestamp, isGzipped, func(rows []parser.Row) error {
return insertRows(at, rows, extraLabels)
}, nil)
})
isGzipped := req.Header.Get("Content-Encoding") == "gzip"
return parser.ParseStream(req.Body, defaultTimestamp, isGzipped, func(rows []parser.Row) error {
return insertRows(at, rows, extraLabels)
}, nil)
}
// InsertHandlerForReader processes metrics from given reader with optional gzip format
func InsertHandlerForReader(r io.Reader, isGzipped bool) error {
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(r, 0, isGzipped, func(rows []parser.Row) error {
return insertRows(nil, rows, nil)
}, nil)
})
return parser.ParseStream(r, 0, isGzipped, func(rows []parser.Row) error {
return insertRows(nil, rows, nil)
}, nil)
}
func insertRows(at *auth.Token, rows []parser.Row, extraLabels []prompbmarshal.Label) error {
@@ -82,7 +77,7 @@ func insertRows(at *auth.Token, rows []parser.Row, extraLabels []prompbmarshal.L
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.PushWithAuthToken(at, &ctx.WriteRequest)
remotewrite.Push(at, &ctx.WriteRequest)
rowsInserted.Add(len(rows))
if at != nil {
rowsTenantInserted.Get(at).Add(len(rows))

View File

@@ -13,7 +13,6 @@ import (
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/promremotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
@@ -29,19 +28,15 @@ func InsertHandler(at *auth.Token, req *http.Request) error {
if err != nil {
return err
}
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(req.Body, func(tss []prompb.TimeSeries) error {
return insertRows(at, tss, extraLabels)
})
return parser.ParseStream(req.Body, func(tss []prompb.TimeSeries) error {
return insertRows(at, tss, extraLabels)
})
}
// InsertHandlerForReader processes metrics from given reader
func InsertHandlerForReader(at *auth.Token, r io.Reader) error {
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(r, func(tss []prompb.TimeSeries) error {
return insertRows(at, tss, nil)
})
return parser.ParseStream(r, func(tss []prompb.TimeSeries) error {
return insertRows(at, tss, nil)
})
}
@@ -81,7 +76,7 @@ func insertRows(at *auth.Token, timeseries []prompb.TimeSeries, extraLabels []pr
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.PushWithAuthToken(at, &ctx.WriteRequest)
remotewrite.Push(at, &ctx.WriteRequest)
rowsInserted.Add(rowsTotal)
if at != nil {
rowsTenantInserted.Get(at).Add(rowsTotal)

View File

@@ -3,13 +3,14 @@ package remotewrite
import (
"bytes"
"fmt"
"io/ioutil"
"io"
"net/http"
"net/url"
"strings"
"sync"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/awsapi"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/persistentqueue"
@@ -19,46 +20,50 @@ import (
)
var (
rateLimit = flagutil.NewArrayInt("remoteWrite.rateLimit", "Optional rate limit in bytes per second for data sent to -remoteWrite.url. "+
rateLimit = flagutil.NewArrayInt("remoteWrite.rateLimit", "Optional rate limit in bytes per second for data sent to the corresponding -remoteWrite.url. "+
"By default the rate limit is disabled. It can be useful for limiting load on remote storage when big amounts of buffered data "+
"is sent after temporary unavailability of the remote storage")
sendTimeout = flagutil.NewArrayDuration("remoteWrite.sendTimeout", "Timeout for sending a single block of data to -remoteWrite.url")
proxyURL = flagutil.NewArray("remoteWrite.proxyURL", "Optional proxy URL for writing data to -remoteWrite.url. Supported proxies: http, https, socks5. "+
"Example: -remoteWrite.proxyURL=socks5://proxy:1234")
sendTimeout = flagutil.NewArrayDuration("remoteWrite.sendTimeout", "Timeout for sending a single block of data to the corresponding -remoteWrite.url")
proxyURL = flagutil.NewArrayString("remoteWrite.proxyURL", "Optional proxy URL for writing data to the corresponding -remoteWrite.url. "+
"Supported proxies: http, https, socks5. Example: -remoteWrite.proxyURL=socks5://proxy:1234")
tlsInsecureSkipVerify = flagutil.NewArrayBool("remoteWrite.tlsInsecureSkipVerify", "Whether to skip tls verification when connecting to -remoteWrite.url")
tlsCertFile = flagutil.NewArray("remoteWrite.tlsCertFile", "Optional path to client-side TLS certificate file to use when connecting to -remoteWrite.url. "+
"If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url")
tlsKeyFile = flagutil.NewArray("remoteWrite.tlsKeyFile", "Optional path to client-side TLS certificate key to use when connecting to -remoteWrite.url. "+
"If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url")
tlsCAFile = flagutil.NewArray("remoteWrite.tlsCAFile", "Optional path to TLS CA file to use for verifying connections to -remoteWrite.url. "+
"By default system CA is used. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url")
tlsServerName = flagutil.NewArray("remoteWrite.tlsServerName", "Optional TLS server name to use for connections to -remoteWrite.url. "+
"By default the server name from -remoteWrite.url is used. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url")
tlsInsecureSkipVerify = flagutil.NewArrayBool("remoteWrite.tlsInsecureSkipVerify", "Whether to skip tls verification when connecting to the corresponding -remoteWrite.url")
tlsCertFile = flagutil.NewArrayString("remoteWrite.tlsCertFile", "Optional path to client-side TLS certificate file to use when connecting "+
"to the corresponding -remoteWrite.url")
tlsKeyFile = flagutil.NewArrayString("remoteWrite.tlsKeyFile", "Optional path to client-side TLS certificate key to use when connecting to the corresponding -remoteWrite.url")
tlsCAFile = flagutil.NewArrayString("remoteWrite.tlsCAFile", "Optional path to TLS CA file to use for verifying connections to the corresponding -remoteWrite.url. "+
"By default system CA is used")
tlsServerName = flagutil.NewArrayString("remoteWrite.tlsServerName", "Optional TLS server name to use for connections to the corresponding -remoteWrite.url. "+
"By default the server name from -remoteWrite.url is used")
basicAuthUsername = flagutil.NewArray("remoteWrite.basicAuth.username", "Optional basic auth username to use for -remoteWrite.url. "+
"If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url")
basicAuthPassword = flagutil.NewArray("remoteWrite.basicAuth.password", "Optional basic auth password to use for -remoteWrite.url. "+
"If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url")
basicAuthPasswordFile = flagutil.NewArray("remoteWrite.basicAuth.passwordFile", "Optional path to basic auth password to use for -remoteWrite.url. "+
"The file is re-read every second. "+
"If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url")
bearerToken = flagutil.NewArray("remoteWrite.bearerToken", "Optional bearer auth token to use for -remoteWrite.url. "+
"If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url")
bearerTokenFile = flagutil.NewArray("remoteWrite.bearerTokenFile", "Optional path to bearer token file to use for -remoteWrite.url. "+
"The token is re-read from the file every second. "+
"If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url")
headers = flagutil.NewArrayString("remoteWrite.headers", "Optional HTTP headers to send with each request to the corresponding -remoteWrite.url. "+
"For example, -remoteWrite.headers='My-Auth:foobar' would send 'My-Auth: foobar' HTTP header with every request to the corresponding -remoteWrite.url. "+
"Multiple headers must be delimited by '^^': -remoteWrite.headers='header1:value1^^header2:value2'")
oauth2ClientID = flagutil.NewArray("remoteWrite.oauth2.clientID", "Optional OAuth2 clientID to use for -remoteWrite.url. "+
"If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url")
oauth2ClientSecret = flagutil.NewArray("remoteWrite.oauth2.clientSecret", "Optional OAuth2 clientSecret to use for -remoteWrite.url. "+
"If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url")
oauth2ClientSecretFile = flagutil.NewArray("remoteWrite.oauth2.clientSecretFile", "Optional OAuth2 clientSecretFile to use for -remoteWrite.url. "+
"If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url")
oauth2TokenURL = flagutil.NewArray("remoteWrite.oauth2.tokenUrl", "Optional OAuth2 tokenURL to use for -remoteWrite.url. "+
"If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url")
oauth2Scopes = flagutil.NewArray("remoteWrite.oauth2.scopes", "Optional OAuth2 scopes to use for -remoteWrite.url. Scopes must be delimited by ';'. "+
"If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url")
basicAuthUsername = flagutil.NewArrayString("remoteWrite.basicAuth.username", "Optional basic auth username to use for the corresponding -remoteWrite.url")
basicAuthPassword = flagutil.NewArrayString("remoteWrite.basicAuth.password", "Optional basic auth password to use for the corresponding -remoteWrite.url")
basicAuthPasswordFile = flagutil.NewArrayString("remoteWrite.basicAuth.passwordFile", "Optional path to basic auth password to use for the corresponding -remoteWrite.url. "+
"The file is re-read every second")
bearerToken = flagutil.NewArrayString("remoteWrite.bearerToken", "Optional bearer auth token to use for the corresponding -remoteWrite.url")
bearerTokenFile = flagutil.NewArrayString("remoteWrite.bearerTokenFile", "Optional path to bearer token file to use for the corresponding -remoteWrite.url. "+
"The token is re-read from the file every second")
oauth2ClientID = flagutil.NewArrayString("remoteWrite.oauth2.clientID", "Optional OAuth2 clientID to use for the corresponding -remoteWrite.url")
oauth2ClientSecret = flagutil.NewArrayString("remoteWrite.oauth2.clientSecret", "Optional OAuth2 clientSecret to use for the corresponding -remoteWrite.url")
oauth2ClientSecretFile = flagutil.NewArrayString("remoteWrite.oauth2.clientSecretFile", "Optional OAuth2 clientSecretFile to use for the corresponding -remoteWrite.url")
oauth2TokenURL = flagutil.NewArrayString("remoteWrite.oauth2.tokenUrl", "Optional OAuth2 tokenURL to use for the corresponding -remoteWrite.url")
oauth2Scopes = flagutil.NewArrayString("remoteWrite.oauth2.scopes", "Optional OAuth2 scopes to use for the corresponding -remoteWrite.url. Scopes must be delimited by ';'")
awsUseSigv4 = flagutil.NewArrayBool("remoteWrite.aws.useSigv4", "Enables SigV4 request signing for the corresponding -remoteWrite.url. "+
"It is expected that other -remoteWrite.aws.* command-line flags are set if sigv4 request signing is enabled")
awsEC2Endpoint = flagutil.NewArrayString("remoteWrite.aws.ec2Endpoint", "Optional AWS EC2 API endpoint to use for the corresponding -remoteWrite.url if -remoteWrite.aws.useSigv4 is set")
awsSTSEndpoint = flagutil.NewArrayString("remoteWrite.aws.stsEndpoint", "Optional AWS STS API endpoint to use for the corresponding -remoteWrite.url if -remoteWrite.aws.useSigv4 is set")
awsRegion = flagutil.NewArrayString("remoteWrite.aws.region", "Optional AWS region to use for the corresponding -remoteWrite.url if -remoteWrite.aws.useSigv4 is set")
awsRoleARN = flagutil.NewArrayString("remoteWrite.aws.roleARN", "Optional AWS roleARN to use for the corresponding -remoteWrite.url if -remoteWrite.aws.useSigv4 is set")
awsAccessKey = flagutil.NewArrayString("remoteWrite.aws.accessKey", "Optional AWS AccessKey to use for the corresponding -remoteWrite.url if -remoteWrite.aws.useSigv4 is set")
awsService = flagutil.NewArrayString("remoteWrite.aws.service", "Optional AWS Service to use for the corresponding -remoteWrite.url if -remoteWrite.aws.useSigv4 is set. "+
"Defaults to \"aps\"")
awsSecretKey = flagutil.NewArrayString("remoteWrite.aws.secretKey", "Optional AWS SecretKey to use for the corresponding -remoteWrite.url if -remoteWrite.aws.useSigv4 is set")
)
type client struct {
@@ -69,6 +74,7 @@ type client struct {
sendBlock func(block []byte) bool
authCfg *promauth.Config
awsCfg *awsapi.Config
rl rateLimiter
@@ -78,6 +84,7 @@ type client struct {
requestsOKCount *metrics.Counter
errorsCount *metrics.Counter
packetsDropped *metrics.Counter
rateLimit *metrics.Gauge
retriesCount *metrics.Counter
sendDuration *metrics.FloatCounter
@@ -88,9 +95,13 @@ type client struct {
func newHTTPClient(argIdx int, remoteWriteURL, sanitizedURL string, fq *persistentqueue.FastQueue, concurrency int) *client {
authCfg, err := getAuthConfig(argIdx)
if err != nil {
logger.Panicf("FATAL: cannot initialize auth config: %s", err)
logger.Panicf("FATAL: cannot initialize auth config for remoteWrite.url=%q: %s", remoteWriteURL, err)
}
tlsCfg := authCfg.NewTLSConfig()
awsCfg, err := getAWSAPIConfig(argIdx)
if err != nil {
logger.Fatalf("FATAL: cannot initialize AWS Config for remoteWrite.url=%q: %s", remoteWriteURL, err)
}
tr := &http.Transport{
DialContext: statDial,
TLSClientConfig: tlsCfg,
@@ -115,6 +126,7 @@ func newHTTPClient(argIdx int, remoteWriteURL, sanitizedURL string, fq *persiste
sanitizedURL: sanitizedURL,
remoteWriteURL: remoteWriteURL,
authCfg: authCfg,
awsCfg: awsCfg,
fq: fq,
hc: &http.Client{
Transport: tr,
@@ -131,16 +143,22 @@ func (c *client) init(argIdx, concurrency int, sanitizedURL string) {
logger.Infof("applying %d bytes per second rate limit for -remoteWrite.url=%q", bytesPerSec, sanitizedURL)
c.rl.perSecondLimit = int64(bytesPerSec)
}
c.rl.limitReached = metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remote_write_rate_limit_reached_total{url=%q}`, c.sanitizedURL))
c.rl.limitReached = metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_rate_limit_reached_total{url=%q}`, c.sanitizedURL))
c.bytesSent = metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_bytes_sent_total{url=%q}`, c.sanitizedURL))
c.blocksSent = metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_blocks_sent_total{url=%q}`, c.sanitizedURL))
c.rateLimit = metrics.GetOrCreateGauge(fmt.Sprintf(`vmagent_remotewrite_rate_limit{url=%q}`, c.sanitizedURL), func() float64 {
return float64(rateLimit.GetOptionalArgOrDefault(argIdx, 0))
})
c.requestDuration = metrics.GetOrCreateHistogram(fmt.Sprintf(`vmagent_remotewrite_duration_seconds{url=%q}`, c.sanitizedURL))
c.requestsOKCount = metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_requests_total{url=%q, status_code="2XX"}`, c.sanitizedURL))
c.errorsCount = metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_errors_total{url=%q}`, c.sanitizedURL))
c.packetsDropped = metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_packets_dropped_total{url=%q}`, c.sanitizedURL))
c.retriesCount = metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_retries_count_total{url=%q}`, c.sanitizedURL))
c.sendDuration = metrics.GetOrCreateFloatCounter(fmt.Sprintf(`vmagent_remotewrite_send_duration_seconds_total{url=%q}`, c.sanitizedURL))
metrics.GetOrCreateGauge(fmt.Sprintf(`vmagent_remotewrite_queues{url=%q}`, c.sanitizedURL), func() float64 {
return float64(*queues)
})
for i := 0; i < concurrency; i++ {
c.wg.Add(1)
go func() {
@@ -158,6 +176,11 @@ func (c *client) MustStop() {
}
func getAuthConfig(argIdx int) (*promauth.Config, error) {
headersValue := headers.GetOptionalArg(argIdx)
var hdrs []string
if headersValue != "" {
hdrs = strings.Split(headersValue, "^^")
}
username := basicAuthUsername.GetOptionalArg(argIdx)
password := basicAuthPassword.GetOptionalArg(argIdx)
passwordFile := basicAuthPasswordFile.GetOptionalArg(argIdx)
@@ -194,13 +217,39 @@ func getAuthConfig(argIdx int) (*promauth.Config, error) {
InsecureSkipVerify: tlsInsecureSkipVerify.GetOptionalArg(argIdx),
}
authCfg, err := promauth.NewConfig(".", nil, basicAuthCfg, token, tokenFile, oauth2Cfg, tlsCfg)
opts := &promauth.Options{
BasicAuth: basicAuthCfg,
BearerToken: token,
BearerTokenFile: tokenFile,
OAuth2: oauth2Cfg,
TLSConfig: tlsCfg,
Headers: hdrs,
}
authCfg, err := opts.NewConfig()
if err != nil {
return nil, fmt.Errorf("cannot populate OAuth2 config for remoteWrite idx: %d, err: %w", argIdx, err)
}
return authCfg, nil
}
func getAWSAPIConfig(argIdx int) (*awsapi.Config, error) {
if !awsUseSigv4.GetOptionalArg(argIdx) {
return nil, nil
}
ec2Endpoint := awsEC2Endpoint.GetOptionalArg(argIdx)
stsEndpoint := awsSTSEndpoint.GetOptionalArg(argIdx)
region := awsRegion.GetOptionalArg(argIdx)
roleARN := awsRoleARN.GetOptionalArg(argIdx)
accessKey := awsAccessKey.GetOptionalArg(argIdx)
secretKey := awsSecretKey.GetOptionalArg(argIdx)
service := awsService.GetOptionalArg(argIdx)
cfg, err := awsapi.NewConfig(ec2Endpoint, stsEndpoint, region, roleARN, accessKey, secretKey, service)
if err != nil {
return nil, err
}
return cfg, nil
}
func (c *client) runWorker() {
var ok bool
var block []byte
@@ -250,21 +299,28 @@ func (c *client) sendBlockHTTP(block []byte) bool {
retriesCount := 0
c.bytesSent.Add(len(block))
c.blocksSent.Inc()
sigv4Hash := ""
if c.awsCfg != nil {
sigv4Hash = awsapi.HashHex(block)
}
again:
req, err := http.NewRequest("POST", c.remoteWriteURL, bytes.NewBuffer(block))
if err != nil {
logger.Panicf("BUG: unexpected error from http.NewRequest(%q): %s", c.sanitizedURL, err)
}
c.authCfg.SetHeaders(req, true)
h := req.Header
h.Set("User-Agent", "vmagent")
h.Set("Content-Type", "application/x-protobuf")
h.Set("Content-Encoding", "snappy")
h.Set("X-Prometheus-Remote-Write-Version", "0.1.0")
if ah := c.authCfg.GetAuthHeader(); ah != "" {
req.Header.Set("Authorization", ah)
if c.awsCfg != nil {
if err := c.awsCfg.SignRequest(req, sigv4Hash); err != nil {
// there is no need in retry, request will be rejected by client.Do and retried by code below
logger.Warnf("cannot sign remoteWrite request with AWS sigv4: %s", err)
}
}
startTime := time.Now()
resp, err := c.hc.Do(req)
c.requestDuration.UpdateDuration(startTime)
@@ -295,15 +351,14 @@ again:
}
metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_requests_total{url=%q, status_code="%d"}`, c.sanitizedURL, statusCode)).Inc()
if statusCode == 409 || statusCode == 400 {
body, err := ioutil.ReadAll(resp.Body)
body, err := io.ReadAll(resp.Body)
_ = resp.Body.Close()
l := logger.WithThrottler("remoteWriteRejected", 5*time.Second)
if err != nil {
l.Errorf("sending a block with size %d bytes to %q was rejected (skipping the block): status code %d; "+
remoteWriteRejectedLogger.Errorf("sending a block with size %d bytes to %q was rejected (skipping the block): status code %d; "+
"failed to read response body: %s",
len(block), c.sanitizedURL, statusCode, err)
} else {
l.Errorf("sending a block with size %d bytes to %q was rejected (skipping the block): status code %d; response body: %s",
remoteWriteRejectedLogger.Errorf("sending a block with size %d bytes to %q was rejected (skipping the block): status code %d; response body: %s",
len(block), c.sanitizedURL, statusCode, string(body))
}
// Just drop block on 409 and 400 status codes like Prometheus does.
@@ -320,7 +375,7 @@ again:
if retryDuration > time.Minute {
retryDuration = time.Minute
}
body, err := ioutil.ReadAll(resp.Body)
body, err := io.ReadAll(resp.Body)
_ = resp.Body.Close()
if err != nil {
logger.Errorf("cannot read response body from %q during retry #%d: %s", c.sanitizedURL, retriesCount, err)
@@ -340,6 +395,8 @@ again:
goto again
}
var remoteWriteRejectedLogger = logger.WithThrottler("remoteWriteRejected", 5*time.Second)
type rateLimiter struct {
perSecondLimit int64

View File

@@ -195,7 +195,7 @@ func pushWriteRequest(wr *prompbmarshal.WriteRequest, pushBlock func(block []byt
}
bb := writeRequestBufPool.Get()
bb.B = prompbmarshal.MarshalWriteRequest(bb.B[:0], wr)
if len(bb.B) <= maxUnpackedBlockSize.N {
if len(bb.B) <= maxUnpackedBlockSize.IntN() {
zb := snappyBufPool.Get()
zb.B = snappy.Encode(zb.B[:cap(zb.B)], bb.B)
writeRequestBufPool.Put(bb)

View File

@@ -0,0 +1,62 @@
package remotewrite
import (
"fmt"
"testing"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/golang/snappy"
)
func TestPushWriteRequest(t *testing.T) {
for _, rowsCount := range []int{1, 10, 100, 1e3, 1e4} {
t.Run(fmt.Sprintf("%d", rowsCount), func(t *testing.T) {
testPushWriteRequest(t, rowsCount)
})
}
}
func testPushWriteRequest(t *testing.T, rowsCount int) {
wr := newTestWriteRequest(rowsCount, 10)
pushBlockLen := 0
pushBlock := func(block []byte) {
if pushBlockLen > 0 {
panic(fmt.Errorf("BUG: pushBlock called multiple times; pushBlockLen=%d at first call, len(block)=%d at second call", pushBlockLen, len(block)))
}
pushBlockLen = len(block)
}
pushWriteRequest(wr, pushBlock)
b := prompbmarshal.MarshalWriteRequest(nil, wr)
zb := snappy.Encode(nil, b)
maxPushBlockLen := len(zb)
minPushBlockLen := maxPushBlockLen / 2
if pushBlockLen < minPushBlockLen {
t.Fatalf("unexpected block len after pushWriteRequest; got %d bytes; must be at least %d bytes", pushBlockLen, minPushBlockLen)
}
if pushBlockLen > maxPushBlockLen {
t.Fatalf("unexpected block len after pushWriteRequest; got %d bytes; must be smaller or equal to %d bytes", pushBlockLen, maxPushBlockLen)
}
}
func newTestWriteRequest(seriesCount, labelsCount int) *prompbmarshal.WriteRequest {
var wr prompbmarshal.WriteRequest
for i := 0; i < seriesCount; i++ {
var labels []prompbmarshal.Label
for j := 0; j < labelsCount; j++ {
labels = append(labels, prompbmarshal.Label{
Name: fmt.Sprintf("label_%d_%d", i, j),
Value: fmt.Sprintf("value_%d_%d", i, j),
})
}
wr.Timeseries = append(wr.Timeseries, prompbmarshal.TimeSeries{
Labels: labels,
Samples: []prompbmarshal.Sample{
{
Value: float64(i),
Timestamp: 1000 * int64(i),
},
},
})
}
return &wr
}

View File

@@ -0,0 +1,36 @@
package remotewrite
import (
"fmt"
"testing"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/golang/snappy"
"github.com/klauspost/compress/s2"
)
func BenchmarkCompressWriteRequestSnappy(b *testing.B) {
b.Run("snappy", func(b *testing.B) {
benchmarkCompressWriteRequest(b, snappy.Encode)
})
b.Run("s2", func(b *testing.B) {
benchmarkCompressWriteRequest(b, s2.EncodeSnappy)
})
}
func benchmarkCompressWriteRequest(b *testing.B, compressFunc func(dst, src []byte) []byte) {
for _, rowsCount := range []int{1, 10, 100, 1e3, 1e4} {
b.Run(fmt.Sprintf("rows_%d", rowsCount), func(b *testing.B) {
wr := newTestWriteRequest(rowsCount, 10)
data := prompbmarshal.MarshalWriteRequest(nil, wr)
b.ReportAllocs()
b.SetBytes(int64(rowsCount))
b.RunParallel(func(pb *testing.PB) {
var zb []byte
for pb.Next() {
zb = compressFunc(zb[:cap(zb)], data)
}
})
})
}
}

View File

@@ -13,18 +13,19 @@ import (
)
var (
unparsedLabelsGlobal = flagutil.NewArray("remoteWrite.label", "Optional label in the form 'name=value' to add to all the metrics before sending them to -remoteWrite.url. "+
unparsedLabelsGlobal = flagutil.NewArrayString("remoteWrite.label", "Optional label in the form 'name=value' to add to all the metrics before sending them to -remoteWrite.url. "+
"Pass multiple -remoteWrite.label flags in order to add multiple labels to metrics before sending them to remote storage")
relabelConfigPathGlobal = flag.String("remoteWrite.relabelConfig", "", "Optional path to file with relabel_config entries. "+
"The path can point either to local file or to http url. These entries are applied to all the metrics "+
"before sending them to -remoteWrite.url. See https://docs.victoriametrics.com/vmagent.html#relabeling for details")
relabelDebugGlobal = flag.Bool("remoteWrite.relabelDebug", false, "Whether to log metrics before and after relabeling with -remoteWrite.relabelConfig. "+
"If the -remoteWrite.relabelDebug is enabled, then the metrics aren't sent to remote storage. This is useful for debugging the relabeling configs")
relabelConfigPaths = flagutil.NewArray("remoteWrite.urlRelabelConfig", "Optional path to relabel config for the corresponding -remoteWrite.url. "+
"The path can point either to local file or to http url")
relabelDebug = flagutil.NewArrayBool("remoteWrite.urlRelabelDebug", "Whether to log metrics before and after relabeling with -remoteWrite.urlRelabelConfig. "+
"If the -remoteWrite.urlRelabelDebug is enabled, then the metrics aren't sent to the corresponding -remoteWrite.url. "+
"This is useful for debugging the relabeling configs")
relabelConfigPathGlobal = flag.String("remoteWrite.relabelConfig", "", "Optional path to file with relabeling configs, which are applied "+
"to all the metrics before sending them to -remoteWrite.url. See also -remoteWrite.urlRelabelConfig. "+
"The path can point either to local file or to http url. "+
"See https://docs.victoriametrics.com/vmagent.html#relabeling")
relabelConfigPaths = flagutil.NewArrayString("remoteWrite.urlRelabelConfig", "Optional path to relabel configs for the corresponding -remoteWrite.url. "+
"See also -remoteWrite.relabelConfig. The path can point either to local file or to http url. "+
"See https://docs.victoriametrics.com/vmagent.html#relabeling")
usePromCompatibleNaming = flag.Bool("usePromCompatibleNaming", false, "Whether to replace characters unsupported by Prometheus with underscores "+
"in the ingested metric names and label names. For example, foo.bar{a.b='c'} is transformed into foo_bar{a_b='c'} during data ingestion if this flag is set. "+
"See https://prometheus.io/docs/concepts/data_model/#metric-names-and-labels")
)
var labelsGlobal []prompbmarshal.Label
@@ -38,7 +39,7 @@ func CheckRelabelConfigs() error {
func loadRelabelConfigs() (*relabelConfigs, error) {
var rcs relabelConfigs
if *relabelConfigPathGlobal != "" {
global, err := promrelabel.LoadRelabelConfigs(*relabelConfigPathGlobal, *relabelDebugGlobal)
global, err := promrelabel.LoadRelabelConfigs(*relabelConfigPathGlobal)
if err != nil {
return nil, fmt.Errorf("cannot load -remoteWrite.relabelConfig=%q: %w", *relabelConfigPathGlobal, err)
}
@@ -54,7 +55,7 @@ func loadRelabelConfigs() (*relabelConfigs, error) {
// Skip empty relabel config.
continue
}
prc, err := promrelabel.LoadRelabelConfigs(path, relabelDebug.GetOptionalArg(i))
prc, err := promrelabel.LoadRelabelConfigs(path)
if err != nil {
return nil, fmt.Errorf("cannot load relabel configs from -remoteWrite.urlRelabelConfig=%q: %w", path, err)
}
@@ -87,7 +88,7 @@ func initLabelsGlobal() {
}
func (rctx *relabelCtx) applyRelabeling(tss []prompbmarshal.TimeSeries, extraLabels []prompbmarshal.Label, pcs *promrelabel.ParsedConfigs) []prompbmarshal.TimeSeries {
if len(extraLabels) == 0 && pcs.Len() == 0 {
if len(extraLabels) == 0 && pcs.Len() == 0 && !*usePromCompatibleNaming {
// Nothing to change.
return tss
}
@@ -107,7 +108,20 @@ func (rctx *relabelCtx) applyRelabeling(tss []prompbmarshal.TimeSeries, extraLab
labels = append(labels, *extraLabel)
}
}
labels = pcs.Apply(labels, labelsLen, true)
if *usePromCompatibleNaming {
// Replace unsupported Prometheus chars in label names and metric names with underscores.
tmpLabels := labels[labelsLen:]
for j := range tmpLabels {
label := &tmpLabels[j]
if label.Name == "__name__" {
label.Value = promrelabel.SanitizeName(label.Value)
} else {
label.Name = promrelabel.SanitizeName(label.Name)
}
}
}
labels = pcs.Apply(labels, labelsLen)
labels = promrelabel.FinalizeLabels(labels[:labelsLen], labels[labelsLen:])
if len(labels) == labelsLen {
// Drop the current time series, since relabeling removed all the labels.
continue

View File

@@ -0,0 +1,49 @@
package remotewrite
import (
"reflect"
"testing"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promutils"
)
func TestApplyRelabeling(t *testing.T) {
f := func(extraLabels []prompbmarshal.Label, pcs *promrelabel.ParsedConfigs, sTss, sExpTss string) {
rctx := &relabelCtx{}
tss, expTss := parseSeries(sTss), parseSeries(sExpTss)
gotTss := rctx.applyRelabeling(tss, extraLabels, pcs)
if !reflect.DeepEqual(gotTss, expTss) {
t.Fatalf("expected to have: \n%v;\ngot: \n%v", expTss, gotTss)
}
}
f(nil, nil, "up", "up")
f([]prompbmarshal.Label{{Name: "foo", Value: "bar"}}, nil, "up", `up{foo="bar"}`)
f([]prompbmarshal.Label{{Name: "foo", Value: "bar"}}, nil, `up{foo="baz"}`, `up{foo="bar"}`)
pcs, err := promrelabel.ParseRelabelConfigsData([]byte(`
- target_label: "foo"
replacement: "aaa"
- action: labeldrop
regex: "env.*"
`))
if err != nil {
t.Fatalf("unexpected error: %s", err)
}
f(nil, pcs, `up{foo="baz", env="prod"}`, `up{foo="aaa"}`)
oldVal := *usePromCompatibleNaming
*usePromCompatibleNaming = true
f(nil, nil, `foo.bar`, `foo_bar`)
*usePromCompatibleNaming = oldVal
}
func parseSeries(data string) []prompbmarshal.TimeSeries {
var tss []prompbmarshal.TimeSeries
tss = append(tss, prompbmarshal.TimeSeries{
Labels: promutils.MustNewLabelsFromString(data).GetLabels(),
})
return tss
}

View File

@@ -13,6 +13,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bloomfilter"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fasttime"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/memory"
@@ -20,16 +21,17 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/procutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/streamaggr"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
"github.com/VictoriaMetrics/metrics"
xxhash "github.com/cespare/xxhash/v2"
"github.com/cespare/xxhash/v2"
)
var (
remoteWriteURLs = flagutil.NewArray("remoteWrite.url", "Remote storage URL to write data to. It must support Prometheus remote_write API. "+
remoteWriteURLs = flagutil.NewArrayString("remoteWrite.url", "Remote storage URL to write data to. It must support Prometheus remote_write API. "+
"It is recommended using VictoriaMetrics as remote storage. Example url: http://<victoriametrics-host>:8428/api/v1/write . "+
"Pass multiple -remoteWrite.url flags in order to replicate data to multiple remote storage systems. See also -remoteWrite.multitenantURL")
remoteWriteMultitenantURLs = flagutil.NewArray("remoteWrite.multitenantURL", "Base path for multitenant remote storage URL to write data to. "+
remoteWriteMultitenantURLs = flagutil.NewArrayString("remoteWrite.multitenantURL", "Base path for multitenant remote storage URL to write data to. "+
"See https://docs.victoriametrics.com/vmagent.html#multitenancy for details. Example url: http://<vminsert>:8480 . "+
"Pass multiple -remoteWrite.multitenantURL flags in order to replicate data to multiple remote storage systems. See also -remoteWrite.url")
tmpDataPath = flag.String("remoteWrite.tmpDataPath", "vmagent-remotewrite-data", "Path to directory where temporary data for remote write component is stored. "+
@@ -38,9 +40,9 @@ var (
"isn't enough for sending high volume of collected data to remote storage. Default value is 2 * numberOfAvailableCPUs")
showRemoteWriteURL = flag.Bool("remoteWrite.showURL", false, "Whether to show -remoteWrite.url in the exported metrics. "+
"It is hidden by default, since it can contain sensitive info such as auth key")
maxPendingBytesPerURL = flagutil.NewBytes("remoteWrite.maxDiskUsagePerURL", 0, "The maximum file-based buffer size in bytes at -remoteWrite.tmpDataPath "+
maxPendingBytesPerURL = flagutil.NewArrayBytes("remoteWrite.maxDiskUsagePerURL", "The maximum file-based buffer size in bytes at -remoteWrite.tmpDataPath "+
"for each -remoteWrite.url. When buffer size reaches the configured maximum, then old data is dropped when adding new data to the buffer. "+
"Buffered data is stored in ~500MB chunks, so the minimum practical value for this flag is 500000000. "+
"Buffered data is stored in ~500MB chunks, so the minimum practical value for this flag is 500MB. "+
"Disk usage is unlimited if the value is set to 0")
significantFigures = flagutil.NewArrayInt("remoteWrite.significantFigures", "The number of significant figures to leave in metric values before writing them "+
"to remote storage. See https://en.wikipedia.org/wiki/Significant_figures . Zero value saves all the significant figures. "+
@@ -57,6 +59,13 @@ var (
"Excess series are logged and dropped. This can be useful for limiting series cardinality. See https://docs.victoriametrics.com/vmagent.html#cardinality-limiter")
maxDailySeries = flag.Int("remoteWrite.maxDailySeries", 0, "The maximum number of unique series vmagent can send to remote storage systems during the last 24 hours. "+
"Excess series are logged and dropped. This can be useful for limiting series churn rate. See https://docs.victoriametrics.com/vmagent.html#cardinality-limiter")
streamAggrConfig = flagutil.NewArrayString("remoteWrite.streamAggr.config", "Optional path to file with stream aggregation config. "+
"See https://docs.victoriametrics.com/stream-aggregation.html . "+
"See also -remoteWrite.streamAggr.keepInput")
streamAggrKeepInput = flagutil.NewArrayBool("remoteWrite.streamAggr.keepInput", "Whether to keep input samples after the aggregation with -remoteWrite.streamAggr.config. "+
"By default the input is dropped after the aggregation, so only the aggregate data is sent to the -remoteWrite.url. "+
"See https://docs.victoriametrics.com/stream-aggregation.html")
)
var (
@@ -140,6 +149,9 @@ func Init() {
}
allRelabelConfigs.Store(rcs)
configSuccess.Set(1)
configTimestamp.Set(fasttime.UnixTimestamp())
if len(*remoteWriteURLs) > 0 {
rwctxsDefault = newRemoteWriteCtxs(nil, *remoteWriteURLs)
}
@@ -154,18 +166,31 @@ func Init() {
case <-stopCh:
return
}
configReloads.Inc()
logger.Infof("SIGHUP received; reloading relabel configs pointed by -remoteWrite.relabelConfig and -remoteWrite.urlRelabelConfig")
rcs, err := loadRelabelConfigs()
if err != nil {
configReloadErrors.Inc()
configSuccess.Set(0)
logger.Errorf("cannot reload relabel configs; preserving the previous configs; error: %s", err)
continue
}
allRelabelConfigs.Store(rcs)
configSuccess.Set(1)
configTimestamp.Set(fasttime.UnixTimestamp())
logger.Infof("Successfully reloaded relabel configs")
}
}()
}
var (
configReloads = metrics.NewCounter(`vmagent_relabel_config_reloads_total`)
configReloadErrors = metrics.NewCounter(`vmagent_relabel_config_reloads_errors_total`)
configSuccess = metrics.NewCounter(`vmagent_relabel_config_last_reload_successful`)
configTimestamp = metrics.NewCounter(`vmagent_relabel_config_last_reload_success_timestamp_seconds`)
)
func newRemoteWriteCtxs(at *auth.Token, urls []string) []*remoteWriteCtx {
if len(urls) == 0 {
logger.Panicf("BUG: urls must be non-empty")
@@ -234,15 +259,11 @@ func Stop() {
// Push sends wr to remote storage systems set via `-remoteWrite.url`.
//
// Note that wr may be modified by Push due to relabeling and rounding.
func Push(wr *prompbmarshal.WriteRequest) {
PushWithAuthToken(nil, wr)
}
// PushWithAuthToken sends wr to remote storage systems set via `-remoteWrite.multitenantURL`.
// If at is nil, then the data is pushed to the configured `-remoteWrite.url`.
// If at isn't nil, the data is pushed to the configured `-remoteWrite.multitenantURL`.
//
// Note that wr may be modified by Push due to relabeling and rounding.
func PushWithAuthToken(at *auth.Token, wr *prompbmarshal.WriteRequest) {
func Push(at *auth.Token, wr *prompbmarshal.WriteRequest) {
if at == nil && len(*remoteWriteMultitenantURLs) > 0 {
// Write data to default tenant if at isn't set while -remoteWrite.multitenantURL is set.
at = defaultAuthToken
@@ -252,7 +273,7 @@ func PushWithAuthToken(at *auth.Token, wr *prompbmarshal.WriteRequest) {
rwctxs = rwctxsDefault
} else {
if len(*remoteWriteMultitenantURLs) == 0 {
logger.Panicf("BUG: remoteWriteMultitenantURLs must be non-empty for non-nil at")
logger.Panicf("BUG: -remoteWrite.multitenantURL command-line flag must be set when __tenant_id__=%q label is set", at)
}
rwctxsMapLock.Lock()
tenantID := tenantmetrics.TenantID{
@@ -274,6 +295,8 @@ func PushWithAuthToken(at *auth.Token, wr *prompbmarshal.WriteRequest) {
rctx = getRelabelCtx()
}
tss := wr.Timeseries
rowsCount := getRowsCount(tss)
globalRowsPushedBeforeRelabel.Add(rowsCount)
maxSamplesPerBlock := *maxRowsPerBlock
// Allow up to 10x of labels per each block on average.
maxLabelsPerBlock := 10 * maxSamplesPerBlock
@@ -298,9 +321,10 @@ func PushWithAuthToken(at *auth.Token, wr *prompbmarshal.WriteRequest) {
tss = nil
}
if rctx != nil {
tssBlockLen := len(tssBlock)
rowsCountBeforeRelabel := getRowsCount(tssBlock)
tssBlock = rctx.applyRelabeling(tssBlock, labelsGlobal, pcsGlobal)
globalRelabelMetricsDropped.Add(tssBlockLen - len(tssBlock))
rowsCountAfterRelabel := getRowsCount(tssBlock)
rowsDroppedByGlobalRelabel.Add(rowsCountBeforeRelabel - rowsCountAfterRelabel)
}
sortLabelsIfNeeded(tssBlock)
tssBlock = limitSeriesCardinality(tssBlock)
@@ -414,16 +438,24 @@ func labelsToString(labels []prompbmarshal.Label) string {
return string(b)
}
var globalRelabelMetricsDropped = metrics.NewCounter("vmagent_remotewrite_global_relabel_metrics_dropped_total")
var (
globalRowsPushedBeforeRelabel = metrics.NewCounter("vmagent_remotewrite_global_rows_pushed_before_relabel_total")
rowsDroppedByGlobalRelabel = metrics.NewCounter("vmagent_remotewrite_global_relabel_metrics_dropped_total")
)
type remoteWriteCtx struct {
idx int
fq *persistentqueue.FastQueue
c *client
idx int
fq *persistentqueue.FastQueue
c *client
sas *streamaggr.Aggregators
streamAggrKeepInput bool
pss []*pendingSeries
pssNextIdx uint64
relabelMetricsDropped *metrics.Counter
rowsPushedAfterRelabel *metrics.Counter
rowsDroppedByRelabel *metrics.Counter
}
func newRemoteWriteCtx(argIdx int, at *auth.Token, remoteWriteURL *url.URL, maxInmemoryBlocks int, sanitizedURL string) *remoteWriteCtx {
@@ -433,7 +465,8 @@ func newRemoteWriteCtx(argIdx int, at *auth.Token, remoteWriteURL *url.URL, maxI
pqURL.Fragment = ""
h := xxhash.Sum64([]byte(pqURL.String()))
queuePath := fmt.Sprintf("%s/persistent-queue/%d_%016X", *tmpDataPath, argIdx+1, h)
fq := persistentqueue.MustOpenFastQueue(queuePath, sanitizedURL, maxInmemoryBlocks, maxPendingBytesPerURL.N)
maxPendingBytes := maxPendingBytesPerURL.GetOptionalArgOrDefault(argIdx, 0)
fq := persistentqueue.MustOpenFastQueue(queuePath, sanitizedURL, maxInmemoryBlocks, maxPendingBytes)
_ = metrics.GetOrCreateGauge(fmt.Sprintf(`vmagent_remotewrite_pending_data_bytes{path=%q, url=%q}`, queuePath, sanitizedURL), func() float64 {
return float64(fq.GetPendingBytes())
})
@@ -449,6 +482,7 @@ func newRemoteWriteCtx(argIdx int, at *auth.Token, remoteWriteURL *url.URL, maxI
}
c.init(argIdx, *queues, sanitizedURL)
// Initialize pss
sf := significantFigures.GetOptionalArgOrDefault(argIdx, 0)
rd := roundDigits.GetOptionalArgOrDefault(argIdx, 100)
pssLen := *queues
@@ -461,14 +495,29 @@ func newRemoteWriteCtx(argIdx int, at *auth.Token, remoteWriteURL *url.URL, maxI
for i := range pss {
pss[i] = newPendingSeries(fq.MustWriteBlock, sf, rd)
}
return &remoteWriteCtx{
rwctx := &remoteWriteCtx{
idx: argIdx,
fq: fq,
c: c,
pss: pss,
relabelMetricsDropped: metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_relabel_metrics_dropped_total{path=%q, url=%q}`, queuePath, sanitizedURL)),
rowsPushedAfterRelabel: metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_rows_pushed_after_relabel_total{path=%q, url=%q}`, queuePath, sanitizedURL)),
rowsDroppedByRelabel: metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_relabel_metrics_dropped_total{path=%q, url=%q}`, queuePath, sanitizedURL)),
}
// Initialize sas
sasFile := streamAggrConfig.GetOptionalArg(argIdx)
if sasFile != "" {
sas, err := streamaggr.LoadFromFile(sasFile, rwctx.pushInternal)
if err != nil {
logger.Fatalf("cannot initialize stream aggregators from -remoteWrite.streamAggrFile=%q: %s", sasFile, err)
}
rwctx.sas = sas
rwctx.streamAggrKeepInput = streamAggrKeepInput.GetOptionalArg(argIdx)
}
return rwctx
}
func (rwctx *remoteWriteCtx) MustStop() {
@@ -480,13 +529,17 @@ func (rwctx *remoteWriteCtx) MustStop() {
rwctx.fq.UnblockAllReaders()
rwctx.c.MustStop()
rwctx.c = nil
rwctx.sas.MustStop()
rwctx.sas = nil
rwctx.fq.MustClose()
rwctx.fq = nil
rwctx.relabelMetricsDropped = nil
rwctx.rowsPushedAfterRelabel = nil
rwctx.rowsDroppedByRelabel = nil
}
func (rwctx *remoteWriteCtx) Push(tss []prompbmarshal.TimeSeries) {
// Apply relabeling
var rctx *relabelCtx
var v *[]prompbmarshal.TimeSeries
rcs := allRelabelConfigs.Load().(*relabelConfigs)
@@ -499,13 +552,22 @@ func (rwctx *remoteWriteCtx) Push(tss []prompbmarshal.TimeSeries) {
// and https://github.com/VictoriaMetrics/VictoriaMetrics/issues/599
v = tssRelabelPool.Get().(*[]prompbmarshal.TimeSeries)
tss = append(*v, tss...)
tssLen := len(tss)
rowsCountBeforeRelabel := getRowsCount(tss)
tss = rctx.applyRelabeling(tss, nil, pcs)
rwctx.relabelMetricsDropped.Add(tssLen - len(tss))
rowsCountAfterRelabel := getRowsCount(tss)
rwctx.rowsDroppedByRelabel.Add(rowsCountBeforeRelabel - rowsCountAfterRelabel)
}
pss := rwctx.pss
idx := atomic.AddUint64(&rwctx.pssNextIdx, 1) % uint64(len(pss))
pss[idx].Push(tss)
rowsCount := getRowsCount(tss)
rwctx.rowsPushedAfterRelabel.Add(rowsCount)
// Apply stream aggregation if any
rwctx.sas.Push(tss)
if rwctx.sas == nil || rwctx.streamAggrKeepInput {
// Push samples to the remote storage
rwctx.pushInternal(tss)
}
// Return back relabeling contexts to the pool
if rctx != nil {
*v = prompbmarshal.ResetTimeSeries(tss)
tssRelabelPool.Put(v)
@@ -513,9 +575,23 @@ func (rwctx *remoteWriteCtx) Push(tss []prompbmarshal.TimeSeries) {
}
}
func (rwctx *remoteWriteCtx) pushInternal(tss []prompbmarshal.TimeSeries) {
pss := rwctx.pss
idx := atomic.AddUint64(&rwctx.pssNextIdx, 1) % uint64(len(pss))
pss[idx].Push(tss)
}
var tssRelabelPool = &sync.Pool{
New: func() interface{} {
a := []prompbmarshal.TimeSeries{}
return &a
},
}
func getRowsCount(tss []prompbmarshal.TimeSeries) int {
rowsCount := 0
for _, ts := range tss {
rowsCount += len(ts.Samples)
}
return rowsCount
}

File diff suppressed because one or more lines are too long

View File

@@ -13,7 +13,6 @@ import (
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/vmimport"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
@@ -31,20 +30,16 @@ func InsertHandler(at *auth.Token, req *http.Request) error {
if err != nil {
return err
}
return writeconcurrencylimiter.Do(func() error {
isGzipped := req.Header.Get("Content-Encoding") == "gzip"
return parser.ParseStream(req.Body, isGzipped, func(rows []parser.Row) error {
return insertRows(at, rows, extraLabels)
})
isGzipped := req.Header.Get("Content-Encoding") == "gzip"
return parser.ParseStream(req.Body, isGzipped, func(rows []parser.Row) error {
return insertRows(at, rows, extraLabels)
})
}
// InsertHandlerForReader processes metrics from given reader
func InsertHandlerForReader(r io.Reader, isGzipped bool) error {
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(r, isGzipped, func(rows []parser.Row) error {
return insertRows(nil, rows, nil)
})
return parser.ParseStream(r, isGzipped, func(rows []parser.Row) error {
return insertRows(nil, rows, nil)
})
}
@@ -88,7 +83,7 @@ func insertRows(at *auth.Token, rows []parser.Row, extraLabels []prompbmarshal.L
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.PushWithAuthToken(at, &ctx.WriteRequest)
remotewrite.Push(at, &ctx.WriteRequest)
rowsInserted.Add(rowsTotal)
if at != nil {
rowsTenantInserted.Get(at).Add(rowsTotal)

View File

@@ -12,20 +12,20 @@ vmalert-prod:
vmalert-pure-prod:
APP_NAME=vmalert $(MAKE) app-via-docker-pure
vmalert-amd64-prod:
APP_NAME=vmalert $(MAKE) app-via-docker-amd64
vmalert-linux-amd64-prod:
APP_NAME=vmalert $(MAKE) app-via-docker-linux-amd64
vmalert-arm-prod:
APP_NAME=vmalert $(MAKE) app-via-docker-arm
vmalert-linux-arm-prod:
APP_NAME=vmalert $(MAKE) app-via-docker-linux-arm
vmalert-arm64-prod:
APP_NAME=vmalert $(MAKE) app-via-docker-arm64
vmalert-linux-arm64-prod:
APP_NAME=vmalert $(MAKE) app-via-docker-linux-arm64
vmalert-ppc64le-prod:
APP_NAME=vmalert $(MAKE) app-via-docker-ppc64le
vmalert-linux-ppc64le-prod:
APP_NAME=vmalert $(MAKE) app-via-docker-linux-ppc64le
vmalert-386-prod:
APP_NAME=vmalert $(MAKE) app-via-docker-386
vmalert-linux-386-prod:
APP_NAME=vmalert $(MAKE) app-via-docker-linux-386
vmalert-darwin-amd64-prod:
APP_NAME=vmalert $(MAKE) app-via-docker-darwin-amd64
@@ -33,6 +33,12 @@ vmalert-darwin-amd64-prod:
vmalert-darwin-arm64-prod:
APP_NAME=vmalert $(MAKE) app-via-docker-darwin-arm64
vmalert-freebsd-amd64-prod:
APP_NAME=vmalert $(MAKE) app-via-docker-freebsd-amd64
vmalert-openbsd-amd64-prod:
APP_NAME=vmalert $(MAKE) app-via-docker-openbsd-amd64
vmalert-windows-amd64-prod:
APP_NAME=vmalert $(MAKE) app-via-docker-windows-amd64
@@ -62,13 +68,14 @@ publish-vmalert:
test-vmalert:
go test -v -race -cover ./app/vmalert -loggerLevel=ERROR
go test -v -race -cover ./app/vmalert/templates
go test -v -race -cover ./app/vmalert/datasource
go test -v -race -cover ./app/vmalert/notifier
go test -v -race -cover ./app/vmalert/config
go test -v -race -cover ./app/vmalert/remotewrite
run-vmalert: vmalert
./bin/vmalert -rule=app/vmalert/config/testdata/rules2-good.rules \
./bin/vmalert -rule=app/vmalert/config/testdata/rules/rules2-good.rules \
-datasource.url=http://localhost:8428 \
-notifier.url=http://localhost:9093 \
-notifier.url=http://127.0.0.1:9093 \
@@ -77,17 +84,17 @@ run-vmalert: vmalert
-external.label=cluster=east-1 \
-external.label=replica=a \
-evaluationInterval=3s \
-rule.configCheckInterval=10s
-configCheckInterval=10s
run-vmalert-sd: vmalert
./bin/vmalert -rule=app/vmalert/config/testdata/rules2-good.rules \
-datasource.url=http://localhost:8428 \
-remoteWrite.url=http://localhost:8428 \
-notifier.config=app/vmalert/notifier/testdata/consul.good.yaml \
-notifier.config=app/vmalert/notifier/testdata/mixed.good.yaml \
-configCheckInterval=10s
replay-vmalert: vmalert
./bin/vmalert -rule=app/vmalert/config/testdata/rules-replay-good.rules \
./bin/vmalert -rule=app/vmalert/config/testdata/rules/rules-replay-good.rules \
-datasource.url=http://localhost:8428 \
-remoteWrite.url=http://localhost:8428 \
-external.label=cluster=east-1 \
@@ -95,26 +102,35 @@ replay-vmalert: vmalert
-replay.timeFrom=2021-05-11T07:21:43Z \
-replay.timeTo=2021-05-29T18:40:43Z
vmalert-amd64:
CGO_ENABLED=1 GOARCH=amd64 $(MAKE) vmalert-local-with-goarch
vmalert-linux-amd64:
APP_NAME=vmalert CGO_ENABLED=1 GOOS=linux GOARCH=amd64 $(MAKE) app-local-goos-goarch
vmalert-arm:
CGO_ENABLED=0 GOARCH=arm $(MAKE) vmalert-local-with-goarch
vmalert-linux-arm:
APP_NAME=vmalert CGO_ENABLED=0 GOOS=linux GOARCH=arm $(MAKE) app-local-goos-goarch
vmalert-arm64:
CGO_ENABLED=0 GOARCH=arm64 $(MAKE) vmalert-local-with-goarch
vmalert-linux-arm64:
APP_NAME=vmalert CGO_ENABLED=0 GOOS=linux GOARCH=arm64 $(MAKE) app-local-goos-goarch
vmalert-ppc64le:
CGO_ENABLED=0 GOARCH=ppc64le $(MAKE) vmalert-local-with-goarch
vmalert-linux-ppc64le:
APP_NAME=vmalert CGO_ENABLED=0 GOOS=linux GOARCH=ppc64le $(MAKE) app-local-goos-goarch
vmalert-386:
CGO_ENABLED=0 GOARCH=386 $(MAKE) vmalert-local-with-goarch
vmalert-linux-386:
APP_NAME=vmalert CGO_ENABLED=0 GOOS=linux GOARCH=386 $(MAKE) app-local-goos-goarch
vmalert-local-with-goarch:
APP_NAME=vmalert $(MAKE) app-local-with-goarch
vmalert-darwin-amd64:
APP_NAME=vmalert CGO_ENABLED=0 GOOS=darwin GOARCH=amd64 $(MAKE) app-local-goos-goarch
vmalert-darwin-arm64:
APP_NAME=vmalert CGO_ENABLED=0 GOOS=darwin GOARCH=arm64 $(MAKE) app-local-goos-goarch
vmalert-freebsd-amd64:
APP_NAME=vmalert CGO_ENABLED=0 GOOS=freebsd GOARCH=amd64 $(MAKE) app-local-goos-goarch
vmalert-openbsd-amd64:
APP_NAME=vmalert CGO_ENABLED=0 GOOS=openbsd GOARCH=amd64 $(MAKE) app-local-goos-goarch
vmalert-windows-amd64:
GOARCH=amd64 APP_NAME=vmalert $(MAKE) app-local-windows-goarch
vmalert-pure:
APP_NAME=vmalert $(MAKE) app-local-pure
vmalert-windows-amd64:
GOARCH=amd64 APP_NAME=vmalert $(MAKE) app-local-windows-with-goarch

View File

@@ -2,7 +2,7 @@
`vmalert` executes a list of the given [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/)
or [recording](https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/)
rules against configured `-datasource.url`. For sending alerting notifications
rules against configured `-datasource.url` compatible with Prometheus HTTP API. For sending alerting notifications
vmalert relies on [Alertmanager](https://github.com/prometheus/alertmanager) configured via `-notifier.url` flag.
Recording rules results are persisted via [remote write](https://prometheus.io/docs/prometheus/latest/storage/#remote-storage-integrations)
protocol and require `-remoteWrite.url` to be configured.
@@ -13,29 +13,30 @@ implementation and aims to be compatible with its syntax.
* Integration with [VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics) TSDB;
* VictoriaMetrics [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html)
support and expressions validation;
support and expressions validation;
* Prometheus [alerting rules definition format](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/#defining-alerting-rules)
support;
support;
* Integration with [Alertmanager](https://github.com/prometheus/alertmanager) starting from [Alertmanager v0.16.0-aplha](https://github.com/prometheus/alertmanager/releases/tag/v0.16.0-alpha.0);
* Keeps the alerts [state on restarts](#alerts-state-on-restarts);
* Graphite datasource can be used for alerting and recording rules. See [these docs](#graphite);
* Recording and Alerting rules backfilling (aka `replay`). See [these docs](#rules-backfilling);
* Lightweight without extra dependencies.
* Lightweight and without extra dependencies.
* Supports [reusable templates](#reusable-templates) for annotations.
## Limitations
* `vmalert` execute queries against remote datasource which has reliability risks because of the network.
It is recommended to configure alerts thresholds and rules expressions with the understanding that network
requests may fail;
It is recommended to configure alerts thresholds and rules expressions with the understanding that network
requests may fail;
* by default, rules execution is sequential within one group, but persistence of execution results to remote
storage is asynchronous. Hence, user shouldn't rely on chaining of recording rules when result of previous
recording rule is reused in the next one;
storage is asynchronous. Hence, user shouldn't rely on chaining of recording rules when result of previous
recording rule is reused in the next one;
## QuickStart
To build `vmalert` from sources:
```bash
```console
git clone https://github.com/VictoriaMetrics/VictoriaMetrics
cd VictoriaMetrics
make vmalert
@@ -46,37 +47,39 @@ The build binary will be placed in `VictoriaMetrics/bin` folder.
To start using `vmalert` you will need the following things:
* list of rules - PromQL/MetricsQL expressions to execute;
* datasource address - reachable MetricsQL endpoint to run queries against;
* datasource address - reachable endpoint with [Prometheus HTTP API](https://prometheus.io/docs/prometheus/latest/querying/api/#http-api) support for running queries against;
* notifier address [optional] - reachable [Alert Manager](https://github.com/prometheus/alertmanager) instance for processing,
aggregating alerts, and sending notifications. Please note, notifier address also supports Consul Service Discovery via
[config file](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmalert/notifier/config.go).
aggregating alerts, and sending notifications. Please note, notifier address also supports Consul and DNS Service Discovery via
[config file](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmalert/notifier/config.go).
* remote write address [optional] - [remote write](https://prometheus.io/docs/prometheus/latest/storage/#remote-storage-integrations)
compatible storage to persist rules and alerts state info;
compatible storage to persist rules and alerts state info. To persist results to multiple destinations use vmagent
configured with multiple remote writes as a proxy;
* remote read address [optional] - MetricsQL compatible datasource to restore alerts state from.
Then configure `vmalert` accordingly:
```bash
```console
./bin/vmalert -rule=alert.rules \ # Path to the file with rules configuration. Supports wildcard
-datasource.url=http://localhost:8428 \ # PromQL compatible datasource
-datasource.url=http://localhost:8428 \ # Prometheus HTTP API compatible datasource
-notifier.url=http://localhost:9093 \ # AlertManager URL (required if alerting rules are used)
-notifier.url=http://127.0.0.1:9093 \ # AlertManager replica URL
-remoteWrite.url=http://localhost:8428 \ # Remote write compatible storage to persist rules and alerts state info (required if recording rules are used)
-remoteRead.url=http://localhost:8428 \ # MetricsQL compatible datasource to restore alerts state from
-remoteRead.url=http://localhost:8428 \ # Prometheus HTTP API compatible datasource to restore alerts state from
-external.label=cluster=east-1 \ # External label to be applied for each rule
-external.label=replica=a # Multiple external labels may be set
```
Note there's a separate `remoteRead.url` to allow writing results of
Note there's a separate `-remoteWrite.url` command-line flag to allow writing results of
alerting/recording rules into a different storage than the initial data that's
queried. This allows using `vmalert` to aggregate data from a short-term,
high-frequency, high-cardinality storage into a long-term storage with
decreased cardinality and a bigger interval between samples.
See also [stream aggregation](https://docs.victoriametrics.com/stream-aggregation.html).
See the full list of configuration flags in [configuration](#configuration) section.
If you run multiple `vmalert` services for the same datastore or AlertManager - do not forget
to specify different `external.label` flags in order to define which `vmalert` generated rules or alerts.
to specify different `-external.label` command-line flags in order to define which `vmalert` generated rules or alerts.
Configuration for [recording](https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/)
and [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/) rules is very
@@ -100,6 +103,10 @@ name: <string>
# How often rules in the group are evaluated.
[ interval: <duration> | default = -evaluationInterval flag ]
# Limit the number of alerts an alerting rule and series a recording
# rule can produce. 0 is no limit.
[ limit: <int> | default = 0 ]
# How many rules execute at once within a group. Increasing concurrency may speed
# up round execution speed.
[ concurrency: <integer> | default = 1 ]
@@ -108,24 +115,27 @@ name: <string>
# By default "prometheus" type is used.
[ type: <string> ]
# Warning: DEPRECATED
# Please use `params` instead:
# params:
# extra_label: ["job=nodeexporter", "env=prod"]
extra_filter_labels:
[ <labelname>: <labelvalue> ... ]
# Optional list of HTTP URL parameters
# applied for all rules requests within a group
# For example:
# params:
# nocache: ["1"] # disable caching for vmselect
# denyPartialResponse: ["true"] # fail if one or more vmstorage nodes returned an error
# extra_label: ["env=dev"] # apply additional label filter "env=dev" for all requests
# extra_label: ["env=dev"] # apply additional label filter "env=dev" for all requests
# see more details at https://docs.victoriametrics.com#prometheus-querying-api-enhancements
params:
[ <string>: [<string>, ...]]
# Optional list of HTTP headers in form `header-name: value`
# applied for all rules requests within a group
# For example:
# headers:
# - "CustomHeader: foo"
# - "CustomHeader2: bar"
# Headers set via this param have priority over headers set via `-datasource.headers` flag.
headers:
[ <string>, ...]
# Optional list of labels added to every rule within a group.
# It has priority over the external labels.
# Labels are commonly used for adding environment
@@ -140,18 +150,18 @@ rules:
### Rules
Every rule contains `expr` field for [PromQL](https://prometheus.io/docs/prometheus/latest/querying/basics/)
or [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) expression. Vmalert will execute the configured
or [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) expression. `vmalert` will execute the configured
expression and then act according to the Rule type.
There are two types of Rules:
* [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/) -
Alerting rules allow defining alert conditions via `expr` field and to send notifications to
[Alertmanager](https://github.com/prometheus/alertmanager) if execution result is not empty.
Alerting rules allow defining alert conditions via `expr` field and to send notifications to
[Alertmanager](https://github.com/prometheus/alertmanager) if execution result is not empty.
* [recording](https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/) -
Recording rules allow defining `expr` which result will be then backfilled to configured
`-remoteWrite.url`. Recording rules are used to precompute frequently needed or computationally
expensive expressions and save their result as a new set of time series.
Recording rules allow defining `expr` which result will be then backfilled to configured
`-remoteWrite.url`. Recording rules are used to precompute frequently needed or computationally
expensive expressions and save their result as a new set of time series.
`vmalert` forbids defining duplicates - rules with the same combination of name, expression, and labels
within one group.
@@ -165,7 +175,7 @@ The syntax for alerting rule is the following:
alert: <string>
# The expression to evaluate. The expression language depends on the type value.
# By default PromQL/MetricsQL expression is used. If group.type="graphite", then the expression
# By default, PromQL/MetricsQL expression is used. If group.type="graphite", then the expression
# must contain valid Graphite expression.
expr: <string>
@@ -175,6 +185,18 @@ expr: <string>
# as firing once they return.
[ for: <duration> | default = 0s ]
# Whether to print debug information into logs.
# Information includes alerts state changes and requests sent to the datasource.
# Please note, that if rule's query params contain sensitive
# information - it will be printed to logs.
# Is applicable to alerting rules only.
[ debug: <bool> | default = false ]
# Defines the number of rule's updates entries stored in memory
# and available for view on rule's Details page.
# Overrides `rule.updateEntriesLimit` value for this specific rule.
[ update_entries_limit: <integer> | default 0 ]
# Labels to add or overwrite for each alert.
labels:
[ <labelname>: <tmpl_string> ]
@@ -184,10 +206,108 @@ annotations:
[ <labelname>: <tmpl_string> ]
```
It is allowed to use [Go templating](https://golang.org/pkg/text/template/) in annotations
to format data, iterate over it or execute expressions.
Additionally, `vmalert` provides some extra templating functions
listed [here](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmalert/notifier/template_func.go).
#### Templating
It is allowed to use [Go templating](https://golang.org/pkg/text/template/) in annotations to format data, iterate over
or execute expressions.
The following variables are available in templating:
| Variable | Description | Example |
|------------------------------------|-----------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| $value or .Value | The current alert's value. Avoid using value in labels, it may cause unexpected issues. | {% raw %}Number of connections is {{ $value }}{% endraw %} |
| $activeAt or .ActiveAt | The moment of [time](https://pkg.go.dev/time) when alert became active (`pending` or `firing`). | {% raw %}http://vm-grafana.com/panelId=xx?from={{($activeAt.Add (parseDurationTime \"1h\")).Unix}}&to={{($activeAt.Add (parseDurationTime \"-1h\")).Unix}}{% endraw %} |
| $labels or .Labels | The list of labels of the current alert. Use as ".Labels.<label_name>". | {% raw %}Too high number of connections for {{ .Labels.instance }}{% endraw %} |
| $alertID or .AlertID | The current alert's ID generated by vmalert. | {% raw %}Link: vmalert/alert?group_id={{.GroupID}}&alert_id={{.AlertID}}{% endraw %} |
| $groupID or .GroupID | The current alert's group ID generated by vmalert. | {% raw %}Link: vmalert/alert?group_id={{.GroupID}}&alert_id={{.AlertID}}{% endraw %} |
| $expr or .Expr | Alert's expression. Can be used for generating links to Grafana or other systems. | {% raw %}/api/v1/query?query={{ $expr&#124;queryEscape }}{% endraw %} |
| $for or .For | Alert's configured for param. | {% raw %}Number of connections is too high for more than {{ .For }}{% endraw %} |
| $externalLabels or .ExternalLabels | List of labels configured via `-external.label` command-line flag. | {% raw %}Issues with {{ $labels.instance }} (datacenter-{{ $externalLabels.dc }}){% endraw %} |
| $externalURL or .ExternalURL | URL configured via `-external.url` command-line flag. Used for cases when vmalert is hidden behind proxy. | {% raw %}Visit {{ $externalURL }} for more details{% endraw %} |
Additionally, `vmalert` provides some extra templating functions listed [here](#template-functions) and [reusable templates](#reusable-templates).
#### Template functions
`vmalert` provides the following template functions, which can be used during [templating](#templating):
- `args arg0 ... argN` - converts the input args into a map with `arg0`, ..., `argN` keys.
- `externalURL` - returns the value of `-external.url` command-line flag.
- `first` - returns the first result from the input query results returned by `query` function.
- `htmlEscape` - escapes special chars in input string, so it can be safely embedded as a plaintext into HTML.
- `humanize` - converts the input number into human-readable format by adding [metric prefixes](https://en.wikipedia.org/wiki/Metric_prefix).
For example, `100000` is converted into `100K`.
- `humanize1024` - converts the input number into human-readable format with 1024 base.
For example, `1024` is converted into 1ki`.
- `humanizeDuration` - converts the input number in seconds into human-readable duration.
- `humanizePercentage` - converts the input number to percentage. For example, `0.123` is converted into `12.3%`.
- `humanizeTimestamp` - converts the input unix timestamp into human-readable time.
- `jsonEscape` - JSON-encodes the input string.
- `label name` - returns the value of the label with the given `name` from the input query result.
- `match regex` - matches the input string against the provided `regex`.
- `parseDuration` - parses the input string into duration in seconds. For example, `1h` is parsed into `3600`.
- `parseDurationTime` - parses the input string into [time.Duration](https://pkg.go.dev/time#Duration).
- `pathEscape` - escapes the input string, so it can be safely put inside path part of URL.
- `pathPrefix` - returns the path part of the `-external.url` command-line flag.
- `query` - executes the [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) query against `-datasource.url` and returns the query result.
For example, {% raw %}`{{ query "sort_desc(process_resident_memory_bytes)" | first | value }}`{% endraw %} executes the `sort_desc(process_resident_memory_bytes)`
query at `-datasource.url` and returns the first result.
- `queryEscape` - escapes the input string, so it can be safely put inside [query arg](https://en.wikipedia.org/wiki/Percent-encoding) part of URL.
- `quotesEscape` - escapes the input string, so it can be safely embedded into JSON string.
- `reReplaceAll regex repl` - replaces all the occurences of the `regex` in input string with the `repl`.
- `safeHtml` - marks the input string as safe to use in HTML context without the need to html-escape it.
- `sortByLabel name` - sorts the input query results by the label with the given `name`.
- `stripDomain` - leaves the first part of the domain. For example, `foo.bar.baz` is converted to `foo`.
The port part is left in the output string. E.g. `foo.bar:1234` is converted into `foo:1234`.
- `stripPort` - strips `port` part from `host:port` input string.
- `strvalue` - returns the metric name from the input query result.
- `title` - converts the first letters of every input word to uppercase.
- `toLower` - converts all the chars in the input string to lowercase.
- `toTime` - converts the input unix timestamp to [time.Time](https://pkg.go.dev/time#Time).
- `toUpper` - converts all the chars in the input string to uppercase.
- `value` - returns the numeric value from the input query result.
#### Reusable templates
Like in Alertmanager you can define [reusable templates](https://prometheus.io/docs/prometheus/latest/configuration/template_examples/#defining-reusable-templates)
to share same templates across annotations. Just define the templates in a file and
set the path via `-rule.templates` flag.
For example, template `grafana.filter` can be defined as following:
{% raw %}
```
{{ define "grafana.filter" -}}
{{- $labels := .arg0 -}}
{{- range $name, $label := . -}}
{{- if (ne $name "arg0") -}}
{{- ( or (index $labels $label) "All" ) | printf "&var-%s=%s" $label -}}
{{- end -}}
{{- end -}}
{{- end -}}
```
{% endraw %}
And then used in annotations:
{% raw %}
```yaml
groups:
- name: AlertGroupName
rules:
- alert: AlertName
expr: any_metric > 100
for: 30s
labels:
alertname: 'Any metric is too high'
severity: 'warning'
annotations:
dashboard: '{{ $externalURL }}/d/dashboard?orgId=1{{ template "grafana.filter" (args .CommonLabels "account_id" "any_label") }}'
```
{% endraw %}
The `-rule.templates` flag supports wildcards so multiple files with templates can be loaded.
The content of `-rule.templates` can be also [hot reloaded](#hot-config-reload).
#### Recording rules
@@ -198,13 +318,19 @@ The syntax for recording rules is following:
record: <string>
# The expression to evaluate. The expression language depends on the type value.
# By default MetricsQL expression is used. If group.type="graphite", then the expression
# By default, MetricsQL expression is used. If group.type="graphite", then the expression
# must contain valid Graphite expression.
expr: <string>
# Labels to add or overwrite before storing the result.
labels:
[ <labelname>: <labelvalue> ]
# Defines the number of rule's updates entries stored in memory
# and available for view on rule's Details page.
# Overrides `rule.updateEntriesLimit` value for this specific rule.
[ update_entries_limit: <integer> | default 0 ]
```
For recording rules to work `-remoteWrite.url` must be specified.
@@ -215,11 +341,11 @@ For recording rules to work `-remoteWrite.url` must be specified.
the process alerts state will be lost. To avoid this situation, `vmalert` should be configured via the following flags:
* `-remoteWrite.url` - URL to VictoriaMetrics (Single) or vminsert (Cluster). `vmalert` will persist alerts state
into the configured address in the form of time series named `ALERTS` and `ALERTS_FOR_STATE` via remote-write protocol.
These are regular time series and maybe queried from VM just as any other time series.
The state is stored to the configured address on every rule evaluation.
into the configured address in the form of time series named `ALERTS` and `ALERTS_FOR_STATE` via remote-write protocol.
These are regular time series and maybe queried from VM just as any other time series.
The state is stored to the configured address on every rule evaluation.
* `-remoteRead.url` - URL to VictoriaMetrics (Single) or vmselect (Cluster). `vmalert` will try to restore alerts state
from configured address by querying time series with name `ALERTS_FOR_STATE`.
from configured address by querying time series with name `ALERTS_FOR_STATE`.
Both flags are required for proper state restoration. Restore process may fail if time series are missing
in configured `-remoteRead.url`, weren't updated in the last `1h` (controlled by `-remoteRead.lookback`)
@@ -240,7 +366,7 @@ There are the following approaches exist for alerting and recording rules across
rules to `AccountID=123`.
* To specify `tenant` parameter per each alerting and recording group if
[enterprise version of vmalert](https://victoriametrics.com/products/enterprise/) is used
[enterprise version of vmalert](https://docs.victoriametrics.com/enterprise.html) is used
with `-clusterMode` command-line flag. For example:
```yaml
@@ -256,6 +382,10 @@ groups:
# Rules for accountID=456, projectID=789
```
The results of alerting and recording rules contain `vm_account_id` and `vm_project_id` labels
if `-clusterMode` is enabled. These labels can be used during [templating](https://docs.victoriametrics.com/vmalert.html#templating),
and help to identify to which account or project the triggered alert or produced recording belongs.
If `-clusterMode` is enabled, then `-datasource.url`, `-remoteRead.url` and `-remoteWrite.url` must
contain only the hostname without tenant id. For example: `-datasource.url=http://vmselect:8481`.
`vmalert` automatically adds the specified tenant to urls per each recording rule in this case.
@@ -275,7 +405,7 @@ for different scenarios.
Please note, not all flags in examples are required:
* `-remoteWrite.url` and `-remoteRead.url` are optional and are needed only if
you have recording rules or want to store [alerts state](#alerts-state-on-restarts) on `vmalert` restarts;
you have recording rules or want to store [alerts state](#alerts-state-on-restarts) on `vmalert` restarts;
* `-notifier.url` is optional and is needed only if you have alerting rules.
#### Single-node VictoriaMetrics
@@ -341,18 +471,54 @@ Alertmanagers.
To avoid recording rules results and alerts state duplication in VictoriaMetrics server
don't forget to configure [deduplication](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#deduplication).
The recommended value for `-dedup.minScrapeInterval` must be greater or equal to vmalert's `evaluation_interval`.
If you observe inconsistent or "jumping" values in series produced by vmalert, try disabling `-datasource.queryTimeAlignment`
command line flag. Because of alignment, two or more vmalert HA pairs will produce results with the same timestamps.
But due of backfilling (data delivered to the datasource with some delay) values of such results may differ,
which would affect deduplication logic and result into "jumping" datapoints.
Alertmanager will automatically deduplicate alerts with identical labels, so ensure that
all `vmalert`s are having the same config.
Don't forget to configure [cluster mode](https://prometheus.io/docs/alerting/latest/alertmanager/)
for Alertmanagers for better reliability.
for Alertmanagers for better reliability. List all Alertmanager URLs in vmalert's `-notifier.url`
to ensure [high availability](https://github.com/prometheus/alertmanager#high-availability).
This example uses single-node VM server for the sake of simplicity.
Check how to replace it with [cluster VictoriaMetrics](#cluster-victoriametrics) if needed.
#### Downsampling and aggregation via vmalert
`vmalert` can't modify existing data. But it can run arbitrary PromQL/MetricsQL queries
via [recording rules](#recording-rules) and backfill results to the configured `-remoteWrite.url`.
This ability allows to aggregate data. For example, the following rule will calculate the average value for
metric `http_requests` on the `5m` interval:
```yaml
- record: http_requests:avg5m
expr: avg_over_time(http_requests[5m])
```
Every time this rule will be evaluated, `vmalert` will backfill its results as a new time series `http_requests:avg5m`
to the configured `-remoteWrite.url`.
`vmalert` executes rules with specified interval (configured via flag `-evaluationInterval`
or as [group's](#groups) `interval` param). The interval helps to control "resolution" of the produced series.
This ability allows to downsample data. For example, the following config will execute the rule only once every `5m`:
```yaml
groups:
- name: my_group
interval: 5m
rules:
- record: http_requests:avg5m
expr: avg_over_time(http_requests[5m])
```
Ability of `vmalert` to be configured with different `-datasource.url` and `-remoteWrite.url` command-line flags
allows reading data from one data source and backfilling results to another. This helps to build a system
for aggregating and downsampling the data.
The following example shows how to build a topology where `vmalert` will process data from one cluster
and write results into another. Such clusters may be called as "hot" (low retention,
high-speed disks, used for operative monitoring) and "cold" (long term retention,
@@ -364,7 +530,7 @@ or reducing resolution) and push results to "cold" cluster.
```
./bin/vmalert -rule=downsampling-rules.yml \ # Path to the file with rules configuration. Supports wildcard
-datasource.url=http://raw-cluster-vmselect:8481/select/0/prometheus # vmselect addr for executing recordi ng rules expressions
-datasource.url=http://raw-cluster-vmselect:8481/select/0/prometheus # vmselect addr for executing recording rules expressions
-remoteWrite.url=http://aggregated-cluster-vminsert:8480/insert/0/prometheus # vminsert addr to persist recording rules results
```
@@ -374,7 +540,22 @@ Please note, [replay](#rules-backfilling) feature may be used for transforming h
Flags `-remoteRead.url` and `-notifier.url` are omitted since we assume only recording rules are used.
See also [downsampling docs](https://docs.victoriametrics.com/#downsampling).
See also [stream aggregation](https://docs.victoriametrics.com/stream-aggregation.html) and [downsampling](https://docs.victoriametrics.com/#downsampling).
#### Multiple remote writes
For persisting recording or alerting rule results `vmalert` requires `-remoteWrite.url` to be set.
But this flag supports only one destination. To persist rule results to multiple destinations
we recommend using [vmagent](https://docs.victoriametrics.com/vmagent.html) as fan-out proxy:
<img alt="vmalert multiple remote write destinations" src="vmalert_multiple_rw.png">
In this topology, `vmalert` is configured to persist rule results to `vmagent`. And `vmagent`
is configured to fan-out received data to two or more destinations.
Using `vmagent` as a proxy provides additional benefits such as
[data persisting when storage is unreachable](https://docs.victoriametrics.com/vmagent.html#replication-and-high-availability),
or time series modification via [relabeling](https://docs.victoriametrics.com/vmagent.html#relabeling).
### Web
@@ -383,11 +564,21 @@ See also [downsampling docs](https://docs.victoriametrics.com/#downsampling).
* `http://<vmalert-addr>` - UI;
* `http://<vmalert-addr>/api/v1/rules` - list of all loaded groups and rules;
* `http://<vmalert-addr>/api/v1/alerts` - list of all active alerts;
* `http://<vmalert-addr>/api/v1/<groupID>/<alertID>/status"` - get alert status by ID.
Used as alert source in AlertManager.
* `http://<vmalert-addr>/vmalert/api/v1/alert?group_id=<group_id>&alert_id=<alert_id>` - get alert status in JSON format.
Used as alert source in AlertManager.
* `http://<vmalert-addr>/vmalert/alert?group_id=<group_id>&alert_id=<alert_id>` - get alert status in web UI.
* `http://<vmalert-addr>/vmalert/rule?group_id=<group_id>&rule_id=<rule_id>` - get rule status in web UI.
* `http://<vmalert-addr>/metrics` - application metrics.
* `http://<vmalert-addr>/-/reload` - hot configuration reload.
`vmalert` web UI can be accessed from [single-node version of VictoriaMetrics](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html)
and from [cluster version of VictoriaMetrics](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html).
This may be used for better integraion with Grafana unified alerting system. See the following docs for details:
* [How to query vmalert from single-node VictoriaMetrics](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#vmalert)
* [How to query vmalert from VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#vmalert)
## Graphite
vmalert sends requests to `<-datasource.url>/render?format=json` during evaluation of alerting and recording rules
@@ -410,7 +601,7 @@ To run vmalert in `replay` mode:
```
./bin/vmalert -rule=path/to/your.rules \ # path to files with rules you usually use with vmalert
-datasource.url=http://localhost:8428 \ # PromQL/MetricsQL compatible datasource
-datasource.url=http://localhost:8428 \ # Prometheus HTTP API compatible datasource
-remoteWrite.url=http://localhost:8428 \ # remote write compatible storage to persist results
-replay.timeFrom=2021-05-11T07:21:43Z \ # time from begin replay
-replay.timeTo=2021-05-29T18:40:43Z # time to finish replay
@@ -446,7 +637,7 @@ max range per request: 8h20m0s
In `replay` mode all groups are executed sequentially one-by-one. Rules within the group are
executed sequentially as well (`concurrency` setting is ignored). Vmalert sends rule's expression
to [/query_range](https://prometheus.io/docs/prometheus/latest/querying/api/#range-queries) endpoint
to [/query_range](https://docs.victoriametrics.com/keyConcepts.html#range-query) endpoint
of the configured `-datasource.url`. Returned data is then processed according to the rule type and
backfilled to `-remoteWrite.url` via [remote Write protocol](https://prometheus.io/docs/prometheus/latest/storage/#remote-storage-integrations).
Vmalert respects `evaluationInterval` value set by flag or per-group during the replay.
@@ -473,32 +664,129 @@ Execute the query against storage which was used for `-remoteWrite.url` during t
There are following non-required `replay` flags:
* `-replay.maxDatapointsPerQuery` - the max number of data points expected to receive in one request.
In two words, it affects the max time range for every `/query_range` request. The higher the value,
the fewer requests will be issued during `replay`.
In two words, it affects the max time range for every `/query_range` request. The higher the value,
the fewer requests will be issued during `replay`.
* `-replay.ruleRetryAttempts` - when datasource fails to respond vmalert will make this number of retries
per rule before giving up.
per rule before giving up.
* `-replay.rulesDelay` - delay between sequential rules execution. Important in cases if there are chaining
(rules which depend on each other) rules. It is expected, that remote storage will be able to persist
previously accepted data during the delay, so data will be available for the subsequent queries.
Keep it equal or bigger than `-remoteWrite.flushInterval`.
(rules which depend on each other) rules. It is expected, that remote storage will be able to persist
previously accepted data during the delay, so data will be available for the subsequent queries.
Keep it equal or bigger than `-remoteWrite.flushInterval`.
* `-replay.disableProgressBar` - whether to disable progress bar which shows progress work.
Progress bar may generate a lot of log records, which is not formatted as standard VictoriaMetrics logger.
It could break logs parsing by external system and generate additional load on it.
See full description for these flags in `./vmalert --help`.
See full description for these flags in `./vmalert -help`.
### Limitations
* Graphite engine isn't supported yet;
* `query` template function is disabled for performance reasons (might be changed in future);
* `limit` group's param has no effect during replay (might be changed in future);
## Monitoring
`vmalert` exports various metrics in Prometheus exposition format at `http://vmalert-host:8880/metrics` page.
The default list of alerting rules for these metric can be found [here](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/master/deployment/docker).
We recommend setting up regular scraping of this page either through `vmagent` or by Prometheus so that the exported
metrics may be analyzed later.
Use the official [Grafana dashboard](https://grafana.com/grafana/dashboards/14950) for `vmalert` overview. Graphs on this dashboard contain useful hints - hover the `i` icon at the top left corner of each graph in order to read it.
Use the official [Grafana dashboard](https://grafana.com/grafana/dashboards/14950-victoriametrics-vmalert/) for `vmalert` overview.
Graphs on this dashboard contain useful hints - hover the `i` icon in the top left corner of each graph in order to read it.
If you have suggestions for improvements or have found a bug - please open an issue on github or add
a review to the dashboard.
## Troubleshooting
vmalert executes configured rules within certain intervals. It is expected that at the moment when rule is executed,
the data is already present in configured `-datasource.url`:
<img alt="vmalert expected evaluation" src="vmalert_ts_normal.gif">
Usually, troubles start to appear when data in `-datasource.url` is delayed or absent. In such cases, evaluations
may get empty response from datasource and produce empty recording rules or reset alerts state:
<img alt="vmalert evaluation when data is delayed" src="vmalert_ts_data_delay.gif">
By default, recently written samples to VictoriaMetrics aren't visible for queries for up to 30s.
This behavior is controlled by `-search.latencyOffset` command-line flag and the `latency_offset` query ag at `vmselect`.
Usually, this results into a 30s shift for recording rules results.
Note that too small value passed to `-search.latencyOffset` or to `latency_offest` query arg may lead to incomplete query results.
Try the following recommendations in such cases:
* Always configure group's `evaluationInterval` to be bigger or equal to `scrape_interval` at which metrics
are delivered to the datasource;
* If you know in advance, that data in datasource is delayed - try changing vmalert's `-datasource.lookback`
command-line flag to add a time shift for evaluations;
* If time intervals between datapoints in datasource are irregular or `>=5min` - try changing vmalert's
`-datasource.queryStep` command-line flag to specify how far search query can lookback for the recent datapoint.
The recommendation is to have the step at least two times bigger than `scrape_interval`, since
there are no guarantees that scrape will not fail.
Sometimes, it is not clear why some specific alert fired or didn't fire. It is very important to remember, that
alerts with `for: 0` fire immediately when their expression becomes true. And alerts with `for > 0` will fire only
after multiple consecutive evaluations, and at each evaluation their expression must be true. If at least one evaluation
becomes false, then alert's state resets to the initial state.
If `-remoteWrite.url` command-line flag is configured, vmalert will persist alert's state in form of time series
`ALERTS` and `ALERTS_FOR_STATE` to the specified destination. Such time series can be then queried via
[vmui](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#vmui) or Grafana to track how alerts state
changed in time.
vmalert stores last `-rule.updateEntriesLimit` (or `update_entries_limit` [per-rule config](https://docs.victoriametrics.com/vmalert.html#alerting-rules))
state updates for each rule. To check updates, click on `Details` link next to rule's name on `/vmalert/groups` page
and check the `Last updates` section:
<img alt="vmalert state" src="vmalert_state.png">
Rows in the section represent ordered rule evaluations and their results. The column `curl` contains an example of
HTTP request sent by vmalert to the `-datasource.url` during evaluation. If specific state shows that there were
no samples returned and curl command returns data - then it is very likely there was no data in datasource on the
moment when rule was evaluated.
vmalert allows configuring more detailed logging for specific alerting rule. Just set `debug: true` in rule's configuration
and vmalert will start printing additional log messages:
```terminal
2022-09-15T13:35:41.155Z DEBUG rule "TestGroup":"Conns" (2601299393013563564) at 2022-09-15T15:35:41+02:00: query returned 0 samples (elapsed: 5.896041ms)
2022-09-15T13:35:56.149Z DEBUG datasource request: executing POST request with params "denyPartialResponse=true&query=sum%28vm_tcplistener_conns%7Binstance%3D%22localhost%3A8429%22%7D%29+by%28instance%29+%3E+0&step=15s&time=1663248945"
2022-09-15T13:35:56.178Z DEBUG rule "TestGroup":"Conns" (2601299393013563564) at 2022-09-15T15:35:56+02:00: query returned 1 samples (elapsed: 28.368208ms)
2022-09-15T13:35:56.178Z DEBUG datasource request: executing POST request with params "denyPartialResponse=true&query=sum%28vm_tcplistener_conns%7Binstance%3D%22localhost%3A8429%22%7D%29&step=15s&time=1663248945"
2022-09-15T13:35:56.179Z DEBUG rule "TestGroup":"Conns" (2601299393013563564) at 2022-09-15T15:35:56+02:00: alert 10705778000901301787 {alertgroup="TestGroup",alertname="Conns",cluster="east-1",instance="localhost:8429",replica="a"} created in state PENDING
...
2022-09-15T13:36:56.153Z DEBUG rule "TestGroup":"Conns" (2601299393013563564) at 2022-09-15T15:36:56+02:00: alert 10705778000901301787 {alertgroup="TestGroup",alertname="Conns",cluster="east-1",instance="localhost:8429",replica="a"} PENDING => FIRING: 1m0s since becoming active at 2022-09-15 15:35:56.126006 +0200 CEST m=+39.384575417
```
## Profiling
`vmalert` provides handlers for collecting the following [Go profiles](https://blog.golang.org/profiling-go-programs):
* Memory profile. It can be collected with the following command (replace `0.0.0.0` with hostname if needed):
<div class="with-copy" markdown="1">
```console
curl http://0.0.0.0:8880/debug/pprof/heap > mem.pprof
```
</div>
* CPU profile. It can be collected with the following command (replace `0.0.0.0` with hostname if needed):
<div class="with-copy" markdown="1">
```console
curl http://0.0.0.0:8880/debug/pprof/profile > cpu.pprof
```
</div>
The command for collecting CPU profile waits for 30 seconds before returning.
The collected profiles may be analyzed with [go tool pprof](https://github.com/google/pprof).
It is safe sharing the collected profiles from security point of view, since they do not contain sensitive information.
## Configuration
### Flags
@@ -507,10 +795,10 @@ Pass `-help` to `vmalert` in order to see the full list of supported
command-line flags with their descriptions.
The shortlist of configuration flags is the following:
{% raw %}
```
-clusterMode
If clusterMode is enabled, then vmalert automatically adds the tenant specified in config groups to -datasource.url, -remoteWrite.url and -remoteRead.url. See https://docs.victoriametrics.com/vmalert.html#multitenancy
If clusterMode is enabled, then vmalert automatically adds the tenant specified in config groups to -datasource.url, -remoteWrite.url and -remoteRead.url. See https://docs.victoriametrics.com/vmalert.html#multitenancy . This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
-configCheckInterval duration
Interval for checking for changes in '-rule' or '-notifier.config' files. By default the checking is disabled. Send SIGHUP signal in order to force config check for changes.
-datasource.appendTypePrefix
@@ -527,6 +815,8 @@ The shortlist of configuration flags is the following:
Optional path to bearer token file to use for -datasource.url.
-datasource.disableKeepAlive
Whether to disable long-lived connections to the datasource. If true, disables HTTP keep-alives and will only use the connection to the server for a single HTTP request.
-datasource.headers string
Optional HTTP extraHeaders to send with each request to the corresponding -datasource.url. For example, -datasource.headers='My-Auth:foobar' would send 'My-Auth: foobar' HTTP header with every request to the corresponding -datasource.url. Multiple headers must be delimited by '^^': -datasource.headers='header1:value1^^header2:value2'
-datasource.lookback duration
Lookback defines how far into the past to look when evaluating queries. For example, if the datasource.lookback=5m then param "time" with value now()-5m will be added to every query.
-datasource.maxIdleConnections int
@@ -542,11 +832,13 @@ The shortlist of configuration flags is the following:
-datasource.oauth2.tokenUrl string
Optional OAuth2 tokenURL to use for -datasource.url.
-datasource.queryStep duration
queryStep defines how far a value can fallback to when evaluating queries. For example, if datasource.queryStep=15s then param "step" with value "15s" will be added to every query.If queryStep isn't specified, rule's evaluationInterval will be used instead.
How far a value can fallback to when evaluating queries. For example, if -datasource.queryStep=15s then param "step" with value "15s" will be added to every query. If set to 0, rule's evaluation interval will be used instead. (default 5m0s)
-datasource.queryTimeAlignment
Whether to align "time" parameter with evaluation interval.Alignment supposed to produce deterministic results despite of number of vmalert replicas or time they were started. See more details here https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1257 (default true)
-datasource.roundDigits int
Adds "round_digits" GET param to datasource requests. In VM "round_digits" limits the number of digits after the decimal point in response values.
-datasource.showURL
Whether to show -datasource.url in the exported metrics. It is hidden by default, since it can contain sensitive info such as auth key
-datasource.tlsCAFile string
Optional path to TLS CA file to use for verifying connections to -datasource.url. By default, system CA is used
-datasource.tlsCertFile string
@@ -558,14 +850,14 @@ The shortlist of configuration flags is the following:
-datasource.tlsServerName string
Optional TLS server name to use for connections to -datasource.url. By default, the server name from -datasource.url is used
-datasource.url string
VictoriaMetrics or vmselect url. Required parameter. E.g. http://127.0.0.1:8428
Datasource compatible with Prometheus HTTP API. It can be single node VictoriaMetrics or vmselect URL. Required parameter. E.g. http://127.0.0.1:8428 . See also -remoteRead.disablePathAppend and -datasource.showURL
-defaultTenant.graphite string
Default tenant for Graphite alerting groups. See https://docs.victoriametrics.com/vmalert.html#multitenancy
Default tenant for Graphite alerting groups. See https://docs.victoriametrics.com/vmalert.html#multitenancy .This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
-defaultTenant.prometheus string
Default tenant for Prometheus alerting groups. See https://docs.victoriametrics.com/vmalert.html#multitenancy
Default tenant for Prometheus alerting groups. See https://docs.victoriametrics.com/vmalert.html#multitenancy . This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
-disableAlertgroupLabel
Whether to disable adding group's Name as label to generated alerts and time series.
-dryRun -rule
-dryRun
Whether to check only config files without running vmalert. The rules file are validated. The -rule flag must be specified.
-enableTCP6
Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used
@@ -574,17 +866,18 @@ The shortlist of configuration flags is the following:
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
-eula
By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf
By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf . This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
-evaluationInterval duration
How often to evaluate the rules (default 1m0s)
-external.alert.source string
External Alert Source allows to override the Source link for alerts sent to AlertManager for cases where you want to build a custom link to Grafana, Prometheus or any other service.
eg. 'explore?orgId=1&left=[\"now-1h\",\"now\",\"VictoriaMetrics\",{\"expr\": \"{{$expr|quotesEscape|crlfEscape|queryEscape}}\"},{\"mode\":\"Metrics\"},{\"ui\":[true,true,true,\"none\"]}]'.If empty '/api/v1/:groupID/alertID/status' is used
External Alert Source allows to override the Source link for alerts sent to AlertManager for cases where you want to build a custom link to Grafana, Prometheus or any other service. Supports templating - see https://docs.victoriametrics.com/vmalert.html#templating . For example, link to Grafana: -external.alert.source='explore?orgId=1&left=["now-1h","now","VictoriaMetrics",{"expr":{{$expr|jsonEscape|queryEscape}} },{"mode":"Metrics"},{"ui":[true,true,true,"none"]}]' . If empty 'vmalert/alert?group_id={{.GroupID}}&alert_id={{.AlertID}}' is used.
-external.label array
Optional label in the form 'Name=value' to add to all generated recording rules and alerts. Pass multiple -label flags in order to add multiple label sets.
Supports an array of values separated by comma or specified via multiple flags.
-external.url string
External URL is used as alert's source for sent alerts to the notifier
-flagsAuthKey string
Auth key for /flags endpoint. It must be passed via authKey query arg. It overrides httpAuth.* settings
-fs.disableMmap
Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread()
-http.connTimeout duration
@@ -611,6 +904,8 @@ The shortlist of configuration flags is the following:
Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit
-loggerFormat string
Format for logs. Possible values: default, json (default "default")
-loggerJSONFields string
Allows renaming fields in JSON formatted logs. Example: "ts:timestamp,msg:message" renames "ts" to "timestamp" and "msg" to "message". Supported fields: ts, level, caller, msg
-loggerLevel string
Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO")
-loggerOutput string
@@ -621,11 +916,11 @@ The shortlist of configuration flags is the following:
Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit
-memory.allowedBytes size
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0)
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 0)
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60)
-metricsAuthKey string
Auth key for /metrics. It must be passed via authKey query arg. It overrides httpAuth.* settings
Auth key for /metrics endpoint. It must be passed via authKey query arg. It overrides httpAuth.* settings
-notifier.basicAuth.password array
Optional basic auth password for -notifier.url
Supports an array of values separated by comma or specified via multiple flags.
@@ -676,18 +971,28 @@ The shortlist of configuration flags is the following:
Optional TLS server name to use for connections to -notifier.url. By default the server name from -notifier.url is used
Supports an array of values separated by comma or specified via multiple flags.
-notifier.url array
Prometheus alertmanager URL, e.g. http://127.0.0.1:9093
Prometheus Alertmanager URL, e.g. http://127.0.0.1:9093. List all Alertmanager URLs if it runs in the cluster mode to ensure high availability.
Supports an array of values separated by comma or specified via multiple flags.
-pprofAuthKey string
Auth key for /debug/pprof. It must be passed via authKey query arg. It overrides httpAuth.* settings
Auth key for /debug/pprof/* endpoints. It must be passed via authKey query arg. It overrides httpAuth.* settings
-promscrape.consul.waitTime duration
Wait time used by Consul service discovery. Default value is used if not set
-promscrape.consulSDCheckInterval duration
Interval for checking for changes in Consul. This works only if consul_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config for details (default 30s)
Interval for checking for changes in Consul. This works only if consul_sd_configs is configured in '-promscrape.config' file. See https://docs.victoriametrics.com/sd_configs.html#consul_sd_configs for details (default 30s)
-promscrape.discovery.concurrency int
The maximum number of concurrent requests to Prometheus autodiscovery API (Consul, Kubernetes, etc.) (default 100)
-promscrape.discovery.concurrentWaitTime duration
The maximum duration for waiting to perform API requests if more than -promscrape.discovery.concurrency requests are simultaneously performed (default 1m0s)
-promscrape.dnsSDCheckInterval duration
Interval for checking for changes in dns. This works only if dns_sd_configs is configured in '-promscrape.config' file. See https://docs.victoriametrics.com/sd_configs.html#dns_sd_configs for details (default 30s)
-pushmetrics.extraLabel array
Optional labels to add to metrics pushed to -pushmetrics.url . For example, -pushmetrics.extraLabel='instance="foo"' adds instance="foo" label to all the metrics pushed to -pushmetrics.url
Supports an array of values separated by comma or specified via multiple flags.
-pushmetrics.interval duration
Interval for pushing metrics to -pushmetrics.url (default 10s)
-pushmetrics.url array
Optional URL to push metrics exposed at /metrics page. See https://docs.victoriametrics.com/#push-metrics . By default metrics exposed at /metrics page aren't pushed to any remote storage
Supports an array of values separated by comma or specified via multiple flags.
-remoteRead.basicAuth.password string
Optional basic auth password for -remoteRead.url
-remoteRead.basicAuth.passwordFile string
@@ -699,7 +1004,9 @@ The shortlist of configuration flags is the following:
-remoteRead.bearerTokenFile string
Optional path to bearer token file to use for -remoteRead.url.
-remoteRead.disablePathAppend
Whether to disable automatic appending of '/api/v1/query' path to the configured -remoteRead.url.
Whether to disable automatic appending of '/api/v1/query' path to the configured -datasource.url and -remoteRead.url
-remoteRead.headers string
Optional HTTP headers to send with each request to the corresponding -remoteRead.url. For example, -remoteRead.headers='My-Auth:foobar' would send 'My-Auth: foobar' HTTP header with every request to the corresponding -remoteRead.url. Multiple headers must be delimited by '^^': -remoteRead.headers='header1:value1^^header2:value2'
-remoteRead.ignoreRestoreErrors
Whether to ignore errors from remote storage when restoring alerts state on startup. (default true)
-remoteRead.lookback duration
@@ -714,6 +1021,8 @@ The shortlist of configuration flags is the following:
Optional OAuth2 scopes to use for -remoteRead.url. Scopes must be delimited by ';'.
-remoteRead.oauth2.tokenUrl string
Optional OAuth2 tokenURL to use for -remoteRead.url.
-remoteRead.showURL
Whether to show -remoteRead.url in the exported metrics. It is hidden by default, since it can contain sensitive info such as auth key
-remoteRead.tlsCAFile string
Optional path to TLS CA file to use for verifying connections to -remoteRead.url. By default system CA is used
-remoteRead.tlsCertFile string
@@ -725,7 +1034,7 @@ The shortlist of configuration flags is the following:
-remoteRead.tlsServerName string
Optional TLS server name to use for connections to -remoteRead.url. By default the server name from -remoteRead.url is used
-remoteRead.url vmalert
Optional URL to VictoriaMetrics or vmselect that will be used to restore alerts state. This configuration makes sense only if vmalert was configured with `remoteWrite.url` before and has been successfully persisted its state. E.g. http://127.0.0.1:8428. See also -remoteRead.disablePathAppend
Optional URL to datasource compatible with Prometheus HTTP API. It can be single node VictoriaMetrics or vmselect.Remote read is used to restore alerts state.This configuration makes sense only if vmalert was configured with `remoteWrite.url` before and has been successfully persisted its state. E.g. http://127.0.0.1:8428. See also '-remoteRead.disablePathAppend', '-remoteRead.showURL'.
-remoteWrite.basicAuth.password string
Optional basic auth password for -remoteWrite.url
-remoteWrite.basicAuth.passwordFile string
@@ -742,6 +1051,8 @@ The shortlist of configuration flags is the following:
Whether to disable automatic appending of '/api/v1/write' path to the configured -remoteWrite.url.
-remoteWrite.flushInterval duration
Defines interval of flushes to remote write endpoint (default 5s)
-remoteWrite.headers string
Optional HTTP headers to send with each request to the corresponding -remoteWrite.url. For example, -remoteWrite.headers='My-Auth:foobar' would send 'My-Auth: foobar' HTTP header with every request to the corresponding -remoteWrite.url. Multiple headers must be delimited by '^^': -remoteWrite.headers='header1:value1^^header2:value2'
-remoteWrite.maxBatchSize int
Defines defines max number of timeseries to be flushed at once (default 1000)
-remoteWrite.maxQueueSize int
@@ -756,6 +1067,10 @@ The shortlist of configuration flags is the following:
Optional OAuth2 scopes to use for -notifier.url. Scopes must be delimited by ';'.
-remoteWrite.oauth2.tokenUrl string
Optional OAuth2 tokenURL to use for -notifier.url.
-remoteWrite.sendTimeout duration
Timeout for sending data to the configured -remoteWrite.url. (default 30s)
-remoteWrite.showURL
Whether to show -remoteWrite.url in the exported metrics. It is hidden by default, since it can contain sensitive info such as auth key
-remoteWrite.tlsCAFile string
Optional path to TLS CA file to use for verifying connections to -remoteWrite.url. By default system CA is used
-remoteWrite.tlsCertFile string
@@ -767,7 +1082,9 @@ The shortlist of configuration flags is the following:
-remoteWrite.tlsServerName string
Optional TLS server name to use for connections to -remoteWrite.url. By default the server name from -remoteWrite.url is used
-remoteWrite.url string
Optional URL to VictoriaMetrics or vminsert where to persist alerts state and recording rules results in form of timeseries. For example, if -remoteWrite.url=http://127.0.0.1:8428 is specified, then the alerts state will be written to http://127.0.0.1:8428/api/v1/write . See also -remoteWrite.disablePathAppend
Optional URL to VictoriaMetrics or vminsert where to persist alerts state and recording rules results in form of timeseries. For example, if -remoteWrite.url=http://127.0.0.1:8428 is specified, then the alerts state will be written to http://127.0.0.1:8428/api/v1/write . See also -remoteWrite.disablePathAppend, '-remoteWrite.showURL'.
-replay.disableProgressBar
Whether to disable rendering progress bars during the replay. Progress bar rendering might be verbose or break the logs parsing, so it is recommended to be disabled when not used in interactive mode.
-replay.maxDatapointsPerQuery int
Max number of data points expected in one request. The higher the value, the less requests will be made during replay. (default 1000)
-replay.ruleRetryAttempts int
@@ -793,19 +1110,35 @@ The shortlist of configuration flags is the following:
Limits the maximum duration for automatic alert expiration, which is by default equal to 3 evaluation intervals of the parent group.
-rule.resendDelay duration
Minimum amount of time to wait before resending an alert to notifier
-rule.templates array
Path or glob pattern to location with go template definitions
for rules annotations templating. Flag can be specified multiple times.
Examples:
-rule.templates="/path/to/file". Path to a single file with go templates
-rule.templates="dir/*.tpl" -rule.templates="/*.tpl". Relative path to all .tpl files in "dir" folder,
absolute path to all .tpl files in root.
Supports an array of values separated by comma or specified via multiple flags.
-rule.updateEntriesLimit int
Defines the max number of rule's state updates stored in-memory. Rule's updates are available on rule's Details page and are used for debugging purposes. The number of stored updates can be overriden per rule via update_entries_limit param. (default 20)
-rule.validateExpressions
Whether to validate rules expressions via MetricsQL engine (default true)
-rule.validateTemplates
Whether to validate annotation and label templates (default true)
-tls
Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set
Whether to enable TLS for incoming HTTP requests at -httpListenAddr (aka https). -tlsCertFile and -tlsKeyFile must be set if -tls is set
-tlsCertFile string
Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower. The provided certificate file is automatically re-read every second, so it can be dynamically updated
Path to file with TLS certificate if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower. The provided certificate file is automatically re-read every second, so it can be dynamically updated
-tlsCipherSuites array
Optional list of TLS cipher suites for incoming requests over HTTPS if -tls is set. See the list of supported cipher suites at https://pkg.go.dev/crypto/tls#pkg-constants
Supports an array of values separated by comma or specified via multiple flags.
-tlsKeyFile string
Path to file with TLS key. Used only if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated
Path to file with TLS key if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated
-tlsMinVersion string
Optional minimum TLS version to use for incoming requests over HTTPS if -tls is set. Supported values: TLS10, TLS11, TLS12, TLS13
-version
Show VictoriaMetrics version
```
{% endraw %}
### Hot config reload
@@ -814,7 +1147,7 @@ The shortlist of configuration flags is the following:
* send SIGHUP signal to `vmalert` process;
* send GET request to `/-/reload` endpoint;
* configure `-configCheckInterval` flag for periodic reload
on config change.
on config change.
### URL params
@@ -846,12 +1179,13 @@ Notifier also supports configuration via file specified with flag `notifier.conf
-notifier.config=app/vmalert/notifier/testdata/consul.good.yaml
```
The configuration file allows to configure static notifiers or discover notifiers via
[Consul](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config).
The configuration file allows to configure static notifiers, discover notifiers via
[Consul](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config)
and [DNS](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dns_sd_config):
For example:
```
static_configs:
static_configs:
- targets:
- localhost:9093
- localhost:9095
@@ -860,9 +1194,17 @@ consul_sd_configs:
- server: localhost:8500
services:
- alertmanager
dns_sd_configs:
- names:
- my.domain.com
type: 'A'
port: 9093
```
The list of configured or discovered Notifiers can be explored via [UI](#Web).
If Alertmanager runs in cluster mode then all its URLs needs to be available during discovery
to ensure [high availability](https://github.com/prometheus/alertmanager#high-availability).
The configuration file [specification](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmalert/notifier/config.go)
is the following:
@@ -882,7 +1224,7 @@ is the following:
# password and password_file are mutually exclusive.
basic_auth:
[ username: <string> ]
[ password: <secret> ]
[ password: <string> ]
[ password_file: <string> ]
# Optional `Authorization` header configuration.
@@ -901,16 +1243,52 @@ authorization:
tls_config:
[ <tls_config> ]
# Configures Bearer authentication token via string
bearer_token: <string>
# or by passing path to the file with token.
bearer_token_file: <string>
# Configures OAuth 2.0 authentication
# see https://prometheus.io/docs/prometheus/latest/configuration/configuration/#oauth2
oauth2:
[ <oauth2_config> ]
# Optional list of HTTP headers in form `header-name: value`
# applied for all requests to notifiers
# For example:
# headers:
# - "CustomHeader: foo"
# - "CustomHeader2: bar"
headers:
[ <string>, ...]
# List of labeled statically configured Notifiers.
#
# Each list of targets may be additionally instructed with
# authorization params. Target's authorization params will
# inherit params from global authorization params if there
# are no conflicts.
static_configs:
targets:
[ - '<host>' ]
[ - targets: ]
[ - '<host>' ]
[ oauth2 ]
[ basic_auth ]
[ authorization ]
[ tls_config ]
[ bearer_token ]
[ bearer_token_file ]
[ headers ]
# List of Consul service discovery configurations.
# See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config
consul_sd_configs:
[ - <consul_sd_config> ... ]
# List of DNS service discovery configurations.
# See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dns_sd_config
dns_sd_configs:
[ - <dns_sd_config> ... ]
# List of relabel configurations for entities discovered via service discovery.
# Supports the same relabeling features as the rest of VictoriaMetrics components.
# See https://docs.victoriametrics.com/vmagent.html#relabeling
@@ -939,9 +1317,28 @@ It is recommended using
* `vmalert` is located in `vmutils-*` archives there.
### Docker image
You can build `vmalert` docker image from source and push it to your own docker repository.
Run the following commands from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics):
```console
make package-vmalert
docker tag victoria-metrics/vmalert:version my-repo:my-version-name
docker push my-repo:my-version-name
```
To run the built image in `victoria-metrics-k8s-stack` or `VMAlert` CR object apply the following config change:
```yaml
kind: VMAlert
spec:
image:
repository: my-repo
tag: my-version-name
```
### Development build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.17.
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.19.
2. Run `make vmalert` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmalert` binary and puts it into the `bin` folder.
@@ -957,12 +1354,12 @@ ARM build may run on Raspberry Pi or on [energy-efficient ARM servers](https://b
### Development ARM build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.17.
2. Run `make vmalert-arm` or `make vmalert-arm64` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmalert-arm` or `vmalert-arm64` binary respectively and puts it into the `bin` folder.
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.19.
2. Run `make vmalert-linux-arm` or `make vmalert-linux-arm64` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmalert-linux-arm` or `vmalert-linux-arm64` binary respectively and puts it into the `bin` folder.
### Production ARM build
1. [Install docker](https://docs.docker.com/install/).
2. Run `make vmalert-arm-prod` or `make vmalert-arm64-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmalert-arm-prod` or `vmalert-arm64-prod` binary respectively and puts it into the `bin` folder.
2. Run `make vmalert-linux-arm-prod` or `make vmalert-linux-arm64-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmalert-linux-arm-prod` or `vmalert-linux-arm64-prod` binary respectively and puts it into the `bin` folder.

View File

@@ -6,12 +6,14 @@ import (
"hash/fnv"
"sort"
"strconv"
"strings"
"sync"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/config"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/templates"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/utils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
@@ -19,7 +21,7 @@ import (
// AlertingRule is basic alert entity
type AlertingRule struct {
Type datasource.Type
Type config.Type
RuleID uint64
Name string
Expr string
@@ -29,24 +31,17 @@ type AlertingRule struct {
GroupID uint64
GroupName string
EvalInterval time.Duration
Debug bool
q datasource.Querier
// guard status fields
mu sync.RWMutex
alertsMu sync.RWMutex
// stores list of active alerts
alerts map[uint64]*notifier.Alert
// stores last moment of time Exec was called
lastExecTime time.Time
// stores the duration of the last Exec call
lastExecDuration time.Duration
// stores last error that happened in Exec func
// resets on every successful Exec
// may be used as Health state
lastExecError error
// stores the number of samples returned during
// the last evaluation
lastExecSamples int
// state stores recent state changes
// during evaluations
state *ruleState
metrics *alertingRuleMetrics
}
@@ -70,20 +65,29 @@ func newAlertingRule(qb datasource.QuerierBuilder, group *Group, cfg config.Rule
GroupID: group.ID(),
GroupName: group.Name,
EvalInterval: group.Interval,
Debug: cfg.Debug,
q: qb.BuildWithParams(datasource.QuerierParams{
DataSourceType: &group.Type,
DataSourceType: group.Type.String(),
EvaluationInterval: group.Interval,
QueryParams: group.Params,
Headers: group.Headers,
Debug: cfg.Debug,
}),
alerts: make(map[uint64]*notifier.Alert),
metrics: &alertingRuleMetrics{},
}
if cfg.UpdateEntriesLimit != nil {
ar.state = newRuleState(*cfg.UpdateEntriesLimit)
} else {
ar.state = newRuleState(*ruleUpdateEntriesLimit)
}
labels := fmt.Sprintf(`alertname=%q, group=%q, id="%d"`, ar.Name, group.Name, ar.ID())
ar.metrics.pending = utils.GetOrCreateGauge(fmt.Sprintf(`vmalert_alerts_pending{%s}`, labels),
func() float64 {
ar.mu.RLock()
defer ar.mu.RUnlock()
ar.alertsMu.RLock()
defer ar.alertsMu.RUnlock()
var num int
for _, a := range ar.alerts {
if a.State == notifier.StatePending {
@@ -94,8 +98,8 @@ func newAlertingRule(qb datasource.QuerierBuilder, group *Group, cfg config.Rule
})
ar.metrics.active = utils.GetOrCreateGauge(fmt.Sprintf(`vmalert_alerts_firing{%s}`, labels),
func() float64 {
ar.mu.RLock()
defer ar.mu.RUnlock()
ar.alertsMu.RLock()
defer ar.alertsMu.RUnlock()
var num int
for _, a := range ar.alerts {
if a.State == notifier.StateFiring {
@@ -106,18 +110,16 @@ func newAlertingRule(qb datasource.QuerierBuilder, group *Group, cfg config.Rule
})
ar.metrics.errors = utils.GetOrCreateGauge(fmt.Sprintf(`vmalert_alerting_rules_error{%s}`, labels),
func() float64 {
ar.mu.RLock()
defer ar.mu.RUnlock()
if ar.lastExecError == nil {
e := ar.state.getLast()
if e.err == nil {
return 0
}
return 1
})
ar.metrics.samples = utils.GetOrCreateGauge(fmt.Sprintf(`vmalert_alerting_rules_last_evaluation_samples{%s}`, labels),
func() float64 {
ar.mu.RLock()
defer ar.mu.RUnlock()
return float64(ar.lastExecSamples)
e := ar.state.getLast()
return float64(e.samples)
})
return ar
}
@@ -141,28 +143,58 @@ func (ar *AlertingRule) ID() uint64 {
return ar.RuleID
}
func (ar *AlertingRule) logDebugf(at time.Time, a *notifier.Alert, format string, args ...interface{}) {
if !ar.Debug {
return
}
prefix := fmt.Sprintf("DEBUG rule %q:%q (%d) at %v: ",
ar.GroupName, ar.Name, ar.RuleID, at.Format(time.RFC3339))
if a != nil {
labelKeys := make([]string, len(a.Labels))
var i int
for k := range a.Labels {
labelKeys[i] = k
i++
}
sort.Strings(labelKeys)
labels := make([]string, len(labelKeys))
for i, l := range labelKeys {
labels[i] = fmt.Sprintf("%s=%q", l, a.Labels[l])
}
labelsStr := strings.Join(labels, ",")
prefix += fmt.Sprintf("alert %d {%s} ", a.ID, labelsStr)
}
msg := fmt.Sprintf(format, args...)
logger.Infof("%s", prefix+msg)
}
type labelSet struct {
// origin labels from series
// used for templating
// origin labels extracted from received time series
// plus extra labels (group labels, service labels like alertNameLabel).
// in case of conflicts, origin labels from time series preferred.
// used for templating annotations
origin map[string]string
// processed labels with additional data
// used as Alert labels
// processed labels includes origin labels
// plus extra labels (group labels, service labels like alertNameLabel).
// in case of conflicts, extra labels are preferred.
// used as labels attached to notifier.Alert and ALERTS series written to remote storage.
processed map[string]string
}
// toLabels converts labels from given Metric
// to labelSet which contains original and processed labels.
func (ar *AlertingRule) toLabels(m datasource.Metric, qFn notifier.QueryFn) (*labelSet, error) {
func (ar *AlertingRule) toLabels(m datasource.Metric, qFn templates.QueryFn) (*labelSet, error) {
ls := &labelSet{
origin: make(map[string]string, len(m.Labels)),
origin: make(map[string]string),
processed: make(map[string]string),
}
for _, l := range m.Labels {
ls.origin[l.Name] = l.Value
// drop __name__ to be consistent with Prometheus alerting
if l.Name == "__name__" {
continue
}
ls.origin[l.Name] = l.Value
ls.processed[l.Name] = l.Value
}
@@ -176,14 +208,23 @@ func (ar *AlertingRule) toLabels(m datasource.Metric, qFn notifier.QueryFn) (*la
}
for k, v := range extraLabels {
ls.processed[k] = v
if _, ok := ls.origin[k]; !ok {
ls.origin[k] = v
}
}
// set additional labels to identify group and rule name
if ar.Name != "" {
ls.processed[alertNameLabel] = ar.Name
if _, ok := ls.origin[alertNameLabel]; !ok {
ls.origin[alertNameLabel] = ar.Name
}
}
if !*disableAlertGroupLabel && ar.GroupName != "" {
ls.processed[alertGroupNameLabel] = ar.GroupName
if _, ok := ls.origin[alertGroupNameLabel]; !ok {
ls.origin[alertGroupNameLabel] = ar.GroupName
}
}
return ls, nil
}
@@ -239,41 +280,57 @@ const resolvedRetention = 15 * time.Minute
// Exec executes AlertingRule expression via the given Querier.
// Based on the Querier results AlertingRule maintains notifier.Alerts
func (ar *AlertingRule) Exec(ctx context.Context, ts time.Time) ([]prompbmarshal.TimeSeries, error) {
func (ar *AlertingRule) Exec(ctx context.Context, ts time.Time, limit int) ([]prompbmarshal.TimeSeries, error) {
start := time.Now()
qMetrics, err := ar.q.Query(ctx, ar.Expr, ts)
ar.mu.Lock()
defer ar.mu.Unlock()
qMetrics, req, err := ar.q.Query(ctx, ar.Expr, ts)
curState := ruleStateEntry{
time: start,
at: ts,
duration: time.Since(start),
samples: len(qMetrics),
err: err,
curl: requestToCurl(req),
}
defer func() {
ar.state.add(curState)
}()
ar.alertsMu.Lock()
defer ar.alertsMu.Unlock()
ar.lastExecTime = start
ar.lastExecDuration = time.Since(start)
ar.lastExecError = err
ar.lastExecSamples = len(qMetrics)
if err != nil {
return nil, fmt.Errorf("failed to execute query %q: %w", ar.Expr, err)
}
ar.logDebugf(ts, nil, "query returned %d samples (elapsed: %s)", curState.samples, curState.duration)
for h, a := range ar.alerts {
// cleanup inactive alerts from previous Exec
if a.State == notifier.StateInactive && ts.Sub(a.ResolvedAt) > resolvedRetention {
ar.logDebugf(ts, a, "deleted as inactive")
delete(ar.alerts, h)
}
}
qFn := func(query string) ([]datasource.Metric, error) { return ar.q.Query(ctx, query, ts) }
qFn := func(query string) ([]datasource.Metric, error) {
res, _, err := ar.q.Query(ctx, query, ts)
return res, err
}
updated := make(map[uint64]struct{})
// update list of active alerts
for _, m := range qMetrics {
ls, err := ar.toLabels(m, qFn)
if err != nil {
return nil, fmt.Errorf("failed to expand labels: %s", err)
curState.err = fmt.Errorf("failed to expand labels: %s", err)
return nil, curState.err
}
h := hash(ls.processed)
if _, ok := updated[h]; ok {
// duplicate may be caused by extra labels
// conflicting with the metric labels
ar.lastExecError = fmt.Errorf("labels %v: %w", ls.processed, errDuplicate)
return nil, ar.lastExecError
curState.err = fmt.Errorf("labels %v: %w", ls.processed, errDuplicate)
return nil, curState.err
}
updated[h] = struct{}{}
if a, ok := ar.alerts[h]; ok {
@@ -283,30 +340,28 @@ func (ar *AlertingRule) Exec(ctx context.Context, ts time.Time) ([]prompbmarshal
// back to notifier.StatePending
a.State = notifier.StatePending
a.ActiveAt = ts
ar.logDebugf(ts, a, "INACTIVE => PENDING")
}
if a.Value != m.Values[0] {
// update Value field with latest value
a.Value = m.Values[0]
// and re-exec template since Value can be used
// in annotations
a.Annotations, err = a.ExecTemplate(qFn, ls.origin, ar.Annotations)
if err != nil {
return nil, err
}
a.Value = m.Values[0]
// re-exec template since Value or query can be used in annotations
a.Annotations, err = a.ExecTemplate(qFn, ls.origin, ar.Annotations)
if err != nil {
return nil, err
}
continue
}
a, err := ar.newAlert(m, ls, ar.lastExecTime, qFn)
a, err := ar.newAlert(m, ls, start, qFn)
if err != nil {
ar.lastExecError = err
return nil, fmt.Errorf("failed to create alert: %w", err)
curState.err = fmt.Errorf("failed to create alert: %w", err)
return nil, curState.err
}
a.ID = h
a.State = notifier.StatePending
a.ActiveAt = ts
ar.alerts[h] = a
ar.logDebugf(ts, a, "created in state PENDING")
}
var numActivePending int
for h, a := range ar.alerts {
// if alert wasn't updated in this iteration
// means it is resolved already
@@ -315,20 +370,29 @@ func (ar *AlertingRule) Exec(ctx context.Context, ts time.Time) ([]prompbmarshal
// alert was in Pending state - it is not
// active anymore
delete(ar.alerts, h)
ar.logDebugf(ts, a, "PENDING => DELETED: is absent in current evaluation round")
continue
}
if a.State == notifier.StateFiring {
a.State = notifier.StateInactive
a.ResolvedAt = ts
ar.logDebugf(ts, a, "FIRING => INACTIVE: is absent in current evaluation round")
}
continue
}
if a.State == notifier.StatePending && time.Since(a.ActiveAt) >= ar.For {
numActivePending++
if a.State == notifier.StatePending && ts.Sub(a.ActiveAt) >= ar.For {
a.State = notifier.StateFiring
a.Start = ts
alertsFired.Inc()
ar.logDebugf(ts, a, "PENDING => FIRING: %s since becoming active at %v", ts.Sub(a.ActiveAt), a.ActiveAt)
}
}
if limit > 0 && numActivePending > limit {
ar.alerts = map[uint64]*notifier.Alert{}
curState.err = fmt.Errorf("exec exceeded limit of %d with %d alerts", limit, numActivePending)
return nil, curState.err
}
return ar.toTimeSeries(ts.Unix()), nil
}
@@ -382,7 +446,7 @@ func hash(labels map[string]string) uint64 {
return hash.Sum64()
}
func (ar *AlertingRule) newAlert(m datasource.Metric, ls *labelSet, start time.Time, qFn notifier.QueryFn) (*notifier.Alert, error) {
func (ar *AlertingRule) newAlert(m datasource.Metric, ls *labelSet, start time.Time, qFn templates.QueryFn) (*notifier.Alert, error) {
var err error
if ls == nil {
ls, err = ar.toLabels(m, qFn)
@@ -397,6 +461,7 @@ func (ar *AlertingRule) newAlert(m datasource.Metric, ls *labelSet, start time.T
Value: m.Values[0],
ActiveAt: start,
Expr: ar.Expr,
For: ar.For,
}
a.Annotations, err = a.ExecTemplate(qFn, ls.origin, ar.Annotations)
return a, err
@@ -404,8 +469,8 @@ func (ar *AlertingRule) newAlert(m datasource.Metric, ls *labelSet, start time.T
// AlertAPI generates APIAlert object from alert by its id(hash)
func (ar *AlertingRule) AlertAPI(id uint64) *APIAlert {
ar.mu.RLock()
defer ar.mu.RUnlock()
ar.alertsMu.RLock()
defer ar.alertsMu.RUnlock()
a, ok := ar.alerts[id]
if !ok {
return nil
@@ -413,9 +478,10 @@ func (ar *AlertingRule) AlertAPI(id uint64) *APIAlert {
return ar.newAlertAPI(*a)
}
// ToAPI returns Rule representation in form
// of APIRule
// ToAPI returns Rule representation in form of APIRule
// Isn't thread-safe. Call must be protected by AlertingRule mutex.
func (ar *AlertingRule) ToAPI() APIRule {
lastState := ar.state.getLast()
r := APIRule{
Type: "alerting",
DatasourceType: ar.Type.String(),
@@ -424,19 +490,21 @@ func (ar *AlertingRule) ToAPI() APIRule {
Duration: ar.For.Seconds(),
Labels: ar.Labels,
Annotations: ar.Annotations,
LastEvaluation: ar.lastExecTime,
EvaluationTime: ar.lastExecDuration.Seconds(),
LastEvaluation: lastState.time,
EvaluationTime: lastState.duration.Seconds(),
Health: "ok",
State: "inactive",
Alerts: ar.AlertsToAPI(),
LastSamples: ar.lastExecSamples,
LastSamples: lastState.samples,
MaxUpdates: ar.state.size(),
Updates: ar.state.getAll(),
// encode as strings to avoid rounding in JSON
ID: fmt.Sprintf("%d", ar.ID()),
GroupID: fmt.Sprintf("%d", ar.GroupID),
}
if ar.lastExecError != nil {
r.LastError = ar.lastExecError.Error()
if lastState.err != nil {
r.LastError = lastState.err.Error()
r.Health = "err"
}
// satisfy APIRule.State logic
@@ -456,14 +524,14 @@ func (ar *AlertingRule) ToAPI() APIRule {
// AlertsToAPI generates list of APIAlert objects from existing alerts
func (ar *AlertingRule) AlertsToAPI() []*APIAlert {
var alerts []*APIAlert
ar.mu.RLock()
ar.alertsMu.RLock()
for _, a := range ar.alerts {
if a.State == notifier.StateInactive {
continue
}
alerts = append(alerts, ar.newAlertAPI(*a))
}
ar.mu.RUnlock()
ar.alertsMu.RUnlock()
return alerts
}
@@ -546,7 +614,10 @@ func (ar *AlertingRule) Restore(ctx context.Context, q datasource.Querier, lookb
}
ts := time.Now()
qFn := func(query string) ([]datasource.Metric, error) { return ar.q.Query(ctx, query, ts) }
qFn := func(query string) ([]datasource.Metric, error) {
res, _, err := ar.q.Query(ctx, query, ts)
return res, err
}
// account for external labels in filter
var labelsFilter string
@@ -554,12 +625,9 @@ func (ar *AlertingRule) Restore(ctx context.Context, q datasource.Querier, lookb
labelsFilter += fmt.Sprintf(",%s=%q", k, v)
}
// Get the last data point in range via MetricsQL `last_over_time`.
// We don't use plain PromQL since Prometheus doesn't support
// remote write protocol which is used for state persistence in vmalert.
expr := fmt.Sprintf("last_over_time(%s{alertname=%q%s}[%ds])",
alertForStateMetricName, ar.Name, labelsFilter, int(lookback.Seconds()))
qMetrics, err := q.Query(ctx, expr, ts)
qMetrics, _, err := q.Query(ctx, expr, ts)
if err != nil {
return err
}

View File

@@ -304,7 +304,7 @@ func TestAlertingRule_Exec(t *testing.T) {
for _, step := range tc.steps {
fq.reset()
fq.add(step...)
if _, err := tc.rule.Exec(context.TODO(), time.Now()); err != nil {
if _, err := tc.rule.Exec(context.TODO(), time.Now(), 0); err != nil {
t.Fatalf("unexpected err: %s", err)
}
// artificial delay between applying steps
@@ -624,14 +624,14 @@ func TestAlertingRule_Exec_Negative(t *testing.T) {
// successful attempt
fq.add(metricWithValueAndLabels(t, 1, "__name__", "foo", "job", "bar"))
_, err := ar.Exec(context.TODO(), time.Now())
_, err := ar.Exec(context.TODO(), time.Now(), 0)
if err != nil {
t.Fatal(err)
}
// label `job` will collide with rule extra label and will make both time series equal
fq.add(metricWithValueAndLabels(t, 1, "__name__", "foo", "job", "baz"))
_, err = ar.Exec(context.TODO(), time.Now())
_, err = ar.Exec(context.TODO(), time.Now(), 0)
if !errors.Is(err, errDuplicate) {
t.Fatalf("expected to have %s error; got %s", errDuplicate, err)
}
@@ -640,7 +640,7 @@ func TestAlertingRule_Exec_Negative(t *testing.T) {
expErr := "connection reset by peer"
fq.setErr(errors.New(expErr))
_, err = ar.Exec(context.TODO(), time.Now())
_, err = ar.Exec(context.TODO(), time.Now(), 0)
if err == nil {
t.Fatalf("expected to get err; got nil")
}
@@ -649,6 +649,50 @@ func TestAlertingRule_Exec_Negative(t *testing.T) {
}
}
func TestAlertingRuleLimit(t *testing.T) {
fq := &fakeQuerier{}
ar := newTestAlertingRule("test", 0)
ar.Labels = map[string]string{"job": "test"}
ar.q = fq
ar.For = time.Minute
testCases := []struct {
limit int
err string
tssNum int
}{
{
limit: 0,
tssNum: 4,
},
{
limit: -1,
tssNum: 4,
},
{
limit: 1,
err: "exec exceeded limit of 1 with 2 alerts",
tssNum: 0,
},
{
limit: 4,
tssNum: 4,
},
}
var (
err error
timestamp = time.Now()
)
fq.add(metricWithValueAndLabels(t, 1, "__name__", "foo", "job", "bar"))
fq.add(metricWithValueAndLabels(t, 1, "__name__", "foo", "bar", "job"))
for _, testCase := range testCases {
_, err = ar.Exec(context.TODO(), timestamp, testCase.limit)
if err != nil && !strings.EqualFold(err.Error(), testCase.err) {
t.Fatal(err)
}
}
fq.reset()
}
func TestAlertingRule_Template(t *testing.T) {
testCases := []struct {
rule *AlertingRule
@@ -656,14 +700,25 @@ func TestAlertingRule_Template(t *testing.T) {
expAlerts map[uint64]*notifier.Alert
}{
{
newTestRuleWithLabels("common", "region", "east"),
&AlertingRule{
Name: "common",
Labels: map[string]string{
"region": "east",
},
Annotations: map[string]string{
"summary": `{{ $labels.alertname }}: Too high connection number for "{{ $labels.instance }}"`,
},
alerts: make(map[uint64]*notifier.Alert),
},
[]datasource.Metric{
metricWithValueAndLabels(t, 1, "instance", "foo"),
metricWithValueAndLabels(t, 1, "instance", "bar"),
},
map[uint64]*notifier.Alert{
hash(map[string]string{alertNameLabel: "common", "region": "east", "instance": "foo"}): {
Annotations: map[string]string{},
Annotations: map[string]string{
"summary": `common: Too high connection number for "foo"`,
},
Labels: map[string]string{
alertNameLabel: "common",
"region": "east",
@@ -671,7 +726,9 @@ func TestAlertingRule_Template(t *testing.T) {
},
},
hash(map[string]string{alertNameLabel: "common", "region": "east", "instance": "bar"}): {
Annotations: map[string]string{},
Annotations: map[string]string{
"summary": `common: Too high connection number for "bar"`,
},
Labels: map[string]string{
alertNameLabel: "common",
"region": "east",
@@ -687,14 +744,14 @@ func TestAlertingRule_Template(t *testing.T) {
"instance": "{{ $labels.instance }}",
},
Annotations: map[string]string{
"summary": `Too high connection number for "{{ $labels.instance }}"`,
"summary": `{{ $labels.__name__ }}: Too high connection number for "{{ $labels.instance }}"`,
"description": `{{ $labels.alertname}}: It is {{ $value }} connections for "{{ $labels.instance }}"`,
},
alerts: make(map[uint64]*notifier.Alert),
},
[]datasource.Metric{
metricWithValueAndLabels(t, 2, "instance", "foo", alertNameLabel, "override"),
metricWithValueAndLabels(t, 10, "instance", "bar", alertNameLabel, "override"),
metricWithValueAndLabels(t, 2, "__name__", "first", "instance", "foo", alertNameLabel, "override"),
metricWithValueAndLabels(t, 10, "__name__", "second", "instance", "bar", alertNameLabel, "override"),
},
map[uint64]*notifier.Alert{
hash(map[string]string{alertNameLabel: "override label", "instance": "foo"}): {
@@ -703,7 +760,7 @@ func TestAlertingRule_Template(t *testing.T) {
"instance": "foo",
},
Annotations: map[string]string{
"summary": `Too high connection number for "foo"`,
"summary": `first: Too high connection number for "foo"`,
"description": `override: It is 2 connections for "foo"`,
},
},
@@ -713,7 +770,7 @@ func TestAlertingRule_Template(t *testing.T) {
"instance": "bar",
},
Annotations: map[string]string{
"summary": `Too high connection number for "bar"`,
"summary": `second: Too high connection number for "bar"`,
"description": `override: It is 10 connections for "bar"`,
},
},
@@ -760,8 +817,9 @@ func TestAlertingRule_Template(t *testing.T) {
fq := &fakeQuerier{}
tc.rule.GroupID = fakeGroup.ID()
tc.rule.q = fq
tc.rule.state = newRuleState(10)
fq.add(tc.metrics...)
if _, err := tc.rule.Exec(context.TODO(), time.Now()); err != nil {
if _, err := tc.rule.Exec(context.TODO(), time.Now(), 0); err != nil {
t.Fatalf("unexpected err: %s", err)
}
for hash, expAlert := range tc.expAlerts {
@@ -871,5 +929,11 @@ func newTestRuleWithLabels(name string, labels ...string) *AlertingRule {
}
func newTestAlertingRule(name string, waitFor time.Duration) *AlertingRule {
return &AlertingRule{Name: name, alerts: make(map[uint64]*notifier.Alert), For: waitFor, EvalInterval: waitFor}
return &AlertingRule{
Name: name,
For: waitFor,
EvalInterval: waitFor,
alerts: make(map[uint64]*notifier.Alert),
state: newRuleState(10),
}
}

View File

@@ -4,16 +4,14 @@ import (
"crypto/md5"
"fmt"
"hash/fnv"
"io/ioutil"
"net/url"
"os"
"path/filepath"
"sort"
"strings"
"gopkg.in/yaml.v2"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/utils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envtemplate"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
@@ -23,17 +21,13 @@ import (
// Group contains list of Rules grouped into
// entity with one name and evaluation interval
type Group struct {
Type datasource.Type `yaml:"type,omitempty"`
Type Type `yaml:"type,omitempty"`
File string
Name string `yaml:"name"`
Interval promutils.Duration `yaml:"interval"`
Rules []Rule `yaml:"rules"`
Concurrency int `yaml:"concurrency"`
// ExtraFilterLabels is a list label filters applied to every rule
// request withing a group. Is compatible only with VM datasources.
// See https://docs.victoriametrics.com#prometheus-querying-api-enhancements
// DEPRECATED: use Params field instead
ExtraFilterLabels map[string]string `yaml:"extra_filter_labels"`
Name string `yaml:"name"`
Interval *promutils.Duration `yaml:"interval,omitempty"`
Limit int `yaml:"limit,omitempty"`
Rules []Rule `yaml:"rules"`
Concurrency int `yaml:"concurrency"`
// Labels is a set of label value pairs, that will be added to every rule.
// It has priority over the external labels.
Labels map[string]string `yaml:"labels"`
@@ -42,6 +36,8 @@ type Group struct {
Checksum string
// Optional HTTP URL parameters added to each rule request
Params url.Values `yaml:"params"`
// Headers contains optional HTTP headers added to each rule request
Headers []Header `yaml:"headers,omitempty"`
// Catches all undefined fields and must be empty after parsing.
XXX map[string]interface{} `yaml:",inline"`
@@ -59,23 +55,7 @@ func (g *Group) UnmarshalYAML(unmarshal func(interface{}) error) error {
}
// change default value to prometheus datasource.
if g.Type.Get() == "" {
g.Type.Set(datasource.NewPrometheusType())
}
// backward compatibility with deprecated `ExtraFilterLabels` param
if len(g.ExtraFilterLabels) > 0 {
if g.Params == nil {
g.Params = url.Values{}
}
// Sort extraFilters for consistent order for query args across runs.
extraFilters := make([]string, 0, len(g.ExtraFilterLabels))
for k, v := range g.ExtraFilterLabels {
extraFilters = append(extraFilters, fmt.Sprintf("%s=%s", k, v))
}
sort.Strings(extraFilters)
for _, extraFilter := range extraFilters {
g.Params.Add("extra_label", extraFilter)
}
g.Type.Set(NewPrometheusType())
}
h := md5.New()
@@ -85,7 +65,7 @@ func (g *Group) UnmarshalYAML(unmarshal func(interface{}) error) error {
}
// Validate check for internal Group or Rule configuration errors
func (g *Group) Validate(validateAnnotations, validateExpressions bool) error {
func (g *Group) Validate(validateTplFn ValidateTplFn, validateExpressions bool) error {
if g.Name == "" {
return fmt.Errorf("group name must be set")
}
@@ -97,7 +77,7 @@ func (g *Group) Validate(validateAnnotations, validateExpressions bool) error {
ruleName = r.Alert
}
if _, ok := uniqueRules[r.ID]; ok {
return fmt.Errorf("rule %q duplicate", ruleName)
return fmt.Errorf("%q is a duplicate within the group %q", r.String(), g.Name)
}
uniqueRules[r.ID] = struct{}{}
if err := r.Validate(); err != nil {
@@ -111,11 +91,11 @@ func (g *Group) Validate(validateAnnotations, validateExpressions bool) error {
return fmt.Errorf("invalid expression for rule %q.%q: %w", g.Name, ruleName, err)
}
}
if validateAnnotations {
if err := notifier.ValidateTemplates(r.Annotations); err != nil {
if validateTplFn != nil {
if err := validateTplFn(r.Annotations); err != nil {
return fmt.Errorf("invalid annotations for rule %q.%q: %w", g.Name, ruleName, err)
}
if err := notifier.ValidateTemplates(r.Labels); err != nil {
if err := validateTplFn(r.Labels); err != nil {
return fmt.Errorf("invalid labels for rule %q.%q: %w", g.Name, ruleName, err)
}
}
@@ -127,12 +107,16 @@ func (g *Group) Validate(validateAnnotations, validateExpressions bool) error {
// recording rule or alerting rule.
type Rule struct {
ID uint64
Record string `yaml:"record,omitempty"`
Alert string `yaml:"alert,omitempty"`
Expr string `yaml:"expr"`
For promutils.Duration `yaml:"for"`
Labels map[string]string `yaml:"labels,omitempty"`
Annotations map[string]string `yaml:"annotations,omitempty"`
Record string `yaml:"record,omitempty"`
Alert string `yaml:"alert,omitempty"`
Expr string `yaml:"expr"`
For *promutils.Duration `yaml:"for,omitempty"`
Labels map[string]string `yaml:"labels,omitempty"`
Annotations map[string]string `yaml:"annotations,omitempty"`
Debug bool `yaml:"debug,omitempty"`
// UpdateEntriesLimit defines max number of rule's state updates stored in memory.
// Overrides `-rule.updateEntriesLimit`.
UpdateEntriesLimit *int `yaml:"update_entries_limit,omitempty"`
// Catches all undefined fields and must be empty after parsing.
XXX map[string]interface{} `yaml:",inline"`
@@ -156,6 +140,32 @@ func (r *Rule) Name() string {
return r.Alert
}
// String implements Stringer interface
func (r *Rule) String() string {
ruleType := "recording"
if r.Alert != "" {
ruleType = "alerting"
}
b := strings.Builder{}
b.WriteString(fmt.Sprintf("%s rule %q", ruleType, r.Name()))
b.WriteString(fmt.Sprintf("; expr: %q", r.Expr))
kv := sortMap(r.Labels)
for i := range kv {
if i == 0 {
b.WriteString("; labels:")
}
b.WriteString(" ")
b.WriteString(kv[i].key)
b.WriteString("=")
b.WriteString(kv[i].value)
if i < len(kv)-1 {
b.WriteString(",")
}
}
return b.String()
}
// HashRule hashes significant Rule fields into
// unique hash that supposed to define Rule uniqueness
func HashRule(r Rule) uint64 {
@@ -188,8 +198,11 @@ func (r *Rule) Validate() error {
return checkOverflow(r.XXX, "rule")
}
// ValidateTplFn must validate the given annotations
type ValidateTplFn func(annotations map[string]string) error
// Parse parses rule configs from given file patterns
func Parse(pathPatterns []string, validateAnnotations, validateExpressions bool) ([]Group, error) {
func Parse(pathPatterns []string, validateTplFn ValidateTplFn, validateExpressions bool) ([]Group, error) {
var fp []string
for _, pattern := range pathPatterns {
matches, err := filepath.Glob(pattern)
@@ -199,7 +212,6 @@ func Parse(pathPatterns []string, validateAnnotations, validateExpressions bool)
fp = append(fp, matches...)
}
errGroup := new(utils.ErrGroup)
var isExtraFilterLabelsUsed bool
var groups []Group
for _, file := range fp {
uniqueGroups := map[string]struct{}{}
@@ -209,7 +221,7 @@ func Parse(pathPatterns []string, validateAnnotations, validateExpressions bool)
continue
}
for _, g := range gr {
if err := g.Validate(validateAnnotations, validateExpressions); err != nil {
if err := g.Validate(validateTplFn, validateExpressions); err != nil {
errGroup.Add(fmt.Errorf("invalid group %q in file %q: %w", g.Name, file, err))
continue
}
@@ -219,9 +231,6 @@ func Parse(pathPatterns []string, validateAnnotations, validateExpressions bool)
}
uniqueGroups[g.Name] = struct{}{}
g.File = file
if len(g.ExtraFilterLabels) > 0 {
isExtraFilterLabelsUsed = true
}
groups = append(groups, g)
}
}
@@ -231,18 +240,18 @@ func Parse(pathPatterns []string, validateAnnotations, validateExpressions bool)
if len(groups) < 1 {
logger.Warnf("no groups found in %s", strings.Join(pathPatterns, ";"))
}
if isExtraFilterLabelsUsed {
logger.Warnf("field `extra_filter_labels` is deprecated - use `params` instead")
}
return groups, nil
}
func parseFile(path string) ([]Group, error) {
data, err := ioutil.ReadFile(path)
data, err := os.ReadFile(path)
if err != nil {
return nil, fmt.Errorf("error reading alert rule file: %w", err)
return nil, fmt.Errorf("error reading alert rule file %q: %w", path, err)
}
data, err = envtemplate.ReplaceBytes(data)
if err != nil {
return nil, fmt.Errorf("cannot expand environment vars in %q: %w", path, err)
}
data = envtemplate.Replace(data)
g := struct {
Groups []Group `yaml:"groups"`
// Catches all undefined fields and must be empty after parsing.

View File

@@ -9,19 +9,20 @@ import (
"gopkg.in/yaml.v2"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/templates"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promutils"
)
func TestMain(m *testing.M) {
u, _ := url.Parse("https://victoriametrics.com/path")
notifier.InitTemplateFunc(u)
if err := templates.Load([]string{"testdata/templates/*good.tmpl"}, true); err != nil {
os.Exit(1)
}
os.Exit(m.Run())
}
func TestParseGood(t *testing.T) {
if _, err := Parse([]string{"testdata/*good.rules", "testdata/dir/*good.*"}, true, true); err != nil {
if _, err := Parse([]string{"testdata/rules/*good.rules", "testdata/dir/*good.*"}, notifier.ValidateTemplates, true); err != nil {
t.Errorf("error parsing files %s", err)
}
}
@@ -32,7 +33,7 @@ func TestParseBad(t *testing.T) {
expErr string
}{
{
[]string{"testdata/rules0-bad.rules"},
[]string{"testdata/rules/rules0-bad.rules"},
"unexpected token",
},
{
@@ -56,12 +57,16 @@ func TestParseBad(t *testing.T) {
"either `record` or `alert` must be set",
},
{
[]string{"testdata/rules1-bad.rules"},
[]string{"testdata/rules/rules1-bad.rules"},
"bad graphite expr",
},
{
[]string{"testdata/dir/rules6-bad.rules"},
"missing ':' in header",
},
}
for _, tc := range testCases {
_, err := Parse(tc.path, true, true)
_, err := Parse(tc.path, notifier.ValidateTemplates, true)
if err == nil {
t.Errorf("expected to get error")
return
@@ -219,7 +224,7 @@ func TestGroup_Validate(t *testing.T) {
},
{
group: &Group{Name: "test thanos",
Type: datasource.NewRawType("thanos"),
Type: NewRawType("thanos"),
Rules: []Rule{
{Alert: "alert", Expr: "up == 1", Labels: map[string]string{
"description": "{{ value|query }}",
@@ -231,7 +236,7 @@ func TestGroup_Validate(t *testing.T) {
},
{
group: &Group{Name: "test graphite",
Type: datasource.NewGraphiteType(),
Type: NewGraphiteType(),
Rules: []Rule{
{Alert: "alert", Expr: "up == 1", Labels: map[string]string{
"description": "some-description",
@@ -243,7 +248,7 @@ func TestGroup_Validate(t *testing.T) {
},
{
group: &Group{Name: "test prometheus",
Type: datasource.NewPrometheusType(),
Type: NewPrometheusType(),
Rules: []Rule{
{Alert: "alert", Expr: "up == 1", Labels: map[string]string{
"description": "{{ value|query }}",
@@ -256,7 +261,7 @@ func TestGroup_Validate(t *testing.T) {
{
group: &Group{
Name: "test graphite inherit",
Type: datasource.NewGraphiteType(),
Type: NewGraphiteType(),
Rules: []Rule{
{
Expr: "sumSeries(time('foo.bar',10))",
@@ -271,7 +276,7 @@ func TestGroup_Validate(t *testing.T) {
{
group: &Group{
Name: "test graphite prometheus bad expr",
Type: datasource.NewGraphiteType(),
Type: NewGraphiteType(),
Rules: []Rule{
{
Expr: "sum(up == 0 ) by (host)",
@@ -285,8 +290,13 @@ func TestGroup_Validate(t *testing.T) {
expErr: "invalid rule",
},
}
for _, tc := range testCases {
err := tc.group.Validate(tc.validateAnnotations, tc.validateExpressions)
var validateTplFn ValidateTplFn
if tc.validateAnnotations {
validateTplFn = notifier.ValidateTemplates
}
err := tc.group.Validate(validateTplFn, tc.validateExpressions)
if err == nil {
if tc.expErr != "" {
t.Errorf("expected to get err %q; got nil insted", tc.expErr)
@@ -491,6 +501,69 @@ params:
rules:
- alert: foo
expr: sum by(job) (up == 1)
`)
})
t.Run("`limit` change", func(t *testing.T) {
f(t, `
name: TestGroup
limit: 5
rules:
- alert: foo
expr: sum by(job) (up == 1)
`, `
name: TestGroup
limit: 10
rules:
- alert: foo
expr: sum by(job) (up == 1)
`)
})
t.Run("`headers` change", func(t *testing.T) {
f(t, `
name: TestGroup
headers:
- "TenantID: foo"
rules:
- alert: foo
expr: sum by(job) (up == 1)
`, `
name: TestGroup
headers:
- "TenantID: bar"
rules:
- alert: foo
expr: sum by(job) (up == 1)
`)
})
t.Run("`debug` change", func(t *testing.T) {
f(t, `
name: TestGroup
rules:
- alert: foo
expr: sum by(job) (up == 1)
`, `
name: TestGroup
rules:
- alert: foo
expr: sum by(job) (up == 1)
debug: true
`)
})
t.Run("`update_entries_limit` change", func(t *testing.T) {
f(t, `
name: TestGroup
rules:
- alert: foo
expr: sum by(job) (up == 1)
`, `
name: TestGroup
rules:
- alert: foo
expr: sum by(job) (up == 1)
update_entries_limit: 33
`)
})
}
@@ -528,30 +601,4 @@ rules:
expr: sum by(job) (up == 1)
`, url.Values{"nocache": {"1"}, "denyPartialResponse": {"true"}})
})
t.Run("extra labels", func(t *testing.T) {
f(t, `
name: TestGroup
extra_filter_labels:
job: victoriametrics
env: prod
rules:
- alert: ExampleAlertAlwaysFiring
expr: sum by(job) (up == 1)
`, url.Values{"extra_label": {"env=prod", "job=victoriametrics"}})
})
t.Run("extra labels and params", func(t *testing.T) {
f(t, `
name: TestGroup
extra_filter_labels:
job: victoriametrics
params:
nocache: ["1"]
extra_label: ["env=prod"]
rules:
- alert: ExampleAlertAlwaysFiring
expr: sum by(job) (up == 1)
`, url.Values{"nocache": {"1"}, "extra_label": {"env=prod", "job=victoriametrics"}})
})
}

View File

@@ -0,0 +1,7 @@
groups:
- name: group
headers:
- 'foobar'
rules:
- alert: rows
expr: vm_rows > 0

View File

@@ -2,6 +2,7 @@ groups:
- name: ReplayGroup
interval: 1m
concurrency: 1
limit: 1000
rules:
- record: type:vm_cache_entries:rate5m
expr: sum(rate(vm_cache_entries[5m])) by (type)

View File

@@ -1,23 +1,27 @@
groups:
- name: TestGroup
interval: 2s
interval: 5s
concurrency: 2
extra_filter_labels: # deprecated param, use `params` instead
job: victoriametrics
limit: 1000
headers:
- "MyHeader: foo"
params:
denyPartialResponse: ["true"]
extra_label: ["env=dev"]
rules:
- alert: Conns
expr: sum(vm_tcplistener_conns) by(instance) > 1
expr: vm_tcplistener_conns > 0
for: 3m
debug: true
update_entries_limit: 40
annotations:
summary: Too high connection number for {{$labels.instance}}
labels: "Available labels: {{ $labels }}"
summary: Too high connection number for {{ $labels.instance }}
{{ with printf "sum(vm_tcplistener_conns{instance=%q})" .Labels.instance | query }}
{{ . | first | value }}
{{ end }}
description: "It is {{ $value }} connections for {{$labels.instance}}"
- alert: ExampleAlertAlwaysFiring
update_entries_limit: -1
expr: sum by(job)
(up == 1)
labels:
@@ -49,4 +53,12 @@ groups:
expr: |2
sum(code:requests:rate5m{code="200"})
/
sum(code:requests:rate5m)
sum(code:requests:rate5m)
- record: code:requests:slo
labels:
recording: true
expr: 0.95
- record: time:current
labels:
recording: true
expr: time()

View File

@@ -7,6 +7,7 @@ groups:
- alert: Conns
expr: filterSeries(sumSeries(host.receiver.interface.cons),'last','>', 500)
for: 3m
annotations:
summary: Too high connection number for {{$labels.instance}}
description: "It is {{ $value }} connections for {{$labels.instance}}"

View File

@@ -0,0 +1,3 @@
{{ define "template0" }}
Visit {{ externalURL }}
{{ end }}

View File

@@ -0,0 +1,3 @@
{{ define "template1" }}
{{ 1048576 | humanize1024 }}
{{ end }}

View File

@@ -0,0 +1,3 @@
{{ define "template2" }}
{{ 1048576 | humanize1024 }}
{{ end }}

View File

@@ -0,0 +1,3 @@
{{ define "template3" }}
{{ printf "%s to %s!" "welcome" "hell" | toUpper }}
{{ end }}

View File

@@ -0,0 +1,3 @@
{{ define "template3" }}
{{ 1230912039102391023.0 | humanizeDuration }}
{{ end }}

View File

@@ -1,7 +1,8 @@
package datasource
package config
import (
"fmt"
"strings"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/graphiteql"
"github.com/VictoriaMetrics/metricsql"
@@ -9,45 +10,45 @@ import (
// Type represents data source type
type Type struct {
name string
Name string
}
// NewPrometheusType returns prometheus datasource type
func NewPrometheusType() Type {
return Type{
name: "prometheus",
Name: "prometheus",
}
}
// NewGraphiteType returns graphite datasource type
func NewGraphiteType() Type {
return Type{
name: "graphite",
Name: "graphite",
}
}
// NewRawType returns datasource type from raw string
// without validation.
func NewRawType(d string) Type {
return Type{name: d}
return Type{Name: d}
}
// Get returns datasource type
func (t *Type) Get() string {
return t.name
return t.Name
}
// Set changes datasource type
func (t *Type) Set(d Type) {
t.name = d.name
t.Name = d.Name
}
// String implements String interface with default value.
func (t Type) String() string {
if t.name == "" {
if t.Name == "" {
return "prometheus"
}
return t.name
return t.Name
}
// ValidateExpr validates query expression with datasource ql.
@@ -62,7 +63,7 @@ func (t *Type) ValidateExpr(expr string) error {
return fmt.Errorf("bad prometheus expr: %q, err: %w", expr, err)
}
default:
return fmt.Errorf("unknown datasource type=%q", t.name)
return fmt.Errorf("unknown datasource type=%q", t.Name)
}
return nil
}
@@ -81,11 +82,35 @@ func (t *Type) UnmarshalYAML(unmarshal func(interface{}) error) error {
default:
return fmt.Errorf("unknown datasource type=%q, want %q or %q", s, "prometheus", "graphite")
}
t.name = s
t.Name = s
return nil
}
// MarshalYAML implements the yaml.Unmarshaler interface.
func (t Type) MarshalYAML() (interface{}, error) {
return t.name, nil
return t.Name, nil
}
// Header is a Key - Value struct for holding an HTTP header.
type Header struct {
Key string
Value string
}
// UnmarshalYAML implements the yaml.Unmarshaler interface.
func (h *Header) UnmarshalYAML(unmarshal func(interface{}) error) error {
var s string
if err := unmarshal(&s); err != nil {
return err
}
if s == "" {
return nil
}
n := strings.IndexByte(s, ':')
if n < 0 {
return fmt.Errorf(`missing ':' in header %q; expecting "key: value" format`, s)
}
h.Key = strings.TrimSpace(s[:n])
h.Value = strings.TrimSpace(s[n+1:])
return nil
}

View File

@@ -2,26 +2,37 @@ package datasource
import (
"context"
"net/http"
"net/url"
"time"
)
// Querier interface wraps Query and QueryRange methods
type Querier interface {
Query(ctx context.Context, query string, ts time.Time) ([]Metric, error)
// Query executes instant request with the given query at the given ts.
// It returns list of Metric in response, the http.Request used for sending query
// and error if any. Returned http.Request can't be reused and its body is already read.
// Query should stop once ctx is cancelled.
Query(ctx context.Context, query string, ts time.Time) ([]Metric, *http.Request, error)
// QueryRange executes range request with the given query on the given time range.
// It returns list of Metric in response and error if any.
// QueryRange should stop once ctx is cancelled.
QueryRange(ctx context.Context, query string, from, to time.Time) ([]Metric, error)
}
// QuerierBuilder builds Querier with given params.
type QuerierBuilder interface {
// BuildWithParams creates a new Querier object with the given params
BuildWithParams(params QuerierParams) Querier
}
// QuerierParams params for Querier.
type QuerierParams struct {
DataSourceType *Type
DataSourceType string
EvaluationInterval time.Duration
QueryParams url.Values
Headers map[string]string
Debug bool
}
// Metric is the basic entity which should be return by datasource
@@ -43,6 +54,19 @@ func (m *Metric) SetLabel(key, value string) {
m.AddLabel(key, value)
}
// SetLabels sets the given map as Metric labels
func (m *Metric) SetLabels(ls map[string]string) {
var i int
m.Labels = make([]Label, len(ls))
for k, v := range ls {
m.Labels[i] = Label{
Name: k,
Value: v,
}
i++
}
}
// AddLabel appends the given label to the label set
func (m *Metric) AddLabel(key, value string) {
m.Labels = append(m.Labels, Label{Name: key, Value: value})

View File

@@ -6,14 +6,22 @@ import (
"net/http"
"net/url"
"strings"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/utils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
)
var (
addr = flag.String("datasource.url", "", "VictoriaMetrics or vmselect url. Required parameter. "+
"E.g. http://127.0.0.1:8428")
appendTypePrefix = flag.Bool("datasource.appendTypePrefix", false, "Whether to add type prefix to -datasource.url based on the query type. Set to true if sending different query types to the vmselect URL.")
addr = flag.String("datasource.url", "", "Datasource compatible with Prometheus HTTP API. It can be single node VictoriaMetrics or vmselect URL. Required parameter. "+
"E.g. http://127.0.0.1:8428 . See also -remoteRead.disablePathAppend and -datasource.showURL")
appendTypePrefix = flag.Bool("datasource.appendTypePrefix", false, "Whether to add type prefix to -datasource.url based on the query type. Set to true if sending different query types to the vmselect URL.")
showDatasourceURL = flag.Bool("datasource.showURL", false, "Whether to show -datasource.url in the exported metrics. "+
"It is hidden by default, since it can contain sensitive info such as auth key")
headers = flag.String("datasource.headers", "", "Optional HTTP extraHeaders to send with each request to the corresponding -datasource.url. "+
"For example, -datasource.headers='My-Auth:foobar' would send 'My-Auth: foobar' HTTP header with every request to the corresponding -datasource.url. "+
"Multiple headers must be delimited by '^^': -datasource.headers='header1:value1^^header2:value2'")
basicAuthUsername = flag.String("datasource.basicAuth.username", "", "Optional basic auth username for -datasource.url")
basicAuthPassword = flag.String("datasource.basicAuth.password", "", "Optional basic auth password for -datasource.url")
@@ -35,9 +43,9 @@ var (
oauth2Scopes = flag.String("datasource.oauth2.scopes", "", "Optional OAuth2 scopes to use for -datasource.url. Scopes must be delimited by ';'")
lookBack = flag.Duration("datasource.lookback", 0, `Lookback defines how far into the past to look when evaluating queries. For example, if the datasource.lookback=5m then param "time" with value now()-5m will be added to every query.`)
queryStep = flag.Duration("datasource.queryStep", 0, "queryStep defines how far a value can fallback to when evaluating queries. "+
"For example, if datasource.queryStep=15s then param \"step\" with value \"15s\" will be added to every query."+
"If queryStep isn't specified, rule's evaluationInterval will be used instead.")
queryStep = flag.Duration("datasource.queryStep", 5*time.Minute, "How far a value can fallback to when evaluating queries. "+
"For example, if -datasource.queryStep=15s then param \"step\" with value \"15s\" will be added to every query. "+
"If set to 0, rule's evaluation interval will be used instead.")
queryTimeAlignment = flag.Bool("datasource.queryTimeAlignment", true, `Whether to align "time" parameter with evaluation interval.`+
"Alignment supposed to produce deterministic results despite of number of vmalert replicas or time they were started. See more details here https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1257")
maxIdleConnections = flag.Int("datasource.maxIdleConnections", 100, `Defines the number of idle (keep-alive connections) to each configured datasource. Consider setting this value equal to the value: groups_total * group.concurrency. Too low a value may result in a high number of sockets in TIME_WAIT state.`)
@@ -47,6 +55,13 @@ var (
`In VM "round_digits" limits the number of digits after the decimal point in response values.`)
)
// InitSecretFlags must be called after flag.Parse and before any logging
func InitSecretFlags() {
if !*showDatasourceURL {
flagutil.RegisterSecretFlag("datasource.url")
}
}
// Param represents an HTTP GET param
type Param struct {
Key, Value string
@@ -80,7 +95,8 @@ func Init(extraParams url.Values) (QuerierBuilder, error) {
authCfg, err := utils.AuthConfig(
utils.WithBasicAuth(*basicAuthUsername, *basicAuthPassword, *basicAuthPasswordFile),
utils.WithBearer(*bearerToken, *bearerTokenFile),
utils.WithOAuth(*oauth2ClientID, *oauth2ClientSecret, *oauth2ClientSecretFile, *oauth2TokenURL, *oauth2Scopes))
utils.WithOAuth(*oauth2ClientID, *oauth2ClientSecret, *oauth2ClientSecretFile, *oauth2TokenURL, *oauth2Scopes),
utils.WithHeaders(*headers))
if err != nil {
return nil, fmt.Errorf("failed to configure auth: %w", err)
}
@@ -92,7 +108,7 @@ func Init(extraParams url.Values) (QuerierBuilder, error) {
appendTypePrefix: *appendTypePrefix,
lookBack: *lookBack,
queryStep: *queryStep,
dataSourceType: NewPrometheusType(),
dataSourceType: datasourcePrometheus,
extraParams: extraParams,
}, nil
}

View File

@@ -3,15 +3,30 @@ package datasource
import (
"context"
"fmt"
"io/ioutil"
"io"
"net/http"
"net/url"
"strings"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promauth"
)
type datasourceType string
const (
datasourcePrometheus datasourceType = "prometheus"
datasourceGraphite datasourceType = "graphite"
)
func toDatasourceType(s string) datasourceType {
if s == string(datasourceGraphite) {
return datasourceGraphite
}
return datasourcePrometheus
}
// VMStorage represents vmstorage entity with ability to read and write metrics
type VMStorage struct {
c *http.Client
@@ -21,33 +36,46 @@ type VMStorage struct {
lookBack time.Duration
queryStep time.Duration
dataSourceType Type
dataSourceType datasourceType
evaluationInterval time.Duration
extraParams url.Values
disablePathAppend bool
extraHeaders []keyValue
// whether to print additional log messages
// for each sent request
debug bool
}
type keyValue struct {
key string
value string
}
// Clone makes clone of VMStorage, shares http client.
func (s *VMStorage) Clone() *VMStorage {
return &VMStorage{
c: s.c,
authCfg: s.authCfg,
datasourceURL: s.datasourceURL,
lookBack: s.lookBack,
queryStep: s.queryStep,
appendTypePrefix: s.appendTypePrefix,
dataSourceType: s.dataSourceType,
disablePathAppend: s.disablePathAppend,
c: s.c,
authCfg: s.authCfg,
datasourceURL: s.datasourceURL,
lookBack: s.lookBack,
queryStep: s.queryStep,
appendTypePrefix: s.appendTypePrefix,
dataSourceType: s.dataSourceType,
}
}
// ApplyParams - changes given querier params.
func (s *VMStorage) ApplyParams(params QuerierParams) *VMStorage {
if params.DataSourceType != nil {
s.dataSourceType = *params.DataSourceType
}
s.dataSourceType = toDatasourceType(params.DataSourceType)
s.evaluationInterval = params.EvaluationInterval
s.extraParams = params.QueryParams
s.debug = params.Debug
if params.Headers != nil {
for key, value := range params.Headers {
kv := keyValue{key: key, value: value}
s.extraHeaders = append(s.extraHeaders, kv)
}
}
return s
}
@@ -57,56 +85,56 @@ func (s *VMStorage) BuildWithParams(params QuerierParams) Querier {
}
// NewVMStorage is a constructor for VMStorage
func NewVMStorage(baseURL string, authCfg *promauth.Config, lookBack time.Duration, queryStep time.Duration, appendTypePrefix bool, c *http.Client, disablePathAppend bool) *VMStorage {
func NewVMStorage(baseURL string, authCfg *promauth.Config, lookBack time.Duration, queryStep time.Duration, appendTypePrefix bool, c *http.Client) *VMStorage {
return &VMStorage{
c: c,
authCfg: authCfg,
datasourceURL: strings.TrimSuffix(baseURL, "/"),
appendTypePrefix: appendTypePrefix,
lookBack: lookBack,
queryStep: queryStep,
dataSourceType: NewPrometheusType(),
disablePathAppend: disablePathAppend,
c: c,
authCfg: authCfg,
datasourceURL: strings.TrimSuffix(baseURL, "/"),
appendTypePrefix: appendTypePrefix,
lookBack: lookBack,
queryStep: queryStep,
dataSourceType: datasourcePrometheus,
}
}
// Query executes the given query and returns parsed response
func (s *VMStorage) Query(ctx context.Context, query string, ts time.Time) ([]Metric, error) {
func (s *VMStorage) Query(ctx context.Context, query string, ts time.Time) ([]Metric, *http.Request, error) {
req, err := s.newRequestPOST()
if err != nil {
return nil, err
return nil, nil, err
}
switch s.dataSourceType.String() {
case "prometheus":
switch s.dataSourceType {
case "", datasourcePrometheus:
s.setPrometheusInstantReqParams(req, query, ts)
case "graphite":
case datasourceGraphite:
s.setGraphiteReqParams(req, query, ts)
default:
return nil, fmt.Errorf("engine not found: %q", s.dataSourceType.name)
return nil, nil, fmt.Errorf("engine not found: %q", s.dataSourceType)
}
resp, err := s.do(ctx, req)
if err != nil {
return nil, err
return nil, req, err
}
defer func() {
_ = resp.Body.Close()
}()
parseFn := parsePrometheusResponse
if s.dataSourceType.name != "prometheus" {
if s.dataSourceType != datasourcePrometheus {
parseFn = parseGraphiteResponse
}
return parseFn(req, resp)
result, err := parseFn(req, resp)
return result, req, err
}
// QueryRange executes the given query on the given time range.
// For Prometheus type see https://prometheus.io/docs/prometheus/latest/querying/api/#range-queries
// Graphite type isn't supported.
func (s *VMStorage) QueryRange(ctx context.Context, query string, start, end time.Time) ([]Metric, error) {
if s.dataSourceType.name != "prometheus" {
return nil, fmt.Errorf("%q is not supported for QueryRange", s.dataSourceType.name)
if s.dataSourceType != datasourcePrometheus {
return nil, fmt.Errorf("%q is not supported for QueryRange", s.dataSourceType)
}
req, err := s.newRequestPOST()
if err != nil {
@@ -130,12 +158,15 @@ func (s *VMStorage) QueryRange(ctx context.Context, query string, start, end tim
}
func (s *VMStorage) do(ctx context.Context, req *http.Request) (*http.Response, error) {
if s.debug {
logger.Infof("DEBUG datasource request: executing %s request with params %q", req.Method, req.URL.RawQuery)
}
resp, err := s.c.Do(req.WithContext(ctx))
if err != nil {
return nil, fmt.Errorf("error getting response from %s: %w", req.URL.Redacted(), err)
}
if resp.StatusCode != http.StatusOK {
body, _ := ioutil.ReadAll(resp.Body)
body, _ := io.ReadAll(resp.Body)
_ = resp.Body.Close()
return nil, fmt.Errorf("unexpected response code %d for %s. Response body %s", resp.StatusCode, req.URL.Redacted(), body)
}
@@ -149,9 +180,10 @@ func (s *VMStorage) newRequestPOST() (*http.Request, error) {
}
req.Header.Set("Content-Type", "application/json")
if s.authCfg != nil {
if auth := s.authCfg.GetAuthHeader(); auth != "" {
req.Header.Set("Authorization", auth)
}
s.authCfg.SetHeaders(req, true)
}
for _, h := range s.extraHeaders {
req.Header.Set(h.key, h.value)
}
return req, nil
}

View File

@@ -2,12 +2,18 @@ package datasource
import (
"encoding/json"
"flag"
"fmt"
"net/http"
"strconv"
"time"
)
var (
disablePathAppend = flag.Bool("remoteRead.disablePathAppend", false, "Whether to disable automatic appending of '/api/v1/query' path "+
"to the configured -datasource.url and -remoteRead.url")
)
type promResponse struct {
Status string `json:"status"`
ErrorType string `json:"errorType"`
@@ -25,31 +31,29 @@ type promInstant struct {
} `json:"result"`
}
type promRange struct {
Result []struct {
Labels map[string]string `json:"metric"`
TVs [][2]interface{} `json:"values"`
} `json:"result"`
}
func (r promInstant) metrics() ([]Metric, error) {
var result []Metric
result := make([]Metric, len(r.Result))
for i, res := range r.Result {
f, err := strconv.ParseFloat(res.TV[1].(string), 64)
if err != nil {
return nil, fmt.Errorf("metric %v, unable to parse float64 from %s: %w", res, res.TV[1], err)
}
var m Metric
for k, v := range r.Result[i].Labels {
m.AddLabel(k, v)
}
m.SetLabels(res.Labels)
m.Timestamps = append(m.Timestamps, int64(res.TV[0].(float64)))
m.Values = append(m.Values, f)
result = append(result, m)
result[i] = m
}
return result, nil
}
type promRange struct {
Result []struct {
Labels map[string]string `json:"metric"`
TVs [][2]interface{} `json:"values"`
} `json:"result"`
}
func (r promRange) metrics() ([]Metric, error) {
var result []Metric
for i, res := range r.Result {
@@ -74,9 +78,22 @@ func (r promRange) metrics() ([]Metric, error) {
return result, nil
}
type promScalar [2]interface{}
func (r promScalar) metrics() ([]Metric, error) {
var m Metric
f, err := strconv.ParseFloat(r[1].(string), 64)
if err != nil {
return nil, fmt.Errorf("metric %v, unable to parse float64 from %s: %w", r, r[1], err)
}
m.Values = append(m.Values, f)
m.Timestamps = append(m.Timestamps, int64(r[0].(float64)))
return []Metric{m}, nil
}
const (
statusSuccess, statusError = "success", "error"
rtVector, rtMatrix = "vector", "matrix"
statusSuccess, statusError = "success", "error"
rtVector, rtMatrix, rScalar = "vector", "matrix", "scalar"
)
func parsePrometheusResponse(req *http.Request, resp *http.Response) ([]Metric, error) {
@@ -103,23 +120,23 @@ func parsePrometheusResponse(req *http.Request, resp *http.Response) ([]Metric,
return nil, err
}
return pr.metrics()
case rScalar:
var ps promScalar
if err := json.Unmarshal(r.Data.Result, &ps); err != nil {
return nil, err
}
return ps.metrics()
default:
return nil, fmt.Errorf("unknown result type %q", r.Data.ResultType)
}
}
const (
prometheusInstantPath = "/api/v1/query"
prometheusRangePath = "/api/v1/query_range"
prometheusPrefix = "/prometheus"
)
func (s *VMStorage) setPrometheusInstantReqParams(r *http.Request, query string, timestamp time.Time) {
if s.appendTypePrefix {
r.URL.Path += prometheusPrefix
r.URL.Path += "/prometheus"
}
if !s.disablePathAppend {
r.URL.Path += prometheusInstantPath
if !*disablePathAppend {
r.URL.Path += "/api/v1/query"
}
q := r.URL.Query()
if s.lookBack > 0 {
@@ -130,20 +147,35 @@ func (s *VMStorage) setPrometheusInstantReqParams(r *http.Request, query string,
timestamp = timestamp.Truncate(s.evaluationInterval)
}
q.Set("time", fmt.Sprintf("%d", timestamp.Unix()))
if s.evaluationInterval > 0 { // set step as evaluationInterval by default
// always convert to seconds to keep compatibility with older
// Prometheus versions. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1943
q.Set("step", fmt.Sprintf("%ds", int(s.evaluationInterval.Seconds())))
}
if s.queryStep > 0 { // override step with user-specified value
// always convert to seconds to keep compatibility with older
// Prometheus versions. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1943
q.Set("step", fmt.Sprintf("%ds", int(s.queryStep.Seconds())))
}
r.URL.RawQuery = q.Encode()
s.setPrometheusReqParams(r, query)
}
func (s *VMStorage) setPrometheusRangeReqParams(r *http.Request, query string, start, end time.Time) {
if s.appendTypePrefix {
r.URL.Path += prometheusPrefix
r.URL.Path += "/prometheus"
}
if !s.disablePathAppend {
r.URL.Path += prometheusRangePath
if !*disablePathAppend {
r.URL.Path += "/api/v1/query_range"
}
q := r.URL.Query()
q.Add("start", fmt.Sprintf("%d", start.Unix()))
q.Add("end", fmt.Sprintf("%d", end.Unix()))
if s.evaluationInterval > 0 { // set step as evaluationInterval by default
// always convert to seconds to keep compatibility with older
// Prometheus versions. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1943
q.Set("step", fmt.Sprintf("%ds", int(s.evaluationInterval.Seconds())))
}
r.URL.RawQuery = q.Encode()
s.setPrometheusReqParams(r, query)
}
@@ -159,15 +191,5 @@ func (s *VMStorage) setPrometheusReqParams(r *http.Request, query string) {
}
}
q.Set("query", query)
if s.evaluationInterval > 0 { // set step as evaluationInterval by default
// always convert to seconds to keep compatibility with older
// Prometheus versions. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1943
q.Set("step", fmt.Sprintf("%ds", int(s.evaluationInterval.Seconds())))
}
if s.queryStep > 0 { // override step with user-specified value
// always convert to seconds to keep compatibility with older
// Prometheus versions. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1943
q.Set("step", fmt.Sprintf("%ds", int(s.queryStep.Seconds())))
}
r.URL.RawQuery = q.Encode()
}

View File

@@ -0,0 +1,20 @@
package datasource
import (
"encoding/json"
"testing"
)
func BenchmarkMetrics(b *testing.B) {
payload := []byte(`[{"metric":{"__name__":"vm_rows"},"value":[1583786142,"13763"]},{"metric":{"__name__":"vm_requests", "foo":"bar", "baz": "qux"},"value":[1583786140,"2000"]}]`)
var pi promInstant
if err := json.Unmarshal(payload, &pi.Result); err != nil {
b.Fatalf(err.Error())
}
b.Run("Instant", func(b *testing.B) {
for i := 0; i < b.N; i++ {
_, _ = pi.metrics()
}
})
}

View File

@@ -7,6 +7,7 @@ import (
"net/http/httptest"
"net/url"
"reflect"
"sort"
"strconv"
"strings"
"testing"
@@ -37,7 +38,7 @@ func TestVMInstantQuery(t *testing.T) {
mux.HandleFunc("/render", func(w http.ResponseWriter, request *http.Request) {
c++
switch c {
case 7:
case 8:
w.Write([]byte(`[{"target":"constantLine(10)","tags":{"name":"constantLine(10)"},"datapoints":[[10,1611758343],[10,1611758373],[10,1611758403]]}]`))
}
})
@@ -74,42 +75,39 @@ func TestVMInstantQuery(t *testing.T) {
case 5:
w.Write([]byte(`{"status":"success","data":{"resultType":"matrix"}}`))
case 6:
w.Write([]byte(`{"status":"success","data":{"resultType":"vector","result":[{"metric":{"__name__":"vm_rows"},"value":[1583786142,"13763"]},{"metric":{"__name__":"vm_requests"},"value":[1583786140,"2000"]}]}}`))
w.Write([]byte(`{"status":"success","data":{"resultType":"vector","result":[{"metric":{"__name__":"vm_rows","foo":"bar"},"value":[1583786142,"13763"]},{"metric":{"__name__":"vm_requests","foo":"baz"},"value":[1583786140,"2000"]}]}}`))
case 7:
w.Write([]byte(`{"status":"success","data":{"resultType":"scalar","result":[1583786142, "1"]}}`))
}
})
srv := httptest.NewServer(mux)
defer srv.Close()
authCfg, err := promauth.NewConfig(".", nil, baCfg, "", "", nil, nil)
authCfg, err := baCfg.NewConfig(".")
if err != nil {
t.Fatalf("unexpected: %s", err)
}
s := NewVMStorage(srv.URL, authCfg, time.Minute, 0, false, srv.Client(), false)
s := NewVMStorage(srv.URL, authCfg, time.Minute, 0, false, srv.Client())
p := NewPrometheusType()
pq := s.BuildWithParams(QuerierParams{DataSourceType: &p, EvaluationInterval: 15 * time.Second})
p := datasourcePrometheus
pq := s.BuildWithParams(QuerierParams{DataSourceType: string(p), EvaluationInterval: 15 * time.Second})
ts := time.Now()
if _, err := pq.Query(ctx, query, ts); err == nil {
t.Fatalf("expected connection error got nil")
expErr := func(err string) {
if _, _, err := pq.Query(ctx, query, ts); err == nil {
t.Fatalf("expected %q got nil", err)
}
}
if _, err := pq.Query(ctx, query, ts); err == nil {
t.Fatalf("expected invalid response status error got nil")
}
if _, err := pq.Query(ctx, query, ts); err == nil {
t.Fatalf("expected response body error got nil")
}
if _, err := pq.Query(ctx, query, ts); err == nil {
t.Fatalf("expected error status got nil")
}
if _, err := pq.Query(ctx, query, ts); err == nil {
t.Fatalf("expected unknown status got nil")
}
if _, err := pq.Query(ctx, query, ts); err == nil {
t.Fatalf("expected non-vector resultType error got nil")
}
m, err := pq.Query(ctx, query, ts)
expErr("connection error") // 0
expErr("invalid response status error") // 1
expErr("response body error") // 2
expErr("error status") // 3
expErr("unknown status") // 4
expErr("non-vector resultType error") // 5
m, _, err := pq.Query(ctx, query, ts) // 6 - vector
if err != nil {
t.Fatalf("unexpected %s", err)
}
@@ -118,37 +116,77 @@ func TestVMInstantQuery(t *testing.T) {
}
expected := []Metric{
{
Labels: []Label{{Value: "vm_rows", Name: "__name__"}},
Labels: []Label{{Value: "vm_rows", Name: "__name__"}, {Value: "bar", Name: "foo"}},
Timestamps: []int64{1583786142},
Values: []float64{13763},
},
{
Labels: []Label{{Value: "vm_requests", Name: "__name__"}},
Labels: []Label{{Value: "vm_requests", Name: "__name__"}, {Value: "baz", Name: "foo"}},
Timestamps: []int64{1583786140},
Values: []float64{2000},
},
}
metricsEqual(t, m, expected)
m, req, err := pq.Query(ctx, query, ts) // 7 - scalar
if err != nil {
t.Fatalf("unexpected %s", err)
}
if req == nil {
t.Fatalf("expected request to be non-nil")
}
if len(m) != 1 {
t.Fatalf("expected 1 metrics got %d in %+v", len(m), m)
}
expected = []Metric{
{
Timestamps: []int64{1583786142},
Values: []float64{1},
},
}
if !reflect.DeepEqual(m, expected) {
t.Fatalf("unexpected metric %+v want %+v", m, expected)
}
g := NewGraphiteType()
gq := s.BuildWithParams(QuerierParams{DataSourceType: &g})
gq := s.BuildWithParams(QuerierParams{DataSourceType: string(datasourceGraphite)})
m, err = gq.Query(ctx, queryRender, ts)
m, _, err = gq.Query(ctx, queryRender, ts) // 8 - graphite
if err != nil {
t.Fatalf("unexpected %s", err)
}
if len(m) != 1 {
t.Fatalf("expected 1 metric got %d in %+v", len(m), m)
}
exp := Metric{
Labels: []Label{{Value: "constantLine(10)", Name: "name"}},
Timestamps: []int64{1611758403},
Values: []float64{10},
exp := []Metric{
{
Labels: []Label{{Value: "constantLine(10)", Name: "name"}},
Timestamps: []int64{1611758403},
Values: []float64{10},
},
}
if !reflect.DeepEqual(m[0], exp) {
t.Fatalf("unexpected metric %+v want %+v", m[0], expected)
metricsEqual(t, m, exp)
}
func metricsEqual(t *testing.T, gotM, expectedM []Metric) {
for i, exp := range expectedM {
got := gotM[i]
gotTS, expTS := got.Timestamps, exp.Timestamps
if !reflect.DeepEqual(gotTS, expTS) {
t.Fatalf("unexpected timestamps %+v want %+v", gotTS, expTS)
}
gotV, expV := got.Values, exp.Values
if !reflect.DeepEqual(gotV, expV) {
t.Fatalf("unexpected values %+v want %+v", gotV, expV)
}
sort.Slice(got.Labels, func(i, j int) bool {
return got.Labels[i].Name < got.Labels[j].Name
})
sort.Slice(exp.Labels, func(i, j int) bool {
return exp.Labels[i].Name < exp.Labels[j].Name
})
if !reflect.DeepEqual(exp.Labels, got.Labels) {
t.Fatalf("unexpected labels %+v want %+v", got.Labels, exp.Labels)
}
}
}
@@ -183,6 +221,10 @@ func TestVMRangeQuery(t *testing.T) {
if _, err := strconv.ParseInt(endTS, 10, 64); err != nil {
t.Errorf("failed to parse 'end' query param: %s", err)
}
step := r.URL.Query().Get("step")
if step != "15s" {
t.Errorf("expected 'step' query param to be 15s; got %q instead", step)
}
switch c {
case 0:
w.Write([]byte(`{"status":"success","data":{"resultType":"matrix","result":[{"metric":{"__name__":"vm_rows"},"values":[[1583786142,"13763"]]}]}}`))
@@ -192,14 +234,13 @@ func TestVMRangeQuery(t *testing.T) {
srv := httptest.NewServer(mux)
defer srv.Close()
authCfg, err := promauth.NewConfig(".", nil, baCfg, "", "", nil, nil)
authCfg, err := baCfg.NewConfig(".")
if err != nil {
t.Fatalf("unexpected: %s", err)
}
s := NewVMStorage(srv.URL, authCfg, time.Minute, 0, false, srv.Client(), false)
s := NewVMStorage(srv.URL, authCfg, time.Minute, *queryStep, false, srv.Client())
p := NewPrometheusType()
pq := s.BuildWithParams(QuerierParams{DataSourceType: &p, EvaluationInterval: 15 * time.Second})
pq := s.BuildWithParams(QuerierParams{DataSourceType: string(datasourcePrometheus), EvaluationInterval: 15 * time.Second})
_, err = pq.QueryRange(ctx, query, time.Now(), time.Time{})
expectError(t, err, "is missing")
@@ -225,15 +266,14 @@ func TestVMRangeQuery(t *testing.T) {
t.Fatalf("unexpected metric %+v want %+v", m[0], expected)
}
g := NewGraphiteType()
gq := s.BuildWithParams(QuerierParams{DataSourceType: &g})
gq := s.BuildWithParams(QuerierParams{DataSourceType: string(datasourceGraphite)})
_, err = gq.QueryRange(ctx, queryRender, start, end)
expectError(t, err, "is not supported")
}
func TestRequestParams(t *testing.T) {
authCfg, err := promauth.NewConfig(".", nil, baCfg, "", "", nil, nil)
authCfg, err := baCfg.NewConfig(".")
if err != nil {
t.Fatalf("unexpected: %s", err)
}
@@ -249,95 +289,49 @@ func TestRequestParams(t *testing.T) {
"prometheus path",
false,
&VMStorage{
dataSourceType: NewPrometheusType(),
dataSourceType: datasourcePrometheus,
},
func(t *testing.T, r *http.Request) {
checkEqualString(t, prometheusInstantPath, r.URL.Path)
},
},
{
"prometheus path with disablePathAppend",
false,
&VMStorage{
dataSourceType: NewPrometheusType(),
disablePathAppend: true,
},
func(t *testing.T, r *http.Request) {
checkEqualString(t, "", r.URL.Path)
checkEqualString(t, "/api/v1/query", r.URL.Path)
},
},
{
"prometheus prefix",
false,
&VMStorage{
dataSourceType: NewPrometheusType(),
dataSourceType: datasourcePrometheus,
appendTypePrefix: true,
},
func(t *testing.T, r *http.Request) {
checkEqualString(t, prometheusPrefix+prometheusInstantPath, r.URL.Path)
},
},
{
"prometheus prefix with disablePathAppend",
false,
&VMStorage{
dataSourceType: NewPrometheusType(),
appendTypePrefix: true,
disablePathAppend: true,
},
func(t *testing.T, r *http.Request) {
checkEqualString(t, prometheusPrefix, r.URL.Path)
checkEqualString(t, "/prometheus/api/v1/query", r.URL.Path)
},
},
{
"prometheus range path",
true,
&VMStorage{
dataSourceType: NewPrometheusType(),
dataSourceType: datasourcePrometheus,
},
func(t *testing.T, r *http.Request) {
checkEqualString(t, prometheusRangePath, r.URL.Path)
},
},
{
"prometheus range path with disablePathAppend",
true,
&VMStorage{
dataSourceType: NewPrometheusType(),
disablePathAppend: true,
},
func(t *testing.T, r *http.Request) {
checkEqualString(t, "", r.URL.Path)
checkEqualString(t, "/api/v1/query_range", r.URL.Path)
},
},
{
"prometheus range prefix",
true,
&VMStorage{
dataSourceType: NewPrometheusType(),
dataSourceType: datasourcePrometheus,
appendTypePrefix: true,
},
func(t *testing.T, r *http.Request) {
checkEqualString(t, prometheusPrefix+prometheusRangePath, r.URL.Path)
},
},
{
"prometheus range prefix with disablePathAppend",
true,
&VMStorage{
dataSourceType: NewPrometheusType(),
appendTypePrefix: true,
disablePathAppend: true,
},
func(t *testing.T, r *http.Request) {
checkEqualString(t, prometheusPrefix, r.URL.Path)
checkEqualString(t, "/prometheus/api/v1/query_range", r.URL.Path)
},
},
{
"graphite path",
false,
&VMStorage{
dataSourceType: NewGraphiteType(),
dataSourceType: datasourceGraphite,
},
func(t *testing.T, r *http.Request) {
checkEqualString(t, graphitePath, r.URL.Path)
@@ -347,7 +341,7 @@ func TestRequestParams(t *testing.T) {
"graphite prefix",
false,
&VMStorage{
dataSourceType: NewGraphiteType(),
dataSourceType: datasourceGraphite,
appendTypePrefix: true,
},
func(t *testing.T, r *http.Request) {
@@ -485,7 +479,7 @@ func TestRequestParams(t *testing.T) {
"graphite extra params",
false,
&VMStorage{
dataSourceType: NewGraphiteType(),
dataSourceType: datasourceGraphite,
extraParams: url.Values{
"nocache": {"1"},
"max_lookback": {"1h"},
@@ -504,14 +498,14 @@ func TestRequestParams(t *testing.T) {
if err != nil {
t.Fatalf("unexpected error: %s", err)
}
switch tc.vm.dataSourceType.String() {
case "prometheus":
switch tc.vm.dataSourceType {
case "", datasourcePrometheus:
if tc.queryRange {
tc.vm.setPrometheusRangeReqParams(req, query, timestamp, timestamp)
} else {
tc.vm.setPrometheusInstantReqParams(req, query, timestamp)
}
case "graphite":
case datasourceGraphite:
tc.vm.setGraphiteReqParams(req, query, timestamp)
}
tc.checkFn(t, req)
@@ -519,7 +513,7 @@ func TestRequestParams(t *testing.T) {
}
}
func TestAuthConfig(t *testing.T) {
func TestHeaders(t *testing.T) {
var testCases = []struct {
name string
vmFn func() *VMStorage
@@ -559,6 +553,40 @@ func TestAuthConfig(t *testing.T) {
checkEqualString(t, "foo", token)
},
},
{
name: "custom extraHeaders",
vmFn: func() *VMStorage {
return &VMStorage{extraHeaders: []keyValue{
{key: "Foo", value: "bar"},
{key: "Baz", value: "qux"},
}}
},
checkFn: func(t *testing.T, r *http.Request) {
h1 := r.Header.Get("Foo")
checkEqualString(t, "bar", h1)
h2 := r.Header.Get("Baz")
checkEqualString(t, "qux", h2)
},
},
{
name: "custom header overrides basic auth",
vmFn: func() *VMStorage {
cfg, err := utils.AuthConfig(utils.WithBasicAuth("foo", "bar", ""))
if err != nil {
t.Errorf("Error get auth config: %s", err)
}
return &VMStorage{
authCfg: cfg,
extraHeaders: []keyValue{
{key: "Authorization", value: "Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ=="},
}}
},
checkFn: func(t *testing.T, r *http.Request) {
u, p, _ := r.BasicAuth()
checkEqualString(t, "Aladdin", u)
checkEqualString(t, "open sesame", p)
},
},
}
for _, tt := range testCases {
t.Run(tt.name, func(t *testing.T) {

View File

@@ -10,6 +10,8 @@ import (
"sync"
"time"
"github.com/VictoriaMetrics/metrics"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/config"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
@@ -18,7 +20,6 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/decimal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/metrics"
)
// Group is an entity for grouping rules
@@ -27,14 +28,16 @@ type Group struct {
Name string
File string
Rules []Rule
Type datasource.Type
Type config.Type
Interval time.Duration
Limit int
Concurrency int
Checksum string
LastEvaluation time.Time
Labels map[string]string
Params url.Values
Labels map[string]string
Params url.Values
Headers map[string]string
doneCh chan struct{}
finishedCh chan struct{}
@@ -49,14 +52,21 @@ type groupMetrics struct {
iterationTotal *utils.Counter
iterationDuration *utils.Summary
iterationMissed *utils.Counter
iterationInterval *utils.Gauge
}
func newGroupMetrics(name, file string) *groupMetrics {
func newGroupMetrics(g *Group) *groupMetrics {
m := &groupMetrics{}
labels := fmt.Sprintf(`group=%q, file=%q`, name, file)
labels := fmt.Sprintf(`group=%q, file=%q`, g.Name, g.File)
m.iterationTotal = utils.GetOrCreateCounter(fmt.Sprintf(`vmalert_iteration_total{%s}`, labels))
m.iterationDuration = utils.GetOrCreateSummary(fmt.Sprintf(`vmalert_iteration_duration_seconds{%s}`, labels))
m.iterationMissed = utils.GetOrCreateCounter(fmt.Sprintf(`vmalert_iteration_missed_total{%s}`, labels))
m.iterationInterval = utils.GetOrCreateGauge(fmt.Sprintf(`vmalert_iteration_interval_seconds{%s}`, labels), func() float64 {
g.mu.RLock()
i := g.Interval.Seconds()
g.mu.RUnlock()
return i
})
return m
}
@@ -83,22 +93,27 @@ func newGroup(cfg config.Group, qb datasource.QuerierBuilder, defaultInterval ti
Name: cfg.Name,
File: cfg.File,
Interval: cfg.Interval.Duration(),
Limit: cfg.Limit,
Concurrency: cfg.Concurrency,
Checksum: cfg.Checksum,
Params: cfg.Params,
Headers: make(map[string]string),
Labels: cfg.Labels,
doneCh: make(chan struct{}),
finishedCh: make(chan struct{}),
updateCh: make(chan *Group),
}
g.metrics = newGroupMetrics(g.Name, g.File)
if g.Interval == 0 {
g.Interval = defaultInterval
}
if g.Concurrency < 1 {
g.Concurrency = 1
}
for _, h := range cfg.Headers {
g.Headers[h.Key] = h.Value
}
g.metrics = newGroupMetrics(g)
rules := make([]Rule, len(cfg.Rules))
for i, r := range cfg.Rules {
var extraLabels map[string]string
@@ -153,9 +168,12 @@ func (g *Group) Restore(ctx context.Context, qb datasource.QuerierBuilder, lookb
if rr.For < 1 {
continue
}
// ignore g.ExtraFilterLabels on purpose, so it
// won't affect the restore procedure.
q := qb.BuildWithParams(datasource.QuerierParams{})
// ignore QueryParams on purpose, because they could contain
// query filters. This may affect the restore procedure.
q := qb.BuildWithParams(datasource.QuerierParams{
DataSourceType: g.Type.String(),
Headers: g.Headers,
})
if err := rr.Restore(ctx, q, lookback, labels); err != nil {
return fmt.Errorf("error while restoring rule %q: %w", rule, err)
}
@@ -207,7 +225,9 @@ func (g *Group) updateWith(newGroup *Group) error {
g.Type = newGroup.Type
g.Concurrency = newGroup.Concurrency
g.Params = newGroup.Params
g.Headers = newGroup.Headers
g.Labels = newGroup.Labels
g.Limit = newGroup.Limit
g.Checksum = newGroup.Checksum
g.Rules = newRules
return nil
@@ -222,6 +242,8 @@ func (g *Group) close() {
g.metrics.iterationDuration.Unregister()
g.metrics.iterationTotal.Unregister()
g.metrics.iterationMissed.Unregister()
g.metrics.iterationInterval.Unregister()
for _, rule := range g.Rules {
rule.Close()
}
@@ -237,8 +259,6 @@ func (g *Group) start(ctx context.Context, nts func() []notifier.Notifier, rw *r
notifiers: nts,
previouslySentSeriesToRW: make(map[uint64]map[string][]prompbmarshal.Label)}
evalTS := time.Now()
// Spread group rules evaluation over time in order to reduce load on VictoriaMetrics.
if !skipRandSleepOnGroupStart {
randSleep := uint64(float64(g.Interval) * (float64(g.ID()) / (1 << 64)))
@@ -259,6 +279,8 @@ func (g *Group) start(ctx context.Context, nts func() []notifier.Notifier, rw *r
}
}
evalTS := time.Now()
logger.Infof("group %q started; interval=%v; concurrency=%d", g.Name, g.Interval, g.Concurrency)
eval := func(ts time.Time) {
@@ -273,7 +295,7 @@ func (g *Group) start(ctx context.Context, nts func() []notifier.Notifier, rw *r
}
resolveDuration := getResolveDuration(g.Interval, *resendDelay, *maxResolveDuration)
errs := e.execConcurrently(ctx, g.Rules, ts, g.Concurrency, resolveDuration)
errs := e.execConcurrently(ctx, g.Rules, ts, g.Concurrency, resolveDuration, g.Limit)
for err := range errs {
if err != nil {
logger.Errorf("group %q: %s", g.Name, err)
@@ -303,6 +325,10 @@ func (g *Group) start(ctx context.Context, nts func() []notifier.Notifier, rw *r
g.mu.Unlock()
continue
}
// ensure that staleness is tracked or existing rules only
e.purgeStaleSeries(g.Rules)
if g.Interval != ng.Interval {
g.Interval = ng.Interval
t.Stop()
@@ -347,12 +373,12 @@ type executor struct {
previouslySentSeriesToRW map[uint64]map[string][]prompbmarshal.Label
}
func (e *executor) execConcurrently(ctx context.Context, rules []Rule, ts time.Time, concurrency int, resolveDuration time.Duration) chan error {
func (e *executor) execConcurrently(ctx context.Context, rules []Rule, ts time.Time, concurrency int, resolveDuration time.Duration, limit int) chan error {
res := make(chan error, len(rules))
if concurrency == 1 {
// fast path
for _, rule := range rules {
res <- e.exec(ctx, rule, ts, resolveDuration)
res <- e.exec(ctx, rule, ts, resolveDuration, limit)
}
close(res)
return res
@@ -365,7 +391,7 @@ func (e *executor) execConcurrently(ctx context.Context, rules []Rule, ts time.T
sem <- struct{}{}
wg.Add(1)
go func(r Rule) {
res <- e.exec(ctx, r, ts, resolveDuration)
res <- e.exec(ctx, r, ts, resolveDuration, limit)
<-sem
wg.Done()
}(rule)
@@ -386,29 +412,35 @@ var (
remoteWriteTotal = metrics.NewCounter(`vmalert_remotewrite_total`)
)
func (e *executor) exec(ctx context.Context, rule Rule, ts time.Time, resolveDuration time.Duration) error {
func (e *executor) exec(ctx context.Context, rule Rule, ts time.Time, resolveDuration time.Duration, limit int) error {
execTotal.Inc()
tss, err := rule.Exec(ctx, ts)
tss, err := rule.Exec(ctx, ts, limit)
if err != nil {
execErrors.Inc()
return fmt.Errorf("rule %q: failed to execute: %w", rule, err)
}
errGr := new(utils.ErrGroup)
if e.rw != nil {
pushToRW := func(tss []prompbmarshal.TimeSeries) {
pushToRW := func(tss []prompbmarshal.TimeSeries) error {
var lastErr error
for _, ts := range tss {
remoteWriteTotal.Inc()
if err := e.rw.Push(ts); err != nil {
remoteWriteErrors.Inc()
errGr.Add(fmt.Errorf("rule %q: remote write failure: %w", rule, err))
lastErr = fmt.Errorf("rule %q: remote write failure: %w", rule, err)
}
}
return lastErr
}
pushToRW(tss)
if err := pushToRW(tss); err != nil {
return err
}
staleSeries := e.getStaleSeries(rule, tss, ts)
pushToRW(staleSeries)
if err := pushToRW(staleSeries); err != nil {
return err
}
}
ar, ok := rule.(*AlertingRule)
@@ -421,11 +453,18 @@ func (e *executor) exec(ctx context.Context, rule Rule, ts time.Time, resolveDur
return nil
}
wg := sync.WaitGroup{}
errGr := new(utils.ErrGroup)
for _, nt := range e.notifiers() {
if err := nt.Send(ctx, alerts); err != nil {
errGr.Add(fmt.Errorf("rule %q: failed to send alerts to addr %q: %w", rule, nt.Addr(), err))
}
wg.Add(1)
go func(nt notifier.Notifier) {
if err := nt.Send(ctx, alerts); err != nil {
errGr.Add(fmt.Errorf("rule %q: failed to send alerts to addr %q: %w", rule, nt.Addr(), err))
}
wg.Done()
}(nt)
}
wg.Wait()
return errGr.Err()
}
@@ -457,6 +496,30 @@ func (e *executor) getStaleSeries(rule Rule, tss []prompbmarshal.TimeSeries, tim
return staleS
}
// purgeStaleSeries deletes references in tracked
// previouslySentSeriesToRW list to Rules which aren't present
// in the given activeRules list. The method is used when the list
// of loaded rules has changed and executor has to remove
// references to non-existing rules.
func (e *executor) purgeStaleSeries(activeRules []Rule) {
newPreviouslySentSeriesToRW := make(map[uint64]map[string][]prompbmarshal.Label)
e.previouslySentSeriesToRWMu.Lock()
for _, rule := range activeRules {
id := rule.ID()
prev, ok := e.previouslySentSeriesToRW[id]
if ok {
// keep previous series for staleness detection
newPreviouslySentSeriesToRW[id] = prev
}
}
e.previouslySentSeriesToRW = nil
e.previouslySentSeriesToRW = newPreviouslySentSeriesToRW
e.previouslySentSeriesToRWMu.Unlock()
}
func labelsToString(labels []prompbmarshal.Label) string {
var b strings.Builder
b.WriteRune('{')

View File

@@ -3,8 +3,6 @@ package main
import (
"context"
"fmt"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/decimal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"reflect"
"sort"
"testing"
@@ -12,6 +10,9 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/config"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/decimal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promutils"
)
@@ -148,7 +149,7 @@ func TestUpdateWith(t *testing.T) {
t.Fatalf("expected to have rule %q; got %q", want, got)
}
if err := compareRules(t, got, want); err != nil {
t.Fatalf("comparsion error: %s", err)
t.Fatalf("comparison error: %s", err)
}
}
})
@@ -157,7 +158,7 @@ func TestUpdateWith(t *testing.T) {
func TestGroupStart(t *testing.T) {
// TODO: make parsing from string instead of file
groups, err := config.Parse([]string{"config/testdata/rules1-good.rules"}, true, true)
groups, err := config.Parse([]string{"config/testdata/rules/rules1-good.rules"}, notifier.ValidateTemplates, true)
if err != nil {
t.Fatalf("failed to parse rules: %s", err)
}
@@ -355,3 +356,121 @@ func TestGetStaleSeries(t *testing.T) {
[][]prompbmarshal.Label{toPromLabels(t, "__name__", "job:foo", "job", "bar")},
nil)
}
func TestPurgeStaleSeries(t *testing.T) {
ts := time.Now()
labels := toPromLabels(t, "__name__", "job:foo", "job", "foo")
tss := []prompbmarshal.TimeSeries{newTimeSeriesPB([]float64{1}, []int64{ts.Unix()}, labels)}
f := func(curRules, newRules, expStaleRules []Rule) {
t.Helper()
e := &executor{
previouslySentSeriesToRW: make(map[uint64]map[string][]prompbmarshal.Label),
}
// seed executor with series for
// current rules
for _, rule := range curRules {
e.getStaleSeries(rule, tss, ts)
}
e.purgeStaleSeries(newRules)
if len(e.previouslySentSeriesToRW) != len(expStaleRules) {
t.Fatalf("expected to get %d stale series, got %d",
len(expStaleRules), len(e.previouslySentSeriesToRW))
}
for _, exp := range expStaleRules {
if _, ok := e.previouslySentSeriesToRW[exp.ID()]; !ok {
t.Fatalf("expected to have rule %d; got nil instead", exp.ID())
}
}
}
f(nil, nil, nil)
f(
nil,
[]Rule{&AlertingRule{RuleID: 1}},
nil,
)
f(
[]Rule{&AlertingRule{RuleID: 1}},
nil,
nil,
)
f(
[]Rule{&AlertingRule{RuleID: 1}},
[]Rule{&AlertingRule{RuleID: 2}},
nil,
)
f(
[]Rule{&AlertingRule{RuleID: 1}, &AlertingRule{RuleID: 2}},
[]Rule{&AlertingRule{RuleID: 2}},
[]Rule{&AlertingRule{RuleID: 2}},
)
f(
[]Rule{&AlertingRule{RuleID: 1}, &AlertingRule{RuleID: 2}},
[]Rule{&AlertingRule{RuleID: 1}, &AlertingRule{RuleID: 2}},
[]Rule{&AlertingRule{RuleID: 1}, &AlertingRule{RuleID: 2}},
)
}
func TestFaultyNotifier(t *testing.T) {
fq := &fakeQuerier{}
fq.add(metricWithValueAndLabels(t, 1, "__name__", "foo", "job", "bar"))
r := newTestAlertingRule("instant", 0)
r.q = fq
fn := &fakeNotifier{}
e := &executor{
notifiers: func() []notifier.Notifier {
return []notifier.Notifier{
&faultyNotifier{},
fn,
}
},
}
delay := 5 * time.Second
ctx, cancel := context.WithTimeout(context.Background(), delay)
defer cancel()
go func() {
_ = e.exec(ctx, r, time.Now(), 0, 10)
}()
tn := time.Now()
deadline := tn.Add(delay / 2)
for {
if fn.getCounter() > 0 {
return
}
if tn.After(deadline) {
break
}
tn = time.Now()
time.Sleep(time.Millisecond * 100)
}
t.Fatalf("alive notifier didn't receive notification by %v", deadline)
}
func TestFaultyRW(t *testing.T) {
fq := &fakeQuerier{}
fq.add(metricWithValueAndLabels(t, 1, "__name__", "foo", "job", "bar"))
r := &RecordingRule{
Name: "test",
state: newRuleState(10),
q: fq,
}
e := &executor{
rw: &remotewrite.Client{},
previouslySentSeriesToRW: make(map[uint64]map[string][]prompbmarshal.Label),
}
err := e.exec(context.Background(), r, time.Now(), 0, 10)
if err == nil {
t.Fatalf("expected to get an error from faulty RW client, got nil instead")
}
}

View File

@@ -3,6 +3,7 @@ package main
import (
"context"
"fmt"
"net/http"
"reflect"
"sort"
"sync"
@@ -44,18 +45,20 @@ func (fq *fakeQuerier) BuildWithParams(_ datasource.QuerierParams) datasource.Qu
}
func (fq *fakeQuerier) QueryRange(ctx context.Context, q string, _, _ time.Time) ([]datasource.Metric, error) {
return fq.Query(ctx, q, time.Now())
req, _, err := fq.Query(ctx, q, time.Now())
return req, err
}
func (fq *fakeQuerier) Query(_ context.Context, _ string, _ time.Time) ([]datasource.Metric, error) {
func (fq *fakeQuerier) Query(_ context.Context, _ string, _ time.Time) ([]datasource.Metric, *http.Request, error) {
fq.Lock()
defer fq.Unlock()
if fq.err != nil {
return nil, fq.err
return nil, nil, fq.err
}
cp := make([]datasource.Metric, len(fq.metrics))
copy(cp, fq.metrics)
return cp, nil
req, _ := http.NewRequest(http.MethodPost, "foo.com", nil)
return cp, req, nil
}
type fakeNotifier struct {
@@ -87,6 +90,18 @@ func (fn *fakeNotifier) getAlerts() []notifier.Alert {
return fn.alerts
}
type faultyNotifier struct {
fakeNotifier
}
func (fn *faultyNotifier) Send(ctx context.Context, _ []notifier.Alert) error {
d, ok := ctx.Deadline()
if ok {
time.Sleep(time.Until(d))
}
return fmt.Errorf("send failed")
}
func metricWithValueAndLabels(t *testing.T, value float64, labels ...string) datasource.Metric {
return metricWithValuesAndLabels(t, []float64{value}, labels...)
}
@@ -152,7 +167,7 @@ func compareGroups(t *testing.T, a, b *Group) {
t.Fatalf("expected to have rule %q; got %q", want.ID(), got.ID())
}
if err := compareRules(t, want, got); err != nil {
t.Fatalf("comparsion error: %s", err)
t.Fatalf("comparison error: %s", err)
}
}
}

View File

@@ -15,6 +15,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/remoteread"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/templates"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envflag"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fasttime"
@@ -22,11 +23,12 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/procutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/pushmetrics"
"github.com/VictoriaMetrics/metrics"
)
var (
rulePath = flagutil.NewArray("rule", `Path to the file with alert rules.
rulePath = flagutil.NewArrayString("rule", `Path to the file with alert rules.
Supports patterns. Flag can be specified multiple times.
Examples:
-rule="/path/to/file". Path to a single file with alerting rules
@@ -34,6 +36,13 @@ Examples:
absolute path to all .yaml files in root.
Rule files may contain %{ENV_VAR} placeholders, which are substituted by the corresponding env vars.`)
ruleTemplatesPath = flagutil.NewArrayString("rule.templates", `Path or glob pattern to location with go template definitions
for rules annotations templating. Flag can be specified multiple times.
Examples:
-rule.templates="/path/to/file". Path to a single file with go templates
-rule.templates="dir/*.tpl" -rule.templates="/*.tpl". Relative path to all .tpl files in "dir" folder,
absolute path to all .tpl files in root.`)
rulesCheckInterval = flag.Duration("rule.configCheckInterval", 0, "Interval for checking for changes in '-rule' files. "+
"By default the checking is disabled. Send SIGHUP signal in order to force config check for changes. DEPRECATED - see '-configCheckInterval' instead")
@@ -47,12 +56,17 @@ Rule files may contain %{ENV_VAR} placeholders, which are substituted by the cor
validateExpressions = flag.Bool("rule.validateExpressions", true, "Whether to validate rules expressions via MetricsQL engine")
maxResolveDuration = flag.Duration("rule.maxResolveDuration", 0, "Limits the maximum duration for automatic alert expiration, "+
"which is by default equal to 3 evaluation intervals of the parent group.")
resendDelay = flag.Duration("rule.resendDelay", 0, "Minimum amount of time to wait before resending an alert to notifier")
resendDelay = flag.Duration("rule.resendDelay", 0, "Minimum amount of time to wait before resending an alert to notifier")
ruleUpdateEntriesLimit = flag.Int("rule.updateEntriesLimit", 20, "Defines the max number of rule's state updates stored in-memory. "+
"Rule's updates are available on rule's Details page and are used for debugging purposes. The number of stored updates can be overriden per rule via update_entries_limit param.")
externalURL = flag.String("external.url", "", "External URL is used as alert's source for sent alerts to the notifier")
externalAlertSource = flag.String("external.alert.source", "", `External Alert Source allows to override the Source link for alerts sent to AlertManager for cases where you want to build a custom link to Grafana, Prometheus or any other service.
eg. 'explore?orgId=1&left=[\"now-1h\",\"now\",\"VictoriaMetrics\",{\"expr\": \"{{$expr|quotesEscape|crlfEscape|queryEscape}}\"},{\"mode\":\"Metrics\"},{\"ui\":[true,true,true,\"none\"]}]'.If empty '/api/v1/:groupID/alertID/status' is used`)
externalLabels = flagutil.NewArray("external.label", "Optional label in the form 'Name=value' to add to all generated recording rules and alerts. "+
externalAlertSource = flag.String("external.alert.source", "", `External Alert Source allows to override the Source link for alerts sent to AlertManager `+
`for cases where you want to build a custom link to Grafana, Prometheus or any other service. `+
`Supports templating - see https://docs.victoriametrics.com/vmalert.html#templating . `+
`For example, link to Grafana: -external.alert.source='explore?orgId=1&left=["now-1h","now","VictoriaMetrics",{"expr":{{$expr|jsonEscape|queryEscape}} },{"mode":"Metrics"},{"ui":[true,true,true,"none"]}]' . `+
`If empty 'vmalert/alert?group_id={{.GroupID}}&alert_id={{.AlertID}}' is used.`)
externalLabels = flagutil.NewArrayString("external.label", "Optional label in the form 'Name=value' to add to all generated recording rules and alerts. "+
"Pass multiple -label flags in order to add multiple label sets.")
remoteReadLookBack = flag.Duration("remoteRead.lookback", time.Hour, "Lookback defines how far to look into past for alerts timeseries."+
@@ -61,7 +75,7 @@ eg. 'explore?orgId=1&left=[\"now-1h\",\"now\",\"VictoriaMetrics\",{\"expr\": \"{
disableAlertGroupLabel = flag.Bool("disableAlertgroupLabel", false, "Whether to disable adding group's Name as label to generated alerts and time series.")
dryRun = flag.Bool("dryRun", false, "Whether to check only config files without running vmalert. The rules file are validated. The `-rule` flag must be specified.")
dryRun = flag.Bool("dryRun", false, "Whether to check only config files without running vmalert. The rules file are validated. The -rule flag must be specified.")
)
var alertURLGeneratorFn notifier.AlertURLGenerator
@@ -71,13 +85,20 @@ func main() {
flag.CommandLine.SetOutput(os.Stdout)
flag.Usage = usage
envflag.Parse()
remoteread.InitSecretFlags()
remotewrite.InitSecretFlags()
datasource.InitSecretFlags()
buildinfo.Init()
logger.Init()
pushmetrics.Init()
err := templates.Load(*ruleTemplatesPath, true)
if err != nil {
logger.Fatalf("failed to parse %q: %s", *ruleTemplatesPath, err)
}
if *dryRun {
u, _ := url.Parse("https://victoriametrics.com/")
notifier.InitTemplateFunc(u)
groups, err := config.Parse(*rulePath, true, true)
groups, err := config.Parse(*rulePath, notifier.ValidateTemplates, true)
if err != nil {
logger.Fatalf("failed to parse %q: %s", *rulePath, err)
}
@@ -91,12 +112,17 @@ func main() {
if err != nil {
logger.Fatalf("failed to init `external.url`: %s", err)
}
notifier.InitTemplateFunc(eu)
alertURLGeneratorFn, err = getAlertURLGenerator(eu, *externalAlertSource, *validateTemplates)
if err != nil {
logger.Fatalf("failed to init `external.alert.source`: %s", err)
}
var validateTplFn config.ValidateTplFn
if *validateTemplates {
validateTplFn = notifier.ValidateTemplates
}
if *replayFrom != "" || *replayTo != "" {
rw, err := remotewrite.Init(context.Background())
if err != nil {
@@ -105,8 +131,7 @@ func main() {
if rw == nil {
logger.Fatalf("remoteWrite.url can't be empty in replay mode")
}
notifier.InitTemplateFunc(eu)
groupsCfg, err := config.Parse(*rulePath, *validateTemplates, *validateExpressions)
groupsCfg, err := config.Parse(*rulePath, validateTplFn, *validateExpressions)
if err != nil {
logger.Fatalf("cannot parse configuration file: %s", err)
}
@@ -127,9 +152,8 @@ func main() {
if err != nil {
logger.Fatalf("failed to init: %s", err)
}
logger.Infof("reading rules configuration file from %q", strings.Join(*rulePath, ";"))
groupsCfg, err := config.Parse(*rulePath, *validateTemplates, *validateExpressions)
groupsCfg, err := config.Parse(*rulePath, validateTplFn, *validateExpressions)
if err != nil {
logger.Fatalf("cannot parse configuration file: %s", err)
}
@@ -170,7 +194,7 @@ func newManager(ctx context.Context) (*manager, error) {
return nil, fmt.Errorf("failed to init datasource: %w", err)
}
labels := make(map[string]string, 0)
labels := make(map[string]string)
for _, s := range *externalLabels {
if len(s) == 0 {
continue
@@ -228,8 +252,9 @@ func getExternalURL(externalURL, httpListenAddr string, isSecure bool) (*url.URL
func getAlertURLGenerator(externalURL *url.URL, externalAlertSource string, validateTemplate bool) (notifier.AlertURLGenerator, error) {
if externalAlertSource == "" {
return func(alert notifier.Alert) string {
return fmt.Sprintf("%s/api/v1/%s/%s/status", externalURL, strconv.FormatUint(alert.GroupID, 10), strconv.FormatUint(alert.ID, 10))
return func(a notifier.Alert) string {
gID, aID := strconv.FormatUint(a.GroupID, 10), strconv.FormatUint(a.ID, 10)
return fmt.Sprintf("%s/vmalert/alert?%s=%s&%s=%s", externalURL, paramGroupID, gID, paramAlertID, aID)
}, nil
}
if validateTemplate {
@@ -243,7 +268,7 @@ func getAlertURLGenerator(externalURL *url.URL, externalAlertSource string, vali
"tpl": externalAlertSource,
}
return func(alert notifier.Alert) string {
templated, err := alert.ExecTemplate(nil, nil, m)
templated, err := alert.ExecTemplate(nil, alert.Labels, m)
if err != nil {
logger.Errorf("can not exec source template %s", err)
}
@@ -273,6 +298,11 @@ func configReload(ctx context.Context, m *manager, groupsCfg []config.Group, sig
defer ticker.Stop()
}
var validateTplFn config.ValidateTplFn
if *validateTemplates {
validateTplFn = notifier.ValidateTemplates
}
// init reload metrics with positive values to improve alerting conditions
configSuccess.Set(1)
configTimestamp.Set(fasttime.UnixTimestamp())
@@ -281,7 +311,11 @@ func configReload(ctx context.Context, m *manager, groupsCfg []config.Group, sig
case <-ctx.Done():
return
case <-sighupCh:
logger.Infof("SIGHUP received. Going to reload rules %q ...", *rulePath)
tmplMsg := ""
if len(*ruleTemplatesPath) > 0 {
tmplMsg = fmt.Sprintf("and templates %q ", *ruleTemplatesPath)
}
logger.Infof("SIGHUP received. Going to reload rules %q %s...", *rulePath, tmplMsg)
configReloads.Inc()
case <-configCheckCh:
}
@@ -291,7 +325,14 @@ func configReload(ctx context.Context, m *manager, groupsCfg []config.Group, sig
logger.Errorf("failed to reload notifier config: %s", err)
continue
}
newGroupsCfg, err := config.Parse(*rulePath, *validateTemplates, *validateExpressions)
err := templates.Load(*ruleTemplatesPath, false)
if err != nil {
configReloadErrors.Inc()
configSuccess.Set(0)
logger.Errorf("failed to load new templates: %s", err)
continue
}
newGroupsCfg, err := config.Parse(*rulePath, validateTplFn, *validateExpressions)
if err != nil {
configReloadErrors.Inc()
configSuccess.Set(0)
@@ -299,6 +340,7 @@ func configReload(ctx context.Context, m *manager, groupsCfg []config.Group, sig
continue
}
if configsEqual(newGroupsCfg, groupsCfg) {
templates.Reload()
// set success to 1 since previous reload
// could have been unsuccessful
configSuccess.Set(1)
@@ -311,6 +353,7 @@ func configReload(ctx context.Context, m *manager, groupsCfg []config.Group, sig
logger.Errorf("error while reloading rules: %s", err)
continue
}
templates.Reload()
groupsCfg = newGroupsCfg
configSuccess.Set(1)
configTimestamp.Set(fasttime.UnixTimestamp())

View File

@@ -3,7 +3,6 @@ package main
import (
"context"
"fmt"
"io/ioutil"
"net/url"
"os"
"testing"
@@ -35,24 +34,25 @@ func TestGetExternalURL(t *testing.T) {
}
func TestGetAlertURLGenerator(t *testing.T) {
testAlert := notifier.Alert{GroupID: 42, ID: 2, Value: 4}
testAlert := notifier.Alert{GroupID: 42, ID: 2, Value: 4, Labels: map[string]string{"tenant": "baz"}}
u, _ := url.Parse("https://victoriametrics.com/path")
fn, err := getAlertURLGenerator(u, "", false)
if err != nil {
t.Errorf("unexpected error %s", err)
}
if exp := "https://victoriametrics.com/path/api/v1/42/2/status"; exp != fn(testAlert) {
exp := fmt.Sprintf("https://victoriametrics.com/path/vmalert/alert?%s=42&%s=2", paramGroupID, paramAlertID)
if exp != fn(testAlert) {
t.Errorf("unexpected url want %s, got %s", exp, fn(testAlert))
}
_, err = getAlertURLGenerator(nil, "foo?{{invalid}}", true)
if err == nil {
t.Errorf("expected tempalte validation error got nil")
t.Errorf("expected template validation error got nil")
}
fn, err = getAlertURLGenerator(u, "foo?query={{$value}}", true)
fn, err = getAlertURLGenerator(u, "foo?query={{$value}}&ds={{ $labels.tenant }}", true)
if err != nil {
t.Errorf("unexpected error %s", err)
}
if exp := "https://victoriametrics.com/path/foo?query=4"; exp != fn(testAlert) {
if exp := "https://victoriametrics.com/path/foo?query=4&ds=baz"; exp != fn(testAlert) {
t.Errorf("unexpected url want %s, got %s", exp, fn(testAlert))
}
}
@@ -86,7 +86,7 @@ groups:
`
)
f, err := ioutil.TempFile("", "")
f, err := os.CreateTemp("", "")
if err != nil {
t.Fatal(err)
}
@@ -153,7 +153,7 @@ groups:
func writeToFile(t *testing.T, file, b string) {
t.Helper()
err := ioutil.WriteFile(file, []byte(b), 0644)
err := os.WriteFile(file, []byte(b), 0644)
if err != nil {
t.Fatal(err)
}

View File

@@ -30,6 +30,23 @@ type manager struct {
groups map[uint64]*Group
}
// RuleAPI generates APIRule object from alert by its ID(hash)
func (m *manager) RuleAPI(gID, rID uint64) (APIRule, error) {
m.groupsMu.RLock()
defer m.groupsMu.RUnlock()
g, ok := m.groups[gID]
if !ok {
return APIRule{}, fmt.Errorf("can't find group with id %d", gID)
}
for _, rule := range g.Rules {
if rule.ID() == rID {
return rule.ToAPI(), nil
}
}
return APIRule{}, fmt.Errorf("can't find rule with id %d in group %q", rID, g.Name)
}
// AlertAPI generates APIAlert object from alert by its ID(hash)
func (m *manager) AlertAPI(gID, aID uint64) (*APIAlert, error) {
m.groupsMu.RLock()
@@ -70,9 +87,9 @@ func (m *manager) startGroup(ctx context.Context, group *Group, restore bool) er
err := group.Restore(ctx, m.rr, *remoteReadLookBack, m.labels)
if err != nil {
if !*remoteReadIgnoreRestoreErrors {
return fmt.Errorf("failed to restore state for group %q: %w", group.Name, err)
return fmt.Errorf("failed to restore ruleState for group %q: %w", group.Name, err)
}
logger.Errorf("error while restoring state for group %q: %s", group.Name, err)
logger.Errorf("error while restoring ruleState for group %q: %s", group.Name, err)
}
}
@@ -170,6 +187,7 @@ func (g *Group) toAPI() APIGroup {
LastEvaluation: g.LastEvaluation,
Concurrency: g.Concurrency,
Params: urlValuesToStrings(g.Params),
Headers: headersToStrings(g.Headers),
Labels: g.Labels,
}
for _, r := range g.Rules {
@@ -198,3 +216,23 @@ func urlValuesToStrings(values url.Values) []string {
}
return res
}
func headersToStrings(headers map[string]string) []string {
if len(headers) < 1 {
return nil
}
keys := make([]string, 0, len(headers))
for k := range headers {
keys = append(keys, k)
}
sort.Strings(keys)
var res []string
for _, k := range keys {
v := headers[k]
res = append(res, fmt.Sprintf("%s: %s", k, v))
}
return res
}

View File

@@ -3,7 +3,6 @@ package main
import (
"context"
"math/rand"
"net/url"
"os"
"strings"
"sync"
@@ -11,14 +10,15 @@ import (
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/config"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/templates"
)
func TestMain(m *testing.M) {
u, _ := url.Parse("https://victoriametrics.com/path")
notifier.InitTemplateFunc(u)
if err := templates.Load([]string{"testdata/templates/*good.tmpl"}, true); err != nil {
os.Exit(1)
}
os.Exit(m.Run())
}
@@ -29,7 +29,7 @@ func TestManagerEmptyRulesDir(t *testing.T) {
m := &manager{groups: make(map[uint64]*Group)}
cfg := loadCfg(t, []string{"foo/bar"}, true, true)
if err := m.update(context.Background(), cfg, false); err != nil {
t.Fatalf("expected to load succesfully with empty rules dir; got err instead: %v", err)
t.Fatalf("expected to load successfully with empty rules dir; got err instead: %v", err)
}
}
@@ -47,9 +47,9 @@ func TestManagerUpdateConcurrent(t *testing.T) {
"config/testdata/dir/rules0-bad.rules",
"config/testdata/dir/rules1-good.rules",
"config/testdata/dir/rules1-bad.rules",
"config/testdata/rules0-good.rules",
"config/testdata/rules1-good.rules",
"config/testdata/rules2-good.rules",
"config/testdata/rules/rules0-good.rules",
"config/testdata/rules/rules1-good.rules",
"config/testdata/rules/rules2-good.rules",
}
evalInterval := *evaluationInterval
defer func() { *evaluationInterval = evalInterval }()
@@ -68,7 +68,7 @@ func TestManagerUpdateConcurrent(t *testing.T) {
defer wg.Done()
for i := 0; i < iterations; i++ {
rnd := rand.Intn(len(paths))
cfg, err := config.Parse([]string{paths[rnd]}, true, true)
cfg, err := config.Parse([]string{paths[rnd]}, notifier.ValidateTemplates, true)
if err != nil { // update can fail and this is expected
continue
}
@@ -125,13 +125,13 @@ func TestManagerUpdate(t *testing.T) {
}{
{
name: "update good rules",
initPath: "config/testdata/rules0-good.rules",
initPath: "config/testdata/rules/rules0-good.rules",
updatePath: "config/testdata/dir/rules1-good.rules",
want: []*Group{
{
File: "config/testdata/dir/rules1-good.rules",
Name: "duplicatedGroupDiffFiles",
Type: datasource.NewPrometheusType(),
Type: config.NewPrometheusType(),
Interval: defaultEvalInterval,
Rules: []Rule{
&AlertingRule{
@@ -150,20 +150,20 @@ func TestManagerUpdate(t *testing.T) {
},
{
name: "update good rules from 1 to 2 groups",
initPath: "config/testdata/dir/rules1-good.rules",
updatePath: "config/testdata/rules0-good.rules",
initPath: "config/testdata/dir/rules/rules1-good.rules",
updatePath: "config/testdata/rules/rules0-good.rules",
want: []*Group{
{
File: "config/testdata/rules0-good.rules",
File: "config/testdata/rules/rules0-good.rules",
Name: "groupGorSingleAlert",
Type: datasource.NewPrometheusType(),
Type: config.NewPrometheusType(),
Rules: []Rule{VMRows},
Interval: defaultEvalInterval,
},
{
File: "config/testdata/rules0-good.rules",
File: "config/testdata/rules/rules0-good.rules",
Interval: defaultEvalInterval,
Type: datasource.NewPrometheusType(),
Type: config.NewPrometheusType(),
Name: "TestGroup", Rules: []Rule{
Conns,
ExampleAlertAlwaysFiring,
@@ -172,21 +172,21 @@ func TestManagerUpdate(t *testing.T) {
},
{
name: "update with one bad rule file",
initPath: "config/testdata/rules0-good.rules",
initPath: "config/testdata/rules/rules0-good.rules",
updatePath: "config/testdata/dir/rules2-bad.rules",
want: []*Group{
{
File: "config/testdata/rules0-good.rules",
File: "config/testdata/rules/rules0-good.rules",
Name: "groupGorSingleAlert",
Type: datasource.NewPrometheusType(),
Type: config.NewPrometheusType(),
Interval: defaultEvalInterval,
Rules: []Rule{VMRows},
},
{
File: "config/testdata/rules0-good.rules",
File: "config/testdata/rules/rules0-good.rules",
Interval: defaultEvalInterval,
Name: "TestGroup",
Type: datasource.NewPrometheusType(),
Type: config.NewPrometheusType(),
Rules: []Rule{
Conns,
ExampleAlertAlwaysFiring,
@@ -196,19 +196,19 @@ func TestManagerUpdate(t *testing.T) {
{
name: "update empty dir rules from 0 to 2 groups",
initPath: "config/testdata/empty/*",
updatePath: "config/testdata/rules0-good.rules",
updatePath: "config/testdata/rules/rules0-good.rules",
want: []*Group{
{
File: "config/testdata/rules0-good.rules",
File: "config/testdata/rules/rules0-good.rules",
Name: "groupGorSingleAlert",
Type: datasource.NewPrometheusType(),
Type: config.NewPrometheusType(),
Interval: defaultEvalInterval,
Rules: []Rule{VMRows},
},
{
File: "config/testdata/rules0-good.rules",
File: "config/testdata/rules/rules0-good.rules",
Interval: defaultEvalInterval,
Type: datasource.NewPrometheusType(),
Type: config.NewPrometheusType(),
Name: "TestGroup", Rules: []Rule{
Conns,
ExampleAlertAlwaysFiring,
@@ -231,7 +231,7 @@ func TestManagerUpdate(t *testing.T) {
t.Fatalf("failed to complete initial rules update: %s", err)
}
cfgUpdate, err := config.Parse([]string{tc.updatePath}, true, true)
cfgUpdate, err := config.Parse([]string{tc.updatePath}, notifier.ValidateTemplates, true)
if err == nil { // update can fail and that's expected
_ = m.update(ctx, cfgUpdate, false)
}
@@ -329,7 +329,11 @@ func TestManagerUpdateNegative(t *testing.T) {
func loadCfg(t *testing.T, path []string, validateAnnotations, validateExpressions bool) []config.Group {
t.Helper()
cfg, err := config.Parse(path, validateAnnotations, validateExpressions)
var validateTplFn config.ValidateTplFn
if validateAnnotations {
validateTplFn = notifier.ValidateTemplates
}
cfg, err := config.Parse(path, validateTplFn, validateExpressions)
if err != nil {
t.Fatal(err)
}

View File

@@ -9,4 +9,4 @@ COPY --from=certs /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ca-certifica
EXPOSE 8880
ENTRYPOINT ["/vmalert-prod"]
ARG TARGETARCH
COPY vmalert-${TARGETARCH}-prod ./vmalert-prod
COPY vmalert-linux-${TARGETARCH}-prod ./vmalert-prod

View File

@@ -5,9 +5,10 @@ import (
"fmt"
"io"
"strings"
"text/template"
textTpl "text/template"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/templates"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/utils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
@@ -44,6 +45,8 @@ type Alert struct {
ID uint64
// Restored is true if Alert was restored after restart
Restored bool
// For defines for how long Alert needs to be active to become StateFiring
For time.Duration
}
// AlertState type indicates the Alert state
@@ -73,9 +76,13 @@ func (as AlertState) String() string {
// AlertTplData is used to execute templating
type AlertTplData struct {
Labels map[string]string
Value float64
Expr string
Labels map[string]string
Value float64
Expr string
AlertID uint64
GroupID uint64
ActiveAt time.Time
For time.Duration
}
var tplHeaders = []string{
@@ -84,32 +91,56 @@ var tplHeaders = []string{
"{{ $expr := .Expr }}",
"{{ $externalLabels := .ExternalLabels }}",
"{{ $externalURL := .ExternalURL }}",
"{{ $alertID := .AlertID }}",
"{{ $groupID := .GroupID }}",
"{{ $activeAt := .ActiveAt }}",
"{{ $for := .For }}",
}
// ExecTemplate executes the Alert template for given
// map of annotations.
// Every alert could have a different datasource, so function
// requires a queryFunction as an argument.
func (a *Alert) ExecTemplate(q QueryFn, labels, annotations map[string]string) (map[string]string, error) {
tplData := AlertTplData{Value: a.Value, Labels: labels, Expr: a.Expr}
return templateAnnotations(annotations, tplData, funcsWithQuery(q))
func (a *Alert) ExecTemplate(q templates.QueryFn, labels, annotations map[string]string) (map[string]string, error) {
tplData := AlertTplData{
Value: a.Value,
Labels: labels,
Expr: a.Expr,
AlertID: a.ID,
GroupID: a.GroupID,
ActiveAt: a.ActiveAt,
For: a.For,
}
tmpl, err := templates.GetWithFuncs(templates.FuncsWithQuery(q))
if err != nil {
return nil, fmt.Errorf("error getting a template: %w", err)
}
return templateAnnotations(annotations, tplData, tmpl, true)
}
// ExecTemplate executes the given template for given annotations map.
func ExecTemplate(q QueryFn, annotations map[string]string, tpl AlertTplData) (map[string]string, error) {
return templateAnnotations(annotations, tpl, funcsWithQuery(q))
func ExecTemplate(q templates.QueryFn, annotations map[string]string, tplData AlertTplData) (map[string]string, error) {
tmpl, err := templates.GetWithFuncs(templates.FuncsWithQuery(q))
if err != nil {
return nil, fmt.Errorf("error cloning template: %w", err)
}
return templateAnnotations(annotations, tplData, tmpl, true)
}
// ValidateTemplates validate annotations for possible template error, uses empty data for template population
func ValidateTemplates(annotations map[string]string) error {
_, err := templateAnnotations(annotations, AlertTplData{
tmpl, err := templates.Get()
if err != nil {
return err
}
_, err = templateAnnotations(annotations, AlertTplData{
Labels: map[string]string{},
Value: 0,
}, tmplFunc)
}, tmpl, false)
return err
}
func templateAnnotations(annotations map[string]string, data AlertTplData, funcs template.FuncMap) (map[string]string, error) {
func templateAnnotations(annotations map[string]string, data AlertTplData, tmpl *textTpl.Template, execute bool) (map[string]string, error) {
var builder strings.Builder
var buf bytes.Buffer
eg := new(utils.ErrGroup)
@@ -122,7 +153,7 @@ func templateAnnotations(annotations map[string]string, data AlertTplData, funcs
builder.Grow(len(header) + len(text))
builder.WriteString(header)
builder.WriteString(text)
if err := templateAnnotation(&buf, builder.String(), tData, funcs); err != nil {
if err := templateAnnotation(&buf, builder.String(), tData, tmpl, execute); err != nil {
r[key] = text
eg.Add(fmt.Errorf("key %q, template %q: %w", key, text, err))
continue
@@ -138,11 +169,17 @@ type tplData struct {
ExternalURL string
}
func templateAnnotation(dst io.Writer, text string, data tplData, funcs template.FuncMap) error {
t := template.New("").Funcs(funcs).Option("missingkey=zero")
tpl, err := t.Parse(text)
func templateAnnotation(dst io.Writer, text string, data tplData, tmpl *textTpl.Template, execute bool) error {
tpl, err := tmpl.Clone()
if err != nil {
return fmt.Errorf("error parsing annotation: %w", err)
return fmt.Errorf("error cloning template before parse annotation: %w", err)
}
tpl, err = tpl.Parse(text)
if err != nil {
return fmt.Errorf("error parsing annotation template: %w", err)
}
if !execute {
return nil
}
if err = tpl.Execute(dst, data); err != nil {
return fmt.Errorf("error evaluating annotation template: %w", err)
@@ -158,9 +195,9 @@ func (a Alert) toPromLabels(relabelCfg *promrelabel.ParsedConfigs) []prompbmarsh
Value: v,
})
}
promrelabel.SortLabels(labels)
if relabelCfg != nil {
return relabelCfg.Apply(labels, 0, false)
labels = relabelCfg.Apply(labels, 0)
}
promrelabel.SortLabels(labels)
return labels
}

View File

@@ -4,6 +4,7 @@ import (
"fmt"
"reflect"
"testing"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
@@ -11,7 +12,7 @@ import (
)
func TestAlert_ExecTemplate(t *testing.T) {
extLabels := make(map[string]string, 0)
extLabels := make(map[string]string)
const (
extCluster = "prod"
extDC = "east"
@@ -53,28 +54,35 @@ func TestAlert_ExecTemplate(t *testing.T) {
"job": "staging",
"instance": "localhost",
},
For: 5 * time.Minute,
},
annotations: map[string]string{
"summary": "Too high connection number for {{$labels.instance}} for job {{$labels.job}}",
"description": "It is {{ $value }} connections for {{$labels.instance}}",
"description": "It is {{ $value }} connections for {{$labels.instance}} for more than {{ .For }}",
},
expTpl: map[string]string{
"summary": "Too high connection number for localhost for job staging",
"description": "It is 10000 connections for localhost",
"description": "It is 10000 connections for localhost for more than 5m0s",
},
},
{
name: "expression-template",
alert: &Alert{
Expr: `vm_rows{"label"="bar"}>0`,
Expr: `vm_rows{"label"="bar"}<0`,
},
annotations: map[string]string{
"exprEscapedQuery": "{{ $expr|quotesEscape|queryEscape }}",
"exprEscapedPath": "{{ $expr|quotesEscape|pathEscape }}",
"exprEscapedQuery": "{{ $expr|queryEscape }}",
"exprEscapedPath": "{{ $expr|pathEscape }}",
"exprEscapedJSON": "{{ $expr|jsonEscape }}",
"exprEscapedQuotes": "{{ $expr|quotesEscape }}",
"exprEscapedHTML": "{{ $expr|htmlEscape }}",
},
expTpl: map[string]string{
"exprEscapedQuery": "vm_rows%7B%5C%22label%5C%22%3D%5C%22bar%5C%22%7D%3E0",
"exprEscapedPath": "vm_rows%7B%5C%22label%5C%22=%5C%22bar%5C%22%7D%3E0",
"exprEscapedQuery": "vm_rows%7B%22label%22%3D%22bar%22%7D%3C0",
"exprEscapedPath": "vm_rows%7B%22label%22=%22bar%22%7D%3C0",
"exprEscapedJSON": `"vm_rows{\"label\"=\"bar\"}\u003c0"`,
"exprEscapedQuotes": `vm_rows{\"label\"=\"bar\"}\u003c0`,
"exprEscapedHTML": "vm_rows{&quot;label&quot;=&quot;bar&quot;}&lt;0",
},
},
{
@@ -109,6 +117,65 @@ func TestAlert_ExecTemplate(t *testing.T) {
"description": fmt.Sprintf("It is 10000 connections for localhost (cluster-%s)", extCluster),
},
},
{
name: "alert and group IDs",
alert: &Alert{
ID: 42,
GroupID: 24,
},
annotations: map[string]string{
"url": "/api/v1/alert?alertID={{$alertID}}&groupID={{$groupID}}",
},
expTpl: map[string]string{
"url": "/api/v1/alert?alertID=42&groupID=24",
},
},
{
name: "ActiveAt time",
alert: &Alert{
ActiveAt: time.Date(2022, 8, 19, 20, 34, 58, 651387237, time.UTC),
},
annotations: map[string]string{
"diagram": "![](http://example.com?render={{$activeAt.Unix}}",
},
expTpl: map[string]string{
"diagram": "![](http://example.com?render=1660941298",
},
},
{
name: "ActiveAt time is nil",
alert: &Alert{},
annotations: map[string]string{
"default_time": "{{$activeAt}}",
},
expTpl: map[string]string{
"default_time": "0001-01-01 00:00:00 +0000 UTC",
},
},
{
name: "ActiveAt custom format",
alert: &Alert{
ActiveAt: time.Date(2022, 8, 19, 20, 34, 58, 651387237, time.UTC),
},
annotations: map[string]string{
"fire_time": `{{$activeAt.Format "2006/01/02 15:04:05"}}`,
},
expTpl: map[string]string{
"fire_time": "2022/08/19 20:34:58",
},
},
{
name: "ActiveAt query range",
alert: &Alert{
ActiveAt: time.Date(2022, 8, 19, 20, 34, 58, 651387237, time.UTC),
},
annotations: map[string]string{
"grafana_url": `vm-grafana.com?from={{($activeAt.Add (parseDurationTime "1h")).Unix}}&to={{($activeAt.Add (parseDurationTime "-1h")).Unix}}`,
},
expTpl: map[string]string{
"grafana_url": "vm-grafana.com?from=1660944898&to=1660937698",
},
},
}
qFn := func(q string) ([]datasource.Metric, error) {
@@ -173,7 +240,7 @@ func TestAlert_toPromLabels(t *testing.T) {
replacement: "aaa"
- action: labeldrop
regex: "env.*"
`), false)
`))
if err != nil {
t.Fatalf("unexpected error: %s", err)
}

View File

@@ -4,7 +4,7 @@ import (
"bytes"
"context"
"fmt"
"io/ioutil"
"io"
"net/http"
"strings"
"time"
@@ -79,9 +79,7 @@ func (am *AlertManager) send(ctx context.Context, alerts []Alert) error {
req = req.WithContext(ctx)
if am.authCfg != nil {
if auth := am.authCfg.GetAuthHeader(); auth != "" {
req.Header.Set("Authorization", auth)
}
am.authCfg.SetHeaders(req, true)
}
resp, err := am.client.Do(req)
if err != nil {
@@ -91,7 +89,7 @@ func (am *AlertManager) send(ctx context.Context, alerts []Alert) error {
defer func() { _ = resp.Body.Close() }()
if resp.StatusCode != http.StatusOK {
body, err := ioutil.ReadAll(resp.Body)
body, err := io.ReadAll(resp.Body)
if err != nil {
return fmt.Errorf("failed to read response from %q: %w", am.addr, err)
}

View File

@@ -15,10 +15,10 @@
"endsAt":{%q= alert.End.Format(time.RFC3339Nano) %},
{% endif %}
"labels": {
"alertname":{%q= alert.Name %}
{% code lbls := alert.toPromLabels(relabelCfg) %}
{% for _, l := range lbls %}
,{%q= l.Name %}:{%q= l.Value %}
{% code ll := len(lbls) %}
{% for idx, l := range lbls %}
{%q= l.Name %}:{%q= l.Value %}{% if idx != ll-1 %}, {% endif %}
{% endfor %}
},
"annotations": {

View File

@@ -51,22 +51,27 @@ func streamamRequest(qw422016 *qt422016.Writer, alerts []Alert, generatorURL fun
//line app/vmalert/notifier/alertmanager_request.qtpl:16
}
//line app/vmalert/notifier/alertmanager_request.qtpl:16
qw422016.N().S(`"labels": {"alertname":`)
qw422016.N().S(`"labels": {`)
//line app/vmalert/notifier/alertmanager_request.qtpl:18
qw422016.N().Q(alert.Name)
//line app/vmalert/notifier/alertmanager_request.qtpl:19
lbls := alert.toPromLabels(relabelCfg)
//line app/vmalert/notifier/alertmanager_request.qtpl:19
ll := len(lbls)
//line app/vmalert/notifier/alertmanager_request.qtpl:20
for _, l := range lbls {
//line app/vmalert/notifier/alertmanager_request.qtpl:20
qw422016.N().S(`,`)
for idx, l := range lbls {
//line app/vmalert/notifier/alertmanager_request.qtpl:21
qw422016.N().Q(l.Name)
//line app/vmalert/notifier/alertmanager_request.qtpl:21
qw422016.N().S(`:`)
//line app/vmalert/notifier/alertmanager_request.qtpl:21
qw422016.N().Q(l.Value)
//line app/vmalert/notifier/alertmanager_request.qtpl:21
if idx != ll-1 {
//line app/vmalert/notifier/alertmanager_request.qtpl:21
qw422016.N().S(`,`)
//line app/vmalert/notifier/alertmanager_request.qtpl:21
}
//line app/vmalert/notifier/alertmanager_request.qtpl:22
}
//line app/vmalert/notifier/alertmanager_request.qtpl:22

View File

@@ -67,9 +67,6 @@ func TestAlertManager_Send(t *testing.T) {
if a[0].GeneratorURL != "0/0" {
t.Errorf("expected 0/0 as generatorURL got %s", a[0].GeneratorURL)
}
if a[0].Labels["alertname"] != "alert0" {
t.Errorf("expected alert0 as alert name got %s", a[0].Labels["alertname"])
}
if a[0].StartsAt.IsZero() {
t.Errorf("expected non-zero start time")
}

View File

@@ -4,17 +4,17 @@ import (
"crypto/md5"
"fmt"
"gopkg.in/yaml.v2"
"io/ioutil"
"net/url"
"os"
"path"
"path/filepath"
"strings"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promauth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape/discovery/consul"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape/discovery/dns"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promutils"
)
@@ -29,6 +29,10 @@ type Config struct {
// ConsulSDConfigs contains list of settings for service discovery via Consul
// see https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config
ConsulSDConfigs []consul.SDConfig `yaml:"consul_sd_configs,omitempty"`
// DNSSDConfigs ontains list of settings for service discovery via DNS.
// See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dns_sd_config
DNSSDConfigs []dns.SDConfig `yaml:"dns_sd_configs,omitempty"`
// StaticConfigs contains list of static targets
StaticConfigs []StaticConfig `yaml:"static_configs,omitempty"`
@@ -39,7 +43,7 @@ type Config struct {
// AlertRelabelConfigs contains list of relabeling rules alert labels
AlertRelabelConfigs []promrelabel.RelabelConfig `yaml:"alert_relabel_configs,omitempty"`
// The timeout used when sending alerts.
Timeout promutils.Duration `yaml:"timeout,omitempty"`
Timeout *promutils.Duration `yaml:"timeout,omitempty"`
// Checksum stores the hash of yaml definition for the config.
// May be used to detect any changes to the config file.
@@ -58,10 +62,13 @@ type Config struct {
}
// StaticConfig contains list of static targets in the following form:
// targets:
// [ - '<host>' ]
//
// targets:
// [ - '<host>' ]
type StaticConfig struct {
Targets []string `yaml:"targets"`
// HTTPClientConfig contains HTTP configuration for the Targets
HTTPClientConfig promauth.HTTPClientConfig `yaml:",inline"`
}
// UnmarshalYAML implements the yaml.Unmarshaler interface.
@@ -76,12 +83,12 @@ func (cfg *Config) UnmarshalYAML(unmarshal func(interface{}) error) error {
if cfg.Timeout.Duration() == 0 {
cfg.Timeout = promutils.NewDuration(time.Second * 10)
}
rCfg, err := promrelabel.ParseRelabelConfigs(cfg.RelabelConfigs, false)
rCfg, err := promrelabel.ParseRelabelConfigs(cfg.RelabelConfigs)
if err != nil {
return fmt.Errorf("failed to parse relabeling config: %w", err)
}
cfg.parsedRelabelConfigs = rCfg
arCfg, err := promrelabel.ParseRelabelConfigs(cfg.AlertRelabelConfigs, false)
arCfg, err := promrelabel.ParseRelabelConfigs(cfg.AlertRelabelConfigs)
if err != nil {
return fmt.Errorf("failed to parse alert relabeling config: %w", err)
}
@@ -98,7 +105,7 @@ func (cfg *Config) UnmarshalYAML(unmarshal func(interface{}) error) error {
}
func parseConfig(path string) (*Config, error) {
data, err := ioutil.ReadFile(path)
data, err := os.ReadFile(path)
if err != nil {
return nil, fmt.Errorf("error reading config file: %w", err)
}
@@ -122,23 +129,24 @@ func parseConfig(path string) (*Config, error) {
return cfg, nil
}
func parseLabels(target string, metaLabels map[string]string, cfg *Config) (string, []prompbmarshal.Label, error) {
func parseLabels(target string, metaLabels *promutils.Labels, cfg *Config) (string, *promutils.Labels, error) {
labels := mergeLabels(target, metaLabels, cfg)
labels = cfg.parsedRelabelConfigs.Apply(labels, 0, false)
labels = promrelabel.RemoveMetaLabels(labels[:0], labels)
labels.Labels = cfg.parsedRelabelConfigs.Apply(labels.Labels, 0)
labels.RemoveMetaLabels()
labels.Sort()
// Remove references to already deleted labels, so GC could clean strings for label name and label value past len(labels).
// This should reduce memory usage when relabeling creates big number of temporary labels with long names and/or values.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/825 for details.
labels = append([]prompbmarshal.Label{}, labels...)
labels = labels.Clone()
if len(labels) == 0 {
if labels.Len() == 0 {
return "", nil, nil
}
schemeRelabeled := promrelabel.GetLabelValueByName(labels, "__scheme__")
schemeRelabeled := labels.Get("__scheme__")
if len(schemeRelabeled) == 0 {
schemeRelabeled = "http"
}
addressRelabeled := promrelabel.GetLabelValueByName(labels, "__address__")
addressRelabeled := labels.Get("__address__")
if len(addressRelabeled) == 0 {
return "", nil, nil
}
@@ -146,7 +154,7 @@ func parseLabels(target string, metaLabels map[string]string, cfg *Config) (stri
return "", nil, nil
}
addressRelabeled = addMissingPort(schemeRelabeled, addressRelabeled)
alertsPathRelabeled := promrelabel.GetLabelValueByName(labels, "__alerts_path__")
alertsPathRelabeled := labels.Get("__alerts_path__")
if !strings.HasPrefix(alertsPathRelabeled, "/") {
alertsPathRelabeled = "/" + alertsPathRelabeled
}
@@ -170,21 +178,12 @@ func addMissingPort(scheme, target string) string {
return target
}
func mergeLabels(target string, metaLabels map[string]string, cfg *Config) []prompbmarshal.Label {
func mergeLabels(target string, metaLabels *promutils.Labels, cfg *Config) *promutils.Labels {
// See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config
m := make(map[string]string)
m["__address__"] = target
m["__scheme__"] = cfg.Scheme
m["__alerts_path__"] = path.Join("/", cfg.PathPrefix, alertManagerPath)
for k, v := range metaLabels {
m[k] = v
}
result := make([]prompbmarshal.Label, 0, len(m))
for k, v := range m {
result = append(result, prompbmarshal.Label{
Name: k,
Value: v,
})
}
return result
m := promutils.NewLabels(3 + metaLabels.Len())
m.Add("__address__", target)
m.Add("__scheme__", cfg.Scheme)
m.Add("__alerts_path__", path.Join("/", cfg.PathPrefix, alertManagerPath))
m.AddFrom(metaLabels)
return m
}

View File

@@ -12,6 +12,7 @@ func TestConfigParseGood(t *testing.T) {
}
f("testdata/mixed.good.yaml")
f("testdata/consul.good.yaml")
f("testdata/dns.good.yaml")
f("testdata/static.good.yaml")
}

View File

@@ -6,7 +6,10 @@ import (
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promauth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape/discovery/consul"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape/discovery/dns"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promutils"
)
// configWatcher supports dynamic reload of Notifier objects
@@ -121,7 +124,7 @@ func targetsFromLabels(labelsFn getLabels, cfg *Config, genFn AlertURLGenerator)
var errors []error
duplicates := make(map[string]struct{})
for _, labels := range metaLabels {
target := labels["__address__"]
target := labels.Get("__address__")
u, processedLabels, err := parseLabels(target, labels, cfg)
if err != nil {
errors = append(errors, err)
@@ -154,18 +157,19 @@ func targetsFromLabels(labelsFn getLabels, cfg *Config, genFn AlertURLGenerator)
return targets, errors
}
type getLabels func() ([]map[string]string, error)
type getLabels func() ([]*promutils.Labels, error)
func (cw *configWatcher) start() error {
if len(cw.cfg.StaticConfigs) > 0 {
var targets []Target
for _, cfg := range cw.cfg.StaticConfigs {
httpCfg := mergeHTTPClientConfigs(cw.cfg.HTTPClientConfig, cfg.HTTPClientConfig)
for _, target := range cfg.Targets {
address, labels, err := parseLabels(target, nil, cw.cfg)
if err != nil {
return fmt.Errorf("failed to parse labels for target %q: %s", target, err)
}
notifier, err := NewAlertManager(address, cw.genFn, cw.cfg.HTTPClientConfig, cw.cfg.parsedRelabelConfigs, cw.cfg.Timeout.Duration())
notifier, err := NewAlertManager(address, cw.genFn, httpCfg, cw.cfg.parsedAlertRelabelConfigs, cw.cfg.Timeout.Duration())
if err != nil {
return fmt.Errorf("failed to init alertmanager for addr %q: %s", address, err)
}
@@ -179,8 +183,8 @@ func (cw *configWatcher) start() error {
}
if len(cw.cfg.ConsulSDConfigs) > 0 {
err := cw.add(TargetConsul, *consul.SDCheckInterval, func() ([]map[string]string, error) {
var labels []map[string]string
err := cw.add(TargetConsul, *consul.SDCheckInterval, func() ([]*promutils.Labels, error) {
var labels []*promutils.Labels
for i := range cw.cfg.ConsulSDConfigs {
sdc := &cw.cfg.ConsulSDConfigs[i]
targetLabels, err := sdc.GetLabels(cw.cfg.baseDir)
@@ -195,6 +199,24 @@ func (cw *configWatcher) start() error {
return fmt.Errorf("failed to start consulSD discovery: %s", err)
}
}
if len(cw.cfg.DNSSDConfigs) > 0 {
err := cw.add(TargetDNS, *dns.SDCheckInterval, func() ([]*promutils.Labels, error) {
var labels []*promutils.Labels
for i := range cw.cfg.DNSSDConfigs {
sdc := &cw.cfg.DNSSDConfigs[i]
targetLabels, err := sdc.GetLabels(cw.cfg.baseDir)
if err != nil {
return nil, fmt.Errorf("got labels err: %s", err)
}
labels = append(labels, targetLabels...)
}
return labels, nil
})
if err != nil {
return fmt.Errorf("failed to start DNSSD discovery: %s", err)
}
}
return nil
}
@@ -233,3 +255,30 @@ func (cw *configWatcher) setTargets(key TargetType, targets []Target) {
cw.targets[key] = targets
cw.targetsMu.Unlock()
}
// mergeHTTPClientConfigs merges fields between child and parent params
// by populating child from parent params if they're missing.
func mergeHTTPClientConfigs(parent, child promauth.HTTPClientConfig) promauth.HTTPClientConfig {
if child.Authorization == nil {
child.Authorization = parent.Authorization
}
if child.BasicAuth == nil {
child.BasicAuth = parent.BasicAuth
}
if child.BearerToken == nil {
child.BearerToken = parent.BearerToken
}
if child.BearerTokenFile == "" {
child.BearerTokenFile = parent.BearerTokenFile
}
if child.OAuth2 == nil {
child.OAuth2 = parent.OAuth2
}
if child.TLSConfig == nil {
child.TLSConfig = parent.TLSConfig
}
if child.Headers == nil {
child.Headers = parent.Headers
}
return child
}

View File

@@ -2,17 +2,18 @@ package notifier
import (
"fmt"
"io/ioutil"
"math/rand"
"net/http"
"net/http/httptest"
"os"
"sync"
"testing"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promauth"
)
func TestConfigWatcherReload(t *testing.T) {
f, err := ioutil.TempFile("", "")
f, err := os.CreateTemp("", "")
if err != nil {
t.Fatal(err)
}
@@ -34,7 +35,7 @@ static_configs:
t.Fatalf("expected to have 2 notifiers; got %d %#v", len(ns), ns)
}
f2, err := ioutil.TempFile("", "")
f2, err := os.CreateTemp("", "")
if err != nil {
t.Fatal(err)
}
@@ -61,7 +62,7 @@ func TestConfigWatcherStart(t *testing.T) {
consulSDServer := newFakeConsulServer()
defer consulSDServer.Close()
consulSDFile, err := ioutil.TempFile("", "")
consulSDFile, err := os.CreateTemp("", "")
if err != nil {
t.Fatal(err)
}
@@ -107,7 +108,7 @@ func TestConfigWatcherReloadConcurrent(t *testing.T) {
consulSDServer2 := newFakeConsulServer()
defer consulSDServer2.Close()
consulSDFile, err := ioutil.TempFile("", "")
consulSDFile, err := os.CreateTemp("", "")
if err != nil {
t.Fatal(err)
}
@@ -123,7 +124,7 @@ consul_sd_configs:
- consul
`, consulSDServer1.URL, consulSDServer2.URL))
staticAndConsulSDFile, err := ioutil.TempFile("", "")
staticAndConsulSDFile, err := os.CreateTemp("", "")
if err != nil {
t.Fatal(err)
}
@@ -175,7 +176,7 @@ consul_sd_configs:
func writeToFile(t *testing.T, file, b string) {
t.Helper()
checkErr(t, ioutil.WriteFile(file, []byte(b), 0644))
checkErr(t, os.WriteFile(file, []byte(b), 0644))
}
func checkErr(t *testing.T, err error) {
@@ -299,3 +300,20 @@ func newFakeConsulServer() *httptest.Server {
return httptest.NewServer(mux)
}
func TestMergeHTTPClientConfigs(t *testing.T) {
cfg1 := promauth.HTTPClientConfig{Headers: []string{"Header:Foo"}}
cfg2 := promauth.HTTPClientConfig{BasicAuth: &promauth.BasicAuthConfig{
Username: "foo",
Password: promauth.NewSecret("bar"),
}}
result := mergeHTTPClientConfigs(cfg1, cfg2)
if result.Headers == nil {
t.Fatalf("expected Headers to be inherited")
}
if result.BasicAuth == nil {
t.Fatalf("expected BasicAuth tp be present")
}
}

View File

@@ -3,44 +3,47 @@ package notifier
import (
"flag"
"fmt"
"net/url"
"strings"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/templates"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promauth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promutils"
)
var (
configPath = flag.String("notifier.config", "", "Path to configuration file for notifiers")
suppressDuplicateTargetErrors = flag.Bool("notifier.suppressDuplicateTargetErrors", false, "Whether to suppress 'duplicate target' errors during discovery")
addrs = flagutil.NewArray("notifier.url", "Prometheus alertmanager URL, e.g. http://127.0.0.1:9093")
addrs = flagutil.NewArrayString("notifier.url", "Prometheus Alertmanager URL, e.g. http://127.0.0.1:9093. "+
"List all Alertmanager URLs if it runs in the cluster mode to ensure high availability.")
basicAuthUsername = flagutil.NewArray("notifier.basicAuth.username", "Optional basic auth username for -notifier.url")
basicAuthPassword = flagutil.NewArray("notifier.basicAuth.password", "Optional basic auth password for -notifier.url")
basicAuthPasswordFile = flagutil.NewArray("notifier.basicAuth.passwordFile", "Optional path to basic auth password file for -notifier.url")
basicAuthUsername = flagutil.NewArrayString("notifier.basicAuth.username", "Optional basic auth username for -notifier.url")
basicAuthPassword = flagutil.NewArrayString("notifier.basicAuth.password", "Optional basic auth password for -notifier.url")
basicAuthPasswordFile = flagutil.NewArrayString("notifier.basicAuth.passwordFile", "Optional path to basic auth password file for -notifier.url")
bearerToken = flagutil.NewArray("notifier.bearerToken", "Optional bearer token for -notifier.url")
bearerTokenFile = flagutil.NewArray("notifier.bearerTokenFile", "Optional path to bearer token file for -notifier.url")
bearerToken = flagutil.NewArrayString("notifier.bearerToken", "Optional bearer token for -notifier.url")
bearerTokenFile = flagutil.NewArrayString("notifier.bearerTokenFile", "Optional path to bearer token file for -notifier.url")
tlsInsecureSkipVerify = flagutil.NewArrayBool("notifier.tlsInsecureSkipVerify", "Whether to skip tls verification when connecting to -notifier.url")
tlsCertFile = flagutil.NewArray("notifier.tlsCertFile", "Optional path to client-side TLS certificate file to use when connecting to -notifier.url")
tlsKeyFile = flagutil.NewArray("notifier.tlsKeyFile", "Optional path to client-side TLS certificate key to use when connecting to -notifier.url")
tlsCAFile = flagutil.NewArray("notifier.tlsCAFile", "Optional path to TLS CA file to use for verifying connections to -notifier.url. "+
tlsCertFile = flagutil.NewArrayString("notifier.tlsCertFile", "Optional path to client-side TLS certificate file to use when connecting to -notifier.url")
tlsKeyFile = flagutil.NewArrayString("notifier.tlsKeyFile", "Optional path to client-side TLS certificate key to use when connecting to -notifier.url")
tlsCAFile = flagutil.NewArrayString("notifier.tlsCAFile", "Optional path to TLS CA file to use for verifying connections to -notifier.url. "+
"By default system CA is used")
tlsServerName = flagutil.NewArray("notifier.tlsServerName", "Optional TLS server name to use for connections to -notifier.url. "+
tlsServerName = flagutil.NewArrayString("notifier.tlsServerName", "Optional TLS server name to use for connections to -notifier.url. "+
"By default the server name from -notifier.url is used")
oauth2ClientID = flagutil.NewArray("notifier.oauth2.clientID", "Optional OAuth2 clientID to use for -notifier.url. "+
oauth2ClientID = flagutil.NewArrayString("notifier.oauth2.clientID", "Optional OAuth2 clientID to use for -notifier.url. "+
"If multiple args are set, then they are applied independently for the corresponding -notifier.url")
oauth2ClientSecret = flagutil.NewArray("notifier.oauth2.clientSecret", "Optional OAuth2 clientSecret to use for -notifier.url. "+
oauth2ClientSecret = flagutil.NewArrayString("notifier.oauth2.clientSecret", "Optional OAuth2 clientSecret to use for -notifier.url. "+
"If multiple args are set, then they are applied independently for the corresponding -notifier.url")
oauth2ClientSecretFile = flagutil.NewArray("notifier.oauth2.clientSecretFile", "Optional OAuth2 clientSecretFile to use for -notifier.url. "+
oauth2ClientSecretFile = flagutil.NewArrayString("notifier.oauth2.clientSecretFile", "Optional OAuth2 clientSecretFile to use for -notifier.url. "+
"If multiple args are set, then they are applied independently for the corresponding -notifier.url")
oauth2TokenURL = flagutil.NewArray("notifier.oauth2.tokenUrl", "Optional OAuth2 tokenURL to use for -notifier.url. "+
oauth2TokenURL = flagutil.NewArrayString("notifier.oauth2.tokenUrl", "Optional OAuth2 tokenURL to use for -notifier.url. "+
"If multiple args are set, then they are applied independently for the corresponding -notifier.url")
oauth2Scopes = flagutil.NewArray("notifier.oauth2.scopes", "Optional OAuth2 scopes to use for -notifier.url. Scopes must be delimited by ';'. "+
oauth2Scopes = flagutil.NewArrayString("notifier.oauth2.scopes", "Optional OAuth2 scopes to use for -notifier.url. Scopes must be delimited by ';'. "+
"If multiple args are set, then they are applied independently for the corresponding -notifier.url")
)
@@ -72,9 +75,10 @@ var (
// Init returns a function for retrieving actual list of Notifier objects.
// Init works in two mods:
// * configuration via flags (for backward compatibility). Is always static
// - configuration via flags (for backward compatibility). Is always static
// and don't support live reloads.
// * configuration via file. Supports live reloads and service discovery.
// - configuration via file. Supports live reloads and service discovery.
//
// Init returns an error if both mods are used.
func Init(gen AlertURLGenerator, extLabels map[string]string, extURL string) (func() []Notifier, error) {
if externalLabels != nil || externalURL != "" {
@@ -83,6 +87,12 @@ func Init(gen AlertURLGenerator, extLabels map[string]string, extURL string) (fu
externalURL = extURL
externalLabels = extLabels
eu, err := url.Parse(externalURL)
if err != nil {
return nil, fmt.Errorf("failed to parse external URL: %s", err)
}
templates.UpdateWithFuncs(templates.FuncsWithExternalURL(eu))
if *configPath == "" && len(*addrs) == 0 {
return nil, nil
@@ -102,7 +112,6 @@ func Init(gen AlertURLGenerator, extLabels map[string]string, extURL string) (fu
return staticNotifiersFn, nil
}
var err error
cw, err = newWatcher(*configPath, gen)
if err != nil {
return nil, fmt.Errorf("failed to init config watcher: %s", err)
@@ -138,7 +147,7 @@ func notifiersFromFlags(gen AlertURLGenerator) ([]Notifier, error) {
}
addr = strings.TrimSuffix(addr, "/")
am, err := NewAlertManager(addr+alertManagerPath, gen, authCfg, nil, time.Minute)
am, err := NewAlertManager(addr+alertManagerPath, gen, authCfg, nil, time.Second*10)
if err != nil {
return nil, err
}
@@ -151,7 +160,7 @@ func notifiersFromFlags(gen AlertURLGenerator) ([]Notifier, error) {
// list of labels added during discovery.
type Target struct {
Notifier
Labels []prompbmarshal.Label
Labels *promutils.Labels
}
// TargetType defines how the Target was discovered
@@ -162,6 +171,8 @@ const (
TargetStatic TargetType = "static"
// TargetConsul is for targets discovered via Consul
TargetConsul TargetType = "consulSD"
// TargetDNS is for targets discovered via DNS
TargetDNS TargetType = "DNSSD"
)
// GetTargets returns list of static or discovered targets

View File

@@ -1,13 +1,14 @@
package notifier
import (
"net/url"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/templates"
"os"
"testing"
)
func TestMain(m *testing.M) {
u, _ := url.Parse("https://victoriametrics.com/path")
InitTemplateFunc(u)
if err := templates.Load([]string{"testdata/templates/*good.tmpl"}, true); err != nil {
os.Exit(1)
}
os.Exit(m.Run())
}

View File

@@ -1,357 +0,0 @@
// Copyright 2013 The Prometheus Authors
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package notifier
import (
"errors"
"fmt"
"math"
"net"
"net/url"
"regexp"
"sort"
"strings"
"time"
htmlTpl "html/template"
textTpl "text/template"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promutils"
)
// metric is private copy of datasource.Metric,
// it is used for templating annotations,
// Labels as map simplifies templates evaluation.
type metric struct {
Labels map[string]string
Timestamp int64
Value float64
}
// datasourceMetricsToTemplateMetrics converts Metrics from datasource package to private copy for templating.
func datasourceMetricsToTemplateMetrics(ms []datasource.Metric) []metric {
mss := make([]metric, 0, len(ms))
for _, m := range ms {
labelsMap := make(map[string]string, len(m.Labels))
for _, labelValue := range m.Labels {
labelsMap[labelValue.Name] = labelValue.Value
}
mss = append(mss, metric{
Labels: labelsMap,
Timestamp: m.Timestamps[0],
Value: m.Values[0]})
}
return mss
}
// QueryFn is used to wrap a call to datasource into simple-to-use function
// for templating functions.
type QueryFn func(query string) ([]datasource.Metric, error)
var tmplFunc textTpl.FuncMap
// InitTemplateFunc initiates template helper functions
func InitTemplateFunc(externalURL *url.URL) {
// See https://prometheus.io/docs/prometheus/latest/configuration/template_reference/
tmplFunc = textTpl.FuncMap{
/* Strings */
// reReplaceAll ReplaceAllString returns a copy of src, replacing matches of the Regexp with
// the replacement string repl. Inside repl, $ signs are interpreted as in Expand,
// so for instance $1 represents the text of the first submatch.
// alias for https://golang.org/pkg/regexp/#Regexp.ReplaceAllString
"reReplaceAll": func(pattern, repl, text string) string {
re := regexp.MustCompile(pattern)
return re.ReplaceAllString(text, repl)
},
// match reports whether the string s
// contains any match of the regular expression pattern.
// alias for https://golang.org/pkg/regexp/#MatchString
"match": regexp.MatchString,
// title returns a copy of the string s with all Unicode letters
// that begin words mapped to their Unicode title case.
// alias for https://golang.org/pkg/strings/#Title
"title": strings.Title,
// toUpper returns s with all Unicode letters mapped to their upper case.
// alias for https://golang.org/pkg/strings/#ToUpper
"toUpper": strings.ToUpper,
// toLower returns s with all Unicode letters mapped to their lower case.
// alias for https://golang.org/pkg/strings/#ToLower
"toLower": strings.ToLower,
// stripPort splits string into host and port, then returns only host.
"stripPort": func(hostPort string) string {
host, _, err := net.SplitHostPort(hostPort)
if err != nil {
return hostPort
}
return host
},
// parseDuration parses a duration string such as "1h" into the number of seconds it represents
"parseDuration": func(s string) (float64, error) {
d, err := promutils.ParseDuration(s)
if err != nil {
return 0, err
}
return d.Seconds(), nil
},
/* Numbers */
// humanize converts given number to a human readable format
// by adding metric prefixes https://en.wikipedia.org/wiki/Metric_prefix
"humanize": func(v float64) string {
if v == 0 || math.IsNaN(v) || math.IsInf(v, 0) {
return fmt.Sprintf("%.4g", v)
}
if math.Abs(v) >= 1 {
prefix := ""
for _, p := range []string{"k", "M", "G", "T", "P", "E", "Z", "Y"} {
if math.Abs(v) < 1000 {
break
}
prefix = p
v /= 1000
}
return fmt.Sprintf("%.4g%s", v, prefix)
}
prefix := ""
for _, p := range []string{"m", "u", "n", "p", "f", "a", "z", "y"} {
if math.Abs(v) >= 1 {
break
}
prefix = p
v *= 1000
}
return fmt.Sprintf("%.4g%s", v, prefix)
},
// humanize1024 converts given number to a human readable format with 1024 as base
"humanize1024": func(v float64) string {
if math.Abs(v) <= 1 || math.IsNaN(v) || math.IsInf(v, 0) {
return fmt.Sprintf("%.4g", v)
}
prefix := ""
for _, p := range []string{"ki", "Mi", "Gi", "Ti", "Pi", "Ei", "Zi", "Yi"} {
if math.Abs(v) < 1024 {
break
}
prefix = p
v /= 1024
}
return fmt.Sprintf("%.4g%s", v, prefix)
},
// humanizeDuration converts given seconds to a human readable duration
"humanizeDuration": func(v float64) string {
if math.IsNaN(v) || math.IsInf(v, 0) {
return fmt.Sprintf("%.4g", v)
}
if v == 0 {
return fmt.Sprintf("%.4gs", v)
}
if math.Abs(v) >= 1 {
sign := ""
if v < 0 {
sign = "-"
v = -v
}
seconds := int64(v) % 60
minutes := (int64(v) / 60) % 60
hours := (int64(v) / 60 / 60) % 24
days := int64(v) / 60 / 60 / 24
// For days to minutes, we display seconds as an integer.
if days != 0 {
return fmt.Sprintf("%s%dd %dh %dm %ds", sign, days, hours, minutes, seconds)
}
if hours != 0 {
return fmt.Sprintf("%s%dh %dm %ds", sign, hours, minutes, seconds)
}
if minutes != 0 {
return fmt.Sprintf("%s%dm %ds", sign, minutes, seconds)
}
// For seconds, we display 4 significant digits.
return fmt.Sprintf("%s%.4gs", sign, v)
}
prefix := ""
for _, p := range []string{"m", "u", "n", "p", "f", "a", "z", "y"} {
if math.Abs(v) >= 1 {
break
}
prefix = p
v *= 1000
}
return fmt.Sprintf("%.4g%ss", v, prefix)
},
// humanizePercentage converts given ratio value to a fraction of 100
"humanizePercentage": func(v float64) string {
return fmt.Sprintf("%.4g%%", v*100)
},
// humanizeTimestamp converts given timestamp to a human readable time equivalent
"humanizeTimestamp": func(v float64) string {
if math.IsNaN(v) || math.IsInf(v, 0) {
return fmt.Sprintf("%.4g", v)
}
t := TimeFromUnixNano(int64(v * 1e9)).Time().UTC()
return fmt.Sprint(t)
},
/* URLs */
// externalURL returns value of `external.url` flag
"externalURL": func() string {
return externalURL.String()
},
// pathPrefix returns a Path segment from the URL value in `external.url` flag
"pathPrefix": func() string {
return externalURL.Path
},
// pathEscape escapes the string so it can be safely placed inside a URL path segment,
// replacing special characters (including /) with %XX sequences as needed.
// alias for https://golang.org/pkg/net/url/#PathEscape
"pathEscape": func(u string) string {
return url.PathEscape(u)
},
// queryEscape escapes the string so it can be safely placed
// inside a URL query.
// alias for https://golang.org/pkg/net/url/#QueryEscape
"queryEscape": func(q string) string {
return url.QueryEscape(q)
},
// crlfEscape replaces new line chars to skip URL encoding.
// see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/890
"crlfEscape": func(q string) string {
q = strings.Replace(q, "\n", `\n`, -1)
return strings.Replace(q, "\r", `\r`, -1)
},
// quotesEscape escapes quote char
"quotesEscape": func(q string) string {
return strings.Replace(q, `"`, `\"`, -1)
},
// query executes the MetricsQL/PromQL query against
// configured `datasource.url` address.
// For example, {{ query "foo" | first | value }} will
// execute "/api/v1/query?query=foo" request and will return
// the first value in response.
"query": func(q string) ([]metric, error) {
// query function supposed to be substituted at funcsWithQuery().
// it is present here only for validation purposes, when there is no
// provided datasource.
//
// return non-empty slice to pass validation with chained functions in template
// see issue #989 for details
return []metric{{}}, nil
},
// first returns the first by order element from the given metrics list.
// usually used alongside with `query` template function.
"first": func(metrics []metric) (metric, error) {
if len(metrics) > 0 {
return metrics[0], nil
}
return metric{}, errors.New("first() called on vector with no elements")
},
// label returns the value of the given label name for the given metric.
// usually used alongside with `query` template function.
"label": func(label string, m metric) string {
return m.Labels[label]
},
// sortByLabel sorts the given metrics by provided label key
"sortByLabel": func(label string, metrics []metric) []metric {
sort.SliceStable(metrics, func(i, j int) bool {
return metrics[i].Labels[label] < metrics[j].Labels[label]
})
return metrics
},
// value returns the value of the given metric.
// usually used alongside with `query` template function.
"value": func(m metric) float64 {
return m.Value
},
/* Helpers */
// Converts a list of objects to a map with keys arg0, arg1 etc.
// This is intended to allow multiple arguments to be passed to templates.
"args": func(args ...interface{}) map[string]interface{} {
result := make(map[string]interface{})
for i, a := range args {
result[fmt.Sprintf("arg%d", i)] = a
}
return result
},
// safeHtml marks string as HTML not requiring auto-escaping.
"safeHtml": func(text string) htmlTpl.HTML {
return htmlTpl.HTML(text)
},
}
}
func funcsWithQuery(query QueryFn) textTpl.FuncMap {
fm := make(textTpl.FuncMap)
for k, fn := range tmplFunc {
fm[k] = fn
}
fm["query"] = func(q string) ([]metric, error) {
result, err := query(q)
if err != nil {
return nil, err
}
return datasourceMetricsToTemplateMetrics(result), nil
}
return fm
}
// Time is the number of milliseconds since the epoch
// (1970-01-01 00:00 UTC) excluding leap seconds.
type Time int64
// TimeFromUnixNano returns the Time equivalent to the Unix Time
// t provided in nanoseconds.
func TimeFromUnixNano(t int64) Time {
return Time(t / nanosPerTick)
}
// The number of nanoseconds per minimum tick.
const nanosPerTick = int64(minimumTick / time.Nanosecond)
// MinimumTick is the minimum supported time resolution. This has to be
// at least time.Second in order for the code below to work.
const minimumTick = time.Millisecond
// second is the Time duration equivalent to one second.
const second = int64(time.Second / minimumTick)
// Time returns the time.Time representation of t.
func (t Time) Time() time.Time {
return time.Unix(int64(t)/second, (int64(t)%second)*nanosPerTick)
}

View File

@@ -0,0 +1,12 @@
dns_sd_configs:
- names:
- cloudflare.com
type: 'A'
port: 9093
relabel_configs:
- source_labels: [__meta_dns_name]
replacement: '${1}'
target_label: dns_name
alert_relabel_configs:
- target_label: "foo"
replacement: "aaa"

View File

@@ -11,8 +11,18 @@ consul_sd_configs:
- server: localhost:8500
services:
- consul
dns_sd_configs:
- names:
- cloudflare.com
type: 'A'
port: 9093
relabel_configs:
- source_labels: [__meta_consul_tags]
regex: .*,__scheme__=([^,]+),.*
replacement: '${1}'
target_label: __scheme__
target_label: __scheme__
- source_labels: [__meta_dns_name]
replacement: '${1}'
target_label: dns_name

View File

@@ -1,7 +1,21 @@
headers:
- 'CustomHeader: foo'
static_configs:
- targets:
- localhost:9093
- localhost:9095
basic_auth:
username: foo
password: bar
- targets:
- localhost:9096
- localhost:9097
basic_auth:
username: foo
password: baz
alert_relabel_configs:
- target_label: "foo"
replacement: "aaa"
replacement: "aaa"

Some files were not shown because too many files have changed in this diff Show More