Compare commits

...

279 Commits

Author SHA1 Message Date
Aliaksandr Valialkin
fcc8b14f86 deployment/docker: upgrade base Docker image from Alpine 3.19.0 to 3.19.1
See https://www.alpinelinux.org/posts/Alpine-3.19.1-released.html
2024-01-30 22:47:18 +02:00
Aliaksandr Valialkin
26488726a8 docs/CHANGELOG.md: cut v1.97.0 2024-01-30 22:45:04 +02:00
Dan Dascalescu
a090de492c docs: t...over_time functions return fractional seconds (#5715)
* docs: t...over_time functions return fractional seconds

* Apply suggestions from code review

---------

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-01-30 20:18:53 +00:00
Roman Khavronenko
6939c53e48 app/vmselect: set proper timestamp for cached instant responses (#5723)
* app/vmselect: set proper timestamp for cached instant responses

The change updates `getSumInstantValues` to prefer timestamp
from the most recent results. Before, timestamp from cached series
was used.

The old behavior had negative impact on recording rules as they
were getting responses with shifted timestamps in past.
Subsequent recording or alerting rules fetching results of these
recording rules could get no result due to staleness interval.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5659
Signed-off-by: hagen1778 <roman@victoriametrics.com>

* wip

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-01-30 20:03:34 +00:00
Aliaksandr Valialkin
c12bdd6c28 app/vmselect/vmui: run make vmui-update after 81b5db04f6 2024-01-30 21:13:01 +02:00
Yury Molodov
81b5db04f6 vmui: add the ability to expand all tracing entries (#5677) (#5726) 2024-01-30 19:10:10 +00:00
Github Actions
300d701df0 Automatic update Grafana datasource docs from VictoriaMetrics/grafana-datasource@40e4e15 (#5729) 2024-01-30 19:07:31 +00:00
Aliaksandr Valialkin
f768d5d797 docs/CHANGELOG.md: document the enhancement, which reduces initial memory usage when vmagent scrapes targets with large responses
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5567
2024-01-30 20:51:13 +02:00
Aliaksandr Valialkin
17f8ed8948 docs/CHANGELOG.md: refer to the related pull request for the bugfix for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1945 2024-01-30 20:21:44 +02:00
Aliaksandr Valialkin
ea2752ce62 docs/CHANGELOG.md: document the bugfix addressed by the commit bc7cf4950b
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1945
2024-01-30 20:16:22 +02:00
Aliaksandr Valialkin
32e60fe09d vendor: run make vendor-update 2024-01-30 18:47:01 +02:00
Aliaksandr Valialkin
adf585f7ed app/vmselect/vmui: run make vmui-update after 6e8995cfb92fb5a87fc6ad78609bf9ea5e0e712f 2024-01-30 18:45:57 +02:00
Aliaksandr Valialkin
bc7cf4950b lib/promscrape: use the standard net/http.Client instead of fasthttp.Client for scraping targets in non-streaming mode
While fasthttp.Client uses less CPU and RAM when scraping targets with small responses (up to 10K metrics),
it doesn't work well when scraping targets with big responses such as kube-state-metrics.
In this case it could use big amounts of additional memory comparing to net/http.Client,
since fasthttp.Client reads the full response in memory and then tries re-using the large buffer
for further scrapes.

Additionally, fasthttp.Client-based scraping had various issues with proxying, redirects
and scrape timeouts like the following ones:

- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1945
- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5425
- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2794
- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1017

This should help reducing memory usage for the case when target returns big response
and this response is scraped by fasthttp.Client at first before switching to stream parsing mode
for subsequent scrapes. Now the switch to stream parsing mode is performed on the first scrape
after reading the response body in memory and noticing that its size exceeds the value passed
to -promscrape.minResponseSizeForStreamParse command-line flag.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5567

Overrides https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4931
2024-01-30 18:39:10 +02:00
Artem Navoiev
a20c289228 docs: add alias for keyconcepts
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-30 17:05:58 +01:00
Aliaksandr Valialkin
c2373a8109 lib/promscrape: fix BenchmarkScrapeWorkScrapeInternal, which has been broken by the commit 65bc460323 2024-01-30 16:06:06 +02:00
Yury Molodov
7007c6a760 vmui: fix Enter key in query field (#5667) (#5717) 2024-01-30 14:36:19 +01:00
Aliaksandr Valialkin
583b6fe1e7 app/vmagent/remotewrite: limit the concurrency for marshaling time series before sending them to remote storage
There is no sense in running more than GOMAXPROCS concurrent marshalers,
since they are CPU-bound. More concurrent marshalers do not increase the marshaling bandwidth,
but they may result in more RAM usage.
2024-01-30 12:18:19 +02:00
Aliaksandr Valialkin
431aa16c8d lib/storage: keep (date, metricID) entries only for the last two dates
Entries for the previous dates is usually not used, so there is little sense in keeping them in memory.

This should reduce the size of storage/date_metricID cache, which can be monitored
via vm_cache_entries{type="storage/date_metricID"} metric.
2024-01-29 18:43:59 +01:00
Aliaksandr Valialkin
e7844f2efd docs/keyConcepts.md: clarify the information about which data is returned by instant and range queries
Do not use `raw samples` term there, since it adds more confusion than clarity:
the `raw samples` refers to real samples stored in the database, while neither range nor instant queries
do not return raw samples - they both return *calculated* samples at *the given* timestamps.

This is a follow-up for b5978ed8f9

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5710
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5708
2024-01-29 18:19:46 +01:00
Fred Navruzov
b2434ec340 - fix link/version of helm chart in update request (#5716) 2024-01-29 18:55:07 +02:00
Aliaksandr Valialkin
5d66ee88bd lib/storage: do not check the limit for -search.maxUniqueTimeseries when performing /api/v1/labels and /api/v1/label/.../values requests
This limit has little sense for these APIs, since:

- Thses APIs frequently result in scanning of all the time series on the given time range.
  For example, if extra_filters={datacenter="some_dc"} .

- Users expect these APIs shouldn't hit the -search.maxUniqueTimeseries limit,
  which is intended for limiting resource usage at /api/v1/query and /api/v1/query_range requests.

Also limit the concurrency for /api/v1/labels, /api/v1/label/.../values
and /api/v1/series requests in order to limit the maximum memory usage and CPU usage for these API.
This limit shouldn't affect typical use cases for these APIs:

- Grafana dashboard load when dashboard labels should be loaded
- Auto-suggestion list load when editing the query in Grafana or vmui

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5055
2024-01-29 16:45:12 +01:00
Artem Navoiev
b9b18b5fd8 docs: add backward compaitble redicrt for url examples page
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-29 16:01:32 +01:00
Fred Navruzov
b4aef0c141 - update versions to 1.9.2 (#5714)
- update guide asset urls to flat
2024-01-29 15:47:27 +02:00
hagen1778
b5978ed8f9 docs: specify results of Instant and Range queries
Mention explicitly what are value and timestamp field in returned
results from Instant and Range queries.

Updates
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5710
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5708

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-29 14:00:14 +01:00
Roman Khavronenko
24eb1ad0c8 vmalert: set ActiveAt to evaluation timestamp in newAlert fn (#5657)
The change fixes flaky test `TestAlertingRule_Exec` which has dependency on the actual timestamps,
which resulted into inaccurate test states:
https://github.com/VictoriaMetrics/VictoriaMetrics/actions/runs/7608452967/job/20717699688

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-29 12:02:02 +01:00
hagen1778
98b805544e lib/streamaggr: fix incorrect err message for min interval value
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-29 09:53:05 +01:00
hagen1778
c23e8bee89 dashboards: specify where to see details about dropped labels
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-29 07:37:51 +01:00
Fred Navruzov
9b555a0034 update guide and changelog to 1.9.1 (#5706) 2024-01-28 09:43:28 +02:00
hagen1778
6c6c2c185f docs: follow-up after 491287ed15
* port un-synced changed from docs/readme to readme
* consistently use `sh` instead of `console` highlight, as it looks like
a more appropriate syntax highlight
* consistently use `sh` instead of `bash`, as it is shorter
* consistently use `yaml` instead of `yml`

See syntax codes here https://gohugo.io/content-management/syntax-highlighting/

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-27 19:29:11 +01:00
hagen1778
c20d68e28d docs: follow-up after 491287ed15
491287ed15
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-27 19:11:38 +01:00
Artem Navoiev
491287ed15 docs: remove witdh from images, remove <p>, remove <div> (#5705)
* docs: remove witdh from images, remove <p>, remove <div>

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>

* docs: remove <div> clarify language in code blocks

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>

---------

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-27 10:08:07 -08:00
Daria Karavaieva
4a9f8f4cb0 version 1.9.1 update, dashboard viz flag (#5704) 2024-01-27 14:16:02 +01:00
Aliaksandr Valialkin
0ed291102d lib/decimal: follow-up for e6bad5174f
- Add a benchmark for CalbirateAndScale.
- Reduce the decimal multipliers table size from 256Kb to 192bytes.
- Use more clear naming for variables.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5672
2024-01-27 00:08:57 +01:00
Fuchun Zhang
64780f4f02 Optimize the performance of data merge: decimal.CalibrateScale() (#5672)
* Optimize the performance of data merge: decimal.CalibrateScale() from 49633 ns/op to 9146 ns/op

* Optimize the performance of data merge: decimal.CalibrateScale()
2024-01-27 00:08:56 +01:00
Aliaksandr Valialkin
1a6c3370bf vendor: run make vendor-update 2024-01-26 22:56:37 +01:00
Aliaksandr Valialkin
b9dcaaa7f8 app/vmui: run make vmui-update after a7b11eff7c 2024-01-26 22:53:46 +01:00
Hui Wang
6ee1bfeb3c add inserting comma inside value instruction to flag description (#5666) 2024-01-26 22:46:49 +01:00
Roman Khavronenko
aaa526e8ff lib/streamaggr: skip unfinished aggregation state on shutdown by default (#5689)
Sending unfinished aggregate states tend to produce unexpected anomalies with lower values than expected.
The old behavior can be restored by specifying `flush_on_shutdown: true` setting in streaming aggregation config

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-26 22:45:23 +01:00
Roman Khavronenko
df59ac7f0e app/vmalert: fix data race during hot-config reload (#5698)
* app/vmalert: fix data race during hot-config reload

During hot-reload, the logic evokes the group update and rules evaluation
interruption simultaneously. Falsely assuming that interruption happens before
the update. However, it could happen that group will be updated first and only
after the rules evaluation will be cancelled. Which will result in permanent
interruption for all rules within the group.

The fix caches the cancel context function into local variable first. And only after
performs the group update. With cached cancel function we can safely call it without
worrying that we cancel the evaluation for already updated group.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* Revert "app/vmalert: fix data race during hot-config reload"

This reverts commit a4bb7e8932.

* app/vmalert: fix data race during hot-config reload

During hot-reload, the logic evokes the group update and rules evaluation
interruption simultaneously. Falsely assuming that interruption happens before
the update. However, it could happen that group will be updated first and only
after the rules evaluation will be cancelled. Which will result in permanent
interruption for all rules within the group.

The fix cancels the evaulation context before applying the update, making sure
that the context will be cancelled for old group always.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* wip

Signed-off-by: hagen1778 <roman@victoriametrics.com>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-26 22:42:21 +01:00
Yury Molodov
a7b11eff7c vmui: fix Enter key in query field (#5667) (#5681) 2024-01-26 22:38:32 +01:00
Aliaksandr Valialkin
3bce55be0c docs: update -help output after bb7a419cc3 2024-01-26 22:28:40 +01:00
Aliaksandr Valialkin
bb7a419cc3 lib/{mergeset,storage}: make background merge more responsive and scalable
- Maintain a separate worker pool per each part type (in-memory, file, big and small).
  Previously a shared pool was used for merging all the part types.
  A single merge worker could merge parts with mixed types at once. For example,
  it could merge simultaneously an in-memory part plus a big file part.
  Such a merge could take hours for big file part. During the duration of this merge
  the in-memory part was pinned in memory and couldn't be persisted to disk
  under the configured -inmemoryDataFlushInterval .

  Another common issue, which could happen when parts with mixed types are merged,
  is uncontrolled growth of in-memory parts or small parts when all the merge workers
  were busy with merging big files. Such growth could lead to significant performance
  degradataion for queries, since every query needs to check ever growing list of parts.
  This could also slow down the registration of new time series, since VictoriaMetrics
  searches for the internal series_id in the indexdb for every new time series.

  The third issue is graceful shutdown duration, which could be very long when a background
  merge is running on in-memory parts plus big file parts. This merge couldn't be interrupted,
  since it merges in-memory parts.

  A separate pool of merge workers per every part type elegantly resolves both issues:
  - In-memory parts are merged to file-based parts in a timely manner, since the maximum
    size of in-memory parts is limited.
  - Long-running merges for big parts do not block merges for in-memory parts and small parts.
  - Graceful shutdown duration is now limited by the time needed for flushing in-memory parts to files.
    Merging for file parts is instantly canceled on graceful shutdown now.

- Deprecate -smallMergeConcurrency command-line flag, since the new background merge algorithm
  should automatically self-tune according to the number of available CPU cores.

- Deprecate -finalMergeDelay command-line flag, since it wasn't working correctly.
  It is better to run forced merge when needed - https://docs.victoriametrics.com/#forced-merge

- Tune the number of shards for pending rows and items before the data goes to in-memory parts
  and becomes visible for search. This improves the maximum data ingestion rate and the maximum rate
  for registration of new time series. This should reduce the duration of data ingestion slowdown
  in VictoriaMetrics cluster on e.g. re-routing events, when some of vmstorage nodes become temporarily
  unavailable.

- Prevent from possible "sync: WaitGroup misuse" panic on graceful shutdown.

This is a follow-up for fa566c68a6 .
Thanks @misutoth to for the inspiration at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5212

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5190
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3790
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3551
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3337
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3425
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3647
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3641
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291
2024-01-26 22:27:47 +01:00
Artem Navoiev
97937d58c4 docs: remove <p> for imanges (#5702)
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-26 13:06:48 -08:00
Artem Navoiev
3e0a117ddf remove all <div> as far they obsolete and can break markdown (#5701)
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-26 12:52:21 -08:00
Aliaksandr Valialkin
e84c877503 lib/mergeset: remove inmemoryBlock pooling, since it wasn't effecitve
This should reduce memory usage a bit when new time series are ingested at high rate (aka high churn rate)
2024-01-26 21:34:57 +01:00
Artem Navoiev
7de19c3748 docs: delete docs/provision_datasources.png as we support webp
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-26 21:25:39 +01:00
Github Actions
5a8daa725e Automatic update Grafana datasource docs from VictoriaMetrics/grafana-datasource@ef5cfe6 (#5700) 2024-01-26 12:22:17 -08:00
Aliaksandr Valialkin
2655c02d5e lib/logstorage: make sure that WaitGroup.Add isnt called after stopCh is closed and WaitGroup.Wait is called
This protects from rare panic, which may occur during graceful shutdown of VictoriaLogs
2024-01-26 21:17:02 +01:00
Aliaksandr Valialkin
8e03bc6b53 app/vmselect/promql: do not spend CPU time on verifying whether the rollup cache needs to be reset for the given metric rows when it has been already instructed to reset 2024-01-26 21:13:38 +01:00
Aliaksandr Valialkin
4ac7e3a355 docs/Makefile: mention that the Makefile rules must be run from VictoriaMetrics repository root 2024-01-26 21:01:40 +01:00
Aliaksandr Valialkin
32c064a401 app/vmauth: return 503 service unavailable status code when the backend returns response with unsupported status code, but the request cannot be re-tried.
While at it, properly close response body. This should prevent from possible http keep-alive connection leak to backends because of unclosed response bodies.

This is a follow-up for 3c0aa14b5b
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5688
2024-01-26 20:43:11 +01:00
Artem Navoiev
60fc2da6c1 docs: fix key concepts image and links
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-26 20:30:29 +01:00
Artem Navoiev
25165656bb docs: change [image] to img as far we support it in release guide
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-26 19:01:20 +01:00
Artem Navoiev
41e99765cc docs: remoev vmanomaly as far we have dedicated section with alredy exists redirects
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-26 18:38:01 +01:00
Artem Navoiev
bc033a2b30 docs: vmanomaly fix images
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-26 17:59:37 +01:00
Daria Karavaieva
105cb44884 Vmanomaly Guide dashboard provisioning (#5679)
* dashboard provisioning

* delete dashboard filter, new query

* dashboard screens, guide fixes
2024-01-26 17:12:58 +01:00
Artem Navoiev
9ded04e643 docs: remove raw and endraw tags as they are not needed for the new v… (#5696)
* docs: remove raw and endraw tags as they are not needed for the new version of site

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>

* revert formating in vmaler

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>

---------

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-26 07:30:45 -08:00
Github Actions
fae801edd3 Automatic update operator docs from VictoriaMetrics/operator@0628def (#5694) 2024-01-26 10:21:52 +01:00
Github Actions
2582b1e15a Automatic update Grafana datasource docs from VictoriaMetrics/grafana-datasource@c644bec (#5691) 2024-01-26 11:44:02 +04:00
Roman Khavronenko
b11f4ef5ea app/vmalert: autogenerate ALERTS_FOR_STATE time series for alerting rules with for: 0 (#5680)
* app/vmalert: autogenerate `ALERTS_FOR_STATE` time series for alerting rules with `for: 0`

 Previously, `ALERTS_FOR_STATE` was generated only for alerts with `for > 0`.
 This behavior differs from Prometheus behavior - it generates ALERTS_FOR_STATE
 time series for alerting rules with `for: 0` as well. Such time series can
 be useful for tracking the moment when alerting rule became active.

 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5648
 https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3056

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* app/vmalert: support ALERTS_FOR_STATE in `replay` mode

Signed-off-by: hagen1778 <roman@victoriametrics.com>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-25 15:42:57 +01:00
Github Actions
a95246d885 Automatic update operator docs from VictoriaMetrics/operator@e75a096 (#5690) 2024-01-25 15:33:01 +01:00
hagen1778
d71218d6ce docs: simplify instructions for spinning up docker env
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-25 15:27:57 +01:00
Github Actions
e29fe0933b Automatic update operator docs from VictoriaMetrics/operator@f6b9c08 (#5676) 2024-01-25 15:10:23 +01:00
Alexander Marshalov
3c0aa14b5b vmauth: fix vmauth_user_request_backend_errors_total metric calc logic for use case when only one backend is available - if we get an error from the retry_status_codes list, but cannot execute retry, we increment vmauth_user_request_backend_errors_total as well (#5688) 2024-01-25 14:04:20 +01:00
hagen1778
56310ffb47 docs: fix the issue link
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-25 13:35:41 +01:00
Alexander Marshalov
495fa9800a Follow up after https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5685 (#5686) 2024-01-25 10:03:46 +01:00
Alexander Marshalov
d5682858c0 Make a makefile work on a MacOS and Linux (#5685) 2024-01-25 09:58:01 +01:00
Aliaksandr Valialkin
c3a585cfe5 lib/storage: rename *AssistedMerges to *AssistedMergesCount in order to make these field names less misleading
These fields are counters, not gauges, so adding Count suffix to them makes easier to understand this while reading the code
2024-01-25 10:19:32 +02:00
Alexander Marshalov
806c07ddd5 vmsingle/vmselect returns http status 429 (TooManyRequests) instead of 503 (ServiceUnavailable) when max concurrent requests limit is reached. (#5682) 2024-01-24 17:55:06 +01:00
Aliaksandr Valialkin
18df07e824 lib/mergeset: start assisted merge for file parts only if the number of file parts is bigger than maxFileParts
The maxFileParts usage has been accidentally removed in fa566c68a6

While at it, add Count suffix to *AssistedMerges counter names in order to make them less misleading.
Previously their names were falsely suggesting that these are gauges, which show the number of concurrently
executed assisted merges.
2024-01-24 15:08:42 +02:00
Aliaksandr Valialkin
ac5b740750 lib/promscrape/discovery/kubernetes: typo fix in the comment for ContainerStateTerminated struct
This is a follow-up for ef12598ad4
2024-01-24 15:06:46 +02:00
Aliaksandr Valialkin
ef12598ad4 lib/promscrape/discovery/kubernetes: do not generate targets for already terminated pods and containers
Already terminated pods and containers cannot be scraped and will never resurrect,
so there is zero sense in creating scrape targets for them.
2024-01-24 14:57:53 +02:00
Aliaksandr Valialkin
4d961c70f7 app/{vmselect,vmstorage}: return compression of the data passed from vmstorage to vmselect
This reverts cd4f641d32 , since it has been appeared that the disabled compression
for vmstorage->vmselect data increase network bandwidth usage by more than 10x on typical production workloads,
while it decreases CPU usage at vmstorage by up to 10% and improves query latency by up to 10%.

The 10x increase in network usage is too high price for 10% improvements on query latency and vmstorage CPU usage.
This may result in network bandwidth bottlenecks, which can reduce the overall performance and stability
of VictoriaMetrics cluster. That's why return back the vmstorage->vmselect data compression by default.

The vmstorage->vmselect compression can be disabled by passing -rpc.disableCompression command-line flag to vmstorage.
The vmselect->vmselect compression in multi-level cluster setup can be disabled by passing -clusternative.disableCompression command-line flag.
2024-01-24 13:39:28 +02:00
Aliaksandr Valialkin
f888a019fe lib/streamaggr: expand %{ENV} placeholders in stream aggregation configs 2024-01-24 12:31:27 +02:00
Aliaksandr Valialkin
fa566c68a6 lib/mergeset: really limit the number of in-memory parts to 15
It has been appeared that the registration of new time series slows down linearly
with the number of indexdb parts, since VictoriaMetrics needs to check every indexdb part
when it searches for TSID by newly ingested metric name.

The number of in-memory parts grows when new time series are registered
at high rate. The number of in-memory parts grows faster on systems with big number
of CPU cores, because the mergeset maintains per-CPU buffers with newly added entries
for the indexdb, and every such entry is transformed eventually into a separate in-memory part.

The solution has been suggested in https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5212
by @misutoth - to limit the number of in-memory parts with buffered channel.
This solution is implemented in this commit. Additionally, this commit merges per-CPU parts
into a single part before adding it to the list of in-memory parts. This reduces CPU load
when searching for TSID by newly ingested metric name.

The https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5212 recommends setting the limit on the number
of in-memory parts to 100, but my internal testing shows that much lower limit 15 works with the same efficiency
on a system with 16 CPU cores while reducing memory usage for `indexdb/dataBlocks` cache by up to 50%.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5190
2024-01-24 03:38:12 +02:00
Aliaksandr Valialkin
5543c04061 docs/Cluster-VictoriaMetrics.md: document that vmstorage doesnt compress data it sends to vmselect by default
This is a follow-up for cd4f641d32
2024-01-23 23:22:18 +02:00
Aliaksandr Valialkin
8fb8b71295 lib/encoding: remove uneeded re-slicing of byte slice before passing it to binary.BigEndian.Uint* 2024-01-23 22:50:29 +02:00
Aliaksandr Valialkin
1c58c00618 app/vmselect/netstorage: limit the initial size for brsPoolCap with 32Kb
This should reduce the number of expensive memory allocations with sizes bigger than 32Kb
2024-01-23 22:29:39 +02:00
Aliaksandr Valialkin
43ecd5d258 app/vmselect/netstorage: pre-allocate memory for metricNamesBuf
This should reduce the number of metricNamesBuf re-allocations in append()
2024-01-23 21:34:16 +02:00
Aliaksandr Valialkin
ae643ef1f1 lib/{storage,mergeset}: reduce the maxium compression level for the stored data
This reduces CPU usage a bit, while doesn't increase resulting file sizes according to synthetic tests.
2024-01-23 17:46:50 +02:00
Github Actions
05c9a4d7ce Automatic update operator docs from VictoriaMetrics/operator@1470569 (#5668) 2024-01-23 16:22:16 +01:00
Aliaksandr Valialkin
6c214397ed lib/storage: compress metricIDs, which match the given filters, before storing them in tagFiltersToMetricIDsCache
This allows reducing the indexdb/tagFiltersToMetricIDs cache size by 8 on average.
The cache size can be checked via vm_cache_size_bytes{type="indexdb/tagFiltersToMetricIDs"} metric exposed at /metrics page.
2024-01-23 16:09:55 +02:00
Aliaksandr Valialkin
4d78954158 lib/storage: do not sort metricIDs passed to Storage.prefetchMetricNames, since the caller is responsible for the sorting 2024-01-23 16:08:38 +02:00
Aliaksandr Valialkin
6d84b1beef lib/filestream: do not measure read / write duration from / to in-memory buffers
Measuring read / write duration from / to in-memory buffers has little sense,
since it will be always fast. It is better to measure read / write duration from / to
real files at vm_filestream_write_duration_seconds_total and vm_filestream_read_duration_seconds_total metrics.
This also reduces overhead on time.Now() and Histogram.UpdateDuration() calls
per each filestream.Reader.Read() and filestream.Writer.Write() call when the data is read / written from / to in-memory buffers.

This is a follow-up for 2f63dec2e3
2024-01-23 14:52:22 +02:00
Aliaksandr Valialkin
41456d9569 app/vmselect/netstorage: limit the maximum brsPool size to 32Kb at ProcessSearchQuery()
This avoids slow path in Go runtime for allocating objects bigger than 32Kb -
see 704401ffa0/src/runtime/malloc.go (L11)

This also reduces memory usage a bit for vmselect and single-node VictoriaMetrics
after the commit 5dd37ad836 .

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5527
2024-01-23 14:04:49 +02:00
Aliaksandr Valialkin
1f1768d7af app/vmselect/netstorage: limit the size of metricNamesBuf to 32Kb in order to avoid slow path at Go runtime for allocating a byte slice of bigger size
See 704401ffa0/src/runtime/malloc.go (L11)

This also reduces the average memory usage a bit for vmselect and single-node VictoriaMetrics
after the commit 508c608062

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5527
2024-01-23 13:46:37 +02:00
Aliaksandr Valialkin
fac7c30f4e docs/vmagent.md: clarify how -promscrape.seriesLimitPerTarget command-line flag, series_limit config option and __series_limit__ label interact with each other
This is a follow-up for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5663
See also 89e3c70ccd
2024-01-23 13:14:50 +02:00
Roman Khavronenko
89e3c70ccd lib/promscrape: respect 0 value for series_limit param (#5663)
* lib/promscrape: respect `0` value for `series_limit` param

Respect `0` value for `series_limit` param in `scrape_config`
even if global limit was set via `-promscrape.seriesLimitPerTarget`.
Previously, `0` value will be ignored in favor of `-promscrape.seriesLimitPerTarget`.

This behavior aligns with possibility to override `series_limit` value via
relabeling with `__series_limit__` label.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* Update docs/CHANGELOG.md

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-01-23 13:09:14 +02:00
Aliaksandr Valialkin
1c5163ae51 lib/mergeset: make sure that the first and the last items are in the original range after prepareBlock()
Previously the checks were to strict by requiring to leave the same first and last items by prepareBlock()

Thanks to @ahfuzhang for the suggestion at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5655
2024-01-23 12:58:32 +02:00
Fred Navruzov
2adb38a9c4 - fix 404 errors after page remaning (#5664)
- slight text fixes
2024-01-23 01:56:42 -08:00
Aliaksandr Valialkin
15a15e5b99 app/vmselect/vmui: run make vmui-update in order to sync recent changes in app/vmui 2024-01-23 04:31:44 +02:00
Aliaksandr Valialkin
114822d585 app/{vmstorage,vmselect}: disable vmstorage->vmselect RPC compression by default in order to improve query performance 2024-01-23 04:24:57 +02:00
Zakhar Bessarab
bf4742526d lib/storage: print tenant ID in log when discarding or truncating labels (#5658)
Previously, it was not possible to determine which tenant sends metrics with excessive amount of labels of label values.

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-01-23 04:24:56 +02:00
Yury Molodov
38231d5994 vmui: query report (#5497)
* vmui: add query analyzer page

* vmui: fix tabs for query analyzer

* vmui: add help to export query

* vmui: add time params to query analyzer

* docs/vmui: add query analyzer

* vmui: fix validation JSON form

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-01-23 04:23:26 +02:00
Yury Molodov
eb6def0695 vmui: add flag for default timezone setting (#5611)
* vmui: add flag for default timezone setting #5375

* vmui: validate timezone before client return

* Update app/vmselect/vmui.go

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-01-23 04:11:19 +02:00
Yury Molodov
633e6b48ad vmui: fix cache autocomplete (#5591)
* vmui: fix the logic of closing the popper #5470

* vmui: fix the logic of caching autocomplete results #5472

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-01-23 04:06:14 +02:00
Aliaksandr Valialkin
980338861f lib/mergeset: skip comparison for every item in the block during merge if the last item in the block is smaller than the first item in the next block
Thanks to @ahfuzhang for the suggestion at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5651
2024-01-23 03:15:52 +02:00
Aliaksandr Valialkin
bc7d19c8ca app/vmselect/promql: remove superflouos memory allocations at aggrPrepareSeries()
While at it, also remove unneeded map lookup
2024-01-23 02:28:31 +02:00
Aliaksandr Valialkin
9240bc36a3 app/vmselect/promql/aggr_incremental.go: eliminate unnecessary memory allocation in incrementalAggrFuncContext.updateTimeseries 2024-01-23 02:28:30 +02:00
Aliaksandr Valialkin
e0399ec29a app/vmselect/netstorage: remove tswPool, since it isnt efficient 2024-01-23 02:28:30 +02:00
Aliaksandr Valialkin
72a838a2a1 app/vmselect/netstorage: avoid metricName->blockRef lookup when processing multiple blocks for the same time series
This saves a few CPU cycles for common case
2024-01-23 02:28:29 +02:00
Aliaksandr Valialkin
5dd37ad836 app/vmselect/netstorage: use []blockRef from blockRefPool in order to reduce memory allocations 2024-01-23 02:28:29 +02:00
Aliaksandr Valialkin
7345567c29 app/vmselect/netstorage: substitute pointer to blockRefs by brssPool index at the metricName->blockRefs map
This should reduce the pressure on Go GC, since it will see lower number of pointers.

This change has been extracted from https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5527
2024-01-23 02:28:29 +02:00
Aliaksandr Valialkin
678234e9f0 app/vmselect/netstorage: reduce the number of allocations for blockRefs objects in ProcessSearchQuery()
This should reduce pressure on Go GC at vmselect

The change has been extracted from https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5527
2024-01-23 02:28:28 +02:00
Aliaksandr Valialkin
508c608062 app/vmselect/netstorage: reduce the number of memory allocations in ProcessSearchQuery() by storing all the metric names in a single byte slice
This reduces the number of memory allocations at the cost of possible memory usage increase,
since now different metric name strings may hold references to the previous byte slice.
This is good tradeoff, since ProcessSearchQuery is called in vmselect, and vmselect isn't usually limited by memory.

This change has been extracted from https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5527
2024-01-23 02:28:28 +02:00
Daria Karavaieva
ffaf48b99e add 1.8.0 notes to changelog (#5616)
* add 1.8.0 notes to changelog

* added release date

* MAD internal link

* monitoring health deprecation
2024-01-22 23:51:12 +01:00
Jaskeerat Singh Randhawa
b606521745 custom-resources: fix link text for alertmanager (#5660) 2024-01-22 18:06:40 +01:00
Aliaksandr Valialkin
3449d563bd all: add up to 10% random jitter to the interval between periodic tasks performed by various components
This should smooth CPU and RAM usage spikes related to these periodic tasks,
by reducing the probability that multiple concurrent periodic tasks are performed at the same time.
2024-01-22 18:40:32 +02:00
Aliaksandr Valialkin
9b4294e53e lib/storage: reduce the contention on dateMetricIDCache mutex when new time series are registered at high rate
The dateMetricIDCache puts recently registered (date, metricID) entries into mutable cache protected by the mutex.
The dateMetricIDCache.Has() checks for the entry in the mutable cache when it isn't found in the immutable cache.
Access to the mutable cache is protected by the mutex. This means this access is slow on systems with many CPU cores.
The mutabe cache was merged into immutable cache every 10 seconds in order to avoid slow access to mutable cache.
This means that ingestion of new time series to VictoriaMetrics could result in significant slowdown for up to 10 seconds
because of bottleneck at the mutex.

Fix this by merging the mutable cache into immutable cache after len(cacheItems) / 2
cache hits under the mutex, e.g. when the entry is found in the mutable cache.
This should automatically adjust intervals between merges depending on the addition rate
for new time series (aka churn rate):

- The interval will be much smaller than 10 seconds under high churn rate.
  This should reduce the mutex contention for mutable cache.
- The interval will be bigger than 10 seconds under low churn rate.
  This should reduce the uneeded work on merging of mutable cache into immutable cache.
2024-01-22 18:40:32 +02:00
hagen1778
8b8d0e3677 deployment/docker: fix typo in commands example
Follow up after 38b2a5bc44

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-22 16:56:27 +01:00
hagen1778
b25ef138ce dashboards: reflect dashboard rename in copy script
This is a follow-up for ff33e60a3d

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-22 16:51:24 +01:00
hagen1778
0e5e502b3c deployment/docker: follow-up 38b2a5bc44
* Simplify folder structure
* mention datasource in README

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-22 16:05:44 +01:00
Dmytro Kozlov
38b2a5bc44 deployment/docker: add grafana datasource to the docker-compose files (#5363)
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3920
https://github.com/VictoriaMetrics/grafana-datasource/issues/113
2024-01-22 15:45:31 +01:00
hagen1778
1075fcfc8c app/vmctl/backoff: fix flaky test
The change removes artificial delay before returning error, which sometimes
caused less retry events than expected.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-22 12:21:14 +01:00
hagen1778
da556cc329 docs: fix Grafana link example for vmalert
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-22 09:35:18 +01:00
dependabot[bot]
df197723ae build(deps): bump github/codeql-action from 2 to 3 (#5462)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 2 to 3.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](https://github.com/github/codeql-action/compare/v2...v3)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-01-22 01:49:17 +02:00
Aliaksandr Valialkin
d3ee3e0ef5 Revert "lib/promscrape: do not store last scrape response when stale markers … (#5577)"
This reverts commit cfec258803.

Reason for revert: the original code already doesn't store the last scrape response when stale markers are disabled.
The scrapeWork.areIdenticalSeries() function always returns true is stale markers are disabled.
This prevents from storing the last response at scrapeWork.processScrapedData().

It looks like the reverted commit could also return back the issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3660

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5577
2024-01-22 00:43:48 +02:00
Aliaksandr Valialkin
9c0863babc docs: use persistent links to Grafana dashboards
These links do not depend on the dashboard name, so they do not break after the renaming of the dashboard.

This is a follow-up for ff33e60a3d
2024-01-22 00:17:17 +02:00
Aliaksandr Valialkin
1c7f990fad app/vmselect: handle negative time range start in a generic manner inside NewSearchQuery()
This is a follow-up for cf03e11d89

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5553
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5630
2024-01-21 23:45:31 +02:00
Artem Navoiev
3f7ed7e6b2 docs vmanomaly fix anchor
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-21 22:21:37 +01:00
Hui Wang
4e3242b02d lib/promscrape/discovery/kubernetes: fix watcher start order for roles endpoints and endpointslice (#5557)
* lib/promscrape/discovery/kubernetes: fix watcher start order for roles endpoints and endpointslice

Previously the groupWatcher could be mistakenly stopped when requests for pod or services resources take too long.

* remove mislead comment

* docs/sd_configs.md: mention -promscrape.kubernetes.attachNodeMetadataAll flag in the description for attach_metadata section

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4640

* wip

* lib/promscrape/kubernetes: prevent from stopping groupWatcher when there are in-flight apiWatcher.mustStart() calls

groupWatcher is stopped if it has zero registered apiWatchers during 14 seconds.
But such a groupWatcher can be still in use if apiWatcher for `role: endpoints` or `role: endpointslice`
is being registered and the discovery of the associated `pod` and/or `service` objects takes longer
than 14 seconds - see the beginning of groupWatcher.startWatchersForRole() function for details.

Track the number of in-flight calls to apiWatcher.mustStart() and prevent from stopping the associated groupWatcher
if the number of in-flight calls is non-zero.

P.S. postponing the discovery of `pod` and/or `service` objects associated with `endpoints` or `endpointslice` roles
isn't the best solution, since it slows down initial discovery of `endpoints` and `endpointslice` targets.

* typo fix

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-01-21 23:13:15 +02:00
Aliaksandr Valialkin
1f105dde98 all: allow dynamically reading *AuthKey flag values from files and urls
Examples:

1) -metricsAuthKey=file:///abs/path/to/file - reads flag value from the given absolute filepath
2) -metricsAuthKey=file://./relative/path/to/file - reads flag value from the given relative filepath
3) -metricsAuthKey=http://some-host/some/path?query_arg=abc - reads flag value from the given url

The flag value is automatically updated when the file contents changes.
2024-01-21 22:03:38 +02:00
Fred Navruzov
7e68722686 - fix 404 links after renaming (#5653)
- improve wording on diagram
- swap enterprise/about chapters for page clarity
2024-01-21 21:24:29 +02:00
Fred Navruzov
0038102b98 docs: vmanomaly - add component interaction diagram (#5652)
* add interaction diagram for vmanomaly components
* small docs fixes
* resolve suggestions
2024-01-21 19:33:59 +02:00
Aliaksandr Valialkin
0b2ea1a7c7 all: call atomic.Load* in front of atomic.CompareAndSwap* at places where the atomic.CompareAndSwap* returns false most of the time
This allows avoiding slow inter-CPU synchornization induced by atomic.CompareAndSwap*
2024-01-21 14:04:54 +02:00
Aliaksandr Valialkin
3d83f3347d docs/sd_configs.md: mention -promscrape.kubernetes.attachNodeMetadataAll flag in the description for attach_metadata section
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4640
2024-01-21 13:26:03 +02:00
Aliaksandr Valialkin
4eb9926125 lib/promscrape: code cleanup: send stale markers immediately after generating automatic metrics
This cleanup has been extracted from https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5557/files#diff-6b205cf6637d7b65a5c45d9417d08822d4efad94227268cb196f61aa2a0fc0f7
2024-01-21 05:18:22 +02:00
Aliaksandr Valialkin
12f2c5679b all: consistently clear prompbmarshal.Label by assigning an empty struct instead of zeroing Name and Value individually 2024-01-21 05:11:05 +02:00
Aliaksandr Valialkin
90768aa418 lib/storage/partition.go: remove misleading comment, which falsely states that inmemoryParts isn't visible to search
Thanks to @satjd for raising attention to this comment at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5410
2024-01-21 04:49:35 +02:00
Nikolay
b3598ba2c1 app/vmauth: adds metric_labels and backend_errors counter (#5585)
* app/vmauth: adds metric_labels and backend_errors counter
it must improve observability for user requests with new metric - per user backend errors counter.
it's needed to calculate requests fail rate to the configured backends.
metric_labels configuration allows to perform additional aggregations on top of multiple users from configuration section.
It could be multiple clients or clients with separate read/write tokens
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5565

* wip

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-01-21 04:40:52 +02:00
Yury Molodov
3ea1294ad2 vmui: add autofocus to input for desktop version #5479 (#5592) 2024-01-21 03:24:16 +02:00
Aliaksandr Valialkin
7fba73ce11 lib/promscrape/discovery/kubernetes: add -promscrape.kubernetes.attachNodeMetadataAll command-line flag
This flag allows setting attach_metadata.node=true for all the kubernetes_sd_configs defined at -promscrape.config

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4640

Thanks to wasim-nihal for the initial implementation at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5593
2024-01-21 03:13:56 +02:00
Hui Wang
fad212c39c app/vmselect/promql: properly handle possible negative results caused… (#5608)
* app/vmselect/promql: properly handle possible negative results caused by float operations precision error in rollup functions like rate() or increase()

* fix test
2024-01-21 02:53:29 +02:00
Nikolay
c9f39fd51f app/vmselect/netstorage (#5649)
* app/vmselect/netstorage

correctly handle errGlobal set

* wip

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5649

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-01-21 02:47:29 +02:00
Nikolay
8ab0ce3ded app/vmselect: abort streaming connections for vmselect (#5650)
* app/vmselect: abort streaming connections for vmselect
due to streaming nature of export APIs, curl and simmilr tools cannot
detect errors that happened after http.Header with status 200 was
written to it.

This PR tracks if body write was already started and closes connection.

It allows client to detect not expected chunk sequence and return error
to the caller.

Mostly it affects vmselect at cluster version

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5645

* wip

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5645
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5650

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-01-21 02:12:51 +02:00
Aliaksandr Valialkin
74448a7e57 lib/promscrape/discovery/hetzner: follow-up after 03a97dc678
- docs/sd_configs.md: moved hetzner_sd_configs docs to the correct place according to alphabetical order of SD names,
  document missing __meta_hetzner_role label.
- lib/promscrape/config.go: added missing MustStop() call for Hetzner SD,
  and moved the code to the correct place according to alphabetical order of SD names.
- lib/promscrape/discovery/hetzner: properly handle pagination for hloud API responses,
  populate missing __meta_hetzner_role label like Prometheus does.
- Properly populate __meta_hetzner_public_ipv6_network label like Prometheus does.
- Remove unused SDConfig.Token.
- Remove "omitempty" annotation from SDConfig.Role field, since this field is mandatory.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5550
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3154
2024-01-20 17:01:53 +02:00
Artem Navoiev
873483a782 vmanomly use proper title for overiview doc
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-19 19:59:17 +01:00
Hui Wang
cfec258803 lib/promscrape: do not store last scrape response when stale markers … (#5577)
* lib/promscrape: do not store last scrape response when stale markers are disabled

* update changelog
2024-01-20 00:53:41 +08:00
Artem Navoiev
6a2a8cd426 docs add docs titles
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-19 17:02:20 +01:00
Artem Navoiev
dfa43da1a2 vmanonamly docs fix sorting for jekill as far it doesnt support of sorting the folders
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-19 16:55:44 +01:00
Artem Navoiev
1af5faa4af vmanomaly remove unused pages from menu
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-19 16:50:12 +01:00
Artem Navoiev
5e17636994 vmanomly specify the right menu parent for overview page
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-19 16:43:18 +01:00
Artem Navoiev
c425ec3088 docs vmanmoly fix sorting
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-19 16:15:40 +01:00
Artem Navoiev
ec85d32e21 Move vmanomaly page to its own section (#5646)
* docs: move vmanomaly overview page to its section

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>

* add alias for backward compatibility

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>

* fix links

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>

* change title

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>

---------

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-19 07:00:41 -08:00
Roman Khavronenko
7e374c227f app/vmui: send step param for instant queries (#5639)
The change reverts https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3896
 due to reasons explained in https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3896#issuecomment-1896704401

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-19 08:48:16 +01:00
Fred Navruzov
69ae1d30bf docs: vmanomaly slight improvements (#5637)
* - better messaging
- update links to dockerhub in guides
- added anomaly_score to FAQ
- improve model section (sort + use cases)
- slight refactor of a guide

* rename guide & change refs

* change wording in installation options

* - update remaining text reference
- add cross-link to component sections in guide

* add docs/.jekyll-metadata to .gitignore
2024-01-18 02:37:36 -08:00
hagen1778
0a5ffb3bc1 docs: remove slug from Grafana dashboard URLs
Each Grafana dashboard has unique ID which can be used to fetch the dashboard
from grafana.com: https://grafana.com/grafana/dashboards/11176
The same dashboard can be accessed via URL with slug: https://grafana.com/grafana/dashboards/11176-victoriametrics-cluster/
But using slug implies that any change to dashboard name will break the link.
So it is better to just use ID, so the dashboard URL will never break.

This is follow-up for ff33e60a3d

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-18 11:19:53 +01:00
Artem Navoiev
f89d16fc4c docs: vmanomaly update vmanomaly + vmalert guide (#5636)
* docs: vmanomaly update vmanomaly + vmalert guide

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>

* docs: vmanomaly update vmanomaly + vmalert guide. Update docker compose and monitoring section

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>

* typos and fixes

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>

---------

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-17 11:49:51 -08:00
Artem Navoiev
ff33e60a3d fix link for grafana dashbaord for single node after its renaming
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-17 16:00:33 +01:00
Artem Navoiev
dab160cd74 docs: changelog fix the link to cluster
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-17 15:44:56 +01:00
Artem Navoiev
a3b3ea4d73 vmanomaly docs fix broken relative links
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-17 15:41:03 +01:00
Artem Navoiev
9a353ee695 docs/anomaly-detection/components/models.md add sort:1
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-17 15:26:38 +01:00
Artem Navoiev
0c06934a59 vmanonaly docs add .html for the section document models
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-17 15:25:30 +01:00
Artem Navoiev
5b419cfb2b vmanomaly docs simplify the strcuture (#5634)
* vmanomaly docs simplify the strcuture

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>

* fix links

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>

---------

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-17 06:20:27 -08:00
hagen1778
71681fd1ca docs/setup-size: rm tolerable churn rate %
It is likely this value was borrowed from panel `Slow inserts` panel
from Grafana dasbhoard for VM single/cluster installations. This is a mistake.
There is no such thing as "tolerable churn rate", as tolerancy depends on the amount
of allocated resources.

Although, it is unclear what is meant by 5%. If it refers to 5% of new time series per second,
then such churn rate is extremely high. It would mean that the avg life of a time series is 20s.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-17 14:44:28 +01:00
Roman Khavronenko
cf03e11d89 app/vmselect: properly calculate start param for queries with too big look-behind window (#5630)
Properly determine time range search for instant queries with too big look-behind window like `foo[100y]`.
 Previously, such queries could return empty responses even if `foo` is present in database.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5553

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-17 13:48:06 +01:00
Aliaksandr Valialkin
5bdf62de5b lib/storage: do not prefetch metric names for small number of metricIDs
This eliminates prefetchedMetricIDsLock lock contention for queries, which return less than 500 time series.

This is a follow-up for 9d886a2eb0
2024-01-17 13:48:15 +02:00
Aliaksandr Valialkin
3c3450fc53 docs/keyConcepts.md: typo fixes after b7ffee2644
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5555
2024-01-17 13:27:47 +02:00
Aliaksandr Valialkin
befcd93305 docs/keyConcepts.md: document /internal/force_flush handler
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5555
2024-01-17 13:23:32 +02:00
Aliaksandr Valialkin
cc6819869a docs/CHANGELOG*: move changes for 2023 year to docs/CHANGELOG_2023.md 2024-01-17 13:10:32 +02:00
hagen1778
8040bdc1d6 docs/troubleshooting: mention query latency in unexpected query results
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5555

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-17 11:59:47 +01:00
Roman Khavronenko
3a1ef3184d app/victoriametrics: update the test suite (#5627)
app/victoriametrics: update the test suite

* simplify /query and /query_range test cases configuration and tests
* support instant queries with lookbehind window like `query=foo[5m]`
* support instant queries selecting scalar value like `query=42`
* add query_range test for prometheus

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-17 09:51:10 +01:00
Aliaksandr Valialkin
f3c5687a04 LICENSE: update the current year from 2023 to 2024 2024-01-17 01:48:04 +02:00
Aliaksandr Valialkin
1683df11f0 docs/CHANGELOG.md: document v1.93.10 LTS release
See https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.93.10
2024-01-17 01:45:06 +02:00
Aliaksandr Valialkin
f2f0468ae7 docs/keyConcepts.md: clarify which values can be stored in VictoriaMetrics without precision loss
This is a follow-up for 43d7de4afe

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5485
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5503
2024-01-17 01:09:16 +02:00
Aliaksandr Valialkin
ecce2d6db1 docs/CHANGELOG.md: document v1.87.13 LTS release
See https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.87.13
2024-01-17 01:04:05 +02:00
Aliaksandr Valialkin
0f39c0e897 snap/local/Makefile: update Go builder from Go1.21.5 to Go1.21.6 2024-01-17 00:07:05 +02:00
Aliaksandr Valialkin
41932db848 lib/promscrape: cosmetic changes after 3ac44baebe
- Rename mustLoadScrapeConfigFiles() to loadScrapeConfigFiles(), since now it may return error.
- Split too long line with the error message into two lines in order to improve readability a bit.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5508
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5560
2024-01-16 22:29:09 +02:00
Aliaksandr Valialkin
c830064c2f docs/{vmbackup,vmbackupmanager}.md: clarify why storing backups to S3 Glacier can be time-consuming and expensive
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5614
This is a follow-up for e14e3d9c8c
2024-01-16 21:50:43 +02:00
hagen1778
b0287867fe deployment/dashboards: change title VictoriaMetrics to VictoriaMetrics - single-node
The new title should provide better understanding of this dashboard purpose.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-16 20:39:52 +01:00
Aliaksandr Valialkin
a9fd130980 app/vmselect/graphite: properly handle -N index for the array of N items
This is a follow-up for 70cd09e736
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5581
2024-01-16 21:05:01 +02:00
Aliaksandr Valialkin
30d77393a5 docs/CHANGELOG.md: fix a link in the description of 70cd09e736
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5581
2024-01-16 20:54:44 +02:00
Aliaksandr Valialkin
4e4d7f4cbe app/vmctl: disallow insecure https connections to vm-native-dst-addr and vm-native-src-addr by default
It is better from security PoV to disallow insecure https connections
to vm-native-dst-addr and vm-native-src-addr . This also maintains backwards compatibility
with vmctl before the commit 828aca82e9

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5595
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5606
2024-01-16 20:20:33 +02:00
Aliaksandr Valialkin
19c04549a5 docs/Single-server-VictoriaMetrics.md: explain why staleness markers are treated as an ordinary values during de-duplication
This is a follow-up for d374595e31
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5587
2024-01-16 20:10:12 +02:00
Aliaksandr Valialkin
db6495560c docs/vmagent.md: mention the minumum value for -remoteWrite.vmProtoCompressLevel
Recommend using the default value for -remoteWrite.vmProtoCompressLevel

This is a follow-up for 095d982976
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5575
2024-01-16 19:46:13 +02:00
Aliaksandr Valialkin
4073bb3303 lib/httputils: handle step=undefined query arg as an empty value
This is needed for Grafana, which may send step=undefined
when working with alerting rules and instant queries.
2024-01-16 18:59:26 +02:00
Yury Molodov
d365157381 vmui/vmanomaly: add support models that produce only anomaly_score (#5594)
* vmui/vmanomaly: add support models that produce only `anomaly_score`

* vmui/vmanomaly: fix display legend

* vmui/vmanomaly: update comment on anomaly threshold
2024-01-16 18:50:19 +02:00
Artem Navoiev
846d5a3ab8 docs: vmanomaly fix font matter
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-16 16:58:48 +01:00
Artem Navoiev
481471b872 docs: vmanomaly remove commented text
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-16 16:43:59 +01:00
Aliaksandr Valialkin
a74f6d63e0 deployment/docker: update Go builder from Go1.21.5 to Go1.21.6 2024-01-16 17:00:16 +02:00
Aliaksandr Valialkin
6eae3f6c8a vendor: run make vendor-update 2024-01-16 16:57:30 +02:00
Aliaksandr Valialkin
9d886a2eb0 lib/storage: follow-up for 4b8088e377
- Clarify the bugfix description at docs/CHANGELOG.md
- Simplify the code by accessing prefetchedMetricIDs struct under the lock
  instead of using lockless access to immutable struct.
  This shouldn't worsen code scalability too much on busy systems with many CPU cores,
  since the code executed under the lock is quite small and fast.
  This allows removing cloning of prefetchedMetricIDs struct every time
  new metric names are pre-fetched. This should reduce load on Go GC,
  since the cloning of uin64set.Set struct allocates many new objects.
2024-01-16 15:29:57 +02:00
Aliaksandr Valialkin
b49b8fed3c app/vmselect/promql: simplify the code after 388d020b7c
Add a test, which verifies the correct sorting of float64 slices with NaNs.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5506
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5509
2024-01-16 15:06:33 +02:00
Hui Wang
3ac44baebe exit vmagent if there is config syntax error in scrape_config_files when -promscrape.config.strictParse=true (#5560) 2024-01-16 17:30:02 +08:00
hagen1778
d0e4190969 deployment/alerts: add job label to DiskRunsOutOfSpace alerting rule
So it is easier to understand to which installation the triggered instance belongs.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-16 09:49:39 +01:00
Aliaksandr Valialkin
388d020b7c app/vmselect/promql: follow-up for ce4f26db02
- Document the bugfix at docs/CHANGELOG.md
- Filter out NaN values before sorting as suggested at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5509#discussion_r1447369218
- Revert unrelated changes in lib/filestream and lib/fs
- Use simpler test at app/vmselect/promql/exec_test.go

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5509
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5506
2024-01-16 02:55:08 +02:00
Zongyang
ce4f26db02 FIX bottomk doesn't return any data when there are no time range overlap between timeseries (#5509)
* FIX sort order in bottomk

* Add lessWithNaNsReversed for bottomk

* Add ut for TopK

* Move lt from loop

* FIX lint

* FIX lint

* FIX lint

* Mod log format

---------

Co-authored-by: xiaozongyang <xiaozngyang@kanyun.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-01-16 02:19:36 +02:00
Aliaksandr Valialkin
190a6565ae app/vmselect/promql: consistently sort results of a or b query
Previously the order of results returned from `a or b` query could change with each request
because the sorting for such query has been disabled in order to satisfy
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4763 .

This commit executes `a or b` query as `sortByMetricName(a) or sortByMetricName(b)`.
This makes the order of returned time series consistent across requests,
while maintaining the requirement from https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4763 ,
e.g. `b` results are consistently put after `a` results.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5393
2024-01-16 01:30:10 +02:00
Aliaksandr Valialkin
7fc2bd0412 app/vmstorage: expose proper types for storage metrics when -metrics.exposeMetadata command-line flag is set
This is a follow-up for 326a77c697
2024-01-16 00:20:37 +02:00
Aliaksandr Valialkin
bfa73ebdf3 lib/prompbmarshal: move WriteRequest proto definition to the correct place 2024-01-16 00:20:37 +02:00
Artem Navoiev
51cdf3676b docs: vmanomaly fix image size
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-15 23:10:52 +01:00
Artem Navoiev
74219a1727 vmanomly: guide add diagramm
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-15 23:07:49 +01:00
Artem Navoiev
041a1966c5 docs: vmanomaly fix formatting and remove unnecessary elements
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-15 22:45:02 +01:00
Artem Navoiev
191e322879 vmanomaly - models a bit pretify docs (#5618)
* vmanomaly - models a bit pretify docs

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>

* typi

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>

* fix formatting

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>

---------

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-15 13:26:38 -08:00
Artem Navoiev
544da241e8 vmanomaly guides - fix formating, add missing piece, clarify statement, use common languagefor VM ecosystem (#5620)
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-15 13:26:26 -08:00
Artem Navoiev
0c78b891b0 clarify cluster multitenacy in vmanomaly docs (#5619)
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-15 12:36:28 -08:00
Artem Navoiev
f51b7fda8e docs: fix markdown in how-to-monitor-k8s
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-15 18:16:26 +01:00
Aliaksandr Valialkin
4b42c8abbb lib/promscrape/discovery/hetzner: fix golangci-lint warnings after 03a97dc678
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5550
2024-01-15 17:12:40 +02:00
Roman Khavronenko
e14e3d9c8c docs: mention requirements for latest backups (#5614)
Add docs to explain that `latest` backup folder can be accessed
more frequently than others. Hence, it is recommended to keep it
on storages with low access price.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-15 15:30:55 +01:00
Aliaksandr Valialkin
5106045048 app/vmstorage: deregister storage metrics before stopping the storage
This prevents from possible nil pointer dereference issues when the storage metrics
are read after the storage is stopped.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5548
2024-01-15 16:11:45 +02:00
Aliaksandr Valialkin
be509b3995 lib/pushmetrics: wait until the background goroutines, which push metrics, are stopped at pushmetrics.Stop()
Previously the was a race condition when the background goroutine still could try collecting metrics
from already stopped resources after returning from pushmetrics.Stop().
Now the pushmetrics.Stop() waits until the background goroutine is stopped before returning.

This is a follow-up for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5549
and the commit fe2d9f6646 .

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5548
2024-01-15 13:50:36 +02:00
Aliaksandr Valialkin
9fd20202e1 vendor/github.com/VictoriaMetrics/easyproto: update from v0.1.3 to v0.1.4
This fixes vet error for 32-bit architectures:

https://github.com/VictoriaMetrics/VictoriaMetrics/actions/runs/7521709384/job/20472882877
2024-01-15 11:31:30 +02:00
Aleksandr Stepanov
03a97dc678 vmagent: added hetzner sd config (#5550)
* added hetzner robot and hetzner cloud sd configs

* remove gettoken fun and update docs

* Updated CHANGELOG and vmagent docs

* Updated CHANGELOG and vmagent docs

---------

Co-authored-by: Nikolay <nik@victoriametrics.com>
2024-01-15 10:13:22 +01:00
Roman Khavronenko
4b8088e377 lib/storage: properly check for storage/prefetchedMetricIDs cache expiration deadline (#5607)
Before, this cache was limited only by size.
Cache invalidation by time happens with jitter to prevent thundering herd problem.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-15 10:03:06 +01:00
rbizos
70cd09e736 Handling negative index in Graphite groupByNode/aliasByNode (#5581)
Handeling the error case with -1

Signed-off-by: Raphael Bizos <r.bizos@criteo.com>
Co-authored-by: Nikolay <nik@victoriametrics.com>
2024-01-15 09:57:15 +01:00
Aliaksandr Valialkin
d2c94a0663 lib/prompbmarshal: switch to github.com/VictoriaMetrics/easyproto 2024-01-14 23:04:45 +02:00
Aliaksandr Valialkin
a47127c1a6 app/vmalert/remotewrite: properly calculate vmalert_remotewrite_dropped_rows_total
It was calculating the number of dropped time series instead of the number of dropped samples.

While at it, drop vmalert_remotewrite_dropped_bytes_total metric, since it was inconsistently calculated -
at one place it was calculating raw protobuf-encoded sample sizes, while at another place it was calculating
the size of snappy-compressed prompbmarshal.WriteRequest protobuf message.
Additionally, this metric has zero practical sense, so just drop it in order to reduce the level of confusion.
2024-01-14 22:55:11 +02:00
Aliaksandr Valialkin
c005245741 lib/prompb: switch to github.com/VictoriaMetrics/easyproto 2024-01-14 22:46:06 +02:00
Aliaksandr Valialkin
f2229c2e42 lib/prompb: change type of Label.Name and Label.Value from []byte to string
This makes it more consistent with lib/prompbmarshal.Label
2024-01-14 22:33:21 +02:00
Aliaksandr Valialkin
f405384c8c lib/protoparser/datadogv2: simplify code for parsing protobuf messages after 0597718435
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5094
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4451
2024-01-14 21:43:01 +02:00
Aliaksandr Valialkin
dd25049858 lib/protoparser/opentelemetry: use github.com/VictoriaMetrics/easyproto for protobuf message unmarshaling and marshaling
This reduces VictoriaMetrics binary size by 100KB.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2570
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2424
2024-01-14 21:19:03 +02:00
Aliaksandr Valialkin
0597718435 lib/protoparser/datadogv2: add support for reading protobuf-encoded requests at /api/v2/series endpoint
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4451
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5094
2024-01-14 21:09:05 +02:00
Dmytro Kozlov
828aca82e9 app/vmctl: add insecure skip verify flags for source and destination addresses for native protocol (#5606)
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5595
2024-01-11 14:04:32 +01:00
Zakhar Bessarab
e5a767cff8 docs: explicitly mention "delete_series" endpoint accepts any HTTP method (#5605)
See: #5552

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2024-01-11 15:39:11 +04:00
hagen1778
eae585e8de docs: make docs-sync
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-11 11:54:40 +01:00
Artem Navoiev
d374595e31 docs: mention staleNaN handling during deduplication
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5587

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-11 11:53:58 +01:00
hagen1778
91ccea236f app/all: follow-up after 84d710beab
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5548
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-09 13:34:54 +01:00
zhdd99
fe2d9f6646 lib/pushmetrics: fix a panic caused by pushing metrics during the graceful shutdown process of vmstorage nodes. (#5549)
Co-authored-by: zhangdongdong <zhangdongdong@kuaishou.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-09 13:24:34 +01:00
Dan Dascalescu
b79d4cc988 docs: fix English in keyConcepts.md, add instant query use case (#5547) 2024-01-09 11:29:21 +01:00
Fred Navruzov
e34f77aed4 docs: vmanomaly chapter hotfixes (#5583)
* update link to book a demo for AD & RCA

* fix invalid refs in components/model

* - fix staging -> prod links
- replace capitalized FAQ headers
- change the section order on main page
- replace :latest tag with current stable
2024-01-09 11:04:24 +01:00
Fred Navruzov
bbea02f82b - fix 404 in /guides page (#5582)
- change back AD section title
2024-01-08 11:27:16 -08:00
Artem Navoiev
fdefc8a816 docs: properly close diff
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-08 20:20:09 +01:00
Artem Navoiev
095d982976 docs: vmagent specify default value for the compression level and its maximum. The question appers too often in our channel (#5575)
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-08 20:14:23 +01:00
Dmytro Kozlov
105c6b2eb7 app/vmui: fix broken link for the statistic inaccuracy explanation (#5568) 2024-01-08 20:13:45 +01:00
Dan Dascalescu
33df9bee22 Link to start/end timestamp formats in url-examples (#5531) 2024-01-08 18:23:38 +04:00
hagen1778
463455665b dashboards: update cluster dashboard
* add panels for detailed visualization of traffic usage between vmstorage, vminsert, vmselect
components and their clients. New panels are available in the rows dedicated to specific components.

* update "Slow Queries" panel to show percentage of the slow queries to the total number of read queries
served by vmselect. The percentage value should make it more clear for users whether there is a service degradation.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-08 11:58:31 +01:00
Denys Holius
eb08f5c7e5 docs/anomaly-detection/guides/README.md: fixed markdown link to the vmanomaly guide (#5578) 2024-01-08 02:41:52 -08:00
Artem Navoiev
1aa39efec1 docs: vmanomaly add list of guides to guides page
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-08 10:44:48 +01:00
Artem Navoiev
07e5d6f0fb docs: change sorting of anomaly deteciton
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-08 10:39:38 +01:00
Fred Navruzov
f75874f5df docs: vmanomaly part 1 (#5558)
* add `AD` section, fix links, release docs and changelog

* - connect sections, refactor structure

* - resolve suggestions
- add FAQ section
- fix dead links

* - fix incorrect render of tables for Writer
- comment out internal readers/writers
- fix page ordering to some extent

* - link licensing requirements from v1.5.0 to main page

---------

Co-authored-by: Artem Navoiev <tenmozes@gmail.com>
2024-01-08 01:31:36 -08:00
Denys Holius
aecfabe318 CHANGELOG.md: fixed wrong links to vmalert-tool documentation page (#5570) 2024-01-05 07:16:46 -08:00
Artem Navoiev
47307c7a37 docs: specify right link to grafana operator docs
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2023-12-24 22:51:43 +01:00
hagen1778
d0ca448093 docs: fix typo in VictoriaLogs upgrading procedure
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-12-22 21:41:27 +01:00
hagen1778
35dd6e5e8e docs: docs-sync after 52692d001a
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-12-22 21:34:26 +01:00
Dan Dascalescu
52692d001a docs: fix English and rm dupe sentence in README (#5523) 2023-12-22 11:02:14 -08:00
hagen1778
95edeffbc6 docs: add link to sandbox to the Grafana section
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-12-22 16:42:33 +01:00
hagen1778
910a39ad72 vendor: go mod tidy & go mod vendor
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-12-22 16:10:01 +01:00
Hui Wang
1f477aba41 vmalert: automatically add exported_ prefix for original evaluation… (#5398)
automatically add `exported_` prefix for original evaluation result label if it's conflicted with external or reserved one,
previously it was overridden.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5161

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2023-12-22 16:07:47 +01:00
Daria Karavaieva
be20501376 docs: vmanomaly guide v1.7.0 changes (#5505) 2023-12-22 16:06:41 +01:00
Dmytro Kozlov
43d7de4afe docs: clarify information about values (#5503) 2023-12-22 16:05:16 +01:00
hagen1778
34b69dcf58 vendor/metrics: fix unaligned 64-bit atomic operation panic on 32-bit arch
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-12-22 16:02:35 +01:00
hagen1778
8ba483eca3 docs: fix unauthorizedAccessConfig filed names for operator
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5520

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-12-22 10:50:21 +01:00
Aliaksandr Valialkin
7575f5c501 lib/protoparser/datadogv2: take into account source_type_name field, since it contains useful value such as kubernetes, docker, system, etc. 2023-12-21 23:05:41 +02:00
Aliaksandr Valialkin
b4ba8d0d76 lib/protoparser: add missing /datadog/ prefix to the /api/v2/series path in the description for -datadog.maxInsertRequestSize command-line flag 2023-12-21 21:04:53 +02:00
Aliaksandr Valialkin
9678235eea docs/CHANGELOG.md: typo fix after fb90a56de2: supperted -> supported 2023-12-21 21:01:42 +02:00
Aliaksandr Valialkin
fb90a56de2 app/{vminsert,vmagent}: preliminary support for /api/v2/series ingestion from new versions of DataDog Agent
This commit adds only JSON support - https://docs.datadoghq.com/api/latest/metrics/#submit-metrics ,
while recent versions of DataDog Agent send data to /api/v2/series in undocumented Protobuf format.
The support for this format will be added later.

Thanks to @AndrewChubatiuk for the initial implementation at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5094

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4451
2023-12-21 20:50:55 +02:00
Roman Khavronenko
8c1dcf4743 app/vmselect: drop rollupDefault function as duplicate (#5502)
* app/vmselect: drop `rollupDefault` function as duplicate

It is unclear why there are two identical fns `rollupDefault`
and `rollupDistinct`. Dropping one of them.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* Update app/vmselect/promql/rollup.go

* Update app/vmselect/promql/rollup.go

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-12-21 11:22:19 +02:00
Aliaksandr Valialkin
01f9edda64 lib/promauth: add more context to errors returned by Options.NewConfig() in order to simplify troubleshooting 2023-12-20 21:58:12 +02:00
Aliaksandr Valialkin
160cc9debd app/{vmagent,vmalert}: add the ability to set OAuth2 endpoint params via the corresponding *.oauth2.endpointParams command-line flags
This is a follow-up for 5ebd5a0d7b

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5427
2023-12-20 21:35:28 +02:00
Aliaksandr Valialkin
67160d08a2 docs/Articles.md: add two articles about VictoriaMetrics from zetablogs.medium.com 2023-12-20 21:24:51 +02:00
Morgan
5ebd5a0d7b Expose OAuth2 Endpoint Parameters to cli (#5427)
The user may which to control the endpoint parameters for instance to
set the audience when requesting an access token. Exposing the
parameters as a map allows for additional use cases without requiring
modification.
2023-12-20 20:16:43 +02:00
Aliaksandr Valialkin
7a31f8a6c9 app/vmselect/netstorage: make sure that at least a single result is collected from every storage group before deciding whether it is OK to skip results from the remaining storage nodes 2023-12-20 19:55:13 +02:00
Nikolay
7cfde237ec lib/awsapi: properly assume role with webIdentity token (#5495)
* lib/awsapi: properly assume role with webIdentity token
introduce new irsaRoleArn param for config. It's only needed for authorization with webIdentity token.
First credentials obtained with irsa role and the next sts assume call for an actual roleArn made with those credentials.
Common use case for it - cross AWS accounts authorization
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3822

* wip

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-12-20 19:05:39 +02:00
Aliaksandr Valialkin
5a88bc973f all: use Gauge instead of Counter for *_config_last_reload_successful metrics
This allows exposing the correct TYPE metadata for these labels when the app runs with -metrics.exposeMetadata command-line flag.
See https://github.com/VictoriaMetrics/metrics/pull/61#issuecomment-1860085508 for more details.

This is follow-up for 326a77c697
2023-12-20 14:23:42 +02:00
Yury Molodov
a35e52114b vmui: add vmanomaly explorer (#5401) 2023-12-19 17:20:54 +01:00
Dmytro Kozlov
df012f1553 docs: remove default value from the maxConcurrentInserts flag (#5494) 2023-12-19 15:57:33 +01:00
Aliaksandr Valialkin
326a77c697 all: add -metrics.exposeMetadata command-line flag, which can be used for adding TYPE and HELP metadata for metrics exposed at /metrics page
This may be needed for systems, which require this metadata such as Google Cloud Managed Prometheus.
See https://cloud.google.com/stackdriver/docs/managed-prometheus/troubleshooting#missing-metric-type
2023-12-19 03:20:40 +02:00
Artem Navoiev
bc3feebf69 docs: vmalert exapand HA abbreviation
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2023-12-18 15:46:40 +01:00
hagen1778
11e2d41c77 docs: mention Splitting data streams in Multi-retention guide
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-12-18 13:23:16 +01:00
hagen1778
afaf7f0b74 docs: add example for Splitting data streams among multiple systems
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-12-18 13:19:24 +01:00
Aliaksandr Valialkin
4b529562ce lib/pushmetrics: add -pushmetrics.header and -pushmetrics.disableCompression command-line flags 2023-12-17 19:56:46 +02:00
Aliaksandr Valialkin
873f0deaa6 docs/CHANGELOG.md: typo fix: use proper backtick closing quote instead of single quote 2023-12-17 19:28:15 +02:00
Aliaksandr Valialkin
0379a0eb82 lib/protoparser/opentelemetry: allow ingesting metrics without resource labels
Some clients may ingest samples via OpenTelemetry protocol without Resource labels.
Previously VictoriaMetrics was silently dropping such samples.

The commit 317834f876 added vm_protoparser_rows_dropped_total{type="opentelemetry",reason="resource_not_set"}
counter for tracking of such dropped samples. See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5459

It is better from usability PoV to accept such samples instead of dropping them and incrementing the corresponding counter.
2023-12-17 19:12:58 +02:00
Aliaksandr Valialkin
5ddccbc2b9 docs/CHANGELOG.md: move the description of the bugfix from 9253c24dd6 into correct place
The description of the bugfix was incorrectly placed in already released v1.96.0,
while it should be placed in tip.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5450
2023-12-17 18:58:42 +02:00
Roman Khavronenko
779bbc2e91 vmctl: rename vm-native-disable-retries to vm-native-disable-per-metric-migration (#5476)
The change supposed to better reflect the meaning of this flag.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-12-15 12:36:28 +01:00
Roman Khavronenko
664fa5cb78 vmctl: retry requests that failed in the very end for vm-native (#5475)
Before, retries happened only on writes into a network connection
between source and destination. But errors returned by server after
all the data was transmitted were logged, but not retried.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-12-15 11:43:41 +01:00
Zakhar Bessarab
317834f876 lib/protoparser/opentelemetry: add metric to track skipped rows without resource (#5459)
Currently, it is impossible to understand why metrics are not ingested when resource is not set by OTEL exporter. Adding metric should simplify debugging and make it improve debuggability.

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2023-12-15 11:16:25 +01:00
Hui Wang
9253c24dd6 vmalert: validate schema for -external.url (#5450)
Requests with wrong or no schema in  `-external.url` could be rejected by alertmanager.
So we validate schema on start up.
2023-12-15 11:13:56 +01:00
Dima Lazerka
cd277e3f84 VMUI: Handle unknown query error response type (#5451)
* VMUI: Handle unknown query error response type

* vmui: add error text for unknown error type

* Simplify nested `if`s for unknown error

Accepting @Loori-R's suggestion

Co-authored-by: Yury Molodov <yurymolodov@gmail.com>

---------

Co-authored-by: Yury Moladau <yurymolodov@gmail.com>
2023-12-14 21:19:54 -08:00
Aliaksandr Valialkin
0a6a2e455d app/vmstorage: addd missing -inmemoryDataFlushInterval command-line flag
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3337
2023-12-14 20:52:39 +02:00
Artem Navoiev
5896fb129d add link to RELEX solutions case study to home and signle pages
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2023-12-14 19:11:38 +01:00
nemobis
0b0f565c31 Add RELEX Solutions to case studies (#5469)
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2023-12-14 10:07:22 -08:00
Aliaksandr Valialkin
72dbd24b22 lib/fs: remove unused IsEmptyDir()
This function became unused after the commit 43b24164ef

The unused function has been found with deadode tool - https://go.dev/blog/deadcode
2023-12-14 19:38:53 +02:00
Aliaksandr Valialkin
df88baef07 docs/CHANGELOG.md: document the bugfix at 66c76a4d4d
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5414
2023-12-14 12:48:37 +02:00
Anton Tykhyy
66c76a4d4d Fix sum(aggr_over_time) 'got 1 args' error (#3028) (#5414)
app/vmselect/promql/eval.go:evalAggrFunc shunts evaluation
of AggrFuncExpr over rollupFunc over MetricsExpr to an optimized
path. tryGetArgRollupFuncWithMetricExpr() checks whether expression
can be shunted, but it mangles the AggrFuncExpr when the aggregation
function has more than one argument. This results in queries like
`sum(aggr_over_time("avg_over_time",m))` failing with error message
'expecting at least 2 args to "aggr_over_time"; got 1 args' while
the analogous query `sum(avg_over_time(m))` executes successfully.
This fix removes the unnecessary mangling.

Signed-off-by: Anton Tykhyy <atykhyy@gmail.com>
2023-12-14 12:38:54 +02:00
Aliaksandr Valialkin
2afb068f0f app/vmauth: allow specifying an empty retry_status_codes and and zero drop_src_path_prefix_parts in order to override user-level setting
Previously `retry_status_codes: []` and `drop_src_path_prefix_parts: 0` at `url_map` were equivalent to missing values.
This was resulting in using the user-level values instead.
2023-12-14 01:04:56 +02:00
Aliaksandr Valialkin
68be182075 app/vmauth: add ability to route requests to different backends depending on the request host 2023-12-14 00:46:36 +02:00
Aliaksandr Valialkin
6f15ca4a16 docs/Single-server-VictoriaMetrics.md: cross-link HA docs with the corresponding chapter at vmauth docs 2023-12-13 23:06:46 +02:00
Aliaksandr Valialkin
463a6e9ac6 deployment: update VictoriaMetrics images from v1.95.1 to v1.96.0
See https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.96.0
2023-12-13 01:20:51 +02:00
952 changed files with 49545 additions and 40078 deletions

View File

@@ -60,8 +60,8 @@ body:
For VictoriaMetrics health-state issues please provide full-length screenshots
of Grafana dashboards if possible:
* [Grafana dashboard for single-node VictoriaMetrics](https://grafana.com/grafana/dashboards/10229-victoriametrics/)
* [Grafana dashboard for VictoriaMetrics cluster](https://grafana.com/grafana/dashboards/11176-victoriametrics-cluster/)
* [Grafana dashboard for single-node VictoriaMetrics](https://grafana.com/grafana/dashboards/10229/)
* [Grafana dashboard for VictoriaMetrics cluster](https://grafana.com/grafana/dashboards/11176/)
See how to setup monitoring here:
* [monitoring for single-node VictoriaMetrics](https://docs.victoriametrics.com/#monitoring)

View File

@@ -36,11 +36,11 @@ jobs:
uses: actions/checkout@v4
- name: Initialize CodeQL
uses: github/codeql-action/init@v2
uses: github/codeql-action/init@v3
with:
languages: ${{ matrix.language }}
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v2
uses: github/codeql-action/analyze@v3
with:
category: "javascript"

View File

@@ -75,7 +75,7 @@ jobs:
# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@v2
uses: github/codeql-action/init@v3
with:
languages: ${{ matrix.language }}
# If you wish to specify custom queries, you can do so here or in a config file.
@@ -86,7 +86,7 @@ jobs:
# Autobuild attempts to build any compiled languages (C/C++, C#, or Java).
# If this step fails, then you should remove it and run the build manually (see below)
- name: Autobuild
uses: github/codeql-action/autobuild@v2
uses: github/codeql-action/autobuild@v3
# Command-line programs to run using the OS shell.
# 📚 https://git.io/JvXDl
@@ -100,4 +100,4 @@ jobs:
# make release
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v2
uses: github/codeql-action/analyze@v3

1
.gitignore vendored
View File

@@ -22,3 +22,4 @@ Gemfile.lock
/_site
_site
*.tmp
/docs/.jekyll-metadata

View File

@@ -175,7 +175,7 @@
END OF TERMS AND CONDITIONS
Copyright 2019-2023 VictoriaMetrics, Inc.
Copyright 2019-2024 VictoriaMetrics, Inc.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.

View File

@@ -1,6 +1,6 @@
PKG_PREFIX := github.com/VictoriaMetrics/VictoriaMetrics
MAKE_CONCURRENCY ?= $(shell cat /proc/cpuinfo | grep -c processor)
MAKE_CONCURRENCY ?= $(shell getconf _NPROCESSORS_ONLN)
MAKE_PARALLEL := $(MAKE) -j $(MAKE_CONCURRENCY)
DATEINFO_TAG ?= $(shell date -u +'%Y%m%d-%H%M%S')
BUILDINFO_TAG ?= $(shell echo $$(git describe --long --all | tr '/' '-')$$( \

317
README.md
View File

@@ -22,17 +22,17 @@ The cluster version of VictoriaMetrics is available [here](https://docs.victoria
Learn more about [key concepts](https://docs.victoriametrics.com/keyConcepts.html) of VictoriaMetrics and follow the
[quick start guide](https://docs.victoriametrics.com/Quick-Start.html) for a better experience.
There is also user-friendly database for logs - [VictoriaLogs](https://docs.victoriametrics.com/VictoriaLogs/).
There is also a user-friendly database for logs - [VictoriaLogs](https://docs.victoriametrics.com/VictoriaLogs/).
If you have questions about VictoriaMetrics, then feel free asking them at [VictoriaMetrics community Slack chat](https://slack.victoriametrics.com/).
If you have questions about VictoriaMetrics, then feel free asking them in the [VictoriaMetrics community Slack chat](https://slack.victoriametrics.com/).
[Contact us](mailto:info@victoriametrics.com) if you need enterprise support for VictoriaMetrics.
See [features available in enterprise package](https://docs.victoriametrics.com/enterprise.html).
Enterprise binaries can be downloaded and evaluated for free
from [the releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/latest).
See how to request a free trial license [here](https://victoriametrics.com/products/enterprise/trial/).
You can also [request a free trial license](https://victoriametrics.com/products/enterprise/trial/).
VictoriaMetrics is developed at a fast pace, so it is recommended periodically checking the [CHANGELOG](https://docs.victoriametrics.com/CHANGELOG.html) and performing [regular upgrades](#how-to-upgrade-victoriametrics).
VictoriaMetrics is developed at a fast pace, so it is recommended to check the [CHANGELOG](https://docs.victoriametrics.com/CHANGELOG.html) periodically, and to perform [regular upgrades](#how-to-upgrade-victoriametrics).
VictoriaMetrics has achieved security certifications for Database Software Development and Software-Based Monitoring Services. We apply strict security measures in everything we do. See our [Security page](https://victoriametrics.com/security/) for more details.
@@ -41,19 +41,19 @@ VictoriaMetrics has achieved security certifications for Database Software Devel
VictoriaMetrics has the following prominent features:
* It can be used as long-term storage for Prometheus. See [these docs](#prometheus-setup) for details.
* It can be used as a drop-in replacement for Prometheus in Grafana, because it supports [Prometheus querying API](#prometheus-querying-api-usage).
* It can be used as a drop-in replacement for Graphite in Grafana, because it supports [Graphite API](#graphite-api-usage).
* It can be used as a drop-in replacement for Prometheus in Grafana, because it supports the [Prometheus querying API](#prometheus-querying-api-usage).
* It can be used as a drop-in replacement for Graphite in Grafana, because it supports the [Graphite API](#graphite-api-usage).
VictoriaMetrics allows reducing infrastructure costs by more than 10x comparing to Graphite - see [this case study](https://docs.victoriametrics.com/CaseStudies.html#grammarly).
* It is easy to setup and operate:
* VictoriaMetrics consists of a single [small executable](https://medium.com/@valyala/stripping-dependency-bloat-in-victoriametrics-docker-image-983fb5912b0d)
without external dependencies.
* All the configuration is done via explicit command-line flags with reasonable defaults.
* All the data is stored in a single directory pointed by `-storageDataPath` command-line flag.
* All the data is stored in a single directory specified by the `-storageDataPath` command-line flag.
* Easy and fast backups from [instant snapshots](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282)
can be done with [vmbackup](https://docs.victoriametrics.com/vmbackup.html) / [vmrestore](https://docs.victoriametrics.com/vmrestore.html) tools.
See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time-series-databases-533c1a927883) for more details.
* It implements PromQL-like query language - [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html), which provides improved functionality on top of PromQL.
* It provides global query view. Multiple Prometheus instances or any other data sources may ingest data into VictoriaMetrics. Later this data may be queried via a single query.
* It implements a PromQL-like query language - [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html), which provides improved functionality on top of PromQL.
* It provides a global query view. Multiple Prometheus instances or any other data sources may ingest data into VictoriaMetrics. Later this data may be queried via a single query.
* It provides high performance and good vertical and horizontal scalability for both
[data ingestion](https://medium.com/@valyala/high-cardinality-tsdb-benchmarks-victoriametrics-vs-timescaledb-vs-influxdb-13e6ee64dd6b)
and [data querying](https://medium.com/@valyala/when-size-matters-benchmarking-victoriametrics-vs-timescale-and-influxdb-6035811952d4).
@@ -62,9 +62,9 @@ VictoriaMetrics has the following prominent features:
and [up to 7x less RAM than Prometheus, Thanos or Cortex](https://valyala.medium.com/prometheus-vs-victoriametrics-benchmark-on-node-exporter-metrics-4ca29c75590f)
when dealing with millions of unique time series (aka [high cardinality](https://docs.victoriametrics.com/FAQ.html#what-is-high-cardinality)).
* It is optimized for time series with [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate).
* It provides high data compression, so up to 70x more data points may be stored into limited storage comparing to TimescaleDB
according to [these benchmarks](https://medium.com/@valyala/when-size-matters-benchmarking-victoriametrics-vs-timescale-and-influxdb-6035811952d4)
and up to 7x less storage space is required compared to Prometheus, Thanos or Cortex
* It provides high data compression: up to 70x more data points may be stored into limited storage compared with TimescaleDB
according to [these benchmarks](https://medium.com/@valyala/when-size-matters-benchmarking-victoriametrics-vs-timescale-and-influxdb-6035811952d4),
and up to 7x less storage space is required compared to Prometheus, Thanos or Cortex.
according to [this benchmark](https://valyala.medium.com/prometheus-vs-victoriametrics-benchmark-on-node-exporter-metrics-4ca29c75590f).
* It is optimized for storage with high-latency IO and low IOPS (HDD and network storage in AWS, Google Cloud, Microsoft Azure, etc).
See [disk IO graphs from these benchmarks](https://medium.com/@valyala/high-cardinality-tsdb-benchmarks-victoriametrics-vs-timescaledb-vs-influxdb-13e6ee64dd6b).
@@ -75,7 +75,7 @@ VictoriaMetrics has the following prominent features:
from [PromCon 2019](https://promcon.io/2019-munich/talks/remote-write-storage-wars/).
* It protects the storage from data corruption on unclean shutdown (i.e. OOM, hardware reset or `kill -9`) thanks to
[the storage architecture](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282).
* It supports metrics' scraping, ingestion and [backfilling](#backfilling) via the following protocols:
* It supports metrics scraping, ingestion and [backfilling](#backfilling) via the following protocols:
* [Metrics scraping from Prometheus exporters](#how-to-scrape-prometheus-exporters-such-as-node-exporter).
* [Prometheus remote write API](#prometheus-setup).
* [Prometheus exposition format](#how-to-import-data-in-prometheus-exposition-format).
@@ -95,7 +95,7 @@ VictoriaMetrics has the following prominent features:
[high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate) issues via [series limiter](#cardinality-limiter).
* It ideally works with big amounts of time series data from APM, Kubernetes, IoT sensors, connected cars, industrial telemetry, financial data
and various [Enterprise workloads](https://docs.victoriametrics.com/enterprise.html).
* It has open source [cluster version](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/cluster).
* It has an open source [cluster version](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/cluster).
* It can store data on [NFS-based storages](https://en.wikipedia.org/wiki/Network_File_System) such as [Amazon EFS](https://aws.amazon.com/efs/)
and [Google Filestore](https://cloud.google.com/filestore).
@@ -138,7 +138,7 @@ See also [articles and slides about VictoriaMetrics from our users](https://docs
### Install
To quickly try VictoriaMetrics, just download [VictoriaMetrics executable](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/latest)
To quickly try VictoriaMetrics, just download the [VictoriaMetrics executable](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/latest)
or [Docker image](https://hub.docker.com/r/victoriametrics/victoria-metrics/) and start it with the desired command-line flags.
See also [QuickStart guide](https://docs.victoriametrics.com/Quick-Start.html) for additional information.
@@ -155,10 +155,10 @@ VictoriaMetrics can also be installed via these installation methods:
The following command-line flags are used the most:
* `-storageDataPath` - VictoriaMetrics stores all the data in this directory. Default path is `victoria-metrics-data` in the current working directory.
* `-storageDataPath` - VictoriaMetrics stores all the data in this directory. The default path is `victoria-metrics-data` in the current working directory.
* `-retentionPeriod` - retention for stored data. Older data is automatically deleted. Default retention is 1 month (31 days). The minimum retention period is 24h or 1d. See [these docs](#retention) for more details.
Other flags have good enough default values, so set them only if you really need this. Pass `-help` to see [all the available flags with description and default values](#list-of-command-line-flags).
Other flags have good enough default values, so set them only if you really need to. Pass `-help` to see [all the available flags with description and default values](#list-of-command-line-flags).
The following docs may be useful during initial VictoriaMetrics setup:
* [How to set up scraping of Prometheus-compatible targets](https://docs.victoriametrics.com/#how-to-scrape-prometheus-exporters-such-as-node-exporter)
@@ -172,9 +172,6 @@ VictoriaMetrics accepts [Prometheus querying API requests](#prometheus-querying-
It is recommended setting up [monitoring](#monitoring) for VictoriaMetrics.
VictoriaMetrics is developed at a fast pace, so it is recommended periodically checking the [CHANGELOG](https://docs.victoriametrics.com/CHANGELOG.html) and performing [regular upgrades](#how-to-upgrade-victoriametrics).
### Environment variables
All the VictoriaMetrics components allow referring environment variables in `yaml` configuration files (such as `-promscrape.config`)
@@ -199,7 +196,7 @@ Snap package for VictoriaMetrics is available [here](https://snapcraft.io/victor
Command-line flags for Snap package can be set with following command:
```text
```sh
echo 'FLAGS="-selfScrapeInterval=10s -search.logSlowQueryDuration=20s"' > $SNAP_DATA/var/snap/victoriametrics/current/extra_flags
snap restart victoriametrics
```
@@ -208,7 +205,7 @@ Do not change value for `-storageDataPath` flag, because snap package has limite
Changing scrape configuration is possible with text editor:
```text
```sh
vi $SNAP_DATA/var/snap/victoriametrics/current/etc/victoriametrics-scrape-config.yaml
```
@@ -261,7 +258,7 @@ and then install it as a service according to the following guide:
1. Install VictoriaMetrics as a service by running the following from elevated PowerShell:
```console
```sh
winsw install VictoriaMetrics.xml
Get-Service VictoriaMetrics | Start-Service
```
@@ -273,25 +270,21 @@ See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3781)
Add the following lines to Prometheus config file (it is usually located at `/etc/prometheus/prometheus.yml`) in order to send data to VictoriaMetrics:
<div class="with-copy" markdown="1">
```yml
```yaml
remote_write:
- url: http://<victoriametrics-addr>:8428/api/v1/write
```
</div>
Substitute `<victoriametrics-addr>` with hostname or IP address of VictoriaMetrics.
Then apply new config via the following command:
<div class="with-copy" markdown="1">
```console
```sh
kill -HUP `pidof prometheus`
```
</div>
Prometheus writes incoming data to local storage and replicates it to remote storage in parallel.
This means that data remains available in local storage for `--storage.tsdb.retention.time` duration
@@ -300,7 +293,7 @@ even if remote storage is unavailable.
If you plan sending data to VictoriaMetrics from multiple Prometheus instances, then add the following lines into `global` section
of [Prometheus config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#configuration-file):
```yml
```yaml
global:
external_labels:
datacenter: dc-123
@@ -312,7 +305,6 @@ across Prometheus instances, so time series could be filtered and grouped by thi
For highly loaded Prometheus instances (200k+ samples per second) the following tuning may be applied:
<div class="with-copy" markdown="1">
```yaml
remote_write:
@@ -323,7 +315,6 @@ remote_write:
max_shards: 30
```
</div>
Using remote write increases memory usage for Prometheus by up to ~25%. If you are experiencing issues with
too high memory consumption of Prometheus, then try to lower `max_samples_per_send` and `capacity` params.
@@ -363,6 +354,8 @@ See more in [description](https://github.com/VictoriaMetrics/grafana-datasource#
Creating a datasource may require [specific permissions](https://grafana.com/docs/grafana/latest/administration/data-source-management/).
If you don't see an option to create a data source - try contacting system administrator.
Grafana playground is available for viewing at our [sandbox](https://play-grafana.victoriametrics.com).
## How to upgrade VictoriaMetrics
VictoriaMetrics is developed at a fast pace, so it is recommended periodically checking [the CHANGELOG page](https://docs.victoriametrics.com/CHANGELOG.html) and performing regular upgrades.
@@ -516,43 +509,34 @@ See also [vmagent](https://docs.victoriametrics.com/vmagent.html), which can be
## How to send data from DataDog agent
VictoriaMetrics accepts data from [DataDog agent](https://docs.datadoghq.com/agent/)
or [DogStatsD](https://docs.datadoghq.com/developers/dogstatsd/)
via ["submit metrics" API](https://docs.datadoghq.com/api/latest/metrics/#submit-metrics)
at `/datadog/api/v1/series` path.
VictoriaMetrics accepts data from [DataDog agent](https://docs.datadoghq.com/agent/) or [DogStatsD](https://docs.datadoghq.com/developers/dogstatsd/)
via ["submit metrics" API](https://docs.datadoghq.com/api/latest/metrics/#submit-metrics) at `/datadog/api/v2/series` path.
### Sending metrics to VictoriaMetrics
DataDog agent allows configuring destinations for metrics sending via ENV variable `DD_DD_URL`
or via [configuration file](https://docs.datadoghq.com/agent/guide/agent-configuration-files/) in section `dd_url`.
<p align="center">
<img src="docs/Single-server-VictoriaMetrics-sending_DD_metrics_to_VM.webp" width="800">
</p>
<img src="docs/Single-server-VictoriaMetrics-sending_DD_metrics_to_VM.webp">
To configure DataDog agent via ENV variable add the following prefix:
<div class="with-copy" markdown="1">
```
```sh
DD_DD_URL=http://victoriametrics:8428/datadog
```
</div>
_Choose correct URL for VictoriaMetrics [here](https://docs.victoriametrics.com/url-examples.html#datadog)._
To configure DataDog agent via [configuration file](https://github.com/DataDog/datadog-agent/blob/878600ef7a55c5ef0efb41ed0915f020cf7e3bd0/pkg/config/config_template.yaml#L33)
add the following line:
<div class="with-copy" markdown="1">
```
```yaml
dd_url: http://victoriametrics:8428/datadog
```
</div>
vmagent also can accept Datadog metrics format. Depending on where vmagent will forward data,
[vmagent](https://docs.victoriametrics.com/vmagent.html) also can accept Datadog metrics format. Depending on where vmagent will forward data,
pick [single-node or cluster URL](https://docs.victoriametrics.com/url-examples.html#datadog) formats.
### Sending metrics to Datadog and VictoriaMetrics
@@ -560,41 +544,30 @@ pick [single-node or cluster URL](https://docs.victoriametrics.com/url-examples.
DataDog allows configuring [Dual Shipping](https://docs.datadoghq.com/agent/guide/dual-shipping/) for metrics
sending via ENV variable `DD_ADDITIONAL_ENDPOINTS` or via configuration file `additional_endpoints`.
<p align="center">
<img src="docs/Single-server-VictoriaMetrics-sending_DD_metrics_to_VM_and_DD.webp" width="800">
</p>
<img src="docs/Single-server-VictoriaMetrics-sending_DD_metrics_to_VM_and_DD.webp">
Run DataDog using the following ENV variable with VictoriaMetrics as additional metrics receiver:
<div class="with-copy" markdown="1">
```
```sh
DD_ADDITIONAL_ENDPOINTS='{\"http://victoriametrics:8428/datadog\": [\"apikey\"]}'
```
</div>
_Choose correct URL for VictoriaMetrics [here](https://docs.victoriametrics.com/url-examples.html#datadog)._
To configure DataDog Dual Shipping via [configuration file](https://docs.datadoghq.com/agent/guide/agent-configuration-files)
add the following line:
<div class="with-copy" markdown="1">
```
```yaml
additional_endpoints:
"http://victoriametrics:8428/datadog":
- apikey
```
</div>
### Send via cURL
See how to send data to VictoriaMetrics via
[DataDog "submit metrics"](https://docs.victoriametrics.com/url-examples.html#datadogapiv1series) from command line.
See how to send data to VictoriaMetrics via DataDog "submit metrics" API [here](https://docs.victoriametrics.com/url-examples.html#datadogapiv2series).
The imported data can be read via [export API](https://docs.victoriametrics.com/url-examples.html#apiv1export).
@@ -605,7 +578,7 @@ according to [DataDog metric naming recommendations](https://docs.datadoghq.com/
If you need accepting metric names as is without sanitizing, then pass `-datadog.sanitizeMetricName=false` command-line flag to VictoriaMetrics.
Extra labels may be added to all the written time series by passing `extra_label=name=value` query args.
For example, `/datadog/api/v1/series?extra_label=foo=bar` would add `{foo="bar"}` label to all the ingested metrics.
For example, `/datadog/api/v2/series?extra_label=foo=bar` would add `{foo="bar"}` label to all the ingested metrics.
DataDog agent sends the [configured tags](https://docs.datadoghq.com/getting_started/tagging/) to
undocumented endpoint - `/datadog/intake`. This endpoint isn't supported by VictoriaMetrics yet.
@@ -659,24 +632,20 @@ foo_field2{tag1="value1", tag2="value2"} 40
Example for writing data with [InfluxDB line protocol](https://docs.influxdata.com/influxdb/v1.7/write_protocols/line_protocol_tutorial/)
to local VictoriaMetrics using `curl`:
<div class="with-copy" markdown="1">
```console
```sh
curl -d 'measurement,tag1=value1,tag2=value2 field1=123,field2=1.23' -X POST 'http://localhost:8428/write'
```
</div>
An arbitrary number of lines delimited by '\n' (aka newline char) can be sent in a single request.
After that the data may be read via [/api/v1/export](#how-to-export-data-in-json-line-format) endpoint:
<div class="with-copy" markdown="1">
```console
```sh
curl -G 'http://localhost:8428/api/v1/export' -d 'match={__name__=~"measurement_.*"}'
```
</div>
The `/api/v1/export` endpoint should return the following response:
@@ -703,13 +672,11 @@ VictoriaMetrics exposes endpoint for InfluxDB v2 HTTP API at `/influx/api/v2/wri
In order to write data with InfluxDB line protocol to local VictoriaMetrics using `curl`:
<div class="with-copy" markdown="1">
```console
```sh
curl -d 'measurement,tag1=value1,tag2=value2 field1=123,field2=1.23' -X POST 'http://localhost:8428/api/v2/write'
```
</div>
The `/api/v1/export` endpoint should return the following response:
@@ -723,7 +690,7 @@ The `/api/v1/export` endpoint should return the following response:
Enable Graphite receiver in VictoriaMetrics by setting `-graphiteListenAddr` command line flag. For instance,
the following command will enable Graphite receiver in VictoriaMetrics on TCP and UDP port `2003`:
```console
```sh
/path/to/victoria-metrics-prod -graphiteListenAddr=:2003
```
@@ -732,7 +699,7 @@ to the VictoriaMetrics host in `StatsD` configs.
Example for writing data with Graphite plaintext protocol to local VictoriaMetrics using `nc`:
```console
```sh
echo "foo.bar.baz;tag1=value1;tag2=value2 123 `date +%s`" | nc -N localhost 2003
```
@@ -740,13 +707,11 @@ VictoriaMetrics sets the current time if the timestamp is omitted.
An arbitrary number of lines delimited by `\n` (aka newline char) can be sent in one go.
After that the data may be read via [/api/v1/export](#how-to-export-data-in-json-line-format) endpoint:
<div class="with-copy" markdown="1">
```console
```sh
curl -G 'http://localhost:8428/api/v1/export' -d 'match=foo.bar.baz'
```
</div>
The `/api/v1/export` endpoint should return the following response:
@@ -783,7 +748,7 @@ The same protocol is used for [ingesting data in KairosDB](https://kairosdb.gith
Enable OpenTSDB receiver in VictoriaMetrics by setting `-opentsdbListenAddr` command line flag. For instance,
the following command enables OpenTSDB receiver in VictoriaMetrics on TCP and UDP port `4242`:
```console
```sh
/path/to/victoria-metrics-prod -opentsdbListenAddr=:4242
```
@@ -791,24 +756,20 @@ Send data to the given address from OpenTSDB-compatible agents.
Example for writing data with OpenTSDB protocol to local VictoriaMetrics using `nc`:
<div class="with-copy" markdown="1">
```console
```sh
echo "put foo.bar.baz `date +%s` 123 tag1=value1 tag2=value2" | nc -N localhost 4242
```
</div>
An arbitrary number of lines delimited by `\n` (aka newline char) can be sent in one go.
After that the data may be read via [/api/v1/export](#how-to-export-data-in-json-line-format) endpoint:
<div class="with-copy" markdown="1">
```console
```sh
curl -G 'http://localhost:8428/api/v1/export' -d 'match=foo.bar.baz'
```
</div>
The `/api/v1/export` endpoint should return the following response:
@@ -821,7 +782,7 @@ The `/api/v1/export` endpoint should return the following response:
Enable HTTP server for OpenTSDB `/api/put` requests by setting `-opentsdbHTTPListenAddr` command line flag. For instance,
the following command enables OpenTSDB HTTP server on port `4242`:
```console
```sh
/path/to/victoria-metrics-prod -opentsdbHTTPListenAddr=:4242
```
@@ -829,33 +790,26 @@ Send data to the given address from OpenTSDB-compatible agents.
Example for writing a single data point:
<div class="with-copy" markdown="1">
```console
```sh
curl -H 'Content-Type: application/json' -d '{"metric":"x.y.z","value":45.34,"tags":{"t1":"v1","t2":"v2"}}' http://localhost:4242/api/put
```
</div>
Example for writing multiple data points in a single request:
<div class="with-copy" markdown="1">
```console
```sh
curl -H 'Content-Type: application/json' -d '[{"metric":"foo","value":45.34},{"metric":"bar","value":43}]' http://localhost:4242/api/put
```
</div>
After that the data may be read via [/api/v1/export](#how-to-export-data-in-json-line-format) endpoint:
<div class="with-copy" markdown="1">
```console
```sh
curl -G 'http://localhost:8428/api/v1/export' -d 'match[]=x.y.z' -d 'match[]=foo' -d 'match[]=bar'
```
</div>
The `/api/v1/export` endpoint should return the following response:
@@ -881,7 +835,7 @@ The `COLLECTOR_URL` must point to `/newrelic` HTTP endpoint at VictoriaMetrics,
which can be obtained [here](https://newrelic.com/signup).
For example, if VictoriaMetrics runs at `localhost:8428`, then the following command can be used for running NewRelic infrastructure agent:
```console
```sh
COLLECTOR_URL="http://localhost:8428/newrelic" NRIA_LICENSE_KEY="NEWRELIC_LICENSE_KEY" ./newrelic-infra
```
@@ -922,13 +876,13 @@ For example, let's import the following NewRelic Events request to VictoriaMetri
Save this JSON into `newrelic.json` file and then use the following command in order to import it into VictoriaMetrics:
```console
```sh
curl -X POST -H 'Content-Type: application/json' --data-binary @newrelic.json http://localhost:8428/newrelic/infra/v2/metrics/events/bulk
```
Let's fetch the ingested data via [data export API](#how-to-export-data-in-json-line-format):
```console
```sh
curl http://localhost:8428/api/v1/export -d 'match={eventType="SystemSample"}'
{"metric":{"__name__":"cpuStealPercent","entityKey":"macbook-pro.local","eventType":"SystemSample"},"values":[0],"timestamps":[1697407970000]}
{"metric":{"__name__":"loadAverageFiveMinute","entityKey":"macbook-pro.local","eventType":"SystemSample"},"values":[4.099609375],"timestamps":[1697407970000]}
@@ -1136,7 +1090,7 @@ The base docker image is [alpine](https://hub.docker.com/_/alpine) but it is pos
by setting it via `<ROOT_IMAGE>` environment variable.
For example, the following command builds the image on top of [scratch](https://hub.docker.com/_/scratch) image:
```console
```sh
ROOT_IMAGE=scratch make package-victoria-metrics
```
@@ -1217,6 +1171,7 @@ before actually deleting the metrics. By default, this query will only scan seri
adjust `start` and `end` to a suitable range to achieve match hits.
The `/api/v1/admin/tsdb/delete_series` handler may be protected with `authKey` if `-deleteAuthKey` command-line flag is set.
Note that handler accepts any HTTP method, so sending a `GET` request to `/api/v1/admin/tsdb/delete_series` will result in deletion of time series.
The delete API is intended mainly for the following cases:
@@ -1278,7 +1233,7 @@ Optional `start` and `end` args may be added to the request in order to limit th
See [allowed formats](#timestamp-formats) for these args.
For example:
```console
```sh
curl http://<victoriametrics-addr>:8428/api/v1/export -d 'match[]=<timeseries_selector_for_export>' -d 'start=1654543486' -d 'end=1654543486'
curl http://<victoriametrics-addr>:8428/api/v1/export -d 'match[]=<timeseries_selector_for_export>' -d 'start=2022-06-06T19:25:48' -d 'end=2022-06-06T19:29:07'
```
@@ -1290,13 +1245,11 @@ In this case the output may contain multiple lines with samples for the same tim
Pass `Accept-Encoding: gzip` HTTP header in the request to `/api/v1/export` in order to reduce network bandwidth during exporting big amounts
of time series data. This enables gzip compression for the exported data. Example for exporting gzipped data:
<div class="with-copy" markdown="1">
```console
```sh
curl -H 'Accept-Encoding: gzip' http://localhost:8428/api/v1/export -d 'match[]={__name__!=""}' > data.jsonl.gz
```
</div>
The maximum duration for each request to `/api/v1/export` is limited by `-search.maxExportDuration` command-line flag.
@@ -1327,7 +1280,7 @@ Optional `start` and `end` args may be added to the request in order to limit th
See [allowed formats](#timestamp-formats) for these args.
For example:
```console
```sh
curl http://<victoriametrics-addr>:8428/api/v1/export/csv -d 'format=<format>' -d 'match[]=<timeseries_selector_for_export>' -d 'start=1654543486' -d 'end=1654543486'
curl http://<victoriametrics-addr>:8428/api/v1/export/csv -d 'format=<format>' -d 'match[]=<timeseries_selector_for_export>' -d 'start=2022-06-06T19:25:48' -d 'end=2022-06-06T19:29:07'
```
@@ -1344,7 +1297,7 @@ for metrics to export. Use `{__name__=~".*"}` selector for fetching all the time
On large databases you may experience problems with limit on the number of time series, which can be exported. In this case you need to adjust `-search.maxExportSeries` command-line flag:
```console
```sh
# count unique time series in database
wget -O- -q 'http://your_victoriametrics_instance:8428/api/v1/series/count' | jq '.data[0]'
@@ -1355,7 +1308,7 @@ Optional `start` and `end` args may be added to the request in order to limit th
See [allowed formats](#timestamp-formats) for these args.
For example:
```console
```sh
curl http://<victoriametrics-addr>:8428/api/v1/export/native -d 'match[]=<timeseries_selector_for_export>' -d 'start=1654543486' -d 'end=1654543486'
curl http://<victoriametrics-addr>:8428/api/v1/export/native -d 'match[]=<timeseries_selector_for_export>' -d 'start=2022-06-06T19:25:48' -d 'end=2022-06-06T19:29:07'
```
@@ -1400,7 +1353,7 @@ VictoriaMetrics accepts metrics data in JSON line format at `/api/v1/import` end
Example for importing data obtained via [/api/v1/export](#how-to-export-data-in-json-line-format):
```console
```sh
# Export the data from <source-victoriametrics>:
curl http://source-victoriametrics:8428/api/v1/export -d 'match={__name__!=""}' > exported_data.jsonl
@@ -1410,7 +1363,7 @@ curl -X POST http://destination-victoriametrics:8428/api/v1/import -T exported_d
Pass `Content-Encoding: gzip` HTTP request header to `/api/v1/import` for importing gzipped data:
```console
```sh
# Export gzipped data from <source-victoriametrics>:
curl -H 'Accept-Encoding: gzip' http://source-victoriametrics:8428/api/v1/export -d 'match={__name__!=""}' > exported_data.jsonl.gz
@@ -1436,7 +1389,7 @@ The specification of VictoriaMetrics' native format may yet change and is not fo
If you have a native format file obtained via [/api/v1/export/native](#how-to-export-data-in-native-format) however this is the most efficient protocol for importing data in.
```console
```sh
# Export the data from <source-victoriametrics>:
curl http://source-victoriametrics:8428/api/v1/export/native -d 'match={__name__!=""}' > exported_data.bin
@@ -1454,7 +1407,7 @@ Note that it could be required to flush response cache after importing historica
Arbitrary CSV data can be imported via `/api/v1/import/csv`. The CSV data is imported according to the provided `format` query arg.
The `format` query arg must contain comma-separated list of parsing rules for CSV fields. Each rule consists of three parts delimited by a colon:
```
```text
<column_pos>:<type>:<context>
```
@@ -1477,14 +1430,14 @@ Each request to `/api/v1/import/csv` may contain arbitrary number of CSV lines.
Example for importing CSV data via `/api/v1/import/csv`:
```console
```sh
curl -d "GOOG,1.23,4.56,NYSE" 'http://localhost:8428/api/v1/import/csv?format=2:metric:ask,3:metric:bid,1:label:ticker,4:label:market'
curl -d "MSFT,3.21,1.67,NASDAQ" 'http://localhost:8428/api/v1/import/csv?format=2:metric:ask,3:metric:bid,1:label:ticker,4:label:market'
```
After that the data may be read via [/api/v1/export](#how-to-export-data-in-json-line-format) endpoint:
```console
```sh
curl -G 'http://localhost:8428/api/v1/export' -d 'match[]={ticker!=""}'
```
@@ -1510,23 +1463,19 @@ and in [Pushgateway format](https://github.com/prometheus/pushgateway#url) via `
For example, the following command imports a single line in Prometheus exposition format into VictoriaMetrics:
<div class="with-copy" markdown="1">
```console
```sh
curl -d 'foo{bar="baz"} 123' -X POST 'http://localhost:8428/api/v1/import/prometheus'
```
</div>
The following command may be used for verifying the imported data:
<div class="with-copy" markdown="1">
```console
```sh
curl -G 'http://localhost:8428/api/v1/export' -d 'match={__name__=~"foo"}'
```
</div>
It should return something like the following:
@@ -1536,24 +1485,20 @@ It should return something like the following:
The following command imports a single metric via [Pushgateway format](https://github.com/prometheus/pushgateway#url) with `{job="my_app",instance="host123"}` labels:
<div class="with-copy" markdown="1">
```console
```sh
curl -d 'metric{label="abc"} 123' -X POST 'http://localhost:8428/api/v1/import/prometheus/metrics/job/my_app/instance/host123'
```
</div>
Pass `Content-Encoding: gzip` HTTP request header to `/api/v1/import/prometheus` for importing gzipped data:
<div class="with-copy" markdown="1">
```console
```sh
# Import gzipped data to <destination-victoriametrics>:
curl -X POST -H 'Content-Encoding: gzip' http://destination-victoriametrics:8428/api/v1/import/prometheus -T prometheus_data.gz
```
</div>
Extra labels may be added to all the imported metrics either via [Pushgateway format](https://github.com/prometheus/pushgateway#url)
or by passing `extra_label=name=value` query args. For example, `/api/v1/import/prometheus?extra_label=foo=bar` would add `{foo="bar"}` label to all the imported metrics.
@@ -1581,7 +1526,7 @@ and exports data in this format at [/api/v1/export](#how-to-export-data-in-json-
The format follows [JSON streaming concept](http://ndjson.org/), e.g. each line contains JSON object with metrics data in the following format:
```
```json
{
// metric contans metric name plus labels for a particular time series
"metric":{
@@ -1633,7 +1578,7 @@ The `-relabelConfig` files can contain special placeholders in the form `%{ENV_V
Example contents for `-relabelConfig` file:
```yml
```yaml
# Add {cluster="dev"} label.
- target_label: cluster
replacement: dev
@@ -1661,7 +1606,7 @@ Optional `start` and `end` args may be added to the request in order to scrape t
See [allowed formats](#timestamp-formats) for these args.
For example:
```console
```sh
curl http://<victoriametrics-addr>:8428/federate -d 'match[]=<timeseries_selector_for_export>' -d 'start=1654543486' -d 'end=1654543486'
curl http://<victoriametrics-addr>:8428/federate -d 'match[]=<timeseries_selector_for_export>' -d 'start=2022-06-06T19:25:48' -d 'end=2022-06-06T19:29:07'
```
@@ -1722,9 +1667,10 @@ See also [resource usage limits at VictoriaMetrics cluster](https://docs.victori
The general approach for achieving high availability is the following:
- to run two identically configured VictoriaMetrics instances in distinct datacenters (availability zones)
- to store the collected data simultaneously into these instances via [vmagent](https://docs.victoriametrics.com/vmagent.html) or Prometheus
- to query the first VictoriaMetrics instance and to fail over to the second instance when the first instance becomes temporarily unavailable.
- To run two identically configured VictoriaMetrics instances in distinct datacenters (availability zones);
- To store the collected data simultaneously into these instances via [vmagent](https://docs.victoriametrics.com/vmagent.html) or Prometheus.
- To query the first VictoriaMetrics instance and to fail over to the second instance when the first instance becomes temporarily unavailable.
This can be done via [vmauth](https://docs.victoriametrics.com/vmauth.html) according to [these docs](https://docs.victoriametrics.com/vmauth.html#high-availability).
Such a setup guarantees that the collected data isn't lost when one of VictoriaMetrics instance becomes unavailable.
The collected data continues to be written to the available VictoriaMetrics instance, so it should be available for querying.
@@ -1737,7 +1683,7 @@ then it can be configured with multiple `-remoteWrite.url` command-line flags, w
instance in a particular availability zone, in order to replicate the collected data to all the VictoriaMetrics instances.
For example, the following command instructs `vmagent` to replicate data to `vm-az1` and `vm-az2` instances of VictoriaMetrics:
```console
```sh
/path/to/vmagent \
-remoteWrite.url=http://<vm-az1>:8428/api/v1/write \
-remoteWrite.url=http://<vm-az2>:8428/api/v1/write
@@ -1747,7 +1693,7 @@ If you use Prometheus for collecting and writing the data to VictoriaMetrics,
then the following [`remote_write`](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write) section
in Prometheus config can be used for replicating the collected data to `vm-az1` and `vm-az2` VictoriaMetrics instances:
```yml
```yaml
remote_write:
- url: http://<vm-az1>:8428/api/v1/write
- url: http://<vm-az2>:8428/api/v1/write
@@ -1771,6 +1717,10 @@ This aligns with the [staleness rules in Prometheus](https://prometheus.io/docs/
If multiple raw samples have **the same timestamp** on the given `-dedup.minScrapeInterval` discrete interval,
then the sample with **the biggest value** is kept.
[Prometheus stalenes markers](https://docs.victoriametrics.com/vmagent.html#prometheus-staleness-markers) are processed as any other value during de-duplication.
If raw sample with the biggest timestamp on `-dedup.minScrapeInterval` contains a stale marker, then it is kept after the deduplication.
This allows properly preserving staleness markers during the de-duplication.
Please note, [labels](https://docs.victoriametrics.com/keyConcepts.html#labels) of raw samples should be identical
in order to be deduplicated. For example, this is why [HA pair of vmagents](https://docs.victoriametrics.com/vmagent.html#high-availability)
needs to be identically configured.
@@ -1854,8 +1804,8 @@ This increases overhead during data querying, since VictoriaMetrics needs to rea
bigger number of parts per each request. That's why it is recommended to have at least 20%
of free disk space under directory pointed by `-storageDataPath` command-line flag.
Information about merging process is available in [the dashboard for single-node VictoriaMetrics](https://grafana.com/grafana/dashboards/10229-victoriametrics/)
and [the dashboard for VictoriaMetrics cluster](https://grafana.com/grafana/dashboards/11176-victoriametrics-cluster/).
Information about merging process is available in [the dashboard for single-node VictoriaMetrics](https://grafana.com/grafana/dashboards/10229)
and [the dashboard for VictoriaMetrics cluster](https://grafana.com/grafana/dashboards/11176).
See more details in [monitoring docs](#monitoring).
See [this article](https://valyala.medium.com/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282) for more details.
@@ -1917,7 +1867,7 @@ command-line flag is applied to it. If series matches multiple configured retent
For example, the following config sets 3 days retention for time series with `team="juniors"` label,
30 days retention for time series with `env="dev"` or `env="staging"` label and 1 year retention for the remaining time series:
```
```sh
-retentionFilter='{team="juniors"}:3d' -retentionFilter='{env=~"dev|staging"}:30d' -retentionPeriod=1y
```
@@ -2012,8 +1962,10 @@ VictoriaMetrics provides the following security-related command-line flags:
with [HTTP Basic Authentication](https://en.wikipedia.org/wiki/Basic_access_authentication).
* `-deleteAuthKey` for protecting `/api/v1/admin/tsdb/delete_series` endpoint. See [how to delete time series](#how-to-delete-time-series).
* `-snapshotAuthKey` for protecting `/snapshot*` endpoints. See [how to work with snapshots](#how-to-work-with-snapshots).
* `-forceFlushAuthKey` for protecting `/internal/force_flush` endpoint. See [these docs](#troubleshooting).
* `-forceMergeAuthKey` for protecting `/internal/force_merge` endpoint. See [force merge docs](#forced-merge).
* `-search.resetCacheAuthKey` for protecting `/internal/resetRollupResultCache` endpoint. See [backfilling](#backfilling) for more details.
* `-reloadAuthKey` for protecting `/-/reload` endpoint, which is used for force reloading of [`-promscrape.config`](#how-to-scrape-prometheus-exporters-such-as-node-exporter).
* `-configAuthKey` for protecting `/config` endpoint, since it may contain sensitive information such as passwords.
* `-flagsAuthKey` for protecting `/flags` endpoint.
* `-pprofAuthKey` for protecting `/debug/pprof/*` endpoints, which can be used for [profiling](#profiling).
@@ -2043,7 +1995,7 @@ and [the general security page at VictoriaMetrics website](https://victoriametri
If you plan to store more than 1TB of data on `ext4` partition or plan extending it to more than 16TB,
then the following options are recommended to pass to `mkfs.ext4`:
```console
```sh
mkfs.ext4 ... -O 64bit,huge_file,extent -T huge
```
@@ -2057,9 +2009,9 @@ with 10 seconds interval.
_Please note, never use loadbalancer address for scraping metrics. All monitored components should be scraped directly by their address._
Official Grafana dashboards available for [single-node](https://grafana.com/grafana/dashboards/10229-victoriametrics/)
and [clustered](https://grafana.com/grafana/dashboards/11176-victoriametrics-cluster/) VictoriaMetrics.
See an [alternative dashboard for clustered VictoriaMetrics](https://grafana.com/grafana/dashboards/11831)
Official Grafana dashboards available for [single-node](https://grafana.com/grafana/dashboards/10229)
and [clustered](https://grafana.com/grafana/dashboards/11176) VictoriaMetrics.
See an [alternative dashboard for clustered VictoriaMetrics](https://grafana.com/grafana/dashboards/11831)
created by community.
Graphs on the dashboards contain useful hints - hover the `i` icon in the top left corner of each graph to read it.
@@ -2101,7 +2053,7 @@ In this case VictoriaMetrics puts query trace into `trace` field in the output J
For example, the following command:
```console
```sh
curl http://localhost:8428/api/v1/query_range -d 'query=2*rand()' -d 'start=-1h' -d 'step=1m' -d 'trace=1' | jq '.trace'
```
@@ -2287,17 +2239,20 @@ The following command-line flags are related to pushing metrics from VictoriaMet
The `-pushmetrics.url` can be specified multiple times. In this case metrics are pushed to all the specified urls.
The url can contain basic auth params in the form `http://user:pass@hostname/api/v1/import/prometheus`.
Metrics are pushed to the provided `-pushmetrics.url` in a compressed form with `Content-Encoding: gzip` request header.
This allows reducing the required network bandwidth for metrics push.
* `-pushmetrics.extraLabel` - labels to add to all the metrics before sending them to `-pushmetrics.url`. Each label must be specified in the format `label="value"`.
This allows reducing the required network bandwidth for metrics push. The compression can be disabled by passing `-pushmetrics.disableCompression` command-line flag.
* `-pushmetrics.extraLabel` - labels to add to all the metrics before sending them to every `-pushmetrics.url`. Each label must be specified in the format `label="value"`.
It is OK to specify multiple `-pushmetrics.extraLabel` command-line flags. In this case all the specified labels
are added to all the metrics before sending them to all the configured `-pushmetrics.url` addresses.
* `-pushmetrics.interval` - the interval between pushes. By default it is set to 10 seconds.
* `-pushmetrics.header` - an optional HTTP header to send to every `-pushmetrics.url`. For example, `-pushmetrics.header='Authorization: Basic foo'` instructs to send
`Authorization: Basic foo` HTTP header with every request to every `-pushmetrics.url`. It is possible to set multiple `-pushmetrics.header` command-line flags
for sending multiple different HTTP headers to `-pushmetrics.url`.
For example, the following command instructs VictoriaMetrics to push metrics from `/metrics` page to `https://maas.victoriametrics.com/api/v1/import/prometheus`
with `user:pass` [Basic auth](https://en.wikipedia.org/wiki/Basic_access_authentication). The `instance="foobar"` and `job="vm"` labels
are added to all the metrics before sending them to the remote storage:
```console
```sh
/path/to/victoria-metrics \
-pushmetrics.url=https://user:pass@maas.victoriametrics.com/api/v1/import/prometheus \
-pushmetrics.extraLabel='instance="foobar"' \
@@ -2325,8 +2280,8 @@ The following metrics for each type of cache are exported at [`/metrics` page](#
* `vm_cache_misses_total` - the number of cache misses
* `vm_cache_entries` - the number of entries in the cache
Both Grafana dashboards for [single-node VictoriaMetrics](https://grafana.com/grafana/dashboards/10229-victoriametrics/)
and [clustered VictoriaMetrics](https://grafana.com/grafana/dashboards/11176-victoriametrics-cluster/)
Both Grafana dashboards for [single-node VictoriaMetrics](https://grafana.com/grafana/dashboards/10229)
and [clustered VictoriaMetrics](https://grafana.com/grafana/dashboards/11176)
contain `Caches` section with cache metrics visualized. The panels show the current
memory usage by each type of cache, and also a cache hit rate. If hit rate is close to 100%
then cache efficiency is already very high and does not need any tuning.
@@ -2440,23 +2395,19 @@ VictoriaMetrics provides handlers for collecting the following [Go profiles](htt
* Memory profile. It can be collected with the following command (replace `0.0.0.0` with hostname if needed):
<div class="with-copy" markdown="1">
```console
```sh
curl http://0.0.0.0:8428/debug/pprof/heap > mem.pprof
```
</div>
* CPU profile. It can be collected with the following command (replace `0.0.0.0` with hostname if needed):
<div class="with-copy" markdown="1">
```console
```sh
curl http://0.0.0.0:8428/debug/pprof/profile > cpu.pprof
```
</div>
The command for collecting CPU profile waits for 30 seconds before returning.
@@ -2524,7 +2475,7 @@ If the page needs to have many images, consider using WEB-optimized image format
When adding a new doc with many images use `webp` format right away. Or use a Makefile command below to
convert already existing images at `docs` folder automatically to `web` format:
```console
```sh
make docs-images-to-webp
```
@@ -2564,26 +2515,28 @@ Files included in each folder:
Pass `-help` to VictoriaMetrics in order to see the list of supported command-line flags with their description:
```
```sh
-bigMergeConcurrency int
Deprecated: this flag does nothing. Please use -smallMergeConcurrency for controlling the concurrency of background merges. See https://docs.victoriametrics.com/#storage
Deprecated: this flag does nothing
-blockcache.missesBeforeCaching int
The number of cache misses before putting the block into cache. Higher values may reduce indexdb/dataBlocks cache size at the cost of higher CPU and disk read usage (default 2)
-cacheExpireDuration duration
Items are removed from in-memory caches after they aren't accessed for this duration. Lower values may reduce memory usage at the cost of higher CPU usage. See also -prevCacheRemovalPercent (default 30m0s)
-configAuthKey string
-configAuthKey value
Authorization key for accessing /config page. It must be passed via authKey query arg
Flag value can be read from the given file when using -configAuthKey=file:///abs/path/to/file or -configAuthKey=file://./relative/path/to/file . Flag value can be read from the given http/https url when using -configAuthKey=http://host/path or -configAuthKey=https://host/path
-csvTrimTimestamp duration
Trim timestamps when importing csv data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms)
-datadog.maxInsertRequestSize size
The maximum size in bytes of a single DataDog POST request to /api/v1/series
The maximum size in bytes of a single DataDog POST request to /datadog/api/v2/series
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 67108864)
-datadog.sanitizeMetricName
Sanitize metric names for the ingested DataDog data to comply with DataDog behaviour described at https://docs.datadoghq.com/metrics/custom_metrics/#naming-custom-metrics (default true)
-dedup.minScrapeInterval duration
Leave only the last sample in every time series per each discrete interval equal to -dedup.minScrapeInterval > 0. See https://docs.victoriametrics.com/#deduplication and https://docs.victoriametrics.com/#downsampling
-deleteAuthKey string
-deleteAuthKey value
authKey for metrics' deletion via /api/v1/admin/tsdb/delete_series and /tags/delSeries
Flag value can be read from the given file when using -deleteAuthKey=file:///abs/path/to/file or -deleteAuthKey=file://./relative/path/to/file . Flag value can be read from the given http/https url when using -deleteAuthKey=http://host/path or -deleteAuthKey=https://host/path
-denyQueriesOutsideRetention
Whether to deny queries outside the configured -retentionPeriod. When set, then /api/v1/query_range would return '503 Service Unavailable' error for queries with 'from' value outside -retentionPeriod. This may be useful when multiple data sources with distinct retentions are hidden behind query-tee
-denyQueryTracing
@@ -2604,13 +2557,16 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
-filestream.disableFadvise
Whether to disable fadvise() syscall when reading large data files. The fadvise() syscall prevents from eviction of recently accessed data from OS page cache during background merges and backups. In some rare cases it is better to disable the syscall if it uses too much CPU
-finalMergeDelay duration
The delay before starting final merge for per-month partition after no new data is ingested into it. Final merge may require additional disk IO and CPU resources. Final merge may increase query speed and reduce disk space usage in some cases. Zero value disables final merge
-flagsAuthKey string
Deprecated: this flag does nothing
-flagsAuthKey value
Auth key for /flags endpoint. It must be passed via authKey query arg. It overrides httpAuth.* settings
-forceFlushAuthKey string
Flag value can be read from the given file when using -flagsAuthKey=file:///abs/path/to/file or -flagsAuthKey=file://./relative/path/to/file . Flag value can be read from the given http/https url when using -flagsAuthKey=http://host/path or -flagsAuthKey=https://host/path
-forceFlushAuthKey value
authKey, which must be passed in query string to /internal/force_flush pages
-forceMergeAuthKey string
Flag value can be read from the given file when using -forceFlushAuthKey=file:///abs/path/to/file or -forceFlushAuthKey=file://./relative/path/to/file . Flag value can be read from the given http/https url when using -forceFlushAuthKey=http://host/path or -forceFlushAuthKey=https://host/path
-forceMergeAuthKey value
authKey, which must be passed in query string to /internal/force_merge pages
Flag value can be read from the given file when using -forceMergeAuthKey=file:///abs/path/to/file or -forceMergeAuthKey=file://./relative/path/to/file . Flag value can be read from the given http/https url when using -forceMergeAuthKey=http://host/path or -forceMergeAuthKey=https://host/path
-fs.disableMmap
Whether to use pread() instead of mmap() for reading data files. By default, mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread()
-graphiteListenAddr string
@@ -2637,8 +2593,9 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus
-http.shutdownDelay duration
Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers
-httpAuth.password string
-httpAuth.password value
Password for HTTP server's Basic Auth. The authentication is disabled if -httpAuth.username is empty
Flag value can be read from the given file when using -httpAuth.password=file:///abs/path/to/file or -httpAuth.password=file://./relative/path/to/file . Flag value can be read from the given http/https url when using -httpAuth.password=http://host/path or -httpAuth.password=https://host/path
-httpAuth.username string
Username for HTTP server's Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
-httpListenAddr string
@@ -2705,7 +2662,7 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
-loggerWarnsPerSecondLimit int
Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit
-maxConcurrentInserts int
The maximum number of concurrent insert requests. Default value should work for most cases, since it minimizes the memory usage. The default value can be increased when clients send data over slow networks. See also -insert.maxQueueDuration (default 32)
The maximum number of concurrent insert requests. Default value should work for most cases, since it minimizes the memory usage. The default value can be increased when clients send data over slow networks. See also -insert.maxQueueDuration.
-maxInsertRequestSize size
The maximum size in bytes of a single Prometheus remote_write API request
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 33554432)
@@ -2718,8 +2675,11 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 0)
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from the OS page cache which will result in higher disk IO usage (default 60)
-metricsAuthKey string
-metrics.exposeMetadata
Whether to expose TYPE and HELP metadata at the /metrics page, which is exposed at -httpListenAddr . The metadata may be needed when the /metrics page is consumed by systems, which require this information. For example, Managed Prometheus in Google Cloud - https://cloud.google.com/stackdriver/docs/managed-prometheus/troubleshooting#missing-metric-type
-metricsAuthKey value
Auth key for /metrics endpoint. It must be passed via authKey query arg. It overrides httpAuth.* settings
Flag value can be read from the given file when using -metricsAuthKey=file:///abs/path/to/file or -metricsAuthKey=file://./relative/path/to/file . Flag value can be read from the given http/https url when using -metricsAuthKey=http://host/path or -metricsAuthKey=https://host/path
-newrelic.maxInsertRequestSize size
The maximum size in bytes of a single NewRelic request to /newrelic/infra/v2/metrics/events/bulk
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 67108864)
@@ -2738,8 +2698,9 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 33554432)
-opentsdbhttpTrimTimestamp duration
Trim timestamps for OpenTSDB HTTP data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms)
-pprofAuthKey string
-pprofAuthKey value
Auth key for /debug/pprof/* endpoints. It must be passed via authKey query arg. It overrides httpAuth.* settings
Flag value can be read from the given file when using -pprofAuthKey=file:///abs/path/to/file or -pprofAuthKey=file://./relative/path/to/file . Flag value can be read from the given http/https url when using -pprofAuthKey=http://host/path or -pprofAuthKey=https://host/path
-precisionBits int
The number of precision bits to store per each value. Lower precision bits improves data compression at the cost of precision loss (default 64)
-prevCacheRemovalPercent float
@@ -2802,6 +2763,8 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
Interval for checking for changes in http endpoint service discovery. This works only if http_sd_configs is configured in '-promscrape.config' file. See https://docs.victoriametrics.com/sd_configs.html#http_sd_configs for details (default 1m0s)
-promscrape.kubernetes.apiServerTimeout duration
How frequently to reload the full state from Kubernetes API server (default 30m0s)
-promscrape.kubernetes.attachNodeMetadataAll
Whether to set attach_metadata.node=true for all the kubernetes_sd_configs at -promscrape.config . It is possible to set attach_metadata.node=false individually per each kubernetes_sd_configs . See https://docs.victoriametrics.com/sd_configs.html#kubernetes_sd_configs
-promscrape.kubernetesSDCheckInterval duration
Interval for checking for changes in Kubernetes API server. This works only if kubernetes_sd_configs is configured in '-promscrape.config' file. See https://docs.victoriametrics.com/sd_configs.html#kubernetes_sd_configs for details (default 30s)
-promscrape.kumaSDCheckInterval duration
@@ -2837,16 +2800,24 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
The delay for suppressing repeated scrape errors logging per each scrape targets. This may be used for reducing the number of log lines related to scrape errors. See also -promscrape.suppressScrapeErrors
-promscrape.yandexcloudSDCheckInterval duration
Interval for checking for changes in Yandex Cloud API. This works only if yandexcloud_sd_configs is configured in '-promscrape.config' file. See https://docs.victoriametrics.com/sd_configs.html#yandexcloud_sd_configs for details (default 30s)
-pushmetrics.disableCompression
Whether to disable request body compression when pushing metrics to every -pushmetrics.url
-pushmetrics.extraLabel array
Optional labels to add to metrics pushed to -pushmetrics.url . For example, -pushmetrics.extraLabel='instance="foo"' adds instance="foo" label to all the metrics pushed to -pushmetrics.url
Optional labels to add to metrics pushed to every -pushmetrics.url . For example, -pushmetrics.extraLabel='instance="foo"' adds instance="foo" label to all the metrics pushed to every -pushmetrics.url
Supports an array of values separated by comma or specified via multiple flags.
-pushmetrics.header array
Optional HTTP request header to send to every -pushmetrics.url . For example, -pushmetrics.header='Authorization: Basic foobar' adds 'Authorization: Basic foobar' header to every request to every -pushmetrics.url
Supports an array of values separated by comma or specified via multiple flags.
-pushmetrics.interval duration
Interval for pushing metrics to -pushmetrics.url (default 10s)
Interval for pushing metrics to every -pushmetrics.url (default 10s)
-pushmetrics.url array
Optional URL to push metrics exposed at /metrics page. See https://docs.victoriametrics.com/#push-metrics . By default, metrics exposed at /metrics page aren't pushed to any remote storage
Supports an array of values separated by comma or specified via multiple flags.
-relabelConfig string
Optional path to a file with relabeling rules, which are applied to all the ingested metrics. The path can point either to local file or to http url. See https://docs.victoriametrics.com/#relabeling for details. The config is reloaded on SIGHUP signal
-reloadAuthKey value
Auth key for /-/reload http endpoint. It must be passed as authKey=...
Flag value can be read from the given file when using -reloadAuthKey=file:///abs/path/to/file or -reloadAuthKey=file://./relative/path/to/file . Flag value can be read from the given http/https url when using -reloadAuthKey=http://host/path or -reloadAuthKey=https://host/path
-retentionFilter array
Retention filter in the format 'filter:retention'. For example, '{env="dev"}:3d' configures the retention for time series with env="dev" label to 3 days. See https://docs.victoriametrics.com/#retention-filters for details. This flag is available only in VictoriaMetrics enterprise. See https://docs.victoriametrics.com/enterprise.html
Supports an array of values separated by comma or specified via multiple flags.
@@ -2941,8 +2912,9 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
Query stats for /api/v1/status/top_queries is tracked on this number of last queries. Zero value disables query stats tracking (default 20000)
-search.queryStats.minQueryDuration duration
The minimum duration for queries to track in query stats at /api/v1/status/top_queries. Queries with lower duration are ignored in query stats (default 1ms)
-search.resetCacheAuthKey string
-search.resetCacheAuthKey value
Optional authKey for resetting rollup cache via /internal/resetRollupResultCache call
Flag value can be read from the given file when using -search.resetCacheAuthKey=file:///abs/path/to/file or -search.resetCacheAuthKey=file://./relative/path/to/file . Flag value can be read from the given http/https url when using -search.resetCacheAuthKey=http://host/path or -search.resetCacheAuthKey=https://host/path
-search.setLookbackToStep
Whether to fix lookback interval to 'step' query arg value. If set to true, the query model becomes closer to InfluxDB data model. If set to true, then -search.maxLookback and -search.maxStalenessInterval are ignored
-search.treatDotsAsIsInRegexps
@@ -2954,9 +2926,10 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
-selfScrapeJob string
Value for 'job' label, which is added to self-scraped metrics (default "victoria-metrics")
-smallMergeConcurrency int
The maximum number of workers for background merges. See https://docs.victoriametrics.com/#storage . It isn't recommended tuning this flag in general case, since this may lead to uncontrolled increase in the number of parts and increased CPU usage during queries
-snapshotAuthKey string
Deprecated: this flag does nothing
-snapshotAuthKey value
authKey, which must be passed in query string to /snapshot* pages
Flag value can be read from the given file when using -snapshotAuthKey=file:///abs/path/to/file or -snapshotAuthKey=file://./relative/path/to/file . Flag value can be read from the given http/https url when using -snapshotAuthKey=http://host/path or -snapshotAuthKey=https://host/path
-snapshotCreateTimeout duration
The timeout for creating new snapshot. If set, make sure that timeout is lower than backup period
-snapshotsMaxAge value
@@ -3012,4 +2985,6 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
Optional URL for proxying requests to vmalert. For example, if -vmalert.proxyURL=http://vmalert:8880 , then alerting API requests such as /api/v1/rules from Grafana will be proxied to http://vmalert:8880/api/v1/rules
-vmui.customDashboardsPath string
Optional path to vmui dashboards. See https://github.com/VictoriaMetrics/VictoriaMetrics/tree/master/app/vmui/packages/vmui/public/dashboards
-vmui.defaultTimezone string
The default timezone to be used in vmui. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local. See https://github.com/VictoriaMetrics/VictoriaMetrics/tree/master/app/vmui#timezone-configuration
```

View File

@@ -37,7 +37,6 @@ func main() {
cgroup.SetGOGC(*gogc)
buildinfo.Init()
logger.Init()
pushmetrics.Init()
logger.Infof("starting VictoriaLogs at %q...", *httpListenAddr)
startTime := time.Now()
@@ -49,8 +48,10 @@ func main() {
go httpserver.Serve(*httpListenAddr, *useProxyProtocol, requestHandler)
logger.Infof("started VictoriaLogs in %.3f seconds; see https://docs.victoriametrics.com/VictoriaLogs/", time.Since(startTime).Seconds())
pushmetrics.Init()
sig := procutil.WaitForSigterm()
logger.Infof("received signal %s", sig)
pushmetrics.Stop()
logger.Infof("gracefully shutting down webservice at %q", *httpListenAddr)
startTime = time.Now()

View File

@@ -48,7 +48,6 @@ func main() {
envflag.Parse()
buildinfo.Init()
logger.Init()
pushmetrics.Init()
if promscrape.IsDryRun() {
*dryRun = true
@@ -74,13 +73,16 @@ func main() {
vmstorage.Init(promql.ResetRollupResultCacheIfNeeded)
vmselect.Init()
vminsert.Init()
startSelfScraper()
go httpserver.Serve(*httpListenAddr, *useProxyProtocol, requestHandler)
logger.Infof("started VictoriaMetrics in %.3f seconds", time.Since(startTime).Seconds())
pushmetrics.Init()
sig := procutil.WaitForSigterm()
logger.Infof("received signal %s", sig)
pushmetrics.Stop()
stopSelfScraper()
@@ -89,8 +91,8 @@ func main() {
if err := httpserver.Stop(*httpListenAddr); err != nil {
logger.Fatalf("cannot stop the webservice: %s", err)
}
vminsert.Stop()
logger.Infof("successfully shut down the webservice in %.3f seconds", time.Since(startTime).Seconds())
vminsert.Stop()
vmstorage.Stop()
vmselect.Stop()

View File

@@ -12,6 +12,7 @@ import (
"os"
"path/filepath"
"reflect"
"strconv"
"strings"
"testing"
"time"
@@ -54,15 +55,14 @@ var (
)
type test struct {
Name string `json:"name"`
Data []string `json:"data"`
InsertQuery string `json:"insert_query"`
Query []string `json:"query"`
ResultMetrics []Metric `json:"result_metrics"`
ResultSeries Series `json:"result_series"`
ResultQuery Query `json:"result_query"`
ResultQueryRange QueryRange `json:"result_query_range"`
Issue string `json:"issue"`
Name string `json:"name"`
Data []string `json:"data"`
InsertQuery string `json:"insert_query"`
Query []string `json:"query"`
ResultMetrics []Metric `json:"result_metrics"`
ResultSeries Series `json:"result_series"`
ResultQuery Query `json:"result_query"`
Issue string `json:"issue"`
}
type Metric struct {
@@ -80,42 +80,90 @@ type Series struct {
Status string `json:"status"`
Data []map[string]string `json:"data"`
}
type Query struct {
Status string `json:"status"`
Data QueryData `json:"data"`
}
type QueryData struct {
ResultType string `json:"resultType"`
Result []QueryDataResult `json:"result"`
Status string `json:"status"`
Data struct {
ResultType string `json:"resultType"`
Result json.RawMessage `json:"result"`
} `json:"data"`
}
type QueryDataResult struct {
Metric map[string]string `json:"metric"`
Value []interface{} `json:"value"`
const rtVector, rtMatrix = "vector", "matrix"
func (q *Query) metrics() ([]Metric, error) {
switch q.Data.ResultType {
case rtVector:
var r QueryInstant
if err := json.Unmarshal(q.Data.Result, &r.Result); err != nil {
return nil, err
}
return r.metrics()
case rtMatrix:
var r QueryRange
if err := json.Unmarshal(q.Data.Result, &r.Result); err != nil {
return nil, err
}
return r.metrics()
default:
return nil, fmt.Errorf("unknown result type %q", q.Data.ResultType)
}
}
func (r *QueryDataResult) UnmarshalJSON(b []byte) error {
type plain QueryDataResult
return json.Unmarshal(testutil.PopulateTimeTpl(b, insertionTime), (*plain)(r))
type QueryInstant struct {
Result []struct {
Labels map[string]string `json:"metric"`
TV [2]interface{} `json:"value"`
} `json:"result"`
}
func (q QueryInstant) metrics() ([]Metric, error) {
result := make([]Metric, len(q.Result))
for i, res := range q.Result {
f, err := strconv.ParseFloat(res.TV[1].(string), 64)
if err != nil {
return nil, fmt.Errorf("metric %v, unable to parse float64 from %s: %w", res, res.TV[1], err)
}
var m Metric
m.Metric = res.Labels
m.Timestamps = append(m.Timestamps, int64(res.TV[0].(float64)))
m.Values = append(m.Values, f)
result[i] = m
}
return result, nil
}
type QueryRange struct {
Status string `json:"status"`
Data QueryRangeData `json:"data"`
}
type QueryRangeData struct {
ResultType string `json:"resultType"`
Result []QueryRangeDataResult `json:"result"`
Result []struct {
Metric map[string]string `json:"metric"`
Values [][]interface{} `json:"values"`
} `json:"result"`
}
type QueryRangeDataResult struct {
Metric map[string]string `json:"metric"`
Values [][]interface{} `json:"values"`
func (q QueryRange) metrics() ([]Metric, error) {
var result []Metric
for i, res := range q.Result {
var m Metric
for _, tv := range res.Values {
f, err := strconv.ParseFloat(tv[1].(string), 64)
if err != nil {
return nil, fmt.Errorf("metric %v, unable to parse float64 from %s: %w", res, tv[1], err)
}
m.Values = append(m.Values, f)
m.Timestamps = append(m.Timestamps, int64(tv[0].(float64)))
}
if len(m.Values) < 1 || len(m.Timestamps) < 1 {
return nil, fmt.Errorf("metric %v contains no values", res)
}
m.Metric = q.Result[i].Metric
result = append(result, m)
}
return result, nil
}
func (r *QueryRangeDataResult) UnmarshalJSON(b []byte) error {
type plain QueryRangeDataResult
return json.Unmarshal(testutil.PopulateTimeTpl(b, insertionTime), (*plain)(r))
func (q *Query) UnmarshalJSON(b []byte) error {
type plain Query
return json.Unmarshal(testutil.PopulateTimeTpl(b, insertionTime), (*plain)(q))
}
func TestMain(m *testing.M) {
@@ -197,6 +245,9 @@ func TestWriteRead(t *testing.T) {
func testWrite(t *testing.T) {
t.Run("prometheus", func(t *testing.T) {
for _, test := range readIn("prometheus", t, insertionTime) {
if test.Data == nil {
continue
}
s := newSuite(t)
r := testutil.WriteRequest{}
s.noError(json.Unmarshal([]byte(strings.Join(test.Data, "\n")), &r.Timeseries))
@@ -272,17 +323,19 @@ func testRead(t *testing.T) {
if err := checkSeriesResult(s, test.ResultSeries); err != nil {
t.Fatalf("Series. %s fails with error %s.%s", q, err, test.Issue)
}
case strings.HasPrefix(q, "/api/v1/query_range"):
queryResult := QueryRange{}
httpReadStruct(t, testReadHTTPPath, q, &queryResult)
if err := checkQueryRangeResult(queryResult, test.ResultQueryRange); err != nil {
t.Fatalf("Query Range. %s fails with error %s.%s", q, err, test.Issue)
}
case strings.HasPrefix(q, "/api/v1/query"):
queryResult := Query{}
httpReadStruct(t, testReadHTTPPath, q, &queryResult)
if err := checkQueryResult(queryResult, test.ResultQuery); err != nil {
t.Fatalf("Query. %s fails with error: %s.%s", q, err, test.Issue)
gotMetrics, err := queryResult.metrics()
if err != nil {
t.Fatalf("failed to parse query response: %s", err)
}
expMetrics, err := test.ResultQuery.metrics()
if err != nil {
t.Fatalf("failed to parse expected response: %s", err)
}
if err := checkMetricsResult(gotMetrics, expMetrics); err != nil {
t.Fatalf("%q fails with error %s.%s", q, err, test.Issue)
}
default:
t.Fatalf("unsupported read query %s", q)
@@ -417,60 +470,6 @@ func removeIfFoundSeries(r map[string]string, contains []map[string]string) []ma
return contains
}
func checkQueryResult(got, want Query) error {
if got.Status != want.Status {
return fmt.Errorf("status mismatch %q - %q", want.Status, got.Status)
}
if got.Data.ResultType != want.Data.ResultType {
return fmt.Errorf("result type mismatch %q - %q", want.Data.ResultType, got.Data.ResultType)
}
wantData := append([]QueryDataResult(nil), want.Data.Result...)
for _, r := range got.Data.Result {
wantData = removeIfFoundQueryData(r, wantData)
}
if len(wantData) > 0 {
return fmt.Errorf("expected query result %+v not found in %+v", wantData, got.Data.Result)
}
return nil
}
func removeIfFoundQueryData(r QueryDataResult, contains []QueryDataResult) []QueryDataResult {
for i, item := range contains {
if reflect.DeepEqual(r.Metric, item.Metric) && reflect.DeepEqual(r.Value[0], item.Value[0]) && reflect.DeepEqual(r.Value[1], item.Value[1]) {
contains[i] = contains[len(contains)-1]
return contains[:len(contains)-1]
}
}
return contains
}
func checkQueryRangeResult(got, want QueryRange) error {
if got.Status != want.Status {
return fmt.Errorf("status mismatch %q - %q", want.Status, got.Status)
}
if got.Data.ResultType != want.Data.ResultType {
return fmt.Errorf("result type mismatch %q - %q", want.Data.ResultType, got.Data.ResultType)
}
wantData := append([]QueryRangeDataResult(nil), want.Data.Result...)
for _, r := range got.Data.Result {
wantData = removeIfFoundQueryRangeData(r, wantData)
}
if len(wantData) > 0 {
return fmt.Errorf("expected query range result %+v not found in %+v", wantData, got.Data.Result)
}
return nil
}
func removeIfFoundQueryRangeData(r QueryRangeDataResult, contains []QueryRangeDataResult) []QueryRangeDataResult {
for i, item := range contains {
if reflect.DeepEqual(r.Metric, item.Metric) && reflect.DeepEqual(r.Values, item.Values) {
contains[i] = contains[len(contains)-1]
return contains[:len(contains)-1]
}
}
return contains
}
type suite struct{ t *testing.T }
func newSuite(t *testing.T) *suite { return &suite{t: t} }

View File

@@ -98,7 +98,7 @@ func addLabel(dst []prompb.Label, key, value string) []prompb.Label {
dst = append(dst, prompb.Label{})
}
lb := &dst[len(dst)-1]
lb.Name = bytesutil.ToUnsafeBytes(key)
lb.Value = bytesutil.ToUnsafeBytes(value)
lb.Name = key
lb.Value = value
return dst
}

View File

@@ -7,7 +7,7 @@
"not_nan_not_inf;item=y 3 {TIME_S-1m}",
"not_nan_not_inf;item=y 1 {TIME_S-2m}"],
"query": ["/api/v1/query_range?query=1/(not_nan_not_inf-1)!=inf!=nan&start={TIME_S-3m}&end={TIME_S}&step=60"],
"result_query_range": {
"result_query": {
"status":"success",
"data":{"resultType":"matrix",
"result":[

View File

@@ -6,7 +6,7 @@
"empty_label_match;foo=bar 2 {TIME_S-1m}",
"empty_label_match;foo=baz 3 {TIME_S-1m}"],
"query": ["/api/v1/query_range?query=empty_label_match{foo=~'bar|'}&start={TIME_S-1m}&end={TIME_S}&step=60"],
"result_query_range": {
"result_query": {
"status":"success",
"data":{"resultType":"matrix",
"result":[

View File

@@ -8,7 +8,7 @@
"max_lookback_set 4 {TIME_S-150s}"
],
"query": ["/api/v1/query_range?query=max_lookback_set&start={TIME_S-150s}&end={TIME_S}&step=10s&max_lookback=1s"],
"result_query_range": {
"result_query": {
"status":"success",
"data":{"resultType":"matrix",
"result":[{"metric":{"__name__":"max_lookback_set"},"values":[

View File

@@ -8,7 +8,7 @@
"max_lookback_unset 4 {TIME_S-150s}"
],
"query": ["/api/v1/query_range?query=max_lookback_unset&start={TIME_S-150s}&end={TIME_S}&step=10s"],
"result_query_range": {
"result_query": {
"status":"success",
"data":{"resultType":"matrix",
"result":[{"metric":{"__name__":"max_lookback_unset"},"values":[

View File

@@ -8,7 +8,7 @@
"not_nan_as_missing_data;item=y 3 {TIME_S-1m}"
],
"query": ["/api/v1/query_range?query=not_nan_as_missing_data>1&start={TIME_S-2m}&end={TIME_S}&step=60"],
"result_query_range": {
"result_query": {
"status":"success",
"data":{"resultType":"matrix",
"result":[

View File

@@ -0,0 +1,12 @@
{
"name": "instant query with look-behind window",
"data": ["[{\"labels\":[{\"name\":\"__name__\",\"value\":\"foo\"}],\"samples\":[{\"value\":1,\"timestamp\":\"{TIME_MS-60s}\"}]}]"],
"query": ["/api/v1/query?query=foo[5m]"],
"result_query": {
"status": "success",
"data":{
"resultType":"matrix",
"result":[{"metric":{"__name__":"foo"},"values":[["{TIME_S-60s}", "1"]]}]
}
}
}

View File

@@ -0,0 +1,11 @@
{
"name": "instant scalar query",
"query": ["/api/v1/query?query=42&time={TIME_S}"],
"result_query": {
"status": "success",
"data":{
"resultType":"vector",
"result":[{"metric":{},"value":["{TIME_S}", "42"]}]
}
}
}

View File

@@ -0,0 +1,13 @@
{
"name": "too big look-behind window",
"issue": "https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5553",
"data": ["[{\"labels\":[{\"name\":\"__name__\",\"value\":\"foo\"},{\"name\":\"issue\",\"value\":\"5553\"}],\"samples\":[{\"value\":1,\"timestamp\":\"{TIME_MS-60s}\"}]}]"],
"query": ["/api/v1/query?query=foo{issue=\"5553\"}[100y]"],
"result_query": {
"status": "success",
"data":{
"resultType":"matrix",
"result":[{"metric":{"__name__":"foo", "issue": "5553"},"values":[["{TIME_S-60s}", "1"]]}]
}
}
}

View File

@@ -0,0 +1,18 @@
{
"name": "query range",
"issue": "https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5553",
"data": ["[{\"labels\":[{\"name\":\"__name__\",\"value\":\"bar\"}],\"samples\":[{\"value\":1,\"timestamp\":\"{TIME_MS-60s}\"}, {\"value\":2,\"timestamp\":\"{TIME_MS-120s}\"}, {\"value\":1,\"timestamp\":\"{TIME_MS-180s}\"}]}]"],
"query": ["/api/v1/query_range?query=bar&step=30s&start={TIME_MS-180s}"],
"result_query": {
"status": "success",
"data":{
"resultType":"matrix",
"result":[
{
"metric":{"__name__":"bar"},
"values":[["{TIME_S-180s}", "1"],["{TIME_S-150s}", "1"],["{TIME_S-120s}", "2"],["{TIME_S-90s}", "2"], ["{TIME_S-60s}", "1"], ["{TIME_S-30s}", "1"], ["{TIME_S}", "1"]]
}
]
}
}
}

View File

@@ -1,4 +1,4 @@
package datadog
package datadogv1
import (
"net/http"
@@ -8,33 +8,32 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/datadog"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/datadog/stream"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/datadogutils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/datadogv1"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/datadogv1/stream"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="datadog"}`)
rowsTenantInserted = tenantmetrics.NewCounterMap(`vmagent_tenant_inserted_rows_total{type="datadog"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="datadog"}`)
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="datadogv1"}`)
rowsTenantInserted = tenantmetrics.NewCounterMap(`vmagent_tenant_inserted_rows_total{type="datadogv1"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="datadogv1"}`)
)
// InsertHandlerForHTTP processes remote write for DataDog POST /api/v1/series request.
//
// See https://docs.datadoghq.com/api/latest/metrics/#submit-metrics
func InsertHandlerForHTTP(at *auth.Token, req *http.Request) error {
extraLabels, err := parserCommon.GetExtraLabels(req)
if err != nil {
return err
}
ce := req.Header.Get("Content-Encoding")
return stream.Parse(req.Body, ce, func(series []datadog.Series) error {
return stream.Parse(req.Body, ce, func(series []datadogv1.Series) error {
return insertRows(at, series, extraLabels)
})
}
func insertRows(at *auth.Token, series []datadog.Series, extraLabels []prompbmarshal.Label) error {
func insertRows(at *auth.Token, series []datadogv1.Series, extraLabels []prompbmarshal.Label) error {
ctx := common.GetPushCtx()
defer common.PutPushCtx(ctx)
@@ -63,7 +62,7 @@ func insertRows(at *auth.Token, series []datadog.Series, extraLabels []prompbmar
})
}
for _, tag := range ss.Tags {
name, value := datadog.SplitTag(tag)
name, value := datadogutils.SplitTag(tag)
if name == "host" {
name = "exported_host"
}

View File

@@ -0,0 +1,102 @@
package datadogv2
import (
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/datadogutils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/datadogv2"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/datadogv2/stream"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="datadogv2"}`)
rowsTenantInserted = tenantmetrics.NewCounterMap(`vmagent_tenant_inserted_rows_total{type="datadogv2"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="datadogv2"}`)
)
// InsertHandlerForHTTP processes remote write for DataDog POST /api/v2/series request.
//
// See https://docs.datadoghq.com/api/latest/metrics/#submit-metrics
func InsertHandlerForHTTP(at *auth.Token, req *http.Request) error {
extraLabels, err := parserCommon.GetExtraLabels(req)
if err != nil {
return err
}
ct := req.Header.Get("Content-Type")
ce := req.Header.Get("Content-Encoding")
return stream.Parse(req.Body, ce, ct, func(series []datadogv2.Series) error {
return insertRows(at, series, extraLabels)
})
}
func insertRows(at *auth.Token, series []datadogv2.Series, extraLabels []prompbmarshal.Label) error {
ctx := common.GetPushCtx()
defer common.PutPushCtx(ctx)
rowsTotal := 0
tssDst := ctx.WriteRequest.Timeseries[:0]
labels := ctx.Labels[:0]
samples := ctx.Samples[:0]
for i := range series {
ss := &series[i]
rowsTotal += len(ss.Points)
labelsLen := len(labels)
labels = append(labels, prompbmarshal.Label{
Name: "__name__",
Value: ss.Metric,
})
for _, rs := range ss.Resources {
labels = append(labels, prompbmarshal.Label{
Name: rs.Type,
Value: rs.Name,
})
}
if ss.SourceTypeName != "" {
labels = append(labels, prompbmarshal.Label{
Name: "source_type_name",
Value: ss.SourceTypeName,
})
}
for _, tag := range ss.Tags {
name, value := datadogutils.SplitTag(tag)
if name == "host" {
name = "exported_host"
}
labels = append(labels, prompbmarshal.Label{
Name: name,
Value: value,
})
}
labels = append(labels, extraLabels...)
samplesLen := len(samples)
for _, pt := range ss.Points {
samples = append(samples, prompbmarshal.Sample{
Timestamp: pt.Timestamp * 1000,
Value: pt.Value,
})
}
tssDst = append(tssDst, prompbmarshal.TimeSeries{
Labels: labels[labelsLen:],
Samples: samples[samplesLen:],
})
}
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
if !remotewrite.TryPush(at, &ctx.WriteRequest) {
return remotewrite.ErrQueueFullHTTPRetry
}
rowsInserted.Add(rowsTotal)
if at != nil {
rowsTenantInserted.Get(at).Add(rowsTotal)
}
rowsPerInsert.Update(float64(rowsTotal))
return nil
}

View File

@@ -12,7 +12,8 @@ import (
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/csvimport"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/datadog"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/datadogv1"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/datadogv2"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/graphite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/influx"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/native"
@@ -68,7 +69,8 @@ var (
"See also -opentsdbHTTPListenAddr.useProxyProtocol")
opentsdbHTTPUseProxyProtocol = flag.Bool("opentsdbHTTPListenAddr.useProxyProtocol", false, "Whether to use proxy protocol for connections accepted "+
"at -opentsdbHTTPListenAddr . See https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt")
configAuthKey = flag.String("configAuthKey", "", "Authorization key for accessing /config page. It must be passed via authKey query arg")
configAuthKey = flagutil.NewPassword("configAuthKey", "Authorization key for accessing /config page. It must be passed via authKey query arg")
reloadAuthKey = flagutil.NewPassword("reloadAuthKey", "Auth key for /-/reload http endpoint. It must be passed as authKey=...")
dryRun = flag.Bool("dryRun", false, "Whether to check config files without running vmagent. The following files are checked: "+
"-promscrape.config, -remoteWrite.relabelConfig, -remoteWrite.urlRelabelConfig, -remoteWrite.streamAggr.config . "+
"Unknown config entries aren't allowed in -promscrape.config by default. This can be changed by passing -promscrape.config.strictParse=false command-line flag")
@@ -95,7 +97,6 @@ func main() {
remotewrite.InitSecretFlags()
buildinfo.Init()
logger.Init()
pushmetrics.Init()
if promscrape.IsDryRun() {
if err := promscrape.CheckConfig(); err != nil {
@@ -146,8 +147,10 @@ func main() {
}
logger.Infof("started vmagent in %.3f seconds", time.Since(startTime).Seconds())
pushmetrics.Init()
sig := procutil.WaitForSigterm()
logger.Infof("received signal %s", sig)
pushmetrics.Stop()
startTime = time.Now()
if len(*httpListenAddr) > 0 {
@@ -343,9 +346,20 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
fmt.Fprintf(w, `{"status":"ok"}`)
return true
case "/datadog/api/v1/series":
datadogWriteRequests.Inc()
if err := datadog.InsertHandlerForHTTP(nil, r); err != nil {
datadogWriteErrors.Inc()
datadogv1WriteRequests.Inc()
if err := datadogv1.InsertHandlerForHTTP(nil, r); err != nil {
datadogv1WriteErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(202)
fmt.Fprintf(w, `{"status":"ok"}`)
return true
case "/datadog/api/v2/series":
datadogv2WriteRequests.Inc()
if err := datadogv2.InsertHandlerForHTTP(nil, r); err != nil {
datadogv2WriteErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
@@ -408,7 +422,7 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
}
return true
case "/prometheus/config", "/config":
if !httpserver.CheckAuthFlag(w, r, *configAuthKey, "configAuthKey") {
if !httpserver.CheckAuthFlag(w, r, configAuthKey.Get(), "configAuthKey") {
return true
}
promscrapeConfigRequests.Inc()
@@ -417,7 +431,7 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
return true
case "/prometheus/api/v1/status/config", "/api/v1/status/config":
// See https://prometheus.io/docs/prometheus/latest/querying/api/#config
if !httpserver.CheckAuthFlag(w, r, *configAuthKey, "configAuthKey") {
if !httpserver.CheckAuthFlag(w, r, configAuthKey.Get(), "configAuthKey") {
return true
}
promscrapeStatusConfigRequests.Inc()
@@ -427,6 +441,9 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
fmt.Fprintf(w, `{"status":"success","data":{"yaml":%q}}`, bb.B)
return true
case "/prometheus/-/reload", "/-/reload":
if !httpserver.CheckAuthFlag(w, r, reloadAuthKey.Get(), "reloadAuthKey") {
return true
}
promscrapeConfigReloadRequests.Inc()
procutil.SelfSIGHUP()
w.WriteHeader(http.StatusOK)
@@ -566,9 +583,19 @@ func processMultitenantRequest(w http.ResponseWriter, r *http.Request, path stri
fmt.Fprintf(w, `{"status":"ok"}`)
return true
case "datadog/api/v1/series":
datadogWriteRequests.Inc()
if err := datadog.InsertHandlerForHTTP(at, r); err != nil {
datadogWriteErrors.Inc()
datadogv1WriteRequests.Inc()
if err := datadogv1.InsertHandlerForHTTP(at, r); err != nil {
datadogv1WriteErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(202)
fmt.Fprintf(w, `{"status":"ok"}`)
return true
case "datadog/api/v2/series":
datadogv2WriteRequests.Inc()
if err := datadogv2.InsertHandlerForHTTP(at, r); err != nil {
datadogv2WriteErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
@@ -626,8 +653,11 @@ var (
influxQueryRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/influx/query", protocol="influx"}`)
datadogWriteRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/datadog/api/v1/series", protocol="datadog"}`)
datadogWriteErrors = metrics.NewCounter(`vmagent_http_request_errors_total{path="/datadog/api/v1/series", protocol="datadog"}`)
datadogv1WriteRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/datadog/api/v1/series", protocol="datadog"}`)
datadogv1WriteErrors = metrics.NewCounter(`vmagent_http_request_errors_total{path="/datadog/api/v1/series", protocol="datadog"}`)
datadogv2WriteRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/datadog/api/v2/series", protocol="datadog"}`)
datadogv2WriteErrors = metrics.NewCounter(`vmagent_http_request_errors_total{path="/datadog/api/v2/series", protocol="datadog"}`)
datadogValidateRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/datadog/api/v1/validate", protocol="datadog"}`)
datadogCheckRunRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/datadog/api/v1/check_run", protocol="datadog"}`)

View File

@@ -6,7 +6,6 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
@@ -48,8 +47,8 @@ func insertRows(at *auth.Token, timeseries []prompb.TimeSeries, extraLabels []pr
for i := range ts.Labels {
label := &ts.Labels[i]
labels = append(labels, prompbmarshal.Label{
Name: bytesutil.ToUnsafeString(label.Name),
Value: bytesutil.ToUnsafeString(label.Value),
Name: label.Name,
Value: label.Value,
})
}
labels = append(labels, extraLabels...)

View File

@@ -18,6 +18,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promauth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/timerpool"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/timeutil"
"github.com/VictoriaMetrics/metrics"
)
@@ -58,8 +59,10 @@ var (
oauth2ClientID = flagutil.NewArrayString("remoteWrite.oauth2.clientID", "Optional OAuth2 clientID to use for the corresponding -remoteWrite.url")
oauth2ClientSecret = flagutil.NewArrayString("remoteWrite.oauth2.clientSecret", "Optional OAuth2 clientSecret to use for the corresponding -remoteWrite.url")
oauth2ClientSecretFile = flagutil.NewArrayString("remoteWrite.oauth2.clientSecretFile", "Optional OAuth2 clientSecretFile to use for the corresponding -remoteWrite.url")
oauth2TokenURL = flagutil.NewArrayString("remoteWrite.oauth2.tokenUrl", "Optional OAuth2 tokenURL to use for the corresponding -remoteWrite.url")
oauth2Scopes = flagutil.NewArrayString("remoteWrite.oauth2.scopes", "Optional OAuth2 scopes to use for the corresponding -remoteWrite.url. Scopes must be delimited by ';'")
oauth2EndpointParams = flagutil.NewArrayString("remoteWrite.oauth2.endpointParams", "Optional OAuth2 endpoint parameters to use for the corresponding -remoteWrite.url . "+
`The endpoint parameters must be set in JSON format: {"param1":"value1",...,"paramN":"valueN"}`)
oauth2TokenURL = flagutil.NewArrayString("remoteWrite.oauth2.tokenUrl", "Optional OAuth2 tokenURL to use for the corresponding -remoteWrite.url")
oauth2Scopes = flagutil.NewArrayString("remoteWrite.oauth2.scopes", "Optional OAuth2 scopes to use for the corresponding -remoteWrite.url. Scopes must be delimited by ';'")
awsUseSigv4 = flagutil.NewArrayBool("remoteWrite.aws.useSigv4", "Enables SigV4 request signing for the corresponding -remoteWrite.url. "+
"It is expected that other -remoteWrite.aws.* command-line flags are set if sigv4 request signing is enabled")
@@ -234,10 +237,16 @@ func getAuthConfig(argIdx int) (*promauth.Config, error) {
clientSecret := oauth2ClientSecret.GetOptionalArg(argIdx)
clientSecretFile := oauth2ClientSecretFile.GetOptionalArg(argIdx)
if clientSecretFile != "" || clientSecret != "" {
endpointParamsJSON := oauth2EndpointParams.GetOptionalArg(argIdx)
endpointParams, err := flagutil.ParseJSONMap(endpointParamsJSON)
if err != nil {
return nil, fmt.Errorf("cannot parse JSON for -remoteWrite.oauth2.endpointParams=%s: %w", endpointParamsJSON, err)
}
oauth2Cfg = &promauth.OAuth2Config{
ClientID: oauth2ClientID.GetOptionalArg(argIdx),
ClientSecret: promauth.NewSecret(clientSecret),
ClientSecretFile: clientSecretFile,
EndpointParams: endpointParams,
TokenURL: oauth2TokenURL.GetOptionalArg(argIdx),
Scopes: strings.Split(oauth2Scopes.GetOptionalArg(argIdx), ";"),
}
@@ -387,7 +396,8 @@ func (c *client) newRequest(url string, body []byte) (*http.Request, error) {
// Otherwise it tries sending the block to remote storage indefinitely.
func (c *client) sendBlockHTTP(block []byte) bool {
c.rl.register(len(block), c.stopCh)
retryDuration := time.Second
maxRetryDuration := timeutil.AddJitterToDuration(time.Minute)
retryDuration := timeutil.AddJitterToDuration(time.Second)
retriesCount := 0
again:
@@ -397,8 +407,8 @@ again:
if err != nil {
c.errorsCount.Inc()
retryDuration *= 2
if retryDuration > time.Minute {
retryDuration = time.Minute
if retryDuration > maxRetryDuration {
retryDuration = maxRetryDuration
}
logger.Warnf("couldn't send a block with size %d bytes to %q: %s; re-sending the block in %.3f seconds",
len(block), c.sanitizedURL, err, retryDuration.Seconds())
@@ -444,8 +454,8 @@ again:
// Unexpected status code returned
retriesCount++
retryDuration *= 2
if retryDuration > time.Minute {
retryDuration = time.Minute
if retryDuration > maxRetryDuration {
retryDuration = maxRetryDuration
}
body, err := io.ReadAll(resp.Body)
_ = resp.Body.Close()

View File

@@ -7,6 +7,7 @@ import (
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/decimal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/encoding/zstd"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fasttime"
@@ -15,6 +16,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/persistentqueue"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/timeutil"
"github.com/VictoriaMetrics/metrics"
"github.com/golang/snappy"
)
@@ -69,7 +71,8 @@ func (ps *pendingSeries) periodicFlusher() {
if flushSeconds <= 0 {
flushSeconds = 1
}
ticker := time.NewTicker(*flushInterval)
d := timeutil.AddJitterToDuration(*flushInterval)
ticker := time.NewTicker(d)
defer ticker.Stop()
for {
select {
@@ -107,11 +110,12 @@ type writeRequest struct {
wr prompbmarshal.WriteRequest
tss []prompbmarshal.TimeSeries
tss []prompbmarshal.TimeSeries
labels []prompbmarshal.Label
samples []prompbmarshal.Sample
buf []byte
// buf holds labels data
buf []byte
}
func (wr *writeRequest) reset() {
@@ -222,33 +226,45 @@ func (wr *writeRequest) copyTimeSeries(dst, src *prompbmarshal.TimeSeries) {
wr.buf = buf
}
// marshalConcurrency limits the maximum number of concurrent workers, which marshal and compress WriteRequest.
var marshalConcurrencyCh = make(chan struct{}, cgroup.AvailableCPUs())
func tryPushWriteRequest(wr *prompbmarshal.WriteRequest, tryPushBlock func(block []byte) bool, isVMRemoteWrite bool) bool {
if len(wr.Timeseries) == 0 {
// Nothing to push
return true
}
marshalConcurrencyCh <- struct{}{}
bb := writeRequestBufPool.Get()
bb.B = prompbmarshal.MarshalWriteRequest(bb.B[:0], wr)
bb.B = wr.MarshalProtobuf(bb.B[:0])
if len(bb.B) <= maxUnpackedBlockSize.IntN() {
zb := snappyBufPool.Get()
zb := compressBufPool.Get()
if isVMRemoteWrite {
zb.B = zstd.CompressLevel(zb.B[:0], bb.B, *vmProtoCompressLevel)
} else {
zb.B = snappy.Encode(zb.B[:cap(zb.B)], bb.B)
}
writeRequestBufPool.Put(bb)
<-marshalConcurrencyCh
if len(zb.B) <= persistentqueue.MaxBlockSize {
if !tryPushBlock(zb.B) {
return false
zbLen := len(zb.B)
ok := tryPushBlock(zb.B)
compressBufPool.Put(zb)
if ok {
blockSizeRows.Update(float64(len(wr.Timeseries)))
blockSizeBytes.Update(float64(zbLen))
}
blockSizeRows.Update(float64(len(wr.Timeseries)))
blockSizeBytes.Update(float64(len(zb.B)))
snappyBufPool.Put(zb)
return true
return ok
}
snappyBufPool.Put(zb)
compressBufPool.Put(zb)
} else {
writeRequestBufPool.Put(bb)
<-marshalConcurrencyCh
}
// Too big block. Recursively split it into smaller parts if possible.
@@ -294,5 +310,7 @@ var (
blockSizeRows = metrics.NewHistogram(`vmagent_remotewrite_block_size_rows`)
)
var writeRequestBufPool bytesutil.ByteBufferPool
var snappyBufPool bytesutil.ByteBufferPool
var (
writeRequestBufPool bytesutil.ByteBufferPool
compressBufPool bytesutil.ByteBufferPool
)

View File

@@ -43,7 +43,7 @@ func testPushWriteRequest(t *testing.T, rowsCount, expectedBlockLenProm, expecte
}
// Check Prometheus remote write
f(false, expectedBlockLenProm, 0)
f(false, expectedBlockLenProm, 3)
// Check VictoriaMetrics remote write
f(true, expectedBlockLenVM, 15)

View File

@@ -4,7 +4,6 @@ import (
"fmt"
"testing"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/golang/snappy"
"github.com/klauspost/compress/s2"
)
@@ -22,7 +21,7 @@ func benchmarkCompressWriteRequest(b *testing.B, compressFunc func(dst, src []by
for _, rowsCount := range []int{1, 10, 100, 1e3, 1e4} {
b.Run(fmt.Sprintf("rows_%d", rowsCount), func(b *testing.B) {
wr := newTestWriteRequest(rowsCount, 10)
data := prompbmarshal.MarshalWriteRequest(nil, wr)
data := wr.MarshalProtobuf(nil)
b.ReportAllocs()
b.SetBytes(int64(rowsCount))
b.RunParallel(func(pb *testing.PB) {

View File

@@ -276,7 +276,7 @@ func reloadRelabelConfigs() {
var (
relabelConfigReloads = metrics.NewCounter(`vmagent_relabel_config_reloads_total`)
relabelConfigReloadErrors = metrics.NewCounter(`vmagent_relabel_config_reloads_errors_total`)
relabelConfigSuccess = metrics.NewCounter(`vmagent_relabel_config_last_reload_successful`)
relabelConfigSuccess = metrics.NewGauge(`vmagent_relabel_config_last_reload_successful`, nil)
relabelConfigTimestamp = metrics.NewCounter(`vmagent_relabel_config_last_reload_success_timestamp_seconds`)
)

View File

@@ -68,6 +68,7 @@ publish-vmalert:
test-vmalert:
go test -v -race -cover ./app/vmalert -loggerLevel=ERROR
go test -v -race -cover ./app/vmalert/rule
go test -v -race -cover ./app/vmalert/templates
go test -v -race -cover ./app/vmalert/datasource
go test -v -race -cover ./app/vmalert/notifier

View File

@@ -22,6 +22,7 @@ groups:
{{ . | first | value }}
{{ end }}
description: "It is {{ $value }} connections for {{$labels.instance}}"
link: http://localhost:3000/d/wNf0q_kZk?viewPanel=51&from={{($activeAt.Add (parseDurationTime "1h")).UnixMilli}}&to={{($activeAt.Add (parseDurationTime "-1h")).UnixMilli}}
- alert: ExampleAlertAlwaysFiring
update_entries_limit: -1
expr: sum by(job)

View File

@@ -37,11 +37,13 @@ var (
tlsCAFile = flag.String("datasource.tlsCAFile", "", `Optional path to TLS CA file to use for verifying connections to -datasource.url. By default, system CA is used`)
tlsServerName = flag.String("datasource.tlsServerName", "", `Optional TLS server name to use for connections to -datasource.url. By default, the server name from -datasource.url is used`)
oauth2ClientID = flag.String("datasource.oauth2.clientID", "", "Optional OAuth2 clientID to use for -datasource.url. ")
oauth2ClientSecret = flag.String("datasource.oauth2.clientSecret", "", "Optional OAuth2 clientSecret to use for -datasource.url.")
oauth2ClientSecretFile = flag.String("datasource.oauth2.clientSecretFile", "", "Optional OAuth2 clientSecretFile to use for -datasource.url. ")
oauth2TokenURL = flag.String("datasource.oauth2.tokenUrl", "", "Optional OAuth2 tokenURL to use for -datasource.url.")
oauth2Scopes = flag.String("datasource.oauth2.scopes", "", "Optional OAuth2 scopes to use for -datasource.url. Scopes must be delimited by ';'")
oauth2ClientID = flag.String("datasource.oauth2.clientID", "", "Optional OAuth2 clientID to use for -datasource.url")
oauth2ClientSecret = flag.String("datasource.oauth2.clientSecret", "", "Optional OAuth2 clientSecret to use for -datasource.url")
oauth2ClientSecretFile = flag.String("datasource.oauth2.clientSecretFile", "", "Optional OAuth2 clientSecretFile to use for -datasource.url")
oauth2EndpointParams = flag.String("datasource.oauth2.endpointParams", "", "Optional OAuth2 endpoint parameters to use for -datasource.url . "+
`The endpoint parameters must be set in JSON format: {"param1":"value1",...,"paramN":"valueN"}`)
oauth2TokenURL = flag.String("datasource.oauth2.tokenUrl", "", "Optional OAuth2 tokenURL to use for -datasource.url")
oauth2Scopes = flag.String("datasource.oauth2.scopes", "", "Optional OAuth2 scopes to use for -datasource.url. Scopes must be delimited by ';'")
lookBack = flag.Duration("datasource.lookback", 0, `Will be deprecated soon, please adjust "-search.latencyOffset" at datasource side `+
`or specify "latency_offset" in rule group's params. Lookback defines how far into the past to look when evaluating queries. `+
@@ -108,10 +110,14 @@ func Init(extraParams url.Values) (QuerierBuilder, error) {
extraParams.Set("round_digits", fmt.Sprintf("%d", *roundDigits))
}
endpointParams, err := flagutil.ParseJSONMap(*oauth2EndpointParams)
if err != nil {
return nil, fmt.Errorf("cannot parse JSON for -datasource.oauth2.endpointParams=%s: %w", *oauth2EndpointParams, err)
}
authCfg, err := utils.AuthConfig(
utils.WithBasicAuth(*basicAuthUsername, *basicAuthPassword, *basicAuthPasswordFile),
utils.WithBearer(*bearerToken, *bearerTokenFile),
utils.WithOAuth(*oauth2ClientID, *oauth2ClientSecret, *oauth2ClientSecretFile, *oauth2TokenURL, *oauth2Scopes),
utils.WithOAuth(*oauth2ClientID, *oauth2ClientSecret, *oauth2ClientSecretFile, *oauth2TokenURL, *oauth2Scopes, endpointParams),
utils.WithHeaders(*headers))
if err != nil {
return nil, fmt.Errorf("failed to configure auth: %w", err)

View File

@@ -96,7 +96,6 @@ func main() {
notifier.InitSecretFlags()
buildinfo.Init()
logger.Init()
pushmetrics.Init()
if !*remoteReadIgnoreRestoreErrors {
logger.Warnf("flag `remoteRead.ignoreRestoreErrors` is deprecated and will be removed in next releases.")
@@ -118,9 +117,9 @@ func main() {
return
}
eu, err := getExternalURL(*externalURL, *httpListenAddr, httpserver.IsTLS())
eu, err := getExternalURL(*externalURL)
if err != nil {
logger.Fatalf("failed to init `external.url`: %s", err)
logger.Fatalf("failed to init `-external.url`: %s", err)
}
alertURLGeneratorFn, err = getAlertURLGenerator(eu, *externalAlertSource, *validateTemplates)
@@ -182,8 +181,11 @@ func main() {
rh := &requestHandler{m: manager}
go httpserver.Serve(*httpListenAddr, *useProxyProtocol, rh.handler)
pushmetrics.Init()
sig := procutil.WaitForSigterm()
logger.Infof("service received signal %s", sig)
pushmetrics.Stop()
if err := httpserver.Stop(*httpListenAddr); err != nil {
logger.Fatalf("cannot stop the webservice: %s", err)
}
@@ -194,7 +196,7 @@ func main() {
var (
configReloads = metrics.NewCounter(`vmalert_config_last_reload_total`)
configReloadErrors = metrics.NewCounter(`vmalert_config_last_reload_errors_total`)
configSuccess = metrics.NewCounter(`vmalert_config_last_reload_successful`)
configSuccess = metrics.NewGauge(`vmalert_config_last_reload_successful`, nil)
configTimestamp = metrics.NewCounter(`vmalert_config_last_reload_success_timestamp_seconds`)
)
@@ -243,14 +245,26 @@ func newManager(ctx context.Context) (*manager, error) {
return manager, nil
}
func getExternalURL(externalURL, httpListenAddr string, isSecure bool) (*url.URL, error) {
if externalURL != "" {
return url.Parse(externalURL)
func getExternalURL(customURL string) (*url.URL, error) {
if customURL == "" {
// use local hostname as external URL
return getHostnameAsExternalURL(*httpListenAddr, httpserver.IsTLS())
}
hname, err := os.Hostname()
u, err := url.Parse(customURL)
if err != nil {
return nil, err
}
if u.Scheme != "http" && u.Scheme != "https" {
return nil, fmt.Errorf("invalid scheme %q in url %q, only 'http' and 'https' are supported", u.Scheme, u.String())
}
return u, nil
}
func getHostnameAsExternalURL(httpListenAddr string, isSecure bool) (*url.URL, error) {
hname, err := os.Hostname()
if err != nil {
return nil, fmt.Errorf("failed to get hostname: %w", err)
}
port := ""
if ipport := strings.Split(httpListenAddr, ":"); len(ipport) > 1 {
port = ":" + ipport[1]

View File

@@ -22,22 +22,29 @@ func init() {
}
func TestGetExternalURL(t *testing.T) {
expURL := "https://vicotriametrics.com/path"
u, err := getExternalURL(expURL, "", false)
invalidURL := "victoriametrics.com/path"
_, err := getExternalURL(invalidURL)
if err == nil {
t.Errorf("expected error, got nil")
}
expURL := "https://victoriametrics.com/path"
u, err := getExternalURL(expURL)
if err != nil {
t.Errorf("unexpected error %s", err)
}
if u.String() != expURL {
t.Errorf("unexpected url want %s, got %s", expURL, u.String())
t.Errorf("unexpected url: want %q, got %s", expURL, u.String())
}
h, _ := os.Hostname()
expURL = fmt.Sprintf("https://%s:4242", h)
u, err = getExternalURL("", "0.0.0.0:4242", true)
expURL = fmt.Sprintf("http://%s:8880", h)
u, err = getExternalURL("")
if err != nil {
t.Errorf("unexpected error %s", err)
}
if u.String() != expURL {
t.Errorf("unexpected url want %s, got %s", expURL, u.String())
t.Errorf("unexpected url: want %s, got %s", expURL, u.String())
}
}
@@ -134,7 +141,7 @@ groups:
t.Fatalf("expected to have config error %s; got nil instead", cErr)
}
if cfgSuc != 0 {
t.Fatalf("expected to have metric configSuccess to be set to 0; got %d instead", cfgSuc)
t.Fatalf("expected to have metric configSuccess to be set to 0; got %v instead", cfgSuc)
}
return
}
@@ -143,7 +150,7 @@ groups:
t.Fatalf("unexpected config error: %s", cErr)
}
if cfgSuc != 1 {
t.Fatalf("expected to have metric configSuccess to be set to 1; got %d instead", cfgSuc)
t.Fatalf("expected to have metric configSuccess to be set to 1; got %v instead", cfgSuc)
}
}

View File

@@ -156,11 +156,14 @@ func (m *manager) update(ctx context.Context, groupsCfg []config.Group, restore
var wg sync.WaitGroup
for _, item := range toUpdate {
wg.Add(1)
// cancel evaluation so the Update will be applied as fast as possible.
// it is important to call InterruptEval before the update, because cancel fn
// can be re-assigned during the update.
item.old.InterruptEval()
go func(old *rule.Group, new *rule.Group) {
old.UpdateWith(new)
wg.Done()
}(item.old, item.new)
item.old.InterruptEval()
}
wg.Wait()
}

View File

@@ -144,7 +144,7 @@ func NewAlertManager(alertManagerURL string, fn AlertURLGenerator, authCfg proma
aCfg, err := utils.AuthConfig(
utils.WithBasicAuth(ba.Username, ba.Password.String(), ba.PasswordFile),
utils.WithBearer(authCfg.BearerToken.String(), authCfg.BearerTokenFile),
utils.WithOAuth(oauth.ClientID, oauth.ClientSecretFile, oauth.ClientSecretFile, oauth.TokenURL, strings.Join(oauth.Scopes, ";")))
utils.WithOAuth(oauth.ClientID, oauth.ClientSecretFile, oauth.ClientSecretFile, oauth.TokenURL, strings.Join(oauth.Scopes, ";"), oauth.EndpointParams))
if err != nil {
return nil, fmt.Errorf("failed to configure auth: %w", err)
}

View File

@@ -46,6 +46,8 @@ var (
"If multiple args are set, then they are applied independently for the corresponding -notifier.url")
oauth2ClientSecretFile = flagutil.NewArrayString("notifier.oauth2.clientSecretFile", "Optional OAuth2 clientSecretFile to use for -notifier.url. "+
"If multiple args are set, then they are applied independently for the corresponding -notifier.url")
oauth2EndpointParams = flagutil.NewArrayString("notifier.oauth2.endpointParams", "Optional OAuth2 endpoint parameters to use for the corresponding -notifier.url . "+
`The endpoint parameters must be set in JSON format: {"param1":"value1",...,"paramN":"valueN"}`)
oauth2TokenURL = flagutil.NewArrayString("notifier.oauth2.tokenUrl", "Optional OAuth2 tokenURL to use for -notifier.url. "+
"If multiple args are set, then they are applied independently for the corresponding -notifier.url")
oauth2Scopes = flagutil.NewArrayString("notifier.oauth2.scopes", "Optional OAuth2 scopes to use for -notifier.url. Scopes must be delimited by ';'. "+
@@ -141,6 +143,11 @@ func InitSecretFlags() {
func notifiersFromFlags(gen AlertURLGenerator) ([]Notifier, error) {
var notifiers []Notifier
for i, addr := range *addrs {
endpointParamsJSON := oauth2EndpointParams.GetOptionalArg(i)
endpointParams, err := flagutil.ParseJSONMap(endpointParamsJSON)
if err != nil {
return nil, fmt.Errorf("cannot parse JSON for -notifier.oauth2.endpointParams=%s: %w", endpointParamsJSON, err)
}
authCfg := promauth.HTTPClientConfig{
TLSConfig: &promauth.TLSConfig{
CAFile: tlsCAFile.GetOptionalArg(i),
@@ -160,6 +167,7 @@ func notifiersFromFlags(gen AlertURLGenerator) ([]Notifier, error) {
ClientID: oauth2ClientID.GetOptionalArg(i),
ClientSecret: promauth.NewSecret(oauth2ClientSecret.GetOptionalArg(i)),
ClientSecretFile: oauth2ClientSecretFile.GetOptionalArg(i),
EndpointParams: endpointParams,
Scopes: strings.Split(oauth2Scopes.GetOptionalArg(i), ";"),
TokenURL: oauth2TokenURL.GetOptionalArg(i),
},

View File

@@ -41,8 +41,10 @@ var (
oauth2ClientID = flag.String("remoteRead.oauth2.clientID", "", "Optional OAuth2 clientID to use for -remoteRead.url.")
oauth2ClientSecret = flag.String("remoteRead.oauth2.clientSecret", "", "Optional OAuth2 clientSecret to use for -remoteRead.url.")
oauth2ClientSecretFile = flag.String("remoteRead.oauth2.clientSecretFile", "", "Optional OAuth2 clientSecretFile to use for -remoteRead.url.")
oauth2TokenURL = flag.String("remoteRead.oauth2.tokenUrl", "", "Optional OAuth2 tokenURL to use for -remoteRead.url. ")
oauth2Scopes = flag.String("remoteRead.oauth2.scopes", "", "Optional OAuth2 scopes to use for -remoteRead.url. Scopes must be delimited by ';'.")
oauth2EndpointParams = flag.String("remoteRead.oauth2.endpointParams", "", "Optional OAuth2 endpoint parameters to use for -remoteRead.url . "+
`The endpoint parameters must be set in JSON format: {"param1":"value1",...,"paramN":"valueN"}`)
oauth2TokenURL = flag.String("remoteRead.oauth2.tokenUrl", "", "Optional OAuth2 tokenURL to use for -remoteRead.url. ")
oauth2Scopes = flag.String("remoteRead.oauth2.scopes", "", "Optional OAuth2 scopes to use for -remoteRead.url. Scopes must be delimited by ';'.")
)
// InitSecretFlags must be called after flag.Parse and before any logging
@@ -63,10 +65,14 @@ func Init() (datasource.QuerierBuilder, error) {
return nil, fmt.Errorf("failed to create transport: %w", err)
}
endpointParams, err := flagutil.ParseJSONMap(*oauth2EndpointParams)
if err != nil {
return nil, fmt.Errorf("cannot parse JSON for -remoteRead.oauth2.endpointParams=%s: %w", *oauth2EndpointParams, err)
}
authCfg, err := utils.AuthConfig(
utils.WithBasicAuth(*basicAuthUsername, *basicAuthPassword, *basicAuthPasswordFile),
utils.WithBearer(*bearerToken, *bearerTokenFile),
utils.WithOAuth(*oauth2ClientID, *oauth2ClientSecret, *oauth2ClientSecretFile, *oauth2TokenURL, *oauth2Scopes),
utils.WithOAuth(*oauth2ClientID, *oauth2ClientSecret, *oauth2ClientSecretFile, *oauth2TokenURL, *oauth2Scopes, endpointParams),
utils.WithHeaders(*headers))
if err != nil {
return nil, fmt.Errorf("failed to configure auth: %w", err)

View File

@@ -123,14 +123,12 @@ func (c *Client) Push(s prompbmarshal.TimeSeries) error {
case <-c.doneCh:
rwErrors.Inc()
droppedRows.Add(len(s.Samples))
droppedBytes.Add(s.Size())
return fmt.Errorf("client is closed")
case c.input <- s:
return nil
default:
rwErrors.Inc()
droppedRows.Add(len(s.Samples))
droppedBytes.Add(s.Size())
return fmt.Errorf("failed to push timeseries - queue is full (%d entries). "+
"Queue size is controlled by -remoteWrite.maxQueueSize flag",
c.maxQueueSize)
@@ -195,7 +193,6 @@ var (
sentRows = metrics.NewCounter(`vmalert_remotewrite_sent_rows_total`)
sentBytes = metrics.NewCounter(`vmalert_remotewrite_sent_bytes_total`)
droppedRows = metrics.NewCounter(`vmalert_remotewrite_dropped_rows_total`)
droppedBytes = metrics.NewCounter(`vmalert_remotewrite_dropped_bytes_total`)
sendDuration = metrics.NewFloatCounter(`vmalert_remotewrite_send_duration_seconds_total`)
bufferFlushDuration = metrics.NewHistogram(`vmalert_remotewrite_flush_duration_seconds`)
@@ -211,15 +208,10 @@ func (c *Client) flush(ctx context.Context, wr *prompbmarshal.WriteRequest) {
if len(wr.Timeseries) < 1 {
return
}
defer prompbmarshal.ResetWriteRequest(wr)
defer wr.Reset()
defer bufferFlushDuration.UpdateDuration(time.Now())
data, err := wr.Marshal()
if err != nil {
logger.Errorf("failed to marshal WriteRequest: %s", err)
return
}
data := wr.MarshalProtobuf(nil)
b := snappy.Encode(nil, data)
retryInterval, maxRetryInterval := *retryMinInterval, *retryMaxTime
@@ -276,8 +268,11 @@ L:
}
rwErrors.Inc()
droppedRows.Add(len(wr.Timeseries))
droppedBytes.Add(len(b))
rows := 0
for _, ts := range wr.Timeseries {
rows += len(ts.Samples)
}
droppedRows.Add(rows)
logger.Errorf("attempts to send remote-write request failed - dropping %d time series",
len(wr.Timeseries))
}

View File

@@ -140,7 +140,7 @@ func (rw *rwServer) handler(w http.ResponseWriter, r *http.Request) {
return
}
wr := &prompb.WriteRequest{}
if err := wr.Unmarshal(b); err != nil {
if err := wr.UnmarshalProtobuf(b); err != nil {
rw.err(w, fmt.Errorf("unmarhsal err: %w", err))
return
}

View File

@@ -49,10 +49,7 @@ func (c *DebugClient) Push(s prompbmarshal.TimeSeries) error {
c.wg.Add(1)
defer c.wg.Done()
wr := &prompbmarshal.WriteRequest{Timeseries: []prompbmarshal.TimeSeries{s}}
data, err := wr.Marshal()
if err != nil {
return fmt.Errorf("failed to marshal the given time series: %w", err)
}
data := wr.MarshalProtobuf(nil)
return c.send(data)
}

View File

@@ -41,11 +41,13 @@ var (
tlsServerName = flag.String("remoteWrite.tlsServerName", "", "Optional TLS server name to use for connections to -remoteWrite.url. "+
"By default, the server name from -remoteWrite.url is used")
oauth2ClientID = flag.String("remoteWrite.oauth2.clientID", "", "Optional OAuth2 clientID to use for -remoteWrite.url.")
oauth2ClientSecret = flag.String("remoteWrite.oauth2.clientSecret", "", "Optional OAuth2 clientSecret to use for -remoteWrite.url.")
oauth2ClientSecretFile = flag.String("remoteWrite.oauth2.clientSecretFile", "", "Optional OAuth2 clientSecretFile to use for -remoteWrite.url.")
oauth2TokenURL = flag.String("remoteWrite.oauth2.tokenUrl", "", "Optional OAuth2 tokenURL to use for -notifier.url.")
oauth2Scopes = flag.String("remoteWrite.oauth2.scopes", "", "Optional OAuth2 scopes to use for -notifier.url. Scopes must be delimited by ';'.")
oauth2ClientID = flag.String("remoteWrite.oauth2.clientID", "", "Optional OAuth2 clientID to use for -remoteWrite.url")
oauth2ClientSecret = flag.String("remoteWrite.oauth2.clientSecret", "", "Optional OAuth2 clientSecret to use for -remoteWrite.url")
oauth2ClientSecretFile = flag.String("remoteWrite.oauth2.clientSecretFile", "", "Optional OAuth2 clientSecretFile to use for -remoteWrite.url")
oauth2EndpointParams = flag.String("remoteWrite.oauth2.endpointParams", "", "Optional OAuth2 endpoint parameters to use for -remoteWrite.url . "+
`The endpoint parameters must be set in JSON format: {"param1":"value1",...,"paramN":"valueN"}`)
oauth2TokenURL = flag.String("remoteWrite.oauth2.tokenUrl", "", "Optional OAuth2 tokenURL to use for -notifier.url.")
oauth2Scopes = flag.String("remoteWrite.oauth2.scopes", "", "Optional OAuth2 scopes to use for -notifier.url. Scopes must be delimited by ';'.")
)
// InitSecretFlags must be called after flag.Parse and before any logging
@@ -67,10 +69,14 @@ func Init(ctx context.Context) (*Client, error) {
return nil, fmt.Errorf("failed to create transport: %w", err)
}
endpointParams, err := flagutil.ParseJSONMap(*oauth2EndpointParams)
if err != nil {
return nil, fmt.Errorf("cannot parse JSON for -remoteWrite.oauth2.endpointParams=%s: %w", *oauth2EndpointParams, err)
}
authCfg, err := utils.AuthConfig(
utils.WithBasicAuth(*basicAuthUsername, *basicAuthPassword, *basicAuthPasswordFile),
utils.WithBearer(*bearerToken, *bearerTokenFile),
utils.WithOAuth(*oauth2ClientID, *oauth2ClientSecret, *oauth2ClientSecretFile, *oauth2TokenURL, *oauth2Scopes),
utils.WithOAuth(*oauth2ClientID, *oauth2ClientSecret, *oauth2ClientSecretFile, *oauth2TokenURL, *oauth2Scopes, endpointParams),
utils.WithHeaders(*headers))
if err != nil {
return nil, fmt.Errorf("failed to configure auth: %w", err)

View File

@@ -237,11 +237,30 @@ type labelSet struct {
origin map[string]string
// processed labels includes origin labels
// plus extra labels (group labels, service labels like alertNameLabel).
// in case of conflicts, extra labels are preferred.
// in case of key conflicts, origin labels are renamed with prefix `exported_` and extra labels are preferred.
// see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5161
// used as labels attached to notifier.Alert and ALERTS series written to remote storage.
processed map[string]string
}
// add adds a value v with key k to origin and processed label sets.
// On k conflicts in processed set, the passed v is preferred.
// On k conflicts in origin set, the original value is preferred and copied
// to processed with `exported_%k` key. The copy happens only if passed v isn't equal to origin[k] value.
func (ls *labelSet) add(k, v string) {
ls.processed[k] = v
ov, ok := ls.origin[k]
if !ok {
ls.origin[k] = v
return
}
if ov != v {
// copy value only if v and ov are different
key := fmt.Sprintf("exported_%s", k)
ls.processed[key] = ov
}
}
// toLabels converts labels from given Metric
// to labelSet which contains original and processed labels.
func (ar *AlertingRule) toLabels(m datasource.Metric, qFn templates.QueryFn) (*labelSet, error) {
@@ -267,24 +286,14 @@ func (ar *AlertingRule) toLabels(m datasource.Metric, qFn templates.QueryFn) (*l
return nil, fmt.Errorf("failed to expand labels: %w", err)
}
for k, v := range extraLabels {
ls.processed[k] = v
if _, ok := ls.origin[k]; !ok {
ls.origin[k] = v
}
ls.add(k, v)
}
// set additional labels to identify group and rule name
if ar.Name != "" {
ls.processed[alertNameLabel] = ar.Name
if _, ok := ls.origin[alertNameLabel]; !ok {
ls.origin[alertNameLabel] = ar.Name
}
ls.add(alertNameLabel, ar.Name)
}
if !*disableAlertGroupLabel && ar.GroupName != "" {
ls.processed[alertGroupNameLabel] = ar.GroupName
if _, ok := ls.origin[alertGroupNameLabel]; !ok {
ls.origin[alertGroupNameLabel] = ar.GroupName
}
ls.add(alertGroupNameLabel, ar.GroupName)
}
return ls, nil
}
@@ -315,16 +324,6 @@ func (ar *AlertingRule) execRange(ctx context.Context, start, end time.Time) ([]
return nil, fmt.Errorf("failed to create alert: %w", err)
}
// if alert is instant, For: 0
if ar.For == 0 {
a.State = notifier.StateFiring
for i := range s.Values {
result = append(result, ar.alertToTimeSeries(a, s.Timestamps[i])...)
}
continue
}
// if alert with For > 0
prevT := time.Time{}
for i := range s.Values {
at := time.Unix(s.Timestamps[i], 0)
@@ -345,6 +344,10 @@ func (ar *AlertingRule) execRange(ctx context.Context, start, end time.Time) ([]
a.Start = at
}
prevT = at
if ar.For == 0 {
// rules with `for: 0` are always firing when they have Value
a.State = notifier.StateFiring
}
result = append(result, ar.alertToTimeSeries(a, s.Timestamps[i])...)
// save alert's state on last iteration, so it can be used on the next execRange call
@@ -414,8 +417,7 @@ func (ar *AlertingRule) exec(ctx context.Context, ts time.Time, limit int) ([]pr
}
h := hash(ls.processed)
if _, ok := updated[h]; ok {
// duplicate may be caused by extra labels
// conflicting with the metric labels
// duplicate may be caused the removal of `__name__` label
curState.Err = fmt.Errorf("labels %v: %w", ls.processed, errDuplicate)
return nil, curState.Err
}
@@ -438,14 +440,13 @@ func (ar *AlertingRule) exec(ctx context.Context, ts time.Time, limit int) ([]pr
a.KeepFiringSince = time.Time{}
continue
}
a, err := ar.newAlert(m, ls, start, qFn)
a, err := ar.newAlert(m, ls, ts, qFn)
if err != nil {
curState.Err = fmt.Errorf("failed to create alert: %w", err)
return nil, curState.Err
}
a.ID = h
a.State = notifier.StatePending
a.ActiveAt = ts
ar.alerts[h] = a
ar.logDebugf(ts, a, "created in state PENDING")
}
@@ -471,7 +472,7 @@ func (ar *AlertingRule) exec(ctx context.Context, ts time.Time, limit int) ([]pr
}
// alerts with ar.KeepFiringFor>0 may remain FIRING
// even if their expression isn't true anymore
if ts.Sub(a.KeepFiringSince) > ar.KeepFiringFor {
if ts.Sub(a.KeepFiringSince) >= ar.KeepFiringFor {
a.State = notifier.StateInactive
a.ResolvedAt = ts
ar.logDebugf(ts, a, "FIRING => INACTIVE: is absent in current evaluation round")
@@ -551,9 +552,9 @@ func (ar *AlertingRule) newAlert(m datasource.Metric, ls *labelSet, start time.T
}
const (
// alertMetricName is the metric name for synthetic alert timeseries.
// alertMetricName is the metric name for time series reflecting the alert state.
alertMetricName = "ALERTS"
// alertForStateMetricName is the metric name for 'for' state of alert.
// alertForStateMetricName is the metric name for time series reflecting the moment of time when alert became active.
alertForStateMetricName = "ALERTS_FOR_STATE"
// alertNameLabel is the label name indicating the name of an alert.
@@ -568,12 +569,10 @@ const (
// alertToTimeSeries converts the given alert with the given timestamp to time series
func (ar *AlertingRule) alertToTimeSeries(a *notifier.Alert, timestamp int64) []prompbmarshal.TimeSeries {
var tss []prompbmarshal.TimeSeries
tss = append(tss, alertToTimeSeries(a, timestamp))
if ar.For > 0 {
tss = append(tss, alertForToTimeSeries(a, timestamp))
return []prompbmarshal.TimeSeries{
alertToTimeSeries(a, timestamp),
alertForToTimeSeries(a, timestamp),
}
return tss
}
func alertToTimeSeries(a *notifier.Alert, timestamp int64) prompbmarshal.TimeSeries {

View File

@@ -28,20 +28,26 @@ func TestAlertingRule_ToTimeSeries(t *testing.T) {
}{
{
newTestAlertingRule("instant", 0),
&notifier.Alert{State: notifier.StateFiring},
&notifier.Alert{State: notifier.StateFiring, ActiveAt: timestamp.Add(time.Second)},
[]prompbmarshal.TimeSeries{
newTimeSeries([]float64{1}, []int64{timestamp.UnixNano()}, map[string]string{
"__name__": alertMetricName,
alertStateLabel: notifier.StateFiring.String(),
}),
newTimeSeries([]float64{float64(timestamp.Add(time.Second).Unix())},
[]int64{timestamp.UnixNano()},
map[string]string{
"__name__": alertForStateMetricName,
}),
},
},
{
newTestAlertingRule("instant extra labels", 0),
&notifier.Alert{State: notifier.StateFiring, Labels: map[string]string{
"job": "foo",
"instance": "bar",
}},
&notifier.Alert{State: notifier.StateFiring, ActiveAt: timestamp.Add(time.Second),
Labels: map[string]string{
"job": "foo",
"instance": "bar",
}},
[]prompbmarshal.TimeSeries{
newTimeSeries([]float64{1}, []int64{timestamp.UnixNano()}, map[string]string{
"__name__": alertMetricName,
@@ -49,19 +55,33 @@ func TestAlertingRule_ToTimeSeries(t *testing.T) {
"job": "foo",
"instance": "bar",
}),
newTimeSeries([]float64{float64(timestamp.Add(time.Second).Unix())},
[]int64{timestamp.UnixNano()},
map[string]string{
"__name__": alertForStateMetricName,
"job": "foo",
"instance": "bar",
}),
},
},
{
newTestAlertingRule("instant labels override", 0),
&notifier.Alert{State: notifier.StateFiring, Labels: map[string]string{
alertStateLabel: "foo",
"__name__": "bar",
}},
&notifier.Alert{State: notifier.StateFiring, ActiveAt: timestamp.Add(time.Second),
Labels: map[string]string{
alertStateLabel: "foo",
"__name__": "bar",
}},
[]prompbmarshal.TimeSeries{
newTimeSeries([]float64{1}, []int64{timestamp.UnixNano()}, map[string]string{
"__name__": alertMetricName,
alertStateLabel: notifier.StateFiring.String(),
}),
newTimeSeries([]float64{float64(timestamp.Add(time.Second).Unix())},
[]int64{timestamp.UnixNano()},
map[string]string{
"__name__": alertForStateMetricName,
alertStateLabel: "foo",
}),
},
},
{
@@ -308,14 +328,17 @@ func TestAlertingRule_Exec(t *testing.T) {
fq := &datasource.FakeQuerier{}
tc.rule.q = fq
tc.rule.GroupID = fakeGroup.ID()
ts := time.Now()
for i, step := range tc.steps {
fq.Reset()
fq.Add(step...)
if _, err := tc.rule.exec(context.TODO(), time.Now(), 0); err != nil {
if _, err := tc.rule.exec(context.TODO(), ts, 0); err != nil {
t.Fatalf("unexpected err: %s", err)
}
// artificial delay between applying steps
time.Sleep(defaultStep)
// shift the execution timestamp before the next iteration
ts = ts.Add(defaultStep)
if _, ok := tc.expAlerts[i]; !ok {
continue
}
@@ -367,7 +390,7 @@ func TestAlertingRule_ExecRange(t *testing.T) {
{Values: []float64{1}, Timestamps: []int64{1}},
},
[]*notifier.Alert{
{State: notifier.StateFiring},
{State: notifier.StateFiring, ActiveAt: time.Unix(1, 0)},
},
nil,
},
@@ -378,8 +401,9 @@ func TestAlertingRule_ExecRange(t *testing.T) {
},
[]*notifier.Alert{
{
Labels: map[string]string{"name": "foo"},
State: notifier.StateFiring,
Labels: map[string]string{"name": "foo"},
State: notifier.StateFiring,
ActiveAt: time.Unix(1, 0),
},
},
nil,
@@ -390,9 +414,9 @@ func TestAlertingRule_ExecRange(t *testing.T) {
{Values: []float64{1, 1, 1}, Timestamps: []int64{1e3, 2e3, 3e3}},
},
[]*notifier.Alert{
{State: notifier.StateFiring},
{State: notifier.StateFiring},
{State: notifier.StateFiring},
{State: notifier.StateFiring, ActiveAt: time.Unix(1e3, 0)},
{State: notifier.StateFiring, ActiveAt: time.Unix(2e3, 0)},
{State: notifier.StateFiring, ActiveAt: time.Unix(3e3, 0)},
},
nil,
},
@@ -460,6 +484,20 @@ func TestAlertingRule_ExecRange(t *testing.T) {
For: time.Second,
}},
},
{
newTestAlertingRuleWithEvalInterval("firing=>inactive=>inactive=>firing=>firing", 0, time.Second),
[]datasource.Metric{
{Values: []float64{1, 1, 1, 1}, Timestamps: []int64{1, 4, 5, 6}},
},
[]*notifier.Alert{
{State: notifier.StateFiring, ActiveAt: time.Unix(1, 0)},
// It is expected for ActiveAT to remain the same while rule continues to fire in each iteration
{State: notifier.StateFiring, ActiveAt: time.Unix(4, 0)},
{State: notifier.StateFiring, ActiveAt: time.Unix(4, 0)},
{State: notifier.StateFiring, ActiveAt: time.Unix(4, 0)},
},
nil,
},
{
newTestAlertingRule("for=>pending=>firing=>pending=>firing=>pending", time.Second),
[]datasource.Metric{
@@ -534,21 +572,25 @@ func TestAlertingRule_ExecRange(t *testing.T) {
},
},
[]*notifier.Alert{
{State: notifier.StateFiring, Labels: map[string]string{
"source": "vm",
}},
{State: notifier.StateFiring, Labels: map[string]string{
"source": "vm",
}},
{State: notifier.StateFiring, ActiveAt: time.Unix(1, 0),
Labels: map[string]string{
"source": "vm",
}},
{State: notifier.StateFiring, ActiveAt: time.Unix(100, 0),
Labels: map[string]string{
"source": "vm",
}},
//
{State: notifier.StateFiring, Labels: map[string]string{
"foo": "bar",
"source": "vm",
}},
{State: notifier.StateFiring, Labels: map[string]string{
"foo": "bar",
"source": "vm",
}},
{State: notifier.StateFiring, ActiveAt: time.Unix(1, 0),
Labels: map[string]string{
"foo": "bar",
"source": "vm",
}},
{State: notifier.StateFiring, ActiveAt: time.Unix(5, 0),
Labels: map[string]string{
"foo": "bar",
"source": "vm",
}},
},
nil,
},
@@ -768,14 +810,16 @@ func TestAlertingRule_Exec_Negative(t *testing.T) {
ar.q = fq
// successful attempt
// label `job` will be overridden by rule extra label, the original value will be reserved by "exported_job"
fq.Add(metricWithValueAndLabels(t, 1, "__name__", "foo", "job", "bar"))
fq.Add(metricWithValueAndLabels(t, 1, "__name__", "foo", "job", "baz"))
_, err := ar.exec(context.TODO(), time.Now(), 0)
if err != nil {
t.Fatal(err)
}
// label `job` will collide with rule extra label and will make both time series equal
fq.Add(metricWithValueAndLabels(t, 1, "__name__", "foo", "job", "baz"))
// label `__name__` will be omitted and get duplicated results here
fq.Add(metricWithValueAndLabels(t, 1, "__name__", "foo_1", "job", "bar"))
_, err = ar.exec(context.TODO(), time.Now(), 0)
if !errors.Is(err, errDuplicate) {
t.Fatalf("expected to have %s error; got %s", errDuplicate, err)
@@ -899,20 +943,22 @@ func TestAlertingRule_Template(t *testing.T) {
metricWithValueAndLabels(t, 10, "__name__", "second", "instance", "bar", alertNameLabel, "override"),
},
map[uint64]*notifier.Alert{
hash(map[string]string{alertNameLabel: "override label", "instance": "foo"}): {
hash(map[string]string{alertNameLabel: "override label", "exported_alertname": "override", "instance": "foo"}): {
Labels: map[string]string{
alertNameLabel: "override label",
"instance": "foo",
alertNameLabel: "override label",
"exported_alertname": "override",
"instance": "foo",
},
Annotations: map[string]string{
"summary": `first: Too high connection number for "foo"`,
"description": `override: It is 2 connections for "foo"`,
},
},
hash(map[string]string{alertNameLabel: "override label", "instance": "bar"}): {
hash(map[string]string{alertNameLabel: "override label", "exported_alertname": "override", "instance": "bar"}): {
Labels: map[string]string{
alertNameLabel: "override label",
"instance": "bar",
alertNameLabel: "override label",
"exported_alertname": "override",
"instance": "bar",
},
Annotations: map[string]string{
"summary": `second: Too high connection number for "bar"`,
@@ -941,14 +987,18 @@ func TestAlertingRule_Template(t *testing.T) {
},
map[uint64]*notifier.Alert{
hash(map[string]string{
alertNameLabel: "OriginLabels",
alertGroupNameLabel: "Testing",
"instance": "foo",
alertNameLabel: "OriginLabels",
"exported_alertname": "originAlertname",
alertGroupNameLabel: "Testing",
"exported_alertgroup": "originGroupname",
"instance": "foo",
}): {
Labels: map[string]string{
alertNameLabel: "OriginLabels",
alertGroupNameLabel: "Testing",
"instance": "foo",
alertNameLabel: "OriginLabels",
"exported_alertname": "originAlertname",
alertGroupNameLabel: "Testing",
"exported_alertgroup": "originGroupname",
"instance": "foo",
},
Annotations: map[string]string{
"summary": `Alert "originAlertname(originGroupname)" for instance foo`,
@@ -1087,8 +1137,65 @@ func newTestAlertingRule(name string, waitFor time.Duration) *AlertingRule {
return &rule
}
func newTestAlertingRuleWithEvalInterval(name string, waitFor, evalInterval time.Duration) *AlertingRule {
rule := newTestAlertingRule(name, waitFor)
rule.EvalInterval = evalInterval
return rule
}
func newTestAlertingRuleWithKeepFiring(name string, waitFor, keepFiringFor time.Duration) *AlertingRule {
rule := newTestAlertingRule(name, waitFor)
rule.KeepFiringFor = keepFiringFor
return rule
}
func TestAlertingRule_ToLabels(t *testing.T) {
metric := datasource.Metric{
Labels: []datasource.Label{
{Name: "instance", Value: "0.0.0.0:8800"},
{Name: "group", Value: "vmalert"},
{Name: "alertname", Value: "ConfigurationReloadFailure"},
},
Values: []float64{1},
Timestamps: []int64{time.Now().UnixNano()},
}
ar := &AlertingRule{
Labels: map[string]string{
"instance": "override", // this should override instance with new value
"group": "vmalert", // this shouldn't have effect since value in metric is equal
},
Expr: "sum(vmalert_alerting_rules_error) by(instance, group, alertname) > 0",
Name: "AlertingRulesError",
GroupName: "vmalert",
}
expectedOriginLabels := map[string]string{
"instance": "0.0.0.0:8800",
"group": "vmalert",
"alertname": "ConfigurationReloadFailure",
"alertgroup": "vmalert",
}
expectedProcessedLabels := map[string]string{
"instance": "override",
"exported_instance": "0.0.0.0:8800",
"alertname": "AlertingRulesError",
"exported_alertname": "ConfigurationReloadFailure",
"group": "vmalert",
"alertgroup": "vmalert",
}
ls, err := ar.toLabels(metric, nil)
if err != nil {
t.Fatalf("unexpected error: %s", err)
}
if !reflect.DeepEqual(ls.origin, expectedOriginLabels) {
t.Errorf("origin labels mismatch, got: %v, want: %v", ls.origin, expectedOriginLabels)
}
if !reflect.DeepEqual(ls.processed, expectedProcessedLabels) {
t.Errorf("processed labels mismatch, got: %v, want: %v", ls.processed, expectedProcessedLabels)
}
}

View File

@@ -194,6 +194,9 @@ func (rr *RecordingRule) toTimeSeries(m datasource.Metric) prompbmarshal.TimeSer
labels["__name__"] = rr.Name
// override existing labels with configured ones
for k, v := range rr.Labels {
if _, ok := labels[k]; ok && labels[k] != v {
labels[fmt.Sprintf("exported_%s", k)] = labels[k]
}
labels[k] = v
}
return newTimeSeries(m.Values, m.Timestamps, labels)
@@ -203,7 +206,7 @@ func (rr *RecordingRule) toTimeSeries(m datasource.Metric) prompbmarshal.TimeSer
func (rr *RecordingRule) updateWith(r Rule) error {
nr, ok := r.(*RecordingRule)
if !ok {
return fmt.Errorf("BUG: attempt to update recroding rule with wrong type %#v", r)
return fmt.Errorf("BUG: attempt to update recording rule with wrong type %#v", r)
}
rr.Expr = nr.Expr
rr.Labels = nr.Labels

View File

@@ -61,7 +61,7 @@ func TestRecordingRule_Exec(t *testing.T) {
},
[]datasource.Metric{
metricWithValueAndLabels(t, 2, "__name__", "foo", "job", "foo"),
metricWithValueAndLabels(t, 1, "__name__", "bar", "job", "bar"),
metricWithValueAndLabels(t, 1, "__name__", "bar", "job", "bar", "source", "origin"),
},
[]prompbmarshal.TimeSeries{
newTimeSeries([]float64{2}, []int64{timestamp.UnixNano()}, map[string]string{
@@ -70,9 +70,10 @@ func TestRecordingRule_Exec(t *testing.T) {
"source": "test",
}),
newTimeSeries([]float64{1}, []int64{timestamp.UnixNano()}, map[string]string{
"__name__": "job:foo",
"job": "bar",
"source": "test",
"__name__": "job:foo",
"job": "bar",
"source": "test",
"exported_source": "origin",
}),
},
},
@@ -254,10 +255,7 @@ func TestRecordingRule_ExecNegative(t *testing.T) {
fq.Add(metricWithValueAndLabels(t, 2, "__name__", "foo", "job", "bar"))
_, err = rr.exec(context.TODO(), time.Now(), 0)
if err == nil {
t.Fatalf("expected to get err; got nil")
}
if !strings.Contains(err.Error(), errDuplicate.Error()) {
t.Fatalf("expected to get err %q; got %q insterad", errDuplicate, err)
if err != nil {
t.Fatal(err)
}
}

View File

@@ -45,13 +45,14 @@ func WithBearer(token, tokenFile string) AuthConfigOptions {
}
// WithOAuth returns AuthConfigOptions and set OAuth params based on given params
func WithOAuth(clientID, clientSecret, clientSecretFile, tokenURL, scopes string) AuthConfigOptions {
func WithOAuth(clientID, clientSecret, clientSecretFile, tokenURL, scopes string, endpointParams map[string]string) AuthConfigOptions {
return func(config *promauth.HTTPClientConfig) {
if clientSecretFile != "" || clientSecret != "" {
config.OAuth2 = &promauth.OAuth2Config{
ClientID: clientID,
ClientSecret: promauth.NewSecret(clientSecret),
ClientSecretFile: clientSecretFile,
EndpointParams: endpointParams,
TokenURL: tokenURL,
Scopes: strings.Split(scopes, ";"),
}

View File

@@ -12,11 +12,14 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/rule"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/tpl"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/procutil"
)
var reloadAuthKey = flagutil.NewPassword("reloadAuthKey", "Auth key for /-/reload http endpoint. It must be passed as authKey=...")
var (
apiLinks = [][2]string{
// api links are relative since they can be used by external clients,
@@ -151,6 +154,9 @@ func (rh *requestHandler) handler(w http.ResponseWriter, r *http.Request) bool {
w.Write(data)
return true
case "/-/reload":
if !httpserver.CheckAuthFlag(w, r, reloadAuthKey.Get(), "reloadAuthKey") {
return true
}
logger.Infof("api config reload was called, sending sighup")
procutil.SelfSIGHUP()
w.WriteHeader(http.StatusOK)

View File

@@ -9,6 +9,7 @@ import (
"net/url"
"os"
"regexp"
"sort"
"strconv"
"strings"
"sync"
@@ -16,12 +17,13 @@ import (
"time"
"github.com/VictoriaMetrics/metrics"
"github.com/cespare/xxhash/v2"
"gopkg.in/yaml.v2"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envtemplate"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fasttime"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs/fscore"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/procutil"
)
@@ -56,16 +58,19 @@ type UserInfo struct {
DefaultURL *URLPrefix `yaml:"default_url,omitempty"`
RetryStatusCodes []int `yaml:"retry_status_codes,omitempty"`
LoadBalancingPolicy string `yaml:"load_balancing_policy,omitempty"`
DropSrcPathPrefixParts int `yaml:"drop_src_path_prefix_parts,omitempty"`
DropSrcPathPrefixParts *int `yaml:"drop_src_path_prefix_parts,omitempty"`
TLSInsecureSkipVerify *bool `yaml:"tls_insecure_skip_verify,omitempty"`
TLSCAFile string `yaml:"tls_ca_file,omitempty"`
MetricLabels map[string]string `yaml:"metric_labels,omitempty"`
concurrencyLimitCh chan struct{}
concurrencyLimitReached *metrics.Counter
httpTransport *http.Transport
requests *metrics.Counter
backendErrors *metrics.Counter
requestsDuration *metrics.Summary
}
@@ -126,16 +131,30 @@ func (h *Header) MarshalYAML() (interface{}, error) {
// URLMap is a mapping from source paths to target urls.
type URLMap struct {
SrcPaths []*SrcPath `yaml:"src_paths,omitempty"`
URLPrefix *URLPrefix `yaml:"url_prefix,omitempty"`
HeadersConf HeadersConf `yaml:",inline"`
RetryStatusCodes []int `yaml:"retry_status_codes,omitempty"`
LoadBalancingPolicy string `yaml:"load_balancing_policy,omitempty"`
DropSrcPathPrefixParts int `yaml:"drop_src_path_prefix_parts,omitempty"`
// SrcHosts is the list of regular expressions, which match the request hostname.
SrcHosts []*Regex `yaml:"src_hosts,omitempty"`
// SrcPaths is the list of regular expressions, which match the request path.
SrcPaths []*Regex `yaml:"src_paths,omitempty"`
// UrlPrefix contains backend url prefixes for the proxied request url.
URLPrefix *URLPrefix `yaml:"url_prefix,omitempty"`
// HeadersConf is the config for augumenting request and response headers.
HeadersConf HeadersConf `yaml:",inline"`
// RetryStatusCodes is the list of response status codes used for retrying requests.
RetryStatusCodes []int `yaml:"retry_status_codes,omitempty"`
// LoadBalancingPolicy is load balancing policy among UrlPrefix backends.
LoadBalancingPolicy string `yaml:"load_balancing_policy,omitempty"`
// DropSrcPathPrefixParts is the number of `/`-delimited request path prefix parts to drop before proxying the request to backend.
DropSrcPathPrefixParts *int `yaml:"drop_src_path_prefix_parts,omitempty"`
}
// SrcPath represents an src path
type SrcPath struct {
// Regex represents a regex
type Regex struct {
sOriginal string
re *regexp.Regexp
}
@@ -152,6 +171,9 @@ type URLPrefix struct {
// load balancing policy used
loadBalancingPolicy string
// how many request path prefix parts to drop before routing the request to backendURL.
dropSrcPathPrefixParts int
}
func (up *URLPrefix) setLoadBalancingPolicy(loadBalancingPolicy string) error {
@@ -249,8 +271,10 @@ func (up *URLPrefix) getLeastLoadedBackendURL() *backendURL {
if bu.isBroken() {
continue
}
if atomic.CompareAndSwapInt32(&bu.concurrentRequests, 0, 1) {
if atomic.LoadInt32(&bu.concurrentRequests) == 0 {
// Fast path - return the backend with zero concurrently executed requests.
// Do not use atomic.CompareAndSwapInt32(), since it is much slower on systems with many CPU cores.
atomic.AddInt32(&bu.concurrentRequests, 1)
return bu
}
}
@@ -333,8 +357,8 @@ func (up *URLPrefix) MarshalYAML() (interface{}, error) {
return string(b), nil
}
func (sp *SrcPath) match(s string) bool {
prefix, ok := sp.re.LiteralPrefix()
func (r *Regex) match(s string) bool {
prefix, ok := r.re.LiteralPrefix()
if ok {
// Fast path - literal match
return s == prefix
@@ -342,11 +366,11 @@ func (sp *SrcPath) match(s string) bool {
if !strings.HasPrefix(s, prefix) {
return false
}
return sp.re.MatchString(s)
return r.re.MatchString(s)
}
// UnmarshalYAML implements yaml.Unmarshaler
func (sp *SrcPath) UnmarshalYAML(f func(interface{}) error) error {
func (r *Regex) UnmarshalYAML(f func(interface{}) error) error {
var s string
if err := f(&s); err != nil {
return err
@@ -356,20 +380,20 @@ func (sp *SrcPath) UnmarshalYAML(f func(interface{}) error) error {
if err != nil {
return fmt.Errorf("cannot build regexp from %q: %w", s, err)
}
sp.sOriginal = s
sp.re = re
r.sOriginal = s
r.re = re
return nil
}
// MarshalYAML implements yaml.Marshaler.
func (sp *SrcPath) MarshalYAML() (interface{}, error) {
return sp.sOriginal, nil
func (r *Regex) MarshalYAML() (interface{}, error) {
return r.sOriginal, nil
}
var (
configReloads = metrics.NewCounter(`vmauth_config_last_reload_total`)
configReloadErrors = metrics.NewCounter(`vmauth_config_last_reload_errors_total`)
configSuccess = metrics.NewCounter(`vmauth_config_last_reload_successful`)
configSuccess = metrics.NewGauge(`vmauth_config_last_reload_successful`, nil)
configTimestamp = metrics.NewCounter(`vmauth_config_last_reload_success_timestamp_seconds`)
)
@@ -445,17 +469,19 @@ func authConfigReloader(sighupCh <-chan os.Signal) {
// authConfigData needs to be updated each time authConfig is updated.
var authConfigData atomic.Pointer[[]byte]
var authConfig atomic.Pointer[AuthConfig]
var authUsers atomic.Pointer[map[string]*UserInfo]
var authConfigWG sync.WaitGroup
var stopCh chan struct{}
var (
authConfig atomic.Pointer[AuthConfig]
authUsers atomic.Pointer[map[string]*UserInfo]
authConfigWG sync.WaitGroup
stopCh chan struct{}
)
// loadAuthConfig loads and applies the config from *authConfigPath.
// It returns bool value to identify if new config was applied.
// The config can be not applied if there is a parsing error
// or if there are no changes to the current authConfig.
func loadAuthConfig() (bool, error) {
data, err := fs.ReadFileOrHTTP(*authConfigPath)
data, err := fscore.ReadFileOrHTTP(*authConfigPath)
if err != nil {
return false, fmt.Errorf("failed to read -auth.config=%q: %w", *authConfigPath, err)
}
@@ -510,16 +536,23 @@ func parseAuthConfig(data []byte) (*AuthConfig, error) {
if err := ui.initURLs(); err != nil {
return nil, err
}
ui.requests = metrics.GetOrCreateCounter(`vmauth_unauthorized_user_requests_total`)
ui.requestsDuration = metrics.GetOrCreateSummary(`vmauth_unauthorized_user_request_duration_seconds`)
metricLabels, err := ui.getMetricLabels()
if err != nil {
return nil, fmt.Errorf("cannot parse metric_labels for unauthorized_user: %w", err)
}
ui.requests = metrics.GetOrCreateCounter(`vmauth_unauthorized_user_requests_total` + metricLabels)
ui.backendErrors = metrics.GetOrCreateCounter(`vmauth_unauthorized_user_request_backend_errors_total` + metricLabels)
ui.requestsDuration = metrics.GetOrCreateSummary(`vmauth_unauthorized_user_request_duration_seconds` + metricLabels)
ui.concurrencyLimitCh = make(chan struct{}, ui.getMaxConcurrentRequests())
ui.concurrencyLimitReached = metrics.GetOrCreateCounter(`vmauth_unauthorized_user_concurrent_requests_limit_reached_total`)
_ = metrics.GetOrCreateGauge(`vmauth_unauthorized_user_concurrent_requests_capacity`, func() float64 {
ui.concurrencyLimitReached = metrics.GetOrCreateCounter(`vmauth_unauthorized_user_concurrent_requests_limit_reached_total` + metricLabels)
_ = metrics.GetOrCreateGauge(`vmauth_unauthorized_user_concurrent_requests_capacity`+metricLabels, func() float64 {
return float64(cap(ui.concurrencyLimitCh))
})
_ = metrics.GetOrCreateGauge(`vmauth_unauthorized_user_concurrent_requests_current`, func() float64 {
_ = metrics.GetOrCreateGauge(`vmauth_unauthorized_user_concurrent_requests_current`+metricLabels, func() float64 {
return float64(len(ui.concurrencyLimitCh))
})
tr, err := getTransport(ui.TLSInsecureSkipVerify, ui.TLSCAFile)
if err != nil {
return nil, fmt.Errorf("cannot initialize HTTP transport: %w", err)
@@ -555,25 +588,24 @@ func parseAuthConfigUsers(ac *AuthConfig) (map[string]*UserInfo, error) {
return nil, err
}
name := ui.name()
if ui.BearerToken != "" {
if ui.Password != "" {
return nil, fmt.Errorf("password shouldn't be set for bearer_token %q", ui.BearerToken)
}
ui.requests = metrics.GetOrCreateCounter(fmt.Sprintf(`vmauth_user_requests_total{username=%q}`, name))
ui.requestsDuration = metrics.GetOrCreateSummary(fmt.Sprintf(`vmauth_user_request_duration_seconds{username=%q}`, name))
if ui.BearerToken != "" && ui.Password != "" {
return nil, fmt.Errorf("password shouldn't be set for bearer_token %q", ui.BearerToken)
}
if ui.Username != "" {
ui.requests = metrics.GetOrCreateCounter(fmt.Sprintf(`vmauth_user_requests_total{username=%q}`, name))
ui.requestsDuration = metrics.GetOrCreateSummary(fmt.Sprintf(`vmauth_user_request_duration_seconds{username=%q}`, name))
metricLabels, err := ui.getMetricLabels()
if err != nil {
return nil, fmt.Errorf("cannot parse metric_labels: %w", err)
}
ui.requests = metrics.GetOrCreateCounter(`vmauth_user_requests_total` + metricLabels)
ui.backendErrors = metrics.GetOrCreateCounter(`vmauth_user_request_backend_errors_total` + metricLabels)
ui.requestsDuration = metrics.GetOrCreateSummary(`vmauth_user_request_duration_seconds` + metricLabels)
mcr := ui.getMaxConcurrentRequests()
ui.concurrencyLimitCh = make(chan struct{}, mcr)
ui.concurrencyLimitReached = metrics.GetOrCreateCounter(fmt.Sprintf(`vmauth_user_concurrent_requests_limit_reached_total{username=%q}`, name))
_ = metrics.GetOrCreateGauge(fmt.Sprintf(`vmauth_user_concurrent_requests_capacity{username=%q}`, name), func() float64 {
ui.concurrencyLimitReached = metrics.GetOrCreateCounter(`vmauth_user_concurrent_requests_limit_reached_total` + metricLabels)
_ = metrics.GetOrCreateGauge(`vmauth_user_concurrent_requests_capacity`+metricLabels, func() float64 {
return float64(cap(ui.concurrencyLimitCh))
})
_ = metrics.GetOrCreateGauge(fmt.Sprintf(`vmauth_user_concurrent_requests_current{username=%q}`, name), func() float64 {
_ = metrics.GetOrCreateGauge(`vmauth_user_concurrent_requests_current`+metricLabels, func() float64 {
return float64(len(ui.concurrencyLimitCh))
})
@@ -589,20 +621,48 @@ func parseAuthConfigUsers(ac *AuthConfig) (map[string]*UserInfo, error) {
return byAuthToken, nil
}
var labelNameRegexp = regexp.MustCompile("^[a-zA-Z_:.][a-zA-Z0-9_:.]*$")
func (ui *UserInfo) getMetricLabels() (string, error) {
name := ui.name()
if len(name) == 0 && len(ui.MetricLabels) == 0 {
// fast path
return "", nil
}
labels := make([]string, 0, len(ui.MetricLabels)+1)
if len(name) > 0 {
labels = append(labels, fmt.Sprintf(`username=%q`, name))
}
for k, v := range ui.MetricLabels {
if !labelNameRegexp.MatchString(k) {
return "", fmt.Errorf("incorrect label name=%q, it must match regex=%q for user=%q", k, labelNameRegexp, name)
}
labels = append(labels, fmt.Sprintf(`%s=%q`, k, v))
}
sort.Strings(labels)
labelsStr := "{" + strings.Join(labels, ",") + "}"
return labelsStr, nil
}
func (ui *UserInfo) initURLs() error {
retryStatusCodes := defaultRetryStatusCodes.Values()
loadBalancingPolicy := *defaultLoadBalancingPolicy
dropSrcPathPrefixParts := 0
if ui.URLPrefix != nil {
if err := ui.URLPrefix.sanitize(); err != nil {
return err
}
if len(ui.RetryStatusCodes) > 0 {
if ui.RetryStatusCodes != nil {
retryStatusCodes = ui.RetryStatusCodes
}
if ui.LoadBalancingPolicy != "" {
loadBalancingPolicy = ui.LoadBalancingPolicy
}
if ui.DropSrcPathPrefixParts != nil {
dropSrcPathPrefixParts = *ui.DropSrcPathPrefixParts
}
ui.URLPrefix.retryStatusCodes = retryStatusCodes
ui.URLPrefix.dropSrcPathPrefixParts = dropSrcPathPrefixParts
if err := ui.URLPrefix.setLoadBalancingPolicy(loadBalancingPolicy); err != nil {
return err
}
@@ -613,8 +673,8 @@ func (ui *UserInfo) initURLs() error {
}
}
for _, e := range ui.URLMaps {
if len(e.SrcPaths) == 0 {
return fmt.Errorf("missing `src_paths` in `url_map`")
if len(e.SrcPaths) == 0 && len(e.SrcHosts) == 0 {
return fmt.Errorf("missing `src_paths` and `src_hosts` in `url_map`")
}
if e.URLPrefix == nil {
return fmt.Errorf("missing `url_prefix` in `url_map`")
@@ -624,16 +684,21 @@ func (ui *UserInfo) initURLs() error {
}
rscs := retryStatusCodes
lbp := loadBalancingPolicy
if len(e.RetryStatusCodes) > 0 {
dsp := dropSrcPathPrefixParts
if e.RetryStatusCodes != nil {
rscs = e.RetryStatusCodes
}
if e.LoadBalancingPolicy != "" {
lbp = e.LoadBalancingPolicy
}
if e.DropSrcPathPrefixParts != nil {
dsp = *e.DropSrcPathPrefixParts
}
e.URLPrefix.retryStatusCodes = rscs
if err := e.URLPrefix.setLoadBalancingPolicy(lbp); err != nil {
return err
}
e.URLPrefix.dropSrcPathPrefixParts = dsp
}
if len(ui.URLMaps) == 0 && ui.URLPrefix == nil {
return fmt.Errorf("missing `url_prefix`")
@@ -649,7 +714,8 @@ func (ui *UserInfo) name() string {
return ui.Username
}
if ui.BearerToken != "" {
return "bearer_token"
h := xxhash.Sum64([]byte(ui.BearerToken))
return fmt.Sprintf("bearer_token:hash:%016X", h)
}
return ""
}

View File

@@ -145,6 +145,12 @@ users:
url_map:
- src_paths: ["/foo/bar"]
`)
f(`
users:
- username: a
url_map:
- src_hosts: ["foobar"]
`)
// Invalid url_prefix in url_map
f(`
@@ -154,6 +160,13 @@ users:
- src_paths: ["/foo/bar"]
url_prefix: foo.bar
`)
f(`
users:
- username: a
url_map:
- src_hosts: ["foobar"]
url_prefix: foo.bar
`)
// empty url_prefix in url_map
f(`
@@ -163,8 +176,15 @@ users:
- src_paths: ['/foo/bar']
url_prefix: []
`)
f(`
users:
- username: a
url_map:
- src_phosts: ['foobar']
url_prefix: []
`)
// Missing src_paths in url_map
// Missing src_paths and src_hosts in url_map
f(`
users:
- username: a
@@ -181,6 +201,15 @@ users:
url_prefix: http://foobar
`)
// Invalid regexp in src_hosts
f(`
users:
- username: a
url_map:
- src_hosts: ['fo[obar']
url_prefix: http://foobar
`)
// Invalid headers in url_map (missing ':')
f(`
users:
@@ -200,6 +229,14 @@ users:
url_prefix: http://foobar
headers:
aaa: bbb
`)
// Invalid metric label name
f(`
users:
- username: foo
url_prefix: http://foo.bar
metric_labels:
not-prometheus-compatible: value
`)
}
@@ -263,7 +300,7 @@ users:
TLSInsecureSkipVerify: &insecureSkipVerifyFalse,
RetryStatusCodes: []int{500, 501},
LoadBalancingPolicy: "first_available",
DropSrcPathPrefixParts: 1,
DropSrcPathPrefixParts: intp(1),
},
})
@@ -293,6 +330,7 @@ users:
- src_paths: ["/api/v1/query","/api/v1/query_range","/api/v1/label/[^./]+/.+"]
url_prefix: http://vmselect/select/0/prometheus
- src_paths: ["/api/v1/write"]
src_hosts: ["foo\\.bar", "baz:1234"]
url_prefix: ["http://vminsert1/insert/0/prometheus","http://vminsert2/insert/0/prometheus"]
headers:
- "foo: bar"
@@ -302,11 +340,12 @@ users:
BearerToken: "foo",
URLMaps: []URLMap{
{
SrcPaths: getSrcPaths([]string{"/api/v1/query", "/api/v1/query_range", "/api/v1/label/[^./]+/.+"}),
SrcPaths: getRegexs([]string{"/api/v1/query", "/api/v1/query_range", "/api/v1/label/[^./]+/.+"}),
URLPrefix: mustParseURL("http://vmselect/select/0/prometheus"),
},
{
SrcPaths: getSrcPaths([]string{"/api/v1/write"}),
SrcHosts: getRegexs([]string{"foo\\.bar", "baz:1234"}),
SrcPaths: getRegexs([]string{"/api/v1/write"}),
URLPrefix: mustParseURLs([]string{
"http://vminsert1/insert/0/prometheus",
"http://vminsert2/insert/0/prometheus",
@@ -330,11 +369,12 @@ users:
BearerToken: "foo",
URLMaps: []URLMap{
{
SrcPaths: getSrcPaths([]string{"/api/v1/query", "/api/v1/query_range", "/api/v1/label/[^./]+/.+"}),
SrcPaths: getRegexs([]string{"/api/v1/query", "/api/v1/query_range", "/api/v1/label/[^./]+/.+"}),
URLPrefix: mustParseURL("http://vmselect/select/0/prometheus"),
},
{
SrcPaths: getSrcPaths([]string{"/api/v1/write"}),
SrcHosts: getRegexs([]string{"foo\\.bar", "baz:1234"}),
SrcPaths: getRegexs([]string{"/api/v1/write"}),
URLPrefix: mustParseURLs([]string{
"http://vminsert1/insert/0/prometheus",
"http://vminsert2/insert/0/prometheus",
@@ -396,11 +436,11 @@ users:
BearerToken: "foo",
URLMaps: []URLMap{
{
SrcPaths: getSrcPaths([]string{"/api/v1/query", "/api/v1/query_range", "/api/v1/label/[^./]+/.+"}),
SrcPaths: getRegexs([]string{"/api/v1/query", "/api/v1/query_range", "/api/v1/label/[^./]+/.+"}),
URLPrefix: mustParseURL("http://vmselect/select/0/prometheus"),
},
{
SrcPaths: getSrcPaths([]string{"/api/v1/write"}),
SrcPaths: getRegexs([]string{"/api/v1/write"}),
URLPrefix: mustParseURLs([]string{
"http://vminsert1/insert/0/prometheus",
"http://vminsert2/insert/0/prometheus",
@@ -428,11 +468,11 @@ users:
BearerToken: "foo",
URLMaps: []URLMap{
{
SrcPaths: getSrcPaths([]string{"/api/v1/query", "/api/v1/query_range", "/api/v1/label/[^./]+/.+"}),
SrcPaths: getRegexs([]string{"/api/v1/query", "/api/v1/query_range", "/api/v1/label/[^./]+/.+"}),
URLPrefix: mustParseURL("http://vmselect/select/0/prometheus"),
},
{
SrcPaths: getSrcPaths([]string{"/api/v1/write"}),
SrcPaths: getRegexs([]string{"/api/v1/write"}),
URLPrefix: mustParseURLs([]string{
"http://vminsert1/insert/0/prometheus",
"http://vminsert2/insert/0/prometheus",
@@ -457,7 +497,41 @@ users:
}),
},
})
// With metric_labels
f(`
users:
- username: foo-same
password: baz
url_prefix: http://foo
metric_labels:
dc: eu
team: dev
- username: foo-same
password: bar
url_prefix: https://bar/x///
metric_labels:
backend_env: test
team: accounting
`, map[string]*UserInfo{
getAuthToken("", "foo-same", "baz"): {
Username: "foo-same",
Password: "baz",
URLPrefix: mustParseURL("http://foo"),
MetricLabels: map[string]string{
"dc": "eu",
"team": "dev",
},
},
getAuthToken("", "foo-same", "bar"): {
Username: "foo-same",
Password: "bar",
URLPrefix: mustParseURL("https://bar/x"),
MetricLabels: map[string]string{
"backend_env": "test",
"team": "accounting",
},
},
})
}
func TestParseAuthConfigPassesTLSVerificationConfig(t *testing.T) {
@@ -494,6 +568,86 @@ unauthorized_user:
}
}
func TestUserInfoGetMetricLabels(t *testing.T) {
t.Run("empty-labels", func(t *testing.T) {
ui := &UserInfo{
Username: "user1",
}
labels, err := ui.getMetricLabels()
if err != nil {
t.Fatalf("unexpected error: %s", err)
}
labelsExpected := `{username="user1"}`
if labels != labelsExpected {
t.Fatalf("unexpected labels; got %s; want %s", labels, labelsExpected)
}
})
t.Run("non-empty-username", func(t *testing.T) {
ui := &UserInfo{
Username: "user1",
MetricLabels: map[string]string{
"env": "prod",
"datacenter": "dc1",
},
}
labels, err := ui.getMetricLabels()
if err != nil {
t.Fatalf("unexpected error: %s", err)
}
labelsExpected := `{datacenter="dc1",env="prod",username="user1"}`
if labels != labelsExpected {
t.Fatalf("unexpected labels; got %s; want %s", labels, labelsExpected)
}
})
t.Run("non-empty-name", func(t *testing.T) {
ui := &UserInfo{
Name: "user1",
BearerToken: "abc",
MetricLabels: map[string]string{
"env": "prod",
"datacenter": "dc1",
},
}
labels, err := ui.getMetricLabels()
if err != nil {
t.Fatalf("unexpected error: %s", err)
}
labelsExpected := `{datacenter="dc1",env="prod",username="user1"}`
if labels != labelsExpected {
t.Fatalf("unexpected labels; got %s; want %s", labels, labelsExpected)
}
})
t.Run("non-empty-bearer-token", func(t *testing.T) {
ui := &UserInfo{
BearerToken: "abc",
MetricLabels: map[string]string{
"env": "prod",
"datacenter": "dc1",
},
}
labels, err := ui.getMetricLabels()
if err != nil {
t.Fatalf("unexpected error: %s", err)
}
labelsExpected := `{datacenter="dc1",env="prod",username="bearer_token:hash:44BC2CF5AD770999"}`
if labels != labelsExpected {
t.Fatalf("unexpected labels; got %s; want %s", labels, labelsExpected)
}
})
t.Run("invalid-label", func(t *testing.T) {
ui := &UserInfo{
Username: "foo",
MetricLabels: map[string]string{
",{": "aaaa",
},
}
_, err := ui.getMetricLabels()
if err == nil {
t.Fatalf("expecting non-nil error")
}
})
}
func isSetBool(boolP *bool, expectedValue bool) bool {
if boolP == nil {
return false
@@ -501,10 +655,10 @@ func isSetBool(boolP *bool, expectedValue bool) bool {
return *boolP == expectedValue
}
func getSrcPaths(paths []string) []*SrcPath {
var sps []*SrcPath
func getRegexs(paths []string) []*Regex {
var sps []*Regex
for _, path := range paths {
sps = append(sps, &SrcPath{
sps = append(sps, &Regex{
sOriginal: path,
re: regexp.MustCompile("^(?:" + path + ")$"),
})
@@ -552,3 +706,7 @@ func mustParseURLs(us []string) *URLPrefix {
bus: bus,
}
}
func intp(n int) *int {
return &n
}

View File

@@ -10,6 +10,11 @@ users:
- bearer_token: "XXXX"
url_prefix: "http://localhost:8428"
# Adds labels to the exported metrics for given user section
# label name must be prometheus compatible and match regex: `^[a-zA-Z_:.][a-zA-Z0-9_:.]*$`
metric_labels:
backend_dc: eu
access_team: dev
# Requests with the 'Authorization: Bearer YYY' header are proxied to http://localhost:8428 ,
# The `X-Scope-OrgID: foobar` http header is added to every proxied request.
# The `X-Server-Hostname:` http header is removed from the proxied response.
@@ -92,7 +97,7 @@ users:
# - to http://default1:8888/unsupported_url_handler?request_path=/non/existing/path
# - or http://default2:8888/unsupported_url_handler?request_path=/non/existing/path
#
# Regular expressions are allowed in `src_paths` entries.
# Regular expressions are allowed in `src_paths` and `src_hosts` entries.
- username: "foobar"
url_map:
- src_paths:

View File

@@ -24,7 +24,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/encoding"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envflag"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs/fscore"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/netutil"
@@ -45,7 +45,7 @@ var (
maxConcurrentPerUserRequests = flag.Int("maxConcurrentPerUserRequests", 300, "The maximum number of concurrent requests vmauth can process per each configured user. "+
"Other requests are rejected with '429 Too Many Requests' http status code. See also -maxConcurrentRequests command-line option and max_concurrent_requests option "+
"in per-user config")
reloadAuthKey = flag.String("reloadAuthKey", "", "Auth key for /-/reload http endpoint. It must be passed as authKey=...")
reloadAuthKey = flagutil.NewPassword("reloadAuthKey", "Auth key for /-/reload http endpoint. It must be passed as authKey=...")
logInvalidAuthTokens = flag.Bool("logInvalidAuthTokens", false, "Whether to log requests with invalid auth tokens. "+
`Such requests are always counted at vmauth_http_request_errors_total{reason="invalid_auth_token"} metric, which is exposed at /metrics page`)
failTimeout = flag.Duration("failTimeout", 3*time.Second, "Sets a delay period for load balancing to skip a malfunctioning backend")
@@ -64,7 +64,6 @@ func main() {
envflag.Parse()
buildinfo.Init()
logger.Init()
pushmetrics.Init()
logger.Infof("starting vmauth at %q...", *httpListenAddr)
startTime := time.Now()
@@ -72,8 +71,10 @@ func main() {
go httpserver.Serve(*httpListenAddr, *useProxyProtocol, requestHandler)
logger.Infof("started vmauth in %.3f seconds", time.Since(startTime).Seconds())
pushmetrics.Init()
sig := procutil.WaitForSigterm()
logger.Infof("received signal %s", sig)
pushmetrics.Stop()
startTime = time.Now()
logger.Infof("gracefully shutting down webservice at %q", *httpListenAddr)
@@ -88,7 +89,7 @@ func main() {
func requestHandler(w http.ResponseWriter, r *http.Request) bool {
switch r.URL.Path {
case "/-/reload":
if !httpserver.CheckAuthFlag(w, r, *reloadAuthKey, "reloadAuthKey") {
if !httpserver.CheckAuthFlag(w, r, reloadAuthKey.Get(), "reloadAuthKey") {
return true
}
configReloadRequests.Inc()
@@ -149,12 +150,20 @@ func processUserRequest(w http.ResponseWriter, r *http.Request, ui *UserInfo) {
if err := ui.beginConcurrencyLimit(); err != nil {
handleConcurrencyLimitError(w, r, err)
<-concurrencyLimitCh
// Requests failed because of concurrency limit must be counted as errors,
// since this usually means the backend cannot keep up with the current load.
ui.backendErrors.Inc()
return
}
default:
concurrentRequestsLimitReached.Inc()
err := fmt.Errorf("cannot serve more than -maxConcurrentRequests=%d concurrent requests", cap(concurrencyLimitCh))
handleConcurrencyLimitError(w, r, err)
// Requests failed because of concurrency limit must be counted as errors,
// since this usually means the backend cannot keep up with the current load.
ui.backendErrors.Inc()
return
}
processRequest(w, r, ui)
@@ -164,7 +173,7 @@ func processUserRequest(w http.ResponseWriter, r *http.Request, ui *UserInfo) {
func processRequest(w http.ResponseWriter, r *http.Request, ui *UserInfo) {
u := normalizeURL(r.URL)
up, hc, dropSrcPathPrefixParts := ui.getURLPrefixAndHeaders(u)
up, hc := ui.getURLPrefixAndHeaders(u)
isDefault := false
if up == nil {
if ui.DefaultURL == nil {
@@ -198,9 +207,9 @@ func processRequest(w http.ResponseWriter, r *http.Request, ui *UserInfo) {
query.Set("request_path", u.String())
targetURL.RawQuery = query.Encode()
} else { // Update path for regular routes.
targetURL = mergeURLs(targetURL, u, dropSrcPathPrefixParts)
targetURL = mergeURLs(targetURL, u, up.dropSrcPathPrefixParts)
}
ok := tryProcessingRequest(w, r, targetURL, hc, up.retryStatusCodes, ui.httpTransport)
ok := tryProcessingRequest(w, r, targetURL, hc, up.retryStatusCodes, ui)
bu.put()
if ok {
return
@@ -212,15 +221,16 @@ func processRequest(w http.ResponseWriter, r *http.Request, ui *UserInfo) {
StatusCode: http.StatusServiceUnavailable,
}
httpserver.Errorf(w, r, "%s", err)
ui.backendErrors.Inc()
}
func tryProcessingRequest(w http.ResponseWriter, r *http.Request, targetURL *url.URL, hc HeadersConf, retryStatusCodes []int, transport *http.Transport) bool {
func tryProcessingRequest(w http.ResponseWriter, r *http.Request, targetURL *url.URL, hc HeadersConf, retryStatusCodes []int, ui *UserInfo) bool {
// This code has been copied from net/http/httputil/reverseproxy.go
req := sanitizeRequestHeaders(r)
req.URL = targetURL
req.Host = targetURL.Host
updateHeadersByConfig(req.Header, hc.RequestHeaders)
res, err := transport.RoundTrip(req)
res, err := ui.httpTransport.RoundTrip(req)
rtb, rtbOK := req.Body.(*readTrackingBody)
if err != nil {
if errors.Is(err, context.Canceled) || errors.Is(err, context.DeadlineExceeded) {
@@ -228,15 +238,20 @@ func tryProcessingRequest(w http.ResponseWriter, r *http.Request, targetURL *url
remoteAddr := httpserver.GetQuotedRemoteAddr(r)
requestURI := httpserver.GetRequestURI(r)
logger.Warnf("remoteAddr: %s; requestURI: %s; error when proxying response body from %s: %s", remoteAddr, requestURI, targetURL, err)
if errors.Is(err, context.DeadlineExceeded) {
// Timed out request must be counted as errors, since this usually means that the backend is slow.
ui.backendErrors.Inc()
}
return true
}
if !rtbOK || !rtb.canRetry() {
// Request body cannot be re-sent to another backend. Return the error to the client then.
err = &httpserver.ErrorWithStatusCode{
Err: fmt.Errorf("cannot proxy the request to %q: %w", targetURL, err),
Err: fmt.Errorf("cannot proxy the request to %s: %w", targetURL, err),
StatusCode: http.StatusServiceUnavailable,
}
httpserver.Errorf(w, r, "%s", err)
ui.backendErrors.Inc()
return true
}
// Retry the request if its body wasn't read yet. This usually means that the backend isn't reachable.
@@ -246,7 +261,20 @@ func tryProcessingRequest(w http.ResponseWriter, r *http.Request, targetURL *url
logger.Warnf("remoteAddr: %s; requestURI: %s; retrying the request to %s because of response error: %s", remoteAddr, req.URL, targetURL, err)
return false
}
if (rtbOK && rtb.canRetry()) && hasInt(retryStatusCodes, res.StatusCode) {
if hasInt(retryStatusCodes, res.StatusCode) {
_ = res.Body.Close()
if !rtbOK || !rtb.canRetry() {
// If we get an error from the retry_status_codes list, but cannot execute retry,
// we consider such a request an error as well.
err := &httpserver.ErrorWithStatusCode{
Err: fmt.Errorf("got response status code=%d from %s, but cannot retry the request on another backend, because the request has been already consumed",
res.StatusCode, targetURL),
StatusCode: http.StatusServiceUnavailable,
}
httpserver.Errorf(w, r, "%s", err)
ui.backendErrors.Inc()
return true
}
// Retry requests at other backends if it matches retryStatusCodes.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4893
remoteAddr := httpserver.GetQuotedRemoteAddr(r)
@@ -265,6 +293,7 @@ func tryProcessingRequest(w http.ResponseWriter, r *http.Request, targetURL *url
copyBuf.B = bytesutil.ResizeNoCopyNoOverallocate(copyBuf.B, 16*1024)
_, err = io.CopyBuffer(w, res.Body, copyBuf.B)
copyBufPool.Put(copyBuf)
_ = res.Body.Close()
if err != nil && !netutil.IsTrivialNetworkError(err) {
remoteAddr := httpserver.GetQuotedRemoteAddr(r)
requestURI := httpserver.GetRequestURI(r)
@@ -392,8 +421,10 @@ func getTransport(insecureSkipVerifyP *bool, caFile string) (*http.Transport, er
return tr, nil
}
var transportMap = make(map[string]*http.Transport)
var transportMapLock sync.Mutex
var (
transportMap = make(map[string]*http.Transport)
transportMapLock sync.Mutex
)
func appendTransportKey(dst []byte, insecureSkipVerify bool, caFile string) []byte {
dst = encoding.MarshalBool(dst, insecureSkipVerify)
@@ -421,7 +452,7 @@ func newTransport(insecureSkipVerify bool, caFile string) (*http.Transport, erro
tlsCfg.ClientSessionCache = tls.NewLRUClientSessionCache(0)
tlsCfg.InsecureSkipVerify = insecureSkipVerify
if caFile != "" {
data, err := fs.ReadFileOrHTTP(caFile)
data, err := fscore.ReadFileOrHTTP(caFile)
if err != nil {
return nil, fmt.Errorf("cannot read tls_ca_file: %w", err)
}

View File

@@ -49,18 +49,28 @@ func dropPrefixParts(path string, parts int) string {
return path
}
func (ui *UserInfo) getURLPrefixAndHeaders(u *url.URL) (*URLPrefix, HeadersConf, int) {
func (ui *UserInfo) getURLPrefixAndHeaders(u *url.URL) (*URLPrefix, HeadersConf) {
for _, e := range ui.URLMaps {
for _, sp := range e.SrcPaths {
if sp.match(u.Path) {
return e.URLPrefix, e.HeadersConf, e.DropSrcPathPrefixParts
}
if matchAnyRegex(e.SrcHosts, u.Host) && matchAnyRegex(e.SrcPaths, u.Path) {
return e.URLPrefix, e.HeadersConf
}
}
if ui.URLPrefix != nil {
return ui.URLPrefix, ui.HeadersConf, ui.DropSrcPathPrefixParts
return ui.URLPrefix, ui.HeadersConf
}
return nil, HeadersConf{}, 0
return nil, HeadersConf{}
}
func matchAnyRegex(rs []*Regex, s string) bool {
if len(rs) == 0 {
return true
}
for _, r := range rs {
if r.match(s) {
return true
}
}
return false
}
func normalizeURL(uOrig *url.URL) *url.URL {

View File

@@ -89,12 +89,12 @@ func TestCreateTargetURLSuccess(t *testing.T) {
t.Fatalf("cannot parse %q: %s", requestURI, err)
}
u = normalizeURL(u)
up, hc, dropSrcPathPrefixParts := ui.getURLPrefixAndHeaders(u)
up, hc := ui.getURLPrefixAndHeaders(u)
if up == nil {
t.Fatalf("cannot determie backend: %s", err)
}
bu := up.getLeastLoadedBackendURL()
target := mergeURLs(bu.url, u, dropSrcPathPrefixParts)
target := mergeURLs(bu.url, u, up.dropSrcPathPrefixParts)
bu.put()
if target.String() != expectedTarget {
t.Fatalf("unexpected target; got %q; want %q", target, expectedTarget)
@@ -109,8 +109,8 @@ func TestCreateTargetURLSuccess(t *testing.T) {
if up.loadBalancingPolicy != expectedLoadBalancingPolicy {
t.Fatalf("unexpected loadBalancingPolicy; got %q; want %q", up.loadBalancingPolicy, expectedLoadBalancingPolicy)
}
if dropSrcPathPrefixParts != expectedDropSrcPathPrefixParts {
t.Fatalf("unexpected dropSrcPathPrefixParts; got %d; want %d", dropSrcPathPrefixParts, expectedDropSrcPathPrefixParts)
if up.dropSrcPathPrefixParts != expectedDropSrcPathPrefixParts {
t.Fatalf("unexpected dropSrcPathPrefixParts; got %d; want %d", up.dropSrcPathPrefixParts, expectedDropSrcPathPrefixParts)
}
}
// Simple routing with `url_prefix`
@@ -127,7 +127,7 @@ func TestCreateTargetURLSuccess(t *testing.T) {
},
RetryStatusCodes: []int{503, 501},
LoadBalancingPolicy: "first_available",
DropSrcPathPrefixParts: 2,
DropSrcPathPrefixParts: intp(2),
}, "/a/b/c", "http://foo.bar/c", `[{"bb" "aaa"}]`, `[]`, []int{503, 501}, "first_available", 2)
f(&UserInfo{
URLPrefix: mustParseURL("http://foo.bar/federate"),
@@ -149,7 +149,8 @@ func TestCreateTargetURLSuccess(t *testing.T) {
ui := &UserInfo{
URLMaps: []URLMap{
{
SrcPaths: getSrcPaths([]string{"/vmsingle/api/v1/query"}),
SrcHosts: getRegexs([]string{"host42"}),
SrcPaths: getRegexs([]string{"/vmsingle/api/v1/query"}),
URLPrefix: mustParseURL("http://vmselect/0/prometheus"),
HeadersConf: HeadersConf{
RequestHeaders: []Header{
@@ -171,11 +172,13 @@ func TestCreateTargetURLSuccess(t *testing.T) {
},
RetryStatusCodes: []int{503, 500, 501},
LoadBalancingPolicy: "first_available",
DropSrcPathPrefixParts: 1,
DropSrcPathPrefixParts: intp(1),
},
{
SrcPaths: getSrcPaths([]string{"/api/v1/write"}),
URLPrefix: mustParseURL("http://vminsert/0/prometheus"),
SrcPaths: getRegexs([]string{"/api/v1/write"}),
URLPrefix: mustParseURL("http://vminsert/0/prometheus"),
RetryStatusCodes: []int{},
DropSrcPathPrefixParts: intp(0),
},
},
URLPrefix: mustParseURL("http://default-server"),
@@ -190,23 +193,30 @@ func TestCreateTargetURLSuccess(t *testing.T) {
}},
},
RetryStatusCodes: []int{502},
DropSrcPathPrefixParts: 2,
DropSrcPathPrefixParts: intp(2),
}
f(ui, "/vmsingle/api/v1/query?query=up", "http://vmselect/0/prometheus/api/v1/query?query=up", `[{"xx" "aa"} {"yy" "asdf"}]`, `[{"qwe" "rty"}]`, []int{503, 500, 501}, "first_available", 1)
f(ui, "/api/v1/write", "http://vminsert/0/prometheus/api/v1/write", "[]", "[]", []int{502}, "least_loaded", 0)
f(ui, "/foo/bar/api/v1/query_range", "http://default-server/api/v1/query_range", `[{"bb" "aaa"}]`, `[{"x" "y"}]`, []int{502}, "least_loaded", 2)
f(ui, "http://host42/vmsingle/api/v1/query?query=up", "http://vmselect/0/prometheus/api/v1/query?query=up",
`[{"xx" "aa"} {"yy" "asdf"}]`, `[{"qwe" "rty"}]`, []int{503, 500, 501}, "first_available", 1)
f(ui, "http://host123/vmsingle/api/v1/query?query=up", "http://default-server/v1/query?query=up",
`[{"bb" "aaa"}]`, `[{"x" "y"}]`, []int{502}, "least_loaded", 2)
f(ui, "https://foo-host/api/v1/write", "http://vminsert/0/prometheus/api/v1/write", "[]", "[]", []int{}, "least_loaded", 0)
f(ui, "https://foo-host/foo/bar/api/v1/query_range", "http://default-server/api/v1/query_range", `[{"bb" "aaa"}]`, `[{"x" "y"}]`, []int{502}, "least_loaded", 2)
// Complex routing regexp paths in `url_map`
ui = &UserInfo{
URLMaps: []URLMap{
{
SrcPaths: getSrcPaths([]string{"/api/v1/query(_range)?", "/api/v1/label/[^/]+/values"}),
SrcPaths: getRegexs([]string{"/api/v1/query(_range)?", "/api/v1/label/[^/]+/values"}),
URLPrefix: mustParseURL("http://vmselect/0/prometheus"),
},
{
SrcPaths: getSrcPaths([]string{"/api/v1/write"}),
SrcPaths: getRegexs([]string{"/api/v1/write"}),
URLPrefix: mustParseURL("http://vminsert/0/prometheus"),
},
{
SrcHosts: getRegexs([]string{"vmui\\..+"}),
URLPrefix: mustParseURL("http://vmui.host:1234/vmui/"),
},
},
URLPrefix: mustParseURL("http://default-server"),
}
@@ -215,6 +225,8 @@ func TestCreateTargetURLSuccess(t *testing.T) {
f(ui, "/api/v1/label/foo/values", "http://vmselect/0/prometheus/api/v1/label/foo/values", "[]", "[]", nil, "least_loaded", 0)
f(ui, "/api/v1/write", "http://vminsert/0/prometheus/api/v1/write", "[]", "[]", nil, "least_loaded", 0)
f(ui, "/api/v1/foo/bar", "http://default-server/api/v1/foo/bar", "[]", "[]", nil, "least_loaded", 0)
f(ui, "https://vmui.foobar.com/a/b?c=d", "http://vmui.host:1234/vmui/a/b?c=d", "[]", "[]", nil, "least_loaded", 0)
f(&UserInfo{
URLPrefix: mustParseURL("http://foo.bar?extra_label=team=dev"),
}, "/api/v1/query", "http://foo.bar/api/v1/query?extra_label=team=dev", "[]", "[]", nil, "least_loaded", 0)
@@ -231,7 +243,7 @@ func TestCreateTargetURLFailure(t *testing.T) {
t.Fatalf("cannot parse %q: %s", requestURI, err)
}
u = normalizeURL(u)
up, hc, dropSrcPathPrefixParts := ui.getURLPrefixAndHeaders(u)
up, hc := ui.getURLPrefixAndHeaders(u)
if up != nil {
t.Fatalf("unexpected non-empty up=%#v", up)
}
@@ -241,15 +253,12 @@ func TestCreateTargetURLFailure(t *testing.T) {
if hc.ResponseHeaders != nil {
t.Fatalf("unexpected non-empty response headers=%q", hc.ResponseHeaders)
}
if dropSrcPathPrefixParts != 0 {
t.Fatalf("unexpected non-zero dropSrcPathPrefixParts=%d", dropSrcPathPrefixParts)
}
}
f(&UserInfo{}, "/foo/bar")
f(&UserInfo{
URLMaps: []URLMap{
{
SrcPaths: getSrcPaths([]string{"/api/v1/query"}),
SrcPaths: getRegexs([]string{"/api/v1/query"}),
URLPrefix: mustParseURL("http://foobar/baz"),
},
},

View File

@@ -47,7 +47,6 @@ func main() {
envflag.Parse()
buildinfo.Init()
logger.Init()
pushmetrics.Init()
// Storing snapshot delete function to be able to call it in case
// of error since logger.Fatal will exit the program without
@@ -96,11 +95,13 @@ func main() {
go httpserver.Serve(*httpListenAddr, false, nil)
pushmetrics.Init()
err := makeBackup()
deleteSnapshot()
if err != nil {
logger.Fatalf("cannot create backup: %s", err)
}
pushmetrics.Stop()
startTime := time.Now()
logger.Infof("gracefully shutting down http server for metrics at %q", *httpListenAddr)

View File

@@ -15,7 +15,6 @@ func TestRetry_Do(t *testing.T) {
backoffFactor float64
backoffMinDuration time.Duration
retryableFunc retryableFunc
ctx context.Context
cancelTimeout time.Duration
want uint64
wantErr bool
@@ -25,7 +24,6 @@ func TestRetry_Do(t *testing.T) {
retryableFunc: func() error {
return ErrBadRequest
},
ctx: context.Background(),
want: 0,
wantErr: true,
},
@@ -35,7 +33,6 @@ func TestRetry_Do(t *testing.T) {
time.Sleep(time.Millisecond * 100)
return nil
},
ctx: context.Background(),
want: 0,
wantErr: true,
},
@@ -58,7 +55,6 @@ func TestRetry_Do(t *testing.T) {
}
return nil
},
ctx: context.Background(),
want: 1,
wantErr: false,
},
@@ -75,7 +71,6 @@ func TestRetry_Do(t *testing.T) {
}
return nil
},
ctx: context.Background(),
want: 5,
wantErr: true,
},
@@ -85,14 +80,8 @@ func TestRetry_Do(t *testing.T) {
backoffFactor: 1.7,
backoffMinDuration: time.Millisecond * 10,
retryableFunc: func() error {
t := time.NewTicker(time.Millisecond * 5)
defer t.Stop()
for range t.C {
return fmt.Errorf("got some error")
}
return nil
return fmt.Errorf("got some error")
},
ctx: context.Background(),
cancelTimeout: time.Millisecond * 40,
want: 3,
wantErr: true,
@@ -101,12 +90,13 @@ func TestRetry_Do(t *testing.T) {
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
r := &Backoff{retries: tt.backoffRetries, factor: tt.backoffFactor, minDuration: tt.backoffMinDuration}
ctx := context.Background()
if tt.cancelTimeout != 0 {
newCtx, cancelFn := context.WithTimeout(tt.ctx, tt.cancelTimeout)
tt.ctx = newCtx
newCtx, cancelFn := context.WithTimeout(context.Background(), tt.cancelTimeout)
ctx = newCtx
defer cancelFn()
}
got, err := r.Retry(tt.ctx, tt.retryableFunc)
got, err := r.Retry(ctx, tt.retryableFunc)
if (err != nil) != tt.wantErr {
t.Errorf("Retry() error = %v, wantErr %v", err, tt.wantErr)
return

View File

@@ -326,21 +326,23 @@ const (
vmNativeFilterTimeReverse = "vm-native-filter-time-reverse"
vmNativeStepInterval = "vm-native-step-interval"
vmNativeDisableBinaryProtocol = "vm-native-disable-binary-protocol"
vmNativeDisableHTTPKeepAlive = "vm-native-disable-http-keep-alive"
vmNativeDisableRetries = "vm-native-disable-retries"
vmNativeDisableBinaryProtocol = "vm-native-disable-binary-protocol"
vmNativeDisableHTTPKeepAlive = "vm-native-disable-http-keep-alive"
vmNativeDisablePerMetricMigration = "vm-native-disable-per-metric-migration"
vmNativeSrcAddr = "vm-native-src-addr"
vmNativeSrcUser = "vm-native-src-user"
vmNativeSrcPassword = "vm-native-src-password"
vmNativeSrcHeaders = "vm-native-src-headers"
vmNativeSrcBearerToken = "vm-native-src-bearer-token"
vmNativeSrcAddr = "vm-native-src-addr"
vmNativeSrcUser = "vm-native-src-user"
vmNativeSrcPassword = "vm-native-src-password"
vmNativeSrcHeaders = "vm-native-src-headers"
vmNativeSrcBearerToken = "vm-native-src-bearer-token"
vmNativeSrcInsecureSkipVerify = "vm-native-src-insecure-skip-verify"
vmNativeDstAddr = "vm-native-dst-addr"
vmNativeDstUser = "vm-native-dst-user"
vmNativeDstPassword = "vm-native-dst-password"
vmNativeDstHeaders = "vm-native-dst-headers"
vmNativeDstBearerToken = "vm-native-dst-bearer-token"
vmNativeDstAddr = "vm-native-dst-addr"
vmNativeDstUser = "vm-native-dst-user"
vmNativeDstPassword = "vm-native-dst-password"
vmNativeDstHeaders = "vm-native-dst-headers"
vmNativeDstBearerToken = "vm-native-dst-bearer-token"
vmNativeDstInsecureSkipVerify = "vm-native-dst-insecure-skip-verify"
)
var (
@@ -454,8 +456,8 @@ var (
Value: 2,
},
&cli.BoolFlag{
Name: vmNativeDisableRetries,
Usage: "Defines whether to disable retries with backoff policy for migration process",
Name: vmNativeDisablePerMetricMigration,
Usage: "Defines whether to disable per-metric migration and migrate all data via one connection. In this mode, vmctl makes less export/import requests, but can't provide a progress bar or retry failed requests.",
Value: false,
},
&cli.BoolFlag{
@@ -466,6 +468,16 @@ var (
"Non-binary export/import API is less efficient, but supports deduplication if it is configured on vm-native-src-addr side.",
Value: false,
},
&cli.BoolFlag{
Name: vmNativeSrcInsecureSkipVerify,
Usage: "Whether to skip TLS certificate verification when connecting to the source address",
Value: false,
},
&cli.BoolFlag{
Name: vmNativeDstInsecureSkipVerify,
Usage: "Whether to skip TLS certificate verification when connecting to the destination address",
Value: false,
},
}
)

View File

@@ -2,6 +2,7 @@ package main
import (
"context"
"crypto/tls"
"fmt"
"log"
"net/http"
@@ -212,6 +213,7 @@ func main() {
var srcExtraLabels []string
srcAddr := strings.Trim(c.String(vmNativeSrcAddr), "/")
srcInsecureSkipVerify := c.Bool(vmNativeSrcInsecureSkipVerify)
srcAuthConfig, err := auth.Generate(
auth.WithBasicAuth(c.String(vmNativeSrcUser), c.String(vmNativeSrcPassword)),
auth.WithBearer(c.String(vmNativeSrcBearerToken)),
@@ -219,10 +221,16 @@ func main() {
if err != nil {
return fmt.Errorf("error initilize auth config for source: %s", srcAddr)
}
srcHTTPClient := &http.Client{Transport: &http.Transport{DisableKeepAlives: disableKeepAlive}}
srcHTTPClient := &http.Client{Transport: &http.Transport{
DisableKeepAlives: disableKeepAlive,
TLSClientConfig: &tls.Config{
InsecureSkipVerify: srcInsecureSkipVerify,
},
}}
dstAddr := strings.Trim(c.String(vmNativeDstAddr), "/")
dstExtraLabels := c.StringSlice(vmExtraLabel)
dstInsecureSkipVerify := c.Bool(vmNativeDstInsecureSkipVerify)
dstAuthConfig, err := auth.Generate(
auth.WithBasicAuth(c.String(vmNativeDstUser), c.String(vmNativeDstPassword)),
auth.WithBearer(c.String(vmNativeDstBearerToken)),
@@ -230,7 +238,12 @@ func main() {
if err != nil {
return fmt.Errorf("error initilize auth config for destination: %s", dstAddr)
}
dstHTTPClient := &http.Client{Transport: &http.Transport{DisableKeepAlives: disableKeepAlive}}
dstHTTPClient := &http.Client{Transport: &http.Transport{
DisableKeepAlives: disableKeepAlive,
TLSClientConfig: &tls.Config{
InsecureSkipVerify: dstInsecureSkipVerify,
},
}}
p := vmNativeProcessor{
rateLimit: c.Int64(vmRateLimit),
@@ -254,11 +267,11 @@ func main() {
ExtraLabels: dstExtraLabels,
HTTPClient: dstHTTPClient,
},
backoff: backoff.New(),
cc: c.Int(vmConcurrency),
disableRetries: c.Bool(vmNativeDisableRetries),
isSilent: c.Bool(globalSilent),
isNative: !c.Bool(vmNativeDisableBinaryProtocol),
backoff: backoff.New(),
cc: c.Int(vmConcurrency),
disablePerMetricRequests: c.Bool(vmNativeDisablePerMetricMigration),
isSilent: c.Bool(globalSilent),
isNative: !c.Bool(vmNativeDisableBinaryProtocol),
}
return p.run(ctx)
},

View File

@@ -29,13 +29,14 @@ type vmNativeProcessor struct {
src *native.Client
backoff *backoff.Backoff
s *stats
rateLimit int64
interCluster bool
cc int
disableRetries bool
isSilent bool
isNative bool
s *stats
rateLimit int64
interCluster bool
cc int
isSilent bool
isNative bool
disablePerMetricRequests bool
}
const (
@@ -119,19 +120,16 @@ func (p *vmNativeProcessor) runSingle(ctx context.Context, f native.Filter, srcU
return fmt.Errorf("failed to init export pipe: %w", err)
}
if p.disableRetries && bar != nil {
if p.disablePerMetricRequests && bar != nil {
fmt.Printf("Continue import process with filter %s:\n", f.String())
reader = bar.NewProxyReader(reader)
}
pr, pw := io.Pipe()
done := make(chan struct{})
importCh := make(chan error)
go func() {
defer func() { close(done) }()
if err := p.dst.ImportPipe(ctx, dstURL, pr); err != nil {
logger.Errorf("error initialize import pipe: %s", err)
return
}
importCh <- p.dst.ImportPipe(ctx, dstURL, pr)
close(importCh)
}()
w := io.Writer(pw)
@@ -153,9 +151,8 @@ func (p *vmNativeProcessor) runSingle(ctx context.Context, f native.Filter, srcU
if err := pw.Close(); err != nil {
return err
}
<-done
return nil
return <-importCh
}
func (p *vmNativeProcessor) runBackfilling(ctx context.Context, tenantID string, ranges [][]time.Time, silent bool) error {
@@ -193,7 +190,7 @@ func (p *vmNativeProcessor) runBackfilling(ctx context.Context, tenantID string,
var foundSeriesMsg string
metrics := []string{p.filter.Match}
if !p.disableRetries {
if !p.disablePerMetricRequests {
log.Printf("Exploring metrics...")
metrics, err = p.src.Explore(ctx, p.filter, tenantID)
if err != nil {
@@ -231,7 +228,7 @@ func (p *vmNativeProcessor) runBackfilling(ctx context.Context, tenantID string,
var bar *pb.ProgressBar
if !silent {
bar = barpool.NewSingleProgress(fmt.Sprintf(nativeWithBackoffTpl, barPrefix), len(metrics)*len(ranges))
if p.disableRetries {
if p.disablePerMetricRequests {
bar = barpool.NewSingleProgress(nativeSingleProcessTpl, 0)
}
bar.Start()
@@ -247,7 +244,7 @@ func (p *vmNativeProcessor) runBackfilling(ctx context.Context, tenantID string,
go func() {
defer wg.Done()
for f := range filterCh {
if !p.disableRetries {
if !p.disablePerMetricRequests {
if err := p.do(ctx, f, srcURL, dstURL, nil); err != nil {
errCh <- err
return

View File

@@ -266,10 +266,16 @@ func fillStorage(series []vm.TimeSeries) error {
for _, series := range series {
var labels []prompb.Label
for _, lp := range series.LabelPairs {
labels = append(labels, prompb.Label{Name: []byte(lp.Name), Value: []byte(lp.Value)})
labels = append(labels, prompb.Label{
Name: lp.Name,
Value: lp.Value,
})
}
if series.Name != "" {
labels = append(labels, prompb.Label{Name: []byte("__name__"), Value: []byte(series.Name)})
labels = append(labels, prompb.Label{
Name: "__name__",
Value: series.Name,
})
}
mr := storage.MetricRow{}
mr.MetricNameRaw = storage.MarshalMetricNameRaw(mr.MetricNameRaw[:0], labels)

View File

@@ -27,12 +27,11 @@ type InsertCtx struct {
// Reset resets ctx for future fill with rowsLen rows.
func (ctx *InsertCtx) Reset(rowsLen int) {
for i := range ctx.Labels {
label := &ctx.Labels[i]
label.Name = nil
label.Value = nil
labels := ctx.Labels
for i := range labels {
labels[i] = prompb.Label{}
}
ctx.Labels = ctx.Labels[:0]
ctx.Labels = labels[:0]
mrs := ctx.mrs
for i := range mrs {
@@ -112,8 +111,8 @@ func (ctx *InsertCtx) AddLabelBytes(name, value []byte) {
ctx.Labels = append(ctx.Labels, prompb.Label{
// Do not copy name and value contents for performance reasons.
// This reduces GC overhead on the number of objects and allocations.
Name: name,
Value: value,
Name: bytesutil.ToUnsafeString(name),
Value: bytesutil.ToUnsafeString(value),
})
}
@@ -130,8 +129,8 @@ func (ctx *InsertCtx) AddLabel(name, value string) {
ctx.Labels = append(ctx.Labels, prompb.Label{
// Do not copy name and value contents for performance reasons.
// This reduces GC overhead on the number of objects and allocations.
Name: bytesutil.ToUnsafeBytes(name),
Value: bytesutil.ToUnsafeBytes(value),
Name: name,
Value: value,
})
}

View File

@@ -38,7 +38,7 @@ var (
saCfgReloads = metrics.NewCounter(`vminsert_streamagg_config_reloads_total`)
saCfgReloadErr = metrics.NewCounter(`vminsert_streamagg_config_reloads_errors_total`)
saCfgSuccess = metrics.NewCounter(`vminsert_streamagg_config_last_reload_successful`)
saCfgSuccess = metrics.NewGauge(`vminsert_streamagg_config_last_reload_successful`, nil)
saCfgTimestamp = metrics.NewCounter(`vminsert_streamagg_config_last_reload_success_timestamp_seconds`)
sasGlobal atomic.Pointer[streamaggr.Aggregators]

View File

@@ -1,4 +1,4 @@
package datadog
package datadogv1
import (
"net/http"
@@ -7,31 +7,30 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/relabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/datadog"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/datadog/stream"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/datadogutils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/datadogv1"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/datadogv1/stream"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="datadog"}`)
rowsPerInsert = metrics.NewHistogram(`vm_rows_per_insert{type="datadog"}`)
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="datadogv1"}`)
rowsPerInsert = metrics.NewHistogram(`vm_rows_per_insert{type="datadogv1"}`)
)
// InsertHandlerForHTTP processes remote write for DataDog POST /api/v1/series request.
//
// See https://docs.datadoghq.com/api/latest/metrics/#submit-metrics
func InsertHandlerForHTTP(req *http.Request) error {
extraLabels, err := parserCommon.GetExtraLabels(req)
if err != nil {
return err
}
ce := req.Header.Get("Content-Encoding")
return stream.Parse(req.Body, ce, func(series []parser.Series) error {
return stream.Parse(req.Body, ce, func(series []datadogv1.Series) error {
return insertRows(series, extraLabels)
})
}
func insertRows(series []parser.Series, extraLabels []prompbmarshal.Label) error {
func insertRows(series []datadogv1.Series, extraLabels []prompbmarshal.Label) error {
ctx := common.GetInsertCtx()
defer common.PutInsertCtx(ctx)
@@ -54,7 +53,7 @@ func insertRows(series []parser.Series, extraLabels []prompbmarshal.Label) error
ctx.AddLabel("device", ss.Device)
}
for _, tag := range ss.Tags {
name, value := parser.SplitTag(tag)
name, value := datadogutils.SplitTag(tag)
if name == "host" {
name = "exported_host"
}

View File

@@ -0,0 +1,91 @@
package datadogv2
import (
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/relabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/datadogutils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/datadogv2"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/datadogv2/stream"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="datadogv2"}`)
rowsPerInsert = metrics.NewHistogram(`vm_rows_per_insert{type="datadogv2"}`)
)
// InsertHandlerForHTTP processes remote write for DataDog POST /api/v2/series request.
//
// See https://docs.datadoghq.com/api/latest/metrics/#submit-metrics
func InsertHandlerForHTTP(req *http.Request) error {
extraLabels, err := parserCommon.GetExtraLabels(req)
if err != nil {
return err
}
ct := req.Header.Get("Content-Type")
ce := req.Header.Get("Content-Encoding")
return stream.Parse(req.Body, ce, ct, func(series []datadogv2.Series) error {
return insertRows(series, extraLabels)
})
}
func insertRows(series []datadogv2.Series, extraLabels []prompbmarshal.Label) error {
ctx := common.GetInsertCtx()
defer common.PutInsertCtx(ctx)
rowsLen := 0
for i := range series {
rowsLen += len(series[i].Points)
}
ctx.Reset(rowsLen)
rowsTotal := 0
hasRelabeling := relabel.HasRelabeling()
for i := range series {
ss := &series[i]
rowsTotal += len(ss.Points)
ctx.Labels = ctx.Labels[:0]
ctx.AddLabel("", ss.Metric)
for _, rs := range ss.Resources {
ctx.AddLabel(rs.Type, rs.Name)
}
for _, tag := range ss.Tags {
name, value := datadogutils.SplitTag(tag)
if name == "host" {
name = "exported_host"
}
ctx.AddLabel(name, value)
}
if ss.SourceTypeName != "" {
ctx.AddLabel("source_type_name", ss.SourceTypeName)
}
for j := range extraLabels {
label := &extraLabels[j]
ctx.AddLabel(label.Name, label.Value)
}
if hasRelabeling {
ctx.ApplyRelabeling()
}
if len(ctx.Labels) == 0 {
// Skip metric without labels.
continue
}
ctx.SortLabelsIfNeeded()
var metricNameRaw []byte
var err error
for _, pt := range ss.Points {
timestamp := pt.Timestamp * 1000
value := pt.Value
metricNameRaw, err = ctx.WriteDataPointExt(metricNameRaw, ctx.Labels, timestamp, value)
if err != nil {
return err
}
}
}
rowsInserted.Add(rowsTotal)
rowsPerInsert.Update(float64(rowsTotal))
return ctx.FlushBufs()
}

View File

@@ -160,11 +160,9 @@ func (ctx *pushCtx) reset() {
originLabels := ctx.originLabels
for i := range originLabels {
label := &originLabels[i]
label.Name = nil
label.Value = nil
originLabels[i] = prompb.Label{}
}
ctx.originLabels = ctx.originLabels[:0]
ctx.originLabels = originLabels[:0]
}
func getPushCtx() *pushCtx {

View File

@@ -13,7 +13,8 @@ import (
vminsertCommon "github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/csvimport"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/datadog"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/datadogv1"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/datadogv2"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/graphite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/influx"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/native"
@@ -28,6 +29,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/vmimport"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/influxutils"
graphiteserver "github.com/VictoriaMetrics/VictoriaMetrics/lib/ingestserver/graphite"
@@ -61,7 +63,8 @@ var (
"See also -opentsdbHTTPListenAddr.useProxyProtocol")
opentsdbHTTPUseProxyProtocol = flag.Bool("opentsdbHTTPListenAddr.useProxyProtocol", false, "Whether to use proxy protocol for connections accepted "+
"at -opentsdbHTTPListenAddr . See https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt")
configAuthKey = flag.String("configAuthKey", "", "Authorization key for accessing /config page. It must be passed via authKey query arg")
configAuthKey = flagutil.NewPassword("configAuthKey", "Authorization key for accessing /config page. It must be passed via authKey query arg")
reloadAuthKey = flagutil.NewPassword("reloadAuthKey", "Auth key for /-/reload http endpoint. It must be passed as authKey=...")
maxLabelsPerTimeseries = flag.Int("maxLabelsPerTimeseries", 30, "The maximum number of labels accepted per time series. Superfluous labels are dropped. In this case the vm_metrics_with_dropped_labels_total metric at /metrics page is incremented")
maxLabelValueLen = flag.Int("maxLabelValueLen", 16*1024, "The maximum length of label values in the accepted time series. Longer label values are truncated. In this case the vm_too_long_label_values_total metric at /metrics page is incremented")
)
@@ -246,9 +249,20 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
fmt.Fprintf(w, `{"status":"ok"}`)
return true
case "/datadog/api/v1/series":
datadogWriteRequests.Inc()
if err := datadog.InsertHandlerForHTTP(r); err != nil {
datadogWriteErrors.Inc()
datadogv1WriteRequests.Inc()
if err := datadogv1.InsertHandlerForHTTP(r); err != nil {
datadogv1WriteErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(202)
fmt.Fprintf(w, `{"status":"ok"}`)
return true
case "/datadog/api/v2/series":
datadogv2WriteRequests.Inc()
if err := datadogv2.InsertHandlerForHTTP(r); err != nil {
datadogv2WriteErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
@@ -303,7 +317,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
}
return true
case "/prometheus/config", "/config":
if !httpserver.CheckAuthFlag(w, r, *configAuthKey, "configAuthKey") {
if !httpserver.CheckAuthFlag(w, r, configAuthKey.Get(), "configAuthKey") {
return true
}
promscrapeConfigRequests.Inc()
@@ -312,7 +326,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
return true
case "/prometheus/api/v1/status/config", "/api/v1/status/config":
// See https://prometheus.io/docs/prometheus/latest/querying/api/#config
if !httpserver.CheckAuthFlag(w, r, *configAuthKey, "configAuthKey") {
if !httpserver.CheckAuthFlag(w, r, configAuthKey.Get(), "configAuthKey") {
return true
}
promscrapeStatusConfigRequests.Inc()
@@ -322,6 +336,9 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
fmt.Fprintf(w, `{"status":"success","data":{"yaml":%q}}`, bb.B)
return true
case "/prometheus/-/reload", "/-/reload":
if !httpserver.CheckAuthFlag(w, r, reloadAuthKey.Get(), "reloadAuthKey") {
return true
}
promscrapeConfigReloadRequests.Inc()
procutil.SelfSIGHUP()
w.WriteHeader(http.StatusNoContent)
@@ -371,8 +388,11 @@ var (
influxQueryRequests = metrics.NewCounter(`vm_http_requests_total{path="/influx/query", protocol="influx"}`)
datadogWriteRequests = metrics.NewCounter(`vm_http_requests_total{path="/datadog/api/v1/series", protocol="datadog"}`)
datadogWriteErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/datadog/api/v1/series", protocol="datadog"}`)
datadogv1WriteRequests = metrics.NewCounter(`vm_http_requests_total{path="/datadog/api/v1/series", protocol="datadog"}`)
datadogv1WriteErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/datadog/api/v1/series", protocol="datadog"}`)
datadogv2WriteRequests = metrics.NewCounter(`vm_http_requests_total{path="/datadog/api/v2/series", protocol="datadog"}`)
datadogv2WriteErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/datadog/api/v2/series", protocol="datadog"}`)
datadogValidateRequests = metrics.NewCounter(`vm_http_requests_total{path="/datadog/api/v1/validate", protocol="datadog"}`)
datadogCheckRunRequests = metrics.NewCounter(`vm_http_requests_total{path="/datadog/api/v1/check_run", protocol="datadog"}`)

View File

@@ -46,7 +46,7 @@ func insertRows(timeseries []prompb.TimeSeries, extraLabels []prompbmarshal.Labe
ctx.Labels = ctx.Labels[:0]
srcLabels := ts.Labels
for _, srcLabel := range srcLabels {
ctx.AddLabelBytes(srcLabel.Name, srcLabel.Value)
ctx.AddLabel(srcLabel.Name, srcLabel.Value)
}
for j := range extraLabels {
label := &extraLabels[j]

View File

@@ -5,7 +5,6 @@ import (
"fmt"
"sync/atomic"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fasttime"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/procutil"
@@ -65,7 +64,7 @@ func Init() {
var (
configReloads = metrics.NewCounter(`vm_relabel_config_reloads_total`)
configReloadErrors = metrics.NewCounter(`vm_relabel_config_reloads_errors_total`)
configSuccess = metrics.NewCounter(`vm_relabel_config_last_reload_successful`)
configSuccess = metrics.NewGauge(`vm_relabel_config_last_reload_successful`, nil)
configTimestamp = metrics.NewCounter(`vm_relabel_config_last_reload_success_timestamp_seconds`)
)
@@ -118,11 +117,11 @@ func (ctx *Ctx) ApplyRelabeling(labels []prompb.Label) []prompb.Label {
// Convert labels to prompbmarshal.Label format suitable for relabeling.
tmpLabels := ctx.tmpLabels[:0]
for _, label := range labels {
name := bytesutil.ToUnsafeString(label.Name)
if len(name) == 0 {
name := label.Name
if name == "" {
name = "__name__"
}
value := bytesutil.ToUnsafeString(label.Value)
value := label.Value
tmpLabels = append(tmpLabels, prompbmarshal.Label{
Name: name,
Value: value,
@@ -155,11 +154,11 @@ func (ctx *Ctx) ApplyRelabeling(labels []prompb.Label) []prompb.Label {
// Return back labels to the desired format.
dst := labels[:0]
for _, label := range tmpLabels {
name := bytesutil.ToUnsafeBytes(label.Name)
name := label.Name
if label.Name == "__name__" {
name = nil
name = ""
}
value := bytesutil.ToUnsafeBytes(label.Value)
value := label.Value
dst = append(dst, prompb.Label{
Name: name,
Value: value,

View File

@@ -36,7 +36,6 @@ func main() {
envflag.Parse()
buildinfo.Init()
logger.Init()
pushmetrics.Init()
go httpserver.Serve(*httpListenAddr, false, nil)
@@ -54,9 +53,11 @@ func main() {
Dst: dstFS,
SkipBackupCompleteCheck: *skipBackupCompleteCheck,
}
pushmetrics.Init()
if err := a.Run(); err != nil {
logger.Fatalf("cannot restore from backup: %s", err)
}
pushmetrics.Stop()
srcFS.MustStop()
dstFS.MustStop()

View File

@@ -123,13 +123,13 @@ func registerMetrics(startTime time.Time, w http.ResponseWriter, r *http.Request
// Convert parsed metric and tags to labels.
labels = append(labels[:0], prompb.Label{
Name: []byte("__name__"),
Value: []byte(row.Metric),
Name: "__name__",
Value: row.Metric,
})
for _, tag := range row.Tags {
labels = append(labels, prompb.Label{
Name: []byte(tag.Key),
Value: []byte(tag.Value),
Name: tag.Key,
Value: tag.Value,
})
}

View File

@@ -3599,6 +3599,17 @@ func groupSeriesByNodes(ss []*series, nodes []graphiteql.Expr) map[string][]*ser
return m
}
func getAbsoluteNodeIndex(index, size int) int {
// Handle the negative index case as Python does
if index < 0 {
index = size + index
}
if index < 0 || index >= size {
return -1
}
return index
}
func getNameFromNodes(name string, tags map[string]string, nodes []graphiteql.Expr) string {
if len(nodes) == 0 {
return ""
@@ -3609,7 +3620,7 @@ func getNameFromNodes(name string, tags map[string]string, nodes []graphiteql.Ex
for _, node := range nodes {
switch t := node.(type) {
case *graphiteql.NumberExpr:
if n := int(t.N); n >= 0 && n < len(parts) {
if n := getAbsoluteNodeIndex(int(t.N), len(parts)); n >= 0 {
dstParts = append(dstParts, parts[n])
}
case *graphiteql.StringExpr:

View File

@@ -79,3 +79,31 @@ func TestGraphiteToGolangRegexpReplace(t *testing.T) {
f(`a\d+`, `a\d+`)
f(`\1f\\oo\2`, `$1f\\oo$2`)
}
func TestGetAbsoluteNodeIndex(t *testing.T) {
f := func(index, size, expectedIndex int) {
t.Helper()
absoluteIndex := getAbsoluteNodeIndex(index, size)
if absoluteIndex != expectedIndex {
t.Fatalf("unexpected result for getAbsoluteNodeIndex(%d, %d); got %d; want %d", index, size, expectedIndex, absoluteIndex)
}
}
f(1, 1, -1)
f(0, 1, 0)
f(-1, 3, 2)
f(-3, 1, -1)
f(-1, 1, 0)
f(-2, 1, -1)
f(3, 2, -1)
f(2, 2, -1)
f(1, 2, 1)
f(0, 2, 0)
f(-1, 2, 1)
f(-2, 2, 0)
f(-3, 2, -1)
f(-5, 2, -1)
f(-1, 100, 99)
f(-99, 100, 1)
f(-100, 100, 0)
f(-101, 100, -1)
}

View File

@@ -18,6 +18,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/searchutils"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httputils"
@@ -29,13 +30,13 @@ import (
)
var (
deleteAuthKey = flag.String("deleteAuthKey", "", "authKey for metrics' deletion via /api/v1/admin/tsdb/delete_series and /tags/delSeries")
deleteAuthKey = flagutil.NewPassword("deleteAuthKey", "authKey for metrics' deletion via /api/v1/admin/tsdb/delete_series and /tags/delSeries")
maxConcurrentRequests = flag.Int("search.maxConcurrentRequests", getDefaultMaxConcurrentRequests(), "The maximum number of concurrent search requests. "+
"It shouldn't be high, since a single request can saturate all the CPU cores, while many concurrently executed requests may require high amounts of memory. "+
"See also -search.maxQueueDuration and -search.maxMemoryPerQuery")
maxQueueDuration = flag.Duration("search.maxQueueDuration", 10*time.Second, "The maximum time the request waits for execution when -search.maxConcurrentRequests "+
"limit is reached; see also -search.maxQueryDuration")
resetCacheAuthKey = flag.String("search.resetCacheAuthKey", "", "Optional authKey for resetting rollup cache via /internal/resetRollupResultCache call")
resetCacheAuthKey = flagutil.NewPassword("search.resetCacheAuthKey", "Optional authKey for resetting rollup cache via /internal/resetRollupResultCache call")
logSlowQueryDuration = flag.Duration("search.logSlowQueryDuration", 5*time.Second, "Log queries with execution time exceeding this value. Zero disables slow query logging. "+
"See also -search.logQueryMemoryUsage")
vmalertProxyURL = flag.String("vmalert.proxyURL", "", "Optional URL for proxying requests to vmalert. For example, if -vmalert.proxyURL=http://vmalert:8880 , then alerting API requests such as /api/v1/rules from Grafana will be proxied to http://vmalert:8880/api/v1/rules")
@@ -148,8 +149,9 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
"are executed. Possible solutions: to reduce query load; to add more compute resources to the server; "+
"to increase -search.maxQueueDuration=%s; to increase -search.maxQueryDuration; to increase -search.maxConcurrentRequests",
d.Seconds(), *maxConcurrentRequests, maxQueueDuration),
StatusCode: http.StatusServiceUnavailable,
StatusCode: http.StatusTooManyRequests,
}
w.Header().Add("Retry-After", "10")
httpserver.Errorf(w, r, "%s", err)
return true
}
@@ -170,7 +172,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
}
if path == "/internal/resetRollupResultCache" {
if !httpserver.CheckAuthFlag(w, r, *resetCacheAuthKey, "resetCacheAuthKey") {
if !httpserver.CheckAuthFlag(w, r, resetCacheAuthKey.Get(), "resetCacheAuthKey") {
return true
}
promql.ResetRollupResultCache()
@@ -367,7 +369,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
}
return true
case "/tags/delSeries":
if !httpserver.CheckAuthFlag(w, r, *deleteAuthKey, "deleteAuthKey") {
if !httpserver.CheckAuthFlag(w, r, deleteAuthKey.Get(), "deleteAuthKey") {
return true
}
graphiteTagsDelSeriesRequests.Inc()
@@ -386,7 +388,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
}
return true
case "/api/v1/admin/tsdb/delete_series":
if !httpserver.CheckAuthFlag(w, r, *deleteAuthKey, "deleteAuthKey") {
if !httpserver.CheckAuthFlag(w, r, deleteAuthKey.Get(), "deleteAuthKey") {
return true
}
deleteRequests.Inc()
@@ -425,6 +427,14 @@ func handleStaticAndSimpleRequests(w http.ResponseWriter, r *http.Request, path
}
return true
}
if path == "/vmui/timezone" {
httpserver.EnableCORS(w, r)
if err := handleVMUITimezone(w); err != nil {
httpserver.Errorf(w, r, "%s", err)
return true
}
return true
}
if strings.HasPrefix(path, "/vmui/") {
if strings.HasPrefix(path, "/vmui/static/") {
// Allow clients caching static contents for long period of time, since it shouldn't change over time.

View File

@@ -5,10 +5,12 @@ import (
"errors"
"flag"
"fmt"
"reflect"
"sort"
"sync"
"sync/atomic"
"time"
"unsafe"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/searchutils"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage"
@@ -88,30 +90,6 @@ type timeseriesWork struct {
rowsProcessed int
}
func (tsw *timeseriesWork) reset() {
tsw.mustStop = nil
tsw.rss = nil
tsw.pts = nil
tsw.f = nil
tsw.err = nil
tsw.rowsProcessed = 0
}
func getTimeseriesWork() *timeseriesWork {
v := tswPool.Get()
if v == nil {
v = &timeseriesWork{}
}
return v.(*timeseriesWork)
}
func putTimeseriesWork(tsw *timeseriesWork) {
tsw.reset()
tswPool.Put(tsw)
}
var tswPool sync.Pool
func (tsw *timeseriesWork) do(r *Result, workerID uint) error {
if atomic.LoadUint32(tsw.mustStop) != 0 {
return nil
@@ -270,22 +248,20 @@ func (rss *Results) runParallel(qt *querytracer.Tracer, f func(rs *Result, worke
maxWorkers := MaxWorkers()
if maxWorkers == 1 || tswsLen == 1 {
// It is faster to process time series in the current goroutine.
tsw := getTimeseriesWork()
var tsw timeseriesWork
tmpResult := getTmpResult()
rowsProcessedTotal := 0
var err error
for i := range rss.packedTimeseries {
initTimeseriesWork(tsw, &rss.packedTimeseries[i])
initTimeseriesWork(&tsw, &rss.packedTimeseries[i])
err = tsw.do(&tmpResult.rs, 0)
rowsReadPerSeries.Update(float64(tsw.rowsProcessed))
rowsProcessedTotal += tsw.rowsProcessed
if err != nil {
break
}
tsw.reset()
}
putTmpResult(tmpResult)
putTimeseriesWork(tsw)
return rowsProcessedTotal, err
}
@@ -295,11 +271,9 @@ func (rss *Results) runParallel(qt *querytracer.Tracer, f func(rs *Result, worke
// which reduces the scalability on systems with many CPU cores.
// Prepare the work for workers.
tsws := make([]*timeseriesWork, len(rss.packedTimeseries))
tsws := make([]timeseriesWork, len(rss.packedTimeseries))
for i := range rss.packedTimeseries {
tsw := getTimeseriesWork()
initTimeseriesWork(tsw, &rss.packedTimeseries[i])
tsws[i] = tsw
initTimeseriesWork(&tsws[i], &rss.packedTimeseries[i])
}
// Prepare worker channels.
@@ -314,9 +288,9 @@ func (rss *Results) runParallel(qt *querytracer.Tracer, f func(rs *Result, worke
}
// Spread work among workers.
for i, tsw := range tsws {
for i := range tsws {
idx := i % len(workChs)
workChs[idx] <- tsw
workChs[idx] <- &tsws[i]
}
// Mark worker channels as closed.
for _, workCh := range workChs {
@@ -339,14 +313,14 @@ func (rss *Results) runParallel(qt *querytracer.Tracer, f func(rs *Result, worke
// Collect results.
var firstErr error
rowsProcessedTotal := 0
for _, tsw := range tsws {
for i := range tsws {
tsw := &tsws[i]
if tsw.err != nil && firstErr == nil {
// Return just the first error, since other errors are likely duplicate the first error.
firstErr = tsw.err
}
rowsReadPerSeries.Update(float64(tsw.rowsProcessed))
rowsProcessedTotal += tsw.rowsProcessed
putTimeseriesWork(tsw)
}
return rowsProcessedTotal, firstErr
}
@@ -1044,7 +1018,7 @@ func ExportBlocks(qt *querytracer.Tracer, sq *storage.SearchQuery, deadline sear
for xw := range workCh {
if err := f(&xw.mn, &xw.b, tr, workerID); err != nil {
errGlobalLock.Lock()
if errGlobal != nil {
if errGlobal == nil {
errGlobal = err
atomic.StoreUint32(&mustStop, 1)
}
@@ -1169,15 +1143,44 @@ func ProcessSearchQuery(qt *querytracer.Tracer, sq *storage.SearchQuery, deadlin
maxSeriesCount := sr.Init(qt, vmstorage.Storage, tfss, tr, sq.MaxMetrics, deadline.Deadline())
indexSearchDuration.UpdateDuration(startTime)
type blockRefs struct {
brsPrealloc [4]blockRef
brs []blockRef
brs []blockRef
}
m := make(map[string]*blockRefs, maxSeriesCount)
orderedMetricNames := make([]string, 0, maxSeriesCount)
blocksRead := 0
samples := 0
tbf := getTmpBlocksFile()
var buf []byte
var metricNamePrev []byte
// metricNamesBuf is used for holding all the loaded unique metric names at m and orderedMetricNames.
// It should reduce pressure on Go GC by reducing the number of string allocations
// when constructing metricName string from byte slice.
metricNamesBufCap := maxSeriesCount * 100
if metricNamesBufCap > maxFastAllocBlockSize {
metricNamesBufCap = maxFastAllocBlockSize
}
metricNamesBuf := make([]byte, 0, metricNamesBufCap)
// brssPool is used for holding all the blockRefs objects across all the loaded time series.
// It should reduce pressure on Go GC by reducing the number of blockRefs allocations.
brssPool := make([]blockRefs, 0, maxSeriesCount)
// brsPool is used for holding the most of blockRefs.brs slices across all the loaded time series.
// It should reduce pressure on Go GC by reducing the number of allocations for blockRefs.brs slices.
brsPoolCap := uintptr(maxSeriesCount)
if brsPoolCap > maxFastAllocBlockSize/unsafe.Sizeof(blockRef{}) {
brsPoolCap = maxFastAllocBlockSize / unsafe.Sizeof(blockRef{})
}
brsPool := make([]blockRef, 0, brsPoolCap)
// m maps from metricName to the index of blockRefs inside brssPool
m := make(map[string]int, maxSeriesCount)
// orderedMetricNames contains the list of loaded unique metric names
// in the load order. This order is important for triggering sequential data reading.
orderedMetricNames := make([]string, 0, maxSeriesCount)
var brsIdx int
for sr.NextMetricBlock() {
blocksRead++
if deadline.Exceeded() {
@@ -1190,8 +1193,10 @@ func ProcessSearchQuery(qt *querytracer.Tracer, sq *storage.SearchQuery, deadlin
if *maxSamplesPerQuery > 0 && samples > *maxSamplesPerQuery {
putTmpBlocksFile(tbf)
putStorageSearch(sr)
return nil, fmt.Errorf("cannot select more than -search.maxSamplesPerQuery=%d samples; possible solutions: to increase the -search.maxSamplesPerQuery; to reduce time range for the query; to use more specific label filters in order to select lower number of series", *maxSamplesPerQuery)
return nil, fmt.Errorf("cannot select more than -search.maxSamplesPerQuery=%d samples; possible solutions: to increase the -search.maxSamplesPerQuery; "+
"to reduce time range for the query; to use more specific label filters in order to select lower number of series", *maxSamplesPerQuery)
}
buf = br.Marshal(buf[:0])
addr, err := tbf.WriteBlockRefData(buf)
if err != nil {
@@ -1199,24 +1204,59 @@ func ProcessSearchQuery(qt *querytracer.Tracer, sq *storage.SearchQuery, deadlin
putStorageSearch(sr)
return nil, fmt.Errorf("cannot write %d bytes to temporary file: %w", len(buf), err)
}
// Do not intern mb.MetricName, since it leads to increased memory usage.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3692
metricName := sr.MetricBlockRef.MetricName
brs := m[string(metricName)]
if brs == nil {
brs = &blockRefs{}
brs.brs = brs.brsPrealloc[:0]
if metricNamePrev == nil || string(metricName) != string(metricNamePrev) {
idx, ok := m[string(metricName)]
if !ok {
if cap(brssPool) > len(brssPool) {
brssPool = brssPool[:len(brssPool)+1]
} else {
brssPool = append(brssPool, blockRefs{})
}
idx = len(brssPool) - 1
}
brsIdx = idx
metricNamePrev = append(metricNamePrev[:0], metricName...)
}
brs := &brssPool[brsIdx]
partRef := br.PartRef()
if uintptr(cap(brsPool)) >= maxFastAllocBlockSize/unsafe.Sizeof(blockRef{}) && len(brsPool) == cap(brsPool) {
// Allocate a new brsPool in order to avoid slow allocation of an object
// bigger than maxFastAllocBlockSize bytes at append() below.
brsPool = make([]blockRef, 0, maxFastAllocBlockSize/unsafe.Sizeof(blockRef{}))
}
if brs.brs == nil || haveSameBlockRefTails(brs.brs, brsPool) {
// It is safe appending blockRef to brsPool, since there are no other items added there yet.
brsPool = append(brsPool, blockRef{
partRef: partRef,
addr: addr,
})
brs.brs = brsPool[len(brsPool)-len(brs.brs)-1 : len(brsPool) : len(brsPool)]
} else {
// It is unsafe appending blockRef to brsPool, since there are other items added there.
// So just append it to brs.brs.
brs.brs = append(brs.brs, blockRef{
partRef: partRef,
addr: addr,
})
}
brs.brs = append(brs.brs, blockRef{
partRef: br.PartRef(),
addr: addr,
})
if len(brs.brs) == 1 {
metricNameStr := string(metricName)
if cap(metricNamesBuf) >= maxFastAllocBlockSize && len(metricNamesBuf)+len(metricName) > cap(metricNamesBuf) {
// Allocate a new metricNamesBuf in order to avoid slow allocation of byte slice
// bigger than maxFastAllocBlockSize bytes at append() below.
metricNamesBuf = make([]byte, 0, maxFastAllocBlockSize)
}
metricNamesBufLen := len(metricNamesBuf)
metricNamesBuf = append(metricNamesBuf, metricName...)
metricNameStr := bytesutil.ToUnsafeString(metricNamesBuf[metricNamesBufLen:])
orderedMetricNames = append(orderedMetricNames, metricNameStr)
m[metricNameStr] = brs
m[metricNameStr] = brsIdx
}
}
if err := sr.Error(); err != nil {
putTmpBlocksFile(tbf)
putStorageSearch(sr)
@@ -1239,7 +1279,7 @@ func ProcessSearchQuery(qt *querytracer.Tracer, sq *storage.SearchQuery, deadlin
for i, metricName := range orderedMetricNames {
pts[i] = packedTimeseries{
metricName: metricName,
brs: m[metricName].brs,
brs: brssPool[m[metricName]].brs,
}
}
rss.packedTimeseries = pts
@@ -1255,6 +1295,12 @@ type blockRef struct {
addr tmpBlockAddr
}
func haveSameBlockRefTails(a, b []blockRef) bool {
sha := (*reflect.SliceHeader)(unsafe.Pointer(&a))
shb := (*reflect.SliceHeader)(unsafe.Pointer(&b))
return sha.Data+uintptr(sha.Len)*unsafe.Sizeof(blockRef{}) == shb.Data+uintptr(shb.Len)*unsafe.Sizeof(blockRef{})
}
func setupTfss(qt *querytracer.Tracer, tr storage.TimeRange, tagFilterss [][]storage.TagFilter, maxMetrics int, deadline searchutils.Deadline) ([]*storage.TagFilters, error) {
tfss := make([]*storage.TagFilters, 0, len(tagFilterss))
for _, tagFilters := range tagFilterss {
@@ -1300,3 +1346,8 @@ func applyGraphiteRegexpFilter(filter string, ss []string) ([]string, error) {
}
return dst, nil
}
// Go uses fast allocations for block sizes up to 32Kb.
//
// See https://github.com/golang/go/blob/704401ffa06c60e059c9e6e4048045b4ff42530a/src/runtime/malloc.go#L11
const maxFastAllocBlockSize = 32 * 1024

View File

@@ -82,7 +82,7 @@ textarea { margin: 1em }
<h3>Tutorial for WITH expressions in <a href="https://docs.victoriametrics.com/MetricsQL.html">MetricsQL</a></h3>
<p>
Let's look at the following real query from <a href="https://grafana.com/grafana/dashboards/1860-node-exporter-full/">Node Exporter Full</a> dashboard:
Let's look at the following real query from <a href="https://grafana.com/grafana/dashboards/1860">Node Exporter Full</a> dashboard:
</p>
<pre>
@@ -146,7 +146,7 @@ my_resource_utilization(
</p>
<p>
Let's take another nice query from <a href="https://grafana.com/grafana/dashboards/1860-node-exporter-full/">Node Exporter Full</a> dashboard:
Let's take another nice query from <a href="https://grafana.com/grafana/dashboards/1860">Node Exporter Full</a> dashboard:
</p>
<pre>

View File

@@ -195,7 +195,7 @@ func streamwithExprsTutorial(qw422016 *qt422016.Writer) {
<h3>Tutorial for WITH expressions in <a href="https://docs.victoriametrics.com/MetricsQL.html">MetricsQL</a></h3>
<p>
Let's look at the following real query from <a href="https://grafana.com/grafana/dashboards/1860-node-exporter-full/">Node Exporter Full</a> dashboard:
Let's look at the following real query from <a href="https://grafana.com/grafana/dashboards/1860">Node Exporter Full</a> dashboard:
</p>
<pre>
@@ -259,7 +259,7 @@ my_resource_utilization(
</p>
<p>
Let's take another nice query from <a href="https://grafana.com/grafana/dashboards/1860-node-exporter-full/">Node Exporter Full</a> dashboard:
Let's take another nice query from <a href="https://grafana.com/grafana/dashboards/1860">Node Exporter Full</a> dashboard:
</p>
<pre>

View File

@@ -321,7 +321,9 @@ func exportHandler(qt *querytracer.Tracer, w http.ResponseWriter, cp *commonPara
firstLineSent := uint32(0)
writeLineFunc = func(xb *exportBlock, workerID uint) error {
bb := sw.getBuffer(workerID)
if atomic.CompareAndSwapUint32(&firstLineOnce, 0, 1) {
// Use atomic.LoadUint32() in front of atomic.CompareAndSwapUint32() in order to avoid slow inter-CPU synchronization
// in fast path after the first line has been already sent.
if atomic.LoadUint32(&firstLineOnce) == 0 && atomic.CompareAndSwapUint32(&firstLineOnce, 0, 1) {
// Send the first line to sw.bw
WriteExportPromAPILine(bb, xb)
_, err := sw.bw.Write(bb.B)
@@ -497,7 +499,10 @@ func LabelValuesHandler(qt *querytracer.Tracer, startTime time.Time, labelName s
if err != nil {
return err
}
sq := storage.NewSearchQuery(cp.start, cp.end, cp.filterss, *maxUniqueTimeseries)
// Do not limit the number of unique time series, which could be scanned
// during the search for matching label values, since users expect this API
// must always work.
sq := storage.NewSearchQuery(cp.start, cp.end, cp.filterss, -1)
labelValues, err := netstorage.LabelValues(qt, labelName, sq, limit, cp.deadline)
if err != nil {
return fmt.Errorf("cannot obtain values for label %q: %w", labelName, err)
@@ -594,7 +599,10 @@ func LabelsHandler(qt *querytracer.Tracer, startTime time.Time, w http.ResponseW
if err != nil {
return err
}
sq := storage.NewSearchQuery(cp.start, cp.end, cp.filterss, *maxUniqueTimeseries)
// Do not limit the number of unique time series, which could be scanned
// during the search for matching label values, since users expect this API
// must always work.
sq := storage.NewSearchQuery(cp.start, cp.end, cp.filterss, -1)
labels, err := netstorage.LabelNames(qt, sq, limit, cp.deadline)
if err != nil {
return fmt.Errorf("cannot obtain labels: %w", err)

View File

@@ -111,41 +111,48 @@ func aggrFuncExt(afe func(tss []*timeseries, modifier *metricsql.ModifierExpr) [
modifier *metricsql.ModifierExpr, maxSeries int, keepOriginal bool) ([]*timeseries, error) {
m := aggrPrepareSeries(argOrig, modifier, maxSeries, keepOriginal)
rvs := make([]*timeseries, 0, len(m))
for _, tss := range m {
rv := afe(tss, modifier)
for _, tssl := range m {
rv := afe(tssl.tss, modifier)
rvs = append(rvs, rv...)
}
return rvs, nil
}
func aggrPrepareSeries(argOrig []*timeseries, modifier *metricsql.ModifierExpr, maxSeries int, keepOriginal bool) map[string][]*timeseries {
func aggrPrepareSeries(argOrig []*timeseries, modifier *metricsql.ModifierExpr, maxSeries int, keepOriginal bool) map[string]*tssList {
// Remove empty time series, e.g. series with all NaN samples,
// since such series are ignored by aggregate functions.
argOrig = removeEmptySeries(argOrig)
arg := copyTimeseriesMetricNames(argOrig, keepOriginal)
// Perform grouping.
m := make(map[string][]*timeseries)
m := make(map[string]*tssList)
bb := bbPool.Get()
for i, ts := range arg {
removeGroupTags(&ts.MetricName, modifier)
bb.B = marshalMetricNameSorted(bb.B[:0], &ts.MetricName)
k := string(bb.B)
k := bb.B
if keepOriginal {
ts = argOrig[i]
}
tss := m[k]
if tss == nil && maxSeries > 0 && len(m) >= maxSeries {
// We already reached time series limit after grouping. Skip other time series.
continue
tssl := m[string(k)]
if tssl == nil {
if maxSeries > 0 && len(m) >= maxSeries {
// We already reached time series limit after grouping. Skip other time series.
continue
}
tssl = &tssList{}
m[string(k)] = tssl
}
tss = append(tss, ts)
m[k] = tss
tssl.tss = append(tssl.tss, ts)
}
bbPool.Put(bb)
return m
}
type tssList struct {
tss []*timeseries
}
func aggrFuncAny(afa *aggrFuncArg) ([]*timeseries, error) {
tss, err := getAggrTimeseries(afa.args)
if err != nil {
@@ -626,8 +633,8 @@ func aggrFuncCountValues(afa *aggrFuncArg) ([]*timeseries, error) {
m := aggrPrepareSeries(args[1], &afa.ae.Modifier, afa.ae.Limit, false)
rvs := make([]*timeseries, 0, len(m))
for _, tss := range m {
rv, err := afe(tss, modifier)
for _, tssl := range m {
rv, err := afe(tssl.tss, modifier)
if err != nil {
return nil, err
}
@@ -651,13 +658,14 @@ func newAggrFuncTopK(isReverse bool) aggrFunc {
}
afe := func(tss []*timeseries, modififer *metricsql.ModifierExpr) []*timeseries {
for n := range tss[0].Values {
lessFunc := lessWithNaNs
if isReverse {
lessFunc = greaterWithNaNs
}
sort.Slice(tss, func(i, j int) bool {
a := tss[i].Values[n]
b := tss[j].Values[n]
if isReverse {
a, b = b, a
}
return lessWithNaNs(a, b)
return lessFunc(a, b)
})
fillNaNsAtIdx(n, ks[n], tss)
}
@@ -710,17 +718,19 @@ func getRangeTopKTimeseries(tss []*timeseries, modifier *metricsql.ModifierExpr,
value: value,
}
}
lessFunc := lessWithNaNs
if isReverse {
lessFunc = greaterWithNaNs
}
sort.Slice(maxs, func(i, j int) bool {
a := maxs[i].value
b := maxs[j].value
if isReverse {
a, b = b, a
}
return lessWithNaNs(a, b)
return lessFunc(a, b)
})
for i := range maxs {
tss[i] = maxs[i].ts
}
remainingSumTS := getRemainingSumTimeseries(tss, modifier, ks, remainingSumTagName)
for i, k := range ks {
fillNaNsAtIdx(i, k, tss)
@@ -1253,12 +1263,27 @@ func newAggrQuantileFunc(phis []float64) func(tss []*timeseries, modifier *metri
}
func lessWithNaNs(a, b float64) bool {
// consider NaNs are smaller than non-NaNs
if math.IsNaN(a) {
return !math.IsNaN(b)
}
if math.IsNaN(b) {
return false
}
return a < b
}
func greaterWithNaNs(a, b float64) bool {
// consider NaNs are bigger than non-NaNs
if math.IsNaN(a) {
return !math.IsNaN(b)
}
if math.IsNaN(b) {
return false
}
return a > b
}
func floatToIntBounded(f float64) int {
if f > math.MaxInt {
return math.MaxInt

View File

@@ -111,8 +111,8 @@ func (iafc *incrementalAggrFuncContext) updateTimeseries(tsOrig *timeseries, wor
removeGroupTags(&ts.MetricName, &iafc.ae.Modifier)
bb := bbPool.Get()
bb.B = marshalMetricNameSorted(bb.B[:0], &ts.MetricName)
k := string(bb.B)
iac := m[k]
k := bb.B
iac := m[string(k)]
if iac == nil {
if iafc.ae.Limit > 0 && len(m) >= iafc.ae.Limit {
// Skip this time series, since the limit on the number of output time series has been already reached.
@@ -131,7 +131,7 @@ func (iafc *incrementalAggrFuncContext) updateTimeseries(tsOrig *timeseries, wor
ts: tsAggr,
values: make([]float64, len(ts.Values)),
}
m[k] = iac
m[string(k)] = iac
}
bbPool.Put(bb)
iafc.callbacks.updateAggrFunc(iac, ts.Values)

View File

@@ -2,9 +2,57 @@ package promql
import (
"math"
"sort"
"testing"
)
func TestSortWithNaNs(t *testing.T) {
f := func(a []float64, ascExpected, descExpected []float64) {
t.Helper()
equalSlices := func(a, b []float64) bool {
for i := range a {
x := a[i]
y := b[i]
if math.IsNaN(x) {
return math.IsNaN(y)
}
if math.IsNaN(y) {
return false
}
if x != y {
return false
}
}
return true
}
aCopy := append([]float64{}, a...)
sort.Slice(aCopy, func(i, j int) bool {
return lessWithNaNs(aCopy[i], aCopy[j])
})
if !equalSlices(aCopy, ascExpected) {
t.Fatalf("unexpected slice after asc sorting; got\n%v\nwant\n%v", aCopy, ascExpected)
}
aCopy = append(aCopy[:0], a...)
sort.Slice(aCopy, func(i, j int) bool {
return greaterWithNaNs(aCopy[i], aCopy[j])
})
if !equalSlices(aCopy, descExpected) {
t.Fatalf("unexpected slice after desc sorting; got\n%v\nwant\n%v", aCopy, descExpected)
}
}
f(nil, nil, nil)
f([]float64{1}, []float64{1}, []float64{1})
f([]float64{1, nan, 3, 2}, []float64{nan, 1, 2, 3}, []float64{nan, 3, 2, 1})
f([]float64{nan}, []float64{nan}, []float64{nan})
f([]float64{nan, nan, nan}, []float64{nan, nan, nan}, []float64{nan, nan, nan})
f([]float64{nan, 1, nan}, []float64{nan, nan, 1}, []float64{nan, nan, 1})
f([]float64{nan, 1, 0, 2, nan}, []float64{nan, nan, 0, 1, 2}, []float64{nan, nan, 2, 1, 0})
}
func TestModeNoNaNs(t *testing.T) {
f := func(prevValue float64, a []float64, expectedResult float64) {
t.Helper()

View File

@@ -404,9 +404,15 @@ func binaryOpDefault(bfa *binaryOpFuncArg) ([]*timeseries, error) {
func binaryOpOr(bfa *binaryOpFuncArg) ([]*timeseries, error) {
mLeft, mRight := createTimeseriesMapByTagSet(bfa.be, bfa.left, bfa.right)
var rvs []*timeseries
for _, tss := range mLeft {
rvs = append(rvs, tss...)
}
// Sort left-hand-side series by metric name as Prometheus does.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5393
sortSeriesByMetricName(rvs)
rvsLen := len(rvs)
for k, tssRight := range mRight {
tssLeft := mLeft[k]
if tssLeft == nil {
@@ -415,6 +421,10 @@ func binaryOpOr(bfa *binaryOpFuncArg) ([]*timeseries, error) {
}
fillLeftNaNsWithRightValues(tssLeft, tssRight)
}
// Sort the added right-hand-side series by metric name as Prometheus does.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5393
sortSeriesByMetricName(rvs[rvsLen:])
return rvs, nil
}

View File

@@ -686,10 +686,7 @@ func tryGetArgRollupFuncWithMetricExpr(ae *metricsql.AggrFuncExpr) (*metricsql.F
return nil, nil
}
// e = rollupFunc(metricExpr)
return &metricsql.FuncExpr{
Name: fe.Name,
Args: []metricsql.Expr{me},
}, nrf
return fe, nrf
}
if re, ok := arg.(*metricsql.RollupExpr); ok {
if me, ok := re.Expr.(*metricsql.MetricExpr); !ok || me.IsEmpty() || re.ForSubquery() {
@@ -1268,7 +1265,7 @@ func evalInstantRollup(qt *querytracer.Tracer, ec *EvalConfig, funcName string,
return evalAt(qtChild, timestamp, window)
}
// Calculate the result
tss, ok := getMaxInstantValues(qtChild, tssCached, tssStart, tssEnd)
tss, ok := getMaxInstantValues(qtChild, tssCached, tssStart, tssEnd, timestamp)
if !ok {
qtChild.Printf("cannot apply instant rollup optimization, since tssEnd contains bigger values than tssCached")
deleteCachedSeries(qtChild)
@@ -1330,7 +1327,7 @@ func evalInstantRollup(qt *querytracer.Tracer, ec *EvalConfig, funcName string,
return evalAt(qtChild, timestamp, window)
}
// Calculate the result
tss, ok := getMinInstantValues(qtChild, tssCached, tssStart, tssEnd)
tss, ok := getMinInstantValues(qtChild, tssCached, tssStart, tssEnd, timestamp)
if !ok {
qtChild.Printf("cannot apply instant rollup optimization, since tssEnd contains smaller values than tssCached")
deleteCachedSeries(qtChild)
@@ -1395,7 +1392,7 @@ func evalInstantRollup(qt *querytracer.Tracer, ec *EvalConfig, funcName string,
return evalAt(qtChild, timestamp, window)
}
// Calculate the result
tss := getSumInstantValues(qtChild, tssCached, tssStart, tssEnd)
tss := getSumInstantValues(qtChild, tssCached, tssStart, tssEnd, timestamp)
return tss, nil
default:
qt.Printf("instant rollup optimization isn't implemented for %s()", funcName)
@@ -1422,8 +1419,8 @@ func hasDuplicateSeries(tss []*timeseries) bool {
return false
}
func getMinInstantValues(qt *querytracer.Tracer, tssCached, tssStart, tssEnd []*timeseries) ([]*timeseries, bool) {
qt = qt.NewChild("calculate the minimum for instant values across series; cached=%d, start=%d, end=%d", len(tssCached), len(tssStart), len(tssEnd))
func getMinInstantValues(qt *querytracer.Tracer, tssCached, tssStart, tssEnd []*timeseries, timestamp int64) ([]*timeseries, bool) {
qt = qt.NewChild("calculate the minimum for instant values across series; cached=%d, start=%d, end=%d, timestamp=%d", len(tssCached), len(tssStart), len(tssEnd), timestamp)
defer qt.Done()
getMin := func(a, b float64) float64 {
@@ -1432,13 +1429,13 @@ func getMinInstantValues(qt *querytracer.Tracer, tssCached, tssStart, tssEnd []*
}
return b
}
tss, ok := getMinMaxInstantValues(tssCached, tssStart, tssEnd, getMin)
tss, ok := getMinMaxInstantValues(tssCached, tssStart, tssEnd, timestamp, getMin)
qt.Printf("resulting series=%d; ok=%v", len(tss), ok)
return tss, ok
}
func getMaxInstantValues(qt *querytracer.Tracer, tssCached, tssStart, tssEnd []*timeseries) ([]*timeseries, bool) {
qt = qt.NewChild("calculate the maximum for instant values across series; cached=%d, start=%d, end=%d", len(tssCached), len(tssStart), len(tssEnd))
func getMaxInstantValues(qt *querytracer.Tracer, tssCached, tssStart, tssEnd []*timeseries, timestamp int64) ([]*timeseries, bool) {
qt = qt.NewChild("calculate the maximum for instant values across series; cached=%d, start=%d, end=%d, timestamp=%d", len(tssCached), len(tssStart), len(tssEnd), timestamp)
defer qt.Done()
getMax := func(a, b float64) float64 {
@@ -1447,12 +1444,12 @@ func getMaxInstantValues(qt *querytracer.Tracer, tssCached, tssStart, tssEnd []*
}
return b
}
tss, ok := getMinMaxInstantValues(tssCached, tssStart, tssEnd, getMax)
tss, ok := getMinMaxInstantValues(tssCached, tssStart, tssEnd, timestamp, getMax)
qt.Printf("resulting series=%d", len(tss))
return tss, ok
}
func getMinMaxInstantValues(tssCached, tssStart, tssEnd []*timeseries, f func(a, b float64) float64) ([]*timeseries, bool) {
func getMinMaxInstantValues(tssCached, tssStart, tssEnd []*timeseries, timestamp int64, f func(a, b float64) float64) ([]*timeseries, bool) {
assertInstantValues(tssCached)
assertInstantValues(tssStart)
assertInstantValues(tssEnd)
@@ -1503,12 +1500,16 @@ func getMinMaxInstantValues(tssCached, tssStart, tssEnd []*timeseries, f func(a,
for _, ts := range m {
rvs = append(rvs, ts)
}
setInstantTimestamp(rvs, timestamp)
return rvs, true
}
// getSumInstantValues calculates tssCached + tssStart - tssEnd
func getSumInstantValues(qt *querytracer.Tracer, tssCached, tssStart, tssEnd []*timeseries) []*timeseries {
qt = qt.NewChild("calculate the sum for instant values across series; cached=%d, start=%d, end=%d", len(tssCached), len(tssStart), len(tssEnd))
// getSumInstantValues aggregates tssCached, tssStart, tssEnd time series
// into a new time series with value = tssCached + tssStart - tssEnd
func getSumInstantValues(qt *querytracer.Tracer, tssCached, tssStart, tssEnd []*timeseries, timestamp int64) []*timeseries {
qt = qt.NewChild("calculate the sum for instant values across series; cached=%d, start=%d, end=%d, timestamp=%d", len(tssCached), len(tssStart), len(tssEnd), timestamp)
defer qt.Done()
assertInstantValues(tssCached)
@@ -1553,10 +1554,19 @@ func getSumInstantValues(qt *querytracer.Tracer, tssCached, tssStart, tssEnd []*
for _, ts := range m {
rvs = append(rvs, ts)
}
setInstantTimestamp(rvs, timestamp)
qt.Printf("resulting series=%d", len(rvs))
return rvs
}
func setInstantTimestamp(tss []*timeseries, timestamp int64) {
for _, ts := range tss {
ts.Timestamps[0] = timestamp
}
}
func assertInstantValues(tss []*timeseries) {
for _, ts := range tss {
if len(ts.Values) != 1 {
@@ -1684,9 +1694,6 @@ func evalRollupFuncNoCache(qt *querytracer.Tracer, ec *EvalConfig, funcName stri
} else {
minTimestamp -= ec.Step
}
if minTimestamp < 0 {
minTimestamp = 0
}
sq := storage.NewSearchQuery(minTimestamp, ec.End, tfss, ec.MaxSeries)
rss, err := netstorage.ProcessSearchQuery(qt, sq, ec.Deadline)
if err != nil {

View File

@@ -1,6 +1,7 @@
package promql
import (
"reflect"
"testing"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/prometheus"
@@ -95,3 +96,77 @@ func TestQueryStats_addSeriesFetched(t *testing.T) {
t.Fatalf("expected to get 4; got %d instead", qs.SeriesFetched)
}
}
func TestGetSumInstantValues(t *testing.T) {
f := func(cached, start, end []*timeseries, timestamp int64, expectedResult []*timeseries) {
t.Helper()
result := getSumInstantValues(nil, cached, start, end, timestamp)
if !reflect.DeepEqual(result, expectedResult) {
t.Errorf("unexpected result; got\n%v\nwant\n%v", result, expectedResult)
}
}
ts := func(name string, timestamp int64, value float64) *timeseries {
return &timeseries{
MetricName: storage.MetricName{
MetricGroup: []byte(name),
},
Timestamps: []int64{timestamp},
Values: []float64{value},
}
}
// start - end + cached = 1
f(
nil,
[]*timeseries{ts("foo", 42, 1)},
nil,
100,
[]*timeseries{ts("foo", 100, 1)},
)
// start - end + cached = 0
f(
nil,
[]*timeseries{ts("foo", 100, 1)},
[]*timeseries{ts("foo", 10, 1)},
100,
[]*timeseries{ts("foo", 100, 0)},
)
// start - end + cached = 2
f(
[]*timeseries{ts("foo", 10, 1)},
[]*timeseries{ts("foo", 100, 1)},
nil,
100,
[]*timeseries{ts("foo", 100, 2)},
)
// start - end + cached = 1
f(
[]*timeseries{ts("foo", 50, 1)},
[]*timeseries{ts("foo", 100, 1)},
[]*timeseries{ts("foo", 10, 1)},
100,
[]*timeseries{ts("foo", 100, 1)},
)
// start - end + cached = 0
f(
[]*timeseries{ts("foo", 50, 1)},
nil,
[]*timeseries{ts("foo", 10, 1)},
100,
[]*timeseries{ts("foo", 100, 0)},
)
// start - end + cached = 1
f(
[]*timeseries{ts("foo", 50, 1)},
nil,
nil,
100,
[]*timeseries{ts("foo", 100, 1)},
)
}

View File

@@ -110,6 +110,7 @@ func maySortResults(e metricsql.Expr) bool {
case "sort", "sort_desc",
"sort_by_label", "sort_by_label_desc",
"sort_by_label_numeric", "sort_by_label_numeric_desc":
// Results already sorted
return false
}
case *metricsql.AggrFuncExpr:
@@ -117,6 +118,7 @@ func maySortResults(e metricsql.Expr) bool {
case "topk", "bottomk", "outliersk",
"topk_max", "topk_min", "topk_avg", "topk_median", "topk_last",
"bottomk_max", "bottomk_min", "bottomk_avg", "bottomk_median", "bottomk_last":
// Results already sorted
return false
}
case *metricsql.BinaryOpExpr:
@@ -131,6 +133,10 @@ func maySortResults(e metricsql.Expr) bool {
func timeseriesToResult(tss []*timeseries, maySort bool) ([]netstorage.Result, error) {
tss = removeEmptySeries(tss)
if maySort {
sortSeriesByMetricName(tss)
}
result := make([]netstorage.Result, len(tss))
m := make(map[string]struct{}, len(tss))
bb := bbPool.Get()
@@ -151,15 +157,15 @@ func timeseriesToResult(tss []*timeseries, maySort bool) ([]netstorage.Result, e
}
bbPool.Put(bb)
if maySort {
sort.Slice(result, func(i, j int) bool {
return metricNameLess(&result[i].MetricName, &result[j].MetricName)
})
}
return result, nil
}
func sortSeriesByMetricName(tss []*timeseries) {
sort.Slice(tss, func(i, j int) bool {
return metricNameLess(&tss[i].MetricName, &tss[j].MetricName)
})
}
func metricNameLess(a, b *storage.MetricName) bool {
if string(a.MetricGroup) != string(b.MetricGroup) {
return string(a.MetricGroup) < string(b.MetricGroup)

View File

@@ -3049,6 +3049,51 @@ func TestExecSuccess(t *testing.T) {
resultExpected := []netstorage.Result{r}
f(q, resultExpected)
})
t.Run(`series or series`, func(t *testing.T) {
t.Parallel()
q := `(
label_set(time(), "x", "foo"),
label_set(time()+1, "x", "bar"),
) or (
label_set(time()+2, "x", "foo"),
label_set(time()+3, "x", "baz"),
)`
r1 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{1001, 1201, 1401, 1601, 1801, 2001},
Timestamps: timestampsExpected,
}
r1.MetricName.Tags = []storage.Tag{
{
Key: []byte("x"),
Value: []byte("bar"),
},
}
r2 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{1000, 1200, 1400, 1600, 1800, 2000},
Timestamps: timestampsExpected,
}
r2.MetricName.Tags = []storage.Tag{
{
Key: []byte("x"),
Value: []byte("foo"),
},
}
r3 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{1003, 1203, 1403, 1603, 1803, 2003},
Timestamps: timestampsExpected,
}
r3.MetricName.Tags = []storage.Tag{
{
Key: []byte("x"),
Value: []byte("baz"),
},
}
resultExpected := []netstorage.Result{r1, r2, r3}
f(q, resultExpected)
})
t.Run(`scalar or scalar`, func(t *testing.T) {
t.Parallel()
q := `time() > 1400 or 123`
@@ -6545,7 +6590,7 @@ func TestExecSuccess(t *testing.T) {
})
t.Run(`bottomk(1)`, func(t *testing.T) {
t.Parallel()
q := `bottomk(1, label_set(10, "foo", "bar") or label_set(time()/150, "baz", "sss"))`
q := `bottomk(1, label_set(10, "foo", "bar") or label_set(time()/150, "baz", "sss") or label_set(time()<100, "a", "b"))`
r1 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{nan, nan, nan, 10, 10, 10},
@@ -8005,10 +8050,10 @@ func TestExecSuccess(t *testing.T) {
})
t.Run(`rollup_rate()`, func(t *testing.T) {
t.Parallel()
q := `rollup_rate((2000-time())[600s])`
q := `rollup_rate((2200-time())[600s])`
r1 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{5, 4, 3, 2, 1, 0},
Values: []float64{6, 5, 4, 3, 2, 1},
Timestamps: timestampsExpected,
}
r1.MetricName.Tags = []storage.Tag{{
@@ -8017,7 +8062,7 @@ func TestExecSuccess(t *testing.T) {
}}
r2 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{6, 5, 4, 3, 2, 1},
Values: []float64{7, 6, 5, 4, 3, 2},
Timestamps: timestampsExpected,
}
r2.MetricName.Tags = []storage.Tag{{
@@ -8026,7 +8071,7 @@ func TestExecSuccess(t *testing.T) {
}}
r3 := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{4, 3, 2, 1, 0, -1},
Values: []float64{5, 4, 3, 2, 1, 0},
Timestamps: timestampsExpected,
}
r3.MetricName.Tags = []storage.Tag{{
@@ -8038,10 +8083,10 @@ func TestExecSuccess(t *testing.T) {
})
t.Run(`rollup_rate(q, "max")`, func(t *testing.T) {
t.Parallel()
q := `rollup_rate((2000-time())[600s], "max")`
q := `rollup_rate((2200-time())[600s], "max")`
r := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{6, 5, 4, 3, 2, 1},
Values: []float64{7, 6, 5, 4, 3, 2},
Timestamps: timestampsExpected,
}
resultExpected := []netstorage.Result{r}
@@ -8049,10 +8094,10 @@ func TestExecSuccess(t *testing.T) {
})
t.Run(`rollup_rate(q, "avg")`, func(t *testing.T) {
t.Parallel()
q := `rollup_rate((2000-time())[600s], "avg")`
q := `rollup_rate((2200-time())[600s], "avg")`
r := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{5, 4, 3, 2, 1, 0},
Values: []float64{6, 5, 4, 3, 2, 1},
Timestamps: timestampsExpected,
}
resultExpected := []netstorage.Result{r}

View File

@@ -859,6 +859,11 @@ func removeCounterResets(values []float64) {
}
prevValue = v
values[i] = v + correction
// Check again, there could be precision error in float operations,
// see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5571
if i > 0 && values[i] < values[i-1] {
values[i] = values[i-1]
}
}
}
@@ -2182,6 +2187,8 @@ func rollupFirst(rfa *rollupFuncArg) float64 {
return values[0]
}
var rollupLast = rollupDefault
func rollupDefault(rfa *rollupFuncArg) float64 {
values := rfa.values
if len(values) == 0 {
@@ -2195,17 +2202,6 @@ func rollupDefault(rfa *rollupFuncArg) float64 {
return values[len(values)-1]
}
func rollupLast(rfa *rollupFuncArg) float64 {
values := rfa.values
if len(values) == 0 {
// Do not take into account rfa.prevValue, since it may lead
// to inconsistent results comparing to Prometheus on broken time series
// with irregular data points.
return nan
}
return values[len(values)-1]
}
func rollupDistinct(rfa *rollupFuncArg) float64 {
// There is no need in handling NaNs here, since they must be cleaned up
// before calling rollup funcs.

View File

@@ -43,6 +43,11 @@ func ResetRollupResultCacheIfNeeded(mrs []storage.MetricRow) {
rollupResultResetMetricRowSample.Store(&storage.MetricRow{})
go checkRollupResultCacheReset()
})
if atomic.LoadUint32(&needRollupResultCacheReset) != 0 {
// The cache has been already instructed to reset.
return
}
minTimestamp := int64(fasttime.UnixTimestamp()*1000) - cacheTimestampOffset.Milliseconds() + checkRollupResultCacheResetInterval.Milliseconds()
needCacheReset := false
for i := range mrs {

View File

@@ -125,7 +125,7 @@ func TestRemoveCounterResets(t *testing.T) {
// removeCounterResets doesn't expect negative values, so it doesn't work properly with them.
values = []float64{-100, -200, -300, -400}
removeCounterResets(values)
valuesExpected = []float64{-100, -300, -600, -1000}
valuesExpected = []float64{-100, -100, -100, -100}
timestampsExpected := []int64{0, 1, 2, 3}
testRowsEqual(t, values, timestampsExpected, valuesExpected, timestampsExpected)
@@ -136,6 +136,17 @@ func TestRemoveCounterResets(t *testing.T) {
valuesExpected = []float64{100, 100, 125, 125, 145, 195}
timestampsExpected = []int64{0, 1, 2, 3, 4, 5}
testRowsEqual(t, values, timestampsExpected, valuesExpected, timestampsExpected)
// verify results always increase monotonically with possible float operations precision error
values = []float64{34.094223, 2.7518, 2.140669, 0.044878, 1.887095, 2.546569, 2.490149, 0.045, 0.035684, 0.062454, 0.058296}
removeCounterResets(values)
var prev float64
for i, v := range values {
if v < prev {
t.Fatalf("error: unexpected value keep getting bigger %d; cur %v; pre %v\n", i, v, prev)
}
prev = v
}
}
func TestDeltaValues(t *testing.T) {

View File

@@ -7,6 +7,7 @@ import (
"net/http"
"os"
"path/filepath"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
)
@@ -14,6 +15,8 @@ import (
var (
vmuiCustomDashboardsPath = flag.String("vmui.customDashboardsPath", "", "Optional path to vmui dashboards. "+
"See https://github.com/VictoriaMetrics/VictoriaMetrics/tree/master/app/vmui/packages/vmui/public/dashboards")
vmuiDefaultTimezone = flag.String("vmui.defaultTimezone", "", "The default timezone to be used in vmui."+
"Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local")
)
// dashboardSettings represents dashboard settings file struct.
@@ -65,6 +68,16 @@ func handleVMUICustomDashboards(w http.ResponseWriter) error {
return nil
}
func handleVMUITimezone(w http.ResponseWriter) error {
tz, err := time.LoadLocation(*vmuiDefaultTimezone)
if err != nil {
return fmt.Errorf("cannot load timezone %q: %w", *vmuiDefaultTimezone, err)
}
response := fmt.Sprintf(`{"timezone": %q}`, tz)
writeSuccessResponse(w, []byte(response))
return nil
}
func writeSuccessResponse(w http.ResponseWriter, data []byte) {
w.WriteHeader(http.StatusOK)
w.Header().Set("Content-Type", "application/json")

View File

@@ -1,13 +1,13 @@
{
"files": {
"main.css": "./static/css/main.fb353c1e.css",
"main.js": "./static/js/main.5bcddddc.js",
"main.css": "./static/css/main.2f6793ba.css",
"main.js": "./static/js/main.bcd188b5.js",
"static/js/522.da77e7b3.chunk.js": "./static/js/522.da77e7b3.chunk.js",
"static/media/MetricsQL.md": "./static/media/MetricsQL.b64c4dbf91f4fa581621.md",
"index.html": "./index.html"
},
"entrypoints": [
"static/css/main.fb353c1e.css",
"static/js/main.5bcddddc.js"
"static/css/main.2f6793ba.css",
"static/js/main.bcd188b5.js"
]
}

View File

@@ -1 +1 @@
<!doctype html><html lang="en"><head><meta charset="utf-8"/><link rel="icon" href="./favicon.ico"/><meta name="viewport" content="width=device-width,initial-scale=1,maximum-scale=5"/><meta name="theme-color" content="#000000"/><meta name="description" content="UI for VictoriaMetrics"/><link rel="apple-touch-icon" href="./apple-touch-icon.png"/><link rel="icon" type="image/png" sizes="32x32" href="./favicon-32x32.png"><link rel="manifest" href="./manifest.json"/><title>VM UI</title><script src="./dashboards/index.js" type="module"></script><meta name="twitter:card" content="summary_large_image"><meta name="twitter:image" content="./preview.jpg"><meta name="twitter:title" content="UI for VictoriaMetrics"><meta name="twitter:description" content="Explore and troubleshoot your VictoriaMetrics data"><meta name="twitter:site" content="@VictoriaMetrics"><meta property="og:title" content="Metric explorer for VictoriaMetrics"><meta property="og:description" content="Explore and troubleshoot your VictoriaMetrics data"><meta property="og:image" content="./preview.jpg"><meta property="og:type" content="website"><script defer="defer" src="./static/js/main.5bcddddc.js"></script><link href="./static/css/main.fb353c1e.css" rel="stylesheet"></head><body><noscript>You need to enable JavaScript to run this app.</noscript><div id="root"></div></body></html>
<!doctype html><html lang="en"><head><meta charset="utf-8"/><link rel="icon" href="./favicon.ico"/><meta name="viewport" content="width=device-width,initial-scale=1,maximum-scale=5"/><meta name="theme-color" content="#000000"/><meta name="description" content="UI for VictoriaMetrics"/><link rel="apple-touch-icon" href="./apple-touch-icon.png"/><link rel="icon" type="image/png" sizes="32x32" href="./favicon-32x32.png"><link rel="manifest" href="./manifest.json"/><title>VM UI</title><script src="./dashboards/index.js" type="module"></script><meta name="twitter:card" content="summary_large_image"><meta name="twitter:image" content="./preview.jpg"><meta name="twitter:title" content="UI for VictoriaMetrics"><meta name="twitter:description" content="Explore and troubleshoot your VictoriaMetrics data"><meta name="twitter:site" content="@VictoriaMetrics"><meta property="og:title" content="Metric explorer for VictoriaMetrics"><meta property="og:description" content="Explore and troubleshoot your VictoriaMetrics data"><meta property="og:image" content="./preview.jpg"><meta property="og:type" content="website"><script defer="defer" src="./static/js/main.bcd188b5.js"></script><link href="./static/css/main.2f6793ba.css" rel="stylesheet"></head><body><noscript>You need to enable JavaScript to run this app.</noscript><div id="root"></div></body></html>

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@@ -4,11 +4,14 @@ import (
"errors"
"flag"
"fmt"
"io"
"net/http"
"strings"
"sync"
"time"
"github.com/VictoriaMetrics/metrics"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/encoding"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fasttime"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
@@ -19,14 +22,14 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/querytracer"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/syncwg"
"github.com/VictoriaMetrics/metrics"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/timeutil"
)
var (
retentionPeriod = flagutil.NewDuration("retentionPeriod", "1", "Data with timestamps outside the retentionPeriod is automatically deleted. The minimum retentionPeriod is 24h or 1d. See also -retentionFilter")
snapshotAuthKey = flag.String("snapshotAuthKey", "", "authKey, which must be passed in query string to /snapshot* pages")
forceMergeAuthKey = flag.String("forceMergeAuthKey", "", "authKey, which must be passed in query string to /internal/force_merge pages")
forceFlushAuthKey = flag.String("forceFlushAuthKey", "", "authKey, which must be passed in query string to /internal/force_flush pages")
snapshotAuthKey = flagutil.NewPassword("snapshotAuthKey", "authKey, which must be passed in query string to /snapshot* pages")
forceMergeAuthKey = flagutil.NewPassword("forceMergeAuthKey", "authKey, which must be passed in query string to /internal/force_merge pages")
forceFlushAuthKey = flagutil.NewPassword("forceFlushAuthKey", "authKey, which must be passed in query string to /internal/force_flush pages")
snapshotsMaxAge = flagutil.NewDuration("snapshotsMaxAge", "0", "Automatically delete snapshots older than -snapshotsMaxAge if it is set to non-zero duration. Make sure that backup process has enough time to finish the backup before the corresponding snapshot is automatically deleted")
snapshotCreateTimeout = flag.Duration("snapshotCreateTimeout", 0, "The timeout for creating new snapshot. If set, make sure that timeout is lower than backup period")
@@ -35,13 +38,10 @@ var (
// DataPath is a path to storage data.
DataPath = flag.String("storageDataPath", "victoria-metrics-data", "Path to storage data")
finalMergeDelay = flag.Duration("finalMergeDelay", 0, "The delay before starting final merge for per-month partition after no new data is ingested into it. "+
"Final merge may require additional disk IO and CPU resources. Final merge may increase query speed and reduce disk space usage in some cases. "+
"Zero value disables final merge")
_ = flag.Int("bigMergeConcurrency", 0, "Deprecated: this flag does nothing. Please use -smallMergeConcurrency "+
"for controlling the concurrency of background merges. See https://docs.victoriametrics.com/#storage")
smallMergeConcurrency = flag.Int("smallMergeConcurrency", 0, "The maximum number of workers for background merges. See https://docs.victoriametrics.com/#storage . "+
"It isn't recommended tuning this flag in general case, since this may lead to uncontrolled increase in the number of parts and increased CPU usage during queries")
_ = flag.Duration("finalMergeDelay", 0, "Deprecated: this flag does nothing")
_ = flag.Int("bigMergeConcurrency", 0, "Deprecated: this flag does nothing")
_ = flag.Int("smallMergeConcurrency", 0, "Deprecated: this flag does nothing")
retentionTimezoneOffset = flag.Duration("retentionTimezoneOffset", 0, "The offset for performing indexdb rotation. "+
"If set to 0, then the indexdb rotation is performed at 4am UTC time per each -retentionPeriod. "+
"If set to 2h, then the indexdb rotation is performed at 4am EET time (the timezone with +2h offset)")
@@ -93,8 +93,6 @@ func Init(resetCacheIfNeeded func(mrs []storage.MetricRow)) {
resetResponseCacheIfNeeded = resetCacheIfNeeded
storage.SetLogNewSeries(*logNewSeries)
storage.SetFinalMergeDelay(*finalMergeDelay)
storage.SetMergeWorkersCount(*smallMergeConcurrency)
storage.SetRetentionTimezoneOffset(*retentionTimezoneOffset)
storage.SetFreeDiskSpaceLimit(minFreeDiskSpaceBytes.N)
storage.SetTSIDCacheSize(cacheSizeStorageTSID.IntN())
@@ -121,9 +119,17 @@ func Init(resetCacheIfNeeded func(mrs []storage.MetricRow)) {
sizeBytes := tm.SmallSizeBytes + tm.BigSizeBytes
logger.Infof("successfully opened storage %q in %.3f seconds; partsCount: %d; blocksCount: %d; rowsCount: %d; sizeBytes: %d",
*DataPath, time.Since(startTime).Seconds(), partsCount, blocksCount, rowsCount, sizeBytes)
registerStorageMetrics(Storage)
// register storage metrics
storageMetrics = metrics.NewSet()
storageMetrics.RegisterMetricsWriter(func(w io.Writer) {
writeStorageMetrics(w, strg)
})
metrics.RegisterSet(storageMetrics)
}
var storageMetrics *metrics.Set
// Storage is a storage.
//
// Every storage call must be wrapped into WG.Add(1) ... WG.Done()
@@ -232,6 +238,10 @@ func GetSeriesCount(deadline uint64) (uint64, error) {
// Stop stops the vmstorage
func Stop() {
// deregister storage metrics
metrics.UnregisterSet(storageMetrics)
storageMetrics = nil
logger.Infof("gracefully closing the storage at %s", *DataPath)
startTime := time.Now()
WG.WaitAndBlock()
@@ -246,7 +256,7 @@ func Stop() {
func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
path := r.URL.Path
if path == "/internal/force_merge" {
if !httpserver.CheckAuthFlag(w, r, *forceMergeAuthKey, "forceMergeAuthKey") {
if !httpserver.CheckAuthFlag(w, r, forceMergeAuthKey.Get(), "forceMergeAuthKey") {
return true
}
// Run force merge in background
@@ -264,7 +274,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
return true
}
if path == "/internal/force_flush" {
if !httpserver.CheckAuthFlag(w, r, *forceFlushAuthKey, "forceFlushAuthKey") {
if !httpserver.CheckAuthFlag(w, r, forceFlushAuthKey.Get(), "forceFlushAuthKey") {
return true
}
logger.Infof("flushing storage to make pending data available for reading")
@@ -280,7 +290,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
if !strings.HasPrefix(path, "/snapshot") {
return false
}
if !httpserver.CheckAuthFlag(w, r, *snapshotAuthKey, "snapshotAuthKey") {
if !httpserver.CheckAuthFlag(w, r, snapshotAuthKey.Get(), "snapshotAuthKey") {
return true
}
path = path[len("/snapshot"):]
@@ -387,7 +397,8 @@ func initStaleSnapshotsRemover(strg *storage.Storage) {
staleSnapshotsRemoverWG.Add(1)
go func() {
defer staleSnapshotsRemoverWG.Done()
t := time.NewTicker(11 * time.Second)
d := timeutil.AddJitterToDuration(time.Second * 11)
t := time.NewTicker(d)
defer t.Stop()
for {
select {
@@ -429,497 +440,191 @@ var (
snapshotsDeleteAllErrorsTotal = metrics.NewCounter(`vm_http_request_errors_total{path="/snapshot/delete_all"}`)
)
func registerStorageMetrics(strg *storage.Storage) {
mCache := &storage.Metrics{}
var mCacheLock sync.Mutex
var lastUpdateTime time.Time
func writeStorageMetrics(w io.Writer, strg *storage.Storage) {
var m storage.Metrics
strg.UpdateMetrics(&m)
tm := &m.TableMetrics
idbm := &m.IndexDBMetrics
m := func() *storage.Metrics {
mCacheLock.Lock()
defer mCacheLock.Unlock()
if time.Since(lastUpdateTime) < time.Second {
return mCache
}
var mc storage.Metrics
strg.UpdateMetrics(&mc)
mCache = &mc
lastUpdateTime = time.Now()
return mCache
}
tm := func() *storage.TableMetrics {
sm := m()
return &sm.TableMetrics
}
idbm := func() *storage.IndexDBMetrics {
sm := m()
return &sm.IndexDBMetrics
metrics.WriteGaugeUint64(w, fmt.Sprintf(`vm_free_disk_space_bytes{path=%q}`, *DataPath), fs.MustGetFreeSpace(*DataPath))
metrics.WriteGaugeUint64(w, fmt.Sprintf(`vm_free_disk_space_limit_bytes{path=%q}`, *DataPath), uint64(minFreeDiskSpaceBytes.N))
isReadOnly := 0
if strg.IsReadOnly() {
isReadOnly = 1
}
metrics.WriteGaugeUint64(w, fmt.Sprintf(`vm_storage_is_read_only{path=%q}`, *DataPath), uint64(isReadOnly))
metrics.NewGauge(fmt.Sprintf(`vm_free_disk_space_bytes{path=%q}`, *DataPath), func() float64 {
return float64(fs.MustGetFreeSpace(*DataPath))
})
metrics.NewGauge(fmt.Sprintf(`vm_free_disk_space_limit_bytes{path=%q}`, *DataPath), func() float64 {
return float64(minFreeDiskSpaceBytes.N)
})
metrics.NewGauge(fmt.Sprintf(`vm_storage_is_read_only{path=%q}`, *DataPath), func() float64 {
if strg.IsReadOnly() {
return 1
}
return 0
})
metrics.WriteGaugeUint64(w, `vm_active_merges{type="storage/inmemory"}`, tm.ActiveInmemoryMerges)
metrics.WriteGaugeUint64(w, `vm_active_merges{type="storage/small"}`, tm.ActiveSmallMerges)
metrics.WriteGaugeUint64(w, `vm_active_merges{type="storage/big"}`, tm.ActiveBigMerges)
metrics.WriteGaugeUint64(w, `vm_active_merges{type="indexdb/inmemory"}`, idbm.ActiveInmemoryMerges)
metrics.WriteGaugeUint64(w, `vm_active_merges{type="indexdb/file"}`, idbm.ActiveFileMerges)
metrics.NewGauge(`vm_active_merges{type="storage/inmemory"}`, func() float64 {
return float64(tm().ActiveInmemoryMerges)
})
metrics.NewGauge(`vm_active_merges{type="storage/small"}`, func() float64 {
return float64(tm().ActiveSmallMerges)
})
metrics.NewGauge(`vm_active_merges{type="storage/big"}`, func() float64 {
return float64(tm().ActiveBigMerges)
})
metrics.NewGauge(`vm_active_merges{type="indexdb/inmemory"}`, func() float64 {
return float64(idbm().ActiveInmemoryMerges)
})
metrics.NewGauge(`vm_active_merges{type="indexdb/file"}`, func() float64 {
return float64(idbm().ActiveFileMerges)
})
metrics.WriteCounterUint64(w, `vm_merges_total{type="storage/inmemory"}`, tm.InmemoryMergesCount)
metrics.WriteCounterUint64(w, `vm_merges_total{type="storage/small"}`, tm.SmallMergesCount)
metrics.WriteCounterUint64(w, `vm_merges_total{type="storage/big"}`, tm.BigMergesCount)
metrics.WriteCounterUint64(w, `vm_merges_total{type="indexdb/inmemory"}`, idbm.InmemoryMergesCount)
metrics.WriteCounterUint64(w, `vm_merges_total{type="indexdb/file"}`, idbm.FileMergesCount)
metrics.NewGauge(`vm_merges_total{type="storage/inmemory"}`, func() float64 {
return float64(tm().InmemoryMergesCount)
})
metrics.NewGauge(`vm_merges_total{type="storage/small"}`, func() float64 {
return float64(tm().SmallMergesCount)
})
metrics.NewGauge(`vm_merges_total{type="storage/big"}`, func() float64 {
return float64(tm().BigMergesCount)
})
metrics.NewGauge(`vm_merges_total{type="indexdb/inmemory"}`, func() float64 {
return float64(idbm().InmemoryMergesCount)
})
metrics.NewGauge(`vm_merges_total{type="indexdb/file"}`, func() float64 {
return float64(idbm().FileMergesCount)
})
metrics.WriteCounterUint64(w, `vm_rows_merged_total{type="storage/inmemory"}`, tm.InmemoryRowsMerged)
metrics.WriteCounterUint64(w, `vm_rows_merged_total{type="storage/small"}`, tm.SmallRowsMerged)
metrics.WriteCounterUint64(w, `vm_rows_merged_total{type="storage/big"}`, tm.BigRowsMerged)
metrics.WriteCounterUint64(w, `vm_rows_merged_total{type="indexdb/inmemory"}`, idbm.InmemoryItemsMerged)
metrics.WriteCounterUint64(w, `vm_rows_merged_total{type="indexdb/file"}`, idbm.FileItemsMerged)
metrics.NewGauge(`vm_rows_merged_total{type="storage/inmemory"}`, func() float64 {
return float64(tm().InmemoryRowsMerged)
})
metrics.NewGauge(`vm_rows_merged_total{type="storage/small"}`, func() float64 {
return float64(tm().SmallRowsMerged)
})
metrics.NewGauge(`vm_rows_merged_total{type="storage/big"}`, func() float64 {
return float64(tm().BigRowsMerged)
})
metrics.NewGauge(`vm_rows_merged_total{type="indexdb/inmemory"}`, func() float64 {
return float64(idbm().InmemoryItemsMerged)
})
metrics.NewGauge(`vm_rows_merged_total{type="indexdb/file"}`, func() float64 {
return float64(idbm().FileItemsMerged)
})
metrics.WriteCounterUint64(w, `vm_rows_deleted_total{type="storage/inmemory"}`, tm.InmemoryRowsDeleted)
metrics.WriteCounterUint64(w, `vm_rows_deleted_total{type="storage/small"}`, tm.SmallRowsDeleted)
metrics.WriteCounterUint64(w, `vm_rows_deleted_total{type="storage/big"}`, tm.BigRowsDeleted)
metrics.NewGauge(`vm_rows_deleted_total{type="storage/inmemory"}`, func() float64 {
return float64(tm().InmemoryRowsDeleted)
})
metrics.NewGauge(`vm_rows_deleted_total{type="storage/small"}`, func() float64 {
return float64(tm().SmallRowsDeleted)
})
metrics.NewGauge(`vm_rows_deleted_total{type="storage/big"}`, func() float64 {
return float64(tm().BigRowsDeleted)
})
metrics.WriteGaugeUint64(w, `vm_part_references{type="storage/inmemory"}`, tm.InmemoryPartsRefCount)
metrics.WriteGaugeUint64(w, `vm_part_references{type="storage/small"}`, tm.SmallPartsRefCount)
metrics.WriteGaugeUint64(w, `vm_part_references{type="storage/big"}`, tm.BigPartsRefCount)
metrics.WriteGaugeUint64(w, `vm_partition_references{type="storage"}`, tm.PartitionsRefCount)
metrics.WriteGaugeUint64(w, `vm_object_references{type="indexdb"}`, idbm.IndexDBRefCount)
metrics.WriteGaugeUint64(w, `vm_part_references{type="indexdb"}`, idbm.PartsRefCount)
metrics.NewGauge(`vm_part_references{type="storage/inmemory"}`, func() float64 {
return float64(tm().InmemoryPartsRefCount)
})
metrics.NewGauge(`vm_part_references{type="storage/small"}`, func() float64 {
return float64(tm().SmallPartsRefCount)
})
metrics.NewGauge(`vm_part_references{type="storage/big"}`, func() float64 {
return float64(tm().BigPartsRefCount)
})
metrics.NewGauge(`vm_partition_references{type="storage"}`, func() float64 {
return float64(tm().PartitionsRefCount)
})
metrics.NewGauge(`vm_object_references{type="indexdb"}`, func() float64 {
return float64(idbm().IndexDBRefCount)
})
metrics.NewGauge(`vm_part_references{type="indexdb"}`, func() float64 {
return float64(idbm().PartsRefCount)
})
metrics.WriteCounterUint64(w, `vm_missing_tsids_for_metric_id_total`, idbm.MissingTSIDsForMetricID)
metrics.WriteCounterUint64(w, `vm_index_blocks_with_metric_ids_processed_total`, idbm.IndexBlocksWithMetricIDsProcessed)
metrics.WriteCounterUint64(w, `vm_index_blocks_with_metric_ids_incorrect_order_total`, idbm.IndexBlocksWithMetricIDsIncorrectOrder)
metrics.WriteGaugeUint64(w, `vm_composite_index_min_timestamp`, idbm.MinTimestampForCompositeIndex/1e3)
metrics.WriteCounterUint64(w, `vm_composite_filter_success_conversions_total`, idbm.CompositeFilterSuccessConversions)
metrics.WriteCounterUint64(w, `vm_composite_filter_missing_conversions_total`, idbm.CompositeFilterMissingConversions)
metrics.NewGauge(`vm_missing_tsids_for_metric_id_total`, func() float64 {
return float64(idbm().MissingTSIDsForMetricID)
})
metrics.NewGauge(`vm_index_blocks_with_metric_ids_processed_total`, func() float64 {
return float64(idbm().IndexBlocksWithMetricIDsProcessed)
})
metrics.NewGauge(`vm_index_blocks_with_metric_ids_incorrect_order_total`, func() float64 {
return float64(idbm().IndexBlocksWithMetricIDsIncorrectOrder)
})
metrics.NewGauge(`vm_composite_index_min_timestamp`, func() float64 {
return float64(idbm().MinTimestampForCompositeIndex) / 1e3
})
metrics.NewGauge(`vm_composite_filter_success_conversions_total`, func() float64 {
return float64(idbm().CompositeFilterSuccessConversions)
})
metrics.NewGauge(`vm_composite_filter_missing_conversions_total`, func() float64 {
return float64(idbm().CompositeFilterMissingConversions)
})
// vm_assisted_merges_total name is used for backwards compatibility.
metrics.WriteCounterUint64(w, `vm_assisted_merges_total{type="indexdb/inmemory"}`, idbm.InmemoryPartsLimitReachedCount)
metrics.NewGauge(`vm_assisted_merges_total{type="storage/inmemory"}`, func() float64 {
return float64(tm().InmemoryAssistedMerges)
})
metrics.NewGauge(`vm_assisted_merges_total{type="storage/small"}`, func() float64 {
return float64(tm().SmallAssistedMerges)
})
metrics.WriteCounterUint64(w, `vm_indexdb_items_added_total`, idbm.ItemsAdded)
metrics.WriteCounterUint64(w, `vm_indexdb_items_added_size_bytes_total`, idbm.ItemsAddedSizeBytes)
metrics.NewGauge(`vm_assisted_merges_total{type="indexdb/inmemory"}`, func() float64 {
return float64(idbm().InmemoryAssistedMerges)
})
metrics.NewGauge(`vm_assisted_merges_total{type="indexdb/file"}`, func() float64 {
return float64(idbm().FileAssistedMerges)
})
metrics.WriteGaugeUint64(w, `vm_pending_rows{type="storage"}`, tm.PendingRows)
metrics.WriteGaugeUint64(w, `vm_pending_rows{type="indexdb"}`, idbm.PendingItems)
metrics.NewGauge(`vm_indexdb_items_added_total`, func() float64 {
return float64(idbm().ItemsAdded)
})
metrics.NewGauge(`vm_indexdb_items_added_size_bytes_total`, func() float64 {
return float64(idbm().ItemsAddedSizeBytes)
})
metrics.WriteGaugeUint64(w, `vm_parts{type="storage/inmemory"}`, tm.InmemoryPartsCount)
metrics.WriteGaugeUint64(w, `vm_parts{type="storage/small"}`, tm.SmallPartsCount)
metrics.WriteGaugeUint64(w, `vm_parts{type="storage/big"}`, tm.BigPartsCount)
metrics.WriteGaugeUint64(w, `vm_parts{type="indexdb/inmemory"}`, idbm.InmemoryPartsCount)
metrics.WriteGaugeUint64(w, `vm_parts{type="indexdb/file"}`, idbm.FilePartsCount)
metrics.NewGauge(`vm_pending_rows{type="storage"}`, func() float64 {
return float64(tm().PendingRows)
})
metrics.NewGauge(`vm_pending_rows{type="indexdb"}`, func() float64 {
return float64(idbm().PendingItems)
})
metrics.WriteGaugeUint64(w, `vm_blocks{type="storage/inmemory"}`, tm.InmemoryBlocksCount)
metrics.WriteGaugeUint64(w, `vm_blocks{type="storage/small"}`, tm.SmallBlocksCount)
metrics.WriteGaugeUint64(w, `vm_blocks{type="storage/big"}`, tm.BigBlocksCount)
metrics.WriteGaugeUint64(w, `vm_blocks{type="indexdb/inmemory"}`, idbm.InmemoryBlocksCount)
metrics.WriteGaugeUint64(w, `vm_blocks{type="indexdb/file"}`, idbm.FileBlocksCount)
metrics.NewGauge(`vm_parts{type="storage/inmemory"}`, func() float64 {
return float64(tm().InmemoryPartsCount)
})
metrics.NewGauge(`vm_parts{type="storage/small"}`, func() float64 {
return float64(tm().SmallPartsCount)
})
metrics.NewGauge(`vm_parts{type="storage/big"}`, func() float64 {
return float64(tm().BigPartsCount)
})
metrics.NewGauge(`vm_parts{type="indexdb/inmemory"}`, func() float64 {
return float64(idbm().InmemoryPartsCount)
})
metrics.NewGauge(`vm_parts{type="indexdb/file"}`, func() float64 {
return float64(idbm().FilePartsCount)
})
metrics.WriteGaugeUint64(w, `vm_data_size_bytes{type="storage/inmemory"}`, tm.InmemorySizeBytes)
metrics.WriteGaugeUint64(w, `vm_data_size_bytes{type="storage/small"}`, tm.SmallSizeBytes)
metrics.WriteGaugeUint64(w, `vm_data_size_bytes{type="storage/big"}`, tm.BigSizeBytes)
metrics.WriteGaugeUint64(w, `vm_data_size_bytes{type="indexdb/inmemory"}`, idbm.InmemorySizeBytes)
metrics.WriteGaugeUint64(w, `vm_data_size_bytes{type="indexdb/file"}`, idbm.FileSizeBytes)
metrics.NewGauge(`vm_blocks{type="storage/inmemory"}`, func() float64 {
return float64(tm().InmemoryBlocksCount)
})
metrics.NewGauge(`vm_blocks{type="storage/small"}`, func() float64 {
return float64(tm().SmallBlocksCount)
})
metrics.NewGauge(`vm_blocks{type="storage/big"}`, func() float64 {
return float64(tm().BigBlocksCount)
})
metrics.NewGauge(`vm_blocks{type="indexdb/inmemory"}`, func() float64 {
return float64(idbm().InmemoryBlocksCount)
})
metrics.NewGauge(`vm_blocks{type="indexdb/file"}`, func() float64 {
return float64(idbm().FileBlocksCount)
})
metrics.WriteCounterUint64(w, `vm_rows_added_to_storage_total`, m.RowsAddedTotal)
metrics.WriteCounterUint64(w, `vm_deduplicated_samples_total{type="merge"}`, m.DedupsDuringMerge)
metrics.NewGauge(`vm_data_size_bytes{type="storage/inmemory"}`, func() float64 {
return float64(tm().InmemorySizeBytes)
})
metrics.NewGauge(`vm_data_size_bytes{type="storage/small"}`, func() float64 {
return float64(tm().SmallSizeBytes)
})
metrics.NewGauge(`vm_data_size_bytes{type="storage/big"}`, func() float64 {
return float64(tm().BigSizeBytes)
})
metrics.NewGauge(`vm_data_size_bytes{type="indexdb/inmemory"}`, func() float64 {
return float64(idbm().InmemorySizeBytes)
})
metrics.NewGauge(`vm_data_size_bytes{type="indexdb/file"}`, func() float64 {
return float64(idbm().FileSizeBytes)
})
metrics.WriteCounterUint64(w, `vm_rows_ignored_total{reason="big_timestamp"}`, m.TooBigTimestampRows)
metrics.WriteCounterUint64(w, `vm_rows_ignored_total{reason="small_timestamp"}`, m.TooSmallTimestampRows)
metrics.NewGauge(`vm_rows_added_to_storage_total`, func() float64 {
return float64(m().RowsAddedTotal)
})
metrics.NewGauge(`vm_deduplicated_samples_total{type="merge"}`, func() float64 {
return float64(m().DedupsDuringMerge)
})
metrics.NewGauge(`vm_rows_ignored_total{reason="big_timestamp"}`, func() float64 {
return float64(m().TooBigTimestampRows)
})
metrics.NewGauge(`vm_rows_ignored_total{reason="small_timestamp"}`, func() float64 {
return float64(m().TooSmallTimestampRows)
})
metrics.NewGauge(`vm_timeseries_repopulated_total`, func() float64 {
return float64(m().TimeseriesRepopulated)
})
metrics.NewGauge(`vm_timeseries_precreated_total`, func() float64 {
return float64(m().TimeseriesPreCreated)
})
metrics.NewGauge(`vm_new_timeseries_created_total`, func() float64 {
return float64(m().NewTimeseriesCreated)
})
metrics.NewGauge(`vm_slow_row_inserts_total`, func() float64 {
return float64(m().SlowRowInserts)
})
metrics.NewGauge(`vm_slow_per_day_index_inserts_total`, func() float64 {
return float64(m().SlowPerDayIndexInserts)
})
metrics.NewGauge(`vm_slow_metric_name_loads_total`, func() float64 {
return float64(m().SlowMetricNameLoads)
})
metrics.WriteCounterUint64(w, `vm_timeseries_repopulated_total`, m.TimeseriesRepopulated)
metrics.WriteCounterUint64(w, `vm_timeseries_precreated_total`, m.TimeseriesPreCreated)
metrics.WriteCounterUint64(w, `vm_new_timeseries_created_total`, m.NewTimeseriesCreated)
metrics.WriteCounterUint64(w, `vm_slow_row_inserts_total`, m.SlowRowInserts)
metrics.WriteCounterUint64(w, `vm_slow_per_day_index_inserts_total`, m.SlowPerDayIndexInserts)
metrics.WriteCounterUint64(w, `vm_slow_metric_name_loads_total`, m.SlowMetricNameLoads)
if *maxHourlySeries > 0 {
metrics.NewGauge(`vm_hourly_series_limit_current_series`, func() float64 {
return float64(m().HourlySeriesLimitCurrentSeries)
})
metrics.NewGauge(`vm_hourly_series_limit_max_series`, func() float64 {
return float64(m().HourlySeriesLimitMaxSeries)
})
metrics.NewGauge(`vm_hourly_series_limit_rows_dropped_total`, func() float64 {
return float64(m().HourlySeriesLimitRowsDropped)
})
metrics.WriteGaugeUint64(w, `vm_hourly_series_limit_current_series`, m.HourlySeriesLimitCurrentSeries)
metrics.WriteGaugeUint64(w, `vm_hourly_series_limit_max_series`, m.HourlySeriesLimitMaxSeries)
metrics.WriteCounterUint64(w, `vm_hourly_series_limit_rows_dropped_total`, m.HourlySeriesLimitRowsDropped)
}
if *maxDailySeries > 0 {
metrics.NewGauge(`vm_daily_series_limit_current_series`, func() float64 {
return float64(m().DailySeriesLimitCurrentSeries)
})
metrics.NewGauge(`vm_daily_series_limit_max_series`, func() float64 {
return float64(m().DailySeriesLimitMaxSeries)
})
metrics.NewGauge(`vm_daily_series_limit_rows_dropped_total`, func() float64 {
return float64(m().DailySeriesLimitRowsDropped)
})
metrics.WriteGaugeUint64(w, `vm_daily_series_limit_current_series`, m.DailySeriesLimitCurrentSeries)
metrics.WriteGaugeUint64(w, `vm_daily_series_limit_max_series`, m.DailySeriesLimitMaxSeries)
metrics.WriteCounterUint64(w, `vm_daily_series_limit_rows_dropped_total`, m.DailySeriesLimitRowsDropped)
}
metrics.NewGauge(`vm_timestamps_blocks_merged_total`, func() float64 {
return float64(m().TimestampsBlocksMerged)
})
metrics.NewGauge(`vm_timestamps_bytes_saved_total`, func() float64 {
return float64(m().TimestampsBytesSaved)
})
metrics.WriteCounterUint64(w, `vm_timestamps_blocks_merged_total`, m.TimestampsBlocksMerged)
metrics.WriteCounterUint64(w, `vm_timestamps_bytes_saved_total`, m.TimestampsBytesSaved)
metrics.NewGauge(`vm_rows{type="storage/inmemory"}`, func() float64 {
return float64(tm().InmemoryRowsCount)
})
metrics.NewGauge(`vm_rows{type="storage/small"}`, func() float64 {
return float64(tm().SmallRowsCount)
})
metrics.NewGauge(`vm_rows{type="storage/big"}`, func() float64 {
return float64(tm().BigRowsCount)
})
metrics.NewGauge(`vm_rows{type="indexdb/inmemory"}`, func() float64 {
return float64(idbm().InmemoryItemsCount)
})
metrics.NewGauge(`vm_rows{type="indexdb/file"}`, func() float64 {
return float64(idbm().FileItemsCount)
})
metrics.WriteGaugeUint64(w, `vm_rows{type="storage/inmemory"}`, tm.InmemoryRowsCount)
metrics.WriteGaugeUint64(w, `vm_rows{type="storage/small"}`, tm.SmallRowsCount)
metrics.WriteGaugeUint64(w, `vm_rows{type="storage/big"}`, tm.BigRowsCount)
metrics.WriteGaugeUint64(w, `vm_rows{type="indexdb/inmemory"}`, idbm.InmemoryItemsCount)
metrics.WriteGaugeUint64(w, `vm_rows{type="indexdb/file"}`, idbm.FileItemsCount)
metrics.NewGauge(`vm_date_range_search_calls_total`, func() float64 {
return float64(idbm().DateRangeSearchCalls)
})
metrics.NewGauge(`vm_date_range_hits_total`, func() float64 {
return float64(idbm().DateRangeSearchHits)
})
metrics.NewGauge(`vm_global_search_calls_total`, func() float64 {
return float64(idbm().GlobalSearchCalls)
})
metrics.WriteCounterUint64(w, `vm_date_range_search_calls_total`, idbm.DateRangeSearchCalls)
metrics.WriteCounterUint64(w, `vm_date_range_hits_total`, idbm.DateRangeSearchHits)
metrics.WriteCounterUint64(w, `vm_global_search_calls_total`, idbm.GlobalSearchCalls)
metrics.NewGauge(`vm_missing_metric_names_for_metric_id_total`, func() float64 {
return float64(idbm().MissingMetricNamesForMetricID)
})
metrics.WriteCounterUint64(w, `vm_missing_metric_names_for_metric_id_total`, idbm.MissingMetricNamesForMetricID)
metrics.NewGauge(`vm_date_metric_id_cache_syncs_total`, func() float64 {
return float64(m().DateMetricIDCacheSyncsCount)
})
metrics.NewGauge(`vm_date_metric_id_cache_resets_total`, func() float64 {
return float64(m().DateMetricIDCacheResetsCount)
})
metrics.WriteCounterUint64(w, `vm_date_metric_id_cache_syncs_total`, m.DateMetricIDCacheSyncsCount)
metrics.WriteCounterUint64(w, `vm_date_metric_id_cache_resets_total`, m.DateMetricIDCacheResetsCount)
metrics.NewGauge(`vm_cache_entries{type="storage/tsid"}`, func() float64 {
return float64(m().TSIDCacheSize)
})
metrics.NewGauge(`vm_cache_entries{type="storage/metricIDs"}`, func() float64 {
return float64(m().MetricIDCacheSize)
})
metrics.NewGauge(`vm_cache_entries{type="storage/metricName"}`, func() float64 {
return float64(m().MetricNameCacheSize)
})
metrics.NewGauge(`vm_cache_entries{type="storage/date_metricID"}`, func() float64 {
return float64(m().DateMetricIDCacheSize)
})
metrics.NewGauge(`vm_cache_entries{type="storage/hour_metric_ids"}`, func() float64 {
return float64(m().HourMetricIDCacheSize)
})
metrics.NewGauge(`vm_cache_entries{type="storage/next_day_metric_ids"}`, func() float64 {
return float64(m().NextDayMetricIDCacheSize)
})
metrics.NewGauge(`vm_cache_entries{type="storage/indexBlocks"}`, func() float64 {
return float64(tm().IndexBlocksCacheSize)
})
metrics.NewGauge(`vm_cache_entries{type="indexdb/dataBlocks"}`, func() float64 {
return float64(idbm().DataBlocksCacheSize)
})
metrics.NewGauge(`vm_cache_entries{type="indexdb/indexBlocks"}`, func() float64 {
return float64(idbm().IndexBlocksCacheSize)
})
metrics.NewGauge(`vm_cache_entries{type="indexdb/tagFiltersToMetricIDs"}`, func() float64 {
return float64(idbm().TagFiltersToMetricIDsCacheSize)
})
metrics.NewGauge(`vm_cache_entries{type="storage/regexps"}`, func() float64 {
return float64(storage.RegexpCacheSize())
})
metrics.NewGauge(`vm_cache_entries{type="storage/regexpPrefixes"}`, func() float64 {
return float64(storage.RegexpPrefixesCacheSize())
})
metrics.WriteGaugeUint64(w, `vm_cache_entries{type="storage/tsid"}`, m.TSIDCacheSize)
metrics.WriteGaugeUint64(w, `vm_cache_entries{type="storage/metricIDs"}`, m.MetricIDCacheSize)
metrics.WriteGaugeUint64(w, `vm_cache_entries{type="storage/metricName"}`, m.MetricNameCacheSize)
metrics.WriteGaugeUint64(w, `vm_cache_entries{type="storage/date_metricID"}`, m.DateMetricIDCacheSize)
metrics.WriteGaugeUint64(w, `vm_cache_entries{type="storage/hour_metric_ids"}`, m.HourMetricIDCacheSize)
metrics.WriteGaugeUint64(w, `vm_cache_entries{type="storage/next_day_metric_ids"}`, m.NextDayMetricIDCacheSize)
metrics.WriteGaugeUint64(w, `vm_cache_entries{type="storage/indexBlocks"}`, tm.IndexBlocksCacheSize)
metrics.WriteGaugeUint64(w, `vm_cache_entries{type="indexdb/dataBlocks"}`, idbm.DataBlocksCacheSize)
metrics.WriteGaugeUint64(w, `vm_cache_entries{type="indexdb/indexBlocks"}`, idbm.IndexBlocksCacheSize)
metrics.WriteGaugeUint64(w, `vm_cache_entries{type="indexdb/tagFiltersToMetricIDs"}`, idbm.TagFiltersToMetricIDsCacheSize)
metrics.WriteGaugeUint64(w, `vm_cache_entries{type="storage/regexps"}`, uint64(storage.RegexpCacheSize()))
metrics.WriteGaugeUint64(w, `vm_cache_entries{type="storage/regexpPrefixes"}`, uint64(storage.RegexpPrefixesCacheSize()))
metrics.WriteGaugeUint64(w, `vm_cache_entries{type="storage/prefetchedMetricIDs"}`, m.PrefetchedMetricIDsSize)
metrics.NewGauge(`vm_cache_entries{type="storage/prefetchedMetricIDs"}`, func() float64 {
return float64(m().PrefetchedMetricIDsSize)
})
metrics.WriteGaugeUint64(w, `vm_cache_size_bytes{type="storage/tsid"}`, m.TSIDCacheSizeBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_bytes{type="storage/metricIDs"}`, m.MetricIDCacheSizeBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_bytes{type="storage/metricName"}`, m.MetricNameCacheSizeBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_bytes{type="storage/indexBlocks"}`, tm.IndexBlocksCacheSizeBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_bytes{type="indexdb/dataBlocks"}`, idbm.DataBlocksCacheSizeBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_bytes{type="indexdb/indexBlocks"}`, idbm.IndexBlocksCacheSizeBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_bytes{type="storage/date_metricID"}`, m.DateMetricIDCacheSizeBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_bytes{type="storage/hour_metric_ids"}`, m.HourMetricIDCacheSizeBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_bytes{type="storage/next_day_metric_ids"}`, m.NextDayMetricIDCacheSizeBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_bytes{type="indexdb/tagFiltersToMetricIDs"}`, idbm.TagFiltersToMetricIDsCacheSizeBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_bytes{type="storage/regexps"}`, uint64(storage.RegexpCacheSizeBytes()))
metrics.WriteGaugeUint64(w, `vm_cache_size_bytes{type="storage/regexpPrefixes"}`, uint64(storage.RegexpPrefixesCacheSizeBytes()))
metrics.WriteGaugeUint64(w, `vm_cache_size_bytes{type="storage/prefetchedMetricIDs"}`, m.PrefetchedMetricIDsSizeBytes)
metrics.NewGauge(`vm_cache_size_bytes{type="storage/tsid"}`, func() float64 {
return float64(m().TSIDCacheSizeBytes)
})
metrics.NewGauge(`vm_cache_size_bytes{type="storage/metricIDs"}`, func() float64 {
return float64(m().MetricIDCacheSizeBytes)
})
metrics.NewGauge(`vm_cache_size_bytes{type="storage/metricName"}`, func() float64 {
return float64(m().MetricNameCacheSizeBytes)
})
metrics.NewGauge(`vm_cache_size_bytes{type="storage/indexBlocks"}`, func() float64 {
return float64(tm().IndexBlocksCacheSizeBytes)
})
metrics.NewGauge(`vm_cache_size_bytes{type="indexdb/dataBlocks"}`, func() float64 {
return float64(idbm().DataBlocksCacheSizeBytes)
})
metrics.NewGauge(`vm_cache_size_bytes{type="indexdb/indexBlocks"}`, func() float64 {
return float64(idbm().IndexBlocksCacheSizeBytes)
})
metrics.NewGauge(`vm_cache_size_bytes{type="storage/date_metricID"}`, func() float64 {
return float64(m().DateMetricIDCacheSizeBytes)
})
metrics.NewGauge(`vm_cache_size_bytes{type="storage/hour_metric_ids"}`, func() float64 {
return float64(m().HourMetricIDCacheSizeBytes)
})
metrics.NewGauge(`vm_cache_size_bytes{type="storage/next_day_metric_ids"}`, func() float64 {
return float64(m().NextDayMetricIDCacheSizeBytes)
})
metrics.NewGauge(`vm_cache_size_bytes{type="indexdb/tagFiltersToMetricIDs"}`, func() float64 {
return float64(idbm().TagFiltersToMetricIDsCacheSizeBytes)
})
metrics.NewGauge(`vm_cache_size_bytes{type="storage/regexps"}`, func() float64 {
return float64(storage.RegexpCacheSizeBytes())
})
metrics.NewGauge(`vm_cache_size_bytes{type="storage/regexpPrefixes"}`, func() float64 {
return float64(storage.RegexpPrefixesCacheSizeBytes())
})
metrics.NewGauge(`vm_cache_size_bytes{type="storage/prefetchedMetricIDs"}`, func() float64 {
return float64(m().PrefetchedMetricIDsSizeBytes)
})
metrics.WriteGaugeUint64(w, `vm_cache_size_max_bytes{type="storage/tsid"}`, m.TSIDCacheSizeMaxBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_max_bytes{type="storage/metricIDs"}`, m.MetricIDCacheSizeMaxBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_max_bytes{type="storage/metricName"}`, m.MetricNameCacheSizeMaxBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_max_bytes{type="storage/indexBlocks"}`, tm.IndexBlocksCacheSizeMaxBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_max_bytes{type="indexdb/dataBlocks"}`, idbm.DataBlocksCacheSizeMaxBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_max_bytes{type="indexdb/indexBlocks"}`, idbm.IndexBlocksCacheSizeMaxBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_max_bytes{type="indexdb/tagFiltersToMetricIDs"}`, idbm.TagFiltersToMetricIDsCacheSizeMaxBytes)
metrics.WriteGaugeUint64(w, `vm_cache_size_max_bytes{type="storage/regexps"}`, uint64(storage.RegexpCacheMaxSizeBytes()))
metrics.WriteGaugeUint64(w, `vm_cache_size_max_bytes{type="storage/regexpPrefixes"}`, uint64(storage.RegexpPrefixesCacheMaxSizeBytes()))
metrics.NewGauge(`vm_cache_size_max_bytes{type="storage/tsid"}`, func() float64 {
return float64(m().TSIDCacheSizeMaxBytes)
})
metrics.NewGauge(`vm_cache_size_max_bytes{type="storage/metricIDs"}`, func() float64 {
return float64(m().MetricIDCacheSizeMaxBytes)
})
metrics.NewGauge(`vm_cache_size_max_bytes{type="storage/metricName"}`, func() float64 {
return float64(m().MetricNameCacheSizeMaxBytes)
})
metrics.NewGauge(`vm_cache_size_max_bytes{type="storage/indexBlocks"}`, func() float64 {
return float64(tm().IndexBlocksCacheSizeMaxBytes)
})
metrics.NewGauge(`vm_cache_size_max_bytes{type="indexdb/dataBlocks"}`, func() float64 {
return float64(idbm().DataBlocksCacheSizeMaxBytes)
})
metrics.NewGauge(`vm_cache_size_max_bytes{type="indexdb/indexBlocks"}`, func() float64 {
return float64(idbm().IndexBlocksCacheSizeMaxBytes)
})
metrics.NewGauge(`vm_cache_size_max_bytes{type="indexdb/tagFiltersToMetricIDs"}`, func() float64 {
return float64(idbm().TagFiltersToMetricIDsCacheSizeMaxBytes)
})
metrics.NewGauge(`vm_cache_size_max_bytes{type="storage/regexps"}`, func() float64 {
return float64(storage.RegexpCacheMaxSizeBytes())
})
metrics.NewGauge(`vm_cache_size_max_bytes{type="storage/regexpPrefixes"}`, func() float64 {
return float64(storage.RegexpPrefixesCacheMaxSizeBytes())
})
metrics.WriteCounterUint64(w, `vm_cache_requests_total{type="storage/tsid"}`, m.TSIDCacheRequests)
metrics.WriteCounterUint64(w, `vm_cache_requests_total{type="storage/metricIDs"}`, m.MetricIDCacheRequests)
metrics.WriteCounterUint64(w, `vm_cache_requests_total{type="storage/metricName"}`, m.MetricNameCacheRequests)
metrics.WriteCounterUint64(w, `vm_cache_requests_total{type="storage/indexBlocks"}`, tm.IndexBlocksCacheRequests)
metrics.WriteCounterUint64(w, `vm_cache_requests_total{type="indexdb/dataBlocks"}`, idbm.DataBlocksCacheRequests)
metrics.WriteCounterUint64(w, `vm_cache_requests_total{type="indexdb/indexBlocks"}`, idbm.IndexBlocksCacheRequests)
metrics.WriteCounterUint64(w, `vm_cache_requests_total{type="indexdb/tagFiltersToMetricIDs"}`, idbm.TagFiltersToMetricIDsCacheRequests)
metrics.WriteCounterUint64(w, `vm_cache_requests_total{type="storage/regexps"}`, storage.RegexpCacheRequests())
metrics.WriteCounterUint64(w, `vm_cache_requests_total{type="storage/regexpPrefixes"}`, storage.RegexpPrefixesCacheRequests())
metrics.NewGauge(`vm_cache_requests_total{type="storage/tsid"}`, func() float64 {
return float64(m().TSIDCacheRequests)
})
metrics.NewGauge(`vm_cache_requests_total{type="storage/metricIDs"}`, func() float64 {
return float64(m().MetricIDCacheRequests)
})
metrics.NewGauge(`vm_cache_requests_total{type="storage/metricName"}`, func() float64 {
return float64(m().MetricNameCacheRequests)
})
metrics.NewGauge(`vm_cache_requests_total{type="storage/indexBlocks"}`, func() float64 {
return float64(tm().IndexBlocksCacheRequests)
})
metrics.NewGauge(`vm_cache_requests_total{type="indexdb/dataBlocks"}`, func() float64 {
return float64(idbm().DataBlocksCacheRequests)
})
metrics.NewGauge(`vm_cache_requests_total{type="indexdb/indexBlocks"}`, func() float64 {
return float64(idbm().IndexBlocksCacheRequests)
})
metrics.NewGauge(`vm_cache_requests_total{type="indexdb/tagFiltersToMetricIDs"}`, func() float64 {
return float64(idbm().TagFiltersToMetricIDsCacheRequests)
})
metrics.NewGauge(`vm_cache_requests_total{type="storage/regexps"}`, func() float64 {
return float64(storage.RegexpCacheRequests())
})
metrics.NewGauge(`vm_cache_requests_total{type="storage/regexpPrefixes"}`, func() float64 {
return float64(storage.RegexpPrefixesCacheRequests())
})
metrics.WriteCounterUint64(w, `vm_cache_misses_total{type="storage/tsid"}`, m.TSIDCacheMisses)
metrics.WriteCounterUint64(w, `vm_cache_misses_total{type="storage/metricIDs"}`, m.MetricIDCacheMisses)
metrics.WriteCounterUint64(w, `vm_cache_misses_total{type="storage/metricName"}`, m.MetricNameCacheMisses)
metrics.WriteCounterUint64(w, `vm_cache_misses_total{type="storage/indexBlocks"}`, tm.IndexBlocksCacheMisses)
metrics.WriteCounterUint64(w, `vm_cache_misses_total{type="indexdb/dataBlocks"}`, idbm.DataBlocksCacheMisses)
metrics.WriteCounterUint64(w, `vm_cache_misses_total{type="indexdb/indexBlocks"}`, idbm.IndexBlocksCacheMisses)
metrics.WriteCounterUint64(w, `vm_cache_misses_total{type="indexdb/tagFiltersToMetricIDs"}`, idbm.TagFiltersToMetricIDsCacheMisses)
metrics.WriteCounterUint64(w, `vm_cache_misses_total{type="storage/regexps"}`, storage.RegexpCacheMisses())
metrics.WriteCounterUint64(w, `vm_cache_misses_total{type="storage/regexpPrefixes"}`, storage.RegexpPrefixesCacheMisses())
metrics.NewGauge(`vm_cache_misses_total{type="storage/tsid"}`, func() float64 {
return float64(m().TSIDCacheMisses)
})
metrics.NewGauge(`vm_cache_misses_total{type="storage/metricIDs"}`, func() float64 {
return float64(m().MetricIDCacheMisses)
})
metrics.NewGauge(`vm_cache_misses_total{type="storage/metricName"}`, func() float64 {
return float64(m().MetricNameCacheMisses)
})
metrics.NewGauge(`vm_cache_misses_total{type="storage/indexBlocks"}`, func() float64 {
return float64(tm().IndexBlocksCacheMisses)
})
metrics.NewGauge(`vm_cache_misses_total{type="indexdb/dataBlocks"}`, func() float64 {
return float64(idbm().DataBlocksCacheMisses)
})
metrics.NewGauge(`vm_cache_misses_total{type="indexdb/indexBlocks"}`, func() float64 {
return float64(idbm().IndexBlocksCacheMisses)
})
metrics.NewGauge(`vm_cache_misses_total{type="indexdb/tagFiltersToMetricIDs"}`, func() float64 {
return float64(idbm().TagFiltersToMetricIDsCacheMisses)
})
metrics.NewGauge(`vm_cache_misses_total{type="storage/regexps"}`, func() float64 {
return float64(storage.RegexpCacheMisses())
})
metrics.NewGauge(`vm_cache_misses_total{type="storage/regexpPrefixes"}`, func() float64 {
return float64(storage.RegexpPrefixesCacheMisses())
})
metrics.WriteCounterUint64(w, `vm_deleted_metrics_total{type="indexdb"}`, idbm.DeletedMetricsCount)
metrics.NewGauge(`vm_deleted_metrics_total{type="indexdb"}`, func() float64 {
return float64(idbm().DeletedMetricsCount)
})
metrics.WriteCounterUint64(w, `vm_cache_collisions_total{type="storage/tsid"}`, m.TSIDCacheCollisions)
metrics.WriteCounterUint64(w, `vm_cache_collisions_total{type="storage/metricName"}`, m.MetricNameCacheCollisions)
metrics.NewGauge(`vm_cache_collisions_total{type="storage/tsid"}`, func() float64 {
return float64(m().TSIDCacheCollisions)
})
metrics.NewGauge(`vm_cache_collisions_total{type="storage/metricName"}`, func() float64 {
return float64(m().MetricNameCacheCollisions)
})
metrics.NewGauge(`vm_next_retention_seconds`, func() float64 {
return float64(m().NextRetentionSeconds)
})
metrics.WriteGaugeUint64(w, `vm_next_retention_seconds`, m.NextRetentionSeconds)
}
func jsonResponseError(w http.ResponseWriter, err error) {

View File

@@ -1,4 +1,4 @@
FROM golang:1.21.5 as build-web-stage
FROM golang:1.21.6 as build-web-stage
COPY build /build
WORKDIR /build

View File

@@ -22,6 +22,14 @@ vmui-logs-build: vmui-package-base-image
--entrypoint=/bin/bash \
vmui-builder-image -c "npm install && npm run build:logs"
vmui-anomaly-build: vmui-package-base-image
docker run --rm \
--user $(shell id -u):$(shell id -g) \
--mount type=bind,src="$(shell pwd)/app/vmui",dst=/build \
-w /build/packages/vmui \
--entrypoint=/bin/bash \
vmui-builder-image -c "npm install && npm run build:anomaly"
vmui-release: vmui-build
docker build -t ${DOCKER_NAMESPACE}/vmui:latest -f app/vmui/Dockerfile-web ./app/vmui/packages/vmui
docker tag ${DOCKER_NAMESPACE}/vmui:latest ${DOCKER_NAMESPACE}/vmui:${PKG_TAG}

View File

@@ -7,6 +7,7 @@ Web UI for VictoriaMetrics
* [Updating vmui embedded into VictoriaMetrics](#updating-vmui-embedded-into-victoriametrics)
* [Predefined dashboards](#predefined-dashboards)
* [App mode config options](#app-mode-config-options)
* [Timezone configuration](#timezone-configuration)
----
@@ -246,3 +247,39 @@ vmui can be used to paste into other applications
```html
<div id="root" data-params='{"serverURL":"http://localhost:8428","useTenantID":true,"headerStyles":{"background":"#FFFFFF","color":"#538DE8"},"palette":{"primary":"#538DE8","secondary":"#F76F8E","error":"#FD151B","warning":"#FFB30F","success":"#7BE622","info":"#0F5BFF"}}'></div>
```
----
## Timezone configuration
vmui's timezone setting offers flexibility in displaying time data. It can be set through a configuration flag and is adjustable within the vmui interface. This feature caters to various user preferences and time zones.
### Default Timezone Setting
#### Via Configuration Flag
- Set the default timezone using the `--vmui.defaultTimezone` flag.
- Accepts a valid IANA Time Zone string (e.g., `America/New_York`, `Europe/Berlin`, `Etc/GMT+3`).
- If the flag is unset or invalid, vmui defaults to the browser's local timezone.
#### User Interface Adjustments
- Users can change the timezone in the vmui interface.
- Any changed setting in the interface overrides the flag's default, persisting for the user.
- The timezone specified in the `--vmui.defaultTimezone` flag is included in the vmui's timezone selection dropdown, aiding user choice.
### Key Points
- **Fallback to Browser's Local Timezone**: If the flag is not set or an invalid timezone is specified, vmui uses the local timezone of the user's browser.
- **User Preference Priority**: User-selected timezones in vmui take precedence over the default set by the flag.
- **Cluster Consistency**: Ensure uniform timezone settings across cluster nodes, but individual user interface selections will always override these defaults.
### Examples
Setting a default timezone, with user options to change:
```
./victoria-metrics --vmui.defaultTimezone="America/New_York"
```
In this scenario, if a user in Berlin accesses vmui without changing settings, it will default to their browser's local timezone (CET). If they select a different timezone in vmui, this choice will override the `"America/New_York"` setting for that user.

Some files were not shown because too many files have changed in this diff Show More