Compare commits

...

115 Commits

Author SHA1 Message Date
Aliaksandr Valialkin
f8de318bfc docs/CHANGELOG.md: cut v1.76.1 2022-04-12 16:20:55 +03:00
Aliaksandr Valialkin
ef66b048c9 app/vmui: further improvements for number display on graphs
This is a follow-up for c4d2cd8336

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2409
2022-04-12 16:01:27 +03:00
Aliaksandr Valialkin
52cb80ed4f docs/CHANGELOG.md: link to the bug related to improper handling of maxSeries limit passed from vmselect to vmstorage 2022-04-12 16:00:25 +03:00
Yury Molodov
49eaa29b91 fix: change display labels yaxis (#2452) 2022-04-12 15:30:59 +03:00
Dmytro Kozlov
64179b7cc5 vmui: changed function (#2451)
* vmui: fixed yaxis labels

* vmui: changed function
2022-04-12 15:17:13 +03:00
Dmytro Kozlov
c4d2cd8336 vmui: fixed yaxis labels (#2448) 2022-04-12 15:12:06 +03:00
Aliaksandr Valialkin
7f83dc06c4 app/vmselect: make vmui-update 2022-04-12 14:35:19 +03:00
Roman Khavronenko
453df02e0a github/dependabot.yml: disable versions update for vmui (#2449)
The change disables versions autopupdate for vmui package.
The change has no impact on security updates, which have a separate,
internal limit of ten open pull requests.

See https://docs.github.com/en/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file#open-pull-requests-limit

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-04-12 14:26:14 +03:00
Yurii Kravets
38383c0bec Update Quick-Start (#2422)
* Update Quick-Start

* Update docs/Quick-Start.md

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

* Update docs/Quick-Start.md

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

* Update docs/Quick-Start.md

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

* Update docs/Quick-Start.md

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

* update Quick-Start.md

added "Starting VM-Cluster via Docker" + Anchor fixes

* Apply suggestions from code review

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-04-12 14:22:53 +03:00
Aliaksandr Valialkin
2973b7c634 app/vmui: revert back incompatible changes proposed by dependabot at da6a1642e0 and further commits 2022-04-12 14:03:24 +03:00
dependabot[bot]
f174f0880d build(deps-dev): bump @typescript-eslint/eslint-plugin (#2447)
Bumps [@typescript-eslint/eslint-plugin](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/eslint-plugin) from 5.17.0 to 5.19.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/eslint-plugin/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v5.19.0/packages/eslint-plugin)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/eslint-plugin"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-12 13:26:37 +03:00
dependabot[bot]
c87b39610e build(deps): bump @mui/styles in /app/vmui/packages/vmui (#2446)
Bumps [@mui/styles](https://github.com/mui/material-ui/tree/HEAD/packages/mui-styles) from 5.5.3 to 5.6.1.
- [Release notes](https://github.com/mui/material-ui/releases)
- [Changelog](https://github.com/mui/material-ui/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mui/material-ui/commits/v5.6.1/packages/mui-styles)

---
updated-dependencies:
- dependency-name: "@mui/styles"
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-12 13:23:28 +03:00
dependabot[bot]
638b25028d build(deps): bump @testing-library/jest-dom in /app/vmui/packages/vmui (#2445)
Bumps [@testing-library/jest-dom](https://github.com/testing-library/jest-dom) from 5.16.3 to 5.16.4.
- [Release notes](https://github.com/testing-library/jest-dom/releases)
- [Changelog](https://github.com/testing-library/jest-dom/blob/main/CHANGELOG.md)
- [Commits](https://github.com/testing-library/jest-dom/compare/v5.16.3...v5.16.4)

---
updated-dependencies:
- dependency-name: "@testing-library/jest-dom"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-12 13:21:44 +03:00
dependabot[bot]
40b2cb469b build(deps): bump @testing-library/user-event in /app/vmui/packages/vmui (#2442)
Bumps [@testing-library/user-event](https://github.com/testing-library/user-event) from 14.0.4 to 14.1.0.
- [Release notes](https://github.com/testing-library/user-event/releases)
- [Changelog](https://github.com/testing-library/user-event/blob/main/CHANGELOG.md)
- [Commits](https://github.com/testing-library/user-event/compare/v14.0.4...v14.1)

---
updated-dependencies:
- dependency-name: "@testing-library/user-event"
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-12 13:19:53 +03:00
dependabot[bot]
37e74b76e9 build(deps): bump @mui/icons-material in /app/vmui/packages/vmui (#2441)
Bumps [@mui/icons-material](https://github.com/mui/material-ui/tree/HEAD/packages/mui-icons-material) from 5.5.1 to 5.6.1.
- [Release notes](https://github.com/mui/material-ui/releases)
- [Changelog](https://github.com/mui/material-ui/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mui/material-ui/commits/v5.6.1/packages/mui-icons-material)

---
updated-dependencies:
- dependency-name: "@mui/icons-material"
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-12 13:19:39 +03:00
dependabot[bot]
741973fd56 build(deps): bump @mui/material in /app/vmui/packages/vmui (#2444)
Bumps [@mui/material](https://github.com/mui/material-ui/tree/HEAD/packages/mui-material) from 5.5.3 to 5.6.1.
- [Release notes](https://github.com/mui/material-ui/releases)
- [Changelog](https://github.com/mui/material-ui/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mui/material-ui/commits/v5.6.1/packages/mui-material)

---
updated-dependencies:
- dependency-name: "@mui/material"
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-12 13:19:02 +03:00
dependabot[bot]
170491ed3a build(deps): bump @mui/lab in /app/vmui/packages/vmui (#2440)
Bumps [@mui/lab](https://github.com/mui/material-ui/tree/HEAD/packages/mui-lab) from 5.0.0-alpha.75 to 5.0.0-alpha.77.
- [Release notes](https://github.com/mui/material-ui/releases)
- [Changelog](https://github.com/mui/material-ui/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mui/material-ui/commits/HEAD/packages/mui-lab)

---
updated-dependencies:
- dependency-name: "@mui/lab"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-12 13:15:36 +03:00
dependabot[bot]
e8d0c1ac4c build(deps): bump @testing-library/react in /app/vmui/packages/vmui (#2438)
Bumps [@testing-library/react](https://github.com/testing-library/react-testing-library) from 13.0.0 to 13.0.1.
- [Release notes](https://github.com/testing-library/react-testing-library/releases)
- [Changelog](https://github.com/testing-library/react-testing-library/blob/main/CHANGELOG.md)
- [Commits](https://github.com/testing-library/react-testing-library/compare/v13.0.0...v13.0.1)

---
updated-dependencies:
- dependency-name: "@testing-library/react"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-12 13:15:24 +03:00
dependabot[bot]
d0f351b0b1 build(deps-dev): bump @typescript-eslint/parser (#2443)
Bumps [@typescript-eslint/parser](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/parser) from 5.17.0 to 5.19.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/parser/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v5.19.0/packages/parser)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/parser"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-12 13:15:05 +03:00
dependabot[bot]
da6a1642e0 build(deps): bump @types/react in /app/vmui/packages/vmui (#2439)
Bumps [@types/react](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/react) from 17.0.43 to 18.0.2.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/react)

---
updated-dependencies:
- dependency-name: "@types/react"
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-12 13:11:26 +03:00
dependabot[bot]
f5011fda4c build(deps): bump preact in /app/vmui/packages/vmui (#2431)
Bumps [preact](https://github.com/preactjs/preact) from 10.7.0 to 10.7.1.
- [Release notes](https://github.com/preactjs/preact/releases)
- [Commits](https://github.com/preactjs/preact/compare/10.7.0...10.7.1)

---
updated-dependencies:
- dependency-name: preact
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-12 13:10:24 +03:00
dependabot[bot]
39fa3aecc0 build(deps): bump @mui/icons-material in /app/vmui/packages/vmui (#2433)
Bumps [@mui/icons-material](https://github.com/mui/material-ui/tree/HEAD/packages/mui-icons-material) from 5.5.1 to 5.6.0.
- [Release notes](https://github.com/mui/material-ui/releases)
- [Changelog](https://github.com/mui/material-ui/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mui/material-ui/commits/v5.6.0/packages/mui-icons-material)

---
updated-dependencies:
- dependency-name: "@mui/icons-material"
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-12 13:07:53 +03:00
dependabot[bot]
5d414aae3d build(deps): bump @types/react-dom in /app/vmui/packages/vmui (#2430)
Bumps [@types/react-dom](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/react-dom) from 17.0.14 to 18.0.0.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/react-dom)

---
updated-dependencies:
- dependency-name: "@types/react-dom"
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-12 13:06:40 +03:00
dependabot[bot]
e0f91ad548 build(deps): bump marked in /app/vmui/packages/vmui (#2429)
Bumps [marked](https://github.com/markedjs/marked) from 4.0.12 to 4.0.14.
- [Release notes](https://github.com/markedjs/marked/releases)
- [Changelog](https://github.com/markedjs/marked/blob/master/.releaserc.json)
- [Commits](https://github.com/markedjs/marked/compare/v4.0.12...v4.0.14)

---
updated-dependencies:
- dependency-name: marked
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-12 13:06:15 +03:00
Aliaksandr Valialkin
ae3017d3a6 deployment/docker: update base Docker image from Alpine 3.15.3 to Alpine 3.15.4
See https://alpinelinux.org/posts/Alpine-3.12.12-3.13.10-3.14.6-3.15.4-released.html
2022-04-12 13:03:42 +03:00
Roman Khavronenko
dbbacc8847 docs: add managed vm documentation section (#2437)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-04-12 13:03:25 +03:00
Aliaksandr Valialkin
c6eb404c69 lib/encoding: explicitly set slice length passed to binary.BigEndian.Uint*
This allows Go complier to generate more optimal code without bound checks
2022-04-12 12:55:21 +03:00
Aliaksandr Valialkin
a91c2a4377 vendor: make vendor-update 2022-04-12 12:51:54 +03:00
Aliaksandr Valialkin
f3d4671bb6 lib/promscrape: follow-up after 7e79adfb55 2022-04-12 12:36:17 +03:00
Nikolay
7e79adfb55 lib/promscrape: allows to use k8s pod name as clusterMemberNum (#2436)
* lib/promscrape: allows to use k8s pod name as clusterMemberNum
it must improve user expirience and simplify clustering scrapers.
it must allow to use vmagent cluster with distroless images
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2359

* Apply suggestions from code review

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-04-12 12:24:11 +03:00
Yurii Kravets
708a3ef276 url-examples.md (#2435)
* url-examples.md

added 1 more example

* Update url-examples.md
2022-04-12 11:27:17 +03:00
Aliaksandr Valialkin
54de0531a4 app/vmstorage: properly handle maxSeries limit passed from vmselect to vmstorage 2022-04-12 11:23:04 +03:00
Aliaksandr Valialkin
deaa8c1ffa lib/protoparser/native: follow-up after fe01f4803d 2022-04-11 19:27:07 +03:00
Nikolay
fe01f4803d lib/protoparser/native: fixes parseStream dead-lock (#2423)
previously, if native block cannot be unmarshaled, wg.Done wasn't called by unmarshal work.
It leads to connection blocking and possible dead-lock at client side
2022-04-11 19:22:24 +03:00
Aliaksandr Valialkin
d7bf0a7348 vendor: update github.com/VictoriaMetrics/metricsql from v0.40.0 to v0.41.0
This allows using built-in function names as with template names
2022-04-11 18:31:44 +03:00
Aliaksandr Valialkin
e27dac25b9 docs/Single-server-VictoriaMetrics.md: clarify that ingestion protocol means data ingestion protocol 2022-04-11 12:57:26 +03:00
Aliaksandr Valialkin
61c7f6beae app/vmselect/promql: allow calling InitRollupResultCache+StopRollupResultCache multiple times during tests
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2406
2022-04-11 12:34:43 +03:00
Aliaksandr Valialkin
b89e846ce3 docs/CHANGELOG.md: document ed364a42e3 2022-04-11 12:11:32 +03:00
hagen1778
ed364a42e3 vmalert: support relabeling for alert labels sent via notifier
Before, relabeling for notifier configured via file was supported
only for target labels discovered via SD.
With this change, new config field `alert_relabel_configs` is introduced
for applying relabeling to labels of sent alerts.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-04-11 11:09:14 +03:00
Aliaksandr Valialkin
3121085e8f docs/Single-server-VictoriaMetrics.md: mention that VictoriaMetrics supports data ingestion in Graphite protocol at Graphite API usage chapter 2022-04-10 16:20:41 +03:00
Aliaksandr Valialkin
f1ad5b6857 docs/Cluster-VictoriaMetrics.md: update docs after b843f0e229 2022-04-10 16:18:21 +03:00
dependabot[bot]
d7f86f111b build(deps): bump codecov/codecov-action from 2.1.0 to 3 (#2407)
Bumps [codecov/codecov-action](https://github.com/codecov/codecov-action) from 2.1.0 to 3.
- [Release notes](https://github.com/codecov/codecov-action/releases)
- [Changelog](https://github.com/codecov/codecov-action/blob/master/CHANGELOG.md)
- [Commits](https://github.com/codecov/codecov-action/compare/v2.1.0...v3)

---
updated-dependencies:
- dependency-name: codecov/codecov-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-08 13:08:09 +03:00
Yurii Kravets
ed8c6f69e4 Update url-examples (#2410) 2022-04-08 13:05:46 +03:00
Ted Robertson
fae2b36b58 Fix English in the bug report template (#2413) 2022-04-08 13:05:08 +03:00
Aliaksandr Valialkin
a0e77744d4 docs/Cluster-VictoriaMetrics.md: clarify high availability docs 2022-04-08 12:51:57 +03:00
Aliaksandr Valialkin
fbd71f3083 docs/CHANGELOG.md: document backwards-incompatible changes in cluster version of v1.76.0 2022-04-08 12:05:45 +03:00
Dmytro Kozlov
66a03a7fa9 docs/guides: Multi-regional setup with VictoriaMetrics (#2416)
* docs/guides: Multi-regional setup with VictoriaMetrics

* docs/guides: cleanup
2022-04-08 11:39:40 +03:00
Aliaksandr Valialkin
dc60e99e94 docs/CHANGELOG.md: document the bugfix in hitCount function 2022-04-08 11:31:52 +03:00
Aliaksandr Valialkin
978f6d0f89 docs/CHANGELOG.md: typo fix 2022-04-07 17:19:59 +03:00
Aliaksandr Valialkin
ef690932ee docs/CHANGELOG.md: cut v1.76.0 2022-04-07 15:33:55 +03:00
Aliaksandr Valialkin
a95b96979c vendor: make vendor-update 2022-04-07 15:28:27 +03:00
Aliaksandr Valialkin
a96eb16329 lib/memory: export process_memory_limit_bytes metric, which shows the amounts of memory the current process has access to
This metric is equivalent to `vm_available_memory_bytes`, but it has better name,
since the metric is related to a process, not VictoriaMetrics itself.

Leave `vm_available_memory_bytes` for backwards compatibility.
2022-04-07 15:23:00 +03:00
Roman Khavronenko
2b59fff526 vmalert: fix labels and annotations processing for alerts (#2403)
To improve compatibility with Prometheus alerting the order of
templates processing has changed.
Before, vmalert did all labels processing beforehand. It meant
all extra labels (such as `alertname`, `alertgroup` or rule labels)
were available in templating. All collisions were resolved in favour
of extra labels.
In Prometheus, only labels from the received metric are available in
templating, so no collisions are possible.
This change makes vmalert's behaviour similar to Prometheus.

For example, consider alerting rule which is triggered by time series
with `alertname` label. In vmalert, this label would be overriden
by alerting rule's name everywhere: for alert labels, for annotations, etc.
In Prometheus, it would be overriden for alert's labels only, but in annotations
the original label value would be available.

See more details here https://github.com/prometheus/compliance/issues/80

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-04-06 20:24:45 +02:00
Aliaksandr Valialkin
57143e9435 lib/storage: increase the number of rawRowsShard shards on systems with more than 4 CPU cores
This should improve data ingestion scalability on systems with many CPU cores
2022-04-06 19:49:20 +03:00
Aliaksandr Valialkin
7bad7133bc lib/mergeset: use more rawItemsShard shards on multi-CPU systems
This should improve the scalability for registering of new time series on multi-CPU system
2022-04-06 19:35:55 +03:00
Aliaksandr Valialkin
ad35068c3a lib/mergeset: skip common prefixes when comparing inmemoryBlock items
This should improve the performance for items sorting inside inmemoryBlock.MarshalUnsortedData
if they have common prefix.

While at it, improve the performance for inmemoryBlock.updateCommonPrefix for sorted items.
This should improve performance for inmemoryBlock.MarshalSortedData during background merge.
2022-04-06 18:51:36 +03:00
Aliaksandr Valialkin
5acd70109b lib/protoparser: remove superflowous memory allocations during protocol parsing 2022-04-06 14:00:08 +03:00
Aliaksandr Valialkin
569b0d444c app/vmagent: properly initialize stdDialer
This is a follow-up commit for 7da20a4b3f

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1699
2022-04-06 13:57:20 +03:00
Aliaksandr Valialkin
50cf74ce4b lib/storage: reuse sync.WaitGroup objects
This reduces GC load by up to 10% according to memory profiling
2022-04-06 13:34:04 +03:00
Aliaksandr Valialkin
077193d87c lib/cgroup: reduce the default GOGC value from 50% to 30%
This reduces memory usage under production workloads by up to 10%,
while CPU spent on GC remains roughly the same.

The CPU spent on GC can be monitored with go_memstats_gc_cpu_fraction metric
2022-04-06 13:32:07 +03:00
Aliaksandr Valialkin
7da20a4b3f app/vmagent: reduce the probability of TLS handshake timeout when dialing the remote storage
The following actions are taken:

- Increase the TLS hashdshake timeout from 5 seconds to 10 seconds
- Increase dial timeout from 5 seconds to 30 seconds
- Specify DialContext instead of Dial in http.Transport. This allows properly handling
  the Context arg during dialing the remote storage

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1699
2022-04-06 12:34:25 +03:00
Aliaksandr Valialkin
cde1e2ec93 docs/Release-Guide.md: add missing steps 2022-04-06 11:41:09 +03:00
Aliaksandr Valialkin
319e910897 lib/workingsetcache: reuse prev cache after its reset
This should reduce memory churn rate
2022-04-05 20:37:45 +03:00
Aliaksandr Valialkin
cae61c85d4 vendor: update github.com/VictoriaMetrics/fastcache from v1.9.0 to v1.10.0 2022-04-05 20:32:50 +03:00
Aliaksandr Valialkin
7ecb72648d docs/CHANGELOG.md: document 0c0efc7781 2022-04-05 19:21:49 +03:00
Aliaksandr Valialkin
29cebb3d95 lib/workingsetcache: check more frequently for cache size overflow
This should reduce the probability of cache size limit overflow
2022-04-05 18:05:43 +03:00
Aliaksandr Valialkin
4785d04312 lib/workingsetcache: reduce the expiration duration from 20 minutes to 10 minutes
This should reduce memory usage for the cache under high churn rate
2022-04-05 17:12:13 +03:00
Nikolay
0c0efc7781 vmctl verify-blocks command (#2390)
* lib/protoparser: changes ParseStream for native format
uses reader instead of http.Request
updates app/vmagent and app/vmagent method usage

* app/vmctl: add verify-block subcommand
it allows to check exported from VictoriaMetrics data block in native format
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2362

Update app/vmctl/README.md

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2022-04-05 16:01:32 +02:00
Aliaksandr Valialkin
4ecb86c179 app/vminsert: reduce the max packet size, which vminsert can send to vmstorage
This reduces the max memory usage for vminsert and vmstorage under heavy ingestion rate
by up to 50% on production workload
2022-04-05 15:43:07 +03:00
Aliaksandr Valialkin
d4f14f4879 vendor: make vendor-update 2022-04-04 13:05:04 +03:00
Aliaksandr Valialkin
d011446f6f docs/CHANGELOG.md: document 70bb0d2708 2022-04-04 13:02:27 +03:00
Roman Khavronenko
70bb0d2708 vmalert: add flag for disabling long-lived connections (keepalive) (#2395)
The new flag `datasource.disableKeepAlive` allows disabling keepalive
connections. This may be useful if there are multiple datasource
replicas (e.g. vmselects) behind the HTTP balancer to avoid uneven
load spread because of long-lived connections.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-04-04 12:59:04 +03:00
Aliaksandr Valialkin
43df19a742 docs/CHANGELOG.md: document 173073364e1bb1e0259ddc873dbd96ce62b07543 2022-04-04 12:55:43 +03:00
Artem Navoiev
3d3b9e3b59 Update release process. Actualize helm-charts release process. Add re… (#2397)
* Update release process. Actualize helm-charts release process. Add release guide for ansible playbooks

* add step to Ansible steps
2022-04-04 12:14:27 +03:00
Aliaksandr Valialkin
19ecc4b2c3 app/vmselect: make vmui-update 2022-04-01 12:55:21 +03:00
dependabot[bot]
f47d67d836 build(deps): bump react-router-dom in /app/vmui/packages/vmui (#2394)
Bumps [react-router-dom](https://github.com/remix-run/react-router/tree/HEAD/packages/react-router-dom) from 6.2.2 to 6.3.0.
- [Release notes](https://github.com/remix-run/react-router/releases)
- [Commits](https://github.com/remix-run/react-router/commits/v6.3.0/packages/react-router-dom)

---
updated-dependencies:
- dependency-name: react-router-dom
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-01 12:53:50 +03:00
dependabot[bot]
4aa5f70f21 build(deps): bump @testing-library/react in /app/vmui/packages/vmui (#2392)
Bumps [@testing-library/react](https://github.com/testing-library/react-testing-library) from 12.1.4 to 13.0.0.
- [Release notes](https://github.com/testing-library/react-testing-library/releases)
- [Changelog](https://github.com/testing-library/react-testing-library/blob/main/CHANGELOG.md)
- [Commits](https://github.com/testing-library/react-testing-library/compare/v12.1.4...v13.0.0)

---
updated-dependencies:
- dependency-name: "@testing-library/react"
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-01 12:52:54 +03:00
dependabot[bot]
73789b333f build(deps): bump @types/react in /app/vmui/packages/vmui (#2375)
Bumps [@types/react](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/react) from 17.0.41 to 17.0.43.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/react)

---
updated-dependencies:
- dependency-name: "@types/react"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-01 12:51:28 +03:00
dependabot[bot]
72fd976cb3 build(deps): bump @testing-library/user-event in /app/vmui/packages/vmui (#2393)
Bumps [@testing-library/user-event](https://github.com/testing-library/user-event) from 13.5.0 to 14.0.4.
- [Release notes](https://github.com/testing-library/user-event/releases)
- [Changelog](https://github.com/testing-library/user-event/blob/main/CHANGELOG.md)
- [Commits](https://github.com/testing-library/user-event/compare/v13.5.0...v14.0.4)

---
updated-dependencies:
- dependency-name: "@testing-library/user-event"
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-01 12:50:30 +03:00
Yury Molodov
f166f80f15 vmui: grid support for predefined panels (#2386)
* update packages

* feat: add setting width for predefined panels

* docs: update doc by predefined dashboards

* app/vmselect: `make vmui-update`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-04-01 12:48:17 +03:00
Ross Dougherty
0fd4c48568 fix object selectors link (#2391)
* fix object selectors link

* update kustomize url
2022-04-01 12:40:41 +03:00
Dima Lazerka
e2b1097545 Fix typo "vmanomapy" 2022-04-01 12:26:59 +03:00
Aliaksandr Valialkin
f977ca8eaf docs/CHANGELOG.md: document a57e3807537914396ee3eb378648a464fa9e1b97 2022-04-01 12:24:49 +03:00
Aliaksandr Valialkin
1c38ff6f48 docs/CHANGELOG.md: document 0989649ad0 2022-04-01 12:01:34 +03:00
Yurii Kravets
a9b6cf53a2 url-examples (#2389) 2022-04-01 11:23:18 +03:00
Roman Khavronenko
1354e6d712 vmalert: protect executor's field from concurrent access (#2387)
Executor recently gain field for storing previously sent series.
Since the same executor object can be used in multiple goroutines,
the access to this field should be serialized.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-03-30 12:37:27 +02:00
Roman Khavronenko
0989649ad0 Vmalert compliance 2 (#2340)
* vmalert: split alert's `Start` field into `ActiveAt` and `Start`

The `ActiveAt` field identifies when alert becomes active for rules
with `for > 0`. Previously, this value was stored in field `Start`.

The field `Start` now identifies the moment alert became `FIRING`.

The split is needed in order to distinguish these two moments
in the API responses for alerts.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: support specific moment of time for rules evaluation

The Querier interface was extended to accept a new argument
used as a timestamp at which evaluation should be made.

It is needed to align rules execution time within the group.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: mark disappeared series as stale

Series generated by alerting rules, which were sent to remote write
now will be marked as stale if they will disappear on the next
evaluation. This would make ALERTS and ALERTS_FOR_TIME series
more precise.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* wip

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: evaluate rules at fixed timestamp

Before, time at which rules were evaluated was calculated
right before rule execution. The change makes sure
that timestamp is calculated only once per evalution round
and all rules are using the same timestamp.

It also updates the logic of resending of already resolved
alert notification.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: allow overridin `alertname` label value if it is present in response

Previously, `alertname` was always equal to the Alerting Rule name. Now,
its value can be overriden if series in response containt the different value
for this label.

The change is needed for improving compatibility with Prometheus.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: align rules evaluation in time

Now, evaluation timestamp for rules evaluates as if
there was no delay in rules evaluation. It means, that
rules will be evaluated at fixed timestamps+group_interval.
This way provides more consistent evaluation results and
improves compatibility with Prometheus,

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: add metric for missed iterations

New metric `vmalert_iteration_missed_total` will show
whether rules evaluation round was missed.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: reduce delay before the initial rule evaluation in group

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: rollback alertname override

According to the spec:
```
The alert name from the alerting rule (HighRequestLatency from the example above) MUST be added to the labels of the alert with the label name as alertname. It MUST override any existing alertname label.
```

https://github.com/prometheus/compliance/blob/main/alert_generator/specification.md#step-3
Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: throw err immediately on dedup detection

```
The execution of an alerting rule MUST error out immediately and MUST NOT send any alerts
or add samples to samples receiver if there is more than one alert with the same labels
```

https://github.com/prometheus/compliance/blob/main/alert_generator/specification.md#step-4
Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: cleanup

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: use strings builder to reduce allocs

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-03-29 15:09:07 +02:00
Denys Holius
0123295d50 Update alpine linux base image to the latest v3.15.3 (#2384)
Updated alpine linux base image to the latest v3.15.3 which has fix for [CVE-2018-25032](https://security.alpinelinux.org/vuln/CVE-2018-25032).
See https://alpinelinux.org/posts/Alpine-3.12.11-3.13.9-3.14.5-3.15.3-released.html
2022-03-29 12:48:11 +02:00
Roman Khavronenko
56de8f0356 docs: fix typo in vmalert's API (#2380)
The API handler was changed in 1.75 but docs
still contain the old address.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2366
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-03-28 12:07:02 +02:00
Aliaksandr Valialkin
e210384f7e docs/CHANGELOG.md: cut v1.75.1 2022-03-28 12:28:48 +03:00
Roman Khavronenko
cb878d50fc docs: update increase_prometheus desc (#2381)
Remove note that this func is supported by PromQL.
2022-03-28 11:57:03 +03:00
Roman Khavronenko
3a2a60cb08 docs: escape chars for /label/values endpoint (#2379)
Without escaping the part wrapped with `<` `>` chars
won't be rendered properly.
2022-03-28 10:18:37 +02:00
Aliaksandr Valialkin
2ea540a5aa vendor: make vendor-update 2022-03-26 13:07:56 +02:00
Yury Molodov
c8d29ed78e vmui: predefined panels (#2243)
* feat: add basic components for predefined dashboards

* fix: change display alert

* feat: add autosize and unit for axes

* feat: add component for CircularProgress

* feat: change layout for predefined dashboards

* feat: add override step for predefined panels

* feat: add override step for predefined panels

* feat: change yaxis limits for predefined panels

* fix: rename flag for hide legend

* feat: add formatted panel description

* feat: add README.md for dashboard setup

* feat: validate dashboard settings

* feat: add unit for y-ticks

* fix: correct display error for dashboards

* fix: disable auto refresh after route change

* update package-lock.json

* fix: add basename for BrowserRouter

* fix: add dynamic basename for routing

* update packages

* feat: add a pre-defined dashboard "per-job resource usage"

* feat: display unit in the hover-tooltip

* fix: change routing and home layout

* fix: change axis width calc

* updated packages

* app/vmselect: `make vmui-update`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-03-26 13:03:11 +02:00
dependabot[bot]
afc2e73948 build(deps): bump node-forge in /app/vmui/packages/vmui (#2371)
Bumps [node-forge](https://github.com/digitalbazaar/forge) from 1.2.1 to 1.3.0.
- [Release notes](https://github.com/digitalbazaar/forge/releases)
- [Changelog](https://github.com/digitalbazaar/forge/blob/main/CHANGELOG.md)
- [Commits](https://github.com/digitalbazaar/forge/compare/v1.2.1...v1.3.0)

---
updated-dependencies:
- dependency-name: node-forge
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-03-26 12:57:11 +02:00
Nikolay
9a88c1a91e lib/{storage,regexpcache}: replaces regexpCacheMap with LRU cache (#2293)
* lib/{storage,regexpcache}: replaces regexpCacheMap with LRU cache

It should decrease memory usage for regexp caching
with storing cacheEntry by pointer - golang map should be able to effectivly shrink it's size
original issue with this case - unexpected map grows and storage OOM

Apply suggestions from code review

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

Adds missing metrics for regexp cache and regexpPrefixes cache

* wip

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-03-26 12:54:50 +02:00
Aliaksandr Valialkin
6e364e19ef app/vmselect: add fine-grained limits for the number of returned/scanned time series for various APIs 2022-03-26 11:29:49 +02:00
Denys Holius
a462b97859 Update alpine linux base image to the latest v3.15.2
Update alpine linux base image to the latest v3.15.2 which has fix for CVE-2022-0778.
See https://alpinelinux.org/posts/Alpine-3.15.2-released.html
2022-03-25 17:05:55 +01:00
Dima Lazerka
1fa0f3ec89 VMAnomaly docs fixes (#2361)
* Added docs for vmanomaly

* Add example images

* Stylistic fixes

* Move images to root

* Update docs/vmanomaly.md

* Update docs/vmanomaly.md

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

* Squeeze vmanomaly after vmbackupmanager before Case Studies

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2022-03-25 12:08:17 +02:00
Denys Holius
7ce40d74d7 Update golangci version to latest v1.45.1 (#2360) 2022-03-24 19:16:24 +02:00
Dima Lazerka
7377163659 Add vmanomaly docs section (#2356)
* Added docs for vmanomaly

* Add example images

* Stylistic fixes

* Move images to root

* Update docs/vmanomaly.md

* Update docs/vmanomaly.md

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2022-03-24 11:56:23 +02:00
Yurii Kravets
c46d9be108 Update url-examples (#2358)
* Update url-examples

Add federate example

* Update docs/url-examples.md

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-03-24 11:53:28 +02:00
Aliaksandr Valialkin
f8dfc22350 docs: a follow-up after 76a477c609 2022-03-23 19:39:08 +02:00
Yurii Kravets
76a477c609 Update Single-server-VictoriaMetrics.md (#2357)
* Update Single-server-VictoriaMetrics.md

Adding /federate link

* Apply suggestions from code review

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-03-23 19:37:23 +02:00
Yurii Kravets
23d0fc220d Update url-examples.md (#2297)
* Update url-examples.md

+additional examples

* Update

* Update url-examples

Some changes requested by Roman.

* Update url-examples.md

* Update url-example

* Update url-examples

Additional info and marking for /labels part

* Update url-example

Added example with complex query which needs encoding:
 How to execute the query similar to - sum(increase(foo{status="bar"}[5m])) by (status)

* Update url-samples

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-03-22 15:11:43 +02:00
Aliaksandr Valialkin
c8f356a6a8 app: sync Markdown changes from a8de1ab000 2022-03-22 14:11:18 +02:00
Aliaksandr Valialkin
b421a1f57b docs/Cluster-VictoriaMetrics.md: clarify mTLS protection docs 2022-03-22 13:55:40 +02:00
Arash Hatami
a8de1ab000 A good change for MD files (#2353)
* Lint YAML

* Remove extra comment

* Fix command problem

* Format MD files

* Format & fix problem of MD files for docs

* Another fix for MD files
2022-03-22 13:40:55 +02:00
Aliaksandr Valialkin
e1311409db vendor: make vendor-update 2022-03-21 17:02:12 +02:00
dependabot[bot]
f36e8debc7 build(deps): bump @types/react in /app/vmui/packages/vmui (#2346)
Bumps [@types/react](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/react) from 17.0.40 to 17.0.41.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/react)

---
updated-dependencies:
- dependency-name: "@types/react"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-03-21 15:44:54 +02:00
dependabot[bot]
c8c6f5b15e build(deps): bump @types/react-dom in /app/vmui/packages/vmui (#2347)
Bumps [@types/react-dom](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/react-dom) from 17.0.13 to 17.0.14.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/react-dom)

---
updated-dependencies:
- dependency-name: "@types/react-dom"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-03-21 15:42:14 +02:00
Roman Khavronenko
f367ff086c docs: update release notes (#2349)
Warn about memory issue introduced in releases 1.73 - 1.74
2022-03-21 15:40:50 +02:00
Aliaksandr Valialkin
5ab6c350ec docs/CHANGELOG.md: document a1e17e91f8 2022-03-21 15:34:49 +02:00
Dmytro Kozlov
a1e17e91f8 issue-2323: Fixed Incorrect Content-Type header 'text/plain' for root path (#2343)
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2323
2022-03-21 08:13:28 +00:00
hagen1778
82659ab5b6 docs: add update note to v1.75.0 release note
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-03-19 17:55:38 +01:00
328 changed files with 15320 additions and 8352 deletions

View File

@@ -8,10 +8,10 @@ assignees: ''
**Describe the bug**
A clear and concise description of what the bug is.
It would be a great [upgrading](https://docs.victoriametrics.com/#how-to-upgrade)
It would be great to [upgrade](https://docs.victoriametrics.com/#how-to-upgrade)
to [the latest available release](https://github.com/VictoriaMetrics/VictoriaMetrics/releases)
and verifying whether the bug is reproducible there.
It is also recommended reading [troubleshooting docs](https://docs.victoriametrics.com/#troubleshooting).
and verify whether the bug is reproducible there.
It's also recommended to read the [troubleshooting docs](https://docs.victoriametrics.com/#troubleshooting).
**To Reproduce**
Steps to reproduce the behavior.
@@ -36,12 +36,11 @@ See how to setup monitoring here:
* [montioring for VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#monitoring)
**Version**
The line returned when passing `--version` command line flag to binary. For example:
The line returned when passing `--version` command line flag to the binary. For example:
```
$ ./victoria-metrics-prod --version
victoria-metrics-20190730-121249-heads-single-node-0-g671d9e55
```
**Used command-line flags**
Please provide applied command-line flags used for running VictoriaMetrics and its components.
Please provide the command-line flags used for running VictoriaMetrics and its components.

View File

@@ -16,6 +16,7 @@ updates:
directory: "/app/vmui/packages/vmui/web"
schedule:
interval: "weekly"
open-pull-requests-limit: 0
- package-ecosystem: "docker"
directory: "/"
schedule:
@@ -24,3 +25,4 @@ updates:
directory: "/app/vmui/packages/vmui"
schedule:
interval: "weekly"
open-pull-requests-limit: 0

View File

@@ -60,7 +60,7 @@ jobs:
GOOS=darwin go build -mod=vendor ./app/vmctl
CGO_ENABLED=0 GOOS=windows go build -mod=vendor ./app/vmagent
- name: Publish coverage
uses: codecov/codecov-action@v2.1.0
uses: codecov/codecov-action@v3
with:
file: ./coverage.txt

View File

@@ -68,9 +68,9 @@ members of the project's leadership.
## Attribution
This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html
available at <https://www.contributor-covenant.org/version/1/4/code-of-conduct.html>
[homepage]: https://www.contributor-covenant.org
For answers to common questions about this code of conduct, see
https://www.contributor-covenant.org/faq
<https://www.contributor-covenant.org/faq>

View File

@@ -79,7 +79,7 @@
**Последствия**: Предупреждение о последствиях в случае продолжающегося неуместного поведения.
На определенное время не допускается взаимодействие с людьми, вовлеченными в инцидент,
включая незапрошенное взаимодействие
включая незапрошенное взаимодействие
с теми, кто обеспечивает соблюдение Кодекса. Это включает в себя избегание взаимодействия
в публичных пространствах, а так же во внешних каналах,
таких как социальные сети. Нарушение этих правил влечет за собой временный или вечный бан.
@@ -89,10 +89,10 @@
**Общественное влияние**: Серьёзное нарушение стандартов сообщества,
включая продолжительное неуместное поведение.
**Последствия**: Временный запрет (бан) на любое взаимодействие
**Последствия**: Временный запрет (бан) на любое взаимодействие
или публичное общение с сообществом на определенный период времени.
На этот период не допускается публичное или личное взаимодействие с людьми,
вовлеченными в инцидент, включая незапрошенное взаимодействие
вовлеченными в инцидент, включая незапрошенное взаимодействие
с теми, кто обеспечивает соблюдение Кодекса.
Нарушение этих правил влечет за собой вечный бан.
@@ -108,7 +108,7 @@
Данный Кодекс Поведения основан на [Кодекс Поведения участника][homepage],
версии 2.0, доступной по адресу
https://www.contributor-covenant.org/version/2/0/code_of_conduct.html.
<https://www.contributor-covenant.org/version/2/0/code_of_conduct.html>.
Принципы Воздействия в Сообществе были вдохновлены [Mozilla's code of conduct
enforcement ladder](https://github.com/mozilla/diversity).
@@ -116,5 +116,5 @@ enforcement ladder](https://github.com/mozilla/diversity).
[homepage]: https://www.contributor-covenant.org
Ответы на общие вопросы о данном кодексе поведения ищите на странице FAQ:
https://www.contributor-covenant.org/faq. Переводы доступны по адресу
https://www.contributor-covenant.org/translations.
<https://www.contributor-covenant.org/faq>. Переводы доступны по адресу
<https://www.contributor-covenant.org/translations>.

View File

@@ -283,7 +283,7 @@ golangci-lint: install-golangci-lint
golangci-lint run --exclude '(SA4003|SA1019|SA5011):' -D errcheck -D structcheck --timeout 2m
install-golangci-lint:
which golangci-lint || curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(shell go env GOPATH)/bin v1.44.1
which golangci-lint || curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(shell go env GOPATH)/bin v1.45.1
install-wwhrd:
which wwhrd || GO111MODULE=off go get github.com/frapposelli/wwhrd

479
README.md

File diff suppressed because it is too large Load Diff

View File

@@ -86,6 +86,7 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
if r.Method != "GET" {
return false
}
w.Header().Add("Content-Type", "text/html; charset=utf-8")
fmt.Fprintf(w, "<h2>Single-node VictoriaMetrics</h2></br>")
fmt.Fprintf(w, "See docs at <a href='https://docs.victoriametrics.com/'>https://docs.victoriametrics.com/</a></br>")
fmt.Fprintf(w, "Useful endpoints:</br>")

View File

@@ -6,7 +6,6 @@ or any other Prometheus-compatible storage systems that support the `remote_writ
<img alt="vmagent" src="vmagent.png">
## Motivation
While VictoriaMetrics provides an efficient solution to store and observe metrics, our users needed something fast
@@ -14,7 +13,6 @@ and RAM friendly to scrape metrics from Prometheus-compatible exporters into Vic
Also, we found that our user's infrastructure are like snowflakes in that no two are alike. Therefore we decided to add more flexibility
to `vmagent` such as the ability to push metrics additionally to pulling them. We did our best and will continue to improve `vmagent`.
## Features
* Can be used as a drop-in replacement for Prometheus for scraping targets such as [node_exporter](https://github.com/prometheus/node_exporter). See [Quick Start](#quick-start) for details.
@@ -67,7 +65,6 @@ Then send InfluxDB data to `http://vmagent-host:8429`. See [these docs](https://
Pass `-help` to `vmagent` in order to see [the full list of supported command-line flags with their descriptions](#advanced-usage).
## Configuration update
`vmagent` should be restarted in order to update config options set via command-line args.
@@ -75,6 +72,7 @@ Pass `-help` to `vmagent` in order to see [the full list of supported command-li
`vmagent` supports multiple approaches for reloading configs from updated config files such as `-promscrape.config`, `-remoteWrite.relabelConfig` and `-remoteWrite.urlRelabelConfig`:
* Sending `SUGHUP` signal to `vmagent` process:
```bash
kill -SIGHUP `pidof vmagent`
```
@@ -83,10 +81,8 @@ Pass `-help` to `vmagent` in order to see [the full list of supported command-li
There is also `-promscrape.configCheckInterval` command-line option, which can be used for automatic reloading configs from updated `-promscrape.config` file.
## Use cases
### IoT and Edge monitoring
`vmagent` can run and collect metrics in IoT and industrial networks with unreliable or scheduled connections to their remote storage.
@@ -97,28 +93,24 @@ The maximum buffer size can be limited with `-remoteWrite.maxDiskUsagePerURL`.
`vmagent` works on various architectures from the IoT world - 32-bit arm, 64-bit arm, ppc64, 386, amd64.
See [the corresponding Makefile rules](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmagent/Makefile) for details.
### Drop-in replacement for Prometheus
If you use Prometheus only for scraping metrics from various targets and forwarding those metrics to remote storage
then `vmagent` can replace Prometheus. Typically, `vmagent` requires lower amounts of RAM, CPU and network bandwidth compared with Prometheus.
See [these docs](#how-to-collect-metrics-in-prometheus-format) for details.
### Replication and high availability
`vmagent` replicates the collected metrics among multiple remote storage instances configured via `-remoteWrite.url` args.
If a single remote storage instance temporarily is out of service, then the collected data remains available in another remote storage instance.
`vmagent` buffers the collected data in files at `-remoteWrite.tmpDataPath` until the remote storage becomes available again and then it sends the buffered data to the remote storage in order to prevent data gaps.
### Relabeling and filtering
`vmagent` can add, remove or update labels on the collected data before sending it to the remote storage. Additionally,
it can remove unwanted samples via Prometheus-like relabeling before sending the collected data to remote storage.
Please see [these docs](#relabeling) for details.
### Splitting data streams among multiple systems
`vmagent` supports splitting the collected data between muliple destinations with the help of `-remoteWrite.urlRelabelConfig`,
@@ -126,7 +118,6 @@ which is applied independently for each configured `-remoteWrite.url` destinatio
data among long-term remote storage, short-term remote storage and a real-time analytical system [built on top of Kafka](https://github.com/Telefonica/prometheus-kafka-adapter).
Note that each destination can receive it's own subset of the collected data due to per-destination relabeling via `-remoteWrite.urlRelabelConfig`.
### Prometheus remote_write proxy
`vmagent` can be used as a proxy for Prometheus data sent via Prometheus `remote_write` protocol. It can accept data via the `remote_write` API
@@ -134,7 +125,6 @@ at the`/api/v1/write` endpoint. Then apply relabeling and filtering and proxy it
The `vmagent` can be configured to encrypt the incoming `remote_write` requests with `-tls*` command-line flags.
Also, Basic Auth can be enabled for the incoming `remote_write` requests with `-httpAuth.*` command-line flags.
### remote_write for clustered version
While `vmagent` can accept data in several supported protocols (OpenTSDB, Influx, Prometheus, Graphite) and scrape data from various targets, writes are always peformed in Promethes remote_write protocol. Therefore for the [clustered version](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html), `-remoteWrite.url` the command-line flag should be configured as `<schema>://<vminsert-host>:8480/insert/<accountID>/prometheus/api/v1/write` according to [these docs](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#url-format). There is also support for multitenant writes. See [these docs](#multitenancy).
@@ -143,7 +133,6 @@ While `vmagent` can accept data in several supported protocols (OpenTSDB, Influx
By default `vmagent` collects the data without tenant identifiers and routes it to the configured `-remoteWrite.url`. But it can accept multitenant data if `-remoteWrite.multitenantURL` is set. In this case it accepts multitenant data at `http://vmagent:8429/insert/<accountID>/...` in the same way as cluster version of VictoriaMetrics does according to [these docs](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#url-format) and routes it to `<-remoteWrite.multitenantURL>/insert/<accountID>/prometheus/api/v1/write`. If multiple `-remoteWrite.multitenantURL` command-line options are set, then `vmagent` replicates the collected data across all the configured urls. This allows using a single `vmagent` instance in front of VictoriaMetrics clusters for processing the data from all the tenants.
## How to collect metrics in Prometheus format
Specify the path to `prometheus.yml` file via `-promscrape.config` command-line flag. `vmagent` takes into account the following
@@ -211,7 +200,6 @@ entries to 60s. Run `vmagent -help` in order to see default values for the `-pro
The file pointed by `-promscrape.config` may contain `%{ENV_VAR}` placeholders which are substituted by the corresponding `ENV_VAR` environment variable values.
## Loading scrape configs from multiple files
`vmagent` supports loading scrape configs from multiple files specified in the `scrape_config_files` section of `-promscrape.config` file. For example, the following `-promscrape.config` instructs `vmagent` loading scrape configs from all the `*.yml` files under `configs` directory, from `single_scrape_config.yml` local file and from `https://config-server/scrape_config.yml` url:
@@ -236,7 +224,6 @@ Every referred file can contain arbitrary number of [supported scrape configs](#
`vmagent` dynamically reloads these files on `SIGHUP` signal or on the request to `http://vmagent:8429/-/reload`.
## Unsupported Prometheus config sections
`vmagent` doesn't support the following sections in Prometheus config file passed to `-promscrape.config` command-line flag:
@@ -249,7 +236,6 @@ The list of supported service discovery types is available [here](#how-to-collec
Additionally `vmagent` doesn't support `refresh_interval` option at service discovery sections. This option is substituted with `-promscrape.*CheckInterval` command-line options, which are specific per each service discovery type. See [the full list of command-line flags for vmagent](#advanced-usage).
## Adding labels to metrics
Labels can be added to metrics by the following mechanisms:
@@ -261,7 +247,6 @@ Labels can be added to metrics by the following mechanisms:
/path/to/vmagent -remoteWrite.label=datacenter=foobar ...
```
## Relabeling
VictoriaMetrics components (including `vmagent`) support Prometheus-compatible relabeling.
@@ -320,7 +305,6 @@ You can read more about relabeling in the following articles:
* [Extracting labels from legacy metric names](https://www.robustperception.io/extracting-labels-from-legacy-metric-names)
* [relabel_configs vs metric_relabel_configs](https://www.robustperception.io/relabel_configs-vs-metric_relabel_configs)
## Prometheus staleness markers
`vmagent` sends [Prometheus staleness markers](https://www.robustperception.io/staleness-and-promql) to `-remoteWrite.url` in the following cases:
@@ -332,16 +316,15 @@ You can read more about relabeling in the following articles:
Prometheus staleness markers' tracking needs additional memory, since it must store the previous response body per each scrape target in order to compare it to the current response body. The memory usage may be reduced by passing `-promscrape.noStaleMarkers` command-line flag to `vmagent`. This disables staleness tracking. This also disables tracking the number of new time series per each scrape with the auto-generated `scrape_series_added` metric. See [these docs](https://prometheus.io/docs/concepts/jobs_instances/#automatically-generated-labels-and-time-series) for details.
## Stream parsing mode
By default `vmagent` reads the full response body from scrape target into memory, then parses it, applies [relabeling](#relabeling) and then pushes the resulting metrics to the configured `-remoteWrite.url`. This mode works good for the majority of cases when the scrape target exposes small number of metrics (e.g. less than 10 thousand). But this mode may take big amounts of memory when the scrape target exposes big number of metrics. In this case it is recommended enabling stream parsing mode. When this mode is enabled, then `vmagent` reads response from scrape target in chunks, then immediately processes every chunk and pushes the processed metrics to remote storage. This allows saving memory when scraping targets that expose millions of metrics.
Stream parsing mode is automatically enabled for scrape targets returning response bodies with sizes bigger than the `-promscrape.minResponseSizeForStreamParse` command-line flag value. Additionally, the stream parsing mode can be explicitly enabled in the following places:
- Via `-promscrape.streamParse` command-line flag. In this case all the scrape targets defined in the file pointed by `-promscrape.config` are scraped in stream parsing mode.
- Via `stream_parse: true` option at `scrape_configs` section. In this case all the scrape targets defined in this section are scraped in stream parsing mode.
- Via `__stream_parse__=true` label, which can be set via [relabeling](#relabeling) at `relabel_configs` section. In this case stream parsing mode is enabled for the corresponding scrape targets. Typical use case: to set the label via [Kubernetes annotations](https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/) for targets exposing big number of metrics.
* Via `-promscrape.streamParse` command-line flag. In this case all the scrape targets defined in the file pointed by `-promscrape.config` are scraped in stream parsing mode.
* Via `stream_parse: true` option at `scrape_configs` section. In this case all the scrape targets defined in this section are scraped in stream parsing mode.
* Via `__stream_parse__=true` label, which can be set via [relabeling](#relabeling) at `relabel_configs` section. In this case stream parsing mode is enabled for the corresponding scrape targets. Typical use case: to set the label via [Kubernetes annotations](https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/) for targets exposing big number of metrics.
Examples:
@@ -361,7 +344,6 @@ scrape_configs:
Note that `sample_limit` and `series_limit` options cannot be used in stream parsing mode because the parsed data is pushed to remote storage as soon as it is parsed.
## Scraping big number of targets
A single `vmagent` instance can scrape tens of thousands of scrape targets. Sometimes this isn't enough due to limitations on CPU, network, RAM, etc.
@@ -376,6 +358,8 @@ spread scrape targets among a cluster of two `vmagent` instances:
/path/to/vmagent -promscrape.cluster.membersCount=2 -promscrape.cluster.memberNum=1 -promscrape.config=/path/to/config.yml ...
```
The `-promscrape.cluster.memberNum` can be set to a StatefulSet pod name when `vmagent` runs in Kubernetes. The pod name must end with a number in the range `0 ... promscrape.cluster.memberNum-1`. For example, `-promscrape.cluster.memberNum=vmagent-0`.
By default each scrape target is scraped only by a single `vmagent` instance in the cluster. If there is a need for replicating scrape targets among multiple `vmagent` instances,
then `-promscrape.cluster.replicationFactor` command-line flag must be set to the desired number of replicas. For example, the following commands
start a cluster of three `vmagent` instances, where each target is scraped by two `vmagent` instances:
@@ -389,7 +373,6 @@ start a cluster of three `vmagent` instances, where each target is scraped by tw
If each target is scraped by multiple `vmagent` instances, then data deduplication must be enabled at remote storage pointed by `-remoteWrite.url`.
See [these docs](https://docs.victoriametrics.com/#deduplication) for details.
## Scraping targets via a proxy
`vmagent` supports scraping targets via http, https and socks5 proxies. Proxy address must be specified in `proxy_url` option. For example, the following scrape config instructs
@@ -429,9 +412,9 @@ scrape_configs:
By default `vmagent` doesn't limit the number of time series each scrape target can expose. The limit can be enforced in the following places:
- Via `-promscrape.seriesLimitPerTarget` command-line option. This limit is applied individually to all the scrape targets defined in the file pointed by `-promscrape.config`.
- Via `series_limit` config option at `scrape_config` section. This limit is applied individually to all the scrape targets defined in the given `scrape_config`.
- Via `__series_limit__` label, which can be set with [relabeling](#relabeling) at `relabel_configs` section. This limit is applied to the corresponding scrape targets. Typical use case: to set the limit via [Kubernetes annotations](https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/) for targets, which may expose too high number of time series.
* Via `-promscrape.seriesLimitPerTarget` command-line option. This limit is applied individually to all the scrape targets defined in the file pointed by `-promscrape.config`.
* Via `series_limit` config option at `scrape_config` section. This limit is applied individually to all the scrape targets defined in the given `scrape_config`.
* Via `__series_limit__` label, which can be set with [relabeling](#relabeling) at `relabel_configs` section. This limit is applied to the corresponding scrape targets. Typical use case: to set the limit via [Kubernetes annotations](https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/) for targets, which may expose too high number of time series.
All the scraped metrics are dropped for time series exceeding the given limit. The exceeded limit can be [monitored](#monitoring) via `promscrape_series_limit_rows_dropped_total` metric.
@@ -451,7 +434,6 @@ The exceeded limits can be [monitored](#monitoring) with the following metrics:
These limits are approximate, so `vmagent` can underflow/overflow the limit by a small percentage (usually less than 1%).
## Monitoring
`vmagent` exports various metrics in Prometheus exposition format at `http://vmagent-host:8429/metrics` page. We recommend setting up regular scraping of this page
@@ -470,7 +452,6 @@ This information may be useful for debugging target relabeling.
* `http://vmagent-host:8429/ready`. This handler returns http 200 status code when `vmagent` finishes it's initialization for all service_discovery configs.
It may be useful to perform `vmagent` rolling update without any scrape loss.
## Troubleshooting
* We recommend you [set up the official Grafana dashboard](#monitoring) in order to monitor the state of `vmagent'.
@@ -534,12 +515,14 @@ It may be useful to perform `vmagent` rolling update without any scrape loss.
See the available options below if you prefer fixing the root cause of the error:
The following relabeling rule may be added to `relabel_configs` section in order to filter out pods with unneeded ports:
```yml
- action: keep_if_equal
source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_port, __meta_kubernetes_pod_container_port_number]
```
The following relabeling rule may be added to `relabel_configs` section in order to filter out init container pods:
```yml
- action: drop
source_labels: [__meta_kubernetes_pod_container_init]
@@ -555,7 +538,6 @@ It may be useful to perform `vmagent` rolling update without any scrape loss.
The enterprise version of vmagent is available for evaluation at [releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) page in `vmutils-*-enteprise.tar.gz` archives and in [docker images](https://hub.docker.com/r/victoriametrics/vmagent/tags) with tags containing `enterprise` suffix.
### Reading metrics from Kafka
[Enterprise version](https://victoriametrics.com/products/enterprise/) of `vmagent` can read metrics in various formats from Kafka messages. These formats can be configured with `-kafka.consumer.topic.defaultFormat` or `-kafka.consumer.topic.format` command-line options. The following formats are supported:
@@ -591,7 +573,6 @@ topic = "influx"
data_format = "influx"
```
#### Command-line flags for Kafka consumer
These command-line flags are available only in [enterprise](https://victoriametrics.com/products/enterprise/) version of `vmagent`, which can be downloaded for evaluation from [releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) page (see `vmutils-*-enteprise.tar.gz` archives) and from [docker images](https://hub.docker.com/r/victoriametrics/vmagent/tags) with tags containing `enterprise` suffix.
@@ -631,7 +612,6 @@ These command-line flags are available only in [enterprise](https://victoriametr
Additional Kafka options can be passed as query params to `-remoteWrite.url`. For instance, `kafka://localhost:9092/?topic=prom-rw&client.id=my-favorite-id` sets `client.id` Kafka option to `my-favorite-id`. The full list of Kafka options is available [here](https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md).
#### Kafka broker authorization and authentication
Two types of auth are supported:
@@ -648,12 +628,10 @@ Two types of auth are supported:
./bin/vmagent -remoteWrite.url=kafka://localhost:9092/?topic=prom-rw&security.protocol=SSL -remoteWrite.tlsCAFile=/opt/ca.pem -remoteWrite.tlsCertFile=/opt/cert.pem -remoteWrite.tlsKeyFile=/opt/key.pem
```
## How to build from sources
We recommend using [binary releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) - `vmagent` is located in the `vmutils-*` archives .
### Development build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.17.
@@ -695,7 +673,6 @@ ARM build may run on Raspberry Pi or on [energy-efficient ARM servers](https://b
2. Run `make vmagent-arm-prod` or `make vmagent-arm64-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmagent-arm-prod` or `vmagent-arm64-prod` binary respectively and puts it into the `bin` folder.
## Profiling
`vmagent` provides handlers for collecting the following [Go profiles](https://blog.golang.org/profiling-go-programs):
@@ -724,7 +701,6 @@ The command for collecting CPU profile waits for 30 seconds before returning.
The collected profiles may be analyzed with [go tool pprof](https://github.com/google/pprof).
## Advanced usage
`vmagent` can be fine-tuned with various command-line flags. Run `./vmagent -help` in order to see the full list of these flags with their desciptions and default values:
@@ -737,314 +713,314 @@ vmagent collects metrics data via popular data ingestion protocols and routes th
See the docs at https://docs.victoriametrics.com/vmagent.html .
-configAuthKey string
Authorization key for accessing /config page. It must be passed via authKey query arg
Authorization key for accessing /config page. It must be passed via authKey query arg
-csvTrimTimestamp duration
Trim timestamps when importing csv data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms)
Trim timestamps when importing csv data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms)
-datadog.maxInsertRequestSize size
The maximum size in bytes of a single DataDog POST request to /api/v1/series
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 67108864)
The maximum size in bytes of a single DataDog POST request to /api/v1/series
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 67108864)
-dryRun
Whether to check only config files without running vmagent. The following files are checked: -promscrape.config, -remoteWrite.relabelConfig, -remoteWrite.urlRelabelConfig . Unknown config entries aren't allowed in -promscrape.config by default. This can be changed by passing -promscrape.config.strictParse=false command-line flag
Whether to check only config files without running vmagent. The following files are checked: -promscrape.config, -remoteWrite.relabelConfig, -remoteWrite.urlRelabelConfig . Unknown config entries aren't allowed in -promscrape.config by default. This can be changed by passing -promscrape.config.strictParse=false command-line flag
-enableTCP6
Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used
Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used
-envflag.enable
Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
Prefix for environment variables if -envflag.enable is set
-eula
By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf
By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf
-fs.disableMmap
Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread()
Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread()
-graphiteListenAddr string
TCP and UDP address to listen for Graphite plaintext data. Usually :2003 must be set. Doesn't work if empty
TCP and UDP address to listen for Graphite plaintext data. Usually :2003 must be set. Doesn't work if empty
-graphiteTrimTimestamp duration
Trim timestamps for Graphite data to this duration. Minimum practical duration is 1s. Higher duration (i.e. 1m) may be used for reducing disk space usage for timestamp data (default 1s)
Trim timestamps for Graphite data to this duration. Minimum practical duration is 1s. Higher duration (i.e. 1m) may be used for reducing disk space usage for timestamp data (default 1s)
-http.connTimeout duration
Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s)
Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s)
-http.disableResponseCompression
Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth
Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth
-http.idleConnTimeout duration
Timeout for incoming idle http connections (default 1m0s)
Timeout for incoming idle http connections (default 1m0s)
-http.maxGracefulShutdownDuration duration
The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s)
The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s)
-http.pathPrefix string
An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus
An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus
-http.shutdownDelay duration
Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers
Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers
-httpAuth.password string
Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty
Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty
-httpAuth.username string
Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
-httpListenAddr string
TCP address to listen for http connections. Set this flag to empty value in order to disable listening on any port. This mode may be useful for running multiple vmagent instances on the same server. Note that /targets and /metrics pages aren't available if -httpListenAddr='' (default ":8429")
TCP address to listen for http connections. Set this flag to empty value in order to disable listening on any port. This mode may be useful for running multiple vmagent instances on the same server. Note that /targets and /metrics pages aren't available if -httpListenAddr='' (default ":8429")
-import.maxLineLen size
The maximum length in bytes of a single line accepted by /api/v1/import; the line length can be limited with 'max_rows_per_line' query arg passed to /api/v1/export
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 104857600)
The maximum length in bytes of a single line accepted by /api/v1/import; the line length can be limited with 'max_rows_per_line' query arg passed to /api/v1/export
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 104857600)
-influx.databaseNames array
Comma-separated list of database names to return from /query and /influx/query API. This can be needed for accepting data from Telegraf plugins such as https://github.com/fangli/fluent-plugin-influxdb
Supports an array of values separated by comma or specified via multiple flags.
Comma-separated list of database names to return from /query and /influx/query API. This can be needed for accepting data from Telegraf plugins such as https://github.com/fangli/fluent-plugin-influxdb
Supports an array of values separated by comma or specified via multiple flags.
-influx.maxLineSize size
The maximum size in bytes for a single InfluxDB line during parsing
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 262144)
The maximum size in bytes for a single InfluxDB line during parsing
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 262144)
-influxDBLabel string
Default label for the DB name sent over '?db={db_name}' query parameter (default "db")
Default label for the DB name sent over '?db={db_name}' query parameter (default "db")
-influxListenAddr string
TCP and UDP address to listen for InfluxDB line protocol data. Usually :8189 must be set. Doesn't work if empty. This flag isn't needed when ingesting data over HTTP - just send it to http://<vmagent>:8429/write
TCP and UDP address to listen for InfluxDB line protocol data. Usually :8189 must be set. Doesn't work if empty. This flag isn't needed when ingesting data over HTTP - just send it to http://<vmagent>:8429/write
-influxMeasurementFieldSeparator string
Separator for '{measurement}{separator}{field_name}' metric name when inserted via InfluxDB line protocol (default "_")
Separator for '{measurement}{separator}{field_name}' metric name when inserted via InfluxDB line protocol (default "_")
-influxSkipMeasurement
Uses '{field_name}' as a metric name while ignoring '{measurement}' and '-influxMeasurementFieldSeparator'
Uses '{field_name}' as a metric name while ignoring '{measurement}' and '-influxMeasurementFieldSeparator'
-influxSkipSingleField
Uses '{measurement}' instead of '{measurement}{separator}{field_name}' for metic name if InfluxDB line contains only a single field
Uses '{measurement}' instead of '{measurement}{separator}{field_name}' for metic name if InfluxDB line contains only a single field
-influxTrimTimestamp duration
Trim timestamps for InfluxDB line protocol data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms)
Trim timestamps for InfluxDB line protocol data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms)
-insert.maxQueueDuration duration
The maximum duration for waiting in the queue for insert requests due to -maxConcurrentInserts (default 1m0s)
The maximum duration for waiting in the queue for insert requests due to -maxConcurrentInserts (default 1m0s)
-kafka.consumer.topic array
Kafka topic names for data consumption.
Supports an array of values separated by comma or specified via multiple flags.
Kafka topic names for data consumption.
Supports an array of values separated by comma or specified via multiple flags.
-kafka.consumer.topic.basicAuth.password array
Optional basic auth password for -kafka.consumer.topic. Must be used in conjunction with any supported auth methods for kafka client, specified by flag -kafka.consumer.topic.options='security.protocol=SASL_SSL;sasl.mechanisms=PLAIN'
Supports an array of values separated by comma or specified via multiple flags.
Optional basic auth password for -kafka.consumer.topic. Must be used in conjunction with any supported auth methods for kafka client, specified by flag -kafka.consumer.topic.options='security.protocol=SASL_SSL;sasl.mechanisms=PLAIN'
Supports an array of values separated by comma or specified via multiple flags.
-kafka.consumer.topic.basicAuth.username array
Optional basic auth username for -kafka.consumer.topic. Must be used in conjunction with any supported auth methods for kafka client, specified by flag -kafka.consumer.topic.options='security.protocol=SASL_SSL;sasl.mechanisms=PLAIN'
Supports an array of values separated by comma or specified via multiple flags.
Optional basic auth username for -kafka.consumer.topic. Must be used in conjunction with any supported auth methods for kafka client, specified by flag -kafka.consumer.topic.options='security.protocol=SASL_SSL;sasl.mechanisms=PLAIN'
Supports an array of values separated by comma or specified via multiple flags.
-kafka.consumer.topic.brokers array
List of brokers to connect for given topic, e.g. -kafka.consumer.topic.broker=host-1:9092;host-2:9092
Supports an array of values separated by comma or specified via multiple flags.
List of brokers to connect for given topic, e.g. -kafka.consumer.topic.broker=host-1:9092;host-2:9092
Supports an array of values separated by comma or specified via multiple flags.
-kafka.consumer.topic.defaultFormat string
Expected data format in the topic if -kafka.consumer.topic.format is skipped. (default "promremotewrite")
Expected data format in the topic if -kafka.consumer.topic.format is skipped. (default "promremotewrite")
-kafka.consumer.topic.format array
data format for corresponding kafka topic. Valid formats: influx, prometheus, promremotewrite, graphite, jsonline
Supports an array of values separated by comma or specified via multiple flags.
data format for corresponding kafka topic. Valid formats: influx, prometheus, promremotewrite, graphite, jsonline
Supports an array of values separated by comma or specified via multiple flags.
-kafka.consumer.topic.groupID array
Defines group.id for topic
Supports an array of values separated by comma or specified via multiple flags.
Defines group.id for topic
Supports an array of values separated by comma or specified via multiple flags.
-kafka.consumer.topic.isGzipped array
Enables gzip setting for topic messages payload. Only prometheus, jsonline and influx formats accept gzipped messages.
Supports array of values separated by comma or specified via multiple flags.
Enables gzip setting for topic messages payload. Only prometheus, jsonline and influx formats accept gzipped messages.
Supports array of values separated by comma or specified via multiple flags.
-kafka.consumer.topic.options array
Optional key=value;key1=value2 settings for topic consumer. See full configuration options at https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md.
Supports an array of values separated by comma or specified via multiple flags.
Optional key=value;key1=value2 settings for topic consumer. See full configuration options at https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md.
Supports an array of values separated by comma or specified via multiple flags.
-loggerDisableTimestamps
Whether to disable writing timestamps in logs
Whether to disable writing timestamps in logs
-loggerErrorsPerSecondLimit int
Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit
Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit
-loggerFormat string
Format for logs. Possible values: default, json (default "default")
Format for logs. Possible values: default, json (default "default")
-loggerLevel string
Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO")
Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO")
-loggerOutput string
Output for the logs. Supported values: stderr, stdout (default "stderr")
Output for the logs. Supported values: stderr, stdout (default "stderr")
-loggerTimezone string
Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC")
Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC")
-loggerWarnsPerSecondLimit int
Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit
Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit
-maxConcurrentInserts int
The maximum number of concurrent inserts. Default value should work for most cases, since it minimizes the overhead for concurrent inserts. This option is tigthly coupled with -insert.maxQueueDuration (default 16)
The maximum number of concurrent inserts. Default value should work for most cases, since it minimizes the overhead for concurrent inserts. This option is tigthly coupled with -insert.maxQueueDuration (default 16)
-maxInsertRequestSize size
The maximum size in bytes of a single Prometheus remote_write API request
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 33554432)
The maximum size in bytes of a single Prometheus remote_write API request
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 33554432)
-memory.allowedBytes size
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0)
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0)
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60)
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60)
-metricsAuthKey string
Auth key for /metrics. It must be passed via authKey query arg. It overrides httpAuth.* settings
Auth key for /metrics. It must be passed via authKey query arg. It overrides httpAuth.* settings
-opentsdbHTTPListenAddr string
TCP address to listen for OpentTSDB HTTP put requests. Usually :4242 must be set. Doesn't work if empty
TCP address to listen for OpentTSDB HTTP put requests. Usually :4242 must be set. Doesn't work if empty
-opentsdbListenAddr string
TCP and UDP address to listen for OpentTSDB metrics. Telnet put messages and HTTP /api/put messages are simultaneously served on TCP port. Usually :4242 must be set. Doesn't work if empty
TCP and UDP address to listen for OpentTSDB metrics. Telnet put messages and HTTP /api/put messages are simultaneously served on TCP port. Usually :4242 must be set. Doesn't work if empty
-opentsdbTrimTimestamp duration
Trim timestamps for OpenTSDB 'telnet put' data to this duration. Minimum practical duration is 1s. Higher duration (i.e. 1m) may be used for reducing disk space usage for timestamp data (default 1s)
Trim timestamps for OpenTSDB 'telnet put' data to this duration. Minimum practical duration is 1s. Higher duration (i.e. 1m) may be used for reducing disk space usage for timestamp data (default 1s)
-opentsdbhttp.maxInsertRequestSize size
The maximum size of OpenTSDB HTTP put request
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 33554432)
The maximum size of OpenTSDB HTTP put request
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 33554432)
-opentsdbhttpTrimTimestamp duration
Trim timestamps for OpenTSDB HTTP data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms)
Trim timestamps for OpenTSDB HTTP data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms)
-pprofAuthKey string
Auth key for /debug/pprof. It must be passed via authKey query arg. It overrides httpAuth.* settings
Auth key for /debug/pprof. It must be passed via authKey query arg. It overrides httpAuth.* settings
-promscrape.cluster.memberNum int
The number of number in the cluster of scrapers. It must be an unique value in the range 0 ... promscrape.cluster.membersCount-1 across scrapers in the cluster
The number of number in the cluster of scrapers. It must be an unique value in the range 0 ... promscrape.cluster.membersCount-1 across scrapers in the cluster
-promscrape.cluster.membersCount int
The number of members in a cluster of scrapers. Each member must have an unique -promscrape.cluster.memberNum in the range 0 ... promscrape.cluster.membersCount-1 . Each member then scrapes roughly 1/N of all the targets. By default cluster scraping is disabled, i.e. a single scraper scrapes all the targets
The number of members in a cluster of scrapers. Each member must have an unique -promscrape.cluster.memberNum in the range 0 ... promscrape.cluster.membersCount-1 . Each member then scrapes roughly 1/N of all the targets. By default cluster scraping is disabled, i.e. a single scraper scrapes all the targets
-promscrape.cluster.replicationFactor int
The number of members in the cluster, which scrape the same targets. If the replication factor is greater than 2, then the deduplication must be enabled at remote storage side. See https://docs.victoriametrics.com/#deduplication (default 1)
The number of members in the cluster, which scrape the same targets. If the replication factor is greater than 2, then the deduplication must be enabled at remote storage side. See https://docs.victoriametrics.com/#deduplication (default 1)
-promscrape.config string
Optional path to Prometheus config file with 'scrape_configs' section containing targets to scrape. The path can point to local file and to http url. See https://docs.victoriametrics.com/#how-to-scrape-prometheus-exporters-such-as-node-exporter for details
Optional path to Prometheus config file with 'scrape_configs' section containing targets to scrape. The path can point to local file and to http url. See https://docs.victoriametrics.com/#how-to-scrape-prometheus-exporters-such-as-node-exporter for details
-promscrape.config.dryRun
Checks -promscrape.config file for errors and unsupported fields and then exits. Returns non-zero exit code on parsing errors and emits these errors to stderr. See also -promscrape.config.strictParse command-line flag. Pass -loggerLevel=ERROR if you don't need to see info messages in the output.
Checks -promscrape.config file for errors and unsupported fields and then exits. Returns non-zero exit code on parsing errors and emits these errors to stderr. See also -promscrape.config.strictParse command-line flag. Pass -loggerLevel=ERROR if you don't need to see info messages in the output.
-promscrape.config.strictParse
Whether to deny unsupported fields in -promscrape.config . Set to false in order to silently skip unsupported fields (default true)
Whether to deny unsupported fields in -promscrape.config . Set to false in order to silently skip unsupported fields (default true)
-promscrape.configCheckInterval duration
Interval for checking for changes in '-promscrape.config' file. By default the checking is disabled. Send SIGHUP signal in order to force config check for changes
Interval for checking for changes in '-promscrape.config' file. By default the checking is disabled. Send SIGHUP signal in order to force config check for changes
-promscrape.consul.waitTime duration
Wait time used by Consul service discovery. Default value is used if not set
Wait time used by Consul service discovery. Default value is used if not set
-promscrape.consulSDCheckInterval duration
Interval for checking for changes in Consul. This works only if consul_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config for details (default 30s)
Interval for checking for changes in Consul. This works only if consul_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config for details (default 30s)
-promscrape.digitaloceanSDCheckInterval duration
Interval for checking for changes in digital ocean. This works only if digitalocean_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#digitalocean_sd_config for details (default 1m0s)
Interval for checking for changes in digital ocean. This works only if digitalocean_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#digitalocean_sd_config for details (default 1m0s)
-promscrape.disableCompression
Whether to disable sending 'Accept-Encoding: gzip' request headers to all the scrape targets. This may reduce CPU usage on scrape targets at the cost of higher network bandwidth utilization. It is possible to set 'disable_compression: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control
Whether to disable sending 'Accept-Encoding: gzip' request headers to all the scrape targets. This may reduce CPU usage on scrape targets at the cost of higher network bandwidth utilization. It is possible to set 'disable_compression: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control
-promscrape.disableKeepAlive
Whether to disable HTTP keep-alive connections when scraping all the targets. This may be useful when targets has no support for HTTP keep-alive connection. It is possible to set 'disable_keepalive: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control. Note that disabling HTTP keep-alive may increase load on both vmagent and scrape targets
Whether to disable HTTP keep-alive connections when scraping all the targets. This may be useful when targets has no support for HTTP keep-alive connection. It is possible to set 'disable_keepalive: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control. Note that disabling HTTP keep-alive may increase load on both vmagent and scrape targets
-promscrape.discovery.concurrency int
The maximum number of concurrent requests to Prometheus autodiscovery API (Consul, Kubernetes, etc.) (default 100)
The maximum number of concurrent requests to Prometheus autodiscovery API (Consul, Kubernetes, etc.) (default 100)
-promscrape.discovery.concurrentWaitTime duration
The maximum duration for waiting to perform API requests if more than -promscrape.discovery.concurrency requests are simultaneously performed (default 1m0s)
The maximum duration for waiting to perform API requests if more than -promscrape.discovery.concurrency requests are simultaneously performed (default 1m0s)
-promscrape.dnsSDCheckInterval duration
Interval for checking for changes in dns. This works only if dns_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dns_sd_config for details (default 30s)
Interval for checking for changes in dns. This works only if dns_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dns_sd_config for details (default 30s)
-promscrape.dockerSDCheckInterval duration
Interval for checking for changes in docker. This works only if docker_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#docker_sd_config for details (default 30s)
Interval for checking for changes in docker. This works only if docker_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#docker_sd_config for details (default 30s)
-promscrape.dockerswarmSDCheckInterval duration
Interval for checking for changes in dockerswarm. This works only if dockerswarm_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dockerswarm_sd_config for details (default 30s)
Interval for checking for changes in dockerswarm. This works only if dockerswarm_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dockerswarm_sd_config for details (default 30s)
-promscrape.dropOriginalLabels
Whether to drop original labels for scrape targets at /targets and /api/v1/targets pages. This may be needed for reducing memory usage when original labels for big number of scrape targets occupy big amounts of memory. Note that this reduces debuggability for improper per-target relabeling configs
Whether to drop original labels for scrape targets at /targets and /api/v1/targets pages. This may be needed for reducing memory usage when original labels for big number of scrape targets occupy big amounts of memory. Note that this reduces debuggability for improper per-target relabeling configs
-promscrape.ec2SDCheckInterval duration
Interval for checking for changes in ec2. This works only if ec2_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#ec2_sd_config for details (default 1m0s)
Interval for checking for changes in ec2. This works only if ec2_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#ec2_sd_config for details (default 1m0s)
-promscrape.eurekaSDCheckInterval duration
Interval for checking for changes in eureka. This works only if eureka_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#eureka_sd_config for details (default 30s)
Interval for checking for changes in eureka. This works only if eureka_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#eureka_sd_config for details (default 30s)
-promscrape.fileSDCheckInterval duration
Interval for checking for changes in 'file_sd_config'. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#file_sd_config for details (default 5m0s)
Interval for checking for changes in 'file_sd_config'. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#file_sd_config for details (default 5m0s)
-promscrape.gceSDCheckInterval duration
Interval for checking for changes in gce. This works only if gce_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#gce_sd_config for details (default 1m0s)
Interval for checking for changes in gce. This works only if gce_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#gce_sd_config for details (default 1m0s)
-promscrape.httpSDCheckInterval duration
Interval for checking for changes in http endpoint service discovery. This works only if http_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#http_sd_config for details (default 1m0s)
Interval for checking for changes in http endpoint service discovery. This works only if http_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#http_sd_config for details (default 1m0s)
-promscrape.kubernetes.apiServerTimeout duration
How frequently to reload the full state from Kuberntes API server (default 30m0s)
How frequently to reload the full state from Kuberntes API server (default 30m0s)
-promscrape.kubernetesSDCheckInterval duration
Interval for checking for changes in Kubernetes API server. This works only if kubernetes_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config for details (default 30s)
Interval for checking for changes in Kubernetes API server. This works only if kubernetes_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config for details (default 30s)
-promscrape.maxDroppedTargets int
The maximum number of droppedTargets to show at /api/v1/targets page. Increase this value if your setup drops more scrape targets during relabeling and you need investigating labels for all the dropped targets. Note that the increased number of tracked dropped targets may result in increased memory usage (default 1000)
The maximum number of droppedTargets to show at /api/v1/targets page. Increase this value if your setup drops more scrape targets during relabeling and you need investigating labels for all the dropped targets. Note that the increased number of tracked dropped targets may result in increased memory usage (default 1000)
-promscrape.maxResponseHeadersSize size
The maximum size of http response headers from Prometheus scrape targets
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 4096)
The maximum size of http response headers from Prometheus scrape targets
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 4096)
-promscrape.maxScrapeSize size
The maximum size of scrape response in bytes to process from Prometheus targets. Bigger responses are rejected
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 16777216)
The maximum size of scrape response in bytes to process from Prometheus targets. Bigger responses are rejected
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 16777216)
-promscrape.minResponseSizeForStreamParse size
The minimum target response size for automatic switching to stream parsing mode, which can reduce memory usage. See https://docs.victoriametrics.com/vmagent.html#stream-parsing-mode
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 1000000)
The minimum target response size for automatic switching to stream parsing mode, which can reduce memory usage. See https://docs.victoriametrics.com/vmagent.html#stream-parsing-mode
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 1000000)
-promscrape.noStaleMarkers
Whether to disable sending Prometheus stale markers for metrics when scrape target disappears. This option may reduce memory usage if stale markers aren't needed for your setup. This option also disables populating the scrape_series_added metric. See https://prometheus.io/docs/concepts/jobs_instances/#automatically-generated-labels-and-time-series
Whether to disable sending Prometheus stale markers for metrics when scrape target disappears. This option may reduce memory usage if stale markers aren't needed for your setup. This option also disables populating the scrape_series_added metric. See https://prometheus.io/docs/concepts/jobs_instances/#automatically-generated-labels-and-time-series
-promscrape.openstackSDCheckInterval duration
Interval for checking for changes in openstack API server. This works only if openstack_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#openstack_sd_config for details (default 30s)
Interval for checking for changes in openstack API server. This works only if openstack_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#openstack_sd_config for details (default 30s)
-promscrape.seriesLimitPerTarget int
Optional limit on the number of unique time series a single scrape target can expose. See https://docs.victoriametrics.com/vmagent.html#cardinality-limiter for more info
Optional limit on the number of unique time series a single scrape target can expose. See https://docs.victoriametrics.com/vmagent.html#cardinality-limiter for more info
-promscrape.streamParse
Whether to enable stream parsing for metrics obtained from scrape targets. This may be useful for reducing memory usage when millions of metrics are exposed per each scrape target. It is posible to set 'stream_parse: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control
Whether to enable stream parsing for metrics obtained from scrape targets. This may be useful for reducing memory usage when millions of metrics are exposed per each scrape target. It is posible to set 'stream_parse: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control
-promscrape.suppressDuplicateScrapeTargetErrors
Whether to suppress 'duplicate scrape target' errors; see https://docs.victoriametrics.com/vmagent.html#troubleshooting for details
Whether to suppress 'duplicate scrape target' errors; see https://docs.victoriametrics.com/vmagent.html#troubleshooting for details
-promscrape.suppressScrapeErrors
Whether to suppress scrape errors logging. The last error for each target is always available at '/targets' page even if scrape errors logging is suppressed
Whether to suppress scrape errors logging. The last error for each target is always available at '/targets' page even if scrape errors logging is suppressed
-remoteWrite.basicAuth.password array
Optional basic auth password to use for -remoteWrite.url. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url
Supports an array of values separated by comma or specified via multiple flags.
Optional basic auth password to use for -remoteWrite.url. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url
Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.basicAuth.passwordFile array
Optional path to basic auth password to use for -remoteWrite.url. The file is re-read every second. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url
Supports an array of values separated by comma or specified via multiple flags.
Optional path to basic auth password to use for -remoteWrite.url. The file is re-read every second. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url
Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.basicAuth.username array
Optional basic auth username to use for -remoteWrite.url. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url
Supports an array of values separated by comma or specified via multiple flags.
Optional basic auth username to use for -remoteWrite.url. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url
Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.bearerToken array
Optional bearer auth token to use for -remoteWrite.url. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url
Supports an array of values separated by comma or specified via multiple flags.
Optional bearer auth token to use for -remoteWrite.url. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url
Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.bearerTokenFile array
Optional path to bearer token file to use for -remoteWrite.url. The token is re-read from the file every second. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url
Supports an array of values separated by comma or specified via multiple flags.
Optional path to bearer token file to use for -remoteWrite.url. The token is re-read from the file every second. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url
Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.flushInterval duration
Interval for flushing the data to remote storage. This option takes effect only when less than 10K data points per second are pushed to -remoteWrite.url (default 1s)
Interval for flushing the data to remote storage. This option takes effect only when less than 10K data points per second are pushed to -remoteWrite.url (default 1s)
-remoteWrite.label array
Optional label in the form 'name=value' to add to all the metrics before sending them to -remoteWrite.url. Pass multiple -remoteWrite.label flags in order to add multiple labels to metrics before sending them to remote storage
Supports an array of values separated by comma or specified via multiple flags.
Optional label in the form 'name=value' to add to all the metrics before sending them to -remoteWrite.url. Pass multiple -remoteWrite.label flags in order to add multiple labels to metrics before sending them to remote storage
Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.maxBlockSize size
The maximum block size to send to remote storage. Bigger blocks may improve performance at the cost of the increased memory usage. See also -remoteWrite.maxRowsPerBlock
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 8388608)
The maximum block size to send to remote storage. Bigger blocks may improve performance at the cost of the increased memory usage. See also -remoteWrite.maxRowsPerBlock
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 8388608)
-remoteWrite.maxDailySeries int
The maximum number of unique series vmagent can send to remote storage systems during the last 24 hours. Excess series are logged and dropped. This can be useful for limiting series churn rate. See https://docs.victoriametrics.com/vmagent.html#cardinality-limiter
The maximum number of unique series vmagent can send to remote storage systems during the last 24 hours. Excess series are logged and dropped. This can be useful for limiting series churn rate. See https://docs.victoriametrics.com/vmagent.html#cardinality-limiter
-remoteWrite.maxDiskUsagePerURL size
The maximum file-based buffer size in bytes at -remoteWrite.tmpDataPath for each -remoteWrite.url. When buffer size reaches the configured maximum, then old data is dropped when adding new data to the buffer. Buffered data is stored in ~500MB chunks, so the minimum practical value for this flag is 500000000. Disk usage is unlimited if the value is set to 0
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0)
The maximum file-based buffer size in bytes at -remoteWrite.tmpDataPath for each -remoteWrite.url. When buffer size reaches the configured maximum, then old data is dropped when adding new data to the buffer. Buffered data is stored in ~500MB chunks, so the minimum practical value for this flag is 500000000. Disk usage is unlimited if the value is set to 0
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0)
-remoteWrite.maxHourlySeries int
The maximum number of unique series vmagent can send to remote storage systems during the last hour. Excess series are logged and dropped. This can be useful for limiting series cardinality. See https://docs.victoriametrics.com/vmagent.html#cardinality-limiter
The maximum number of unique series vmagent can send to remote storage systems during the last hour. Excess series are logged and dropped. This can be useful for limiting series cardinality. See https://docs.victoriametrics.com/vmagent.html#cardinality-limiter
-remoteWrite.maxRowsPerBlock int
The maximum number of samples to send in each block to remote storage. Higher number may improve performance at the cost of the increased memory usage. See also -remoteWrite.maxBlockSize (default 10000)
The maximum number of samples to send in each block to remote storage. Higher number may improve performance at the cost of the increased memory usage. See also -remoteWrite.maxBlockSize (default 10000)
-remoteWrite.multitenantURL array
Base path for multitenant remote storage URL to write data to. See https://docs.victoriametrics.com/vmagent.html#multitenancy for details. Example url: http://<vminsert>:8480 . Pass multiple -remoteWrite.multitenantURL flags in order to replicate data to multiple remote storage systems. See also -remoteWrite.url
Supports an array of values separated by comma or specified via multiple flags.
Base path for multitenant remote storage URL to write data to. See https://docs.victoriametrics.com/vmagent.html#multitenancy for details. Example url: http://<vminsert>:8480 . Pass multiple -remoteWrite.multitenantURL flags in order to replicate data to multiple remote storage systems. See also -remoteWrite.url
Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.oauth2.clientID array
Optional OAuth2 clientID to use for -remoteWrite.url. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url
Supports an array of values separated by comma or specified via multiple flags.
Optional OAuth2 clientID to use for -remoteWrite.url. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url
Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.oauth2.clientSecret array
Optional OAuth2 clientSecret to use for -remoteWrite.url. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url
Supports an array of values separated by comma or specified via multiple flags.
Optional OAuth2 clientSecret to use for -remoteWrite.url. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url
Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.oauth2.clientSecretFile array
Optional OAuth2 clientSecretFile to use for -remoteWrite.url. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url
Supports an array of values separated by comma or specified via multiple flags.
Optional OAuth2 clientSecretFile to use for -remoteWrite.url. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url
Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.oauth2.scopes array
Optional OAuth2 scopes to use for -remoteWrite.url. Scopes must be delimited by ';'. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url
Supports an array of values separated by comma or specified via multiple flags.
Optional OAuth2 scopes to use for -remoteWrite.url. Scopes must be delimited by ';'. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url
Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.oauth2.tokenUrl array
Optional OAuth2 tokenURL to use for -remoteWrite.url. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url
Supports an array of values separated by comma or specified via multiple flags.
Optional OAuth2 tokenURL to use for -remoteWrite.url. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url
Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.proxyURL array
Optional proxy URL for writing data to -remoteWrite.url. Supported proxies: http, https, socks5. Example: -remoteWrite.proxyURL=socks5://proxy:1234
Supports an array of values separated by comma or specified via multiple flags.
Optional proxy URL for writing data to -remoteWrite.url. Supported proxies: http, https, socks5. Example: -remoteWrite.proxyURL=socks5://proxy:1234
Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.queues int
The number of concurrent queues to each -remoteWrite.url. Set more queues if default number of queues isn't enough for sending high volume of collected data to remote storage. Default value is 2 * numberOfAvailableCPUs (default 8)
The number of concurrent queues to each -remoteWrite.url. Set more queues if default number of queues isn't enough for sending high volume of collected data to remote storage. Default value is 2 * numberOfAvailableCPUs (default 8)
-remoteWrite.rateLimit array
Optional rate limit in bytes per second for data sent to -remoteWrite.url. By default the rate limit is disabled. It can be useful for limiting load on remote storage when big amounts of buffered data is sent after temporary unavailability of the remote storage
Supports array of values separated by comma or specified via multiple flags.
Optional rate limit in bytes per second for data sent to -remoteWrite.url. By default the rate limit is disabled. It can be useful for limiting load on remote storage when big amounts of buffered data is sent after temporary unavailability of the remote storage
Supports array of values separated by comma or specified via multiple flags.
-remoteWrite.relabelConfig string
Optional path to file with relabel_config entries. The path can point either to local file or to http url. These entries are applied to all the metrics before sending them to -remoteWrite.url. See https://docs.victoriametrics.com/vmagent.html#relabeling for details
Optional path to file with relabel_config entries. The path can point either to local file or to http url. These entries are applied to all the metrics before sending them to -remoteWrite.url. See https://docs.victoriametrics.com/vmagent.html#relabeling for details
-remoteWrite.relabelDebug
Whether to log metrics before and after relabeling with -remoteWrite.relabelConfig. If the -remoteWrite.relabelDebug is enabled, then the metrics aren't sent to remote storage. This is useful for debugging the relabeling configs
Whether to log metrics before and after relabeling with -remoteWrite.relabelConfig. If the -remoteWrite.relabelDebug is enabled, then the metrics aren't sent to remote storage. This is useful for debugging the relabeling configs
-remoteWrite.roundDigits array
Round metric values to this number of decimal digits after the point before writing them to remote storage. Examples: -remoteWrite.roundDigits=2 would round 1.236 to 1.24, while -remoteWrite.roundDigits=-1 would round 126.78 to 130. By default digits rounding is disabled. Set it to 100 for disabling it for a particular remote storage. This option may be used for improving data compression for the stored metrics
Supports array of values separated by comma or specified via multiple flags.
Round metric values to this number of decimal digits after the point before writing them to remote storage. Examples: -remoteWrite.roundDigits=2 would round 1.236 to 1.24, while -remoteWrite.roundDigits=-1 would round 126.78 to 130. By default digits rounding is disabled. Set it to 100 for disabling it for a particular remote storage. This option may be used for improving data compression for the stored metrics
Supports array of values separated by comma or specified via multiple flags.
-remoteWrite.sendTimeout array
Timeout for sending a single block of data to -remoteWrite.url
Supports array of values separated by comma or specified via multiple flags.
Timeout for sending a single block of data to -remoteWrite.url
Supports array of values separated by comma or specified via multiple flags.
-remoteWrite.showURL
Whether to show -remoteWrite.url in the exported metrics. It is hidden by default, since it can contain sensitive info such as auth key
Whether to show -remoteWrite.url in the exported metrics. It is hidden by default, since it can contain sensitive info such as auth key
-remoteWrite.significantFigures array
The number of significant figures to leave in metric values before writing them to remote storage. See https://en.wikipedia.org/wiki/Significant_figures . Zero value saves all the significant figures. This option may be used for improving data compression for the stored metrics. See also -remoteWrite.roundDigits
Supports array of values separated by comma or specified via multiple flags.
The number of significant figures to leave in metric values before writing them to remote storage. See https://en.wikipedia.org/wiki/Significant_figures . Zero value saves all the significant figures. This option may be used for improving data compression for the stored metrics. See also -remoteWrite.roundDigits
Supports array of values separated by comma or specified via multiple flags.
-remoteWrite.tlsCAFile array
Optional path to TLS CA file to use for verifying connections to -remoteWrite.url. By default system CA is used. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url
Supports an array of values separated by comma or specified via multiple flags.
Optional path to TLS CA file to use for verifying connections to -remoteWrite.url. By default system CA is used. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url
Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.tlsCertFile array
Optional path to client-side TLS certificate file to use when connecting to -remoteWrite.url. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url
Supports an array of values separated by comma or specified via multiple flags.
Optional path to client-side TLS certificate file to use when connecting to -remoteWrite.url. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url
Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.tlsInsecureSkipVerify array
Whether to skip tls verification when connecting to -remoteWrite.url
Supports array of values separated by comma or specified via multiple flags.
Whether to skip tls verification when connecting to -remoteWrite.url
Supports array of values separated by comma or specified via multiple flags.
-remoteWrite.tlsKeyFile array
Optional path to client-side TLS certificate key to use when connecting to -remoteWrite.url. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url
Supports an array of values separated by comma or specified via multiple flags.
Optional path to client-side TLS certificate key to use when connecting to -remoteWrite.url. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url
Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.tlsServerName array
Optional TLS server name to use for connections to -remoteWrite.url. By default the server name from -remoteWrite.url is used. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url
Supports an array of values separated by comma or specified via multiple flags.
Optional TLS server name to use for connections to -remoteWrite.url. By default the server name from -remoteWrite.url is used. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url
Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.tmpDataPath string
Path to directory where temporary data for remote write component is stored. See also -remoteWrite.maxDiskUsagePerURL (default "vmagent-remotewrite-data")
Path to directory where temporary data for remote write component is stored. See also -remoteWrite.maxDiskUsagePerURL (default "vmagent-remotewrite-data")
-remoteWrite.url array
Remote storage URL to write data to. It must support Prometheus remote_write API. It is recommended using VictoriaMetrics as remote storage. Example url: http://<victoriametrics-host>:8428/api/v1/write . Pass multiple -remoteWrite.url flags in order to replicate data to multiple remote storage systems. See also -remoteWrite.multitenantURL
Supports an array of values separated by comma or specified via multiple flags.
Remote storage URL to write data to. It must support Prometheus remote_write API. It is recommended using VictoriaMetrics as remote storage. Example url: http://<victoriametrics-host>:8428/api/v1/write . Pass multiple -remoteWrite.url flags in order to replicate data to multiple remote storage systems. See also -remoteWrite.multitenantURL
Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.urlRelabelConfig array
Optional path to relabel config for the corresponding -remoteWrite.url. The path can point either to local file or to http url
Supports an array of values separated by comma or specified via multiple flags.
Optional path to relabel config for the corresponding -remoteWrite.url. The path can point either to local file or to http url
Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.urlRelabelDebug array
Whether to log metrics before and after relabeling with -remoteWrite.urlRelabelConfig. If the -remoteWrite.urlRelabelDebug is enabled, then the metrics aren't sent to the corresponding -remoteWrite.url. This is useful for debugging the relabeling configs
Supports array of values separated by comma or specified via multiple flags.
Whether to log metrics before and after relabeling with -remoteWrite.urlRelabelConfig. If the -remoteWrite.urlRelabelDebug is enabled, then the metrics aren't sent to the corresponding -remoteWrite.url. This is useful for debugging the relabeling configs
Supports array of values separated by comma or specified via multiple flags.
-sortLabels
Whether to sort labels for incoming samples before writing them to all the configured remote storage systems. This may be needed for reducing memory usage at remote storage when the order of labels in incoming samples is random. For example, if m{k1="v1",k2="v2"} may be sent as m{k2="v2",k1="v1"}Enabled sorting for labels can slow down ingestion performance a bit
Whether to sort labels for incoming samples before writing them to all the configured remote storage systems. This may be needed for reducing memory usage at remote storage when the order of labels in incoming samples is random. For example, if m{k1="v1",k2="v2"} may be sent as m{k2="v2",k1="v1"}Enabled sorting for labels can slow down ingestion performance a bit
-tls
Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set
Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set
-tlsCertFile string
Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower. The provided certificate file is automatically re-read every second, so it can be dynamically updated
Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower. The provided certificate file is automatically re-read every second, so it can be dynamically updated
-tlsKeyFile string
Path to file with TLS key. Used only if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated
Path to file with TLS key. Used only if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated
-version
Show VictoriaMetrics version
Show VictoriaMetrics version
```

View File

@@ -154,6 +154,7 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
if r.Method != "GET" {
return false
}
w.Header().Add("Content-Type", "text/html; charset=utf-8")
fmt.Fprintf(w, "<h2>vmagent</h2>")
fmt.Fprintf(w, "See docs at <a href='https://docs.victoriametrics.com/vmagent.html'>https://docs.victoriametrics.com/vmagent.html</a></br>")
fmt.Fprintf(w, "Useful endpoints:</br>")

View File

@@ -30,8 +30,9 @@ func InsertHandler(at *auth.Token, req *http.Request) error {
if err != nil {
return err
}
isGzip := req.Header.Get("Content-Encoding") == "gzip"
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(req, func(block *parser.Block) error {
return parser.ParseStream(req.Body, isGzip, func(block *parser.Block) error {
return insertRows(at, block, extraLabels)
})
})

View File

@@ -92,9 +92,9 @@ func newHTTPClient(argIdx int, remoteWriteURL, sanitizedURL string, fq *persiste
}
tlsCfg := authCfg.NewTLSConfig()
tr := &http.Transport{
Dial: statDial,
DialContext: statDial,
TLSClientConfig: tlsCfg,
TLSHandshakeTimeout: 5 * time.Second,
TLSHandshakeTimeout: 10 * time.Second,
MaxConnsPerHost: 2 * concurrency,
MaxIdleConnsPerHost: 2 * concurrency,
IdleConnTimeout: time.Minute,

View File

@@ -1,7 +1,9 @@
package remotewrite
import (
"context"
"net"
"sync"
"sync/atomic"
"time"
@@ -9,9 +11,26 @@ import (
"github.com/VictoriaMetrics/metrics"
)
func statDial(networkUnused, addr string) (conn net.Conn, err error) {
func getStdDialer() *net.Dialer {
stdDialerOnce.Do(func() {
stdDialer = &net.Dialer{
Timeout: 30 * time.Second,
KeepAlive: 30 * time.Second,
DualStack: netutil.TCP6Enabled(),
}
})
return stdDialer
}
var (
stdDialer *net.Dialer
stdDialerOnce sync.Once
)
func statDial(ctx context.Context, networkUnused, addr string) (conn net.Conn, err error) {
network := netutil.GetTCPNetwork()
conn, err = net.DialTimeout(network, addr, 5*time.Second)
d := getStdDialer()
conn, err = d.DialContext(ctx, network, addr)
dialsTotal.Inc()
if err != nil {
dialErrors.Inc()

View File

@@ -10,6 +10,7 @@ Vmalert is heavily inspired by [Prometheus](https://prometheus.io/docs/alerting/
implementation and aims to be compatible with its syntax.
## Features
* Integration with [VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics) TSDB;
* VictoriaMetrics [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html)
support and expressions validation;
@@ -22,8 +23,9 @@ implementation and aims to be compatible with its syntax.
* Lightweight without extra dependencies.
## Limitations
* `vmalert` execute queries against remote datasource which has reliability risks because of the network.
It is recommended to configure alerts thresholds and rules expressions with the understanding that network
It is recommended to configure alerts thresholds and rules expressions with the understanding that network
requests may fail;
* by default, rules execution is sequential within one group, but persistence of execution results to remote
storage is asynchronous. Hence, user shouldn't rely on chaining of recording rules when result of previous
@@ -32,25 +34,29 @@ recording rule is reused in the next one;
## QuickStart
To build `vmalert` from sources:
```
```bash
git clone https://github.com/VictoriaMetrics/VictoriaMetrics
cd VictoriaMetrics
make vmalert
```
The build binary will be placed in `VictoriaMetrics/bin` folder.
To start using `vmalert` you will need the following things:
* list of rules - PromQL/MetricsQL expressions to execute;
* datasource address - reachable MetricsQL endpoint to run queries against;
* notifier address [optional] - reachable [Alert Manager](https://github.com/prometheus/alertmanager) instance for processing,
aggregating alerts, and sending notifications. Please note, notifier address also supports Consul Service Discovery via
aggregating alerts, and sending notifications. Please note, notifier address also supports Consul Service Discovery via
[config file](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmalert/notifier/config.go).
* remote write address [optional] - [remote write](https://prometheus.io/docs/prometheus/latest/storage/#remote-storage-integrations)
compatible storage to persist rules and alerts state info;
* remote read address [optional] - MetricsQL compatible datasource to restore alerts state from.
Then configure `vmalert` accordingly:
```
```bash
./bin/vmalert -rule=alert.rules \ # Path to the file with rules configuration. Supports wildcard
-datasource.url=http://localhost:8428 \ # PromQL compatible datasource
-notifier.url=http://localhost:9093 \ # AlertManager URL (required if alerting rules are used)
@@ -77,6 +83,7 @@ and [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerti
similar to Prometheus rules and configured using YAML. Configuration examples may be found
in [testdata](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmalert/config/testdata) folder.
Every `rule` belongs to a `group` and every configuration file may contain arbitrary number of groups:
```yaml
groups:
[ - <rule_group> ]
@@ -85,6 +92,7 @@ groups:
### Groups
Each group has the following attributes:
```yaml
# The name of the group. Must be unique within a file.
name: <string>
@@ -136,6 +144,7 @@ or [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) expression. Vmal
expression and then act according to the Rule type.
There are two types of Rules:
* [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/) -
Alerting rules allow defining alert conditions via `expr` field and to send notifications to
[Alertmanager](https://github.com/prometheus/alertmanager) if execution result is not empty.
@@ -150,6 +159,7 @@ within one group.
#### Alerting rules
The syntax for alerting rule is the following:
```yaml
# The name of the alert. Must be a valid metric name.
alert: <string>
@@ -182,6 +192,7 @@ listed [here](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app
#### Recording rules
The syntax for recording rules is following:
```yaml
# The name of the time series to output to. Must be a valid metric name.
record: <string>
@@ -198,11 +209,11 @@ labels:
For recording rules to work `-remoteWrite.url` must be specified.
### Alerts state on restarts
`vmalert` has no local storage, so alerts state is stored in the process memory. Hence, after restart of `vmalert`
the process alerts state will be lost. To avoid this situation, `vmalert` should be configured via the following flags:
* `-remoteWrite.url` - URL to VictoriaMetrics (Single) or vminsert (Cluster). `vmalert` will persist alerts state
into the configured address in the form of time series named `ALERTS` and `ALERTS_FOR_STATE` via remote-write protocol.
These are regular time series and maybe queried from VM just as any other time series.
@@ -214,7 +225,6 @@ Both flags are required for proper state restoration. Restore process may fail i
in configured `-remoteRead.url`, weren't updated in the last `1h` (controlled by `-remoteRead.lookback`)
or received state doesn't match current `vmalert` rules configuration.
### Multitenancy
There are the following approaches exist for alerting and recording rules across
@@ -259,10 +269,11 @@ tags at [Docker Hub](https://hub.docker.com/r/victoriametrics/vmalert/tags).
### Topology examples
The following sections are showing how `vmalert` may be used and configured
for different scenarios.
The following sections are showing how `vmalert` may be used and configured
for different scenarios.
Please note, not all flags in examples are required:
Please note, not all flags in examples are required:
* `-remoteWrite.url` and `-remoteRead.url` are optional and are needed only if
you have recording rules or want to store [alerts state](#alerts-state-on-restarts) on `vmalert` restarts;
* `-notifier.url` is optional and is needed only if you have alerting rules.
@@ -273,6 +284,7 @@ The simplest configuration where one single-node VM server is used for
rules execution, storing recording rules results and alerts state.
`vmalert` configuration flags:
```
./bin/vmalert -rule=rules.yml \ # Path to the file with rules configuration. Supports wildcard
-datasource.url=http://victoriametrics:8428 \ # VM-single addr for executing rules expressions
@@ -283,16 +295,16 @@ rules execution, storing recording rules results and alerts state.
<img alt="vmalert single" width="500" src="vmalert_single.png">
#### Cluster VictoriaMetrics
In [cluster mode](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html)
VictoriaMetrics has separate components for writing and reading path:
`vminsert` and `vmselect` components respectively. `vmselect` is used for executing rules expressions
and `vminsert` is used to persist recording rules results and alerts state.
Cluster mode could have multiple `vminsert` and `vmselect` components.
Cluster mode could have multiple `vminsert` and `vmselect` components.
`vmalert` configuration flags:
```
./bin/vmalert -rule=rules.yml \ # Path to the file with rules configuration. Supports wildcard
-datasource.url=http://vmselect:8481/select/0/prometheus # vmselect addr for executing rules expressions
@@ -315,6 +327,7 @@ the same destinations, and send alert notifications to multiple configured
Alertmanagers.
`vmalert` configuration flags:
```
./bin/vmalert -rule=rules.yml \ # Path to the file with rules configuration. Supports wildcard
-datasource.url=http://victoriametrics:8428 \ # VM-single addr for executing rules expressions
@@ -335,9 +348,8 @@ all `vmalert`s are having the same config.
Don't forget to configure [cluster mode](https://prometheus.io/docs/alerting/latest/alertmanager/)
for Alertmanagers for better reliability.
This example uses single-node VM server for the sake of simplicity.
Check how to replace it with [cluster VictoriaMetrics](#cluster-victoriametrics) if needed.
This example uses single-node VM server for the sake of simplicity.
Check how to replace it with [cluster VictoriaMetrics](#cluster-victoriametrics) if needed.
#### Downsampling and aggregation via vmalert
@@ -349,6 +361,7 @@ recording rules to process raw data from "hot" cluster (by applying additional t
or reducing resolution) and push results to "cold" cluster.
`vmalert` configuration flags:
```
./bin/vmalert -rule=downsampling-rules.yml \ # Path to the file with rules configuration. Supports wildcard
-datasource.url=http://raw-cluster-vmselect:8481/select/0/prometheus # vmselect addr for executing recordi ng rules expressions
@@ -363,19 +376,18 @@ Flags `-remoteRead.url` and `-notifier.url` are omitted since we assume only rec
See also [downsampling docs](https://docs.victoriametrics.com/#downsampling).
### Web
`vmalert` runs a web-server (`-httpListenAddr`) for serving metrics and alerts endpoints:
* `http://<vmalert-addr>` - UI;
* `http://<vmalert-addr>/api/v1/groups` - list of all loaded groups and rules;
* `http://<vmalert-addr>/api/v1/rules` - list of all loaded groups and rules;
* `http://<vmalert-addr>/api/v1/alerts` - list of all active alerts;
* `http://<vmalert-addr>/api/v1/<groupID>/<alertID>/status" ` - get alert status by ID.
* `http://<vmalert-addr>/api/v1/<groupID>/<alertID>/status"` - get alert status by ID.
Used as alert source in AlertManager.
* `http://<vmalert-addr>/metrics` - application metrics.
* `http://<vmalert-addr>/-/reload` - hot configuration reload.
## Graphite
vmalert sends requests to `<-datasource.url>/render?format=json` during evaluation of alerting and recording rules
@@ -395,6 +407,7 @@ data source for backfilling.
In `replay` mode vmalert works as a cli-tool and exits immediately after work is done.
To run vmalert in `replay` mode:
```
./bin/vmalert -rule=path/to/your.rules \ # path to files with rules you usually use with vmalert
-datasource.url=http://localhost:8428 \ # PromQL/MetricsQL compatible datasource
@@ -404,6 +417,7 @@ To run vmalert in `replay` mode:
```
The output of the command will look like the following:
```
Replay mode:
from: 2021-05-11 07:21:43 +0000 UTC # set by -replay.timeFrom
@@ -447,9 +461,11 @@ The result of recording rules `replay` should match with results of normal rules
The result of alerting rules `replay` is time series reflecting [alert's state](#alerts-state-on-restarts).
To see if `replayed` alert has fired in the past use the following PromQL/MetricsQL expression:
```
ALERTS{alertname="your_alertname", alertstate="firing"}
```
Execute the query against storage which was used for `-remoteWrite.url` during the `replay`.
### Additional configuration
@@ -473,7 +489,6 @@ See full description for these flags in `./vmalert --help`.
* Graphite engine isn't supported yet;
* `query` template function is disabled for performance reasons (might be changed in future);
## Monitoring
`vmalert` exports various metrics in Prometheus exposition format at `http://vmalert-host:8880/metrics` page.
@@ -484,7 +499,6 @@ Use the official [Grafana dashboard](https://grafana.com/grafana/dashboards/1495
If you have suggestions for improvements or have found a bug - please open an issue on github or add
a review to the dashboard.
## Configuration
### Flags
@@ -493,305 +507,310 @@ Pass `-help` to `vmalert` in order to see the full list of supported
command-line flags with their descriptions.
The shortlist of configuration flags is the following:
```
-clusterMode
If clusterMode is enabled, then vmalert automatically adds the tenant specified in config groups to -datasource.url, -remoteWrite.url and -remoteRead.url. See https://docs.victoriametrics.com/vmalert.html#multitenancy
If clusterMode is enabled, then vmalert automatically adds the tenant specified in config groups to -datasource.url, -remoteWrite.url and -remoteRead.url. See https://docs.victoriametrics.com/vmalert.html#multitenancy
-configCheckInterval duration
Interval for checking for changes in '-rule' or '-notifier.config' files. By default the checking is disabled. Send SIGHUP signal in order to force config check for changes.
Interval for checking for changes in '-rule' or '-notifier.config' files. By default the checking is disabled. Send SIGHUP signal in order to force config check for changes.
-datasource.appendTypePrefix
Whether to add type prefix to -datasource.url based on the query type. Set to true if sending different query types to the vmselect URL.
Whether to add type prefix to -datasource.url based on the query type. Set to true if sending different query types to the vmselect URL.
-datasource.basicAuth.password string
Optional basic auth password for -datasource.url
Optional basic auth password for -datasource.url
-datasource.basicAuth.passwordFile string
Optional path to basic auth password to use for -datasource.url
Optional path to basic auth password to use for -datasource.url
-datasource.basicAuth.username string
Optional basic auth username for -datasource.url
Optional basic auth username for -datasource.url
-datasource.bearerToken string
Optional bearer auth token to use for -datasource.url.
Optional bearer auth token to use for -datasource.url.
-datasource.bearerTokenFile string
Optional path to bearer token file to use for -datasource.url.
Optional path to bearer token file to use for -datasource.url.
-datasource.disableKeepAlive
Whether to disable long-lived connections to the datasource. If true, disables HTTP keep-alives and will only use the connection to the server for a single HTTP request.
-datasource.lookback duration
Lookback defines how far into the past to look when evaluating queries. For example, if the datasource.lookback=5m then param "time" with value now()-5m will be added to every query.
Lookback defines how far into the past to look when evaluating queries. For example, if the datasource.lookback=5m then param "time" with value now()-5m will be added to every query.
-datasource.maxIdleConnections int
Defines the number of idle (keep-alive connections) to each configured datasource. Consider setting this value equal to the value: groups_total * group.concurrency. Too low a value may result in a high number of sockets in TIME_WAIT state. (default 100)
Defines the number of idle (keep-alive connections) to each configured datasource. Consider setting this value equal to the value: groups_total * group.concurrency. Too low a value may result in a high number of sockets in TIME_WAIT state. (default 100)
-datasource.oauth2.clientID string
Optional OAuth2 clientID to use for -datasource.url.
Optional OAuth2 clientID to use for -datasource.url.
-datasource.oauth2.clientSecret string
Optional OAuth2 clientSecret to use for -datasource.url.
Optional OAuth2 clientSecret to use for -datasource.url.
-datasource.oauth2.clientSecretFile string
Optional OAuth2 clientSecretFile to use for -datasource.url.
Optional OAuth2 clientSecretFile to use for -datasource.url.
-datasource.oauth2.scopes string
Optional OAuth2 scopes to use for -datasource.url. Scopes must be delimited by ';'
Optional OAuth2 scopes to use for -datasource.url. Scopes must be delimited by ';'
-datasource.oauth2.tokenUrl string
Optional OAuth2 tokenURL to use for -datasource.url.
Optional OAuth2 tokenURL to use for -datasource.url.
-datasource.queryStep duration
queryStep defines how far a value can fallback to when evaluating queries. For example, if datasource.queryStep=15s then param "step" with value "15s" will be added to every query.If queryStep isn't specified, rule's evaluationInterval will be used instead.
queryStep defines how far a value can fallback to when evaluating queries. For example, if datasource.queryStep=15s then param "step" with value "15s" will be added to every query.If queryStep isn't specified, rule's evaluationInterval will be used instead.
-datasource.queryTimeAlignment
Whether to align "time" parameter with evaluation interval.Alignment supposed to produce deterministic results despite of number of vmalert replicas or time they were started. See more details here https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1257 (default true)
Whether to align "time" parameter with evaluation interval.Alignment supposed to produce deterministic results despite of number of vmalert replicas or time they were started. See more details here https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1257 (default true)
-datasource.roundDigits int
Adds "round_digits" GET param to datasource requests. In VM "round_digits" limits the number of digits after the decimal point in response values.
Adds "round_digits" GET param to datasource requests. In VM "round_digits" limits the number of digits after the decimal point in response values.
-datasource.tlsCAFile string
Optional path to TLS CA file to use for verifying connections to -datasource.url. By default, system CA is used
Optional path to TLS CA file to use for verifying connections to -datasource.url. By default, system CA is used
-datasource.tlsCertFile string
Optional path to client-side TLS certificate file to use when connecting to -datasource.url
Optional path to client-side TLS certificate file to use when connecting to -datasource.url
-datasource.tlsInsecureSkipVerify
Whether to skip tls verification when connecting to -datasource.url
Whether to skip tls verification when connecting to -datasource.url
-datasource.tlsKeyFile string
Optional path to client-side TLS certificate key to use when connecting to -datasource.url
Optional path to client-side TLS certificate key to use when connecting to -datasource.url
-datasource.tlsServerName string
Optional TLS server name to use for connections to -datasource.url. By default, the server name from -datasource.url is used
Optional TLS server name to use for connections to -datasource.url. By default, the server name from -datasource.url is used
-datasource.url string
VictoriaMetrics or vmselect url. Required parameter. E.g. http://127.0.0.1:8428
VictoriaMetrics or vmselect url. Required parameter. E.g. http://127.0.0.1:8428
-defaultTenant.graphite string
Default tenant for Graphite alerting groups. See https://docs.victoriametrics.com/vmalert.html#multitenancy
Default tenant for Graphite alerting groups. See https://docs.victoriametrics.com/vmalert.html#multitenancy
-defaultTenant.prometheus string
Default tenant for Prometheus alerting groups. See https://docs.victoriametrics.com/vmalert.html#multitenancy
Default tenant for Prometheus alerting groups. See https://docs.victoriametrics.com/vmalert.html#multitenancy
-disableAlertgroupLabel
Whether to disable adding group's Name as label to generated alerts and time series.
Whether to disable adding group's Name as label to generated alerts and time series.
-dryRun -rule
Whether to check only config files without running vmalert. The rules file are validated. The -rule flag must be specified.
Whether to check only config files without running vmalert. The rules file are validated. The -rule flag must be specified.
-enableTCP6
Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used
Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used
-envflag.enable
Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
Prefix for environment variables if -envflag.enable is set
-eula
By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf
By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf
-evaluationInterval duration
How often to evaluate the rules (default 1m0s)
How often to evaluate the rules (default 1m0s)
-external.alert.source string
External Alert Source allows to override the Source link for alerts sent to AlertManager for cases where you want to build a custom link to Grafana, Prometheus or any other service.
eg. 'explore?orgId=1&left=[\"now-1h\",\"now\",\"VictoriaMetrics\",{\"expr\": \"{{$expr|quotesEscape|crlfEscape|queryEscape}}\"},{\"mode\":\"Metrics\"},{\"ui\":[true,true,true,\"none\"]}]'.If empty '/api/v1/:groupID/alertID/status' is used
External Alert Source allows to override the Source link for alerts sent to AlertManager for cases where you want to build a custom link to Grafana, Prometheus or any other service.
eg. 'explore?orgId=1&left=[\"now-1h\",\"now\",\"VictoriaMetrics\",{\"expr\": \"{{$expr|quotesEscape|crlfEscape|queryEscape}}\"},{\"mode\":\"Metrics\"},{\"ui\":[true,true,true,\"none\"]}]'.If empty '/api/v1/:groupID/alertID/status' is used
-external.label array
Optional label in the form 'Name=value' to add to all generated recording rules and alerts. Pass multiple -label flags in order to add multiple label sets.
Supports an array of values separated by comma or specified via multiple flags.
Optional label in the form 'Name=value' to add to all generated recording rules and alerts. Pass multiple -label flags in order to add multiple label sets.
Supports an array of values separated by comma or specified via multiple flags.
-external.url string
External URL is used as alert's source for sent alerts to the notifier
External URL is used as alert's source for sent alerts to the notifier
-fs.disableMmap
Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread()
Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread()
-http.connTimeout duration
Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s)
Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s)
-http.disableResponseCompression
Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth
Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth
-http.idleConnTimeout duration
Timeout for incoming idle http connections (default 1m0s)
Timeout for incoming idle http connections (default 1m0s)
-http.maxGracefulShutdownDuration duration
The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s)
The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s)
-http.pathPrefix string
An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus
An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus
-http.shutdownDelay duration
Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers
Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers
-httpAuth.password string
Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty
Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty
-httpAuth.username string
Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
-httpListenAddr string
Address to listen for http connections (default ":8880")
Address to listen for http connections (default ":8880")
-loggerDisableTimestamps
Whether to disable writing timestamps in logs
Whether to disable writing timestamps in logs
-loggerErrorsPerSecondLimit int
Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit
Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit
-loggerFormat string
Format for logs. Possible values: default, json (default "default")
Format for logs. Possible values: default, json (default "default")
-loggerLevel string
Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO")
Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO")
-loggerOutput string
Output for the logs. Supported values: stderr, stdout (default "stderr")
Output for the logs. Supported values: stderr, stdout (default "stderr")
-loggerTimezone string
Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC")
Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC")
-loggerWarnsPerSecondLimit int
Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit
Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit
-memory.allowedBytes size
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0)
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0)
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60)
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60)
-metricsAuthKey string
Auth key for /metrics. It must be passed via authKey query arg. It overrides httpAuth.* settings
Auth key for /metrics. It must be passed via authKey query arg. It overrides httpAuth.* settings
-notifier.basicAuth.password array
Optional basic auth password for -notifier.url
Supports an array of values separated by comma or specified via multiple flags.
Optional basic auth password for -notifier.url
Supports an array of values separated by comma or specified via multiple flags.
-notifier.basicAuth.passwordFile array
Optional path to basic auth password file for -notifier.url
Supports an array of values separated by comma or specified via multiple flags.
Optional path to basic auth password file for -notifier.url
Supports an array of values separated by comma or specified via multiple flags.
-notifier.basicAuth.username array
Optional basic auth username for -notifier.url
Supports an array of values separated by comma or specified via multiple flags.
Optional basic auth username for -notifier.url
Supports an array of values separated by comma or specified via multiple flags.
-notifier.bearerToken array
Optional bearer token for -notifier.url
Supports an array of values separated by comma or specified via multiple flags.
Optional bearer token for -notifier.url
Supports an array of values separated by comma or specified via multiple flags.
-notifier.bearerTokenFile array
Optional path to bearer token file for -notifier.url
Supports an array of values separated by comma or specified via multiple flags.
Optional path to bearer token file for -notifier.url
Supports an array of values separated by comma or specified via multiple flags.
-notifier.config string
Path to configuration file for notifiers
Path to configuration file for notifiers
-notifier.oauth2.clientID array
Optional OAuth2 clientID to use for -notifier.url. If multiple args are set, then they are applied independently for the corresponding -notifier.url
Supports an array of values separated by comma or specified via multiple flags.
Optional OAuth2 clientID to use for -notifier.url. If multiple args are set, then they are applied independently for the corresponding -notifier.url
Supports an array of values separated by comma or specified via multiple flags.
-notifier.oauth2.clientSecret array
Optional OAuth2 clientSecret to use for -notifier.url. If multiple args are set, then they are applied independently for the corresponding -notifier.url
Supports an array of values separated by comma or specified via multiple flags.
Optional OAuth2 clientSecret to use for -notifier.url. If multiple args are set, then they are applied independently for the corresponding -notifier.url
Supports an array of values separated by comma or specified via multiple flags.
-notifier.oauth2.clientSecretFile array
Optional OAuth2 clientSecretFile to use for -notifier.url. If multiple args are set, then they are applied independently for the corresponding -notifier.url
Supports an array of values separated by comma or specified via multiple flags.
Optional OAuth2 clientSecretFile to use for -notifier.url. If multiple args are set, then they are applied independently for the corresponding -notifier.url
Supports an array of values separated by comma or specified via multiple flags.
-notifier.oauth2.scopes array
Optional OAuth2 scopes to use for -notifier.url. Scopes must be delimited by ';'. If multiple args are set, then they are applied independently for the corresponding -notifier.url
Supports an array of values separated by comma or specified via multiple flags.
Optional OAuth2 scopes to use for -notifier.url. Scopes must be delimited by ';'. If multiple args are set, then they are applied independently for the corresponding -notifier.url
Supports an array of values separated by comma or specified via multiple flags.
-notifier.oauth2.tokenUrl array
Optional OAuth2 tokenURL to use for -notifier.url. If multiple args are set, then they are applied independently for the corresponding -notifier.url
Supports an array of values separated by comma or specified via multiple flags.
Optional OAuth2 tokenURL to use for -notifier.url. If multiple args are set, then they are applied independently for the corresponding -notifier.url
Supports an array of values separated by comma or specified via multiple flags.
-notifier.suppressDuplicateTargetErrors
Whether to suppress 'duplicate target' errors during discovery
Whether to suppress 'duplicate target' errors during discovery
-notifier.tlsCAFile array
Optional path to TLS CA file to use for verifying connections to -notifier.url. By default system CA is used
Supports an array of values separated by comma or specified via multiple flags.
Optional path to TLS CA file to use for verifying connections to -notifier.url. By default system CA is used
Supports an array of values separated by comma or specified via multiple flags.
-notifier.tlsCertFile array
Optional path to client-side TLS certificate file to use when connecting to -notifier.url
Supports an array of values separated by comma or specified via multiple flags.
Optional path to client-side TLS certificate file to use when connecting to -notifier.url
Supports an array of values separated by comma or specified via multiple flags.
-notifier.tlsInsecureSkipVerify array
Whether to skip tls verification when connecting to -notifier.url
Supports array of values separated by comma or specified via multiple flags.
Whether to skip tls verification when connecting to -notifier.url
Supports array of values separated by comma or specified via multiple flags.
-notifier.tlsKeyFile array
Optional path to client-side TLS certificate key to use when connecting to -notifier.url
Supports an array of values separated by comma or specified via multiple flags.
Optional path to client-side TLS certificate key to use when connecting to -notifier.url
Supports an array of values separated by comma or specified via multiple flags.
-notifier.tlsServerName array
Optional TLS server name to use for connections to -notifier.url. By default the server name from -notifier.url is used
Supports an array of values separated by comma or specified via multiple flags.
Optional TLS server name to use for connections to -notifier.url. By default the server name from -notifier.url is used
Supports an array of values separated by comma or specified via multiple flags.
-notifier.url array
Prometheus alertmanager URL, e.g. http://127.0.0.1:9093
Supports an array of values separated by comma or specified via multiple flags.
Prometheus alertmanager URL, e.g. http://127.0.0.1:9093
Supports an array of values separated by comma or specified via multiple flags.
-pprofAuthKey string
Auth key for /debug/pprof. It must be passed via authKey query arg. It overrides httpAuth.* settings
Auth key for /debug/pprof. It must be passed via authKey query arg. It overrides httpAuth.* settings
-promscrape.consul.waitTime duration
Wait time used by Consul service discovery. Default value is used if not set
Wait time used by Consul service discovery. Default value is used if not set
-promscrape.consulSDCheckInterval duration
Interval for checking for changes in Consul. This works only if consul_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config for details (default 30s)
Interval for checking for changes in Consul. This works only if consul_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config for details (default 30s)
-promscrape.discovery.concurrency int
The maximum number of concurrent requests to Prometheus autodiscovery API (Consul, Kubernetes, etc.) (default 100)
The maximum number of concurrent requests to Prometheus autodiscovery API (Consul, Kubernetes, etc.) (default 100)
-promscrape.discovery.concurrentWaitTime duration
The maximum duration for waiting to perform API requests if more than -promscrape.discovery.concurrency requests are simultaneously performed (default 1m0s)
The maximum duration for waiting to perform API requests if more than -promscrape.discovery.concurrency requests are simultaneously performed (default 1m0s)
-remoteRead.basicAuth.password string
Optional basic auth password for -remoteRead.url
Optional basic auth password for -remoteRead.url
-remoteRead.basicAuth.passwordFile string
Optional path to basic auth password to use for -remoteRead.url
Optional path to basic auth password to use for -remoteRead.url
-remoteRead.basicAuth.username string
Optional basic auth username for -remoteRead.url
Optional basic auth username for -remoteRead.url
-remoteRead.bearerToken string
Optional bearer auth token to use for -remoteRead.url.
Optional bearer auth token to use for -remoteRead.url.
-remoteRead.bearerTokenFile string
Optional path to bearer token file to use for -remoteRead.url.
Optional path to bearer token file to use for -remoteRead.url.
-remoteRead.disablePathAppend
Whether to disable automatic appending of '/api/v1/query' path to the configured -remoteRead.url.
Whether to disable automatic appending of '/api/v1/query' path to the configured -remoteRead.url.
-remoteRead.ignoreRestoreErrors
Whether to ignore errors from remote storage when restoring alerts state on startup. (default true)
Whether to ignore errors from remote storage when restoring alerts state on startup. (default true)
-remoteRead.lookback duration
Lookback defines how far to look into past for alerts timeseries. For example, if lookback=1h then range from now() to now()-1h will be scanned. (default 1h0m0s)
Lookback defines how far to look into past for alerts timeseries. For example, if lookback=1h then range from now() to now()-1h will be scanned. (default 1h0m0s)
-remoteRead.oauth2.clientID string
Optional OAuth2 clientID to use for -remoteRead.url.
Optional OAuth2 clientID to use for -remoteRead.url.
-remoteRead.oauth2.clientSecret string
Optional OAuth2 clientSecret to use for -remoteRead.url.
Optional OAuth2 clientSecret to use for -remoteRead.url.
-remoteRead.oauth2.clientSecretFile string
Optional OAuth2 clientSecretFile to use for -remoteRead.url.
Optional OAuth2 clientSecretFile to use for -remoteRead.url.
-remoteRead.oauth2.scopes string
Optional OAuth2 scopes to use for -remoteRead.url. Scopes must be delimited by ';'.
Optional OAuth2 scopes to use for -remoteRead.url. Scopes must be delimited by ';'.
-remoteRead.oauth2.tokenUrl string
Optional OAuth2 tokenURL to use for -remoteRead.url.
Optional OAuth2 tokenURL to use for -remoteRead.url.
-remoteRead.tlsCAFile string
Optional path to TLS CA file to use for verifying connections to -remoteRead.url. By default system CA is used
Optional path to TLS CA file to use for verifying connections to -remoteRead.url. By default system CA is used
-remoteRead.tlsCertFile string
Optional path to client-side TLS certificate file to use when connecting to -remoteRead.url
Optional path to client-side TLS certificate file to use when connecting to -remoteRead.url
-remoteRead.tlsInsecureSkipVerify
Whether to skip tls verification when connecting to -remoteRead.url
Whether to skip tls verification when connecting to -remoteRead.url
-remoteRead.tlsKeyFile string
Optional path to client-side TLS certificate key to use when connecting to -remoteRead.url
Optional path to client-side TLS certificate key to use when connecting to -remoteRead.url
-remoteRead.tlsServerName string
Optional TLS server name to use for connections to -remoteRead.url. By default the server name from -remoteRead.url is used
Optional TLS server name to use for connections to -remoteRead.url. By default the server name from -remoteRead.url is used
-remoteRead.url vmalert
Optional URL to VictoriaMetrics or vmselect that will be used to restore alerts state. This configuration makes sense only if vmalert was configured with `remoteWrite.url` before and has been successfully persisted its state. E.g. http://127.0.0.1:8428. See also -remoteRead.disablePathAppend
Optional URL to VictoriaMetrics or vmselect that will be used to restore alerts state. This configuration makes sense only if vmalert was configured with `remoteWrite.url` before and has been successfully persisted its state. E.g. http://127.0.0.1:8428. See also -remoteRead.disablePathAppend
-remoteWrite.basicAuth.password string
Optional basic auth password for -remoteWrite.url
Optional basic auth password for -remoteWrite.url
-remoteWrite.basicAuth.passwordFile string
Optional path to basic auth password to use for -remoteWrite.url
Optional path to basic auth password to use for -remoteWrite.url
-remoteWrite.basicAuth.username string
Optional basic auth username for -remoteWrite.url
Optional basic auth username for -remoteWrite.url
-remoteWrite.bearerToken string
Optional bearer auth token to use for -remoteWrite.url.
Optional bearer auth token to use for -remoteWrite.url.
-remoteWrite.bearerTokenFile string
Optional path to bearer token file to use for -remoteWrite.url.
Optional path to bearer token file to use for -remoteWrite.url.
-remoteWrite.concurrency int
Defines number of writers for concurrent writing into remote querier (default 1)
Defines number of writers for concurrent writing into remote querier (default 1)
-remoteWrite.disablePathAppend
Whether to disable automatic appending of '/api/v1/write' path to the configured -remoteWrite.url.
Whether to disable automatic appending of '/api/v1/write' path to the configured -remoteWrite.url.
-remoteWrite.flushInterval duration
Defines interval of flushes to remote write endpoint (default 5s)
Defines interval of flushes to remote write endpoint (default 5s)
-remoteWrite.maxBatchSize int
Defines defines max number of timeseries to be flushed at once (default 1000)
Defines defines max number of timeseries to be flushed at once (default 1000)
-remoteWrite.maxQueueSize int
Defines the max number of pending datapoints to remote write endpoint (default 100000)
Defines the max number of pending datapoints to remote write endpoint (default 100000)
-remoteWrite.oauth2.clientID string
Optional OAuth2 clientID to use for -remoteWrite.url.
Optional OAuth2 clientID to use for -remoteWrite.url.
-remoteWrite.oauth2.clientSecret string
Optional OAuth2 clientSecret to use for -remoteWrite.url.
Optional OAuth2 clientSecret to use for -remoteWrite.url.
-remoteWrite.oauth2.clientSecretFile string
Optional OAuth2 clientSecretFile to use for -remoteWrite.url.
Optional OAuth2 clientSecretFile to use for -remoteWrite.url.
-remoteWrite.oauth2.scopes string
Optional OAuth2 scopes to use for -notifier.url. Scopes must be delimited by ';'.
Optional OAuth2 scopes to use for -notifier.url. Scopes must be delimited by ';'.
-remoteWrite.oauth2.tokenUrl string
Optional OAuth2 tokenURL to use for -notifier.url.
Optional OAuth2 tokenURL to use for -notifier.url.
-remoteWrite.tlsCAFile string
Optional path to TLS CA file to use for verifying connections to -remoteWrite.url. By default system CA is used
Optional path to TLS CA file to use for verifying connections to -remoteWrite.url. By default system CA is used
-remoteWrite.tlsCertFile string
Optional path to client-side TLS certificate file to use when connecting to -remoteWrite.url
Optional path to client-side TLS certificate file to use when connecting to -remoteWrite.url
-remoteWrite.tlsInsecureSkipVerify
Whether to skip tls verification when connecting to -remoteWrite.url
Whether to skip tls verification when connecting to -remoteWrite.url
-remoteWrite.tlsKeyFile string
Optional path to client-side TLS certificate key to use when connecting to -remoteWrite.url
Optional path to client-side TLS certificate key to use when connecting to -remoteWrite.url
-remoteWrite.tlsServerName string
Optional TLS server name to use for connections to -remoteWrite.url. By default the server name from -remoteWrite.url is used
Optional TLS server name to use for connections to -remoteWrite.url. By default the server name from -remoteWrite.url is used
-remoteWrite.url string
Optional URL to VictoriaMetrics or vminsert where to persist alerts state and recording rules results in form of timeseries. For example, if -remoteWrite.url=http://127.0.0.1:8428 is specified, then the alerts state will be written to http://127.0.0.1:8428/api/v1/write . See also -remoteWrite.disablePathAppend
Optional URL to VictoriaMetrics or vminsert where to persist alerts state and recording rules results in form of timeseries. For example, if -remoteWrite.url=http://127.0.0.1:8428 is specified, then the alerts state will be written to http://127.0.0.1:8428/api/v1/write . See also -remoteWrite.disablePathAppend
-replay.maxDatapointsPerQuery int
Max number of data points expected in one request. The higher the value, the less requests will be made during replay. (default 1000)
Max number of data points expected in one request. The higher the value, the less requests will be made during replay. (default 1000)
-replay.ruleRetryAttempts int
Defines how many retries to make before giving up on rule if request for it returns an error. (default 5)
Defines how many retries to make before giving up on rule if request for it returns an error. (default 5)
-replay.rulesDelay duration
Delay between rules evaluation within the group. Could be important if there are chained rules inside of the groupand processing need to wait for previous rule results to be persisted by remote storage before evaluating the next rule.Keep it equal or bigger than -remoteWrite.flushInterval. (default 1s)
Delay between rules evaluation within the group. Could be important if there are chained rules inside of the groupand processing need to wait for previous rule results to be persisted by remote storage before evaluating the next rule.Keep it equal or bigger than -remoteWrite.flushInterval. (default 1s)
-replay.timeFrom string
The time filter in RFC3339 format to select time series with timestamp equal or higher than provided value. E.g. '2020-01-01T20:07:00Z'
The time filter in RFC3339 format to select time series with timestamp equal or higher than provided value. E.g. '2020-01-01T20:07:00Z'
-replay.timeTo string
The time filter in RFC3339 format to select timeseries with timestamp equal or lower than provided value. E.g. '2020-01-01T20:07:00Z'
The time filter in RFC3339 format to select timeseries with timestamp equal or lower than provided value. E.g. '2020-01-01T20:07:00Z'
-rule array
Path to the file with alert rules.
Supports patterns. Flag can be specified multiple times.
Examples:
-rule="/path/to/file". Path to a single file with alerting rules
-rule="dir/*.yaml" -rule="/*.yaml". Relative path to all .yaml files in "dir" folder,
absolute path to all .yaml files in root.
Rule files may contain %{ENV_VAR} placeholders, which are substituted by the corresponding env vars.
Supports an array of values separated by comma or specified via multiple flags.
Path to the file with alert rules.
Supports patterns. Flag can be specified multiple times.
Examples:
-rule="/path/to/file". Path to a single file with alerting rules
-rule="dir/*.yaml" -rule="/*.yaml". Relative path to all .yaml files in "dir" folder,
absolute path to all .yaml files in root.
Rule files may contain %{ENV_VAR} placeholders, which are substituted by the corresponding env vars.
Supports an array of values separated by comma or specified via multiple flags.
-rule.configCheckInterval duration
Interval for checking for changes in '-rule' files. By default the checking is disabled. Send SIGHUP signal in order to force config check for changes. DEPRECATED - see '-configCheckInterval' instead
Interval for checking for changes in '-rule' files. By default the checking is disabled. Send SIGHUP signal in order to force config check for changes. DEPRECATED - see '-configCheckInterval' instead
-rule.maxResolveDuration duration
Limits the maximum duration for automatic alert expiration, which is by default equal to 3 evaluation intervals of the parent group.
Limits the maximum duration for automatic alert expiration, which is by default equal to 3 evaluation intervals of the parent group.
-rule.resendDelay duration
Minimum amount of time to wait before resending an alert to notifier
Minimum amount of time to wait before resending an alert to notifier
-rule.validateExpressions
Whether to validate rules expressions via MetricsQL engine (default true)
Whether to validate rules expressions via MetricsQL engine (default true)
-rule.validateTemplates
Whether to validate annotation and label templates (default true)
Whether to validate annotation and label templates (default true)
-tls
Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set
Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set
-tlsCertFile string
Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower. The provided certificate file is automatically re-read every second, so it can be dynamically updated
Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower. The provided certificate file is automatically re-read every second, so it can be dynamically updated
-tlsKeyFile string
Path to file with TLS key. Used only if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated
Path to file with TLS key. Used only if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated
-version
Show VictoriaMetrics version
Show VictoriaMetrics version
```
### Hot config reload
`vmalert` supports "hot" config reload via the following methods:
* send SIGHUP signal to `vmalert` process;
* send GET request to `/-/reload` endpoint;
* configure `-configCheckInterval` flag for periodic reload
@@ -804,6 +823,7 @@ just add them in address: `-datasource.url=http://localhost:8428?nocache=1`.
To set additional URL params for specific [group of rules](#Groups) modify
the `params` group:
```yaml
groups:
- name: TestGroup
@@ -811,6 +831,7 @@ groups:
denyPartialResponse: ["true"]
extra_label: ["env=dev"]
```
Please note, `params` are used only for executing rules expressions (requests to `datasource.url`).
If there would be a conflict between URL params set in `datasource.url` flag and params in group definition
the latter will have higher priority.
@@ -818,15 +839,17 @@ the latter will have higher priority.
### Notifier configuration file
Notifier also supports configuration via file specified with flag `notifier.config`:
```
./bin/vmalert -rule=app/vmalert/config/testdata/rules.good.rules \
-datasource.url=http://localhost:8428 \
-notifier.config=app/vmalert/notifier/testdata/consul.good.yaml
-datasource.url=http://localhost:8428 \
-notifier.config=app/vmalert/notifier/testdata/consul.good.yaml
```
The configuration file allows to configure static notifiers or discover notifiers via
The configuration file allows to configure static notifiers or discover notifiers via
[Consul](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config).
For example:
```
static_configs:
- targets:
@@ -843,6 +866,7 @@ The list of configured or discovered Notifiers can be explored via [UI](#Web).
The configuration file [specification](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmalert/notifier/config.go)
is the following:
```
# Per-target Notifier timeout when pushing alerts.
[ timeout: <duration> | default = 10s ]
@@ -887,17 +911,21 @@ static_configs:
consul_sd_configs:
[ - <consul_sd_config> ... ]
# List of relabel configurations.
# List of relabel configurations for entities discovered via service discovery.
# Supports the same relabeling features as the rest of VictoriaMetrics components.
# See https://docs.victoriametrics.com/vmagent.html#relabeling
relabel_configs:
[ - <relabel_config> ... ]
# List of relabel configurations for alert labels sent via Notifier.
# Supports the same relabeling features as the rest of VictoriaMetrics components.
# See https://docs.victoriametrics.com/vmagent.html#relabeling
alert_relabel_configs:
[ - <relabel_config> ... ]
```
The configuration file can be [hot-reloaded](#hot-config-reload).
## Contributing
`vmalert` is mostly designed and built by VictoriaMetrics community.
@@ -908,8 +936,8 @@ software. Please keep simplicity as the main priority.
It is recommended using
[binary releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases)
- `vmalert` is located in `vmutils-*` archives there.
* `vmalert` is located in `vmutils-*` archives there.
### Development build
@@ -923,7 +951,6 @@ It is recommended using
2. Run `make vmalert-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmalert-prod` binary and puts it into the `bin` folder.
### ARM build
ARM build may run on Raspberry Pi or on [energy-efficient ARM servers](https://blog.cloudflare.com/arm-takes-wing/).

View File

@@ -141,6 +141,53 @@ func (ar *AlertingRule) ID() uint64 {
return ar.RuleID
}
type labelSet struct {
// origin labels from series
// used for templating
origin map[string]string
// processed labels with additional data
// used as Alert labels
processed map[string]string
}
// toLabels converts labels from given Metric
// to labelSet which contains original and processed labels.
func (ar *AlertingRule) toLabels(m datasource.Metric, qFn notifier.QueryFn) (*labelSet, error) {
ls := &labelSet{
origin: make(map[string]string, len(m.Labels)),
processed: make(map[string]string),
}
for _, l := range m.Labels {
// drop __name__ to be consistent with Prometheus alerting
if l.Name == "__name__" {
continue
}
ls.origin[l.Name] = l.Value
ls.processed[l.Name] = l.Value
}
extraLabels, err := notifier.ExecTemplate(qFn, ar.Labels, notifier.AlertTplData{
Labels: ls.origin,
Value: m.Values[0],
Expr: ar.Expr,
})
if err != nil {
return nil, fmt.Errorf("failed to expand labels: %s", err)
}
for k, v := range extraLabels {
ls.processed[k] = v
}
// set additional labels to identify group and rule name
if ar.Name != "" {
ls.processed[alertNameLabel] = ar.Name
}
if !*disableAlertGroupLabel && ar.GroupName != "" {
ls.processed[alertGroupNameLabel] = ar.GroupName
}
return ls, nil
}
// ExecRange executes alerting rule on the given time range similarly to Exec.
// It doesn't update internal states of the Rule and meant to be used just
// to get time series for backfilling.
@@ -155,24 +202,7 @@ func (ar *AlertingRule) ExecRange(ctx context.Context, start, end time.Time) ([]
return nil, fmt.Errorf("`query` template isn't supported in replay mode")
}
for _, s := range series {
// set additional labels to identify group and rule Name
if ar.Name != "" {
s.SetLabel(alertNameLabel, ar.Name)
}
if !*disableAlertGroupLabel && ar.GroupName != "" {
s.SetLabel(alertGroupNameLabel, ar.GroupName)
}
// extra labels could contain templates, so we expand them first
labels, err := expandLabels(s, qFn, ar)
if err != nil {
return nil, fmt.Errorf("failed to expand labels: %s", err)
}
for k, v := range labels {
// apply extra labels to datasource
// so the hash key will be consistent on restore
s.SetLabel(k, v)
}
a, err := ar.newAlert(s, time.Time{}, qFn) // initial alert
a, err := ar.newAlert(s, nil, time.Time{}, qFn) // initial alert
if err != nil {
return nil, fmt.Errorf("failed to create alert: %s", err)
}
@@ -191,9 +221,10 @@ func (ar *AlertingRule) ExecRange(ctx context.Context, start, end time.Time) ([]
if at.Sub(prevT) > ar.EvalInterval {
// reset to Pending if there are gaps > EvalInterval between DPs
a.State = notifier.StatePending
a.Start = at
} else if at.Sub(a.Start) >= ar.For {
a.ActiveAt = at
} else if at.Sub(a.ActiveAt) >= ar.For {
a.State = notifier.StateFiring
a.Start = at
}
prevT = at
result = append(result, ar.alertToTimeSeries(a, s.Timestamps[i])...)
@@ -202,11 +233,15 @@ func (ar *AlertingRule) ExecRange(ctx context.Context, start, end time.Time) ([]
return result, nil
}
// resolvedRetention is the duration for which a resolved alert instance
// is kept in memory state and consequently repeatedly sent to the AlertManager.
const resolvedRetention = 15 * time.Minute
// Exec executes AlertingRule expression via the given Querier.
// Based on the Querier results AlertingRule maintains notifier.Alerts
func (ar *AlertingRule) Exec(ctx context.Context) ([]prompbmarshal.TimeSeries, error) {
func (ar *AlertingRule) Exec(ctx context.Context, ts time.Time) ([]prompbmarshal.TimeSeries, error) {
start := time.Now()
qMetrics, err := ar.q.Query(ctx, ar.Expr)
qMetrics, err := ar.q.Query(ctx, ar.Expr, ts)
ar.mu.Lock()
defer ar.mu.Unlock()
@@ -220,59 +255,55 @@ func (ar *AlertingRule) Exec(ctx context.Context) ([]prompbmarshal.TimeSeries, e
for h, a := range ar.alerts {
// cleanup inactive alerts from previous Exec
if a.State == notifier.StateInactive {
if a.State == notifier.StateInactive && ts.Sub(a.ResolvedAt) > resolvedRetention {
delete(ar.alerts, h)
}
}
qFn := func(query string) ([]datasource.Metric, error) { return ar.q.Query(ctx, query) }
qFn := func(query string) ([]datasource.Metric, error) { return ar.q.Query(ctx, query, ts) }
updated := make(map[uint64]struct{})
// update list of active alerts
for _, m := range qMetrics {
// set additional labels to identify group and rule name
if ar.Name != "" {
m.SetLabel(alertNameLabel, ar.Name)
}
if !*disableAlertGroupLabel && ar.GroupName != "" {
m.SetLabel(alertGroupNameLabel, ar.GroupName)
}
// extra labels could contain templates, so we expand them first
labels, err := expandLabels(m, qFn, ar)
ls, err := ar.toLabels(m, qFn)
if err != nil {
return nil, fmt.Errorf("failed to expand labels: %s", err)
}
for k, v := range labels {
// apply extra labels to datasource
// so the hash key will be consistent on restore
m.SetLabel(k, v)
}
h := hash(m)
h := hash(ls.processed)
if _, ok := updated[h]; ok {
// duplicate may be caused by extra labels
// conflicting with the metric labels
return nil, fmt.Errorf("labels %v: %w", m.Labels, errDuplicate)
ar.lastExecError = fmt.Errorf("labels %v: %w", ls.processed, errDuplicate)
return nil, ar.lastExecError
}
updated[h] = struct{}{}
if a, ok := ar.alerts[h]; ok {
if a.State == notifier.StateInactive {
// alert could be in inactive state for resolvedRetention
// so when we again receive metrics for it - we switch it
// back to notifier.StatePending
a.State = notifier.StatePending
a.ActiveAt = ts
}
if a.Value != m.Values[0] {
// update Value field with latest value
a.Value = m.Values[0]
// and re-exec template since Value can be used
// in annotations
a.Annotations, err = a.ExecTemplate(qFn, ar.Annotations)
a.Annotations, err = a.ExecTemplate(qFn, ls.origin, ar.Annotations)
if err != nil {
return nil, err
}
}
continue
}
a, err := ar.newAlert(m, ar.lastExecTime, qFn)
a, err := ar.newAlert(m, ls, ar.lastExecTime, qFn)
if err != nil {
ar.lastExecError = err
return nil, fmt.Errorf("failed to create alert: %w", err)
}
a.ID = h
a.State = notifier.StatePending
a.ActiveAt = ts
ar.alerts[h] = a
}
@@ -286,28 +317,19 @@ func (ar *AlertingRule) Exec(ctx context.Context) ([]prompbmarshal.TimeSeries, e
delete(ar.alerts, h)
continue
}
a.State = notifier.StateInactive
if a.State == notifier.StateFiring {
a.State = notifier.StateInactive
a.ResolvedAt = ts
}
continue
}
if a.State == notifier.StatePending && time.Since(a.Start) >= ar.For {
if a.State == notifier.StatePending && time.Since(a.ActiveAt) >= ar.For {
a.State = notifier.StateFiring
a.Start = ts
alertsFired.Inc()
}
}
return ar.toTimeSeries(ar.lastExecTime.Unix()), nil
}
func expandLabels(m datasource.Metric, q notifier.QueryFn, ar *AlertingRule) (map[string]string, error) {
metricLabels := make(map[string]string)
for _, l := range m.Labels {
metricLabels[l.Name] = l.Value
}
tpl := notifier.AlertTplData{
Labels: metricLabels,
Value: m.Values[0],
Expr: ar.Expr,
}
return notifier.ExecTemplate(q, ar.Labels, tpl)
return ar.toTimeSeries(ts.Unix()), nil
}
func (ar *AlertingRule) toTimeSeries(timestamp int64) []prompbmarshal.TimeSeries {
@@ -340,42 +362,43 @@ func (ar *AlertingRule) UpdateWith(r Rule) error {
}
// TODO: consider hashing algorithm in VM
func hash(m datasource.Metric) uint64 {
func hash(labels map[string]string) uint64 {
hash := fnv.New64a()
labels := m.Labels
sort.Slice(labels, func(i, j int) bool {
return labels[i].Name < labels[j].Name
})
for _, l := range labels {
keys := make([]string, 0, len(labels))
for k := range labels {
keys = append(keys, k)
}
sort.Strings(keys)
for _, k := range keys {
// drop __name__ to be consistent with Prometheus alerting
if l.Name == "__name__" {
if k == "__name__" {
continue
}
hash.Write([]byte(l.Name))
hash.Write([]byte(l.Value))
name, value := k, labels[k]
hash.Write([]byte(name))
hash.Write([]byte(value))
hash.Write([]byte("\xff"))
}
return hash.Sum64()
}
func (ar *AlertingRule) newAlert(m datasource.Metric, start time.Time, qFn notifier.QueryFn) (*notifier.Alert, error) {
a := &notifier.Alert{
GroupID: ar.GroupID,
Name: ar.Name,
Labels: map[string]string{},
Value: m.Values[0],
Start: start,
Expr: ar.Expr,
}
for _, l := range m.Labels {
// drop __name__ to be consistent with Prometheus alerting
if l.Name == "__name__" {
continue
}
a.Labels[l.Name] = l.Value
}
func (ar *AlertingRule) newAlert(m datasource.Metric, ls *labelSet, start time.Time, qFn notifier.QueryFn) (*notifier.Alert, error) {
var err error
a.Annotations, err = a.ExecTemplate(qFn, ar.Annotations)
if ls == nil {
ls, err = ar.toLabels(m, qFn)
if err != nil {
return nil, fmt.Errorf("failed to expand labels: %s", err)
}
}
a := &notifier.Alert{
GroupID: ar.GroupID,
Name: ar.Name,
Labels: ls.processed,
Value: m.Values[0],
ActiveAt: start,
Expr: ar.Expr,
}
a.Annotations, err = a.ExecTemplate(qFn, ls.origin, ar.Annotations)
return a, err
}
@@ -435,6 +458,9 @@ func (ar *AlertingRule) AlertsToAPI() []*APIAlert {
var alerts []*APIAlert
ar.mu.RLock()
for _, a := range ar.alerts {
if a.State == notifier.StateInactive {
continue
}
alerts = append(alerts, ar.newAlertAPI(*a))
}
ar.mu.RUnlock()
@@ -453,7 +479,7 @@ func (ar *AlertingRule) newAlertAPI(a notifier.Alert) *APIAlert {
Labels: a.Labels,
Annotations: a.Annotations,
State: a.State.String(),
ActiveAt: a.Start,
ActiveAt: a.ActiveAt,
Restored: a.Restored,
Value: strconv.FormatFloat(a.Value, 'f', -1, 32),
}
@@ -479,7 +505,7 @@ const (
alertGroupNameLabel = "alertgroup"
)
// alertToTimeSeries converts the given alert with the given timestamp to timeseries
// alertToTimeSeries converts the given alert with the given timestamp to time series
func (ar *AlertingRule) alertToTimeSeries(a *notifier.Alert, timestamp int64) []prompbmarshal.TimeSeries {
var tss []prompbmarshal.TimeSeries
tss = append(tss, alertToTimeSeries(a, timestamp))
@@ -507,11 +533,11 @@ func alertForToTimeSeries(a *notifier.Alert, timestamp int64) prompbmarshal.Time
labels[k] = v
}
labels["__name__"] = alertForStateMetricName
return newTimeSeries([]float64{float64(a.Start.Unix())}, []int64{timestamp}, labels)
return newTimeSeries([]float64{float64(a.ActiveAt.Unix())}, []int64{timestamp}, labels)
}
// Restore restores the state of active alerts basing on previously written time series.
// Restore restores only Start field. Field State will be always Pending and supposed
// Restore restores only ActiveAt field. Field State will be always Pending and supposed
// to be updated on next Exec, as well as Value field.
// Only rules with For > 0 will be restored.
func (ar *AlertingRule) Restore(ctx context.Context, q datasource.Querier, lookback time.Duration, labels map[string]string) error {
@@ -519,7 +545,8 @@ func (ar *AlertingRule) Restore(ctx context.Context, q datasource.Querier, lookb
return fmt.Errorf("querier is nil")
}
qFn := func(query string) ([]datasource.Metric, error) { return ar.q.Query(ctx, query) }
ts := time.Now()
qFn := func(query string) ([]datasource.Metric, error) { return ar.q.Query(ctx, query, ts) }
// account for external labels in filter
var labelsFilter string
@@ -532,21 +559,32 @@ func (ar *AlertingRule) Restore(ctx context.Context, q datasource.Querier, lookb
// remote write protocol which is used for state persistence in vmalert.
expr := fmt.Sprintf("last_over_time(%s{alertname=%q%s}[%ds])",
alertForStateMetricName, ar.Name, labelsFilter, int(lookback.Seconds()))
qMetrics, err := q.Query(ctx, expr)
qMetrics, err := q.Query(ctx, expr, ts)
if err != nil {
return err
}
for _, m := range qMetrics {
a, err := ar.newAlert(m, time.Unix(int64(m.Values[0]), 0), qFn)
ls := &labelSet{
origin: make(map[string]string, len(m.Labels)),
processed: make(map[string]string, len(m.Labels)),
}
for _, l := range m.Labels {
if l.Name == "__name__" {
continue
}
ls.origin[l.Name] = l.Value
ls.processed[l.Name] = l.Value
}
a, err := ar.newAlert(m, ls, time.Unix(int64(m.Values[0]), 0), qFn)
if err != nil {
return fmt.Errorf("failed to create alert: %w", err)
}
a.ID = hash(m)
a.ID = hash(ls.processed)
a.State = notifier.StatePending
a.Restored = true
ar.alerts[a.ID] = a
logger.Infof("alert %q (%d) restored to state at %v", a.Name, a.ID, a.Start)
logger.Infof("alert %q (%d) restored to state at %v", a.Name, a.ID, a.ActiveAt)
}
return nil
}
@@ -555,21 +593,27 @@ func (ar *AlertingRule) Restore(ctx context.Context, q datasource.Querier, lookb
// and returns only those which should be sent to notifier.
// Isn't concurrent safe.
func (ar *AlertingRule) alertsToSend(ts time.Time, resolveDuration, resendDelay time.Duration) []notifier.Alert {
needsSending := func(a *notifier.Alert) bool {
if a.State == notifier.StatePending {
return false
}
if a.ResolvedAt.After(a.LastSent) {
return true
}
return a.LastSent.Add(resendDelay).Before(ts)
}
var alerts []notifier.Alert
for _, a := range ar.alerts {
switch a.State {
case notifier.StateFiring:
if time.Since(a.LastSent) < resendDelay {
continue
}
a.End = ts.Add(resolveDuration)
a.LastSent = ts
alerts = append(alerts, *a)
case notifier.StateInactive:
a.End = ts
a.LastSent = ts
alerts = append(alerts, *a)
if !needsSending(a) {
continue
}
a.End = ts.Add(resolveDuration)
if a.State == notifier.StateInactive {
a.End = a.ResolvedAt
}
a.LastSent = ts
alerts = append(alerts, *a)
}
return alerts
}

View File

@@ -61,7 +61,7 @@ func TestAlertingRule_ToTimeSeries(t *testing.T) {
},
{
newTestAlertingRule("for", time.Second),
&notifier.Alert{State: notifier.StateFiring, Start: timestamp.Add(time.Second)},
&notifier.Alert{State: notifier.StateFiring, ActiveAt: timestamp.Add(time.Second)},
[]prompbmarshal.TimeSeries{
newTimeSeries([]float64{1}, []int64{timestamp.UnixNano()}, map[string]string{
"__name__": alertMetricName,
@@ -76,7 +76,7 @@ func TestAlertingRule_ToTimeSeries(t *testing.T) {
},
{
newTestAlertingRule("for pending", 10*time.Second),
&notifier.Alert{State: notifier.StatePending, Start: timestamp.Add(time.Second)},
&notifier.Alert{State: notifier.StatePending, ActiveAt: timestamp.Add(time.Second)},
[]prompbmarshal.TimeSeries{
newTimeSeries([]float64{1}, []int64{timestamp.UnixNano()}, map[string]string{
"__name__": alertMetricName,
@@ -169,7 +169,7 @@ func TestAlertingRule_Exec(t *testing.T) {
},
},
{
newTestAlertingRule("single-firing=>inactive=>firing=>inactive=>empty", 0),
newTestAlertingRule("single-firing=>inactive=>firing=>inactive=>inactive", 0),
[][]datasource.Metric{
{metricWithLabels(t, "name", "foo")},
{},
@@ -177,7 +177,9 @@ func TestAlertingRule_Exec(t *testing.T) {
{},
{},
},
nil,
[]testAlert{
{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateInactive}},
},
},
{
newTestAlertingRule("single-firing=>inactive=>firing=>inactive=>empty=>firing", 0),
@@ -217,8 +219,9 @@ func TestAlertingRule_Exec(t *testing.T) {
},
// 1: fire first alert
// 2: fire second alert, set first inactive
// 3: fire third alert, set second inactive, delete first one
// 3: fire third alert, set second inactive
[]testAlert{
{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateInactive}},
{labels: []string{"name", "foo1"}, alert: &notifier.Alert{State: notifier.StateInactive}},
{labels: []string{"name", "foo2"}, alert: &notifier.Alert{State: notifier.StateFiring}},
},
@@ -301,7 +304,7 @@ func TestAlertingRule_Exec(t *testing.T) {
for _, step := range tc.steps {
fq.reset()
fq.add(step...)
if _, err := tc.rule.Exec(context.TODO()); err != nil {
if _, err := tc.rule.Exec(context.TODO(), time.Now()); err != nil {
t.Fatalf("unexpected err: %s", err)
}
// artificial delay between applying steps
@@ -312,10 +315,13 @@ func TestAlertingRule_Exec(t *testing.T) {
}
expAlerts := make(map[uint64]*notifier.Alert)
for _, ta := range tc.expAlerts {
labels := ta.labels
labels = append(labels, alertNameLabel)
labels = append(labels, tc.rule.Name)
h := hash(metricWithLabels(t, labels...))
labels := make(map[string]string)
for i := 0; i < len(ta.labels); i += 2 {
k, v := ta.labels[i], ta.labels[i+1]
labels[k] = v
}
labels[alertNameLabel] = tc.rule.Name
h := hash(labels)
expAlerts[h] = ta.alert
}
for key, exp := range expAlerts {
@@ -380,9 +386,9 @@ func TestAlertingRule_ExecRange(t *testing.T) {
{Values: []float64{1, 1, 1}, Timestamps: []int64{1, 3, 5}},
},
[]*notifier.Alert{
{State: notifier.StatePending, Start: time.Unix(1, 0)},
{State: notifier.StatePending, Start: time.Unix(3, 0)},
{State: notifier.StatePending, Start: time.Unix(5, 0)},
{State: notifier.StatePending, ActiveAt: time.Unix(1, 0)},
{State: notifier.StatePending, ActiveAt: time.Unix(3, 0)},
{State: notifier.StatePending, ActiveAt: time.Unix(5, 0)},
},
},
{
@@ -391,9 +397,9 @@ func TestAlertingRule_ExecRange(t *testing.T) {
{Values: []float64{1, 1, 1}, Timestamps: []int64{1, 3, 5}},
},
[]*notifier.Alert{
{State: notifier.StatePending, Start: time.Unix(1, 0)},
{State: notifier.StatePending, Start: time.Unix(1, 0)},
{State: notifier.StateFiring, Start: time.Unix(1, 0)},
{State: notifier.StatePending, ActiveAt: time.Unix(1, 0)},
{State: notifier.StatePending, ActiveAt: time.Unix(1, 0)},
{State: notifier.StateFiring, ActiveAt: time.Unix(1, 0)},
},
},
{
@@ -402,11 +408,11 @@ func TestAlertingRule_ExecRange(t *testing.T) {
{Values: []float64{1, 1, 1, 1, 1}, Timestamps: []int64{1, 2, 5, 6, 20}},
},
[]*notifier.Alert{
{State: notifier.StatePending, Start: time.Unix(1, 0)},
{State: notifier.StateFiring, Start: time.Unix(1, 0)},
{State: notifier.StatePending, Start: time.Unix(5, 0)},
{State: notifier.StateFiring, Start: time.Unix(5, 0)},
{State: notifier.StatePending, Start: time.Unix(20, 0)},
{State: notifier.StatePending, ActiveAt: time.Unix(1, 0)},
{State: notifier.StateFiring, ActiveAt: time.Unix(1, 0)},
{State: notifier.StatePending, ActiveAt: time.Unix(5, 0)},
{State: notifier.StateFiring, ActiveAt: time.Unix(5, 0)},
{State: notifier.StatePending, ActiveAt: time.Unix(20, 0)},
},
},
{
@@ -418,15 +424,15 @@ func TestAlertingRule_ExecRange(t *testing.T) {
},
},
[]*notifier.Alert{
{State: notifier.StatePending, Start: time.Unix(1, 0)},
{State: notifier.StatePending, Start: time.Unix(1, 0)},
{State: notifier.StateFiring, Start: time.Unix(1, 0)},
{State: notifier.StatePending, ActiveAt: time.Unix(1, 0)},
{State: notifier.StatePending, ActiveAt: time.Unix(1, 0)},
{State: notifier.StateFiring, ActiveAt: time.Unix(1, 0)},
//
{State: notifier.StatePending, Start: time.Unix(1, 0),
{State: notifier.StatePending, ActiveAt: time.Unix(1, 0),
Labels: map[string]string{
"foo": "bar",
}},
{State: notifier.StatePending, Start: time.Unix(5, 0),
{State: notifier.StatePending, ActiveAt: time.Unix(5, 0),
Labels: map[string]string{
"foo": "bar",
}},
@@ -479,7 +485,7 @@ func TestAlertingRule_ExecRange(t *testing.T) {
a.Labels = make(map[string]string)
}
a.Labels[alertNameLabel] = tc.rule.Name
expTS = append(expTS, tc.rule.alertToTimeSeries(tc.expAlerts[j], timestamp)...)
expTS = append(expTS, tc.rule.alertToTimeSeries(a, timestamp)...)
j++
}
}
@@ -510,8 +516,8 @@ func TestAlertingRule_Restore(t *testing.T) {
),
},
map[uint64]*notifier.Alert{
hash(datasource.Metric{}): {State: notifier.StatePending,
Start: time.Now().Truncate(time.Hour)},
hash(nil): {State: notifier.StatePending,
ActiveAt: time.Now().Truncate(time.Hour)},
},
},
{
@@ -526,13 +532,13 @@ func TestAlertingRule_Restore(t *testing.T) {
),
},
map[uint64]*notifier.Alert{
hash(metricWithLabels(t,
alertNameLabel, "metric labels",
alertGroupNameLabel, "groupID",
"foo", "bar",
"namespace", "baz",
)): {State: notifier.StatePending,
Start: time.Now().Truncate(time.Hour)},
hash(map[string]string{
alertNameLabel: "metric labels",
alertGroupNameLabel: "groupID",
"foo": "bar",
"namespace": "baz",
}): {State: notifier.StatePending,
ActiveAt: time.Now().Truncate(time.Hour)},
},
},
{
@@ -547,12 +553,12 @@ func TestAlertingRule_Restore(t *testing.T) {
),
},
map[uint64]*notifier.Alert{
hash(metricWithLabels(t,
"foo", "bar",
"namespace", "baz",
"source", "vm",
)): {State: notifier.StatePending,
Start: time.Now().Truncate(time.Hour)},
hash(map[string]string{
"foo": "bar",
"namespace": "baz",
"source": "vm",
}): {State: notifier.StatePending,
ActiveAt: time.Now().Truncate(time.Hour)},
},
},
{
@@ -572,12 +578,12 @@ func TestAlertingRule_Restore(t *testing.T) {
),
},
map[uint64]*notifier.Alert{
hash(metricWithLabels(t, "host", "localhost-1")): {State: notifier.StatePending,
Start: time.Now().Truncate(time.Hour)},
hash(metricWithLabels(t, "host", "localhost-2")): {State: notifier.StatePending,
Start: time.Now().Truncate(2 * time.Hour)},
hash(metricWithLabels(t, "host", "localhost-3")): {State: notifier.StatePending,
Start: time.Now().Truncate(3 * time.Hour)},
hash(map[string]string{"host": "localhost-1"}): {State: notifier.StatePending,
ActiveAt: time.Now().Truncate(time.Hour)},
hash(map[string]string{"host": "localhost-2"}): {State: notifier.StatePending,
ActiveAt: time.Now().Truncate(2 * time.Hour)},
hash(map[string]string{"host": "localhost-3"}): {State: notifier.StatePending,
ActiveAt: time.Now().Truncate(3 * time.Hour)},
},
},
}
@@ -602,8 +608,8 @@ func TestAlertingRule_Restore(t *testing.T) {
if got.State != exp.State {
t.Fatalf("expected state %d; got %d", exp.State, got.State)
}
if got.Start != exp.Start {
t.Fatalf("expected Start %v; got %v", exp.Start, got.Start)
if got.ActiveAt != exp.ActiveAt {
t.Fatalf("expected ActiveAt %v; got %v", exp.ActiveAt, got.ActiveAt)
}
}
})
@@ -618,14 +624,14 @@ func TestAlertingRule_Exec_Negative(t *testing.T) {
// successful attempt
fq.add(metricWithValueAndLabels(t, 1, "__name__", "foo", "job", "bar"))
_, err := ar.Exec(context.TODO())
_, err := ar.Exec(context.TODO(), time.Now())
if err != nil {
t.Fatal(err)
}
// label `job` will collide with rule extra label and will make both time series equal
fq.add(metricWithValueAndLabels(t, 1, "__name__", "foo", "job", "baz"))
_, err = ar.Exec(context.TODO())
_, err = ar.Exec(context.TODO(), time.Now())
if !errors.Is(err, errDuplicate) {
t.Fatalf("expected to have %s error; got %s", errDuplicate, err)
}
@@ -634,7 +640,7 @@ func TestAlertingRule_Exec_Negative(t *testing.T) {
expErr := "connection reset by peer"
fq.setErr(errors.New(expErr))
_, err = ar.Exec(context.TODO())
_, err = ar.Exec(context.TODO(), time.Now())
if err == nil {
t.Fatalf("expected to get err; got nil")
}
@@ -656,7 +662,7 @@ func TestAlertingRule_Template(t *testing.T) {
metricWithValueAndLabels(t, 1, "instance", "bar"),
},
map[uint64]*notifier.Alert{
hash(metricWithLabels(t, alertNameLabel, "common", "region", "east", "instance", "foo")): {
hash(map[string]string{alertNameLabel: "common", "region": "east", "instance": "foo"}): {
Annotations: map[string]string{},
Labels: map[string]string{
alertNameLabel: "common",
@@ -664,7 +670,7 @@ func TestAlertingRule_Template(t *testing.T) {
"instance": "foo",
},
},
hash(metricWithLabels(t, alertNameLabel, "common", "region", "east", "instance", "bar")): {
hash(map[string]string{alertNameLabel: "common", "region": "east", "instance": "bar"}): {
Annotations: map[string]string{},
Labels: map[string]string{
alertNameLabel: "common",
@@ -679,77 +685,70 @@ func TestAlertingRule_Template(t *testing.T) {
Name: "override label",
Labels: map[string]string{
"instance": "{{ $labels.instance }}",
"region": "east",
},
Annotations: map[string]string{
"summary": `Too high connection number for "{{ $labels.instance }}" for region {{ $labels.region }}`,
"description": `It is {{ $value }} connections for "{{ $labels.instance }}"`,
"summary": `Too high connection number for "{{ $labels.instance }}"`,
"description": `{{ $labels.alertname}}: It is {{ $value }} connections for "{{ $labels.instance }}"`,
},
alerts: make(map[uint64]*notifier.Alert),
},
[]datasource.Metric{
metricWithValueAndLabels(t, 2, "instance", "foo"),
metricWithValueAndLabels(t, 10, "instance", "bar"),
metricWithValueAndLabels(t, 2, "instance", "foo", alertNameLabel, "override"),
metricWithValueAndLabels(t, 10, "instance", "bar", alertNameLabel, "override"),
},
map[uint64]*notifier.Alert{
hash(metricWithLabels(t, alertNameLabel, "override label", "region", "east", "instance", "foo")): {
hash(map[string]string{alertNameLabel: "override label", "instance": "foo"}): {
Labels: map[string]string{
alertNameLabel: "override label",
"instance": "foo",
"region": "east",
},
Annotations: map[string]string{
"summary": `Too high connection number for "foo" for region east`,
"description": `It is 2 connections for "foo"`,
"summary": `Too high connection number for "foo"`,
"description": `override: It is 2 connections for "foo"`,
},
},
hash(metricWithLabels(t, alertNameLabel, "override label", "region", "east", "instance", "bar")): {
hash(map[string]string{alertNameLabel: "override label", "instance": "bar"}): {
Labels: map[string]string{
alertNameLabel: "override label",
"instance": "bar",
"region": "east",
},
Annotations: map[string]string{
"summary": `Too high connection number for "bar" for region east`,
"description": `It is 10 connections for "bar"`,
"summary": `Too high connection number for "bar"`,
"description": `override: It is 10 connections for "bar"`,
},
},
},
},
{
&AlertingRule{
Name: "ExtraTemplating",
Name: "OriginLabels",
GroupName: "Testing",
Labels: map[string]string{
"name": "alert_{{ $labels.alertname }}",
"group": "group_{{ $labels.alertgroup }}",
"instance": "{{ $labels.instance }}",
},
Annotations: map[string]string{
"summary": `Alert "{{ $labels.alertname }}({{ $labels.alertgroup }})" for instance {{ $labels.instance }}`,
"description": `Alert "{{ $labels.name }}({{ $labels.group }})" for instance {{ $labels.instance }}`,
"summary": `Alert "{{ $labels.alertname }}({{ $labels.alertgroup }})" for instance {{ $labels.instance }}`,
},
alerts: make(map[uint64]*notifier.Alert),
},
[]datasource.Metric{
metricWithValueAndLabels(t, 1, "instance", "foo"),
metricWithValueAndLabels(t, 1,
alertNameLabel, "originAlertname",
alertGroupNameLabel, "originGroupname",
"instance", "foo"),
},
map[uint64]*notifier.Alert{
hash(metricWithLabels(t, alertNameLabel, "ExtraTemplating",
"name", "alert_ExtraTemplating",
alertGroupNameLabel, "Testing",
"group", "group_Testing",
"instance", "foo")): {
hash(map[string]string{
alertNameLabel: "OriginLabels",
alertGroupNameLabel: "Testing",
"instance": "foo"}): {
Labels: map[string]string{
alertNameLabel: "ExtraTemplating",
"name": "alert_ExtraTemplating",
alertNameLabel: "OriginLabels",
alertGroupNameLabel: "Testing",
"group": "group_Testing",
"instance": "foo",
},
Annotations: map[string]string{
"summary": `Alert "ExtraTemplating(Testing)" for instance foo`,
"description": `Alert "alert_ExtraTemplating(group_Testing)" for instance foo`,
"summary": `Alert "originAlertname(originGroupname)" for instance foo`,
},
},
},
@@ -762,7 +761,7 @@ func TestAlertingRule_Template(t *testing.T) {
tc.rule.GroupID = fakeGroup.ID()
tc.rule.q = fq
fq.add(tc.metrics...)
if _, err := tc.rule.Exec(context.TODO()); err != nil {
if _, err := tc.rule.Exec(context.TODO(), time.Now()); err != nil {
t.Fatalf("unexpected err: %s", err)
}
for hash, expAlert := range tc.expAlerts {
@@ -821,17 +820,17 @@ func TestAlertsToSend(t *testing.T) {
5*time.Minute, time.Minute,
)
f( // resolve inactive alert at the current timestamp
[]*notifier.Alert{{State: notifier.StateInactive}},
[]*notifier.Alert{{State: notifier.StateInactive, ResolvedAt: ts}},
[]*notifier.Alert{{LastSent: ts, End: ts}},
time.Minute, time.Minute,
)
f( // mixed case of firing and resolved alerts. Names are added for deterministic sorting
[]*notifier.Alert{{Name: "a", State: notifier.StateFiring}, {Name: "b", State: notifier.StateInactive}},
[]*notifier.Alert{{Name: "a", State: notifier.StateFiring}, {Name: "b", State: notifier.StateInactive, ResolvedAt: ts}},
[]*notifier.Alert{{Name: "a", LastSent: ts, End: ts.Add(5 * time.Minute)}, {Name: "b", LastSent: ts, End: ts}},
5*time.Minute, time.Minute,
)
f( // mixed case of pending and resolved alerts. Names are added for deterministic sorting
[]*notifier.Alert{{Name: "a", State: notifier.StatePending}, {Name: "b", State: notifier.StateInactive}},
[]*notifier.Alert{{Name: "a", State: notifier.StatePending}, {Name: "b", State: notifier.StateInactive, ResolvedAt: ts}},
[]*notifier.Alert{{Name: "b", LastSent: ts, End: ts}},
5*time.Minute, time.Minute,
)
@@ -850,6 +849,16 @@ func TestAlertsToSend(t *testing.T) {
[]*notifier.Alert{{LastSent: ts, End: ts.Add(time.Minute)}},
time.Minute, 0,
)
f( // inactive alert which has been sent already
[]*notifier.Alert{{State: notifier.StateInactive, LastSent: ts.Add(-time.Second), ResolvedAt: ts.Add(-2 * time.Second)}},
nil,
time.Minute, time.Minute,
)
f( // inactive alert which has been resolved after last send
[]*notifier.Alert{{State: notifier.StateInactive, LastSent: ts.Add(-time.Second), ResolvedAt: ts}},
[]*notifier.Alert{{LastSent: ts, End: ts}},
time.Minute, time.Minute,
)
}
func newTestRuleWithLabels(name string, labels ...string) *AlertingRule {

View File

@@ -8,7 +8,7 @@ import (
// Querier interface wraps Query and QueryRange methods
type Querier interface {
Query(ctx context.Context, query string) ([]Metric, error)
Query(ctx context.Context, query string, ts time.Time) ([]Metric, error)
QueryRange(ctx context.Context, query string, from, to time.Time) ([]Metric, error)
}

View File

@@ -41,7 +41,9 @@ var (
queryTimeAlignment = flag.Bool("datasource.queryTimeAlignment", true, `Whether to align "time" parameter with evaluation interval.`+
"Alignment supposed to produce deterministic results despite of number of vmalert replicas or time they were started. See more details here https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1257")
maxIdleConnections = flag.Int("datasource.maxIdleConnections", 100, `Defines the number of idle (keep-alive connections) to each configured datasource. Consider setting this value equal to the value: groups_total * group.concurrency. Too low a value may result in a high number of sockets in TIME_WAIT state.`)
roundDigits = flag.Int("datasource.roundDigits", 0, `Adds "round_digits" GET param to datasource requests. `+
disableKeepAlive = flag.Bool("datasource.disableKeepAlive", false, `Whether to disable long-lived connections to the datasource. `+
`If true, disables HTTP keep-alives and will only use the connection to the server for a single HTTP request.`)
roundDigits = flag.Int("datasource.roundDigits", 0, `Adds "round_digits" GET param to datasource requests. `+
`In VM "round_digits" limits the number of digits after the decimal point in response values.`)
)
@@ -62,6 +64,7 @@ func Init(extraParams url.Values) (QuerierBuilder, error) {
if err != nil {
return nil, fmt.Errorf("failed to create transport: %w", err)
}
tr.DisableKeepAlives = *disableKeepAlive
tr.MaxIdleConnsPerHost = *maxIdleConnections
if tr.MaxIdleConns != 0 && tr.MaxIdleConns < tr.MaxIdleConnsPerHost {
tr.MaxIdleConns = tr.MaxIdleConnsPerHost

View File

@@ -71,13 +71,12 @@ func NewVMStorage(baseURL string, authCfg *promauth.Config, lookBack time.Durati
}
// Query executes the given query and returns parsed response
func (s *VMStorage) Query(ctx context.Context, query string) ([]Metric, error) {
func (s *VMStorage) Query(ctx context.Context, query string, ts time.Time) ([]Metric, error) {
req, err := s.newRequestPOST()
if err != nil {
return nil, err
}
ts := time.Now()
switch s.dataSourceType.String() {
case "prometheus":
s.setPrometheusInstantReqParams(req, query, ts)

View File

@@ -89,26 +89,27 @@ func TestVMInstantQuery(t *testing.T) {
p := NewPrometheusType()
pq := s.BuildWithParams(QuerierParams{DataSourceType: &p, EvaluationInterval: 15 * time.Second})
ts := time.Now()
if _, err := pq.Query(ctx, query); err == nil {
if _, err := pq.Query(ctx, query, ts); err == nil {
t.Fatalf("expected connection error got nil")
}
if _, err := pq.Query(ctx, query); err == nil {
if _, err := pq.Query(ctx, query, ts); err == nil {
t.Fatalf("expected invalid response status error got nil")
}
if _, err := pq.Query(ctx, query); err == nil {
if _, err := pq.Query(ctx, query, ts); err == nil {
t.Fatalf("expected response body error got nil")
}
if _, err := pq.Query(ctx, query); err == nil {
if _, err := pq.Query(ctx, query, ts); err == nil {
t.Fatalf("expected error status got nil")
}
if _, err := pq.Query(ctx, query); err == nil {
if _, err := pq.Query(ctx, query, ts); err == nil {
t.Fatalf("expected unknown status got nil")
}
if _, err := pq.Query(ctx, query); err == nil {
if _, err := pq.Query(ctx, query, ts); err == nil {
t.Fatalf("expected non-vector resultType error got nil")
}
m, err := pq.Query(ctx, query)
m, err := pq.Query(ctx, query, ts)
if err != nil {
t.Fatalf("unexpected %s", err)
}
@@ -134,7 +135,7 @@ func TestVMInstantQuery(t *testing.T) {
g := NewGraphiteType()
gq := s.BuildWithParams(QuerierParams{DataSourceType: &g})
m, err = gq.Query(ctx, queryRender)
m, err = gq.Query(ctx, queryRender, ts)
if err != nil {
t.Fatalf("unexpected %s", err)
}

View File

@@ -5,6 +5,8 @@ import (
"fmt"
"hash/fnv"
"net/url"
"strconv"
"strings"
"sync"
"time"
@@ -13,7 +15,9 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/utils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/decimal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/metrics"
)
@@ -44,6 +48,7 @@ type Group struct {
type groupMetrics struct {
iterationTotal *utils.Counter
iterationDuration *utils.Summary
iterationMissed *utils.Counter
}
func newGroupMetrics(name, file string) *groupMetrics {
@@ -51,6 +56,7 @@ func newGroupMetrics(name, file string) *groupMetrics {
labels := fmt.Sprintf(`group=%q, file=%q`, name, file)
m.iterationTotal = utils.GetOrCreateCounter(fmt.Sprintf(`vmalert_iteration_total{%s}`, labels))
m.iterationDuration = utils.GetOrCreateSummary(fmt.Sprintf(`vmalert_iteration_duration_seconds{%s}`, labels))
m.iterationMissed = utils.GetOrCreateCounter(fmt.Sprintf(`vmalert_iteration_missed_total{%s}`, labels))
return m
}
@@ -226,6 +232,13 @@ var skipRandSleepOnGroupStart bool
func (g *Group) start(ctx context.Context, nts func() []notifier.Notifier, rw *remotewrite.Client) {
defer func() { close(g.finishedCh) }()
e := &executor{
rw: rw,
notifiers: nts,
previouslySentSeriesToRW: make(map[uint64]map[string][]prompbmarshal.Label)}
evalTS := time.Now()
// Spread group rules evaluation over time in order to reduce load on VictoriaMetrics.
if !skipRandSleepOnGroupStart {
randSleep := uint64(float64(g.Interval) * (float64(g.ID()) / (1 << 64)))
@@ -247,7 +260,31 @@ func (g *Group) start(ctx context.Context, nts func() []notifier.Notifier, rw *r
}
logger.Infof("group %q started; interval=%v; concurrency=%d", g.Name, g.Interval, g.Concurrency)
e := &executor{rw: rw, notifiers: nts}
eval := func(ts time.Time) {
g.metrics.iterationTotal.Inc()
start := time.Now()
if len(g.Rules) < 1 {
g.metrics.iterationDuration.UpdateDuration(start)
g.LastEvaluation = start
return
}
resolveDuration := getResolveDuration(g.Interval, *resendDelay, *maxResolveDuration)
errs := e.execConcurrently(ctx, g.Rules, ts, g.Concurrency, resolveDuration)
for err := range errs {
if err != nil {
logger.Errorf("group %q: %s", g.Name, err)
}
}
g.metrics.iterationDuration.UpdateDuration(start)
g.LastEvaluation = start
}
eval(evalTS)
t := time.NewTicker(g.Interval)
defer t.Stop()
for {
@@ -274,32 +311,26 @@ func (g *Group) start(ctx context.Context, nts func() []notifier.Notifier, rw *r
g.mu.Unlock()
logger.Infof("group %q re-started; interval=%v; concurrency=%d", g.Name, g.Interval, g.Concurrency)
case <-t.C:
g.metrics.iterationTotal.Inc()
iterationStart := time.Now()
if len(g.Rules) > 0 {
errs := e.execConcurrently(ctx, g.Rules, g.Concurrency, getResolveDuration(g.Interval))
for err := range errs {
if err != nil {
logger.Errorf("group %q: %s", g.Name, err)
}
}
g.LastEvaluation = iterationStart
missed := (time.Since(evalTS) / g.Interval) - 1
if missed > 0 {
g.metrics.iterationMissed.Inc()
}
g.metrics.iterationDuration.UpdateDuration(iterationStart)
evalTS = evalTS.Add((missed + 1) * g.Interval)
eval(evalTS)
}
}
}
// getResolveDuration returns the duration after which firing alert
// can be considered as resolved.
func getResolveDuration(groupInterval time.Duration) time.Duration {
delta := *resendDelay
func getResolveDuration(groupInterval, delta, maxDuration time.Duration) time.Duration {
if groupInterval > delta {
delta = groupInterval
}
resolveDuration := delta * 4
if *maxResolveDuration > 0 && resolveDuration > *maxResolveDuration {
resolveDuration = *maxResolveDuration
if maxDuration > 0 && resolveDuration > maxDuration {
resolveDuration = maxDuration
}
return resolveDuration
}
@@ -307,14 +338,21 @@ func getResolveDuration(groupInterval time.Duration) time.Duration {
type executor struct {
notifiers func() []notifier.Notifier
rw *remotewrite.Client
previouslySentSeriesToRWMu sync.Mutex
// previouslySentSeriesToRW stores series sent to RW on previous iteration
// map[ruleID]map[ruleLabels][]prompb.Label
// where `ruleID` is ID of the Rule within a Group
// and `ruleLabels` is []prompb.Label marshalled to a string
previouslySentSeriesToRW map[uint64]map[string][]prompbmarshal.Label
}
func (e *executor) execConcurrently(ctx context.Context, rules []Rule, concurrency int, resolveDuration time.Duration) chan error {
func (e *executor) execConcurrently(ctx context.Context, rules []Rule, ts time.Time, concurrency int, resolveDuration time.Duration) chan error {
res := make(chan error, len(rules))
if concurrency == 1 {
// fast path
for _, rule := range rules {
res <- e.exec(ctx, rule, resolveDuration)
res <- e.exec(ctx, rule, ts, resolveDuration)
}
close(res)
return res
@@ -327,7 +365,7 @@ func (e *executor) execConcurrently(ctx context.Context, rules []Rule, concurren
sem <- struct{}{}
wg.Add(1)
go func(r Rule) {
res <- e.exec(ctx, r, resolveDuration)
res <- e.exec(ctx, r, ts, resolveDuration)
<-sem
wg.Done()
}(rule)
@@ -348,24 +386,29 @@ var (
remoteWriteTotal = metrics.NewCounter(`vmalert_remotewrite_total`)
)
func (e *executor) exec(ctx context.Context, rule Rule, resolveDuration time.Duration) error {
func (e *executor) exec(ctx context.Context, rule Rule, ts time.Time, resolveDuration time.Duration) error {
execTotal.Inc()
now := time.Now()
tss, err := rule.Exec(ctx)
tss, err := rule.Exec(ctx, ts)
if err != nil {
execErrors.Inc()
return fmt.Errorf("rule %q: failed to execute: %w", rule, err)
}
if len(tss) > 0 && e.rw != nil {
for _, ts := range tss {
remoteWriteTotal.Inc()
if err := e.rw.Push(ts); err != nil {
remoteWriteErrors.Inc()
return fmt.Errorf("rule %q: remote write failure: %w", rule, err)
errGr := new(utils.ErrGroup)
if e.rw != nil {
pushToRW := func(tss []prompbmarshal.TimeSeries) {
for _, ts := range tss {
remoteWriteTotal.Inc()
if err := e.rw.Push(ts); err != nil {
remoteWriteErrors.Inc()
errGr.Add(fmt.Errorf("rule %q: remote write failure: %w", rule, err))
}
}
}
pushToRW(tss)
staleSeries := e.getStaleSeries(rule, tss, ts)
pushToRW(staleSeries)
}
ar, ok := rule.(*AlertingRule)
@@ -373,12 +416,11 @@ func (e *executor) exec(ctx context.Context, rule Rule, resolveDuration time.Dur
return nil
}
alerts := ar.alertsToSend(now, resolveDuration, *resendDelay)
alerts := ar.alertsToSend(ts, resolveDuration, *resendDelay)
if len(alerts) < 1 {
return nil
}
errGr := new(utils.ErrGroup)
for _, nt := range e.notifiers() {
if err := nt.Send(ctx, alerts); err != nil {
errGr.Add(fmt.Errorf("rule %q: failed to send alerts to addr %q: %w", rule, nt.Addr(), err))
@@ -386,3 +428,50 @@ func (e *executor) exec(ctx context.Context, rule Rule, resolveDuration time.Dur
}
return errGr.Err()
}
// getStaledSeries checks whether there are stale series from previously sent ones.
func (e *executor) getStaleSeries(rule Rule, tss []prompbmarshal.TimeSeries, timestamp time.Time) []prompbmarshal.TimeSeries {
ruleLabels := make(map[string][]prompbmarshal.Label, len(tss))
for _, ts := range tss {
// convert labels to strings so we can compare with previously sent series
key := labelsToString(ts.Labels)
ruleLabels[key] = ts.Labels
}
rID := rule.ID()
var staleS []prompbmarshal.TimeSeries
// check whether there are series which disappeared and need to be marked as stale
e.previouslySentSeriesToRWMu.Lock()
for key, labels := range e.previouslySentSeriesToRW[rID] {
if _, ok := ruleLabels[key]; ok {
continue
}
// previously sent series are missing in current series, so we mark them as stale
ss := newTimeSeriesPB([]float64{decimal.StaleNaN}, []int64{timestamp.Unix()}, labels)
staleS = append(staleS, ss)
}
// set previous series to current
e.previouslySentSeriesToRW[rID] = ruleLabels
e.previouslySentSeriesToRWMu.Unlock()
return staleS
}
func labelsToString(labels []prompbmarshal.Label) string {
var b strings.Builder
b.WriteRune('{')
for i, label := range labels {
if len(label.Name) == 0 {
b.WriteString("__name__")
} else {
b.WriteString(label.Name)
}
b.WriteRune('=')
b.WriteString(strconv.Quote(label.Value))
if i < len(labels)-1 {
b.WriteRune(',')
}
}
b.WriteRune('}')
return b.String()
}

View File

@@ -3,6 +3,9 @@ package main
import (
"context"
"fmt"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/decimal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"reflect"
"sort"
"testing"
"time"
@@ -171,7 +174,7 @@ func TestGroupStart(t *testing.T) {
m2 := metricWithLabels(t, "instance", inst2, "job", job)
r := g.Rules[0].(*AlertingRule)
alert1, err := r.newAlert(m1, time.Now(), nil)
alert1, err := r.newAlert(m1, nil, time.Now(), nil)
if err != nil {
t.Fatalf("faield to create alert: %s", err)
}
@@ -184,13 +187,9 @@ func TestGroupStart(t *testing.T) {
// add service labels
alert1.Labels[alertNameLabel] = alert1.Name
alert1.Labels[alertGroupNameLabel] = g.Name
var labels1 []string
for k, v := range alert1.Labels {
labels1 = append(labels1, k, v)
}
alert1.ID = hash(metricWithLabels(t, labels1...))
alert1.ID = hash(alert1.Labels)
alert2, err := r.newAlert(m2, time.Now(), nil)
alert2, err := r.newAlert(m2, nil, time.Now(), nil)
if err != nil {
t.Fatalf("faield to create alert: %s", err)
}
@@ -203,11 +202,7 @@ func TestGroupStart(t *testing.T) {
// add service labels
alert2.Labels[alertNameLabel] = alert2.Name
alert2.Labels[alertGroupNameLabel] = g.Name
var labels2 []string
for k, v := range alert2.Labels {
labels2 = append(labels2, k, v)
}
alert2.ID = hash(metricWithLabels(t, labels2...))
alert2.ID = hash(alert2.Labels)
finished := make(chan struct{})
fs.add(m1)
@@ -239,7 +234,8 @@ func TestGroupStart(t *testing.T) {
time.Sleep(20 * evalInterval)
gotAlerts = fn.getAlerts()
expectedAlerts = []notifier.Alert{*alert1}
alert2.State = notifier.StateInactive
expectedAlerts = []notifier.Alert{*alert1, *alert2}
compareAlerts(t, expectedAlerts, gotAlerts)
g.close()
@@ -262,21 +258,100 @@ func TestResolveDuration(t *testing.T) {
{0, 0, 0, 0},
}
defaultResolveDuration := *maxResolveDuration
defaultResendDelay := *resendDelay
defer func() {
*maxResolveDuration = defaultResolveDuration
*resendDelay = defaultResendDelay
}()
for _, tc := range testCases {
t.Run(fmt.Sprintf("%v-%v-%v", tc.groupInterval, tc.expected, tc.maxDuration), func(t *testing.T) {
*maxResolveDuration = tc.maxDuration
*resendDelay = tc.resendDelay
got := getResolveDuration(tc.groupInterval)
got := getResolveDuration(tc.groupInterval, tc.resendDelay, tc.maxDuration)
if got != tc.expected {
t.Errorf("expected to have %v; got %v", tc.expected, got)
}
})
}
}
func TestGetStaleSeries(t *testing.T) {
ts := time.Now()
e := &executor{
previouslySentSeriesToRW: make(map[uint64]map[string][]prompbmarshal.Label),
}
f := func(rule Rule, labels, expLabels [][]prompbmarshal.Label) {
t.Helper()
var tss []prompbmarshal.TimeSeries
for _, l := range labels {
tss = append(tss, newTimeSeriesPB([]float64{1}, []int64{ts.Unix()}, l))
}
staleS := e.getStaleSeries(rule, tss, ts)
if staleS == nil && expLabels == nil {
return
}
if len(staleS) != len(expLabels) {
t.Fatalf("expected to get %d stale series, got %d",
len(expLabels), len(staleS))
}
for i, exp := range expLabels {
got := staleS[i]
if !reflect.DeepEqual(exp, got.Labels) {
t.Fatalf("expected to get labels: \n%v;\ngot instead: \n%v",
exp, got.Labels)
}
if len(got.Samples) != 1 {
t.Fatalf("expected to have 1 sample; got %d", len(got.Samples))
}
if !decimal.IsStaleNaN(got.Samples[0].Value) {
t.Fatalf("expected sample value to be %v; got %v", decimal.StaleNaN, got.Samples[0].Value)
}
}
}
// warn: keep in mind, that executor holds the state, so sequence of f calls matters
// single series
f(&AlertingRule{RuleID: 1},
[][]prompbmarshal.Label{toPromLabels(t, "__name__", "job:foo", "job", "foo")},
nil)
f(&AlertingRule{RuleID: 1},
[][]prompbmarshal.Label{toPromLabels(t, "__name__", "job:foo", "job", "foo")},
nil)
f(&AlertingRule{RuleID: 1},
nil,
[][]prompbmarshal.Label{toPromLabels(t, "__name__", "job:foo", "job", "foo")})
f(&AlertingRule{RuleID: 1},
nil,
nil)
// multiple series
f(&AlertingRule{RuleID: 1},
[][]prompbmarshal.Label{
toPromLabels(t, "__name__", "job:foo", "job", "foo"),
toPromLabels(t, "__name__", "job:foo", "job", "bar"),
},
nil)
f(&AlertingRule{RuleID: 1},
[][]prompbmarshal.Label{toPromLabels(t, "__name__", "job:foo", "job", "bar")},
[][]prompbmarshal.Label{toPromLabels(t, "__name__", "job:foo", "job", "foo")})
f(&AlertingRule{RuleID: 1},
[][]prompbmarshal.Label{toPromLabels(t, "__name__", "job:foo", "job", "bar")},
nil)
f(&AlertingRule{RuleID: 1},
nil,
[][]prompbmarshal.Label{toPromLabels(t, "__name__", "job:foo", "job", "bar")})
// multiple rules and series
f(&AlertingRule{RuleID: 1},
[][]prompbmarshal.Label{
toPromLabels(t, "__name__", "job:foo", "job", "foo"),
toPromLabels(t, "__name__", "job:foo", "job", "bar"),
},
nil)
f(&AlertingRule{RuleID: 2},
[][]prompbmarshal.Label{
toPromLabels(t, "__name__", "job:foo", "job", "foo"),
toPromLabels(t, "__name__", "job:foo", "job", "bar"),
},
nil)
f(&AlertingRule{RuleID: 1},
[][]prompbmarshal.Label{toPromLabels(t, "__name__", "job:foo", "job", "bar")},
[][]prompbmarshal.Label{toPromLabels(t, "__name__", "job:foo", "job", "foo")})
f(&AlertingRule{RuleID: 1},
[][]prompbmarshal.Label{toPromLabels(t, "__name__", "job:foo", "job", "bar")},
nil)
}

View File

@@ -44,10 +44,10 @@ func (fq *fakeQuerier) BuildWithParams(_ datasource.QuerierParams) datasource.Qu
}
func (fq *fakeQuerier) QueryRange(ctx context.Context, q string, _, _ time.Time) ([]datasource.Metric, error) {
return fq.Query(ctx, q)
return fq.Query(ctx, q, time.Now())
}
func (fq *fakeQuerier) Query(_ context.Context, _ string) ([]datasource.Metric, error) {
func (fq *fakeQuerier) Query(_ context.Context, _ string, _ time.Time) ([]datasource.Metric, error) {
fq.Lock()
defer fq.Unlock()
if fq.err != nil {
@@ -116,6 +116,21 @@ func metricWithLabels(t *testing.T, labels ...string) datasource.Metric {
return m
}
func toPromLabels(t *testing.T, labels ...string) []prompbmarshal.Label {
t.Helper()
if len(labels) == 0 || len(labels)%2 != 0 {
t.Fatalf("expected to get even number of labels")
}
var ls []prompbmarshal.Label
for i := 0; i < len(labels); i += 2 {
ls = append(ls, prompbmarshal.Label{
Name: labels[i],
Value: labels[i+1],
})
}
return ls
}
func compareGroups(t *testing.T, a, b *Group) {
t.Helper()
if a.Name != b.Name {

View File

@@ -243,7 +243,7 @@ func getAlertURLGenerator(externalURL *url.URL, externalAlertSource string, vali
"tpl": externalAlertSource,
}
return func(alert notifier.Alert) string {
templated, err := alert.ExecTemplate(nil, m)
templated, err := alert.ExecTemplate(nil, nil, m)
if err != nil {
logger.Errorf("can not exec source template %s", err)
}

View File

@@ -37,7 +37,7 @@ func (m *manager) AlertAPI(gID, aID uint64) (*APIAlert, error) {
g, ok := m.groups[gID]
if !ok {
return nil, fmt.Errorf("can't find group with id %q", gID)
return nil, fmt.Errorf("can't find group with id %d", gID)
}
for _, rule := range g.Rules {
ar, ok := rule.(*AlertingRule)
@@ -48,7 +48,7 @@ func (m *manager) AlertAPI(gID, aID uint64) (*APIAlert, error) {
return apiAlert, nil
}
}
return nil, fmt.Errorf("can't find alert with id %q in group %q", aID, g.Name)
return nil, fmt.Errorf("can't find alert with id %d in group %q", aID, g.Name)
}
func (m *manager) start(ctx context.Context, groupsCfg []config.Group) error {

View File

@@ -9,6 +9,8 @@ import (
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/utils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
)
// Alert the triggered alert
@@ -26,10 +28,14 @@ type Alert struct {
State AlertState
// Expr contains expression that was executed to generate the Alert
Expr string
// Start defines the moment of time when Alert has triggered
// ActiveAt defines the moment of time when Alert has become active
ActiveAt time.Time
// Start defines the moment of time when Alert has become firing
Start time.Time
// End defines the moment of time when Alert supposed to expire
End time.Time
// ResolvedAt defines the moment when Alert was switched from Firing to Inactive
ResolvedAt time.Time
// LastSent defines the moment when Alert was sent last time
LastSent time.Time
// Value stores the value returned from evaluating expression from Expr field
@@ -84,8 +90,8 @@ var tplHeaders = []string{
// map of annotations.
// Every alert could have a different datasource, so function
// requires a queryFunction as an argument.
func (a *Alert) ExecTemplate(q QueryFn, annotations map[string]string) (map[string]string, error) {
tplData := AlertTplData{Value: a.Value, Labels: a.Labels, Expr: a.Expr}
func (a *Alert) ExecTemplate(q QueryFn, labels, annotations map[string]string) (map[string]string, error) {
tplData := AlertTplData{Value: a.Value, Labels: labels, Expr: a.Expr}
return templateAnnotations(annotations, tplData, funcsWithQuery(q))
}
@@ -143,3 +149,18 @@ func templateAnnotation(dst io.Writer, text string, data tplData, funcs template
}
return nil
}
func (a Alert) toPromLabels(relabelCfg *promrelabel.ParsedConfigs) []prompbmarshal.Label {
var labels []prompbmarshal.Label
for k, v := range a.Labels {
labels = append(labels, prompbmarshal.Label{
Name: k,
Value: v,
})
}
promrelabel.SortLabels(labels)
if relabelCfg != nil {
return relabelCfg.Apply(labels, 0, false)
}
return labels
}

View File

@@ -2,9 +2,12 @@ package notifier
import (
"fmt"
"reflect"
"testing"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
)
func TestAlert_ExecTemplate(t *testing.T) {
@@ -130,7 +133,7 @@ func TestAlert_ExecTemplate(t *testing.T) {
}
for _, tc := range testCases {
t.Run(tc.name, func(t *testing.T) {
tpl, err := tc.alert.ExecTemplate(qFn, tc.annotations)
tpl, err := tc.alert.ExecTemplate(qFn, tc.alert.Labels, tc.annotations)
if err != nil {
t.Fatal(err)
}
@@ -146,3 +149,48 @@ func TestAlert_ExecTemplate(t *testing.T) {
})
}
}
func TestAlert_toPromLabels(t *testing.T) {
fn := func(labels map[string]string, exp []prompbmarshal.Label, relabel *promrelabel.ParsedConfigs) {
t.Helper()
a := Alert{Labels: labels}
got := a.toPromLabels(relabel)
if !reflect.DeepEqual(got, exp) {
t.Fatalf("expected to have: \n%v;\ngot:\n%v",
exp, got)
}
}
fn(nil, nil, nil)
fn(
map[string]string{"foo": "bar", "a": "baz"}, // unsorted
[]prompbmarshal.Label{{Name: "a", Value: "baz"}, {Name: "foo", Value: "bar"}},
nil,
)
pcs, err := promrelabel.ParseRelabelConfigsData([]byte(`
- target_label: "foo"
replacement: "aaa"
- action: labeldrop
regex: "env.*"
`), false)
if err != nil {
t.Fatalf("unexpected error: %s", err)
}
fn(
map[string]string{"a": "baz"},
[]prompbmarshal.Label{{Name: "a", Value: "baz"}, {Name: "foo", Value: "aaa"}},
pcs,
)
fn(
map[string]string{"foo": "bar", "a": "baz"},
[]prompbmarshal.Label{{Name: "a", Value: "baz"}, {Name: "foo", Value: "aaa"}},
pcs,
)
fn(
map[string]string{"qux": "bar", "env": "prod", "environment": "production"},
[]prompbmarshal.Label{{Name: "foo", Value: "aaa"}, {Name: "qux", Value: "bar"}},
pcs,
)
}

View File

@@ -11,6 +11,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/utils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promauth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
)
// AlertManager represents integration provider with Prometheus alert manager
@@ -22,6 +23,8 @@ type AlertManager struct {
timeout time.Duration
authCfg *promauth.Config
// stores already parsed RelabelConfigs object
relabelConfigs *promrelabel.ParsedConfigs
metrics *metrics
}
@@ -59,7 +62,7 @@ func (am *AlertManager) Send(ctx context.Context, alerts []Alert) error {
func (am *AlertManager) send(ctx context.Context, alerts []Alert) error {
b := &bytes.Buffer{}
writeamRequest(b, alerts, am.argFunc)
writeamRequest(b, alerts, am.argFunc, am.relabelConfigs)
req, err := http.NewRequest("POST", am.addr, b)
if err != nil {
@@ -103,7 +106,8 @@ type AlertURLGenerator func(Alert) string
const alertManagerPath = "/api/v2/alerts"
// NewAlertManager is a constructor for AlertManager
func NewAlertManager(alertManagerURL string, fn AlertURLGenerator, authCfg promauth.HTTPClientConfig, timeout time.Duration) (*AlertManager, error) {
func NewAlertManager(alertManagerURL string, fn AlertURLGenerator, authCfg promauth.HTTPClientConfig,
relabelCfg *promrelabel.ParsedConfigs, timeout time.Duration) (*AlertManager, error) {
tls := &promauth.TLSConfig{}
if authCfg.TLSConfig != nil {
tls = authCfg.TLSConfig
@@ -131,11 +135,12 @@ func NewAlertManager(alertManagerURL string, fn AlertURLGenerator, authCfg proma
}
return &AlertManager{
addr: alertManagerURL,
argFunc: fn,
authCfg: aCfg,
client: &http.Client{Transport: tr},
timeout: timeout,
metrics: newMetrics(alertManagerURL),
addr: alertManagerURL,
argFunc: fn,
authCfg: aCfg,
relabelConfigs: relabelCfg,
client: &http.Client{Transport: tr},
timeout: timeout,
metrics: newMetrics(alertManagerURL),
}, nil
}

View File

@@ -1,9 +1,11 @@
{% import (
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
) %}
{% stripspace %}
{% func amRequest(alerts []Alert, generatorURL func(Alert) string) %}
{% func amRequest(alerts []Alert, generatorURL func(Alert) string, relabelCfg *promrelabel.ParsedConfigs) %}
[
{% for i, alert := range alerts %}
{
@@ -14,8 +16,9 @@
{% endif %}
"labels": {
"alertname":{%q= alert.Name %}
{% for k,v := range alert.Labels %}
,{%q= k %}:{%q= v %}
{% code lbls := alert.toPromLabels(relabelCfg) %}
{% for _, l := range lbls %}
,{%q= l.Name %}:{%q= l.Value %}
{% endfor %}
},
"annotations": {

View File

@@ -7,124 +7,129 @@ package notifier
//line app/vmalert/notifier/alertmanager_request.qtpl:1
import (
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
)
//line app/vmalert/notifier/alertmanager_request.qtpl:6
//line app/vmalert/notifier/alertmanager_request.qtpl:8
import (
qtio422016 "io"
qt422016 "github.com/valyala/quicktemplate"
)
//line app/vmalert/notifier/alertmanager_request.qtpl:6
//line app/vmalert/notifier/alertmanager_request.qtpl:8
var (
_ = qtio422016.Copy
_ = qt422016.AcquireByteBuffer
)
//line app/vmalert/notifier/alertmanager_request.qtpl:6
func streamamRequest(qw422016 *qt422016.Writer, alerts []Alert, generatorURL func(Alert) string) {
//line app/vmalert/notifier/alertmanager_request.qtpl:6
//line app/vmalert/notifier/alertmanager_request.qtpl:8
func streamamRequest(qw422016 *qt422016.Writer, alerts []Alert, generatorURL func(Alert) string, relabelCfg *promrelabel.ParsedConfigs) {
//line app/vmalert/notifier/alertmanager_request.qtpl:8
qw422016.N().S(`[`)
//line app/vmalert/notifier/alertmanager_request.qtpl:8
//line app/vmalert/notifier/alertmanager_request.qtpl:10
for i, alert := range alerts {
//line app/vmalert/notifier/alertmanager_request.qtpl:8
//line app/vmalert/notifier/alertmanager_request.qtpl:10
qw422016.N().S(`{"startsAt":`)
//line app/vmalert/notifier/alertmanager_request.qtpl:10
//line app/vmalert/notifier/alertmanager_request.qtpl:12
qw422016.N().Q(alert.Start.Format(time.RFC3339Nano))
//line app/vmalert/notifier/alertmanager_request.qtpl:10
//line app/vmalert/notifier/alertmanager_request.qtpl:12
qw422016.N().S(`,"generatorURL":`)
//line app/vmalert/notifier/alertmanager_request.qtpl:11
//line app/vmalert/notifier/alertmanager_request.qtpl:13
qw422016.N().Q(generatorURL(alert))
//line app/vmalert/notifier/alertmanager_request.qtpl:11
//line app/vmalert/notifier/alertmanager_request.qtpl:13
qw422016.N().S(`,`)
//line app/vmalert/notifier/alertmanager_request.qtpl:12
//line app/vmalert/notifier/alertmanager_request.qtpl:14
if !alert.End.IsZero() {
//line app/vmalert/notifier/alertmanager_request.qtpl:12
//line app/vmalert/notifier/alertmanager_request.qtpl:14
qw422016.N().S(`"endsAt":`)
//line app/vmalert/notifier/alertmanager_request.qtpl:13
//line app/vmalert/notifier/alertmanager_request.qtpl:15
qw422016.N().Q(alert.End.Format(time.RFC3339Nano))
//line app/vmalert/notifier/alertmanager_request.qtpl:13
//line app/vmalert/notifier/alertmanager_request.qtpl:15
qw422016.N().S(`,`)
//line app/vmalert/notifier/alertmanager_request.qtpl:14
}
//line app/vmalert/notifier/alertmanager_request.qtpl:14
qw422016.N().S(`"labels": {"alertname":`)
//line app/vmalert/notifier/alertmanager_request.qtpl:16
qw422016.N().Q(alert.Name)
//line app/vmalert/notifier/alertmanager_request.qtpl:17
for k, v := range alert.Labels {
//line app/vmalert/notifier/alertmanager_request.qtpl:17
qw422016.N().S(`,`)
//line app/vmalert/notifier/alertmanager_request.qtpl:18
qw422016.N().Q(k)
//line app/vmalert/notifier/alertmanager_request.qtpl:18
qw422016.N().S(`:`)
//line app/vmalert/notifier/alertmanager_request.qtpl:18
qw422016.N().Q(v)
//line app/vmalert/notifier/alertmanager_request.qtpl:19
}
//line app/vmalert/notifier/alertmanager_request.qtpl:16
qw422016.N().S(`"labels": {"alertname":`)
//line app/vmalert/notifier/alertmanager_request.qtpl:18
qw422016.N().Q(alert.Name)
//line app/vmalert/notifier/alertmanager_request.qtpl:19
qw422016.N().S(`},"annotations": {`)
lbls := alert.toPromLabels(relabelCfg)
//line app/vmalert/notifier/alertmanager_request.qtpl:20
for _, l := range lbls {
//line app/vmalert/notifier/alertmanager_request.qtpl:20
qw422016.N().S(`,`)
//line app/vmalert/notifier/alertmanager_request.qtpl:21
qw422016.N().Q(l.Name)
//line app/vmalert/notifier/alertmanager_request.qtpl:21
qw422016.N().S(`:`)
//line app/vmalert/notifier/alertmanager_request.qtpl:21
qw422016.N().Q(l.Value)
//line app/vmalert/notifier/alertmanager_request.qtpl:22
}
//line app/vmalert/notifier/alertmanager_request.qtpl:22
qw422016.N().S(`},"annotations": {`)
//line app/vmalert/notifier/alertmanager_request.qtpl:25
c := len(alert.Annotations)
//line app/vmalert/notifier/alertmanager_request.qtpl:23
//line app/vmalert/notifier/alertmanager_request.qtpl:26
for k, v := range alert.Annotations {
//line app/vmalert/notifier/alertmanager_request.qtpl:24
//line app/vmalert/notifier/alertmanager_request.qtpl:27
c = c - 1
//line app/vmalert/notifier/alertmanager_request.qtpl:25
//line app/vmalert/notifier/alertmanager_request.qtpl:28
qw422016.N().Q(k)
//line app/vmalert/notifier/alertmanager_request.qtpl:25
//line app/vmalert/notifier/alertmanager_request.qtpl:28
qw422016.N().S(`:`)
//line app/vmalert/notifier/alertmanager_request.qtpl:25
//line app/vmalert/notifier/alertmanager_request.qtpl:28
qw422016.N().Q(v)
//line app/vmalert/notifier/alertmanager_request.qtpl:25
//line app/vmalert/notifier/alertmanager_request.qtpl:28
if c > 0 {
//line app/vmalert/notifier/alertmanager_request.qtpl:25
//line app/vmalert/notifier/alertmanager_request.qtpl:28
qw422016.N().S(`,`)
//line app/vmalert/notifier/alertmanager_request.qtpl:25
//line app/vmalert/notifier/alertmanager_request.qtpl:28
}
//line app/vmalert/notifier/alertmanager_request.qtpl:26
//line app/vmalert/notifier/alertmanager_request.qtpl:29
}
//line app/vmalert/notifier/alertmanager_request.qtpl:26
//line app/vmalert/notifier/alertmanager_request.qtpl:29
qw422016.N().S(`}}`)
//line app/vmalert/notifier/alertmanager_request.qtpl:29
//line app/vmalert/notifier/alertmanager_request.qtpl:32
if i != len(alerts)-1 {
//line app/vmalert/notifier/alertmanager_request.qtpl:29
//line app/vmalert/notifier/alertmanager_request.qtpl:32
qw422016.N().S(`,`)
//line app/vmalert/notifier/alertmanager_request.qtpl:29
//line app/vmalert/notifier/alertmanager_request.qtpl:32
}
//line app/vmalert/notifier/alertmanager_request.qtpl:30
//line app/vmalert/notifier/alertmanager_request.qtpl:33
}
//line app/vmalert/notifier/alertmanager_request.qtpl:30
//line app/vmalert/notifier/alertmanager_request.qtpl:33
qw422016.N().S(`]`)
//line app/vmalert/notifier/alertmanager_request.qtpl:32
//line app/vmalert/notifier/alertmanager_request.qtpl:35
}
//line app/vmalert/notifier/alertmanager_request.qtpl:32
func writeamRequest(qq422016 qtio422016.Writer, alerts []Alert, generatorURL func(Alert) string) {
//line app/vmalert/notifier/alertmanager_request.qtpl:32
//line app/vmalert/notifier/alertmanager_request.qtpl:35
func writeamRequest(qq422016 qtio422016.Writer, alerts []Alert, generatorURL func(Alert) string, relabelCfg *promrelabel.ParsedConfigs) {
//line app/vmalert/notifier/alertmanager_request.qtpl:35
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmalert/notifier/alertmanager_request.qtpl:32
streamamRequest(qw422016, alerts, generatorURL)
//line app/vmalert/notifier/alertmanager_request.qtpl:32
//line app/vmalert/notifier/alertmanager_request.qtpl:35
streamamRequest(qw422016, alerts, generatorURL, relabelCfg)
//line app/vmalert/notifier/alertmanager_request.qtpl:35
qt422016.ReleaseWriter(qw422016)
//line app/vmalert/notifier/alertmanager_request.qtpl:32
//line app/vmalert/notifier/alertmanager_request.qtpl:35
}
//line app/vmalert/notifier/alertmanager_request.qtpl:32
func amRequest(alerts []Alert, generatorURL func(Alert) string) string {
//line app/vmalert/notifier/alertmanager_request.qtpl:32
//line app/vmalert/notifier/alertmanager_request.qtpl:35
func amRequest(alerts []Alert, generatorURL func(Alert) string, relabelCfg *promrelabel.ParsedConfigs) string {
//line app/vmalert/notifier/alertmanager_request.qtpl:35
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmalert/notifier/alertmanager_request.qtpl:32
writeamRequest(qb422016, alerts, generatorURL)
//line app/vmalert/notifier/alertmanager_request.qtpl:32
//line app/vmalert/notifier/alertmanager_request.qtpl:35
writeamRequest(qb422016, alerts, generatorURL, relabelCfg)
//line app/vmalert/notifier/alertmanager_request.qtpl:35
qs422016 := string(qb422016.B)
//line app/vmalert/notifier/alertmanager_request.qtpl:32
//line app/vmalert/notifier/alertmanager_request.qtpl:35
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmalert/notifier/alertmanager_request.qtpl:32
//line app/vmalert/notifier/alertmanager_request.qtpl:35
return qs422016
//line app/vmalert/notifier/alertmanager_request.qtpl:32
//line app/vmalert/notifier/alertmanager_request.qtpl:35
}

View File

@@ -14,7 +14,7 @@ import (
func TestAlertManager_Addr(t *testing.T) {
const addr = "http://localhost"
am, err := NewAlertManager(addr, nil, promauth.HTTPClientConfig{}, 0)
am, err := NewAlertManager(addr, nil, promauth.HTTPClientConfig{}, nil, 0)
if err != nil {
t.Errorf("unexpected error: %s", err)
}
@@ -89,7 +89,7 @@ func TestAlertManager_Send(t *testing.T) {
}
am, err := NewAlertManager(srv.URL+alertManagerPath, func(alert Alert) string {
return strconv.FormatUint(alert.GroupID, 10) + "/" + strconv.FormatUint(alert.ID, 10)
}, aCfg, 0)
}, aCfg, nil, 0)
if err != nil {
t.Errorf("unexpected error: %s", err)
}

View File

@@ -34,9 +34,10 @@ type Config struct {
// HTTPClientConfig contains HTTP configuration for Notifier clients
HTTPClientConfig promauth.HTTPClientConfig `yaml:",inline"`
// RelabelConfigs contains list of relabeling rules
// RelabelConfigs contains list of relabeling rules for entities discovered via SD
RelabelConfigs []promrelabel.RelabelConfig `yaml:"relabel_configs,omitempty"`
// AlertRelabelConfigs contains list of relabeling rules alert labels
AlertRelabelConfigs []promrelabel.RelabelConfig `yaml:"alert_relabel_configs,omitempty"`
// The timeout used when sending alerts.
Timeout promutils.Duration `yaml:"timeout,omitempty"`
@@ -52,6 +53,8 @@ type Config struct {
// stores already parsed RelabelConfigs object
parsedRelabelConfigs *promrelabel.ParsedConfigs
// stores already parsed AlertRelabelConfigs object
parsedAlertRelabelConfigs *promrelabel.ParsedConfigs
}
// StaticConfig contains list of static targets in the following form:
@@ -78,6 +81,11 @@ func (cfg *Config) UnmarshalYAML(unmarshal func(interface{}) error) error {
return fmt.Errorf("failed to parse relabeling config: %w", err)
}
cfg.parsedRelabelConfigs = rCfg
arCfg, err := promrelabel.ParseRelabelConfigs(cfg.AlertRelabelConfigs, false)
if err != nil {
return fmt.Errorf("failed to parse alert relabeling config: %w", err)
}
cfg.parsedAlertRelabelConfigs = arCfg
b, err := yaml.Marshal(cfg)
if err != nil {

View File

@@ -141,7 +141,7 @@ func targetsFromLabels(labelsFn getLabels, cfg *Config, genFn AlertURLGenerator)
}
duplicates[u] = struct{}{}
am, err := NewAlertManager(u, genFn, cfg.HTTPClientConfig, cfg.Timeout.Duration())
am, err := NewAlertManager(u, genFn, cfg.HTTPClientConfig, cfg.parsedAlertRelabelConfigs, cfg.Timeout.Duration())
if err != nil {
errors = append(errors, err)
continue
@@ -165,7 +165,7 @@ func (cw *configWatcher) start() error {
if err != nil {
return fmt.Errorf("failed to parse labels for target %q: %s", target, err)
}
notifier, err := NewAlertManager(address, cw.genFn, cw.cfg.HTTPClientConfig, cw.cfg.Timeout.Duration())
notifier, err := NewAlertManager(address, cw.genFn, cw.cfg.HTTPClientConfig, cw.cfg.parsedRelabelConfigs, cw.cfg.Timeout.Duration())
if err != nil {
return fmt.Errorf("failed to init alertmanager for addr %q: %s", address, err)
}

View File

@@ -138,7 +138,7 @@ func notifiersFromFlags(gen AlertURLGenerator) ([]Notifier, error) {
}
addr = strings.TrimSuffix(addr, "/")
am, err := NewAlertManager(addr+alertManagerPath, gen, authCfg, time.Minute)
am, err := NewAlertManager(addr+alertManagerPath, gen, authCfg, nil, time.Minute)
if err != nil {
return nil, err
}

View File

@@ -10,4 +10,7 @@ relabel_configs:
- source_labels: [__meta_consul_tags]
regex: .*,__scheme__=([^,]+),.*
replacement: '${1}'
target_label: __scheme__
target_label: __scheme__
alert_relabel_configs:
- target_label: "foo"
replacement: "aaa"

View File

@@ -2,3 +2,6 @@ static_configs:
- targets:
- localhost:9093
- localhost:9095
alert_relabel_configs:
- target_label: "foo"
replacement: "aaa"

View File

@@ -124,14 +124,13 @@ func (rr *RecordingRule) ExecRange(ctx context.Context, start, end time.Time) ([
}
// Exec executes RecordingRule expression via the given Querier.
func (rr *RecordingRule) Exec(ctx context.Context) ([]prompbmarshal.TimeSeries, error) {
start := time.Now()
qMetrics, err := rr.q.Query(ctx, rr.Expr)
func (rr *RecordingRule) Exec(ctx context.Context, ts time.Time) ([]prompbmarshal.TimeSeries, error) {
qMetrics, err := rr.q.Query(ctx, rr.Expr, ts)
rr.mu.Lock()
defer rr.mu.Unlock()
rr.lastExecTime = start
rr.lastExecDuration = time.Since(start)
rr.lastExecTime = ts
rr.lastExecDuration = time.Since(ts)
rr.lastExecError = err
rr.lastExecSamples = len(qMetrics)
if err != nil {

View File

@@ -77,7 +77,7 @@ func TestRecoridngRule_Exec(t *testing.T) {
fq := &fakeQuerier{}
fq.add(tc.metrics...)
tc.rule.q = fq
tss, err := tc.rule.Exec(context.TODO())
tss, err := tc.rule.Exec(context.TODO(), time.Now())
if err != nil {
t.Fatalf("unexpected Exec err: %s", err)
}
@@ -178,7 +178,7 @@ func TestRecoridngRule_ExecNegative(t *testing.T) {
expErr := "connection reset by peer"
fq.setErr(errors.New(expErr))
rr.q = fq
_, err := rr.Exec(context.TODO())
_, err := rr.Exec(context.TODO(), time.Now())
if err == nil {
t.Fatalf("expected to get err; got nil")
}
@@ -193,7 +193,7 @@ func TestRecoridngRule_ExecNegative(t *testing.T) {
fq.add(metricWithValueAndLabels(t, 1, "__name__", "foo", "job", "foo"))
fq.add(metricWithValueAndLabels(t, 2, "__name__", "foo", "job", "bar"))
_, err = rr.Exec(context.TODO())
_, err = rr.Exec(context.TODO(), time.Now())
if err == nil {
t.Fatalf("expected to get err; got nil")
}

View File

@@ -225,7 +225,7 @@ func (c *Client) flush(ctx context.Context, wr *prompbmarshal.WriteRequest) {
droppedRows.Add(len(wr.Timeseries))
droppedBytes.Add(len(b))
logger.Errorf("all %d attempts to send request failed - dropping %d timeseries",
logger.Errorf("all %d attempts to send request failed - dropping %d time series",
attempts, len(wr.Timeseries))
}

View File

@@ -3,8 +3,9 @@ package main
import (
"context"
"errors"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
)
// Rule represents alerting or recording rule
@@ -14,8 +15,8 @@ type Rule interface {
// ID returns unique ID that may be used for
// identifying this Rule among others.
ID() uint64
// Exec executes the rule with given context
Exec(ctx context.Context) ([]prompbmarshal.TimeSeries, error)
// Exec executes the rule with given context at the given timestamp
Exec(ctx context.Context, ts time.Time) ([]prompbmarshal.TimeSeries, error)
// ExecRange executes the rule on the given time range
ExecRange(ctx context.Context, start, end time.Time) ([]prompbmarshal.TimeSeries, error)
// UpdateWith performs modification of current Rule

View File

@@ -30,3 +30,20 @@ func newTimeSeries(values []float64, timestamps []int64, labels map[string]strin
}
return ts
}
// newTimeSeriesPB creates prompbmarshal.TimeSeries with given
// values, timestamps and labels.
// It expects that labels are already sorted.
func newTimeSeriesPB(values []float64, timestamps []int64, labels []prompbmarshal.Label) prompbmarshal.TimeSeries {
ts := prompbmarshal.TimeSeries{
Samples: make([]prompbmarshal.Sample, len(values)),
}
for i := range values {
ts.Samples[i] = prompbmarshal.Sample{
Value: values[i],
Timestamp: time.Unix(timestamps[i], 0).UnixNano() / 1e6,
}
}
ts.Labels = labels
return ts
}

View File

@@ -14,7 +14,7 @@ func TestHandler(t *testing.T) {
ar := &AlertingRule{
Name: "alert",
alerts: map[uint64]*notifier.Alert{
0: {},
0: {State: notifier.StateFiring},
},
}
g := &Group{

View File

@@ -10,7 +10,7 @@ The `-auth.config` can point to either local file or to http url.
Just download `vmutils-*` archive from [releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases), unpack it
and pass the following flag to `vmauth` binary in order to start authorizing and routing requests:
```
```bash
/path/to/vmauth -auth.config=/path/to/auth/config.yml
```
@@ -129,13 +129,13 @@ It is expected that all the backend services protected by `vmauth` are located i
Do not transfer Basic Auth headers in plaintext over untrusted networks. Enable https. This can be done by passing the following `-tls*` command-line flags to `vmauth`:
```
```bash
-tls
Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set
Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set
-tlsCertFile string
Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs, since RSA certs are slow
Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs, since RSA certs are slow
-tlsKeyFile string
Path to file with TLS key. Used only if -tls is set
Path to file with TLS key. Used only if -tls is set
```
Alternatively, [https termination proxy](https://en.wikipedia.org/wiki/TLS_termination_proxy) may be put in front of `vmauth`.
@@ -217,7 +217,7 @@ The collected profiles may be analyzed with [go tool pprof](https://github.com/g
Pass `-help` command-line arg to `vmauth` in order to see all the configuration options:
```
```bash
./vmauth -help
vmauth authenticates and authorizes incoming requests and proxies them to VictoriaMetrics.
@@ -225,70 +225,70 @@ vmauth authenticates and authorizes incoming requests and proxies them to Victor
See the docs at https://docs.victoriametrics.com/vmauth.html .
-auth.config string
Path to auth config. It can point either to local file or to http url. See https://docs.victoriametrics.com/vmauth.html for details on the format of this auth config
Path to auth config. It can point either to local file or to http url. See https://docs.victoriametrics.com/vmauth.html for details on the format of this auth config
-enableTCP6
Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used
Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used
-envflag.enable
Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
Prefix for environment variables if -envflag.enable is set
-eula
By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf
By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf
-fs.disableMmap
Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread()
Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread()
-http.connTimeout duration
Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s)
Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s)
-http.disableResponseCompression
Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth
Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth
-http.idleConnTimeout duration
Timeout for incoming idle http connections (default 1m0s)
Timeout for incoming idle http connections (default 1m0s)
-http.maxGracefulShutdownDuration duration
The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s)
The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s)
-http.pathPrefix string
An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus
An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus
-http.shutdownDelay duration
Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers
Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers
-httpAuth.password string
Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty
Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty
-httpAuth.username string
Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
-httpListenAddr string
TCP address to listen for http connections (default ":8427")
TCP address to listen for http connections (default ":8427")
-logInvalidAuthTokens
Whether to log requests with invalid auth tokens. Such requests are always counted at vmauth_http_request_errors_total{reason="invalid_auth_token"} metric, which is exposed at /metrics page
Whether to log requests with invalid auth tokens. Such requests are always counted at vmauth_http_request_errors_total{reason="invalid_auth_token"} metric, which is exposed at /metrics page
-loggerDisableTimestamps
Whether to disable writing timestamps in logs
Whether to disable writing timestamps in logs
-loggerErrorsPerSecondLimit int
Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit
Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit
-loggerFormat string
Format for logs. Possible values: default, json (default "default")
Format for logs. Possible values: default, json (default "default")
-loggerLevel string
Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO")
Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO")
-loggerOutput string
Output for the logs. Supported values: stderr, stdout (default "stderr")
Output for the logs. Supported values: stderr, stdout (default "stderr")
-loggerTimezone string
Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC")
Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC")
-loggerWarnsPerSecondLimit int
Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit
Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit
-maxIdleConnsPerBackend int
The maximum number of idle connections vmauth can open per each backend host (default 100)
The maximum number of idle connections vmauth can open per each backend host (default 100)
-memory.allowedBytes size
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0)
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0)
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60)
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60)
-metricsAuthKey string
Auth key for /metrics. It must be passed via authKey query arg. It overrides httpAuth.* settings
Auth key for /metrics. It must be passed via authKey query arg. It overrides httpAuth.* settings
-pprofAuthKey string
Auth key for /debug/pprof. It must be passed via authKey query arg. It overrides httpAuth.* settings
Auth key for /debug/pprof. It must be passed via authKey query arg. It overrides httpAuth.* settings
-reloadAuthKey string
Auth key for /-/reload http endpoint. It must be passed as authKey=...
Auth key for /-/reload http endpoint. It must be passed as authKey=...
-tls
Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set
Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set
-tlsCertFile string
Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower. The provided certificate file is automatically re-read every second, so it can be dynamically updated
Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower. The provided certificate file is automatically re-read every second, so it can be dynamically updated
-tlsKeyFile string
Path to file with TLS key. Used only if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated
Path to file with TLS key. Used only if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated
-version
Show VictoriaMetrics version
Show VictoriaMetrics version
```

View File

@@ -22,14 +22,13 @@ See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time-
See also [vmbackupmanager](https://docs.victoriametrics.com/vmbackupmanager.html) tool built on top of `vmbackup`. This tool simplifies
creation of hourly, daily, weekly and monthly backups.
## Use cases
### Regular backups
Regular backup can be performed with the following command:
```
```bash
vmbackup -storageDataPath=</path/to/victoria-metrics-data> -snapshotName=<local-snapshot> -dst=gs://<bucket>/<path/to/new/backup>
```
@@ -39,36 +38,33 @@ vmbackup -storageDataPath=</path/to/victoria-metrics-data> -snapshotName=<local-
* `<bucket>` is an already existing name for [GCS bucket](https://cloud.google.com/storage/docs/creating-buckets).
* `<path/to/new/backup>` is the destination path where new backup will be placed.
### Regular backups with server-side copy from existing backup
If the destination GCS bucket already contains the previous backup at `-origin` path, then new backup can be sped up
with the following command:
```
```bash
vmbackup -storageDataPath=</path/to/victoria-metrics-data> -snapshotName=<local-snapshot> -dst=gs://<bucket>/<path/to/new/backup> -origin=gs://<bucket>/<path/to/existing/backup>
```
It saves time and network bandwidth costs by performing server-side copy for the shared data from the `-origin` to `-dst`.
### Incremental backups
Incremental backups are performed if `-dst` points to an already existing backup. In this case only new data is uploaded to remote storage.
It saves time and network bandwidth costs when working with big backups:
```
```bash
vmbackup -storageDataPath=</path/to/victoria-metrics-data> -snapshotName=<local-snapshot> -dst=gs://<bucket>/<path/to/existing/backup>
```
### Smart backups
Smart backups mean storing full daily backups into `YYYYMMDD` folders and creating incremental hourly backup into `latest` folder:
* Run the following command every hour:
```
```bash
vmbackup -snapshotName=<latest-snapshot> -dst=gs://<bucket>/latest
```
@@ -77,13 +73,12 @@ The command will upload only changed data to `gs://<bucket>/latest`.
* Run the following command once a day:
```
```bash
vmbackup -snapshotName=<daily-snapshot> -dst=gs://<bucket>/<YYYYMMDD> -origin=gs://<bucket>/latest
```
Where `<daily-snapshot>` is the snapshot for the last day `<YYYYMMDD>`.
This apporach saves network bandwidth costs on hourly backups (since they are incremental) and allows recovering data from either the last hour (`latest` backup)
or from any day (`YYYYMMDD` backups). Note that hourly backup shouldn't run when creating daily backup.
@@ -91,7 +86,6 @@ Do not forget to remove old snapshots and backups when they are no longer needed
See also [vmbackupmanager tool](https://docs.victoriametrics.com/vmbackupmanager.html) for automating smart backups.
## How does it work?
The backup algorithm is the following:
@@ -108,16 +102,15 @@ Such splitting minimizes the amounts of data to re-transfer after temporary erro
`vmbackup` relies on [instant snapshot](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282) properties:
- All the files in the snapshot are immutable.
- Old files are periodically merged into new files.
- Smaller files have higher probability to be merged.
- Consecutive snapshots share many identical files.
* All the files in the snapshot are immutable.
* Old files are periodically merged into new files.
* Smaller files have higher probability to be merged.
* Consecutive snapshots share many identical files.
These properties allow performing fast and cheap incremental backups and server-side copying from `-origin` paths.
See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time-series-databases-533c1a927883) for more details.
`vmbackup` can work improperly or slowly when these properties are violated.
## Troubleshooting
* If the backup is slow, then try setting higher value for `-concurrency` flag. This will increase the number of concurrent workers that upload data to backup storage.
@@ -126,15 +119,14 @@ See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time-
* Backups created from [single-node VictoriaMetrics](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html) cannot be restored
at [cluster VictoriaMetrics](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html) and vice versa.
## Advanced usage
* Obtaining credentials from a file.
Add flag `-credsFilePath=/etc/credentials` with the following content:
for s3 (aws, minio or other s3 compatible storages):
```bash
[default]
aws_access_key_id=theaccesskey
@@ -142,6 +134,7 @@ See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time-
```
for gce cloud storage:
```json
{
"type": "service_account",
@@ -159,7 +152,8 @@ See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time-
* Usage with s3 custom url endpoint. It is possible to use `vmbackup` with s3 compatible storages like minio, cloudian, etc.
You have to add a custom url endpoint via flag:
```
```bash
# for minio
-customS3Endpoint=http://localhost:9000
@@ -169,102 +163,100 @@ See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time-
* Run `vmbackup -help` in order to see all the available options:
```
```bash
-concurrency int
The number of concurrent workers. Higher concurrency may reduce backup duration (default 10)
The number of concurrent workers. Higher concurrency may reduce backup duration (default 10)
-configFilePath string
Path to file with S3 configs. Configs are loaded from default location if not set.
See https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html
Path to file with S3 configs. Configs are loaded from default location if not set.
See https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html
-configProfile string
Profile name for S3 configs. If no set, the value of the environment variable will be loaded (AWS_PROFILE or AWS_DEFAULT_PROFILE), or if both not set, DefaultSharedConfigProfile is used
Profile name for S3 configs. If no set, the value of the environment variable will be loaded (AWS_PROFILE or AWS_DEFAULT_PROFILE), or if both not set, DefaultSharedConfigProfile is used
-credsFilePath string
Path to file with GCS or S3 credentials. Credentials are loaded from default locations if not set.
See https://cloud.google.com/iam/docs/creating-managing-service-account-keys and https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html
Path to file with GCS or S3 credentials. Credentials are loaded from default locations if not set.
See https://cloud.google.com/iam/docs/creating-managing-service-account-keys and https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html
-customS3Endpoint string
Custom S3 endpoint for use with S3-compatible storages (e.g. MinIO). S3 is used if not set
Custom S3 endpoint for use with S3-compatible storages (e.g. MinIO). S3 is used if not set
-dst string
Where to put the backup on the remote storage. Example: gs://bucket/path/to/backup/dir, s3://bucket/path/to/backup/dir or fs:///path/to/local/backup/dir
-dst can point to the previous backup. In this case incremental backup is performed, i.e. only changed data is uploaded
Where to put the backup on the remote storage. Example: gs://bucket/path/to/backup/dir, s3://bucket/path/to/backup/dir or fs:///path/to/local/backup/dir
-dst can point to the previous backup. In this case incremental backup is performed, i.e. only changed data is uploaded
-enableTCP6
Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used
Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used
-envflag.enable
Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
Prefix for environment variables if -envflag.enable is set
-fs.disableMmap
Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread()
Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread()
-http.connTimeout duration
Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s)
Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s)
-http.disableResponseCompression
Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth
Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth
-http.idleConnTimeout duration
Timeout for incoming idle http connections (default 1m0s)
Timeout for incoming idle http connections (default 1m0s)
-http.maxGracefulShutdownDuration duration
The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s)
The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s)
-http.pathPrefix string
An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus
An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus
-http.shutdownDelay duration
Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers
Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers
-httpAuth.password string
Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty
Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty
-httpAuth.username string
Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
-httpListenAddr string
TCP address for exporting metrics at /metrics page (default ":8420")
TCP address for exporting metrics at /metrics page (default ":8420")
-loggerDisableTimestamps
Whether to disable writing timestamps in logs
Whether to disable writing timestamps in logs
-loggerErrorsPerSecondLimit int
Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit
Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit
-loggerFormat string
Format for logs. Possible values: default, json (default "default")
Format for logs. Possible values: default, json (default "default")
-loggerLevel string
Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO")
Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO")
-loggerOutput string
Output for the logs. Supported values: stderr, stdout (default "stderr")
Output for the logs. Supported values: stderr, stdout (default "stderr")
-loggerTimezone string
Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC")
Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC")
-loggerWarnsPerSecondLimit int
Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit
Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit
-maxBytesPerSecond size
The maximum upload speed. There is no limit if it is set to 0
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0)
The maximum upload speed. There is no limit if it is set to 0
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0)
-memory.allowedBytes size
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0)
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0)
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60)
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60)
-metricsAuthKey string
Auth key for /metrics. It must be passed via authKey query arg. It overrides httpAuth.* settings
Auth key for /metrics. It must be passed via authKey query arg. It overrides httpAuth.* settings
-origin string
Optional origin directory on the remote storage with old backup for server-side copying when performing full backup. This speeds up full backups
Optional origin directory on the remote storage with old backup for server-side copying when performing full backup. This speeds up full backups
-pprofAuthKey string
Auth key for /debug/pprof. It must be passed via authKey query arg. It overrides httpAuth.* settings
Auth key for /debug/pprof. It must be passed via authKey query arg. It overrides httpAuth.* settings
-s3ForcePathStyle
Prefixing endpoint with bucket name when set false, true by default. (default true)
Prefixing endpoint with bucket name when set false, true by default. (default true)
-snapshot.createURL string
VictoriaMetrics create snapshot url. When this is given a snapshot will automatically be created during backup. Example: http://victoriametrics:8428/snapshot/create . There is no need in setting -snapshotName if -snapshot.createURL is set
VictoriaMetrics create snapshot url. When this is given a snapshot will automatically be created during backup. Example: http://victoriametrics:8428/snapshot/create . There is no need in setting -snapshotName if -snapshot.createURL is set
-snapshot.deleteURL string
VictoriaMetrics delete snapshot url. Optional. Will be generated from -snapshot.createURL if not provided. All created snapshots will be automatically deleted. Example: http://victoriametrics:8428/snapshot/delete
VictoriaMetrics delete snapshot url. Optional. Will be generated from -snapshot.createURL if not provided. All created snapshots will be automatically deleted. Example: http://victoriametrics:8428/snapshot/delete
-snapshotName string
Name for the snapshot to backup. See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-work-with-snapshots. There is no need in setting -snapshotName if -snapshot.createURL is set
Name for the snapshot to backup. See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-work-with-snapshots. There is no need in setting -snapshotName if -snapshot.createURL is set
-storageDataPath string
Path to VictoriaMetrics data. Must match -storageDataPath from VictoriaMetrics or vmstorage (default "victoria-metrics-data")
Path to VictoriaMetrics data. Must match -storageDataPath from VictoriaMetrics or vmstorage (default "victoria-metrics-data")
-tls
Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set
Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set
-tlsCertFile string
Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower
Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower
-tlsKeyFile string
Path to file with TLS key. Used only if -tls is set
Path to file with TLS key. Used only if -tls is set
-version
Show VictoriaMetrics version
Show VictoriaMetrics version
```
## How to build from sources
It is recommended using [binary releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) - see `vmutils-*` archives there.
### Development build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.17.

View File

@@ -9,11 +9,10 @@ The required flags for running the service are as follows:
* -eula - should be true and means that you have the legal right to run a backup manager. That can either be a signed contract or an email with confirmation to run the service in a trial period
* -storageDataPath - path to VictoriaMetrics or vmstorage data path to make backup from
* -snapshot.createURL - VictoriaMetrics creates snapshot URL which will automatically be created during backup. Example: http://victoriametrics:8428/snapshot/create
* -snapshot.createURL - VictoriaMetrics creates snapshot URL which will automatically be created during backup. Example: <http://victoriametrics:8428/snapshot/create>
* -dst - backup destination at s3, gcs or local filesystem
* -credsFilePath - path to file with GCS or S3 credentials. Credentials are loaded from default locations if not set. See [https://cloud.google.com/iam/docs/creating-managing-service-account-keys](https://cloud.google.com/iam/docs/creating-managing-service-account-keys) and [https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html](https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html)
Backup schedule is controlled by the following flags:
* -disableHourly - disable hourly run. Default false
@@ -23,7 +22,6 @@ Backup schedule is controlled by the following flags:
By default, all flags are turned on and Backup Manager backups data every hour for every interval (hourly, daily, weekly and monthly).
The backup manager creates the following directory hierarchy at **-dst**:
* /latest/ - contains the latest backup
@@ -32,7 +30,6 @@ The backup manager creates the following directory hierarchy at **-dst**:
* /weekly/ - contains weekly backups. Each backup is named as *YYYY-WW*
* /monthly/ - contains monthly backups. Each backup is named as *YYYY-MM*
To get the full list of supported flags please run the following command:
```console
@@ -48,7 +45,6 @@ There are two flags which could help with performance tuning:
* -maxBytesPerSecond - the maximum upload speed. There is no limit if it is set to 0
* -concurrency - The number of concurrent workers. Higher concurrency may improve upload speed (default 10)
## Example of Usage
GCS and cluster version. You need to have a credentials file in json format with following structure
@@ -96,11 +92,11 @@ info VictoriaMetrics/lib/storage/storage.go:319 deleted snapshot "/vmstora
The result on the GCS bucket
- The root folder
* The root folder
![root](vmbackupmanager_root_folder.png)
- The latest folder
* The latest folder
![latest](vmbackupmanager_latest_folder.png)
@@ -119,7 +115,6 @@ Lets assume we have a backup manager collecting daily backups for the past 10
![daily](vmbackupmanager_rp_daily_1.png)
We enable backup retention policy for backup manager by using following configuration:
```console

View File

@@ -2,19 +2,20 @@
VictoriaMetrics command-line tool
Features:
- [x] Prometheus: migrate data from Prometheus to VictoriaMetrics using snapshot API
- [x] Thanos: migrate data from Thanos to VictoriaMetrics
- [ ] ~~Prometheus: migrate data from Prometheus to VictoriaMetrics by query~~(discarded)
- [x] InfluxDB: migrate data from InfluxDB to VictoriaMetrics
- [x] OpenTSDB: migrate data from OpenTSDB to VictoriaMetrics
- [ ] Storage Management: data re-balancing between nodes
vmctl provides various useful actions with VictoriaMetrics components.
vmctl acts as a proxy between data source ([Prometheus](#migrating-data-from-prometheus),
[InfluxDB](#migrating-data-from-influxdb-1x), [VictoriaMetrics](##migrating-data-from-victoriametrics), etc.)
and destination - VictoriaMetrics single or cluster version. To see the full list of supported modes
Features:
- migrate data from [Prometheus](#migrating-data-from-prometheus) to VictoriaMetrics using snapshot API
- migrate data from [Thanos](#migrating-data-from-thanos) to VictoriaMetrics
- migrate data from [InfluxDB](#migrating-data-from-influxdb-1x) to VictoriaMetrics
- migrate data from [OpenTSDB](#migrating-data-from-opentsdb) to VictoriaMetrics
- migrate data between [VictoriaMetrics](#migrating-data-from-victoriametrics) single or cluster version.
- [verify](#verifying-exported-blocks-from-victoriametrics) exported blocks from VictoriaMetrics single or cluster version.
To see the full list of supported modes
run the following command:
```
```bash
./vmctl --help
NAME:
vmctl - VictoriaMetrics command-line tool
@@ -27,10 +28,12 @@ COMMANDS:
influx Migrate timeseries from InfluxDB
prometheus Migrate timeseries from Prometheus
vm-native Migrate time series between VictoriaMetrics installations via native binary format
verify-block Verifies correctness of data blocks exported via VictoriaMetrics Native format. See https://docs.victoriametrics.com/#how-to-export-data-in-native-format
```
Each mode has its own unique set of flags specific (e.g. prefixed with `influx` for influx mode)
to the data source and common list of flags for destination (prefixed with `vm` for VictoriaMetrics):
```
./vmctl influx --help
OPTIONS:
@@ -46,10 +49,11 @@ Please note, that vmctl performs initial readiness check for the given address b
```
When doing a migration user needs to specify flags for source (where and how to fetch data) and for
destination (where to migrate data). Every mode has additional details and nuances, please see
destination (where to migrate data). Every mode has additional details and nuances, please see
them below in corresponding sections.
For the destination flags see the full description by running the following command:
```
./vmctl influx --help | grep vm-
```
@@ -59,14 +63,13 @@ has additional sections with description below. Details about tweaking and adjus
are explained in [Tuning](#tuning) section.
Please note, that if you're going to import data into VictoriaMetrics cluster do not
forget to specify the `--vm-account-id` flag. See more details for cluster version
forget to specify the `--vm-account-id` flag. See more details for cluster version
[here](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/cluster).
## Articles
* [How to migrate data from Prometheus](https://medium.com/@romanhavronenko/victoriametrics-how-to-migrate-data-from-prometheus-d44a6728f043)
* [How to migrate data from Prometheus. Filtering and modifying time series](https://medium.com/@romanhavronenko/victoriametrics-how-to-migrate-data-from-prometheus-filtering-and-modifying-time-series-6d40cea4bf21)
- [How to migrate data from Prometheus](https://medium.com/@romanhavronenko/victoriametrics-how-to-migrate-data-from-prometheus-d44a6728f043)
- [How to migrate data from Prometheus. Filtering and modifying time series](https://medium.com/@romanhavronenko/victoriametrics-how-to-migrate-data-from-prometheus-filtering-and-modifying-time-series-6d40cea4bf21)
## Migrating data from OpenTSDB
@@ -79,16 +82,21 @@ See `./vmctl opentsdb --help` for details and full list of flags.
OpenTSDB migration works like so:
1. Find metrics based on selected filters (or the default filter set ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z'])
* e.g. `curl -Ss "http://opentsdb:4242/api/suggest?type=metrics&q=sys"`
- e.g. `curl -Ss "http://opentsdb:4242/api/suggest?type=metrics&q=sys"`
2. Find series associated with each returned metric
* e.g. `curl -Ss "http://opentsdb:4242/api/search/lookup?m=system.load5&limit=1000000"`
- e.g. `curl -Ss "http://opentsdb:4242/api/search/lookup?m=system.load5&limit=1000000"`
3. Download data for each series in chunks defined in the CLI switches
* e.g. `-retention=sum-1m-avg:1h:90d` ==
* `curl -Ss "http://opentsdb:4242/api/query?start=1h-ago&end=now&m=sum:1m-avg-none:system.load5\{host=host1\}"`
* `curl -Ss "http://opentsdb:4242/api/query?start=2h-ago&end=1h-ago&m=sum:1m-avg-none:system.load5\{host=host1\}"`
* `curl -Ss "http://opentsdb:4242/api/query?start=3h-ago&end=2h-ago&m=sum:1m-avg-none:system.load5\{host=host1\}"`
* ...
* `curl -Ss "http://opentsdb:4242/api/query?start=2160h-ago&end=2159h-ago&m=sum:1m-avg-none:system.load5\{host=host1\}"`
- e.g. `-retention=sum-1m-avg:1h:90d` ==
- `curl -Ss "http://opentsdb:4242/api/query?start=1h-ago&end=now&m=sum:1m-avg-none:system.load5\{host=host1\}"`
- `curl -Ss "http://opentsdb:4242/api/query?start=2h-ago&end=1h-ago&m=sum:1m-avg-none:system.load5\{host=host1\}"`
- `curl -Ss "http://opentsdb:4242/api/query?start=3h-ago&end=2h-ago&m=sum:1m-avg-none:system.load5\{host=host1\}"`
- ...
- `curl -Ss "http://opentsdb:4242/api/query?start=2160h-ago&end=2159h-ago&m=sum:1m-avg-none:system.load5\{host=host1\}"`
This means that we must stream data from OpenTSDB to VictoriaMetrics in chunks. This is where concurrency for OpenTSDB comes in. We can query multiple chunks at once, but we shouldn't perform too many chunks at a time to avoid overloading the OpenTSDB cluster.
@@ -107,6 +115,7 @@ Found 9 metrics to import. Continue? [Y/n]
Starting with a relatively simple retention string (`sum-1m-avg:1h:30d`), let's describe how this is converted into actual queries.
There are two essential parts of a retention string:
1. [aggregation](#aggregation)
2. [windows/time ranges](#windows)
@@ -115,8 +124,9 @@ There are two essential parts of a retention string:
Retention strings essentially define the two levels of aggregation for our collected series.
`sum-1m-avg` would become:
* First order: `sum`
* Second order: `1m-avg-none`
- First order: `sum`
- Second order: `1m-avg-none`
##### First Order Aggregations
@@ -137,6 +147,7 @@ We do not allow for defining the "null value" portion of the rollup window (e.g.
#### Windows
There are two important windows we define in a retention string:
1. the "chunk" range of each query
2. The time range we will be querying on with that "chunk"
@@ -182,8 +193,8 @@ See `./vmctl influx --help` for details and full list of flags.
To use migration tool please specify the InfluxDB address `--influx-addr`, the database `--influx-database` and VictoriaMetrics address `--vm-addr`.
Flag `--vm-addr` for single-node VM is usually equal to `--httpListenAddr`, and for cluster version
is equal to `--httpListenAddr` flag of vminsert component. Please note, that vmctl performs initial readiness check for the given address
by checking `/health` endpoint. For cluster version it is additionally required to specify the `--vm-account-id` flag.
is equal to `--httpListenAddr` flag of vminsert component. Please note, that vmctl performs initial readiness check for the given address
by checking `/health` endpoint. For cluster version it is additionally required to specify the `--vm-account-id` flag.
See more details for cluster version [here](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/cluster).
As soon as required flags are provided and all endpoints are accessible, `vmctl` will start the InfluxDB scheme exploration.
@@ -191,8 +202,9 @@ Basically, it just fetches all fields and timeseries from the provided database
Then `vmctl` sends fetch requests for each timeseries to InfluxDB one by one and pass results to VM importer.
VM importer then accumulates received samples in batches and sends import requests to VM.
The importing process example for local installation of InfluxDB(`http://localhost:8086`)
The importing process example for local installation of InfluxDB(`http://localhost:8086`)
and single-node VictoriaMetrics(`http://localhost:8428`):
```
./vmctl influx --influx-database benchmark
InfluxDB import mode
@@ -212,25 +224,27 @@ Found 40000 timeseries to import. Continue? [Y/n] y
bytes/s: 5.4 MB;
import requests: 40001;
2020/01/18 21:19:00 Total time: 31m48.467044016s
```
```
### Data mapping
Vmctl maps InfluxDB data the same way as VictoriaMetrics does by using the following rules:
* `influx-database` arg is mapped into `db` label value unless `db` tag exists in the InfluxDB line.
* Field names are mapped to time series names prefixed with {measurement}{separator} value,
where {separator} equals to _ by default.
- `influx-database` arg is mapped into `db` label value unless `db` tag exists in the InfluxDB line.
- Field names are mapped to time series names prefixed with {measurement}{separator} value,
where {separator} equals to _ by default.
It can be changed with `--influx-measurement-field-separator` command-line flag.
* Field values are mapped to time series values.
* Tags are mapped to Prometheus labels format as-is.
- Field values are mapped to time series values.
- Tags are mapped to Prometheus labels format as-is.
For example, the following InfluxDB line:
```
foo,tag1=value1,tag2=value2 field1=12,field2=40
```
is converted into the following Prometheus format data points:
```
foo_field1{tag1="value1", tag2="value2"} 12
foo_field2{tag1="value1", tag2="value2"} 40
@@ -238,7 +252,7 @@ foo_field2{tag1="value1", tag2="value2"} 40
### Configuration
The configuration flags should contain self-explanatory descriptions.
The configuration flags should contain self-explanatory descriptions.
### Filtering
@@ -246,6 +260,7 @@ The filtering consists of two parts: timeseries and time.
The first step of application is to select all available timeseries
for given database and retention. User may specify additional filtering
condition via `--influx-filter-series` flag. For example:
```
./vmctl influx --influx-database benchmark \
--influx-filter-series "on benchmark from cpu where hostname='host_1703'"
@@ -256,13 +271,15 @@ InfluxDB import mode
2020/01/26 14:23:29 fetching series: command: "show series on benchmark from cpu where hostname='host_1703'"; database: "benchmark"; retention: "autogen"
Found 10 timeseries to import. Continue? [Y/n]
```
The timeseries select query would be following:
`fetching series: command: "show series on benchmark from cpu where hostname='host_1703'"; database: "benchmark"; retention: "autogen"`
The second step of filtering is a time filter and it applies when fetching the datapoints from Influx.
Time filtering may be configured with two flags:
* --influx-filter-time-start
* --influx-filter-time-end
- --influx-filter-time-start
- --influx-filter-time-end
Here's an example of importing timeseries for one day only:
`./vmctl influx --influx-database benchmark --influx-filter-series "where hostname='host_1703'" --influx-filter-time-start "2020-01-01T10:07:00Z" --influx-filter-time-end "2020-01-01T15:07:00Z"`
@@ -271,36 +288,36 @@ Please see more about time filtering [here](https://docs.influxdata.com/influxdb
## Migrating data from InfluxDB (2.x)
Migrating data from InfluxDB v2.x is not supported yet ([#32](https://github.com/VictoriaMetrics/vmctl/issues/32)).
You may find useful a 3rd party solution for this - https://github.com/jonppe/influx_to_victoriametrics.
You may find useful a 3rd party solution for this - <https://github.com/jonppe/influx_to_victoriametrics>.
## Migrating data from Prometheus
`vmctl` supports the `prometheus` mode for migrating data from Prometheus to VictoriaMetrics time-series database.
Migration is based on reading Prometheus snapshot, which is basically a hard-link to Prometheus data files.
Migration is based on reading Prometheus snapshot, which is basically a hard-link to Prometheus data files.
See `./vmctl prometheus --help` for details and full list of flags. Also see Prometheus related articles [here](#articles).
To use migration tool please specify the file path to Prometheus snapshot `--prom-snapshot` (see how to make a snapshot [here](https://www.robustperception.io/taking-snapshots-of-prometheus-data)) and VictoriaMetrics address `--vm-addr`.
Please note, that `vmctl` *do not make a snapshot from Prometheus*, it uses an already prepared snapshot. More about Prometheus snapshots may be found [here](https://www.robustperception.io/taking-snapshots-of-prometheus-data) and [here](https://medium.com/@romanhavronenko/victoriametrics-how-to-migrate-data-from-prometheus-d44a6728f043).
Flag `--vm-addr` for single-node VM is usually equal to `--httpListenAddr`, and for cluster version
is equal to `--httpListenAddr` flag of vminsert component. Please note, that vmctl performs initial readiness check for the given address
by checking `/health` endpoint. For cluster version it is additionally required to specify the `--vm-account-id` flag.
is equal to `--httpListenAddr` flag of vminsert component. Please note, that vmctl performs initial readiness check for the given address
by checking `/health` endpoint. For cluster version it is additionally required to specify the `--vm-account-id` flag.
See more details for cluster version [here](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/cluster).
As soon as required flags are provided and all endpoints are accessible, `vmctl` will start the Prometheus snapshot exploration.
Basically, it just fetches all available blocks in provided snapshot and read the metadata. It also does initial filtering by time
if flags `--prom-filter-time-start` or `--prom-filter-time-end` were set. The exploration procedure prints some stats from read blocks.
Please note that stats are not taking into account timeseries or samples filtering. This will be done during importing process.
The importing process takes the snapshot blocks revealed from Explore procedure and processes them one by one
accumulating timeseries and samples. Please note, that `vmctl` relies on responses from InfluxDB on this stage,
so ensure that Explore queries are executed without errors or limits. Please see this
so ensure that Explore queries are executed without errors or limits. Please see this
[issue](https://github.com/VictoriaMetrics/vmctl/issues/30) for details.
The data processed in chunks and then sent to VM.
The importing process example for local installation of Prometheus
The importing process example for local installation of Prometheus
and single-node VictoriaMetrics(`http://localhost:8428`):
```
./vmctl prometheus --prom-snapshot=/path/to/snapshot \
--vm-concurrency=1 \
@@ -327,7 +344,7 @@ Found 14 blocks to import. Continue? [Y/n] y
import requests: 323;
import requests retries: 0;
2020/02/23 15:50:03 Total time: 51.077451066s
```
```
### Data mapping
@@ -336,7 +353,7 @@ So no data changes will be applied.
### Configuration
The configuration flags should contain self-explanatory descriptions.
The configuration flags should contain self-explanatory descriptions.
### Filtering
@@ -347,6 +364,7 @@ in in RFC3339 format. This filter applied twice: to drop blocks out of range and
overlapping time range.
Example of applying time filter:
```
./vmctl prometheus --prom-snapshot=/path/to/snapshot \
--prom-filter-time-start=2020-02-07T00:07:01Z \
@@ -366,12 +384,13 @@ Please notice, that total amount of blocks in provided snapshot is 14, but only
time range. So other 12 blocks were marked as `skipped`. The amount of samples and series is not taken into account,
since this is heavy operation and will be done during import process.
Filtering by timeseries is configured with following flags:
Filtering by timeseries is configured with following flags:
* `--prom-filter-label` - the label name, e.g. `__name__` or `instance`;
* `--prom-filter-label-value` - the regular expression to filter the label value. By default matches all `.*`
- `--prom-filter-label` - the label name, e.g. `__name__` or `instance`;
- `--prom-filter-label-value` - the regular expression to filter the label value. By default matches all `.*`
For example:
```
./vmctl prometheus --prom-snapshot=/path/to/snapshot \
--prom-filter-label="__name__" \
@@ -405,38 +424,44 @@ Found 2 blocks to import. Continue? [Y/n] y
Thanos uses the same storage engine as Prometheus and the data layout on-disk should be the same. That means
`vmctl` in mode `prometheus` may be used for Thanos historical data migration as well.
These instructions may vary based on the details of your Thanos configuration.
Please read carefully and verify as you go. We assume you're using Thanos Sidecar on your Prometheus pods,
These instructions may vary based on the details of your Thanos configuration.
Please read carefully and verify as you go. We assume you're using Thanos Sidecar on your Prometheus pods,
and that you have a separate Thanos Store installation.
### Current data
1. For now, keep your Thanos Sidecar and Thanos-related Prometheus configuration, but add this to also stream
1. For now, keep your Thanos Sidecar and Thanos-related Prometheus configuration, but add this to also stream
metrics to VictoriaMetrics:
```
remote_write:
- url: http://victoria-metrics:8428/api/v1/write
```
2. Make sure VM is running, of course. Now check the logs to make sure that Prometheus is sending and VM is receiving.
2. Make sure VM is running, of course. Now check the logs to make sure that Prometheus is sending and VM is receiving.
In Prometheus, make sure there are no errors. On the VM side, you should see messages like this:
```
2020-04-27T18:38:46.474Z info VictoriaMetrics/lib/storage/partition.go:207 creating a partition "2020_04" with smallPartsPath="/victoria-metrics-data/data/small/2020_04", bigPartsPath="/victoria-metrics-data/data/big/2020_04"
2020-04-27T18:38:46.506Z info VictoriaMetrics/lib/storage/partition.go:222 partition "2020_04" has been created
2020-04-27T18:38:46.474Z info VictoriaMetrics/lib/storage/partition.go:207 creating a partition "2020_04" with smallPartsPath="/victoria-metrics-data/data/small/2020_04", bigPartsPath="/victoria-metrics-data/data/big/2020_04"
2020-04-27T18:38:46.506Z info VictoriaMetrics/lib/storage/partition.go:222 partition "2020_04" has been created
```
3. Now just wait. Within two hours, Prometheus should finish its current data file and hand it off to Thanos Store for long term
storage.
### Historical data
Let's assume your data is stored on S3 served by minio. You first need to copy that out to a local filesystem,
Let's assume your data is stored on S3 served by minio. You first need to copy that out to a local filesystem,
then import it into VM using `vmctl` in `prometheus` mode.
1. Copy data from minio.
1. Run the `minio/mc` Docker container.
1. `mc config host add minio http://minio:9000 accessKey secretKey`, substituting appropriate values for the last 3 items.
1. `mc cp -r minio/prometheus thanos-data`
1. Import using `vmctl`.
1. Follow the [instructions](#how-to-build) to compile `vmctl` on your machine.
1. Use [prometheus](#migrating-data-from-prometheus) mode to import data:
1. Use [prometheus](#migrating-data-from-prometheus) mode to import data:
```
vmctl prometheus --prom-snapshot thanos-data --vm-addr http://victoria-metrics:8428
```
@@ -453,8 +478,8 @@ or higher.
See `./vmctl vm-native --help` for details and full list of flags.
In this mode `vmctl` acts as a proxy between two VM instances, where time series filtering is done by "source" (`src`)
and processing is done by "destination" (`dst`). Because of that, `vmctl` doesn't actually know how much data will be
In this mode `vmctl` acts as a proxy between two VM instances, where time series filtering is done by "source" (`src`)
and processing is done by "destination" (`dst`). Because of that, `vmctl` doesn't actually know how much data will be
processed and can't show the progress bar. It will show the current processing speed and total number of processed bytes:
```
@@ -468,20 +493,36 @@ Initing export pipe from "http://localhost:8528" with filters:
Initing import process to "http://localhost:8428":
Total: 336.75 KiB ↖ Speed: 454.46 KiB p/s
2020/10/13 17:04:59 Total time: 952.143376ms
```
```
Importing tips:
1. Migrating all the metrics from one VM to another may collide with existing application metrics
(prefixed with `vm_`) at destination and lead to confusion when using
[official Grafana dashboards](https://grafana.com/orgs/victoriametrics/dashboards).
1. Migrating all the metrics from one VM to another may collide with existing application metrics
(prefixed with `vm_`) at destination and lead to confusion when using
[official Grafana dashboards](https://grafana.com/orgs/victoriametrics/dashboards).
To avoid such situation try to filter out VM process metrics via `--vm-native-filter-match` flag.
2. Migration is a backfilling process, so it is recommended to read
2. Migration is a backfilling process, so it is recommended to read
[Backfilling tips](https://github.com/VictoriaMetrics/VictoriaMetrics#backfilling) section.
3. `vmctl` doesn't provide relabeling or other types of labels management in this mode.
Instead, use [relabeling in VictoriaMetrics](https://github.com/VictoriaMetrics/vmctl/issues/4#issuecomment-683424375).
4. When importing in or from cluster version remember to use correct [URL format](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#url-format)
and specify `accountID` param.
## Verifying exported blocks from VictoriaMetrics
In this mode, `vmctl` allows verifying correctness and integrity of data exported via [native format](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-export-data-in-native-format) from VictoriaMetrics.
You can verify exported data at disk before uploading it by `vmctl verify-block` command:
```bash
# export blocks from VictoriaMetrics
curl localhost:8428/api/v1/export/native -g -d 'match[]={__name__!=""}' -o exported_data_block
# verify block content
./vmctl verify-block exported_data_block
2022/03/30 18:04:50 verifying block at path="exported_data_block"
2022/03/30 18:04:50 successfully verified block at path="exported_data_block", blockCount=123786
2022/03/30 18:04:50 Total time: 100.108ms
```
## Tuning
### InfluxDB mode
@@ -491,7 +532,7 @@ timeseries. Please set it wisely to avoid InfluxDB overwhelming.
The flag `--influx-chunk-size` controls the max amount of datapoints to return in single chunk from fetch requests.
Please see more details [here](https://docs.influxdata.com/influxdb/v1.7/guides/querying_data/#chunking).
The chunk size is used to control InfluxDB memory usage, so it won't OOM on processing large timeseries with
The chunk size is used to control InfluxDB memory usage, so it won't OOM on processing large timeseries with
billions of datapoints.
### Prometheus mode
@@ -507,17 +548,18 @@ Please note that each import request can load up to a single vCPU core on Victor
to allocated CPU resources of your VictoriMetrics installation.
The flag `--vm-batch-size` controls max amount of samples collected before sending the import request.
For example, if `--influx-chunk-size=500` and `--vm-batch-size=2000` then importer will process not more
than 4 chunks before sending the request.
For example, if `--influx-chunk-size=500` and `--vm-batch-size=2000` then importer will process not more
than 4 chunks before sending the request.
### Importer stats
After successful import `vmctl` prints some statistics for details.
After successful import `vmctl` prints some statistics for details.
The important numbers to watch are following:
- `idle duration` - shows time that importer spent while waiting for data from InfluxDB/Prometheus
- `idle duration` - shows time that importer spent while waiting for data from InfluxDB/Prometheus
to fill up `--vm-batch-size` batch size. Value shows total duration across all workers configured
via `--vm-concurrency`. High value may be a sign of too slow InfluxDB/Prometheus fetches or too
high `--vm-concurrency` value. Try to improve it by increasing `--<mode>-concurrency` value or
high `--vm-concurrency` value. Try to improve it by increasing `--<mode>-concurrency` value or
decreasing `--vm-concurrency` value.
- `import requests` - shows how many import requests were issued to VM server.
The import request is issued once the batch size(`--vm-batch-size`) is full and ready to be sent.
@@ -529,6 +571,7 @@ a sign of network issues or VM being overloaded. See the logs during import for
By default `vmctl` waits confirmation from user before starting the import. If this is unwanted
behavior and no user interaction required - pass `-s` flag to enable "silence" mode:
```
-s Whether to run in silent mode. If set to true no confirmation prompts will appear. (default: false)
```
@@ -537,18 +580,18 @@ behavior and no user interaction required - pass `-s` flag to enable "silence" m
`vmctl` allows to limit the number of [significant figures](https://en.wikipedia.org/wiki/Significant_figures)
before importing. For example, the average value for response size is `102.342305` bytes and it has 9 significant figures.
If you ask a human to pronounce this value then with high probability value will be rounded to first 4 or 5 figures
because the rest aren't really that important to mention. In most cases, such a high precision is too much.
Moreover, such values may be just a result of [floating point arithmetic](https://en.wikipedia.org/wiki/Floating-point_arithmetic),
create a [false precision](https://en.wikipedia.org/wiki/False_precision) and result into bad compression ratio
according to [information theory](https://en.wikipedia.org/wiki/Information_theory).
If you ask a human to pronounce this value then with high probability value will be rounded to first 4 or 5 figures
because the rest aren't really that important to mention. In most cases, such a high precision is too much.
Moreover, such values may be just a result of [floating point arithmetic](https://en.wikipedia.org/wiki/Floating-point_arithmetic),
create a [false precision](https://en.wikipedia.org/wiki/False_precision) and result into bad compression ratio
according to [information theory](https://en.wikipedia.org/wiki/Information_theory).
`vmctl` provides the following flags for improving data compression:
* `--vm-round-digits` flag for rounding processed values to the given number of decimal digits after the point.
- `--vm-round-digits` flag for rounding processed values to the given number of decimal digits after the point.
For example, `--vm-round-digits=2` would round `1.2345` to `1.23`. By default the rounding is disabled.
* `--vm-significant-figures` flag for limiting the number of significant figures in processed values. It takes no effect if set
- `--vm-significant-figures` flag for limiting the number of significant figures in processed values. It takes no effect if set
to 0 (by default), but set `--vm-significant-figures=5` and `102.342305` will be rounded to `102.34`.
The most common case for using these flags is to improve data compression for time series storing aggregation
@@ -556,7 +599,7 @@ results such as `average`, `rate`, etc.
### Adding extra labels
`vmctl` allows to add extra labels to all imported series. It can be achived with flag `--vm-extra-label label=value`.
`vmctl` allows to add extra labels to all imported series. It can be achived with flag `--vm-extra-label label=value`.
If multiple labels needs to be added, set flag for each label, for example, `--vm-extra-label label1=value1 --vm-extra-label label2=value2`.
If timeseries already have label, that must be added with `--vm-extra-label` flag, flag has priority and will override label value from timeseries.
@@ -565,15 +608,13 @@ results such as `average`, `rate`, etc.
Limiting the rate of data transfer could help to reduce pressure on disk or on destination database.
The rate limit may be set in bytes-per-second via `--vm-rate-limit` flag.
Please note, you can also use [vmagent](https://docs.victoriametrics.com/vmagent.html)
Please note, you can also use [vmagent](https://docs.victoriametrics.com/vmagent.html)
as a proxy between `vmctl` and destination with `-remoteWrite.rateLimit` flag enabled.
## How to build
It is recommended using [binary releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) - `vmctl` is located in `vmutils-*` archives there.
### Development build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.17.

View File

@@ -6,6 +6,7 @@ import (
"os"
"os/signal"
"strings"
"sync/atomic"
"syscall"
"time"
@@ -14,6 +15,8 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl/prometheus"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl/vm"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/native"
"github.com/urfave/cli/v2"
)
@@ -164,6 +167,39 @@ func main() {
return p.run()
},
},
{
Name: "verify-block",
Usage: "Verifies exported block with VictoriaMetrics Native format",
Flags: []cli.Flag{
&cli.BoolFlag{
Name: "gunzip",
Usage: "Use GNU zip decompression for exported block",
Value: false,
},
},
Action: func(c *cli.Context) error {
common.StartUnmarshalWorkers()
blockPath := c.Args().First()
isBlockGzipped := c.Bool("gunzip")
if len(blockPath) == 0 {
return cli.Exit("you must provide path for exported data block", 1)
}
log.Printf("verifying block at path=%q", blockPath)
f, err := os.OpenFile(blockPath, os.O_RDONLY, 0600)
if err != nil {
return cli.Exit(fmt.Errorf("cannot open exported block at path=%q err=%w", blockPath, err), 1)
}
var blocksCount uint64
if err := parser.ParseStream(f, isBlockGzipped, func(block *parser.Block) error {
atomic.AddUint64(&blocksCount, 1)
return nil
}); err != nil {
return cli.Exit(fmt.Errorf("cannot parse block at path=%q, blocksCount=%d, err=%w", blockPath, blocksCount, err), 1)
}
log.Printf("successfully verified block at path=%q, blockCount=%d", blockPath, blocksCount)
return nil
},
},
},
}

View File

@@ -2,7 +2,6 @@
***vmgateway is a part of [enterprise package](https://victoriametrics.com/products/enterprise/). It is available for download and evaluation at [releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases)***
<img alt="vmgateway" src="vmgateway-overview.jpeg">
`vmgateway` is a proxy for the VictoriaMetrics Time Series Database (TSDB). It provides the following features:
@@ -16,7 +15,6 @@
`vmgateway` is included in our [enterprise packages](https://victoriametrics.com/products/enterprise/).
## Access Control
<img alt="vmgateway-ac" src="vmgateway-access-control.jpg">
@@ -24,6 +22,7 @@
`vmgateway` supports jwt based authentication. With jwt payload can be configured to give access to specific tenants and labels as well as to read/write.
jwt token must be in following format:
```json
{
"exp": 1617304574,
@@ -41,13 +40,15 @@ jwt token must be in following format:
}
}
```
Where:
- `exp` - required, expire time in unix_timestamp. If the token expires then `vmgateway` rejects the request.
- `vm_access` - required, dict with claim info, minimum form: `{"vm_access": {"tenand_id": {}}`
- `tenant_id` - optional, for cluster mode, routes requests to the corresponding tenant.
- `extra_labels` - optional, key-value pairs for label filters added to the ingested or selected metrics. Multiple filters are added with `and` operation. If defined, `extra_label` from original request removed.
- `extra_filters` - optional, [series selectors](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors) added to the select query requests. Multiple selectors are added with `or` operation. If defined, `extra_filter` from original request removed.
- `mode` - optional, access mode for api - read, write, or full. Supported values: 0 - full (default value), 1 - read, 2 - write.
* `exp` - required, expire time in unix_timestamp. If the token expires then `vmgateway` rejects the request.
* `vm_access` - required, dict with claim info, minimum form: `{"vm_access": {"tenand_id": {}}`
* `tenant_id` - optional, for cluster mode, routes requests to the corresponding tenant.
* `extra_labels` - optional, key-value pairs for label filters added to the ingested or selected metrics. Multiple filters are added with `and` operation. If defined, `extra_label` from original request removed.
* `extra_filters` - optional, [series selectors](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors) added to the select query requests. Multiple selectors are added with `or` operation. If defined, `extra_filter` from original request removed.
* `mode` - optional, access mode for api - read, write, or full. Supported values: 0 - full (default value), 1 - read, 2 - write.
## QuickStart
@@ -66,18 +67,19 @@ Start vmgateway
```
Retrieve data from the database
```bash
curl 'http://localhost:8431/api/v1/series/count' -H 'Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ2bV9hY2Nlc3MiOnsidGVuYW50X2lkIjp7fSwicm9sZSI6MX0sImV4cCI6MTkzOTM0NjIxMH0.5WUxEfdcV9hKo4CtQdtuZYOGpGXWwaqM9VuVivMMrVg'
```
A request with an incorrect token or without any token will be rejected:
```bash
curl 'http://localhost:8431/api/v1/series/count'
curl 'http://localhost:8431/api/v1/series/count' -H 'Authorization: Bearer incorrect-token'
```
## Rate Limiter
<img alt="vmgateway-rl" src="vmgateway-rate-limiting.jpg">
@@ -88,14 +90,16 @@ Limits incoming requests by given, pre-configured limits. It supports read and w
The metrics that you want to rate limit must be scraped from the cluster.
List of supported limit types:
- `queries` - count of api requests made at tenant to read the api, such as `/api/v1/query`, `/api/v1/series` and others.
- `active_series` - count of current active series at any given tenant.
- `new_series` - count of created series; aka churn rate
- `rows_inserted` - count of inserted rows per tenant.
* `queries` - count of api requests made at tenant to read the api, such as `/api/v1/query`, `/api/v1/series` and others.
* `active_series` - count of current active series at any given tenant.
* `new_series` - count of created series; aka churn rate
* `rows_inserted` - count of inserted rows per tenant.
List of supported time windows:
- `minute`
- `hour`
* `minute`
* `hour`
Limits can be specified per tenant or at a global level if you omit `project_id` and `account_id`.
@@ -119,6 +123,7 @@ limits:
## QuickStart
cluster version of VictoriaMetrics is required for rate limiting.
```bash
# start datasource for cluster metrics
@@ -169,6 +174,7 @@ curl 'http://localhost:8431/api/v1/labels' -H 'Authorization: Bearer eyJhbGciOiJ
## Configuration
The shortlist of configuration flags include the following:
```console
-clusterMode
enable this for the cluster version
@@ -276,12 +282,11 @@ The shortlist of configuration flags include the following:
## TroubleShooting
* Access control:
* incorrect `jwt` format, try https://jwt.io/#debugger-io with our tokens
* incorrect `jwt` format, try <https://jwt.io/#debugger-io> with our tokens
* expired token, check `exp` field.
* Rate Limiting:
* `scrape_interval` at datasource, reduce it to apply limits faster.
## Limitations
* Access Control:

View File

@@ -27,8 +27,9 @@ func InsertHandler(req *http.Request) error {
if err != nil {
return err
}
isGzip := req.Header.Get("Content-Encoding") == "gzip"
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(req, func(block *parser.Block) error {
return parser.ParseStream(req.Body, isGzip, func(block *parser.Block) error {
return insertRows(block, extraLabels)
})
})

View File

@@ -6,12 +6,11 @@ VictoriaMetrics `v1.29.0` and newer versions must be used for working with the r
Restore process can be interrupted at any time. It is automatically resumed from the interruption point
when restarting `vmrestore` with the same args.
## Usage
VictoriaMetrics must be stopped during the restore process.
```
```bash
vmrestore -src=gs://<bucket>/<path/to/backup> -storageDataPath=<local/path/to/restore>
```
@@ -24,13 +23,11 @@ vmrestore -src=gs://<bucket>/<path/to/backup> -storageDataPath=<local/path/to/re
The original `-storageDataPath` directory may contain old files. They will be substituted by the files from backup,
i.e. the end result would be similar to [rsync --delete](https://askubuntu.com/questions/476041/how-do-i-make-rsync-delete-files-that-have-been-deleted-from-the-source-folder).
## Troubleshooting
* If `vmrestore` eats all the network bandwidth, then set `-maxBytesPerSecond` to the desired value.
* If `vmrestore` has been interrupted due to temporary error, then just restart it with the same args. It will resume the restore process.
## Advanced usage
* Obtaining credentials from a file.
@@ -38,6 +35,7 @@ i.e. the end result would be similar to [rsync --delete](https://askubuntu.com/q
Add flag `-credsFilePath=/etc/credentials` with following content:
for s3 (aws, minio or other s3 compatible storages):
```bash
[default]
aws_access_key_id=theaccesskey
@@ -45,6 +43,7 @@ i.e. the end result would be similar to [rsync --delete](https://askubuntu.com/q
```
for gce cloud storage:
```json
{
"type": "service_account",
@@ -62,7 +61,8 @@ i.e. the end result would be similar to [rsync --delete](https://askubuntu.com/q
* Usage with s3 custom url endpoint. It is possible to use `vmrestore` with s3 api compatible storages, like minio, cloudian and other.
You have to add custom url endpoint with a flag:
```
```bash
# for minio:
-customS3Endpoint=http://localhost:9000
@@ -70,97 +70,95 @@ i.e. the end result would be similar to [rsync --delete](https://askubuntu.com/q
-customS3Endpoint=https://s3-fips.us-gov-west-1.amazonaws.com
```
* Run `vmrestore -help` in order to see all the available options:
* Run `vmrestore -help` in order to see all the available options:
```
```bash
-concurrency int
The number of concurrent workers. Higher concurrency may reduce restore duration (default 10)
The number of concurrent workers. Higher concurrency may reduce restore duration (default 10)
-configFilePath string
Path to file with S3 configs. Configs are loaded from default location if not set.
See https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html
Path to file with S3 configs. Configs are loaded from default location if not set.
See https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html
-configProfile string
Profile name for S3 configs. If no set, the value of the environment variable will be loaded (AWS_PROFILE or AWS_DEFAULT_PROFILE), or if both not set, DefaultSharedConfigProfile is used
Profile name for S3 configs. If no set, the value of the environment variable will be loaded (AWS_PROFILE or AWS_DEFAULT_PROFILE), or if both not set, DefaultSharedConfigProfile is used
-credsFilePath string
Path to file with GCS or S3 credentials. Credentials are loaded from default locations if not set.
See https://cloud.google.com/iam/docs/creating-managing-service-account-keys and https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html
Path to file with GCS or S3 credentials. Credentials are loaded from default locations if not set.
See https://cloud.google.com/iam/docs/creating-managing-service-account-keys and https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html
-customS3Endpoint string
Custom S3 endpoint for use with S3-compatible storages (e.g. MinIO). S3 is used if not set
Custom S3 endpoint for use with S3-compatible storages (e.g. MinIO). S3 is used if not set
-enableTCP6
Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used
Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used
-envflag.enable
Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
Prefix for environment variables if -envflag.enable is set
-fs.disableMmap
Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread()
Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread()
-http.connTimeout duration
Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s)
Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s)
-http.disableResponseCompression
Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth
Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth
-http.idleConnTimeout duration
Timeout for incoming idle http connections (default 1m0s)
Timeout for incoming idle http connections (default 1m0s)
-http.maxGracefulShutdownDuration duration
The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s)
The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s)
-http.pathPrefix string
An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus
An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus
-http.shutdownDelay duration
Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers
Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers
-httpAuth.password string
Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty
Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty
-httpAuth.username string
Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
-httpListenAddr string
TCP address for exporting metrics at /metrics page (default ":8421")
TCP address for exporting metrics at /metrics page (default ":8421")
-loggerDisableTimestamps
Whether to disable writing timestamps in logs
Whether to disable writing timestamps in logs
-loggerErrorsPerSecondLimit int
Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit
Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit
-loggerFormat string
Format for logs. Possible values: default, json (default "default")
Format for logs. Possible values: default, json (default "default")
-loggerLevel string
Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO")
Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO")
-loggerOutput string
Output for the logs. Supported values: stderr, stdout (default "stderr")
Output for the logs. Supported values: stderr, stdout (default "stderr")
-loggerTimezone string
Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC")
Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC")
-loggerWarnsPerSecondLimit int
Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit
Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit
-maxBytesPerSecond size
The maximum download speed. There is no limit if it is set to 0
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0)
The maximum download speed. There is no limit if it is set to 0
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0)
-memory.allowedBytes size
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0)
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0)
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60)
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60)
-metricsAuthKey string
Auth key for /metrics. It must be passed via authKey query arg. It overrides httpAuth.* settings
Auth key for /metrics. It must be passed via authKey query arg. It overrides httpAuth.* settings
-pprofAuthKey string
Auth key for /debug/pprof. It must be passed via authKey query arg. It overrides httpAuth.* settings
Auth key for /debug/pprof. It must be passed via authKey query arg. It overrides httpAuth.* settings
-s3ForcePathStyle
Prefixing endpoint with bucket name when set false, true by default. (default true)
Prefixing endpoint with bucket name when set false, true by default. (default true)
-skipBackupCompleteCheck
Whether to skip checking for 'backup complete' file in -src. This may be useful for restoring from old backups, which were created without 'backup complete' file
Whether to skip checking for 'backup complete' file in -src. This may be useful for restoring from old backups, which were created without 'backup complete' file
-src string
Source path with backup on the remote storage. Example: gs://bucket/path/to/backup/dir, s3://bucket/path/to/backup/dir or fs:///path/to/local/backup/dir
Source path with backup on the remote storage. Example: gs://bucket/path/to/backup/dir, s3://bucket/path/to/backup/dir or fs:///path/to/local/backup/dir
-storageDataPath string
Destination path where backup must be restored. VictoriaMetrics must be stopped when restoring from backup. -storageDataPath dir can be non-empty. In this case the contents of -storageDataPath dir is synchronized with -src contents, i.e. it works like 'rsync --delete' (default "victoria-metrics-data")
Destination path where backup must be restored. VictoriaMetrics must be stopped when restoring from backup. -storageDataPath dir can be non-empty. In this case the contents of -storageDataPath dir is synchronized with -src contents, i.e. it works like 'rsync --delete' (default "victoria-metrics-data")
-tls
Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set
Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set
-tlsCertFile string
Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower
Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower
-tlsKeyFile string
Path to file with TLS key. Used only if -tls is set
Path to file with TLS key. Used only if -tls is set
-version
Show VictoriaMetrics version
Show VictoriaMetrics version
```
## How to build from sources
It is recommended using [binary releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) - see `vmutils-*` archives there.
### Development build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.17.

View File

@@ -54,7 +54,7 @@ func TagsDelSeriesHandler(startTime time.Time, w http.ResponseWriter, r *http.Re
})
}
tfss := joinTagFilterss(tfs, etfs)
sq := storage.NewSearchQuery(0, ct, tfss)
sq := storage.NewSearchQuery(0, ct, tfss, 0)
n, err := netstorage.DeleteSeries(sq, deadline)
if err != nil {
return fmt.Errorf("cannot delete series for %q: %w", sq, err)
@@ -196,7 +196,7 @@ func TagsAutoCompleteValuesHandler(startTime time.Time, w http.ResponseWriter, r
}
} else {
// Slow path: use netstorage.SearchMetricNames for applying `expr` filters.
sq, err := getSearchQueryForExprs(startTime, etfs, exprs)
sq, err := getSearchQueryForExprs(startTime, etfs, exprs, limit*10)
if err != nil {
return err
}
@@ -282,7 +282,7 @@ func TagsAutoCompleteTagsHandler(startTime time.Time, w http.ResponseWriter, r *
}
} else {
// Slow path: use netstorage.SearchMetricNames for applying `expr` filters.
sq, err := getSearchQueryForExprs(startTime, etfs, exprs)
sq, err := getSearchQueryForExprs(startTime, etfs, exprs, limit*10)
if err != nil {
return err
}
@@ -349,7 +349,7 @@ func TagsFindSeriesHandler(startTime time.Time, w http.ResponseWriter, r *http.R
if err != nil {
return fmt.Errorf("cannot setup tag filters: %w", err)
}
sq, err := getSearchQueryForExprs(startTime, etfs, exprs)
sq, err := getSearchQueryForExprs(startTime, etfs, exprs, limit*10)
if err != nil {
return err
}
@@ -474,14 +474,14 @@ func getInt(r *http.Request, argName string) (int, error) {
return n, nil
}
func getSearchQueryForExprs(startTime time.Time, etfs [][]storage.TagFilter, exprs []string) (*storage.SearchQuery, error) {
func getSearchQueryForExprs(startTime time.Time, etfs [][]storage.TagFilter, exprs []string, maxMetrics int) (*storage.SearchQuery, error) {
tfs, err := exprsToTagFilters(exprs)
if err != nil {
return nil, err
}
ct := startTime.UnixNano() / 1e6
tfss := joinTagFilterss(tfs, etfs)
sq := storage.NewSearchQuery(0, ct, tfss)
sq := storage.NewSearchQuery(0, ct, tfss, maxMetrics)
return sq, nil
}

View File

@@ -26,7 +26,6 @@ var (
maxTagKeysPerSearch = flag.Int("search.maxTagKeys", 100e3, "The maximum number of tag keys returned from /api/v1/labels")
maxTagValuesPerSearch = flag.Int("search.maxTagValues", 100e3, "The maximum number of tag values returned from /api/v1/label/<label_name>/values")
maxTagValueSuffixesPerSearch = flag.Int("search.maxTagValueSuffixesPerSearch", 100e3, "The maximum number of tag value suffixes returned from /metrics/find")
maxMetricsPerSearch = flag.Int("search.maxUniqueTimeseries", 300e3, "The maximum number of unique time series each search can scan. This option allows limiting memory usage")
maxSamplesPerSeries = flag.Int("search.maxSamplesPerSeries", 30e6, "The maximum number of raw samples a single query can scan per each time series. This option allows limiting memory usage")
maxSamplesPerQuery = flag.Int("search.maxSamplesPerQuery", 1e9, "The maximum number of raw samples a single query can process across all time series. This protects from heavy queries, which select unexpectedly high number of raw samples. See also -search.maxSamplesPerSeries")
)
@@ -600,7 +599,7 @@ func DeleteSeries(sq *storage.SearchQuery, deadline searchutils.Deadline) (int,
MinTimestamp: sq.MinTimestamp,
MaxTimestamp: sq.MaxTimestamp,
}
tfss, err := setupTfss(tr, sq.TagFilterss, deadline)
tfss, err := setupTfss(tr, sq.TagFilterss, sq.MaxMetrics, deadline)
if err != nil {
return 0, err
}
@@ -805,11 +804,11 @@ func GetLabelEntries(deadline searchutils.Deadline) ([]storage.TagEntry, error)
}
// GetTSDBStatusForDate returns tsdb status according to https://prometheus.io/docs/prometheus/latest/querying/api/#tsdb-stats
func GetTSDBStatusForDate(deadline searchutils.Deadline, date uint64, topN int) (*storage.TSDBStatus, error) {
func GetTSDBStatusForDate(deadline searchutils.Deadline, date uint64, topN, maxMetrics int) (*storage.TSDBStatus, error) {
if deadline.Exceeded() {
return nil, fmt.Errorf("timeout exceeded before starting the query processing: %s", deadline.String())
}
status, err := vmstorage.GetTSDBStatusForDate(date, topN, deadline.Deadline())
status, err := vmstorage.GetTSDBStatusForDate(date, topN, maxMetrics, deadline.Deadline())
if err != nil {
return nil, fmt.Errorf("error during tsdb status request: %w", err)
}
@@ -827,12 +826,12 @@ func GetTSDBStatusWithFilters(deadline searchutils.Deadline, sq *storage.SearchQ
MinTimestamp: sq.MinTimestamp,
MaxTimestamp: sq.MaxTimestamp,
}
tfss, err := setupTfss(tr, sq.TagFilterss, deadline)
tfss, err := setupTfss(tr, sq.TagFilterss, sq.MaxMetrics, deadline)
if err != nil {
return nil, err
}
date := uint64(tr.MinTimestamp) / (3600 * 24 * 1000)
status, err := vmstorage.GetTSDBStatusWithFiltersForDate(tfss, date, topN, deadline.Deadline())
status, err := vmstorage.GetTSDBStatusWithFiltersForDate(tfss, date, topN, sq.MaxMetrics, deadline.Deadline())
if err != nil {
return nil, fmt.Errorf("error during tsdb status with filters request: %w", err)
}
@@ -883,7 +882,7 @@ func ExportBlocks(sq *storage.SearchQuery, deadline searchutils.Deadline, f func
if err := vmstorage.CheckTimeRange(tr); err != nil {
return err
}
tfss, err := setupTfss(tr, sq.TagFilterss, deadline)
tfss, err := setupTfss(tr, sq.TagFilterss, sq.MaxMetrics, deadline)
if err != nil {
return err
}
@@ -894,7 +893,7 @@ func ExportBlocks(sq *storage.SearchQuery, deadline searchutils.Deadline, f func
sr := getStorageSearch()
defer putStorageSearch(sr)
startTime := time.Now()
sr.Init(vmstorage.Storage, tfss, tr, *maxMetricsPerSearch, deadline.Deadline())
sr.Init(vmstorage.Storage, tfss, tr, sq.MaxMetrics, deadline.Deadline())
indexSearchDuration.UpdateDuration(startTime)
// Start workers that call f in parallel on available CPU cores.
@@ -991,12 +990,12 @@ func SearchMetricNames(sq *storage.SearchQuery, deadline searchutils.Deadline) (
if err := vmstorage.CheckTimeRange(tr); err != nil {
return nil, err
}
tfss, err := setupTfss(tr, sq.TagFilterss, deadline)
tfss, err := setupTfss(tr, sq.TagFilterss, sq.MaxMetrics, deadline)
if err != nil {
return nil, err
}
mns, err := vmstorage.SearchMetricNames(tfss, tr, *maxMetricsPerSearch, deadline.Deadline())
mns, err := vmstorage.SearchMetricNames(tfss, tr, sq.MaxMetrics, deadline.Deadline())
if err != nil {
return nil, fmt.Errorf("cannot find metric names: %w", err)
}
@@ -1019,7 +1018,7 @@ func ProcessSearchQuery(sq *storage.SearchQuery, fetchData bool, deadline search
if err := vmstorage.CheckTimeRange(tr); err != nil {
return nil, err
}
tfss, err := setupTfss(tr, sq.TagFilterss, deadline)
tfss, err := setupTfss(tr, sq.TagFilterss, sq.MaxMetrics, deadline)
if err != nil {
return nil, err
}
@@ -1029,7 +1028,7 @@ func ProcessSearchQuery(sq *storage.SearchQuery, fetchData bool, deadline search
sr := getStorageSearch()
startTime := time.Now()
maxSeriesCount := sr.Init(vmstorage.Storage, tfss, tr, *maxMetricsPerSearch, deadline.Deadline())
maxSeriesCount := sr.Init(vmstorage.Storage, tfss, tr, sq.MaxMetrics, deadline.Deadline())
indexSearchDuration.UpdateDuration(startTime)
m := make(map[string][]blockRef, maxSeriesCount)
orderedMetricNames := make([]string, 0, maxSeriesCount)
@@ -1111,7 +1110,7 @@ type blockRef struct {
addr tmpBlockAddr
}
func setupTfss(tr storage.TimeRange, tagFilterss [][]storage.TagFilter, deadline searchutils.Deadline) ([]*storage.TagFilters, error) {
func setupTfss(tr storage.TimeRange, tagFilterss [][]storage.TagFilter, maxMetrics int, deadline searchutils.Deadline) ([]*storage.TagFilters, error) {
tfss := make([]*storage.TagFilters, 0, len(tagFilterss))
for _, tagFilters := range tagFilterss {
tfs := storage.NewTagFilters()
@@ -1119,13 +1118,13 @@ func setupTfss(tr storage.TimeRange, tagFilterss [][]storage.TagFilter, deadline
tf := &tagFilters[i]
if string(tf.Key) == "__graphite__" {
query := tf.Value
paths, err := vmstorage.SearchGraphitePaths(tr, query, *maxMetricsPerSearch, deadline.Deadline())
paths, err := vmstorage.SearchGraphitePaths(tr, query, maxMetrics, deadline.Deadline())
if err != nil {
return nil, fmt.Errorf("error when searching for Graphite paths for query %q: %w", query, err)
}
if len(paths) >= *maxMetricsPerSearch {
return nil, fmt.Errorf("more than -search.maxUniqueTimeseries=%d time series match Graphite query %q; "+
"either narrow down the query or increase -search.maxUniqueTimeseries command-line flag value", *maxMetricsPerSearch, query)
if len(paths) >= maxMetrics {
return nil, fmt.Errorf("more than %d time series match Graphite query %q; "+
"either narrow down the query or increase the corresponding -search.max* command-line flag value", maxMetrics, query)
}
tfs.AddGraphiteQuery(query, paths, tf.IsNegative)
continue

View File

@@ -42,6 +42,12 @@ var (
"See also '-search.maxLookback' flag, which has the same meaning due to historical reasons")
maxStepForPointsAdjustment = flag.Duration("search.maxStepForPointsAdjustment", time.Minute, "The maximum step when /api/v1/query_range handler adjusts "+
"points with timestamps closer than -search.latencyOffset to the current time. The adjustment is needed because such points may contain incomplete data")
maxUniqueTimeseries = flag.Int("search.maxUniqueTimeseries", 300e3, "The maximum number of unique time series, which can be selected during /api/v1/query and /api/v1/query_range queries. This option allows limiting memory usage")
maxFederateSeries = flag.Int("search.maxFederateSeries", 300e3, "The maximum number of time series, which can be returned from /federate. This option allows limiting memory usage")
maxExportSeries = flag.Int("search.maxExportSeries", 1e6, "The maximum number of time series, which can be returned from /api/v1/export* APIs. This option allows limiting memory usage")
maxTSDBStatusSeries = flag.Int("search.maxTSDBStatusSeries", 1e6, "The maximum number of time series, which can be processed during the call to /api/v1/status/tsdb. This option allows limiting memory usage")
maxSeriesLimit = flag.Int("search.maxSeries", 10e3, "The maximum number of time series, which can be returned from /api/v1/series. This option allows limiting memory usage")
)
// Default step used if not set.
@@ -78,7 +84,7 @@ func FederateHandler(startTime time.Time, w http.ResponseWriter, r *http.Request
if err != nil {
return err
}
sq := storage.NewSearchQuery(start, end, tagFilterss)
sq := storage.NewSearchQuery(start, end, tagFilterss, *maxFederateSeries)
rss, err := netstorage.ProcessSearchQuery(sq, true, deadline)
if err != nil {
return fmt.Errorf("cannot fetch data for %q: %w", sq, err)
@@ -135,7 +141,7 @@ func ExportCSVHandler(startTime time.Time, w http.ResponseWriter, r *http.Reques
if err != nil {
return err
}
sq := storage.NewSearchQuery(start, end, tagFilterss)
sq := storage.NewSearchQuery(start, end, tagFilterss, *maxExportSeries)
w.Header().Set("Content-Type", "text/csv; charset=utf-8")
bw := bufferedwriter.Get(w)
defer bufferedwriter.Put(bw)
@@ -232,7 +238,7 @@ func ExportNativeHandler(startTime time.Time, w http.ResponseWriter, r *http.Req
if err != nil {
return err
}
sq := storage.NewSearchQuery(start, end, tagFilterss)
sq := storage.NewSearchQuery(start, end, tagFilterss, *maxExportSeries)
w.Header().Set("Content-Type", "VictoriaMetrics/native")
bw := bufferedwriter.Get(w)
defer bufferedwriter.Put(bw)
@@ -383,7 +389,7 @@ func exportHandler(w http.ResponseWriter, matches []string, etfs [][]storage.Tag
}
tagFilterss = searchutils.JoinTagFilterss(tagFilterss, etfs)
sq := storage.NewSearchQuery(start, end, tagFilterss)
sq := storage.NewSearchQuery(start, end, tagFilterss, *maxExportSeries)
w.Header().Set("Content-Type", contentType)
bw := bufferedwriter.Get(w)
defer bufferedwriter.Put(bw)
@@ -484,7 +490,7 @@ func DeleteHandler(startTime time.Time, r *http.Request) error {
return err
}
ct := startTime.UnixNano() / 1e6
sq := storage.NewSearchQuery(0, ct, tagFilterss)
sq := storage.NewSearchQuery(0, ct, tagFilterss, 0)
deletedCount, err := netstorage.DeleteSeries(sq, deadline)
if err != nil {
return fmt.Errorf("cannot delete time series: %w", err)
@@ -597,7 +603,7 @@ func labelValuesWithMatches(labelName string, matches []string, etfs [][]storage
if len(tagFilterss) == 0 {
logger.Panicf("BUG: tagFilterss must be non-empty")
}
sq := storage.NewSearchQuery(start, end, tagFilterss)
sq := storage.NewSearchQuery(start, end, tagFilterss, *maxSeriesLimit)
m := make(map[string]struct{})
if end-start > 24*3600*1000 {
// It is cheaper to call SearchMetricNames on time ranges exceeding a day.
@@ -709,12 +715,12 @@ func TSDBStatusHandler(startTime time.Time, w http.ResponseWriter, r *http.Reque
}
var status *storage.TSDBStatus
if len(matches) == 0 && len(etfs) == 0 {
status, err = netstorage.GetTSDBStatusForDate(deadline, date, topN)
status, err = netstorage.GetTSDBStatusForDate(deadline, date, topN, *maxTSDBStatusSeries)
if err != nil {
return fmt.Errorf(`cannot obtain tsdb status for date=%d, topN=%d: %w`, date, topN, err)
}
} else {
status, err = tsdbStatusWithMatches(matches, etfs, date, topN, deadline)
status, err = tsdbStatusWithMatches(matches, etfs, date, topN, *maxTSDBStatusSeries, deadline)
if err != nil {
return fmt.Errorf("cannot obtain tsdb status with matches for date=%d, topN=%d: %w", date, topN, err)
}
@@ -729,7 +735,7 @@ func TSDBStatusHandler(startTime time.Time, w http.ResponseWriter, r *http.Reque
return nil
}
func tsdbStatusWithMatches(matches []string, etfs [][]storage.TagFilter, date uint64, topN int, deadline searchutils.Deadline) (*storage.TSDBStatus, error) {
func tsdbStatusWithMatches(matches []string, etfs [][]storage.TagFilter, date uint64, topN, maxMetrics int, deadline searchutils.Deadline) (*storage.TSDBStatus, error) {
tagFilterss, err := getTagFilterssFromMatches(matches)
if err != nil {
return nil, err
@@ -740,7 +746,7 @@ func tsdbStatusWithMatches(matches []string, etfs [][]storage.TagFilter, date ui
}
start := int64(date*secsPerDay) * 1000
end := int64(date*secsPerDay+secsPerDay) * 1000
sq := storage.NewSearchQuery(start, end, tagFilterss)
sq := storage.NewSearchQuery(start, end, tagFilterss, maxMetrics)
status, err := netstorage.GetTSDBStatusWithFilters(deadline, sq, topN)
if err != nil {
return nil, err
@@ -835,7 +841,7 @@ func labelsWithMatches(matches []string, etfs [][]storage.TagFilter, start, end
if len(tagFilterss) == 0 {
logger.Panicf("BUG: tagFilterss must be non-empty")
}
sq := storage.NewSearchQuery(start, end, tagFilterss)
sq := storage.NewSearchQuery(start, end, tagFilterss, *maxSeriesLimit)
m := make(map[string]struct{})
if end-start > 24*3600*1000 {
// It is cheaper to call SearchMetricNames on time ranges exceeding a day.
@@ -933,7 +939,7 @@ func SeriesHandler(startTime time.Time, w http.ResponseWriter, r *http.Request)
if start >= end {
end = start + defaultStep
}
sq := storage.NewSearchQuery(start, end, tagFilterss)
sq := storage.NewSearchQuery(start, end, tagFilterss, *maxSeriesLimit)
if end-start > 24*3600*1000 {
// It is cheaper to call SearchMetricNames on time ranges exceeding a day.
mns, err := netstorage.SearchMetricNames(sq, deadline)
@@ -1080,6 +1086,7 @@ func QueryHandler(startTime time.Time, w http.ResponseWriter, r *http.Request) e
Start: start,
End: start,
Step: step,
MaxSeries: *maxUniqueTimeseries,
QuotedRemoteAddr: httpserver.GetQuotedRemoteAddr(r),
Deadline: deadline,
LookbackDelta: lookbackDelta,
@@ -1170,6 +1177,7 @@ func queryRangeHandler(startTime time.Time, w http.ResponseWriter, query string,
Start: start,
End: end,
Step: step,
MaxSeries: *maxUniqueTimeseries,
QuotedRemoteAddr: httpserver.GetQuotedRemoteAddr(r),
Deadline: deadline,
MayCache: mayCache,

View File

@@ -93,6 +93,10 @@ type EvalConfig struct {
End int64
Step int64
// MaxSeries is the maximum number of time series, which can be scanned by the query.
// Zero means 'no limit'
MaxSeries int
// QuotedRemoteAddr contains quoted remote address.
QuotedRemoteAddr string
@@ -113,12 +117,13 @@ type EvalConfig struct {
timestampsOnce sync.Once
}
// newEvalConfig returns new EvalConfig copy from src.
func newEvalConfig(src *EvalConfig) *EvalConfig {
// copyEvalConfig returns src copy.
func copyEvalConfig(src *EvalConfig) *EvalConfig {
var ec EvalConfig
ec.Start = src.Start
ec.End = src.End
ec.Step = src.Step
ec.MaxSeries = src.MaxSeries
ec.Deadline = src.Deadline
ec.MayCache = src.MayCache
ec.LookbackDelta = src.LookbackDelta
@@ -575,7 +580,7 @@ func evalRollupFunc(ec *EvalConfig, funcName string, rf rollupFunc, expr metrics
return nil, fmt.Errorf("`@` modifier must return a single series; it returns %d series instead", len(tssAt))
}
atTimestamp := int64(tssAt[0].Values[0] * 1000)
ecNew := newEvalConfig(ec)
ecNew := copyEvalConfig(ec)
ecNew.Start = atTimestamp
ecNew.End = atTimestamp
tss, err := evalRollupFuncWithoutAt(ecNew, funcName, rf, expr, re, iafc)
@@ -602,7 +607,7 @@ func evalRollupFuncWithoutAt(ec *EvalConfig, funcName string, rf rollupFunc, exp
var offset int64
if re.Offset != nil {
offset = re.Offset.Duration(ec.Step)
ecNew = newEvalConfig(ecNew)
ecNew = copyEvalConfig(ecNew)
ecNew.Start -= offset
ecNew.End -= offset
// There is no need in calling AdjustStartEnd() on ecNew if ecNew.MayCache is set to true,
@@ -615,7 +620,7 @@ func evalRollupFuncWithoutAt(ec *EvalConfig, funcName string, rf rollupFunc, exp
// in order to obtain expected OHLC results.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/309#issuecomment-582113462
step := ecNew.Step
ecNew = newEvalConfig(ecNew)
ecNew = copyEvalConfig(ecNew)
ecNew.Start += step
ecNew.End += step
offset -= step
@@ -679,7 +684,7 @@ func evalRollupFuncWithSubquery(ec *EvalConfig, funcName string, rf rollupFunc,
}
window := re.Window.Duration(ec.Step)
ecSQ := newEvalConfig(ec)
ecSQ := copyEvalConfig(ec)
ecSQ.Start -= window + maxSilenceInterval + step
ecSQ.End += step
ecSQ.Step = step
@@ -834,7 +839,7 @@ func evalRollupFuncWithMetricExpr(ec *EvalConfig, funcName string, rf rollupFunc
} else {
minTimestamp -= ec.Step
}
sq := storage.NewSearchQuery(minTimestamp, ec.End, tfss)
sq := storage.NewSearchQuery(minTimestamp, ec.End, tfss, ec.MaxSeries)
rss, err := netstorage.ProcessSearchQuery(sq, true, ec.Deadline)
if err != nil {
return nil, err

View File

@@ -61,6 +61,7 @@ func TestExecSuccess(t *testing.T) {
Start: start,
End: end,
Step: step,
MaxSeries: 1000,
Deadline: searchutils.NewDeadline(time.Now(), time.Minute, ""),
RoundDigits: 100,
}
@@ -7496,6 +7497,7 @@ func TestExecError(t *testing.T) {
Start: 1000,
End: 2000,
Step: 100,
MaxSeries: 1000,
Deadline: searchutils.NewDeadline(time.Now(), time.Minute, ""),
RoundDigits: 100,
}

View File

@@ -97,6 +97,11 @@ var (
)
// InitRollupResultCache initializes the rollupResult cache
//
// if cachePath is empty, then the cache isn't stored to persistent disk.
//
// ResetRollupResultCache must be called when the cache must be reset.
// StopRollupResultCache must be called when the cache isn't needed anymore.
func InitRollupResultCache(cachePath string) {
rollupResultCachePath = cachePath
startTime := time.Now()
@@ -133,16 +138,19 @@ func InitRollupResultCache(cachePath string) {
rollupResultCachePath, time.Since(startTime).Seconds(), fcs().EntriesCount, fcs().BytesSize)
}
metrics.NewGauge(`vm_cache_entries{type="promql/rollupResult"}`, func() float64 {
// Use metrics.GetOrCreateGauge instead of metrics.NewGauge,
// so InitRollupResultCache+StopRollupResultCache could be called multiple times in tests.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2406
metrics.GetOrCreateGauge(`vm_cache_entries{type="promql/rollupResult"}`, func() float64 {
return float64(fcs().EntriesCount)
})
metrics.NewGauge(`vm_cache_size_bytes{type="promql/rollupResult"}`, func() float64 {
metrics.GetOrCreateGauge(`vm_cache_size_bytes{type="promql/rollupResult"}`, func() float64 {
return float64(fcs().BytesSize)
})
metrics.NewGauge(`vm_cache_requests_total{type="promql/rollupResult"}`, func() float64 {
metrics.GetOrCreateGauge(`vm_cache_requests_total{type="promql/rollupResult"}`, func() float64 {
return float64(fcs().GetCalls)
})
metrics.NewGauge(`vm_cache_misses_total{type="promql/rollupResult"}`, func() float64 {
metrics.GetOrCreateGauge(`vm_cache_misses_total{type="promql/rollupResult"}`, func() float64 {
return float64(fcs().Misses)
})

View File

@@ -3,11 +3,32 @@ package promql
import (
"testing"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/metricsql"
)
func TestRollupResultCacheInitStop(t *testing.T) {
t.Run("inmemory", func(t *testing.T) {
for i := 0; i < 5; i++ {
InitRollupResultCache("")
StopRollupResultCache()
}
})
t.Run("file-based", func(t *testing.T) {
cacheFilePath := "test-rollup-result-cache"
for i := 0; i < 3; i++ {
InitRollupResultCache(cacheFilePath)
StopRollupResultCache()
}
fs.MustRemoveAll(cacheFilePath)
})
}
func TestRollupResultCache(t *testing.T) {
InitRollupResultCache("")
defer StopRollupResultCache()
ResetRollupResultCache()
window := int64(456)
ec := &EvalConfig{

View File

@@ -1,12 +1,14 @@
{
"files": {
"main.css": "./static/css/main.098d452b.css",
"main.js": "./static/js/main.523bd341.js",
"main.css": "./static/css/main.d8362c27.css",
"main.js": "./static/js/main.d940c8c2.js",
"static/js/362.1a2113d4.chunk.js": "./static/js/362.1a2113d4.chunk.js",
"static/js/27.939f971b.chunk.js": "./static/js/27.939f971b.chunk.js",
"static/media/README.md": "./static/media/README.5e5724daf3ee333540a3.md",
"index.html": "./index.html"
},
"entrypoints": [
"static/css/main.098d452b.css",
"static/js/main.523bd341.js"
"static/css/main.d8362c27.css",
"static/js/main.d940c8c2.js"
]
}

View File

@@ -1 +1 @@
<!doctype html><html lang="en"><head><meta charset="utf-8"/><link rel="icon" href="./favicon.ico"/><meta name="viewport" content="width=device-width,initial-scale=1"/><meta name="theme-color" content="#000000"/><meta name="description" content="VM-UI is a metric explorer for Victoria Metrics"/><link rel="apple-touch-icon" href="./apple-touch-icon.png"/><link rel="icon" type="image/png" sizes="32x32" href="./favicon-32x32.png"><link rel="manifest" href="./manifest.json"/><title>VM UI</title><link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700&display=swap"/><script defer="defer" src="./static/js/main.523bd341.js"></script><link href="./static/css/main.098d452b.css" rel="stylesheet"></head><body><noscript>You need to enable JavaScript to run this app.</noscript><div id="root"></div></body></html>
<!doctype html><html lang="en"><head><meta charset="utf-8"/><link rel="icon" href="./favicon.ico"/><meta name="viewport" content="width=device-width,initial-scale=1"/><meta name="theme-color" content="#000000"/><meta name="description" content="VM-UI is a metric explorer for Victoria Metrics"/><link rel="apple-touch-icon" href="./apple-touch-icon.png"/><link rel="icon" type="image/png" sizes="32x32" href="./favicon-32x32.png"><link rel="manifest" href="./manifest.json"/><title>VM UI</title><link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700&display=swap"/><script defer="defer" src="./static/js/main.d940c8c2.js"></script><link href="./static/css/main.d8362c27.css" rel="stylesheet"></head><body><noscript>You need to enable JavaScript to run this app.</noscript><div id="root"></div></body></html>

View File

@@ -1 +1 @@
body{-webkit-font-smoothing:antialiased;-moz-osx-font-smoothing:grayscale;font-family:-apple-system,BlinkMacSystemFont,Segoe UI,Roboto,Oxygen,Ubuntu,Cantarell,Fira Sans,Droid Sans,Helvetica Neue,sans-serif}code{font-family:source-code-pro,Menlo,Monaco,Consolas,Courier New,monospace}.MuiAccordionSummary-content{margin:0!important}.uplot,.uplot *,.uplot :after,.uplot :before{box-sizing:border-box}.uplot{font-family:system-ui,-apple-system,Segoe UI,Roboto,Helvetica Neue,Arial,Noto Sans,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol,Noto Color Emoji;line-height:1.5;width:-webkit-min-content;width:min-content}.u-title{font-size:18px;font-weight:700;text-align:center}.u-wrap{position:relative;-webkit-user-select:none;-ms-user-select:none;user-select:none}.u-over,.u-under{position:absolute}.u-under{overflow:hidden}.uplot canvas{display:block;height:100%;position:relative;width:100%}.u-axis{position:absolute}.u-legend{font-size:14px;margin:auto;text-align:center}.u-inline{display:block}.u-inline *{display:inline-block}.u-inline tr{margin-right:16px}.u-legend th{font-weight:600}.u-legend th>*{display:inline-block;vertical-align:middle}.u-legend .u-marker{background-clip:padding-box!important;height:1em;margin-right:4px;width:1em}.u-inline.u-live th:after{content:":";vertical-align:middle}.u-inline:not(.u-live) .u-value{display:none}.u-series>*{padding:4px}.u-series th{cursor:pointer}.u-legend .u-off>*{opacity:.3}.u-select{background:rgba(0,0,0,.07)}.u-cursor-x,.u-cursor-y,.u-select{pointer-events:none;position:absolute}.u-cursor-x,.u-cursor-y{left:0;top:0;will-change:transform;z-index:100}.u-hz .u-cursor-x,.u-vt .u-cursor-y{border-right:1px dashed #607d8b;height:100%}.u-hz .u-cursor-y,.u-vt .u-cursor-x{border-bottom:1px dashed #607d8b;width:100%}.u-cursor-pt{background-clip:padding-box!important;border:0 solid;border-radius:50%;left:0;pointer-events:none;position:absolute;top:0;will-change:transform;z-index:100}.u-axis.u-off,.u-cursor-pt.u-off,.u-cursor-x.u-off,.u-cursor-y.u-off,.u-select.u-off,.u-tooltip{display:none}.u-tooltip{grid-gap:12px;word-wrap:break-word;background:rgba(57,57,57,.9);border-radius:4px;color:#fff;font-family:monospace;font-size:10px;font-weight:500;line-height:1.4em;max-width:300px;padding:8px;pointer-events:none;position:absolute;z-index:100}.u-tooltip-data{align-items:center;display:flex;flex-wrap:wrap;font-size:11px;line-height:150%}.u-tooltip-data__value{font-weight:700;padding:4px}.u-tooltip__info{grid-gap:4px;display:grid}.u-tooltip__marker{height:12px;margin-right:4px;width:12px}.legendWrapper{grid-gap:20px;cursor:default;display:grid;grid-template-columns:repeat(auto-fit,minmax(400px,1fr));margin-top:20px;position:relative}.legendGroup{margin-bottom:24px}.legendGroupTitle{align-items:center;display:grid;font-size:11px;grid-template-columns:43px auto;padding:10px}.legendGroupQuery{grid-column:1/3;opacity:.6}.legendGroupLine{margin-right:10px}.legendItem{grid-gap:6px;align-items:start;background-color:#fff;cursor:pointer;display:inline-grid;grid-template-columns:auto auto;justify-content:start;padding:7px 50px 7px 10px;transition:.2s ease}.legendItemHide{opacity:.5;text-decoration:line-through}.legendItem:hover{background-color:rgba(0,0,0,.1)}.legendMarker{border-style:solid;border-width:2px;box-sizing:border-box;height:12px;transition:.2s ease;width:12px}.legendLabel{font-size:11px;font-weight:400;line-height:12px}.legendFreeFields{cursor:pointer;padding:3px}.legendFreeFields:hover{text-decoration:underline}.legendFreeFields:not(:last-child):after{content:","}.legendWrapperHotkey{align-items:center;display:flex;font-size:11px}.legendWrapperHotkey p{margin-right:20px}.legendWrapperHotkey code{word-wrap:break-word;background-color:#f2f2f2;border:1px solid #dedede;border-radius:2px;color:#0a0a0a;display:inline;font-size:10px;font-weight:400;max-width:100%;padding:4px 6px}
body{-webkit-font-smoothing:antialiased;-moz-osx-font-smoothing:grayscale;font-family:-apple-system,BlinkMacSystemFont,Segoe UI,Roboto,Oxygen,Ubuntu,Cantarell,Fira Sans,Droid Sans,Helvetica Neue,sans-serif}code{font-family:source-code-pro,Menlo,Monaco,Consolas,Courier New,monospace}.MuiAccordionSummary-content{margin:0!important}.uplot,.uplot *,.uplot :after,.uplot :before{box-sizing:border-box}.uplot{font-family:system-ui,-apple-system,Segoe UI,Roboto,Helvetica Neue,Arial,Noto Sans,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol,Noto Color Emoji;line-height:1.5;width:-webkit-min-content;width:min-content}.u-title{font-size:18px;font-weight:700;text-align:center}.u-wrap{position:relative;-webkit-user-select:none;-ms-user-select:none;user-select:none}.u-over,.u-under{position:absolute}.u-under{overflow:hidden}.uplot canvas{display:block;height:100%;position:relative;width:100%}.u-axis{position:absolute}.u-legend{font-size:14px;margin:auto;text-align:center}.u-inline{display:block}.u-inline *{display:inline-block}.u-inline tr{margin-right:16px}.u-legend th{font-weight:600}.u-legend th>*{display:inline-block;vertical-align:middle}.u-legend .u-marker{background-clip:padding-box!important;height:1em;margin-right:4px;width:1em}.u-inline.u-live th:after{content:":";vertical-align:middle}.u-inline:not(.u-live) .u-value{display:none}.u-series>*{padding:4px}.u-series th{cursor:pointer}.u-legend .u-off>*{opacity:.3}.u-select{background:rgba(0,0,0,.07)}.u-cursor-x,.u-cursor-y,.u-select{pointer-events:none;position:absolute}.u-cursor-x,.u-cursor-y{left:0;top:0;will-change:transform;z-index:100}.u-hz .u-cursor-x,.u-vt .u-cursor-y{border-right:1px dashed #607d8b;height:100%}.u-hz .u-cursor-y,.u-vt .u-cursor-x{border-bottom:1px dashed #607d8b;width:100%}.u-cursor-pt{background-clip:padding-box!important;border:0 solid;border-radius:50%;left:0;pointer-events:none;position:absolute;top:0;will-change:transform;z-index:100}.u-axis.u-off,.u-cursor-pt.u-off,.u-cursor-x.u-off,.u-cursor-y.u-off,.u-select.u-off,.u-tooltip{display:none}.u-tooltip{grid-gap:12px;word-wrap:break-word;background:rgba(57,57,57,.9);border-radius:4px;color:#fff;font-family:monospace;font-size:10px;font-weight:500;line-height:1.4em;max-width:300px;padding:8px;pointer-events:none;position:absolute;z-index:100}.u-tooltip-data{align-items:center;display:flex;flex-wrap:wrap;font-size:11px;line-height:150%}.u-tooltip-data__value{font-weight:700;padding:4px}.u-tooltip__info{grid-gap:4px;display:grid}.u-tooltip__marker{height:12px;margin-right:4px;width:12px}.legendWrapper{cursor:default;display:flex;flex-wrap:wrap;margin-top:20px;position:relative}.legendGroup{margin:0 12px 24px 0}.legendGroupTitle{align-items:center;display:grid;font-size:11px;grid-template-columns:43px auto;padding:10px}.legendGroupQuery{grid-column:1/3;opacity:.6}.legendGroupLine{margin-right:10px}.legendItem{grid-gap:6px;align-items:start;background-color:#fff;cursor:pointer;display:grid;grid-template-columns:auto auto;justify-content:start;padding:7px 50px 7px 10px;transition:.2s ease}.legendItemHide{opacity:.5;text-decoration:line-through}.legendItem:hover{background-color:rgba(0,0,0,.1)}.legendMarker{border-style:solid;border-width:2px;box-sizing:border-box;height:12px;transition:.2s ease;width:12px}.legendLabel{font-size:11px;font-weight:400;line-height:12px}.legendFreeFields{cursor:pointer;padding:3px}.legendFreeFields:hover{text-decoration:underline}.legendFreeFields:not(:last-child):after{content:","}.legendWrapperHotkey{align-items:center;display:flex;font-size:11px}.legendWrapperHotkey p{margin-right:20px}.legendWrapperHotkey code{word-wrap:break-word;background-color:#f2f2f2;border:1px solid #dedede;border-radius:2px;color:#0a0a0a;display:inline;font-size:10px;font-weight:400;max-width:100%;padding:4px 6px}.panelDescription ul{line-height:2.2}.panelDescription a{color:#fff}.panelDescription code{background-color:rgba(0,0,0,.3);border-radius:2px;color:#fff;display:inline;font-size:inherit;font-weight:400;max-width:100%;padding:4px 6px}

View File

@@ -0,0 +1 @@
"use strict";(self.webpackChunkvmui=self.webpackChunkvmui||[]).push([[362],{8362:function(e,s,u){e.exports=u.p+"static/media/README.5e5724daf3ee333540a3.md"}}]);

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@@ -6,7 +6,29 @@
* @license MIT
*/
/** @license MUI v5.4.4
/**
* React Router DOM v6.3.0
*
* Copyright (c) Remix Software Inc.
*
* This source code is licensed under the MIT license found in the
* LICENSE.md file in the root directory of this source tree.
*
* @license MIT
*/
/**
* React Router v6.3.0
*
* Copyright (c) Remix Software Inc.
*
* This source code is licensed under the MIT license found in the
* LICENSE.md file in the root directory of this source tree.
*
* @license MIT
*/
/** @license MUI v5.5.2
*
* This source code is licensed under the MIT license found in the
* LICENSE file in the root directory of this source tree.

View File

@@ -0,0 +1,77 @@
### Configuration options
<br/>
DashboardSettings:
| Name | Type | Description |
|:----------|:----------------:|---------------------------:|
| rows* | `DashboardRow[]` | Sections containing panels |
| title | `string` | Dashboard title |
<br/>
DashboardRow:
| Name | Type | Description |
|:-----------|:-----------------:|---------------------------:|
| panels* | `PanelSettings[]` | List of panels (charts) |
| title | `string` | Row title |
<br/>
PanelSettings:
| Name | Type | Description |
|:------------|:----------:|-------------------------------------------------------------------------------------------:|
| expr* | `string[]` | Data source queries |
| title | `string` | Panel title |
| description | `string` | Additional information about the panel |
| unit | `string` | Y-axis unit |
| showLegend | `boolean` | If `false`, the legend hide. Default value - `true` |
| width | `number` | The number of columns the panel uses.<br/> From 1 (minimum width) to 12 (full width). |
---
### Example json
```json
{
"title": "Example",
"rows": [
{
"title": "Performance",
"panels": [
{
"title": "Query duration",
"description": "The less time it takes is better.\n* `*` - unsupported query path\n* `/write` - insert into VM\n* `/metrics` - query VM system metrics\n* `/query` - query instant values\n* `/query_range` - query over a range of time\n* `/series` - match a certain label set\n* `/label/{}/values` - query a list of label values (variables mostly)",
"unit": "ms",
"showLegend": false,
"expr": [
"max(vm_request_duration_seconds{quantile=~\"(0.5|0.99)\"}) by (path, quantile) > 0"
]
},
{
"title": "Concurrent flushes on disk",
"description": "Shows how many ongoing insertions (not API /write calls) on disk are taking place, where:\n* `max` - equal to number of CPUs;\n* `current` - current number of goroutines busy with inserting rows into underlying storage.\n\nEvery successful API /write call results into flush on disk. However, these two actions are separated and controlled via different concurrency limiters. The `max` on this panel can't be changed and always equal to number of CPUs. \n\nWhen `current` hits `max` constantly, it means storage is overloaded and requires more CPU.\n\n",
"expr": [
"sum(vm_concurrent_addrows_capacity)",
"sum(vm_concurrent_addrows_current)"
]
}
]
},
{
"title": "Troubleshooting",
"panels": [
{
"title": "Churn rate",
"description": "Shows the rate and total number of new series created over last 24h.\n\nHigh churn rate tightly connected with database performance and may result in unexpected OOM's or slow queries. It is recommended to always keep an eye on this metric to avoid unexpected cardinality \"explosions\".\n\nThe higher churn rate is, the more resources required to handle it. Consider to keep the churn rate as low as possible.\n\nGood references to read:\n* https://www.robustperception.io/cardinality-is-key\n* https://www.robustperception.io/using-tsdb-analyze-to-investigate-churn-and-cardinality",
"expr": [
"sum(rate(vm_new_timeseries_created_total[5m]))",
"sum(increase(vm_new_timeseries_created_total[24h]))"
]
}
]
}
]
}
```

View File

@@ -227,17 +227,17 @@ func SearchTagEntries(maxTagKeys, maxTagValues int, deadline uint64) ([]storage.
}
// GetTSDBStatusForDate returns TSDB status for the given date.
func GetTSDBStatusForDate(date uint64, topN int, deadline uint64) (*storage.TSDBStatus, error) {
func GetTSDBStatusForDate(date uint64, topN, maxMetrics int, deadline uint64) (*storage.TSDBStatus, error) {
WG.Add(1)
status, err := Storage.GetTSDBStatusWithFiltersForDate(nil, date, topN, deadline)
status, err := Storage.GetTSDBStatusWithFiltersForDate(nil, date, topN, maxMetrics, deadline)
WG.Done()
return status, err
}
// GetTSDBStatusWithFiltersForDate returns TSDB status for given filters on the given date.
func GetTSDBStatusWithFiltersForDate(tfss []*storage.TagFilters, date uint64, topN int, deadline uint64) (*storage.TSDBStatus, error) {
func GetTSDBStatusWithFiltersForDate(tfss []*storage.TagFilters, date uint64, topN, maxMetrics int, deadline uint64) (*storage.TSDBStatus, error) {
WG.Add(1)
status, err := Storage.GetTSDBStatusWithFiltersForDate(tfss, date, topN, deadline)
status, err := Storage.GetTSDBStatusWithFiltersForDate(tfss, date, topN, maxMetrics, deadline)
WG.Done()
return status, err
}
@@ -680,6 +680,10 @@ func registerStorageMetrics() {
metrics.NewGauge(`vm_cache_entries{type="storage/regexps"}`, func() float64 {
return float64(storage.RegexpCacheSize())
})
metrics.NewGauge(`vm_cache_entries{type="storage/regexpPrefixes"}`, func() float64 {
return float64(storage.RegexpPrefixesCacheSize())
})
metrics.NewGauge(`vm_cache_entries{type="storage/prefetchedMetricIDs"}`, func() float64 {
return float64(m().PrefetchedMetricIDsSize)
})
@@ -714,6 +718,12 @@ func registerStorageMetrics() {
metrics.NewGauge(`vm_cache_size_bytes{type="indexdb/tagFilters"}`, func() float64 {
return float64(idbm().TagFiltersCacheSizeBytes)
})
metrics.NewGauge(`vm_cache_size_bytes{type="storage/regexps"}`, func() float64 {
return float64(storage.RegexpCacheSizeBytes())
})
metrics.NewGauge(`vm_cache_size_bytes{type="storage/regexpPrefixes"}`, func() float64 {
return float64(storage.RegexpPrefixesCacheSizeBytes())
})
metrics.NewGauge(`vm_cache_size_bytes{type="storage/prefetchedMetricIDs"}`, func() float64 {
return float64(m().PrefetchedMetricIDsSizeBytes)
})
@@ -739,6 +749,12 @@ func registerStorageMetrics() {
metrics.NewGauge(`vm_cache_size_max_bytes{type="indexdb/tagFilters"}`, func() float64 {
return float64(idbm().TagFiltersCacheSizeMaxBytes)
})
metrics.NewGauge(`vm_cache_size_max_bytes{type="storage/regexps"}`, func() float64 {
return float64(storage.RegexpCacheMaxSizeBytes())
})
metrics.NewGauge(`vm_cache_size_max_bytes{type="storage/regexpPrefixes"}`, func() float64 {
return float64(storage.RegexpPrefixesCacheMaxSizeBytes())
})
metrics.NewGauge(`vm_cache_requests_total{type="storage/tsid"}`, func() float64 {
return float64(m().TSIDCacheRequests)
@@ -764,6 +780,9 @@ func registerStorageMetrics() {
metrics.NewGauge(`vm_cache_requests_total{type="storage/regexps"}`, func() float64 {
return float64(storage.RegexpCacheRequests())
})
metrics.NewGauge(`vm_cache_requests_total{type="storage/regexpPrefixes"}`, func() float64 {
return float64(storage.RegexpPrefixesCacheRequests())
})
metrics.NewGauge(`vm_cache_misses_total{type="storage/tsid"}`, func() float64 {
return float64(m().TSIDCacheMisses)
@@ -789,6 +808,9 @@ func registerStorageMetrics() {
metrics.NewGauge(`vm_cache_misses_total{type="storage/regexps"}`, func() float64 {
return float64(storage.RegexpCacheMisses())
})
metrics.NewGauge(`vm_cache_misses_total{type="storage/regexpPrefixes"}`, func() float64 {
return float64(storage.RegexpPrefixesCacheMisses())
})
metrics.NewGauge(`vm_deleted_metrics_total{type="indexdb"}`, func() float64 {
return float64(idbm().DeletedMetricsCount)

View File

@@ -41,7 +41,11 @@ module.exports = {
"react/prop-types": 0,
"max-lines": [
"error",
{"max": 150}
{
"max": 150,
"skipBlankLines": true,
"skipComments": true,
}
]
},
"settings": {

File diff suppressed because it is too large Load Diff

View File

@@ -6,28 +6,33 @@
"dependencies": {
"@date-io/dayjs": "^2.13.1",
"@emotion/styled": "^11.8.1",
"@mui/icons-material": "^5.5.1",
"@mui/icons-material": "^5.6.0",
"@mui/lab": "^5.0.0-alpha.73",
"@mui/material": "^5.5.1",
"@mui/styles": "^5.5.1",
"@testing-library/jest-dom": "^5.16.2",
"@testing-library/react": "^12.1.4",
"@testing-library/user-event": "^13.5.0",
"@testing-library/react": "^13.0.0",
"@testing-library/user-event": "^14.0.4",
"@types/jest": "^27.4.1",
"@types/lodash.debounce": "^4.0.6",
"@types/lodash.get": "^4.4.6",
"@types/lodash.throttle": "^4.1.6",
"@types/marked": "^4.0.2",
"@types/node": "^17.0.21",
"@types/qs": "^6.9.7",
"@types/react": "^17.0.40",
"@types/react-dom": "^17.0.11",
"@types/react": "^17.0.43",
"@types/react-dom": "^18.0.0",
"@types/react-measure": "^2.0.8",
"@types/react-router-dom": "^5.3.3",
"@types/webpack-env": "^1.16.3",
"dayjs": "^1.11.0",
"lodash.debounce": "^4.0.8",
"lodash.get": "^4.4.2",
"lodash.throttle": "^4.1.1",
"preact": "^10.6.6",
"marked": "^4.0.14",
"preact": "^10.7.1",
"qs": "^6.10.3",
"react-router-dom": "^6.3.0",
"typescript": "~4.6.2",
"uplot": "^1.6.19",
"web-vitals": "^2.1.4"

View File

@@ -1,6 +1,6 @@
import React, {FC} from "preact/compat";
import {HashRouter, Route, Routes} from "react-router-dom";
import {SnackbarProvider} from "./contexts/Snackbar";
import HomeLayout from "./components/Home/HomeLayout";
import {StateProvider} from "./state/common/StateContext";
import {AuthStateProvider} from "./state/auth/AuthStateContext";
import {GraphStateProvider} from "./state/graph/GraphStateContext";
@@ -9,6 +9,11 @@ import { ThemeProvider, StyledEngineProvider } from "@mui/material/styles";
import CssBaseline from "@mui/material/CssBaseline";
import LocalizationProvider from "@mui/lab/LocalizationProvider";
import DayjsUtils from "@date-io/dayjs";
import router from "./router/index";
import CustomPanel from "./components/CustomPanel/CustomPanel";
import HomeLayout from "./components/Home/HomeLayout";
import DashboardsLayout from "./components/PredefinedPanels/DashboardsLayout";
const App: FC = () => {
@@ -22,7 +27,14 @@ const App: FC = () => {
<AuthStateProvider> {/* Auth related info - optionally persisted to Local Storage */}
<GraphStateProvider> {/* Graph settings */}
<SnackbarProvider> {/* Display various snackbars */}
<HomeLayout/>
<HashRouter>
<Routes>
<Route path={"/"} element={<HomeLayout/>}>
<Route path={router.home} element={<CustomPanel/>}/>
<Route path={router.dashboards} element={<DashboardsLayout/>}/>
</Route>
</Routes>
</HashRouter>
</SnackbarProvider>
</GraphStateProvider>
</AuthStateProvider>

View File

@@ -3,29 +3,31 @@ import {ChangeEvent} from "react";
import Box from "@mui/material/Box";
import FormControlLabel from "@mui/material/FormControlLabel";
import TextField from "@mui/material/TextField";
import {useGraphDispatch, useGraphState} from "../../../../state/graph/GraphStateContext";
import debounce from "lodash.debounce";
import BasicSwitch from "../../../../theme/switch";
import {AxisRange, YaxisState} from "../../../../state/graph/reducer";
const AxesLimitsConfigurator: FC = () => {
interface AxesLimitsConfiguratorProps {
yaxis: YaxisState,
setYaxisLimits: (limits: AxisRange) => void,
toggleEnableLimits: () => void
}
const AxesLimitsConfigurator: FC<AxesLimitsConfiguratorProps> = ({yaxis, setYaxisLimits, toggleEnableLimits}) => {
const { yaxis } = useGraphState();
const graphDispatch = useGraphDispatch();
const axes = useMemo(() => Object.keys(yaxis.limits.range), [yaxis.limits.range]);
const onChangeYaxisLimits = () => { graphDispatch({type: "TOGGLE_ENABLE_YAXIS_LIMITS"}); };
const onChangeLimit = (e: ChangeEvent<HTMLInputElement | HTMLTextAreaElement>, axis: string, index: number) => {
const newLimits = yaxis.limits.range;
newLimits[axis][index] = +e.target.value;
if (newLimits[axis][0] === newLimits[axis][1] || newLimits[axis][0] > newLimits[axis][1]) return;
graphDispatch({type: "SET_YAXIS_LIMITS", payload: newLimits});
setYaxisLimits(newLimits);
};
const debouncedOnChangeLimit = useCallback(debounce(onChangeLimit, 500), [yaxis.limits.range]);
return <Box display="grid" alignItems="center" gap={2}>
<FormControlLabel
control={<BasicSwitch checked={yaxis.limits.enable} onChange={onChangeYaxisLimits}/>}
control={<BasicSwitch checked={yaxis.limits.enable} onChange={toggleEnableLimits}/>}
label="Fix the limits for y-axis"
/>
<Box display="grid" alignItems="center" gap={2}>

View File

@@ -10,6 +10,7 @@ import Typography from "@mui/material/Typography";
import makeStyles from "@mui/styles/makeStyles";
import CloseIcon from "@mui/icons-material/Close";
import ClickAwayListener from "@mui/material/ClickAwayListener";
import {AxisRange, YaxisState} from "../../../../state/graph/reducer";
const useStyles = makeStyles({
popover: {
@@ -35,7 +36,13 @@ const useStyles = makeStyles({
const title = "Axes Settings";
const GraphSettings: FC = () => {
interface GraphSettingsProps {
yaxis: YaxisState,
setYaxisLimits: (limits: AxisRange) => void,
toggleEnableLimits: () => void
}
const GraphSettings: FC<GraphSettingsProps> = ({yaxis, setYaxisLimits, toggleEnableLimits}) => {
const [anchorEl, setAnchorEl] = useState<HTMLButtonElement | null>(null);
const open = Boolean(anchorEl);
@@ -61,7 +68,11 @@ const GraphSettings: FC = () => {
</IconButton>
</div>
<Box className={classes.popoverBody}>
<AxesLimitsConfigurator/>
<AxesLimitsConfigurator
yaxis={yaxis}
setYaxisLimits={setYaxisLimits}
toggleEnableLimits={toggleEnableLimits}
/>
</Box>
</Paper>
</ClickAwayListener>

View File

@@ -5,10 +5,14 @@ import {saveToStorage} from "../../../../utils/storage";
import {useAppDispatch, useAppState} from "../../../../state/common/StateContext";
import BasicSwitch from "../../../../theme/switch";
import StepConfigurator from "./StepConfigurator";
import {useGraphDispatch, useGraphState} from "../../../../state/graph/GraphStateContext";
const AdditionalSettings: FC = () => {
const {queryControls: {autocomplete, nocache}} = useAppState();
const {customStep} = useGraphState();
const graphDispatch = useGraphDispatch();
const {queryControls: {autocomplete, nocache}, time: {period: {step}}} = useAppState();
const dispatch = useAppDispatch();
const onChangeAutocomplete = () => {
@@ -33,7 +37,13 @@ const AdditionalSettings: FC = () => {
/>
</Box>
<Box ml={2}>
<StepConfigurator/>
<StepConfigurator defaultStep={step} customStepEnable={customStep.enable}
setStep={(value) => {
graphDispatch({type: "SET_CUSTOM_STEP", payload: value});
}}
toggleEnableStep={() => {
graphDispatch({type: "TOGGLE_CUSTOM_STEP"});
}}/>
</Box>
</Box>;
};

View File

@@ -0,0 +1,60 @@
import React, {FC, useEffect, useState} from "preact/compat";
import {ChangeEvent} from "react";
import Box from "@mui/material/Box";
import FormControlLabel from "@mui/material/FormControlLabel";
import TextField from "@mui/material/TextField";
import BasicSwitch from "../../../../theme/switch";
interface StepConfiguratorProps {
defaultStep?: number,
customStepEnable: boolean,
setStep: (step: number) => void,
toggleEnableStep: () => void
}
const StepConfigurator: FC<StepConfiguratorProps> = ({
defaultStep, customStepEnable, setStep, toggleEnableStep
}) => {
const [customStep, setCustomStep] = useState(defaultStep);
const [error, setError] = useState(false);
useEffect(() => {
setStep(customStep || 1);
}, [customStep]);
const onChangeStep = (e: ChangeEvent<HTMLInputElement | HTMLTextAreaElement>) => {
if (!customStepEnable) return;
const value = +e.target.value;
if (value > 0) {
setCustomStep(value);
setError(false);
} else {
setError(true);
}
};
const onChangeEnableStep = () => {
setError(false);
toggleEnableStep();
};
return <Box display="grid" gridTemplateColumns="auto 120px" alignItems="center">
<FormControlLabel
control={<BasicSwitch checked={customStepEnable} onChange={onChangeEnableStep}/>}
label="Override step value"
/>
<TextField
label="Step value"
type="number"
size="small"
variant="outlined"
value={customStep}
disabled={!customStepEnable}
error={error}
helperText={error ? "step is out of allowed range" : " "}
onChange={onChangeStep}/>
</Box>;
};
export default StepConfigurator;

View File

@@ -10,6 +10,7 @@ import KeyboardArrowDownIcon from "@mui/icons-material/KeyboardArrowDown";
import List from "@mui/material/List";
import ListItem from "@mui/material/ListItem";
import ListItemText from "@mui/material/ListItemText";
import {useLocation} from "react-router-dom";
interface AutoRefreshOption {
seconds: number
@@ -36,6 +37,12 @@ export const ExecutionControls: FC = () => {
const dispatch = useAppDispatch();
const {queryControls: {autoRefresh}} = useAppState();
const location = useLocation();
useEffect(() => {
if (autoRefresh) dispatch({type: "TOGGLE_AUTOREFRESH"});
}, [location]);
const [selectedDelay, setSelectedDelay] = useState<AutoRefreshOption>(delayOptions[0]);
const handleChange = (d: AutoRefreshOption) => {

View File

@@ -0,0 +1,68 @@
import React, {FC} from "preact/compat";
import Alert from "@mui/material/Alert";
import Box from "@mui/material/Box";
import GraphView from "./Views/GraphView";
import TableView from "./Views/TableView";
import {useAppDispatch, useAppState} from "../../state/common/StateContext";
import QueryConfigurator from "./Configurator/Query/QueryConfigurator";
import {useFetchQuery} from "../../hooks/useFetchQuery";
import JsonView from "./Views/JsonView";
import {DisplayTypeSwitch} from "./Configurator/DisplayTypeSwitch";
import GraphSettings from "./Configurator/Graph/GraphSettings";
import {useGraphDispatch, useGraphState} from "../../state/graph/GraphStateContext";
import {AxisRange} from "../../state/graph/reducer";
import Spinner from "../common/Spinner";
const CustomPanel: FC = () => {
const {displayType, time: {period}, query} = useAppState();
const { customStep, yaxis } = useGraphState();
const dispatch = useAppDispatch();
const graphDispatch = useGraphDispatch();
const setYaxisLimits = (limits: AxisRange) => {
graphDispatch({type: "SET_YAXIS_LIMITS", payload: limits});
};
const toggleEnableLimits = () => {
graphDispatch({type: "TOGGLE_ENABLE_YAXIS_LIMITS"});
};
const setPeriod = ({from, to}: {from: Date, to: Date}) => {
dispatch({type: "SET_PERIOD", payload: {from, to}});
};
const {isLoading, liveData, graphData, error, queryOptions} = useFetchQuery({
visible: true,
customStep
});
return (
<Box p={4} display="grid" gridTemplateRows="auto 1fr" style={{minHeight: "calc(100vh - 64px)"}}>
<QueryConfigurator error={error} queryOptions={queryOptions}/>
<Box height="100%">
{isLoading && <Spinner isLoading={isLoading} height={"500px"}/>}
{<Box height={"100%"} bgcolor={"#fff"}>
<Box display="grid" gridTemplateColumns="1fr auto" alignItems="center" mx={-4} px={4} mb={2}
borderBottom={1} borderColor="divider">
<DisplayTypeSwitch/>
{displayType === "chart" && <GraphSettings
yaxis={yaxis}
setYaxisLimits={setYaxisLimits}
toggleEnableLimits={toggleEnableLimits}
/>}
</Box>
{error && <Alert color="error" severity="error" sx={{whiteSpace: "pre-wrap", mt: 2}}>{error}</Alert>}
{graphData && period && (displayType === "chart") &&
<GraphView data={graphData} period={period} customStep={customStep} query={query} yaxis={yaxis}
setYaxisLimits={setYaxisLimits} setPeriod={setPeriod}/>}
{liveData && (displayType === "code") && <JsonView data={liveData}/>}
{liveData && (displayType === "table") && <TableView data={liveData}/>}
</Box>}
</Box>
</Box>
);
};
export default CustomPanel;

View File

@@ -1,16 +1,25 @@
import React, {FC, useEffect, useMemo, useState} from "preact/compat";
import React, {FC, useEffect, useMemo, useRef, useState} from "preact/compat";
import {MetricResult} from "../../../api/types";
import LineChart from "../../LineChart/LineChart";
import {AlignedData as uPlotData, Series as uPlotSeries} from "uplot";
import Legend from "../../Legend/Legend";
import {useGraphDispatch, useGraphState} from "../../../state/graph/GraphStateContext";
import {getHideSeries, getLegendItem, getSeriesItem} from "../../../utils/uplot/series";
import {getLimitsYAxis, getTimeSeries} from "../../../utils/uplot/axes";
import {LegendItem} from "../../../utils/uplot/types";
import {useAppState} from "../../../state/common/StateContext";
import {TimeParams} from "../../../types";
import {AxisRange, CustomStep, YaxisState} from "../../../state/graph/reducer";
import Alert from "@mui/material/Alert";
export interface GraphViewProps {
data?: MetricResult[];
period: TimeParams;
customStep: CustomStep;
query: string[];
yaxis: YaxisState;
unit?: string;
showLegend?: boolean;
setYaxisLimits: (val: AxisRange) => void
setPeriod: ({from, to}: {from: Date, to: Date}) => void
}
const promValueToNumber = (s: string): number => {
@@ -28,10 +37,17 @@ const promValueToNumber = (s: string): number => {
}
};
const GraphView: FC<GraphViewProps> = ({data = []}) => {
const graphDispatch = useGraphDispatch();
const {time: {period}} = useAppState();
const { customStep } = useGraphState();
const GraphView: FC<GraphViewProps> = ({
data = [],
period,
customStep,
query,
yaxis,
unit,
showLegend= true,
setYaxisLimits,
setPeriod
}) => {
const currentStep = useMemo(() => customStep.enable ? customStep.value : period.step || 1, [period.step, customStep]);
const [dataChart, setDataChart] = useState<uPlotData>([[]]);
@@ -41,7 +57,7 @@ const GraphView: FC<GraphViewProps> = ({data = []}) => {
const setLimitsYaxis = (values: {[key: string]: number[]}) => {
const limits = getLimitsYAxis(values);
graphDispatch({type: "SET_YAXIS_LIMITS", payload: limits});
setYaxisLimits(limits);
};
const onChangeLegend = (legend: LegendItem, metaKey: boolean) => {
@@ -110,13 +126,17 @@ const GraphView: FC<GraphViewProps> = ({data = []}) => {
setLegend(tempLegend);
}, [hideSeries]);
const containerRef = useRef<HTMLDivElement>(null);
return <>
{(data.length > 0)
? <div>
<LineChart data={dataChart} series={series} metrics={data}/>
<Legend labels={legend} onChange={onChangeLegend}/>
{(data.length > 0) ?
<div style={{width: "100%"}} ref={containerRef}>
{containerRef?.current &&
<LineChart data={dataChart} series={series} metrics={data} period={period} yaxis={yaxis} unit={unit}
setPeriod={setPeriod} container={containerRef?.current}/>}
{showLegend && <Legend labels={legend} query={query} onChange={onChangeLegend}/>}
</div>
: <div style={{textAlign: "center"}}>No data to show</div>}
: <Alert color="warning" severity="warning" sx={{mt: 2}}>No data to show</Alert>}
</>;
};

View File

@@ -10,6 +10,7 @@ import TableRow from "@mui/material/TableRow";
import TableSortLabel from "@mui/material/TableSortLabel";
import makeStyles from "@mui/styles/makeStyles";
import {useSortedCategories} from "../../../hooks/useSortedCategories";
import Alert from "@mui/material/Alert";
export interface GraphViewProps {
data: InstantMetricResult[];
@@ -98,7 +99,7 @@ const TableView: FC<GraphViewProps> = ({data}) => {
</TableBody>
</Table>
</TableContainer>
: <div style={{textAlign: "center"}}>No data to show</div>}
: <Alert color="warning" severity="warning" sx={{mt: 2}}>No data to show</Alert>}
</>
);
};

View File

@@ -1,15 +1,19 @@
import React, {FC} from "preact/compat";
import React, {FC, useState} from "preact/compat";
import AppBar from "@mui/material/AppBar";
import Box from "@mui/material/Box";
import Link from "@mui/material/Link";
import Toolbar from "@mui/material/Toolbar";
import Typography from "@mui/material/Typography";
import {ExecutionControls} from "../Home/Configurator/Time/ExecutionControls";
import {ExecutionControls} from "../CustomPanel/Configurator/Time/ExecutionControls";
import Logo from "../common/Logo";
import makeStyles from "@mui/styles/makeStyles";
import {setQueryStringWithoutPageReload} from "../../utils/query-string";
import {TimeSelector} from "../Home/Configurator/Time/TimeSelector";
import GlobalSettings from "../Home/Configurator/Settings/GlobalSettings";
import {TimeSelector} from "../CustomPanel/Configurator/Time/TimeSelector";
import GlobalSettings from "../CustomPanel/Configurator/Settings/GlobalSettings";
import {Link as RouterLink, useLocation, useNavigate} from "react-router-dom";
import Tabs from "@mui/material/Tabs";
import Tab from "@mui/material/Tab";
import router from "../../router/index";
const useStyles = makeStyles({
logo: {
@@ -32,18 +36,41 @@ const useStyles = makeStyles({
"&:hover": {
opacity: ".8",
}
},
menuLink: {
display: "block",
padding: "16px 8px",
color: "white",
fontSize: "11px",
textDecoration: "none",
cursor: "pointer",
textTransform: "uppercase",
borderRadius: "4px",
transition: ".2s background",
"&:hover": {
boxShadow: "rgba(0, 0, 0, 0.15) 0px 2px 8px"
}
}
});
const Header: FC = () => {
const classes = useStyles();
const {search, pathname} = useLocation();
const navigate = useNavigate();
const [activeMenu, setActiveMenu] = useState(pathname);
const onClickLogo = () => {
navigateHandler(router.home);
setQueryStringWithoutPageReload("");
window.location.reload();
};
const navigateHandler = (pathname: string) => {
navigate({pathname, search: search});
};
return <AppBar position="static" sx={{px: 1, boxShadow: "none"}}>
<Toolbar>
<Box display="grid" alignItems="center" justifyContent="center">
@@ -59,6 +86,13 @@ const Header: FC = () => {
create an issue
</Link>
</Box>
<Box sx={{ml: 8}}>
<Tabs value={activeMenu} textColor="inherit" TabIndicatorProps={{style: {background: "white"}}}
onChange={(e, val) => setActiveMenu(val)}>
<Tab label="Custom panel" value={router.home} component={RouterLink} to={`${router.home}${search}`}/>
<Tab label="Dashboards" value={router.dashboards} component={RouterLink} to={`${router.dashboards}${search}`}/>
</Tabs>
</Box>
<Box display="grid" gridTemplateColumns="repeat(3, auto)" gap={1} alignItems="center" ml="auto" mr={0}>
<TimeSelector/>
<ExecutionControls/>
@@ -68,4 +102,4 @@ const Header: FC = () => {
</AppBar>;
};
export default Header;
export default Header;

View File

@@ -1,53 +0,0 @@
import React, {FC, useCallback, useEffect, useState} from "preact/compat";
import {ChangeEvent} from "react";
import Box from "@mui/material/Box";
import FormControlLabel from "@mui/material/FormControlLabel";
import TextField from "@mui/material/TextField";
import BasicSwitch from "../../../../theme/switch";
import {useGraphDispatch, useGraphState} from "../../../../state/graph/GraphStateContext";
import {useAppState} from "../../../../state/common/StateContext";
import debounce from "lodash.debounce";
const StepConfigurator: FC = () => {
const {customStep} = useGraphState();
const graphDispatch = useGraphDispatch();
const [error, setError] = useState(false);
const {time: {period: {step}}} = useAppState();
const onChangeStep = (e: ChangeEvent<HTMLInputElement | HTMLTextAreaElement>) => {
const value = +e.target.value;
if (value > 0) {
graphDispatch({type: "SET_CUSTOM_STEP", payload: value});
setError(false);
} else {
setError(true);
}
};
const debouncedOnChangeStep = useCallback(debounce(onChangeStep, 500), [customStep.value]);
const onChangeEnableStep = () => {
setError(false);
graphDispatch({type: "TOGGLE_CUSTOM_STEP"});
};
useEffect(() => {
if (!customStep.enable) graphDispatch({type: "SET_CUSTOM_STEP", payload: step || 1});
}, [step]);
return <Box display="grid" gridTemplateColumns="auto 120px" alignItems="center">
<FormControlLabel
control={<BasicSwitch checked={customStep.enable} onChange={onChangeEnableStep}/>}
label="Override step value"
/>
{customStep.enable &&
<TextField label="Step value" type="number" size="small" variant="outlined"
defaultValue={customStep.value}
error={error}
helperText={error ? "step is out of allowed range" : " "}
onChange={debouncedOnChangeStep}/>
}
</Box>;
};
export default StepConfigurator;

View File

@@ -1,62 +1,13 @@
import React, {FC} from "preact/compat";
import Alert from "@mui/material/Alert";
import Box from "@mui/material/Box";
import CircularProgress from "@mui/material/CircularProgress";
import Fade from "@mui/material/Fade";
import GraphView from "./Views/GraphView";
import TableView from "./Views/TableView";
import {useAppState} from "../../state/common/StateContext";
import QueryConfigurator from "./Configurator/Query/QueryConfigurator";
import {useFetchQuery} from "./Configurator/Query/useFetchQuery";
import JsonView from "./Views/JsonView";
import Header from "../Header/Header";
import {DisplayTypeSwitch} from "./Configurator/DisplayTypeSwitch";
import GraphSettings from "./Configurator/Graph/GraphSettings";
import React, {FC} from "preact/compat";
import Box from "@mui/material/Box";
import { Outlet } from "react-router-dom";
const HomeLayout: FC = () => {
const {displayType, time: {period}} = useAppState();
const {isLoading, liveData, graphData, error, queryOptions} = useFetchQuery();
return (
<Box id="homeLayout">
<Header/>
<Box p={4} display="grid" gridTemplateRows="auto 1fr" style={{minHeight: "calc(100vh - 64px)"}}>
<QueryConfigurator error={error} queryOptions={queryOptions}/>
<Box height="100%">
{isLoading && <Fade in={isLoading} style={{
transitionDelay: isLoading ? "300ms" : "0ms",
}}>
<Box alignItems="center" justifyContent="center" flexDirection="column" display="flex"
style={{
width: "100%",
maxWidth: "calc(100vw - 64px)",
position: "absolute",
height: "50%",
background: "linear-gradient(rgba(255,255,255,.7), rgba(255,255,255,.7), rgba(255,255,255,0))"
}}>
<CircularProgress/>
</Box>
</Fade>}
{<Box height={"100%"} bgcolor={"#fff"}>
<Box display="grid" gridTemplateColumns="1fr auto" alignItems="center" mx={-4} px={4} mb={2}
borderBottom={1} borderColor="divider">
<DisplayTypeSwitch/>
{displayType === "chart" && <GraphSettings/>}
</Box>
{error && <Alert color="error" severity="error"
style={{fontSize: "14px", whiteSpace: "pre-wrap", marginTop: "20px"}}>
{error}
</Alert>}
{graphData && period && (displayType === "chart") && <GraphView data={graphData}/>}
{liveData && (displayType === "code") && <JsonView data={liveData}/>}
{liveData && (displayType === "table") && <TableView data={liveData}/>}
</Box>}
</Box>
</Box>
</Box>
);
return <Box>
<Header/>
<Outlet/>
</Box>;
};
export default HomeLayout;
export default HomeLayout;

View File

@@ -1,6 +1,5 @@
import React, {FC, useMemo, useState} from "preact/compat";
import {hexToRGB} from "../../utils/color";
import {useAppState} from "../../state/common/StateContext";
import {LegendItem} from "../../utils/uplot/types";
import "./legend.css";
import {getDashLine} from "../../utils/uplot/helpers";
@@ -8,12 +7,11 @@ import Tooltip from "@mui/material/Tooltip";
export interface LegendProps {
labels: LegendItem[];
query: string[];
onChange: (item: LegendItem, metaKey: boolean) => void;
}
const Legend: FC<LegendProps> = ({labels, onChange}) => {
const {query} = useAppState();
const Legend: FC<LegendProps> = ({labels, query, onChange}) => {
const [copiedValue, setCopiedValue] = useState("");
const groups = useMemo(() => {

View File

@@ -1,14 +1,13 @@
.legendWrapper {
position: relative;
display: grid;
grid-template-columns: repeat(auto-fit, minmax(400px, 1fr));
grid-gap: 20px;
display: flex;
flex-wrap: wrap;
margin-top: 20px;
cursor: default;
}
.legendGroup {
margin-bottom: 24px;
margin: 0 12px 24px 0;
}
.legendGroupTitle {
@@ -29,7 +28,7 @@
}
.legendItem {
display: inline-grid;
display: grid;
grid-template-columns: auto auto;
grid-gap: 6px;
align-items: start;

View File

@@ -1,7 +1,5 @@
import React, {FC, useCallback, useEffect, useRef, useState} from "preact/compat";
import {useAppDispatch, useAppState} from "../../state/common/StateContext";
import uPlot, {AlignedData as uPlotData, Options as uPlotOptions, Series as uPlotSeries, Range, Scales, Scale} from "uplot";
import {useGraphState} from "../../state/graph/GraphStateContext";
import {defaultOptions} from "../../utils/uplot/helpers";
import {dragChart} from "../../utils/uplot/events";
import {getAxes, getMinMaxBuffer} from "../../utils/uplot/axes";
@@ -12,23 +10,29 @@ import throttle from "lodash.throttle";
import "uplot/dist/uPlot.min.css";
import "./tooltip.css";
import useResize from "../../hooks/useResize";
import {TimeParams} from "../../types";
import {YaxisState} from "../../state/graph/reducer";
export interface LineChartProps {
metrics: MetricResult[];
data: uPlotData;
series: uPlotSeries[];
metrics: MetricResult[];
data: uPlotData;
period: TimeParams;
yaxis: YaxisState;
series: uPlotSeries[];
unit?: string;
setPeriod: ({from, to}: {from: Date, to: Date}) => void;
container: HTMLDivElement | null
}
enum typeChartUpdate {xRange = "xRange", yRange = "yRange", data = "data"}
const LineChart: FC<LineChartProps> = ({data, series, metrics = []}) => {
const dispatch = useAppDispatch();
const {time: {period}} = useAppState();
const {yaxis} = useGraphState();
const LineChart: FC<LineChartProps> = ({data, series, metrics = [],
period, yaxis, unit, setPeriod, container}) => {
const uPlotRef = useRef<HTMLDivElement>(null);
const [isPanning, setPanning] = useState(false);
const [xRange, setXRange] = useState({min: period.start, max: period.end});
const [uPlotInst, setUPlotInst] = useState<uPlot>();
const layoutSize = useResize(document.getElementById("homeLayout"));
const layoutSize = useResize(container);
const tooltip = document.createElement("div");
tooltip.className = "u-tooltip";
@@ -36,7 +40,7 @@ const LineChart: FC<LineChartProps> = ({data, series, metrics = []}) => {
const tooltipOffset = {left: 0, top: 0};
const setScale = ({min, max}: { min: number, max: number }): void => {
dispatch({type: "SET_PERIOD", payload: {from: new Date(min * 1000), to: new Date(max * 1000)}});
setPeriod({from: new Date(min * 1000), to: new Date(max * 1000)});
};
const throttledSetScale = useCallback(throttle(setScale, 500), []);
const setPlotScale = ({u, min, max}: { u: uPlot, min: number, max: number }) => {
@@ -73,7 +77,7 @@ const LineChart: FC<LineChartProps> = ({data, series, metrics = []}) => {
if (tooltipIdx.dataIdx === u.cursor.idx) return;
tooltipIdx.dataIdx = u.cursor.idx || 0;
if (tooltipIdx.seriesIdx !== null && tooltipIdx.dataIdx !== undefined) {
setTooltip({u, tooltipIdx, metrics, series, tooltip, tooltipOffset});
setTooltip({u, tooltipIdx, metrics, series, tooltip, tooltipOffset, unit});
}
};
@@ -81,7 +85,7 @@ const LineChart: FC<LineChartProps> = ({data, series, metrics = []}) => {
if (tooltipIdx.seriesIdx === sidx) return;
tooltipIdx.seriesIdx = sidx;
sidx && tooltipIdx.dataIdx !== undefined
? setTooltip({u, tooltipIdx, metrics, series, tooltip, tooltipOffset})
? setTooltip({u, tooltipIdx, metrics, series, tooltip, tooltipOffset, unit})
: tooltip.style.display = "none";
};
const getRangeX = (): Range.MinMax => [xRange.min, xRange.max];
@@ -101,9 +105,9 @@ const LineChart: FC<LineChartProps> = ({data, series, metrics = []}) => {
const options: uPlotOptions = {
...defaultOptions,
series,
axes: getAxes(series),
axes: getAxes(series, unit),
scales: {...getScales()},
width: layoutSize.width ? layoutSize.width - 64 : 400,
width: layoutSize.width || 400,
plugins: [{hooks: {ready: onReadyChart, setCursor, setSeries: seriesFocus}}],
};
@@ -123,7 +127,7 @@ const LineChart: FC<LineChartProps> = ({data, series, metrics = []}) => {
uPlotInst.setData(data);
break;
}
uPlotInst.redraw();
if (!isPanning) uPlotInst.redraw();
};
useEffect(() => setXRange({min: period.start, max: period.end}), [period]);

View File

@@ -0,0 +1,53 @@
import React, {FC, useEffect, useMemo, useState} from "preact/compat";
import getDashboardSettings from "./getDashboardSettings";
import {DashboardRow, DashboardSettings} from "../../types";
import Box from "@mui/material/Box";
import Alert from "@mui/material/Alert";
import Tabs from "@mui/material/Tabs";
import Tab from "@mui/material/Tab";
import PredefinedDashboard from "./PredefinedDashboard";
import get from "lodash.get";
const DashboardLayout: FC = () => {
const [dashboards, setDashboards] = useState<DashboardSettings[]>();
const [tab, setTab] = useState(0);
const filename = useMemo(() => get(dashboards, [tab, "filename"], ""), [dashboards, tab]);
const rows = useMemo(() => {
return get(dashboards, [tab, "rows"], []) as DashboardRow[];
}, [dashboards, tab]);
useEffect(() => {
getDashboardSettings().then(d => d.length && setDashboards(d));
}, []);
return <>
{!dashboards && <Alert color="info" severity="info" sx={{m: 4}}>Dashboards not found</Alert>}
{dashboards && <>
<Box sx={{ borderBottom: 1, borderColor: "divider" }}>
<Tabs value={tab} onChange={(e, val) => setTab(val)} aria-label="dashboard-tabs">
{dashboards && dashboards.map((d, i) =>
<Tab key={i} label={d.title || d.filename} id={`tab-${i}`} aria-controls={`tabpanel-${i}`}/>
)}
</Tabs>
</Box>
<Box>
{Array.isArray(rows) && !!rows.length
? rows.map((r,i) =>
<PredefinedDashboard
key={`${tab}_${i}`}
index={i}
filename={filename}
title={r.title}
panels={r.panels}/>)
: <Alert color="error" severity="error" sx={{m: 4}}>
<code>&quot;rows&quot;</code> not found. Check the configuration file <b>{filename}</b>.
</Alert>}
</Box>
</>}
</>;
};
export default DashboardLayout;

View File

@@ -0,0 +1,117 @@
import React, {FC, useEffect, useMemo, useState} from "preact/compat";
import {MouseEvent as ReactMouseEvent} from "react";
import {DashboardRow} from "../../types";
import Box from "@mui/material/Box";
import Accordion from "@mui/material/Accordion";
import AccordionSummary from "@mui/material/AccordionSummary";
import AccordionDetails from "@mui/material/AccordionDetails";
import Grid from "@mui/material/Grid";
import ExpandMoreIcon from "@mui/icons-material/ExpandMore";
import Typography from "@mui/material/Typography";
import PredefinedPanels from "./PredefinedPanels";
import Alert from "@mui/material/Alert";
import {CSSProperties} from "@mui/styles";
import useResize from "../../hooks/useResize";
export interface PredefinedDashboardProps extends DashboardRow {
filename: string;
index: number;
}
const resizerStyle: CSSProperties = {
position: "absolute",
top: 0,
bottom: 0,
width: "10px",
opacity: 0,
cursor: "ew-resize",
};
const PredefinedDashboard: FC<PredefinedDashboardProps> = ({index, title, panels, filename}) => {
const windowSize = useResize(document.body);
const sizeSection = useMemo(() => {
return windowSize.width / 12;
}, [windowSize]);
const [panelsWidth, setPanelsWidth] = useState<number[]>([]);
useEffect(() => {
setPanelsWidth(panels.map(p => p.width || 12));
}, [panels]);
const [resize, setResize] = useState({start: 0, target: 0, enable: false});
const handleMouseMove = (e: MouseEvent) => {
if (!resize.enable) return;
const {start} = resize;
const sectionCount = Math.ceil((start - e.clientX)/sizeSection);
if (Math.abs(sectionCount) >= 12) return;
const width = panelsWidth.map((p, i) => {
return p - (i === resize.target ? sectionCount : 0);
});
setPanelsWidth(width);
};
const handleMouseDown = (e: ReactMouseEvent<HTMLButtonElement, MouseEvent>, i: number) => {
setResize({
start: e.clientX,
target: i,
enable: true,
});
};
const handleMouseUp = () => {
setResize({
...resize,
enable: false
});
};
useEffect(() => {
window.addEventListener("mousemove", handleMouseMove);
window.addEventListener("mouseup", handleMouseUp);
return () => {
window.removeEventListener("mousemove", handleMouseMove);
window.removeEventListener("mouseup", handleMouseUp);
};
}, [resize]);
return <Accordion defaultExpanded={!index} sx={{boxShadow: "none"}}>
<AccordionSummary
sx={{px: 3, bgcolor: "rgba(227, 242, 253, 0.6)"}}
aria-controls={`panel${index}-content`}
id={`panel${index}-header`}
expandIcon={<ExpandMoreIcon />}
>
<Box display="flex" alignItems="center" width={"100%"}>
{title && <Typography variant="h6" fontWeight="bold" sx={{mr: 2}}>{title}</Typography>}
{panels && <Typography variant="body2" fontStyle="italic">({panels.length} panels)</Typography>}
</Box>
</AccordionSummary>
<AccordionDetails sx={{display: "grid", gridGap: "10px"}}>
<Grid container spacing={2}>
{Array.isArray(panels) && !!panels.length
? panels.map((p, i) =>
<Grid key={i} item xs={panelsWidth[i]} sx={{transition: "200ms"}}>
<Box position={"relative"} height={"100%"}>
<PredefinedPanels
title={p.title}
description={p.description}
unit={p.unit}
expr={p.expr}
filename={filename}
showLegend={p.showLegend}/>
<button style={{...resizerStyle, right: 0}}
onMouseDown={(e) => handleMouseDown(e, i)}/>
</Box>
</Grid>)
: <Alert color="error" severity="error" sx={{m: 4}}>
<code>&quot;panels&quot;</code> not found. Check the configuration file <b>{filename}</b>.
</Alert>
}
</Grid>
</AccordionDetails>
</Accordion>;
};
export default PredefinedDashboard;

View File

@@ -0,0 +1,132 @@
import React, {FC, useEffect, useMemo, useRef, useState} from "preact/compat";
import Box from "@mui/material/Box";
import {PanelSettings} from "../../types";
import Tooltip from "@mui/material/Tooltip";
import InfoIcon from "@mui/icons-material/Info";
import Typography from "@mui/material/Typography";
import {useAppDispatch, useAppState} from "../../state/common/StateContext";
import {AxisRange, YaxisState} from "../../state/graph/reducer";
import GraphView from "../CustomPanel/Views/GraphView";
import Alert from "@mui/material/Alert";
import {useFetchQuery} from "../../hooks/useFetchQuery";
import Spinner from "../common/Spinner";
import StepConfigurator from "../CustomPanel/Configurator/Query/StepConfigurator";
import GraphSettings from "../CustomPanel/Configurator/Graph/GraphSettings";
import {CustomStep} from "../../state/graph/reducer";
import {marked} from "marked";
import "./dashboard.css";
export interface PredefinedPanelsProps extends PanelSettings {
filename: string;
}
const PredefinedPanels: FC<PredefinedPanelsProps> = ({
title,
description,
unit,
expr,
showLegend,
filename
}) => {
const {time: {period}} = useAppState();
const dispatch = useAppDispatch();
const containerRef = useRef<HTMLDivElement>(null);
const [visible, setVisible] = useState(true);
const [customStep, setCustomStep] = useState<CustomStep>({enable: false, value: period.step || 1});
const [yaxis, setYaxis] = useState<YaxisState>({
limits: {
enable: false,
range: {"1": [0, 0]}
}
});
const validExpr = useMemo(() => Array.isArray(expr) && expr.every(q => typeof q === "string"), [expr]);
const {isLoading, graphData, error} = useFetchQuery({
predefinedQuery: validExpr ? expr : [],
display: "chart",
visible,
customStep,
});
const setYaxisLimits = (limits: AxisRange) => {
const tempYaxis = {...yaxis};
tempYaxis.limits.range = limits;
setYaxis(tempYaxis);
};
const toggleEnableLimits = () => {
const tempYaxis = {...yaxis};
tempYaxis.limits.enable = !tempYaxis.limits.enable;
setYaxis(tempYaxis);
};
const setPeriod = ({from, to}: {from: Date, to: Date}) => {
dispatch({type: "SET_PERIOD", payload: {from, to}});
};
useEffect(() => {
const observer = new IntersectionObserver((entries) => {
entries.forEach(entry => setVisible(entry.isIntersecting));
}, { threshold: 0.1 });
if (containerRef.current) observer.observe(containerRef.current);
return () => {
if (containerRef.current) observer.unobserve(containerRef.current);
};
}, []);
if (!validExpr) return <Alert color="error" severity="error" sx={{m: 4}}>
<code>&quot;expr&quot;</code> not found. Check the configuration file <b>{filename}</b>.
</Alert>;
return <Box border="1px solid" borderRadius="2px" borderColor="divider" width={"100%"} height={"100%"} ref={containerRef}>
<Box px={2} py={1} display="flex" flexWrap={"wrap"}
width={"100%"}
alignItems="center" justifyContent="space-between" borderBottom={"1px solid"} borderColor={"divider"}>
<Tooltip arrow componentsProps={{tooltip: {sx: {maxWidth: "100%"}}}}
title={<Box sx={{p: 1}}>
{description && <Box mb={2}>
<Typography fontWeight={"500"} sx={{mb: 0.5, textDecoration: "underline"}}>Description:</Typography>
<div className="panelDescription" dangerouslySetInnerHTML={{__html: marked.parse(description)}}/>
</Box>}
<Box>
<Typography fontWeight={"500"} sx={{mb: 0.5, textDecoration: "underline"}}>Queries:</Typography>
<div>
{expr.map((e, i) => <Box key={`${i}_${e}`} mb={0.5}>{e}</Box>)}
</div>
</Box>
</Box>}>
<InfoIcon color="info" sx={{mr: 1}}/>
</Tooltip>
<Typography component={"div"} variant="subtitle1" fontWeight={500} sx={{mr: 2, py: 1, flexGrow: "1"}}>
{title || ""}
</Typography>
<Box mr={2} py={1}>
<StepConfigurator defaultStep={period.step} customStepEnable={customStep.enable}
setStep={(value) => setCustomStep({...customStep, value: value})}
toggleEnableStep={() => setCustomStep({...customStep, enable: !customStep.enable})}/>
</Box>
<GraphSettings yaxis={yaxis} setYaxisLimits={setYaxisLimits} toggleEnableLimits={toggleEnableLimits}/>
</Box>
<Box px={2} pb={2}>
{isLoading && <Spinner isLoading={true} height={"500px"}/>}
{error && <Alert color="error" severity="error" sx={{whiteSpace: "pre-wrap", mt: 2}}>{error}</Alert>}
{graphData && <GraphView
data={graphData}
period={period}
customStep={customStep}
query={expr}
yaxis={yaxis}
unit={unit}
showLegend={showLegend}
setYaxisLimits={setYaxisLimits}
setPeriod={setPeriod}/>
}
</Box>
</Box>;
};
export default PredefinedPanels;

View File

@@ -0,0 +1,18 @@
.panelDescription ul {
line-height: 2.2;
}
.panelDescription a {
color: #FFFFFF;
}
.panelDescription code {
display: inline;
max-width: 100%;
padding: 4px 6px;
background-color: rgba(0, 0, 0, 0.3);
border-radius: 2px;
font-weight: 400;
font-size: inherit;
color: #FFFFFF;
}

Some files were not shown because too many files have changed in this diff Show More